Contributed by James Cheng and modified by me.

To use intrinsics to accelerate SHA operations on multiple blocks [1], 
it is needed to pull a loop out of DigestBase.engineUpdate() and make a 
new method implCompressMultiBlock() which contains only the loop and can 
be intrinsified.

On platforms which does not use intrinsic implCompressMultiBlock() 
method will be inlined by JIT and the same code will be generated as 
before. So no performance regression with the pure Java SUN provider is 

About arithmetic change. limit = ofs + len will not overflow integer 
because ofs <= b.length - len (there is check).

Tested with jdk jtreg tests and new hotspot jtreg test James wrote for 


[1] https://bugs.openjdk.java.net/browse/JDK-8035968

