RFR: 8179004: Add an efficient implementation of the "count trailing zeros" operation
kim.barrett at oracle.com
Thu Apr 20 06:17:55 UTC 2017
Please review this addition of the count_trailing_zeros function.
Unfortunately, there isn't an obvious direct and portable way to write
such a function. But supported hardware architectures generally
provide an efficient implementation, e.g. a single instruction or a
short sequence. Compilers often provide "built-in" or "intrinsic"
access to that hardware implementation, or one can use inline
If a platform doesn't have such a built-in efficient implementation,
the function can be implemented using de Bruijn sequences as discussed
in the literature. But all of the OpenJDK-supported platforms provide
an efficient implementation, so we aren't providing such a fallback.
As part of reviewing this change, please feel free to suggest
alternative ways to structure the code. I'm not completely happy with
the way I've done it, but alternatives I've tried seemed worse.
Unit test for new function.
Experimented with modifying BitMap::get_next_one_offset and the like
to replace the existing for-loop with a call to this new function, and
measured substantial speedups on all tested platforms for a test
program that counts the bits in a bitmap by iterating calls to that
search function. The speedup varies with the density of set bits, with
very sparse or nearly full being similar to the existing code, since
the bit counting is not the dominant factor in those situations. But
in between give speedups of factors of x1.5 to x5 or more, depending
on the density and the platform.
More information about the hotspot-dev