RFR (M): 8143925: Enhancing CounterMode.crypt() for AES
john.r.rose at oracle.com
Tue Jan 12 08:45:16 UTC 2016
On Jan 11, 2016, at 7:54 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>> As a general comment, would it make sense to assume exceptional paths are not taken in most Java code? That is, for code optimization purposes it's probably a reasonable assumption. It seems like having an exceptional path is already a hint that it's not expected to fail; most Java devs know not to use exceptions for expected control flow.
> That sounds reasonable. There’s a BailoutToInterpreterForThrows command line argument that does that (off by default, not available in product builds). I don’t know what the history behind it is.
It is reasonable in *most* code, which means you have to be ready to run into a bit of code which misbehaves, and mark the profile so you treat that bit of code specially. Key example: Null checks are *mostly* uncommon, so we work hard to turn them into implicit (trap-bearing) instructions. But those which are not get rewritten differently, after the trap happens, using the profile marks. In other words, we speculate that a null check that throws uncommonly finds a null (pending evidence to the contrary).
Hence the subtle interactions with profiles, including the profile inside a bytecoded method like checkIndex. With an intrinsic, you can say (just for that intrinsic), "ignore special-case marks in my profile". We don't have a good-enough heuristic (in the absence of split profiles) for detecting which normal methods can be treated this way.
>> Could bytecode shape just like checkIndex be treated as same hint? Are there cases where something looks like checkIndex but really isn't?
> That sounds like a good suggestion. We would trade:
> 2 comparisons: i < 0 || i >= length
> 2 comparisons: length < 0 || i >=u length
> so even if it doesn't result in further improvements, we wouldn’t lose anything.
In the small scale you don't lose anything, but (as I said earlier) in the IR graph at scale you lose the opportunity to common up certain expressions.
Since Java expressions don't have unsigned comparison operators (and programmers wouldn't use them consistently for index checking, even if they were available), the comparison expressions available in the IR for commoning are signed, except for those which have been converted to unsigned. Converting speculatively from signed to unsigned (as suggested above) would seem to be harmless, but unless it is somehow limited to comparisons that *all* go unsigned, you could get a mix of signed and unsigned versions of the same logic, which would (worst case) double the number of tests in the object code. Using an intrinsic for range checks (which is the current case with aaload and will also be the case with checkIndex) allows us to reduce the number of unsigned comparisons to those which actually .
I hope everybody understands that I am not arguing *against* strong IR normalization and automatic detection of idioms, but rather observing that, powerful as those desirable techniques are, they are not infallible, and sometimes benefit from user-driven help via explicit operators, like checkIndex. Of course we want to fold user code which "works just like" checkIndex or the aaload check into the same good IR. But we don't want to rely on this auto-detection always, and we want to tread carefully when balancing the various IR normalization rules, which may work either for or against the use case of checkIndex detection.
Open question: Given a "sea of nodes" encoding various configurations of signed and unsigned comparisons, how do we normalize them so that (a) we maximize commoning (U with U and S with S), and (b) we end up with all available clever uses of U mode to fold <=0 checks? Or, more to the point how do we arrange these choices so (c) the dynamic number of comparisons (of either mode) is minimized? Given that (a) and (b) can sometimes work against each other, what's a good heuristic for binning comparisons into U and S categories, for subsequent CSE? (So, checkIndex is a hint for binning.)
Personal background: About 10 years ago I worked on opto/subnode.cpp to try to switch between S and U modes more vigorously, implementing something probably related to what Vitaly is advocating. I ran into the (a) vs. (b) tradeoff, especially trying to preserve the aggressive matching of dominating tests in opto/ifnode.cpp. I think it can be made better than it is today, but the details are very tricky. Conjecture: It might help if the TypeInt lattice could encode ranges in the uint32 space, just as (today) it encodes ranges in int32, since some of the heuristics are type-driven.
More information about the hotspot-compiler-dev