Use of long in Nashorn
hannes.wallnoefer at oracle.com
Fri Dec 11 15:08:02 UTC 2015
I uploaded a new webrev that includes most of the changes you suggested.
Conversion of long from Java now works without losing precision, using
int, double, or Object arrays. I also added a test for this.
I didn't implement the int/double overloading of array iterator actions.
Unless I missed something, I would have to implement two forEach methods
in each subclass, which seem ugly and error prone.
Additionally, I removed the ArrayData.set method that takes a long
value, something I had overlooked in my previous patch.
Am 2015-12-06 um 11:12 schrieb Hannes Wallnoefer:
> Thanks for the quick review, Attila. Answers inline.
> Am 2015-12-04 um 18:39 schrieb Attila Szegedi:
>> * In CodeGenerator SHR implementations (both self-assign and
>> ordinary) you have method.shr() in loadStack instead of consumeStack.
>> I was actually staring at this for a while as it seemed wrong to
>> perform an operation in loadStack, but in the end I decided it’s okay
>> like this. After all, it’s the toUint32 that’s the optimistic part
>> here, so this should be fine indeed. I think we had a JIRA issue
>> saying “make SHR optimistic” but I can’t find it now. If it pops up,
>> we can mark it as a duplicate of this one.
> I've looked for that Jira issue but didn't find it either.
>> * I see "assert storeType != Type.LONG;” do we even need Type.LONG
>> and LongType class anymore?
> That assert is a leftover from the conversion process, it shouldn't be
> needed anymore. We do still use Type.LONG for creating and handling
> the primitive fields and spill slots with dual fields. That's why I
> had to keep it.
>> * Symbol.java: you could reclaim the HAS_LONG_VALUE bit by shifting
>> the rest down by one
> Will do.
>> * optimization idea: have versions of callback invokers in
>> NativeArray.java for both int and double indices. Since we know the
>> length of the array when we enter forEach etc. we could select the
>> double version when length > maxint and the int version otherwise.
>> Actually, we could even have IteratorAction.forEach be overloaded for
>> int and double, and write the body of IteratorAction.apply() to start
>> out with the int version, and when the index crosses maxint start
>> calling the double version (so even for large arrays we’ll iterate
>> calling int specialization of functions for the cases where it’s
>> short circuited).
> Nice idea, and should be easy to implement. I'll try it out.
>> * array length: could we still have Nashorn APIs that return long?
>> Optimistic filters will deal with these appropriately, won’t they? I
>> guess they should since they also need to be able to handle return
>> values from POJO methods that return long (e.g.
>> System.currentTimeMillis()). Hence, you could have NativeArray.length
>> return “long” and let the optimistic machinery decide whether to cast
>> it as int or double. That would allow you to not have to box the
>> return value of NativeArray.length.
> Yes, we could have things returning long, but it will deoptimize to
> Object. OptimisticReturnFilters (which do the runtime checks) are not
> used for ScriptObject properties.
>> * NativeNumber: unused import?
>> *Unit32ArrayData: getBoxedElementType went from INTEGER to DOUBLE.
>> I’m not sure I understand that. I mean, was INTEGER incorrect before?
> That obviously has been incorrect before. Actually, that method is
> only used in NativeArray#concat and will never be invoked on typed
> arrays. Looking at that NativeArray#concat method it looks a bit fishy
> to me, assuming all NativeArrays use ContinuousArrayData. I have to
> investigate further on this.
> Back to the issue at hand, int.class/Integer.class is definitely wrong
> for element type for Uint32. When returning int.class in
> getElementType, optimistic code that uses optimstic int getter gets
> incredibly slow when actually deoptimizing to double, because we keep
> trying to handle elements as ints. (I had this in my code at one time
> and found pdfjs slowed down to a crawl when changing the optimistic
> int getter to always deoptimize to double.)
> Probably getBoxedElementType should just be a final method instead of
> abstract in ContinuousArrayData and convert getElementType to boxed
> counterpart on the fly.
>> * You didn’t remove LongArrayData.java?
> I think I did:
>> * It looks like long arrays can now lose precision if passed
>> through Java.from(), E.g. if you have Java methods “long
>> getLongArray()” and “void setLongArray(long arr)” then
>> pojo.setLongArray(Java.from(pojo.getLongArray()) will lose precision
>> because NativeJava.copyArray(long) produces double. Of course, we
>> could argue that this is expected behavior if you explicitly use
>> Java.from. Just passing around and manipulating a raw long won’t
>> have that effect (although it’d be good to confirm that in test), it
>> requires an explicit Java.from(). Still, I wonder if it’d make sense
>> to have copyArray(long) either return Object or choose
>> dynamically between double and Object based on the maximum
>> magnitude of an element (you can start with double and reallocate
>> as Object when you bump into a large long).
> Good catch. I'll see if I can use existing code in ArrayData to
> convert to the narrowest array type.
>> Other than that: great work! Nice to see large swaths of code removed.
>>> On Dec 4, 2015, at 4:27 PM, Hannes Wallnoefer
>>> <hannes.wallnoefer at oracle.com> wrote:
>>> After receiving another long/precision related bug last week I
>>> decided to go ahead with the removal of longs in Nashorn. It's been
>>> quite an effort, but I think it looks good now. Below are the links
>>> to the webrev and Jira issue.
>>> This is a rather big patch, but it mostly removes code. There are
>>> over 2000 deletions vs. 400 insertions. I removed all uses of long
>>> in our code where it represented JS numbers, including ScriptObject
>>> property accessors with longs as key or value, and the LongArrayData
>>> class. With this, all JS numbers are represented as int or double in
>>> Nashorn. Longs will not "naturally" occur anymore and only be
>>> present as java.lang.Long instances.
>>> As expected, the areas that demanded special care were those where
>>> ES prescribes use of uint32. These are mostly unsigned right shift,
>>> Uint32Array elements, and the length property of arrays. For right
>>> shift and Uint32Array elements, I added optimistic implementations
>>> that return int if possible and deoptimize to double. Pdfjs and
>>> mandreel are benchmarks that make use of these features, and I
>>> didn't notice any observable impact on performance. Even when I
>>> simulated fallback to double performance was still ok (previously
>>> reported performance regressions for this case were in fact caused
>>> by an error of mine).
>>> For the Array length property, I changed the getter in NativeArray
>>> to return int or double depending on actual value. Previously, the
>>> length getter always returned long. Since the property is actually
>>> evaluated by OptimisticTypesCalculator, for-loops with an array
>>> length as limit now use ints instead of longs to iterate through
>>> array indices, which is probably a good thing.
>>> As for longs returned by Java code, their value is always preserved.
>>> Only when they are used for JS math they will be converted to double
>>> as prescribed by ECMA. When running with optimistic types we check
>>> if a long fits into an int or double to avoid deoptimization to
>>> object. Previously we did this only when converting long to
>>> optimistic int, but optimistic double did not use any return filter
>>> for longs, so a long returned for an optimistic double could easily
>>> lose precision.
>>> You can find the related changes in OptimisticReturnFilters.java. I
>>> also added tests to make sure long values are preserved in various
>>> optimistic and non-optimstic contexts, some of which would have
>>> previously fail.
>>> In a previous version of this patch it included functionality to
>>> only treat ints and doubles or their wrapper objects as native JS
>>> numbers (e.g. you could invoke Number.prototype methods only on ints
>>> and doubles). However, this is a quite a hairy issue and I reckoned
>>> the patch is large enough without it, so I pulled it out and will
>>> fix this separately as JDK-8143896.
>>> I've testing and refining this patch for the last few days and think
>>> it looks pretty good. I thought it was a good idea to discuss this
>>> to this existing thread before posting a review request. Please let
>>> me know what you think.
>>> Am 2015-11-13 um 15:06 schrieb Attila Szegedi:
>>>> Well, we could just box them in that case. If we only used int and
>>>> double as primitive number types internally, then a natural
>>>> representation for a long becomes Long. Java methods returning long
>>>> could have an optimistic filter that first checks if the value fits
>>>> in an int (32-bit signed), then in a double (53-bit signed) without
>>>> loss of precision, and finally deoptimizes to Object and uses Long.
>>>> int->long->double->Object becomes int->double->Object, with longs
>>>> of too large magnitude becoming boxed.
>>>>> On Nov 13, 2015, at 2:46 PM, Jim Laskey
>>>>> (Oracle)<james.laskey at oracle.com> wrote:
>>>>> The main thing to watch for here is that longs/Longs need to
>>>>> survive unobstructed between Java calls. Remember we ran into
>>>>> trouble in this area (loss of precision when using longs for
>>>>>> On Nov 13, 2015, at 8:26 AM, Attila
>>>>>> Szegedi<attila.szegedi at oracle.com> wrote:
>>>>>>> On Nov 13, 2015, at 10:31 AM, Hannes
>>>>>>> Wallnoefer<hannes.wallnoefer at oracle.com> wrote:
>>>>>>> Hi all,
>>>>>>> I'm currently questioning our use of longs to represent numbers
>>>>>>> in combination with optimistic typing.
>>>>>> I often feel that the presence of longs is more hassle than
>>>>>> they’re worth. I’d be all for eliminating them.
>>>>>>> After all, there's a pretty large range where long arithmetic
>>>>>>> will be more precise than the doubles required by ECMA.
>>>>>> There’s currently several different places in Nashorn where we
>>>>>> try to confine the precision of longs to 53 bits (so they aren’t
>>>>>> more precise than doubles). It’s a bit of a whack-a-mole where
>>>>>> you can’t always be sure whether you found all instances.
>>>>>>> For context see this bug, especially the last two comments (the
>>>>>>> original bug was about number to string conversion which has
>>>>>>> been solved in the meantime):
>>>>>>> So the question is: can we use longs at all and be confident
>>>>>>> that results won't be too precise (and thus, strictly speaking,
>>>>>> Internally, the longs are also used for representing UInt32 even
>>>>>> in non-optimistic setting, which is only really significant for
>>>>>> the >>> operator and array indices and lengths; maybe we should
>>>>>> limit longs to that usage only, or even just use doubles
>>>>>> internally for UInt32 values that can’t be represented as Int32.
>>>>>> FWIW, even for the >>> operator we only need it when shifting by
>>>>>> zero, as in every other case the result will have the topmost bit
>>>>>> set to 0 and thus fit in Int32 too.
>>>>>> I guess once Valhalla rolls around, we could easily have an
>>>>>> unsigned 32 bit int type.
More information about the nashorn-dev