[PATCH] 4511638: Double.toString(double) sometimes produces incorrect results
raffaello.giulietti at gmail.com
raffaello.giulietti at gmail.com
Thu Oct 4 16:14:55 UTC 2018
On 2018-10-04 15:28, Ulf Adams wrote:
> On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti
> <raffaello.giulietti at gmail.com <mailto:raffaello.giulietti at gmail.com>>
> wrote:
>
> Hi Ulf,
>
>
> On 2018-09-27 16:40, Ulf Adams wrote:
> > Hi Raffaello,
> >
> > I am the author of a recent publication on double to string
> conversion
> > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk
> > for several years, and am very much looking forward to
> improvements in
> > correctness and performance in this area.
> >
>
> What a coincidence! I'm happy to hear that the quest for better
> floating->string conversions has not stopped. Tomorrow I'll download
> your paper and have a look at it during the weekend.
>
>
> Have you had a chance to take a look?
>
> (I'm traveling for the next ~10 days and at a conference, so don't
> expect too much from me during that time.)
>
>
I had a cursory reading but couldn't dig deeper for now. If nothing
unexpected happens, I should be able to study your paper during the weekend.
>
>
>
> > I have done some testing against my Java implementation of the Ryu
> > algorithm described in the linked paper. Interestingly, I've found
> a few
> > cases where they output different results. In particular:
> > 1.0E-323 is printed as 9.9E-324
> > 1.0E-322 is printed as 9.9E-323
>
> If Ryu also produces 1 digit long outputs, then your results above are
> correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324,
> for example.
> Even better, it should output 5E-324, 1E-323 and 1E-322 because adding
> the .0 part might confuse a human reader to believe that 2 digits are
> really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to
> the double.
>
>
> The C version produces 1 digit long outputs, and I was trying to follow
> the Java spec in the Java version, but the code to do so isn't quite
> right. Unfortunately, I haven't yet been able to fix it.
>
>
>
> 2 digits are for backward compatibility with the existing spec which
> requires at least one digit to the right of the decimal point.
>
>
> >
> > It's likely that there are more such cases - I only ran a sample of
> > double-precision numbers. Arguably, 9.9 is the correctly rounded
> 2-digit
> > output and Ryu is incorrect here. That's what you get when you have a
> > special case for Java without a correctness proof. :-(
> >
> > In terms of performance, this algorithm performs almost exactly
> the same
> > as my Java implementation of Ryu, although I'd like to point out
> that my
> > C implementation of Ryu is quite a bit faster (though note that it
> > generates different output, in particular, it only outputs a single
> > digit of precision in the above cases, rather than two), and I didn't
> > backport all the performance improvements from the Java version,
> yet. It
> > looks like this is not coincidence - as far as I can see so far, it's
> > algorithmically very similar, although it manages to avoid the
> loop I'm
> > using in Ryu to find the shortest representation.
> >
> > I have a few comments:
> >
> > * <li> It rounds to {@code v} according to the usual
> > round-to-closest
> > * rule of IEEE 754 floating-point arithmetic.
> > - Since you're spelling out the rounding rules just below, this is
> > duplicated, and by itself, it's unclear since it doesn't specify the
> > specific sub-type (round half even).
> >
>
> I tried to save as much of the original spec wording as possible.
> Perhaps it isn't worthwhile.
>
>
>
> > - Naming: I'd strongly suggest to use variable names that relate to
> > what's stored, e.g., m for mantissa, e for exponent, etc.
> >
>
> I currently prefer to be consistent with a forthcoming paper of mine on
> the subject. But thanks for the suggestion.
>
>
> May I suggest that the paper also uses names that relate to what they're
> referring to? :-) Not that I've managed to do that very well myself...
>
I tend to use short "mathematical" names that still evoke their
semantics. Will see if I manage to be consistent.
>
>
>
> > - What's not clear to me is how the algorithm determines how many
> digits
> > to print.
> >
>
> You'll have to wait for the paper.
>
>
> Looking forward to it. I tried to reverse engineer the code, but it's
> far from obvious.
>
I don't think it is reversible, even for knowledgeable people like you :-(
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
I have to write the paper...
>
>
>
> > - Also, it might be nicer to move the long multiplications to a
> helper
> > method - at least from a short look, it looks like the
> computations of
> > vn, vnl, and vnr are identical.
> >
>
> I tried several variants: the current one seems to be the faster with
> the current optimizations of C2. Some day I'll also try with Graal.
>
>
> Sure, but moving it to a method shouldn't affect performance (except if
> you need to return multiple values), and, right now, it looks like
> identical code.
>
>
Where possible, I try not to rely on C2 to perform inlining of "long"
code like the 8 lines for the multiplication. In the end, it is repeated
only 3 times there and in a limited space. It shouldn't come to a
surprise to a reader.
I'll certainly add a comment to warn the reader that the code is the
same and I'll retry my experiments with extracting a method for the
multiplication although it currently seems more complex than at first
sight. Unfortunately, Java does not yet support 128 bits quantities.
Greetings
Raffaello
>
>
>
> > - I looked through the spec, and it looks like all cases are
> > well-defined. Yay!
> >
> > I will need some more time to do a more thorough review of the
> code and
> > more testing for differences. Unfortunately, I'm also traveling
> the next
> > two weeks, so this might take a bit of time.
> >
>
> I thank you in advance for your willingness to review the code but my
> understanding is that only the officially appointed reviewers can
> approve OpenJDK contributions, which is of course a good policy.
> Besides, as two Andrews engineers from RedHat correctly observe,
> understanding the rationale of the code without the planned
> accompanying
> paper is hard.
>
>
>
> > I'm not a contributor to the Jdk, and this isn't my full-time job.
> I was
> > lurking here because I was going to send a patch for the double to
> > string conversion code myself (based on Ryu).
> >
>
> All my efforts on this projects are done in my unpaid spare time, too.
>
>
> > Thanks,
> >
> > -- Ulf
> >
> > [1] https://dl.acm.org/citation.cfm?id=3192369
> > [2] https://github.com/google/double-conversion
> > [3] https://en.wikipedia.org/wiki/Rounding
> >
>
>
> Thank you
> Raffaello
>
>
>
>
>
>
More information about the core-libs-dev
mailing list