[PATCH] 4511638: Double.toString(double) sometimes produces incorrect results
Ulf Adams
ulfjack at google.com
Thu Oct 4 13:28:04 UTC 2018
On Thu, Sep 27, 2018 at 5:37 PM Raffaello Giulietti <
raffaello.giulietti at gmail.com> wrote:
> Hi Ulf,
>
>
> On 2018-09-27 16:40, Ulf Adams wrote:
> > Hi Raffaello,
> >
> > I am the author of a recent publication on double to string conversion
> > [1] - the Ryu algorithm. I've been aware of the problems with the Jdk
> > for several years, and am very much looking forward to improvements in
> > correctness and performance in this area.
> >
>
> What a coincidence! I'm happy to hear that the quest for better
> floating->string conversions has not stopped. Tomorrow I'll download
> your paper and have a look at it during the weekend.
>
Have you had a chance to take a look?
(I'm traveling for the next ~10 days and at a conference, so don't expect
too much from me during that time.)
>
>
>
> > I have done some testing against my Java implementation of the Ryu
> > algorithm described in the linked paper. Interestingly, I've found a few
> > cases where they output different results. In particular:
> > 1.0E-323 is printed as 9.9E-324
> > 1.0E-322 is printed as 9.9E-323
>
> If Ryu also produces 1 digit long outputs, then your results above are
> correct. But then Ryu should also output 5.0E-324 rather than 4.9E-324,
> for example.
> Even better, it should output 5E-324, 1E-323 and 1E-322 because adding
> the .0 part might confuse a human reader to believe that 2 digits are
> really needed. But then 4.9E-324, 9.9E-324 and 9.9E-323 are closer to
> the double.
>
The C version produces 1 digit long outputs, and I was trying to follow the
Java spec in the Java version, but the code to do so isn't quite right.
Unfortunately, I haven't yet been able to fix it.
>
> 2 digits are for backward compatibility with the existing spec which
> requires at least one digit to the right of the decimal point.
>
>
> >
> > It's likely that there are more such cases - I only ran a sample of
> > double-precision numbers. Arguably, 9.9 is the correctly rounded 2-digit
> > output and Ryu is incorrect here. That's what you get when you have a
> > special case for Java without a correctness proof. :-(
> >
> > In terms of performance, this algorithm performs almost exactly the same
> > as my Java implementation of Ryu, although I'd like to point out that my
> > C implementation of Ryu is quite a bit faster (though note that it
> > generates different output, in particular, it only outputs a single
> > digit of precision in the above cases, rather than two), and I didn't
> > backport all the performance improvements from the Java version, yet. It
> > looks like this is not coincidence - as far as I can see so far, it's
> > algorithmically very similar, although it manages to avoid the loop I'm
> > using in Ryu to find the shortest representation.
> >
> > I have a few comments:
> >
> > * <li> It rounds to {@code v} according to the usual
> > round-to-closest
> > * rule of IEEE 754 floating-point arithmetic.
> > - Since you're spelling out the rounding rules just below, this is
> > duplicated, and by itself, it's unclear since it doesn't specify the
> > specific sub-type (round half even).
> >
>
> I tried to save as much of the original spec wording as possible.
> Perhaps it isn't worthwhile.
>
>
>
> > - Naming: I'd strongly suggest to use variable names that relate to
> > what's stored, e.g., m for mantissa, e for exponent, etc.
> >
>
> I currently prefer to be consistent with a forthcoming paper of mine on
> the subject. But thanks for the suggestion.
>
May I suggest that the paper also uses names that relate to what they're
referring to? :-) Not that I've managed to do that very well myself...
>
>
> > - What's not clear to me is how the algorithm determines how many digits
> > to print.
> >
>
> You'll have to wait for the paper.
>
Looking forward to it. I tried to reverse engineer the code, but it's far
from obvious.
>
>
> > - Also, it might be nicer to move the long multiplications to a helper
> > method - at least from a short look, it looks like the computations of
> > vn, vnl, and vnr are identical.
> >
>
> I tried several variants: the current one seems to be the faster with
> the current optimizations of C2. Some day I'll also try with Graal.
>
Sure, but moving it to a method shouldn't affect performance (except if you
need to return multiple values), and, right now, it looks like identical
code.
>
>
>
> > - I looked through the spec, and it looks like all cases are
> > well-defined. Yay!
> >
> > I will need some more time to do a more thorough review of the code and
> > more testing for differences. Unfortunately, I'm also traveling the next
> > two weeks, so this might take a bit of time.
> >
>
> I thank you in advance for your willingness to review the code but my
> understanding is that only the officially appointed reviewers can
> approve OpenJDK contributions, which is of course a good policy.
> Besides, as two Andrews engineers from RedHat correctly observe,
> understanding the rationale of the code without the planned accompanying
> paper is hard.
>
>
>
> > I'm not a contributor to the Jdk, and this isn't my full-time job. I was
> > lurking here because I was going to send a patch for the double to
> > string conversion code myself (based on Ryu).
> >
>
> All my efforts on this projects are done in my unpaid spare time, too.
>
>
> > Thanks,
> >
> > -- Ulf
> >
> > [1] https://dl.acm.org/citation.cfm?id=3192369
> > [2] https://github.com/google/double-conversion
> > [3] https://en.wikipedia.org/wiki/Rounding
> >
>
>
> Thank you
> Raffaello
>
>
>
>
>
>
>
More information about the core-libs-dev
mailing list