sigsegv on porter stemmer (Lucene, but also otherwise)

Uwe Schindler uschindler at
Tue Jul 26 11:57:37 PDT 2011


With the help from Vladimir we patched our test platform and verified that
the bug is fixed with Vladimirs latest patches (including already committed
patches not available in the latest Java 7 developer preview), so thanks for
that. I think you may post your patches for review here, we are happy!

I was able to apply the patches  to our Jenkins server's Java 7 preview
installation, which runs on FreeBSD. I prepared a combined patch to be
placed in the ports directory and recompiled the openjdk7 package. After
restarting our Jenkins jobs, all was fine, even the other random test
failures without SIGSEGV seem to be resolved  (e.g. Lucene Faceting module
caused broken index on disk). More information can also be found in Lucene's

You can find the working builds here:, starting
with build #27
(some builds still failed because of another bug in IBM's ICU when the
default locale, which is randomly set by our test framework, contained one
of the new Java 7 ones - we have now a workaround)

We now hope, that Java 7 will not be released with that bugs, because quite
a lot loops got miscompiled without the fixes, which will break lots of
applications. We are afraid of people using Lucene/Solr with Oracle JDK
1.7.0_0 on July 28th and will corrupt their indexes. Is there a chance to
get the fixes in or delay the release?

Also it would be interesting to know how this affects JDK 1.6.0, because _26
did not seem to contain the broken original fix that caused the porter
failure and 1.6.0 is still broken with our readVInt method. We hope the
complete fix will be in _27.

Finally, we have a list of all issues related to Java7, if you are


> -----Original Message-----
> From: hotspot-compiler-dev-bounces at [mailto:hotspot-
> compiler-dev-bounces at] On Behalf Of Uwe Schindler
> Sent: Monday, July 25, 2011 9:52 PM
> To: vladimir.kozlov at; hotspot-compiler-dev at
> Subject: Re: sigsegv on porter stemmer (Lucene, but also otherwise)
> Hi thanks for taking care!
> Thanks for the workaround. We already found another workaround to get
> this running in our 2-hourly Lucene builds at:
> It would be nice if you could look into the console logs of the failed
builds on
> Saturday - you can see the bug in the earlier builds only (with always
> different stack traces). We drilled it down to one method (not sure if
> information clipped out of the bug report. We then disabled compilation
> only this affected method, its PorterStemmer.ends(...):
> -
> XX:CompileCommand=exclude,org/apache/lucene/analysis/en/PorterStem
> mer,ends
> -
> XX:CompileCommand=exclude,org/apache/lucene/analysis/PorterStemmer,
> ends
> (we have different class names in stable 3.x branch and trunk).
> We also see other random test failures not happening with Java 5 and Java
> it would be nice, if you could review, too.
> One big bug in loops affected also Java 1.6.0_18 (still not fixed): our
> DataInput.readVInt method was incorrectly compiled in the case that
> MappedByteBuffer.get()/lucene.DataInput.readByte() was inlined, leading
> to simply wrong results (the method returned a decoded integer that was
> different than expected results). See the unwinded loops in
> ache
> /lucene/store/
> I hope this all helps you in finding more bad loop optimization bugs, all
> those issues seem to be related to this special optimization in loops. The
> latest lucene builds also contains a failure in a test case only happening
> Java 7 (not on every test run, so unreproducible). So it might be good for
> to watch our Lucene builds also for other bugs.
> Some of the other developers already say, we should not trust any loops in
> java anymore and recommend not to use Java 7 with Apache Lucene/Solr,
> and that's bad news :(
> Thanks for the help,
> Uwe
> On Mon, Jul 25, 2011 at 7:49 PM, Vladimir Kozlov <vladimir.kozlov at
>> wrote:
> > Thank you very much, Dawid, for providing the test case. Bug was filed
> > in wrong category so we did not know about it. I will work on it since
> > it
> could
> > be my changes in loop optimizations. Use next flag as workaround:
> >
> > -XX:-UseLoopPredicate
> >
> > Thanks,
> > Vladimir
> >
> > Dawid Weiss wrote:
> >>
> >> Hello everyone,
> >>
> >> I am an Apache Lucene developer, we've been running tests with Java
> >> 1.7 and this came up:
> >>
> >>
> >>
> >> Porter stemmer is pretty widely used for shallow NLP, not only in
> >> Lucene. It'd be interesting to hear from jit gurus what's causing
> >> this (the problem does not occur in 1.6). Thanks in advance,
> >>
> >> Dawid
> -----
> Uwe Schindler
> uschindler at
> Apache Lucene PMC Member / Committer
> Bremen, Germany

More information about the hotspot-compiler-dev mailing list