Bug in sun.nio.ch.SolarisEventPort#port_dissociate
David M. Lloyd
david.lloyd at redhat.com
Fri Jun 16 15:59:42 UTC 2017
On 06/16/2017 10:36 AM, Alan Bateman wrote:
> On 14/06/2017 15:32, David M. Lloyd wrote:
>> It's coming from a user so my information is limited but I can
>> establish that it is happening under load, and I think it corresponds
>> to an open socket being abruptly closed in another thread.
>> I am not sure whether I can get it down to a test case though. I'll
>> see if I can get access to a Solaris system for testing.
> If you get some idea on the conditions when this occurs then it would be useful. To make sure there is nothing obvious, I ran JDK tests on Solaris 11.3 system with the port Selector as the default.
I'll see if I can find out more. I have gained access to a test
environment but I haven't been able to reproduce it in isolation either.
> All the tests pass. I
> can't think of a scenario where port_dissociate could fail with EBADF.
It's not EBADF but EBADFD, if that makes a difference. I've been
working off of various GC-related hypotheses but without knowing the
exact conditions that precipitate EBADFD, I'm really shooting in the
dark. One would have to examine the kernel sources to get that answer.
> It is correct to ignore ENOENT as that occurs then the file descriptor
> registered with the port is closed by dup'ing.
Is it possible that this operation is non-atomic in the kernel, such
that the descriptor is briefly in an intermediate state before being
replaced by the placeholder?
More information about the nio-dev