[12] 8184157: (ch) AsynchronousFileChannel hangs with internal error when reading locked file

Brian Burkhalter brian.burkhalter at oracle.com
Mon Jul 9 18:19:19 UTC 2018


Revisiting a topic initially posted about a year ago [1]; most recent post [2].

Here’s an alternate explanation:

The class sun.nio.ch.PendingIoCache is in a sense a specialized form of Map<Long,Future>. The long value is a pointer to a native OVERLAPPED structure [3]. The Future is a sun.nio.ch.PendingFuture corresponding to either a lock, read, or write operation. The PendingFuture has a ResultHandler which varies according to the operation type. The PendingIoCache tries to save allocation of OVERLAPPED structures by maintaining a cache array of length 4 of pointers to OVERLAPPED. When an entry is removed from the internal Map<Long,PendingFuture>, the OVERLAPPED pointer is placed in the cache if there is room. When an entry is added to the PendingIoCache, the array is first checked to see whether a cached pointer is available, and, if so, it is re-used as the key of the new entry in the Map<Long,PendingFuture>.

Each WindowsAsynchronousFileChannelImpl instance is associated with an instance of Iocp. The Iocp instance has a long-running thread which runs an EventHandlerTask which loops over getQueuedCompletionStatus() [4]. The method getQueuedCompletionStatus() blocks until there is an I/O completion packet. When it receives the completion packet, it extracts from it the native OVERLAPPED pointer. The pointer is used as the key to obtain the PendingFuture of the operation from the PendingIoCache. The PendingFuture contains a ResultHandler with defines a completed() method which is then invoked by the event thread with a value equal to the number of bytes transferred as returned in the lpNumberOfBytes parameter of getQueuedCompletionStatus().

The problem arises for example when a lock operation and a read operation occur in sequence. The pointer to the OVERLAPPED structure allocated for the lock operation is re-used as the key of the PendingFuture of the read operation before the completion packet of the lock operation has been received by the call to getQueuedCompletionStatus() in the Iocp EventHandler loop. Then when the event handler receives the completion packet for the lock operation, it uses as key the OVERLAPPED pointer originally assigned for the lock operation but now being used by the read operation. This ends up retrieving from the PendingIoCache the PendingFuture of the read operation instead of that of the lock operation which was removed. The event thread then calls the completed() method of the read operation’s ResultHandler with a garbage value for the number of bytes transferred as derived from the lock operation’s completion packet. This value is then used by the ReadTask to attempt to set the position of the destination ByteBuffer and if that position is outside the ByteBuffer an IllegalArgumentException ensues.

The proposed fix is to disallow re-using the OVERLAPPED structure pointer as a key until after the I/O completion packet of the operation is actually received. When each task completes, it invalidates the corresponding entry by removing it from the PendingIoCache OVERLAPPED-to-PendingFuture map and saves the pointer value in the set “invalidOverlapped.” After receiving an I/O completion packet, the Iocp event handler will invoke remove() on the PendingIoCache which will attempt to re-use the pointer only if it has been previously invalidated, i.e., is found in the invalidOverlapped set.

An alternative, simpler fix would be to dispense with re-using OVERLAPPED structures altogether in PendingIoCache.



[1] http://mail.openjdk.java.net/pipermail/nio-dev/2017-July/004369.html
[2] http://mail.openjdk.java.net/pipermail/nio-dev/2017-August/004388.html
[3] https://docs.microsoft.com/en-us/windows/desktop/api/shobjidl/ns-shobjidl-_overlapped
[4] https://msdn.microsoft.com/en-us/library/windows/desktop/aa364986(v=vs.85).aspx

More information about the nio-dev mailing list