RFR: 8022213 Intermittent test failures in java/net/URLClassLoader (Add jdk/testlibrary/FileUtils.java)

Dan Xu dan.xu at oracle.com
Fri Nov 8 02:26:35 UTC 2013

On 11/07/2013 11:04 AM, Alan Bateman wrote:
> On 07/11/2013 14:59, Chris Hegarty wrote:
>> Given both Michael and Alan's comments. I've update the webrev:
>>   http://cr.openjdk.java.net/~chegar/fileUtils.01/webrev/
>> 1) more descriptive method names
>> 2) deleteXXX methods return if interrupted, leaving the
>>    interrupt status set
>> 3) Use Files.copy with REPLACE_EXISTING
>> 4) Use SimpleFileVisitor, rather than FileVisitor
> This looks better although interrupting the sleep means that the 
> deleteXXX will quietly terminate with the interrupt status set (which 
> could be awkward to diagnose if used with tests that are also using 
> Thread.interrupt). An alternative might be to just throw the 
> IOException with InterruptedException as the cause.
> -Alan.
Hi Chris,

In the method, deleteFileWithRetry0(), it assumes that if any other 
process is accessing the same file, the delete operation, 
Files.delete(), will throw out IOException on Windows. But I don't see 
this assumption is always true when I investigated this issue on 
intermittent test failures.

When Files.delete() method is called, it finally calls DeleteFile or 
RemoveDirectory functions based on whether the target is a file or 
directory. And these Windows APIs only mark the target for deletion on 
close and return immediately without waiting the operation to be 
completed. If another process is accessing the file in the meantime, the 
delete operation does not occur and the target file stays at 
delete-pending status until that open handle is closed. It basically 
implies that DeleteFile and RemoveDirectory is like an async operation. 
Therefore, we cannot assume that the file/directory is deleted after 
Files.delete() returns or File.delete() returns true.

When checking those intermittently test failures, I find the test 
normally succeeds on the Files.delete() call. But due to the 
interference of Anti-virus or other Windows daemon services, the target 
file changes to delete-pending status. And the immediately following 
operation fails due the target file still exists, but our tests assume 
the target file is already gone. Because the delete-pending status of a 
file usually last for a very short time which depends on the 
interference source, such failures normally happens when we recursively 
delete a folder or delete-and-create a file with the same file name at a 
high frequency.

It is basically a Windows API design or implementation issue. I have 
logged an enhancement, JDK-8024496, to solve it from Java library layer. 
Currently, I have two strategies in mind. One is to make the delete 
operation blocking, which means to make sure the file/directory is 
deleted before the return. The other is to make sure the delete-pending 
file does not lead to a failure of subsequent file operations. But they 
both has pros and cons.



More information about the core-libs-dev mailing list