Unneeded size/fstat system call in sun.nio.ch.FileChannelImpl#transferTo for linux?

David Holmes david.holmes at oracle.com
Thu Aug 20 05:14:21 UTC 2015

Redirecting to NIO-dev


On 16/08/2015 3:27 AM, Adrian wrote:
> Hello,
> I was looking at FileChannelImpl#transferTo, and noticed there were
> always two system calls - fstat and sendfile.
> Looking the source code (e.g.
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/sun/nio/ch/FileChannelImpl.java#FileChannelImpl.transferTo%28long%2Clong%2Cjava.nio.channels.WritableByteChannel%29),
> the JVM checks that the position is not past the end of the file to
> read.
> On Linux, the size translates to a fstat system call, and the native
> function transferTo0 translates to a sendfile system call, and I
> believe sendfile already checks the offsets and will only read as many
> bytes as available, similar to the read system call.
> As of linux 2.6.23, sendfile is implemented using splice.
> Just following the latest source for example:
> - sendfile (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/fs/read_write.c?id=HEAD)
> - do_sendfile
> - do_splice_direct
> (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/fs/splice.c?id=HEAD)
> - splice_direct_to_actor
> - do_splice_to
> - default_file_splice_read
> - kernel_readv
> - vfs_readv (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/fs/read_write.c?id=HEAD)
> Which is the same vfs_readv used when defining the read system call
> I didn't find the exact code where it checks the position and length,
> as vfs_readv calls file->f_op->read, but:
> - the read system call (which also uses vfs_readv) is documented as
> "attempting to read up to count bytes"
> - all the code uses the return value as the "actual length", not the
> parameter passed in
> - the man page for splice says it returns "the number of bytes spliced
> to/from the pipe"
> Manually testing this on linux and OSX seems to confirm the behaviour
> (that manually checking the position/length is unnecessary because
> sendfile/splice will only return how many bytes it managed to
> read/write) - by calling sendfile with an offset beyond EOF (returns
> 0) or with a count size larger than the size of the file (returns
> number of bytes left).
> I also looked at the JVM windows source, which uses the TransmitFile API.
> I didn't look into this in detail - the docs are unclear, but it seems
> you will only get an error if the number of bytes to send is greater
> than 2 ^ 31 - 1
> Doing some benchmarks with a 64KB buffer/transfer size (compiling
> minimally modified JDK), FileChannelImpl#size averaged ~12us, and
> FileChannelImpl#transferToDirectly averaged ~33us.
> The whole FileChannelImpl#transferTo took ~48us.
> I know microbenchmarks need to be done very carefully, but I think it
> gives a general idea of the performance impact for something that
> should be (?) unecessary
> Of course, I could be wrong, or more needs to be considered, but I
> thought it was worth bringing up and get some feedback.
> Maybe someone could confirm the behaviour of sendfile/splice, or
> explain the reasoning for checking the size in the JVM?
> Thank you for your time!
> Best regards,
> Adrian

More information about the core-libs-dev mailing list