Process.exec with the linux posix_spawn mode has a bug

Thomas Stüfe thomas.stuefe at
Mon May 13 14:11:52 UTC 2019

On Mon, May 13, 2019 at 3:42 PM Thomas Stüfe <thomas.stuefe at>

> Hi Martin,
> On Mon, May 13, 2019 at 2:08 PM Martin Buchholz <martinrb at>
> wrote:
>> I am happy this is resolved and the intermittent behavior explained. Yes,
>>> we could improve exception messages, especially since analyzing fork
>>> scenarios is cumbersome.
>> I tried hard back in 2005 to provide pretty good java-level diagnostics
>> when subprocess starting failed somehow (see WhyCantJohnnyExec) .  At least
>> the errno did get reported.
> I know your code. For many years I wondered who Johnny is :)
> We have a very similar solution in our port: we have our own error codes
> (plus errno mixed in where it makes sense) for the many things that can go
> wrong in the forkhelper. Maybe we can improve upon your solution a bit.
> And/or add tracing for environment etc.
> But here is one thing that I still do not understand with Remis problem:
> The theory is that the first exec(), starting jspawnhelper, went wrong
> with NOACCESS, yes?
> Man page for posix_spawn() states:
> <quote>
>        Upon successful completion, posix_spawn() and posix_spawnp() place
>        the PID of the child process in pid, and return 0.  If there is an
>        error before or during the fork(2), then no child is created, the
>        contents of *pid are unspecified, and these functions return an
> error
>        number as described below.
>        Even when these functions return a success status, the child process
>        may still fail for a plethora of reasons related to its pre-exec()
>        initialization.  In addition, the exec(3) may fail.  In all of these
>        cases, the child process will exit with the exit value of 127.
> </quote>
> To me this looks as if what should have happened is: posix_spawn() should
> have returned with success, since the fork() went thru. Then, the child
> process (still inside posix_spawn()) attempts exec and gets a NOACCESS.
> Then, child process should have ended with exit code 127. Your fail pipe
> would never read an error code since we never entered the main function of
> jspawnhelper. For the java caller it should have looked like a very short
> lived process with exit code 127.
> Obviously this is not what happened, since Remi reported an IOException
> with an errno. So, where do I understand this wong?
Hmm this looks wrong. Just tested (Ubuntu 16.4): removing execute
permission from jspawnhelper does not result in an IOException. Instead,
Runtime.exec() seemingly succeeds. strace shows the exec() for jspawnhelper
to fail as expected:

5676 [pid 13796]
["11:14"], [/* 79 vars */]) = -1 EACCES (Permission denied)
5677 [pid 13796] exit_group(127)             = ?
5678 [pid 13780] <... vfork resumed> )       = 13796
5679 [pid 13796] +++ exited with 127 +++
5680 [pid 13780] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED,
si_pid=13796, si_uid=1027, si_status=127, si_utime=0, si_stime=0} ---

But we completely fail to notice.

This is bad. We should fix it.

One more thing, not sure if this is libc specific? The OpenGroup manpage
for posix_spawn() states:

If *posix_spawn*() or *posix_spawnp*() fail for any of the reasons that
would cause *fork*()
<> or one
of the *exec
<>* family
of functions to fail, an error value shall be returned as described by
<> and *exec
respectively (or, if the error occurs after the calling process
successfully returns, the child process shall exit with exit status 127).

which I interpret as the standard leaves open the decision if exec() errors
are communicated outside to the caller of posix_spawn().


> I've had this little script around for ages:
>> #!/bin/bash
>> # -v: Print unabbreviated versions of environment, etc
>> exec /usr/bin/strace -f -v -s 256 -e signal=none -e trace=process "$@"
> We had all this as part of spawn traces. But this is a nice and neat idea.
> Does it print current directory?
> Cheers, Thomas

More information about the core-libs-dev mailing list