Misbehaving exit status from Hotspot
Charles Oliver Nutter
headius at headius.com
Thu Jun 28 20:30:32 UTC 2018
On Thu, Jun 28, 2018 at 12:24 AM, David Holmes <david.holmes at oracle.com>
> On 28/06/2018 1:33 AM, Charles Oliver Nutter wrote:
>> Oops, in my editing of the post I lost the link to sources. Perhaps this
>> will illustrate the problem I'm talking about a bit better!
> Okay so the test outputs:
> $ ./sigtest `which java` Loop
> pid: 22136
> status: 36608
> exited: 1, stop signal: 143, term signal: 0, exit status: 143
> WIFEXITED is defined as:
> "Evaluates to a non-zero value if status was returned for a child process
> that terminated normally."
> So this value is one because the JVM process did exit "normally" - it
> called exit(143);
Why does it terminate "abnormally" if you -XX:+ReduceSignalUsage then? Does
it no longer handle TERM at all?
> WSTOPSIG is defined as:
> "If the value of WIFSTOPPED(stat_val) is non-zero, this macro evaluates to
> the number of the signal that caused the child process to stop."
> But we haven't checked WIFSTOPPED (and the process is terminated not
> stopped) so this is "garbage".
But a TERM signal *did* cause the process to stop, didn't it? This is not
an appropriate value for the stop signal in that case (indeed, garbage).
> WTERMSIG is defined as:
> "If the value of WIFSIGNALED(stat_val) is non-zero, this macro evaluates
> to the number of the signal that caused the termination of the child
> You haven't checked WIFSIGNALED but it will be zero as the process was not
> terminated by an _uncaught signal_. So the value zero here is fine, but
> could be anything given WIFSIGNAD will be zero.
I guess this is more of the same...if you handle a TERM signal, but then
ultimately do terminate...wasn't the termination in response to the TERM
signal? If I did not send TERM, it would not have shut down.
My docs, on MacOS, say something similar: "True if the process terminated
due to receipt of a signal."
The process *did* terminate due to receipt of a signal. The number of the
signal that caused the termination of the process was 15, TERM. I'm still
not getting something in your logic, I guess?
Bear with me please :-) There are big gaps in the documentation of this
stuff online, and if what Hotspot does is "correct" and "standard" I would
like to know that and see the docs indicating such.
> WEXITSTATUS is defined as:
> "If the value of WIFEXITED(stat_val) is non-zero, this macro evaluates to
> the low-order 8 bits of the status argument that the child process passed
> to _exit() or exit(), or the value the child process returned from main()."
> The JVM called exit(143) so we expect to get 143 and that's exactly what
> we do get.
Here's more confusion for me: WEXITSTATUS is different from what $? would
be at a command line, correct? Because for the C program, WEXITSTATUS is 0
and the exit code at command line is 143.
I am not arguing that the exit(143) is really *wrong*...it just doesn't
really seem to match the rest of the state.
Here's my logic:
Command-line exit result of 143 is "ok"...standard says that it should be
128+N for that value when a signal N caused the process to end.
However, none of the OTHER state that would indicate a signal-based
termination are set properly. There's no termsig and stopsig is nonsense.
So we get a non-zero exit code indicating that a signal caused the process
to end, and yet none of the W macros produce the right values to give us
more information. So, did it terminate because of a signal or not? One
result says yes, the other result says no.
If this is standard, I would like to see that standard. This is where our
bug reports come from...these results do not match any other programs our
users are managing via signals+waitpid.
The flaw with your thinking here is that sending a signal to tell the VM to
> terminate should behave as-if the VM received (and terminated due to) an
> uncaught signal. It doesn't - nor should it.
Is that really the flaw in my thinking? I don't care if these are caught or
uncaught signals...should I? Is there a spec that says *caught* signals
used for a clean shutdown show now indicate the process exited normally? I
don't get it.
If would not have stopped if I had not sent the signal, and so I believe
the macros above should indicate that. If it was a normal termination, I'd
expect the exit status to be zero. But exit status indicates shutdown was
*not* normal...it was due to a signal...but then the signal macros for
waitpid don't also reflect that.
Again, if there's a standard documenting that this is typically how
TERM-handling C programs are supposed to work, I would be happy to be
reeducated. But at the moment, the numbers aren't lining up for me.
More information about the hotspot-runtime-dev