ProcessReaper: single thread reaper

Peter Levart peter.levart at
Fri Apr 11 10:47:52 UTC 2014

On 04/09/2014 03:20 PM, roger riggs wrote:
> Hi Peter,
> On a related topic, the request to be able to destroy a Process and 
> all of its children
> might also want to used the group pid to be able to identify all of 
> the children.

Hi Roger,

This would require each child spawned by Process API to be assigned it's 
own process group. The grandchildren would inherit this process group. 
You could then send KILL/TERM signal to a process group in order to 
destroy the child and all it's descendants (that did not change the 
process group in the meanwhile).

But we can only group processes for one purpose, since a process can 
only belong to one group at a time. To send signals to a (sub-)tree of 
processes, the child-parent relationship is more natural to follow, I 
think, since no waiting/blocking is involved in sending the signals, so 
enumerating and iterating is appropriate.

Waiting on children is another purpose where process group(s) could be 
employed and I think they would be better spent this way.

I think I now have a picture of how this could work. See my reply to Martin.

Regards, Peter

> Roger
> On 4/9/2014 2:08 AM, Peter Levart wrote:
>> Hi Martin,
>> As you might have seen in my later reply to Roger, there's still hope 
>> on that front: setpgid() + wait(-pgid, ...) might be the answer. I'm 
>> exploring in that direction. Shells are doing it, so why can't JDK?
>> It's a little trickier for Process API, since I imagine that shells 
>> form a group of processes from a pipeline which is known in-advance 
>> while Process API will have to add processes to the live group 
>> dynamically. So some races will have to be resolved, but I think it's 
>> doable.
>> Stay tuned.
>> Regards, Peter
>> On 04/08/2014 07:48 PM, Martin Buchholz wrote:
>>> Peter, thank you very much for your deep analysis.
>>> TIL and am horrified: signals on Unix are not queued, not even if 
>>> you specify SA_SIGINFO.  Providing siginfo turns signals into proper 
>>> "messages" each with unique content, and it is unacceptable to 
>>> simply drop some (Especially when proper queueing seems required for 
>>> so-called real-time signals), but at least the Linux kernel does so 
>>> very deliberately.   45 years later, we are still fighting with 
>>> unreliable Unix signals...
>>> We can't call waitpid(WAIT_ANY, ) because we can only wait for 
>>> processes owned by the j.l.Process subsystem.  We can't override 
>>> libc functions like waitpid because the JVM may be a "guest" in some 
>>> other process.
>>> I don't know of any public examples, but it is reasonable to add a 
>>> JVM to a previously pure native code application, similarly to the 
>>> way tcl or lua is often used to provide a higher-level safer 
>>> programming api to native code, and some programs at Google use this 
>>> strategy.
>>> What problem are we actually trying to solve?  The army of reaper 
>>> threads is ugly, but the inefficiency is greatly mitigated by the 
>>> use of small explicit stack sizes.  Redoing the process code is 
>>> always risky, as we have already seen in this thread.
>>> Maintaining a single child helper process which spawns all the 
>>> (grand)child processes seems reasonable, although it would create a 
>>> permanent intermediate entry in the process table (pstree?) which 
>>> might confuse some sysadmin scripts.  Is it worth it?

More information about the core-libs-dev mailing list