RFR (M/L): 8131168: Refactor ProcessHandleImpl_*.c and add implememtation for AIX

Volker Simonis volker.simonis at gmail.com
Mon Jul 20 14:46:56 UTC 2015

Hi Roger,

thanks for looking at the webrev. Please find my comments inline:

On Mon, Jul 20, 2015 at 4:01 PM, Roger Riggs <Roger.Riggs at oracle.com> wrote:
> Hi Volker,
> yes, generally, the approach is an improvement and provides a clearer
> model of os specific variations.
> More comments inline...
> On 7/17/2015 2:28 PM, Volker Simonis wrote:
>> Hi,
>> I worked on refactoring the native ProcessHandleImpl implementation to
>> enable more code sharing and make it easier to add support for new
>> platforms (e.g. AIX) and I think I had a pretty nice version running. But
>> unfortunately I've just realized that the recent changes to
>> ProcessHandleImpl (i.e. "8078099: (process) ProcessHandle should uniquely
>> identify processes" and "8078108: (process) ProcessHandle.isAlive should
>> be
>> robust") have been massive so I have to start over to merge all my changes
>> with the new version.
>> But before doing that I just wanted to post my current changes which
>> cleanly apply to the repo before 8078099 and ask for your opinion:
>> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/
>> Here's a summary (taken from ProcesHandleImpl_unix.c) of what I've
>> actually
>> done (if the output appears scrambled in you mail better read it in the
>> webrev
>> http://cr.openjdk.java.net/~simonis/webrevs/2015/8131168.v1/src/java.base/unix/native/libjava/ProcessHandleImpl_unix.c.udiff.html
>> ):
>> /**
>>   * This file contains the implementation of the native ProcessHandleImpl
>>   * functions which are common to all Unix variants.
>>   *
>>   * The currently supported Unix variants are Solaris, Linux, MaxOS X and
>> AIX.
>>   * The various similarities and differences between these systems make it
>> hard
>>   * to find a clear boundary between platform specific and shared code.
>>   *
>>   * In order to ease code sharing between the pltforms while still keeping
>> the
>>   * code as clean as possible (i.e. free of preprocessor macros) we use
>> the
>>   * following source code layout (remember that ProcessHandleImpl_unix.c
>> will
>>   * be compiled on EVERY Unix platform while ProcessHandleImpl_<os>.c will
>> be
>>   * only compiled on the specific OS):
>>   *
>>   * - all the JNI wrappers for the ProcessHandleImpl functions go into
>> this
>> file
>>   * - if their implementation is common on ALL the supported Unix
>> platforms
>> it
>>   *   goes right into the JNI wrappers
>>   * - if the whole function or substantial parts are platform dependent,
>> the
>>   *   implementation goes into os_<function_name> functions in
>>   *   ProcessHandleImpl_<os>.c
>>   * - if at least two platforms implement an os_<function_name> function
>> in
>> the
>>   *   same way, this implementation is factored out into
>> unix_<function_name>,
>>   *   placed into this file and called from the corresponding
>> os_<function_name>
>>   *   function.
>>   * - For convenience, all the os_ and unix_ functions are declared in
>>   *   ProcessHandleImpl_unix.h which is included into every
>>   *   ProcessHandleImpl_<os>.c file.
>>   *
>>   * Example 1:
>>   * ----------
>>   * The implementation of
>> Java_java_lang_ProcessHandleImpl_00024Info_initIDs()
>>   * is the same on all platforms except on Linux where it initilizes one
>>   * additional field. So we place the implementation right into
>>   * Java_java_lang_ProcessHandleImpl_00024Info_initIDs() but add call to
>>   * os_init() at the end of the function which is empty on all platforms
>>   * except Linux where it performs the additionally initializations.
>>   *
>>   * Example 2:
>>   * ----------
>>   * The implementation of Java_java_lang_ProcessHandleImpl_00024Info_info0
>> is
>>   * the same on Solaris and AIX but different on Linux and MacOSX. We
>> therefore
>>   * simply call the two helpers os_getStatInfo() and os_getCmdlineInfo().
>> The
>>   * Linux and MaxOS X versions of these functions (in the corresponding
>> files
>>   * ProcessHandleImpl_linux.c and ProcessHandleImpl_macosx.c) directly
>> contain
>>   * the platform specific implementations while the Solaris and AIX
>>   * implementations simply call back to unix_getStatInfo() and
>>   * unix_getCmdlineInfo() which are implemented right in this file.
>>   *
>>   * The term "same implementation" is still a question of interpretation.
>> It
>> my
>>   * be acceptable to have a few ifdef'ed lines if that allows the sharing
>> of
>> a
>>   * huge function. On the other hand, if the platform specific code in a
>> shared
>>   * function grows over a certain limit, it may be better to refactor that
>>   * functionality into corresponding, platform-specific os_ functions.
>>   */
>> This resulted in the new file ProcessHandleImpl_linux.c which now only
>> contains Linux-only code which was previously in ifdefed for Linux in
>> ProcessHandleImpl_linux.c. The advantage is that we now have only one
>> version of:
>>   Java_java_lang_ProcessHandleImpl_00024Info_initIDs
>>   unix_fillArgArray
>>   unix_uidToUser
>> and on version of
>>   unix_getChildren
>> which is shared by Solaris, Linux and AIX and one version of
>>   unix_getStatInfo
>>   unix_getCmdlineInfo
>> which are shared by Solaris and Linux.
>> Additionally I've added the AIX port following the new schema and I've
>> slightly improved the Solaris port to use the information from
>> psinfo.pr_psargs for reporting at least the first 80 characters of the
>> command line arguments. arg[0] from psinfo.pr_psargs is also used as the
>> "command()" string in the case where "/proc/%d/path/a.out" isn't readable.
>> This helps to report a command string for all the processes we do not own,
>> because in that case, "/proc/%d/path/a.out" isn't readable. By the way,
>> using args[0] as a fall-back for cmd beeing null also helps on MacOS X for
>> processes we don't own. Finally I've also done some test improvements.
> My bias was toward omitting information that was incomplete or not usable
> programatically.
> If the commandline is truncated then it is pretty useless to a program, the
> same with
> the executable command.  If the application has to try to guess what's
> missing and how
> to use what is present then it should be omitted from the API.  If there is
> a user present
> then they can use 'ps' or 'tasklist' and interpret the partial information.

But 'ps' is limited to 80 characters for arguments and command line as
well on Solaris as far as I know. Moreover, on AIX I haven't found
something better than this for the command line at all. Anyway we have
to live with the fact that this information is extremely platform
dependent. The least common divisor is surely the PID but for this API
I prefer to expose "as much as possible" for a given platform.

I suspect that once this API is out there in the world, people will
use it for all kind of stuff like monitoring or logging and I'd rather
prefer to see a truncated command line in a log file instad of just
seeing a plain PID. So usefull/useless depends pretty much on the use
case :)

> The focus should be on an API that can be used to control/manage a set of
> applications
> the run together, usually under the same userid or with root privileges.
> Comments on the diffs:
> ProcessHandle.java:
>  - The notes should probably be use @implNote  (though I disagree with
> providing truncated information)

Didn't knew that one but I'll use it. Thanks for the hint.
> ProcessHandleImpl_unix:
>  - I would not have expected to find the Solaris and AIX implementations of
> getStatInfo
> and getCmdlineInfo here; but perhaps it depends on how one considers the
> lineage of the real unix.
>   Since the more common platforms are Linux, Redhat, etc I would keep those
> versions in the
>   ProcessHandleImpl_unix common file.

As I wrote in the summary - I don't think the _unix file should
contain the implementations of the most prominent Unix platform but
rather the implementations shared between most platforms. If a
function implementation is only being used on Linux I think it is
reasonable to put it into a _linux file.

A while ago we discussed if the 'unix' directory should be named
'unix' or 'posix' or whatsoever and ended up with 'unix'. I think
that's a good decision because it honours the fact that there is no
single, true, genuine Unix variant. The 'unix' directory is a
pragmatic approach to share "common" Unix code with "common" in the
sense of "supported by at least two or more Unix variants". At least
that was my understanding.

I'll start to merge in your new changes into my schema and post a new
webrev once I'm done.

Thanks once again for your comments,

> Thanks, Roger
>> I have tested all my changes on Linux/amd64, Linux/ppc64, MacOS X, Solaris
>> and AIX.
>> I'd be happy to forward port my changes to the newest head version if you
>> generally agree with my approach (and give me a short time frame where you
>> promise not to do any massive changes :)
>> What do you think?
>> Thank you and best regards,
>> Volker

More information about the core-libs-dev mailing list