<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">But I also thought you said the delay
and retry fixed the problem. How could fix the problem if it is
just duplicating something that is already in place?<br>
<br>
Chris<br>
<br>
On 10/4/18 9:48 AM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:5BB64447.4090408@oracle.com">
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
My delay and retry just duplicated the openDoor retry.<br>
The normal processing of FileNotFoundException(ENOENT) is to retry<br>
several times until the file is available.<br>
<br>
But the original problem reported is a "Permission denied"
(EACCESS or EPERM).<br>
Delay and retry will not resolve a permissions error.<br>
<br>
On 10/4/18, 12:30 PM, Chris Plummer wrote:
<blockquote
cite="mid:7115147a-2aae-d168-2db5-d1914837f050@oracle.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=utf-8">
<div class="moz-cite-prefix">Didn't the retry after 100ms delay
work? If yes, why would it if the problem is that a java_pid
was not cleaned up?<br>
<br>
Chris<br>
<br>
On 10/4/18 8:54 AM, Gary Adams wrote:<br>
</div>
<blockquote type="cite" cite="mid:5BB637A6.6010008@oracle.com">
<meta content="text/html; charset=utf-8"
http-equiv="Content-Type">
First, let me retract the proposed change,<br>
it is not the right solution to the problem originally<br>
reported.<br>
<br>
Second, as a bit of explanation consider the code fragments
below.<br>
<br>
The high level processing calls openDoor which is willing to
retry <br>
the operation as long as the error is flagged specifically<br>
as a FileNotFoundException.<br>
<br>
 VirtualMachineImpl.java:72<br>
 VirtualMachineImpl.c:81<br>
<br>
During my testing I had added a check
VirtualMachineImpl.java:214<br>
and when an IOException was detected made a call to
checkPermissions<br>
to get more detailed information about the IOException. The
error <br>
I saw was an ENOENT from the stat call. And not the detailed
checks for<br>
specific permissions issues (VirtualMachineImpl.c:143)<br>
<br>
  VirtualMachineImpl.c:118<br>
  VirtualMachineImpl.c:147<br>
<br>
What I missed in the original proposed solution was a
FileNotFoundException<br>
extends IOException. That means my delay and retry just
duplicates the higher<br>
level retry around the openDoor call.<br>
<br>
Third, the original error message logged in the bug report :<br>
<br>
<span style="caret-color: rgb(51, 51, 51); color: rgb(51, 51,
51); font-family: Arial, sans-serif; font-size:
14.000000953674316px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(255,
255, 255); text-decoration: none; display: inline
!important; float: none;">java.io.IOException: Permission
denied<span class="Apple-converted-space">Â </span></span><br
style="caret-color: rgb(51, 51, 51); color: rgb(51, 51, 51);
font-family: Arial, sans-serif; font-size:
14.000000953674316px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; text-decoration: none;">
<span style="caret-color: rgb(51, 51, 51); color: rgb(51, 51,
51); font-family: Arial, sans-serif; font-size:
14.000000953674316px; font-style: normal; font-variant-caps:
normal; font-weight: normal; letter-spacing: normal;
orphans: auto; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: auto;
word-spacing: 0px; -webkit-text-size-adjust: auto;
-webkit-text-stroke-width: 0px; background-color: rgb(255,
255, 255); text-decoration: none; display: inline
!important; float: none;">at
jdk.attach/sun.tools.attach.VirtualMachineImpl.open(Native
Method)<span class="Apple-converted-space"> </span></span><br>
<br>
had to have come from<br>
<br>
 VirtualMachineImpl.c:70<br>
 VirtualMachineImpl.c:84<br>
<br>
which means the actual open call reported the file does exist<br>
but the permissions do not allow the file to be accessed.<br>
That also means the normal mechanism of removing leftover <br>
java_pid files would not have cleaned up another user's<br>
java_pid files.<br>
<br>
=====<br>
src/jdk.attach/solaris/classes/sun/tools/attach/VirtualMachineImpl.java:<br>
...<br>
   67          // Opens the door file to the target VM. If
the file is not<br>
   68          // found it might mean that the attach
mechanism isn't started in the<br>
   69          // target VM so we attempt to start it and
retry.<br>
   70          try {<br>
   71              fd = openDoor(pid);<br>
   72          } catch (FileNotFoundException fnf1) {<br>
   73              File f = createAttachFile(pid);<br>
   74              try {<br>
   75                  sigquit(pid);<br>
   76   <br>
   77                  // give the target VM time to start
the attach mechanism<br>
   78                  final int delay_step = 100;<br>
   79                  final long timeout =
attachTimeout();<br>
   80                  long time_spend = 0;<br>
   81                  long delay = 0;<br>
   82                  do {<br>
   83                      // Increase timeout on each
attempt to reduce polling<br>
   84                      delay += delay_step;<br>
   85                      try {<br>
   86                          Thread.sleep(delay);<br>
   87                      } catch (InterruptedException x)
{ }<br>
   88                      try {<br>
   89                          fd = openDoor(pid);<br>
   90                      } catch (FileNotFoundException
fnf2) {<br>
   91                          // pass<br>
   92                      }<br>
   93   <br>
   94                      time_spend += delay;<br>
   95                      if (time_spend > timeout/2
&& fd == -1) {<br>
   96                          // Send QUIT again to give
target VM the last chance to react<br>
   97                          sigquit(pid);<br>
   98                      }<br>
   99                  } while (time_spend <= timeout
&& fd == -1);<br>
  100                  if (fd == -1) {<br>
  101                      throw new
AttachNotSupportedException(<br>
  102                          String.format("Unable to
open door %s: " +<br>
  103                            "target process %d doesn't
respond within %dms " +<br>
  104                            "or HotSpot VM not
loaded", socket_path, pid, time_spend));<br>
  105                  }<br>
...<br>
  212      // The door is attached to .java_pid<pid>
in the temporary directory.<br>
  213      private int openDoor(int pid) throws IOException
{<br>
  214          socket_path = tmpdir + "/.java_pid" + pid;<br>
  215          fd = open(socket_path);<br>
  216   <br>
  217          // Check that the file owner/permission to
avoid attaching to<br>
  218          // bogus process<br>
  219          try {<br>
  220              checkPermissions(socket_path);<br>
  221          } catch (IOException ioe) {<br>
  222              close(fd);<br>
  223              throw ioe;<br>
  224          }<br>
  225          return fd;<br>
  226      }<br>
<br>
=====<br>
src/jdk.attach/solaris/native/libattach/VirtualMachineImpl.c:<br>
...<br>
   59   JNIEXPORT jint JNICALL
Java_sun_tools_attach_VirtualMachineImpl_open<br>
   60    (JNIEnv *env, jclass cls, jstring path)<br>
   61   {<br>
   62      jboolean isCopy;<br>
   63      const char* p = GetStringPlatformChars(env,
path, &isCopy);<br>
   64      if (p == NULL) {<br>
   65          return 0;<br>
   66      } else {<br>
   67          int fd;<br>
   68          int err = 0;<br>
   69   <br>
   70          fd = open(p, O_RDWR);<br>
   71          if (fd == -1) {<br>
   72              err = errno;<br>
   73          }<br>
   74   <br>
   75          if (isCopy) {<br>
   76              JNU_ReleaseStringPlatformChars(env,
path, p);<br>
   77          }<br>
   78   <br>
   79          if (fd == -1) {<br>
   80              if (err == ENOENT) {<br>
   81                  JNU_ThrowByName(env,
"java/io/FileNotFoundException", NULL);<br>
   82              } else {<br>
   83                  char* msg = strdup(strerror(err));<br>
   84                  JNU_ThrowIOException(env, msg);<br>
   85                  if (msg != NULL) {<br>
   86                      free(msg);<br>
   87                  }<br>
   88              }<br>
   89          }<br>
   90          return fd;<br>
   91      }<br>
   92   }<br>
...<br>
   99   JNIEXPORT void JNICALL
Java_sun_tools_attach_VirtualMachineImpl_checkPermissions<br>
  100    (JNIEnv *env, jclass cls, jstring path)<br>
  101   {<br>
  102      jboolean isCopy;<br>
  103      const char* p = GetStringPlatformChars(env,
path, &isCopy);<br>
  104      if (p != NULL) {<br>
  105          struct stat64 sb;<br>
  106          uid_t uid, gid;<br>
  107          int res;<br>
  108   <br>
  109          memset(&sb, 0, sizeof(struct stat64));<br>
  110   <br>
  111          /*<br>
  112           * Check that the path is owned by the
effective uid/gid of this<br>
  113           * process. Also check that group/other
access is not allowed.<br>
  114           */<br>
  115          uid = geteuid();<br>
  116          gid = getegid();<br>
  117   <br>
  118          res = stat64(p, &sb);<br>
  119          if (res != 0) {<br>
  120              /* save errno */<br>
  121              res = errno;<br>
  122          }<br>
  123   <br>
  124          if (res == 0) {<br>
  125              char msg[100];<br>
  126              jboolean isError = JNI_FALSE;<br>
  127              if (sb.st_uid != uid && uid !=
ROOT_UID) {<br>
  128                  snprintf(msg, sizeof(msg),<br>
  129                      "file should be owned by the
current user (which is %d) but is owned by %d", uid,
sb.st_uid);<br>
  130                  isError = JNI_TRUE;<br>
  131              } else if (sb.st_gid != gid &&
uid != ROOT_UID) {<br>
  132                  snprintf(msg, sizeof(msg),<br>
  133                      "file's group should be the
current group (which is %d) but the group is %d", gid,
sb.st_gid);<br>
  134                  isError = JNI_TRUE;<br>
  135              } else if ((sb.st_mode &
(S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH)) != 0) {<br>
  136                  snprintf(msg, sizeof(msg),<br>
  137                      "file should only be readable
and writable by the owner but has 0%03o access", sb.st_mode
& 0777);<br>
  138                  isError = JNI_TRUE;<br>
  139              }<br>
  140              if (isError) {<br>
  141                  char buf[256];<br>
  142                  snprintf(buf, sizeof(buf),
"well-known file %s is not secure: %s", p, msg);<br>
  143                  JNU_ThrowIOException(env, buf);<br>
  144              }<br>
  145          } else {<br>
  146              char* msg = strdup(strerror(res));<br>
  147              JNU_ThrowIOException(env, msg);<br>
  148              if (msg != NULL) {<br>
  149                  free(msg);<br>
  150              }<br>
  151          }<br>
<br>
On 10/2/18, 6:23 PM, David Holmes wrote:
<blockquote
cite="mid:e1adb28a-031f-5c28-74a5-451718a7683b@oracle.com"
type="cite">Minor correction: EPERM -> EACCES for Solaris
<br>
<br>
Hard to see how to get a transient EACCES when opening a
file ... though as it is really a door I guess there could
be additional complexity. <br>
<br>
David <br>
<br>
On 3/10/2018 7:54 AM, Chris Plummer wrote: <br>
<blockquote type="cite">On 10/2/18 2:38 PM, David Holmes
wrote: <br>
<blockquote type="cite">Chris, <br>
<br>
On 3/10/2018 6:57 AM, Chris Plummer wrote: <br>
<blockquote type="cite"> <br>
<br>
On 10/2/18 1:44 PM, <a
class="moz-txt-link-abbreviated"
href="mailto:gary.adams@oracle.com"
moz-do-not-send="true">gary.adams@oracle.com</a>
wrote: <br>
<blockquote type="cite">The general attach sequence
... <br>
<br>
src/jdk.attach/solaris/classes/sun/tools/attach/VirtualMachineImpl.java
<br>
<br>
 the attacher creates an attach_pid file in a
directory where the attachee is runnning <br>
 issues a signal to the attacheee <br>
<br>
 loops waiting for the java_pid file to be created
<br>
 default timeout is 10 seconds <br>
<br>
</blockquote>
So getting a FileNotFoundException while in this loop
is OK, but IOException is not. <br>
<br>
<blockquote type="cite">src/hotspot/os/solaris/attachListener_solaris.cpp
<br>
<br>
  attachee creates the java_pid file <br>
  listens til the attacher opens the door <br>
<br>
</blockquote>
I'm don't think this is related, but JDK-8199811 made
a fix in attachListener_solaris.cpp to make it wait up
to 10 seconds for initialization to complete before
failing the enqueue. <br>
<br>
<blockquote type="cite">... <br>
Not sure when a bare IOException is thrown rather
than the <br>
more specific FileNotFoundException. <br>
</blockquote>
Where is the IOException originating from? I wonder if
the issue is that the file is in the process of being
created, but is not fully created yet. Maybe it is
there, but owner/group/permissions have not been set
yet, and this results in an IOException instead of
FileNotFoundException. <br>
</blockquote>
<br>
The exception is shown in the bug report: <br>
<br>
 [java.io.IOException: Permission denied <br>
at
jdk.attach/sun.tools.attach.VirtualMachineImpl.open(Native
Method) <br>
at
jdk.attach/sun.tools.attach.VirtualMachineImpl.openDoor(VirtualMachineImpl.java:215)
<br>
at
jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:71)
<br>
at
jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
<br>
at
jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
<br>
at
jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:114)
<br>
at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:98) <br>
<br>
And if you look at the native code the EPERM from open
will cause IOException to be thrown. <br>
<br>
./jdk.attach/solaris/native/libattach/VirtualMachineImpl.c <br>
<br>
JNIEXPORT jint JNICALL
Java_sun_tools_attach_VirtualMachineImpl_open <br>
 (JNIEnv *env, jclass cls, jstring path) <br>
{ <br>
   jboolean isCopy; <br>
   const char* p = GetStringPlatformChars(env, path,
&isCopy); <br>
   if (p == NULL) { <br>
       return 0; <br>
   } else { <br>
       int fd; <br>
       int err = 0; <br>
<br>
       fd = open(p, O_RDWR); <br>
       if (fd == -1) { <br>
           err = errno; <br>
       } <br>
<br>
       if (isCopy) { <br>
           JNU_ReleaseStringPlatformChars(env, path,
p); <br>
       } <br>
<br>
       if (fd == -1) { <br>
           if (err == ENOENT) { <br>
               JNU_ThrowByName(env,
"java/io/FileNotFoundException", NULL); <br>
           } else { <br>
               char* msg = strdup(strerror(err)); <br>
               JNU_ThrowIOException(env, msg); <br>
               if (msg != NULL) { <br>
                   free(msg); <br>
               } <br>
<br>
<br>
We should add the path to the exception message. <br>
<br>
</blockquote>
Thanks David. So if EPERM is the error and a retry 100ms
later works, I think that supports my hypothesis that the
file is not quite fully created. So Gary's fix is probably
fine. The only other possible fix I can think of that
wouldn't require an explicit delay (or multiple retries)
is probably not worth the complexity. It would require
that the attachee create two files, and the attacher try
to open the second file first. When it either opens or
returns EPERM, you know the first file can safety be
opened. <br>
<br>
Chris <br>
<blockquote type="cite">David <br>
----- <br>
<br>
<blockquote type="cite">Chris <br>
<blockquote type="cite"> <br>
<br>
<br>
On 10/2/18 4:11 PM, Chris Plummer wrote: <br>
<blockquote type="cite">Can you summarize how the
attach handshaking is suppose to work? I'm just
wondering why the attacher would ever be looking
for the file before the attachee has created it.
It seems a proper handshake would prevent this.
Maybe there's some sort of visibility issue where
the attachee has indeed created the file, but it
is not immediately visible to the attacher
process. <br>
<br>
Chris <br>
<br>
On 10/2/18 12:27 PM, <a
class="moz-txt-link-abbreviated"
href="mailto:gary.adams@oracle.com"
moz-do-not-send="true">gary.adams@oracle.com</a>
wrote: <br>
<blockquote type="cite">The problem reproduced
pretty quickly. <br>
I added a call to checkPermission and revealed
the <br>
"file not found" from the stat call when the
IOException <br>
was detected. <br>
<br>
There has been some flakiness from the Solaris
test machines today, <br>
so I'll continue with the testing a bit longer.
<br>
<br>
On 10/2/18 3:12 PM, Chris Plummer wrote: <br>
<blockquote type="cite">Without the fix was this
issue easy enough to reproduce that you can be
sure this is resolving it? <br>
<br>
Chris <br>
<br>
On 10/2/18 8:16 AM, Gary Adams wrote: <br>
<blockquote type="cite">Solaris debug builds
are failing tests that use the attach
interface. <br>
An IOException is reported when the java_pid
file is not opened. <br>
<br>
It appears that the attempt to attach is
taking place too quickly. <br>
This workaround will allow the open
operation to be retried <br>
after a short pause. <br>
<br>
 Webrev: <a class="moz-txt-link-freetext"
href="http://cr.openjdk.java.net/%7Egadams/8210337/webrev/"
moz-do-not-send="true">http://cr.openjdk.java.net/~gadams/8210337/webrev/</a>
<br>
 Issue: <a class="moz-txt-link-freetext"
href="https://bugs.openjdk.java.net/browse/JDK-8210337"
moz-do-not-send="true">https://bugs.openjdk.java.net/browse/JDK-8210337</a>
<br>
<br>
Testing is in progress. <br>
</blockquote>
<br>
<br>
<br>
</blockquote>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</blockquote>
<br>
<br>
</blockquote>
</blockquote>
<br>
<br>
</blockquote>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</body>
</html>