Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications

yumin.qi at oracle.com yumin.qi at oracle.com
Mon May 9 10:48:17 PDT 2011


  Hi, Ramki

   I think there is no need for lock if all threads supposed to be 
stopped at safepoints except for VMThread and CMS Thread. This is the 
previous version, check if the caller is one for them. In fact, 
rotate_log only called just before ending safepoints, so it is at STW. 
end() is only called by those two threads. I will test this and 
performance affection.

   Write to a file with file handle is thread safe, this is POSIX 
requirement so we don't have to use lock when writing to log file. 
fileStream is a good example for this.
   If at STW, there is only one thread is running, then no need to lock 
for changing file handle.

   I also find:
   In ConcurrentGCThread::stopWorldAndDo(..)s call 
SafepointSynchronizer::begin(), and in begin():

   Thread* myThread = Thread::current();
   assert(myThread->is_VM_thread(), "Only VM thread may execute a 
safepoint");

   This could be a problem since it is called from ConcurrentGCThread. 
Then another search for the calling site of stopWorldAndDo find no 
where.  Is this function ever used? If it is used, it will be a problem 
for this assertion.

   So now, seems only VMThread is the only caller of rotate_log. I will 
test the fastdebug version with assert in rotate_log:

   assert(thread->is_VM_thread(), "Not VM Thread");

to verify my theory.

   Thanks
   Yumin

On 5/9/2011 9:41 AM, Y. Srinivas Ramakrishna wrote:
> Hi Yumin --
>
> It seems on a quick audit of the code that ttyLock is a
> global lock (unless I am missing somrthing). For performance
> one would probably want a per-rotating file stream lock, rather than
> overload a global lock protecting all streams (which
> is what seems to be happening here, and which in the
> general case could all kinds of deadlocking and dependency
> headaches).
>
> I also could not easily see which of the streams used
> locking and for which operations...
>
> So I am not sure your locking here in rotate_log()
> is really giving us any safety here. Did you check
> the paths through which the file descriptor is accessed
> to make sure this (or another) lock protected access in
> any manner? I'd start by putting in appropriate asserts
> in the paths that do the write()'s to the rotating
> stream and work backwards to ensure that locking was
> adequate. If it isn't, I'd suggest introducing a
> per-rotating-stream lock here, for better scalability when
> one might have multiple such streams in the future.
>
> It's also a good idea to measure the performance impact
> of this feature (the rotation and the locking -- or
> reference counting -- which appears to be important
> for safety of rotation in general) comparing both
> no rotation gc logging  and rotation enabled gc logging,
> versus the baseline.
>
> -- ramki
>
> On 5/9/2011 9:21 AM, Y. Srinivas Ramakrishna wrote:
> ...
>> Finally, on locking: did you try using ttyLocker() instead of
>> doing the raw locking you are doing here? I notice a number of
>> cutouts for the locking path in hold(): we'd need to do a careful audit
>> to make sure this does not leave open the possibility of
>> leakage of the file descriptor outside of the lock. I did
>> not have the time to do such an audit, but it would be a good
>> idea to do so. (Have all kinds of logging -- every conceivable kind
>> of GC logging enabled and then run with rotation at the smallest
>> possible size, say just a handful of bytes, enabled -- in order to
>> stress test this.)
>
> ...
>>>> It's fine to do the locking. I have no objection to the locking;
>>>> my question was why the locking was elided for those two cases.
>>>> It would seem to me that you'd need some way to make sure that
>>>> the stream is not under active use when the rotation occurs,
>>>> and the lock would be a fine way to ensure that the stream is
>>>> never accessed except under protection of the lock. I can see
>>>> that the rotation is done at a STW pause when no mutator is
>>>> running. It turns out today that no daemons run at that time
>>>> either, but it is conceivable that some may in the future
>>>> and have access to or attempt tp access that stream/log.
>>>> We should make the code robust in the face of future such
>>>> evolution, by at least putting in enough controls, and
>>>> the lock would seem to me to be a fine mechanism to do so.
>>>> (After auditing the code to make sure that the handle is
>>>> not accessible outside the lock.)
>>>>
>>>> -- ramki
>>
>


More information about the hotspot-gc-dev mailing list