Request for review: 6941923: RFE: Handling large log files produced by long running Java Applications
Y. Srinivas Ramakrishna
y.s.ramakrishna at oracle.com
Mon May 9 12:56:44 PDT 2011
Hi Yumin, If you do not lock, then your counting can go quite awry.
In that case you should use an atomic increment to
do the book-keeping of accumulated count? Or am i being too anal here
and rough counting is OK in this case?
I'll respond to yr other points below once this point has
been taken care of.
May be we can move this discussion off-line so as not to
spam the alias (which i may be already guilty of) because
of my rat-holing? (Folks scream if you want to keep this
discussion out of yr mailboxes, and we'll go off-alias.)
On 5/9/2011 10:48 AM, yumin.qi at oracle.com wrote:
> Hi, Ramki
> I think there is no need for lock if all threads supposed to be stopped at safepoints except for
> VMThread and CMS Thread. This is the previous version, check if the caller is one for them. In fact,
> rotate_log only called just before ending safepoints, so it is at STW. end() is only called by those
> two threads. I will test this and performance affection.
> Write to a file with file handle is thread safe, this is POSIX requirement so we don't have to use
> lock when writing to log file. fileStream is a good example for this.
> If at STW, there is only one thread is running, then no need to lock for changing file handle.
> I also find:
> In ConcurrentGCThread::stopWorldAndDo(..)s call SafepointSynchronizer::begin(), and in begin():
> Thread* myThread = Thread::current();
> assert(myThread->is_VM_thread(), "Only VM thread may execute a safepoint");
> This could be a problem since it is called from ConcurrentGCThread. Then another search for the
> calling site of stopWorldAndDo find no where. Is this function ever used? If it is used, it will be
> a problem for this assertion.
> So now, seems only VMThread is the only caller of rotate_log. I will test the fastdebug version with
> assert in rotate_log:
> assert(thread->is_VM_thread(), "Not VM Thread");
> to verify my theory.
> On 5/9/2011 9:41 AM, Y. Srinivas Ramakrishna wrote:
>> Hi Yumin --
>> It seems on a quick audit of the code that ttyLock is a
>> global lock (unless I am missing somrthing). For performance
>> one would probably want a per-rotating file stream lock, rather than
>> overload a global lock protecting all streams (which
>> is what seems to be happening here, and which in the
>> general case could all kinds of deadlocking and dependency
>> I also could not easily see which of the streams used
>> locking and for which operations...
>> So I am not sure your locking here in rotate_log()
>> is really giving us any safety here. Did you check
>> the paths through which the file descriptor is accessed
>> to make sure this (or another) lock protected access in
>> any manner? I'd start by putting in appropriate asserts
>> in the paths that do the write()'s to the rotating
>> stream and work backwards to ensure that locking was
>> adequate. If it isn't, I'd suggest introducing a
>> per-rotating-stream lock here, for better scalability when
>> one might have multiple such streams in the future.
>> It's also a good idea to measure the performance impact
>> of this feature (the rotation and the locking -- or
>> reference counting -- which appears to be important
>> for safety of rotation in general) comparing both
>> no rotation gc logging and rotation enabled gc logging,
>> versus the baseline.
>> -- ramki
>> On 5/9/2011 9:21 AM, Y. Srinivas Ramakrishna wrote:
>>> Finally, on locking: did you try using ttyLocker() instead of
>>> doing the raw locking you are doing here? I notice a number of
>>> cutouts for the locking path in hold(): we'd need to do a careful audit
>>> to make sure this does not leave open the possibility of
>>> leakage of the file descriptor outside of the lock. I did
>>> not have the time to do such an audit, but it would be a good
>>> idea to do so. (Have all kinds of logging -- every conceivable kind
>>> of GC logging enabled and then run with rotation at the smallest
>>> possible size, say just a handful of bytes, enabled -- in order to
>>> stress test this.)
>>>>> It's fine to do the locking. I have no objection to the locking;
>>>>> my question was why the locking was elided for those two cases.
>>>>> It would seem to me that you'd need some way to make sure that
>>>>> the stream is not under active use when the rotation occurs,
>>>>> and the lock would be a fine way to ensure that the stream is
>>>>> never accessed except under protection of the lock. I can see
>>>>> that the rotation is done at a STW pause when no mutator is
>>>>> running. It turns out today that no daemons run at that time
>>>>> either, but it is conceivable that some may in the future
>>>>> and have access to or attempt tp access that stream/log.
>>>>> We should make the code robust in the face of future such
>>>>> evolution, by at least putting in enough controls, and
>>>>> the lock would seem to me to be a fine mechanism to do so.
>>>>> (After auditing the code to make sure that the handle is
>>>>> not accessible outside the lock.)
>>>>> -- ramki
More information about the hotspot-gc-dev