Code Review Request 7142596: RMI JPRT tests are failing

Darryl Mocek darryl.mocek at
Fri Jul 13 20:40:14 UTC 2012

On 07/12/2012 05:13 PM, Stuart Marks wrote:
> OK, I took a look at the revised webrev. Most stuff is good. A couple 
> changes are necessary though.
> ***
> Not a big deal, but the comment at lines 65-67 is no longer necessary.
Comment removed.
> ***
> I think the reserved port range stuff still needs to be fixed up.
> The comment at lines 81-82 talks about a range 64000-64100 which isn't 
> really present anywhere. The comment should instead say that PORT_MIN 
> and PORT_MAX should be kept adjusted so that they specify the range of 
> ports defined by the constants following. Make sure to do this 
> adjustment; PORT_MAX is 64002 here.
I reserved ports 64000-64100 (even though only 64001-64005 are currently 
being used) for fixed port tests (for tests which require fixed ports) 
so any future tests which require fixed ports can be assured there are 
ports available within a range for their use.  In fixing the tests, I 
saw  that those who wrote tests and used fixed ports grabbed random port 
numbers all over the place.  Providing a reserved range allows them to 
just grab the next number and to keep the fixed port numbers in a group, 
even when new tests are added. If the ports are defined in a central 
place, like TestLibrary, then hopefully developers will come to 
TestLibrary to reserve the ports rather then define them in the tests 
themselves, like was done previously.  I chose 100 ports for 
"future-proofing"...I don't think 100 is too many, but I've reduced it 
to 10 and lowered the max to 64010.  I was also thinking of looking for 
tests which used fixed ports outside of RMI tests and standardizing 
those ports here, but that's possible future work and I'm not sure it's 
> The previous webrev had PORT_MIN and PORT_MAX specify a range of 
> 1024-64000 as the *allowed* range for ports returned by 
> getUnusedRandomPort(). But as I described previously, this is 
> difficult to enforce given the varying port allocation schemes on 
> different OSes.
> Instead I think we need to invert the sense of the range 
> PORT_MIN..PORT_MAX and make it be the range of ports reserved by tests 
> that require fixed ports (64001 through 64005). Then, 
> getUnusedRandomPort() needs to be changed to *disallow* ports in this 
> range, that is, to retry if the port is within this range. Thus the 
> condition on the while-loop has to be inverted, something like
>     while (... unusedRandomPort >= PORT_MIN && unusedRandomPort <= 
Yes, this is correct and it has been changed.
> So, how could getUnusedRandomPort() have worked in its present state? 
> Well I think it ends up retrying 10 times, then at the end of the loop 
> unusedRandomPort is actually some legal port number -- albeit possibly 
> one within the disallowed range -- and almost certainly not -1, and so 
> the method returns it and the caller uses it. So, the success-testing 
> logic here will also have to change so that it retests whether the 
> port is in the disallowed range.
> The comment on getUnusedRandomPort() also needs to be updated to 
> reflect its new policy on the reserved range, as well as throwing an 
> exception on failure. Also (nitpick) it should say "less than" not 
> "less then".
Yes, I always do this. :(
> Looking at getUnusedRandomPort() again more closely [sorry] I think 
> the catch of Exception that does nothing is suspicious. I'm not sure 
> why getting a ServerSocket would fail, but if it does (maybe the 
> system is out of ports??) we should just throw out to the caller. 
> Perhaps ideally we'd just have this method throw IOException, but if 
> that requires all callers to be updated, it might be easier just to 
> catch IOException and throw a RuntimeException that wraps the caught 
> IOException.
I see no harm in trying again if it fails.  If we have run out of ports, 
then possibly one or more used ports will be closed and freed and a 
subsequent try will succeed.  I don't know all of the cases where a 
failure might occur.  Anyway, if this is the prevailing way of handling 
this, I'll change it.
> Similar comments apply to the catch(Exception)/do-nothing code in the 
> other utility methods.
Now throwing RuntimeException in getRegistryPort.
> Certainly getRegistryPort() should just throw (or possibly wrap and 
> rethrow) any exception caught.
> For createRegistryOnUnusedPort(), the catching of ExportException is 
> handled properly. The second catch clause of the outer try-block, and 
> the catch of the inner try-block, both ignore exceptions. The code 
> will then end up retrying. Is it reasonable that retrying in these 
> cases will result in a different outcome? My guess is that it's more 
> likely that something is seriously wrong that will cause all the 
> retries to fail, in which case this method will discard the exceptions 
> it has just caught and throw a *new* RemoteException instance.
This was a holdover from webrev.00 which should have been removed. I 
agree with Alan and you here that multiple tries of 
LocateRegistry.createRegistry(0) is unnecessary.  I've modified this to 
make a single attempt at using LocateRegistry.createRegistry(0) and if 
it fails, make a single attempt at getting an unused random port and 
creating a registry on this port, then failing with a RuntimeException.
> I'm particularly sensitive to this; as you might recall a couple weeks 
> ago I was having a wrestling match with the jstatd tests (I finished 
> the match, but I'm not sure I won). The primary problem was that 
> several layers of code would catch and discard exceptions, which made 
> diagnosing the problem incredibly difficult. So, I'd recommend 
> removing the "do nothing" catch clauses.
> Thinking further about createRegistryOnUnusedPort(), I'm not sure that 
> retrying 10 (or some other number of times) actually makes sense. It 
> does for getUnusedRandomPort(), which has to retry in order to get a 
> port outside the disallowed range. But for creating a registry, 
> createRegistry(0) will usually work the first time. If it throws 
> ExportException, it does so for a specific reason, so we should retry 
> once on an unused random port. But if this still fails, don't think 
> retrying repeatedly makes sense.
See above.

Changes pass all tests.  Updated webrev at

> s'marks
> On 7/10/12 2:14 PM, Darryl Mocek wrote:
>> On 07/09/2012 04:41 PM, Stuart Marks wrote:
>>> OK, here's the review for the second half of the files in the 
>>> webrev. I saw
>>> your reply to the first half (to which I'll reply separately), and I 
>>> don't
>>> think there's anything here that's affected by them.
>>> ***
>>> ***
>>> REGISTRY_PORT should be a local variable; also rename to use mixed 
>>> case.
>> Changed to a private registryPort (see next issue).
>>> Eh, whoops, after looking at I see that it 
>>> accesses
>>> the REGISTRY_PORT field directly. This is why direct field access is 
>>> a bad
>>> idea. :-) Now the question is, has REGISTRY_PORT been initialized 
>>> before
>>> ApplicationServer needs it? It turns out that it has been -- but 
>>> only in some
>>> cases.
>>> It seems like the test is trying to support two modes, one that runs 
>>> in two
>>> threads in the same JVM, and the other that runs in two separate 
>>> JVMs. If
>>> they are in separate JVMs, things will no longer work because in the 
>>> JVM that
>>> runs ApplicationServer.main(), AppleUserImpl.REGISTRY_PORT will be 
>>> -1. I
>>> suspect that our test environment doesn't support the separate JVM 
>>> mode, but
>>> it seems unwise to break it.
>>> I'd suggest that in two-JVM mode the classes fall back to using a
>>> "well-known" default registry port number, which in this case seems 
>>> like 2006.
>>> In single-JVM mode, AppleUserImpl creates an instance of 
>>> ApplicationServer,
>>> so I'd suggest adding a method to ApplicationServer that allows 
>>> AppleUserImpl
>>> to store the randomly-assigned registry port number into it, 
>>> overriding the
>>> default value.
>>> This seems like this is the simplest way to preserve the two modes of
>>> operation but to support the random port selection model we're 
>>> trying to
>>> achieve.
>> Rather then going the "fixed port" route, which is what we're trying 
>> to get
>> away from, I've changed the implementation of both AppletUserImpl's and
>> ApplicationServer so ApplicationServer requires a port and AppleUserImpl
>> supplies the port on construction of ApplicationServer. I thought of 
>> modifying
>> ApplicationServer's constructor to create a port using
>> TestLibrary.getUnusedRandomPort, but decided requiring a port is 
>> better as
>> ApplicationServer's job is to look for already exported AppleUser 
>> objects.
>>> *** activatable/
>>> int registryPort = new 
>>> Integer(System.getProperty("rmi.registry.port"));
>>> I'd suggest using Integer.parseInt() instead of new Integer(). Not a 
>>> huge
>>> deal, but it's probably more conventional to use parseInt() and it 
>>> avoids
>>> boxing.
>>> One could probably do Integer.getInteger("rmi.registry.port") but 
>>> this is
>>> seems pretty obscure to me even though it's more succinct.
>>> The same also applies to the following:
>>> -
>>> - unicast/
>>> -
>>> -
>>> -
>>> -
>>> - dgcDeadLock/
>> Integer.parseInt returns a primitive (which is what the return is 
>> assigned to)
>> and it appears Integer.parseInt is "faster" then creating a new Integer.
>> Changed to Integer.parseInt in all places referenced.
>>> ***
>>> The pattern here is a bit odd, as the test creates the registry, 
>>> throws away
>>> the returned reference, and then calls getRegistry() to get another 
>>> Registry
>>> reference. It *seems* like they're identical references, but in fact 
>>> the
>>> first is apparently a reference to the actual Registry implementation,
>>> whereas the second is a remote stub.
>>> The tests seem to do all the actual work using the remote stub, 
>>> which seems
>>> proper.
>>> This is confusing, though, as it looks like there's a redundant 
>>> Registry
>>> reference now. This might lead someone in the future to "simplify" 
>>> the test
>>> by not getting the remote stub, which in turn might invalidate some 
>>> tests.
>>> (In fact I was going to suggest this but I decided to investigate 
>>> further
>>> first.)
>>> At the very least, I'd suggest renaming the variable that holds the 
>>> newly
>>> created Registry to something like "registryImpl" to make it clear 
>>> that it's
>>> different from the thing returned by getRegistry(), even though they 
>>> have a
>>> the same time.
>>> Another possibility is to rearrange the TestLibrary API so that 
>>> there is a
>>> single utility method that combines createRegistryOnUnusedPort() and
>>> getRegistryPort(). That is, it creates a new registry and simply 
>>> returns the
>>> port on which it was created, not a reference to the registry 
>>> implementation.
>>> I don't think the registry implementation is actually ever used by 
>>> the tests,
>>> and it might simplify things a bit as well.
>>> Possibly similar issues with:
>>> -
>>> -
>>> ***
>>> Unnecessary call to TestLibrary.getUnusedRandomPort()?
>> Looks like extra code left over from the change from using
>> TestLibrary.getUnusedRandomPort/LocateRegistry.createRegistry(randomPort) 
>> to
>> TestLibrary.createRegistryOnUnusedPort...removed.
>>> ***
>>> Mostly pretty straightforward, but I do have some concerns about the 
>>> random
>>> port selection and a potential clash with the "reserved port range" as
>>> defined in this test library.
>>> The getUnusedRandomPort() method attempts to get a socket within the 
>>> range
>>> (1024,64000) and will retry 10 times if it can't. Unfortunately, MacOS
>>> allocates ports more-or-less sequentially in the range [49152, 
>>> 65536) which
>>> means that when the kernel's internal counter gets to 64000,
>>> getUnusedRandomPort()'s retries will fail, causing tests to fail 
>>> until the
>>> counter wraps around.
>>> Other systems behave differently; Linux seems to allocate them 
>>> randomly in
>>> the range [32768,65536) and Windows XP SP3 allocates them 
>>> sequentially in the
>>> range (1024,5000]. So it's probably not a problem for them.
>>> I think the thing to do is to check only for "reserved ports" that are
>>> actually used by tests here. These are in the range [64001,64005]. In
>>> getUnusedRandomPort(), it should only need to retry if the returned 
>>> port is
>>> within this narrow, reserved range. If it's anything else it should 
>>> be OK.
>> I'll try setting the range this narrow, but I don't know how many 
>> sequential
>> tests will be run at a time and I'm concerned 5 is too few. The 
>> -concurrency
>> option on jtreg allows you to specify how many concurrent tests will 
>> be run. We
>> should have enough test ports reserved to satisfy any concurrency 
>> request. I've
>> run the tests with -concurrency=8 (I have a dual-core system showing 
>> 4 CPU's).
>> I tried reducing the port range to 64001/64002 and concurrency=4 and 
>> all passed
>> fine, so maybe we're OK with just 5.
>>> On another topic, the three utility methods here:
>>> - createRegistryOnUnusedPort
>>> - getRegistryPort
>>> - getUnusedRandomPort
>>> all catch exceptions and then return illegal values (null or -1), 
>>> sometimes
>>> after printing some diagnostic information. The problem is that the 
>>> caller
>>> will attempt to soldier on with the illegal return value and will 
>>> stumble
>>> over something later, such as NullPointerException or
>>> IllegalArgumentException. This will probably be obvious but it's 
>>> equally
>>> likely to be confusing.
>>> Since these utilities are all called from test code, and the tests are
>>> relying on them to return valid results, I'd suggest just throwing 
>>> exceptions
>>> from the utility methods if they fail. This will (should) cause the 
>>> test to
>>> error out, but that's OK, as it never could have succeeded anyway if 
>>> the
>>> utility call had failed.
>> I already modified createRegistryOnUnusedPort to throw an exception 
>> as part of
>> the MultipleRegistries change. I'm now throwing a RuntimeException for
>> getRegistryPort and getUnusedRandomPort if they fail.
>> See updated webrev:
>> Darryl
>>> s'marks
>>> On 7/5/12 2:22 PM, Darryl Mocek wrote:
>>>> Hello core-libs. Please review this webrev to fix Bugs #7142596 and 
>>>> 7161503.
>>>> Webrev can be found here: 
>>>> This commit fixes concurrency issues with the RMI tests.
>>>> - Added TestLibrary.createRegistryOnUnusedPort method. This creates an
>>>> RMIRegistry on an unused port. It will try up to 10 times before 
>>>> giving up.
>>>> - Added a TestLibrary.getRegistryPort(Registry) method to get the 
>>>> port number
>>>> of the registry.
>>>> - Changed almost all tests from using hard port numbers to using 
>>>> random port
>>>> numbers for running the RMI Registry and RMID.
>>>> - Removed othervm from those tests which don't need it.
>>>> - Added parameters for tests which spawn a separate VM to pass RMI 
>>>> Registry and
>>>> RMID ports in cases where needed.
>>>> - Added PropertyPermission to security policy files where needed.
>>>> - Removed java/rmi and sun/rmi from tests which cannot be run 
>>>> concurrently.
>>>> - Added java/rmi/Naming to list of tests which cannot be run 
>>>> concurrently.
>>>> Thanks,
>>>> Darryl

More information about the core-libs-dev mailing list