RFR (L) JDK-8195100: Use a low latency hashtable for SymbolTable

Ioi Lam ioi.lam at oracle.com
Wed Aug 1 21:27:16 UTC 2018

Hi Gerard,

Have you tried running HelloWorld to get a sense of the start-up impact? 

     perf stat -r 100 ./old/bin/java -cp . HelloWorld > /dev/null
     perf stat -r 100 ./new/bin/java -cp . HelloWorld > /dev/null

I've looked at all the files except for the GC files. Here are my comments:


   87 // The symbol table holds all Symbol*s and corresponding interned 
   88 // Symbol*s and literal strings should be canonicalized.
   89 //
   90 // The interned strings are created lazily.
   91 //
   92 // %note:
   93 //  - symbolTableEntrys are allocated in blocks to reduce the 
space overhead.
   94 template <class T, class N> class CompactHashtable;
   95 class CompactSymbolTableWriter;
   96 class SerializeClosure;
   98 class SymbolTable;
   99 class SymbolTableConfig;
  100 typedef ConcurrentHashTable<Symbol*,
  101                               SymbolTableConfig, mtSymbol> 
  103 class SymbolTableCreateEntry;
  105 class SymbolTable : public CHeapObj<mtSymbol> {

Line 98 can be deleted.

Lines 87~93 seem to be out of date and should be deleted:
     - The fact that SymbolTable holds symbols should be obvious. Also, it
       no longer holds all the symbols -- there's also the shared symbol 
     - Interned strings are not stored here.
     - There's no definition of what "canonicalized" is.
     - symbolTableEntrys no longer exists.


  342   // Otherwise, add the symbol to table. Copy to a C string first.
  343   char stack_buf[ON_STACK_BUFFER_LENGTH];
  344   ResourceMark rm(THREAD);
  345   if (len <= ON_STACK_BUFFER_LENGTH) {
  346     buffer = stack_buf;
  347   } else {
  348     buffer = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD, char, len);
  349   }
  350   for (int i=0; i<len; i++) {
  351     buffer[i] = name[i];
  352   }
  353   // Make sure there is no safepoint in the code above since name 
can't move.
  354   // We can't include the code in NoSafepointVerifier because of the
  355   // ResourceMark.

The above is dead code and should be removed.

  429 // Suggestion: Push unicode-based lookup all the way into the hashing
  430 // and probing logic, so there is no need for convert_to_utf8 until
  431 // an actual new Symbol* is created.
  432 Symbol* SymbolTable::lookup_unicode(const jchar* name, int 
utf16_length, TRAPS) {

Do you have statistics of how often this function is called? I suspect it's
pretty low for common apps.

  311 Symbol* SymbolTable::lookup(const char* name, int len, TRAPS) {
  312   unsigned int hash = hash_symbol(name, len, SymbolTable::_alt_hash);
  313   Symbol* sym = SymbolTable::the_table()->lookup_common(name, len, 
  314   if (NULL == sym) {
  315     sym = SymbolTable::the_table()->do_add_if_needed((char*)name, 
len, hash, true, CHECK_NULL);
  316   }
  463 void SymbolTable::add(ClassLoaderData* loader_data, const 
constantPoolHandle& cp,
  464                       int names_count, const char** names, int* 
  465                       int* cp_indices, unsigned int* hashValues, 
  471     Symbol* sym = SymbolTable::the_table()->lookup_common(name, 
len, hash);
  472     if (NULL == sym) {
  473       sym = SymbolTable::the_table()->do_add_if_needed(name, len, 
hash, c_heap, CHECK);
  474     }

Is it possible to combine the lookup_common and do_add_if_needed into a 
single operation,
so you can avoid doing the search twice?

  647 // Sharing
  648 #if INCLUDE_CDS

647 can be deleted as it's superfluous.

- Ioi

On 7/30/18 9:19 AM, Gerard Ziemski wrote:
> Please review this Enhancement, which uses the new concurrent hash table for SymbolTable.
> This is an effort similar to the one behind JDK-8195097 "Make it possible to process StringTable outside safepoint” from a while ago.
> The main expected goal here is to eliminate safepoint pauses needed to cleanup the table. This goal was achieved by using “Service Thread” to do the cleaning. Checking whether we need to clean is performed on “VM Thread”, after class unloading (we check the entire table). We also check the bucket into which we happen to be inserting a new item.
> A few things to note:
> - The SymbolTable implementation follows closely that of StringTable - we might be able to factor out common code later
> - There are a few small, but statistically significant, regressions in startup benchmarks (around 2-5%) that will be addressed later, tracked by JDK-8208142
> - There is an outstanding question about whether we can safely walk the table during a safepoint using do_scan, but without locking, tracked by JDK-8208462
> - There is a cleanup opportunity presented now to remove rehashable hash table, tracked by JDK-8208519
> - There is a new test that validates that we free dead entries when we insert new symbols (using short lived symbols via Class.forName() API)
> Tested using Mach5 hs-tier1,2,3,4,5 (final test running right now…)
> Webrev: http://cr.openjdk.java.net/~gziemski/8195100_rev1/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8195100
> Cheers

More information about the hotspot-runtime-dev mailing list