On 11/05/2015 10:20 AM, Andrew Haley wrote:
> On 05/11/15 04:36, David Holmes wrote:
>> I still need some assistance from Aarch64 folk to write their get_thread 
>> function please!

Here you are.

I found a bug in the existing AArch64 get_thread.

The specification of MacroAssembler::get_thread() is that it clobbers
no registers.  Most of the vector state is callee-clobbered, so if you
want to call a get_thread() method which is written in C you have to
save the entire vector state as well as all call-clobbered general
registers.  You have to do this because it is possible that GCC will
use a vector register for temporary storage.  (This is not just a
theoretical possibility: I have seen AArch64 bugs caused by this
actually happening.)  So, I wrote code to save all the general and
vector registers.  It was horrible.

I scrapped it and instead wrote a little assembly-language routine
which returns the contents of Thread::_thr_current.  This clobbers
only a couple of registers, and everything is much nicer.

I suppose it might have been possible to change the specification of
MacroAssembler::get_thread() so that it clobbered the vector state,
but never mind: it's done now.

diff --git a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
--- a/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
+++ b/src/cpu/aarch64/vm/macroAssembler_aarch64.cpp
@@ -4651,23 +4651,22 @@
       sub(result, result, len); // Return index where we stopped
-// get_thread can be called anywhere inside generated code so we need
-// to save whatever non-callee save context might get clobbered by the
-// call to Thread::current or, indeed, the call setup code. 
-// x86 appears to save C arg registers.
+// get_thread() can be called anywhere inside generated code so we
+// need to save whatever non-callee save context might get clobbered
+// by the call to JavaThread::aarch64_get_thread_helper() or, indeed,
+// the call setup code.
+// aarch64_get_thread_helper() clobbers only r0, r1, and flags.
 void MacroAssembler::get_thread(Register dst) {
-  // Save all call-clobbered regs except dst, plus r19 and r20.
-  RegSet saved_regs = RegSet::range(r0, r20) + lr - dst;
+  RegSet saved_regs = RegSet::range(r0, r1) + lr - dst;
   push(saved_regs, sp);
-  // FIX-ME: implement
-  // return Thread::current()
+  mov(lr, CAST_FROM_FN_PTR(address, JavaThread::aarch64_get_thread_helper));
+  blrt(lr, 1, 0, 1);
   if (dst != c_rarg0) {
     mov(dst, c_rarg0);
-  // restore pushed registers
   pop(saved_regs, sp);
diff --git a/src/os_cpu/linux_aarch64/vm/threadLS_linux_aarch64.s b/src/os_cpu/linux_aarch64/vm/threadLS_linux_aarch64.s
new file mode 100644
--- /dev/null
+++ b/src/os_cpu/linux_aarch64/vm/threadLS_linux_aarch64.s
@@ -0,0 +1,44 @@
+// Copyright (c) 2015, Red Hat Inc. All rights reserved.
+// This code is free software; you can redistribute it and/or modify it
+// under the terms of the GNU General Public License version 2 only, as
+// published by the Free Software Foundation.
+// This code is distributed in the hope that it will be useful, but WITHOUT
+// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+// FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+// version 2 for more details (a copy is included in the LICENSE file that
+// accompanied this code).
+// You should have received a copy of the GNU General Public License version
+// 2 along with this work; if not, write to the Free Software Foundation,
+// Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+// Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
+// or visit www.oracle.com if you need additional information or have any
+// questions.
+        // JavaThread::aarch64_get_thread_helper()
+        //
+        // Return the current thread pointer in x0.
+        // Clobber x1, flags.
+        // All other registers are preserved,
+	.global	_ZN10JavaThread25aarch64_get_thread_helperEv
+	.type	_ZN10JavaThread25aarch64_get_thread_helperEv, %function
+	stp x29, x30, [sp, -16]!
+	adrp x0, :tlsdesc:_ZN6Thread12_thr_currentE
+	ldr x1, [x0, #:tlsdesc_lo12:_ZN6Thread12_thr_currentE]
+	add x0, x0, :tlsdesc_lo12:_ZN6Thread12_thr_currentE
+	.tlsdesccall _ZN6Thread12_thr_currentE
+	blr x1
+	mrs x1, tpidr_el0
+	add x0, x1, x0
+	ldr x0, [x0]
+	ldp x29, x30, [sp], 16
+	ret
+	.size _ZN10JavaThread25aarch64_get_thread_helperEv, .-_ZN10JavaThread25aarch64_get_thread_helperEv
diff --git a/src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.hpp b/src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.hpp
--- a/src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.hpp
+++ b/src/os_cpu/linux_aarch64/vm/thread_linux_aarch64.hpp
@@ -77,6 +77,8 @@
   bool pd_get_top_frame(frame* fr_addr, void* ucontext, bool isInJava);
+  static Thread *aarch64_get_thread_helper();
   // These routines are only used on cpu architectures that
   // have separate register stacks (Itanium).
   static bool register_stack_overflow() { return false; }

