Thread stack size issue related to glibc TLS bug
david.holmes at oracle.com
Thu May 30 05:39:19 UTC 2019
On 24/05/2019 8:13 pm, Florian Weimer wrote:
> * David Holmes:
>> My thoughts haven't really changed since 2015 - and sadly neither has
>> there been any change in glibc in that time. Nor, to my recollection,
>> have there been any other reported issues with this.
> The issue gets occasionally reported by people who use small stacks with
> large initial-exec TLS consumers (such as jemalloc). On the glibc side,
> we aren't entirely sure what to do about this. We have recently tweaked
> the stack size computation, so that in many cases, threads now receive
> an additional page. This was necessary to work around a kernel/hardware
> change where context switches started to push substantially more data on
> the stack than before, and minimal stack sizes did not work anymore on
> x86-64 (leading to ntpd crashing during startup, among other things).
> The main concern is that for workloads with carefully tuned stack sizes,
> revamping the stack size computation so that TLS is no longer
> effectively allocated on the stack might result in address space
> exhaustion. (This should only be a concern on 32-bit architectures.)
> Even if we changed this today (or had changed it in 2015), it would take
> a long time for the change to end up with end users, so it's unclear how
> much help it would be.
If it had been fixed in 2012 it wouldn't be an issue today. If it gets
fixed today then it may not be an issue in 2025. If it is not fixed then
it will always be an issue. Stealing the TLS space out of the stack
requested by the user is just not a reasonable thing to do IMHO.
> Maybe OpenJDK can add a property specifying a stack size reserve, and
> htis number is added to all stack size requests? This will at least
> allow users to work around the issue locally.
This would be a low-impact workaround, though as Jiangli points out it
is a bit hard on the end-user as they first have to hit the problem,
then recognize what it is, then realize there's a potential solution and
then determine the right magic number to use. Better than nothing but
Further follow up coming in response to your later email.
> If we change the accounting in glibc, we will have to add a similar
> tunable on the glibc side, too.
More information about the core-libs-dev