Opened 2 years ago

Last modified 21 months ago

#17896 new bug

static (hidden) thread_local values do not work in some cases

Reported by: waddlesplash Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: System/runtime_loader Version: R1/beta3
Keywords: Cc: korli
Blocked By: Blocking:
Platform: All

Description (last modified by waddlesplash)

gnutls makes use of this, and it had to be disabled to fix crashes: https://github.com/haikuports/haikuports/commit/cda76a62712b91647892ba01f1a5d0dcb38cadc5

Attachments (2)

static-vs-non.diff (30.8 KB ) - added by waddlesplash 2 years ago.
test-static-thread_local.c (724 bytes ) - added by waddlesplash 21 months ago.

Download all attachments as: .zip

Change History (23)

comment:1 by waddlesplash, 2 years ago

Cc: korli added

CC korli, perhaps you have some idea about what to do here?

comment:2 by waddlesplash, 2 years ago

Hmm, I guess this counts as "static TLS", which would be #13294. But we should get an error and the program shouldn't load at all in that case...

comment:3 by waddlesplash, 2 years ago

Ah, right, since this is a shared object, it does not count as "static TLS", I don't think, so this should in theory work. But it appears at least in this test application that putting the "_Thread_local" variable outside the function and deleting the "static" doesn't make it work either, in fact in that case I get all 5 having the same pointer value!

comment:4 by waddlesplash, 2 years ago

A bit more testing shows that tls_get_addr in libroot is indeed returning bad values:

__tls_get_addr: thread 2224, module 0, offset (nil) -> 0x48bc539990
__tls_get_addr: thread 2221, module 0, offset (nil) -> 0x48bc5398f0
__tls_get_addr: thread 2225, module 0, offset (nil) -> 0x48bc5398f0
__tls_get_addr: thread 2222, module 0, offset (nil) -> 0x48bc5398f0
__tls_get_addr: thread 2223, module 0, offset (nil) -> 0x48bc5398f0

That pushes the problem up into runtime_loader indeed.

Version 0, edited 2 years ago by waddlesplash (next)

comment:5 by waddlesplash, 2 years ago

This does not seem to be a race condition. Adding snooze(100000) into both for loops actually has the effect that we always seem to get the same pointer for all threads!

comment:6 by waddlesplash, 2 years ago

Got it: the TLS_DYNAMIC_THREAD_VECTOR has different pointers for each thread (good, that would mean base TLS is broken otherwise) but they all point to the same place. Now how did that happen...

comment:7 by waddlesplash, 2 years ago

Ah, so what is happening is that the threads are all exiting before the next one even starts, and so the TLS memory gets freed when the thread exits and then reused when the next thread starts. Adding snooze(10000); to the end of the thread function and removing the snoozes from elsewhere shows all variables have different values indeed.

I then readded the static and things still worked just fine. So I guess I am not sure what the problem with gnutls is that it works when the thread-local variables are non-static...

comment:8 by waddlesplash, 2 years ago

<X512> I found that after writing to TLS variable it read as zero.
<X512> Removing static fixed that.
<X512> I tried to make mores simple case, but it worked fine...
<X512> Write value to static TLS variable and immediately after read it. It will become zero.
<X512> I suspect that GCC use some optimization for static TLS variable that may use TLS model unsupported in Haiku.

comment:9 by waddlesplash, 2 years ago

More likely it seems to me is that somehow the generation-accounting logic in runtime_loader's elf_tls.cpp is not correct, and blocks are getting spuriously deleted and recreated. Comparing it to FreeBSD I note a bunch of differences, but based on commit messages some of these may be intentional.

by waddlesplash, 2 years ago

Attachment: static-vs-non.diff added

comment:10 by waddlesplash, 2 years ago

Here's a difference of the gcc -S output when compiling random.c from GNUTLS with static vs no static on the variable declarations. Note the .value 0x6666: what's that about?

comment:11 by waddlesplash, 2 years ago

I also did find what I am pretty sure is an issue in the TLS DTV handling, but I think it's not necessarily related to this: https://review.haiku-os.org/c/haiku/+/5604

comment:12 by waddlesplash, 2 years ago

Description: modified (diff)
Summary: static (hidden) thread_local values do not actually get unique pointersstatic (hidden) thread_local values do not work in some cases

comment:13 by jessicah, 2 years ago

This actually sounds like our gcc config is broken in some way.

For example, in https://github.com/haiku/buildtools/blob/master/binutils/bfd/elf64-x86-64.c, searching for 0x6666 shows several comments where the code is meant to be transformed from the 0x6666 into proper TLS calls.

The fact that 0x6666 remaining in the generated output indicates mis-compilation, and perhaps that at some point, gcc thinks the static TLS model is available, and then this code fails to do the transform into static TLS.

It definitely needs deeper investigation, but it certainly looks like the fault is not in runtime_loader, unless we actually go and implement static TLS.

comment:14 by waddlesplash, 2 years ago

Here's the full command invocation being used to compile the file:

gcc -DHAVE_CONFIG_H -I. -I.. -DLOCALEDIR=\"/packages/gnutls-3.7.4-1/.self/data/locale\" -I./../gl -I./../gl -I./includes -I./x509 -I./includes -I./includes -I./x509 -I/packages/libtasn1-4.18.0-1/.self/develop/headers -I/packages/p11_kit-0.23.18.1-2/.self/develop/headers/p11-kit-1 -Wtype-limits -fanalyzer -fno-common -Wall -Wbad-function-cast -Wcast-align=strict -Wdate-time -Wdisabled-optimization -Wdouble-promotion -Wduplicated-branches -Wduplicated-cond -Wextra -Winit-self -Winvalid-pch -Wlogical-op -Wmissing-declarations -Wmissing-include-dirs -Wmissing-prototypes -Wnested-externs -Wnull-dereference -Wold-style-definition -Wopenmp-simd -Wpacked -Wpointer-arith -Wshadow -Wstrict-prototypes -Wsuggest-attribute=cold -Wsuggest-attribute=format -Wsuggest-attribute=malloc -Wsuggest-final-methods -Wsuggest-final-types -Wsync-nand -Wtrampolines -Wuninitialized -Wunknown-pragmas -Wunused-macros -Wvariadic-macros -Wvector-operation-performance -Wwrite-strings -Warray-bounds=2 -Wattribute-alias=2 -Wformat-overflow=2 -Wformat=2 -Wformat-truncation=2 -Wimplicit-fallthrough=5 -Wshift-overflow=2 -Wunused-const-variable=2 -Wvla-larger-than=4031 -Wno-analyzer-double-free -Wno-analyzer-malloc-leak -Wno-analyzer-null-dereference -Wno-analyzer-use-after-free -Wno-missing-field-initializers -Wno-unused-parameter -Wno-format-truncation -Wimplicit-fallthrough=2 -Wabi=11 -fdiagnostics-show-option -fno-builtin-strcmp -I/packages/nettle-3.7.3-1/.self/develop/headers -I/packages/libtasn1-4.18.0-1/.self/develop/headers -I/packages/libidn2-2.0.5-2/.self/develop/headers -I/packages/p11_kit-0.23.18.1-2/.self/develop/headers/p11-kit-1 -g -O2 -MT random.lo -MD -MP -MF .Tpo -c random.c  -DPIC -o .libs/random.o

All warnings, includes, defines, etc. excluded, there is only:

-fanalyzer -fno-common -fdiagnostics-show-option -fno-builtin-strcmp -g -O2

None of which should actually affect TLS behavior to my knowledge.

comment:15 by waddlesplash, 2 years ago

The fact that 0x6666 remaining in the generated output indicates mis-compilation

I don't think so, I just tried compiling this same testcase on Linux and I see the 0x6666 in the assembly output.

comment:16 by waddlesplash, 2 years ago

jessicah found this document which explains the different kinds of thread-local storage: https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter8-20.html

<jessicah> can probably identify the TLS accesses being used from the disassembly
<jessicah> from what I can tell, all accesses were LD changed to GD
<jessicah> so that'd imply that our TLS accesses, which I guess runtime_loader is responsible for, are wrong

comment:18 by waddlesplash, 23 months ago

I learned from someone trying to port an application to Haiku that their TLS variable read as 0 after fork. I haven't yet investigated that myself, but it sounds like a place we haven't looked yet.

comment:19 by waddlesplash, 23 months ago

From JustineTunney, this is the workaround added:

extern _Thread_local struct Machine *g_machine;
  pid = fork();
#ifdef __HAIKU__
  g_machine = m;
#endif

I am told that g_machine was only 0 in the child process; it wasn't unset in the parent.

by waddlesplash, 21 months ago

Attachment: test-static-thread_local.c added

comment:20 by waddlesplash, 21 months ago

Added a failing test which exhibits the fork() problem.

comment:21 by waddlesplash, 21 months ago

Testcase fixed in hrev56874. I suppose we should now retest GNUTLS.

Note: See TracTickets for help on using tickets.