Discussion:
Q: How __declspec(thread) consumes TLS indexes
(too old to reply)
Mladen Turk
2010-08-10 06:42:23 UTC
Permalink
Raw Message
Hi,

According to the MSDN, __declspec(thread) use Win32 TLS api
underneath, so it consumes at least one TlsIndex per process.

Question is whether the compiler generates one TlsIndex per
variable declaration or whether it combines them in some
sort of .data section.

Eg.
__declspec(thread) int I;
__declspec(thread) int J;

How many TlsIndexes that consumes?
Since each win32 process has maximum of 1088 TlsIndexes,
if the answer is one per declaration, then it can easily
lead to the TLS_OUT_OF_INDEXES when standard TlsAlloc is used.

If OTOH compiler merges them in one data section, on what
base is that section managed? Per process or per execution unit?

Eg.
DLL A
__declspec(thread) int I;
__declspec(thread) int J;

DLL B
__declspec(thread) int K;
__declspec(thread) int L;


If both are loaded in process P, how many TlsIndexes are
left to user to consume.

Since there is no Win32 API that would return the
number of TlsIndexes available, making a test program
figuring that out would require some sort of brute force
deduction. However I'm sure there is someone in Microsoft
that could answer this question in a microsecond :)

There is some hint in the MSDN that this might be one:
"global variable space for a thread is allocated at
run time, ... application plus the requirements of
all DLLs that are statically linked".



Regards
--
^TM
Timothy Madden
2010-08-11 11:36:55 UTC
Permalink
Raw Message
Post by Mladen Turk
Hi,
According to the MSDN, __declspec(thread) use Win32 TLS api
underneath, so it consumes at least one TlsIndex per process.
Question is whether the compiler generates one TlsIndex per
variable declaration or whether it combines them in some
sort of .data section.
Eg.
__declspec(thread) int I;
__declspec(thread) int J;
How many TlsIndexes that consumes?
Since each win32 process has maximum of 1088 TlsIndexes,
if the answer is one per declaration, then it can easily
lead to the TLS_OUT_OF_INDEXES when standard TlsAlloc is used.
If OTOH compiler merges them in one data section, on what
base is that section managed? Per process or per execution unit?
Eg.
DLL A
__declspec(thread) int I;
__declspec(thread) int J;
DLL B
__declspec(thread) int K;
__declspec(thread) int L;
If both are loaded in process P, how many TlsIndexes are
left to user to consume.
Since there is no Win32 API that would return the
number of TlsIndexes available, making a test program
figuring that out would require some sort of brute force
deduction. However I'm sure there is someone in Microsoft
that could answer this question in a microsecond :)
You can make a program that repeatedly calls TlsAlloc() until the
returned value is TLS_OUT_OF_INDEXES. Count the number of successful
TlsAlloc allocations and then write another program that declares that
many thread-local variables plus 1, and that then tries to TlsAlloc()
again.

I have a feeling that you will see TlsAlloc succeeds just as many times
as the first program, minus 1. Just a feeling.
Post by Mladen Turk
"global variable space for a thread is allocated at
run time, ... application plus the requirements of
all DLLs that are statically linked".
Regards
--
^TM
f***@gmail.com
2016-04-05 22:45:43 UTC
Permalink
Raw Message
Post by Mladen Turk
Hi,
According to the MSDN, __declspec(thread) use Win32 TLS api
underneath, so it consumes at least one TlsIndex per process.
Question is whether the compiler generates one TlsIndex per
variable declaration or whether it combines them in some
sort of .data section.
Eg.
__declspec(thread) int I;
__declspec(thread) int J;
How many TlsIndexes that consumes?
Since each win32 process has maximum of 1088 TlsIndexes,
if the answer is one per declaration, then it can easily
lead to the TLS_OUT_OF_INDEXES when standard TlsAlloc is used.
If OTOH compiler merges them in one data section, on what
base is that section managed? Per process or per execution unit?
Eg.
DLL A
__declspec(thread) int I;
__declspec(thread) int J;
DLL B
__declspec(thread) int K;
__declspec(thread) int L;
If both are loaded in process P, how many TlsIndexes are
left to user to consume.
Since there is no Win32 API that would return the
number of TlsIndexes available, making a test program
figuring that out would require some sort of brute force
deduction. However I'm sure there is someone in Microsoft
that could answer this question in a microsecond :)
"global variable space for a thread is allocated at
run time, ... application plus the requirements of
all DLLs that are statically linked".
Regards
--
^TM
Did you try out the test proposed by Timothy?

Loading...