How does InheritableThreadLocal work?
The creation of a Java thread is actually the process of class Thread’s instantiation, Thread instance has a variable defined called “threadLocals”, which is instance of ThreadLocal.ThreadLocalMap, the variable “threadLocals” stores all the objects of ThreadLocal variables that referenced by the current Thread object.
Every time the ThreadLocal variable is used, it internally sets and gets its real variable value into its current Thread object’s “threadLocals” variable. That is why ThreadLocal variable always manages the variable value that within the current thread context without messing up with other thread objects.
Besides the instance variable “threadLocals”, Thread instance has another same typed variable called “inheritableThreadLocals”, it follows same way as what “threadLocals” dose regarding to the thread scoped variables management, only difference of “inheritableThreadLocals” only stores the InheritableThreadLocal variable values, and Thread constructor checks the existence of “inheritableThreadLocals” variable from parent thread object and copies the “inheritableThreadLocals” variable into child thread object.
The copy operation is backed by calling method “childValue” of array object stored inside the “table” field of “inheritableThreadLocals” variable. The method “childValue” is implemented from class InheritableThreadLocal.
Leak Issue Of InheritableThreadLocal
The method “childValue” is called to get the copied value from parent thread for those InheritableThreadLocal variables. The default implementation of method “childValue” is actually shallow copy, or the other way to say, copy on reference. That means, theoretically some data is shared with reference between parent thread and its children thread.
Leak here does not mean memory leak, because the “table” field of thread object’s “inheritableThreadLocals” or “threadLocals” is wrapped as WeakReference object, it dose not impact the JVM garage collection, Leak here means about the scope leak and security leak.
First, let’s check the so called scope leak, or more accurately speaking, thread local variable scope leak. Real intention of InheritableThreadLocal according to the Java doc here is “when a child thread is created, the child receives initial values for all inheritable thread-local variables for which the parent has values”, that means, although the “values” of the thread-local variables from the child threat should be setup from the parent thread, it should be only happen at “initial” time, the variables should still be thread local scoped and not supposed to be correlated any more after the “initial” time with its parent thread. However, since the method “childValue” of the class “InheritableThreadLocal” class does the shallow copy, same object is referenced by both child and parent threads, changes on parent will be reflected to the child thread during the lifecycle of both parent and child thread and vice versa. The “shallow copy” changes the variable scope from thread local scope to across parent-child threads scope.
This causes some severe issues sometimes and makes the business logic mess up totally. Let’s say, we have a complicated banking system, we want to create child thread to process heavy financial data and for convenience, those financial data is defined as InheritableThreadLocal so that it can be inherited from parent to child thread conveniently, it saves a lot developer’s effort on copy the data from parent to child thread and whole thread logic is much more simplified. However, if for whatever reasons, the parent thread still doing some changes on those inheritable financial data, then what ever that have been changed on child thread will be overridden, most of the time, developers do not even how that happens and need spent huge effort to investigate the issue, it bring a very severe inconsistencies inside the banking system.
Second the leak is what I called “security leak”, that actually means, sensible info of a thread gets exposed to another thread with different security context. Let’s use some example to describe the issue, we setup a web application which consumes HTTP request and gives back response after logic processing. Usually, each request is identified by a user and is assigned with some Java thread by the web application, the java thread is filled with the user security context. Most of http server provide thread pool support, a thread will be reused for the other request from a different user once it completes its life cycle with the previous request, security context is updated for sure so there is no security context mess up between user requests. But what if the request thread creates child thread to handle some asynchronous logic and the parent thread ends its serving to the request and child thread still runs, there will another request sent to server to check the status of the asynchronous processing logic, this is very standard asynchronous processing model. Here come the issue, when the original parent thread gets reused by new request which is associated to another user, even the authentication value is updated but since the reference the security context is shared by default between the parent thread and its still running child thread, the child thread can know the security context of the other user actually.
How To Fix The Leak Issue
Do Not Use InheritableThreadLocal
I do not think the class InheritableThreadLocal is a really a must-to-have feature in Java, it serves only a very rare scenario, for just a very limited benefit but you need expose your system to potentially severe leak loophole, it does not worth it. So my first advice for avoid the leak issue of InheritableThreadLocal is, “do not touch it until it is necessary”.
Most of the use scenarios that we want to inherit the data from parent thread and child thread automatically, you can always find replacement of InheritableThreadLocal, both on software design and code implementation. When we use child thread, we always use the child thread as a separated running task in a way of being either asynchronous or parallel, there are a lot safe data processing model of doing asynchronous and parallel programming in Java world, using InheritableThreadLocal should be at the bottom the list you need consider.
Deep Copy On InheritableThreadLocal
If you have to use it or you really like the InheritableThreadLocal, then use it safely, one of key thing you need always keep in your mind is, doing deep copy of the InheritableThreadLocal variables.
Several ways of doing deep copy theInheritableThreadLocal variables, simplest one is extending the classInheritableThreadLocal and overrides the implementation of method “childValue” with doing deep object copy of the value from parent thread.
If you do not want to extend the InheritableThreadLocal class, you can hook the deep copy operation when child thread logic is started to be executed, especially when thread pool is involved, you need build your hook functions so that you can better arrange your code logic when child thread is created and reused, so that would be very convenient to inject the deep copy logic.
In The End
I am sure there are always other better ways of avoid and fix the leak use of InheritableThreadLocal especially when we talk about this in more specific use scenarios, the above talking and solutions are pretty some generic ones. Once we are clarified about the root cause of theInheritableThreadLocal leak issues, we can use what ever solution we like for better protect or improve our software system.