Phantom of the Coroutine
Threads are heavy-weight and have substance. With threads, we can get a reference to some kind of current Thread
object, examine its properties, modify its thread-local variables, and otherwise manipulate it. It is no surprise that people with the thread-programming background and education, who are coming to programming with coroutines, are looking for some kind of Coroutine
object they can get hold of. However, there is none. Coroutines are phantom, ephemeral, insubstantial. There are good reasons for this state of affairs and some non-trivial consequences. Let’s dig in.
The rule of coroutine transparency
Consider the following suspending function foo
that performs some work and then writes the resulting data to a database and sends a message with it over a message bus (both use network and are suspending, too).
suspend fun foo() {
val data = doSomeWork()
writeToDatabase(data)
sendMessage(data)
}
We can speed foo
function up by calling writeToDatabase
concurrently with the rest of the code that does sendMessage
. It is straightforward— just delimit a scope for this concurrent operation and use launch
function:
suspend fun foo() = coroutineScope {
val data = doSomeWork()
launch { writeToDatabase(data) } // concurrent now
sendMessage(data)
}
We don’t have to explicity wait for
writeToDatabase
operation to complete before returning fromfoo
becausecoroutineScope
builder does this wait automatically.
The rule of coroutine transparency states that neither the caller of foo
nor the function writeToDatabase
should be aware of or be affected by this introduction of concurrency. Having a separate coroutine should be completely transparent. Stated more narrowly, replacing a direct call to writeToDatabase(data)
with a call from another coroutine viacoroutineScope { launch { writeToDatabase(data) } }
should have as little noticeable effects as possible.
Of course, all software abstractions are leaky and we only have an illusion of true transparency here. If you look carefully, you’ll figure out that the coroutine context of writeToDatabase
function has got a new Job
object, as shown in detail in the story on Coroutine Context and Scope, but that is all there is to it. In all the other respects, writeToDatabase
executes in the same context. If it fails or gets canceled, then those failures and cancellations are handled so that it is impossible for writeToDatabase
and for foo
’s caller to notice that it was being called from another coroutine.
Corollary 1: Immutable context
The first corollary to the rule of coroutine transparency is that the coroutine context is immutable. The context has to be inherited by children coroutines for the purpose of transparency and if it were mutable, just like a typical thread-local variable, then writeToDabata
mutating the context would race with sendMessage
that is trying to read it. It means that programmers who are used to mutating thread-local variables for things like tracking security context, logging diagnostics, etc, will have to learn a way to do so without a mutation.
Let’s take a look at logging, for example. You might have had code like this to set logging context for doSomeWork
call:
MDC.put("login", "user") // updates thread-local behind the scenes
val data = doSomeWork()
// somehow restore the old MDC context after the call
You cannot translate this code to coroutines by mutating the coroutine context instead of the thread-local context. What to do? You cannot change the coroutine context, but you can create a different coroutine with a different context; coroutines are phantom:
val newContextMap = MDC.getCopyOfContextMap()
newContextMap.put("login", "user")
val data = withContext(MDCContext(newContextMap)) {
doSomeWork()
}
This is more verbose, but it also avoids the need to restore the context after the call to doSomeWork
. If you happen to need it more than a couple of times in your code, you can always introduce your own withMDCContext
helper function and write the resulting code like this:
val data = withMDCContext("login", "user") {
doSomeWork()
}
Corollary 2: Non-reentrant mutex
JVM monitors behind synchronized
are reentrant and similarly ReentrantLock implementation in JDK is, which is even enshrined in its name. What does it mean for a lock to be reentrant? It means that the function foo
can call doSomeWork
holding the lock:
fun foo() {
lock.lock()
try { doSomeWork() }
finally { lock.unlock() }
}
At the same time,doSomeWork
can take this lock without problems, too:
So, it might come as a surprise that a similar Mutex
primitive for coroutines is non-reentrant. If you try to rewrite the same code line-by-line to coroutines, it does not work. It hangs due to a deadlock:
How come? To understand that, think about an implementation of a reentrant lock or take a look at its source. It is reentrant because it internally stores a reference to the owner thread and does not attempt to acquire the lock from the same thread twice. But this cannot be done with coroutines. There’s no way to store a reference to the current coroutine, because of the rule of coroutine transparency.
So, is there any way to work around it and still have reentrant mutex with coroutines? Absolutely! Use similar thinking to the previous section. Instead of trying to effectively mutate the current context with two separate lock
and unlock
functions, define a higher-order withReentrantLock
function that creates a new coroutine with a custom context element that marks acquiring this mutex in the context, so that it can skip acquiring the same mutex on the second attempt from the same context. Now it works!
Conclusion
Coroutines are not just light-weight threads, coroutines are phantom. You can transparently move a piece of work from one coroutine to another, but with this power comes a responsibility to think about coroutines differently from how you used to think about threads.