Threading From Scratch (#3) - Futexes, mutexes, and memory sychronization

Syscalls, memory, and your first therad
The pointer to self and thread-local storage
Futexes, mutexes, and memory sychronization
Joining threads and dynamic initialization
Cancellation
Scheduling and task priority
RW Locks
Condition variables
Final thoughts

Intro

This part discusses an implementation of a mutex. It will not be a particularly efficient mutex, but it will be an understandable and straightforward one. We will not bother minimizing the number of system calls or implementing lock elision. We will also not handle the case of inter-process communication. Therefore, the process-shared and robust mutexes will not be discussed. If you are interested in these ideas, I recommend the kernel's documentation file on robust mutexes and Intel's blog post on lock elision. The scheduling related parameters will likely be dealt with on another occasion.

Futexes

POSIX defines a bunch of different mutexes. See the manpage for pthread_mutexattr_settype to learn more. On Linux, all of them are implemented using the same locking primitive - a Futex (Fast User-Space Mutex). It is a 4-byte long chunk of aligned memory. Its contents can be updated atomically, and its address can be used to refer to a kernel-level process queue. The kernel interface is defined as follows:

1asmlinkage long sys_futex(u32 __user *uaddr, int op, u32 val,
2                         struct timespec __user *utime, u32 __user *uaddr2,
3                         u32 val3);

We will only use the first four out of the six parameters here. The first one is the address of the futex, and the second one is the type of the operation to be performed. The meaning of the remaining parameters depends on the context. We will need only two of the available operations to implement a mutex:

FUTEX_WAIT puts a thread to sleep if the value passed in val is the same as the value stored in the memory pointed to by *uaddr. Optionally, the sleep time may be limited by passing a pointer to a timespec object. The return values are:
- 0, if the thread was woken up by FUTEX_WAIT.
- EWOULDBLOCK, if the value of the futex was different than val.
- EINTR, if the sleep was interrupted by a signal.
FUTEX_WAKE wakes the number of threads specified in val. In practice, it only makes sense to either wake one or all sleeping threads, so we pass either 1 or INT_MAX respectively.

See the original futex paper for more details.

Normal mutex

We start with a normal mutex because it is possible to implement all the other kinds using the procedures defined for it. If the value of the associated futex is 0, then the mutex is unlocked. Locking it means changing the value to 1. To avoid race conditions, both the checking and the changing need to be done atomically. GCC has a built-in function to do this that results with lock cmpxchgq or similar being emitted in assembly. If the locking fails, we need to wait until another thread releases the mutex and re-check the lock.

1static int lock_normal(tbthread_mutex_t *mutex)
2{
3  while(1) {
4    if(__sync_bool_compare_and_swap(&mutex->futex, 0, 1))
5      return 0;
6    SYSCALL3(__NR_futex, &mutex->futex, FUTEX_WAIT, 1);
7  }
8}

The logic of trylock is essentially the same, except that we return a failure instead of sleeping.

1static int trylock_normal(tbthread_mutex_t *mutex)
2{
3  if(__sync_bool_compare_and_swap(&mutex->futex, 0, 1))
4      return 0;
5  return -EBUSY;
6}

To unlock a mutex, we do the reverse. We change the futex value from 1 to 0 and wake one thread waiting for the futex to be released.

1static int unlock_normal(tbthread_mutex_t *mutex)
2{
3  if(__sync_bool_compare_and_swap(&mutex->futex, 1, 0))
4    SYSCALL3(__NR_futex, &mutex->futex, FUTEX_WAKE, 1);
5  return 0;
6}

Note that the values stored in the futex are application-specific and arbitrary. The kernel does not care and does not change this variable except in one case, which we will discuss in a later chapter.

This mutex is not particularly efficient because we make a system call while unlocking regardless of whether there is a waiting thread or not. To see how the situation could be improved, please refer to Ulrich Drepper's Futexes Are Tricky.

Error-check mutex

These guys are the same as the ones discussed earlier, except that they do additional bookkeeping. An error-check mutex remembers who owns it, to report the following types of errors:

re-locking of a mutex the thread already owns
unlocking a mutex owned by another thread
unlocking a mutex that is not locked

The behavior of normal mutexes in these cases is undefined.

 1static int lock_errorcheck(tbthread_mutex_t *mutex)
 2{
 3  tbthread_t self = tbthread_self();
 4  if(mutex->owner == self)
 5    return -EDEADLK;
 6  lock_normal(mutex);
 7  mutex->owner = self;
 8  return 0;
 9}
10
11static int trylock_errorcheck(tbthread_mutex_t *mutex)
12{
13  int ret = trylock_normal(mutex);
14  if(ret == 0)
15    mutex->owner = tbthread_self();
16  return ret;
17}
18
19static int unlock_errorcheck(tbthread_mutex_t *mutex)
20{
21  if(mutex->owner != tbthread_self() || mutex->futex == 0)
22    return -EPERM;
23  mutex->owner = 0;
24  unlock_normal(mutex);
25  return 0;
26}

Recursive mutex

Recursive mutexes may be locked multiple times by the same thread and require the same numbers of unlock operations to be released. To provide this kind of functionality, we just need to add a counter counter.

 1static int lock_recursive(tbthread_mutex_t *mutex)
 2{
 3  tbthread_t self = tbthread_self();
 4  if(mutex->owner != self) {
 5    lock_normal(mutex);
 6    mutex->owner   = self;
 7  }
 8  if(mutex->counter == (uint64_t)-1)
 9    return -EAGAIN;
10  ++mutex->counter;
11  return 0;
12}
13
14static int trylock_recursive(tbthread_mutex_t *mutex)
15{
16  tbthread_t self = tbthread_self();
17  if(mutex->owner != self && trylock_normal(mutex))
18    return -EBUSY;
19
20  if(mutex->owner != self) {
21    mutex->owner = self;
22    mutex->counter = 1;
23    return 0;
24  }
25
26  if(mutex->counter == (uint64_t)-1)
27    return -EAGAIN;
28
29  ++mutex->counter;
30  return 0;
31}
32
33static int unlock_recursive(tbthread_mutex_t *mutex)
34{
35  if(mutex->owner != tbthread_self())
36    return -EPERM;
37  --mutex->counter;
38  if(mutex->counter == 0) {
39    mutex->owner = 0;
40    return unlock_normal(mutex);
41  }
42  return 0;
43}

Other code

This is pretty much it. The remaining part of the code is not interesting. Both mutex and mutexattr objects need to be initialized to their default values, but the futexes don't need any initialization or cleanup. As always, the full patch is available on GitHub.

If you like this kind of content, you can subscribe to my newsletter, follow me on Twitter, or subscribe to my RSS channel.

2016-02-15

Table of Contents