Table of Contents

  1. Syscalls, memory, and your first therad
  2. The pointer to self and thread-local storage
  3. Futexes, mutexes, and memory sychronization
  4. Joining threads and dynamic initialization
  5. Cancellation
  6. Scheduling and task priority
  7. RW Locks
  8. Condition variables
  9. Final thoughts

Recycling the thread descriptors

How do we know when the thread task has died? This is what the CHILD_SETTID and CHILD_CLEARTID flags to sys_clone are for. If they are set, the kernel will store the new thread's TID at the location pointed to by the ctid argument (see tb #1). When the thread terminates, the kernel will set the TID to 0 and wake the futex at this location. It is a convenient way to wait for a thread to finish. Unfortunately, as far as I can tell, there is no way to unset these flags, and it makes implementing tbthread_detach a pain. We cannot delete the thread descriptor in the thread it refers to anymore. Doing so would cause the kernel to write to a memory location that might have been either unmapped or reused. Therefore, we need to have some sort of a cache holding thread descriptors and make sure that we re-use them only after the thread they were referring to before has exited. Thread bites uses two linked lists to maintain this cache, and the descriptor allocation function calls the following procedure to wait until the corresponding task is gone:

1static void wait_for_thread(tbthread_t thread)
2{
3  uint32_t tid = thread->exit_futex;
4  long ret = 0;
5  if(tid != 0)
6    do {
7      ret = SYSCALL3(__NR_futex, &thread->exit_futex, FUTEX_WAIT, tid);
8    } while(ret != -EWOULDBLOCK && ret != 0);
9}

See the full patch at GitHub.

Joining threads

Joining complicates things a bit further because it does quite a bit of error checking to prevent deadlocks and such. To perform these checks, the thread calling tbthread_join needs to have a valid thread descriptor obtainable by tbthread_self. The problem is that we have never set this thread descriptor up for the main thread, and we need to do it by hand at the beginning of the program. The original state needs to be restored at the end because glibc uses it internally and not cleaning things up causes segfaults.

 1static void *glibc_thread_desc;
 2void tbthread_init()
 3{
 4  glibc_thread_desc = tbthread_self();
 5  tbthread_t thread = malloc(sizeof(struct tbthread));
 6  memset(thread, 0, sizeof(struct tbthread));
 7  thread->self = thread;
 8  SYSCALL2(__NR_arch_prctl, ARCH_SET_FS, thread);
 9}
10
11void tbthread_finit()
12{
13  free(tbthread_self());
14  SYSCALL2(__NR_arch_prctl, ARCH_SET_FS, glibc_thread_desc);
15}

After performing all the validity and deadlock checks, the meat of tbthread_join is rather simple:

1wait_for_thread(thread);
2if(retval)
3  *retval = thread->retval;
4release_descriptor(thread);
5return 0;

See the full patch at GitHub.

Dynamic initialization

pthread_once is an interesting beast. Its purpose is to initialize dynamically some resources by calling a designated function exactly once. The fun part is that the actual initialization call may be made from multiple threads at the same time. pthread_once_t, therefore, is kind of like a mutex, but has three states instead of two:

  • new: the initialization function has not been called yet; one of the threads needs to call it.
  • in progress: the initialization function is running; the threads are waiting for it to finish.
  • done: the initialization function is done; all the threads may be woken up.

The thread that manages to change the state from new to in progress gets to call the function. All the other threads wait until the done state is reached.

 1int tbthread_once(tbthread_once_t *once, void (*func)(void))
 2{
 3  if(!once || !func)
 4    return -EINVAL;
 5
 6  if(*once == TB_ONCE_DONE)
 7    return 0;
 8
 9  if(__sync_bool_compare_and_swap(once, TB_ONCE_NEW, TB_ONCE_IN_PROGRESS)) {
10    (*func)();
11    *once = TB_ONCE_DONE;
12    SYSCALL3(__NR_futex, once, FUTEX_WAKE, INT_MAX);
13    return 0;
14  }
15
16  while(*once != TB_ONCE_DONE)
17    SYSCALL3(__NR_futex, once, FUTEX_WAIT, TB_ONCE_IN_PROGRESS);
18  return 0;
19}

Side effects

The original glibc thread descriptor stores the localization information for the thread. Changing it to ours causes seemingly simple functions, like strerror, to segfault. Therefore, we need to implement strerror ourselves.

If you like this kind of content, you can subscribe to my newsletter, follow me on Twitter, or subscribe to my RSS channel.