Table of Contents

  1. Syscalls, memory, and your first therad
  2. The pointer to self and thread-local storage
  3. Futexes, mutexes, and memory sychronization
  4. Joining threads and dynamic initialization
  5. Cancellation
  6. Scheduling and task priority
  7. RW Locks
  8. Condition variables
  9. Final thoughts

Cancellation

Cancellation boils down to making one thread exit following a request from another thread. It seems that calling tbthread_exit at an appropriate point is enough to implement all of the behavior described in the man pages. We will go this way despite the fact that it is not the approach taken by glibc. Glibc unwinds the stack back to the point invoking the user-supplied thread function. This behavior allows it to simulate an exception if C++ code is using the library. We don't bother with C++ support for the moment and don't always care to supply valid DWARF information. Therefore, we will take the easier approach.

tbthread_setcancelstate and tbthread_setcanceltype are the two functions controlling the response of a thread to a cancellation request. The former enables or disables cancellation altogether queuing the requests for later handling if necessary. The latter decides whether the thread should abort immediately or at a cancellation point. POSIX has a list of cancellation points, but we will not bother with them. Instead, we'll just use tbthread_testcancel and the two functions mentioned before for this purpose.

The thread must not get interrupted after it disables or defers cancellation. It would likely lead to deadlocks due to unreleased mutexes, memory leaks and such. The trick here is to update all the cancellation related flags atomically. So, we use one variable to handle the following flags:

  • TB_CANCEL_ENABLED: The cancellation is enabled; if a cancellation request has been queued, reaching a cancellation point will cause the thread to exit.
  • TB_CANCEL_DEFERRED: The cancellation is deferred (not asynchronous); SIGCANCEL will not be sent; see the paragraph on signal handling.
  • TB_CANCELING: A cancellation request has been queued; depending on other flags, SIGCANCEL may be sent.
  • TB_CANCELED: A cancellation request has been taken into account and the thread is in the process of exiting; this flag is used to handle the cases when a cancellation point has been reached before SIGCANCEL has been delivered by the kernel.

The tbhread_testcancel looks as follows:

 1void tbthread_testcancel()
 2{
 3  tbthread_t thread = tbthread_self();
 4  uint8_t val, newval;
 5
 6  while(1) {
 7    newval = val = thread->cancel_status;
 8    if(!(val & TB_CANCEL_ENABLED) || !(val & TB_CANCELING) ||
 9       (val & TB_CANCELED))
10      return;
11    newval |= TB_CANCELED;
12    if(__sync_bool_compare_and_swap(&thread->cancel_status, val, newval))
13      break;
14  }
15  tbthread_exit(TBTHREAD_CANCELED);
16}

See the full patch at GitHub.

Clean-up handlers

The user may register a bunch of functions cleaning up the mess caused by an unexpected interruption. They are installed with tbthread_cleanup_push and called when the thread exits abnormally. The purpose of these functions is to unlock mutexes, free the heap memory and such. tbthread_cleanup_pop removes them and optionally executes in the process.

 1void tbthread_cleanup_push(void (*func)(void *), void *arg)
 2{
 3  tbthread_t self = tbthread_self();
 4  struct cleanup_elem *e = malloc(sizeof(struct cleanup_elem));
 5  e->func = func;
 6  e->arg = arg;
 7  list_add_elem(&self->cleanup_handlers, e, 1);
 8}
 9
10void tbthread_cleanup_pop(int execute)
11{
12  tbthread_t self = tbthread_self();
13  list_t *node = self->cleanup_handlers.next;
14  if(!node)
15    return;
16  list_rm(node);
17  struct cleanup_elem *e = (struct cleanup_elem*)node->element;
18  if(execute)
19    (*e->func)(e->arg);
20  free(e);
21  free(node);
22}

See the full patch at GitHub.

Signals and asynchronous cancellation

The asynchronous cancellation uses the first real-time signal, SIGRTMIN, that we call SIGCANCEL here for clarity.

Registering a signal handler is somewhat more tricky than just calling the appropriate syscall. It is so because, on x86_64, we need to provide a function that restores the stack after the signal handler returns. The function is called a signal trampoline and its purpose is to invoke sys_rt_sigreturn. The trampoline is registered with the kernel using a special sigaction flag:

1void __restore_rt();
2#define SA_RESTORER 0x04000000
3
4int tbsigaction(int signum, struct sigaction *act, struct sigaction *old)
5{
6  act->sa_flags |= SA_RESTORER;
7  act->sa_restorer = __restore_rt;
8  return SYSCALL4(__NR_rt_sigaction, signum, act, old, sizeof(sigset_t));
9}

The trampoline itself, called __restore_rt here, is defined in assembly as follows:

1  .text
2
3  .global __restore_rt
4  .type   __restore_rt,@function
5  .align  16
6
7__restore_rt:
8  movq $__NR_rt_sigreturn, %rax
9  syscall

Looking at the corresponding glibc code, you can see that they add the eh_frame info here. The comments say that it is to aid gdb and handle the stack unwinding. I don't know enough DWARF to write one on my own, gdb does not seem to be utterly confused without it, and we won't do stack unwinding, so we just won't bother with it for the moment.

In the cancellation handler, we first check whether it's the right signal and that it has been sent by a thread in the same thread group. We then need to check whether the thread is still in the asynchronous cancellation mode. It might have changed between the time the signal was sent and the time the it is delivered. Finally, we call thread_testcancel to see if the thread should exit.

 1void tb_cancel_handler(int sig, siginfo_t *si, void *ctx)
 2{
 3  if(sig != SIGCANCEL || si->si_pid != tb_pid || si->si_code != SI_TKILL)
 4    return;
 5
 6  tbthread_t self = tbthread_self();
 7  if(self->cancel_status & TB_CANCEL_DEFERRED)
 8    return;
 9
10  tbthread_testcancel();
11}

We invoke sys_tgkill to send the signal:

1SYSCALL3(__NR_tgkill, tb_pid, thread->exit_futex, SIGCANCEL);

See the full patch at GitHub.

Cancellation of a "once" function

The implementation of tbthread_once gets quite a bit more interesting as well. If the thread invoking the initialization function gets canceled, another thread needs to pick it up. We need to install a cleanup handler that will change the state of the once control back to TB_ONCE_NEW and wake all the threads so that they could restart from the beginning:

 1static void once_cleanup(void *arg)
 2{
 3  tbthread_once_t *once = (tbthread_once_t *)arg;
 4  *once = TB_ONCE_NEW;
 5  SYSCALL3(__NR_futex, once, FUTEX_WAKE, INT_MAX);
 6}
 7
 8int tbthread_once(tbthread_once_t *once, void (*func)(void))
 9{
10  if(!once || !func)
11    return -EINVAL;
12
13  int cancel_state;
14
15  while(1) {
16    if(*once == TB_ONCE_DONE)
17      return 0;
18
19    //--------------------------------------------------------------------------
20    // The executor
21    //--------------------------------------------------------------------------
22    tbthread_setcancelstate(TBTHREAD_CANCEL_DISABLE, &cancel_state);
23    if(__sync_bool_compare_and_swap(once, TB_ONCE_NEW, TB_ONCE_IN_PROGRESS)) {
24      tbthread_cleanup_push(once_cleanup, once);
25      tbthread_setcancelstate(cancel_state, 0);
26
27      (*func)();
28
29      tbthread_setcancelstate(TBTHREAD_CANCEL_DISABLE, &cancel_state);
30      tbthread_cleanup_pop(0);
31
32      *once = TB_ONCE_DONE;
33      SYSCALL3(__NR_futex, once, FUTEX_WAKE, INT_MAX);
34      tbthread_setcancelstate(cancel_state, 0);
35      return 0;
36    }
37
38    tbthread_setcancelstate(cancel_state, 0);
39
40    //--------------------------------------------------------------------------
41    // The waiters
42    //--------------------------------------------------------------------------
43    while(1) {
44      SYSCALL3(__NR_futex, once, FUTEX_WAIT, TB_ONCE_IN_PROGRESS);
45      if(*once != TB_ONCE_IN_PROGRESS)
46        break;
47    }
48  }
49}

See the patch at GitHub.

If you like this kind of content, you can subscribe to my newsletter, follow me on Twitter, or subscribe to my RSS channel.