Chapter 10. Kernel Synchronization Methods

The previous chapter discussed the sources of and solutions to race conditions. The Linux kernel provides a family of synchronization methods, which enable developers to write efficient and race-free code. This chapter discusses these methods and their interfaces, behavior, and use

Atomic Operations

As the foundation on which other synchronization methods are built, atomic operations provide instructions that execute atomically, without interruption. Atomic operators are indivisible instructions. For example, an atomic increment can read and increment a variable by one in a single indivisible and uninterruptible step. Recall the simple race in incrementing an integer that we discussed in the previous chapter:

Thread 1 Thread 2
get i (7) get i (7)
increment i (7 -> 8)
increment i (7 -> 8)
write back i (8)
write back i (8)

With atomic operators, this race does cannot—occur. Instead, the outcome is always one of the following:

Thread 1 Thread 2
get, increment, and store i (7 -> 8)
get, increment, and store i (8 -> 9)

Or:

Thread 1 Thread 2
get, increment, and store i (7 -> 8)
get, increment, and store i (8 -> 9)

The ultimate value, always nine, is correct. It is never possible for the two atomic operations to occur on the same variable concurrently. Therefore, it is not possible for the increments to race.

The kernel provides two sets of interfaces for atomic operations: one that operates on integers and another that operates on individual bits. These interfaces are implemented on every architecture that Linux supports. Most architectures contain instructions that provide atomic versions of simple arithmetic operations. Other architectures, lacking direct atomic operations, provide an operation to lock the memory bus for a single operation, thus guaranteeing that another memory-affecting operation cannot occur simultaneously.

Atomic Integer Operations

The atomic integer methods operate on a special data type, atomic_t. This special type is used, as opposed to having the functions work directly on the C int type, for several reasons:

  1. Having the atomic functions accept only the atomic_t type ensures that the atomic operations are used only with these special types. Likewise, it also ensures that the data types are not passed to any non-atomic functions. Indeed, what good would atomic operations be if they were not consistently used on the data?
  2. The use of atomic_t ensures the compiler does not (erroneously but cleverly) optimize access to the value—it is important the atomic operations receive the correct memory address and not an alias.
  3. Use of atomic_t can hide any architecture-specific differences in its implementation.

The atomic_t type is defined in <linux/types.h> (include/linux/types.h#L192):

typedef struct {
    volatile int counter;
} atomic_t;

Despite being an integer, and thus 32 bits on all the machines that Linux supports, developers and their code once had to assume that an atomic_t was no larger than 24 bits in size. The SPARC port in Linux has an odd implementation of atomic operations: A lock was embedded in the lower 8 bits of the 32-bit int, like in the following figure:

Figure 10.1 Old layout of the 32-bit atomic_t on SPARC.

The lock was used to protect concurrent access to the atomic type because the SPARC architecture lacks appropriate support at the instruction level. Consequently, only 24 usable bits were available on SPARC machines. Although code that assumed that the full 32-bit range existed would work on other machines; it would have failed in strange and subtle ways on SPARC machines. Recently, clever hacks have allowed SPARC to provide a fully usable 32-bit atomic_t, and this limitation is no more.

The declarations needed to use the atomic integer operations are in <asm/atomic.h>. Some architectures provide additional methods that are unique to that architecture, but all architectures provide at least a minimum set of operations that are used throughout the kernel. When you write kernel code, you can ensure that these operations are correctly implemented on all architectures.

Defining an atomic_t is done in the usual manner. Optionally, you can set it to an initial value:

atomic_t v;                   /* define v */
atomic_t u = ATOMIC_INIT(0);  /* define u and initialize it to zero */

Operations are all simple:

atomic_set(&v, 4);  /* v = 4 (atomically) */
atomic_add(2, &v);  /* v = v + 2 = 6 (atomically) */
atomic_inc(&v);     /* v = v + 1 = 7 (atomically) */

If you ever need to convert an atomic_t to an int, use atomic_read():

printk(%d\n, atomic_read(&v)); /* will print "7" */

A common use of the atomic integer operations is to implement counters. Protecting a sole counter with a complex locking scheme is overkill, so instead developers use atomic_inc() and atomic_dec(), which are much lighter in weight.

Another use of the atomic integer operators is atomically performing an operation and testing the result. A common example is the atomic decrement and test:

int atomic_dec_and_test(atomic_t *v)

This function decrements by one the given atomic value. If the result is zero, it returns true; otherwise, it returns false. The following table is a full listing of the standard atomic integer operations (those found on all architectures). All the operations implemented on a specific architecture can be found in <asm/atomic.h>.

Atomic Integer Operation Description
ATOMIC_INIT(int i) At declaration, initialize to i.
int atomic_read(atomic_t *v) Atomically read the integer value of v.
void atomic_set(atomic_t *v, int i) Atomically set v equal to i.
void atomic_add(int i, atomic_t *v) Atomically add i to v.
void atomic_sub(int i, atomic_t *v) Atomically subtract i from v.
void atomic_inc(atomic_t *v) Atomically add one to v.
void atomic_dec(atomic_t *v) Atomically subtract one from v.
int atomic_sub_and_test(int i, atomic_t *v) Atomically subtract i from v and return true if the result is zero; otherwise false.
int atomic_add_negative(int i, atomic_t *v) Atomically add i to v and return true if the result is negative; otherwise false.
int atomic_add_return(int i, atomic_t *v) Atomically add i to v and return the result.
int atomic_sub_return(int i, atomic_t *v) Atomically subtract i from v and return the result.
int atomic_inc_return(int i, atomic_t *v) Atomically increment v by one and return the result.
int atomic_dec_return(int i, atomic_t *v) Atomically decrement v by one and return the result.
int atomic_dec_and_test(atomic_t *v) Atomically decrement v by one and return true if zero; false otherwise.
int atomic_inc_and_test(atomic_t *v) Atomically increment v by one and return true if the result is zero; false otherwise.

The atomic operations are typically implemented as inline functions with inline assembly. In the case where a specific function is inherently atomic, the given function is usually just a macro. For example, on most architectures, a word-sized read is always atomic.That is, a read of a single word cannot complete in the middle of a write to that word. The read always returns the word in a consistent state, either before or after the write completes, but never in the middle. Consequently, atomic_read() is usually just a macro returning the integer value of the atomic_t:

/**
 * atomic_read - read atomic variable
 * @v: pointer of type atomic_t
 *
 * Atomically reads the value of @v.
 */
static inline int atomic_read(const atomic_t *v)
{
    return v->counter;
}
Atomicity Versus Ordering *

The preceding discussion on atomic reading begs a discussion on the differences between atomicity and ordering. As discussed, a word-sized read always occurs atomically. It never interleaves with a write to the same word; the read always returns the word in a consistent state: perhaps before the write completes, perhaps after, but never during. For example, if an integer is initially 42 and then set to 365, a read on the integer always returns 42 or 365 and never some commingling of the two values. We call this atomicity.

However, your code might have more stringent requirements than this. Perhaps you require that the read always occurs before the pending write. This type of requirement is not atomicity, but ordering.

The atomic operations discussed in this section guarantee only atomicity. Ordering is enforced via barrier operations, which is discussed later in this chapter.

In your code, it is usually preferred to choose atomic operations over more complicated locking mechanisms. On most architectures, one or two atomic operations incur less overhead and less cache-line thrashing than a more complicated synchronization method. However, testing multiple approaches is always smart, as with any performance-sensitive code.

64-Bit Atomic Operations

With the rising prevalence of 64-bit architectures, it is no surprise that the Linux kernel developers augmented the 32-bit atomic_t type with a 64-bit variant, atomic64_t. For portability, the size of atomic_t cannot change between architectures, so atomic_t is 32-bit even on 64-bit architectures. Instead, the atomic64_t type provides a 64-bit atomic integer that functions otherwise identical to its 32-bit brother. Usage is exactly the same, except that the usable range of the integer is 64 bits, rather than 32 bits. Nearly all the classic 32-bit atomic operations are implemented in 64-bit variants; they are prefixed with atomic64 in lieu of atomic. The following table is a full listing of the standard operations; some architectures implement more, but they are not portable. As with atomic_t, the atomic64_t type is just a simple wrapper around an integer type a long:

typedef struct {
    volatile long counter;
} atomic64_t;
Atomic Integer Operation Description
ATOMIC64_INIT(long i) At declaration, initialize to i.
long atomic64_read(atomic64_t *v) Atomically read the integer value of v.
void atomic64_set(atomic64_t *v, int i) Atomically set v equal to i.
void atomic64_add(int i, atomic64_t *v) Atomically add i to v.
void atomic64_sub(int i, atomic64_t *v) Atomically subtract i from v.
void atomic64_inc(atomic64_t *v) Atomically add one to v.
void atomic64_dec(atomic64_t *v) Atomically subtract one from v.
int atomic64_sub_and_test(int i, atomic64_t *v) Atomically subtract i from v and return true if the result is zero; otherwise false.
int atomic64_add_negative(int i, atomic64_t *v) Atomically add i to v and return true if the result is negative; otherwise false.
long atomic64_add_return(int i, atomic64_t *v) Atomically add i to v and return the result.
long atomic64_sub_return(int i, atomic64_t *v) Atomically subtract i from v and return the result.
long atomic64_inc_return(int i, atomic64_t *v) Atomically increment v by one and return the result.
long atomic64_dec_return(int i, atomic64_t *v) Atomically decrement v by one and return the result.
int atomic64_dec_and_test(atomic64_t *v) Atomically decrement v by one and return true if zero; false otherwise.
int atomic64_inc_and_test(atomic64_t *v) Atomically increment v by one and return true if the result is zero; false otherwise.

All 64-bit architectures provide atomic64_t and a family of arithmetic functions to operate on it. Most 32-bit architectures do not support atomic64_t, while x86-32 is a notable exception. For portability between all Linux's supported architectures, developers should use the 32-bit atomic_t type. The 64-bit atomic64_t is reserved for code that is both architecture-specific and that requires 64-bits.

Atomic Bitwise Operations

In addition to atomic integer operations, the kernel also provides a family of functions that operate at the bit level, which are architecture-specific and are defined in <asm/bitops.h>.

The bitwise functions have the following properties:

Because the functions operate on a generic pointer, there is no equivalent of the atomic integer's atomic_t type. Instead, you can work with a pointer to whatever data you want.

For example:

set_bit(0, &word); /* bit zero is now set (atomically) */
set_bit(1, &word); /* bit one is now set (atomically) */
printk("%ul\n", word); /* will print "3" */
clear_bit(1, &word); /* bit one is now unset (atomically) */
change_bit(0, &word); /* bit zero is flipped; now it is unset (atomically) */

/* atomically sets bit zero and returns the previous value (zero) */
if (test_and_set_bit(0, &word)) {
    /* never true ... */
}

/* the following is legal; you can mix atomic bit instructions with normal C */
word = 7;

A listing of the standard atomic bit operations is in the following table:

Atomic Bitwise Operation Description
void set_bit(int nr, void *addr) Atomically set the nr-th bit starting from addr.
void clear_bit(int nr, void *addr) Atomically clear the nr-th bit starting from addr.
void change_bit(int nr, void *addr) Atomically flip the value of the nr-th bit starting from addr.
int test_and_set_bit(int nr, void *addr) Atomically set the nr-th bit starting from addr and return the previous value.
int test_and_clear_bit(int nr, void *addr) Atomically clear the nr-th bit starting from addr and return the previous value.
int test_and_change_bit(int nr, void *addr) Atomically flip the nr-th bit starting from addr and return the previous value.
int test_bit(int nr, void *addr) Atomically return the value of the nr-th bit starting from addr.

Conveniently, nonatomic versions of all the bitwise functions are also provided. They behave identically to their atomic siblings, except they do not guarantee atomicity, and their names are prefixed with double underscores. For example, the nonatomic form of test_bit() is __test_bit(). If you do not require atomicity (for example, because a lock already protects your data), these variants of the bitwise functions might be faster.

What the Heck Is a Nonatomic Bit Operation? *

[p183]

Atomicity requires that either instructions succeed in their entirety, uninterrupted, or instructions fail to execute at all. Therefore, if you issue two atomic bit operations, you expect two operations to succeed. After both operations complete, the bit needs to have the value as specified by the second operation. Moreover, however, at some point in time prior to the final operation, the bit needs to hold the value as specified by the first operation. Put more generally, real atomicity requires that all intermediate states be correctly realized.

For example, assume you issue two atomic bit operations: Initially set the bit and then clear the bit. Without atomic operations, the bit might end up cleared, but it might never have been set. The set operation could occur simultaneously with the clear operation and fail. The clear operation would succeed, and the bit would emerge cleared as intended. With atomic operations, however, the set would actually occur: there would be a moment in time when a read would show the bit as set, and then the clear would execute and the bit would be zero.

This behavior can be important, especially when ordering comes into play or when dealing with hardware registers.

The kernel also provides routines to find the first set (or unset) bit starting at a given address:

int find_first_bit(unsigned long *addr, unsigned int size)
int find_first_zero_bit(unsigned long *addr, unsigned int size)

Both functions take a pointer as their first argument and the number of bits in total to search as their second. They return the bit number of the first set or first unset bit, respectively. If your code is searching only a word, the routines __ffs() and ffz(), which take a single parameter of the word in which to search, are optimal.

Unlike the atomic integer operations, code typically has no choice whether to use the bitwise operations; they are the only portable way to set a specific bit. The only question is whether to use the atomic or nonatomic variants. If your code is inherently safe from race conditions, you can use the nonatomic versions, which might be faster depending on the architecture.