1 The Kernel Concurrency Sanitizer (KCSAN)
2 ========================================
7 *Kernel Concurrency Sanitizer (KCSAN)* is a dynamic data race detector for
8 kernel space. KCSAN is a sampling watchpoint-based data race detector. Key
9 priorities in KCSAN's design are lack of false positives, scalability, and
10 simplicity. More details can be found in `Implementation Details`_.
12 KCSAN uses compile-time instrumentation to instrument memory accesses. KCSAN is
13 supported in both GCC and Clang. With GCC it requires version 7.3.0 or later.
14 With Clang it requires version 7.0.0 or later.
19 To enable KCSAN configure kernel with::
23 KCSAN provides several other configuration options to customize behaviour (see
24 their respective help text for more info).
29 A typical data race report looks like this::
31 ==================================================================
32 BUG: KCSAN: data-race in generic_permission / kernfs_refresh_inode
34 write to 0xffff8fee4c40700c of 4 bytes by task 175 on cpu 4:
35 kernfs_refresh_inode+0x70/0x170
36 kernfs_iop_permission+0x4f/0x90
37 inode_permission+0x190/0x200
38 link_path_walk.part.0+0x503/0x8e0
39 path_lookupat.isra.0+0x69/0x4d0
40 filename_lookup+0x136/0x280
41 user_path_at_empty+0x47/0x60
43 __do_sys_newlstat+0x50/0xb0
44 __x64_sys_newlstat+0x37/0x50
45 do_syscall_64+0x85/0x260
46 entry_SYSCALL_64_after_hwframe+0x44/0xa9
48 read to 0xffff8fee4c40700c of 4 bytes by task 166 on cpu 6:
49 generic_permission+0x5b/0x2a0
50 kernfs_iop_permission+0x66/0x90
51 inode_permission+0x190/0x200
52 link_path_walk.part.0+0x503/0x8e0
53 path_lookupat.isra.0+0x69/0x4d0
54 filename_lookup+0x136/0x280
55 user_path_at_empty+0x47/0x60
56 do_faccessat+0x11a/0x390
57 __x64_sys_access+0x3c/0x50
58 do_syscall_64+0x85/0x260
59 entry_SYSCALL_64_after_hwframe+0x44/0xa9
61 Reported by Kernel Concurrency Sanitizer on:
62 CPU: 6 PID: 166 Comm: systemd-journal Not tainted 5.3.0-rc7+ #1
63 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
64 ==================================================================
66 The header of the report provides a short summary of the functions involved in
67 the race. It is followed by the access types and stack traces of the 2 threads
68 involved in the data race.
70 The other less common type of data race report looks like this::
72 ==================================================================
73 BUG: KCSAN: data-race in e1000_clean_rx_irq+0x551/0xb10
75 race at unknown origin, with read to 0xffff933db8a2ae6c of 1 bytes by interrupt on cpu 0:
76 e1000_clean_rx_irq+0x551/0xb10
77 e1000_clean+0x533/0xda0
78 net_rx_action+0x329/0x900
79 __do_softirq+0xdb/0x2db
82 ret_from_intr+0x0/0x18
83 default_idle+0x3f/0x220
84 arch_cpu_idle+0x21/0x30
86 cpu_startup_entry+0x14/0x20
88 arch_call_rest_init+0x13/0x2b
89 start_kernel+0x6db/0x700
91 Reported by Kernel Concurrency Sanitizer on:
92 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc7+ #2
93 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
94 ==================================================================
96 This report is generated where it was not possible to determine the other
97 racing thread, but a race was inferred due to the data value of the watched
98 memory location having changed. These can occur either due to missing
99 instrumentation or e.g. DMA accesses.
104 It may be desirable to disable data race detection for specific accesses,
105 functions, compilation units, or entire subsystems. For static blacklisting,
106 the below options are available:
108 * KCSAN understands the ``data_race(expr)`` annotation, which tells KCSAN that
109 any data races due to accesses in ``expr`` should be ignored and resulting
110 behaviour when encountering a data race is deemed safe.
112 * Disabling data race detection for entire functions can be accomplished by
113 using the function attribute ``__no_kcsan`` (or ``__no_kcsan_or_inline`` for
114 ``__always_inline`` functions). To dynamically control for which functions
115 data races are reported, see the `debugfs`_ blacklist/whitelist feature.
117 * To disable data race detection for a particular compilation unit, add to the
120 KCSAN_SANITIZE_file.o := n
122 * To disable data race detection for all compilation units listed in a
123 ``Makefile``, add to the respective ``Makefile``::
130 * The file ``/sys/kernel/debug/kcsan`` can be read to get stats.
132 * KCSAN can be turned on or off by writing ``on`` or ``off`` to
133 ``/sys/kernel/debug/kcsan``.
135 * Writing ``!some_func_name`` to ``/sys/kernel/debug/kcsan`` adds
136 ``some_func_name`` to the report filter list, which (by default) blacklists
137 reporting data races where either one of the top stackframes are a function
140 * Writing either ``blacklist`` or ``whitelist`` to ``/sys/kernel/debug/kcsan``
141 changes the report filtering behaviour. For example, the blacklist feature
142 can be used to silence frequently occurring data races; the whitelist feature
143 can help with reproduction and testing of fixes.
148 Informally, two operations *conflict* if they access the same memory location,
149 and at least one of them is a write operation. In an execution, two memory
150 operations from different threads form a **data race** if they *conflict*, at
151 least one of them is a *plain access* (non-atomic), and they are *unordered* in
152 the "happens-before" order according to the `LKMM
153 <../../tools/memory-model/Documentation/explanation.txt>`_.
155 Relationship with the Linux Kernel Memory Model (LKMM)
156 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
158 The LKMM defines the propagation and ordering rules of various memory
159 operations, which gives developers the ability to reason about concurrent code.
160 Ultimately this allows to determine the possible executions of concurrent code,
161 and if that code is free from data races.
163 KCSAN is aware of *atomic* accesses (``READ_ONCE``, ``WRITE_ONCE``,
164 ``atomic_*``, etc.), but is oblivious of any ordering guarantees. In other
165 words, KCSAN assumes that as long as a plain access is not observed to race
166 with another conflicting access, memory operations are correctly ordered.
168 This means that KCSAN will not report *potential* data races due to missing
169 memory ordering. If, however, missing memory ordering (that is observable with
170 a particular compiler and architecture) leads to an observable data race (e.g.
171 entering a critical section erroneously), KCSAN would report the resulting
174 Race conditions vs. data races
175 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
177 Race conditions are logic bugs, where unexpected interleaving of racing
178 concurrent operations result in an erroneous state.
180 Data races on the other hand are defined at the *memory model/language level*.
181 Many data races are also harmful race conditions, which a tool like KCSAN
182 reports! However, not all data races are race conditions and vice-versa.
183 KCSAN's intent is to report data races according to the LKMM. A data race
184 detector can only work at the memory model/language level.
186 Deeper analysis, to find high-level race conditions only, requires conveying
187 the intended kernel logic to a tool. This requires (1) the developer writing a
188 specification or model of their code, and then (2) the tool verifying that the
189 implementation matches. This has been done for small bits of code using model
190 checkers and other formal methods, but does not scale to the level of what can
191 be covered with a dynamic analysis based data race detector such as KCSAN.
193 For reasons outlined in this `article <https://lwn.net/Articles/793253/>`_,
194 data races can be much more subtle, but can cause no less harm than high-level
197 Implementation Details
198 ----------------------
200 The general approach is inspired by `DataCollider
201 <http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf>`_.
202 Unlike DataCollider, KCSAN does not use hardware watchpoints, but instead
203 relies on compiler instrumentation. Watchpoints are implemented using an
204 efficient encoding that stores access type, size, and address in a long; the
205 benefits of using "soft watchpoints" are portability and greater flexibility in
206 limiting which accesses trigger a watchpoint.
208 More specifically, KCSAN requires instrumenting plain (unmarked, non-atomic)
209 memory operations; for each instrumented plain access:
211 1. Check if a matching watchpoint exists; if yes, and at least one access is a
212 write, then we encountered a racing access.
214 2. Periodically, if no matching watchpoint exists, set up a watchpoint and
215 stall for a small delay.
217 3. Also check the data value before the delay, and re-check the data value
218 after delay; if the values mismatch, we infer a race of unknown origin.
220 To detect data races between plain and atomic memory operations, KCSAN also
221 annotates atomic accesses, but only to check if a watchpoint exists
222 (``kcsan_check_atomic_*``); i.e. KCSAN never sets up a watchpoint on atomic
228 1. **Memory Overhead:** The current implementation uses a small array of longs
229 to encode watchpoint information, which is negligible.
231 2. **Performance Overhead:** KCSAN's runtime aims to be minimal, using an
232 efficient watchpoint encoding that does not require acquiring any shared
233 locks in the fast-path. For kernel boot on a system with 8 CPUs:
235 - 5.0x slow-down with the default KCSAN config;
236 - 2.8x slow-down from runtime fast-path overhead only (set very large
237 ``KCSAN_SKIP_WATCH`` and unset ``KCSAN_SKIP_WATCH_RANDOMIZE``).
239 3. **Annotation Overheads:** Minimal annotations are required outside the KCSAN
240 runtime. As a result, maintenance overheads are minimal as the kernel
243 4. **Detects Racy Writes from Devices:** Due to checking data values upon
244 setting up watchpoints, racy writes from devices can also be detected.
246 5. **Memory Ordering:** KCSAN is *not* explicitly aware of the LKMM's ordering
247 rules; this may result in missed data races (false negatives).
249 6. **Analysis Accuracy:** For observed executions, due to using a sampling
250 strategy, the analysis is *unsound* (false negatives possible), but aims to
251 be complete (no false positives).
253 Alternatives Considered
254 -----------------------
256 An alternative data race detection approach for the kernel can be found in
257 `Kernel Thread Sanitizer (KTSAN) <https://github.com/google/ktsan/wiki>`_.
258 KTSAN is a happens-before data race detector, which explicitly establishes the
259 happens-before order between memory operations, which can then be used to
260 determine data races as defined in `Data Races`_. To build a correct
261 happens-before relation, KTSAN must be aware of all ordering rules of the LKMM
262 and synchronization primitives. Unfortunately, any omission leads to false
263 positives, which is especially important in the context of the kernel which
264 includes numerous custom synchronization mechanisms. Furthermore, KTSAN's
265 implementation requires metadata for each memory location (shadow memory);
266 currently, for each page, KTSAN requires 4 pages of shadow memory.