location, at least one access is a write.” // Shared variable var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { // Spawn two “threads” go incrementCount() go incrementCount() } data race g1 R g1 R g1 R g1 W g2 R g2 R g2 R g1 W g2 W g2 !W g2 W g1 W count = 1 count = 2 count = 2 !concurrent concurrent concurrent “g2” “g1”
location, at least one access is a write.” Thread 1 Thread 2 lock(l) lock(l) count=1 count=2 unlock(l) unlock(l) !data race // Shared variable var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { // Spawn two “threads” go incrementCount() go incrementCount() } data race
to introduce in languages like Go Panic messages from unexpected program crashes are often reported on the Go issue tracker. An overwhelming number of these panics are caused by data races, and an overwhelming number of those reports centre around Go’s built in map type. — Dave Cheney
chain — > go run -race counter.go • Based on C/ C++ ThreadSanitizer dynamic race detection library • As of August 2015, 1200+ races in Google’s codebase, ~100 in the Go stdlib, 100+ in Chromium, + LLVM, GCC, OpenSSL, WebRTC, Firefox go race detector
user-space threads use as you would threads > go handle_request(r) Go memory model specified in terms of goroutines within a goroutine: reads + writes are ordered with multiple goroutines: shared data must be synchronized…else data races!
access is a write.” ? concurrency var count = 0 func incrementCount() { if count == 0 { count ++ } } func main() { go incrementCount() go incrementCount() } “g2” “g1” g1 R g1 R g1 R g1 W g2 R g2 R g2 R g1 W g2 W g2 !W g2 W g1 W count = 1 count = 2 count = 2 !concurrent concurrent concurrent
synchronization via locks or lock-free sync mu.Unlock() ch <— a X ≺ Y IF one of: — same goroutine — are a synchronization-pair — X ≺ E ≺ Y across goroutines IF X not ≺ Y and Y not ≺ X , concurrent! orders events
for goroutines -> ThreadState computes happens-before edges at memory access, synchronization events -> Shadow State, Meta Map compares vector clocks to detect data races. threadsanitizer
fixed-size vector clock (size == max(# threads)) func newproc1() { if race.Enabled { newg.racectx = racegostart(…) } ... } proc.go count == 0 raceread(…) by compiler instrumentation 1. data race with a previous access? 2. store information about this access for future detections
each existing shadow word do the access locations overlap? are any of the accesses a write? are the TIDS different? are they unordered by happens-before? g2’s vector clock: (0, 0) existing shadow word’s clock: (1, ?) g1 1 0:8 1 g2 0 0:8 0 0 0 ✓ ✓ ✓ ✓
compare (accessor’s threadState, new shadow word) with each existing shadow word: do the access locations overlap? are any of the accesses a write? are the TIDS different? is there a happens-before edge? 0 0 RACE! ✓ ✓ ✓ ✓
instance (e.g. one per mutex), stored in the meta map region. each has a vector clock to facilitate the happens-before edge. can track your custom sync primitives too, via dynamic annotations! TSan tracks file descriptors, memory allocations etc. too a note (or two)…
5x-15x memory usage = 5x-10x no false positives (only reports “real races”, but can be benign) can miss races! depends on execution trace As of August 2015, 1200+ races in Google’s codebase, ~100 in the Go stdlib, 100+ in Chromium, + LLVM, GCC, OpenSSL, WebRTC, Firefox
have to augment the source with race annotations (-) • single detection pass sufficient to determine all possible races (+) • too many false positives to be practical (-) II. Lockset-based dynamic detectors uses an algorithm based on locks held • more performant than pure happens-before (+) • do not recognize synchronization via non-locks, like channels (will report as races) (-)
support as per: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01828.html x86_64 required Linux, OSX, Windows II. TSan specifics LLVM Clang 3.2, gcc 4.8 x86_64 requires ASLR, so compile/ ld with -fPIE, -pie maps (using mmap but does not reserve) virtual address space; tools like top/ ulimit may not work as expected.
reserve) tons of virtual address space; tools like top/ ulimit may not work as expected. need: gdb -ex 'set disable-randomization off' --args ./a.out due to ASLR requirement. Deadlock detection? Kernel TSan?