浏览代码

fix missing synchronization in atomic store on i386 and x86_64

despite being strongly ordered, the x86 memory model does not preclude
reordering of loads across earlier stores. while a plain store
suffices as a release barrier, we actually need a full barrier, since
users of a_store subsequently load a waiter count to determine whether
to issue a futex wait, and using a stale count will result in soft
(fail-to-wake) deadlocks. these deadlocks were observed in malloc and
possible with stdio locks and other libc-internal locking.

on i386, an atomic operation on the caller's stack is used as the
barrier rather than performing the store itself using xchg; this
avoids the need to read the cache line on which the store is being
performed. mfence is used on x86_64 where it's always available, and
could be used on i386 with the appropriate cpu model checks if it's
shown to perform better.
Rich Felker 9 年之前
父节点
当前提交
3c43c0761e
共有 3 个文件被更改,包括 3 次插入3 次删除
  1. 1 1
      arch/i386/atomic.h
  2. 1 1
      arch/x32/atomic.h
  3. 1 1
      arch/x86_64/atomic.h

+ 1 - 1
arch/i386/atomic.h

@@ -88,7 +88,7 @@ static inline void a_dec(volatile int *x)
 
 static inline void a_store(volatile int *p, int x)
 {
-	__asm__( "movl %1, %0" : "=m"(*p) : "r"(x) : "memory" );
+	__asm__( "movl %1, %0 ; lock ; orl $0,(%%esp)" : "=m"(*p) : "r"(x) : "memory" );
 }
 
 static inline void a_spin()

+ 1 - 1
arch/x32/atomic.h

@@ -83,7 +83,7 @@ static inline void a_dec(volatile int *x)
 
 static inline void a_store(volatile int *p, int x)
 {
-	__asm__( "mov %1, %0" : "=m"(*p) : "r"(x) : "memory" );
+	__asm__( "mov %1, %0 ; mfence" : "=m"(*p) : "r"(x) : "memory" );
 }
 
 static inline void a_spin()

+ 1 - 1
arch/x86_64/atomic.h

@@ -83,7 +83,7 @@ static inline void a_dec(volatile int *x)
 
 static inline void a_store(volatile int *p, int x)
 {
-	__asm__( "mov %1, %0" : "=m"(*p) : "r"(x) : "memory" );
+	__asm__( "mov %1, %0 ; mfence" : "=m"(*p) : "r"(x) : "memory" );
 }
 
 static inline void a_spin()