ARC: add compiler barrier to LLSC based cmpxchg
When auditing cmpxchg call sites, Chuck noted that gcc was optimizing
away some of the desired LDs.
| do {
| new = old = *ipi_data_ptr;
| new |= 1U << msg;
| } while (cmpxchg(ipi_data_ptr, old, new) != old);
was generating to below
|
8015cef8: ld r2,[r4,0] <-- First LD
|
8015cefc: bset r1,r2,r1
|
|
8015cf00: llock r3,[r4] <-- atomic op
|
8015cf04: brne r3,r2,
8015cf10
|
8015cf08: scond r1,[r4]
|
8015cf0c: bnz
8015cf00
|
|
8015cf10: brne r3,r2,
8015cf00 <-- Branch doesn't go to orig LD
Although this was fixed by adding a ACCESS_ONCE in this call site, it
seems safer (for now at least) to add compiler barrier to LLSC based
cmpxchg
Reported-by: Chuck Jordan <cjordan@synopsys,com>
Cc: <stable@vger.kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>