bpf, arm64: use more scalable stadd over ldxr / stxr loop in xadd