Patches on this branch depend on the branches merged above.
1. General
-----------
- * PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
- register state and TPIDR2_EL0 are tracked per thread.
+ * PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA and (when
+ present) ZTn register state and TPIDR2_EL0 are tracked per thread.
* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
AT_HWCAP2 entry. Presence of this flag implies the presence of the SME
instructions and registers, and the Linux-specific system interfaces
described in this document. SME is reported in /proc/cpuinfo as "sme".
+ * The presence of SME2 is reported to userspace via HWCAP2_SME2 in the
+ aux vector AT_HWCAP2 entry. Presence of this flag implies the presence of
+ the SME2 instructions and ZT0, and the Linux-specific system interfaces
+ described in this document. SME2 is reported in /proc/cpuinfo as "sme2".
+
* Support for the execution of SME instructions in userspace can also be
detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
instruction, and checking that the value of the SME field is nonzero. [3]
HWCAP2_SME_B16F32
HWCAP2_SME_F32F32
HWCAP2_SME_FA64
+ HWCAP2_SME2
This list may be extended over time as the SME architecture evolves.
cpu-feature-registers.txt for details.
* Debuggers should restrict themselves to interacting with the target via the
- NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way
- of detecting support for these regsets is to connect to a target process
+ NT_ARM_SVE, NT_ARM_SSVE, NT_ARM_ZA and NT_ARM_ZT regsets. The recommended
+ way of detecting support for these regsets is to connect to a target process
first and then attempt a
ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
-------------------------
* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
- ZA matrix are preserved.
+ ZA matrix and ZTn (if present) are preserved.
* On syscall PSTATE.SM will be cleared and the SVE registers will be handled
as per the standard SVE ABI.
- * Neither the SVE registers nor ZA are used to pass arguments to or receive
- results from any syscall.
+ * None of the SVE registers, ZA or ZTn are used to pass arguments to
+ or receive results from any syscall.
* On process creation (eg, clone()) the newly created process will have
PSTATE.SM cleared.
* Signal handlers are invoked with streaming mode and ZA disabled.
+* A new signal frame record TPIDR2_MAGIC is added formatted as a struct
+ tpidr2_context to allow access to TPIDR2_EL0 from signal handlers.
+
* A new signal frame record za_context encodes the ZA register contents on
signal delivery. [1]
__reserved[] referencing this space. za_context is then written in the
extra space. Refer to [1] for further details about this mechanism.
+ * If ZTn is supported and PSTATE.ZA==1 then a signal frame record for ZTn will
+ be generated.
+
+ * The signal record for ZTn has magic ZT_MAGIC (0x5a544e01) and consists of a
+ standard signal frame header followed by a struct zt_context specifying
+ the number of ZTn registers supported by the system, then zt_context.nregs
+ blocks of 64 bytes of data per register.
+
5. Signal return
-----------------
the signal frame does not match the current vector length, the signal return
attempt is treated as illegal, resulting in a forced SIGSEGV.
+ * If ZTn is not supported or PSTATE.ZA==0 then it is illegal to have a
+ signal frame record for ZTn, resulting in a forced SIGSEGV.
+
6. prctl extensions
--------------------
vector length that will be applied at the next execve() by the calling
thread.
- * Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
- Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
+ * Changing the vector length causes all of ZA, ZTn, P0..P15, FFR and all
+ bits of Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
unspecified, including both streaming and non-streaming SVE state.
Calling PR_SME_SET_VL with vl equal to the thread's current vector
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
* The effect of writing a partial, incomplete payload is unspecified.
+ * A new regset NT_ARM_ZT is defined for access to ZTn state via
+ PTRACE_GETREGSET and PTRACE_SETREGSET.
+
+ * The NT_ARM_ZT regset consists of a single 512 bit register.
+
+ * When PSTATE.ZA==0 reads of NT_ARM_ZT will report all bits of ZTn as 0.
+
+ * Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
+
8. ELF coredump extensions
---------------------------
been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
when the coredump was generated.
+ * A NT_ARM_ZT note will be added to each coredump for each thread of the
+ dumped process. The contents will be equivalent to the data that would have
+ been read if a PTRACE_GETREGSET of NT_ARM_ZT were executed for each thread
+ when the coredump was generated.
+
* The NT_ARM_TLS note will be extended to two registers, the second register
will contain TPIDR2_EL0 on systems that support SME and will be read as
zero with writes ignored otherwise.
For best system performance it is strongly encouraged for software to enable
ZA only when it is actively being used.
+ * A new ZT0 register is introduced when SME2 is present. This is a 512 bit
+ register which is accessible when PSTATE.ZA is set, as ZA itself is.
+
* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
SMSTOP instructions or by access to the SVCR system register:
#define SVE_SIG_FLAG_SM 0x1 /* Context describes streaming mode */
+/* TPIDR2_EL0 context */
+#define TPIDR2_MAGIC 0x54504902
+
+struct tpidr2_context {
+ struct _aarch64_ctx head;
+ __u64 tpidr2;
+};
+
#define ZA_MAGIC 0x54366345
struct za_context {
__u16 __reserved[3];
};
+ #define ZT_MAGIC 0x5a544e01
+
+ struct zt_context {
+ struct _aarch64_ctx head;
+ __u16 nregs;
+ __u16 __reserved[3];
+ };
+
#endif /* !__ASSEMBLY__ */
#include <asm/sve_context.h>
#define ZA_SIG_CONTEXT_SIZE(vq) \
(ZA_SIG_REGS_OFFSET + ZA_SIG_REGS_SIZE(vq))
+ #define ZT_SIG_REG_SIZE 512
+
+ #define ZT_SIG_REG_BYTES (ZT_SIG_REG_SIZE / 8)
+
+ #define ZT_SIG_REGS_OFFSET sizeof(struct zt_context)
+
+ #define ZT_SIG_REGS_SIZE(n) (ZT_SIG_REG_BYTES * n)
+
+ #define ZT_SIG_CONTEXT_SIZE(n) \
+ (sizeof(struct zt_context) + ZT_SIG_REGS_SIZE(n))
+
#endif /* _UAPI__ASM_SIGCONTEXT_H */
unsigned long fpsimd_offset;
unsigned long esr_offset;
unsigned long sve_offset;
+ unsigned long tpidr2_offset;
unsigned long za_offset;
+ unsigned long zt_offset;
unsigned long extra_offset;
unsigned long end_offset;
};
struct user_ctxs {
struct fpsimd_context __user *fpsimd;
struct sve_context __user *sve;
+ struct tpidr2_context __user *tpidr2;
struct za_context __user *za;
+ struct zt_context __user *zt;
};
#ifdef CONFIG_ARM64_SVE
#ifdef CONFIG_ARM64_SME
+static int preserve_tpidr2_context(struct tpidr2_context __user *ctx)
+{
+ int err = 0;
+
+ current->thread.tpidr2_el0 = read_sysreg_s(SYS_TPIDR2_EL0);
+
+ __put_user_error(TPIDR2_MAGIC, &ctx->head.magic, err);
+ __put_user_error(sizeof(*ctx), &ctx->head.size, err);
+ __put_user_error(current->thread.tpidr2_el0, &ctx->tpidr2, err);
+
+ return err;
+}
+
+static int restore_tpidr2_context(struct user_ctxs *user)
+{
+ u64 tpidr2_el0;
+ int err = 0;
+
+ /* Magic and size were validated deciding to call this function */
+ __get_user_error(tpidr2_el0, &user->tpidr2->tpidr2, err);
+ if (!err)
+ current->thread.tpidr2_el0 = tpidr2_el0;
+
+ return err;
+}
+
static int preserve_za_context(struct za_context __user *ctx)
{
int err = 0;
* fpsimd_signal_preserve_current_state().
*/
err |= __copy_to_user((char __user *)ctx + ZA_SIG_REGS_OFFSET,
- current->thread.za_state,
+ current->thread.sme_state,
ZA_SIG_REGS_SIZE(vq));
}
/*
* Careful: we are about __copy_from_user() directly into
- * thread.za_state with preemption enabled, so protection is
+ * thread.sme_state with preemption enabled, so protection is
* needed to prevent a racing context switch from writing stale
* registers back over the new data.
*/
/* From now, fpsimd_thread_switch() won't touch thread.sve_state */
sme_alloc(current);
- if (!current->thread.za_state) {
+ if (!current->thread.sme_state) {
current->thread.svcr &= ~SVCR_ZA_MASK;
clear_thread_flag(TIF_SME);
return -ENOMEM;
}
- err = __copy_from_user(current->thread.za_state,
+ err = __copy_from_user(current->thread.sme_state,
(char __user const *)user->za +
ZA_SIG_REGS_OFFSET,
ZA_SIG_REGS_SIZE(vq));
return 0;
}
+
+ static int preserve_zt_context(struct zt_context __user *ctx)
+ {
+ int err = 0;
+ u16 reserved[ARRAY_SIZE(ctx->__reserved)];
+
+ if (WARN_ON(!thread_za_enabled(¤t->thread)))
+ return -EINVAL;
+
+ memset(reserved, 0, sizeof(reserved));
+
+ __put_user_error(ZT_MAGIC, &ctx->head.magic, err);
+ __put_user_error(round_up(ZT_SIG_CONTEXT_SIZE(1), 16),
+ &ctx->head.size, err);
+ __put_user_error(1, &ctx->nregs, err);
+ BUILD_BUG_ON(sizeof(ctx->__reserved) != sizeof(reserved));
+ err |= __copy_to_user(&ctx->__reserved, reserved, sizeof(reserved));
+
+ /*
+ * This assumes that the ZT state has already been saved to
+ * the task struct by calling the function
+ * fpsimd_signal_preserve_current_state().
+ */
+ err |= __copy_to_user((char __user *)ctx + ZT_SIG_REGS_OFFSET,
+ thread_zt_state(¤t->thread),
+ ZT_SIG_REGS_SIZE(1));
+
+ return err ? -EFAULT : 0;
+ }
+
+ static int restore_zt_context(struct user_ctxs *user)
+ {
+ int err;
+ struct zt_context zt;
+
+ /* ZA must be restored first for this check to be valid */
+ if (!thread_za_enabled(¤t->thread))
+ return -EINVAL;
+
+ if (__copy_from_user(&zt, user->zt, sizeof(zt)))
+ return -EFAULT;
+
+ if (zt.nregs != 1)
+ return -EINVAL;
+
+ if (zt.head.size != ZT_SIG_CONTEXT_SIZE(zt.nregs))
+ return -EINVAL;
+
+ /*
+ * Careful: we are about __copy_from_user() directly into
+ * thread.zt_state with preemption enabled, so protection is
+ * needed to prevent a racing context switch from writing stale
+ * registers back over the new data.
+ */
+
+ fpsimd_flush_task_state(current);
+ /* From now, fpsimd_thread_switch() won't touch ZT in thread state */
+
+ err = __copy_from_user(thread_zt_state(¤t->thread),
+ (char __user const *)user->zt +
+ ZT_SIG_REGS_OFFSET,
+ ZT_SIG_REGS_SIZE(1));
+ if (err)
+ return -EFAULT;
+
+ return 0;
+ }
+
#else /* ! CONFIG_ARM64_SME */
/* Turn any non-optimised out attempts to use these into a link error: */
+extern int preserve_tpidr2_context(void __user *ctx);
+extern int restore_tpidr2_context(struct user_ctxs *user);
extern int preserve_za_context(void __user *ctx);
extern int restore_za_context(struct user_ctxs *user);
+ extern int preserve_zt_context(void __user *ctx);
+ extern int restore_zt_context(struct user_ctxs *user);
#endif /* ! CONFIG_ARM64_SME */
user->fpsimd = NULL;
user->sve = NULL;
+ user->tpidr2 = NULL;
user->za = NULL;
+ user->zt = NULL;
if (!IS_ALIGNED((unsigned long)base, 16))
goto invalid;
user->sve = (struct sve_context __user *)head;
break;
+ case TPIDR2_MAGIC:
+ if (!system_supports_sme())
+ goto invalid;
+
+ if (user->tpidr2)
+ goto invalid;
+
+ if (size != sizeof(*user->tpidr2))
+ goto invalid;
+
+ user->tpidr2 = (struct tpidr2_context __user *)head;
+ break;
+
case ZA_MAGIC:
if (!system_supports_sme())
goto invalid;
user->za = (struct za_context __user *)head;
break;
+ case ZT_MAGIC:
+ if (!system_supports_sme2())
+ goto invalid;
+
+ if (user->zt)
+ goto invalid;
+
+ if (size < sizeof(*user->zt))
+ goto invalid;
+
+ user->zt = (struct zt_context __user *)head;
+ break;
+
case EXTRA_MAGIC:
if (have_extra_context)
goto invalid;
err = restore_fpsimd_context(user.fpsimd);
}
+ if (err == 0 && system_supports_sme() && user.tpidr2)
+ err = restore_tpidr2_context(&user);
+
if (err == 0 && system_supports_sme() && user.za)
err = restore_za_context(&user);
+ if (err == 0 && system_supports_sme2() && user.zt)
+ err = restore_zt_context(&user);
+
return err;
}
else
vl = task_get_sme_vl(current);
+ err = sigframe_alloc(user, &user->tpidr2_offset,
+ sizeof(struct tpidr2_context));
+ if (err)
+ return err;
+
if (thread_za_enabled(¤t->thread))
vq = sve_vq_from_vl(vl);
return err;
}
+ if (system_supports_sme2()) {
+ if (add_all || thread_za_enabled(¤t->thread)) {
+ err = sigframe_alloc(user, &user->zt_offset,
+ ZT_SIG_CONTEXT_SIZE(1));
+ if (err)
+ return err;
+ }
+ }
+
return sigframe_alloc_end(user);
}
err |= preserve_sve_context(sve_ctx);
}
+ /* TPIDR2 if supported */
+ if (system_supports_sme() && err == 0) {
+ struct tpidr2_context __user *tpidr2_ctx =
+ apply_user_offset(user, user->tpidr2_offset);
+ err |= preserve_tpidr2_context(tpidr2_ctx);
+ }
+
/* ZA state if present */
if (system_supports_sme() && err == 0 && user->za_offset) {
struct za_context __user *za_ctx =
err |= preserve_za_context(za_ctx);
}
+ /* ZT state if present */
+ if (system_supports_sme2() && err == 0 && user->zt_offset) {
+ struct zt_context __user *zt_ctx =
+ apply_user_offset(user, user->zt_offset);
+ err |= preserve_zt_context(zt_ctx);
+ }
+
if (err == 0 && user->extra_offset) {
char __user *sfp = (char __user *)user->sigframe;
char __user *userp =
sme_*
ssve_*
sve_*
+tpidr2_siginfo
za_*
+ zt_*
!*.[ch]
return true;
}
+ bool validate_zt_context(struct zt_context *zt, char **err)
+ {
+ if (!zt || !err)
+ return false;
+
+ /* If the context is present there should be at least one register */
+ if (zt->nregs == 0) {
+ *err = "no registers";
+ return false;
+ }
+
+ /* Size should agree with the number of registers */
+ if (zt->head.size != ZT_SIG_CONTEXT_SIZE(zt->nregs)) {
+ *err = "register count does not match size";
+ return false;
+ }
+
+ return true;
+ }
+
bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err)
{
bool terminated = false;
struct extra_context *extra = NULL;
struct sve_context *sve = NULL;
struct za_context *za = NULL;
+ struct zt_context *zt = NULL;
struct _aarch64_ctx *head =
(struct _aarch64_ctx *)uc->uc_mcontext.__reserved;
void *extra_data = NULL;
if (head->size != sizeof(struct esr_context))
*err = "Bad size for esr_context";
break;
+ case TPIDR2_MAGIC:
+ if (head->size != sizeof(struct tpidr2_context))
+ *err = "Bad size for tpidr2_context";
+ break;
case SVE_MAGIC:
if (flags & SVE_CTX)
*err = "Multiple SVE_MAGIC";
za = (struct za_context *)head;
new_flags |= ZA_CTX;
break;
+ case ZT_MAGIC:
+ if (flags & ZT_CTX)
+ *err = "Multiple ZT_MAGIC";
+ /* Size is validated in validate_za_context() */
+ zt = (struct zt_context *)head;
+ new_flags |= ZT_CTX;
+ break;
case EXTRA_MAGIC:
if (flags & EXTRA_CTX)
*err = "Multiple EXTRA_MAGIC";
if (new_flags & ZA_CTX)
if (!validate_za_context(za, err))
return false;
+ if (new_flags & ZT_CTX)
+ if (!validate_zt_context(zt, err))
+ return false;
flags |= new_flags;
return false;
}
+ if (terminated && (flags & ZT_CTX) && !(flags & ZA_CTX)) {
+ *err = "ZT context but no ZA context";
+ return false;
+ }
+
return true;
}