linux-2.6-microblaze.git
6 months agocrypto: sahara - use 'time_left' variable with wait_for_completion_timeout()
Wolfram Sang [Tue, 30 Apr 2024 12:15:51 +0000 (14:15 +0200)]
crypto: sahara - use 'time_left' variable with wait_for_completion_timeout()

There is a confusing pattern in the kernel to use a variable named 'timeout' to
store the result of wait_for_completion_timeout() causing patterns like:

timeout = wait_for_completion_timeout(...)
if (!timeout) return -ETIMEDOUT;

with all kinds of permutations. Use 'time_left' as a variable to make the code
self explaining.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: api - use 'time_left' variable with wait_for_completion_killable_timeout()
Wolfram Sang [Tue, 30 Apr 2024 12:14:42 +0000 (14:14 +0200)]
crypto: api - use 'time_left' variable with wait_for_completion_killable_timeout()

There is a confusing pattern in the kernel to use a variable named 'timeout' to
store the result of wait_for_completion_killable_timeout() causing patterns like:

timeout = wait_for_completion_killable_timeout(...)
if (!timeout) return -ETIMEDOUT;

with all kinds of permutations. Use 'time_left' as a variable to make the code
self explaining.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: caam - i.MX8ULP donot have CAAM page0 access
Pankaj Gupta [Mon, 29 Apr 2024 06:28:54 +0000 (11:58 +0530)]
crypto: caam - i.MX8ULP donot have CAAM page0 access

iMX8ULP have a secure-enclave hardware IP called EdgeLock Enclave(ELE),
that control access to caam controller's register page, i.e., page0.

At all, if the ELE release access to CAAM controller's register page,
it will release to secure-world only.

Clocks are turned on automatically for iMX8ULP. There exists the caam
clock gating bit, but it is not advised to gate the clock at linux, as
optee-os or any other entity might be using it.

Signed-off-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com>
Reviewed-by: Horia Geanta <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: caam - init-clk based on caam-page0-access
Pankaj Gupta [Mon, 29 Apr 2024 06:28:52 +0000 (11:58 +0530)]
crypto: caam - init-clk based on caam-page0-access

CAAM clock initializat is done based on the basis of soc specific
info stored in struct caam_imx_data:
- caam-page0-access flag
- num_clks

CAAM driver needs to be aware of access rights to CAAM control page
i.e., page0, to do things differently.

Signed-off-by: Pankaj Gupta <pankaj.gupta@nxp.com>
Reviewed-by: Gaurav Jain <gaurav.jain@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: starfive - Use fallback for unaligned dma access
Jia Jie Ho [Mon, 29 Apr 2024 06:06:40 +0000 (14:06 +0800)]
crypto: starfive - Use fallback for unaligned dma access

Dma address mapping fails on unaligned scatterlist offset. Use sw
fallback for these cases.

Signed-off-by: Jia Jie Ho <jiajie.ho@starfivetech.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: starfive - Do not free stack buffer
Jia Jie Ho [Mon, 29 Apr 2024 06:06:39 +0000 (14:06 +0800)]
crypto: starfive - Do not free stack buffer

RSA text data uses variable length buffer allocated in software stack.
Calling kfree on it causes undefined behaviour in subsequent operations.

Cc: <stable@vger.kernel.org> #6.7+
Signed-off-by: Jia Jie Ho <jiajie.ho@starfivetech.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: starfive - Skip unneeded fallback allocation
Jia Jie Ho [Mon, 29 Apr 2024 06:06:38 +0000 (14:06 +0800)]
crypto: starfive - Skip unneeded fallback allocation

Skip sw fallback allocation if RSA module failed to get device handle.

Signed-off-by: Jia Jie Ho <jiajie.ho@starfivetech.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: starfive - Skip dma setup for zeroed message
Jia Jie Ho [Mon, 29 Apr 2024 06:06:37 +0000 (14:06 +0800)]
crypto: starfive - Skip dma setup for zeroed message

Skip dma setup and mapping for AES driver if plaintext is empty.

Signed-off-by: Jia Jie Ho <jiajie.ho@starfivetech.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: hisilicon/sec2 - fix for register offset
Wenkai Lin [Tue, 23 Apr 2024 01:19:22 +0000 (09:19 +0800)]
crypto: hisilicon/sec2 - fix for register offset

The offset of SEC_CORE_ENABLE_BITMAP should be 0 instead of 32,
it cause a kasan shift-out-bounds warning, fix it.

Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: hisilicon/debugfs - mask the unnecessary info from the dump
Chenghai Huang [Tue, 23 Apr 2024 01:19:21 +0000 (09:19 +0800)]
crypto: hisilicon/debugfs - mask the unnecessary info from the dump

Some information showed by the dump function is invalid. Mask
the unnecessary information from the dump file.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
6 months agocrypto: qat - specify firmware files for 402xx
Giovanni Cabiddu [Mon, 22 Apr 2024 14:13:17 +0000 (15:13 +0100)]
crypto: qat - specify firmware files for 402xx

The 4xxx driver can probe 4xxx and 402xx devices. However, the driver
only specifies the firmware images required for 4xxx.
This might result in external tools missing these binaries, if required,
in the initramfs.

Specify the firmware image used by 402xx with the MODULE_FIRMWARE()
macros in the 4xxx driver.

Fixes: a3e8c919b993 ("crypto: qat - add support for 402xx devices")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Damian Muszynski <damian.muszynski@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-gcm - simplify GCM hash subkey derivation
Eric Biggers [Sat, 20 Apr 2024 18:20:16 +0000 (11:20 -0700)]
crypto: x86/aes-gcm - simplify GCM hash subkey derivation

Remove a redundant expansion of the AES key, and use rodata for zeroes.
Also rename rfc4106_set_hash_subkey() to aes_gcm_derive_hash_subkey()
because it's used for both versions of AES-GCM, not just RFC4106.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-gcm - delete unused GCM assembly code
Eric Biggers [Sat, 20 Apr 2024 05:56:42 +0000 (22:56 -0700)]
crypto: x86/aes-gcm - delete unused GCM assembly code

Delete aesni_gcm_enc() and aesni_gcm_dec() because they are unused.
Only the incremental AES-GCM functions (aesni_gcm_init(),
aesni_gcm_enc_update(), aesni_gcm_finalize()) are actually used.

This saves 17 KB of object code.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - simplify loop in xts_crypt_slowpath()
Eric Biggers [Sat, 20 Apr 2024 05:54:55 +0000 (22:54 -0700)]
crypto: x86/aes-xts - simplify loop in xts_crypt_slowpath()

Since the total length processed by the loop in xts_crypt_slowpath() is
a multiple of AES_BLOCK_SIZE, just round the length down to
AES_BLOCK_SIZE even on the last step.  This doesn't change behavior, as
the last step will process a multiple of AES_BLOCK_SIZE regardless.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agohwrng: stm32 - repair clock handling
Marek Vasut [Fri, 19 Apr 2024 05:01:14 +0000 (07:01 +0200)]
hwrng: stm32 - repair clock handling

The clock management in this driver does not seem to be correct. The
struct hwrng .init callback enables the clock, but there is no matching
.cleanup callback to disable the clock. The clock get disabled as some
later point by runtime PM suspend callback.

Furthermore, both runtime PM and sleep suspend callbacks access registers
first and disable clock which are used for register access second. If the
IP is already in RPM suspend and the system enters sleep state, the sleep
callback will attempt to access registers while the register clock are
already disabled. This bug has been fixed once before already in commit
9bae54942b13 ("hwrng: stm32 - fix pm_suspend issue"), and regressed in
commit ff4e46104f2e ("hwrng: stm32 - rework power management sequences") .

Fix this slightly differently, disable register clock at the end of .init
callback, this way the IP is disabled after .init. On every access to the
IP, which really is only stm32_rng_read(), do pm_runtime_get_sync() which
is already done in stm32_rng_read() to bring the IP from RPM suspend, and
pm_runtime_mark_last_busy()/pm_runtime_put_sync_autosuspend() to put it
back into RPM suspend.

Change sleep suspend/resume callbacks to enable and disable register clock
around register access, as those cannot use the RPM suspend/resume callbacks
due to slightly different initialization in those sleep callbacks. This way,
the register access should always be performed with clock surely enabled.

Fixes: ff4e46104f2e ("hwrng: stm32 - rework power management sequences")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agohwrng: stm32 - put IP into RPM suspend on failure
Marek Vasut [Fri, 19 Apr 2024 05:01:13 +0000 (07:01 +0200)]
hwrng: stm32 - put IP into RPM suspend on failure

In case of an irrecoverable failure, put the IP into RPM suspend
to avoid RPM imbalance. I did not trigger this case, but it seems
it should be done based on reading the code.

Fixes: b17bc6eb7c2b ("hwrng: stm32 - rework error handling in stm32_rng_read()")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agohwrng: stm32 - use logical OR in conditional
Marek Vasut [Fri, 19 Apr 2024 05:01:12 +0000 (07:01 +0200)]
hwrng: stm32 - use logical OR in conditional

The conditional is used to check whether err is non-zero OR whether
reg variable is non-zero after clearing bits from it. This should be
done using logical OR, not bitwise OR, fix it.

Fixes: 6b85a7e141cb ("hwrng: stm32 - implement STM32MP13x support")
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdh - Initialize ctx->private_key in proper byte order
Stefan Berger [Thu, 18 Apr 2024 15:24:45 +0000 (11:24 -0400)]
crypto: ecdh - Initialize ctx->private_key in proper byte order

The private key in ctx->private_key is currently initialized in reverse
byte order in ecdh_set_secret and whenever the key is needed in proper
byte order the variable priv is introduced and the bytes from
ctx->private_key are copied into priv while being byte-swapped
(ecc_swap_digits). To get rid of the unnecessary byte swapping initialize
ctx->private_key in proper byte order and clean up all functions that were
previously using priv or were called with ctx->private_key:

- ecc_gen_privkey: Directly initialize the passed ctx->private_key with
  random bytes filling all the digits of the private key. Get rid of the
  priv variable. This function only has ecdh_set_secret as a caller to
  create NIST P192/256/384 private keys.

- crypto_ecdh_shared_secret: Called only from ecdh_compute_value with
  ctx->private_key. Get rid of the priv variable and work with the passed
  private_key directly.

- ecc_make_pub_key: Called only from ecdh_compute_value with
  ctx->private_key. Get rid of the priv variable and work with the passed
  private_key directly.

Cc: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdh - Pass private key in proper byte order to check valid key
Stefan Berger [Thu, 18 Apr 2024 15:24:44 +0000 (11:24 -0400)]
crypto: ecdh - Pass private key in proper byte order to check valid key

ecc_is_key_valid expects a key with the most significant digit in the last
entry of the digit array. Currently ecdh_set_secret passes a reversed key
to ecc_is_key_valid that then passes the rather simple test checking
whether the private key is in range [2, n-3]. For all current ecdh-
supported curves (NIST P192/256/384) the 'n' parameter is a rather large
number, therefore easily passing this test.

Throughout the ecdh and ecc codebase the variable 'priv' is used for a
private_key holding the bytes in proper byte order. Therefore, introduce
priv in ecdh_set_secret and copy the bytes from ctx->private_key into
priv in proper byte order by using ecc_swap_digits. Pass priv to
ecc_is_valid_key.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: tegra - Fix some error codes
Dan Carpenter [Wed, 17 Apr 2024 18:12:32 +0000 (21:12 +0300)]
crypto: tegra - Fix some error codes

Return negative -ENOMEM, instead of positive ENOMEM.

Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Akhil R <akhilrajeev@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agodt-bindings: crypto: starfive: Restore sort order
Geert Uytterhoeven [Tue, 16 Apr 2024 15:51:49 +0000 (17:51 +0200)]
dt-bindings: crypto: starfive: Restore sort order

Restore alphabetical sort order of the list of supported compatible
values.

Fixes: 2ccf7a5d9c50f3ea ("dt-bindings: crypto: starfive: Add jh8100 support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: qat - validate slices count returned by FW
Lucas Segarra Fernandez [Tue, 16 Apr 2024 10:33:37 +0000 (12:33 +0200)]
crypto: qat - validate slices count returned by FW

The function adf_send_admin_tl_start() enables the telemetry (TL)
feature on a QAT device by sending the ICP_QAT_FW_TL_START message to
the firmware. This triggers the FW to start writing TL data to a DMA
buffer in memory and returns an array containing the number of
accelerators of each type (slices) supported by this HW.
The pointer to this array is stored in the adf_tl_hw_data data
structure called slice_cnt.

The array slice_cnt is then used in the function tl_print_dev_data()
to report in debugfs only statistics about the supported accelerators.
An incorrect value of the elements in slice_cnt might lead to an out
of bounds memory read.
At the moment, there isn't an implementation of FW that returns a wrong
value, but for robustness validate the slice count array returned by FW.

Fixes: 69e7649f7cc2 ("crypto: qat - add support for device telemetry")
Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com>
Reviewed-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: aead,cipher - zeroize key buffer after use
Hailey Mothershead [Mon, 15 Apr 2024 22:19:15 +0000 (22:19 +0000)]
crypto: aead,cipher - zeroize key buffer after use

I.G 9.7.B for FIPS 140-3 specifies that variables temporarily holding
cryptographic information should be zeroized once they are no longer
needed. Accomplish this by using kfree_sensitive for buffers that
previously held the private key.

Signed-off-by: Hailey Mothershead <hailmo@amazon.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: arm64/aes-ce - Simplify round key load sequence
Ard Biesheuvel [Mon, 15 Apr 2024 13:04:26 +0000 (15:04 +0200)]
crypto: arm64/aes-ce - Simplify round key load sequence

Tweak the round key logic so that they can be loaded using a single
branchless sequence using overlapping loads. This is shorter and
simpler, and puts the conditional branches based on the key size further
apart, which might benefit microarchitectures that cannot record taken
branches at every instruction. For these branches, use test-bit-branch
instructions that don't clobber the condition flags.

Note that none of this has any impact on performance, positive or
otherwise (and the branch prediction benefit would only benefit AES-192
which nobody uses). It does make for nicer code, though.

While at it, use \@ to generate the labels inside the macros, which is
more robust than using fixed numbers, which could clash inadvertently.
Also, bring aes-neon.S in line with these changes, including the switch
to test-and-branch instructions, to avoid surprises in the future when
we might start relying on the condition flags being preserved in the
chaining mode wrappers in aes-modes.S

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: tegra - Convert to platform remove callback returning void
Uwe Kleine-König [Mon, 15 Apr 2024 07:34:21 +0000 (09:34 +0200)]
crypto: tegra - Convert to platform remove callback returning void

The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - optimize size of instructions operating on lengths
Eric Biggers [Sat, 13 Apr 2024 03:17:28 +0000 (20:17 -0700)]
crypto: x86/aes-xts - optimize size of instructions operating on lengths

x86_64 has the "interesting" property that the instruction size is
generally a bit shorter for instructions that operate on the 32-bit (or
less) part of registers, or registers that are in the original set of 8.

This patch adjusts the AES-XTS code to take advantage of that property
by changing the LEN parameter from size_t to unsigned int (which is all
that's needed and is what the non-AVX implementation uses) and using the
%eax register for KEYLEN.

This decreases the size of aes-xts-avx-x86_64.o by 1.2%.

Note that changing the kmovq to kmovd was going to be needed anyway to
make the AVX10/256 code really work on CPUs that don't support 512-bit
vectors (since the AVX10 spec says that 64-bit opmask instructions will
only be supported on processors that support 512-bit vectors).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - eliminate a few more instructions
Eric Biggers [Sat, 13 Apr 2024 03:17:27 +0000 (20:17 -0700)]
crypto: x86/aes-xts - eliminate a few more instructions

- For conditionally subtracting 16 from LEN when decrypting a message
  whose length isn't a multiple of 16, use the cmovnz instruction.

- Fold the addition of 4*VL to LEN into the sub of VL or 16 from LEN.

- Remove an unnecessary test instruction.

This results in slightly shorter code, both source and binary.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - handle AES-128 and AES-192 more efficiently
Eric Biggers [Sat, 13 Apr 2024 03:17:26 +0000 (20:17 -0700)]
crypto: x86/aes-xts - handle AES-128 and AES-192 more efficiently

Decrease the amount of code specific to the different AES variants by
"right-aligning" the sequence of round keys, and for AES-128 and AES-192
just skipping irrelevant rounds at the beginning.

This shrinks the size of aes-xts-avx-x86_64.o by 13.3%, and it improves
the efficiency of AES-128 and AES-192.  The tradeoff is that for AES-256
some additional not-taken conditional jumps are now executed.  But these
are predicted well and are cheap on x86.

Note that the ARMv8 CE based AES-XTS implementation uses a similar
strategy to handle the different AES variants.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aesni-xts - deduplicate aesni_xts_enc() and aesni_xts_dec()
Eric Biggers [Sat, 13 Apr 2024 00:09:47 +0000 (17:09 -0700)]
crypto: x86/aesni-xts - deduplicate aesni_xts_enc() and aesni_xts_dec()

Since aesni_xts_enc() and aesni_xts_dec() are very similar, generate
them from a macro that's passed an argument enc=1 or enc=0.  This
reduces the length of aesni-intel_asm.S by 112 lines while still
producing the exact same object file in both 32-bit and 64-bit mode.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - handle CTS encryption more efficiently
Eric Biggers [Fri, 12 Apr 2024 15:45:59 +0000 (08:45 -0700)]
crypto: x86/aes-xts - handle CTS encryption more efficiently

When encrypting a message whose length isn't a multiple of 16 bytes,
encrypt the last full block in the main loop.  This works because only
decryption uses the last two tweaks in reverse order, not encryption.

This improves the performance of decrypting messages whose length isn't
a multiple of the AES block length, shrinks the size of
aes-xts-avx-x86_64.o by 5.0%, and eliminates two instructions (a test
and a not-taken conditional jump) when encrypting a message whose length
*is* a multiple of the AES block length.

While it's not super useful to optimize for ciphertext stealing given
that it's rarely needed in practice, the other two benefits mentioned
above make this optimization worthwhile.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: stm32/hash - add full DMA support for stm32mpx
Maxime Méré [Fri, 12 Apr 2024 12:45:45 +0000 (14:45 +0200)]
crypto: stm32/hash - add full DMA support for stm32mpx

Due to a lack of alignment in the data sent by requests, the actual DMA
support of the STM32 hash driver is only working with digest calls.
This patch, based on the algorithm used in the driver omap-sham.c,
allows for the usage of DMA in any situation.

It has been functionally tested on STM32MP15, STM32MP13 and STM32MP25.

By checking the performance of this new driver with OpenSSL, the
following results were found:

Performance:

(datasize: 4096, number of hashes performed in 10s)

|type   |no DMA    |DMA support|software  |
|-------|----------|-----------|----------|
|md5    |13873.56k |10958.03k  |71163.08k |
|sha1   |13796.15k |10729.47k  |39670.58k |
|sha224 |13737.98k |10775.76k  |22094.64k |
|sha256 |13655.65k |10872.01k  |22075.39k |

CPU Usage:

(algorithm used: sha256, computation time: 20s, measurement taken at
~10s)

|datasize  |no DMA |DMA  | software |
|----------|-------|-----|----------|
|  2048    | 56%   | 49% | 50%      |
|  4096    | 54%   | 46% | 50%      |
|  8192    | 53%   | 40% | 50%      |
| 16384    | 53%   | 33% | 50%      |

Note: this update doesn't change the driver performance without DMA.

As shown, performance with DMA is slightly lower than without, but in
most cases, it will save CPU time.

Signed-off-by: Maxime Méré <maxime.mere@foss.st.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: qat - improve error logging to be consistent across features
Adam Guerin [Fri, 12 Apr 2024 12:24:03 +0000 (13:24 +0100)]
crypto: qat - improve error logging to be consistent across features

Improve error logging in rate limiting feature. Staying consistent with
the error logging found in the telemetry feature.

Fixes: d9fb8408376e ("crypto: qat - add rate limiting feature to qat_4xxx")
Signed-off-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: qat - improve error message in adf_get_arbiter_mapping()
Adam Guerin [Fri, 12 Apr 2024 12:24:02 +0000 (13:24 +0100)]
crypto: qat - improve error message in adf_get_arbiter_mapping()

Improve error message to be more readable.

Fixes: 5da6a2d5353e ("crypto: qat - generate dynamically arbiter mappings")
Signed-off-by: Adam Guerin <adam.guerin@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/sha256-ni - simplify do_4rounds
Eric Biggers [Thu, 11 Apr 2024 16:23:59 +0000 (09:23 -0700)]
crypto: x86/sha256-ni - simplify do_4rounds

Instead of loading the message words into both MSG and \m0 and then
adding the round constants to MSG, load the message words into \m0 and
the round constants into MSG and then add \m0 to MSG.  This shortens the
source code slightly.  It changes the instructions slightly, but it
doesn't affect binary code size and doesn't seem to affect performance.

Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/sha256-ni - optimize code size
Eric Biggers [Thu, 11 Apr 2024 16:23:58 +0000 (09:23 -0700)]
crypto: x86/sha256-ni - optimize code size

- Load the SHA-256 round constants relative to a pointer that points
  into the middle of the constants rather than to the beginning.  Since
  x86 instructions use signed offsets, this decreases the instruction
  length required to access some of the later round constants.

- Use punpcklqdq or punpckhqdq instead of longer instructions such as
  pshufd, pblendw, and palignr.  This doesn't harm performance.

The end result is that sha256_ni_transform shrinks from 839 bytes to 791
bytes, with no loss in performance.

Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/sha256-ni - rename some register aliases
Eric Biggers [Thu, 11 Apr 2024 16:23:57 +0000 (09:23 -0700)]
crypto: x86/sha256-ni - rename some register aliases

MSGTMP[0-3] are used to hold the message schedule and are not temporary
registers per se.  MSGTMP4 is used as a temporary register for several
different purposes and isn't really related to MSGTMP[0-3].  Rename them
to MSG[0-3] and TMP accordingly.

Also add a comment that clarifies what MSG is.

Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/sha256-ni - convert to use rounds macros
Eric Biggers [Thu, 11 Apr 2024 16:23:56 +0000 (09:23 -0700)]
crypto: x86/sha256-ni - convert to use rounds macros

To avoid source code duplication, do the SHA-256 rounds using macros.
This reduces the length of sha256_ni_asm.S by 153 lines while still
producing the exact same object file.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: qat - implement dh fallback for primes > 4K
Damian Muszynski [Thu, 11 Apr 2024 09:24:58 +0000 (11:24 +0200)]
crypto: qat - implement dh fallback for primes > 4K

The Intel QAT driver provides support for the Diffie-Hellman (DH)
algorithm, limited to prime numbers up to 4K. This driver is used
by default on platforms with integrated QAT hardware for all DH requests.
This has led to failures with algorithms requiring larger prime sizes,
such as ffdhe6144.

  alg: ffdhe6144(dh): test failed on vector 1, err=-22
  alg: self-tests for ffdhe6144(qat-dh) (ffdhe6144(dh)) failed (rc=-22)

Implement a fallback mechanism when an unsupported request is received.

Signed-off-by: Damian Muszynski <damian.muszynski@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - access round keys using single-byte offsets
Eric Biggers [Tue, 9 Apr 2024 00:01:54 +0000 (20:01 -0400)]
crypto: x86/aes-xts - access round keys using single-byte offsets

Access the AES round keys using offsets -7*16 through 7*16, instead of
0*16 through 14*16.  This allows VEX-encoded instructions to address all
round keys using 1-byte offsets, whereas before some needed 4-byte
offsets.  This decreases the code size of aes-xts-avx-x86_64.o by 4.2%.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: octeontx2 - add missing check for dma_map_single
Chen Ni [Mon, 8 Apr 2024 01:59:14 +0000 (01:59 +0000)]
crypto: octeontx2 - add missing check for dma_map_single

Add check for dma_map_single() and return error if it fails in order
to avoid invalid dma address.

Fixes: e92971117c2c ("crypto: octeontx2 - add ctx_val workaround")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Bharat Bhushan <bbhushan2@marvell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - make non-AVX implementation use new glue code
Eric Biggers [Sun, 7 Apr 2024 21:22:31 +0000 (17:22 -0400)]
crypto: x86/aes-xts - make non-AVX implementation use new glue code

Make the non-AVX implementation of AES-XTS (xts-aes-aesni) use the new
glue code that was introduced for the AVX implementations of AES-XTS.
This reduces code size, and it improves the performance of xts-aes-aesni
due to the optimization for messages that don't span page boundaries.

This required moving the new glue functions higher up in the file and
allowing the IV encryption function to be specified by the caller.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agoX.509: Introduce scope-based x509_certificate allocation
Lukas Wunner [Sun, 7 Apr 2024 17:57:40 +0000 (19:57 +0200)]
X.509: Introduce scope-based x509_certificate allocation

Add a DEFINE_FREE() clause for x509_certificate structs and use it in
x509_cert_parse() and x509_key_preparse().  These are the only functions
where scope-based x509_certificate allocation currently makes sense.
A third user will be introduced with the forthcoming SPDM library
(Security Protocol and Data Model) for PCI device authentication.

Unlike most other DEFINE_FREE() clauses, this one checks for IS_ERR()
instead of NULL before calling x509_free_certificate() at end of scope.
That's because the "constructor" of x509_certificate structs,
x509_cert_parse(), returns a valid pointer or an ERR_PTR(), but never
NULL.

Comparing the Assembler output before/after has shown they are identical,
save for the fact that gcc-12 always generates two return paths when
__cleanup() is used, one for the success case and one for the error case.

In x509_cert_parse(), add a hint for the compiler that kzalloc() never
returns an ERR_PTR().  Otherwise the compiler adds a gratuitous IS_ERR()
check on return.  Introduce an assume() macro for this which can be
re-used elsewhere in the kernel to provide hints for the compiler.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Link: https://lore.kernel.org/all/20231003153937.000034ca@Huawei.com/
Link: https://lwn.net/Articles/934679/
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/qm - Add the err memory release process to qm uninit
Chenghai Huang [Sun, 7 Apr 2024 08:00:00 +0000 (16:00 +0800)]
crypto: hisilicon/qm - Add the err memory release process to qm uninit

When the qm uninit command is executed, the err data needs to
be released to prevent memory leakage. The error information
release operation and uacce_remove are integrated in
qm_remove_uacce.

So add the qm_remove_uacce to qm uninit to avoid err memory
leakage.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/debugfs - Resolve the problem of applying for redundant space in...
Chenghai Huang [Sun, 7 Apr 2024 07:59:59 +0000 (15:59 +0800)]
crypto: hisilicon/debugfs - Resolve the problem of applying for redundant space in sq dump

When dumping SQ, only the corresponding ID's SQE needs to be
dumped, and there is no need to apply for the entire SQE
memory. This is because excessive dump operations can lead to
memory resource waste.

Therefor apply for the space corresponding to sqe_id separately
to avoid space waste.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/sec - Fix memory leak for sec resource release
Chenghai Huang [Sun, 7 Apr 2024 07:59:58 +0000 (15:59 +0800)]
crypto: hisilicon/sec - Fix memory leak for sec resource release

The AIV is one of the SEC resources. When releasing resources,
it need to release the AIV resources at the same time.
Otherwise, memory leakage occurs.

The aiv resource release is added to the sec resource release
function.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon - Adjust debugfs creation and release order
Chenghai Huang [Sun, 7 Apr 2024 07:59:57 +0000 (15:59 +0800)]
crypto: hisilicon - Adjust debugfs creation and release order

There is a scenario where the file directory is created but the
file memory is not set. In this case, if a user accesses the
file, an error occurs.

So during the creation process of debugfs, memory should be
allocated first before creating the directory. In the release
process, the directory should be deleted first before releasing
the memory to avoid the situation where the memory does not
exist when accessing the directory.

In addition, the directory released by the debugfs is a global
variable. When the debugfs of an accelerator fails to be
initialized, releasing the directory of the global variable
affects the debugfs initialization of other accelerators.
The debugfs root directory released by debugfs init should be a
member of qm, not a global variable.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/qm - Add the default processing branch
Chenghai Huang [Sun, 7 Apr 2024 07:59:56 +0000 (15:59 +0800)]
crypto: hisilicon/qm - Add the default processing branch

The cmd type can be extended. Currently, only four types of cmd
can be processed. Therefor, add the default processing branch
to intercept incorrect parameter input.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/debugfs - Fix the processing logic issue in the debugfs creation
Chenghai Huang [Sun, 7 Apr 2024 07:59:55 +0000 (15:59 +0800)]
crypto: hisilicon/debugfs - Fix the processing logic issue in the debugfs creation

There is a scenario where the file directory is created but the
file attribute is not set. In this case, if a user accesses the
file, an error occurs.

So adjust the processing logic in the debugfs creation to
prevent the file from being accessed before the file attributes
such as the index are set.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/sgl - Delete redundant parameter verification
Chenghai Huang [Sun, 7 Apr 2024 07:59:54 +0000 (15:59 +0800)]
crypto: hisilicon/sgl - Delete redundant parameter verification

The input parameter check in acc_get_sgl is redundant. The
caller has been verified once. When the check is performed for
multiple times, the performance deteriorates.

So the redundant parameter verification is deleted, and the
index verification is changed to the module entry function for
verification.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/debugfs - Fix debugfs uninit process issue
Chenghai Huang [Sun, 7 Apr 2024 07:59:53 +0000 (15:59 +0800)]
crypto: hisilicon/debugfs - Fix debugfs uninit process issue

During the zip probe process, the debugfs failure does not stop
the probe. When debugfs initialization fails, jumping to the
error branch will also release regs, in addition to its own
rollback operation.

As a result, it may be released repeatedly during the regs
uninit process. Therefore, the null check needs to be added to
the regs uninit process.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: hisilicon/sec - Add the condition for configuring the sriov function
Chenghai Huang [Sun, 7 Apr 2024 07:59:52 +0000 (15:59 +0800)]
crypto: hisilicon/sec - Add the condition for configuring the sriov function

When CONFIG_PCI_IOV is disabled, the SRIOV configuration
function is not required. An error occurs if this function is
incorrectly called.

Consistent with other modules, add the condition for
configuring the sriov function of sec_pci_driver.

Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/sha512-avx2 - add missing vzeroupper
Eric Biggers [Sat, 6 Apr 2024 00:26:10 +0000 (20:26 -0400)]
crypto: x86/sha512-avx2 - add missing vzeroupper

Since sha512_transform_rorx() uses ymm registers, execute vzeroupper
before returning from it.  This is necessary to avoid reducing the
performance of SSE code.

Fixes: e01d69cb0195 ("crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX instructions.")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/sha256-avx2 - add missing vzeroupper
Eric Biggers [Sat, 6 Apr 2024 00:26:09 +0000 (20:26 -0400)]
crypto: x86/sha256-avx2 - add missing vzeroupper

Since sha256_transform_rorx() uses ymm registers, execute vzeroupper
before returning from it.  This is necessary to avoid reducing the
performance of SSE code.

Fixes: d34a460092d8 ("crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/nh-avx2 - add missing vzeroupper
Eric Biggers [Sat, 6 Apr 2024 00:26:08 +0000 (20:26 -0400)]
crypto: x86/nh-avx2 - add missing vzeroupper

Since nh_avx2() uses ymm registers, execute vzeroupper before returning
from it.  This is necessary to avoid reducing the performance of SSE
code.

Fixes: 0f961f9f670e ("crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: iaa - Use cpumask_weight() when rebalancing
Tom Zanussi [Fri, 5 Apr 2024 18:57:30 +0000 (13:57 -0500)]
crypto: iaa - Use cpumask_weight() when rebalancing

If some cpus are offlined, or if the node mask is smaller than
expected, the 'nonexistent cpu' warning in rebalance_wq_table() may be
erroneously triggered.

Use cpumask_weight() to make sure we only iterate over the exact
number of cpus in the mask.

Also use num_possible_cpus() instead of num_online_cpus() to make sure
all slots in the wq table are initialized.

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x509 - Add OID for NIST P521 and extend parser for it
Stefan Berger [Thu, 4 Apr 2024 14:18:56 +0000 (10:18 -0400)]
crypto: x509 - Add OID for NIST P521 and extend parser for it

Enable the x509 parser to accept NIST P521 certificates and add the
OID for ansip521r1, which is the identifier for NIST P521.

Cc: David Howells <dhowells@redhat.com>
Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: asymmetric_keys - Adjust signature size calculation for NIST P521
Stefan Berger [Thu, 4 Apr 2024 14:18:55 +0000 (10:18 -0400)]
crypto: asymmetric_keys - Adjust signature size calculation for NIST P521

Adjust the calculation of the maximum signature size for support of
NIST P521. While existing curves may prepend a 0 byte to their coordinates
(to make the number positive), NIST P521 will not do this since only the
first bit in the most significant byte is used.

If the encoding of the x & y coordinates requires at least 128 bytes then
an additional byte is needed for the encoding of the length. Take this into
account when calculating the maximum signature size.

Reviewed-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Register NIST P521 and extend test suite
Stefan Berger [Thu, 4 Apr 2024 14:18:54 +0000 (10:18 -0400)]
crypto: ecdsa - Register NIST P521 and extend test suite

Register NIST P521 as an akcipher and extend the testmgr with
NIST P521-specific test vectors.

Add a module alias so the module gets automatically loaded by the crypto
subsystem when the curve is needed.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Rename keylen to bufsize where necessary
Stefan Berger [Thu, 4 Apr 2024 14:18:53 +0000 (10:18 -0400)]
crypto: ecdsa - Rename keylen to bufsize where necessary

In cases where 'keylen' was referring to the size of the buffer used by
a curve's digits, it does not reflect the purpose of the variable anymore
once NIST P521 is used. What it refers to then is the size of the buffer,
which may be a few bytes larger than the size a coordinate of a key.
Therefore, rename keylen to bufsize where appropriate.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Replace ndigits with nbits where precision is needed
Stefan Berger [Thu, 4 Apr 2024 14:18:52 +0000 (10:18 -0400)]
crypto: ecdsa - Replace ndigits with nbits where precision is needed

Replace the usage of ndigits with nbits where precise space calculations
are needed, such as in ecdsa_max_size where the length of a coordinate is
determined.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - Add NIST P521 curve parameters
Stefan Berger [Thu, 4 Apr 2024 14:18:51 +0000 (10:18 -0400)]
crypto: ecc - Add NIST P521 curve parameters

Add the parameters for the NIST P521 curve and define a new curve ID
for it. Make the curve available in ecc_get_curve.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - Add special case for NIST P521 in ecc_point_mult
Stefan Berger [Thu, 4 Apr 2024 14:18:50 +0000 (10:18 -0400)]
crypto: ecc - Add special case for NIST P521 in ecc_point_mult

In ecc_point_mult use the number of bits of the NIST P521 curve + 2. The
change is required specifically for NIST P521 to pass mathematical tests
on the public key.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - Implement vli_mmod_fast_521 for NIST p521
Stefan Berger [Thu, 4 Apr 2024 14:18:49 +0000 (10:18 -0400)]
crypto: ecc - Implement vli_mmod_fast_521 for NIST p521

Implement vli_mmod_fast_521 following the description for how to calculate
the modulus for NIST P521 in the NIST publication "Recommendations for
Discrete Logarithm-Based Cryptography: Elliptic Curve Domain Parameters"
section G.1.4.

NIST p521 requires 9 64bit digits, so increase the ECC_MAX_DIGITS so that
the vli digit array provides enough elements to fit the larger integers
required by this curve.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - Add nbits field to ecc_curve structure
Stefan Berger [Thu, 4 Apr 2024 14:18:48 +0000 (10:18 -0400)]
crypto: ecc - Add nbits field to ecc_curve structure

Add the number of bits a curve has to the ecc_curve definition to be able
to derive the number of bytes a curve requires for its coordinates from it.
It also allows one to identify a curve by its particular size. Set the
number of bits on all curve definitions.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Extend res.x mod n calculation for NIST P521
Stefan Berger [Thu, 4 Apr 2024 14:18:47 +0000 (10:18 -0400)]
crypto: ecdsa - Extend res.x mod n calculation for NIST P521

res.x has been calculated by ecc_point_mult_shamir, which uses
'mod curve_prime' on res.x and therefore p > res.x with 'p' being the
curve_prime. Further, it is true that for the NIST curves p > n with 'n'
being the 'curve_order' and therefore the following may be true as well:
p > res.x >= n.

If res.x >= n then res.x mod n can be calculated by iteratively sub-
tracting n from res.x until res.x < n. For NIST P192/256/384 this can be
done in a single subtraction. This can also be done in a single
subtraction for NIST P521.

The mathematical reason why a single subtraction is sufficient is due to
the values of 'p' and 'n' of the NIST curves where the following holds
true:

   note: max(res.x) = p - 1

   max(res.x) - n < n
       p - 1  - n < n
       p - 1      < 2n  => holds true for the NIST curves

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Adjust tests on length of key parameters
Stefan Berger [Thu, 4 Apr 2024 14:18:46 +0000 (10:18 -0400)]
crypto: ecdsa - Adjust tests on length of key parameters

In preparation for support of NIST P521, adjust the basic tests on the
length of the provided key parameters to only ensure that the length of the
x plus y coordinates parameter array is not an odd number and that each
coordinate fits into an array of 'ndigits' digits. Mathematical tests on
the key's parameters are then done in ecc_is_pubkey_valid_full rejecting
invalid keys.

The change is necessary since NIST P521 keys do not have keys with
coordinates that each require 'full' digits (= all bits in u64 used).
NIST P521 only requires 2 bytes (9 bits) in the most significant digit
unlike NIST P192/256/384 that each require multiple 'full' digits.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Convert byte arrays with key coordinates to digits
Stefan Berger [Thu, 4 Apr 2024 14:18:45 +0000 (10:18 -0400)]
crypto: ecdsa - Convert byte arrays with key coordinates to digits

For NIST P192/256/384 the public key's x and y parameters could be copied
directly from a given array since both parameters filled 'ndigits' of
digits (a 'digit' is a u64). For support of NIST P521 the key parameters
need to have leading zeros prepended to the most significant digit since
only 2 bytes of the most significant digit are provided.

Therefore, implement ecc_digits_from_bytes to convert a byte array into an
array of digits and use this function in ecdsa_set_pub_key where an input
byte array needs to be converted into digits.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - Use ECC_CURVE_NIST_P192/256/384_DIGITS where possible
Stefan Berger [Thu, 4 Apr 2024 14:18:44 +0000 (10:18 -0400)]
crypto: ecc - Use ECC_CURVE_NIST_P192/256/384_DIGITS where possible

Replace hard-coded numbers with ECC_CURVE_NIST_P192/256/384_DIGITS where
possible.

Tested-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: tegra - Add Tegra Security Engine driver
Akhil R [Wed, 3 Apr 2024 10:00:37 +0000 (15:30 +0530)]
crypto: tegra - Add Tegra Security Engine driver

Add support for Tegra Security Engine which can accelerate various
crypto algorithms. The Engine has two separate instances within for
AES and HASH algorithms respectively.

The driver registers two crypto engines - one for AES and another for
HASH algorithms and these operate independently and both uses the host1x
bus. Additionally, it provides  hardware-assisted key protection for up
to 15 symmetric keys which it can use for the cipher operations.

Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agogpu: host1x: Add Tegra SE to SID table
Akhil R [Wed, 3 Apr 2024 10:00:36 +0000 (15:30 +0530)]
gpu: host1x: Add Tegra SE to SID table

Add Tegra Security Engine details to the SID table in host1x driver.
These entries are required to be in place to configure the stream ID
for SE. Register writes to stream ID registers fail otherwise.

Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Acked-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agodt-bindings: crypto: Add Tegra Security Engine
Akhil R [Wed, 3 Apr 2024 10:00:35 +0000 (15:30 +0530)]
dt-bindings: crypto: Add Tegra Security Engine

Add DT binding document for Tegra Security Engine.
The AES and HASH algorithms are handled independently by separate
engines within the Security Engine. These engines are registered
as two separate crypto engine drivers.

Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agopadata: Disable BH when taking works lock on MT path
Herbert Xu [Wed, 3 Apr 2024 09:36:18 +0000 (17:36 +0800)]
padata: Disable BH when taking works lock on MT path

As the old padata code can execute in softirq context, disable
softirqs for the new padata_do_mutithreaded code too as otherwise
lockdep will get antsy.

Reported-by: syzbot+0cb5bb0f4bf9e79db3b3@syzkaller.appspotmail.com
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ccp - drop platform ifdef checks
Arnd Bergmann [Wed, 3 Apr 2024 08:06:42 +0000 (10:06 +0200)]
crypto: ccp - drop platform ifdef checks

When both ACPI and OF are disabled, the dev_vdata variable is unused:

drivers/crypto/ccp/sp-platform.c:33:34: error: unused variable 'dev_vdata' [-Werror,-Wunused-const-variable]

This is not a useful configuration, and there is not much point in saving
a few bytes when only one of the two is enabled, so just remove all
these ifdef checks and rely on of_match_node() and acpi_match_device()
returning NULL when these subsystems are disabled.

Fixes: 6c5063434098 ("crypto: ccp - Add ACPI support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: qat - Fix spelling mistake "Invalide" -> "Invalid"
Colin Ian King [Tue, 2 Apr 2024 08:13:55 +0000 (09:13 +0100)]
crypto: qat - Fix spelling mistake "Invalide" -> "Invalid"

There is a spelling mistake in a dev_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: algboss - remove NULL check in cryptomgr_schedule_probe()
Roman Smirnov [Mon, 1 Apr 2024 12:22:58 +0000 (15:22 +0300)]
crypto: algboss - remove NULL check in cryptomgr_schedule_probe()

The for loop will be executed at least once, so i > 0. If the loop
is interrupted before i is incremented (e.g., when checking len for NULL),
i will not be checked.

Found by Linux Verification Center (linuxtesting.org) with Svace.

Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - remove checks in crypto_ecdh_shared_secret() and ecc_make_pub_key()
Roman Smirnov [Mon, 1 Apr 2024 12:16:23 +0000 (15:16 +0300)]
crypto: ecc - remove checks in crypto_ecdh_shared_secret() and ecc_make_pub_key()

With the current state of the code, ecc_get_curve() cannot return
NULL in crypto_ecdh_shared_secret() and ecc_make_pub_key(). This is
conditioned by the fact that they are only called from ecdh_compute_value(),
which implements the kpp_alg::{generate_public_key,compute_shared_secret}()
methods. Also ecdh implements the kpp_alg::init() method, which is called
before the other methods and sets ecdh_ctx::curve_id to a valid value.

Signed-off-by: Roman Smirnov <r.smirnov@omp.ru>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: jitter - Replace http with https
Thorsten Blum [Fri, 29 Mar 2024 15:53:45 +0000 (16:53 +0100)]
crypto: jitter - Replace http with https

The PDF is also available via https.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: jitter - Remove duplicate word in comment
Thorsten Blum [Fri, 29 Mar 2024 15:44:54 +0000 (16:44 +0100)]
crypto: jitter - Remove duplicate word in comment

s/in//

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - wire up VAES + AVX10/512 implementation
Eric Biggers [Fri, 29 Mar 2024 08:03:54 +0000 (01:03 -0700)]
crypto: x86/aes-xts - wire up VAES + AVX10/512 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx10_512" for x86_64 CPUs
with the VAES, VPCLMULQDQ, and either AVX10/512 or AVX512BW + AVX512VL
extensions.  This implementation uses zmm registers to operate on four
AES blocks at a time.  The assembly code is instantiated using a macro
so that most of the source code is shared with other implementations.

To avoid downclocking on older Intel CPU models, an exclusion list is
used to prevent this 512-bit implementation from being used by default
on some CPU models.  They will use xts-aes-vaes-avx10_256 instead.  For
now, this exclusion list is simply coded into aesni-intel_glue.c.  It
may make sense to eventually move it into a more central location.

xts-aes-vaes-avx10_512 is slightly faster than xts-aes-vaes-avx10_256 on
some current CPUs.  E.g., on AMD Zen 4, AES-256-XTS decryption
throughput increases by 13% with 4096-byte inputs, or 14% with 512-byte
inputs.  On Intel Sapphire Rapids, AES-256-XTS decryption throughput
increases by 2% with 4096-byte inputs, or 3% with 512-byte inputs.

Future CPUs may provide stronger 512-bit support, in which case a larger
benefit should be seen.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - wire up VAES + AVX10/256 implementation
Eric Biggers [Fri, 29 Mar 2024 08:03:53 +0000 (01:03 -0700)]
crypto: x86/aes-xts - wire up VAES + AVX10/256 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx10_256" for x86_64 CPUs
with the VAES, VPCLMULQDQ, and either AVX10/256 or AVX512BW + AVX512VL
extensions.  This implementation avoids using zmm registers, instead
using ymm registers to operate on two AES blocks at a time.  The
assembly code is instantiated using a macro so that most of the source
code is shared with other implementations.

This is the optimal implementation on CPUs that support VAES and AVX512
but where the zmm registers should not be used due to downclocking
effects, for example Intel's Ice Lake.  It should also be the optimal
implementation on future CPUs that support AVX10/256 but not AVX10/512.

The performance is slightly better than that of xts-aes-vaes-avx2, which
uses the same 256-bit vector length, due to factors such as being able
to use ymm16-ymm31 to cache the AES round keys, and being able to use
the vpternlogd instruction to do XORs more efficiently.  For example, on
Ice Lake, the throughput of decrypting 4096-byte messages with
AES-256-XTS is 6.6% higher with xts-aes-vaes-avx10_256 than with
xts-aes-vaes-avx2.  While this is a small improvement, it is
straightforward to provide this implementation (xts-aes-vaes-avx10_256)
as long as we are providing xts-aes-vaes-avx2 and xts-aes-vaes-avx10_512
anyway, due to the way the _aes_xts_crypt macro is structured.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - wire up VAES + AVX2 implementation
Eric Biggers [Fri, 29 Mar 2024 08:03:52 +0000 (01:03 -0700)]
crypto: x86/aes-xts - wire up VAES + AVX2 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx2" for x86_64 CPUs with
the VAES, VPCLMULQDQ, and AVX2 extensions, but not AVX512 or AVX10.
This implementation uses ymm registers to operate on two AES blocks at a
time.  The assembly code is instantiated using a macro so that most of
the source code is shared with other implementations.

This is the optimal implementation on AMD Zen 3.  It should also be the
optimal implementation on Intel Alder Lake, which similarly supports
VAES but not AVX512.  Comparing to xts-aes-aesni-avx on Zen 3,
xts-aes-vaes-avx2 provides 70% higher AES-256-XTS decryption throughput
with 4096-byte messages, or 23% higher with 512-byte messages.

A large improvement is also seen with CPUs that do support AVX512 (e.g.,
98% higher AES-256-XTS decryption throughput on Ice Lake with 4096-byte
messages), though the following patches add AVX512 optimized
implementations to get a bit more performance on those CPUs.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - wire up AESNI + AVX implementation
Eric Biggers [Fri, 29 Mar 2024 08:03:51 +0000 (01:03 -0700)]
crypto: x86/aes-xts - wire up AESNI + AVX implementation

Add an AES-XTS implementation "xts-aes-aesni-avx" for x86_64 CPUs that
have the AES-NI and AVX extensions but not VAES.  It's similar to the
existing xts-aes-aesni in that uses xmm registers to operate on one AES
block at a time.  It differs from xts-aes-aesni in the following ways:

- It uses the VEX-coded (non-destructive) instructions from AVX.
  This improves performance slightly.
- It incorporates some additional optimizations such as interleaving the
  tweak computation with AES en/decryption, handling single-page
  messages more efficiently, and caching the first round key.
- It supports only 64-bit (x86_64).
- It's generated by an assembly macro that will also be used to generate
  VAES-based implementations.

The performance improvement over xts-aes-aesni varies from small to
large, depending on the CPU and other factors such as the size of the
messages en/decrypted.  For example, the following increases in
AES-256-XTS decryption throughput are seen on the following CPUs:

                          | 4096-byte messages | 512-byte messages |
    ----------------------+--------------------+-------------------+
    Intel Skylake         |        6%          |       31%         |
    Intel Cascade Lake    |        4%          |       26%         |
    AMD Zen 1             |        61%         |       73%         |
    AMD Zen 2             |        36%         |       59%         |

(The above CPUs don't support VAES, so they can't use VAES instead.)

While this isn't as large an improvement as what VAES provides, this
still seems worthwhile.  This implementation is fairly easy to provide
based on the assembly macro that's needed for VAES anyway, and it will
be the best implementation on a large number of CPUs (very roughly, the
CPUs launched by Intel and AMD from 2011 to 2018).

This makes the existing xts-aes-aesni *mostly* obsolete.  For now, leave
it in place to support 32-bit kernels and also CPUs like Intel Westmere
that support AES-NI but not AVX.  (We could potentially remove it anyway
and just rely on the indirect acceleration via ecb-aes-aesni in those
cases, but that change will need to be considered separately.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aes-xts - add AES-XTS assembly macro for modern CPUs
Eric Biggers [Fri, 29 Mar 2024 08:03:50 +0000 (01:03 -0700)]
crypto: x86/aes-xts - add AES-XTS assembly macro for modern CPUs

Add an assembly file aes-xts-avx-x86_64.S which contains a macro that
expands into AES-XTS implementations for x86_64 CPUs that support at
least AES-NI and AVX, optionally also taking advantage of VAES,
VPCLMULQDQ, and AVX512 or AVX10.

This patch doesn't expand the macro at all.  Later patches will do so,
adding each implementation individually so that the motivation and use
case for each individual implementation can be fully presented.

The file also provides a function aes_xts_encrypt_iv() which handles the
encryption of the IV (tweak), using AES-NI and AVX.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agox86: add kconfig symbols for assembler VAES and VPCLMULQDQ support
Eric Biggers [Fri, 29 Mar 2024 08:03:49 +0000 (01:03 -0700)]
x86: add kconfig symbols for assembler VAES and VPCLMULQDQ support

Add config symbols AS_VAES and AS_VPCLMULQDQ that expose whether the
assembler supports the vector AES and carryless multiplication
cryptographic extensions.

Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdh - explicitly zeroize private_key
Joachim Vandersmissen [Thu, 28 Mar 2024 16:24:30 +0000 (11:24 -0500)]
crypto: ecdh - explicitly zeroize private_key

private_key is overwritten with the key parameter passed in by the
caller (if present), or alternatively a newly generated private key.
However, it is possible that the caller provides a key (or the newly
generated key) which is shorter than the previous key. In that
scenario, some key material from the previous key would not be
overwritten. The easiest solution is to explicitly zeroize the entire
private_key array first.

Note that this patch slightly changes the behavior of this function:
previously, if the ecc_gen_privkey failed, the old private_key would
remain. Now, the private_key is always zeroized. This behavior is
consistent with the case where params.key is set and ecc_is_key_valid
fails.

Signed-off-by: Joachim Vandersmissen <git@jvdsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: fips - Remove the now superfluous sentinel element from ctl_table array
Joel Granados [Thu, 28 Mar 2024 15:57:50 +0000 (16:57 +0100)]
crypto: fips - Remove the now superfluous sentinel element from ctl_table array

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which will
reduce the overall build time size of the kernel and run time memory
bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove sentinel from crypto_sysctl_table

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: jitter - Use kvfree_sensitive() to fix Coccinelle warning
Thorsten Blum [Wed, 27 Mar 2024 22:25:09 +0000 (23:25 +0100)]
crypto: jitter - Use kvfree_sensitive() to fix Coccinelle warning

Replace memzero_explicit() and kvfree() with kvfree_sensitive() to fix
the following Coccinelle/coccicheck warning reported by
kfree_sensitive.cocci:

WARNING opportunity for kfree_sensitive/kvfree_sensitive

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agodt-bindings: crypto: ti,omap-sham: Convert to dtschema
Animesh Agarwal [Wed, 27 Mar 2024 05:49:06 +0000 (11:19 +0530)]
dt-bindings: crypto: ti,omap-sham: Convert to dtschema

Convert the OMAP SoC SHA crypto Module bindings to DT Schema.

Signed-off-by: Animesh Agarwal <animeshagarwal28@gmail.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: qat - Avoid -Wflex-array-member-not-at-end warnings
Gustavo A. R. Silva [Mon, 25 Mar 2024 20:44:55 +0000 (14:44 -0600)]
crypto: qat - Avoid -Wflex-array-member-not-at-end warnings

-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.

Use the `__struct_group()` helper to separate the flexible array
from the rest of the members in flexible `struct qat_alg_buf_list`,
through tagged `struct qat_alg_buf_list_hdr`, and avoid embedding the
flexible-array member in the middle of `struct qat_alg_fixed_buf_list`.

Also, use `container_of()` whenever we need to retrieve a pointer to
the flexible structure.

So, with these changes, fix the following warnings:
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/crypto/intel/qat/qat_common/qat_bl.h:25:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

Link: https://github.com/KSPP/linux/issues/202
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agohwrng: mxc-rnga - Drop usage of platform_driver_probe()
Uwe Kleine-König [Sun, 24 Mar 2024 16:12:26 +0000 (17:12 +0100)]
hwrng: mxc-rnga - Drop usage of platform_driver_probe()

There are considerations to drop platform_driver_probe() as a concept
that isn't relevant any more today. It comes with an added complexity
that makes many users hold it wrong. (E.g. this driver should have mark
the driver struct with __refdata.)

Convert the driver to the more usual module_platform_driver().

This fixes a W=1 build warning:

WARNING: modpost: drivers/char/hw_random/mxc-rnga: section mismatch in reference: mxc_rnga_driver+0x10 (section: .data) -> mxc_rnga_remove (section: .exit.text)

with CONFIG_HW_RANDOM_MXC_RNGA=m.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aesni - Update aesni_set_key() to return void
Chang S. Bae [Fri, 22 Mar 2024 23:04:59 +0000 (16:04 -0700)]
crypto: x86/aesni - Update aesni_set_key() to return void

The aesni_set_key() implementation has no error case, yet its prototype
specifies to return an error code.

Modify the function prototype to return void and adjust the related code.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: x86/aesni - Rearrange AES key size check
Chang S. Bae [Fri, 22 Mar 2024 23:04:58 +0000 (16:04 -0700)]
crypto: x86/aesni - Rearrange AES key size check

aes_expandkey() already includes an AES key size check. If AES-NI is
unusable, invoke the function without the size check.

Also, use aes_check_keylen() instead of open code.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: bcm - Fix pointer arithmetic
Aleksandr Mishin [Fri, 22 Mar 2024 20:59:15 +0000 (23:59 +0300)]
crypto: bcm - Fix pointer arithmetic

In spu2_dump_omd() value of ptr is increased by ciph_key_len
instead of hash_iv_len which could lead to going beyond the
buffer boundaries.
Fix this bug by changing ciph_key_len to hash_iv_len.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 9d12ba86f818 ("crypto: brcm - Add Broadcom SPU driver")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: iaa - Fix some errors in IAA documentation
Jerry Snitselaar [Thu, 21 Mar 2024 21:08:46 +0000 (16:08 -0500)]
crypto: iaa - Fix some errors in IAA documentation

This cleans up the following issues I ran into when trying to use the
scripts and commands in the iaa-crypto.rst document.

- Fix incorrect arguments being passed to accel-config
  config-wq.
    - Replace --device_name with --driver-name.
    - Replace --driver_name with --driver-name.
    - Replace --size with --wq-size.
    - Add missing --priority argument.
- Add missing accel-config config-engine command after the
  config-wq commands.
- Fix wq name passed to accel-config config-wq.
- Add rmmod/modprobe of iaa_crypto to script that disables,
  then enables all devices and workqueues to avoid enable-wq
  failing with -EEXIST when trying to register to compression
  algorithm.
- Fix device name in cases where iaa was used instead of iax.

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-crypto@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecdsa - Fix module auto-load on add-key
Stefan Berger [Thu, 21 Mar 2024 14:44:33 +0000 (10:44 -0400)]
crypto: ecdsa - Fix module auto-load on add-key

Add module alias with the algorithm cra_name similar to what we have for
RSA-related and other algorithms.

The kernel attempts to modprobe asymmetric algorithms using the names
"crypto-$cra_name" and "crypto-$cra_name-all." However, since these
aliases are currently missing, the modules are not loaded. For instance,
when using the `add_key` function, the hash algorithm is typically
loaded automatically, but the asymmetric algorithm is not.

Steps to test:

1. Create certificate

  openssl req -x509 -sha256 -newkey ec \
  -pkeyopt "ec_paramgen_curve:secp384r1" -keyout key.pem -days 365 \
  -subj '/CN=test' -nodes -outform der -out nist-p384.der

2. Optionally, trace module requests with: trace-cmd stream -e module &

3. Trigger add_key call for the cert:

   # keyctl padd asymmetric "" @u < nist-p384.der
   641069229
   # lsmod | head -2
   Module                  Size  Used by
   ecdsa_generic          16384  0

Fixes: c12d448ba939 ("crypto: ecdsa - Register NIST P384 and extend test suite")
Cc: stable@vger.kernel.org
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Vitaly Chikunov <vt@altlinux.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecc - update ecc_gen_privkey for FIPS 186-5
Joachim Vandersmissen [Wed, 20 Mar 2024 05:13:38 +0000 (00:13 -0500)]
crypto: ecc - update ecc_gen_privkey for FIPS 186-5

FIPS 186-5 [1] was released approximately 1 year ago. The most
interesting change for ecc_gen_privkey is the removal of curves with
order < 224 bits. This is minimum is now checked in step 1. It is
unlikely that there is still any benefit in generating private keys for
curves with n < 224, as those curves provide less than 112 bits of
security strength and are therefore unsafe for any modern usage.

This patch also updates the documentation for __ecc_is_key_valid and
ecc_gen_privkey to clarify which FIPS 186-5 method is being used to
generate private keys. Previous documentation mentioned that "extra
random bits" was used. However, this did not match the code. Instead,
the code currently uses (and always has used) the "rejection sampling"
("testing candidates" in FIPS 186-4) method.

[1]: https://doi.org/10.6028/NIST.FIPS.186-5

Signed-off-by: Joachim Vandersmissen <git@jvdsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: ecrdsa - Fix module auto-load on add_key
Vitaly Chikunov [Mon, 18 Mar 2024 00:42:40 +0000 (03:42 +0300)]
crypto: ecrdsa - Fix module auto-load on add_key

Add module alias with the algorithm cra_name similar to what we have for
RSA-related and other algorithms.

The kernel attempts to modprobe asymmetric algorithms using the names
"crypto-$cra_name" and "crypto-$cra_name-all." However, since these
aliases are currently missing, the modules are not loaded. For instance,
when using the `add_key` function, the hash algorithm is typically
loaded automatically, but the asymmetric algorithm is not.

Steps to test:

1. Cert is generated usings ima-evm-utils test suite with
   `gen-keys.sh`, example cert is provided below:

  $ base64 -d >test-gost2012_512-A.cer <<EOF
  MIIB/DCCAWagAwIBAgIUK8+whWevr3FFkSdU9GLDAM7ure8wDAYIKoUDBwEBAwMFADARMQ8wDQYD
  VQQDDAZDQSBLZXkwIBcNMjIwMjAxMjIwOTQxWhgPMjA4MjEyMDUyMjA5NDFaMBExDzANBgNVBAMM
  BkNBIEtleTCBoDAXBggqhQMHAQEBAjALBgkqhQMHAQIBAgEDgYQABIGALXNrTJGgeErBUOov3Cfo
  IrHF9fcj8UjzwGeKCkbCcINzVUbdPmCopeJRHDJEvQBX1CQUPtlwDv6ANjTTRoq5nCk9L5PPFP1H
  z73JIXHT0eRBDVoWy0cWDRz1mmQlCnN2HThMtEloaQI81nTlKZOcEYDtDpi5WODmjEeRNQJMdqCj
  UDBOMAwGA1UdEwQFMAMBAf8wHQYDVR0OBBYEFCwfOITMbE9VisW1i2TYeu1tAo5QMB8GA1UdIwQY
  MBaAFCwfOITMbE9VisW1i2TYeu1tAo5QMAwGCCqFAwcBAQMDBQADgYEAmBfJCMTdC0/NSjz4BBiQ
  qDIEjomO7FEHYlkX5NGulcF8FaJW2jeyyXXtbpnub1IQ8af1KFIpwoS2e93LaaofxpWlpQLlju6m
  KYLOcO4xK3Whwa2hBAz9YbpUSFjvxnkS2/jpH2MsOSXuUEeCruG/RkHHB3ACef9umG6HCNQuAPY=
  EOF

2. Optionally, trace module requests with: trace-cmd stream -e module &

3. Trigger add_key call for the cert:

  # keyctl padd asymmetric "" @u <test-gost2012_512-A.cer
  939910969
  # lsmod | head -3
  Module                  Size  Used by
  ecrdsa_generic         16384  0
  streebog_generic       28672  0

Repored-by: Paul Wolneykien <manowar@altlinux.org>
Cc: stable@vger.kernel.org
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agohwrng: core - Convert sprintf/snprintf to sysfs_emit
Li Zhijian [Thu, 14 Mar 2024 08:45:59 +0000 (16:45 +0800)]
hwrng: core - Convert sprintf/snprintf to sysfs_emit

Per filesystems/sysfs.rst, show() should only use sysfs_emit()
or sysfs_emit_at() when formatting the value to be returned to user space.

coccinelle complains that there are still a couple of functions that use
snprintf(). Convert them to sysfs_emit().

sprintf() will be converted as weel if they have.

Generally, this patch is generated by
make coccicheck M=<path/to/file> MODE=patch \
COCCI=scripts/coccinelle/api/device_attr_show.cocci

No functional change intended

CC: Olivia Mackall <olivia@selenic.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
CC: linux-crypto@vger.kernel.org
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agodt-bindings: crypto: ice: Document sc7280 inline crypto engine
Luca Weiss [Wed, 13 Mar 2024 12:53:14 +0000 (13:53 +0100)]
dt-bindings: crypto: ice: Document sc7280 inline crypto engine

Document the compatible used for the inline crypto engine found on
SC7280.

Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 months agocrypto: remove CONFIG_CRYPTO_STATS
Eric Biggers [Wed, 13 Mar 2024 03:48:21 +0000 (20:48 -0700)]
crypto: remove CONFIG_CRYPTO_STATS

Remove support for the "Crypto usage statistics" feature
(CONFIG_CRYPTO_STATS).  This feature does not appear to have ever been
used, and it is harmful because it significantly reduces performance and
is a large maintenance burden.

Covering each of these points in detail:

1. Feature is not being used

Since these generic crypto statistics are only readable using netlink,
it's fairly straightforward to look for programs that use them.  I'm
unable to find any evidence that any such programs exist.  For example,
Debian Code Search returns no hits except the kernel header and kernel
code itself and translations of the kernel header:
https://codesearch.debian.net/search?q=CRYPTOCFGA_STAT&literal=1&perpkg=1

The patch series that added this feature in 2018
(https://lore.kernel.org/linux-crypto/1537351855-16618-1-git-send-email-clabbe@baylibre.com/)
said "The goal is to have an ifconfig for crypto device."  This doesn't
appear to have happened.

It's not clear that there is real demand for crypto statistics.  Just
because the kernel provides other types of statistics such as I/O and
networking statistics and some people find those useful does not mean
that crypto statistics are useful too.

Further evidence that programs are not using CONFIG_CRYPTO_STATS is that
it was able to be disabled in RHEL and Fedora as a bug fix
(https://gitlab.com/redhat/centos-stream/src/kernel/centos-stream-9/-/merge_requests/2947).

Even further evidence comes from the fact that there are and have been
bugs in how the stats work, but they were never reported.  For example,
before Linux v6.7 hash stats were double-counted in most cases.

There has also never been any documentation for this feature, so it
might be hard to use even if someone wanted to.

2. CONFIG_CRYPTO_STATS significantly reduces performance

Enabling CONFIG_CRYPTO_STATS significantly reduces the performance of
the crypto API, even if no program ever retrieves the statistics.  This
primarily affects systems with a large number of CPUs.  For example,
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039576 reported
that Lustre client encryption performance improved from 21.7GB/s to
48.2GB/s by disabling CONFIG_CRYPTO_STATS.

It can be argued that this means that CONFIG_CRYPTO_STATS should be
optimized with per-cpu counters similar to many of the networking
counters.  But no one has done this in 5+ years.  This is consistent
with the fact that the feature appears to be unused, so there seems to
be little interest in improving it as opposed to just disabling it.

It can be argued that because CONFIG_CRYPTO_STATS is off by default,
performance doesn't matter.  But Linux distros tend to error on the side
of enabling options.  The option is enabled in Ubuntu and Arch Linux,
and until recently was enabled in RHEL and Fedora (see above).  So, even
just having the option available is harmful to users.

3. CONFIG_CRYPTO_STATS is a large maintenance burden

There are over 1000 lines of code associated with CONFIG_CRYPTO_STATS,
spread among 32 files.  It significantly complicates much of the
implementation of the crypto API.  After the initial submission, many
fixes and refactorings have consumed effort of multiple people to keep
this feature "working".  We should be spending this effort elsewhere.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>