Jianpeng Chang [Wed, 15 Oct 2025 02:14:27 +0000 (10:14 +0800)]
net: enetc: fix the deadlock of enetc_mdio_lock
After applying the workaround for err050089, the LS1028A platform
experiences RCU stalls on RT kernel. This issue is caused by the
recursive acquisition of the read lock enetc_mdio_lock. Here list some
of the call stacks identified under the enetc_poll path that may lead to
a deadlock:
enetc_poll
-> enetc_lock_mdio
-> enetc_clean_rx_ring OR napi_complete_done
-> napi_gro_receive
-> enetc_start_xmit
-> enetc_lock_mdio
-> enetc_map_tx_buffs
-> enetc_unlock_mdio
-> enetc_unlock_mdio
After enetc_poll acquires the read lock, a higher-priority writer attempts
to acquire the lock, causing preemption. The writer detects that a
read lock is already held and is scheduled out. However, readers under
enetc_poll cannot acquire the read lock again because a writer is already
waiting, leading to a thread hang.
Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent
recursive lock acquisition.
Fixes:
6d36ecdbc441 ("net: enetc: take the MDIO lock only once per NAPI poll cycle")
Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Acked-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sebastian Reichel [Tue, 14 Oct 2025 15:49:34 +0000 (17:49 +0200)]
net: stmmac: dwmac-rk: Fix disabling set_clock_selection
On all platforms set_clock_selection() writes to a GRF register. This
requires certain clocks running and thus should happen before the
clocks are disabled.
This has been noticed on RK3576 Sige5, which hangs during system suspend
when trying to suspend the second network interface. Note, that
suspending the first interface works, because the second device ensures
that the necessary clocks for the GRF are enabled.
Cc: stable@vger.kernel.org
Fixes:
2f2b60a0ec28 ("net: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588")
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251014-rockchip-network-clock-fix-v1-1-c257b4afdf75@collabora.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Johannes Wiesböck [Wed, 15 Oct 2025 20:15:43 +0000 (22:15 +0200)]
rtnetlink: Allow deleting FDB entries in user namespace
Creating FDB entries is possible from a non-initial user namespace when
having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive
an EPERM because the capability is always checked against the initial
user namespace. This restricts the FDB management from unprivileged
containers.
Drop the netlink_capable check in rtnl_fdb_del as it was originally
dropped in
c5c351088ae7 and reintroduced in
1690be63a27b without
intention.
This patch was tested using a container on GyroidOS, where it was
possible to delete FDB entries from an unprivileged user namespace and
private network namespace.
Fixes:
1690be63a27b ("bridge: Add vlan support to static neighbors")
Reviewed-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Tested-by: Harshal Gohel <hg@simonwunderlich.de>
Signed-off-by: Johannes Wiesböck <johannes.wiesboeck@aisec.fraunhofer.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bagas Sanjaya [Wed, 15 Oct 2025 09:25:41 +0000 (16:25 +0700)]
net: rmnet: Fix checksum offload header v5 and aggregation packet formatting
Packet format for checksum offload header v5 and aggregation, and header
type table for the former, are shown in normal paragraphs instead.
Use appropriate markup.
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251015092540.32282-2-bagasdotme@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cosmin Ratiu [Wed, 15 Oct 2025 07:05:23 +0000 (10:05 +0300)]
net/mlx5e: psp, avoid 'accel' NULL pointer dereference
The 'accel' parameter of mlx5e_txwqe_build_eseg_csum() and the similar
'state' parameter of mlx5e_accel_tx_ids_len() were NULL when called
from mlx5i_sq_xmit() and were causing kernel panics from that context.
Fix that by passing in a local empty mlx5e_accel_tx_state variable, thus
guaranteeing that 'accel' is never NULL. Also remove an unnecessary
check from mlx5e_tx_wqe_inline_mode().
Fixes:
e5a1861a298e ("net/mlx5e: Implement PSP Tx data path")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1760511923-890650-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 15 Oct 2025 06:32:21 +0000 (06:32 +0000)]
net: gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb()
Some network drivers assume this field is zero after napi_get_frags().
We must clear it in napi_reuse_skb() otherwise the following can happen:
1) A packet is received, and skb_shinfo(skb)->hwtstamps is populated
because a bit in the receive descriptor announced hwtstamp
availability for this packet.
2) Packet is given to gro layer via napi_gro_frags().
3) Packet is merged to a prior one held in GRO queues.
4) skb is saved after some cleanup in napi->skb via a call
to napi_reuse_skb().
5) Next packet is received 10 seconds later, gets the recycled skb
from napi_get_frags().
6) The receive descriptor does not announce hwtstamp availability.
Driver does not clear shinfo->hwtstamps.
7) We have in shinfo->hwtstamps an old timestamp.
Fixes:
ac45f602ee3d ("net: infrastructure for hardware time stamping")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20251015063221.4171986-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Nathan Chancellor [Tue, 14 Oct 2025 20:46:49 +0000 (13:46 -0700)]
net/mlx5e: Return 1 instead of 0 in invalid case in mlx5e_mpwrq_umr_entry_size()
When building with Clang 20 or newer, there are some objtool warnings
from unexpected fallthroughs to other functions:
vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries()
vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom()
LLVM 20 contains an (admittedly problematic [1]) optimization [2] to
convert divide by zero into the equivalent of __builtin_unreachable(),
which invokes undefined behavior and destroys code generation when it is
encountered in a control flow graph.
mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an
unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(),
which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of
mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for
zero, so LLVM is able to infer there will be a divide by zero in this
case and invokes undefined behavior. While there is some proposed work
to isolate this undefined behavior and avoid the destructive code
generation that results in these objtool warnings, code should still be
defensive against divide by zero.
As the WARN_ONCE() implies that an invalid value should be handled
gracefully, return 1 instead of 0 in the default case so that the
results of this division operation is always valid.
Fixes:
168723c1f8d6 ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters")
Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/
Link: https://github.com/llvm/llvm-project/commit/37932643abab699e8bb1def08b7eb4eae7ff1448
Closes: https://github.com/ClangBuiltLinux/linux/issues/2131
Closes: https://github.com/ClangBuiltLinux/linux/issues/2132
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michal Pecio [Tue, 14 Oct 2025 18:35:28 +0000 (20:35 +0200)]
net: usb: rtl8150: Fix frame padding
TX frames aren't padded and unknown memory is sent into the ether.
Theoretically, it isn't even guaranteed that the extra memory exists
and can be sent out, which could cause further problems. In practice,
I found that plenty of tailroom exists in the skb itself (in my test
with ping at least) and skb_padto() easily succeeds, so use it here.
In the event of -ENOMEM drop the frame like other drivers do.
The use of one more padding byte instead of a USB zero-length packet
is retained to avoid regression. I have a dodgy Etron xHCI controller
which doesn't seem to support sending ZLPs at all.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Michal Pecio <michal.pecio@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251014203528.3f9783c4.michal.pecio@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 16 Oct 2025 16:41:21 +0000 (09:41 -0700)]
Merge tag 'net-6.18-rc2' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from CAN
Current release - regressions:
- udp: do not use skb_release_head_state() before
skb_attempt_defer_free()
- gro_cells: use nested-BH locking for gro_cell
- dpll: zl3073x: increase maximum size of flash utility
Previous releases - regressions:
- core: fix lockdep splat on device unregister
- tcp: fix tcp_tso_should_defer() vs large RTT
- tls:
- don't rely on tx_work during send()
- wait for pending async decryptions if tls_strp_msg_hold fails
- can: j1939: add missing calls in NETDEV_UNREGISTER notification
handler
- eth: lan78xx: fix lost EEPROM write timeout in
lan78xx_write_raw_eeprom
Previous releases - always broken:
- ip6_tunnel: prevent perpetual tunnel growth
- dpll: zl3073x: handle missing or corrupted flash configuration
- can: m_can: fix pm_runtime and CAN state handling
- eth:
- ixgbe: fix too early devlink_free() in ixgbe_remove()
- ixgbevf: fix mailbox API compatibility
- gve: Check valid ts bit on RX descriptor before hw timestamping
- idpf: cleanup remaining SKBs in PTP flows
- r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H"
* tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
udp: do not use skb_release_head_state() before skb_attempt_defer_free()
net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset
netdevsim: set the carrier when the device goes up
selftests: tls: add test for short splice due to full skmsg
selftests: net: tls: add tests for cmsg vs MSG_MORE
tls: don't rely on tx_work during send()
tls: wait for pending async decryptions if tls_strp_msg_hold fails
tls: always set record_type in tls_process_cmsg
tls: wait for async encrypt in case of error during latter iterations of sendmsg
tls: trim encrypted message to match the plaintext on short splice
tg3: prevent use of uninitialized remote_adv and local_adv variables
MAINTAINERS: new entry for IPv6 IOAM
gve: Check valid ts bit on RX descriptor before hw timestamping
net: core: fix lockdep splat on device unregister
MAINTAINERS: add myself as maintainer for b53
selftests: net: check jq command is supported
net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()
tcp: fix tcp_tso_should_defer() vs large RTT
r8152: add error handling in rtl8152_driver_init
usbnet: Fix using smp_processor_id() in preemptible code warnings
...
Linus Torvalds [Thu, 16 Oct 2025 16:39:29 +0000 (09:39 -0700)]
Merge tag 'ata-6.18-rc2' of git://git./linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
- Do not print an error message (and assume that the General Purpose
Log Directory log page is not supported) for a device that reports a
bogus General Purpose Logging Version.
Unsurprisingly, many vendors fail to report the only valid General
Purpose Logging Version (Damien)
* tag 'ata-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: relax checks in ata_read_log_directory()
Eric Dumazet [Wed, 15 Oct 2025 05:27:15 +0000 (05:27 +0000)]
udp: do not use skb_release_head_state() before skb_attempt_defer_free()
Michal reported and bisected an issue after recent adoption
of skb_attempt_defer_free() in UDP.
The issue here is that skb_release_head_state() is called twice per skb,
one time from skb_consume_udp(), then a second time from skb_defer_free_flush()
and napi_consume_skb().
As Sabrina suggested, remove skb_release_head_state() call from
skb_consume_udp().
Add a DEBUG_NET_WARN_ON_ONCE(skb_nfct(skb)) in skb_attempt_defer_free()
Many thanks to Michal, Sabrina, Paolo and Florian for their help.
Fixes:
6471658dc66c ("udp: use skb_attempt_defer_free()")
Reported-and-bisected-by: Michal Kubecek <mkubecek@suse.cz>
Closes: https://lore.kernel.org/netdev/gpjh4lrotyephiqpuldtxxizrsg6job7cvhiqrw72saz2ubs3h@g6fgbvexgl3r/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: Florian Westphal <fw@strlen.de>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/20251015052715.4140493-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
I Viswanath [Mon, 13 Oct 2025 18:16:48 +0000 (23:46 +0530)]
net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset
dev->chipid is used in lan78xx_init_mac_address before it's initialized:
lan78xx_reset() {
lan78xx_init_mac_address()
lan78xx_read_eeprom()
lan78xx_read_raw_eeprom() <- dev->chipid is used here
dev->chipid = ... <- dev->chipid is initialized correctly here
}
Reorder initialization so that dev->chipid is set before calling
lan78xx_init_mac_address().
Fixes:
a0db7d10b76e ("lan78xx: Add to handle mux control per chip id")
Signed-off-by: I Viswanath <viswanathiyyappan@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Khalid Aziz <khalid@kernel.org>
Link: https://patch.msgid.link/20251013181648.35153-1-viswanathiyyappan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 16 Oct 2025 00:56:20 +0000 (17:56 -0700)]
Merge tag 'linux-can-fixes-for-6.18-
20251014' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-10-14
The first 2 paches are by Celeste Liu and target the gS_usb driver.
The first patch remove the limitation to 3 CAN interface per USB
device. The second patch adds the missing population of
net_device->dev_port.
The next 4 patches are by me and fix the m_can driver. They add a
missing pm_runtime_disable(), fix the CAN state transition back to
Error Active and fix the state after ifup and suspend/resume.
Another patch by me targets the m_can driver, too and replaces Dong
Aisheng's old email address.
The next 2 patches are by Vincent Mailhol and update the CAN
networking Documentation.
Tetsuo Handa contributes the last patch that add missing cleanup calls
in the NETDEV_UNREGISTER notification handler.
* tag 'linux-can-fixes-for-6.18-
20251014' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: j1939: add missing calls in NETDEV_UNREGISTER notification handler
can: add Transmitter Delay Compensation (TDC) documentation
can: remove false statement about 1:1 mapping between DLC and length
can: m_can: replace Dong Aisheng's old email address
can: m_can: fix CAN state in system PM
can: m_can: m_can_chip_config(): bring up interface in correct state
can: m_can: m_can_handle_state_errors(): fix CAN state transition to Error Active
can: m_can: m_can_plat_remove(): add missing pm_runtime_disable()
can: gs_usb: gs_make_candev(): populate net_device->dev_port
can: gs_usb: increase max interface to U8_MAX
====================
Link: https://patch.msgid.link/20251014122140.990472-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Breno Leitao [Tue, 14 Oct 2025 09:17:25 +0000 (02:17 -0700)]
netdevsim: set the carrier when the device goes up
Bringing a linked netdevsim device down and then up causes communication
failure because both interfaces lack carrier. Basically a ifdown/ifup on
the interface make the link broken.
Commit
3762ec05a9fbda ("netdevsim: add NAPI support") added supported
for NAPI, calling netif_carrier_off() in nsim_stop(). This patch
re-enables the carrier symmetrically on nsim_open(), in case the device
is linked and the peer is up.
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes:
3762ec05a9fbda ("netdevsim: add NAPI support")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20251014-netdevsim_fix-v2-1-53b40590dae1@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 16 Oct 2025 00:41:47 +0000 (17:41 -0700)]
Merge branch 'tls-misc-bugfixes'
Sabrina Dubroca says:
====================
tls: misc bugfixes
Jann Horn reported multiple bugs in kTLS. This series addresses them,
and adds some corresponding selftests for those that are reproducible
(and without failure injection).
====================
Link: https://patch.msgid.link/cover.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:17:02 +0000 (11:17 +0200)]
selftests: tls: add test for short splice due to full skmsg
We don't have a test triggering a partial splice caused by a full
skmsg. Add one, based on a program by Jann Horn.
Use MAX_FRAGS=48 to make sure the skmsg will be full for any allowed
value of CONFIG_MAX_SKB_FRAGS (17..45).
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/1d129a15f526ea3602f3a2b368aa0b6f7e0d35d5.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:17:01 +0000 (11:17 +0200)]
selftests: net: tls: add tests for cmsg vs MSG_MORE
We don't have a test to check that MSG_MORE won't let us merge records
of different types across sendmsg calls.
Add new tests that check:
- MSG_MORE is only allowed for DATA records
- a pending DATA record gets closed and pushed before a non-DATA
record is processed
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/b34feeadefe8a997f068d5ed5617afd0072df3c0.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:17:00 +0000 (11:17 +0200)]
tls: don't rely on tx_work during send()
With async crypto, we rely on tx_work to actually transmit records
once encryption completes. But while send() is running, both the
tx_lock and socket lock are held, so tx_work_handler cannot process
the queue of encrypted records, and simply reschedules itself. During
a large send(), this could last a long time, and use a lot of memory.
Transmit any pending encrypted records before restarting the main
loop of tls_sw_sendmsg_locked.
Fixes:
a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/8396631478f70454b44afb98352237d33f48d34d.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:16:59 +0000 (11:16 +0200)]
tls: wait for pending async decryptions if tls_strp_msg_hold fails
Async decryption calls tls_strp_msg_hold to create a clone of the
input skb to hold references to the memory it uses. If we fail to
allocate that clone, proceeding with async decryption can lead to
various issues (UAF on the skb, writing into userspace memory after
the recv() call has returned).
In this case, wait for all pending decryption requests.
Fixes:
84c61fe1a75b ("tls: rx: do not use the standard strparser")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/b9fe61dcc07dab15da9b35cf4c7d86382a98caf2.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:16:58 +0000 (11:16 +0200)]
tls: always set record_type in tls_process_cmsg
When userspace wants to send a non-DATA record (via the
TLS_SET_RECORD_TYPE cmsg), we need to send any pending data from a
previous MSG_MORE send() as a separate DATA record. If that DATA record
is encrypted asynchronously, tls_handle_open_record will return
-EINPROGRESS. This is currently treated as an error by
tls_process_cmsg, and it will skip setting record_type to the correct
value, but the caller (tls_sw_sendmsg_locked) handles that return
value correctly and proceeds with sending the new message with an
incorrect record_type (DATA instead of whatever was requested in the
cmsg).
Always set record_type before handling the open record. If
tls_handle_open_record returns an error, record_type will be
ignored. If it succeeds, whether with synchronous crypto (returning 0)
or asynchronous (returning -EINPROGRESS), the caller will proceed
correctly.
Fixes:
a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/0457252e578a10a94e40c72ba6288b3a64f31662.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:16:57 +0000 (11:16 +0200)]
tls: wait for async encrypt in case of error during latter iterations of sendmsg
If we hit an error during the main loop of tls_sw_sendmsg_locked (eg
failed allocation), we jump to send_end and immediately
return. Previous iterations may have queued async encryption requests
that are still pending. We should wait for those before returning, as
we could otherwise be reading from memory that userspace believes
we're not using anymore, which would be a sort of use-after-free.
This is similar to what tls_sw_recvmsg already does: failures during
the main loop jump to the "wait for async" code, not straight to the
unlock/return.
Fixes:
a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/c793efe9673b87f808d84fdefc0f732217030c52.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Tue, 14 Oct 2025 09:16:56 +0000 (11:16 +0200)]
tls: trim encrypted message to match the plaintext on short splice
During tls_sw_sendmsg_locked, we pre-allocate the encrypted message
for the size we're expecting to send during the current iteration, but
we may end up sending less, for example when splicing: if we're
getting the data from small fragments of memory, we may fill up all
the slots in the skmsg with less data than expected.
In this case, we need to trim the encrypted message to only the length
we actually need, to avoid pushing uninitialized bytes down the
underlying TCP socket.
Fixes:
fe1e81d4f73b ("tls/sw: Support MSG_SPLICE_PAGES")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/66a0ae99c9efc15f88e9e56c1f58f902f442ce86.1760432043.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexey Simakov [Tue, 14 Oct 2025 16:47:38 +0000 (19:47 +0300)]
tg3: prevent use of uninitialized remote_adv and local_adv variables
Some execution paths that jump to the fiber_setup_done label
could leave the remote_adv and local_adv variables uninitialized
and then use it.
Initialize this variables at the point of definition to avoid this.
Fixes:
85730a631f0c ("tg3: Add SGMII phy support for 5719/5718 serdes")
Co-developed-by: Alexandr Sapozhnikov <alsp705@gmail.com>
Signed-off-by: Alexandr Sapozhnikov <alsp705@gmail.com>
Signed-off-by: Alexey Simakov <bigalex934@gmail.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://patch.msgid.link/20251014164736.5890-1-bigalex934@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Iurman [Tue, 14 Oct 2025 17:06:50 +0000 (19:06 +0200)]
MAINTAINERS: new entry for IPv6 IOAM
Create a maintainer entry for IPv6 IOAM. Add myself as I authored most
if not all of the IPv6 IOAM code in the kernel and actively participate
in the related IETF groups.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Link: https://patch.msgid.link/20251014170650.27679-1-justin.iurman@uliege.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 15 Oct 2025 22:12:58 +0000 (15:12 -0700)]
Merge tag 'vfs-6.18-rc2.fixes' of git://git./linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Handle inode number mismatches in nsfs file handles
- Update the comment to init_file()
- Add documentation link for EBADF in the rust file code
- Skip read lock assertion for read-only filesystems when using dax
- Don't leak disconnected dentries during umount
- Fix new coredump input pattern validation
- Handle ENOIOCTLCMD conversion in vfs_fileattr_{g,s}et() correctly
- Remove redundant IOCB_DIO_CALLER_COMP clearing in overlayfs
* tag 'vfs-6.18-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ovl: remove redundant IOCB_DIO_CALLER_COMP clearing
fs: return EOPNOTSUPP from file_setattr/file_getattr syscalls
Revert "fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP"
coredump: fix core_pattern input validation
vfs: Don't leak disconnected dentries on umount
dax: skip read lock assertion for read-only filesystems
rust: file: add intra-doc link for 'EBADF'
fs: update comment in init_file()
nsfs: handle inode number mismatches gracefully in file handles
Tim Hostetler [Tue, 14 Oct 2025 00:47:39 +0000 (00:47 +0000)]
gve: Check valid ts bit on RX descriptor before hw timestamping
The device returns a valid bit in the LSB of the low timestamp byte in
the completion descriptor that the driver should check before
setting the SKB's hardware timestamp. If the timestamp is not valid, do not
hardware timestamp the SKB.
Cc: stable@vger.kernel.org
Fixes:
b2c7aeb49056 ("gve: Implement ndo_hwtstamp_get/set for RX timestamping")
Reviewed-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Tim Hostetler <thostet@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251014004740.2775957-1-hramamurthy@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 15 Oct 2025 15:48:17 +0000 (08:48 -0700)]
Merge tag 'bitmap-for-v6.18-rc2' of https://github.com/norov/linux
Pull bitmap fixes from Yury Norov:
"A "unnecessary `unsafe`" warning fix for bitmap/rust, and one leftover
patch for FIELD_PREP_WM16() conversion.
- rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning (Miguel)
- FIELD_PREP_WM16() rework leftover (Nicolas)"
* tag 'bitmap-for-v6.18-rc2' of https://github.com/norov/linux:
PM / devfreq: rockchip-dfi: switch to FIELD_PREP_WM16 macro
rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning
Linus Torvalds [Wed, 15 Oct 2025 14:57:28 +0000 (07:57 -0700)]
Remove long-stale ext3 defconfig option
Inspired by commit
c065b6046b34 ("Use CONFIG_EXT4_FS instead of
CONFIG_EXT3_FS in all of the defconfigs") I looked around for any other
left-over EXT3 config options, and found some old defconfig files still
mentioned CONFIG_EXT3_DEFAULTS_TO_ORDERED.
That config option was removed a decade ago in commit
c290ea01abb7 ("fs:
Remove ext3 filesystem driver"). It had a good run, but let's remove it
for good.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 15 Oct 2025 14:51:57 +0000 (07:51 -0700)]
Merge tag 'ext4_for_linus-6.18-rc2' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 bug fixes from Ted Ts'o:
- Fix regression caused by removing CONFIG_EXT3_FS when testing some
very old defconfigs
- Avoid a BUG_ON when opening a file on a maliciously corrupted file
system
- Avoid mm warnings when freeing a very large orphan file metadata
- Avoid a theoretical races between metadata writeback and checkpoints
(it's very hard to hit in practice, since the race requires that the
writeback take a very long time)
* tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs
ext4: free orphan info with kvfree
ext4: detect invalid INLINE_DATA + EXTENTS flag combination
ext4, doc: fix and improve directory hash tree description
ext4: wait for ongoing I/O to complete before freeing blocks
jbd2: ensure that all ongoing I/O complete before freeing blocks
Nicolas Frattaroli [Mon, 13 Oct 2025 07:34:04 +0000 (09:34 +0200)]
PM / devfreq: rockchip-dfi: switch to FIELD_PREP_WM16 macro
The era of hand-rolled HIWORD_UPDATE macros is over, at least for those
drivers that use constant masks.
Like many other Rockchip drivers, rockchip-dfi brings with it its own
HIWORD_UPDATE macro. This variant doesn't shift the value (and like the
others, doesn't do any checking).
Remove it, and replace instances of it with hw_bitfield.h's
FIELD_PREP_WM16. Since FIELD_PREP_WM16 requires contiguous masks and
shifts the value for us, some reshuffling of definitions needs to
happen.
This gives us better compile-time error checking, and in my opinion,
nicer code.
Tested on an RK3568 ODROID-M1 board (LPDDR4X at 1560 MHz, an RK3588
Radxa ROCK 5B board (LPDDR4X at 2112 MHz) and an RK3588 Radxa ROCK 5T
board (LPDDR5 at 2400 MHz). perf measurements were consistent with the
measurements of stress-ng --stream in all cases.
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Miguel Ojeda [Mon, 13 Oct 2025 00:14:22 +0000 (02:14 +0200)]
rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning
Starting with Rust 1.92.0 (expected 2025-12-11), Rust allows to safely
take the address of a union field [1][2]:
CLIPPY L rust/kernel.o
error: unnecessary `unsafe` block
--> rust/kernel/bitmap.rs:169:13
|
169 | unsafe { core::ptr::addr_of!(self.repr.bitmap) }
| ^^^^^^ unnecessary `unsafe` block
|
= note: `-D unused-unsafe` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(unused_unsafe)]`
error: unnecessary `unsafe` block
--> rust/kernel/bitmap.rs:185:13
|
185 | unsafe { core::ptr::addr_of_mut!(self.repr.bitmap) }
| ^^^^^^ unnecessary `unsafe` block
Thus allow both instances to clean the warning in newer compilers.
Link: https://github.com/rust-lang/rust/issues/141264
Link: https://github.com/rust-lang/rust/pull/141469
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Florian Westphal [Mon, 13 Oct 2025 18:50:52 +0000 (20:50 +0200)]
net: core: fix lockdep splat on device unregister
Since blamed commit, unregister_netdevice_many_notify() takes the netdev
mutex if the device needs it.
If the device list is too long, this will lock more device mutexes than
lockdep can handle:
unshare -n \
bash -c 'for i in $(seq 1 100);do ip link add foo$i type dummy;done'
BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48 max: 48!
48 locks held by kworker/u16:1/69:
#0: ..148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work
#1: ..d40 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work
#2: ..bd0 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net
#3: ..aa8 (rtnl_mutex){+.+.}-{4:4}, at: default_device_exit_batch
#4: ..cb0 (&dev_instance_lock_key#3){+.+.}-{4:4}, at: unregister_netdevice_many_notify
[..]
Add a helper to close and then unlock a list of net_devices.
Devices that are not up have to be skipped - netif_close_many always
removes them from the list without any other actions taken, so they'd
remain in locked state.
Close devices whenever we've used up half of the tracking slots or we
processed entire list without hitting the limit.
Fixes:
7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20251013185052.14021-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonas Gorski [Mon, 13 Oct 2025 18:03:47 +0000 (20:03 +0200)]
MAINTAINERS: add myself as maintainer for b53
I wrote the original OpenWrt driver that Florian used as the base for
the dsa driver, I might as well take responsibility for it.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20251013180347.133246-1-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Tue, 14 Oct 2025 16:28:12 +0000 (09:28 -0700)]
Merge tag 'nfsd-6.18-1' of git://git./linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:
- Fix a crasher reported by rtm@csail.mit.edu
* tag 'nfsd-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Define a proc_layoutcommit for the FlexFiles layout type
Linus Torvalds [Tue, 14 Oct 2025 16:15:45 +0000 (09:15 -0700)]
Merge tag 'for-linus-6.18-2' of https://github.com/cminyard/linux-ipmi
Pull IPMI fixes from Corey Minyard:
"A few bug fixes for patches that went in this release: a refcount
error and some missing or incorrect error checks"
* tag 'for-linus-6.18-2' of https://github.com/cminyard/linux-ipmi:
ipmi: Fix handling of messages with provided receive message pointer
mfd: ls2kbmc: check for devm_mfd_add_devices() failure
mfd: ls2kbmc: Fix an IS_ERR() vs NULL check in probe()
Wang Liang [Mon, 13 Oct 2025 08:00:39 +0000 (16:00 +0800)]
selftests: net: check jq command is supported
The jq command is used in vlan_bridge_binding.sh, if it is not supported,
the test will spam the following log.
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# ./vlan_bridge_binding.sh: line 51: jq: command not found
# TEST: Test bridge_binding on->off when lower down [FAIL]
# Got operstate of , expected 0
The rtnetlink.sh has the same problem. It makes sense to check if jq is
installed before running these tests. After this patch, the
vlan_bridge_binding.sh skipped if jq is not supported:
# timeout set to 3600
# selftests: net: vlan_bridge_binding.sh
# TEST: jq not installed [SKIP]
Fixes:
dca12e9ab760 ("selftests: net: Add a VLAN bridge binding selftest")
Fixes:
6a414fd77f61 ("selftests: rtnetlink: Add an address proto test")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251013080039.3035898-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lorenzo Bianconi [Sun, 12 Oct 2025 09:19:44 +0000 (11:19 +0200)]
net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()
Completion napi can free out-of-order tx descriptors if hw QoS is
enabled and packets with different priority are queued to same DMA ring.
Take into account possible out-of-order reports checking if the tx queue
is full using circular buffer head/tail pointer instead of the number of
queued packets.
Fixes:
23020f0493270 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Suggested-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251012-airoha-tx-busy-queue-v2-1-a600b08bab2d@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Eric Dumazet [Sat, 11 Oct 2025 11:57:42 +0000 (11:57 +0000)]
tcp: fix tcp_tso_should_defer() vs large RTT
Neal reported that using neper tcp_stream with TCP_TX_DELAY
set to 50ms would often lead to flows stuck in a small cwnd mode,
regardless of the congestion control.
While tcp_stream sets TCP_TX_DELAY too late after the connect(),
it highlighted two kernel bugs.
The following heuristic in tcp_tso_should_defer() seems wrong
for large RTT:
delta = tp->tcp_clock_cache - head->tstamp;
/* If next ACK is likely to come too late (half srtt), do not defer */
if ((s64)(delta - (u64)NSEC_PER_USEC * (tp->srtt_us >> 4)) < 0)
goto send_now;
If next ACK is expected to come in more than 1 ms, we should
not defer because we prefer a smooth ACK clocking.
While blamed commit was a step in the good direction, it was not
generic enough.
Another patch fixing TCP_TX_DELAY for established flows
will be proposed when net-next reopens.
Fixes:
50c8339e9299 ("tcp: tso: restore IW10 after TSO autosizing")
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20251011115742.1245771-1-edumazet@google.com
[pabeni@redhat.com: fixed whitespace issue]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yi Cong [Sat, 11 Oct 2025 08:24:15 +0000 (16:24 +0800)]
r8152: add error handling in rtl8152_driver_init
rtl8152_driver_init() is missing the error handling.
When rtl8152_driver registration fails, rtl8152_cfgselector_driver
should be deregistered.
Fixes:
ec51fbd1b8a2 ("r8152: add USB device driver for config selection")
Cc: stable@vger.kernel.org
Signed-off-by: Yi Cong <yicong@kylinos.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251011082415.580740-1-yicongsrfy@163.com
[pabeni@redhat.com: clarified the commit message]
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Zqiang [Sat, 11 Oct 2025 07:05:18 +0000 (15:05 +0800)]
usbnet: Fix using smp_processor_id() in preemptible code warnings
Syzbot reported the following warning:
BUG: using smp_processor_id() in preemptible [
00000000] code: dhcpcd/2879
caller is usbnet_skb_return+0x74/0x490 drivers/net/usb/usbnet.c:331
CPU: 1 UID: 0 PID: 2879 Comm: dhcpcd Not tainted
6.15.0-rc4-syzkaller-00098-g615dca38c2ea #0 PREEMPT(voluntary)
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120
check_preemption_disabled+0xd0/0xe0 lib/smp_processor_id.c:49
usbnet_skb_return+0x74/0x490 drivers/net/usb/usbnet.c:331
usbnet_resume_rx+0x4b/0x170 drivers/net/usb/usbnet.c:708
usbnet_change_mtu+0x1be/0x220 drivers/net/usb/usbnet.c:417
__dev_set_mtu net/core/dev.c:9443 [inline]
netif_set_mtu_ext+0x369/0x5c0 net/core/dev.c:9496
netif_set_mtu+0xb0/0x160 net/core/dev.c:9520
dev_set_mtu+0xae/0x170 net/core/dev_api.c:247
dev_ifsioc+0xa31/0x18d0 net/core/dev_ioctl.c:572
dev_ioctl+0x223/0x10e0 net/core/dev_ioctl.c:821
sock_do_ioctl+0x19d/0x280 net/socket.c:1204
sock_ioctl+0x42f/0x6a0 net/socket.c:1311
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:906 [inline]
__se_sys_ioctl fs/ioctl.c:892 [inline]
__x64_sys_ioctl+0x190/0x200 fs/ioctl.c:892
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
For historical and portability reasons, the netif_rx() is usually
run in the softirq or interrupt context, this commit therefore add
local_bh_disable/enable() protection in the usbnet_resume_rx().
Fixes:
43daa96b166c ("usbnet: Stop RX Q on MTU change")
Link: https://syzkaller.appspot.com/bug?id=81f55dfa587ee544baaaa5a359a060512228c1e1
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Link: https://patch.msgid.link/20251011070518.7095-1-qiang.zhang@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Harshit Mogalapalli [Fri, 10 Oct 2025 20:42:39 +0000 (13:42 -0700)]
Octeontx2-af: Fix missing error code in cgx_probe()
When CGX fails mapping to NIX, set the error code to -ENODEV, currently
err is zero and that is treated as success path.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/aLAdlCg2_Yv7Y-3h@stanley.mountain/
Fixes:
d280233fc866 ("Octeontx2-af: Fix NIX X2P calibration failures")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251010204239.94237-1-harshit.m.mogalapalli@oracle.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Raju Rangoju [Fri, 10 Oct 2025 06:51:42 +0000 (12:21 +0530)]
amd-xgbe: Avoid spurious link down messages during interface toggle
During interface toggle operations (ifdown/ifup), the driver currently
resets the local helper variable 'phy_link' to -1. This causes the link
state machine to incorrectly interpret the state as a link change event,
resulting in spurious "Link is down" messages being logged when the
interface is brought back up.
Preserve the phy_link state across interface toggles to avoid treating
the -1 sentinel value as a legitimate link state transition.
Fixes:
88131a812b16 ("amd-xgbe: Perform phy connect/disconnect at dev open/stop")
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
Link: https://patch.msgid.link/20251010065142.1189310-1-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Theodore Ts'o [Tue, 14 Oct 2025 01:50:40 +0000 (21:50 -0400)]
Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs
Commit
d6ace46c82fd ("ext4: remove obsolete EXT3 config options")
removed the obsolete EXT3_CONFIG options, since it had been over a
decade since fs/ext3 had been removed. Unfortunately, there were a
number of defconfigs that still used CONFIG_EXT3_FS which the cleanup
commit didn't fix up. This led to a large number of defconfig test
builds to fail. Oops.
Fixes:
d6ace46c82fd ("ext4: remove obsolete EXT3 config options")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Marek Vasut [Sat, 11 Oct 2025 11:02:49 +0000 (13:02 +0200)]
net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present
The driver is currently checking for PHYCR2 register presence in
rtl8211f_config_init(), but it does so after accessing PHYCR2 to
disable EEE. This was introduced in commit
bfc17c165835 ("net:
phy: realtek: disable PHY-mode EEE"). Move the PHYCR2 presence
test before the EEE disablement and simplify the code.
Fixes:
bfc17c165835 ("net: phy: realtek: disable PHY-mode EEE")
Signed-off-by: Marek Vasut <marek.vasut@mailbox.org>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251011110309.12664-1-marek.vasut@mailbox.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 14 Oct 2025 00:44:51 +0000 (17:44 -0700)]
Merge branch 'intel-wired-lan-driver-updates-2025-10-01-idpf-ixgbe-ixgbevf'
Jacob Keller says:
====================
Intel Wired LAN Driver Updates 2025-10-01 (idpf, ixgbe, ixgbevf)
For idpf:
Milena fixes a memory leak in the idpf reset logic when the driver resets
with an outstanding Tx timestamp.
For ixgbe and ixgbevf:
Jedrzej fixes an issue with reporting link speed on E610 VFs.
Jedrzej also fixes the VF mailbox API incompatibilities caused by the
confusion with API v1.4, v1.5, and v1.6. The v1.4 API introduced IPSEC
offload, but this was only supported on Linux hosts. The v1.5 API
introduced a new mailbox API which is necessary to resolve issues on ESX
hosts. The v1.6 API introduced a new link management API for E610. Jedrzej
introduces a new v1.7 API with a feature negotiation which enables properly
checking if features such as IPSEC or the ESX mailbox APIs are supported.
This resolves issues with compatibility on different hosts, and aligns the
API across hosts instead of having Linux require custom mailbox API
versions for IPSEC offload.
Koichiro fixes a KASAN use-after-free bug in ixgbe_remove().
====================
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-0-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Koichiro Den [Fri, 10 Oct 2025 00:03:51 +0000 (17:03 -0700)]
ixgbe: fix too early devlink_free() in ixgbe_remove()
Since ixgbe_adapter is embedded in devlink, calling devlink_free()
prematurely in the ixgbe_remove() path can lead to UAF. Move devlink_free()
to the end.
KASAN report:
BUG: KASAN: use-after-free in ixgbe_reset_interrupt_capability+0x140/0x180 [ixgbe]
Read of size 8 at addr
ffff0000adf813e0 by task bash/2095
CPU: 1 UID: 0 PID: 2095 Comm: bash Tainted: G S 6.17.0-rc2-tnguy.net-queue+ #1 PREEMPT(full)
[...]
Call trace:
show_stack+0x30/0x90 (C)
dump_stack_lvl+0x9c/0xd0
print_address_description.constprop.0+0x90/0x310
print_report+0x104/0x1f0
kasan_report+0x88/0x180
__asan_report_load8_noabort+0x20/0x30
ixgbe_reset_interrupt_capability+0x140/0x180 [ixgbe]
ixgbe_clear_interrupt_scheme+0xf8/0x130 [ixgbe]
ixgbe_remove+0x2d0/0x8c0 [ixgbe]
pci_device_remove+0xa0/0x220
device_remove+0xb8/0x170
device_release_driver_internal+0x318/0x490
device_driver_detach+0x40/0x68
unbind_store+0xec/0x118
drv_attr_store+0x64/0xb8
sysfs_kf_write+0xcc/0x138
kernfs_fop_write_iter+0x294/0x440
new_sync_write+0x1fc/0x588
vfs_write+0x480/0x6a0
ksys_write+0xf0/0x1e0
__arm64_sys_write+0x70/0xc0
invoke_syscall.constprop.0+0xcc/0x280
el0_svc_common.constprop.0+0xa8/0x248
do_el0_svc+0x44/0x68
el0_svc+0x54/0x160
el0t_64_sync_handler+0xa0/0xe8
el0t_64_sync+0x1b0/0x1b8
Fixes:
a0285236ab93 ("ixgbe: add initial devlink support")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Tested-by: Rinitha S <sx.rinitha@intel.com>
Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-6-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jedrzej Jagielski [Fri, 10 Oct 2025 00:03:50 +0000 (17:03 -0700)]
ixgbe: handle IXGBE_VF_FEATURES_NEGOTIATE mbox cmd
Send to VF information about features supported by the PF driver.
Increase API version to 1.7.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-5-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jedrzej Jagielski [Fri, 10 Oct 2025 00:03:49 +0000 (17:03 -0700)]
ixgbevf: fix mailbox API compatibility by negotiating supported features
There was backward compatibility in the terms of mailbox API. Various
drivers from various OSes supporting 10G adapters from Intel portfolio
could easily negotiate mailbox API.
This convention has been broken since introducing API 1.4.
Commit
0062e7cc955e ("ixgbevf: add VF IPsec offload code") added support
for IPSec which is specific only for the kernel ixgbe driver. None of the
rest of the Intel 10G PF/VF drivers supports it. And actually lack of
support was not included in the IPSec implementation - there were no such
code paths. No possibility to negotiate support for the feature was
introduced along with introduction of the feature itself.
Commit
339f28964147 ("ixgbevf: Add support for new mailbox communication
between PF and VF") increasing API version to 1.5 did the same - it
introduced code supported specifically by the PF ESX driver. It altered API
version for the VF driver in the same time not touching the version
defined for the PF ixgbe driver. It led to additional discrepancies,
as the code provided within API 1.6 cannot be supported for Linux ixgbe
driver as it causes crashes.
The issue was noticed some time ago and mitigated by Jake within the commit
d0725312adf5 ("ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5").
As a result we have regression for IPsec support and after increasing API
to version 1.6 ixgbevf driver stopped to support ESX MBX.
To fix this mess add new mailbox op asking PF driver about supported
features. Basing on a response determine whether to set support for IPSec
and ESX-specific enhanced mailbox.
New mailbox op, for compatibility purposes, must be added within new API
revision, as API version of OOT PF & VF drivers is already increased to
1.6 and doesn't incorporate features negotiate op.
Features negotiation mechanism gives possibility to be extended with new
features when needed in the future.
Reported-by: Jacob Keller <jacob.e.keller@intel.com>
Closes: https://lore.kernel.org/intel-wired-lan/
20241101-jk-ixgbevf-mailbox-v1-5-fixes-v1-0-
f556dc9a66ed@intel.com/
Fixes:
0062e7cc955e ("ixgbevf: add VF IPsec offload code")
Fixes:
339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF")
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-4-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jedrzej Jagielski [Fri, 10 Oct 2025 00:03:48 +0000 (17:03 -0700)]
ixgbe: handle IXGBE_VF_GET_PF_LINK_STATE mailbox operation
Update supported API version and provide handler for
IXGBE_VF_GET_PF_LINK_STATE cmd.
Simply put stored values of link speed and link_up from adapter context.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Link: https://lore.kernel.org/stable/20250828095227.1857066-3-jedrzej.jagielski%40intel.com
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-3-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jedrzej Jagielski [Fri, 10 Oct 2025 00:03:47 +0000 (17:03 -0700)]
ixgbevf: fix getting link speed data for E610 devices
E610 adapters no longer use the VFLINKS register to read PF's link
speed and linkup state. As a result VF driver cannot get actual link
state and it incorrectly reports 10G which is the default option.
It leads to a situation where even 1G adapters print 10G as actual
link speed. The same happens when PF driver set speed different than 10G.
Add new mailbox operation to let the VF driver request a PF driver
to provide actual link data. Update the mailbox api to v1.6.
Incorporate both ways of getting link status within the legacy
ixgbe_check_mac_link_vf() function.
Fixes:
4c44b450c69b ("ixgbevf: Add support for Intel(R) E610 device")
Co-developed-by: Andrzej Wilczynski <andrzejx.wilczynski@intel.com>
Signed-off-by: Andrzej Wilczynski <andrzejx.wilczynski@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-2-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Milena Olech [Fri, 10 Oct 2025 00:03:46 +0000 (17:03 -0700)]
idpf: cleanup remaining SKBs in PTP flows
When the driver requests Tx timestamp value, one of the first steps is
to clone SKB using skb_get. It increases the reference counter for that
SKB to prevent unexpected freeing by another component.
However, there may be a case where the index is requested, SKB is
assigned and never consumed by PTP flows - for example due to reset during
running PTP apps.
Add a check in release timestamping function to verify if the SKB
assigned to Tx timestamp latch was freed, and release remaining SKBs.
Fixes:
4901e83a94ef ("idpf: add Tx timestamp capabilities negotiation")
Signed-off-by: Milena Olech <milena.olech@intel.com>
Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-1-ef32a425b92a@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dmitry Safonov [Thu, 9 Oct 2025 15:02:19 +0000 (16:02 +0100)]
net/ip6_tunnel: Prevent perpetual tunnel growth
Similarly to ipv4 tunnel, ipv6 version updates dev->needed_headroom, too.
While ipv4 tunnel headroom adjustment growth was limited in
commit
5ae1e9922bbd ("net: ip_tunnel: prevent perpetual headroom growth"),
ipv6 tunnel yet increases the headroom without any ceiling.
Reflect ipv4 tunnel headroom adjustment limit on ipv6 version.
Credits to Francesco Ruggeri, who was originally debugging this issue
and wrote local Arista-specific patch and a reproducer.
Fixes:
8eb30be0352d ("ipv6: Create ip6_tnl_xmit")
Cc: Florian Westphal <fw@strlen.de>
Cc: Francesco Ruggeri <fruggeri05@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Link: https://patch.msgid.link/20251009-ip6_tunnel-headroom-v2-1-8e4dbd8f7e35@arista.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kamil Horák - 2N [Thu, 9 Oct 2025 13:06:56 +0000 (15:06 +0200)]
net: phy: bcm54811: Fix GMII/MII/MII-Lite selection
The Broadcom bcm54811 is hardware-strapped to select among RGMII and
GMII/MII/MII-Lite modes. However, the corresponding bit, RGMII Enable
in Miscellaneous Control Register must be also set to select desired
RGMII or MII(-lite)/GMII mode.
Fixes:
3117a11fff5af9e7 ("net: phy: bcm54811: PHY initialization")
Signed-off-by: Kamil Horák - 2N <kamilh@axis.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251009130656.1308237-2-kamilh@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linmao Li [Thu, 9 Oct 2025 12:25:49 +0000 (20:25 +0800)]
r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H
After resume from S4 (hibernate), RTL8168H/RTL8111H truncates incoming
packets. Packet captures show messages like "IP truncated-ip - 146 bytes
missing!".
The issue is caused by RxConfig not being properly re-initialized after
resume. Re-initializing the RxConfig register before the chip
re-initialization sequence avoids the truncation and restores correct
packet reception.
This follows the same pattern as commit
ef9da46ddef0 ("r8169: fix data
corruption issue on RTL8402").
Fixes:
6e1d0b898818 ("r8169:add support for RTL8168H and RTL8107E")
Signed-off-by: Linmao Li <lilinmao@kylinos.cn>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/20251009122549.3955845-1-lilinmao@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sebastian Andrzej Siewior [Thu, 9 Oct 2025 09:43:38 +0000 (11:43 +0200)]
net: gro_cells: Use nested-BH locking for gro_cell
The gro_cell data structure is per-CPU variable and relies on disabled
BH for its locking. Without per-CPU locking in local_bh_disable() on
PREEMPT_RT this data structure requires explicit locking.
Add a local_lock_t to the data structure and use
local_lock_nested_bh() for locking. This change adds only lockdep
coverage and does not alter the functional behaviour for !PREEMPT_RT.
Reported-by: syzbot+8715dd783e9b0bef43b1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
68c6c3b1.
050a0220.2ff435.0382.GAE@google.com/
Fixes:
3253cb49cbad ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20251009094338.j1jyKfjR@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ivan Vecera [Wed, 8 Oct 2025 14:14:45 +0000 (16:14 +0200)]
dpll: zl3073x: Handle missing or corrupted flash configuration
If the internal flash contains missing or corrupted configuration,
basic communication over the bus still functions, but the device
is not capable of normal operation (for example, using mailboxes).
This condition is indicated in the info register by the ready bit.
If this bit is cleared, the probe procedure times out while fetching
the device state.
Handle this case by checking the ready bit value in zl3073x_dev_start()
and skipping DPLL device and pin registration if it is cleared.
Do not report this condition as an error, allowing the devlink device
to be registered and enabling the user to flash the correct configuration.
Prior this patch:
[ 31.112299] zl3073x-i2c 1-0070: Failed to fetch input state: -ETIMEDOUT
[ 31.116332] zl3073x-i2c 1-0070: error -ETIMEDOUT: Failed to start device
[ 31.136881] zl3073x-i2c 1-0070: probe with driver zl3073x-i2c failed with error -110
After this patch:
[ 41.011438] zl3073x-i2c 1-0070: FW not fully ready - missing or corrupted config
Fixes:
75a71ecc24125 ("dpll: zl3073x: Register DPLL devices and pins")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251008141445.841113-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tetsuo Handa [Sat, 27 Sep 2025 12:11:16 +0000 (21:11 +0900)]
can: j1939: add missing calls in NETDEV_UNREGISTER notification handler
Currently NETDEV_UNREGISTER event handler is not calling
j1939_cancel_active_session() and j1939_sk_queue_drop_all().
This will result in these calls being skipped when j1939_sk_release() is
called. And I guess that the reason syzbot is still reporting
unregister_netdevice: waiting for vcan0 to become free. Usage count = 2
is caused by lack of these calls.
Calling j1939_cancel_active_session(priv, sk) from j1939_sk_release() can
be covered by calling j1939_cancel_active_session(priv, NULL) from
j1939_netdev_notify().
Calling j1939_sk_queue_drop_all() from j1939_sk_release() can be covered
by calling j1939_sk_netdev_event_netdown() from j1939_netdev_notify().
Therefore, we can reuse j1939_cancel_active_session(priv, NULL) and
j1939_sk_netdev_event_netdown(priv) for NETDEV_UNREGISTER event handler.
Fixes:
7fcbe5b2c6a4 ("can: j1939: implement NETDEV_UNREGISTER notification handler")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/3ad3c7f8-5a74-4b07-a193-cb0725823558@I-love.SAKURA.ne.jp
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 13 Oct 2025 19:26:08 +0000 (21:26 +0200)]
Merge patch series "can: add Transmitter Delay Compensation (TDC) documentation"
Vincent Mailhol <mailhol@kernel.org> says:
TDC was added to the kernel in 2021 but I never took time to update the
documentation. The year is now 2025... As we say: "better late than never"!
The first patch is a small clean up which fixes an incorrect statement
concerning the CAN DLC, the second patch is the real thing and adds the
documentation of how to use the ip tool to configure the TDC.
Link: https://patch.msgid.link/20251013-can-fd-doc-v2-0-5d53bdc8f2ad@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Vincent Mailhol [Mon, 13 Oct 2025 10:10:23 +0000 (19:10 +0900)]
can: add Transmitter Delay Compensation (TDC) documentation
Back in 2021, support for CAN TDC was added to the kernel in series [1]
and in iproute2 in series [2]. However, the documentation was never
updated.
Add a new sub-section under CAN-FD driver support to document how to
configure the TDC using the "ip tool".
[1] add the netlink interface for CAN-FD Transmitter Delay Compensation (TDC)
Link: https://lore.kernel.org/all/20210918095637.20108-1-mailhol.vincent@wanadoo.fr/
[2] iplink_can: cleaning, fixes and adding TDC support
Link: https://lore.kernel.org/all/20211103164428.692722-1-mailhol.vincent@wanadoo.fr/
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20251013-can-fd-doc-v2-2-5d53bdc8f2ad@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Vincent Mailhol [Mon, 13 Oct 2025 10:10:22 +0000 (19:10 +0900)]
can: remove false statement about 1:1 mapping between DLC and length
The CAN-FD section of can.rst still states that there is a 1:1 mapping
between the Classical CAN DLC and its length. This is only true for
the DLC values up to 8. Beyond that point, the length remains at 8.
For reference, the mapping between the CAN DLC and the length is given
in below table [1]:
DLC value CBFF and CEFF FBFF and FEFF
[decimal] [byte] [byte]
----------------------------------------------
0 0 0
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
6 6 6
7 7 7
8 8 8
9 8 12
10 8 16
11 8 20
12 8 24
13 8 32
14 8 48
15 8 64
Remove the erroneous statement. Instead just state that the length of
a Classical CAN frame ranges from 0 to 8.
[1] ISO 11898-1:2024, Table 5 -- DLC: coding of the four LSB
Signed-off-by: Vincent Mailhol <mailhol@kernel.org>
Link: https://patch.msgid.link/20251013-can-fd-doc-v2-1-5d53bdc8f2ad@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Damien Le Moal [Thu, 9 Oct 2025 10:46:00 +0000 (19:46 +0900)]
ata: libata-core: relax checks in ata_read_log_directory()
Commit
6d4405b16d37 ("ata: libata-core: Cache the general purpose log
directory") introduced caching of a device general purpose log directory
to avoid repeated access to this log page during device scan. This
change also added a check on this log page to verify that the log page
version is 0x0001 as mandated by the ACS specifications.
And it turns out that some devices do not bother reporting this version,
instead reporting a version 0, resulting in error messages such as:
ata6.00: Invalid log directory version 0x0000
and to the device being marked as not supporting the general purpose log
directory log page.
Since before commit
6d4405b16d37 the log page version check did not
exist and things were still working correctly for these devices, relax
ata_read_log_directory() version check and only warn about the invalid
log page version number without disabling access to the log directory
page.
Fixes:
6d4405b16d37 ("ata: libata-core: Cache the general purpose log directory")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220635
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Nicolas Dichtel [Fri, 10 Oct 2025 14:18:59 +0000 (16:18 +0200)]
doc: fix seg6_flowlabel path
This sysctl is not per interface; it's global per netns.
Fixes:
292ecd9f5a94 ("doc: move seg6_flowlabel to seg6-sysctl.rst")
Reported-by: Philippe Guibert <philippe.guibert@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 12 Oct 2025 20:42:36 +0000 (13:42 -0700)]
Linux 6.18-rc1
Linus Torvalds [Sun, 12 Oct 2025 20:27:56 +0000 (13:27 -0700)]
Merge tag 'i2c-for-6.18-rc1-hotfix' of git://git./linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"One revert because of a regression in the I2C core which has sadly not
showed up during its time in -next"
* tag 'i2c-for-6.18-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: boardinfo: Annotate code used in init phase only"
Jakub Kicinski [Thu, 9 Oct 2025 20:56:29 +0000 (13:56 -0700)]
selftests: drv-net: update remaining Python init files
Convert remaining __init__ files similar to what we did in
commit
b615879dbfea ("selftests: drv-net: make linters happy with our imports")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yeounsu Moon [Thu, 9 Oct 2025 15:57:16 +0000 (00:57 +0900)]
net: dlink: handle dma_map_single() failure properly
There is no error handling for `dma_map_single()` failures.
Add error handling by checking `dma_mapping_error()` and freeing
the `skb` using `dev_kfree_skb()` (process context) when it fails.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com>
Tested-on: D-Link DGE-550T Rev-A3
Suggested-by: Simon Horman <horms@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rex Lu [Thu, 9 Oct 2025 06:29:34 +0000 (08:29 +0200)]
net: mtk: wed: add dma mask limitation and GFP_DMA32 for device with more than 4GB DRAM
Limit tx/rx buffer address to 32-bit address space for board with more
than 4GB DRAM.
Fixes:
804775dfc2885 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)")
Fixes:
6757d345dd7db ("net: ethernet: mtk_wed: introduce hw_rro support for MT7988")
Tested-by: Daniel Pawlik <pawlik.dan@gmail.com>
Tested-by: Matteo Croce <teknoraver@meta.com>
Signed-off-by: Rex Lu <rex.lu@mediatek.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhanu Seshu Kumar Valluri [Thu, 9 Oct 2025 05:30:09 +0000 (11:00 +0530)]
net: usb: lan78xx: Fix lost EEPROM write timeout error(-ETIMEDOUT) in lan78xx_write_raw_eeprom
The function lan78xx_write_raw_eeprom failed to properly propagate EEPROM
write timeout errors (-ETIMEDOUT). In the timeout fallthrough path, it first
attempted to restore the pin configuration for LED outputs and then
returned only the status of that restore operation, discarding the
original timeout error saved in ret.
As a result, callers could mistakenly treat EEPROM write operation as
successful even though the EEPROM write had actually timed out with no
or partial data write.
To fix this, handle errors in restoring the LED pin configuration separately.
If the restore succeeds, return any prior EEPROM write timeout error saved
in ret to the caller.
Suggested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes:
8b1b2ca83b20 ("net: usb: lan78xx: Improve error handling in EEPROM and OTP operations")
cc: stable@vger.kernel.org
Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 12 Oct 2025 15:45:52 +0000 (08:45 -0700)]
Merge tag 'irq_urgent_for_v6.18_rc1' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Skip interrupt ID 0 in sifive-plic during suspend/resume because
ID 0 is reserved and accessing reserved register space could result
in undefined behavior
- Fix a function's retval check in aspeed-scu-ic
* tag 'irq_urgent_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resume
irqchip/aspeed-scu-ic: Fix an IS_ERR() vs NULL check
Linus Torvalds [Sat, 11 Oct 2025 23:06:04 +0000 (16:06 -0700)]
Merge tag 'trace-v6.18-3' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
"The previous fix to trace_marker required updating trace_marker_raw as
well. The difference between trace_marker_raw from trace_marker is
that the raw version is for applications to write binary structures
directly into the ring buffer instead of writing ASCII strings. This
is for applications that will read the raw data from the ring buffer
and get the data structures directly. It's a bit quicker than using
the ASCII version.
Unfortunately, it appears that our test suite has several tests that
test writes to the trace_marker file, but lacks any tests to the
trace_marker_raw file (this needs to be remedied). Two issues came
about the update to the trace_marker_raw file that syzbot found:
- Fix tracing_mark_raw_write() to use per CPU buffer
The fix to use the per CPU buffer to copy from user space was
needed for both the trace_maker and trace_maker_raw file.
The fix for reading from user space into per CPU buffers properly
fixed the trace_marker write function, but the trace_marker_raw
file wasn't fixed properly. The user space data was correctly
written into the per CPU buffer, but the code that wrote into the
ring buffer still used the user space pointer and not the per CPU
buffer that had the user space data already written.
- Stop the fortify string warning from writing into trace_marker_raw
After converting the copy_from_user_nofault() into a memcpy(),
another issue appeared. As writes to the trace_marker_raw expects
binary data, the first entry is a 4 byte identifier. The entry
structure is defined as:
struct {
struct trace_entry ent;
int id;
char buf[];
};
The size of this structure is reserved on the ring buffer with:
size = sizeof(*entry) + cnt;
Then it is copied from the buffer into the ring buffer with:
memcpy(&entry->id, buf, cnt);
This use to be a copy_from_user_nofault(), but now converting it to
a memcpy() triggers the fortify-string code, and causes a warning.
The allocated space is actually more than what is copied, as the
cnt used also includes the entry->id portion. Allocating
sizeof(*entry) plus cnt is actually allocating 4 bytes more than
what is needed.
Change the size function to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
And update the memcpy() to unsafe_memcpy()"
* tag 'trace-v6.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Stop fortify-string from warning in tracing_mark_raw_write()
tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
Linus Torvalds [Sat, 11 Oct 2025 22:47:12 +0000 (15:47 -0700)]
Merge tag 'kbuild-fixes-6.18-1' of git://git./linux/kernel/git/kbuild/linux
Pull Kbuild fixes from Nathan Chancellor:
- Fix UAPI types check in headers_check.pl
- Only enable -Werror for hostprogs with CONFIG_WERROR / W=e
- Ignore fsync() error when output of gen_init_cpio is a pipe
- Several little build fixes for recent modules.builtin.modinfo series
* tag 'kbuild-fixes-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux:
kbuild: Use '--strip-unneeded-symbol' for removing module device table symbols
s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections
kbuild: Add '.rel.*' strip pattern for vmlinux
kbuild: Restore pattern to avoid stripping .rela.dyn from vmlinux
gen_init_cpio: Ignore fsync() returning EINVAL on pipes
scripts/Makefile.extrawarn: Respect CONFIG_WERROR / W=e for hostprogs
kbuild: uapi: Strip comments before size type check
Wolfram Sang [Sat, 11 Oct 2025 10:31:53 +0000 (12:31 +0200)]
Revert "i2c: boardinfo: Annotate code used in init phase only"
This reverts commit
1a2b423be6a89dd07d5fc27ea042be68697a6a49 because we
got a regression report and need time to find out the details.
Reported-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Closes: https://lore.kernel.org/r/
29ec0082-4dd4-4120-acd2-
44b35b4b9487@oss.qualcomm.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Linus Torvalds [Sat, 11 Oct 2025 18:56:47 +0000 (11:56 -0700)]
Merge tag 'rtc-6.18' of git://git./linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"This cycle, we have a new RTC driver, for the SpacemiT P1. The optee
driver gets alarm support. We also get a fix for a race condition that
was fairly rare unless while stress testing the alarms.
Subsystem:
- Fix race when setting alarm
- Ensure alarm irq is enabled when UIE is enabled
- remove unneeded 'fast_io' parameter in regmap_config
New driver:
- SpacemiT P1 RTC
Drivers:
- efi: Remove wakeup functionality
- optee: add alarms support
- s3c: Drop support for S3C2410
- zynqmp: Restore alarm functionality after kexec transition"
* tag 'rtc-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (29 commits)
rtc: interface: Ensure alarm irq is enabled when UIE is enabled
rtc: tps6586x: Fix initial enable_irq/disable_irq balance
rtc: cpcap: Fix initial enable_irq/disable_irq balance
rtc: isl12022: Fix initial enable_irq/disable_irq balance
rtc: interface: Fix long-standing race when setting alarm
rtc: pcf2127: fix watchdog interrupt mask on pcf2131
rtc: zynqmp: Restore alarm functionality after kexec transition
rtc: amlogic-a4: Optimize global variables
rtc: sd2405al: Add I2C address.
rtc: Kconfig: move symbols to proper section
rtc: optee: make optee_rtc_pm_ops static
rtc: optee: Fix error code in optee_rtc_read_alarm()
rtc: optee: fix error code in probe()
dt-bindings: rtc: Convert apm,xgene-rtc to DT schema
rtc: spacemit: support the SpacemiT P1 RTC
rtc: optee: add alarm related rtc ops to optee rtc driver
rtc: optee: remove unnecessary memory operations
rtc: optee: fix memory leak on driver removal
rtc: x1205: Fix Xicor X1205 vendor prefix
dt-bindings: rtc: Fix Xicor X1205 vendor prefix
...
Linus Torvalds [Sat, 11 Oct 2025 18:49:00 +0000 (11:49 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Fixes only in drivers (ufs, mvsas, qla2xxx, target) that came in just
before or during the merge window.
The most important one is the qla2xxx which reverts a conversion to
fix flexible array member warnings, that went up in this merge window
but which turned out on further testing to be causing data corruption"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: core: Include UTP error in INT_FATAL_ERRORS
scsi: ufs: sysfs: Make HID attributes visible
scsi: mvsas: Fix use-after-free bugs in mvs_work_queue
scsi: ufs: core: Fix PM QoS mutex initialization
scsi: ufs: core: Fix runtime suspend error deadlock
Revert "scsi: qla2xxx: Fix memcpy() field-spanning write issue"
scsi: target: target_core_configfs: Add length check to avoid buffer overflow
Linus Torvalds [Sat, 11 Oct 2025 18:19:16 +0000 (11:19 -0700)]
Merge tag 'x86_core_for_v6.18_rc1' of git://git./linux/kernel/git/tip/tip
Pull more x86 updates from Borislav Petkov:
- Remove a bunch of asm implementing condition flags testing in KVM's
emulator in favor of int3_emulate_jcc() which is written in C
- Replace KVM fastops with C-based stubs which avoids problems with the
fastop infra related to latter not adhering to the C ABI due to their
special calling convention and, more importantly, bypassing compiler
control-flow integrity checking because they're written in asm
- Remove wrongly used static branches and other ugliness accumulated
over time in hyperv's hypercall implementation with a proper static
function call to the correct hypervisor call variant
- Add some fixes and modifications to allow running FRED-enabled
kernels in KVM even on non-FRED hardware
- Add kCFI improvements like validating indirect calls and prepare for
enabling kCFI with GCC. Add cmdline params documentation and other
code cleanups
- Use the single-byte 0xd6 insn as the official #UD single-byte
undefined opcode instruction as agreed upon by both x86 vendors
- Other smaller cleanups and touchups all over the place
* tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86,retpoline: Optimize patch_retpoline()
x86,ibt: Use UDB instead of 0xEA
x86/cfi: Remove __noinitretpoline and __noretpoline
x86/cfi: Add "debug" option to "cfi=" bootparam
x86/cfi: Standardize on common "CFI:" prefix for CFI reports
x86/cfi: Document the "cfi=" bootparam options
x86/traps: Clarify KCFI instruction layout
compiler_types.h: Move __nocfi out of compiler-specific header
objtool: Validate kCFI calls
x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y
x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware
x86/fred: Install system vector handlers even if FRED isn't fully enabled
x86/hyperv: Use direct call to hypercall-page
x86/hyperv: Clean up hv_do_hypercall()
KVM: x86: Remove fastops
KVM: x86: Convert em_salc() to C
KVM: x86: Introduce EM_ASM_3WCL
KVM: x86: Introduce EM_ASM_1SRC2
KVM: x86: Introduce EM_ASM_2CL
KVM: x86: Introduce EM_ASM_2W
...
Linus Torvalds [Sat, 11 Oct 2025 17:51:14 +0000 (10:51 -0700)]
Merge tag 'x86_cleanups_for_v6.18_rc1' of git://git./linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
- Simplify inline asm flag output operands now that the minimum
compiler version supports the =@ccCOND syntax
- Remove a bunch of AS_* Kconfig symbols which detect assembler support
for various instruction mnemonics now that the minimum assembler
version supports them all
- The usual cleanups all over the place
* tag 'x86_cleanups_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm: Remove code depending on __GCC_ASM_FLAG_OUTPUTS__
x86/sgx: Use ENCLS mnemonic in <kernel/cpu/sgx/encls.h>
x86/mtrr: Remove license boilerplate text with bad FSF address
x86/asm: Use RDPKRU and WRPKRU mnemonics in <asm/special_insns.h>
x86/idle: Use MONITORX and MWAITX mnemonics in <asm/mwait.h>
x86/entry/fred: Push __KERNEL_CS directly
x86/kconfig: Remove CONFIG_AS_AVX512
crypto: x86 - Remove CONFIG_AS_VPCLMULQDQ
crypto: X86 - Remove CONFIG_AS_VAES
crypto: x86 - Remove CONFIG_AS_GFNI
x86/kconfig: Drop unused and needless config X86_64_SMP
Linus Torvalds [Sat, 11 Oct 2025 17:40:24 +0000 (10:40 -0700)]
Merge tag 'slab-for-6.18-rc1-hotfix' of git://git./linux/kernel/git/vbabka/slab
Pull slab fix from Vlastimil Babka:
"A NULL pointer deref hotfix"
* tag 'slab-for-6.18-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
slab: fix barn NULL pointer dereference on memoryless nodes
Linus Torvalds [Sat, 11 Oct 2025 17:31:38 +0000 (10:31 -0700)]
Merge tag 'bpf-fixes' of git://git./linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:
- Finish constification of 1st parameter of bpf_d_path() (Rong Tao)
- Harden userspace-supplied xdp_desc validation (Alexander Lobakin)
- Fix metadata_dst leak in __bpf_redirect_neigh_v{4,6}() (Daniel
Borkmann)
- Fix undefined behavior in {get,put}_unaligned_be32() (Eric Biggers)
- Use correct context to unpin bpf hash map with special types (KaFai
Wan)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add test for unpinning htab with internal timer struct
bpf: Avoid RCU context warning when unpinning htab with internal structs
xsk: Harden userspace-supplied xdp_desc validation
bpf: Fix metadata_dst leak __bpf_redirect_neigh_v{4,6}
libbpf: Fix undefined behavior in {get,put}_unaligned_be32()
bpf: Finish constification of 1st parameter of bpf_d_path()
Linus Torvalds [Sat, 11 Oct 2025 17:27:52 +0000 (10:27 -0700)]
Merge tag 'mm-nonmm-stable-2025-10-10-15-03' of git://git./linux/kernel/git/akpm/mm
Pull more updates from Andrew Morton:
"Just one series here - Mike Rappoport has taught KEXEC handover to
preserve vmalloc allocations across handover"
* tag 'mm-nonmm-stable-2025-10-10-15-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
lib/test_kho: use kho_preserve_vmalloc instead of storing addresses in fdt
kho: add support for preserving vmalloc allocations
kho: replace kho_preserve_phys() with kho_preserve_pages()
kho: check if kho is finalized in __kho_preserve_order()
MAINTAINERS, .mailmap: update Umang's email address
Linus Torvalds [Sat, 11 Oct 2025 17:14:55 +0000 (10:14 -0700)]
Merge tag 'mm-hotfixes-stable-2025-10-10-15-00' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"7 hotfixes. All 7 are cc:stable and all 7 are for MM.
All singletons, please see the changelogs for details"
* tag 'mm-hotfixes-stable-2025-10-10-15-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: hugetlb: avoid soft lockup when mprotect to large memory area
fsnotify: pass correct offset to fsnotify_mmap_perm()
mm/ksm: fix flag-dropping behavior in ksm_madvise
mm/damon/vaddr: do not repeat pte_offset_map_lock() until success
mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage
mm/thp: fix MTE tag mismatch when replacing zero-filled subpages
memcg: skip cgroup_file_notify if spinning is not allowed
Steven Rostedt [Sat, 11 Oct 2025 15:20:32 +0000 (11:20 -0400)]
tracing: Stop fortify-string from warning in tracing_mark_raw_write()
The way tracing_mark_raw_write() records its data is that it has the
following structure:
struct {
struct trace_entry;
int id;
char buf[];
};
But memcpy(&entry->id, buf, size) triggers the following warning when the
size is greater than the id:
------------[ cut here ]------------
memcpy: detected field-spanning write (size 6) of single field "&entry->id" at kernel/trace/trace.c:7458 (size 4)
WARNING: CPU: 7 PID: 995 at kernel/trace/trace.c:7458 write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Modules linked in:
CPU: 7 UID: 0 PID: 995 Comm: bash Not tainted
6.17.0-test-00007-g60b82183e78a-dirty #211 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
RIP: 0010:write_raw_marker_to_buffer.isra.0+0x1f9/0x2e0
Code: 04 00 75 a7 b9 04 00 00 00 48 89 de 48 89 04 24 48 c7 c2 e0 b1 d1 b2 48 c7 c7 40 b2 d1 b2 c6 05 2d 88 6a 04 01 e8 f7 e8 bd ff <0f> 0b 48 8b 04 24 e9 76 ff ff ff 49 8d 7c 24 04 49 8d 5c 24 08 48
RSP: 0018:
ffff888104c3fc78 EFLAGS:
00010292
RAX:
0000000000000000 RBX:
0000000000000006 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
1ffffffff6b363b4 RDI:
0000000000000001
RBP:
ffff888100058a00 R08:
ffffffffb041d459 R09:
ffffed1020987f40
R10:
0000000000000007 R11:
0000000000000001 R12:
ffff888100bb9010
R13:
0000000000000000 R14:
00000000000003e3 R15:
ffff888134800000
FS:
00007fa61d286740(0000) GS:
ffff888286cad000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000560d28d509f1 CR3:
00000001047a4006 CR4:
0000000000172ef0
Call Trace:
<TASK>
tracing_mark_raw_write+0x1fe/0x290
? __pfx_tracing_mark_raw_write+0x10/0x10
? security_file_permission+0x50/0xf0
? rw_verify_area+0x6f/0x4b0
vfs_write+0x1d8/0xdd0
? __pfx_vfs_write+0x10/0x10
? __pfx_css_rstat_updated+0x10/0x10
? count_memcg_events+0xd9/0x410
? fdget_pos+0x53/0x5e0
ksys_write+0x182/0x200
? __pfx_ksys_write+0x10/0x10
? do_user_addr_fault+0x4af/0xa30
do_syscall_64+0x63/0x350
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fa61d318687
Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
RSP: 002b:
00007ffd87fe0120 EFLAGS:
00000202 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
00007fa61d286740 RCX:
00007fa61d318687
RDX:
0000000000000006 RSI:
0000560d28d509f0 RDI:
0000000000000001
RBP:
0000560d28d509f0 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000202 R12:
0000000000000006
R13:
00007fa61d4715c0 R14:
00007fa61d46ee80 R15:
0000000000000000
</TASK>
---[ end trace
0000000000000000 ]---
This is because fortify string sees that the size of entry->id is only 4
bytes, but it is writing more than that. But this is OK as the
dynamic_array is allocated to handle that copy.
The size allocated on the ring buffer was actually a bit too big:
size = sizeof(*entry) + cnt;
But cnt includes the 'id' and the buffer data, so adding cnt to the size
of *entry actually allocates too much on the ring buffer.
Change the allocation to:
size = struct_size(entry, buf, cnt - sizeof(entry->id));
and the memcpy() to unsafe_memcpy() with an added justification.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/20251011112032.77be18e4@gandalf.local.home
Fixes:
64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
68e973f5.
050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Vlastimil Babka [Sat, 11 Oct 2025 08:45:41 +0000 (10:45 +0200)]
slab: fix barn NULL pointer dereference on memoryless nodes
Phil reported a boot failure once sheaves become used in commits
59faa4da7cd4 ("maple_tree: use percpu sheaves for maple_node_cache") and
3accabda4da1 ("mm, vma: use percpu sheaves for vm_area_struct cache"):
BUG: kernel NULL pointer dereference, address:
0000000000000040
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 21 UID: 0 PID: 818 Comm: kworker/u398:0 Not tainted 6.17.0-rc3.slab+ #5 PREEMPT(voluntary)
Hardware name: Dell Inc. PowerEdge R7425/02MJ3T, BIOS 1.26.0 07/30/2025
RIP: 0010:__pcs_replace_empty_main+0x44/0x1d0
Code: ec 08 48 8b 46 10 48 8b 76 08 48 85 c0 74 0b 8b 48 18 85 c9 0f 85 e5 00 00 00 65 48 63 05 e4 ee 50 02 49 8b 84 c6 e0 00 00 00 <4c> 8b 68 40 4c 89 ef e8 b0 81 ff ff 48 89 c5 48 85 c0 74 1d 48 89
RSP: 0018:
ffffd2d10950bdb0 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff8a775dab74b0 RCX:
00000000ffffffff
RDX:
0000000000000cc0 RSI:
ffff8a6800804000 RDI:
ffff8a680004e300
RBP:
ffffd2d10950be40 R08:
0000000000000060 R09:
ffffffffb9367388
R10:
00000000000149e8 R11:
ffff8a6f87a38000 R12:
0000000000000cc0
R13:
0000000000000cc0 R14:
ffff8a680004e300 R15:
00000000000000c0
FS:
0000000000000000(0000) GS:
ffff8a77a3541000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000040 CR3:
0000000e1aa24000 CR4:
00000000003506f0
Call Trace:
<TASK>
? srso_return_thunk+0x5/0x5f
? vm_area_alloc+0x1e/0x60
kmem_cache_alloc_noprof+0x4ec/0x5b0
vm_area_alloc+0x1e/0x60
create_init_stack_vma+0x26/0x210
alloc_bprm+0x139/0x200
kernel_execve+0x4a/0x140
call_usermodehelper_exec_async+0xd0/0x190
? __pfx_call_usermodehelper_exec_async+0x10/0x10
ret_from_fork+0xf0/0x110
? __pfx_call_usermodehelper_exec_async+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Modules linked in:
CR2:
0000000000000040
---[ end trace
0000000000000000 ]---
RIP: 0010:__pcs_replace_empty_main+0x44/0x1d0
Code: ec 08 48 8b 46 10 48 8b 76 08 48 85 c0 74 0b 8b 48 18 85 c9 0f 85 e5 00 00 00 65 48 63 05 e4 ee 50 02 49 8b 84 c6 e0 00 00 00 <4c> 8b 68 40 4c 89 ef e8 b0 81 ff ff 48 89 c5 48 85 c0 74 1d 48 89
RSP: 0018:
ffffd2d10950bdb0 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff8a775dab74b0 RCX:
00000000ffffffff
RDX:
0000000000000cc0 RSI:
ffff8a6800804000 RDI:
ffff8a680004e300
RBP:
ffffd2d10950be40 R08:
0000000000000060 R09:
ffffffffb9367388
R10:
00000000000149e8 R11:
ffff8a6f87a38000 R12:
0000000000000cc0
R13:
0000000000000cc0 R14:
ffff8a680004e300 R15:
00000000000000c0
FS:
0000000000000000(0000) GS:
ffff8a77a3541000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000040 CR3:
0000000e1aa24000 CR4:
00000000003506f0
Kernel panic - not syncing: Fatal exception
Kernel Offset: 0x36a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
---[ end Kernel panic - not syncing: Fatal exception ]---
And noted "this is an AMD EPYC 7401 with 8 NUMA nodes configured such
that memory is only on 2 of them."
# numactl --hardware
available: 8 nodes (0-7)
node 0 cpus: 0 8 16 24 32 40 48 56 64 72 80 88
node 0 size: 0 MB
node 0 free: 0 MB
node 1 cpus: 2 10 18 26 34 42 50 58 66 74 82 90
node 1 size: 31584 MB
node 1 free: 30397 MB
node 2 cpus: 4 12 20 28 36 44 52 60 68 76 84 92
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 6 14 22 30 38 46 54 62 70 78 86 94
node 3 size: 0 MB
node 3 free: 0 MB
node 4 cpus: 1 9 17 25 33 41 49 57 65 73 81 89
node 4 size: 0 MB
node 4 free: 0 MB
node 5 cpus: 3 11 19 27 35 43 51 59 67 75 83 91
node 5 size: 32214 MB
node 5 free: 31625 MB
node 6 cpus: 5 13 21 29 37 45 53 61 69 77 85 93
node 6 size: 0 MB
node 6 free: 0 MB
node 7 cpus: 7 15 23 31 39 47 55 63 71 79 87 95
node 7 size: 0 MB
node 7 free: 0 MB
Linus decoded the stacktrace to get_barn() and get_node() and determined
that kmem_cache->node[numa_mem_id()] is NULL.
The problem is due to a wrong assumption that memoryless nodes only
exist on systems with CONFIG_HAVE_MEMORYLESS_NODES, where numa_mem_id()
points to the nearest node that has memory. SLUB has been allocating its
kmem_cache_node structures only on nodes with memory and so it does with
struct node_barn.
For kmem_cache_node, get_partial_node() checks if get_node() result is
not NULL, which I assumed was for protection from a bogus node id passed
to kmalloc_node() but apparently it's also for systems where
numa_mem_id() (used when no specific node is given) might return a
memoryless node.
Fix the sheaves code the same way by checking the result of get_node()
and bailing out if it's NULL. Note that cpus on such memoryless nodes
will have degraded sheaves performance, which can be improved later,
preferably by making numa_mem_id() work properly on such systems.
Fixes:
2d517aa09bbc ("slab: add opt-in caching layer of percpu sheaves")
Reported-and-tested-by: Phil Auld <pauld@redhat.com>
Closes: https://lore.kernel.org/all/
20251010151116.GA436967@pauld.westford.csb/
Analyzed-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/all/CAHk-%3Dwg1xK%2BBr%3DFJ5QipVhzCvq7uQVPt5Prze6HDhQQ%3DQD_BcQ@mail.gmail.com/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Steven Rostedt [Sat, 11 Oct 2025 03:51:42 +0000 (23:51 -0400)]
tracing: Fix tracing_mark_raw_write() to use buf and not ubuf
The fix to use a per CPU buffer to read user space tested only the writes
to trace_marker. But it appears that the selftests are missing tests to
the trace_maker_raw file. The trace_maker_raw file is used by applications
that writes data structures and not strings into the file, and the tools
read the raw ring buffer to process the structures it writes.
The fix that reads the per CPU buffers passes the new per CPU buffer to
the trace_marker file writes, but the update to the trace_marker_raw write
read the data from user space into the per CPU buffer, but then still used
then passed the user space address to the function that records the data.
Pass in the per CPU buffer and not the user space address.
TODO: Add a test to better test trace_marker_raw.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lore.kernel.org/20251011035243.386098147@kernel.org
Fixes:
64cf7d058a00 ("tracing: Have trace_marker use per-cpu data to read user space")
Reported-by: syzbot+9a2ede1643175f350105@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/
68e973f5.
050a0220.1186a4.0010.GAE@google.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Nathan Chancellor [Fri, 10 Oct 2025 21:49:27 +0000 (14:49 -0700)]
kbuild: Use '--strip-unneeded-symbol' for removing module device table symbols
After commit
5ab23c7923a1 ("modpost: Create modalias for builtin
modules"), relocatable RISC-V kernels with CONFIG_KASAN=y start failing
when attempting to strip the module device table symbols:
riscv64-linux-objcopy: not stripping symbol `__mod_device_table__kmod_irq_starfive_jh8100_intc__of__starfive_intc_irqchip_match_table' because it is named in a relocation
make[4]: *** [scripts/Makefile.vmlinux:97: vmlinux] Error 1
The relocation appears to come from .LASANLOC5 in .data.rel.local:
$ llvm-objdump --disassemble-symbols=.LASANLOC5 --disassemble-all -r drivers/irqchip/irq-starfive-jh8100-intc.o
drivers/irqchip/irq-starfive-jh8100-intc.o: file format elf64-littleriscv
Disassembly of section .data.rel.local:
0000000000000180 <.LASANLOC5>:
...
1d0: 0000 unimp
00000000000001d0: R_RISCV_64 __mod_device_table__kmod_irq_starfive_jh8100_intc__of__starfive_intc_irqchip_match_table
...
This section appears to come from GCC for including additional
information about global variables that may be protected by KASAN.
There appears to be no way to opt out of the generation of these symbols
through either a flag or attribute. Attempting to remove '.LASANLOC*'
with '--strip-symbol' results in the same error as above because these
symbols may refer to (thus have relocation between) each other.
Avoid this build breakage by switching to '--strip-unneeded-symbol' for
removing __mod_device_table__ symbols, as it will only remove the symbol
when there is no relocation pointing to it. While this may result in a
little more bloat in the symbol table in certain configurations, it is
not as bad as outright build failures.
Fixes:
5ab23c7923a1 ("modpost: Create modalias for builtin modules")
Reported-by: Charles Mirabile <cmirabil@redhat.com>
Closes: https://lore.kernel.org/
20251007011637.
2512413-1-cmirabil@redhat.com/
Suggested-by: Alexey Gladkov <legion@kernel.org>
Tested-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Linus Torvalds [Fri, 10 Oct 2025 21:06:02 +0000 (14:06 -0700)]
Merge tag 'for-6.18/hpfs-changes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull hpfs updates from Mikulas Patocka:
- Avoid -Wflex-array-member-not-at-end warnings
- Replace simple_strtoul with kstrtoint
- Fix error code for new_inode() failure
* tag 'for-6.18/hpfs-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
fs/hpfs: Fix error code for new_inode() failure in mkdir/create/mknod/symlink
hpfs: Replace simple_strtoul with kstrtoint in hpfs_parse_param
fs: hpfs: Avoid multiple -Wflex-array-member-not-at-end warnings
Linus Torvalds [Fri, 10 Oct 2025 21:02:14 +0000 (14:02 -0700)]
Merge tag 'drm-next-2025-10-11-1' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Dave Airlie:
"Just the follow up fixes for rc1 from the next branch, amdgpu and xe
mostly with a single v3d fix in there.
amdgpu:
- DC DCE6 fixes
- GPU reset fixes
- Secure diplay messaging cleanup
- MES fix
- GPUVM locking fixes
- PMFW messaging cleanup
- PCI US/DS switch handling fix
- VCN queue reset fix
- DC FPU handling fix
- DCN 3.5 fix
- DC mirroring fix
amdkfd:
- Fix kfd process ref leak
- mmap write lock handling fix
- Fix comments in IOCTL
xe:
- Fix build with clang 16
- Fix handling of invalid configfs syntax usage and spell out the
expected syntax in the documentation
- Do not try late bind firmware when running as VF since it shouldn't
handle firmware loading
- Fix idle assertion for local BOs
- Fix uninitialized variable for late binding
- Do not require perfmon_capable to expose free memory at page
granularity. Handle it like other drm drivers do
- Fix lock handling on suspend error path
- Fix I2C controller resume after S3
v3d:
- fix fence locking"
* tag 'drm-next-2025-10-11-1' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
drm/amd/display: Incorrect Mirror Cositing
drm/amd/display: Enable Dynamic DTBCLK Switch
drm/amdgpu: Report individual reset error
drm/amdgpu: partially revert "revert to old status lock handling v3"
drm/amd/display: Fix unsafe uses of kernel mode FPU
drm/amd/pm: Disable VCN queue reset on SMU v13.0.6 due to regression
drm/amdgpu: Fix general protection fault in amdgpu_vm_bo_reset_state_machine
drm/amdgpu: Check swus/ds for switch state save
drm/amdkfd: Fix two comments in kfd_ioctl.h
drm/amd/pm: Avoid interface mismatch messaging
drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_init
drm/amd/amdgpu: Fix the mes version that support inv_tlbs
drm/amd: Check whether secure display TA loaded successfully
drm/amdkfd: Fix mmap write lock not release
drm/amdkfd: Fix kfd process ref leaking when userptr unmapping
drm/amdgpu: Fix for GPU reset being blocked by KIQ I/O.
drm/amd/display: Disable scaling on DCE6 for now
drm/amd/display: Properly disable scaling on DCE6
drm/amd/display: Properly clear SCL_*_FILTER_CONTROL on DCE6
drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs
...
Linus Torvalds [Fri, 10 Oct 2025 20:59:38 +0000 (13:59 -0700)]
Merge tag 'drm-fixes-2025-10-11' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
"Some fixes leftover from our fixes branch, just nouveau and vmwgfx:
nouveau:
- Return errno code from TTM move helper
vmwgfx:
- Fix null-ptr access in cursor code
- Fix UAF in validation
- Use correct iterator in validation"
* tag 'drm-fixes-2025-10-11' of https://gitlab.freedesktop.org/drm/kernel:
drm/nouveau: fix bad ret code in nouveau_bo_move_prep
drm/vmwgfx: Fix copy-paste typo in validation
drm/vmwgfx: Fix Use-after-free in validation
drm/vmwgfx: Fix a null-ptr access in the cursor snooper
Jan Kara [Tue, 7 Oct 2025 13:49:37 +0000 (15:49 +0200)]
ext4: free orphan info with kvfree
Orphan info is now getting allocated with kvmalloc_array(). Free it with
kvfree() instead of kfree() to avoid complaints from mm.
Reported-by: Chris Mason <clm@meta.com>
Fixes:
0a6ce20c1564 ("ext4: verify orphan file size is not too big")
Cc: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Message-ID: <
20251007134936.7291-2-jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Deepanshu Kartikey [Tue, 30 Sep 2025 11:28:10 +0000 (16:58 +0530)]
ext4: detect invalid INLINE_DATA + EXTENTS flag combination
syzbot reported a BUG_ON in ext4_es_cache_extent() when opening a verity
file on a corrupted ext4 filesystem mounted without a journal.
The issue is that the filesystem has an inode with both the INLINE_DATA
and EXTENTS flags set:
EXT4-fs error (device loop0): ext4_cache_extents:545: inode #15:
comm syz.0.17: corrupted extent tree: lblk 0 < prev 66
Investigation revealed that the inode has both flags set:
DEBUG: inode 15 - flag=1, i_inline_off=164, has_inline=1, extents_flag=1
This is an invalid combination since an inode should have either:
- INLINE_DATA: data stored directly in the inode
- EXTENTS: data stored in extent-mapped blocks
Having both flags causes ext4_has_inline_data() to return true, skipping
extent tree validation in __ext4_iget(). The unvalidated out-of-order
extents then trigger a BUG_ON in ext4_es_cache_extent() due to integer
underflow when calculating hole sizes.
Fix this by detecting this invalid flag combination early in ext4_iget()
and rejecting the corrupted inode.
Cc: stable@kernel.org
Reported-and-tested-by: syzbot+038b7bf43423e132b308@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=
038b7bf43423e132b308
Suggested-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Message-ID: <
20250930112810.315095-1-kartikey406@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Dave Airlie [Fri, 10 Oct 2025 20:17:06 +0000 (06:17 +1000)]
Merge tag 'drm-misc-fixes-2025-10-09' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
nouveau:
- Return errno code from TTM move helper
vmwgfx:
- Fix null-ptr access in cursor code
- Fix UAF in validation
- Use correct iterator in validation
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20251009120004.GA17570@linux.fritz.box
Linus Torvalds [Fri, 10 Oct 2025 20:05:40 +0000 (13:05 -0700)]
Merge tag 'devicetree-fixes-for-6.18-1' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Allow child nodes on renesas-bsc bus binding
- Drop node name pattern on allwinner,sun50i-a64-de2 bus binding
- Switch DT patchwork to kernel.org from ozlabs.org
- Fix some typos in docs and bindings
- Fix reference count in PCI node unittest
* tag 'devicetree-fixes-for-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: bus: renesas-bsc: allow additional properties
dt-bindings: bus: allwinner,sun50i-a64-de2: don't check node names
MAINTAINERS: Move DT patchwork to kernel.org
of: unittest: Fix device reference count leak in of_unittest_pci_node_verify
of: doc: Fix typo in doc comments.
dt-bindings: mmc: Correct typo "upto" to "up to"
Wolfram Sang [Thu, 9 Oct 2025 18:34:53 +0000 (20:34 +0200)]
dt-bindings: bus: renesas-bsc: allow additional properties
Allow additional properties to enable devices attached to the bus.
Fixes warnings like these:
arch/arm/boot/dts/renesas/sh73a0-kzm9g.dtb: bus@
fec10000 (renesas,bsc-sh73a0): Unevaluated properties are not allowed ('ethernet@
10000000' was unexpected)
arch/arm/boot/dts/renesas/r8a73a4-ape6evm.dtb: bus@
fec10000 (renesas,bsc-r8a73a4): Unevaluated properties are not allowed ('ethernet@
8000000', 'flash@0' were unexpected)
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Wolfram Sang [Thu, 9 Oct 2025 18:37:43 +0000 (20:37 +0200)]
dt-bindings: bus: allwinner,sun50i-a64-de2: don't check node names
Node names are already and properly checked by the core schema. No need
to do it again.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
[robh: Also drop [A-F] in unit address]
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Linus Torvalds [Fri, 10 Oct 2025 18:30:19 +0000 (11:30 -0700)]
Merge tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
- some messenger improvements (Eric and Max)
- address an issue (also affected userspace) of incorrect permissions
being granted to users who have access to multiple different CephFS
instances within the same cluster (Kotresh)
- a bunch of assorted CephFS fixes (Slava)
* tag 'ceph-for-6.18-rc1' of https://github.com/ceph/ceph-client:
ceph: add bug tracking system info to MAINTAINERS
ceph: fix multifs mds auth caps issue
ceph: cleanup in ceph_alloc_readdir_reply_buffer()
ceph: fix potential NULL dereference issue in ceph_fill_trace()
libceph: add empty check to ceph_con_get_out_msg()
libceph: pass the message pointer instead of loading con->out_msg
libceph: make ceph_con_get_out_msg() return the message pointer
ceph: fix potential race condition on operations with CEPH_I_ODIRECT flag
ceph: refactor wake_up_bit() pattern of calling
ceph: fix potential race condition in ceph_ioctl_lazyio()
ceph: fix overflowed constant issue in ceph_do_objects_copy()
ceph: fix wrong sizeof argument issue in register_session()
ceph: add checking of wait_for_completion_killable() return value
ceph: make ceph_start_io_*() killable
libceph: Use HMAC-SHA256 library instead of crypto_shash
Linus Torvalds [Fri, 10 Oct 2025 18:23:57 +0000 (11:23 -0700)]
Merge tag 'v6.18-rc-part2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull more smb client updates from Steve French:
- fix i_size in fallocate
- two truncate fixes
- utime fix
- minor cleanups
- SMB1 fixes
- improve error check in read
- improve perf of copy file_range (copy_chunk)
* tag 'v6.18-rc-part2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal version number
cifs: Add comments for DeletePending assignments in open functions
cifs: Add fallback code path for cifs_mkdir_setinfo()
cifs: Allow fallback code in smb_set_file_info() also for directories
cifs: Query EA $LXMOD in cifs_query_path_info() for WSL reparse points
smb: client: remove cfids_invalidation_worker
smb: client: remove redudant assignment in cifs_strict_fsync()
smb: client: fix race with fallocate(2) and AIO+DIO
smb: client: fix missing timestamp updates after utime(2)
smb: client: fix missing timestamp updates after ftruncate(2)
smb: client: fix missing timestamp updates with O_TRUNC
cifs: Fix copy_to_iter return value check
smb: client: batch SRV_COPYCHUNK entries to cut round trips
smb: client: Omit an if branch in smb2_find_smb_tcon()
smb: client: Return directly after a failed genlmsg_new() in cifs_swn_send_register_message()
smb: client: Use common code in cifs_do_create()
smb: client: Improve unlocking of a mutex in cifs_get_swn_reg()
smb: client: Return a status code only as a constant in cifs_spnego_key_instantiate()
smb: client: Use common code in cifs_lookup()
smb: client: Reduce the scopes for a few variables in two functions
Linus Torvalds [Fri, 10 Oct 2025 18:20:19 +0000 (11:20 -0700)]
Merge tag 'xtensa-
20251010' of https://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa updates from Max Filippov:
- minor cleanups
* tag 'xtensa-
20251010' of https://github.com/jcmvbkbc/linux-xtensa:
xtensa: use HZ_PER_MHZ in platform_calibrate_ccount
xtensa: simdisk: add input size check in proc_write_simdisk
Linus Torvalds [Fri, 10 Oct 2025 17:37:13 +0000 (10:37 -0700)]
Merge tag 'block-6.18-
20251009' of git://git./linux/kernel/git/axboe/linux
Pull block fixes from Jens Axboe:
- Don't include __GFP_NOWARN for loop worker allocation, as it already
uses GFP_NOWAIT which has __GFP_NOWARN set already
- Small series cleaning up the recent bio_iov_iter_get_pages() changes
- loop fix for leaking the backing reference file, if validation fails
- Update of a comment pertaining to disk/partition stat locking
* tag 'block-6.18-
20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
loop: remove redundant __GFP_NOWARN flag
block: move bio_iov_iter_get_bdev_pages to block/fops.c
iomap: open code bio_iov_iter_get_bdev_pages
block: rename bio_iov_iter_get_pages_aligned to bio_iov_iter_get_pages
block: remove bio_iov_iter_get_pages
block: Update a comment of disk statistics
loop: fix backing file reference leak on validation error
Zeno Endemann [Thu, 25 Sep 2025 15:24:33 +0000 (17:24 +0200)]
ext4, doc: fix and improve directory hash tree description
Some of the details about how directory hash trees work were confusing or
outright wrong, this patch should fix those.
A note on dx_tail's dt_reserved member, as far as I can tell the kernel
never sets this explicitly, so its content is apparently left-overs from
what was there before (for the dx_root I've seen remnants of a
ext4_dir_entry_tail struct from when the dir was not yet a hash dir).
Signed-off-by: Zeno Endemann <zeno.endemann@mailbox.org>
Message-ID: <
20250925152435.22749-1-zeno.endemann@mailbox.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Zhang Yi [Tue, 16 Sep 2025 09:33:37 +0000 (17:33 +0800)]
ext4: wait for ongoing I/O to complete before freeing blocks
When freeing metadata blocks in nojournal mode, ext4_forget() calls
bforget() to clear the dirty flag on the buffer_head and remvoe
associated mappings. This is acceptable if the metadata has not yet
begun to be written back. However, if the write-back has already started
but is not yet completed, ext4_forget() will have no effect.
Subsequently, ext4_mb_clear_bb() will immediately return the block to
the mb allocator. This block can then be reallocated immediately,
potentially causing an data corruption issue.
Fix this by clearing the buffer's dirty flag and waiting for the ongoing
I/O to complete, ensuring that no further writes to stale data will
occur.
Fixes:
16e08b14a455 ("ext4: cleanup clean_bdev_aliases() calls")
Cc: stable@kernel.org
Reported-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Closes: https://lore.kernel.org/linux-ext4/
a9417096-9549-4441-9878-
b1955b899b4e@huaweicloud.com/
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <
20250916093337.
3161016-3-yi.zhang@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Linus Torvalds [Fri, 10 Oct 2025 17:25:24 +0000 (10:25 -0700)]
Merge tag 'io_uring-6.18-
20251009' of git://git./linux/kernel/git/axboe/linux
Pull io_uring fixes from Jens Axboe:
- Fixup indentation in the UAPI header
- Two fixes for zcrx. One fixes receiving too much in some cases, and
the other deals with not correctly incrementing the source in the
fallback copy loop
- Fix for a race in the IORING_OP_WAITID command, where there was a
small window where the request would be left on the wait_queue_head
list even though it was being canceled/completed
- Update liburing git URL in the kernel tree
* tag 'io_uring-6.18-
20251009' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/zcrx: increment fallback loop src offset
io_uring/zcrx: fix overshooting recv limit
io_uring: use tab indentation for IORING_SEND_VECTORIZED comment
io_uring/waitid: always prune wait queue entry in io_waitid_wait()
io_uring: update liburing git URL