Dmitry Antipov [Wed, 4 Sep 2024 11:54:01 +0000 (14:54 +0300)]
 
net: sched: consistently use rcu_replace_pointer() in taprio_change()
According to Vinicius (and carefully looking through the whole
https://syzkaller.appspot.com/bug?extid=
b65e0af58423fc8a73aa
once again), txtime branch of 'taprio_change()' is not going to
race against 'advance_sched()'. But using 'rcu_replace_pointer()'
in the former may be a good idea as well.
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 7 Sep 2024 01:39:31 +0000 (18:39 -0700)]
 
Merge tag 'nf-next-24-09-06' of git://git./linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
Patch #1 adds ctnetlink support for kernel side filtering for
	 deletions, from Changliang Wu.
Patch #2 updates nft_counter support to Use u64_stats_t,
	 from Sebastian Andrzej Siewior.
Patch #3 uses kmemdup_array() in all xtables frontends,
	 from Yan Zhen.
Patch #4 is a oneliner to use ERR_CAST() in nf_conntrack instead
	 opencoded casting, from Shen Lichuan.
Patch #5 removes unused argument in nftables .validate interface,
	 from Florian Westphal.
Patch #6 is a oneliner to correct a typo in nftables kdoc,
	 from Simon Horman.
Patch #7 fixes missing kdoc in nftables, also from Simon.
Patch #8 updates nftables to handle timeout less than CONFIG_HZ.
Patch #9 rejects element expiration if timeout is zero,
	 otherwise it is silently ignored.
Patch #10 disallows element expiration larger than timeout.
Patch #11 removes unnecessary READ_ONCE annotation while mutex is held.
Patch #12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset.
Patch #13 annotates data-races around element expiration.
Patch #14 allocates timeout and expiration in one single set element
	  extension, they are tighly couple, no reason to keep them
	  separated anymore.
Patch #15 updates nftables to interpret zero timeout element as never
	  times out. Note that it is already possible to declare sets
	  with elements that never time out but this generalizes to all
	  kind of set with timeouts.
Patch #16 supports for element timeout and expiration updates.
* tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: set element timeout update support
  netfilter: nf_tables: zero timeout means element never times out
  netfilter: nf_tables: consolidate timeout extension for elements
  netfilter: nf_tables: annotate data-races around element expiration
  netfilter: nft_dynset: annotate data-races around set timeout
  netfilter: nf_tables: remove annotation to access set timeout while holding lock
  netfilter: nf_tables: reject expiration higher than timeout
  netfilter: nf_tables: reject element expiration with no timeout
  netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
  netfilter: nf_tables: Add missing Kernel doc
  netfilter: nf_tables: Correct spelling in nf_tables.h
  netfilter: nf_tables: drop unused 3rd argument from validate callback ops
  netfilter: conntrack: Convert to use ERR_CAST()
  netfilter: Use kmemdup_array instead of kmemdup for multiple allocation
  netfilter: nft_counter: Use u64_stats_t for statistic.
  netfilter: ctnetlink: support CTA_FILTER for flush
====================
Link: https://patch.msgid.link/20240905232920.5481-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Fedorenko [Thu, 5 Sep 2024 14:00:28 +0000 (14:00 +0000)]
 
ptp: ocp: Improve PCIe delay estimation
The PCIe bus can be pretty busy during boot and probe function can
see excessive delays. Let's find the minimal value out of several
tests and use it as estimated value.
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20240905140028.560454-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 5 Sep 2024 08:49:09 +0000 (08:49 +0000)]
 
netpoll: remove netpoll_srcu
netpoll_srcu is currently used from netpoll_poll_disable() and
__netpoll_cleanup()
Both functions run under RTNL, using netpoll_srcu adds confusion
and no additional protection.
Moreover the synchronize_srcu() call in __netpoll_cleanup() is
performed before clearing np->dev->npinfo, which violates RCU rules.
After this patch, netpoll_poll_disable() and netpoll_poll_enable()
simply use rtnl_dereference().
This saves a big chunk of memory (more than 192KB on platforms
with 512 cpus)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240905084909.2082486-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 7 Sep 2024 01:23:51 +0000 (18:23 -0700)]
 
Merge branch 'octeontx2-address-some-warnings'
Simon Horman says:
====================
octeontx2: Address some warnings
This patchset addresses some warnings flagged by Sparse, gcc-14, and
clang-18 in files touched by recent patch submissions.
Although these changes do not alter the functionality of the code, by
addressing them real problems introduced in future which are flagged by
Sparse will stand out more readily.
v1: https://lore.kernel.org/
20240903-octeontx2-sparse-v1-0-
f190309ecb0a@kernel.org
====================
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-0-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 4 Sep 2024 18:29:37 +0000 (19:29 +0100)]
 
octeontx2-pf: Make iplen __be16 in otx2_sqe_add_ext()
In otx2_sqe_add_ext() iplen is used to hold a 16-bit big-endian value,
but it's type is u16, indicating a host byte order integer.
Address this mismatch by changing the type of iplen to __be16.
Flagged by Sparse as:
.../otx2_txrx.c:699:31: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:699:31:    expected unsigned short [usertype] iplen
.../otx2_txrx.c:699:31:    got restricted __be16 [usertype]
.../otx2_txrx.c:701:54: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:701:54:    expected restricted __be16 [usertype] tot_len
.../otx2_txrx.c:701:54:    got unsigned short [usertype] iplen
.../otx2_txrx.c:704:60: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:704:60:    expected restricted __be16 [usertype] payload_len
.../otx2_txrx.c:704:60:    got unsigned short [usertype] iplen
Introduced in
commit 
dc1a9bf2c816 ("octeontx2-pf: Add UDP segmentation offload support")
No functional change intended.
Compile tested only by author.
Tested-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-2-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 4 Sep 2024 18:29:36 +0000 (19:29 +0100)]
 
octeontx2-af: Pass string literal as format argument of alloc_workqueue()
Recently I noticed that both gcc-14 and clang-18 report that passing
a non-string literal as the format argument of alloc_workqueue()
is potentially insecure.
E.g. clang-18 says:
.../rvu.c:2493:32: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
 2493 |         mw->mbox_wq = alloc_workqueue(name,
      |                                       ^~~~
.../rvu.c:2493:32: note: treat the string as an argument to avoid this
 2493 |         mw->mbox_wq = alloc_workqueue(name,
      |                                       ^
      |                                       "%s",
It is always the case where the contents of name is safe to pass as the
format argument. That is, in my understanding, it never contains any
format escape sequences.
But, it seems better to be safe than sorry. And, as a bonus, compiler
output becomes less verbose by addressing this issue as suggested by
clang-18.
Compile tested only by author.
Tested-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-1-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Wed, 4 Sep 2024 20:56:59 +0000 (13:56 -0700)]
 
net: phy: qca83xx: use PHY_ID_MATCH_EXACT
No need for the mask when there's already a macro for this.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240904205659.7470-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Wed, 4 Sep 2024 18:11:56 +0000 (19:11 +0100)]
 
sfc: siena: rip out rss-context dead code
Siena hardware does not support custom RSS contexts, but when the
 driver was forked from sfc.ko, some of the plumbing for them was
 copied across from the common code.  Actually trying to use them
 would lead to EOPNOTSUPP as the relevant efx_nic_type methods were
 not populated.
Remove this dead code from the Siena driver.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904181156.1993666-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 7 Sep 2024 01:21:47 +0000 (18:21 -0700)]
 
Merge branch 'use-functionality-of-irq_get_trigger_type'
Vasileios Amoiridis says:
====================
Use functionality of irq_get_trigger_type()
v1: https://lore.kernel.org/
20240902225534.130383-1-vassilisamir@gmail.com
====================
Link: https://patch.msgid.link/20240904151018.71967-1-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:18 +0000 (17:10 +0200)]
 
net: smc91x: Make use of irq_get_trigger_type()
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240904151018.71967-4-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:17 +0000 (17:10 +0200)]
 
net: dsa: realtek: rtl8366rb: Make use of irq_get_trigger_type()
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Link: https://patch.msgid.link/20240904151018.71967-3-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:16 +0000 (17:10 +0200)]
 
net: dsa: realtek: rtl8365mb: Make use of irq_get_trigger_type()
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240904151018.71967-2-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sascha Hauer [Wed, 4 Sep 2024 12:17:41 +0000 (14:17 +0200)]
 
net: tls: wait for async completion on last message
When asynchronous encryption is used KTLS sends out the final data at
proto->close time. This becomes problematic when the task calling
close() receives a signal. In this case it can happen that
tcp_sendmsg_locked() called at close time returns -ERESTARTSYS and the
final data is not sent.
The described situation happens when KTLS is used in conjunction with
io_uring, as io_uring uses task_work_add() to add work to the current
userspace task. A discussion of the problem along with a reproducer can
be found in [1] and [2]
Fix this by waiting for the asynchronous encryption to be completed on
the final message. With this there is no data left to be sent at close
time.
[1] https://lore.kernel.org/all/
20231010141932.GD3114228@pengutronix.de/
[2] https://lore.kernel.org/all/
20240315100159.
3898944-1-s.hauer@pengutronix.de/
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://patch.msgid.link/20240904-ktls-wait-async-v1-1-a62892833110@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 7 Sep 2024 01:10:26 +0000 (18:10 -0700)]
 
Merge branch 'make-use-of-the-helper-macro-list_head'
Hongbo Li says:
====================
make use of the helper macro LIST_HEAD()
The macro LIST_HEAD() declares a list variable and
initializes it, which can be used to simplify the steps
of list initialization, thereby simplifying the code.
These serials just do some equivalatent substitutions,
and with no functional modifications.
====================
Link: https://patch.msgid.link/20240904093243.3345012-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:43 +0000 (17:32 +0800)]
 
net/core: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-6-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:42 +0000 (17:32 +0800)]
 
net/ipv6: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-5-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:41 +0000 (17:32 +0800)]
 
net/netfilter: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240904093243.3345012-4-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:40 +0000 (17:32 +0800)]
 
net/tipc: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-3-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:39 +0000 (17:32 +0800)]
 
net/ipv4: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-2-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:49:51 +0000 (16:49 +0800)]
 
sfc: convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240904084951.1353518-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:40:34 +0000 (16:40 +0800)]
 
sfc/siena: Convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240904084034.1353404-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:17:28 +0000 (16:17 +0800)]
 
ionic: Convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240904081728.1353260-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:08:45 +0000 (16:08 +0800)]
 
net: atlantic: convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904080845.1353144-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Fri, 6 Sep 2024 08:34:19 +0000 (09:34 +0100)]
 
Merge branch 'rx-sw-tstamp-for-all'
Gal Pressman says:
====================
RX software timestamp for all - round 2
Round 1 of drivers conversion was merged [1], this is round 2, more
drivers to follow.
[1] https://lore.kernel.org/netdev/
20240901112803.212753-1-gal@nvidia.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:22 +0000 (10:49 +0300)]
 
bnx2x: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:21 +0000 (10:49 +0300)]
 
cxgb4: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:20 +0000 (10:49 +0300)]
 
ixgbe: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:19 +0000 (10:49 +0300)]
 
igc: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:18 +0000 (10:49 +0300)]
 
igb: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:17 +0000 (10:49 +0300)]
 
ice: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:16 +0000 (10:49 +0300)]
 
i40e: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:15 +0000 (10:49 +0300)]
 
net: netcp: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:14 +0000 (10:49 +0300)]
 
net: ti: icssg-prueth: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:13 +0000 (10:49 +0300)]
 
net: ethernet: ti: cpsw_ethtool: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:12 +0000 (10:49 +0300)]
 
net: ethernet: ti: am65-cpsw-ethtool: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:11 +0000 (10:49 +0300)]
 
mlxsw: spectrum: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:10 +0000 (10:49 +0300)]
 
net: sparx5: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:09 +0000 (10:49 +0300)]
 
net: lan966x: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:08 +0000 (10:49 +0300)]
 
lan743x: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 6 Sep 2024 07:41:36 +0000 (08:41 +0100)]
 
Merge branch 'microchip=ksz8-cleanup'
Pieter Van Trappen says:
====================
net: dsa: microchip: rename and clean ksz8 series files
The first KSZ8 series implementation was done for a KSZ8795 device but
since several other KSZ8 devices have been added. Rename these files
to adhere to the ksz8 naming convention as already used in most
functions and the existing ksz8.h; add an explanatory note.
In addition, clean the files by removing macros that are defined at
more than one place and remove confusion by renaming the KSZ8830
string which in fact is not an existing KSZ series switch.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
---
v4:
 - correct once more Kconfig list of supported switches
v3: https://lore.kernel.org/netdev/
20240903072946.344507-1-vtpieter@gmail.com/
 - rename all KSZ8830 to KSZ88X3 only (not KSZ8863)
 - update Kconfig as per Arun's suggestion
v2: https://lore.kernel.org/netdev/
20240830141250.30425-1-vtpieter@gmail.com/
 - more finegrained description in Kconfig and ksz8.c header
 - add KSZ8830/ksz8830 to KSZ8863/ksz88x3 renaming
v1: https://lore.kernel.org/netdev/
20240828102801.227588-1-vtpieter@gmail.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:42 +0000 (08:27 +0200)]
 
net: dsa: microchip: replace unclear KSZ8830 strings
Replace ksz8830 with ksz88x3 for CHIP_ID definition and other
strings. This due to KSZ8830 not being an actual switch but the Chip
ID shared among KSZ8863/8873 switches, impossible to differentiate
from their Chip ID or Revision ID registers.
Now all KSZ*_CHIP_ID macros refer to actual, existing switches which
removes confusion.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:41 +0000 (08:27 +0200)]
 
net: dsa: microchip: clean up ksz8_reg definition macros
Remove macros that are already defined at more appropriate places.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:40 +0000 (08:27 +0200)]
 
net: dsa: microchip: rename ksz8 series files
The first KSZ8 series implementation was done for a KSZ8795 device but
since several other KSZ8 devices have been added. Rename these files
to adhere to the ksz8 naming convention as already used in most
functions and the existing ksz8.h; add an explanatory note.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 6 Sep 2024 05:02:41 +0000 (22:02 -0700)]
 
Merge branch 'add-realtek-automotive-pcie-driver'
Justin Lai says:
====================
Add Realtek automotive PCIe driver
This series includes adding realtek automotive ethernet driver
and adding rtase ethernet driver entry in MAINTAINERS file.
This ethernet device driver for the PCIe interface of
Realtek Automotive Ethernet Switch,applicable to
RTL9054, RTL9068, RTL9072, RTL9075, RTL9068, RTL9071.
====================
Link: https://patch.msgid.link/20240904032114.247117-1-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:14 +0000 (11:21 +0800)]
 
MAINTAINERS: Add the rtase ethernet driver entry
Add myself and Larry Chiu as the maintainer for the rtase ethernet driver.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-14-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:13 +0000 (11:21 +0800)]
 
realtek: Update the Makefile and Kconfig in the realtek folder
1. Add the RTASE entry in the Kconfig.
2. Add the CONFIG_RTASE entry in the Makefile.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-13-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:12 +0000 (11:21 +0800)]
 
rtase: Add a Makefile in the rtase folder
Add a Makefile in the rtase folder to build rtase driver.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-12-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:11 +0000 (11:21 +0800)]
 
rtase: Implement ethtool function
Implement the ethtool function to support users to obtain network card
information, including obtaining various device settings, Report whether
physical link is up, Report pause parameters, Set pause parameters,
Return extended statistics about the device.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-11-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:10 +0000 (11:21 +0800)]
 
rtase: Implement pci_driver suspend and resume function
Implement the pci_driver suspend function to enable the device
to sleep, and implement the resume function to enable the device
to resume operation.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-10-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:09 +0000 (11:21 +0800)]
 
rtase: Implement net_device_ops
1. Implement .ndo_set_rx_mode so that the device can change address
list filtering.
2. Implement .ndo_set_mac_address so that mac address can be changed.
3. Implement .ndo_change_mtu so that mtu can be changed.
4. Implement .ndo_tx_timeout to perform related processing when the
transmitter does not make any progress.
5. Implement .ndo_get_stats64 to provide statistics that are called
when the user wants to get network device usage.
6. Implement .ndo_vlan_rx_add_vid to register VLAN ID when the device
supports VLAN filtering.
7. Implement .ndo_vlan_rx_kill_vid to unregister VLAN ID when the device
supports VLAN filtering.
8. Implement the .ndo_setup_tc to enable setting any "tc" scheduler,
classifier or action on dev.
9. Implement .ndo_fix_features enables adjusting requested feature flags
based on device-specific constraints.
10. Implement .ndo_set_features enables updating device configuration to
new features.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-9-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:08 +0000 (11:21 +0800)]
 
rtase: Implement a function to receive packets
Implement rx_handler to read the information of the rx descriptor,
thereby checking the packet accordingly and storing the packet
in the socket buffer to complete the reception of the packet.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-8-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:07 +0000 (11:21 +0800)]
 
rtase: Implement .ndo_start_xmit function
Implement .ndo_start_xmit function to fill the information of the packet
to be transmitted into the tx descriptor, and then the hardware will
transmit the packet using the information in the tx descriptor.
In addition, we also implemented the tx_handler function to enable the
tx descriptor to be reused.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-7-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:06 +0000 (11:21 +0800)]
 
rtase: Implement hardware configuration function
Implement rtase_hw_config to set default hardware settings, including
setting interrupt mitigation, tx/rx DMA burst, interframe gap time,
rx packet filter, near fifo threshold and fill descriptor ring and
tally counter address, and enable flow control. When filling the
rx descriptor ring, the first group of queues needs to be processed
separately because the positions of the first group of queues are not
regular with other subsequent groups. The other queues are all newly
added features, but we want to retain the original design. So they were
not put together.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-6-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:05 +0000 (11:21 +0800)]
 
rtase: Implement the interrupt routine and rtase_poll
1. Implement rtase_interrupt to handle txQ0/rxQ0, txQ4~txQ7 interrupts,
and implement rtase_q_interrupt to handle txQ1/rxQ1, txQ2/rxQ2 and
txQ3/rxQ3 interrupts.
2. Implement rtase_poll to call ring_handler to process the tx or
rx packet of each ring. If the returned value is budget,it means that
there is still work of a certain ring that has not yet been completed.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-5-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:04 +0000 (11:21 +0800)]
 
rtase: Implement the rtase_down function
Implement the rtase_down function to disable hardware setting
and interrupt and clear descriptor ring.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-4-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:03 +0000 (11:21 +0800)]
 
rtase: Implement the .ndo_open function
Implement the .ndo_open function to set default hardware settings
and initialize the descriptor ring and interrupts. Among them,
when requesting interrupt, because the first group of interrupts
needs to process more events, the overall structure and interrupt
handler will be different from other groups of interrupts, so it
needs to be handled separately. The first set of interrupt handlers
need to handle the interrupt status of RXQ0 and TXQ0, TXQ4~7,
while other groups of interrupt handlers will handle the interrupt
status of RXQ1&TXQ1 or RXQ2&TXQ2 or RXQ3&TXQ3 according to the
interrupt vector.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-3-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:02 +0000 (11:21 +0800)]
 
rtase: Add support for a pci table in this module
Add support for a pci table in this module, and implement pci_driver
function to initialize this driver, remove this driver, or shutdown
this driver.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-2-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 6 Sep 2024 03:27:09 +0000 (20:27 -0700)]
 
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/phy/phy_device.c
  
2560db6ede1a ("net: phy: Fix missing of_node_put() for leds")
  
1dce520abd46 ("net: phy: Use for_each_available_child_of_node_scoped()")
https://lore.kernel.org/
20240904115823.
74333648@canb.auug.org.au
Adjacent changes:
drivers/net/ethernet/xilinx/xilinx_axienet.h
drivers/net/ethernet/xilinx/xilinx_axienet_main.c
  
858430db28a5 ("net: xilinx: axienet: Fix race in axienet_stop")
  
76abb5d675c4 ("net: xilinx: axienet: Add statistics support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Wed, 4 Sep 2024 09:10:24 +0000 (10:10 +0100)]
 
netlink: specs: nftables: allow decode of tailscale ruleset
Fill another small gap in the nftables spec so that it is possible to
dump a tailscale ruleset with:
  tools/net/ynl/cli.py --spec \
     Documentation/netlink/specs/nftables.yaml --dump getrule
This adds support for the 'target' expression.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20240904091024.3138-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joe Damato [Wed, 4 Sep 2024 15:34:30 +0000 (15:34 +0000)]
 
net: napi: Prevent overflow of napi_defer_hard_irqs
In commit 
6f8b12d661d0 ("net: napi: add hard irqs deferral feature")
napi_defer_irqs was added to net_device and napi_defer_irqs_count was
added to napi_struct, both as type int.
This value never goes below zero, so there is not reason for it to be a
signed int. Change the type for both from int to u32, and add an
overflow check to sysfs to limit the value to S32_MAX.
The limit of S32_MAX was chosen because the practical limit before this
patch was S32_MAX (anything larger was an overflow) and thus there are
no behavioral changes introduced. If the extra bit is needed in the
future, the limit can be raised.
Before this patch:
$ sudo bash -c 'echo 
2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
$ cat /sys/class/net/eth4/napi_defer_hard_irqs
-
2147483647
After this patch:
$ sudo bash -c 'echo 
2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
bash: line 0: echo: write error: Numerical result out of range
Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:
include/linux/netdevice.h:      unsigned int            tx_queue_len;
And has an overflow check:
dev_change_tx_queue_len(..., unsigned long new_len):
  if (new_len != (unsigned int)new_len)
          return -ERANGE;
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240904153431.307932-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Fri, 6 Sep 2024 00:08:01 +0000 (17:08 -0700)]
 
Merge tag 'net-6.11-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from can, bluetooth and wireless.
  No known regressions at this point. Another calm week, but chances are
  that has more to do with vacation season than the quality of our work.
  Current release - new code bugs:
   - smc: prevent NULL pointer dereference in txopt_get
   - eth: ti: am65-cpsw: number of XDP-related fixes
  Previous releases - regressions:
   - Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over
     BREDR/LE", it breaks existing user space
   - Bluetooth: qca: if memdump doesn't work, re-enable IBS to avoid
     later problems with suspend
   - can: mcp251x: fix deadlock if an interrupt occurs during
     mcp251x_open
   - eth: r8152: fix the firmware communication error due to use of bulk
     write
   - ptp: ocp: fix serial port information export
   - eth: igb: fix not clearing TimeSync interrupts for 82580
   - Revert "wifi: ath11k: support hibernation", fix suspend on Lenovo
  Previous releases - always broken:
   - eth: intel: fix crashes and bugs when reconfiguration and resets
     happening in parallel
   - wifi: ath11k: fix NULL dereference in ath11k_mac_get_eirp_power()
  Misc:
   - docs: netdev: document guidance on cleanup.h"
* tag 'net-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
  ila: call nf_unregister_net_hooks() sooner
  tools/net/ynl: fix cli.py --subscribe feature
  MAINTAINERS: fix ptp ocp driver maintainers address
  selftests: net: enable bind tests
  net: dsa: vsc73xx: fix possible subblocks range of CAPT block
  sched: sch_cake: fix bulk flow accounting logic for host fairness
  docs: netdev: document guidance on cleanup.h
  net: xilinx: axienet: Fix race in axienet_stop
  net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN
  r8152: fix the firmware doesn't work
  fou: Fix null-ptr-deref in GRO.
  bareudp: Fix device stats updates.
  net: mana: Fix error handling in mana_create_txq/rxq's NAPI cleanup
  bpf, net: Fix a potential race in do_sock_getsockopt()
  net: dqs: Do not use extern for unused dql_group
  sch/netem: fix use after free in netem_dequeue
  usbnet: modern method to get random MAC
  MAINTAINERS: wifi: cw1200: add net-cw1200.h
  ice: do not bring the VSI up, if it was down before the XDP setup
  ice: remove ICE_CFG_BUSY locking from AF_XDP code
  ...
Linus Torvalds [Thu, 5 Sep 2024 23:49:10 +0000 (16:49 -0700)]
 
Merge tag 'spi-fix-v6.11-rc6' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A few small driver specific fixes (including some of the widespread
  work on fixing missing ID tables for module autoloading and the revert
  of some problematic PM work in spi-rockchip), some improvements to the
  MAINTAINERS information for the NXP drivers and the addition of a new
  device ID to spidev"
* tag 'spi-fix-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  MAINTAINERS: SPI: Add mailing list imx@lists.linux.dev for nxp spi drivers
  MAINTAINERS: SPI: Add freescale lpspi maintainer information
  spi: spi-fsl-lpspi: Fix off-by-one in prescale max
  spi: spidev: Add missing spi_device_id for jg10309-01
  spi: bcm63xx: Enable module autoloading
  spi: intel: Add check devm_kasprintf() returned value
  spi: spidev: Add an entry for elgin,jg10309-01
  spi: rockchip: Resolve unbalanced runtime PM / system PM handling
Linus Torvalds [Thu, 5 Sep 2024 23:41:16 +0000 (16:41 -0700)]
 
Merge tag 'regulator-fix-v6.11-stub' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "A fix from Doug Anderson for a missing stub, required to fix the build
  for some newly added users of devm_regulator_bulk_get_const() in
  !REGULATOR configurations"
* tag 'regulator-fix-v6.11-stub' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: Stub devm_regulator_bulk_get_const() if !CONFIG_REGULATOR
Linus Torvalds [Thu, 5 Sep 2024 23:35:57 +0000 (16:35 -0700)]
 
Merge tag 'rust-fixes-6.11-2' of https://github.com/Rust-for-Linux/linux
Pull Rust fixes from Miguel Ojeda:
 "Toolchain and infrastructure:
   - Fix builds for nightly compiler users now that 'new_uninit' was
     split into new features by using an alternative approach for the
     code that used what is now called the 'box_uninit_write' feature
   - Allow the 'stable_features' lint to preempt upcoming warnings about
     them, since soon there will be unstable features that will become
     stable in nightly compilers
   - Export bss symbols too
  'kernel' crate:
   - 'block' module: fix wrong usage of lockdep API
  'macros' crate:
   - Provide correct provenance when constructing 'THIS_MODULE'
  Documentation:
   - Remove unintended indentation (blockquotes) in generated output
   - Fix a couple typos
  MAINTAINERS:
   - Remove Wedson as Rust maintainer
   - Update Andreas' email"
* tag 'rust-fixes-6.11-2' of https://github.com/Rust-for-Linux/linux:
  MAINTAINERS: update Andreas Hindborg's email address
  MAINTAINERS: Remove Wedson as Rust maintainer
  rust: macros: provide correct provenance when constructing THIS_MODULE
  rust: allow `stable_features` lint
  docs: rust: remove unintended blockquote in Quick Start
  rust: alloc: eschew `Box<MaybeUninit<T>>::write`
  rust: kernel: fix typos in code comments
  docs: rust: remove unintended blockquote in Coding Guidelines
  rust: block: fix wrong usage of lockdep API
  rust: kbuild: fix export of bss symbols
Linus Torvalds [Thu, 5 Sep 2024 23:29:41 +0000 (16:29 -0700)]
 
Merge tag 'trace-v6.11-rc4' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
 - Fix adding a new fgraph callback after function graph tracing has
   already started.
   If the new caller does not initialize its hash before registering the
   fgraph_ops, it can cause a NULL pointer dereference. Fix this by
   adding a new parameter to ftrace_graph_enable_direct() passing in the
   newly added gops directly and not rely on using the fgraph_array[],
   as entries in the fgraph_array[] must be initialized.
   Assign the new gops to the fgraph_array[] after it goes through
   ftrace_startup_subops() as that will properly initialize the
   gops->ops and initialize its hashes.
 - Fix a memory leak in fgraph storage memory test.
   If the "multiple fgraph storage on a function" boot up selftest fails
   in the registering of the function graph tracer, it will not free the
   memory it allocated for the filter. Break the loop up into two where
   it allocates the filters first and then registers the functions where
   any errors will do the appropriate clean ups.
 - Only clear the timerlat timers if it has an associated kthread.
   In the rtla tool that uses timerlat, if it was killed just as it was
   shutting down, the signals can free the kthread and the timer. But
   the closing of the timerlat files could cause the hrtimer_cancel() to
   be called on the already freed timer. As the kthread variable is is
   set to NULL when the kthreads are stopped and the timers are freed it
   can be used to know not to call hrtimer_cancel() on the timer if the
   kthread variable is NULL.
 - Use a cpumask to keep track of osnoise/timerlat kthreads
   The timerlat tracer can use user space threads for its analysis. With
   the killing of the rtla tool, the kernel can get confused between if
   it is using a user space thread to analyze or one of its own kernel
   threads. When this confusion happens, kthread_stop() can be called on
   a user space thread and bad things happen. As the kernel threads are
   per-cpu, a bitmask can be used to know when a kernel thread is used
   or when a user space thread is used.
 - Add missing interface_lock to osnoise/timerlat stop_kthread()
   The stop_kthread() function in osnoise/timerlat clears the osnoise
   kthread variable, and if it was a user space thread does a put_task
   on it. But this can race with the closing of the timerlat files that
   also does a put_task on the kthread, and if the race happens the task
   will have put_task called on it twice and oops.
 - Add cond_resched() to the tracing_iter_reset() loop.
   The latency tracers keep writing to the ring buffer without resetting
   when it issues a new "start" event (like interrupts being disabled).
   When reading the buffer with an iterator, the tracing_iter_reset()
   sets its pointer to that start event by walking through all the
   events in the buffer until it gets to the time stamp of the start
   event. In the case of a very large buffer, the loop that looks for
   the start event has been reported taking a very long time with a non
   preempt kernel that it can trigger a soft lock up warning. Add a
   cond_resched() into that loop to make sure that doesn't happen.
 - Use list_del_rcu() for eventfs ei->list variable
   It was reported that running loops of creating and deleting kprobe
   events could cause a crash due to the eventfs list iteration hitting
   a LIST_POISON variable. This is because the list is protected by SRCU
   but when an item is deleted from the list, it was using list_del()
   which poisons the "next" pointer. This is what list_del_rcu() was to
   prevent.
* tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()
  tracing/timerlat: Only clear timer if a kthread exists
  tracing/osnoise: Use a cpumask to know what threads are kthreads
  eventfs: Use list_del_rcu() for SRCU protected list variable
  tracing: Avoid possible softlockup in tracing_iter_reset()
  tracing: Fix memory leak in fgraph storage selftest
  tracing: fgraph: Fix to add new fgraph_ops to array after ftrace_startup_subops()
Eric Dumazet [Wed, 4 Sep 2024 14:44:18 +0000 (14:44 +0000)]
 
ila: call nf_unregister_net_hooks() sooner
syzbot found an use-after-free Read in ila_nf_input [1]
Issue here is that ila_xlat_exit_net() frees the rhashtable,
then call nf_unregister_net_hooks().
It should be done in the reverse way, with a synchronize_rcu().
This is a good match for a pre_exit() method.
[1]
 BUG: KASAN: use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline]
 BUG: KASAN: use-after-free in __rhashtable_lookup include/linux/rhashtable.h:604 [inline]
 BUG: KASAN: use-after-free in rhashtable_lookup include/linux/rhashtable.h:646 [inline]
 BUG: KASAN: use-after-free in rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672
Read of size 4 at addr 
ffff888064620008 by task ksoftirqd/0/16
CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 
6.11.0-rc4-syzkaller-00238-g2ad6d23f465a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:93 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
  print_address_description mm/kasan/report.c:377 [inline]
  print_report+0x169/0x550 mm/kasan/report.c:488
  kasan_report+0x143/0x180 mm/kasan/report.c:601
  rht_key_hashfn include/linux/rhashtable.h:159 [inline]
  __rhashtable_lookup include/linux/rhashtable.h:604 [inline]
  rhashtable_lookup include/linux/rhashtable.h:646 [inline]
  rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672
  ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:132 [inline]
  ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline]
  ila_nf_input+0x1fe/0x3c0 net/ipv6/ila/ila_xlat.c:190
  nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
  nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
  nf_hook include/linux/netfilter.h:269 [inline]
  NF_HOOK+0x29e/0x450 include/linux/netfilter.h:312
  __netif_receive_skb_one_core net/core/dev.c:5661 [inline]
  __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5775
  process_backlog+0x662/0x15b0 net/core/dev.c:6108
  __napi_poll+0xcb/0x490 net/core/dev.c:6772
  napi_poll net/core/dev.c:6841 [inline]
  net_rx_action+0x89b/0x1240 net/core/dev.c:6963
  handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
  run_ksoftirqd+0xca/0x130 kernel/softirq.c:928
  smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164
  kthread+0x2f0/0x390 kernel/kthread.c:389
  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:
0000000000000000 index:0x0 pfn:0x64620
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xbfffffff(buddy)
raw: 
00fff00000000000 ffffea0000959608 ffffea00019d9408 0000000000000000
raw: 
0000000000000000 0000000000000003 00000000bfffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as freed
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 5242, tgid 5242 (syz-executor), ts 
73611328570, free_ts 
618981657187
  set_page_owner include/linux/page_owner.h:32 [inline]
  post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
  prep_new_page mm/page_alloc.c:1501 [inline]
  get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
  __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
  __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
  alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
  ___kmalloc_large_node+0x8b/0x1d0 mm/slub.c:4103
  __kmalloc_large_node_noprof+0x1a/0x80 mm/slub.c:4130
  __do_kmalloc_node mm/slub.c:4146 [inline]
  __kmalloc_node_noprof+0x2d2/0x440 mm/slub.c:4164
  __kvmalloc_node_noprof+0x72/0x190 mm/util.c:650
  bucket_table_alloc lib/rhashtable.c:186 [inline]
  rhashtable_init_noprof+0x534/0xa60 lib/rhashtable.c:1071
  ila_xlat_init_net+0xa0/0x110 net/ipv6/ila/ila_xlat.c:613
  ops_init+0x359/0x610 net/core/net_namespace.c:139
  setup_net+0x515/0xca0 net/core/net_namespace.c:343
  copy_net_ns+0x4e2/0x7b0 net/core/net_namespace.c:508
  create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110
  unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228
  ksys_unshare+0x619/0xc10 kernel/fork.c:3328
  __do_sys_unshare kernel/fork.c:3399 [inline]
  __se_sys_unshare kernel/fork.c:3397 [inline]
  __x64_sys_unshare+0x38/0x40 kernel/fork.c:3397
page last free pid 11846 tgid 11846 stack trace:
  reset_page_owner include/linux/page_owner.h:25 [inline]
  free_pages_prepare mm/page_alloc.c:1094 [inline]
  free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
  __folio_put+0x2c8/0x440 mm/swap.c:128
  folio_put include/linux/mm.h:1486 [inline]
  free_large_kmalloc+0x105/0x1c0 mm/slub.c:4565
  kfree+0x1c4/0x360 mm/slub.c:4588
  rhashtable_free_and_destroy+0x7c6/0x920 lib/rhashtable.c:1169
  ila_xlat_exit_net+0x55/0x110 net/ipv6/ila/ila_xlat.c:626
  ops_exit_list net/core/net_namespace.c:173 [inline]
  cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
  process_one_work kernel/workqueue.c:3231 [inline]
  process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
  worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
  kthread+0x2f0/0x390 kernel/kthread.c:389
  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Memory state around the buggy address:
 
ffff88806461ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 
ffff88806461ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
ffff888064620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                      ^
 
ffff888064620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 
ffff888064620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Fixes: 
7f00feaf1076 ("ila: Add generic ILA translation facility")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20240904144418.1162839-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arkadiusz Kubalewski [Wed, 4 Sep 2024 13:50:34 +0000 (15:50 +0200)]
 
tools/net/ynl: fix cli.py --subscribe feature
Execution of command:
./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml /
	--subscribe "monitor" --sleep 10
fails with:
  File "/repo/./tools/net/ynl/cli.py", line 109, in main
    ynl.check_ntf()
  File "/repo/tools/net/ynl/lib/ynl.py", line 924, in check_ntf
    op = self.rsp_by_value[nl_msg.cmd()]
KeyError: 19
Parsing Generic Netlink notification messages performs lookup for op in
the message. The message was not yet decoded, and is not yet considered
GenlMsg, thus msg.cmd() returns Generic Netlink family id (19) instead of
proper notification command id (i.e.: DPLL_CMD_PIN_CHANGE_NTF=13).
Allow the op to be obtained within NetlinkProtocol.decode(..) itself if the
op was not passed to the decode function, thus allow parsing of Generic
Netlink notifications without causing the failure.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/netdev/m2le0n5xpn.fsf@gmail.com/
Fixes: 
0a966d606c68 ("tools/net/ynl: Fix extack decoding for directional ops")
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20240904135034.316033-1-arkadiusz.kubalewski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Fedorenko [Wed, 4 Sep 2024 13:18:55 +0000 (13:18 +0000)]
 
MAINTAINERS: fix ptp ocp driver maintainers address
While checking the latest series for ptp_ocp driver I realised that
MAINTAINERS file has wrong item about email on linux.dev domain.
Fixes: 
795fd9342c62 ("ptp_ocp: adjust MAINTAINERS and mailmap")
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904131855.559078-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jamie Bainbridge [Wed, 4 Sep 2024 06:12:26 +0000 (16:12 +1000)]
 
selftests: net: enable bind tests
bind_wildcard is compiled but not run, bind_timewait is not compiled.
These two tests complete in a very short time, use the test harness
properly, and seem reasonable to enable.
The author of the tests confirmed via email that these were
intended to be run.
Enable these two tests.
Fixes: 
13715acf8ab5 ("selftest: Add test for bind() conflicts.")
Fixes: 
2c042e8e54ef ("tcp: Add selftest for bind() and TIME_WAIT.")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/5a009b26cf5fb1ad1512d89c61b37e2fac702323.1725430322.git.jamie.bainbridge@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Frank Li [Thu, 5 Sep 2024 15:52:30 +0000 (11:52 -0400)]
 
MAINTAINERS: SPI: Add mailing list imx@lists.linux.dev for nxp spi drivers
Add mailing list imx@lists.linux.dev for nxp spi drivers(qspi, fspi and
dspi).
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20240905155230.1901787-1-Frank.Li@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Frank Li [Thu, 5 Sep 2024 15:41:24 +0000 (11:41 -0400)]
 
MAINTAINERS: SPI: Add freescale lpspi maintainer information
Add imx@lists.linux.dev and NXP maintainer information for lpspi driver
(drivers/spi/spi-fsl-lpspi.c).
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20240905154124.1901311-1-Frank.Li@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Linus Torvalds [Thu, 5 Sep 2024 16:57:50 +0000 (09:57 -0700)]
 
Merge tag 'platform-drivers-x86-v6.11-6' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
 - amd/pmf: ASUS GA403 quirk matching tweak
 - dell-smbios: Fix to the init function rollback path
* tag 'platform-drivers-x86-v6.11-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/amd: pmf: Make ASUS GA403 quirk generic
  platform/x86: dell-smbios: Fix error path in dell_smbios_init()
Linus Torvalds [Thu, 5 Sep 2024 16:43:38 +0000 (09:43 -0700)]
 
Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull kunit fix fromShuah Khan:
 "One single fix to a use-after-free bug resulting from
  kunit_driver_create() failing to copy the driver name leaving it on
  the stack or freeing it"
* tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Device wrappers should also manage driver name
Steven Rostedt [Thu, 5 Sep 2024 15:33:59 +0000 (11:33 -0400)]
 
tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()
The timerlat interface will get and put the task that is part of the
"kthread" field of the osn_var to keep it around until all references are
released. But here's a race in the "stop_kthread()" code that will call
put_task_struct() on the kthread if it is not a kernel thread. This can
race with the releasing of the references to that task struct and the
put_task_struct() can be called twice when it should have been called just
once.
Take the interface_lock() in stop_kthread() to synchronize this change.
But to do so, the function stop_per_cpu_kthreads() needs to change the
loop from for_each_online_cpu() to for_each_possible_cpu() and remove the
cpu_read_lock(), as the interface_lock can not be taken while the cpu
locks are held. The only side effect of this change is that it may do some
extra work, as the per_cpu variables of the offline CPUs would not be set
anyway, and would simply be skipped in the loop.
Remove unneeded "return;" in stop_kthread().
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240905113359.2b934242@gandalf.local.home
Fixes: 
e88ed227f639e ("tracing/timerlat: Add user-space interface")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Thu, 5 Sep 2024 12:53:30 +0000 (08:53 -0400)]
 
tracing/timerlat: Only clear timer if a kthread exists
The timerlat tracer can use user space threads to check for osnoise and
timer latency. If the program using this is killed via a SIGTERM, the
threads are shutdown one at a time and another tracing instance can start
up resetting the threads before they are fully closed. That causes the
hrtimer assigned to the kthread to be shutdown and freed twice when the
dying thread finally closes the file descriptors, causing a use-after-free
bug.
Only cancel the hrtimer if the associated thread is still around. Also add
the interface_lock around the resetting of the tlat_var->kthread.
Note, this is just a quick fix that can be backported to stable. A real
fix is to have a better synchronization between the shutdown of old
threads and the starting of new ones.
Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240905085330.45985730@gandalf.local.home
Fixes: 
e88ed227f639e ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Wed, 4 Sep 2024 14:34:28 +0000 (10:34 -0400)]
 
tracing/osnoise: Use a cpumask to know what threads are kthreads
The start_kthread() and stop_thread() code was not always called with the
interface_lock held. This means that the kthread variable could be
unexpectedly changed causing the kthread_stop() to be called on it when it
should not have been, leading to:
 while true; do
   rtla timerlat top -u -q & PID=$!;
   sleep 5;
   kill -INT $PID;
   sleep 0.001;
   kill -TERM $PID;
   wait $PID;
  done
Causing the following OOPS:
 Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
 KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
 CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 
6.11.0-rc4-test-00002-gbc754cc76d1b-dirty #125 
a533010b71dab205ad2f507188ce8c82203b0254
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 RIP: 0010:hrtimer_active+0x58/0x300
 Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f
 RSP: 0018:
ffff88811d97f940 EFLAGS: 
00010202
 RAX: 
0000000000000002 RBX: 
ffff88823c6b5b28 RCX: 
ffffed10478d6b6b
 RDX: 
dffffc0000000000 RSI: 
ffffed10478d6b6c RDI: 
ffff88823c6b5b28
 RBP: 
0000000000000000 R08: 
ffff88823c6b5b58 R09: 
ffff88823c6b5b60
 R10: 
ffff88811d97f957 R11: 
0000000000000010 R12: 
00000000000a801d
 R13: 
ffff88810d8b35d8 R14: 
0000000000000010 R15: 
ffff88823c6b5b28
 FS:  
0000000000000000(0000) GS:
ffff88823c680000(0000) knlGS:
0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
 CR2: 
0000561858ad7258 CR3: 
000000007729e001 CR4: 
0000000000170ef0
 Call Trace:
  <TASK>
  ? die_addr+0x40/0xa0
  ? exc_general_protection+0x154/0x230
  ? asm_exc_general_protection+0x26/0x30
  ? hrtimer_active+0x58/0x300
  ? __pfx_mutex_lock+0x10/0x10
  ? __pfx_locks_remove_file+0x10/0x10
  hrtimer_cancel+0x15/0x40
  timerlat_fd_release+0x8e/0x1f0
  ? security_file_release+0x43/0x80
  __fput+0x372/0xb10
  task_work_run+0x11e/0x1f0
  ? _raw_spin_lock+0x85/0xe0
  ? __pfx_task_work_run+0x10/0x10
  ? poison_slab_object+0x109/0x170
  ? do_exit+0x7a0/0x24b0
  do_exit+0x7bd/0x24b0
  ? __pfx_migrate_enable+0x10/0x10
  ? __pfx_do_exit+0x10/0x10
  ? __pfx_read_tsc+0x10/0x10
  ? ktime_get+0x64/0x140
  ? _raw_spin_lock_irq+0x86/0xe0
  do_group_exit+0xb0/0x220
  get_signal+0x17ba/0x1b50
  ? vfs_read+0x179/0xa40
  ? timerlat_fd_read+0x30b/0x9d0
  ? __pfx_get_signal+0x10/0x10
  ? __pfx_timerlat_fd_read+0x10/0x10
  arch_do_signal_or_restart+0x8c/0x570
  ? __pfx_arch_do_signal_or_restart+0x10/0x10
  ? vfs_read+0x179/0xa40
  ? ksys_read+0xfe/0x1d0
  ? __pfx_ksys_read+0x10/0x10
  syscall_exit_to_user_mode+0xbc/0x130
  do_syscall_64+0x74/0x110
  ? __pfx___rseq_handle_notify_resume+0x10/0x10
  ? __pfx_ksys_read+0x10/0x10
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? fpregs_restore_userregs+0xdb/0x1e0
  ? syscall_exit_to_user_mode+0x116/0x130
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  ? do_syscall_64+0x74/0x110
  entry_SYSCALL_64_after_hwframe+0x71/0x79
 RIP: 0033:0x7ff0070eca9c
 Code: Unable to access opcode bytes at 0x7ff0070eca72.
 RSP: 002b:
00007ff006dff8c0 EFLAGS: 
00000246 ORIG_RAX: 
0000000000000000
 RAX: 
0000000000000000 RBX: 
0000000000000005 RCX: 
00007ff0070eca9c
 RDX: 
0000000000000400 RSI: 
00007ff006dff9a0 RDI: 
0000000000000003
 RBP: 
00007ff006dffde0 R08: 
0000000000000000 R09: 
00007ff000000ba0
 R10: 
00007ff007004b08 R11: 
0000000000000246 R12: 
0000000000000003
 R13: 
00007ff006dff9a0 R14: 
0000000000000007 R15: 
0000000000000008
  </TASK>
 Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core
 ---[ end trace 
0000000000000000 ]---
This is because it would mistakenly call kthread_stop() on a user space
thread making it "exit" before it actually exits.
Since kthreads are created based on global behavior, use a cpumask to know
when kthreads are running and that they need to be shutdown before
proceeding to do new work.
Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/
This was debugged by using the persistent ring buffer:
Link: https://lore.kernel.org/all/20240823013902.135036960@goodmis.org/
Note, locking was originally used to fix this, but that proved to cause too
many deadlocks to work around:
  https://lore.kernel.org/linux-trace-kernel/
20240823102816.
5e55753b@gandalf.local.home/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240904103428.08efdf4c@gandalf.local.home
Fixes: 
e88ed227f639e ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Steven Rostedt [Wed, 4 Sep 2024 17:16:05 +0000 (13:16 -0400)]
 
eventfs: Use list_del_rcu() for SRCU protected list variable
Chi Zhiling reported:
  We found a null pointer accessing in tracefs[1], the reason is that the
  variable 'ei_child' is set to LIST_POISON1, that means the list was
  removed in eventfs_remove_rec. so when access the ei_child->is_freed, the
  panic triggered.
  by the way, the following script can reproduce this panic
  loop1 (){
      while true
      do
          echo "p:kp submit_bio" > /sys/kernel/debug/tracing/kprobe_events
          echo "" > /sys/kernel/debug/tracing/kprobe_events
      done
  }
  loop2 (){
      while true
      do
          tree /sys/kernel/debug/tracing/events/kprobes/
      done
  }
  loop1 &
  loop2
  [1]:
  [ 1147.959632][T17331] Unable to handle kernel paging request at virtual address 
dead000000000150
  [ 1147.968239][T17331] Mem abort info:
  [ 1147.971739][T17331]   ESR = 0x0000000096000004
  [ 1147.976172][T17331]   EC = 0x25: DABT (current EL), IL = 32 bits
  [ 1147.982171][T17331]   SET = 0, FnV = 0
  [ 1147.985906][T17331]   EA = 0, S1PTW = 0
  [ 1147.989734][T17331]   FSC = 0x04: level 0 translation fault
  [ 1147.995292][T17331] Data abort info:
  [ 1147.998858][T17331]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
  [ 1148.005023][T17331]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  [ 1148.010759][T17331]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  [ 1148.016752][T17331] [
dead000000000150] address between user and kernel address ranges
  [ 1148.024571][T17331] Internal error: Oops: 
0000000096000004 [#1] SMP
  [ 1148.030825][T17331] Modules linked in: team_mode_loadbalance team nlmon act_gact cls_flower sch_ingress bonding tls macvlan dummy ib_core bridge stp llc veth amdgpu amdxcp mfd_core gpu_sched drm_exec drm_buddy radeon crct10dif_ce video drm_suballoc_helper ghash_ce drm_ttm_helper sha2_ce ttm sha256_arm64 i2c_algo_bit sha1_ce sbsa_gwdt cp210x drm_display_helper cec sr_mod cdrom drm_kms_helper binfmt_misc sg loop fuse drm dm_mod nfnetlink ip_tables autofs4 [last unloaded: tls]
  [ 1148.072808][T17331] CPU: 3 PID: 17331 Comm: ls Tainted: G        W         ------- ----  6.6.43 #2
  [ 1148.081751][T17331] Source Version: 
21b3b386e948bedd29369af66f3e98ab01b1c650
  [ 1148.088783][T17331] Hardware name: Greatwall GW-001M1A-FTF/GW-001M1A-FTF, BIOS KunLun BIOS V4.0 07/16/2020
  [ 1148.098419][T17331] pstate: 
20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  [ 1148.106060][T17331] pc : eventfs_iterate+0x2c0/0x398
  [ 1148.111017][T17331] lr : eventfs_iterate+0x2fc/0x398
  [ 1148.115969][T17331] sp : 
ffff80008d56bbd0
  [ 1148.119964][T17331] x29: 
ffff80008d56bbf0 x28: 
ffff001ff5be2600 x27: 
0000000000000000
  [ 1148.127781][T17331] x26: 
ffff001ff52ca4e0 x25: 
0000000000009977 x24: 
dead000000000100
  [ 1148.135598][T17331] x23: 
0000000000000000 x22: 
000000000000000b x21: 
ffff800082645f10
  [ 1148.143415][T17331] x20: 
ffff001fddf87c70 x19: 
ffff80008d56bc90 x18: 
0000000000000000
  [ 1148.151231][T17331] x17: 
0000000000000000 x16: 
0000000000000000 x15: 
ffff001ff52ca4e0
  [ 1148.159048][T17331] x14: 
0000000000000000 x13: 
0000000000000000 x12: 
0000000000000000
  [ 1148.166864][T17331] x11: 
0000000000000000 x10: 
0000000000000000 x9 : 
ffff8000804391d0
  [ 1148.174680][T17331] x8 : 
0000000180000000 x7 : 
0000000000000018 x6 : 
0000aaab04b92862
  [ 1148.182498][T17331] x5 : 
0000aaab04b92862 x4 : 
0000000080000000 x3 : 
0000000000000068
  [ 1148.190314][T17331] x2 : 
000000000000000f x1 : 
0000000000007ea8 x0 : 
0000000000000001
  [ 1148.198131][T17331] Call trace:
  [ 1148.201259][T17331]  eventfs_iterate+0x2c0/0x398
  [ 1148.205864][T17331]  iterate_dir+0x98/0x188
  [ 1148.210036][T17331]  __arm64_sys_getdents64+0x78/0x160
  [ 1148.215161][T17331]  invoke_syscall+0x78/0x108
  [ 1148.219593][T17331]  el0_svc_common.constprop.0+0x48/0xf0
  [ 1148.224977][T17331]  do_el0_svc+0x24/0x38
  [ 1148.228974][T17331]  el0_svc+0x40/0x168
  [ 1148.232798][T17331]  el0t_64_sync_handler+0x120/0x130
  [ 1148.237836][T17331]  el0t_64_sync+0x1a4/0x1a8
  [ 1148.242182][T17331] Code: 
54ffff6c f9400676 910006d6 f9000676 (
b9405300)
  [ 1148.248955][T17331] ---[ end trace 
0000000000000000 ]---
The issue is that list_del() is used on an SRCU protected list variable
before the synchronization occurs. This can poison the list pointers while
there is a reader iterating the list.
This is simply fixed by using list_del_rcu() that is specifically made for
this purpose.
Link: https://lore.kernel.org/linux-trace-kernel/20240829085025.3600021-1-chizhiling@163.com/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lore.kernel.org/20240904131605.640d42b1@gandalf.local.home
Fixes: 
43aa6f97c2d03 ("eventfs: Get rid of dentry pointers without refcounts")
Reported-by: Chi Zhiling <chizhiling@kylinos.cn>
Tested-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Zheng Yejian [Tue, 27 Aug 2024 12:46:54 +0000 (20:46 +0800)]
 
tracing: Avoid possible softlockup in tracing_iter_reset()
In __tracing_open(), when max latency tracers took place on the cpu,
the time start of its buffer would be updated, then event entries with
timestamps being earlier than start of the buffer would be skipped
(see tracing_iter_reset()).
Softlockup will occur if the kernel is non-preemptible and too many
entries were skipped in the loop that reset every cpu buffer, so add
cond_resched() to avoid it.
Cc: stable@vger.kernel.org
Fixes: 
2f26ebd549b9a ("tracing: use timestamp to determine start of latency traces")
Link: https://lore.kernel.org/20240827124654.3817443-1-zhengyejian@huaweicloud.com
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Paolo Abeni [Thu, 5 Sep 2024 13:21:14 +0000 (15:21 +0200)]
 
Merge branch 'add-driver-for-motorcomm-yt8821-2-5g-ethernet-phy'
Frank Sae says:
====================
Add driver for Motorcomm yt8821 2.5G ethernet phy
yt8521 and yt8531s as Gigabit transceiver use bit15:14(bit9 reserved
default 0) as phy speed mask, yt8821 as 2.5G transceiver uses bit9 bit15:14
as phy speed mask.
Be compatible to yt8821, reform phy speed mask and phy speed macro.
Based on update above, add yt8821 2.5G phy driver.
====================
Link: https://patch.msgid.link/20240901083526.163784-1-Frank.Sae@motor-comm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Frank Sae [Sun, 1 Sep 2024 08:35:26 +0000 (01:35 -0700)]
 
net: phy: Add driver for Motorcomm yt8821 2.5G ethernet phy
Add a driver for the motorcomm yt8821 2.5G ethernet phy. Verified the
driver on BPI-R3(with MediaTek MT7986(Filogic 830) SoC) development board,
which is developed by Guangdong Bipai Technology Co., Ltd..
yt8821 2.5G ethernet phy works in AUTO_BX2500_SGMII or FORCE_BX2500
interface, supports 2.5G/1000M/100M/10M speeds, and wol(magic package).
Signed-off-by: Frank Sae <Frank.Sae@motor-comm.com>
Reviewed-by: Sai Krishna <saikrishnag@marvell.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Frank Sae [Sun, 1 Sep 2024 08:35:25 +0000 (01:35 -0700)]
 
net: phy: Optimize phy speed mask to be compatible to yt8821
yt8521 and yt8531s as Gigabit transceiver use bit15:14(bit9 reserved
default 0) as phy speed mask, yt8821 as 2.5G transceiver uses bit9 bit15:14
as phy speed mask.
Be compatible to yt8821, reform phy speed mask and phy speed macro.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Frank Sae <Frank.Sae@motor-comm.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 5 Sep 2024 13:08:58 +0000 (15:08 +0200)]
 
Merge tag 'linux-can-next-for-6.12-
20240904-2' of git://git./linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2024-09-04-2
this is a pull request of 18 patches for net-next/master.
All 18 patches add support for CAN-FD IP core found on Rockchip
RK3568.
The first patch is co-developed by Elaine Zhang and me and adds DT
bindings documentation.
The remaining 17 patches are by me and add the driver in several
stages.
linux-can-next-for-6.12-
20240904-2
* tag 'linux-can-next-for-6.12-
20240904-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: rockchip_canfd: add support for CAN_CTRLMODE_BERR_REPORTING
  can: rockchip_canfd: add support for CAN_CTRLMODE_LOOPBACK
  can: rockchip_canfd: add hardware timestamping support
  can: rockchip_canfd: enable full TX-FIFO depth of 2
  can: rockchip_canfd: prepare to use full TX-FIFO depth
  can: rockchip_canfd: add stats support for errata workarounds
  can: rockchip_canfd: rkcanfd_get_berr_counter_corrected(): work around broken {RX,TX}ERRORCNT register
  can: rockchip_canfd: implement workaround for erratum 12
  can: rockchip_canfd: implement workaround for erratum 6
  can: rockchip_canfd: add TX PATH
  can: rockchip_canfd: rkcanfd_register_done(): add warning for erratum 5
  can: rockchip_canfd: rkcanfd_handle_rx_int_one(): implement workaround for erratum 5: check for empty FIFO
  can: rockchip_canfd: add notes about known issues
  can: rockchip_canfd: add support for rk3568v3
  can: rockchip_canfd: add quirk for broken CAN-FD support
  can: rockchip_canfd: add quirks for errata workarounds
  can: rockchip_canfd: add driver for Rockchip CAN-FD controller
  dt-bindings: can: rockchip_canfd: add rockchip CAN-FD controller
====================
Link: https://patch.msgid.link/20240904130256.1965582-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stefan Wahren [Thu, 5 Sep 2024 11:15:37 +0000 (13:15 +0200)]
 
spi: spi-fsl-lpspi: Fix off-by-one in prescale max
The commit 
783bf5d09f86 ("spi: spi-fsl-lpspi: limit PRESCALE bit in
TCR register") doesn't implement the prescaler maximum as intended.
The maximum allowed value for i.MX93 should be 1 and for i.MX7ULP
it should be 7. So this needs also a adjustment of the comparison
in the scldiv calculation.
Fixes: 
783bf5d09f86 ("spi: spi-fsl-lpspi: limit PRESCALE bit in TCR register")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20240905111537.90389-1-wahrenst@gmx.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Chen Ni [Wed, 4 Sep 2024 01:50:03 +0000 (09:50 +0800)]
 
ptp: ptp_idt82p33: Convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904015003.1065872-1-nichen@iscas.ac.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hongbo Li [Wed, 4 Sep 2024 01:49:56 +0000 (09:49 +0800)]
 
net: dsa: felix: Annotate struct action_gate_entry with __counted_by
Add the __counted_by compiler attribute to the flexible array member
entries to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904014956.2035117-1-lihongbo22@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 5 Sep 2024 10:51:48 +0000 (12:51 +0200)]
 
Merge branch 'bonding-support-new-xfrm-state-offload-functions'
Hangbin Liu says:
====================
Bonding: support new xfrm state offload functions
Add 2 new xfrm state offload functions xdo_dev_state_advance_esn and
xdo_dev_state_update_stats for bonding. The xdo_dev_state_free will be
added by Jianbo's patchset [1]. I will add the bonding xfrm policy offload
in future.
v7: no update, just rebase the code.
v6: Use "Return: " based on ./scripts/kernel-doc (Simon Horman)
v5: Rebase to latest net-next, update function doc (Jakub Kicinski)
v4: Ratelimit pr_warn (Sabrina Dubroca)
v3: Re-format bond_ipsec_dev, use slave_warn instead of WARN_ON (Nikolay Aleksandrov)
    Fix bond_ipsec_dev defination, add *. (Simon Horman, kernel test robot)
    Fix "real" typo (kernel test robot)
v2: Add a function to process the common device checking (Nikolay Aleksandrov)
    Remove unused variable (Simon Horman)
v1: lore.kernel.org/netdev/
20240816035518.203704-1-liuhangbin@gmail.com
====================
Link: https://patch.msgid.link/20240904003457.3847086-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hangbin Liu [Wed, 4 Sep 2024 00:34:57 +0000 (08:34 +0800)]
 
bonding: support xfrm state update
The patch add xfrm statistics update for bonding IPsec offload.
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hangbin Liu [Wed, 4 Sep 2024 00:34:56 +0000 (08:34 +0800)]
 
bonding: Add ESN support to IPSec HW offload
Currently, users can see that bonding supports IPSec HW offload via ethtool.
However, this functionality does not work with NICs like Mellanox cards when
ESN (Extended Sequence Numbers) is enabled, as ESN functions are not yet
supported. This patch adds ESN support to the bonding IPSec device offload,
ensuring proper functionality with NICs that support ESN.
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hangbin Liu [Wed, 4 Sep 2024 00:34:55 +0000 (08:34 +0800)]
 
bonding: add common function to check ipsec device
This patch adds a common function to check the status of IPSec devices.
This function will be useful for future implementations, such as IPSec ESN
and state offload callbacks.
Suggested-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jay Vosburgh <jv@jvosburgh.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pawel Dembicki [Tue, 3 Sep 2024 20:33:41 +0000 (22:33 +0200)]
 
net: dsa: vsc73xx: fix possible subblocks range of CAPT block
CAPT block (CPU Capture Buffer) have 7 sublocks: 0-3, 4, 6, 7.
Function 'vsc73xx_is_addr_valid' allows to use only block 0 at this
moment.
This patch fix it.
Fixes: 
05bd97fc559d ("net: dsa: Add Vitesse VSC73xx DSA router driver")
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240903203340.1518789-1-paweldembicki@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Toke Høiland-Jørgensen [Tue, 3 Sep 2024 16:08:45 +0000 (18:08 +0200)]
 
sched: sch_cake: fix bulk flow accounting logic for host fairness
In sch_cake, we keep track of the count of active bulk flows per host,
when running in dst/src host fairness mode, which is used as the
round-robin weight when iterating through flows. The count of active
bulk flows is updated whenever a flow changes state.
This has a peculiar interaction with the hash collision handling: when a
hash collision occurs (after the set-associative hashing), the state of
the hash bucket is simply updated to match the new packet that collided,
and if host fairness is enabled, that also means assigning new per-host
state to the flow. For this reason, the bulk flow counters of the
host(s) assigned to the flow are decremented, before new state is
assigned (and the counters, which may not belong to the same host
anymore, are incremented again).
Back when this code was introduced, the host fairness mode was always
enabled, so the decrement was unconditional. When the configuration
flags were introduced the *increment* was made conditional, but
the *decrement* was not. Which of course can lead to a spurious
decrement (and associated wrap-around to U16_MAX).
AFAICT, when host fairness is disabled, the decrement and wrap-around
happens as soon as a hash collision occurs (which is not that common in
itself, due to the set-associative hashing). However, in most cases this
is harmless, as the value is only used when host fairness mode is
enabled. So in order to trigger an array overflow, sch_cake has to first
be configured with host fairness disabled, and while running in this
mode, a hash collision has to occur to cause the overflow. Then, the
qdisc has to be reconfigured to enable host fairness, which leads to the
array out-of-bounds because the wrapped-around value is retained and
used as an array index. It seems that syzbot managed to trigger this,
which is quite impressive in its own right.
This patch fixes the issue by introducing the same conditional check on
decrement as is used on increment.
The original bug predates the upstreaming of cake, but the commit listed
in the Fixes tag touched that code, meaning that this patch won't apply
before that.
Fixes: 
712639929912 ("sch_cake: Make the dual modes fairer")
Reported-by: syzbot+7fe7b81d602cc1e6b94d@syzkaller.appspotmail.com
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20240903160846.20909-1-toke@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tan En De [Sat, 31 Aug 2024 01:11:14 +0000 (09:11 +0800)]
 
net: stmmac: Batch set RX OWN flag and other flags
Minimize access to the RX descriptor by collecting all the flags in a
local variable and then updating the descriptor at once.
Signed-off-by: Tan En De <ende.tan@starfivetech.com>
Link: https://patch.msgid.link/20240831011114.2065912-1-ende.tan@starfivetech.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Fri, 30 Aug 2024 17:14:42 +0000 (10:14 -0700)]
 
docs: netdev: document guidance on cleanup.h
Document what was discussed multiple times on list and various
virtual / in-person conversations. guard() being okay in functions
<= 20 LoC is a bit of my own invention. If the function is trivial
it should be fine, but feel free to disagree :)
We'll obviously revisit this guidance as time passes and we and other
subsystems get more experience.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240830171443.3532077-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Thu, 5 Sep 2024 00:37:37 +0000 (17:37 -0700)]
 
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
ice: fix synchronization between .ndo_bpf() and reset
Larysa Zaremba says:
PF reset can be triggered asynchronously, by tx_timeout or by a user. With some
unfortunate timings both ice_vsi_rebuild() and .ndo_bpf will try to access and
modify XDP rings at the same time, causing system crash.
The first patch factors out rtnl-locked code from VSI rebuild code to avoid
deadlock. The following changes lock rebuild and .ndo_bpf() critical sections
with an internal mutex as well and provide complementary fixes.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: do not bring the VSI up, if it was down before the XDP setup
  ice: remove ICE_CFG_BUSY locking from AF_XDP code
  ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset
  ice: check for XDP rings instead of bpf program when unconfiguring
  ice: protect XDP configuration with a mutex
  ice: move netif_queue_set_napi to rtnl-protected sections
====================
Link: https://patch.msgid.link/20240903183034.3530411-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 5 Sep 2024 00:20:14 +0000 (17:20 -0700)]
 
Merge tag 'wireless-next-2024-09-04' of git://git./linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
pull-request: wireless-next-2024-09-04
here's a pull request to net-next tree, more info below. Please let me know if
there are any problems.
====================
Conflicts:
drivers/net/wireless/ath/ath12k/hw.c
  
38055789d151 ("wifi: ath12k: use 128 bytes aligned iova in transmit path for WCN7850")
  
8be12629b428 ("wifi: ath12k: restore ASPM for supported hardwares only")
https://lore.kernel.org/87msldyj97.fsf@kernel.org
Link: https://patch.msgid.link/20240904153205.64C11C4CEC2@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 5 Sep 2024 00:14:11 +0000 (17:14 -0700)]
 
Merge tag 'wireless-2024-09-04' of git://git./linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v6.11
Hopefully final fixes for v6.11 and this time only fixes to ath11k
driver. We need to revert hibernation support due to reported
regressions and we have a fix for kernel crash introduced in
v6.11-rc1.
* tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  MAINTAINERS: wifi: cw1200: add net-cw1200.h
  Revert "wifi: ath11k: support hibernation"
  Revert "wifi: ath11k: restore country code during resume"
  wifi: ath11k: fix NULL pointer dereference in ath11k_mac_get_eirp_power()
====================
Link: https://patch.msgid.link/20240904135906.5986EC4CECA@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson [Tue, 3 Sep 2024 18:49:12 +0000 (14:49 -0400)]
 
net: cadence: macb: Enable software IRQ coalescing by default
This NIC doesn't have hardware IRQ coalescing. Under high load,
interrupts can adversely affect performance. To mitigate this, enable
software IRQ coalescing by default. On my system this increases receive
throughput with iperf3 from 853 MBit/sec to 934 MBit/s, decreases
interrupts from 69489/sec to 2016/sec, and decreases CPU utilization
from 27% (4x Cortex-A53) to 14%. Latency is not affected (as far as I
can tell).
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240903184912.4151926-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson [Tue, 3 Sep 2024 17:51:41 +0000 (13:51 -0400)]
 
net: xilinx: axienet: Fix race in axienet_stop
axienet_dma_err_handler can race with axienet_stop in the following
manner:
CPU 1                       CPU 2
======================      ==================
axienet_stop()
    napi_disable()
    axienet_dma_stop()
                            axienet_dma_err_handler()
                                napi_disable()
                                axienet_dma_stop()
                                axienet_dma_start()
                                napi_enable()
    cancel_work_sync()
    free_irq()
Fix this by setting a flag in axienet_stop telling
axienet_dma_err_handler not to bother doing anything. I chose not to use
disable_work_sync to allow for easier backporting.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Fixes: 
8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Link: https://patch.msgid.link/20240903175141.4132898-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Niklas Söderlund [Tue, 3 Sep 2024 17:15:36 +0000 (19:15 +0200)]
 
net: phy: Check for read errors in SIOCGMIIREG
When reading registers from the PHY using the SIOCGMIIREG IOCTL any
errors returned from either mdiobus_read() or mdiobus_c45_read() are
ignored, and parts of the returned error is passed as the register value
back to user-space.
For example, if mdiobus_c45_read() is used with a bus that do not
implement the read_c45() callback -EOPNOTSUPP is returned. This is
however directly stored in mii_data->val_out and returned as the
registers content. As val_out is a u16 the error code is truncated and
returned as a plausible register value.
Fix this by first checking the return value for errors before returning
it as the register content.
Before this patch,
    # phytool read eth0/0:1/0
    0xffa1
After this change,
    $ phytool read eth0/0:1/0
    error: phy_read (-95)
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20240903171536.628930-1-niklas.soderlund+renesas@ragnatech.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>