Jakub Kicinski [Fri, 6 Sep 2024 16:10:59 +0000 (09:10 -0700)]
 
net: remove dev_pick_tx_cpu_id()
dev_pick_tx_cpu_id() has been introduced with two users by
commit 
a4ea8a3dacc3 ("net: Add generic ndo_select_queue functions").
The use in AF_PACKET has been removed in 2019 by
commit 
b71b5837f871 ("packet: rework packet_pick_tx_queue() to use common code selection")
The other user was a Netlogic XLP driver, removed in 2021 by
commit 
47ac6f567c28 ("staging: Remove Netlogic XLP network driver").
It's relatively unlikely that any modern driver will need an
.ndo_select_queue implementation which picks purely based on CPU ID
and skips XPS, delete dev_pick_tx_cpu_id()
Found by code inspection.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240906161059.715546-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 9 Sep 2024 23:52:06 +0000 (16:52 -0700)]
 
Merge branch 'selftests-mptcp-add-time-per-subtests-in-tap-output'
Matthieu Baerts says:
====================
selftests: mptcp: add time per subtests in TAP output
Patches here add 'time=<N>ms' in the diagnostic data of the TAP output,
e.g.
  ok 1 - pm_netlink: defaults addr list # time=9ms
This addition is useful to quickly identify which subtests are taking a
longer time than the others, or more than expected.
Note that there are no specific formats to follow to show this time
according to the TAP 13, TAP 14 and KTAP specifications, but we follow
the format being parsed by NIPA [1].
Patch 1 modifies mptcp_lib.sh to add this support to all MPTCP
selftests.
Patch 2 removes the now duplicated info in mptcp_connect.sh
Patch 3 slightly improves the precision of the first subtests in all
MPTCP subtests.
Patches 4 and 5 remove duplicated spaces in TAP output, for the TAP
parsers that cannot handle them properly.
v1: https://lore.kernel.org/
20240902-net-next-mptcp-ksft-subtest-time-v1-0-
f1ed499a11b1@kernel.org
Link: https://github.com/linux-netdev/nipa/pull/36
====================
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-0-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:11 +0000 (20:46 +0200)]
 
selftests: mptcp: connect: remove duplicated spaces in TAP output
It is nice to have a visual alignment in the test output to present the
different results, but it makes less sense in the TAP output that is
there for computers.
It sounds then better to remove the duplicated whitespaces in the TAP
output, also because it can cause some issues with TAP parsers expecting
only one space around the directive delimiter (#).
While at it, change the variable name (result_msg) to something more
explicit.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-5-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:10 +0000 (20:46 +0200)]
 
selftests: mptcp: diag: remove trailing whitespace
It doesn't need to be there, and it can cause some issues with TAP
parsers expecting only one space around the directive delimiter (#).
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-4-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:09 +0000 (20:46 +0200)]
 
selftests: mptcp: reset the last TS before the first test
Just to slightly improve the precision of the duration of the first
test.
In mptcp_join.sh, the last append_prev_results is now done as soon as
the last test is over: this will add the last result in the list, and
get a more precise time for this last test.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-3-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:08 +0000 (20:46 +0200)]
 
selftests: mptcp: connect: remote time in TAP output
It is now added by the MPTCP lib automatically, see the parent commit.
The time in the TAP output might be slightly different from the one
displayed before, but that's OK.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-2-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:07 +0000 (20:46 +0200)]
 
selftests: mptcp: lib: add time per subtests in TAP output
It adds 'time=<N>ms' in the diagnostic data of the TAP output, e.g.
  ok 1 - pm_netlink: defaults addr list # time=9ms
This addition is useful to quickly identify which subtests are taking a
longer time than the others, or more than expected.
Note that there are no specific formats to follow to show this time
according to the TAP 13 [1], TAP 14 [2] and KTAP [3] specifications.
Let's then define this one here.
Link: https://testanything.org/tap-version-13-specification.html
Link: https://testanything.org/tap-version-14-specification.html
Link: https://docs.kernel.org/dev-tools/ktap.html
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-1-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason Xing [Thu, 5 Sep 2024 16:00:35 +0000 (00:00 +0800)]
 
selftests: return failure when timestamps can't be reported
When I was trying to modify the tx timestamping feature, I found that
running "./txtimestamp -4 -C -L 127.0.0.1" didn't reflect the error:
I succeeded to generate timestamp stored in the skb but later failed
to report it to the userspace (which means failed to put css into cmsg).
It can happen when someone writes buggy codes in __sock_recv_timestamp(),
for example.
After adding the check so that running ./txtimestamp will reflect the
result correctly like this if there is a bug in the reporting phase:
protocol:     TCP
payload:      10
server port:  9000
family:       INET
test SND
    USR: 
1725458477 s 667997 us (seq=0, len=0)
Failed to report timestamps
    USR: 
1725458477 s 718128 us (seq=0, len=0)
Failed to report timestamps
    USR: 
1725458477 s 768273 us (seq=0, len=0)
Failed to report timestamps
    USR: 
1725458477 s 818416 us (seq=0, len=0)
Failed to report timestamps
...
In the future, it will help us detect whether the new coming patch has
bugs or not.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240905160035.62407-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Mon, 9 Sep 2024 13:14:54 +0000 (14:14 +0100)]
 
Merge branch 'unmask-dscp-part-four'
Ido Schimmel says:
====================
Unmask upper DSCP bits - part 4 (last)
tl;dr - This patchset finishes to unmask the upper DSCP bits in the IPv4
flow key in preparation for allowing IPv4 FIB rules to match on DSCP. No
functional changes are expected.
The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
lookup to match against the TOS selector in FIB rules and routes.
It is currently impossible for user space to configure FIB rules that
match on the DSCP value as the upper DSCP bits are either masked in the
various call sites that initialize the IPv4 flow key or along the path
to the FIB core.
In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
need to make sure the entire DSCP value is present in the IPv4 flow key.
This patchset finishes to unmask the upper DSCP bits by adjusting all
the callers of ip_route_output_key() to properly initialize the full
DSCP value in the IPv4 flow key.
No functional changes are expected as commit 
1fa3314c14c6 ("ipv4:
Centralize TOS matching") moved the masking of the upper DSCP bits to
the core where 'flowi4_tos' is matched against the TOS selector.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:40 +0000 (19:51 +0300)]
 
sctp: Unmask upper DSCP bits in sctp_v4_get_dst()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Note that the 'tos' variable holds the full DS field.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:39 +0000 (19:51 +0300)]
 
ipv4: udp_tunnel: Unmask upper DSCP bits in udp_tunnel_dst_lookup()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Note that callers of udp_tunnel_dst_lookup() pass the entire DS field in
the 'tos' argument.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:38 +0000 (19:51 +0300)]
 
netfilter: nf_dup4: Unmask upper DSCP bits in nf_dup_ipv4_route()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:37 +0000 (19:51 +0300)]
 
netfilter: nft_flow_offload: Unmask upper DSCP bits in nft_flow_route()
Unmask the upper DSCP bits when calling nf_route() which eventually
calls ip_route_output_key() so that in the future it could perform the
FIB lookup according to the full DSCP value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:36 +0000 (19:51 +0300)]
 
ipv4: netfilter: Unmask upper DSCP bits in ip_route_me_harder()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:35 +0000 (19:51 +0300)]
 
ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_xmit()
Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.
Note that the 'tos' variable includes the full DS field. Either the one
specified as part of the tunnel parameters or the one inherited from the
inner packet.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:34 +0000 (19:51 +0300)]
 
ipv4: ip_tunnel: Unmask upper DSCP bits in ip_md_tunnel_xmit()
Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.
Note that the 'tos' variable includes the full DS field. Either the one
specified via the tunnel key or the one inherited from the inner packet.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:33 +0000 (19:51 +0300)]
 
ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_bind_dev()
Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:32 +0000 (19:51 +0300)]
 
ipv4: icmp: Unmask upper DSCP bits in icmp_reply()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:31 +0000 (19:51 +0300)]
 
bpf: lwtunnel: Unmask upper DSCP bits in bpf_lwt_xmit_reroute()
Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:30 +0000 (19:51 +0300)]
 
ipv4: ip_gre: Unmask upper DSCP bits in ipgre_open()
Unmask the upper DSCP bits when calling ip_route_output_gre() so that in
the future it could perform the FIB lookup according to the full DSCP
value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 5 Sep 2024 16:51:29 +0000 (19:51 +0300)]
 
netfilter: br_netfilter: Unmask upper DSCP bits in br_nf_pre_routing_finish()
Unmask upper DSCP bits when calling ip_route_output() so that in the
future it could perform the FIB lookup according to the full DSCP value.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zijun Hu [Wed, 4 Sep 2024 23:35:38 +0000 (07:35 +0800)]
 
net: sysfs: Fix weird usage of class's namespace relevant fields
Device class has two namespace relevant fields which are associated by
the following usage:
struct class {
	...
	const struct kobj_ns_type_operations *ns_type;
	const void *(*namespace)(const struct device *dev);
	...
}
if (dev->class && dev->class->ns_type)
	dev->class->namespace(dev);
The usage looks weird since it checks @ns_type but calls namespace()
it is found for all existing class definitions that the other filed is
also assigned once one is assigned in current kernel tree, so fix this
weird usage by checking @namespace to call namespace().
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 9 Sep 2024 09:29:05 +0000 (10:29 +0100)]
 
Merge branch 'fs_enet-cleanup'
Maxime Chevallier says:
====================
net: ethernet: fs_enet: Cleanup and phylink conversion
This is V3 of a series that cleans-up fs_enet, with the ultimate goal of
converting it to phylink (patch 8).
The main changes compared to V2 are :
 - Reviewed-by tags from Andrew were gathered
 - Patch 5 now includes the removal of now unused includes, thanks
   Andrew for spotting this
 - Patch 4 is new, it reworks the adjust_link to move the spinlock
   acquisition to a more suitable location. Although this dissapears in
   the actual phylink port, it makes the phylink conversion clearer on
   that point
 - Patch 8 includes fixes in the tx_timeout cancellation, to prevent
   taking rtnl twice when canceling a pending tx_timeout. Thanks Jakub
   for spotting this.
Link to V2: https://lore.kernel.org/netdev/
20240829161531.610874-1-maxime.chevallier@bootlin.com/
Link to V1: https://lore.kernel.org/netdev/
20240828095103.132625-1-maxime.chevallier@bootlin.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:21 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: phylink conversion
fs_enet is a quite old but still used Ethernet driver found on some NXP
devices. It has support for 10/100 Mbps ethernet, with half and full
duplex. Some variants of it can use RMII, while other integrations are
MII-only.
Add phylink support, thus removing custom fixed-link hanldling.
This also allows removing some internal flags such as the use_rmii flag.
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:20 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: simplify clock handling with devm accessors
devm_clock_get_enabled() can be used to simplify clock handling for the
PER register clock.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:19 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: use macros for speed and duplex values
The PHY speed and duplex should be manipulated using the SPEED_XXX and
DUPLEX_XXX macros available. Use it in the fcc, fec and scc MAC for
fs_enet.
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:18 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: drop unused phy_info and mii_if_info
There's no user of the struct phy_info, the 'phy' field and the
mii_if_info in the fs_enet driver, probably dating back when phylib
wasn't as widely used.  Drop these from the driver code.
As the definition for struct mii_if_info is no longer required, drop the
include for linux/mii.h altogether in the driver.
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:17 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: only protect the .restart() call in .adjust_link
When .adjust_link() gets called, it runs in thread context, with the
phydev->lock held. We only need to protect the fep->fecp/fccp/sccp
register that are accessed within the .restart() function from
concurrent access from the interrupts.
These registers are being protected by the fep->lock spinlock, so we can
move the spinlock protection around the .restart() call instead of the
entire adjust_link() call. By doing so, we can simplify further the
.adjust_link() callback and avoid the intermediate helper.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:16 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: drop the .adjust_link custom fs_ops
There's no in-tree user for the fs_ops .adjust_link() function, so we
can always use the generic one in fe_enet-main.
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:15 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: cosmetic cleanups
Due to the age of the driver and the slow recent activity on it, the code
has taken some layers of dust. Clean the main driver file up so that it
passes checkpatch and also conforms with the net coding style.
Changes include :
 - Re-ordering of the variable declarations for RCT
 - Fixing the comment styles to either one-line comments, or net-style
   comments
 - Adding braces around single-statement 'else' clauses
 - Aligning function/macro parameters on the opening parenthesis
 - Simplifying checks for NULL pointers
 - Splitting cascaded assignments into individual assignments
 - Fixing some typos
 - Fixing whitespace issues
This is a cosmetic change and doesn't introduce any change in behaviour.
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Wed, 4 Sep 2024 17:18:14 +0000 (19:18 +0200)]
 
net: ethernet: fs_enet: convert to SPDX
The ENET driver has SPDX tags in the header files, but they were missing
in the C files. Change the licence information to SPDX format.
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Wed, 4 Sep 2024 14:13:05 +0000 (07:13 -0700)]
 
ptp/ioctl: support MONOTONIC{,_RAW} timestamps for PTP_SYS_OFFSET_EXTENDED
The ability to read the PHC (Physical Hardware Clock) alongside
multiple system clocks is currently dependent on the specific
hardware architecture. This limitation restricts the use of
PTP_SYS_OFFSET_PRECISE to certain hardware configurations.
The generic soultion which would work across all architectures
is to read the PHC along with the latency to perform PHC-read as
offered by PTP_SYS_OFFSET_EXTENDED which provides pre and post
timestamps.  However, these timestamps are currently limited
to the CLOCK_REALTIME timebase. Since CLOCK_REALTIME is affected
by NTP (or similar time synchronization services), it can
experience significant jumps forward or backward. This hinders
the precise latency measurements that PTP_SYS_OFFSET_EXTENDED
is designed to provide.
This problem could be addressed by supporting MONOTONIC_RAW
timestamps within PTP_SYS_OFFSET_EXTENDED. Unlike CLOCK_REALTIME
or CLOCK_MONOTONIC, the MONOTONIC_RAW timebase is unaffected
by NTP adjustments.
This enhancement can be implemented by utilizing one of the three
reserved words within the PTP_SYS_OFFSET_EXTENDED struct to pass
the clock-id for timestamps.  The current behavior aligns with
clock-id for CLOCK_REALTIME timebase (value of 0), ensuring
backward compatibility of the UAPI.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Antipov [Wed, 4 Sep 2024 11:54:01 +0000 (14:54 +0300)]
 
net: sched: consistently use rcu_replace_pointer() in taprio_change()
According to Vinicius (and carefully looking through the whole
https://syzkaller.appspot.com/bug?extid=
b65e0af58423fc8a73aa
once again), txtime branch of 'taprio_change()' is not going to
race against 'advance_sched()'. But using 'rcu_replace_pointer()'
in the former may be a good idea as well.
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 7 Sep 2024 01:39:31 +0000 (18:39 -0700)]
 
Merge tag 'nf-next-24-09-06' of git://git./linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
Patch #1 adds ctnetlink support for kernel side filtering for
	 deletions, from Changliang Wu.
Patch #2 updates nft_counter support to Use u64_stats_t,
	 from Sebastian Andrzej Siewior.
Patch #3 uses kmemdup_array() in all xtables frontends,
	 from Yan Zhen.
Patch #4 is a oneliner to use ERR_CAST() in nf_conntrack instead
	 opencoded casting, from Shen Lichuan.
Patch #5 removes unused argument in nftables .validate interface,
	 from Florian Westphal.
Patch #6 is a oneliner to correct a typo in nftables kdoc,
	 from Simon Horman.
Patch #7 fixes missing kdoc in nftables, also from Simon.
Patch #8 updates nftables to handle timeout less than CONFIG_HZ.
Patch #9 rejects element expiration if timeout is zero,
	 otherwise it is silently ignored.
Patch #10 disallows element expiration larger than timeout.
Patch #11 removes unnecessary READ_ONCE annotation while mutex is held.
Patch #12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset.
Patch #13 annotates data-races around element expiration.
Patch #14 allocates timeout and expiration in one single set element
	  extension, they are tighly couple, no reason to keep them
	  separated anymore.
Patch #15 updates nftables to interpret zero timeout element as never
	  times out. Note that it is already possible to declare sets
	  with elements that never time out but this generalizes to all
	  kind of set with timeouts.
Patch #16 supports for element timeout and expiration updates.
* tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: set element timeout update support
  netfilter: nf_tables: zero timeout means element never times out
  netfilter: nf_tables: consolidate timeout extension for elements
  netfilter: nf_tables: annotate data-races around element expiration
  netfilter: nft_dynset: annotate data-races around set timeout
  netfilter: nf_tables: remove annotation to access set timeout while holding lock
  netfilter: nf_tables: reject expiration higher than timeout
  netfilter: nf_tables: reject element expiration with no timeout
  netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
  netfilter: nf_tables: Add missing Kernel doc
  netfilter: nf_tables: Correct spelling in nf_tables.h
  netfilter: nf_tables: drop unused 3rd argument from validate callback ops
  netfilter: conntrack: Convert to use ERR_CAST()
  netfilter: Use kmemdup_array instead of kmemdup for multiple allocation
  netfilter: nft_counter: Use u64_stats_t for statistic.
  netfilter: ctnetlink: support CTA_FILTER for flush
====================
Link: https://patch.msgid.link/20240905232920.5481-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Fedorenko [Thu, 5 Sep 2024 14:00:28 +0000 (14:00 +0000)]
 
ptp: ocp: Improve PCIe delay estimation
The PCIe bus can be pretty busy during boot and probe function can
see excessive delays. Let's find the minimal value out of several
tests and use it as estimated value.
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20240905140028.560454-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 5 Sep 2024 08:49:09 +0000 (08:49 +0000)]
 
netpoll: remove netpoll_srcu
netpoll_srcu is currently used from netpoll_poll_disable() and
__netpoll_cleanup()
Both functions run under RTNL, using netpoll_srcu adds confusion
and no additional protection.
Moreover the synchronize_srcu() call in __netpoll_cleanup() is
performed before clearing np->dev->npinfo, which violates RCU rules.
After this patch, netpoll_poll_disable() and netpoll_poll_enable()
simply use rtnl_dereference().
This saves a big chunk of memory (more than 192KB on platforms
with 512 cpus)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240905084909.2082486-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 7 Sep 2024 01:23:51 +0000 (18:23 -0700)]
 
Merge branch 'octeontx2-address-some-warnings'
Simon Horman says:
====================
octeontx2: Address some warnings
This patchset addresses some warnings flagged by Sparse, gcc-14, and
clang-18 in files touched by recent patch submissions.
Although these changes do not alter the functionality of the code, by
addressing them real problems introduced in future which are flagged by
Sparse will stand out more readily.
v1: https://lore.kernel.org/
20240903-octeontx2-sparse-v1-0-
f190309ecb0a@kernel.org
====================
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-0-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 4 Sep 2024 18:29:37 +0000 (19:29 +0100)]
 
octeontx2-pf: Make iplen __be16 in otx2_sqe_add_ext()
In otx2_sqe_add_ext() iplen is used to hold a 16-bit big-endian value,
but it's type is u16, indicating a host byte order integer.
Address this mismatch by changing the type of iplen to __be16.
Flagged by Sparse as:
.../otx2_txrx.c:699:31: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:699:31:    expected unsigned short [usertype] iplen
.../otx2_txrx.c:699:31:    got restricted __be16 [usertype]
.../otx2_txrx.c:701:54: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:701:54:    expected restricted __be16 [usertype] tot_len
.../otx2_txrx.c:701:54:    got unsigned short [usertype] iplen
.../otx2_txrx.c:704:60: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:704:60:    expected restricted __be16 [usertype] payload_len
.../otx2_txrx.c:704:60:    got unsigned short [usertype] iplen
Introduced in
commit 
dc1a9bf2c816 ("octeontx2-pf: Add UDP segmentation offload support")
No functional change intended.
Compile tested only by author.
Tested-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-2-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Wed, 4 Sep 2024 18:29:36 +0000 (19:29 +0100)]
 
octeontx2-af: Pass string literal as format argument of alloc_workqueue()
Recently I noticed that both gcc-14 and clang-18 report that passing
a non-string literal as the format argument of alloc_workqueue()
is potentially insecure.
E.g. clang-18 says:
.../rvu.c:2493:32: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
 2493 |         mw->mbox_wq = alloc_workqueue(name,
      |                                       ^~~~
.../rvu.c:2493:32: note: treat the string as an argument to avoid this
 2493 |         mw->mbox_wq = alloc_workqueue(name,
      |                                       ^
      |                                       "%s",
It is always the case where the contents of name is safe to pass as the
format argument. That is, in my understanding, it never contains any
format escape sequences.
But, it seems better to be safe than sorry. And, as a bonus, compiler
output becomes less verbose by addressing this issue as suggested by
clang-18.
Compile tested only by author.
Tested-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-1-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rosen Penev [Wed, 4 Sep 2024 20:56:59 +0000 (13:56 -0700)]
 
net: phy: qca83xx: use PHY_ID_MATCH_EXACT
No need for the mask when there's already a macro for this.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240904205659.7470-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Edward Cree [Wed, 4 Sep 2024 18:11:56 +0000 (19:11 +0100)]
 
sfc: siena: rip out rss-context dead code
Siena hardware does not support custom RSS contexts, but when the
 driver was forked from sfc.ko, some of the plumbing for them was
 copied across from the common code.  Actually trying to use them
 would lead to EOPNOTSUPP as the relevant efx_nic_type methods were
 not populated.
Remove this dead code from the Siena driver.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904181156.1993666-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 7 Sep 2024 01:21:47 +0000 (18:21 -0700)]
 
Merge branch 'use-functionality-of-irq_get_trigger_type'
Vasileios Amoiridis says:
====================
Use functionality of irq_get_trigger_type()
v1: https://lore.kernel.org/
20240902225534.130383-1-vassilisamir@gmail.com
====================
Link: https://patch.msgid.link/20240904151018.71967-1-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:18 +0000 (17:10 +0200)]
 
net: smc91x: Make use of irq_get_trigger_type()
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240904151018.71967-4-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:17 +0000 (17:10 +0200)]
 
net: dsa: realtek: rtl8366rb: Make use of irq_get_trigger_type()
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Link: https://patch.msgid.link/20240904151018.71967-3-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:16 +0000 (17:10 +0200)]
 
net: dsa: realtek: rtl8365mb: Make use of irq_get_trigger_type()
Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240904151018.71967-2-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sascha Hauer [Wed, 4 Sep 2024 12:17:41 +0000 (14:17 +0200)]
 
net: tls: wait for async completion on last message
When asynchronous encryption is used KTLS sends out the final data at
proto->close time. This becomes problematic when the task calling
close() receives a signal. In this case it can happen that
tcp_sendmsg_locked() called at close time returns -ERESTARTSYS and the
final data is not sent.
The described situation happens when KTLS is used in conjunction with
io_uring, as io_uring uses task_work_add() to add work to the current
userspace task. A discussion of the problem along with a reproducer can
be found in [1] and [2]
Fix this by waiting for the asynchronous encryption to be completed on
the final message. With this there is no data left to be sent at close
time.
[1] https://lore.kernel.org/all/
20231010141932.GD3114228@pengutronix.de/
[2] https://lore.kernel.org/all/
20240315100159.
3898944-1-s.hauer@pengutronix.de/
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://patch.msgid.link/20240904-ktls-wait-async-v1-1-a62892833110@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 7 Sep 2024 01:10:26 +0000 (18:10 -0700)]
 
Merge branch 'make-use-of-the-helper-macro-list_head'
Hongbo Li says:
====================
make use of the helper macro LIST_HEAD()
The macro LIST_HEAD() declares a list variable and
initializes it, which can be used to simplify the steps
of list initialization, thereby simplifying the code.
These serials just do some equivalatent substitutions,
and with no functional modifications.
====================
Link: https://patch.msgid.link/20240904093243.3345012-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:43 +0000 (17:32 +0800)]
 
net/core: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-6-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:42 +0000 (17:32 +0800)]
 
net/ipv6: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-5-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:41 +0000 (17:32 +0800)]
 
net/netfilter: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240904093243.3345012-4-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:40 +0000 (17:32 +0800)]
 
net/tipc: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-3-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 4 Sep 2024 09:32:39 +0000 (17:32 +0800)]
 
net/ipv4: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-2-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:49:51 +0000 (16:49 +0800)]
 
sfc: convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240904084951.1353518-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:40:34 +0000 (16:40 +0800)]
 
sfc/siena: Convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240904084034.1353404-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:17:28 +0000 (16:17 +0800)]
 
ionic: Convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240904081728.1353260-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Chen Ni [Wed, 4 Sep 2024 08:08:45 +0000 (16:08 +0800)]
 
net: atlantic: convert comma to semicolon
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904080845.1353144-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Fri, 6 Sep 2024 08:34:19 +0000 (09:34 +0100)]
 
Merge branch 'rx-sw-tstamp-for-all'
Gal Pressman says:
====================
RX software timestamp for all - round 2
Round 1 of drivers conversion was merged [1], this is round 2, more
drivers to follow.
[1] https://lore.kernel.org/netdev/
20240901112803.212753-1-gal@nvidia.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:22 +0000 (10:49 +0300)]
 
bnx2x: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:21 +0000 (10:49 +0300)]
 
cxgb4: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:20 +0000 (10:49 +0300)]
 
ixgbe: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:19 +0000 (10:49 +0300)]
 
igc: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:18 +0000 (10:49 +0300)]
 
igb: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:17 +0000 (10:49 +0300)]
 
ice: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:16 +0000 (10:49 +0300)]
 
i40e: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:15 +0000 (10:49 +0300)]
 
net: netcp: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:14 +0000 (10:49 +0300)]
 
net: ti: icssg-prueth: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:13 +0000 (10:49 +0300)]
 
net: ethernet: ti: cpsw_ethtool: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:12 +0000 (10:49 +0300)]
 
net: ethernet: ti: am65-cpsw-ethtool: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:11 +0000 (10:49 +0300)]
 
mlxsw: spectrum: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:10 +0000 (10:49 +0300)]
 
net: sparx5: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:09 +0000 (10:49 +0300)]
 
net: lan966x: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gal Pressman [Wed, 4 Sep 2024 07:49:08 +0000 (10:49 +0300)]
 
lan743x: Remove setting of RX software timestamp
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 6 Sep 2024 07:41:36 +0000 (08:41 +0100)]
 
Merge branch 'microchip=ksz8-cleanup'
Pieter Van Trappen says:
====================
net: dsa: microchip: rename and clean ksz8 series files
The first KSZ8 series implementation was done for a KSZ8795 device but
since several other KSZ8 devices have been added. Rename these files
to adhere to the ksz8 naming convention as already used in most
functions and the existing ksz8.h; add an explanatory note.
In addition, clean the files by removing macros that are defined at
more than one place and remove confusion by renaming the KSZ8830
string which in fact is not an existing KSZ series switch.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
---
v4:
 - correct once more Kconfig list of supported switches
v3: https://lore.kernel.org/netdev/
20240903072946.344507-1-vtpieter@gmail.com/
 - rename all KSZ8830 to KSZ88X3 only (not KSZ8863)
 - update Kconfig as per Arun's suggestion
v2: https://lore.kernel.org/netdev/
20240830141250.30425-1-vtpieter@gmail.com/
 - more finegrained description in Kconfig and ksz8.c header
 - add KSZ8830/ksz8830 to KSZ8863/ksz88x3 renaming
v1: https://lore.kernel.org/netdev/
20240828102801.227588-1-vtpieter@gmail.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:42 +0000 (08:27 +0200)]
 
net: dsa: microchip: replace unclear KSZ8830 strings
Replace ksz8830 with ksz88x3 for CHIP_ID definition and other
strings. This due to KSZ8830 not being an actual switch but the Chip
ID shared among KSZ8863/8873 switches, impossible to differentiate
from their Chip ID or Revision ID registers.
Now all KSZ*_CHIP_ID macros refer to actual, existing switches which
removes confusion.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:41 +0000 (08:27 +0200)]
 
net: dsa: microchip: clean up ksz8_reg definition macros
Remove macros that are already defined at more appropriate places.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:40 +0000 (08:27 +0200)]
 
net: dsa: microchip: rename ksz8 series files
The first KSZ8 series implementation was done for a KSZ8795 device but
since several other KSZ8 devices have been added. Rename these files
to adhere to the ksz8 naming convention as already used in most
functions and the existing ksz8.h; add an explanatory note.
Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 6 Sep 2024 05:02:41 +0000 (22:02 -0700)]
 
Merge branch 'add-realtek-automotive-pcie-driver'
Justin Lai says:
====================
Add Realtek automotive PCIe driver
This series includes adding realtek automotive ethernet driver
and adding rtase ethernet driver entry in MAINTAINERS file.
This ethernet device driver for the PCIe interface of
Realtek Automotive Ethernet Switch,applicable to
RTL9054, RTL9068, RTL9072, RTL9075, RTL9068, RTL9071.
====================
Link: https://patch.msgid.link/20240904032114.247117-1-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:14 +0000 (11:21 +0800)]
 
MAINTAINERS: Add the rtase ethernet driver entry
Add myself and Larry Chiu as the maintainer for the rtase ethernet driver.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-14-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:13 +0000 (11:21 +0800)]
 
realtek: Update the Makefile and Kconfig in the realtek folder
1. Add the RTASE entry in the Kconfig.
2. Add the CONFIG_RTASE entry in the Makefile.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-13-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:12 +0000 (11:21 +0800)]
 
rtase: Add a Makefile in the rtase folder
Add a Makefile in the rtase folder to build rtase driver.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-12-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:11 +0000 (11:21 +0800)]
 
rtase: Implement ethtool function
Implement the ethtool function to support users to obtain network card
information, including obtaining various device settings, Report whether
physical link is up, Report pause parameters, Set pause parameters,
Return extended statistics about the device.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-11-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:10 +0000 (11:21 +0800)]
 
rtase: Implement pci_driver suspend and resume function
Implement the pci_driver suspend function to enable the device
to sleep, and implement the resume function to enable the device
to resume operation.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-10-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:09 +0000 (11:21 +0800)]
 
rtase: Implement net_device_ops
1. Implement .ndo_set_rx_mode so that the device can change address
list filtering.
2. Implement .ndo_set_mac_address so that mac address can be changed.
3. Implement .ndo_change_mtu so that mtu can be changed.
4. Implement .ndo_tx_timeout to perform related processing when the
transmitter does not make any progress.
5. Implement .ndo_get_stats64 to provide statistics that are called
when the user wants to get network device usage.
6. Implement .ndo_vlan_rx_add_vid to register VLAN ID when the device
supports VLAN filtering.
7. Implement .ndo_vlan_rx_kill_vid to unregister VLAN ID when the device
supports VLAN filtering.
8. Implement the .ndo_setup_tc to enable setting any "tc" scheduler,
classifier or action on dev.
9. Implement .ndo_fix_features enables adjusting requested feature flags
based on device-specific constraints.
10. Implement .ndo_set_features enables updating device configuration to
new features.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-9-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:08 +0000 (11:21 +0800)]
 
rtase: Implement a function to receive packets
Implement rx_handler to read the information of the rx descriptor,
thereby checking the packet accordingly and storing the packet
in the socket buffer to complete the reception of the packet.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-8-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:07 +0000 (11:21 +0800)]
 
rtase: Implement .ndo_start_xmit function
Implement .ndo_start_xmit function to fill the information of the packet
to be transmitted into the tx descriptor, and then the hardware will
transmit the packet using the information in the tx descriptor.
In addition, we also implemented the tx_handler function to enable the
tx descriptor to be reused.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-7-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:06 +0000 (11:21 +0800)]
 
rtase: Implement hardware configuration function
Implement rtase_hw_config to set default hardware settings, including
setting interrupt mitigation, tx/rx DMA burst, interframe gap time,
rx packet filter, near fifo threshold and fill descriptor ring and
tally counter address, and enable flow control. When filling the
rx descriptor ring, the first group of queues needs to be processed
separately because the positions of the first group of queues are not
regular with other subsequent groups. The other queues are all newly
added features, but we want to retain the original design. So they were
not put together.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-6-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:05 +0000 (11:21 +0800)]
 
rtase: Implement the interrupt routine and rtase_poll
1. Implement rtase_interrupt to handle txQ0/rxQ0, txQ4~txQ7 interrupts,
and implement rtase_q_interrupt to handle txQ1/rxQ1, txQ2/rxQ2 and
txQ3/rxQ3 interrupts.
2. Implement rtase_poll to call ring_handler to process the tx or
rx packet of each ring. If the returned value is budget,it means that
there is still work of a certain ring that has not yet been completed.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-5-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:04 +0000 (11:21 +0800)]
 
rtase: Implement the rtase_down function
Implement the rtase_down function to disable hardware setting
and interrupt and clear descriptor ring.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-4-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:03 +0000 (11:21 +0800)]
 
rtase: Implement the .ndo_open function
Implement the .ndo_open function to set default hardware settings
and initialize the descriptor ring and interrupts. Among them,
when requesting interrupt, because the first group of interrupts
needs to process more events, the overall structure and interrupt
handler will be different from other groups of interrupts, so it
needs to be handled separately. The first set of interrupt handlers
need to handle the interrupt status of RXQ0 and TXQ0, TXQ4~7,
while other groups of interrupt handlers will handle the interrupt
status of RXQ1&TXQ1 or RXQ2&TXQ2 or RXQ3&TXQ3 according to the
interrupt vector.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-3-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Justin Lai [Wed, 4 Sep 2024 03:21:02 +0000 (11:21 +0800)]
 
rtase: Add support for a pci table in this module
Add support for a pci table in this module, and implement pci_driver
function to initialize this driver, remove this driver, or shutdown
this driver.
Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-2-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 6 Sep 2024 03:27:09 +0000 (20:27 -0700)]
 
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/phy/phy_device.c
  
2560db6ede1a ("net: phy: Fix missing of_node_put() for leds")
  
1dce520abd46 ("net: phy: Use for_each_available_child_of_node_scoped()")
https://lore.kernel.org/
20240904115823.
74333648@canb.auug.org.au
Adjacent changes:
drivers/net/ethernet/xilinx/xilinx_axienet.h
drivers/net/ethernet/xilinx/xilinx_axienet_main.c
  
858430db28a5 ("net: xilinx: axienet: Fix race in axienet_stop")
  
76abb5d675c4 ("net: xilinx: axienet: Add statistics support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Wed, 4 Sep 2024 09:10:24 +0000 (10:10 +0100)]
 
netlink: specs: nftables: allow decode of tailscale ruleset
Fill another small gap in the nftables spec so that it is possible to
dump a tailscale ruleset with:
  tools/net/ynl/cli.py --spec \
     Documentation/netlink/specs/nftables.yaml --dump getrule
This adds support for the 'target' expression.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20240904091024.3138-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joe Damato [Wed, 4 Sep 2024 15:34:30 +0000 (15:34 +0000)]
 
net: napi: Prevent overflow of napi_defer_hard_irqs
In commit 
6f8b12d661d0 ("net: napi: add hard irqs deferral feature")
napi_defer_irqs was added to net_device and napi_defer_irqs_count was
added to napi_struct, both as type int.
This value never goes below zero, so there is not reason for it to be a
signed int. Change the type for both from int to u32, and add an
overflow check to sysfs to limit the value to S32_MAX.
The limit of S32_MAX was chosen because the practical limit before this
patch was S32_MAX (anything larger was an overflow) and thus there are
no behavioral changes introduced. If the extra bit is needed in the
future, the limit can be raised.
Before this patch:
$ sudo bash -c 'echo 
2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
$ cat /sys/class/net/eth4/napi_defer_hard_irqs
-
2147483647
After this patch:
$ sudo bash -c 'echo 
2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
bash: line 0: echo: write error: Numerical result out of range
Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:
include/linux/netdevice.h:      unsigned int            tx_queue_len;
And has an overflow check:
dev_change_tx_queue_len(..., unsigned long new_len):
  if (new_len != (unsigned int)new_len)
          return -ERANGE;
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240904153431.307932-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Fri, 6 Sep 2024 00:08:01 +0000 (17:08 -0700)]
 
Merge tag 'net-6.11-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from can, bluetooth and wireless.
  No known regressions at this point. Another calm week, but chances are
  that has more to do with vacation season than the quality of our work.
  Current release - new code bugs:
   - smc: prevent NULL pointer dereference in txopt_get
   - eth: ti: am65-cpsw: number of XDP-related fixes
  Previous releases - regressions:
   - Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over
     BREDR/LE", it breaks existing user space
   - Bluetooth: qca: if memdump doesn't work, re-enable IBS to avoid
     later problems with suspend
   - can: mcp251x: fix deadlock if an interrupt occurs during
     mcp251x_open
   - eth: r8152: fix the firmware communication error due to use of bulk
     write
   - ptp: ocp: fix serial port information export
   - eth: igb: fix not clearing TimeSync interrupts for 82580
   - Revert "wifi: ath11k: support hibernation", fix suspend on Lenovo
  Previous releases - always broken:
   - eth: intel: fix crashes and bugs when reconfiguration and resets
     happening in parallel
   - wifi: ath11k: fix NULL dereference in ath11k_mac_get_eirp_power()
  Misc:
   - docs: netdev: document guidance on cleanup.h"
* tag 'net-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
  ila: call nf_unregister_net_hooks() sooner
  tools/net/ynl: fix cli.py --subscribe feature
  MAINTAINERS: fix ptp ocp driver maintainers address
  selftests: net: enable bind tests
  net: dsa: vsc73xx: fix possible subblocks range of CAPT block
  sched: sch_cake: fix bulk flow accounting logic for host fairness
  docs: netdev: document guidance on cleanup.h
  net: xilinx: axienet: Fix race in axienet_stop
  net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN
  r8152: fix the firmware doesn't work
  fou: Fix null-ptr-deref in GRO.
  bareudp: Fix device stats updates.
  net: mana: Fix error handling in mana_create_txq/rxq's NAPI cleanup
  bpf, net: Fix a potential race in do_sock_getsockopt()
  net: dqs: Do not use extern for unused dql_group
  sch/netem: fix use after free in netem_dequeue
  usbnet: modern method to get random MAC
  MAINTAINERS: wifi: cw1200: add net-cw1200.h
  ice: do not bring the VSI up, if it was down before the XDP setup
  ice: remove ICE_CFG_BUSY locking from AF_XDP code
  ...
Linus Torvalds [Thu, 5 Sep 2024 23:49:10 +0000 (16:49 -0700)]
 
Merge tag 'spi-fix-v6.11-rc6' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
 "A few small driver specific fixes (including some of the widespread
  work on fixing missing ID tables for module autoloading and the revert
  of some problematic PM work in spi-rockchip), some improvements to the
  MAINTAINERS information for the NXP drivers and the addition of a new
  device ID to spidev"
* tag 'spi-fix-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  MAINTAINERS: SPI: Add mailing list imx@lists.linux.dev for nxp spi drivers
  MAINTAINERS: SPI: Add freescale lpspi maintainer information
  spi: spi-fsl-lpspi: Fix off-by-one in prescale max
  spi: spidev: Add missing spi_device_id for jg10309-01
  spi: bcm63xx: Enable module autoloading
  spi: intel: Add check devm_kasprintf() returned value
  spi: spidev: Add an entry for elgin,jg10309-01
  spi: rockchip: Resolve unbalanced runtime PM / system PM handling
Linus Torvalds [Thu, 5 Sep 2024 23:41:16 +0000 (16:41 -0700)]
 
Merge tag 'regulator-fix-v6.11-stub' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
 "A fix from Doug Anderson for a missing stub, required to fix the build
  for some newly added users of devm_regulator_bulk_get_const() in
  !REGULATOR configurations"
* tag 'regulator-fix-v6.11-stub' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: core: Stub devm_regulator_bulk_get_const() if !CONFIG_REGULATOR
Linus Torvalds [Thu, 5 Sep 2024 23:35:57 +0000 (16:35 -0700)]
 
Merge tag 'rust-fixes-6.11-2' of https://github.com/Rust-for-Linux/linux
Pull Rust fixes from Miguel Ojeda:
 "Toolchain and infrastructure:
   - Fix builds for nightly compiler users now that 'new_uninit' was
     split into new features by using an alternative approach for the
     code that used what is now called the 'box_uninit_write' feature
   - Allow the 'stable_features' lint to preempt upcoming warnings about
     them, since soon there will be unstable features that will become
     stable in nightly compilers
   - Export bss symbols too
  'kernel' crate:
   - 'block' module: fix wrong usage of lockdep API
  'macros' crate:
   - Provide correct provenance when constructing 'THIS_MODULE'
  Documentation:
   - Remove unintended indentation (blockquotes) in generated output
   - Fix a couple typos
  MAINTAINERS:
   - Remove Wedson as Rust maintainer
   - Update Andreas' email"
* tag 'rust-fixes-6.11-2' of https://github.com/Rust-for-Linux/linux:
  MAINTAINERS: update Andreas Hindborg's email address
  MAINTAINERS: Remove Wedson as Rust maintainer
  rust: macros: provide correct provenance when constructing THIS_MODULE
  rust: allow `stable_features` lint
  docs: rust: remove unintended blockquote in Quick Start
  rust: alloc: eschew `Box<MaybeUninit<T>>::write`
  rust: kernel: fix typos in code comments
  docs: rust: remove unintended blockquote in Coding Guidelines
  rust: block: fix wrong usage of lockdep API
  rust: kbuild: fix export of bss symbols
Linus Torvalds [Thu, 5 Sep 2024 23:29:41 +0000 (16:29 -0700)]
 
Merge tag 'trace-v6.11-rc4' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
 - Fix adding a new fgraph callback after function graph tracing has
   already started.
   If the new caller does not initialize its hash before registering the
   fgraph_ops, it can cause a NULL pointer dereference. Fix this by
   adding a new parameter to ftrace_graph_enable_direct() passing in the
   newly added gops directly and not rely on using the fgraph_array[],
   as entries in the fgraph_array[] must be initialized.
   Assign the new gops to the fgraph_array[] after it goes through
   ftrace_startup_subops() as that will properly initialize the
   gops->ops and initialize its hashes.
 - Fix a memory leak in fgraph storage memory test.
   If the "multiple fgraph storage on a function" boot up selftest fails
   in the registering of the function graph tracer, it will not free the
   memory it allocated for the filter. Break the loop up into two where
   it allocates the filters first and then registers the functions where
   any errors will do the appropriate clean ups.
 - Only clear the timerlat timers if it has an associated kthread.
   In the rtla tool that uses timerlat, if it was killed just as it was
   shutting down, the signals can free the kthread and the timer. But
   the closing of the timerlat files could cause the hrtimer_cancel() to
   be called on the already freed timer. As the kthread variable is is
   set to NULL when the kthreads are stopped and the timers are freed it
   can be used to know not to call hrtimer_cancel() on the timer if the
   kthread variable is NULL.
 - Use a cpumask to keep track of osnoise/timerlat kthreads
   The timerlat tracer can use user space threads for its analysis. With
   the killing of the rtla tool, the kernel can get confused between if
   it is using a user space thread to analyze or one of its own kernel
   threads. When this confusion happens, kthread_stop() can be called on
   a user space thread and bad things happen. As the kernel threads are
   per-cpu, a bitmask can be used to know when a kernel thread is used
   or when a user space thread is used.
 - Add missing interface_lock to osnoise/timerlat stop_kthread()
   The stop_kthread() function in osnoise/timerlat clears the osnoise
   kthread variable, and if it was a user space thread does a put_task
   on it. But this can race with the closing of the timerlat files that
   also does a put_task on the kthread, and if the race happens the task
   will have put_task called on it twice and oops.
 - Add cond_resched() to the tracing_iter_reset() loop.
   The latency tracers keep writing to the ring buffer without resetting
   when it issues a new "start" event (like interrupts being disabled).
   When reading the buffer with an iterator, the tracing_iter_reset()
   sets its pointer to that start event by walking through all the
   events in the buffer until it gets to the time stamp of the start
   event. In the case of a very large buffer, the loop that looks for
   the start event has been reported taking a very long time with a non
   preempt kernel that it can trigger a soft lock up warning. Add a
   cond_resched() into that loop to make sure that doesn't happen.
 - Use list_del_rcu() for eventfs ei->list variable
   It was reported that running loops of creating and deleting kprobe
   events could cause a crash due to the eventfs list iteration hitting
   a LIST_POISON variable. This is because the list is protected by SRCU
   but when an item is deleted from the list, it was using list_del()
   which poisons the "next" pointer. This is what list_del_rcu() was to
   prevent.
* tag 'trace-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()
  tracing/timerlat: Only clear timer if a kthread exists
  tracing/osnoise: Use a cpumask to know what threads are kthreads
  eventfs: Use list_del_rcu() for SRCU protected list variable
  tracing: Avoid possible softlockup in tracing_iter_reset()
  tracing: Fix memory leak in fgraph storage selftest
  tracing: fgraph: Fix to add new fgraph_ops to array after ftrace_startup_subops()
Eric Dumazet [Wed, 4 Sep 2024 14:44:18 +0000 (14:44 +0000)]
 
ila: call nf_unregister_net_hooks() sooner
syzbot found an use-after-free Read in ila_nf_input [1]
Issue here is that ila_xlat_exit_net() frees the rhashtable,
then call nf_unregister_net_hooks().
It should be done in the reverse way, with a synchronize_rcu().
This is a good match for a pre_exit() method.
[1]
 BUG: KASAN: use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline]
 BUG: KASAN: use-after-free in __rhashtable_lookup include/linux/rhashtable.h:604 [inline]
 BUG: KASAN: use-after-free in rhashtable_lookup include/linux/rhashtable.h:646 [inline]
 BUG: KASAN: use-after-free in rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672
Read of size 4 at addr 
ffff888064620008 by task ksoftirqd/0/16
CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 
6.11.0-rc4-syzkaller-00238-g2ad6d23f465a #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:93 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
  print_address_description mm/kasan/report.c:377 [inline]
  print_report+0x169/0x550 mm/kasan/report.c:488
  kasan_report+0x143/0x180 mm/kasan/report.c:601
  rht_key_hashfn include/linux/rhashtable.h:159 [inline]
  __rhashtable_lookup include/linux/rhashtable.h:604 [inline]
  rhashtable_lookup include/linux/rhashtable.h:646 [inline]
  rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672
  ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:132 [inline]
  ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline]
  ila_nf_input+0x1fe/0x3c0 net/ipv6/ila/ila_xlat.c:190
  nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
  nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626
  nf_hook include/linux/netfilter.h:269 [inline]
  NF_HOOK+0x29e/0x450 include/linux/netfilter.h:312
  __netif_receive_skb_one_core net/core/dev.c:5661 [inline]
  __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5775
  process_backlog+0x662/0x15b0 net/core/dev.c:6108
  __napi_poll+0xcb/0x490 net/core/dev.c:6772
  napi_poll net/core/dev.c:6841 [inline]
  net_rx_action+0x89b/0x1240 net/core/dev.c:6963
  handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
  run_ksoftirqd+0xca/0x130 kernel/softirq.c:928
  smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164
  kthread+0x2f0/0x390 kernel/kthread.c:389
  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 </TASK>
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:
0000000000000000 index:0x0 pfn:0x64620
flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xbfffffff(buddy)
raw: 
00fff00000000000 ffffea0000959608 ffffea00019d9408 0000000000000000
raw: 
0000000000000000 0000000000000003 00000000bfffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as freed
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 5242, tgid 5242 (syz-executor), ts 
73611328570, free_ts 
618981657187
  set_page_owner include/linux/page_owner.h:32 [inline]
  post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
  prep_new_page mm/page_alloc.c:1501 [inline]
  get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
  __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
  __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
  alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
  ___kmalloc_large_node+0x8b/0x1d0 mm/slub.c:4103
  __kmalloc_large_node_noprof+0x1a/0x80 mm/slub.c:4130
  __do_kmalloc_node mm/slub.c:4146 [inline]
  __kmalloc_node_noprof+0x2d2/0x440 mm/slub.c:4164
  __kvmalloc_node_noprof+0x72/0x190 mm/util.c:650
  bucket_table_alloc lib/rhashtable.c:186 [inline]
  rhashtable_init_noprof+0x534/0xa60 lib/rhashtable.c:1071
  ila_xlat_init_net+0xa0/0x110 net/ipv6/ila/ila_xlat.c:613
  ops_init+0x359/0x610 net/core/net_namespace.c:139
  setup_net+0x515/0xca0 net/core/net_namespace.c:343
  copy_net_ns+0x4e2/0x7b0 net/core/net_namespace.c:508
  create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110
  unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228
  ksys_unshare+0x619/0xc10 kernel/fork.c:3328
  __do_sys_unshare kernel/fork.c:3399 [inline]
  __se_sys_unshare kernel/fork.c:3397 [inline]
  __x64_sys_unshare+0x38/0x40 kernel/fork.c:3397
page last free pid 11846 tgid 11846 stack trace:
  reset_page_owner include/linux/page_owner.h:25 [inline]
  free_pages_prepare mm/page_alloc.c:1094 [inline]
  free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
  __folio_put+0x2c8/0x440 mm/swap.c:128
  folio_put include/linux/mm.h:1486 [inline]
  free_large_kmalloc+0x105/0x1c0 mm/slub.c:4565
  kfree+0x1c4/0x360 mm/slub.c:4588
  rhashtable_free_and_destroy+0x7c6/0x920 lib/rhashtable.c:1169
  ila_xlat_exit_net+0x55/0x110 net/ipv6/ila/ila_xlat.c:626
  ops_exit_list net/core/net_namespace.c:173 [inline]
  cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640
  process_one_work kernel/workqueue.c:3231 [inline]
  process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
  worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
  kthread+0x2f0/0x390 kernel/kthread.c:389
  ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Memory state around the buggy address:
 
ffff88806461ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 
ffff88806461ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
ffff888064620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                      ^
 
ffff888064620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
 
ffff888064620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Fixes: 
7f00feaf1076 ("ila: Add generic ILA translation facility")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20240904144418.1162839-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arkadiusz Kubalewski [Wed, 4 Sep 2024 13:50:34 +0000 (15:50 +0200)]
 
tools/net/ynl: fix cli.py --subscribe feature
Execution of command:
./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml /
	--subscribe "monitor" --sleep 10
fails with:
  File "/repo/./tools/net/ynl/cli.py", line 109, in main
    ynl.check_ntf()
  File "/repo/tools/net/ynl/lib/ynl.py", line 924, in check_ntf
    op = self.rsp_by_value[nl_msg.cmd()]
KeyError: 19
Parsing Generic Netlink notification messages performs lookup for op in
the message. The message was not yet decoded, and is not yet considered
GenlMsg, thus msg.cmd() returns Generic Netlink family id (19) instead of
proper notification command id (i.e.: DPLL_CMD_PIN_CHANGE_NTF=13).
Allow the op to be obtained within NetlinkProtocol.decode(..) itself if the
op was not passed to the decode function, thus allow parsing of Generic
Netlink notifications without causing the failure.
Suggested-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/netdev/m2le0n5xpn.fsf@gmail.com/
Fixes: 
0a966d606c68 ("tools/net/ynl: Fix extack decoding for directional ops")
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20240904135034.316033-1-arkadiusz.kubalewski@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>