David S. Miller [Mon, 20 Feb 2023 11:22:54 +0000 (11:22 +0000)]
Merge branch 'default_rps_mask-follow-up'
Paolo Abeni says:
====================
net: default_rps_mask follow-up
The first patch namespacify the setting. In the common case, once
proper isolation is in place in the main namespace, forwarding
to/from each child netns will allways happen on the desidered CPUs.
Any additional RPS stage inside the child namespace will not provide
additional isolation and could hurt performance badly if picking a
CPU on a remote node.
The 2nd patch adds more self-tests coverage.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 17 Feb 2023 12:28:50 +0000 (13:28 +0100)]
self-tests: more rps self tests
Explicitly check for child netns and main ns independency
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 17 Feb 2023 12:28:49 +0000 (13:28 +0100)]
net: make default_rps_mask a per netns attribute
That really was meant to be a per netns attribute from the beginning.
The idea is that once proper isolation is in place in the main
namespace, additional demux in the child namespaces will be redundant.
Let's make child netns default rps mask empty by default.
To avoid bloating the netns with a possibly large cpumask, allocate
it on-demand during the first write operation.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Feb 2023 11:18:13 +0000 (11:18 +0000)]
Merge tag 'wireless-next-2023-02-17' of git://git./linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.3
Third set of patches for v6.3. This time only a set of small fixes
submitted during the last day or two.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Feb 2023 10:53:56 +0000 (10:53 +0000)]
Merge git://git./linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add safeguard to check for NULL tupe in objects updates via
NFT_MSG_NEWOBJ, this should not ever happen. From Alok Tiwari.
2) Incorrect pointer check in the new destroy rule command,
from Yang Yingliang.
3) Incorrect status bitcheck in nf_conntrack_udp_packet(),
from Florian Westphal.
4) Simplify seq_print_acct(), from Ilia Gavrilov.
5) Use 2-arg optimal variant of kfree_rcu() in IPVS,
from Julian Anastasov.
6) TCP connection enters CLOSE state in conntrack for locally
originated TCP reset packet from the reject target,
from Florian Westphal.
The fixes #2 and #3 in this series address issues from the previous pull
nf-next request in this net-next cycle.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 17 Feb 2023 09:58:06 +0000 (10:58 +0100)]
net: microchip: sparx5: reduce stack usage
The vcap_admin structures in vcap_api_next_lookup_advanced_test()
take several hundred bytes of stack frame, but when CONFIG_KASAN_STACK
is enabled, each one of them also has extra padding before and after
it, which ends up blowing the warning limit:
In file included from drivers/net/ethernet/microchip/vcap/vcap_api.c:3521:
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c: In function 'vcap_api_next_lookup_advanced_test':
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1954:1: error: the frame size of 1448 bytes is larger than 1400 bytes [-Werror=frame-larger-than=]
1954 | }
Reduce the total stack usage by replacing the five structures with
an array that only needs one pair of padding areas.
Fixes:
1f741f001160 ("net: microchip: sparx5: Add KUNIT tests for enabling/disabling chains")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 17 Feb 2023 09:56:39 +0000 (10:56 +0100)]
sfc: use IS_ENABLED() checks for CONFIG_SFC_SRIOV
One local variable has become unused after a recent change:
drivers/net/ethernet/sfc/ef100_nic.c: In function 'ef100_probe_netdev_pf':
drivers/net/ethernet/sfc/ef100_nic.c:1155:21: error: unused variable 'net_dev' [-Werror=unused-variable]
struct net_device *net_dev = efx->net_dev;
^~~~~~~
The variable is still used in an #ifdef. Replace the #ifdef with
an if(IS_ENABLED()) check that lets the compiler see where it is
used, rather than adding another #ifdef.
This also fixes an uninitialized return value in ef100_probe_netdev_pf()
that gcc did not spot.
Fixes:
7e056e2360d9 ("sfc: obtain device mac address based on firmware handle for ef100")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Swiatkowski [Fri, 17 Feb 2023 10:50:17 +0000 (11:50 +0100)]
ice: properly alloc ICE_VSI_LB
Devlink reload patchset introduced regression. ICE_VSI_LB wasn't
taken into account when doing default allocation. Fix it by adding a
case for ICE_VSI_LB in ice_vsi_alloc_def().
Fixes:
6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 17 Feb 2023 09:25:28 +0000 (09:25 +0000)]
sfc: Fix spelling mistake "creationg" -> "creating"
There is a spelling mistake in a pci_warn message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geetha sowjanya [Fri, 17 Feb 2023 05:51:12 +0000 (11:21 +0530)]
octeontx2-af: Add NIX Errata workaround on CN10K silicon
This patch adds workaround for below 2 HW erratas
1. Due to improper clock gating, NIXRX may free the same
NPA buffer multiple times.. to avoid this, always enable
NIX RX conditional clock.
2. NIX FIFO does not get initialized on reset, if the SMQ
flush is triggered before the first packet is processed, it
will lead to undefined state. The workaround to perform SMQ
flush only if packet count is non-zero in MDQ.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Fri, 17 Feb 2023 03:15:20 +0000 (04:15 +0100)]
net: phy: Read EEE abilities when using .features
A PHY driver can use a static integer value to indicate what link mode
features it supports, i.e, its abilities.. This is the old way, but
useful when dynamically determining the devices features does not
work, e.g. support of fibre.
EEE support has been moved into phydev->supported_eee. This needs to
be set otherwise the code assumes EEE is not supported. It is normally
set as part of reading the devices abilities. However if a static
integer value was used, the dynamic reading of the abilities is not
performed. Add a call to genphy_c45_read_eee_abilities() to read the
EEE abilities.
Fixes:
8b68710a3121 ("net: phy: start using genphy_c45_ethtool_get/set_eee()")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Feb 2023 10:04:22 +0000 (10:04 +0000)]
Merge branch 'phydev-locks'
Andrew Lunn says:
====================
Add additional phydev locks
The phydev lock should be held when accessing members of phydev, or
calling into the driver. Some of the phy_ethtool_ functions are
missing locks. Add them. To avoid deadlock the marvell driver is
modified since it calls one of the functions which gain locks, which
would result in a deadlock.
The missing locks have not caused noticeable issues, so these patches
are for net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Fri, 17 Feb 2023 03:07:14 +0000 (04:07 +0100)]
net: phy: Add locks to ethtool functions
The phydev lock should be held while accessing members of phydev,
or calling into the driver.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Fri, 17 Feb 2023 03:07:13 +0000 (04:07 +0100)]
net: phy: marvell: Use the unlocked genphy_c45_ethtool_get_eee()
phy_ethtool_get_eee() is about to gain locking of the phydev lock.
This means it cannot be used within a PHY driver without causing a
deadlock. Swap to using genphy_c45_ethtool_get_eee() which assumes the
lock has already been taken.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 20 Feb 2023 08:54:24 +0000 (08:54 +0000)]
Merge branch 'icmp6-drop-reason'
Eric Dumazet says:
====================
ipv6: icmp6: better drop reason support
This series aims to have more precise drop reason reports for icmp6.
This should reduce false positives on most usual cases.
This can be extended as needed later.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:42 +0000 (16:28 +0000)]
ipv6: icmp6: add drop reason support to icmpv6_echo_reply()
Change icmpv6_echo_reply() to return a drop reason.
For the moment, return NOT_SPECIFIED or SKB_CONSUMED.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:41 +0000 (16:28 +0000)]
ipv6: icmp6: add SKB_DROP_REASON_IPV6_NDISC_NS_OTHERHOST
Hosts can often receive neighbour discovery messages
that are not for them.
Use a dedicated drop reason to make clear the packet is dropped
for this normal case.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:40 +0000 (16:28 +0000)]
ipv6: icmp6: add SKB_DROP_REASON_IPV6_NDISC_BAD_OPTIONS
This is a generic drop reason for any error detected
in ndisc_parse_options().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:39 +0000 (16:28 +0000)]
ipv6: icmp6: add drop reason support to ndisc_redirect_rcv()
Change ndisc_redirect_rcv() to return a drop reason.
For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
and values from icmpv6_notify().
More reasons are added later.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:38 +0000 (16:28 +0000)]
ipv6: icmp6: add drop reason support to ndisc_router_discovery()
Change ndisc_router_discovery() to return a drop reason.
For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
and SKB_CONSUMED.
More reasons are added later.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:37 +0000 (16:28 +0000)]
ipv6: icmp6: add drop reason support to ndisc_recv_rs()
Change ndisc_recv_rs() to return a drop reason.
For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
or SKB_CONSUMED. More reasons are added later.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:36 +0000 (16:28 +0000)]
ipv6: icmp6: add drop reason support to ndisc_recv_na()
Change ndisc_recv_na() to return a drop reason.
For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
or SKB_CONSUMED. More reasons are added later.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 16:28:35 +0000 (16:28 +0000)]
ipv6: icmp6: add drop reason support to ndisc_recv_ns()
Change ndisc_recv_ns() to return a drop reason.
For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
or SKB_CONSUMED.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 16 Feb 2023 15:47:18 +0000 (15:47 +0000)]
net: add location to trace_consume_skb()
kfree_skb() includes the location, it makes sense
to add it to consume_skb() as well.
After patch:
taskd_EventMana 8602 [004] 420.406239: skb:consume_skb: skbaddr=0xffff893a4a6d0500 location=unix_stream_read_generic
swapper 0 [011] 422.732607: skb:consume_skb: skbaddr=0xffff89597f68cee0 location=mlx4_en_free_tx_desc
discipline 9141 [043] 423.065653: skb:consume_skb: skbaddr=0xffff893a487e9c00 location=skb_consume_udp
swapper 0 [010] 423.073166: skb:consume_skb: skbaddr=0xffff8949ce9cdb00 location=icmpv6_rcv
borglet 8672 [014] 425.628256: skb:consume_skb: skbaddr=0xffff8949c42e9400 location=netlink_dump
swapper 0 [028] 426.263317: skb:consume_skb: skbaddr=0xffff893b1589dce0 location=net_rx_action
wget 14339 [009] 426.686380: skb:consume_skb: skbaddr=0xffff893a51b552e0 location=tcp_rcv_state_process
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xuan Zhuo [Thu, 16 Feb 2023 08:30:47 +0000 (16:30 +0800)]
xsk: support use vaddr as ring
When we try to start AF_XDP on some machines with long running time, due
to the machine's memory fragmentation problem, there is no sufficient
contiguous physical memory that will cause the start failure.
If the size of the queue is 8 * 1024, then the size of the desc[] is
8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
16page+. This is necessary to apply for a 4-order memory. If there are a
lot of queues, it is difficult to these machine with long running time.
Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but
we only use 17 pages.
This patch replaces __get_free_pages() by vmalloc() to allocate memory
to solve these problems.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Mon, 20 Feb 2023 07:46:59 +0000 (08:46 +0100)]
Merge branch 'taprio-queuemaxsdu-fixes'
Vladimir Oltean says:
====================
taprio queueMaxSDU fixes
This fixes 3 issues noticed while attempting to reoffload the
dynamically calculated queueMaxSDU values. These are:
- Dynamic queueMaxSDU is not calculated correctly due to a lost patch
- Dynamically calculated queueMaxSDU needs to be clamped on the low end
- Dynamically calculated queueMaxSDU needs to be clamped on the high end
====================
Link: https://lore.kernel.org/r/20230215224632.2532685-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vladimir Oltean [Wed, 15 Feb 2023 22:46:32 +0000 (00:46 +0200)]
net/sched: taprio: dynamic max_sdu larger than the max_mtu is unlimited
It makes no sense to keep randomly large max_sdu values, especially if
larger than the device's max_mtu. These are visible in "tc qdisc show".
Such a max_sdu is practically unlimited and will cause no packets for
that traffic class to be dropped on enqueue.
Just set max_sdu_dynamic to U32_MAX, which in the logic below causes
taprio to save a max_frm_len of U32_MAX and a max_sdu presented to user
space of 0 (unlimited).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vladimir Oltean [Wed, 15 Feb 2023 22:46:31 +0000 (00:46 +0200)]
net/sched: taprio: don't allow dynamic max_sdu to go negative after stab adjustment
The overhead specified in the size table comes from the user. With small
time intervals (or gates always closed), the overhead can be larger than
the max interval for that traffic class, and their difference is
negative.
What we want to happen is for max_sdu_dynamic to have the smallest
non-zero value possible (1) which means that all packets on that traffic
class are dropped on enqueue. However, since max_sdu_dynamic is u32, a
negative is represented as a large value and oversized dropping never
happens.
Use max_t with int to force a truncation of max_frm_len to no smaller
than dev->hard_header_len + 1, which in turn makes max_sdu_dynamic no
smaller than 1.
Fixes:
fed87cc6718a ("net/sched: taprio: automatically calculate queueMaxSDU based on TC gate durations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vladimir Oltean [Wed, 15 Feb 2023 22:46:30 +0000 (00:46 +0200)]
net/sched: taprio: fix calculation of maximum gate durations
taprio_calculate_gate_durations() depends on netdev_get_num_tc() and
this returns 0. So it calculates the maximum gate durations for no
traffic class.
I had tested the blamed commit only with another patch in my tree, one
which in the end I decided isn't valuable enough to submit ("net/sched:
taprio: mask off bits in gate mask that exceed number of TCs").
The problem is that having this patch threw off my testing. By moving
the netdev_set_num_tc() call earlier, we implicitly gave to
taprio_calculate_gate_durations() the information it needed.
Extract only the portion from the unsubmitted change which applies the
mqprio configuration to the netdev earlier.
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-15-vladimir.oltean@nxp.com/
Fixes:
a306a90c8ffe ("net/sched: taprio: calculate tc gate durations")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
David Howells [Wed, 15 Feb 2023 21:48:05 +0000 (21:48 +0000)]
rxrpc: Fix overproduction of wakeups to recvmsg()
Fix three cases of overproduction of wakeups:
(1) rxrpc_input_split_jumbo() conditionally notifies the app that there's
data for recvmsg() to collect if it queues some data - and then its
only caller, rxrpc_input_data(), goes and wakes up recvmsg() anyway.
Fix the rxrpc_input_data() to only do the wakeup in failure cases.
(2) If a DATA packet is received for a call by the I/O thread whilst
recvmsg() is busy draining the call's rx queue in the app thread, the
call will left on the recvmsg() queue for recvmsg() to pick up, even
though there isn't any data on it.
This can cause an unexpected recvmsg() with a 0 return and no MSG_EOR
set after the reply has been posted to a service call.
Fix this by discarding pending calls from the recvmsg() queue that
don't need servicing yet.
(3) Not-yet-completed calls get requeued after having data read from them,
even if they have no data to read.
Fix this by only requeuing them if they have data waiting on them; if
they don't, the I/O thread will requeue them when data arrives or they
fail.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/3386149.1676497685@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Mon, 20 Feb 2023 07:14:22 +0000 (08:14 +0100)]
Merge branch 'net-final-gsi-register-updates'
Alex Elder says:
====================
net: final GSI register updates
I believe this is the last set of changes required to allow IPA v5.0
to be supported. There is a little cleanup work remaining, but that
can happen in the next Linux release cycle. Otherwise we just need
config data and register definitions for IPA v5.0 (and DTS updates).
These are ready but won't be posted without further testing.
The first patch in this series fixes a minor bug in a patch just
posted, which I found too late. The second eliminates the GSI
memory "adjustment"; this was done previously to avoid/delay the
need to implement a more general way to define GSI register offsets.
Note that this patch causes "checkpatch" warnings due to indentation
that aligns with an open parenthesis.
The third patch makes use of the newly-defined register offsets, to
eliminate the need for a function that hid a few details. The next
modifies a different helper function to work properly for IPA v5.0+.
The fifth patch changes the way the event ring size is specified
based on how it's now done for IPA v5.0+. And the last defines a
new register required for IPA v5.0+.
====================
Link: https://lore.kernel.org/r/20230215195352.755744-1-elder@linaro.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alex Elder [Wed, 15 Feb 2023 19:53:52 +0000 (13:53 -0600)]
net: ipa: add HW_PARAM_4 GSI register
Starting at IPA v5.0, the number of event rings per EE is defined
in a field in a new HW_PARAM_4 GSI register rather than HW_PARAM_2.
Define this new register and its fields, and update the code that
checks the number of rings supported by hardware to use the proper
field based on IPA version.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alex Elder [Wed, 15 Feb 2023 19:53:51 +0000 (13:53 -0600)]
net: ipa: support different event ring encoding
Starting with IPA v5.0, a channel's event ring index is encoded in
a field in the CH_C_CNTXT_1 GSI register rather than CH_C_CNTXT_0.
Define a new field ID for the former register and encode the event
ring in the appropriate register.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alex Elder [Wed, 15 Feb 2023 19:53:50 +0000 (13:53 -0600)]
net: ipa: avoid setting an undefined field
The GSI channel protocol field in the CH_C_CNTXT_0 GSI register is
widened starting IPA v5.0, making the CHTYPE_PROTOCOL_MSB field
added in IPA v4.5 unnecessary. Update the code to reflect this.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alex Elder [Wed, 15 Feb 2023 19:53:49 +0000 (13:53 -0600)]
net: ipa: kill ev_ch_e_cntxt_1_length_encode()
Now that we explicitly define each register field width there is no
need to have a special encoding function for the event ring length.
Add a field for this to the EV_CH_E_CNTXT_1 GSI register, and use it
in place of ev_ch_e_cntxt_1_length_encode() (which can be removed).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alex Elder [Wed, 15 Feb 2023 19:53:48 +0000 (13:53 -0600)]
net: ipa: kill gsi->virt_raw
Starting at IPA v4.5, almost all GSI registers had their offsets
changed by a fixed amount (shifted downward by 0xd000). Rather than
defining offsets for all those registers dependent on version, an
adjustment was applied for most register accesses. This was
implemented in commit
cdeee49f3ef7f ("net: ipa: adjust GSI register
addresses"). It was later modified to be a bit more obvious about
the adjusment, in commit
571b1e7e58ad3 ("net: ipa: use a separate
pointer for adjusted GSI memory").
We now are able to define every GSI register with its own offset, so
there's no need to implement this special adjustment.
So get rid of the "virt_raw" pointer, and just maintain "virt" as
the (non-adjusted) base address of I/O mapped GSI register memory.
Redefine the offsets of all GSI registers (other than the INTER_EE
ones, which were not subject to the adjustment) for IPA v4.5+,
subtracting 0xd000 from their defined offsets instead.
Move the ERROR_LOG and ERROR_LOG_CLR definitions further down in the
register definition files so all registers are defined in order of
their offset.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alex Elder [Wed, 15 Feb 2023 19:53:47 +0000 (13:53 -0600)]
net: ipa: fix an incorrect assignment
I spotted an error in a patch posted this week, unfortunately just
after it got accepted. The effect of the bug is that time-based
interrupt moderation is disabled. This is not technically a bug,
but it is not what is intended. The problem is that a |= assignment
got implemented as a simple assignment, so the previously assigned
value was ignored.
Fixes:
edc6158b18af ("net: ipa: define fields for event-ring related registers")
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lorenzo Bianconi [Wed, 15 Feb 2023 14:32:57 +0000 (15:32 +0100)]
net: dpaa2-eth: do not always set xsk support in xdp_features flag
Do not always add NETDEV_XDP_ACT_XSK_ZEROCOPY bit in xdp_features flag
but check if the NIC really supports it.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Link: https://lore.kernel.org/r/3dba6ea42dc343a9f2d7d1a6a6a6c173235e1ebf.1676471386.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Arnd Bergmann [Fri, 17 Feb 2023 09:59:04 +0000 (10:59 +0100)]
wifi: rtl8xxxu: add LEDS_CLASS dependency
rtl8xxxu now unconditionally uses LEDS_CLASS, so a Kconfig dependency
is required to avoid link errors:
aarch64-linux-ld: drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.o: in function `rtl8xxxu_disconnect':
rtl8xxxu_core.c:(.text+0x730): undefined reference to `led_classdev_unregister'
ERROR: modpost: "led_classdev_unregister" [drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.ko] undefined!
ERROR: modpost: "led_classdev_register_ext" [drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu.ko] undefined!
Fixes:
3be01622995b ("wifi: rtl8xxxu: Register the LED and make it blink")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230217095910.2480356-1-arnd@kernel.org
Florian Westphal [Wed, 1 Feb 2023 13:45:22 +0000 (14:45 +0100)]
netfilter: let reset rules clean out conntrack entries
iptables/nftables support responding to tcp packets with tcp resets.
The generated tcp reset packet passes through both output and postrouting
netfilter hooks, but conntrack will never see them because the generated
skb has its ->nfct pointer copied over from the packet that triggered the
reset rule.
If the reset rule is used for established connections, this
may result in the conntrack entry to be around for a very long
time (default timeout is 5 days).
One way to avoid this would be to not copy the nf_conn pointer
so that the rest packet passes through conntrack too.
Problem is that output rules might not have the same conntrack
zone setup as the prerouting ones, so its possible that the
reset skb won't find the correct entry. Generating a template
entry for the skb seems error prone as well.
Add an explicit "closing" function that switches a confirmed
conntrack entry to closed state and wire this up for tcp.
If the entry isn't confirmed, no action is needed because
the conntrack entry will never be committed to the table.
Reported-by: Russel King <linux@armlinux.org.uk>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Fri, 17 Feb 2023 11:06:39 +0000 (11:06 +0000)]
Merge ra./pub/scm/linux/kernel/git/netdev/net
Some of the devlink bits were tricky, but I think I got it right.
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Thu, 16 Feb 2023 20:34:44 +0000 (21:34 +0100)]
wifi: iwlegacy: avoid fortify warning
There are two different alive messages, the "init" one is
bigger than the other one, so we have a fortify read warn
here. Avoid it by copying from the variable-sized 'raw'
instead.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230216203444.134310-1-johannes@sipsolutions.net
Johannes Berg [Thu, 16 Feb 2023 19:57:54 +0000 (20:57 +0100)]
wifi: iwlwifi: mvm: remove unused iwl_dbgfs_is_match()
This inline function is unused, remove it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230216205754.d500dcc2e90c.Id87df297263f86b5bba002f7cbb387abc13adf53@changeid
Po-Hao Huang [Thu, 16 Feb 2023 08:28:07 +0000 (16:28 +0800)]
wifi: rtw89: fix AP mode authentication transmission failed
For some ICs, packets can't be sent correctly without initializing
CMAC table first. Previous flow do this initialization after
associated, results in authentication response fails to transmit.
Move the initialization up front to a proper place to solve this.
Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230216082807.22285-1-pkshih@realtek.com
Ping-Ke Shih [Thu, 16 Feb 2023 05:36:33 +0000 (13:36 +0800)]
wifi: rtw88: use RTW_FLAG_POWERON flag to prevent to power on/off twice
Use power state to decide whether we can enter or leave IPS accurately,
and then prevent to power on/off twice.
The commit
6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS")
would like to prevent this as well, but it still can't entirely handle all
cases. The exception is that WiFi gets connected and does suspend/resume,
it will power on twice and cause it failed to power on after resuming,
like:
rtw_8723de 0000:03:00.0: failed to poll offset=0x6 mask=0x2 value=0x2
rtw_8723de 0000:03:00.0: mac power on failed
rtw_8723de 0000:03:00.0: failed to power on mac
rtw_8723de 0000:03:00.0: leave idle state failed
rtw_8723de 0000:03:00.0: failed to leave ips state
rtw_8723de 0000:03:00.0: failed to leave idle state
rtw_8723de 0000:03:00.0: failed to send h2c command
To fix this, introduce new flag RTW_FLAG_POWERON to reflect power state,
and call rtw_mac_pre_system_cfg() to configure registers properly between
power-off/-on.
Reported-by: Paul Gover <pmw.gover@yahoo.co.uk>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217016
Fixes:
6bf3a083407b ("wifi: rtw88: add flag check before enter or leave IPS")
Cc: <Stable@vger.kernel.org>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230216053633.20366-1-pkshih@realtek.com
Linus Torvalds [Fri, 17 Feb 2023 04:23:32 +0000 (20:23 -0800)]
Merge tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Just a final collection of misc fixes, the biggest disables the
recently added dynamic debugging support, it has a regression that
needs some bigger fixes.
Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx
mostly, with some smaller i915 and ast fixes.
drm:
- dynamic debug disable for now
fbdev:
- deferred i/o device close fix
amdgpu:
- Fix GC11.x suspend warning
- Fix display warning
vc4:
- YUV planes fix
- hdmi display fix
- crtc reduced blanking fix
ast:
- fix start address computation
vmwgfx:
- fix bo/handle races
i915:
- gen11 WA fix"
* tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Fail atomic_check early on normalize_zpos error
drm/amd/amdgpu: fix warning during suspend
drm/vmwgfx: Do not drop the reference to the handle too soon
drm/vmwgfx: Stop accessing buffer objects which failed init
drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
drm: Disable dynamic debug as broken
drm/ast: Fix start address computation
fbdev: Fix invalid page access after closing deferred I/O devices
drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
drm/vc4: hdmi: Always enable GCP with AVMUTE cleared
drm/vc4: Fix YUV plane handling when planes are in different buffers
Dave Airlie [Thu, 16 Feb 2023 23:49:12 +0000 (09:49 +1000)]
Merge tag 'drm-intel-fixes-2023-02-16' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Moving gen11 hw wa to the right place. (Matt)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y+47eUvwbafER35/@intel.com
Dave Airlie [Thu, 16 Feb 2023 23:23:43 +0000 (09:23 +1000)]
Merge tag 'drm-misc-fixes-2023-02-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Multiple fixes in vc4 to address issues with YUV planes, HDMI and CRTC;
an invalid page access fix for fbdev, mark dynamic debug as broken, a
double free and refcounting fix for vmwgfx.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216091905.i5wswy4dd74x4br5@houat
Dave Airlie [Thu, 16 Feb 2023 21:34:58 +0000 (07:34 +1000)]
Merge tag 'amd-drm-fixes-6.2-2023-02-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.2-2023-02-15:
amdgpu:
- Fix GC11.x suspend warning
- Fix display warning
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216041122.7714-1-alexander.deucher@amd.com
Linus Torvalds [Thu, 16 Feb 2023 20:13:58 +0000 (12:13 -0800)]
Merge tag 'net-6.2-final' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Fixes from the main networking tree only, probably because all
sub-trees have backed off and haven't submitted their changes.
None of the fixes here are particularly scary and no outstanding
regressions. In an ideal world the "current release" sections would be
empty at this stage but that never happens.
Current release - regressions:
- fix unwanted sign extension in netdev_stats_to_stats64()
Current release - new code bugs:
- initialize net->notrefcnt_tracker earlier
- devlink: fix netdev notifier chain corruption
- nfp: make sure mbox accesses in IPsec code are atomic
- ice: fix check for weight and priority of a scheduling node
Previous releases - regressions:
- ice: xsk: fix cleaning of XDP_TX frame, prevent inf loop
- igb: fix I2C bit banging config with external thermal sensor
Previous releases - always broken:
- sched: tcindex: update imperfect hash filters respecting rcu
- mpls: fix stale pointer if allocation fails during device rename
- dccp/tcp: avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions
- remove WARN_ON_ONCE(sk->sk_forward_alloc) from
sk_stream_kill_queues()
- af_key: fix heap information leak
- ipv6: fix socket connection with DSCP (correct interpretation of
the tclass field vs fib rule matching)
- tipc: fix kernel warning when sending SYN message
- vmxnet3: read RSS information from the correct descriptor (eop)"
* tag 'net-6.2-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
devlink: Fix netdev notifier chain corruption
igb: conditionalize I2C bit banging on external thermal sensor support
net: mpls: fix stale pointer if allocation fails during device rename
net/sched: tcindex: search key must be 16 bits
tipc: fix kernel warning when sending SYN message
igb: Fix PPS input and output using 3rd and 4th SDP
net: use a bounce buffer for copying skb->mark
ixgbe: add double of VLAN header when computing the max MTU
i40e: add double of VLAN header when computing the max MTU
ixgbe: allow to increase MTU to 3K with XDP enabled
net: stmmac: Restrict warning on disabling DMA store and fwd mode
net/sched: act_ctinfo: use percpu stats
net: stmmac: fix order of dwmac5 FlexPPS parametrization sequence
ice: fix lost multicast packets in promisc mode
ice: Fix check for weight and priority of a scheduling node
bnxt_en: Fix mqprio and XDP ring checking logic
net: Fix unwanted sign extension in netdev_stats_to_stats64()
net/usb: kalmia: Don't pass act_len in usb_bulk_msg error path
net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()
af_key: Fix heap information leak
...
Linus Torvalds [Thu, 16 Feb 2023 20:05:33 +0000 (12:05 -0800)]
Merge tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"Just a few NVMe fixes that should go into the 6.2 release, adding a
quirk and fixing two issues introduced in this release:
- NVMe fixes via Christoph:
- Always return an ERR_PTR from nvme_pci_alloc_dev (Irvin Cote)
- Add bogus ID quirk for ADATA SX6000PNP (Daniel Wagner)
- Set the DMA mask earlier (Christoph Hellwig)"
* tag 'block-6.2-2023-02-16' of git://git.kernel.dk/linux:
nvme-pci: always return an ERR_PTR from nvme_pci_alloc_dev
nvme-pci: set the DMA mask earlier
nvme-pci: add bogus ID quirk for ADATA SX6000PNP
Linus Torvalds [Thu, 16 Feb 2023 20:01:46 +0000 (12:01 -0800)]
Merge tag 'spi-v6.2-rc8-abi' of git://git./linux/kernel/git/broonie/spi
Pull spi fix from Mark Brown:
"One more last minute patch for v6.2 updating the parsing of the newly
added spi-cs-setup-delay-ns.
It's been pointed out that due to the way DT parsing works the change
in property size is ABI visible so let's not let a release go out
without it being fixed. The change got split from some earlier ABI
related fixes to the property since the first version sent had a build
error"
* tag 'spi-v6.2-rc8-abi' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: Use a 32-bit DT property for spi-cs-setup-delay-ns
Linus Torvalds [Thu, 16 Feb 2023 19:57:43 +0000 (11:57 -0800)]
Merge tag 'gpio-fixes-for-v6.2' of git://git./linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix a potential Kconfig issue with gpio-mlxbf2 not selecting
GPIOLIB_IRQCHIP
- another immutable irqchip conversion, this time for gpio-vf610
- fix a wakeup issue on Clevo NH5xAx
* tag 'gpio-fixes-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: mlxbf2: select GPIOLIB_IRQCHIP
gpiolib: acpi: Add a ignore wakeup quirk for Clevo NH5xAx
gpio: vf610: make irq_chip immutable
gpiolib: acpi: remove redundant declaration
Christoph Hellwig [Thu, 16 Feb 2023 06:31:10 +0000 (07:31 +0100)]
stop mainaining UUID
The uuid code is very low maintainance now that the major overhaul
has completed, and doesn't need it's own tree. All the recent work
has been done by Andy who'd like to stay on as a reviewer without an
explicit tree.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Thu, 16 Feb 2023 06:29:22 +0000 (07:29 +0100)]
orphan sysvfs
This code has been stale for years and I have no way to test it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jakub Kicinski [Thu, 16 Feb 2023 19:36:14 +0000 (11:36 -0800)]
Merge branch 'mlx5-next' of https://git./linux/kernel/git/mellanox/linux
Leon Romanovsky says:
====================
mlx5-next changes
Following previous conversations [1] and our clear commitment to do
the TC work [2], please pull mlx5-next shared branch, which includes
low-level steering logic to allow RoCEv2 traffic to be encrypted/
decrypted through IPsec.
[1] https://lore.kernel.org/all/
20230126230815.224239-1-saeed@kernel.org/
[2] https://lore.kernel.org/all/Y+Z7lVVWqnRBiPh2@nvidia.com/
* 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Configure IPsec steering for egress RoCEv2 traffic
net/mlx5: Configure IPsec steering for ingress RoCEv2 traffic
net/mlx5: Add IPSec priorities in RDMA namespaces
net/mlx5: Implement new destination type TABLE_TYPE
net/mlx5: Introduce new destination type TABLE_TYPE
====================
Link: https://lore.kernel.org/r/20230215095624.1365200-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 16 Feb 2023 19:33:10 +0000 (11:33 -0800)]
Merge tag 'wireless-next-2023-03-16' of git://git./linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Major stack changes:
* EHT channel puncturing support (client & AP)
* some support for AP MLD without mac80211
* fixes for A-MSDU on mesh connections
Major driver changes:
iwlwifi
* EHT rate reporting
* Bump FW API to 74 for AX devices
* STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware
mt76
* switch to using page pool allocator
* mt7996 EHT (Wi-Fi 7) support
* Wireless Ethernet Dispatch (WED) reset support
libertas
* WPS enrollee support
brcmfmac
* Rename Cypress 89459 to BCM4355
* BCM4355 and BCM4377 support
mwifiex
* SD8978 chipset support
rtl8xxxu
* LED support
ath12k
* new driver for Qualcomm Wi-Fi 7 devices
ath11k
* IPQ5018 support
* Fine Timing Measurement (FTM) responder role support
* channel 177 support
ath10k
* store WLAN firmware version in SMEM image table
* tag 'wireless-next-2023-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (207 commits)
wifi: brcmfmac: p2p: Introduce generic flexible array frame member
wifi: mac80211: add documentation for amsdu_mesh_control
wifi: cfg80211: remove gfp parameter from cfg80211_obss_color_collision_notify description
wifi: mac80211: always initialize link_sta with sta
wifi: mac80211: pass 'sta' to ieee80211_rx_data_set_sta()
wifi: cfg80211: Set SSID if it is not already set
wifi: rtw89: move H2C of del_pkt_offload before polling FW status ready
wifi: rtw89: use readable return 0 in rtw89_mac_cfg_ppdu_status()
wifi: rtw88: usb: drop now unnecessary URB size check
wifi: rtw88: usb: send Zero length packets if necessary
wifi: rtw88: usb: Set qsel correctly
wifi: mac80211: fix off-by-one link setting
wifi: mac80211: Fix for Rx fragmented action frames
wifi: mac80211: avoid u32_encode_bits() warning
wifi: mac80211: Don't translate MLD addresses for multicast
wifi: cfg80211: call reg_notifier for self managed wiphy from driver hint
wifi: cfg80211: get rid of gfp in cfg80211_bss_color_notify
wifi: nl80211: Allow authentication frames and set keys on NAN interface
wifi: mac80211: fix non-MLO station association
wifi: mac80211: Allow NSS change only up to capability
...
====================
Link: https://lore.kernel.org/r/20230216105406.208416-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bartosz Golaszewski [Thu, 16 Feb 2023 12:31:42 +0000 (13:31 +0100)]
Merge tag 'intel-gpio-v6.2-2' of git://git./linux/kernel/git/andy/linux-gpio-intel into gpio/for-current
intel-gpio for v6.2-2
* Ignore spurious wakeup by touchpad on Clevo NH5xAx
* Miscellaneous fix(es)
Paolo Abeni [Thu, 16 Feb 2023 12:30:06 +0000 (13:30 +0100)]
Merge branch 'seg6-add-psp-flavor-support-for-srv6-end-behavior'
Andrea Mayer says:
====================
seg6: add PSP flavor support for SRv6 End behavior
Segment Routing for IPv6 (SRv6 in short) [1] is the instantiation of the
Segment Routing (SR) [2] architecture on the IPv6 dataplane.
In SRv6, the segment identifiers (SID) are IPv6 addresses and the segment list
(SID List) is carried in the Segment Routing Header (SRH). A segment may be
bound to a specific packet processing operation called "behavior". The RFC8986
[3] defines and standardizes the most common/relevant behaviors for network
operators, e.g., End, End.X and End.T and so on.
The RFC8986 also introduces the "flavors" framework aiming to modify or extend
the capabilities of SRv6 End, End.X and End.T behaviors. Specifically, these
behaviors support the following flavors (either individually or in
combinations):
- Penultimate Segment Pop (PSP);
- Ultimate Segment Pop (USP);
- Ultimate Segment Decapsulation (USD).
Such flavors enable an End/End.X/End.T behavior to pop the SRH on the
penultimate/ultimate SR endpoint node listed in the SID List or to perform a
full decapsulation.
Currently, the Linux kernel supports a large subset of behaviors described in
RFC8986, including the End, End.X and End.T. However, PSP, USP and USD flavors
have not yet been implemented.
In this patchset, we extend the SRv6 subsystem to implement the PSP flavor in
the SRv6 End behavior. To accomplish this task, we leverage the flavor
framework previously introduced by another patchset required for supporting the
efficient representation of the SID List through the NEXT-C-SID mechanism [4].
In details, the patchset is made of:
- patch 1/3: seg6: factor out End lookup nexthop processing to a dedicated
function
- patch 2/3: seg6: add PSP flavor support for SRv6 End behavior
- patch 3/3: selftests: seg6: add selftest for PSP flavor in SRv6 End
behavior
From the user space perspective, we do not need to change the iproute2 code to
support the PSP flavor. However, we provide the man page for the PSP flavor in
a separate patch.
Comments, improvements and suggestions are always appreciated.
[1] - RFC8754: https://datatracker.ietf.org/doc/html/rfc8754
[2] - RFC8402: https://datatracker.ietf.org/doc/html/rfc8402
[3] - RFC8986: https://datatracker.ietf.org/doc/html/rfc8986
[4] - https://datatracker.ietf.org/doc/html/draft-ietf-spring-srv6-srh-compression
====================
Link: https://lore.kernel.org/r/20230215134659.7613-1-andrea.mayer@uniroma2.it
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrea Mayer [Wed, 15 Feb 2023 13:46:59 +0000 (14:46 +0100)]
selftests: seg6: add selftest for PSP flavor in SRv6 End behavior
This selftest is designed for testing the PSP flavor in SRv6 End behavior.
It instantiates a virtual network composed of several nodes: hosts and
SRv6 routers. Each node is realized using a network namespace that is
properly interconnected to others through veth pairs.
The test makes use of the SRv6 End behavior and of the PSP flavor needed
for removing the SRH from the IPv6 header at the penultimate node.
The correct execution of the behavior is verified through reachability
tests carried out between hosts.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Lungaroni <paolo.lungaroni@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrea Mayer [Wed, 15 Feb 2023 13:46:58 +0000 (14:46 +0100)]
seg6: add PSP flavor support for SRv6 End behavior
The "flavors" framework defined in RFC8986 [1] represents additional
operations that can modify or extend a subset of existing behaviors such as
SRv6 End, End.X and End.T. We report these flavors hereafter:
- Penultimate Segment Pop (PSP);
- Ultimate Segment Pop (USP);
- Ultimate Segment Decapsulation (USD).
Depending on how the Segment Routing Header (SRH) has to be handled, an
SRv6 End* behavior can support these flavors either individually or in
combinations.
In this patch, we only consider the PSP flavor for the SRv6 End behavior.
A PSP enabled SRv6 End behavior is used by the Source/Ingress SR node
(i.e., the one applying the SRv6 Policy) when it needs to instruct the
penultimate SR Endpoint node listed in the SID List (carried by the SRH) to
remove the SRH from the IPv6 header.
Specifically, a PSP enabled SRv6 End behavior processes the SRH by:
i) decreasing the Segment Left (SL) from 1 to 0;
ii) copying the Last Segment IDentifier (SID) into the IPv6 Destination
Address (DA);
iii) removing (i.e., popping) the outer SRH from the extension headers
following the IPv6 header.
It is important to note that PSP operation (steps i, ii, iii) takes place
only at a penultimate SR Segment Endpoint node (i.e., when the SL=1) and
does not happen at non-penultimate Endpoint nodes. Indeed, when a SID of
PSP flavor is processed at a non-penultimate SR Segment Endpoint node, the
PSP operation is not performed because it would not be possible to decrease
the SL from 1 to 0.
SL=2 SL=1 SL=0
| | |
For example, given the SRv6 policy (SID List := < X, Y, Z >):
- a PSP enabled SRv6 End behavior bound to SID "Y" will apply the PSP
operation as Segment Left (SL) is 1, corresponding to the Penultimate
Segment of the SID List;
- a PSP enabled SRv6 End behavior bound to SID "X" will *NOT* apply the
PSP operation as the Segment Left is 2. This behavior instance will
apply the "standard" End packet processing, ignoring the configured PSP
flavor at all.
[1] - RFC8986: https://datatracker.ietf.org/doc/html/rfc8986
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrea Mayer [Wed, 15 Feb 2023 13:46:57 +0000 (14:46 +0100)]
seg6: factor out End lookup nexthop processing to a dedicated function
The End nexthop lookup/input operations are moved into a new helper
function named input_action_end_finish(). This avoids duplicating the
code needed to compute the nexthop in the different flavors of the End
behavior.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lukas Bulwahn [Wed, 15 Feb 2023 10:46:31 +0000 (11:46 +0100)]
net: dsa: ocelot: fix selecting MFD_OCELOT
Commit
3d7316ac81ac ("net: dsa: ocelot: add external ocelot switch
control") adds config NET_DSA_MSCC_OCELOT_EXT, which selects the
non-existing config MFD_OCELOT_CORE.
Replace this select with the intended and existing MFD_OCELOT.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230215104631.31568-1-lukas.bulwahn@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 16 Feb 2023 11:03:15 +0000 (12:03 +0100)]
Merge branch 'sfc-devlink-support-for-ef100'
Alejandro Lucero says:
====================
sfc: devlink support for ef100
This patchset adds devlink port support for ef100 allowing setting VFs
mac addresses through the VF representor devlink ports.
Basic devlink infrastructure is first introduced, then support for info
command. Next changes for enumerating MAE ports which will be used for
devlink port creation when netdevs are registered.
Adding support for devlink port_function_hw_addr_get requires changes in
the ef100 driver for getting the mac address based on a client handle.
This allows to obtain VFs mac addresses during netdev initialization as
well what is included in patch 6.
Such client handle is used in patches 7 and 8 for getting and setting
devlink port addresses.
====================
Link: https://lore.kernel.org/r/20230215090828.11697-1-alejandro.lucero-palau@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:28 +0000 (09:08 +0000)]
sfc: add support for devlink port_function_hw_addr_set in ef100
Using the builtin client handle id infrastructure, add support for
setting the mac address linked to mports in ef100. This implies to
execute an MCDI command for giving the address to the firmware for
the specific devlink port.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:27 +0000 (09:08 +0000)]
sfc: add support for devlink port_function_hw_addr_get in ef100
Using the builtin client handle id infrastructure, add support for
obtaining the mac address linked to mports in ef100. This implies
to execute an MCDI command for getting the data from the firmware
for each devlink port.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:26 +0000 (09:08 +0000)]
sfc: obtain device mac address based on firmware handle for ef100
Getting device mac address is currently based on a specific MCDI command
only available for the PF. This patch changes the MCDI command to a
generic one for PFs and VFs based on a client handle. This allows both
PFs and VFs to ask for their mac address during initialization using the
CLIENT_HANDLE_SELF.
Moreover, the patch allows other client handles which will be used by
the PF to ask for mac addresses linked to VFs. This is necessary for
suporting the port_function_hw_addr_get devlink function in further
patches.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:25 +0000 (09:08 +0000)]
sfc: add devlink port support for ef100
Using the data when enumerating mports, create devlink ports just before
netdevs are registered and remove those devlink ports after netdev has
been unregistered.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:24 +0000 (09:08 +0000)]
sfc: add mport lookup based on driver's mport data
Obtaining mport id is based on asking the firmware about it. This is
still needed for mport initialization itself, but once the mport data is
now kept by the driver, further mport id request can be satisfied
internally without firmware interaction.
Previous function is just modified in name making clear the firmware
interaction. The new function uses the old name and looks for the data
in the mport data structure.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:23 +0000 (09:08 +0000)]
sfc: enumerate mports in ef100
MAE ports (mports) are the ports on the EF100 embedded switch such
as networking PCIe functions, the physical port, and potentially
others.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:22 +0000 (09:08 +0000)]
sfc: add devlink info support for ef100
Add devlink info support for ef100. The information reported is obtained
through the MCDI interface with the specific meaning defined in new
documentation file.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Alejandro Lucero [Wed, 15 Feb 2023 09:08:21 +0000 (09:08 +0000)]
sfc: add devlink support for ef100
Add devlink infrastructure support. Further patches add devlink
info and devlink port support.
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Ido Schimmel [Wed, 15 Feb 2023 07:31:39 +0000 (09:31 +0200)]
devlink: Fix netdev notifier chain corruption
Cited commit changed devlink to register its netdev notifier block on
the global netdev notifier chain instead of on the per network namespace
one.
However, when changing the network namespace of the devlink instance,
devlink still tries to unregister its notifier block from the chain of
the old namespace and register it on the chain of the new namespace.
This results in corruption of the notifier chains, as the same notifier
block is registered on two different chains: The global one and the per
network namespace one. In turn, this causes other problems such as the
inability to dismantle namespaces due to netdev reference count issues.
Fix by preventing devlink from moving its notifier block between
namespaces.
Reproducer:
# echo "10 1" > /sys/bus/netdevsim/new_device
# ip netns add test123
# devlink dev reload netdevsim/netdevsim10 netns test123
# ip netns del test123
[ 71.935619] unregister_netdevice: waiting for lo to become free. Usage count = 2
[ 71.938348] leaked reference.
Fixes:
565b4824c39f ("devlink: change port event netdev notifier from per-net to global")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230215073139.1360108-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 16 Feb 2023 09:39:30 +0000 (10:39 +0100)]
Merge branch 'net-sched-transition-actions-to-pcpu-stats-and-rcu'
Pedro Tammela says:
====================
net/sched: transition actions to pcpu stats and rcu
Following the work done for act_pedit[0], transition the remaining tc
actions to percpu stats and rcu, whenever possible.
Percpu stats make updating the action stats very cheap, while combining
it with rcu action parameters makes it possible to get rid of the per
action lock in the datapath.
For act_connmark and act_nat we run the following tests:
- tc filter add dev ens2f0 ingress matchall action connmark
- tc filter add dev ens2f0 ingress matchall action nat ingress any 10.10.10.10
Our setup consists of a 26 cores Intel CPU and a 25G NIC.
We use TRex to shoot 10mpps TCP packets and take perf measurements.
Both actions improved performance as expected since the datapath lock disappeared.
For act_pedit we move the drop counter to percpu, when available.
For act_gate we move the counters to percpu, when available.
[0] https://lore.kernel.org/all/
20230131145149.
3776656-1-pctammela@mojatatu.com/
====================
Link: https://lore.kernel.org/r/20230214211534.735718-1-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 14 Feb 2023 21:15:34 +0000 (18:15 -0300)]
net/sched: act_pedit: use percpu overlimit counter when available
Since act_pedit now has access to percpu counters, use the
tcf_action_inc_overlimit_qstats wrapper that will use the percpu
counter whenever they are available.
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 14 Feb 2023 21:15:33 +0000 (18:15 -0300)]
net/sched: act_gate: use percpu stats
The tc action act_gate was using shared stats, move it to percpu stats.
tdc results:
1..12
ok 1 5153 - Add gate action with priority and sched-entry
ok 2 7189 - Add gate action with base-time
ok 3 a721 - Add gate action with cycle-time
ok 4 c029 - Add gate action with cycle-time-ext
ok 5 3719 - Replace gate base-time action
ok 6 d821 - Delete gate action with valid index
ok 7 3128 - Delete gate action with invalid index
ok 8 7837 - List gate actions
ok 9 9273 - Flush gate actions
ok 10 c829 - Add gate action with duplicate index
ok 11 3043 - Add gate action with invalid index
ok 12 2930 - Add gate action with cookie
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 14 Feb 2023 21:15:32 +0000 (18:15 -0300)]
net/sched: act_connmark: transition to percpu stats and rcu
The tc action act_connmark was using shared stats and taking the per
action lock in the datapath. Improve it by using percpu stats and rcu.
perf before:
- 13.55% tcf_connmark_act
- 81.18% _raw_spin_lock
80.46% native_queued_spin_lock_slowpath
perf after:
- 2.85% tcf_connmark_act
tdc results:
1..15
ok 1 2002 - Add valid connmark action with defaults
ok 2 56a5 - Add valid connmark action with control pass
ok 3 7c66 - Add valid connmark action with control drop
ok 4 a913 - Add valid connmark action with control pipe
ok 5 bdd8 - Add valid connmark action with control reclassify
ok 6 b8be - Add valid connmark action with control continue
ok 7 d8a6 - Add valid connmark action with control jump
ok 8 aae8 - Add valid connmark action with zone argument
ok 9 2f0b - Add valid connmark action with invalid zone argument
ok 10 9305 - Add connmark action with unsupported argument
ok 11 71ca - Add valid connmark action and replace it
ok 12 5f8f - Add valid connmark action with cookie
ok 13 c506 - Replace connmark with invalid goto chain control
ok 14 6571 - Delete connmark action with valid index
ok 15 3426 - Delete connmark action with invalid index
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 14 Feb 2023 21:15:31 +0000 (18:15 -0300)]
net/sched: act_nat: transition to percpu stats and rcu
The tc action act_nat was using shared stats and taking the per action
lock in the datapath. Improve it by using percpu stats and rcu.
perf before:
- 10.48% tcf_nat_act
- 81.83% _raw_spin_lock
81.08% native_queued_spin_lock_slowpath
perf after:
- 0.48% tcf_nat_act
tdc results:
1..27
ok 1 7565 - Add nat action on ingress with default control action
ok 2 fd79 - Add nat action on ingress with pipe control action
ok 3 eab9 - Add nat action on ingress with continue control action
ok 4 c53a - Add nat action on ingress with reclassify control action
ok 5 76c9 - Add nat action on ingress with jump control action
ok 6 24c6 - Add nat action on ingress with drop control action
ok 7 2120 - Add nat action on ingress with maximum index value
ok 8 3e9d - Add nat action on ingress with invalid index value
ok 9 f6c9 - Add nat action on ingress with invalid IP address
ok 10 be25 - Add nat action on ingress with invalid argument
ok 11 a7bd - Add nat action on ingress with DEFAULT IP address
ok 12 ee1e - Add nat action on ingress with ANY IP address
ok 13 1de8 - Add nat action on ingress with ALL IP address
ok 14 8dba - Add nat action on egress with default control action
ok 15 19a7 - Add nat action on egress with pipe control action
ok 16 f1d9 - Add nat action on egress with continue control action
ok 17 6d4a - Add nat action on egress with reclassify control action
ok 18 b313 - Add nat action on egress with jump control action
ok 19 d9fc - Add nat action on egress with drop control action
ok 20 a895 - Add nat action on egress with DEFAULT IP address
ok 21 2572 - Add nat action on egress with ANY IP address
ok 22 37f3 - Add nat action on egress with ALL IP address
ok 23 6054 - Add nat action on egress with cookie
ok 24 79d6 - Add nat action on ingress with cookie
ok 25 4b12 - Replace nat action with invalid goto chain control
ok 26 b811 - Delete nat action with valid index
ok 27 a521 - Delete nat action with invalid index
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 16 Feb 2023 09:11:29 +0000 (10:11 +0100)]
Merge branch 'net-core-commmon-prints-for-promisc'
Jesse Brandeburg says:
====================
net/core: commmon prints for promisc
Add a print to the kernel log for allmulticast entry and exit, and
standardize the print for entry and exit of promiscuous mode.
These prints are useful to both user and developer and should have the
triggering driver/bus/device info that netdev_info (optionally) gives.
====================
Link: https://lore.kernel.org/r/20230214210117.23123-1-jesse.brandeburg@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jesse Brandeburg [Tue, 14 Feb 2023 21:01:17 +0000 (13:01 -0800)]
net/core: refactor promiscuous mode message
The kernel stack can be more consistent by printing the IFF_PROMISC
aka promiscuous enable/disable messages with the standard netdev_info
message which can include bus and driver info as well as the device.
typical command usage from user space looks like:
ip link set eth0 promisc <on|off>
But lots of utilities such as bridge, tcpdump, etc put the interface into
promiscuous mode.
old message:
[ 406.034418] device eth0 entered promiscuous mode
[ 408.424703] device eth0 left promiscuous mode
new message:
[ 406.034431] ice 0000:17:00.0 eth0: entered promiscuous mode
[ 408.424715] ice 0000:17:00.0 eth0: left promiscuous mode
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jesse Brandeburg [Tue, 14 Feb 2023 21:01:16 +0000 (13:01 -0800)]
net/core: print message for allmulticast
When the user sets or clears the IFF_ALLMULTI flag in the netdev, there are
no log messages printed to the kernel log to indicate anything happened.
This is inexplicably different from most other dev->flags changes, and
could suprise the user.
Typically this occurs from user-space when a user:
ip link set eth0 allmulticast <on|off>
However, other devices like bridge set allmulticast as well, and many
other flows might trigger entry into allmulticast as well.
The new message uses the standard netdev_info print and looks like:
[ 413.246110] ixgbe 0000:17:00.0 eth0: entered allmulticast mode
[ 415.977184] ixgbe 0000:17:00.0 eth0: left allmulticast mode
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Kees Cook [Wed, 15 Feb 2023 22:41:14 +0000 (14:41 -0800)]
wifi: brcmfmac: p2p: Introduce generic flexible array frame member
Silence run-time memcpy() false positive warning when processing
management frames:
memcpy: detected field-spanning write (size 27) of single field "&mgmt_frame->u" at drivers/net/wireless/broadcom/brcm80211/brcmfmac/p2p.c:1469 (size 26)
Due to this (soon to be fixed) GCC bug[1], FORTIFY_SOURCE (via
__builtin_dynamic_object_size) doesn't recognize that the union may end
with a flexible array, and returns "26" (the fixed size of the union),
rather than the remaining size of the allocation. Add an explicit
flexible array member and set it as the destination here, so that we
get the correct coverage for the memcpy().
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Arend van Spriel <aspriel@gmail.com>
Cc: Franky Lin <franky.lin@broadcom.com>
Cc: Hante Meuleman <hante.meuleman@broadcom.com>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Brian Henriquez <brian.henriquez@cypress.com>
Cc: linux-wireless@vger.kernel.org
Cc: brcm80211-dev-list.pdl@broadcom.com
Cc: SHA-cyfmac-dev-list@infineon.com
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230215224110.never.022-kees@kernel.org
[rename 'frame' to 'body']
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Paolo Abeni [Thu, 16 Feb 2023 08:27:09 +0000 (09:27 +0100)]
Merge branch 'net-sched-retire-some-tc-qdiscs-and-classifiers'
Jamal Hadi Salim says:
====================
net/sched: Retire some tc qdiscs and classifiers
The CBQ + dsmark qdiscs and the tcindex + rsvp classifiers have served us for
over 2 decades. Unfortunately, they have not been getting much attention due
to reduced usage. While we dont have a good metric for tabulating how much use
a specific kernel feature gets, for these specific features we observed that
some of the functionality has been broken for some time and no users complained.
In addition, syzkaller has been going to town on most of these and finding
issues; and while we have been fixing those issues, at times it becomes obvious
that we would need to perform bigger surgeries to resolve things found while
getting a syzkaller fix in place. After some discussion we feel that in order
to reduce the maintenance burden it is best to retire them.
This patchset leaves the UAPI alone. I could send another version which deletes
the UAPI as well. AFAIK, this has not been done before - so it wasnt clear what
how to handle UAPI. It seems legit to just delete it but we would need to
coordinate with iproute2 (given they sync up with kernel uapi headers). There
are probably other users we don't know of that copy kernel headers.
If folks feel differently I will resend the patches deleting UAPI for these
qdiscs and classifiers.
I will start another thread on iproute2 before sending any patches to iproute2.
====================
Link: https://lore.kernel.org/r/20230214134915.199004-1-jhs@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:15 +0000 (08:49 -0500)]
net/sched: Retire rsvp classifier
The rsvp classifier has served us well for about a quarter of a century but has
has not been getting much maintenance attention due to lack of known users.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:14 +0000 (08:49 -0500)]
net/sched: Retire tcindex classifier
The tcindex classifier has served us well for about a quarter of a century
but has not been getting much TLC due to lack of known users. Most recently
it has become easy prey to syzkaller. For this reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:13 +0000 (08:49 -0500)]
net/sched: Retire dsmark qdisc
The dsmark qdisc has served us well over the years for diffserv but has not
been getting much attention due to other more popular approaches to do diffserv
services. Most recently it has become a shooting target for syzkaller. For this
reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:12 +0000 (08:49 -0500)]
net/sched: Retire ATM qdisc
The ATM qdisc has served us well over the years but has not been getting much
TLC due to lack of known users. Most recently it has become a shooting target
for syzkaller. For this reason, we are retiring it.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jamal Hadi Salim [Tue, 14 Feb 2023 13:49:11 +0000 (08:49 -0500)]
net/sched: Retire CBQ qdisc
While this amazing qdisc has served us well over the years it has not been
getting any tender love and care and has bitrotted over time.
It has become mostly a shooting target for syzkaller lately.
For this reason, we are retiring it. Goodbye CBQ - we loved you.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 16 Feb 2023 07:59:51 +0000 (08:59 +0100)]
Merge branch 'adding-sparx5-es0-vcap-support'
Steen Hegelund says:
====================
Adding Sparx5 ES0 VCAP support
This provides the Egress Stage 0 (ES0) VCAP (Versatile Content-Aware
Processor) support for the Sparx5 platform.
The ES0 VCAP is an Egress Access Control VCAP that uses frame keyfields and
previously classified keyfields to add, rewrite or remove VLAN tags on the
egress frames, and is therefore often referred to as the rewriter.
The ES0 VCAP also supports trapping frames to the host.
The ES0 VCAP has 1 lookup accessible with this chain id:
- chain
10000000: ES0 Lookup 0
The ES0 VCAP does not do traffic classification to select a keyset, but it
does have two keysets that can be used on all traffic. For now only the
ISDX keyset is used.
The ES0 VCAP can match on an ISDX key (Ingress Service Index) as one of the
frame metadata keyfields, similar to the ES2 VCAP.
The ES0 VCAP uses external counters in the XQS (statistics) group.
====================
Link: https://lore.kernel.org/r/20230214104049.1553059-1-steen.hegelund@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:49 +0000 (11:40 +0100)]
net: microchip: sparx5: Add TC vlan action support for the ES0 VCAP
This provides these 3 actions for rule in the ES0 VCAP:
- action vlan pop
- action vlan modify id X priority Y
- action vlan push id X priority Y protocol Z
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:48 +0000 (11:40 +0100)]
net: microchip: sparx5: Add TC support for the ES0 VCAP
This enables the TC command to use the Sparx5 ES0 VCAP, and handling of
rule links between IS0 and ES0.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:47 +0000 (11:40 +0100)]
net: microchip: sparx5: Add ES0 VCAP keyset configuration for Sparx5
This adds the ES0 VCAP port keyset configuration for Sparx5.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:46 +0000 (11:40 +0100)]
net: microchip: sparx5: Updated register interface with VCAP ES0 access
This provides access to the ES0 VCAP register targets
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:45 +0000 (11:40 +0100)]
net: microchip: sparx5: Add ES0 VCAP model and updated KUNIT VCAP model
This provides the VCAP model for the Sparx5 ES0 (Egress Stage 0) VCAP.
This VCAP provides rewriting functionality in the egress path.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:44 +0000 (11:40 +0100)]
net: microchip: sparx5: Improve the error handling for linked rules
Ensure that an error is returned if the VCAP instance was not found.
The chain offset (diff) is allowed to be zero as this just means that the
user did not request rules to be linked.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:43 +0000 (11:40 +0100)]
net: microchip: sparx5: Use chain ids without offsets when enabling rules
This improves the check performed on linked rules when enabling or
disabling them. The chain id used must be the chain id without the offset
used for linking the rules.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:42 +0000 (11:40 +0100)]
net: microchip: sparx5: Egress VLAN TPID configuration follows IFH
This changes the TPID of the egress frames to use the TPID stored in the
IFH (internal frame header), which ensures that this is the TPID classified
for the frame at ingress.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:41 +0000 (11:40 +0100)]
net: microchip: sparx5: Clear rule counter even if lookup is disabled
The rule counter must be cleared when creating a new rule, even if the VCAP
lookup is currently disabled.
This ensures that rules located in VCAPs that use external counters (such
as Sparx5 IS2 and ES0) will have their counter reset even if the VCAP
lookup is not enabled at the moment.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Fixes:
95fa74148daa ("net: microchip: sparx5: Reset VCAP counter for new rules")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steen Hegelund [Tue, 14 Feb 2023 10:40:40 +0000 (11:40 +0100)]
net: microchip: sparx5: Discard frames with SMAC multicast addresses
A valid frame should never use a multicast address as its source MAC
address, so discard these invalid frames.
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Thu, 16 Feb 2023 05:52:33 +0000 (21:52 -0800)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-02-14 (ice)
This series contains updates to ice driver only.
Karol extends support for GPIO pins to E823 devices.
Daniel Vacek stops processing of PTP packets when link is down.
Pawel adds support for BIG TCP for IPv6.
Tony changes return type of ice_vsi_realloc_stat_arrays() as it always
returns success.
Zhu Yanjun updates kdoc stating supported TLVs.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Mention CEE DCBX in code comment
ice: Change ice_vsi_realloc_stat_arrays() to void
ice: add support BIG TCP on IPv6
ice/ptp: fix the PTP worker retrying indefinitely if the link went down
ice: Add GPIO pin support for E823 products
====================
Link: https://lore.kernel.org/r/20230214213003.2117125-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>