Mengyuan Lou [Tue, 30 May 2023 02:26:30 +0000 (10:26 +0800)]
net: ngbe: Implement vlan add and remove ops
ngbe add ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mengyuan Lou [Tue, 30 May 2023 02:26:29 +0000 (10:26 +0800)]
net: ngbe: Add netdev features support
Add features and hw_features that ngbe can support.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mengyuan Lou [Tue, 30 May 2023 02:26:28 +0000 (10:26 +0800)]
net: libwx: Implement xx_set_features ops
Implement wx_set_features function which to support
ndo_set_features.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mengyuan Lou [Tue, 30 May 2023 02:26:27 +0000 (10:26 +0800)]
net: wangxun: Implement vlan add and kill functions
Implement vlan add/kill functions which add and remove
vlan id in hardware.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mengyuan Lou [Tue, 30 May 2023 02:26:26 +0000 (10:26 +0800)]
net: wangxun: libwx add rx offload functions
Add rx offload functions for wx_clean_rx_irq
which supports ngbe and txgbe to implement
rx offload function.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mengyuan Lou [Tue, 30 May 2023 02:26:25 +0000 (10:26 +0800)]
net: wangxun: libwx add tx offload functions
Add tx offload functions for wx_xmit_frame_ring which
includes wx_encode_tx_desc_ptype, wx_tso and wx_tx_csum.
which supports ngbe and txgbe to implement tx offload
function.
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 31 May 2023 01:55:23 +0000 (18:55 -0700)]
devlink: make health report on unregistered instance warn just once
Devlink health is involved in error recovery. Machines in bad
state tend to be fairly unreliable, and occasionally get stuck
in error loops. Even with a reasonable grace period devlink health
may get a thousand reports in an hour.
In case of reporting on an unregistered devlink instance
the subsequent reports don't add much value. Switch to
WARN_ON_ONCE() to avoid flooding dmesg and fleet monitoring
dashboards.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230531015523.48961-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 1 Jun 2023 05:33:46 +0000 (22:33 -0700)]
Merge branch 'add-support-for-vsc85xx-dt-rgmii-delays'
Harini Katakam says:
====================
Add support for VSC85xx DT RGMII delays
Provide an option to change RGMII delay value via devicetree.
====================
Link: https://lore.kernel.org/r/20230529122017.10620-1-harini.katakam@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Harini Katakam [Mon, 29 May 2023 12:20:17 +0000 (17:50 +0530)]
phy: mscc: Add support for RGMII delay configuration
Add support for optional rx/tx-internal-delay-ps from devicetree.
- When rx/tx-internal-delay-ps is/are specified, these take priority
- When either is absent,
1) use 2ns for respective settings if rgmii-id/rxid/txid is/are present
2) use 0.2ns for respective settings if mode is rgmii
Signed-off-by: Harini Katakam <harini.katakam@amd.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Harini Katakam [Mon, 29 May 2023 12:20:16 +0000 (17:50 +0530)]
phy: mscc: Use PHY_ID_MATCH_VENDOR to minimize PHY ID table
All the PHY devices variants specified have the same mask and
hence can be simplified to one vendor look up for 0x00070400.
Any individual config can be identified by PHY_ID_MATCH_EXACT
in the respective structure.
Signed-off-by: Harini Katakam <harini.katakam@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Sun, 28 May 2023 17:39:59 +0000 (19:39 +0200)]
net: don't set sw irq coalescing defaults in case of PREEMPT_RT
If PREEMPT_RT is set, then assume that the user focuses on minimum
latency. Therefore don't set sw irq coalescing defaults.
This affects the defaults only, users can override these settings
via sysfs.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/f9439c7f-c92c-4c2c-703e-110f96d841b7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Wed, 31 May 2023 09:00:30 +0000 (10:00 +0100)]
Merge branch 'xstats-for-tc-taprio'
Vladimir Oltean says:
====================
xstats for tc-taprio
As a result of this discussion:
https://lore.kernel.org/intel-wired-lan/
20230411055543.24177-1-muhammad.husaini.zulkifli@intel.com/
it became apparent that tc-taprio should make an effort to standardize
statistics counters related to the 802.1Qbv scheduling as implemented
by the NIC. I'm presenting here one counter suggested by the standard,
and one counter defined by the NXP ENETC controller from LS1028A. Both
counters are reported globally and per traffic class - drivers get
different callbacks for reporting both of these, and get to choose what
to report in both cases.
The iproute2 counterpart is available here for testing:
https://github.com/vladimiroltean/iproute2/commits/taprio-xstats
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 30 May 2023 09:19:48 +0000 (12:19 +0300)]
net: enetc: report statistics counters for taprio
Report the "win_drop" counter from the unstructured ethtool -S as
TCA_TAPRIO_OFFLOAD_STATS_WINDOW_DROPS to the Qdisc layer. It is
available both as a global counter as well as a per-TC one.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 30 May 2023 09:19:47 +0000 (12:19 +0300)]
net: enetc: refactor enetc_setup_tc_taprio() to have a switch/case for cmd
Make enetc_setup_tc_taprio() more amenable to future extensions, like
reporting statistics.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 30 May 2023 09:19:46 +0000 (12:19 +0300)]
net/sched: taprio: add netlink reporting for offload statistics counters
Offloading drivers may report some additional statistics counters, some
of them even suggested by 802.1Q, like TransmissionOverrun.
In my opinion we don't have to limit ourselves to reporting counters
only globally to the Qdisc/interface, especially if the device has more
detailed reporting (per traffic class), since the more detailed info is
valuable for debugging and can help identifying who is exceeding its
time slot.
But on the other hand, some devices may not be able to report both per
TC and global stats.
So we end up reporting both ways, and use the good old ethtool_put_stat()
strategy to determine which statistics are supported by this NIC.
Statistics which aren't set are simply not reported to netlink. For this
reason, we need something dynamic (a nlattr nest) to be reported through
TCA_STATS_APP, and not something daft like the fixed-size and
inextensible struct tc_codel_xstats. A good model for xstats which are a
nlattr nest rather than a fixed struct seems to be cake.
# Global stats
$ tc -s qdisc show dev eth0 root
# Per-tc stats
$ tc -s class show dev eth0
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 30 May 2023 09:19:45 +0000 (12:19 +0300)]
net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum
Inspired from struct flow_cls_offload :: cmd, in order for taprio to be
able to report statistics (which is future work), it seems that we need
to drill one step further with the ndo_setup_tc(TC_SETUP_QDISC_TAPRIO)
multiplexing, and pass the command as part of the common portion of the
muxed structure.
Since we already have an "enable" variable in tc_taprio_qopt_offload,
refactor all drivers to check for "cmd" instead of "enable", and reject
every other command except "replace" and "destroy" - to be future proof.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> # for lan966x
Acked-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Muhammad Husaini Zulkifli <muhammad.husaini.zulkifli@intel.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 30 May 2023 09:19:44 +0000 (12:19 +0300)]
net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()
In taprio_dump_class_stats() we don't need a reference to the root Qdisc
once we get the reference to the child corresponding to this traffic
class, so it's okay to overwrite "sch". But in a future patch we will
need the root Qdisc too, so create a dedicated "child" pointer variable
to hold the child reference. This also makes the code adhere to a more
conventional coding style.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 31 May 2023 08:56:08 +0000 (09:56 +0100)]
Merge branch 'dsa-marvell-mv88e6071-and-6020-support'
Lukasz Majewski says:
====================
dsa: marvell: Add support for mv88e6071 and 6020 switches
After the commit (SHA1:
7e9517375a14f44ee830ca1c3278076dd65fcc8f);
"net: dsa: mv88e6xxx: fix max_mtu of 1492 on 6165, 6191, 6220, 6250, 6290" the
error when mv88e6020 or mv88e6071 is used is not present anymore.
As a result patches for adding max frame size are not required to provide
working setup with aforementioned switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lukasz Majewski [Tue, 30 May 2023 08:39:16 +0000 (10:39 +0200)]
net: dsa: mv88e6xxx: add support for MV88E6071 switch
A mv88e6250 family switch with 5 internal PHYs, 2 RMIIs
and no PTP support.
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Tue, 30 May 2023 08:39:15 +0000 (10:39 +0200)]
net: dsa: mv88e6xxx: add support for MV88E6020 switch
A mv88e6250 family switch with 2 PHY and RMII ports and
no PTP support.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lukasz Majewski [Tue, 30 May 2023 08:39:14 +0000 (10:39 +0200)]
net: dsa: Define .set_max_frame_size() callback for mv88e6250 SoC family
Switches from mv88e6250 family (including mv88e6020 and mv88e6071) need
the possibility to setup the maximal frame size, as they support frames
up to 2048 bytes.
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Uwe Kleine-König [Tue, 30 May 2023 06:39:36 +0000 (08:39 +0200)]
net: dsa: Switch i2c drivers back to use .probe()
After commit
b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new()
call-back type"), all drivers being converted to .probe_new() and then
03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert
back to (the new) .probe() to be able to eventually drop .probe_new() from
struct i2c_driver.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parav Pandit [Mon, 29 May 2023 13:44:30 +0000 (16:44 +0300)]
net: Make gro complete function to return void
tcp_gro_complete() function only updates the skb fields related to GRO
and it always returns zero. All the 3 drivers which are using it
do not check for the return value either.
Change it to return void instead which simplifies its callers as
error handing becomes unnecessary.
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 31 May 2023 08:42:09 +0000 (09:42 +0100)]
Merge branch 'net-led-hw-control-api'
Christian Marangi says:
====================
leds: introduce new LED hw control APIs
Since this series is cross subsystem between LED and netdev,
a stable branch was created to facilitate merging process.
This is based on top of branch ib-leds-netdev-v6.5 present here [1]
and rebased on top of net-next since the LED stable branch got merged.
This is a continue of [2]. It was decided to take a more gradual
approach to implement LEDs support for switch and phy starting with
basic support and then implementing the hw control part when we have all
the prereq done.
This is the main part of the series, the one that actually implement the
hw control API.
Some history about this feature and why
=======================================
This proposal is highly requested by the entire net community but the API
is not strictly designed for net usage but for a more generic usage.
Initial version were very flexible and designed to try to support every
aspect of the LED driver with many complex function that served multiple
purpose. There was an idea to have sw only and hw only LEDs and sw only
and hw only LEDs.
With some heads up from Andrew from the net mailing list, it was suggested
to implement a more basic yet easy to implement system.
These API strictly work with a designated trigger to offload their
function.
This may be confused with hw blink offload but LED may have an even more
advanced configuration where the entire aspect of the trigger is
offloaded and completely handled by the hardware.
An example of this usage are PHY or switch port LEDs. Almost every of
these kind of device have multiple LED attached and provide info of the
current port state.
Currently we lack any support of them but these device always provide a
way to configure them, from basic feature like turning the LED off or no
(implemented in previous series related to this feature) or even entirely
driven by the hw and power on/off/blink based on some events, like tx/rx
traffic, ethernet cable attached, link speed of 10mbps, 100mbps, 1000mbps
or more. They can also support multiple logic like blink with traffic only
if a particular link speed is attached. (an example of this is when a LED
is designated to be turned on only with 100mbps link speed and configured
to blink on traffic and a secondary LED of a different color is present to
serve the same function but only when the link speed is 1000mbps)
These case are very common for a PHY or a switch but they were never
standardized so OEM support all kind of variant and configuration.
Again with Andrew we compared some feature and we reached a common set
of modes that are for sure present in every kind of devices.
And this concludes history and why.
What is present in this series
==============================
This patch contain the required API to support this feature, I decided on
the name of hw control to quickly describe this feature.
I documented each require API in the related Documentation for leds-class
so I think it might me redundant to expose them here. Feel free to tell me
how to improve it if anything is not clear.
On an abstract idea, this feature require this:
- The trigger needs to make use of it, this is currently implemented
for the netdev trigger but other trigger can be expanded if the
device expose these function. An idea might be a anything that
handle a storage disk and have the LED configurable to blink when
there is any activity to the disk.
- The LED driver needs to expose and implement these new API.
Currently a LED driver supports only a trigger. The trigger should use
the related helper to check if the LED can be driven hy hardware.
The different modes a trigger support are exposed in the kernel include
leds.h header and are used by the LED driver to understand what to do.
From a user standpoint, he should enable modes as usual from sysfs and if
anything is not supported warned.
Final words and missing piece from this series
==============================================
I honestly hope this feature can finally be implemented.
This series originally had also additional modes and logic to add to the
netdev trigger, but I decided to strip them and implement only the API
and support basic tx and rx. After this is merged, I will quickly propose
these additional modes.
Currently this is limited to tx and rx and this is what the current user
qca8k use. Marvell PHY support link and a generic blink with any kind of
traffic (both rx and tx). qca8k switch supports keeping the LED on based on
link speed.
The next series will add the concept of hw control only modes to the netdev
trigger and support for these additional modes:
- link_10
- link_100
- link_1000
- activity
The current implementation is voluntary basic and limited to put the ground
work and have something easy to implement and usable. 99% part of the logic
is done on the trigger side, leaving to the LED driver only the validating
and the apply part.
As shown for the PHY led binding, people are really intrested in this
feature as quickly after they were merged, people were already working on
adding support for it.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/lee/leds.git/?h=ib-leds-netdev-6.5
[2] https://lore.kernel.org/lkml/
20230216013230.22978-1-ansuelsmth@gmail.com/
Changes in v4:
- Added review tag from Andrew.
- Move default interval to a define to keep them synced.
- Apply suggested reword to improve Documentation rst.
Changes in v3:
- Rebased on top of net-next
Changes in v2:
- Drop helper as currently used only by one trigger
- Improve Documentation and document return error of some functions
- Squash some patch to reduce series size
- Drop trigger mode mask as currently not used
- Rework hw control validating function to a simple implementation
Changes from previous v8 series:
- Rewrite Documentation from scratch and move to separate commit
- Strip additional trigger modes (to propose in a different series)
- Strip from qca8k driver additional modes (to implement in the different
series)
- Split the netdev chages to smaller piece to permit easier review
Changelog in the previous v8 series: (stripped of unrelated changes)
v8:
- Improve the documentation of the new feature
- Rename to a more symbolic name
- Fix some bug in netdev trigger (not using BIT())
- Add more define for qca8k-leds driver
- Drop interval support
- Fix many bugs in the validate option in the netdev trigger
v7:
- Fix qca8k leds documentation warning
- Remove RFC tag
v6:
- Back to RFC.
- Drop additional trigger
- Rework netdev trigger to support common modes used by switch and
hardware only triggers
- Refresh qca8k leds logic and driver
v5:
- Move out of RFC. (no comments from Andrew this is the right path?)
- Fix more spelling mistake (thx Randy)
- Fix error reported by kernel test bot
- Drop the additional HW_CONTROL flag. It does simplify CONFIG
handling and hw control should be available anyway to support
triggers as module.
v4:
- Rework implementation and drop hw_configure logic.
We now expand blink_set.
- Address even more spelling mistake. (thx a lot Randy)
- Drop blink option and use blink_set delay.
v3:
- Rework start/stop as Andrew asked.
- Use test_bit API to check flag passed to hw_control_configure.
- Added a new cmd to hw_control_configure to reset any active blink_mode.
- Refactor all the patches to follow this new implementation.
v2:
- Fix spelling mistake (sorry)
- Drop patch 02 "permit to declare supported offload triggers".
Change the logic, now the LED driver declare support for them
using the configure_offload with the cmd TRIGGER_SUPPORTED.
- Rework code to follow this new implementation.
- Update Documentation to better describe how this offload
implementation work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 29 May 2023 16:32:43 +0000 (18:32 +0200)]
net: dsa: qca8k: add op to get ports netdev
In order that the LED trigger can blink the switch MAC ports LED, it
needs to know the netdev associated to the port. Add the callback to
return the struct device of the netdev.
Add an helper function qca8k_phy_to_port() to convert the phy back to
dsa_port index, as we reference LED port based on the internal PHY
index and needs to be converted back.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:42 +0000 (18:32 +0200)]
net: dsa: qca8k: implement hw_control ops
Implement hw_control ops to drive Switch LEDs based on hardware events.
Netdev trigger is the declared supported trigger for hw control
operation and supports the following mode:
- tx
- rx
When hw_control_set is called, LEDs are set to follow the requested
mode.
Each LEDs will blink at 4Hz by default.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:41 +0000 (18:32 +0200)]
leds: trigger: netdev: expose netdev trigger modes in linux include
Expose netdev trigger modes to make them accessible by LED driver that
will support netdev trigger for hw control.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:40 +0000 (18:32 +0200)]
leds: trigger: netdev: init mode if hw control already active
On netdev trigger activation, hw control may be already active by
default. If this is the case and a device is actually provided by
hw_control_get_device(), init the already active mode and set the
bool to hw_control bool to true to reflect the already set mode in the
trigger_data.
Co-developed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 29 May 2023 16:32:39 +0000 (18:32 +0200)]
leds: trigger: netdev: validate configured netdev
The netdev which the LED should blink for is configurable in
/sys/class/led/foo/device_name. Ensure when offloading that the
configured netdev is the same as the netdev the LED is associated
with. If it is not, only perform software blinking.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:38 +0000 (18:32 +0200)]
leds: trigger: netdev: add support for LED hw control
Add support for LED hw control for the netdev trigger.
The trigger on calling set_baseline_state to configure a new mode, will
do various check to verify if hw control can be used for the requested
mode in can_hw_control() function.
It will first check if the LED driver supports hw control for the netdev
trigger, then will use hw_control_is_supported() and finally will call
hw_control_set() to apply the requested mode.
To use such mode, interval MUST be set to the default value and net_dev
MUST be set. If one of these 2 value are not valid, hw control will
never be used and normal software fallback is used.
The default interval value is moved to a define to make sure they are
always synced.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:37 +0000 (18:32 +0200)]
leds: trigger: netdev: reject interval store for hw_control
Reject interval store with hw_control enabled. It's are currently not
supported and MUST be set to the default value with hw control enabled.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:36 +0000 (18:32 +0200)]
leds: trigger: netdev: add basic check for hw control support
Add basic check for hw control support. Check if the required API are
defined and check if the defined trigger supported in hw control for the
LED driver match netdev.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:35 +0000 (18:32 +0200)]
leds: trigger: netdev: introduce check for possible hw control
Introduce function to check if the requested mode can use hw control in
preparation for hw control support. Currently everything is handled in
software so can_hw_control will always return false.
Add knob with the new value hw_control in trigger_data struct to
set hw control possible. Useful for future implementation to implement
in set_baseline_state() the required function to set the requested mode
using LEDs hw control ops and in other function to reject set if hw
control is currently active.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 29 May 2023 16:32:34 +0000 (18:32 +0200)]
leds: trigger: netdev: refactor code setting device name
Move the code into a helper, ready for it to be called at
other times. No intended behaviour change.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:33 +0000 (18:32 +0200)]
Documentation: leds: leds-class: Document new Hardware driven LEDs APIs
Document new Hardware driven LEDs APIs.
Some LEDs can be programmed to be driven by hardware. This is not
limited to blink but also to turn off or on autonomously.
To support this feature, a LED needs to implement various additional
ops and needs to declare specific support for the supported triggers.
Add documentation for each required value and API to make hw control
possible and implementable by both LEDs and triggers.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 29 May 2023 16:32:32 +0000 (18:32 +0200)]
leds: add API to get attached device for LED hw control
Some specific LED triggers blink the LED based on events from a device
or subsystem.
For example, an LED could be blinked to indicate a network device is
receiving packets, or a disk is reading blocks. To correctly enable and
request the hw control of the LED, the trigger has to check if the
network interface or block device configured via a /sys/class/led file
match the one the LED driver provide for hw control for.
Provide an API call to get the device which the LED blinks for.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Marangi [Mon, 29 May 2023 16:32:31 +0000 (18:32 +0200)]
leds: add APIs for LEDs hw control
Add an option to permit LED driver to declare support for a specific
trigger to use hw control and setup the LED to blink based on specific
provided modes.
Add APIs for LEDs hw control. These functions will be used to activate
hardware control where a LED will use the provided flags, from an
unique defined supported trigger, to setup the LED to be driven by
hardware.
Add hw_control_is_supported() to ask the LED driver if the requested
mode by the trigger are supported and the LED can be setup to follow
the requested modes.
Deactivate hardware blink control by setting brightness to LED_OFF via
the brightness_set() callback.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 29 May 2023 14:52:13 +0000 (10:52 -0400)]
tipc: delete tipc_mtu_bad from tipc_udp_enable
Since commit
a4dfa72d0acd ("tipc: set default MTU for UDP media"), it's
been no longer using dev->mtu for b->mtu, and the issue described in
commit
3de81b758853 ("tipc: check minimum bearer MTU") doesn't exist
in UDP bearer any more.
Besides, dev->mtu can still be changed to a too small mtu after the UDP
bearer is created even with tipc_mtu_bad() check in tipc_udp_enable().
Note that NETDEV_CHANGEMTU event processing in tipc_l2_device_event()
doesn't really work for UDP bearer.
So this patch deletes the unnecessary tipc_mtu_bad from tipc_udp_enable.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/282f1f5cc40e6cad385aa1c60569e6c5b70e2fb3.1685371933.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 31 May 2023 06:54:35 +0000 (23:54 -0700)]
Merge branch 'net-dsa-mv88e6xxx-add-
88e6361-support'
Alexis Lothoré says:
====================
net: dsa: mv88e6xxx: add
88E6361 support
This series brings initial support for Marvell
88E6361 switch.
MV88E6361 is a 8 ports switch with 5 integrated Gigabit PHYs and 3
2.5Gigabit SerDes interfaces. It is in fact a new variant in the
88E639X/88E6193X/88E6191X family with a subset of existing features:
- port 0: MII, RMII, RGMII, 1000BaseX, 2500BaseX
- port 3 to 7: triple speed internal phys
- port 9 and 10: 1000BaseX, 25000BaseX
Since said family is already well supported in mv88e6xxx driver, adding
initial support for this new switch mostly consists in finding the ID
exposed in its identification register, adding a proper description
in switch description tables in mv88e6xxx driver, and enforcing
88E6361
specificities in mv88e6393x_XXX methods.
- first 4 commits introduce an internal phy offset field for switches which
have internal phys but not starting from port 0
- 5th commit is a fix on existing switches based on first commits
- 6th commit is a slight modification to prepare 886361 support
- last commit introduces
88E6361 support in 88E6393X family
This initial support has been tested with two samples of a custom board
with the following hardware configuration:
- a main CPU connected to MV88E6361 using port 0 as CPU port
- port 9 wired to a SFP cage
- port 10 wired to a G.Hn transceiver
The following setup was used:
PC <-ethernet-> (copper SFP) - Board 1 - (G.hn) <-phone line(RJ11)-> (G.hn) Board 2
The unit 1 has been configured to bridge SFP port and G.hn port together,
which allowed to successfully ping Board 2 from PC.
====================
Link: https://lore.kernel.org/r/20230529080246.82953-1-alexis.lothore@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:46 +0000 (10:02 +0200)]
net: dsa: mv88e6xxx: enable support for
88E6361 switch
Marvell
88E6361 is an 8-port switch derived from the
88E6393X/88E9193X/88E6191X switches family. It can benefit from the
existing mv88e6xxx driver by simply adding the proper switch description in
the driver. Main differences with other switches from this
family are:
- 8 ports exposed (instead of 11): ports 1, 2 and 8 not available
- No 5GBase-x nor SFI/USXGMII support
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:45 +0000 (10:02 +0200)]
net: dsa: mv88e6xxx: pass mv88e6xxx_chip structure to port_max_speed_mode
Some switches families have minor differences on supported link speed for
ports. Instead of redefining a new port_max_speed_mode for each different
configuration, allow to pass mv88e6xxx_chip structure to allow
differentiating those chips by known chip id
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:44 +0000 (10:02 +0200)]
net: dsa: mv88e6xxx: fix 88E6393X family internal phys layout
88E6393X/88E6193X/88E6191X switches have in fact 8 internal PHYs, but those
are not present starting at port 0: supported ports go from 1 to 8
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:43 +0000 (10:02 +0200)]
net: dsa: mv88e6xxx: add field to specify internal phys layout
mv88e6xxx currently assumes that switch equipped with internal phys have
those phys mapped contiguously starting from port 0 (see
mv88e6xxx_phy_is_internal). However, some switches have internal PHYs but
NOT starting from port 0. For example 88e6393X, 88E6193X and 88E6191X have
integrated PHYs available on ports 1 to 8
To properly support this offset, add a new field to allow specifying an
internal PHYs layout. If field is not set, default layout is assumed (start
at port 0)
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:42 +0000 (10:02 +0200)]
net: dsa: mv88e6xxx: use mv88e6xxx_phy_is_internal in mv88e6xxx_port_ppu_updates
Make sure to use existing helper to get internal PHYs count instead of
redoing it manually
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:41 +0000 (10:02 +0200)]
net: dsa: mv88e6xxx: pass directly chip structure to mv88e6xxx_phy_is_internal
Since this function is a simple helper, we do not need to pass a full
dsa_switch structure, we can directly pass the mv88e6xxx_chip structure.
Doing so will allow to share this function with any other function
not manipulating dsa_switch structure but needing info about number of
internal phys
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexis Lothoré [Mon, 29 May 2023 08:02:40 +0000 (10:02 +0200)]
dt-bindings: net: dsa: marvell: add MV88E6361 switch to compatibility list
Marvell MV88E6361 is an 8-port switch derived from the
88E6393X/88E9193X/88E6191X switches family. Since its functional behavior
is very close to switches from this family, it can benefit from existing
drivers for this family, so add it to the list of compatible switches
Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 31 May 2023 06:37:02 +0000 (23:37 -0700)]
Merge branch 'add-layer-2-miss-indication-and-filtering'
Ido Schimmel says:
====================
Add layer 2 miss indication and filtering
tl;dr
=====
This patchset adds a single bit to the tc skb extension to indicate that
a packet encountered a layer 2 miss in the bridge and extends flower to
match on this metadata. This is required for non-DF (Designated
Forwarder) filtering in EVPN multi-homing which prevents decapsulated
BUM packets from being forwarded multiple times to the same multi-homed
host.
Background
==========
In a typical EVPN multi-homing setup each host is multi-homed using a
set of links called ES (Ethernet Segment, i.e., LAG) to multiple leaf
switches in a rack. These switches act as VTEPs and are not directly
connected (as opposed to MLAG), but can communicate with each other (as
well as with VTEPs in remote racks) via spine switches over L3.
When a host sends a BUM packet over ES1 to VTEP1, the VTEP will flood it
to other VTEPs in the network, including those connected to the host
over ES1. The receiving VTEPs must drop the packet and not forward it
back to the host. This is called "split-horizon filtering" (SPH) [1].
FRR configures SPH filtering using two tc filters. The first, an ingress
filter that matches on packets received from VTEP1 and marks them using
a fwmark (firewall mark). The second, an egress filter configured on the
LAG interface connected to the host that matches on the fwmark and drops
the packets. Example:
# tc filter add dev vxlan0 ingress pref 1 proto all flower enc_src_ip $VTEP1_IP action skbedit mark 101
# tc filter add dev bond0 egress pref 1 handle 101 fw action drop
Motivation
==========
For each ES, only one VTEP is elected by the control plane as the DF.
The DF is responsible for forwarding decapsulated BUM traffic to the
host over the ES. The non-DF VTEPs must drop such traffic as otherwise
the host will receive multiple copies of BUM traffic. This is called
"non-DF filtering" [2].
Filtering of multicast and broadcast traffic can be achieved using the
following flower filter:
# tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop
Unlike broadcast and multicast traffic, it is not currently possible to
filter unknown unicast traffic. The classification into unknown unicast
is performed by the bridge driver, but is not visible to other layers.
Implementation
==============
The proposed solution is to add a single bit to the tc skb extension
that is set by the bridge for packets that encountered an FDB or MDB
miss. The flower classifier is extended to be able to match on this new
metadata bit in a similar fashion to existing metadata options such as
'indev'.
A bit that is set for every flooded packet would also work, but it does
not allow us to differentiate between registered and unregistered
multicast traffic which might be useful in the future.
A relatively generic name is chosen for this bit - 'l2_miss' - to allow
its use to be extended to other layer 2 devices such as VXLAN, should a
use case arise.
With the above, the control plane can implement a non-DF filter using
the following tc filters:
# tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop
# tc filter add dev bond0 egress pref 2 proto all flower indev vxlan0 l2_miss true action drop
The first drops broadcast and multicast traffic and the second drops
unknown unicast traffic.
Testing
=======
A test exercising the different permutations of the 'l2_miss' bit is
added in patch #8.
Patchset overview
=================
Patch #1 adds the new bit to the tc skb extension and sets it in the
bridge driver for packets that encountered a miss. The marking of the
packets and the use of this extension is protected by the
'tc_skb_ext_tc' static key in order to keep performance impact to a
minimum when the feature is not in use.
Patch #2 extends the flow dissector to dissect this information from the
tc skb extension into the 'FLOW_DISSECTOR_KEY_META' key.
Patch #3 extends the flower classifier to be able to match on the new
layer 2 miss metadata. The classifier enables the 'tc_skb_ext_tc' static
key upon the installation of the first filter that matches on 'l2_miss'
and disables the key upon the removal of the last filter that matches on
it.
Patch #4 rejects matching on the new metadata in drivers that already
support the 'FLOW_DISSECTOR_KEY_META' key.
Patches #5-#6 are small preparations in mlxsw.
Patch #7 extends mlxsw to be able to match on layer 2 miss.
Patch #8 adds a selftest.
iproute2 patches can be found here [3].
[1] https://datatracker.ietf.org/doc/html/rfc7432#section-8.3
[2] https://datatracker.ietf.org/doc/html/rfc7432#section-8.5
[3] https://github.com/idosch/iproute2/tree/submit/non_df_filter_v1
[4] https://lore.kernel.org/netdev/
20230518113328.
1952135-1-idosch@nvidia.com/
[5] https://lore.kernel.org/netdev/
20230509070446.246088-1-idosch@nvidia.com/
====================
Link: https://lore.kernel.org/r/20230529114835.372140-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:35 +0000 (14:48 +0300)]
selftests: forwarding: Add layer 2 miss test cases
Add test cases to verify that the bridge driver correctly marks layer 2
misses only when it should and that the flower classifier can match on
this metadata.
Example output:
# ./tc_flower_l2_miss.sh
TEST: L2 miss - Unicast [ OK ]
TEST: L2 miss - Multicast (IPv4) [ OK ]
TEST: L2 miss - Multicast (IPv6) [ OK ]
TEST: L2 miss - Link-local multicast (IPv4) [ OK ]
TEST: L2 miss - Link-local multicast (IPv6) [ OK ]
TEST: L2 miss - Broadcast [ OK ]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:34 +0000 (14:48 +0300)]
mlxsw: spectrum_flower: Add ability to match on layer 2 miss
Add the 'fdb_miss' key element to supported key blocks and make use of
it to match on layer 2 miss.
The key is only supported on Spectrum-{2,3,4}. An error is returned for
Spectrum-1 since the key element is not present in any of its key
blocks.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:33 +0000 (14:48 +0300)]
mlxsw: spectrum_flower: Do not force matching on iif
Currently, mlxsw only supports the 'ingress_ifindex' field in the
'FLOW_DISSECTOR_KEY_META' key, but subsequent patches are going to add
support for the 'l2_miss' field as well. It is valid to only match on
'l2_miss' without 'ingress_ifindex', so do not force matching on it.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:32 +0000 (14:48 +0300)]
mlxsw: spectrum_flower: Split iif parsing to a separate function
Currently, mlxsw only supports the 'ingress_ifindex' field in the
'FLOW_DISSECTOR_KEY_META' key, but subsequent patches are going to add
support for the 'l2_miss' field as well. Split the parsing of the
'ingress_ifindex' field to a separate function to avoid nesting. No
functional changes intended.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:31 +0000 (14:48 +0300)]
flow_offload: Reject matching on layer 2 miss
Adjust drivers that support the 'FLOW_DISSECTOR_KEY_META' key to reject
filters that try to match on the newly added layer 2 miss field. Add an
extack message to clearly communicate the failure reason to user space.
The following users were not patched:
1. mtk_flow_offload_replace(): Only checks that the key is present, but
does not do anything with it.
2. mlx5_tc_ct_set_tuple_match(): Used as part of netfilter offload,
which does not make use of the new field, unlike tc.
3. get_netdev_from_rule() in nfp: Likewise.
Example:
# tc filter add dev swp1 egress pref 1 proto all flower skip_sw l2_miss true action drop
Error: mlxsw_spectrum: Can't match on "l2_miss".
We have an error talking to the kernel
Acked-by: Elad Nachman <enachman@marvell.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:30 +0000 (14:48 +0300)]
net/sched: flower: Allow matching on layer 2 miss
Add the 'TCA_FLOWER_L2_MISS' netlink attribute that allows user space to
match on packets that encountered a layer 2 miss. The miss indication is
set as metadata in the tc skb extension by the bridge driver upon FDB or
MDB lookup miss and dissected by the flow dissector to the
'FLOW_DISSECTOR_KEY_META' key.
The use of this skb extension is guarded by the 'tc_skb_ext_tc' static
key. As such, enable / disable this key when filters that match on layer
2 miss are added / deleted.
Tested:
# cat tc_skb_ext_tc.py
#!/usr/bin/env -S drgn -s vmlinux
refcount = prog["tc_skb_ext_tc"].key.enabled.counter.value_()
print(f"tc_skb_ext_tc reference count is {refcount}")
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 0
# tc filter add dev swp1 egress proto all handle 101 pref 1 flower src_mac 00:11:22:33:44:55 action drop
# tc filter add dev swp1 egress proto all handle 102 pref 2 flower src_mac 00:11:22:33:44:55 l2_miss true action drop
# tc filter add dev swp1 egress proto all handle 103 pref 3 flower src_mac 00:11:22:33:44:55 l2_miss false action drop
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 2
# tc filter replace dev swp1 egress proto all handle 102 pref 2 flower src_mac 00:01:02:03:04:05 l2_miss false action drop
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 2
# tc filter del dev swp1 egress proto all handle 103 pref 3 flower
# tc filter del dev swp1 egress proto all handle 102 pref 2 flower
# tc filter del dev swp1 egress proto all handle 101 pref 1 flower
# ./tc_skb_ext_tc.py
tc_skb_ext_tc reference count is 0
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:29 +0000 (14:48 +0300)]
flow_dissector: Dissect layer 2 miss from tc skb extension
Extend the 'FLOW_DISSECTOR_KEY_META' key with a new 'l2_miss' field and
populate it from a field with the same name in the tc skb extension.
This field is set by the bridge driver for packets that incur an FDB or
MDB miss.
The next patch will extend the flower classifier to be able to match on
layer 2 misses.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Mon, 29 May 2023 11:48:28 +0000 (14:48 +0300)]
skbuff: bridge: Add layer 2 miss indication
For EVPN non-DF (Designated Forwarder) filtering we need to be able to
prevent decapsulated traffic from being flooded to a multi-homed host.
Filtering of multicast and broadcast traffic can be achieved using the
following flower filter:
# tc filter add dev bond0 egress pref 1 proto all flower indev vxlan0 dst_mac 01:00:00:00:00:00/01:00:00:00:00:00 action drop
Unlike broadcast and multicast traffic, it is not currently possible to
filter unknown unicast traffic. The classification into unknown unicast
is performed by the bridge driver, but is not visible to other layers
such as tc.
Solve this by adding a new 'l2_miss' bit to the tc skb extension. Clear
the bit whenever a packet enters the bridge (received from a bridge port
or transmitted via the bridge) and set it if the packet did not match an
FDB or MDB entry. If there is no skb extension and the bit needs to be
cleared, then do not allocate one as no extension is equivalent to the
bit being cleared. The bit is not set for broadcast packets as they
never perform a lookup and therefore never incur a miss.
A bit that is set for every flooded packet would also work for the
current use case, but it does not allow us to differentiate between
registered and unregistered multicast traffic, which might be useful in
the future.
To keep the performance impact to a minimum, the marking of packets is
guarded by the 'tc_skb_ext_tc' static key. When 'false', the skb is not
touched and an skb extension is not allocated. Instead, only a
5 bytes nop is executed, as demonstrated below for the call site in
br_handle_frame().
Before the patch:
```
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37b09: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12)
c37b10: 00 00
p = br_port_get_rcu(skb->dev);
c37b12: 49 8b 44 24 10 mov 0x10(%r12),%rax
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37b17: 49 c7 44 24 30 00 00 movq $0x0,0x30(%r12)
c37b1e: 00 00
c37b20: 49 c7 44 24 38 00 00 movq $0x0,0x38(%r12)
c37b27: 00 00
```
After the patch (when static key is disabled):
```
memset(skb->cb, 0, sizeof(struct br_input_skb_cb));
c37c29: 49 c7 44 24 28 00 00 movq $0x0,0x28(%r12)
c37c30: 00 00
c37c32: 49 8d 44 24 28 lea 0x28(%r12),%rax
c37c37: 48 c7 40 08 00 00 00 movq $0x0,0x8(%rax)
c37c3e: 00
c37c3f: 48 c7 40 10 00 00 00 movq $0x0,0x10(%rax)
c37c46: 00
#ifdef CONFIG_HAVE_JUMP_LABEL_HACK
static __always_inline bool arch_static_branch(struct static_key *key, bool branch)
{
asm_volatile_goto("1:"
c37c47: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
br_tc_skb_miss_set(skb, false);
p = br_port_get_rcu(skb->dev);
c37c4c: 49 8b 44 24 10 mov 0x10(%r12),%rax
```
Subsequent patches will extend the flower classifier to be able to match
on the new 'l2_miss' bit and enable / disable the static key when
filters that match on it are added / deleted.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 30 May 2023 17:32:22 +0000 (10:32 -0700)]
Merge branch 'devlink-move-port-ops-into-separate-structure'
Jiri Pirko says:
====================
devlink: move port ops into separate structure
In devlink, some of the objects have separate ops registered alongside
with the object itself. Port however have ops in devlink_ops structure.
For drivers what register multiple kinds of ports with different ops
this is not convenient.
This patchset changes does following changes:
1) Introduces devlink_port_ops with functions that allow devlink port
to be registered passing a pointer to driver port ops. (patch #1)
2) Converts drivers to define port_ops and register ports passing the
ops pointer. (patches #2, #3, #4, #6, #8, and #9)
3) Moves ops from devlink_ops struct to devlink_port_ops.
(patches #5, #7, #10-15)
No functional changes.
====================
Link: https://lore.kernel.org/r/20230526102841.2226553-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:41 +0000 (12:28 +0200)]
devlink: save devlink_port_ops into a variable in devlink_port_function_validate()
Now when the original ops variable is removed, introduce it again
but this time for devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:40 +0000 (12:28 +0200)]
devlink: move port_del() to devlink_port_ops
Move port_del() from devlink_ops into newly introduced devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:39 +0000 (12:28 +0200)]
devlink: move port_fn_state_get/set() to devlink_port_ops
Move port_fn_state_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:38 +0000 (12:28 +0200)]
devlink: move port_fn_migratable_get/set() to devlink_port_ops
Move port_fn_migratable_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:37 +0000 (12:28 +0200)]
devlink: move port_fn_roce_get/set() to devlink_port_ops
Move port_fn_roce_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:36 +0000 (12:28 +0200)]
devlink: move port_fn_hw_addr_get/set() to devlink_port_ops
Move port_fn_hw_addr_get/set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:35 +0000 (12:28 +0200)]
mlx5: register devlink ports with ops
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:34 +0000 (12:28 +0200)]
sfc: register devlink port with ops
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:33 +0000 (12:28 +0200)]
devlink: move port_type_set() op into devlink_port_ops
Move port_type_set() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:32 +0000 (12:28 +0200)]
mlx4: register devlink port with ops
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:31 +0000 (12:28 +0200)]
devlink: move port_split/unsplit() ops into devlink_port_ops
Move port_split/unsplit() from devlink_ops into newly introduced
devlink_port_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:30 +0000 (12:28 +0200)]
nfp: devlink: register devlink port with ops
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:29 +0000 (12:28 +0200)]
mlxsw_core: register devlink port with ops
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Tested-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:28 +0000 (12:28 +0200)]
ice: register devlink port for PF with ops
Use newly introduce devlink port registration function variant and
register devlink port passing ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Fri, 26 May 2023 10:28:27 +0000 (12:28 +0200)]
devlink: introduce port ops placeholder
In devlink, some of the objects have separate ops registered alongside
with the object itself. Port however have ops in devlink_ops structure.
For drivers what register multiple kinds of ports with different ops
this is not convenient. Introduce devlink_port_ops and a set
of functions that allow drivers to pass ops pointer during
port registration.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wei Fang [Mon, 29 May 2023 02:26:15 +0000 (10:26 +0800)]
net: fec: remove last_bdp from fec_enet_txq_xmit_frame()
The last_bdp is initialized to bdp, and both last_bdp and bdp are
not changed. That is to say that last_bdp and bdp are always equal.
So bdp can be used directly.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230529022615.669589-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Heiner Kallweit [Sun, 28 May 2023 17:35:12 +0000 (19:35 +0200)]
r8169: check for PCI read error in probe
Check whether first PCI read returns 0xffffffff. Currently, if this is
the case, the user sees the following misleading message:
unknown chip XID fcf, contact r8169 maintainers (see MAINTAINERS file)
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/75b54d23-fefe-2bf4-7e80-c9d3bc91af11@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andy Shevchenko [Sun, 28 May 2023 14:25:31 +0000 (17:25 +0300)]
dsa: lan9303: Remove stray gpiod_unexport() call
There is no gpiod_export() and gpiod_unexport() looks pretty much stray.
The gpiod_export() and gpiod_unexport() shouldn't be used in the code,
GPIO sysfs is deprecated. That said, simply drop the stray call.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230528142531.38602-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Christophe JAILLET [Sat, 27 May 2023 19:40:08 +0000 (21:40 +0200)]
liquidio: Use vzalloc()
Use vzalloc() instead of hand writing it with vmalloc()+memset().
This is less verbose.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/93b010824d9d92376e8d49b9eb396a0fa0c0ac80.1685216322.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 30 May 2023 09:50:07 +0000 (11:50 +0200)]
Merge branch 'microchip_t1s-update-on-microchip-10base-t1s-phy-driver'
Parthiban Veerasooran says:
====================
microchip_t1s: Update on Microchip 10BASE-T1S PHY driver
This patch series contain the below updates,
- Fixes on the Microchip LAN8670/1/2 10BASE-T1S PHYs support in the
net/phy/microchip_t1s.c driver.
- Adds support for the Microchip LAN8650/1 Rev.B0 10BASE-T1S Internal
PHYs in the net/phy/microchip_t1s.c driver.
====================
Link: https://lore.kernel.org/r/20230526152348.70781-1-Parthiban.Veerasooran@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran [Fri, 26 May 2023 15:23:48 +0000 (20:53 +0530)]
net: phy: microchip_t1s: add support for Microchip LAN865x Rev.B0 PHYs
Add support for the Microchip LAN865x Rev.B0 10BASE-T1S Internal PHYs
(LAN8650/1). The LAN865x combines a Media Access Controller (MAC) and an
internal 10BASE-T1S Ethernet PHY to access 10BASE‑T1S networks. As
LAN867X and LAN865X are using the same function for the read_status,
rename the function as lan86xx_read_status.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran [Fri, 26 May 2023 15:23:47 +0000 (20:53 +0530)]
net: phy: microchip_t1s: remove unnecessary interrupts disabling code
By default, except Reset Complete interrupt in the Interrupt Mask 2
Register all other interrupts are disabled/masked. As Reset Complete
status is already handled, it doesn't make sense to disable it.
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Tested-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran [Fri, 26 May 2023 15:23:46 +0000 (20:53 +0530)]
net: phy: microchip_t1s: fix reset complete status handling
As per the datasheet DS-LAN8670-1-2-
60001573C.pdf, the Reset Complete
status bit in the STS2 register has to be checked before proceeding to
the initial configuration. Reading STS2 register will also clear the
Reset Complete interrupt which is non-maskable.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Tested-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran [Fri, 26 May 2023 15:23:45 +0000 (20:53 +0530)]
net: phy: microchip_t1s: update LAN867x PHY supported revision number
As per AN1699, the initial configuration in the driver applies to LAN867x
Rev.B1 hardware revision. 0x0007C160 (Rev.A0) and 0x0007C161 (Rev.B0)
never released to production and hence they don't need to be supported.
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran [Fri, 26 May 2023 15:23:44 +0000 (20:53 +0530)]
net: phy: microchip_t1s: replace read-modify-write code with phy_modify_mmd
Replace read-modify-write code in the lan867x_config_init function to
avoid handling data type mismatch and to simplify the code.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Parthiban Veerasooran [Fri, 26 May 2023 15:23:43 +0000 (20:53 +0530)]
net: phy: microchip_t1s: modify driver description to be more generic
Remove LAN867X from the driver description as this driver is common for
all the Microchip 10BASE-T1S PHYs.
Reviewed-by: Ramón Nordin Rodriguez <ramon.nordin.rodriguez@ferroamp.se>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 30 May 2023 07:48:22 +0000 (09:48 +0200)]
Merge branch 'microchip-dsa-driver-improvements'
Oleksij Rempel says:
====================
Microchip DSA Driver Improvements
changes v2:
- set .max_register = U8_MAX, it should be more readable
- clarify in the RMW error handling patch, logging behavior
expectation.
I'd like to share a set of patches for the Microchip DSA driver. These
patches were chosen from a bigger set because they are simpler and
should be easier to review. The goal is to make the code easier to read,
get rid of unused code, and handle errors better.
====================
Link: https://lore.kernel.org/r/20230526073445.668430-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Oleksij Rempel [Fri, 26 May 2023 07:34:45 +0000 (09:34 +0200)]
net: dsa: microchip: Add register access control for KSZ8873 chip
This update introduces specific register access boundaries for the
KSZ8873 and KSZ8863 chips within the DSA Microchip driver. The outlined
ranges target global control registers, port registers, and advanced
control registers.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Oleksij Rempel [Fri, 26 May 2023 07:34:44 +0000 (09:34 +0200)]
net: dsa: microchip: ksz8: Prepare ksz8863_smi for regmap register access validation
This patch prepares the ksz8863_smi part of ksz8 driver to utilize the
regmap register access validation feature.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Oleksij Rempel [Fri, 26 May 2023 07:34:43 +0000 (09:34 +0200)]
net: dsa: microchip: remove ksz_port:on variable
The only place where this variable would be set to false is the
ksz8_config_cpu_port() function. But it is done in a bogus way:
for (i = 0; i < dev->phy_port_cnt; i++) {
if (i == dev->phy_port_cnt) <--- will be never executed.
break;
p->on = 1;
So, we never have a situation where p->on = 0. In this case, we can just
remove it.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vladimir Oltean [Fri, 26 May 2023 07:34:42 +0000 (09:34 +0200)]
net: dsa: microchip: add an enum for regmap widths
It is not immediately obvious that this driver allocates, via the
KSZ_REGMAP_TABLE() macro, 3 regmaps for register access: dev->regmap[0]
for 8-bit access, dev->regmap[1] for 16-bit and dev->regmap[2] for
32-bit access.
In future changes that add support for reg_fields, each field will have
to specify through which of the 3 regmaps it's going to go. Add an enum
now, to denote one of the 3 register access widths, and make the code go
through some wrapper functions for easier review and further
modification.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Oleksij Rempel [Fri, 26 May 2023 07:34:41 +0000 (09:34 +0200)]
net: dsa: microchip: improving error handling for 8-bit register RMW operations
This patch refines the error handling mechanism for 8-bit register
read-modify-write operations. In case of a failure, it now logs an error
message detailing the problematic offset. This enhancement aids in
debugging by providing more precise information when these operations
encounter issues.
Furthermore, the ksz_prmw8() function has been updated to return error
values rather than void, enabling calling functions to appropriately
respond to errors.
Additionally, in case of an error that affects both the current and
future accesses, the PHY driver will log the errors consistently, akin
to the existing behavior in all ksz_read*/ksz_write* helpers.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Tue, 30 May 2023 05:05:40 +0000 (22:05 -0700)]
Merge branch 'netlink-specs-add-ynl-spec-for-ovs_flow'
Donald Hunter says:
====================
netlink: specs: add ynl spec for ovs_flow
Add a ynl specification for ovs_flow. The spec is sufficient to dump ovs
flows but some attrs have been left as binary blobs because ynl doesn't
support C arrays in struct definitions yet.
Patches 1-3 add features for genetlink-legacy specs
Patch 4 is the ovs_flow netlink spec
====================
Link: https://lore.kernel.org/r/20230527133107.68161-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Sat, 27 May 2023 13:31:07 +0000 (14:31 +0100)]
netlink: specs: add ynl spec for ovs_flow
Add a ynl specification for ovs_flow. This spec is sufficient to dump ovs
flows. Some attrs are left as binary blobs because ynl doesn't support C
arrays in struct definitions yet.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Sat, 27 May 2023 13:31:06 +0000 (14:31 +0100)]
tools: ynl: Support enums in struct members in genetlink-legacy
Support decoding scalars as enums in struct members for genetlink-legacy
specs.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Sat, 27 May 2023 13:31:05 +0000 (14:31 +0100)]
tools: ynl: Initialise fixed headers to 0 in genetlink-legacy
This eliminates the need for e.g. --json '{"dp-ifindex":0}' which is not
too big a deal for ovs but will get tiresome for fixed header structs that
have many members.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Sat, 27 May 2023 13:31:04 +0000 (14:31 +0100)]
doc: ynl: Add doc attr to struct members in genetlink-legacy spec
Make it possible to document the meaning of struct member attributes in
genetlink-legacy specs.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Horman [Fri, 26 May 2023 13:45:13 +0000 (15:45 +0200)]
devlink: Spelling corrections
Make some minor spelling corrections in comments.
Found by inspection.
Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230526-devlink-spelling-v1-1-9a3e36cdebc8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dan Carpenter [Fri, 26 May 2023 13:39:15 +0000 (16:39 +0300)]
net: fix signedness bug in skb_splice_from_iter()
The "len" variable needs to be signed for the error handling to work
correctly.
Fixes:
2e910b95329c ("net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/366861a7-87c8-4bbf-9101-69dd41021d07@kili.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Fri, 26 May 2023 11:44:43 +0000 (12:44 +0100)]
net: dpaa2-mac: use correct interface to free mdiodev
Rather than using put_device(&mdiodev->dev), use the proper interface
provided to dispose of the mdiodev - that being mdio_device_free().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/E1q2VsB-008QlZ-El@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 30 May 2023 04:46:55 +0000 (21:46 -0700)]
Merge branch 'net-pcs-add-helpers-to-xpcs-and-lynx-to-manage-mdiodev'
Russell King says:
====================
net: pcs: add helpers to xpcs and lynx to manage mdiodev
This morning, we have had two instances where the destruction of the
MDIO device associated with XPCS and Lynx has been wrong. Rather than
allowing this pattern of errors to continue, let's make it easier for
driver authors to get this right by adding a helper.
The changes are essentially:
1. Add two new mdio device helpers to manage the underlying struct
device reference count. Note that the existing mdio_device_free()
doesn't actually free anything, it merely puts the reference count.
2. Make the existing _create() and _destroy() PCS driver methods
increment and decrement this refcount using these helpers. This
results in no overall change, although drivers may hang on to
the mdio device for a few cycles longer.
3. Add _create_mdiodev() which creates the mdio device before calling
the existing _create() method. Once the _create() method has
returned, we put the reference count on the mdio device.
If _create() was successful, then the reference count taken there
will "hold" the mdio device for the lifetime of the PCS (in other
words, until _destroy() is called.) However, if _create() failed,
then dropping the refcount at this point will free the mdio device.
This is the exact behaviour we desire.
4. Convert users that create a mdio device and then call the PCS's
_create() method over to the new _create_mdiodev() method, and
simplify the cleanup.
We also have DPAA2 and fmem_memac that look up their PCS rather than
creating it. These could also drop their reference count on the MDIO
device immediately after calling lynx_pcs_create(), which would then
mean we wouldn't need lynx_get_mdio_device() and the associated
complexity to put the device in dpaa2_pcs_destroy() and pcs_put().
Note that DPAA2 bypasses the mdio device's abstractions by calling
put_device() directly.
====================
Link: https://lore.kernel.org/r/ZHCGZ8IgAAwr8bla@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Fri, 26 May 2023 10:14:50 +0000 (11:14 +0100)]
net: enetc: use lynx_pcs_create_mdiodev()
Use the newly introduced lynx_pcs_create_mdiodev() which simplifies the
creation and destruction of the lynx PCS.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Fri, 26 May 2023 10:14:44 +0000 (11:14 +0100)]
net: dsa: ocelot: use lynx_pcs_create_mdiodev()
Use the newly introduced lynx_pcs_create_mdiodev() which simplifies the
creation and destruction of the lynx PCS.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Fri, 26 May 2023 10:14:39 +0000 (11:14 +0100)]
net: pcs: lynx: add lynx_pcs_create_mdiodev()
Add lynx_pcs_create_mdiodev() to simplify the creation of the mdio
device associated with lynx PCS. In order to allow lynx_pcs_destroy()
to clean this up, we need to arrange for lynx_pcs_create() to take a
refcount on the mdiodev, and lynx_pcs_destroy() to put it.
Adding the refcounting to lynx_pcs_create()..lynx_pcs_destroy() will
be transparent to existing users of these interfaces.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>