linux-2.6-microblaze.git
3 years agonet: dsa: mv88e6xxx: Make global2 support mandatory
Andrew Lunn [Wed, 27 Jan 2021 00:32:10 +0000 (01:32 +0100)]
net: dsa: mv88e6xxx: Make global2 support mandatory

Early generations of the mv88e6xxx did not have the global 2
registers. In order to keep the driver slim, it was decided to make
the code for these registers optional. Over time, more generations of
switches have been added, always supporting global 2 and adding more
and more registers. No effort has been made to keep these additional
registers also optional to slim the driver down when used for older
generations. Optional global 2 now just gives additional development
and maintenance burden for no real gain.

Make global 2 support always compiled in.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20210127003210.663173-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'mac80211-next-for-net-next-2021-01-27' of git://git.kernel.org/pub/scm...
Jakub Kicinski [Thu, 28 Jan 2021 03:01:06 +0000 (19:01 -0800)]
Merge tag 'mac80211-next-for-net-next-2021-01-27' of git://git./linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
More updates:
 * many minstrel improvements, including removal of the old
   minstrel in favour of minstrel_ht
 * speed improvements on FQ
 * support for RX decapsulation (header conversion) offload
 * RTNL reduction: limit RTNL usage in the wireless stack
   mostly to where really needed (regulatory not yet) to
   reduce contention on it

* tag 'mac80211-next-for-net-next-2021-01-27' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (24 commits)
  mac80211: minstrel_ht: fix regression in the max_prob_rate fix
  virt_wifi: fix deadlock on RTNL
  cfg80211: avoid holding the RTNL when calling the driver
  cfg80211: change netdev registration/unregistration semantics
  mac80211: minstrel_ht: fix rounding error in throughput calculation
  mac80211: minstrel_ht: increase stats update interval
  mac80211: minstrel_ht: fix max probability rate selection
  mac80211: minstrel_ht: improve sample rate selection
  mac80211: minstrel_ht: improve ampdu length estimation
  mac80211: minstrel_ht: remove old ewma based rate average code
  mac80211: remove legacy minstrel rate control
  mac80211: minstrel_ht: add support for OFDM rates on non-HT clients
  mac80211: minstrel_ht: clean up CCK code
  mac80211: introduce aql_enable node in debugfs
  cfg80211: Add phyrate conversion support for extended MCS in 60GHz band
  cfg80211: add VHT rate entries for MCS-10 and MCS-11
  mac80211: reduce peer HE MCS/NSS to own capabilities
  mac80211: remove NSS number of 160MHz if not support 160MHz for HE
  mac80211_hwsim: add 6GHz channels
  mac80211: add LDPC encoding to ieee80211_parse_tx_radiotap
  ...
====================

Link: https://lore.kernel.org/r/20210127210915.135550-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'linux-can-next-for-5.12-20210127' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Thu, 28 Jan 2021 02:53:09 +0000 (18:53 -0800)]
Merge tag 'linux-can-next-for-5.12-20210127' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-01-27

The first two patches are by me and fix typos on the CAN gw protocol and the
flexcan driver.

The next patch is by Vincent Mailhol and targets the CAN driver infrastructure,
it exports the function that converts the CAN state into a human readable
string.

A patch by me, which target the CAN driver infrastructure, too, makes the
calculation in can_fd_len2dlc() more readable.

A patch by Tom Rix fixes a checkpatch warning in the mcba_usb driver.

The next seven patches target the mcp251xfd driver. Su Yanjun's patch replaces
several hardcoded assumptions when calling regmap, by using
regmap_get_val_bytes(). The remaining patches are by me. First an open coded
check is replaced by an existing helper function, then in the TX path the
padding for CAN-FD frames is cleaned up. The next two patches clean up the RTR
frame handling in the RX and TX path. Then support for len8_dlc is added. The
last patch adds BQL support.

* tag 'linux-can-next-for-5.12-20210127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: mcp251xfd: add BQL support
  can: mcp251xfd: add len8_dlc support
  can: mcp251xfd: mcp251xfd_tx_obj_from_skb(): don't copy data for RTR CAN frames in TX-path
  can: mcp251xfd: mcp251xfd_hw_rx_obj_to_skb(): don't copy data for RTR CAN frames in RX-path
  can: mcp251xfd: mcp251xfd_tx_obj_from_skb(): clean up padding of CAN-FD frames
  can: mcp251xfd: mcp251xfd_start_xmit(): use mcp251xfd_get_tx_free() to check TX is is full
  can: mcp251xfd: replace sizeof(u32) with val_bytes in regmap
  can: mcba_usb: remove h from printk format specifier
  can: length: can_fd_len2dlc(): make legnth calculation readable again
  can: dev: export can_get_state_str() function
  can: flexcan: fix typos
  can: gw: fix typo
====================

Link: https://lore.kernel.org/r/20210127092227.2775573-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotipc: remove duplicated code in tipc_msg_create
Hoang Huu Le [Wed, 27 Jan 2021 02:51:23 +0000 (09:51 +0700)]
tipc: remove duplicated code in tipc_msg_create

Remove a duplicate code checking for header size in tipc_msg_create() as
it's already being done in tipc_msg_init().

Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20210127025123.6390-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-bridge-multicast-per-port-eht-hosts-limit'
Jakub Kicinski [Thu, 28 Jan 2021 01:40:36 +0000 (17:40 -0800)]
Merge branch 'net-bridge-multicast-per-port-eht-hosts-limit'

Nikolay Aleksandrov says:

====================
net: bridge: multicast: per-port EHT hosts limit

This set adds a simple configurable per-port EHT tracked hosts limit.
Patch 01 adds a default limit of 512 tracked hosts per-port, since the EHT
changes are still only in net-next that shouldn't be a problem. Then
patch 02 adds the ability to configure and retrieve the hosts limit
and to retrieve the current number of tracked hosts per port.
Let's be on the safe side and limit the number of tracked hosts by
default while allowing the user to increase that limit if needed.
====================

Link: https://lore.kernel.org/r/20210126093533.441338-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: multicast: make tracked EHT hosts limit configurable
Nikolay Aleksandrov [Tue, 26 Jan 2021 09:35:33 +0000 (11:35 +0200)]
net: bridge: multicast: make tracked EHT hosts limit configurable

Add two new port attributes which make EHT hosts limit configurable and
export the current number of tracked EHT hosts:
 - IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT: configure/retrieve current limit
 - IFLA_BRPORT_MCAST_EHT_HOSTS_CNT: current number of tracked hosts
Setting IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT to 0 is currently not allowed.

Note that we have to increase RTNL_SLAVE_MAX_TYPE to 38 minimum, I've
increased it to 40 to have space for two more future entries.

v2: move br_multicast_eht_set_hosts_limit() to br_multicast_eht.c,
    no functional change

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: multicast: add per-port EHT hosts limit
Nikolay Aleksandrov [Tue, 26 Jan 2021 09:35:32 +0000 (11:35 +0200)]
net: bridge: multicast: add per-port EHT hosts limit

Add a default limit of 512 for number of tracked EHT hosts per-port.

Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agorocker: Simplify the calculation of variables
Jiapeng Zhong [Tue, 26 Jan 2021 08:13:03 +0000 (16:13 +0800)]
rocker: Simplify the calculation of variables

Fix the following coccicheck warnings:

./drivers/net/ethernet/rocker/rocker_ofdpa.c:926:34-36: WARNING !A || A
&& B is equivalent to !A || B.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Zhong <abaci-bugfix@linux.alibaba.com>
Link: https://lore.kernel.org/r/1611648783-3916-1-git-send-email-abaci-bugfix@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: remove redundant 'depends on NET'
Masahiro Yamada [Mon, 25 Jan 2021 23:20:26 +0000 (08:20 +0900)]
net: remove redundant 'depends on NET'

These Kconfig files are included from net/Kconfig, inside the
if NET ... endif.

Remove 'depends on NET', which we know it is already met.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210125232026.106855-1-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: l3mdev: use obj-$(CONFIG_NET_L3_MASTER_DEV) form in net/Makefile
Masahiro Yamada [Mon, 25 Jan 2021 23:16:58 +0000 (08:16 +0900)]
net: l3mdev: use obj-$(CONFIG_NET_L3_MASTER_DEV) form in net/Makefile

CONFIG_NET_L3_MASTER_DEV is a bool option. Change the ifeq conditional
to the standard obj-$(CONFIG_NET_L3_MASTER_DEV) form.

Use obj-y in net/l3mdev/Makefile because Kbuild visits this Makefile
only when CONFIG_NET_L3_MASTER_DEV=y.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125231659.106201-4-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: switchdev: use obj-$(CONFIG_NET_SWITCHDEV) form in net/Makefile
Masahiro Yamada [Mon, 25 Jan 2021 23:16:57 +0000 (08:16 +0900)]
net: switchdev: use obj-$(CONFIG_NET_SWITCHDEV) form in net/Makefile

CONFIG_NET_SWITCHDEV is a bool option. Change the ifeq conditional to
the standard obj-$(CONFIG_NET_SWITCHDEV) form.

Use obj-y in net/switchdev/Makefile because Kbuild visits this Makefile
only when CONFIG_NET_SWITCHDEV=y.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210125231659.106201-3-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dcb: use obj-$(CONFIG_DCB) form in net/Makefile
Masahiro Yamada [Mon, 25 Jan 2021 23:16:56 +0000 (08:16 +0900)]
net: dcb: use obj-$(CONFIG_DCB) form in net/Makefile

CONFIG_DCB is a bool option. Change the ifeq conditional to the
standard obj-$(CONFIG_DCB) form.

Use obj-y in net/dcb/Makefile because Kbuild visits this Makefile
only when CONFIG_DCB=y.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210125231659.106201-2-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: move CONFIG_NET guard to top Makefile
Masahiro Yamada [Mon, 25 Jan 2021 23:16:55 +0000 (08:16 +0900)]
net: move CONFIG_NET guard to top Makefile

When CONFIG_NET is disabled, nothing under the net/ directory is
compiled. Move the CONFIG_NET guard to the top Makefile so the net/
directory is entirely skipped.

When Kbuild visits net/Makefile, CONFIG_NET is obvioulsy 'y' because
CONFIG_NET is a bool option. Clean up net/Makefile.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210125231659.106201-1-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: sysctl: remove redundant #ifdef CONFIG_NET
Masahiro Yamada [Mon, 25 Jan 2021 23:14:21 +0000 (08:14 +0900)]
net: sysctl: remove redundant #ifdef CONFIG_NET

CONFIG_NET is a bool option, and this file is compiled only when
CONFIG_NET=y.

Remove #ifdef CONFIG_NET, which we know it is always met.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20210125231421.105936-1-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'mptcp-ipv4-mapped-ipv6-addressing-for-subflows'
Jakub Kicinski [Thu, 28 Jan 2021 00:50:05 +0000 (16:50 -0800)]
Merge branch 'mptcp-ipv4-mapped-ipv6-addressing-for-subflows'

Mat Martineau says:

====================
MPTCP: IPv4-mapped IPv6 addressing for subflows

This patch series from the MPTCP tree adds support for IPv4-mapped IPv6
addressing that was missing when multiple subflows were first
implemented.

Patches 1 and 2 handle the conversion and comparison of the mapped
addresses.

Patch 3 contains a minor refactor in the path manager's handling of
addresses.

Patches 4 and 5 add selftests for the new functionality and adjust the
selftest timeout.
====================

Link: https://lore.kernel.org/r/20210125185904.6997-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: increase timeout to 10 min
Matthieu Baerts [Mon, 25 Jan 2021 18:59:04 +0000 (10:59 -0800)]
selftests: increase timeout to 10 min

On slow systems with kernel debug settings, we can reach the current
timeout when all tests are executed.

Likely some tests need be improved to remove some 'sleep' and wait
(less) for a specific action. This can also improve stability.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: mptcp: add IPv4-mapped IPv6 testcases
Geliang Tang [Mon, 25 Jan 2021 18:59:03 +0000 (10:59 -0800)]
selftests: mptcp: add IPv4-mapped IPv6 testcases

Here, we make sure we support IPv4-mapped in IPv6 addresses in different
contexts:

- a v4-mapped address is received by the PM and can be used as v4.
- a v4 address is received by the PM and can be used even with a v4
  mapped socket.

We also make sure we don't try to establish subflows between v4 and v6
addresses, e.g. if a real v6 address ends with a valid v4 address.

Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: pm nl: reduce variable scope
Matthieu Baerts [Mon, 25 Jan 2021 18:59:02 +0000 (10:59 -0800)]
mptcp: pm nl: reduce variable scope

To avoid confusions like when working on the previous patch, better to
declare and assign this variable only where it is needed.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: pm nl: support IPv4 mapped in v6 addresses
Matthieu Baerts [Mon, 25 Jan 2021 18:59:01 +0000 (10:59 -0800)]
mptcp: pm nl: support IPv4 mapped in v6 addresses

On one side, we can allow the creation of subflows between v4 mapped in
v6 and v4 addresses. For that we look for v4mapped addresses between the
local address we want to select and the remote one.

On the other side, we also properly deal with received v4mapped
addresses, either announced ones or set via Netlink.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/122
Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Co-developed-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: support MPJoin with IPv4 mapped in v6 sk
Matthieu Baerts [Mon, 25 Jan 2021 18:59:00 +0000 (10:59 -0800)]
mptcp: support MPJoin with IPv4 mapped in v6 sk

With an IPv4 mapped in v6 socket, we were trying to call inet6_bind()
with an IPv4 address resulting in a -EINVAL error because the given
addr_len -- size of the address structure -- was too short.

We now make sure to use address structures for the same family as the
MPTCP socket for both the bind() and the connect(). It means we convert
v4 addresses to v4 mapped in v6 or the opposite if needed.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/122
Co-developed-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: remove not needed call to rtl_wol_enable_rx from rtl_shutdown
Heiner Kallweit [Mon, 25 Jan 2021 16:55:12 +0000 (17:55 +0100)]
r8169: remove not needed call to rtl_wol_enable_rx from rtl_shutdown

rtl_wol_enable_rx() is called via the following call chain if WoL
is enabled:
rtl8169_down()
-> rtl_prepare_power_down()
   -> rtl_wol_enable_rx()
Therefore we don't have to call this function here.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/34ce78e2-596c-e2ac-16aa-c550fa624c22@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agopktgen: fix misuse of BUG_ON() in pktgen_thread_worker()
Di Zhu [Mon, 25 Jan 2021 12:42:29 +0000 (20:42 +0800)]
pktgen: fix misuse of BUG_ON() in pktgen_thread_worker()

pktgen create threads for all online cpus and bond these threads to
relevant cpu repecivtily. when this thread firstly be woken up, it
will compare cpu currently running with the cpu specified at the time
of creation and if the two cpus are not equal, BUG_ON() will take effect
causing panic on the system.
Notice that these threads could be migrated to other cpus before start
running because of the cpu hotplug after these threads have created. so the
BUG_ON() used here seems unreasonable and we can replace it with WARN_ON()
to just printf a warning other than panic the system.

Signed-off-by: Di Zhu <zhudi21@huawei.com>
Link: https://lore.kernel.org/r/20210125124229.19334-1-zhudi21@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomac80211: minstrel_ht: fix regression in the max_prob_rate fix
Felix Fietkau [Tue, 26 Jan 2021 15:44:09 +0000 (16:44 +0100)]
mac80211: minstrel_ht: fix regression in the max_prob_rate fix

Since mi->max_prob_rate is overwritten after the loop that calls
minstrel_ht_set_best_prob_rate, the new best rate needs to be written to *dest

Fixes: a7fca4e4037f ("mac80211: minstrel_ht: fix max probability rate selection")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210126154409.6755-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
3 years agovirt_wifi: fix deadlock on RTNL
Johannes Berg [Wed, 27 Jan 2021 20:59:42 +0000 (21:59 +0100)]
virt_wifi: fix deadlock on RTNL

Fix a regression where everything in virt_wifi would just hang. This
happened due to overlapping changes between commit a05829a7222e
("cfg80211: avoid holding the RTNL when calling the driver") which
had originally needed to change the locking, but then I introduced
commit 2fe8ef106238 ("cfg80211: change netdev registration/unregistration
semantics") instead. virt_wifi somehow fell through the cracks when
I undid all the previous locking changes. Fix it now.

Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver")
Reported-by: syzbot+3d2d5e6cc3fb15c6a0fd@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20210127215941.2d6a97b09784.I4f1fac32f67045171be50931f44d77e150911bee@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
3 years agocan: mcp251xfd: add BQL support
Marc Kleine-Budde [Sun, 13 Dec 2020 16:25:15 +0000 (17:25 +0100)]
can: mcp251xfd: add BQL support

This patch adds BQL support to the driver. Support for netdev_xmit_more() will
be added in a separate patch series.

Link: https://lore.kernel.org/r/20210114153448.1506901-7-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcp251xfd: add len8_dlc support
Marc Kleine-Budde [Sun, 20 Dec 2020 17:47:51 +0000 (18:47 +0100)]
can: mcp251xfd: add len8_dlc support

This patch adds support for the Classical CAN raw DLC functionality to send and
receive DLC values from 9 ... 15 to the mcp251xfd driver.

Link: https://lore.kernel.org/r/20210114153448.1506901-6-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcp251xfd: mcp251xfd_tx_obj_from_skb(): don't copy data for RTR CAN frames in...
Marc Kleine-Budde [Mon, 21 Dec 2020 20:34:50 +0000 (21:34 +0100)]
can: mcp251xfd: mcp251xfd_tx_obj_from_skb(): don't copy data for RTR CAN frames in TX-path

In Classical CAN there are RTR frames. RTR frames have the RTR bit set, may
have a dlc != 0, but contain no data.

This patch optimizes the TX-path to not copy any data for RTR frames.

Link: https://lore.kernel.org/r/20210114153448.1506901-5-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcp251xfd: mcp251xfd_hw_rx_obj_to_skb(): don't copy data for RTR CAN frames...
Marc Kleine-Budde [Mon, 21 Dec 2020 20:34:50 +0000 (21:34 +0100)]
can: mcp251xfd: mcp251xfd_hw_rx_obj_to_skb(): don't copy data for RTR CAN frames in RX-path

In Classical CAN there are RTR frames. RTR frames have the RTR bit set, may
have a dlc != 0, but contain no data.

This patch changes the RX-path to no copy any data for RTR frames, so that the
data field in the CAN frame stays 0x0.

Link: https://lore.kernel.org/r/20210114153448.1506901-4-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcp251xfd: mcp251xfd_tx_obj_from_skb(): clean up padding of CAN-FD frames
Marc Kleine-Budde [Mon, 21 Dec 2020 20:48:20 +0000 (21:48 +0100)]
can: mcp251xfd: mcp251xfd_tx_obj_from_skb(): clean up padding of CAN-FD frames

CAN-FD frames have only specific frame length (0, 1, 2, 3, 4, 5, 6, 7, 8, 12,
16, 20, 24, 32, 48, 64). A CAN-FD frame provided by user space might not cover
the whole CAN-FD frame. To avoid sending garbage over the CAN bus the driver
pads the CAN frame with 0x0 (if MCP251XFD_SANITIZE_CAN is activated).

This patch cleans up the pad len calculation. Rounding to full u32 brings no
benefit, in case of CRC transfers, the hw_tx_obj->data is not aligned to u32
anyway.

Link: https://lore.kernel.org/r/20210114153448.1506901-3-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcp251xfd: mcp251xfd_start_xmit(): use mcp251xfd_get_tx_free() to check TX is...
Marc Kleine-Budde [Sun, 20 Dec 2020 13:02:29 +0000 (14:02 +0100)]
can: mcp251xfd: mcp251xfd_start_xmit(): use mcp251xfd_get_tx_free() to check TX is is full

This patch replaces an open coded check if the TX ring is full by a check if
mcp251xfd_get_tx_free() returns 0.

Link: https://lore.kernel.org/r/20210114153448.1506901-2-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcp251xfd: replace sizeof(u32) with val_bytes in regmap
Su Yanjun [Fri, 22 Jan 2021 08:13:34 +0000 (16:13 +0800)]
can: mcp251xfd: replace sizeof(u32) with val_bytes in regmap

The sizeof(u32) is hardcoded. It's better to use the config value from the
regmap.

It increases the size of target object, but it's flexible when new mcp chip
need other val_bytes.

Link: https://lore.kernel.org/r/20210122081334.213957-1-suyanjun218@gmail.com
Signed-off-by: Su Yanjun <suyanjun218@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: mcba_usb: remove h from printk format specifier
Tom Rix [Sun, 24 Jan 2021 15:09:16 +0000 (07:09 -0800)]
can: mcba_usb: remove h from printk format specifier

This change fixes the checkpatch warning described in this commit commit
cbacb5ab0aa0 ("docs: printk-formats: Stop encouraging use of unnecessary
%h[xudi] and %hh[xudi]")

Standard integer promotion is already done and %hx and %hhx is useless so do
not encourage the use of %hh[xudi] or %h[xudi].

Link: https://lore.kernel.org/r/20210124150916.1920434-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: length: can_fd_len2dlc(): make legnth calculation readable again
Marc Kleine-Budde [Sun, 17 Jan 2021 22:06:28 +0000 (23:06 +0100)]
can: length: can_fd_len2dlc(): make legnth calculation readable again

In commit 652562e5ff06 ("can: length: can_fd_len2dlc(): simplify length
calculcation") the readability of the code degraded and became more error
prone. To counteract this, partially convert that patch and replace open coded
values (of the original code) with proper defines.

Fixes: 652562e5ff06 ("can: length: can_fd_len2dlc(): simplify length calculcation")
Cc: Vincent MAILHOL <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/r/20210118201346.79422-1-socketcan@hartkopp.net
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: dev: export can_get_state_str() function
Vincent Mailhol [Tue, 19 Jan 2021 17:03:55 +0000 (02:03 +0900)]
can: dev: export can_get_state_str() function

The can_get_state_str() function is also relevant to the drivers. Export the
symbol and make it visible in the can/dev.h header.

Link: https://lore.kernel.org/r/20210119170355.12040-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: flexcan: fix typos
Marc Kleine-Budde [Wed, 27 Jan 2021 08:13:03 +0000 (09:13 +0100)]
can: flexcan: fix typos

This patch fixes two typos found by codespell.

Fixes: 812f0116c66a ("can: flexcan: add CAN wakeup function for i.MX8QM")
Link: https://lore.kernel.org/r/20210127085529.2768537-2-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agocan: gw: fix typo
Marc Kleine-Budde [Wed, 27 Jan 2021 08:13:03 +0000 (09:13 +0100)]
can: gw: fix typo

This patch fixes a typo found by codespell.

Fixes: 94c23097f991 ("can: gw: support modification of Classical CAN DLCs")
Link: https://lore.kernel.org/r/20210127085529.2768537-3-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agonet: allow user to set metric on default route learned via Router Advertisement
Praveen Chaudhary [Mon, 25 Jan 2021 21:44:30 +0000 (13:44 -0800)]
net: allow user to set metric on default route learned via Router Advertisement

For IPv4, default route is learned via DHCPv4 and user is allowed to change
metric using config etc/network/interfaces. But for IPv6, default route can
be learned via RA, for which, currently a fixed metric value 1024 is used.

Ideally, user should be able to configure metric on default route for IPv6
similar to IPv4. This patch adds sysctl for the same.

Logs:

For IPv4:

Config in etc/network/interfaces:
auto eth0
iface eth0 inet dhcp
    metric 4261413864

IPv4 Kernel Route Table:
$ ip route list
default via 172.21.47.1 dev eth0 metric 4261413864

FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over DHCPv4 default route.]
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       > - selected route, * - FIB route

S>* 0.0.0.0/0 [20/0] is directly connected, eth0, 00:00:03
K   0.0.0.0/0 [254/1000] via 172.21.47.1, eth0, 6d08h51m

i.e. User can prefer Default Router learned via Routing Protocol in IPv4.
Similar behavior is not possible for IPv6, without this fix.

After fix [for IPv6]:
sudo sysctl -w net.ipv6.conf.eth0.net.ipv6.conf.eth0.ra_defrtr_metric=1996489705

IP monitor: [When IPv6 RA is received]
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705  pref high

Kernel IPv6 routing table
$ ip -6 route list
default via fe80::be16:65ff:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 21sec hoplimit 64 pref high

FRR Table, if a static route is configured:
[In real scenario, it is useful to prefer BGP learned default route over IPv6 RA default route.]
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       > - selected route, * - FIB route

S>* ::/0 [20/0] is directly connected, eth0, 00:00:06
K   ::/0 [119/1001] via fe80::xx16:xxxx:feb3:ce8e, eth0, 6d07h43m

If the metric is changed later, the effect will be seen only when next IPv6
RA is received, because the default route must be fully controlled by RA msg.
Below metric is changed from 1996489705 to 1996489704.

$ sudo sysctl -w net.ipv6.conf.eth0.ra_defrtr_metric=1996489704
net.ipv6.conf.eth0.ra_defrtr_metric = 1996489704

IP monitor:
[On next IPv6 RA msg, Kernel deletes prev route and installs new route with updated metric]

Deleted default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489705 expires 3sec hoplimit 64 pref high
default via fe80::xx16:xxxx:feb3:ce8e dev eth0 proto ra metric 1996489704 pref high

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210125214430.24079-1-pchaudhary@linkedin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-usbnet-convert-to-new-tasklet-api'
Jakub Kicinski [Wed, 27 Jan 2021 02:04:28 +0000 (18:04 -0800)]
Merge branch 'net-usbnet-convert-to-new-tasklet-api'

Emil Renner Berthing says:

====================
net: usbnet: convert to new tasklet API

This converts the usbnet driver to use the new tasklet API introduced in
commit 12cc923f1ccc ("tasklet: Introduce new initialization API")

It is split into two commits for ease of reviewing.
====================

Link: https://lore.kernel.org/r/20210123173221.5855-1-esmil@mailme.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: usbnet: use new tasklet API
Emil Renner Berthing [Sat, 23 Jan 2021 17:32:21 +0000 (18:32 +0100)]
net: usbnet: use new tasklet API

This converts the driver to use the new tasklet API introduced in
commit 12cc923f1ccc ("tasklet: Introduce new initialization API")

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: usbnet: initialize tasklet using tasklet_init
Emil Renner Berthing [Sat, 23 Jan 2021 17:32:20 +0000 (18:32 +0100)]
net: usbnet: initialize tasklet using tasklet_init

Initialize tasklet using tasklet_init() rather than open-coding it.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-dsa-mv88e6xxx-remove-some-6250-specific-methods'
Jakub Kicinski [Wed, 27 Jan 2021 01:58:30 +0000 (17:58 -0800)]
Merge branch 'net-dsa-mv88e6xxx-remove-some-6250-specific-methods'

Rasmus Villemoes says:

====================
net: dsa: mv88e6xxx: remove some 6250-specific methods

v2:
 - resend now that the bug-fix patch (87fe04367d84, "net: dsa:
   mv88e6xxx: also read STU state in mv88e6250_g1_vtu_getnext") is in
   net and also merged to net-next.
 - include various tags in patch 1.
 - add second similar patch for loadpurge.
====================

Link: https://lore.kernel.org/r/20210125150449.115032-1-rasmus.villemoes@prevas.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: mv88e6xxx: use mv88e6185_g1_vtu_loadpurge() for the 6250
Rasmus Villemoes [Mon, 25 Jan 2021 15:04:49 +0000 (16:04 +0100)]
net: dsa: mv88e6xxx: use mv88e6185_g1_vtu_loadpurge() for the 6250

Apart from the mask used to get the high bits of the fid,
mv88e6185_g1_vtu_loadpurge() and mv88e6250_g1_vtu_loadpurge() are
identical. Since the entry->fid passed in should never exceed the
number of databases, we can simply use the former as-is as replacement
for the latter.

Suggested-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: mv88e6xxx: use mv88e6185_g1_vtu_getnext() for the 6250
Rasmus Villemoes [Mon, 25 Jan 2021 15:04:48 +0000 (16:04 +0100)]
net: dsa: mv88e6xxx: use mv88e6185_g1_vtu_getnext() for the 6250

mv88e6250_g1_vtu_getnext is almost identical to
mv88e6185_g1_vtu_getnext, except for the 6250 only having 64 databases
instead of 256. We can reduce code duplication by simply masking off
the extra two garbage bits when assembling the fid from VTU op [3:0]
and [11:8].

Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: add IPv4 unicast extensions tests
Seth David Schoen [Tue, 26 Jan 2021 04:08:34 +0000 (20:08 -0800)]
selftests: add IPv4 unicast extensions tests

Add selftests for kernel behavior with regard to various classes of
unallocated/reserved IPv4 addresses, checking whether or not these
addresses can be assigned as unicast addresses on links and used in
routing.

Expect the current kernel behavior at the time of this patch. That is:

* 0/8 and 240/4 may be used as unicast, with the exceptions of 0.0.0.0
  and 255.255.255.255;
* the lowest address in a subnet may only be used as a broadcast address;
* 127/8 may not be used as unicast (the route_localnet option, which is
  disabled by default, still leaves it treated slightly specially);
* 224/4 may not be used as unicast.

Signed-off-by: Seth David Schoen <schoen@loyalty.org>
Suggested-by: John Gilmore <gnu@toad.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210126040834.GR24989@frotz.zork.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobonding: add TLS dependency
Arnd Bergmann [Mon, 25 Jan 2021 11:31:59 +0000 (12:31 +0100)]
bonding: add TLS dependency

When TLS is a module, the built-in bonding driver may cause a
link error:

x86_64-linux-ld: drivers/net/bonding/bond_main.o: in function `bond_start_xmit':
bond_main.c:(.text+0xc451): undefined reference to `tls_validate_xmit_skb'

Add a dependency to avoid the problem.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210125113209.2248522-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Convert to use netif_level() helpers.
Michael Chan [Tue, 26 Jan 2021 06:20:24 +0000 (01:20 -0500)]
bnxt_en: Convert to use netif_level() helpers.

Use the various netif_level() helpers to simplify the C code.  This was
suggested by Joe Perches.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/1611642024-3166-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agousbnet: fix the indentation of one code snippet
Dongliang Mu [Sat, 23 Jan 2021 05:11:02 +0000 (13:11 +0800)]
usbnet: fix the indentation of one code snippet

Every line of code should start with tab (8 characters)

Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Link: https://lore.kernel.org/r/20210123051102.1091541-1-mudongliangabcd@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: multicast: fix br_multicast_eht_set_entry_lookup indentation
Nikolay Aleksandrov [Mon, 25 Jan 2021 08:20:40 +0000 (10:20 +0200)]
net: bridge: multicast: fix br_multicast_eht_set_entry_lookup indentation

Fix the messed up indentation in br_multicast_eht_set_entry_lookup().

Fixes: baa74d39ca39 ("net: bridge: multicast: add EHT source set handling functions")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20210125082040.13022-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agocfg80211: avoid holding the RTNL when calling the driver
Johannes Berg [Fri, 22 Jan 2021 15:19:43 +0000 (16:19 +0100)]
cfg80211: avoid holding the RTNL when calling the driver

Currently, _everything_ in cfg80211 holds the RTNL, and if you
have a slow USB device (or a few) you can get some bad lock
contention on that.

Fix that by re-adding a mutex to each wiphy/rdev as we had at
some point, so we have locking for the wireless_dev lists and
all the other things in there, and also so that drivers still
don't have to worry too much about it (they still won't get
parallel calls for a single device).

Then, we can restrict the RTNL to a few cases where we add or
remove interfaces and really need the added protection. Some
of the global list management still also uses the RTNL, since
we need to have it anyway for netdev management, but we only
hold the RTNL for very short periods of time here.

Link: https://lore.kernel.org/r/20210122161942.81df9f5e047a.I4a8e1a60b18863ea8c5e6d3a0faeafb2d45b2f40@changeid
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> [marvell driver issues]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
3 years agonfc: fix typo
wengjianfeng [Sat, 23 Jan 2021 08:25:50 +0000 (16:25 +0800)]
nfc: fix typo

change 'regster' to 'register'

Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
Acked-by: Mark Greer <mgreer@animalcreek.com>
Link: https://lore.kernel.org/r/20210123082550.3748-1-samirweng1979@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonfc: fdp: fix typo issue
wengjianfeng [Sat, 23 Jan 2021 07:48:35 +0000 (15:48 +0800)]
nfc: fdp: fix typo issue

change 'paquet' to 'packet'

Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
Link: https://lore.kernel.org/r/20210123074835.9448-1-samirweng1979@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'bnxt_en-error-recovery-improvements'
Jakub Kicinski [Tue, 26 Jan 2021 03:20:06 +0000 (19:20 -0800)]
Merge branch 'bnxt_en-error-recovery-improvements'

Michael Chan says:

====================
bnxt_en: Error recovery improvements.

This series contains a number of improvements in the area of error
recovery.  Most error recovery scenarios are tightly coordinated with
the firmware.  A number of patches add retry logic to establish
connection with the firmware if there are indications that the
firmware is still alive and will likely transition back to the
normal state.  Some patches speed up the recovery process and make
it more reliable.  There are some cleanup patches as well.
====================

Link: https://lore.kernel.org/r/1611558501-11022-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Do not process completion entries after fatal condition detected.
Michael Chan [Mon, 25 Jan 2021 07:08:21 +0000 (02:08 -0500)]
bnxt_en: Do not process completion entries after fatal condition detected.

Once the firmware fatal condition is detected, we should cease
comminication with the firmware and hardware quickly even if there
are many completion entries in the completion rings.  This will
speed up the recovery process and prevent further I/Os that may
cause further exceptions.

Do not proceed in the NAPI poll function if fatal condition is
detected.  Call napi_complete() and return without arming interrupts.
Cleanup of all rings and reset are imminent.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Consolidate firmware reset event logging.
Michael Chan [Mon, 25 Jan 2021 07:08:20 +0000 (02:08 -0500)]
bnxt_en: Consolidate firmware reset event logging.

Combine the three netdev_warn() calls into a single call, printed at
the NETIF_MSG_HW log level.

Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Improve firmware fatal error shutdown sequence.
Michael Chan [Mon, 25 Jan 2021 07:08:19 +0000 (02:08 -0500)]
bnxt_en: Improve firmware fatal error shutdown sequence.

In the event of a fatal firmware error, firmware will notify the host
and then it will proceed to do core reset when it sees that all functions
have disabled Bus Master.  To prevent Master Aborts and other hard
errors, we need to quiesce all activities in addition to disabling Bus
Master before the chip goes into core reset.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Modify bnxt_disable_int_sync() to be called more than once.
Michael Chan [Mon, 25 Jan 2021 07:08:18 +0000 (02:08 -0500)]
bnxt_en: Modify bnxt_disable_int_sync() to be called more than once.

In the event of a fatal firmware error, we want to disable IRQ early
in the recovery sequence.  This change will allow it to be called
safely again as part of the normal shutdown sequence.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Add a new BNXT_STATE_NAPI_DISABLED flag to keep track of NAPI state.
Michael Chan [Mon, 25 Jan 2021 07:08:17 +0000 (02:08 -0500)]
bnxt_en: Add a new BNXT_STATE_NAPI_DISABLED flag to keep track of NAPI state.

Up until now, we don't need to keep track of this state because NAPI
is always enabled once and disabled once during bring up and shutdown.
For better error recovery in subsequent patches, we want to quiesce
the device earlier during fatal error conditions.  The normal shutdown
sequence will disable NAPI again and the flag will prevent disabling
NAPI twice.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Add bnxt_fw_reset_timeout() helper.
Michael Chan [Mon, 25 Jan 2021 07:08:16 +0000 (02:08 -0500)]
bnxt_en: Add bnxt_fw_reset_timeout() helper.

This code to check if we have reached the maximum wait time after
firmware reset is used multiple times.  Add a helper function to
do this.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Retry open if firmware is in reset.
Vasundhara Volam [Mon, 25 Jan 2021 07:08:15 +0000 (02:08 -0500)]
bnxt_en: Retry open if firmware is in reset.

Firmware may be in the middle of reset when the driver tries to do ifup.
In that case, firmware will return a special error code and the driver
will retry 10 times with 50 msecs delay after each retry.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: attempt to reinitialize after aborted reset
Edwin Peer [Mon, 25 Jan 2021 07:08:14 +0000 (02:08 -0500)]
bnxt_en: attempt to reinitialize after aborted reset

Drawing a hard line on aborted resets prevents a NIC open in
some scenarios that may otherwise be recoverable. For example,
if a firmware recovery happened while a PF was down and an
attempt was made to bring up an associated VF in this state,
then it was impossible to ever bring up this VF without a
rebind or reload of its driver.

Attempt to reinitialize the firmware when an aborted reset (or
failed init after a reset) is discovered during open - it may
succeed. Also take care to allow the user to retry opening the
NIC even after an aborted reset.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: log firmware debug notifications
Edwin Peer [Mon, 25 Jan 2021 07:08:13 +0000 (02:08 -0500)]
bnxt_en: log firmware debug notifications

Firmware is capable of generating asynchronous debug notifications.
The event data is opaque to the driver and is simply logged. Debug
notifications can be enabled by turning on hardware status messages
using the ethtool msglvl interface.

Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Add an upper bound for all firmware command timeouts.
Vasundhara Volam [Mon, 25 Jan 2021 07:08:12 +0000 (02:08 -0500)]
bnxt_en: Add an upper bound for all firmware command timeouts.

The timeout period for firmware messages is passed to the driver
from the firmware in the response of the first command.  This
timeout period is multiplied by a factor for certain long
running commands such as NVRAM commands.  In some cases, the
timeout period can become really long and it can cause hung task
warnings if firmware has crashed or is not responding.  To avoid
such long delays, cap all firmware commands to a max timeout value
of 40 seconds.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Move reading VPD info after successful handshake with fw.
Vasundhara Volam [Mon, 25 Jan 2021 07:08:11 +0000 (02:08 -0500)]
bnxt_en: Move reading VPD info after successful handshake with fw.

If firmware is in reset or in bad state, it won't be able to return
VPD data.  Move bnxt_vpd_read_info() until after bnxt_fw_init_one_p1()
successfully returns.  By then we would have established proper
communications with the firmware.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Retry sending the first message to firmware if it is under reset.
Michael Chan [Mon, 25 Jan 2021 07:08:10 +0000 (02:08 -0500)]
bnxt_en: Retry sending the first message to firmware if it is under reset.

The first HWRM_VER_GET message to firmware during probe may timeout if
firmware is under reset.  This can happen during hot-plug for example.
On P5 and newer chips, we can check if firmware is in the boot stage by
reading a status register.  Retry 5 times if the status register shows
that firmware is not ready and not in error state.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: handle CRASH_NO_MASTER during bnxt_open()
Edwin Peer [Mon, 25 Jan 2021 07:08:09 +0000 (02:08 -0500)]
bnxt_en: handle CRASH_NO_MASTER during bnxt_open()

Add missing support for handling NO_MASTER crashes while ports are
administratively down (ifdown). On some SoC platforms, the driver
needs to assist the firmware to recover from a crash via OP-TEE.
This is performed in a similar fashion to what is done during driver
probe.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Define macros for the various health register states.
Michael Chan [Mon, 25 Jan 2021 07:08:08 +0000 (02:08 -0500)]
bnxt_en: Define macros for the various health register states.

Define macros to check for the various states in the lower 16 bits of
the health register.  Replace the C code that checks for these values
with the newly defined macros.

Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobnxt_en: Update firmware interface to 1.10.2.11.
Michael Chan [Mon, 25 Jan 2021 07:08:07 +0000 (02:08 -0500)]
bnxt_en: Update firmware interface to 1.10.2.11.

Updates to backing store APIs, QoS profiles, and push buffer initial
index support.

Since the new HWRM_FUNC_BACKING_STORE_CFG message size has increased,
we need to add some compat. logic to fall back to the smaller legacy
size if firmware cannot accept the larger message size.  The new fields
added to the structure are not used yet.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'dsa-add-mt7530-gpio-support'
Jakub Kicinski [Tue, 26 Jan 2021 02:19:05 +0000 (18:19 -0800)]
Merge branch 'dsa-add-mt7530-gpio-support'

DENG Qingfang says:

====================
dsa: add MT7530 GPIO support

MT7530's LED controller can be used as GPIO controller.
Add support for it.
====================

Link: https://lore.kernel.org/r/20210125044322.6280-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: mt7530: MT7530 optional GPIO support
DENG Qingfang [Mon, 25 Jan 2021 04:43:22 +0000 (12:43 +0800)]
net: dsa: mt7530: MT7530 optional GPIO support

MT7530's LED controller can drive up to 15 LED/GPIOs.

Add support for GPIO control and allow users to use its GPIOs by
setting gpio-controller property in device tree.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodt-bindings: net: dsa: add MT7530 GPIO controller binding
DENG Qingfang [Mon, 25 Jan 2021 04:43:21 +0000 (12:43 +0800)]
dt-bindings: net: dsa: add MT7530 GPIO controller binding

Add device tree binding to support MT7530 GPIO controller.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ethernet: mediatek: support setting MTU
DENG Qingfang [Mon, 25 Jan 2021 04:20:46 +0000 (12:20 +0800)]
net: ethernet: mediatek: support setting MTU

MT762x HW, except for MT7628, supports frame length up to 2048
(maximum length on GDM), so allow setting MTU up to 2030.

Also set the default frame length to the hardware default 1518.

Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210125042046.5599-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobridge: Use PTR_ERR_OR_ZERO instead if(IS_ERR(...)) + PTR_ERR
Jiapeng Zhong [Mon, 25 Jan 2021 02:39:41 +0000 (10:39 +0800)]
bridge: Use PTR_ERR_OR_ZERO instead if(IS_ERR(...)) + PTR_ERR

coccicheck suggested using PTR_ERR_OR_ZERO() and looking at the code.

Fix the following coccicheck warnings:

./net/bridge/br_multicast.c:1295:7-13: WARNING: PTR_ERR_OR_ZERO can be
used.

Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Zhong <abaci-bugfix@linux.alibaba.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/1611542381-91178-1-git-send-email-abaci-bugfix@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoocteontx2-af: Support ESP/AH RSS hashing
Subbaraya Sundeep [Sat, 23 Jan 2021 05:09:12 +0000 (10:39 +0530)]
octeontx2-af: Support ESP/AH RSS hashing

Support SPI and sequence number fields of
ESP/AH header to be hashed for RSS. By default
ESP/AH fields are not considered for RSS and
needs to be set explicitly as below:
ethtool -U eth0 rx-flow-hash esp4 sdfn
or
ethtool -U eth0 rx-flow-hash ah4 sdfn
or
ethtool -U eth0 rx-flow-hash esp6 sdfn
or
ethtool -U eth0 rx-flow-hash ah6 sdfn

To disable hashing of ESP fields:
ethtool -U eth0 rx-flow-hash esp4 sd
or
ethtool -U eth0 rx-flow-hash ah4 sd
or
ethtool -U eth0 rx-flow-hash esp6 sd
or
ethtool -U eth0 rx-flow-hash ah6 sd

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/1611378552-13288-1-git-send-email-sundeep.lkml@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotg3: improve PCI VPD access
Heiner Kallweit [Fri, 22 Jan 2021 12:08:22 +0000 (13:08 +0100)]
tg3: improve PCI VPD access

When working on the PCI VPD code I also tested with a Broadcom BCM95719
card. tg3 uses internal NVRAM access with this card, so I forced it to
PCI VPD mode for testing. PCI VPD access fails
(i + PCI_VPD_LRDT_TAG_SIZE + j > len) because only TG3_NVM_VPD_LEN (256)
bytes are read, but PCI VPD has 400 bytes on this card.

So add a constant TG3_NVM_PCI_VPD_MAX_LEN that defines the maximum
PCI VPD size. The actual VPD size is returned by pci_read_vpd().
In addition it's not worth looping over pci_read_vpd(). If we miss the
125ms timeout per VPD dword read then definitely something is wrong,
and if the tg3 module loading is killed then there's also not much
benefit in retrying the VPD read.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/cb9e9113-0861-3904-87e0-d4c4ab3c8860@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-dsa-hellcreek-add-taprio-offloading'
Jakub Kicinski [Sun, 24 Jan 2021 05:25:17 +0000 (21:25 -0800)]
Merge branch 'net-dsa-hellcreek-add-taprio-offloading'

Kurt Kanzenbach says:

====================
net: dsa: hellcreek: Add TAPRIO offloading

The switch has support for the 802.1Qbv Time Aware Shaper (TAS). Traffic
schedules may be configured individually on each front port. Each port
has eight egress queues. The traffic is mapped to a traffic class
respectively via the PCP field of a VLAN tagged frame.

Previous attempts:
 * https://lkml.kernel.org/netdev/20201121115703.23221-1-kurt@linutronix.de/
 * https://lkml.kernel.org/netdev/20210116124922.32356-1-kurt@linutronix.de/
====================

Link: https://lore.kernel.org/r/20210123105633.16753-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: hellcreek: Add TAPRIO offloading support
Kurt Kanzenbach [Sat, 23 Jan 2021 10:56:33 +0000 (11:56 +0100)]
net: dsa: hellcreek: Add TAPRIO offloading support

The switch has support for the 802.1Qbv Time Aware Shaper (TAS). Traffic
schedules may be configured individually on each front port. Each port has eight
egress queues. The traffic is mapped to a traffic class respectively via the PCP
field of a VLAN tagged frame.

The TAPRIO Qdisc already implements that. Therefore, this interface can simply
be reused. Add .port_setup_tc() accordingly.

The activation of a schedule on a port is split into two parts:

 * Programming the necessary gate control list (GCL)
 * Setup delayed work for starting the schedule

The hardware supports starting a schedule up to eight seconds in the future. The
TAPRIO interface provides an absolute base time. Therefore, periodic delayed
work is leveraged to check whether a schedule may be started or not.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: mhi: Set wwan device type
Loic Poulain [Fri, 22 Jan 2021 15:15:54 +0000 (16:15 +0100)]
net: mhi: Set wwan device type

The 'wwan' devtype is meant for devices that require additional
configuration to be used, like WWAN specific APN setup over AT/QMI
commands, rmnet link creation, etc. This is the case for MHI (Modem
host Interface) netdev which targets modem/WWAN endpoints.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/1611328554-1414-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'udp-allow-forwarding-of-plain-non-fraglisted-udp-gro-packets'
Jakub Kicinski [Sun, 24 Jan 2021 04:16:26 +0000 (20:16 -0800)]
Merge branch 'udp-allow-forwarding-of-plain-non-fraglisted-udp-gro-packets'

Alexander Lobakin says:

====================
udp: allow forwarding of plain (non-fraglisted) UDP GRO packets

This series allows to form UDP GRO packets in cases without sockets
(for forwarding). To not change the current datapath, this is
performed only when the new corresponding netdev feature is enabled
via Ethtool (and fraglisted GRO is disabled).
Prior to this point, only fraglisted UDP GRO was available. Plain UDP
GRO shows better forwarding performance when a target NIC is capable
of GSO UDP offload.

Since v3 [2]:
 - rename introduced netdev feature to reflect that it targets
   forwarding and don't touch fraglisted GRO at all (Willem de Bruijn).

Since v2 [1]:
 - convert to a series;
 - new: add new netdev_feature to explicitly enable/disable UDP GRO
   when there is no socket, defaults to off (Paolo Abeni).

Since v1 [0]:
 - drop redundant 'if (sk)' check (Alexander Duyck);
 - add a ref in the commit message to one more commit that was
   an important step for UDP GRO forwarding.

[0] https://lore.kernel.org/netdev/20210112211536.261172-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210113103232.4761-1-alobakin@pm.me
[2] https://lore.kernel.org/netdev/20210118193122.87271-1-alobakin@pm.me
====================

Link: https://lore.kernel.org/r/20210122181909.36340-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoudp: allow forwarding of plain (non-fraglisted) UDP GRO packets
Alexander Lobakin [Fri, 22 Jan 2021 18:20:02 +0000 (18:20 +0000)]
udp: allow forwarding of plain (non-fraglisted) UDP GRO packets

Commit 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") actually
not only added a support for fraglisted UDP GRO, but also tweaked
some logics the way that non-fraglisted UDP GRO started to work for
forwarding too.
Commit 2e4ef10f5850 ("net: add GSO UDP L4 and GSO fraglists to the
list of software-backed types") added GSO UDP L4 to the list of
software GSO to allow virtual netdevs to forward them as is up to
the real drivers.

Tests showed that currently forwarding and NATing of plain UDP GRO
packets are performed fully correctly, regardless if the target
netdevice has a support for hardware/driver GSO UDP L4 or not.
Add the last element and allow to form plain UDP GRO packets if
we are on forwarding path, and the new NETIF_F_GRO_UDP_FWD is
enabled on a receiving netdevice.

If both NETIF_F_GRO_FRAGLIST and NETIF_F_GRO_UDP_FWD are set,
fraglisted GRO takes precedence. This keeps the current behaviour
and is generally more optimal for now, as the number of NICs with
hardware USO offload is relatively small.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: introduce a netdev feature for UDP GRO forwarding
Alexander Lobakin [Fri, 22 Jan 2021 18:19:48 +0000 (18:19 +0000)]
net: introduce a netdev feature for UDP GRO forwarding

Introduce a new netdev feature, NETIF_F_GRO_UDP_FWD, to allow user
to turn UDP GRO on and off for forwarding.
Defaults to off to not change current datapath.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'remove-unneeded-phy-time-stamping-option'
Jakub Kicinski [Sat, 23 Jan 2021 21:21:39 +0000 (13:21 -0800)]
Merge branch 'remove-unneeded-phy-time-stamping-option'

Richard Cochran says:

====================
Remove unneeded PHY time stamping option.

The NETWORK_PHY_TIMESTAMPING configuration option adds additional
checks into the networking hot path, and it is only needed by two
rather esoteric devices, namely the TI DP83640 PHYTER and the ZHAW
InES 1588 IP core.  Very few end users have these devices, and those
that do have them are building specialized embedded systems.

Unfortunately two unrelated drivers depend on this option, and two
defconfigs enable it.  It is probably my fault for not paying enough
attention in reviews.

This series corrects the gratuitous use of NETWORK_PHY_TIMESTAMPING.
====================

Link: https://lore.kernel.org/r/cover.1611198584.git.richardcochran@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: mvpp2: Remove unneeded Kconfig dependency.
Richard Cochran [Thu, 21 Jan 2021 04:06:01 +0000 (20:06 -0800)]
net: mvpp2: Remove unneeded Kconfig dependency.

The mvpp2 is an Ethernet driver, and it implements MAC style time
stamping of PTP frames.  It has no need of the expensive option to
enable PHY time stamping.  Remove the incorrect dependency.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: mv88e6xxx: Remove bogus Kconfig dependency.
Richard Cochran [Thu, 21 Jan 2021 04:06:00 +0000 (20:06 -0800)]
net: dsa: mv88e6xxx: Remove bogus Kconfig dependency.

The mv88e6xxx is a DSA driver, and it implements DSA style time
stamping of PTP frames.  It has no need of the expensive option to
enable PHY time stamping.  Remove the bogus dependency.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Brandon Streiff <brandon.streiff@ni.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-ipa-napi-poll-updates'
Jakub Kicinski [Sat, 23 Jan 2021 21:16:02 +0000 (13:16 -0800)]
Merge branch 'net-ipa-napi-poll-updates'

Alex Elder says:

====================
net: ipa: NAPI poll updates

While reviewing the IPA NAPI polling code in detail I found two
problems.  This series fixes those, and implements a few other
improvements to this part of the code.

The first two patches are minor bug fixes that avoid extra passes
through the poll function.  The third simplifies code inside the
polling loop a bit.

The last two update how interrupts are disabled; previously it was
possible for another I/O completion condition to be recorded before
NAPI got scheduled.
====================

Link: https://lore.kernel.org/r/20210121114821.26495-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: disable IEOB interrupts before clearing
Alex Elder [Thu, 21 Jan 2021 11:48:21 +0000 (05:48 -0600)]
net: ipa: disable IEOB interrupts before clearing

Currently in gsi_isr_ieob(), event ring IEOB interrupts are disabled
one at a time.  The loop disables the IEOB interrupt for all event
rings represented in the event mask.  Instead, just disable them all
at once.

Disable them all *before* clearing the interrupt condition.  This
guarantees we'll schedule NAPI for each event once, before another
IEOB interrupt could be signaled.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: repurpose gsi_irq_ieob_disable()
Alex Elder [Thu, 21 Jan 2021 11:48:20 +0000 (05:48 -0600)]
net: ipa: repurpose gsi_irq_ieob_disable()

Rename gsi_irq_ieob_disable() to be gsi_irq_ieob_disable_one().

Introduce a new function gsi_irq_ieob_disable() that takes a mask of
events to disable rather than a single event id.  This will be used
in the next patch.

Rename gsi_irq_ieob_enable() to be gsi_irq_ieob_enable_one() to be
consistent.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: have gsi_channel_update() return a value
Alex Elder [Thu, 21 Jan 2021 11:48:19 +0000 (05:48 -0600)]
net: ipa: have gsi_channel_update() return a value

Have gsi_channel_update() return the first transaction in the
updated completed transaction list, or NULL if no new transactions
have been added.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: heed napi_complete() return value
Alex Elder [Thu, 21 Jan 2021 11:48:18 +0000 (05:48 -0600)]
net: ipa: heed napi_complete() return value

Pay attention to the return value of napi_complete(), completing
polling only if it returns true.

Just use napi rather than &channel->napi as the argument passed to
napi_complete().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: count actual work done in gsi_channel_poll()
Alex Elder [Thu, 21 Jan 2021 11:48:17 +0000 (05:48 -0600)]
net: ipa: count actual work done in gsi_channel_poll()

There is an off-by-one problem in gsi_channel_poll().  The count of
transactions completed is incremented each time through the loop
*before* determining whether there is any more work to do.  As a
result, if we exit the loop early the counter its value is one more
than the number of transactions actually processed.

Instead, increment the count after processing, to ensure it reflects
the number of processed transactions.  The result is more naturally
described as a for loop rather than a while loop, so change that.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'mlxsw-expose-number-of-physical-ports'
Jakub Kicinski [Sat, 23 Jan 2021 04:42:15 +0000 (20:42 -0800)]
Merge branch 'mlxsw-expose-number-of-physical-ports'

Ido Schimmel says:

====================
mlxsw: Expose number of physical ports

The switch ASIC has a limited capacity of physical ports that it can
support. While each system is brought up with a different number of
ports, this number can be increased via splitting up to the ASIC's
limit.

Expose physical ports as a devlink resource so that user space will have
visibility into the maximum number of ports that can be supported and
the current occupancy. With this resource it is possible, for example,
to write generic (i.e., not platform dependent) tests for port
splitting.

Patch #1 adds the new resource and patch #2 adds a selftest.

v2:
* Add the physical ports resource as a generic devlink resource so that
  it could be re-used by other device drivers
====================

Link: https://lore.kernel.org/r/20210121131024.2656154-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: mlxsw: Add a scale test for physical ports
Danielle Ratson [Thu, 21 Jan 2021 13:10:24 +0000 (15:10 +0200)]
selftests: mlxsw: Add a scale test for physical ports

Query the maximum number of supported physical ports using devlink-resource
and test that this number can be reached by splitting each of the
splittable ports to its width. Test that an error is returned in case
the maximum number is exceeded.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomlxsw: Register physical ports as a devlink resource
Danielle Ratson [Thu, 21 Jan 2021 13:10:23 +0000 (15:10 +0200)]
mlxsw: Register physical ports as a devlink resource

The switch ASIC has a limited capacity of physical ('flavour physical'
in devlink terminology) ports that it can support. While each system is
brought up with a different number of ports, this number can be
increased via splitting up to the ASIC's limit.

Expose physical ports as a devlink resource so that user space will have
visibility to the maximum number of ports that can be supported and the
current occupancy.

In addition, add a "Generic Resources" section in devlink-resource
documentation so the different drivers will be aligned by the same resource
name when exposing to user space.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'htb-offload'
Jakub Kicinski [Sat, 23 Jan 2021 04:41:31 +0000 (20:41 -0800)]
Merge branch 'htb-offload'

Maxim Mikityanskiy says:

====================
HTB offload

This series adds support for HTB offload to the HTB qdisc, and adds
usage to mlx5 driver.

The previous RFCs are available at [1], [2].

The feature is intended to solve the performance bottleneck caused by
the single lock of the HTB qdisc, which prevents it from scaling well.
The HTB algorithm itself is offloaded to the device, eliminating the
need to take the root lock of HTB on every packet. Classification part
is done in clsact (still in software) to avoid acquiring the lock, which
imposes a limitation that filters can target only leaf classes.

The speedup on Mellanox ConnectX-6 Dx was 14.2 times in the UDP
multi-stream test, compared to software HTB implementation (more details
in the mlx5 patch).

[1]: https://www.spinics.net/lists/netdev/msg628422.html
[2]: https://www.spinics.net/lists/netdev/msg663548.html

v2 changes:

Fixed sparse and smatch warnings. Formatted HTB patches to 80 chars per
line.

v3 changes:

Fixed the CI failure on parisc with 16-bit xchg by replacing it with
WRITE_ONCE. Fixed the capability bits in mlx5_ifc.h and the value of
MLX5E_QOS_MAX_LEAF_NODES.

v4 changes:

Check if HTB is root when offloading. Add extack for hardware errors.
Rephrase explanations of how it works in the commit message. Remove %hu
from format strings. Add resiliency when leaf_del_last fails to create a
new leaf node.
====================

Link: https://lore.kernel.org/r/20210119120815.463334-1-maximmi@mellanox.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet/mlx5e: Support HTB offload
Maxim Mikityanskiy [Tue, 19 Jan 2021 12:08:15 +0000 (14:08 +0200)]
net/mlx5e: Support HTB offload

This commit adds support for HTB offload in the mlx5e driver.

Performance:

  NIC: Mellanox ConnectX-6 Dx
  CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (24 cores with HT)

  100 Gbit/s line rate, 500 UDP streams @ ~200 Mbit/s each
  48 traffic classes, flower used for steering
  No shaping (rate limits set to 4 Gbit/s per TC) - checking for max
  throughput.

  Baseline: 98.7 Gbps, 8.25 Mpps
  HTB: 6.7 Gbps, 0.56 Mpps
  HTB offload: 95.6 Gbps, 8.00 Mpps

Limitations:

1. 256 leaf nodes, 3 levels of depth.

2. Granularity for ceil is 1 Mbit/s. Rates are converted to weights, and
the bandwidth is split among the siblings according to these weights.
Other parameters for classes are not supported.

Ethtool statistics support for QoS SQs are also added. The counters are
called qos_txN_*, where N is the QoS queue number (starting from 0, the
numeration is separate from the normal SQs), and * is the counter name
(the counters are the same as for the normal SQs).

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agosch_htb: Stats for offloaded HTB
Maxim Mikityanskiy [Tue, 19 Jan 2021 12:08:14 +0000 (14:08 +0200)]
sch_htb: Stats for offloaded HTB

This commit adds support for statistics of offloaded HTB. Bytes and
packets counters for leaf and inner nodes are supported, the values are
taken from per-queue qdiscs, and the numbers that the user sees should
have the same behavior as the software (non-offloaded) HTB.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agosch_htb: Hierarchical QoS hardware offload
Maxim Mikityanskiy [Tue, 19 Jan 2021 12:08:13 +0000 (14:08 +0200)]
sch_htb: Hierarchical QoS hardware offload

HTB doesn't scale well because of contention on a single lock, and it
also consumes CPU. This patch adds support for offloading HTB to
hardware that supports hierarchical rate limiting.

In the offload mode, HTB passes control commands to the driver using
ndo_setup_tc. The driver has to replicate the whole hierarchy of classes
and their settings (rate, ceil) in the NIC. Every modification of the
HTB tree caused by the admin results in ndo_setup_tc being called.

After this setup, the HTB algorithm is done completely in the NIC. An SQ
(send queue) is created for every leaf class and attached to the
hierarchy, so that the NIC can calculate and obey aggregated rate
limits, too. In the future, it can be changed, so that multiple SQs will
back a single leaf class.

ndo_select_queue is responsible for selecting the right queue that
serves the traffic class of each packet.

The data path works as follows: a packet is classified by clsact, the
driver selects a hardware queue according to its class, and the packet
is enqueued into this queue's qdisc.

This solution addresses two main problems of scaling HTB:

1. Contention by flow classification. Currently the filters are attached
to the HTB instance as follows:

    # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80
    classid 1:10

It's possible to move classification to clsact egress hook, which is
thread-safe and lock-free:

    # tc filter add dev eth0 egress protocol ip flower dst_port 80
    action skbedit priority 1:10

This way classification still happens in software, but the lock
contention is eliminated, and it happens before selecting the TX queue,
allowing the driver to translate the class to the corresponding hardware
queue in ndo_select_queue.

Note that this is already compatible with non-offloaded HTB and doesn't
require changes to the kernel nor iproute2.

2. Contention by handling packets. HTB is not multi-queue, it attaches
to a whole net device, and handling of all packets takes the same lock.
When HTB is offloaded, it registers itself as a multi-queue qdisc,
similarly to mq: HTB is attached to the netdev, and each queue has its
own qdisc.

Some features of HTB may be not supported by some particular hardware,
for example, the maximum number of classes may be limited, the
granularity of rate and ceil parameters may be different, etc. - so, the
offload is not enabled by default, a new parameter is used to enable it:

    # tc qdisc replace dev eth0 root handle 1: htb offload

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: sched: Add extack to Qdisc_class_ops.delete
Maxim Mikityanskiy [Tue, 19 Jan 2021 12:08:12 +0000 (14:08 +0200)]
net: sched: Add extack to Qdisc_class_ops.delete

In a following commit, sch_htb will start using extack in the delete
class operation to pass hardware errors in offload mode. This commit
prepares for that by adding the extack parameter to this callback and
converting usage of the existing qdiscs.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: sched: Add multi-queue support to sch_tree_lock
Maxim Mikityanskiy [Tue, 19 Jan 2021 12:08:11 +0000 (14:08 +0200)]
net: sched: Add multi-queue support to sch_tree_lock

The existing qdiscs that set TCQ_F_MQROOT don't use sch_tree_lock.
However, hardware-offloaded HTB will start setting this flag while also
using sch_tree_lock.

The current implementation of sch_tree_lock basically locks on
qdisc->dev_queue->qdisc, and it works fine when the tree is attached to
some queue. However, it's not the case for MQROOT qdiscs: such a qdisc
is the root itself, and its dev_queue just points to queue 0, while not
actually being used, because there are real per-queue qdiscs.

This patch changes the logic of sch_tree_lock and sch_tree_unlock to
lock the qdisc itself if it's the MQROOT.

Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'tcp-add-cmsg-rx-timestamps-to-rx-zerocopy'
Jakub Kicinski [Sat, 23 Jan 2021 04:05:58 +0000 (20:05 -0800)]
Merge branch 'tcp-add-cmsg-rx-timestamps-to-rx-zerocopy'

Arjun Roy says:

====================
tcp: add CMSG+rx timestamps to rx. zerocopy

Provide CMSG and receive timestamp support to TCP
receive zerocopy. Patch 1 refactors CMSG pending state for
tcp_recvmsg() to avoid the use of magic numbers; patch 2 implements
receive timestamp via CMSG support for receive zerocopy, and uses the
constants added in patch 1.
====================

Link: https://lore.kernel.org/r/20210121004148.2340206-1-arjunroy.kdev@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotcp: Add receive timestamp support for receive zerocopy.
Arjun Roy [Thu, 21 Jan 2021 00:41:48 +0000 (16:41 -0800)]
tcp: Add receive timestamp support for receive zerocopy.

tcp_recvmsg() uses the CMSG mechanism to receive control information
like packet receive timestamps. This patch adds CMSG fields to
struct tcp_zerocopy_receive, and provides receive timestamps
if available to the user.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>