linux-2.6-microblaze.git
4 years agonet/ipv6: factor out MCAST_MSFILTER getsockopt helpers
Christoph Hellwig [Fri, 17 Jul 2020 06:23:27 +0000 (08:23 +0200)]
net/ipv6: factor out MCAST_MSFILTER getsockopt helpers

Factor out one helper each for getting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/ipv4: remove compat_ip_{get,set}sockopt
Christoph Hellwig [Fri, 17 Jul 2020 06:23:26 +0000 (08:23 +0200)]
net/ipv4: remove compat_ip_{get,set}sockopt

Handle the few cases that need special treatment in-line using
in_compat_syscall().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/ipv4: factor out mcast join/leave setsockopt helpers
Christoph Hellwig [Fri, 17 Jul 2020 06:23:25 +0000 (08:23 +0200)]
net/ipv4: factor out mcast join/leave setsockopt helpers

Factor out one helper each for setting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/ipv4: factor out MCAST_MSFILTER setsockopt helpers
Christoph Hellwig [Fri, 17 Jul 2020 06:23:24 +0000 (08:23 +0200)]
net/ipv4: factor out MCAST_MSFILTER setsockopt helpers

Factor out one helper each for setting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/ipv4: factor out MCAST_MSFILTER getsockopt helpers
Christoph Hellwig [Fri, 17 Jul 2020 06:23:23 +0000 (08:23 +0200)]
net/ipv4: factor out MCAST_MSFILTER getsockopt helpers

Factor out one helper each for getting the native and compat
version of the MCAST_MSFILTER option.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter: split nf_sockopt
Christoph Hellwig [Fri, 17 Jul 2020 06:23:22 +0000 (08:23 +0200)]
netfilter: split nf_sockopt

Split nf_sockopt into a getsockopt and setsockopt side as they share
very little code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter: remove the compat argument to xt_copy_counters_from_user
Christoph Hellwig [Fri, 17 Jul 2020 06:23:21 +0000 (08:23 +0200)]
netfilter: remove the compat argument to xt_copy_counters_from_user

Lift the in_compat_syscall() from the callers instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter: remove the compat_{get,set} methods
Christoph Hellwig [Fri, 17 Jul 2020 06:23:20 +0000 (08:23 +0200)]
netfilter: remove the compat_{get,set} methods

All instances handle compat sockopts via in_compat_syscall() now, so
remove the compat_{get,set} methods as well as the
compat_nf_{get,set}sockopt wrappers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter/ebtables: clean up compat {get, set}sockopt handling
Christoph Hellwig [Fri, 17 Jul 2020 06:23:19 +0000 (08:23 +0200)]
netfilter/ebtables: clean up compat {get, set}sockopt handling

Merge the native and compat {get,set}sockopt handlers using
in_compat_syscall().  Note that this required moving a fair
amout of code around to be done sanely.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter/ip6_tables: clean up compat {get, set}sockopt handling
Christoph Hellwig [Fri, 17 Jul 2020 06:23:18 +0000 (08:23 +0200)]
netfilter/ip6_tables: clean up compat {get, set}sockopt handling

Merge the native and compat {get,set}sockopt handlers using
in_compat_syscall().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter/ip_tables: clean up compat {get,set}sockopt handling
Christoph Hellwig [Fri, 17 Jul 2020 06:23:17 +0000 (08:23 +0200)]
netfilter/ip_tables: clean up compat {get,set}sockopt handling

Merge the native and compat {get,set}sockopt handlers using
in_compat_syscall().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetfilter/arp_tables: clean up compat {get, set}sockopt handling
Christoph Hellwig [Fri, 17 Jul 2020 06:23:16 +0000 (08:23 +0200)]
netfilter/arp_tables: clean up compat {get, set}sockopt handling

Merge the native and compat {get,set}sockopt handlers using
in_compat_syscall().

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: remove compat_sys_{get,set}sockopt
Christoph Hellwig [Fri, 17 Jul 2020 06:23:15 +0000 (08:23 +0200)]
net: remove compat_sys_{get,set}sockopt

Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: remove compat_sock_common_{get,set}sockopt
Christoph Hellwig [Fri, 17 Jul 2020 06:23:14 +0000 (08:23 +0200)]
net: remove compat_sock_common_{get,set}sockopt

Add the compat handling to sock_common_{get,set}sockopt instead,
keyed of in_compat_syscall().  This allow to remove the now unused
->compat_{get,set}sockopt methods from struct proto_ops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: simplify cBPF setsockopt compat handling
Christoph Hellwig [Fri, 17 Jul 2020 06:23:13 +0000 (08:23 +0200)]
net: simplify cBPF setsockopt compat handling

Add a helper that copies either a native or compat bpf_fprog from
userspace after verifying the length, and remove the compat setsockopt
handlers that now aren't required.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: streamline __sys_getsockopt
Christoph Hellwig [Fri, 17 Jul 2020 06:23:12 +0000 (08:23 +0200)]
net: streamline __sys_getsockopt

Return early when sockfd_lookup_light fails to reduce a level of
indentation for most of the function body.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: streamline __sys_setsockopt
Christoph Hellwig [Fri, 17 Jul 2020 06:23:11 +0000 (08:23 +0200)]
net: streamline __sys_setsockopt

Return early when sockfd_lookup_light fails to reduce a level of
indentation for most of the function body.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/atm: remove the atmdev_ops {get, set}sockopt methods
Christoph Hellwig [Fri, 17 Jul 2020 06:23:10 +0000 (08:23 +0200)]
net/atm: remove the atmdev_ops {get, set}sockopt methods

All implementations of these two methods are dummies that always
return -EINVAL.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: rds: rdma_transport.h: delete duplicated word
Randy Dunlap [Sun, 19 Jul 2020 18:08:24 +0000 (11:08 -0700)]
net: rds: rdma_transport.h: delete duplicated word

Delete the doubled word "be" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atm: lec_arpc.h: delete duplicated word
Randy Dunlap [Sun, 19 Jul 2020 18:08:01 +0000 (11:08 -0700)]
net: atm: lec_arpc.h: delete duplicated word

Delete the doubled word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: at803x: add mdix configuration support for AR9331 and AR8035
Oleksij Rempel [Sun, 19 Jul 2020 08:05:30 +0000 (10:05 +0200)]
net: phy: at803x: add mdix configuration support for AR9331 and AR8035

This patch add MDIX configuration ability for AR9331 and AR8035. Theoretically
it should work on other Atheros PHYs, but I was able to test only this
two.

Since I have no certified reference HW able to detect or configure MDIX, this
functionality was confirmed by oscilloscope.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-enetc-remove-bootloader-dependency'
David S. Miller [Mon, 20 Jul 2020 01:05:49 +0000 (18:05 -0700)]
Merge branch 'net-enetc-remove-bootloader-dependency'

Michael Walle says:

====================
net: enetc: remove bootloader dependency

These patches were picked from the following series:
https://lore.kernel.org/netdev/1567779344-30965-1-git-send-email-claudiu.manoil@nxp.com/
They have never been resent. I've picked them up, addressed Andrews
comments, fixed some more bugs and asked Claudiu if I can keep their SOB
tags; he agreed. I've tested this on our board which happens to have a
bootloader which doesn't do the enetc setup in all cases. Though, only
SGMII mode was tested.

changes since v6:
 - dropped _LPA_ infix for USXGMII constants

changes since v5:
 - fixed pcs->autoneg_complete and pcs->link assignment. Thanks Vladimir.

changes since v4:
 - moved (and renamed) the USXGMII constants to include/uapi/linux/mdio.h.
   Suggested by Russell King.

changes since v3:
 - rebased to latest net-next where devm_mdiobus_free() was removed.
   replace it by mdiobus_free(). The internal MDIO bus is optional, if
   there is any error, we try to run with the bootloader default PCS
   settings, thus in the error case, we need to free the mdiobus.

changes since v2:
 - removed SOBs from "net: enetc: Initialize SerDes for SGMII and USXGMII
   protocols" because almost everything has changed.
 - get a phy_device for the internal PCS PHY so we can use the phy_
   functions instead of raw mdiobus writes
 - reuse macros already defined in fsl_mdio.h, move missing bits from
   felix to fsl_mdio.h, because they share the same PCS PHY building
   block
 - added 2500BaseX mode (based on felix init routine)
 - changed xgmii mode to usxgmii mode, because it is actually USXGMII and
   felix does the same.
 - fixed devad, which is 0x1f (MMD_VEND2)

changes since v1:
 - mdiobus id is '"imdio-%s", dev_name(dev)' because the plain dev_name()
   is used by the emdio.
 - use mdiobus_write() instead of imdio->write(imdio, ..), since this is
   already a full featured mdiobus
 - set phy_mask to ~0 to avoid scanning the bus
 - use phy_interface_mode_is_rgmii(phy_mode) to also include the RGMII
   modes with pad delays.
 - move enetc_imdio_init() to enetc_pf.c, there shouldn't be any other
   users, should it?
 - renamed serdes to SerDes
 - printing the error code of mdiobus_register() in the error path
 - call mdiobus_unregister() on _remove()
 - call devm_mdiobus_free() if mdiobus_register() fails, since an
   error is not fatal
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: enetc: Use DT protocol information to set up the ports
Alex Marginean [Sun, 19 Jul 2020 22:03:36 +0000 (00:03 +0200)]
net: enetc: Use DT protocol information to set up the ports

Use DT information rather than in-band information from bootloader to
set up MAC for XGMII. For RGMII use the DT indication in addition to
RGMII defaults in hardware.
However, this implies that PHY connection information needs to be
extracted before netdevice creation, when the ENETC Port MAC is
being configured.

Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: enetc: Initialize SerDes for SGMII and USXGMII protocols
Michael Walle [Sun, 19 Jul 2020 22:03:35 +0000 (00:03 +0200)]
net: enetc: Initialize SerDes for SGMII and USXGMII protocols

ENETC has ethernet MACs capable of SGMII, 2500BaseX and USXGMII. But in
order to use these protocols some SerDes configurations need to be
performed. The SerDes is configurable via an internal PCS PHY which is
connected to an internal MDIO bus at address 0.

This patch basically removes the dependency on bootloader regarding
SerDes initialization.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: felix: (re)use already existing constants
Michael Walle [Sun, 19 Jul 2020 22:03:34 +0000 (00:03 +0200)]
net: dsa: felix: (re)use already existing constants

Now that there are USXGMII constants available, drop the old definitions
and reuse the generic ones.

Signed-off-by: Michael Walle <michael@walle.cc>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: add USXGMII link partner ability constants
Michael Walle [Sun, 19 Jul 2020 22:03:33 +0000 (00:03 +0200)]
net: phy: add USXGMII link partner ability constants

The constants are taken from the USXGMII Singleport Copper Interface
specification. The naming are based on the SGMII ones, but with an MDIO_
prefix.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agone2k-pci: Use netif_msg_init to initialize msg_enable bits
Armin Wolf [Fri, 17 Jul 2020 18:21:48 +0000 (20:21 +0200)]
ne2k-pci: Use netif_msg_init to initialize msg_enable bits

Use netif_msg_enable() to process param settings.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-atlantic-add-support-for-FW-4-x'
David S. Miller [Sat, 18 Jul 2020 02:00:54 +0000 (19:00 -0700)]
Merge branch 'net-atlantic-add-support-for-FW-4-x'

Mark Starovoytov says:

====================
net: atlantic: add support for FW 4.x

This patch set adds support for FW 4.x, which is about to get into the
production for some products.
4.x is mostly compatible with 3.x, save for soft reset, which requires
the acquisition of 2 additional semaphores.
Other differences (e.g. absence of PTP support) are handled via
capabilities.

Note: 4.x targets specific products only. 3.x is still the main firmware
branch, which should be used by most users (at least for now).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: add support for FW 4.x
Dmitry Bogdanov [Fri, 17 Jul 2020 18:01:47 +0000 (21:01 +0300)]
net: atlantic: add support for FW 4.x

This patch adds support for FW 4.x, which is about to get into the
production for some products.
4.x is mostly compatible with 3.x, save for soft reset, which requires
the acquisition of 2 additional semaphores.
Other differences (e.g. absence of PTP support) are handled via
capabilities.

Note: 4.x targets specific products only. 3.x is still the main firmware
branch, which should be used by most users (at least for now).

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: align return value of ver_match function with function name
Mark Starovoytov [Fri, 17 Jul 2020 18:01:46 +0000 (21:01 +0300)]
net: atlantic: align return value of ver_match function with function name

This patch aligns the return value of hw_atl_utils_ver_match function with
its name.
Change the return type to bool, because it's better aligned with the actual
usage. Return true when the version matches, false otherwise.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: et131x: Remove redundant register read
Mark Einon [Fri, 17 Jul 2020 13:21:35 +0000 (14:21 +0100)]
net: ethernet: et131x: Remove redundant register read

Following the removal of an unused variable assignment (remove
unused variable 'pm_csr') the associated register read can also go,
as the read also occurs in the subsequent et1310_in_phy_coma()
call.

Signed-off-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: et131x: Remove unused variable 'pm_csr'
Zhang Changzhong [Fri, 17 Jul 2020 10:33:30 +0000 (18:33 +0800)]
net: ethernet: et131x: Remove unused variable 'pm_csr'

Gcc report warning as follows:

drivers/net/ethernet/agere/et131x.c:953:6: warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
  953 |  u32 pm_csr;
      |      ^~~~~~
drivers/net/ethernet/agere/et131x.c:1002:6:warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
 1002 |  u32 pm_csr;
      |      ^~~~~~
drivers/net/ethernet/agere/et131x.c:3446:8: warning:
 variable 'pm_csr' set but not used [-Wunused-but-set-variable]
 3446 |    u32 pm_csr;
      |        ^~~~~~

After commit 38df6492eb51 ("et131x: Add PCIe gigabit ethernet driver
et131x to drivers/net"), 'pm_csr' is never used in these functions,
so removing it to avoid build warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Mark Einon <mark.einon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bna: Remove unused variable 't'
Zhang Changzhong [Fri, 17 Jul 2020 10:23:04 +0000 (18:23 +0800)]
net: bna: Remove unused variable 't'

Gcc report warning as follows:

drivers/net/ethernet/brocade/bna/bfa_ioc.c:1538:6: warning:
 variable 't' set but not used [-Wunused-but-set-variable]
 1538 |  u32 t;
      |      ^

After commit c107ba171f3d ("bna: Firmware Patch Simplification"),
't' is never used, so removing it to avoid build warning.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bnxt: don't complain if TC flower can't be supported
Jakub Kicinski [Fri, 17 Jul 2020 20:59:58 +0000 (13:59 -0700)]
net: bnxt: don't complain if TC flower can't be supported

The fact that NETIF_F_HW_TC is not set should be a sufficient
indication to the user that TC offloads are not supported.
No need to bother users of older firmware versions with
pointless warnings on every boot.

Also, since the support is optional, bnxt_init_tc() should not
return an error in case FW is old, similarly to how error
is not returned when CONFIG_BNXT_FLOWER_OFFLOAD is not set.

With that we can add an error message to the caller, to warn
about actual unexpected failures.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-updates-2020-07-16' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Fri, 17 Jul 2020 20:04:17 +0000 (13:04 -0700)]
Merge tag 'mlx5-updates-2020-07-16' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-07-16

Fixes:
1) Fix build break when CONFIG_XPS is not set
2) Fix missing switch_id for representors

Updates:
1) IPsec XFRM RX offloads from Raed and Huy.
  - Added IPSec RX steering flow tables to NIC RX
  - Refactoring of the existing FPGA IPSec, to add support
    for ConnectX IPsec.
  - RX data path handling for IPSec traffic
  - Synchronize offloading device ESN with xfrm received SN

2) Parav allows E-Switch to siwtch to switchdev mode directly without
   the need to go through legacy mode first.

3) From Tariq, Misc updates including:
   3.1) indirect calls for RX and XDP handlers
   3.2) Make MLX5_EN_TLS non-prompt as it should always be enabled when
        TLS and MLX5_EN are selected.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: alteon: Avoid some useless memset
Christophe JAILLET [Thu, 16 Jul 2020 20:52:42 +0000 (22:52 +0200)]
net: alteon: Avoid some useless memset

Avoid a memset after a call to 'dma_alloc_coherent()'.
This is useless since
commit 518a2f1925c3 ("dma-mapping: zero memory returned from dma_alloc_*")

Replace a kmalloc+memset with a corresponding kzalloc.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: alteon: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Thu, 16 Jul 2020 20:48:02 +0000 (22:48 +0200)]
net: alteon: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'ace_allocate_descriptors()' and
'ace_init()' GFP_KERNEL can be used because both functions are called from
the probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sungem: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Thu, 16 Jul 2020 19:28:21 +0000 (21:28 +0200)]
net: sungem: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'gem_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: decnet: af_decnet: Simplify goto loop.
Suraj Upadhyay [Thu, 16 Jul 2020 19:16:45 +0000 (00:46 +0530)]
net: decnet: af_decnet: Simplify goto loop.

Replace goto loop with while loop.

Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'tcp-dsack-multi-seg'
David S. Miller [Fri, 17 Jul 2020 19:54:31 +0000 (12:54 -0700)]
Merge branch 'tcp-dsack-multi-seg'

Priyaranjan Jha says:

====================
tcp: improve handling of DSACK covering multiple segments

Currently, while processing DSACK, we assume DSACK covers only one
segment. This leads to significant underestimation of no. of duplicate
segments with LRO/GRO. Also, the existing SNMP counters, TCPDSACKRecv
and TCPDSACKOfoRecv, make similar assumption for DSACK, which makes them
unusable for estimating spurious retransmit rates.

This patch series fixes the segment accounting with DSACK, by estimating
number of duplicate segments based on: (DSACKed sequence range) / MSS.
It also introduces a new SNMP counter, TCPDSACKRecvSegs, which tracks
the estimated number of duplicate segments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: add SNMP counter for no. of duplicate segments reported by DSACK
Priyaranjan Jha [Thu, 16 Jul 2020 19:12:35 +0000 (12:12 -0700)]
tcp: add SNMP counter for no. of duplicate segments reported by DSACK

There are two existing SNMP counters, TCPDSACKRecv and TCPDSACKOfoRecv,
which are incremented depending on whether the DSACKed range is below
the cumulative ACK sequence number or not. Unfortunately, these both
implicitly assume each DSACK covers only one segment. This makes these
counters unusable for estimating spurious retransmit rates,
or real/non-spurious loss rate.

This patch introduces a new SNMP counter, TCPDSACKRecvSegs, which tracks
the estimated number of duplicate segments based on:
(DSACKed sequence range) / MSS. This counter is usable for estimating
spurious retransmit rates, or real/non-spurious loss rate.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: fix segment accounting when DSACK range covers multiple segments
Priyaranjan Jha [Thu, 16 Jul 2020 19:12:34 +0000 (12:12 -0700)]
tcp: fix segment accounting when DSACK range covers multiple segments

Currently, while processing DSACK, we assume DSACK covers only one
segment. This leads to significant underestimation of DSACKs with
LRO/GRO. This patch fixes segment accounting with DSACK by estimating
segment count from DSACK sequence range / MSS.

Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sun: cassini: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Thu, 16 Jul 2020 19:03:58 +0000 (21:03 +0200)]
net: sun: cassini: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'cas_tx_tiny_alloc()', GFP_KERNEL can be used
because a few lines below in its only caller, 'cas_alloc_rxds()', is also
called. This function makes an explicit use of GFP_KERNEL.

When memory is allocated in 'cas_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomptcp: silence warning in subflow_data_ready()
Davide Caratti [Wed, 15 Jul 2020 20:27:05 +0000 (22:27 +0200)]
mptcp: silence warning in subflow_data_ready()

since commit d47a72152097 ("mptcp: fix race in subflow_data_ready()"), it
is possible to observe a regression in MP_JOIN kselftests. For sockets in
TCP_CLOSE state, it's not sufficient to just wake up the main socket: we
also need to ensure that received data are made available to the reader.
Silence the WARN_ON_ONCE() in these cases: it preserves the syzkaller fix
and restores kselftests when they are ran as follows:

  # while true; do
  > make KBUILD_OUTPUT=/tmp/kselftest TARGETS=net/mptcp kselftest
  > done

Reported-by: Florian Westphal <fw@strlen.de>
Fixes: d47a72152097 ("mptcp: fix race in subflow_data_ready()")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/47
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'usbnet-multicast-filter-support-for-cdc-ncm-devices'
David S. Miller [Fri, 17 Jul 2020 19:42:47 +0000 (12:42 -0700)]
Merge branch 'usbnet-multicast-filter-support-for-cdc-ncm-devices'

Bjørn Mork says:

====================
usbnet: multicast filter support for cdc ncm devices

This revives a 2 year old patch set from Miguel Rodríguez
Pérez, which appears to have been lost somewhere along the
way.  I've based it on the last version I found (v4), and
added one patch which I believe must have been missing in
the original.

I kept Oliver's ack on one of the patches, since both the patch and
the motivation still is the same.  Hope this is OK..

Thanks to the anonymous user <wxcafe@wxcafe.net> for bringing up this
problem in https://bugs.debian.org/965074

This is only build and load tested by me.  I don't have any device
where I can test the actual functionality.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cdc_ncm: hook into set_rx_mode to admit multicast traffic
Miguel Rodríguez Pérez [Wed, 15 Jul 2020 18:41:00 +0000 (20:41 +0200)]
net: cdc_ncm: hook into set_rx_mode to admit multicast traffic

We set set_rx_mode to usbnet_cdc_update_filter provided
by cdc_ether that simply admits all multicast traffic
if there is more than one multicast filter configured.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cdc_ncm: add .ndo_set_rx_mode to cdc_ncm_netdev_ops
Miguel Rodríguez Pérez [Wed, 15 Jul 2020 18:40:59 +0000 (20:40 +0200)]
net: cdc_ncm: add .ndo_set_rx_mode to cdc_ncm_netdev_ops

The cdc_ncm driver overrides the net_device_ops structure used by usbnet
to be able to hook into .ndo_change_mtu. However, the structure was
missing the .ndo_set_rx_mode field, preventing the driver from
hooking into usbnet's set_rx_mode. This patch adds the missing callback to
usbnet_set_rx_mode in net_device_ops.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: usbnet: export usbnet_set_rx_mode()
Bjørn Mork [Wed, 15 Jul 2020 18:40:58 +0000 (20:40 +0200)]
net: usbnet: export usbnet_set_rx_mode()

This function can be reused by other usbnet minidrivers.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cdc_ether: export usbnet_cdc_update_filter
Miguel Rodríguez Pérez [Wed, 15 Jul 2020 18:40:57 +0000 (20:40 +0200)]
net: cdc_ether: export usbnet_cdc_update_filter

This makes the function available to other drivers, like cdc_ncm.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: cdc_ether: use dev->intf to get interface information
Miguel Rodríguez Pérez [Wed, 15 Jul 2020 18:40:56 +0000 (20:40 +0200)]
net: cdc_ether: use dev->intf to get interface information

usbnet_cdc_update_filter was getting the interface number from the
usb_interface struct in cdc_state->control. However, cdc_ncm does
not initialize that structure in its bind function, but uses
cdc_ncm_ctx instead. Getting intf directly from struct usbnet solves
the problem.

Signed-off-by: Miguel Rodríguez Pérez <miguel@det.uvigo.gal>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: openvswitch: reorder masks array based on usage
Eelco Chaudron [Wed, 15 Jul 2020 12:09:28 +0000 (14:09 +0200)]
net: openvswitch: reorder masks array based on usage

This patch reorders the masks array every 4 seconds based on their
usage count. This greatly reduces the masks per packet hit, and
hence the overall performance. Especially in the OVS/OVN case for
OpenShift.

Here are some results from the OVS/OVN OpenShift test, which use
8 pods, each pod having 512 uperf connections, each connection
sends a 64-byte request and gets a 1024-byte response (TCP).
All uperf clients are on 1 worker node while all uperf servers are
on the other worker node.

Kernel without this patch     :  7.71 Gbps
Kernel with this patch applied: 14.52 Gbps

We also run some tests to verify the rebalance activity does not
lower the flow insertion rate, which does not.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Tested-by: Andrew Theurer <atheurer@redhat.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: sfp: Cotsworks SFF module EEPROM fixup
Chris Healy [Tue, 14 Jul 2020 17:59:10 +0000 (10:59 -0700)]
net: phy: sfp: Cotsworks SFF module EEPROM fixup

Some Cotsworks SFF have invalid data in the first few bytes of the
module EEPROM.  This results in these modules not being detected as
valid modules.

Address this by poking the correct EEPROM values into the module
EEPROM when the model/PN match and the existing module EEPROM contents
are not correct.

Signed-off-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: continue searching for C45 MMDs even if first returned ffff:ffff
Vladimir Oltean [Sun, 12 Jul 2020 16:48:15 +0000 (19:48 +0300)]
net: phy: continue searching for C45 MMDs even if first returned ffff:ffff

At the time of introduction, in commit bdeced75b13f ("net: dsa: felix:
Add PCS operations for PHYLINK"), support for the Lynx PCS inside Felix
was relying, for USXGMII support, on the fact that get_phy_device() is
able to parse the Lynx PCS "device-in-package" registers for this C45
MDIO device and identify it correctly.

However, this was actually working somewhat by mistake (in the sense
that, even though it was detected, it was detected for the wrong
reasons).

The get_phy_c45_ids() function works by iterating through all MMDs
starting from 1 (MDIO_MMD_PMAPMD) and stops at the first one which
returns a non-zero value in the "device-in-package" register pair,
proceeding to see what that non-zero value is.

For the Felix PCS, the first MMD (1, for the PMA/PMD) returns a non-zero
value of 0xffffffff in the "device-in-package" registers. There is a
code branch which is supposed to treat this case and flag it as wrong,
and normally, this would have caught my attention when adding initial
support for this PCS:

if ((devs_in_pkg & 0x1fffffff) == 0x1fffffff) {
/* If mostly Fs, there is no device there, then let's probe
 * MMD 0, as some 10G PHYs have zero Devices In package,
 * e.g. Cortina CS4315/CS4340 PHY.
 */

However, this code never actually kicked in, it seems, because this
snippet from get_phy_c45_devs_in_pkg() was basically sabotaging itself,
by returning 0xfffffffe instead of 0xffffffff:

/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
*devices_in_package &= ~BIT(0);

Then the rest of the code just carried on thinking "ok, MMD 1 (PMA/PMD)
says that there are 31 devices in that package, each having a device id
of ffff:ffff, that's perfectly fine, let's go ahead and probe this PHY
device".

But after cleanup commit 320ed3bf9000 ("net: phy: split
devices_in_package"), this got "fixed", and now devs_in_pkg is no longer
0xfffffffe, but 0xffffffff. So now, get_phy_device is returning -ENODEV
for the Lynx PCS, because the semantics have remained mostly unchanged:
the loop stops at the first MMD that returns a non-zero value, and that
is MMD 1.

But the Lynx PCS is simply a clause 37 PCS which implements the required
MAC-side functionality for USXGMII (when operated in C45 mode, which is
where C45 devices-in-package detection is relevant to). Of course it
will fail the PMD/PMA test (MMD 1), since it is not a PHY. But it does
implement detection for MDIO_MMD_PCS (3):

- MDIO_DEVS1=0x008a, MDIO_DEVS2=0x0000,
- MDIO_DEVID1=0x0083, MDIO_DEVID2=0xe400

Let get_phy_c45_ids() continue searching for valid MMDs, and don't
assume that every phy_device has a PMA/PMD MMD implemented.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-sched-do-not-drop-root-lock-in-tcf_qevent_handle'
Jakub Kicinski [Thu, 16 Jul 2020 23:48:47 +0000 (16:48 -0700)]
Merge branch 'net-sched-do-not-drop-root-lock-in-tcf_qevent_handle'

Petr Machata says:

====================
net: sched: Do not drop root lock in tcf_qevent_handle()

Mirred currently does not mix well with blocks executed after the qdisc
root lock is taken. This includes classification blocks (such as in PRIO,
ETS, DRR qdiscs) and qevents. The locking caused by the packet mirrored by
mirred can cause deadlocks: either when the thread of execution attempts to
take the lock a second time, or when two threads end up waiting on each
other's locks.

The qevent patchset attempted to not introduce further badness of this
sort, and dropped the lock before executing the qevent block. However this
lead to too little locking and races between qdisc configuration and packet
enqueue in the RED qdisc.

Before the deadlock issues are solved in a way that can be applied across
many qdiscs reasonably easily, do for qevents what is done for the
classification blocks and just keep holding the root lock.

That is done in patch #1. Patch #2 then drops the now unnecessary root_lock
argument from Qdisc_ops.enqueue.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoRevert "net: sched: Pass root lock to Qdisc_ops.enqueue"
Petr Machata [Tue, 14 Jul 2020 17:03:08 +0000 (20:03 +0300)]
Revert "net: sched: Pass root lock to Qdisc_ops.enqueue"

This reverts commit aebe4426ccaa4838f36ea805cdf7d76503e65117.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: sched: Do not drop root lock in tcf_qevent_handle()
Petr Machata [Tue, 14 Jul 2020 17:03:07 +0000 (20:03 +0300)]
net: sched: Do not drop root lock in tcf_qevent_handle()

Mirred currently does not mix well with blocks executed after the qdisc
root lock is taken. This includes classification blocks (such as in PRIO,
ETS, DRR qdiscs) and qevents. The locking caused by the packet mirrored by
mirred can cause deadlocks: either when the thread of execution attempts to
take the lock a second time, or when two threads end up waiting on each
other's locks.

The qevent patchset attempted to not introduce further badness of this
sort, and dropped the lock before executing the qevent block. However this
lead to too little locking and races between qdisc configuration and packet
enqueue in the RED qdisc.

Before the deadlock issues are solved in a way that can be applied across
many qdiscs reasonably easily, do for qevents what is done for the
classification blocks and just keep holding the root lock.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet/mlx5e: CT: Map 128 bits labels to 32 bit map ID
Eli Britstein [Thu, 18 Jun 2020 15:38:31 +0000 (15:38 +0000)]
net/mlx5e: CT: Map 128 bits labels to 32 bit map ID

The 128 bits ct_label field is matched using a 32 bit hardware register.
As such, only the lower 32 bits of ct_label field are offloaded. Change
this logic to support setting and matching higher bits too.
Map the 128 bits data to a unique 32 bits ID. Matching is done as exact
match of the mapping ID of key & mask.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Maor Dickman <maord@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Do not request completion on every single UMR WQE
Tariq Toukan [Wed, 1 May 2019 12:23:06 +0000 (15:23 +0300)]
net/mlx5e: Do not request completion on every single UMR WQE

UMR WQEs are posted in bulks, and HW is notified once per a bulk.
Reduce the number of completions by requesting such only for
the last WQE of the bulk.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: RX, Avoid indirect call in representor CQE handling
Tariq Toukan [Thu, 30 Apr 2020 12:52:53 +0000 (15:52 +0300)]
net/mlx5e: RX, Avoid indirect call in representor CQE handling

Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call
when/if CONFIG_RETPOLINE=y.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: XDP, Avoid indirect call in TX flow
Tariq Toukan [Thu, 30 Apr 2020 11:32:45 +0000 (14:32 +0300)]
net/mlx5e: XDP, Avoid indirect call in TX flow

Use INDIRECT_CALL_2() helper to avoid the cost of the indirect call
when/if CONFIG_RETPOLINE=y.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: IPsec: Add Connect-X IPsec ESN update offload support
Raed Salem [Sun, 29 Dec 2019 15:13:53 +0000 (17:13 +0200)]
net/mlx5e: IPsec: Add Connect-X IPsec ESN update offload support

Synchronize offloading device ESN with xfrm received SN
by updating an existing IPsec HW context with the new SN.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload
Raed Salem [Thu, 24 Oct 2019 13:11:28 +0000 (16:11 +0300)]
net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload

On receive flow inspect received packets for IPsec offload indication
using the cqe, for IPsec offloaded packets propagate offload status
and stack handle to stack for further processing.

Supported statuses:
- Offload ok.
- Authentication failure.
- Bad trailer indication.

Connect-X IPsec does not use mlx5e_ipsec_handle_rx_cqe.

For RX only offload, we see the BW gain. Below is the iperf3
performance report on two server of 24 cores Intel(R) Xeon(R)
CPU E5-2620 v3 @ 2.40GHz with ConnectX6-DX.
We use one thread per IPsec tunnel.

---------------------------------------------------------------------
Mode          |  Num tunnel | BW     | Send CPU util | Recv CPU util
              |             | (Gbps) | (Average %)   | (Average %)
---------------------------------------------------------------------
Cryto offload | 1           | 4.6    | 4.2           | 14.5
---------------------------------------------------------------------
Cryto offload | 24          | 38     | 73            | 63
---------------------------------------------------------------------
Non-offload   | 1           | 4      | 4             | 13
---------------------------------------------------------------------
Non-offload   | 24          | 23     | 52            | 67

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: IPsec: Add IPsec steering in local NIC RX
Huy Nguyen [Fri, 5 Jun 2020 21:36:35 +0000 (16:36 -0500)]
net/mlx5e: IPsec: Add IPsec steering in local NIC RX

Introduce decrypt FT, the RX error FT and the default rules.

The IPsec RX decrypt flow table is pointed by the TTC
(Traffic Type Classifier) ESP steering rules.
The decrypt flow table has two flow groups. The first flow group
keeps the decrypt steering rule programmed via the "ip xfrm s" interface.
The second flow group has a default rule to forward all non-offloaded
ESP packet to the TTC ESP default RSS TIR.

The RX error flow table is the destination of the decrypt steering rules
in the IPsec RX decrypt flow table. It has a fixed rule with single
copy action that copies ipsec_syndrome to metadata_regB[0:6]. The IPsec
syndrome is used to filter out non-ipsec packet and to return the IPsec
crypto offload status in Rx flow. The destination of RX error flow table
is the TTC ESP default RSS TIR.

All the FTs (decrypt FT and error FT) are created only when IPsec SAs
are added. If there is no IPsec SAs, the FTs are removed.

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Add IPsec related Flow steering entry's fields
Huy Nguyen [Thu, 9 Apr 2020 01:09:05 +0000 (20:09 -0500)]
net/mlx5: Add IPsec related Flow steering entry's fields

Add FTE actions IPsec ENCRYPT/DECRYPT
Add ipsec_obj_id field in FTE
Add new action field MLX5_ACTION_IN_FIELD_IPSEC_SYNDROME

Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: IPsec: Add HW crypto offload support
Raed Salem [Wed, 29 Jan 2020 16:15:15 +0000 (18:15 +0200)]
net/mlx5: IPsec: Add HW crypto offload support

This patch adds support for Connect-X IPsec crypto offload
by implementing the IPsec acceleration layer needed routines,
which delegates IPsec offloads to Connect-X routines.

In Connect-X IPsec, a Security Association (SA) is added or deleted
via allocating a HW context of an encryption/decryption key and
a HW context of a matching SA (IPsec object).
The Security Policy (SP) is added or deleted by creating matching Tx/Rx
steering rules whith an action of encryption/decryption respectively,
executed using the previously allocated SA HW context.

When new xfrm state (SA) is added:
- Use a separate crypto key HW context.
- Create a separate IPsec context in HW to inlcude the SA properties:
 - aes-gcm salt.
 - ICV properties (ICV length, implicit IV).
 - on supported devices also update ESN.
 - associate the allocated crypto key with this IPsec context.

Introduce a new compilation flag MLX5_IPSEC for it.

Downstream patches will implement the Rx,Tx steering
and will add the update esn.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Accel, Add core IPsec support for the Connect-X family
Raed Salem [Mon, 18 Nov 2019 12:30:20 +0000 (14:30 +0200)]
net/mlx5: Accel, Add core IPsec support for the Connect-X family

This to set the base for downstream patches to support
the new IPsec implementation of the Connect-X family.

Following modifications made:
- Remove accel layer dependency from MLX5_FPGA_IPSEC.
- Introduce accel_ipsec_ops, each IPsec device will
  have to support these ops.

Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-switch, Reduce dependency on num_vfs during mode set
Parav Pandit [Fri, 26 Jun 2020 20:15:12 +0000 (23:15 +0300)]
net/mlx5: E-switch, Reduce dependency on num_vfs during mode set

Currently only ECPF allows enabling eswitch when SR-IOV is disabled.

Enable PF also to enable eswitch when SR-IOV is disabled.
Load VF vports when eswitch is already enabled.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-switch, Avoid function change handler for non ECPF
Parav Pandit [Sat, 27 Jun 2020 11:12:36 +0000 (14:12 +0300)]
net/mlx5: E-switch, Avoid function change handler for non ECPF

for non ECPF eswitch manager function, vports are already
enabled/disabled when eswitch is enabled/disabled respectively.
Simplify function change handler for such eswitch manager function.

Therefore, ECPF is the only one which remains PF/VF function change
handler.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Make MLX5_EN_TLS non-prompt
Tariq Toukan [Thu, 9 Apr 2020 16:53:24 +0000 (19:53 +0300)]
net/mlx5: Make MLX5_EN_TLS non-prompt

TLS runs only over Eth, and the Eth driver is the only user of
the core TLS functionality.
There is no meaning of having the core functionality without the usage
in Eth driver.
Hence, let both TLS core implementations depend on MLX5_CORE_EN,
and select MLX5_EN_TLS.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix build break when CONFIG_XPS is not set
Saeed Mahameed [Wed, 15 Jul 2020 01:54:46 +0000 (18:54 -0700)]
net/mlx5e: Fix build break when CONFIG_XPS is not set

mlx5e_accel_sk_get_rxq is only used in ktls_rx.c file which already
depends on XPS to be compiled, move it from the generic en_accel.h
header to be local in ktls_rx.c, to fix the below build break

In file included from
../drivers/net/ethernet/mellanox/mlx5/core/en_main.c:49:0:
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h:
In function ‘mlx5e_accel_sk_get_rxq’:
../drivers/net/ethernet/mellanox/mlx5/core/en_accel/en_accel.h:153:12:
error: implicit declaration of function ‘sk_rx_queue_get’ ...
  int rxq = sk_rx_queue_get(sk);
            ^~~~~~~~~~~~~~~

Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
4 years agonet/mlx5e: Fix missing switch_id for representors
Parav Pandit [Fri, 10 Jul 2020 18:12:44 +0000 (21:12 +0300)]
net/mlx5e: Fix missing switch_id for representors

Cited commit in fixes tag missed to set the switch id of the PF and VF
ports. Due to this flow cannot be offloaded, a simple command like below
fails to offload with below error.

tc filter add dev ens2f0np0 parent ffff: prio 1 flower \
 dst_mac 00:00:00:00:00:00/00:00:00:00:00:00 skip_sw \
 action mirred egress redirect dev ens2f0np0pf0vf0

Error: mlx5_core: devices are not on same switch HW, can't offload forwarding.

Hence, fix it by setting switch id for each PF and VF representors port
as before the cited commit.

Fixes: 71ad8d55f8e5 ("devlink: Replace devlink_port_attrs_set parameters with a struct")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
4 years agonet: mscc: ocelot: rethink Kconfig dependencies again
Vladimir Oltean [Sun, 12 Jul 2020 21:28:33 +0000 (00:28 +0300)]
net: mscc: ocelot: rethink Kconfig dependencies again

Having the users of MSCC_OCELOT_SWITCH_LIB depend on REGMAP_MMIO was a
bad idea, since that symbol is not user-selectable. So we should have
kept a 'select REGMAP_MMIO'.

When we do that, we run into 2 more problems:

- By depending on GENERIC_PHY, we are causing a recursive dependency.
  But it looks like GENERIC_PHY has no other dependencies, and other
  drivers select it, so we can select it too:

drivers/of/Kconfig:69:error: recursive dependency detected!
drivers/of/Kconfig:69:  symbol OF_IRQ depends on IRQ_DOMAIN
kernel/irq/Kconfig:68:  symbol IRQ_DOMAIN is selected by REGMAP
drivers/base/regmap/Kconfig:7:  symbol REGMAP default is visible depending on REGMAP_MMIO
drivers/base/regmap/Kconfig:39: symbol REGMAP_MMIO is selected by MSCC_OCELOT_SWITCH_LIB
drivers/net/ethernet/mscc/Kconfig:15:   symbol MSCC_OCELOT_SWITCH_LIB is selected by MSCC_OCELOT_SWITCH
drivers/net/ethernet/mscc/Kconfig:22:   symbol MSCC_OCELOT_SWITCH depends on GENERIC_PHY
drivers/phy/Kconfig:8:  symbol GENERIC_PHY is selected by PHY_BCM_NS_USB3
drivers/phy/broadcom/Kconfig:41:        symbol PHY_BCM_NS_USB3 depends on MDIO_BUS
drivers/net/phy/Kconfig:13:     symbol MDIO_BUS depends on MDIO_DEVICE
drivers/net/phy/Kconfig:6:      symbol MDIO_DEVICE is selected by PHYLIB
drivers/net/phy/Kconfig:254:    symbol PHYLIB is selected by ARC_EMAC_CORE
drivers/net/ethernet/arc/Kconfig:19:    symbol ARC_EMAC_CORE is selected by ARC_EMAC
drivers/net/ethernet/arc/Kconfig:25:    symbol ARC_EMAC depends on OF_IRQ

- By depending on PHYLIB, we are causing a recursive dependency. PHYLIB
  only has a single dependency, "depends on NETDEVICES", which we are
  already depending on, so we can again hack our way into conformance by
  turning the PHYLIB dependency into a select.

drivers/of/Kconfig:69:error: recursive dependency detected!
drivers/of/Kconfig:69:  symbol OF_IRQ depends on IRQ_DOMAIN
kernel/irq/Kconfig:68:  symbol IRQ_DOMAIN is selected by REGMAP
drivers/base/regmap/Kconfig:7:  symbol REGMAP default is visible depending on REGMAP_MMIO
drivers/base/regmap/Kconfig:39: symbol REGMAP_MMIO is selected by MSCC_OCELOT_SWITCH_LIB
drivers/net/ethernet/mscc/Kconfig:15:   symbol MSCC_OCELOT_SWITCH_LIB is selected by MSCC_OCELOT_SWITCH
drivers/net/ethernet/mscc/Kconfig:22:   symbol MSCC_OCELOT_SWITCH depends on PHYLIB
drivers/net/phy/Kconfig:254:    symbol PHYLIB is selected by ARC_EMAC_CORE
drivers/net/ethernet/arc/Kconfig:19:    symbol ARC_EMAC_CORE is selected by ARC_EMAC
drivers/net/ethernet/arc/Kconfig:25:    symbol ARC_EMAC depends on OF_IRQ

Fixes: f4d0323bae4e ("net: mscc: ocelot: convert MSCC_OCELOT_SWITCH into a library")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoaf_packet: TPACKET_V3: replace busy-wait loop
John Ogness [Tue, 7 Jul 2020 15:22:04 +0000 (17:28 +0206)]
af_packet: TPACKET_V3: replace busy-wait loop

A busy-wait loop is used to implement waiting for bits to be copied
from the skb to the kernel buffer before retiring a block. This is
a problem on PREEMPT_RT because the copying task could be preempted
by the busy-waiting task and thus live lock in the busy-wait loop.

Replace the busy-wait logic with an rwlock_t. This provides lockdep
coverage and makes the code RT ready.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoMerge branch 'net-fec-a-few-improvements'
Jakub Kicinski [Thu, 16 Jul 2020 18:32:24 +0000 (11:32 -0700)]
Merge branch 'net-fec-a-few-improvements'

Sergey Organov says:

====================
net: fec: a few improvements

This is a collection of simple improvements that reduce and/or
simplify code. They got developed out of attempt to use DP83640 PTP
PHY connected to built-in FEC (that has its own PTP support) of the
iMX 6SX micro-controller. The primary bug-fix was now submitted
separately, and this is the rest of the changes.

NOTE: the patches are developed and tested on 4.9.146, and rebased on
top of recent 'net-next/master', where, besides visual inspection, I
only tested that they do compile.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: fec: replace snprintf() with strlcpy() in fec_ptp_init()
Sergey Organov [Wed, 15 Jul 2020 15:43:00 +0000 (18:43 +0300)]
net: fec: replace snprintf() with strlcpy() in fec_ptp_init()

No need to use snprintf() on a constant string, nor using magic
constant in the fixed code was a good idea.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: fec: get rid of redundant code in fec_ptp_set()
Sergey Organov [Wed, 15 Jul 2020 15:42:59 +0000 (18:42 +0300)]
net: fec: get rid of redundant code in fec_ptp_set()

Code of the form "if(x) x = 0" replaced with "x = 0".

Code of the form "if(x == a) x = a" removed.

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: fec: initialize clock with 0 rather than current kernel time
Sergey Organov [Wed, 15 Jul 2020 15:42:58 +0000 (18:42 +0300)]
net: fec: initialize clock with 0 rather than current kernel time

Initializing with 0 makes it much easier to identify time stamps from
otherwise uninitialized clock.

Initialization of PTP clock with current kernel time makes little sense as
PTP time scale differs from UTC time scale that kernel time represents.
It only leads to confusion when no actual PTP initialization happens, as
these time scales differ in a small integer number of seconds (37 at the
time of writing.)

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: fec: enable to use PPS feature without time stamping
Sergey Organov [Wed, 15 Jul 2020 15:42:57 +0000 (18:42 +0300)]
net: fec: enable to use PPS feature without time stamping

PPS feature could be useful even when hardware time stamping
of network packets is not in use, so remove offending check
for this condition from fec_ptp_enable_pps().

Signed-off-by: Sergey Organov <sorganov@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: ipv6: drop duplicate word in comment
Randy Dunlap [Wed, 15 Jul 2020 16:42:46 +0000 (09:42 -0700)]
net: ipv6: drop duplicate word in comment

Drop the doubled word "by" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: sctp: drop duplicate words in comments
Randy Dunlap [Wed, 15 Jul 2020 16:42:45 +0000 (09:42 -0700)]
net: sctp: drop duplicate words in comments

Drop doubled words in several comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: ip6_fib.h: drop duplicate word in comment
Randy Dunlap [Wed, 15 Jul 2020 16:42:44 +0000 (09:42 -0700)]
net: ip6_fib.h: drop duplicate word in comment

Drop doubled word "the" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: dsa.h: drop duplicate word in comment
Randy Dunlap [Wed, 15 Jul 2020 16:42:43 +0000 (09:42 -0700)]
net: dsa.h: drop duplicate word in comment

Drop doubled word "to" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: caif: drop duplicate words in comments
Randy Dunlap [Wed, 15 Jul 2020 16:42:42 +0000 (09:42 -0700)]
net: caif: drop duplicate words in comments

Drop doubled words "or" and "the" in several comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: 9p: drop duplicate word in comment
Randy Dunlap [Wed, 15 Jul 2020 16:42:41 +0000 (09:42 -0700)]
net: 9p: drop duplicate word in comment

Drop doubled word "not" in a comment.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: wimax: fix duplicate words in comments
Randy Dunlap [Wed, 15 Jul 2020 16:42:40 +0000 (09:42 -0700)]
net: wimax: fix duplicate words in comments

Drop doubled words in two comments.
Fix a spello/typo.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: skbuff.h: drop duplicate words in comments
Randy Dunlap [Wed, 15 Jul 2020 16:42:39 +0000 (09:42 -0700)]
net: skbuff.h: drop duplicate words in comments

Drop doubled words in several comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agonet: qed: drop duplicate words in comments
Randy Dunlap [Wed, 15 Jul 2020 16:42:38 +0000 (09:42 -0700)]
net: qed: drop duplicate words in comments

Drop doubled word "the" in two comments.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agodrivers: net: wan: Fix trivial spelling
Kieran Bingham [Wed, 15 Jul 2020 12:48:34 +0000 (13:48 +0100)]
drivers: net: wan: Fix trivial spelling

The word 'descriptor' is misspelled throughout the tree.

Fix it up accordingly:
    decriptor -> descriptor

Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoMerge branch 'mlxsw-reg-add-policer-bandwidth-limits'
Jakub Kicinski [Thu, 16 Jul 2020 01:10:41 +0000 (18:10 -0700)]
Merge branch 'mlxsw-reg-add-policer-bandwidth-limits'

Ido Schimmel says:

====================
mlxsw: Offload tc police action

This patch set adds support for tc police action in mlxsw.

Patches #1-#2 add defines for policer bandwidth limits and resource
identifiers (e.g., maximum number of policers).

Patch #3 adds a common policer core in mlxsw. Currently it is only used
by the policy engine, but future patch sets will use it for trap
policers and storm control policers. The common core allows us to share
common logic between all policer types and abstract certain details from
the various users in mlxsw.

Patch #4 exposes the maximum number of supported policers and their
current usage to user space via devlink-resource. This provides better
visibility and also used for selftests purposes.

Patches #5-#7 gradually add support for tc police action in the policy
engine by calling into previously mentioned policer core.

Patch #8 adds a generic selftest for tc-police that can be used with
veth pairs or physical loopbacks.

Patches #9-#11 add mlxsw-specific selftests.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoselftests: mlxsw: Test policers' occupancy
Ido Schimmel [Wed, 15 Jul 2020 08:27:33 +0000 (11:27 +0300)]
selftests: mlxsw: Test policers' occupancy

Test that policers shared by different tc filters are correctly
reference counted by observing policers' occupancy via devlink-resource.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoselftests: mlxsw: Add scale test for tc-police
Ido Schimmel [Wed, 15 Jul 2020 08:27:32 +0000 (11:27 +0300)]
selftests: mlxsw: Add scale test for tc-police

Query the maximum number of supported policers using devlink-resource
and test that this number can be reached by configuring tc filters with
police action. Test that an error is returned in case the maximum number
is exceeded.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoselftests: mlxsw: tc_restrictions: Test tc-police restrictions
Ido Schimmel [Wed, 15 Jul 2020 08:27:31 +0000 (11:27 +0300)]
selftests: mlxsw: tc_restrictions: Test tc-police restrictions

Test that upper and lower limits on rate and burst size imposed by the
device are rejected by the kernel.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agoselftests: forwarding: Add tc-police tests
Ido Schimmel [Wed, 15 Jul 2020 08:27:30 +0000 (11:27 +0300)]
selftests: forwarding: Add tc-police tests

Test tc-police action in various scenarios such as Rx policing, Tx
policing, shared policer and police piped to mirred. The test passes
with both veth pairs and loopbacked ports.

# ./tc_police.sh
TEST: police on rx                                                  [ OK ]
TEST: police on tx                                                  [ OK ]
TEST: police with shared policer - rx                               [ OK ]
TEST: police with shared policer - tx                               [ OK ]
TEST: police rx and mirror                                          [ OK ]
TEST: police tx and mirror                                          [ OK ]

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: spectrum_acl: Offload FLOW_ACTION_POLICE
Ido Schimmel [Wed, 15 Jul 2020 08:27:29 +0000 (11:27 +0300)]
mlxsw: spectrum_acl: Offload FLOW_ACTION_POLICE

Offload action police when used with a flower classifier. The number of
dropped packets is read from the policer and reported to tc.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: core_acl_flex_actions: Add police action
Ido Schimmel [Wed, 15 Jul 2020 08:27:28 +0000 (11:27 +0300)]
mlxsw: core_acl_flex_actions: Add police action

Add core functionality required to support police action in the policy
engine.

The utilized hardware policers are stored in a hash table keyed by the
flow action index. This allows to support policer sharing between
multiple ACL rules.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: core_acl_flex_actions: Work around hardware limitation
Ido Schimmel [Wed, 15 Jul 2020 08:27:27 +0000 (11:27 +0300)]
mlxsw: core_acl_flex_actions: Work around hardware limitation

In the policy engine, each ACL rule points to an action block where the
ACL actions are stored. Each action block consists of one or more action
sets. Each action set holds one or more individual actions, up to a
maximum queried from the device. For example:

                        Action set #1               Action set #2

+----------+          +--------------+            +--------------+
| ACL rule +---------->  Action #1   |      +----->  Action #4   |
+----------+          +--------------+      |     +--------------+
                      |  Action #2   |      |     |  Action #5   |
                      +--------------+      |     +--------------+
                      |  Action #3   +------+     |              |
                      +--------------+            +--------------+

                      <---------+ Action block +----------------->

The hardware has a limitation that prevents a policing action
(MLXSW_AFA_POLCNT_CODE when used with a policer, not a counter) from
being configured in the same action set with a trap action (i.e.,
MLXSW_AFA_TRAP_CODE or MLXSW_AFA_TRAPWU_CODE). Note that the latter used
to implement multiple actions: 'trap', 'mirred', 'drop'.

Work around this limitation by teaching mlxsw_afa_block_append_action()
to create a new action set not only when there is no more room left in
the current set, but also when there is a conflict between previously
mentioned actions.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: spectrum_policer: Add devlink resource support
Ido Schimmel [Wed, 15 Jul 2020 08:27:26 +0000 (11:27 +0300)]
mlxsw: spectrum_policer: Add devlink resource support

Expose via devlink-resource the maximum number of single-rate policers
and their current occupancy. Example:

$ devlink resource show pci/0000:01:00.0
...
  name global_policers size 1000 unit entry dpipe_tables none
    resources:
      name single_rate_policers size 968 occ 0 unit entry dpipe_tables none

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: spectrum_policer: Add policer core
Ido Schimmel [Wed, 15 Jul 2020 08:27:25 +0000 (11:27 +0300)]
mlxsw: spectrum_policer: Add policer core

Add common code to handle all policer-related functionality in mlxsw.
Currently, only policer for policy engines are supported, but it in the
future more policer families will be added such as CPU (trap) policers
and storm control policers.

The API allows different modules to add / delete policers and read their
drop counter.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: resources: Add resource identifier for global policers
Ido Schimmel [Wed, 15 Jul 2020 08:27:24 +0000 (11:27 +0300)]
mlxsw: resources: Add resource identifier for global policers

Add a resource identifier for maximum global policers so that it could
be later used to query the information from firmware.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
4 years agomlxsw: reg: Add policer bandwidth limits
Ido Schimmel [Wed, 15 Jul 2020 08:27:23 +0000 (11:27 +0300)]
mlxsw: reg: Add policer bandwidth limits

Add policer bandwidth limits for both rate and burst size so that they
could be enforced by a later patch.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>