linux-2.6-microblaze.git
4 years agobnxt_en: Add fw.mgmt.api version to devlink info_get cb.
Vasundhara Volam [Fri, 27 Mar 2020 09:34:52 +0000 (15:04 +0530)]
bnxt_en: Add fw.mgmt.api version to devlink info_get cb.

Display the minimum version of firmware interface spec supported
between driver and firmware. Also update bnxt.rst documentation file.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: Add macro for "fw.mgmt.api" to info_get cb.
Vasundhara Volam [Fri, 27 Mar 2020 09:34:51 +0000 (15:04 +0530)]
devlink: Add macro for "fw.mgmt.api" to info_get cb.

Add definition and documentation for the new generic info
"fw.mgmt.api". This macro specifies the version of the software
interfaces between driver and firmware.

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'mlxsw-Various-static-checkers-fixes'
David S. Miller [Fri, 27 Mar 2020 22:06:50 +0000 (15:06 -0700)]
Merge branch 'mlxsw-Various-static-checkers-fixes'

Ido Schimmel says:

====================
mlxsw: Various static checkers fixes

Jakub told me he gets some warnings with W=1, so I decided to check with
sparse, smatch and coccinelle as well. This patch set fixes all the
issues found. None are actual bugs / regressions and therefore not
targeted at net.

Patches #1-#2 add missing kernel-doc comments.

Patch #3 removes dead code.

Patch #4 reworks the ACL code to avoid defining a static variable in a
header file.

Patch #5 removes unnecessary conversion to bool that coccinelle warns
about.

Patch #6 avoids false-positive uninitialized symbol errors emitted by
smatch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: spectrum_router: Avoid uninitialized symbol errors
Ido Schimmel [Fri, 27 Mar 2020 08:55:25 +0000 (11:55 +0300)]
mlxsw: spectrum_router: Avoid uninitialized symbol errors

Suppress the following smatch errors. None of these are actually
possible with current code paths.

drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1220
mlxsw_sp_ipip_entry_find_decap() error: uninitialized symbol 'saddrp'.
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1220
mlxsw_sp_ipip_entry_find_decap() error: uninitialized symbol
'saddr_len'.
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1221
mlxsw_sp_ipip_entry_find_decap() error: uninitialized symbol
'saddr_prefix_len'.

drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:1390
mlxsw_sp_netdevice_ipip_ol_reg_event() error: uninitialized symbol
'ipipt'.

drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:3255
mlxsw_sp_nexthop_group_update() error: uninitialized symbol 'err'.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: switchx2: Remove unnecessary conversion to bool
Ido Schimmel [Fri, 27 Mar 2020 08:55:24 +0000 (11:55 +0300)]
mlxsw: switchx2: Remove unnecessary conversion to bool

Suppress following warning from coccinelle:

drivers/net/ethernet/mellanox/mlxsw//switchx2.c:183:63-68: WARNING:
conversion to bool not needed here

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: core_acl: Avoid defining static variable in header file
Ido Schimmel [Fri, 27 Mar 2020 08:55:23 +0000 (11:55 +0300)]
mlxsw: core_acl: Avoid defining static variable in header file

The static array 'mlxsw_afk_element_infos' in 'core_acl_flex_keys.h' is
copied to each file that includes the header, but not all use it. This
results in the following warnings when compiling with W=1:

drivers/net/ethernet/mellanox/mlxsw//core_acl_flex_keys.h:76:44:
warning: ‘mlxsw_afk_element_infos’ defined but not used
[-Wunused-const-variable=]

One way to suppress the warning is to mark the array with
'__maybe_unused', but another option is to remove it from the header
file entirely.

Change 'struct mlxsw_afk_element_inst' to store the key to the array
('element') instead of the array value keyed by 'element'. Adjust the
different users accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: spectrum: Remove unused RIF and FID families
Ido Schimmel [Fri, 27 Mar 2020 08:55:22 +0000 (11:55 +0300)]
mlxsw: spectrum: Remove unused RIF and FID families

In merge commit 50853808ff4a ("Merge branch
'mlxsw-Prepare-for-VLAN-aware-bridge-w-VxLAN'") I flipped mlxsw to use
emulated 802.1Q FIDs and correspondingly emulated VLAN RIFs. This means
that the non-emulated variants are no longer used. Remove them and
suppress the following warnings when compiling with W=1:

drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:7572:38: warning:
‘mlxsw_sp_rif_vlan_ops’ defined but not used [-Wunused-const-variable=]

drivers/net/ethernet/mellanox/mlxsw//spectrum_fid.c:584:41: warning:
‘mlxsw_sp_fid_8021q_family’ defined but not used
[-Wunused-const-variable=]

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: spectrum_router: Add proper function documentation
Ido Schimmel [Fri, 27 Mar 2020 08:55:21 +0000 (11:55 +0300)]
mlxsw: spectrum_router: Add proper function documentation

Suppress following warnings when compiling with W=1:

drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1552: warning:
Function parameter or member 'mlxsw_sp' not described in
'__mlxsw_sp_ipip_entry_update_tunnel'
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1552: warning:
Function parameter or member 'ipip_entry' not described in
'__mlxsw_sp_ipip_entry_update_tunnel'
drivers/net/ethernet/mellanox/mlxsw//spectrum_router.c:1552: warning:
Function parameter or member 'extack' not described in
'__mlxsw_sp_ipip_entry_update_tunnel'

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: i2c: Add missing field documentation
Ido Schimmel [Fri, 27 Mar 2020 08:55:20 +0000 (11:55 +0300)]
mlxsw: i2c: Add missing field documentation

Suppress following warning when compiling with W=1:

drivers/net/ethernet/mellanox/mlxsw//i2c.c:78: warning: Function
parameter or member 'cmd' not described in 'mlxsw_i2c'

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: don't touch suspended flag if there's no suspend/resume callback
Heiner Kallweit [Thu, 26 Mar 2020 17:58:24 +0000 (18:58 +0100)]
net: phy: don't touch suspended flag if there's no suspend/resume callback

So far we set phydev->suspended to true in phy_suspend() even if the
PHY driver doesn't implement the suspend callback. This applies
accordingly for the resume path. The current behavior doesn't cause
any issue I'd be aware of, but it's not logical and misleading,
especially considering the description of the flag:
"suspended: Set to true if this phy has been suspended successfully"

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-atlantic-MACSec-support-for-AQC-devices'
David S. Miller [Fri, 27 Mar 2020 03:17:37 +0000 (20:17 -0700)]
Merge branch 'net-atlantic-MACSec-support-for-AQC-devices'

Igor Russkikh says:

====================
net: atlantic: MACSec support for AQC devices

This patchset introduces MACSec HW offloading support in
Marvell(Aquantia) AQC atlantic driver.

This implementation is a joint effort of Marvell developers on top of
the work started by Antoine Tenart.

v2:
 * clean up the generated code (removed useless bit operations);
 * use WARN_ONCE to avoid log spam;
 * use put_unaligned_be64;
 * removed trailing \0 and length limit for format strings;

v1: https://patchwork.ozlabs.org/cover/1259998/

RFC v2: https://patchwork.ozlabs.org/cover/1252204/

RFC v1: https://patchwork.ozlabs.org/cover/1238082/

Several patches introduce backward-incompatible changes and are
subject for discussion/drop:

1) patch 0007:
  multicast/broadcast when offloading is needed to handle ARP requests,
  because they have broadcast destination address;
  With this patch we also match and encrypt/decrypt packets between macsec
  hw and realdev based on device's mac address.
  This can potentially be used to support multiple macsec offloaded
  interfaces on top of one realdev.
  However in some environments this could lead to problems, e.g. the
  'bridge over macsec' configuration will expect the packets with unknown
  src MAC should come through macsec.
  The patch is questionable, we've used it because our current hw setup
  and requirements both assume that the decryption is done based on mac
  address match only.
  This could be changed by encrypting/decripting all the traffic (except
  control).

2) patch 0009:
  real_dev features are now propagated to macsec device (when HW
  offloading is enabled), otherwise feature set might lead to HW
  reconfiguration during MACSec configuration.
  Also, HW offloaded macsec should be able to keep LRO LSO features,
  since they are transparent for macsec engine (at least in our hardware).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: add XPN handling
Mark Starovoytov [Wed, 25 Mar 2020 12:52:46 +0000 (15:52 +0300)]
net: atlantic: add XPN handling

This patch adds XPN handling.
Our driver doesn't support XPN, but we should still update a couple
of places in the code, because the size of 'next_pn' field has
changed.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec offload statistics implementation
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:45 +0000 (15:52 +0300)]
net: atlantic: MACSec offload statistics implementation

This patch adds support for MACSec statistics on Atlantic network cards.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec offload statistics HW bindings
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:44 +0000 (15:52 +0300)]
net: atlantic: MACSec offload statistics HW bindings

This patch adds the Atlantic HW-specific bindings for MACSec statistics,
e.g. register addresses / structs, helper function, etc, which will be
used by actual callback implementations.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec ingress offload implementation
Mark Starovoytov [Wed, 25 Mar 2020 12:52:43 +0000 (15:52 +0300)]
net: atlantic: MACSec ingress offload implementation

This patch adds support for MACSec ingress HW offloading on Atlantic
network cards.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec ingress offload HW bindings
Mark Starovoytov [Wed, 25 Mar 2020 12:52:42 +0000 (15:52 +0300)]
net: atlantic: MACSec ingress offload HW bindings

This patch adds the Atlantic HW-specific bindings for MACSec ingress, e.g.
register addresses / structs, helper function, etc, which will be used by
actual callback implementations.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec egress offload implementation
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:41 +0000 (15:52 +0300)]
net: atlantic: MACSec egress offload implementation

This patch adds support for MACSec egress HW offloading on Atlantic
network cards.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec egress offload HW bindings
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:40 +0000 (15:52 +0300)]
net: atlantic: MACSec egress offload HW bindings

This patch adds the Atlantic HW-specific bindings for MACSec egress, e.g.
register addresses / structs, helper function, etc, which will be used by
actual callback implementations.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: atlantic: MACSec offload skeleton
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:39 +0000 (15:52 +0300)]
net: atlantic: MACSec offload skeleton

This patch adds basic functionality for MACSec offloading for Atlantic
NICs.

MACSec offloading functionality is enabled if network card has
appropriate FW that has MACSec offloading enabled in config.

Actual functionality (ingress, egress, etc) will be added in follow-up
patches.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: report real_dev features when HW offloading is enabled
Mark Starovoytov [Wed, 25 Mar 2020 12:52:38 +0000 (15:52 +0300)]
net: macsec: report real_dev features when HW offloading is enabled

This patch makes real_dev_feature propagation by MACSec offloaded device.

Issue description:
real_dev features are disabled upon macsec creation.

Root cause:
Features limitation (specific to SW MACSec limitation) is being applied
to HW offloaded case as well.
This causes 'set_features' request on the real_dev with reduced feature
set due to chain propagation.

Proposed solution:
Report real_dev features when HW offloading is enabled.
NB! MACSec offloaded device does not propagate VLAN offload features at
the moment. This can potentially be added later on as a separate patch.

Note: this patch requires HW offloading to be enabled by default in order
to function properly.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: add support for getting offloaded stats
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:37 +0000 (15:52 +0300)]
net: macsec: add support for getting offloaded stats

When HW offloading is enabled, offloaded stats should be used, because
s/w stats are wrong and out of sync with the HW in this case.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: support multicast/broadcast when offloading
Mark Starovoytov [Wed, 25 Mar 2020 12:52:36 +0000 (15:52 +0300)]
net: macsec: support multicast/broadcast when offloading

The idea is simple. If the frame is an exact match for the controlled port
(based on DA comparison), then we simply divert this skb to matching port.

Multicast/broadcast messages are delivered to all ports.

Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: allow multiple macsec devices with offload
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:35 +0000 (15:52 +0300)]
net: macsec: allow multiple macsec devices with offload

Offload engine can setup several SecY. Each macsec interface shall have
its own mac address. It will filter a traffic by dest mac address.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: init secy pointer in macsec_context
Dmitry Bogdanov [Wed, 25 Mar 2020 12:52:34 +0000 (15:52 +0300)]
net: macsec: init secy pointer in macsec_context

This patch adds secy pointer initialization in the macsec_context.
It will be used by MAC drivers in offloading operations.

Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: add support for offloading to the MAC
Antoine Tenart [Wed, 25 Mar 2020 12:52:33 +0000 (15:52 +0300)]
net: macsec: add support for offloading to the MAC

This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC,
allowing a user to select a MAC as a provider for MACsec offloading
operations.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macsec: allow to reference a netdev from a MACsec context
Antoine Tenart [Wed, 25 Mar 2020 12:52:32 +0000 (15:52 +0300)]
net: macsec: allow to reference a netdev from a MACsec context

This patch allows to reference a net_device from a MACsec context. This
is needed to allow implementing MACsec operations in net device drivers.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: add a reference to MACsec ops in net_device
Antoine Tenart [Wed, 25 Mar 2020 12:52:31 +0000 (15:52 +0300)]
net: add a reference to MACsec ops in net_device

This patch adds a reference to MACsec ops to the net_device structure,
allowing net device drivers to implement offloading operations for
MACsec.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: introduce the MACSEC netdev feature
Antoine Tenart [Wed, 25 Mar 2020 12:52:30 +0000 (15:52 +0300)]
net: introduce the MACSEC netdev feature

This patch introduce a new netdev feature, which will be used by drivers
to state they can perform MACsec transformations in hardware.

The patchset was gathered by Mark, macsec functinality itself
was implemented by Dmitry, Mark and Pavel Belous.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotaprio: do not use BIT() in TCA_TAPRIO_ATTR_FLAG_* definitions
Eugene Syromiatnikov [Tue, 24 Mar 2020 04:19:20 +0000 (05:19 +0100)]
taprio: do not use BIT() in TCA_TAPRIO_ATTR_FLAG_* definitions

BIT() macro definition is internal to the Linux kernel and is not
to be used in UAPI headers; replace its usage with the _BITUL() macro
that is already used elsewhere in the header.

Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoipv6: ndisc: add support for 'PREF64' dns64 prefix identifier
Maciej Żenczykowski [Tue, 24 Mar 2020 01:10:19 +0000 (18:10 -0700)]
ipv6: ndisc: add support for 'PREF64' dns64 prefix identifier

This is trivial since we already have support for the entirely
identical (from the kernel's point of view) RDNSS, DNSSL, etc. that
also contain opaque data that needs to be passed down to userspace
for further processing.

As specified in draft-ietf-6man-ra-pref64-09 (while it is still a draft,
it is purely waiting on the RFC Editor for cleanups and publishing):
  PREF64 option contains lifetime and a (up to) 96-bit IPv6 prefix.

The 8-bit identifier of the option type as assigned by the IANA is 38.

Since we lack DNS64/NAT64/CLAT support in kernel at the moment,
thus this option should also be passed on to userland.

See:
  https://tools.ietf.org/html/draft-ietf-6man-ra-pref64-09
  https://www.iana.org/assignments/icmpv6-parameters/icmpv6-parameters.xhtml#icmpv6-parameters-5

Cc: Erik Kline <ek@google.com>
Cc: Jen Linkova <furry@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Michael Haro <mharo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-ethernet-ti-add-networking-support-for-k3-am65x-j721e-soc'
David S. Miller [Fri, 27 Mar 2020 03:01:14 +0000 (20:01 -0700)]
Merge branch 'net-ethernet-ti-add-networking-support-for-k3-am65x-j721e-soc'

Grygorii Strashko says:

====================
net: ethernet: ti: add networking support for k3 am65x/j721e soc

This v6 series adds basic networking support support TI K3 AM654x/J721E SoC which
have integrated Gigabit Ethernet MAC (Media Access Controller) into device MCU
domain and named MCU_CPSW0 (CPSW2G NUSS).

Formally TRMs refer CPSW2G NUSS as two-port Gigabit Ethernet Switch subsystem
with port 0 being the CPPI DMA host port and port 1 being the external Ethernet
port, but for 1 external port device it's just Port 0 <-> ALE <-> Port 1 and it's
rather device with HW filtering capabilities then actually switching device.
It's expected to have similar devices, but with more external ports.

The new Host port 0 Communications Port Programming Interface (CPPI5) is
operating by TI AM654x/J721E NAVSS Unified DMA Peripheral Root Complex (UDMA-P)
controller [1].

The CPSW2G contains below modules for which existing code is re-used:
 - MAC SL: cpsw_sl.c
 - Address Lookup Engine (ALE): cpsw_ale.c, basically compatible with K2 66AK2E/G
 - Management Data Input/Output interface (MDIO): davinci_mdio.c, fully
   compatible with TI AM3/4/5 devices

Basic features supported by CPSW2G NUSS driver:
 - VLAN support, 802.1Q compliant, Auto add port VLAN for untagged frames on
   ingress, Auto VLAN removal on egress and auto pad to minimum frame size.
 - multicast filtering
 - promisc mode
 - TX multiq support in Round Robin or Fixed priority modes
 - RX checksum offload for non-fragmented IPv4/IPv6 TCP/UDP packets
 - TX checksum offload support for IPv4/IPv6 TCP/UDP packets (J721E only).

Features under development:
 - Support for IEEE 1588 Clock Synchronization. The CPSW2G NUSS includes new
   version of Common Platform Time Sync (CPTS)
 - tc-mqprio: priority level Quality Of Service (QOS) support (802.1p)
 - tc-cbs: Support for Audio/Video Bridging (P802.1Qav/D6.0) HW shapers
 - tc-taprio: IEEE 802.1Qbv/D2.2 Enhancements for Scheduled Traffic
 - frame preemption: IEEE P902.3br/D2.0 Interspersing Express Traffic, 802.1Qbu
 - extended ALE features: classifier/policers, auto-aging

Patches 1-6 are intended for netdev, Patches 7-11 are intended for K3 Platform
tree and provided here for testing purposes.

Changes in v6:
 - fixed comments from Rob Herring <robh@kernel.org> and added his Reviewed-by.
 - added Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>

Changes in v5:
 - renamed files k3-udma-desc-pool.*  k3-udma-desc-pool to k3-cppi-desc-pool.*,
   and API to k3_cppi_desc_pool_* as requested by Peter Ujfalusi <peter.ujfalusi@ti.com>
 - fixed copy-paste err in am65_cpsw_nuss_ndo_slave_set_rx_mode() which blocked
   recieving of mcast frames.
 - added Tested-by: Murali Karicheri <m-karicheri2@ti.com>

Changes in v4:
 - fixed minor comments from Jakub Kicinski <kuba@kernel.org>
 - dependencies resolved: required phy-rmii-sel changes [2] queued for merge
   except one [3] which is included in this series with Kishon's ask.

Changes in v3:
 - add ARM64 defconfig changes for testing purposes
 - fixed DT yaml definition
 - fixed comments from Jakub Kicinski <kuba@kernel.org>

Changes in v2:
 - fixed DT yaml definition
 - fixed comments from David Miller

v5: https://patchwork.ozlabs.org/cover/1258295/
v4: https://patchwork.ozlabs.org/cover/1256092/
v3: https://patchwork.ozlabs.org/cover/1254568/
v2: https://patchwork.ozlabs.org/cover/1250674/
v1: https://lwn.net/Articles/813087/

TRMs:
 AM654: http://www.ti.com/lit/ug/spruid7e/spruid7e.pdf
 J721E: http://www.ti.com/lit/ug/spruil1a/spruil1a.pdf

Preliminary documentation can be found at:
 http://software-dl.ti.com/processor-sdk-linux/esd/docs/latest/linux/Foundational_Components/Kernel/Kernel_Drivers/Network/K3_CPSW2g.html

[1] https://lwn.net/Articles/808030/
[2] https://lkml.org/lkml/2020/2/22/100
[3] https://lkml.org/lkml/2020/3/3/724
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoarm64: defconfig: ti: k3: enable dma and networking
Grygorii Strashko [Mon, 23 Mar 2020 22:52:54 +0000 (00:52 +0200)]
arm64: defconfig: ti: k3: enable dma and networking

Enable TI K3 AM654x/J721E DMA and networking options.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoarm64: dts: ti: k3-j721e-common-proc-board: add mcu cpsw nuss pinmux and phy defs
Grygorii Strashko [Mon, 23 Mar 2020 22:52:53 +0000 (00:52 +0200)]
arm64: dts: ti: k3-j721e-common-proc-board: add mcu cpsw nuss pinmux and phy defs

The TI J721E EVM base board has TI DP83867 PHY connected to external CPSW
NUSS Port 1 in rgmii-rxid mode.

Hence, add pinmux and Ethernet PHY configuration for TI j721e SoC MCU
Gigabit Ethernet two ports Switch subsystem (CPSW NUSS).

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoarm64: dts: ti: k3-j721e-mcu: add mcu cpsw nuss node
Grygorii Strashko [Mon, 23 Mar 2020 22:52:52 +0000 (00:52 +0200)]
arm64: dts: ti: k3-j721e-mcu: add mcu cpsw nuss node

Add DT node for The TI J721E MCU SoC Gigabit Ethernet
subsystem (MCU CPSW NUSS).

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoarm64: dts: k3-am654-base-board: add mcu cpsw nuss pinmux and phy defs
Grygorii Strashko [Mon, 23 Mar 2020 22:52:51 +0000 (00:52 +0200)]
arm64: dts: k3-am654-base-board: add mcu cpsw nuss pinmux and phy defs

AM654 EVM base board has TI DP83867 PHY connected to external CPSW NUSS
Port 1 in rgmii-rxid mode.

Hence, add pinmux and Ethernet PHY configuration for TI am654 SoC Gigabit
Ethernet two ports Switch subsystem (CPSW NUSS).

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoarm64: dts: ti: k3-am65-mcu: add cpsw nuss node
Grygorii Strashko [Mon, 23 Mar 2020 22:52:50 +0000 (00:52 +0200)]
arm64: dts: ti: k3-am65-mcu: add cpsw nuss node

Add DT node for the TI AM65x SoC Gigabit Ethernet two ports Switch
subsystem (CPSW NUSS).

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver
Grygorii Strashko [Mon, 23 Mar 2020 22:52:49 +0000 (00:52 +0200)]
net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver

The TI AM65x/J721E SoCs Gigabit Ethernet Switch subsystem (CPSW2G NUSS) has
two ports - One Ethernet port (port 1) with selectable RGMII and RMII
interfaces and an internal Communications Port Programming Interface (CPPI)
port (Host port 0) and with ALE in between. It also contains
 - Management Data Input/Output (MDIO) interface for physical layer device
(PHY) management;
 - Updated Address Lookup Engine (ALE) module;
 - (TBD) New version of Common platform time sync (CPTS) module.

On the TI am65x/J721E SoCs CPSW NUSS Ethernet subsystem into device MCU
domain named MCU_CPSW0.

Host Port 0 CPPI Packet Streaming Interface interface supports 8 TX
channels and one RX channels operating by TI am654 NAVSS Unified DMA
Peripheral Root Complex (UDMA-P) controller.

Introduced driver provides standard Linux net_device to user space and supports:
 - ifconfig up/down
 - MAC address configuration
 - ethtool operation:
   --driver
   --change
   --register-dump
   --negotiate phy
   --statistics
   --set-eee phy
   --show-ring
   --show-channels
   --set-channels
 - net_device ioctl mii-control
 - promisc mode

 - rx checksum offload for non-fragmented IPv4/IPv6 TCP/UDP packets.
   The CPSW NUSS can verify IPv4/IPv6 TCP/UDP packets checksum and fills
   csum information for each packet in psdata[2] word:
   - BIT(16) CHECKSUM_ERROR - indicates csum error
   - BIT(17) FRAGMENT -  indicates fragmented packet
   - BIT(18) TCP_UDP_N -  Indicates TCP packet was detected
   - BIT(19) IPV6_VALID,  BIT(20) IPV4_VALID - indicates IPv6/IPv4 packet
   - BIT(15, 0) CHECKSUM_ADD - This is the value that was summed
   during the checksum computation. This value is FFFFh for non fragmented
   IPV4/6 UDP/TCP packets with no checksum error.

   RX csum offload can be disabled:
 ethtool -K <dev> rx-checksum on|off

 - tx checksum offload support for IPv4/IPv6 TCP/UDP packets (J721E only).
   TX csum HW offload  can be enabled/disabled:
 ethtool -K <dev> tx-checksum-ip-generic on|off

 - multiq and switch between round robin/prio modes for cppi tx queues by
   using Netdev private flag "p0-rx-ptype-rrobin" to switch between
   Round Robin and Fixed priority modes:
 # ethtool --show-priv-flags eth0
 Private flags for eth0:
 p0-rx-ptype-rrobin: on
 # ethtool --set-priv-flags eth0 p0-rx-ptype-rrobin off

   Number of TX DMA channels can be changed using "ethtool -L eth0 tx <N>".

 - GRO support: the napi_gro_receive() and napi_complete_done() are used.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-binding: ti: am65x: document mcu cpsw nuss
Grygorii Strashko [Mon, 23 Mar 2020 22:52:48 +0000 (00:52 +0200)]
dt-binding: ti: am65x: document mcu cpsw nuss

Document device tree bindings for The TI AM654x/J721E SoC Gigabit Ethernet MAC
(Media Access Controller - CPSW2G NUSS). The CPSW NUSS provides Ethernet packet
communication for the device.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: ti: ale: am65: add support for default thread cfg
Grygorii Strashko [Mon, 23 Mar 2020 22:52:47 +0000 (00:52 +0200)]
net: ethernet: ti: ale: am65: add support for default thread cfg

Add support for default thread configuration for AM65x CPSW NUSS ALE to
allow route all ingress packets to one default RX UDMA flow.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: ti: ale: add support for mac-only mode
Grygorii Strashko [Mon, 23 Mar 2020 22:52:46 +0000 (00:52 +0200)]
net: ethernet: ti: ale: add support for mac-only mode

The new CPSW ALE version, available on TI K3 AM654/J721E SoCs family,
allows to switch any external port to MAC only mode. When MAC only mode
enabled this port be treated like a MAC port for the host. All traffic
received is only sent to the host. The host must direct traffic to this
port as the lookup engine will not send traffic to the ports with the
p0_maconly bit set and the p0_no_learn also set. If p0_maconly bit is set
and the p0_no_learn is not set, the host can send non-directed packets that
can be sent to the destination of a MacOnly port. It is also possible that
The host can broadcast to all ports including MacOnly ports in this mode.

This patch add ALE supprt for MAC only mode.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti...
Grygorii Strashko [Mon, 23 Mar 2020 22:52:45 +0000 (00:52 +0200)]
net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled

On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS the unregistered multicast
packets are still can be received with promisc and allmulti disabled.

This happens, because ALE VLAN entries on these SoCs do not contain port
masks for reg/unreg mcast packets, but instead store indexes of
ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for
reg/unreg mcast packets.

ALE VLAN entry:UNREG_MCAST_FLOOD_INDEX -> ALE_VLAN_MASK_MUXx
ALE VLAN entry:REG_MCAST_FLOOD_INDEX -> ALE_VLAN_MASK_MUXy

The commit b361da837392 ("net: netcp: ale: add proper ale entry mask bits
for netcp switch ALE") update ALE code to support such ALE entries, it is
always used ALE_VLAN_MASK_MUX0_REG index in ALE VLAN entry for unreg mcast
packets mask configuration, which is read-only, at least for AM65xx MCU
CPSW2G NUSS and 66AK2E/L NUSS. As result unreg mcast packets are allowed
always.

Hence, update ALE code to use ALE_VLAN_MASK_MUX1_REG index for ALE VLAN
entries to configure unreg mcast port mask.

Fixes: b361da837392 ("net: netcp: ale: add proper ale entry mask bits for netcp switch ALE")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agophy: ti: gmii-sel: simplify config dependencies between net drivers and gmii phy
Grygorii Strashko [Mon, 23 Mar 2020 22:52:44 +0000 (00:52 +0200)]
phy: ti: gmii-sel: simplify config dependencies between net drivers and gmii phy

The phy-gmii-sel can be only auto selected in Kconfig and now the pretty
complex Kconfig dependencies are defined for phy-gmii-sel driver, which
also need to be updated every time phy-gmii-sel is re-used for any new
networking driver.

Simplify Kconfig definition for phy-gmii-sel PHY driver - drop all
dependencies and from networking drivers and rely on using 'imply
PHY_TI_GMII_SEL' in Kconfig definitions for networking drivers instead.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Tested-by: Murali Karicheri <m-karicheri2@ti.com>
Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'cls_flower-Use-extack-in-fl_set_key'
David S. Miller [Fri, 27 Mar 2020 02:52:31 +0000 (19:52 -0700)]
Merge branch 'cls_flower-Use-extack-in-fl_set_key'

Guillaume Nault says:

====================
cls_flower: Use extack in fl_set_key()

Add missing extack messages in fl_set_key(), so that users can get more
meaningfull error messages when netlink attributes are rejected.

Patch 1 also extends extack in tcf_change_indev() (in pkt_cls.h) since
this function is used by fl_set_key().
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocls_flower: Add extack support for flags key
Guillaume Nault [Mon, 23 Mar 2020 20:48:53 +0000 (21:48 +0100)]
cls_flower: Add extack support for flags key

Pass extack down to fl_set_key_flags() and set message on error.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocls_flower: Add extack support for src and dst port range options
Guillaume Nault [Mon, 23 Mar 2020 20:48:51 +0000 (21:48 +0100)]
cls_flower: Add extack support for src and dst port range options

Pass extack down to fl_set_key_port_range() and set message on error.

Both the min and max ports would qualify as invalid attributes here.
Report the min one as invalid, as it's probably what makes the most
sense from a user point of view.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocls_flower: Add extack support for mpls options
Guillaume Nault [Mon, 23 Mar 2020 20:48:49 +0000 (21:48 +0100)]
cls_flower: Add extack support for mpls options

Pass extack down to fl_set_key_mpls() and set message on error.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sched: refine extack messages in tcf_change_indev
Guillaume Nault [Mon, 23 Mar 2020 20:48:47 +0000 (21:48 +0100)]
net: sched: refine extack messages in tcf_change_indev

Add an error message when device wasn't found.
While there, also set the bad attribute's offset in extack.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-phy-marvell-usb-to-mdio-controller'
David S. Miller [Fri, 27 Mar 2020 02:49:34 +0000 (19:49 -0700)]
Merge branch 'net-phy-marvell-usb-to-mdio-controller'

Tobias Waldekranz says:

====================
net: phy: marvell usb to mdio controller

Support for an MDIO controller present on development boards for
Marvell switches from the Link Street (88E6xxx) family.

v3->v4:
- Remove unnecessary dependency on OF_MDIO.

v2->v3:
- Rename driver smi2usb -> mvusb.
- Clean up unused USB references.

v1->v2:
- Reverse christmas tree ordering of local variables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: add marvell usb to mdio controller
Tobias Waldekranz [Mon, 23 Mar 2020 10:14:14 +0000 (11:14 +0100)]
net: phy: add marvell usb to mdio controller

An MDIO controller present on development boards for Marvell switches
from the Link Street (88E6xxx) family.

Using this module, you can use the following setup as a development
platform for switchdev and DSA related work.

   .-------.      .-----------------.
   |      USB----USB                |
   |  SoC  |      |  88E6390X-DB  ETH1-10
   |      ETH----ETH0               |
   '-------'      '-----------------'

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-bindings: net: add marvell usb to mdio bindings
Tobias Waldekranz [Mon, 23 Mar 2020 10:14:13 +0000 (11:14 +0100)]
dt-bindings: net: add marvell usb to mdio bindings

Describe how the USB to MDIO controller can optionally use device tree
bindings to reference attached devices such as switches.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: phy: probe PHY drivers synchronously
Heiner Kallweit [Fri, 27 Mar 2020 00:00:22 +0000 (01:00 +0100)]
net: phy: probe PHY drivers synchronously

If we have scenarios like

mdiobus_register()
-> loads PHY driver module(s)
-> registers PHY driver(s)
-> may schedule async probe
phydev = mdiobus_get_phy()
<phydev action involving PHY driver>

or

phydev = phy_device_create()
-> loads PHY driver module
-> registers PHY driver
-> may schedule async probe
<phydev action involving PHY driver>

then we expect the PHY driver to be bound to the phydev when triggering
the action. This may not be the case in case of asynchronous probing.
Therefore ensure that PHY drivers are probed synchronously.

Default still is sync probing, except async probing is explicitly
requested. I saw some comments that the intention is to promote
async probing for more parallelism in boot process and want to be
prepared for the case that the default is changed to async probing.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'implement-DEVLINK_CMD_REGION_NEW'
David S. Miller [Fri, 27 Mar 2020 02:39:27 +0000 (19:39 -0700)]
Merge branch 'implement-DEVLINK_CMD_REGION_NEW'

Jacob Keller says:

====================
implement DEVLINK_CMD_REGION_NEW

This series adds support for the DEVLINK_CMD_REGION_NEW operation, used to
enable userspace requesting a snapshot of a region on demand.

This can be useful to enable adding regions for a driver for which there is
no trigger to create snapshots. By making this a core part of devlink, there
is no need for the drivers to use a separate channel such as debugfs.

The primary intent for this kind of region is to expose device information
that might be useful for diagnostics and information gathering.

The first few patches refactor regions to support a new ops structure for
extending the available operations that regions can perform. This includes
converting the destructor into an op from a function argument.

Next, patches refactor the snapshot id allocation to use an xarray which
tracks the number of current snapshots using a given id. This is done so
that id lifetime can be determined, and ids can be released when no longer
in use.

Without this change, snapshot ids remain used forever, until the snapshot_id
count rolled over UINT_MAX.

Finally, code to enable the previously unused DEVLINK_CMD_REGION_NEW is
added. This code enforces that the snapshot id is always provided, unlike
previous revisions of this series.

Finally, a patch is added to enable using this new command via the .snapshot
callback in both netdevsim and the ice driver.

For the ice driver, a new "nvm-flash" region is added, which will enable
read access to the NVM flash contents. The intention for this is to allow
diagnostics tools to gather information about the device. By using a
snapshot and gathering the NVM contents all at once, the contents can be
atomic.

Links to previous discussions:
1st RFC - https://lore.kernel.org/netdev/20200130225913.1671982-1-jacob.e.keller@intel.com/
2nd RFC - https://lore.kernel.org/netdev/20200214232223.3442651-1-jacob.e.keller@intel.com/
v1 - https://lore.kernel.org/netdev/20200324223445.2077900-1-jacob.e.keller@intel.com/
v2 - https://lore.kernel.org/netdev/20200326035157.2211090-1-jacob.e.keller@intel.com/

Major changes since RFC:
* use an xarray for tracking snapshot ids, rather than an IDR
* remove support for auto-generated snapshot ids in DEVLINK_CMD_REGION_NEW

See each patch for an individual changelog per-patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoice: add a devlink region for dumping NVM contents
Jacob Keller [Thu, 26 Mar 2020 18:37:18 +0000 (11:37 -0700)]
ice: add a devlink region for dumping NVM contents

Add a devlink region for exposing the device's Non Volatime Memory flash
contents.

Support the recently added .snapshot operation, enabling userspace to
request a snapshot of the NVM contents via DEVLINK_CMD_REGION_NEW.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonetdevsim: support taking immediate snapshot via devlink
Jacob Keller [Thu, 26 Mar 2020 18:37:17 +0000 (11:37 -0700)]
netdevsim: support taking immediate snapshot via devlink

Implement the .snapshot region operation for the dummy data region. This
enables a region snapshot to be taken upon request via the new
DEVLINK_CMD_REGION_SNAPSHOT command.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: implement DEVLINK_CMD_REGION_NEW
Jacob Keller [Thu, 26 Mar 2020 18:37:16 +0000 (11:37 -0700)]
devlink: implement DEVLINK_CMD_REGION_NEW

Implement support for the DEVLINK_CMD_REGION_NEW command for creating
snapshots. This new command parallels the existing
DEVLINK_CMD_REGION_DEL.

In order for DEVLINK_CMD_REGION_NEW to work for a region, the new
".snapshot" operation must be implemented in the region's ops structure.

The desired snapshot id must be provided. This helps avoid confusion on
the purpose of DEVLINK_CMD_REGION_NEW, and keeps the API simpler.

The requested id will be inserted into the xarray tracking the number of
snapshots using each id. If this id is already used by another snapshot
on any region, an error will be returned.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: track snapshot id usage count using an xarray
Jacob Keller [Thu, 26 Mar 2020 18:37:15 +0000 (11:37 -0700)]
devlink: track snapshot id usage count using an xarray

Each snapshot created for a devlink region must have an id. These ids
are supposed to be unique per "event" that caused the snapshot to be
created. Drivers call devlink_region_snapshot_id_get to obtain a new id
to use for a new event trigger. The id values are tracked per devlink,
so that the same id number can be used if a triggering event creates
multiple snapshots on different regions.

There is no mechanism for snapshot ids to ever be reused. Introduce an
xarray to store the count of how many snapshots are using a given id,
replacing the snapshot_id field previously used for picking the next id.

The devlink_region_snapshot_id_get() function will use xa_alloc to
insert an initial value of 1 value at an available slot between 0 and
U32_MAX.

The new __devlink_snapshot_id_increment() and
__devlink_snapshot_id_decrement() functions will be used to track how
many snapshots currently use an id.

Drivers must now call devlink_snapshot_id_put() in order to release
their reference of the snapshot id after adding region snapshots.

By tracking the total number of snapshots using a given id, it is
possible for the decrement() function to erase the id from the xarray
when it is not in use.

With this method, a snapshot id can become reused again once all
snapshots that referred to it have been deleted via
DEVLINK_CMD_REGION_DEL, and the driver has finished adding snapshots.

This work also paves the way to introduce a mechanism for userspace to
request a snapshot.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: report error once U32_MAX snapshot ids have been used
Jacob Keller [Thu, 26 Mar 2020 18:37:14 +0000 (11:37 -0700)]
devlink: report error once U32_MAX snapshot ids have been used

The devlink_snapshot_id_get() function returns a snapshot id. The
snapshot id is a u32, so there is no way to indicate an error code.

A future change is going to possibly add additional cases where this
function could fail. Refactor the function to return the snapshot id in
an argument, so that it can return zero or an error value.

This ensures that snapshot ids cannot be confused with error values, and
aids in the future refactor of snapshot id allocation management.

Because there is no current way to release previously used snapshot ids,
add a simple check ensuring that an error is reported in case the
snapshot_id would over flow.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: extract snapshot id allocation to helper function
Jacob Keller [Thu, 26 Mar 2020 18:37:13 +0000 (11:37 -0700)]
devlink: extract snapshot id allocation to helper function

A future change is going to implement a new devlink command to request
a snapshot on demand. As part of this, the logic for handling the
snapshot ids will be refactored. To simplify the snapshot id allocation
function, move it to a separate function prefixed by `__`. This helper
function will assume the lock is held.

While no other callers will exist, it simplifies refactoring the logic
because there is no need to complicate the function with gotos to handle
unlocking on failure.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: use -ENOSPC to indicate no more room for snapshots
Jacob Keller [Thu, 26 Mar 2020 18:37:12 +0000 (11:37 -0700)]
devlink: use -ENOSPC to indicate no more room for snapshots

The devlink_region_snapshot_create function returns -ENOMEM when the
maximum number of snapshots has been reached. This is confusing because
it is not an issue of being out of memory. Change this to use -ENOSPC
instead.

Reported-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: add function to take snapshot while locked
Jacob Keller [Thu, 26 Mar 2020 18:37:11 +0000 (11:37 -0700)]
devlink: add function to take snapshot while locked

A future change is going to add a new devlink command to request
a snapshot on demand. This function will want to call the
devlink_region_snapshot_create function while already holding the
devlink instance lock.

Extract the logic of this function into a static function prefixed by
`__` to indicate that it is an internal helper function. Modify the
original function to be implemented in terms of the new locked
function.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: trivial: fix tab in function documentation
Jacob Keller [Thu, 26 Mar 2020 18:37:10 +0000 (11:37 -0700)]
devlink: trivial: fix tab in function documentation

The function documentation comment for devlink_region_snapshot_create
included a literal tab character between 'future analyses' that was
difficult to spot as it happened to only display as one space wide.

Fix the comment to use a space here instead of a stray tab appearing in
the middle of a sentence.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: convert snapshot destructor callback to region op
Jacob Keller [Thu, 26 Mar 2020 18:37:09 +0000 (11:37 -0700)]
devlink: convert snapshot destructor callback to region op

It does not makes sense that two snapshots for a given region would use
different destructors. Simplify snapshot creation by adding
a .destructor op for regions.

This operation will replace the data_destructor for the snapshot
creation, and makes snapshot creation easier.

Noticed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: prepare to support region operations
Jacob Keller [Thu, 26 Mar 2020 18:37:08 +0000 (11:37 -0700)]
devlink: prepare to support region operations

Modify the devlink region code in preparation for adding new operations
on regions.

Create a devlink_region_ops structure, and move the name pointer from
within the devlink_region structure into the ops structure (similar to
the devlink_health_reporter_ops).

This prepares the regions to enable support of additional operations in
the future such as requesting snapshots, or accessing the region
directly without a snapshot.

In order to re-use the constant strings in the mlx4 driver their
declaration must be changed to 'const char * const' to ensure the
compiler realizes that both the data and the pointer cannot change.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'veth-stats'
David S. Miller [Fri, 27 Mar 2020 02:35:13 +0000 (19:35 -0700)]
Merge branch 'veth-stats'

Lorenzo Bianconi says:

====================
veth: move ndo_xdp_xmit stats to peer veth_rq

Move ndo_xdp_xmit ethtool stats accounting to peer veth_rq.
Move XDP_TX accounting to veth_xdp_flush_bq routine.

Changes since v1:
- rename xdp_xmit[_err] counters to peer_tq_xdp_xmit[_err]
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoveth: rely on peer veth_rq for ndo_xdp_xmit accounting
Lorenzo Bianconi [Thu, 26 Mar 2020 22:10:20 +0000 (23:10 +0100)]
veth: rely on peer veth_rq for ndo_xdp_xmit accounting

Rely on 'remote' veth_rq to account ndo_xdp_xmit ethtool counters.
Move XDP_TX accounting to veth_xdp_flush_bq routine.
Remove 'rx' prefix in rx xdp ethool counters

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoveth: rely on veth_rq in veth_xdp_flush_bq signature
Lorenzo Bianconi [Thu, 26 Mar 2020 22:10:19 +0000 (23:10 +0100)]
veth: rely on veth_rq in veth_xdp_flush_bq signature

Substitute net_device point with veth_rq one in veth_xdp_flush_bq,
veth_xdp_flush and veth_xdp_tx signature. This is a preliminary patch
to account xdp_xmit counter on 'receiving' veth_rq

Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosfc: falcon: convert to use i2c_new_client_device()
Wolfram Sang [Thu, 26 Mar 2020 21:10:00 +0000 (22:10 +0100)]
sfc: falcon: convert to use i2c_new_client_device()

Move away from the deprecated API and return the shiny new ERRPTR where
useful.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoigb: convert to use i2c_new_client_device()
Wolfram Sang [Thu, 26 Mar 2020 21:09:59 +0000 (22:09 +0100)]
igb: convert to use i2c_new_client_device()

Move away from the deprecated API and return the shiny new ERRPTR where
useful.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'Implement-stats_update-callback-for-pedit-and-skbedit'
David S. Miller [Fri, 27 Mar 2020 02:20:37 +0000 (19:20 -0700)]
Merge branch 'Implement-stats_update-callback-for-pedit-and-skbedit'

Petr Machata says:

====================
Implement stats_update callback for pedit and skbedit

The stats_update callback is used for adding HW counters to the SW ones.
Both skbedit and pedit actions are actually recognized by flow_offload.h,
but do not implement these callbacks. As a consequence, the reported values
are only the SW ones, even where there is a HW counter available.

Patch #1 adds the callback to action skbedit, patch #2 adds it to action
pedit. Patch #3 tweaks an skbedit selftest with a check that would have
caught this problem.

The pedit test is not likewise tweaked, because the iproute2 pedit action
currently does not support JSON dumping. This will be addressed later.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: skbedit_priority: Test counters at the skbedit rule
Petr Machata [Thu, 26 Mar 2020 20:45:57 +0000 (22:45 +0200)]
selftests: skbedit_priority: Test counters at the skbedit rule

Currently the test checks the observable effect of skbedit priority:
queueing of packets at the correct qdisc band. It therefore misses the fact
that the counters for offloaded rules are not updated. Add an extra check
for the counter.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosched: act_pedit: Implement stats_update callback
Petr Machata [Thu, 26 Mar 2020 20:45:56 +0000 (22:45 +0200)]
sched: act_pedit: Implement stats_update callback

Implement this callback in order to get the offloaded stats added to the
kernel stats.

Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosched: act_skbedit: Implement stats_update callback
Petr Machata [Thu, 26 Mar 2020 20:45:55 +0000 (22:45 +0200)]
sched: act_skbedit: Implement stats_update callback

Implement this callback in order to get the offloaded stats added to the
kernel stats.

Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'mlxsw-Offload-TC-action-pedit-munge-dsfield'
David S. Miller [Thu, 26 Mar 2020 18:55:41 +0000 (11:55 -0700)]
Merge branch 'mlxsw-Offload-TC-action-pedit-munge-dsfield'

Ido Schimmel says:

====================
mlxsw: Offload TC action pedit munge dsfield

Petr says:

The Spectrum switches allow packet prioritization based on DSCP on ingress,
and update of DSCP on egress. This is configured through the DCB APP rules.
For some use cases, assigning a custom DSCP value based on an ACL match is
a better tool. To that end, offload FLOW_ACTION_MANGLE to permit changing
of dsfield as a whole, or DSCP and ECN values in isolation.

After fixing a commentary nit in patch #1, and mlxsw naming in patch #2,
patches #3 and #4 add the offload to mlxsw.

Patch #5 adds a forwarding selftest for pedit dsfield, applicable to SW as
well as HW datapaths. Patch #6 adds a mlxsw-specific test to verify DSCP
rewrite due to DCB APP rules is not performed on pedited packets.

The tests only cover IPv4 dsfield setting. We have tests for IPv6 as well,
but would like to postpone their contribution until the corresponding
iproute patches have been accepted.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: mlxsw: qos_dscp_router: Test no DSCP rewrite after pedit
Petr Machata [Thu, 26 Mar 2020 14:01:14 +0000 (16:01 +0200)]
selftests: mlxsw: qos_dscp_router: Test no DSCP rewrite after pedit

When DSCP is updated through an offloaded pedit action, DSCP rewrite on
egress should be disabled. Add a test that check that it is so.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: forwarding: Add a forwarding test for pedit munge dsfield
Petr Machata [Thu, 26 Mar 2020 14:01:13 +0000 (16:01 +0200)]
selftests: forwarding: Add a forwarding test for pedit munge dsfield

Add a test that runs packets with dsfield set, and test that pedit adjusts
the DSCP or ECN parts or the whole field.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: spectrum_flower: Offload FLOW_ACTION_MANGLE
Petr Machata [Thu, 26 Mar 2020 14:01:12 +0000 (16:01 +0200)]
mlxsw: spectrum_flower: Offload FLOW_ACTION_MANGLE

Offload action pedit ex munge when used with a flower classifier. Only
allow setting of DSCP, ECN, or the whole DSField in IPv4 and IPv6 packets.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: core: Add DSCP, ECN, dscp_rw to QOS_ACTION
Petr Machata [Thu, 26 Mar 2020 14:01:11 +0000 (16:01 +0200)]
mlxsw: core: Add DSCP, ECN, dscp_rw to QOS_ACTION

The QOS_ACTION is used for manipulating the QOS attributes of the packet.
Add the defines and helpers related to DSCP and ECN fields, and dscp_rw.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agomlxsw: core: Rename mlxsw_afa_qos_cmd to mlxsw_afa_qos_switch_prio_cmd
Petr Machata [Thu, 26 Mar 2020 14:01:10 +0000 (16:01 +0200)]
mlxsw: core: Rename mlxsw_afa_qos_cmd to mlxsw_afa_qos_switch_prio_cmd

The original idea was to reuse this set of actions for ECN rewrite as well,
but on second look, it's not such a great idea. These two items should each
have its own command. Rename the existing enum to make it obvious that it
belongs to switch_prio_cmd.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: flow_offload.h: Fix a comment at flow_action_entry.mangle
Petr Machata [Thu, 26 Mar 2020 14:01:09 +0000 (16:01 +0200)]
net: flow_offload.h: Fix a comment at flow_action_entry.mangle

This field references FLOW_ACTION_PACKET_EDIT. Such action does not exist
though. Instead the field is used for FLOW_ACTION_MANGLE and _ADD.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-updates-2020-03-25' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Thu, 26 Mar 2020 18:38:48 +0000 (11:38 -0700)]
Merge tag 'mlx5-updates-2020-03-25' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-03-25

1) Cleanups from Dan Carpenter and wenxu.

2) Paul and Roi, Some minor updates and fixes to E-Switch to address
issues introduced in the previous reg_c0 updates series.

3) Eli Cohen simplifies and improves flow steering matching group searches
and flow table entries version management.

4) Parav Pandit, improves devlink eswitch mode changes thread safety.
By making devlink rely on driver for thread safety and introducing mlx5
eswitch mode change protection.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoatl2: remove unused variable 'atl2_driver_string'
YueHaibing [Thu, 26 Mar 2020 03:21:50 +0000 (11:21 +0800)]
atl2: remove unused variable 'atl2_driver_string'

drivers/net/ethernet/atheros/atlx/atl2.c:40:19: warning: ‘atl2_driver_string’ defined but not used [-Wunused-const-variable=]
 static const char atl2_driver_string[] = "Atheros(R) L2 Ethernet Driver";
                   ^~~~~~~~~~~~~~~~~~

commit ea973742140b ("net/atheros: Clean atheros code from driver version")
left behind this, remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotipc: Add a missing case of TIPC_DIRECT_MSG type
Hoang Le [Thu, 26 Mar 2020 02:50:29 +0000 (09:50 +0700)]
tipc: Add a missing case of TIPC_DIRECT_MSG type

In the commit f73b12812a3d
("tipc: improve throughput between nodes in netns"), we're missing a check
to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for
this message type. So, throughput improvement is not significant as
expected.

Besides that, when sending a large message with that type, we're also
handle wrong receiving queue, it should be enqueued in socket receiving
instead of multicast messages.

Fix this by adding the missing case for TIPC_DIRECT_MSG.

Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Reported-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5: E-switch, Protect eswitch mode changes
Parav Pandit [Wed, 18 Dec 2019 08:51:19 +0000 (02:51 -0600)]
net/mlx5: E-switch, Protect eswitch mode changes

Currently eswitch mode change is occurring from 2 different execution
contexts as below.
1. sriov sysfs enable/disable
2. devlink eswitch set commands

Both of them need to access eswitch related data structures in
synchronized manner.
Without any synchronization below race condition exist.

SR-IOV enable/disable with devlink eswitch mode change:

          cpu-0                         cpu-1
          -----                         -----
mlx5_device_disable_sriov()        mlx5_devlink_eswitch_mode_set()
  mlx5_eswitch_disable()             esw_offloads_stop()
    esw_offloads_disable()             mlx5_eswitch_disable()
                                         esw_offloads_disable()

Hence, they are synchronized using a new mode_lock.
eswitch's state_lock is not used as it can lead to a deadlock scenario
below and state_lock is only for vport and fdb exclusive access.

ip link set vf <param>
  netlink rcv_msg() - Lock A
    rtnl_lock
    vfinfo()
      esw->state_lock() - Lock B
devlink eswitch_set
   devlink_mutex
     esw->state_lock() - Lock B
       attach_netdev()
         register_netdev()
           rtnl_lock - Lock A

Alternatives considered:
1. Acquiring rtnl lock before taking esw->state_lock to follow similar
locking sequence as ip link flow during eswitch mode set.
rtnl lock is not good idea for two reasons.
(a) Holding rtnl lock for several hundred device commands is not good
    idea.
(b) It leads to below and more similar deadlocks.

devlink eswitch_set
   devlink_mutex
     rtnl_lock - Lock A
       esw->state_lock() - Lock B
         eswitch_disable()
           reload()
             ib_register_device()
               ib_cache_setup_one()
                 rtnl_lock()

2. Exporting devlink lock may lead to undesired use of it in vendor
driver(s) in future.

3. Unloading representors outside of the mode_lock requires
serialization with other process trying to enable the eswitch.

4. Differing the representors life cycle to a different workqueue
requires synchronization with func_change_handler workqueue.

Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-switch, Extend eswitch enable to handle num_vfs change
Parav Pandit [Wed, 18 Dec 2019 10:58:58 +0000 (04:58 -0600)]
net/mlx5: E-switch, Extend eswitch enable to handle num_vfs change

Subsequent patch protects eswitch mode changes across sriov and devlink
interfaces. It is desirable for eswitch to provide thread safe eswitch
enable and disable APIs.
Hence, extend eswitch enable API to optionally update num_vfs when
requested.

In subsequent patch, eswitch num_vfs are updated after all the eswitch
users eswitch drops its reference count.

Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Split eswitch mode check to different helper function
Parav Pandit [Sat, 14 Dec 2019 10:09:04 +0000 (04:09 -0600)]
net/mlx5: Split eswitch mode check to different helper function

In order to check eswitch state under a lock, prepare code to split
capability check and eswitch state check into two helper functions.

Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agodevlink: Rely on driver eswitch thread safety instead of devlink
Parav Pandit [Mon, 24 Feb 2020 01:06:56 +0000 (19:06 -0600)]
devlink: Rely on driver eswitch thread safety instead of devlink

devlink_nl_cmd_eswitch_set_doit() doesn't hold devlink->lock mutex while
invoking driver callback. This is likely due to eswitch mode setting
involves adding/remove devlink ports, health reporters or
other devlink objects for a devlink device.

So it is driver responsiblity to ensure thread safe eswitch state
transition happening via either sriov legacy enablement or via devlink
eswitch set callback.

Therefore, get() callback should also be invoked without holding
devlink->lock mutex.
Vendor driver can use same internal lock which it uses during eswitch
mode set() callback.
This makes get() and set() implimentation symmetric in devlink core and
in vendor drivers.

Hence, remove holding devlink->lock mutex during eswitch get() callback.

Failing to do so results into below deadlock scenario when mlx5_core
driver is improved to handle eswitch mode set critical section invoked
by devlink and sriov sysfs interface in subsequent patch.

devlink_nl_cmd_eswitch_set_doit()
   mlx5_eswitch_mode_set()
     mutex_lock(esw->mode_lock) <- Lock A
     [...]
     register_devlink_port()
       mutex_lock(&devlink->lock); <- lock B

mutex_lock(&devlink->lock); <- lock B
devlink_nl_cmd_eswitch_get_doit()
   mlx5_eswitch_mode_get()
   mutex_lock(esw->mode_lock) <- Lock A

In subsequent patch, mlx5_core driver uses its internal lock during
get() and set() eswitch callbacks.

Other drivers have been inspected which returns either constant during
get operations or reads the value from already allocated structure.
Hence it is safe to remove the lock in get( ) callback and let vendor
driver handle it.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Simplify mlx5_unload_one() and its callers
Parav Pandit [Mon, 9 Mar 2020 04:17:37 +0000 (23:17 -0500)]
net/mlx5: Simplify mlx5_unload_one() and its callers

mlx5_unload_one() always returns 0.
Simplify callers of mlx5_unload_one() and remove the dead code.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Simplify mlx5_register_device to return void
Parav Pandit [Mon, 9 Mar 2020 03:19:51 +0000 (22:19 -0500)]
net/mlx5: Simplify mlx5_register_device to return void

mlx5_register_device() doesn't check for any error and always returns 0.
Simplify mlx5_register_device() to return void and its caller, remove
dead code related to it.

Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Avoid group version scan when not necessary
Eli Cohen [Wed, 4 Mar 2020 08:32:56 +0000 (10:32 +0200)]
net/mlx5: Avoid group version scan when not necessary

Group version is used when modifying a rule is allowed
(FLOW_ACT_NO_APPEND is clear) to detect a case where the rule was found
but while the groups where unlocked a new FTE was added. In this case,
the added FTE could be one with identical match value so we need to
attempt again with group lock held.

Change the code so version is retrieved only when FLOW_ACT_NO_APPEND is
cleared. As result, later compare can also be avoided if FLOW_ACT_NO_APPEND
is cleared.

Also improve comments text.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Avoid incrementing FTE version
Eli Cohen [Wed, 4 Mar 2020 08:32:56 +0000 (10:32 +0200)]
net/mlx5: Avoid incrementing FTE version

FTE version is not used anywhere in the code so avoid incrementing it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Fix group version management
Eli Cohen [Wed, 4 Mar 2020 08:32:56 +0000 (10:32 +0200)]
net/mlx5: Fix group version management

When adding a rule to a flow group we need increment the version of the
group. Current code fails to do that and as a result, when trying to add
a rule, we will fail to discover a case where an FTE with the same match
value was added while we scanned the groups of the same match criteria,
thus we may try to add an FTE that was already added.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Simplify matching group searches
Eli Cohen [Wed, 26 Feb 2020 13:03:16 +0000 (15:03 +0200)]
net/mlx5: Simplify matching group searches

Instead of using two different structs for searching groups with the
same match, use a single struct and thus simplify the code, make it more
readable and smaller size which means less code cache misses.

         text    data     bss     dec     hex
before: 35524    2744       0   38268    957c
after:  35038    2744       0   37782    9396

When testing add 70000 rules, delete all the rules, and repeat three
times taking the average, we get (time in seconds):

Before the change: insert 16.80, delete 11.02
After the change:  insert 16.55, delete 10.95

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-Switch, Use correct type for chain, prio and level values
Roi Dayan [Mon, 23 Mar 2020 10:14:58 +0000 (12:14 +0200)]
net/mlx5: E-Switch, Use correct type for chain, prio and level values

The correct type is u32.

Fixes: d18296ffd9cc ("net/mlx5: E-Switch, Introduce global tables")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-Switch, free flow_group_in after creating the restore table
Roi Dayan [Thu, 19 Mar 2020 15:48:18 +0000 (17:48 +0200)]
net/mlx5: E-Switch, free flow_group_in after creating the restore table

We allocate a temporary memory but forget to free it.

Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-Switch, Enable chains only if regs loopback is enabled
Paul Blakey [Wed, 18 Mar 2020 09:43:06 +0000 (11:43 +0200)]
net/mlx5: E-Switch, Enable chains only if regs loopback is enabled

Register c0 loopback is needed to fully support chains and prios.

Enable chains and prio only if loopback (of reg c1 which came together
with c0), is enabled. To be able to check that, move enabling of loopback
before eswitch chains init.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: E-Switch, Enable restore table only if reg_c1 is supported
Paul Blakey [Wed, 18 Mar 2020 08:55:12 +0000 (10:55 +0200)]
net/mlx5: E-Switch, Enable restore table only if reg_c1 is supported

Reg c0/c1 matching, rewrite of regs c0/c1, and copy header of regs c1,B
is needed for the restore table to function, might not be supported by
firmware, and creation of the restore table or the copy header will
fail.

Check reg_c1 loopback support, as firmware which supports this,
should have all of the above.

Fixes: 11b717d61526 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: remove duplicated check chain_index in mlx5e_rep_setup_ft_cb
wenxu [Wed, 18 Mar 2020 09:05:43 +0000 (17:05 +0800)]
net/mlx5e: remove duplicated check chain_index in mlx5e_rep_setup_ft_cb

The function mlx5e_rep_setup_ft_cb check chain_index is zero twice.

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix actions_match_supported() return
Dan Carpenter [Fri, 20 Mar 2020 13:23:05 +0000 (16:23 +0300)]
net/mlx5e: Fix actions_match_supported() return

The actions_match_supported() function returns a bool, true for success
and false for failure.  This error path is returning a negative which
is cast to true but it should return false.

Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Thu, 26 Mar 2020 01:58:11 +0000 (18:58 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Overlapping header include additions in macsec.c

A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c

Overlapping test additions in selftests Makefile

Overlapping PCI ID table adjustments in iwlwifi driver.

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Wed, 25 Mar 2020 20:58:05 +0000 (13:58 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

 1) Fix deadlock in bpf_send_signal() from Yonghong Song.

 2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan.

 3) Add missing locking in iwlwifi mvm code, from Avraham Stern.

 4) Fix MSG_WAITALL handling in rxrpc, from David Howells.

 5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong
    Wang.

 6) Fix producer race condition in AF_PACKET, from Willem de Bruijn.

 7) cls_route removes the wrong filter during change operations, from
    Cong Wang.

 8) Reject unrecognized request flags in ethtool netlink code, from
    Michal Kubecek.

 9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from
    Doug Berger.

10) Don't leak ct zone template in act_ct during replace, from Paul
    Blakey.

11) Fix flushing of offloaded netfilter flowtable flows, also from Paul
    Blakey.

12) Fix throughput drop during tx backpressure in cxgb4, from Rahul
    Lakkireddy.

13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric
    Dumazet.

14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well,
    also from Eric Dumazet.

15) Restrict macsec to ethernet devices, from Willem de Bruijn.

16) Fix reference leak in some ethtool *_SET handlers, from Michal
    Kubecek.

17) Fix accidental disabling of MSI for some r8169 chips, from Heiner
    Kallweit.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
  net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
  net: ena: Add PCI shutdown handler to allow safe kexec
  selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
  selftests/net: add missing tests to Makefile
  r8169: re-enable MSI on RTL8168c
  net: phy: mdio-bcm-unimac: Fix clock handling
  cxgb4/ptp: pass the sign of offset delta in FW CMD
  net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
  net: cbs: Fix software cbs to consider packet sending time
  net/mlx5e: Do not recover from a non-fatal syndrome
  net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
  net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
  net/mlx5e: Enhance ICOSQ WQE info fields
  net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
  selftests: netfilter: add nfqueue test case
  netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
  netfilter: nft_fwd_netdev: validate family and chain type
  netfilter: nft_set_rbtree: Detect partial overlaps on insertion
  netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
  netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
  ...