linux-2.6-microblaze.git
3 years agoqede: refactor XDP Tx processing
Alexander Lobakin [Wed, 22 Jul 2020 22:10:44 +0000 (01:10 +0300)]
qede: refactor XDP Tx processing

Current XDP Tx logic is suboptimal and can't be reused for XDP_REDIRECT
path.
Make qede_xdp_{tx_int,xmit}() more universal and effective in general to
allow future expanding.

Misc: use unlikely() hints where appropriate and replace "fallthrough"
comments with pseudo-keywords.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqede: reformat net_device_ops declarations
Alexander Lobakin [Wed, 22 Jul 2020 22:10:43 +0000 (01:10 +0300)]
qede: reformat net_device_ops declarations

Correct the indentation of net_device_ops declarations for fancier look.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqede: reformat several structures in "qede.h"
Alexander Lobakin [Wed, 22 Jul 2020 22:10:42 +0000 (01:10 +0300)]
qede: reformat several structures in "qede.h"

Make the file more readable and easier for adding new fields.

Misc: use IFNAMSIZ and netdev_name() instead of sizeof_field()
and direct net_device::name dereferencing.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: introduce qed_chain_get_elem_used{,u32}()
Alexander Lobakin [Wed, 22 Jul 2020 22:10:41 +0000 (01:10 +0300)]
qed: introduce qed_chain_get_elem_used{,u32}()

Add reverse-variants of qed_chain_get_elem_left{,u32}() to be able to
know current chain occupation. They will be used in the upcoming qede
XDP_REDIRECT code.
They share most of the logics with the mentioned ones, so were reused
to collapse the latters.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: optimize common chain accessors
Alexander Lobakin [Wed, 22 Jul 2020 22:10:40 +0000 (01:10 +0300)]
qed: optimize common chain accessors

Constify chain pointers and refactor qed_chain_get_elem_left{,u32}() a bit.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: add support for different page sizes for chains
Alexander Lobakin [Wed, 22 Jul 2020 22:10:39 +0000 (01:10 +0300)]
qed: add support for different page sizes for chains

Extend current infrastructure to store chain page size in a struct
and use it in all functions instead of fixed QED_CHAIN_PAGE_SIZE.
Its value remains the default one, but can be overridden in
qed_chain_init_params before chain allocation.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: simplify chain allocation with init params struct
Alexander Lobakin [Wed, 22 Jul 2020 22:10:38 +0000 (01:10 +0300)]
qed: simplify chain allocation with init params struct

To simplify qed_chain_alloc() prototype and call sites, introduce struct
qed_chain_init_params to specify chain params, and pass a pointer to
filled struct to the actual qed_chain_alloc() instead of a long list
of separate arguments.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: simplify initialization of the chains with an external PBL
Alexander Lobakin [Wed, 22 Jul 2020 22:10:37 +0000 (01:10 +0300)]
qed: simplify initialization of the chains with an external PBL

Fill PBL table parameters for chains with an external PBL data earlier on
qed_chain_init_params() rather than on allocation itself. This simplifies
allocation code and allows to extend struct ext_pbl for other chain types.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: move chain initialization inlines next to allocation functions
Alexander Lobakin [Wed, 22 Jul 2020 22:10:36 +0000 (01:10 +0300)]
qed: move chain initialization inlines next to allocation functions

qed_chain_init*() are used in one file/place on "cold" path only, so they
can be uninlined and moved next to the call sites.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: sanitize PBL chains allocation
Alexander Lobakin [Wed, 22 Jul 2020 22:10:35 +0000 (01:10 +0300)]
qed: sanitize PBL chains allocation

PBL chain elements are actually DMA addresses stored in __le64, but
currently their size is hardcoded to 8, and DMA addresses are assigned
via cast to variable-sized dma_addr_t without any bitwise conversions.
Change the type of pbl_virt array to match the actual one, add a new
field to store the size of allocated DMA memory and sanitize elements
assignment.

Misc: give more logic names to the members of qed_chain::pbl_sp embedded
struct.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: prevent possible double-frees of the chains
Alexander Lobakin [Wed, 22 Jul 2020 22:10:34 +0000 (01:10 +0300)]
qed: prevent possible double-frees of the chains

Zero-initialize chain on qed_chain_free(), so it couldn't be freed
twice and provoke undefined behaviour.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: move chain methods to a separate file
Alexander Lobakin [Wed, 22 Jul 2020 22:10:33 +0000 (01:10 +0300)]
qed: move chain methods to a separate file

Move chain allocation/freeing functions to a new file to not mix it with
hardware-related code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: reformat Makefile
Alexander Lobakin [Wed, 22 Jul 2020 22:10:32 +0000 (01:10 +0300)]
qed: reformat Makefile

List one entry per line and sort them alphabetically to simplify the
addition of the new ones.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: reformat "qed_chain.h" a bit
Alexander Lobakin [Wed, 22 Jul 2020 22:10:31 +0000 (01:10 +0300)]
qed: reformat "qed_chain.h" a bit

Reformat structs and macros definitions a bit prior to making functional
changes.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qed_hsi.h: Avoid the use of one-element array
Gustavo A. R. Silva [Wed, 22 Jul 2020 18:58:52 +0000 (13:58 -0500)]
net: qed_hsi.h: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type '__le32 reserved1'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/qed_hsi-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobna: bfi.h: Avoid the use of one-element array
Gustavo A. R. Silva [Wed, 22 Jul 2020 18:50:24 +0000 (13:50 -0500)]
bna: bfi.h: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u8 rsvd'[2], once it seems this is
just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/bfi-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotg3: Avoid the use of one-element array
Gustavo A. R. Silva [Wed, 22 Jul 2020 18:43:58 +0000 (13:43 -0500)]
tg3: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u32 reserved2'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/tg3-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: fix memory leak of object 'lid'
Colin Ian King [Wed, 22 Jul 2020 17:40:03 +0000 (18:40 +0100)]
ionic: fix memory leak of object 'lid'

Currently when netdev fails to allocate the error return path
fails to free the allocated object 'lid'.  Fix this by setting
err to the return error code and jumping to a new label that
performs the kfree of lid before returning.

Addresses-Coverity: ("Resource leak")
Fixes: 4b03b27349c0 ("ionic: get MTU from lif identity")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'l2tp-cleanup-checkpatch-pl-warnings'
David S. Miller [Thu, 23 Jul 2020 01:08:39 +0000 (18:08 -0700)]
Merge branch 'l2tp-cleanup-checkpatch-pl-warnings'

Tom Parkin says:

====================
l2tp: cleanup checkpatch.pl warnings

l2tp hasn't been kept up to date with the static analysis checks offered
by checkpatch.pl.

This series addresses a range of minor issues which don't involve large
changes to code structure.  The changes include:

 * tweaks to use of whitespace, comment style, line breaks,
   and indentation

 * two minor modifications to code to use a function or macro suggested
   by checkpatch

v1 -> v2

 * combine related patches (patches fixing whitespace issues, patches
   addressing comment style)

 * respin the single large patchset into a multiple smaller series for
   easier review
====================

Reviewed-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: avoid precidence issues in L2TP_SKB_CB macro
Tom Parkin [Wed, 22 Jul 2020 16:32:14 +0000 (17:32 +0100)]
l2tp: avoid precidence issues in L2TP_SKB_CB macro

checkpatch warned about the L2TP_SKB_CB macro's use of its argument: add
braces to avoid the problem.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: line-break long function prototypes
Tom Parkin [Wed, 22 Jul 2020 16:32:13 +0000 (17:32 +0100)]
l2tp: line-break long function prototypes

In l2tp_core.c both l2tp_tunnel_create and l2tp_session_create take
quite a number of arguments and have a correspondingly long prototype.

This is both quite difficult to scan visually, and triggers checkpatch
warnings.

Add a line break to make these function prototypes more readable.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: prefer seq_puts for unformatted output
Tom Parkin [Wed, 22 Jul 2020 16:32:12 +0000 (17:32 +0100)]
l2tp: prefer seq_puts for unformatted output

checkpatch warns about use of seq_printf where seq_puts would do.

Modify l2tp_debugfs accordingly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: prefer using BIT macro
Tom Parkin [Wed, 22 Jul 2020 16:32:11 +0000 (17:32 +0100)]
l2tp: prefer using BIT macro

Use BIT(x) rather than (1<<x), reported by checkpatch.pl.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: add identifier name in function pointer prototype
Tom Parkin [Wed, 22 Jul 2020 16:32:10 +0000 (17:32 +0100)]
l2tp: add identifier name in function pointer prototype

Reported by checkpatch:

        "WARNING: function definition argument 'struct sock *'
         should also have an identifier name"

Add an identifier name to help document the prototype.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup suspect code indent
Tom Parkin [Wed, 22 Jul 2020 16:32:09 +0000 (17:32 +0100)]
l2tp: cleanup suspect code indent

l2tp_core has conditionally compiled code in l2tp_xmit_skb for IPv6
support.  The structure of this code triggered a checkpatch warning
due to incorrect indentation.

Fix up the indentation to address the checkpatch warning.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup wonky alignment of line-broken function calls
Tom Parkin [Wed, 22 Jul 2020 16:32:08 +0000 (17:32 +0100)]
l2tp: cleanup wonky alignment of line-broken function calls

Arguments should be aligned with the function call open parenthesis as
per checkpatch.  Tweak some function calls which were not aligned
correctly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup difficult-to-read line breaks
Tom Parkin [Wed, 22 Jul 2020 16:32:07 +0000 (17:32 +0100)]
l2tp: cleanup difficult-to-read line breaks

Some l2tp code had line breaks which made the code more difficult to
read.  These were originally motivated by the 80-character line width
coding guidelines, but were actually a negative from the perspective of
trying to follow the code.

Remove these linebreaks for clearer code, even if we do exceed 80
characters in width in some places.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup comments
Tom Parkin [Wed, 22 Jul 2020 16:32:06 +0000 (17:32 +0100)]
l2tp: cleanup comments

Modify some l2tp comments to better adhere to kernel coding style, as
reported by checkpatch.pl.

Add descriptive comments for the l2tp per-net spinlocks to document
their use.

Fix an incorrect comment in l2tp_recv_common:

RFC2661 section 5.4 states that:

"The LNS controls enabling and disabling of sequence numbers by sending a
data message with or without sequence numbers present at any time during
the life of a session."

l2tp handles this correctly in l2tp_recv_common, but the comment around
the code was incorrect and confusing.  Fix up the comment accordingly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup whitespace use
Tom Parkin [Wed, 22 Jul 2020 16:32:05 +0000 (17:32 +0100)]
l2tp: cleanup whitespace use

Fix up various whitespace issues as reported by checkpatch.pl:

 * remove spaces around operators where appropriate,
 * add missing blank lines following declarations,
 * remove multiple blank lines, or trailing blank lines at the end of
   functions.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Always use user_ptr[0] for devlink and simplify post_doit
Parav Pandit [Wed, 22 Jul 2020 15:57:11 +0000 (18:57 +0300)]
devlink: Always use user_ptr[0] for devlink and simplify post_doit

Currently devlink instance is searched on all doit() operations.
But it is optionally stored into user_ptr[0]. This requires
rediscovering devlink again doing post_doit().

Few devlink commands related to port shared buffers needs 3 pointers
(devlink, devlink_port, and devlink_sb) while executing doit commands.
Though devlink pointer can be derived from the devlink_port during
post_doit() operation when doit() callback has acquired devlink
instance lock, relying on such scheme to access devlik pointer makes
code very fragile.

Hence, to avoid ambiguity in post_doit() and to avoid searching
devlink instance again, simplify code by always storing devlink
instance in user_ptr[0] and derive devlink_sb pointer in their
respective callback routines.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agohv_netvsc: add support for vlans in AF_PACKET mode
Sriram Krishnan [Wed, 22 Jul 2020 15:38:44 +0000 (21:08 +0530)]
hv_netvsc: add support for vlans in AF_PACKET mode

Vlan tagged packets are getting dropped when used with DPDK that uses
the AF_PACKET interface on a hyperV guest.

The packet layer uses the tpacket interface to communicate the vlans
information to the upper layers. On Rx path, these drivers can read the
vlan info from the tpacket header but on the Tx path, this information
is still within the packet frame and requires the paravirtual drivers to
push this back into the NDIS header which is then used by the host OS to
form the packet.

This transition from the packet frame to NDIS header is currently missing
hence causing the host OS to drop the all vlan tagged packets sent by
the drivers that use AF_PACKET (ETH_P_ALL) such as DPDK.

Here is an overview of the changes in the vlan header in the packet path:

The RX path (userspace handles everything):
  1. RX VLAN packet is stripped by HOST OS and placed in NDIS header
  2. Guest Kernel RX hv_netvsc packets and moves VLAN info from NDIS
     header into kernel SKB
  3. Kernel shares packets with user space application with PACKET_MMAP.
     The SKB VLAN info is copied to tpacket layer and indication set
     TP_STATUS_VLAN_VALID.
  4. The user space application will re-insert the VLAN info into the frame

The TX path:
  1. The user space application has the VLAN info in the frame.
  2. Guest kernel gets packets from the application with PACKET_MMAP.
  3. The kernel later sends the frame to the hv_netvsc driver. The only way
     to send VLANs is when the SKB is setup & the VLAN is stripped from the
     frame.
  4. TX VLAN is re-inserted by HOST OS based on the NDIS header. If it sees
     a VLAN in the frame the packet is dropped.

Cc: xe-linux-external@cisco.com
Cc: Sriram Krishnan <srirakr2@cisco.com>
Signed-off-by: Sriram Krishnan <srirakr2@cisco.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: zero token hash at creation time.
Paolo Abeni [Wed, 22 Jul 2020 15:20:50 +0000 (17:20 +0200)]
mptcp: zero token hash at creation time.

Otherwise the 'chain_len' filed will carry random values,
some token creation calls will fail due to excessive chain
length, causing unexpected fallback to TCP.

Fixes: 2c5ebd001d4f ("mptcp: refactor token container")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agolan743x: remove redundant initialization of variable current_head_index
Colin Ian King [Wed, 22 Jul 2020 15:12:21 +0000 (16:12 +0100)]
lan743x: remove redundant initialization of variable current_head_index

The variable current_head_index is being initialized with a value that
is never read and it is being updated later with a new value.  Replace
the initialization of -1 with the latter assignment.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Remove the imdio bus on PF probe bailout
Claudiu Manoil [Wed, 22 Jul 2020 12:38:48 +0000 (15:38 +0300)]
enetc: Remove the imdio bus on PF probe bailout

enetc_imdio_remove() is missing from the enetc_pf_probe()
bailout path. Not surprisingly because enetc_setup_serdes()
is registering the imdio bus for internal purposes, and it's
not obvious that enetc_imdio_remove() currently performs the
teardown of enetc_setup_serdes().
To fix this, define enetc_teardown_serdes() to wrap
enetc_imdio_remove() (improve code maintenance) and call it
on bailout and remove paths.

Fixes: 975d183ef0ca ("net: enetc: Initialize SerDes for SGMII and USXGMII protocols")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qed: Remove unneeded cast from memory allocation
Wang Hai [Wed, 22 Jul 2020 02:10:27 +0000 (10:10 +0800)]
net: qed: Remove unneeded cast from memory allocation

Remove casting the values returned by memory allocation function.

Coccinelle emits WARNING: casting value returned by memory allocation
unction to (struct roce_destroy_qp_req_output_params *) is useless.

This issue was detected by using the Coccinelle software.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: fix check in get_phy_c45_ids
Vladimir Oltean [Mon, 20 Jul 2020 17:26:54 +0000 (20:26 +0300)]
net: phy: fix check in get_phy_c45_ids

After the patch below, the iteration through the available MMDs is
completely short-circuited, and devs_in_pkg remains set to the initial
value of zero.

Due to devs_in_pkg being zero, the rest of get_phy_c45_ids() is
short-circuited too: the following loop never reaches below this point
either (it executes "continue" for every device in package, failing to
retrieve PHY ID for any of them):

/* Now probe Device Identifiers for each device present. */
for (i = 1; i < num_ids; i++) {
if (!(devs_in_pkg & (1 << i)))
continue;

So c45_ids->device_ids remains populated with zeroes. This causes an
Aquantia AQR412 PHY (same as any C45 PHY would, in fact) to be probed by
the Generic PHY driver.

The issue seems to be a case of submitting partially committed work (and
therefore testing something other than was submitted).

The intention of the patch was to delay exiting the loop until one more
condition is reached (the devs_in_pkg read from hardware is either 0, OR
mostly f's). So fix the patch to reflect that.

Tested with traffic on a LS1028A-QDS, the PHY is now probed correctly
using the Aquantia driver. The devs_in_pkg bit field is set to
0xe000009a, and the MMDs that are present have the following IDs:

[    5.600772] libphy: get_phy_c45_ids: device_ids[1]=0x3a1b662
[    5.618781] libphy: get_phy_c45_ids: device_ids[3]=0x3a1b662
[    5.630797] libphy: get_phy_c45_ids: device_ids[4]=0x3a1b662
[    5.654535] libphy: get_phy_c45_ids: device_ids[7]=0x3a1b662
[    5.791723] libphy: get_phy_c45_ids: device_ids[29]=0x3a1b662
[    5.804050] libphy: get_phy_c45_ids: device_ids[30]=0x3a1b662
[    5.816375] libphy: get_phy_c45_ids: device_ids[31]=0x0

[    7.690237] mscc_felix 0000:00:00.5: PHY [0.5:00] driver [Aquantia AQR412] (irq=POLL)
[    7.704739] mscc_felix 0000:00:00.5: PHY [0.5:01] driver [Aquantia AQR412] (irq=POLL)
[    7.718918] mscc_felix 0000:00:00.5: PHY [0.5:02] driver [Aquantia AQR412] (irq=POLL)
[    7.733044] mscc_felix 0000:00:00.5: PHY [0.5:03] driver [Aquantia AQR412] (irq=POLL)

Fixes: bba238ed037c ("net: phy: continue searching for C45 MMDs even if first returned ffff:ffff")
Reported-by: Colin King <colin.king@canonical.com>
Reported-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dccp: Add SIOCOUTQ IOCTL support (send buffer fill)
Richard Sailer [Mon, 20 Jul 2020 16:06:14 +0000 (18:06 +0200)]
net: dccp: Add SIOCOUTQ IOCTL support (send buffer fill)

This adds support for the SIOCOUTQ IOCTL to get the send buffer fill
of a DCCP socket, like UDP and TCP sockets already have.

Regarding the used data field: DCCP uses per packet sequence numbers,
not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued
is not used by DCCP and always 0, even in test on highly congested paths.
Therefore this uses sk_wmem_alloc like in UDP.

Signed-off-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'Add-DSA-yaml-binding'
David S. Miller [Wed, 22 Jul 2020 23:56:44 +0000 (16:56 -0700)]
Merge branch 'Add-DSA-yaml-binding'

Kurt Kanzenbach says:

====================
Add DSA yaml binding

as discussed [1] [2] it makes sense to add a DSA yaml binding. This is the
second version and contains now two ways of specifying the switch ports: Either
by "ports" or by "ethernet-ports". That is why the third patch also adjusts the
DSA core for it.

Tested in combination with the hellcreek.yaml file.

Changes since v1:

 * Use select to not match unrelated switches
 * Allow ethernet-port(s)
 * List ethernet-controller properties
 * Include better description
 * Let dsa.txt refer to dsa.yaml

Thanks,
Kurt

[1] - https://lkml.kernel.org/netdev/449f0a03-a91d-ae82-b31f-59dfd1457ec5@gmail.com/
[2] - https://lkml.kernel.org/netdev/20200710090618.28945-1-kurt@linutronix.de/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: of: Allow ethernet-ports as encapsulating node
Kurt Kanzenbach [Mon, 20 Jul 2020 12:49:39 +0000 (14:49 +0200)]
net: dsa: of: Allow ethernet-ports as encapsulating node

Due to unified Ethernet Switch Device Tree Bindings allow for ethernet-ports as
encapsulating node as well.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: dsa: Let dsa.txt refer to dsa.yaml
Kurt Kanzenbach [Mon, 20 Jul 2020 12:49:38 +0000 (14:49 +0200)]
dt-bindings: net: dsa: Let dsa.txt refer to dsa.yaml

The DSA bindings have been converted to YAML. Therefore, the old text style
documentation should refer to that one.

The text file can be removed completely once all the existing DSA switch
bindings have been converted as well.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: dsa: Add DSA yaml binding
Kurt Kanzenbach [Mon, 20 Jul 2020 12:49:37 +0000 (14:49 +0200)]
dt-bindings: net: dsa: Add DSA yaml binding

For future DSA drivers it makes sense to add a generic DSA yaml binding which
can be used then. This was created using the properties from dsa.txt. It
includes the ports and the dsa,member property.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: ocelot: fix non-initialized CPU port on VSC7514
Vladimir Oltean [Wed, 22 Jul 2020 08:08:57 +0000 (11:08 +0300)]
net: mscc: ocelot: fix non-initialized CPU port on VSC7514

The VSC7514 is marketed as a 10-port switch, however it has 11 physical
ports (0->10) in the block diagram:
https://www.microsemi.com/product-directory/ethernet-switches/3992-vsc7514
(also in the device tree at arch/mips/boot/dts/mscc/ocelot.dtsi)

Additionally, by architecture it has one more entry in the analyzer
block, situated right after the physical ports, for the CPU port module.
This is not a physical port, it only represents a channel for frame
injection and extraction. That entry for the CPU port is at index 11 in
the analyzer.

When the register groups for QSYS_SWITCH_PORT_MODE, SYS_PORT_MODE and
SYS_PAUSE_CFG are declared to be replicated 11 times, the 11th entry in
the array of regfields is not initialized, so the CPU port module is not
initialized either.

The documentation of QSYS_SWITCH_PORT_MODE for VSC7514 also says that
this register group is replicated 12 times, so this patch is simply
reflecting that and not introducing any further inconsistency.

Fixes: 886e1387c73d ("net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields")
Fixes: 541132f0961a ("net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield")
Reported-by: Bryan Whitehead <bryan.whitehead@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: explicitly include <linux/compat.h> in net/core/sock.c
Christoph Hellwig [Wed, 22 Jul 2020 07:40:27 +0000 (09:40 +0200)]
net: explicitly include <linux/compat.h> in net/core/sock.c

The buildbot found a config where the header isn't already implicitly
pulled in, so add an explicit include as well.

Fixes: 8c918ffbbad4 ("net: remove compat_sock_common_{get,set}sockopt")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Wed, 22 Jul 2020 19:34:55 +0000 (12:34 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).

The main changes are:

1) Run BPF program on socket lookup, from Jakub.

2) Introduce cpumap, from Lorenzo.

3) s390 JIT fixes, from Ilya.

4) teach riscv JIT to emit compressed insns, from Luke.

5) use build time computed BTF ids in bpf iter, from Yonghong.
====================

Purely independent overlapping changes in both filter.h and xdp.h

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ionic-updates'
David S. Miller [Wed, 22 Jul 2020 01:36:34 +0000 (18:36 -0700)]
Merge branch 'ionic-updates'

Shannon Nelson says:

====================
ionic updates

These are a few odd code tweaks to the ionic driver: FW defined MTU
limits, remove unnecessary code, and other tidiness tweaks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: interface file updates
Shannon Nelson [Tue, 21 Jul 2020 20:34:09 +0000 (13:34 -0700)]
ionic: interface file updates

Add some new interface values and update a few more descriptions.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: rearrange reset and bus-master control
Shannon Nelson [Tue, 21 Jul 2020 20:34:08 +0000 (13:34 -0700)]
ionic: rearrange reset and bus-master control

We can prevent potential incorrect DMA access attempts from the
NIC by enabling bus-master after the reset, and by disabling
bus-master earlier in cleanup.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: update eid test for overflow
Shannon Nelson [Tue, 21 Jul 2020 20:34:07 +0000 (13:34 -0700)]
ionic: update eid test for overflow

Fix up our comparison to better handle a potential (but largely
unlikely) wrap around.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: remove unused ionic_coal_hw_to_usec
Shannon Nelson [Tue, 21 Jul 2020 20:34:06 +0000 (13:34 -0700)]
ionic: remove unused ionic_coal_hw_to_usec

Clean up some unused code.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: set netdev default name
Shannon Nelson [Tue, 21 Jul 2020 20:34:05 +0000 (13:34 -0700)]
ionic: set netdev default name

If the host system's udev fails to set a new name for the
network port, there is no NETDEV_CHANGENAME event to trigger
the driver to send the name down to the firmware.  It is safe
to set the lif name multiple times, so we add a call early on
to set the default netdev name to be sure the FW has something
to use in its internal debug logging.  Then when udev gets
around to changing it we can update it to the actual name the
system will be using.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: get MTU from lif identity
Shannon Nelson [Tue, 21 Jul 2020 20:34:04 +0000 (13:34 -0700)]
ionic: get MTU from lif identity

Change from using hardcoded MTU limits and instead use the
firmware defined limits. The value from the LIF attributes is
the frame size, so we take off the header size to convert to
MTU size.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobareudp: Reverted support to enable & disable rx metadata collection
Martin Varghese [Fri, 17 Jul 2020 02:35:12 +0000 (08:05 +0530)]
bareudp: Reverted support to enable & disable rx metadata collection

The commit fe80536acf83 ("bareudp: Added attribute to enable & disable
rx metadata collection") breaks the the original(5.7) default behavior of
bareudp module to collect RX metadadata at the receive. It was added to
avoid the crash at the kernel neighbour subsytem when packet with metadata
from bareudp is processed. But it is no more needed as the
commit 394de110a733 ("net: Added pointer check for
dst->ops->neigh_lookup in dst_neigh_lookup_skb") solves this crash.

Fixes: fe80536acf83 ("bareudp: Added attribute to enable & disable rx metadata collection")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dpaa2-eth-add-support-for-TBF-offload'
David S. Miller [Tue, 21 Jul 2020 23:24:05 +0000 (16:24 -0700)]
Merge branch 'dpaa2-eth-add-support-for-TBF-offload'

Ioana Ciornei says:

====================
dpaa2-eth: add support for TBF offload

This patch set adds support for TBF offload in dpaa2-eth.
The first patch restructures how the .ndo_setup_tc() callback is
implemented (each Qdisc is treated in a separate function), the second
patch just adds the necessary APIs for configuring the Tx shaper and the
last one is handling TC_SETUP_QDISC_TBF and configures as requested the
shaper.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-eth: add support for TBF offload
Ioana Ciornei [Tue, 21 Jul 2020 16:38:25 +0000 (19:38 +0300)]
dpaa2-eth: add support for TBF offload

React to TC_SETUP_QDISC_TBF and configure the egress shaper as
appropriate with the maximum rate and burst size requested by the user.
TBF can only be offloaded on DPAA2 when it's the root qdisc, ie it's a
per port shaper.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-eth: add API for Tx shaping
Ioana Ciornei [Tue, 21 Jul 2020 16:38:24 +0000 (19:38 +0300)]
dpaa2-eth: add API for Tx shaping

Add the necessary API (dpni_set_tx_shaping) for configuring the rate and
burst size of a per port shaper in DPAA2.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-eth: move the mqprio setup into a separate function
Ioana Ciornei [Tue, 21 Jul 2020 16:38:23 +0000 (19:38 +0300)]
dpaa2-eth: move the mqprio setup into a separate function

Move the setup done for MQPRIO into a separate function so that
with the addition of another offload we do not crowd
dpaa2_eth_setup_tc(). After this restructuring it's easier to see what
is supported in terms of Qdisc offloading.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: move helper to where its used
Florian Westphal [Tue, 21 Jul 2020 19:08:54 +0000 (21:08 +0200)]
mptcp: move helper to where its used

Only used in token.c.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'devlink-small-improvements'
David S. Miller [Tue, 21 Jul 2020 23:14:58 +0000 (16:14 -0700)]
Merge branch 'devlink-small-improvements'

Parav Pandit says:

====================
devlink small improvements

This short series improves the devlink code for lock commment,
simplifying checks and keeping the scope of mutex lock for necessary
fields.

Patch summary:
Patch-1 Keep the devlink_mutex for only for necessary changes.
Patch-2 Avoids duplicate check for reload flag
Patch-3 Adds missing comment for the scope of devlink instance lock
Patch-4 Constify devlink instance pointer
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Constify devlink instance pointer
Parav Pandit [Tue, 21 Jul 2020 16:53:54 +0000 (19:53 +0300)]
devlink: Constify devlink instance pointer

Constify devlink instance pointer while checking if reload operation is
supported or not.

This helps to review the scope of checks done in reload.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Add comment for devlink instance lock
Parav Pandit [Tue, 21 Jul 2020 16:53:53 +0000 (19:53 +0300)]
devlink: Add comment for devlink instance lock

Add comment to describe the purpose of devlink instance lock.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Avoid duplicate check for reload enabled flag
Parav Pandit [Tue, 21 Jul 2020 16:53:52 +0000 (19:53 +0300)]
devlink: Avoid duplicate check for reload enabled flag

Reload operation is enabled or not is already checked by
devlink_reload(). Hence, remove the duplicate check from
devlink_nl_cmd_reload().

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Do not hold devlink mutex when initializing devlink fields
Parav Pandit [Tue, 21 Jul 2020 16:53:51 +0000 (19:53 +0300)]
devlink: Do not hold devlink mutex when initializing devlink fields

There is no need to hold a device global lock when initializing
devlink device fields of a devlink instance which is not yet part of the
devices list.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8169: allow to enable ASPM on RTL8125A
Heiner Kallweit [Tue, 21 Jul 2020 16:22:24 +0000 (18:22 +0200)]
r8169: allow to enable ASPM on RTL8125A

For most chip versions this has been added already. Allow also for
RTL8125A to enable ASPM.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ena-driver-new-features'
David S. Miller [Tue, 21 Jul 2020 22:59:04 +0000 (15:59 -0700)]
Merge branch 'ena-driver-new-features'

Arthur Kiyanovski says:

====================
ENA driver new features

V4 changes:
-----------
Add smp_rmb() to "net: ena: avoid unnecessary rearming of interrupt
vector when busy-polling" to adhere to the linux kernel memory model,
and update the commit message accordingly.

V3 changes:
-----------
1. Add "net: ena: enable support of rss hash key and function
   changes" patch again, with more explanations why it should
   be in net-next in commit message.
2. Add synchronization considerations to "net: ena: avoid unnecessary
   rearming of interrupt vector when busy-polling"

V2 changes:
-----------
1. Update commit messages of 2 patches to be more verbose.
2. Remove "net: ena: enable support of rss hash key and function
   changes" patch. Will be resubmitted net.

V1 cover letter:
----------------
This patchset contains performance improvements, support for new devices
and functionality:

1. Support for upcoming ENA devices
2. Avoid unnecessary IRQ unmasking in busy poll to reduce interrupt rate
3. Enabling device support for RSS function and key manipulation
4. Support for NIC-based traffic mirroring (SPAN port)
5. Additional PCI device ID
6. Cosmetic changes
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: support new LLQ acceleration mode
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:11 +0000 (16:38 +0300)]
net: ena: support new LLQ acceleration mode

New devices add a new hardware acceleration engine, which adds some
restrictions to the driver.
Metadata descriptor must be present for each packet and the maximum
burst size between two doorbells is now limited to a number
advertised by the device.

This patch adds:
1. A handshake protocol between the driver and the device, so the
device will enable the accelerated queues only when both sides
support it.

2. The driver support for the new acceleration engine:
2.1. Send metadata descriptor for each Tx packet.
2.2. Limit the number of packets sent between doorbells.(*)

(*) A previous driver implementation of this feature was comitted in
commit 05d62ca218f8 ("net: ena: add handling of llq max tx burst size")
however the design of the interface between the driver and device
changed since then. This change is reflected in this commit.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: move llq configuration from ena_probe to ena_device_init()
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:10 +0000 (16:38 +0300)]
net: ena: move llq configuration from ena_probe to ena_device_init()

When the ENA device resets to recover from some error state, all LLQ
configuration values are reset to their defaults, because LLQ is
initialized only once during ena_probe().

Changes in this commit:
1. Move the LLQ configuration process into ena_init_device()
which is called from both ena_probe() and ena_restore_device(). This
way, LLQ setup configurations that are different from the default
values will survive resets.

2. Extract the LLQ bar mapping to ena_map_llq_bar(),
and call once in the lifetime of the driver from ena_probe(),
since there is no need to unmap and map the LLQ bar again every reset.

3. Map the LLQ bar if it exists, regardless if initialization of LLQ
placement policy (ENA_ADMIN_PLACEMENT_POLICY_DEV) succeeded
or not. Initialization might fail the first time, falling back to the
ENA_ADMIN_PLACEMENT_POLICY_HOST placement policy, but later succeed
after device reset, in which case the LLQ bar needs to be mapped
already.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: enable support of rss hash key and function changes
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:09 +0000 (16:38 +0300)]
net: ena: enable support of rss hash key and function changes

Add the rss_configurable_function_key bit to driver_supported_feature.

This bit tells the device that the driver in question supports the
retrieving and updating of RSS function and hash key, and therefore
the device should allow RSS function and key manipulation.

This commit turns on  device support for hash key and RSS function
management. Without this commit this feature is turned off at the
device and appears to the user as unsupported.

This commit concludes the following series of already merged commits:
commit 0af3c4e2eab8 ("net: ena: changes to RSS hash key allocation")
commit c1bd17e51c71 ("net: ena: change default RSS hash function to Toeplitz")
commit f66c2ea3b18a ("net: ena: allow setting the hash function without changing the key")
commit e9a1de378dd4 ("net: ena: fix error returning in ena_com_get_hash_function()")
commit 80f8443fcdaa ("net: ena: avoid unnecessary admin command when RSS function set fails")
commit 6a4f7dc82d1e ("net: ena: rss: do not allocate key when not supported")
commit 0d1c3de7b8c7 ("net: ena: fix incorrect default RSS key")

The above commits represent the last part of the implementation of
this feature, and with them merged the feature can be enabled
in the device.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: add support for traffic mirroring
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:08 +0000 (16:38 +0300)]
net: ena: add support for traffic mirroring

Add support for traffic mirroring, where the hardware reads the
buffer from the instance memory directly.

Traffic Mirroring needs access to the rx buffers in the instance.
To have this access, this patch:
1. Changes the code to map and unmap the rx buffers bidirectionally.
2. Enables the relevant bit in driver_supported_features to indicate
   to the FW that this driver supports traffic mirroring.

Rx completion is not generated until mirroring is done to avoid
the situation where the driver changes the buffer before it is
mirrored.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: cosmetic: change ena_com_stats_admin stats to u64
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:07 +0000 (16:38 +0300)]
net: ena: cosmetic: change ena_com_stats_admin stats to u64

The size of the admin statistics in ena_com_stats_admin is changed
from 32bit to 64bit so to align with the sizes of the other statistics
in the driver (i.e. rx_stats, tx_stats and ena_stats_dev).

This is done as part of an effort to create a unified API to read
statistics.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: cosmetic: satisfy gcc warning
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:06 +0000 (16:38 +0300)]
net: ena: cosmetic: satisfy gcc warning

gcc 4.8 reports a warning when initializing with = {0}.
Dropping the "0" from the braces fixes the issue.
This fix is not ANSI compatible but is allowed by gcc.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: add reserved PCI device ID
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:05 +0000 (16:38 +0300)]
net: ena: add reserved PCI device ID

Add a reserved PCI device ID to the driver's table
Used for internal testing purposes.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ena: avoid unnecessary rearming of interrupt vector when busy-polling
Arthur Kiyanovski [Tue, 21 Jul 2020 13:38:04 +0000 (16:38 +0300)]
net: ena: avoid unnecessary rearming of interrupt vector when busy-polling

For an overview of the race created by this patch goto synchronization
label.

In napi busy-poll mode, the kernel invokes the napi handler of the
device repeatedly to poll the NIC's receive queues. This process
repeats until a timeout, specific for each connection, is up.
By polling packets in busy-poll mode the user may gain lower latency
and higher throughput (since the kernel no longer waits for interrupts
to poll the queues) in expense of CPU usage.

Upon completing a napi routine, the driver checks whether
the routine was called by an interrupt handler. If so, the driver
re-enables interrupts for the device. This is needed since an
interrupt routine invocation disables future invocations until
explicitly re-enabled.

The driver avoids re-enabling the interrupts if they were not disabled
in the first place (e.g. if driver in busy mode).
Originally, the driver checked whether interrupt re-enabling is needed
by reading the 'ena_napi->unmask_interrupt' variable. This atomic
variable was set upon interrupt and cleared after re-enabling it.

In the 4.10 Linux version, the 'napi_complete_done' call was changed
so that it returns 'false' when device should not re-enable
interrupts, and 'true' otherwise. The change includes reading the
"NAPIF_STATE_IN_BUSY_POLL" flag to check if the napi call is in
busy-poll mode, and if so, return 'false'.
The driver was changed to re-enable interrupts according to this
routine's return value.
The Linux community rejected the use of the
'ena_napi->unmaunmask_interrupt' variable to determine whether
unmasking is needed, and urged to use napi_napi_complete_done()
return value solely.
See https://lore.kernel.org/patchwork/patch/741149/ for more details

As explained, a busy-poll session exists for a specified timeout
value, after which it exits the busy-poll mode and re-enters it later.
This leads to many invocations of the napi handler where
napi_complete_done() false indicates that interrupts should be
re-enabled.
This creates a bug in which the interrupts are re-enabled
unnecessarily.
To reproduce this bug:
    1) echo 50 | sudo tee /proc/sys/net/core/busy_poll
    2) echo 50 | sudo tee /proc/sys/net/core/busy_read
    3) Add counters that check whether
    'ena_unmask_interrupt(tx_ring, rx_ring);'
    is called without disabling the interrupts in the first
    place (i.e. with calling the interrupt routine
    ena_intr_msix_io())

Steps 1+2 enable busy-poll as the default mode for new connections.

The busy poll routine rearms the interrupts after every session by
design, and so we need to add an extra check that the interrupts were
masked in the first place.

synchronization:
This patch introduces a race between the interrupt handler
ena_intr_msix_io() and the napi routine ena_io_poll().
Some macros and instruction were added to prevent this race from leaving
the interrupts masked. The following specifies the different race
scenarios in this patch:

1) interrupt handler and napi routine run sequentially
    i) interrupt handler is called, sets 'interrupts_masked' flag and
successfully schedules the napi handler via softirq.

    In this scenario the napi routine might not see the flag change
    for several reasons:
a) The flag is stored in a register by the compiler. For this
case the WRITE_ONCE macro which prevents this.
b) The compiler might reorder the instruction. For this the
smp_wmb() instruction was used which implies a compiler memory
barrier.
c) On archs with weak consistency model (like ARM64) the napi
routine might be scheduled and start running before the flag
STORE instruction is committed to cache/memory. To ensure this
doesn't happen, the smp_wmb() instruction was added. It ensures
that the flag set instruction is committed before scheduling
napi.

    ii) compiler reorders the flag's value check in the 'if' with
    the flag set in the napi routine.

    This scenario is prevented by smp_rmb() call after the flag check.

2) interrupt handler and napi routine run in parallel (can happen when
busy poll routine invokes the napi handler)

    i) interrupt handler sets the flag in one core, while the napi
    routine reads it in another core.

    This scenario also is divided into two cases:
a) napi_complete_done() doesn't finish running, in which case
napi_sched() would just set NAPIF_STATE_MISSED and the napi
routine would reschedule itself without changing the flag's value.

b) napi_complete_done() finishes running. In this case the
napi routine might override the flag's value.
This doesn't present any rise since it later unmasks the
interrupt vector.

Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: Fix ILT and XRCD bitmap memory leaks
Yuval Basson [Tue, 21 Jul 2020 11:34:26 +0000 (14:34 +0300)]
qed: Fix ILT and XRCD bitmap memory leaks

- Free ILT lines used for XRC-SRQ's contexts.
- Free XRCD bitmap

Fixes: b8204ad878ce7 ("qed: changes to ILT to support XRC")
Fixes: 7bfb399eca460 ("qed: Add XRC to RoCE")
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Basson <ybason@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'Phylink-PCS-updates'
David S. Miller [Tue, 21 Jul 2020 22:46:51 +0000 (15:46 -0700)]
Merge branch 'Phylink-PCS-updates'

Russell King says:

====================
Phylink PCS updates

This series updates the rudimentary phylink PCS support with the
results of the last four months of development of that.  Phylink
PCS support was initially added back at the end of March, when it
became clear that the current approach of treating everything at
the MAC end as being part of the MAC was inadequate.

However, this rudimentary implementation was fine initially for
mvneta and similar, but in practice had a fair number of issues,
particularly when ethtool interfaces were used to change various
link properties.

It became apparent that relying on the phylink_config structure for
the PCS was also bad when it became clear that the same PCS was used
in DSA drivers as well as in NXPs other offerings, and there was a
desire to re-use that code.

It also became apparent that splitting the "configuration" step on
an interface mode configuration between the MAC and PCS using just
mac_config() and pcs_config() methods was not sufficient for some
setups, as the MAC needed to be "taken down" prior to making changes,
and once all settings were complete, the MAC could only then be
resumed.

This series addresses these points, progressing PCS support, and
has been developed with mvneta and DPAA2 setups, with work on both
those drivers to prove this approach.  It has been rigorously tested
with mvneta, as that provides the most flexibility for testing the
various code paths.

To solve the phylink_config reuse problem, we introduce a struct
phylink_pcs, which contains the minimal information necessary, and it
is intended that this is embedded in the PCS private data structure.

To solve the interface mode configuration problem, we introduce two
new MAC methods, mac_prepare() and mac_finish() which wrap the entire
interface mode configuration only.  This has the additional benefit of
relieving MAC drivers from working out whether an interface change has
occurred, and whether they need to do some major work.

I have not yet updated all the interface documentation for these
changes yet, that work remains, but this patch set is provided in the
hope that those working on PCS support in NXP will find this useful.

Since there is a lot of change here, this is the reason why I strongly
advise that everyone has converted to the mac_link_up() way of
configuring the link parameters when the link comes up, rather than
the old way of using mac_config() - especially as splitting the PCS
changes how and when phylink calls mac_config(). Although no change
for existing users is intended, that is something I no longer am able
to test.

Changes since RFC:
- fix bisect build failure
- add patch to use config.an_enabled
- rename phylink_config_interface to phylink_major_reconfig
- add expanded documentation for phylink_set_pcs()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: add interface to configure clause 22 PCS PHY
Russell King [Tue, 21 Jul 2020 11:04:51 +0000 (12:04 +0100)]
net: phylink: add interface to configure clause 22 PCS PHY

Add an interface to configure the advertisement for a clause 22 PCS
PHY, and set the AN enable flag in the BMCR appropriately.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: add struct phylink_pcs
Russell King [Tue, 21 Jul 2020 11:04:46 +0000 (12:04 +0100)]
net: phylink: add struct phylink_pcs

Add a way for MAC PCS to have private data while keeping independence
from struct phylink_config, which is used for the MAC itself. We need
this independence as we will have stand-alone code for PCS that is
independent of the MAC.  Introduce struct phylink_pcs, which is
designed to be embedded in a driver private data structure.

This structure does not include a mdio_device as there are PCS
implementations such as the Marvell DSA and network drivers where this
is not necessary.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: re-implement interface configuration with PCS
Russell King [Tue, 21 Jul 2020 11:04:41 +0000 (12:04 +0100)]
net: phylink: re-implement interface configuration with PCS

With PCS support, how we implement interface reconfiguration (or other
major reconfiguration) is not up to the job; we end up reconfiguring
the PCS for an interface change while the link could potentially be up.
In order to solve this, add two additional MAC methods for major
configuration, one to prepare for the change, and one to finish the
change.

This allows mvneta and mvpp2 to shutdown what they require prior to the
MAC and PCS configuration calls, and then restart as appropriate.

This impacts ksettings_set(), which now needs to identify whether the
change is a minor tweak to the advertisement masks or whether the
interface mode has changed, and call the appropriate function for that
update.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: in-band pause mode advertisement update for PCS
Russell King [Tue, 21 Jul 2020 11:04:36 +0000 (12:04 +0100)]
net: phylink: in-band pause mode advertisement update for PCS

Re-code the pause in-band advertisement update in light of the addition
of PCS support, so that we perform the minimum required; only the PCS
configuration function needs to be called in this case, followed by the
request to trigger a restart of negotiation if the programmed
advertisement changed.

We need to change the pcs_config() signature to pass whether resolved
pause should be passed to the MAC for setups such as mvneta and mvpp2
where doing so overrides the MAC manual flow controls.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: simplify fixed-link case for ksettings_set method
Russell King [Tue, 21 Jul 2020 11:04:31 +0000 (12:04 +0100)]
net: phylink: simplify fixed-link case for ksettings_set method

For fixed links, we only allow the current settings, so this should be
a matter of merely rejecting an attempt to change the settings.  If the
settings agree, then there is nothing more we need to do.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: use config.an_enabled in ksettings_set method
Russell King [Tue, 21 Jul 2020 11:04:26 +0000 (12:04 +0100)]
net: phylink: use config.an_enabled in ksettings_set method

Rather than recomputing whether AN is enabled, use config.an_enabled.

Suggested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: simplify phy case for ksettings_set method
Russell King [Tue, 21 Jul 2020 11:04:21 +0000 (12:04 +0100)]
net: phylink: simplify phy case for ksettings_set method

When we have a PHY attached, an ethtool ksettings_set() call only
really needs to call through to the phylib equivalent; phylib will
call back to us when the link changes so we can update our state.
Therefore, we can bypass most of our ksettings_set() call for this
case.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: simplify ksettings_set() implementation
Russell King [Tue, 21 Jul 2020 11:04:16 +0000 (12:04 +0100)]
net: phylink: simplify ksettings_set() implementation

Simplify the ksettings_set() implementation to look more like phylib's
implementation; use a switch() for validating the autoneg setting, and
use the linkmode_modify() helper to set the autoneg bit in the
advertisement mask.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: avoid mac_config calls
Russell King [Tue, 21 Jul 2020 11:04:10 +0000 (12:04 +0100)]
net: phylink: avoid mac_config calls

Avoid calling mac_config() when using split PCS, and the interface
remains the same.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: update PCS when changing interface during resolution
Russell King [Tue, 21 Jul 2020 11:04:05 +0000 (12:04 +0100)]
net: phylink: update PCS when changing interface during resolution

The only PHYs that are used with phylink which change their interface
are the BCM84881 and MV88X3310 family, both of which only change their
interface modes on link-up events.  This will break when drivers are
converted to split-PCS.  Fix this.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: ensure link is down when changing interface
Russell King [Tue, 21 Jul 2020 11:04:00 +0000 (12:04 +0100)]
net: phylink: ensure link is down when changing interface

The only PHYs that are used with phylink which change their interface
are the BCM84881 and MV88X3310 family, both of which only change their
interface modes on link-up events.  However, rather than relying upon
this behaviour by the PHY, we should give a stronger guarantee when
resolving that the link will be down whenever we change the interface
mode.  This patch implements that stronger guarantee for resolve.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: rearrange resolve mac_config() call
Russell King [Tue, 21 Jul 2020 11:03:55 +0000 (12:03 +0100)]
net: phylink: rearrange resolve mac_config() call

Use a boolean to indicate whether mac_config() should be called during
a resolution. This allows resolution to have a single location where
mac_config() will be called, which will allow us to make decisions
about how and what we do.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: rejig link state tracking
Russell King [Tue, 21 Jul 2020 11:03:50 +0000 (12:03 +0100)]
net: phylink: rejig link state tracking

Rejig the link state tracking, so that we can use the current state
in a future patch.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: update ethtool reporting for fixed-link modes
Russell King [Tue, 21 Jul 2020 11:03:45 +0000 (12:03 +0100)]
net: phylink: update ethtool reporting for fixed-link modes

Comparing the ethtool output from phylink and non-phylink fixed-link
setups shows that we have some differences:

- The "auto-negotiation" fields are different; phylink reports these
  as "No", non-phylink reports these as "Yes" for the supported and
  advertising masks.
- The link partner advertisement is set to the link speed with non-
  phylink, but phylink leaves this unset, causing all link partner
  fields to be omitted.

The phylink ethtool output also disagrees with the software emulated
PHY dump via the MII registers.

Update the phylink fixed-link parsing code so that we better reflect
the behaviour of the non-phylink code that this facility replaces, and
bring the ethtool interface more into line with the report from via the
MII interface.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'enetc-Add-adaptive-interrupt-coalescing'
David S. Miller [Tue, 21 Jul 2020 22:38:30 +0000 (15:38 -0700)]
Merge branch 'enetc-Add-adaptive-interrupt-coalescing'

Claudiu Manoil says:

====================
enetc: Add adaptive interrupt coalescing

Apart from some related cleanup patches, this set
introduces in a straightforward way the support needed
to enable and configure interrupt coalescing for ENETC.

Patch 5 introduces the support needed for configuring the
interrupt coalescing parameters and for switching between
moderated (int. coalescing) and per-packet interrupt modes.
When interrupt coalescing is enabled the Rx/Tx time
thresholds are configurable, packet thresholds are fixed.
To make this work reliably, patch 5 uses the traffic
pause procedure introduced in patch 2.

Patch 6 adds DIM (Dynamic Interrupt Moderation) to implement
adaptive coalescing based on time thresholds, for the Rx 'channel'.
On the Tx side a default optimal value is used instead, optimized for
TCP traffic over 1G and 2.5G links.  This default 'optimal' value can
be overridden anytime via 'ethtool -C tx-usecs'.

netperf -t TCP_MAERTS measurements show a significant CPU load
reduction correlated w/ reduced interrupt rates. For the
measurement results refer to the comments in patch 6.

v2: Replaced Tx DIM with predefined optimal value, giving
better results. This was also suggested by Jakub (cc).
Switched order of patches 4 and 5, for better grouping.

v3: minor cleanup/improvements
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Add adaptive interrupt coalescing
Claudiu Manoil [Tue, 21 Jul 2020 07:55:22 +0000 (10:55 +0300)]
enetc: Add adaptive interrupt coalescing

Use the generic dynamic interrupt moderation (dim)
framework to implement adaptive interrupt coalescing
on Rx.  With the per-packet interrupt scheme, a high
interrupt rate has been noted for moderate traffic flows
leading to high CPU utilization.  The 'dim' scheme
implemented by the current patch addresses this issue
improving CPU utilization while using minimal coalescing
time thresholds in order to preserve a good latency.
On the Tx side use an optimal time threshold value by
default.  This value has been optimized for Tx TCP
streams at a rate of around 85kpps on a 1G link,
at which rate half of the Tx ring size (128) gets filled
in 1500 usecs.  Scaling this down to 2.5G links yields
the current value of 600 usecs, which is conservative
and gives good enough results for 1G links too (see
next).

Below are some measurement results for before and after
this patch (and related dependencies) basically, for a
2 ARM Cortex-A72 @1.3Ghz CPUs system (32 KB L1 data cache),
using 60secs log netperf TCP stream tests @ 1Gbit link
(maximum throughput):

1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI
thread on the same CPU:
CPU utilization int rate (ints/sec)
Before: 50%-60% (over 50%) 92k
After:  13%-22% 3.5k-12k
Comment:  Major CPU utilization improvement for a single flow
  Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single
  CPU. Usually settles under 16% for longer tests.

2) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency):
Total CPU utilization Total int rate (ints/sec)
Before: ~80% (spikes to 90%) ~100k
After:   60% (more steady)   ~4k
Comment:  Important improvement for this load test, while the
  ping test outcome does not show any notable
  difference compared to before.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Add interrupt coalescing support
Claudiu Manoil [Tue, 21 Jul 2020 07:55:21 +0000 (10:55 +0300)]
enetc: Add interrupt coalescing support

Enable programming of the interrupt coalescing registers
and allow manual configuration of the coalescing time
thresholds via ethtool.  Packet thresholds have been fixed
to predetermined values as there's no point in making them
run-time configurable, also anticipating the dynamic interrupt
moderation (DIM) algorithm which uses fixed packet thresholds
as well.  If the interface is up when the operation mode of
traffic interrupt events is changed by the user (i.e. switching
from default per-packet interrupts to coalesced interrupts),
the traffic needs to be paused in the process.
This patch also prepares the ground for introducing DIM on Rx.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Drop redundant ____cacheline_aligned_in_smp
Claudiu Manoil [Tue, 21 Jul 2020 07:55:20 +0000 (10:55 +0300)]
enetc: Drop redundant ____cacheline_aligned_in_smp

'struct enetc_bdr' is already '____cacheline_aligned_in_smp'.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Fix interrupt coalescing register naming
Claudiu Manoil [Tue, 21 Jul 2020 07:55:19 +0000 (10:55 +0300)]
enetc: Fix interrupt coalescing register naming

Interrupt coalescing registers naming in the current revision
of the Ref Man (RM) is ICR, deprecating the ICIR name used
in earlier (draft) versions of the RM.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Factor out the traffic start/stop procedures
Claudiu Manoil [Tue, 21 Jul 2020 07:55:18 +0000 (10:55 +0300)]
enetc: Factor out the traffic start/stop procedures

A reliable traffic pause (and reconfiguration) procedure
is needed to be able to safely make h/w configuration
changes during run-time, like changing the mode in which the
interrupts are operating (i.e. with or without coalescing),
as opposed to making on-the-fly register updates that
may be subject to h/w or s/w concurrency issues.
To this end, the code responsible of the run-time device
configurations that basically starts resp. stops the traffic
flow through the device has been extracted from the
the enetc_open/_close procedures, to the separate standalone
enetc_start/_stop procedures. Traffic stop should be as
graceful as possible, it lets the executing napi threads to
to finish while the interrupts stay disabled.  But since
the napi thread will try to re-enable interrupts by clearing
the device's unmask register, the enable_irq/ disable_irq
API has been used to avoid this potential concurrency issue
and make the traffic pause procedure more reliable.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Refine buffer descriptor ring sizes
Claudiu Manoil [Tue, 21 Jul 2020 07:55:17 +0000 (10:55 +0300)]
enetc: Refine buffer descriptor ring sizes

It's time to differentiate between Rx and Tx ring sizes.
Not only Tx rings are processed differently than Rx rings,
but their default number also differs - i.e. up to 8 Tx rings
per device (8 traffic classes) vs. 2 Rx rings (one per CPU).
So let's set Tx rings sizes to half the size of the Rx rings
for now, to be conservative.
The default ring sizes were decreased as well (to the next
lower power of 2), to reduce the memory footprint, buffering
etc., since the measurements I've made so far show that the
rings are very unlikely to get full.
This change also anticipates the introduction of the
dynamic interrupt moderation (dim) algorithm which operates
on maximum packet thresholds of 256 packets for Rx and 128
packets for Tx.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mdio-mux-gpio: use devm_gpiod_get_array()
Jisheng Zhang [Tue, 21 Jul 2020 07:01:57 +0000 (15:01 +0800)]
net: mdio-mux-gpio: use devm_gpiod_get_array()

Use devm_gpiod_get_array() to simplify the error handling and exit
code path.

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobpftool: Use only nftw for file tree parsing
Tony Ambardar [Tue, 21 Jul 2020 02:48:16 +0000 (19:48 -0700)]
bpftool: Use only nftw for file tree parsing

The bpftool sources include code to walk file trees, but use multiple
frameworks to do so: nftw and fts. While nftw conforms to POSIX/SUSv3 and
is widely available, fts is not conformant and less common, especially on
non-glibc systems. The inconsistent framework usage hampers maintenance
and portability of bpftool, in particular for embedded systems.

Standardize code usage by rewriting one fts-based function to use nftw and
clean up some related function warnings by extending use of "const char *"
arguments. This change helps in building bpftool against musl for OpenWrt.

Also fix an unsafe call to dirname() by duplicating the string to pass,
since some implementations may directly alter it. The same approach is
used in libbpf.c.

Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200721024817.13701-1-Tony.Ambardar@gmail.com
3 years agoMerge branch 'bpf_iter-BTF_ID-at-build-time'
Alexei Starovoitov [Tue, 21 Jul 2020 20:15:01 +0000 (13:15 -0700)]
Merge branch 'bpf_iter-BTF_ID-at-build-time'

Yonghong Song says:

====================
Commit 5a2798ab32ba
("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros")
implemented a mechanism to compute btf_ids at kernel build
time which can simplify kernel implementation and reduce
runtime overhead by removing in-kernel btf_id calculation.

This patch set tried to use this mechanism to compute
btf_ids for bpf_skc_to_*() helpers and for btf_id_or_null ctx
arguments specified during bpf iterator registration.
Please see individual patch for details.

Changelogs:
  v1 -> v2:
    - v1 ([1]) is only for bpf_skc_to_*() helpers. This version
      expanded it to cover ctx btf_id_or_null arguments
    - abandoned the change of "extern u32 name[]" to
      "static u32 name[]" for BPF_ID_LIST local "name" definition.
      gcc 9 incurred a compilation error.

 [1]: https://lore.kernel.org/bpf/20200717184706.3476992-1-yhs@fb.com/T
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
3 years agosamples/bpf, selftests/bpf: Use bpf_probe_read_kernel
Ilya Leoshkevich [Mon, 20 Jul 2020 11:48:06 +0000 (13:48 +0200)]
samples/bpf, selftests/bpf: Use bpf_probe_read_kernel

A handful of samples and selftests fail to build on s390, because
after commit 0ebeea8ca8a4 ("bpf: Restrict bpf_probe_read{, str}()
only to archs where they work") bpf_probe_read is not available
anymore.

Fix by using bpf_probe_read_kernel.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720114806.88823-1-iii@linux.ibm.com
3 years agobpf: net: Use precomputed btf_id for bpf iterators
Yonghong Song [Mon, 20 Jul 2020 16:34:03 +0000 (09:34 -0700)]
bpf: net: Use precomputed btf_id for bpf iterators

One additional field btf_id is added to struct
bpf_ctx_arg_aux to store the precomputed btf_ids.
The btf_id is computed at build time with
BTF_ID_LIST or BTF_ID_LIST_GLOBAL macro definitions.
All existing bpf iterators are changed to used
pre-compute btf_ids.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163403.1393551-1-yhs@fb.com