linux-2.6-microblaze.git
3 years agoigc: Enable internal i225 PPS
Ederson de Souza [Fri, 19 Feb 2021 01:31:03 +0000 (17:31 -0800)]
igc: Enable internal i225 PPS

The i225 device can produce one interrupt on the full second, much
like i210 - from where this patch is inspired.

This patch sets up the full second interruption on the i225 and when
receiving it, it sends a PPS event to PTP (Precision Time Protocol)
kernel subsystem.

The PTP subsystem exposes the PPS events via ioctl and sysfs, and one
can use the `testptp` tool (tools/testing/selftests/ptp) to check that
the events are being generated.

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigb: Add double-check MTA_REGISTER for i210 and i211
Grzegorz Siwik [Tue, 23 Feb 2021 14:15:27 +0000 (15:15 +0100)]
igb: Add double-check MTA_REGISTER for i210 and i211

Add new function which checks MTA_REGISTER if its filled correctly.
If not then writes again to same register.
There is possibility that i210 and i211 could not accept
MTA_REGISTER settings, specially when you add and remove
many of multicast addresses in short time.
Without this patch there is possibility that multicast settings will be
not always set correctly in hardware.

Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigb: Redistribute memory for transmit packet buffers when in Qav mode
Ederson de Souza [Thu, 4 Feb 2021 01:50:25 +0000 (17:50 -0800)]
igb: Redistribute memory for transmit packet buffers when in Qav mode

i210 has a total of 24KB of transmit packet buffer. When in Qav mode,
this buffer is divided into four pieces, one for each Tx queue.
Currently, 8KB are given to each of the two SR queues and 4KB are given
to each of the two SP queues.

However, it was noticed that such distribution can make best effort
traffic (which would usually go to the SP queues when Qav is enabled, as
the SR queues would be used by ETF or CBS qdiscs for TSN-aware traffic)
perform poorly. Using iperf3 to measure, one could see the performance
of best effort traffic drop by nearly a third (from 935Mbps to 578Mbps),
with no TSN traffic competing.

This patch redistributes the 24KB to each queue equally: 6KB each. On
tests, there was no notable performance reduction of best effort traffic
performance when there was no TSN traffic competing.

Below, more details about the data collected:

All experiments were run using the following qdisc setup:

qdisc taprio 100: root refcnt 9 tc 4 map 3 3 3 2 3 0 0 3 3 3 3 3 3 3 3 3
    queues offset 0 count 1 offset 1 count 1 offset 2 count 1 offset 3 count 1
    clockid TAI base-time 0 cycle-time 10000000 cycle-time-extension 0
    index 0 cmd S gatemask 0xf interval 10000000

qdisc etf 8045: parent 100:1 clockid TAI delta 1000000 offload on
    deadline_mode off skip_sock_check off

TSN traffic, when enabled, had this characteristics:
 Packet size: 1500 bytes
 Transmission interval: 125us

----------------------------------
Without this patch:
----------------------------------
- TCP data:
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.35 GBytes   578 Mbits/sec    0

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.07 GBytes   460 Mbits/sec    1

- TCP data limiting iperf3 buffer size to 4K:
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.35 GBytes   579 Mbits/sec    0

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.08 GBytes   462 Mbits/sec    0

- TCP data limiting iperf3 buffer size to 192 bytes (smallest size without
 serious performance degradation):
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.34 GBytes   577 Mbits/sec    0

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.07 GBytes   461 Mbits/sec    1

- UDP data at 1000Mbit/sec:
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
        [  5]   0.00-20.00  sec  1.36 GBytes   586 Mbits/sec  0.000 ms  0/1011407 (0%)

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
        [  5]   0.00-20.00  sec  1.05 GBytes   451 Mbits/sec  0.000 ms  0/778672 (0%)

----------------------------------
With this patch:
----------------------------------

- TCP data:
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  2.17 GBytes   932 Mbits/sec    0

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.50 GBytes   646 Mbits/sec    1

- TCP data limiting iperf3 buffer size to 4K:
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  2.17 GBytes   931 Mbits/sec    0

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.50 GBytes   645 Mbits/sec    0

- TCP data limiting iperf3 buffer size to 192 bytes (smallest size without
 serious performance degradation):
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  2.17 GBytes   932 Mbits/sec    1

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Retr
        [  5]   0.00-20.00  sec  1.50 GBytes   645 Mbits/sec    0

- UDP data at 1000Mbit/sec:
    - No TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
        [  5]   0.00-20.00  sec  2.23 GBytes   956 Mbits/sec  0.000 ms  0/1650226 (0%)

    - With TSN traffic:
        [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
        [  5]   0.00-20.00  sec  1.51 GBytes   649 Mbits/sec  0.000 ms  0/1120264 (0%)

Signed-off-by: Ederson de Souza <ederson.desouza@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoMerge branch 'ehtool-fec-stats'
David S. Miller [Fri, 16 Apr 2021 00:08:30 +0000 (17:08 -0700)]
Merge branch 'ehtool-fec-stats'

Jakub Kicinski says:

====================
ethtool: add standard FEC statistics

This set adds uAPI for reporting standard FEC statistics, and
implements it in a handful of drivers.

The statistics are taken from the IEEE standard, with one
extra seemingly popular but not standard statistics added.

The implementation is similar to that of the pause frame
statistics, user requests the stats by setting a bit
(ETHTOOL_FLAG_STATS) in the common ethtool header of
ETHTOOL_MSG_FEC_GET.

Since standard defines the statistics per lane what's
reported is both total and per-lane counters:

 # ethtool -I --show-fec eth0
 FEC parameters for eth0:
 Configured FEC encodings: None
 Active FEC encoding: None
 Statistics:
  corrected_blocks: 256
    Lane 0: 255
    Lane 1: 1
  uncorrectable_blocks: 145
    Lane 0: 128
    Lane 1: 17

v2: check for errors in mlx5 register access
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlx5: implement ethtool::get_fec_stats
Jakub Kicinski [Thu, 15 Apr 2021 22:53:18 +0000 (15:53 -0700)]
mlx5: implement ethtool::get_fec_stats

Report corrected bits.

v2: catch reg access errors (Saeed)

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc: ef10: implement ethtool::get_fec_stats
Jakub Kicinski [Thu, 15 Apr 2021 22:53:17 +0000 (15:53 -0700)]
sfc: ef10: implement ethtool::get_fec_stats

Report what appears to be the standard block counts:
 - 30.5.1.1.17 aFECCorrectedBlocks
 - 30.5.1.1.18 aFECUncorrectableBlocks

Don't report the per-lane symbol counts, if those really
count symbols they are not what the standard calls for
(even if symbols seem like the most useful thing to count.)

Fingers crossed that fec_corrected_errors is not in symbols.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobnxt: implement ethtool::get_fec_stats
Jakub Kicinski [Thu, 15 Apr 2021 22:53:16 +0000 (15:53 -0700)]
bnxt: implement ethtool::get_fec_stats

Report corrected bits.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethtool: add FEC statistics
Jakub Kicinski [Thu, 15 Apr 2021 22:53:15 +0000 (15:53 -0700)]
ethtool: add FEC statistics

Similarly to pause statistics add stats for FEC.

The IEEE standard mandates two sets of counters:
 - 30.5.1.1.17 aFECCorrectedBlocks
 - 30.5.1.1.18 aFECUncorrectableBlocks
where block is a block of bits FEC operates on.
Each of these counters is defined per lane (PCS instance).

Multiple vendors provide number of corrected _bits_ rather
than/as well as blocks.

This set adds the 2 standard-based block counters and a extra
one for corrected bits.

Counters are exposed to user space via netlink in new attributes.
Each attribute carries an array of u64s, first element is
the total count, and the following ones are a per-lane break down.

Much like with pause stats the operation will not fail when driver
does not implement the get_fec_stats callback (nor can the driver
fail the operation by returning an error). If stats can't be
reported the relevant attributes will be empty.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethtool: fec_prepare_data() - jump to error handling
Jakub Kicinski [Thu, 15 Apr 2021 22:53:14 +0000 (15:53 -0700)]
ethtool: fec_prepare_data() - jump to error handling

Refactor fec_prepare_data() a little bit to skip the body
of the function and exit on error. Currently the code
depends on the fact that we only have one call which
may fail between ethnl_ops_begin() and ethnl_ops_complete()
and simply saves the error code. This will get hairy with
the stats also being queried.

No functional changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoethtool: move ethtool_stats_init
Jakub Kicinski [Thu, 15 Apr 2021 22:53:13 +0000 (15:53 -0700)]
ethtool: move ethtool_stats_init

We'll need it for FEC stats as well.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoscm: optimize put_cmsg()
Eric Dumazet [Thu, 15 Apr 2021 17:37:53 +0000 (10:37 -0700)]
scm: optimize put_cmsg()

Calling two copy_to_user() for very small regions has very high overhead.

Switch to inlined unsafe_put_user() to save one stac/clac sequence,
and avoid copy_to_user().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: convert to schedule_work()
Yangbo Lu [Thu, 15 Apr 2021 05:34:55 +0000 (13:34 +0800)]
enetc: convert to schedule_work()

Convert system_wq queue_work() to schedule_work() which is
a wrapper around it, since the former is a rare construct.

Fixes: 7294380c5211 ("enetc: support PTP Sync packet one-step timestamping")
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'hns3-next'
David S. Miller [Thu, 15 Apr 2021 23:51:29 +0000 (16:51 -0700)]
Merge branch 'hns3-next'

Huazhong Tan says:

====================
net: hns3: updates for -next

This series adds support for pushing link status to VFs for
the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: VF not request link status when PF support push link status feature
Guangbin Huang [Thu, 15 Apr 2021 02:20:39 +0000 (10:20 +0800)]
net: hns3: VF not request link status when PF support push link status feature

To reduce the processing of unnecessary mailbox command when PF supports
actively push its link status to VFs, VFs stop sending request link
status command in periodic service task in this case.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: PF add support for pushing link status to VFs
Guangbin Huang [Thu, 15 Apr 2021 02:20:38 +0000 (10:20 +0800)]
net: hns3: PF add support for pushing link status to VFs

Previously, VF updates its link status every second by send query command
to PF in periodic service task. If link stats of PF is changed, VF may
need at most one second to update its link status.

To reduce delay of link status between PF and VFs, PF actively push its
link status to VFs when its link status is updated. And to let VF know
PF supports this new feature, the link status changed mailbox command
adds one bit to indicate it.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: at803x: select correct page on config init
David Bauer [Thu, 15 Apr 2021 01:26:50 +0000 (03:26 +0200)]
net: phy: at803x: select correct page on config init

The Atheros AR8031 and AR8033 expose different registers for SGMII/Fiber
as well as the copper side of the PHY depending on the BT_BX_REG_SEL bit
in the chip configure register.

The driver assumes the copper side is selected on probe, but this might
not be the case depending which page was last selected by the
bootloader. Notably, Ubiquiti UniFi bootloaders show this behavior.

Select the copper page when probing to circumvent this.

Signed-off-by: David Bauer <mail@david-bauer.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
David S. Miller [Thu, 15 Apr 2021 23:45:14 +0000 (16:45 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-04-14

This series contains updates to ice driver only.

Bruce changes and removes open coded values to instead use existing
kernel defines and suppresses false cppcheck issues.

Ani adds new VSI states to track netdev allocation and registration. He
also removes leading underscores in the ice_pf_state enum.

Jesse refactors ITR by introducing helpers to reduce duplicated code and
structures to simplify checking of ITR mode. He also triggers a software
interrupt when exiting napi poll or busy-poll to ensure all work is
processed. Modifies /proc/iomem to display driver name instead of PCI
address. He also changes the checks of vsi->type to use a local variable
in ice_vsi_rebuild() and removes an unneeded struct member.

Jake replaces the driver's adaptive interrupt moderation algorithm to
use the kernel's DIM library implementation.

Scott reworks module reads to reduce the number of reads needed and
remove excessive increment of QSFP page.

Brett sets the vsi->vf_id to invalid for non-VF VSIs.

Paul removes the return value from ice_vsi_manage_rss_lut() as it's not
communicating anything critical. He also reduces the scope of a
variable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoice: reduce scope of variable
Paul M Stillwell Jr [Wed, 31 Mar 2021 21:17:08 +0000 (14:17 -0700)]
ice: reduce scope of variable

The scope of this variable can be reduced so do that.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: remove return variable
Paul M Stillwell Jr [Wed, 31 Mar 2021 21:17:07 +0000 (14:17 -0700)]
ice: remove return variable

We were saving the return value from ice_vsi_manage_rss_lut(), but
the errors from that function are not critical so change it to
return void and remove the code that saved the value.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: suppress false cppcheck issues
Bruce Allan [Wed, 31 Mar 2021 21:17:05 +0000 (14:17 -0700)]
ice: suppress false cppcheck issues

Silence false errors, warnings and style issues reported by cppcheck.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: Set vsi->vf_id as ICE_INVAL_VFID for non VF VSI types
Brett Creeley [Wed, 31 Mar 2021 21:17:04 +0000 (14:17 -0700)]
ice: Set vsi->vf_id as ICE_INVAL_VFID for non VF VSI types

Currently the vsi->vf_id is set only for ICE_VSI_VF and it's left as 0
for all other VSI types. This is confusing and could be problematic
since 0 is a valid vf_id. Fix this by always setting non VF VSI types to
ICE_INVAL_VFID.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: remove unused struct member
Jesse Brandeburg [Wed, 31 Mar 2021 21:17:03 +0000 (14:17 -0700)]
ice: remove unused struct member

The only time you can ever have a rq_last_status is if
a firmware event was somehow reporting a status on the receive
queue, which are generally firmware initiated events or
mailbox messages from a VF.  Mostly this struct member was unused.

Fix this problem by still printing the value of the field in a debug
print, but don't store the value forever in a struct, potentially
creating opportunities for callers to use the wrong struct member.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: use local for consistency
Jesse Brandeburg [Wed, 31 Mar 2021 21:17:02 +0000 (14:17 -0700)]
ice: use local for consistency

Do a minor refactor on ice_vsi_rebuild to use a local
variable to store vsi->type.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: print name in /proc/iomem
Jesse Brandeburg [Wed, 31 Mar 2021 21:17:01 +0000 (14:17 -0700)]
ice: print name in /proc/iomem

The driver previously printed it's PCI address in
the name field for the pci resource, which when displayed
via /proc/iomem, would print the same thing twice.

It's more useful for debugging to see the driver name, as
most other modules do.

Here's a diff of before and after this change:
     99100000-991fffff : 0000:3b:00.1
   9a000000-a04fffff : PCI Bus 0000:3b
     9a000000-9bffffff : 0000:3b:00.1
-      9a000000-9bffffff : 0000:3b:00.1
+      9a000000-9bffffff : ice
     9c000000-9dffffff : 0000:3b:00.0
-      9c000000-9dffffff : 0000:3b:00.0
+      9c000000-9dffffff : ice
     9e000000-9effffff : 0000:3b:00.1
     9f000000-9fffffff : 0000:3b:00.0
     a0000000-a000ffff : 0000:3b:00.1

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: Reimplement module reads used by ethtool
Scott W Taylor [Wed, 31 Mar 2021 21:17:00 +0000 (14:17 -0700)]
ice: Reimplement module reads used by ethtool

There was an excessive increment of the QSFP page, which is
now fixed. Additionally, this new update now reads 8 bytes
at a time and will retry each request if the module/bus is
busy.

Also, prevent reading from upper pages if module does not
support those pages.

Signed-off-by: Scott W Taylor <scott.w.taylor@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: refactor ITR data structures
Jesse Brandeburg [Wed, 31 Mar 2021 21:16:59 +0000 (14:16 -0700)]
ice: refactor ITR data structures

Use a dedicated bitfield in order to both increase
the amount of checking around the length of ITR writes
as well as simplify the checks of dynamic mode.

Basically unpack the "high bit means dynamic" logic
into bitfields.

Also, remove some unused ITR defines.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: manage interrupts during poll exit
Jesse Brandeburg [Wed, 31 Mar 2021 21:16:58 +0000 (14:16 -0700)]
ice: manage interrupts during poll exit

The driver would occasionally miss that there were outstanding
descriptors to clean when exiting busy/napi poll. This issue has
been in the code since the introduction of the ice driver.

Attempt to "catch" any remaining work by triggering a software
interrupt when exiting napi poll or busy-poll. This will not
cause extra interrupts in the case of normal execution.

This issue was found when running sfnt-pingpong, with busy
poll enabled, and typically with larger I/O sizes like > 8192,
the program would occasionally report > 1 second maximums
to complete a ping pong.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: replace custom AIM algorithm with kernel's DIM library
Jacob Keller [Wed, 31 Mar 2021 21:16:57 +0000 (14:16 -0700)]
ice: replace custom AIM algorithm with kernel's DIM library

The ice driver has support for adaptive interrupt moderation, an
algorithm for tuning the interrupt rate dynamically. This algorithm
is based on various assumptions about ring size, socket buffer size,
link speed, SKB overhead, ethernet frame overhead and more.

The Linux kernel has support for a dynamic interrupt moderation
algorithm known as "dimlib". Replace the custom driver-specific
implementation of dynamic interrupt moderation with the kernel's
algorithm.

The Intel hardware has a different hardware implementation than the
originators of the dimlib code had to work with, which requires the
driver to use a slightly different set of inputs for the actual
moderation values, while getting all the advice from dimlib of
better/worse, shift left or right.

The change made for this implementation is to use a pair of values
for each of the 5 "slots" that the dimlib moderation expects, and
the driver will program those pairs when dimlib recommends a slot to
use. The currently implementation uses two tables, one for receive
and one for transmit, and the pairs of values in each slot set the
maximum delay of an interrupt and a maximum number of interrupts per
second (both expressed in microseconds).

There are two separate kinds of bugs fixed by using DIMLIB, one is
UDP single stream send was too slow, and the other is that 8K
ping-pong was going to the most aggressive moderation and has much
too high latency.

The overall result of using DIMLIB is that we meet or exceed our
performance expectations set based on the old algorithm.

Co-developed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: refactor interrupt moderation writes
Jesse Brandeburg [Wed, 31 Mar 2021 21:16:56 +0000 (14:16 -0700)]
ice: refactor interrupt moderation writes

Introduce several new helpers for writing ITR and GLINT_RATE
registers, and refactor the code calling them.  This resulted
in removal of several duplicate functions and rolled a bunch
of simple code back into the calling routines.

In particular this removes some code that was doing both
a store and a set in a helper function, which seems better
done as separate tasks in the caller (and generally takes
less lines of code even with a tiny bit of repetition).

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: Add new VSI states to track netdev alloc/registration
Anirudh Venkataramanan [Tue, 2 Mar 2021 18:15:41 +0000 (10:15 -0800)]
ice: Add new VSI states to track netdev alloc/registration

Add two new VSI states, one to track if a netdev for the VSI has been
allocated and the other to track if the netdev has been registered.
Call unregister_netdev/free_netdev only when the corresponding state
bits are set.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: Drop leading underscores in enum ice_pf_state
Anirudh Venkataramanan [Tue, 2 Mar 2021 18:15:38 +0000 (10:15 -0800)]
ice: Drop leading underscores in enum ice_pf_state

Remove the leading underscores in enum ice_pf_state. This is not really
communicating anything and is unnecessary. No functional change.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoice: use kernel definitions for IANA protocol ports and ether-types
Bruce Allan [Tue, 2 Mar 2021 18:12:06 +0000 (10:12 -0800)]
ice: use kernel definitions for IANA protocol ports and ether-types

The well-known IANA protocol port 3260 (iscsi-target 0x0cbc) and the
ether-types 0x8906 (ETH_P_FCOE) and 0x8914 (ETH_P_FIP) are already defined
in kernel header files.  Use those definitions instead of open-coding the
same.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoMerge tag 'linux-can-next-for-5.13-20210414' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Wed, 14 Apr 2021 21:37:02 +0000 (14:37 -0700)]
Merge tag 'linux-can-next-for-5.13-20210414' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-04-14

this is a pull request of a single patch for net-next/master.

Vincent Mailhol's patch fixes a NULL pointer dereference when handling
error frames in the etas_es58x driver, which has been added in the
previous PR.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/packet: remove data races in fanout operations
Eric Dumazet [Wed, 14 Apr 2021 19:36:44 +0000 (12:36 -0700)]
net/packet: remove data races in fanout operations

af_packet fanout uses RCU rules to ensure f->arr elements
are not dismantled before RCU grace period.

However, it lacks rcu accessors to make sure KCSAN and other tools
wont detect data races. Stupid compilers could also play games.

Fixes: dc99f600698d ("packet: Add fanout support.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: "Gong, Sishuai" <sishuai@purdue.edu>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: bridge: propagate error code and extack from br_mc_disabled_update
Florian Fainelli [Wed, 14 Apr 2021 19:22:57 +0000 (22:22 +0300)]
net: bridge: propagate error code and extack from br_mc_disabled_update

Some Ethernet switches might only be able to support disabling multicast
snooping globally, which is an issue for example when several bridges
span the same physical device and request contradictory settings.

Propagate the return value of br_mc_disabled_update() such that this
limitation is transmitted correctly to user-space.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'mlx5-updates-2021-04-13' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 14 Apr 2021 21:14:03 +0000 (14:14 -0700)]
Merge tag 'mlx5-updates-2021-04-13' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-04-13

mlx5 core and netdev driver updates

1) E-Switch updates from Parav,
  1.1) Devlink parameter to control mlx5 metadata enablement for E-Switch
  1.2) Trivial cleanups for E-Switch code
  1.3) Dynamically allocate vport steering namespaces only when required

2) From Jianbo, Use variably sized data structures for Software steering

3) Several minor cleanups
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: enetc: fetch MAC address from device tree
Michael Walle [Wed, 14 Apr 2021 14:48:14 +0000 (16:48 +0200)]
net: enetc: fetch MAC address from device tree

Normally, the bootloader will already initialize the MAC address
registers of the ENETC and the driver will just use them or generate a
random one, if it is not initialized.

Add a new way to provide the MAC address: via device tree. Besides the
usual 'mac-address' property, there is also the possibility to fetch it
via a NVMEM provider. The sl28 board stores the MAC address in the SPI
NOR flash OTP region. Having this will allow linux to fetch the MAC
address from there without being dependent on the bootloader.

No in-tree boards have the device tree properties set, thus for these,
this is a no-op.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc: Remove duplicate argument
Wan Jiabing [Wed, 14 Apr 2021 11:06:45 +0000 (19:06 +0800)]
sfc: Remove duplicate argument

Fix the following coccicheck warning:

./drivers/net/ethernet/sfc/enum.h:80:7-28: duplicated argument to |

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoskbuff: revert "skbuff: remove some unnecessary operation in skb_segment_list()"
Paolo Abeni [Wed, 14 Apr 2021 10:48:48 +0000 (12:48 +0200)]
skbuff: revert "skbuff: remove some unnecessary operation in skb_segment_list()"

the commit 1ddc3229ad3c ("skbuff: remove some unnecessary operation
in skb_segment_list()") introduces an issue very similar to the
one already fixed by commit 53475c5dd856 ("net: fix use-after-free when
UDP GRO with shared fraglist").

If the GSO skb goes though skb_clone() and pskb_expand_head() before
entering skb_segment_list(), the latter  will unshare the frag_list
skbs and will release the old list. With the reverted commit in place,
when skb_segment_list() completes, skb->next points to the just
released list, and later on the kernel will hit UaF.

Note that since commit e0e3070a9bc9 ("udp: properly complete L4 GRO
over UDP tunnel packet") the critical scenario can be reproduced also
receiving UDP over vxlan traffic with:

NIC (NETIF_F_GRO_FRAGLIST enabled) -> vxlan -> UDP sink

Attaching a packet socket to the NIC will cause skb_clone() and the
tunnel decapsulation will call pskb_expand_head().

Fixes: 1ddc3229ad3c ("skbuff: remove some unnecessary operation in skb_segment_list()")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoatm: idt77252: remove unused function
Jiapeng Chong [Wed, 14 Apr 2021 07:13:39 +0000 (15:13 +0800)]
atm: idt77252: remove unused function

Fix the following clang warning:

drivers/atm/idt77252.c:1787:1: warning: unused function
'idt77252_fbq_level' [-Wunused-function].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Wed, 14 Apr 2021 20:15:12 +0000 (13:15 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2021-04-14

Not much this time:

1) Simplification of some variable calculations in esp4 and esp6.
   From Jiapeng Chong and Junlin Yang.

2) Fix a clang Wformat warning in esp6 and ah6.
   From Arnd Bergmann.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8169: add support for pause ethtool ops
Heiner Kallweit [Wed, 14 Apr 2021 06:23:15 +0000 (08:23 +0200)]
r8169: add support for pause ethtool ops

This adds support for the [g|s]et_pauseparam ethtool ops. It considers
that the chip doesn't support pause frame use in jumbo mode.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
David S. Miller [Wed, 14 Apr 2021 19:59:41 +0000 (12:59 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
10GbE Intel Wired LAN Driver Updates 2021-04-13

This series contains updates to ixgbe and ixgbevf driver.

Jostar Yang adds support for BCM54616s PHY for ixgbe.

Chen Lin removes an unused function pointer for ixgbe and ixgbevf.

Bhaskar Chowdhury fixes a typo in ixgbe.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Add support for external trigger timestamping
Tan Tee Min [Wed, 14 Apr 2021 00:16:17 +0000 (08:16 +0800)]
net: stmmac: Add support for external trigger timestamping

The Synopsis MAC controller supports auxiliary snapshot feature that
allows user to store a snapshot of the system time based on an external
event.

This patch add supports to the above mentioned feature. Users will be
able to triggered capturing the time snapshot from user-space using
application such as testptp or any other applications that uses the
PTP_EXTTS_REQUEST ioctl request.

Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Tan Tee Min <tee.min.tan@intel.com>
Co-developed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'marvell-88x2222-improvements'
David S. Miller [Wed, 14 Apr 2021 19:56:44 +0000 (12:56 -0700)]
Merge branch 'marvell-88x2222-improvements'

Ivan Bornyakov says:

====================
net: phy: marvell-88x2222: a couple of improvements

First, there are some SFP modules that only uses RX_LOS for link
indication. Add check that link is operational before actual read of
line-side status.

Second, it is invalid to set 10G speed without autonegotiation,
according to phy_ethtool_ksettings_set(). Implement switching between
10GBase-R and 1000Base-X/SGMII if autonegotiation can't complete but
there is signal in line.

Changelog:
  v1 -> v2:
    * make checking that link is operational more friendly for
      trancievers without SFP cages.
    * split swapping 1G/10G modes into non-functional and functional
      commits for the sake of easier review.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: marvell-88x2222: swap 1G/10G modes on autoneg
Ivan Bornyakov [Tue, 13 Apr 2021 20:54:52 +0000 (23:54 +0300)]
net: phy: marvell-88x2222: swap 1G/10G modes on autoneg

Setting 10G without autonegotiation is invalid according to
phy_ethtool_ksettings_set(). Thus, we need to set it during
autonegotiation.

If 1G autonegotiation can't complete for quite a time, but there is
signal in line, switch line interface type to 10GBase-R, if supported,
in hope for link to be established.

And vice versa. If 10GBase-R link can't be established for quite a time,
and autonegotiation is enabled, and there is signal in line, switch line
interface type to appropriate 1G mode, i.e. 1000Base-X or SGMII, if
supported.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: marvell-88x2222: move read_status after config_aneg
Ivan Bornyakov [Tue, 13 Apr 2021 20:54:51 +0000 (23:54 +0300)]
net: phy: marvell-88x2222: move read_status after config_aneg

No functional changes, just move read link status routines below
autonegotiation configuration to make future functional changes more
distinct.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: marvell-88x2222: check that link is operational
Ivan Bornyakov [Tue, 13 Apr 2021 20:54:50 +0000 (23:54 +0300)]
net: phy: marvell-88x2222: check that link is operational

Some SFP modules uses RX_LOS for link indication. In such cases link
will be always up, even without cable connected. RX_LOS changes will
trigger link_up()/link_down() upstream operations. Thus, check that SFP
link is operational before actual read link status.

If there is no SFP cage connected to the tranciever, check only PMD
Recieve Signal Detect register.

Signed-off-by: Ivan Bornyakov <i.bornyakov@metrotek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/mlx5e: Fix RQ creation flow for queues which doesn't support XDP
Aya Levin [Wed, 7 Apr 2021 11:09:05 +0000 (14:09 +0300)]
net/mlx5e: Fix RQ creation flow for queues which doesn't support XDP

Allow to create an RQ which is not registered as an XDP RQ. For example:
the trap-RQ doesn't register as an XDP RQ.

Fixes: 869c5f926247 ("net/mlx5e: Generalize open RQ")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Replace spaces with tab at the start of a line
Wenpeng Liang [Thu, 1 Apr 2021 13:07:43 +0000 (21:07 +0800)]
net/mlx5: Replace spaces with tab at the start of a line

There should be no spaces at the start of the line.

Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Remove return statement exist at the end of void function
Wenpeng Liang [Thu, 1 Apr 2021 13:07:42 +0000 (21:07 +0800)]
net/mlx5: Remove return statement exist at the end of void function

void function return statements are not generally useful.

Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Add a blank line after declarations
Wenpeng Liang [Thu, 1 Apr 2021 13:07:41 +0000 (21:07 +0800)]
net/mlx5: Add a blank line after declarations

There should be a blank lines after declarations.

Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: Fix bit-wise and with zero
Colin Ian King [Tue, 6 Apr 2021 16:53:46 +0000 (17:53 +0100)]
net/mlx5: Fix bit-wise and with zero

The bit-wise and of the action field with MLX5_ACCEL_ESP_ACTION_DECRYPT
is incorrect as MLX5_ACCEL_ESP_ACTION_DECRYPT is zero and not intended
to be a bit-flag. Fix this by using the == operator as was originally
intended.

Addresses-Coverity: ("Logically dead code")
Fixes: 7dfee4b1d79e ("net/mlx5: IPsec, Refactor SA handle creation and destruction")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: DR, Alloc cmd buffer with kvzalloc() instead of kzalloc()
Roi Dayan [Mon, 15 Mar 2021 08:22:51 +0000 (10:22 +0200)]
net/mlx5: DR, Alloc cmd buffer with kvzalloc() instead of kzalloc()

The cmd size is 8K so use kvzalloc().

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: DR, Use variably sized data structures for different actions
Jianbo Liu [Thu, 12 Nov 2020 01:32:52 +0000 (01:32 +0000)]
net/mlx5: DR, Use variably sized data structures for different actions

mlx5dr_action is a generally used data structure, and there is an
union for different types of actions in it. The size of mlx5dr_action
is about 72 bytes, but for those actions with fewer fields, most of
the allocated memory is wasted.
Remove this union, and mlx5dr_action becomes a generic action header.
Then actions are dynamically allocated with needed memory, the data
for each action is stored right after the header.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Reuse stored hardware function id
Parav Pandit [Fri, 5 Mar 2021 05:38:02 +0000 (07:38 +0200)]
net/mlx5: SF, Reuse stored hardware function id

SF's hardware function id is already stored in mlx5_sf. Reuse it,
instead of querying the hw table.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: SF, Use device pointer directly
Parav Pandit [Fri, 5 Mar 2021 08:11:21 +0000 (10:11 +0200)]
net/mlx5: SF, Use device pointer directly

At many places in the code, device pointer is directly available. Make
use of it, instead of accessing it from the table.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Initialize eswitch acls ns when eswitch is enabled
Parav Pandit [Mon, 1 Mar 2021 12:12:13 +0000 (14:12 +0200)]
net/mlx5: E-Switch, Initialize eswitch acls ns when eswitch is enabled

Currently eswitch flow steering (FS) namespace of vport's ingress and
egress ACL are enabled when FS layer is initialized. This is done even
when eswitch is diabled. This demands that total eswitch ports to be
known to FS layer without eswitch in use.

Given the FS core is not dependent on eswitch, make namespace init and
cleanup routines as helper routines to be invoked only when eswitch is
needed.

With this change, ingress and egress ACL namespaces are created only
when eswitch legacy/offloads mode is enabled.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Move legacy code to a individual file
Parav Pandit [Fri, 26 Feb 2021 12:30:50 +0000 (14:30 +0200)]
net/mlx5: E-Switch, Move legacy code to a individual file

Currently eswitch offers two modes. Legacy and offloads.
Offloads code is already in its own file eswitch_offloads.c

However eswitch.c contains the eswitch legacy code and common
infrastructure  code.

To enable future extensions and to better manage generic common eswitch
infrastructure code, move the legacy code to its own legacy.c file.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Convert a macro to a helper routine
Parav Pandit [Fri, 26 Feb 2021 12:36:22 +0000 (14:36 +0200)]
net/mlx5: E-Switch, Convert a macro to a helper routine

Convert ESW_ALLOWED macro to a helper routine so that it can be used in
other eswitch files.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch Make cleanup sequence mirror of init
Parav Pandit [Tue, 2 Mar 2021 19:39:01 +0000 (21:39 +0200)]
net/mlx5: E-Switch Make cleanup sequence mirror of init

Make cleanup sequence mirror of init sequence for cleaning up reps
and freeing vports.

Also when reps initialization fails, there is no need to perform reps
cleanup.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Make vport number u16
Parav Pandit [Tue, 2 Mar 2021 19:27:47 +0000 (21:27 +0200)]
net/mlx5: E-Switch, Make vport number u16

Vport number is 16-bit field in hardware. Make it u16.

Move location of vport in the structure so that it reduces a hole
in the structure.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, Skip querying SF enabled bits
Parav Pandit [Mon, 8 Mar 2021 09:50:40 +0000 (11:50 +0200)]
net/mlx5: E-Switch, Skip querying SF enabled bits

With vhca events, SF state is queried through the VHCA events. Device no
longer expects SF bitmap in the query eswitch functions command.

Hence, remove it to simplify the code.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
3 years agonet/mlx5: E-Switch, let user to enable disable metadata
Parav Pandit [Fri, 30 Oct 2020 20:44:16 +0000 (22:44 +0200)]
net/mlx5: E-Switch, let user to enable disable metadata

Currently each packet inserted in eswitch is tagged with a internal
metadata to indicate source vport. Metadata tagging is not always
needed. Metadata insertion is needed for multi-port RoCE, failover
between representors and stacked devices. In many other cases,
metadata enablement is not needed.

Metadata insertion slows down the packet processing rate of the E-switch
when it is in switchdev mode.

Below table show performance gain with metadata disabled for VXLAN
offload rules in both SMFS and DMFS steering mode on ConnectX-5 device.

----------------------------------------------
| steering | metadata | pkt size | rx pps    |
| mode     |          |          | (million) |
----------------------------------------------
| smfs     | disabled | 128Bytes | 42        |
----------------------------------------------
| smfs     | enabled  | 128Bytes | 36        |
----------------------------------------------
| dmfs     | disabled | 128Bytes | 42        |
----------------------------------------------
| dmfs     | enabled  | 128Bytes | 36        |
----------------------------------------------

Hence, allow user to disable metadata using driver specific devlink
parameter. Metadata setting of the eswitch is applicable only for the
switchdev mode.

Example to show and disable metadata before changing eswitch mode:
$ devlink dev param show pci/0000:06:00.0 name esw_port_metadata
pci/0000:06:00.0:
  name esw_port_metadata type driver-specific
    values:
      cmode runtime value true

$ devlink dev param set pci/0000:06:00.0 \
  name esw_port_metadata value false cmode runtime

$ devlink dev eswitch set pci/0000:06:00.0 mode switchdev

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
changelog:
v1->v2:
 - added performance numbers in commit log
 - updated commit log and documentation for switchdev mode
 - added explicit note on when user can disable metadata in
   documentation

3 years agocan: etas_es58x: fix null pointer dereference when handling error frames
Vincent Mailhol [Tue, 13 Apr 2021 11:42:42 +0000 (20:42 +0900)]
can: etas_es58x: fix null pointer dereference when handling error frames

During the handling of CAN bus errors, a CAN error SKB is allocated
using alloc_can_err_skb(). Even if the allocation of the SKB fails,
the function continues in order to do the stats handling.

All access to the can_frame pointer (cf) should be guarded by an if
statement:
if (cf)

However, the increment of the rx_bytes stats:
netdev->stats.rx_bytes += cf->can_dlc;
dereferences the cf pointer and was not guarded by an if condition
leading to a NULL pointer dereference if the can_err_skb() function
failed.

Replacing the cf->can_dlc by the macro CAN_ERR_DLC (which is the
length of any CAN error frames) solves this NULL pointer dereference.

Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN USB interfaces")
Link: https://lore.kernel.org/r/20210413114242.2760-1-mailhol.vincent@wanadoo.fr
Reported-by: Arunachalam Santhanam <arunachalam.santhanam@in.bosch.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
3 years agonet: ethernet: intel: Fix a typo in the file ixgbe_dcb_nl.c
Bhaskar Chowdhury [Wed, 17 Mar 2021 10:00:01 +0000 (15:30 +0530)]
net: ethernet: intel: Fix a typo in the file ixgbe_dcb_nl.c

s/Reprogam/Reprogram/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agonet: intel: Remove unused function pointer typedef ixgbe_mc_addr_itr
Chen Lin [Mon, 15 Feb 2021 12:04:58 +0000 (20:04 +0800)]
net: intel: Remove unused function pointer typedef ixgbe_mc_addr_itr

Remove the 'ixgbe_mc_addr_itr' typedef as it is not used.

Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoixgbe: Support external GBE SerDes PHY BCM54616s
Jostar Yang [Thu, 12 Nov 2020 20:33:56 +0000 (21:33 +0100)]
ixgbe: Support external GBE SerDes PHY BCM54616s

The Broadcom PHY is used in switches, so add the ID, and hook it up.

This upstreams the Linux kernel patch from the network operating system
SONiC from February 2020 [1].

[1]: https://github.com/Azure/sonic-linux-kernel/pull/122

Signed-off-by: Jostar Yang <jostar_yang@accton.com>
Signed-off-by: Guohan Lu <lguohan@gmail.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agonet: Space: remove hp100 probe
Arnd Bergmann [Tue, 13 Apr 2021 14:16:17 +0000 (16:16 +0200)]
net: Space: remove hp100 probe

The driver was removed last year, but the static initialization got left
behind by accident.

Fixes: a10079c66290 ("staging: remove hp100 driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dpaa2-switch-tc-hw-offload'
David S. Miller [Tue, 13 Apr 2021 22:12:19 +0000 (15:12 -0700)]
Merge branch 'dpaa2-switch-tc-hw-offload'

Ioana Ciornei says:

====================
dpaa2-switch: add tc hardware offload on ingress traffic

This patch set adds tc hardware offload on ingress traffic in
dpaa2-switch. The cls flower and matchall classifiers are supported
using the same ACL infrastructure supported by the dpaa2-switch.

The first patch creates a new structure to hold all the necessary
information related to an ACL table. This structure is used in the next
patches to create a link between each switch port and the table used.
Multiple ports can share the same ACL table when they also share the
ingress tc block. Also, some small changes in the priority of the
default STP trap is done in the second patch.

The support for cls flower is added in the 3rd patch, while the 4th
one builds on top of the infrastructure put in place and adds cls
matchall support.

The following flow keys are supported:
 - Ethernet: dst_mac/src_mac
 - IPv4: dst_ip/src_ip/ip_proto/tos
 - VLAN: vlan_id/vlan_prio/vlan_tpid/vlan_dei
 - L4: dst_port/src_port

Each filter can support only one action from the following list:
 - drop
 - mirred egress redirect
 - trap

With the last patch, we reuse the dpaa2_switch_acl_entry_add() function
added previously instead of open-coding the install of a new ACL entry
into the table.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-switch: reuse dpaa2_switch_acl_entry_add() for STP frames trap
Ioana Ciornei [Tue, 13 Apr 2021 13:24:48 +0000 (16:24 +0300)]
dpaa2-switch: reuse dpaa2_switch_acl_entry_add() for STP frames trap

Since we added the dpaa2_switch_acl_entry_add() function in the previous
patches to hide all the details of actually adding the ACL entry by
issuing a firmware command, let's use it also for adding a CPU trap for
the STP frames.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-switch: add tc matchall filter support
Ioana Ciornei [Tue, 13 Apr 2021 13:24:47 +0000 (16:24 +0300)]
dpaa2-switch: add tc matchall filter support

Add support TC_SETUP_CLSMATCHALL by using the same ACL table entries
framework as for tc flower. Adding a matchall rule is done by installing
an entry which has a mask of all zeroes, thus matching on any packet.

This can be used as a catch-all type of rule if used correctly, ie the
priority of the matchall filter should be kept as the lowest one in the
entire filter block.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-switch: add tc flower hardware offload on ingress traffic
Ioana Ciornei [Tue, 13 Apr 2021 13:24:46 +0000 (16:24 +0300)]
dpaa2-switch: add tc flower hardware offload on ingress traffic

This patch adds support for tc flower hardware offload on the ingress
path. Shared filter blocks are supported by sharing a single ACL table
between multiple ports.

The following flow keys are supported:
 - Ethernet: dst_mac/src_mac
 - IPv4: dst_ip/src_ip/ip_proto/tos
 - VLAN: vlan_id/vlan_prio/vlan_tpid/vlan_dei
 - L4: dst_port/src_port

As per flow actions, the following are supported:
 - drop
 - mirred egress redirect
 - trap
Each ACL entry (filter) can be setup with only one of the listed
actions.

A sorted single linked list is used to keep the ACL entries by their
order of priority. When adding a new filter, this enables us to quickly
ascertain if the new entry has the highest priority of the entire block
or if we should make some space in the ACL table by increasing the
priority of the filters already in the table.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-switch: install default STP trap rule with the highest priority
Ioana Ciornei [Tue, 13 Apr 2021 13:24:45 +0000 (16:24 +0300)]
dpaa2-switch: install default STP trap rule with the highest priority

Change the default ACL trap rule for STP frames to have the highest
priority.

In the same ACL table will reside both default rules added by the driver
for its internal use as well as rules added with tc flower.  In this
case, the default rules such as the STP one that we already have should
have the highest priority.

Also, remove the check for a full ACL table since we already know that
it's sized so that we don't hit this case.  The last thing changes is
that default trap filters will not be counted in the acl_tbl's num_rules
variable since their number doesn't change.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-switch: create a central dpaa2_switch_acl_tbl structure
Ioana Ciornei [Tue, 13 Apr 2021 13:24:44 +0000 (16:24 +0300)]
dpaa2-switch: create a central dpaa2_switch_acl_tbl structure

Introduce a new structure - dpaa2_switch_acl_tbl - to hold all data
related to an ACL table: number of rules added, ACL table id, etc.
This will be used more in the next patches when adding support for
sharing an ACL table between ports.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: return -EFAULT if copy_to_user() fails
Dan Carpenter [Tue, 13 Apr 2021 10:47:59 +0000 (13:47 +0300)]
ionic: return -EFAULT if copy_to_user() fails

The copy_to_user() function returns the number of bytes that it wasn't
able to copy.  We want to return -EFAULT to the user.

Fixes: fee6efce565d ("ionic: add hw timestamp support files")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'stmmac-xdp-zc'
David S. Miller [Tue, 13 Apr 2021 22:06:51 +0000 (15:06 -0700)]
Merge branch 'stmmac-xdp-zc'

Ong Boon Leong says:

====================
stmmac: add XDP ZC support

This is the v2 patch series to add XDP ZC support to stmmac driver.

Summary of v2 patch change:-

6/7: fix synchronize_rcu() is called stmmac_disable_all_queues() that is
     used by ndo_setup_tc().

 ########################################################################

Continuous burst traffics are generated by pktgen script and in the midst
of each packet processing operation by xdpsock the following tc-loop.sh
script is looped continuously:-

 #!/bin/bash
 tc qdisc del dev eth0 parent root
 tc qdisc add dev eth0 ingress
 tc qdisc add dev eth0 root mqprio num_tc 4 map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 1@1 1@2 1@3 hw 0
 tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 0 hw_tc 0
 tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 1 hw_tc 1
 tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 2 hw_tc 2
 tc filter add dev eth0 parent ffff: protocol 802.1Q flower vlan_prio 3 hw_tc 3
 tc qdisc list dev eth0
 tc filter show dev eth0 ingress

 On different ssh terminal
 $ while true; do ./tc-loop.sh; sleep 1; done

The v2 patch series have been tested using the xdpsock app:
 $ ./xdpsock -i eth0 -l -z

From xdpsock poller pps report and dmesg, we don't find any warning
related to rcu and the only difference when the script is executed is
the pps rate drops momentarily.

 sock0@eth0:0 l2fwd xdp-drv
                   pps            pkts           1.00
rx                 436347         191361334
tx                 436411         191361334

 sock0@eth0:0 l2fwd xdp-drv
                   pps            pkts           1.00
rx                 254117         191615476
tx                 254053         191615412

 sock0@eth0:0 l2fwd xdp-drv
                   pps            pkts           1.00
rx                 466395         192081924
tx                 466395         192081860

 sock0@eth0:0 l2fwd xdp-drv
                   pps            pkts           1.00
rx                 287410         192369365
tx                 287474         192369365

 sock0@eth0:0 l2fwd xdp-drv
                   pps            pkts           1.00
rx                 395853         192765329
tx                 395789         192765265

 sock0@eth0:0 l2fwd xdp-drv
                   pps            pkts           1.00
rx                 466132         193231514
tx                 466132         193231450

 ########################################################################

Based on the above result, the fix looks promising. Appreciate that if
community can help to review the patch series and provide me feedback
for improvement.
====================

3 years agonet: stmmac: Add TX via XDP zero-copy socket
Ong Boon Leong [Tue, 13 Apr 2021 09:36:26 +0000 (17:36 +0800)]
net: stmmac: Add TX via XDP zero-copy socket

We add the support of XDP ZC TX submission and cleaning into
stmmac_tx_clean(). The function is made to clean as many TX complete
frames as possible, i.e. limit by priv->dma_tx_size instead of NAPI
budget. For TX ring that is associated with XSK pool, the function
stmmac_xdp_xmit_zc() is introduced to TX frame buffers from XSK pool by
using xsk_tx_peek_desc(). To make stmmac_tx_clean() support the cleaning
of XSK TX frames, STMMAC_TXBUF_T_XSK_TX TX buffer type is introduced.

As stmmac_tx_clean() uses the return value to cue whether NAPI function
should continue to poll, we augment the caller of stmmac_tx_clean() to
pass NAPI budget instead of priv->dma_tx_size through 'budget' input and
made stmmac_tx_clean() to always clean up-to the TX ring size instead.
This allows us to use the return boolean status of stmmac_xdp_xmit_zc()
to decide if XSK TX work is done or not: If true, set 'xmits' to return
'budget - 1' so that NAPI poll may exit. Else, set 'xmits' to return
'budget' to make NAPI poll continue to poll since XSK TX work is not
done. Finally, at the end of stmmac_tx_clean(), the function now take
a maximum value between 'count' and 'xmits' so that status from both
TX cleaning and XSK TX (only for XDP ZC) is considered.

This patch adds a new NAPI poll called stmmac_napi_poll_rxtx() that is
meant to be enabled/disabled for RX and TX ring that are bound to XSK
pool. This NAPI poll function starts with cleaning TX ring, then submits
XSK TX frames to TX ring before proceed to perform RX operations, i.e.
, receiving RX frames and replenishing RX ring with RX free buffers
obtained from XSK pool. Therefore, during XSK RX and TX setup, the driver
enables stmmac_napi_poll_rxtx() for RX and TX operations, then during
XSK RX and TX pool tear-down, the driver reenables the exisiting
independent NAPI poll functions accordingly: stmmac_napi_poll_rx() and
stmmac_napi_poll_tx().

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Enable RX via AF_XDP zero-copy
Ong Boon Leong [Tue, 13 Apr 2021 09:36:25 +0000 (17:36 +0800)]
net: stmmac: Enable RX via AF_XDP zero-copy

This patch adds the support for receiving packet via AF_XDP zero-copy
mechanism.

XDP ZC uses 1:1 mapping of XDP buffer to receive packet, therefore the
use of split header is not used currently. The 'xdp_buff' is declared as
union together with a struct that contains 'page', 'addr' and
'page_offset' that are associated with primary buffer.

RX buffers are now allocated either via page_pool or xsk pool. For RX
buffers from xsk_pool they are allocated and deallocated using below
functions:

 * stmmac_alloc_rx_buffers_zc(struct stmmac_priv *priv, u32 queue)
 * dma_free_rx_xskbufs(struct stmmac_priv *priv, u32 queue)

With above functions now available, we then extend the following driver
functions to support XDP ZC:
 * stmmac_reinit_rx_buffers()
 * __init_dma_rx_desc_rings()
 * init_dma_rx_desc_rings()
 * __free_dma_rx_desc_resources()

Note: stmmac_alloc_rx_buffers_zc() may return -ENOMEM due to RX XDP
buffer pool is not allocated (e.g. samples/bpf/xdpsock TX-only). But,
it is still ok to let TX XDP ZC to continue, therefore, the -ENOMEM
is silently ignored to let the driver succcessfully transition to XDP
ZC mode for the said RX and TX queue.

As XDP ZC buffer size is different, the DMA buffer size is required
to be reprogrammed accordingly for RX DMA/Queue that is populated with
XDP buffer from XSK pool.

Next, to add or remove per-queue XSK pool, stmmac_xdp_setup_pool()
will call stmmac_xdp_enable_pool() or stmmac_xdp_disable_pool()
that in-turn coordinates the tearing down and setting up RX ring via
RX buffers and descriptors removal and reallocation through
stmmac_disable_rx_queue() and stmmac_enable_rx_queue(). In addition,
stmmac_xsk_wakeup() is added to initiate XDP RX buffer replenishing
by signalling user application to add available XDP frames back to
FILL queue.

For RX processing using XDP zero-copy buffer, stmmac_rx_zc() is
introduced which is implemented with the assumption that RX split
header is disabled. For XDP verdict is XDP_PASS, the XDP buffer is
copied into a sk_buff allocated through stmmac_construct_skb_zc()
and sent to Linux network GRO inside stmmac_dispatch_skb_zc(). Free RX
buffers are then replenished using stmmac_rx_refill_zc()

v2: introduce __stmmac_disable_all_queues() to contain the original code
    that does napi_disable() and then make stmmac_setup_tc_block_cb()
    to use it. Move synchronize_rcu() into stmmac_disable_all_queues()
    that eventually calls __stmmac_disable_all_queues(). Then,
    make both stmmac_release() and stmmac_suspend() to use
    stmmac_disable_all_queues(). Thanks David Miller for spotting the
    synchronize_rcu() issue in v1 patch.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Refactor __stmmac_xdp_run_prog for XDP ZC
Ong Boon Leong [Tue, 13 Apr 2021 09:36:24 +0000 (17:36 +0800)]
net: stmmac: Refactor __stmmac_xdp_run_prog for XDP ZC

Prepare stmmac_xdp_run_prog() for AF_XDP zero-copy support which will be
added by upcoming patches by splitting out the XDP verdict processing
into __stmmac_xdp_run_prog() and it callable for XDP ZC path which does
not need to verify bpf_prog is not NULL.

The stmmac_xdp_run_prog() is used for regular XDP Rx path which requires
bpf_prog to be verified.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: rearrange RX and TX desc init into per-queue basis
Ong Boon Leong [Tue, 13 Apr 2021 09:36:23 +0000 (17:36 +0800)]
net: stmmac: rearrange RX and TX desc init into per-queue basis

Below functions are made to be per-queue in preparation of XDP ZC:

 __init_dma_rx_desc_rings(struct stmmac_priv *priv, u32 queue, gfp_t flags)
 __init_dma_tx_desc_rings(struct stmmac_priv *priv, u32 queue)

The original functions below are stay maintained for all queue usage:

 init_dma_rx_desc_rings(struct net_device *dev, gfp_t flags)
 init_dma_tx_desc_rings(struct net_device *dev)

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: refactor stmmac_init_rx_buffers for stmmac_reinit_rx_buffers
Ong Boon Leong [Tue, 13 Apr 2021 09:36:22 +0000 (17:36 +0800)]
net: stmmac: refactor stmmac_init_rx_buffers for stmmac_reinit_rx_buffers

The per-queue RX buffer allocation in stmmac_reinit_rx_buffers() can be
made to use stmmac_alloc_rx_buffers() by merging the page_pool alloc
checks for "buf->page" and "buf->sec_page" in stmmac_init_rx_buffers().

This is in preparation for XSK pool allocation later.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: introduce dma_recycle_rx_skbufs for stmmac_reinit_rx_buffers
Ong Boon Leong [Tue, 13 Apr 2021 09:36:21 +0000 (17:36 +0800)]
net: stmmac: introduce dma_recycle_rx_skbufs for stmmac_reinit_rx_buffers

Rearrange RX buffer page_pool recycling logics into dma_recycle_rx_skbufs,
so that we prepare stmmac_reinit_rx_buffers() for XSK pool expansion.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: rearrange RX buffer allocation and free functions
Ong Boon Leong [Tue, 13 Apr 2021 09:36:20 +0000 (17:36 +0800)]
net: stmmac: rearrange RX buffer allocation and free functions

This patch restructures the per RX queue buffer allocation from page_pool
to stmmac_alloc_rx_buffers().

We also rearrange dma_free_rx_skbufs() so that it can be used in
init_dma_rx_desc_rings() during freeing of RX buffer in the event of
page_pool allocation failure to replace the more efficient method earlier.
The replacement is needed to make the RX buffer alloc and free method
scalable to XDP ZC xsk_pool alloc and free later.

Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ipa-SM8350-SoC'
David S. Miller [Tue, 13 Apr 2021 22:02:25 +0000 (15:02 -0700)]
Merge branch 'ipa-SM8350-SoC'

Alex Elder says:

====================
net: ipa: add support for the SM8350 SoC

This small series adds IPA driver support for the Qualcomm SM8350
SoC, which implements IPA v4.9.

The first patch updates the DT binding, and depends on a previous
patch that has already been accepted into net-next.

The second just defines the IPA v4.9 configuration data file.

(Device Tree files to support this SoC will be sent separately and
will go through the Qualcomm tree.)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: ipa: add IPA v4.9 configuration data
Alex Elder [Tue, 13 Apr 2021 16:38:26 +0000 (11:38 -0500)]
net: ipa: add IPA v4.9 configuration data

Add support for the SM8350 SoC, which includes IPA version 4.9.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: qcom,ipa: add support for SM8350
Alex Elder [Tue, 13 Apr 2021 16:38:25 +0000 (11:38 -0500)]
dt-bindings: net: qcom,ipa: add support for SM8350

Add support for "qcom,sm8350-ipa", which uses IPA v4.9.

Use "enum" rather than "oneOf/const ..." to specify compatible
strings, as suggested by Rob Herring.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: git_ts_info bit shifters
Shannon Nelson [Tue, 13 Apr 2021 17:22:16 +0000 (10:22 -0700)]
ionic: git_ts_info bit shifters

All the uses of HWTSTAMP_FILTER_* values need to be
bit shifters, not straight values.

v2: fixed subject and added Cc Dan and SoB Allen

Fixes: f8ba81da73fc ("ionic: add ethtool support for PTP")
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Allen Hubbe <allenbh@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: queue reset work in system_long_wq
Lijun Pan [Tue, 13 Apr 2021 19:33:39 +0000 (14:33 -0500)]
ibmvnic: queue reset work in system_long_wq

The reset process for ibmvnic commonly takes multiple seconds, clearly
making it inappropriate for schedule_work/system_wq. The reason to make
this change is that ibmvnic's use of the default system-wide workqueue
for a relatively long-running work item can negatively affect other
workqueue users. So, queue the relatively slow reset job to the
system_long_wq.

Suggested-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'linux-can-next-for-5.13-20210413' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Tue, 13 Apr 2021 21:53:11 +0000 (14:53 -0700)]
Merge tag 'linux-can-next-for-5.13-20210413' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2021-04-13

this is a pull request of 14 patches for net-next/master.

The first patch is by Yoshihiro Shimoda and updates the DT bindings
for the rcar_can driver.

Vincent Mailhol contributes 3 patches that add support for several
ETAS USB CAN adapters.

The final 10 patches are by me and clean up the peak_usb CAN driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agorsi: remove unused including <linux/version.h>
Yang Li [Tue, 13 Apr 2021 09:46:12 +0000 (17:46 +0800)]
rsi: remove unused including <linux/version.h>

Fix the following versioncheck warning:
./drivers/net/wireless/rsi/rsi_91x_ps.c: 19 linux/version.h not needed.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonfc: st-nci: remove unnecessary label
wengjianfeng [Tue, 13 Apr 2021 09:45:30 +0000 (17:45 +0800)]
nfc: st-nci: remove unnecessary label

in st_nci_spi_write function, first assign a value to a variable then
goto exit label. return statement just follow the label and exit label
just used once, so we should directly return and remove exit label.

Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: improve failover sysfs entry
Lijun Pan [Tue, 13 Apr 2021 08:31:44 +0000 (03:31 -0500)]
ibmvnic: improve failover sysfs entry

The current implementation relies on H_IOCTL call to issue a
H_SESSION_ERR_DETECTED command to let the hypervisor to send a failover
signal. However, it may not work if there is no backup device or if
the vnic is already in error state,
e.g., "ibmvnic 30000003 env3: rx buffer returned with rc 6".
Add a last resort, that is to schedule a failover reset via CRQ command.

Signed-off-by: Lijun Pan <lijunp213@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoicmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv
Andreas Roeseler [Mon, 12 Apr 2021 21:23:56 +0000 (16:23 -0500)]
icmp: ICMPV6: pass RFC 8335 reply messages to ping_rcv

The current icmp_rcv function drops all unknown ICMP types, including
ICMP_EXT_ECHOREPLY (type 43). In order to parse Extended Echo Reply messages, we have
to pass these packets to the ping_rcv function, which does not do any
other filtering and passes the packet to the designated socket.

Pass incoming RFC 8335 ICMP Extended Echo Reply packets to the ping_rcv
handler instead of discarding the packet.

Signed-off-by: Andreas Roeseler <andreas.a.roeseler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'non-platform-devices-of_get_mac_address'
David S. Miller [Tue, 13 Apr 2021 21:35:02 +0000 (14:35 -0700)]
Merge branch 'non-platform-devices-of_get_mac_address'

Michael Walle says:

====================
of: net: support non-platform devices in of_get_mac_address()

of_get_mac_address() is commonly used to fetch the MAC address
from the device tree. It also supports reading it from a NVMEM
provider. But the latter is only possible for platform devices,
because only platform devices are searched for a matching device
node.

Add a second method to fetch the NVMEM cell by a device tree node
instead of a "struct device".

Moreover, the NVMEM subsystem will return dynamically allocated
data which has to be freed after use. Currently, this is handled
by allocating a device resource manged buffer to store the MAC
address. of_get_mac_address() then returns a pointer to this
buffer. Without a device, this trick is not possible anymore.
Thus, change the of_get_mac_address() API to have the caller
supply a buffer.

It was considered to use the network device to attach the buffer
to, but then the order matters and netdev_register() has to be
called before of_get_mac_address(). No driver does it this way.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoof: net: fix of_get_mac_addr_nvmem() for non-platform devices
Michael Walle [Mon, 12 Apr 2021 17:47:18 +0000 (19:47 +0200)]
of: net: fix of_get_mac_addr_nvmem() for non-platform devices

of_get_mac_address() already supports fetching the MAC address by an
nvmem provider. But until now, it was just working for platform devices.
Esp. it was not working for DSA ports and PCI devices. It gets more
common that PCI devices have a device tree binding since SoCs contain
integrated root complexes.

Use the nvmem of_* binding to fetch the nvmem cells by a struct
device_node. We still have to try to read the cell by device first
because there might be a nvmem_cell_lookup associated with that device.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoof: net: pass the dst buffer to of_get_mac_address()
Michael Walle [Mon, 12 Apr 2021 17:47:17 +0000 (19:47 +0200)]
of: net: pass the dst buffer to of_get_mac_address()

of_get_mac_address() returns a "const void*" pointer to a MAC address.
Lately, support to fetch the MAC address by an NVMEM provider was added.
But this will only work with platform devices. It will not work with
PCI devices (e.g. of an integrated root complex) and esp. not with DSA
ports.

There is an of_* variant of the nvmem binding which works without
devices. The returned data of a nvmem_cell_read() has to be freed after
use. On the other hand the return of_get_mac_address() points to some
static data without a lifetime. The trick for now, was to allocate a
device resource managed buffer which is then returned. This will only
work if we have an actual device.

Change it, so that the caller of of_get_mac_address() has to supply a
buffer where the MAC address is written to. Unfortunately, this will
touch all drivers which use the of_get_mac_address().

Usually the code looks like:

  const char *addr;
  addr = of_get_mac_address(np);
  if (!IS_ERR(addr))
    ether_addr_copy(ndev->dev_addr, addr);

This can then be simply rewritten as:

  of_get_mac_address(np, ndev->dev_addr);

Sometimes is_valid_ether_addr() is used to test the MAC address.
of_get_mac_address() already makes sure, it just returns a valid MAC
address. Thus we can just test its return code. But we have to be
careful if there are still other sources for the MAC address before the
of_get_mac_address(). In this case we have to keep the
is_valid_ether_addr() call.

The following coccinelle patch was used to convert common cases to the
new style. Afterwards, I've manually gone over the drivers and fixed the
return code variable: either used a new one or if one was already
available use that. Mansour Moufid, thanks for that coccinelle patch!

<spml>
@a@
identifier x;
expression y, z;
@@
- x = of_get_mac_address(y);
+ x = of_get_mac_address(y, z);
  <...
- ether_addr_copy(z, x);
  ...>

@@
identifier a.x;
@@
- if (<+... x ...+>) {}

@@
identifier a.x;
@@
  if (<+... x ...+>) {
      ...
  }
- else {}

@@
identifier a.x;
expression e;
@@
- if (<+... x ...+>@e)
-     {}
- else
+ if (!(e))
      {...}

@@
expression x, y, z;
@@
- x = of_get_mac_address(y, z);
+ of_get_mac_address(y, z);
  ... when != x
</spml>

All drivers, except drivers/net/ethernet/aeroflex/greth.c, were
compile-time tested.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: mt7530: Add support for EEE features
René van Dorst [Mon, 12 Apr 2021 06:50:31 +0000 (08:50 +0200)]
net: dsa: mt7530: Add support for EEE features

This patch adds EEE support.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'wireless-drivers-next-2021-04-13' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Tue, 13 Apr 2021 21:12:34 +0000 (14:12 -0700)]
Merge tag 'wireless-drivers-next-2021-04-13' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.13

First set of patches for v5.13. I have been offline for a couple of
and I have a smaller pull request this time. The next one will be
bigger. Nothing really special standing out.

ath11k

* add initial support for QCN9074, but not enabled yet due to firmware problems

* enable radar detection for 160MHz secondary segment

* handle beacon misses in station mode

rtw88

* 8822c: support firmware crash dump

mt7601u

* enable TDLS support
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agocan: peak_usb: pcan_usb: replace open coded endianness conversion of unaligned data
Marc Kleine-Budde [Mon, 5 Apr 2021 12:44:15 +0000 (14:44 +0200)]
can: peak_usb: pcan_usb: replace open coded endianness conversion of unaligned data

This patch replaces the open coded endianness conversion of unaligned
data by the appropriate get/put_unaligned_leXX() variants.

Link: https://lore.kernel.org/r/20210406111622.1874957-11-mkl@pengutronix.de
Acked-by: Stephane Grosjean <s.grosjean@peak-system.com>
Tested-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>