git.monstr.eu Git - linux-2.6-microblaze.git/log

net: mvpp2: restructure "link status" interrupt handling

The "link status" interrupt is used for more than just link status.
Restructure mvpp2_link_status_isr() so we can add additional handling.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'devlink-show-controller-number'

Parav Pandit says:

====================
devlink show controller number

Currently a devlink instance that supports an eswitch handles eswitch
ports of two type of controllers.
(1) controller discovered on same system where eswitch resides.
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external system.
This is the case where a controller is plugged in one system and its
devlink eswitch ports are located in a different system. In this case
devlink instance of the eswitch only have access to ports of the
controller.
However, there is no way to describe that a eswitch devlink port
belongs to which controller (mainly which external host controller).
This problem is more prevalent when port attribute such as PF and VF
numbers are overlapping between multiple controllers of same eswitch.
Due to this, for a specific switch_id, unique phys_port_name cannot
be constructed for such devlink ports.

This short series overcomes this limitation by defining two new
attributes.
(a) external: Indicates if port belongs to external controller
(b) controller number: Indicates a controller number of the port

Based on this a unique phys_port_name is prepared using controller
number.

phys_port_name construction using unique controller number is only
applicable to external controller ports. This ensures that for
non smartnic usecases where there is no external controller,
phys_port_name stays same as before.

Patch summary:
Patch-1 Added mlx5 driver to read controller number
Patch-2 Adds the missing comment for the port attributes
Patch-3 Move structure comments away from structure fields
Patch-4 external attribute added for PCI port flavours
Patch-5 Add controller number
Patch-6 Use controller number to build phys_port_name

---
Changelog:
v2->v3:
- Updated diagram to get rid of controller 'A' and 'B'
- Kept ports of single controller together in diagram
- Updated diagram for pf1's VF and SF and its ports
v1->v2:
- Added text diagram of multiple controllers
- Updated example for a VF
- Addressed comments from Jiri and Jakub
- Moved controller number attribute to PCI port flavours
   This enables to better, hirerchical view with controller and its
    PF, VF numbers
- Split 'external' and 'controller number' attributes as two
   different attributes
- Merged mlx5_core driver to avoid compiliation break
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: Use controller while building phys_port_name

Now that controller number attribute is available, use it when
building phsy_port_name for external controller ports.

An example devlink port and representor netdev name consist of controller
annotation for external controller with controller number = 1,
for a VF 1 of PF 0:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0c1pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Controller number annotation is skipped for non external controllers to
maintain backward compatibility.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: Introduce controller number

A devlink port may be for a controller consist of PCI device.
A devlink instance holds ports of two types of controllers.
(1) controller discovered on same system where eswitch resides
This is the case where PCI PF/VF of a controller and devlink eswitch
instance both are located on a single system.
(2) controller located on external host system.
This is the case where a controller is located in one system and its
devlink eswitch ports are located in a different system.

When a devlink eswitch instance serves the devlink ports of both
controllers together, PCI PF/VF numbers may overlap.
Due to this a unique phys_port_name cannot be constructed.

For example in below such system controller-0 and controller-1, each has
PCI PF pf0 whose eswitch ports can be present in controller-0.
These results in phys_port_name as "pf0" for both.
Similar problem exists for VFs and upcoming Sub functions.

An example view of two controller systems:

             ---------------------------------------------------------
             |                                                       |
             |           --------- ---------         ------- ------- |
-----------  |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
| server  |  | -------   ----/---- ---/----- ------- ---/--- ---/--- |
| pci rc  |=== | pf0 |______/________/       | pf1 |___/_______/     |
| connect |  | -------                       -------                 |
-----------  |     | controller_num=1 (no eswitch)                   |
             ------|--------------------------------------------------
             (internal wire)
                   |
             ---------------------------------------------------------
             | devlink eswitch ports and reps                        |
             | ----------------------------------------------------- |
             | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
             | |pf0    | pf0vfN | pf0sfN | pf1    | pf1vfN |pf1sfN | |
             | ----------------------------------------------------- |
             | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
             | |pf1    | pf1vfN | pf1sfN | pf1    | pf1vfN |pf0sfN | |
             | ----------------------------------------------------- |
             |                                                       |
             |                                                       |
             |           --------- ---------         ------- ------- |
             |           | vf(s) | | sf(s) |         |vf(s)| |sf(s)| |
             | -------   ----/---- ---/----- ------- ---/--- ---/--- |
             | | pf0 |______/________/       | pf1 |___/_______/     |
             | -------                       -------                 |
             |                                                       |
             |  local controller_num=0 (eswitch)                     |
             ---------------------------------------------------------

An example devlink port for external controller with controller
number = 1 for a VF 1 of PF 0:

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00

$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0pf0vf1",
            "flavour": "pcivf",
            "controller": 1,
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: Introduce external controller flag

A devlink eswitch port may represent PCI PF/VF ports of a controller.

A controller either located on same system or it can be an external
controller located in host where such NIC is plugged in.

Add the ability for driver to specify if a port is for external
controller.

Use such flag in the mlx5_core driver.

An example of an external controller having VF1 of PF0 belong to
controller 1.

$ devlink port show pci/0000:06:00.0/2
pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false
  function:
    hw_addr 00:00:00:00:00:00
$ devlink port show pci/0000:06:00.0/2 -jp
{
    "port": {
        "pci/0000:06:00.0/2": {
            "type": "eth",
            "netdev": "ens2f0pf0vf1",
            "flavour": "pcivf",
            "pfnum": 0,
            "vfnum": 1,
            "external": true,
            "splittable": false,
            "function": {
                "hw_addr": "00:00:00:00:00:00"
            }
        }
    }
}

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: Move structure comments outside of structure

To add more fields to the PCI PF and VF port attributes, follow standard
structure comment format.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

devlink: Add comment block for missing port attributes

Add comment block for physical, PF and VF port attributes.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/mlx5: E-switch, Read controller number from device

ECPF supports one external host controller. Read controller number
from the device.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: dwmac-intel-plat: remove redundant null check before clk_disable_unprepare()

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: pxa168_eth: remove redundant null check before clk_disable_unprepare()

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'SMSC-Cleanups-and-clock-setup'

Marco Felsch says:

====================
SMSC: Cleanups and clock setup

this small series cleans the smsc-phy code a bit and adds the support to
specify the phy clock source. Adding the phy clock source support is
also the main purpose of this series.

Each file has its own changelog.

Thanks a lot to Florian and Andrew for reviewing it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: smsc: LAN8710/20: remove PHY_RST_AFTER_CLK_EN flag

Don't reset the phy without respect to the PHY library state machine
because this breaks the phy IRQ mode. The same behaviour can be archived
now by specifying the refclk.

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: smsc: LAN8710/20: add phy refclk in support

Add support to specify the clock provider for the PHY refclk and don't
rely on 'magic' host clock setup. [1] tried to address this by
introducing a flag and fixing the corresponding host. But this commit
breaks the IRQ support since the irq setup during .config_intr() is
thrown away because the reset comes from the side without respecting the
current PHY state within the PHY library state machine. Furthermore the
commit fixed the problem only for FEC based hosts other hosts acting
like the FEC are not covered.

This commit goes the other way around to address the bug fixed by [1].
Instead of resetting the device from the side every time the refclk gets
(re-)enabled it requests and enables the clock till the device gets
removed. Now the PHY library is the only place where the PHY gets reset
to respect the PHY library state machine.

[1] commit 7f64e5b18ebb ("net: phy: smsc: LAN8710/20: add
PHY_RST_AFTER_CLK_EN flag")

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: phy: smsc: document reference clock

Add support to specify the reference clock for the phy.

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: smsc: simplify config_init callback

Exit the driver specific config_init hook early if energy detection is
disabled. We can do this because we don't need to clear the interrupt
status here. Clearing the status should be removed anyway since this is
handled by the phy_enable_interrupts().

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: smsc: skip ENERGYON interrupt if disabled

Don't enable the interrupt if the platform disable the energy detection
by "smsc,disable-energy-detect".

Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cavium: Fix a bunch of kerneldoc parameter issues

Rename ptp to ptp_info.

Fix W=1 compile warnings (invalid kerneldoc):

drivers/net/ethernet/cavium/common/cavium_ptp.c:94: warning: Excess function parameter 'ptp' description in 'cavium_ptp_adjfine'
drivers/net/ethernet/cavium/common/cavium_ptp.c:141: warning: Excess function parameter 'ptp' description in 'cavium_ptp_adjtime'
drivers/net/ethernet/cavium/common/cavium_ptp.c:163: warning: Excess function parameter 'ptp' description in 'cavium_ptp_gettime'
drivers/net/ethernet/cavium/common/cavium_ptp.c:185: warning: Excess function parameter 'ptp' description in 'cavium_ptp_settime'
drivers/net/ethernet/cavium/common/cavium_ptp.c:208: warning: Excess function parameter 'ptp' description in 'cavium_ptp_enable'

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4/ch_ipsec: Registering xfrmdev_ops with cxgb4

As ch_ipsec was removed without clearing xfrmdev_ops and netdev
feature(esp-hw-offload). When a recalculation of netdev feature is
triggered by changing tls feature(tls-hw-tx-offload) from user
request, it causes a page fault due to absence of valid xfrmdev_ops.

Fixes: 6dad4e8ab3ec ("chcr: Add support for Inline IPSec")
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ksz9477-dsa-switch-driver-improvements'

Paul Barker says:

====================
ksz9477 dsa switch driver improvements

These changes were made while debugging the ksz9477 driver for use on a
custom board which uses the ksz9893 switch supported by this driver. The
patches have been runtime tested on top of Linux 5.8.4, I couldn't
runtime test them on top of 5.9-rc3 due to unrelated issues. They have
been build tested on top of net-next.

These changes can also be pulled from:

  https://gitlab.com/pbarker.dev/staging/linux.git
  tag: for-net-next/ksz-v3_2020-09-09

Changes from v2:

  * Fixed incorrect type in assignment error.
Reported-by: kernel test robot <lkp@intel.com>
Changes from v1:

  * Rebased onto net-next.

  * Dropped unnecessary `#include <linux/printk.h>`.

  * Instead of printing the phy mode in `ksz9477_port_setup()`, modify
    the existing print in `ksz9477_config_cpu_port()` to always produce
    output and to be more clear.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: Implement recommended reset timing

The datasheet for the ksz9893 and ksz9477 switches recommend waiting at
least 100us after the de-assertion of reset before trying to program the
device through any interface.

Also switch the existing msleep() call to usleep_range() as recommended
in Documentation/timers/timers-howto.rst. The 2ms range used here is
somewhat arbitrary, as long as the reset is asserted for at least 10ms
we should be ok.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: Disable RGMII in-band status on KSZ9893

We can't assume that the link partner supports the in-band status
reporting which is enabled by default on the KSZ9893 when using RGMII
for the upstream port.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: Improve phy mode message

Always print the selected phy mode for the CPU port when using the
ksz9477 driver. If the phy mode was changed, also print the previous
mode to aid in debugging.

To make the message more clear, prefix it with the port number which it
applies to and improve the language a little.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: microchip: Make switch detection more informative

To make switch detection more informative print the result of the
ksz9477/ksz9893 compatibility check. With debug output enabled also
print the contents of the Chip ID registers as a 40-bit hex string.

As this detection is the first communication with the switch performed
by the driver, making it easy to see any errors here will help identify
issues with SPI data corruption or reset sequencing.

Signed-off-by: Paul Barker <pbarker@konsulko.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Rewrite inner header IPv6 in ICMPv6 messages in ip6t_NPT,
   from Michael Zhou.

2) do_ip_vs_set_ctl() dereferences uninitialized value,
   from Peilin Ye.

3) Support for userdata in tables, from Jose M. Guisado.

4) Do not increment ct error and invalid stats at the same time,
   from Florian Westphal.

5) Remove ct ignore stats, also from Florian.

6) Add ct stats for clash resolution, from Florian Westphal.

7) Bump reference counter bump on ct clash resolution only,
   this is safe because bucket lock is held, again from Florian.

8) Use ip_is_fragment() in xt_HMARK, from YueHaibing.

9) Add wildcard support for nft_socket, from Balazs Scheidler.

10) Remove superfluous IPVS dependency on iptables, from
    Yaroslav Bolyukin.

11) Remove unused definition in ebt_stp, from Wang Hai.

12) Replace CONFIG_NFT_CHAIN_NAT_{IPV4,IPV6} by CONFIG_NFT_NAT
    in selftests/net, from Fabian Frederick.

13) Add userdata support for nft_object, from Jose M. Guisado.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet/neterion/vxge: fix spelling of "functionality"

Fix typo/spello of "functionality".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfc: pn533/usb.c: fix spelling of "functions"

Fix typo/spello of "functions".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-nfc@lists.01.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: add tos reflection in TCP reset and ack

Currently, ipv6 stack does not do any TOS reflection. To make the
behavior consistent with v4 stack, this commit adds TOS reflection in
tcp_v6_reqsk_send_ack() and tcp_v6_send_reset(). We clear the lower
2-bit ECN value of the received TOS in compliance with RFC 3168 6.1.5
robustness principles.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'rxrpc-next-20200908' of git://git./linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Allow more calls to same peer

Here are some development patches for AF_RXRPC that allow more simultaneous
calls to be made to the same peer with the same security parameters.  The
current code allows a maximum of 4 simultaneous calls, which limits the afs
filesystem to that many simultaneous threads.  This increases the limit to
16.

To make this work, the way client connections are limited has to be changed
(incoming call/connection limits are unaffected) as the current code
depends on queuing calls on a connection and then pushing the connection
through a queue.  The limit is on the number of available connections.

This is changed such that there's a limit[*] on the total number of calls
systemwide across all namespaces, but the limit on the number of client
connections is removed.

Once a call is allowed to proceed, it finds a bundle of connections and
tries to grab a call slot.  If there's a spare call slot, fine, otherwise
it will wait.  If there's already a waiter, it will try to create another
connection in the bundle, unless the limit of 4 is reached (4 calls per
connection, giving 16).

A number of things throttle someone trying to set up endless connections:

- Calls that fail immediately have their conns deleted immediately,

- Calls that don't fail immediately have to wait for a timeout,

- Connections normally get automatically reaped if they haven't been used
   for 2m, but this is sped up to 2s if the number of connections rises
   over 900.  This number is tunable by sysctl.

[*] Technically two limits - kernel sockets and userspace rxrpc sockets are
    accounted separately.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: tc35815: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'tc35815_init_queues()' GFP_ATOMIC must be used
because it can be called from 'tc35815_restart()' where some spinlock are
taken.
The call chain is:
  tc35815_restart
    --> tc35815_clear_queues
      --> tc35815_init_queues

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

hippi: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'rr_init_one()' GFP_KERNEL can be used because
it is a probe function and no spinlock is taken in the between.

When memory is allocated in 'rr_open()' GFP_KERNEL can be used because
it is a '.ndo_open' function (see struct net_device_ops) and no spinlock is
taken in the between.
'.ndo_open' functions are synchronized using the rtnl_lock() semaphore.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: coding style cleanups in mcdi_port_common.c

The code recently moved into this file contained a number of coding style
issues, about which checkpatch and xmastree complained. Fix them.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bridge: mcast: fix unused br var when lockdep isn't defined

Stephen reported the following warning:
net/bridge/br_multicast.c: In function 'br_multicast_find_port':
net/bridge/br_multicast.c:1818:21: warning: unused variable 'br' [-Wunused-variable]
1818 | struct net_bridge *br = mp->br;
| ^~

It happens due to bridge's mlock_dereference() when lockdep isn't defined.
Silence the warning by annotating the variable as __maybe_unused.

Fixes: 0436862e417e ("net: bridge: mcast: support for IGMPv3/MLDv2 ALLOW_NEW_SOURCES report")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netlabel: Fix some kernel-doc warnings

Fixes the following W=1 kernel build warning(s):

net/netlabel/netlabel_calipso.c:438: warning: Excess function parameter 'audit_secid' description in 'calipso_doi_remove'
net/netlabel/netlabel_calipso.c:605: warning: Excess function parameter 'reg' description in 'calipso_req_delattr'

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: wimax: i2400m: fix 'msg_skb' kernel-doc warning in i2400m_msg_to_dev()

Fixes the following W=1 kernel build warning(s):

drivers/net/wimax/i2400m/control.c:709: warning: Excess function parameter 'msg_skb' description in 'i2400m_msg_to_dev'

This parameter is not in use. Remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2x: Fix some kernel-doc warnings

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c:4238: warning: Excess function parameter 'netdev' description in 'bnx2x_setup_tc'
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c:4238: warning: Excess function parameter 'tc' description in 'bnx2x_setup_tc'

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cipso: fix 'audit_secid' kernel-doc warning in cipso_ipv4.c

Fixes the following W=1 kernel build warning(s):

net/ipv4/cipso_ipv4.c:510: warning: Excess function parameter 'audit_secid' description in 'cipso_v4_doi_remove'

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: smsc911x: Remove unused variables

Fixes the following W=1 kernel build warning(s):

drivers/net/ethernet/smsc/smsc911x.c: In function ‘smsc911x_rx_fastforward’:
drivers/net/ethernet/smsc/smsc911x.c:1199:16: warning: variable ‘temp’ set but not used [-Wunused-but-set-variable]

drivers/net/ethernet/smsc/smsc911x.c: In function ‘smsc911x_eeprom_write_location’:
drivers/net/ethernet/smsc/smsc911x.c:2058:6: warning: variable ‘temp’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'net-hns3-misc-updates'

Huazhong Tan says:

====================
net: hns3: misc updates

There are some misc updates for the HNS3 ethernet driver.

====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: remove some unused function hns3_update_promisc_mode()

hns3_update_promisc_mode is defined, but not be used, so remove it.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: remove some unused macros related to queue

There are several macros related queue defined, but never
used, so remove them.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: remove unused field 'tc_num_last_time' in struct hclge_dev

'tc_num_last_time' is defined, but never used, so remove it.

Reported-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: remove unused field 'io_base' in struct hns3_enet_ring

'io_base' has been defined and initialized, but never used,
so remove it.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: fix a typo in struct hclge_mac

The member link of struct hclge_mac stores the link status of
MAC and PHY if PHY exists, but its annotation uses word "exit",
so fix it.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: skip periodic service task if reset failed

When reset fails, if there are some pending jobs for the periodic
service task, it does not do anything except print error each
time the task is scheduled. So skip the periodic service task if
reset failed.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: narrow two local variable range in hclgevf_reset_prepare_wait()

Since variable send_msg and ret only used in if branch, so move
their definition into the if branch.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sched: skip an unnecessay check

Reviewing the error handling in tcf_action_init_1()
most of the early handling uses

err_out:
if (cookie) {
kfree(cookie->data);
kfree(cookie);
}

before cookie could ever be set.

So skip the unnecessay check.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

rxrpc: Allow multiple client connections to the same peer

Allow the number of parallel connections to a machine to be expanded from a
single connection to a maximum of four. This allows up to 16 calls to be
in progress at the same time to any particular peer instead of 4.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Rewrite the client connection manager

Rewrite the rxrpc client connection manager so that it can support multiple
connections for a given security key to a peer.  The following changes are
made:

(1) For each open socket, the code currently maintains an rbtree with the
     connections placed into it, keyed by communications parameters.  This
     is tricky to maintain as connections can be culled from the tree or
     replaced within it.  Connections can require replacement for a number
     of reasons, e.g. their IDs span too great a range for the IDR data
     type to represent efficiently, the call ID numbers on that conn would
     overflow or the conn got aborted.

     This is changed so that there's now a connection bundle object placed
     in the tree, keyed on the same parameters.  The bundle, however, does
     not need to be replaced.

(2) An rxrpc_bundle object can now manage the available channels for a set
     of parallel connections.  The lock that manages this is moved there
     from the rxrpc_connection struct (channel_lock).

(3) There'a a dummy bundle for all incoming connections to share so that
     they have a channel_lock too.  It might be better to give each
     incoming connection its own bundle.  This bundle is not needed to
     manage which channels incoming calls are made on because that's the
     solely at whim of the client.

(4) The restrictions on how many client connections are around are
     removed.  Instead, a previous patch limits the number of client calls
     that can be allocated.  Ordinarily, client connections are reaped
     after 2 minutes on the idle queue, but when more than a certain number
     of connections are in existence, the reaper starts reaping them after
     2s of idleness instead to get the numbers back down.

     It could also be made such that new call allocations are forced to
     wait until the number of outstanding connections subsides.

Signed-off-by: David Howells <dhowells@redhat.com>

rxrpc: Impose a maximum number of client calls

Impose a maximum on the number of client rxrpc calls that are allowed
simultaneously. This will be in lieu of a maximum number of client
connections as this is easier to administed as, unlike connections, calls
aren't reusable (to be changed in a subsequent patch)..

This doesn't affect the limits on service calls and connections.

Signed-off-by: David Howells <dhowells@redhat.com>

netfilter: nf_tables: add userdata support for nft_object

Enables storing userdata for nft_object. Initially this will store an
optional comment but can be extended in the future as needed.

Adds new attribute NFTA_OBJ_USERDATA to nft_object.

Signed-off-by: Jose M. Guisado Gomez <guigom@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

selftests/net: replace obsolete NFT_CHAIN configuration

Replace old parameters with global NFT_NAT from commit db8ab38880e0
("netfilter: nf_tables: merge ipv4 and ipv6 nat chain types")

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

netfilter: ebt_stp: Remove unused macro BPDU_TYPE_TCN

BPDU_TYPE_TCN is never used after it was introduced.
So better to remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

net: dsa: don't print non-fatal MTU error if not supported

Commit 72579e14a1d3 ("net: dsa: don't fail to probe if we couldn't set
the MTU") changed, for some reason, the "err && err != -EOPNOTSUPP"
check into a simple "err". This causes the MTU warning to be printed
even for drivers that don't have the MTU operations implemented.
Fix that.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: change PHY error message again

slave_dev->name is only populated at this stage if it was specified
through a label in the device tree. However that is not mandatory.
When it isn't, the error message looks like this:

[    5.037057] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d
[    5.044672] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d
[    5.052275] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d
[    5.059877] fsl_enetc 0000:00:00.2 eth2: error -19 setting up slave PHY for eth%d

which is especially confusing since the error gets printed on behalf of
the DSA master (fsl_enetc in this case).

Printing an error message that contains a valid reference to the DSA
port's name is difficult at this point in the initialization stage, so
at least we should print some info that is more reliable, even if less
user-friendly. That may be the driver name and the hardware port index.

After this change, the error is printed as:

[    6.051587] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 0
[    6.061192] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 1
[    6.070765] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 2
[    6.080324] mscc_felix 0000:00:00.5: error -19 setting up PHY for tree 0, switch 0, port 3

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: tighten the definition of interface statistics

This patch is born out of an investigation into which IEEE statistics
correspond to which struct rtnl_link_stats64 members. Turns out that
there seems to be reasonable consensus on the matter, among many drivers.
To save others the time (and it took more time than I'm comfortable
admitting) I'm adding comments referring to IEEE attributes to
struct rtnl_link_stats64.

Up until now we had two forms of documentation for stats - in
Documentation/ABI/testing/sysfs-class-net-statistics and the comments
on struct rtnl_link_stats64 itself. While the former is very cautious
in defining the expected behavior, the latter feel quite dated and
may not be easy to understand for modern day driver author
(e.g. rx_over_errors). At the same time modern systems are far more
complex and once obvious definitions lost their clarity. For example
- does rx_packet count at the MAC layer (aFramesReceivedOK)?
packets processed correctly by hardware? received by the driver?
or maybe received by the stack?

I tried to clarify the expectations, further clarifications from
others are very welcome.

The part hardest to untangle is rx_over_errors vs rx_fifo_errors
vs rx_missed_errors. After much deliberation I concluded that for
modern HW only two of the counters will make sense. The distinction
between internal FIFO overflow and packets dropped due to back-pressure
from the host is likely too implementation (driver and device) specific
to expose in the standard stats.

Now - which two of those counters we select to use is anyone's pick:

sysfs documentation suggests rx_over_errors counts packets which
did not fit into buffers due to MTU being too small, which I reused.
There don't seem to be many modern drivers using it (well, CAN drivers
seem to love this statistic).

Of the remaining two I picked rx_missed_errors to report device drops.
bnxt reports it and it's folded into "drop"s in procfs (while
rx_fifo_errors is an error, and modern devices usually receive the frame
OK, they just can't admit it into the pipeline).

Of the drivers I looked at only AMD Lance-like and NS8390-like use all
three of these counters. rx_missed_errors counts missed frames,
rx_over_errors counts overflow events, and rx_fifo_errors counts frames
which were truncated because they didn't fit into buffers. This suggests
that rx_fifo_errors may be the correct stat for truncated packets, but
I'd think a FIFO stat counting truncated packets would be very confusing
to a modern reader.

v2:
- add driver developer notes about ethtool stat count and reset
- replace Ethernet with IEEE 802.3 to better indicate source of attrs
- mention byte counters don't count FCS
- clarify RX counter is from device to host
- drop "sightly" from sysfs paragraph
- add examples of ethtool stats
- s/incoming/received/ s/incoming/transmitted/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

rxrpc: Remove unused macro rxrpc_min_rtt_wlen

rxrpc_min_rtt_wlen is never used after it was introduced.
So better to remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'sfc-ethtool-for-EF100-and-related-improvements'

Edward Cree says:

====================
sfc: ethtool for EF100 and related improvements

This series adds the ethtool support to the EF100 driver that was held
back from the original submission as the lack of phy_ops caused issues.
Patch #2, removing the phy_op indirection, deals with this. There are a
lot of checkpatch warnings / xmastree violations but they're all in
pure code movement so I've left the code as it is.
While patch #1 is technically a fix and possibly could go to 'net', I've
put it in this series since it only becomes triggerable with the added
'ethtool --reset' support.
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: simplify DMA mask setting

Christoph says[1] that dma_set_mask_and_coherent() is smart enough to
truncate the mask itself if it's too long. So we can get rid of our
"lop off one bit and retry" loop in efx_init_io().

[1]: https://www.spinics.net/lists/netdev/msg677266.html

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: remove EFX_DRIVER_VERSION

Per-module versions for in-tree drivers are deprecated.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: handle limited FEC support

If the reported PHY capabilities do not include a given FEC mode, don't
attempt to select that FEC mode anyway. If the user tries to set a mode
through ethtool that is not supported, return an error.
The _REQUESTED bits don't appear in the supported caps, but are implied
by the corresponding FEC bits.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: add ethtool ops and miscellaneous ndos to EF100

Mostly just calls to existing common functions.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: remove phy_op indirection

Originally there were several implementations of PHY operations for the
several different PHYs used on Falcon boards. But Falcon is now in a
separate driver, and all sfc NICs since then have had MCDI-managed PHYs.
Thus, there is no need to indirect through function pointers in
efx->phy_op; we can simply call the efx_mcdi_phy_* functions directly.

This also hooks up these functions for EF100, which was previously using
the dummy_phy_ops.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: don't double-down() filters in ef100_reset()

dev_close(), by way of ef100_net_stop(), already brings down the filter
table, so there's no need to do it again (which just causes lots of
WARN_ONs).
Similarly, don't bring it up ourselves, as dev_open() -> ef100_net_open()
will do it, and will fail if it's already been brought up.

Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: dnet: Remove set but unused variable 'len'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/dnet.c: In function dnet_start_xmit
drivers/net/ethernet/dnet.c:511:15: warning: variable ‘len’ set but not used [-Wunused-but-set-variable]

commit 4796417417a6 ("dnet: Dave DNET ethernet controller driver (updated)")
involved this unused variable, remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: dwmac: remove redundant null check before clk_disable_unprepare()

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: ethernet: fec: remove redundant null check before clk_disable_unprepare()

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: remove redundant null check before clk_disable_unprepare()

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: xilinx: remove redundant null check before clk_disable_unprepare()

Because clk_prepare_enable() and clk_disable_unprepare() already checked
NULL clock parameter, so the additional checks are unnecessary, just
remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-bridge-mcast-initial-IGMPv3-MLDv2-support-part-1'

Nikolay Aleksandrov says:

====================
net: bridge: mcast: initial IGMPv3/MLDv2 support (part 1)

This patch-set implements the control plane for initial IGMPv3/MLDv2
support which takes care of include/exclude sets and state transitions
based on the different report types.
Patch 01 arranges the structure better by moving the frequently used
fields together, patch 02 factors out the port group deletion code which is
used in a few places. Patches 03 and 04 add support for source lists and
group modes per port group which are dumped. Patch 05 adds support for
group-and-source specific queries required for IGMPv3/MLDv2. Then patch 06
adds support for group and group-and-source query retransmissions via a new
rexmit timer. Patches 07 and 08 make use of the already present mdb fill
functions when sending notifications so we can have the full mdb entries'
state filled in (with sources, mode etc). Patch 09 takes care of port group
expiration, it switches the group mode to include and deletes it if there
are no sources with active timers. Patches 10-13 are the core changes which
add support for IGMPv3/MLDv2 reports and handle the source list set
operations as per RFCs 3376 and 3810, all IGMPv3/MLDv2 report types with
their transitions should be supported after these patches. I've used RFCs
3376, 3810 and FRR as a reference implementation. The source lists are
capped at 32 entries, we can remove that limitation at a later point which
would require a better data structure to hold them. IGMPv3 processing is
hidden behind the bridge's multicast_igmp_version option which must be set
to 3 in order to enable it. MLDv2 processing is hidden behind the bridge's
multicast_mld_version which must be set to 2 in order to enable it.
Patch 14 improves other querier processing a bit (more about this below).
And finally patch 15 transforms the src gc so it can be used with all mcast
objects since now we have multiple timers that can be running and we
need to make sure they have all finished before freeing the objects.
This is part 1, it only adds control plane support and doesn't change
the fast path. A following patch-set will take care of that.

Here're the sets that will come next (in order):
- Fast path patch-set which adds support for (S, G) mdb entries needed
   for IGMPv3/MLDv2 forwarding, entry add source (kernel, user-space etc)
   needed for IGMPv3/MLDv2 entry management, entry block mode needed for
   IGMPv3/MLDv2 exclude mode. This set will also add iproute2 support for
   manipulating and showing all the new state.
- Selftests patches which will verify all state transitions and forwarding
- Explicit host tracking patch-set, needed for proper fast leave and
   with it fast leave will be enabled for IGMPv3/MLDv2

Not implemented yet:
- Host IGMPv3/MLDv2 filter support (currently we handle only join/leave
   as before)
- Proper other querier source timer and value updates
- IGMPv3/v2 MLDv2/v1 compat (I have a few rough patches for this one)

v4: move old patch 05 to 02 (group del patch), before src lists
    patch 02: set pg's fast leave flag when deleting due to fast leave
    patch 03: now can use the new port del function
              add igmpv2/mldv1 bool which are set when the entry is
              added in those modes (later will be passed as update_timer)
    patch 10: rename update_timer to igmpv2_mldv1 and use the passed
              value from br_multicast_add_group's callers
v3: add IPv6/MLDv2 support, most patches are changed
v2:
patches 03-04: make src lists RCU friendly so they can be traversed
                when dumping, reduce limit to a more conservative 32
                src group entries for a start
patches 11-13: remove helper and directly do bitops
patch      15: force mcast gc on bridge port del to make sure port
                group timers have finished before freeing the port
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: destroy all entries via gc

Since each entry type has timers that can be running simultaneously we need
to make sure that entries are not freed before their timers have finished.
In order to do that generalize the src gc work to mcast gc work and use a
callback to free the entries (mdb, port group or src).

v3: add IPv6 support
v2: force mcast gc on port del to make sure all port group timers have
finished before freeing the bridge port

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: improve IGMPv3/MLDv2 query processing

When an IGMPv3/MLDv2 query is received and we're operating in such mode
then we need to avoid updating group timers if the suppress flag is set.
Also we should update only timers for groups in exclude mode.

v3: add IPv6/MLDv2 support

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: support for IGMPV3/MLDv2 BLOCK_OLD_SOURCES report

We already have all necessary helpers, so process IGMPV3/MLDv2
BLOCK_OLD_SOURCES as per the RFCs.

v3: add IPv6/MLDv2 support
v2: directly do flag bit operations

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: support for IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report

In order to process IGMPV3/MLDv2 CHANGE_TO_INCLUDE/EXCLUDE report types we
need new helpers which allow us to mark entries based on their timer
state and to query only marked entries.

v3: add IPv6/MLDv2 support, fix other_query checks
v2: directly do flag bit operations

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: support for IGMPV3/MLDv2 MODE_IS_INCLUDE/EXCLUDE report

In order to process IGMPV3/MLDv2_MODE_IS_INCLUDE/EXCLUDE report types we
need some new helpers which allow us to set/clear flags for all current
entries and later delete marked entries after the report sources have been
processed.

v3: add IPv6/MLDv2 support
v2: drop flag helpers and directly do flag bit operations

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: support for IGMPv3/MLDv2 ALLOW_NEW_SOURCES report

This patch adds handling for the ALLOW_NEW_SOURCES IGMPv3/MLDv2 report
types and limits them only when multicast_igmp_version == 3 or
multicast_mld_version == 2 respectively. Now that IGMPv3/MLDv2 handling
functions will be managing timers we need to delay their activation, thus
a new argument is added which controls if the timer should be updated.
We also disable host IGMPv3/MLDv2 handling as it's not yet implemented and
could cause inconsistent group state, the host can only join a group as
EXCLUDE {} or leave it.

v4: rename update_timer to igmpv2_mldv1 and use the passed value from
br_multicast_add_group's callers
v3: Add IPv6/MLDv2 support

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: delete expired port groups without srcs

If an expired port group is in EXCLUDE mode, then we have to turn it
into INCLUDE mode, remove all srcs with zero timer and finally remove
the group itself if there are no more srcs with an active timer.
For IGMPv2 use there would be no sources, so this will reduce to just
removing the group as before.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mdb: use mdb and port entries in notifications

We have to use mdb and port entries when sending mdb notifications in
order to fill in all group attributes properly. Before this change we
would've used a fake br_mdb_entry struct to fill in only partial
information about the mdb. Now we can also reuse the mdb dump fill
function and thus have only a single central place which fills the mdb
attributes.

v3: add IPv6 support

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mdb: push notifications in __br_mdb_add/del

This change is in preparation for using the mdb port group entries when
sending a notification, so their full state and additional attributes can
be filled in.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: add support for group query retransmit

We need to be able to retransmit group-specific and group-and-source
specific queries. The new timer takes care of those.

v3: add IPv6 support

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: add support for group-and-source specific queries

Allows br_multicast_alloc_query to build queries with the port group's
source lists and sends a query for sources over and under lmqt when
necessary as per RFCs 3376 and 3810 with the suppress flag set
appropriately.

v3: add IPv6 support

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: add support for src list and filter mode dumping

Support per port group src list (address and timer) and filter mode
dumping. Protected by either multicast_lock or rcu.

v3: add IPv6 support
v2: require RCU or multicast_lock to traverse src groups

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: add support for group source list

Initial functions for group source lists which are needed for IGMPv3
and MLDv2 include/exclude lists. Both IPv4 and IPv6 sources are supported.
User-added mdb entries are created with exclude filter mode, we can
extend that later to allow user-supplied mode. When group src entries
are deleted, they're freed from a workqueue to make sure their timers
are not still running. Source entries are protected by the multicast_lock
and rcu. The number of src groups per port group is limited to 32.

v4: use the new port group del function directly
add igmpv2/mldv1 bool to denote if the entry was added in those
modes, it will later replace the old update_timer bool
v3: add IPv6 support
v2: allow src groups to be traversed under rcu

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mcast: factor out port group del

In order to avoid future errors and reduce code duplication we should
factor out the port group del sequence. This allows us to have one
function which takes care of all details when removing a port group.

v4: set pg's fast leave flag when deleting due to fast leave
move the patch before adding source lists

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bridge: mdb: arrange internal structs so fast-path fields are close

Before this patch we'd need 2 cache lines for fast-path, now all used
fields are in the first cache line.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: rtl8366rb: Switch to phylink

This switches the RTL8366RB over to using phylink callbacks
instead of .adjust_link(). This is a pretty template
switchover. All we adjust is the CPU port so that is why
the code only inspects this port.

We enhance by adding proper error messages, also disabling
the CPU port on the way down and moving dev_info() to
dev_dbg().

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tipc: fix a deadlock when flushing scheduled work

In the commit fdeba99b1e58
("tipc: fix use-after-free in tipc_bcast_get_mode"), we're trying
to make sure the tipc_net_finalize_work work item finished if it
enqueued. But calling flush_scheduled_work() is not just affecting
above work item but either any scheduled work. This has turned out
to be overkill and caused to deadlock as syzbot reported:

======================================================
WARNING: possible circular locking dependency detected
5.9.0-rc2-next-20200828-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u4:6/349 is trying to acquire lock:
ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: flush_workqueue+0xe1/0x13e0 kernel/workqueue.c:2777

but task is already holding lock:
ffffffff8a879430 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x9b/0xb10 net/core/net_namespace.c:565

[...]
Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(pernet_ops_rwsem);
                               lock(&sb->s_type->i_mutex_key#13);
                               lock(pernet_ops_rwsem);
  lock((wq_completion)events);

*** DEADLOCK ***
[...]

v1:
To fix the original issue, we replace above calling by introducing
a bit flag. When a namespace cleaned-up, bit flag is set to zero and:
- tipc_net_finalize functionial just does return immediately.
- tipc_net_finalize_work does not enqueue into the scheduled work queue.

v2:
Use cancel_work_sync() helper to make sure ONLY the
tipc_net_finalize_work() stopped before releasing bcbase object.

Reported-by: syzbot+d5aa7e0385f6a5d0f4fd@syzkaller.appspotmail.com
Fixes: fdeba99b1e58 ("tipc: fix use-after-free in tipc_bcast_get_mode")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Huu Le <hoang.h.le@dektech.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

enic: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'vnic_dev_classifier()', 'vnic_dev_fw_info()',
'vnic_dev_notify_set()' and 'vnic_dev_stats_dump()' (vnic_dev.c) GFP_ATOMIC
must be used because its callers take a spinlock before calling these
functions.

When memory is allocated in '__enic_set_rsskey()' and 'enic_set_rsscpu()'
GFP_ATOMIC must be used because they can be called with a spinlock.
The call chain is:
  enic_reset                         <-- takes 'enic->enic_api_lock'
    --> enic_set_rss_nic_cfg
      --> enic_set_rsskey
        --> __enic_set_rsskey        <-- uses dma_alloc_coherent
      --> enic_set_rsscpu            <-- uses dma_alloc_coherent

When memory is allocated in 'vnic_dev_init_prov2()' GFP_ATOMIC must be used
because a spinlock is hidden in the ENIC_DEVCMD_PROXY_BY_INDEX macro, when
this function is called in 'enic_set_port_profile()'.

When memory is allocated in 'vnic_dev_alloc_desc_ring()' GFP_KERNEL can be
used because it is only called from 5 functions ('vnic_dev_init_devcmd2()',
'vnic_cq_alloc()', 'vnic_rq_alloc()', 'vnic_wq_alloc()' and
'enic_wq_devcmd2_alloc()'.

  'vnic_dev_init_devcmd2()': already uses GFP_KERNEL and no lock is taken
     in the between.
  'enic_wq_devcmd2_alloc()': is called from ' vnic_dev_init_devcmd2()'
     which already uses GFP_KERNEL and no lock is taken in the between.
  'vnic_cq_alloc()', 'vnic_rq_alloc()', 'vnic_wq_alloc()': are called
     from 'enic_alloc_vnic_resources()'
'enic_alloc_vnic_resources()' has only 2 call chains:

  1) enic_probe
      --> enic_dev_init
        --> enic_alloc_vnic_resources
'enic_probe()' is a probe function and no lock is taken in the between

  2) enic_set_ringparam
      --> enic_alloc_vnic_resources
'enic_set_ringparam()' is a .set_ringparam function (see struct
ethtool_ops). It seems to only take a mutex and no spinlock.

So all paths are safe to use GFP_KERNEL.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: gemini: Clean up phy registration

It's nice if the phy is online before we register the netdev
so try to do that first.

Stop trying to do "second tried" to register the phy, it
works perfectly fine the first time.

Stop remvoving the phy in uninit. Remove it when the
driver is remove():d, symmetric to where it is added, in
probe().

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: Add a missing word

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: rtl8366rb: Support setting MTU

This implements the missing MTU setting for the RTL8366RB
switch.

Apart from supporting jumboframes, this rids us of annoying
boot messages like this:
realtek-smi switch: nonfatal error -95 setting MTU on port 0

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/packet: Remove unused macro BLOCK_PRIV

BLOCK_PRIV is never used after it was introduced.
So better to remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

NFC: digital: Remove two unused macroes

DIGITAL_NFC_DEP_REQ_RES_TAILROOM is never used after it was introduced.
DIGITAL_NFC_DEP_REQ_RES_HEADROOM is no more used after below
commit e8e7f4217564 ("NFC: digital: Remove useless call to skb_reserve()")
Remove them.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE

Remove SRVL_CTRL_PKT_SIZE which is defined more than once.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

MAINTAINERS: repair reference in LYNX PCS MODULE

Commit 0da4c3d393e4 ("net: phy: add Lynx PCS module") added the files in
./drivers/net/pcs/, but the new LYNX PCS MODULE section refers to
./drivers/net/phy/.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains:

warning: no file matches F: drivers/net/phy/pcs-lynx.c

Repair the LYNX PCS MODULE section by referring to the right location.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-dsa-bcm_sf2-Ensure-MDIO-diversion-is-used'

Florian Fainelli says:

====================
net: dsa: bcm_sf2: Ensure MDIO diversion is used

Changes in v2:

- export of_update_property() to permit building bcm_sf2 as a module
- provided a better explanation of the problem being solved after
explaining it to Andrew during the v1 review
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: bcm_sf2: Ensure that MDIO diversion is used

Registering our slave MDIO bus outside of the OF infrastructure is
necessary in order to avoid creating double references of the same
Device Tree nodes, however it is not sufficient to guarantee that the
MDIO bus diversion is used because of_phy_connect() will still resolve
to a valid PHY phandle and it will connect to the PHY using its parent
MDIO bus which is still the SF2 master MDIO bus. The reason for that is
because BCM7445 systems were already shipped with a Device Tree blob
looking like this (irrelevant parts omitted for simplicity):

ports {
#address-cells = <1>;
#size-cells = <0>;

port@1 {
phy-mode = "rgmii-txid";
phy-handle = <&phy0>;
reg = <1>;
label = "rgmii_1";
};
...

mdio@403c0 {
...

phy0: ethernet-phy@0 {
broken-turn-around;
device_type = "ethernet-phy";
max-speed = <0x3e8>;
reg = <0>;
compatible = "brcm,bcm53125", "ethernet-phy-ieee802.3-c22";
};
};

There is a hardware issue with chip revisions (Dx) that lead to the
development of the following commits:

461cd1b03e32 ("net: dsa: bcm_sf2: Register our slave MDIO bus")
536fab5bf582 ("net: dsa: bcm_sf2: Do not register slave MDIO bus with OF")
b8c6cd1d316f ("net: dsa: bcm_sf2: do not use indirect reads and writes for 7445E0")

There should have been an internal MDIO bus node created for the chip
revision (Dx) that suffers from this problem, but it did not happen back
then.

Had that happen, that we should have correctly parented phy@0 (bcm53125
below) as child node of the internal MDIO bus, but the production Device
Tree blob that was shipped with the firmware targeted the fixed version
of the chip, despite both the affected and corrected chips being shipped
into production.

The problem is that of_phy_connect() for port@1 will happily resolve the
'phy-handle' from the mdio@403c0 node, which bypasses the diversion
completely. This results in this double programming that the diversion
refers to and aims to avoid. In order to force of_phy_connect() to fail,
and have DSA call to dsa_slave_phy_connect(), we must deactivate
ethernet-phy@0 from mdio@403c0, and the best way to do that is by
removing the phandle property completely.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

of: Export of_remove_property() to modules

We will need to remove some OF properties in drivers/net/dsa/bcm_sf2.c
with a subsequent commit. Export of_remove_property() to modules so we
can keep bcm_sf2 modular and provide an empty stub for when CONFIG_OF is
disabled to maintain the ability to compile test.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'sfc-TXQ-refactor'

Edward Cree says:

====================
sfc: TXQ refactor

Refactor and unify partner-TXQ handling in the EF100 and legacy drivers.

The main thrust of this series is to remove from the legacy (Siena/EF10)
driver the assumption that a netdev TX queue has precisely two hardware
TXQs (checksummed and unchecksummed) associated with it, so that in
future we can have more (e.g. for handling inner-header checksums) or
fewer (e.g. to free up hardware queues for XDP usage).

Changes from v1:
* better explain patch #1 in the commit message, and rename
xmit_more_available to xmit_pending
* add new patch #2 applying the same approach to ef100, for consistency
====================

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: remove efx_tx_queue_partner

All users of this function are now gone.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

sfc: rewrite efx_tx_may_pio

Use efx_for_each_channel_tx_queue() rather than efx_tx_queue_partner().
Make some related simplifications of efx_nic_tx_is_empty() to remove
entry points that aren't used.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>