linux-2.6-microblaze.git
3 years agomptcp: pm: add lockdep assertions
Florian Westphal [Thu, 4 Feb 2021 23:23:30 +0000 (15:23 -0800)]
mptcp: pm: add lockdep assertions

Add a few assertions to make sure functions are called with the needed
locks held.
Two functions gain might_sleep annotations because they contain
conditional calls to functions that sleep.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: mptcp: add command line arguments for mptcp_join.sh
Geliang Tang [Thu, 4 Feb 2021 23:23:29 +0000 (15:23 -0800)]
selftests: mptcp: add command line arguments for mptcp_join.sh

Since the mptcp_join script is becoming too big, this patch splits it
into several smaller chunks, each of them has been defined in a function
as a individual test group for several related testcases.

Using bash getopts function to parse command line arguments, and invoke
each function to do the individual test group.

Here are all the arguments:
  -f subflows_tests
  -s signal_address_tests
  -l link_failure_tests
  -t add_addr_timeout_tests
  -r remove_tests
  -a add_tests
  -6 ipv6_tests
  -4 v4mapped_tests
  -b backup_tests
  -p add_addr_ports_tests
  -c syncookies_tests
  -h help

Run mptcp_join.sh with no argument will execute all testcases.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'dpaa2-add-1000base-x-support'
Jakub Kicinski [Sat, 6 Feb 2021 22:35:23 +0000 (14:35 -0800)]
Merge branch 'dpaa2-add-1000base-x-support'

Russell King says:

====================
dpaa2: add 1000base-X support

This patch series adds 1000base-X support to pcs-lynx and DPAA2,
allowing runtime switching between SGMII and 1000base-X. This is
a pre-requisit for SFP module support on the SolidRun ComExpress 7.

v2: updated with Ioana's r-b's, and comment on backplane support
====================

Link: https://lore.kernel.org/r/20210205103859.GH1463@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dpaa2-mac: add backplane link mode support
Russell King [Fri, 5 Feb 2021 10:40:09 +0000 (10:40 +0000)]
net: dpaa2-mac: add backplane link mode support

Add support for backplane link mode, which is, according to discussions
with NXP earlier in the year, is a mode where the OS (Linux) is able to
manage the PCS and Serdes itself.

This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dpaa2-mac: add 1000BASE-X support
Russell King [Fri, 5 Feb 2021 10:40:04 +0000 (10:40 +0000)]
net: dpaa2-mac: add 1000BASE-X support

Now that pcs-lynx supports 1000BASE-X, add support for this interface
mode to dpaa2-mac. pcs-lynx can be switched at runtime between SGMII
and 1000BASE-X mode, so allow dpaa2-mac to switch between these as
well.

This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: pcs: add pcs-lynx 1000BASE-X support
Russell King [Fri, 5 Feb 2021 10:39:59 +0000 (10:39 +0000)]
net: pcs: add pcs-lynx 1000BASE-X support

Add support for 1000BASE-X to pcs-lynx for the LX2160A.

This commit prepares the ground work for allowing 1G fiber connections
to be used with DPAA2 on the SolidRun CEX7 platforms.

Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: wan: farsync: use new tasklet API
Emil Renner Berthing [Thu, 4 Feb 2021 17:39:47 +0000 (18:39 +0100)]
net: wan: farsync: use new tasklet API

This converts the driver to use the new tasklet API introduced in
commit 12cc923f1ccc ("tasklet: Introduce new initialization API")

The new API changes the argument passed to callback functions,
but fortunately it is unused so it is straight forward to use
DECLARE_TASKLET rather than DECLARE_TASLKLET_OLD.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Link: https://lore.kernel.org/r/20210204173947.92884-1-kernel@esmil.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-avoid-the-memory-waste-in-some-ethernet-drivers'
Jakub Kicinski [Sat, 6 Feb 2021 19:57:30 +0000 (11:57 -0800)]
Merge branch 'net-avoid-the-memory-waste-in-some-ethernet-drivers'

Kevin Hao says:

====================
net: Avoid the memory waste in some Ethernet drivers

In the current implementation of napi_alloc_frag(), it doesn't have any
align guarantee for the returned buffer address. We would have to use
some ugly workarounds to make sure that we can get a align buffer
address for some Ethernet drivers. This patch series tries to introduce
some helper functions to make sure that an align buffer is returned.
Then we can drop the ugly workarounds and avoid the unnecessary memory
waste.
====================

Link: https://lore.kernel.org/r/20210204105638.1584-1-haokexin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dpaa2: Use napi_alloc_frag_align() to avoid the memory waste
Kevin Hao [Thu, 4 Feb 2021 10:56:38 +0000 (18:56 +0800)]
net: dpaa2: Use napi_alloc_frag_align() to avoid the memory waste

The napi_alloc_frag_align() will guarantee that a correctly align
buffer address is returned. So use this function to simplify the buffer
alloc and avoid the unnecessary memory waste.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: octeontx2: Use napi_alloc_frag_align() to avoid the memory waste
Kevin Hao [Thu, 4 Feb 2021 10:56:37 +0000 (18:56 +0800)]
net: octeontx2: Use napi_alloc_frag_align() to avoid the memory waste

The napi_alloc_frag_align() will guarantee that a correctly align
buffer address is returned. So use this function to simplify the buffer
alloc and avoid the unnecessary memory waste.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Tested-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: Introduce {netdev,napi}_alloc_frag_align()
Kevin Hao [Thu, 4 Feb 2021 10:56:36 +0000 (18:56 +0800)]
net: Introduce {netdev,napi}_alloc_frag_align()

In the current implementation of {netdev,napi}_alloc_frag(), it doesn't
have any align guarantee for the returned buffer address, But for some
hardwares they do require the DMA buffer to be aligned correctly,
so we would have to use some workarounds like below if the buffers
allocated by the {netdev,napi}_alloc_frag() are used by these hardwares
for DMA.
    buf = napi_alloc_frag(really_needed_size + align);
    buf = PTR_ALIGN(buf, align);

These codes seems ugly and would waste a lot of memories if the buffers
are used in a network driver for the TX/RX. We have added the align
support for the page_frag functions, so add the corresponding
{netdev,napi}_frag functions.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomm: page_frag: Introduce page_frag_alloc_align()
Kevin Hao [Thu, 4 Feb 2021 10:56:35 +0000 (18:56 +0800)]
mm: page_frag: Introduce page_frag_alloc_align()

In the current implementation of page_frag_alloc(), it doesn't have
any align guarantee for the returned buffer address. But for some
hardwares they do require the DMA buffer to be aligned correctly,
so we would have to use some workarounds like below if the buffers
allocated by the page_frag_alloc() are used by these hardwares for
DMA.
    buf = page_frag_alloc(really_needed_size + align);
    buf = PTR_ALIGN(buf, align);

These codes seems ugly and would waste a lot of memories if the buffers
are used in a network driver for the TX/RX. So introduce
page_frag_alloc_align() to make sure that an aligned buffer address is
returned.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dwc-xlgmac: Fix spelling mistake in function name
Colin Ian King [Thu, 4 Feb 2021 09:49:44 +0000 (09:49 +0000)]
net: dwc-xlgmac: Fix spelling mistake in function name

There is a spelling mistake in the function name alloc_channles_and_rings.
Fix this by renaming it to alloc_channels_and_rings.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210204094944.51460-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: mhi-net: Add re-aggregation of fragmented packets
Loic Poulain [Thu, 4 Feb 2021 08:40:00 +0000 (09:40 +0100)]
net: mhi-net: Add re-aggregation of fragmented packets

When device side MTU is larger than host side MTU, the packets
(typically rmnet packets) are split over multiple MHI transfers.
In that case, fragments must be re-aggregated to recover the packet
before forwarding to upper layer.

A fragmented packet result in -EOVERFLOW MHI transaction status for
each of its fragments, except the final one. Such transfer was
previously considered as error and fragments were simply dropped.

This change adds re-aggregation mechanism using skb chaining, via
skb frag_list.

A warning (once) is printed since this behavior usually comes from
a misconfiguration of the device (e.g. modem MTU).

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/1612428002-12333-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: qualcomm: rmnet: Fix rx_handler for non-linear skbs
Loic Poulain [Thu, 4 Feb 2021 08:40:01 +0000 (09:40 +0100)]
net: qualcomm: rmnet: Fix rx_handler for non-linear skbs

There is no guarantee that rmnet rx_handler is only fed with linear
skbs, but current rmnet implementation does not check that, leading
to crash in case of non linear skbs processed as linear ones.

Fix that by ensuring skb linearization before processing.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Link: https://lore.kernel.org/r/1612428002-12333-2-git-send-email-loic.poulain@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: sched: Return the correct errno code
Zheng Yongjun [Thu, 4 Feb 2021 07:39:50 +0000 (15:39 +0800)]
net: sched: Return the correct errno code

When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Link: https://lore.kernel.org/r/20210204073950.18372-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodccp: Return the correct errno code
Zheng Yongjun [Thu, 4 Feb 2021 07:28:20 +0000 (15:28 +0800)]
dccp: Return the correct errno code

When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Link: https://lore.kernel.org/r/20210204072820.17723-1-zhengyongjun3@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: mcast: Use ERR_CAST instead of ERR_PTR(PTR_ERR())
Xu Wang [Thu, 4 Feb 2021 07:05:49 +0000 (07:05 +0000)]
net: bridge: mcast: Use ERR_CAST instead of ERR_PTR(PTR_ERR())

Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)).

net/bridge/br_multicast.c:1246:9-16: WARNING: ERR_CAST can be used with mp
Generated by: scripts/coccinelle/api/err_cast.cocci

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Link: https://lore.kernel.org/r/20210204070549.83636-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ethernet: ti: fix netdevice stats for XDP
Lorenzo Bianconi [Wed, 3 Feb 2021 18:06:17 +0000 (19:06 +0100)]
net: ethernet: ti: fix netdevice stats for XDP

Align netdevice statistics when the device is running in XDP mode
to other upstream drivers. In particular report to user-space rx
packets even if they are not forwarded to the networking stack
(XDP_PASS) but if they are redirected (XDP_REDIRECT), dropped (XDP_DROP)
or sent back using the same interface (XDP_TX). This patch allows the
system administrator to verify the device is receiving data correctly.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/a457cb17dd9c58c116d64ee34c354b2e89c0ff8f.1612375372.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-eth: Simplify the calculation of variables
Jiapeng Chong [Tue, 2 Feb 2021 10:02:37 +0000 (18:02 +0800)]
dpaa2-eth: Simplify the calculation of variables

Fix the following coccicheck warnings:

./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1651:36-38: WARNING
!A || A && B is equivalent to !A || B.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/1612260157-128026-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'wireless-drivers-next-2021-02-05' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Sat, 6 Feb 2021 17:36:03 +0000 (09:36 -0800)]
Merge tag 'wireless-drivers-next-2021-02-05' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:

====================
wireless-drivers-next patches for v5.12

First set of patches for v5.12. A smaller pull request this time,
biggest feature being a better key handling for ath9k. And of course
the usual fixes and cleanups all over.

Major changes:

ath9k
 * more robust encryption key cache management

brcmfmac
 * support BCM4365E with 43666 ChipCommon chip ID

* tag 'wireless-drivers-next-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (35 commits)
  iwl4965: do not process non-QOS frames on txq->sched_retry path
  mt7601u: process tx URBs with status EPROTO properly
  wlcore: Fix command execute failure 19 for wl12xx
  mt7601u: use ieee80211_rx_list to pass frames to the network stack as a batch
  rtw88: 8723de: adjust the LTR setting
  rtlwifi: rtl8821ae: fix bool comparison in expressions
  rtlwifi: rtl8192se: fix bool comparison in expressions
  rtlwifi: rtl8188ee: fix bool comparison in expressions
  rtlwifi: rtl8192c-common: fix bool comparison in expressions
  rtlwifi: rtl_pci: fix bool comparison in expressions
  wlcore: Downgrade exceeded max RX BA sessions to debug
  wilc1000: use flexible-array member instead of zero-length array
  brcmfmac: clear EAP/association status bits on linkdown events
  brcmfmac: Delete useless kfree code
  qtnfmac_pcie: Use module_pci_driver
  mt7601u: check the status of device in calibration
  mt7601u: process URBs in status EPROTO properly
  brcmfmac: support BCM4365E with 43666 ChipCommon chip ID
  wilc1000: fix spelling mistake in Kconfig "devision" -> "division"
  mwifiex: pcie: Drop bogus __refdata annotation
  ...
====================

Link: https://lore.kernel.org/r/20210205161901.C7F83C433ED@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Jakub Kicinski [Fri, 5 Feb 2021 05:26:27 +0000 (21:26 -0800)]
Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2021-02-03

This series contains updates to igc, igb, e1000e, and e1000 drivers.

Sasha adds counting of good transmit packets and reporting of NVM version
and gPHY version in ethtool firmware version. Replaces the use of strlcpy
to the preferred strscpy. Fixes a typo that caused the wrong register to be
output. He also removes an unused function pointer, some unneeded defines,
and a non-applicable comment. All changes for igc.

Gal Hammer fixes a typo which caused the RDBAL register values to be
shown instead of TDBAL for igb.

Nick Lowe enables RSS support for i211 devices for igb.

Tom Rix fixes checkpatch warning by removing h from printk format
specifier for igb.

Kaixu Xia removes setting of a variable that is overwritten before next
use for e1000e.

Sudip Mukherjee removes an unneeded assignment for e1000.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  e1000: drop unneeded assignment in e1000_set_itr()
  e1000e: remove the redundant value assignment in e1000_update_nvm_checksum_spt
  igb: remove h from printk format specifier
  igb: Enable RSS for Intel I211 Ethernet Controller
  igb: fix TDBAL register show incorrect value
  igc: Fix TDBAL register show incorrect value
  igc: Remove unused FUNC_1 mask
  igc: Remove unused local receiver mask
  igc: Prefer strscpy over strlcpy
  igc: Expose the gPHY firmware version
  igc: Expose the NVM version
  igc: Add Host Good Packets Transmitted Count
  igc: Remove MULR mask define
  igc: Remove igc_set_fw_version comment
  igc: Clean up nvm_operations structure
====================

Link: https://lore.kernel.org/r/20210204004259.3662059-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'amend-hv_netvsc-copy-packets-sent-by-hyper-v-out-of-the-receive-buffer'
Jakub Kicinski [Fri, 5 Feb 2021 04:37:07 +0000 (20:37 -0800)]
Merge branch 'amend-hv_netvsc-copy-packets-sent-by-hyper-v-out-of-the-receive-buffer'

Andrea Parri says:

====================
Amend "hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer"

Patch #2 also addresses the Smatch complaint reported here:

   https://lkml.kernel.org/r/YBp2oVIdMe+G%2FliJ@mwanda/
====================

Link: https://lore.kernel.org/r/20210203113513.558864-1-parri.andrea@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agohv_netvsc: Load and store the proper (NBL_HASH_INFO) per-packet info
Andrea Parri (Microsoft) [Wed, 3 Feb 2021 11:35:13 +0000 (12:35 +0100)]
hv_netvsc: Load and store the proper (NBL_HASH_INFO) per-packet info

Fix the typo.

Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Fixes: 0ba35fe91ce34f ("hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agohv_netvsc: Allocate the recv_buf buffers after NVSP_MSG1_TYPE_SEND_RECV_BUF
Andrea Parri (Microsoft) [Wed, 3 Feb 2021 11:35:12 +0000 (12:35 +0100)]
hv_netvsc: Allocate the recv_buf buffers after NVSP_MSG1_TYPE_SEND_RECV_BUF

The recv_buf buffers are allocated in netvsc_device_add().  Later in
netvsc_init_buf() the response to NVSP_MSG1_TYPE_SEND_RECV_BUF allows
the host to set up a recv_section_size that could be bigger than the
(default) value used for that allocation.  The host-controlled value
could be used by a malicious host to bypass the check on the packet's
length in netvsc_receive() and hence to overflow the recv_buf buffer.

Move the allocation of the recv_buf buffers into netvsc_init_but().

Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Fixes: 0ba35fe91ce34f ("hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'r8152-adjust-flow-for-power-cut'
Jakub Kicinski [Fri, 5 Feb 2021 04:36:53 +0000 (20:36 -0800)]
Merge branch 'r8152-adjust-flow-for-power-cut'

Hayes Wang says:

====================
r8152: adjust flow for power cut

The two patches are used to adjust the flow about resuming from
the state of power cut. For the purpose, some functions have to
be updated first.
====================

Link: https://lore.kernel.org/r/1394712342-15778-398-Taiwan-albertk@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8152: adjust the flow of power cut for RTL8153B
Hayes Wang [Wed, 3 Feb 2021 09:14:29 +0000 (17:14 +0800)]
r8152: adjust the flow of power cut for RTL8153B

For runtime resuming, the RTL8153B may be resumed from the state
of power cut, when enabling the feature of UPS. Then, the PHY
would be reset, so it is necessary to be initailized again.

Besides, the USB_U1U2_TIMER also has to be set again, so I move
it from r8153b_init() to r8153b_hw_phy_cfg().

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8152: replace several functions about phy patch request
Hayes Wang [Wed, 3 Feb 2021 09:14:28 +0000 (17:14 +0800)]
r8152: replace several functions about phy patch request

Replace r8153_patch_request() with rtl_phy_patch_request().
Replace r8153_pre_ram_code() with rtl_pre_ram_code().
Replace r8153_post_ram_code() with rtl_post_ram_code().
Add rtl_patch_key_set().

The new functions have an additional parameter. It is used to wait
the patch request command finished. When the PHY is resumed from
the state of power cut, the PHY is at a safe mode and the
OCP_PHY_PATCH_STAT wouldn't be updated. For this situation, it is
safe to set patch request command without waiting OCP_PHY_PATCH_STAT.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: xrs700x: Correctly address device over I2C
Tobias Waldekranz [Tue, 2 Feb 2021 19:16:45 +0000 (20:16 +0100)]
net: dsa: xrs700x: Correctly address device over I2C

On read, master should send 31 MSB of the register (only even values
are ever used), followed by a 1 to indicate read. Then, reading two
bytes, the device will output the register's value.

On write, master sends 31 MSB of the register, followed by a 0 to
indicate write, followed by two bytes containing the register value.

Flexibilis' documentation (version 1.3) specifies the opposite
polarity (#read/write), but the scope indicates that it is, in fact,
read/#write.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: George McCollister <george.mccollister@gmail.com>
Link: https://lore.kernel.org/r/20210202191645.439-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: bcm_sf2: Check egress tagging of CFP rule with proper accessor
Vladimir Oltean [Wed, 3 Feb 2021 19:39:18 +0000 (21:39 +0200)]
net: dsa: bcm_sf2: Check egress tagging of CFP rule with proper accessor

The flow steering struct ethtool_flow_ext::data field is __be32, so when
the CFP code needs to check the VLAN egress tagging attribute in bit 0,
it does this in CPU native endianness. So logically, the endianness
conversion is set up the other way around, although in practice the same
result is produced.

Gets rid of build warning:

warning: cast from restricted __be32
warning: incorrect type in argument 1 (different base types)
   expected unsigned int [usertype] val
   got restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: cast from restricted __be32
warning: restricted __be32 degrades to integer

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210203193918.2236994-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodrivers: net: ethernet: i825xx: Fix couple of spellings in the file ether1.c
Bhaskar Chowdhury [Thu, 4 Feb 2021 03:16:48 +0000 (08:46 +0530)]
drivers: net: ethernet: i825xx: Fix couple of spellings in the file ether1.c

s/initialsation/initialisation/
s/specifiing/specifying/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210204031648.27300-1-unixbhaskar@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'fix-w-1-compilation-warnings-in-net-folder'
Jakub Kicinski [Fri, 5 Feb 2021 02:38:01 +0000 (18:38 -0800)]
Merge branch 'fix-w-1-compilation-warnings-in-net-folder'

Leon Romanovsky says:

====================
Fix W=1 compilation warnings in net/* folder

Changelog:
v2:
 * Patch 3: Added missing include file.
v1: https://lore.kernel.org/lkml/20210203101612.4004322-1-leon@kernel.org
 * Removed Fixes lines.
 * Changed target from net to be net-next.
 * Patch 1: Moved function declaration to be outside config instead
   games with if/endif.
 * Patch 3: Moved declarations to new header file.
v0: https://lore.kernel.org/lkml/20210202135544.3262383-1-leon@kernel.org
====================

Link: https://lore.kernel.org/r/20210203135112.4083711-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonetfilter: move handlers to net/ip_vs.h
Leon Romanovsky [Wed, 3 Feb 2021 13:51:12 +0000 (15:51 +0200)]
netfilter: move handlers to net/ip_vs.h

Fix the following compilation warnings:
net/netfilter/ipvs/ip_vs_proto_tcp.c:147:1: warning: no previous prototype for 'tcp_snat_handler' [-Wmissing-prototypes]
  147 | tcp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
      | ^~~~~~~~~~~~~~~~
net/netfilter/ipvs/ip_vs_proto_udp.c:136:1: warning: no previous prototype for 'udp_snat_handler' [-Wmissing-prototypes]
  136 | udp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
      | ^~~~~~~~~~~~~~~~

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet/core: move gro function declarations to separate header
Leon Romanovsky [Wed, 3 Feb 2021 13:51:11 +0000 (15:51 +0200)]
net/core: move gro function declarations to separate header

Fir the following compilation warnings:
 1031 | INDIRECT_CALLABLE_SCOPE void udp_v6_early_demux(struct sk_buff *skb)

net/ipv6/ip6_offload.c:182:41: warning: no previous prototype for â€˜ipv6_gro_receive’ [-Wmissing-prototypes]
  182 | INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
      |                                         ^~~~~~~~~~~~~~~~
net/ipv6/ip6_offload.c:320:29: warning: no previous prototype for â€˜ipv6_gro_complete’ [-Wmissing-prototypes]
  320 | INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff)
      |                             ^~~~~~~~~~~~~~~~~
net/ipv6/ip6_offload.c:182:41: warning: no previous prototype for â€˜ipv6_gro_receive’ [-Wmissing-prototypes]
  182 | INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
      |                                         ^~~~~~~~~~~~~~~~
net/ipv6/ip6_offload.c:320:29: warning: no previous prototype for â€˜ipv6_gro_complete’ [-Wmissing-prototypes]
  320 | INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff)

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoipv6: move udp declarations to net/udp.h
Leon Romanovsky [Wed, 3 Feb 2021 13:51:10 +0000 (15:51 +0200)]
ipv6: move udp declarations to net/udp.h

Fix the following compilation warning:

net/ipv6/udp.c:1031:30: warning: no previous prototype for 'udp_v6_early_demux' [-Wmissing-prototypes]
 1031 | INDIRECT_CALLABLE_SCOPE void udp_v6_early_demux(struct sk_buff *skb)
      |                              ^~~~~~~~~~~~~~~~~~
net/ipv6/udp.c:1072:29: warning: no previous prototype for 'udpv6_rcv' [-Wmissing-prototypes]
 1072 | INDIRECT_CALLABLE_SCOPE int udpv6_rcv(struct sk_buff *skb)
      |                             ^~~~~~~~~

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoipv6: silence compilation warning for non-IPV6 builds
Leon Romanovsky [Wed, 3 Feb 2021 13:51:09 +0000 (15:51 +0200)]
ipv6: silence compilation warning for non-IPV6 builds

The W=1 compilation of allmodconfig generates the following warning:

net/ipv6/icmp.c:448:6: warning: no previous prototype for 'icmp6_send' [-Wmissing-prototypes]
  448 | void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
      |      ^~~~~~~~~~

Fix it by providing function declaration for builds with ipv6 as a module.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: hns3: remove redundant null check of an array
Colin Ian King [Wed, 3 Feb 2021 13:10:40 +0000 (13:10 +0000)]
net: hns3: remove redundant null check of an array

The null check of filp->f_path.dentry->d_iname is redundant because
it is an array of DNAME_INLINE_LEN chars and cannot be a null. Fix
this by removing the null check.

Addresses-Coverity: ("Array compared against 0")
Fixes: 04987ca1b9b6 ("net: hns3: add debugfs support for tm nodes, priority and qset info")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20210203131040.21656-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonfc: pn533: Fix typo issue
wengjianfeng [Wed, 3 Feb 2021 09:38:42 +0000 (17:38 +0800)]
nfc: pn533: Fix typo issue

change 'piority' to 'priority'
change 'succesfult' to 'successful'

Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
Link: https://lore.kernel.org/r/20210203093842.11180-1-samirweng1979@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-enable-udp-v6-sockets-receiving-v4-packets-with-udp'
Jakub Kicinski [Fri, 5 Feb 2021 02:37:18 +0000 (18:37 -0800)]
Merge branch 'net-enable-udp-v6-sockets-receiving-v4-packets-with-udp'

Xin Long says:

====================
net: enable udp v6 sockets receiving v4 packets with UDP

Currently, udp v6 socket can not process v4 packets with UDP GRO, as
udp_encap_needed_key is not increased when udp_tunnel_encap_enable()
is called for v6 socket.

This patchset is to increase it and remove the unnecessary code in
bareudp in Patch 1/2, and improve rxrpc encap_enable by calling
udp_tunnel_encap_enable().
====================

Link: https://lore.kernel.org/r/cover.1612342376.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agorxrpc: call udp_tunnel_encap_enable in rxrpc_open_socket
Xin Long [Wed, 3 Feb 2021 08:54:23 +0000 (16:54 +0800)]
rxrpc: call udp_tunnel_encap_enable in rxrpc_open_socket

When doing encap_enable/increasing encap_needed_key, up->encap_enabled
is not set in rxrpc_open_socket(), and it will cause encap_needed_key
not being decreased in udpv6_destroy_sock().

This patch is to improve it by just calling udp_tunnel_encap_enable()
where it increases both UDP and UDPv6 encap_needed_key and sets
up->encap_enabled.

v4->v5:
  - add the missing '#include <net/udp_tunnel.h>', as David Howells
    noticed.

Acked-and-tested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoudp: call udp_encap_enable for v6 sockets when enabling encap
Xin Long [Wed, 3 Feb 2021 08:54:22 +0000 (16:54 +0800)]
udp: call udp_encap_enable for v6 sockets when enabling encap

When enabling encap for a ipv6 socket without udp_encap_needed_key
increased, UDP GRO won't work for v4 mapped v6 address packets as
sk will be NULL in udp4_gro_receive().

This patch is to enable it by increasing udp_encap_needed_key for
v6 sockets in udp_tunnel_encap_enable(), and correspondingly
decrease udp_encap_needed_key in udpv6_destroy_sock().

v1->v2:
  - add udp_encap_disable() and export it.
v2->v3:
  - add the change for rxrpc and bareudp into one patch, as Alex
    suggested.
v3->v4:
  - move rxrpc part to another patch.

Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet-loopback: set lo dev initial state to UP
Jian Yang [Mon, 1 Feb 2021 23:34:45 +0000 (15:34 -0800)]
net-loopback: set lo dev initial state to UP

Traditionally loopback devices come up with initial state as DOWN for
any new network-namespace. This would mean that anyone needing this
device would have to bring this UP by issuing something like 'ip link
set lo up'. This can be avoided if the initial state is set as UP.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Jian Yang <jianyang@google.com>
Link: https://lore.kernel.org/r/20210201233445.2044327-1-jianyang.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-consolidate-page_is_pfmemalloc-usage'
Jakub Kicinski [Fri, 5 Feb 2021 02:20:17 +0000 (18:20 -0800)]
Merge branch 'net-consolidate-page_is_pfmemalloc-usage'

Alexander Lobakin says:

====================
net: consolidate page_is_pfmemalloc() usage

page_is_pfmemalloc() is used mostly by networking drivers to test
if a page can be considered for reusing/recycling.
It doesn't write anything to the struct page itself, so its sole
argument can be constified, as well as the first argument of
skb_propagate_pfmemalloc().
In Page Pool core code, it can be simply inlined instead.
Most of the callers from NIC drivers were just doppelgangers of
the same condition tests. Derive them into a new common function
do deduplicate the code.

Resend of v3 [2]:
 - it missed Patchwork and Netdev archives, probably due to server-side
   issues.

Since v2 [1]:
 - use more intuitive name for the new inline function since there's
   nothing "reserved" in remote pages (Jakub Kicinski, John Hubbard);
 - fold likely() inside the helper itself to make driver code a bit
   fancier (Jakub Kicinski);
 - split function introduction and using into two separate commits;
 - collect some more tags (Jesse Brandeburg, David Rientjes).

Since v1 [0]:
 - new: reduce code duplication by introducing a new common function
   to test if a page can be reused/recycled (David Rientjes);
 - collect autographs for Page Pool bits (Jesper Dangaard Brouer,
   Ilias Apalodimas).

[0] https://lore.kernel.org/netdev/20210125164612.243838-1-alobakin@pm.me
[1] https://lore.kernel.org/netdev/20210127201031.98544-1-alobakin@pm.me
[2] https://lore.kernel.org/lkml/20210131120844.7529-1-alobakin@pm.me
====================

Link: https://lore.kernel.org/r/20210202133030.5760-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: page_pool: simplify page recycling condition tests
Alexander Lobakin [Tue, 2 Feb 2021 13:31:46 +0000 (13:31 +0000)]
net: page_pool: simplify page recycling condition tests

pool_page_reusable() is a leftover from pre-NUMA-aware times. For now,
this function is just a redundant wrapper over page_is_pfmemalloc(),
so inline it into its sole call site.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: use the new dev_page_is_reusable() instead of private versions
Alexander Lobakin [Tue, 2 Feb 2021 13:31:35 +0000 (13:31 +0000)]
net: use the new dev_page_is_reusable() instead of private versions

Now we can remove a bunch of identical functions from the drivers and
make them use common dev_page_is_reusable(). All {,un}likely() checks
are omitted since it's already present in this helper.
Also update some comments near the call sites.

Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: introduce common dev_page_is_reusable()
Alexander Lobakin [Tue, 2 Feb 2021 13:31:21 +0000 (13:31 +0000)]
net: introduce common dev_page_is_reusable()

A bunch of drivers test the page before reusing/recycling for two
common conditions:
 - if a page was allocated under memory pressure (pfmemalloc page);
 - if a page was allocated at a distant memory node (to exclude
   slowdowns).

Introduce a new common inline for doing this, with likely() already
folded inside to make driver code a bit simpler.

Suggested-by: David Rientjes <rientjes@google.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: constify skb_propagate_pfmemalloc() "page" argument
Alexander Lobakin [Tue, 2 Feb 2021 13:31:05 +0000 (13:31 +0000)]
skbuff: constify skb_propagate_pfmemalloc() "page" argument

The function doesn't write anything to the page struct itself,
so this argument can be const.

Misc: align second argument to the brace while at it.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomm: constify page_is_pfmemalloc() argument
Alexander Lobakin [Tue, 2 Feb 2021 13:30:54 +0000 (13:30 +0000)]
mm: constify page_is_pfmemalloc() argument

The function only tests for page->index, so its argument should be
const.

Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: fix building errors on powerpc when CONFIG_RETPOLINE is not set
Brian Vazquez [Thu, 4 Feb 2021 18:18:39 +0000 (18:18 +0000)]
net: fix building errors on powerpc when CONFIG_RETPOLINE is not set

This commit fixes the errores reported when building for powerpc:

 ERROR: modpost: "ip6_dst_check" [vmlinux] is a static EXPORT_SYMBOL
 ERROR: modpost: "ipv4_dst_check" [vmlinux] is a static EXPORT_SYMBOL
 ERROR: modpost: "ipv4_mtu" [vmlinux] is a static EXPORT_SYMBOL
 ERROR: modpost: "ip6_mtu" [vmlinux] is a static EXPORT_SYMBOL

Fixes: f67fbeaebdc0 ("net: use indirect call helpers for dst_mtu")
Fixes: bbd807dfbf20 ("net: indirect call helpers for ipv4/ipv6 dst_check functions")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Link: https://lore.kernel.org/r/20210204181839.558951-2-brianvv@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: add EXPORT_INDIRECT_CALLABLE wrapper
Brian Vazquez [Thu, 4 Feb 2021 18:18:38 +0000 (18:18 +0000)]
net: add EXPORT_INDIRECT_CALLABLE wrapper

When a static function is annotated with INDIRECT_CALLABLE_SCOPE and
CONFIG_RETPOLINE is set, the static keyword is removed. Sometimes the
function needs to be exported but EXPORT_SYMBOL can't be used because if
CONFIG_RETPOLINE is not set, we will attempt to export a static symbol.

This patch introduces a new indirect call wrapper:
EXPORT_INDIRECT_CALLABLE. This basically does EXPORT_SYMBOL when
CONFIG_RETPOLINE is set, but does nothing when it's not.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Link: https://lore.kernel.org/r/20210204181839.558951-1-brianvv@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonetlink: add tracepoint at NL_SET_ERR_MSG
Marcelo Ricardo Leitner [Thu, 4 Feb 2021 01:48:16 +0000 (22:48 -0300)]
netlink: add tracepoint at NL_SET_ERR_MSG

Often userspace won't request the extack information, or they don't log it
because of log level or so, and even when they do, sometimes it's not
enough to know exactly what caused the error.

Netlink extack is the standard way of reporting erros with descriptive
error messages. With a trace point on it, we then can know exactly where
the error happened, regardless of userspace app. Also, we can even see if
the err msg was overwritten.

The wrapper do_trace_netlink_extack() is because trace points shouldn't be
called from .h files, as trace points are not that small, and the function
call to do_trace_netlink_extack() on the macros is not protected by
tracepoint_enabled() because the macros are called from modules, and this
would require exporting some trace structs. As this is error path, it's
better to export just the wrapper instead.

v2: removed leftover tracepoint declaration

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/4546b63e67b2989789d146498b13cc09e1fdc543.1612403190.git.marcelo.leitner@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodrivers: net: xen-netfront: Simplify the calculation of variables
Jiapeng Chong [Tue, 2 Feb 2021 10:17:49 +0000 (18:17 +0800)]
drivers: net: xen-netfront: Simplify the calculation of variables

Fix the following coccicheck warnings:

./drivers/net/xen-netfront.c:1816:52-54: WARNING !A || A && B is
equivalent to !A || B.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/1612261069-13315-1-git-send-email-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'gtp'
Jakub Kicinski [Thu, 4 Feb 2021 17:30:00 +0000 (09:30 -0800)]
Merge branch 'gtp'

Jonas Bonn says:

====================
There's ongoing work in this driver to provide support for IPv6, GRO,
GSO, and "collect metadata" mode operation.  In order to facilitate this
work going forward, this short series accumulates already ACK:ed patches
that are ready for the next merge window.

All of these patches should be uncontroversial at this point, including
the first one in the series that reverts a recently added change to
introduce "collect metadata" mode.  As that patch produces 'broken'
packets when common GTP headers are in place, it seems better to revert
it and rethink things a bit before inclusion.
====================

Link: https://lore.kernel.org/r/20210203070805.281321-1-jonas@norrbonn.se
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogtp: update rx_length_errors for abnormally short packets
Jonas Bonn [Wed, 3 Feb 2021 07:08:05 +0000 (08:08 +0100)]
gtp: update rx_length_errors for abnormally short packets

Based on work by Pravin Shelar.

Update appropriate stats when packet transmission isn't possible.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogtp: set device type
Jonas Bonn [Wed, 3 Feb 2021 07:08:04 +0000 (08:08 +0100)]
gtp: set device type

Set the devtype to 'gtp' when setting up the link.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogtp: drop unnecessary call to skb_dst_drop
Jonas Bonn [Wed, 3 Feb 2021 07:08:03 +0000 (08:08 +0100)]
gtp: drop unnecessary call to skb_dst_drop

The call to skb_dst_drop() is already done as part of udp_tunnel_xmit().

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogtp: really check namespaces before xmit
Jonas Bonn [Wed, 3 Feb 2021 07:08:02 +0000 (08:08 +0100)]
gtp: really check namespaces before xmit

Blindly assuming that packet transmission crosses namespaces results in
skb marks being lost in the single namespace case.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogtp: include role in link info
Jonas Bonn [Wed, 3 Feb 2021 07:08:01 +0000 (08:08 +0100)]
gtp: include role in link info

Querying link info for the GTP interface doesn't reveal in which "role" the
device is set to operate.  Include this information in the info query
result.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogtp: set initial MTU
Jonas Bonn [Wed, 3 Feb 2021 07:08:00 +0000 (08:08 +0100)]
gtp: set initial MTU

The GTP link is brought up with a default MTU of zero.  This can lead to
some rather unexpected behaviour for users who are more accustomed to
interfaces coming online with reasonable defaults.

This patch sets an initial MTU for the GTP link of 1500 less worst-case
tunnel overhead.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoRevert "GTP: add support for flow based tunneling API"
Jonas Bonn [Wed, 3 Feb 2021 07:07:59 +0000 (08:07 +0100)]
Revert "GTP: add support for flow based tunneling API"

This reverts commit 9ab7e76aefc97a9aa664accb59d6e8dc5e52514a.

This patch was committed without maintainer approval and despite a number
of unaddressed concerns from review.  There are several issues that
impede the acceptance of this patch and that make a reversion of this
particular instance of these changes the best way forward:

i)  the patch contains several logically separate changes that would be
better served as smaller patches (for review purposes)
ii) functionality like the handling of end markers has been introduced
without further explanation
iii) symmetry between the handling of GTPv0 and GTPv1 has been
unnecessarily broken
iv) the patchset produces 'broken' packets when extension headers are
included
v) there are no available userspace tools to allow for testing this
functionality
vi) there is an unaddressed Coverity report against the patch concering
memory leakage
vii) most importantly, the patch contains a large amount of superfluous
churn that impedes other ongoing work with this driver

This patch will be reworked into a series that aligns with other
ongoing work and facilitates review.

Signed-off-by: Jonas Bonn <jonas@norrbonn.se>
Acked-by: Harald Welte <laforge@gnumonks.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: tracepoint: exposing sk_family in all tcp:tracepoints
Hariharan Ananthakrishnan [Fri, 29 Jan 2021 00:12:10 +0000 (00:12 +0000)]
net: tracepoint: exposing sk_family in all tcp:tracepoints

Similar to sock:inet_sock_set_state tracepoint, expose sk_family to
distinguish AF_INET and AF_INET6 families.

The following tcp tracepoints are updated:
tcp:tcp_destroy_sock
tcp:tcp_rcv_space_adjust
tcp:tcp_retransmit_skb
tcp:tcp_send_reset
tcp:tcp_receive_reset
tcp:tcp_retransmit_synack
tcp:tcp_probe

Signed-off-by: Hariharan Ananthakrishnan <hari@netflix.com>
Signed-off-by: Brendan Gregg <bgregg@netflix.com>
Link: https://lore.kernel.org/r/20210129001210.344438-1-hari@netflix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotcp: use a smaller percpu_counter batch size for sk_alloc
Wei Wang [Tue, 2 Feb 2021 19:34:08 +0000 (11:34 -0800)]
tcp: use a smaller percpu_counter batch size for sk_alloc

Currently, a percpu_counter with the default batch size (2*nr_cpus) is
used to record the total # of active sockets per protocol. This means
sk_sockets_allocated_read_positive() could be off by +/-2*(nr_cpus^2).
This under/over-estimation could lead to wrong memory suppression
conditions in __sk_raise_mem_allocated().
Fix this by using a more reasonable fixed batch size of 16.

See related commit cf86a086a180 ("net/dst: use a smaller percpu_counter
batch for dst entries accounting") that addresses a similar issue.

Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com>
Link: https://lore.kernel.org/r/20210202193408.1171634-1-weiwan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'support-setting-lanes-via-ethtool'
Jakub Kicinski [Thu, 4 Feb 2021 02:37:31 +0000 (18:37 -0800)]
Merge branch 'support-setting-lanes-via-ethtool'

Danielle Ratson says:

====================
Support setting lanes via ethtool

Some speeds can be achieved with different number of lanes. For example,
100Gbps can be achieved using two lanes of 50Gbps or four lanes of
25Gbps. This patchset adds a new selector that allows ethtool to
advertise link modes according to their number of lanes and also force a
specific number of lanes when autonegotiation is off.

Advertising all link modes with a speed of 100Gbps that use two lanes:

$ ethtool -s swp1 speed 100000 lanes 2 autoneg on

Forcing a speed of 100Gbps using four lanes:

$ ethtool -s swp1 speed 100000 lanes 4 autoneg off
====================

Link: https://lore.kernel.org/r/20210202180612.325099-1-danieller@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: selftests: Add lanes setting test
Danielle Ratson [Tue, 2 Feb 2021 18:06:12 +0000 (20:06 +0200)]
net: selftests: Add lanes setting test

Test that setting lanes parameter is working.

Set max speed and max lanes in the list of advertised link modes,
and then try to set max speed with the lanes below max lanes if exists
in the list.

And then, test that setting number of lanes larger than max lanes fails.

Do the above for both autoneg on and off.

$ ./ethtool_lanes.sh

TEST: 4 lanes is autonegotiated                                     [ OK ]
TEST: Lanes number larger than max width is not set                 [ OK ]
TEST: Autoneg off, 4 lanes detected during force mode               [ OK ]
TEST: Lanes number larger than max width is not set                 [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomlxsw: ethtool: Pass link mode in use to ethtool
Danielle Ratson [Tue, 2 Feb 2021 18:06:11 +0000 (20:06 +0200)]
mlxsw: ethtool: Pass link mode in use to ethtool

Currently, when user space queries the link's parameters, as speed and
duplex, each parameter is passed from the driver to ethtool.

Instead, pass the link mode bit in use.
In Spectrum-1, simply pass the bit that is set to '1' from PTYS register.
In Spectrum-2, pass the first link mode bit in the mask of the used
link mode.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomlxsw: ethtool: Add support for setting lanes when autoneg is off
Danielle Ratson [Tue, 2 Feb 2021 18:06:10 +0000 (20:06 +0200)]
mlxsw: ethtool: Add support for setting lanes when autoneg is off

Currently, when auto negotiation is set to off, the user can force a
specific speed or both speed and duplex. The user cannot influence the
number of lanes that will be forced.

Add support for setting speed along with lanes so one would be able
to choose how many lanes will be forced.

When lanes parameter is passed from user space, choose the link mode
that its actual width equals to it.
Otherwise, the default link mode will be the one that supports the width
of the port.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomlxsw: ethtool: Remove max lanes filtering
Danielle Ratson [Tue, 2 Feb 2021 18:06:09 +0000 (20:06 +0200)]
mlxsw: ethtool: Remove max lanes filtering

Currently, when a speed can be supported by different number of lanes,
the supported link modes bitmask contains only link modes with a single
number of lanes.

This was done in order to prevent auto negotiation on number of
lanes after 50G-1-lane and 100G-2-lanes link modes were introduced.

For example, if a port's max width is 4, only link modes with 4 lanes
will be presented as supported by that port, so 100G is always achieved by
4 lanes of 25G.

After the previous patches that allow selection of the number of lanes,
auto negotiation on number of lanes becomes practical.

Remove that filtering of the maximum number of lanes supported link modes,
so indeed all the supported and advertised link modes will be shown.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoethtool: Expose the number of lanes in use
Danielle Ratson [Tue, 2 Feb 2021 18:06:08 +0000 (20:06 +0200)]
ethtool: Expose the number of lanes in use

Currently, ethtool does not expose how many lanes are used when the
link is up.

After adding a possibility to advertise or force a specific number of
lanes, the lanes in use value can be either the maximum width of the port
or below.

Extend ethtool to expose the number of lanes currently in use for
drivers that support it.

For example:

$ ethtool -s swp1 speed 100000 lanes 4
$ ethtool -s swp2 speed 100000 lanes 4
$ ip link set swp1 up
$ ip link set swp2 up
$ ethtool swp1
Settings for swp1:
        Supported ports: [ FIBRE         Backplane ]
        Supported link modes:   1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                50000baseKR/Full
                                50000baseSR/Full
                                50000baseCR/Full
                                50000baseLR_ER_FR/Full
                                50000baseDR/Full
                                100000baseKR2/Full
                                100000baseSR2/Full
                                100000baseCR2/Full
                                100000baseLR2_ER2_FR2/Full
                                100000baseDR2/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseT/Full
                                10000baseT/Full
                                1000baseKX/Full
                                1000baseKX/Full
                                10000baseKR/Full
                                10000baseR_FEC
                                40000baseKR4/Full
                                40000baseCR4/Full
                                40000baseSR4/Full
                                40000baseLR4/Full
                                25000baseCR/Full
                                25000baseKR/Full
                                25000baseSR/Full
                                50000baseCR2/Full
                                50000baseKR2/Full
                                100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
                                50000baseSR2/Full
                                10000baseCR/Full
                                10000baseSR/Full
                                10000baseLR/Full
                                10000baseER/Full
                                200000baseKR4/Full
                                200000baseSR4/Full
                                200000baseLR4_ER4_FR4/Full
                                200000baseDR4/Full
                                200000baseCR4/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Advertised link modes:  100000baseKR4/Full
                                100000baseSR4/Full
                                100000baseCR4/Full
                                100000baseLR4_ER4/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Speed: 100000Mb/s
Lanes: 4
Duplex: Full
Auto-negotiation: on
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Link detected: yes

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoethtool: Get link mode in use instead of speed and duplex parameters
Danielle Ratson [Tue, 2 Feb 2021 18:06:07 +0000 (20:06 +0200)]
ethtool: Get link mode in use instead of speed and duplex parameters

Currently, when user space queries the link's parameters, as speed and
duplex, each parameter is passed from the driver to ethtool.

Instead, get the link mode bit in use, and derive each of the parameters
from it in ethtool.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoethtool: Extend link modes settings uAPI with lanes
Danielle Ratson [Tue, 2 Feb 2021 18:06:06 +0000 (20:06 +0200)]
ethtool: Extend link modes settings uAPI with lanes

Currently, when auto negotiation is on, the user can advertise all the
linkmodes which correspond to a specific speed, but does not have a
similar selector for the number of lanes. This is significant when a
specific speed can be achieved using different number of lanes.  For
example, 2x50 or 4x25.

Add 'ETHTOOL_A_LINKMODES_LANES' attribute and expand 'struct
ethtool_link_settings' with lanes field in order to implement a new
lanes-selector that will enable the user to advertise a specific number
of lanes as well.

When auto negotiation is off, lanes parameter can be forced only if the
driver supports it. Add a capability bit in 'struct ethtool_ops' that
allows ethtool know if the driver can handle the lanes parameter when
auto negotiation is off, so if it does not, an error message will be
returned when trying to set lanes.

Example:

$ ethtool -s swp1 lanes 4
$ ethtool swp1
  Settings for swp1:
Supported ports: [ FIBRE ]
        Supported link modes:   1000baseKX/Full
                                10000baseKR/Full
                                40000baseCR4/Full
40000baseSR4/Full
40000baseLR4/Full
                                25000baseCR/Full
                                25000baseSR/Full
50000baseCR2/Full
                                100000baseSR4/Full
100000baseCR4/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  40000baseCR4/Full
40000baseSR4/Full
40000baseLR4/Full
                                100000baseSR4/Full
100000baseCR4/Full
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: on
        Port: Direct Attach Copper
        PHYAD: 0
        Transceiver: internal
        Link detected: no

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoethtool: Validate master slave configuration before rtnl_lock()
Danielle Ratson [Tue, 2 Feb 2021 18:06:05 +0000 (20:06 +0200)]
ethtool: Validate master slave configuration before rtnl_lock()

Create a new function for input validations to be called before
rtnl_lock() and move the master slave validation to that function.

This would be a cleanup for next patch that would add another validation
to the new function.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: fix SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING getting ignored
Vladimir Oltean [Tue, 2 Feb 2021 23:31:09 +0000 (01:31 +0200)]
net: dsa: fix SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING getting ignored

The bridge emits VLAN filtering events and quite a few others via
switchdev with orig_dev = br->dev. After the blamed commit, these events
started getting ignored.

The point of the patch was to not offload switchdev objects for ports
that didn't go through dsa_port_bridge_join, because the configuration
is unsupported:
- ports that offload a bonding/team interface go through
  dsa_port_bridge_join when that bonding/team interface is later bridged
  with another switch port or LAG
- ports that don't offload LAG don't get notified of the bridge that is
  on top of that LAG.

Sadly, a check is missing, which is that the orig_dev is equal to the
bridge device. This check is compatible with the original intention,
because ports that don't offload bridging because they use a software
LAG don't have dp->bridge_dev set.

On a semi-related note, we should not offload switchdev objects or
populate dp->bridge_dev if the driver doesn't implement .port_bridge_join
either. However there is no regression associated with that, so it can
be done separately.

Fixes: 5696c8aedfcc ("net: dsa: Don't offload port attributes on standalone ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Tobias Waldekranz <tobias@waldekranz.com>
Link: https://lore.kernel.org/r/20210202233109.1591466-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'chelsio-cxgb-use-threaded-interrupts-for-deferred-work'
Jakub Kicinski [Thu, 4 Feb 2021 01:41:03 +0000 (17:41 -0800)]
Merge branch 'chelsio-cxgb-use-threaded-interrupts-for-deferred-work'

Sebastian Andrzej Siewior says:

====================
chelsio: cxgb: Use threaded interrupts for deferred work

Patch #2 fixes an issue in which del_timer_sync() and tasklet_kill() is
invoked from the interrupt handler. This is probably a rare error case
since it disables interrupts / the card in that case.
Patch #1 converts a worker to use a threaded interrupt which is then
also used in patch #2 instead adding another worker for this task (and
flush_work() to synchronise vs rmmod).
====================

Link: https://lore.kernel.org/r/20210202170104.1909200-1-bigeasy@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochelsio: cxgb: Disable the card on error in threaded interrupt
Sebastian Andrzej Siewior [Tue, 2 Feb 2021 17:01:04 +0000 (18:01 +0100)]
chelsio: cxgb: Disable the card on error in threaded interrupt

t1_fatal_err() is invoked from the interrupt handler. The bad part is
that it invokes (via t1_sge_stop()) del_timer_sync() and tasklet_kill().
Both functions must not be called from an interrupt because it is
possible that it will wait for the completion of the timer/tasklet it
just interrupted.

In case of a fatal error, use t1_interrupts_disable() to disable all
interrupt sources and then wake the interrupt thread with
F_PL_INTR_SGE_ERR as pending flag. The threaded-interrupt will stop the
card via t1_sge_stop() and not re-enable the interrupts again.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochelsio: cxgb: Replace the workqueue with threaded interrupt
Sebastian Andrzej Siewior [Tue, 2 Feb 2021 17:01:03 +0000 (18:01 +0100)]
chelsio: cxgb: Replace the workqueue with threaded interrupt

The external interrupt (F_PL_INTR_EXT) needs to be handled in a process
context and this is accomplished by utilizing a workqueue.

The process context can also be provided by a threaded interrupt instead
of a workqueue. The threaded interrupt can be used later for other
interrupt related processing which require non-atomic context without
using yet another workqueue. free_irq() also ensures that the thread is
done which is currently missing (the worker could continue after the
module has been removed).

Save pending flags in pending_thread_intr. Use the same mechanism
to disable F_PL_INTR_EXT as interrupt source like it is used before the
worker is scheduled. Enable the interrupt again once
t1_elmer0_ext_intr_handler() is done.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'support-for-octeontx2-98xx-cpt-block'
Jakub Kicinski [Thu, 4 Feb 2021 01:31:35 +0000 (17:31 -0800)]
Merge branch 'support-for-octeontx2-98xx-cpt-block'

Srujana Challa says:

====================
Support for OcteonTX2 98xx CPT block.

OcteonTX2 series of silicons have multiple variants, the
98xx variant has two crypto (CPT) blocks to double the crypto
performance. This patchset adds support for new CPT block(CPT1).
====================

Link: https://lore.kernel.org/r/20210202152709.20450-1-schalla@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoocteontx2-af: Handle CPT function level reset
Srujana Challa [Tue, 2 Feb 2021 15:27:09 +0000 (20:57 +0530)]
octeontx2-af: Handle CPT function level reset

When FLR is initiated for a VF (PCI function level reset),
the parent PF gets a interrupt. PF then sends a message to
admin function (AF), which then cleans up all resources
attached to that VF. This patch adds support to handle
CPT FLR.

Signed-off-by: Narayana Prasad Raju Atherya <pathreya@marvell.com>
Signed-off-by: Suheil Chandran <schandran@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoocteontx2-af: Add support for CPT1 in debugfs
Srujana Challa [Tue, 2 Feb 2021 15:27:08 +0000 (20:57 +0530)]
octeontx2-af: Add support for CPT1 in debugfs

Adds support to display block CPT1 stats at
"/sys/kernel/debug/octeontx2/cpt1".

Signed-off-by: Mahipal Challa <mchalla@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoocteontx2-af: Mailbox changes for 98xx CPT block
Srujana Challa [Tue, 2 Feb 2021 15:27:07 +0000 (20:57 +0530)]
octeontx2-af: Mailbox changes for 98xx CPT block

This patch changes CPT mailbox message format to
support new block CPT1 in 98xx silicon.

cpt_rd_wr_reg ->
    Modify cpt_rd_wr_reg mailbox and its handler to
    accommodate new block CPT1.
cpt_lf_alloc ->
    Modify cpt_lf_alloc mailbox and its handler to
    configure LFs from a block address out of multiple
    blocks of same type. If a PF/VF needs to configure
    LFs from both the blocks then this mbox should be
    called twice.

Signed-off-by: Mahipal Challa <mchalla@marvell.com>
Signed-off-by: Srujana Challa <schalla@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: mdiobus: Prevent spike on MDIO bus reset signal
Mike Looijmans [Tue, 2 Feb 2021 14:32:39 +0000 (15:32 +0100)]
net: mdiobus: Prevent spike on MDIO bus reset signal

The mdio_bus reset code first de-asserted the reset by allocating with
GPIOD_OUT_LOW, then asserted and de-asserted again. In other words, if
the reset signal defaulted to asserted, there'd be a short "spike"
before the reset.

Here is what happens depending on the pre-existing state of the reset
signal:
Reset (previously asserted):   ~~~|_|~~~~|_______
Reset (previously deasserted): _____|~~~~|_______
                                  ^ ^    ^
                                  A B    C

At point A, the low going transition is because the reset line is
requested using GPIOD_OUT_LOW. If the line is successfully requested,
the first thing we do is set it high _without_ any delay. This is
point B. So, a glitch occurs between A and B.

We then fsleep() and finally set the GPIO low at point C.

Requesting the line using GPIOD_OUT_HIGH eliminates the A and B
transitions. Instead we get:

Reset (previously asserted)  : ~~~~~~~~~~|______
Reset (previously deasserted): ____|~~~~~|______
                                   ^     ^
                                   A     C

Where A and C are the points described above in the code. Point B
has been eliminated.

The issue was found when we pulled down the reset signal for the
Marvell 88E1512P PHY (because it requires at least 50ms after POR with
an active clock). Looking at the reset signal with a scope revealed a
short spike, point B in the artwork above.

Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20210202143239.10714-1-mike.looijmans@topic.nl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoe1000: drop unneeded assignment in e1000_set_itr()
Sudip Mukherjee [Sun, 11 Oct 2020 21:23:26 +0000 (22:23 +0100)]
e1000: drop unneeded assignment in e1000_set_itr()

The variable 'current_itr' is assigned to 0 before jumping to
'set_itr_now' but it has not been used after the jump. So, remove the
unneeded assignment.

Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoe1000e: remove the redundant value assignment in e1000_update_nvm_checksum_spt
Kaixu Xia [Sat, 21 Nov 2020 10:17:27 +0000 (18:17 +0800)]
e1000e: remove the redundant value assignment in e1000_update_nvm_checksum_spt

Both of the statements are value assignment of the variable act_offset.
The first value assignment is overwritten by the second and is useless.
Remove it.

Reported-by: Tosk Robot <tencent_os_robot@tencent.com>
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigb: remove h from printk format specifier
Tom Rix [Wed, 23 Dec 2020 19:44:25 +0000 (11:44 -0800)]
igb: remove h from printk format specifier

This change fixes the checkpatch warning described in this
commit cbacb5ab0aa0 ("docs: printk-formats: Stop encouraging use of
unnecessary %h[xudi] and %hh[xudi]")

Standard integer promotion is already done and %hx and %hhx is useless
so do not encourage the use of %hh[xudi] or %h[xudi].

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigb: Enable RSS for Intel I211 Ethernet Controller
Nick Lowe [Mon, 21 Dec 2020 22:25:02 +0000 (22:25 +0000)]
igb: Enable RSS for Intel I211 Ethernet Controller

The Intel I211 Ethernet Controller supports 2 Receive Side Scaling (RSS)
queues. It should not be excluded from having this feature enabled.

Via commit c883de9fd787 ("igb: rename igb define to be more generic")
E1000_MRQC_ENABLE_RSS_4Q was renamed to E1000_MRQC_ENABLE_RSS_MQ to
indicate that this is a generic bit flag to enable queues and not
a flag that is specific to devices that support 4 queues

The bit flag enables 2, 4 or 8 queues appropriately depending on the part.

Tested with a multicore CPU and frames were then distributed as expected.

This issue appears to have been introduced because of confusion caused
by the prior name.

Signed-off-by: Nick Lowe <nick.lowe@gmail.com>
Tested-by: David Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigb: fix TDBAL register show incorrect value
Gal Hammer [Wed, 6 Jan 2021 09:28:26 +0000 (11:28 +0200)]
igb: fix TDBAL register show incorrect value

Fixed a typo which caused the registers dump function to read the
RDBAL register when printing TDBAL register values.

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Tested-by: David Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Fix TDBAL register show incorrect value
Sasha Neftin [Wed, 6 Jan 2021 17:27:04 +0000 (19:27 +0200)]
igc: Fix TDBAL register show incorrect value

Fixed a typo which caused the registers dump function to read the
RDBAL register when printing TDBAL register values.
_reg_dump method has been partially derived from i210 and have
same typo.

Suggested-by: Gal Hammer <ghammer@redhat.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agonet: mscc: ocelot: fix error code in mscc_ocelot_probe()
Dan Carpenter [Tue, 2 Feb 2021 09:13:44 +0000 (12:13 +0300)]
net: mscc: ocelot: fix error code in mscc_ocelot_probe()

Probe should return an error code if platform_get_irq_byname() fails
but it returns success instead.

Fixes: 6c30384eb1de ("net: mscc: ocelot: register devlink ports")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/YBkXyFIl4V9hgxYM@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: mscc: ocelot: fix error handling bugs in mscc_ocelot_init_ports()
Dan Carpenter [Tue, 2 Feb 2021 09:12:38 +0000 (12:12 +0300)]
net: mscc: ocelot: fix error handling bugs in mscc_ocelot_init_ports()

There are several error handling bugs in mscc_ocelot_init_ports().  I
went through the code, and carefully audited it and made fixes and
cleanups.

1) The ocelot_probe_port() function didn't have a mirror release function
   so it was hard to follow.  I created the ocelot_release_port()
   function.
2) In the ocelot_probe_port() function, if the register_netdev() call
   failed, then it lead to a double free_netdev(dev) bug.  Fix this by
   setting "ocelot->ports[port] = NULL" on the error path.
3) I was concerned that the "port" which comes from of_property_read_u32()
   might be out of bounds so I added a check for that.
4) In the original code if ocelot_regmap_init() failed then the driver
   tried to continue but I think that should be a fatal error.
5) If ocelot_probe_port() failed then the most recent devlink was leaked.
   The fix for mostly came Vladimir Oltean.  Get rid of "registered_ports"
   and just set a bit in "devlink_ports_registered" to say when the
   devlink port has been registered (and needs to be unregistered on
   error).  There are fewer than 32 ports so a u32 is large enough for
   this purpose.
6) The error handling if the final ocelot_port_devlink_init() failed had
   two problems.  The "while (port-- >= 0)" loop should have been
   "--port" pre-op instead of a post-op to avoid a buffer underflow.
   The "if (!registered_ports[port])" condition was reversed leading to
   resource leaks and double frees.

Fixes: 6c30384eb1de ("net: mscc: ocelot: register devlink ports")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/YBkXhqRxHtRGzSnJ@mwanda
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoigc: Remove unused FUNC_1 mask
Sasha Neftin [Sun, 20 Dec 2020 09:15:36 +0000 (11:15 +0200)]
igc: Remove unused FUNC_1 mask

FUNC_1 mask not in use in i225 device and could be removed

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Remove unused local receiver mask
Sasha Neftin [Sun, 13 Dec 2020 15:25:26 +0000 (17:25 +0200)]
igc: Remove unused local receiver mask

Local receiver mask SR_1000T_LOCAL_RX_STATUS not in use in i225 device
and could be removed

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Prefer strscpy over strlcpy
Sasha Neftin [Sun, 17 Jan 2021 08:57:02 +0000 (10:57 +0200)]
igc: Prefer strscpy over strlcpy

Use the strscpy method instead of strlcpy method.

See: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr
_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Expose the gPHY firmware version
Sasha Neftin [Sun, 20 Dec 2020 09:16:49 +0000 (11:16 +0200)]
igc: Expose the gPHY firmware version

Extend reporting of NVM image version to include the gPHY (i225 PHY)
firmware version.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Expose the NVM version
Sasha Neftin [Thu, 10 Dec 2020 06:42:09 +0000 (08:42 +0200)]
igc: Expose the NVM version

Expose the NVM map version via drvinfo in ethtool
NVM image version is reported as firmware version for i225 device
Minor typo fix - remove space

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Add Host Good Packets Transmitted Count
Sasha Neftin [Sun, 6 Dec 2020 12:07:00 +0000 (14:07 +0200)]
igc: Add Host Good Packets Transmitted Count

This counter counts the number of good (non-erred) packets
transmitted sent by the host.
A good transmit packet is considered one that is 64 or more bytes
in length (from <Destination Address> through <CRC>,
inclusively) in length

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Remove MULR mask define
Sasha Neftin [Sun, 6 Dec 2020 09:37:07 +0000 (11:37 +0200)]
igc: Remove MULR mask define

Multiple Tx Data Read Requests is hardware pipeline feature and
is not controlled by software

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Remove igc_set_fw_version comment
Sasha Neftin [Mon, 30 Nov 2020 17:43:00 +0000 (19:43 +0200)]
igc: Remove igc_set_fw_version comment

i225 device not supported and do not plan to support
configuration of fw version string for ethtool

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoigc: Clean up nvm_operations structure
Sasha Neftin [Mon, 30 Nov 2020 09:24:04 +0000 (11:24 +0200)]
igc: Clean up nvm_operations structure

valid_led_default function pointer not in use and can
be removed from nvm_operations structure.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoMerge branch 'net-use-indirect_call-in-some-dst_ops'
Jakub Kicinski [Wed, 3 Feb 2021 22:51:41 +0000 (14:51 -0800)]
Merge branch 'net-use-indirect_call-in-some-dst_ops'

Brian Vazquez says:

====================
net: use INDIRECT_CALL in some dst_ops

This patch series uses the INDIRECT_CALL wrappers in some dst_ops
functions to mitigate retpoline costs. Benefits depend on the
platform as described below.

Background: The kernel rewrites the retpoline code at
__x86_indirect_thunk_r11 depending on the CPU's requirements.
The INDIRECT_CALL wrappers provide hints on possible targets and
save the retpoline overhead using a direct call in case the
target matches one of the hints.

The retpoline overhead for the following three cases has been
measured by Luigi Rizzo in microbenchmarks, using CPU performance
counters, and cover reasonably well the range of possible retpoline
overheads compared to a plain indirect call (in equal conditions,
specifically with predicted branch, hot cache):

- just "jmp *(%r11)" on modern platforms like Intel Cascadelake.
  In this case the overhead is just 2 clock cycles:

- "lfence; jmp *(%r11)" on e.g. some recent AMD CPUs.
  In this case the lfence is blocked until pending reads complete,
  so the actual overhead depends on previous instructions.
  The best case we have measured 15 clock cycles of overhead.

- worst case, e.g. skylake, the full retpoline is used

    __x86_indirect_thunk_r11:     call set_u_target
    capture_speculation:          pause
                                  lfence
                                  jmp capture_speculation
    .align 16
    set_up_target:                mov %r11, (%rsp)
                                  ret

   In this case the overhead has been measured in 35-40 clock cycles.

The actual time saved hence depends on the platform and current
clock speed (which varies heavily, especially when C-states are active).
Also note that actual benefit might be lower than expected if the
longer retpoline overlaps with some pending memory read.

MEASUREMENTS:
The INDIRECT_CALL wrappers in this patchset involve the processing
of incoming SYN and generation of syncookies. Hence, the test has been
run by configuring a receiving host with a single NIC rx queue, disabling
RPS and RFS so that all processing occurs on the same core.
An external source generates SYN fast enough to saturate the receiving CPU.
We ran two sets of experiments, with and without the dst_output patch,
comparing the number of syncookies generated over a 20s period
in multiple runs.

Assuming the CPU is saturated, the time per packet is
   t = number_of_packets/total_time
and if the two datasets have statistically meaningful difference,
the difference in times between the two cases gives an estimate
of the benefits from one INDIRECT_CALL.

Here are the experimental results:

Skylake     Syncookies over 20s (5 tests)
---------------------------------------------------
indirect    9166325 9182023 9170093 9134014 9171082
retpoline   9099308 9126350 9154841 9056377 9122376

Computing the stats on the ns_pkt = 20e6/total_packets gives the following:

$ ministat -c 95 -w 70 /tmp/sk-indirect /tmp/sk-retp
x /tmp/sk-indirect
+ /tmp/sk-retp
+----------------------------------------------------------------------+
|x     xx x     +          x    + +           +                       +|
||______M__A_______|_|____________M_____A___________________|          |
+----------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5   2.17817e-06   2.18962e-06     2.181e-06  2.182292e-06 4.3252133e-09
+   5   2.18464e-06   2.20839e-06   2.19241e-06  2.194974e-06 8.8695958e-09
Difference at 95.0% confidence
        1.2682e-08 +/- 1.01766e-08
        0.581132% +/- 0.466326%
        (Student's t, pooled s = 6.97772e-09)

This suggests a difference of 13ns +/- 10ns
Our expectation from microbenchmarks was 35-40 cycles per call,
but part of the gains may be eaten by stalls from pending memory reads.

For Cascadelake:
Cascadelake     Syncookies over 20s (5 tests)
---------------------------------------------------------
indirect     10339797 10297547 10366826 10378891 10384854
retpoline    10332674 10366805 10320374 10334272 10374087

Computing the stats on the ns_pkt = 20e6/total_packets gives no
meaningful difference even at just 80% (this was expected):

$ ministat -c 80 -w 70 /tmp/cl-indirect /tmp/cl-retp
x /tmp/cl-indirect
+ /tmp/cl-retp
+----------------------------------------------------------------------+
|   x    x  +     *                   x   + +        +                x|
||______________|_M_________A_____A_______M________|___|               |
+----------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5   1.92588e-06   1.94221e-06   1.92923e-06  1.931716e-06 6.6936746e-09
+   5   1.92788e-06   1.93791e-06   1.93531e-06  1.933188e-06 4.3734106e-09
No difference proven at 80.0% confidence
====================

Link: https://lore.kernel.org/r/20210201174132.3534118-1-brianvv@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: indirect call helpers for ipv4/ipv6 dst_check functions
Brian Vazquez [Mon, 1 Feb 2021 17:41:32 +0000 (17:41 +0000)]
net: indirect call helpers for ipv4/ipv6 dst_check functions

This patch avoids the indirect call for the common case:
ip6_dst_check and ipv4_dst_check

Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: use indirect call helpers for dst_mtu
Brian Vazquez [Mon, 1 Feb 2021 17:41:31 +0000 (17:41 +0000)]
net: use indirect call helpers for dst_mtu

This patch avoids the indirect call for the common case:
ip6_mtu and ipv4_mtu

Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>