Shen Lichuan [Thu, 29 Aug 2024 02:12:53 +0000 (10:12 +0800)]
sfc: Convert to use ERR_CAST()
As opposed to open-code, using the ERR_CAST macro clearly indicates that
this is a pointer to an error value and a type conversion was performed.
Signed-off-by: Shen Lichuan <shenlichuan@vivo.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240829021253.3066-1-shenlichuan@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Fri, 30 Aug 2024 09:27:36 +0000 (10:27 +0100)]
Merge branch 'am-qt2025-phy-rust'
FUJITA Tomonori says:
====================
net: phy: add Applied Micro QT2025 PHY driver
This patchset adds a PHY driver for Applied Micro Circuits Corporation
QT2025.
The first patch adds Rust equivalent to include/linux/sizes.h, makes
code more readable. The 2-5th patches update the PHYLIB Rust bindings.
The 4th and 5th patches have been reviewed previously in a different
thread [1].
QT2025 PHY support was implemented as a part of an Ethernet driver for
Tehuti Networks TN40xx chips. Multiple vendors (DLink, Asus, Edimax,
QNAP, etc) developed adapters based on TN40xx chips. Tehuti Networks
went out of business and the driver wasn't merged into mainline. But
it's still distributed with some of the hardware (and also available
on some vendor sites).
The original driver handles multiple PHY hardware (AMCC QT2025, TI
TLK10232, Aqrate AQR105, and Marvell MV88X3120, MV88X3310, and
MV88E2010). I divided the original driver into MAC and PHY drivers and
implemented a QT2025 PHY driver in Rust.
The MAC driver for Tehuti Networks TN40xx chips was already merged in
6.11-rc1. The MAC and this PHY drivers have been tested with Edimax
EN-9320SFP+ 10G network adapter.
[1] https://lore.kernel.org/rust-for-linux/
20240607052113.69026-1-fujita.tomonori@gmail.com/
v7:
- add Trevor as Reviewer to MAINTAINERS file entry
- add Trevor Reviewed-by
- add/fix comments
- replace uppercase hex with lowercase
- remove unnecessary code
- update the commit message (1st patch)
v6: https://lore.kernel.org/netdev/
20240820225719.91410-1-fujita.tomonori@gmail.com/
- improve comments
- make the logic to load firmware more readable
- add Copy trait to reg::{C22 and C45}
- add Trevor Reviewed-by
v5: https://lore.kernel.org/netdev/
20240819005345.84255-1-fujita.tomonori@gmail.com/
- fix the comments (3th patch)
- add RUST_FW_LOADER_ABSTRACTIONS dependency
- add Andrew and Benno Reviewed-by
v4: https://lore.kernel.org/netdev/
20240817051939.77735-1-fujita.tomonori@gmail.com/
- fix the comments
- add Andrew's Reviewed-by
- fix the order of tags
- remove wrong endianness conversion
v3: https://lore.kernel.org/netdev/
20240804233835.223460-1-fujita.tomonori@gmail.com/
- use addr_of_mut!` to avoid intermediate mutable reference
- update probe callback's Safety comment
- add MODULE_FIRMWARE equivalent
- add Alice's Reviewed-by
v2: https://lore.kernel.org/netdev/
20240731042136.201327-1-fujita.tomonori@gmail.com/
- add comments in accordance with the hw datasheet
- unify C22 and C45 APIs
- load firmware in probe callback instead of config_init
- use firmware API
- handle firmware endian
- check firmware size
- use SZ_*K constants
- avoid confusing phy_id variable
v1: https://lore.kernel.org/netdev/
20240415104701.4772-1-fujita.tomonori@gmail.com/
====================
rom: FUJITA Tomonori <fujita.tomonori@gmail.com>
To: netdev@vger.kernel.org
Cc: rust-for-linux@vger.kernel.org, andrew@lunn.ch,
tmgross@umich.edu, miguel.ojeda.sandonis@gmail.com,
benno.lossin@proton.me, aliceryhl@google.com
Subject: [PATCH net-next v7 0/6] net: phy: add Applied Micro QT2025 PHY driver
Date: Sat, 24 Aug 2024 02:06:10 +0000 [thread overview]
Message-ID: <
20240824020617.113828-1-fujita.tomonori@gmail.com> (raw)
This patchset adds a PHY driver for Applied Micro Circuits Corporation
QT2025.
The first patch adds Rust equivalent to include/linux/sizes.h, makes
code more readable. The 2-5th patches update the PHYLIB Rust bindings.
The 4th and 5th patches have been reviewed previously in a different
thread [1].
QT2025 PHY support was implemented as a part of an Ethernet driver for
Tehuti Networks TN40xx chips. Multiple vendors (DLink, Asus, Edimax,
QNAP, etc) developed adapters based on TN40xx chips. Tehuti Networks
went out of business and the driver wasn't merged into mainline. But
it's still distributed with some of the hardware (and also available
on some vendor sites).
The original driver handles multiple PHY hardware (AMCC QT2025, TI
TLK10232, Aqrate AQR105, and Marvell MV88X3120, MV88X3310, and
MV88E2010). I divided the original driver into MAC and PHY drivers and
implemented a QT2025 PHY driver in Rust.
The MAC driver for Tehuti Networks TN40xx chips was already merged in
6.11-rc1. The MAC and this PHY drivers have been tested with Edimax
EN-9320SFP+ 10G network adapter.
[1] https://lore.kernel.org/rust-for-linux/
20240607052113.69026-1-fujita.tomonori@gmail.com/
v7:
- add Trevor as Reviewer to MAINTAINERS file entry
- add Trevor Reviewed-by
- add/fix comments
- replace uppercase hex with lowercase
- remove unnecessary code
- update the commit message (1st patch)
v6: https://lore.kernel.org/netdev/
20240820225719.91410-1-fujita.tomonori@gmail.com/
- improve comments
- make the logic to load firmware more readable
- add Copy trait to reg::{C22 and C45}
- add Trevor Reviewed-by
v5: https://lore.kernel.org/netdev/
20240819005345.84255-1-fujita.tomonori@gmail.com/
- fix the comments (3th patch)
- add RUST_FW_LOADER_ABSTRACTIONS dependency
- add Andrew and Benno Reviewed-by
v4: https://lore.kernel.org/netdev/
20240817051939.77735-1-fujita.tomonori@gmail.com/
- fix the comments
- add Andrew's Reviewed-by
- fix the order of tags
- remove wrong endianness conversion
v3: https://lore.kernel.org/netdev/
20240804233835.223460-1-fujita.tomonori@gmail.com/
- use addr_of_mut!` to avoid intermediate mutable reference
- update probe callback's Safety comment
- add MODULE_FIRMWARE equivalent
- add Alice's Reviewed-by
v2: https://lore.kernel.org/netdev/
20240731042136.201327-1-fujita.tomonori@gmail.com/
- add comments in accordance with the hw datasheet
- unify C22 and C45 APIs
- load firmware in probe callback instead of config_init
- use firmware API
- handle firmware endian
- check firmware size
- use SZ_*K constants
- avoid confusing phy_id variable
v1: https://lore.kernel.org/netdev/
20240415104701.4772-1-fujita.tomonori@gmail.com/
Signed-off-by: David S. Miller <davem@davemloft.net>
rom: FUJITA Tomonori <fujita.tomonori@gmail.com>
To: netdev@vger.kernel.org
Cc: rust-for-linux@vger.kernel.org, andrew@lunn.ch,
tmgross@umich.edu, miguel.ojeda.sandonis@gmail.com,
benno.lossin@proton.me, aliceryhl@google.com
Subject: [PATCH net-next v7 0/6] net: phy: add Applied Micro QT2025 PHY driver
Date: Sat, 24 Aug 2024 02:06:10 +0000 [thread overview]
Message-ID: <
20240824020617.113828-1-fujita.tomonori@gmail.com> (raw)
This patchset adds a PHY driver for Applied Micro Circuits Corporation
QT2025.
The first patch adds Rust equivalent to include/linux/sizes.h, makes
code more readable. The 2-5th patches update the PHYLIB Rust bindings.
The 4th and 5th patches have been reviewed previously in a different
thread [1].
QT2025 PHY support was implemented as a part of an Ethernet driver for
Tehuti Networks TN40xx chips. Multiple vendors (DLink, Asus, Edimax,
QNAP, etc) developed adapters based on TN40xx chips. Tehuti Networks
went out of business and the driver wasn't merged into mainline. But
it's still distributed with some of the hardware (and also available
on some vendor sites).
The original driver handles multiple PHY hardware (AMCC QT2025, TI
TLK10232, Aqrate AQR105, and Marvell MV88X3120, MV88X3310, and
MV88E2010). I divided the original driver into MAC and PHY drivers and
implemented a QT2025 PHY driver in Rust.
The MAC driver for Tehuti Networks TN40xx chips was already merged in
6.11-rc1. The MAC and this PHY drivers have been tested with Edimax
EN-9320SFP+ 10G network adapter.
[1] https://lore.kernel.org/rust-for-linux/
20240607052113.69026-1-fujita.tomonori@gmail.com/
v7:
- add Trevor as Reviewer to MAINTAINERS file entry
- add Trevor Reviewed-by
- add/fix comments
- replace uppercase hex with lowercase
- remove unnecessary code
- update the commit message (1st patch)
v6: https://lore.kernel.org/netdev/
20240820225719.91410-1-fujita.tomonori@gmail.com/
- improve comments
- make the logic to load firmware more readable
- add Copy trait to reg::{C22 and C45}
- add Trevor Reviewed-by
v5: https://lore.kernel.org/netdev/
20240819005345.84255-1-fujita.tomonori@gmail.com/
- fix the comments (3th patch)
- add RUST_FW_LOADER_ABSTRACTIONS dependency
- add Andrew and Benno Reviewed-by
v4: https://lore.kernel.org/netdev/
20240817051939.77735-1-fujita.tomonori@gmail.com/
- fix the comments
- add Andrew's Reviewed-by
- fix the order of tags
- remove wrong endianness conversion
v3: https://lore.kernel.org/netdev/
20240804233835.223460-1-fujita.tomonori@gmail.com/
- use addr_of_mut!` to avoid intermediate mutable reference
- update probe callback's Safety comment
- add MODULE_FIRMWARE equivalent
- add Alice's Reviewed-by
v2: https://lore.kernel.org/netdev/
20240731042136.201327-1-fujita.tomonori@gmail.com/
- add comments in accordance with the hw datasheet
- unify C22 and C45 APIs
- load firmware in probe callback instead of config_init
- use firmware API
- handle firmware endian
- check firmware size
- use SZ_*K constants
- avoid confusing phy_id variable
v1: https://lore.kernel.org/netdev/
20240415104701.4772-1-fujita.tomonori@gmail.com/
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
FUJITA Tomonori [Wed, 28 Aug 2024 07:35:16 +0000 (07:35 +0000)]
net: phy: add Applied Micro QT2025 PHY driver
This driver supports Applied Micro Circuits Corporation QT2025 PHY,
based on a driver for Tehuti Networks TN40xx chips.
The original driver for TN40xx chips supports multiple PHY hardware
(AMCC QT2025, TI TLK10232, Aqrate AQR105, and Marvell 88X3120,
88X3310, and MV88E2010). This driver is extracted from the original
driver and modified to a PHY driver in Rust.
This has been tested with Edimax EN-9320SFP+ 10G network adapter.
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Wed, 28 Aug 2024 07:35:15 +0000 (07:35 +0000)]
rust: net::phy unified genphy_read_status function for C22 and C45 registers
Add unified genphy_read_status function for C22 and C45
registers. Instead of having genphy_c22 and genphy_c45 methods, this
unifies genphy_read_status functions for C22 and C45.
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Wed, 28 Aug 2024 07:35:14 +0000 (07:35 +0000)]
rust: net::phy unified read/write API for C22 and C45 registers
Add the unified read/write API for C22 and C45 registers. The
abstractions support access to only C22 registers now. Instead of
adding read/write_c45 methods specifically for C45, a new reg module
supports the unified API to access C22 and C45 registers with trait,
by calling an appropriate phylib functions.
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Wed, 28 Aug 2024 07:35:13 +0000 (07:35 +0000)]
rust: net::phy implement AsRef<kernel::device::Device> trait
Implement AsRef<kernel::device::Device> trait for Device. A PHY driver
needs a reference to device::Device to call the firmware API.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Wed, 28 Aug 2024 07:35:12 +0000 (07:35 +0000)]
rust: net::phy support probe callback
Support phy_driver probe callback, used to set up device-specific
structures.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FUJITA Tomonori [Wed, 28 Aug 2024 07:35:11 +0000 (07:35 +0000)]
rust: sizes: add commonly used constants
Add rust equivalent to include/linux/sizes.h, makes code more
readable. Only SZ_*K that QT2025 PHY driver uses are added.
Make generated constants accessible with a proper type.
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 29 Aug 2024 22:33:27 +0000 (15:33 -0700)]
Merge branch 'bnxt_en-update-for-net-next'
Michael Chan says:
====================
bnxt_en: Update for net-next
This series starts with 2 patches to support firmware crash dump. The
driver allocates the required DMA memory ahead of time for firmware to
store the crash dump if and when it crashes. Patch 3 adds priority and
TPID for the .ndo_set_vf_vlan() callback. Note that this was rejected
and reverted last year and it is being re-submitted after recent changes
in the guidelines. The remaining patches are MSIX related. Legacy
interrupt is no longer supported by firmware so we remove the support
in the driver. We then convert to use the newer kernel APIs to
allocate and enable MSIX vectors. The last patch adds support for
dynamic MSIX.
v3: https://lore.kernel.org/
20240823195657.31588-1-michael.chan@broadcom.com
v2: https://lore.kernel.org/
20240816212832.185379-1-michael.chan@broadcom.com
v1: https://lore.kernel.org/
20240713234339.70293-1-michael.chan@broadcom.com
====================
Link: https://patch.msgid.link/20240828183235.128948-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 28 Aug 2024 18:32:35 +0000 (11:32 -0700)]
bnxt_en: Support dynamic MSIX
A range of MSIX vectors are allocated at initialization for the number
needed for RocE and L2. During run-time, if the user increases or
decreases the number of L2 rings, all the MSIX vectors have to be
freed and a new range has to be allocated. This is not optimal and
causes disruptions to RoCE traffic every time there is a change in L2
MSIX.
If the system supports dynamic MSIX allocations, use dynamic
allocation to add new L2 MSIX vectors or free unneeded L2 MSIX
vectors. RoCE traffic is not affected using this scheme.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20240828183235.128948-10-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 28 Aug 2024 18:32:34 +0000 (11:32 -0700)]
bnxt_en: Allocate the max bp->irq_tbl size for dynamic msix allocation
If dynamic MSIX allocation is supported, additional MSIX can be
allocated at run-time without reinitializing the existing MSIX entries.
The first step to support this dynamic scheme is to allocate a large
enough bp->irq_tbl if dynamic allocation is supported.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-9-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 28 Aug 2024 18:32:33 +0000 (11:32 -0700)]
bnxt_en: Replace deprecated PCI MSIX APIs
Use the new pci_alloc_irq_vectors() and pci_free_irq_vectors() to
replace the deprecated pci_enable_msix_range() and pci_disable_msix().
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-8-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 28 Aug 2024 18:32:32 +0000 (11:32 -0700)]
bnxt_en: Remove register mapping to support INTX
In legacy INTX mode, a register is mapped so that the INTX handler can
read it to determine if the NIC is the source of the interrupt. This
and all the related macros are no longer needed now that INTX is no
longer supported.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 28 Aug 2024 18:32:31 +0000 (11:32 -0700)]
bnxt_en: Remove BNXT_FLAG_USING_MSIX flag
Now that we only support MSIX, the BNXT_FLAG_USING_MSIX is always true.
Remove it and any if conditions checking for it. Remove the INTX
handler and associated logic.
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Chan [Wed, 28 Aug 2024 18:32:30 +0000 (11:32 -0700)]
bnxt_en: Deprecate support for legacy INTX mode
Firmware has deprecated support for legacy INTX in 2022 (since v2.27)
and INTX hasn't been tested for many years before that. INTX was
only used as a fallback mechansim in case MSIX wasn't available. MSIX
is always supported by all firmware. If MSIX capability in PCI config
space is not found during probe, abort.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sreekanth Reddy [Wed, 28 Aug 2024 18:32:29 +0000 (11:32 -0700)]
bnxt_en: Support QOS and TPID settings for the SRIOV VLAN
With recent changes in the .ndo_set_vf_*() guidelines, resubmitting
this patch that was reverted eariler in 2023:
c27153682eac ("Revert "bnxt_en: Support QOS and TPID settings for the SRIOV VLAN")
Add these missing settings in the .ndo_set_vf_vlan() method.
Older firmware does not support the TPID setting so check for
proper support.
Remove the unused BNXT_VF_QOS flag.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vikas Gupta [Wed, 28 Aug 2024 18:32:28 +0000 (11:32 -0700)]
bnxt_en: add support for retrieving crash dump using ethtool
Add support for retrieving crash dump using ethtool -w on the
supported interface.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vikas Gupta [Wed, 28 Aug 2024 18:32:27 +0000 (11:32 -0700)]
bnxt_en: add support for storing crash dump into host memory
Newer firmware supports automatic DMA of crash dump to host memory
when it crashes. If the feature is supported, allocate the required
memory using the existing context memory infrastructure. Communicate
the page table containing the DMA addresses to the firmware.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240828183235.128948-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 29 Aug 2024 20:00:38 +0000 (13:00 -0700)]
Merge branch 'adding-so_peek_off-for-tcpv6'
Jon Maloy says:
====================
Adding SO_PEEK_OFF for TCPv6
Adding SO_PEEK_OFF for TCPv6 and selftest for both TCPv4 and TCPv6
====================
Link: https://patch.msgid.link/20240828183752.660267-1-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jon Maloy [Wed, 28 Aug 2024 18:37:52 +0000 (14:37 -0400)]
selftests: add selftest for tcp SO_PEEK_OFF support
We add a selftest to check that the new feature added in
commit
05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option")
works correctly.
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Link: https://patch.msgid.link/20240828183752.660267-3-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jon Maloy [Wed, 28 Aug 2024 18:37:51 +0000 (14:37 -0400)]
tcp: add SO_PEEK_OFF socket option tor TCPv6
When doing further testing of the recently added SO_PEEK_OFF feature
for TCP I realized I had omitted to add it for TCP/IPv6.
I do that here.
Fixes:
05ea491641d3 ("tcp: add support for SO_PEEK_OFF socket option")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Tested-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20240828183752.660267-2-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 28 Aug 2024 17:36:09 +0000 (10:36 -0700)]
tools: ynl: error check scanf() in a sample
Someone reported on GitHub that the YNL NIPA test is failing
when run locally. The test builds the tools, and it hits:
netdev.c:82:9: warning: ignoring return value of ‘scanf’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
82 | scanf("%d", &ifindex);
I can't repro this on my setups but error seems clear enough.
Link: https://github.com/linux-netdev/nipa/discussions/37
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://patch.msgid.link/20240828173609.2951335-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 29 Aug 2024 19:33:14 +0000 (12:33 -0700)]
Merge branch 'replace-deprecated-strcpy-with-strscpy'
Hongbo Li says:
====================
replace deprecated strcpy with strscpy
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
====================
Link: https://patch.msgid.link/20240828123224.3697672-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:32:24 +0000 (20:32 +0800)]
net/ipv4: net: prefer strscpy over strcpy
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240828123224.3697672-7-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:32:23 +0000 (20:32 +0800)]
net/tipc: replace deprecated strcpy with strscpy
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240828123224.3697672-6-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:32:21 +0000 (20:32 +0800)]
net/netrom: prefer strscpy over strcpy
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240828123224.3697672-4-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:32:20 +0000 (20:32 +0800)]
net/ipv6: replace deprecated strcpy with strscpy
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240828123224.3697672-3-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:32:19 +0000 (20:32 +0800)]
net: prefer strscpy over strcpy
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 8 Aug 2024 21:03:51 +0000 (14:03 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/faraday/ftgmac100.c
4186c8d9e6af ("net: ftgmac100: Ensure tx descriptor updates are visible")
e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI")
https://lore.kernel.org/
0b851ec5-f91d-4dd3-99da-
e81b98c9ed28@kernel.org
net/ipv4/tcp.c
bac76cf89816 ("tcp: fix forever orphan socket caused by tcp_abort")
edefba66d929 ("tcp: rstreason: introduce SK_RST_REASON_TCP_STATE for active reset")
https://lore.kernel.org/
20240828112207.
5c199d41@canb.auug.org.au
No adjacent changes.
Link: https://patch.msgid.link/20240829130829.39148-1-pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yang Ruibin [Wed, 28 Aug 2024 12:26:49 +0000 (20:26 +0800)]
net: alacritech: Switch to use dev_err_probe()
use dev_err_probe() instead of dev_err() to simplify the error path and
standardize the format of the error code.
Signed-off-by: Yang Ruibin <11162571@vivo.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240828122650.1324246-1-11162571@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:23:36 +0000 (20:23 +0800)]
net: hns: Use IS_ERR_OR_NULL() helper function
Use the IS_ERR_OR_NULL() helper instead of open-coding a
NULL and an error pointer checks to simplify the code and
improve readability.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240828122336.3697176-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:18:05 +0000 (20:18 +0800)]
net: dsa: realtek: make use of dev_err_cast_probe()
Using dev_err_cast_probe() to simplify the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240828121805.3696631-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hongbo Li [Wed, 28 Aug 2024 12:15:51 +0000 (20:15 +0800)]
net: ipa: make use of dev_err_cast_probe()
Using dev_err_cast_probe() to simplify the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240828121551.3696520-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 29 Aug 2024 18:39:39 +0000 (11:39 -0700)]
Merge branch 'net-vertexcom-mse102x-minor-clean-ups'
Stefan Wahren says:
====================
net: vertexcom: mse102x: Minor clean-ups
This series provides some minor clean-ups for the Vertexcom MSE102x
driver.
====================
Link: https://patch.msgid.link/20240827191000.3244-1-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefan Wahren [Tue, 27 Aug 2024 19:10:00 +0000 (21:10 +0200)]
net: vertexcom: mse102x: Use ETH_ZLEN
There is already a define for minimum Ethernet frame length without FCS.
So used this instead of the magic number.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827191000.3244-6-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefan Wahren [Tue, 27 Aug 2024 19:09:59 +0000 (21:09 +0200)]
net: vertexcom: mse102x: Drop log message on remove
This message is a leftover from initial development. It's
unnecessary now and can be dropped.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827191000.3244-5-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefan Wahren [Tue, 27 Aug 2024 19:09:58 +0000 (21:09 +0200)]
net: vertexcom: mse102x: Fix random MAC address log
At the time of MAC address assignment the netdev is not registered yet,
so netdev log functions won't work as expected. While we are at this
downgrade the log level to a warning, because a random MAC address is
not a real error.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827191000.3244-4-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefan Wahren [Tue, 27 Aug 2024 19:09:57 +0000 (21:09 +0200)]
net: vertexcom: mse102x: Silence TX timeout
As long as the MSE102x is not operational, every packet transmission
will run into a TX timeout and flood the kernel log. So log only the
first TX timeout and a user is at least informed about this issue.
The amount of timeouts are still available via netstat.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827191000.3244-3-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefan Wahren [Tue, 27 Aug 2024 19:09:56 +0000 (21:09 +0200)]
net: vertexcom: mse102x: Use DEFINE_SIMPLE_DEV_PM_OPS
This macro has the advantage over SET_SYSTEM_SLEEP_PM_OPS that we don't
have to care about when the functions are actually used.
Also make use of pm_sleep_ptr() to discard all PM_SLEEP related
stuff if CONFIG_PM_SLEEP isn't enabled.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827191000.3244-2-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 29 Aug 2024 18:14:39 +0000 (06:14 +1200)]
Merge tag 'net-6.11-rc6' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth, wireless and netfilter.
No known outstanding regressions.
Current release - regressions:
- wifi: iwlwifi: fix hibernation
- eth: ionic: prevent tx_timeout due to frequent doorbell ringing
Previous releases - regressions:
- sched: fix sch_fq incorrect behavior for small weights
- wifi:
- iwlwifi: take the mutex before running link selection
- wfx: repair open network AP mode
- netfilter: restore IP sanity checks for netdev/egress
- tcp: fix forever orphan socket caused by tcp_abort
- mptcp: close subflow when receiving TCP+FIN
- bluetooth: fix random crash seen while removing btnxpuart driver
Previous releases - always broken:
- mptcp: more fixes for the in-kernel PM
- eth: bonding: change ipsec_lock from spin lock to mutex
- eth: mana: fix race of mana_hwc_post_rx_wqe and new hwc response
Misc:
- documentation: drop special comment style for net code"
* tag 'net-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
nfc: pn533: Add poll mod list filling check
mailmap: update entry for Sriram Yagnaraman
selftests: mptcp: join: check re-re-adding ID 0 signal
mptcp: pm: ADD_ADDR 0 is not a new address
selftests: mptcp: join: validate event numbers
mptcp: avoid duplicated SUB_CLOSED events
selftests: mptcp: join: check re-re-adding ID 0 endp
mptcp: pm: fix ID 0 endp usage after multiple re-creations
mptcp: pm: do not remove already closed subflows
selftests: mptcp: join: no extra msg if no counter
selftests: mptcp: join: check re-adding init endp with != id
mptcp: pm: reset MPC endp ID when re-added
mptcp: pm: skip connecting to already established sf
mptcp: pm: send ACK on an active subflow
selftests: mptcp: join: check removing ID 0 endpoint
mptcp: pm: fix RM_ADDR ID for the initial subflow
mptcp: pm: reuse ID 0 after delete and re-add
net: busy-poll: use ktime_get_ns() instead of local_clock()
sctp: fix association labeling in the duplicate COOKIE-ECHO case
mptcp: pr_debug: add missing \n at the end
...
Aleksandr Mishin [Tue, 27 Aug 2024 08:48:22 +0000 (11:48 +0300)]
nfc: pn533: Add poll mod list filling check
In case of im_protocols value is 1 and tm_protocols value is 0 this
combination successfully passes the check
'if (!im_protocols && !tm_protocols)' in the nfc_start_poll().
But then after pn533_poll_create_mod_list() call in pn533_start_poll()
poll mod list will remain empty and dev->poll_mod_count will remain 0
which lead to division by zero.
Normally no im protocol has value 1 in the mask, so this combination is
not expected by driver. But these protocol values actually come from
userspace via Netlink interface (NFC_CMD_START_POLL operation). So a
broken or malicious program may pass a message containing a "bad"
combination of protocol parameter values so that dev->poll_mod_count
is not incremented inside pn533_poll_create_mod_list(), thus leading
to division by zero.
Call trace looks like:
nfc_genl_start_poll()
nfc_start_poll()
->start_poll()
pn533_start_poll()
Add poll mod list filling check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes:
dfccd0f58044 ("NFC: pn533: Add some polling entropy")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240827084822.18785-1-amishin@t-argos.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 29 Aug 2024 09:35:54 +0000 (11:35 +0200)]
Merge tag 'nf-24-08-28' of git://git./linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 sets on NFT_PKTINFO_L4PROTO for UDP packets less than 4 bytes
payload from netdev/egress by subtracting skb_network_offset() when
validating IPv4 packet length, otherwise 'meta l4proto udp' never
matches.
Patch #2 subtracts skb_network_offset() when validating IPv6 packet
length for netdev/egress.
netfilter pull request 24-08-28
* tag 'nf-24-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables_ipv6: consider network offset in netdev/egress validation
netfilter: nf_tables: restore IP sanity checks for netdev/egress
====================
Link: https://patch.msgid.link/20240828214708.619261-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Sriram Yagnaraman [Wed, 28 Aug 2024 07:24:17 +0000 (09:24 +0200)]
mailmap: update entry for Sriram Yagnaraman
Link my old est.tech address to my active mail address
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://patch.msgid.link/20240828072417.4111996-1-sriram.yagnaraman@ericsson.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 29 Aug 2024 08:39:52 +0000 (10:39 +0200)]
Merge branch 'mptcp-more-fixes-for-the-in-kernel-pm'
Matthieu Baerts says:
====================
mptcp: more fixes for the in-kernel PM
Here is a new batch of fixes for the MPTCP in-kernel path-manager:
Patch 1 ensures the address ID is set to 0 when the path-manager sends
an ADD_ADDR for the address of the initial subflow. The same fix is
applied when a new subflow is created re-using this special address. A
fix for v6.0.
Patch 2 is similar, but for the case where an endpoint is removed: if
this endpoint was used for the initial address, it is important to send
a RM_ADDR with this ID set to 0, and look for existing subflows with the
ID set to 0. A fix for v6.0 as well.
Patch 3 validates the two previous patches.
Patch 4 makes the PM selecting an "active" path to send an address
notification in an ACK, instead of taking the first path in the list. A
fix for v5.11.
Patch 5 fixes skipping the establishment of a new subflow if a previous
subflow using the same pair of addresses is being closed. A fix for
v5.13.
Patch 6 resets the ID linked to the initial subflow when the linked
endpoint is re-added, possibly with a different ID. A fix for v6.0.
Patch 7 validates the three previous patches.
Patch 8 is a small fix for the MPTCP Join selftest, when being used with
older subflows not supporting all MIB counters. A fix for a commit
introduced in v6.4, but backported up to v5.10.
Patch 9 avoids the PM to try to close the initial subflow multiple
times, and increment counters while nothing happened. A fix for v5.10.
Patch 10 stops incrementing local_addr_used and add_addr_accepted
counters when dealing with the address ID 0, because these counters are
not taking into account the initial subflow, and are then not
decremented when the linked addresses are removed. A fix for v6.0.
Patch 11 validates the previous patch.
Patch 12 avoids the PM to send multiple SUB_CLOSED events for the
initial subflow. A fix for v5.12.
Patch 13 validates the previous patch.
Patch 14 stops treating the ADD_ADDR 0 as a new address, and accepts it
in order to re-create the initial subflow if it has been closed, even if
the limit for *new* addresses -- not taking into account the address of
the initial subflow -- has been reached. A fix for v5.10.
Patch 15 validates the previous patch.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Matthieu Baerts (NGI0) (15):
mptcp: pm: reuse ID 0 after delete and re-add
mptcp: pm: fix RM_ADDR ID for the initial subflow
selftests: mptcp: join: check removing ID 0 endpoint
mptcp: pm: send ACK on an active subflow
mptcp: pm: skip connecting to already established sf
mptcp: pm: reset MPC endp ID when re-added
selftests: mptcp: join: check re-adding init endp with != id
selftests: mptcp: join: no extra msg if no counter
mptcp: pm: do not remove already closed subflows
mptcp: pm: fix ID 0 endp usage after multiple re-creations
selftests: mptcp: join: check re-re-adding ID 0 endp
mptcp: avoid duplicated SUB_CLOSED events
selftests: mptcp: join: validate event numbers
mptcp: pm: ADD_ADDR 0 is not a new address
selftests: mptcp: join: check re-re-adding ID 0 signal
net/mptcp/pm.c | 4 +-
net/mptcp/pm_netlink.c | 87 ++++++++++----
net/mptcp/protocol.c | 6 +
net/mptcp/protocol.h | 5 +-
tools/testing/selftests/net/mptcp/mptcp_join.sh | 153 ++++++++++++++++++++----
tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 +
6 files changed, 209 insertions(+), 50 deletions(-)
---
base-commit:
3a0504d54b3b57f0d7bf3d9184a00c9f8887f6d7
change-id:
20240826-net-mptcp-more-pm-fix-
ffa61a36f817
Best regards,
====================
Link: https://patch.msgid.link/20240828-net-mptcp-more-pm-fix-v2-0-7f11b283fff7@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:38 +0000 (08:14 +0200)]
selftests: mptcp: join: check re-re-adding ID 0 signal
This test extends "delete re-add signal" to validate the previous
commit: when the 'signal' endpoint linked to the initial subflow (ID 0)
is re-added multiple times, it will re-send the ADD_ADDR with id 0. The
client should still be able to re-create this subflow, even if the
add_addr_accepted limit has been reached as this special address is not
considered as a new address.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes:
d0876b2284cf ("mptcp: add the incoming RM_ADDR support")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:37 +0000 (08:14 +0200)]
mptcp: pm: ADD_ADDR 0 is not a new address
The ADD_ADDR 0 with the address from the initial subflow should not be
considered as a new address: this is not something new. If the host
receives it, it simply means that the address is available again.
When receiving an ADD_ADDR for the ID 0, the PM already doesn't consider
it as new by not incrementing the 'add_addr_accepted' counter. But the
'accept_addr' might not be set if the limit has already been reached:
this can be bypassed in this case. But before, it is important to check
that this ADD_ADDR for the ID 0 is for the same address as the initial
subflow. If not, it is not something that should happen, and the
ADD_ADDR can be ignored.
Note that if an ADD_ADDR is received while there is already a subflow
opened using the same address, this ADD_ADDR is ignored as well. It
means that if multiple ADD_ADDR for ID 0 are received, there will not be
any duplicated subflows created by the client.
Fixes:
d0876b2284cf ("mptcp: add the incoming RM_ADDR support")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:36 +0000 (08:14 +0200)]
selftests: mptcp: join: validate event numbers
This test extends "delete and re-add" and "delete re-add signal" to
validate the previous commit: the number of MPTCP events are checked to
make sure there are no duplicated or unexpected ones.
A new helper has been introduced to easily check these events. The
missing events have been added to the lib.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes:
b911c97c7dc7 ("mptcp: add netlink event support")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:35 +0000 (08:14 +0200)]
mptcp: avoid duplicated SUB_CLOSED events
The initial subflow might have already been closed, but still in the
connection list. When the worker is instructed to close the subflows
that have been marked as closed, it might then try to close the initial
subflow again.
A consequence of that is that the SUB_CLOSED event can be seen twice:
# ip mptcp endpoint
1.1.1.1 id 1 subflow dev eth0
2.2.2.2 id 2 subflow dev eth1
# ip mptcp monitor &
[ CREATED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9
[ ESTABLISHED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9
[ SF_ESTABLISHED] remid=0 locid=2 saddr4=2.2.2.2 daddr4=9.9.9.9
# ip mptcp endpoint delete id 1
[ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9
[ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9
The first one is coming from mptcp_pm_nl_rm_subflow_received(), and the
second one from __mptcp_close_subflow().
To avoid doing the post-closed processing twice, the subflow is now
marked as closed the first time.
Note that it is not enough to check if we are dealing with the first
subflow and check its sk_state: the subflow might have been reset or
closed before calling mptcp_close_ssk().
Fixes:
b911c97c7dc7 ("mptcp: add netlink event support")
Cc: stable@vger.kernel.org
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:34 +0000 (08:14 +0200)]
selftests: mptcp: join: check re-re-adding ID 0 endp
This test extends "delete and re-add" to validate the previous commit:
when the endpoint linked to the initial subflow (ID 0) is re-added
multiple times, it was no longer being used, because the internal linked
counters are not decremented for this special endpoint: it is not an
additional endpoint.
Here, the "del/add id 0" steps are done 3 times to unsure this case is
validated.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:33 +0000 (08:14 +0200)]
mptcp: pm: fix ID 0 endp usage after multiple re-creations
'local_addr_used' and 'add_addr_accepted' are decremented for addresses
not related to the initial subflow (ID0), because the source and
destination addresses of the initial subflows are known from the
beginning: they don't count as "additional local address being used" or
"ADD_ADDR being accepted".
It is then required not to increment them when the entrypoint used by
the initial subflow is removed and re-added during a connection. Without
this modification, this entrypoint cannot be removed and re-added more
than once.
Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/512
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Reported-by: syzbot+455d38ecd5f655fc45cf@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/
00000000000049861306209237f4@google.com
Cc: stable@vger.kernel.org
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:32 +0000 (08:14 +0200)]
mptcp: pm: do not remove already closed subflows
It is possible to have in the list already closed subflows, e.g. the
initial subflow has been already closed, but still in the list. No need
to try to close it again, and increments the related counters again.
Fixes:
0ee4261a3681 ("mptcp: implement mptcp_pm_remove_subflow")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:31 +0000 (08:14 +0200)]
selftests: mptcp: join: no extra msg if no counter
The checksum and fail counters might not be available. Then no need to
display an extra message with missing info.
While at it, fix the indentation around, which is wrong since the same
commit.
Fixes:
47867f0a7e83 ("selftests: mptcp: join: skip check if MIB counter not supported")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:30 +0000 (08:14 +0200)]
selftests: mptcp: join: check re-adding init endp with != id
The initial subflow has a special local ID: 0. It is specific per
connection.
When a global endpoint is deleted and re-added later, it can have a
different ID, but the kernel should still use the ID 0 if it corresponds
to the initial address.
This test validates this behaviour: the endpoint linked to the initial
subflow is removed, and re-added with a different ID.
Note that removing the initial subflow will not decrement the 'subflows'
counters, which corresponds to the *additional* subflows. On the other
hand, when the same endpoint is re-added, it will increment this
counter, as it will be seen as an additional subflow this time.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:29 +0000 (08:14 +0200)]
mptcp: pm: reset MPC endp ID when re-added
The initial subflow has a special local ID: 0. It is specific per
connection.
When a global endpoint is deleted and re-added later, it can have a
different ID -- most services managing the endpoints automatically don't
force the ID to be the same as before. It is then important to track
these modifications to be consistent with the ID being used for the
address used by the initial subflow, not to confuse the other peer or to
send the ID 0 for the wrong address.
Now when removing an endpoint, msk->mpc_endpoint_id is reset if it
corresponds to this endpoint. When adding a new endpoint, the same
variable is updated if the address match the one of the initial subflow.
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:28 +0000 (08:14 +0200)]
mptcp: pm: skip connecting to already established sf
The lookup_subflow_by_daddr() helper checks if there is already a
subflow connected to this address. But there could be a subflow that is
closing, but taking time due to some reasons: latency, losses, data to
process, etc.
If an ADD_ADDR is received while the endpoint is being closed, it is
better to try connecting to it, instead of rejecting it: the peer which
has sent the ADD_ADDR will not be notified that the ADD_ADDR has been
rejected for this reason, and the expected subflow will not be created
at the end.
This helper should then only look for subflows that are established, or
going to be, but not the ones being closed.
Fixes:
d84ad04941c3 ("mptcp: skip connecting the connected address")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:27 +0000 (08:14 +0200)]
mptcp: pm: send ACK on an active subflow
Taking the first one on the list doesn't work in some cases, e.g. if the
initial subflow is being removed. Pick another one instead of not
sending anything.
Fixes:
84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:26 +0000 (08:14 +0200)]
selftests: mptcp: join: check removing ID 0 endpoint
Removing the endpoint linked to the initial subflow should trigger a
RM_ADDR for the right ID, and the removal of the subflow. That's what is
now being verified in the "delete and re-add" test.
Note that removing the initial subflow will not decrement the 'subflows'
counters, which corresponds to the *additional* subflows. On the other
hand, when the same endpoint is re-added, it will increment this
counter, as it will be seen as an additional subflow this time.
The 'Fixes' tag here below is the same as the one from the previous
commit: this patch here is not fixing anything wrong in the selftests,
but it validates the previous fix for an issue introduced by this commit
ID.
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:25 +0000 (08:14 +0200)]
mptcp: pm: fix RM_ADDR ID for the initial subflow
The initial subflow has a special local ID: 0. When an endpoint is being
deleted, it is then important to check if its address is not linked to
the initial subflow to send the right ID.
If there was an endpoint linked to the initial subflow, msk's
mpc_endpoint_id field will be set. We can then use this info when an
endpoint is being removed to see if it is linked to the initial subflow.
So now, the correct IDs are passed to mptcp_pm_nl_rm_addr_or_subflow(),
it is no longer needed to use mptcp_local_id_match().
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts (NGI0) [Wed, 28 Aug 2024 06:14:24 +0000 (08:14 +0200)]
mptcp: pm: reuse ID 0 after delete and re-add
When the endpoint used by the initial subflow is removed and re-added
later, the PM has to force the ID 0, it is a special case imposed by the
MPTCP specs.
Note that the endpoint should then need to be re-added reusing the same
ID.
Fixes:
3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Florian Westphal [Tue, 27 Aug 2024 09:00:12 +0000 (11:00 +0200)]
selftests: netfilter: nft_queue.sh: reduce test file size for debug build
The sctp selftest is very slow on debug kernels.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/
20240826192500.
32efa22c@kernel.org/
Fixes:
4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage")
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240827090023.8917-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Linus Torvalds [Thu, 29 Aug 2024 01:59:18 +0000 (13:59 +1200)]
Merge tag 'random-6.11-rc6-for-linus' of git://git./linux/kernel/git/crng/random
Pull random number generator fix from Jason Donenfeld:
"Reject invalid flags passed to vgetrandom() in the same way that
getrandom() does, so that the behavior is the same, from Yann.
The flags argument to getrandom() only has a behavioral effect on the
function if the RNG isn't initialized yet, so vgetrandom() falls back
to the syscall in that case. But if the RNG is initialized, all of the
flags behave the same way, so vgetrandom() didn't bother checking
them, and just ignored them entirely.
But that doesn't account for invalid flags passed in, which need to be
rejected so we can use them later"
* tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: vDSO: reject unknown getrandom() flags
Jakub Kicinski [Thu, 29 Aug 2024 01:22:02 +0000 (18:22 -0700)]
Merge branch 'net-hisilicon-minor-fixes'
Krzysztof Kozlowski says:
====================
net: hisilicon: minor fixes
Minor fixes for hisilicon ethernet driver which look too trivial to be
considered for current RC.
====================
Link: https://patch.msgid.link/20240827144421.52852-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Krzysztof Kozlowski [Tue, 27 Aug 2024 14:44:21 +0000 (16:44 +0200)]
net: hisilicon: hns_mdio: fix OF node leak in probe()
Driver is leaking OF node reference from
of_parse_phandle_with_fixed_args() in probe().
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827144421.52852-4-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Krzysztof Kozlowski [Tue, 27 Aug 2024 14:44:20 +0000 (16:44 +0200)]
net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()
Driver is leaking OF node reference from
of_parse_phandle_with_fixed_args() in hns_mac_get_info().
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827144421.52852-3-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Krzysztof Kozlowski [Tue, 27 Aug 2024 14:44:19 +0000 (16:44 +0200)]
net: hisilicon: hip04: fix OF node leak in probe()
Driver is leaking OF node reference from
of_parse_phandle_with_fixed_args() in probe().
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240827144421.52852-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andy Shevchenko [Tue, 27 Aug 2024 17:10:05 +0000 (20:10 +0300)]
net: dsa: mv88e6xxx: Remove stale comment
GPIOF_DIR_* definitions are legacy and subject to remove.
Taking this into account, remove stale comment.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240827171005.2301845-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 27 Aug 2024 11:49:16 +0000 (11:49 +0000)]
net: busy-poll: use ktime_get_ns() instead of local_clock()
Typically, busy-polling durations are below 100 usec.
When/if the busy-poller thread migrates to another cpu,
local_clock() can be off by +/-2msec or more for small
values of HZ, depending on the platform.
Use ktimer_get_ns() to ensure deterministic behavior,
which is the whole point of busy-polling.
Fixes:
060212928670 ("net: add low latency socket poll")
Fixes:
9a3c71aa8024 ("net: convert low latency sockets to sched_clock()")
Fixes:
37089834528b ("sched, net: Fixup busy_loop_us_clock()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20240827114916.223377-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maxime Chevallier [Tue, 27 Aug 2024 09:23:13 +0000 (11:23 +0200)]
net: ethtool: cable-test: Release RTNL when the PHY isn't found
Use the correct logic to check for the presence of a PHY device, and
jump to a label that correctly releases RTNL in case of an error, as we
are holding RTNL at that point.
Fixes:
3688ff3077d3 ("net: ethtool: cable-test: Target the command to the requested PHY")
Closes: https://lore.kernel.org/netdev/
20240827104825.
5cbe0602@fedora-3.home/T/#m6bc49cdcc5cfab0d162516b92916b944a01c833f
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20240827092314.2500284-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Erni Sri Satya Vennela [Tue, 27 Aug 2024 05:16:31 +0000 (22:16 -0700)]
net: netvsc: Update default VMBus channels
Change VMBus channels macro (VRSS_CHANNEL_DEFAULT) in
Linux netvsc from 8 to 16 to align with Azure Windows VM
and improve networking throughput.
For VMs having less than 16 vCPUS, the channels depend
on number of vCPUs. For greater than 16 vCPUs,
set the channels to maximum of VRSS_CHANNEL_DEFAULT and
number of physical cores / 2 which is returned by
netif_get_num_default_rss_queues() as a way to optimize CPU
resource utilization and scale for high-end processors with
many cores.
Maximum number of channels are by default set to 64.
Based on this change the channel creation would change as follows:
-----------------------------------------------------------------
| No. of vCPU | dev_info->num_chn | channels created |
-----------------------------------------------------------------
| 1-16 | 16 | vCPU |
| >16 | max(16,#cores/2) | min(64 , max(16,#cores/2)) |
-----------------------------------------------------------------
Performance tests showed significant improvement in throughput:
- 0.54% for 16 vCPUs
- 0.83% for 32 vCPUs
- 0.86% for 48 vCPUs
- 9.72% for 64 vCPUs
- 13.57% for 96 vCPUs
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/1724735791-22815-1-git-send-email-ernis@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jacky Chou [Tue, 27 Aug 2024 03:05:13 +0000 (11:05 +0800)]
net: ftgmac100: Get link speed and duplex for NC-SI
The ethtool of this driver uses the phy API of ethtool
to get the link information from PHY driver.
Because the NC-SI is forced on 100Mbps and full duplex,
the driver connect a fixed-link phy driver for NC-SI.
The ethtool will get the link information from the
fixed-link phy driver.
Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240827030513.481469-1-jacky_chou@aspeedtech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 29 Aug 2024 00:08:20 +0000 (17:08 -0700)]
Merge branch 'tcp-take-better-care-of-tw_substate-and-tw_rcv_nxt'
Eric Dumazet says:
====================
tcp: take better care of tw_substate and tw_rcv_nxt
While reviewing Jason Xing recent commit (
0d9e5df4a257 "tcp: avoid reusing
FIN_WAIT2 when trying to find port in connect() process") I saw
we could remove the volatile qualifier for tw_substate field,
and I also added missing data-race annotations around tcptw->tw_rcv_nxt.
====================
Link: https://patch.msgid.link/20240827015250.3509197-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 27 Aug 2024 01:52:50 +0000 (01:52 +0000)]
tcp: annotate data-races around tcptw->tw_rcv_nxt
No lock protects tcp tw fields.
tcptw->tw_rcv_nxt can be changed from twsk_rcv_nxt_update()
while other threads might read this field.
Add READ_ONCE()/WRITE_ONCE() annotations, and make sure
tcp_timewait_state_process() reads tcptw->tw_rcv_nxt only once.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20240827015250.3509197-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Tue, 27 Aug 2024 01:52:49 +0000 (01:52 +0000)]
tcp: remove volatile qualifier on tw_substate
Using a volatile qualifier for a specific struct field is unusual.
Use instead READ_ONCE()/WRITE_ONCE() where necessary.
tcp_timewait_state_process() can change tw_substate while other
threads are reading this field.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20240827015250.3509197-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jeongjun Park [Thu, 22 Aug 2024 18:11:09 +0000 (03:11 +0900)]
net/xen-netback: prevent UAF in xenvif_flush_hash()
During the list_for_each_entry_rcu iteration call of xenvif_flush_hash,
kfree_rcu does not exist inside the rcu read critical section, so if
kfree_rcu is called when the rcu grace period ends during the iteration,
UAF occurs when accessing head->next after the entry becomes free.
Therefore, to solve this, you need to change it to list_for_each_entry_safe.
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Link: https://patch.msgid.link/20240822181109.2577354-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 28 Aug 2024 23:54:44 +0000 (16:54 -0700)]
Merge tag 'wireless-2024-08-28' of git://git./linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Regressions:
* wfx: fix for open network connection
* iwlwifi: fix for hibernate (due to fast resume feature)
* iwlwifi: fix for a few warnings that were recently added
(had previously been messages not warnings)
Previously broken:
* mwifiex: fix static structures used for per-device data
* iwlwifi: some harmless FW related messages were tagged
too high priority
* iwlwifi: scan buffers weren't checked correctly
* mac80211: SKB leak on beacon error path
* iwlwifi: fix ACPI table interop with certain BIOSes
* iwlwifi: fix locking for link selection
* mac80211: fix SSID comparison in beacon validation
* tag 'wireless-2024-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: iwlwifi: clear trans->state earlier upon error
wifi: wfx: repair open network AP mode
wifi: mac80211: free skb on error path in ieee80211_beacon_get_ap()
wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead
wifi: iwlwifi: mvm: allow 6 GHz channels in MLO scan
wifi: iwlwifi: mvm: pause TCM when the firmware is stopped
wifi: iwlwifi: fw: fix wgds rev 3 exact size
wifi: iwlwifi: mvm: take the mutex before running link selection
wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room()
wifi: iwlwifi: mvm: fix iwl_mvm_scan_fits() calculation
wifi: iwlwifi: lower message level for FW buffer destination
wifi: iwlwifi: mvm: fix hibernation
wifi: mac80211: fix beacon SSID mismatch handling
wifi: mwifiex: duplicate static structs used in driver instances
====================
Link: https://patch.msgid.link/20240828100151.23662-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 28 Aug 2024 19:15:41 +0000 (07:15 +1200)]
Merge tag 'loongarch-fixes-6.11-2' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Remove the unused dma-direct.h, and some bug & warning fixes"
* tag 'loongarch-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Invalidate guest steal time address on vCPU reset
LoongArch: Add ifdefs to fix LSX and LASX related warnings
LoongArch: Define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE
LoongArch: Remove the unused dma-direct.h
Linus Torvalds [Wed, 28 Aug 2024 19:12:02 +0000 (07:12 +1200)]
Merge tag 'platform-drivers-x86-v6.11-5' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:
- platform/x86/amd/pmc: AMD 1Ah model 60h series support (2nd attempt)
- asus-wmi: Prevent spurious rfkill on Asus Zenbook Duo
- x86-android-tablets: Relax DMI match to cover another model
* tag 'platform-drivers-x86-v6.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: x86-android-tablets: Make Lenovo Yoga Tab 3 X90F DMI match less strict
platform/x86: asus-wmi: Fix spurious rfkill on UX8406MA
platform/x86/amd/pmc: Extend support for PMC features on new AMD platform
platform/x86/amd/pmc: Fix SMU command submission path on new AMD platform
Linus Torvalds [Wed, 28 Aug 2024 18:20:44 +0000 (06:20 +1200)]
Merge tag 'nfsd-6.11-2' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix a number of crashers
- Update email address for an NFSD reviewer
* tag 'nfsd-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
fs/nfsd: fix update of inode attrs in CB_GETATTR
nfsd: fix potential UAF in nfsd4_cb_getattr_release
nfsd: hold reference to delegation when updating it for cb_getattr
MAINTAINERS: Update Olga Kornievskaia's email address
nfsd: prevent panic for nfsv4.0 closed files in nfs4_show_open
nfsd: ensure that nfsd4_fattr_args.context is zeroed out
Linus Torvalds [Wed, 28 Aug 2024 18:17:46 +0000 (06:17 +1200)]
Merge tag 'for-6.11-rc5-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix use-after-free when submitting bios for read, after an error and
partially submitted bio the original one is freed while it can be
still be accessed again
- fix fstests case btrfs/301, with enabled quotas wait for delayed
iputs when flushing delalloc
- fix periodic block group reclaim, an unitialized value can be
returned if there are no block groups to reclaim
- fix build warning (-Wmaybe-uninitialized)
* tag 'for-6.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix uninitialized return value from btrfs_reclaim_sweep()
btrfs: fix a use-after-free when hitting errors inside btrfs_submit_chunk()
btrfs: initialize last_extent_end to fix -Wmaybe-uninitialized warning in extent_fiemap()
btrfs: run delayed iputs when flushing delalloc
Linus Torvalds [Wed, 28 Aug 2024 03:05:02 +0000 (15:05 +1200)]
Merge tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- two RDMA/smbdirect fixes and a minor cleanup
- punch hole fix
* tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix FALLOC_FL_PUNCH_HOLE support
smb/client: fix rdma usage in smb2_async_writev()
smb/client: remove unused rq_iter_size from struct smb_rqst
smb/client: avoid dereferencing rdata=NULL in smb2_new_read_req()
Linus Torvalds [Wed, 28 Aug 2024 02:55:48 +0000 (14:55 +1200)]
Merge tag 'tpmdd-next-6.11-rc6' of git://git./linux/kernel/git/jarkko/linux-tpmdd
Pull TPM fix from Jarkko Sakkinen:
"A bug fix for tpm_ibmvtpm driver so that it will take the bus
encryption into use"
* tag 'tpmdd-next-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: ibmvtpm: Call tpm2_sessions_init() to initialize session support
Jakub Kicinski [Tue, 27 Aug 2024 23:35:31 +0000 (16:35 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-08-26 (ice)
This series contains updates to ice driver only.
Jake implements and uses rd32_poll_timeout to replace a jiffies loop for
calling ice_sq_done. The rd32_poll_timeout() function is designed to allow
simplifying other places in the driver where we need to read a register
until it matches a known value.
Jake, Bruce, and Przemek update ice_debug_cq() to be more robust, and more
useful for tracing control queue messages sent and received by the device
driver.
Jake rewords several commands in the ice_control.c file which previously
referred to the "Admin queue" when they were actually generic functions
usable on any control queue.
Jake removes the unused and unnecessary cmd_buf array allocation for send
queues. This logic originally was going to be useful if we ever implemented
asynchronous completion of transmit messages. This support is unlikely to
materialize, so the overhead of allocating a command buffer is unnecessary.
Sergey improves the log messages when the ice driver reports that the NVM
version on the device is not supported by the driver. Now, these messages
include both the discovered NVM version and the requested/expected NVM
version.
Aleksandr Mishin corrects overallocation of memory related to adding
scheduler nodes.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node()
ice: Report NVM version numbers on mismatch during load
ice: remove unnecessary control queue cmd_buf arrays
ice: reword comments referring to control queues
ice: stop intermixing AQ commands/responses debug dumps
ice: do not clutter debug logs with unused data
ice: improve debug print for control queue messages
ice: implement and use rd32_poll_timeout for ice_sq_done timeout
====================
Link: https://patch.msgid.link/20240826224655.133847-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 27 Aug 2024 23:14:20 +0000 (16:14 -0700)]
Merge branch 'net-dsa-microchip-add-ksz8895-ksz8864-switch-support'
Tristram Ha says:
====================
net: dsa: microchip: Add KSZ8895/KSZ8864 switch support
This series of patches is to add KSZ8895/KSZ8864 switch support to the
KSZ DSA driver.
====================
Link: https://patch.msgid.link/BYAPR11MB3558B8A089C88DFFFC09B067EC8B2@BYAPR11MB3558.namprd11.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tristram Ha [Mon, 26 Aug 2024 21:43:08 +0000 (21:43 +0000)]
net: dsa: microchip: Add KSZ8895/KSZ8864 switch support
KSZ8895/KSZ8864 is a switch family between KSZ8863/73 and KSZ8795, so it
shares some registers and functions in those switches already
implemented in the KSZ DSA driver.
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Tested-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tristram Ha [Mon, 26 Aug 2024 21:43:05 +0000 (21:43 +0000)]
dt-bindings: net: dsa: microchip: Add KSZ8895/KSZ8864 switch support
KSZ8895/KSZ8864 is a switch family developed before KSZ8795 and after
KSZ8863, so it shares some registers and functions in those switches.
KSZ8895 has 5 ports and so is more similar to KSZ8795.
KSZ8864 is a 4-port version of KSZ8895. The first port is removed
while port 5 remains as a host port.
Signed-off-by: Tristram Ha <tristram.ha@microchip.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/BYAPR11MB3558FD0717772263FAD86846EC8B2@BYAPR11MB3558.namprd11.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A K M Fazla Mehrab [Mon, 26 Aug 2024 18:26:52 +0000 (18:26 +0000)]
net/handshake: use sockfd_put() helper
Replace fput() with sockfd_put() in handshake_nl_done_doit().
Signed-off-by: A K M Fazla Mehrab <a.mehrab@bytedance.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://patch.msgid.link/20240826182652.2449359-1-a.mehrab@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shradha Gupta [Mon, 26 Aug 2024 16:07:41 +0000 (09:07 -0700)]
net: mana: Implement get_ringparam/set_ringparam for mana
Currently the values of WQs for RX and TX queues for MANA devices
are hardcoded to default sizes.
Allow configuring these values for MANA devices as ringparam
configuration(get/set) through ethtool_ops.
Pre-allocate buffers at the beginning of this operation, to
prevent complete network loss in low-memory conditions.
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Link: https://patch.msgid.link/1724688461-12203-1-git-send-email-shradhagupta@linux.microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ondrej Mosnacek [Mon, 26 Aug 2024 13:07:11 +0000 (15:07 +0200)]
sctp: fix association labeling in the duplicate COOKIE-ECHO case
sctp_sf_do_5_2_4_dupcook() currently calls security_sctp_assoc_request()
on new_asoc, but as it turns out, this association is always discarded
and the LSM labels never get into the final association (asoc).
This can be reproduced by having two SCTP endpoints try to initiate an
association with each other at approximately the same time and then peel
off the association into a new socket, which exposes the unitialized
labels and triggers SELinux denials.
Fix it by calling security_sctp_assoc_request() on asoc instead of
new_asoc. Xin Long also suggested limit calling the hook only to cases
A, B, and D, since in cases C and E the COOKIE ECHO chunk is discarded
and the association doesn't enter the ESTABLISHED state, so rectify that
as well.
One related caveat with SELinux and peer labeling: When an SCTP
connection is set up simultaneously in this way, we will end up with an
association that is initialized with security_sctp_assoc_request() on
both sides, so the MLS component of the security context of the
association will get swapped between the peers, instead of just one side
setting it to the other's MLS component. However, at that point
security_sctp_assoc_request() had already been called on both sides in
sctp_sf_do_unexpected_init() (on a temporary association) and thus if
the exchange didn't fail before due to MLS, it won't fail now either
(most likely both endpoints have the same MLS range).
Tested by:
- reproducer from https://src.fedoraproject.org/tests/selinux/pull-request/530
- selinux-testsuite (https://github.com/SELinuxProject/selinux-testsuite/)
- sctp-tests (https://github.com/sctp/sctp-tests) - no tests failed
that wouldn't fail also without the patch applied
Fixes:
c081d53f97a1 ("security: pass asoc to sctp_assoc_request and sctp_sk_clone")
Suggested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM/SELinux)
Link: https://patch.msgid.link/20240826130711.141271-1-omosnace@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 27 Aug 2024 21:45:18 +0000 (14:45 -0700)]
Merge branch 'mptcp-close-subflow-when-receiving-tcp-fin-and-misc'
Matthieu Baerts says:
====================
mptcp: close subflow when receiving TCP+FIN and misc.
Here are different fixes:
Patch 1 closes the subflow after having received a FIN, instead
of leaving it half-closed until the end of the MPTCP connection.
A fix for v5.12.
Patch 2 validates the previous patch.
Patch 3 is a fix for a recent fix to check both directions for the
backup flag. It can follow the 'Fixes' commit and be backported up
to v5.7.
Patch 4 adds a missing \n at the end of pr_debug(), causing debug
messages to be displayed with a delay, which confuses the debugger.
A fix for v5.6.
====================
Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-0-905199fe1172@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Mon, 26 Aug 2024 17:11:21 +0000 (19:11 +0200)]
mptcp: pr_debug: add missing \n at the end
pr_debug() have been added in various places in MPTCP code to help
developers to debug some situations. With the dynamic debug feature, it
is easy to enable all or some of them, and asks users to reproduce
issues with extra debug.
Many of these pr_debug() don't end with a new line, while no 'pr_cont()'
are used in MPTCP code. So the goal was not to display multiple debug
messages on one line: they were then not missing the '\n' on purpose.
Not having the new line at the end causes these messages to be printed
with a delay, when something else needs to be printed. This issue is not
visible when many messages need to be printed, but it is annoying and
confusing when only specific messages are expected, e.g.
# echo "func mptcp_pm_add_addr_echoed +fmp" \
> /sys/kernel/debug/dynamic_debug/control
# ./mptcp_join.sh "signal address"; \
echo "$(awk '{print $1}' /proc/uptime) - end"; \
sleep 5s; \
echo "$(awk '{print $1}' /proc/uptime) - restart"; \
./mptcp_join.sh "signal address"
013 signal address
(...)
10.75 - end
15.76 - restart
013 signal address
[ 10.367935] mptcp:mptcp_pm_add_addr_echoed: MPTCP: msk=(...)
(...)
=> a delay of 5 seconds: printed with a 10.36 ts, but after 'restart'
which was printed at the 15.76 ts.
The 'Fixes' tag here below points to the first pr_debug() used without
'\n' in net/mptcp. This patch could be split in many small ones, with
different Fixes tag, but it doesn't seem worth it, because it is easy to
re-generate this patch with this simple 'sed' command:
git grep -l pr_debug -- net/mptcp |
xargs sed -i "s/\(pr_debug(\".*[^n]\)\(\"[,)]\)/\1\\\n\2/g"
So in case of conflicts, simply drop the modifications, and launch this
command.
Fixes:
f870fa0b5768 ("mptcp: Add MPTCP socket stubs")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-4-905199fe1172@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Mon, 26 Aug 2024 17:11:20 +0000 (19:11 +0200)]
mptcp: sched: check both backup in retrans
The 'mptcp_subflow_context' structure has two items related to the
backup flags:
- 'backup': the subflow has been marked as backup by the other peer
- 'request_bkup': the backup flag has been set by the host
Looking only at the 'backup' flag can make sense in some cases, but it
is not the behaviour of the default packet scheduler when selecting
paths.
As explained in the commit
b6a66e521a20 ("mptcp: sched: check both
directions for backup"), the packet scheduler should look at both flags,
because that was the behaviour from the beginning: the 'backup' flag was
set by accident instead of the 'request_bkup' one. Now that the latter
has been fixed, get_retrans() needs to be adapted as well.
Fixes:
b6a66e521a20 ("mptcp: sched: check both directions for backup")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-3-905199fe1172@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Mon, 26 Aug 2024 17:11:19 +0000 (19:11 +0200)]
selftests: mptcp: join: cannot rm sf if closed
Thanks to the previous commit, the MPTCP subflows are now closed on both
directions even when only the MPTCP path-manager of one peer asks for
their closure.
In the two tests modified here -- "userspace pm add & remove address"
and "userspace pm create destroy subflow" -- one peer is controlled by
the userspace PM, and the other one by the in-kernel PM. When the
userspace PM sends a RM_ADDR notification, the in-kernel PM will
automatically react by closing all subflows using this address. Now,
thanks to the previous commit, the subflows are properly closed on both
directions, the userspace PM can then no longer closes the same
subflows if they are already closed. Before, it was OK to do that,
because the subflows were still half-opened, still OK to send a RM_ADDR.
In other words, thanks to the previous commit closing the subflows, an
error will be returned to the userspace if it tries to close a subflow
that has already been closed. So no need to run this command, which mean
that the linked counters will then not be incremented.
These tests are then no longer sending both a RM_ADDR, then closing the
linked subflow just after. The test with the userspace PM on the server
side is now removing one subflow linked to one address, then sending
a RM_ADDR for another address. The test with the userspace PM on the
client side is now only removing the subflow that was previously
created.
Fixes:
4369c198e599 ("selftests: mptcp: test userspace pm out of transfer")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-2-905199fe1172@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matthieu Baerts (NGI0) [Mon, 26 Aug 2024 17:11:18 +0000 (19:11 +0200)]
mptcp: close subflow when receiving TCP+FIN
When a peer decides to close one subflow in the middle of a connection
having multiple subflows, the receiver of the first FIN should accept
that, and close the subflow on its side as well. If not, the subflow
will stay half closed, and would even continue to be used until the end
of the MPTCP connection or a reset from the network.
The issue has not been seen before, probably because the in-kernel
path-manager always sends a RM_ADDR before closing the subflow. Upon the
reception of this RM_ADDR, the other peer will initiate the closure on
its side as well. On the other hand, if the RM_ADDR is lost, or if the
path-manager of the other peer only closes the subflow without sending a
RM_ADDR, the subflow would switch to TCP_CLOSE_WAIT, but that's it,
leaving the subflow half-closed.
So now, when the subflow switches to the TCP_CLOSE_WAIT state, and if
the MPTCP connection has not been closed before with a DATA_FIN, the
kernel owning the subflow schedules its worker to initiate the closure
on its side as well.
This issue can be easily reproduced with packetdrill, as visible in [1],
by creating an additional subflow, injecting a FIN+ACK before sending
the DATA_FIN, and expecting a FIN+ACK in return.
Fixes:
40947e13997a ("mptcp: schedule worker when subflow is closed")
Cc: stable@vger.kernel.org
Link: https://github.com/multipath-tcp/packetdrill/pull/154
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-1-905199fe1172@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xueming Feng [Mon, 26 Aug 2024 10:23:27 +0000 (18:23 +0800)]
tcp: fix forever orphan socket caused by tcp_abort
We have some problem closing zero-window fin-wait-1 tcp sockets in our
environment. This patch come from the investigation.
Previously tcp_abort only sends out reset and calls tcp_done when the
socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only
purging the write queue, but not close the socket and left it to the
timer.
While purging the write queue, tp->packets_out and sk->sk_write_queue
is cleared along the way. However tcp_retransmit_timer have early
return based on !tp->packets_out and tcp_probe_timer have early
return based on !sk->sk_write_queue.
This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched
and socket not being killed by the timers, converting a zero-windowed
orphan into a forever orphan.
This patch removes the SOCK_DEAD check in tcp_abort, making it send
reset to peer and close the socket accordingly. Preventing the
timer-less orphan from happening.
According to Lorenzo's email in the v1 thread, the check was there to
prevent force-closing the same socket twice. That situation is handled
by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is
already closed.
The -ENOENT code comes from the associate patch Lorenzo made for
iproute2-ss; link attached below, which also conform to RFC 9293.
At the end of the patch, tcp_write_queue_purge(sk) is removed because it
was already called in tcp_done_with_error().
p.s. This is the same patch with v2. Resent due to mis-labeled "changes
requested" on patchwork.kernel.org.
Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/
Fixes:
c1e64e298b8c ("net: diag: Support destroying TCP sockets.")
Signed-off-by: Xueming Feng <kuro@kuroa.me>
Tested-by: Lorenzo Colitti <lorenzo@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pawel Dembicki [Mon, 26 Aug 2024 09:37:10 +0000 (11:37 +0200)]
net: phy: vitesse: implement MDI-X configuration in vsc73xx
This commit introduces MDI-X configuration support in vsc73xx phys.
Vsc73xx supports only auto mode or forced MDI.
Vsc73xx have auto MDI-X disabled by default in forced speed mode.
This commit enables it.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240826093710.511837-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 27 Aug 2024 21:26:08 +0000 (14:26 -0700)]
Merge branch 'net-fix-module-autoloading'
Liao Chen says:
====================
net: fix module autoloading
This patchset aims to enable autoloading of some net modules.
By registering MDT, the kernel is allowed to automatically bind
modules to devices that match the specified compatible strings.
====================
Link: https://patch.msgid.link/20240826091858.369910-1-liaochen4@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liao Chen [Mon, 26 Aug 2024 09:18:58 +0000 (09:18 +0000)]
net: airoha: fix module autoloading
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded
based on the alias from of_device_id table.
Signed-off-by: Liao Chen <liaochen4@huawei.com>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20240826091858.369910-4-liaochen4@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liao Chen [Mon, 26 Aug 2024 09:18:57 +0000 (09:18 +0000)]
net: ag71xx: fix module autoloading
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded
based on the alias from of_device_id table.
Signed-off-by: Liao Chen <liaochen4@huawei.com>
Link: https://patch.msgid.link/20240826091858.369910-3-liaochen4@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liao Chen [Mon, 26 Aug 2024 09:18:56 +0000 (09:18 +0000)]
net: dm9051: fix module autoloading
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded
based on the alias from of_device_id table.
Signed-off-by: Liao Chen <liaochen4@huawei.com>
Link: https://patch.msgid.link/20240826091858.369910-2-liaochen4@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yu Liao [Mon, 26 Aug 2024 01:21:00 +0000 (09:21 +0800)]
net: txgbe: use pci_dev_id() helper
PCI core API pci_dev_id() can be used to get the BDF number for a PCI
device. We don't need to compose it manually. Use pci_dev_id() to
simplify the code a little bit.
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240826012100.3975175-1-liaoyu15@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>