git.monstr.eu Git - linux-2.6-microblaze.git/log

crypto: af_alg: Use extract_iter_to_sg() to create scatterlists

Use extract_iter_to_sg() to decant the destination iterator into a
scatterlist in af_alg_get_rsgl(). af_alg_make_sg() can then be removed.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

crypto: af_alg: Pin pages rather than ref'ing if appropriate

Convert AF_ALG to use iov_iter_extract_pages() instead of
iov_iter_get_pages(). This will pin pages or leave them unaltered rather
than getting a ref on them as appropriate to the iterator.

The pages need to be pinned for DIO-read rather than having refs taken on
them to prevent VM copy-on-write from malfunctioning during a concurrent
fork() (the result of the I/O would otherwise end up only visible to the
child process and not the parent).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Move netfs_extract_iter_to_sg() to lib/scatterlist.c

Move netfs_extract_iter_to_sg() to lib/scatterlist.c as it's going to be
used by more than just network filesystems (AF_ALG, for example).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Wrap lines at 80

Wrap a line at 80 to stop checkpatch complaining.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: Simon Horman <simon.horman@corigine.com>
cc: linux-crypto@vger.kernel.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Fix a couple of spelling mistakes

Fix a couple of spelling mistakes in a comment.

Suggested-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZHH2mSRqeL4Gs1ft@corigine.com/
Link: https://lore.kernel.org/r/ZHH1nqZWOGzxlidT@corigine.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Drop the netfs_ prefix from netfs_extract_iter_to_sg()

Rename netfs_extract_iter_to_sg() and its auxiliary functions to drop the
netfs_ prefix.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Steve French <sfrench@samba.org>
cc: Shyam Prasad N <nspmangalore@gmail.com>
cc: Rohith Surabattula <rohiths.msft@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: linux-crypto@vger.kernel.org
cc: linux-cachefs@redhat.com
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'txgbe-phylink-support'

Jiawen Wu says:

====================
TXGBE PHYLINK support

Implement I2C, SFP, GPIO and PHYLINK to setup TXGBE link.

Because our I2C and PCS are based on Synopsys Designware IP-core, extend
the i2c-designware and pcs-xpcs driver to realize our functions.
====================

Link: https://lore.kernel.org/r/20230606092107.764621-1-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Support phylink MAC layer

Add phylink support to Wangxun 10Gb Ethernet controller for the 10GBASE-R
interface.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Implement phylink pcs

Register MDIO bus for PCS layer to use Synopsys designware XPCS, support
10GBASE-R interface to the controller.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: pcs: Add 10GBASE-R mode for Synopsys Designware XPCS

Add basic support for XPCS using 10GBASE-R interface. This mode will
be extended to use interrupt, so set pcs.poll false. And avoid soft
reset so that the device using this mode is in the default configuration.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Support GPIO to SFP socket

Register GPIO chip and handle GPIO IRQ for SFP socket.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Add SFP module identify

Register SFP platform device to get modules information.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Register I2C platform device

Register the platform device to use Designware I2C bus master driver.
Use regmap to read/write I2C device region from given base offset.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Register fixed rate clock

In order for I2C to be able to work in standard mode, register a fixed
rate clock for each I2C device.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: txgbe: Add software nodes to support phylink

Register software nodes for GPIO, I2C, SFP and PHYLINK. Define the
device properties.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: dsa: mv88e6xxx: implement USXGMII mode for mv88e6393x

Enable USXGMII mode for mv88e6393x chips. Tested on Marvell 88E6191X.

Signed-off-by: Michal Smulski <michal.smulski@ooma.com>
Link: https://lore.kernel.org/r/20230605174442.12493-1-msmulski2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'followup-fixes-for-the-dwmac-and-altera-lynx-conversion'

Maxime Chevallier says:

====================
Followup fixes for the dwmac and altera lynx conversion

Here's yet another version of the cleanup series for the TSE PCS replacement
by PCS Lynx. It includes Kconfig fixups, some missing initialisations
and a slight rework suggested by Russell for the dwmac cleanup sequence,
along with more explicit zeroing of local structures as per MAciej's
review.
====================

Link: https://lore.kernel.org/r/20230607135941.407054-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dwmac_socfpga: initialize local data for mdio regmap configuration

Explicitly zero-ize the local mdio_regmap_config data, and explicitly
set the .autoscan parameter, as we only have a PCS on this bus.

Fixes: 5d1f3fe7d2d5 ("net: stmmac: dwmac-sogfpga: use the lynx pcs driver")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: altera_tse: explicitly disable autoscan on the regmap-mdio bus

Set the .autoscan flag to false on the regmap-mdio bus, to avoid using a
random uninitialized value. We don't want autoscan in this case as the
mdio device is a PCS and not a PHY.

Fixes: db48abbaa18e ("net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: stmmac: make the pcs_lynx cleanup sequence specific to dwmac_socfpga

So far, only the dwmac_socfpga variant of stmmac uses PCS Lynx. Use a
dedicated cleanup sequence for dwmac_socfpga instead of using the
generic stmmac one.

Fixes: 5d1f3fe7d2d5 ("net: stmmac: dwmac-sogfpga: use the lynx pcs driver")
Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: altera_tse: Use the correct Kconfig option for the PCS_LYNX dependency

Use the correct Kconfig dependency for altera_tse as PCS_ALTERA_TSE was
replaced by PCS_LYNX.

Fixes: db48abbaa18e ("net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: altera-tse: Initialize local structs before using it

The regmap_config and mdio_regmap_config objects needs to be zeroed before
using them. This will cause spurious errors at probe time as config->pad_bits
is containing random uninitialized data.

Fixes: db48abbaa18e ("net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'tools-ynl-generate-code-for-the-handshake-family'

Jakub Kicinski says:

====================
tools: ynl: generate code for the handshake family

Add necessary features and generate user space C code for serializing
/ deserializing messages of the handshake family.

In addition to basics already present in netdev and fou, handshake
has nested attrs and multi-attr u32.
====================

Link: https://lore.kernel.org/r/20230606194302.919343-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: generate code for the handshake family

Generate support for the handshake family.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: improve unwind on parsing errors

When parsing multi-attr we count the objects and then allocate
an array to hold the parsed objects. If an attr space has multiple
multi-attr objects, however, if parsing the first array fails
we'll leave the object count for the second even tho the second
array was never allocated.

This may cause crashes when freeing objects on error.

Count attributes to a variable on the stack and only set the count
in the object once the memory was allocated.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: fill in support for MultiAttr scalars

The handshake family needs support for MultiAttr scalars.
Right now we only support code gen for MultiAttr nested
types.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: txgbe: Avoid passing uninitialised parameter to pci_wake_from_d3()

txgbe_shutdown() relies on txgbe_dev_shutdown() to initialise
wake by passing it by reference. However, txgbe_dev_shutdown()
doesn't use this parameter at all.

wake is then passed uninitialised by txgbe_dev_shutdown()
to pci_wake_from_d3().

Resolve this problem by:
* Removing the unused parameter from txgbe_dev_shutdown()
* Removing the uninitialised variable wake from txgbe_dev_shutdown()
* Passing false to pci_wake_from_d3() - this assumes that
  although uninitialised wake was in practice false (0).

I'm not sure that this counts as a bug, as I'm not sure that
it manifests in any unwanted behaviour. But in any case, the issue
was introduced by:

  3ce7547e5b71 ("net: txgbe: Add build support for txgbe")

Flagged by Smatch as:

  .../txgbe_main.c:486 txgbe_shutdown() error: uninitialized symbol 'wake'.

No functional change intended.
Compile tested only.

Signed-off-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: qca8k: remove unnecessary (void*) conversions

Pointer variables of (void*) type do not require type cast.

Signed-off-by: Atin Bainada <hi@atinb.me>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: liquidio: fix mixed module-builtin object

With CONFIG_LIQUIDIO=m and CONFIG_LIQUIDIO_VF=y (or vice versa),
$(common-objs) are linked to a module and also to vmlinux even though
the expected CFLAGS are different between builtins and modules.

This is the same situation as fixed by commit 637a642f5ca5 ("zstd:
Fixing mixed module-builtin objects").

Introduce the new module, liquidio-core, to provide the common functions
to liquidio and liquidio-vf.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix formatting in sysctl_net_ipv4.c

Fix incorrectly formatted tcp_syn_linear_timeouts sysctl in the
ipv4_net_table.

Fixes: ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear")
Signed-off-by: David Morley <morleyd@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Tested-by: David Morley <morleyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: ocelot: unlock on error in vsc9959_qos_port_tas_set()

This error path needs call mutex_unlock(&ocelot->tas_lock) before
returning.

Fixes: 2d800bc500fb ("net/sched: taprio: replace tc_taprio_qopt_offload :: enable with a "cmd" enum")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'realtek-external-phy-clock'

Detlev Casanova says:

====================
net: phy: realtek: Support external PHY clock

Some PHYs can use an external clock that must be enabled before
communicating with them.

Changes since v3:
* Do not call genphy_suspend if WoL is enabled.
Changes since v2:
* Reword documentation commit message
Changes since v1:
* Remove the clock name as it is not guaranteed to be identical across
different PHYs
* Disable/Enable the clock when suspending/resuming
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: realtek: Disable clock on suspend

For PHYs that call rtl821x_probe() where an external clock can be
configured, make sure that the clock is disabled
when ->suspend() is called and enabled on resume.

The PHY_ALWAYS_CALL_SUSPEND is added to ensure that the suspend function
is actually always called.

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: phy: Document support for external PHY clk

Ethern PHYs can have external an clock that needs to be activated before
communicating with the PHY.

Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phy: realtek: Add optional external PHY clock

In some cases, the PHY can use an external clock source instead of a
crystal.

Add an optional clock in the phy node to make sure that the clock source
is enabled, if specified, before probing.

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

hv_netvsc: Allocate rx indirection table size dynamically

Allocate the size of rx indirection table dynamically in netvsc
from the value of size provided by OID_GEN_RECEIVE_SCALE_CAPABILITIES
query instead of using a constant value of ITAB_NUM.

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Tested-on: Ubuntu22 (azure VM, SKU size: Standard_F72s_v2)
Testcases:
1. ethtool -x eth0 output
2. LISA testcase:PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-Synthetic
3. LISA testcase:PERF-NETWORK-TCP-THROUGHPUT-MULTICONNECTION-NTTTCP-SRIOV
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tipc: replace open-code bearer rcu_dereference access in bearer.c

Replace these open-code bearer rcu_dereference access with bearer_get(),
like other places in bearer.c. While at it, also use tipc_net() instead
of net_generic(net, tipc_net_id) to get "tn" in bearer.c.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/1072588a8691f970bda950c7e2834d1f2983f58e.1685976044.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'ipv4-remove-rt_conn_flags-calls-in-flowi4_init_output'

Guillaume Nault says:

====================
ipv4: Remove RT_CONN_FLAGS() calls in flowi4_init_output().

Remove a few RT_CONN_FLAGS() calls used inside flowi4_init_output().
These users can be easily converted to set the scope properly, instead
of overloading the tos parameter with scope information as done by
RT_CONN_FLAGS().

The objective is to eventually remove RT_CONN_FLAGS() entirely, which
will then allow to also remove RTO_ONLINK and to finally convert
->flowi4_tos to dscp_t.
====================

Link: https://lore.kernel.org/r/cover.1685999117.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tcp: Set route scope properly in cookie_v4_check().

RT_CONN_FLAGS(sk) overloads flowi4_tos with the RTO_ONLINK bit when
sk has the SOCK_LOCALROUTE flag set. This allows
ip_route_output_key_hash() to eventually adjust flowi4_scope.

Instead of relying on special handling of the RTO_ONLINK bit, we can
just set the route scope correctly. This will eventually allow to avoid
special interpretation of tos variables and to convert ->flowi4_tos to
dscp_t.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv4: Set correct scope in inet_csk_route_*().

RT_CONN_FLAGS(sk) overloads the tos parameter with the RTO_ONLINK bit
when sk has the SOCK_LOCALROUTE flag set. This is only useful for
ip_route_output_key_hash() to eventually adjust the route scope.

Let's drop RTO_ONLINK and set the correct scope directly to avoid this
special case in the future and to allow converting ->flowi4_tos to
dscp_t.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'move-ksz9477-errata-handling-to-phy-driver'

Robert Hancock says:

====================
Move KSZ9477 errata handling to PHY driver

Patches to move handling for KSZ9477 PHY errata register fixes from
the DSA switch driver into the corresponding PHY driver, for more
proper layering and ordering.
====================

Link: https://lore.kernel.org/r/20230605153943.1060444-1-robert.hancock@calian.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: microchip: remove KSZ9477 PHY errata handling

The KSZ9477 PHY errata handling code has now been moved into the Micrel
PHY driver, so it is no longer needed inside the DSA switch driver.
Remove it.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: micrel: Move KSZ9477 errata fixes to PHY driver

The ksz9477 DSA switch driver is currently updating some MMD registers
on the internal port PHYs to address some chip errata. However, these
errata are really a property of the PHY itself, not the switch they are
part of, so this is kind of a layering violation. It makes more sense for
these writes to be done inside the driver which binds to the PHY and not
the driver for the containing device.

This also addresses some issues where the ordering of when these writes
are done may have been incorrect, causing the link to erratically fail to
come up at the proper speed or at all. Doing this in the PHY driver
during config_init ensures that they happen before anything else tries to
change the state of the PHY on the port.

The new code also ensures that autonegotiation is disabled during the
register writes and re-enabled afterwards, as indicated by the latest
version of the errata documentation from Microchip.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'tools-ynl-user-space-c'

Jakub Kicinski says:

====================
tools: ynl: user space C

Use the code gen which is already in tree to generate a user space
library for a handful of simple families. I find YNL C quite useful
in some WIP projects, and I think others may find it useful, too.
I was hoping someone will pick this work up and finish it...
but it seems that Python YNL has largely stolen the thunder.
Python may not be great for selftest, tho, and actually this lib
is more fully-featured. The Python script was meant as a quick demo,
funny how those things go.

v2: https://lore.kernel.org/all/20230604175843.662084-1-kuba@kernel.org/
v1: https://lore.kernel.org/all/20230603052547.631384-1-kuba@kernel.org/
====================

Link: https://lore.kernel.org/r/20230605190108.809439-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: add sample for netdev

Add a sample application using the C library.
My main goal is to make writing selftests easier but until
I have some of those ready I think it's useful to show off
the functionality and let people poke and tinker.

Sample outputs - dump:

$ ./netdev
Select ifc ($ifindex; or 0 = dump; or -2 ntf check): 0
lo[1] 0:
enp1s0[2] 23: basic redirect rx-sg

Notifications (watching veth pair getting added and deleted):

$ ./netdev
Select ifc ($ifindex; or 0 = dump; or -2 ntf check): -2
[53] 0: (ntf: dev-add-ntf)
[54] 0: (ntf: dev-add-ntf)
[54] 23: basic redirect rx-sg (ntf: dev-change-ntf)
[53] 23: basic redirect rx-sg (ntf: dev-change-ntf)
[53] 23: basic redirect rx-sg (ntf: dev-del-ntf)
[54] 23: basic redirect rx-sg (ntf: dev-del-ntf)

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: support fou and netdev in C

Generate the code for netdev and fou families. They are simple
and already supported by the code gen.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl: user space helpers

Add "fixed" part of the user space Netlink Spec-based library.
This will get linked with the protocol implementations to form
a full API.

Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: clean up stray new lines at the end of reply-less requests

Do not print empty lines before closing brackets.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/pppoe: fix a typo for the PPPOE_HASH_BITS_1 definition

Instead of its intention to define PPPOE_HASH_BITS_1, commit 96ba44c637b0
("net/pppoe: make number of hash bits configurable") actually defined
config PPPOE_HASH_BITS_2 twice in the ppp's Kconfig file due to a quick
typo with the numbers.

Fix the typo and define PPPOE_HASH_BITS_1.

Fixes: 96ba44c637b0 ("net/pppoe: make number of hash bits configurable")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jaco Kroon <jaco@uls.co.za>
Link: https://lore.kernel.org/r/20230605072743.11247-1-lukas.bulwahn@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

mac_pton: Clean up the header inclusions

Since hex_to_bin() is provided by hex.h there is no need to require
kernel.h. Replace the latter by the former and add missing export.h.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230604132858.6650-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

gro: decrease size of CB

The GRO control block (NAPI_GRO_CB) is currently at its maximum size.
This commit reduces its size by putting two groups of fields that are
used only at different times into a union.

Specifically, the fields frag0 and frag0_len are the fields that make up
the frag0 optimisation mechanism, which is used during the initial
parsing of the SKB.

The fields last and age are used after the initial parsing, while the
SKB is stored in the GRO list, waiting for other packets to arrive.

There was one location in dev_gro_receive that modified the frag0 fields
after setting last and age. I changed this accordingly without altering
the code behaviour.

Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230601161407.GA9253@debian
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Merge branch 'splice-net-handle-msg_splice_pages-in-af_kcm'

David Howells says:

====================
splice, net: Handle MSG_SPLICE_PAGES in AF_KCM

Here are patches to make AF_KCM handle the MSG_SPLICE_PAGES internal
sendmsg flag.  MSG_SPLICE_PAGES is an internal hint that tells the protocol
that it should splice the pages supplied if it can.  Its sendpage
implementation is then turned into a wrapper around that.

Does anyone actually use AF_KCM?  Upstream it has some issues.  It doesn't
seem able to handle a "message" longer than 113920 bytes without jamming
and doesn't handle the client termination once it is jammed.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6
Link: https://lore.kernel.org/r/20230524144923.3623536-1-dhowells@redhat.com/
====================

Link: https://lore.kernel.org/r/20230531110423.643196-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

kcm: Convert kcm_sendpage() to use MSG_SPLICE_PAGES

Convert kcm_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than
directly splicing in the pages itself.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Cong Wang <cong.wang@bytedance.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

kcm: Support MSG_SPLICE_PAGES

Make AF_KCM sendmsg() support MSG_SPLICE_PAGES. This causes pages to be
spliced from the source iterator if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Cong Wang <cong.wang@bytedance.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'mlx5-updates-2023-05-31' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-05-31

net/mlx5: Support 4 ports VF LAG, part 1/2

This series continues the series[1] "Support 4 ports HCAs LAG mode"
by Mark Bloch. This series adds support for 4 ports VF LAG (single FDB
E-Switch).

This series of patches focuses on refactoring different sections of the
code that make assumptions about VF LAG supporting only two ports. For
instance, it assumes that each device can only have one peer.

Patches 1-5:
- Refactor ETH handling of TC rules of eswitches with peers.
Patch 6:
- Refactors peer miss group table.
Patches 7-9:
- Refactor single FDB E-Switch creation.
Patch 10:
- Refactor the DR layer.
Patches 11-14:
- Refactors devcom layer.

Next series will refactor LAG layer and enable 4 ports VF LAG.
This series specifically allows HCAs with 4 ports to create a VF LAG
with only 4 ports. It is not possible to create a VF LAG with 2 or 3
ports using HCAs that have 4 ports.

Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
However, upcoming patches will introduce support for HCAs with 4 ports.

In order to activate VF LAG a user can execute:

devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink dev eswitch set pci/0000:08:00.1 mode switchdev
devlink dev eswitch set pci/0000:08:00.2 mode switchdev
devlink dev eswitch set pci/0000:08:00.3 mode switchdev
ip link add name bond0 type bond
ip link set dev bond0 type bond mode 802.3ad
ip link set dev eth2 master bond0
ip link set dev eth3 master bond0
ip link set dev eth4 master bond0
ip link set dev eth5 master bond0

Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.

User can verify LAG state and type via debugfs:
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type

[1]
https://lore.kernel.org/netdev/20220510055743.118828-1-saeedm@nvidia.com/

* tag 'mlx5-updates-2023-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices
  net/mlx5: Devcom, introduce devcom_for_each_peer_entry
  net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired
  net/mlx5: Devcom, Rename paired to ready
  net/mlx5: DR, handle more than one peer domain
  net/mlx5: E-switch, generalize shared FDB creation
  net/mlx5: E-switch, Handle multiple master egress rules
  net/mlx5: E-switch, refactor FDB miss rule add/remove
  net/mlx5: E-switch, enlarge peer miss group table
  net/mlx5e: Handle offloads flows per peer
  net/mlx5e: en_tc, re-factor query route port
  net/mlx5e: rep, store send to vport rules per peer
  net/mlx5e: tc, Refactor peer add/del flow
  net/mlx5e: en_tc, Extend peer flows to a list
====================

Link: https://lore.kernel.org/r/20230602191301.47004-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'drm-i915-use-ref_tracker-library-for-tracking-wakerefs'

Andrzej Hajda says:

====================
drm/i915: use ref_tracker library for tracking wakerefs

This is reviewed series of ref_tracker patches, ready to merge
via network tree, rebased on net-next/main.
i915 patches will be merged later via intel-gfx tree.
====================

Merge on top of an -rc tag in case it's needed in another tree.

Link: https://lore.kernel.org/r/20230224-track_gt-v9-0-5b47a33f55d1@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

lib/ref_tracker: remove warnings in case of allocation failure

Library can handle allocation failures. To avoid allocation warnings
__GFP_NOWARN has been added everywhere. Moreover GFP_ATOMIC has been
replaced with GFP_NOWAIT in case of stack allocation on tracker free
call.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

lib/ref_tracker: add printing to memory buffer

Similar to stack_(depot|trace)_snprint the patch
adds helper to printing stats to memory buffer.
It will be helpful in case of debugfs.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

lib/ref_tracker: improve printing stats

In case the library is tracking busy subsystem, simply
printing stack for every active reference will spam log
with long, hard to read, redundant stack traces. To improve
readabilty following changes have been made:
- reports are printed per stack_handle - log is more compact,
- added display name for ref_tracker_dir - it will differentiate
multiple subsystems,
- stack trace is printed indented, in the same printk call,
- info about dropped references is printed as well.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

lib/ref_tracker: add unlocked leak print helper

To have reliable detection of leaks, caller must be able to check under
the same lock both: tracked counter and the leaks. dir.lock is natural
candidate for such lock and unlocked print helper can be called with this
lock taken.
As a bonus we can reuse this helper in ref_tracker_dir_exit.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'mlxsw-selftests-cleanups'

Petr Machata says:

====================
mlxsw, selftests: Cleanups

This patchset consolidates a number of disparate items that can all be
considered cleanups. They are all related to mlxsw in that they are
directly in mlxsw code, or in selftests that mlxsw heavily uses.

- patch #1 fixes a comment, patch #2 propagates an extack

- patches #3 and #4 tweak several loops to query a resource once and cache
  in a local variable instead of querying on each iteration

- patches #5 and #6 fix selftest diagrams, and #7 adds a missing diagram
  into an existing test

- patch #8 disables a PVID on a bridge in a selftest that should not need
  said PVID
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: router_bridge_vlan: Set vlan_default_pvid 0 on the bridge

When everything is configured, VLAN membership on the bridge in this
selftest are as follows:

    # bridge vlan show
    port              vlan-id
    swp2              1 PVID Egress Untagged
                      555
    br1               1 Egress Untagged
                      555 PVID Egress Untagged

Note that it is possible for untagged traffic to just flow through as VLAN
1, instead of using VLAN 555 as intended by the test. This configuration
seems too close to "works by accident", and it would be better to just shut
out VLAN 1 altogether.

To that end, configure vlan_default_pvid of 0:

    # bridge vlan show
    port              vlan-id
    swp2              555
    br1               555 PVID Egress Untagged

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: router_bridge_vlan: Add a diagram

Add a topology diagram to this selftest to make the configuration easier to
understand.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mlxsw: egress_vid_classification: Fix the diagram

The topology diagram implies that $swp1 and $swp2 are members of the bridge
br0, when in fact only their uppers, $swp1.10 and $swp2.10 are. Adjust the
diagram.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mlxsw: ingress_rif_conf_1d: Fix the diagram

The topology diagram implies that $swp1 and $swp2 are members of the bridge
br0, when in fact only their uppers, $swp1.10 and $swp2.10 are. Adjust the
diagram.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Do not query MAX_VRS on each iteration

MLXSW_CORE_RES_GET involves a call to spectrum_core, a separate module.
Instead of making the call on every iteration, cache it up front, and use
the value.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Do not query MAX_RIFS on each iteration

MLXSW_CORE_RES_GET involves a call to spectrum_core, a separate module.
Instead of making the call on every iteration, cache it up front, and use
the value.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Use extack in mlxsw_sp~_rif_ipip_lb_configure()

In commit 26029225d992 ("mlxsw: spectrum_router: Propagate extack
further"), the mlxsw_sp_rif_ops.configure callback got a new argument,
extack. However the callbacks that deal with tunnel configuration,
mlxsw_sp1_rif_ipip_lb_configure() and mlxsw_sp2_rif_ipip_lb_configure(),
were never updated to pass the parameter further. Do that now.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Clarify a comment

"Reserved for X" usually means that only X is supposed to use a given
object. Here, it is used in the sense that X should consider the object
"reserved", as in "restricted".

Replace the comment simply by "X", with the implication that that's where
the field is used.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sja1105-cleanups'

Russell King says:

====================
convert sja1105 xpcs creation and remove xpcs_create

This series of three patches converts sja1105 to use the newly
provided xpcs_create_mdiodev(), and as there become no users of
xpcs_create(), removes this function from the global namespace to
discourage future direct use.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: pcs: xpcs: remove xpcs_create() from public view

There are now no callers of xpcs_create(), so let's remove it from
public view to discourage future direct usage.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: use xpcs_create_mdiodev()

Use the new xpcs_create_mdiodev() creator, which simplifies the
creation and destruction of the mdio device associated with xpcs.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: sja1105: allow XPCS to handle mdiodev lifetime

Put the mdiodev after xpcs_create() so that the XPCS driver can manage
the lifetime of the mdiodev its using.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'regmap-TSE-PCS'

Maxime Chevallier says:

====================
net: add a regmap-based mdio driver and drop TSE PCS

This is the V4 of a series that follows-up on the work [1] aiming to drop the
altera TSE PCS driver, as it turns out to be a version of the Lynx PCS exposed
as a memory-mapped block, instead of living on an MDIO bus.

One step of this removal involved creating a regmap-based mdio driver
that translates MDIO accesses into the actual underlying bus that
exposes the register. The register layout must of course match the
standard MDIO layout, but we can now account for differences in stride
with recent work on the regmap subsystem [2].

Sorry for repeating this, but I didn't hear anything on this matter in previous
iterations, Mark, Net maintainers, this series depends on the patch
e12ff2876493 that was recently merged into the regmap tree [3].

For this series to be usable in net-next, this patch must be applied
beforehand. Should Mark create a tag that would then be merged into
net-next ? Or should we just wait for the next release to merge this
into net-next ?

This series introduces a new MDIO driver, and uses it to convert Altera
TSE from the actual TSE PCS driver to Lynx PCS.

Since it turns out dwmac_socfpga also uses a TSE PCS block, port that
driver to Lynx as well.

Changes in V4 :
- Use new pcs_lynx_create/destroy helpers added by Russell
- Rework the cleanup sequence to avoid leaking data
- Rework a bit KConfig to properly select dependencies
- Fix a few hiccups with misplaced hunks in 2 commits

Changes in V3 :
- Use a dedicated struct for the mii bus's priv data, to avoid
   duplicating the whole struct mdio_regmap_config, from which 2 fields
   only are necessary after init, as suggested by Russell
- Use ~0 instead of ~0UL for the no-scan bitmask, following Simon's
   review.

Changes in V2 :
- Use phy_mask to avoid unnecessarily scanning the whole mdio bus
- Go one step further and completely disable scanning if users
   set the .autoscan flag to false, in case the mdiodevice isn't an
   actual PHY (a PCS for example).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: dwmac-sogfpga: use the lynx pcs driver

dwmac_socfpga re-implements support for the TSE PCS, which is identical
to the already existing TSE PCS, which in turn is the same as the Lynx
PCS. Drop the existing TSE re-implemenation and use the Lynx PCS
instead, relying on the regmap-mdio driver to translate MDIO accesses
into mmio accesses.

Add a lynx_pcs reference in the stmmac's internal structure, and use
.mac_select_pcs() to return the relevant PCS to be used.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: pcs: Drop the TSE PCS driver

Now that we can easily create a mdio-device that represents a
memory-mapped device that exposes an MDIO-like register layout, we don't
need the Altera TSE PCS anymore, since we can use the Lynx PCS instead.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ethernet: altera-tse: Convert to mdio-regmap and use PCS Lynx

The newly introduced regmap-based MDIO driver allows for an easy mapping
of an mdiodevice onto the memory-mapped TSE PCS, which is actually a
Lynx PCS.

Convert Altera TSE to use this PCS instead of the pcs-altera-tse, which
is nothing more than a memory-mapped Lynx PCS.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mdio: Introduce a regmap-based mdio driver

There exists several examples today of devices that embed an ethernet
PHY or PCS directly inside an SoC. In this situation, either the device
is controlled through a vendor-specific register set, or sometimes
exposes the standard 802.3 registers that are typically accessed over
MDIO.

As phylib and phylink are designed to use mdiodevices, this driver
allows creating a virtual MDIO bus, that translates mdiodev register
accesses to regmap accesses.

The reason we use regmap is because there are at least 3 such devices
known today, 2 of them are Altera TSE PCS's, memory-mapped, exposed
with a 4-byte stride in stmmac's dwmac-socfpga variant, and a 2-byte
stride in altera-tse. The other one (nxp,sja1110-base-tx-mdio) is
exposed over SPI.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: lower "link become ready"'s level message

This following message is printed in the console each time a network
device configured with an IPv6 addresses is ready to be used:

ADDRCONF(NETDEV_CHANGE): <iface>: link becomes ready

When netns are being extensively used -- e.g. by re-creating netns' with
veth to discuss with each others for testing purposes like mptcp_join.sh
selftest does -- it generates a lot of messages like that: more than 700
when executing mptcp_join.sh with the latest version.

It looks like this message is not that helpful after all: maybe it can
be used as a sign to know if there is something wrong, e.g. if a device
is being regularly reconfigured by accident? But even then, there are
better ways to monitor and diagnose such issues.

When looking at commit 3c21edbd1137 ("[IPV6]: Defer IPv6 device
initialization until the link becomes ready.") which introduces this new
message, it seems it had been added to verify that the new feature was
working as expected. It could have then used a lower level than "info"
from the beginning but it was fine like that back then: 17 years ago.

It seems then OK today to simply lower its level, similar to commit
7c62b8dd5ca8 ("net/ipv6: lower the level of "link is not ready" messages")
and as suggested by Mat [1], Stephen and David [2].

Link: https://lore.kernel.org/mptcp/614e76ac-184e-c553-af72-084f792e60b0@kernel.org/T/
Link: https://lore.kernel.org/netdev/68035bad-b53e-91cb-0e4a-007f27d62b05@tessares.net/T/
Suggested-by: Mat Martineau <martineau@kernel.org>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylib: fix phy_read*_poll_timeout()

Dan Carpenter reported a signedness bug in genphy_loopback(). Andrew
reports that:

"It is common to get this wrong in general with PHY drivers. Dan
regularly posts fixes like this soon after a PHY driver patch it
merged. I really wish we could somehow get the compiler to warn when
the result from phy_read() is stored into a unsigned type. It would
save Dan a lot of work."

Let's make phy_read*_poll_timeout() immune to further issues when "val"
is an unsigned type by storing the read function's result in a signed
int as well as "val", and using the signed variable both to check for
an error and for propagating that error to the caller.

The advantage of this method is we don't change where the cast from
the signed return code to the user's variable occurs - so users will
see no change.

Previously Heiner changed phy_read_poll_timeout() to check for an error
before evaluating the user supplied condition, but didn't update
phy_read_mmd_poll_timeout(). Make that change there too.

Link: https://lore.kernel.org/r/d7bb312e-2428-45f6-b9b3-59ba544e8b94@kili.mountain
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1q4kX6-00BNuM-Mx@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'tools-ynl-gen-dust-off-the-user-space-code'

Jakub Kicinski says:

====================
tools: ynl-gen: dust off the user space code

Every now and then I wish I finished the user space part of
the netlink specs, Python scripts kind of stole the show but
C is useful for selftests and stuff which needs to be fast.
Recently someone asked me how to access devlink and ethtool
from C++ which pushed me over the edge.

Fix things which bit rotted and finish notification handling.
This series contains code gen changes only. I'll follow up
with the fixed component, samples and docs as soon as it's
merged.
====================

Link: https://lore.kernel.org/r/20230602023548.463441-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: generate static descriptions of notifications

Notifications may come in at any time. The family must be always
ready to parse a random incoming notification. Generate notification
table for parsing and tell YNL which request we're processing
to distinguish responses from notifications.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: switch to family struct

We'll want to store static info about the family soon.
Generate a struct. This changes creation from, e.g.:

ys = ynl_sock_create("netdev", &yerr);
to:
ys = ynl_sock_create(&ynl_netdev_family, &yerr);

on user's side.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: generate alloc and free helpers for req

We expect user to allocate requests with calloc(),
make things a bit more consistent and provide helpers.
Generate free calls, too.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: move the response reading logic into YNL

We generate send() and recv() calls and all msg handling for
each operation. It's a lot of repeated code and will only grow
with notification handling. Call back to a helper YNL lib instead.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: generate enum-to-string helpers

It's sometimes useful to print the name of an enum value,
flag or name of the op. Python can do it, add C helper
code gen for getting names of things.

Example:

  static const char * const netdev_xdp_act_strmap[] = {
[0] = "basic",
[1] = "redirect",
[2] = "ndo-xmit",
[3] = "xsk-zerocopy",
[4] = "hw-offload",
[5] = "rx-sg",
[6] = "ndo-xmit-sg",
  };

  const char *netdev_xdp_act_str(enum netdev_xdp_act value)
  {
value = ffs(value) - 1;
if (value < 0 || value >= (int)MNL_ARRAY_SIZE(netdev_xdp_act_strmap))
return NULL;
return netdev_xdp_act_strmap[value];
  }

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: add error checking for nested structs

Parsing nested types may return an error, propagate it.
Not marking as a fix, because nothing uses YNL upstream.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: loosen type consistency check for events

Both event and notify types are always consistent. Rewrite
the condition checking if we can reuse reply types to be
less picky and let notify thru.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: don't override pure nested struct

For pure structs (parsed nested attributes) we track what
forms of the struct exist in request and reply directions.
Make sure we don't overwrite the recorded struct each time,
otherwise the information is lost.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: fix unused / pad attribute handling

Unused and Pad attributes don't carry information.
Unused should never exist, and be rejected.
Pad should be silently skipped.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

tools: ynl-gen: add extra headers for user space

Make sure all relevant headers are included, we allocate memory,
use memcpy() and Linux types without including the headers.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Devcom, extend mlx5_devcom_send_event to work with more than two devices

mlx5_devcom_send_event is used to send event from one eswitch to the
other. In other words, only one event is sent, which means, no error
mechanism is needed.
However, In case devcom have more than two eswitches, a proper error
mechanism is needed. Hence, in case of error, devcom will perform the
error unwind, since devcom knows how many events were successful.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Devcom, introduce devcom_for_each_peer_entry

Introduce generic APIs which will retrieve all peers.
This API replace mlx5_devcom_get/release_peer_data which retrieve
only a single peer.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, mark devcom as not ready when all eswitches are unpaired

Whenever an eswitch is unpaired with another, the driver mark devcom
as not ready. While this is correct in case we are pairing only two
eswitches, in order to support pairing of more than two eswitches,
driver need to mark devcom as not ready only when all eswitches are
unpaired.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: Devcom, Rename paired to ready

In downstream patch devcom will provide support for more than two
devices. The term 'paired' will be renamed as 'ready' to convey a
more accurate meaning.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: DR, handle more than one peer domain

Currently, DR domain is using the assumption that each domain can only
have a single peer.
In order to support VF LAG of more then two ports, expand peer domain
to use an array of peers, and align the code accordingly.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, generalize shared FDB creation

Shared FDB creation is hard coded for only two eswitches.
Generalize shared FDB creation so that any number of eswitches could
create shared FDB.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, Handle multiple master egress rules

Currently, whenever a shared FDB is created, the slave eswitch is
creating master egress rule to the master eswitch.
In order to support more than two ports, which means there will be
more than one slave eswitch, enlarge bounce_rule, which is used to
create master egress rule, to an xarray.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, refactor FDB miss rule add/remove

Currently, E-switch FDB have a single peer miss rule.
In order to support more than one peer, refactor E-switch FDB to
have peer miss rule per peer, and change the code to add/remove a
rule from specific peer.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>

net/mlx5: E-switch, enlarge peer miss group table

There is an implicit assumption that peer miss group table
require to handle only a single peer.
Also, there is an assumption that total_vports of the master
is greater or equal to the total_vports of each peer.
Change the code to support peer miss group for more than one peer.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>