linux-2.6-microblaze.git
17 months agosfc: use padding to fix alignment in loopback test
Edward Cree [Fri, 23 Jun 2023 18:38:04 +0000 (19:38 +0100)]
sfc: use padding to fix alignment in loopback test

Add two bytes of padding to the start of struct efx_loopback_payload,
 which are not sent on the wire.  This ensures the 'ip' member is
 4-byte aligned, preventing the following W=1 warning:
net/ethernet/sfc/selftest.c:46:15: error: field ip within 'struct efx_loopback_payload' is less aligned than 'struct iphdr' and is usually due to 'struct efx_loopback_payload' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
        struct iphdr ip;

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
17 months agoMerge branch 'splice-net-switch-over-users-of-sendpage-and-remove-it'
Jakub Kicinski [Sat, 24 Jun 2023 22:50:21 +0000 (15:50 -0700)]
Merge branch 'splice-net-switch-over-users-of-sendpage-and-remove-it'

David Howells says:

====================
splice, net: Switch over users of sendpage() and remove it

Here's the final set of patches towards the removal of sendpage.  All the
drivers that use sendpage() get switched over to using sendmsg() with
MSG_SPLICE_PAGES.

The following changes are made:

 (1) Make the protocol drivers behave according to MSG_MORE, not
     MSG_SENDPAGE_NOTLAST.  The latter is restricted to turning on MSG_MORE
     in the sendpage() wrappers.

 (2) Fix ocfs2 to allocate its global protocol buffers with folio_alloc()
     rather than kzalloc() so as not to invoke the !sendpage_ok warning in
     skb_splice_from_iter().

 (3) Make ceph/rds, skb_send_sock, dlm, nvme, smc, ocfs2, drbd and iscsi
     use sendmsg(), not sendpage and make them specify MSG_MORE instead of
     MSG_SENDPAGE_NOTLAST.

 (4) Kill off sendpage and clean up MSG_SENDPAGE_NOTLAST.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6
Link: https://lore.kernel.org/r/20230616161301.622169-1-dhowells@redhat.com/
Link: https://lore.kernel.org/r/20230617121146.716077-1-dhowells@redhat.com/
Link: https://lore.kernel.org/r/20230620145338.1300897-1-dhowells@redhat.com/
====================

Link: https://lore.kernel.org/r/20230623225513.2732256-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: Kill MSG_SENDPAGE_NOTLAST
David Howells [Fri, 23 Jun 2023 22:55:13 +0000 (23:55 +0100)]
net: Kill MSG_SENDPAGE_NOTLAST

Now that ->sendpage() has been removed, MSG_SENDPAGE_NOTLAST can be cleaned
up.  Things were converted to use MSG_MORE instead, but the protocol
sendpage stubs still convert MSG_SENDPAGE_NOTLAST to MSG_MORE, which is now
unnecessary.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20230623225513.2732256-17-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agosock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
David Howells [Fri, 23 Jun 2023 22:55:12 +0000 (23:55 +0100)]
sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)

Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
David Howells [Fri, 23 Jun 2023 22:55:11 +0000 (23:55 +0100)]
ocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()

Switch ocfs2 from using sendpage() to using sendmsg() + MSG_SPLICE_PAGES so
that sendpage can be phased out.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mark Fasheh <mark@fasheh.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-15-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoocfs2: Fix use of slab data with sendpage
David Howells [Fri, 23 Jun 2023 22:55:10 +0000 (23:55 +0100)]
ocfs2: Fix use of slab data with sendpage

ocfs2 uses kzalloc() to allocate buffers for o2net_hand, o2net_keep_req and
o2net_keep_resp and then passes these to sendpage.  This isn't really
allowed as the lifetime of slab objects is not controlled by page ref -
though in this case it will probably work.  sendmsg() with MSG_SPLICE_PAGES
will, however, print a warning and give an error.

Fix it to use folio_alloc() instead to allocate a buffer for the handshake
message, keepalive request and reply messages.

Fixes: 98211489d414 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mark Fasheh <mark@fasheh.com>
cc: Kurt Hackel <kurt.hackel@oracle.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-14-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoscsi: target: iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
David Howells [Fri, 23 Jun 2023 22:55:09 +0000 (23:55 +0100)]
scsi: target: iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage

Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage.  This allows
multiple pages and multipage folios to be passed through.

TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the
entire set of pages it's going to transfer plus two for the header and
trailer and page fragments to hold the header and trailer - and then call
sendmsg once for the entire message.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mike Christie <michael.christie@oracle.com>
cc: Maurizio Lombardi <mlombard@redhat.com>
cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
cc: "Martin K. Petersen" <martin.petersen@oracle.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: open-iscsi@googlegroups.com
Link: https://lore.kernel.org/r/20230623225513.2732256-13-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoscsi: iscsi_tcp: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
David Howells [Fri, 23 Jun 2023 22:55:08 +0000 (23:55 +0100)]
scsi: iscsi_tcp: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage

Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage.  This allows
multiple pages and multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
cc: Lee Duncan <lduncan@suse.com>
cc: Chris Leech <cleech@redhat.com>
cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
cc: "Martin K. Petersen" <martin.petersen@oracle.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: open-iscsi@googlegroups.com
Link: https://lore.kernel.org/r/20230623225513.2732256-12-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodrbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
David Howells [Fri, 23 Jun 2023 22:55:07 +0000 (23:55 +0100)]
drbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()

Use sendmsg() conditionally with MSG_SPLICE_PAGES in _drbd_send_page()
rather than calling sendpage() or _drbd_no_send_page().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Philipp Reisner <philipp.reisner@linbit.com>
cc: Lars Ellenberg <lars.ellenberg@linbit.com>
cc: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: drbd-dev@lists.linbit.com
Link: https://lore.kernel.org/r/20230623225513.2732256-11-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agosmc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES
David Howells [Fri, 23 Jun 2023 22:55:06 +0000 (23:55 +0100)]
smc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES

Drop the smc_sendpage() code as smc_sendmsg() just passes the call down to
the underlying TCP socket and smc_tx_sendpage() is just a wrapper around
its sendmsg implementation.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Karsten Graul <kgraul@linux.ibm.com>
cc: Wenjia Zhang <wenjia@linux.ibm.com>
cc: Jan Karcher <jaka@linux.ibm.com>
cc: "D. Wythe" <alibuda@linux.alibaba.com>
cc: Tony Lu <tonylu@linux.alibaba.com>
cc: Wen Gu <guwen@linux.alibaba.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-10-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonvmet-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
David Howells [Fri, 23 Jun 2023 22:55:05 +0000 (23:55 +0100)]
nvmet-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage

When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather than
copied instead of calling sendpage.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Willem de Bruijn <willemb@google.com>
cc: Keith Busch <kbusch@kernel.org>
cc: Jens Axboe <axboe@fb.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Chaitanya Kulkarni <kch@nvidia.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-nvme@lists.infradead.org
Link: https://lore.kernel.org/r/20230623225513.2732256-9-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonvme-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
David Howells [Fri, 23 Jun 2023 22:55:04 +0000 (23:55 +0100)]
nvme-tcp: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage

When transmitting data, call down into TCP using a sendmsg with
MSG_SPLICE_PAGES instead of sendpage.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Willem de Bruijn <willemb@google.com>
cc: Keith Busch <kbusch@kernel.org>
cc: Jens Axboe <axboe@fb.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Chaitanya Kulkarni <kch@nvidia.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-nvme@lists.infradead.org
Link: https://lore.kernel.org/r/20230623225513.2732256-8-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
David Howells [Fri, 23 Jun 2023 22:55:03 +0000 (23:55 +0100)]
dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage

When transmitting data, call down a layer using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather using
sendpage.  This allows ->sendpage() to be replaced by something that can
handle multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christine Caulfield <ccaulfie@redhat.com>
cc: David Teigland <teigland@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: cluster-devel@redhat.com
Link: https://lore.kernel.org/r/20230623225513.2732256-7-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agords: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
David Howells [Fri, 23 Jun 2023 22:55:02 +0000 (23:55 +0100)]
rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage

When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced.

To make this work, the data is assembled in a bio_vec array and attached to
a BVEC-type iterator.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: rds-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-6-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
David Howells [Fri, 23 Jun 2023 22:55:01 +0000 (23:55 +0100)]
ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()

Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when
transmitting data.  For the moment, this can only transmit one page at a
time because of the architecture of net/ceph/, but if
write_partial_message_data() can be given a bvec[] at a time by the
iteration code, this would allow pages to be sent in a batch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-5-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
David Howells [Fri, 23 Jun 2023 22:55:00 +0000 (23:55 +0100)]
ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage

Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when
transmitting data.  For the moment, this can only transmit one page at a
time because of the architecture of net/ceph/, but if
write_partial_message_data() can be given a bvec[] at a time by the
iteration code, this would allow pages to be sent in a batch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-4-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock()
David Howells [Fri, 23 Jun 2023 22:54:59 +0000 (23:54 +0100)]
net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock()

Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage in
skb_send_sock().  This causes pages to be spliced from the source iterator
if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Note that this could perhaps be improved to fill out a bvec array with all
the frags and then make a single sendmsg call, possibly sticking the header
on the front also.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230623225513.2732256-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agotcp_bpf, smc, tls, espintcp, siw: Reduce MSG_SENDPAGE_NOTLAST usage
David Howells [Fri, 23 Jun 2023 22:54:58 +0000 (23:54 +0100)]
tcp_bpf, smc, tls, espintcp, siw: Reduce MSG_SENDPAGE_NOTLAST usage

As MSG_SENDPAGE_NOTLAST is being phased out along with sendpage(), don't
use it further in than the sendpage methods, but rather translate it to
MSG_MORE and use that instead.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: Bernard Metzler <bmt@zurich.ibm.com>
cc: Jason Gunthorpe <jgg@ziepe.ca>
cc: Leon Romanovsky <leon@kernel.org>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jakub Sitnicki <jakub@cloudflare.com>
cc: David Ahern <dsahern@kernel.org>
cc: Karsten Graul <kgraul@linux.ibm.com>
cc: Wenjia Zhang <wenjia@linux.ibm.com>
cc: Jan Karcher <jaka@linux.ibm.com>
cc: "D. Wythe" <alibuda@linux.alibaba.com>
cc: Tony Lu <tonylu@linux.alibaba.com>
cc: Wen Gu <guwen@linux.alibaba.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: Steffen Klassert <steffen.klassert@secunet.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20230623225513.2732256-2-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge tag 'mlx5-updates-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Sat, 24 Jun 2023 22:48:04 +0000 (15:48 -0700)]
Merge tag 'mlx5-updates-2023-06-21' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-06-21

mlx5 driver minor cleanup and fixes to net-next

* tag 'mlx5-updates-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Remove pointless vport lookup from mlx5_esw_check_port_type()
  net/mlx5: Remove redundant check from mlx5_esw_query_vport_vhca_id()
  net/mlx5: Remove redundant is_mdev_switchdev_mode() check from is_ib_rep_supported()
  net/mlx5: Remove redundant MLX5_ESWITCH_MANAGER() check from is_ib_rep_supported()
  net/mlx5e: E-Switch, Fix shared fdb error flow
  net/mlx5e: Remove redundant comment
  net/mlx5e: E-Switch, Pass other_vport flag if vport is not 0
  net/mlx5e: E-Switch, Use xarray for devcom paired device index
  net/mlx5e: E-Switch, Add peer fdb miss rules for vport manager or ecpf
  net/mlx5e: Use vhca_id for device index in vport rx rules
  net/mlx5: Lag, Remove duplicate code checking lag is supported
  net/mlx5: Fix error code in mlx5_is_reset_now_capable()
  net/mlx5: Fix reserved at offset in hca_cap register
  net/mlx5: Fix SFs kernel documentation error
  net/mlx5: Fix UAF in mlx5_eswitch_cleanup()
====================

Link: https://lore.kernel.org/r/20230623192907.39033-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch 'netlink-add-display-hint-to-ynl'
Jakub Kicinski [Sat, 24 Jun 2023 22:45:51 +0000 (15:45 -0700)]
Merge branch 'netlink-add-display-hint-to-ynl'

Donald Hunter says:

====================
netlink: add display-hint to ynl

Add a display-hint property to the netlink schema, to be used by generic
netlink clients as hints about how to display attribute values.

A display-hint on an attribute definition is intended for letting a
client such as ynl know that, for example, a u32 should be rendered as
an ipv4 address. The display-hint enumeration includes a small number of
networking domain-specific value types.
====================

Link: https://lore.kernel.org/r/20230623201928.14275-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonetlink: specs: add display hints to ovs_flow
Donald Hunter [Fri, 23 Jun 2023 20:19:28 +0000 (21:19 +0100)]
netlink: specs: add display hints to ovs_flow

Add display hints for mac, ipv4, ipv6, hex and uuid to the ovs_flow
schema.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230623201928.14275-4-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agotools: ynl: add display-hint support to ynl
Donald Hunter [Fri, 23 Jun 2023 20:19:27 +0000 (21:19 +0100)]
tools: ynl: add display-hint support to ynl

Add support to the ynl tool for rendering output based on display-hint
properties.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230623201928.14275-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonetlink: specs: add display-hint to schema definitions
Donald Hunter [Fri, 23 Jun 2023 20:19:26 +0000 (21:19 +0100)]
netlink: specs: add display-hint to schema definitions

Add a display-hint property to the netlink schema that is for providing
optional hints to generic netlink clients about how to display attribute
values. A display-hint on an attribute definition is intended for
letting a client such as ynl know that, for example, a u32 should be
rendered as an ipv4 address. The display-hint enumeration includes a
small number of networking domain-specific value types.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230623201928.14275-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge tag 'ieee802154-for-net-next-2023-06-23' of gitolite.kernel.org:pub/scm/linux...
Jakub Kicinski [Sat, 24 Jun 2023 22:41:46 +0000 (15:41 -0700)]
Merge tag 'ieee802154-for-net-next-2023-06-23' of gitolite.pub/scm/linux/kernel/git/wpan/wpan-next

Miquel Raynal says:

====================
Core WPAN changes:
 - Support for active scans
 - Support for answering BEACON_REQ
 - Specific MLME handling for limited devices

WPAN driver changes:
 - ca8210:
   - Flag the devices as limited
   - Remove stray gpiod_unexport() call

* tag 'ieee802154-for-net-next-2023-06-23' of gitolite.kernel.org:pub/scm/linux/kernel/git/wpan/wpan-next:
  ieee802154: ca8210: Remove stray gpiod_unexport() call
  ieee802154: ca8210: Flag the driver as being limited
  net: ieee802154: Handle limited devices with only datagram support
  mac802154: Handle received BEACON_REQ
  ieee802154: Add support for allowing to answer BEACON_REQ
  mac802154: Handle active scanning
  ieee802154: Add support for user active scan requests
====================

Link: https://lore.kernel.org/r/20230623195506.40b87b5f@xps-13
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch 'selftests-mptcp-refactoring-and-minor-fixes'
Jakub Kicinski [Sat, 24 Jun 2023 22:38:02 +0000 (15:38 -0700)]
Merge branch 'selftests-mptcp-refactoring-and-minor-fixes'

Mat Martineau says:

====================
selftests: mptcp: Refactoring and minor fixes

Patch 1 moves code around for clarity and improved code reuse.

Patch 2 makes use of new MPTCP info that consolidates MPTCP-level and
subflow-level information.

Patches 3-7 refactor code to favor limited-scope environment vars over
optional parameters.

Patch 8: typo fix
====================

Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-0-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: connect: fix comment typo
Yueh-Shun Li [Fri, 23 Jun 2023 17:34:14 +0000 (10:34 -0700)]
selftests: mptcp: connect: fix comment typo

Spell "transmissions" properly.

Found by searching for keyword "tranm".

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Yueh-Shun Li <shamrocklee@posteo.net>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-8-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: add pm_nl_set_endpoint helper
Geliang Tang [Fri, 23 Jun 2023 17:34:13 +0000 (10:34 -0700)]
selftests: mptcp: add pm_nl_set_endpoint helper

This patch moves endpoint settings out of do_transfer() into a new
helper pm_nl_set_endpoint(). And invoke this helper in do_transfer().
This makes the code much more clearer.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-7-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: drop sflags parameter
Geliang Tang [Fri, 23 Jun 2023 17:34:12 +0000 (10:34 -0700)]
selftests: mptcp: drop sflags parameter

run_tests() accepts too many optional parameters. Before this modification,
it was required to set all of then when only the last one had to be
changed. That's not clear to see all these 0 and it makes the maintenance
harder:

      run_tests $ns1 $ns2 10.0.1.1 1 2 3 slow

Instead, the parameter can be set as an env var with a limited scope:

      foo=1 bar=2 next=3 \
            run_tests $ns1 $ns2 10.0.1.1 slow

This patch switches to key/value "sflags=*" instead of positional parameter
sflags of do_transfer() and run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-6-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: drop addr_nr_ns1/2 parameters
Geliang Tang [Fri, 23 Jun 2023 17:34:11 +0000 (10:34 -0700)]
selftests: mptcp: drop addr_nr_ns1/2 parameters

run_tests() accepts too many optional parameters. Before this modification,
it was required to set all of then when only the last one had to be
changed. That's not clear to see all these 0 and it makes the maintenance
harder:

      run_tests $ns1 $ns2 10.0.1.1 1 2 3 slow

Instead, the parameter can be set as an env var with a limited scope:

      foo=1 bar=2 next=3 \
            run_tests $ns1 $ns2 10.0.1.1 slow

This patch switches to key/value "addr_nr_ns1=*, addr_nr_ns2=*" instead
of positional parameters addr_nr_ns1 and addr_nr_ns2 of do_transfer()
and run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-5-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: drop test_linkfail parameter
Geliang Tang [Fri, 23 Jun 2023 17:34:10 +0000 (10:34 -0700)]
selftests: mptcp: drop test_linkfail parameter

run_tests() accepts too many optional parameters. Before this modification,
it was required to set all of then when only the last one had to be
changed. That's not clear to see all these 0 and it makes the maintenance
harder:

      run_tests $ns1 $ns2 10.0.1.1 1 2 3 slow

Instead, the parameter can be set as an env var with a limited scope:

      foo=1 bar=2 next=3 \
            run_tests $ns1 $ns2 10.0.1.1 slow

This patch switches to key/value "test_linkfail=*" instead of positional
parameter test_linkfail of do_transfer() and run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-4-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: set FAILING_LINKS in run_tests
Geliang Tang [Fri, 23 Jun 2023 17:34:09 +0000 (10:34 -0700)]
selftests: mptcp: set FAILING_LINKS in run_tests

Set FAILING_LINKS as an env var with a limited scope only when calling
run_tests().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-3-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: check subflow and addr infos
Geliang Tang [Fri, 23 Jun 2023 17:34:08 +0000 (10:34 -0700)]
selftests: mptcp: check subflow and addr infos

New MPTCP info are being checked in multiple places to improve the code
coverage when using the userspace PM.

This patch makes chk_mptcp_info() more generic to be able to check
subflows, add_addr_signal and add_addr_accepted info (and even more
later). New arguments are now required to get different infos from the
two namespaces because some counters are specific to the client or the
server.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-2-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoselftests: mptcp: test userspace pm out of transfer
Geliang Tang [Fri, 23 Jun 2023 17:34:07 +0000 (10:34 -0700)]
selftests: mptcp: test userspace pm out of transfer

This patch moves userspace pm tests out of do_transfer(). Move add address
test into a new function userspace_pm_add_addr(), and remove address test
into userspace_pm_rm_sf_addr_ns1(). Move add subflow test into
userspace_pm_add_sf() and remove subflow into
userspace_pm_rm_sf_addr_ns2().

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-1-a883213c8ba9@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch 'net-stmmac-introduce-devres-helpers-for-stmmac-platform-drivers'
Jakub Kicinski [Sat, 24 Jun 2023 22:36:05 +0000 (15:36 -0700)]
Merge branch 'net-stmmac-introduce-devres-helpers-for-stmmac-platform-drivers'

Bartosz Golaszewski says:

====================
net: stmmac: introduce devres helpers for stmmac platform drivers

The goal of this series is two-fold: to make the API for stmmac platforms more
logically correct (by providing functions that acquire resources with release
counterparts that undo only their actions and nothing more) and to provide
devres variants of commonly use registration functions that allows to
significantly simplify the platform drivers.

The current pattern for stmmac platform drivers is to call
stmmac_probe_config_dt(), possibly the platform's init() callback and then
call stmmac_drv_probe(). The resources allocated by these calls will then
be released by calling stmmac_pltfr_remove(). This goes against the commonly
accepted way of providing each function that allocated a resource with a
function that frees it.

First: provide wrappers around platform's init() and exit() callbacks that
allow users to skip checking if the callbacks exist manually.

Second: provide stmmac_pltfr_probe() which calls the platform init() callback
and then calls stmmac_drv_probe() together with a variant of
stmmac_pltfr_remove() that DOES NOT call stmmac_remove_config_dt(). For now
this variant is called stmmac_pltfr_remove_no_dt() but once all users of
the old stmmac_pltfr_remove() are converted to the devres helper, it will be
renamed back to stmmac_pltfr_remove() and the no_dt function removed.

Finally use the devres helpers in dwmac-qco-ethqos to show how much simplier
the driver's probe() becomes.

This series obviously just starts the conversion process and other platform
drivers will need to be converted once the helpers land in net/.
====================

Link: https://lore.kernel.org/r/20230623100417.93592-1-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: dwmac-qcom-ethqos: use devm_stmmac_pltfr_probe()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:17 +0000 (12:04 +0200)]
net: stmmac: dwmac-qcom-ethqos: use devm_stmmac_pltfr_probe()

Use the devres variant of stmmac_pltfr_probe() and finally drop the
remove() callback entirely.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-12-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: platform: provide devm_stmmac_pltfr_probe()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:16 +0000 (12:04 +0200)]
net: stmmac: platform: provide devm_stmmac_pltfr_probe()

Provide a devres variant of stmmac_pltfr_probe() which allows users to
skip calling stmmac_pltfr_remove() at driver detach.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-11-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: dwmac-qco-ethqos: use devm_stmmac_probe_config_dt()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:15 +0000 (12:04 +0200)]
net: stmmac: dwmac-qco-ethqos: use devm_stmmac_probe_config_dt()

Significantly simplify the driver's probe() function by using the devres
variant of stmmac_probe_config_dt(). This allows to drop the goto jumps
entirely.

The remove_new() callback now needs to be switched to
stmmac_pltfr_remove_no_dt().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-10-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: platform: provide devm_stmmac_probe_config_dt()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:14 +0000 (12:04 +0200)]
net: stmmac: platform: provide devm_stmmac_probe_config_dt()

Provide a devres variant of stmmac_probe_config_dt() that allows users to
skip calling stmmac_remove_config_dt() at driver detach.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-9-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: platform: provide stmmac_pltfr_remove_no_dt()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:13 +0000 (12:04 +0200)]
net: stmmac: platform: provide stmmac_pltfr_remove_no_dt()

Add a variant of stmmac_pltfr_remove() that only frees resources
allocated by stmmac_pltfr_probe() and - unlike stmmac_pltfr_remove() -
does not call stmmac_remove_config_dt().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-8-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: dwmac-generic: use stmmac_pltfr_probe()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:12 +0000 (12:04 +0200)]
net: stmmac: dwmac-generic: use stmmac_pltfr_probe()

Shrink the code and remove labels by using the new stmmac_pltfr_probe()
function.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-7-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: platform: provide stmmac_pltfr_probe()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:11 +0000 (12:04 +0200)]
net: stmmac: platform: provide stmmac_pltfr_probe()

Implement stmmac_pltfr_probe() which is the logical API counterpart
for stmmac_pltfr_remove(). It calls the platform's init() callback and
then probes the stmmac device.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-6-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: dwmac-generic: use stmmac_pltfr_exit()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:10 +0000 (12:04 +0200)]
net: stmmac: dwmac-generic: use stmmac_pltfr_exit()

Shrink the code in dwmac-generic by using the new stmmac_pltfr_exit()
helper.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-5-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: platform: provide stmmac_pltfr_exit()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:09 +0000 (12:04 +0200)]
net: stmmac: platform: provide stmmac_pltfr_exit()

Provide a helper wrapper around calling the platform's exit() callback.
This allows users to skip checking if the callback exists.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-4-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: dwmac-generic: use stmmac_pltfr_init()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:08 +0000 (12:04 +0200)]
net: stmmac: dwmac-generic: use stmmac_pltfr_init()

Shrink the code in dwmac-generic by using the new stmmac_pltfr_init()
helper.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-3-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: stmmac: platform: provide stmmac_pltfr_init()
Bartosz Golaszewski [Fri, 23 Jun 2023 10:04:07 +0000 (12:04 +0200)]
net: stmmac: platform: provide stmmac_pltfr_init()

Provide a helper wrapper around calling the platform's init() callback.
This allows users to skip checking if the callback exists.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Link: https://lore.kernel.org/r/20230623100417.93592-2-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Sat, 24 Jun 2023 22:32:18 +0000 (15:32 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-06-22 (ice)

This series contains updates to ice driver only.

Jake adds a slight wait on control queue send to reduce wait time for
responses that occur within normal times.

Maciej allows for hot-swapping XDP programs.

Przemek removes unnecessary checks when enabling SR-IOV and freeing
allocated memory.

Christophe Jaillet converts a managed memory allocation to a regular one.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  ice: use ice_down_up() where applicable
  ice: Remove managed memory usage in ice_get_fw_log_cfg()
  ice: remove null checks before devm_kfree() calls
  ice: clean up freeing SR-IOV VFs
  ice: allow hot-swapping XDP programs
  ice: reduce initial wait for control queue messages
====================

Link: https://lore.kernel.org/r/20230622183601.2406499-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet/tcp: optimise locking for blocking splice
Pavel Begunkov [Fri, 23 Jun 2023 12:38:55 +0000 (13:38 +0100)]
net/tcp: optimise locking for blocking splice

Even when tcp_splice_read() reads all it was asked for, for blocking
sockets it'll release and immediately regrab the socket lock, loop
around and break on the while check.

Check tss.len right after we adjust it, and return if we're done.
That saves us one release_sock(); lock_sock(); pair per successful
blocking splice read.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/80736a2cc6d478c383ea565ba825eaf4d1abd876.1687523671.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoaf_unix: Call scm_recv() only after scm_set_cred().
Kuniyuki Iwashima [Thu, 22 Jun 2023 18:43:51 +0000 (11:43 -0700)]
af_unix: Call scm_recv() only after scm_set_cred().

syzkaller hit a WARN_ON_ONCE(!scm->pid) in scm_pidfd_recv().

In unix_stream_read_generic(), if there is no skb in the queue, we could
bail out the do-while loop without calling scm_set_cred():

  1. No skb in the queue
  2. sk is non-blocking
       or
     shutdown(sk, RCV_SHUTDOWN) is called concurrently
       or
     peer calls close()

If the socket is configured with SO_PASSCRED or SO_PASSPIDFD, scm_recv()
would populate cmsg with garbage.

Let's not call scm_recv() unless there is skb to receive.

WARNING: CPU: 1 PID: 3245 at include/net/scm.h:138 scm_pidfd_recv include/net/scm.h:138 [inline]
WARNING: CPU: 1 PID: 3245 at include/net/scm.h:138 scm_recv.constprop.0+0x754/0x850 include/net/scm.h:177
Modules linked in:
CPU: 1 PID: 3245 Comm: syz-executor.1 Not tainted 6.4.0-rc5-01219-gfa0e21fa4443 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:scm_pidfd_recv include/net/scm.h:138 [inline]
RIP: 0010:scm_recv.constprop.0+0x754/0x850 include/net/scm.h:177
Code: 67 fd e9 55 fd ff ff e8 4a 70 67 fd e9 7f fd ff ff e8 40 70 67 fd e9 3e fb ff ff e8 36 70 67 fd e9 02 fd ff ff e8 8c 3a 20 fd <0f> 0b e9 fe fb ff ff e8 50 70 67 fd e9 2e f9 ff ff e8 46 70 67 fd
RSP: 0018:ffffc90009af7660 EFLAGS: 00010216
RAX: 00000000000000a1 RBX: ffff888041e58a80 RCX: ffffc90003852000
RDX: 0000000000040000 RSI: ffffffff842675b4 RDI: 0000000000000007
RBP: ffffc90009af7810 R08: 0000000000000007 R09: 0000000000000013
R10: 00000000000000f8 R11: 0000000000000001 R12: ffffc90009af7db0
R13: 0000000000000000 R14: ffff888041e58a88 R15: 1ffff9200135eecc
FS:  00007f6b7113f640(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6b7111de38 CR3: 0000000012a6e002 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
 <TASK>
 unix_stream_read_generic+0x5fe/0x1f50 net/unix/af_unix.c:2830
 unix_stream_recvmsg+0x194/0x1c0 net/unix/af_unix.c:2880
 sock_recvmsg_nosec net/socket.c:1019 [inline]
 sock_recvmsg+0x188/0x1d0 net/socket.c:1040
 ____sys_recvmsg+0x210/0x610 net/socket.c:2712
 ___sys_recvmsg+0xff/0x190 net/socket.c:2754
 do_recvmmsg+0x25d/0x6c0 net/socket.c:2848
 __sys_recvmmsg net/socket.c:2927 [inline]
 __do_sys_recvmmsg net/socket.c:2950 [inline]
 __se_sys_recvmmsg net/socket.c:2943 [inline]
 __x64_sys_recvmmsg+0x224/0x290 net/socket.c:2943
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f6b71da2e5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007f6b7113ecc8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00000000004bc050 RCX: 00007f6b71da2e5d
RDX: 0000000000000007 RSI: 0000000020006600 RDI: 000000000000000b
RBP: 00000000004bc050 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000120 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f6b71e03530 R15: 0000000000000000
 </TASK>

Fixes: 5e2ff6704a27 ("scm: add SO_PASSPIDFD and SCM_PIDFD")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Alexander Mikhalitsyn <alexander@mihalicyn.com>
Link: https://lore.kernel.org/r/20230622184351.91544-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
Jakub Kicinski [Sat, 24 Jun 2023 22:12:05 +0000 (15:12 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-06-22 (iavf)

This series contains updates to iavf driver only.

Przemek defers removing, previous, primary MAC address until after
getting result of adding its replacement. He also does some cleanup by
removing unused functions and making applicable functions static.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  iavf: make functions static where possible
  iavf: remove some unused functions and pointless wrappers
  iavf: fix err handling for MAC replace
====================

Link: https://lore.kernel.org/r/20230622165914.2203081-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agorevert "s390/net: lcs: use IS_ENABLED() for kconfig detection"
Randy Dunlap [Thu, 22 Jun 2023 15:54:09 +0000 (08:54 -0700)]
revert "s390/net: lcs: use IS_ENABLED() for kconfig detection"

The referenced patch is causing build errors when ETHERNET=y and
FDDI=m. While we work out the preferred patch(es), revert this patch
to make the pain go away.

Fixes: 128272336120 ("s390/net: lcs: use IS_ENABLED() for kconfig detection")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Link: lore.kernel.org/r/202306202129.pl0AqK8G-lkp@intel.com
Cc: Alexandra Winter <wintera@linux.ibm.com>
Cc: Wenjia Zhang <wenjia@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230622155409.27311-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: phy: broadcom: drop brcm_phy_setbits() and use phy_set_bits() instead
Giulio Benetti [Thu, 22 Jun 2023 18:47:21 +0000 (20:47 +0200)]
net: phy: broadcom: drop brcm_phy_setbits() and use phy_set_bits() instead

Linux provides phy_set_bits() helper so let's drop brcm_phy_setbits() and
use phy_set_bits() in its place.

Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230622184721.24368-1-giulio.benetti@benettiengineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf...
Jakub Kicinski [Sat, 24 Jun 2023 21:52:28 +0000 (14:52 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-06-23

We've added 49 non-merge commits during the last 24 day(s) which contain
a total of 70 files changed, 1935 insertions(+), 442 deletions(-).

The main changes are:

1) Extend bpf_fib_lookup helper to allow passing the route table ID,
   from Louis DeLosSantos.

2) Fix regsafe() in verifier to call check_ids() for scalar registers,
   from Eduard Zingerman.

3) Extend the set of cpumask kfuncs with bpf_cpumask_first_and()
   and a rework of bpf_cpumask_any*() kfuncs. Additionally,
   add selftests, from David Vernet.

4) Fix socket lookup BPF helpers for tc/XDP to respect VRF bindings,
   from Gilad Sever.

5) Change bpf_link_put() to use workqueue unconditionally to fix it
   under PREEMPT_RT, from Sebastian Andrzej Siewior.

6) Follow-ups to address issues in the bpf_refcount shared ownership
   implementation, from Dave Marchevsky.

7) A few general refactorings to BPF map and program creation permissions
   checks which were part of the BPF token series, from Andrii Nakryiko.

8) Various fixes for benchmark framework and add a new benchmark
   for BPF memory allocator to BPF selftests, from Hou Tao.

9) Documentation improvements around iterators and trusted pointers,
   from Anton Protopopov.

10) Small cleanup in verifier to improve allocated object check,
    from Daniel T. Lee.

11) Improve performance of bpf_xdp_pointer() by avoiding access
    to shared_info when XDP packet does not have frags,
    from Jesper Dangaard Brouer.

12) Silence a harmless syzbot-reported warning in btf_type_id_size(),
    from Yonghong Song.

13) Remove duplicate bpfilter_umh_cleanup in favor of umd_cleanup_helper,
    from Jarkko Sakkinen.

14) Fix BPF selftests build for resolve_btfids under custom HOSTCFLAGS,
    from Viktor Malik.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits)
  bpf, docs: Document existing macros instead of deprecated
  bpf, docs: BPF Iterator Document
  selftests/bpf: Fix compilation failure for prog vrf_socket_lookup
  selftests/bpf: Add vrf_socket_lookup tests
  bpf: Fix bpf socket lookup from tc/xdp to respect socket VRF bindings
  bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC hookpoint
  bpf: Factor out socket lookup functions for the TC hookpoint.
  selftests/bpf: Set the default value of consumer_cnt as 0
  selftests/bpf: Ensure that next_cpu() returns a valid CPU number
  selftests/bpf: Output the correct error code for pthread APIs
  selftests/bpf: Use producer_cnt to allocate local counter array
  xsk: Remove unused inline function xsk_buff_discard()
  bpf: Keep BPF_PROG_LOAD permission checks clear of validations
  bpf: Centralize permissions checks for all BPF map types
  bpf: Inline map creation logic in map_create() function
  bpf: Move unprivileged checks into map_create() and bpf_prog_load()
  bpf: Remove in_atomic() from bpf_link_put().
  selftests/bpf: Verify that check_ids() is used for scalars in regsafe()
  bpf: Verify scalar ids mapping in regsafe() using check_ids()
  selftests/bpf: Check if mark_chain_precision() follows scalar ids
  ...
====================

Link: https://lore.kernel.org/r/20230623211256.8409-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch 'mlxsw-maintain-candidate-rifs'
Jakub Kicinski [Sat, 24 Jun 2023 01:59:16 +0000 (18:59 -0700)]
Merge branch 'mlxsw-maintain-candidate-rifs'

Petr Machata says:

====================
mlxsw: Maintain candidate RIFs

The mlxsw driver currently makes the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices need to be added to
the bridge before IP addresses are configured on that bridge or SVI added
on top of it. Enslaving a netdevice to another netdevice that already has
uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
it is rather easy to get into situations where the offloaded configuration
is just plain wrong.

As an example, take a front panel port, configure an IP address: it gets a
RIF. Now enslave the port to the bridge, and the RIF is gone. Remove the
port from the bridge again, but the RIF never comes back. There is a number
of similar situations, where changing the configuration there and back
utterly breaks the offload.

The situation is going to be made better by implementing a range of replays
and post-hoc offloads.

This patch set lays the ground for replay of next hops. The particular
issue that it deals with is that currently, driver-specific bookkeeping for
next hops is hooked off RIF objects, which come and go across the lifetime
of a netdevice. We would rather keep these objects at an entity that
mirrors the lifetime of the netdevice itself. That way they are at hand and
can be offloaded when a RIF is eventually created.

To that end, with this patchset, mlxsw keeps a hash table of CRIFs:
candidate RIFs, persistent handles for netdevices that mlxsw deems
potentially interesting. The lifetime of a CRIF matches that of the
underlying netdevice, and thus a RIF can always assume a CRIF exists. A
CRIF is where next hops are kept, and when RIF is created, these next hops
can be easily offloaded. (Previously only the next hops created after the
RIF was created were offloaded.)

- Patches #1 and #2 are minor adjustments.
- In patches #3 and #4, add CRIF bookkeeping.
- In patch #5, link CRIFs to RIFs such that given a netdevice-backed RIF,
  the corresponding CRIF is easy to look up.
- Patch #6 is a clean-up allowed by the previous patches
- Patches #7 and #8 move next hop tracking to CRIFs

No observable effects are intended as of yet. This will be useful once
there is support for RIF creation for netdevices that become mlxsw uppers,
which will come in following patch sets.
====================

Link: https://lore.kernel.org/r/cover.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Track next hops at CRIFs
Petr Machata [Thu, 22 Jun 2023 13:33:09 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Track next hops at CRIFs

Move the list of next hops from struct mlxsw_sp_rif to mlxsw_sp_crif. The
reason is that eventually, next hops for mlxsw uppers should be offloaded
and unoffloaded on demand as a netdevice becomes an upper, or stops being
one. Currently, next hops are tracked at RIFs, but RIFs do not exist when a
netdevice is not an mlxsw uppers. CRIFs are kept track of throughout the
netdevice lifetime.

Correspondingly, track at each next hop not its RIF, but its CRIF (from
which a RIF can always be deduced).

Note that now that next hops are tracked at a CRIF, it is not necessary to
move each over to a new RIF when it is necessary to edit a RIF. Therefore
drop mlxsw_sp_nexthop_rif_migrate() and have mlxsw_sp_rif_migrate_destroy()
call mlxsw_sp_nexthop_rif_update() directly.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/e7c1c0a7dd13883b0f09aeda12c4fcf4d63a70e3.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Split nexthop finalization to two stages
Petr Machata [Thu, 22 Jun 2023 13:33:08 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Split nexthop finalization to two stages

Nexthop finalization consists of two steps: the part where the offload is
removed, because the backing RIF is now gone; and the part where the
association to the RIF is severed.

Extract from mlxsw_sp_nexthop_type_fini() a helper that covers the
unoffloading part, mlxsw_sp_nexthop_type_rif_gone(), so that it can later
be called independently.

Note that this swaps around the ordering of mlxsw_sp_nexthop_ipip_fini()
vs. mlxsw_sp_nexthop_rif_fini(). The current ordering is more of a
historical happenstance than a conscious decision. The two cleanups do not
depend on each other, and this change should have no observable effects.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/7134559534c5f5c4807c3a1569fae56f8887e763.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Use router.lb_crif instead of .lb_rif_index
Petr Machata [Thu, 22 Jun 2023 13:33:07 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Use router.lb_crif instead of .lb_rif_index

A previous patch added a pointer to loopback CRIF to the router data
structure. That makes the loopback RIF index redundant, as everything
necessary can be derived from the CRIF. Drop the field and adjust the code
accordingly.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/8637bf959bc5b6c9d5184b9bd8a0cd53c5132835.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Link CRIFs to RIFs
Petr Machata [Thu, 22 Jun 2023 13:33:06 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Link CRIFs to RIFs

When a RIF is about to be created, the registration of the netdevice that
it should be associated with must have been seen in the past, and a CRIF
created. Therefore make this a hard requirement by looking up the CRIF
during RIF creation, and complaining loudly when there isn't one.

This then allows to keep a link between a RIF and its corresponding
CRIF (and back, as the relationship is one-to-at-most-one), which do.

The CRIF will later be useful as the objects tracked there will be
offloaded lazily as a result of RIF creation.

CRIFs are created when an "interesting" netdevice is registered, and
destroyed after such device is unregistered. CRIFs are supposed to already
exist when a RIF creation request arises, and exist at least as long as
that RIF exists. This makes for a simple invariant: it is always safe to
dereference CRIF pointer from "its" RIF.

To guarantee this, CRIFs cannot be removed immediately when the UNREGISTER
event is delivered. The reason is that if a RIF's netdevices has an IPv6
address, removal of this address is notified in an atomic block. To remove
the RIF, the IPv6 removal handler schedules a work item. It must be safe
for this work item to access the associated CRIF as well.

Thus when a netdevice that backs the CRIF is removed, if it still has a
RIF, do not actually free the CRIF, only toggle its can_destroy flag, which
this patch adds. Later on, mlxsw_sp_rif_destroy() collects the CRIF.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/68c8e33afa6b8c03c431b435e1685ffdff752e63.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Maintain CRIF for fallback loopback RIF
Petr Machata [Thu, 22 Jun 2023 13:33:05 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Maintain CRIF for fallback loopback RIF

CRIFs are generally not maintained for loopback RIFs. However, the RIF for
the default VRF is used for offloading of blackhole nexthops. Nexthops
expect to have a valid CRIF. Therefore in this patch, add code to maintain
CRIF for the loopback RIF as well.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/7f2b2fcc98770167ed1254a904c3f7f585ba43f0.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Maintain a hash table of CRIFs
Petr Machata [Thu, 22 Jun 2023 13:33:04 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Maintain a hash table of CRIFs

CRIFs are objects that mlxsw maintains for netdevices that may not have an
associated RIF (i.e. they may not have been instantiated in the ASIC), but
if indeed they do not, it is quite possible they will in the future. These
netdevices are candidate RIFs, hence CRIFs. Netdevices for which CRIFs are
created include e.g. bridges, LAGs, or front panel ports. The idea is that
next hops would be kept at CRIFs, not RIFs, and thus it would be easier to
offload and unoffload the entities that have been added before the RIF was
created.

In this patch, add the code for low-level CRIF maintenance: create and
destroy, and keep in a table keyed by the netdevice pointer for easy
recall.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/186d44e399c475159da20689f2c540719f2d1ed0.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Use mlxsw_sp_ul_rif_get() to get main VRF LB RIF
Petr Machata [Thu, 22 Jun 2023 13:33:03 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Use mlxsw_sp_ul_rif_get() to get main VRF LB RIF

The current function, mlxsw_sp_router_ul_rif_get(), is a wrapper around the
function mentioned in the subject. As such it forms an external interface
of the router code.

In future patches we will want to maintain connection between RIFs and the
CRIFs (introduced in the next patch) that back them. That will not hold
for the VRF-based loopback netdevices, so the whole CRIF business can be
kept hidden from the rest of mlxsw.

But for the main VRF loopback RIF we do want to keep the RIF-CRIF
connection, because that RIF is used for blackhole next hops, and the next
hop code can be kept simpler for assuming rif->crif is valid.

Hence, instead, call mlxsw_sp_ul_rif_get() to create the main VRF loopback
RIF. This being an internal function will take the CRIF argument anyway.
Furthermore, the function does not lock, which is not necessary at this
point in code yet.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/7a39a011a02a84164cd7f5da7985ec5b2ae01ba5.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agomlxsw: spectrum_router: Add extack argument to mlxsw_sp_lb_rif_init()
Petr Machata [Thu, 22 Jun 2023 13:33:02 +0000 (15:33 +0200)]
mlxsw: spectrum_router: Add extack argument to mlxsw_sp_lb_rif_init()

The extack will be handy in later patches.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Link: https://lore.kernel.org/r/e87ba300121010d580b80a281877573a7b1377ca.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet/mlx5: Remove pointless vport lookup from mlx5_esw_check_port_type()
Jiri Pirko [Thu, 1 Jun 2023 07:17:17 +0000 (09:17 +0200)]
net/mlx5: Remove pointless vport lookup from mlx5_esw_check_port_type()

As xa_get_mark() returns false in case the entry is not present,
no need to redundantly check if vport is present. Remove the lookup.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Remove redundant check from mlx5_esw_query_vport_vhca_id()
Jiri Pirko [Wed, 31 May 2023 13:11:07 +0000 (15:11 +0200)]
net/mlx5: Remove redundant check from mlx5_esw_query_vport_vhca_id()

Since mlx5_esw_query_vport_vhca_id() could be called either from
mlx5_esw_vport_enable() or mlx5_esw_vport_disable() where the
the check is done, this is always false here.
Remove the redundant check.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Remove redundant is_mdev_switchdev_mode() check from is_ib_rep_supported()
Jiri Pirko [Thu, 1 Jun 2023 11:15:12 +0000 (13:15 +0200)]
net/mlx5: Remove redundant is_mdev_switchdev_mode() check from is_ib_rep_supported()

is_mdev_switchdev_mode() check is done in is_eth_rep_supported().
Function is_ib_rep_supported() calls is_eth_rep_supported().
Remove the redundant check from it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Remove redundant MLX5_ESWITCH_MANAGER() check from is_ib_rep_supported()
Jiri Pirko [Thu, 1 Jun 2023 11:12:37 +0000 (13:12 +0200)]
net/mlx5: Remove redundant MLX5_ESWITCH_MANAGER() check from is_ib_rep_supported()

MLX5_ESWITCH_MANAGER() check is done in is_eth_rep_supported().
Function is_ib_rep_supported() calls is_eth_rep_supported().
Remove the redundant check from it.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5e: E-Switch, Fix shared fdb error flow
Roi Dayan [Mon, 29 May 2023 06:32:57 +0000 (09:32 +0300)]
net/mlx5e: E-Switch, Fix shared fdb error flow

On error flow resources being freed in esw_master_egress_destroy_resources()
but pointers not being set to null if error flow is from creating a
bounce rule. Then in esw_acl_egress_ofld_cleanup() we try to access already
freed pointers. Fix it by resetting the pointers to null.
Also if error is from creating a second or later bounce rule then the
flow group and table being used and cannot and should not be freed.
Add a check to destroy the flow group and table if there are no bounce
rules.

mlx5_core.sf mlx5_core.sf.2: mlx5_destroy_flow_group:2306:(pid 2235): Flow group 4 wasn't destroyed, refcount > 1
mlx5_core.sf mlx5_core.sf.2: mlx5_destroy_flow_table:2295:(pid 2235): Flow table 3 wasn't destroyed, refcount > 1

Fixes: 5e0202eb49ed ("net/mlx5: E-switch, Handle multiple master egress rules")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5e: Remove redundant comment
Roi Dayan [Mon, 29 May 2023 06:24:54 +0000 (09:24 +0300)]
net/mlx5e: Remove redundant comment

The function comment says what it is and the comment
is redundant.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5e: E-Switch, Pass other_vport flag if vport is not 0
Roi Dayan [Sun, 28 May 2023 14:10:43 +0000 (17:10 +0300)]
net/mlx5e: E-Switch, Pass other_vport flag if vport is not 0

When creating flow table for shared fdb resources, there is
only need to pass other_vport flag if vport is not 0 or
if the port is ECPF in BlueField.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5e: E-Switch, Use xarray for devcom paired device index
Roi Dayan [Sun, 28 May 2023 09:11:47 +0000 (12:11 +0300)]
net/mlx5e: E-Switch, Use xarray for devcom paired device index

To allow devcom events on E-Switch that is not a vport group manager,
use vhca id as an index instead of device index which might be shared
between several E-Switches. for example SF and its PF.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5e: E-Switch, Add peer fdb miss rules for vport manager or ecpf
Roi Dayan [Sun, 28 May 2023 09:10:26 +0000 (12:10 +0300)]
net/mlx5e: E-Switch, Add peer fdb miss rules for vport manager or ecpf

Add peer fdb rules for E-Switch that are vport managers or ecpf device.
It is not needed for other devices.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5e: Use vhca_id for device index in vport rx rules
Roi Dayan [Sun, 28 May 2023 07:58:03 +0000 (10:58 +0300)]
net/mlx5e: Use vhca_id for device index in vport rx rules

Device index is like PF index and limited to max physical ports.
For example, SFs created under PF the device index is the PF device index.
Use vhca_id which gets the FW index per vport, for vport rx rules
and vport pair events.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Lag, Remove duplicate code checking lag is supported
Roi Dayan [Tue, 23 May 2023 09:02:06 +0000 (12:02 +0300)]
net/mlx5: Lag, Remove duplicate code checking lag is supported

Remove duplicate function for checking if device has lag support.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Fix error code in mlx5_is_reset_now_capable()
Dan Carpenter [Tue, 20 Jun 2023 13:43:07 +0000 (16:43 +0300)]
net/mlx5: Fix error code in mlx5_is_reset_now_capable()

The mlx5_is_reset_now_capable() function returns bool, not negative
error codes.  So if fast teardown is not supported it should return
false instead of -EOPNOTSUPP.

Fixes: 92501fa6e421 ("net/mlx5: Ack on sync_reset_request only if PF can do reset_now")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Fix reserved at offset in hca_cap register
Lama Kayal [Mon, 12 Jun 2023 13:34:43 +0000 (16:34 +0300)]
net/mlx5: Fix reserved at offset in hca_cap register

A member of struct mlx5_ifc_cmd_hca_cap_bits has been mistakenly
assigned the wrong reserved_at offset value. Correct it to align to the
right value, thus avoid future miscalculation.

Signed-off-by: Lama Kayal <lkayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agonet/mlx5: Fix SFs kernel documentation error
Shay Drory [Mon, 19 Jun 2023 04:42:07 +0000 (07:42 +0300)]
net/mlx5: Fix SFs kernel documentation error

Indent SFs probe code example in order to fix the below error:

Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst:57: ERROR: Unexpected indentation.
Documentation/networking/device_drivers/ethernet/mellanox/mlx5/switchdev.rst:61: ERROR: Unexpected indentation.

Fixes: e71383fb9cd1 ("net/mlx5: Light probe local SFs")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Automatic Verification <verifier@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
17 months agonet/mlx5: Fix UAF in mlx5_eswitch_cleanup()
Shay Drory [Wed, 14 Jun 2023 13:26:07 +0000 (16:26 +0300)]
net/mlx5: Fix UAF in mlx5_eswitch_cleanup()

mlx5_eswitch_cleanup() is using esw right after freeing it for
releasing devlink_param.
Fix it by releasing the devlink_param before freeing the esw, and
adjust the create function accordingly.

Fixes: 3f90840305e2 ("net/mlx5: Move esw multiport devlink param to eswitch code")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Automatic Verification <verifier@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
17 months agodt-bindings: net: altr,tse: Fix error in "compatible" conditional schema
Rob Herring [Wed, 21 Jun 2023 23:10:12 +0000 (17:10 -0600)]
dt-bindings: net: altr,tse: Fix error in "compatible" conditional schema

The conditional if/then schema has an error as the "enum" values have
"const" in them. Drop the "const".

Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230621231012.3816139-1-robh@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoMerge branch 's390-net-updates-2023-06-10'
Paolo Abeni [Fri, 23 Jun 2023 09:55:56 +0000 (11:55 +0200)]
Merge branch 's390-net-updates-2023-06-10'

Alexandra Winter says:

====================
s390/net: updates 2023-06-10

Please apply the following patch series for s390's ctcm and lcs drivers
to netdev's net-next tree.

Just maintenance patches, no functional changes.
====================

Link: https://lore.kernel.org/r/20230621134921.904217-1-wintera@linux.ibm.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agos390/ctcm: Convert sprintf/snprintf to scnprintf
Thorsten Winkler [Wed, 21 Jun 2023 13:49:21 +0000 (15:49 +0200)]
s390/ctcm: Convert sprintf/snprintf to scnprintf

This LWN article explains the why scnprintf is preferred over snprintf
in general
https://lwn.net/Articles/69419/
Ie. snprintf() returns what *would* be the resulting length, while
scnprintf() returns the actual length.

Note that ctcm_print_statistics() writes the data into the kernel log
and is therefore not suitable for sysfs_emit(). Observable behavior is
not changed, as there may be dependencies.

Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agos390/ctcm: Convert sysfs sprintf to sysfs_emit
Thorsten Winkler [Wed, 21 Jun 2023 13:49:20 +0000 (15:49 +0200)]
s390/ctcm: Convert sysfs sprintf to sysfs_emit

Following the advice of the Documentation/filesystems/sysfs.rst.
All sysfs related show()-functions should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.

Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agos390/lcs: Convert sprintf to scnprintf
Thorsten Winkler [Wed, 21 Jun 2023 13:49:19 +0000 (15:49 +0200)]
s390/lcs: Convert sprintf to scnprintf

This LWN article explains the why scnprintf is preferred over snprintf
in general
https://lwn.net/Articles/69419/
Ie. snprintf() returns what *would* be the resulting length, while
scnprintf() returns the actual length.

Reported-by: Jules Irenge <jbi.octave@gmail.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agos390/lcs: Convert sysfs sprintf to sysfs_emit
Thorsten Winkler [Wed, 21 Jun 2023 13:49:18 +0000 (15:49 +0200)]
s390/lcs: Convert sysfs sprintf to sysfs_emit

Following the advice of the Documentation/filesystems/sysfs.rst.
All sysfs related show()-functions should only use sysfs_emit() or
sysfs_emit_at() when formatting the value to be returned to user space.

While at it, follow Linux kernel coding style and unify indentation

Reported-by: Jules Irenge <jbi.octave@gmail.com>
Reported-by: Joe Perches <joe@perches.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Thorsten Winkler <twinkler@linux.ibm.com>
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoMerge branch 'net-hns3-there-are-some-cleanup-for-the-hns3-ethernet-driver'
Paolo Abeni [Fri, 23 Jun 2023 08:59:21 +0000 (10:59 +0200)]
Merge branch 'net-hns3-there-are-some-cleanup-for-the-hns3-ethernet-driver'

Hao Lan says:

====================
net: hns3: There are some cleanup for the HNS3 ethernet driver

There are some cleanup for the HNS3 ethernet driver.
====================

Link: https://lore.kernel.org/r/20230621123309.34381-1-lanhao@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agonet: hns3: clear hns unused parameter alarm
Peiyang Wang [Wed, 21 Jun 2023 12:33:09 +0000 (20:33 +0800)]
net: hns3: clear hns unused parameter alarm

Several functions in the hns3 driver have unused parameters.
The compiler will warn about them when building
with -Wunused-parameter option of hns3.

Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agonet: hns3: fix strncpy() not using dest-buf length as length issue
Hao Chen [Wed, 21 Jun 2023 12:33:08 +0000 (20:33 +0800)]
net: hns3: fix strncpy() not using dest-buf length as length issue

Now, strncpy() in hns3_dbg_fill_content() use src-length as copy-length,
it may result in dest-buf overflow.

This patch is to fix intel compile warning for csky-linux-gcc (GCC) 12.1.0
compiler.

The warning reports as below:

hclge_debugfs.c:92:25: warning: 'strncpy' specified bound depends on
the length of the source argument [-Wstringop-truncation]

strncpy(pos, items[i].name, strlen(items[i].name));

hclge_debugfs.c:90:25: warning: 'strncpy' output truncated before
terminating nul copying as many bytes from a string as its length
[-Wstringop-truncation]

strncpy(pos, result[i], strlen(result[i]));

strncpy() use src-length as copy-length, it may result in
dest-buf overflow.

So,this patch add some values check to avoid this issue.

Signed-off-by: Hao Chen <chenhao418@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/lkml/202207170606.7WtHs9yS-lkp@intel.com/T/
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agonet: hns3: refine the tcam key convert handle
Jian Shen [Wed, 21 Jun 2023 12:33:07 +0000 (20:33 +0800)]
net: hns3: refine the tcam key convert handle

The result of expression '(k ^ ~v)  & k' is exactly
the same with  'k & v', so simplify it.
(k ^ ~v) & k == k & v
The truth table (in non table form):
k == 0, v == 0:
  (k ^ ~v) & k == (0 ^ ~0) & 0 == (0 ^ 1) & 0 == 1 & 0 == 0
  k & v == 0 & 0 == 0

k == 0, v == 1:
  (k ^ ~v) & k == (0 ^ ~1) & 0 == (0 ^ 0) & 0 == 1 & 0 == 0
  k & v == 0 & 1 == 0

k == 1, v == 0:
  (k ^ ~v) & k == (1 ^ ~0) & 1 == (1 ^ 1) & 1 == 0 & 1 == 0
  k & v == 1 & 0 == 0

k == 1, v == 1:
  (k ^ ~v) & k == (1 ^ ~1) & 1 == (1 ^ 0) & 1 == 1 & 1 == 1
  k & v == 1 & 1 == 1
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Hao Lan <lanhao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
17 months agoMerge tag 'wireless-next-2023-06-22' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 23 Jun 2023 03:09:13 +0000 (20:09 -0700)]
Merge tag 'wireless-next-2023-06-22' of git://git./linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
Notable changes this time around:

MAINTAINERS
 - add missing driver git trees

ath11k
 - factory test mode support

iwlwifi
 - config rework to drop test devices and
   split the different families
 - major update for new firmware and MLO

stack
 - initial multi-link reconfiguration suppor
 - multi-BSSID and MLO improvements

other
 - fix the last few W=1 warnings from GCC 13
 - merged wireless tree to avoid conflicts

* tag 'wireless-next-2023-06-22' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (245 commits)
  wifi: ieee80211: fix erroneous NSTR bitmap size checks
  wifi: rtlwifi: cleanup USB interface
  wifi: rtlwifi: simplify LED management
  wifi: ath10k: improve structure padding
  wifi: ath9k: convert msecs to jiffies where needed
  wifi: iwlwifi: mvm: Add support for IGTK in D3 resume flow
  wifi: iwlwifi: mvm: update two most recent GTKs on D3 resume flow
  wifi: iwlwifi: mvm: Refactor security key update after D3
  wifi: mac80211: mark keys as uploaded when added by the driver
  wifi: iwlwifi: remove support of A0 version of FM RF
  wifi: iwlwifi: cfg: clean up Bz module firmware lines
  wifi: iwlwifi: pcie: add device id 51F1 for killer 1675
  wifi: iwlwifi: bump FW API to 83 for AX/BZ/SC devices
  wifi: iwlwifi: cfg: remove trailing dash from FW_PRE constants
  wifi: iwlwifi: also unify Ma device configurations
  wifi: iwlwifi: also unify Sc device configurations
  wifi: iwlwifi: unify Bz/Gl device configurations
  wifi: iwlwifi: pcie: also drop jacket from info macro
  wifi: iwlwifi: remove support for *nJ devices
  wifi: iwlwifi: don't load old firmware for 22000
  ...
====================

Link: https://lore.kernel.org/r/20230622185602.147650-2-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge tag 'linux-can-next-for-6.5-20230622' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Fri, 23 Jun 2023 03:05:25 +0000 (20:05 -0700)]
Merge tag 'linux-can-next-for-6.5-20230622' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2023-06-22

The first patch is by Carsten Schmidt, targets the kvaser_usb driver
and adds len8_dlc support.

Marcel Hellwig's patch for the xilinx_can driver adds support for CAN
transceivers via the PHY framework.

Frank Jungclaus contributes 6+2 patches for the esd_usb driver in
preparation for the upcoming CAN-USB/3 support.

The 2 patches by Miquel Raynal for the sja1000 driver work around
overruns stalls on the Renesas SoCs.

The next 3 patches are by me and fix the coding style in the
rx-offload helper and in the m_can and ti_hecc driver.

Vincent Mailhol contributes 3 patches to fix and update the
calculation of the length of CAN frames on the wire.

Oliver Hartkopp's patch moves the CAN_RAW_FILTER_MAX definition into
the correct header.

The remaining 14 patches are by Jimmy Assarsson, target the
kvaser_pciefd driver and bring various updates and improvements.

* tag 'linux-can-next-for-6.5-20230622' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (33 commits)
  can: kvaser_pciefd: Use TX FIFO size read from CAN controller
  can: kvaser_pciefd: Refactor code
  can: kvaser_pciefd: Add len8_dlc support
  can: kvaser_pciefd: Use FIELD_{GET,PREP} and GENMASK where appropriate
  can: kvaser_pciefd: Sort register definitions
  can: kvaser_pciefd: Change return type for kvaser_pciefd_{receive,transmit,set_tx}_irq()
  can: kvaser_pciefd: Rename device ID defines
  can: kvaser_pciefd: Sort includes in alphabetic order
  can: kvaser_pciefd: Remove SPI flash parameter read functionality
  can: uapi: move CAN_RAW_FILTER_MAX definition to raw.h
  can: kvaser_pciefd: Define unsigned constants with type suffix 'U'
  can: kvaser_pciefd: Set hardware timestamp on transmitted packets
  can: kvaser_pciefd: Add function to set skb hwtstamps
  can: kvaser_pciefd: Remove handler for unused KVASER_PCIEFD_PACK_TYPE_EFRAME_ACK
  can: kvaser_pciefd: Remove useless write to interrupt register
  can: length: refactor frame lengths definition to add size in bits
  can: length: fix bitstuffing count
  can: length: fix description of the RRS field
  can: m_can: fix coding style
  can: ti_hecc: fix coding style
  ...
====================

Link: https://lore.kernel.org/r/20230622082658.571150-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: fix net device address assign type
Piotr Gardocki [Wed, 21 Jun 2023 13:21:06 +0000 (15:21 +0200)]
net: fix net device address assign type

Commit ad72c4a06acc introduced optimization to return from function
quickly if the MAC address is not changing at all. It was reported
that such change causes dev->addr_assign_type to not change
to NET_ADDR_SET from _PERM or _RANDOM.
Restore the old behavior and skip only call to ndo_set_mac_address.

Fixes: ad72c4a06acc ("net: add check for current MAC address in dev_set_mac_address")
Reported-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230621132106.991342-1-piotrx.gardocki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agosfc: keep alive neighbour entries while a TC encap action is using them
Edward Cree [Wed, 21 Jun 2023 12:15:04 +0000 (13:15 +0100)]
sfc: keep alive neighbour entries while a TC encap action is using them

When processing counter updates, if any action set using the newly
 incremented counter includes an encap action, prod the corresponding
 neighbouring entry to indicate to the neighbour cache that the entry
 is still in use and passing traffic.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230621121504.17004-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: dsa: qca8k: add support for additional modes for netdev trigger
Christian Marangi [Wed, 21 Jun 2023 09:54:09 +0000 (11:54 +0200)]
net: dsa: qca8k: add support for additional modes for netdev trigger

The QCA8K switch supports additional modes that can be handled in
hardware for the LED netdev trigger.

Add these additional modes to further support the Switch LEDs and
offload more blink modes.

Add additional modes:
- link_10
- link_100
- link_1000
- half_duplex
- full_duplex

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230621095409.25859-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agodocs: ABI: sysfs-class-led-trigger-netdev: add new modes and entry
Christian Marangi [Wed, 21 Jun 2023 09:26:53 +0000 (11:26 +0200)]
docs: ABI: sysfs-class-led-trigger-netdev: add new modes and entry

Document newly introduced modes and entry for the LED netdev trigger.

Add documentation for new modes:
- link_10
- link_100
- link_1000
- half_duplex
- full_duplex

Add documentation for new entry:
- hw_control

Also add additional info for the interval entry and the tx/rx modes with
the special case of hw_control ON.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230621092653.23172-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoigb: Fix igb_down hung on surprise removal
Ying Hsu [Tue, 20 Jun 2023 17:47:32 +0000 (10:47 -0700)]
igb: Fix igb_down hung on surprise removal

In a setup where a Thunderbolt hub connects to Ethernet and a display
through USB Type-C, users may experience a hung task timeout when they
remove the cable between the PC and the Thunderbolt hub.
This is because the igb_down function is called multiple times when
the Thunderbolt hub is unplugged. For example, the igb_io_error_detected
triggers the first call, and the igb_remove triggers the second call.
The second call to igb_down will block at napi_synchronize.
Here's the call trace:
    __schedule+0x3b0/0xddb
    ? __mod_timer+0x164/0x5d3
    schedule+0x44/0xa8
    schedule_timeout+0xb2/0x2a4
    ? run_local_timers+0x4e/0x4e
    msleep+0x31/0x38
    igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4]
    __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4]
    igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4]
    __dev_close_many+0x95/0xec
    dev_close_many+0x6e/0x103
    unregister_netdevice_many+0x105/0x5b1
    unregister_netdevice_queue+0xc2/0x10d
    unregister_netdev+0x1c/0x23
    igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4]
    pci_device_remove+0x3f/0x9c
    device_release_driver_internal+0xfe/0x1b4
    pci_stop_bus_device+0x5b/0x7f
    pci_stop_bus_device+0x30/0x7f
    pci_stop_bus_device+0x30/0x7f
    pci_stop_and_remove_bus_device+0x12/0x19
    pciehp_unconfigure_device+0x76/0xe9
    pciehp_disable_slot+0x6e/0x131
    pciehp_handle_presence_or_link_change+0x7a/0x3f7
    pciehp_ist+0xbe/0x194
    irq_thread_fn+0x22/0x4d
    ? irq_thread+0x1fd/0x1fd
    irq_thread+0x17b/0x1fd
    ? irq_forced_thread_fn+0x5f/0x5f
    kthread+0x142/0x153
    ? __irq_get_irqchip_state+0x46/0x46
    ? kthread_associate_blkcg+0x71/0x71
    ret_from_fork+0x1f/0x30

In this case, igb_io_error_detected detaches the network interface
and requests a PCIE slot reset, however, the PCIE reset callback is
not being invoked and thus the Ethernet connection breaks down.
As the PCIE error in this case is a non-fatal one, requesting a
slot reset can be avoided.
This patch fixes the task hung issue and preserves Ethernet
connection by ignoring non-fatal PCIE errors.

Signed-off-by: Ying Hsu <yinghsu@chromium.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230620174732.4145155-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch 'net-dsa-microchip-fix-writes-to-phy-registers-0x10'
Jakub Kicinski [Fri, 23 Jun 2023 02:48:40 +0000 (19:48 -0700)]
Merge branch 'net-dsa-microchip-fix-writes-to-phy-registers-0x10'

Rasmus Villemoes says:

====================
net: dsa: microchip: fix writes to phy registers >= 0x10

Patch 1 is just a simplification, technically unrelated to the other
two patches. But it would be a bit inconsistent to have the new
ksz_prmw32() introduced in patch 2 use ksz_rmw32() while leaving
ksz_prmw8() as-is.

The actual fix is of course patch 3. I can definitely see some weird
behaviour on our ksz9567 when writing to phy registers 0x1e and 0x1f
(with phytool from userspace), though it does not seem that the effect
is always to write zeroes to the buddy register as the errata sheet
says would be the case. In our case, the switch is connected via i2c;
I hope somebody with other switches and/or the SPI variants can test
this.
====================

Link: https://lore.kernel.org/r/20230620113855.733526-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: dsa: microchip: fix writes to phy registers >= 0x10
Rasmus Villemoes [Tue, 20 Jun 2023 11:38:54 +0000 (13:38 +0200)]
net: dsa: microchip: fix writes to phy registers >= 0x10

According to the errata sheets for ksz9477 and ksz9567, writes to the
PHY registers 0x10-0x1f (i.e. those located at addresses 0xN120 to
0xN13f) must be done as a 32 bit write to the 4-byte aligned address
containing the register, hence requires a RMW in order not to change
the adjacent PHY register.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230620113855.733526-4-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: dsa: microchip: add ksz_prmw32() helper
Rasmus Villemoes [Tue, 20 Jun 2023 11:38:53 +0000 (13:38 +0200)]
net: dsa: microchip: add ksz_prmw32() helper

This will be used in a subsequent patch fixing an errata for writes to
certain PHY registers.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Link: https://lore.kernel.org/r/20230620113855.733526-3-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: dsa: microchip: simplify ksz_prmw8()
Rasmus Villemoes [Tue, 20 Jun 2023 11:38:52 +0000 (13:38 +0200)]
net: dsa: microchip: simplify ksz_prmw8()

Implement ksz_prmw8() in terms of ksz_rmw8(), just as all the other
ksz_pX are implemented in terms of ksz_X. No functional change.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Link: https://lore.kernel.org/r/20230620113855.733526-2-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agotools: ynl: improve the direct-include header guard logic
Jakub Kicinski [Wed, 21 Jun 2023 23:17:19 +0000 (16:17 -0700)]
tools: ynl: improve the direct-include header guard logic

Przemek suggests that I shouldn't accuse GCC of witchcraft,
there is a simpler explanation for why we need manual define.

scripts/headers_install.sh modifies the guard, removing _UAPI.
That's why including a kernel header from the tree and from
/usr leads to duplicate definitions.

This also solves the mystery of why I needed to include
the header conditionally. I had the wrong guards for most
cases but ethtool.

Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20230621231719.2728928-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agonet: txgbe: remove unused buffer in txgbe_calc_eeprom_checksum
Zhengchao Shao [Tue, 20 Jun 2023 06:25:19 +0000 (14:25 +0800)]
net: txgbe: remove unused buffer in txgbe_calc_eeprom_checksum

Half a year passed since commit 049fe5365324c ("net: txgbe: Add operations
to interact with firmware") was submitted, the buffer in
txgbe_calc_eeprom_checksum was not used. So remove it and the related
branch codes.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202306200242.FXsHokaJ-lkp@intel.com/
Reviewed-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230620062519.1575298-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
17 months agoMerge branch 'add-and-use-helper-for-pcs-negotiation-modes'
Jakub Kicinski [Fri, 23 Jun 2023 02:41:13 +0000 (19:41 -0700)]
Merge branch 'add-and-use-helper-for-pcs-negotiation-modes'

Russell King says:

====================
Add and use helper for PCS negotiation modes

Earlier this month, I proposed a helper for deciding whether a PCS
should use inband negotiation modes or not. There was some discussion
around this topic, and I believe there was no disagreement about
providing the helper.

The initial discussion can be found at:

https://lore.kernel.org/r/ZGIkGmyL8yL1q1zp@shell.armlinux.org.uk

Subsequently, I posted a RFC series back in May:

https://lore.kernel.org/r/ZGzhvePzPjJ0v2En@shell.armlinux.org.uk

that added a helper, phylink_pcs_neg_mode() which PCS drivers could use
to parse the state, and updated a bunch of drivers to use it. I got
a couple of bits of feedback to it, including some ACKs.

However, I've decided to take this slightly further and change the
"mode" parameter to both the pcs_config() and pcs_link_up() methods
when a PCS driver opts in to this (by setting "neg_mode" in the
phylink_pcs structure.) If this is not set, we default to the old
behaviour. That said, this series converts all the PCS implementations
I can find currently in net-next.

Doing this has the added benefit that the negotiation mode parameter
is also available to the pcs_link_up() function, which can now know
whether inband negotiation was in fact enabled or not at pcs_config()
time.

It has been posted as RFC at:

https://lore.kernel.org/r/ZIh/CLQ3z89g0Ua0@shell.armlinux.org.uk

and received one reply, thanks Elad, which is a similar amount of
interest to previous postings. Let's post it as non-RFC and see
whether we get more reaction.
====================

Link: https://lore.kernel.org/r/ZIxQIBfO9dH5xFlg@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>