Mat Martineau [Mon, 2 May 2022 20:52:32 +0000 (13:52 -0700)]
selftests: mptcp: ADD_ADDR echo test with missing userspace daemon
Check userspace PM behavior to ensure ADD_ADDR echoes are only sent when
there is an active userspace daemon. If the daemon is restarting or
hasn't loaded yet, the missing echo will cause the peer to retransmit
the ADD_ADDR - and hopefully the daemon will be ready to receive it at
that later time.
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kishen Maloor [Mon, 2 May 2022 20:52:31 +0000 (13:52 -0700)]
mptcp: bypass in-kernel PM restrictions for non-kernel PMs
Current limits on the # of addresses/subflows must apply only to
in-kernel PM managed sockets. Thus this change removes such
restrictions on connections overseen by non-kernel (e.g. userspace)
PMs. This change also ensures that the kernel does not record stats
inside struct mptcp_pm_data updated along kernel code paths when exercised
via non-kernel PMs.
Additionally, address announcements are acknolwedged and subflow
requests are honored only when it's deemed that a userspace path
manager is active at the time.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Tue, 3 May 2022 10:43:41 +0000 (12:43 +0200)]
Merge tag 'mlx5-updates-2022-05-02' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2022-05-02
1) Trivial Misc updates to mlx5 driver
2) From Mark Bloch: Flow steering, general steering refactoring/cleaning
An issue with flow steering deletion flow (when creating a rule without
dests) turned out to be easy to fix but during the fix some issue
with the flow steering creation/deletion flows have been found.
The following patch series tries to fix long standing issues with flow
steering code and hopefully preventing silly future bugs.
A) Fix an issue where a proper dest type wasn't assigned.
B) Refactor and fix dests enums values, refactor deletion
function and do proper bookkeeping of dests.
C) Change mlx5_del_flow_rules() to delete rules when there are no
no more rules attached associated with an FTE.
D) Don't call hard coded deletion function but use the node's
defined one.
E) Add a WARN_ON() to catch future bugs when an FTE with dests
is deleted.
* tag 'mlx5-updates-2022-05-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: fs, an FTE should have no dests when deleted
net/mlx5: fs, call the deletion function of the node
net/mlx5: fs, delete the FTE when there are no rules attached to it
net/mlx5: fs, do proper bookkeeping for forward destinations
net/mlx5: fs, add unused destination type
net/mlx5: fs, jump to exit point and don't fall through
net/mlx5: fs, refactor software deletion rule
net/mlx5: fs, split software and IFC flow destination definitions
net/mlx5e: TC, set proper dest type
net/mlx5e: Remove unused mlx5e_dcbnl_build_rep_netdev function
net/mlx5e: Drop error CQE handling from the XSK RX handler
net/mlx5: Print initializing field in case of timeout
net/mlx5: Delete redundant default assignment of runtime devlink params
net/mlx5: Remove useless kfree
net/mlx5: use kvfree() for kvzalloc() in mlx5_ct_fs_smfs_matcher_create
====================
Link: https://lore.kernel.org/r/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 3 May 2022 10:10:51 +0000 (12:10 +0200)]
Merge branch 'mlxsw-remove-size-limitations-on-egress-descriptor-buffer'
Ido Schimmel says:
====================
mlxsw: Remove size limitations on egress descriptor buffer
Petr says:
Spectrum machines have two resources related to keeping packets in an
internal buffer: bytes (allocated in cell-sized units) for packet payload,
and descriptors, for keeping headers. Currently, mlxsw only configures the
bytes part of the resource management.
Spectrum switches permit a full parallel configuration for the descriptor
resources, including port-pool and port-TC-pool quotas. By default, these
are all configured to use pool 14, with an infinite quota. The ingress pool
14 is then infinite in size.
However, egress pool 14 has finite size by default. The size is chip
dependent, but always much lower than what the chip actually permits. As a
result, we can easily construct workloads that exhaust the configured
descriptor limit.
Going forward, mlxsw will have to fix this issue properly by maintaining
descriptor buffer sizes, TC bindings, and quotas that match the
architecture recommendation. Short term, fix the issue by configuring the
egress descriptor pool to be infinite in size as well. This will maintain
the same configuration philosophy, but will unlock all chip resources to be
usable.
In this patchset, patch #1 first adds the "desc" field into the pool
configuration register. Then in patch #2, the new field is used to
configure both ingress and egress pool 14 as infinite.
In patches #3 and #4, add a selftest that verifies that a large burst
can be absorbed by the shared buffer. This test specifically exercises a
scenario where descriptor buffer is the limiting factor and the test
fails without the above patches.
====================
Link: https://lore.kernel.org/r/20220502084926.365268-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Machata [Mon, 2 May 2022 08:49:26 +0000 (11:49 +0300)]
selftests: mlxsw: Add a test for soaking up a burst of traffic
Add a test that sends 1Gbps of traffic through the switch, into which it
then injects a burst of traffic and tests that there are no drops.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Machata [Mon, 2 May 2022 08:49:25 +0000 (11:49 +0300)]
selftests: forwarding: lib: Add start_traffic_pktsize() helpers
Add two helpers, start_traffic_pktsize() and start_tcp_traffic_pktsize(),
that allow explicit overriding of packet size. Change start_traffic() and
start_tcp_traffic() to dispatch through these helpers with the default
packet size.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Machata [Mon, 2 May 2022 08:49:24 +0000 (11:49 +0300)]
mlxsw: Configure descriptor buffers
Spectrum machines have two resources related to keeping packets in an
internal buffer: bytes (allocated in cell-sized units) for packet payload,
and descriptors, for keeping metadata. Currently, mlxsw only configures the
bytes part of the resource management.
Spectrum switches permit a full parallel configuration for the descriptor
resources, including port-pool and port-TC-pool quotas. By default, these
are all configured to use pool 14, with an infinite quota. The ingress pool
14 is then infinite in size.
However, egress pool 14 has finite size by default. The size is chip
dependent, but always much lower than what the chip actually permits. As a
result, we can easily construct workloads that exhaust the configured
descriptor limit.
Fix the issue by configuring the egress descriptor pool to be infinite in
size as well. This will maintain the configuration philosophy of the
default configuration, but will unlock all chip resources to be usable.
In the code, include both the configuration of ingress and ingress, mostly
for clarity.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Machata [Mon, 2 May 2022 08:49:23 +0000 (11:49 +0300)]
mlxsw: reg: Add "desc" field to SBPR
SBPR, or Shared Buffer Pools Register, configures and retrieves the shared
buffer pools and configuration. The desc field determines whether the
configuration relates to the byte pool or the descriptor pool.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 3 May 2022 08:15:09 +0000 (10:15 +0200)]
Merge branch 'use-standard-sysctl-macro'
Tonghao Zhang says:
====================
use standard sysctl macro
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>
This patchset introduce sysctl macro or replace var
with macro.
====================
Link: https://lore.kernel.org/r/20220501035524.91205-1-xiangxia.m.yue@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tonghao Zhang [Sun, 1 May 2022 03:55:24 +0000 (11:55 +0800)]
selftests/sysctl: add sysctl macro test
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tonghao Zhang [Sun, 1 May 2022 03:55:23 +0000 (11:55 +0800)]
net: sysctl: introduce sysctl SYSCTL_THREE
This patch introdues the SYSCTL_THREE.
KUnit:
[00:10:14] ================ sysctl_test (10 subtests) =================
[00:10:14] [PASSED] sysctl_test_api_dointvec_null_tbl_data
[00:10:14] [PASSED] sysctl_test_api_dointvec_table_maxlen_unset
[00:10:14] [PASSED] sysctl_test_api_dointvec_table_len_is_zero
[00:10:14] [PASSED] sysctl_test_api_dointvec_table_read_but_position_set
[00:10:14] [PASSED] sysctl_test_dointvec_read_happy_single_positive
[00:10:14] [PASSED] sysctl_test_dointvec_read_happy_single_negative
[00:10:14] [PASSED] sysctl_test_dointvec_write_happy_single_positive
[00:10:14] [PASSED] sysctl_test_dointvec_write_happy_single_negative
[00:10:14] [PASSED] sysctl_test_api_dointvec_write_single_less_int_min
[00:10:14] [PASSED] sysctl_test_api_dointvec_write_single_greater_int_max
[00:10:14] =================== [PASSED] sysctl_test ===================
./run_kselftest.sh -c sysctl
...
ok 1 selftests: sysctl: sysctl.sh
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tonghao Zhang [Sun, 1 May 2022 03:55:22 +0000 (11:55 +0800)]
net: sysctl: use shared sysctl macro
This patch replace two, four and long_one to SYSCTL_XXX.
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Lorenz Bauer <lmb@cloudflare.com>
Cc: Akhmat Karakotov <hmukos@yandex-team.ru>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Mark Bloch [Tue, 22 Mar 2022 12:58:02 +0000 (12:58 +0000)]
net/mlx5: fs, an FTE should have no dests when deleted
When deleting an FTE it should have no dests, which means
fte->dests_size should be 0. Add a WARN_ON() to catch bugs
where the proper tracking wasn't done.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 15 Mar 2022 11:51:55 +0000 (11:51 +0000)]
net/mlx5: fs, call the deletion function of the node
Don't call del_hw_fte() directly, instead use the hardware deletion
function set. This is just a small cleanup and doesn't change anything
as for an FTE the deletion function is already set to del_hw_fte().
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 15 Mar 2022 11:23:40 +0000 (11:23 +0000)]
net/mlx5: fs, delete the FTE when there are no rules attached to it
When an FTE has no children is means all the rules where removed
and the FTE can be deleted regardless of the dests_size value.
While dests_size should be 0 when there are no children
be extra careful not to leak memory or get firmware syndrome
if the proper bookkeeping of dests_size wasn't done.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 15 Mar 2022 11:22:17 +0000 (11:22 +0000)]
net/mlx5: fs, do proper bookkeeping for forward destinations
Keep track after destinations that are forward destinations.
When a forward destinations is removed from an FTE check if
the actions bits need to be updated.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 15 Mar 2022 14:08:23 +0000 (14:08 +0000)]
net/mlx5: fs, add unused destination type
When the caller doesn't pass a destination fs_core will create a unused
rule just so a context can be returned. This unused rule
is zeroed out and its type is 0 which can be mixed up with
MLX5_FLOW_DESTINATION_TYPE_VPORT.
Create a dedicated type to differentiate between the two
named MLX5_FLOW_DESTINATION_TYPE_NONE.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 15 Mar 2022 10:45:00 +0000 (10:45 +0000)]
net/mlx5: fs, jump to exit point and don't fall through
For code clarity and to prevent future bugs make sure to jump
to the exit point once done handling that specific type.
This aligns the code with the rest logic in the function.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 15 Mar 2022 10:44:14 +0000 (10:44 +0000)]
net/mlx5: fs, refactor software deletion rule
When deleting a rule make sure that for every type dests_size is
decreased only once and no other logic is executed.
Without this dests_size might be decreased twice when dests_size == 1
so the if for that type won't be entered and if action has
MLX5_FLOW_CONTEXT_ACTION_FWD_DEST set dests_size will be decreased again.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 22 Mar 2022 09:16:39 +0000 (09:16 +0000)]
net/mlx5: fs, split software and IFC flow destination definitions
Separate flow destinations between software and IFC.
Flow destination type passed by callers was used as the input in
firmware commands and over the years software only types were added
which resulted in mixing between the two.
Create an IFC enum that contains only the flow destinations defined
when talking to the firmware.
Now that there is a proper software only enum for flow destinations
the hardcoded values can be removed as the values are no longer used
in firmware commands.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Mark Bloch [Tue, 22 Mar 2022 16:14:51 +0000 (16:14 +0000)]
net/mlx5e: TC, set proper dest type
Dest type isn't set, this works only because
MLX5_FLOW_DESTINATION_TYPE_VPORT is zero. Set the proper type.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Gal Pressman [Tue, 29 Mar 2022 12:24:33 +0000 (15:24 +0300)]
net/mlx5e: Remove unused mlx5e_dcbnl_build_rep_netdev function
Commit
7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
removed the usage of mlx5e_dcbnl_build_rep_netdev() from the driver,
delete the function.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Thu, 20 Jan 2022 09:32:04 +0000 (11:32 +0200)]
net/mlx5e: Drop error CQE handling from the XSK RX handler
This commit removes the redundant check and removes the unused cqe parameter
of skb_from_cqe handlers.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Shay Drory [Tue, 22 Mar 2022 14:55:58 +0000 (16:55 +0200)]
net/mlx5: Print initializing field in case of timeout
Print the initializing field in case of FW couldn't initialize before
timeout. This will help to better understand the root cause in some
cases.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Shay Drory [Wed, 3 Nov 2021 10:18:35 +0000 (12:18 +0200)]
net/mlx5: Delete redundant default assignment of runtime devlink params
Runtime devlink params always read their values from the get() callbacks.
Also, it is an error to set driverinit_value for params which don't
support driverinit cmode. Delete such assignments.
In addition, move the set of default matching mode inside eswitch code.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Haowen Bai [Fri, 22 Apr 2022 06:08:32 +0000 (14:08 +0800)]
net/mlx5: Remove useless kfree
After alloc fail, we do not need to kfree.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Ziyang Xuan [Wed, 20 Apr 2022 10:36:17 +0000 (18:36 +0800)]
net/mlx5: use kvfree() for kvzalloc() in mlx5_ct_fs_smfs_matcher_create
The memory of spec is allocated with kvzalloc(), the corresponding
release function should not be kfree(), use kvfree() instead.
Generated by: scripts/coccinelle/api/kfree_mismatch.cocci
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Jakub Kicinski [Mon, 2 May 2022 23:04:36 +0000 (16:04 -0700)]
Merge branch 'vsock-virtio-add-support-for-device-suspend-resume'
Stefano Garzarella says:
====================
vsock/virtio: add support for device suspend/resume
Vilas reported that virtio-vsock no longer worked properly after
suspend/resume (echo mem >/sys/power/state).
It was impossible to connect to the host and vice versa.
Indeed, the support has never been implemented.
This series implement .freeze and .restore callbacks of struct virtio_driver
to support device suspend/resume.
The first patch factors our the code to initialize and delete VQs.
The second patch uses that code to support device suspend/resume.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
====================
Link: https://lore.kernel.org/r/20220428132241.152679-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefano Garzarella [Thu, 28 Apr 2022 13:22:41 +0000 (15:22 +0200)]
vsock/virtio: add support for device suspend/resume
Implement .freeze and .restore callbacks of struct virtio_driver
to support device suspend/resume.
During suspension all connected sockets are reset and VQs deleted.
During resume the VQs are re-initialized.
Reported by: Vilas R K <vilas.r.k@intel.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefano Garzarella [Thu, 28 Apr 2022 13:22:40 +0000 (15:22 +0200)]
vsock/virtio: factor our the code to initialize and delete VQs
Add virtio_vsock_vqs_init() and virtio_vsock_vqs_del() with the code
that was in virtio_vsock_probe() and virtio_vsock_remove to initialize
and delete VQs.
These new functions will be used in the next commit to support device
suspend/resume
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Sun, 1 May 2022 11:29:53 +0000 (14:29 +0300)]
selftests: forwarding: add Per-Stream Filtering and Policing test for Ocelot
The Felix VSC9959 switch in NXP LS1028A supports the tc-gate action
which enforced time-based access control per stream. A stream as seen by
this switch is identified by {MAC DA, VID}.
We use the standard forwarding selftest topology with 2 host interfaces
and 2 switch interfaces. The host ports must require timestamping non-IP
packets and supporting tc-etf offload, for isochron to work. The
isochron program monitors network sync status (ptp4l, phc2sys) and
deterministically transmits packets to the switch such that the tc-gate
action either (a) always accepts them based on its schedule, or
(b) always drops them.
I tried to keep as much of the logic that isn't specific to the NXP
LS1028A in a new tsn_lib.sh, for future reuse. This covers
synchronization using ptp4l and phc2sys, and isochron.
The cycle-time chosen for this selftest isn't particularly impressive
(and the focus is the functionality of the switch), but I didn't really
know what to do better, considering that it will mostly be run during
debugging sessions, various kernel bloatware would be enabled, like
lockdep, KASAN, etc, and we certainly can't run any races with those on.
I tried to look through the kselftest framework for other real time
applications and didn't really find any, so I'm not sure how better to
prepare the environment in case we want to go for a lower cycle time.
At the moment, the only thing the selftest is ensuring is that dynamic
frequency scaling is disabled on the CPU that isochron runs on. It would
probably be useful to have a blacklist of kernel config options (checked
through zcat /proc/config.gz) and some cyclictest scripts to run
beforehand, but I saw none of those.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20220501112953.3298973-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
jianghaoran [Fri, 29 Apr 2022 05:38:02 +0000 (13:38 +0800)]
ipv6: Don't send rs packets to the interface of ARPHRD_TUNNEL
ARPHRD_TUNNEL interface can't process rs packets
and will generate TX errors
ex:
ip tunnel add ethn mode ipip local 192.168.1.1 remote 192.168.1.2
ifconfig ethn x.x.x.x
ethn: flags=209<UP,POINTOPOINT,RUNNING,NOARP> mtu 1480
inet x.x.x.x netmask 255.255.255.255 destination x.x.x.x
inet6 fe80::5efe:ac1e:3cdb prefixlen 64 scopeid 0x20<link>
tunnel txqueuelen 1000 (IPIP Tunnel)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 3 dropped 0 overruns 0 carrier 0 collisions 0
Signed-off-by: jianghaoran <jianghaoran@kylinos.cn>
Link: https://lore.kernel.org/r/20220429053802.246681-1-jianghaoran@kylinos.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pavel Begunkov [Thu, 28 Apr 2022 10:57:46 +0000 (11:57 +0100)]
tcp: optimise skb_zerocopy_iter_stream()
It's expensive to make a copy of 40B struct iov_iter to the point it
was taking 0.2-0.5% of all cycles in my tests. iov_iter_revert() should
be fine as it's a simple case without nested reverts/truncates.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/a7e1690c00c5dfe700c30eb9a8a81ec59f6545dd.1650884401.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Niels Dossche [Sat, 30 Apr 2022 19:46:56 +0000 (21:46 +0200)]
octeontx2-af: debugfs: fix error return of allocations
Current memory failure code in the debugfs returns -ENOSPC. This is
normally used for indicating that there is no space left on the
device and is not applicable for memory allocation failures.
Replace this with -ENOMEM.
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Link: https://lore.kernel.org/r/20220430194656.44357-1-dossche.niels@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 2 May 2022 21:04:19 +0000 (14:04 -0700)]
Merge branch 'ocelot-stats-improvement'
Colin Foster says:
====================
ocelot stats improvement
A couple of pick-ups after
f187bfa6f35 ("net: ethernet: ocelot: remove
the need for num_stats initializer") - one addresses a warning
patchwork flagged about operator precedence when using macro arguments.
The other is a reduction of unnecessary memory allocation.
====================
Link: https://lore.kernel.org/r/20220430232327.4091825-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Colin Foster [Sat, 30 Apr 2022 23:23:27 +0000 (16:23 -0700)]
net: mscc: ocelot: add missed parentheses around macro argument
Commit
2f187bfa6f35 ("net: ethernet: ocelot: remove the need for num_stats
initializer") added a macro that patchwork warned it lacked parentheses
around an argument. Correct this mistake.
Fixes:
2f187bfa6f35 ("net: ethernet: ocelot: remove the need for num_stats initializer")
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Colin Foster [Sat, 30 Apr 2022 23:23:26 +0000 (16:23 -0700)]
net: mscc: ocelot: remove unnecessary variable
Commit
2f187bfa6f35 ("net: ethernet: ocelot: remove the need for num_stats
initializer") added a flags field to the ocelot stats structure. The same
behavior can be achieved without this additional field taking up extra
memory.
Remove this structure element to free up RAM
Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 2 May 2022 20:57:54 +0000 (13:57 -0700)]
Stefan Schmidt says:
====================
pull-request: ieee802154-next 2022-05-01
Miquel Raynal landed two patch series bundled in this pull request.
The first series re-works the symbol duration handling to better
accommodate the needs of the various phy layers in ieee802154.
In the second series Miquel improves th errors handling from drivers
up mac802154. THis streamlines the error handling throughout the
ieee/mac802154 stack in preparation for sync TX to be introduced for
MLME frames.
====================
Link: https://lore.kernel.org/r/20220501194614.1198325-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Mon, 2 May 2022 13:14:22 +0000 (15:14 +0200)]
Merge branch 'net-more-heap-allocation-and-split-of-rtnl_newlink'
Jakub Kicinski says:
====================
net: more heap allocation and split of rtnl_newlink()
Small refactoring of rtnl_newlink() to fix a stack usage warning
and make the function shorter.
====================
Link: https://lore.kernel.org/r/20220429235508.268349-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Fri, 29 Apr 2022 23:55:08 +0000 (16:55 -0700)]
rtnl: move rtnl_newlink_create()
Pure code move.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Fri, 29 Apr 2022 23:55:07 +0000 (16:55 -0700)]
rtnl: split __rtnl_newlink() into two functions
__rtnl_newlink() is 250LoC, but has a few clear sections.
Move the part which creates a new netdev to a separate
function.
For ease of review code will be moved in the next change.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Fri, 29 Apr 2022 23:55:06 +0000 (16:55 -0700)]
rtnl: allocate more attr tables on the heap
Commit
a293974590cf ("rtnetlink: avoid frame size warning in rtnl_newlink()")
moved to allocating the largest attribute array of rtnl_newlink()
on the heap. Kalle reports the stack has grown above 1k again:
net/core/rtnetlink.c:3557:1: error: the frame size of 1104 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
Move more attrs to the heap, wrap them in a struct.
Don't bother with linkinfo, it's referenced a lot and we take
its size so it's awkward to move, plus it's small (6 elements).
Reported-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Mon, 2 May 2022 11:21:40 +0000 (13:21 +0200)]
Merge branch 'use-mmd-c45-helpers'
Andrew Lunn says:
====================
Use MMD/C45 helpers
MDIO busses can perform two sorts of bus transaction, defined in
clause 22 and clause 45 of 802.3. This results in two register
addresses spaces. The current driver structure for indicating if C22
or C45 should be used is messy, and many C22 only bus drivers will
wrongly interpret a C45 transaction as a C22 transaction.
This patchset is a preparation step to cleanup the situation. It
converts MDIO bus users to make use of existing _mmd and _c45 helpers
to perform accesses to C45 registers. This will later allow C45 and
C22 to be kept separate.
====================
Link: https://lore.kernel.org/r/20220430173037.156823-1-andrew@lunn.ch
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrew Lunn [Sat, 30 Apr 2022 17:30:37 +0000 (19:30 +0200)]
net: pcs: pcs-xpcs: Convert to mdiobus_c45_read
Stop using the helpers to construct a special mdio address which
indicates C45. Instead use the C45 accessors, which will call the
busses C45 specific read/write API.
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrew Lunn [Sat, 30 Apr 2022 17:30:36 +0000 (19:30 +0200)]
net: dsa: sja1105: Convert to mdiobus_c45_read
Stop using the helpers to construct a special phy address which
indicates C45. Instead use the C45 accessors, which will call the
busses C45 specific read/write API.
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrew Lunn [Sat, 30 Apr 2022 17:30:35 +0000 (19:30 +0200)]
net: phy: bcm87xx: Use mmd helpers
Rather than construct special phy device addresses to access C45
registers, use the mmd helpers. These will directly call the C45 API
of the MDIO bus driver.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrew Lunn [Sat, 30 Apr 2022 17:30:34 +0000 (19:30 +0200)]
net: phy: Convert to mdiobus_c45_{read|write}
Stop using the helpers to construct a special phy address which
indicates C45. Instead use the C45 accessors, which will call the
busses C45 specific read/write API.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Andrew Lunn [Sat, 30 Apr 2022 17:30:33 +0000 (19:30 +0200)]
net: phylink: Convert to mdiobus_c45_{read|write}
Stop using the helpers to construct a special phy address which
indicates C45. Instead use the C45 accessors, which will call the
busses C45 specific read/write API.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Mon, 2 May 2022 11:03:50 +0000 (13:03 +0200)]
Merge tag 'linux-can-next-for-5.19-
20220502' of git://git./linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2022-05-02
this is a pull request of 9 patches for net-next/master.
The first patch is by Biju Das and documents renesas,r9a07g043-canfd
support in the renesas,rcar-canfd bindings document.
Jakub Kicinski's patch removes a copy of the NAPI_POLL_WEIGHT define
from the m_can driver.
The last 7 patches all target the ctucanfd driver. Pavel Pisa provides
2 patch which update the documentation. 2 patches by Jiapeng Chong
remove unneeded includes and error messages. And another 3 patches by
Pavel Pisa to further clean up the driver (remove inline keyword,
remove unneeded debug statements, and remove unneeded module parameters).
linux-can-next-for-5.19-
20220502
* tag 'linux-can-next-for-5.19-
20220502' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: ctucanfd: remove PCI module debug parameters
can: ctucanfd: remove debug statements
can: ctucanfd: remove inline keyword from local static functions
can: ctucanfd: ctucan_platform_probe(): remove unnecessary print function dev_err()
can: ctucanfd: remove unused including <linux/version.h>
docs: networking: device drivers: can: ctucanfd: update author e-mail
docs: networking: device drivers: can: add ctucanfd to index
can: m_can: remove a copy of the NAPI_POLL_WEIGHT define
dt-bindings: can: renesas,rcar-canfd: Document RZ/G2UL support
====================
Link: https://lore.kernel.org/r/20220502075914.1905039-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Fei Qin [Sat, 30 Apr 2022 23:11:50 +0000 (08:11 +0900)]
nfp: support VxLAN inner TSO with GSO_PARTIAL offload
VxLAN belongs to UDP-based encapsulation protocol. Inner TSO for VxLAN
packet with udpcsum requires offloading of outer header csum.
The device doesn't support outer header csum offload. However, inner TSO
for VxLAN with udpcsum can still work with GSO_PARTIAL offload, which
means outer udp csum computed by stack and inner tcp segmentation finished
by hardware. Thus, the patch enable features "NETIF_F_GSO_UDP_TUNNEL_CSUM"
and "NETIF_F_GSO_PARTIAL" and set gso_partial_features.
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220430231150.175270-1-simon.horman@corigine.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jaehee Park [Fri, 29 Apr 2022 16:46:58 +0000 (12:46 -0400)]
selftests: net: vrf_strict_mode_test: add support to select a test to run
Add a boilerplate test loop to run all tests in
vrf_strict_mode_test.sh. Add a -t flag that allows a selected test to
run. Remove the vrf_strict_mode_tests function which is now unused.
Signed-off-by: Jaehee Park <jhpark1013@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220429164658.GA656707@jaehee-ThinkPad-X1-Extreme
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Mon, 2 May 2022 08:30:36 +0000 (10:30 +0200)]
Merge branch 'devices-always-netif_f_lltx'
Peilin Ye says:
====================
devices always NETIF_F_LLTX
v1: https://lore.kernel.org/netdev/cover.
1650580763.git.peilin.ye@bytedance.com/
change since v1:
- deleted "depends on patch..." in [1/2]'s commit message
This patchset depends on these fixes [1], which has been merged into
net-next. Since o_seqno is now atomic_t, we can always turn on
NETIF_F_LLTX for [IP6]GRE[TAP] devices, since we no longer need the TX
lock (&txq->_xmit_lock).
We could probably do the same thing to [IP6]ERSPAN devices as well, but
I'm not familiar with them yet. For example, ERSPAN devices are
initialized as |= GRE_FEATURES in erspan_tunnel_init(), but I don't see
IP6ERSPAN devices being initialized as |= GRE6_FEATURES. Where should we
initialize IP6ERSPAN devices' ->features? Please suggest if I'm missing
something, thanks!
[1] https://lore.kernel.org/netdev/cover.
1650575919.git.peilin.ye@bytedance.com/
====================
Link: https://lore.kernel.org/r/cover.1651207788.git.peilin.ye@bytedance.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Peilin Ye [Fri, 29 Apr 2022 05:25:47 +0000 (22:25 -0700)]
ip6_gre: Make IP6GRE and IP6GRETAP devices always NETIF_F_LLTX
Recently we made o_seqno atomic_t. Stop special-casing TUNNEL_SEQ, and
always mark IP6GRE[TAP] devices as NETIF_F_LLTX, since we no longer need
the TX lock (&txq->_xmit_lock).
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Peilin Ye [Fri, 29 Apr 2022 05:25:21 +0000 (22:25 -0700)]
ip_gre: Make GRE and GRETAP devices always NETIF_F_LLTX
Recently we made o_seqno atomic_t. Stop special-casing TUNNEL_SEQ, and
always mark GRE[TAP] devices as NETIF_F_LLTX, since we no longer need
the TX lock (&txq->_xmit_lock).
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pavel Pisa [Sun, 24 Apr 2022 16:28:08 +0000 (18:28 +0200)]
can: ctucanfd: remove PCI module debug parameters
This patch removes the PCI module debug parameters, which are not
needed anymore, to make both checkpatch.pl and patchwork happy.
Link: https://lore.kernel.org/all/1fd684bcf5ddb0346aad234072f54e976a5210fb.1650816929.git.pisa@cmp.felk.cvut.cz
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
[mkl: split into separate patches]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Pisa [Sun, 24 Apr 2022 16:28:08 +0000 (18:28 +0200)]
can: ctucanfd: remove debug statements
This patch removes the debug statements from the driver to make
checkpatch.pl and patchwork happy.
Link: https://lore.kernel.org/all/1fd684bcf5ddb0346aad234072f54e976a5210fb.1650816929.git.pisa@cmp.felk.cvut.cz
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
[mkl: split into separate patches]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Pisa [Sun, 24 Apr 2022 16:28:08 +0000 (18:28 +0200)]
can: ctucanfd: remove inline keyword from local static functions
This patch removes the inline keywords from the local static functions
to make both checkpatch.pl and patchwork happy.
Link: https://lore.kernel.org/all/1fd684bcf5ddb0346aad234072f54e976a5210fb.1650816929.git.pisa@cmp.felk.cvut.cz
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
[mkl: split into separate patches]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Jiapeng Chong [Thu, 21 Apr 2022 20:32:42 +0000 (04:32 +0800)]
can: ctucanfd: ctucan_platform_probe(): remove unnecessary print function dev_err()
The print function dev_err() is redundant because platform_get_irq()
already prints an error.
Eliminate the follow coccicheck warnings:
| drivers/net/can/ctucanfd/ctucanfd_platform.c:67:2-9:
| line 67 is redundant because platform_get_irq() already prints an error.
Link: https://lore.kernel.org/all/20220421203242.7335-1-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Pave Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Jiapeng Chong [Thu, 21 Apr 2022 20:28:52 +0000 (04:28 +0800)]
can: ctucanfd: remove unused including <linux/version.h>
Eliminate the follow versioncheck warning:
| drivers/net/can/ctucanfd/ctucanfd_base.c: 34 linux/version.h not needed.
Link: https://lore.kernel.org/all/20220421202852.2693-1-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Pave Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Pisa [Sun, 24 Apr 2022 16:28:11 +0000 (18:28 +0200)]
docs: networking: device drivers: can: ctucanfd: update author e-mail
This patch updates the author's email address.
Link: https://lore.kernel.org/all/e4396244da6b008c671def9f50bb983a10389863.1650816929.git.pisa@cmp.felk.cvut.cz
Cc: Odrej Ille <ondrej.ille@gmail.com>
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
[mkl: split into separate patches]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Pisa [Sun, 24 Apr 2022 16:28:11 +0000 (18:28 +0200)]
docs: networking: device drivers: can: add ctucanfd to index
This patch adds the ctucanfd-driver document to the index.
Link: https://lore.kernel.org/all/e4396244da6b008c671def9f50bb983a10389863.1650816929.git.pisa@cmp.felk.cvut.cz
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
[mkl: split into separate patches]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Jakub Kicinski [Fri, 29 Apr 2022 17:44:46 +0000 (10:44 -0700)]
can: m_can: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same values in
the drivers just makes refactoring harder.
Link: https://lore.kernel.org/all/20220429174446.196655-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Biju Das [Sat, 23 Apr 2022 13:07:43 +0000 (14:07 +0100)]
dt-bindings: can: renesas,rcar-canfd: Document RZ/G2UL support
Add CANFD binding documentation for Renesas R9A07G043 (RZ/G2UL) SoC.
Link: https://lore.kernel.org/all/20220423130743.123198-1-biju.das.jz@bp.renesas.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
David S. Miller [Sun, 1 May 2022 16:45:35 +0000 (17:45 +0100)]
Merge branch 'adin1100-industrial-PHY-support'
Alexandru Tachici says:
====================
net: phy: adin1100: Add initial support for ADIN1100 industrial PHY
The ADIN1100 is a low power single port 10BASE-T1L transceiver designed for
industrial Ethernet applications and is compliant with the IEEE 802.3cg
Ethernet standard for long reach 10 Mb/s Single Pair Ethernet.
The ADIN1100 uses Auto-Negotiation capability in accordance
with IEEE 802.3 Clause 98, providing a mechanism for
exchanging information between PHYs to allow link partners to
agree to a common mode of operation.
The concluded operating mode is the transmit amplitude mode and
master/slave preference common across the two devices.
Both device and LP advertise their ability and request for
increased transmit at:
- BASE-T1 autonegotiation advertisement register [47:32]\
Clause 45.2.7.21 of Standard 802.3
- BIT(13) - 10BASE-T1L High Level Transmit Operating Mode Ability
- BIT(12) - 10BASE-T1L High Level Transmit Operating Mode Request
For 2.4 Vpp (high level transmit) operation, both devices need
to have the High Level Transmit Operating Mode Ability bit set,
and only one of them needs to have the High Level Transmit
Operating Mode Request bit set. Otherwise 1.0 Vpp transmit level
will be used.
Settings for eth1:
Supported ports: [ TP MII ]
Supported link modes: 10baseT1L/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT1L/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 10baseT1L/Full
Link partner advertised pause frame use: No
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 10Mb/s
Duplex: Full
Auto-negotiation: on
master-slave cfg: preferred slave
master-slave status: slave
Port: Twisted Pair
PHYAD: 0
Transceiver: external
MDI-X: Unknown
Link detected: yes
SQI: 7/7
1. Add basic support for ADIN1100.
Alexandru Ardelean (1):
net: phy: adin1100: Add initial support for ADIN1100 industrial PHY
1. Added 10baset-T1L link modes.
2. Added 10-BasetT1L registers.
3. Added Base-T1 auto-negotiation registers. For Base-T1 these
registers decide master/slave status and TX voltage of the
device and link partner.
4. Added 10BASE-T1L support in phy-c45.c. Now genphy functions will call
Base-T1 functions where registers don't match, like the auto-negotiation ones.
5. Convert MSE to SQI using a predefined table and allow user access
through ethtool.
6. DT bindings for the 2.4 Vpp transmit mode.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Tachici [Fri, 29 Apr 2022 15:34:37 +0000 (18:34 +0300)]
dt-bindings: net: phy: Add 10-baseT1L 2.4 Vpp
Add a tristate property to advertise desired transmit level.
If the device supports the 2.4 Vpp operating mode for 10BASE-T1L,
as defined in 802.3gc, and the 2.4 Vpp transmit voltage operation
is desired, property should be set to 1. This property is used
to select whether Auto-Negotiation advertises a request to
operate the 10BASE-T1L PHY in increased transmit level mode.
If property is set to 1, the PHY shall advertise a request
to operate the 10BASE-T1L PHY in increased transmit level mode.
If property is set to zero, the PHY shall not advertise
a request to operate the 10BASE-T1L PHY in increased transmit level mode.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Tachici [Fri, 29 Apr 2022 15:34:36 +0000 (18:34 +0300)]
net: phy: adin1100: Add SQI support
Determine the SQI from MSE using a predefined table
for the 10BASE-T1L.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Ardelean [Fri, 29 Apr 2022 15:34:35 +0000 (18:34 +0300)]
net: phy: adin1100: Add initial support for ADIN1100 industrial PHY
The ADIN1100 is a low power single port 10BASE-T1L transceiver designed for
industrial Ethernet applications and is compliant with the IEEE 802.3cg
Ethernet standard for long reach 10 Mb/s Single Pair Ethernet.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Tachici [Fri, 29 Apr 2022 15:34:34 +0000 (18:34 +0300)]
net: phy: Add 10BASE-T1L support in phy-c45
This patch is needed because the BASE-T1 uses different registers
for status, control and advertisement to those already
employed in the existing phy-c45 functions.
Where required, genphy_c45 functions will now check whether
the device supports BASE-T1 and use the specific registers
instead: 45.2.7.19 BASE-T1 AN control register,
45.2.7.20 BASE-T1 AN status, 45.2.7.21 BASE-T1 AN
advertisement register, 45.2.7.22 BASE-T1 AN LP Base
Page ability register, 45.2.1.185 BASE-T1 PMA/PMD control
register.
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Tachici [Fri, 29 Apr 2022 15:34:33 +0000 (18:34 +0300)]
net: phy: Add BaseT1 auto-negotiation registers
Added BASE-T1 AN advertisement register (Registers 7.514, 7.515, and
7.516) and BASE-T1 AN LP Base Page ability register (Registers 7.517,
7.518, and 7.519).
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Tachici [Fri, 29 Apr 2022 15:34:32 +0000 (18:34 +0300)]
net: phy: Add 10-BaseT1L registers
The 802.3gc specification defines the 10-BaseT1L link
mode for ethernet trafic on twisted wire pair.
PMA status register can be used to detect if the phy supports
2.4 V TX level and PCS control register can be used to
enable/disable PCS level loopback.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandru Tachici [Fri, 29 Apr 2022 15:34:31 +0000 (18:34 +0300)]
ethtool: Add 10base-T1L link mode entry
Add entry for the 10base-T1L full duplex mode.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Fri, 29 Apr 2022 14:32:14 +0000 (16:32 +0200)]
net: dsa: mv88e6xxx: Cosmetic change spaces to tabs in dsa_switch_ops
All but 5 methods in dsa_swith_ops use tabs for indentation.
Change the 5 methods that break this rule.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasily Averin [Fri, 29 Apr 2022 05:17:35 +0000 (08:17 +0300)]
net: enable memcg accounting for veth queues
veth netdevice defines own rx queues and allocates array containing
up to 4095 ~750-bytes-long 'struct veth_rq' elements. Such allocation
is quite huge and should be accounted to memcg.
Signed-off-by: Vasily Averin <vvs@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 1 May 2022 11:19:01 +0000 (12:19 +0100)]
Merge branch 'UDP-sock_wfree-opts'
Pavel Begunkov says:
====================
UDP sock_wfree optimisations
The series is not UDP specific but that the main beneficiary. 2/3 saves one
atomic in sock_wfree() and on top 3/3 removes an extra barrier.
Tested with UDP over dummy netdev,
2038491 ->
2099071 req/s (or around +3%).
note: in regards to 1/3, there is a "Should agree with poll..." comment
that I don't completely get, and there is no git history to explain it.
Though I can't see how it could rely on having the second check without
racing with tasks woken by wake_up*().
The series was split from a larger patchset, see
https://lore.kernel.org/netdev/cover.
1648981570.git.asml.silence@gmail.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:19 +0000 (11:58 +0100)]
sock: optimise sock_def_write_space barriers
Now we have a separate path for sock_def_write_space() and can go one
step further. When it's called from sock_wfree() we know that there is a
preceding atomic for putting down ->sk_wmem_alloc. We can use it to
replace to replace smb_mb() with a less expensive
smp_mb__after_atomic(). It also removes an extra RCU read lock/unlock as
a small bonus.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:18 +0000 (11:58 +0100)]
sock: optimise UDP sock_wfree() refcounting
For non SOCK_USE_WRITE_QUEUE sockets, sock_wfree() (atomically) puts
->sk_wmem_alloc twice. It's needed to keep the socket alive while
calling ->sk_write_space() after the first put.
However, some sockets, such as UDP, are freed by RCU
(i.e. SOCK_RCU_FREE) and use already RCU-safe sock_def_write_space().
Carve a fast path for such sockets, put down all refs in one go before
calling sock_def_write_space() but guard the socket from being freed
by an RCU read section.
note: because TCP sockets are marked with SOCK_USE_WRITE_QUEUE it
doesn't add extra checks in its path.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:17 +0000 (11:58 +0100)]
sock: dedup sock_def_write_space wmem_alloc checks
Except for minor rounding differences the first ->sk_wmem_alloc test in
sock_def_write_space() is a hand coded version of sock_writeable().
Replace it with the helper, and also kill the following if duplicating
the check.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Hancock [Wed, 27 Apr 2022 19:39:28 +0000 (13:39 -0600)]
net: phy: marvell: update abilities and advertising when switching to SGMII
With some SFP modules, such as Finisar FCLF8522P2BTL, the PHY hardware
strapping defaults to 1000BaseX mode, but the kernel prefers to set them
for SGMII mode. When this happens and the PHY is soft reset, the BMSR
status register is updated, but this happens after the kernel has already
read the PHY abilities during probing. This results in support not being
detected for, and the PHY not advertising support for, 10 and 100 Mbps
modes, preventing the link from working with a non-gigabit link partner.
When the PHY is being configured for SGMII mode, call genphy_read_abilities
again in order to re-read the capabilities, and update the advertising
field accordingly.
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Miquel Raynal [Thu, 28 Apr 2022 16:41:40 +0000 (18:41 +0200)]
net: mac802154: Fix symbol durations
There are two major issues in the logic calculating the symbol durations
based on the page/channel:
- The page number is used in place of the channel value.
- The BIT() macro is missing because we want to check the channel
value against a bitmask.
Fix these two errors and apologize loudly for this mistake.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/r/20220428164140.251965-1-miquel.raynal@bootlin.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Horatiu Vultur [Fri, 29 Apr 2022 07:19:53 +0000 (09:19 +0200)]
net: lan966x: Fix compilation error
Starting from the blamed commit, the lan966x build fails with the
following compilation error:
drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c:342:9: error: implicit declaration of function ‘ptp_find_pin_unlocked’ [-Werror=implicit-function-declaration]
342 | pin = ptp_find_pin_unlocked(phc->clock, PTP_PF_EXTTS, 0);
The issue is that there is no stub function for ptp_find_pin_unlocked
in case CONFIG_PTP_1588_CLOCK is not selected. Therefore add one.
Reported-by: kernel test robot <lkp@intel.com>
Fixes:
f3d8e0a9c28ba0 ("net: lan966x: Add support for PTP_PF_EXTTS")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yu Zhe [Fri, 29 Apr 2022 02:14:04 +0000 (19:14 -0700)]
ipv4: remove unnecessary type castings
remove unnecessary void* type castings.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 29 Apr 2022 17:43:30 +0000 (10:43 -0700)]
eth: remove remaining copies of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
This patch covers three more drivers which I missed in
commit
5f012b40ef63 ("eth: remove copies of the NAPI_POLL_WEIGHT define").
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pengcheng Yang [Fri, 29 Apr 2022 10:32:56 +0000 (18:32 +0800)]
tcp: use tcp_skb_sent_after() instead in RACK
This patch doesn't change any functionality.
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Tested-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minghao Chi [Fri, 29 Apr 2022 09:01:04 +0000 (09:01 +0000)]
net/funeth: simplify the return expression of fun_dl_info_get()
Simplify the return expression.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prabhakar Kushwaha [Sat, 30 Apr 2022 01:05:13 +0000 (04:05 +0300)]
qede: Reduce verbosity of ptp tx timestamp
Reduce verbosity of ptp tx timestamp error to reduce excessive log
messages.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Foster [Fri, 29 Apr 2022 21:30:36 +0000 (14:30 -0700)]
net: ethernet: ocelot: remove the need for num_stats initializer
There is a desire to share the oclot_stats_layout struct outside of the
current vsc7514 driver. In order to do so, the length of the array needs to
be known at compile time, and defined in the struct ocelot and struct
felix_info.
Since the array is defined in a .c file and would be declared in the header
file via:
extern struct ocelot_stat_layout[];
the size of the array will not be known at compile time to outside modules.
To fix this, remove the need for defining the number of stats at compile
time and allow this number to be determined at initialization.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 30 Apr 2022 01:15:23 +0000 (18:15 -0700)]
tcp: drop skb dst in tcp_rcv_established()
In commit
f84af32cbca7 ("net: ip_queue_rcv_skb() helper")
I dropped the skb dst in tcp_data_queue().
This only dealt with so-called TCP input slow path.
When fast path is taken, tcp_rcv_established() calls
tcp_queue_rcv() while skb still has a dst.
This was mostly fine, because most dsts at this point
are not refcounted (thanks to early demux)
However, TCP packets sent over loopback have refcounted dst.
Then commit
68822bdf76f1 ("net: generalize skb freeing
deferral to per-cpu lists") came and had the effect
of delaying skb freeing for an arbitrary time.
If during this time the involved netns is dismantled, cleanup_net()
frees the struct net with embedded net->ipv6.ip6_dst_ops.
Then when eventually dst_destroy_rcu() is called,
if (dst->ops->destroy) ... triggers an use-after-free.
It is not clear if ip6_route_net_exit() lacks a rcu_barrier()
as syzbot reported similar issues before the blamed commit.
( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ )
Fixes:
68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 30 Apr 2022 12:09:26 +0000 (13:09 +0100)]
Merge branch 'lan966x-phy-reset-remove'
Michael Walle says:
====================
net: lan966x: remove PHY reset support
Remove the unneeded PHY reset node as well as the driver support for it.
This was already discussed [1] and I expect Microchip to Ack on this
removal. Since there is no user, no breakage is expected.
I'm not sure it this should go through net or net-next and if the patches
should have a Fixes: tag or not. In upstream linux there was never any user
of it, so there is no bug to be fixed. But OTOH if the schema fix isn't
backported, then there might be an older schema version still containing
the reset node. Thoughts?
The patches needed for the GPIO part are just waiting to be picked up by
Linus [2,3]. This patch and the GPIO parts are the last pieces of the
puzzle to get ethernet working on the LAN9668 on upstream linux.
[1] https://lore.kernel.org/netdev/
20220330110210.
3374165-1-michael@walle.cc/
[2] https://lore.kernel.org/linux-gpio/CACRpkdbxmN+SWt95aGHjA2ZGnN61aWaA7c5S4PaG+WePAj=htg@mail.gmail.com/
[3] https://lore.kernel.org/linux-gpio/
20220420191926.
3411830-1-michael@walle.cc/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Walle [Thu, 28 Apr 2022 11:40:49 +0000 (13:40 +0200)]
net: lan966x: remove PHY reset support
The PHY subsystem as well as the MIIM mdio driver (in case of the
integrated PHYs) will take care of the resets. A separate reset driver
isn't needed. There is no in-tree user of this feature. Remove the
support.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Walle [Thu, 28 Apr 2022 11:40:48 +0000 (13:40 +0200)]
dt-bindings: net: lan966x: remove PHY reset
The PHY reset was intended to be a phandle for a special PHY reset
driver for the integrated PHYs as well as any external PHYs. It turns
out, that the culprit is how the reset of the switch device is done.
In particular, the switch reset also affects other subsystems like
the GPIO and the SGPIO block and it happens to be the case that the
reset lines of the external PHYs are connected to a common GPIO line.
Thus as soon as the switch issues a reset during probe time, all the
external PHYs will go into reset because all the GPIO lines will
switch to input and the pull-down on that signal will take effect.
So even if there was a special PHY reset driver, it (1) won't fix
the root cause of the problem and (2) it won't fix all the other
consumers of GPIO lines which will also be reset.
It turns out, the Ocelot SoC has the same weird behavior (or the
lack of a dedicated switch reset) and there the problem is already
solved and all the bits and pieces are already there and this PHY
reset property isn't not needed at all.
There are no users of this binding. Just remove it.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 30 Apr 2022 11:58:45 +0000 (12:58 +0100)]
Merge branch 'ipv6-net-opts'
Pavel Begunkov says:
====================
generic net and ipv6 minor optimisations
1-3 inline simple functions that only reshuffle arguments possibly adding
extra zero args, and call another function. It was benchmarked before with
a bunch of extra patches, see for details
https://lore.kernel.org/netdev/cover.
1648981570.git.asml.silence@gmail.com/
It may increase the binary size, but it's the right thing to do and at least
without modules it actually sheds some bytes for some standard-ish config.
text data bss dec hex filename
9627200 0 0
9627200 92e640 ./arch/x86_64/boot/bzImage
text data bss dec hex filename
9627104 0 0
9627104 92e5e0 ./arch/x86_64/boot/bzImage
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:48 +0000 (11:58 +0100)]
ipv6: refactor ip6_finish_output2()
Throw neigh checks in ip6_finish_output2() under a single slow path if,
so we don't have the overhead in the hot path.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:47 +0000 (11:58 +0100)]
ipv6: help __ip6_finish_output() inlining
There are two callers of __ip6_finish_output(), both are in
ip6_finish_output(). We can combine the call sites into one and handle
return code after, that will inline __ip6_finish_output().
Note, error handling under NET_XMIT_CN will only return 0 if
__ip6_finish_output() succeded, and in this case it return 0.
Considering that NET_XMIT_SUCCESS is 0, it'll be returning exactly the
same result for it as before.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:46 +0000 (11:58 +0100)]
net: inline dev_queue_xmit()
Inline dev_queue_xmit() and dev_queue_xmit_accel(), they both are small
proxy functions doing nothing but redirecting the control flow to
__dev_queue_xmit().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:45 +0000 (11:58 +0100)]
net: inline skb_zerocopy_iter_dgram
skb_zerocopy_iter_dgram() is a small proxy function, inline it. For
that, move __zerocopy_sg_from_iter into linux/skbuff.h
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Begunkov [Thu, 28 Apr 2022 10:58:44 +0000 (11:58 +0100)]
net: inline sock_alloc_send_skb
sock_alloc_send_skb() is simple and just proxying to another function,
so we can inline it and cut associated overhead.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 30 Apr 2022 02:12:05 +0000 (19:12 -0700)]
Merge branch 'tcp-pass-back-data-left-in-socket-after-receive' of git://git./linux/kernel/git/kuba/linux
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jens Axboe [Fri, 29 Apr 2022 00:45:06 +0000 (18:45 -0600)]
tcp: pass back data left in socket after receive
This is currently done for CMSG_INQ, add an ability to do so via struct
msghdr as well and have CMSG_INQ use that too. If the caller sets
msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.
Rearrange struct msghdr a bit so we can add this member while shrinking
it at the same time. On a 64-bit build, it was 96 bytes before this
change and 88 bytes afterwards.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/650c22ca-cffc-0255-9a05-2413a1e20826@kernel.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yinjun Zhang [Fri, 29 Apr 2022 07:51:24 +0000 (09:51 +0200)]
nfp: flower: utilize the tuple iifidx in offloading ct flows
The device info from which conntrack originates is stored in metadata
field of the ct flow to offload now, driver can utilize it to reduce
the number of offloaded flows.
v2: Drop inline keyword from get_netdev_from_rule() signature.
The compiler can decide.
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220429075124.128589-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pieter Jansen van Vuuren [Thu, 28 Apr 2022 11:39:33 +0000 (12:39 +0100)]
sfc: add EF100 VF support via a write to sriov_numvfs
This patch extends the EF100 PF driver by adding .sriov_configure()
which would allow users to enable and disable virtual functions
using the sriov sysfs.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/75e74d9e-14ce-0524-9668-5ab735a7cf62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>