linux-2.6-microblaze.git
3 years agonet/ipv4: switch do_ip_setsockopt to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:58 +0000 (08:08 +0200)]
net/ipv4: switch do_ip_setsockopt to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv4: merge ip_options_get and ip_options_get_from_user
Christoph Hellwig [Thu, 23 Jul 2020 06:08:57 +0000 (08:08 +0200)]
net/ipv4: merge ip_options_get and ip_options_get_from_user

Use the sockptr_t type to merge the versions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ipv4: switch ip_mroute_setsockopt to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:56 +0000 (08:08 +0200)]
net/ipv4: switch ip_mroute_setsockopt to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobpfilter: switch bpfilter_ip_set_sockopt to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:55 +0000 (08:08 +0200)]
bpfilter: switch bpfilter_ip_set_sockopt to sockptr_t

This is mostly to prepare for cleaning up the callers, as bpfilter by
design can't handle kernel pointers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetfilter: switch nf_setsockopt to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:54 +0000 (08:08 +0200)]
netfilter: switch nf_setsockopt to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetfilter: switch xt_copy_counters to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:53 +0000 (08:08 +0200)]
netfilter: switch xt_copy_counters to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonetfilter: remove the unused user argument to do_update_counters
Christoph Hellwig [Thu, 23 Jul 2020 06:08:52 +0000 (08:08 +0200)]
netfilter: remove the unused user argument to do_update_counters

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/xfrm: switch xfrm_user_policy to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:51 +0000 (08:08 +0200)]
net/xfrm: switch xfrm_user_policy to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: switch sock_set_timeout to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:50 +0000 (08:08 +0200)]
net: switch sock_set_timeout to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: switch sock_set_timeout to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:49 +0000 (08:08 +0200)]
net: switch sock_set_timeout to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: switch sock_setbindtodevice to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:48 +0000 (08:08 +0200)]
net: switch sock_setbindtodevice to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: switch copy_bpf_fprog_from_user to sockptr_t
Christoph Hellwig [Thu, 23 Jul 2020 06:08:47 +0000 (08:08 +0200)]
net: switch copy_bpf_fprog_from_user to sockptr_t

Pass a sockptr_t to prepare for set_fs-less handling of the kernel
pointer from bpf-cgroup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: add a new sockptr_t type
Christoph Hellwig [Thu, 23 Jul 2020 06:08:46 +0000 (08:08 +0200)]
net: add a new sockptr_t type

Add a uptr_t type that can hold a pointer to either a user or kernel
memory region, and simply helpers to copy to and from it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobpfilter: reject kernel addresses
Christoph Hellwig [Thu, 23 Jul 2020 06:08:45 +0000 (08:08 +0200)]
bpfilter: reject kernel addresses

The bpfilter user mode helper processes the optval address using
process_vm_readv.  Don't send it kernel addresses fed under
set_fs(KERNEL_DS) as that won't work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/bpfilter: split __bpfilter_process_sockopt
Christoph Hellwig [Thu, 23 Jul 2020 06:08:44 +0000 (08:08 +0200)]
net/bpfilter: split __bpfilter_process_sockopt

Split __bpfilter_process_sockopt into a low-level send request routine and
the actual setsockopt hook to split the init time ping from the actual
setsockopt processing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobpfilter: fix up a sparse annotation
Christoph Hellwig [Thu, 23 Jul 2020 06:08:43 +0000 (08:08 +0200)]
bpfilter: fix up a sparse annotation

The __user doesn't make sense when casting to an integer type, just
switch to a uintptr_t cast which also removes the need for the __force.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'TC-datapath-hash-api'
David S. Miller [Fri, 24 Jul 2020 22:23:31 +0000 (15:23 -0700)]
Merge branch 'TC-datapath-hash-api'

Ariel Levkovich says:

====================
TC datapath hash api

Hash based packet classification allows user to set up rules that
provide load balancing of traffic across multiple vports and
for ECMP path selection while keeping the number of rule at minimum.

Instead of matching on exact flow spec, which requires a rule per
flow, user can define rules based on a their hash value and distribute
the flows to different buckets. The number of rules
in this case will be constant and equal to the number of buckets.

The series introduces an extention to the cls flower classifier
and allows user to add rules that match on the hash value that
is stored in skb->hash while assuming the value was set prior to
the classification.

Setting the skb->hash can be done in various ways and is not defined
in this series - for example:
1. By the device driver upon processing an rx packet.
2. Using tc action bpf with a program which computes and sets the
skb->hash value.

$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 2 proto ip \
flower hash 0x0/0xf  \
action mirred egress redirect dev ens1f0_1

$ tc filter add dev ens1f0_0 ingress \
prio 1 chain 2 proto ip \
flower hash 0x1/0xf  \
action mirred egress redirect dev ens1f0_2

v3 -> v4:
 *Drop hash setting code leaving only the classidication parts.
  Setting the hash will be possible via existing tc action bpf.

v2 -> v3:
 *Split hash algorithm option into 2 different actions.
  Asym_l4 available via act_skbedit and bpf via new act_hash.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/sched: cls_flower: Add hash info to flow classification
Ariel Levkovich [Wed, 22 Jul 2020 22:03:01 +0000 (01:03 +0300)]
net/sched: cls_flower: Add hash info to flow classification

Adding new cls flower keys for hash value and hash
mask and dissect the hash info from the skb into
the flow key towards flow classication.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/flow_dissector: add packet hash dissection
Ariel Levkovich [Wed, 22 Jul 2020 22:03:00 +0000 (01:03 +0300)]
net/flow_dissector: add packet hash dissection

Retreive a hash value from the SKB and store it
in the dissector key for future matching.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hyperv: dump TX indirection table to ethtool regs
Chi Song [Fri, 24 Jul 2020 04:14:26 +0000 (21:14 -0700)]
net: hyperv: dump TX indirection table to ethtool regs

An imbalanced TX indirection table causes netvsc to have low
performance. This table is created and managed during runtime. To help
better diagnose performance issues caused by imbalanced tables, it needs
make TX indirection tables visible.

Because TX indirection table is driver specified information, so
display it via ethtool register dump.

Signed-off-by: Chi Song <chisong@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agovrf: Handle CONFIG_SYSCTL not set
David Ahern [Thu, 23 Jul 2020 23:23:09 +0000 (17:23 -0600)]
vrf: Handle CONFIG_SYSCTL not set

Randy reported compile failure when CONFIG_SYSCTL is not set/enabled:

ERROR: modpost: "sysctl_vals" [drivers/net/vrf.ko] undefined!

Fix by splitting out the sysctl init and cleanup into helpers that
can be set to do nothing when CONFIG_SYSCTL is disabled. In addition,
move vrf_strict_mode and vrf_strict_mode_change to above
vrf_shared_table_handler (code move only) and wrap all of it
in the ifdef CONFIG_SYSCTL.

Update the strict mode tests to check for the existence of the
/proc/sys entry.

Fixes: 33306f1aaf82 ("vrf: add sysctl parameter for strict mode")
Cc: Andrea Mayer <andrea.mayer@uniroma2.it>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: stop overriding master's ndo_get_phys_port_name
Vladimir Oltean [Wed, 22 Jul 2020 22:43:12 +0000 (01:43 +0300)]
net: dsa: stop overriding master's ndo_get_phys_port_name

The purpose of this override is to give the user an indication of what
the number of the CPU port is (in DSA, the CPU port is a hardware
implementation detail and not a network interface capable of traffic).

However, it has always failed (by design) at providing this information
to the user in a reliable fashion.

Prior to commit 3369afba1e46 ("net: Call into DSA netdevice_ops
wrappers"), the behavior was to only override this callback if it was
not provided by the DSA master.

That was its first failure: if the DSA master itself was a DSA port or a
switchdev, then the user would not see the number of the CPU port in
/sys/class/net/eth0/phys_port_name, but the number of the DSA master
port within its respective physical switch.

But that was actually ok in a way. The commit mentioned above changed
that behavior, and now overrides the master's ndo_get_phys_port_name
unconditionally. That comes with problems of its own, which are worse in
a way.

The idea is that it's typical for switchdev users to have udev rules for
consistent interface naming. These are based, among other things, on
the phys_port_name attribute. If we let the DSA switch at the bottom
to start randomly overriding ndo_get_phys_port_name with its own CPU
port, we basically lose any predictability in interface naming, or even
uniqueness, for that matter.

So, there are reasons to let DSA override the master's callback (to
provide a consistent interface, a number which has a clear meaning and
must not be interpreted according to context), and there are reasons to
not let DSA override it (it breaks udev matching for the DSA master).

But, there is an alternative method for users to retrieve the number of
the CPU port of each DSA switch in the system:

  $ devlink port
  pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0
  pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2
  pci/0000:00:00.5/4: type notset flavour cpu port 4
  spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0
  spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1
  spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2
  spi/spi2.0/4: type notset flavour cpu port 4
  spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0
  spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1
  spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2
  spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3
  spi/spi2.1/4: type notset flavour cpu port 4

So remove this duplicated, unreliable and troublesome method. From this
patch on, the phys_port_name attribute of the DSA master will only
contain information about itself (if at all). If the users need reliable
information about the CPU port they're probably using devlink anyway.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agocxgb4: add loopback ethtool self-test
Vishal Kulkarni [Thu, 23 Jul 2020 12:49:50 +0000 (18:19 +0530)]
cxgb4: add loopback ethtool self-test

In this test, loopback pkt is created and sent on default queue.
The packet goes until the Multi Port Switch (MPS) just before
the MAC and based on the specified channel number, it either
goes outside the wire on one of the physical ports or looped
back to Rx path by MPS. In this case, we're specifying loopback
channel, instead of physical ports, so the packet gets looped
back to Rx path, instead of getting transmitted on the wire.

v3:
- Modify commit message to include test details.
v2:
- Add only loopback self-test.

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'l2tp-further-checkpatch-pl-cleanups'
David S. Miller [Thu, 23 Jul 2020 18:54:41 +0000 (11:54 -0700)]
Merge branch 'l2tp-further-checkpatch-pl-cleanups'

Tom Parkin says:

====================
l2tp: further checkpatch.pl cleanups

l2tp hasn't been kept up to date with the static analysis checks offered
by checkpatch.pl.

This patchset builds on the series "l2tp: cleanup checkpatch.pl
warnings".  It includes small refactoring changes which improve code
quality and resolve a subset of the checkpatch warnings for the l2tp
codebase.
====================

Reviewed-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup kzalloc calls
Tom Parkin [Thu, 23 Jul 2020 11:29:55 +0000 (12:29 +0100)]
l2tp: cleanup kzalloc calls

Passing "sizeof(struct blah)" in kzalloc calls is less readable,
potentially prone to future bugs if the type of the pointer is changed,
and triggers checkpatch warnings.

Tweak the kzalloc calls in l2tp which use this form to avoid the
warning.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup netlink tunnel create address handling
Tom Parkin [Thu, 23 Jul 2020 11:29:54 +0000 (12:29 +0100)]
l2tp: cleanup netlink tunnel create address handling

When creating an L2TP tunnel using the netlink API, userspace must
either pass a socket FD for the tunnel to use (for managed tunnels),
or specify the tunnel source/destination address (for unmanaged
tunnels).

Since source/destination addresses may be AF_INET or AF_INET6, the l2tp
netlink code has conditionally compiled blocks to support IPv6.

Rather than embedding these directly into l2tp_nl_cmd_tunnel_create
(where it makes the code difficult to read and confuses checkpatch to
boot) split the handling of address-related attributes into a separate
function.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup netlink send of tunnel address information
Tom Parkin [Thu, 23 Jul 2020 11:29:53 +0000 (12:29 +0100)]
l2tp: cleanup netlink send of tunnel address information

l2tp_nl_tunnel_send has conditionally compiled code to support AF_INET6,
which makes the code difficult to follow and triggers checkpatch
warnings.

Split the code out into functions to handle the AF_INET v.s. AF_INET6
cases, which both improves readability and resolves the checkpatch
warnings.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: check socket address type in l2tp_dfs_seq_tunnel_show
Tom Parkin [Thu, 23 Jul 2020 11:29:52 +0000 (12:29 +0100)]
l2tp: check socket address type in l2tp_dfs_seq_tunnel_show

checkpatch warns about indentation and brace balancing around the
conditionally compiled code for AF_INET6 support in
l2tp_dfs_seq_tunnel_show.

By adding another check on the socket address type we can make the code
more readable while removing the checkpatch warning.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup unnecessary braces in if statements
Tom Parkin [Thu, 23 Jul 2020 11:29:51 +0000 (12:29 +0100)]
l2tp: cleanup unnecessary braces in if statements

These checks are all simple and don't benefit from extra braces to
clarify intent.  Remove them for easier-reading code.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup comparisons to NULL
Tom Parkin [Thu, 23 Jul 2020 11:29:50 +0000 (12:29 +0100)]
l2tp: cleanup comparisons to NULL

checkpatch warns about comparisons to NULL, e.g.

        CHECK: Comparison to NULL could be written "!rt"
        #474: FILE: net/l2tp/l2tp_ip.c:474:
        +       if (rt == NULL) {

These sort of comparisons are generally clearer and more readable
the way checkpatch suggests, so update l2tp accordingly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet/ncsi: use eth_zero_addr() to clear mac address
Miaohe Lin [Thu, 23 Jul 2020 11:13:43 +0000 (19:13 +0800)]
net/ncsi: use eth_zero_addr() to clear mac address

Use eth_zero_addr() to clear mac address insetad of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agocxgb4: use eth_zero_addr() to clear mac address
Miaohe Lin [Thu, 23 Jul 2020 11:05:00 +0000 (19:05 +0800)]
cxgb4: use eth_zero_addr() to clear mac address

Use eth_zero_addr() to clear mac address insetad of memset().

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mptcp-non-backup-subflows-pre-reqs'
David S. Miller [Thu, 23 Jul 2020 18:47:25 +0000 (11:47 -0700)]
Merge branch 'mptcp-non-backup-subflows-pre-reqs'

Paolo Abeni says:

====================
mptcp: non backup subflows pre-reqs

This series contains a bunch of MPTCP improvements loosely related to
concurrent subflows xmit usage, currently under development.

The first 3 patches are actually bugfixes for issues that will become apparent
as soon as we will enable the above feature.

The later patches improve the handling of incoming additional subflows,
improving significantly the performances in stress tests based on a high new
connection rate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosubflow: introduce and use mptcp_can_accept_new_subflow()
Paolo Abeni [Thu, 23 Jul 2020 11:02:36 +0000 (13:02 +0200)]
subflow: introduce and use mptcp_can_accept_new_subflow()

So that we can easily perform some basic PM-related
adimission checks before creating the child socket.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosubflow: use rsk_ops->send_reset()
Paolo Abeni [Thu, 23 Jul 2020 11:02:35 +0000 (13:02 +0200)]
subflow: use rsk_ops->send_reset()

tcp_send_active_reset() is more prone to transient errors
(memory allocation or xmit queue full): in stress conditions
the kernel may drop the egress packet, and the client will be
stuck.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosubflow: explicitly check for plain tcp rsk
Paolo Abeni [Thu, 23 Jul 2020 11:02:34 +0000 (13:02 +0200)]
subflow: explicitly check for plain tcp rsk

When syncookie are in use, the TCP stack may feed into
subflow_syn_recv_sock() plain TCP request sockets. We can't
access mptcp_subflow_request_sock-specific fields on such
sockets. Explicitly check the rsk ops to do safe accesses.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: cleanup subflow_finish_connect()
Paolo Abeni [Thu, 23 Jul 2020 11:02:33 +0000 (13:02 +0200)]
mptcp: cleanup subflow_finish_connect()

The mentioned function has several unneeded branches,
handle each case - MP_CAPABLE, MP_JOIN, fallback -
under a single conditional and drop quite a bit of
duplicate code.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: explicitly track the fully established status
Paolo Abeni [Thu, 23 Jul 2020 11:02:32 +0000 (13:02 +0200)]
mptcp: explicitly track the fully established status

Currently accepted msk sockets become established only after
accept() returns the new sk to user-space.

As MP_JOIN request are refused as per RFC spec on non fully
established socket, the above causes mp_join self-tests
instabilities.

This change lets the msk entering the established status
as soon as it receives the 3rd ack and propagates the first
subflow fully established status on the msk socket.

Finally we can change the subflow acceptance condition to
take in account both the sock state and the msk fully
established flag.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: mark as fallback even early ones
Paolo Abeni [Thu, 23 Jul 2020 11:02:31 +0000 (13:02 +0200)]
mptcp: mark as fallback even early ones

In the unlikely event of a failure at connect time,
we currently clear the request_mptcp flag - so that
the MPC handshake is not started at all, but the msk
is not explicitly marked as fallback.

This would lead to later insertion of wrong DSS options
in the xmitted packets, in violation of RFC specs and
possibly fooling the peer.

Fixes: e1ff9e82e2ea ("net: mptcp: improve fallback to TCP")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: avoid data corruption on reinsert
Paolo Abeni [Thu, 23 Jul 2020 11:02:30 +0000 (13:02 +0200)]
mptcp: avoid data corruption on reinsert

When updating a partially acked data fragment, we
actually corrupt it. This is irrelevant till we send
data on a single subflow, as retransmitted data, if
any are discarded by the peer as duplicate, but it
will cause data corruption as soon as we will start
creating non backup subflows.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosubflow: always init 'rel_write_seq'
Paolo Abeni [Thu, 23 Jul 2020 11:02:29 +0000 (13:02 +0200)]
subflow: always init 'rel_write_seq'

Currently we do not init the subflow write sequence for
MP_JOIN subflows. This will cause bad mapping being
generated as soon as we will use non backup subflow.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosfc: convert to new udp_tunnel infrastructure
Jakub Kicinski [Wed, 22 Jul 2020 19:05:10 +0000 (12:05 -0700)]
sfc: convert to new udp_tunnel infrastructure

Check MC_CMD_DRV_ATTACH_EXT_OUT_FLAG_TRUSTED, before setting
the info, which will hopefully protect us from -EPERM errors
the previous code was gracefully ignoring. Ed reports this
is not the 100% correct bit, but it's the best approximation
we have. Shared code reports the port information back to user
space, so we really want to know what was added and what failed.
Ignoring -EPERM is not an option.

The driver does not call udp_tunnel_get_rx_info(), so its own
management of table state is not really all that problematic,
we can leave it be. This allows the driver to continue with its
copious table syncing, and matching the ports to TX frames,
which it will reportedly do one day.

Leave the feature checking in the callbacks, as the device may
remove the capabilities on reset.

Inline the loop from __efx_ef10_udp_tnl_lookup_port() into
efx_ef10_udp_tnl_has_port(), since it's the only caller now.

With new infra this driver gains port replace - when space frees
up in a full table a new port will be selected for offload.
Plus efx will no longer sleep in an atomic context.

v2:
 - amend the commit message about TRUSTED not being 100%
 - add TUNNEL_ENCAP_UDP_PORT_ENTRY_INVALID to mark unsed
   entries

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-By: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'qed-qede-improve-chain-API-and-add-XDP_REDIRECT-support'
David S. Miller [Thu, 23 Jul 2020 01:19:03 +0000 (18:19 -0700)]
Merge branch 'qed-qede-improve-chain-API-and-add-XDP_REDIRECT-support'

Alexander Lobakin says:

====================
qed, qede: improve chain API and add XDP_REDIRECT support

This series adds missing XDP_REDIRECT case handling in QLogic Everest
Ethernet driver with all necessary prerequisites and ops.
QEDE Tx relies heavily on chain API, so make sure it is in its best
at first.

v2 (from [1]):
 - add missing includes to #003 to pass the build on Alpha;
 - no functional changes.

[1] https://lore.kernel.org/netdev/20200722155349.747-1-alobakin@marvell.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqede: add .ndo_xdp_xmit() and XDP_REDIRECT support
Alexander Lobakin [Wed, 22 Jul 2020 22:10:45 +0000 (01:10 +0300)]
qede: add .ndo_xdp_xmit() and XDP_REDIRECT support

Add XDP_REDIRECT case handling and the corresponding NDO to support
redirecting XDP frames. This also includes registering driver memory
model (currently order-0 page mode) in BPF subsystem.
The total number of XDP queues is usually 1:1 with Rx ones.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqede: refactor XDP Tx processing
Alexander Lobakin [Wed, 22 Jul 2020 22:10:44 +0000 (01:10 +0300)]
qede: refactor XDP Tx processing

Current XDP Tx logic is suboptimal and can't be reused for XDP_REDIRECT
path.
Make qede_xdp_{tx_int,xmit}() more universal and effective in general to
allow future expanding.

Misc: use unlikely() hints where appropriate and replace "fallthrough"
comments with pseudo-keywords.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqede: reformat net_device_ops declarations
Alexander Lobakin [Wed, 22 Jul 2020 22:10:43 +0000 (01:10 +0300)]
qede: reformat net_device_ops declarations

Correct the indentation of net_device_ops declarations for fancier look.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqede: reformat several structures in "qede.h"
Alexander Lobakin [Wed, 22 Jul 2020 22:10:42 +0000 (01:10 +0300)]
qede: reformat several structures in "qede.h"

Make the file more readable and easier for adding new fields.

Misc: use IFNAMSIZ and netdev_name() instead of sizeof_field()
and direct net_device::name dereferencing.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: introduce qed_chain_get_elem_used{,u32}()
Alexander Lobakin [Wed, 22 Jul 2020 22:10:41 +0000 (01:10 +0300)]
qed: introduce qed_chain_get_elem_used{,u32}()

Add reverse-variants of qed_chain_get_elem_left{,u32}() to be able to
know current chain occupation. They will be used in the upcoming qede
XDP_REDIRECT code.
They share most of the logics with the mentioned ones, so were reused
to collapse the latters.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: optimize common chain accessors
Alexander Lobakin [Wed, 22 Jul 2020 22:10:40 +0000 (01:10 +0300)]
qed: optimize common chain accessors

Constify chain pointers and refactor qed_chain_get_elem_left{,u32}() a bit.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: add support for different page sizes for chains
Alexander Lobakin [Wed, 22 Jul 2020 22:10:39 +0000 (01:10 +0300)]
qed: add support for different page sizes for chains

Extend current infrastructure to store chain page size in a struct
and use it in all functions instead of fixed QED_CHAIN_PAGE_SIZE.
Its value remains the default one, but can be overridden in
qed_chain_init_params before chain allocation.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: simplify chain allocation with init params struct
Alexander Lobakin [Wed, 22 Jul 2020 22:10:38 +0000 (01:10 +0300)]
qed: simplify chain allocation with init params struct

To simplify qed_chain_alloc() prototype and call sites, introduce struct
qed_chain_init_params to specify chain params, and pass a pointer to
filled struct to the actual qed_chain_alloc() instead of a long list
of separate arguments.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: simplify initialization of the chains with an external PBL
Alexander Lobakin [Wed, 22 Jul 2020 22:10:37 +0000 (01:10 +0300)]
qed: simplify initialization of the chains with an external PBL

Fill PBL table parameters for chains with an external PBL data earlier on
qed_chain_init_params() rather than on allocation itself. This simplifies
allocation code and allows to extend struct ext_pbl for other chain types.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: move chain initialization inlines next to allocation functions
Alexander Lobakin [Wed, 22 Jul 2020 22:10:36 +0000 (01:10 +0300)]
qed: move chain initialization inlines next to allocation functions

qed_chain_init*() are used in one file/place on "cold" path only, so they
can be uninlined and moved next to the call sites.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: sanitize PBL chains allocation
Alexander Lobakin [Wed, 22 Jul 2020 22:10:35 +0000 (01:10 +0300)]
qed: sanitize PBL chains allocation

PBL chain elements are actually DMA addresses stored in __le64, but
currently their size is hardcoded to 8, and DMA addresses are assigned
via cast to variable-sized dma_addr_t without any bitwise conversions.
Change the type of pbl_virt array to match the actual one, add a new
field to store the size of allocated DMA memory and sanitize elements
assignment.

Misc: give more logic names to the members of qed_chain::pbl_sp embedded
struct.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: prevent possible double-frees of the chains
Alexander Lobakin [Wed, 22 Jul 2020 22:10:34 +0000 (01:10 +0300)]
qed: prevent possible double-frees of the chains

Zero-initialize chain on qed_chain_free(), so it couldn't be freed
twice and provoke undefined behaviour.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: move chain methods to a separate file
Alexander Lobakin [Wed, 22 Jul 2020 22:10:33 +0000 (01:10 +0300)]
qed: move chain methods to a separate file

Move chain allocation/freeing functions to a new file to not mix it with
hardware-related code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: reformat Makefile
Alexander Lobakin [Wed, 22 Jul 2020 22:10:32 +0000 (01:10 +0300)]
qed: reformat Makefile

List one entry per line and sort them alphabetically to simplify the
addition of the new ones.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: reformat "qed_chain.h" a bit
Alexander Lobakin [Wed, 22 Jul 2020 22:10:31 +0000 (01:10 +0300)]
qed: reformat "qed_chain.h" a bit

Reformat structs and macros definitions a bit prior to making functional
changes.

Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qed_hsi.h: Avoid the use of one-element array
Gustavo A. R. Silva [Wed, 22 Jul 2020 18:58:52 +0000 (13:58 -0500)]
net: qed_hsi.h: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type '__le32 reserved1'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/qed_hsi-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobna: bfi.h: Avoid the use of one-element array
Gustavo A. R. Silva [Wed, 22 Jul 2020 18:50:24 +0000 (13:50 -0500)]
bna: bfi.h: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u8 rsvd'[2], once it seems this is
just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/bfi-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotg3: Avoid the use of one-element array
Gustavo A. R. Silva [Wed, 22 Jul 2020 18:43:58 +0000 (13:43 -0500)]
tg3: Avoid the use of one-element array

One-element arrays are being deprecated[1]. Replace the one-element
array with a simple value type 'u32 reserved2'[2], once it seems
this is just a placeholder for alignment.

[1] https://github.com/KSPP/linux/issues/79
[2] https://github.com/KSPP/linux/issues/86

Tested-by: kernel test robot <lkp@intel.com>
Link: https://github.com/GustavoARSilva/linux-hardening/blob/master/cii/0-day/tg3-20200718.md
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: fix memory leak of object 'lid'
Colin Ian King [Wed, 22 Jul 2020 17:40:03 +0000 (18:40 +0100)]
ionic: fix memory leak of object 'lid'

Currently when netdev fails to allocate the error return path
fails to free the allocated object 'lid'.  Fix this by setting
err to the return error code and jumping to a new label that
performs the kfree of lid before returning.

Addresses-Coverity: ("Resource leak")
Fixes: 4b03b27349c0 ("ionic: get MTU from lif identity")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'l2tp-cleanup-checkpatch-pl-warnings'
David S. Miller [Thu, 23 Jul 2020 01:08:39 +0000 (18:08 -0700)]
Merge branch 'l2tp-cleanup-checkpatch-pl-warnings'

Tom Parkin says:

====================
l2tp: cleanup checkpatch.pl warnings

l2tp hasn't been kept up to date with the static analysis checks offered
by checkpatch.pl.

This series addresses a range of minor issues which don't involve large
changes to code structure.  The changes include:

 * tweaks to use of whitespace, comment style, line breaks,
   and indentation

 * two minor modifications to code to use a function or macro suggested
   by checkpatch

v1 -> v2

 * combine related patches (patches fixing whitespace issues, patches
   addressing comment style)

 * respin the single large patchset into a multiple smaller series for
   easier review
====================

Reviewed-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: avoid precidence issues in L2TP_SKB_CB macro
Tom Parkin [Wed, 22 Jul 2020 16:32:14 +0000 (17:32 +0100)]
l2tp: avoid precidence issues in L2TP_SKB_CB macro

checkpatch warned about the L2TP_SKB_CB macro's use of its argument: add
braces to avoid the problem.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: line-break long function prototypes
Tom Parkin [Wed, 22 Jul 2020 16:32:13 +0000 (17:32 +0100)]
l2tp: line-break long function prototypes

In l2tp_core.c both l2tp_tunnel_create and l2tp_session_create take
quite a number of arguments and have a correspondingly long prototype.

This is both quite difficult to scan visually, and triggers checkpatch
warnings.

Add a line break to make these function prototypes more readable.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: prefer seq_puts for unformatted output
Tom Parkin [Wed, 22 Jul 2020 16:32:12 +0000 (17:32 +0100)]
l2tp: prefer seq_puts for unformatted output

checkpatch warns about use of seq_printf where seq_puts would do.

Modify l2tp_debugfs accordingly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: prefer using BIT macro
Tom Parkin [Wed, 22 Jul 2020 16:32:11 +0000 (17:32 +0100)]
l2tp: prefer using BIT macro

Use BIT(x) rather than (1<<x), reported by checkpatch.pl.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: add identifier name in function pointer prototype
Tom Parkin [Wed, 22 Jul 2020 16:32:10 +0000 (17:32 +0100)]
l2tp: add identifier name in function pointer prototype

Reported by checkpatch:

        "WARNING: function definition argument 'struct sock *'
         should also have an identifier name"

Add an identifier name to help document the prototype.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup suspect code indent
Tom Parkin [Wed, 22 Jul 2020 16:32:09 +0000 (17:32 +0100)]
l2tp: cleanup suspect code indent

l2tp_core has conditionally compiled code in l2tp_xmit_skb for IPv6
support.  The structure of this code triggered a checkpatch warning
due to incorrect indentation.

Fix up the indentation to address the checkpatch warning.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup wonky alignment of line-broken function calls
Tom Parkin [Wed, 22 Jul 2020 16:32:08 +0000 (17:32 +0100)]
l2tp: cleanup wonky alignment of line-broken function calls

Arguments should be aligned with the function call open parenthesis as
per checkpatch.  Tweak some function calls which were not aligned
correctly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup difficult-to-read line breaks
Tom Parkin [Wed, 22 Jul 2020 16:32:07 +0000 (17:32 +0100)]
l2tp: cleanup difficult-to-read line breaks

Some l2tp code had line breaks which made the code more difficult to
read.  These were originally motivated by the 80-character line width
coding guidelines, but were actually a negative from the perspective of
trying to follow the code.

Remove these linebreaks for clearer code, even if we do exceed 80
characters in width in some places.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup comments
Tom Parkin [Wed, 22 Jul 2020 16:32:06 +0000 (17:32 +0100)]
l2tp: cleanup comments

Modify some l2tp comments to better adhere to kernel coding style, as
reported by checkpatch.pl.

Add descriptive comments for the l2tp per-net spinlocks to document
their use.

Fix an incorrect comment in l2tp_recv_common:

RFC2661 section 5.4 states that:

"The LNS controls enabling and disabling of sequence numbers by sending a
data message with or without sequence numbers present at any time during
the life of a session."

l2tp handles this correctly in l2tp_recv_common, but the comment around
the code was incorrect and confusing.  Fix up the comment accordingly.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agol2tp: cleanup whitespace use
Tom Parkin [Wed, 22 Jul 2020 16:32:05 +0000 (17:32 +0100)]
l2tp: cleanup whitespace use

Fix up various whitespace issues as reported by checkpatch.pl:

 * remove spaces around operators where appropriate,
 * add missing blank lines following declarations,
 * remove multiple blank lines, or trailing blank lines at the end of
   functions.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodevlink: Always use user_ptr[0] for devlink and simplify post_doit
Parav Pandit [Wed, 22 Jul 2020 15:57:11 +0000 (18:57 +0300)]
devlink: Always use user_ptr[0] for devlink and simplify post_doit

Currently devlink instance is searched on all doit() operations.
But it is optionally stored into user_ptr[0]. This requires
rediscovering devlink again doing post_doit().

Few devlink commands related to port shared buffers needs 3 pointers
(devlink, devlink_port, and devlink_sb) while executing doit commands.
Though devlink pointer can be derived from the devlink_port during
post_doit() operation when doit() callback has acquired devlink
instance lock, relying on such scheme to access devlik pointer makes
code very fragile.

Hence, to avoid ambiguity in post_doit() and to avoid searching
devlink instance again, simplify code by always storing devlink
instance in user_ptr[0] and derive devlink_sb pointer in their
respective callback routines.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agohv_netvsc: add support for vlans in AF_PACKET mode
Sriram Krishnan [Wed, 22 Jul 2020 15:38:44 +0000 (21:08 +0530)]
hv_netvsc: add support for vlans in AF_PACKET mode

Vlan tagged packets are getting dropped when used with DPDK that uses
the AF_PACKET interface on a hyperV guest.

The packet layer uses the tpacket interface to communicate the vlans
information to the upper layers. On Rx path, these drivers can read the
vlan info from the tpacket header but on the Tx path, this information
is still within the packet frame and requires the paravirtual drivers to
push this back into the NDIS header which is then used by the host OS to
form the packet.

This transition from the packet frame to NDIS header is currently missing
hence causing the host OS to drop the all vlan tagged packets sent by
the drivers that use AF_PACKET (ETH_P_ALL) such as DPDK.

Here is an overview of the changes in the vlan header in the packet path:

The RX path (userspace handles everything):
  1. RX VLAN packet is stripped by HOST OS and placed in NDIS header
  2. Guest Kernel RX hv_netvsc packets and moves VLAN info from NDIS
     header into kernel SKB
  3. Kernel shares packets with user space application with PACKET_MMAP.
     The SKB VLAN info is copied to tpacket layer and indication set
     TP_STATUS_VLAN_VALID.
  4. The user space application will re-insert the VLAN info into the frame

The TX path:
  1. The user space application has the VLAN info in the frame.
  2. Guest kernel gets packets from the application with PACKET_MMAP.
  3. The kernel later sends the frame to the hv_netvsc driver. The only way
     to send VLANs is when the SKB is setup & the VLAN is stripped from the
     frame.
  4. TX VLAN is re-inserted by HOST OS based on the NDIS header. If it sees
     a VLAN in the frame the packet is dropped.

Cc: xe-linux-external@cisco.com
Cc: Sriram Krishnan <srirakr2@cisco.com>
Signed-off-by: Sriram Krishnan <srirakr2@cisco.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomptcp: zero token hash at creation time.
Paolo Abeni [Wed, 22 Jul 2020 15:20:50 +0000 (17:20 +0200)]
mptcp: zero token hash at creation time.

Otherwise the 'chain_len' filed will carry random values,
some token creation calls will fail due to excessive chain
length, causing unexpected fallback to TCP.

Fixes: 2c5ebd001d4f ("mptcp: refactor token container")
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Tested-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agolan743x: remove redundant initialization of variable current_head_index
Colin Ian King [Wed, 22 Jul 2020 15:12:21 +0000 (16:12 +0100)]
lan743x: remove redundant initialization of variable current_head_index

The variable current_head_index is being initialized with a value that
is never read and it is being updated later with a new value.  Replace
the initialization of -1 with the latter assignment.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoenetc: Remove the imdio bus on PF probe bailout
Claudiu Manoil [Wed, 22 Jul 2020 12:38:48 +0000 (15:38 +0300)]
enetc: Remove the imdio bus on PF probe bailout

enetc_imdio_remove() is missing from the enetc_pf_probe()
bailout path. Not surprisingly because enetc_setup_serdes()
is registering the imdio bus for internal purposes, and it's
not obvious that enetc_imdio_remove() currently performs the
teardown of enetc_setup_serdes().
To fix this, define enetc_teardown_serdes() to wrap
enetc_imdio_remove() (improve code maintenance) and call it
on bailout and remove paths.

Fixes: 975d183ef0ca ("net: enetc: Initialize SerDes for SGMII and USXGMII protocols")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: qed: Remove unneeded cast from memory allocation
Wang Hai [Wed, 22 Jul 2020 02:10:27 +0000 (10:10 +0800)]
net: qed: Remove unneeded cast from memory allocation

Remove casting the values returned by memory allocation function.

Coccinelle emits WARNING: casting value returned by memory allocation
unction to (struct roce_destroy_qp_req_output_params *) is useless.

This issue was detected by using the Coccinelle software.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: fix check in get_phy_c45_ids
Vladimir Oltean [Mon, 20 Jul 2020 17:26:54 +0000 (20:26 +0300)]
net: phy: fix check in get_phy_c45_ids

After the patch below, the iteration through the available MMDs is
completely short-circuited, and devs_in_pkg remains set to the initial
value of zero.

Due to devs_in_pkg being zero, the rest of get_phy_c45_ids() is
short-circuited too: the following loop never reaches below this point
either (it executes "continue" for every device in package, failing to
retrieve PHY ID for any of them):

/* Now probe Device Identifiers for each device present. */
for (i = 1; i < num_ids; i++) {
if (!(devs_in_pkg & (1 << i)))
continue;

So c45_ids->device_ids remains populated with zeroes. This causes an
Aquantia AQR412 PHY (same as any C45 PHY would, in fact) to be probed by
the Generic PHY driver.

The issue seems to be a case of submitting partially committed work (and
therefore testing something other than was submitted).

The intention of the patch was to delay exiting the loop until one more
condition is reached (the devs_in_pkg read from hardware is either 0, OR
mostly f's). So fix the patch to reflect that.

Tested with traffic on a LS1028A-QDS, the PHY is now probed correctly
using the Aquantia driver. The devs_in_pkg bit field is set to
0xe000009a, and the MMDs that are present have the following IDs:

[    5.600772] libphy: get_phy_c45_ids: device_ids[1]=0x3a1b662
[    5.618781] libphy: get_phy_c45_ids: device_ids[3]=0x3a1b662
[    5.630797] libphy: get_phy_c45_ids: device_ids[4]=0x3a1b662
[    5.654535] libphy: get_phy_c45_ids: device_ids[7]=0x3a1b662
[    5.791723] libphy: get_phy_c45_ids: device_ids[29]=0x3a1b662
[    5.804050] libphy: get_phy_c45_ids: device_ids[30]=0x3a1b662
[    5.816375] libphy: get_phy_c45_ids: device_ids[31]=0x0

[    7.690237] mscc_felix 0000:00:00.5: PHY [0.5:00] driver [Aquantia AQR412] (irq=POLL)
[    7.704739] mscc_felix 0000:00:00.5: PHY [0.5:01] driver [Aquantia AQR412] (irq=POLL)
[    7.718918] mscc_felix 0000:00:00.5: PHY [0.5:02] driver [Aquantia AQR412] (irq=POLL)
[    7.733044] mscc_felix 0000:00:00.5: PHY [0.5:03] driver [Aquantia AQR412] (irq=POLL)

Fixes: bba238ed037c ("net: phy: continue searching for C45 MMDs even if first returned ffff:ffff")
Reported-by: Colin King <colin.king@canonical.com>
Reported-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dccp: Add SIOCOUTQ IOCTL support (send buffer fill)
Richard Sailer [Mon, 20 Jul 2020 16:06:14 +0000 (18:06 +0200)]
net: dccp: Add SIOCOUTQ IOCTL support (send buffer fill)

This adds support for the SIOCOUTQ IOCTL to get the send buffer fill
of a DCCP socket, like UDP and TCP sockets already have.

Regarding the used data field: DCCP uses per packet sequence numbers,
not per byte, so sequence numbers can't be used like in TCP. sk_wmem_queued
is not used by DCCP and always 0, even in test on highly congested paths.
Therefore this uses sk_wmem_alloc like in UDP.

Signed-off-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'Add-DSA-yaml-binding'
David S. Miller [Wed, 22 Jul 2020 23:56:44 +0000 (16:56 -0700)]
Merge branch 'Add-DSA-yaml-binding'

Kurt Kanzenbach says:

====================
Add DSA yaml binding

as discussed [1] [2] it makes sense to add a DSA yaml binding. This is the
second version and contains now two ways of specifying the switch ports: Either
by "ports" or by "ethernet-ports". That is why the third patch also adjusts the
DSA core for it.

Tested in combination with the hellcreek.yaml file.

Changes since v1:

 * Use select to not match unrelated switches
 * Allow ethernet-port(s)
 * List ethernet-controller properties
 * Include better description
 * Let dsa.txt refer to dsa.yaml

Thanks,
Kurt

[1] - https://lkml.kernel.org/netdev/449f0a03-a91d-ae82-b31f-59dfd1457ec5@gmail.com/
[2] - https://lkml.kernel.org/netdev/20200710090618.28945-1-kurt@linutronix.de/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: of: Allow ethernet-ports as encapsulating node
Kurt Kanzenbach [Mon, 20 Jul 2020 12:49:39 +0000 (14:49 +0200)]
net: dsa: of: Allow ethernet-ports as encapsulating node

Due to unified Ethernet Switch Device Tree Bindings allow for ethernet-ports as
encapsulating node as well.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: dsa: Let dsa.txt refer to dsa.yaml
Kurt Kanzenbach [Mon, 20 Jul 2020 12:49:38 +0000 (14:49 +0200)]
dt-bindings: net: dsa: Let dsa.txt refer to dsa.yaml

The DSA bindings have been converted to YAML. Therefore, the old text style
documentation should refer to that one.

The text file can be removed completely once all the existing DSA switch
bindings have been converted as well.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: dsa: Add DSA yaml binding
Kurt Kanzenbach [Mon, 20 Jul 2020 12:49:37 +0000 (14:49 +0200)]
dt-bindings: net: dsa: Add DSA yaml binding

For future DSA drivers it makes sense to add a generic DSA yaml binding which
can be used then. This was created using the properties from dsa.txt. It
includes the ports and the dsa,member property.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: ocelot: fix non-initialized CPU port on VSC7514
Vladimir Oltean [Wed, 22 Jul 2020 08:08:57 +0000 (11:08 +0300)]
net: mscc: ocelot: fix non-initialized CPU port on VSC7514

The VSC7514 is marketed as a 10-port switch, however it has 11 physical
ports (0->10) in the block diagram:
https://www.microsemi.com/product-directory/ethernet-switches/3992-vsc7514
(also in the device tree at arch/mips/boot/dts/mscc/ocelot.dtsi)

Additionally, by architecture it has one more entry in the analyzer
block, situated right after the physical ports, for the CPU port module.
This is not a physical port, it only represents a channel for frame
injection and extraction. That entry for the CPU port is at index 11 in
the analyzer.

When the register groups for QSYS_SWITCH_PORT_MODE, SYS_PORT_MODE and
SYS_PAUSE_CFG are declared to be replicated 11 times, the 11th entry in
the array of regfields is not initialized, so the CPU port module is not
initialized either.

The documentation of QSYS_SWITCH_PORT_MODE for VSC7514 also says that
this register group is replicated 12 times, so this patch is simply
reflecting that and not introducing any further inconsistency.

Fixes: 886e1387c73d ("net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields")
Fixes: 541132f0961a ("net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield")
Reported-by: Bryan Whitehead <bryan.whitehead@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: explicitly include <linux/compat.h> in net/core/sock.c
Christoph Hellwig [Wed, 22 Jul 2020 07:40:27 +0000 (09:40 +0200)]
net: explicitly include <linux/compat.h> in net/core/sock.c

The buildbot found a config where the header isn't already implicitly
pulled in, so add an explicit include as well.

Fixes: 8c918ffbbad4 ("net: remove compat_sock_common_{get,set}sockopt")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Wed, 22 Jul 2020 19:34:55 +0000 (12:34 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).

The main changes are:

1) Run BPF program on socket lookup, from Jakub.

2) Introduce cpumap, from Lorenzo.

3) s390 JIT fixes, from Ilya.

4) teach riscv JIT to emit compressed insns, from Luke.

5) use build time computed BTF ids in bpf iter, from Yonghong.
====================

Purely independent overlapping changes in both filter.h and xdp.h

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ionic-updates'
David S. Miller [Wed, 22 Jul 2020 01:36:34 +0000 (18:36 -0700)]
Merge branch 'ionic-updates'

Shannon Nelson says:

====================
ionic updates

These are a few odd code tweaks to the ionic driver: FW defined MTU
limits, remove unnecessary code, and other tidiness tweaks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: interface file updates
Shannon Nelson [Tue, 21 Jul 2020 20:34:09 +0000 (13:34 -0700)]
ionic: interface file updates

Add some new interface values and update a few more descriptions.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: rearrange reset and bus-master control
Shannon Nelson [Tue, 21 Jul 2020 20:34:08 +0000 (13:34 -0700)]
ionic: rearrange reset and bus-master control

We can prevent potential incorrect DMA access attempts from the
NIC by enabling bus-master after the reset, and by disabling
bus-master earlier in cleanup.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: update eid test for overflow
Shannon Nelson [Tue, 21 Jul 2020 20:34:07 +0000 (13:34 -0700)]
ionic: update eid test for overflow

Fix up our comparison to better handle a potential (but largely
unlikely) wrap around.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: remove unused ionic_coal_hw_to_usec
Shannon Nelson [Tue, 21 Jul 2020 20:34:06 +0000 (13:34 -0700)]
ionic: remove unused ionic_coal_hw_to_usec

Clean up some unused code.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: set netdev default name
Shannon Nelson [Tue, 21 Jul 2020 20:34:05 +0000 (13:34 -0700)]
ionic: set netdev default name

If the host system's udev fails to set a new name for the
network port, there is no NETDEV_CHANGENAME event to trigger
the driver to send the name down to the firmware.  It is safe
to set the lif name multiple times, so we add a call early on
to set the default netdev name to be sure the FW has something
to use in its internal debug logging.  Then when udev gets
around to changing it we can update it to the actual name the
system will be using.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoionic: get MTU from lif identity
Shannon Nelson [Tue, 21 Jul 2020 20:34:04 +0000 (13:34 -0700)]
ionic: get MTU from lif identity

Change from using hardcoded MTU limits and instead use the
firmware defined limits. The value from the LIF attributes is
the frame size, so we take off the header size to convert to
MTU size.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobareudp: Reverted support to enable & disable rx metadata collection
Martin Varghese [Fri, 17 Jul 2020 02:35:12 +0000 (08:05 +0530)]
bareudp: Reverted support to enable & disable rx metadata collection

The commit fe80536acf83 ("bareudp: Added attribute to enable & disable
rx metadata collection") breaks the the original(5.7) default behavior of
bareudp module to collect RX metadadata at the receive. It was added to
avoid the crash at the kernel neighbour subsytem when packet with metadata
from bareudp is processed. But it is no more needed as the
commit 394de110a733 ("net: Added pointer check for
dst->ops->neigh_lookup in dst_neigh_lookup_skb") solves this crash.

Fixes: fe80536acf83 ("bareudp: Added attribute to enable & disable rx metadata collection")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dpaa2-eth-add-support-for-TBF-offload'
David S. Miller [Tue, 21 Jul 2020 23:24:05 +0000 (16:24 -0700)]
Merge branch 'dpaa2-eth-add-support-for-TBF-offload'

Ioana Ciornei says:

====================
dpaa2-eth: add support for TBF offload

This patch set adds support for TBF offload in dpaa2-eth.
The first patch restructures how the .ndo_setup_tc() callback is
implemented (each Qdisc is treated in a separate function), the second
patch just adds the necessary APIs for configuring the Tx shaper and the
last one is handling TC_SETUP_QDISC_TBF and configures as requested the
shaper.
====================

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-eth: add support for TBF offload
Ioana Ciornei [Tue, 21 Jul 2020 16:38:25 +0000 (19:38 +0300)]
dpaa2-eth: add support for TBF offload

React to TC_SETUP_QDISC_TBF and configure the egress shaper as
appropriate with the maximum rate and burst size requested by the user.
TBF can only be offloaded on DPAA2 when it's the root qdisc, ie it's a
per port shaper.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-eth: add API for Tx shaping
Ioana Ciornei [Tue, 21 Jul 2020 16:38:24 +0000 (19:38 +0300)]
dpaa2-eth: add API for Tx shaping

Add the necessary API (dpni_set_tx_shaping) for configuring the rate and
burst size of a per port shaper in DPAA2.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodpaa2-eth: move the mqprio setup into a separate function
Ioana Ciornei [Tue, 21 Jul 2020 16:38:23 +0000 (19:38 +0300)]
dpaa2-eth: move the mqprio setup into a separate function

Move the setup done for MQPRIO into a separate function so that
with the addition of another offload we do not crowd
dpaa2_eth_setup_tc(). After this restructuring it's easier to see what
is supported in terms of Qdisc offloading.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>