linux-2.6-microblaze.git
3 years agoMerge branch 'net-small-csum-optimizations'
Jakub Kicinski [Fri, 26 Nov 2021 05:03:33 +0000 (21:03 -0800)]
Merge branch 'net-small-csum-optimizations'

Eric Dumazet says:

====================
net: small csum optimizations

After recent x86 csum_partial() optimizations, we can more easily
see in kernel profiles costs of add/adc operations that could
be avoided, by feeding a non zero third argument to csum_partial()
====================

Link: https://lore.kernel.org/r/20211124202446.2917972-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: optimize skb_postpull_rcsum()
Eric Dumazet [Wed, 24 Nov 2021 20:24:46 +0000 (12:24 -0800)]
net: optimize skb_postpull_rcsum()

Remove one pair of add/adc instructions and their dependency
against carry flag.

We can leverage third argument to csum_partial():

  X = csum_block_sub(X, csum_partial(start, len, 0), 0);

  -->

  X = csum_block_add(X, ~csum_partial(start, len, 0), 0);

  -->

  X = ~csum_partial(start, len, ~X);

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogro: optimize skb_gro_postpull_rcsum()
Eric Dumazet [Wed, 24 Nov 2021 20:24:45 +0000 (12:24 -0800)]
gro: optimize skb_gro_postpull_rcsum()

We can leverage third argument to csum_partial():

  X = csum_sub(X, csum_partial(start, len, 0));

  -->

  X = csum_add(X, ~csum_partial(start, len, 0));

  -->

  X = ~csum_partial(start, len, ~X);

This removes one add/adc pair and its dependency against the carry flag.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agosctp: make the raise timer more simple and accurate
Xin Long [Wed, 24 Nov 2021 19:26:14 +0000 (14:26 -0500)]
sctp: make the raise timer more simple and accurate

Currently, the probe timer is reused as the raise timer when PLPMTUD is in
the Search Complete state. raise_count was introduced to count how many
times the probe timer has timed out. When raise_count reaches to 30, the
raise timer handler will be triggered.

During the whole processing above, the timer keeps timing out every probe_
interval. It is a waste for the Search Complete state, as the raise timer
only needs to time out after 30 * probe_interval.

Since the raise timer and probe timer are never used at the same time, it
is no need to keep probe timer 'alive' in the Search Complete state. This
patch to introduce sctp_transport_reset_raise_timer() to start the timer
as the raise timer when entering the Search Complete state. When entering
the other states, sctp_transport_reset_probe_timer() will still be called
to reset the timer to the probe timer.

raise_count can be removed from sctp_transport as no need to count probe
timer timeout for raise timer timeout. last_rtx_chunks can be removed as
sctp_transport_reset_probe_timer() can be called in the place where asoc
rtx_data_chunks is changed.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/edb0e48988ea85997488478b705b11ddc1ba724a.1637781974.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotipc: delete the unlikely branch in tipc_aead_encrypt
Xin Long [Wed, 24 Nov 2021 17:11:12 +0000 (12:11 -0500)]
tipc: delete the unlikely branch in tipc_aead_encrypt

When a skb comes to tipc_aead_encrypt(), it's always linear. The
unlikely check 'skb_cloned(skb) && tailen <= skb_tailroom(skb)'
can completely be taken care of in skb_cow_data() by the code
in branch "if (!skb_has_frag_list())".

Also, remove the 'TODO:' annotation, as the pages in skbs are not
writable, see more on commit 3cf4375a0904 ("tipc: do not write
skb_shinfo frags when doing decrytion").

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/47a478da0b6095b76e3cbe7a75cbd25d9da1df9a.1637773872.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-ipa-gsi-channel-flow-control'
Jakub Kicinski [Fri, 26 Nov 2021 04:05:34 +0000 (20:05 -0800)]
Merge branch 'net-ipa-gsi-channel-flow-control'

Alex Elder says:

====================
net: ipa: GSI channel flow control

Starting with IPA v4.2, endpoint DELAY mode (which prevents data
transfer on TX endpoints) does not work properly.  To address this,
changes were made to allow underlying GSI channels to be put into
a "flow controlled" state, which achieves a similar objective.
The first patch in this series implements the flow controlled
channel state and the commands used to control it.  It arranges
to use the new mechanism--instead of DELAY mode--for IPA v4.2+.

In IPA v4.11, the notion of GSI channel flow control was enhanced,
and implemented in a slightly different way.  For the most part this
doesn't affect the way the IPA driver uses flow control, but the
second patch adds support for the newer mechanism.
====================

Link: https://lore.kernel.org/r/20211124194416.707007-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: support enhanced channel flow control
Alex Elder [Wed, 24 Nov 2021 19:44:16 +0000 (13:44 -0600)]
net: ipa: support enhanced channel flow control

IPA v4.2 introduced GSI channel flow control, used instead of IPA
endpoint DELAY mode to prevent a TX channel from injecting packets
into the IPA core.  It used a new FLOW_CONTROLLED channel state
which could be entered using GSI generic commands.

IPA v4.11 extended the channel flow control model.  Rather than
having a distinct FLOW_CONTROLLED channel state, each channel has a
"flow control" property that can be enabled or not--independent of
the channel state.  The AP (or modem) can modify this property using
the same GSI generic commands as before.

The AP only uses channel flow control on modem TX channels, and only
when recovering from a modem crash.  The AP has no way to discover
the state of a modem channel, so the fact that (starting with IPA
v4.11) flow control no longer uses a distinct channel state is
invisible to the AP.  So enhanced flow control generally does not
change the way AP uses flow control.

There are a few small differences, however:
  - There is a notion of "primary" or "secondary" flow control, and
    when enabling or disabling flow control that must be specified
    in a new field in the GSI generic command register.  For now, we
    always specify 0 (meaning "primary").
  - When disabling flow control, it's possible a request will need
    to be retried.  We retry up to 5 times in this case.
  - Another new generic command allows the current flow control
    state to be queried.  We do not use this.

Other than the need for retries, the code essentially works the same
way as before.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: introduce channel flow control
Alex Elder [Wed, 24 Nov 2021 19:44:15 +0000 (13:44 -0600)]
net: ipa: introduce channel flow control

One quirk for certain versions of IPA is that endpoint DELAY mode
does not work properly.  IPA DELAY mode prevents any packets from
being delivered to the IPA core for processing on a TX endpoint.
The AP uses DELAY mode when the modem crashes, to prevent modem TX
endpoints from generating traffic during crash recovery.  Without
this, there is a chance the hardware will stall during recovery from
a modem crash.

To achieve a similar effect, a GSI FLOW_CONTROLLED channel state
was created.  A STARTED TX channel can be placed in FLOW_CONTROLLED
state, which prevents the transfer of any more packets.  A channel
in FLOW_CONTROLLED state can be either returned to STARTED state, or
can be transitioned to STOPPED state.

Because this operates on GSI channels, two generic commands were
added to allow the AP to control this state for modem channels
(similar to the ALLOCATE and HALT channel commands).

Previously the code assumed this quirk only applied to IPA v4.2.
In fact, channel flow control (rather than endpoint DELAY mode)
should be used for all versions *starting* with IPA v4.2.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'mctp-serial-minor-fixes'
Jakub Kicinski [Fri, 26 Nov 2021 03:40:41 +0000 (19:40 -0800)]
Merge branch 'mctp-serial-minor-fixes'

Jeremy Kerr says:

====================
mctp serial minor fixes

We had a few minor fixes queued for a v4 of the original series, so
they're sent here as separate changes.

v2:
 - fix ordering of cancel_work vs. unregister_netdev.
====================

Link: https://lore.kernel.org/r/20211125060739.3023442-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomctp: serial: remove unnecessary ldisc data check
Jeremy Kerr [Thu, 25 Nov 2021 06:07:39 +0000 (14:07 +0800)]
mctp: serial: remove unnecessary ldisc data check

Jiri assures me that a ldisc->open with tty->disc_data set should never
happen, so this check doesn't do anything.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomctp: serial: enforce fixed MTU
Jeremy Kerr [Thu, 25 Nov 2021 06:07:38 +0000 (14:07 +0800)]
mctp: serial: enforce fixed MTU

The current serial driver requires a maximum MTU of 68, and it doesn't
make sense to set a MTU below the MCTP-required baseline (of 68) either.

This change sets the min_mtu & max_mtu of the mctp netdev, essentially
disallowing changes. By using these instead of a ndo_change_mtu op, we
get the netlink extacks reported too.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomctp: serial: cancel tx work on ldisc close
Jeremy Kerr [Thu, 25 Nov 2021 06:07:37 +0000 (14:07 +0800)]
mctp: serial: cancel tx work on ldisc close

We want to ensure that the tx work has finished before returning from
the ldisc close op, so do a synchronous cancel.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-ipa-small-collected-improvements'
Jakub Kicinski [Fri, 26 Nov 2021 03:37:37 +0000 (19:37 -0800)]
Merge branch 'net-ipa-small-collected-improvements'

Alex Elder says:

====================
net: ipa: small collected improvements

This series contains a somewhat unrelated set of changes, some
inspired by some recent work posted for back-port.  For the most
part they're meant to improve the code without changing it's
functionality.  Each basically stands on its own.
====================

Link: https://lore.kernel.org/r/20211124202511.862588-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: rearrange GSI structure fields
Alex Elder [Wed, 24 Nov 2021 20:25:11 +0000 (14:25 -0600)]
net: ipa: rearrange GSI structure fields

The dummy net_device is a large field in the GSI structure, but it
is not at all interesting from the perspective of debugging.  Move
it to the end of the GSI structure so the other fields are easier to
find in memory.

The channel and event ring arrays are also very large, so move them
near the end of the structure as well.

Swap the position of the result and completion fields to improve
structure packing.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: GSI only needs one completion
Alex Elder [Wed, 24 Nov 2021 20:25:10 +0000 (14:25 -0600)]
net: ipa: GSI only needs one completion

A mutex ensures we never submit more than one GSI command of any
kind at once.  This means the per-channel and per-event ring
completion structures provide no benefit.  Instead, just use the
single (existing) GSI completion to signal the completion of GSI
commands of all types.

This makes gsi_evt_ring_init() a trivial function with no inverse,
so open-code it in its sole caller and get rid of the function.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: skip SKB copy if no netdev
Alex Elder [Wed, 24 Nov 2021 20:25:09 +0000 (14:25 -0600)]
net: ipa: skip SKB copy if no netdev

In ipa_endpoint_skb_copy(), a new socket buffer structure is
allocated so that some data can be copied into it.  However, after
doing this, if the endpoint has a null netdev pointer, we just drop
free the socket buffer.

Instead, check endpoint->netdev pointer first, and just return early
if it's null.  Also return early if the SKB allocation fails, to
avoid the deeper indentation in the normal path.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: explicitly disable HOLB drop during setup
Alex Elder [Wed, 24 Nov 2021 20:25:08 +0000 (14:25 -0600)]
net: ipa: explicitly disable HOLB drop during setup

During setup, ipa_endpoint_program() programs each endpoint with
various configuration parameters.  One of those registers defines
whether to drop packets when a head-of-line blocking condition is
detected on an RX endpoint.  We currently assume this is disabled;
instead, explicitly set it to be disabled.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: rework how HOL_BLOCK handling is specified
Alex Elder [Wed, 24 Nov 2021 20:25:07 +0000 (14:25 -0600)]
net: ipa: rework how HOL_BLOCK handling is specified

The head-of-line block (HOLB) drop timer is only meaningful when
dropping packets due to blocking is enabled.  Given that, redefine
the interface so the timer is specified when enabling HOLB drop, and
use a different function when disabling.

To enable and disable HOLB drop, these functions will now be used:
  ipa_endpoint_init_hol_block_enable(endpoint, milliseconds)
  ipa_endpoint_init_hol_block_disable(endpoint)

The existing ipa_endpoint_init_hol_block_enable() becomes a helper
function, renamed ipa_endpoint_init_hol_block_en(), and used with
ipa_endpoint_init_hol_block_timer() to enable HOLB block on an
endpoint.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: zero unused portions of filter table memory
Alex Elder [Wed, 24 Nov 2021 20:25:06 +0000 (14:25 -0600)]
net: ipa: zero unused portions of filter table memory

Not all filter table entries are used.  Only certain endpoints
support filtering, and the table begins with a bitmap indicating
which endpoints use the "slots" that follow for filter rules.

Currently, unused filter table entries are not initialized.
Instead, zero-fill the entire unused portion of the filter table
memory regions, to make it more obvious that memory is unused (and
not subsequently modified).

This is not strictly necessary, but the result is reassuring when
looking at filter table memory.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: kill ipa_modem_init()
Alex Elder [Wed, 24 Nov 2021 20:25:05 +0000 (14:25 -0600)]
net: ipa: kill ipa_modem_init()

A recent commit made disabling the SMP2P "setup ready" interrupt
unrelated to ipa_modem_stop().  Given that, it seems fitting to get
rid of ipa_modem_init() and ipa_modem_exit() (which are trivial
wrapper functions), and call ipa_smp2p_init() and ipa_smp2p_exit()
directly instead.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: felix: enable cut-through forwarding between ports by default
Vladimir Oltean [Thu, 25 Nov 2021 12:58:08 +0000 (14:58 +0200)]
net: dsa: felix: enable cut-through forwarding between ports by default

The VSC9959 switch embedded within NXP LS1028A (and that version of
Ocelot switches only) supports cut-through forwarding - meaning it can
start the process of looking up the destination ports for a packet, and
forward towards those ports, before the entire packet has been received
(as opposed to the store-and-forward mode).

The up side is having lower forwarding latency for large packets. The
down side is that frames with FCS errors are forwarded instead of being
dropped. However, erroneous frames do not result in incorrect updates of
the FDB or incorrect policer updates, since these processes are deferred
inside the switch to the end of frame. Since the switch starts the
cut-through forwarding process after all packet headers (including IP,
if any) have been processed, packets with large headers and small
payload do not see the benefit of lower forwarding latency.

There are two cases that need special attention.

The first is when a packet is multicast (or flooded) to multiple
destinations, one of which doesn't have cut-through forwarding enabled.
The switch deals with this automatically by disabling cut-through
forwarding for the frame towards all destination ports.

The second is when a packet is forwarded from a port of lower link speed
towards a port of higher link speed. This is not handled by the hardware
and needs software intervention.

Since we practically need to update the cut-through forwarding domain
from paths that aren't serialized by the rtnl_mutex (phylink
mac_link_down/mac_link_up ops), this means we need to serialize physical
link events with user space updates of bonding/bridging domains.

Enabling cut-through forwarding is done per {egress port, traffic class}.
I don't see any reason why this would be a configurable option as long
as it works without issues, and there doesn't appear to be any user
space configuration tool to toggle this on/off, so this patch enables
cut-through forwarding on all eligible ports and traffic classes.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ocelot: remove "bridge" argument from ocelot_get_bridge_fwd_mask
Vladimir Oltean [Thu, 25 Nov 2021 12:58:07 +0000 (14:58 +0200)]
net: ocelot: remove "bridge" argument from ocelot_get_bridge_fwd_mask

The only called takes ocelot_port->bridge and passes it as the "bridge"
argument to this function, which then compares it with
ocelot_port->bridge. This is not useful.

Instead, we would like this function to return 0 if ocelot_port->bridge
is not present, which is what this patch does.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: qca8k: Fix spelling mistake "Mismateched" -> "Mismatched"
Colin Ian King [Thu, 25 Nov 2021 00:29:32 +0000 (00:29 +0000)]
net: dsa: qca8k: Fix spelling mistake "Mismateched" -> "Mismatched"

There is a spelling mistake in a netdev_err error message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20211125002932.49217-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: stmmac: perserve TX and RX coalesce value during XDP setup
Ong Boon Leong [Wed, 24 Nov 2021 11:40:19 +0000 (19:40 +0800)]
net: stmmac: perserve TX and RX coalesce value during XDP setup

When XDP program is loaded, it is desirable that the previous TX and RX
coalesce values are not re-inited to its default value. This prevents
unnecessary re-configurig the coalesce values that were working fine
before.

Fixes: ac746c8520d9 ("net: stmmac: enhance XDP ZC driver level switching performance")
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20211124114019.3949125-1-boon.leong.ong@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotsnep: Add missing of_node_put() in tsnep_mdio_init()
Yang Yingliang [Wed, 24 Nov 2021 08:40:48 +0000 (16:40 +0800)]
tsnep: Add missing of_node_put() in tsnep_mdio_init()

The node pointer is returned by of_get_child_by_name() with
refcount incremented in tsnep_mdio_init(). Calling of_node_put()
to aovid the refcount leak in tsnep_mdio_init().

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211124084048.175456-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoveth: use ethtool_sprintf instead of snprintf
Tonghao Zhang [Thu, 25 Nov 2021 02:54:44 +0000 (10:54 +0800)]
veth: use ethtool_sprintf instead of snprintf

use ethtools api ethtool_sprintf to instead of snprintf.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Link: https://lore.kernel.org/r/20211125025444.13115-1-xiangxia.m.yue@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: macb: convert to phylink_generic_validate()
Russell King (Oracle) [Wed, 24 Nov 2021 15:44:43 +0000 (15:44 +0000)]
net: macb: convert to phylink_generic_validate()

Populate the supported interfaces bitmap and MAC capabilities mask for
the macb driver and remove the old validate implementation.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1mpuRv-00D4rb-Lz@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: disable detection of chip version 60
Heiner Kallweit [Wed, 24 Nov 2021 20:44:40 +0000 (21:44 +0100)]
r8169: disable detection of chip version 60

It seems only XID 609 made it to the mass market. Therefore let's
disable detection of the other RTL8125a XID's. If nobody complains
we can remove support for RTL_GIGA_MAC_VER_60 later.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/2cd3df01-5f8b-08dd-6def-3f31a3014bde@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet-ipv6: changes to ->tclass (via IPV6_TCLASS) should sk_dst_reset()
Maciej Żenczykowski [Tue, 23 Nov 2021 22:32:08 +0000 (14:32 -0800)]
net-ipv6: changes to ->tclass (via IPV6_TCLASS) should sk_dst_reset()

This is to match ipv4 behaviour, see __ip_sock_set_tos()
implementation.

Technically for ipv6 this might not be required because normally we
do not allow tclass to influence routing, yet the cli tooling does
support it:

lpk11:~# ip -6 rule add pref 5 tos 45 lookup 5
lpk11:~# ip -6 rule
5:      from all tos 0x45 lookup 5

and in general dscp/tclass based routing does make sense.

We already have cases where dscp can affect vlan priority and/or
transmit queue (especially on wifi).

So let's just make things match.  Easier to reason about and no harm.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123223208.1117871-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet-ipv6: do not allow IPV6_TCLASS to muck with tcp's ECN
Maciej Żenczykowski [Tue, 23 Nov 2021 22:31:54 +0000 (14:31 -0800)]
net-ipv6: do not allow IPV6_TCLASS to muck with tcp's ECN

This is to match ipv4 behaviour, see __ip_sock_set_tos()
implementation at ipv4/ip_sockglue.c:579

void __ip_sock_set_tos(struct sock *sk, int val)
{
  if (sk->sk_type == SOCK_STREAM) {
    val &= ~INET_ECN_MASK;
    val |= inet_sk(sk)->tos & INET_ECN_MASK;
  }
  if (inet_sk(sk)->tos != val) {
    inet_sk(sk)->tos = val;
    sk->sk_priority = rt_tos2priority(val);
    sk_dst_reset(sk);
  }
}

Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211123223154.1117794-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: allow SO_MARK with CAP_NET_RAW
Maciej Żenczykowski [Tue, 23 Nov 2021 20:37:15 +0000 (12:37 -0800)]
net: allow SO_MARK with CAP_NET_RAW

A CAP_NET_RAW capable process can already spoof (on transmit) anything
it desires via raw packet sockets...  There is no good reason to not
allow it to also be able to play routing tricks on packets from its
own normal sockets.

There is a desire to be able to use SO_MARK for routing table selection
(via ip rule fwmark) from within a user process without having to run
it as root.  Granting it CAP_NET_RAW is much less dangerous than
CAP_NET_ADMIN (CAP_NET_RAW doesn't permit persistent state change,
while CAP_NET_ADMIN does - by for example allowing the reconfiguration
of the routing tables and/or bringing up/down devices).

Let's keep CAP_NET_ADMIN for persistent state changes,
while using CAP_NET_RAW for non-configuration related stuff.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123203715.193413-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: allow CAP_NET_RAW to setsockopt SO_PRIORITY
Maciej Żenczykowski [Tue, 23 Nov 2021 20:37:02 +0000 (12:37 -0800)]
net: allow CAP_NET_RAW to setsockopt SO_PRIORITY

CAP_NET_ADMIN is and should continue to be about configuring the
system as a whole, not about configuring per-socket or per-packet
parameters.
Sending and receiving raw packets is what CAP_NET_RAW is all about.

It can already send packets with any VLAN tag, and any IPv4 TOS
mark, and any IPv6 TCLASS mark, simply by virtue of building
such a raw packet.  Not to mention using any protocol and source/
/destination ip address/port tuple.

These are the fields that networking gear uses to prioritize packets.

Hence, a CAP_NET_RAW process is already capable of affecting traffic
prioritization after it hits the wire.  This change makes it capable
of affecting traffic prioritization even in the host at the nic and
before that in the queueing disciplines (provided skb->priority is
actually being used for prioritization, and not the TOS/TCLASS field)

Hence it makes sense to allow a CAP_NET_RAW process to set the
priority of sockets and thus packets it sends.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20211123203702.193221-1-zenczykowski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: qca8k: fix warning in LAG feature
Ansuel Smith [Tue, 23 Nov 2021 15:44:46 +0000 (16:44 +0100)]
net: dsa: qca8k: fix warning in LAG feature

Fix warning reported by bot.
Make sure hash is init to 0 and fix wrong logic for hash_type in
qca8k_lag_can_offload.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: def975307c01 ("net: dsa: qca8k: add LAG support")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211123154446.31019-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agocxgb4: allow reading unrecognized port module eeprom
Rahul Lakkireddy [Tue, 23 Nov 2021 15:47:17 +0000 (21:17 +0530)]
cxgb4: allow reading unrecognized port module eeprom

Even if firmware fails to recognize the plugged-in port module type,
allow reading port module EEPROM anyway. This helps in obtaining
necessary diagnostics information for debugging and analysis.

Signed-off-by: Manoj Malviya <manojmalviya@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Link: https://lore.kernel.org/r/1637682437-31407-1-git-send-email-rahul.lakkireddy@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: Allow base 16 inputs in sysfs
Ido Schimmel [Wed, 24 Nov 2021 10:11:22 +0000 (12:11 +0200)]
net: bridge: Allow base 16 inputs in sysfs

Cited commit converted simple_strtoul() to kstrtoul() as suggested by
the former's documentation. However, it also forced all the inputs to be
decimal resulting in user space breakage.

Fix by setting the base to '0' so that the base is automatically
detected.

Before:

 # ip link add name br0 type bridge vlan_filtering 1
 # echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol
 bash: echo: write error: Invalid argument

After:

 # ip link add name br0 type bridge vlan_filtering 1
 # echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol
 # echo $?
 0

Fixes: 520fbdf7fb19 ("net/bridge: replace simple_strtoul to kstrtol")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20211124101122.3321496-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'gro-remove-redundant-rcu_read_lock'
Jakub Kicinski [Thu, 25 Nov 2021 01:21:46 +0000 (17:21 -0800)]
Merge branch 'gro-remove-redundant-rcu_read_lock'

Eric Dumazet says:

====================
gro: remove redundant rcu_read_lock

Recent trees got an increase of rcu_read_{lock|unlock} costs,
it is time to get rid of the not needed pairs.
====================

Link: https://lore.kernel.org/r/20211123225608.2155163-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers
Eric Dumazet [Tue, 23 Nov 2021 22:56:08 +0000 (14:56 -0800)]
gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers

All gro_complete() handlers are called from napi_gro_complete()
while rcu_read_lock() has been called.

There is no point stacking more rcu_read_lock()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agogro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers
Eric Dumazet [Tue, 23 Nov 2021 22:56:07 +0000 (14:56 -0800)]
gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlers

All gro_receive() handlers are called from dev_gro_receive()
while rcu_read_lock() has been called.

There is no point stacking more rcu_read_lock()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotsnep: Fix resource_size cocci warning
Gerhard Engleder [Wed, 24 Nov 2021 20:52:25 +0000 (21:52 +0100)]
tsnep: Fix resource_size cocci warning

The following warning is fixed, by removing the unused resource size:

drivers/net/ethernet/engleder/tsnep_main.c:1155:21-24:
WARNING: Suspicious code. resource_size is maybe missing with io

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/20211124205225.13985-1-gerhard@engleder-embedded.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotsnep: fix platform_no_drv_owner.cocci warning
Yang Li [Wed, 24 Nov 2021 02:36:24 +0000 (10:36 +0800)]
tsnep: fix platform_no_drv_owner.cocci warning

Remove .owner field if calls are used which set it automatically

Eliminate the following coccicheck warning:
./drivers/net/ethernet/engleder/tsnep_main.c:1263:3-8: No need to set
.owner here. The core will do it.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Link: https://lore.kernel.org/r/1637721384-70836-2-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'hns3-next'
David S. Miller [Wed, 24 Nov 2021 14:12:26 +0000 (14:12 +0000)]
Merge branch 'hns3-next'

Guangbin Huang says:

====================
net: hns3: updates for -next

This series includes some updates for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add dql info when tx timeout
Yufeng Mo [Wed, 24 Nov 2021 01:06:54 +0000 (09:06 +0800)]
net: hns3: add dql info when tx timeout

When tx timeout occurs, the info of dql maybe helpful, so print
these info to hns3_get_tx_timeo_queue_info().

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: debugfs add drop packet statistics of multicast and broadcast for igu
Jie Wang [Wed, 24 Nov 2021 01:06:53 +0000 (09:06 +0800)]
net: hns3: debugfs add drop packet statistics of multicast and broadcast for igu

Currently, there is no way to get drop packet number of multicast and
broadcast in IGU hardware module, it is not convenient to find problem
when multicast packet or broadcast packet is dropped in IGU, so this
patch adds statistics for them in debugfs.

Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: format the output of the MAC address
Yufeng Mo [Wed, 24 Nov 2021 01:06:52 +0000 (09:06 +0800)]
net: hns3: format the output of the MAC address

Printing the whole MAC addresse may bring security risks. Therefore,
the MAC address is partially encrypted to improve security.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: hns3: add log for workqueue scheduled late
Yufeng Mo [Wed, 24 Nov 2021 01:06:51 +0000 (09:06 +0800)]
net: hns3: add log for workqueue scheduled late

When the mbx or reset message arrives, the driver is informed
through an interrupt. This task can be processed only after
the workqueue is scheduled. In some cases, this workqueue
scheduling takes a long time. As a result, the mbx or reset
service task cannot be processed in time. So add some warning
message to improve debugging efficiency for this case.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agolan78xx: Clean up some inconsistent indenting
Jiapeng Chong [Wed, 24 Nov 2021 10:09:56 +0000 (18:09 +0800)]
lan78xx: Clean up some inconsistent indenting

Eliminate the follow smatch warning:

drivers/net/usb/lan78xx.c:4961 lan78xx_resume() warn: inconsistent
indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'dccp-tcp-minor-fixes-for-inet_csk_listen_start'
Jakub Kicinski [Wed, 24 Nov 2021 04:16:22 +0000 (20:16 -0800)]
Merge branch 'dccp-tcp-minor-fixes-for-inet_csk_listen_start'

Kuniyuki Iwashima says:

====================
dccp/tcp: Minor fixes for inet_csk_listen_start().

The first patch removes an unused argument, and the second removes a stale
comment.
====================

Link: https://lore.kernel.org/r/20211122101622.50572-1-kuniyu@amazon.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodccp: Inline dccp_listen_start().
Kuniyuki Iwashima [Mon, 22 Nov 2021 10:16:22 +0000 (19:16 +0900)]
dccp: Inline dccp_listen_start().

This patch inlines dccp_listen_start() and removes a stale comment in
inet_dccp_listen() so that it looks like inet_listen().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Reviewed-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodccp/tcp: Remove an unused argument in inet_csk_listen_start().
Kuniyuki Iwashima [Mon, 22 Nov 2021 10:16:21 +0000 (19:16 +0900)]
dccp/tcp: Remove an unused argument in inet_csk_listen_start().

The commit 1295e2cf3065 ("inet: minor optimization for backlog setting in
listen(2)") added change so that sk_max_ack_backlog is initialised earlier
in inet_dccp_listen() and inet_listen().  Since then, we no longer use
backlog in inet_csk_listen_start(), so let's remove it.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Reviewed-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: stmmac: Calculate CDC error only once
Kurt Kanzenbach [Mon, 22 Nov 2021 11:19:31 +0000 (12:19 +0100)]
net: stmmac: Calculate CDC error only once

The clock domain crossing error (CDC) is calculated at every fetch of Tx or Rx
timestamps. It includes a division. Especially on arm32 based systems it is
expensive. It also requires two conditionals in the hotpath.

Add a compensation value cache to struct plat_stmmacenet_data and subtract it
unconditionally in the RX/TX functions which spares the conditionals.

The value is initialized to 0 and if supported calculated in the PTP
initialization code.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://lore.kernel.org/r/20211122111931.135135-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: remove .ndo_change_proto_down
Jakub Kicinski [Tue, 23 Nov 2021 01:24:47 +0000 (17:24 -0800)]
net: remove .ndo_change_proto_down

.ndo_change_proto_down was added seemingly to enable out-of-tree
implementations. Over 2.5yrs later we still have no real users
upstream. Hardwire the generic implementation for now, we can
revert once real users materialize. (rocker is a test vehicle,
not a user.)

We need to drop the optimization on the sysfs side, because
unlike ndos priv_flags will be changed at runtime, so we'd
need READ_ONCE/WRITE_ONCE everywhere..

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next...
David S. Miller [Tue, 23 Nov 2021 12:17:24 +0000 (12:17 +0000)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2021-11-22

Shiraz Saleem says:

Currently E800 devices come up as RoCEv2 devices by default.

This series add supports for users to configure iWARP or RoCEv2 functionality
per PCI function. devlink parameters is used to realize this and is keyed
off similar work in [1].

[1] https://lore.kernel.org/linux-rdma/20210810132424.9129-1-parav@nvidia.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mvpp2-5gbase-r-support'
David S. Miller [Tue, 23 Nov 2021 12:14:49 +0000 (12:14 +0000)]
Merge branch 'mvpp2-5gbase-r-support'

Marek Behún says:

====================
Add 5gbase-r support for mvpp2

this adds support for 5gbase-r for mvpp2 driver. Current versions of
TF-A firmware support changing the PHY to 5gbase-r via SMC calls, at
least on Macchiatobin.

Tested on Macchiatobin.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: marvell: mvpp2: Add support for 5gbase-r
Marek Behún [Mon, 22 Nov 2021 20:51:11 +0000 (21:51 +0100)]
net: marvell: mvpp2: Add support for 5gbase-r

Add support for PHY_INTERFACE_MODE_5GBASER.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agophy: marvell: phy-mvebu-cp110-comphy: add support for 5gbase-r
Marek Behún [Mon, 22 Nov 2021 20:51:10 +0000 (21:51 +0100)]
phy: marvell: phy-mvebu-cp110-comphy: add support for 5gbase-r

Add support for PHY_INTERFACE_MODE_5GBASER mode within the Marvell CP110
common PHY driver.

This is currently only supported via SMC calls to TF-A. Legacy support
may be added later, if needed.

Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotsnep: Fix set MAC address
Gerhard Engleder [Mon, 22 Nov 2021 20:32:25 +0000 (21:32 +0100)]
tsnep: Fix set MAC address

Commit 4dfb9982644b ("tsn:  Fix build.") fixed compilation with const
dev_addr. In tsnep_netdev_set_mac_address() the call of ether_addr_copy()
was replaced with dev_set_mac_address(), which calls
ndo_set_mac_address(). This results in an endless recursive loop because
ndo_set_mac_address is set to tsnep_netdev_set_mac_address.

Call eth_hw_addr_set() instead of dev_set_mac_address() in
ndo_set_mac_address()/tsnep_netdev_set_mac_address() to copy the address
as intended.

[   26.563303] Insufficient stack space to handle exception!
[   26.563312] ESR: 0x96000047 -- DABT (current EL)
[   26.563317] FAR: 0xffff80000a507fc0
[   26.563320] Task stack:     [0xffff80000a508000..0xffff80000a50c000]
[   26.563324] IRQ stack:      [0xffff80000a0c0000..0xffff80000a0c4000]
[   26.563327] Overflow stack: [0xffff00007fbaf2b0..0xffff00007fbb02b0]
[   26.563333] CPU: 3 PID: 381 Comm: ifconfig Not tainted 5.16.0-rc1-zynqmp #60
[   26.563340] Hardware name: TSN endpoint (DT)
[   26.563343] pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   26.563351] pc : inetdev_event+0x4/0x560
[   26.563364] lr : raw_notifier_call_chain+0x54/0x78
[   26.563372] sp : ffff80000a508040
[   26.563374] x29: ffff80000a508040 x28: ffff00000132b800 x27: 0000000000000000
[   26.563386] x26: 0000000000000000 x25: ffff800000ea5058 x24: 0904030201020001
[   26.563396] x23: ffff800000ea5058 x22: ffff80000a5080e0 x21: 0000000000000009
[   26.563405] x20: 00000000fffffffa x19: ffff80000a009510 x18: 0000000000000000
[   26.563414] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffd1341030
[   26.563422] x14: ffffffffffffffff x13: 0000000000000020 x12: 0101010101010101
[   26.563432] x11: 0000000000000020 x10: 0101010101010101 x9 : 7f7f7f7f7f7f7f7f
[   26.563441] x8 : 7f7f7f7f7f7f7f7f x7 : fefefeff30677364 x6 : 0000000080808080
[   26.563450] x5 : 0000000000000000 x4 : ffff800008dee170 x3 : ffff80000a50bd42
[   26.563459] x2 : ffff80000a5080e0 x1 : 0000000000000009 x0 : ffff80000a0092d0
[   26.563470] Kernel panic - not syncing: kernel stack overflow
[   26.563474] CPU: 3 PID: 381 Comm: ifconfig Not tainted 5.16.0-rc1-zynqmp #60
[   26.563481] Hardware name: TSN endpoint (DT)
[   26.563484] Call trace:
[   26.563486]  dump_backtrace+0x0/0x1b0
[   26.563497]  show_stack+0x18/0x68
[   26.563504]  dump_stack_lvl+0x68/0x84
[   26.563513]  dump_stack+0x18/0x34
[   26.563519]  panic+0x164/0x324
[   26.563524]  nmi_panic+0x64/0x98
[   26.563533]  panic_bad_stack+0x108/0x128
[   2k6.563539]  handle_bad_stack+0x38/0x68
[   26.563548]  __bad_stack+0x88/0x8c
[   26.563553]  inetdev_event+0x4/0x560
[   26.563560]  call_netdevice_notifiers_info+0x58/0xa8
[   26.563569]  dev_set_mac_address+0x78/0x110
[   26.563576]  tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
[   26.563591]  dev_set_mac_address+0xc4/0x110
[   26.563599]  tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
...
[   26.565444]  dev_set_mac_address+0xc4/0x110
[   26.565452]  tsnep_netdev_set_mac_address+0x38/0x60 [tsnep]
[   26.565462]  dev_set_mac_address+0xc4/0x110
[   26.565469]  dev_set_mac_address_user+0x44/0x68
[   26.565477]  dev_ifsioc+0x30c/0x568
[   26.565483]  dev_ioctl+0x124/0x3f0
[   26.565489]  sock_do_ioctl+0xb4/0xf8
[   26.565497]  sock_ioctl+0x2f4/0x398
[   26.565503]  __arm64_sys_ioctl+0xa8/0xe8
[   26.565511]  invoke_syscall+0x44/0x108
[   26.565520]  el0_svc_common.constprop.3+0x94/0xf8
[   26.565527]  do_el0_svc+0x24/0x88
[   26.565534]  el0_svc+0x20/0x50
[   26.565541]  el0t_64_sync_handler+0x90/0xb8
[   26.565548]  el0t_64_sync+0x180/0x184
[   26.565556] SMP: stopping secondary CPUs
[   26.565622] Kernel Offset: disabled
[   26.565624] CPU features: 0x0,00004002,00000846
[   26.565628] Memory Limit: none
[   27.843428] ---[ end Kernel panic - not syncing: kernel stack overflow ]---

Fixes: 4dfb9982644b ("tsn:  Fix build.")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'qca8k-mirror-and-lag-support'
David S. Miller [Tue, 23 Nov 2021 11:53:16 +0000 (11:53 +0000)]
Merge branch 'qca8k-mirror-and-lag-support'

Ansuel Smith says:

====================
Add mirror and LAG support to qca8k

With the continue of adding 'Multiple feature to qca8k'

The switch supports mirror mode and LAG.
In mirror mode a port is set as mirror and other port are configured
to both igress or egress mode. With no port configured for mirror,
the mirror port is disabled and reverted to normal port.

The switch supports max 4 LAG with 4 different member max.
Current supported mode is Hash mode in both L2 or L2+3 mode.
There is a problematic implementation for the hash mode where
with multiple LAG configured, someone has to remove them to
change the hash mode as it's global.
When a member of the LAG is disconnected, the traffic is redirected
to the other port.

Some warning are present from checkpatch but can't really be fixed
as it would result in making the regs less readable.
(They really did their best with the LAG reg logic and complexity)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: add LAG support
Ansuel Smith [Tue, 23 Nov 2021 02:59:11 +0000 (03:59 +0100)]
net: dsa: qca8k: add LAG support

Add LAG support to this switch. In Documentation this is described as
trunk mode. A max of 4 LAGs are supported and each can support up to 4
port. The current tx mode supported is Hash mode with both L2 and L2+3
mode.
When no port are present in the trunk, the trunk is disabled in the
switch.
When a port is disconnected, the traffic is redirected to the other
available port.
The hash mode is global and each LAG require to have the same hash mode
set. To change the hash mode when multiple LAG are configured, it's
required to remove each LAG and set the desired hash mode to the last.
An error is printed when it's asked to set a not supported hadh mode.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: add support for mirror mode
Ansuel Smith [Tue, 23 Nov 2021 02:59:10 +0000 (03:59 +0100)]
net: dsa: qca8k: add support for mirror mode

The switch supports mirror mode. Only one port can set as mirror port and
every other port can set to both ingress and egress mode. The mirror
port is disabled and reverted to normal operation once every port is
removed from sending packet to it.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoneigh: introduce neigh_confirm() helper function
Yajun Deng [Tue, 23 Nov 2021 02:54:30 +0000 (10:54 +0800)]
neigh: introduce neigh_confirm() helper function

Add neigh_confirm() for the confirmed member in struct neighbour,
it can be called as an independent unit by other functions.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomctp: Add MCTP-over-serial transport binding
Jeremy Kerr [Tue, 23 Nov 2021 07:50:46 +0000 (15:50 +0800)]
mctp: Add MCTP-over-serial transport binding

This change adds a MCTP Serial transport binding, as defined by DMTF
specificiation DSP0253 - "MCTP Serial Transport Binding". This is
implemented as a new serial line discipline, and can be attached to
arbitrary tty devices.

From the Kconfig description:

  This driver provides an MCTP-over-serial interface, through a
  serial line-discipline, as defined by DMTF specification "DSP0253 -
  MCTP Serial Transport Binding". By attaching the ldisc to a serial
  device, we get a new net device to transport MCTP packets.

  This allows communication with external MCTP endpoints which use
  serial as their transport. It can also be used as an easy way to
  provide MCTP connectivity between virtual machines, by forwarding
  data between simple virtual serial devices.

  Say y here if you need to connect to MCTP endpoints over serial. To
  compile as a module, use m; the module will be called mctp-serial.

Once the N_MCTP line discipline is set [using ioctl(TCIOSETD)], we get a
new netdev suitable for MCTP communication.

The 'mctp' utility[1] provides a simple wrapper for this ioctl, using
'link serial <device>':

  # mctp link serial /dev/ttyS0 &
  # mctp link
  dev lo index 1 address 0x00:00:00:00:00:00 net 1 mtu 65536 up
  dev mctpserial0 index 5 address 0x(no-addr) net 1 mtu 68 down

[1]: https://github.com/CodeConstruct/mctp

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mlxsw-updates'
David S. Miller [Tue, 23 Nov 2021 11:44:31 +0000 (11:44 +0000)]
Merge branch 'mlxsw-updates'

Ido Schimmel says:

====================
mlxsw: Various updates

Patch #1 removes deadcode reported by Coverity.

Patch #2 adds a shutdown method in the PCI driver to ensure the kexeced
kernel starts working with a device that is in a sane state.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: pci: Add shutdown method in PCI driver
Danielle Ratson [Tue, 23 Nov 2021 07:54:47 +0000 (09:54 +0200)]
mlxsw: pci: Add shutdown method in PCI driver

On an arm64 platform with the Spectrum ASIC, after loading and executing
a new kernel via kexec, the following trace [1] is observed. This seems
to be caused by the fact that the device is not properly shutdown before
executing the new kernel.

Fix this by implementing a shutdown method which mirrors the remove
method, as recommended by the kexec maintainer [2][3].

[1]
BUG: Bad page state in process devlink pfn:22f73d
page:fffffe00089dcf40 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x2ffff00000000000()
raw: 2ffff00000000000 0000000000000000 ffffffff089d0201 0000000000000000
raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount
Modules linked in:
CPU: 1 PID: 16346 Comm: devlink Tainted: G B 5.8.0-rc6-custom-273020-gac6b365b1bf5 #44
Hardware name: Marvell Armada 7040 TX4810M (DT)
Call trace:
 dump_backtrace+0x0/0x1d0
 show_stack+0x1c/0x28
 dump_stack+0xbc/0x118
 bad_page+0xcc/0xf8
 check_free_page_bad+0x80/0x88
 __free_pages_ok+0x3f8/0x418
 __free_pages+0x38/0x60
 kmem_freepages+0x200/0x2a8
 slab_destroy+0x28/0x68
 slabs_destroy+0x60/0x90
 ___cache_free+0x1b4/0x358
 kfree+0xc0/0x1d0
 skb_free_head+0x2c/0x38
 skb_release_data+0x110/0x1a0
 skb_release_all+0x2c/0x38
 consume_skb+0x38/0x130
 __dev_kfree_skb_any+0x44/0x50
 mlxsw_pci_rdq_fini+0x8c/0xb0
 mlxsw_pci_queue_fini.isra.0+0x28/0x58
 mlxsw_pci_queue_group_fini+0x58/0x88
 mlxsw_pci_aqs_fini+0x2c/0x60
 mlxsw_pci_fini+0x34/0x50
 mlxsw_core_bus_device_unregister+0x104/0x1d0
 mlxsw_devlink_core_bus_device_reload_down+0x2c/0x48
 devlink_reload+0x44/0x158
 devlink_nl_cmd_reload+0x270/0x290
 genl_rcv_msg+0x188/0x2f0
 netlink_rcv_skb+0x5c/0x118
 genl_rcv+0x3c/0x50
 netlink_unicast+0x1bc/0x278
 netlink_sendmsg+0x194/0x390
 __sys_sendto+0xe0/0x158
 __arm64_sys_sendto+0x2c/0x38
 el0_svc_common.constprop.0+0x70/0x168
 do_el0_svc+0x28/0x88
 el0_sync_handler+0x88/0x190
 el0_sync+0x140/0x180

[2]
https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1195432.html

[3]
https://patchwork.kernel.org/project/linux-scsi/patch/20170212214920.28866-1-anton@ozlabs.org/#20116693

Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agomlxsw: spectrum_router: Remove deadcode in mlxsw_sp_rif_mac_profile_find
Danielle Ratson [Tue, 23 Nov 2021 07:54:46 +0000 (09:54 +0200)]
mlxsw: spectrum_router: Remove deadcode in mlxsw_sp_rif_mac_profile_find

The function idr_for_each_entry() already checks that the next entry in
the IDR is not NULL.

Therefore, checking that again in every iteration leads to deadcode.

Remove the unnecessary check in order to avoid that.

Addresses-Coverity: ("Logically dead code")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoRDMA/irdma: Set protocol based on PF rdma_mode flag
Shiraz Saleem [Mon, 18 Oct 2021 23:16:03 +0000 (18:16 -0500)]
RDMA/irdma: Set protocol based on PF rdma_mode flag

Set the RDMA protocol to use at driver bind time based on the ice PF's
rdma_mode flag.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agonet/ice: Add support for enable_iwarp and enable_roce devlink param
Shiraz Saleem [Mon, 18 Oct 2021 23:16:02 +0000 (18:16 -0500)]
net/ice: Add support for enable_iwarp and enable_roce devlink param

Allow support for 'enable_iwarp' and 'enable_roce' devlink params to turn
on/off iWARP or RoCE protocol support for E800 devices.

For example, a user can turn on iWARP functionality with,

devlink dev param set pci/0000:07:00.0 name enable_iwarp value true cmode runtime

This add an iWARP auxiliary rdma device, ice.iwarp.<>, under this PF.

A user request to enable both iWARP and RoCE under the same PF is rejected
since this device does not support both protocols simultaneously on the
same port.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agodevlink: Add 'enable_iwarp' generic device param
Shiraz Saleem [Mon, 18 Oct 2021 23:16:01 +0000 (18:16 -0500)]
devlink: Add 'enable_iwarp' generic device param

Add a new device generic parameter to enable and disable
iWARP functionality on a multi-protocol RDMA device.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Tested-by: Leszek Kaliszczuk <leszek.kaliszczuk@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoMerge branch 'qca8k-next'
David S. Miller [Mon, 22 Nov 2021 15:35:17 +0000 (15:35 +0000)]
Merge branch 'qca8k-next'

Ansuel Smith says:

====================
Multiple cleanup and feature for qca8k

This is a reduced version of the old massive series.
Refer to the changelog to know what is removed from this.

We clean and convert the driver to GENMASK FIELD_PREP to clean multiple
use of various naming scheme. (example we have a mix of _MASK, _S _M,
and various name) The idea is to ""simplify"" and remove some shift and
data handling by using FIELD API.
The patch contains various checkpatch warning and are ignored to not
create more mess in the header file. (fixing the too long line warning
would results in regs definition less readable)

We conver the driver to regmap API as ipq40xx SoC is based on the same
reg structure and we need to generilize the read/write access to split
the driver to commond and specific code.

The conversion to regmap is NOT done for the read/write/rmw operation,
the function are reworked to use the regmap helper instead.
This is done to keep the patch delta low but will come sooner or later
when the code will be split.
Any new feature will directly use the regmap helper and the reg
set/clear and the busy wait function are migrated to regmap helper as
the use of these function is low.

We also add a minor patch for MIB counter as qca8337 is missing 2 extra
counter, support for mdb and ageing settings.

v3:
- Try to reduce regmap conversion patch
v2:
- Move regmap init to sw_probe instead of moving switch id check.
- Removed LAGs, mirror extra features will come later in another
  smaller series.
- Squash 2 ageing patch
- Add more description about the regmap patch
- Rework mdb patch to do operation under the same lock
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: add support for mdb_add/del
Ansuel Smith [Mon, 22 Nov 2021 15:23:48 +0000 (16:23 +0100)]
net: dsa: qca8k: add support for mdb_add/del

Add support for mdb add/del function. The ARL table is used to insert
the rule. The rule will be searched, deleted and reinserted with the
port mask updated. The function will check if the rule has to be updated
or insert directly with no deletion of the old rule.
If every port is removed from the port mask, the rule is removed.
The rule is set STATIC in the ARL table (aka it doesn't age) to not be
flushed by fast age function.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: add set_ageing_time support
Ansuel Smith [Mon, 22 Nov 2021 15:23:47 +0000 (16:23 +0100)]
net: dsa: qca8k: add set_ageing_time support

qca8k support setting ageing time in step of 7s. Add support for it and
set the max value accepted of 7645m.
Documentation talks about support for 10000m but that values doesn't
make sense as the value doesn't match the max value in the reg.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: add support for port fast aging
Ansuel Smith [Mon, 22 Nov 2021 15:23:46 +0000 (16:23 +0100)]
net: dsa: qca8k: add support for port fast aging

The switch supports fast aging by flushing any rule in the ARL
table for a specific port.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: add additional MIB counter and make it dynamic
Ansuel Smith [Mon, 22 Nov 2021 15:23:45 +0000 (16:23 +0100)]
net: dsa: qca8k: add additional MIB counter and make it dynamic

We are currently missing 2 additionals MIB counter present in QCA833x
switch.
QC832x switch have 39 MIB counter and QCA833X have 41 MIB counter.
Add the additional MIB counter and rework the MIB function to print the
correct supported counter from the match_data struct.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: initial conversion to regmap helper
Ansuel Smith [Mon, 22 Nov 2021 15:23:44 +0000 (16:23 +0100)]
net: dsa: qca8k: initial conversion to regmap helper

Convert any qca8k set/clear/pool to regmap helper and add
missing config to regmap_config struct.
Read/write/rmw operation are reworked to use the regmap helper
internally to keep the delta of this patch low. These additional
function will then be dropped when the code split will be proposed.

Ipq40xx SoC have the internal switch based on the qca8k regmap but use
mmio for read/write/rmw operation instead of mdio.
In preparation for the support of this internal switch, convert the
driver to regmap API to later split the driver to common and specific
code. The overhead introduced by the use of regamp API is marginal as the
internal mdio will bypass it by using its direct access and regmap will be
used only by configuration functions or fdb access.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: move regmap init in probe and set it mandatory
Ansuel Smith [Mon, 22 Nov 2021 15:23:43 +0000 (16:23 +0100)]
net: dsa: qca8k: move regmap init in probe and set it mandatory

In preparation for regmap conversion, move regmap init in the probe
function and make it mandatory as any read/write/rmw operation will be
converted to regmap API.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: remove extra mutex_init in qca8k_setup
Ansuel Smith [Mon, 22 Nov 2021 15:23:42 +0000 (16:23 +0100)]
net: dsa: qca8k: remove extra mutex_init in qca8k_setup

Mutex is already init in sw_probe. Remove the extra init in qca8k_setup.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: convert to GENMASK/FIELD_PREP/FIELD_GET
Ansuel Smith [Mon, 22 Nov 2021 15:23:41 +0000 (16:23 +0100)]
net: dsa: qca8k: convert to GENMASK/FIELD_PREP/FIELD_GET

Convert and try to standardize bit fields using
GENMASK/FIELD_PREP/FIELD_GET macros. Rework some logic to support the
standard macro and tidy things up. No functional change intended.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: qca8k: remove redundant check in parse_port_config
Ansuel Smith [Mon, 22 Nov 2021 15:23:40 +0000 (16:23 +0100)]
net: dsa: qca8k: remove redundant check in parse_port_config

The very next check for port 0 and 6 already makes sure we don't go out
of bounds with the ports_config delay table.
Remove the redundant check.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'skbuff-struct-group'
David S. Miller [Mon, 22 Nov 2021 15:13:54 +0000 (15:13 +0000)]
Merge branch 'skbuff-struct-group'

Kees Cook says:

====================
skbuff: Switch structure bounds to struct_group()

This is a pair of patches to add struct_group() to struct sk_buff. The
first is needed to work around sparse-specific complaints, and is new
for v2. The second patch is the same as originally sent as v1.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoskbuff: Switch structure bounds to struct_group()
Kees Cook [Sun, 21 Nov 2021 00:31:49 +0000 (16:31 -0800)]
skbuff: Switch structure bounds to struct_group()

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy(), memmove(), and memset(), avoid
intentionally writing across neighboring fields.

Replace the existing empty member position markers "headers_start" and
"headers_end" with a struct_group(). This will allow memcpy() and sizeof()
to more easily reason about sizes, and improve readability.

"pahole" shows no size nor member offset changes to struct sk_buff.
"objdump -d" shows no object code changes (outside of WARNs affected by
source line number changes).

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com> # drivers/net/wireguard/*
Link: https://lore.kernel.org/lkml/20210728035006.GD35706@embeddedor
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoskbuff: Move conditional preprocessor directives out of struct sk_buff
Kees Cook [Sun, 21 Nov 2021 00:31:48 +0000 (16:31 -0800)]
skbuff: Move conditional preprocessor directives out of struct sk_buff

In preparation for using the struct_group() macro in struct sk_buff,
move the conditional preprocessor directives out of the region of struct
sk_buff that will be enclosed by struct_group(). While GCC and Clang are
happy with conditional preprocessor directives here, sparse is not, even
under -Wno-directive-within-macro[1], as would be seen under a C=1 build:

net/core/filter.c: note: in included file (through include/linux/netlink.h, include/linux/sock_diag.h):
./include/linux/skbuff.h:820:1: warning: directive in macro's argument list
./include/linux/skbuff.h:822:1: warning: directive in macro's argument list
./include/linux/skbuff.h:846:1: warning: directive in macro's argument list
./include/linux/skbuff.h:848:1: warning: directive in macro's argument list

Additionally remove empty macro argument definitions and usage.

"objdump -d" shows no object code differences.

[1] https://www.spinics.net/lists/linux-sparse/msg10857.html

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosections: global data can be in .bss
Antoine Tenart [Mon, 22 Nov 2021 14:24:56 +0000 (15:24 +0100)]
sections: global data can be in .bss

When checking an address is located in a global data section also check
for the .bss section as global variables initialized to 0 can be in
there (-fzero-initialized-in-bss).

This was found when looking at ensure_safe_net_sysctl which was failing
to detect non-init sysctl pointing to a global data section when the
data was in the .bss section.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoarp: Remove #ifdef CONFIG_PROC_FS
Yajun Deng [Mon, 22 Nov 2021 07:02:36 +0000 (15:02 +0800)]
arp: Remove #ifdef CONFIG_PROC_FS

proc_create_net() and remove_proc_entry() already contain the case
whether to define CONFIG_PROC_FS, so remove #ifdef CONFIG_PROC_FS.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agohv_netvsc: Use bitmap_zalloc() when applicable
Christophe JAILLET [Sun, 21 Nov 2021 21:56:39 +0000 (22:56 +0100)]
hv_netvsc: Use bitmap_zalloc() when applicable

'send_section_map' is a bitmap. So use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

While at it, change an '== NULL' test into a '!'.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoqed: Use the bitmap API to simplify some functions
Christophe JAILLET [Sun, 21 Nov 2021 19:12:54 +0000 (20:12 +0100)]
qed: Use the bitmap API to simplify some functions

'cid_map' is a bitmap. So use 'bitmap_zalloc()' to simplify code,
improve the semantic and avoid some open-coded arithmetic in allocator
arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Also change some 'memset()' into 'bitmap_zero()' to keep consistency. This
is also much less verbose.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet-sysfs: Slightly optimize 'xps_queue_show()'
Christophe JAILLET [Sun, 21 Nov 2021 18:01:03 +0000 (19:01 +0100)]
net-sysfs: Slightly optimize 'xps_queue_show()'

The 'mask' bitmap is local to this function. So the non-atomic
'__set_bit()' can be used to save a few cycles.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agords: Fix a typo in a comment
Christophe JAILLET [Sun, 21 Nov 2021 15:32:04 +0000 (16:32 +0100)]
rds: Fix a typo in a comment

s/cold/could/

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-By: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoFix coverity issue 'Uninitialized scalar variable"
Yacov Simhony [Sun, 21 Nov 2021 15:02:53 +0000 (17:02 +0200)]
Fix coverity issue 'Uninitialized scalar variable"

There are three boolean variable which were not initialized and later
being used in the code.

Signed-off-by: Yacov Simhony <ysimhony@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agopcmcia: hide the MAC address helpers if !NET
Jakub Kicinski [Sat, 20 Nov 2021 17:15:10 +0000 (09:15 -0800)]
pcmcia: hide the MAC address helpers if !NET

pcmcia_get_mac_from_cis is only called from networking and
recent changes made it call dev_addr_mod() which is itself
only defined if NET.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: adeef3e32146 ("net: constify netdev->dev_addr")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotsn: Fix build.
David S. Miller [Mon, 22 Nov 2021 13:56:22 +0000 (13:56 +0000)]
tsn:  Fix build.

Due to const dev_addr changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: wwan: iosm: device trace collection using relayfs
M Chetan Kumar [Sat, 20 Nov 2021 16:21:55 +0000 (21:51 +0530)]
net: wwan: iosm: device trace collection using relayfs

This patch brings in support for device trace collection.
It implements relayfs interface for pushing device trace
from kernel space to user space.

Driver gets the debugfs base directory associated to WWAN
Device and creates trace_control and trace debugfs for
device tracing. Both trace_control & trace debugfs are
created under /sys/kernel/debug/wwan/wwan0/.

In order to collect device trace on trace0 interface, user
need to write 1 to trace_ctl interface.

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: wwan: common debugfs base dir for wwan device
M Chetan Kumar [Sat, 20 Nov 2021 16:21:54 +0000 (21:51 +0530)]
net: wwan: common debugfs base dir for wwan device

This patch set brings in a common debugfs base directory
i.e. /sys/kernel/debugfs/wwan/ in WWAN Subsystem for a
WWAN device instance. So that it avoids driver polluting
debugfs root with unrelated directories & possible name
collusion.

Having a common debugfs base directory for WWAN drivers
eases user to match control devices with debugfs entries.

WWAN Subsystem creates dentry (/sys/kernel/debugfs/wwan)
on module load & removes dentry on module unload.

When driver registers a new wwan device, dentry (wwanX)
is created for WWAN device instance & on driver unregister
dentry is removed.

New API is introduced to return the wwan device instance
dentry so that driver can create debugfs entries under it.

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoocteon: constify netdev->dev_addr
Jakub Kicinski [Sat, 20 Nov 2021 15:31:19 +0000 (07:31 -0800)]
octeon: constify netdev->dev_addr

Argument of a helper is missing a const.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: adeef3e32146 ("net: constify netdev->dev_addr")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mana: Add XDP support
Haiyang Zhang [Sat, 20 Nov 2021 00:29:53 +0000 (16:29 -0800)]
net: mana: Add XDP support

Add support of XDP for the MANA driver.

Supported XDP actions:
XDP_PASS, XDP_TX, XDP_DROP, XDP_ABORTED

XDP actions not yet supported:
XDP_REDIRECT

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'tsn-endpoint-driver'
David S. Miller [Mon, 22 Nov 2021 13:19:04 +0000 (13:19 +0000)]
Merge branch 'tsn-endpoint-driver'

Gerhard Engleder says:

====================
TSN endpoint Ethernet MAC driver

This series adds a driver for my FPGA based TSN endpoint Ethernet MAC.
It also includes the device tree binding.

The device is designed as Ethernet MAC for TSN networks. It will be used
in PLCs with real-time requirements up to isochronous communication with
protocols like OPC UA Pub/Sub.

v3:
 - set MAC mode based on PHY information (Andrew Lunn)
 - remove/postpone loopback mode interface (Andrew Lunn)
 - add suppress_preamble node support (Andrew Lunn)
 - add mdio timeout (Andrew Lunn)
 - no need to call phy_start_aneg (Andrew Lunn)
 - remove unreachable code (Andrew Lunn)
 - move 'struct napi_struct' closer to queues (Vinicius Costa Gomes)
 - remove unused variable (kernel test robot)
 - switch from mdio interrupt to polling
 - mdio register without PHY address flag
 - thread safe interrupt enable register
 - add PTP_1588_CLOCK_OPTIONAL dependency to Kconfig
 - introduce dmadev for DMA allocation
 - mdiobus for platforms without device tree
 - prepare MAC address support for platforms without device tree
 - add missing interrupt disable to probe error path

v2:
 - add C45 check (Andrew Lunn)
 - forward phy_connect_direct() return value (Andrew Lunn)
 - use phy_remove_link_mode() (Andrew Lunn)
 - do not touch PHY directly, use PHY subsystem (Andrew Lunn)
 - remove management data lock (Andrew Lunn)
 - use phy_loopback (Andrew Lunn)
 - remove GMII2RGMII handling, use xgmiitorgmii (Andrew Lunn)
 - remove char device for direct TX/RX queue access (Andrew Lunn)
 - mdio node for mdiobus (Rob Herring)
 - simplify compatible node (Rob Herring)
 - limit number of items of reg and interrupts nodes (Rob Herring)
 - restrict phy-connection-type node (Rob Herring)
 - reference to mdio.yaml under mdio node (Rob Herring)
 - remove device tree (Michal Simek)
 - fix %llx warning (kernel test robot)
 - fix unused tmp variable warning (kernel test robot)
 - add missing of_node_put() for of_parse_phandle()
 - use devm_mdiobus_alloc()
 - simplify mdiobus read/write
 - reduce required nodes
 - ethtool priv flags interface for loopback
 - add missing static for some functions
 - remove obsolete hardware defines
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotsnep: Add TSN endpoint Ethernet MAC driver
Gerhard Engleder [Fri, 19 Nov 2021 22:58:26 +0000 (23:58 +0100)]
tsnep: Add TSN endpoint Ethernet MAC driver

The TSN endpoint Ethernet MAC is a FPGA based network device for
real-time communication.

It is integrated as Ethernet controller with ethtool and PTP support.
For real-time communcation TC_SETUP_QDISC_TAPRIO is supported.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: net: Add tsnep Ethernet controller
Gerhard Engleder [Fri, 19 Nov 2021 22:58:25 +0000 (23:58 +0100)]
dt-bindings: net: Add tsnep Ethernet controller

The TSN endpoint Ethernet MAC is a FPGA based network device for
real-time communication.

It is integrated as normal Ethernet controller with
ethernet-controller.yaml and mdio.yaml.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodt-bindings: Add vendor prefix for Engleder
Gerhard Engleder [Fri, 19 Nov 2021 22:58:24 +0000 (23:58 +0100)]
dt-bindings: Add vendor prefix for Engleder

Engleder develops FPGA based controllers for real-time communication.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: handle NA interface mode in phylink_fwnode_phy_connect()
Russell King (Oracle) [Fri, 19 Nov 2021 16:28:06 +0000 (16:28 +0000)]
net: phylink: handle NA interface mode in phylink_fwnode_phy_connect()

Commit 4904b6ea1f9db ("net: phy: phylink: Use PHY device interface if
N/A") introduced handling for the phy interface mode where this is not
known at phylink creation time. This was never added to the OF/fwnode
paths, but is necessary when the phy is present in DT, but the phy-mode
is not specified.

Add this handling.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phylink: Add helpers for c22 registers without MDIO
Sean Anderson [Fri, 19 Nov 2021 15:58:09 +0000 (10:58 -0500)]
net: phylink: Add helpers for c22 registers without MDIO

Some devices expose memory-mapped c22-compliant PHYs. Because these
devices do not have an MDIO bus, we cannot use the existing helpers.
Refactor the existing helpers to allow supplying the values for c22
registers directly, instead of using MDIO to access them. Only get_state
and set_advertisement are converted, since they contain the most complex
logic. Because set_advertisement is never actually used outside
phylink_mii_c22_pcs_config, move the MDIO-writing part into that
function. Because some modes do not need the advertisement register set
at all, we use -EINVAL for this purpose.

Additionally, a new function phylink_pcs_enable_an is provided to
determine whether to enable autonegotiation.

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: annotate accesses to dev->gso_max_segs
Eric Dumazet [Fri, 19 Nov 2021 15:43:32 +0000 (07:43 -0800)]
net: annotate accesses to dev->gso_max_segs

dev->gso_max_segs is written under RTNL protection, or when the device is
not yet visible, but is read locklessly.

Add netif_set_gso_max_segs() helper.

Add the READ_ONCE()/WRITE_ONCE() pairs, and use netif_set_gso_max_segs()
where we can to better document what is going on.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>