Vladimir Oltean [Fri, 11 Jun 2021 19:01:24 +0000 (22:01 +0300)]
net: dsa: generalize overhead for taggers that use both headers and trailers
Some really really weird switches just couldn't decide whether to use a
normal or a tail tagger, so they just did both.
This creates problems for DSA, because we only have the concept of an
'overhead' which can be applied to the headroom or to the tailroom of
the skb (like for example during the central TX reallocation procedure),
depending on the value of bool tail_tag, but not to both.
We need to generalize DSA to cater for these odd switches by
transforming the 'overhead / tail_tag' pair into 'needed_headroom /
needed_tailroom'.
The DSA master's MTU is increased to account for both.
The flow dissector code is modified such that it only calls the DSA
adjustment callback if the tagger has a non-zero header length.
Taggers are trivially modified to declare either needed_headroom or
needed_tailroom, based on the tail_tag value that they currently
declare.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 11 Jun 2021 19:01:23 +0000 (22:01 +0300)]
net: dsa: sja1105: allow RX timestamps to be taken on all ports for SJA1110
On SJA1105, there is support for a cascade port which is presumably
connected to a downstream SJA1105 switch. The upstream one does not take
PTP timestamps for packets received on this port, presumably because the
downstream switch already did (and for PTP, it only makes sense for the
leaf nodes in a DSA switch tree to do that).
I haven't been able to validate that feature in a fully assembled setup,
so I am disabling the feature by setting the cascade port to an unused
port value (ds->num_ports).
In SJA1110, multiple cascade ports are supported, and CASC_PORT became
a bit mask from a port number. So when CASC_PORT is set to ds->num_ports
(which is 11 on SJA1110), it is actually set to 0b1011, so ports 3, 1
and 0 are configured as cascade ports and we cannot take RX timestamps
on them.
So we need to introduce a check for SJA1110 and set things differently
(to zero there), so that the cascading feature is properly disabled and
RX timestamps can be taken on all ports.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 11 Jun 2021 19:01:22 +0000 (22:01 +0300)]
net: dsa: sja1105: enable the TTEthernet engine on SJA1110
As opposed to SJA1105 where there are parts with TTEthernet and parts
without, in SJA1110 all parts support it, but it must be enabled in the
static config. So enable it unconditionally. We use it for the tc-taprio
and tc-gate offload.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 11 Jun 2021 19:43:16 +0000 (12:43 -0700)]
Merge branch 'hns3-ptp'
Guangbin Huang says:
====================
net: hns3: add support for PTP
This series adds PTP support for the HNS3 ethernet driver.
change log:
V1 -> V2:
1. use spinlock to prevent concurrency
2. add the handling when ptp_clock_register() returns NULL
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 10 Jun 2021 13:38:57 +0000 (21:38 +0800)]
net: hns3: add debugfs support for ptp info
Add a debugfs interface for dumping ptp information, which
is helpful for debugging.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 10 Jun 2021 13:38:56 +0000 (21:38 +0800)]
net: hns3: add support for PTP
Adds PTP support for HNS3 ethernet driver.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 10 Jun 2021 21:50:08 +0000 (14:50 -0700)]
Merge branch 'ipa-mem-2'
Alex Elder says:
====================
net: ipa: memory region rework, part 2
This is the second portion of a set of patches updating the IPA
memory region code.
In this portion (part 2), the focus is on adjusting the code so that
it no longer assumes the memory region descriptor array is indexed
by the region identifier. This brings with it some related cleanup.
Three loops are changed so their loop index variable is an unsigned
rather than an enumerated type.
A set of functions is changed so a region identifier (rather than a
memory region descriptor pointer) is passed as argument, to simplify
their call sites. This isn't entirely related or required, but I
think it improves the code.
A validation function for filter and route table memory regions is
changed to take memory region IDs, rather than determining which
region to validate based on a set of Boolean flags.
Finally, ipa_mem_find() is created to abstract getting a memory
descriptor based on its ID, and it is used everywhere rather than
indexing the array. With that implemented, all of the memory
regions can be defined by arrays of entries defined without
providing index designators.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:08 +0000 (14:23 -0500)]
net: ipa: don't index mem data array by ID
Finally the code handles the IPA memory region array in the
configuration data without assuming it is indexed by region ID.
Get rid of the array index designators where these arrays are
initialized. As a result, there's no more need to define an
explicitly undefined memory region ID, so get rid of that.
Change ipa_mem_find() so it no longer assumes the ipa->mem[] array
is indexed by memory region ID. Instead, have it search the array
for the entry having the requested memory ID, and return the address
of the descriptor if found. Otherwise return NULL.
Stop allowing memory regions to be defined with zero size and zero
canary value. Check for this condition in ipa_mem_valid_one().
As a result, it is not necessary to check for this case in
ipa_mem_config().
Finally, there is no need for IPA_MEM_UNDEFINED to be defined any
more, so get rid of it.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:07 +0000 (14:23 -0500)]
net: ipa: introduce ipa_mem_find()
Introduce a new function that abstracts finding information about a
region in IPA-local memory, given its memory region ID. For now it
simply uses the region ID as an index into the IPA memory array.
If the region is not defined, ipa_mem_find() returns a null pointer.
Update all code that accesses the ipa->mem[] array directly to use
ipa_mem_find() instead. The return value must be checked for null
when optional memory regions are sought.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:06 +0000 (14:23 -0500)]
net: ipa: pass memory id to ipa_table_valid_one()
Stop passing most of the Boolean flags to ipa_table_valid_one(), and
just pass a memory region ID to it instead. We still need to
indicate whether we're operating on a routing or filter table.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:05 +0000 (14:23 -0500)]
net: ipa: pass mem_id to ipa_table_reset_add()
Pass a memory region ID rather than the address of a memory region
descriptor to ipa_table_reset_add() to simplify callers. Similarly,
pass memory region IDs to ipa_table_init_add().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:04 +0000 (14:23 -0500)]
net: ipa: pass mem ID to ipa_mem_zero_region_add()
Pass a memory region ID rather than the address of a memory region
descriptor to ipa_mem_zero_region_add() to simplify callers.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:03 +0000 (14:23 -0500)]
net: ipa: pass mem_id to ipa_filter_reset_table()
Pass a memory region ID rather than the address of a memory region
descriptor to ipa_filter_reset_table(), to simplify callers.
We can eliminate the check for a zero region size in this function
because ipa_table_reset_add() checks that before adding anything to
the transaction.
Note that here and in subsequent commits there is no need to check
whether a memory region exists, because we will have already
verified that during initialization.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:02 +0000 (14:23 -0500)]
net: ipa: clean up header memory validation
Do some general cleanup in ipa_cmd_header_valid():
- Delay assigning the mem variable until just before it's used.
- Assign the maximum offset and size values together.
- Improve comments explaining the single range of memory being
made up of a modem portion and an AP portion.
- Record the offset of the combined range in a local variable.
- Do the initial size assignment right after assigning the offset.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 10 Jun 2021 19:23:01 +0000 (14:23 -0500)]
net: ipa: don't assume mem array indexed by ID
Change ipa_mem_valid() to iterate over the entries using a u32 index
variable rather than using a memory region ID. Use the ID found
inside the memory descriptor rather than the loop index.
Change ipa_mem_size_valid() to iterate over the entries but without
assuming the array index is the memory region ID. "Empty" entries
will have zero size; and we'll temporarily assume such entries have
zero offset as well (they all do, currently).
Similarly, don't assume the mem[] array is indexed by ID in
ipa_mem_config(). There, "empty" entries will have a zero canary
count, so no special assumptions are needed to handle them correctly.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cristobal Forno [Thu, 10 Jun 2021 17:08:35 +0000 (11:08 -0600)]
ibmvnic: Allow device probe if the device is not ready at boot
Allow the device to be initialized at a later time if
it is not available at boot. The device will be allowed to probe but
will be given a "down" state. After completing device probe and
registering the net device, the driver will await an interrupt signal
from its partner device, indicating that it is ready for boot. The
driver will schedule a work event to perform the necessary procedure
and begin operation.
Co-developed-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Cristobal Forno <cforno12@linux.ibm.com>
Acked-by: Lijun Pan <lijunp213@gmail.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 10 Jun 2021 21:20:44 +0000 (14:20 -0700)]
Merge branch 'marvell-prestera-lag'
Vadym Kochan says:
====================
net: marvell: prestera: add LAG support
The following features are supported:
- LAG basic operations
- create/delete LAG
- add/remove a member to LAG
- enable/disable member in LAG
- LAG Bridge support
- LAG VLAN support
- LAG FDB support
Limitations:
- Only HASH lag tx type is supported
- The Hash parameters are not configurable. They are applied
during the LAG creation stage.
- Enslaving a port to the LAG device that already has an
upper device is not supported.
Changes extracted from:
https://lkml.org/lkml/2021/2/3/877
and marked with "v2".
v2:
There are 2 additional preparation patches which simplifies the
netdev topology handling.
1) Initialize 'lag' with NULL in prestera_lag_create() [suggested by Vladimir Oltean]
2) Use -ENOSPC in prestera_lag_port_add() if max lag [suggested by Vladimir Oltean]
numbers were reached.
3) Do not propagate netdev events to prestera_switchdev [suggested by Vladimir Oltean]
but call bridge specific funcs. It simplifies the code.
4) Check on info->link_up in prestera_netdev_port_lower_event() [suggested by Vladimir Oltean]
5) Return -EOPNOTSUPP in prestera_netdev_port_event() in case [suggested by Vladimir Oltean]
LAG hashing mode is not supported.
6) Do not pass "lower" netdev to bridge join/leave functions. [suggested by Vladimir Oltean]
It is not need as offloading settings applied on particular
physical port. It requires to do extra upper dev lookup
in case port is in the LAG which is in the bridge on vlans add/del.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Serhiy Boiko [Thu, 10 Jun 2021 15:43:11 +0000 (18:43 +0300)]
net: marvell: prestera: add LAG support
The following features are supported:
- LAG basic operations
- create/delete LAG
- add/remove a member to LAG
- enable/disable member in LAG
- LAG Bridge support
- LAG VLAN support
- LAG FDB support
Limitations:
- Only HASH lag tx type is supported
- The Hash parameters are not configurable. They are applied
during the LAG creation stage.
- Enslaving a port to the LAG device that already has an
upper device is not supported.
Co-developed-by: Andrii Savka <andrii.savka@plvision.eu>
Signed-off-by: Andrii Savka <andrii.savka@plvision.eu>
Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu>
Co-developed-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadym Kochan [Thu, 10 Jun 2021 15:43:10 +0000 (18:43 +0300)]
net: marvell: prestera: do not propagate netdev events to prestera_switchdev.c
Replace prestera_bridge_port_event(...) by
prestera_bridge_port_join(...) and prestera_bridge_port_leave().
It simplifies the code by reading netdev event specific handling only
once in prestera_main.c
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
CC: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadym Kochan [Thu, 10 Jun 2021 15:43:09 +0000 (18:43 +0300)]
net: marvell: prestera: move netdev topology validation to prestera_main
Move handling of PRECHANGEUPPER event from prestera_switchdev to
prestera_main which is responsible for basic netdev events handling
and routing them to related module.
Signed-off-by: Vadym Kochan <vkochan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tan Zhongjun [Thu, 10 Jun 2021 14:01:18 +0000 (22:01 +0800)]
soc: qcom: ipa: Remove superfluous error message around platform_get_irq()
The platform_get_irq() prints error message telling that interrupt is
missing,hence there is no need to duplicated that message in the
drivers.
Signed-off-by: Tan Zhongjun <tanzhongjun@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baokun Li [Thu, 10 Jun 2021 13:26:03 +0000 (21:26 +0800)]
dccp: tfrc: fix doc warnings in tfrc_equation.c
Add description for `tfrc_invert_loss_event_rate` to fix the W=1 warnings:
net/dccp/ccids/lib/tfrc_equation.c:695: warning: Function parameter or
member 'loss_event_rate' not described in 'tfrc_invert_loss_event_rate'
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Richard Sailer <richard_siegfried@systemli.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Thu, 10 Jun 2021 13:03:55 +0000 (21:03 +0800)]
atm: Use list_for_each_entry() to simplify code in resources.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Thu, 10 Jun 2021 12:54:17 +0000 (20:54 +0800)]
ibmvnic: Use list_for_each_entry() to simplify code in ibmvnic.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Acked-by: Lijun Pan <lijunp213@gmail.com>
Reviewed-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Thu, 10 Jun 2021 12:48:26 +0000 (20:48 +0800)]
net: x25: Use list_for_each_entry() to simplify code in x25_route.c
Convert list_for_each() to list_for_each_entry() where
applicable. This simplifies the code.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Thu, 10 Jun 2021 09:25:35 +0000 (17:25 +0800)]
mt76: mt7615: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Thu, 10 Jun 2021 09:17:12 +0000 (17:17 +0800)]
net: mido: mdio-mux-bcm-iproc: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code and avoid a null-ptr-deref by checking 'res' in it.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong Vee Khee [Thu, 10 Jun 2021 08:53:54 +0000 (16:53 +0800)]
net: stmmac: Fix mixed enum type warning
The commit
5a5586112b92 ("net: stmmac: support FPE link partner
hand-shaking procedure") introduced the following coverity warning:
"Parse warning (PW.MIXED_ENUM_TYPE)"
"1. mixed_enum_type: enumerated type mixed with another type"
This is due to both "lo_state" and "lp_sate" which their datatype are
enum stmmac_fpe_state type, and being assigned with "FPE_EVENT_UNKNOWN"
which is a macro-defined of 0. Fixed this by assigned both these
variables with the correct enum value.
Fixes:
5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking procedure")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Thu, 10 Jun 2021 08:02:43 +0000 (16:02 +0800)]
fjes: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Thu, 10 Jun 2021 07:36:22 +0000 (15:36 +0800)]
net: axienet: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Thu, 10 Jun 2021 07:29:33 +0000 (15:29 +0800)]
net: w5100: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 10 Jun 2021 20:50:43 +0000 (13:50 -0700)]
Merge branch 'ixp4xxx_hss-cleanups'
Peng Li says:
====================
net: ixp4xx_hss: clean up some code style issues
This patchset clean up some code style issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:20:05 +0000 (15:20 +0800)]
net: ixp4xx_hss: add braces {} to all arms of the statement
Braces {} should be used on all arms of this statement.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:20:04 +0000 (15:20 +0800)]
net: ixp4xx_hss: fix the comments style issue
Networking block comments don't use an empty /* line,
use /* Comment...
Block comments use * on subsequent lines.
Block comments use a trailing */ on a separate line.
This patch fixes the comments style issues.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:20:03 +0000 (15:20 +0800)]
net: ixp4xx_hss: remove redundant spaces
According to the chackpatch.pl,
space prohibited after that open parenthesis '('.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:20:02 +0000 (15:20 +0800)]
net: ixp4xx_hss: add some required spaces
Add space required before the open parenthesis '('.
Add space required after that close brace '}'.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:20:01 +0000 (15:20 +0800)]
net: ixp4xx_hss: move out assignment in if condition
Should not use assignment in if condition.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:20:00 +0000 (15:20 +0800)]
net: ixp4xx_hss: fix the code style issue about "foo* bar"
Fix the checkpatch error as "foo* bar" and should be "foo *bar",
and "(foo*)" should be "(foo *)".
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:19:59 +0000 (15:19 +0800)]
net: ixp4xx_hss: add blank line after declarations
This patch fixes the checkpatch error about missing a blank line
after declarations.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Thu, 10 Jun 2021 07:19:58 +0000 (15:19 +0800)]
net: ixp4xx_hss: remove redundant blank lines
This patch removes some redundant blank lines.
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gushengxian [Thu, 10 Jun 2021 06:29:58 +0000 (23:29 -0700)]
tipc:subscr.c: fix a spelling mistake
Fix a spelling mistake.
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gushengxian [Thu, 10 Jun 2021 06:18:53 +0000 (23:18 -0700)]
tipc: socket.c: fix the use of copular verb
Fix the use of copular verb.
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gushengxian [Thu, 10 Jun 2021 05:50:46 +0000 (22:50 -0700)]
node.c: fix the use of indefinite article
Fix the use of indefinite article.
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gushengxian [Thu, 10 Jun 2021 03:09:35 +0000 (20:09 -0700)]
af_unix: remove the repeated word "and"
Remove the repeated word "and".
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
gushengxian [Thu, 10 Jun 2021 01:11:59 +0000 (18:11 -0700)]
vsock/vmci: remove the repeated word "be"
Remove the repeated word "be".
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 10 Jun 2021 20:36:37 +0000 (13:36 -0700)]
Merge tag 'mlx5-updates-2021-06-09' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-06-09
Introduce steering header insert/remove and switchdev bridge offloads
1) From Yevgeny, Steering header insert/remove support
ConnectX supports offloading of various encapsulations and decapsulations
(e.g. VXLAN), which are performed by 'Packet Reformat' action.
Starting with ConnectX-6 DX, a new reformat type is supported - INSERT_HEADER.
This reformat allows inserting an arbitrary size buffer at a selected location
in the packet on RX flows.
The insert/remove header support are needed as a prerequisite for the
bridge offloads vlan pop/push supprt, see below.
2) From Vlad, Support for bridge offloads for switchdev mode
This change implements bridge offloads with VLAN-support that works on top
of mlx5 representors in switchdev mode.
HIGH-LEVEL OVERVIEW
Hardware supported by mlx5 driver doesn't provide dynamic learning or aging
functionality and requires the driver to emulate all switch-like behavior
in software. As such, all packets by default go through miss path, appear
on representor and get to software bridge, if it is the upper device of the
representor. This causes bridge to process packet in software, learn the
MAC address to FDB and send SWITCHDEV_FDB_ADD_TO_DEVICE event to all
subscribers. Upon reception of SWITCHDEV_FDB_ADD_TO_DEVICE notification
mlx5 bridge offloads the FDB to hardware and sends back
SWITCHDEV_FDB_ADD_TO_BRIDGE notification to prevent such entries from being
aged out by kernel bridge. Leaving aging to kernel bridge would result
deletion of offloaded dynamic FDB entries every aging_time period due to
packets being processed by hardware and, consecutively, 'used' timestamp
for FDB entry not being updated. Hardware aging is emulated in driver by
running periodic workqueue task that manually updates the rules according
to their hardware counter:
- If hardware counter has changed since last update, the handler updates
'used' timestamp in kernel bridge dynamic entry by sending
SWITCHDEV_FDB_ADD_TO_BRIDGE notification for the entry.
- If FDB entry wasn't updated for user-controllable aging_time period,
then the FDB entry is unoffloaded from hardware and corresponding
SWITCHDEV_FDB_DEL_TO_BRIDGE notification is sent to kernel bridge.
The mlx5 bridge offload implementation fully supports port VLAN objects,
including PVID (vlan push) and "Egress Untagged" (vlan pop).
SOFTWARE ARCHITECTURE
Mlx5_eswitch is extended with pointer to new mlx5_esw_bridge_offloads
structure which has a linked list of mlx5_esw_bridge objects. Struct
mlx5_esw_bridge is the main switch object in mlx5 that holds all data for
offloaded FDB entries and metadata for bridge ports and their vlans. The
mlx5_esw_bridge object is created when first representor of eswitch vport
is added to bridge and deleted when the last representor is detached from
it. Bridge FDB entries are saved in linked list (to iterate over all FDB
entries in aging workqueue task) and also in hashtable for quick lookup by
MAC+VLAN tuple. Bridge FDB entries are saved in linked list (to iterate
over all FDB entries in aging workqueue task) and in hashtable for quick
lookup by MAC+VLAN tuple. Port metadata is stored in struct
mlx5_esw_bridge_port that is saved in xarray to allow quick lookup by vport
number. Part of the port metadata is the set of port vlans that are
represented by mlx5_esw_bridge_vlan structure. The vlan structure points to
all FDBs on vlan/port via fdb_list linked list.
Simplified diagram of mlx5 bridge objects:
+------------------+
| mxl5_eswitch |
| |
| br_offloads |
+--------+---------+
|
+--------v-------------------+
| mlx5_esw_bridge_offloads |
| |
+--> bridges |
| +-------+--------------------+
| |
| |
| +---v---------------+
| | mlx5_esw_bridge |
| | |
| | vports |
| | |
| | fdb_ht |
| +---+---------------+
| |
| +---v---------------+
+------+ mlx5_esw_bridge |
| |
+-------------------------+ vports |
| | |
| | fdb_ht +------------------------------------------+
| +-------------------+ |
| |
| |
| +----------------------+ +---------------------------+ |
+-> mlx5_esw_bridge_port | +--> mlx5_esw_bridge_fdb_entry <-+
| | | +----------------------+ | +--+------------------------+ |
| | vlans +--+-> mlx5_esw_bridge_vlan | | | |
| | | | | | | +--v------------------------+ |
| +----------------------+ | | fdb_list +--+ | mlx5_esw_bridge_fdb_entry <-+
| | +-------^--------------+ +--+------------------------+ |
| +----------------------+ | | | |
+-> mlx5_esw_bridge_port | | +-----------------------+ |
| | | |
| vlans | | -----------------------+ |
| | +-> mlx5_esw_bridge_vlan | |
+----------------------+ | | +---------------------------+ |
| fdb_list +-----> mlx5_esw_bridge_fdb_entry <-+
+-------^--------------+ +--+------------------------+
| |
+-----------------------+
HARDWARE REPRESENTATION
In order to adhere to kernel software datapath model bridge offloads must
come after TC and NF FDBs. However, since netfilter offload in mlx5 is
implemented with unmanaged tables, its miss path is not automatically
connected to next priority and requires the code to manually connect with
slow table. To keep bridge offloads encapsulated and not mix it with
eswitch offloads new FDB_TC_MISS priority is created between FDB_FT_OFFLOAD
and FDB_SLOW_PATH which allows bridge offloads to be created without
exposing its internal tables to any other modules since miss path of
managed TC-miss table is automatically wired to next priority.
The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB
namespace. The new priority is between tc-miss and slow path priorities.
Priority consist of two levels: the ingress table that is global per
eswitch and matches incoming packets by src_mac/vid and redirects them to
next level (egress table) that is chosen according to ingress port bridge
membership and matches on dst_mac/vid in order to redirect packet to vport
according to the following diagram:
+
|
+---------v----------+
| |
| FDB_TC_OFFLOAD |
| |
+---------+----------+
|
|
+---------v----------+
| |
| FDB_FT_OFFLOAD |
| |
+---------+----------+
|
|
+---------v----------+
| |
| FDB_TC_MISS |
| |
+---------+----------+
|
+--------------------------------------+
| | |
| +------+ |
| | |
| +------v--------+ FDB_BR_OFFLOAD |
| | INGRESS_TABLE | |
| +------+---+----+ |
| | | match |
| | +---------+ |
| | | | +-------+
| | +-------v-------+ match | | |
| | | EGRESS_TABLE +------------> vport |
| | +-------+-------+ | | |
| | | | +-------+
| | miss | |
| +------+------+ |
| | |
+--------------------------------------+
|
|
+---------v----------+
| |
| FDB_SLOW_PATH |
| |
+---------+----------+
|
v
PATCHES OVERVIEW
1-3 - Miscellaneous refactorings and infrastructure changes.
4 - Mlx5 bridge offload infrastructure and dedicated fs_core
namespace/tables implementation.
5 - FDB entry offload.
6 - Dynamic FDB entry aging.
7-10 - VLAN filtering offload.
11 - Tracepoints for main mlx5 bridge offload events (FDB entry
offload/unoffload, VLAN add/delete, etc.)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
--
Yang Yingliang [Wed, 9 Jun 2021 14:17:44 +0000 (22:17 +0800)]
net: davinci_emac: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code and avoid a null-ptr-deref by checking 'res' in it.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 9 Jun 2021 14:01:52 +0000 (22:01 +0800)]
net: ethernet: ti: cpsw: Use devm_platform_get_and_ioremap_resource()
Use devm_platform_get_and_ioremap_resource() to simplify
code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong Vee Khee [Mon, 7 Jun 2021 02:36:45 +0000 (10:36 +0800)]
net: phy: probe for C45 PHYs that return PHY ID of zero in C22 space
PHY devices such as the Marvell Alaska
88E2110 does not return a valid
PHY ID when probed using Clause-22. The current implementation treats
PHY ID of zero as a non-error and valid PHY ID, and causing the PHY
device failed to bind to the Marvell driver.
For such devices, we do an additional probe in the Clause-45 space,
if a valid PHY ID is returned, we then proceed to attach the PHY
device to the matching PHY ID driver.
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chen Li [Mon, 7 Jun 2021 01:44:35 +0000 (09:44 +0800)]
netlink: simplify NLMSG_DATA with NLMSG_HDRLEN
The NLMSG_LENGTH(0) may confuse the API users,
NLMSG_HDRLEN is much more clear.
Besides, some code style problems are also fixed.
Signed-off-by: Chen Li <chenli@uniontech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Fri, 2 Apr 2021 18:16:13 +0000 (21:16 +0300)]
net/mlx5: Bridge, add tracepoints
Move private bridge structures to dedicated headers that is accessible to
bridge tracepoint header. Implemented following tracepoints:
- Initialize FDB entry.
- Refresh FDB entry.
- Cleanup FDB entry.
- Create VLAN.
- Cleanup VLAN.
- Attach port to bridge.
- Detach port from bridge.
Usage example:
># cd /sys/kernel/debug/tracing
># echo mlx5:mlx5_esw_bridge_fdb_entry_init >> set_event
># cat trace
...
kworker/u20:1-96 [001] .... 231.892503: mlx5_esw_bridge_fdb_entry_init: net_device=enp8s0f0_0 addr=e4:fd:05:08:00:02 vid=3 flags=0 lastuse=
4294895695
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Thu, 18 Mar 2021 19:05:20 +0000 (21:05 +0200)]
net/mlx5: Bridge, filter tagged packets that didn't match tagged fg
With support for pvid vlans in mlx5 bridge it is possible to have rules in
untagged flow group when vlan filtering is enabled. However, such rules can
also match tagged packets that didn't match anything in tagged flow group.
Filter such packets by introducing additional flow group between tagged and
untagged groups. When filtering is enabled on the bridge create additional
flow in vlan filtering flow group and matches tagged packets with specified
source MAC address and redirects them to new "skip" table. The skip table
is new lowest-level empty table that is used to skip all further processing
on packet in bridge priority.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Fri, 2 Apr 2021 12:09:06 +0000 (15:09 +0300)]
net/mlx5: Bridge, support pvid and untagged vlan configurations
Implement support for pushing vlan header into untagged packet on ingress
of port that has pvid configured and support for popping vlan on egress of
port that has the matching vlan configured as untagged. To support such
configurations packet reformat contexts of {INSERT|REMOVE}_HEADER types are
created per such vlan and saved to struct mlx5_esw_bridge_vlan which allows
all FDB entries on particular vlan to share single packet reformat
instance. When initializing FDB entries with pvid or untagged vlan type set
its mlx5_flow_act->pkt_reformat action accordingly.
Flush all flows when removing vlan from port. This is necessary because
even though software bridge removes all FDB entries before removing their
vlan, mlx5 bridge implementation deletes their corresponding flow entries
from hardware in asynchronous workqueue task, which will cause firmware
error if vlan packet reformat context is deleted before all flows that
point to it.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Fri, 12 Mar 2021 11:37:46 +0000 (13:37 +0200)]
net/mlx5: Bridge, match FDB entry vlan tag
Add support for FDB vlan-tagged entries. Extend ingress and egress flow
tables with flow groups to match packet vlan tag. Modify the flow creation
code to include vlan tag, if vlan is configured on port and vlan
configuration is supported for offload.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Tue, 30 Mar 2021 15:46:31 +0000 (18:46 +0300)]
net/mlx5: Bridge, implement infrastructure for vlans
Establish all the necessary infrastructure for implementing vlan matching
and vlan push/pop in following patches:
- Add new per-vport struct mlx5_esw_bridge_port that is used to store
metadata for all port vlans. Initialize and cleanup the instance of the
structure when port representor is linked/unliked to bridge. Use xarray to
allow quick vport metadata lookup by vport number.
- Add new per-port-vlan struct mlx5_esw_bridge_vlan that is used to store
vlan-specific data (vid, flags). Handle SWITCHDEV_PORT_OBJ_{ADD|DEL}
switchdev blocking event for SWITCHDEV_OBJ_ID_PORT_VLAN object by
creating/deleting the vlan structure and saving it in per-vport xarray for
quick lookup.
- Implement support for SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING object
attribute that is used to toggle vlan filtering. Remove all FDB entries
from hardware when vlan filtering state is changed.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Tue, 30 Mar 2021 17:08:57 +0000 (20:08 +0300)]
net/mlx5: Bridge, dynamic entry ageing
Dynamic FDB entries require capability to age out unused entries. Such
entries are either aged out by kernel software bridge implementation or by
hardware switch that offloaded them (and notified the kernel to mark them
as SWITCHDEV_FDB_ADD_TO_BRIDGE). Leaving ageing to kernel bridge would
result it deleting offloaded dynamic FDB entries every ageing_time period
due to packets being processed by hardware and, consecutively, 'used'
timestamp for FDB entry not being updated. However, since hardware doesn't
support ageing, software solution inside the driver is required.
In order to emulate hardware ageing in driver, extend bridge FDB ingress
flows with counter and create delayed br_offloads->update_work task on
bridge offloads workqueue. Run the task every second, update 'used'
timestamp in software bridge dynamic entry by sending
SWITCHDEV_FDB_ADD_TO_BRIDGE for the entry, if it flow hardware counter
lastuse field was changed since last update. If lastuse wasn't changed for
ageing_time period, then delete the FDB entry and notify kernel bridge by
sending SWITCHDEV_FDB_DEL_TO_BRIDGE notification.
Register blocking switchdev notifier callback and handle attribute set
SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME event to allow user to dynamically
configure bridge FDB entry ageing timeout. Save the value per-bridge in
struct mlx5_esw_bridge. Silently ignore
SWITCHDEV_ATTR_ID_PORT_{PRE_}BRIDGE_FLAGS switchdev event since mlx5 bridge
implementation relies on software bridge for implementing necessary
behavior for all of these flags.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Mon, 7 Jun 2021 14:42:49 +0000 (17:42 +0300)]
net/mlx5: Bridge, handle FDB events
Hardware supported by mlx5 driver doesn't provide learning and requires the
driver to emulate all switch-like behavior in software. As such, all
packets by default go through miss path, appear on representor and get to
software bridge, if it is the upper device of the representor. This causes
bridge to process packet in software, learn the MAC address to FDB and send
SWITCHDEV_FDB_ADD_TO_DEVICE event to all subscribers.
In order to offload FDB entries in mlx5, register switchdev notifier
callback and implement support for both 'added_by_user' and dynamic FDB
entry SWITCHDEV_FDB_ADD_TO_DEVICE events asynchronously using new
mlx5_esw_bridge_offloads->wq ordered workqueue. In workqueue callback
offload the ingress rule (matching FDB entry MAC as packet source MAC) and
egress table rule (matching FDB entry MAC as destination MAC). For ingress
table rule also match source vport to ensure that only traffic coming from
expected bridge port is matched by offloaded rule. Save all the relevant
FDB entry data in struct mlx5_esw_bridge_fdb_entry instance and insert the
instance in new mlx5_esw_bridge->fdb_list list (for traversing all entries
by software ageing implementation in following patch) and in new
mlx5_esw_bridge->fdb_ht hash table for fast retrieval. Notify the bridge
that FDB entry has been offloaded by sending SWITCHDEV_FDB_OFFLOADED
notification.
Delete FDB entry on reception of SWITCHDEV_FDB_DEL_TO_DEVICE event.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Fri, 2 Apr 2021 12:57:02 +0000 (15:57 +0300)]
net/mlx5: Bridge, add offload infrastructure
Create new files bridge.{c|h} in en/rep directory that implement bridge
interaction with representor netdevices and handle required
events/notifications, bridge.{c|h} in esw directory that implement all
necessary eswitch offloading infrastructure and works on vport/eswitch
level. Provide new kconfig MLX5_BRIDGE which is automatically selected when
both kernel bridge and mlx5 eswitch configs are enabled.
Provide basic infrastructure for bridge offloads:
- struct mlx5_esw_bridge_offloads - per-eswitch bridge offload structure
that encapsulates generic bridge-offloads data (notifier blocks, ingress
flow table/group, etc.) that is created/deleted on enable/disable eswitch
offloads.
- struct mlx5_esw_bridge - per-bridge structure that encapsulates
per-bridge data (reference counter, FDB, egress flow table/group, etc.)
that is created when first eswitch represetor is attached to new bridge and
deleted when last representor is removed from the bridge as a result of
NETDEV_CHANGEUPPER event.
The bridge tables are created with new priority FDB_BR_OFFLOAD in FDB
namespace. The new priority is between tc-miss and slow path priorities.
Priority consist of two levels: the ingress table that is global per
eswitch and matches incoming packets by src_mac/vid and redirects them to
next level (egress table) that is chosen according to ingress port bridge
membership and matches on dst_mac/vid in order to redirect packet to vport
according to the following diagram:
+
|
+---------v----------+
| |
| FDB_TC_OFFLOAD |
| |
+---------+----------+
|
|
+---------v----------+
| |
| FDB_FT_OFFLOAD |
| |
+---------+----------+
|
|
+---------v----------+
| |
| FDB_TC_MISS |
| |
+---------+----------+
|
+--------------------------------------+
| | |
| +------+ |
| | |
| +------v--------+ FDB_BR_OFFLOAD |
| | INGRESS_TABLE | |
| +------+---+----+ |
| | | match |
| | +---------+ |
| | | | +-------+
| | +-------v-------+ match | | |
| | | EGRESS_TABLE +------------> vport |
| | +-------+-------+ | | |
| | | | +-------+
| | miss | |
| +------+------+ |
| | |
+--------------------------------------+
|
|
+---------v----------+
| |
| FDB_SLOW_PATH |
| |
+---------+----------+
|
v
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Wed, 21 Apr 2021 18:52:43 +0000 (21:52 +0300)]
net/mlx5e: Refactor mlx5e_eswitch_{*}rep() helpers
Change the helper to functions to accept constant pointer to struct
net_device. This is necessary for following patches in series that pass
mlx5e_eswitch_rep() as a callback to kernel bridge infrastructure code.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Vlad Buslov [Thu, 4 Mar 2021 11:09:53 +0000 (13:09 +0200)]
net/mlx5: Create TC-miss priority and table
In order to adhere to kernel software datapath model bridge offloads must
come after TC and NF FDBs. Following patches in this series add new FDB
priority for bridge after FDB_FT_OFFLOAD. However, since netfilter offload
is implemented with unmanaged tables, its miss path is not automatically
connected to next priority and requires the code to manually connect with
slow table. To keep bridge offloads encapsulated and not mix it with
eswitch offloads, create a new FDB_TC_MISS priority between FDB_FT_OFFLOAD
and FDB_SLOW_PATH:
+
|
+---------v----------+
| |
| FDB_TC_OFFLOAD |
| |
+---------+----------+
|
|
|
+---------v----------+
| |
| FDB_FT_OFFLOAD |
| |
+---------+----------+
|
|
|
+---------v----------+
| |
| FDB_TC_MISS |
| |
+---------+----------+
|
|
|
+---------v----------+
| |
| FDB_SLOW_PATH |
| |
+---------+----------+
|
v
Initialize the new priority with single default empty managed table and use
the table as TC/NF miss patch instead of slow table. This approach allows
bridge offloads to be created as new FDB namespace priority between
FDB_TC_MISS and FDB_SLOW_PATH without exposing its internal tables to any
other modules since miss path of managed TC-miss table is automatically
wired to next priority.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Yevgeny Kliteynik [Sun, 14 Mar 2021 01:08:28 +0000 (03:08 +0200)]
net/mlx5: DR, Support EMD tag in modify header for STEv1
Add support for EMD tag in modify header set/copy actions
on device that supports STEv1.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Yevgeny Kliteynik [Thu, 28 Jan 2021 20:12:07 +0000 (22:12 +0200)]
net/mlx5: DR, Added support for INSERT_HEADER reformat type
Add support for INSERT_HEADER packet reformat context type
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Yevgeny Kliteynik [Tue, 9 Mar 2021 01:30:44 +0000 (03:30 +0200)]
net/mlx5: Added new parameters to reformat context
Adding new reformat context type (INSERT_HEADER) requires adding two new
parameters to reformat context - reformat_param_0 and reformat_param_1.
As defined by HW spec, these parameters have different meaning for
different reformat context type.
The first parameter (reformat_param_0) is not new to HW spec, but it
wasn't used by any of the supported reformats. The second parameter
(reformat_param_1) is new to the HW spec - it was added to allow
supporting INSERT_HEADER.
For NSERT_HEADER, reformat_param_0 indicates the header used to
reference the location of the inserted header, and reformat_param_1
indicates the offset of the inserted header from the reference point
defined by reformat_param_0.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Yevgeny Kliteynik [Wed, 10 Mar 2021 02:38:06 +0000 (04:38 +0200)]
net/mlx5: DR, Allow encap action for RX for supporting devices
Encap actions on RX flow were not supported on older devices.
However, this is no longer the case in devices that support STEv1.
This patch adds support for encap l3/l2 on RX flow for supported
devices: update actions state machine by adding the newely supported
transitions and add the required support in STEv0/1 files.
The new transitions that are supported are:
- from decap/modify-header/pop-vlan to encap
- from encap to termination table
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Yevgeny Kliteynik [Wed, 10 Mar 2021 02:12:28 +0000 (04:12 +0200)]
net/mlx5: DR, Split reformat state to Encap and Decap
Split single reformat state into two separate states for encap and decap.
This will allow adding actions to the specific domain, such as encap on RX.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Yevgeny Kliteynik [Tue, 9 Mar 2021 01:29:16 +0000 (03:29 +0200)]
net/mlx5: mlx5_ifc support for header insert/remove
Add support for HCA caps 2 that contains capabilities for the new
insert/remove header actions.
Added the required definitions for supporting the new reformat type:
added packet reformat parameters, reformat anchors and definitions
to allow copy/set into the inserted EMD (Embedded MetaData) tag.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
David S. Miller [Wed, 9 Jun 2021 22:59:34 +0000 (15:59 -0700)]
Merge branch 'ipa-mem-1'
Alex Elder says:
====================
net: ipa: memory region rework, part 1
This is the first portion of a very long series of patches that has
been split in two. Once these patches are accepted, I'll post the
remaining patches.
The combined series reworks the way memory regions are defined in
the configuration data, and in the process solidifies code that
ensures configurations are valid.
In this portion (part 1), most of the focus is on improving
validation of code. This validation is now done unconditionally
(something I promised Leon Romanovsky I would work on). Validation
will occur earlier than before, catching configuration problems as
early as possible and permitting the rest of the driver to avoid
needing to do some error checking. There will now be checks to
ensure all defined regions are supported by the hardware, that
required regions are all defined, and that there are no duplicate
regions.
The second portion (part 2) is mainly a set of small but pervasive
changes whose result is to have the memory region array not be
indexed by region ID. I'll provide further explanation when I post
that series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:35:03 +0000 (17:35 -0500)]
net: ipa: use bitmap to check for missing regions
In ipa_mem_valid(), wait until regions have been marked in the memory
region bitmap, and check all that are not found there to ensure they
are not required.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:35:02 +0000 (17:35 -0500)]
net: ipa: flag duplicate memory regions
Add a test in ipa_mem_valid() to ensure no memory region is defined
more than once, using a bitmap to record each defined memory region.
Skip over undefined regions when checking (we can have any number of
those).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:35:01 +0000 (17:35 -0500)]
net: ipa: validate memory regions based on version
Introduce ipa_mem_id_valid(), and use it to check defined memory
regions to ensure they are valid for a given version of IPA.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:35:00 +0000 (17:35 -0500)]
net: ipa: introduce ipa_mem_id_optional()
Introduce a new function that indicates whether a given memory
region is required for a given version of IPA hardware. Use it to
verify that all required regions are present during initialization.
Reorder the definitions of the memory region IDs to be based on
the version in which they're first defined. Use "+" rather than
"and above" where defining the IPA versions in which memory IDs are
used, and indicate which regions are optional (many are not).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:59 +0000 (17:34 -0500)]
net: ipa: pass memory configuration data to ipa_mem_valid()
Pass the memory configuration data array to ipa_mem_valid() for
validation, and use that rather than assuming it's already been
recorded in the IPA structure. Move the memory data array size
check into ipa_mem_valid().
Call ipa_mem_valid() early in ipa_mem_init(), and only proceed with
assigning the memory array pointer and size if it is found to be
valid.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:58 +0000 (17:34 -0500)]
net: ipa: validate memory regions at init time
Move the memory region validation check so it happens earlier when
initializing the driver, at init time rather than config time (i.e.,
before access to hardware is required).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:57 +0000 (17:34 -0500)]
net: ipa: separate region range check from other validation
The only thing done by ipa_mem_valid_one() that requires hardware
access is the check for whether all regions fit within the size of
IPA local memory specified by an IPA register.
Introduce ipa_mem_size_valid() to implement this verification and
stop doing so in ipa_mem_valid_one(). Call the new function from
ipa_mem_config() (which is also the caller of ipa_mem_valid()).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:56 +0000 (17:34 -0500)]
net: ipa: separate memory validation from initialization
Currently, memory regions are validated in the loop that initializes
them. Instead, validate them separately.
Rename ipa_mem_valid() to be ipa_mem_valid_one(). Define a *new*
function named ipa_mem_valid() that performs validation of the array
of memory regions provided. This function calls ipa_mem_valid_one()
for each region in turn.
Skip validation for any "empty" region descriptors, which have zero
size and are not preceded by any canary values. Issue a warning for
such descriptors if the offset is non-zero.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:55 +0000 (17:34 -0500)]
net: ipa: validate memory regions unconditionally
Do memory region descriptor validation unconditionally, rather than
having it depend on IPA_VALIDATION being defined.
Pass the address of a memory region descriptor rather than a memory
ID to ipa_mem_valid().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:54 +0000 (17:34 -0500)]
net: ipa: store memory region id in descriptor
Store the memory region ID in the memory descriptor structure. This
is a move toward *not* indexing the array by the ID, but for now we
must still specify those index values. Define an explicitly
undefined region ID, value 0, so uninitialized entries in the array
won't use an otherwise valid ID.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Wed, 9 Jun 2021 22:34:53 +0000 (17:34 -0500)]
net: ipa: define IPA_MEM_END_MARKER
Define a new pseudo memory region identifer that specifies the
offset at the end of IPA resident memory. Use it instead of
IPA_MEM_UC_EVENT_RING in places where the size of that region was
defined to be 0.
The size of the IPA_MEM_END_MARKER pseudo region must be zero.
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 9 Jun 2021 17:43:53 +0000 (18:43 +0100)]
net: dsa: sja1105: Fix assigned yet unused return code rc
The return code variable rc is being set to return error values in two
places in sja1105_mdiobus_base_tx_register and yet it is not being
returned, the function always returns 0 instead. Fix this by replacing
the return 0 with the return code rc.
Addresses-Coverity: ("Unused value")
Fixes:
5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Wed, 9 Jun 2021 17:23:03 +0000 (19:23 +0200)]
stmmac: prefetch right address
To support XDP, a headroom is prepended to the packet data.
Consider this offset when doing a prefetch.
Fixes:
da5ec7f22a0f ("net: stmmac: refactor stmmac_init_rx_buffers for stmmac_reinit_rx_buffers")
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 9 Jun 2021 17:56:57 +0000 (18:56 +0100)]
mlxsw: thermal: Fix null dereference of NULL temperature parameter
The call to mlxsw_thermal_module_temp_and_thresholds_get passes a NULL
pointer for the temperature and this can be dereferenced in this function
if the mlxsw_reg_query call fails. The simplist fix is to pass the
address of dummy temperature variable instead of a NULL pointer.
Addresses-Coverity: ("Explicit null dereferenced")
Fixes:
72a64c2fe9d8 ("mlxsw: thermal: Read module temperature thresholds using MTMP register")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kristian Evensen [Wed, 9 Jun 2021 14:32:49 +0000 (16:32 +0200)]
net: ethernet: rmnet: Always subtract MAP header
Commit
e1d9a90a9bfd ("net: ethernet: rmnet: Support for ingress MAPv5
checksum offload") broke ingress handling for devices where
RMNET_FLAGS_INGRESS_MAP_CKSUMV5 or RMNET_FLAGS_INGRESS_MAP_CKSUMV4 are
not set. Unless either of these flags are set, the MAP header is not
removed. This commit restores the original logic by ensuring that the
MAP header is removed for all MAP packets.
Fixes:
e1d9a90a9bfd ("net: ethernet: rmnet: Support for ingress MAPv5 checksum offload")
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 9 Jun 2021 14:25:06 +0000 (14:25 +0000)]
net: ena: make symbol 'ena_alloc_map_page' static
The sparse tool complains as follows:
drivers/net/ethernet/amazon/ena/ena_netdev.c:978:13: warning:
symbol 'ena_alloc_map_page' was not declared. Should it be static?
This symbol is not used outside of ena_netdev.c, so marks it static.
Fixes:
947c54c395cb ("net: ena: Use dev_alloc() in RX buffer allocation")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 9 Jun 2021 17:17:48 +0000 (18:17 +0100)]
net: phy: realtek: net: Fix less than zero comparison of a u16
The comparisons of the u16 values priv->phycr1 and priv->phycr2 to less
than zero always false because they are unsigned. Fix this by using an
int for the assignment and less than zero check.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes:
0a4355c2b7f8 ("net: phy: realtek: add dt property to disable CLKOUT clock")
Fixes:
d90db36a9e74 ("net: phy: realtek: add dt property to enable ALDPS mode")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 9 Jun 2021 17:05:12 +0000 (18:05 +0100)]
net: stmmac: Fix missing { } around two statements in an if statement
There are missing { } around a block of code on an if statement. Fix this
by adding them in.
Addresses-Coverity: ("Nesting level does not match indentation")
Fixes:
46682cb86a37 ("net: stmmac: enable Intel mGbE 2.5Gbps link speed")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 9 Jun 2021 13:51:38 +0000 (21:51 +0800)]
net: ethernet: ti: cpsw-phy-sel: Use devm_platform_ioremap_resource_byname()
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 9 Jun 2021 22:26:50 +0000 (15:26 -0700)]
Merge branch 'mvpp2-prefetch'
Matteo Croce says:
====================
mvpp2: prefetch data early
These two patches prefetch some data from RAM so to reduce stall
and speedup the packet processing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Wed, 9 Jun 2021 13:47:14 +0000 (15:47 +0200)]
mvpp2: prefetch page
Most of the time during the RX is caused by the compound_head() call
done at the end of the RX loop:
│ build_skb():
[...]
│ static inline struct page *compound_head(struct page *page)
│ {
│ unsigned long head = READ_ONCE(page->compound_head);
65.23 │ ldr x2, [x1, #8]
Prefetch the page struct as soon as possible, to speedup the RX path
noticeabily by a ~3-4% packet rate in a drop test.
│ build_skb():
[...]
│ static inline struct page *compound_head(struct page *page)
│ {
│ unsigned long head = READ_ONCE(page->compound_head);
17.92 │ ldr x2, [x1, #8]
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Wed, 9 Jun 2021 13:47:13 +0000 (15:47 +0200)]
mvpp2: prefetch right address
In the RX buffer, the received data starts after a headroom used to
align the IP header and to allow prepending headers efficiently.
The prefetch() should take this into account, and prefetch from
the very start of the received data.
We can see that ether_addr_equal_64bits(), which is the first function
to access the data, drops from the top of the perf top output.
prefetch(data):
Overhead Shared Object Symbol
11.64% [kernel] [k] eth_type_trans
prefetch(data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM):
Overhead Shared Object Symbol
13.42% [kernel] [k] build_skb
10.35% [mvpp2] [k] mvpp2_rx
9.35% [kernel] [k] __netif_receive_skb_core
8.24% [kernel] [k] kmem_cache_free
7.97% [kernel] [k] dev_gro_receive
7.68% [kernel] [k] page_pool_put_page
7.32% [kernel] [k] kmem_cache_alloc
7.09% [mvpp2] [k] mvpp2_bm_pool_put
3.36% [kernel] [k] eth_type_trans
Also, move the eth_type_trans() call a bit down, to give the RAM more
time to prefetch the data.
Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 9 Jun 2021 13:45:37 +0000 (21:45 +0800)]
net: ethernet: ti: am65-cpts: Use devm_platform_ioremap_resource_byname()
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 9 Jun 2021 13:36:55 +0000 (21:36 +0800)]
net: stmmac: Use devm_platform_ioremap_resource_byname()
Use the devm_platform_ioremap_resource_byname() helper instead of
calling platform_get_resource_byname() and devm_ioremap_resource()
separately.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Wed, 9 Jun 2021 13:25:15 +0000 (21:25 +0800)]
net: sgi: ioc3-eth: check return value after calling platform_get_resource()
It will cause null-ptr-deref if platform_get_resource() returns NULL,
we need check the return value.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shai Malin [Wed, 9 Jun 2021 10:49:18 +0000 (13:49 +0300)]
Revert "nvme-tcp-offload: ULP Series"
This reverts commits:
-
762411542050dbe27c7c96f13c57f93da5d9b89a
nvme: NVME_TCP_OFFLOAD should not default to m
-
5ff5622ea1f16d535f1be4e478e712ef48fe183b:
Merge branch 'NVMeTCP-Offload-ULP'
As requested on the mailing-list: https://lore.kernel.org/netdev/SJ0PR18MB3882C20793EA35A3E8DAE300CC379@SJ0PR18MB3882.namprd18.prod.outlook.com/
This patch will revert the nvme-tcp-offload ULP from net-next.
The nvme-tcp-offload ULP series will continue to be considered only on
linux-nvme@lists.infradead.org.
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 9 Jun 2021 10:24:48 +0000 (11:24 +0100)]
net: usb: asix: ax88772: Fix less than zero comparison of a u16
The comparison of the u16 priv->phy_addr < 0 is always false because
phy_addr is unsigned. Fix this by assigning the return from the call
to function asix_read_phy_addr to int ret and using this for the
less than zero error check comparison.
Fixes:
7e88b11a862a ("net: usb: asix: refactor asix_read_phy_addr() and handle errors on return")
Addresses-Coverity: ("Unsigned compared against 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 9 Jun 2021 10:24:47 +0000 (11:24 +0100)]
net: usb: asix: Fix less than zero comparison of a u16
The comparison of the u16 priv->phy_addr < 0 is always false because
phy_addr is unsigned. Fix this by assigning the return from the call
to function asix_read_phy_addr to int ret and using this for the
less than zero error check comparison.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes:
e532a096be0e ("net: usb: asix: ax88772: add phylib support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 9 Jun 2021 21:50:35 +0000 (14:50 -0700)]
Merge git://git./linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Add nfgenmsg field to nfnetlink's struct nfnl_info and use it.
2) Remove nft_ctx_init_from_elemattr() and nft_ctx_init_from_setattr()
helper functions.
3) Add the nf_ct_pernet() helper function to fetch the conntrack
pernetns data area.
4) Expose TCP and UDP flowtable offload timeouts through sysctl,
from Oz Shlomo.
5) Add nfnetlink_hook subsystem to fetch the netfilter hook
pipeline configuration, from Florian Westphal. This also includes
a new field to annotate the hook type as metadata.
6) Fix unsafe memory access to non-linear skbuff in the new SCTP
chunk support for nft_exthdr, from Phil Sutter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 9 Jun 2021 09:56:45 +0000 (12:56 +0300)]
netdevsim: delete unnecessary debugfs checking
In normal situations where the driver doesn't dereference
"nsim_node->ddir" or "nsim_node->rate_parent" itself then we are not
supposed to check the return from debugfs functions. In the case of
debugfs_create_dir() the check was wrong as well because it doesn't
return NULL, it returns error pointers.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 9 Jun 2021 09:54:31 +0000 (12:54 +0300)]
devlink: Fix error message in devlink_rate_set_ops_supported()
The WARN_ON() macro takes a condition, it doesn't take a message. Use
WARN() instead.
Fixes:
1897db2ec310 ("devlink: Allow setting tx rate for devlink rate leaf objects")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 9 Jun 2021 09:53:03 +0000 (12:53 +0300)]
net: dsa: qca8k: check the correct variable in qca8k_set_mac_eee()
This code check "reg" but "ret" was intended so the error handling will
never trigger.
Fixes:
7c9896e37807 ("net: dsa: qca8k: check return value of read functions correctly")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 9 Jun 2021 09:52:12 +0000 (12:52 +0300)]
net: dsa: qca8k: fix an endian bug in qca8k_get_ethtool_stats()
The "hi" variable is a u64 but the qca8k_read() writes to the top 32
bits of it. That will work on little endian systems but it's a bit
subtle. It's cleaner to make declare "hi" as a u32. We will still need
to cast it when we shift it later on in the function but that's fine.
Fixes:
7c9896e37807 ("net: dsa: qca8k: check return value of read functions correctly")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>