phy-mode:
$ref: "#/properties/phy-connection-type"
+ pcs-handle:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ Specifies a reference to a node representing a PCS PHY device on a MDIO
+ bus to link with an external PHY (phy-handle) if exists.
+
phy-handle:
$ref: /schemas/types.yaml#/definitions/phandle
description:
In fiber mode, auto-negotiation is disabled and the PHY can only work in
100base-fx (full and half duplex) modes.
-
- - lan8814,ignore-ts: If present the PHY will not support timestamping.
-
- This option acts as check whether Timestamping is supported by
- hardware or not. LAN8814 phy support hardware tmestamping.
-
- - lan8814,latency_rx_10: Configures Latency value of phy in ingress at 10 Mbps.
-
- - lan8814,latency_tx_10: Configures Latency value of phy in egress at 10 Mbps.
-
- - lan8814,latency_rx_100: Configures Latency value of phy in ingress at 100 Mbps.
-
- - lan8814,latency_tx_100: Configures Latency value of phy in egress at 100 Mbps.
-
- - lan8814,latency_rx_1000: Configures Latency value of phy in ingress at 1000 Mbps.
-
- - lan8814,latency_tx_1000: Configures Latency value of phy in egress at 1000 Mbps.
specified, the TX/RX DMA interrupts should be on that node
instead, and only the Ethernet core interrupt is optionally
specified here.
-- phy-handle : Should point to the external phy device.
+- phy-handle : Should point to the external phy device if exists. Pointing
+ this to the PCS/PMA PHY is deprecated and should be avoided.
See ethernet.txt file in the same directory.
- xlnx,rxmem : Set to allocated memory buffer for Rx/Tx in the hardware
required through the core's MDIO interface (i.e. always,
unless the PHY is accessed through a different bus).
+ - pcs-handle: Phandle to the internal PCS/PMA PHY in SGMII or 1000Base-X
+ modes, where "pcs-handle" should be used to point
+ to the PCS/PMA PHY, and "phy-handle" should point to an
+ external PHY if exists.
+
Example:
axi_ethernet_eth: ethernet@40c00000 {
compatible = "xlnx,axi-ethernet-1.00.a";
Design principles
=================
-The Distributed Switch Architecture is a subsystem which was primarily designed
-to support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line)
-using Linux, but has since evolved to support other vendors as well.
+The Distributed Switch Architecture subsystem was primarily designed to
+support Marvell Ethernet switches (MV88E6xxx, a.k.a. Link Street product
+line) using Linux, but has since evolved to support other vendors as well.
The original philosophy behind this design was to be able to use unmodified
Linux tools such as bridge, iproute2, ifconfig to work transparently whether
they configured/queried a switch port network device or a regular network
device.
-An Ethernet switch is typically comprised of multiple front-panel ports, and one
-or more CPU or management port. The DSA subsystem currently relies on the
+An Ethernet switch typically comprises multiple front-panel ports and one
+or more CPU or management ports. The DSA subsystem currently relies on the
presence of a management port connected to an Ethernet controller capable of
receiving Ethernet frames from the switch. This is a very common setup for all
kinds of Ethernet switches found in Small Home and Office products: routers,
-gateways, or even top-of-the rack switches. This host Ethernet controller will
+gateways, or even top-of-rack switches. This host Ethernet controller will
be later referred to as "master" and "cpu" in DSA terminology and code.
The D in DSA stands for Distributed, because the subsystem has been designed
ports are referred to as "dsa" ports in DSA terminology and code. A collection
of multiple switches connected to each other is called a "switch tree".
-For each front-panel port, DSA will create specialized network devices which are
+For each front-panel port, DSA creates specialized network devices which are
used as controlling and data-flowing endpoints for use by the Linux networking
stack. These specialized network interfaces are referred to as "slave" network
interfaces in DSA terminology and code.
The ideal case for using DSA is when an Ethernet switch supports a "switch tag"
which is a hardware feature making the switch insert a specific tag for each
-Ethernet frames it received to/from specific ports to help the management
+Ethernet frame it receives to/from specific ports to help the management
interface figure out:
- what port is this frame coming from
ports must decapsulate the packet.
Note that in certain cases, it might be the case that the tagging format used
-by a leaf switch (not connected directly to the CPU) to not be the same as what
+by a leaf switch (not connected directly to the CPU) is not the same as what
the network stack sees. This can be seen with Marvell switch trees, where the
CPU port can be configured to use either the DSA or the Ethertype DSA (EDSA)
format, but the DSA links are configured to use the shorter (without Ethertype)
to/from specific switch ports
- query the switch for ethtool operations: statistics, link state,
Wake-on-LAN, register dumps...
-- external/internal PHY management: link, auto-negotiation etc.
+- manage external/internal PHY: link, auto-negotiation, etc.
These slave network devices have custom net_device_ops and ethtool_ops function
pointers which allow DSA to introduce a level of layering between the networking
-stack/ethtool, and the switch driver implementation.
+stack/ethtool and the switch driver implementation.
Upon frame transmission from these slave network devices, DSA will look up which
-switch tagging protocol is currently registered with these network devices, and
+switch tagging protocol is currently registered with these network devices and
invoke a specific transmit routine which takes care of adding the relevant
switch tag in the Ethernet frames.
These frames are then queued for transmission using the master network device
-``ndo_start_xmit()`` function, since they contain the appropriate switch tag, the
+``ndo_start_xmit()`` function. Since they contain the appropriate switch tag, the
Ethernet switch will be able to process these incoming frames from the
-management interface and delivers these frames to the physical switch port.
+management interface and deliver them to the physical switch port.
Graphical representation
------------------------
switches, these functions would utilize direct or indirect PHY addressing mode
to return standard MII registers from the switch builtin PHYs, allowing the PHY
library and/or to return link status, link partner pages, auto-negotiation
-results etc..
+results, etc.
-For Ethernet switches which have both external and internal MDIO busses, the
+For Ethernet switches which have both external and internal MDIO buses, the
slave MII bus can be utilized to mux/demux MDIO reads and writes towards either
internal or external MDIO devices this switch might be connected to: internal
PHYs, external PHYs, or even external switches.
table indication (when cascading switches)
- ``dsa_platform_data``: platform device configuration data which can reference
- a collection of dsa_chip_data structure if multiples switches are cascaded,
+ a collection of dsa_chip_data structures if multiple switches are cascaded,
the master network device this switch tree is attached to needs to be
referenced
"phy-handle" property, if found, this PHY device is created and registered
using ``of_phy_connect()``
-- if Device Tree is used, and the PHY device is "fixed", that is, conforms to
+- if Device Tree is used and the PHY device is "fixed", that is, conforms to
the definition of a non-MDIO managed PHY as defined in
``Documentation/devicetree/bindings/net/fixed-link.txt``, the PHY is registered
and connected transparently using the special fixed MDIO bus driver
DSA features a standardized binding which is documented in
``Documentation/devicetree/bindings/net/dsa/dsa.txt``. PHY/MDIO library helper
functions such as ``of_get_phy_mode()``, ``of_phy_connect()`` are also used to query
-per-port PHY specific details: interface connection, MDIO bus location etc..
+per-port PHY specific details: interface connection, MDIO bus location, etc.
Driver development
==================
- ``setup``: setup function for the switch, this function is responsible for setting
up the ``dsa_switch_ops`` private structure with all it needs: register maps,
- interrupts, mutexes, locks etc.. This function is also expected to properly
+ interrupts, mutexes, locks, etc. This function is also expected to properly
configure the switch to separate all network interfaces from each other, that
is, they should be isolated by the switch hardware itself, typically by creating
a Port-based VLAN ID for each port and allowing only the CPU port and the
- ``get_phy_flags``: Some switches are interfaced to various kinds of Ethernet PHYs,
if the PHY library PHY driver needs to know about information it cannot obtain
on its own (e.g.: coming from switch memory mapped registers), this function
- should return a 32-bits bitmask of "flags", that is private between the switch
+ should return a 32-bit bitmask of "flags" that is private between the switch
driver and the Ethernet PHY driver in ``drivers/net/phy/\*``.
- ``phy_read``: Function invoked by the DSA slave MDIO bus when attempting to read
the switch port MDIO registers. If unavailable, return 0xffff for each read.
For builtin switch Ethernet PHYs, this function should allow reading the link
- status, auto-negotiation results, link partner pages etc..
+ status, auto-negotiation results, link partner pages, etc.
- ``phy_write``: Function invoked by the DSA slave MDIO bus when attempting to write
to the switch port MDIO registers. If unavailable return a negative error
------------------
- ``get_strings``: ethtool function used to query the driver's strings, will
- typically return statistics strings, private flags strings etc.
+ typically return statistics strings, private flags strings, etc.
- ``get_ethtool_stats``: ethtool function used to query per-port statistics and
return their values. DSA overlays slave network devices general statistics:
- ``get_sset_count``: ethtool function used to query the number of statistics items
- ``get_wol``: ethtool function used to obtain Wake-on-LAN settings per-port, this
- function may, for certain implementations also query the master network device
+ function may for certain implementations also query the master network device
Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN
- ``set_wol``: ethtool function used to configure Wake-on-LAN settings per-port,
in a fully active state
- ``port_enable``: function invoked by the DSA slave network device ndo_open
- function when a port is administratively brought up, this function should be
- fully enabling a given switch port. DSA takes care of marking the port with
+ function when a port is administratively brought up, this function should
+ fully enable a given switch port. DSA takes care of marking the port with
``BR_STATE_BLOCKING`` if the port is a bridge member, or ``BR_STATE_FORWARDING`` if it
was not, and propagating these changes down to the hardware
- ``port_disable``: function invoked by the DSA slave network device ndo_close
- function when a port is administratively brought down, this function should be
- fully disabling a given switch port. DSA takes care of marking the port with
+ function when a port is administratively brought down, this function should
+ fully disable a given switch port. DSA takes care of marking the port with
``BR_STATE_DISABLED`` and propagating changes to the hardware if this port is
disabled while being a bridge member
------------
- ``port_bridge_join``: bridge layer function invoked when a given switch port is
- added to a bridge, this function should be doing the necessary at the switch
- level to permit the joining port from being added to the relevant logical
+ added to a bridge, this function should do what's necessary at the switch
+ level to permit the joining port to be added to the relevant logical
domain for it to ingress/egress traffic with other members of the bridge.
- ``port_bridge_leave``: bridge layer function invoked when a given switch port is
- removed from a bridge, this function should be doing the necessary at the
+ removed from a bridge, this function should do what's necessary at the
switch level to deny the leaving port from ingress/egress traffic from the
remaining bridge members. When the port leaves the bridge, it should be aged
out at the switch hardware for the switch to (re) learn MAC addresses behind
point for drivers that need to configure the hardware for enabling this
feature.
-- ``port_bridge_tx_fwd_unoffload``: bridge layer function invoken when a driver
+- ``port_bridge_tx_fwd_unoffload``: bridge layer function invoked when a driver
leaves a bridge port which had the TX forwarding offload feature enabled.
Bridge VLAN filtering
}
qidx = bp->tc_to_qidx[j];
ring->queue_id = bp->q_info[qidx].queue_id;
+ spin_lock_init(&txr->xdp_tx_lock);
if (i < bp->tx_nr_rings_xdp)
continue;
if (i % bp->tx_nr_rings_per_tc == (bp->tx_nr_rings_per_tc - 1))
if (irq_re_init)
udp_tunnel_nic_reset_ntf(bp->dev);
+ if (bp->tx_nr_rings_xdp < num_possible_cpus()) {
+ if (!static_key_enabled(&bnxt_xdp_locking_key))
+ static_branch_enable(&bnxt_xdp_locking_key);
+ } else if (static_key_enabled(&bnxt_xdp_locking_key)) {
+ static_branch_disable(&bnxt_xdp_locking_key);
+ }
set_bit(BNXT_STATE_OPEN, &bp->state);
bnxt_enable_int(bp);
/* Enable TX queues */
#define BNXT_MAX_MTU 9500
#define BNXT_MAX_PAGE_MODE_MTU \
((unsigned int)PAGE_SIZE - VLAN_ETH_HLEN - NET_IP_ALIGN - \
- XDP_PACKET_HEADROOM)
+ XDP_PACKET_HEADROOM - \
+ SKB_DATA_ALIGN((unsigned int)sizeof(struct skb_shared_info)))
#define BNXT_MIN_PKT_SIZE 52
u32 dev_state;
struct bnxt_ring_struct tx_ring_struct;
+ /* Synchronize simultaneous xdp_xmit on same ring */
+ spinlock_t xdp_tx_lock;
};
#define BNXT_LEGACY_COAL_CMPL_PARAMS \
#include "bnxt.h"
#include "bnxt_xdp.h"
+DEFINE_STATIC_KEY_FALSE(bnxt_xdp_locking_key);
+
struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp,
struct bnxt_tx_ring_info *txr,
dma_addr_t mapping, u32 len)
ring = smp_processor_id() % bp->tx_nr_rings_xdp;
txr = &bp->tx_ring[ring];
+ if (READ_ONCE(txr->dev_state) == BNXT_DEV_STATE_CLOSING)
+ return -EINVAL;
+
+ if (static_branch_unlikely(&bnxt_xdp_locking_key))
+ spin_lock(&txr->xdp_tx_lock);
+
for (i = 0; i < num_frames; i++) {
struct xdp_frame *xdp = frames[i];
- if (!txr || !bnxt_tx_avail(bp, txr) ||
- !(bp->bnapi[ring]->flags & BNXT_NAPI_FLAG_XDP))
+ if (!bnxt_tx_avail(bp, txr))
break;
mapping = dma_map_single(&pdev->dev, xdp->data, xdp->len,
bnxt_db_write(bp, &txr->tx_db, txr->tx_prod);
}
+ if (static_branch_unlikely(&bnxt_xdp_locking_key))
+ spin_unlock(&txr->xdp_tx_lock);
+
return nxmit;
}
#ifndef BNXT_XDP_H
#define BNXT_XDP_H
+DECLARE_STATIC_KEY_FALSE(bnxt_xdp_locking_key);
+
struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp,
struct bnxt_tx_ring_info *txr,
dma_addr_t mapping, u32 len);
base = of_iomap(node, 0);
if (!base) {
err = -ENOMEM;
- goto err_close;
+ goto err_put;
}
err = fsl_mc_allocate_irqs(mc_dev);
fsl_mc_free_irqs(mc_dev);
err_unmap:
iounmap(base);
+err_put:
+ of_node_put(node);
err_close:
dprtc_close(mc_dev->mc_io, 0, mc_dev->mc_handle);
err_free_mcp:
/* Calculate the max QID based on SQ/CQ/doorbell counts.
* SQ/CQ doorbells alternate.
*/
- num_dbs = (pci_resource_len(pdev, 0) - NVME_REG_DBS) /
- (fdev->db_stride * 4);
+ num_dbs = (pci_resource_len(pdev, 0) - NVME_REG_DBS) >>
+ (2 + NVME_CAP_STRIDE(fdev->cap_reg));
fdev->max_qid = min3(cq_count, sq_count, num_dbs / 2) - 1;
fdev->kern_end_qid = fdev->max_qid + 1;
return 0;
ICE_VSI_NETDEV_REGISTERED,
ICE_VSI_UMAC_FLTR_CHANGED,
ICE_VSI_MMAC_FLTR_CHANGED,
- ICE_VSI_VLAN_FLTR_CHANGED,
ICE_VSI_PROMISC_CHANGED,
ICE_VSI_STATE_NBITS /* must be last */
};
static inline bool ice_is_xdp_ena_vsi(struct ice_vsi *vsi)
{
- return !!vsi->xdp_prog;
+ return !!READ_ONCE(vsi->xdp_prog);
}
static inline void ice_set_ring_xdp(struct ice_tx_ring *ring)
ice_fltr_set_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi,
u8 promisc_mask)
{
- return ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, false);
+ struct ice_pf *pf = hw->back;
+ int result;
+
+ result = ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, false);
+ if (result)
+ dev_err(ice_pf_to_dev(pf),
+ "Error setting promisc mode on VSI %i (rc=%d)\n",
+ vsi->vsi_num, result);
+
+ return result;
}
/**
ice_fltr_clear_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi,
u8 promisc_mask)
{
- return ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, true);
+ struct ice_pf *pf = hw->back;
+ int result;
+
+ result = ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, true);
+ if (result)
+ dev_err(ice_pf_to_dev(pf),
+ "Error clearing promisc mode on VSI %i (rc=%d)\n",
+ vsi->vsi_num, result);
+
+ return result;
}
/**
ice_fltr_clear_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask,
u16 vid)
{
- return ice_clear_vsi_promisc(hw, vsi_handle, promisc_mask, vid);
+ struct ice_pf *pf = hw->back;
+ int result;
+
+ result = ice_clear_vsi_promisc(hw, vsi_handle, promisc_mask, vid);
+ if (result)
+ dev_err(ice_pf_to_dev(pf),
+ "Error clearing promisc mode on VSI %i for VID %u (rc=%d)\n",
+ ice_get_hw_vsi_num(hw, vsi_handle), vid, result);
+
+ return result;
}
/**
ice_fltr_set_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask,
u16 vid)
{
- return ice_set_vsi_promisc(hw, vsi_handle, promisc_mask, vid);
+ struct ice_pf *pf = hw->back;
+ int result;
+
+ result = ice_set_vsi_promisc(hw, vsi_handle, promisc_mask, vid);
+ if (result)
+ dev_err(ice_pf_to_dev(pf),
+ "Error setting promisc mode on VSI %i for VID %u (rc=%d)\n",
+ ice_get_hw_vsi_num(hw, vsi_handle), vid, result);
+
+ return result;
}
/**
ring->tx_tstamps = &pf->ptp.port.tx;
ring->dev = dev;
ring->count = vsi->num_tx_desc;
+ ring->txq_teid = ICE_INVAL_TEID;
if (dvm_ena)
ring->flags |= ICE_TX_FLAGS_RING_VLAN_L2TAG2;
else
}
}
+ if (ice_is_vsi_dflt_vsi(pf->first_sw, vsi))
+ ice_clear_dflt_vsi(pf->first_sw);
ice_fltr_remove_all(vsi);
ice_rm_vsi_lan_cfg(vsi->port_info, vsi->idx);
err = ice_rm_vsi_rdma_cfg(vsi->port_info, vsi->idx);
static bool ice_vsi_fltr_changed(struct ice_vsi *vsi)
{
return test_bit(ICE_VSI_UMAC_FLTR_CHANGED, vsi->state) ||
- test_bit(ICE_VSI_MMAC_FLTR_CHANGED, vsi->state) ||
- test_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state);
+ test_bit(ICE_VSI_MMAC_FLTR_CHANGED, vsi->state);
}
/**
if (vsi->type != ICE_VSI_PF)
return 0;
- if (ice_vsi_has_non_zero_vlans(vsi))
- status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m);
- else
- status = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0);
+ if (ice_vsi_has_non_zero_vlans(vsi)) {
+ promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX);
+ status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi,
+ promisc_m);
+ } else {
+ status = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx,
+ promisc_m, 0);
+ }
+
return status;
}
if (vsi->type != ICE_VSI_PF)
return 0;
- if (ice_vsi_has_non_zero_vlans(vsi))
- status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m);
- else
- status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0);
+ if (ice_vsi_has_non_zero_vlans(vsi)) {
+ promisc_m |= (ICE_PROMISC_VLAN_RX | ICE_PROMISC_VLAN_TX);
+ status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi,
+ promisc_m);
+ } else {
+ status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx,
+ promisc_m, 0);
+ }
+
return status;
}
struct ice_pf *pf = vsi->back;
struct ice_hw *hw = &pf->hw;
u32 changed_flags = 0;
- u8 promisc_m;
int err;
if (!vsi->netdev)
if (ice_vsi_fltr_changed(vsi)) {
clear_bit(ICE_VSI_UMAC_FLTR_CHANGED, vsi->state);
clear_bit(ICE_VSI_MMAC_FLTR_CHANGED, vsi->state);
- clear_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state);
/* grab the netdev's addr_list_lock */
netif_addr_lock_bh(netdev);
/* check for changes in promiscuous modes */
if (changed_flags & IFF_ALLMULTI) {
if (vsi->current_netdev_flags & IFF_ALLMULTI) {
- if (ice_vsi_has_non_zero_vlans(vsi))
- promisc_m = ICE_MCAST_VLAN_PROMISC_BITS;
- else
- promisc_m = ICE_MCAST_PROMISC_BITS;
-
- err = ice_set_promisc(vsi, promisc_m);
+ err = ice_set_promisc(vsi, ICE_MCAST_PROMISC_BITS);
if (err) {
- netdev_err(netdev, "Error setting Multicast promiscuous mode on VSI %i\n",
- vsi->vsi_num);
vsi->current_netdev_flags &= ~IFF_ALLMULTI;
goto out_promisc;
}
} else {
/* !(vsi->current_netdev_flags & IFF_ALLMULTI) */
- if (ice_vsi_has_non_zero_vlans(vsi))
- promisc_m = ICE_MCAST_VLAN_PROMISC_BITS;
- else
- promisc_m = ICE_MCAST_PROMISC_BITS;
-
- err = ice_clear_promisc(vsi, promisc_m);
+ err = ice_clear_promisc(vsi, ICE_MCAST_PROMISC_BITS);
if (err) {
- netdev_err(netdev, "Error clearing Multicast promiscuous mode on VSI %i\n",
- vsi->vsi_num);
vsi->current_netdev_flags |= IFF_ALLMULTI;
goto out_promisc;
}
spin_lock_init(&xdp_ring->tx_lock);
for (j = 0; j < xdp_ring->count; j++) {
tx_desc = ICE_TX_DESC(xdp_ring, j);
- tx_desc->cmd_type_offset_bsz = cpu_to_le64(ICE_TX_DESC_DTYPE_DESC_DONE);
+ tx_desc->cmd_type_offset_bsz = 0;
}
}
ice_for_each_xdp_txq(vsi, i)
if (vsi->xdp_rings[i]) {
- if (vsi->xdp_rings[i]->desc)
+ if (vsi->xdp_rings[i]->desc) {
+ synchronize_rcu();
ice_free_tx_ring(vsi->xdp_rings[i]);
+ }
kfree_rcu(vsi->xdp_rings[i], rcu);
vsi->xdp_rings[i] = NULL;
}
if (!vid)
return 0;
+ while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
+ usleep_range(1000, 2000);
+
+ /* Add multicast promisc rule for the VLAN ID to be added if
+ * all-multicast is currently enabled.
+ */
+ if (vsi->current_netdev_flags & IFF_ALLMULTI) {
+ ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx,
+ ICE_MCAST_VLAN_PROMISC_BITS,
+ vid);
+ if (ret)
+ goto finish;
+ }
+
vlan_ops = ice_get_compat_vsi_vlan_ops(vsi);
/* Add a switch rule for this VLAN ID so its corresponding VLAN tagged
*/
vlan = ICE_VLAN(be16_to_cpu(proto), vid, 0);
ret = vlan_ops->add_vlan(vsi, &vlan);
- if (!ret)
- set_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state);
+ if (ret)
+ goto finish;
+
+ /* If all-multicast is currently enabled and this VLAN ID is only one
+ * besides VLAN-0 we have to update look-up type of multicast promisc
+ * rule for VLAN-0 from ICE_SW_LKUP_PROMISC to ICE_SW_LKUP_PROMISC_VLAN.
+ */
+ if ((vsi->current_netdev_flags & IFF_ALLMULTI) &&
+ ice_vsi_num_non_zero_vlans(vsi) == 1) {
+ ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx,
+ ICE_MCAST_PROMISC_BITS, 0);
+ ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx,
+ ICE_MCAST_VLAN_PROMISC_BITS, 0);
+ }
+
+finish:
+ clear_bit(ICE_CFG_BUSY, vsi->state);
return ret;
}
if (!vid)
return 0;
+ while (test_and_set_bit(ICE_CFG_BUSY, vsi->state))
+ usleep_range(1000, 2000);
+
vlan_ops = ice_get_compat_vsi_vlan_ops(vsi);
/* Make sure VLAN delete is successful before updating VLAN
vlan = ICE_VLAN(be16_to_cpu(proto), vid, 0);
ret = vlan_ops->del_vlan(vsi, &vlan);
if (ret)
- return ret;
+ goto finish;
- set_bit(ICE_VSI_VLAN_FLTR_CHANGED, vsi->state);
- return 0;
+ /* Remove multicast promisc rule for the removed VLAN ID if
+ * all-multicast is enabled.
+ */
+ if (vsi->current_netdev_flags & IFF_ALLMULTI)
+ ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx,
+ ICE_MCAST_VLAN_PROMISC_BITS, vid);
+
+ if (!ice_vsi_has_non_zero_vlans(vsi)) {
+ /* Update look-up type of multicast promisc rule for VLAN 0
+ * from ICE_SW_LKUP_PROMISC_VLAN to ICE_SW_LKUP_PROMISC when
+ * all-multicast is enabled and VLAN 0 is the only VLAN rule.
+ */
+ if (vsi->current_netdev_flags & IFF_ALLMULTI) {
+ ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx,
+ ICE_MCAST_VLAN_PROMISC_BITS,
+ 0);
+ ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx,
+ ICE_MCAST_PROMISC_BITS, 0);
+ }
+ }
+
+finish:
+ clear_bit(ICE_CFG_BUSY, vsi->state);
+
+ return ret;
}
/**
/* Add filter for new MAC. If filter exists, return success */
err = ice_fltr_add_mac(vsi, mac, ICE_FWD_TO_VSI);
- if (err == -EEXIST)
+ if (err == -EEXIST) {
/* Although this MAC filter is already present in hardware it's
* possible in some cases (e.g. bonding) that dev_addr was
* modified outside of the driver and needs to be restored back
* to this value.
*/
netdev_dbg(netdev, "filter for MAC %pM already exists\n", mac);
- else if (err)
+
+ return 0;
+ } else if (err) {
/* error if the new filter addition failed */
err = -EADDRNOTAVAIL;
+ }
err_update_filters:
if (err) {
goto error_param;
}
- /* Skip queue if not enabled */
if (!test_bit(vf_q_id, vf->txq_ena))
- continue;
+ dev_dbg(ice_pf_to_dev(vsi->back), "Queue %u on VSI %u is not enabled, but stopping it anyway\n",
+ vf_q_id, vsi->vsi_num);
ice_fill_txq_meta(vsi, ring, &txq_meta);
static void ice_qp_clean_rings(struct ice_vsi *vsi, u16 q_idx)
{
ice_clean_tx_ring(vsi->tx_rings[q_idx]);
- if (ice_is_xdp_ena_vsi(vsi))
+ if (ice_is_xdp_ena_vsi(vsi)) {
+ synchronize_rcu();
ice_clean_tx_ring(vsi->xdp_rings[q_idx]);
+ }
ice_clean_rx_ring(vsi->rx_rings[q_idx]);
}
struct ice_vsi *vsi = np->vsi;
struct ice_tx_ring *ring;
- if (test_bit(ICE_DOWN, vsi->state))
+ if (test_bit(ICE_VSI_DOWN, vsi->state))
return -ENETDOWN;
if (!ice_is_xdp_ena_vsi(vsi))
}
ret = of_get_mac_address(pnp, ppd.mac_addr);
- if (ret)
+ if (ret == -EPROBE_DEFER)
return ret;
mv643xx_eth_property(pnp, "tx-queue-size", ppd.tx_queue_size);
config KS8851
tristate "Micrel KS8851 SPI"
depends on SPI
+ depends on PTP_1588_CLOCK_OPTIONAL
select MII
select CRC32
select EEPROM_93CX6
config KS8851_MLL
tristate "Micrel KS8851 MLL"
depends on HAS_IOMEM
+ depends on PTP_1588_CLOCK_OPTIONAL
select MII
select CRC32
select EEPROM_93CX6
status = myri10ge_xmit(curr, dev);
if (status != 0) {
dev_kfree_skb_any(curr);
- if (segs != NULL) {
- curr = segs;
- segs = next;
+ skb_list_walk_safe(next, curr, next) {
curr->next = NULL;
- dev_kfree_skb_any(segs);
+ dev_kfree_skb_any(curr);
}
goto drop;
}
#define STATIC_DEBUG_LINE_DWORDS 9
-#define NUM_COMMON_GLOBAL_PARAMS 11
+#define NUM_COMMON_GLOBAL_PARAMS 10
#define MAX_RECURSION_DEPTH 10
buf = page_address(bd->data) + bd->page_offset;
skb = build_skb(buf, rxq->rx_buf_seg_size);
+ if (unlikely(!skb))
+ return NULL;
+
skb_reserve(skb, pad);
skb_put(skb, len);
kfree(efx->xdp_tx_queues);
}
+static int efx_set_xdp_tx_queue(struct efx_nic *efx, int xdp_queue_number,
+ struct efx_tx_queue *tx_queue)
+{
+ if (xdp_queue_number >= efx->xdp_tx_queue_count)
+ return -EINVAL;
+
+ netif_dbg(efx, drv, efx->net_dev,
+ "Channel %u TXQ %u is XDP %u, HW %u\n",
+ tx_queue->channel->channel, tx_queue->label,
+ xdp_queue_number, tx_queue->queue);
+ efx->xdp_tx_queues[xdp_queue_number] = tx_queue;
+ return 0;
+}
+
+static void efx_set_xdp_channels(struct efx_nic *efx)
+{
+ struct efx_tx_queue *tx_queue;
+ struct efx_channel *channel;
+ unsigned int next_queue = 0;
+ int xdp_queue_number = 0;
+ int rc;
+
+ /* We need to mark which channels really have RX and TX
+ * queues, and adjust the TX queue numbers if we have separate
+ * RX-only and TX-only channels.
+ */
+ efx_for_each_channel(channel, efx) {
+ if (channel->channel < efx->tx_channel_offset)
+ continue;
+
+ if (efx_channel_is_xdp_tx(channel)) {
+ efx_for_each_channel_tx_queue(tx_queue, channel) {
+ tx_queue->queue = next_queue++;
+ rc = efx_set_xdp_tx_queue(efx, xdp_queue_number,
+ tx_queue);
+ if (rc == 0)
+ xdp_queue_number++;
+ }
+ } else {
+ efx_for_each_channel_tx_queue(tx_queue, channel) {
+ tx_queue->queue = next_queue++;
+ netif_dbg(efx, drv, efx->net_dev,
+ "Channel %u TXQ %u is HW %u\n",
+ channel->channel, tx_queue->label,
+ tx_queue->queue);
+ }
+
+ /* If XDP is borrowing queues from net stack, it must
+ * use the queue with no csum offload, which is the
+ * first one of the channel
+ * (note: tx_queue_by_type is not initialized yet)
+ */
+ if (efx->xdp_txq_queues_mode ==
+ EFX_XDP_TX_QUEUES_BORROWED) {
+ tx_queue = &channel->tx_queue[0];
+ rc = efx_set_xdp_tx_queue(efx, xdp_queue_number,
+ tx_queue);
+ if (rc == 0)
+ xdp_queue_number++;
+ }
+ }
+ }
+ WARN_ON(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DEDICATED &&
+ xdp_queue_number != efx->xdp_tx_queue_count);
+ WARN_ON(efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED &&
+ xdp_queue_number > efx->xdp_tx_queue_count);
+
+ /* If we have more CPUs than assigned XDP TX queues, assign the already
+ * existing queues to the exceeding CPUs
+ */
+ next_queue = 0;
+ while (xdp_queue_number < efx->xdp_tx_queue_count) {
+ tx_queue = efx->xdp_tx_queues[next_queue++];
+ rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
+ if (rc == 0)
+ xdp_queue_number++;
+ }
+}
+
int efx_realloc_channels(struct efx_nic *efx, u32 rxq_entries, u32 txq_entries)
{
struct efx_channel *other_channel[EFX_MAX_CHANNELS], *channel;
efx_init_napi_channel(efx->channel[i]);
}
+ efx_set_xdp_channels(efx);
out:
/* Destroy unused channel structures */
for (i = 0; i < efx->n_channels; i++) {
goto out;
}
-static inline int
-efx_set_xdp_tx_queue(struct efx_nic *efx, int xdp_queue_number,
- struct efx_tx_queue *tx_queue)
-{
- if (xdp_queue_number >= efx->xdp_tx_queue_count)
- return -EINVAL;
-
- netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n",
- tx_queue->channel->channel, tx_queue->label,
- xdp_queue_number, tx_queue->queue);
- efx->xdp_tx_queues[xdp_queue_number] = tx_queue;
- return 0;
-}
-
int efx_set_channels(struct efx_nic *efx)
{
- struct efx_tx_queue *tx_queue;
struct efx_channel *channel;
- unsigned int next_queue = 0;
- int xdp_queue_number;
int rc;
efx->tx_channel_offset =
return -ENOMEM;
}
- /* We need to mark which channels really have RX and TX
- * queues, and adjust the TX queue numbers if we have separate
- * RX-only and TX-only channels.
- */
- xdp_queue_number = 0;
efx_for_each_channel(channel, efx) {
if (channel->channel < efx->n_rx_channels)
channel->rx_queue.core_index = channel->channel;
else
channel->rx_queue.core_index = -1;
-
- if (channel->channel >= efx->tx_channel_offset) {
- if (efx_channel_is_xdp_tx(channel)) {
- efx_for_each_channel_tx_queue(tx_queue, channel) {
- tx_queue->queue = next_queue++;
- rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
- if (rc == 0)
- xdp_queue_number++;
- }
- } else {
- efx_for_each_channel_tx_queue(tx_queue, channel) {
- tx_queue->queue = next_queue++;
- netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is HW %u\n",
- channel->channel, tx_queue->label,
- tx_queue->queue);
- }
-
- /* If XDP is borrowing queues from net stack, it must use the queue
- * with no csum offload, which is the first one of the channel
- * (note: channel->tx_queue_by_type is not initialized yet)
- */
- if (efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_BORROWED) {
- tx_queue = &channel->tx_queue[0];
- rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
- if (rc == 0)
- xdp_queue_number++;
- }
- }
- }
}
- WARN_ON(efx->xdp_txq_queues_mode == EFX_XDP_TX_QUEUES_DEDICATED &&
- xdp_queue_number != efx->xdp_tx_queue_count);
- WARN_ON(efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED &&
- xdp_queue_number > efx->xdp_tx_queue_count);
- /* If we have more CPUs than assigned XDP TX queues, assign the already
- * existing queues to the exceeding CPUs
- */
- next_queue = 0;
- while (xdp_queue_number < efx->xdp_tx_queue_count) {
- tx_queue = efx->xdp_tx_queues[next_queue++];
- rc = efx_set_xdp_tx_queue(efx, xdp_queue_number, tx_queue);
- if (rc == 0)
- xdp_queue_number++;
- }
+ efx_set_xdp_channels(efx);
rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels);
if (rc)
struct efx_rx_queue *rx_queue;
struct efx_channel *channel;
- efx_for_each_channel(channel, efx) {
+ efx_for_each_channel_rev(channel, efx) {
efx_for_each_channel_tx_queue(tx_queue, channel) {
efx_init_tx_queue(tx_queue);
atomic_inc(&efx->active_queues);
struct efx_nic *efx = rx_queue->efx;
int i;
+ if (unlikely(!rx_queue->page_ring))
+ return;
+
/* Unmap and release the pages in the recycle ring. Remove the ring. */
for (i = 0; i <= rx_queue->page_ptr_mask; i++) {
struct page *page = rx_queue->page_ring[i];
if (unlikely(!tx_queue))
return -EINVAL;
+ if (!tx_queue->initialised)
+ return -EINVAL;
+
if (efx->xdp_txq_queues_mode != EFX_XDP_TX_QUEUES_DEDICATED)
HARD_TX_LOCK(efx->net_dev, tx_queue->core_txq, cpu);
netif_dbg(tx_queue->efx, drv, tx_queue->efx->net_dev,
"shutting down TX queue %d\n", tx_queue->queue);
+ tx_queue->initialised = false;
+
if (!tx_queue->buffer)
return;
};
MODULE_DEVICE_TABLE(pci, loongson_dwmac_id_table);
-struct pci_driver loongson_dwmac_driver = {
+static struct pci_driver loongson_dwmac_driver = {
.name = "dwmac-loongson-pci",
.id_table = loongson_dwmac_id_table,
.probe = loongson_dwmac_probe,
plat->phylink_node = np;
/* Get max speed of operation from device tree */
- if (of_property_read_u32(np, "max-speed", &plat->max_speed))
- plat->max_speed = -1;
+ of_property_read_u32(np, "max-speed", &plat->max_speed);
plat->bus_id = of_alias_get_id(np, "ethernet");
if (plat->bus_id < 0)
struct net_device *ndev;
struct device *dev;
- struct device_node *phy_node;
-
struct phylink *phylink;
struct phylink_config phylink_config;
if (ret)
goto cleanup_clk;
- lp->phy_node = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0);
- if (lp->phy_node) {
- ret = axienet_mdio_setup(lp);
- if (ret)
- dev_warn(&pdev->dev,
- "error registering MDIO bus: %d\n", ret);
- }
+ ret = axienet_mdio_setup(lp);
+ if (ret)
+ dev_warn(&pdev->dev,
+ "error registering MDIO bus: %d\n", ret);
+
if (lp->phy_mode == PHY_INTERFACE_MODE_SGMII ||
lp->phy_mode == PHY_INTERFACE_MODE_1000BASEX) {
- if (!lp->phy_node) {
- dev_err(&pdev->dev, "phy-handle required for 1000BaseX/SGMII\n");
+ np = of_parse_phandle(pdev->dev.of_node, "pcs-handle", 0);
+ if (!np) {
+ /* Deprecated: Always use "pcs-handle" for pcs_phy.
+ * Falling back to "phy-handle" here is only for
+ * backward compatibility with old device trees.
+ */
+ np = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0);
+ }
+ if (!np) {
+ dev_err(&pdev->dev, "pcs-handle (preferred) or phy-handle required for 1000BaseX/SGMII\n");
ret = -EINVAL;
goto cleanup_mdio;
}
- lp->pcs_phy = of_mdio_find_device(lp->phy_node);
+ lp->pcs_phy = of_mdio_find_device(np);
if (!lp->pcs_phy) {
ret = -EPROBE_DEFER;
+ of_node_put(np);
goto cleanup_mdio;
}
+ of_node_put(np);
lp->pcs.ops = &axienet_pcs_ops;
lp->pcs.poll = true;
}
put_device(&lp->pcs_phy->dev);
if (lp->mii_bus)
axienet_mdio_teardown(lp);
- of_node_put(lp->phy_node);
-
cleanup_clk:
clk_bulk_disable_unprepare(XAE_NUM_MISC_CLOCKS, lp->misc_clks);
clk_disable_unprepare(lp->axi_clk);
clk_bulk_disable_unprepare(XAE_NUM_MISC_CLOCKS, lp->misc_clks);
clk_disable_unprepare(lp->axi_clk);
- of_node_put(lp->phy_node);
- lp->phy_node = NULL;
-
free_netdev(ndev);
return 0;
hdr->source_slave = ((llsrc << 1) & 0xff) | 0x01;
mhdr->ver = 0x01;
- return 0;
+ return sizeof(struct mctp_i2c_hdr);
}
static int mctp_i2c_tx_thread(void *data)
u32 val;
int ret;
+ if (regnum & MII_ADDR_C45)
+ return -EOPNOTSUPP;
+
ret = mscc_miim_wait_pending(bus);
if (ret)
goto out;
struct mscc_miim_dev *miim = bus->priv;
int ret;
+ if (regnum & MII_ADDR_C45)
+ return -EOPNOTSUPP;
+
ret = mscc_miim_wait_pending(bus);
if (ret < 0)
goto out;
#define PTP_TIMESTAMP_EN_PDREQ_ BIT(2)
#define PTP_TIMESTAMP_EN_PDRES_ BIT(3)
-#define PTP_RX_LATENCY_1000 0x0224
-#define PTP_TX_LATENCY_1000 0x0225
-
-#define PTP_RX_LATENCY_100 0x0222
-#define PTP_TX_LATENCY_100 0x0223
-
-#define PTP_RX_LATENCY_10 0x0220
-#define PTP_TX_LATENCY_10 0x0221
-
#define PTP_TX_PARSE_L2_ADDR_EN 0x0284
#define PTP_RX_PARSE_L2_ADDR_EN 0x0244
u16 seq_id;
};
-struct kszphy_latencies {
- u16 rx_10;
- u16 tx_10;
- u16 rx_100;
- u16 tx_100;
- u16 rx_1000;
- u16 tx_1000;
-};
-
struct kszphy_ptp_priv {
struct mii_timestamper mii_ts;
struct phy_device *phydev;
struct kszphy_priv {
struct kszphy_ptp_priv ptp_priv;
- struct kszphy_latencies latencies;
const struct kszphy_type *type;
int led_mode;
bool rmii_ref_clk_sel;
u64 stats[ARRAY_SIZE(kszphy_hw_stats)];
};
-static struct kszphy_latencies lan8814_latencies = {
- .rx_10 = 0x22AA,
- .tx_10 = 0x2E4A,
- .rx_100 = 0x092A,
- .tx_100 = 0x02C1,
- .rx_1000 = 0x01AD,
- .tx_1000 = 0x00C9,
-};
static const struct kszphy_type ksz8021_type = {
.led_mode_reg = MII_KSZPHY_CTRL_2,
.has_broadcast_disable = true,
return 0;
}
-static int lan8814_read_status(struct phy_device *phydev)
-{
- struct kszphy_priv *priv = phydev->priv;
- struct kszphy_latencies *latencies = &priv->latencies;
- int err;
- int regval;
-
- err = genphy_read_status(phydev);
- if (err)
- return err;
-
- switch (phydev->speed) {
- case SPEED_1000:
- lanphy_write_page_reg(phydev, 5, PTP_RX_LATENCY_1000,
- latencies->rx_1000);
- lanphy_write_page_reg(phydev, 5, PTP_TX_LATENCY_1000,
- latencies->tx_1000);
- break;
- case SPEED_100:
- lanphy_write_page_reg(phydev, 5, PTP_RX_LATENCY_100,
- latencies->rx_100);
- lanphy_write_page_reg(phydev, 5, PTP_TX_LATENCY_100,
- latencies->tx_100);
- break;
- case SPEED_10:
- lanphy_write_page_reg(phydev, 5, PTP_RX_LATENCY_10,
- latencies->rx_10);
- lanphy_write_page_reg(phydev, 5, PTP_TX_LATENCY_10,
- latencies->tx_10);
- break;
- default:
- break;
- }
-
- /* Make sure the PHY is not broken. Read idle error count,
- * and reset the PHY if it is maxed out.
- */
- regval = phy_read(phydev, MII_STAT1000);
- if ((regval & 0xFF) == 0xFF) {
- phy_init_hw(phydev);
- phydev->link = 0;
- if (phydev->drv->config_intr && phy_interrupt_is_valid(phydev))
- phydev->drv->config_intr(phydev);
- return genphy_config_aneg(phydev);
- }
-
- return 0;
-}
-
static int lan8814_config_init(struct phy_device *phydev)
{
int val;
return 0;
}
-static void lan8814_parse_latency(struct phy_device *phydev)
-{
- const struct device_node *np = phydev->mdio.dev.of_node;
- struct kszphy_priv *priv = phydev->priv;
- struct kszphy_latencies *latency = &priv->latencies;
- u32 val;
-
- if (!of_property_read_u32(np, "lan8814,latency_rx_10", &val))
- latency->rx_10 = val;
- if (!of_property_read_u32(np, "lan8814,latency_tx_10", &val))
- latency->tx_10 = val;
- if (!of_property_read_u32(np, "lan8814,latency_rx_100", &val))
- latency->rx_100 = val;
- if (!of_property_read_u32(np, "lan8814,latency_tx_100", &val))
- latency->tx_100 = val;
- if (!of_property_read_u32(np, "lan8814,latency_rx_1000", &val))
- latency->rx_1000 = val;
- if (!of_property_read_u32(np, "lan8814,latency_tx_1000", &val))
- latency->tx_1000 = val;
-}
-
static int lan8814_probe(struct phy_device *phydev)
{
- const struct device_node *np = phydev->mdio.dev.of_node;
struct kszphy_priv *priv;
u16 addr;
int err;
priv->led_mode = -1;
- priv->latencies = lan8814_latencies;
-
phydev->priv = priv;
if (!IS_ENABLED(CONFIG_PTP_1588_CLOCK) ||
- !IS_ENABLED(CONFIG_NETWORK_PHY_TIMESTAMPING) ||
- of_property_read_bool(np, "lan8814,ignore-ts"))
+ !IS_ENABLED(CONFIG_NETWORK_PHY_TIMESTAMPING))
return 0;
/* Strap-in value for PHY address, below register read gives starting
return err;
}
- lan8814_parse_latency(phydev);
lan8814_ptp_init(phydev);
return 0;
.config_init = lan8814_config_init,
.probe = lan8814_probe,
.soft_reset = genphy_soft_reset,
- .read_status = lan8814_read_status,
+ .read_status = ksz9031_read_status,
.get_sset_count = kszphy_get_sset_count,
.get_strings = kszphy_get_strings,
.get_stats = kszphy_get_stats,
spin_lock(&sl->lock);
if (netif_queue_stopped(dev)) {
- if (!netif_running(dev))
+ if (!netif_running(dev) || !sl->tty)
goto out;
/* May be we must check transmitter timeout here ?
if (start_of_descs != desc_offset)
goto err;
- /* self check desc_offset from header*/
- if (desc_offset >= skb_len)
+ /* self check desc_offset from header and make sure that the
+ * bounds of the metadata array are inside the SKB
+ */
+ if (pkt_count * 2 + desc_offset >= skb_len)
goto err;
+ /* Packets must not overlap the metadata array */
+ skb_trim(skb, desc_offset);
+
if (pkt_count == 0)
goto err;
eth = (struct ethhdr *)skb->data;
skb_reset_mac_header(skb);
+ skb_reset_mac_len(skb);
/* we set the ethernet destination and the source addresses to the
* address of the VRF device.
*/
static int vrf_add_mac_header_if_unset(struct sk_buff *skb,
struct net_device *vrf_dev,
- u16 proto)
+ u16 proto, struct net_device *orig_dev)
{
- if (skb_mac_header_was_set(skb))
+ if (skb_mac_header_was_set(skb) && dev_has_header(orig_dev))
return 0;
return vrf_prepare_mac_header(skb, vrf_dev, proto);
/* if packet is NDISC then keep the ingress interface */
if (!is_ndisc) {
+ struct net_device *orig_dev = skb->dev;
+
vrf_rx_stats(vrf_dev, skb->len);
skb->dev = vrf_dev;
skb->skb_iif = vrf_dev->ifindex;
int err;
err = vrf_add_mac_header_if_unset(skb, vrf_dev,
- ETH_P_IPV6);
+ ETH_P_IPV6,
+ orig_dev);
if (likely(!err)) {
skb_push(skb, skb->mac_len);
dev_queue_xmit_nit(skb, vrf_dev);
static struct sk_buff *vrf_ip_rcv(struct net_device *vrf_dev,
struct sk_buff *skb)
{
+ struct net_device *orig_dev = skb->dev;
+
skb->dev = vrf_dev;
skb->skb_iif = vrf_dev->ifindex;
IPCB(skb)->flags |= IPSKB_L3SLAVE;
if (!list_empty(&vrf_dev->ptype_all)) {
int err;
- err = vrf_add_mac_header_if_unset(skb, vrf_dev, ETH_P_IP);
+ err = vrf_add_mac_header_if_unset(skb, vrf_dev, ETH_P_IP,
+ orig_dev);
if (likely(!err)) {
skb_push(skb, skb->mac_len);
dev_queue_xmit_nit(skb, vrf_dev);
return type & ~BPF_BASE_TYPE_MASK;
}
+/* only use after check_attach_btf_id() */
static inline enum bpf_prog_type resolve_prog_type(struct bpf_prog *prog)
{
- return prog->aux->dst_prog ? prog->aux->dst_prog->type : prog->type;
+ return prog->type == BPF_PROG_TYPE_EXT ?
+ prog->aux->dst_prog->type : prog->type;
}
#endif /* _LINUX_BPF_VERIFIER_H */
#define MCTP_HDR_TAG_SHIFT 0
#define MCTP_HDR_TAG_MASK GENMASK(2, 0)
-#define MCTP_HEADER_MAXLEN 4
-
#define MCTP_INITIAL_DEFAULT_NET 1
static inline bool mctp_address_unicast(mctp_eid_t eid)
}
static int
-kprobe_multi_resolve_syms(const void *usyms, u32 cnt,
+kprobe_multi_resolve_syms(const void __user *usyms, u32 cnt,
unsigned long *addrs)
{
unsigned long addr, size;
- const char **syms;
+ const char __user **syms;
int err = -ENOMEM;
unsigned int i;
char *func;
*/
void rethook_free(struct rethook *rh)
{
- rcu_assign_pointer(rh->handler, NULL);
+ WRITE_ONCE(rh->handler, NULL);
call_rcu(&rh->rcu, rethook_free_rcu);
}
if (!th->ack || th->rst || th->syn)
return -ENOENT;
+ if (unlikely(iph_len < sizeof(struct iphdr)))
+ return -EINVAL;
+
if (tcp_synq_no_recent_overflow(sk))
return -ENOENT;
cookie = ntohl(th->ack_seq) - 1;
- switch (sk->sk_family) {
- case AF_INET:
- if (unlikely(iph_len < sizeof(struct iphdr)))
+ /* Both struct iphdr and struct ipv6hdr have the version field at the
+ * same offset so we can cast to the shorter header (struct iphdr).
+ */
+ switch (((struct iphdr *)iph)->version) {
+ case 4:
+ if (sk->sk_family == AF_INET6 && ipv6_only_sock(sk))
return -EINVAL;
ret = __cookie_v4_check((struct iphdr *)iph, th, cookie);
break;
#if IS_BUILTIN(CONFIG_IPV6)
- case AF_INET6:
+ case 6:
if (unlikely(iph_len < sizeof(struct ipv6hdr)))
return -EINVAL;
+ if (sk->sk_family != AF_INET6)
+ return -EINVAL;
+
ret = __cookie_v6_check((struct ipv6hdr *)iph, th, cookie);
break;
#endif /* CONFIG_IPV6 */
if (skb_cloned(to))
return false;
- /* The page pool signature of struct page will eventually figure out
- * which pages can be recycled or not but for now let's prohibit slab
- * allocated and page_pool allocated SKBs from being coalesced.
+ /* In general, avoid mixing slab allocated and page_pool allocated
+ * pages within the same SKB. However when @to is not pp_recycle and
+ * @from is cloned, we can transition frag pages from page_pool to
+ * reference counted.
+ *
+ * On the other hand, don't allow coalescing two pp_recycle SKBs if
+ * @from is cloned, in case the SKB is using page_pool fragment
+ * references (PP_FLAG_PAGE_FRAG). Since we only take full page
+ * references for cloned SKBs at the moment that would result in
+ * inconsistent reference counts.
*/
- if (to->pp_recycle != from->pp_recycle)
+ if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
return false;
if (len <= skb_tailroom(to)) {
.attrs = dsa_slave_attrs,
};
+static void dsa_master_reset_mtu(struct net_device *dev)
+{
+ int err;
+
+ err = dev_set_mtu(dev, ETH_DATA_LEN);
+ if (err)
+ netdev_dbg(dev,
+ "Unable to reset MTU to exclude DSA overheads\n");
+}
+
int dsa_master_setup(struct net_device *dev, struct dsa_port *cpu_dp)
{
+ const struct dsa_device_ops *tag_ops = cpu_dp->tag_ops;
struct dsa_switch *ds = cpu_dp->ds;
struct device_link *consumer_link;
- int ret;
+ int mtu, ret;
+
+ mtu = ETH_DATA_LEN + dsa_tag_protocol_overhead(tag_ops);
/* The DSA master must use SET_NETDEV_DEV for this to work. */
consumer_link = device_link_add(ds->dev, dev->dev.parent,
"Failed to create a device link to DSA switch %s\n",
dev_name(ds->dev));
+ /* The switch driver may not implement ->port_change_mtu(), case in
+ * which dsa_slave_change_mtu() will not update the master MTU either,
+ * so we need to do that here.
+ */
+ ret = dev_set_mtu(dev, mtu);
+ if (ret)
+ netdev_warn(dev, "error %d setting MTU to %d to include DSA overhead\n",
+ ret, mtu);
+
/* If we use a tagging format that doesn't have an ethertype
* field, make sure that all packets from this point on get
* sent to the tag format's receive function.
sysfs_remove_group(&dev->dev.kobj, &dsa_group);
dsa_netdev_ops_set(dev, NULL);
dsa_master_ethtool_teardown(dev);
+ dsa_master_reset_mtu(dev);
dsa_master_set_promiscuity(dev, -1);
dev->dsa_ptr = NULL;
}
if (cfg->fc_oif || cfg->fc_gw_family) {
- struct fib_nh *nh = fib_info_nh(fi, 0);
+ struct fib_nh *nh;
+
+ /* cannot match on nexthop object attributes */
+ if (fi->nh)
+ return 1;
+ nh = fib_info_nh(fi, 0);
if (cfg->fc_encap) {
if (fib_encap_match(net, cfg->fc_encap_type,
cfg->fc_encap, nh, cfg, extack))
mifi_t mifi;
struct net *net = sock_net(sk);
struct mr_table *mrt;
- bool do_wrmifwhole;
if (sk->sk_type != SOCK_RAW ||
inet_sk(sk)->inet_num != IPPROTO_ICMPV6)
#ifdef CONFIG_IPV6_PIMSM_V2
case MRT6_PIM:
{
+ bool do_wrmifwhole;
int v;
if (optlen != sizeof(v))
struct inet6_dev *idev;
int type;
- if (netif_is_l3_master(skb->dev) &&
+ if (netif_is_l3_master(skb->dev) ||
dst->dev == net->loopback_dev)
idev = __in6_dev_get_safely(dev_get_by_index_rcu(net, IP6CB(skb)->iif));
else
static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
{
DECLARE_SOCKADDR(struct sockaddr_mctp *, addr, msg->msg_name);
- const int hlen = MCTP_HEADER_MAXLEN + sizeof(struct mctp_hdr);
int rc, addrlen = msg->msg_namelen;
struct sock *sk = sock->sk;
struct mctp_sock *msk = container_of(sk, struct mctp_sock, sk);
struct mctp_skb_cb *cb;
struct mctp_route *rt;
- struct sk_buff *skb;
+ struct sk_buff *skb = NULL;
+ int hlen;
if (addr) {
const u8 tagbits = MCTP_TAG_MASK | MCTP_TAG_OWNER |
if (addr->smctp_network == MCTP_NET_ANY)
addr->smctp_network = mctp_default_net(sock_net(sk));
+ /* direct addressing */
+ if (msk->addr_ext && addrlen >= sizeof(struct sockaddr_mctp_ext)) {
+ DECLARE_SOCKADDR(struct sockaddr_mctp_ext *,
+ extaddr, msg->msg_name);
+ struct net_device *dev;
+
+ rc = -EINVAL;
+ rcu_read_lock();
+ dev = dev_get_by_index_rcu(sock_net(sk), extaddr->smctp_ifindex);
+ /* check for correct halen */
+ if (dev && extaddr->smctp_halen == dev->addr_len) {
+ hlen = LL_RESERVED_SPACE(dev) + sizeof(struct mctp_hdr);
+ rc = 0;
+ }
+ rcu_read_unlock();
+ if (rc)
+ goto err_free;
+ rt = NULL;
+ } else {
+ rt = mctp_route_lookup(sock_net(sk), addr->smctp_network,
+ addr->smctp_addr.s_addr);
+ if (!rt) {
+ rc = -EHOSTUNREACH;
+ goto err_free;
+ }
+ hlen = LL_RESERVED_SPACE(rt->dev->dev) + sizeof(struct mctp_hdr);
+ }
+
skb = sock_alloc_send_skb(sk, hlen + 1 + len,
msg->msg_flags & MSG_DONTWAIT, &rc);
if (!skb)
cb = __mctp_cb(skb);
cb->net = addr->smctp_network;
- /* direct addressing */
- if (msk->addr_ext && addrlen >= sizeof(struct sockaddr_mctp_ext)) {
+ if (!rt) {
+ /* fill extended address in cb */
DECLARE_SOCKADDR(struct sockaddr_mctp_ext *,
extaddr, msg->msg_name);
}
cb->ifindex = extaddr->smctp_ifindex;
+ /* smctp_halen is checked above */
cb->halen = extaddr->smctp_halen;
memcpy(cb->haddr, extaddr->smctp_haddr, cb->halen);
-
- rt = NULL;
- } else {
- rt = mctp_route_lookup(sock_net(sk), addr->smctp_network,
- addr->smctp_addr.s_addr);
- if (!rt) {
- rc = -EHOSTUNREACH;
- goto err_free;
- }
}
rc = mctp_local_output(sk, rt, skb, addr->smctp_addr.s_addr,
if (cb->ifindex) {
/* direct route; use the hwaddr we stashed in sendmsg */
+ if (cb->halen != skb->dev->addr_len) {
+ /* sanity check, sendmsg should have already caught this */
+ kfree_skb(skb);
+ return -EMSGSIZE;
+ }
daddr = cb->haddr;
} else {
/* If lookup fails let the device handle daddr==NULL */
rc = dev_hard_header(skb, skb->dev, ntohs(skb->protocol),
daddr, skb->dev->dev_addr, skb->len);
- if (rc) {
+ if (rc < 0) {
kfree_skb(skb);
return -EHOSTUNREACH;
}
{
const unsigned int hlen = sizeof(struct mctp_hdr);
struct mctp_hdr *hdr, *hdr2;
- unsigned int pos, size;
+ unsigned int pos, size, headroom;
struct sk_buff *skb2;
int rc;
u8 seq;
return -EMSGSIZE;
}
+ /* keep same headroom as the original skb */
+ headroom = skb_headroom(skb);
+
/* we've got the header */
skb_pull(skb, hlen);
/* size of message payload */
size = min(mtu - hlen, skb->len - pos);
- skb2 = alloc_skb(MCTP_HEADER_MAXLEN + hlen + size, GFP_KERNEL);
+ skb2 = alloc_skb(headroom + hlen + size, GFP_KERNEL);
if (!skb2) {
rc = -ENOMEM;
break;
skb_set_owner_w(skb2, skb->sk);
/* establish packet */
- skb_reserve(skb2, MCTP_HEADER_MAXLEN);
+ skb_reserve(skb2, headroom);
skb_reset_network_header(skb2);
skb_put(skb2, hlen + size);
skb2->transport_header = skb2->network_header + hlen;
int err, i, k;
for (i = 0; i < set->num_exprs; i++) {
- expr = kzalloc(set->exprs[i]->ops->size, GFP_KERNEL);
+ expr = kzalloc(set->exprs[i]->ops->size, GFP_KERNEL_ACCOUNT);
if (!expr)
goto err_expr;
if (!track->regs[priv->sreg].selector)
return false;
- bitwise = nft_expr_priv(expr);
+ bitwise = nft_expr_priv(track->regs[priv->dreg].selector);
if (track->regs[priv->sreg].selector == track->regs[priv->dreg].selector &&
track->regs[priv->sreg].num_reg == 0 &&
track->regs[priv->dreg].bitwise &&
if (!track->regs[priv->sreg].selector)
return false;
- bitwise = nft_expr_priv(expr);
+ bitwise = nft_expr_priv(track->regs[priv->dreg].selector);
if (track->regs[priv->sreg].selector == track->regs[priv->dreg].selector &&
track->regs[priv->dreg].bitwise &&
track->regs[priv->dreg].bitwise->ops == expr->ops &&
invert = true;
}
- priv->list = kmalloc(sizeof(*priv->list), GFP_KERNEL);
+ priv->list = kmalloc(sizeof(*priv->list), GFP_KERNEL_ACCOUNT);
if (!priv->list)
return -ENOMEM;
struct nft_counter __percpu *cpu_stats;
struct nft_counter *this_cpu;
- cpu_stats = alloc_percpu(struct nft_counter);
+ cpu_stats = alloc_percpu_gfp(struct nft_counter, GFP_KERNEL_ACCOUNT);
if (cpu_stats == NULL)
return -ENOMEM;
u64 last_jiffies;
int err;
- last = kzalloc(sizeof(*last), GFP_KERNEL);
+ last = kzalloc(sizeof(*last), GFP_KERNEL_ACCOUNT);
if (!last)
return -ENOMEM;
priv->rate);
}
- priv->limit = kmalloc(sizeof(*priv->limit), GFP_KERNEL);
+ priv->limit = kmalloc(sizeof(*priv->limit), GFP_KERNEL_ACCOUNT);
if (!priv->limit)
return -ENOMEM;
return -EOPNOTSUPP;
}
- priv->consumed = kmalloc(sizeof(*priv->consumed), GFP_KERNEL);
+ priv->consumed = kmalloc(sizeof(*priv->consumed), GFP_KERNEL_ACCOUNT);
if (!priv->consumed)
return -ENOMEM;
int rem = nla_len(attr);
bool dont_clone_flow_key;
- /* The first action is always 'OVS_CLONE_ATTR_ARG'. */
+ /* The first action is always 'OVS_CLONE_ATTR_EXEC'. */
clone_arg = nla_data(attr);
dont_clone_flow_key = nla_get_u32(clone_arg);
actions = nla_next(clone_arg, &rem);
return sfa;
}
+static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len);
+
+static void ovs_nla_free_check_pkt_len_action(const struct nlattr *action)
+{
+ const struct nlattr *a;
+ int rem;
+
+ nla_for_each_nested(a, action, rem) {
+ switch (nla_type(a)) {
+ case OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL:
+ case OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER:
+ ovs_nla_free_nested_actions(nla_data(a), nla_len(a));
+ break;
+ }
+ }
+}
+
+static void ovs_nla_free_clone_action(const struct nlattr *action)
+{
+ const struct nlattr *a = nla_data(action);
+ int rem = nla_len(action);
+
+ switch (nla_type(a)) {
+ case OVS_CLONE_ATTR_EXEC:
+ /* The real list of actions follows this attribute. */
+ a = nla_next(a, &rem);
+ ovs_nla_free_nested_actions(a, rem);
+ break;
+ }
+}
+
+static void ovs_nla_free_dec_ttl_action(const struct nlattr *action)
+{
+ const struct nlattr *a = nla_data(action);
+
+ switch (nla_type(a)) {
+ case OVS_DEC_TTL_ATTR_ACTION:
+ ovs_nla_free_nested_actions(nla_data(a), nla_len(a));
+ break;
+ }
+}
+
+static void ovs_nla_free_sample_action(const struct nlattr *action)
+{
+ const struct nlattr *a = nla_data(action);
+ int rem = nla_len(action);
+
+ switch (nla_type(a)) {
+ case OVS_SAMPLE_ATTR_ARG:
+ /* The real list of actions follows this attribute. */
+ a = nla_next(a, &rem);
+ ovs_nla_free_nested_actions(a, rem);
+ break;
+ }
+}
+
static void ovs_nla_free_set_action(const struct nlattr *a)
{
const struct nlattr *ovs_key = nla_data(a);
}
}
-void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts)
+static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
{
const struct nlattr *a;
int rem;
- if (!sf_acts)
+ /* Whenever new actions are added, the need to update this
+ * function should be considered.
+ */
+ BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 23);
+
+ if (!actions)
return;
- nla_for_each_attr(a, sf_acts->actions, sf_acts->actions_len, rem) {
+ nla_for_each_attr(a, actions, len, rem) {
switch (nla_type(a)) {
- case OVS_ACTION_ATTR_SET:
- ovs_nla_free_set_action(a);
+ case OVS_ACTION_ATTR_CHECK_PKT_LEN:
+ ovs_nla_free_check_pkt_len_action(a);
+ break;
+
+ case OVS_ACTION_ATTR_CLONE:
+ ovs_nla_free_clone_action(a);
break;
+
case OVS_ACTION_ATTR_CT:
ovs_ct_free_action(a);
break;
+
+ case OVS_ACTION_ATTR_DEC_TTL:
+ ovs_nla_free_dec_ttl_action(a);
+ break;
+
+ case OVS_ACTION_ATTR_SAMPLE:
+ ovs_nla_free_sample_action(a);
+ break;
+
+ case OVS_ACTION_ATTR_SET:
+ ovs_nla_free_set_action(a);
+ break;
}
}
+}
+
+void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts)
+{
+ if (!sf_acts)
+ return;
+ ovs_nla_free_nested_actions(sf_acts->actions, sf_acts->actions_len);
kfree(sf_acts);
}
if (!start)
return -EMSGSIZE;
- err = ovs_nla_put_actions(nla_data(attr), rem, skb);
+ /* Skipping the OVS_CLONE_ATTR_EXEC that is always the first attribute. */
+ attr = nla_next(nla_data(attr), &rem);
+ err = ovs_nla_put_actions(attr, rem, skb);
if (err)
nla_nest_cancel(skb, start);
struct rxrpc_net *rxnet = rxrpc_net(net);
rxnet->live = false;
- del_timer_sync(&rxnet->peer_keepalive_timer);
cancel_work_sync(&rxnet->peer_keepalive_work);
+ del_timer_sync(&rxnet->peer_keepalive_timer);
rxrpc_destroy_all_calls(rxnet);
rxrpc_destroy_all_connections(rxnet);
rxrpc_destroy_all_peers(rxnet);
ctx->asoc->base.sk->sk_err = -error;
return;
}
+ ctx->asoc->stats.octrlchunks++;
break;
case SCTP_CID_ABORT:
case SCTP_CID_HEARTBEAT:
if (chunk->pmtu_probe) {
- sctp_packet_singleton(ctx->transport, chunk, ctx->gfp);
+ error = sctp_packet_singleton(ctx->transport,
+ chunk, ctx->gfp);
+ if (!error)
+ ctx->asoc->stats.octrlchunks++;
break;
}
fallthrough;
if (prot->version == TLS_1_3_VERSION ||
prot->cipher_type == TLS_CIPHER_CHACHA20_POLY1305)
memcpy(iv + iv_offset, tls_ctx->rx.iv,
- crypto_aead_ivsize(ctx->aead_recv));
+ prot->iv_size + prot->salt_size);
else
memcpy(iv + iv_offset, tls_ctx->rx.iv, prot->salt_size);
s->map_cnt = %zu; \n\
s->map_skel_sz = sizeof(*s->maps); \n\
s->maps = (struct bpf_map_skeleton *)calloc(s->map_cnt, s->map_skel_sz);\n\
- if (!s->maps) \n\
+ if (!s->maps) { \n\
+ err = -ENOMEM; \n\
goto err; \n\
+ } \n\
",
map_cnt
);
s->prog_cnt = %zu; \n\
s->prog_skel_sz = sizeof(*s->progs); \n\
s->progs = (struct bpf_prog_skeleton *)calloc(s->prog_cnt, s->prog_skel_sz);\n\
- if (!s->progs) \n\
+ if (!s->progs) { \n\
+ err = -ENOMEM; \n\
goto err; \n\
+ } \n\
",
prog_cnt
);
%1$s__create_skeleton(struct %1$s *obj) \n\
{ \n\
struct bpf_object_skeleton *s; \n\
+ int err; \n\
\n\
s = (struct bpf_object_skeleton *)calloc(1, sizeof(*s));\n\
- if (!s) \n\
+ if (!s) { \n\
+ err = -ENOMEM; \n\
goto err; \n\
+ } \n\
\n\
s->sz = sizeof(*s); \n\
s->name = \"%1$s\"; \n\
return 0; \n\
err: \n\
bpf_object__destroy_skeleton(s); \n\
- return -ENOMEM; \n\
+ return err; \n\
} \n\
\n\
static inline const void *%2$s__elf_bytes(size_t *sz) \n\
\n\
obj = (struct %1$s *)calloc(1, sizeof(*obj)); \n\
if (!obj) { \n\
- errno = ENOMEM; \n\
+ err = -ENOMEM; \n\
goto err; \n\
} \n\
s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s));\n\
if (!s) { \n\
- errno = ENOMEM; \n\
+ err = -ENOMEM; \n\
goto err; \n\
} \n\
s->sz = sizeof(*s); \n\
s->var_cnt = %2$d; \n\
s->vars = (struct bpf_var_skeleton *)calloc(%2$d, sizeof(*s->vars));\n\
if (!s->vars) { \n\
- errno = ENOMEM; \n\
+ err = -ENOMEM; \n\
goto err; \n\
} \n\
",
return obj; \n\
err: \n\
%1$s__destroy(obj); \n\
+ errno = -err; \n\
return NULL; \n\
} \n\
\n\
/* Copyright (C) 2021. Huawei Technologies Co., Ltd */
#include <test_progs.h>
#include "dummy_st_ops.skel.h"
+#include "trace_dummy_st_ops.skel.h"
/* Need to keep consistent with definition in include/linux/bpf.h */
struct bpf_dummy_ops_state {
.ctx_in = args,
.ctx_size_in = sizeof(args),
);
+ struct trace_dummy_st_ops *trace_skel;
struct dummy_st_ops *skel;
int fd, err;
return;
fd = bpf_program__fd(skel->progs.test_1);
+
+ trace_skel = trace_dummy_st_ops__open();
+ if (!ASSERT_OK_PTR(trace_skel, "trace_dummy_st_ops__open"))
+ goto done;
+
+ err = bpf_program__set_attach_target(trace_skel->progs.fentry_test_1,
+ fd, "test_1");
+ if (!ASSERT_OK(err, "set_attach_target(fentry_test_1)"))
+ goto done;
+
+ err = trace_dummy_st_ops__load(trace_skel);
+ if (!ASSERT_OK(err, "load(trace_skel)"))
+ goto done;
+
+ err = trace_dummy_st_ops__attach(trace_skel);
+ if (!ASSERT_OK(err, "attach(trace_skel)"))
+ goto done;
+
err = bpf_prog_test_run_opts(fd, &attr);
ASSERT_OK(err, "test_run");
ASSERT_EQ(in_state.val, 0x5a, "test_ptr_ret");
ASSERT_EQ(attr.retval, exp_retval, "test_ret");
+ ASSERT_EQ(trace_skel->bss->val, exp_retval, "fentry_val");
+done:
dummy_st_ops__destroy(skel);
+ trace_dummy_st_ops__destroy(trace_skel);
}
static void test_dummy_multiple_args(void)
VERIFY(check_default(&array_of_maps->map, map));
inner_map = bpf_map_lookup_elem(array_of_maps, &key);
- VERIFY(inner_map != 0);
+ VERIFY(inner_map != NULL);
VERIFY(inner_map->map.max_entries == INNER_MAX_ENTRIES);
return 1;
VERIFY(check_default(&hash_of_maps->map, map));
inner_map = bpf_map_lookup_elem(hash_of_maps, &key);
- VERIFY(inner_map != 0);
+ VERIFY(inner_map != NULL);
VERIFY(inner_map->map.max_entries == INNER_MAX_ENTRIES);
return 1;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+int val = 0;
+
+SEC("fentry/test_1")
+int BPF_PROG(fentry_test_1, __u64 *st_ops_ctx)
+{
+ __u64 state;
+
+ /* Read the traced st_ops arg1 which is a pointer */
+ bpf_probe_read_kernel(&state, sizeof(__u64), (void *)st_ops_ctx);
+ /* Read state->val */
+ bpf_probe_read_kernel(&val, sizeof(__u32), (void *)state);
+
+ return 0;
+}
+
+char _license[] SEC("license") = "GPL";
#include "bpf_rlimit.h"
#include "cgroup_helpers.h"
-static int start_server(const struct sockaddr *addr, socklen_t len)
+static int start_server(const struct sockaddr *addr, socklen_t len, bool dual)
{
+ int mode = !dual;
int fd;
fd = socket(addr->sa_family, SOCK_STREAM, 0);
goto out;
}
+ if (addr->sa_family == AF_INET6) {
+ if (setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, (char *)&mode,
+ sizeof(mode)) == -1) {
+ log_err("Failed to set the dual-stack mode");
+ goto close_out;
+ }
+ }
+
if (bind(fd, addr, len) == -1) {
log_err("Failed to bind server socket");
goto close_out;
return fd;
}
-static int connect_to_server(int server_fd)
+static int connect_to_server(const struct sockaddr *addr, socklen_t len)
{
- struct sockaddr_storage addr;
- socklen_t len = sizeof(addr);
int fd = -1;
- if (getsockname(server_fd, (struct sockaddr *)&addr, &len)) {
- log_err("Failed to get server addr");
- goto out;
- }
-
- fd = socket(addr.ss_family, SOCK_STREAM, 0);
+ fd = socket(addr->sa_family, SOCK_STREAM, 0);
if (fd == -1) {
log_err("Failed to create client socket");
goto out;
}
- if (connect(fd, (const struct sockaddr *)&addr, len) == -1) {
+ if (connect(fd, (const struct sockaddr *)addr, len) == -1) {
log_err("Fail to connect to server");
goto close_out;
}
return map_fd;
}
-static int run_test(int server_fd, int results_fd, bool xdp)
+static int run_test(int server_fd, int results_fd, bool xdp,
+ const struct sockaddr *addr, socklen_t len)
{
int client = -1, srv_client = -1;
int ret = 0;
goto err;
}
- client = connect_to_server(server_fd);
+ client = connect_to_server(addr, len);
if (client == -1)
goto err;
return ret;
}
+static bool get_port(int server_fd, in_port_t *port)
+{
+ struct sockaddr_in addr;
+ socklen_t len = sizeof(addr);
+
+ if (getsockname(server_fd, (struct sockaddr *)&addr, &len)) {
+ log_err("Failed to get server addr");
+ return false;
+ }
+
+ /* sin_port and sin6_port are located at the same offset. */
+ *port = addr.sin_port;
+ return true;
+}
+
int main(int argc, char **argv)
{
struct sockaddr_in addr4;
struct sockaddr_in6 addr6;
+ struct sockaddr_in addr4dual;
+ struct sockaddr_in6 addr6dual;
int server = -1;
int server_v6 = -1;
+ int server_dual = -1;
int results = -1;
int err = 0;
bool xdp;
addr4.sin_family = AF_INET;
addr4.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
addr4.sin_port = 0;
+ memcpy(&addr4dual, &addr4, sizeof(addr4dual));
memset(&addr6, 0, sizeof(addr6));
addr6.sin6_family = AF_INET6;
addr6.sin6_addr = in6addr_loopback;
addr6.sin6_port = 0;
- server = start_server((const struct sockaddr *)&addr4, sizeof(addr4));
- if (server == -1)
+ memset(&addr6dual, 0, sizeof(addr6dual));
+ addr6dual.sin6_family = AF_INET6;
+ addr6dual.sin6_addr = in6addr_any;
+ addr6dual.sin6_port = 0;
+
+ server = start_server((const struct sockaddr *)&addr4, sizeof(addr4),
+ false);
+ if (server == -1 || !get_port(server, &addr4.sin_port))
goto err;
server_v6 = start_server((const struct sockaddr *)&addr6,
- sizeof(addr6));
- if (server_v6 == -1)
+ sizeof(addr6), false);
+ if (server_v6 == -1 || !get_port(server_v6, &addr6.sin6_port))
+ goto err;
+
+ server_dual = start_server((const struct sockaddr *)&addr6dual,
+ sizeof(addr6dual), true);
+ if (server_dual == -1 || !get_port(server_dual, &addr4dual.sin_port))
+ goto err;
+
+ if (run_test(server, results, xdp,
+ (const struct sockaddr *)&addr4, sizeof(addr4)))
goto err;
- if (run_test(server, results, xdp))
+ if (run_test(server_v6, results, xdp,
+ (const struct sockaddr *)&addr6, sizeof(addr6)))
goto err;
- if (run_test(server_v6, results, xdp))
+ if (run_test(server_dual, results, xdp,
+ (const struct sockaddr *)&addr4dual, sizeof(addr4dual)))
goto err;
printf("ok\n");
out:
close(server);
close(server_v6);
+ close(server_dual);
close(results);
return err;
}
set +e
check_nexthop "dev veth1" ""
log_test $? 0 "Nexthops removed on admin down"
+
+ # nexthop route delete warning: route add with nhid and delete
+ # using device
+ run_cmd "$IP li set dev veth1 up"
+ run_cmd "$IP nexthop add id 12 via 172.16.1.3 dev veth1"
+ out1=`dmesg | grep "WARNING:.*fib_nh_match.*" | wc -l`
+ run_cmd "$IP route add 172.16.101.1/32 nhid 12"
+ run_cmd "$IP route delete 172.16.101.1/32 dev veth1"
+ out2=`dmesg | grep "WARNING:.*fib_nh_match.*" | wc -l`
+ [ $out1 -eq $out2 ]
+ rc=$?
+ log_test $rc 0 "Delete nexthop route warning"
+ run_cmd "$IP route delete 172.16.101.1/32 nhid 12"
+ run_cmd "$IP nexthop del id 12"
}
ipv4_grp_fcnal()