linux-2.6-microblaze.git
3 years agoMerge tag 'x86_sgx_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Sun, 21 Feb 2021 03:13:18 +0000 (19:13 -0800)]
Merge tag 'x86_sgx_for_v5.12' of git://git./linux/kernel/git/tip/tip

Pull x86 SGX fixes from Borislav Petkov:
 "Random small fixes which missed the initial SGX submission. Also, some
  procedural clarifications"

* tag 'x86_sgx_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Add Dave Hansen as reviewer for INTEL SGX
  x86/sgx: Drop racy follow_pfn() check
  MAINTAINERS: Fix the tree location for INTEL SGX patches
  x86/sgx: Fix the return type of sgx_init()

3 years agoMerge tag 'efi-next-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Linus Torvalds [Sun, 21 Feb 2021 03:09:26 +0000 (19:09 -0800)]
Merge tag 'efi-next-for-v5.12' of git://git./linux/kernel/git/tip/tip

Pull EFI updates from Ard Biesheuvel via Borislav Petkov:
 "A few cleanups left and right, some of which were part of a initrd
  measured boot series that needs some more work, and so only the
  cleanup patches have been included for this release"

* tag 'efi-next-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/arm64: Update debug prints to reflect other entropy sources
  efi: x86: clean up previous struct mm switching
  efi: x86: move mixed mode stack PA variable out of 'efi_scratch'
  efi/libstub: move TPM related prototypes into efistub.h
  efi/libstub: fix prototype of efi_tcg2_protocol::get_event_log()
  efi/libstub: whitespace cleanup
  efi: ia64: move IA64-only declarations to new asm/efi.h header

3 years agoMerge tag 'ras_updates_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Feb 2021 03:06:34 +0000 (19:06 -0800)]
Merge tag 'ras_updates_for_v5.12' of git://git./linux/kernel/git/tip/tip

Pull RAS updates from Borislav Petkov:

 - move therm_throt.c to the thermal framework, where it belongs.

 - identify CPUs which miss to enter the broadcast handler, as an
   additional debugging aid.

* tag 'ras_updates_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  thermal: Move therm_throt there from x86/mce
  x86/mce: Get rid of mcheck_intel_therm_init()
  x86/mce: Make mce_timed_out() identify holdout CPUs

3 years agoMerge tag 'edac_updates_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Feb 2021 03:02:28 +0000 (19:02 -0800)]
Merge tag 'edac_updates_for_v5.12' of git://git./linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - a couple of fixes/improvements to amd64_edac:
    * merge debugging and error injection functionality into the main driver
    * tone down info/error output
    * do not attempt to load it on F15h client hw

 - misc fixes to other drivers

* tag 'edac_updates_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/amd64: Issue probing messages only on properly detected hardware
  EDAC/xgene: Do not print a failure message to get an IRQ twice
  EDAC/ppc4xx: Convert comma to semicolon
  EDAC/amd64: Limit error injection functionality to supported hw
  EDAC/amd64: Merge error injection sysfs facilities
  EDAC/amd64: Merge sysfs debugging attributes setup code
  EDAC/amd64: Tone down messages about missing PCI IDs
  EDAC/amd64: Do not load on family 0x15, model 0x13

3 years agoMerge tag 'arm-drivers-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 21 Feb 2021 02:42:28 +0000 (18:42 -0800)]
Merge tag 'arm-drivers-v5.12' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC driver updates from Arnd Bergmann:
 "Updates for SoC specific drivers include a few subsystems that have
  their own maintainers but send them through the soc tree:

  SCMI firmware:
   - add support for a completion interrupt

  Reset controllers:
   - new driver for BCM4908
   - new devm_reset_control_get_optional_exclusive_released() function

  Memory controllers:
   - Renesas RZ/G2 support
   - Tegra124 interconnect support
   - Allow more drivers to be loadable modules

  TEE/optee firmware:
   - minor code cleanup

  The other half of this is SoC specific drivers that do not belong into
  any other subsystem, most of them living in drivers/soc:

   - Allwinner/sunxi power management work
   - Allwinner H616 support

   - ASpeed AST2600 system identification support

   - AT91 SAMA7G5 SoC ID driver
   - AT91 SoC driver cleanups

   - Broadcom BCM4908 power management bus support

   - Marvell mbus cleanups

   - Mediatek MT8167 power domain support

   - Qualcomm socinfo driver support for PMIC
   - Qualcomm SoC identification for many more products

   - TI Keystone driver cleanups for PRUSS and elsewhere"

* tag 'arm-drivers-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (89 commits)
  soc: aspeed: socinfo: Add new systems
  soc: aspeed: snoop: Add clock control logic
  memory: tegra186-emc: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
  memory: samsung: exynos5422-dmc: Correct function names in kerneldoc
  memory: ti-emif-pm: Drop of_match_ptr from of_device_id table
  optee: simplify i2c access
  drivers: soc: atmel: fix type for same7
  tee: optee: remove need_resched() before cond_resched()
  soc: qcom: ocmem: don't return NULL in of_get_ocmem
  optee: sync OP-TEE headers
  tee: optee: fix 'physical' typos
  drivers: optee: use flexible-array member instead of zero-length array
  tee: fix some comment typos in header files
  soc: ti: k3-ringacc: Use of_device_get_match_data()
  soc: ti: pruss: Refactor the CFG sub-module init
  soc: mediatek: pm-domains: Don't print an error if child domain is deferred
  soc: mediatek: pm-domains: Add domain regulator supply
  dt-bindings: power: Add domain regulator supply
  soc: mediatek: cmdq: Remove cmdq_pkt_flush()
  soc: mediatek: pm-domains: Add support for mt8167
  ...

3 years agoMerge tag 'arm-dt-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 21 Feb 2021 02:34:53 +0000 (18:34 -0800)]
Merge tag 'arm-dt-v5.12' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC devicetree updates from Arnd Bergmann:
 "After the last release contained a surprising amount of new 32-bit
  machines, this time two thirds of the code changes are for 64-bit.

  The usual updates to existing files include:

   - Device tree compiler warning fixes for Berlin, Renesas, SoCFPGA,
     nomadik, stm32, Allwinner, TI Keystone

   - Support for additional devices on existing machines on Renesas,
     SoCFPGA, at91, hisilicon, OMAP, Tegra, TI K3, Allwinner, Broadcom,
     ux500, Mediatek, Marvell Armada, Marvell MMP, ZynqMP, AMLogic,
     Qualcomm, i.MX, Layerscape, Actions, ASpeed, Toshiba

   - Cleanups and minor fixes for Renesas, at91, mstar, ux500, Samsung,
     stm32, Tegra, Broadcom, Mediatek, Marvell MMP, AMLogic, Qualcomm,
     i.MX, Rockchip, ASpeed, Zynq

  Only three new SoCs this time, but a number of boards across:

  Renesas:
   - Two Beacon EmbeddedWorks boards (RZ/G2H and RZ/G2N based)

  Intel SoCFPGA:
   - eASIC N5X board (N5X)

  ST-Ericsson Ux500:
   - Samsung GT-I9070 (Janice) phone (u8500)

  TI OMAP:
   - MYIR Tech Limited development board (AM335X)

  Allwinner/sunxi:
   - SL631 Action Camera (V3)
   - PineTab Early Adopter tablet (A64)

  Broadcom:
   - BCM4906 networking chip
   - Netgear R8000P router (BCM4906)

  AMLogic:
   - Hardkernel ODROID-HC4 development board (SM1)
   - Beelink GS-King-X TV Box (S922X)

  Qualcomm:
   - Snapdragon 888 / SM8350 high-end phone SoC
   - Qualcomm SDX55 5G modem as standalone SoC
   - Snapdragon MTP reference board (SM8350)
   - Snapdragon MTP reference board (SDX55)
   - Sony Kitakami phones: Xperia Z3+/Z4/Z5 (APQ8094)
   - Alcatel Idol 3 phone (MSM8916)
   - ASUS Zenfone 2 Laser phone (MSM8916)
   - BQ Aquaris X5 aka Longcheer L8910 phone (MSM8916)
   - OnePlus6 phone (SDM845)
   - OnePlus6T phone (SDM845)
   - Alfa Network AP120C-AC access point (IPQ4018)

  NXP i.MX6 (32-bit):
   - Plymovent BAS base system controller for filter systems (imx6dl)
   - Protonic MVT industrial touchscreen terminals (imx6dl)
   - Protonic PRTI6G reference board (imx6ul)
   - Kverneland UT1, UT1Q, UT1P, TGO agricultural terminals (imx6q/dl/qp)

  NXP i.MX8 (64-bit)
   - Beacon i.MX8M Nano development kit (imx8mn)
   - Boundary Devices i.MX8MM Nitrogen SBC (imx8mm)
   - Gateworks Venice i.MX 8M Mini Development Kits (imx8mm)
   - phyBOARD-Pollux-i.MX8MP (imx8mp)
   - Purism Librem5 Evergreen phone (imx8mp)
   - Kontron SMARC-sAL28 system-on-module(imx8mp)

  Rockchip:
   - NanoPi M4B Single-board computer (RK3399)
   - Radxa Rock Pi E router SBC (RK3328)

  ASpeed:
   - Ampere Mt. Jade, a BMC for an x86 server (AST2500)
   - IBM Everest, a BMC for a Power10 server (AST2600)
   - Supermicro x11spi, a BMC for an ARM server (AST2500)

  Zynq:
   - Ebang EBAZ4205, FPGA board (Zynq-7000)
   - ZynqMP zcu104 revC reference platform (ZynqMP)"

* tag 'arm-dt-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (584 commits)
  ARM: dts: aspeed: align GPIO hog names with dtschema
  ARM: dts: aspeed: fix PCA95xx GPIO expander properties on Portwell
  dt-bindings: spi: zynq: Convert Zynq QSPI binding to yaml
  arm: dts: visconti: Add DT support for Toshiba Visconti5 GPIO driver
  ARM: dts: aspeed: ast2600evb: Add enable ehci and uhci
  ARM: dts: aspeed: mowgli: Add i2c rtc device
  ARM: dts: aspeed: amd-ethanolx: Enable secondary LPC snooping address
  dt-bindings: arm: xilinx: Add missing Zturn boards
  ARM: dts: ebaz4205: add pinctrl entries for switches
  ARM: dts: add Ebang EBAZ4205 device tree
  dt-bindings: arm: add Ebang EBAZ4205 board
  dt-bindings: add ebang vendor prefix
  ARM: dts: aspeed: Add Everest BMC machine
  ARM: dts: aspeed: inspur-fp5280g2: Add ipsps1 driver
  ARM: dts: aspeed: inspur-fp5280g2: Add GPIO line names
  ARM: dts: aspeed: Add Supermicro x11spi BMC machine
  ARM: dts: aspeed: g220a: Fix some gpio
  ARM: dts: aspeed: g220a: Enable ipmb
  ARM: dts: aspeed: rainier: Add eMMC clock phase compensation
  ARM: dts: aspeed: Add LCLK to lpc-snoop
  ...

3 years agoMerge tag 'arm-defconfig-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Feb 2021 02:23:15 +0000 (18:23 -0800)]
Merge tag 'arm-defconfig-v5.12' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC defconfig updates from Arnd Bergmann:
 "As usual, a number of additional device drivers that were added to the
  kernel are now enabled in the respective configuration files.

  A few of the files get updated to match the current Kconfig files for
  removed or renamed options"

* tag 'arm-defconfig-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits)
  ARM: configs: sama5_defconfig: add QSPI driver
  ARM: configs: at91_dt_defconfig: add ov7740 module
  ARM: configs: at91_dt_defconfig: add useful helper options
  ARM: configs: at91: DT/ATAG defconfig modifications
  ARM: configs: sama5_defconfig: update and remove unneeded options
  ARM: configs: at91: enable drivers for sam9x60
  arm64: defconfig: Enable RT5659
  arm64: configs: Support DEVAPC on MediaTek platforms
  arm64: configs: Support pwrap on Mediatek MT6779 platform
  arm64: defconfig: Enable PF8x00 as builtin
  ARM: multi_v7_defconfig: add STM32 CEC support
  arm64: defconfig: Enable vibra-pwm
  ARM: omap2plus_defconfig: Update for dropped options
  ARM: omap2plus_defconfig: Update for moved options
  ARM: multi_v7_defconfig: Enable nvmem's rmem driver
  arm64: defconfig: Enable nvmem's rmem driver
  ARM: multi_v7_defconfig: Enable support for the ADC thermal sensor
  ARM: qcom_defconfig: Enable Command DB driver
  ARM: qcom_defconfig: Enable RPMh power domain driver
  ARM: qcom_defconfig: Enable ARM PSCI support
  ...

3 years agoMerge tag 'arm-soc-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 21 Feb 2021 02:20:06 +0000 (18:20 -0800)]
Merge tag 'arm-soc-v5.12' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC updates from Arnd Bergmann:
 "This is mostly 32-bit code for SoC platforms, and looks smaller than
  any such branch I remember from previous kernels, as most of this is
  now handled in other subsystems for modern platforms:

   - Minor bugfixes and Kconfig updates for Tegra, Broadcom, i.MX,
     Renesas, and Samsung

   - Updates to the MAINTAINERS listing for Actions, OMAP, and Samsung

   - Samsung SoC driver updates to make them loadable modules"

* tag 'arm-soc-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: arm: samsung: include S3C headers in platform entry
  MAINTAINERS: Add linux-actions ML for Actions Semi Arch
  ARM: s3c: irq-s3c24xx: staticize local functions
  ARM: s3c: irq-s3c24xx: include headers for missing declarations
  ARM: s3c: fix fiq for clang IAS
  ARM: imx: Remove unused IMX_GPIO_NR() macro
  soc: renesas: rcar-sysc: Mark device node OF_POPULATED after init
  ARM: OMAP2+: fix spellint typo
  MAINTAINERS: Update address for OMAP GPMC driver
  soc: renesas: rcar-sysc: Use readl_poll_timeout_atomic()
  ARM: bcm: Select BRCMSTB_L2_IRQ for bcm2835
  ARM: brcmstb: Add debug UART entry for 72116
  ARM: tegra: Don't enable unused PLLs on resume from suspend
  soc: samsung: pm_domains: Convert to regular platform driver
  soc: samsung: exynos-chipid: correct helpers __init annotation
  ARM: mach-imx: imx6ul: Print SOC revision on boot
  ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup
  soc: samsung: exynos-chipid: convert to driver and merge exynos-asv
  soc: samsung: exynos-asv: handle reading revision register error
  soc: samsung: exynos-asv: don't defer early on not-supported SoCs

3 years agoMerge tag 'arm-platform-removal-v5.12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 21 Feb 2021 02:16:30 +0000 (18:16 -0800)]
Merge tag 'arm-platform-removal-v5.12' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC platform removals from Arnd Bergmann:
 "There are a lot of platforms that have not seen any interesting code
  changes in the past five years or more.

  I made a list and asked around which ones are no longer in use, and
  received confirmation about six ARM platforms and the TI C6x
  architecture that have all reached the end of their life upstream,
  with no known users remaining:

   - efm32 - added in 2011, first Cortex-M, no notable changes after 2013

   - picoxcell - added in 2011, abandoned after 2012 acquisition

   - prima2 - added in 20111, no notable changes since 2015

   - tango - added in 2015, sporadic changes until 2017, but abandoned

   - u300 - added in 2009, no notable changes since 2013

   - zx - added in 2015 for both 32, 2017 for 64 bit, no notable changes

   - arch/c6x - added in 2011, but work stalled soon after that

  A number of other platforms on the original list turned out to still
  have users. In some cases there are out-of-tree patches and users that
  plan to contribute them in the future, in other cases the code is
  complete and works reliably"

Link: https://lore.kernel.org/lkml/CAK8P3a2DZ8xQp7R=H=wewHnT2=a_=M53QsZOueMVEf7tOZLKNg@mail.gmail.com/
* tag 'arm-platform-removal-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: remove u300 platform
  ARM: remove tango platform
  ARM: remove zte zx platform
  ARM: remove sirf prima2/atlas platforms
  c6x: remove architecture
  MAINTAINERS: Remove deleted platform efm32
  ARM: drop efm32 platform
  ARM: Remove PicoXcell platform support
  ARM: dts: Remove PicoXcell platforms

3 years agoMerge tag 'arm-fixes-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 21 Feb 2021 02:09:40 +0000 (18:09 -0800)]
Merge tag 'arm-fixes-v5.12' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "There are only two left-over remaining non-urgent ARM SoC bug fixes:

   - A build fix for the Atmel SAM9 platform to allow building with the
     clang integrated assembler

   - A DT fix for ethernet on Intel SoCFPGA, this has been broken since
     it was added in v5.4"

* tag 'arm-fixes-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: at91: use proper asm syntax in pm_suspend
  arm64: dts: agilex: fix phy interface bit shift for gmac1 and gmac2

3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Linus Torvalds [Sun, 21 Feb 2021 01:45:32 +0000 (17:45 -0800)]
Merge git://git./linux/kernel/git/netdev/net-next

Pull networking updates from David Miller:
 "Here is what we have this merge window:

   1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik.

   2) Add RSS multi group support to octeontx2-pf driver, from Geetha
      Sowjanya.

   3) Add support for KS8851 PHY. From Marek Vasut.

   4) Add support for GarfieldPeak bluetooth controller from Kiran K.

   5) Add support for half-duplex tcan4x5x can controllers.

   6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw
      Liew.

   7) Rework RX port offload infrastructure, particularly wrt, UDP
      tunneling, from Jakub Kicinski.

   8) Add BCM72116 PHY support, from Florian Fainelli.

   9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir
      Oltean.

  10) Add support for picosecond rx delay in dwmac-meson8b chips. From
      Martin Blumenstingl.

  11) Support TSO on xfrm interfaces from Eyal Birger.

  12) Add support for MP_PRIO to mptcp stack, from Geliang Tang.

  13) Support BCM4908 integrated switch, from Rafał Miłecki.

  14) Support for directly accessing kernel module variables via module
      BTF info, from Andrii Naryiko.

  15) Add DASH (esktop and mobile Architecture for System Hardware)
      support to r8169 driver, from Heiner Kallweit.

  16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron.

  17) Add support for 100 base0x SFP devices, from Bjarni Jonasson.

  18) Support link aggregation in DSA, from Tobias Waldekranz.

  19) Support for bitwidse atomics in bpf, from Brendan Jackman.

  20) SmartEEE support in at803x driver, from Russell King.

  21) Add support for flow based tunneling to GTP, from Pravin B Shelar.

  22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder.

  23) TLS RX offload for bonding, from Tariq Toukan.

  24) RX decap offklload support in mac80211, from Felix Fietkou.

  25) devlink health saupport in octeontx2-af, from George Cherian.

  26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung

  27) Delegated actionss support in mptcp, from Paolo Abeni.

  28) Support receive timestamping when doin zerocopy tcp receive. From
      Arjun Ray.

  29) HTB offload support for mlx5, from Maxim Mikityanskiy.

  30) UDP GRO forwarding, from Maxim Mikityanskiy.

  31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach.

  32) Weighted random twos choice algorithm for ipvs, from Darby Payne.

  33) Fix netdev registration deadlock, from Johannes Berg.

  34) Various conversions to new tasklet api, from EmilRenner Berthing.

  35) Bulk skb allocations in veth, from Lorenzo Bianconi.

  36) New ethtool interface for lane setting, from Danielle Ratson.

  37) Offload failiure notifications for routes, from Amit Cohen.

  38) BCM4908 support, from Rafał Miłecki.

  39) Support several new iwlwifi chips, from Ihab Zhaika.

  40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski.

  41) Support for mhi prrotocols, from Loic Poulain.

  42) Optimize bpf program stats.

  43) Implement RFC6056, for better port randomization, from Eric
      Dumazet.

  44) hsr tag offloading support from George McCollister.

  45) Netpoll support in qede, from Bhaskar Upadhaya.

  46) 2005/400g speed support in bonding 3ad mode, from Nikolay
      Aleksandrov.

  47) Netlink event support in mptcp, from Florian Westphal.

  48) Better skbuff caching, from Alexander Lobakin.

  49) MRP (Media Redundancy Protocol) offloading in DSA and a few
      drivers, from Horatiu Vultur.

  50) mqprio saupport in mvneta, from Maxime Chevallier.

  51) Remove of_phy_attach, no longer needed, from Florian Fainelli"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits)
  octeontx2-pf: Fix otx2_get_fecparam()
  cteontx2-pf: cn10k: Prevent harmless double shift bugs
  net: stmmac: Add PCI bus info to ethtool driver query output
  ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
  ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
  ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
  ptp: ptp_clockmatrix: Clean-up dev_*() messages.
  ptp: ptp_clockmatrix: Remove unused header declarations.
  ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
  ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
  net: stmmac: dwmac-sun8i: Add a shutdown callback
  net: stmmac: dwmac-sun8i: Minor probe function cleanup
  net: stmmac: dwmac-sun8i: Use reset_control_reset
  net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
  net: stmmac: dwmac-sun8i: Return void from PHY unpower
  r8169: use macro pm_ptr
  net: mdio: Remove of_phy_attach()
  net: mscc: ocelot: select PACKING in the Kconfig
  net: re-solve some conflicts after net -> net-next merge
  net: dsa: tag_rtl4_a: Support also egress tags
  ...

3 years agofix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*
Al Viro [Mon, 15 Feb 2021 17:03:23 +0000 (12:03 -0500)]
fix handling of nd->depth on LOOKUP_CACHED failures in try_to_unlazy*

After switching to non-RCU mode, we want nd->depth to match the number
of entries in nd->stack[] that need eventual path_put().
legitimize_links() takes care of that on failures; unfortunately,
failure exits added for LOOKUP_CACHED do not.

We could add the logics for that into those failure exits, both in
try_to_unlazy() and in try_to_unlazy_next(), but since both checks
are immediately followed by legitimize_links() and there's no calls
of legitimize_links() other than those two...  It's easier to
move the check (and required handling of nd->depth on failure) into
legitimize_links() itself.

[caught by Jens: ... and since we are zeroing ->depth here, we need
to do drop_links() first]

Fixes: 6c6ec2b0a3e0 "fs: add support for LOOKUP_CACHED"
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 years agoi40e: Fix endianness conversions
Norbert Ciosek [Fri, 5 Feb 2021 08:48:52 +0000 (08:48 +0000)]
i40e: Fix endianness conversions

Fixes the following sparse warnings:
i40e_main.c:5953:32: warning: cast from restricted __le16
i40e_main.c:8008:29: warning: incorrect type in assignment (different base types)
i40e_main.c:8008:29:    expected unsigned int [assigned] [usertype] ipa
i40e_main.c:8008:29:    got restricted __le32 [usertype]
i40e_main.c:8008:29: warning: incorrect type in assignment (different base types)
i40e_main.c:8008:29:    expected unsigned int [assigned] [usertype] ipa
i40e_main.c:8008:29:    got restricted __le32 [usertype]
i40e_txrx.c:1950:59: warning: incorrect type in initializer (different base types)
i40e_txrx.c:1950:59:    expected unsigned short [usertype] vlan_tag
i40e_txrx.c:1950:59:    got restricted __le16 [usertype] l2tag1
i40e_txrx.c:1953:40: warning: cast to restricted __le16
i40e_xsk.c:448:38: warning: invalid assignment: |=
i40e_xsk.c:448:38:    left side has type restricted __le64
i40e_xsk.c:448:38:    right side has type int

Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Fixes: 2a508c64ad27 ("i40e: fix VLAN.TCI == 0 RX HW offload")
Fixes: 3106c580fb7c ("i40e: Use batched xsk Tx interfaces to increase performance")
Fixes: 8f88b3034db3 ("i40e: Add infrastructure for queue channel support")
Signed-off-by: Norbert Ciosek <norbertx.ciosek@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoi40e: Fix add TC filter for IPv6
Mateusz Palczewski [Mon, 28 Dec 2020 10:38:00 +0000 (11:38 +0100)]
i40e: Fix add TC filter for IPv6

Fix insufficient distinction between IPv4 and IPv6 addresses
when creating a filter.
IPv4 and IPv6 are kept in the same memory area. If IPv6 is added,
then it's caught by IPv4 check, which leads to err -95.

Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agopstore: Fix typo in compression option name
Jiri Bohac [Thu, 18 Feb 2021 11:15:47 +0000 (12:15 +0100)]
pstore: Fix typo in compression option name

Both pstore_compress() and decompress_record() use a mistyped config
option name ("PSTORE_COMPRESSION" instead of "PSTORE_COMPRESS"). As
a result compression and decompression of pstore records was always
disabled.

Use the correct config option name.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Fixes: fd49e03280e5 ("pstore: Fix linking when crypto API disabled")
Acked-by: Matteo Croce <mcroce@microsoft.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210218111547.johvp5klpv3xrpnn@dwarf.suse.cz
3 years agoi40e: Fix VFs not created
Sylwester Dziedziuch [Fri, 27 Nov 2020 11:23:01 +0000 (11:23 +0000)]
i40e: Fix VFs not created

When creating VFs they were sometimes not getting resources.
It was caused by not executing i40e_reset_all_vfs due to
flag __I40E_VF_DISABLE being set on PF. Because of this
IAVF was never able to finish setup sequence never
getting reset indication from PF.
Changed test_and_set_bit __I40E_VF_DISABLE in
i40e_sync_filters_subtask to test_bit and removed clear_bit.
This function should not set this bit it should only check
if it hasn't been already set.

Fixes: a7542b876075 ("i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask")
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoi40e: Fix addition of RX filters after enabling FW LLDP agent
Mateusz Palczewski [Fri, 27 Nov 2020 10:39:03 +0000 (10:39 +0000)]
i40e: Fix addition of RX filters after enabling FW LLDP agent

Fix addition of VLAN filter for PF after enabling FW LLDP agent.
Changing LLDP Agent causes FW to re-initialize per NVM settings.
Remove default PF filter and move "Enable/Disable" to currently used
reset flag.
Without this patch PF would try to add MAC VLAN filter with default
switch filter present. This causes AQ error and sets promiscuous mode
on.

Fixes: c65e78f87f81 ("i40e: Further implementation of LLDP")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoi40e: Fix overwriting flow control settings during driver loading
Mateusz Palczewski [Tue, 24 Nov 2020 15:08:27 +0000 (15:08 +0000)]
i40e: Fix overwriting flow control settings during driver loading

During driver loading flow control settings were written to FW
using a variable which was always zero, since it was being set
only by ethtool. This behavior has been corrected and driver
no longer overwrites the default FW/NVM settings.

Fixes: 373149fc99a0 ("i40e: Decrease the scope of rtnl lock")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoi40e: Add zero-initialization of AQ command structures
Mateusz Palczewski [Fri, 20 Nov 2020 10:35:37 +0000 (10:35 +0000)]
i40e: Add zero-initialization of AQ command structures

Zero-initialize AQ command data structures to comply with
API specifications.

Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower")
Fixes: f4492db16df8 ("i40e: Add NPAR BW get and set functions")
Signed-off-by: Andrzej Sawuła <andrzej.sawula@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoi40e: Fix memory leak in i40e_probe
Keita Suzuki [Fri, 30 Oct 2020 07:14:30 +0000 (07:14 +0000)]
i40e: Fix memory leak in i40e_probe

Struct i40e_veb is allocated in function i40e_setup_pf_switch, and
stored to an array field veb inside struct i40e_pf. However when
i40e_setup_misc_vector fails, this memory leaks.

Fix this by calling exit and teardown functions.

Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoi40e: Fix flow for IPv6 next header (extension header)
Slawomir Laba [Thu, 10 Sep 2020 07:57:04 +0000 (07:57 +0000)]
i40e: Fix flow for IPv6 next header (extension header)

When a packet contains an IPv6 header with next header which is
an extension header and not a protocol one, the kernel function
skb_transport_header called with such sk_buff will return a
pointer to the extension header and not to the TCP one.

The above explained call caused a problem with packet processing
for skb with encapsulation for tunnel with I40E_TX_CTX_EXT_IP_IPV6.
The extension header was not skipped at all.

The ipv6_skip_exthdr function does check if next header of the IPV6
header is an extension header and doesn't modify the l4_proto pointer
if it points to a protocol header value so its safe to omit the
comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can
return value -1. This means that the skipping process failed
and there is something wrong with the packet so it will be dropped.

Fixes: a3fd9d8876a5 ("i40e/i40evf: Handle IPv6 extension headers in checksum offload")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
3 years agoocteontx2-pf: Fix otx2_get_fecparam()
Dan Carpenter [Wed, 17 Feb 2021 07:41:39 +0000 (10:41 +0300)]
octeontx2-pf: Fix otx2_get_fecparam()

Static checkers complained about an off by one read overflow in
otx2_get_fecparam() and we applied two conflicting fixes for it.

Correct: b0aae0bde26f ("octeontx2: Fix condition.")
  Wrong: 93efb0c65683 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()")

Revert the incorrect fix.

Fixes: 93efb0c65683 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agocteontx2-pf: cn10k: Prevent harmless double shift bugs
Dan Carpenter [Wed, 17 Feb 2021 06:16:20 +0000 (09:16 +0300)]
cteontx2-pf: cn10k: Prevent harmless double shift bugs

These defines are used with set_bit() and test_bit() which take a bit
number.  In other words, the code is doing:

if (BIT(BIT(1)) & pf->hw.cap_flag) {

This was done consistently so it did not cause a problem at runtime but
it's still worth fixing.

Fixes: facede8209ef ("octeontx2-pf: cn10k: Add mbox support for CN10K")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: Add PCI bus info to ethtool driver query output
Wong Vee Khee [Wed, 17 Feb 2021 09:57:05 +0000 (17:57 +0800)]
net: stmmac: Add PCI bus info to ethtool driver query output

This patch populates the PCI bus info in the ethtool driver query data.

Users will be able to view PCI bus info using 'ethtool -i <interface>'.

Signed-off-by: Wong Vee Khee <vee.khee.wong@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ptp-ptp_clockmatrix-Fix-output-1-PPS-alignment'
David S. Miller [Wed, 17 Feb 2021 21:49:26 +0000 (13:49 -0800)]
Merge branch 'ptp-ptp_clockmatrix-Fix-output-1-PPS-alignment'

Vincent Cheng says:

====================
ptp: ptp_clockmatrix: Fix output 1 PPS alignment.

This series fixes a race condition that may result in the output clock
not aligned to internal 1 PPS clock.

Part of device initialization is to align the rising edge of output
clocks to the internal rising edge of the 1 PPS clock.  If the system
APLL and DPLL are not locked when this alignment occurs, the alignment
fails and a fixed offset between the internal 1 PPS clock and the
output clock occurs.

If a clock is dynamically enabled after power-up, the output clock
also needs to be aligned to the internal 1 PPS clock.

v3:
Suggested by: Jakub Kicinski <kuba@kernel.org>
- Remove unnecessary 'err' variable
- Increase msleep()/loop accuracy by using jiffies in while()
- No empty lines between variables
- No empty lines between call and the if
- parenthesis around a == b are unnecessary
- Inconsistent \n usage in dev_()
- Remove unnecessary empty line
- Leave string format in place so static code checkers can
  validate arguments

v2:
Suggested by: Richard Cochran <richardcochran@gmail.com>
- Added const to "char * fmt"
- Break unrelated header change into separate patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
Vincent Cheng [Wed, 17 Feb 2021 05:42:18 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary

Code clean-up.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
Vincent Cheng [Wed, 17 Feb 2021 05:42:17 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.

Code clean-up.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
Vincent Cheng [Wed, 17 Feb 2021 05:42:16 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.

Code clean-up.

* Remove blank line between variable declarations.
* Remove blank line between:
err = blah(...)

if (err)
...
* Remove unnecessary blank line before/after loop constructs.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: Clean-up dev_*() messages.
Vincent Cheng [Wed, 17 Feb 2021 05:42:15 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: Clean-up dev_*() messages.

Code clean-up.

* Remove unnecessary \n termination from dev_*() messages.
* Remove 'char *fmt' to define strings to stay within 80 column
  limit.  Not needed since coding guidelines increased to
  100 columns limit.
  Keeping format in place allows static code checkers to
  validate the arguments.
* Tighten up vertical spacing.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: Remove unused header declarations.
Vincent Cheng [Wed, 17 Feb 2021 05:42:14 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: Remove unused header declarations.

Removed unused header declarations.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
Vincent Cheng [Wed, 17 Feb 2021 05:42:13 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.

When enabling output using PTP_CLK_REQ_PEROUT, need to align the output
clock to the internal 1 PPS clock.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
Vincent Cheng [Wed, 17 Feb 2021 05:42:12 +0000 (00:42 -0500)]
ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.

Part of the device initialization aligns the rising edge of the output
clock to the internal 1 PPS clock. If the system APLL and DPLL is not
locked, then the alignment will fail and there will be a fixed offset
between the internal 1 PPS clock and the output clock.

After loading the device firmware, poll the system APLL and DPLL for
locked state prior to initialization, timing out after 2 seconds.

Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'ddwmac-sun8i-cleanup-and-shutdown-hook'
David S. Miller [Wed, 17 Feb 2021 21:42:56 +0000 (13:42 -0800)]
Merge branch 'ddwmac-sun8i-cleanup-and-shutdown-hook'

Samuel Holland says:

====================
dwmac-sun8i cleanup and shutdown hook

These patches clean up some things I noticed while fixing suspend/resume
behavior. The first four are minor code improvements. The last one adds
a shutdown hook to minimize power consumption on boards without a PMIC.

Changes v1 to v2:
  - Note the assumption of exclusive reset controller access in patch 3
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-sun8i: Add a shutdown callback
Samuel Holland [Wed, 17 Feb 2021 04:20:06 +0000 (22:20 -0600)]
net: stmmac: dwmac-sun8i: Add a shutdown callback

The Ethernet MAC and PHY are usually major consumers of power on boards
which may not be able to fully power off (those with no PMIC). Powering
down the MAC and internal PHY saves power while these boards are "off".

Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-sun8i: Minor probe function cleanup
Samuel Holland [Wed, 17 Feb 2021 04:20:05 +0000 (22:20 -0600)]
net: stmmac: dwmac-sun8i: Minor probe function cleanup

Adjust the spacing and use an explicit "return 0" in the success path
to make the function easier to parse.

Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-sun8i: Use reset_control_reset
Samuel Holland [Wed, 17 Feb 2021 04:20:04 +0000 (22:20 -0600)]
net: stmmac: dwmac-sun8i: Use reset_control_reset

Use the appropriate function instead of reimplementing it,
and update the error message to match the code.

Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
Samuel Holland [Wed, 17 Feb 2021 04:20:03 +0000 (22:20 -0600)]
net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check

sun8i_dwmac_unpower_internal_phy already checks if the PHY is powered,
so there is no need to do it again here.

Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: stmmac: dwmac-sun8i: Return void from PHY unpower
Samuel Holland [Wed, 17 Feb 2021 04:20:02 +0000 (22:20 -0600)]
net: stmmac: dwmac-sun8i: Return void from PHY unpower

This is a deinitialization function that always returned zero, and that
return value was always ignored. Have it return void instead.

Reviewed-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agor8169: use macro pm_ptr
Heiner Kallweit [Wed, 17 Feb 2021 21:23:58 +0000 (22:23 +0100)]
r8169: use macro pm_ptr

Use macro pm_ptr(), this helps to avoid some ifdeffery.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Wed, 17 Feb 2021 21:19:24 +0000 (13:19 -0800)]
Merge git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Add two helper functions to release one table and hooks from
   the netns and netlink event path.

2) Add table ownership infrastructure, this new infrastructure allows
   users to bind a table (and its content) to a process through the
   netlink socket.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mdio: Remove of_phy_attach()
Florian Fainelli [Wed, 17 Feb 2021 20:25:57 +0000 (12:25 -0800)]
net: mdio: Remove of_phy_attach()

We have no in-tree users, also update the sfp-phylink.rst documentation
to indicate that phy_attach_direct() is used instead of of_phy_attach().

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: ocelot: select PACKING in the Kconfig
Vladimir Oltean [Wed, 17 Feb 2021 20:33:48 +0000 (22:33 +0200)]
net: mscc: ocelot: select PACKING in the Kconfig

Ocelot now uses include/linux/dsa/ocelot.h which makes use of
CONFIG_PACKING to pack/unpack bits into the Injection/Extraction Frame
Headers. So it needs to explicitly select it, otherwise there might be
build errors due to the missing dependency.

Fixes: 40d3f295b5fe ("net: mscc: ocelot: use common tag parsing code with DSA")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosched,x86: Allow !PREEMPT_DYNAMIC
Peter Zijlstra [Tue, 9 Feb 2021 21:02:33 +0000 (22:02 +0100)]
sched,x86: Allow !PREEMPT_DYNAMIC

Allow building x86 with PREEMPT_DYNAMIC=n, this is needed for
PREEMPT_RT as it makes no sense to not have full preemption on
PREEMPT_RT.

Fixes: 8c98e8cf723c ("preempt/dynamic: Provide preempt_schedule[_notrace]() static calls")
Reported-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Mike Galbraith <efault@gmx.de>
Link: https://lkml.kernel.org/r/YCK1+JyFNxQnWeXK@hirez.programming.kicks-ass.net
3 years agoentry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point
Frederic Weisbecker [Sun, 31 Jan 2021 23:05:48 +0000 (00:05 +0100)]
entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point

Following the idle loop model, cleanly check for pending rcuog wakeup
before the last rescheduling point upon resuming to guest mode. This
way we can avoid to do it from rcu_user_enter() with the last resort
self-IPI hack that enforces rescheduling.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-6-frederic@kernel.org
3 years agoentry: Explicitly flush pending rcuog wakeup before last rescheduling point
Frederic Weisbecker [Sun, 31 Jan 2021 23:05:47 +0000 (00:05 +0100)]
entry: Explicitly flush pending rcuog wakeup before last rescheduling point

Following the idle loop model, cleanly check for pending rcuog wakeup
before the last rescheduling point on resuming to user mode. This
way we can avoid to do it from rcu_user_enter() with the last resort
self-IPI hack that enforces rescheduling.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-5-frederic@kernel.org
3 years agorcu/nocb: Trigger self-IPI on late deferred wake up before user resume
Frederic Weisbecker [Sun, 31 Jan 2021 23:05:46 +0000 (00:05 +0100)]
rcu/nocb: Trigger self-IPI on late deferred wake up before user resume

Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP
kthread (rcuog) to be serviced.

Unfortunately the call to rcu_user_enter() is already past the last
rescheduling opportunity before we resume to userspace or to guest mode.
We may escape there with the woken task ignored.

The ultimate resort to fix every callsites is to trigger a self-IPI
(nohz_full depends on arch to implement arch_irq_work_raise()) that will
trigger a reschedule on IRQ tail or guest exit.

Eventually every site that want a saner treatment will need to carefully
place a call to rcu_nocb_flush_deferred_wakeup() before the last explicit
need_resched() check upon resume.

Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf)
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-4-frederic@kernel.org
3 years agorcu/nocb: Perform deferred wake up before last idle's need_resched() check
Frederic Weisbecker [Sun, 31 Jan 2021 23:05:45 +0000 (00:05 +0100)]
rcu/nocb: Perform deferred wake up before last idle's need_resched() check

Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP
kthread (rcuog) to be serviced.

Usually a local wake up happening while running the idle task is handled
in one of the need_resched() checks carefully placed within the idle
loop that can break to the scheduler.

Unfortunately the call to rcu_idle_enter() is already beyond the last
generic need_resched() check and we may halt the CPU with a resched
request unhandled, leaving the task hanging.

Fix this with splitting the rcuog wakeup handling from rcu_idle_enter()
and place it before the last generic need_resched() check in the idle
loop. It is then assumed that no call to call_rcu() will be performed
after that in the idle loop until the CPU is put in low power mode.

Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf)
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-3-frederic@kernel.org
3 years agorcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers
Frederic Weisbecker [Sun, 31 Jan 2021 23:05:44 +0000 (00:05 +0100)]
rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers

Deferred wakeup of rcuog kthreads upon RCU idle mode entry is going to
be handled differently whether initiated by idle, user or guest. Prepare
with pulling that control up to rcu_eqs_enter() callers.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210131230548.32970-2-frederic@kernel.org
3 years agosched/features: Distinguish between NORMAL and DEADLINE hrtick
Juri Lelli [Mon, 8 Feb 2021 07:35:54 +0000 (08:35 +0100)]
sched/features: Distinguish between NORMAL and DEADLINE hrtick

The HRTICK feature has traditionally been servicing configurations that
need precise preemptions point for NORMAL tasks. More recently, the
feature has been extended to also service DEADLINE tasks with stringent
runtime enforcement needs (e.g., runtime < 1ms with HZ=1000).

Enabling HRTICK sched feature currently enables the additional timer and
task tick for both classes, which might introduced undesired overhead
for no additional benefit if one needed it only for one of the cases.

Separate HRTICK sched feature in two (and leave the traditional case
name unmodified) so that it can be selectively enabled when needed.

With:

  $ echo HRTICK > /sys/kernel/debug/sched_features

the NORMAL/fair hrtick gets enabled.

With:

  $ echo HRTICK_DL > /sys/kernel/debug/sched_features

the DEADLINE hrtick gets enabled.

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210208073554.14629-3-juri.lelli@redhat.com
3 years agosched/features: Fix hrtick reprogramming
Juri Lelli [Mon, 8 Feb 2021 07:35:53 +0000 (08:35 +0100)]
sched/features: Fix hrtick reprogramming

Hung tasks and RCU stall cases were reported on systems which were not
100% busy. Investigation of such unexpected cases (no sign of potential
starvation caused by tasks hogging the system) pointed out that the
periodic sched tick timer wasn't serviced anymore after a certain point
and that caused all machinery that depends on it (timers, RCU, etc.) to
stop working as well. This issues was however only reproducible if
HRTICK was enabled.

Looking at core dumps it was found that the rbtree of the hrtimer base
used also for the hrtick was corrupted (i.e. next as seen from the base
root and actual leftmost obtained by traversing the tree are different).
Same base is also used for periodic tick hrtimer, which might get "lost"
if the rbtree gets corrupted.

Much alike what described in commit 1f71addd34f4c ("tick/sched: Do not
mess with an enqueued hrtimer") there is a race window between
hrtimer_set_expires() in hrtick_start and hrtimer_start_expires() in
__hrtick_restart() in which the former might be operating on an already
queued hrtick hrtimer, which might lead to corruption of the base.

Use hrtick_start() (which removes the timer before enqueuing it back) to
ensure hrtick hrtimer reprogramming is entirely guarded by the base
lock, so that no race conditions can occur.

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210208073554.14629-2-juri.lelli@redhat.com
3 years agosched/deadline: Reduce rq lock contention in dl_add_task_root_domain()
Dietmar Eggemann [Tue, 19 Jan 2021 08:35:42 +0000 (09:35 +0100)]
sched/deadline: Reduce rq lock contention in dl_add_task_root_domain()

dl_add_task_root_domain() is called during sched domain rebuild:

  rebuild_sched_domains_locked()
    partition_and_rebuild_sched_domains()
      rebuild_root_domains()
         for all top_cpuset descendants:
           update_tasks_root_domain()
             for all tasks of cpuset:
               dl_add_task_root_domain()

Change it so that only the task pi lock is taken to check if the task
has a SCHED_DEADLINE (DL) policy. In case that p is a DL task take the
rq lock as well to be able to safely de-reference root domain's DL
bandwidth structure.

Most of the tasks will have another policy (namely SCHED_NORMAL) and
can now bail without taking the rq lock.

One thing to note here: Even in case that there aren't any DL user
tasks, a slow frequency switching system with cpufreq gov schedutil has
a DL task (sugov) per frequency domain running which participates in DL
bandwidth management.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Acked-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20210119083542.19856-1-dietmar.eggemann@arm.com
3 years agouprobes: (Re)add missing get_uprobe() in __find_uprobe()
Sven Schnelle [Tue, 9 Feb 2021 15:07:11 +0000 (16:07 +0100)]
uprobes: (Re)add missing get_uprobe() in __find_uprobe()

commit c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers")
accidentally removed the refcount increase. Add it again.

Fixes: c6bc9bd06dff ("rbtree, uprobes: Use rbtree helpers")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210209150711.36778-1-svens@linux.ibm.com
3 years agosmp: Process pending softirqs in flush_smp_call_function_from_idle()
Sebastian Andrzej Siewior [Sat, 23 Jan 2021 20:10:25 +0000 (21:10 +0100)]
smp: Process pending softirqs in flush_smp_call_function_from_idle()

send_call_function_single_ipi() may wake an idle CPU without sending an
IPI. The woken up CPU will process the SMP-functions in
flush_smp_call_function_from_idle(). Any raised softirq from within the
SMP-function call will not be processed.
Should the CPU have no tasks assigned, then it will go back to idle with
pending softirqs and the NOHZ will rightfully complain.

Process pending softirqs on return from flush_smp_call_function_queue().

Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()")
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210123201027.3262800-2-bigeasy@linutronix.de
3 years agosched: Harden PREEMPT_DYNAMIC
Peter Zijlstra [Mon, 25 Jan 2021 15:26:50 +0000 (16:26 +0100)]
sched: Harden PREEMPT_DYNAMIC

Use the new EXPORT_STATIC_CALL_TRAMP() / static_call_mod() to unexport
the static_call_key for the PREEMPT_DYNAMIC calls such that modules
can no longer update these calls.

Having modules change/hi-jack the preemption calls would be horrible.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
3 years agostatic_call: Allow module use without exposing static_call_key
Josh Poimboeuf [Wed, 27 Jan 2021 23:18:37 +0000 (17:18 -0600)]
static_call: Allow module use without exposing static_call_key

When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module
can use static_call_update() to change the function called.  This is
not desirable in general.

Not exporting static_call_key however also disallows usage of
static_call(), since objtool needs the key to construct the
static_call_site.

Solve this by allowing objtool to create the static_call_site using
the trampoline address when it builds a module and cannot find the
static_call_key symbol. The module loader will then try and map the
trampole back to a key before it constructs the normal sites list.

Doing this requires a trampoline -> key associsation, so add another
magic section that keeps those.

Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
3 years agosched: Add /debug/sched_preempt
Peter Zijlstra [Fri, 22 Jan 2021 12:01:58 +0000 (13:01 +0100)]
sched: Add /debug/sched_preempt

Add a debugfs file to muck about with the preempt mode at runtime.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/YAsGiUYf6NyaTplX@hirez.programming.kicks-ass.net
3 years agopreempt/dynamic: Support dynamic preempt with preempt= boot option
Peter Zijlstra (Intel) [Mon, 18 Jan 2021 14:12:23 +0000 (15:12 +0100)]
preempt/dynamic: Support dynamic preempt with preempt= boot option

Support the preempt= boot option and patch the static call sites
accordingly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-9-frederic@kernel.org
3 years agopreempt/dynamic: Provide irqentry_exit_cond_resched() static call
Peter Zijlstra (Intel) [Mon, 18 Jan 2021 14:12:22 +0000 (15:12 +0100)]
preempt/dynamic: Provide irqentry_exit_cond_resched() static call

Provide static call to control IRQ preemption (called in CONFIG_PREEMPT)
so that we can override its behaviour when preempt= is overriden.

Since the default behaviour is full preemption, its call is
initialized to provide IRQ preemption when preempt= isn't passed.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-8-frederic@kernel.org
3 years agopreempt/dynamic: Provide preempt_schedule[_notrace]() static calls
Peter Zijlstra (Intel) [Mon, 18 Jan 2021 14:12:21 +0000 (15:12 +0100)]
preempt/dynamic: Provide preempt_schedule[_notrace]() static calls

Provide static calls to control preempt_schedule[_notrace]()
(called in CONFIG_PREEMPT) so that we can override their behaviour when
preempt= is overriden.

Since the default behaviour is full preemption, both their calls are
initialized to the arch provided wrapper, if any.

[fweisbec: only define static calls when PREEMPT_DYNAMIC, make it less
           dependent on x86 with __preempt_schedule_func]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-7-frederic@kernel.org
3 years agopreempt/dynamic: Provide cond_resched() and might_resched() static calls
Peter Zijlstra (Intel) [Mon, 18 Jan 2021 14:12:20 +0000 (15:12 +0100)]
preempt/dynamic: Provide cond_resched() and might_resched() static calls

Provide static calls to control cond_resched() (called in !CONFIG_PREEMPT)
and might_resched() (called in CONFIG_PREEMPT_VOLUNTARY) to that we
can override their behaviour when preempt= is overriden.

Since the default behaviour is full preemption, both their calls are
ignored when preempt= isn't passed.

  [fweisbec: branch might_resched() directly to __cond_resched(), only
             define static calls when PREEMPT_DYNAMIC]

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-6-frederic@kernel.org
3 years agopreempt: Introduce CONFIG_PREEMPT_DYNAMIC
Michal Hocko [Mon, 18 Jan 2021 14:12:19 +0000 (15:12 +0100)]
preempt: Introduce CONFIG_PREEMPT_DYNAMIC

Preemption mode selection is currently hardcoded on Kconfig choices.
Introduce a dedicated option to tune preemption flavour at boot time,

This will be only available on architectures efficiently supporting
static calls in order not to tempt with the feature against additional
overhead that might be prohibitive or undesirable.

CONFIG_PREEMPT_DYNAMIC is automatically selected by CONFIG_PREEMPT if
the architecture provides the necessary support (CONFIG_STATIC_CALL_INLINE,
CONFIG_GENERIC_ENTRY, and provide with __preempt_schedule_function() /
__preempt_schedule_notrace_function()).

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
[peterz: relax requirement to HAVE_STATIC_CALL]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-5-frederic@kernel.org
3 years agostatic_call: Provide DEFINE_STATIC_CALL_RET0()
Frederic Weisbecker [Mon, 18 Jan 2021 14:12:17 +0000 (15:12 +0100)]
static_call: Provide DEFINE_STATIC_CALL_RET0()

DECLARE_STATIC_CALL() must pass the original function targeted for a
given static call. But DEFINE_STATIC_CALL() may want to initialize it as
off. In this case we can't pass NULL (for functions without return value)
or __static_call_return0 (for functions returning a value) directly
to DEFINE_STATIC_CALL() as that may trigger a static call redeclaration
with a different function prototype. Type casts neither can work around
that as they don't get along with typeof().

The proper way to do that for functions that don't return a value is
to use DEFINE_STATIC_CALL_NULL(). But functions returning a actual value
don't have an equivalent yet.

Provide DEFINE_STATIC_CALL_RET0() to solve this situation.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-3-frederic@kernel.org
3 years agostatic_call/x86: Add __static_call_return0()
Peter Zijlstra [Mon, 18 Jan 2021 14:12:16 +0000 (15:12 +0100)]
static_call/x86: Add __static_call_return0()

Provide a stub function that return 0 and wire up the static call site
patching to replace the CALL with a single 5 byte instruction that
clears %RAX, the return value register.

The function can be cast to any function pointer type that has a
single %RAX return (including pointers). Also provide a version that
returns an int for convenience. We are clearing the entire %RAX register
in any case, whether the return value is 32 or 64 bits, since %RAX is
always a scratch register anyway.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-2-frederic@kernel.org
3 years agostatic_call: Pull some static_call declarations to the type headers
Peter Zijlstra [Mon, 18 Jan 2021 14:12:18 +0000 (15:12 +0100)]
static_call: Pull some static_call declarations to the type headers

Some static call declarations are going to be needed on low level header
files. Move the necessary material to the dedicated static call types
header to avoid inclusion dependency hell.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org
3 years agosched/core: Update task_prio() function header
Dietmar Eggemann [Thu, 28 Jan 2021 13:10:40 +0000 (14:10 +0100)]
sched/core: Update task_prio() function header

The description of the RT offset and the values for 'normal' tasks needs
update. Moreover there are DL tasks now.
task_prio() has to stay like it is to guarantee compatibility with the
/proc/<pid>/stat priority field:

  # cat /proc/<pid>/stat | awk '{ print $18; }'

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210128131040.296856-4-dietmar.eggemann@arm.com
3 years agosched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO
Dietmar Eggemann [Thu, 28 Jan 2021 13:10:39 +0000 (14:10 +0100)]
sched: Remove USER_PRIO, TASK_USER_PRIO and MAX_USER_PRIO

The only remaining use of MAX_USER_PRIO (and USER_PRIO) is the
SCALE_PRIO() definition in the PowerPC Cell architecture's Synergistic
Processor Unit (SPU) scheduler. TASK_USER_PRIO isn't used anymore.

Commit fe443ef2ac42 ("[POWERPC] spusched: Dynamic timeslicing for
SCHED_OTHER") copied SCALE_PRIO() from the task scheduler in v2.6.23.

Commit a4ec24b48dde ("sched: tidy up SCHED_RR") removed it from the task
scheduler in v2.6.24.

Commit 3ee237dddcd8 ("sched/prio: Add 3 macros of MAX_NICE, MIN_NICE and
NICE_WIDTH in prio.h") introduced NICE_WIDTH much later.

With:

  MAX_USER_PRIO = USER_PRIO(MAX_PRIO)

                = MAX_PRIO - MAX_RT_PRIO

       MAX_PRIO = MAX_RT_PRIO + NICE_WIDTH

  MAX_USER_PRIO = MAX_RT_PRIO + NICE_WIDTH - MAX_RT_PRIO

  MAX_USER_PRIO = NICE_WIDTH

MAX_USER_PRIO can be replaced by NICE_WIDTH to be able to remove all the
{*_}USER_PRIO defines.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210128131040.296856-3-dietmar.eggemann@arm.com
3 years agosched: Remove MAX_USER_RT_PRIO
Dietmar Eggemann [Thu, 28 Jan 2021 13:10:38 +0000 (14:10 +0100)]
sched: Remove MAX_USER_RT_PRIO

Commit d46523ea32a7 ("[PATCH] fix MAX_USER_RT_PRIO and MAX_RT_PRIO")
was introduced due to a a small time period in which the realtime patch
set was using different values for MAX_USER_RT_PRIO and MAX_RT_PRIO.

This is no longer true, i.e. now MAX_RT_PRIO == MAX_USER_RT_PRIO.

Get rid of MAX_USER_RT_PRIO and make everything use MAX_RT_PRIO
instead.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20210128131040.296856-2-dietmar.eggemann@arm.com
3 years agosched/topology: Fix sched_domain_topology_level alloc in sched_init_numa()
Dietmar Eggemann [Mon, 1 Feb 2021 09:53:53 +0000 (10:53 +0100)]
sched/topology: Fix sched_domain_topology_level alloc in sched_init_numa()

Commit "sched/topology: Make sched_init_numa() use a set for the
deduplicating sort" allocates 'i + nr_levels (level)' instead of
'i + nr_levels + 1' sched_domain_topology_level.

This led to an Oops (on Arm64 juno with CONFIG_SCHED_DEBUG):

sched_init_domains
  build_sched_domains()
    __free_domain_allocs()
      __sdt_free() {
...
        for_each_sd_topology(tl)
  ...
          sd = *per_cpu_ptr(sdd->sd, j); <--
  ...
      }

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Barry Song <song.bao.hua@hisilicon.com>
Link: https://lkml.kernel.org/r/6000e39e-7d28-c360-9cd6-8798fd22a9bf@arm.com
3 years agorbtree, timerqueue: Use rb_add_cached()
Peter Zijlstra [Wed, 29 Apr 2020 15:07:53 +0000 (17:07 +0200)]
rbtree, timerqueue: Use rb_add_cached()

Reduce rbtree boiler plate by using the new helpers.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agorbtree, rtmutex: Use rb_add_cached()
Peter Zijlstra [Wed, 29 Apr 2020 15:29:58 +0000 (17:29 +0200)]
rbtree, rtmutex: Use rb_add_cached()

Reduce rbtree boiler plate by using the new helpers.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agorbtree, uprobes: Use rbtree helpers
Peter Zijlstra [Wed, 29 Apr 2020 15:06:27 +0000 (17:06 +0200)]
rbtree, uprobes: Use rbtree helpers

Reduce rbtree boilerplate by using the new helpers.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agorbtree, perf: Use new rbtree helpers
Peter Zijlstra [Wed, 29 Apr 2020 15:05:15 +0000 (17:05 +0200)]
rbtree, perf: Use new rbtree helpers

Reduce rbtree boiler plate by using the new helpers.

One noteworthy change is unification of the various (partial) compare
functions. We construct a subtree match by forcing the sub-order to
always match, see __group_cmp().

Due to 'const' we had to touch cgroup_id().

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agorbtree, sched/deadline: Use rb_add_cached()
Peter Zijlstra [Wed, 29 Apr 2020 15:04:41 +0000 (17:04 +0200)]
rbtree, sched/deadline: Use rb_add_cached()

Reduce rbtree boiler plate by using the new helpers.

Make rb_add_cached() / rb_erase_cached() return a pointer to the
leftmost node to aid in updating additional state.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agorbtree, sched/fair: Use rb_add_cached()
Peter Zijlstra [Wed, 29 Apr 2020 15:04:12 +0000 (17:04 +0200)]
rbtree, sched/fair: Use rb_add_cached()

Reduce rbtree boiler plate by using the new helper function.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agorbtree: Add generic add and find helpers
Peter Zijlstra [Wed, 29 Apr 2020 15:03:22 +0000 (17:03 +0200)]
rbtree: Add generic add and find helpers

I've always been bothered by the endless (fragile) boilerplate for
rbtree, and I recently wrote some rbtree helpers for objtool and
figured I should lift them into the kernel and use them more widely.

Provide:

partial-order; less() based:
 - rb_add(): add a new entry to the rbtree
 - rb_add_cached(): like rb_add(), but for a rb_root_cached

total-order; cmp() based:
 - rb_find(): find an entry in an rbtree
 - rb_find_add(): find an entry, and add if not found

 - rb_find_first(): find the first (leftmost) matching entry
 - rb_next_match(): continue from rb_find_first()
 - rb_for_each(): iterate a sub-tree using the previous two

Inlining and constant propagation should see the compiler inline the
whole thing, including the various compare functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
3 years agosched/fair: Merge select_idle_core/cpu()
Mel Gorman [Wed, 27 Jan 2021 13:52:03 +0000 (13:52 +0000)]
sched/fair: Merge select_idle_core/cpu()

Both select_idle_core() and select_idle_cpu() do a loop over the same
cpumask. Observe that by clearing the already visited CPUs, we can
fold the iteration and iterate a core at a time.

All we need to do is remember any non-idle CPU we encountered while
scanning for an idle core. This way we'll only iterate every CPU once.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210127135203.19633-5-mgorman@techsingularity.net
3 years agosched/fair: Remove select_idle_smt()
Mel Gorman [Mon, 25 Jan 2021 08:59:08 +0000 (08:59 +0000)]
sched/fair: Remove select_idle_smt()

In order to make the next patch more readable, and to quantify the
actual effectiveness of this pass, start by removing it.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20210125085909.4600-4-mgorman@techsingularity.net
3 years agoMerge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch
Ingo Molnar [Wed, 17 Feb 2021 13:04:39 +0000 (14:04 +0100)]
Merge tag 'v5.11' into sched/core, to pick up fixes & refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
3 years agoMerge branch 'perf/kprobes' into perf/core, to pick up finished branch
Ingo Molnar [Wed, 17 Feb 2021 10:50:11 +0000 (11:50 +0100)]
Merge branch 'perf/kprobes' into perf/core, to pick up finished branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
3 years agonet: re-solve some conflicts after net -> net-next merge
Jakub Kicinski [Wed, 17 Feb 2021 06:58:44 +0000 (22:58 -0800)]
net: re-solve some conflicts after net -> net-next merge

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Wed, 17 Feb 2021 01:30:20 +0000 (17:30 -0800)]
Merge git://git./linux/kernel/git/netdev/net

3 years agonet: dsa: tag_rtl4_a: Support also egress tags
Linus Walleij [Tue, 16 Feb 2021 23:55:42 +0000 (00:55 +0100)]
net: dsa: tag_rtl4_a: Support also egress tags

Support also transmitting frames using the custom "8899 A"
4 byte tag.

Qingfang came up with the solution: we need to pad the
ethernet frame to 60 bytes using eth_skb_pad(), then the
switch will happily accept frames with custom tags.

Cc: Mauri Sandberg <sandberg@mailfence.com>
Reported-by: DENG Qingfang <dqfext@gmail.com>
Fixes: efd7fe68f0c6 ("net: dsa: tag_rtl4_a: Implement Realtek 4 byte A tag")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'broadcom-next'
David S. Miller [Tue, 16 Feb 2021 23:23:24 +0000 (15:23 -0800)]
Merge branch 'broadcom-next'

Robert Hancock says:

====================
Broadcom PHY driver updates

Updates to the Broadcom PHY driver related to use with copper SFP modules.

Changed since v3:
-fixed kerneldoc error

Changed since v2:
-Create flag for PHY on SFP module and use that rather than accessing
 attached_dev directly in PHY driver

Changed since v1:
-Reversed conditional to reduce indentation
-Added missing setting of MII_BCM54XX_AUXCTL_MISC_WREN in
 MII_BCM54XX_AUXCTL_SHDWSEL_MISC register
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: broadcom: Do not modify LED configuration for SFP module PHYs
Robert Hancock [Tue, 16 Feb 2021 22:54:54 +0000 (16:54 -0600)]
net: phy: broadcom: Do not modify LED configuration for SFP module PHYs

bcm54xx_config_init was modifying the PHY LED configuration to enable link
and activity indications. However, some SFP modules (such as Bel-Fuse
SFP-1GBT-06) have no LEDs but use the LED outputs to control the SFP LOS
signal, and modifying the LED settings will cause the LOS output to
malfunction. Skip this configuration for PHYs which are bound to an SFP
bus.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: Add is_on_sfp_module flag and phy_on_sfp helper
Robert Hancock [Tue, 16 Feb 2021 22:54:53 +0000 (16:54 -0600)]
net: phy: Add is_on_sfp_module flag and phy_on_sfp helper

Add a flag and helper function to indicate that a PHY device is part of
an SFP module, which is set on attach. This can be used by PHY drivers
to handle SFP-specific quirks or behavior.

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S
Robert Hancock [Tue, 16 Feb 2021 22:54:52 +0000 (16:54 -0600)]
net: phy: broadcom: Set proper 1000BaseX/SGMII interface mode for BCM54616S

The default configuration for the BCM54616S PHY may not match the desired
mode when using 1000BaseX or SGMII interface modes, such as when it is on
an SFP module. Add code to explicitly set the correct mode using
programming sequences provided by Bel-Fuse:

https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-05-series.pdf
https://www.belfuse.com/resources/datasheets/powersolutions/ds-bps-sfp-1gbt-06-series.pdf

Signed-off-by: Robert Hancock <robert.hancock@calian.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agolan743x: sync only the received area of an rx ring buffer
Sven Van Asbroeck [Tue, 16 Feb 2021 01:08:03 +0000 (20:08 -0500)]
lan743x: sync only the received area of an rx ring buffer

On cpu architectures w/o dma cache snooping, dma_unmap() is a
is a very expensive operation, because its resulting sync
needs to invalidate cpu caches.

Increase efficiency/performance by syncing only those sections
of the lan743x's rx ring buffers that are actually in use.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agolan743x: boost performance on cpu archs w/o dma cache snooping
Sven Van Asbroeck [Tue, 16 Feb 2021 01:08:02 +0000 (20:08 -0500)]
lan743x: boost performance on cpu archs w/o dma cache snooping

The buffers in the lan743x driver's receive ring are always 9K,
even when the largest packet that can be received (the mtu) is
much smaller. This performs particularly badly on cpu archs
without dma cache snooping (such as ARM): each received packet
results in a 9K dma_{map|unmap} operation, which is very expensive
because cpu caches need to be invalidated.

Careful measurement of the driver rx path on armv7 reveals that
the cpu spends the majority of its time waiting for cache
invalidation.

Optimize by keeping the rx ring buffer size as close as possible
to the mtu. This limits the amount of cache that requires
invalidation.

This optimization would normally force us to re-allocate all
ring buffers when the mtu is changed - a disruptive event,
because it can only happen when the network interface is down.

Remove the need to re-allocate all ring buffers by adding support
for multi-buffer frames. Now any combination of mtu and ring
buffer size will work. When the mtu changes from mtu1 to mtu2,
consumed buffers of size mtu1 are lazily replaced by newly
allocated buffers of size mtu2.

These optimizations double the rx performance on armv7.
Third parties report 3x rx speedup on armv8.

Tested with iperf3 on a freescale imx6qp + lan7430, both sides
set to mtu 1500 bytes, measure rx performance:

Before:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec   550 MBytes   231 Mbits/sec    0
After:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-20.00  sec  1.33 GBytes   570 Mbits/sec    0

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Reviewed-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: enetc: fix destroyed phylink dereference during unbind
Vladimir Oltean [Tue, 16 Feb 2021 10:16:28 +0000 (12:16 +0200)]
net: enetc: fix destroyed phylink dereference during unbind

The following call path suggests that calling unregister_netdev on an
interface that is up will first bring it down.

enetc_pf_remove
-> unregister_netdev
   -> unregister_netdevice_queue
      -> unregister_netdevice_many
         -> dev_close_many
            -> __dev_close_many
               -> enetc_close
                  -> enetc_stop
                     -> phylink_stop

However, enetc first destroys the phylink instance, then calls
unregister_netdev. This is already dissimilar to the setup (and error
path teardown path) from enetc_pf_probe, but more than that, it is buggy
because it is invalid to call phylink_stop after phylink_destroy.

So let's first unregister the netdev (and let the .ndo_stop events
consume themselves), then destroy the phylink instance, then free the
netdev.

Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'net-mvneta-implement-basic-MQPrio-support'
David S. Miller [Tue, 16 Feb 2021 23:03:26 +0000 (15:03 -0800)]
Merge branch 'net-mvneta-implement-basic-MQPrio-support'

Maxime Chevallier says:

====================
net: mvneta: implement basic MQPrio support

This is V2 for the MQPrio support in mvneta.

This small series adds basic support for mqprio offloading, by having
the rx queueing mirroring the TCs based on VLAN prio fields.

This was tested on Armada 3700, and proves useful to make sure
high-priority traffic has a better chance not getting dropped when
there's lots of packets incoming.

The first patch of the series deals with the per-cpu interrupts on the
armada 3700. Since they don't work, there were already some patches
applied to keep all queue mappings to CPU0, but there still were some
remaining mappings left to be dealt with.

The second patch implements the MQPrio offloading for the receive path.

Changes in V2 :
 - Add a Fixes tag for the first patch
 - Fix some warnings and the xmas tree in the second patch
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mvneta: Implement mqprio support
Maxime Chevallier [Tue, 16 Feb 2021 09:25:36 +0000 (10:25 +0100)]
net: mvneta: Implement mqprio support

Implement a basic MQPrio support, inserting rules in RX that translate
the TC to prio mapping into vlan prio to queues.

The TX logic stays the same as when we don't offload the qdisc.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mvneta: Remove per-cpu queue mapping for Armada 3700
Maxime Chevallier [Tue, 16 Feb 2021 09:25:35 +0000 (10:25 +0100)]
net: mvneta: Remove per-cpu queue mapping for Armada 3700

According to Errata #23 "The per-CPU GbE interrupt is limited to Core
0", we can't use the per-cpu interrupt mechanism on the Armada 3700
familly.

This is correctly checked for RSS configuration, but the initial queue
mapping is still done by having the queues spread across all the CPUs in
the system, both in the init path and in the cpu_hotplug path.

Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: sched: fix police ext initialization
Vlad Buslov [Tue, 16 Feb 2021 16:22:00 +0000 (18:22 +0200)]
net: sched: fix police ext initialization

When police action is created by cls API tcf_exts_validate() first
conditional that calls tcf_action_init_1() directly, the action idr is not
updated according to latest changes in action API that require caller to
commit newly created action to idr with tcf_idr_insert_many(). This results
such action not being accessible through act API and causes crash reported
by syzbot:

==================================================================
BUG: KASAN: null-ptr-deref in instrument_atomic_read include/linux/instrumented.h:71 [inline]
BUG: KASAN: null-ptr-deref in atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
BUG: KASAN: null-ptr-deref in __tcf_idr_release net/sched/act_api.c:178 [inline]
BUG: KASAN: null-ptr-deref in tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
Read of size 4 at addr 0000000000000010 by task kworker/u4:5/204

CPU: 0 PID: 204 Comm: kworker/u4:5 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 __kasan_report mm/kasan/report.c:400 [inline]
 kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413
 check_memory_region_inline mm/kasan/generic.c:179 [inline]
 check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
 instrument_atomic_read include/linux/instrumented.h:71 [inline]
 atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
 __tcf_idr_release net/sched/act_api.c:178 [inline]
 tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
 tc_action_net_exit include/net/act_api.h:151 [inline]
 police_exit_net+0x168/0x360 net/sched/act_police.c:390
 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
==================================================================
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 204 Comm: kworker/u4:5 Tainted: G    B             5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 panic+0x306/0x73d kernel/panic.c:231
 end_report+0x58/0x5e mm/kasan/report.c:100
 __kasan_report mm/kasan/report.c:403 [inline]
 kasan_report.cold+0x67/0xd5 mm/kasan/report.c:413
 check_memory_region_inline mm/kasan/generic.c:179 [inline]
 check_memory_region+0x13d/0x180 mm/kasan/generic.c:185
 instrument_atomic_read include/linux/instrumented.h:71 [inline]
 atomic_read include/asm-generic/atomic-instrumented.h:27 [inline]
 __tcf_idr_release net/sched/act_api.c:178 [inline]
 tcf_idrinfo_destroy+0x129/0x1d0 net/sched/act_api.c:598
 tc_action_net_exit include/net/act_api.h:151 [inline]
 police_exit_net+0x168/0x360 net/sched/act_police.c:390
 ops_exit_list+0x10d/0x160 net/core/net_namespace.c:190
 cleanup_net+0x4ea/0xb10 net/core/net_namespace.c:604
 process_one_work+0x98d/0x15f0 kernel/workqueue.c:2275
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421
 kthread+0x3b1/0x4a0 kernel/kthread.c:292
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Kernel Offset: disabled

Fix the issue by calling tcf_idr_insert_many() after successful action
initialization.

Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together")
Reported-by: syzbot+151e3e714d34ae4ce7e8@syzkaller.appspotmail.com
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
David S. Miller [Tue, 16 Feb 2021 22:53:30 +0000 (14:53 -0800)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux

Saeed Mahameed says:
====================
pull-request: mlx5-next 2021-02-16

The patches in this pr are already submitted and reviewed through the
netdev and rdma mailing lists.

The series includes mlx5 HW bits and definitions for mlx5 real time clock
translation and handling in the mlx5 driver clock module to enable and
support such mode [1]

[1] https://patchwork.kernel.org/project/netdevbpf/patch/20210212223042.449816-7-saeed@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodrivers: net: xilinx_emaclite: remove arch limitation
Gary Guo [Tue, 16 Feb 2021 22:33:42 +0000 (22:33 +0000)]
drivers: net: xilinx_emaclite: remove arch limitation

The changes made in eccd540 is enough for xilinx_emaclite to run
without problem on 64-bit systems. I have tested it on a Xilinx
FPGA with RV64 softcore. The architecture limitation in Kconfig
seems no longer necessary.

A small change is included to print address with %lx instead of
casting to int and print with %x.

Signed-off-by: Gary Guo <gary@garyguo.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge branch 'bridge-mrp-Extend-br_mrp_switchdev_'
David S. Miller [Tue, 16 Feb 2021 22:47:46 +0000 (14:47 -0800)]
Merge branch 'bridge-mrp-Extend-br_mrp_switchdev_'

Horatiu Vulturv says:

====================
bridge: mrp: Extend br_mrp_switchdev_*

This patch series extends MRP switchdev to allow the SW to have a better
understanding if the HW can implement the MRP functionality or it needs
to help the HW to run it. There are 3 cases:
- when HW can't implement at all the functionality.
- when HW can implement a part of the functionality but needs the SW
  implement the rest. For example if it can't detect when it stops
  receiving MRP Test frames but it can copy the MRP frames to CPU to
  allow the SW to determine this.  Another example is generating the MRP
  Test frames. If HW can't do that then the SW is used as backup.
- when HW can implement completely the functionality.

So, initially the SW tries to offload the entire functionality in HW, if
that fails it tries offload parts of the functionality in HW and use the
SW as helper and if also this fails then MRP can't run on this HW.

Based on these new calls, implement the switchdev for Ocelot driver. This
is an example where the HW can't run completely the functionality but it
can help the SW to run it, by trapping all MRP frames to CPU.

Also this patch series adds MRP support to DSA and implements the Felix
driver which just reuse the Ocelot functions. This part was just compiled
tested because I don't have any HW on which to do the actual tests.

v4:
 - remove ifdef MRP from include/net/switchdev.h
 - move MRP implementation for Ocelot in a different file such that
   Felix driver can use it.
 - extend DSA with MRP support
 - implement MRP support for Felix.
v3:
 - implement the switchdev calls needed by Ocelot driver.
v2:
 - fix typos in comments and in commit messages
 - remove some of the comments
 - move repeated code in helper function
 - fix issue when deleting a node when sw_backup was true
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: felix: Add support for MRP
Horatiu Vultur [Tue, 16 Feb 2021 21:42:05 +0000 (22:42 +0100)]
net: dsa: felix: Add support for MRP

Implement functions 'port_mrp_add', 'port_mrp_del',
'port_mrp_add_ring_role' and 'port_mrp_del_ring_role' to call the mrp
functions from ocelot.

Also all MRP frames that arrive to CPU on queue number OCELOT_MRP_CPUQ
will be forward by the SW.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: add MRP support
Horatiu Vultur [Tue, 16 Feb 2021 21:42:04 +0000 (22:42 +0100)]
net: dsa: add MRP support

Add support for offloading MRP in HW. Currently implement the switchdev
calls 'SWITCHDEV_OBJ_ID_MRP', 'SWITCHDEV_OBJ_ID_RING_ROLE_MRP',
to allow to create MRP instances and to set the role of these instances.

Add DSA_NOTIFIER_MRP_ADD/DEL and DSA_NOTIFIER_MRP_ADD/DEL_RING_ROLE
which calls to .port_mrp_add/del and .port_mrp_add/del_ring_role in the
DSA driver for the switch.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: mscc: ocelot: Add support for MRP
Horatiu Vultur [Tue, 16 Feb 2021 21:42:03 +0000 (22:42 +0100)]
net: mscc: ocelot: Add support for MRP

Add basic support for MRP. The HW will just trap all MRP frames on the
ring ports to CPU and allow the SW to process them. In this way it is
possible to for this node to behave both as MRM and MRC.

Current limitations are:
- it doesn't support Interconnect roles.
- it supports only a single ring.
- the HW should be able to do forwarding of MRP Test frames so the SW
  will not need to do this. So it would be able to have the role MRC
  without SW support.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agobridge: mrp: Update br_mrp to use new return values of br_mrp_switchdev
Horatiu Vultur [Tue, 16 Feb 2021 21:42:02 +0000 (22:42 +0100)]
bridge: mrp: Update br_mrp to use new return values of br_mrp_switchdev

Check the return values of the br_mrp_switchdev function.
In case of:
- BR_MRP_NONE, return the error to userspace,
- BR_MRP_SW, continue with SW implementation,
- BR_MRP_HW, continue without SW implementation,

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>