Pavel Begunkov [Mon, 9 Aug 2021 12:04:07 +0000 (13:04 +0100)]
io-wq: improve wq_list_add_tail()
Prepare nodes that we're going to add before actually linking them, it's
always safer and costs us nothing.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/f7e53f0c84c02ed6748c488ed0789b98f8cc6185.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 9 Aug 2021 12:04:06 +0000 (13:04 +0100)]
io_uring: remove unnecessary PF_EXITING check
We prefer nornal task_works even if it would fail requests inside. Kill
a PF_EXITING check in io_req_task_work_add(), task_work_add() handles
well dying tasks, i.e. return error when can't enqueue due to late
stages of do_exit().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/fc14297e8441cd8f5d1743a2488cf0df09bf48ac.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 9 Aug 2021 12:04:05 +0000 (13:04 +0100)]
io_uring: clean io-wq callbacks
Move io-wq callbacks closer to each other, so it's easier to work with
them, and rename io_free_work() into io_wq_free_work() for consistency.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/851bbc7f0f86f206d8c1333efee8bcb9c26e419f.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 9 Aug 2021 12:04:04 +0000 (13:04 +0100)]
io_uring: avoid touching inode in rw prep
If we use fixed files, we can be sure (almost) that REQ_F_ISREG is set.
However, for non-reg files io_prep_rw() still will look into inode to
double check, and that's expensive and can be avoided.
The only caveat is that it only currently works with 64+ bit
architectures, see FFS_ISREG, so we should consider that.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0a62780c491ca2522cd52db4ae3f16e03aafed0f.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 9 Aug 2021 12:04:03 +0000 (13:04 +0100)]
io_uring: rename io_file_supports_async()
io_file_supports_async() checks whether a file supports nowait
operations, so "async" in the name is misleading. Rename it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/33d55b5ce43aa1884c637c1957f1e30d30dc3bec.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 9 Aug 2021 12:04:02 +0000 (13:04 +0100)]
io_uring: inline fixed part of io_file_get()
Optimise io_file_get() with registered files, which is in a hot path,
by inlining parts of the function. Saves a function call, and
inefficiencies of passing arguments, e.g. evaluating
(sqe_flags & IOSQE_FIXED_FILE).
It couldn't have been done before as compilers were refusing to inline
it because of the function size.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/52115cd6ce28f33bd0923149c0e6cb611084a0b1.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Mon, 9 Aug 2021 12:04:01 +0000 (13:04 +0100)]
io_uring: use kvmalloc for fixed files
Instead of hand-coded two-level tables for registered files, allocate
them with kvmalloc(). In many cases small enough tables are enough, and
so can be kmalloc()'ed removing an extra memory load and a bunch of bit
logic instructions from the hot path. If the table is larger, we trade
off all the pros with a TLB-assisted memory lookup.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/280421d3b48775dabab773006bb5588c7b2dabc0.1628471125.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 6 Aug 2021 20:04:31 +0000 (14:04 -0600)]
io_uring: be smarter about waking multiple CQ ring waiters
Currently we only wake the first waiter, even if we have enough entries
posted to satisfy multiple waiters. Improve that situation so that
every waiter knows how much the CQ tail has to advance before they can
be safely woken up.
With this change, if we have N waiters each asking for 1 event and we get
4 completions, then we wake up 4 waiters. If we have N waiters asking
for 2 completions and we get 4 completions, then we wake up the first
two. Previously, only the first waiter would've been woken up.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 4 Aug 2021 14:37:25 +0000 (08:37 -0600)]
io-wq: remove GFP_ATOMIC allocation off schedule out path
Daniel reports that the v5.14-rc4-rt4 kernel throws a BUG when running
stress-ng:
| [ 90.202543] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:35
| [ 90.202549] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2047, name: iou-wrk-2041
| [ 90.202555] CPU: 5 PID: 2047 Comm: iou-wrk-2041 Tainted: G W 5.14.0-rc4-rt4+ #89
| [ 90.202559] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
| [ 90.202561] Call Trace:
| [ 90.202577] dump_stack_lvl+0x34/0x44
| [ 90.202584] ___might_sleep.cold+0x87/0x94
| [ 90.202588] rt_spin_lock+0x19/0x70
| [ 90.202593] ___slab_alloc+0xcb/0x7d0
| [ 90.202598] ? newidle_balance.constprop.0+0xf5/0x3b0
| [ 90.202603] ? dequeue_entity+0xc3/0x290
| [ 90.202605] ? io_wqe_dec_running.isra.0+0x98/0xe0
| [ 90.202610] ? pick_next_task_fair+0xb9/0x330
| [ 90.202612] ? __schedule+0x670/0x1410
| [ 90.202615] ? io_wqe_dec_running.isra.0+0x98/0xe0
| [ 90.202618] kmem_cache_alloc_trace+0x79/0x1f0
| [ 90.202621] io_wqe_dec_running.isra.0+0x98/0xe0
| [ 90.202625] io_wq_worker_sleeping+0x37/0x50
| [ 90.202628] schedule+0x30/0xd0
| [ 90.202630] schedule_timeout+0x8f/0x1a0
| [ 90.202634] ? __bpf_trace_tick_stop+0x10/0x10
| [ 90.202637] io_wqe_worker+0xfd/0x320
| [ 90.202641] ? finish_task_switch.isra.0+0xd3/0x290
| [ 90.202644] ? io_worker_handle_work+0x670/0x670
| [ 90.202646] ? io_worker_handle_work+0x670/0x670
| [ 90.202649] ret_from_fork+0x22/0x30
which is due to the RT kernel not liking a GFP_ATOMIC allocation inside
a raw spinlock. Besides that not working on RT, doing any kind of
allocation from inside schedule() is kind of nasty and should be avoided
if at all possible.
This particular path happens when an io-wq worker goes to sleep, and we
need a new worker to handle pending work. We currently allocate a small
data item to hold the information we need to create a new worker, but we
can instead include this data in the io_worker struct itself and just
protect it with a single bit lock. We only really need one per worker
anyway, as we will have run pending work between to sleep cycles.
https://lore.kernel.org/lkml/
20210804082418.fbibprcwtzyt5qax@beryllium.lan/
Reported-by: Daniel Wagner <dwagner@suse.de>
Tested-by: Daniel Wagner <dwagner@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sun, 22 Aug 2021 21:24:56 +0000 (14:24 -0700)]
Linux 5.14-rc7
Linus Torvalds [Sun, 22 Aug 2021 16:49:31 +0000 (09:49 -0700)]
Merge tag 'powerpc-5.14-6' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix random crashes on some 32-bit CPUs by adding isync() after
locking/unlocking KUEP
- Fix intermittent crashes when loading modules with strict module RWX
- Fix a section mismatch introduce by a previous fix.
Thanks to Christophe Leroy, Fabiano Rosas, Laurent Vivier, Murilo
Opsfelder Araújo, Nathan Chancellor, and Stan Johnson.
h# -----BEGIN PGP SIGNATURE-----
* tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix set_memory_*() against concurrent accesses
powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP
powerpc/xive: Do not mark xive_request_ipi() as __init
Linus Torvalds [Sat, 21 Aug 2021 18:27:16 +0000 (11:27 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk driver fixes from Stephen Boyd:
- Make the regulator state match the GDSC power domain state at boot on
Qualcomm SoCs so that the regulator isn't turned off inadvertently.
- Fix earlycon on i.MX6Q SoCs
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: gdsc: Ensure regulator init state matches GDSC state
clk: imx6q: fix uart earlycon unwork
Linus Torvalds [Sat, 21 Aug 2021 18:22:10 +0000 (11:22 -0700)]
Merge tag 'char-misc-5.14-rc7' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small driver fixes for 5.14-rc7.
They consist of:
- revert for an interconnect patch that was found to have problems
- ipack tpci200 driver fixes for reported problems
- slimbus messaging and ngd fixes for reported problems
All are small and have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
ipack: tpci200: fix memory leak in the tpci200_register
ipack: tpci200: fix many double free issues in tpci200_pci_probe
slimbus: ngd: reset dma setup during runtime pm
slimbus: ngd: set correct device for pm
slimbus: messaging: check for valid transaction id
slimbus: messaging: start transaction ids from 1 instead of zero
Revert "interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate"
Linus Torvalds [Sat, 21 Aug 2021 18:10:06 +0000 (11:10 -0700)]
Merge tag 'usb-5.14-rc7' of git://git./linux/kernel/git/gregkh/usb
Pull USB fix from Greg KH:
"Here is a single USB typec tcpm fix for a reported problem for
5.14-rc7. It showed up in 5.13 and resolves an issue that Hans found.
It has been in linux-next this week with no reported problems"
* tag 'usb-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers
Linus Torvalds [Sat, 21 Aug 2021 18:04:26 +0000 (11:04 -0700)]
Merge tag 'riscv-for-linus-5.14-rc7' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fix the sifive-l2-cache device tree bindings for json-schema
compatibility. This does not change the intended behavior of the
binding.
- avoid improperly freeing necessary resources during early boot.
* tag 'riscv-for-linus-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix a number of free'd resources in init_resources()
dt-bindings: sifive-l2-cache: Fix 'select' matching
Linus Torvalds [Sat, 21 Aug 2021 17:56:06 +0000 (10:56 -0700)]
Merge tag 's390-5.14-5' of git://git./linux/kernel/git/s390/linux
Pull s390 fix from Vasily Gorbik:
- fix use after free of zpci_dev in pci code
* tag 's390-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pci: fix use after free of zpci_dev
Linus Torvalds [Sat, 21 Aug 2021 17:50:22 +0000 (10:50 -0700)]
Merge tag 'locks-v5.14' of git://git./linux/kernel/git/jlayton/linux
Pull mandatory file locking deprecation warning from Jeff Layton:
"As discussed on the list, this patch just adds a new warning for folks
who still have mandatory locking enabled and actually mount with '-o
mand'. I'd like to get this in for v5.14 so we can push this out into
stable kernels and hopefully reach folks who have mounts with -o mand.
For now, I'm operating under the assumption that we'll fully remove
this support in v5.15, but we can move that out if any legitimate
users of this facility speak up between now and then"
* tag 'locks-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs: warn about impending deprecation of mandatory locks
Linus Torvalds [Sat, 21 Aug 2021 15:11:22 +0000 (08:11 -0700)]
Merge tag 'block-5.14-2021-08-20' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Three fixes from Ming Lei that should go into 5.14:
- Fix for a kernel panic when iterating over tags for some cases
where a flush request is present, a regression in this cycle.
- Request timeout fix
- Fix flush request checking"
* tag 'block-5.14-2021-08-20' of git://git.kernel.dk/linux-block:
blk-mq: fix is_flush_rq
blk-mq: fix kernel panic during iterating over flush request
blk-mq: don't grab rq's refcount in blk_mq_check_expired()
Linus Torvalds [Sat, 21 Aug 2021 15:06:26 +0000 (08:06 -0700)]
Merge tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A few small fixes that should go into this release:
- Fix never re-assigning an initial error value for io_uring_enter()
for SQPOLL, if asked to do nothing
- Fix xa_alloc_cycle() return value checking, for cases where we have
wrapped around
- Fix for a ctx pin issue introduced in this cycle (Pavel)"
* tag 'io_uring-5.14-2021-08-20' of git://git.kernel.dk/linux-block:
io_uring: fix xa_alloc_cycle() error return value check
io_uring: pin ctx on fallback execution
io_uring: only assign io_uring_enter() SQPOLL error in actual error case
Jeff Layton [Fri, 20 Aug 2021 13:29:50 +0000 (09:29 -0400)]
fs: warn about impending deprecation of mandatory locks
We've had CONFIG_MANDATORY_FILE_LOCKING since 2015 and a lot of distros
have disabled it. Warn the stragglers that still use "-o mand" that
we'll be dropping support for that mount option.
Cc: stable@vger.kernel.org
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Jens Axboe [Fri, 20 Aug 2021 20:53:59 +0000 (14:53 -0600)]
io_uring: fix xa_alloc_cycle() error return value check
We currently check for ret != 0 to indicate error, but '1' is a valid
return and just indicates that the allocation succeeded with a wrap.
Correct the check to be for < 0, like it was before the xarray
conversion.
Cc: stable@vger.kernel.org
Fixes:
61cf93700fe6 ("io_uring: Convert personality_idr to XArray")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 20 Aug 2021 20:44:25 +0000 (13:44 -0700)]
Merge tag 'acpi-5.14-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix two mistakes in new code.
Specifics:
- Prevent confusing messages from being printed if the PRMT table is
not present or there are no PRM modules (Aubrey Li).
- Fix the handling of suspend-to-idle entry and exit in the case when
the Microsoft UUID is used with the Low-Power S0 Idle _DSM
interface (Mario Limonciello)"
* tag 'acpi-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: s2idle: Invert Microsoft UUID entry and exit
ACPI: PRM: Deal with table not present or no module found
Linus Torvalds [Fri, 20 Aug 2021 20:38:42 +0000 (13:38 -0700)]
Merge tag 'pm-5.14-rc7' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix some issues in the ARM cpufreq drivers and in the operating
performance points (OPP) framework.
Specifics:
- Fix useless WARN() in the OPP core and prevent a noisy warning
from being printed by OPP _put functions (Dmitry Osipenko).
- Fix error path when allocation failed in the arm_scmi cpufreq
driver (Lukasz Luba).
- Blacklist Qualcomm sc8180x and Qualcomm sm8150 in
cpufreq-dt-platdev (Bjorn Andersson, Thara Gopinath).
- Forbid cpufreq for 1.2 GHz variant in the armada-37xx cpufreq
driver (Marek Behún)"
* tag 'pm-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
opp: Drop empty-table checks from _put functions
cpufreq: armada-37xx: forbid cpufreq for 1.2 GHz variant
cpufreq: blocklist Qualcomm sm8150 in cpufreq-dt-platdev
cpufreq: arm_scmi: Fix error path when allocation failed
opp: remove WARN when no valid OPPs remain
cpufreq: blacklist Qualcomm sc8180x in cpufreq-dt-platdev
Linus Torvalds [Fri, 20 Aug 2021 20:08:56 +0000 (13:08 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"10 patches.
Subsystems affected by this patch series: MAINTAINERS and mm (shmem,
pagealloc, tracing, memcg, memory-failure, vmscan, kfence, and
hugetlb)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
hugetlb: don't pass page cache pages to restore_reserve_on_error
kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
mm: vmscan: fix missing psi annotation for node_reclaim()
mm/hwpoison: retry with shake_page() for unhandlable pages
mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
MAINTAINERS: update ClangBuiltLinux IRC chat
mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names
mm/page_alloc: don't corrupt pcppage_migratetype
Revert "mm: swap: check if swap backing device is congested or not"
Revert "mm/shmem: fix shmem_swapin() race with swapoff"
Linus Torvalds [Fri, 20 Aug 2021 19:59:54 +0000 (12:59 -0700)]
Merge tag 'drm-fixes-2021-08-20-3' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regularly scheduled fixes. The ttm one solves a problem of GPU drivers
failing to load if debugfs is off in Kconfig, otherwise the i915 and
mediatek, and amdgpu fixes all fairly normal.
Nouveau has a couple of display fixes, but it has a fix for a
longstanding race condition in it's memory manager code, and the fix
mostly removes some code that wasn't working properly and has no
userspace users. This fix makes the diffstat kinda larger but in a
good (negative line-count) way.
core:
- fix drm_wait_vblank uapi copying bug
ttm:
- fix debugfs init when debugfs is off
amdgpu:
- vega10 SMU workload fix
- DCN VM fix
- DCN 3.01 watermark fix
amdkfd:
- SVM fix
nouveau:
- ampere display fixes
- remove MM misfeature to fix a longstanding race condition
i915:
- tweaked display workaround for all PCHs
- eDP MSO pipe sanity for ADL-P fix
- remove unused symbol export
mediatek:
- AAL output size setting
- Delete component in remove function"
* tag 'drm-fixes-2021-08-20-3' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Use DCN30 watermark calc for DCN301
drm/i915/dp: remove superfluous EXPORT_SYMBOL()
drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P
drm/i915: Tweaked Wa_14010685332 for all PCHs
drm/nouveau: rip out nvkm_client.super
drm/nouveau: block a bunch of classes from userspace
drm/nouveau/fifo/nv50-: rip out dma channels
drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences
drm/nouveau/disp: power down unused DP links during init
drm/nouveau: recognise GA107
drm: Copy drm_wait_vblank to user before returning
drm/amd/display: Ensure DCN save after VM setup
drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure
drm/amd/pm: change the workload type for some cards
Revert "drm/amd/pm: fix workload mismatch on vega10"
drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir fails
drm/mediatek: Add component_del in OVL and COLOR remove function
drm/mediatek: Add AAL output size configuration
Linus Torvalds [Fri, 20 Aug 2021 19:51:37 +0000 (12:51 -0700)]
Merge tag 'pci-v5.14-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer (Rahul Tanwar)
- Add Jim Quinlan et al as Broadcom STB PCIe maintainers (Jim Quinlan)
- Increase D3hot-to-D0 delay for AMD Renoir/Cezanne XHCI (Marcin
Bachry)
- Correct iomem_get_mapping() usage for legacy_mem sysfs (Krzysztof
Wilczyński)
* tag 'pci-v5.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/sysfs: Use correct variable for the legacy_mem sysfs object
PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
MAINTAINERS: Add Jim Quinlan et al as Broadcom STB PCIe maintainers
MAINTAINERS: Add Rahul Tanwar as Intel LGM Gateway PCIe maintainer
Linus Torvalds [Fri, 20 Aug 2021 19:46:00 +0000 (12:46 -0700)]
Merge tag 'mmc-v5.14-rc4' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- dw_mmc: Fix hang on data CRC error
- mmci: Fix voltage switch procedure for the stm32 variant
- sdhci-iproc: Fix some clock issues for BCM2711
- sdhci-msm: Fixup software timeout value
* tag 'mmc-v5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711
mmc: sdhci-iproc: Cap min clock frequency on BCM2711
mmc: sdhci-msm: Update the software timeout value for sdhc
mmc: mmci: stm32: Check when the voltage switch procedure should be done
mmc: dw_mmc: Fix hang on data CRC error
Linus Torvalds [Fri, 20 Aug 2021 19:31:10 +0000 (12:31 -0700)]
Merge tag 'sound-5.14-rc7-2' of git://git./linux/kernel/git/tiwai/sound
Pull more sound fixes from Takashi Iwai:
"This is a quick follow up for 5.14: a fix for a very recently
introduced regression on ASoC Intel Atom driver, and another trivial
HD-audio quirk for HP laptops"
* tag 'sound-5.14-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: intel: atom: Fix breakage for PCM buffer address setup
ALSA: hda/realtek: Limit mic boost on HP ProBook 445 G8
Linus Torvalds [Fri, 20 Aug 2021 19:18:49 +0000 (12:18 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
- Fix cleaning of vDSO directories
- Ensure CNTHCTL_EL2 is fully initialised when booting at EL2
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: initialize all of CNTHCTL_EL2
arm64: clean vdso & vdso32 files
Rafael J. Wysocki [Fri, 20 Aug 2021 19:11:43 +0000 (21:11 +0200)]
Merge branch 'acpi-pm'
* acpi-pm:
ACPI: PM: s2idle: Invert Microsoft UUID entry and exit
Linus Torvalds [Fri, 20 Aug 2021 19:11:33 +0000 (12:11 -0700)]
Merge tag 'iommu-fixes-v5.14-rc6' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Fix for a potential NULL-ptr dereference in IOMMU core code
- Two resource leak fixes
- Cache flush fix in the Intel VT-d driver
* tag 'iommu-fixes-v5.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry()
iommu/vt-d: Fix PASID reference leak
iommu: Check if group is NULL before remove device
iommu/dma: Fix leak in non-contiguous API
Rafael J. Wysocki [Fri, 20 Aug 2021 19:11:16 +0000 (21:11 +0200)]
Merge branch 'pm-opp'
* pm-opp:
opp: Drop empty-table checks from _put functions
opp: remove WARN when no valid OPPs remain
Mike Kravetz [Fri, 20 Aug 2021 02:04:33 +0000 (19:04 -0700)]
hugetlb: don't pass page cache pages to restore_reserve_on_error
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache. It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page to
the cache.
The only code other than huge page allocation which sets the flag is
restore_reserve_on_error. It will potentially set the flag in rare out
of memory conditions. syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.
The code in restore_reserve_on_error is doing the right thing. However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error. This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page
until it is removed from the cache. Error paths do not remove pages
from the cache, so even in the case of error, the page will remain in
the cache and no reservation adjustment is needed.
Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.
Note on fixes tag: Prior to commit
846be08578ed ("mm/hugetlb: expand
restore_reserve_on_error functionality") the routine would not process
page cache pages because the HPageRestoreReserve flag is not set on such
pages. Therefore, this issue could not be trigggered. The code added
by commit
846be08578ed ("mm/hugetlb: expand restore_reserve_on_error
functionality") is needed and correct. It exposed incorrect calls to
restore_reserve_on_error which is the root cause addressed by this
commit.
[1] https://lore.kernel.org/linux-mm/
00000000000050776d05c9b7c7f0@google.com/
Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com
Fixes:
846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: <syzbot+67654e51e54455f1c585@syzkaller.appspotmail.com>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Marco Elver [Fri, 20 Aug 2021 02:04:30 +0000 (19:04 -0700)]
kfence: fix is_kfence_address() for addresses below KFENCE_POOL_SIZE
Originally the addr != NULL check was meant to take care of the case
where __kfence_pool == NULL (KFENCE is disabled). However, this does
not work for addresses where addr > 0 && addr < KFENCE_POOL_SIZE.
This can be the case on NULL-deref where addr > 0 && addr < PAGE_SIZE or
any other faulting access with addr < KFENCE_POOL_SIZE. While the
kernel would likely crash, the stack traces and report might be
confusing due to double faults upon KFENCE's attempt to unprotect such
an address.
Fix it by just checking that __kfence_pool != NULL instead.
Link: https://lkml.kernel.org/r/20210818130300.2482437-1-elver@google.com
Fixes:
0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Marco Elver <elver@google.com>
Reported-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 20 Aug 2021 02:04:27 +0000 (19:04 -0700)]
mm: vmscan: fix missing psi annotation for node_reclaim()
In a debugging session the other day, Rik noticed that node_reclaim()
was missing memstall annotations. This means we'll miss pressure and
lost productivity resulting from reclaim on an overloaded local NUMA
node when vm.zone_reclaim_mode is enabled.
There haven't been any reports, but that's likely because
vm.zone_reclaim_mode hasn't been a commonly used feature recently, and
the intersection between such setups and psi users is probably nil.
But secondary memory such as CXL-connected DIMMS, persistent memory etc,
and the page demotion patches that handle them
(https://lore.kernel.org/lkml/
20210401183216.
443C4443@viggo.jf.intel.com/)
could soon make this a more common codepath again.
Link: https://lkml.kernel.org/r/20210818152457.35846-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Fri, 20 Aug 2021 02:04:24 +0000 (19:04 -0700)]
mm/hwpoison: retry with shake_page() for unhandlable pages
HWPoisonHandlable() sometimes returns false for typical user pages due
to races with average memory events like transfers over LRU lists. This
causes failures in hwpoison handling.
There's retry code for such a case but does not work because the retry
loop reaches the retry limit too quickly before the page settles down to
handlable state. Let get_any_page() call shake_page() to fix it.
[naoya.horiguchi@nec.com: get_any_page(): return -EIO when retry limit reached]
Link: https://lkml.kernel.org/r/20210819001958.2365157-1-naoya.horiguchi@linux.dev
Link: https://lkml.kernel.org/r/20210817053703.2267588-1-naoya.horiguchi@linux.dev
Fixes:
25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org> [5.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 20 Aug 2021 02:04:21 +0000 (19:04 -0700)]
mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim
We've noticed occasional OOM killing when memory.low settings are in
effect for cgroups. This is unexpected and undesirable as memory.low is
supposed to express non-OOMing memory priorities between cgroups.
The reason for this is proportional memory.low reclaim. When cgroups
are below their memory.low threshold, reclaim passes them over in the
first round, and then retries if it couldn't find pages anywhere else.
But when cgroups are slightly above their memory.low setting, page scan
force is scaled down and diminished in proportion to the overage, to the
point where it can cause reclaim to fail as well - only in that case we
currently don't retry, and instead trigger OOM.
To fix this, hook proportional reclaim into the same retry logic we have
in place for when cgroups are skipped entirely. This way if reclaim
fails and some cgroups were scanned with diminished pressure, we'll try
another full-force cycle before giving up and OOMing.
[akpm@linux-foundation.org: coding-style fixes]
Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org
Fixes:
9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Leon Yang <lnyng@fb.com>
Reviewed-by: Rik van Riel <riel@surriel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nathan Chancellor [Fri, 20 Aug 2021 02:04:18 +0000 (19:04 -0700)]
MAINTAINERS: update ClangBuiltLinux IRC chat
Everyone has moved from Freenode to Libera so updated the channel entry
for MAINTAINERS.
Link: https://github.com/ClangBuiltLinux/linux/issues/1402
Link: https://lkml.kernel.org/r/20210818022339.3863058-1-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 20 Aug 2021 02:04:15 +0000 (19:04 -0700)]
mmflags.h: add missing __GFP_ZEROTAGS and __GFP_SKIP_KASAN_POISON names
printk("%pGg") outputs these two flags as hexadecimal number, rather
than as a string, e.g:
GFP_KERNEL|0x1800000
Fix this by adding missing names of __GFP_ZEROTAGS and
__GFP_SKIP_KASAN_POISON flags to __def_gfpflag_names.
Link: https://lkml.kernel.org/r/20210816133502.590-1-rppt@kernel.org
Fixes:
013bb59dbb7c ("arm64: mte: handle tags zeroing at page allocation time")
Fixes:
c275c5c6d50a ("kasan: disable freed user page poisoning with HW tags")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Doug Berger [Fri, 20 Aug 2021 02:04:12 +0000 (19:04 -0700)]
mm/page_alloc: don't corrupt pcppage_migratetype
When placing pages on a pcp list, migratetype values over
MIGRATE_PCPTYPES get added to the MIGRATE_MOVABLE pcp list.
However, the actual migratetype is preserved in the page and should
not be changed to MIGRATE_MOVABLE or the page may end up on the wrong
free_list.
The impact is that HIGHATOMIC or CMA pages getting bulk freed from the
PCP lists could potentially end up on the wrong buddy list. There are
various consequences but minimally NR_FREE_CMA_PAGES accounting could
get screwed up.
[mgorman@techsingularity.net: changelog update]
Link: https://lkml.kernel.org/r/20210811182917.2607994-1-opendmb@gmail.com
Fixes:
df1acc856923 ("mm/page_alloc: avoid conflating IRQs disabled with zone->lock")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Shi [Fri, 20 Aug 2021 02:04:09 +0000 (19:04 -0700)]
Revert "mm: swap: check if swap backing device is congested or not"
Due to the change about how block layer detects congestion the
justification of commit
8fd2e0b505d1 ("mm: swap: check if swap backing
device is congested or not") doesn't stand anymore, so the commit could
be just reverted in order to solve the race reported by commit
2efa33fc7f6e ("mm/shmem: fix shmem_swapin() race with swapoff"). The
fix was reverted by the previous patch.
Link: https://lkml.kernel.org/r/20210810202936.2672-3-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Shi [Fri, 20 Aug 2021 02:04:05 +0000 (19:04 -0700)]
Revert "mm/shmem: fix shmem_swapin() race with swapoff"
Due to the change about how block layer detects congestion the
justification of commit
8fd2e0b505d1 ("mm: swap: check if swap backing
device is congested or not") doesn't stand anymore, so the commit could
be just reverted in order to solve the race reported by commit
2efa33fc7f6e ("mm/shmem: fix shmem_swapin() race with swapoff"), so the
fix commit could be just reverted as well.
And that fix is also kind of buggy as discussed by [1] and [2].
[1] https://lore.kernel.org/linux-mm/
24187e5e-069-9f3f-cefe-
39ac70783753@google.com/
[2] https://lore.kernel.org/linux-mm/
e82380b9-3ad4-4a52-be50-
6d45c7f2b5da@google.com/
Link: https://lkml.kernel.org/r/20210810202936.2672-2-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Petr Pavlu [Sat, 7 Aug 2021 17:54:50 +0000 (19:54 +0200)]
riscv: Fix a number of free'd resources in init_resources()
Function init_resources() allocates a boot memory block to hold an array of
resources which it adds to iomem_resource. The array is filled in from its
end and the function then attempts to free any unused memory at the
beginning. The problem is that size of the unused memory is incorrectly
calculated and this can result in releasing memory which is in use by
active resources. Their data then gets corrupted later when the memory is
reused by a different part of the system.
Fix the size of the released memory to correctly match the number of unused
resource entries.
Fixes:
ffe0e5261268 ("RISC-V: Improve init_resources()")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>
Tested-by: Sunil V L <sunilvl@ventanamicro.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Dave Airlie [Fri, 20 Aug 2021 05:13:56 +0000 (15:13 +1000)]
Merge tag 'amd-drm-fixes-5.14-2021-08-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.14-2021-08-18:
amdgpu:
- vega10 SMU workload fix
- DCN VM fix
- DCN 3.01 watermark fix
amdkfd:
- SVM fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210818225137.4070-1-alexander.deucher@amd.com
Rob Herring [Tue, 17 Aug 2021 17:47:55 +0000 (12:47 -0500)]
dt-bindings: sifive-l2-cache: Fix 'select' matching
When the schema fixups are applied to 'select' the result is a single
entry is required for a match, but that will never match as there should
be 2 entries. Also, a 'select' schema should have the widest possible
match, so use 'contains' which matches the compatible string(s) in any
position and not just the first position.
Fixes:
993dcfac64eb ("dt-bindings: riscv: sifive-l2-cache: convert bindings to json-schema")
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Dave Airlie [Fri, 20 Aug 2021 00:09:42 +0000 (10:09 +1000)]
Merge tag 'mediatek-drm-fixes-5.14-2' of https://git./linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes for Linux 5.14-2
1. Fix AAL output size setting.
2. Delete component in remove function.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210819001635.14803-1-chunkuang.hu@kernel.org
Dave Airlie [Thu, 19 Aug 2021 23:38:30 +0000 (09:38 +1000)]
Merge tag 'drm-intel-fixes-2021-08-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Expand a tweaked display workaround for all PCHs. (Anshuman)
- Fix eDP MSO pipe sanity checks for ADL-P. (Jani)
- Remove superfluous EXPORT_SYMBOL(). (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YR137zkSAIbun1Ed@intel.com
Dave Airlie [Thu, 19 Aug 2021 20:57:44 +0000 (06:57 +1000)]
Merge branch 'linux-5.14' of git://github.com/skeggsb/linux into drm-fixes
- Ampere display fixes
- Fix longstanding MM race issue by removing unused code.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv5jtUFkHsGe-pf-=RceDOgKygjPnCi=6d5vCLM_f5aeMQ@mail.gmail.com
Linus Torvalds [Thu, 19 Aug 2021 22:32:58 +0000 (15:32 -0700)]
Merge tag 'soc-fixes-5.14-3' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Not much to see here. Half the fixes this time are for Qualcomm dts
files, fixing small mistakes on certain machines. The other fixes are:
- A 5.13 regression fix for freescale QE interrupt controller\
- A fix for TI OMAP gpt12 timer error handling
- A randconfig build regression fix for ixp4xx
- Another defconfig fix following the CONFIG_FB dependency rework"
* tag 'soc-fixes-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: fsl: qe: fix static checker warning
ARM: ixp4xx: fix building both pci drivers
ARM: configs: Update the nhk8815_defconfig
bus: ti-sysc: Fix error handling for sysc_check_active_timer()
soc: fsl: qe: convert QE interrupt controller to platform_device
arm64: dts: qcom: sdm845-oneplus: fix reserved-mem
arm64: dts: qcom: msm8994-angler: Disable cont_splash_mem
arm64: dts: qcom: sc7280: Fixup cpufreq domain info for cpu7
arm64: dts: qcom: msm8992-bullhead: Fix cont_splash_mem mapping
arm64: dts: qcom: msm8992-bullhead: Remove PSCI
arm64: dts: qcom: c630: fix correct powerdown pin for WSA881x
Dave Airlie [Thu, 19 Aug 2021 07:39:33 +0000 (17:39 +1000)]
Merge tag 'drm-misc-fixes-2021-08-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
* UAPI: Return results for failed drm_wait_vblank_ioctl()
* ttm: Fix debugfs initialization
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YR1c7cG1IaL+g8EN@linux-uq9g.fritz.box
Linus Torvalds [Thu, 19 Aug 2021 19:33:43 +0000 (12:33 -0700)]
Merge tag 'net-5.14-rc7' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from bpf, wireless and mac80211
trees.
Current release - regressions:
- tipc: call tipc_wait_for_connect only when dlen is not 0
- mac80211: fix locking in ieee80211_restart_work()
Current release - new code bugs:
- bpf: add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id()
- ethernet: ice: fix perout start time rounding
- wwan: iosm: prevent underflow in ipc_chnl_cfg_get()
Previous releases - regressions:
- bpf: clear zext_dst of dead insns
- sch_cake: fix srchost/dsthost hashing mode
- vrf: reset skb conntrack connection on VRF rcv
- net/rds: dma_map_sg is entitled to merge entries
Previous releases - always broken:
- ethernet: bnxt: fix Tx path locking and races, add Rx path
barriers"
* tag 'net-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits)
net: dpaa2-switch: disable the control interface on error path
Revert "flow_offload: action should not be NULL when it is referenced"
iavf: Fix ping is lost after untrusted VF had tried to change MAC
i40e: Fix ATR queue selection
r8152: fix the maximum number of PLA bp for RTL8153C
r8152: fix writing USB_BP2_EN
mptcp: full fully established support after ADD_ADDR
mptcp: fix memory leak on address flush
net/rds: dma_map_sg is entitled to merge entries
net: mscc: ocelot: allow forwarding from bridge ports to the tag_8021q CPU port
net: asix: fix uninit value bugs
ovs: clear skb->tstamp in forwarding path
net: mdio-mux: Handle -EPROBE_DEFER correctly
net: mdio-mux: Don't ignore memory allocation errors
net: mdio-mux: Delete unnecessary devm_kfree
net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse
sch_cake: fix srchost/dsthost hashing mode
ixgbe, xsk: clean up the resources in ixgbe_xsk_pool_enable error path
net: qlcnic: add missed unlock in qlcnic_83xx_flash_read32
mac80211: fix locking in ieee80211_restart_work()
...
Linus Torvalds [Thu, 19 Aug 2021 19:19:58 +0000 (12:19 -0700)]
Merge tag 'platform-drivers-x86-v5.14-4' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
- Enable SW_TABLET_MODE support for the TP200s
- Enable WMI on two more Gigabyte motherboards
* tag 'platform-drivers-x86-v5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: gigabyte-wmi: add support for B450M S2H V2
platform/x86: gigabyte-wmi: add support for X570 GAMING X
platform/x86: asus-nb-wmi: Add tablet_mode_sw=lid-flip quirk for the TP200s
platform/x86: asus-nb-wmi: Allow configuring SW_TABLET_MODE method with a module option
Vladimir Oltean [Thu, 19 Aug 2021 14:17:55 +0000 (17:17 +0300)]
net: dpaa2-switch: disable the control interface on error path
Currently dpaa2_switch_takedown has a funny name and does not do the
opposite of dpaa2_switch_init, which makes probing fail when we need to
handle an -EPROBE_DEFER.
A sketch of what dpaa2_switch_init does:
dpsw_open
dpaa2_switch_detect_features
dpsw_reset
for (i = 0; i < ethsw->sw_attr.num_ifs; i++) {
dpsw_if_disable
dpsw_if_set_stp
dpsw_vlan_remove_if_untagged
dpsw_if_set_tci
dpsw_vlan_remove_if
}
dpsw_vlan_remove
alloc_ordered_workqueue
dpsw_fdb_remove
dpaa2_switch_ctrl_if_setup
When dpaa2_switch_takedown is called from the error path of
dpaa2_switch_probe(), the control interface, enabled by
dpaa2_switch_ctrl_if_setup from dpaa2_switch_init, remains enabled,
because dpaa2_switch_takedown does not call
dpaa2_switch_ctrl_if_teardown.
Since dpaa2_switch_probe might fail due to EPROBE_DEFER of a PHY, this
means that a second probe of the driver will happen with the control
interface directly enabled.
This will trigger a second error:
[ 93.273528] fsl_dpaa2_switch dpsw.0: dpsw_ctrl_if_set_pools() failed
[ 93.281966] fsl_dpaa2_switch dpsw.0: fsl_mc_driver_probe failed: -13
[ 93.288323] fsl_dpaa2_switch: probe of dpsw.0 failed with error -13
Which if we investigate the /dev/dpaa2_mc_console log, we find out is
caused by:
[E, ctrl_if_set_pools:2211, DPMNG] ctrl_if must be disabled
So make dpaa2_switch_takedown do the opposite of dpaa2_switch_init (in
reasonable limits, no reason to change STP state, re-add VLANs etc), and
rename it to something more conventional, like dpaa2_switch_teardown.
Fixes:
613c0a5810b7 ("staging: dpaa2-switch: enable the control interface")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20210819141755.1931423-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Thu, 19 Aug 2021 10:58:42 +0000 (13:58 +0300)]
Revert "flow_offload: action should not be NULL when it is referenced"
This reverts commit
9ea3e52c5bc8bb4a084938dc1e3160643438927a.
Cited commit added a check to make sure 'action' is not NULL, but
'action' is already dereferenced before the check, when calling
flow_offload_has_one_action().
Therefore, the check does not make any sense and results in a smatch
warning:
include/net/flow_offload.h:322 flow_action_mixed_hw_stats_check() warn:
variable dereferenced before check 'action' (see line 319)
Fix by reverting this commit.
Cc: gushengxian <gushengxian@yulong.com>
Fixes:
9ea3e52c5bc8 ("flow_offload: action should not be NULL when it is referenced")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20210819105842.1315705-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 19 Aug 2021 16:56:42 +0000 (09:56 -0700)]
Merge branch 'intel-wired-lan-driver-updates-2021-08-18'
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-08-18
This series contains updates to i40e and iavf drivers.
Arkadiusz fixes Flow Director not using the correct queue due to calling
the wrong pick Tx function for i40e.
Sylwester resolves traffic loss for iavf when it attempts to change its
MAC address when it does not have permissions to do so.
====================
Link: https://lore.kernel.org/r/20210818174217.4138922-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sylwester Dziedziuch [Wed, 18 Aug 2021 17:42:17 +0000 (10:42 -0700)]
iavf: Fix ping is lost after untrusted VF had tried to change MAC
Make changes to MAC address dependent on the response of PF.
Disallow changes to HW MAC address and MAC filter from untrusted
VF, thanks to that ping is not lost if VF tries to change MAC.
Add a new field in iavf_mac_filter, to indicate whether there
was response from PF for given filter. Based on this field pass
or discard the filter.
If untrusted VF tried to change it's address, it's not changed.
Still filter was changed, because of that ping couldn't go through.
Fixes:
c5c922b3e09b ("iavf: fix MAC address setting for VFs when filter is rejected")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan G <Gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arkadiusz Kubalewski [Wed, 18 Aug 2021 17:42:16 +0000 (10:42 -0700)]
i40e: Fix ATR queue selection
Without this patch, ATR does not work. Receive/transmit uses queue
selection based on SW DCB hashing method.
If traffic classes are not configured for PF, then use
netdev_pick_tx function for selecting queue for packet transmission.
Instead of calling i40e_swdcb_skb_tx_hash, call netdev_pick_tx,
which ensures that packet is transmitted/received from CPU that is
running the application.
Reproduction steps:
1. Load i40e driver
2. Map each MSI interrupt of i40e port for each CPU
3. Disable ntuple, enable ATR i.e.:
ethtool -K $interface ntuple off
ethtool --set-priv-flags $interface flow-director-atr
4. Run application that is generating traffic and is bound to a
single CPU, i.e.:
taskset -c 9 netperf -H 1.1.1.1 -t TCP_RR -l 10
5. Observe behavior:
Application's traffic should be restricted to the CPU provided in
taskset.
Fixes:
89ec1f0886c1 ("i40e: Fix queue-to-TC mapping on Tx")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Dave Switzer <david.switzer@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 19 Aug 2021 15:58:16 +0000 (08:58 -0700)]
Merge https://git./linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2021-08-19
We've added 3 non-merge commits during the last 3 day(s) which contain
a total of 3 files changed, 29 insertions(+), 6 deletions(-).
The main changes are:
1) Fix to clear zext_dst for dead instructions which was causing invalid program
rejections on JITs with bpf_jit_needs_zext such as s390x, from Ilya Leoshkevich.
2) Fix RCU splat in bpf_get_current_{ancestor_,}cgroup_id() helpers when they are
invoked from sleepable programs, from Yonghong Song.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests, bpf: Test that dead ldx_w insns are accepted
bpf: Clear zext_dst of dead insns
bpf: Add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id() helpers
====================
Link: https://lore.kernel.org/r/20210819144904.20069-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Takashi Iwai [Thu, 19 Aug 2021 15:29:45 +0000 (17:29 +0200)]
ASoC: intel: atom: Fix breakage for PCM buffer address setup
The commit
2e6b836312a4 ("ASoC: intel: atom: Fix reference to PCM
buffer address") changed the reference of PCM buffer address to
substream->runtime->dma_addr as the buffer address may change
dynamically. However, I forgot that the dma_addr field is still not
set up for the CONTINUOUS buffer type (that this driver uses) yet in
5.14 and earlier kernels, and it resulted in garbage I/O. The problem
will be fixed in 5.15, but we need to address it quickly for now.
The fix is to deduce the address again from the DMA pointer with
virt_to_phys(), but from the right one, substream->runtime->dma_area.
Fixes:
2e6b836312a4 ("ASoC: intel: atom: Fix reference to PCM buffer address")
Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: <stable@vger.kernel.org>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/2048c6aa-2187-46bd-6772-36a4fb3c5aeb@redhat.com
Link: https://lore.kernel.org/r/20210819152945.8510-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Kai-Heng Feng [Wed, 18 Aug 2021 14:41:18 +0000 (22:41 +0800)]
ALSA: hda/realtek: Limit mic boost on HP ProBook 445 G8
The mic has lots of noises if mic boost is enabled. So disable mic boost
to get crystal clear audio capture.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210818144119.121738-1-kai.heng.feng@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Arnd Bergmann [Thu, 19 Aug 2021 15:22:46 +0000 (17:22 +0200)]
Merge tag 'omap-for-v5.14/gpt12-fix-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes
Fix for omap gpt12 timer error handling
Two of the recent fixes for ti-sysc driver had bad interaction for a
function return value that caused one of the fixes to not work so we
need to change the return value handling. Otherwise early beagleboard
variants still have a boot issue.
* tag 'omap-for-v5.14/gpt12-fix-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
bus: ti-sysc: Fix error handling for sysc_check_active_timer()
Link: https://lore.kernel.org/r/pull-1629354796-830948@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Krzysztof Wilczyński [Thu, 12 Aug 2021 13:21:44 +0000 (13:21 +0000)]
PCI/sysfs: Use correct variable for the legacy_mem sysfs object
Two legacy PCI sysfs objects "legacy_io" and "legacy_mem" were updated
to use an unified address space in the commit
636b21b50152 ("PCI: Revoke
mappings like devmem"). This allows for revocations to be managed from
a single place when drivers want to take over and mmap() a /dev/mem
range.
Following the update, both of the sysfs objects should leverage the
iomem_get_mapping() function to get an appropriate address range, but
only the "legacy_io" has been correctly updated - the second attribute
seems to be using a wrong variable to pass the iomem_get_mapping()
function to.
Thus, correct the variable name used so that the "legacy_mem" sysfs
object would also correctly call the iomem_get_mapping() function.
Fixes:
636b21b50152 ("PCI: Revoke mappings like devmem")
Link: https://lore.kernel.org/r/20210812132144.791268-1-kw@linux.com
Signed-off-by: Krzysztof Wilczyński <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Marcin Bachry [Thu, 22 Jul 2021 02:58:58 +0000 (22:58 -0400)]
PCI: Increase D3 delay for AMD Renoir/Cezanne XHCI
The Renoir XHCI controller apparently doesn't resume reliably with the
standard D3hot-to-D0 delay. Increase it to 20ms.
[Alex: I talked to the AMD USB hardware team and the AMD Windows team and
they are not aware of any HW errata or specific issues. The HW works fine
in Windows. I was told Windows uses a rather generous default delay of
100ms for PCI state transitions.]
Link: https://lore.kernel.org/r/20210722025858.220064-1-alexander.deucher@amd.com
Signed-off-by: Marcin Bachry <hegel666@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Prike Liang <prike.liang@amd.com>
Cc: Shyam Sundar S K <shyam-sundar.s-k@amd.com>
Jim Quinlan [Wed, 18 Aug 2021 22:50:30 +0000 (18:50 -0400)]
MAINTAINERS: Add Jim Quinlan et al as Broadcom STB PCIe maintainers
Add Jim Quinlan, Nicolas Saenz Julienne, and Florian Fainelli as
maintainers of the Broadcom STB PCIe controller driver.
This driver is also included in these entries:
BROADCOM BCM2711/BCM2835 ARM ARCHITECTURE
BROADCOM BCM7XXX ARM ARCHITECTURE
which cover the Raspberry Pi specifics of the PCIe driver.
Link: https://lore.kernel.org/r/20210818225031.8502-1-jim2101024@gmail.com
Signed-off-by: Jim Quinlan <jim2101024@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
David S. Miller [Thu, 19 Aug 2021 11:19:30 +0000 (12:19 +0100)]
Merge branch 'r8152-bp-settings'
Hayes Wang says:
====================
r8152: fix bp settings
Fix the wrong bp settings of the firmware.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang [Thu, 19 Aug 2021 03:05:37 +0000 (11:05 +0800)]
r8152: fix the maximum number of PLA bp for RTL8153C
The maximum PLA bp number of RTL8153C is 16, not 8. That is, the
bp 0 ~ 15 are at 0xfc28 ~ 0xfc46, and the bp_en is at 0xfc48.
Fixes:
195aae321c82 ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang [Thu, 19 Aug 2021 03:05:36 +0000 (11:05 +0800)]
r8152: fix writing USB_BP2_EN
The register of USB_BP2_EN is 16 bits, so we should use
ocp_write_word(), not ocp_write_byte().
Fixes:
9370f2d05a2a ("support request_firmware for RTL8153")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 19 Aug 2021 11:17:05 +0000 (12:17 +0100)]
Merge branch 'mptcp-fixes'
Mat Martineau says:
====================
mptcp: Bug fixes
Here are two bug fixes for the net tree:
Patch 1 fixes a memory leak that could be encountered when clearing the
list of advertised MPTCP addresses.
Patch 2 fixes a protocol issue early in an MPTCP connection, to ensure
both peers correctly understand that the full MPTCP connection handshake
has completed even when the server side quickly sends an ADD_ADDR
option.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthieu Baerts [Wed, 18 Aug 2021 23:42:37 +0000 (16:42 -0700)]
mptcp: full fully established support after ADD_ADDR
If directly after an MP_CAPABLE 3WHS, the client receives an ADD_ADDR
with HMAC from the server, it is enough to switch to a "fully
established" mode because it has received more MPTCP options.
It was then OK to enable the "fully_established" flag on the MPTCP
socket. Still, best to check if the ADD_ADDR looks valid by looking if
it contains an HMAC (no 'echo' bit). If an ADD_ADDR echo is received
while we are not in "fully established" mode, it is strange and then
we should not switch to this mode now.
But that is not enough. On one hand, the path-manager has be notified
the state has changed. On the other hand, the "fully_established" flag
on the subflow socket should be turned on as well not to re-send the
MP_CAPABLE 3rd ACK content with the next ACK.
Fixes:
84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Wed, 18 Aug 2021 23:42:36 +0000 (16:42 -0700)]
mptcp: fix memory leak on address flush
The endpoint cleanup path is prone to a memory leak, as reported
by syzkaller:
BUG: memory leak
unreferenced object 0xffff88810680ea00 (size 64):
comm "syz-executor.6", pid 6191, jiffies
4295756280 (age 24.138s)
hex dump (first 32 bytes):
58 75 7d 3c 80 88 ff ff 22 01 00 00 00 00 ad de Xu}<....".......
01 00 02 00 00 00 00 00 ac 1e 00 07 00 00 00 00 ................
backtrace:
[<
0000000072a9f72a>] kmalloc include/linux/slab.h:591 [inline]
[<
0000000072a9f72a>] mptcp_nl_cmd_add_addr+0x287/0x9f0 net/mptcp/pm_netlink.c:1170
[<
00000000f6e931bf>] genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731
[<
00000000f1504a2c>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
[<
00000000f1504a2c>] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792
[<
0000000097e76f6a>] netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504
[<
00000000ceefa2b8>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
[<
000000008ff91aec>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
[<
000000008ff91aec>] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340
[<
0000000041682c35>] netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929
[<
00000000df3aa8e7>] sock_sendmsg_nosec net/socket.c:704 [inline]
[<
00000000df3aa8e7>] sock_sendmsg+0x14e/0x190 net/socket.c:724
[<
000000002154c54c>] ____sys_sendmsg+0x709/0x870 net/socket.c:2403
[<
000000001aab01d7>] ___sys_sendmsg+0xff/0x170 net/socket.c:2457
[<
00000000fa3b1446>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486
[<
00000000db2ee9c7>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<
00000000db2ee9c7>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
[<
000000005873517d>] entry_SYSCALL_64_after_hwframe+0x44/0xae
We should not require an allocation to cleanup stuff.
Rework the code a bit so that the additional RCU work is no more needed.
Fixes:
1729cf186d8a ("mptcp: create the listening socket for new port")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Rutland [Wed, 18 Aug 2021 16:15:35 +0000 (17:15 +0100)]
arm64: initialize all of CNTHCTL_EL2
In __init_el2_timers we initialize CNTHCTL_EL2.{EL1PCEN,EL1PCTEN} with a
RMW sequence, leaving all other bits UNKNOWN.
In general, we should initialize all bits in a register rather than
using an RMW sequence, since most bits are UNKNOWN out of reset, and as
new bits are added to the reigster their reset value might not result in
expected behaviour.
In the case of CNTHCTL_EL2, FEAT_ECV added a number of new control bits
in previously RES0 bits, which reset to UNKNOWN values, and may cause
issues for EL1 and EL0:
* CNTHCTL_EL2.ECV enables the CNTPOFF_EL2 offset (which itself resets to
an UNKNOWN value) at EL0 and EL1. Since the offset could reset to
distinct values across CPUs, when the control bit resets to 1 this
could break timekeeping generally.
* CNTHCTL_EL2.{EL1TVT,EL1TVCT} trap EL0 and EL1 accesses to the EL1
virtual timer/counter registers to EL2. When reset to 1, this could
cause unexpected traps to EL2.
Initializing these bits to zero avoids these problems, and all other
bits in CNTHCTL_EL2 other than EL1PCEN and EL1PCTEN can safely be reset
to zero.
This patch ensures we initialize CNTHCTL_EL2 accordingly, only setting
EL1PCEN and EL1PCTEN, and setting all other bits to zero.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oupton@google.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Oliver Upton <oupton@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210818161535.52786-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Michael Ellerman [Sun, 15 Aug 2021 04:10:24 +0000 (14:10 +1000)]
powerpc/mm: Fix set_memory_*() against concurrent accesses
Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:
kernel tried to execute exec-protected page (
c008000004073278) - exploit attempt? (uid: 0)
BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xc008000004073278
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
Workqueue: events control_work_handler [virtio_console]
NIP:
c008000004073278 LR:
c008000004073278 CTR:
c0000000001e9de0
REGS:
c00000002e4ef7e0 TRAP: 0400 Not tainted (5.14.0-rc4+)
MSR:
800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR:
24002822 XER:
200400cf
...
NIP fill_queue+0xf0/0x210 [virtio_console]
LR fill_queue+0xf0/0x210 [virtio_console]
Call Trace:
fill_queue+0xb4/0x210 [virtio_console] (unreliable)
add_port+0x1a8/0x470 [virtio_console]
control_work_handler+0xbc/0x1e8 [virtio_console]
process_one_work+0x290/0x590
worker_thread+0x88/0x620
kthread+0x194/0x1a0
ret_from_kernel_thread+0x5c/0x64
Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.
Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.
We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].
But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.
[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
Fixes:
1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Murilo Opsfelder Araújo <muriloo@linux.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818120518.3603172-1-mpe@ellerman.id.au
Christophe Leroy [Wed, 18 Aug 2021 06:49:29 +0000 (06:49 +0000)]
powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP
Commit
b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
removed the 'isync' instruction after adding/removing NX bit in user
segments. The reasoning behind this change was that when setting the
NX bit we don't mind it taking effect with delay as the kernel never
executes text from userspace, and when clearing the NX bit this is
to return to userspace and then the 'rfi' should synchronise the
context.
However, it looks like on book3s/32 having a hash page table, at least
on the G3 processor, we get an unexpected fault from userspace, then
this is followed by something wrong in the verification of MSR_PR
at end of another interrupt.
This is fixed by adding back the removed isync() following update
of NX bit in user segment registers. Only do it for cores with an
hash table, as 603 cores don't exhibit that problem and the two isync
increase ./null_syscall selftest by 6 cycles on an MPC 832x.
First problem: unexpected WARN_ON() for mysterious PROTFAULT
WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0
Modules linked in:
CPU: 0 PID: 1660 Comm: Xorg Not tainted
5.13.0-pmac-00028-gb3c15b60339a #40
NIP:
c001b5c8 LR:
c001b6f8 CTR:
00000000
REGS:
e2d09e40 TRAP: 0700 Not tainted (
5.13.0-pmac-00028-gb3c15b60339a)
MSR:
00021032 <ME,IR,DR,RI> CR:
42d04f30 XER:
20000000
GPR00:
c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000
GPR08:
08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c
GPR16:
00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
GPR24:
a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40
NIP [
c001b5c8] do_page_fault+0x6c/0x5b0
LR [
c001b6f8] do_page_fault+0x19c/0x5b0
Call Trace:
[
e2d09f00] [
e2d09f04] 0xe2d09f04 (unreliable)
[
e2d09f30] [
c000424c] DataAccess_virt+0xd4/0xe4
--- interrupt: 300 at 0xa7a261dc
NIP:
a7a261dc LR:
a7a253bc CTR:
00000000
REGS:
e2d09f40 TRAP: 0300 Not tainted (
5.13.0-pmac-00028-gb3c15b60339a)
MSR:
0000d032 <EE,PR,ME,IR,DR,RI> CR:
228428e2 XER:
20000000
DAR:
00cba02c DSISR:
42000000
GPR00:
a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000
GPR08:
00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c
GPR16:
00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010
GPR24:
a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030
NIP [
a7a261dc] 0xa7a261dc
LR [
a7a253bc] 0xa7a253bc
--- interrupt: 300
Instruction dump:
7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c
2e1b0000 40920228 75280800 41820220 <
0fe00000>
3b600000 41920214 81420594
Second problem: MSR PR is seen unset allthough the interrupt frame shows it set
kernel BUG at arch/powerpc/kernel/interrupt.c:458!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1660 Comm: Xorg Tainted: G W
5.13.0-pmac-00028-gb3c15b60339a #40
NIP:
c0011434 LR:
c001629c CTR:
00000000
REGS:
e2d09e70 TRAP: 0700 Tainted: G W (
5.13.0-pmac-00028-gb3c15b60339a)
MSR:
00029032 <EE,ME,IR,DR,RI> CR:
42d09f30 XER:
00000000
GPR00:
00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000
GPR08:
00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144
GPR16:
00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
GPR24:
00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
NIP [
c0011434] interrupt_exit_kernel_prepare+0x10/0x70
LR [
c001629c] interrupt_return+0x9c/0x144
Call Trace:
[
e2d09f30] [
c000424c] DataAccess_virt+0xd4/0xe4 (unreliable)
--- interrupt: 300 at 0xa09be008
NIP:
a09be008 LR:
a09bdfe8 CTR:
a09bdfc0
REGS:
e2d09f40 TRAP: 0300 Tainted: G W (
5.13.0-pmac-00028-gb3c15b60339a)
MSR:
0000d032 <EE,PR,ME,IR,DR,RI> CR:
420028e2 XER:
20000000
DAR:
a539a308 DSISR:
0a000000
GPR00:
a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004
GPR08:
00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144
GPR16:
00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000
GPR24:
00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089
NIP [
a09be008] 0xa09be008
LR [
a09bdfe8] 0xa09bdfe8
--- interrupt: 300
Instruction dump:
80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc
81430084 71480002 41820038 554a0462 <
0f0a0000>
80620060 74630001 40820034
Fixes:
b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C")
Cc: stable@vger.kernel.org # v5.13+
Reported-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4856f5574906e2aec0522be17bf3848a22b2cd0b.1629269345.git.christophe.leroy@csgroup.eu
Gerd Rausch [Tue, 17 Aug 2021 17:04:37 +0000 (10:04 -0700)]
net/rds: dma_map_sg is entitled to merge entries
Function "dma_map_sg" is entitled to merge adjacent entries
and return a value smaller than what was passed as "nents".
Subsequently "ib_map_mr_sg" needs to work with this value ("sg_dma_len")
rather than the original "nents" parameter ("sg_len").
This old RDS bug was exposed and reliably causes kernel panics
(using RDMA operations "rds-stress -D") on x86_64 starting with:
commit
c588072bba6b ("iommu/vt-d: Convert intel iommu driver to the iommu ops")
Simply put: Linux 5.11 and later.
Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Link: https://lore.kernel.org/r/60efc69f-1f35-529d-a7ef-da0549cad143@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Tue, 17 Aug 2021 16:04:25 +0000 (19:04 +0300)]
net: mscc: ocelot: allow forwarding from bridge ports to the tag_8021q CPU port
Currently we are unable to ping a bridge on top of a felix switch which
uses the ocelot-8021q tagger. The packets are dropped on the ingress of
the user port and the 'drop_local' counter increments (the counter which
denotes drops due to no valid destinations).
Dumping the PGID tables, it becomes clear that the PGID_SRC of the user
port is zero, so it has no valid destinations.
But looking at the code, the cpu_fwd_mask (the bit mask of DSA tag_8021q
ports) is clearly missing from the forwarding mask of ports that are
under a bridge. So this has always been broken.
Looking at the version history of the patch, in v7
https://patchwork.kernel.org/project/netdevbpf/patch/
20210125220333.
1004365-12-olteanv@gmail.com/
the code looked like this:
/* Standalone ports forward only to DSA tag_8021q CPU ports */
unsigned long mask = cpu_fwd_mask;
(...)
} else if (ocelot->bridge_fwd_mask & BIT(port)) {
mask |= ocelot->bridge_fwd_mask & ~BIT(port);
while in v8 (the merged version)
https://patchwork.kernel.org/project/netdevbpf/patch/
20210129010009.
3959398-12-olteanv@gmail.com/
it looked like this:
unsigned long mask;
(...)
} else if (ocelot->bridge_fwd_mask & BIT(port)) {
mask = ocelot->bridge_fwd_mask & ~BIT(port);
So the breakage was introduced between v7 and v8 of the patch.
Fixes:
e21268efbe26 ("net: dsa: felix: perform switch setup for tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20210817160425.3702809-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhan Liu [Fri, 13 Aug 2021 15:31:04 +0000 (08:31 -0700)]
drm/amd/display: Use DCN30 watermark calc for DCN301
[why]
dcn301_calculate_wm_and_dl() causes flickering when external monitor is
connected.
This issue has been fixed before by commit
0e4c0ae59d7e
("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however
part of the fix was gone after commit
2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next").
[how]
Use dcn30_calculate_wm_and_dlg() instead as in the original fix.
Fixes:
2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next")
Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Zhan Liu <zhan.liu@amd.com>
Tested-by: Oliver Logush <oliver.logush@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Linus Torvalds [Wed, 18 Aug 2021 19:06:42 +0000 (12:06 -0700)]
Merge tag 'for-5.14-rc6-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more fix for cross-rename, adding a missing check for directory
and subvolume, this could lead to a crash"
* tag 'for-5.14-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: prevent rename2 from exchanging a subvol with a directory from different parents
Linus Torvalds [Wed, 18 Aug 2021 19:00:27 +0000 (12:00 -0700)]
Merge tag 'sound-5.14-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Only a few regression fixes and trivial device quirks"
* tag 'sound-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/via: Apply runtime PM workaround for ASUS B23E
ALSA: hda: Fix hang during shutdown due to link reset
ALSA: hda/realtek: Enable 4-speaker output for Dell XPS 15 9510 laptop
ALSA: oxfw: fix functioal regression for silence in Apogee Duet FireWire
ALSA: hda - fix the 'Capture Switch' value change notifications
Linus Torvalds [Wed, 18 Aug 2021 18:55:50 +0000 (11:55 -0700)]
Merge tag 'cfi-v5.14-rc7' of git://git./linux/kernel/git/kees/linux
Pull clang cfi fix from Kees Cook:
- Use rcu_read_{un}lock_sched_notrace to avoid recursion (Elliot Berman)
* tag 'cfi-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
cfi: Use rcu_read_{un}lock_sched_notrace
Linus Torvalds [Thu, 5 Aug 2021 17:04:43 +0000 (10:04 -0700)]
pipe: avoid unnecessary EPOLLET wakeups under normal loads
I had forgotten just how sensitive hackbench is to extra pipe wakeups,
and commit
3a34b13a88ca ("pipe: make pipe writes always wake up
readers") ended up causing a quite noticeable regression on larger
machines.
Now, hackbench isn't necessarily a hugely meaningful benchmark, and it's
not clear that this matters in real life all that much, but as Mel
points out, it's used often enough when comparing kernels and so the
performance regression shows up like a sore thumb.
It's easy enough to fix at least for the common cases where pipes are
used purely for data transfer, and you never have any exciting poll
usage at all. So set a special 'poll_usage' flag when there is polling
activity, and make the ugly "EPOLLET has crazy legacy expectations"
semantics explicit to only that case.
I would love to limit it to just the broken EPOLLET case, but the pipe
code can't see the difference between epoll and regular select/poll, so
any non-read/write waiting will trigger the extra wakeup behavior. That
is sufficient for at least the hackbench case.
Apart from making the odd extra wakeup cases more explicitly about
EPOLLET, this also makes the extra wakeup be at the _end_ of the pipe
write, not at the first write chunk. That is actually much saner
semantics (as much as you can call any of the legacy edge-triggered
expectations for EPOLLET "sane") since it means that you know the wakeup
will happen once the write is done, rather than possibly in the middle
of one.
[ For stable people: I'm putting a "Fixes" tag on this, but I leave it
up to you to decide whether you actually want to backport it or not.
It likely has no impact outside of synthetic benchmarks - Linus ]
Link: https://lore.kernel.org/lkml/20210802024945.GA8372@xsang-OptiPlex-9020/
Fixes:
3a34b13a88ca ("pipe: make pipe writes always wake up readers")
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Sandeep Patil <sspatil@android.com>
Tested-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Weißschuh [Wed, 18 Aug 2021 16:44:35 +0000 (18:44 +0200)]
platform/x86: gigabyte-wmi: add support for B450M S2H V2
Reported as working here:
https://github.com/t-8ch/linux-gigabyte-wmi-driver/issues/1#issuecomment-
901207693
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20210818164435.99821-1-linux@weissschuh.net
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Hans de Goede [Mon, 16 Aug 2021 15:46:32 +0000 (17:46 +0200)]
usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers
Commit
a20dcf53ea98 ("usb: typec: tcpm: Respond Not_Supported if no
snk_vdo"), stops tcpm_pd_data_request() calling tcpm_handle_vdm_request()
when port->nr_snk_vdo is not set. But the VDM might be intended for an
altmode-driver, in which case nr_snk_vdo does not matter.
This change breaks the forwarding of connector hotplug (HPD) events
for displayport altmode on devices which don't set nr_snk_vdo.
tcpm_pd_data_request() is the only caller of tcpm_handle_vdm_request(),
so we can move the nr_snk_vdo check to inside it, at which point we
have already looked up the altmode device so we can check for this too.
Doing this check here also ensures that vdm_state gets set to
VDM_STATE_DONE if it was VDM_STATE_BUSY, even if we end up with
responding with PD_MSG_CTRL_NOT_SUPP later.
Note that tcpm_handle_vdm_request() was already sending
PD_MSG_CTRL_NOT_SUPP in some circumstances, after moving the nr_snk_vdo
check the same error-path is now taken when that check fails. So that
we have only one error-path for this and not two. Replace the
tcpm_queue_message(PD_MSG_CTRL_NOT_SUPP) used by the existing error-path
with the more robust tcpm_pd_handle_msg() from the (now removed) second
error-path.
Fixes:
a20dcf53ea98 ("usb: typec: tcpm: Respond Not_Supported if no snk_vdo")
Cc: stable <stable@vger.kernel.org>
Cc: Kyle Tso <kyletso@google.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Kyle Tso <kyletso@google.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20210816154632.381968-1-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nathan Chancellor [Mon, 16 Aug 2021 18:57:11 +0000 (11:57 -0700)]
powerpc/xive: Do not mark xive_request_ipi() as __init
Compiling ppc64le_defconfig with clang-14 shows a modpost warning:
WARNING: modpost: vmlinux.o(.text+0xa74e0): Section mismatch in
reference from the function xive_setup_cpu_ipi() to the function
.init.text:xive_request_ipi()
The function xive_setup_cpu_ipi() references
the function __init xive_request_ipi().
This is often because xive_setup_cpu_ipi lacks a __init
annotation or the annotation of xive_request_ipi is wrong.
xive_request_ipi() is called from xive_setup_cpu_ipi(), which is not
__init, so xive_request_ipi() should not be marked __init. Remove the
attribute so there is no more warning.
Fixes:
cbc06f051c52 ("powerpc/xive: Do not skip CPU-less nodes when creating the IPIs")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210816185711.21563-1-nathan@kernel.org
Jani Nikula [Mon, 16 Aug 2021 07:17:37 +0000 (10:17 +0300)]
drm/i915/dp: remove superfluous EXPORT_SYMBOL()
The symbol isn't needed outside of i915.ko.
Fixes:
b30edfd8d0b4 ("drm/i915: Switch to LTTPR non-transparent mode link training")
Fixes:
264613b406eb ("drm/i915: Disable LTTPR support when the DPCD rev < 1.4")
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210816071737.2917-1-jani.nikula@intel.com
(cherry picked from commit
d8959fb33890ba1956c142e83398e89812450ffc)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Jani Nikula [Thu, 12 Aug 2021 13:23:54 +0000 (16:23 +0300)]
drm/i915/edp: fix eDP MSO pipe sanity checks for ADL-P
ADL-P supports stream splitter on pipe B in addition to pipe A. Update
the sanity check in intel_ddi_mso_get_config() to reflect this, and
remove the check in intel_ddi_mso_configure() as redundant with
encoder->pipe_mask. Abstract the splitter pipe mask to a single point of
truth while at it to avoid similar mistakes in the future.
Fixes:
7bc188cc2c8c ("drm/i915/adl_p: enable MSO on pipe B")
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Swati Sharma <swati2.sharma@intel.com>
Tested-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812132354.10885-1-jani.nikula@intel.com
(cherry picked from commit
f6864b27d6d324771d979694de7ca455afbad32a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Anshuman Gupta [Tue, 10 Aug 2021 11:31:12 +0000 (17:01 +0530)]
drm/i915: Tweaked Wa_14010685332 for all PCHs
dispcnlunit1_cp_xosc_clkreq clock observed to be active on TGL-H platform
despite Wa_14010685332 original sequence,
thus blocks entry to deeper s0ix state.
The Tweaked Wa_14010685332 sequence fixes this issue, therefore use tweaked
Wa_14010685332 sequence for every PCH since PCH_CNP.
v2:
- removed RKL from comment and simplified condition. [Rodrigo]
Fixes:
b896898c7369 ("drm/i915: Tweaked Wa_14010685332 for PCHs used on gen11 platforms")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210810113112.31739-2-anshuman.gupta@intel.com
(cherry picked from commit
8b46cc6577f4bbef7e5909bb926da31d705f350f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Liu Yi L [Tue, 17 Aug 2021 12:43:21 +0000 (20:43 +0800)]
iommu/vt-d: Fix incomplete cache flush in intel_pasid_tear_down_entry()
This fixes improper iotlb invalidation in intel_pasid_tear_down_entry().
When a PASID was used as nested mode, released and reused, the following
error message will appear:
[ 180.187556] Unexpected page request in Privilege Mode
[ 180.187565] Unexpected page request in Privilege Mode
[ 180.279933] Unexpected page request in Privilege Mode
[ 180.279937] Unexpected page request in Privilege Mode
Per chapter 6.5.3.3 of VT-d spec 3.3, when tear down a pasid entry, the
software should use Domain selective IOTLB flush if the PGTT of the pasid
entry is SL only or Nested, while for the pasid entries whose PGTT is FL
only or PT using PASID-based IOTLB flush is enough.
Fixes:
2cd1311a26673 ("iommu/vt-d: Add set domain DOMAIN_ATTR_NESTING attr")
Signed-off-by: Kumar Sanjay K <sanjay.k.kumar@intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Tested-by: Yi Sun <yi.y.sun@intel.com>
Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Fenghua Yu [Tue, 17 Aug 2021 12:43:20 +0000 (20:43 +0800)]
iommu/vt-d: Fix PASID reference leak
A PASID reference is increased whenever a device is bound to an mm (and
its PASID) successfully (i.e. the device's sdev user count is increased).
But the reference is not dropped every time the device is unbound
successfully from the mm (i.e. the device's sdev user count is decreased).
The reference is dropped only once by calling intel_svm_free_pasid() when
there isn't any device bound to the mm. intel_svm_free_pasid() drops the
reference and only frees the PASID on zero reference.
Fix the issue by dropping the PASID reference and freeing the PASID when
no reference on successful unbinding the device by calling
intel_svm_free_pasid() .
Fixes:
4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers")
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/r/20210813181345.1870742-1-fenghua.yu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210817124321.1517985-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pavel Skripkin [Tue, 17 Aug 2021 16:37:23 +0000 (19:37 +0300)]
net: asix: fix uninit value bugs
Syzbot reported uninit-value in asix_mdio_read(). The problem was in
missing error handling. asix_read_cmd() should initialize passed stack
variable smsr, but it can fail in some cases. Then while condidition
checks possibly uninit smsr variable.
Since smsr is uninitialized stack variable, driver can misbehave,
because smsr will be random in case of asix_read_cmd() failure.
Fix it by adding error handling and just continue the loop instead of
checking uninit value.
Added helper function for checking Host_En bit, since wrong loop was used
in 4 functions and there is no need in copy-pasting code parts.
Cc: Robert Foss <robert.foss@collabora.com>
Fixes:
d9fe64e51114 ("net: asix: Add in_pm parameter")
Reported-by: syzbot+a631ec9e717fb0423053@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kaixi.fan [Wed, 18 Aug 2021 02:22:15 +0000 (10:22 +0800)]
ovs: clear skb->tstamp in forwarding path
fq qdisc requires tstamp to be cleared in the forwarding path. Now ovs
doesn't clear skb->tstamp. We encountered a problem with linux
version 5.4.56 and ovs version 2.14.1, and packets failed to
dequeue from qdisc when fq qdisc was attached to ovs port.
Fixes:
fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: kaixi.fan <fankaixi.li@bytedance.com>
Signed-off-by: xiexiaohui <xiexiaohui.xxh@bytedance.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Aug 2021 09:48:52 +0000 (10:48 +0100)]
Merge branch 'mdio-fixes'
Saravana Kannan says:
====================
Clean up and fix error handling in mdio_mux_init()
This patch series was started due to -EPROBE_DEFER not being handled
correctly in mdio_mux_init() and causing issues [1]. While at it, I also
did some more error handling fixes and clean ups. The -EPROBE_DEFER fix is
the last patch.
Ideally, in the last patch we'd treat any error similar to -EPROBE_DEFER
but I'm not sure if it'll break any board/platforms where some child
mdiobus never successfully registers. If we treated all errors similar to
-EPROBE_DEFER, then none of the child mdiobus will work and that might be a
regression. If people are sure this is not a real case, then I can fix up
the last patch to always fail the entire mdio-mux init if any of the child
mdiobus registration fails.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Saravana Kannan [Wed, 18 Aug 2021 03:38:03 +0000 (20:38 -0700)]
net: mdio-mux: Handle -EPROBE_DEFER correctly
When registering mdiobus children, if we get an -EPROBE_DEFER, we shouldn't
ignore it and continue registering the rest of the mdiobus children. This
would permanently prevent the deferring child mdiobus from working instead
of reattempting it in the future. So, if a child mdiobus needs to be
reattempted in the future, defer the entire mdio-mux initialization.
This fixes the issue where PHYs sitting under the mdio-mux aren't
initialized correctly if the PHY's interrupt controller is not yet ready
when the mdio-mux is being probed. Additional context in the link below.
Fixes:
0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.")
Link: https://lore.kernel.org/lkml/CAGETcx95kHrv8wA-O+-JtfH7H9biJEGJtijuPVN0V5dUKUAB3A@mail.gmail.com/#t
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saravana Kannan [Wed, 18 Aug 2021 03:38:02 +0000 (20:38 -0700)]
net: mdio-mux: Don't ignore memory allocation errors
If we are seeing memory allocation errors, don't try to continue
registering child mdiobus devices. It's unlikely they'll succeed.
Fixes:
342fa1964439 ("mdio: mux: make child bus walking more permissive and errors more verbose")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saravana Kannan [Wed, 18 Aug 2021 03:38:01 +0000 (20:38 -0700)]
net: mdio-mux: Delete unnecessary devm_kfree
The whole point of devm_* APIs is that you don't have to undo them if you
are returning an error that's going to get propagated out of a probe()
function. So delete unnecessary devm_kfree() call in the error return path.
Fixes:
b60161668199 ("mdio: mux: Correct mdio_mux_init error path issues")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 17 Aug 2021 14:52:45 +0000 (17:52 +0300)]
net: dsa: sja1105: fix use-after-free after calling of_find_compatible_node, or worse
It seems that of_find_compatible_node has a weird calling convention in
which it calls of_node_put() on the "from" node argument, instead of
leaving that up to the caller. This comes from the fact that
of_find_compatible_node with a non-NULL "from" argument it only supposed
to be used as the iterator function of for_each_compatible_node(). OF
iterator functions call of_node_get on the next OF node and of_node_put()
on the previous one.
When of_find_compatible_node calls of_node_put, it actually never
expects the refcount to drop to zero, because the call is done under the
atomic devtree_lock context, and when the refcount drops to zero it
triggers a kobject and a sysfs file deletion, which assume blocking
context.
So any driver call to of_find_compatible_node is probably buggy because
an unexpected of_node_put() takes place.
What should be done is to use the of_get_compatible_child() function.
Fixes:
5a8f09748ee7 ("net: dsa: sja1105: register the MDIO buses for 100base-T1 and 100base-TX")
Link: https://lore.kernel.org/netdev/20210814010139.kzryimmp4rizlznt@skbuf/
Suggested-by: Frank Rowand <frowand.list@gmail.com>
Suggested-by: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toke Høiland-Jørgensen [Mon, 16 Aug 2021 11:59:17 +0000 (13:59 +0200)]
sch_cake: fix srchost/dsthost hashing mode
When adding support for using the skb->hash value as the flow hash in CAKE,
I accidentally introduced a logic error that broke the host-only isolation
modes of CAKE (srchost and dsthost keywords). Specifically, the flow_hash
variable should stay initialised to 0 in cake_hash() in pure host-based
hashing mode. Add a check for this before using the skb->hash value as
flow_hash.
Fixes:
b0c19ed6088a ("sch_cake: Take advantage of skb->hash where appropriate")
Reported-by: Pete Heist <pete@heistp.net>
Tested-by: Pete Heist <pete@heistp.net>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Skeggs [Thu, 4 Mar 2021 09:53:01 +0000 (19:53 +1000)]
drm/nouveau: rip out nvkm_client.super
No longer required now that userspace can't touch anything that might
need it, and should fix DRM MM operations racing with each other, and
the random hangs/crashes that come with that.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Ben Skeggs [Thu, 4 Mar 2021 09:16:18 +0000 (19:16 +1000)]
drm/nouveau: block a bunch of classes from userspace
Long ago, there had been plans for making use of a bunch of these APIs
from userspace and there's various checks in place to stop misbehaving.
Countless other projects have occurred in the meantime, and the pieces
didn't finish falling into place for that to happen.
They will (hopefully) in the not-too-distant future, but it won't look
quite as insane. The super checks are causing problems right now, and
are going to be removed.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Ben Skeggs [Thu, 4 Mar 2021 08:52:52 +0000 (18:52 +1000)]
drm/nouveau/fifo/nv50-: rip out dma channels
I honestly don't even know why... These have never been used.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Ben Skeggs [Tue, 10 Aug 2021 09:29:57 +0000 (19:29 +1000)]
drm/nouveau/kms/nv50: workaround EFI GOP window channel format differences
Should fix some initial modeset failures on (at least) Ampere boards.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>