linux-2.6-microblaze.git
3 years agonvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a
Dmitry Monakhov [Wed, 10 Mar 2021 12:06:41 +0000 (12:06 +0000)]
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a

This adds a quirk for Samsung PM1725a drive which fixes timeouts and
I/O errors due to the fact that the controller does not properly
handle the Write Zeroes command, dmesg log:

nvme nvme0: I/O 528 QID 10 timeout, aborting
nvme nvme0: I/O 529 QID 10 timeout, aborting
nvme nvme0: I/O 530 QID 10 timeout, aborting
nvme nvme0: I/O 531 QID 10 timeout, aborting
nvme nvme0: I/O 532 QID 10 timeout, aborting
nvme nvme0: I/O 533 QID 10 timeout, aborting
nvme nvme0: I/O 534 QID 10 timeout, aborting
nvme nvme0: I/O 535 QID 10 timeout, aborting
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: Abort status: 0x0
nvme nvme0: I/O 528 QID 10 timeout, reset controller
nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10
nvme nvme0: Device not ready; aborting reset, CSTS=0x3
nvme nvme0: Device not ready; aborting reset, CSTS=0x3
nvme nvme0: Removing after probe failure status: -19
nvme0n1: detected capacity change from 6251233968 to 0
blk_update_request: I/O error, dev nvme0n1, sector 32776 op 0x1:(WRITE) flags 0x3000 phys_seg 6 prio class 0
blk_update_request: I/O error, dev nvme0n1, sector 113319936 op 0x9:(WRITE_ZEROES) flags 0x800 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 1, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113319680 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 2, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113319424 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 3, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113319168 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 4, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113318912 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 5, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113318656 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
Buffer I/O error on dev nvme0n1p2, logical block 6, lost async page write
blk_update_request: I/O error, dev nvme0n1, sector 113318400 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
blk_update_request: I/O error, dev nvme0n1, sector 113318144 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
blk_update_request: I/O error, dev nvme0n1, sector 113317888 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0

Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme-rdma: Fix a use after free in nvmet_rdma_write_data_done
Lv Yunlong [Thu, 11 Mar 2021 05:44:13 +0000 (21:44 -0800)]
nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done

In nvmet_rdma_write_data_done, rsp is recoverd by wc->wr_cqe and freed by
nvmet_rdma_release_rsp(). But after that, pr_info() used the freed
chunk's member object and could leak the freed chunk address with
wc->wr_cqe by computing the offset.

Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme-core: check ctrl css before setting up zns
Chaitanya Kulkarni [Tue, 9 Mar 2021 04:58:21 +0000 (20:58 -0800)]
nvme-core: check ctrl css before setting up zns

Ensure multiple Command Sets are supported before starting to setup a
ZNS namespace.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
[hch: move the check around a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme-fc: fix racing controller reset and create association
James Smart [Tue, 9 Mar 2021 00:51:26 +0000 (16:51 -0800)]
nvme-fc: fix racing controller reset and create association

Recent patch to prevent calling __nvme_fc_abort_outstanding_ios in
interrupt context results in a possible race condition. A controller
reset results in errored io completions, which schedules error
work. The change of error work to a work element allows it to fire
after the ctrl state transition to NVME_CTRL_CONNECTING, causing
any outstanding io (used to initialize the controller) to fail and
cause problems for connect_work.

Add a state check to only schedule error work if not in the RESETTING
state.

Fixes: 19fce0470f05 ("nvme-fc: avoid calling _nvme_fc_abort_outstanding_ios from interrupt context")
Signed-off-by: Nigel Kirkland <nkirkland2304@gmail.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted
Hannes Reinecke [Fri, 26 Feb 2021 07:17:28 +0000 (08:17 +0100)]
nvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted

When a command has been aborted we should return NVME_SC_HOST_ABORTED_CMD
to be consistent with the other transports.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()
Hannes Reinecke [Fri, 26 Feb 2021 07:17:27 +0000 (08:17 +0100)]
nvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()

nvme_fc_terminate_exchange() is being called when exchanges are
being deleted, and as such we should be setting the NVME_REQ_CANCELLED
flag to have identical behaviour on all transports.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()
Hannes Reinecke [Fri, 26 Feb 2021 07:17:26 +0000 (08:17 +0100)]
nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()

NVME_REQ_CANCELLED is translated into -EINTR in nvme_submit_sync_cmd(),
so we should be setting this flags during nvme_cancel_request() to
ensure that the callers to nvme_submit_sync_cmd() will get the correct
error code when the controller is reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chao Leng <lengchao@huawei.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme: simplify error logic in nvme_validate_ns()
Hannes Reinecke [Fri, 26 Feb 2021 07:17:25 +0000 (08:17 +0100)]
nvme: simplify error logic in nvme_validate_ns()

We only should remove namespaces when we get fatal error back from
the device or when the namespace IDs have changed.
So instead of painfully masking out error numbers which might indicate
that the error should be ignored we could use an NVME status code
to indicated when the namespace should be removed.
That simplifies the final logic and makes it less error-prone.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agonvme: set max_zone_append_sectors nvme_revalidate_zones
Chaitanya Kulkarni [Wed, 3 Mar 2021 22:47:17 +0000 (14:47 -0800)]
nvme: set max_zone_append_sectors nvme_revalidate_zones

The chunk_sectors value affects max_zone_append_sectors.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Tested-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
3 years agomedia: rkisp1: params: fix wrong bits settings
Dafna Hirschfeld [Mon, 1 Mar 2021 17:18:35 +0000 (18:18 +0100)]
media: rkisp1: params: fix wrong bits settings

The histogram mode is set using 'rkisp1_params_set_bits'.
Only the bits of the mode should be the value argument for
that function. Otherwise bits outside the mode mask are
turned on which is not what was intended.

Fixes: bae1155cf579 ("media: staging: rkisp1: add output device for parameters")
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
3 years agomedia: v4l: vsp1: Fix uif null pointer access
Biju Das [Mon, 1 Mar 2021 12:08:28 +0000 (13:08 +0100)]
media: v4l: vsp1: Fix uif null pointer access

RZ/G2L SoC has no UIF. This patch fixes null pointer access, when UIF
module is not used.

Fixes: 5e824f989e6e8("media: v4l: vsp1: Integrate DISCOM in display pipeline")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
3 years agomedia: v4l: vsp1: Fix bru null pointer access
Biju Das [Mon, 1 Mar 2021 12:08:27 +0000 (13:08 +0100)]
media: v4l: vsp1: Fix bru null pointer access

RZ/G2L SoC has only BRS. This patch fixes null pointer access,when only
BRS is enabled.

Fixes: cbb7fa49c7466("media: v4l: vsp1: Rename BRU to BRx")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
3 years agomedia: usbtv: Fix deadlock on suspend
Maxim Mikityanskiy [Fri, 5 Feb 2021 22:51:39 +0000 (23:51 +0100)]
media: usbtv: Fix deadlock on suspend

usbtv doesn't support power management, so on system suspend the
.disconnect callback of the driver is called. The teardown sequence
includes a call to snd_card_free. Its implementation waits until the
refcount of the sound card device drops to zero, however, if its file is
open, snd_card_file_add takes a reference, which can't be dropped during
the suspend, because the userspace processes are already frozen at this
point. snd_card_free waits for completion forever, leading to a hang on
suspend.

This commit fixes this deadlock condition by replacing snd_card_free
with snd_card_free_when_closed, that doesn't wait until all references
are released, allowing suspend to progress.

Fixes: 63ddf68de52e ("[media] usbtv: add audio support")
Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
3 years agomedia: rc: compile rc-cec.c into rc-core
Hans Verkuil [Fri, 26 Feb 2021 10:37:47 +0000 (11:37 +0100)]
media: rc: compile rc-cec.c into rc-core

The rc-cec keymap is unusual in that it can't be built as a module,
instead it is registered directly in rc-main.c if CONFIG_MEDIA_CEC_RC
is set. This is because it can be called from drm_dp_cec_set_edid() via
cec_register_adapter() in an asynchronous context, and it is not
allowed to use request_module() to load rc-cec.ko in that case. Trying to
do so results in a 'WARN_ON_ONCE(wait && current_is_async())'.

Since this keymap is only used if CONFIG_MEDIA_CEC_RC is set, we
just compile this keymap into the rc-core module and never as a
separate module.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 2c6d1fffa1d9 (drm: add support for DisplayPort CEC-Tunneling-over-AUX)
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
3 years agodrm/compat: Clear bounce structures
Daniel Vetter [Mon, 22 Feb 2021 10:06:43 +0000 (11:06 +0100)]
drm/compat: Clear bounce structures

Some of them have gaps, or fields we don't clear. Native ioctl code
does full copies plus zero-extends on size mismatch, so nothing can
leak. But compat is more hand-rolled so need to be careful.

None of these matter for performance, so just memset.

Also I didn't fix up the CONFIG_DRM_LEGACY or CONFIG_DRM_AGP ioctl, those
are security holes anyway.

Acked-by: Maxime Ripard <mripard@kernel.org>
Reported-by: syzbot+620cf21140fc7e772a5d@syzkaller.appspotmail.com # vblank ioctl
Cc: syzbot+620cf21140fc7e772a5d@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210222100643.400935-1-daniel.vetter@ffwll.ch
(cherry picked from commit e926c474ebee404441c838d18224cd6f246a71b7)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/shmem-helpers: vunmap: Don't put pages for dma-buf
Noralf Trønnes [Fri, 19 Feb 2021 12:22:03 +0000 (13:22 +0100)]
drm/shmem-helpers: vunmap: Don't put pages for dma-buf

dma-buf importing was reworked in commit 7d2cd72a9aa3
("drm/shmem-helpers: Simplify dma-buf importing"). Before that commit
drm_gem_shmem_prime_import_sg_table() did set ->pages_use_count=1 and
drm_gem_shmem_vunmap_locked() could call drm_gem_shmem_put_pages()
unconditionally. Now without the use count set, put pages is called also
on dma-bufs. Fix this by only putting pages if it's not imported.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Fixes: 7d2cd72a9aa3 ("drm/shmem-helpers: Simplify dma-buf importing")
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Tested-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210219122203.51130-1-noralf@tronnes.org
(cherry picked from commit cdea72518a2b38207146e92e1c9e2fac15975679)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm: meson_drv add shutdown function
Artem Lapkin [Tue, 2 Mar 2021 04:22:02 +0000 (12:22 +0800)]
drm: meson_drv add shutdown function

Problem: random stucks on reboot stage about 1/20 stuck/reboots
// debug kernel log
[    4.496660] reboot: kernel restart prepare CMD:(null)
[    4.498114] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown begin
[    4.503949] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown domain 0:VPU...
...STUCK...

Solution: add shutdown function to meson_drm driver
// debug kernel log
[    5.231896] reboot: kernel restart prepare CMD:(null)
[    5.246135] [drm:meson_drv_shutdown]
...
[    5.259271] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown begin
[    5.274688] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown domain 0:VPU...
[    5.338331] reboot: Restarting system
[    5.358293] psci: PSCI_0_2_FN_SYSTEM_RESET reboot_mode:0 cmd:(null)
bl31 reboot reason: 0xd
bl31 reboot reason: 0x0
system cmd  1.
...REBOOT...

Tested: on VIM1 VIM2 VIM3 VIM3L khadas sbcs - 1000+ successful reboots
and Odroid boards, WeTek Play2 (GXBB)

Fixes: bbbe775ec5b5 ("drm: Add support for Amlogic Meson Graphic Controller")
Signed-off-by: Artem Lapkin <art@khadas.com>
Tested-by: Christian Hewitt <christianshewitt@gmail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210302042202.3728113-1-art@khadas.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/shmem-helper: Don't remove the offset in vm_area_struct pgoff
Neil Roberts [Tue, 23 Feb 2021 15:51:25 +0000 (16:51 +0100)]
drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff

When mmapping the shmem, it would previously adjust the pgoff in the
vm_area_struct to remove the fake offset that is added to be able to
identify the buffer. This patch removes the adjustment and makes the
fault handler use the vm_fault address to calculate the page offset
instead. Although using this address is apparently discouraged, several
DRM drivers seem to be doing it anyway.

The problem with removing the pgoff is that it prevents
drm_vma_node_unmap from working because that searches the mapping tree
by address. That doesn't work because all of the mappings are at offset
0. drm_vma_node_unmap is being used by the shmem helpers when purging
the buffer.

This fixes a bug in Panfrost which is using drm_gem_shmem_purge. Without
this the mapping for the purged buffer can still be accessed which might
mean it would access random pages from other buffers

v2: Don't check whether the unsigned page_offset is less than 0.

Cc: stable@vger.kernel.org
Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers")
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210223155125.199577-3-nroberts@igalia.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/shmem-helper: Check for purged buffers in fault handler
Neil Roberts [Tue, 23 Feb 2021 15:51:24 +0000 (16:51 +0100)]
drm/shmem-helper: Check for purged buffers in fault handler

When a buffer is madvised as not needed and then purged, any attempts to
access the buffer from user-space should cause a bus fault. This patch
adds a check for that.

Cc: stable@vger.kernel.org
Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers")
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210223155125.199577-2-nroberts@igalia.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agoqxl: Fix uninitialised struct field head.surface_id
Colin Ian King [Thu, 4 Mar 2021 09:49:28 +0000 (09:49 +0000)]
qxl: Fix uninitialised struct field head.surface_id

The surface_id struct field in head is not being initialized and
static analysis warns that this is being passed through to
dev->monitors_config->heads[i] on an assignment. Clear up this
warning by initializing it to zero.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: a6d3c4d79822 ("qxl: hook monitors_config updates into crtc, not encoder.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210304094928.2280722-1-colin.king@canonical.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/ttm: Fix TTM page pool accounting
Anthony DeRossi [Wed, 3 Mar 2021 01:17:25 +0000 (17:17 -0800)]
drm/ttm: Fix TTM page pool accounting

Freed pages are not subtracted from the allocated_pages counter in
ttm_pool_type_fini(), causing a leak in the count on device removal.
The next shrinker invocation loops forever trying to free pages that are
no longer in the pool:

  rcu: INFO: rcu_sched self-detected stall on CPU
  rcu:  3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237
    (t=10001 jiffies g=2194533 q=49211)
  NMI backtrace for cpu 3
  CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P           O      5.11.0-com #1
  Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019
  Call Trace:
   <IRQ>
   ...
   </IRQ>
   sysvec_apic_timer_interrupt+0x77/0x80
   asm_sysvec_apic_timer_interrupt+0x12/0x20
  RIP: 0010:mutex_unlock+0x16/0x20
  Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10
  RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246
  RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000
  RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0
  RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80
  R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60
   ttm_pool_shrink+0x7d/0x90 [ttm]
   ttm_pool_shrinker_scan+0x5/0x20 [ttm]
   do_shrink_slab+0x13a/0x1a0
...

debugfs shows the incorrect total:

  $ cat /sys/kernel/debug/dri/0/ttm_page_pool
            --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10---
  wc      :        0        0        0        0        0        0        0        0        0        0        0
  uc      :        0        0        0        0        0        0        0        0        0        0        0
  wc 32   :        0        0        0        0        0        0        0        0        0        0        0
  uc 32   :        0        0        0        0        0        0        0        0        0        0        0
  DMA uc  :        0        0        0        0        0        0        0        0        0        0        0
  DMA wc  :        0        0        0        0        0        0        0        0        0        0        0
  DMA     :        0        0        0        0        0        0        0        0        0        0        0

  total   :     3029 of  8244261

Using ttm_pool_type_take() to remove pages from the pool before freeing
them correctly accounts for the freed pages.

Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3")
Signed-off-by: Anthony DeRossi <ajderossi@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderossi@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/ttm: soften TTM warnings
Christian König [Wed, 3 Mar 2021 14:47:14 +0000 (15:47 +0100)]
drm/ttm: soften TTM warnings

QXL indeed unrefs pinned BOs and the warnings are spamming peoples log files.

Make sure we warn only once until the QXL driver is fixed.

Signed-off-by: Christian König <christian.koenig@amd.com>
References: https://lore.kernel.org/lkml/YD+eYcMMcdlXB8PY@alley/
Link: https://patchwork.freedesktop.org/patch/422834/
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm: Use USB controller's DMA mask when importing dmabufs
Thomas Zimmermann [Wed, 3 Mar 2021 13:32:29 +0000 (14:32 +0100)]
drm: Use USB controller's DMA mask when importing dmabufs

USB devices cannot perform DMA and hence have no dma_mask set in their
device structure. Therefore importing dmabuf into a USB-based driver
fails, which breaks joining and mirroring of display in X11.

For USB devices, pick the associated USB controller as attachment device.
This allows the DRM import helpers to perform the DMA setup. If the DMA
controller does not support DMA transfers, we're out of luck and cannot
import. Our current USB-based DRM drivers don't use DMA, so the actual
DMA device is not important.

Tested by joining/mirroring displays of udl and radeon under Gnome/X11.

v8:
* release dmadev if device initialization fails (Noralf)
* fix commit description (Noralf)
v7:
* fix use-before-init bug in gm12u320 (Dan)
v6:
* implement workaround in DRM drivers and hold reference to
  DMA device while USB device is in use
* remove dev_is_usb() (Greg)
* collapse USB helper into usb_intf_get_dma_device() (Alan)
* integrate Daniel's TODO statement (Daniel)
* fix typos (Greg)
v5:
* provide a helper for USB interfaces (Alan)
* add FIXME item to documentation and TODO list (Daniel)
v4:
* implement workaround with USB helper functions (Greg)
* use struct usb_device->bus->sysdev as DMA device (Takashi)
v3:
* drop gem_create_object
* use DMA mask of USB controller, if any (Daniel, Christian, Noralf)
v2:
* move fix to importer side (Christian, Daniel)
* update SHMEM and CMA helpers for new PRIME callbacks

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 6eb0233ec2d0 ("usb: don't inherity DMA properties for USB devices")
Tested-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210303133229.3288-1-tzimmermann@suse.de
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agoMAINTAINERS: update drm bug reporting URL
Pavel Turinský [Sun, 28 Feb 2021 16:36:58 +0000 (17:36 +0100)]
MAINTAINERS: update drm bug reporting URL

The original bugzilla seems to be read-only now, linking to the gitlab
for new bugs.

Signed-off-by: Pavel Turinský <ledoian@kam.mff.cuni.cz>
Cc: trivial@kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210228163658.54962-1-ledoian@kam.mff.cuni.cz
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agofbdev: atyfb: use LCD management functions for PPC_PMAC also
Randy Dunlap [Fri, 26 Feb 2021 17:30:08 +0000 (09:30 -0800)]
fbdev: atyfb: use LCD management functions for PPC_PMAC also

Include PPC_PMAC in the configs that use aty_ld_lcd() and
aty_st_lcd() implementations so that the PM code may work
correctly for PPC_PMAC.

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-fbdev@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210226173008.18236-1-rdunlap@infradead.org
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agofbdev: atyfb: always declare aty_{ld,st}_lcd()
Randy Dunlap [Wed, 24 Feb 2021 21:55:28 +0000 (13:55 -0800)]
fbdev: atyfb: always declare aty_{ld,st}_lcd()

The previously added stubs for aty_{ld,}st_lcd() make it
so that these functions are used regardless of the config
options that were guarding them, so remove the #ifdef/#endif
lines and make their declarations always visible.
This fixes build warnings that were reported by clang:

   drivers/video/fbdev/aty/atyfb_base.c:180:6: warning: no previous prototype for function 'aty_st_lcd' [-Wmissing-prototypes]
   void aty_st_lcd(int index, u32 val, const struct atyfb_par *par)
        ^
   drivers/video/fbdev/aty/atyfb_base.c:180:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void aty_st_lcd(int index, u32 val, const struct atyfb_par *par)

   drivers/video/fbdev/aty/atyfb_base.c:183:5: warning: no previous prototype for function 'aty_ld_lcd' [-Wmissing-prototypes]
   u32 aty_ld_lcd(int index, const struct atyfb_par *par)
       ^
   drivers/video/fbdev/aty/atyfb_base.c:183:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   u32 aty_ld_lcd(int index, const struct atyfb_par *par)

They should not be marked as static since they are used in
mach64_ct.c.

Fixes: bfa5782b9caa ("fbdev: atyfb: add stubs for aty_{ld,st}_lcd()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: linux-fbdev@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210224215528.822-1-rdunlap@infradead.org
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/qxl: fix lockdep issue in qxl_alloc_release_reserved
Gerd Hoffmann [Wed, 17 Feb 2021 12:32:06 +0000 (13:32 +0100)]
drm/qxl: fix lockdep issue in qxl_alloc_release_reserved

Call qxl_bo_unpin (which does a reservation) without holding the
release_mutex lock.  Fixes lockdep (correctly) warning on a possible
deadlock.

Fixes: e8dd3506dcf3 ("drm/qxl: unpin release objects")
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20210217123213.2199186-5-kraxel@redhat.com
(cherry picked from commit 19089b760e56c97458c272e90e43da761b05cf12)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/qxl: unpin release objects
Gerd Hoffmann [Thu, 4 Feb 2021 14:57:05 +0000 (15:57 +0100)]
drm/qxl: unpin release objects

Balances the qxl_create_bo(..., pinned=true, ...);
call in qxl_release_bo_alloc().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20210204145712.1531203-5-kraxel@redhat.com
(cherry picked from commit 65ffea3c6e738f37bb15ff3ee480415c793df893)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agodrm/fb-helper: only unmap if buffer not null
Tong Zhang [Sun, 28 Feb 2021 04:46:25 +0000 (23:46 -0500)]
drm/fb-helper: only unmap if buffer not null

drm_fbdev_cleanup() can be called when fb_helper->buffer is null, hence
fb_helper->buffer should be checked before calling
drm_client_buffer_vunmap(). This buffer is also checked in
drm_client_framebuffer_delete(), so we should also do the same thing for
drm_client_buffer_vunmap().

[  199.128742] RIP: 0010:drm_client_buffer_vunmap+0xd/0x20
[  199.129031] Code: 43 18 48 8b 53 20 49 89 45 00 49 89 55 08 5b 44 89 e0 41 5c 41 5d 41 5e 5d
c3 0f 1f 00 53 48 89 fb 48 8d 7f 10 e8 73 7d a1 ff <48> 8b 7b 10 48 8d 73 18 5b e9 75 53 fc ff 0
f 1f 44 00 00 48 b8 00
[  199.130041] RSP: 0018:ffff888103f3fc88 EFLAGS: 00010282
[  199.130329] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8214d46d
[  199.130733] RDX: 1ffffffff079c6b9 RSI: 0000000000000246 RDI: ffffffff83ce35c8
[  199.131119] RBP: ffff888103d25458 R08: 0000000000000001 R09: fffffbfff0791761
[  199.131505] R10: ffffffff83c8bb07 R11: fffffbfff0791760 R12: 0000000000000000
[  199.131891] R13: ffff888103d25468 R14: ffff888103d25418 R15: ffff888103f18120
[  199.132277] FS:  00007f36fdcbb6a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000
[  199.132721] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  199.133033] CR2: 0000000000000010 CR3: 0000000103d26000 CR4: 00000000000006f0
[  199.133420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  199.133807] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  199.134195] Call Trace:
[  199.134333]  drm_fbdev_cleanup+0x179/0x1a0
[  199.134562]  drm_fbdev_client_unregister+0x2b/0x40
[  199.134828]  drm_client_dev_unregister+0xa8/0x180
[  199.135088]  drm_dev_unregister+0x61/0x110
[  199.135315]  mgag200_pci_remove+0x38/0x52 [mgag200]
[  199.135586]  pci_device_remove+0x62/0xe0
[  199.135806]  device_release_driver_internal+0x148/0x270
[  199.136094]  driver_detach+0x76/0xe0
[  199.136294]  bus_remove_driver+0x7e/0x100
[  199.136521]  pci_unregister_driver+0x28/0xf0
[  199.136759]  __x64_sys_delete_module+0x268/0x300
[  199.137016]  ? __ia32_sys_delete_module+0x300/0x300
[  199.137285]  ? call_rcu+0x3e4/0x580
[  199.137481]  ? fpregs_assert_state_consistent+0x4d/0x60
[  199.137767]  ? exit_to_user_mode_prepare+0x2f/0x130
[  199.138037]  do_syscall_64+0x33/0x40
[  199.138237]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  199.138517] RIP: 0033:0x7f36fdc3dcf7

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Fixes: 763aea17bf57 ("drm/fb-helper: Unmap client buffer during shutdown")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.11+
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210228044625.171151-1-ztong0001@gmail.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
3 years agokbuild: remove meaningless parameter to $(call if_changed_rule,dtc)
Masahiro Yamada [Thu, 11 Mar 2021 06:30:54 +0000 (15:30 +0900)]
kbuild: remove meaningless parameter to $(call if_changed_rule,dtc)

This is a remnant of commit 78046fabe6e7 ("kbuild: determine the output
format of DTC by the target suffix").

The parameter "yaml" is meaningless because cmd_dtc no loner takes $(2).

Reported-by: Rob Herring <robh@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agoMerge tag 'usb-serial-5.12-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Thu, 11 Mar 2021 08:53:34 +0000 (09:53 +0100)]
Merge tag 'usb-serial-5.12-rc3' of https://git./linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for 5.12-rc3

Here's a fix for a long-standing memory leak after probe failure in
io_edgeport and a fix for a NULL-deref on disconnect in the new xr
driver.

Included are also some new device ids.

All have been in linux-next with no reported issues.

* tag 'usb-serial-5.12-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: io_edgeport: fix memory leak in edge_startup
  USB: serial: ch341: add new Product ID
  USB: serial: xr: fix NULL-deref on disconnect
  USB: serial: cp210x: add some more GE USB IDs
  USB: serial: cp210x: add ID for Acuity Brands nLight Air Adapter

3 years agokbuild: remove LLVM=1 test from HAS_LTO_CLANG
Masahiro Yamada [Wed, 10 Mar 2021 13:54:22 +0000 (22:54 +0900)]
kbuild: remove LLVM=1 test from HAS_LTO_CLANG

As Documentation/kbuild/llvm.rst notes, LLVM=1 switches the default of
tools, but you can still override CC, LD, etc. individually. This LLVM=1
check is unneeded because each tool is already checked separately.

"make CC=clang LD=ld.lld NM=llvm-nm AR=llvm-ar LLVM_IAS=1 menuconfig"
should be able to enable Clang LTO.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
3 years agokbuild: remove unneeded -O option to dtc
Masahiro Yamada [Wed, 10 Mar 2021 11:08:24 +0000 (20:08 +0900)]
kbuild: remove unneeded -O option to dtc

This piece of code converts the target suffix to the dtc -O option:

    *.dtb      ->  -O dtb
    *.dt.yaml  ->  -O yaml

Commit ce88c9c79455 ("kbuild: Add support to build overlays (%.dtbo)")
added the third case:

    *.dtbo     ->  -O dtbo

This works thanks to commit 163f0469bf2e ("dtc: Allow overlays to have
.dtbo extension") in the upstream DTC, which has already been pulled in
the kernel.

However, I think it is a bit odd because "dtbo" is not a format name.
At least, it does not show up in the help message of dtc.

$ scripts/dtc/dtc --help
  [ snip ]
  -O, --out-format <arg>
        Output formats are:
                dts - device tree source text
                dtb - device tree blob
                yaml - device tree encoded as YAML
                asm - assembler source

So, I am not a big fan of the second hunk of that change:

        } else if (streq(outform, "dtbo")) {
                dt_to_blob(outf, dti, outversion);

Anyway, we did not need to do this in Makefile in the first place.

guess_type_by_name() had already understood ".yaml" before commit
4f0e3a57d6eb ("kbuild: Add support for DT binding schema checks"),
and now does ".dtbo" as well.

Makefile does not need to duplicate the same logic. Let's leave it
to dtc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
3 years agokbuild: dummy-tools: adjust to scripts/cc-version.sh
Masahiro Yamada [Tue, 9 Mar 2021 16:25:45 +0000 (01:25 +0900)]
kbuild: dummy-tools: adjust to scripts/cc-version.sh

Commit aec6c60a01d3 ("kbuild: check the minimum compiler version in
Kconfig") changed how the script detects the compiler version.

Get 'make CROSS_COMPILE=scripts/dummy-tools/' back working again.

Fixes: aec6c60a01d3 ("kbuild: check the minimum compiler version in Kconfig")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
3 years agokbuild: Allow LTO to be selected with KASAN_HW_TAGS
Sami Tolvanen [Mon, 8 Mar 2021 18:46:56 +0000 (10:46 -0800)]
kbuild: Allow LTO to be selected with KASAN_HW_TAGS

While LTO with KASAN is normally not useful, hardware tag-based KASAN
can be used also in production kernels with ARM64_MTE. Therefore, allow
KASAN_HW_TAGS to be selected together with HAS_LTO_CLANG.

Reported-by: Alistair Delva <adelva@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokbuild: dummy-tools: support MPROFILE_KERNEL checks for ppc
Jiri Slaby [Mon, 8 Mar 2021 06:28:20 +0000 (07:28 +0100)]
kbuild: dummy-tools: support MPROFILE_KERNEL checks for ppc

ppc64le checks for -mprofile-kernel to define MPROFILE_KERNEL Kconfig.
Kconfig calls arch/powerpc/tools/gcc-check-mprofile-kernel.sh for that
purpose. This script performs two checks:
1) build with -mprofile-kernel should contain "_mcount"
2) build with -mprofile-kernel with a function marked as "notrace"
   should not produce "_mcount"

So support this in dummy-tools' gcc, so that we have MPROFILE_KERNEL
always true.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agokbuild: rebuild GCC plugins when the compiler is upgraded
Masahiro Yamada [Thu, 4 Mar 2021 11:37:08 +0000 (20:37 +0900)]
kbuild: rebuild GCC plugins when the compiler is upgraded

Linus reported a build error due to the GCC plugin incompatibility
when the compiler is upgraded. [1]

GCC plugins are tied to a particular GCC version. So, they must be
rebuilt when the compiler is upgraded.

This seems to be a long-standing flaw since the initial support of
GCC plugins.

Extend commit 8b59cd81dc5e ("kbuild: ensure full rebuild when the
compiler is updated"), so that GCC plugins are covered by the
compiler upgrade detection.

[1]: https://lore.kernel.org/lkml/CAHk-=wieoN5ttOy7SnsGwZv+Fni3R6m-Ut=oxih6bbZ28G+4dw@mail.gmail.com/

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
3 years agoXen/gntdev: don't needlessly use kvcalloc()
Jan Beulich [Wed, 10 Mar 2021 10:46:13 +0000 (11:46 +0100)]
Xen/gntdev: don't needlessly use kvcalloc()

Requesting zeroed memory when all of it will be overwritten subsequently
by all ones is a waste of processing bandwidth. In fact, rather than
recording zeroed ->grants[], fill that array too with more appropriate
"invalid" indicators.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/9a726be2-4893-8ffe-0ef1-b70dd1c229b1@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agoXen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}
Jan Beulich [Wed, 10 Mar 2021 10:45:26 +0000 (11:45 +0100)]
Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}

It's not helpful if every driver has to cook its own. Generalize
xenbus'es INVALID_GRANT_HANDLE and pcifront's INVALID_GRANT_REF (which
shouldn't have expanded to zero to begin with). Use the constants in
p2m.c and gntdev.c right away, and update field types where necessary so
they would match with the constants' types (albeit without touching
struct ioctl_gntdev_grant_ref's ref field, as that's part of the public
interface of the kernel and would require introducing a dependency on
Xen's grant_table.h public header).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/db7c38a5-0d75-d5d1-19de-e5fe9f0b9c48@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agoXen/gntdev: don't needlessly allocate k{,un}map_ops[]
Jan Beulich [Wed, 10 Mar 2021 10:45:00 +0000 (11:45 +0100)]
Xen/gntdev: don't needlessly allocate k{,un}map_ops[]

They're needed only in the not-auto-translate (i.e. PV) case; there's no
point in allocating memory that's never going to get accessed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/180d50cb-5531-8952-4bf0-d65c554638ed@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agoXen: drop exports of {set,clear}_foreign_p2m_mapping()
Jan Beulich [Tue, 9 Mar 2021 17:00:44 +0000 (18:00 +0100)]
Xen: drop exports of {set,clear}_foreign_p2m_mapping()

They're only used internally, and the layering violation they contain
(x86) or imply (Arm) of calling HYPERVISOR_grant_table_op() strongly
advise against any (uncontrolled) use from a module. The functions also
never had users except the ones from drivers/xen/grant-table.c forever
since their introduction in 3.15.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/746a5cd6-1446-eda4-8b23-03c1cac30b8d@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agoxen/events: avoid handling the same event on two cpus at the same time
Juergen Gross [Sat, 6 Mar 2021 16:18:33 +0000 (17:18 +0100)]
xen/events: avoid handling the same event on two cpus at the same time

When changing the cpu affinity of an event it can happen today that
(with some unlucky timing) the same event will be handled on the old
and the new cpu at the same time.

Avoid that by adding an "event active" flag to the per-event data and
call the handler only if this flag isn't set.

Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agoxen/events: don't unmask an event channel when an eoi is pending
Juergen Gross [Sat, 6 Mar 2021 16:18:32 +0000 (17:18 +0100)]
xen/events: don't unmask an event channel when an eoi is pending

An event channel should be kept masked when an eoi is pending for it.
When being migrated to another cpu it might be unmasked, though.

In order to avoid this keep three different flags for each event channel
to be able to distinguish "normal" masking/unmasking from eoi related
masking/unmasking and temporary masking. The event channel should only
be able to generate an interrupt if all flags are cleared.

Cc: stable@vger.kernel.org
Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework")
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Link: https://lore.kernel.org/r/20210306161833.4552-3-jgross@suse.com
[boris -- corrected Fixed tag format]

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agodrm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m
Alex Deucher [Wed, 10 Mar 2021 03:58:47 +0000 (22:58 -0500)]
drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m

Need to check the module variant as well.

Acked-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/radeon: fix AGP dependency
Christian König [Mon, 8 Mar 2021 18:22:13 +0000 (19:22 +0100)]
drm/radeon: fix AGP dependency

When AGP is compiled as module radeon must be compiled as module as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table
Christian König [Mon, 8 Mar 2021 18:35:14 +0000 (19:35 +0100)]
drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table

Otherwise we will run into a NULL ptr deref.

Signed-off-by: Christian König <christian.koenig@amd.com>
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212137
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.11.x
3 years agodrm/amd/pm: correct the watermark settings for Polaris
Evan Quan [Fri, 5 Mar 2021 06:21:26 +0000 (14:21 +0800)]
drm/amd/pm: correct the watermark settings for Polaris

The "/ 10" should be applied to the right-hand operand instead of
the left-hand one.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Noticed-by: Georgios Toptsidis <gtoptsid@gmail.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amd/pm: bug fix for pcie dpm
Kenneth Feng [Tue, 9 Mar 2021 13:10:16 +0000 (21:10 +0800)]
drm/amd/pm: bug fix for pcie dpm

Currently the pcie dpm has two problems.
1. Only the high dpm level speed/width can be overrided
if the requested values are out of the pcie capability.
2. The high dpm level is always overrided though sometimes
it's not necesarry.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amdgpu: fb BO should be ttm_bo_type_device
Nirmoy Das [Mon, 8 Mar 2021 14:22:22 +0000 (15:22 +0100)]
drm/amdgpu: fb BO should be ttm_bo_type_device

FB BO should not be ttm_bo_type_kernel type and
amdgpufb_create_pinned_object() pins the FB BO anyway.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wm
Zhan Liu [Tue, 9 Mar 2021 01:28:22 +0000 (20:28 -0500)]
drm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wm

[Why]
For DGPU Navi, the wm_table.nv_entries are used. These entires are not
populated for DCN301 Vangogh APU, but instead wm_table.entries are.

[How]
Use DCN21 Renoir style wm calculations.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Enabled pipe harvesting in dcn30
Dillon Varone [Fri, 19 Feb 2021 23:15:30 +0000 (18:15 -0500)]
drm/amd/display: Enabled pipe harvesting in dcn30

[Why & How]
Ported logic from dcn21 for reading in pipe fusing to dcn30.
Supported configurations are 1 and 6 pipes. Invalid fusing
will revert to 1 pipe being enabled.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Revert dram_clock_change_latency for DCN2.1
Sung Lee [Fri, 26 Feb 2021 18:20:43 +0000 (13:20 -0500)]
drm/amd/display: Revert dram_clock_change_latency for DCN2.1

[WHY & HOW]
Using values provided by DF for latency may cause hangs in
multi display configurations. Revert change to previous value.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Haonan Wang <Haonan.Wang2@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoMerge tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Wed, 10 Mar 2021 21:15:16 +0000 (13:15 -0800)]
Merge tag 's390-5.12-3' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - fix various user space visible copy_to_user() instances which return
   the number of bytes left to copy instead of -EFAULT

 - make TMPFS_INODE64 available again for s390 and alpha, now that both
   architectures have been switched to 64-bit ino_t (see commit
   96c0a6a72d18: "s390,alpha: switch to 64-bit ino_t")

 - make sure to release a shared hypervisor resource within the zcore
   device driver also on restart and power down; also remove unneeded
   surrounding debugfs_create return value checks

 - for the new hardware counter set device driver rename the uapi header
   file to be a bit more generic; also remove 60 second read limit which
   is not really necessary and without the limit the interface can be
   easier tested

 - some small cleanups, the largest being to convert all long long in
   our time and idle code to longs

 - update defconfigs

* tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig
  s390: update defconfigs
  s390,alpha: make TMPFS_INODE64 available again
  s390/cio: return -EFAULT if copy_to_user() fails
  s390/tty3270: avoid comma separated statements
  s390/cpumf: remove unneeded semicolon
  s390/crypto: return -EFAULT if copy_to_user() fails
  s390/cio: return -EFAULT if copy_to_user() fails
  s390/cpumf: rename header file to hwctrset.h
  s390/zcore: release dump save area on restart or power down
  s390/zcore: no need to check return value of debugfs_create functions
  s390/cpumf: remove 60 seconds read limit
  s390/topology: remove always false if check
  s390/time,idle: get rid of unsigned long long

3 years agodrm/amd/display: Enable pflip interrupt upon pipe enable
Qingqing Zhuo [Fri, 19 Feb 2021 22:17:50 +0000 (17:17 -0500)]
drm/amd/display: Enable pflip interrupt upon pipe enable

[Why]
pflip interrupt would not be enabled promptly if a pipe is disabled
and re-enabled, causing flip_done timeout error during DP
compliance tests

[How]
Enable pflip interrupt upon pipe enablement

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu/display: use GFP_ATOMIC in dcn21_validate_bandwidth_fp()
Holger Hoffstätte [Fri, 5 Mar 2021 14:23:18 +0000 (15:23 +0100)]
drm/amdgpu/display: use GFP_ATOMIC in dcn21_validate_bandwidth_fp()

After fixing nested FPU contexts caused by 41401ac67791 we're still seeing
complaints about spurious kernel_fpu_end(). As it turns out this was
already fixed for dcn20 in commit f41ed88cbd ("drm/amdgpu/display:
use GFP_ATOMIC in dcn20_validate_bandwidth_internal") but never moved
forward to dcn21.

Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amd/display: Fix nested FPU context in dcn21_validate_bandwidth()
Holger Hoffstätte [Fri, 5 Mar 2021 11:39:21 +0000 (12:39 +0100)]
drm/amd/display: Fix nested FPU context in dcn21_validate_bandwidth()

Commit 41401ac67791 added FPU wrappers to dcn21_validate_bandwidth(),
which was correct. Unfortunately a nested function alredy contained
DC_FP_START()/DC_FP_END() calls, which results in nested FPU context
enter/exit and complaints by kernel_fpu_begin_mask().
This can be observed e.g. with 5.10.20, which backported 41401ac67791
and now emits the following warning on boot:

WARNING: CPU: 6 PID: 858 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa5/0xc0
Call Trace:
 dcn21_calculate_wm+0x47/0xa90 [amdgpu]
 dcn21_validate_bandwidth_fp+0x15d/0x2b0 [amdgpu]
 dcn21_validate_bandwidth+0x29/0x40 [amdgpu]
 dc_validate_global_state+0x3c7/0x4c0 [amdgpu]

The warning is emitted due to the additional DC_FP_START/END calls in
patch_bounding_box(), which is inlined into dcn21_calculate_wm(),
its only caller. Removing the calls brings the code in line with
dcn20 and makes the warning disappear.

Fixes: 41401ac67791 ("drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()")
Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amd/display: Add a backlight module option
Takashi Iwai [Wed, 3 Feb 2021 12:42:41 +0000 (13:42 +0100)]
drm/amd/display: Add a backlight module option

There seem devices that don't work with the aux channel backlight
control.  For allowing such users to test with the other backlight
control method, provide a new module option, aux_backlight, to specify
enabling or disabling the aux backport support explicitly.  As
default, the aux support is detected by the hardware capability.

v2: make the backlight option generic in case we add future
backlight types (Alex)

BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amdgpu/display: handle aux backlight in backlight_get_brightness
Alex Deucher [Thu, 10 Dec 2020 06:45:12 +0000 (01:45 -0500)]
drm/amdgpu/display: handle aux backlight in backlight_get_brightness

Need to fetch it via aux.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amdgpu/display: don't assert in set backlight function
Alex Deucher [Thu, 10 Dec 2020 06:20:08 +0000 (01:20 -0500)]
drm/amdgpu/display: don't assert in set backlight function

It just spams the logs.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amdgpu/display: simplify backlight setting
Alex Deucher [Thu, 10 Dec 2020 06:18:40 +0000 (01:18 -0500)]
drm/amdgpu/display: simplify backlight setting

Avoid the extra wrapper function.

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agousbip: fix vudc usbip_sockfd_store races leading to gpf
Shuah Khan [Mon, 8 Mar 2021 03:53:31 +0000 (20:53 -0700)]
usbip: fix vudc usbip_sockfd_store races leading to gpf

usbip_sockfd_store() is invoked when user requests attach (import)
detach (unimport) usb gadget device from usbip host. vhci_hcd sends
import request and usbip_sockfd_store() exports the device if it is
free for export.

Export and unexport are governed by local state and shared state
- Shared state (usbip device status, sockfd) - sockfd and Device
  status are used to determine if stub should be brought up or shut
  down. Device status is shared between host and client.
- Local state (tcp_socket, rx and tx thread task_struct ptrs)
  A valid tcp_socket controls rx and tx thread operations while the
  device is in exported state.
- While the device is exported, device status is marked used and socket,
  sockfd, and thread pointers are valid.

Export sequence (stub-up) includes validating the socket and creating
receive (rx) and transmit (tx) threads to talk to the client to provide
access to the exported device. rx and tx threads depends on local and
shared state to be correct and in sync.

Unexport (stub-down) sequence shuts the socket down and stops the rx and
tx threads. Stub-down sequence relies on local and shared states to be
in sync.

There are races in updating the local and shared status in the current
stub-up sequence resulting in crashes. These stem from starting rx and
tx threads before local and global state is updated correctly to be in
sync.

1. Doesn't handle kthread_create() error and saves invalid ptr in local
   state that drives rx and tx threads.
2. Updates tcp_socket and sockfd,  starts stub_rx and stub_tx threads
   before updating usbip_device status to SDEV_ST_USED. This opens up a
   race condition between the threads and usbip_sockfd_store() stub up
   and down handling.

Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold usbip_device lock to update local and shared states after
  creating rx and tx threads.
- Update usbip_device status to SDEV_ST_USED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
  and status) is complete.

Credit goes to syzbot and Tetsuo Handa for finding and root-causing the
kthread_get_run() improper error handling problem and others. This is a
hard problem to find and debug since the races aren't seen in a normal
case. Fuzzing forces the race window to be small enough for the
kthread_get_run() error path bug and starting threads before updating the
local and shared state bug in the stub-up sequence.

Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread")
Cc: stable@vger.kernel.org
Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/b1c08b983ffa185449c9f0f7d1021dc8c8454b60.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousbip: fix vhci_hcd attach_store() races leading to gpf
Shuah Khan [Mon, 8 Mar 2021 03:53:30 +0000 (20:53 -0700)]
usbip: fix vhci_hcd attach_store() races leading to gpf

attach_store() is invoked when user requests import (attach) a device
from usbip host.

Attach and detach are governed by local state and shared state
- Shared state (usbip device status) - Device status is used to manage
  the attach and detach operations on import-able devices.
- Local state (tcp_socket, rx and tx thread task_struct ptrs)
  A valid tcp_socket controls rx and tx thread operations while the
  device is in exported state.
- Device has to be in the right state to be attached and detached.

Attach sequence includes validating the socket and creating receive (rx)
and transmit (tx) threads to talk to the host to get access to the
imported device. rx and tx threads depends on local and shared state to
be correct and in sync.

Detach sequence shuts the socket down and stops the rx and tx threads.
Detach sequence relies on local and shared states to be in sync.

There are races in updating the local and shared status in the current
attach sequence resulting in crashes. These stem from starting rx and
tx threads before local and global state is updated correctly to be in
sync.

1. Doesn't handle kthread_create() error and saves invalid ptr in local
   state that drives rx and tx threads.
2. Updates tcp_socket and sockfd,  starts stub_rx and stub_tx threads
   before updating usbip_device status to VDEV_ST_NOTASSIGNED. This opens
   up a race condition between the threads, port connect, and detach
   handling.

Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold vhci and usbip_device locks to update local and shared states after
  creating rx and tx threads.
- Update usbip_device status to VDEV_ST_NOTASSIGNED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
  and status) is complete.

Credit goes to syzbot and Tetsuo Handa for finding and root-causing the
kthread_get_run() improper error handling problem and others. This is
hard problem to find and debug since the races aren't seen in a normal
case. Fuzzing forces the race window to be small enough for the
kthread_get_run() error path bug and starting threads before updating the
local and shared state bug in the attach sequence.
- Update usbip_device tcp_rx and tcp_tx pointers holding vhci and
  usbip_device locks.

Tested with syzbot reproducer:
- https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000

Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread")
Cc: stable@vger.kernel.org
Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/bb434bd5d7a64fbec38b5ecfb838a6baef6eb12b.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousbip: fix stub_dev usbip_sockfd_store() races leading to gpf
Shuah Khan [Mon, 8 Mar 2021 03:53:29 +0000 (20:53 -0700)]
usbip: fix stub_dev usbip_sockfd_store() races leading to gpf

usbip_sockfd_store() is invoked when user requests attach (import)
detach (unimport) usb device from usbip host. vhci_hcd sends import
request and usbip_sockfd_store() exports the device if it is free
for export.

Export and unexport are governed by local state and shared state
- Shared state (usbip device status, sockfd) - sockfd and Device
  status are used to determine if stub should be brought up or shut
  down.
- Local state (tcp_socket, rx and tx thread task_struct ptrs)
  A valid tcp_socket controls rx and tx thread operations while the
  device is in exported state.
- While the device is exported, device status is marked used and socket,
  sockfd, and thread pointers are valid.

Export sequence (stub-up) includes validating the socket and creating
receive (rx) and transmit (tx) threads to talk to the client to provide
access to the exported device. rx and tx threads depends on local and
shared state to be correct and in sync.

Unexport (stub-down) sequence shuts the socket down and stops the rx and
tx threads. Stub-down sequence relies on local and shared states to be
in sync.

There are races in updating the local and shared status in the current
stub-up sequence resulting in crashes. These stem from starting rx and
tx threads before local and global state is updated correctly to be in
sync.

1. Doesn't handle kthread_create() error and saves invalid ptr in local
   state that drives rx and tx threads.
2. Updates tcp_socket and sockfd,  starts stub_rx and stub_tx threads
   before updating usbip_device status to SDEV_ST_USED. This opens up a
   race condition between the threads and usbip_sockfd_store() stub up
   and down handling.

Fix the above problems:
- Stop using kthread_get_run() macro to create/start threads.
- Create threads and get task struct reference.
- Add kthread_create() failure handling and bail out.
- Hold usbip_device lock to update local and shared states after
  creating rx and tx threads.
- Update usbip_device status to SDEV_ST_USED.
- Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx
- Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx,
  and status) is complete.

Credit goes to syzbot and Tetsuo Handa for finding and root-causing the
kthread_get_run() improper error handling problem and others. This is a
hard problem to find and debug since the races aren't seen in a normal
case. Fuzzing forces the race window to be small enough for the
kthread_get_run() error path bug and starting threads before updating the
local and shared state bug in the stub-up sequence.

Tested with syzbot reproducer:
- https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000

Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread")
Cc: stable@vger.kernel.org
Reported-by: syzbot <syzbot+a93fba6d384346a761e3@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+bf1a360e305ee719e364@syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+95ce4b142579611ef0a9@syzkaller.appspotmail.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/268a0668144d5ff36ec7d87fdfa90faf583b7ccc.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousbip: fix vudc to check for stream socket
Shuah Khan [Mon, 8 Mar 2021 03:53:28 +0000 (20:53 -0700)]
usbip: fix vudc to check for stream socket

Fix usbip_sockfd_store() to validate the passed in file descriptor is
a stream socket. If the file descriptor passed was a SOCK_DGRAM socket,
sock_recvmsg() can't detect end of stream.

Cc: stable@vger.kernel.org
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/387a670316002324113ac7ea1e8b53f4085d0c95.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousbip: fix vhci_hcd to check for stream socket
Shuah Khan [Mon, 8 Mar 2021 03:53:27 +0000 (20:53 -0700)]
usbip: fix vhci_hcd to check for stream socket

Fix attach_store() to validate the passed in file descriptor is a
stream socket. If the file descriptor passed was a SOCK_DGRAM socket,
sock_recvmsg() can't detect end of stream.

Cc: stable@vger.kernel.org
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/52712aa308915bda02cece1589e04ee8b401d1f3.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agousbip: fix stub_dev to check for stream socket
Shuah Khan [Mon, 8 Mar 2021 03:53:26 +0000 (20:53 -0700)]
usbip: fix stub_dev to check for stream socket

Fix usbip_sockfd_store() to validate the passed in file descriptor is
a stream socket. If the file descriptor passed was a SOCK_DGRAM socket,
sock_recvmsg() can't detect end of stream.

Cc: stable@vger.kernel.org
Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/e942d2bd03afb8e8552bd2a5d84e18d17670d521.1615171203.git.skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoRevert "mm, slub: consider rest of partial list if acquire_slab() fails"
Linus Torvalds [Wed, 10 Mar 2021 18:18:04 +0000 (10:18 -0800)]
Revert "mm, slub: consider rest of partial list if acquire_slab() fails"

This reverts commit 8ff60eb052eeba95cfb3efe16b08c9199f8121cf.

The kernel test robot reports a huge performance regression due to the
commit, and the reason seems fairly straightforward: when there is
contention on the page list (which is what causes acquire_slab() to
fail), we do _not_ want to just loop and try again, because that will
transfer the contention to the 'n->list_lock' spinlock we hold, and
just make things even worse.

This is admittedly likely a problem only on big machines - the kernel
test robot report comes from a 96-thread dual socket Intel Xeon Gold
6252 setup, but the regression there really is quite noticeable:

   -47.9% regression of stress-ng.rawpkt.ops_per_sec

and the commit that was marked as being fixed (7ced37197196: "slub:
Acquire_slab() avoid loop") actually did the loop exit early very
intentionally (the hint being that "avoid loop" part of that commit
message), exactly to avoid this issue.

The correct thing to do may be to pick some kind of reasonable middle
ground: instead of breaking out of the loop on the very first sign of
contention, or trying over and over and over again, the right thing may
be to re-try _once_, and then give up on the second failure (or pick
your favorite value for "once"..).

Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/lkml/20210301080404.GF12822@xsang-OptiPlex-9020/
Cc: Jann Horn <jannh@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 10 Mar 2021 18:01:35 +0000 (10:01 -0800)]
Merge tag 'for-linus-2021-03-10' of git://git./linux/kernel/git/brauner/linux

Pull detached mounts fix from Christian Brauner:
 "Creating a series of detached mounts, attaching them to the
  filesystem, and unmounting them can be used to trigger an integer
  overflow in ns->mounts causing the kernel to block any new mounts in
  count_mounts() and returning ENOSPC because it falsely assumes that
  the maximum number of mounts in the mount namespace has been reached,
  i.e. it thinks it can't fit the new mounts into the mount namespace
  anymore.

  Without this fix heavy use of the new mount API with move_mount() will
  cause the host to become unuseable and thus blocks some xfstest
  patches I want to resend.

  Depending on the number of mounts in your system, this can be
  reproduced on any kernel that supportes open_tree() and move_mount().

  A reproducer has been sent for inclusion with xfstests. It takes care
  to do this in another mount namespace, not in the host's mount
  namespace so there shouldn't be any risk in running it but if one did
  run it on the host it would require a reboot in order to be able to
  mount again. See

      https://lore.kernel.org/fstests/20210309121041.753359-1-christian.brauner@ubuntu.com

  The root cause of this is that detached mounts aren't handled
  correctly when source and target mount are identical and reside on a
  shared mount causing a broken mount tree where the detached source
  itself is propagated which propagation prevents for regular
  bind-mounts and new mounts.

  This ultimately leads to a miscalculation of the number of mounts in
  the mount namespace.

  Detached mounts created via 'open_tree(fd, path, OPEN_TREE_CLONE)' are
  essentially like an unattached bind-mount. They can then later on be
  attached to the filesystem via move_mount() which calls into
  attach_recursive_mount().

  Part of attaching it to the filesystem is making sure that mounts get
  correctly propagated in case the destination mountpoint is MS_SHARED,
  i.e. is a shared mountpoint. This is done by calling into
  propagate_mnt() which walks the list of peers calling propagate_one()
  on each mount in this list making sure it receives the propagation
  event. The propagate_one() function thereby skips both new mounts and
  bind mounts to not propagate them "into themselves". Both are
  identified by checking whether the mount is already attached to any
  mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is
  responsible for.

  However, detached mounts have an anonymous mount namespace attached to
  them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't
  realize they need to be skipped causing the mount to propagate "into
  itself" breaking the mount table and causing a disconnect between the
  number of mounts recorded as being beneath or reachable from the
  target mountpoint and the number of mounts actually recorded/counted
  in ns->mounts ultimately causing an overflow which in turn prevents
  any new mounts via the ENOSPC issue.

  So teach propagation to handle detached mounts by making it aware of
  them. I've been tracking this issue down for the last couple of days
  and then verifying that the fix is correct by unmounting everything in
  my current mount table leaving only /proc and /sys mounted and running
  the reproducer above overnight verifying the number of mounts counted
  in ns->mounts. With this fix the counts are correct and the ENOSPC
  issue can't be reproduced.

  This change will only have an effect on mounts created with the new
  mount API since detached mounts cannot be created with the old mount
  API so regressions are extremely unlikely.

  Here's an illustration:

    #### mount():
    ubuntu@f1-vm:~$ sudo mount --bind /mnt/ /mnt/
    ubuntu@f1-vm:~$ findmnt  | grep -i mnt
    ├─/mnt                                /dev/sda2[/mnt] ext4       rw,relatime

    #### open_tree(OPEN_TREE_CLONE) + move_mount() with bug:
    ubuntu@f1-vm:~$ sudo ./mount-new /mnt/ /mnt/
    ubuntu@f1-vm:~$ findmnt  | grep -i mnt
    ├─/mnt                                /dev/sda2[/mnt] ext4       rw,relatime
    │ └─/mnt                              /dev/sda2[/mnt] ext4       rw,relatime

    #### open_tree(OPEN_TREE_CLONE) + move_mount() with the fix:
    ubuntu@f1-vm:~$ sudo ./mount-new /mnt /mnt
    ubuntu@f1-vm:~$ findmnt | grep -i mnt
    └─/mnt                                /dev/sda2[/mnt] ext4       rw,relatime"

* tag 'for-linus-2021-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  mount: fix mounting of detached mounts onto targets that reside on shared mounts

3 years agoMerge tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Wed, 10 Mar 2021 17:55:06 +0000 (09:55 -0800)]
Merge tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Six cifs/smb3 fixes, three of them for stable, including some
  important mulitchannel crediting fixes, and a fix for statfs error
  handling"

* tag '5.12-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: do not send close in compound create+close requests
  cifs: return proper error code in statfs(2)
  cifs: change noisy error message to FYI
  cifs: print MIDs in decimal notation
  cifs: ask for more credit on async read/write code paths
  cifs: fix credit accounting for extra channel

3 years agomisc/pvpanic: Export module FDT device table
Shile Zhang [Thu, 18 Feb 2021 12:31:16 +0000 (20:31 +0800)]
misc/pvpanic: Export module FDT device table

Export the module FDT device table to ensure the FDT compatible strings
are listed in the module alias. This help the pvpanic driver can be
loaded on boot automatically not only the ACPI device, but also the FDT
device.

Fixes: 46f934c9a12fc ("misc/pvpanic: add support to get pvpanic device info FDT")
Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20210218123116.207751-1-shile.zhang@linux.alibaba.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agomisc: fastrpc: restrict user apps from sending kernel RPC messages
Dmitry Baryshkov [Fri, 12 Feb 2021 19:26:58 +0000 (22:26 +0300)]
misc: fastrpc: restrict user apps from sending kernel RPC messages

Verify that user applications are not using the kernel RPC message
handle to restrict them from directly attaching to guest OS on the
remote subsystem. This is a port of CVE-2019-2308 fix.

Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method")
Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Jonathan Marek <jonathan@marek.ca>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20210212192658.3476137-1-dmitry.baryshkov@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agovirt: acrn: Correct type casting of argument of copy_from_user()
Shuo Liu [Wed, 10 Mar 2021 15:37:08 +0000 (23:37 +0800)]
virt: acrn: Correct type casting of argument of copy_from_user()

hsm.c:336:50: warning: incorrect type in argument 2 (different address spaces)
hsm.c:336:50:    expected void const [noderef] __user *from
hsm.c:336:50:    got void *

This patch fixes above sparse warning.

Fixes: 3d679d5aec64 ("virt: acrn: Introduce interfaces to query C-states and P-states allowed by hypervisor")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Shuo Liu <shuo.a.liu@intel.com>
Link: https://lore.kernel.org/r/20210310153708.17451-1-shuo.a.liu@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agox86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case
Sean Christopherson [Tue, 9 Mar 2021 17:10:19 +0000 (09:10 -0800)]
x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case

Initialize x86_pmu.guest_get_msrs to return 0/NULL to handle the "nop"
case.  Patching in perf_guest_get_msrs_nop() during setup does not work
if there is no PMU, as setup bails before updating the static calls,
leaving x86_pmu.guest_get_msrs NULL and thus a complete nop.  Ultimately,
this causes VMX abort on VM-Exit due to KVM putting random garbage from
the stack into the MSR load list.

Add a comment in KVM to note that nr_msrs is valid if and only if the
return value is non-NULL.

Fixes: abd562df94d1 ("x86/perf: Use static_call for x86_pmu.guest_get_msrs")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: syzbot+cce9ef2dd25246f815ee@syzkaller.appspotmail.com
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210309171019.1125243-1-seanjc@google.com
3 years agoblock: rsxx: fix error return code of rsxx_pci_probe()
Jia-Ju Bai [Wed, 10 Mar 2021 03:30:17 +0000 (19:30 -0800)]
block: rsxx: fix error return code of rsxx_pci_probe()

When create_singlethread_workqueue returns NULL to card->event_wq, no
error return code of rsxx_pci_probe() is assigned.

To fix this bug, st is assigned with -ENOMEM in this case.

Fixes: 8722ff8cdbfa ("block: IBM RamSan 70/80 device driver")
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Link: https://lore.kernel.org/r/20210310033017.4023-1-baijiaju1990@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: Fix REQ_OP_ZONE_RESET_ALL handling
Damien Le Moal [Wed, 10 Mar 2021 09:09:19 +0000 (18:09 +0900)]
block: Fix REQ_OP_ZONE_RESET_ALL handling

Similarly to a single zone reset operation (REQ_OP_ZONE_RESET), execute
REQ_OP_ZONE_RESET_ALL operations with REQ_SYNC set.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: remove indirect ctx into sqo injection
Pavel Begunkov [Wed, 10 Mar 2021 13:13:54 +0000 (13:13 +0000)]
io_uring: remove indirect ctx into sqo injection

We use ->ctx_new_list to notify sqo about new ctx pending, then sqo
should stop and splice it to its sqd->ctx_list, paired with
->sq_thread_comp.

The last one is broken because nobody reinitialises it, and trying to
fix it would only add more complexity and bugs. And the first isn't
really needed as is done under park(), that protects from races well.
Add ctx into sqd->ctx_list directly (under park()), it's much simpler
and allows to kill both, ctx_new_list and sq_thread_comp.

note: apparently there is no real problem at the moment, because
sq_thread_comp is used only by io_sq_thread_finish() followed by
parking, where list_del(&ctx->sqd_list) removes it well regardless
whether it's in the new or the active list.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: fix invalid ctx->sq_thread_idle
Pavel Begunkov [Wed, 10 Mar 2021 13:13:53 +0000 (13:13 +0000)]
io_uring: fix invalid ctx->sq_thread_idle

We have to set ctx->sq_thread_idle before adding a ring to an SQ task,
otherwise sqd races for seeing zero and accounting it as such.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agokernel: make IO threads unfreezable by default
Jens Axboe [Wed, 10 Mar 2021 02:49:02 +0000 (19:49 -0700)]
kernel: make IO threads unfreezable by default

The io-wq threads were already marked as no-freeze, but the manager was
not. On resume, we perpetually have signal_pending() being true, and
hence the manager will loop and spin 100% of the time.

Just mark the tasks created by create_io_thread() as PF_NOFREEZE by
default, and remove any knowledge of it in io-wq and io_uring.

Reported-by: Kevin Locke <kevin@kevinlocke.name>
Tested-by: Kevin Locke <kevin@kevinlocke.name>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: always wait for sqd exited when stopping SQPOLL thread
Jens Axboe [Tue, 9 Mar 2021 23:32:13 +0000 (16:32 -0700)]
io_uring: always wait for sqd exited when stopping SQPOLL thread

We have a tiny race where io_put_sq_data() calls io_sq_thead_stop()
and finds the thread gone, but the thread has indeed not fully
exited or called complete() yet. Close it up by always having
io_sq_thread_stop() wait on completion of the exit event.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: remove unneeded variable 'ret'
Yang Li [Tue, 9 Mar 2021 06:30:41 +0000 (14:30 +0800)]
io_uring: remove unneeded variable 'ret'

Fix the following coccicheck warning:
./fs/io_uring.c:8984:5-8: Unneeded variable: "ret". Return "0" on line
8998

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/1615271441-33649-1-git-send-email-yang.lee@linux.alibaba.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: move all io_kiocb init early in io_init_req()
Jens Axboe [Tue, 9 Mar 2021 14:02:21 +0000 (07:02 -0700)]
io_uring: move all io_kiocb init early in io_init_req()

If we hit an error path in the function, make sure that the io_kiocb is
fully initialized at that point so that freeing the request always sees
a valid state.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio-wq: fix ref leak for req in case of exit cancelations
yangerkun [Tue, 9 Mar 2021 03:04:10 +0000 (11:04 +0800)]
io-wq: fix ref leak for req in case of exit cancelations

do_work such as io_wq_submit_work that cancel the work may leave a ref of
req as 1 if we have links. Fix it by call io_run_cancel.

Fixes: 4fb6ac326204 ("io-wq: improve manager/worker handling over exec")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Link: https://lore.kernel.org/r/20210309030410.3294078-1-yangerkun@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: fix complete_post races for linked req
Pavel Begunkov [Tue, 9 Mar 2021 00:37:59 +0000 (00:37 +0000)]
io_uring: fix complete_post races for linked req

Calling io_queue_next() after spin_unlock in io_req_complete_post()
races with the other side extracting and reusing this request. Hand
coded parts of io_req_find_next() considering that io_disarm_next()
and io_req_task_queue() have (and safe) to be called with
completion_lock held.

It already does io_commit_cqring() and io_cqring_ev_posted(), so just
reuse it for post io_disarm_next().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/5672a62f3150ee7c55849f40c0037655c4f2840f.1615250156.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: add io_disarm_next() helper
Pavel Begunkov [Tue, 9 Mar 2021 00:37:58 +0000 (00:37 +0000)]
io_uring: add io_disarm_next() helper

A preparation patch placing all preparations before extracting a next
request into a separate helper io_disarm_next().

Also, don't spuriously do ev_posted in a rare case where REQ_F_FAIL_LINK
is set but there are no requests linked (i.e. after cancelling a linked
timeout or setting IOSQE_IO_LINK on a last request of a submission
batch).

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/44ecff68d6b47e1c4e6b891bdde1ddc08cfc3590.1615250156.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: fix io_sq_offload_create error handling
Pavel Begunkov [Mon, 8 Mar 2021 17:30:54 +0000 (17:30 +0000)]
io_uring: fix io_sq_offload_create error handling

Don't set IO_SQ_THREAD_SHOULD_STOP when io_sq_offload_create() has
failed on io_uring_alloc_task_context() but leave everything to
io_sq_thread_finish(), because currently io_sq_thread_finish()
hangs on trying to park it. That's great it stalls there, because
otherwise the following io_sq_thread_stop() would be skipped on
IO_SQ_THREAD_SHOULD_STOP check and the sqo would race for sqd with
freeing ctx.

A simple error injection gives something like this.

[  245.463955] INFO: task sqpoll-test-hang:523 blocked for more than 122 seconds.
[  245.463983] Call Trace:
[  245.463990]  __schedule+0x36b/0x950
[  245.464005]  schedule+0x68/0xe0
[  245.464013]  schedule_timeout+0x209/0x2a0
[  245.464032]  wait_for_completion+0x8b/0xf0
[  245.464043]  io_sq_thread_finish+0x44/0x1a0
[  245.464049]  io_uring_setup+0x9ea/0xc80
[  245.464058]  __x64_sys_io_uring_setup+0x16/0x20
[  245.464064]  do_syscall_64+0x38/0x50
[  245.464073]  entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio-wq: remove unused 'user' member of io_wq
Jens Axboe [Mon, 8 Mar 2021 16:34:43 +0000 (09:34 -0700)]
io-wq: remove unused 'user' member of io_wq

Previous patches killed the last user of this, now it's just a dead member
in the struct. Get rid of it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: Convert personality_idr to XArray
Matthew Wilcox (Oracle) [Mon, 8 Mar 2021 14:16:16 +0000 (14:16 +0000)]
io_uring: Convert personality_idr to XArray

You can't call idr_remove() from within a idr_for_each() callback,
but you can call xa_erase() from an xa_for_each() loop, so switch the
entire personality_idr from the IDR to the XArray.  This manifests as a
use-after-free as idr_for_each() attempts to walk the rest of the node
after removing the last entry from it.

Fixes: 071698e13ac6 ("io_uring: allow registering credentials")
Cc: stable@vger.kernel.org # 5.6+
Reported-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[Pavel: rebased (creds load was moved into io_init_req())]
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/7ccff36e1375f2b0ebf73d957f037b43becc0dde.1615212806.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: clean R_DISABLED startup mess
Pavel Begunkov [Mon, 8 Mar 2021 13:20:57 +0000 (13:20 +0000)]
io_uring: clean R_DISABLED startup mess

There are enough of problems with IORING_SETUP_R_DISABLED, including the
burden of checking and kicking off the SQO task all over the codebase --
for exit/cancel/etc.

Rework it, always start the thread but don't do submit unless the flag
is gone, that's much easier.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: fix unrelated ctx reqs cancellation
Pavel Begunkov [Mon, 8 Mar 2021 12:14:14 +0000 (12:14 +0000)]
io_uring: fix unrelated ctx reqs cancellation

io-wq now is per-task, so cancellations now should match against
request's ctx.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: SQPOLL parking fixes
Jens Axboe [Sat, 6 Mar 2021 20:58:48 +0000 (13:58 -0700)]
io_uring: SQPOLL parking fixes

We keep running into weird dependency issues between the sqd lock and
the parking state. Disentangle the SQPOLL thread from the last bits of
the kthread parking inheritance, and just replace the parking state,
and two associated locks, with a single rw mutex. The SQPOLL thread
keeps the mutex for read all the time, except if someone has marked us
needing to park. Then we drop/re-acquire and try again.

This greatly simplifies the parking state machine (by just getting rid
of it), and makes it a lot more obvious how it works - if you need to
modify the ctx list, then you simply park the thread which will grab
the lock for writing.

Fold in fix from Hillf Danton on not setting STOP on a fatal signal.

Fixes: e54945ae947f ("io_uring: SQPOLL stop error handling fixes")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agosoftware node: Fix device_add_software_node()
Heikki Krogerus [Mon, 1 Mar 2021 14:30:12 +0000 (17:30 +0300)]
software node: Fix device_add_software_node()

The function device_add_software_node() was meant to
register the node supplied to it, but only if that node
wasn't already registered. Right now the function attempts
to always register the node. That will cause a failure with
nodes that are already registered.

Fixing that by incrementing the reference count of the nodes
that have already been registered, and only registering the
new nodes. Also, clarifying the behaviour in the function
documentation.

Fixes: e68d0119e328 ("software node: Introduce device_add_software_node()")
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
3 years agosoftware node: Fix node registration
Heikki Krogerus [Mon, 1 Mar 2021 14:30:11 +0000 (17:30 +0300)]
software node: Fix node registration

Software node can not be registered before its parent.

Fixes: 80488a6b1d3c ("software node: Add support for static node descriptors")
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
3 years agoxen/events: reset affinity of 2-level event when tearing it down
Juergen Gross [Sat, 6 Mar 2021 16:18:31 +0000 (17:18 +0100)]
xen/events: reset affinity of 2-level event when tearing it down

When creating a new event channel with 2-level events the affinity
needs to be reset initially in order to avoid using an old affinity
from earlier usage of the event channel port. So when tearing an event
channel down reset all affinity bits.

The same applies to the affinity when onlining a vcpu: all old
affinity settings for this vcpu must be reset. As percpu events get
initialized before the percpu event channel hook is called,
resetting of the affinities happens after offlining a vcpu (this is
working, as initial percpu memory is zeroed out).

Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Link: https://lore.kernel.org/r/20210306161833.4552-2-jgross@suse.com
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
3 years agoregulator: rt4831: Fix return value check in rt4831_regulator_probe()
Wei Yongjun [Fri, 5 Mar 2021 03:49:30 +0000 (03:49 +0000)]
regulator: rt4831: Fix return value check in rt4831_regulator_probe()

In case of error, the function dev_get_regmap() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value
check should be replaced with NULL test.

Fixes: 9351ab8b0cb6 ("regulator: rt4831: Adds support for Richtek RT4831 DSV regulator")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20210305034930.3236099-1-weiyongjun1@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
3 years agoregulator: pca9450: Clear PRESET_EN bit to fix BUCK1/2/3 voltage setting
Frieder Schrempf [Mon, 22 Feb 2021 11:52:20 +0000 (12:52 +0100)]
regulator: pca9450: Clear PRESET_EN bit to fix BUCK1/2/3 voltage setting

The driver uses the DVS registers PCA9450_REG_BUCKxOUT_DVS0 to set the
voltage for the buck regulators 1, 2 and 3. This has no effect as the
PRESET_EN bit is set by default and therefore the preset values are used
instead, which are set to 850 mV.

To fix this we clear the PRESET_EN bit at time of initialization.

Fixes: 0935ff5f1f0a ("regulator: pca9450: add pca9450 pmic driver")
Cc: <stable@vger.kernel.org>
Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Link: https://lore.kernel.org/r/20210222115229.166620-1-frieder.schrempf@kontron.de
Signed-off-by: Mark Brown <broonie@kernel.org>
3 years agoregulator: qcom-rpmh: Use correct buck for S1C regulator
satya priya [Wed, 24 Feb 2021 08:33:11 +0000 (14:03 +0530)]
regulator: qcom-rpmh: Use correct buck for S1C regulator

Use correct buck, that is, pmic5_hfsmps515 for S1C regulator
of PM8350C PMIC.

Signed-off-by: satya priya <skakit@codeaurora.org>
Link: https://lore.kernel.org/r/1614155592-14060-7-git-send-email-skakit@codeaurora.org
Signed-off-by: Mark Brown <broonie@kernel.org>
3 years agoregulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck
satya priya [Wed, 24 Feb 2021 08:33:08 +0000 (14:03 +0530)]
regulator: qcom-rpmh: Correct the pmic5_hfsmps515 buck

Correct the REGULATOR_LINEAR_RANGE and n_voltges for
pmic5_hfsmps515 buck.

Signed-off-by: satya priya <skakit@codeaurora.org>
Link: https://lore.kernel.org/r/1614155592-14060-4-git-send-email-skakit@codeaurora.org
Signed-off-by: Mark Brown <broonie@kernel.org>
3 years agoregulator: pca9450: Fix return value when failing to get sd-vsel GPIO
Frieder Schrempf [Mon, 22 Feb 2021 15:08:04 +0000 (16:08 +0100)]
regulator: pca9450: Fix return value when failing to get sd-vsel GPIO

This fixes the return value of pca9450_i2c_probe() to use the correct
error code when getting the sd-vsel GPIO fails.

Signed-off-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Link: https://lore.kernel.org/r/20210222150809.208942-1-frieder.schrempf@kontron.de
Signed-off-by: Mark Brown <broonie@kernel.org>
3 years agoregulator: mt6315: Return REGULATOR_MODE_INVALID for invalid mode
Axel Lin [Mon, 15 Feb 2021 03:48:13 +0000 (11:48 +0800)]
regulator: mt6315: Return REGULATOR_MODE_INVALID for invalid mode

-EINVAL is not a valid return value for .of_map_mode, return
REGULATOR_MODE_INVALID instead.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Link: https://lore.kernel.org/r/20210215034813.45510-1-axel.lin@ingics.com
Signed-off-by: Mark Brown <broonie@kernel.org>
3 years agoALSA: hda/hdmi: Cancel pending works before suspend
Takashi Iwai [Wed, 10 Mar 2021 11:28:09 +0000 (12:28 +0100)]
ALSA: hda/hdmi: Cancel pending works before suspend

The per_pin->work might be still floating at the suspend, and this may
hit the access to the hardware at an unexpected timing.  Cancel the
work properly at the suspend callback for avoiding the buggy access.

Note that the bug doesn't trigger easily in the recent kernels since
the work is queued only when the repoll count is set, and usually it's
only at the resume callback, but it's still possible to hit in
theory.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182377
Reported-and-tested-by: Abhishek Sahu <abhsahu@nvidia.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210310112809.9215-4-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>