Keith Busch [Fri, 27 Oct 2023 17:58:12 +0000 (10:58 -0700)]
nvme: ensure reset state check ordering
A different CPU may be setting the ctrl->state value, so ensure proper
barriers to prevent optimizing to a stale state. Normally it isn't a
problem to observe the wrong state as it is merely advisory to take a
quicker path during initialization and error recovery, but seeing an old
state can report unexpected ENETRESET errors when a reset request was in
fact successful.
Reported-by: Minh Hoang <mh2022@meta.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Keith Busch [Mon, 30 Oct 2023 15:13:09 +0000 (08:13 -0700)]
nvme: introduce helper function to get ctrl state
The controller state is typically written by another CPU, so reading it
should ensure no optimizations are taken. This is a repeated pattern in
the driver, so start with adding a convenience function that returns the
controller state with READ_ONCE().
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Jens Axboe [Sat, 2 Dec 2023 01:37:24 +0000 (18:37 -0700)]
Merge tag 'md-fixes-
20231201-1' of https://git./linux/kernel/git/song/md into block-6.7
Pull MD fix from Song:
"This change fixes issue with raid456 reshape."
* tag 'md-fixes-
20231201-1' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md/raid6: use valid sector values to determine if an I/O should wait on the reshape
David Jeffery [Tue, 28 Nov 2023 18:11:39 +0000 (13:11 -0500)]
md/raid6: use valid sector values to determine if an I/O should wait on the reshape
During a reshape or a RAID6 array such as expanding by adding an additional
disk, I/Os to the region of the array which have not yet been reshaped can
stall indefinitely. This is from errors in the stripe_ahead_of_reshape
function causing md to think the I/O is to a region in the actively
undergoing the reshape.
stripe_ahead_of_reshape fails to account for the q disk having a sector
value of 0. By not excluding the q disk from the for loop, raid6 will always
generate a min_sector value of 0, causing a return value which stalls.
The function's max_sector calculation also uses min() when it should use
max(), causing the max_sector value to always be 0. During a backwards
rebuild this can cause the opposite problem where it allows I/O to advance
when it should wait.
Fixing these errors will allow safe I/O to advance in a timely manner and
delay only I/O which is unsafe due to stripes in the middle of undergoing
the reshape.
Fixes:
486f60558607 ("md/raid5: Check all disks in a stripe_head for reshape progress")
Cc: stable@vger.kernel.org # v6.0+
Signed-off-by: David Jeffery <djeffery@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20231128181233.6187-1-djeffery@redhat.com
Jens Axboe [Fri, 1 Dec 2023 16:09:16 +0000 (09:09 -0700)]
Merge tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme into block-6.7
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.7
- Invalid namespace identification error handling (Marizio Ewan, Keith)
- Fabrics keep-alive tuning (Mark)"
* tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme:
nvme-core: check for too small lba shift
nvme: check for valid nvme_identify_ns() before using it
nvme-core: fix a memory leak in nvme_ns_info_from_identify()
nvme: fine-tune sending of first keep-alive
Keith Busch [Tue, 28 Nov 2023 17:36:04 +0000 (09:36 -0800)]
nvme-core: check for too small lba shift
The block layer doesn't support logical block sizes smaller than 512
bytes. The nvme spec doesn't support that small either, but the driver
isn't checking to make sure the device responded with usable data.
Failing to catch this will result in a kernel bug, either from a
division by zero when stacking, or a zero length bio.
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Ming Lei [Fri, 1 Dec 2023 08:56:05 +0000 (16:56 +0800)]
blk-mq: don't count completed flush data request as inflight in case of quiesce
Request queue quiesce may interrupt flush sequence, and the original request
may have been marked as COMPLETE, but can't get finished because of
queue quiesce.
This way is fine from driver viewpoint, because flush sequence is block
layer concept, and it isn't related with driver.
However, driver(such as dm-rq) can call blk_mq_queue_inflight() to count &
drain inflight requests, then the wait & drain never gets done because
the completed & not-finished flush request is counted as inflight.
Fix this issue by not counting completed flush data request as inflight in
case of quiesce.
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: David Jeffery <djeffery@redhat.com>
Cc: John Pittman <jpittman@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20231201085605.577730-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Tue, 28 Nov 2023 19:40:19 +0000 (11:40 -0800)]
block: Document the role of the two attribute groups
It is nontrivial to derive the role of the two attribute groups in source
file block/blk-sysfs.c. Hence add a comment that explains their roles. See
also commit
6d85ebf95c44 ("blk-sysfs: add a new attr_group for blk_mq").
Cc: Christoph Hellwig <hch@lst.de>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20231128194019.72762-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Yu Kuai [Tue, 28 Nov 2023 12:30:27 +0000 (20:30 +0800)]
block: warn once for each partition in bio_check_ro()
Commit
1b0a151c10a6 ("blk-core: use pr_warn_ratelimited() in
bio_check_ro()") fix message storm by limit the rate, however, there
will still be lots of message in the long term. Fix it better by warn
once for each partition.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231128123027.971610-3-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Tue, 28 Nov 2023 12:30:26 +0000 (20:30 +0800)]
block: move .bd_inode into 1st cacheline of block_device
The .bd_inode field of block_device is used in IO fast path of
blkdev_write_iter() and blkdev_llseek(), so it is more efficient to keep
it into the 1st cacheline.
.bd_openers is only touched in open()/close(), and .bd_size_lock is only
for updating bdev capacity, which is in slow path too.
So swap .bd_inode layout with .bd_openers & .bd_size_lock to move
.bd_inode into the 1st cache line.
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20231128123027.971610-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ewan D. Milne [Mon, 27 Nov 2023 20:56:57 +0000 (15:56 -0500)]
nvme: check for valid nvme_identify_ns() before using it
When scanning namespaces, it is possible to get valid data from the first
call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second
call in nvme_update_ns_info_block(). In particular, if the NSID becomes
inactive between the two commands, a storage device may return a buffer
filled with zero as per 4.1.5.1. In this case, we can get a kernel crash
due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will
be set to zero.
PID: 326 TASK:
ffff95fec3cd8000 CPU: 29 COMMAND: "kworker/u98:10"
#0 [
ffffad8f8702f9e0] machine_kexec at
ffffffff91c76ec7
#1 [
ffffad8f8702fa38] __crash_kexec at
ffffffff91dea4fa
#2 [
ffffad8f8702faf8] crash_kexec at
ffffffff91deb788
#3 [
ffffad8f8702fb00] oops_end at
ffffffff91c2e4bb
#4 [
ffffad8f8702fb20] do_trap at
ffffffff91c2a4ce
#5 [
ffffad8f8702fb70] do_error_trap at
ffffffff91c2a595
#6 [
ffffad8f8702fbb0] exc_divide_error at
ffffffff928506e6
#7 [
ffffad8f8702fbd0] asm_exc_divide_error at
ffffffff92a00926
[exception RIP: blk_stack_limits+434]
RIP:
ffffffff92191872 RSP:
ffffad8f8702fc80 RFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff95efa0c91800 RCX:
0000000000000001
RDX:
0000000000000000 RSI:
0000000000000001 RDI:
0000000000000001
RBP:
00000000ffffffff R8:
ffff95fec7df35a8 R9:
0000000000000000
R10:
0000000000000000 R11:
0000000000000001 R12:
0000000000000000
R13:
0000000000000000 R14:
0000000000000000 R15:
ffff95fed33c09a8
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
#8 [
ffffad8f8702fce0] nvme_update_ns_info_block at
ffffffffc06d3533 [nvme_core]
#9 [
ffffad8f8702fd18] nvme_scan_ns at
ffffffffc06d6fa7 [nvme_core]
This happened when the check for valid data was moved out of nvme_identify_ns()
into one of the callers. Fix this by checking in both callers.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186
Fixes:
0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan")
Cc: stable@vger.kernel.org
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Maurizio Lombardi [Thu, 23 Nov 2023 14:07:41 +0000 (15:07 +0100)]
nvme-core: fix a memory leak in nvme_ns_info_from_identify()
In case of error, free the nvme_id_ns structure that was allocated
by nvme_identify_ns().
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Mark O'Donovan [Fri, 24 Nov 2023 20:56:59 +0000 (20:56 +0000)]
nvme: fine-tune sending of first keep-alive
Keep-alive commands are sent half-way through the kato period.
This normally works well but fails when the keep-alive system is
started when we are more than half way through the kato.
This can happen on larger setups or due to host delays.
With this change we now time the initial keep-alive command from
the controller initialisation time, rather than the keep-alive
mechanism activation time.
Signed-off-by: Mark O'Donovan <shiftee@posteo.net>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Markus Weippert [Fri, 24 Nov 2023 15:14:37 +0000 (16:14 +0100)]
bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
Commit
028ddcac477b ("bcache: Remove unnecessary NULL point check in
node allocations") replaced IS_ERR_OR_NULL by IS_ERR. This leads to a
NULL pointer dereference.
BUG: kernel NULL pointer dereference, address:
0000000000000080
Call Trace:
? __die_body.cold+0x1a/0x1f
? page_fault_oops+0xd2/0x2b0
? exc_page_fault+0x70/0x170
? asm_exc_page_fault+0x22/0x30
? btree_node_free+0xf/0x160 [bcache]
? up_write+0x32/0x60
btree_gc_coalesce+0x2aa/0x890 [bcache]
? bch_extent_bad+0x70/0x170 [bcache]
btree_gc_recurse+0x130/0x390 [bcache]
? btree_gc_mark_node+0x72/0x230 [bcache]
bch_btree_gc+0x5da/0x600 [bcache]
? cpuusage_read+0x10/0x10
? bch_btree_gc+0x600/0x600 [bcache]
bch_gc_thread+0x135/0x180 [bcache]
The relevant code starts with:
new_nodes[0] = NULL;
for (i = 0; i < nodes; i++) {
if (__bch_keylist_realloc(&keylist, bkey_u64s(&r[i].b->key)))
goto out_nocoalesce;
// ...
out_nocoalesce:
// ...
for (i = 0; i < nodes; i++)
if (!IS_ERR(new_nodes[i])) { // IS_ERR_OR_NULL before
028ddcac477b
btree_node_free(new_nodes[i]); // new_nodes[0] is NULL
rw_unlock(true, new_nodes[i]);
}
This patch replaces IS_ERR() by IS_ERR_OR_NULL() to fix this.
Fixes:
028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations")
Link: https://lore.kernel.org/all/3DF4A87A-2AC1-4893-AE5F-E921478419A9@suse.de/
Cc: stable@vger.kernel.org
Cc: Zheng Wang <zyytlz.wz@163.com>
Cc: Coly Li <colyli@suse.de>
Signed-off-by: Markus Weippert <markus@gekmihesg.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Wed, 22 Nov 2023 22:47:19 +0000 (23:47 +0100)]
nvme: tcp: fix compile-time checks for TLS mode
When CONFIG_NVME_KEYRING is enabled as a loadable module, but the TCP
host code is built-in, it fails to link:
arm-linux-gnueabi-ld: drivers/nvme/host/tcp.o: in function `nvme_tcp_setup_ctrl':
tcp.c:(.text+0x1940): undefined reference to `nvme_tls_psk_default'
The problem is that the compile-time conditionals are inconsistent here,
using a mix of #ifdef CONFIG_NVME_TCP_TLS, IS_ENABLED(CONFIG_NVME_TCP_TLS)
and IS_ENABLED(CONFIG_NVME_KEYRING) checks, with CONFIG_NVME_KEYRING
controlling whether the implementation is actually built.
Change it to use IS_ENABLED(CONFIG_NVME_KEYRING) checks consistently,
which should help readability and make it less error-prone. Combining
it with the check for the ctrl->opts->tls flag lets the compiler drop
all the TLS code in configurations without this feature, which also
helps runtime behavior in addition to avoiding the link failure.
To make it possible for the compiler to build the dead code, both
the tls_handshake_timeout variable and the TLS specific members
of nvme_tcp_queue need to be moved out of the #ifdef block as well,
but at least the former of these gets optimized out again.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231122224719.4042108-4-arnd@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Wed, 22 Nov 2023 22:47:18 +0000 (23:47 +0100)]
nvme: target: fix Kconfig select statements
When the NVME target code is built-in but its TCP frontend is a loadable
module, enabling keyring support causes a link failure:
x86_64-linux-ld: vmlinux.o: in function `nvmet_ports_make':
configfs.c:(.text+0x100a211): undefined reference to `nvme_keyring_id'
The problem is that CONFIG_NVME_TARGET_TCP_TLS is a 'bool' symbol that
depends on the tristate CONFIG_NVME_TARGET_TCP, so any 'select' from
it inherits the state of the tristate symbol rather than the intended
CONFIG_NVME_TARGET one that contains the actual call.
The same thing is true for CONFIG_KEYS, which itself is required for
NVME_KEYRING.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231122224719.4042108-3-arnd@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Wed, 22 Nov 2023 22:47:17 +0000 (23:47 +0100)]
nvme: target: fix nvme_keyring_id() references
In configurations without CONFIG_NVME_TARGET_TCP_TLS, the keyring
code might not be available, or using it will result in a runtime
failure:
x86_64-linux-ld: vmlinux.o: in function `nvmet_ports_make':
configfs.c:(.text+0x100a211): undefined reference to `nvme_keyring_id'
Add a check to ensure we only check the keyring if there is a chance
of it being used, which avoids both the runtime and link-time
problems.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231122224719.4042108-2-arnd@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Wed, 22 Nov 2023 17:19:27 +0000 (10:19 -0700)]
Merge tag 'nvme-6.7-2023-11-22' of git://git.infradead.org/nvme into block-6.7
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.7
- TCP TLS fixes (Hannes)
- Authentifaction fixes (Mark, Hannes)
- Properly terminate target names (Christoph)"
* tag 'nvme-6.7-2023-11-22' of git://git.infradead.org/nvme:
nvme: move nvme_stop_keep_alive() back to original position
nvmet-tcp: always initialize tls_handshake_tmo_work
nvmet: nul-terminate the NQNs passed in the connect command
nvme: blank out authentication fabrics options if not configured
nvme: catch errors from nvme_configure_metadata()
nvme-tcp: only evaluate 'tls' option if TLS is selected
nvme-auth: set explanation code for failure2 msgs
nvme-auth: unlock mutex in one place only
Hannes Reinecke [Tue, 21 Nov 2023 08:01:03 +0000 (09:01 +0100)]
nvme: move nvme_stop_keep_alive() back to original position
Stopping keep-alive not only stops the keep-alive workqueue,
but also needs to be synchronized with I/O termination as we
must not send a keep-alive command after all I/O had been
terminated.
So to avoid any regressions move the call to stop_keep_alive()
back to its original position and ensure that keep-alive is
correctly stopped failing to setup the admin queue.
Fixes:
4733b65d82bd ("nvme: start keep-alive after admin queue setup")
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Li Nan [Mon, 11 Sep 2023 02:33:08 +0000 (10:33 +0800)]
nbd: pass nbd_sock to nbd_read_reply() instead of index
If a socket is processing ioctl 'NBD_SET_SOCK', config->socks might be
krealloc in nbd_add_socket(), and a garbage request is received now, a UAF
may occurs.
T1
nbd_ioctl
__nbd_ioctl
nbd_add_socket
blk_mq_freeze_queue
T2
recv_work
nbd_read_reply
sock_xmit
krealloc config->socks
def config->socks
Pass nbd_sock to nbd_read_reply(). And introduce a new function
sock_xmit_recv(), which differs from sock_xmit only in the way it get
socket.
==================================================================
BUG: KASAN: use-after-free in sock_xmit+0x525/0x550
Read of size 8 at addr
ffff8880188ec428 by task kworker/u12:1/18779
Workqueue: knbd4-recv recv_work
Call Trace:
__dump_stack
dump_stack+0xbe/0xfd
print_address_description.constprop.0+0x19/0x170
__kasan_report.cold+0x6c/0x84
kasan_report+0x3a/0x50
sock_xmit+0x525/0x550
nbd_read_reply+0xfe/0x2c0
recv_work+0x1c2/0x750
process_one_work+0x6b6/0xf10
worker_thread+0xdd/0xd80
kthread+0x30a/0x410
ret_from_fork+0x22/0x30
Allocated by task 18784:
kasan_save_stack+0x1b/0x40
kasan_set_track
set_alloc_info
__kasan_kmalloc
__kasan_kmalloc.constprop.0+0xf0/0x130
slab_post_alloc_hook
slab_alloc_node
slab_alloc
__kmalloc_track_caller+0x157/0x550
__do_krealloc
krealloc+0x37/0xb0
nbd_add_socket
+0x2d3/0x880
__nbd_ioctl
nbd_ioctl+0x584/0x8e0
__blkdev_driver_ioctl
blkdev_ioctl+0x2a0/0x6e0
block_ioctl+0xee/0x130
vfs_ioctl
__do_sys_ioctl
__se_sys_ioctl+0x138/0x190
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
Freed by task 18784:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x40
__kasan_slab_free.part.0+0x13f/0x1b0
slab_free_hook
slab_free_freelist_hook
slab_free
kfree+0xcb/0x6c0
krealloc+0x56/0xb0
nbd_add_socket+0x2d3/0x880
__nbd_ioctl
nbd_ioctl+0x584/0x8e0
__blkdev_driver_ioctl
blkdev_ioctl+0x2a0/0x6e0
block_ioctl+0xee/0x130
vfs_ioctl
__do_sys_ioctl
__se_sys_ioctl+0x138/0x190
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230911023308.3467802-1-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jan Höppner [Wed, 25 Oct 2023 13:24:37 +0000 (15:24 +0200)]
s390/dasd: protect device queue against concurrent access
In dasd_profile_start() the amount of requests on the device queue are
counted. The access to the device queue is unprotected against
concurrent access. With a lot of parallel I/O, especially with alias
devices enabled, the device queue can change while dasd_profile_start()
is accessing the queue. In the worst case this leads to a kernel panic
due to incorrect pointer accesses.
Fix this by taking the device lock before accessing the queue and
counting the requests. Additionally the check for a valid profile data
pointer can be done earlier to avoid unnecessary locking in a hot path.
Cc: <stable@vger.kernel.org>
Fixes:
4fa52aa7a82f ("[S390] dasd: add enhanced DASD statistics interface")
Reviewed-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Link: https://lore.kernel.org/r/20231025132437.1223363-3-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Muhammad Muzammil [Wed, 25 Oct 2023 13:24:36 +0000 (15:24 +0200)]
s390/dasd: resolve spelling mistake
resolve typing mistake from pimary to primary
Signed-off-by: Muhammad Muzammil <m.muzzammilashraf@gmail.com>
Link: https://lore.kernel.org/r/20231010043140.28416-1-m.muzzammilashraf@gmail.com
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Link: https://lore.kernel.org/r/20231025132437.1223363-2-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chengming Zhou [Mon, 20 Nov 2023 03:25:21 +0000 (03:25 +0000)]
block/null_blk: Fix double blk_mq_start_request() warning
When CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION is enabled, null_queue_rq()
would return BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE for the request,
which has been marked as MQ_RQ_IN_FLIGHT by blk_mq_start_request().
Then null_queue_rqs() put these requests in the rqlist, return back to
the block layer core, which would try to queue them individually again,
so the warning in blk_mq_start_request() triggered.
Fix it by splitting the null_queue_rq() into two parts: the first is the
preparation of request, the second is the handling of request. We put
the blk_mq_start_request() after the preparation part, which may fail
and return back to the block layer core.
The throttling also belongs to the preparation part, so move it before
blk_mq_start_request(). And change the return type of null_handle_cmd()
to void, since it always return BLK_STS_OK now.
Reported-by: <syzbot+fcc47ba2476570cbbeb0@syzkaller.appspotmail.com>
Closes: https://lore.kernel.org/all/
0000000000000e6aac06098aee0c@google.com/
Fixes:
d78bfa1346ab ("block/null_blk: add queue_rqs() support")
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Link: https://lore.kernel.org/r/20231120032521.1012037-1-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Hannes Reinecke [Fri, 20 Oct 2023 05:06:06 +0000 (07:06 +0200)]
nvmet-tcp: always initialize tls_handshake_tmo_work
The TLS handshake timeout work item should always be
initialized to avoid a crash when cancelling the workqueue.
Fixes:
675b453e0241 ("nvmet-tcp: enable TLS handshake upcall")
Suggested-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Christoph Hellwig [Fri, 17 Nov 2023 13:13:36 +0000 (08:13 -0500)]
nvmet: nul-terminate the NQNs passed in the connect command
The host and subsystem NQNs are passed in the connect command payload and
interpreted as nul-terminated strings. Ensure they actually are
nul-terminated before using them.
Fixes:
a07b4970f464 "nvmet: add a generic NVMe target")
Reported-by: Alon Zahavi <zahavi.alon@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Hannes Reinecke [Thu, 16 Nov 2023 12:14:35 +0000 (13:14 +0100)]
nvme: blank out authentication fabrics options if not configured
If the config option NVME_HOST_AUTH is not selected we should not
accept the corresponding fabrics options. This allows userspace
to detect if NVMe authentication has been enabled for the kernel.
Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes:
f50fff73d620 ("nvme: implement In-Band authentication")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Hannes Reinecke [Tue, 14 Nov 2023 13:27:01 +0000 (14:27 +0100)]
nvme: catch errors from nvme_configure_metadata()
nvme_configure_metadata() is issuing I/O, so we might incur an I/O
error which will cause the connection to be reset.
But in that case any further probing will race with reset and
cause UAF errors.
So return a status from nvme_configure_metadata() and abort
probing if there was an I/O error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Hannes Reinecke [Tue, 14 Nov 2023 13:18:21 +0000 (14:18 +0100)]
nvme-tcp: only evaluate 'tls' option if TLS is selected
We only need to evaluate the 'tls' connect option if TLS is
enabled; otherwise we might be getting a link error.
Fixes:
706add13676d ("nvme: keyring: fix conditional compilation")
Reported-by: kernel test robot <yujie.liu@intel.com>
Closes: https://lore.kernel.org/r/
202311140426.0eHrTXBr-lkp@intel.com/
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Mark O'Donovan [Wed, 11 Oct 2023 08:45:12 +0000 (08:45 +0000)]
nvme-auth: set explanation code for failure2 msgs
Some error cases were not setting an auth-failure-reason-code-explanation.
This means an AUTH_Failure2 message will be sent with an explanation value
of 0 which is a reserved value.
Signed-off-by: Mark O'Donovan <shiftee@posteo.net>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Mark O'Donovan [Wed, 11 Oct 2023 08:45:11 +0000 (08:45 +0000)]
nvme-auth: unlock mutex in one place only
Signed-off-by: Mark O'Donovan <shiftee@posteo.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Damien Le Moal [Mon, 20 Nov 2023 07:06:11 +0000 (16:06 +0900)]
block: Remove blk_set_runtime_active()
The function blk_set_runtime_active() is called only from
blk_post_runtime_resume(), so there is no need for that function to be
exported. Open-code this function directly in blk_post_runtime_resume()
and remove it.
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20231120070611.33951-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Li Nan [Thu, 16 Nov 2023 16:23:16 +0000 (00:23 +0800)]
nbd: fix null-ptr-dereference while accessing 'nbd->config'
Memory reordering may occur in nbd_genl_connect(), causing config_refs
to be set to 1 while nbd->config is still empty. Opening nbd at this
time will cause null-ptr-dereference.
T1 T2
nbd_open
nbd_get_config_unlocked
nbd_genl_connect
nbd_alloc_and_init_config
//memory reordered
refcount_set(&nbd->config_refs, 1) // 2
nbd->config
->null point
nbd->config = config // 1
Fix it by adding smp barrier to guarantee the execution sequence.
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20231116162316.1740402-4-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Li Nan [Thu, 16 Nov 2023 16:23:15 +0000 (00:23 +0800)]
nbd: factor out a helper to get nbd_config without holding 'config_lock'
There are no functional changes, just to make code cleaner and prepare
to fix null-ptr-dereference while accessing 'nbd->config'.
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20231116162316.1740402-3-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Li Nan [Thu, 16 Nov 2023 16:23:14 +0000 (00:23 +0800)]
nbd: fold nbd config initialization into nbd_alloc_config()
There are no functional changes, make the code cleaner and prepare to
fix null-ptr-dereference while accessing 'nbd->config'.
Signed-off-by: Li Nan <linan122@huawei.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20231116162316.1740402-2-linan666@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 20 Nov 2023 16:45:31 +0000 (09:45 -0700)]
Merge tag 'md-fixes-
20231120' of https://git./linux/kernel/git/song/md into block-6.7
Pull MD fix from Song.
* tag 'md-fixes-
20231120' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: fix bi_status reporting in md_end_clone_io
Coly Li [Mon, 20 Nov 2023 05:25:03 +0000 (13:25 +0800)]
bcache: avoid NULL checking to c->root in run_cache_set()
In run_cache_set() after c->root returned from bch_btree_node_get(), it
is checked by IS_ERR_OR_NULL(). Indeed it is unncessary to check NULL
because bch_btree_node_get() will not return NULL pointer to caller.
This patch replaces IS_ERR_OR_NULL() by IS_ERR() for the above reason.
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-11-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Coly Li [Mon, 20 Nov 2023 05:25:02 +0000 (13:25 +0800)]
bcache: add code comments for bch_btree_node_get() and __bch_btree_node_alloc()
This patch adds code comments to bch_btree_node_get() and
__bch_btree_node_alloc() that NULL pointer will not be returned and it
is unnecessary to check NULL pointer by the callers of these routines.
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-10-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Coly Li [Mon, 20 Nov 2023 05:25:01 +0000 (13:25 +0800)]
bcache: replace a mistaken IS_ERR() by IS_ERR_OR_NULL() in btree_gc_coalesce()
Commit
028ddcac477b ("bcache: Remove unnecessary NULL point check in
node allocations") do the following change inside btree_gc_coalesce(),
31 @@ -1340,7 +1340,7 @@ static int btree_gc_coalesce(
32 memset(new_nodes, 0, sizeof(new_nodes));
33 closure_init_stack(&cl);
34
35 - while (nodes < GC_MERGE_NODES && !IS_ERR_OR_NULL(r[nodes].b))
36 + while (nodes < GC_MERGE_NODES && !IS_ERR(r[nodes].b))
37 keys += r[nodes++].keys;
38
39 blocks = btree_default_blocks(b->c) * 2 / 3;
At line 35 the original r[nodes].b is not always allocatored from
__bch_btree_node_alloc(), and possibly initialized as NULL pointer by
caller of btree_gc_coalesce(). Therefore the change at line 36 is not
correct.
This patch replaces the mistaken IS_ERR() by IS_ERR_OR_NULL() to avoid
potential issue.
Fixes:
028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations")
Cc: <stable@vger.kernel.org> # 6.5+
Cc: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-9-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mingzhe Zou [Mon, 20 Nov 2023 05:25:00 +0000 (13:25 +0800)]
bcache: fixup multi-threaded bch_sectors_dirty_init() wake-up race
We get a kernel crash about "unable to handle kernel paging request":
```dmesg
[368033.032005] BUG: unable to handle kernel paging request at
ffffffffad9ae4b5
[368033.032007] PGD
fc3a0d067 P4D
fc3a0d067 PUD
fc3a0e063 PMD
8000000fc38000e1
[368033.032012] Oops: 0003 [#1] SMP PTI
[368033.032015] CPU: 23 PID: 55090 Comm: bch_dirtcnt[0] Kdump: loaded Tainted: G OE --------- - - 4.18.0-147.5.1.es8_24.x86_64 #1
[368033.032017] Hardware name: Tsinghua Tongfang THTF Chaoqiang Server/072T6D, BIOS 2.4.3 01/17/2017
[368033.032027] RIP: 0010:native_queued_spin_lock_slowpath+0x183/0x1d0
[368033.032029] Code: 8b 02 48 85 c0 74 f6 48 89 c1 eb d0 c1 e9 12 83 e0
03 83 e9 01 48 c1 e0 05 48 63 c9 48 05 c0 3d 02 00 48 03 04 cd 60 68 93
ad <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 02
[368033.032031] RSP: 0018:
ffffbb48852abe00 EFLAGS:
00010082
[368033.032032] RAX:
ffffffffad9ae4b5 RBX:
0000000000000246 RCX:
0000000000003bf3
[368033.032033] RDX:
ffff97b0ff8e3dc0 RSI:
0000000000600000 RDI:
ffffbb4884743c68
[368033.032034] RBP:
0000000000000001 R08:
0000000000000000 R09:
000007ffffffffff
[368033.032035] R10:
ffffbb486bb01000 R11:
0000000000000001 R12:
ffffffffc068da70
[368033.032036] R13:
0000000000000003 R14:
0000000000000000 R15:
0000000000000000
[368033.032038] FS:
0000000000000000(0000) GS:
ffff97b0ff8c0000(0000) knlGS:
0000000000000000
[368033.032039] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[368033.032040] CR2:
ffffffffad9ae4b5 CR3:
0000000fc3a0a002 CR4:
00000000003626e0
[368033.032042] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[368033.032043] bcache: bch_cached_dev_attach() Caching rbd479 as bcache462 on set
8cff3c36-4a76-4242-afaa-
7630206bc70b
[368033.032045] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[368033.032046] Call Trace:
[368033.032054] _raw_spin_lock_irqsave+0x32/0x40
[368033.032061] __wake_up_common_lock+0x63/0xc0
[368033.032073] ? bch_ptr_invalid+0x10/0x10 [bcache]
[368033.033502] bch_dirty_init_thread+0x14c/0x160 [bcache]
[368033.033511] ? read_dirty_submit+0x60/0x60 [bcache]
[368033.033516] kthread+0x112/0x130
[368033.033520] ? kthread_flush_work_fn+0x10/0x10
[368033.034505] ret_from_fork+0x35/0x40
```
The crash occurred when call wake_up(&state->wait), and then we want
to look at the value in the state. However, bch_sectors_dirty_init()
is not found in the stack of any task. Since state is allocated on
the stack, we guess that bch_sectors_dirty_init() has exited, causing
bch_dirty_init_thread() to be unable to handle kernel paging request.
In order to verify this idea, we added some printing information during
wake_up(&state->wait). We find that "wake up" is printed twice, however
we only expect the last thread to wake up once.
```dmesg
[ 994.641004] alcache: bch_dirty_init_thread() wake up
[ 994.641018] alcache: bch_dirty_init_thread() wake up
[ 994.641523] alcache: bch_sectors_dirty_init() init exit
```
There is a race. If bch_sectors_dirty_init() exits after the first wake
up, the second wake up will trigger this bug("unable to handle kernel
paging request").
Proceed as follows:
bch_sectors_dirty_init
kthread_run ==============> bch_dirty_init_thread(bch_dirtcnt[0])
... ...
atomic_inc(&state.started) ...
... ...
atomic_read(&state.enough) ...
... atomic_set(&state->enough, 1)
kthread_run ======================================================> bch_dirty_init_thread(bch_dirtcnt[1])
... atomic_dec_and_test(&state->started) ...
atomic_inc(&state.started) ... ...
... wake_up(&state->wait) ...
atomic_read(&state.enough) atomic_dec_and_test(&state->started)
... ...
wait_event(state.wait, atomic_read(&state.started) == 0) ...
return ...
wake_up(&state->wait)
We believe it is very common to wake up twice if there is no dirty, but
crash is an extremely low probability event. It's hard for us to reproduce
this issue. We attached and detached continuously for a week, with a total
of more than one million attaches and only one crash.
Putting atomic_inc(&state.started) before kthread_run() can avoid waking
up twice.
Fixes:
b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded")
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-8-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mingzhe Zou [Mon, 20 Nov 2023 05:24:59 +0000 (13:24 +0800)]
bcache: fixup lock c->root error
We had a problem with io hung because it was waiting for c->root to
release the lock.
crash> cache_set.root -l cache_set.list
ffffa03fde4c0050
root = 0xffff802ef454c800
crash> btree -o 0xffff802ef454c800 | grep rw_semaphore
[
ffff802ef454c858] struct rw_semaphore lock;
crash> struct rw_semaphore
ffff802ef454c858
struct rw_semaphore {
count = {
counter = -
4294967297
},
wait_list = {
next = 0xffff00006786fc28,
prev = 0xffff00005d0efac8
},
wait_lock = {
raw_lock = {
{
val = {
counter = 0
},
{
locked = 0 '\000',
pending = 0 '\000'
},
{
locked_pending = 0,
tail = 0
}
}
}
},
osq = {
tail = {
counter = 0
}
},
owner = 0xffffa03fdc586603
}
The "counter = -
4294967297" means that lock count is -1 and a write lock
is being attempted. Then, we found that there is a btree with a counter
of 1 in btree_cache_freeable.
crash> cache_set -l cache_set.list
ffffa03fde4c0050 -o|grep btree_cache
[
ffffa03fde4c1140] struct list_head btree_cache;
[
ffffa03fde4c1150] struct list_head btree_cache_freeable;
[
ffffa03fde4c1160] struct list_head btree_cache_freed;
[
ffffa03fde4c1170] unsigned int btree_cache_used;
[
ffffa03fde4c1178] wait_queue_head_t btree_cache_wait;
[
ffffa03fde4c1190] struct task_struct *btree_cache_alloc_lock;
crash> list -H
ffffa03fde4c1140|wc -l
973
crash> list -H
ffffa03fde4c1150|wc -l
1123
crash> cache_set.btree_cache_used -l cache_set.list
ffffa03fde4c0050
btree_cache_used = 2097
crash> list -s btree -l btree.list -H
ffffa03fde4c1140|grep -E -A2 "^ lock = {" > btree_cache.txt
crash> list -s btree -l btree.list -H
ffffa03fde4c1150|grep -E -A2 "^ lock = {" > btree_cache_freeable.txt
[root@node-3 127.0.0.1-2023-08-04-16:40:28]# pwd
/var/crash/127.0.0.1-2023-08-04-16:40:28
[root@node-3 127.0.0.1-2023-08-04-16:40:28]# cat btree_cache.txt|grep counter|grep -v "counter = 0"
[root@node-3 127.0.0.1-2023-08-04-16:40:28]# cat btree_cache_freeable.txt|grep counter|grep -v "counter = 0"
counter = 1
We found that this is a bug in bch_sectors_dirty_init() when locking c->root:
(1). Thread X has locked c->root(A) write.
(2). Thread Y failed to lock c->root(A), waiting for the lock(c->root A).
(3). Thread X bch_btree_set_root() changes c->root from A to B.
(4). Thread X releases the lock(c->root A).
(5). Thread Y successfully locks c->root(A).
(6). Thread Y releases the lock(c->root B).
down_write locked ---(1)----------------------┐
| |
| down_read waiting ---(2)----┐ |
| | ┌-------------┐ ┌-------------┐
bch_btree_set_root ===(3)========>> | c->root A | | c->root B |
| | └-------------┘ └-------------┘
up_write ---(4)---------------------┘ | |
| | |
down_read locked ---(5)-----------┘ |
| |
up_read ---(6)-----------------------------┘
Since c->root may change, the correct steps to lock c->root should be
the same as bch_root_usage(), compare after locking.
static unsigned int bch_root_usage(struct cache_set *c)
{
unsigned int bytes = 0;
struct bkey *k;
struct btree *b;
struct btree_iter iter;
goto lock_root;
do {
rw_unlock(false, b);
lock_root:
b = c->root;
rw_lock(false, b, b->level);
} while (b != c->root);
for_each_key_filter(&b->keys, k, &iter, bch_ptr_bad)
bytes += bkey_bytes(k);
rw_unlock(false, b);
return (bytes * 100) / btree_bytes(c);
}
Fixes:
b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded")
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-7-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mingzhe Zou [Mon, 20 Nov 2023 05:24:58 +0000 (13:24 +0800)]
bcache: fixup init dirty data errors
We found that after long run, the dirty_data of the bcache device
will have errors. This error cannot be eliminated unless re-register.
We also found that reattach after detach, this error can accumulate.
In bch_sectors_dirty_init(), all inode <= d->id keys will be recounted
again. This is wrong, we only need to count the keys of the current
device.
Fixes:
b144e45fc576 ("bcache: make bch_sectors_dirty_init() to be multithreaded")
Signed-off-by: Mingzhe Zou <mingzhe.zou@easystack.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-6-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rand Deeb [Mon, 20 Nov 2023 05:24:57 +0000 (13:24 +0800)]
bcache: prevent potential division by zero error
In SHOW(), the variable 'n' is of type 'size_t.' While there is a
conditional check to verify that 'n' is not equal to zero before
executing the 'do_div' macro, concerns arise regarding potential
division by zero error in 64-bit environments.
The concern arises when 'n' is 64 bits in size, greater than zero, and
the lower 32 bits of it are zeros. In such cases, the conditional check
passes because 'n' is non-zero, but the 'do_div' macro casts 'n' to
'uint32_t,' effectively truncating it to its lower 32 bits.
Consequently, the 'n' value becomes zero.
To fix this potential division by zero error and ensure precise
division handling, this commit replaces the 'do_div' macro with
div64_u64(). div64_u64() is designed to work with 64-bit operands,
guaranteeing that division is performed correctly.
This change enhances the robustness of the code, ensuring that division
operations yield accurate results in all scenarios, eliminating the
possibility of division by zero, and improving compatibility across
different 64-bit environments.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Rand Deeb <rand.sec96@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-5-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Colin Ian King [Mon, 20 Nov 2023 05:24:56 +0000 (13:24 +0800)]
bcache: remove redundant assignment to variable cur_idx
Variable cur_idx is being initialized with a value that is never read,
it is being re-assigned later in a while-loop. Remove the redundant
assignment. Cleans up clang scan build warning:
drivers/md/bcache/writeback.c:916:2: warning: Value stored to 'cur_idx'
is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Coly Li <colyli@suse.de>
Link: https://lore.kernel.org/r/20231120052503.6122-4-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Coly Li [Mon, 20 Nov 2023 05:24:55 +0000 (13:24 +0800)]
bcache: check return value from btree_node_alloc_replacement()
In btree_gc_rewrite_node(), pointer 'n' is not checked after it returns
from btree_gc_rewrite_node(). There is potential possibility that 'n' is
a non NULL ERR_PTR(), referencing such error code is not permitted in
following code. Therefore a return value checking is necessary after 'n'
is back from btree_node_alloc_replacement().
Signed-off-by: Coly Li <colyli@suse.de>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20231120052503.6122-3-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Coly Li [Mon, 20 Nov 2023 05:24:54 +0000 (13:24 +0800)]
bcache: avoid oversize memory allocation by small stripe_size
Arraies bcache->stripe_sectors_dirty and bcache->full_dirty_stripes are
used for dirty data writeback, their sizes are decided by backing device
capacity and stripe size. Larger backing device capacity or smaller
stripe size make these two arraies occupies more dynamic memory space.
Currently bcache->stripe_size is directly inherited from
queue->limits.io_opt of underlying storage device. For normal hard
drives, its limits.io_opt is 0, and bcache sets the corresponding
stripe_size to 1TB (1<<31 sectors), it works fine 10+ years. But for
devices do declare value for queue->limits.io_opt, small stripe_size
(comparing to 1TB) becomes an issue for oversize memory allocations of
bcache->stripe_sectors_dirty and bcache->full_dirty_stripes, while the
capacity of hard drives gets much larger in recent decade.
For example a raid5 array assembled by three 20TB hardrives, the raid
device capacity is 40TB with typical 512KB limits.io_opt. After the math
calculation in bcache code, these two arraies will occupy 400MB dynamic
memory. Even worse Andrea Tomassetti reports that a 4KB limits.io_opt is
declared on a new 2TB hard drive, then these two arraies request 2GB and
512MB dynamic memory from kzalloc(). The result is that bcache device
always fails to initialize on his system.
To avoid the oversize memory allocation, bcache->stripe_size should not
directly inherited by queue->limits.io_opt from the underlying device.
This patch defines BCH_MIN_STRIPE_SZ (4MB) as minimal bcache stripe size
and set bcache device's stripe size against the declared limits.io_opt
value from the underlying storage device,
- If the declared limits.io_opt > BCH_MIN_STRIPE_SZ, bcache device will
set its stripe size directly by this limits.io_opt value.
- If the declared limits.io_opt < BCH_MIN_STRIPE_SZ, bcache device will
set its stripe size by a value multiplying limits.io_opt and euqal or
large than BCH_MIN_STRIPE_SZ.
Then the minimal stripe size of a bcache device will always be >= 4MB.
For a 40TB raid5 device with 512KB limits.io_opt, memory occupied by
bcache->stripe_sectors_dirty and bcache->full_dirty_stripes will be 50MB
in total. For a 2TB hard drive with 4KB limits.io_opt, memory occupied
by these two arraies will be 2.5MB in total.
Such mount of memory allocated for bcache->stripe_sectors_dirty and
bcache->full_dirty_stripes is reasonable for most of storage devices.
Reported-by: Andrea Tomassetti <andrea.tomassetti-opensource@devo.com>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Eric Wheeler <bcache@lists.ewheeler.net>
Link: https://lore.kernel.org/r/20231120052503.6122-2-colyli@suse.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Song Liu [Fri, 17 Nov 2023 23:56:30 +0000 (15:56 -0800)]
md: fix bi_status reporting in md_end_clone_io
md_end_clone_io() may overwrite error status in orig_bio->bi_status with
BLK_STS_OK. This could happen when orig_bio has BIO_CHAIN (split by
md_submit_bio => bio_split_to_limits, for example). As a result, upper
layer may miss error reported from md (or the device) and consider the
failed IO was successful.
Fix this by only update orig_bio->bi_status when current bio reports
error and orig_bio is BLK_STS_OK. This is the same behavior as
__bio_chain_endio().
Fixes:
10764815ff47 ("md: add io accounting for raid0 and raid5")
Cc: stable@vger.kernel.org # v5.14+
Reported-by: Bhanu Victor DiCara <00bvd0+linux@gmail.com>
Closes: https://lore.kernel.org/regressions/
5727380.DvuYhMxLoT@bvd0/
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Xiao Ni <xni@redhat.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Ming Lei [Fri, 17 Nov 2023 02:35:24 +0000 (10:35 +0800)]
blk-cgroup: bypass blkcg_deactivate_policy after destroying
blkcg_deactivate_policy() can be called after blkg_destroy_all()
returns, and it isn't necessary since blkg_destroy_all has covered
policy deactivation.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20231117023527.3188627-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Fri, 17 Nov 2023 02:35:23 +0000 (10:35 +0800)]
blk-cgroup: avoid to warn !rcu_read_lock_held() in blkg_lookup()
So far, all callers either holds spin lock or rcu read explicitly, and
most of the caller has added WARN_ON_ONCE(!rcu_read_lock_held()) or
lockdep_assert_held(&disk->queue->queue_lock).
Remove WARN_ON_ONCE(!rcu_read_lock_held()) from blkg_lookup() for
killing the false positive warning from blkg_conf_prep().
Reported-by: Changhui Zhong <czhong@redhat.com>
Fixes:
83462a6c971c ("blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20231117023527.3188627-3-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Fri, 17 Nov 2023 02:35:22 +0000 (10:35 +0800)]
blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!"
Inside blkg_for_each_descendant_pre(), both
css_for_each_descendant_pre() and blkg_lookup() requires RCU read lock,
and either cgroup_assert_mutex_or_rcu_locked() or rcu_read_lock_held()
is called.
Fix the warning by adding rcu read lock.
Reported-by: Changhui Zhong <czhong@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20231117023527.3188627-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Christoph Hellwig [Mon, 13 Nov 2023 03:52:31 +0000 (11:52 +0800)]
blk-mq: make sure active queue usage is held for bio_integrity_prep()
blk_integrity_unregister() can come if queue usage counter isn't held
for one bio with integrity prepared, so this request may be completed with
calling profile->complete_fn, then kernel panic.
Another constraint is that bio_integrity_prep() needs to be called
before bio merge.
Fix the issue by:
- call bio_integrity_prep() with one queue usage counter grabbed reliably
- call bio_integrity_prep() before bio merge
Fixes:
900e080752025f00 ("block: move queue enter logic into blk_mq_submit_bio()")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Link: https://lore.kernel.org/r/20231113035231.2708053-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Mon, 13 Nov 2023 00:19:07 +0000 (16:19 -0800)]
Linux 6.7-rc1
Miri Korenblit [Sun, 12 Nov 2023 14:36:20 +0000 (16:36 +0200)]
wifi: iwlwifi: fix system commands group ordering
The commands should be sorted inside the group definition.
Fix the ordering so we won't get following warning:
WARN_ON(iwl_cmd_groups_verify_sorted(trans_cfg))
Link: https://lore.kernel.org/regressions/2fa930bb-54dd-4942-a88d-05a47c8e9731@gmail.com/
Link: https://lore.kernel.org/linux-wireless/CAHk-=wix6kqQ5vHZXjOPpZBfM7mMm9bBZxi2Jh7XnaKCqVf94w@mail.gmail.com/
Fixes:
b6e3d1ba4fcf ("wifi: iwlwifi: mvm: implement new firmware API for statistics")
Tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com>
Tested-by: Damian Tometzki <damian@riscv-rocks.de>
Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 12 Nov 2023 19:05:31 +0000 (11:05 -0800)]
Merge tag 'parisc-for-6.7-rc1-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
- Include the upper 5 address bits when inserting TLB entries on a
64-bit kernel.
On physical machines those are ignored, but in qemu it's nice to have
them included and to be correct.
- Stop the 64-bit kernel and show a warning if someone tries to boot on
a machine with a 32-bit CPU
- Fix a "no previous prototype" warning in parport-gsc
* tag 'parisc-for-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Prevent booting 64-bit kernels on PA1.x machines
parport: gsc: mark init function static
parisc/pgtable: Do not drop upper 5 address bits of physical address
Linus Torvalds [Sun, 12 Nov 2023 18:58:08 +0000 (10:58 -0800)]
Merge tag 'loongarch-6.7' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- support PREEMPT_DYNAMIC with static keys
- relax memory ordering for atomic operations
- support BPF CPU v4 instructions for LoongArch
- some build and runtime warning fixes
* tag 'loongarch-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
selftests/bpf: Enable cpu v4 tests for LoongArch
LoongArch: BPF: Support signed mod instructions
LoongArch: BPF: Support signed div instructions
LoongArch: BPF: Support 32-bit offset jmp instructions
LoongArch: BPF: Support unconditional bswap instructions
LoongArch: BPF: Support sign-extension mov instructions
LoongArch: BPF: Support sign-extension load instructions
LoongArch: Add more instruction opcodes and emit_* helpers
LoongArch/smp: Call rcutree_report_cpu_starting() earlier
LoongArch: Relax memory ordering for atomic operations
LoongArch: Mark __percpu functions as always inline
LoongArch: Disable module from accessing external data directly
LoongArch: Support PREEMPT_DYNAMIC with static keys
Linus Torvalds [Sun, 12 Nov 2023 18:50:38 +0000 (10:50 -0800)]
Merge tag 'powerpc-6.7-2' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Finish a refactor of pgprot_framebuffer() which dependend
on some changes that were merged via the drm tree
- Fix some kernel-doc warnings to quieten the bots
Thanks to Nathan Lynch and Thomas Zimmermann.
* tag 'powerpc-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/rtas: Fix ppc_rtas_rmo_buf_show() kernel-doc
powerpc/pseries/rtas-work-area: Fix rtas_work_area_reserve_arena() kernel-doc
powerpc/fb: Call internal __phys_mem_access_prot() in fbdev code
powerpc: Remove file parameter from phys_mem_access_prot()
powerpc/machdep: Remove trailing whitespaces
Linus Torvalds [Sun, 12 Nov 2023 01:17:22 +0000 (17:17 -0800)]
Merge tag '6.7-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- ctime caching fix (for setxattr)
- encryption fix
- DNS resolver mount fix
- debugging improvements
- multichannel fixes including cases where server stops or starts
supporting multichannel after mount
- reconnect fix
- minor cleanups
* tag '6.7-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number for cifs.ko
cifs: handle when server stops supporting multichannel
cifs: handle when server starts supporting multichannel
Missing field not being returned in ioctl CIFS_IOC_GET_MNT_INFO
smb3: allow dumping session and tcon id to improve stats analysis and debugging
smb: client: fix mount when dns_resolver key is not available
smb3: fix caching of ctime on setxattr
smb3: minor cleanup of session handling code
cifs: reconnect work should have reference on server struct
cifs: do not pass cifs_sb when trying to add channels
cifs: account for primary channel in the interface list
cifs: distribute channels across interfaces based on speed
cifs: handle cases where a channel is closed
smb3: more minor cleanups for session handling routines
smb3: minor RDMA cleanup
cifs: Fix encryption of cleared, but unset rq_iter data buffers
Linus Torvalds [Sat, 11 Nov 2023 00:35:04 +0000 (16:35 -0800)]
Merge tag 'probes-fixes-v6.7-rc1' of git://git./linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- Documentation update: Add a note about argument and return value
fetching is the best effort because it depends on the type.
- objpool: Fix to make internal global variables static in
test_objpool.c.
- kprobes: Unify kprobes_exceptions_nofify() prototypes. There are the
same prototypes in asm/kprobes.h for some architectures, but some of
them are missing the prototype and it causes a warning. So move the
prototype into linux/kprobes.h.
- tracing: Fix to check the tracepoint event and return event at
parsing stage. The tracepoint event doesn't support %return but if
$retval exists, it will be converted to %return silently. This finds
that case and rejects it.
- tracing: Fix the order of the descriptions about the parameters of
__kprobe_event_gen_cmd_start() to be consistent with the argument
list of the function.
* tag 'probes-fixes-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/kprobes: Fix the order of argument descriptions
tracing: fprobe-event: Fix to check tracepoint event and return
kprobes: unify kprobes_exceptions_nofify() prototypes
lib: test_objpool: make global variables static
Documentation: tracing: Add a note about argument and retval access
Linus Torvalds [Fri, 10 Nov 2023 23:07:01 +0000 (15:07 -0800)]
Merge tag 'fbdev-for-6.7-rc1' of git://git./linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes and cleanups from Helge Deller:
- fix double free and resource leaks in imsttfb
- lots of remove callback cleanups and section mismatch fixes in
omapfb, amifb and atmel_lcdfb
- error code fix and memparse simplification in omapfb
* tag 'fbdev-for-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: (31 commits)
fbdev: fsl-diu-fb: mark wr_reg_wa() static
fbdev: amifb: Convert to platform remove callback returning void
fbdev: amifb: Mark driver struct with __refdata to prevent section mismatch warning
fbdev: hyperv_fb: fix uninitialized local variable use
fbdev: omapfb/tpd12s015: Convert to platform remove callback returning void
fbdev: omapfb/tfp410: Convert to platform remove callback returning void
fbdev: omapfb/sharp-ls037v7dw01: Convert to platform remove callback returning void
fbdev: omapfb/opa362: Convert to platform remove callback returning void
fbdev: omapfb/hdmi: Convert to platform remove callback returning void
fbdev: omapfb/dvi: Convert to platform remove callback returning void
fbdev: omapfb/dsi-cm: Convert to platform remove callback returning void
fbdev: omapfb/dpi: Convert to platform remove callback returning void
fbdev: omapfb/analog-tv: Convert to platform remove callback returning void
fbdev: atmel_lcdfb: Convert to platform remove callback returning void
fbdev: omapfb/tpd12s015: Don't put .remove() in .exit.text and drop suppress_bind_attrs
fbdev: omapfb/tfp410: Don't put .remove() in .exit.text and drop suppress_bind_attrs
fbdev: omapfb/sharp-ls037v7dw01: Don't put .remove() in .exit.text and drop suppress_bind_attrs
fbdev: omapfb/opa362: Don't put .remove() in .exit.text and drop suppress_bind_attrs
fbdev: omapfb/hdmi: Don't put .remove() in .exit.text and drop suppress_bind_attrs
fbdev: omapfb/dvi: Don't put .remove() in .exit.text and drop suppress_bind_attrs
...
Yujie Liu [Tue, 31 Oct 2023 04:13:05 +0000 (12:13 +0800)]
tracing/kprobes: Fix the order of argument descriptions
The order of descriptions should be consistent with the argument list of
the function, so "kretprobe" should be the second one.
int __kprobe_event_gen_cmd_start(struct dynevent_cmd *cmd, bool kretprobe,
const char *name, const char *loc, ...)
Link: https://lore.kernel.org/all/20231031041305.3363712-1-yujie.liu@intel.com/
Fixes:
2a588dd1d5d6 ("tracing: Add kprobe event command generation functions")
Suggested-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Linus Torvalds [Fri, 10 Nov 2023 22:59:30 +0000 (14:59 -0800)]
Merge tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"Dave's VPN to the big machine died, so it's on me to do fixes pr this
and next week while everyone else is at plumbers.
- big pile of amd fixes, but mostly for hw support newly added in 6.7
- i915 fixes, mostly minor things
- qxl memory leak fix
- vc4 uaf fix in mock helpers
- syncobj fix for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE"
* tag 'drm-next-2023-11-10' of git://anongit.freedesktop.org/drm/drm: (78 commits)
drm/amdgpu: fix error handling in amdgpu_vm_init
drm/amdgpu: Fix possible null pointer dereference
drm/amdgpu: move UVD and VCE sched entity init after sched init
drm/amdgpu: move kfd_resume before the ip late init
drm/amd: Explicitly check for GFXOFF to be enabled for s0ix
drm/amdgpu: Change WREG32_RLC to WREG32_SOC15_RLC where inst != 0 (v2)
drm/amdgpu: Use correct KIQ MEC engine for gfx9.4.3 (v5)
drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support
drm/amdgpu: fix software pci_unplug on some chips
drm/amd/display: remove duplicated argument
drm/amdgpu: correct mca debugfs dump reg list
drm/amdgpu: correct acclerator check architecutre dump
drm/amdgpu: add pcs xgmi v6.4.0 ras support
drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3
drm/amdgpu: disable smu v13.0.6 mca debug mode by default
drm/amdgpu: Support multiple error query modes
drm/amdgpu: refine smu v13.0.6 mca dump driver
drm/amdgpu: Do not program PF-only regs in hdp_v4_0.c under SRIOV (v2)
drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV
drm: amd: Resolve Sphinx unexpected indentation warning
...
Linus Torvalds [Fri, 10 Nov 2023 20:22:14 +0000 (12:22 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Mostly PMU fixes and a reworking of the pseudo-NMI disabling on broken
MediaTek firmware:
- Move the MediaTek GIC quirk handling from irqchip to core. Before
the merging window commit
44bd78dd2b88 ("irqchip/gic-v3: Disable
pseudo NMIs on MediaTek devices w/ firmware issues") temporarily
addressed this issue. Fixed now at a deeper level in the arch code
- Reject events meant for other PMUs in the CoreSight PMU driver,
otherwise some of the core PMU events would disappear
- Fix the Armv8 PMUv3 driver driver to not truncate 64-bit registers,
causing some events to be invisible
- Remove duplicate declaration of __arm64_sys##name following the
patch to avoid prototype warning for syscalls
- Typos in the elf_hwcap documentation"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/syscall: Remove duplicate declaration
Revert "arm64: smp: avoid NMI IPIs with broken MediaTek FW"
arm64: Move MediaTek GIC quirk handling from irqchip to core
arm64/arm: arm_pmuv3: perf: Don't truncate 64-bit registers
perf: arm_cspmu: Reject events meant for other PMUs
Documentation/arm64: Fix typos in elf_hwcaps
Linus Torvalds [Fri, 10 Nov 2023 19:57:51 +0000 (11:57 -0800)]
Merge tag 'sound-fix-6.7-rc1' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of fixes for rc1.
The majority of changes are various ASoC driver-specific small fixes
and usual HD-audio quirks, while there are a couple of core changes: a
fix in ALSA core procfs code to avoid deadlocks at disconnection and
an ASoC core fix for DAPM clock widgets"
* tag 'sound-fix-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
OSS: dmasound/paula: Convert to platform remove callback returning void
ALSA: hda: ASUS UM5302LA: Added quirks for cs35L41/
10431A83 on i2c bus
ALSA: info: Fix potential deadlock at disconnection
ASoC: nau8540: Add self recovery to improve capture quility
ALSA: hda/realtek: Add support dual speaker for Dell
ALSA: hda: Add ASRock X670E Taichi to denylist
ALSA: hda/realtek: Add quirk for ASUS UX7602ZM
ASoC: SOF: sof-client: trivial: fix comment typo
ASoC: dapm: fix clock get name
ASoC: hdmi-codec: register hpd callback on component probe
ASoC: mediatek: mt8186_mt6366_rt1019_rt5682s: trivial: fix error messages
ASoC: da7219: Improve system suspend and resume handling
ASoC: codecs: Modify macro value error
ASoC: codecs: Modify the wrong judgment of re value
ASoC: codecs: Modify the maximum value of calib
ASoC: amd: acp: fix for i2s mode register field update
ASoC: codecs: aw88399: Fix -Wuninitialized in aw_dev_set_vcalb()
ASoC: rt712-sdca: fix speaker route missing issue
ASoC: rockchip: Fix unused rockchip_i2s_tdm_match warning for !CONFIG_OF
ASoC: ti: omap-mcbsp: Fix runtime PM underflow warnings
Daniel Vetter [Fri, 10 Nov 2023 19:51:37 +0000 (20:51 +0100)]
Merge tag 'amd-drm-next-6.7-2023-11-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.7-2023-11-10:
amdgpu:
- SR-IOV fixes
- DMCUB fixes
- DCN3.5 fixes
- DP2 fixes
- SubVP fixes
- SMU14 fixes
- SDMA4.x fixes
- Suspend/resume fixes
- AGP regression fix
- UAF fixes for some error cases
- SMU 13.0.6 fixes
- Documentation fixes
- RAS fixes
- Hotplug fixes
- Scheduling entity ordering fix
- GPUVM fixes
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231110190703.4741-1-alexander.deucher@amd.com
Linus Torvalds [Fri, 10 Nov 2023 19:44:38 +0000 (11:44 -0800)]
Merge tag 'spi-fix-v6.7-merge-window' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of fixes that came in during the merge window: one Kconfig
dependency fix and another fix for a long standing issue where a sync
transfer races with system suspend"
* tag 'spi-fix-v6.7-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: Fix null dereference on suspend
spi: spi-zynq-qspi: add spi-mem to driver kconfig dependencies
Linus Torvalds [Fri, 10 Nov 2023 19:40:38 +0000 (11:40 -0800)]
Merge tag 'mmc-v6.7-2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix broken cache-flush support for Micron eMMCs
- Revert 'mmc: core: Capture correct oemid-bits for eMMC cards'
MMC host:
- sdhci_am654: Fix TAP value parsing for legacy speed mode
- sdhci-pci-gli: Fix support for ASPM mode for GL9755/GL9750
- vub300: Fix an error path in probe"
* tag 'mmc-v6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-pci-gli: GL9750: Mask the replay timer timeout of AER
mmc: sdhci-pci-gli: GL9755: Mask the replay timer timeout of AER
Revert "mmc: core: Capture correct oemid-bits for eMMC cards"
mmc: vub300: fix an error code
mmc: Add quirk MMC_QUIRK_BROKEN_CACHE_FLUSH for Micron eMMC Q2J54A
mmc: sdhci_am654: fix start loop index for TAP value parsing
Linus Torvalds [Fri, 10 Nov 2023 19:34:16 +0000 (11:34 -0800)]
Merge tag 'pwm/for-6.7-rc1-fixes' of git://git./linux/kernel/git/thierry.reding/linux-pwm
Pull pwm fixes from Thierry Reding:
"This contains two very small fixes that I failed to include in the
main pull request"
* tag 'pwm/for-6.7-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: Fix double shift bug
pwm: samsung: Fix a bit test in pwm_samsung_resume()
Linus Torvalds [Fri, 10 Nov 2023 19:25:58 +0000 (11:25 -0800)]
Merge tag 'io_uring-6.7-2023-11-10' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
"Mostly just a few fixes and cleanups caused by the read multishot
support.
Outside of that, a stable fix for how a connect retry is done"
* tag 'io_uring-6.7-2023-11-10' of git://git.kernel.dk/linux:
io_uring: do not clamp read length for multishot read
io_uring: do not allow multishot read to set addr or len
io_uring: indicate if io_kbuf_recycle did recycle anything
io_uring/rw: add separate prep handler for fixed read/write
io_uring/rw: add separate prep handler for readv/writev
io_uring/net: ensure socket is marked connected on connect retry
io_uring/rw: don't attempt to allocate async data if opcode doesn't need it
Linus Torvalds [Fri, 10 Nov 2023 19:20:33 +0000 (11:20 -0800)]
Merge tag 'block-6.7-2023-11-10' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- nvme keyring config compile fixes (Hannes and Arnd)
- fabrics keep alive fixes (Hannes)
- tcp authentication fixes (Mark)
- io_uring_cmd error handling fix (Anuj)
- stale firmware attribute fix (Daniel)
- tcp memory leak (Christophe)
- crypto library usage simplification (Eric)
- nbd use-after-free fix. May need a followup, but at least it's better
than what it was before (Li)
- Rate limit write on read-only device warnings (Yu)
* tag 'block-6.7-2023-11-10' of git://git.kernel.dk/linux:
nvme: keyring: fix conditional compilation
nvme: common: make keyring and auth separate modules
blk-core: use pr_warn_ratelimited() in bio_check_ro()
nbd: fix uaf in nbd_open
nvme: start keep-alive after admin queue setup
nvme-loop: always quiesce and cancel commands before destroying admin q
nvme-tcp: avoid open-coding nvme_tcp_teardown_admin_queue()
nvme-auth: always set valid seq_num in dhchap reply
nvme-auth: add flag for bi-directional auth
nvme-auth: auth success1 msg always includes resp
nvme: fix error-handling for io_uring nvme-passthrough
nvme: update firmware version after commit
nvme-tcp: Fix a memory leak
nvme-auth: use crypto_shash_tfm_digest()
Linus Torvalds [Fri, 10 Nov 2023 19:15:34 +0000 (11:15 -0800)]
Merge tag 'ata-6.7-rc1-2' of git://git./linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
- Revert a change in ata_pci_shutdown_one() to suspend disks on
shutdown as this is now done using the manage_shutdown scsi device
flag (me)
- Change the pata_falcon and pata_gayle drivers to stop using
module_platform_driver_probe(). This makes these drivers more inline
with all other drivers (allowing bind/unbind) and suppress a
compilation warning (Uwe)
- Convert the pata_falcon and pata_gayle drivers to the new
.remove_new() void-return callback. These 2 drivers are the last ones
needing this change (Uwe)
* tag 'ata-6.7-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: pata_gayle: Convert to platform remove callback returning void
ata: pata_falcon: Convert to platform remove callback returning void
ata: pata_gayle: Stop using module_platform_driver_probe()
ata: pata_falcon: Stop using module_platform_driver_probe()
ata: libata-core: Fix ata_pci_shutdown_one()
Linus Torvalds [Fri, 10 Nov 2023 19:09:07 +0000 (11:09 -0800)]
Merge tag 'dma-mapping-6.7-2023-11-10' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
- don't leave pages decrypted for DMA in encrypted memory setups linger
around on failure (Petr Tesarik)
- fix an out of bounds access in the new dynamic swiotlb code (Petr
Tesarik)
- fix dma_addressing_limited for systems with weird physical memory
layouts (Jia He)
* tag 'dma-mapping-6.7-2023-11-10' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: fix out-of-bounds TLB allocations with CONFIG_SWIOTLB_DYNAMIC
dma-mapping: fix dma_addressing_limited() if dma_range_map can't cover all system RAM
dma-mapping: move dma_addressing_limited() out of line
swiotlb: do not free decrypted pages if dynamic
Linus Torvalds [Fri, 10 Nov 2023 18:58:49 +0000 (10:58 -0800)]
Merge tag 'lsm-pr-
20231109' of git://git./linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
"We've got two small patches to correct the default return
value of two LSM hooks: security_vm_enough_memory_mm() and
security_inode_getsecctx()"
* tag 'lsm-pr-
20231109' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: fix default return value for inode_getsecctx
lsm: fix default return value for vm_enough_memory
Linus Torvalds [Fri, 10 Nov 2023 18:23:53 +0000 (10:23 -0800)]
Merge tag '6.7-rc-smb3-server-part2' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- slab out of bounds fix in ACL handling
- fix malformed request oops
- minor doc fix
* tag '6.7-rc-smb3-server-part2' of git://git.samba.org/ksmbd:
ksmbd: handle malformed smb1 message
ksmbd: fix kernel-doc comment of ksmbd_vfs_kern_path_locked()
ksmbd: fix slab out of bounds write in smb_inherit_dacl()
Linus Torvalds [Fri, 10 Nov 2023 17:52:56 +0000 (09:52 -0800)]
Merge tag 'ceph-for-6.7-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
- support for idmapped mounts in CephFS (Christian Brauner, Alexander
Mikhalitsyn).
The series was originally developed by Christian and later picked up
and brought over the finish line by Alexander, who also contributed
an enabler on the MDS side (separate owner_{u,g}id fields on the
wire).
The required exports for mnt_idmap_{get,put}() in VFS have been acked
by Christian and received no objection from Christoph.
- a churny change in CephFS logging to include cluster and client
identifiers in log and debug messages (Xiubo Li).
This would help in scenarios with dozens of CephFS mounts on the same
node which are getting increasingly common, especially in the
Kubernetes world.
* tag 'ceph-for-6.7-rc1' of https://github.com/ceph/ceph-client:
ceph: allow idmapped mounts
ceph: allow idmapped atomic_open inode op
ceph: allow idmapped set_acl inode op
ceph: allow idmapped setattr inode op
ceph: pass idmap to __ceph_setattr
ceph: allow idmapped permission inode op
ceph: allow idmapped getattr inode op
ceph: pass an idmapping to mknod/symlink/mkdir
ceph: add enable_unsafe_idmap module parameter
ceph: handle idmapped mounts in create_request_message()
ceph: stash idmapping in mdsc request
fs: export mnt_idmap_get/mnt_idmap_put
libceph, ceph: move mdsmap.h to fs/ceph
ceph: print cluster fsid and client global_id in all debug logs
ceph: rename _to_client() to _to_fs_client()
ceph: pass the mdsc to several helpers
libceph: add doutc and *_client debug macros support
Linus Torvalds [Fri, 10 Nov 2023 17:23:17 +0000 (09:23 -0800)]
Merge tag 'riscv-for-linus-6.7-mw2' of git://git./linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:
- Support for handling misaligned accesses in S-mode
- Probing for misaligned access support is now properly cached and
handled in parallel
- PTDUMP now reflects the SW reserved bits, as well as the PBMT and
NAPOT extensions
- Performance improvements for TLB flushing
- Support for many new relocations in the module loader
- Various bug fixes and cleanups
* tag 'riscv-for-linus-6.7-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
riscv: Optimize bitops with Zbb extension
riscv: Rearrange hwcap.h and cpufeature.h
drivers: perf: Do not broadcast to other cpus when starting a counter
drivers: perf: Check find_first_bit() return value
of: property: Add fw_devlink support for msi-parent
RISC-V: Don't fail in riscv_of_parent_hartid() for disabled HARTs
riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings
riscv: Don't use PGD entries for the linear mapping
RISC-V: Probe misaligned access speed in parallel
RISC-V: Remove __init on unaligned_emulation_finish()
RISC-V: Show accurate per-hart isa in /proc/cpuinfo
RISC-V: Don't rely on positional structure initialization
riscv: Add tests for riscv module loading
riscv: Add remaining module relocations
riscv: Avoid unaligned access when relocating modules
riscv: split cache ops out of dma-noncoherent.c
riscv: Improve flush_tlb_kernel_range()
riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlb
riscv: Improve flush_tlb_range() for hugetlb pages
riscv: Improve tlb_flush()
...
Linus Torvalds [Fri, 10 Nov 2023 17:19:46 +0000 (09:19 -0800)]
Merge tag 'mips_6.7' of git://git./linux/kernel/git/mips/linux
Pull MIPS updates from Thomas Bogendoerfer:
- removed AR7 platform support
- cleanups and fixes
* tag 'mips_6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: AR7: remove platform
watchdog: ar7_wdt: remove driver to prepare for platform removal
vlynq: remove bus driver
mtd: parsers: ar7: remove support
serial: 8250: remove AR7 support
arch: mips: remove ReiserFS from defconfig
MIPS: lantiq: Remove unnecessary include of <linux/of_irq.h>
MIPS: lantiq: Fix pcibios_plat_dev_init() "no previous prototype" warning
MIPS: KVM: Fix a build warning about variable set but not used
MIPS: Remove dead code in relocate_new_kernel
mips: dts: ralink: mt7621: rename to GnuBee GB-PC1 and GnuBee GB-PC2
mips: dts: ralink: mt7621: define each reset as an item
mips: dts: ingenic: Remove unneeded probe-type properties
MIPS: loongson32: Remove dma.h and nand.h
Christian König [Tue, 31 Oct 2023 14:35:27 +0000 (15:35 +0100)]
drm/amdgpu: fix error handling in amdgpu_vm_init
When clearing the root PD fails we need to properly release it again.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Felix Kuehling [Tue, 31 Oct 2023 17:30:00 +0000 (13:30 -0400)]
drm/amdgpu: Fix possible null pointer dereference
mem = bo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes:
180253782038 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Wed, 8 Nov 2023 14:40:44 +0000 (09:40 -0500)]
drm/amdgpu: move UVD and VCE sched entity init after sched init
We need kernel scheduling entities to deal with handle clean up
if apps are not cleaned up properly. With commit
56e449603f0ac5
("drm/sched: Convert the GPU scheduler to variable number of run-queues")
the scheduler entities have to be created after scheduler init, so
change the ordering to fix this.
v2: Leave logic in UVD and VCE code
Fixes:
56e449603f0a ("drm/sched: Convert the GPU scheduler to variable number of run-queues")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: ltuikov89@gmail.com
Tim Huang [Thu, 19 Oct 2023 07:50:43 +0000 (15:50 +0800)]
drm/amdgpu: move kfd_resume before the ip late init
The kfd_resume needs to touch GC registers to enable the interrupts,
it needs to be done before GFXOFF is enabled to ensure that the GFX is
not off and GC registers can be touched. So move kfd_resume before the
amdgpu_device_ip_late_init which enables the CGPG/GFXOFF.
Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mario Limonciello [Thu, 9 Nov 2023 17:23:46 +0000 (11:23 -0600)]
drm/amd: Explicitly check for GFXOFF to be enabled for s0ix
If a user has disabled GFXOFF this may cause problems for the suspend
sequence. Ensure that it is enabled in amdgpu_acpi_is_s0ix_active().
The system won't reach the deepest state but it also won't hang.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Daniel Vetter [Fri, 10 Nov 2023 15:54:41 +0000 (16:54 +0100)]
Merge tag 'drm-misc-fixes-2023-11-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-fixes for v6.7-rc1:
qxl:
- qxl memory leak fix.
syncobj:
- Fix waiting for DRM_SYNCOBJ_WAIT_FLAGS_WAIT_AVAILABLE
vc4:
- Fix UAF in mock helpers
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
[sima: Stitch together both changelogs from Maarten. Also because of
branch history this contains a few more bugfixes which are already in
v6.6, but I didn't feel like this justifies some backmerge since there
wasn't any real conflict.]
Link: https://patchwork.freedesktop.org/patch/msgid/bc8598ee-d427-4616-8ebd-64107ab9a2d8@linux.intel.com
Daniel Vetter [Fri, 10 Nov 2023 15:43:44 +0000 (16:43 +0100)]
Merge tag 'drm-intel-next-fixes-2023-11-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 fixes for v6.7-rc1:
- Fix null dereference when perf interface is not available
- Fix a -Wstringop-overflow warning
- Fix a -Wformat-truncation warning in intel_tc_port_init
- Flush WC GGTT only on required platforms
- Fix MTL HBR3 rate support on C10 phy and eDP
- Fix MTL notify_guc for multi-GT
- Bump GLK CDCLK frequency when driving multiple pipes
- Fix potential spectre vulnerability
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/878r78xrxd.fsf@intel.com
Steve French [Thu, 20 Jul 2023 13:30:32 +0000 (08:30 -0500)]
cifs: update internal module version number for cifs.ko
From 2.45 to 2.46
Signed-off-by: Steve French <stfrench@microsoft.com>
Shyam Prasad N [Fri, 13 Oct 2023 11:40:09 +0000 (11:40 +0000)]
cifs: handle when server stops supporting multichannel
When a server stops supporting multichannel, we will
keep attempting reconnects to the secondary channels today.
Avoid this by freeing extra channels when negotiate
returns no multichannel support.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Shyam Prasad N [Fri, 13 Oct 2023 11:33:21 +0000 (11:33 +0000)]
cifs: handle when server starts supporting multichannel
When the user mounts with multichannel option, but the
server does not support it, there can be a time in future
where it can be supported.
With this change, such a case is handled.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Steve French [Fri, 10 Nov 2023 07:24:16 +0000 (01:24 -0600)]
Missing field not being returned in ioctl CIFS_IOC_GET_MNT_INFO
The tcon_flags field was always being set to zero in the information
about the mount returned by the ioctl CIFS_IOC_GET_MNT_INFO instead
of being set to the value of the Flags field in the tree connection
structure as intended.
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Helge Deller [Fri, 10 Nov 2023 15:13:15 +0000 (16:13 +0100)]
parisc: Prevent booting 64-bit kernels on PA1.x machines
Bail out early with error message when trying to boot a 64-bit kernel on
32-bit machines. This fixes the previous commit to include the check for
true 64-bit kernels as well.
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes:
591d2108f3abc ("parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines")
Cc: <stable@vger.kernel.org> # v6.0+
Mark Hasemeyer [Tue, 7 Nov 2023 21:47:43 +0000 (14:47 -0700)]
spi: Fix null dereference on suspend
A race condition exists where a synchronous (noqueue) transfer can be
active during a system suspend. This can cause a null pointer
dereference exception to occur when the system resumes.
Example order of events leading to the exception:
1. spi_sync() calls __spi_transfer_message_noqueue() which sets
ctlr->cur_msg
2. Spi transfer begins via spi_transfer_one_message()
3. System is suspended interrupting the transfer context
4. System is resumed
6. spi_controller_resume() calls spi_start_queue() which resets cur_msg
to NULL
7. Spi transfer context resumes and spi_finalize_current_message() is
called which dereferences cur_msg (which is now NULL)
Wait for synchronous transfers to complete before suspending by
acquiring the bus mutex and setting/checking a suspend flag.
Signed-off-by: Mark Hasemeyer <markhas@chromium.org>
Link: https://lore.kernel.org/r/20231107144743.v1.1.I7987f05f61901f567f7661763646cb7d7919b528@changeid
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@kernel.org
Masami Hiramatsu (Google) [Wed, 8 Nov 2023 12:12:39 +0000 (21:12 +0900)]
tracing: fprobe-event: Fix to check tracepoint event and return
Fix to check the tracepoint event is not valid with $retval.
The commit
08c9306fc2e3 ("tracing/fprobe-event: Assume fprobe is
a return event by $retval") introduced automatic return probe
conversion with $retval. But since tracepoint event does not
support return probe, $retval is not acceptable.
Without this fix, ftracetest, tprobe_syntax_errors.tc fails;
[22] Tracepoint probe event parser error log check [FAIL]
----
# tail 22-tprobe_syntax_errors.tc-log.mRKroL
+ ftrace_errlog_check trace_fprobe t kfree ^$retval dynamic_events
+ printf %s t kfree
+ wc -c
+ pos=8
+ printf %s t kfree ^$retval
+ tr -d ^
+ command=t kfree $retval
+ echo Test command: t kfree $retval
Test command: t kfree $retval
+ echo
----
So 't kfree $retval' should fail (tracepoint doesn't support
return probe) but passed it.
Link: https://lore.kernel.org/all/169944555933.45057.12831706585287704173.stgit@devnote2/
Fixes:
08c9306fc2e3 ("tracing/fprobe-event: Assume fprobe is a return event by $retval")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Arnd Bergmann [Fri, 10 Nov 2023 10:59:05 +0000 (19:59 +0900)]
kprobes: unify kprobes_exceptions_nofify() prototypes
Most architectures that support kprobes declare this function in their
own asm/kprobes.h header and provide an override, but some are missing
the prototype, which causes a warning for the __weak stub implementation:
kernel/kprobes.c:1865:12: error: no previous prototype for 'kprobe_exceptions_notify' [-Werror=missing-prototypes]
1865 | int __weak kprobe_exceptions_notify(struct notifier_block *self,
Move the prototype into linux/kprobes.h so it is visible to all
the definitions.
Link: https://lore.kernel.org/all/20231108125843.3806765-4-arnd@kernel.org/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
wuqiang.matt [Fri, 10 Nov 2023 10:59:04 +0000 (19:59 +0900)]
lib: test_objpool: make global variables static
Kernel test robot reported build warnings that structures g_ot_sync_ops,
g_ot_async_ops and g_testcases should be static. These definitions are
only used in test_objpool.c, so make them static
Link: https://lore.kernel.org/all/20231108012248.313574-1-wuqiang.matt@bytedance.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/
202311071229.WGrWUjM1-lkp@intel.com/
Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Masami Hiramatsu (Google) [Fri, 10 Nov 2023 10:59:03 +0000 (19:59 +0900)]
Documentation: tracing: Add a note about argument and retval access
Add a note about the argument and return value accecss will be best
effort. Depending on the type, it will be passed via stack or a
pair of the registers, but $argN and $retval only support the
single register access.
Link: https://lore.kernel.org/all/169556269377.146934.14829235476649685954.stgit@devnote2/
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Dan Carpenter [Wed, 25 Oct 2023 11:58:18 +0000 (14:58 +0300)]
pwm: Fix double shift bug
These enums are passed to set/test_bit(). The set/test_bit() functions
take a bit number instead of a shifted value. Passing a shifted value
is a double shift bug like doing BIT(BIT(1)). The double shift bug
doesn't cause a problem here because we are only checking 0 and 1 but
if the value was 5 or above then it can lead to a buffer overflow.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Dan Carpenter [Wed, 25 Oct 2023 11:57:34 +0000 (14:57 +0300)]
pwm: samsung: Fix a bit test in pwm_samsung_resume()
The PWMF_REQUESTED enum is supposed to be used with test_bit() and not
used as in a bitwise AND. In this specific code the flag will never be
set so the function is effectively a no-op.
Fixes:
e3fe982b2e4e ("pwm: samsung: Put per-channel data into driver data")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Arnd Bergmann [Wed, 8 Nov 2023 12:58:42 +0000 (13:58 +0100)]
fbdev: fsl-diu-fb: mark wr_reg_wa() static
wr_reg_wa() is not an appropriate name for a global function, and doesn't need
to be global anyway, so mark it static and avoid the warning:
drivers/video/fbdev/fsl-diu-fb.c:493:6: error: no previous prototype for 'wr_reg_wa' [-Werror=missing-prototypes]
Fixes:
0d9dab39fbbe ("powerpc/5121: fsl-diu-fb: fix issue with re-enabling DIU area descriptor")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Uwe Kleine-König [Thu, 9 Nov 2023 22:01:54 +0000 (23:01 +0100)]
fbdev: amifb: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Uwe Kleine-König [Thu, 9 Nov 2023 22:01:53 +0000 (23:01 +0100)]
fbdev: amifb: Mark driver struct with __refdata to prevent section mismatch warning
As described in the added code comment, a reference to .exit.text is ok
for drivers registered via module_platform_driver_probe(). Make this
explicit to prevent a section mismatch warning.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Steve French [Thu, 9 Nov 2023 21:28:12 +0000 (15:28 -0600)]
smb3: allow dumping session and tcon id to improve stats analysis and debugging
When multiple mounts are to the same share from the same client it was not
possible to determine which section of /proc/fs/cifs/Stats (and DebugData)
correspond to that mount. In some recent examples this turned out to be
a significant problem when trying to analyze performance data - since
there are many cases where unless we know the tree id and session id we
can't figure out which stats (e.g. number of SMB3.1.1 requests by type,
the total time they take, which is slowest, how many fail etc.) apply to
which mount. The only existing loosely related ioctl CIFS_IOC_GET_MNT_INFO
does not return the information needed to uniquely identify which tcon
is which mount although it does return various flags and device info.
Add a cifs.ko ioctl CIFS_IOC_GET_TCON_INFO (0x800ccf0c) to return tid,
session id, tree connect count.
Cc: stable@vger.kernel.org
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Arnd Bergmann [Wed, 8 Nov 2023 12:58:26 +0000 (13:58 +0100)]
parport: gsc: mark init function static
This is only used locally, so mark it static to avoid a warning:
drivers/parport/parport_gsc.c:395:5: error: no previous prototype for 'parport_gsc_init' [-Werror=missing-prototypes]
Acked-by: Helge Deller <deller@gmx.de>
Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Arnd Bergmann [Wed, 8 Nov 2023 14:58:13 +0000 (15:58 +0100)]
fbdev: hyperv_fb: fix uninitialized local variable use
When CONFIG_SYSFB is disabled, the hyperv_fb driver can now run into
undefined behavior on a gen2 VM, as indicated by this smatch warning:
drivers/video/fbdev/hyperv_fb.c:1077 hvfb_getmem() error: uninitialized symbol 'base'.
drivers/video/fbdev/hyperv_fb.c:1077 hvfb_getmem() error: uninitialized symbol 'size'.
Since there is no way to know the actual framebuffer in this configuration,
just return an allocation failure here, which should avoid the build
warning and the undefined behavior.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/
202311070802.YCpvehaz-lkp@intel.com/
Fixes:
a07b50d80ab6 ("hyperv: avoid dependency on screen_info")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>