linux-2.6-microblaze.git
7 months agobcachefs: Fix NULL ptr dereference in btree_node_iter_and_journal_peek
Piotr Zalewski [Sun, 27 Oct 2024 19:46:52 +0000 (19:46 +0000)]
bcachefs: Fix NULL ptr dereference in btree_node_iter_and_journal_peek

Add NULL check for key returned from bch2_btree_and_journal_iter_peek in
btree_node_iter_and_journal_peek to avoid NULL ptr dereference in
bch2_bkey_buf_reassemble.

When key returned from bch2_btree_and_journal_iter_peek is NULL it means
that btree topology needs repair. Print topology error message with
position at which node wasn't found, its parent node information and
btree_id with level.

Return error code returned by bch2_topology_error to ensure that topology
error is handled properly by recovery.

Reported-by: syzbot+005ef9aa519f30d97657@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=005ef9aa519f30d97657
Fixes: 5222a4607cd8 ("bcachefs: BTREE_ITER_WITH_JOURNAL")
Suggested-by: Alan Huang <mmpgouride@gmail.com>
Suggested-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix possible null-ptr-deref in __bch2_ec_stripe_head_get()
Gaosheng Cui [Sat, 26 Oct 2024 10:26:58 +0000 (18:26 +0800)]
bcachefs: fix possible null-ptr-deref in __bch2_ec_stripe_head_get()

The function ec_new_stripe_head_alloc() returns nullptr if kzalloc()
fails. It is crucial to verify its return value before dereferencing
it to avoid a potential nullptr dereference.

Fixes: 035d72f72c91 ("bcachefs: bch2_ec_stripe_head_get() now checks for change in rw devices")
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Fix deadlock on -ENOSPC w.r.t. partial open buckets
Kent Overstreet [Sun, 27 Oct 2024 00:21:41 +0000 (20:21 -0400)]
bcachefs: Fix deadlock on -ENOSPC w.r.t. partial open buckets

Open buckets on the partial list should not count as allocated when
we're trying to allocate from the partial list.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Don't filter partial list buckets in open_buckets_to_text()
Kent Overstreet [Fri, 18 Oct 2024 06:26:59 +0000 (02:26 -0400)]
bcachefs: Don't filter partial list buckets in open_buckets_to_text()

these are an important source of stranded buckets we need to be able to
watch

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Don't keep tons of cached pointers around
Kent Overstreet [Mon, 21 Oct 2024 00:02:09 +0000 (20:02 -0400)]
bcachefs: Don't keep tons of cached pointers around

We had a bug report where the data update path was creating an extent
that failed to validate because it had too many pointers; almost all of
them were cached.

To fix this, we have:
- want_cached_ptr(), a new helper that checks if we even want a cached
  pointer (is on appropriate target, device is readable).

- bch2_extent_set_ptr_cached() now only sets a pointer cached if we want
  it.

- bch2_extent_normalize_by_opts() now ensures that we only have a single
  cached pointer that we want.

While working on this, it was noticed that this doesn't work well with
reflinked data and per-file options. Another patch series is coming that
plumbs through additional io path options through bch_extent_rebalance,
with improved option handling.

Reported-by: Reed Riley <reed@riley.engineer>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: init freespace inited bits to 0 in bch2_fs_initialize
Piotr Zalewski [Sat, 26 Oct 2024 00:15:49 +0000 (00:15 +0000)]
bcachefs: init freespace inited bits to 0 in bch2_fs_initialize

Initialize freespace_initialized bits to 0 in member's flags and update
member's cached version for each device in bch2_fs_initialize.

It's possible for the bits to be set to 1 before fs is initialized and if
call to bch2_trans_mark_dev_sbs (just before bch2_fs_freespace_init) fails
bits remain to be 1 which can later indirectly trigger BUG condition in
bch2_bucket_alloc_freelist during shutdown.

Reported-by: syzbot+2b6a17991a6af64f9489@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=2b6a17991a6af64f9489
Fixes: bbe682c76789 ("bcachefs: Ensure devices are always correctly initialized")
Suggested-by: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Fix unhandled transaction restart in fallocate
Kent Overstreet [Sat, 26 Oct 2024 00:18:48 +0000 (20:18 -0400)]
bcachefs: Fix unhandled transaction restart in fallocate

This used to not matter, but now we're being more strict.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Fix UAF in bch2_reconstruct_alloc()
Kent Overstreet [Fri, 25 Oct 2024 17:13:05 +0000 (13:13 -0400)]
bcachefs: Fix UAF in bch2_reconstruct_alloc()

write_super() -> sb_counters_from_cpu() may reallocate the superblock

Reported-by: syzbot+9fc4dac4775d07bcfe34@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix null-ptr-deref in have_stripes()
Jeongjun Park [Wed, 23 Oct 2024 16:13:45 +0000 (01:13 +0900)]
bcachefs: fix null-ptr-deref in have_stripes()

c->btree_roots_known[i].b can be NULL. In this case, a NULL pointer dereference
occurs, so you need to add code to check the variable.

Reported-by: syzbot+b468b9fef56949c3b528@syzkaller.appspotmail.com
Fixes: 7773df19c35f ("bcachefs: metadata version bucket_stripe_sectors")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix shift oob in alloc_lru_idx_fragmentation
Jeongjun Park [Mon, 21 Oct 2024 15:43:56 +0000 (00:43 +0900)]
bcachefs: fix shift oob in alloc_lru_idx_fragmentation

The size of a.data_type is set abnormally large, causing shift-out-of-bounds.
To fix this, we need to add validation on a.data_type in
alloc_lru_idx_fragmentation().

Reported-by: syzbot+7f45fa9805c40db3f108@syzkaller.appspotmail.com
Fixes: 260af1562ec1 ("bcachefs: Kill alloc_v4.fragmentation_lru")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Fix invalid shift in validate_sb_layout()
Gianfranco Trad [Wed, 23 Oct 2024 21:30:44 +0000 (23:30 +0200)]
bcachefs: Fix invalid shift in validate_sb_layout()

Add check on layout->sb_max_size_bits against BCH_SB_LAYOUT_SIZE_BITS_MAX
to prevent UBSAN shift-out-of-bounds in validate_sb_layout().

Reported-by: syzbot+089fad5a3a5e77825426@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=089fad5a3a5e77825426
Fixes: 03ef80b469d5 ("bcachefs: Ignore unknown mount options")
Tested-by: syzbot+089fad5a3a5e77825426@syzkaller.appspotmail.com
Signed-off-by: Gianfranco Trad <gianf.trad@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Set bch_inode_unpacked.bi_snapshot in old inode path
Kent Overstreet [Sun, 20 Oct 2024 22:00:13 +0000 (18:00 -0400)]
bcachefs: Set bch_inode_unpacked.bi_snapshot in old inode path

This fixes a fsck bug on a very old filesystem (pre mainline merge).

Fixes: 72350ee0ea22 ("bcachefs: Kill snapshot arg to fsck_write_inode()")
Reported-by: Marcin Mirosław <marcin@mejor.pl>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Mark more errors as AUTOFIX
Kent Overstreet [Sun, 20 Oct 2024 20:48:31 +0000 (16:48 -0400)]
bcachefs: Mark more errors as AUTOFIX

Reported-by: Marcin Mirosław <marcin@mejor.pl>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations
Kent Overstreet [Sat, 19 Oct 2024 22:27:09 +0000 (18:27 -0400)]
bcachefs: Workaround for kvmalloc() not supporting > INT_MAX allocations

kvmalloc() doesn't support allocations > INT_MAX, but vmalloc() does -
the limit should be lifted, but we can work around this for now.

A user with a 75 TB filesystem reported the following journal replay
error:
https://github.com/koverstreet/bcachefs/issues/769

In journal replay we have to sort and dedup all the keys from the
journal, which means we need a large contiguous allocation. Given that
the user has 128GB of ram, the 2GB limit on allocation size has become
far too small.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Don't use wait_event_interruptible() in recovery
Kent Overstreet [Sat, 19 Oct 2024 21:50:41 +0000 (17:50 -0400)]
bcachefs: Don't use wait_event_interruptible() in recovery

Fix a bug where mount was failing with -ERESTARTSYS:
https://github.com/koverstreet/bcachefs/issues/741

We only want the interruptible wait when called from fsync.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Fix __bch2_fsck_err() warning
Kent Overstreet [Sat, 19 Oct 2024 21:23:10 +0000 (17:23 -0400)]
bcachefs: Fix __bch2_fsck_err() warning

We only warn about having a btree_trans that wasn't passed in if we'll
be prompting.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fsck: Improve hash_check_key()
Kent Overstreet [Fri, 18 Oct 2024 04:22:09 +0000 (00:22 -0400)]
bcachefs: fsck: Improve hash_check_key()

hash_check_key() checks and repairs the hash table btrees: dirents and
xattrs are open addressing hash tables.

We recently had a corruption reported where the hash type on an inode
somehow got flipped, which made the existing dirents invisible and
allowed new ones to be created with the same name.

Now, hash_check_key() can repair duplicates: it will delete one of them,
if it has an xattr or dangling dirent, but if it has two valid dirents
one of them gets renamed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: bch2_hash_set_or_get_in_snapshot()
Kent Overstreet [Fri, 18 Oct 2024 04:19:12 +0000 (00:19 -0400)]
bcachefs: bch2_hash_set_or_get_in_snapshot()

Add a variant of bch2_hash_set_in_snapshot() that returns the existing
key on -EEXIST.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Repair mismatches in inode hash seed, type
Kent Overstreet [Fri, 18 Oct 2024 03:00:14 +0000 (23:00 -0400)]
bcachefs: Repair mismatches in inode hash seed, type

Different versions of the same inode (same inode number, different
snapshot ID) must have the same hash seed and type - lookups require
this, since they see keys from different snapshots simultaneously.

To repair we only need to make the inodes consistent, hash_check_key()
will do the rest.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Add hash seed, type to inode_to_text()
Kent Overstreet [Sat, 28 Sep 2024 18:44:06 +0000 (14:44 -0400)]
bcachefs: Add hash seed, type to inode_to_text()

This helped with discovering some filesystem corruption fsck has having
trouble with: the str_hash type had gotten flipped on one snapshot's
version of an inode.

All versions of a given inode number have the same hash seed and hash
type, since lookups will be done with a single hash/seed and type and
see dirents/xattrs from multiple snapshots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: INODE_STR_HASH() for bch_inode_unpacked
Kent Overstreet [Fri, 18 Oct 2024 02:55:59 +0000 (22:55 -0400)]
bcachefs: INODE_STR_HASH() for bch_inode_unpacked

Trivial cleanup - add a normal BITMASK() helper for bch_inode_unpacked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Run in-kernel offline fsck without ratelimit errors
Kent Overstreet [Fri, 18 Oct 2024 02:29:23 +0000 (22:29 -0400)]
bcachefs: Run in-kernel offline fsck without ratelimit errors

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: skip mount option handle for empty string.
Hongbo Li [Thu, 10 Oct 2024 04:01:48 +0000 (12:01 +0800)]
bcachefs: skip mount option handle for empty string.

The options parse in get_tree will split the options buffer, it will
get the empty string for last one by strsep(). After commit
ea0eeb89b1d5 ("bcachefs: reject unknown mount options") is merged,
unknown mount options is not allowed (here is empty string), and this
causes this errors. This can be reproduced just by the following steps:

    bcachefs format /dev/loop
    mount -t bcachefs -o metadata_target=loop1 /dev/loop1 /mnt/bcachefs/

Fixes: ea0eeb89b1d5 ("bcachefs: reject unknown mount options")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix incorrect show_options results
Hongbo Li [Thu, 26 Sep 2024 02:00:01 +0000 (10:00 +0800)]
bcachefs: fix incorrect show_options results

When call show_options in bcachefs, the options buffer is appeneded
to the seq variable. In fact, it requires an additional comma to be
appended first. This will affect the remount process when reading
existing mount options.

Fixes: 9305cf91d05e ("bcachefs: bch2_opts_to_text()")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Fix data corruption on -ENOSPC in buffered write path
Kent Overstreet [Thu, 17 Oct 2024 05:10:49 +0000 (01:10 -0400)]
bcachefs: Fix data corruption on -ENOSPC in buffered write path

Found by generic/299: When we have to truncate a write due to -ENOSPC,
we may have to read in the folio we're writing to if we're now no longer
doing a complete write to a !uptodate folio.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: bch2_folio_reservation_get_partial() is now better behaved
Kent Overstreet [Thu, 17 Oct 2024 05:05:17 +0000 (01:05 -0400)]
bcachefs: bch2_folio_reservation_get_partial() is now better behaved

bch2_folio_reservation_get_partial(), on partial success, will now
return a reservation that's aligned to the filesystem blocksize.

This is a partial fix for fstests generic/299 - fio verify is badly
behaved in the presence of short writes that aren't aligned to its
blocksize.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix disk reservation accounting in bch2_folio_reservation_get()
Kent Overstreet [Mon, 14 Oct 2024 21:55:48 +0000 (17:55 -0400)]
bcachefs: fix disk reservation accounting in bch2_folio_reservation_get()

bch2_disk_reservation_put() zeroes out the reservation - oops.

This fixes a disk reservation leak when getting a quota reservation
returned an error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefS: ec: fix data type on stripe deletion
Kent Overstreet [Wed, 16 Oct 2024 10:32:12 +0000 (06:32 -0400)]
bcachefS: ec: fix data type on stripe deletion

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: Don't use commit_do() unnecessarily
Kent Overstreet [Mon, 14 Oct 2024 01:53:26 +0000 (21:53 -0400)]
bcachefs: Don't use commit_do() unnecessarily

Using commit_do() to call alloc_sectors_start_trans() breaks when we're
randomly injecting transaction restarts - the restart in the commit
causes us to leak the lock that alloc_sectorS_start_trans() takes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: handle restarts in bch2_bucket_io_time_reset()
Kent Overstreet [Tue, 15 Oct 2024 03:58:45 +0000 (23:58 -0400)]
bcachefs: handle restarts in bch2_bucket_io_time_reset()

bch2_bucket_io_time_reset() doesn't need to succeed, which is why it
didn't previously retry on transaction restart - but we're now treating
these as errors.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix restart handling in __bch2_resume_logged_op_finsert()
Kent Overstreet [Wed, 16 Oct 2024 08:11:15 +0000 (04:11 -0400)]
bcachefs: fix restart handling in __bch2_resume_logged_op_finsert()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix restart handling in bch2_alloc_write_key()
Kent Overstreet [Wed, 16 Oct 2024 07:36:40 +0000 (03:36 -0400)]
bcachefs: fix restart handling in bch2_alloc_write_key()

This is ugly:

We may discover in alloc_write_key that the data type we calculated is
wrong, because BCH_DATA_need_discard is checked/set elsewhere, and the
disk accounting counters we calculated need to be updated.

But bch2_alloc_key_to_dev_counters(..., BTREE_TRIGGER_gc) is not safe
w.r.t. transaction restarts, so we need to propagate the fixup back to
our gc state in case we take a transaction restart.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix restart handling in bch2_do_invalidates_work()
Kent Overstreet [Tue, 15 Oct 2024 06:13:22 +0000 (02:13 -0400)]
bcachefs: fix restart handling in bch2_do_invalidates_work()

this one is fairly harmless since the invalidate worker will just run
again later if it needs to, but still worth fixing

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix missing restart handling in bch2_read_retry_nodecode()
Kent Overstreet [Tue, 15 Oct 2024 03:52:38 +0000 (23:52 -0400)]
bcachefs: fix missing restart handling in bch2_read_retry_nodecode()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix restart handling in bch2_fiemap()
Kent Overstreet [Tue, 15 Oct 2024 03:32:23 +0000 (23:32 -0400)]
bcachefs: fix restart handling in bch2_fiemap()

We were leaking transaction restart errors to userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix bch2_hash_delete() error path
Kent Overstreet [Tue, 15 Oct 2024 02:40:20 +0000 (22:40 -0400)]
bcachefs: fix bch2_hash_delete() error path

we were exiting an iterator that hadn't been initialized

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
7 months agobcachefs: fix restart handling in bch2_rename2()
Kent Overstreet [Tue, 15 Oct 2024 02:18:12 +0000 (22:18 -0400)]
bcachefs: fix restart handling in bch2_rename2()

This should be impossible to hit in practice; the first lookup within a
transaction won't return a restart due to lock ordering, but we're
adding fault injection for transaction restarts and shaking out bugs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix sysfs warning in fstests generic/730,731
Kent Overstreet [Sat, 12 Oct 2024 18:36:38 +0000 (14:36 -0400)]
bcachefs: Fix sysfs warning in fstests generic/730,731

sysfs warns if we're removing a symlink from a directory that's no
longer in sysfs; this is triggered by fstests generic/730, which
simulates hot removal of a block device.

This patch is however not a correct fix, since checking
kobj->state_in_sysfs on a kobj owned by another subsystem is racy.

A better fix would be to add the appropriate check to
sysfs_remove_link() - and sysfs_create_link() as well.

But kobject_add_internal()/kobject_del() do not as of today have locking
that would support that.

Note that the block/holder.c code appears to be subject to this race as
well.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev
Kent Overstreet [Sun, 13 Oct 2024 23:38:00 +0000 (19:38 -0400)]
bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev

When creating a new stripe, we may reuse an existing stripe that has
some empty and some nonempty blocks.

Generally, the existing stripe won't change underneath us - except for
block sector counts, which we copy to the new key in
ec_stripe_key_update.

But the device removal path can now invalidate stripe pointers to a
device, and that can race with stripe reuse.

Change ec_stripe_key_update() to check for and resolve this
inconsistency.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix kasan splat in new_stripe_alloc_buckets()
Kent Overstreet [Mon, 14 Oct 2024 00:16:45 +0000 (20:16 -0400)]
bcachefs: Fix kasan splat in new_stripe_alloc_buckets()

Update for BCH_SB_MEMBER_INVALID.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Add missing validation for bch_stripe.csum_granularity_bits
Kent Overstreet [Sat, 12 Oct 2024 21:03:30 +0000 (17:03 -0400)]
bcachefs: Add missing validation for bch_stripe.csum_granularity_bits

Reported-by: syzbot+f8c98a50c323635be65d@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix missing bounds checks in bch2_alloc_read()
Kent Overstreet [Sat, 12 Oct 2024 19:49:23 +0000 (15:49 -0400)]
bcachefs: Fix missing bounds checks in bch2_alloc_read()

We were checking that the alloc key was for a valid device, but not a
valid bucket.

This is the upgrade path from versions prior to bcachefs being mainlined.

Reported-by: syzbot+a1b59c8e1a3f022fd301@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: fix uaf in bch2_dio_write_done()
Kent Overstreet [Sat, 12 Oct 2024 19:38:33 +0000 (15:38 -0400)]
bcachefs: fix uaf in bch2_dio_write_done()

Reported-by: syzbot+19ad84d5133871207377@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Improve check_snapshot_exists()
Kent Overstreet [Thu, 10 Oct 2024 01:28:11 +0000 (21:28 -0400)]
bcachefs: Improve check_snapshot_exists()

Check if we have snapshot_trees or subvolumes that refer to the snapshot
node being reconstructed, and use them.

With this, the kill_btree_root test that blows away the snapshots btree
now passes, and we're able to successfully reconstruct.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix bkey_nocow_lock()
Kent Overstreet [Sat, 12 Oct 2024 09:00:26 +0000 (05:00 -0400)]
bcachefs: Fix bkey_nocow_lock()

This fixes an assertion pop in nocow_locking.c

00243 kernel BUG at fs/bcachefs/nocow_locking.c:41!
00243 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
00243 Modules linked in:
00243 Hardware name: linux,dummy-virt (DT)
00243 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
00244 pc : bch2_bucket_nocow_unlock (/home/testdashboard/linux-7/fs/bcachefs/nocow_locking.c:41)
00244 lr : bkey_nocow_lock (/home/testdashboard/linux-7/fs/bcachefs/data_update.c:79)
00244 sp : ffffff80c82373b0
00244 x29: ffffff80c82373b0 x28: ffffff80e08958c0 x27: ffffff80e0880000
00244 x26: ffffff80c8237a98 x25: 00000000000000a0 x24: ffffff80c8237ab0
00244 x23: 00000000000000c0 x22: 0000000000000008 x21: 0000000000000000
00244 x20: ffffff80c8237a98 x19: 0000000000000018 x18: 0000000000000000
00244 x17: 0000000000000000 x16: 000000000000003f x15: 0000000000000000
00244 x14: 0000000000000008 x13: 0000000000000018 x12: 0000000000000000
00244 x11: 0000000000000000 x10: ffffff80e0880000 x9 : ffffffc0803ac1a4
00244 x8 : 0000000000000018 x7 : ffffff80c8237a88 x6 : ffffff80c8237ab0
00244 x5 : ffffff80e08988d0 x4 : 00000000ffffffff x3 : 0000000000000000
00244 x2 : 0000000000000004 x1 : 0003000000000d1e x0 : ffffff80e08988c0
00244 Call trace:
00244 bch2_bucket_nocow_unlock (/home/testdashboard/linux-7/fs/bcachefs/nocow_locking.c:41)
00245 bch2_data_update_init (/home/testdashboard/linux-7/fs/bcachefs/data_update.c:627 (discriminator 1))
00245 promote_alloc.isra.0 (/home/testdashboard/linux-7/fs/bcachefs/io_read.c:242 /home/testdashboard/linux-7/fs/bcachefs/io_read.c:304)
00245 __bch2_read_extent (/home/testdashboard/linux-7/fs/bcachefs/io_read.c:949)
00246 __bch2_read (/home/testdashboard/linux-7/fs/bcachefs/io_read.c:1215)
00246 bch2_direct_IO_read (/home/testdashboard/linux-7/fs/bcachefs/fs-io-direct.c:132)
00246 bch2_read_iter (/home/testdashboard/linux-7/fs/bcachefs/fs-io-direct.c:201)
00247 aio_read.constprop.0 (/home/testdashboard/linux-7/fs/aio.c:1602)
00247 io_submit_one.constprop.0 (/home/testdashboard/linux-7/fs/aio.c:2003 /home/testdashboard/linux-7/fs/aio.c:2052)
00248 __arm64_sys_io_submit (/home/testdashboard/linux-7/fs/aio.c:2111 /home/testdashboard/linux-7/fs/aio.c:2081 /home/testdashboard/linux-7/fs/aio.c:2081)
00248 invoke_syscall.constprop.0 (/home/testdashboard/linux-7/arch/arm64/include/asm/syscall.h:61 /home/testdashboard/linux-7/arch/arm64/kernel/syscall.c:54)
00248 ========= FAILED TIMEOUT tiering_variable_buckets_replicas in 1200s

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix accounting replay flags
Kent Overstreet [Sat, 12 Oct 2024 06:44:38 +0000 (02:44 -0400)]
bcachefs: Fix accounting replay flags

BCH_TRANS_COMMIT_journal_reclaim without BCH_WATERMARK_reclaim means
"return an error if low on journal space" - but accounting replay must
succeed.

Fixes https://github.com/koverstreet/bcachefs/issues/656

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix invalid shift in member_to_text()
Kent Overstreet [Sat, 12 Oct 2024 02:06:58 +0000 (22:06 -0400)]
bcachefs: Fix invalid shift in member_to_text()

Reported-by: syzbot+064ce437a1ad63d3f6ef@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID
Kent Overstreet [Sat, 12 Oct 2024 02:00:44 +0000 (22:00 -0400)]
bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID

This fixes a kasan splat in the ec device removal tests.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry
Kent Overstreet [Wed, 9 Oct 2024 20:21:00 +0000 (16:21 -0400)]
bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry

inode_bit_waitqueue() is changing - this update clears the way for
sched changes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check if stuck in journal_res_get()
Kent Overstreet [Mon, 7 Oct 2024 20:55:34 +0000 (16:55 -0400)]
bcachefs: Check if stuck in journal_res_get()

Like how we already do when the allocator seems to be stuck, check if
we're waiting too long for a journal reservation and print some debug
info.

This is specifically to track down
https://github.com/koverstreet/bcachefs/issues/656

which is showing up in userspace where we don't have sysfs/debugfs to
get the journal debug info.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agoclosures: Add closure_wait_event_timeout()
Kent Overstreet [Mon, 7 Oct 2024 20:54:11 +0000 (16:54 -0400)]
closures: Add closure_wait_event_timeout()

Add a closure version of wait_event_timeout(), with the same semantics.

The closure version is useful because unlike wait_event(), it allows
blocking code to run in the conditional expression.

Cc: Coly Li <colyli@suse.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix state lock involved deadlock
Alan Huang [Tue, 8 Oct 2024 17:33:05 +0000 (01:33 +0800)]
bcachefs: Fix state lock involved deadlock

We increased write ref, if the fs went to RO, that would lead to
a deadlock, it actually happens:

00171 ========= TEST   generic/279
00171
00172 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=nocow
00172 bcachefs (vdb): recovering from clean shutdown, journal seq 35
00172 bcachefs (vdb): accounting_read... done
00172 bcachefs (vdb): alloc_read... done
00172 bcachefs (vdb): stripes_read... done
00172 bcachefs (vdb): snapshots_read... done
00172 bcachefs (vdb): journal_replay... done
00172 bcachefs (vdb): resume_logged_ops... done
00172 bcachefs (vdb): going read-write
00172 bcachefs (vdb): done starting filesystem
00172 FSTYP         -- bcachefs
00172 PLATFORM      -- Linux/aarch64 farm3-kvm 6.11.0-rc1-ktest-g3e290a0b8e34 #7030 SMP Tue Oct  8 14:15:12 UTC 2024
00172 MKFS_OPTIONS  -- --nocow /dev/vdc
00172 MOUNT_OPTIONS -- /dev/vdc /mnt/scratch
00172
00172 bcachefs (vdc): starting version 1.12: rebalance_work_acct_fix opts=nocow
00172 bcachefs (vdc): initializing new filesystem
00172 bcachefs (vdc): going read-write
00172 bcachefs (vdc): marking superblocks
00172 bcachefs (vdc): initializing freespace
00172 bcachefs (vdc): done initializing freespace
00172 bcachefs (vdc): reading snapshots table
00172 bcachefs (vdc): reading snapshots done
00172 bcachefs (vdc): done starting filesystem
00173 bcachefs (vdc): shutting down
00173 bcachefs (vdc): going read-only
00173 bcachefs (vdc): finished waiting for writes to stop
00173 bcachefs (vdc): flushing journal and stopping allocators, journal seq 4
00173 bcachefs (vdc): flushing journal and stopping allocators complete, journal seq 6
00173 bcachefs (vdc): shutdown complete, journal seq 7
00173 bcachefs (vdc): marking filesystem clean
00173 bcachefs (vdc): shutdown complete
00173 bcachefs (vdb): shutting down
00173 bcachefs (vdb): going read-only
00361 INFO: task umount:6180 blocked for more than 122 seconds.
00361 Not tainted 6.11.0-rc1-ktest-g3e290a0b8e34 #7030
00361 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
00361 task:umount          state:D stack:0     pid:6180  tgid:6180  ppid:6176   flags:0x00000004
00361 Call trace:
00362 __switch_to (arch/arm64/kernel/process.c:556)
00362 __schedule (kernel/sched/core.c:5191 kernel/sched/core.c:6529)
00363 schedule (include/asm-generic/bitops/generic-non-atomic.h:128 include/linux/thread_info.h:192 include/linux/sched.h:2084 kernel/sched/core.c:6608 kernel/sched/core.c:6621)
00365 bch2_fs_read_only (fs/bcachefs/super.c:346 (discriminator 41))
00367 __bch2_fs_stop (fs/bcachefs/super.c:620)
00368 bch2_put_super (fs/bcachefs/fs.c:1942)
00369 generic_shutdown_super (include/linux/list.h:373 (discriminator 2) fs/super.c:650 (discriminator 2))
00371 bch2_kill_sb (fs/bcachefs/fs.c:2170)
00372 deactivate_locked_super (fs/super.c:434 fs/super.c:475)
00373 deactivate_super (fs/super.c:508)
00374 cleanup_mnt (fs/namespace.c:250 fs/namespace.c:1374)
00376 __cleanup_mnt (fs/namespace.c:1381)
00376 task_work_run (include/linux/sched.h:2024 kernel/task_work.c:224)
00377 do_notify_resume (include/linux/resume_user_mode.h:50 arch/arm64/kernel/entry-common.c:151)
00377 el0_svc (arch/arm64/include/asm/daifflags.h:28 arch/arm64/kernel/entry-common.c:171 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
00377 el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731)
00378 el0t_64_sync (arch/arm64/kernel/entry.S:598)
00378 INFO: task tee:6182 blocked for more than 122 seconds.
00378 Not tainted 6.11.0-rc1-ktest-g3e290a0b8e34 #7030
00378 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
00378 task:tee             state:D stack:0     pid:6182  tgid:6182  ppid:533    flags:0x00000004
00378 Call trace:
00378 __switch_to (arch/arm64/kernel/process.c:556)
00378 __schedule (kernel/sched/core.c:5191 kernel/sched/core.c:6529)
00378 schedule (include/asm-generic/bitops/generic-non-atomic.h:128 include/linux/thread_info.h:192 include/linux/sched.h:2084 kernel/sched/core.c:6608 kernel/sched/core.c:6621)
00378 schedule_preempt_disabled (kernel/sched/core.c:6680)
00379 rwsem_down_read_slowpath (kernel/locking/rwsem.c:1073 (discriminator 1))
00379 down_read (kernel/locking/rwsem.c:1529)
00381 bch2_gc_gens (fs/bcachefs/sb-members.h:77 fs/bcachefs/sb-members.h:88 fs/bcachefs/sb-members.h:128 fs/bcachefs/btree_gc.c:1240)
00383 bch2_fs_store_inner (fs/bcachefs/sysfs.c:473)
00385 bch2_fs_internal_store (fs/bcachefs/sysfs.c:417 fs/bcachefs/sysfs.c:580 fs/bcachefs/sysfs.c:576)
00386 sysfs_kf_write (fs/sysfs/file.c:137)
00387 kernfs_fop_write_iter (fs/kernfs/file.c:334)
00389 vfs_write (fs/read_write.c:497 fs/read_write.c:590)
00390 ksys_write (fs/read_write.c:643)
00391 __arm64_sys_write (fs/read_write.c:652)
00391 invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
00392 do_el0_svc (include/linux/thread_info.h:127 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
00392 el0_svc (arch/arm64/include/asm/irqflags.h:55 arch/arm64/include/asm/irqflags.h:76 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
00392 el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731)
00392 el0t_64_sync (arch/arm64/kernel/entry.S:598)

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix NULL pointer dereference in bch2_opt_to_text
Mohammed Anees [Sat, 5 Oct 2024 13:02:29 +0000 (18:32 +0530)]
bcachefs: Fix NULL pointer dereference in bch2_opt_to_text

This patch adds a bounds check to the bch2_opt_to_text function to prevent
NULL pointer dereferences when accessing the opt->choices array. This
ensures that the index used is within valid bounds before dereferencing.
The new version enhances the readability.

Reported-and-tested-by: syzbot+37186860aa7812b331d5@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=37186860aa7812b331d5
Signed-off-by: Mohammed Anees <pvmohammedanees2003@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Release transaction before wake up
Alan Huang [Tue, 8 Oct 2024 16:59:08 +0000 (00:59 +0800)]
bcachefs: Release transaction before wake up

We will get this if we wake up first:

Kernel panic - not syncing: btree_node_write_done leaked btree_trans

since there are still transactions waiting for cycle detectors after
BTREE_NODE_write_in_flight is cleared.

Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: add check for btree id against max in try read node
Piotr Zalewski [Sun, 29 Sep 2024 14:26:45 +0000 (14:26 +0000)]
bcachefs: add check for btree id against max in try read node

Add check for read node's btree_id against BTREE_ID_NR_MAX in
try_read_btree_node to prevent triggering EBUG_ON condition in
bch2_btree_id_root[1].

[1] https://syzkaller.appspot.com/bug?extid=cf7b2215b5d70600ec00

Reported-by: syzbot+cf7b2215b5d70600ec00@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cf7b2215b5d70600ec00
Fixes: 4409b8081d16 ("bcachefs: Repair pass for scanning for btree nodes")
Signed-off-by: Piotr Zalewski <pZ010001011111@proton.me>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Disk accounting device validation fixes
Kent Overstreet [Mon, 7 Oct 2024 22:04:21 +0000 (18:04 -0400)]
bcachefs: Disk accounting device validation fixes

- Fix failure to validate that accounting replicas entries point to
  valid devices: this wasn't a real bug since they'd be cleaned up by
  GC, but is still something we should know about

- Fix failure to validate that dev_data_type entries point to valid
  devices: this does fix a real bug, since bch2_accounting_read() would
  then try to copy the counters to that device and pop an inconsistent
  error when the device didn't exist

- Remove accounting entries that are zeroed or invalid: if we're not
  validating them we need to get rid of them: they might not exist in
  the superblock, so we need the to trigger the superblock mark path
  when they're readded.

  This fixes the replication.ktest rereplicate test, which was failing
  with "superblock not marked for replicas..."

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: bch2_inode_or_descendents_is_open()
Kent Overstreet [Thu, 3 Oct 2024 01:23:41 +0000 (21:23 -0400)]
bcachefs: bch2_inode_or_descendents_is_open()

fsck can now correctly check if inodes in interior snapshot nodes are
open/in use.

- Tweak the vfs inode rhashtable so that the subvolume ID isn't hashed,
  meaning inums in different subvolumes will hash to the same slot. Note
  that this is a hack, and will cause problems if anyone ever has the
  same file in many different snapshots open all at the same time.

- Then check if any of those subvolumes is a descendent of the snapshot
  ID being checked

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Kill bch2_propagate_key_to_snapshot_leaves()
Kent Overstreet [Mon, 30 Sep 2024 04:38:13 +0000 (00:38 -0400)]
bcachefs: Kill bch2_propagate_key_to_snapshot_leaves()

Dead code now.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: bcachefs_metadata_version_inode_has_child_snapshots
Kent Overstreet [Mon, 30 Sep 2024 02:11:37 +0000 (22:11 -0400)]
bcachefs: bcachefs_metadata_version_inode_has_child_snapshots

There's an inherent race in taking a snapshot while an unlinked file is
open, and then reattaching it in the child snapshot.

In the interior snapshot node the file will appear unlinked, as though
it should be deleted - it's not referenced by anything in that snapshot
- but we can't delete it, because the file data is referenced by the
child snapshot.

This was being handled incorrectly with
propagate_key_to_snapshot_leaves() - but that doesn't resolve the
fundamental inconsistency of "this file looks like it should be deleted
according to normal rules, but - ".

To fix this, we need to fix the rule for when an inode is deleted. The
previous rule, ignoring snapshots (there was no well-defined rule
for with snapshots) was:
  Unlinked, non open files are deleted, either at recovery time or
  during online fsck

The new rule is:
  Unlinked, non open files, that do not exist in child snapshots, are
  deleted.

To make this work transactionally, we add a new inode flag,
BCH_INODE_has_child_snapshot; it overrides BCH_INODE_unlinked when
considering whether to delete an inode, or put it on the deleted list.

For transactional consistency, clearing it handled by the inode trigger:
when deleting an inode we check if there are parent inodes which can now
have the BCH_INODE_has_child_snapshot flag cleared.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Delete vestigal check_inode() checks
Kent Overstreet [Sun, 29 Sep 2024 03:30:05 +0000 (23:30 -0400)]
bcachefs: Delete vestigal check_inode() checks

BCH_INODE_i_size_dirty dates from before we had logged operations for
truncate (as well as finsert) - it hasn't been needed since before
bcachefs was mainlined.

BCH_INODE_i_sectors_dirty hasn't been needed since we started always
updating i_sectors transactionally - it's been unused for even longer.

BCH_INODE_backptr_untrusted also hasn't been used since prior to
mainlining; when unlinking a hardling, we zero out the backpointer
fields if they're for the dirent being removed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: btree_iter_peek_upto() now handles BTREE_ITER_all_snapshots
Kent Overstreet [Sat, 5 Oct 2024 01:40:13 +0000 (21:40 -0400)]
bcachefs: btree_iter_peek_upto() now handles BTREE_ITER_all_snapshots

end_pos now compares against snapshot ID when required

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: reattach_inode() now correctly handles interior snapshot nodes
Kent Overstreet [Mon, 30 Sep 2024 23:03:19 +0000 (19:03 -0400)]
bcachefs: reattach_inode() now correctly handles interior snapshot nodes

When we find an unreachable inode, we now reattach it in the oldest
version that needs to be reattached (thus avoiding redundant work
reattaching every single version), and we now fix up inode -> dirent
backpointers in newer versions as needed - or white out the reattaching
dirent in newer versions, if the newer version isn't supposed to be
reattached.

This results in the second verify fsck now passing cleanly after
repairing on a user-provided filesystem image with thousands of
different snapshots.

Reported-by: Christopher Snowhill <chris@kode54.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Split out check_unreachable_inodes() pass
Kent Overstreet [Mon, 30 Sep 2024 03:40:28 +0000 (23:40 -0400)]
bcachefs: Split out check_unreachable_inodes() pass

With inode backpointers, we can write a very simple
check_unreachable_inodes() pass that only looks for non-unlinked inodes
that are missing backpointers, and reattaches them.

This simplifies check_directory_structure() so that it's now only
checking for directory structure loops,

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix lockdep splat in bch2_accounting_read
Kent Overstreet [Sat, 5 Oct 2024 21:37:02 +0000 (17:37 -0400)]
bcachefs: Fix lockdep splat in bch2_accounting_read

We can't take sb_lock while holding mark_lock, so split out
replicas_entry_validate() and replicas_entry_sb_validate() -
replicas_entry_validate() now uses the normal online device interface.

00039 ========= TEST   set_option
00039
00039 WATCHDOG 30
00040 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=errors=panic
00040 bcachefs (vdb): initializing new filesystem
00040 bcachefs (vdb): going read-write
00040 bcachefs (vdb): marking superblocks
00040 bcachefs (vdb): initializing freespace
00040 bcachefs (vdb): done initializing freespace
00040 bcachefs (vdb): reading snapshots table
00040 bcachefs (vdb): reading snapshots done
00040 bcachefs (vdb): done starting filesystem
00040 zstd
00041 bcachefs (vdb): shutting down
00041 bcachefs (vdb): going read-only
00041 bcachefs (vdb): finished waiting for writes to stop
00041 bcachefs (vdb): flushing journal and stopping allocators, journal seq 3
00041 bcachefs (vdb): flushing journal and stopping allocators complete, journal seq 11
00041 bcachefs (vdb): shutdown complete, journal seq 12
00041 bcachefs (vdb): marking filesystem clean
00041 bcachefs (vdb): shutdown complete
00041 Setting option on offline fs
00041 bch2_write_super(): fatal error : attempting to write superblock that wasn't version downgraded (1.12: (unknown version) > 1.10: disk_accounting_v3)
00041 fatal error - emergency read only
00041 bch2_write_super(): fatal error : attempting to write superblock that wasn't version downgraded (1.12: (unknown version) > 1.10: disk_accounting_v3)
00042 bcachefs (vdb): starting version 1.12: rebalance_work_acct_fix opts=errors=panic,compression=zstd
00042 bcachefs (vdb): recovering from clean shutdown, journal seq 12
00042 bcachefs (vdb): accounting_read...
00042
00042 ======================================================
00042 WARNING: possible circular locking dependency detected
00042 6.12.0-rc1-ktest-g805e938a8502 #6807 Not tainted
00042 ------------------------------------------------------
00042 mount.bcachefs/665 is trying to acquire lock:
00045 ffffff80cc280908 (&c->sb_lock){+.+.}-{3:3}, at: bch2_replicas_entry_validate (fs/bcachefs/replicas.c:102)
00045
00045 but task is already holding lock:
00048 ffffff80cc284870 (&c->mark_lock){++++}-{0:0}, at: bch2_accounting_read (fs/bcachefs/disk_accounting.c:670 (discriminator 1))
00048
00048 which lock already depends on the new lock.
00048
00048
00048 the existing dependency chain (in reverse order) is:
00048
00048 -> #1 (&c->mark_lock){++++}-{0:0}:
00049 percpu_down_write (kernel/locking/percpu-rwsem.c:232)
00052 bch2_sb_replicas_to_cpu_replicas (fs/bcachefs/replicas.c:583)
00055 bch2_sb_to_fs (fs/bcachefs/super-io.c:614)
00057 bch2_fs_open (fs/bcachefs/super.c:828 fs/bcachefs/super.c:2050)
00060 bch2_fs_get_tree (fs/bcachefs/fs.c:2067)
00062 vfs_get_tree (fs/super.c:1801)
00064 path_mount (fs/namespace.c:3507 fs/namespace.c:3834)
00066 __arm64_sys_mount (fs/namespace.c:3847 fs/namespace.c:4055 fs/namespace.c:4032 fs/namespace.c:4032)
00067 invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
00068 do_el0_svc (include/linux/thread_info.h:127 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
00069 el0_svc (arch/arm64/include/asm/irqflags.h:82 arch/arm64/include/asm/irqflags.h:123 arch/arm64/include/asm/irqflags.h:136 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713)
00069 ========= FAILED TIMEOUT set_option in 30s

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Rework logged op error handling
Kent Overstreet [Tue, 24 Sep 2024 02:06:58 +0000 (22:06 -0400)]
bcachefs: Rework logged op error handling

Initially it was thought that we just wanted to ignore errors from
logged op replay, but it turns out we do need to catch -EROFS, or we'll
go into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Add warn param to subvol_get_snapshot, peek_inode
Kent Overstreet [Tue, 24 Sep 2024 09:33:07 +0000 (05:33 -0400)]
bcachefs: Add warn param to subvol_get_snapshot, peek_inode

These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Kill snapshot arg to fsck_write_inode()
Kent Overstreet [Mon, 30 Sep 2024 04:00:33 +0000 (00:00 -0400)]
bcachefs: Kill snapshot arg to fsck_write_inode()

It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.

This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check for unlinked, non-empty dirs in check_inode()
Kent Overstreet [Mon, 30 Sep 2024 03:38:37 +0000 (23:38 -0400)]
bcachefs: Check for unlinked, non-empty dirs in check_inode()

We want to check for this early so it can be reattached if necessary in
check_unreachable_inodes(); better than letting it be deleted and having
the children reattached, losing their filenames.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check for unlinked inodes with dirents
Kent Overstreet [Mon, 30 Sep 2024 02:38:04 +0000 (22:38 -0400)]
bcachefs: Check for unlinked inodes with dirents

link count works differently in bcachefs - it's only nonzero for files
with multiple hardlinks, which means we can also avoid checking it
except for files that are known to have hardlinks.

That means we need a few different checks instead; in particular, we
don't want fsck to delet a file that has a dirent pointing to it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Check for directories with no backpointers
Kent Overstreet [Sat, 28 Sep 2024 19:27:37 +0000 (15:27 -0400)]
bcachefs: Check for directories with no backpointers

It's legal for regular files to have missing backpointers (due to
hardlinks), and fsck should automatically add them, but for directories
this is an error that should be flagged.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Kill alloc_v4.fragmentation_lru
Kent Overstreet [Tue, 1 Oct 2024 23:08:37 +0000 (19:08 -0400)]
bcachefs: Kill alloc_v4.fragmentation_lru

The fragmentation_lru field hasn't been needed since we reworked the LRU
btrees to use the btree write buffer; previously it was used to resolve
collisions, but the revised LRU btree uses the backpointer (the bucket)
as part of the key.

It should have been deleted at the time of the LRU rework; since it
wasn't, that left places for bugs to hide, in check/repair.

This fixes LRU fsck on a filesystem image helpfully provided by a user
who disappeared before I could get his name for the reported-by.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: minor lru fsck fixes
Kent Overstreet [Tue, 1 Oct 2024 20:40:33 +0000 (16:40 -0400)]
bcachefs: minor lru fsck fixes

check_lru_key() wasn't using write buffer updates for deleting bad lru
entries - dating from before the lru btree used the btree write buffer.

And when possibly flushing the btree write buffer (to make sure we're
seeing a real inconsistency), we need to be using the modern
bch2_btree_write_buffer_maybe_flush().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Mark more errors AUTOFIX
Kent Overstreet [Tue, 1 Oct 2024 20:26:21 +0000 (16:26 -0400)]
bcachefs: Mark more errors AUTOFIX

Errors are getting marked as AUTOFIX once they've been (re)-tested and
audited.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Make sure we print error that causes fsck to bail out
Kent Overstreet [Tue, 1 Oct 2024 20:26:02 +0000 (16:26 -0400)]
bcachefs: Make sure we print error that causes fsck to bail out

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: bkey errors are only AUTOFIX during read
Kent Overstreet [Fri, 4 Oct 2024 19:05:40 +0000 (15:05 -0400)]
bcachefs: bkey errors are only AUTOFIX during read

Newly generated keys, in the transaction commit path or write path,
should not be AUTOFIX; those indicate bugs that we need to fail fast
for.

Fixes: 5612daafb764 ("bcachefs: Fix fsck warnings from bkey validation")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Create lost+found in correct snapshot
Kent Overstreet [Sat, 28 Sep 2024 19:33:08 +0000 (15:33 -0400)]
bcachefs: Create lost+found in correct snapshot

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix reattach_inode()
Kent Overstreet [Sat, 28 Sep 2024 06:44:12 +0000 (02:44 -0400)]
bcachefs: Fix reattach_inode()

Ensure a copy of the lost+found inode exists in the snapshot that we're
reattaching, so that we don't trigger warnings in
lookup_inode_for_snapshot() later.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Add missing wakeup to bch2_inode_hash_remove()
Kent Overstreet [Fri, 4 Oct 2024 23:44:32 +0000 (19:44 -0400)]
bcachefs: Add missing wakeup to bch2_inode_hash_remove()

This fixes two different bugs:

- Looser locking with the rhashtable means we need to recheck if the
  inode is still hashed after prepare_to_wait(), and add a corresponding
  wakeup after removing from the hash table.

da18ecbf0fb6 ("fs: add i_state helpers") changed the bit waitqueues
  used for inodes, and bcachefs wasn't updated and thus broke; this
  updates bcachefs to the new helper.

Fixes: 112d21fd1a12 ("bcachefs: switch to rhashtable for vfs inodes hash")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix trans_commit disk accounting revert
Kent Overstreet [Thu, 3 Oct 2024 01:35:38 +0000 (21:35 -0400)]
bcachefs: Fix trans_commit disk accounting revert

We only are applying JSET_ENTRY_TYPE_write_buffer_keys, revert path was
missed.

Fixes: a3581ca35d2b ("bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix bch2_inode_is_open() check
Kent Overstreet [Thu, 3 Oct 2024 01:31:31 +0000 (21:31 -0400)]
bcachefs: Fix bch2_inode_is_open() check

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix return type of dirent_points_to_inode_nowarn()
Kent Overstreet [Tue, 1 Oct 2024 21:43:36 +0000 (17:43 -0400)]
bcachefs: Fix return type of dirent_points_to_inode_nowarn()

we're returning an error code now, not a bool

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: Fix bad shift in bch2_read_flag_list()
Kent Overstreet [Mon, 30 Sep 2024 22:46:48 +0000 (18:46 -0400)]
bcachefs: Fix bad shift in bch2_read_flag_list()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agobcachefs: rename version -> bversion for big endian builds
Guenter Roeck [Mon, 30 Sep 2024 00:39:02 +0000 (17:39 -0700)]
bcachefs: rename version -> bversion for big endian builds

Builds on big endian systems fail as follows.

fs/bcachefs/bkey.h: In function 'bch2_bkey_format_add_key':
fs/bcachefs/bkey.h:557:41: error:
'const struct bkey' has no member named 'bversion'

The original commit only renamed the variable for little endian builds.
Rename it for big endian builds as well to fix the problem.

Fixes: cf49f8a8c277 ("bcachefs: rename version -> bversion")
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
8 months agoLinux 6.12-rc1
Linus Torvalds [Sun, 29 Sep 2024 22:06:19 +0000 (15:06 -0700)]
Linux 6.12-rc1

8 months agox86: kvm: fix build error
Linus Torvalds [Sun, 29 Sep 2024 21:47:33 +0000 (14:47 -0700)]
x86: kvm: fix build error

The cpu_emergency_register_virt_callback() function is used
unconditionally by the x86 kvm code, but it is declared (and defined)
conditionally:

  #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
  void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
  ...

leading to a build error when neither KVM_INTEL nor KVM_AMD support is
enabled:

  arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
  arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
  12517 |         cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
  arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
  12522 |         cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix the build by defining empty helper functions the same way the old
cpu_emergency_disable_virtualization() function was dealt with for the
same situation.

Maybe we could instead have made the call sites conditional, since the
callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
fallback.  I'll leave that to the kvm people to argue about, this at
least gets the build going for that particular config.

Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 months agoMerge tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar...
Linus Torvalds [Sun, 29 Sep 2024 16:53:04 +0000 (09:53 -0700)]
Merge tag 'mailbox-v6.12' of git://git./linux/kernel/git/jassibrar/mailbox

Pull mailbox updates from Jassi Brar:

 - fix kconfig dependencies (mhu-v3, omap2+)

 - use devie name instead of genereic imx_mu_chan as interrupt name
   (imx)

 - enable sa8255p and qcs8300 ipc controllers (qcom)

 - Fix timeout during suspend mode (bcm2835)

 - convert to use use of_property_match_string (mailbox)

 - enable mt8188 (mediatek)

 - use devm_clk_get_enabled helpers (spreadtrum)

 - fix device-id typo (rockchip)

* tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
  mailbox, remoteproc: omap2+: fix compile testing
  dt-bindings: mailbox: qcom-ipcc: Document QCS8300 IPCC
  dt-bindings: mailbox: qcom-ipcc: document the support for SA8255p
  dt-bindings: mailbox: mtk,adsp-mbox: Add compatible for MT8188
  mailbox: Use of_property_match_string() instead of open-coding
  mailbox: bcm2835: Fix timeout during suspend mode
  mailbox: sprd: Use devm_clk_get_enabled() helpers
  mailbox: rockchip: fix a typo in module autoloading
  mailbox: imx: use device name in interrupt name
  mailbox: ARM_MHU_V3 should depend on ARM64

8 months agoMerge tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sun, 29 Sep 2024 16:47:33 +0000 (09:47 -0700)]
Merge tag 'i2c-for-6.12-rc1-additional_fixes' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - fix DesignWare driver ENABLE-ABORT sequence, ensuring ABORT can
   always be sent when needed

 - check for PCLK in the SynQuacer controller as an optional clock,
   allowing ACPI to directly provide the clock rate

 - KEBA driver Kconfig dependency fix

 - fix XIIC driver power suspend sequence

* tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
  i2c: keba: I2C_KEBA should depend on KEBA_CP500
  i2c: synquacer: Deal with optional PCLK correctly
  i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled

8 months agoMerge tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma...
Linus Torvalds [Sun, 29 Sep 2024 16:35:10 +0000 (09:35 -0700)]
Merge tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:

 - handle chained SGLs in the new tracing code (Christoph Hellwig)

* tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: fix DMA API tracing for chained scatterlists

8 months agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 29 Sep 2024 16:22:34 +0000 (09:22 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "These are mostly minor updates.

  There are two drivers (lpfc and mpi3mr) which missed the initial
  pull and a core change to retry a start/stop unit which affect
  suspend/resume"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi: lpfc: Update lpfc version to 14.4.0.5
  scsi: lpfc: Support loopback tests with VMID enabled
  scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
  scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
  scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
  scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
  scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
  scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
  scsi: mpi3mr: Update driver version to 8.12.0.0.50
  scsi: mpi3mr: Improve wait logic while controller transitions to READY state
  scsi: mpi3mr: Update MPI Headers to revision 34
  scsi: mpi3mr: Use firmware-provided timestamp update interval
  scsi: mpi3mr: Enhance the Enable Controller retry logic
  scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
  scsi: pm8001: Do not overwrite PCI queue mapping
  scsi: scsi_debug: Remove a useless memset()
  scsi: pmcraid: Convert comma to semicolon
  scsi: sd: Retry START STOP UNIT commands
  scsi: mpi3mr: A performance fix
  scsi: ufs: qcom: Update MODE_MAX cfg_bw value
  ...

8 months agoMerge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs
Linus Torvalds [Sun, 29 Sep 2024 16:17:44 +0000 (09:17 -0700)]
Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs

Pull more bcachefs updates from Kent Overstreet:
 "Assorted minor syzbot fixes, and for bigger stuff:

  Fix two disk accounting rewrite bugs:

   - Disk accounting keys use the version field of bkey so that journal
     replay can tell which updates have been applied to the btree.

     This is set in the transaction commit path, after we've gotten our
     journal reservation (and our time ordering), but the
     BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
     uses was incorrectly skipping this for new updates generated prior
     to journal replay.

     This fixes the underlying cause of an assertion pop in
     disk_accounting_read.

   - A couple of fixes for disk accounting + device removal.

     Checking if acocunting replicas entries were marked in the
     superblock was being done at the wrong point, when deltas in the
     journal could still zero them out, and then additionally we'd try
     to add a missing replicas entry to the superblock without checking
     if it referred to an invalid (removed) device.

  A whole slew of repair fixes:

   - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
     an infinite loop when repairing a filesystem with many snapshots

   - fix incorrect transaction restart handling leading to occasional
     "fsck counted ..." warnings

   - fix warning in __bch2_fsck_err() for bkey fsck errors

   - check_inode() in fsck now correctly checks if the filesystem was
     clean

   - there shouldn't be pending logged ops if the fs was clean, we now
     check for this

   - remove_backpointer() doesn't remove a dirent that doesn't actually
     point to the inode

   - many more fsck errors are AUTOFIX"

* tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
  bcachefs: check_subvol_path() now prints subvol root inode
  bcachefs: remove_backpointer() now checks if dirent points to inode
  bcachefs: dirent_points_to_inode() now warns on mismatch
  bcachefs: Fix lost wake up
  bcachefs: Check for logged ops when clean
  bcachefs: BCH_FS_clean_recovery
  bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
  bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
  bcachefs: Check for accounting keys with bversion=0
  bcachefs: rename version -> bversion
  bcachefs: Don't delete unlinked inodes before logged op resume
  bcachefs: Fix BCH_SB_ERRS() so we can reorder
  bcachefs: Fix fsck warnings from bkey validation
  bcachefs: Move transaction commit path validation to as late as possible
  bcachefs: Fix disk accounting attempting to mark invalid replicas entry
  bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
  bcachefs: Fix accounting read + device removal
  bcachefs: bch_accounting_mode
  bcachefs: fix transaction restart handling in check_extents(), check_dirents()
  bcachefs: kill inode_walker_entry.seen_this_pos
  ...

8 months agoMerge tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 29 Sep 2024 16:10:00 +0000 (09:10 -0700)]
Merge tag 'x86-urgent-2024-09-29' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Fix TDX MMIO #VE fault handling, and add two new Intel model numbers
  for 'Pantherlake' and 'Diamond Rapids'"

* tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Add two Intel CPU model numbers
  x86/tdx: Fix "in-kernel MMIO" check

8 months agoMerge tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Sep 2024 15:51:30 +0000 (08:51 -0700)]
Merge tag 'locking-urgent-2024-09-29' of git://git./linux/kernel/git/tip/tip

Pull locking updates from Ingo Molnar:
 "lockdep:
    - Fix potential deadlock between lockdep and RCU (Zhiguo Niu)
    - Use str_plural() to address Coccinelle warning (Thorsten Blum)
    - Add debuggability enhancement (Luis Claudio R. Goncalves)

  static keys & calls:
    - Fix static_key_slow_dec() yet again (Peter Zijlstra)
    - Handle module init failure correctly in static_call_del_module()
      (Thomas Gleixner)
    - Replace pointless WARN_ON() in static_call_module_notify() (Thomas
      Gleixner)

  <linux/cleanup.h>:
    - Add usage and style documentation (Dan Williams)

  rwsems:
    - Move is_rwsem_reader_owned() and rwsem_owner() under
      CONFIG_DEBUG_RWSEMS (Waiman Long)

  atomic ops, x86:
    - Redeclare x86_32 arch_atomic64_{add,sub}() as void (Uros Bizjak)
    - Introduce the read64_nonatomic macro to x86_32 with cx8 (Uros
      Bizjak)"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
* tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS
  jump_label: Fix static_key_slow_dec() yet again
  static_call: Replace pointless WARN_ON() in static_call_module_notify()
  static_call: Handle module init failure correctly in static_call_del_module()
  locking/lockdep: Simplify character output in seq_line()
  lockdep: fix deadlock issue between lockdep and rcu
  lockdep: Use str_plural() to fix Coccinelle warning
  cleanup: Add usage and style documentation
  lockdep: suggest the fix for "lockdep bfs error:-1" on print_bfs_bug
  locking/atomic/x86: Redeclare x86_32 arch_atomic64_{add,sub}() as void
  locking/atomic/x86: Introduce the read64_nonatomic macro to x86_32 with cx8

8 months agoMerge tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall...
Linus Torvalds [Sun, 29 Sep 2024 15:44:28 +0000 (08:44 -0700)]
Merge tag 'cocci-for-6.12' of git://git./linux/kernel/git/jlawall/linux

Pull coccinelle updates from Julia Lawall:
 "Extend string_choices.cocci to use more available helpers

  Ten patches from Hongbo Li extending string_choices.cocci with the
  complete set of functions offered by include/linux/string_choices.h.

  One patch from myself reducing the number of redundant cases that are
  checked by Coccinelle, giving a small performance improvement"

* tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
  Reduce Coccinelle choices in string_choices.cocci
  coccinelle: Remove unnecessary parentheses for only one possible change.
  coccinelle: Add rules to find str_yes_no() replacements
  coccinelle: Add rules to find str_on_off() replacements
  coccinelle: Add rules to find str_write_read() replacements
  coccinelle: Add rules to find str_read_write() replacements
  coccinelle: Add rules to find str_enable{d}_disable{d}() replacements
  coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements
  coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements
  coccinelle: Add rules to find str_false_true() replacements
  coccinelle: Add rules to find str_true_false() replacements

8 months agoMerge tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Sun, 29 Sep 2024 15:37:03 +0000 (08:37 -0700)]
Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest fix from Shuah Khan:
 "One urgent fix to vDSO as automated testing is failing due to this
  bug"

* tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: vDSO: align stack for O2-optimized memcpy

8 months agoMerge branch 'locking/core' into locking/urgent, to pick up pending commits
Ingo Molnar [Sun, 29 Sep 2024 06:57:18 +0000 (08:57 +0200)]
Merge branch 'locking/core' into locking/urgent, to pick up pending commits

Merge all pending locking commits into a single branch.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 months agoReduce Coccinelle choices in string_choices.cocci
Julia Lawall [Sat, 28 Sep 2024 19:26:22 +0000 (21:26 +0200)]
Reduce Coccinelle choices in string_choices.cocci

The isomorphism neg_if_exp negates the test of a ?: conditional,
making it unnecessary to have an explicit case for a negated test
with the branches inverted.

At the same time, we can disable neg_if_exp in cases where a
different API function may be more suitable for a negated test.

Finally, in the non-patch cases, E matches an expression with
parentheses around it, so there is no need to mention ()
explicitly in the pattern.  The () are still needed in the patch
cases, because we want to drop them, if they are present.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
8 months agococcinelle: Remove unnecessary parentheses for only one possible change.
Hongbo Li [Wed, 11 Sep 2024 01:09:27 +0000 (09:09 +0800)]
coccinelle: Remove unnecessary parentheses for only one possible change.

The parentheses are only needed if there is a disjunction, ie a
set of possible changes. If there is only one pattern, we can
remove these parentheses. Just like the format:

  -  x
  +  y

not:

  (
  -  x
  +  y
  )

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
8 months agococcinelle: Add rules to find str_yes_no() replacements
Hongbo Li [Wed, 11 Sep 2024 01:09:26 +0000 (09:09 +0800)]
coccinelle: Add rules to find str_yes_no() replacements

As other rules done, we add rules for str_yes_no()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
8 months agococcinelle: Add rules to find str_on_off() replacements
Hongbo Li [Wed, 11 Sep 2024 01:09:25 +0000 (09:09 +0800)]
coccinelle: Add rules to find str_on_off() replacements

As other rules done, we add rules for str_on_off()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
8 months agococcinelle: Add rules to find str_write_read() replacements
Hongbo Li [Wed, 11 Sep 2024 01:09:24 +0000 (09:09 +0800)]
coccinelle: Add rules to find str_write_read() replacements

As other rules done, we add rules for str_write_read()
to check the relative opportunities.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>