linux-2.6-microblaze.git
10 months agobcachefs: Fix printing of device durability
Kent Overstreet [Sat, 30 Dec 2023 00:16:14 +0000 (19:16 -0500)]
bcachefs: Fix printing of device durability

BCH_MEMBER_DURABILITY() was not present initially; a value of 0 means
use the default, nonzero means use v - 1.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: __bch2_journal_key_to_wb -> bch2_journal_key_to_wb_slowpath
Kent Overstreet [Thu, 28 Dec 2023 01:26:30 +0000 (20:26 -0500)]
bcachefs: __bch2_journal_key_to_wb -> bch2_journal_key_to_wb_slowpath

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: __journal_keys_sort() refactoring
Kent Overstreet [Thu, 28 Dec 2023 01:31:21 +0000 (20:31 -0500)]
bcachefs: __journal_keys_sort() refactoring

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: wb_key_cmp -> wb_key_ref_cmp
Kent Overstreet [Wed, 27 Dec 2023 23:23:34 +0000 (18:23 -0500)]
bcachefs: wb_key_cmp -> wb_key_ref_cmp

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: track transaction durations
Kent Overstreet [Sun, 24 Dec 2023 03:43:33 +0000 (22:43 -0500)]
bcachefs: track transaction durations

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: btree_trans always has stats
Kent Overstreet [Sun, 24 Dec 2023 04:08:45 +0000 (23:08 -0500)]
bcachefs: btree_trans always has stats

reserve slot 0 for unknown (when we overflow), to avoid some branches

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Split brain detection
Kent Overstreet [Wed, 28 Jun 2023 01:02:27 +0000 (21:02 -0400)]
bcachefs: Split brain detection

Use the new bch_member->seq, sb->write_time fields to detect split brain
and kick out devices when necessary.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch_member->seq
Kent Overstreet [Wed, 28 Jun 2023 01:02:27 +0000 (21:02 -0400)]
bcachefs: bch_member->seq

Add new fields for split brain detection:

 - bch_member->seq, which tracks the sequence number of the last superblock
   write that happened to each member device

 - bch_sb->write_time, which tracks the time of the last superblock write,
   to allow detection of when two members have diverged but had the same
   number of superblock writes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fix nochanges/read_only interaction
Kent Overstreet [Sat, 23 Dec 2023 22:50:29 +0000 (17:50 -0500)]
bcachefs: Fix nochanges/read_only interaction

nochanges means "we cannot issue writes at all"; it's possible to go
into a pseudo read-write mode where we pin dirty metadata in memory,
which is used for fsck in dry run mode and doing journal replay on a
read only mount, but we do not want to allow an actual read-write mount
in nochanges mode.

But we do always want to allow early read-write, during recovery - this
patch clarifies that.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Check journal entries for invalid keys in trans commit path
Kent Overstreet [Thu, 21 Dec 2023 05:16:32 +0000 (00:16 -0500)]
bcachefs: Check journal entries for invalid keys in trans commit path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: check_directory_structure() can now be run online
Kent Overstreet [Mon, 11 Dec 2023 03:52:43 +0000 (22:52 -0500)]
bcachefs: check_directory_structure() can now be run online

Now that we have dynamically resizable btree paths,
check_directory_structure() can check one path - inode up to the root -
in a single transaction.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fix reattach_inode() for snapshots
Kent Overstreet [Fri, 15 Dec 2023 19:13:48 +0000 (14:13 -0500)]
bcachefs: Fix reattach_inode() for snapshots

reattach_inode() was broken w.r.t. snapshots - we'd lookup the subvolume
to look up lost+found, but if we're in an interior node snapshot that
didn't make any sense.

Instead, this adds a dirent path for creating in a specific snapshot,
skipping the subvolume; and we also make sure to create lost+found in
the root snapshot, to avoid conflicts with lost+found being created in
overlapping snapshots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_trans_peek_slot_updates
Kent Overstreet [Sun, 17 Dec 2023 05:57:37 +0000 (00:57 -0500)]
bcachefs: bch2_btree_trans_peek_slot_updates

refactoring the BTREE_ITER_WITH_UPDATES code, prep for removing the flag
and making it always-on

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_trans_peek_prev_updates
Kent Overstreet [Sun, 17 Dec 2023 05:57:37 +0000 (00:57 -0500)]
bcachefs: bch2_btree_trans_peek_prev_updates

bch2_btree_iter_peek_prev() now supports BTREE_ITER_WITH_UPDATES

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_trans_peek_updates
Kent Overstreet [Sun, 17 Dec 2023 05:57:37 +0000 (00:57 -0500)]
bcachefs: bch2_btree_trans_peek_updates

refactoring the BTREE_ITER_WITH_UPDATES code, prep for removing the flag
and making it always-on

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: growable btree_paths
Kent Overstreet [Mon, 11 Dec 2023 00:26:30 +0000 (19:26 -0500)]
bcachefs: growable btree_paths

XXX: we're allocating memory with btree locks held - bad

We need to plumb through an error path so we can do
allocate_dropping_locks() - but we're merging this now because it fixes
a transaction path overflow caused by indirect extent fragmentation, and
the resize path is rare.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fix interior update path btree_path uses
Kent Overstreet [Fri, 15 Dec 2023 20:21:40 +0000 (15:21 -0500)]
bcachefs: Fix interior update path btree_path uses

Since the btree_paths array is now about to become growable, we have to
be careful not to refer to paths by pointer across contexts where they
may be reallocated.

This fixes the remaining btree_interior_update() paths - split and
merge.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: trans->nr_paths
Kent Overstreet [Sun, 10 Dec 2023 22:10:31 +0000 (17:10 -0500)]
bcachefs: trans->nr_paths

Start to plumb through dynamically growable btree_paths; this patch
replaces most BTREE_ITER_MAX references with trans->nr_paths.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: trans->updates will also be resizable
Kent Overstreet [Wed, 13 Dec 2023 01:30:44 +0000 (20:30 -0500)]
bcachefs: trans->updates will also be resizable

the reflink triggers are also bumping up against the maximum number of
paths in a transaction - and generating proportional numbers of updates.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONS
Kent Overstreet [Mon, 11 Dec 2023 16:11:22 +0000 (11:11 -0500)]
bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONS

 - Some tweaks to greatly reduce locking overhead for the list of btree
   transactions, so that it can always be enabled: leave btree_trans
   objects on the list when they're on the percpu single item freelist,
   and only check for duplicates in the same process when
   CONFIG_BCACHEFS_DEBUG is enabled

 - don't zero out the full btree_trans() unless we allocated it from
   the mempool

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: rcu protect trans->paths
Kent Overstreet [Wed, 13 Dec 2023 01:08:29 +0000 (20:08 -0500)]
bcachefs: rcu protect trans->paths

Upcoming patches are going to be changing trans->paths to a
reallocatable buffer. We need to guard against use after free when it's
used by other threads; this introduces RCU protection to those paths and
changes them to check for trans->paths == NULL

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Clean up btree_trans
Kent Overstreet [Mon, 11 Dec 2023 07:31:12 +0000 (02:31 -0500)]
bcachefs: Clean up btree_trans

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill btree_path.idx
Kent Overstreet [Mon, 11 Dec 2023 05:23:33 +0000 (00:23 -0500)]
bcachefs: kill btree_path.idx

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: get_unlocked_mut_path() -> btree_path_idx_t
Kent Overstreet [Mon, 11 Dec 2023 05:17:17 +0000 (00:17 -0500)]
bcachefs: get_unlocked_mut_path() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_iter_peek_prev() no longer uses path->idx
Kent Overstreet [Mon, 11 Dec 2023 05:03:44 +0000 (00:03 -0500)]
bcachefs: bch2_btree_iter_peek_prev() no longer uses path->idx

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_path_get() no longer uses path->idx
Kent Overstreet [Mon, 11 Dec 2023 05:02:07 +0000 (00:02 -0500)]
bcachefs: bch2_path_get() no longer uses path->idx

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: trans_for_each_path_with_node() no longer uses path->idx
Kent Overstreet [Mon, 11 Dec 2023 04:57:50 +0000 (23:57 -0500)]
bcachefs: trans_for_each_path_with_node() no longer uses path->idx

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: trans_for_each_path() no longer uses path->idx
Kent Overstreet [Mon, 11 Dec 2023 04:37:45 +0000 (23:37 -0500)]
bcachefs: trans_for_each_path() no longer uses path->idx

path->idx is now a code smell: we should be using path_idx_t, since it's
stable across btree path reallocation.

This is also a bit faster, using the same loop counter vs. fetching
path->idx from each path we iterate over.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill trans_for_each_path_from()
Kent Overstreet [Sun, 10 Dec 2023 22:54:02 +0000 (17:54 -0500)]
bcachefs: kill trans_for_each_path_from()

dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_path_to_text() -> btree_path_idx_t
Kent Overstreet [Mon, 11 Dec 2023 04:29:06 +0000 (23:29 -0500)]
bcachefs: bch2_btree_path_to_text() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: struct trans_for_each_path_inorder_iter
Kent Overstreet [Sun, 10 Dec 2023 21:35:45 +0000 (16:35 -0500)]
bcachefs: struct trans_for_each_path_inorder_iter

reducing our usage of path->idx

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: btree_insert_entry -> btree_path_idx_t
Kent Overstreet [Sun, 10 Dec 2023 21:10:24 +0000 (16:10 -0500)]
bcachefs: btree_insert_entry -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: btree_iter -> btree_path_idx_t
Kent Overstreet [Mon, 4 Dec 2023 05:39:38 +0000 (00:39 -0500)]
bcachefs: btree_iter -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: btree_path_alloc() -> btree_path_idx_t
Kent Overstreet [Fri, 8 Dec 2023 22:02:16 +0000 (17:02 -0500)]
bcachefs: btree_path_alloc() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_path_traverse() -> btree_path_idx_t
Kent Overstreet [Fri, 8 Dec 2023 08:02:43 +0000 (03:02 -0500)]
bcachefs: bch2_btree_path_traverse() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_path_make_mut() -> btree_path_idx_t
Kent Overstreet [Fri, 8 Dec 2023 07:24:05 +0000 (02:24 -0500)]
bcachefs: bch2_btree_path_make_mut() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_btree_path_set_pos() -> btree_path_idx_t
Kent Overstreet [Fri, 8 Dec 2023 07:10:23 +0000 (02:10 -0500)]
bcachefs: bch2_btree_path_set_pos() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs; bch2_path_put() -> btree_path_idx_t
Kent Overstreet [Mon, 11 Dec 2023 04:18:52 +0000 (23:18 -0500)]
bcachefs; bch2_path_put() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_path_get() -> btree_path_idx_t
Kent Overstreet [Fri, 8 Dec 2023 07:00:43 +0000 (02:00 -0500)]
bcachefs: bch2_path_get() -> btree_path_idx_t

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: minor bch2_btree_path_set_pos() optimization
Kent Overstreet [Fri, 8 Dec 2023 06:51:04 +0000 (01:51 -0500)]
bcachefs: minor bch2_btree_path_set_pos() optimization

bpos_eq() is cheaper than bpos_cmp()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Kill GFP_NOFAIL usage in readahead path
Kent Overstreet [Wed, 20 Dec 2023 06:20:53 +0000 (01:20 -0500)]
bcachefs: Kill GFP_NOFAIL usage in readahead path

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Convert split_devs() to darray
Kent Overstreet [Sat, 23 Dec 2023 02:10:32 +0000 (21:10 -0500)]
bcachefs: Convert split_devs() to darray

Bit of cleanup & modernization: also moving this code to util.c, it'll
be used by userspace as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: skip journal more often in key cache reclaim
Kent Overstreet [Wed, 20 Dec 2023 01:54:11 +0000 (20:54 -0500)]
bcachefs: skip journal more often in key cache reclaim

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: for_each_keylist_key() declares loop iter
Kent Overstreet [Fri, 22 Dec 2023 03:24:46 +0000 (22:24 -0500)]
bcachefs: for_each_keylist_key() declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bkey_for_each_ptr() now declares loop iter
Kent Overstreet [Thu, 21 Dec 2023 20:47:15 +0000 (15:47 -0500)]
bcachefs: bkey_for_each_ptr() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill __bch2_btree_iter_peek_upto_and_restart()
Kent Overstreet [Sun, 17 Dec 2023 08:39:03 +0000 (03:39 -0500)]
bcachefs: kill __bch2_btree_iter_peek_upto_and_restart()

dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: fsck -> bch2_trans_run()
Kent Overstreet [Sun, 17 Dec 2023 08:07:26 +0000 (03:07 -0500)]
bcachefs: fsck -> bch2_trans_run()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: vstruct_for_each() now declares loop iter
Kent Overstreet [Sun, 17 Dec 2023 07:19:23 +0000 (02:19 -0500)]
bcachefs: vstruct_for_each() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: for_each_member_device_rcu() now declares loop iter
Kent Overstreet [Sun, 17 Dec 2023 07:34:05 +0000 (02:34 -0500)]
bcachefs: for_each_member_device_rcu() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: for_each_member_device() now declares loop iter
Kent Overstreet [Sun, 17 Dec 2023 04:47:29 +0000 (23:47 -0500)]
bcachefs: for_each_member_device() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: for_each_btree_key() now declares loop iter
Kent Overstreet [Sun, 17 Dec 2023 03:30:09 +0000 (22:30 -0500)]
bcachefs: for_each_btree_key() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill for_each_btree_key_norestart()
Kent Overstreet [Sun, 17 Dec 2023 02:55:12 +0000 (21:55 -0500)]
bcachefs: kill for_each_btree_key_norestart()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill for_each_btree_key_old_upto()
Kent Overstreet [Sun, 17 Dec 2023 02:51:34 +0000 (21:51 -0500)]
bcachefs: kill for_each_btree_key_old_upto()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: for_each_btree_key_upto() -> for_each_btree_key_old_upto()
Kent Overstreet [Sun, 17 Dec 2023 02:46:23 +0000 (21:46 -0500)]
bcachefs: for_each_btree_key_upto() -> for_each_btree_key_old_upto()

And for_each_btree_key2_upto -> for_each_btree_key_upto

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_dirent_lookup() -> lockrestart_do()
Kent Overstreet [Sun, 17 Dec 2023 08:05:30 +0000 (03:05 -0500)]
bcachefs: bch2_dirent_lookup() -> lockrestart_do()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_trans_srcu_lock() should be static
Kent Overstreet [Mon, 11 Dec 2023 23:04:29 +0000 (18:04 -0500)]
bcachefs: bch2_trans_srcu_lock() should be static

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: use track_event_change() for allocator blocked stats
Kent Overstreet [Mon, 11 Dec 2023 15:15:18 +0000 (10:15 -0500)]
bcachefs: use track_event_change() for allocator blocked stats

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: fix warning about uninitialized time_stats
Kent Overstreet [Sun, 24 Dec 2023 03:55:05 +0000 (22:55 -0500)]
bcachefs: fix warning about uninitialized time_stats

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: add more verbose logging
Kent Overstreet [Sun, 24 Dec 2023 02:39:45 +0000 (21:39 -0500)]
bcachefs: add more verbose logging

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: better error message in btree_node_write_work()
Kent Overstreet [Sun, 24 Dec 2023 02:09:34 +0000 (21:09 -0500)]
bcachefs: better error message in btree_node_write_work()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: simplify bch_devs_list
Kent Overstreet [Sun, 24 Dec 2023 02:02:45 +0000 (21:02 -0500)]
bcachefs: simplify bch_devs_list

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: darray_for_each() now declares loop iter
Kent Overstreet [Sun, 17 Dec 2023 02:40:26 +0000 (21:40 -0500)]
bcachefs: darray_for_each() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: trans_for_each_update() now declares loop iter
Kent Overstreet [Sun, 17 Dec 2023 02:31:26 +0000 (21:31 -0500)]
bcachefs: trans_for_each_update() now declares loop iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Improve the nopromote tracepoint
Kent Overstreet [Wed, 20 Dec 2023 21:49:43 +0000 (16:49 -0500)]
bcachefs: Improve the nopromote tracepoint

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Use GFP_KERNEL for promote allocations
Kent Overstreet [Wed, 20 Dec 2023 07:38:10 +0000 (02:38 -0500)]
bcachefs: Use GFP_KERNEL for promote allocations

We already have btree locks dropped here - no need for GFP_NOFS.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: mean and variance: fix kernel-doc for function params
Randy Dunlap [Wed, 20 Dec 2023 06:44:09 +0000 (22:44 -0800)]
bcachefs: mean and variance: fix kernel-doc for function params

Add missing function parameter descriptions in mean_and_variance.c.
The also eliminates the "Excess function parameter" warnings.

Prevents these kernel-doc warnings:

mean_and_variance.c:67: warning: Function parameter or member 's' not described in 'mean_and_variance_get_mean'
mean_and_variance.c:78: warning: Function parameter or member 's1' not described in 'mean_and_variance_get_variance'
mean_and_variance.c:94: warning: Function parameter or member 's' not described in 'mean_and_variance_get_stddev'
mean_and_variance.c:108: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_update'
mean_and_variance.c:108: warning: Function parameter or member 'x' not described in 'mean_and_variance_weighted_update'
mean_and_variance.c:108: warning: Excess function parameter 's1' description in 'mean_and_variance_weighted_update'
mean_and_variance.c:108: warning: Excess function parameter 's2' description in 'mean_and_variance_weighted_update'
mean_and_variance.c:134: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_get_mean'
mean_and_variance.c:143: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_get_variance'
mean_and_variance.c:153: warning: Function parameter or member 's' not described in 'mean_and_variance_weighted_get_stddev'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Brian Foster <bfoster@redhat.com>
Cc: linux-bcachefs@vger.kernel.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: check for failure to downgrade
Kent Overstreet [Sat, 23 Dec 2023 02:58:43 +0000 (21:58 -0500)]
bcachefs: check for failure to downgrade

With the upcoming member seq patch, it's now critical that we don't ever
write to a superblock that hasn't been version downgraded - failure to
update member seq fields will cause split brain detection to fire
erroniously.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fixes for rust bindgen
Kent Overstreet [Fri, 22 Dec 2023 00:47:55 +0000 (19:47 -0500)]
bcachefs: Fixes for rust bindgen

bindgen doesn't seem to like u128 or DECLARE_FLEX_ARRAY(), but we can
hack around them.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Delete dio read alignment check
Kent Overstreet [Wed, 20 Dec 2023 02:58:20 +0000 (21:58 -0500)]
bcachefs: Delete dio read alignment check

We'll typically fomat devices with the physical blocksize supported, but
the logical blocksize will be smaller.

There's no real need to be checking the blocksize at the filesystem
level, anyways - the block layer has to check this anyways.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agoMAINTAINERS: Update my email address
Kent Overstreet [Fri, 22 Dec 2023 04:57:22 +0000 (23:57 -0500)]
MAINTAINERS: Update my email address

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: clean up some dead fallocate code
Brian Foster [Tue, 19 Dec 2023 14:02:15 +0000 (09:02 -0500)]
bcachefs: clean up some dead fallocate code

The have_reservation local variable in bch2_extent_fallocate() is
initialized to false and set to true further down in the function.
Between this two points, one branch of code checks for negative
value and one for positive, and nothing ever checks the variable
after it is set to true. Clean up some of the unnecessary logic and
code.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Make sure allocation failure errors are logged
Kent Overstreet [Tue, 19 Dec 2023 23:08:19 +0000 (18:08 -0500)]
bcachefs: Make sure allocation failure errors are logged

The previous patch fixed a bug in allocation path error handling, and it
would've been noticed sooner had it been logged properly.

Generally speaking, errors that shouldn't happen in normal operation and
are being returned up the stack should be logged: the write path was
already logging IO errors, but non IO errors were missed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: drop extra semicolon
Kent Overstreet [Tue, 19 Dec 2023 21:27:38 +0000 (16:27 -0500)]
bcachefs: drop extra semicolon

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Replace zero-length array with flex-array member and use __counted_by
Gustavo A. R. Silva [Tue, 19 Dec 2023 00:24:53 +0000 (18:24 -0600)]
bcachefs: Replace zero-length array with flex-array member and use __counted_by

Fake flexible arrays (zero-length and one-element arrays) are
deprecated, and should be replaced by flexible-array members.
So, replace zero-length array with a flexible-array member in
`struct bch_ioctl_fsck_offline`.

Also annotate array `devs` with `__counted_by()` to prepare for the
coming implementation by GCC and Clang of the `__counted_by` attribute.
Flexible array members annotated with `__counted_by` can have their
accesses bounds-checked at run-time via `CONFIG_UBSAN_BOUNDS` (for
array indexing) and `CONFIG_FORTIFY_SOURCE` (for strcpy/memcpy-family
functions).

This fixes the following -Warray-bounds warnings:
fs/bcachefs/chardev.c: In function 'bch2_ioctl_fsck_offline':
fs/bcachefs/chardev.c:363:34: warning: array subscript 0 is outside array bounds of '__u64[0]' {aka 'long long unsigned int[]'} [-Warray-bounds=]
  363 |         if (copy_from_user(devs, &user_arg->devs[0], sizeof(user_arg->devs[0]) * arg.nr_devs)) {
      |                                  ^~~~~~~~~~~~~~~~~~
In file included from fs/bcachefs/chardev.c:5:
fs/bcachefs/bcachefs_ioctl.h:400:33: note: while referencing 'devs'
  400 |         __u64                   devs[0];

This results in no differences in binary output.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Use array_size() in call to copy_from_user()
Gustavo A. R. Silva [Tue, 19 Dec 2023 00:26:26 +0000 (18:26 -0600)]
bcachefs: Use array_size() in call to copy_from_user()

Use array_size() helper, instead of the open-coded version in
call to copy_from_user().

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: qstr_eq()
Kent Overstreet [Sun, 17 Dec 2023 02:16:34 +0000 (21:16 -0500)]
bcachefs: qstr_eq()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch_err_(fn|msg) check if should print
Kent Overstreet [Sun, 17 Dec 2023 03:43:41 +0000 (22:43 -0500)]
bcachefs: bch_err_(fn|msg) check if should print

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: fix userspace build errors
Kent Overstreet [Sat, 16 Dec 2023 03:16:51 +0000 (22:16 -0500)]
bcachefs: fix userspace build errors

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Drop journal entry compaction
Kent Overstreet [Mon, 11 Dec 2023 07:13:33 +0000 (02:13 -0500)]
bcachefs: Drop journal entry compaction

Previously, we dropped empty journal entries and coalesced entries that
could be - but it's not worth the overhead; we very rarely leave unused
journal entries after getting a journal reservation.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill btree_trans->wb_updates
Kent Overstreet [Sun, 12 Nov 2023 02:43:47 +0000 (21:43 -0500)]
bcachefs: kill btree_trans->wb_updates

the btree write buffer path now creates a journal entry directly

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: check_root() can now be run online
Kent Overstreet [Mon, 11 Dec 2023 03:51:16 +0000 (22:51 -0500)]
bcachefs: check_root() can now be run online

check_root() is simple enough to run as one single transaction, so is
trivial to run online.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Inline btree write buffer sort
Kent Overstreet [Sat, 4 Nov 2023 04:06:56 +0000 (00:06 -0400)]
bcachefs: Inline btree write buffer sort

The sort in the btree write buffer flush path is a very hot path, and
it's particularly performance sensitive since it's single threaded and
can block every other thread on a multithreaded write workload.

It's well worth doing a sort with inlined cmp and swap functions.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: btree write buffer now slurps keys from journal
Kent Overstreet [Thu, 2 Nov 2023 22:57:19 +0000 (18:57 -0400)]
bcachefs: btree write buffer now slurps keys from journal

Previosuly, the transaction commit path would have to add keys to the
btree write buffer as a separate operation, requiring additional global
synchronization.

This patch introduces a new journal entry type, which indicates that the
keys need to be copied into the btree write buffer prior to being
written out. We switch the journal entry type back to
JSET_ENTRY_btree_keys prior to write, so this is not an on disk format
change.

Flushing the btree write buffer may require pulling keys out of journal
entries yet to be written, and quiescing outstanding journal
reservations; we previously added journal->buf_lock for synchronization
with the journal write path.

We also can't put strict bounds on the number of keys in the journal
destined for the write buffer, which means we might overflow the size of
the preallocated buffer and have to reallocate - this introduces a
potentially fatal memory allocation failure. This is something we'll
have to watch for, if it becomes an issue in practice we can do
additional mitigation.

The transaction commit path no longer has to explicitly check if the
write buffer is full and wait on flushing; this is another performance
optimization. Instead, when the btree write buffer is close to full we
change the journal watermark, so that only reservations for journal
reclaim are allowed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: journal->buf_lock
Kent Overstreet [Fri, 3 Nov 2023 01:06:52 +0000 (21:06 -0400)]
bcachefs: journal->buf_lock

Add a new lock for synchronizing between journal IO path and btree write
buffer flush.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Unwritten journal buffers are always dirty
Kent Overstreet [Tue, 7 Nov 2023 23:08:38 +0000 (18:08 -0500)]
bcachefs: Unwritten journal buffers are always dirty

Ensure that journal bufs that haven't been written can't be reclaimed
from the journal pin fifo, and can thus have new pins taken.

Prep work for changing the btree write buffer to pull keys from the
journal directly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_trans_node_add no longer uses trans_for_each_path()
Kent Overstreet [Sun, 10 Dec 2023 22:44:04 +0000 (17:44 -0500)]
bcachefs: bch2_trans_node_add no longer uses trans_for_each_path()

In the future we'll be making trans->paths resizable and potentially
having _many_ more paths (for fsck); we need to start fixing algorithms
that walk each path in a transaction where possible.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Improve trans->extra_journal_entries
Kent Overstreet [Sun, 10 Dec 2023 21:48:22 +0000 (16:48 -0500)]
bcachefs: Improve trans->extra_journal_entries

Instead of using a darray, we now allocate journal entries for the
transaction commit path with our normal bump allocator - with an inlined
fastpath, and using btree_transaction_stats to remember how much to
initially allocate so as to avoid transaction restarts.

This is prep work for converting write buffer updates to use this
mechanism.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs; kill bch2_btree_key_cache_flush()
Kent Overstreet [Sun, 10 Dec 2023 22:52:58 +0000 (17:52 -0500)]
bcachefs; kill bch2_btree_key_cache_flush()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: kill btree_path->(alloc_seq|downgrade_seq)
Kent Overstreet [Sun, 10 Dec 2023 21:12:24 +0000 (16:12 -0500)]
bcachefs: kill btree_path->(alloc_seq|downgrade_seq)

These were for extra info in tracepoints for debugging a specialized
issue - we do not want to bloat btree_path for this, at least in release
builds.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fix snapshot.c assertion for online fsck
Kent Overstreet [Sun, 10 Dec 2023 17:42:49 +0000 (12:42 -0500)]
bcachefs: Fix snapshot.c assertion for online fsck

c->curr_recovery_pass can go backwards; this adds a non rewinding
version, c->recovery_pass_done.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: six lock: fix typos
Randy Dunlap [Sun, 10 Dec 2023 06:06:44 +0000 (22:06 -0800)]
bcachefs: six lock: fix typos

Fix a few typos in the six.h header file.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Brian Foster <bfoster@redhat.com>
Cc: linux-bcachefs@vger.kernel.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: reserve path idx 0 for sentinal
Kent Overstreet [Thu, 7 Dec 2023 18:11:44 +0000 (13:11 -0500)]
bcachefs: reserve path idx 0 for sentinal

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Rename for_each_btree_key2() -> for_each_btree_key()
Kent Overstreet [Fri, 8 Dec 2023 04:33:11 +0000 (23:33 -0500)]
bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Kill for_each_btree_key()
Kent Overstreet [Fri, 8 Dec 2023 04:28:26 +0000 (23:28 -0500)]
bcachefs: Kill for_each_btree_key()

for_each_btree_key() handles transaction restarts, like
for_each_btree_key2(), but only calls bch2_trans_begin() after a
transaction restart - for_each_btree_key2() wraps every loop iteration
in a transaction.

The for_each_btree_key() behaviour is problematic when it leads to
holding the SRCU lock that prevents key cache reclaim for an unbounded
amount of time - there's no real need to keep it around.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: continue now works in for_each_btree_key2()
Kent Overstreet [Fri, 8 Dec 2023 05:10:25 +0000 (00:10 -0500)]
bcachefs: continue now works in for_each_btree_key2()

continue now works as in any other loop

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fix bch2_read_btree()
Kent Overstreet [Fri, 8 Dec 2023 04:50:38 +0000 (23:50 -0500)]
bcachefs: Fix bch2_read_btree()

In the debugfs code, we had an incorrect use of drop_locks_do(); on
transaction restart we don't want to restart the current loop iteration,
since we've already emitted the current key to the buffer for userspace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: Fix open coded set_btree_iter_dontneed()
Kent Overstreet [Wed, 6 Dec 2023 22:53:59 +0000 (17:53 -0500)]
bcachefs: Fix open coded set_btree_iter_dontneed()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: BCH_IOCTL_FSCK_ONLINE
Kent Overstreet [Mon, 4 Dec 2023 18:45:33 +0000 (13:45 -0500)]
bcachefs: BCH_IOCTL_FSCK_ONLINE

This adds a new ioctl for running fsck on a mounted, in use filesystem.

This reuses the fsck_thread code from the previous patch for running
fsck on an offline, unmounted filesystem, so that log messages for the
fsck thread are redirected to userspace.

Only one running fsck instance is allowed at a time; a new semaphore
(since the lock will be taken by one thread and released by another) is
added for this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: BCH_IOCTL_FSCK_OFFLINE
Kent Overstreet [Wed, 12 Jul 2023 03:23:40 +0000 (23:23 -0400)]
bcachefs: BCH_IOCTL_FSCK_OFFLINE

This adds a new ioctl for running fsck on a list of devices.

Normally, if we wish to use the kernel's implementation of fsck we'd run
it at mount time with -o fsck. This ioctl lets us run fsck without
mounting, so that userspace bcachefs-tools can transparently switch to
the kernel's implementation of fsck when appropriate - primarily if the
kernel version of bcachefs better matches the filesystem on disk.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
10 months agobcachefs: bch2_run_online_recovery_passes()
Kent Overstreet [Wed, 6 Dec 2023 19:36:18 +0000 (14:36 -0500)]
bcachefs: bch2_run_online_recovery_passes()

Add a new helper for running online recovery passes - i.e. online fsck.
This is a subset of our normal recovery passes, and does not - for now -
use or follow c->curr_recovery_pass.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>