btrfs: zoned: unset dedicated block group on allocation failure
authorNaohiro Aota <naohiro.aota@wdc.com>
Tue, 7 Dec 2021 15:35:47 +0000 (00:35 +0900)
committerDavid Sterba <dsterba@suse.com>
Fri, 7 Jan 2022 13:18:26 +0000 (14:18 +0100)
Allocating an extent from a block group can fail for various reasons.
When an allocation from a dedicated block group (for tree-log or
relocation data) fails, we need to unregister it as a dedicated one so
that we can allocate a new block group for the dedicated one.

However, we are returning early when the block group in case it is
read-only, fully used, or not be able to activate the zone. As a result,
we keep the non-usable block group as a dedicated one, leading to
further allocation failure. With many block groups, the allocator will
iterate hopeless loop to find a free extent, results in a hung task.

Fix the issue by delaying the return and doing the proper cleanups.

CC: stable@vger.kernel.org # 5.16
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fs/btrfs/extent-tree.c

index 7f08c9e..5362b54 100644 (file)
@@ -3805,23 +3805,35 @@ static int do_allocation_zoned(struct btrfs_block_group *block_group,
        spin_unlock(&fs_info->relocation_bg_lock);
        if (skip)
                return 1;
+
        /* Check RO and no space case before trying to activate it */
        spin_lock(&block_group->lock);
        if (block_group->ro ||
            block_group->alloc_offset == block_group->zone_capacity) {
-               spin_unlock(&block_group->lock);
-               return 1;
+               ret = 1;
+               /*
+                * May need to clear fs_info->{treelog,data_reloc}_bg.
+                * Return the error after taking the locks.
+                */
        }
        spin_unlock(&block_group->lock);
 
-       if (!btrfs_zone_activate(block_group))
-               return 1;
+       if (!ret && !btrfs_zone_activate(block_group)) {
+               ret = 1;
+               /*
+                * May need to clear fs_info->{treelog,data_reloc}_bg.
+                * Return the error after taking the locks.
+                */
+       }
 
        spin_lock(&space_info->lock);
        spin_lock(&block_group->lock);
        spin_lock(&fs_info->treelog_bg_lock);
        spin_lock(&fs_info->relocation_bg_lock);
 
+       if (ret)
+               goto out;
+
        ASSERT(!ffe_ctl->for_treelog ||
               block_group->start == fs_info->treelog_bg ||
               fs_info->treelog_bg == 0);