Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"
authorJunxiao Bi <junxiao.bi@oracle.com>
Wed, 8 Nov 2023 18:22:16 +0000 (10:22 -0800)
committerSong Liu <song@kernel.org>
Mon, 27 Nov 2023 23:46:51 +0000 (15:46 -0800)
This reverts commit 5e2cf333b7bd5d3e62595a44d598a254c697cd74.

That commit introduced the following race and can cause system hung.

 md_write_start:             raid5d:
 // mddev->in_sync == 1
 set "MD_SB_CHANGE_PENDING"
                            // running before md_write_start wakeup it
                             waiting "MD_SB_CHANGE_PENDING" cleared
                             >>>>>>>>> hung
 wakeup mddev->thread
 ...
 waiting "MD_SB_CHANGE_PENDING" cleared
 >>>> hung, raid5d should clear this flag
 but get hung by same flag.

The issue reverted commit fixing is fixed by last patch in a new way.

Fixes: 5e2cf333b7bd ("md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d")
Cc: stable@vger.kernel.org # v5.19+
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20231108182216.73611-2-junxiao.bi@oracle.com
drivers/md/raid5.c

index dc031d4..fcc8a44 100644 (file)
@@ -36,7 +36,6 @@
  */
 
 #include <linux/blkdev.h>
-#include <linux/delay.h>
 #include <linux/kthread.h>
 #include <linux/raid/pq.h>
 #include <linux/async_tx.h>
@@ -6820,18 +6819,7 @@ static void raid5d(struct md_thread *thread)
                        spin_unlock_irq(&conf->device_lock);
                        md_check_recovery(mddev);
                        spin_lock_irq(&conf->device_lock);
-
-                       /*
-                        * Waiting on MD_SB_CHANGE_PENDING below may deadlock
-                        * seeing md_check_recovery() is needed to clear
-                        * the flag when using mdmon.
-                        */
-                       continue;
                }
-
-               wait_event_lock_irq(mddev->sb_wait,
-                       !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags),
-                       conf->device_lock);
        }
        pr_debug("%d stripes handled\n", handled);