ubi: wl: Put source PEB into correct list if trying locking LEB failed
authorZhihao Cheng <chengzhihao1@huawei.com>
Mon, 19 Aug 2024 03:26:21 +0000 (11:26 +0800)
committerRichard Weinberger <richard@nod.at>
Thu, 14 Nov 2024 16:41:30 +0000 (17:41 +0100)
commitd610020f030bec819f42de327c2bd5437d2766b3
tree3f74c809e41ef0fee197d9ba4630555fa1a02133
parent3c50701fd37ffc265d28e7cb4db2ff0cd33241d7
ubi: wl: Put source PEB into correct list if trying locking LEB failed

During wear-leveing work, the source PEB will be moved into scrub list
when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
for non-scrub type source PEB. The problem could bring extra and
ineffective wear-leveing jobs, which makes more or less negative effects
for the life time of flash. Specifically, the process is divided 2 steps:
1. wear_leveling_worker // generate false scrub type PEB
     ubi_eba_copy_leb // MOVE_RETRY is returned
       leb_write_trylock // trylock failed
     scrubbing = 1;
     e1 is put into ubi->scrub
2. wear_leveling_worker // schedule false scrub type PEB for wl
     scrubbing = 1
     e1 = rb_entry(rb_first(&ubi->scrub))

The problem can be reproduced easily by running fsstress on a small
UBIFS partition(<64M, simulated by nandsim) for 5~10mins
(CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following
message is shown:
 ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165

Since scrub type source PEB has set variable scrubbing as '1', and
variable scrubbing is checked before variable keep, so the problem can
be fixed by setting keep variable as 1 directly if the source LEB cannot
be locked.

Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip")
CC: stable@vger.kernel.org
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
drivers/mtd/ubi/wl.c