xfs: shut down filesystem if we xfs_trans_cancel with deferred work items
authorDarrick J. Wong <djwong@kernel.org>
Wed, 15 Dec 2021 19:53:14 +0000 (11:53 -0800)
committerDarrick J. Wong <djwong@kernel.org>
Tue, 21 Dec 2021 17:49:41 +0000 (09:49 -0800)
While debugging some very strange rmap corruption reports in connection
with the online directory repair code.  I root-caused the error to the
following incorrect sequence:

<start repair transaction>
<expand directory, causing a deferred rmap to be queued>
<roll transaction>
<cancel transaction>

Obviously, we should have committed the transaction instead of
cancelling it.  Thinking more broadly, however, xfs_trans_cancel should
have warned us that we were throwing away work item that we already
committed to performing.  This is not correct, and we need to shut down
the filesystem.

Change xfs_trans_cancel to complain in the loudest manner if we're
cancelling any transaction with deferred work items attached.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
fs/xfs/xfs_trans.c

index 234a9d9..59e2f90 100644 (file)
@@ -942,8 +942,17 @@ xfs_trans_cancel(
 
        trace_xfs_trans_cancel(tp, _RET_IP_);
 
-       if (tp->t_flags & XFS_TRANS_PERM_LOG_RES)
+       /*
+        * It's never valid to cancel a transaction with deferred ops attached,
+        * because the transaction is effectively dirty.  Complain about this
+        * loudly before freeing the in-memory defer items.
+        */
+       if (!list_empty(&tp->t_dfops)) {
+               ASSERT(xfs_is_shutdown(mp) || list_empty(&tp->t_dfops));
+               ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES);
+               dirty = true;
                xfs_defer_cancel(tp);
+       }
 
        /*
         * See if the caller is relying on us to shut down the