NFSv4: When recovering state fails with EAGAIN, retry the same recovery
authorTrond Myklebust <trond.myklebust@hammerspace.com>
Mon, 22 Jul 2019 08:54:29 +0000 (09:54 +0100)
committerTrond Myklebust <trond.myklebust@hammerspace.com>
Mon, 5 Aug 2019 02:35:40 +0000 (22:35 -0400)
If the server returns with EAGAIN when we're trying to recover from
a server reboot, we currently delay for 1 second, but then mark the
stateid as needing recovery after the grace period has expired.

Instead, we should just retry the same recovery process immediately
after the 1 second delay. Break out of the loop after 10 retries.

Fixes: 35a61606a612 ("NFS: Reduce indentation of the switch statement...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
fs/nfs/nfs4state.c

index a71a61e..d03b9cf 100644 (file)
@@ -1607,6 +1607,7 @@ static int __nfs4_reclaim_open_state(struct nfs4_state_owner *sp, struct nfs4_st
 static int nfs4_reclaim_open_state(struct nfs4_state_owner *sp, const struct nfs4_state_recovery_ops *ops)
 {
        struct nfs4_state *state;
+       unsigned int loop = 0;
        int status = 0;
 
        /* Note: we rely on the sp->so_states list being ordered 
@@ -1633,8 +1634,10 @@ restart:
 
                switch (status) {
                default:
-                       if (status >= 0)
+                       if (status >= 0) {
+                               loop = 0;
                                break;
+                       }
                        printk(KERN_ERR "NFS: %s: unhandled error %d\n", __func__, status);
                        /* Fall through */
                case -ENOENT:
@@ -1648,6 +1651,10 @@ restart:
                        break;
                case -EAGAIN:
                        ssleep(1);
+                       if (loop++ < 10) {
+                               set_bit(ops->state_flag_bit, &state->flags);
+                               break;
+                       }
                        /* Fall through */
                case -NFS4ERR_ADMIN_REVOKED:
                case -NFS4ERR_STALE_STATEID: