Andy Adamson [Wed, 8 May 2013 20:21:18 +0000 (16:21 -0400)]
NFS4.1 Fix data server connection race
Unlike meta data server mounts which support multiple mount points to
the same server via struct nfs_server, data servers support a single connection.
Concurrent calls to setup the data server connection can race where the first
call allocates the nfs_client struct, and before the cache struct nfs_client
pointer can be set, a second call also tries to setup the connection, finds the
already allocated nfs_client, bumps the reference count, re-initializes the
session,etc. This results in a hanging data server session after umount.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Weston Andros Adamson [Mon, 6 May 2013 21:12:13 +0000 (17:12 -0400)]
NFSv3: match sec= flavor against server list
Older linux clients match the 'sec=' mount option flavor against the server's
flavor list (if available) and return EPERM if the specified flavor or AUTH_NULL
(which "matches" any flavor) is not found.
Recent changes skip this step and allow the vfs mount even though no operations
will succeed, creating a 'dud' mount.
This patch reverts back to the old behavior of matching specified flavors
against the server list and also returns EPERM when no sec= is specified and
none of the flavors returned by the server are supported by the client.
Example of behavior change:
the server's /etc/exports:
/export/krb5 *(sec=krb5,rw,no_root_squash)
old client behavior:
$ uname -a
Linux one.apikia.fake 3.8.8-202.fc18.x86_64 #1 SMP Wed Apr 17 23:25:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ sudo mount -v -o sec=sys,vers=3 zero:/export/krb5 /mnt
mount.nfs: timeout set for Sun May 5 17:32:04 2013
mount.nfs: trying text-based options 'sec=sys,vers=3,addr=192.168.100.10'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 192.168.100.10 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 192.168.100.10 prog 100005 vers 3 prot UDP port 20048
mount.nfs: mount(2): Permission denied
mount.nfs: access denied by server while mounting zero:/export/krb5
recently changed behavior:
$ uname -a
Linux one.apikia.fake 3.9.0-testing+ #2 SMP Fri May 3 20:29:32 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux
$ sudo mount -v -o sec=sys,vers=3 zero:/export/krb5 /mnt
mount.nfs: timeout set for Sun May 5 17:37:17 2013
mount.nfs: trying text-based options 'sec=sys,vers=3,addr=192.168.100.10'
mount.nfs: prog 100003, trying vers=3, prot=6
mount.nfs: trying 192.168.100.10 prog 100003 vers 3 prot TCP port 2049
mount.nfs: prog 100005, trying vers=3, prot=17
mount.nfs: trying 192.168.100.10 prog 100005 vers 3 prot UDP port 20048
$ ls /mnt
ls: cannot open directory /mnt: Permission denied
$ sudo ls /mnt
ls: cannot open directory /mnt: Permission denied
$ sudo df /mnt
df: ‘/mnt’: Permission denied
df: no file systems processed
$ sudo umount /mnt
$
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 3 May 2013 20:22:55 +0000 (16:22 -0400)]
NFSv4.1: Ensure that we free the lock stateid on the server
This ensures that the server doesn't need to keep huge numbers of
lock stateids waiting around for the final CLOSE.
See section 8.2.4 in RFC5661.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 3 May 2013 18:40:01 +0000 (14:40 -0400)]
NFSv4: Convert nfs41_free_stateid to use an asynchronous RPC call
The main reason for doing this is will be to allow for an asynchronous
RPC mode that we can use for freeing lock stateids as per section
8.2.4 of RFC5661.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 1 May 2013 13:17:50 +0000 (09:17 -0400)]
SUNRPC: Don't spam syslog with "Pseudoflavor not found" messages
Just convert those messages to dprintk()s so that they can be used
when debugging.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 30 Apr 2013 16:43:42 +0000 (12:43 -0400)]
NFSv4.x: Fix handling of partially delegated locks
If a NFS client receives a delegation for a file after it has taken
a lock on that file, we can currently end up in a situation where
we mistakenly skip unlocking that file.
The following patch swaps an erroneous check in nfs4_proc_unlck for
whether or not the file has a delegation to one which checks whether
or not we hold a lock stateid for that file.
Reported-by: Chuck Lever <Chuck.Lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>=3.7]
Tested-by: Chuck Lever <Chuck.Lever@oracle.com>
Trond Myklebust [Mon, 29 Apr 2013 15:11:58 +0000 (11:11 -0400)]
NFSv4: Warn once about servers that incorrectly apply open mode to setattr
Debugging aid to help identify servers that incorrectly apply open mode
checks to setattr requests that are not changing the file size.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 29 Apr 2013 14:35:36 +0000 (10:35 -0400)]
NFSv4: Servers should only check SETATTR stateid open mode on size change
The NFSv4 and NFSv4.1 specs are both clear that the server should only check
stateid open mode if a SETATTR specifies the size attribute. If the
open mode is not one that allows writing, then it returns NFS4ERR_OPENMODE.
In the case where the SETATTR is not changing the size, the client will
still pass it the delegation stateid to ensure that the server does not
recall that delegation. In that case, the server should _ignore_ the
delegation open mode, and simply apply standard permission checks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 23 Apr 2013 19:52:14 +0000 (15:52 -0400)]
Merge branch 'bugfixes' into linux-next
Fix up a conflict between the linux-next branch and mainline.
Conflicts:
fs/nfs/nfs4proc.c
Trond Myklebust [Tue, 23 Apr 2013 19:40:40 +0000 (15:40 -0400)]
Merge branch 'rpcsec_gss-from_cel' into linux-next
* rpcsec_gss-from_cel: (21 commits)
NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
NFSv4: Don't clear the machine cred when client establish returns EACCES
NFSv4: Fix issues in nfs4_discover_server_trunking
NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
NFS: Use server-recommended security flavor by default (NFSv3)
SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
NFS: Use "krb5i" to establish NFSv4 state whenever possible
NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
NFS: Use static list of security flavors during root FH lookup recovery
NFS: Avoid PUTROOTFH when managing leases
NFS: Clean up nfs4_proc_get_rootfh
NFS: Handle missing rpc.gssd when looking up root FH
SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
SUNRPC: Make gss_mech_get() static
SUNRPC: Refactor nfsd4_do_encode_secinfo()
SUNRPC: Consider qop when looking up pseudoflavors
SUNRPC: Load GSS kernel module by OID
SUNRPC: Introduce rpcauth_get_pseudoflavor()
SUNRPC: Define rpcsec_gss_info structure
NFS: Remove unneeded forward declaration
...
Trond Myklebust [Tue, 23 Apr 2013 18:52:44 +0000 (14:52 -0400)]
NFSv4: Don't recheck permissions on open in case of recovery cached open
If we already checked the user access permissions on the original open,
then don't bother checking again on recovery. Doing so can cause a
deadlock with NFSv4.1, since the may_open() operation is not privileged.
Furthermore, we can't report an access permission failure here anyway.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 23 Apr 2013 18:46:25 +0000 (14:46 -0400)]
NFSv4.1: Don't do a delegated open for NFS4_OPEN_CLAIM_DELEG_CUR_FH modes
If we're in a delegation recall situation, we can't do a delegated open.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 23 Apr 2013 18:31:19 +0000 (14:31 -0400)]
NFSv4.1: Use the more efficient open_noattr call for open-by-filehandle
When we're doing open-by-filehandle in NFSv4.1, we shouldn't need to
do the cache consistency revalidation on the directory. It is
therefore more efficient to just use open_noattr, which returns the
file attributes, but not the directory attributes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Mon, 22 Apr 2013 19:42:48 +0000 (15:42 -0400)]
NFS: Retry SETCLIENTID with AUTH_SYS instead of AUTH_NONE
Recently I changed the SETCLIENTID code to use AUTH_GSS(krb5i), and
then retry with AUTH_NONE if that didn't work. This was to enable
Kerberos NFS mounts to work without forcing Linux NFS clients to
have a keytab on hand.
Rick Macklem reports that the FreeBSD server accepts AUTH_NONE only
for NULL operations (thus certainly not for SETCLIENTID). Falling
back to AUTH_NONE means our proposed 3.10 NFS client will not
interoperate with FreeBSD servers over NFSv4 unless Kerberos is
fully configured on both ends.
If the Linux client falls back to using AUTH_SYS instead for
SETCLIENTID, all should work fine as long as the NFS server is
configured to allow AUTH_SYS for SETCLIENTID.
This may still prevent access to Kerberos-only FreeBSD servers by
Linux clients with no keytab. Rick is of the opinion that the
security settings the server applies to its pseudo-fs should also
apply to the SETCLIENTID operation.
Linux and Solaris NFS servers do not place that limitation on
SETCLIENTID. The security settings for the server's pseudo-fs are
determined automatically as the union of security flavors allowed on
real exports, as recommended by RFC 3530bis; and the flavors allowed
for SETCLIENTID are all flavors supported by the respective server
implementation.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 22 Apr 2013 15:29:51 +0000 (11:29 -0400)]
NFSv4: Ensure that we clear the NFS_OPEN_STATE flag when appropriate
We should always clear it before initiating file recovery.
Also ensure that we clear it after a CLOSE and/or after TEST_STATEID fails.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 21 Apr 2013 22:01:06 +0000 (18:01 -0400)]
LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot
After a server reboot, the reclaimer thread will recover all the existing
locks. For locks that are blocked, however, it will change the value
of block->b_status to nlm_lck_denied_grace_period in order to signal that
they need to wake up and resend the original blocking lock request.
Due to a bug, however, the block->b_status never gets reset after the
blocked locks have been woken up, and so the process goes into an
infinite loop of resends until the blocked lock is satisfied.
Reported-by: Marc Eshel <eshel@us.ibm.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Trond Myklebust [Sat, 20 Apr 2013 05:30:53 +0000 (01:30 -0400)]
NFSv4: Ensure the LOCK call cannot use the delegation stateid
Defensive patch to ensure that we copy the state->open_stateid, which
can never be set to the delegation stateid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 20 Apr 2013 05:25:45 +0000 (01:25 -0400)]
NFSv4: Use the open stateid if the delegation has the wrong mode
Fix nfs4_select_rw_stateid() so that it chooses the open stateid
(or an all-zero stateid) if the delegation does not match the selected
read/write mode.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Bryan Schumaker [Fri, 19 Apr 2013 20:09:37 +0000 (16:09 -0400)]
nfs: Send atime and mtime as a 64bit value
RFC 3530 says that the seconds value of a nfstime4 structure is a 64bit
value, but we are instead sending a 32-bit 0 and then a 32bit conversion
of the 64bit Linux value. This means that if we try to set atime to a
value before the epoch (touch -t
196001010101) the client will only send
part of the new value due to lost precision.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 16 Apr 2013 22:42:34 +0000 (18:42 -0400)]
NFSv4: Record the OPEN create mode used in the nfs4_opendata structure
If we're doing NFSv4.1 against a server that has persistent sessions,
then we should not need to call SETATTR in order to reset the file
attributes immediately after doing an exclusive create.
Note that since the create mode depends on the type of session that
has been negotiated with the server, we should not choose the
mode until after we've got a session slot.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 14 Apr 2013 15:49:51 +0000 (11:49 -0400)]
NFSv4.1: Set the RPC_CLNT_CREATE_INFINITE_SLOTS flag for NFSv4.1 transports
This ensures that the RPC layer doesn't override the NFS session
negotiation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 14 Apr 2013 15:42:00 +0000 (11:42 -0400)]
SUNRPC: Allow rpc_create() to request that TCP slots be unlimited
This is mainly for use by NFSv4.1, where the session negotiation
ultimately wants to decide how many RPC slots we can fill.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 14 Apr 2013 14:49:37 +0000 (10:49 -0400)]
SUNRPC: Fix a livelock problem in the xprt->backlog queue
This patch ensures that we throttle new RPC requests if there are
requests already waiting in the xprt->backlog queue. The reason for
doing this is to fix livelock issues that can occur when an existing
(high priority) task is waiting in the backlog queue, gets woken up
by xprt_free_slot(), but a new task then steals the slot.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 12 Apr 2013 19:04:51 +0000 (15:04 -0400)]
NFSv4: Fix handling of revoked delegations by setattr
Currently, _nfs4_do_setattr() will use the delegation stateid if no
writeable open file stateid is available.
If the server revokes that delegation stateid, then the call to
nfs4_handle_exception() will fail to handle the error due to the
lack of a struct nfs4_state, and will just convert the error into
an EIO.
This patch just removes the requirement that we must have a
struct nfs4_state in order to invalidate the delegation and
retry.
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Andy Adamson [Thu, 11 Apr 2013 13:28:45 +0000 (09:28 -0400)]
NFSv4 release the sequence id in the return on close case
Otherwise we deadlock if state recovery is initiated while we
sleep.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Wed, 10 Apr 2013 19:36:48 +0000 (15:36 -0400)]
nfs: remove unnecessary check for NULL inode->i_flock from nfs_delegation_claim_locks
The second check was added in commit
65b62a29 but it will never be true.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 10 Apr 2013 16:44:18 +0000 (12:44 -0400)]
NFSv4: Doh! Typo in the fix to nfs41_walk_client_list
Make sure that we set the status to 0 on success. Missed in testing
because it never appears when doing multiple mounts to _different_
servers.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: <stable@vger.kernel.org> # 3.7.x: 7b1f1fd: NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
Trond Myklebust [Tue, 9 Apr 2013 16:56:52 +0000 (12:56 -0400)]
NFSv4: Fix another potential state manager deadlock
Don't hold the NFSv4 sequence id while we check for open permission.
The call to ACCESS may block due to reboot recovery.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 9 Apr 2013 01:49:53 +0000 (21:49 -0400)]
NFS: Ensure that NFS file unlock waits for readahead to complete
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 9 Apr 2013 01:38:12 +0000 (21:38 -0400)]
NFS: Add functionality to allow waiting on all outstanding reads to complete
This will later allow NFS locking code to wait for readahead to complete
before releasing byte range locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 8 Apr 2013 21:50:28 +0000 (17:50 -0400)]
NFSv4: Handle timeouts correctly when probing for lease validity
When we send a RENEW or SEQUENCE operation in order to probe if the
lease is still valid, we want it to be able to time out since the
lease we are probing is likely to time out too. Currently, because
we use soft mount semantics for these RPC calls, the return value
is EIO, which causes the state manager to exit with an "unhandled
error" message.
This patch changes the call semantics, so that the RPC layer returns
ETIMEDOUT instead of EIO. We then have the state manager default to
a simple retry instead of exiting.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 3 Apr 2013 23:27:52 +0000 (19:27 -0400)]
NFSv4: Fix CB_RECALL_ANY to only return delegations that are not in use
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 3 Apr 2013 23:23:58 +0000 (19:23 -0400)]
NFSv4: Clean up nfs_expire_all_delegations
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 3 Apr 2013 23:04:58 +0000 (19:04 -0400)]
NFSv4: Fix nfs_server_return_all_delegations
If the state manager thread is already running, we may end up
racing with it in nfs_client_return_marked_delegations. Better to
just allow the state manager thread to do the job.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 3 Apr 2013 18:33:49 +0000 (14:33 -0400)]
NFSv4: Be less aggressive about returning delegations for open files
Currently, if the application that holds the file open isn't doing
I/O, we may end up returning the delegation. This means that we can
no longer cache the file as aggressively, and often also that we
multiply the state that both the server and the client needs to track.
This patch adds a check for open files to the routine that scans
for delegations that are unreferenced.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Apr 2013 19:56:46 +0000 (15:56 -0400)]
NFSv4: Clean up delegation recall error handling
Unify the error handling in nfs4_open_delegation_recall and
nfs4_lock_delegation_recall.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Apr 2013 19:40:44 +0000 (15:40 -0400)]
NFSv4: Clean up nfs4_open_delegation_recall
Make it symmetric with nfs4_lock_delegation_recall
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Apr 2013 18:47:22 +0000 (14:47 -0400)]
NFSv4: Clean up nfs4_lock_delegation_recall
All error cases are handled by the switch() statement, meaning that the
call to nfs4_handle_exception() is unreachable.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 1 Apr 2013 19:34:05 +0000 (15:34 -0400)]
NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_open_delegation_recall
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the open in this
instance
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Trond Myklebust [Mon, 1 Apr 2013 18:27:29 +0000 (14:27 -0400)]
NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_lock_delegation_recall
A server shouldn't normally return NFS4ERR_GRACE if the client holds a
delegation, since no conflicting lock reclaims can be granted, however
the spec does not require the server to grant the lock in this
instance.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Paul Bolle [Sat, 9 Mar 2013 16:02:31 +0000 (17:02 +0100)]
sunrpc: drop "select NETVM"
The Kconfig entry for SUNRPC_SWAP selects NETVM. That select statement
was added in commit
a564b8f0398636ba30b07c0eaebdef7ff7837249 ("nfs:
enable swap on NFS"). But there's no Kconfig symbol NETVM. It apparently
was only in used in development versions of the swap over nfs
functionality but never entered mainline. Anyhow, it is a nop and can
safely be dropped.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Jeff Layton [Mon, 25 Mar 2013 11:59:57 +0000 (07:59 -0400)]
nfs: allow the v4.1 callback thread to freeze
The v4.1 callback thread has set_freezable() at the top, but it doesn't
ever try to freeze within the loop. Have it call try_to_freeze() at the
top of the loop. If a freeze event occurs, recheck kthread_should_stop()
after thawing.
Reported-by: Yanchuan Nian <ycnian@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 5 Apr 2013 18:13:21 +0000 (14:13 -0400)]
SUNRPC: Fix a potential memory leak in rpc_new_client
If the call to rpciod_up() fails, we currently leak a reference to the
struct rpc_xprt.
As part of the fix, we also remove the redundant check for xprt!=NULL.
This is already taken care of by the callers.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 5 Apr 2013 20:11:11 +0000 (16:11 -0400)]
NFSv4/4.1: Fix bugs in nfs4[01]_walk_client_list
It is unsafe to use list_for_each_entry_safe() here, because
when we drop the nn->nfs_client_lock, we pin the _current_ list
entry and ensure that it stays in the list, but we don't do the
same for the _next_ list entry. Use of list_for_each_entry() is
therefore the correct thing to do.
Also fix the refcounting in nfs41_walk_client_list().
Finally, ensure that the nfs_client has finished being initialised
and, in the case of NFSv4.1, that the session is set up.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@vger.kernel.org [>= 3.7]
Trond Myklebust [Thu, 4 Apr 2013 19:55:00 +0000 (15:55 -0400)]
NFSv4: Fix a memory leak in nfs4_discover_server_trunking
When we assign a new rpc_client to clp->cl_rpcclient, we need to destroy
the old one.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org [>=3.7]
Chuck Lever [Fri, 22 Mar 2013 16:52:59 +0000 (12:52 -0400)]
SUNRPC: Remove extra xprt_put()
While testing error cases where rpc_new_client() fails, I saw
some oopses.
If rpc_new_client() fails, it already invokes xprt_put(). Thus
__rpc_clone_client() does not need to invoke it again.
Introduced by commit
1b63a751 "SUNRPC: Refactor rpc_clone_client()"
Fri Sep 14, 2012.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org [>=3.7]
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 5 Apr 2013 19:37:04 +0000 (15:37 -0400)]
NFSv4: Don't clear the machine cred when client establish returns EACCES
The expected behaviour is that the client will decide at mount time
whether or not to use a krb5i machine cred, or AUTH_NULL.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Trond Myklebust [Fri, 5 Apr 2013 17:22:50 +0000 (13:22 -0400)]
NFSv4: Fix issues in nfs4_discover_server_trunking
- Ensure that we exit with ENOENT if the call to ops->get_clid_cred()
fails.
- Handle the case where ops->detect_trunking() exits with an
unexpected error, and return EIO.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 4 Apr 2013 20:14:11 +0000 (16:14 -0400)]
NFSv4: Fix the fallback to AUTH_NULL if krb5i is not available
If the rpcsec_gss_krb5 module cannot be loaded, the attempt to create
an rpc_client in nfs4_init_client will currently fail with an EINVAL.
Fix is to retry with AUTH_NULL.
Regression introduced by the commit "NFS: Use "krb5i" to establish NFSv4
state whenever possible"
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Chuck Lever [Fri, 22 Mar 2013 16:53:17 +0000 (12:53 -0400)]
NFS: Use server-recommended security flavor by default (NFSv3)
Since commit
ec88f28d in 2009, checking if the user-specified flavor
is in the server's flavor list has been the source of a few
noticeable regressions (now fixed), but there is one that is still
vexing.
An NFS server can list AUTH_NULL in its flavor list, which suggests
a client should try to mount the server with the flavor of the
client's choice, but the server will squash all accesses. In some
cases, our client fails to mount a server because of this check,
when the mount could have proceeded successfully.
Skip this check if the user has specified "sec=" on the mount
command line. But do consult the server-provided flavor list to
choose a security flavor if no sec= option is specified on the mount
command.
If a server lists Kerberos pseudoflavors before "sys" in its export
options, our client now chooses Kerberos over AUTH_UNIX for mount
points, when no security flavor is specified by the mount command.
This could be surprising to some administrators or users, who would
then need to have Kerberos credentials to access the export.
Or, a client administrator may not have enabled rpc.gssd. In this
case, auth_rpcgss.ko might still be loadable, which is enough for
the new logic to choose Kerberos over AUTH_UNIX. But the mount
would fail since no GSS context can be created without rpc.gssd
running.
To retain the use of AUTH_UNIX by default:
o The server administrator can ensure that "sys" is listed before
Kerberos flavors in its export security options (see
exports(5)),
o The client administrator can explicitly specify "sec=sys" on
its mount command line (see nfs(5)),
o The client administrator can use "Sec=sys" in an appropriate
section of /etc/nfsmount.conf (see nfsmount.conf(5)), or
o The client administrator can blacklist auth_rpcgss.ko.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Fri, 22 Mar 2013 16:53:08 +0000 (12:53 -0400)]
SUNRPC: Don't recognize RPC_AUTH_MAXFLAVOR
RPC_AUTH_MAXFLAVOR is an invalid flavor, on purpose. Don't allow
any processing whatsoever if a caller passes it to rpcauth_create()
or rpcauth_get_gssinfo().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:56:20 +0000 (15:56 -0400)]
NFS: Use "krb5i" to establish NFSv4 state whenever possible
Currently our client uses AUTH_UNIX for state management on Kerberos
NFS mounts in some cases. For example, if the first mount of a
server specifies "sec=sys," the SETCLIENTID operation is performed
with AUTH_UNIX. Subsequent mounts using stronger security flavors
can not change the flavor used for lease establishment. This might
be less security than an administrator was expecting.
Dave Noveck's migration issues draft recommends the use of an
integrity-protecting security flavor for the SETCLIENTID operation.
Let's ignore the mount's sec= setting and use krb5i as the default
security flavor for SETCLIENTID.
If our client can't establish a GSS context (eg. because it doesn't
have a keytab or the server doesn't support Kerberos) we fall back
to using AUTH_NULL. For an operation that requires a
machine credential (which never represents a particular user)
AUTH_NULL is as secure as AUTH_UNIX.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:56:11 +0000 (15:56 -0400)]
NFS: Try AUTH_UNIX when PUTROOTFH gets NFS4ERR_WRONGSEC
Most NFSv4 servers implement AUTH_UNIX, and administrators will
prefer this over AUTH_NULL. It is harmless for our client to try
this flavor in addition to the flavors mandated by RFC 3530/5661.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:56:02 +0000 (15:56 -0400)]
NFS: Use static list of security flavors during root FH lookup recovery
If the Linux NFS client receives an NFS4ERR_WRONGSEC error while
trying to look up an NFS server's root file handle, it retries the
lookup operation with various security flavors to see what flavor
the NFS server will accept for pseudo-fs access.
The list of flavors the client uses during retry consists only of
flavors that are currently registered in the kernel RPC client.
This list may not include any GSS pseudoflavors if auth_rpcgss.ko
has not yet been loaded.
Let's instead use a static list of security flavors that the NFS
standard requires the server to implement (RFC 3530bis, section
3.2.1). The RPC client should now be able to load support for
these dynamically; if not, they are skipped.
Recovery behavior here is prescribed by RFC 3530bis, section
15.33.5:
> For LOOKUPP, PUTROOTFH and PUTPUBFH, the client will be unable to
> use the SECINFO operation since SECINFO requires a current
> filehandle and none exist for these two [sic] operations. Therefore,
> the client must iterate through the security triples available at
> the client and reattempt the PUTROOTFH or PUTPUBFH operation. In
> the unfortunate event none of the MANDATORY security triples are
> supported by the client and server, the client SHOULD try using
> others that support integrity. Failing that, the client can try
> using AUTH_NONE, but because such forms lack integrity checks,
> this puts the client at risk.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:53 +0000 (15:55 -0400)]
NFS: Avoid PUTROOTFH when managing leases
Currently, the compound operation the Linux NFS client sends to the
server to confirm a client ID looks like this:
{ SETCLIENTID_CONFIRM; PUTROOTFH; GETATTR(lease_time) }
Once the lease is confirmed, it makes sense to know how long before
the client will have to renew it. And, performing these operations
in the same compound saves a round trip.
Unfortunately, this arrangement assumes that the security flavor
used for establishing a client ID can also be used to access the
server's pseudo-fs.
If the server requires a different security flavor to access its
pseudo-fs than it allowed for the client's SETCLIENTID operation,
the PUTROOTFH in this compound fails with NFS4ERR_WRONGSEC. Even
though the SETCLIENTID_CONFIRM succeeded, our client's trunking
detection logic interprets the failure of the compound as a failure
by the server to confirm the client ID.
As part of server trunking detection, the client then begins another
SETCLIENTID pass with the same nfs4_client_id. This fails with
NFS4ERR_CLID_INUSE because the first SETCLIENTID/SETCLIENTID_CONFIRM
already succeeded in confirming that client ID -- it was the
PUTROOTFH operation that caused the SETCLIENTID_CONFIRM compound to
fail.
To address this issue, separate the "establish client ID" step from
the "accessing the server's pseudo-fs root" step. The first access
of the server's pseudo-fs may require retrying the PUTROOTFH
operation with different security flavors. This access is done in
nfs4_proc_get_rootfh().
That leaves the matter of how to retrieve the server's lease time.
nfs4_proc_fsinfo() already retrieves the lease time value, though
none of its callers do anything with the retrieved value (nor do
they mark the lease as "renewed").
Note that NFSv4.1 state recovery invokes nfs4_proc_get_lease_time()
using the lease management security flavor. This may cause some
heartburn if that security flavor isn't the same as the security
flavor the server requires for accessing the pseudo-fs.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:45 +0000 (15:55 -0400)]
NFS: Clean up nfs4_proc_get_rootfh
The long lines with no vertical white space make this function
difficult for humans to read. Add a proper documenting comment
while we're here.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:36 +0000 (15:55 -0400)]
NFS: Handle missing rpc.gssd when looking up root FH
When rpc.gssd is not running, any NFS operation that needs to use a
GSS security flavor of course does not work.
If looking up a server's root file handle results in an
NFS4ERR_WRONGSEC, nfs4_find_root_sec() is called to try a bunch of
security flavors until one works or all reasonable flavors have
been tried. When rpc.gssd isn't running, this loop seems to fail
immediately after rpcauth_create() craps out on the first GSS
flavor.
When the rpcauth_create() call in nfs4_lookup_root_sec() fails
because rpc.gssd is not available, nfs4_lookup_root_sec()
unconditionally returns -EIO. This prevents nfs4_find_root_sec()
from retrying any other flavors; it drops out of its loop and fails
immediately.
Having nfs4_lookup_root_sec() return -EACCES instead allows
nfs4_find_root_sec() to try all flavors in its list.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:27 +0000 (15:55 -0400)]
SUNRPC: Remove EXPORT_SYMBOL_GPL() from GSS mech switch
Clean up: Reduce the symbol table footprint for auth_rpcgss.ko by
removing exported symbols for functions that are no longer used
outside of auth_rpcgss.ko.
The remaining two EXPORTs in gss_mech_switch.c get documenting
comments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:19 +0000 (15:55 -0400)]
SUNRPC: Make gss_mech_get() static
gss_mech_get() is no longer used outside of gss_mech_switch.c.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:10 +0000 (15:55 -0400)]
SUNRPC: Refactor nfsd4_do_encode_secinfo()
Clean up. This matches a similar API for the client side, and
keeps ULP fingers out the of the GSS mech switch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:55:01 +0000 (15:55 -0400)]
SUNRPC: Consider qop when looking up pseudoflavors
The NFSv4 SECINFO operation returns a list of security flavors that
the server supports for a particular share. An NFSv4 client is
supposed to pick a pseudoflavor it supports that corresponds to one
of the flavors returned by the server.
GSS flavors in this list have a GSS tuple that identify a specific
GSS pseudoflavor.
Currently our client ignores the GSS tuple's "qop" value. A
matching pseudoflavor is chosen based only on the OID and service
value.
So far this omission has not had much effect on Linux. The NFSv4
protocol currently supports only one qop value: GSS_C_QOP_DEFAULT,
also known as zero.
However, if an NFSv4 server happens to return something other than
zero in the qop field, our client won't notice. This could cause
the client to behave in incorrect ways that could have security
implications.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:54:52 +0000 (15:54 -0400)]
SUNRPC: Load GSS kernel module by OID
The current GSS mech switch can find and load GSS pseudoflavor
modules by name ("krb5") or pseudoflavor number ("390003"), but
cannot find GSS modules by GSS tuple:
[ "1.2.840.113554.1.2.2", GSS_C_QOP_DEFAULT, RPC_GSS_SVC_NONE ]
This is important when dealing with a SECINFO request. A SECINFO
reply contains a list of flavors the server supports for the
requested export, but GSS flavors also have a GSS tuple that maps
to a pseudoflavor (like 390003 for krb5).
If the GSS module that supports the OID in the tuple is not loaded,
our client is not able to load that module dynamically to support
that pseudoflavor.
Add a way for the GSS mech switch to load GSS pseudoflavor support
by OID before searching for the pseudoflavor that matches the OID
and service.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:54:43 +0000 (15:54 -0400)]
SUNRPC: Introduce rpcauth_get_pseudoflavor()
A SECINFO reply may contain flavors whose kernel module is not
yet loaded by the client's kernel. A new RPC client API, called
rpcauth_get_pseudoflavor(), is introduced to do proper checking
for support of a security flavor.
When this API is invoked, the RPC client now tries to load the
module for each flavor first before performing the "is this
supported?" check. This means if a module is available on the
client, but has not been loaded yet, it will be loaded and
registered automatically when the SECINFO reply is processed.
The new API can take a full GSS tuple (OID, QoP, and service).
Previously only the OID and service were considered.
nfs_find_best_sec() is updated to verify all flavors requested in a
SECINFO reply, including AUTH_NULL and AUTH_UNIX. Previously these
two flavors were simply assumed to be supported without consulting
the RPC client.
Note that the replaced version of nfs_find_best_sec() can return
RPC_AUTH_MAXFLAVOR if the server returns a recognized OID but an
unsupported "service" value. nfs_find_best_sec() now returns
RPC_AUTH_UNIX in this case.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:54:34 +0000 (15:54 -0400)]
SUNRPC: Define rpcsec_gss_info structure
The NFSv4 SECINFO procedure returns a list of security flavors. Any
GSS flavor also has a GSS tuple containing an OID, a quality-of-
protection value, and a service value, which specifies a particular
GSS pseudoflavor.
For simplicity and efficiency, I'd like to return each GSS tuple
from the NFSv4 SECINFO XDR decoder and pass it straight into the RPC
client.
Define a data structure that is visible to both the NFS client and
the RPC client. Take structure and field names from the relevant
standards to avoid confusion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:54:25 +0000 (15:54 -0400)]
NFS: Remove unneeded forward declaration
I've built with NFSv4 enabled and disabled. This forward
declaration does not seem to be required.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Chuck Lever [Sat, 16 Mar 2013 19:54:16 +0000 (15:54 -0400)]
SUNRPC: Missing module alias for auth_rpcgss.ko
Commit
f344f6df "SUNRPC: Auto-load RPC authentication kernel
modules", Mon Mar 20 13:44:08 2006, adds a request_module() call
in rpcauth_create() to auto-load RPC security modules when a ULP
tries to create a credential of that flavor.
In rpcauth_create(), the name of the module to load is built like
this:
request_module("rpc-auth-%u", flavor);
This means that for, say, RPC_AUTH_GSS, request_module() is looking
for a module or alias called "rpc-auth-6".
The GSS module is named "auth_rpcgss", and commit
f344f6df does not
add any new module aliases. There is also no such alias provided in
/etc/modprobe.d on my system (Fedora 16). Without this alias, the
GSS module is not loaded on demand.
This is used by rpcauth_create(). The pseudoflavor_to_flavor() call
can return RPC_AUTH_GSS, which is passed to request_module().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 27 Mar 2013 15:54:45 +0000 (11:54 -0400)]
NFSv4: Fix Oopses in the fs_locations code
If the server sends us a pathname with more components than the client
limit of NFS4_PATHNAME_MAXCOMPONENTS, more server entries than the client
limit of NFS4_FS_LOCATION_MAXSERVERS, or sends a total number of
fs_locations entries than the client limit of NFS4_FS_LOCATIONS_MAXENTRIES
then we will currently Oops because the limit checks are done _after_ we've
decoded the data into the arrays.
Reported-by: fanchaoting<fanchaoting@cn.fujitsu.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 28 Mar 2013 18:01:33 +0000 (14:01 -0400)]
NFSv4: Fix another reboot recovery race
If the open_context for the file is not yet fully initialised,
then open recovery cannot succeed, and since nfs4_state_find_open_context
returns an ENOENT, we end up treating the file as being irrecoverable.
What we really want to do, is just defer the recovery until later.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sat, 23 Mar 2013 19:22:45 +0000 (15:22 -0400)]
NFSv4: Add a mapping for NFS4ERR_FILE_OPEN in nfs4_map_errors
With unlink is an asynchronous operation in the sillyrename case, it
expects nfs4_async_handle_error() to map the error correctly.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 18 Mar 2013 16:50:59 +0000 (12:50 -0400)]
NFSv4.1: Use CLAIM_DELEG_CUR_FH opens when available
Now that we do CLAIM_FH opens, we may run into situations where we
get a delegation but don't have perfect knowledge of the file path.
When returning the delegation, we might therefore not be able to
us CLAIM_DELEGATE_CUR opens to convert the delegation into OPEN
stateids and locks.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 15 Mar 2013 20:44:28 +0000 (16:44 -0400)]
NFSv4.1: Enable open-by-filehandle
Sometimes, we actually _want_ to do open-by-filehandle, for instance
when recovering opens after a network partition, or when called
from nfs4_file_open.
Enable that functionality using a new capability NFS_CAP_ATOMIC_OPEN_V1,
and which is only enabled for NFSv4.1 servers that support it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 15 Mar 2013 19:39:06 +0000 (15:39 -0400)]
NFSv4.1: Add xdr support for CLAIM_FH and CLAIM_DELEG_CUR_FH opens
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 15 Mar 2013 18:57:33 +0000 (14:57 -0400)]
NFSv4: Clean up nfs4_opendata_alloc in preparation for NFSv4.1 open modes
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 17 Mar 2013 19:31:15 +0000 (15:31 -0400)]
NFSv4.1: Select the "most recent locking state" for read/write/setattr stateids
Follow the practice described in section 8.2.2 of RFC5661: When sending a
read/write or setattr stateid, set the seqid field to zero in order to
signal that the NFS server should apply the most recent locking state.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 15 Mar 2013 20:11:57 +0000 (16:11 -0400)]
NFSv4: Prepare for minorversion-specific nfs_server capabilities
Clean up the setting of the nfs_server->caps, by shoving it all
into nfs4_server_common_setup().
Then add an 'initial capabilities' field into struct nfs4_minor_version_ops.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 17 Mar 2013 00:54:34 +0000 (20:54 -0400)]
NFSv4: Resend the READ/WRITE RPC call if a stateid change causes an error
Adds logic to ensure that if the server returns a BAD_STATEID,
or other state related error, then we check if the stateid has
already changed. If it has, then rather than start state recovery,
we should just resend the failed RPC call with the new stateid.
Allow nfs4_select_rw_stateid to notify that the stateid is unstable by
having it return -EWOULDBLOCK if an RPC is underway that might change the
stateid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Sun, 17 Mar 2013 19:52:00 +0000 (15:52 -0400)]
NFSv4: The stateid must remain the same for replayed RPC calls
If we replay a READ or WRITE call, we should not be changing the
stateid. Currently, we may end up doing so, because the stateid
is only selected at xdr encode time.
This patch ensures that we select the stateid after we get an NFSv4.1
session slot, and that we keep that same stateid across retries.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 15 Mar 2013 22:11:31 +0000 (18:11 -0400)]
NFS: __nfs_find_lock_context needs to check ctx->lock_context for a match too
Currently, we're forcing an unnecessary duplication of the
initial nfs_lock_context in calls to nfs_get_lock_context, since
__nfs_find_lock_context ignores the ctx->lock_context.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 18 Mar 2013 23:45:14 +0000 (19:45 -0400)]
NFS: Don't accept more reads/writes if the open context recovery failed
If the state recovery failed, we want to ensure that the application
doesn't try to use the same file descriptor for more reads or writes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Thu, 14 Mar 2013 20:57:48 +0000 (16:57 -0400)]
NFSv4: Fail I/O if the state recovery fails irrevocably
If state recovery fails with an ESTALE or a ENOENT, then we shouldn't
keep retrying. Instead, mark the stateid as being invalid and
fail the I/O with an EIO error.
For other operations such as POSIX and BSD file locking, truncate
etc, fail with an EBADF to indicate that this file descriptor is no
longer valid.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 4 Mar 2013 22:29:33 +0000 (17:29 -0500)]
SUNRPC: Report network/connection errors correctly for SOFTCONN rpc tasks
In the case of a SOFTCONN rpc task, we really want to ensure that it
reports errors like ENETUNREACH back to the caller. Currently, only
some of these errors are being reported back (connect errors are not),
and they are being converted by the RPC layer into EIO.
Reported-by: Jan Engelhardt <jengelh@inai.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Mon, 25 Mar 2013 15:23:40 +0000 (11:23 -0400)]
SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked
We need to be careful when testing task->tk_waitqueue in
rpc_wake_up_task_queue_locked, because it can be changed while we
are holding the queue->lock.
By adding appropriate memory barriers, we can ensure that it is safe to
test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
Trond Myklebust [Wed, 20 Mar 2013 17:23:33 +0000 (13:23 -0400)]
NFSv4.1: Add a helper pnfs_commit_and_return_layout
In order to be able to safely return the layout in nfs4_proc_setattr,
we need to block new uses of the layout, wait for all outstanding
users of the layout to complete, commit the layout and then return it.
This patch adds a helper in order to do all this safely.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Boaz Harrosh <bharrosh@panasas.com>
Trond Myklebust [Wed, 20 Mar 2013 17:03:00 +0000 (13:03 -0400)]
NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn
Note that clearing NFS_INO_LAYOUTCOMMIT is tricky, since it requires
you to also clear the NFS_LSEG_LAYOUTCOMMIT bits from the layout
segments.
The only two sites that need to do this are the ones that call
pnfs_return_layout() without first doing a layout commit.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Benny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org
Trond Myklebust [Wed, 20 Mar 2013 16:34:32 +0000 (12:34 -0400)]
NFSv4.1: Fix a race in pNFS layoutcommit
We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the
NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations
where the two are out of sync.
The first half of the problem is to ensure that pnfs_layoutcommit_inode
clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg.
We still need to keep the reference to those segments until the RPC call
is finished, so in order to make it clear _where_ those references come
from, we add a helper pnfs_list_write_lseg_done() that cleans up after
pnfs_list_write_lseg.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Benny Halevy <bhalevy@tonian.com>
Cc: stable@vger.kernel.org
fanchaoting [Thu, 21 Mar 2013 01:15:30 +0000 (09:15 +0800)]
pnfs-block: removing DM device maybe cause oops when call dev_remove
when pnfs block using device mapper,if umounting later,it maybe
cause oops. we apply "1 + sizeof(bl_umount_request)" memory for
msg->data, the memory maybe overflow when we do "memcpy(&dataptr
[sizeof(bl_msg)], &bl_umount_request, sizeof(bl_umount_request))",
because the size of bl_msg is more than 1 byte.
Signed-off-by: fanchaoting<fanchaoting@cn.fujitsu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Fri, 8 Mar 2013 17:56:37 +0000 (12:56 -0500)]
NFSv4: Fix the string length returned by the idmapper
Functions like nfs_map_uid_to_name() and nfs_map_gid_to_group() are
expected to return a string without any terminating NUL character.
Regression introduced by commit
57e62324e469e092ecc6c94a7a86fe4bd6ac5172
(NFS: Store the legacy idmapper result in the keyring).
Reported-by: Dave Chiluk <dave.chiluk@canonical.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Bryan Schumaker <bjschuma@netapp.com>
Cc: stable@vger.kernel.org [>=3.4]
Linus Torvalds [Sun, 3 Mar 2013 23:11:05 +0000 (15:11 -0800)]
Linux 3.9-rc1
Linus Torvalds [Sun, 3 Mar 2013 22:24:59 +0000 (14:24 -0800)]
Merge tag 'disintegrate-fbdev-
20121220' of git://git.infradead.org/users/dhowells/linux-headers
Pull fbdev UAPI disintegration from David Howells:
"You'll be glad to here that the end is nigh for the UAPI patches.
Only the fbdev/framebuffer piece remains now that the SCSI stuff has
gone in.
Here are the UAPI disintegration bits for the fbdev drivers. It
appears that Florian hasn't had time to deal with my patch, but back
in December he did say he didn't mind if I pushed it forward."
Yay. No more uapi movement. And hopefully no more big header file
cleanups coming up either, it just tends to be very painful.
* tag 'disintegrate-fbdev-
20121220' of git://git.infradead.org/users/dhowells/linux-headers:
UAPI: (Scripted) Disintegrate include/video
Linus Torvalds [Sun, 3 Mar 2013 22:22:53 +0000 (14:22 -0800)]
Merge tag 'stable/for-linus-3.9-rc1-tag' of git://git./linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
- Update the Xen ACPI memory and CPU hotplug locking mechanism.
- Fix PAT issues wherein various applications would not start
- Fix handling of multiple MSI as AHCI now does it.
- Fix ARM compile failures.
* tag 'stable/for-linus-3.9-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus: fix compile failure on ARM with Xen enabled
xen/pci: We don't do multiple MSI's.
xen/pat: Disable PAT using pat_enabled value.
xen/acpi: xen cpu hotplug minor updates
xen/acpi: xen memory hotplug minor updates
Linus Torvalds [Sun, 3 Mar 2013 21:23:02 +0000 (13:23 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull more VFS bits from Al Viro:
"Unfortunately, it looks like xattr series will have to wait until the
next cycle ;-/
This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
more file_inode() work"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
constify path_get/path_put and fs_struct.c stuff
fix nommu breakage in shmem.c
cache the value of file_inode() in struct file
9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
9p: make sure ->lookup() adds fid to the right dentry
9p: untangle ->lookup() a bit
9p: double iput() in ->lookup() if d_materialise_unique() fails
9p: v9fs_fid_add() can't fail now
v9fs: get rid of v9fs_dentry
9p: turn fid->dlist into hlist
9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
more file_inode() open-coded instances
selinux: opened file can't have NULL or negative ->f_path.dentry
(In the meantime, the hlist traversal macros have changed, so this
required a semantic conflict fixup for the newly hlistified fid->dlist)
Linus Torvalds [Sun, 3 Mar 2013 21:13:20 +0000 (13:13 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixup from Chris Mason:
"Geert and James both sent this one in, sorry guys"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs/raid56: Add missing #include <linux/vmalloc.h>
Linus Torvalds [Sun, 3 Mar 2013 20:58:43 +0000 (12:58 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull second set of s390 patches from Martin Schwidefsky:
"The main part of this merge are Heikos uaccess patches. Together with
commit
09884964335e ("mm: do not grow the stack vma just because of an
overrun on preceding vma") the user string access is hopefully fixed
for good.
In addition some bug fixes and two cleanup patches."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/module: fix compile warning
qdio: remove unused parameters
s390/uaccess: fix kernel ds access for page table walk
s390/uaccess: fix strncpy_from_user string length check
input: disable i8042 PC Keyboard controller for s390
s390/dis: Fix invalid array size
s390/uaccess: remove pointless access_ok() checks
s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case
s390/uaccess: shorten strncpy_from_user/strnlen_user
s390/dasd: fix unresponsive device after all channel paths were lost
s390/mm: ignore change bit for vmemmap
s390/page table dumper: add support for change-recording override bit
Linus Torvalds [Sun, 3 Mar 2013 20:57:38 +0000 (12:57 -0800)]
Merge branch 'fixes-for-3.9-latest' of git://git./linux/kernel/git/deller/parisc-linux
Pull second round of PARISC updates from Helge Deller:
"The most important fix in this branch is the switch of io_setup,
io_getevents and io_submit syscalls to use the available compat
syscalls when running 32bit userspace on 64bit kernel. Other than
that it's mostly removal of compile warnings."
* 'fixes-for-3.9-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix redefinition of SET_PERSONALITY
parisc: do not install modules when installing kernel
parisc: fix compile warnings triggered by atomic_sub(sizeof(),v)
parisc: check return value of down_interruptible() in hp_sdc_rtc.c
parisc: avoid unitialized variable warning in pa_memcpy()
parisc: remove unused variable 'compat_val'
parisc: switch to compat_functions of io_setup, io_getevents and io_submit
parisc: select ARCH_WANT_FRAME_POINTERS
Linus Torvalds [Sun, 3 Mar 2013 20:06:09 +0000 (12:06 -0800)]
Merge tag 'metag-v3.9-rc1-v4' of git://git./linux/kernel/git/jhogan/metag
Pull new ImgTec Meta architecture from James Hogan:
"This adds core architecture support for Imagination's Meta processor
cores, followed by some later miscellaneous arch/metag cleanups and
fixes which I kept separate to ease review:
- Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
- A few fixes all over, particularly for symbol prefixes
- A few privilege protection fixes
- Several cleanups (setup.c includes, split out a lot of
metag_ksyms.c)
- Fix some missing exports
- Convert hugetlb to use vm_unmapped_area()
- Copy device tree to non-init memory
- Provide dma_get_sgtable()"
* tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
metag: Provide dma_get_sgtable()
metag: prom.h: remove declaration of metag_dt_memblock_reserve()
metag: copy devicetree to non-init memory
metag: cleanup metag_ksyms.c includes
metag: move mm/init.c exports out of metag_ksyms.c
metag: move usercopy.c exports out of metag_ksyms.c
metag: move setup.c exports out of metag_ksyms.c
metag: move kick.c exports out of metag_ksyms.c
metag: move traps.c exports out of metag_ksyms.c
metag: move irq enable out of irqflags.h on SMP
genksyms: fix metag symbol prefix on crc symbols
metag: hugetlb: convert to vm_unmapped_area()
metag: export clear_page and copy_page
metag: export metag_code_cache_flush_all
metag: protect more non-MMU memory regions
metag: make TXPRIVEXT bits explicit
metag: kernel/setup.c: sort includes
perf: Enable building perf tools for Meta
metag: add boot time LNKGET/LNKSET check
metag: add __init to metag_cache_probe()
...
Linus Torvalds [Sun, 3 Mar 2013 19:54:39 +0000 (11:54 -0800)]
Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull late ARM updates from Russell King:
"Here is the late set of ARM updates for this merge window; in here is:
- The ARM parts of the broadcast timer support, core parts merged
through tglx's tree. This was left over from the previous merge to
allow the dependency on tglx's tree to be resolved.
- A fix to the VFP code which shows up on Raspberry Pi's, as well as
fixing the fallout from a previous commit in this area.
- A number of smaller fixes scattered throughout the ARM tree"
* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm:
ARM: Fix broken commit
0cc41e4a21d43 corrupting kernel messages
ARM: fix scheduling while atomic warning in alignment handling code
ARM: VFP: fix emulation of second VFP instruction
ARM: 7656/1: uImage: Error out on build of multiplatform without LOADADDR
ARM: 7640/1: memory: tegra_ahb_enable_smmu() depends on TEGRA_IOMMU_SMMU
ARM: 7654/1: Preserve L_PTE_VALID in pte_modify()
ARM: 7653/2: do not scale loops_per_jiffy when using a constant delay clock
ARM: 7651/1: remove unused smp_timer_broadcast #define
Linus Torvalds [Sun, 3 Mar 2013 18:25:47 +0000 (10:25 -0800)]
Merge tag 'char-misc-3.9-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc patch from Greg Kroah-Hartman:
"Here is one remaining patch for 3.9-rc1. It is for the hyper-v
drivers, and had to wait until some other patches went in through the
x86 tree."
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tag 'char-misc-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Use the new infrastructure for delivering VMBUS interrupts
Linus Torvalds [Sun, 3 Mar 2013 18:24:57 +0000 (10:24 -0800)]
Merge tag 'usb-3.9-rc1' of git://git./linux/kernel/git/gregkh/usb
Pull USB patch revert from Greg Kroah-Hartman:
"Here is one remaining USB patch for 3.9-rc1, it reverts a 3.8 patch
that has caused a lot of regressions for some VIA EHCI controllers."
* tag 'usb-3.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: EHCI: revert "remove ASS/PSS polling timeout"
Linus Torvalds [Sun, 3 Mar 2013 18:23:29 +0000 (10:23 -0800)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
"This contains:
- fixes and improvements
- devicetree bindings
- conversion to watchdog generic framework of the following drivers:
- booke_wdt
- bcm47xx_wdt.c
- at91sam9_wdt
- Removal of old STMP3xxx driver
- Addition of following new drivers:
- new driver for STMP3xxx and i.MX23/28
- Retu watchdog driver"
* git://www.linux-watchdog.org/linux-watchdog: (30 commits)
watchdog: sp805_wdt depends on ARM
watchdog: davinci_wdt: update to devm_* API
watchdog: davinci_wdt: use devm managed clk get
watchdog: at91rm9200: add DT support
watchdog: add timeout-sec property binding
watchdog: at91sam9_wdt: Convert to use the watchdog framework
watchdog: omap_wdt: Add option nowayout
watchdog: core: dt: add support for the timeout-sec dt property
watchdog: bcm47xx_wdt.c: add hard timer
watchdog: bcm47xx_wdt.c: rename wdt_time to timeout
watchdog: bcm47xx_wdt.c: rename ops methods
watchdog: bcm47xx_wdt.c: use platform device
watchdog: bcm47xx_wdt.c: convert to watchdog core api
watchdog: Convert BookE watchdog driver to watchdog infrastructure
watchdog: s3c2410_wdt: Use devm_* functions
watchdog: remove old STMP3xxx driver
watchdog: add new driver for STMP3xxx and i.MX23/28
rtc: stmp3xxx: add wdt-accessor function
watchdog: introduce retu_wdt driver
watchdog: intel_scu_watchdog: fix Kconfig dependency
...
Linus Torvalds [Sun, 3 Mar 2013 18:20:22 +0000 (10:20 -0800)]
Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma
Pull second set of slave-dmaengine updates from Vinod Koul:
"Arnd's patch moves the dw_dmac to use generic DMA binding. I agreed
to merge this late as it will avoid the conflicts between trees.
The second patch from Matt adding a dma_request_slave_channel_compat
API was supposed to be picked up, but somehow never got picked up.
Some patches dependent on this are already in -next :("
* 'next' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: dw_dmac: move to generic DMA binding
dmaengine: add dma_request_slave_channel_compat()