Chuck Lever [Fri, 26 Jan 2024 17:46:26 +0000 (12:46 -0500)]
NFSD: Remove BUG_ON in nfsd4_process_cb_update()
Don't kill the kworker thread, and don't panic while cl_lock is
held. There's no need for scorching the earth here.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:46:20 +0000 (12:46 -0500)]
NFSD: Replace comment with lockdep assertion
Convert a code comment into a real assertion.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:46:13 +0000 (12:46 -0500)]
NFSD: Remove unused @reason argument
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:46:07 +0000 (12:46 -0500)]
SUNRPC: Remove EXPORT_SYMBOL_GPL for svc_process_bc()
svc_process_bc(), previously known as bc_svc_process(), was
added in commit
4d6bbb6233c9 ("nfs41: Backchannel bc_svc_process()")
but there has never been a call site outside of the sunrpc.ko
module.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:46:01 +0000 (12:46 -0500)]
NFSD: Add callback operation lifetime trace points
Help observe the flow of callback operations.
bc_shutdown() records exactly when the backchannel RPC client is
destroyed and cl_cb_client is replaced with NULL.
Examples include:
nfsd-955 [004] 650.013997: nfsd_cb_queue: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 cb=0xffff8881134b02f8 (first try)
kworker/u21:4-497 [004] 650.014050: nfsd_cb_seq_status: task:
00000001@
00000001 sessionid=
65b3c5b8:
f541f749:
00000001:
00000000 tk_status=-107 seq_status=1
kworker/u21:4-497 [004] 650.014051: nfsd_cb_restart: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 cb=0xffff88810e39f400 (first try)
kworker/u21:4-497 [004] 650.014066: nfsd_cb_queue: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 cb=0xffff88810e39f400 (need restart)
kworker/u16:0-10 [006] 650.065750: nfsd_cb_start: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=UNKNOWN
kworker/u16:0-10 [006] 650.065752: nfsd_cb_bc_update: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 cb=0xffff8881134b02f8 (first try)
kworker/u16:0-10 [006] 650.065754: nfsd_cb_bc_shutdown: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 cb=0xffff8881134b02f8 (first try)
kworker/u16:0-10 [006] 650.065810: nfsd_cb_new_state: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=DOWN
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:54 +0000 (12:45 -0500)]
NFSD: Rename nfsd_cb_state trace point
Make it clear where backchannel state is updated.
Example trace point output:
kworker/u16:0-10 [006] 2800.080404: nfsd_cb_new_state: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=UP
nfsd-940 [003] 2800.478368: nfsd_cb_new_state: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=UNKNOWN
kworker/u16:0-10 [003] 2800.478828: nfsd_cb_new_state: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=DOWN
kworker/u16:0-10 [005] 2802.039724: nfsd_cb_start: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=UP
kworker/u16:0-10 [005] 2810.611452: nfsd_cb_start: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=FAULT
kworker/u16:0-10 [005] 2810.616832: nfsd_cb_start: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=UNKNOWN
kworker/u16:0-10 [005] 2810.616931: nfsd_cb_start: addr=192.168.122.6:0 client
65b3c5b8:
f541f749 state=DOWN
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:48 +0000 (12:45 -0500)]
NFSD: Replace dprintks in nfsd4_cb_sequence_done()
Improve observability of backchannel session operation.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:42 +0000 (12:45 -0500)]
NFSD: Add nfsd_seq4_status trace event
Add a trace point that records SEQ4_STATUS flags returned in an
NFSv4.1 SEQUENCE response. SEQ4_STATUS flags report backchannel
issues and changes to lease state to clients. Knowing what the
server is reporting to clients is useful for debugging both
configuration and operational issues in real time.
For example, upcoming patches will enable server administrators to
revoke parts of a client's lease; that revocation is indicated to
the client when a subsequent SEQUENCE operation has one or more
SEQ4_STATUS flags that are set.
Sample trace records:
nfsd-927 [006] 615.581821: nfsd_seq4_status: xid=0x095ded07 sessionid=
65a032c3:
b7845faf:
00000001:
00000000 status_flags=BACKCHANNEL_FAULT
nfsd-927 [006] 615.588043: nfsd_seq4_status: xid=0x0a5ded07 sessionid=
65a032c3:
b7845faf:
00000001:
00000000 status_flags=BACKCHANNEL_FAULT
nfsd-928 [003] 615.588448: nfsd_seq4_status: xid=0x0b5ded07 sessionid=
65a032c3:
b7845faf:
00000001:
00000000 status_flags=BACKCHANNEL_FAULT
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:36 +0000 (12:45 -0500)]
NFSD: Retransmit callbacks after client reconnects
NFSv4.1 clients assume that if they disconnect, that will force the
server to resend pending callback operations once a fresh connection
has been established.
Turns out NFSD has not been resending after reconnect.
Fixes:
7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:29 +0000 (12:45 -0500)]
NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down
As part of managing a client disconnect, NFSD closes down and
replaces the backchannel rpc_clnt.
If a callback operation is pending when the backchannel rpc_clnt is
shut down, currently nfsd4_run_cb_work() just discards that
callback. But there are multiple cases to deal with here:
o The client's lease is getting destroyed. Throw the CB away.
o The client disconnected. It might be forcing a retransmit of
CB operations, or it could have disconnected for other reasons.
Reschedule the CB so it is retransmitted when the client
reconnects.
Since callback operations can now be rescheduled, ensure that
cb_ops->prepare can be called only once by moving the
cb_ops->prepare paragraph down to just before the rpc_call_async()
call.
Fixes:
2bbfed98a4d8 ("nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:23 +0000 (12:45 -0500)]
NFSD: Convert the callback workqueue to use delayed_work
Normally, NFSv4 callback operations are supposed to be sent to the
client as soon as they are queued up.
In a moment, I will introduce a recovery path where the server has
to wait for the client to reconnect. We don't want a hard busy wait
here -- the callback should be requeued to try again in several
milliseconds.
For now, convert nfsd4_callback from struct work_struct to struct
delayed_work, and queue with a zero delay argument. This should
avoid behavior changes for current operation.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Fri, 26 Jan 2024 17:45:17 +0000 (12:45 -0500)]
NFSD: Reset cb_seq_status after NFS4ERR_DELAY
I noticed that once an NFSv4.1 callback operation gets a
NFS4ERR_DELAY status on CB_SEQUENCE and then the connection is lost,
the callback client loops, resending it indefinitely.
The switch arm in nfsd4_cb_sequence_done() that handles
NFS4ERR_DELAY uses rpc_restart_call() to rearm the RPC state machine
for the retransmit, but that path does not call the rpc_prepare_call
callback again. Thus cb_seq_status is set to -10008 by the first
NFS4ERR_DELAY result, but is never set back to 1 for the retransmits.
nfsd4_cb_sequence_done() thinks it's getting nothing but a
long series of CB_SEQUENCE NFS4ERR_DELAY replies.
Fixes:
7ba6cad6c88f ("nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:49 +0000 (10:39 -0500)]
nfsd: make svc_stat per-network namespace instead of global
The final bit of stats that is global is the rpc svc_stat. Move this
into the nfsd_net struct and use that everywhere instead of the global
struct. Remove the unused global struct.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:48 +0000 (10:39 -0500)]
nfsd: remove nfsd_stats, make th_cnt a global counter
This is the last global stat, take it out of the nfsd_stats struct and
make it a global part of nfsd, report it the same as always.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:47 +0000 (10:39 -0500)]
nfsd: make all of the nfsd stats per-network namespace
We have a global set of counters that we modify for all of the nfsd
operations, but now that we're exposing these stats across all network
namespaces we need to make the stats also be per-network namespace. We
already have some caching stats that are per-network namespace, so move
these definitions into the same counter and then adjust all the helpers
and users of these stats to provide the appropriate nfsd_net struct so
that the stats are maintained for the per-network namespace objects.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:46 +0000 (10:39 -0500)]
nfsd: expose /proc/net/sunrpc/nfsd in net namespaces
We are running nfsd servers inside of containers with their own network
namespace, and we want to monitor these services using the stats found
in /proc. However these are not exposed in the proc inside of the
container, so we have to bind mount the host /proc into our containers
to get at this information.
Separate out the stat counters init and the proc registration, and move
the proc registration into the pernet operations entry and exit points
so that these stats can be exposed inside of network namespaces.
This is an intermediate step, this just exposes the global counters in
the network namespace. Subsequent patches will move these counters into
the per-network namespace container.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:45 +0000 (10:39 -0500)]
nfsd: rename NFSD_NET_* to NFSD_STATS_*
We're going to merge the stats all into per network namespace in
subsequent patches, rename these nn counters to be consistent with the
rest of the stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:44 +0000 (10:39 -0500)]
sunrpc: use the struct net as the svc proc private
nfsd is the only thing using this helper, and it doesn't use the private
currently. When we switch to per-network namespace stats we will need
the struct net * in order to get to the nfsd_net. Use the net as the
proc private so we can utilize this when we make the switch over.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:43 +0000 (10:39 -0500)]
sunrpc: remove ->pg_stats from svc_program
Now that this isn't used anywhere, remove it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:42 +0000 (10:39 -0500)]
sunrpc: pass in the sv_stats struct through svc_create_pooled
Since only one service actually reports the rpc stats there's not much
of a reason to have a pointer to it in the svc_program struct. Adjust
the svc_create_pooled function to take the sv_stats as an argument and
pass the struct through there as desired instead of getting it from the
svc_program->pg_stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:41 +0000 (10:39 -0500)]
nfsd: stop setting ->pg_stats for unused stats
A lot of places are setting a blank svc_stats in ->pg_stats and never
utilizing these stats. Remove all of these extra structs as we're not
reporting these stats anywhere.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Josef Bacik [Fri, 26 Jan 2024 15:39:40 +0000 (10:39 -0500)]
sunrpc: don't change ->sv_stats if it doesn't exist
We check for the existence of ->sv_stats elsewhere except in the core
processing code. It appears that only nfsd actual exports these values
anywhere, everybody else just has a write only copy of sv_stats in their
svc_program. Add a check for ->sv_stats before every adjustment to
allow us to eliminate the stats struct from all the users who don't
report the stats.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jorge Mora [Thu, 25 Jan 2024 14:42:23 +0000 (07:42 -0700)]
NFSD: fix LISTXATTRS returning more bytes than maxcount
The maxcount is the maximum number of bytes for the LISTXATTRS4resok
result. This includes the cookie and the count for the name array,
thus subtract 12 bytes from the maxcount: 8 (cookie) + 4 (array count)
when filling up the name array.
Fixes:
23e50fe3a5e6 ("nfsd: implement the xattr functions and en/decode logic")
Signed-off-by: Jorge Mora <mora@netapp.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jorge Mora [Thu, 25 Jan 2024 14:45:28 +0000 (07:45 -0700)]
NFSD: fix LISTXATTRS returning a short list with eof=TRUE
If the XDR buffer is not large enough to fit all attributes
and the remaining bytes left in the XDR buffer (xdrleft) is
equal to the number of bytes for the current attribute, then
the loop will prematurely exit without setting eof to FALSE.
Also in this case, adding the eof flag to the buffer will
make the reply 4 bytes larger than lsxa_maxcount.
Need to check if there are enough bytes to fit not only the
next attribute name but also the eof as well.
Fixes:
23e50fe3a5e6 ("nfsd: implement the xattr functions and en/decode logic")
Signed-off-by: Jorge Mora <mora@netapp.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jorge Mora [Thu, 25 Jan 2024 14:46:12 +0000 (07:46 -0700)]
NFSD: change LISTXATTRS cookie encoding to big-endian
Function nfsd4_listxattr_validate_cookie() expects the cookie
as an offset to the list thus it needs to be encoded in big-endian.
Fixes:
23e50fe3a5e6 ("nfsd: implement the xattr functions and en/decode logic")
Signed-off-by: Jorge Mora <mora@netapp.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Jorge Mora [Thu, 25 Jan 2024 14:46:54 +0000 (07:46 -0700)]
NFSD: fix nfsd4_listxattr_validate_cookie
If LISTXATTRS is sent with a correct cookie but a small maxcount,
this could lead function nfsd4_listxattr_validate_cookie to
return NFS4ERR_BAD_COOKIE. If maxcount = 20, then second check
on function gives RHS = 3 thus any cookie larger than 3 returns
NFS4ERR_BAD_COOKIE.
There is no need to validate the cookie on the return XDR buffer
since attribute referenced by cookie will be the first in the
return buffer.
Fixes:
23e50fe3a5e6 ("nfsd: implement the xattr functions and en/decode logic")
Signed-off-by: Jorge Mora <mora@netapp.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NeilBrown [Fri, 15 Dec 2023 01:18:31 +0000 (12:18 +1100)]
nfsd: use __fput_sync() to avoid delayed closing of files.
Calling fput() directly or though filp_close() from a kernel thread like
nfsd causes the final __fput() (if necessary) to be called from a
workqueue. This means that nfsd is not forced to wait for any work to
complete. If the ->release or ->destroy_inode function is slow for any
reason, this can result in nfsd closing files more quickly than the
workqueue can complete the close and the queue of pending closes can
grow without bounces (30 million has been seen at one customer site,
though this was in part due to a slowness in xfs which has since been
fixed).
nfsd does not need this. It is quite appropriate and safe for nfsd to
do its own close work. There is no reason that close should ever wait
for nfsd, so no deadlock can occur.
It should be safe and sensible to change all fput() calls to
__fput_sync(). However in the interests of caution this patch only
changes two - the two that can be most directly affected by client
behaviour and could occur at high frequency.
- the fput() implicitly in flip_close() is changed to __fput_sync()
by calling get_file() first to ensure filp_close() doesn't do
the final fput() itself. If is where files opened for IO are closed.
- the fput() in nfsd_read() is also changed. This is where directories
opened for readdir are closed.
This ensure that minimal fput work is queued to the workqueue.
This removes the need for the flush_delayed_fput() call in
nfsd_file_close_inode_sync()
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NeilBrown [Fri, 15 Dec 2023 01:18:30 +0000 (12:18 +1100)]
nfsd: Don't leave work of closing files to a work queue
The work of closing a file can have non-trivial cost. Doing it in a
separate work queue thread means that cost isn't imposed on the nfsd
threads and an imbalance can be created. This can result in files being
queued for the work queue more quickly that the work queue can process
them, resulting in unbounded growth of the queue and memory exhaustion.
To avoid this work imbalance that exhausts memory, this patch moves all
closing of files into the nfsd threads. This means that when the work
imposes a cost, that cost appears where it would be expected - in the
work of the nfsd thread. A subsequent patch will ensure the final
__fput() is called in the same (nfsd) thread which calls filp_close().
Files opened for NFSv3 are never explicitly closed by the client and are
kept open by the server in the "filecache", which responds to memory
pressure, is garbage collected even when there is no pressure, and
sometimes closes files when there is particular need such as for rename.
These files currently have filp_close() called in a dedicated work
queue, so their __fput() can have no effect on nfsd threads.
This patch discards the work queue and instead has each nfsd thread call
flip_close() on as many as 8 files from the filecache each time it acts
on a client request (or finds there are no pending client requests). If
there are more to be closed, more threads are woken. This spreads the
work of __fput() over multiple threads and imposes any cost on those
threads.
The number 8 is somewhat arbitrary. It needs to be greater than 1 to
ensure that files are closed more quickly than they can be added to the
cache. It needs to be small enough to limit the per-request delays that
will be imposed on clients when all threads are busy closing files.
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Mon, 1 Jan 2024 16:37:45 +0000 (11:37 -0500)]
SUNRPC: Use a static buffer for the checksum initialization vector
Allocating and zeroing a buffer during every call to
krb5_etm_checksum() is inefficient. Instead, set aside a static
buffer that is the maximum crypto block size, and use a portion
(or all) of that.
Reported-by: Markus Elfring <Markus.Elfring@web.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Zhipeng Lu [Tue, 2 Jan 2024 05:38:13 +0000 (13:38 +0800)]
SUNRPC: fix some memleaks in gssx_dec_option_array
The creds and oa->data need to be freed in the error-handling paths after
their allocation. So this patch add these deallocations in the
corresponding paths.
Fixes:
1d658336b05f ("SUNRPC: Add RPC based upcall mechanism for RPCGSS auth")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Zhipeng Lu [Sun, 24 Dec 2023 08:20:33 +0000 (16:20 +0800)]
SUNRPC: fix a memleak in gss_import_v2_context
The ctx->mech_used.data allocated by kmemdup is not freed in neither
gss_import_v2_context nor it only caller gss_krb5_import_sec_context,
which frees ctx on error.
Thus, this patch reform the last call of gss_import_v2_context to the
gss_krb5_import_ctx_v2, preventing the memleak while keepping the return
formation.
Fixes:
47d848077629 ("gss_krb5: handle new context format from gssd")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Linus Torvalds [Sun, 25 Feb 2024 23:46:06 +0000 (15:46 -0800)]
Linux 6.8-rc6
Linus Torvalds [Sun, 25 Feb 2024 23:31:57 +0000 (15:31 -0800)]
Merge tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
"Some more mostly boring fixes, but some not
User reported ones:
- the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty
performance bug; user reported an untar initially taking two
seconds and then ~2 minutes
- kill a __GFP_NOFAIL in the buffered read path; this was a leftover
from the trickier fix to kill __GFP_NOFAIL in readahead, where we
can't return errors (and have to silently truncate the read
ourselves).
bcachefs can't use GFP_NOFAIL for folio state unlike iomap based
filesystems because our folio state is just barely too big, 2MB
hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL.
additionally, the flags argument was just buggy, we weren't
supplying GFP_KERNEL previously (!)"
* tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs:
bcachefs: fix bch2_save_backtrace()
bcachefs: Fix check_snapshot() memcpy
bcachefs: Fix bch2_journal_flush_device_pins()
bcachefs: fix iov_iter count underflow on sub-block dio read
bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
bcachefs: Kill __GFP_NOFAIL in buffered read path
bcachefs: fix backpointer_to_text() when dev does not exist
Kent Overstreet [Sun, 25 Feb 2024 20:45:34 +0000 (15:45 -0500)]
bcachefs: fix bch2_save_backtrace()
Missed a call in the previous fix.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 25 Feb 2024 18:58:12 +0000 (10:58 -0800)]
Merge tag 'docs-6.8-fixes3' of git://git.lwn.net/linux
Pull two documentation build fixes from Jonathan Corbet:
- The XFS online fsck documentation uses incredibly deeply nested
subsection and list nesting; that broke the PDF docs build. Tweak a
parameter to tell LaTeX to allow the deeper nesting.
- Fix a 6.8 PDF-build regression
* tag 'docs-6.8-fixes3' of git://git.lwn.net/linux:
docs: translations: use attribute to store current language
docs: Instruct LaTeX to cope with deeper nesting
Linus Torvalds [Sun, 25 Feb 2024 18:41:57 +0000 (10:41 -0800)]
Merge tag 'usb-6.8-rc6' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 6.8-rc6 to resolve some reported
problems. These include:
- regression fixes with typec tpcm code as reported by many
- cdnsp and cdns3 driver fixes
- usb role setting code bugfixes
- build fix for uhci driver
- ncm gadget driver bugfix
- MAINTAINERS entry update
All of these have been in linux-next all week with no reported issues
and there is at least one fix in here that is in Thorsten's regression
list that is being tracked"
* tag 'usb-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: tpcm: Fix issues with power being removed during reset
MAINTAINERS: Drop myself as maintainer of TYPEC port controller drivers
usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
Revert "usb: typec: tcpm: reset counter when enter into unattached state after try role"
usb: gadget: omap_udc: fix USB gadget regression on Palm TE
usb: dwc3: gadget: Don't disconnect if not started
usb: cdns3: fix memory double free when handle zero packet
usb: cdns3: fixed memory use after free at cdns3_gadget_ep_disable()
usb: roles: don't get/set_role() when usb_role_switch is unregistered
usb: roles: fix NULL pointer issue when put module's reference
usb: cdnsp: fixed issue with incorrect detecting CDNSP family controllers
usb: cdnsp: blocked some cdns3 specific code
usb: uhci-grlib: Explicitly include linux/platform_device.h
Linus Torvalds [Sun, 25 Feb 2024 18:35:41 +0000 (10:35 -0800)]
Merge tag 'tty-6.8-rc6' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are three small serial/tty driver fixes for 6.8-rc6 that resolve
the following reported errors:
- riscv hvc console driver fix that was reported by many
- amba-pl011 serial driver fix for RS485 mode
- stm32 serial driver fix for RS485 mode
All of these have been in linux-next all week with no reported
problems"
* tag 'tty-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: amba-pl011: Fix DMA transmission in RS485 mode
serial: stm32: do not always set SER_RS485_RX_DURING_TX if RS485 is enabled
tty: hvc: Don't enable the RISC-V SBI console by default
Linus Torvalds [Sun, 25 Feb 2024 18:22:21 +0000 (10:22 -0800)]
Merge tag 'x86_urgent_for_v6.8_rc6' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Make sure clearing CPU buffers using VERW happens at the latest
possible point in the return-to-userspace path, otherwise memory
accesses after the VERW execution could cause data to land in CPU
buffers again
* tag 'x86_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
KVM/VMX: Move VERW closer to VMentry for MDS mitigation
KVM/VMX: Use BT+JNC, i.e. EFLAGS.CF to select VMRESUME vs. VMLAUNCH
x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key
x86/entry_32: Add VERW just before userspace transition
x86/entry_64: Add VERW just before userspace transition
x86/bugs: Add asm helpers for executing VERW
Linus Torvalds [Sun, 25 Feb 2024 18:14:12 +0000 (10:14 -0800)]
Merge tag 'irq_urgent_for_v6.8_rc6' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Make sure GICv4 always gets initialized to prevent a kexec-ed kernel
from silently failing to set it up
- Do not call bus_get_dev_root() for the mbigen irqchip as it always
returns NULL - use NULL directly
- Fix hardware interrupt number truncation when assigning MSI
interrupts
- Correct sending end-of-interrupt messages to disabled interrupts
lines on RISC-V PLIC
* tag 'irq_urgent_for_v6.8_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Do not assume vPE tables are preallocated
irqchip/mbigen: Don't use bus_get_dev_root() to find the parent
PCI/MSI: Prevent MSI hardware interrupt number truncation
irqchip/sifive-plic: Enable interrupt if needed before EOI
Linus Torvalds [Sun, 25 Feb 2024 17:53:13 +0000 (09:53 -0800)]
Merge tag 'erofs-for-6.8-rc6-fixes' of git://git./linux/kernel/git/xiang/erofs
Pull erofs fix from Gao Xiang:
- Fix page refcount leak when looking up specific inodes
introduced by metabuf reworking
* tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix refcount on the metabuf used for inode lookup
Linus Torvalds [Sun, 25 Feb 2024 17:29:05 +0000 (09:29 -0800)]
Merge tag 'pull-fixes.pathwalk-rcu-2' of git://git./linux/kernel/git/viro/vfs
Pull RCU pathwalk fixes from Al Viro:
"We still have some races in filesystem methods when exposed to RCU
pathwalk. This series is a result of code audit (the second round of
it) and it should deal with most of that stuff.
Still pending: ntfs3 ->d_hash()/->d_compare() and ceph_d_revalidate().
Up to maintainers (a note for NTFS folks - when documentation says
that a method may not block, it *does* imply that blocking allocations
are to be avoided. Really)"
[ More explanations for people who aren't familiar with the vagaries of
RCU path walking: most of it is hidden from filesystems, but if a
filesystem actively participates in the low-level path walking it
needs to make sure the fields involved in that walk are RCU-safe.
That "actively participate in low-level path walking" includes things
like having its own ->d_hash()/->d_compare() routines, or by having
its own directory permission function that doesn't just use the common
helpers. Having a ->d_revalidate() function will also have this issue.
Note that instead of making everything RCU safe you can also choose to
abort the RCU pathwalk if your operation cannot be done safely under
RCU, but that obviously comes with a performance penalty. One common
pattern is to allow the simple cases under RCU, and abort only if you
need to do something more complicated.
So not everything needs to be RCU-safe, and things like the inode etc
that the VFS itself maintains obviously already are. But these fixes
tend to be about properly RCU-delaying things like ->s_fs_info that
are maintained by the filesystem and that got potentially released too
early. - Linus ]
* tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ext4_get_link(): fix breakage in RCU mode
cifs_get_link(): bail out in unsafe case
fuse: fix UAF in rcu pathwalks
procfs: make freeing proc_fs_info rcu-delayed
procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()
nfs: fix UAF on pathwalk running into umount
nfs: make nfs_set_verifier() safe for use in RCU pathwalk
afs: fix __afs_break_callback() / afs_drop_open_mmap() race
hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info
exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper
affs: free affs_sb_info with kfree_rcu()
rcu pathwalk: prevent bogus hard errors from may_lookup()
fs/super.c: don't drop ->s_user_ns until we free struct super_block itself
Linus Torvalds [Sun, 25 Feb 2024 17:17:15 +0000 (09:17 -0800)]
Merge tag 'pull-fixes' of git://git./linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"A couple of fixes - revert of regression from this cycle and a fix for
erofs failure exit breakage (had been there since way back)"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
erofs: fix handling kern_mount() failure
Revert "get rid of DCACHE_GENOCIDE"
Al Viro [Sat, 3 Feb 2024 06:17:34 +0000 (01:17 -0500)]
ext4_get_link(): fix breakage in RCU mode
1) errors from ext4_getblk() should not be propagated to caller
unless we are really sure that we would've gotten the same error
in non-RCU pathwalk.
2) we leak buffer_heads if ext4_getblk() is successful, but bh is
not uptodate.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 20 Sep 2023 02:28:16 +0000 (22:28 -0400)]
cifs_get_link(): bail out in unsafe case
->d_revalidate() bails out there, anyway. It's not enough
to prevent getting into ->get_link() in RCU mode, but that
could happen only in a very contrieved setup. Not worth
trying to do anything fancy here unless ->d_revalidate()
stops kicking out of RCU mode at least in some cases.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 28 Sep 2023 04:19:39 +0000 (00:19 -0400)]
fuse: fix UAF in rcu pathwalks
->permission(), ->get_link() and ->inode_get_acl() might dereference
->s_fs_info (and, in case of ->permission(), ->s_fs_info->fc->user_ns
as well) when called from rcu pathwalk.
Freeing ->s_fs_info->fc is rcu-delayed; we need to make freeing ->s_fs_info
and dropping ->user_ns rcu-delayed too.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 20 Sep 2023 04:12:00 +0000 (00:12 -0400)]
procfs: make freeing proc_fs_info rcu-delayed
makes proc_pid_ns() safe from rcu pathwalk (put_pid_ns()
is still synchronous, but that's not a problem - it does
rcu-delay everything that needs to be)
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 20 Sep 2023 03:52:58 +0000 (23:52 -0400)]
procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()
that keeps both around until struct inode is freed, making access
to them safe from rcu-pathwalk
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 28 Sep 2023 02:11:26 +0000 (22:11 -0400)]
nfs: fix UAF on pathwalk running into umount
NFS ->d_revalidate(), ->permission() and ->get_link() need to access
some parts of nfs_server when called in RCU mode:
server->flags
server->caps
*(server->io_stats)
and, worst of all, call
server->nfs_client->rpc_ops->have_delegation
(the last one - as NFS_PROTO(inode)->have_delegation()). We really
don't want to RCU-delay the entire nfs_free_server() (it would have
to be done with schedule_work() from RCU callback, since it can't
be made to run from interrupt context), but actual freeing of
nfs_server and ->io_stats can be done via call_rcu() just fine.
nfs_client part is handled simply by making nfs_free_client() use
kfree_rcu().
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Thu, 28 Sep 2023 01:50:25 +0000 (21:50 -0400)]
nfs: make nfs_set_verifier() safe for use in RCU pathwalk
nfs_set_verifier() relies upon dentry being pinned; if that's
the case, grabbing ->d_lock stabilizes ->d_parent and guarantees
that ->d_parent points to a positive dentry. For something
we'd run into in RCU mode that is *not* true - dentry might've
been through dentry_kill() just as we grabbed ->d_lock, with
its parent going through the same just as we get to into
nfs_set_verifier_locked(). It might get to detaching inode
(and zeroing ->d_inode) before nfs_set_verifier_locked() gets
to fetching that; we get an oops as the result.
That can happen in nfs{,4} ->d_revalidate(); the call chain in
question is nfs_set_verifier_locked() <- nfs_set_verifier() <-
nfs_lookup_revalidate_delegated() <- nfs{,4}_do_lookup_revalidate().
We have checked that the parent had been positive, but that's
done before we get to nfs_set_verifier() and it's possible for
memory pressure to pick our dentry as eviction candidate by that
time. If that happens, back-to-back attempts to kill dentry and
its parent are quite normal. Sure, in case of eviction we'll
fail the ->d_seq check in the caller, but we need to survive
until we return there...
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 30 Sep 2023 00:24:34 +0000 (20:24 -0400)]
afs: fix __afs_break_callback() / afs_drop_open_mmap() race
In __afs_break_callback() we might check ->cb_nr_mmap and if it's non-zero
do queue_work(&vnode->cb_work). In afs_drop_open_mmap() we decrement
->cb_nr_mmap and do flush_work(&vnode->cb_work) if it reaches zero.
The trouble is, there's nothing to prevent __afs_break_callback() from
seeing ->cb_nr_mmap before the decrement and do queue_work() after both
the decrement and flush_work(). If that happens, we might be in trouble -
vnode might get freed before the queued work runs.
__afs_break_callback() is always done under ->cb_lock, so let's make
sure that ->cb_nr_mmap can change from non-zero to zero while holding
->cb_lock (the spinlock component of it - it's a seqlock and we don't
need to mess with the counter).
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Wed, 20 Sep 2023 00:18:59 +0000 (20:18 -0400)]
hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info
->d_hash() and ->d_compare() use those, so we need to delay freeing
them.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 19 Sep 2023 19:53:32 +0000 (15:53 -0400)]
exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper
That stuff can be accessed by ->d_hash()/->d_compare(); as it is, we have
a hard-to-hit UAF if rcu pathwalk manages to get into ->d_hash() on a filesystem
that is in process of getting shut down.
Besides, having nls and upcase table cleanup moved from ->put_super() towards
the place where sbi is freed makes for simpler failure exits.
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 19 Sep 2023 23:36:07 +0000 (19:36 -0400)]
affs: free affs_sb_info with kfree_rcu()
one of the flags in it is used by ->d_hash()/->d_compare()
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 30 Sep 2023 01:11:41 +0000 (21:11 -0400)]
rcu pathwalk: prevent bogus hard errors from may_lookup()
If lazy call of ->permission() returns a hard error, check that
try_to_unlazy() succeeds before returning it. That both makes
life easier for ->permission() instances and closes the race
in ENOTDIR handling - it is possible that positive d_can_lookup()
seen in link_path_walk() applies to the state *after* unlink() +
mkdir(), while nd->inode matches the state prior to that.
Normally seeing e.g. EACCES from permission check in rcu pathwalk
means that with some timings non-rcu pathwalk would've run into
the same; however, running into a non-executable regular file
in the middle of a pathname would not get to permission check -
it would fail with ENOTDIR instead.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 2 Feb 2024 02:10:01 +0000 (21:10 -0500)]
fs/super.c: don't drop ->s_user_ns until we free struct super_block itself
Avoids fun races in RCU pathwalk... Same goes for freeing LSM shite
hanging off super_block's arse.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Kent Overstreet [Sat, 24 Feb 2024 06:18:45 +0000 (01:18 -0500)]
bcachefs: Fix check_snapshot() memcpy
check_snapshot() copies the bch_snapshot to a temporary to easily handle
older versions that don't have all the fields of the current version,
but it lacked a min() to correctly handle keys newer and larger than the
current version.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 18 Feb 2024 01:38:47 +0000 (20:38 -0500)]
bcachefs: Fix bch2_journal_flush_device_pins()
If a journal write errored, the list of devices it was written to could
be empty - we're not supposed to mark an empty replicas list.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Brian Foster [Thu, 15 Feb 2024 17:16:05 +0000 (12:16 -0500)]
bcachefs: fix iov_iter count underflow on sub-block dio read
bch2_direct_IO_read() checks the request offset and size for sector
alignment and then falls through to a couple calculations to shrink
the size of the request based on the inode size. The problem is that
these checks round up to the fs block size, which runs the risk of
underflowing iter->count if the block size happens to be large
enough. This is triggered by fstest generic/361 with a 4k block
size, which subsequently leads to a crash. To avoid this crash,
check that the shorten length doesn't exceed the overall length of
the iter.
Fixes:
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Su Yue <glass.su@suse.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 25 Feb 2024 00:14:36 +0000 (19:14 -0500)]
bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree
If we're in FILTER_SNAPSHOTS mode and we start scanning a range of the
keyspace where no keys are visible in the current snapshot, we have a
problem - we'll scan for a very long time before scanning terminates.
Awhile back, this was fixed for most cases with peek_upto() (and
assertions that enforce that it's being used).
But the fix missed the fact that the inodes btree is different - every
key offset is in a different snapshot tree, not just the inode field.
Fixes:
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 Feb 2024 02:39:13 +0000 (21:39 -0500)]
bcachefs: Kill __GFP_NOFAIL in buffered read path
Recently, we fixed our __GFP_NOFAIL usage in the readahead path, but the
easy one in read_single_folio() (where wa can return an error) was
missed - oops.
Fixes:
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 21 Feb 2024 03:16:00 +0000 (22:16 -0500)]
bcachefs: fix backpointer_to_text() when dev does not exist
Fixes:
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Linus Torvalds [Sun, 25 Feb 2024 00:49:51 +0000 (16:49 -0800)]
Merge tag 'powerpc-6.8-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix a crash when hot adding a PCI device to an LPAR since
recent changes
- Fix nested KVM level-2 guest reboot failure due to empty
'arch_compat'
Thanks to Amit Machhiwal, Aneesh Kumar K.V (IBM), Brian King, Gaurav
Batra, and Vaibhav Jain.
* tag 'powerpc-6.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'
powerpc/pseries/iommu: DLPAR add doesn't completely initialize pci_controller
Linus Torvalds [Sat, 24 Feb 2024 23:59:26 +0000 (15:59 -0800)]
Merge tag 'iommu-fixes-v6.8-rc5' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Intel VT-d fixes for nested domain handling:
- Cache invalidation for changes in a parent domain
- Dirty tracking setting for parent and nested domains
- Fix a constant-out-of-range warning
- ARM SMMU fixes:
- Fix CD allocation from atomic context when using SVA with SMMUv3
- Revert the conversion of SMMUv2 to domain_alloc_paging(), as it
breaks the boot for Qualcomm MSM8996 devices
- Restore SVA handle sharing in core code as it turned out there are
still drivers relying on it
* tag 'iommu-fixes-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/sva: Restore SVA handle sharing
iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlock
iommu/vt-d: Fix constant-out-of-range warning
iommu/vt-d: Set SSADE when attaching to a parent with dirty tracking
iommu/vt-d: Add missing dirty tracking set for parent domain
iommu/vt-d: Wrap the dirty tracking loop to be a helper
iommu/vt-d: Remove domain parameter for intel_pasid_setup_dirty_tracking()
iommu/vt-d: Add missing device iotlb flush for parent domain
iommu/vt-d: Update iotlb in nested domain attach
iommu/vt-d: Add missing iotlb flush for parent domain
iommu/vt-d: Add __iommu_flush_iotlb_psi()
iommu/vt-d: Track nested domains in parent
Revert "iommu/arm-smmu: Convert to domain_alloc_paging()"
Linus Torvalds [Sat, 24 Feb 2024 23:53:40 +0000 (15:53 -0800)]
Merge tag 'cxl-fixes-6.8-rc6' of git://git./linux/kernel/git/cxl/cxl
Pull cxl fixes from Dan Williams:
"A collection of significant fixes for the CXL subsystem.
The largest change in this set, that bordered on "new development", is
the fix for the fact that the location of the new qos_class attribute
did not match the Documentation. The fix ends up deleting more code
than it added, and it has a new unit test to backstop basic errors in
this interface going forward. So the "red-diff" and unit test saved
the "rip it out and try again" response.
In contrast, the new notification path for firmware reported CXL
errors (CXL CPER notifications) has a locking context bug that can not
be fixed with a red-diff. Given where the release cycle stands, it is
not comfortable to squeeze in that fix in these waning days. So, that
receives the "back it out and try again later" treatment.
There is a regression fix in the code that establishes memory NUMA
nodes for platform CXL regions. That has an ack from x86 folks. There
are a couple more fixups for Linux to understand (reassemble) CXL
regions instantiated by platform firmware. The policy around platforms
that do not match host-physical-address with system-physical-address
(i.e. systems that have an address translation mechanism between the
address range reported in the ACPI CEDT.CFMWS and endpoint decoders)
has been softened to abort driver load rather than teardown the memory
range (can cause system hangs). Lastly, there is a robustness /
regression fix for cases where the driver would previously continue in
the face of error, and a fixup for PCI error notification handling.
Summary:
- Fix NUMA initialization from ACPI CEDT.CFMWS
- Fix region assembly failures due to async init order
- Fix / simplify export of qos_class information
- Fix cxl_acpi initialization vs single-window-init failures
- Fix handling of repeated 'pci_channel_io_frozen' notifications
- Workaround platforms that violate host-physical-address ==
system-physical address assumptions
- Defer CXL CPER notification handling to v6.9"
* tag 'cxl-fixes-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/acpi: Fix load failures due to single window creation failure
acpi/ghes: Remove CXL CPER notifications
cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window
cxl/test: Add support for qos_class checking
cxl: Fix sysfs export of qos_class for memdev
cxl: Remove unnecessary type cast in cxl_qos_class_verify()
cxl: Change 'struct cxl_memdev_state' *_perf_list to single 'struct cxl_dpa_perf'
cxl/region: Allow out of order assembly of autodiscovered regions
cxl/region: Handle endpoint decoders in cxl_region_find_decoder()
x86/numa: Fix the sort compare func used in numa_fill_memblks()
x86/numa: Fix the address overlap check in numa_fill_memblks()
cxl/pci: Skip to handle RAS errors if CXL.mem device is detached
Linus Torvalds [Sat, 24 Feb 2024 17:55:29 +0000 (09:55 -0800)]
Merge tag 'for-6.8/dm-fix-3' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fix from Mike Snitzer:
- Fix DM integrity and verity targets to not use excessive stack when
they recheck in the error path.
* tag 'for-6.8/dm-fix-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-integrity, dm-verity: reduce stack usage for recheck
Linus Torvalds [Sat, 24 Feb 2024 17:49:16 +0000 (09:49 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six fixes: the four driver ones are pretty trivial.
The larger two core changes are to try to fix various USB attached
devices which have somewhat eccentric ways of handling the VPD and
other mode pages which necessitate multiple revalidates (that were
removed in the interests of efficiency) and updating the heuristic for
supported VPD pages"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: jazz_esp: Only build if SCSI core is builtin
scsi: smartpqi: Fix disable_managed_interrupts
scsi: ufs: Uninitialized variable in ufshcd_devfreq_target()
scsi: target: pscsi: Fix bio_put() for error case
scsi: core: Consult supported VPD page list prior to fetching page
scsi: sd: usb_storage: uas: Access media prior to querying device properties
Linus Torvalds [Sat, 24 Feb 2024 17:46:05 +0000 (09:46 -0800)]
Merge tag 'i2c-for-6.8-rc6' of git://git./linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"A bugfix for host drivers"
* tag 'i2c-for-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: imx: when being a target, mark the last read as processed
Linus Torvalds [Sat, 24 Feb 2024 17:36:35 +0000 (09:36 -0800)]
Merge tag 'loongarch-fixes-6.8-3' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix two cpu-hotplug issues, fix the init sequence about FDT system,
fix the coding style of dts, and fix the wrong CPUCFG ID handling of
KVM"
* tag 'loongarch-fixes-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Streamline kvm_check_cpucfg() and improve comments
LoongArch: KVM: Rename _kvm_get_cpucfg() to _kvm_get_cpucfg_mask()
LoongArch: KVM: Fix input validation of _kvm_get_cpucfg() & kvm_check_cpucfg()
LoongArch: dts: Minor whitespace cleanup
LoongArch: Call early_init_fdt_scan_reserved_mem() earlier
LoongArch: Update cpu_sibling_map when disabling nonboot CPUs
LoongArch: Disable IRQ before init_fn() for nonboot CPUs
Arnd Bergmann [Sat, 24 Feb 2024 13:48:03 +0000 (14:48 +0100)]
dm-integrity, dm-verity: reduce stack usage for recheck
The newly added integrity_recheck() function has another larger stack
allocation, just like its caller integrity_metadata(). When it gets
inlined, the combination of the two exceeds the warning limit for 32-bit
architectures and possibly risks an overflow when this is called from
a deep call chain through a file system:
drivers/md/dm-integrity.c:1767:13: error: stack frame size (1048) exceeds limit (1024) in 'integrity_metadata' [-Werror,-Wframe-larger-than]
1767 | static void integrity_metadata(struct work_struct *w)
Since the caller at this point is done using its checksum buffer,
just reuse the same buffer in the new function to avoid the double
allocation.
[Mikulas: add "noinline" to integrity_recheck and verity_recheck.
These functions are only called on error, so they shouldn't bloat the
stack frame or code size of the caller.]
Fixes:
c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure")
Fixes:
9177f3c0dea6 ("dm-verity: recheck the hash after a failure")
Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Corey Minyard [Wed, 21 Feb 2024 19:27:13 +0000 (20:27 +0100)]
i2c: imx: when being a target, mark the last read as processed
When being a target, NAK from the controller means that all bytes have
been transferred. So, the last byte needs also to be marked as
'processed'. Otherwise index registers of backends may not increase.
Fixes:
f7414cd6923f ("i2c: imx: support slave mode for imx I2C driver")
Signed-off-by: Corey Minyard <minyard@acm.org>
Tested-by: Andrew Manley <andrew.manley@sealingtech.com>
Reviewed-by: Andrew Manley <andrew.manley@sealingtech.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
[wsa: fixed comment and commit message to properly describe the case]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Linus Torvalds [Fri, 23 Feb 2024 18:40:20 +0000 (10:40 -0800)]
Merge tag 'parisc-for-6.8-rc6' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
"Fixes CPU hotplug, the parisc stack unwinder and two possible build
errors in kprobes and ftrace area:
- Fix CPU hotplug
- Fix unaligned accesses and faults in stack unwinder
- Fix potential build errors by always including asm-generic/kprobes.h
- Fix build bug by add missing CONFIG_DYNAMIC_FTRACE check"
* tag 'parisc-for-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix stack unwinder
parisc/kprobes: always include asm-generic/kprobes.h
parisc/ftrace: add missing CONFIG_DYNAMIC_FTRACE check
Revert "parisc: Only list existing CPUs in cpu_possible_mask"
Linus Torvalds [Fri, 23 Feb 2024 18:31:28 +0000 (10:31 -0800)]
Merge tag 'arm-fixes-6.8-2' of git://git./linux/kernel/git/soc/soc
Pull arm and RISC-V SoC fixes from Arnd Bergmann:
"The Rockchip and IMX8 platforms get a number of fixes for dts files in
order to address some misconfigurations, including a regression for
USB-C support on some boards.
The other dts fixes are part of a series by Rob Herring to clean up
another class of dtc compiler warnings across all platforms, with a
few others helping out as well. With this, we can enable the warning
for the coming merge window without introducing regressions.
Conor Dooley has collected fixes for RISC-V platforms, both for the
dts files and for platofrm specific drivers.
The ep93xx platform gets a regression for for its gpio descriptors"
* tag 'arm-fixes-6.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits)
ARM: dts: renesas: rcar-gen2: Add missing #interrupt-cells to DA9063 nodes
cache: ax45mp_cache: Align end size to cache boundary in ax45mp_dma_cache_wback()
arm64: dts: qcom: Fix interrupt-map cell sizes
arm: dts: Fix dtc interrupt_map warnings
arm64: dts: Fix dtc interrupt_provider warnings
arm: dts: Fix dtc interrupt_provider warnings
arm64: dts: freescale: Disable interrupt_map check
ARM: ep93xx: Add terminator to gpiod_lookup_table
riscv: dts: sifive: add missing #interrupt-cells to pmic
arm64: dts: rockchip: Correct Indiedroid Nova GPIO Names
arm64: dts: rockchip: Drop interrupts property from rk3328 pwm-rockchip node
arm64: dts: rockchip: set num-cs property for spi on px30
arm64: dts: rockchip: minor rk3588 whitespace cleanup
riscv: dts: starfive: replace underscores in node names
bus: imx-weim: fix valid range check
Revert "arm64: dts: imx8mn-var-som-symphony: Describe the USB-C connector"
Revert "arm64: dts: imx8mp-dhcom-pdk3: Describe the USB-C connector"
arm64: dts: tqma8mpql: fix audio codec iov-supply
arm64: dts: rockchip: drop unneeded status from rk3588-jaguar gpio-leds
ARM: dts: rockchip: Drop interrupts property from pwm-rockchip nodes
...
Linus Torvalds [Fri, 23 Feb 2024 18:26:43 +0000 (10:26 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A simple fix to a definition in the CXL PMU driver, a couple of
patches to restore SME control registers on the resume path (since
Arm's fast model now clears them) and a revert for our jump label asm
constraints after Geert noticed they broke the build with GCC 5.5.
There was then the ensuing discussion about raising the minimum GCC
(and corresponding binutils) versions at [1], but for now we'll keep
things working as they were until that goes ahead.
- Revert fix to jump label asm constraints, as it regresses the build
with some GCC 5.5 toolchains.
- Restore SME control registers when resuming from suspend
- Fix incorrect filter definition in CXL PMU driver"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/sme: Restore SMCR_EL1.EZT0 on exit from suspend
arm64/sme: Restore SME registers on exit from suspend
Revert "arm64: jump_label: use constraints "Si" instead of "i""
perf: CXL: fix CPMU filter value mask length
Linus Torvalds [Fri, 23 Feb 2024 17:54:13 +0000 (09:54 -0800)]
Merge tag 's390-6.8-3' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Fix invalid -EBUSY on ccw_device_start() which can lead to failing
device initialization
- Add missing multiplication by 8 in __iowrite64_copy() to get the
correct byte length before calling zpci_memcpy_toio()
- Various config updates
* tag 's390-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: fix invalid -EBUSY on ccw_device_start
s390: use the correct count for __iowrite64_copy()
s390/configs: update default configurations
s390/configs: enable INIT_STACK_ALL_ZERO in all configurations
s390/configs: provide compat topic configuration target
Linus Torvalds [Fri, 23 Feb 2024 17:43:21 +0000 (09:43 -0800)]
Merge tag 'mm-hotfixes-stable-2024-02-22-15-02' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"A batch of MM (and one non-MM) hotfixes.
Ten are cc:stable and the remainder address post-6.7 issues or aren't
considered appropriate for backporting"
* tag 'mm-hotfixes-stable-2024-02-22-15-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
kasan: guard release_free_meta() shadow access with kasan_arch_is_ready()
mm/damon/lru_sort: fix quota status loss due to online tunings
mm/damon/reclaim: fix quota stauts loss due to online tunings
MAINTAINERS: mailmap: update Shakeel's email address
mm/damon/sysfs-schemes: handle schemes sysfs dir removal before commit_schemes_quota_goals
mm: memcontrol: clarify swapaccount=0 deprecation warning
mm/memblock: add MEMBLOCK_RSRV_NOINIT into flagname[] array
mm/zswap: invalidate duplicate entry when !zswap_enabled
lib/Kconfig.debug: TEST_IOV_ITER depends on MMU
mm/swap: fix race when skipping swapcache
mm/swap_state: update zswap LRU's protection range with the folio locked
selftests/mm: uffd-unit-test check if huge page size is 0
mm/damon/core: check apply interval in damon_do_apply_schemes()
mm: zswap: fix missing folio cleanup in writeback race path
Linus Torvalds [Fri, 23 Feb 2024 17:23:54 +0000 (09:23 -0800)]
Merge tag 'for-6.8/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Stable fixes for 3 DM targets (integrity, verity and crypt) to
address systemic failure that can occur if user provided pages map to
the same block.
- Fix DM crypt to not allow modifying data that being encrypted for
authenticated encryption.
- Fix DM crypt and verity targets to align their respective bvec_iter
struct members to avoid the need for byte level access (due to
__packed attribute) that is costly on some arches (like RISC).
* tag 'for-6.8/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm-crypt, dm-integrity, dm-verity: bump target version
dm-verity, dm-crypt: align "struct bvec_iter" correctly
dm-crypt: recheck the integrity tag after a failure
dm-crypt: don't modify the data when using authenticated encryption
dm-verity: recheck the hash after a failure
dm-integrity: recheck the integrity tag after a failure
Linus Torvalds [Fri, 23 Feb 2024 17:17:47 +0000 (09:17 -0800)]
Merge tag 'drm-fixes-2024-02-23' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This is the weekly drm fixes. Non-drivers there is a fbdev/sparc fix,
syncobj, ttm and buddy fixes.
On the driver side, ivpu, meson, i915 have a small fix each. Then
amdgpu and xe have a bunch. Nouveau has some minor uapi additions to
give userspace some useful info along with a Kconfig change to allow
the new GSP firmware paths to be used by default on the GPUs it
supports.
Seems about the usual amount for this time of release cycle.
fbdev:
- fix sparc undefined reference
syncobj:
- fix sync obj fence waiting
- handle NULL fence in syncobj eventfd code
ttm:
- fix invalid free
buddy:
- fix list handling
- fix 32-bit build
meson:
- don't remove bridges from other drivers
nouveau:
- fix build warnings
- add two minor info parameters
- add a Kconfig to allow GSP by default on some GPUs
ivpu:
- allow fw to do initial tile config
i915:
- fix TV mode
amdgpu:
- Suspend/resume fixes
- Backlight error fix
- DCN 3.5 fixes
- Misc fixes
xe:
- Remove support for persistent exec_queues
- Drop a reduntant sysfs newline printout
- A three-patch fix for a VM_BIND rebind optimization path
- Fix a modpost warning on an xe KUNIT module"
* tag 'drm-fixes-2024-02-23' of git://anongit.freedesktop.org/drm/drm: (27 commits)
nouveau: add an ioctl to report vram usage
nouveau: add an ioctl to return vram bar size.
nouveau/gsp: add kconfig option to enable GSP paths by default
drm/amdgpu: Fix the runtime resume failure issue
drm/amd/display: fix null-pointer dereference on edid reading
drm/amd/display: Fix memory leak in dm_sw_fini()
drm/amd/display: fix input states translation error for dcn35 & dcn351
drm/amd/display: Fix potential null pointer dereference in dc_dmub_srv
drm/amd/display: Only allow dig mapping to pwrseq in new asic
drm/amd/display: adjust few initialization order in dm
drm/syncobj: handle NULL fence in syncobj_eventfd_entry_func
drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is set
drm/ttm: Fix an invalid freeing on already freed page in error path
sparc: Fix undefined reference to fb_is_primary_device
drm/xe: Fix modpost warning on xe_mocs kunit module
drm/xe/xe_gt_idle: Drop redundant newline in name
drm/xe: Return 2MB page size for compact 64k PTEs
drm/xe: Add XE_VMA_PTE_64K VMA flag
drm/xe: Fix xe_vma_set_pte_size
drm/xe/uapi: Remove support for persistent exec_queues
...
Linus Torvalds [Fri, 23 Feb 2024 17:05:56 +0000 (09:05 -0800)]
Merge tag 'ata-6.8-rc6' of git://git./linux/kernel/git/libata/linux
Pull ata fixes from Niklas Cassel:
- Do not try to set a sleeping device to standby. Sleep is a deeper
sleep state than standby, and needs a reset to wake up the drive. A
system resume will reset the port. Sending a command other than reset
to a sleeping device is not wise, as the command will timeout (Damien
Le Moal)
- Do not try to put a device to standby twice during system shutdown.
ata_dev_power_set_standby() is currently called twice during
shutdown, once after the scsi device is removed, and another when
ata_pci_shutdown_one() executes. Modify ata_dev_power_set_standby()
to do nothing if the device is already in standby (Damien Le Moal)
- Add a quirk for ASM1064 to fixup the number of implemented ports. We
probe all ports that the hardware reports to be implemented. Probing
ports that are not implemented causes significantly increased boot
time (Andrey Jr. Melnikov)
- Fix error handling for the ahci_ceva driver. Ensure that the
ahci_ceva driver does a proper cleanup of its resources in the error
path (Radhey Shyam Pandey)
* tag 'ata-6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: Do not call ata_dev_power_set_standby() twice
ata: ahci_ceva: fix error handling for Xilinx GT PHY support
ahci: asm1064: correct count of reported ports
ata: libata-core: Do not try to set sleeping devices to standby
Linus Torvalds [Fri, 23 Feb 2024 17:01:35 +0000 (09:01 -0800)]
Merge tag 'gpio-fixes-for-v6.8-rc6' of git://git./linux/kernel/git/brgl/linux
Pull gpio fix from Bartosz Golaszewski:
- fix a use-case where no pins are mapped to GPIOs in
gpiochip_generic_config()
* tag 'gpio-fixes-for-v6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: Handle no pin_ranges in gpiochip_generic_config()
Linus Torvalds [Fri, 23 Feb 2024 16:58:47 +0000 (08:58 -0800)]
Merge tag 'hwmon-for-v6.8-rc6' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix a global-out-of-bounds bug in nct6775 driver"
* tag 'hwmon-for-v6.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6775) Fix access to temperature configuration registers
Jason Gunthorpe [Thu, 22 Feb 2024 14:07:41 +0000 (10:07 -0400)]
iommu/sva: Restore SVA handle sharing
Prior to commit
092edaddb660 ("iommu: Support mm PASID 1:n with sva
domains") the code allowed a SVA handle to be bound multiple times to the
same (mm, device) pair. This was alluded to in the kdoc comment, but we
had understood this to be more a remark about allowing multiple devices,
not a literal same-driver re-opening the same SVA.
It turns out uacce and idxd were both relying on the core code to handle
reference counting for same-device same-mm scenarios. As this looks hard
to resolve in the drivers bring it back to the core code.
The new design has changed the meaning of the domain->users refcount to
refer to the number of devices that are sharing that domain for the same
mm. This is part of the design to lift the SVA domain de-duplication out
of the drivers.
Return the old behavior by explicitly de-duplicating the struct iommu_sva
handle. The same (mm, device) will return the same handle pointer and the
core code will handle tracking this. The last unbind of the handle will
destroy it.
Fixes:
092edaddb660 ("iommu: Support mm PASID 1:n with sva domains")
Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Closes: https://lore.kernel.org/all/
20240221110658.529-1-zhangfei.gao@linaro.org/
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/0-v1-9455fc497a6f+3b4-iommu_sva_sharing_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Joerg Roedel [Fri, 23 Feb 2024 15:43:53 +0000 (16:43 +0100)]
Merge tag 'arm-smmu-fixes' of git://git./linux/kernel/git/will/linux into iommu/fixes
Arm SMMU fixes for 6.8
- Fix CD allocation from atomic context when using SVA with SMMUv3
- Revert the conversion of SMMUv2 to domain_alloc_paging(), as it
breaks the boot for Qualcomm MSM8996 devices
Arnd Bergmann [Fri, 23 Feb 2024 12:54:36 +0000 (13:54 +0100)]
Merge tag 'renesas-fixes-for-v6.8-tag1' of git://git./linux/kernel/git/geert/renesas-devel into arm/fixes
Renesas fixes for v6.8
- Add missing #interrupt-cells to DA9063 nodes.
* tag 'renesas-fixes-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
ARM: dts: renesas: rcar-gen2: Add missing #interrupt-cells to DA9063 nodes
Link: https://lore.kernel.org/r/cover.1708597150.git.geert+renesas@glider.be
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 23 Feb 2024 12:54:07 +0000 (13:54 +0100)]
Merge tag 'riscv-dt-fixes-for-v6.8-rc6' of https://git./linux/kernel/git/conor/linux into arm/fixes
RISC-V Devicetree fixes for v6.8-rc6
Two fixes for W=2 issues in devicetrees, which should constitute fixes
for all reasonable-to-fix W=2 problems on RISC-V. The others are caused
by standard USB and MMC property names containing underscores that are
not likely to ever change.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-dt-fixes-for-v6.8-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
riscv: dts: sifive: add missing #interrupt-cells to pmic
riscv: dts: starfive: replace underscores in node names
Link: https://lore.kernel.org/r/20240221-foil-glade-09dbf1aa3fe2@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 23 Feb 2024 12:53:54 +0000 (13:53 +0100)]
Merge tag 'riscv-soc-drivers-fixes-for-v6.8-rc6' of https://git./linux/kernel/git/conor/linux into arm/fixes
RISC-V SoC driver fixes for v6.8-rc6
A fix for a kconfig symbol whose help text has been unhelpful since its
introduction.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-soc-drivers-fixes-for-v6.8-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
soc: microchip: Fix POLARFIRE_SOC_SYS_CTRL input prompt
Link: https://lore.kernel.org/r/20240221-irate-outrage-cf7f96f83074@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 23 Feb 2024 12:53:43 +0000 (13:53 +0100)]
Merge tag 'riscv-firmware-fixes-for-v6.8-rc6' of https://git./linux/kernel/git/conor/linux into arm/fixes
Microchip firmware driver fixes for v6.8-rc6
A single fix for me incorrectly using sizeof().
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-firmware-fixes-for-v6.8-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
firmware: microchip: fix wrong sizeof argument
Link: https://lore.kernel.org/r/20240221-recognize-dust-4bb575f4e67b@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 23 Feb 2024 12:53:30 +0000 (13:53 +0100)]
Merge tag 'riscv-cache-fixes-for-v6.8-rc6' of https://git./linux/kernel/git/conor/linux into arm/fixes
RISC-V Cache driver fixes for v6.8-rc6
A single fix for an inconsistency reported during CIP review by Pavel in
the newly added ax45mp cache driver.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-cache-fixes-for-v6.8-rc6' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
cache: ax45mp_cache: Align end size to cache boundary in ax45mp_dma_cache_wback()
Link: https://lore.kernel.org/r/20240221-keenness-handheld-b930aaa77708@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
WANG Xuerui [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
LoongArch: KVM: Streamline kvm_check_cpucfg() and improve comments
All the checks currently done in kvm_check_cpucfg can be realized with
early returns, so just do that to avoid extra cognitive burden related
to the return value handling.
While at it, clean up comments of _kvm_get_cpucfg_mask() and
kvm_check_cpucfg(), by removing comments that are merely restatement of
the code nearby, and paraphrasing the rest so they read more natural for
English speakers (that likely are not familiar with the actual Chinese-
influenced grammar).
No functional changes intended.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
WANG Xuerui [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
LoongArch: KVM: Rename _kvm_get_cpucfg() to _kvm_get_cpucfg_mask()
The function is not actually a getter of guest CPUCFG, but rather
validation of the input CPUCFG ID plus information about the supported
bit flags of that CPUCFG leaf. So rename it to avoid confusion.
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
WANG Xuerui [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
LoongArch: KVM: Fix input validation of _kvm_get_cpucfg() & kvm_check_cpucfg()
The range check for the CPUCFG ID is wrong (should have been a ||
instead of &&) and useless in effect, so fix the obvious mistake.
Furthermore, the juggling of the temp return value is unnecessary,
because it is semantically equivalent and more readable to just
return at every switch case's end. This is done too to avoid potential
bugs in the future related to the unwanted complexity.
Also, the return value of _kvm_get_cpucfg is meant to be checked, but
this was not done, so bad CPUCFG IDs wrongly fall back to the default
case and 0 is incorrectly returned; check the return value to fix the
UAPI behavior.
While at it, also remove the redundant range check in kvm_check_cpucfg,
because out-of-range CPUCFG IDs are already rejected by the -EINVAL
as returned by _kvm_get_cpucfg().
Fixes:
db1ecca22edf ("LoongArch: KVM: Add LSX (128bit SIMD) support")
Fixes:
118e10cd893d ("LoongArch: KVM: Add LASX (256bit SIMD) support")
Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Krzysztof Kozlowski [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
LoongArch: dts: Minor whitespace cleanup
The DTS code coding style expects exactly one space before '{'
character.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
LoongArch: Call early_init_fdt_scan_reserved_mem() earlier
The unflatten_and_copy_device_tree() function contains a call to
memblock_alloc(). This means that memblock is allocating memory before
any of the reserved memory regions are set aside in the arch_mem_init()
function which calls early_init_fdt_scan_reserved_mem(). Therefore,
there is a possibility for memblock to allocate from any of the
reserved memory regions.
Hence, move the call to early_init_fdt_scan_reserved_mem() to be earlier
in the init sequence, so that the reserved memory regions are set aside
before any allocations are done using memblock.
Cc: stable@vger.kernel.org
Fixes:
88d4d957edc707e ("LoongArch: Add FDT booting support from efi system table")
Signed-off-by: Oreoluwa Babatunde <quic_obabatun@quicinc.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Huacai Chen [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
Huacai Chen [Fri, 23 Feb 2024 06:36:31 +0000 (14:36 +0800)]
Dave Airlie [Wed, 24 Jan 2024 04:24:25 +0000 (14:24 +1000)]
nouveau: add an ioctl to report vram usage
This reports the currently used vram allocations.
userspace using this has been proposed for nvk, but
it's a rather trivial uapi addition.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 24 Jan 2024 03:50:58 +0000 (13:50 +1000)]
nouveau: add an ioctl to return vram bar size.
This returns the BAR resources size so userspace can make
decisions based on rebar support.
userspace using this has been proposed for nvk, but
it's a rather trivial uapi addition.
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 14 Feb 2024 04:06:32 +0000 (14:06 +1000)]
nouveau/gsp: add kconfig option to enable GSP paths by default
Turing and Ampere will continue to use the old paths by default,
but we should allow distros to decide what the policy is.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240214040632.661069-1-airlied@gmail.com
Dave Airlie [Thu, 22 Feb 2024 23:44:44 +0000 (09:44 +1000)]
Merge tag 'drm-xe-fixes-2024-02-22' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Remove support for persistent exec_queues
- Drop a reduntant sysfs newline printout
Cross-subsystem Changes:
Core Changes:
Driver Changes:
- A three-patch fix for a VM_BIND rebind optimization path
- Fix a modpost warning on an xe KUNIT module
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZdcsNrxdWMMM417v@fedora
Dave Airlie [Thu, 22 Feb 2024 22:36:46 +0000 (08:36 +1000)]
Merge tag 'amd-drm-fixes-6.8-2024-02-22' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.8-2024-02-22:
amdgpu:
- Suspend/resume fixes
- Backlight error fix
- DCN 3.5 fixes
- Misc fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240222195338.5809-1-alexander.deucher@amd.com
Dave Airlie [Thu, 22 Feb 2024 22:30:20 +0000 (08:30 +1000)]
Merge tag 'drm-intel-fixes-2024-02-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fixup for TV mode
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZdcwT9kltvEgJZZE@jlahtine-mobl.ger.corp.intel.com