linux-2.6-microblaze.git
2 years agox86/ibt: Add IBT feature, MSR and #CP handling
Peter Zijlstra [Tue, 8 Mar 2022 15:30:35 +0000 (16:30 +0100)]
x86/ibt: Add IBT feature, MSR and #CP handling

The bits required to make the hardware go.. Of note is that, provided
the syscall entry points are covered with ENDBR, #CP doesn't need to
be an IST because we'll never hit the syscall gap.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.582331711@infradead.org
2 years agox86/ibt,ftrace: Add ENDBR to samples/ftrace
Peter Zijlstra [Tue, 8 Mar 2022 15:30:34 +0000 (16:30 +0100)]
x86/ibt,ftrace: Add ENDBR to samples/ftrace

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.523421433@infradead.org
2 years agox86/ibt,bpf: Add ENDBR instructions to prologue and trampoline
Peter Zijlstra [Tue, 8 Mar 2022 15:30:33 +0000 (16:30 +0100)]
x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline

With IBT enabled builds we need ENDBR instructions at indirect jump
target sites, since we start execution of the JIT'ed code through an
indirect jump, the very first instruction needs to be ENDBR.

Similarly, since eBPF tail-calls use indirect branches, their landing
site needs to be an ENDBR too.

The trampolines need similar adjustment.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Fixed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.464998838@infradead.org
2 years agox86/ibt,kprobes: Cure sym+0 equals fentry woes
Peter Zijlstra [Tue, 8 Mar 2022 15:30:32 +0000 (16:30 +0100)]
x86/ibt,kprobes: Cure sym+0 equals fentry woes

In order to allow kprobes to skip the ENDBR instructions at sym+0 for
X86_KERNEL_IBT builds, change _kprobe_addr() to take an architecture
callback to inspect the function at hand and modify the offset if
needed.

This streamlines the existing interface to cover more cases and
require less hooks. Once PowerPC gets fully converted there will only
be the one arch hook.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.405947704@infradead.org
2 years agox86/ibt,ftrace: Make function-graph play nice
Peter Zijlstra [Tue, 8 Mar 2022 15:30:31 +0000 (16:30 +0100)]
x86/ibt,ftrace: Make function-graph play nice

Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.

And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org
2 years agox86/livepatch: Validate __fentry__ location
Peter Zijlstra [Tue, 8 Mar 2022 15:30:30 +0000 (16:30 +0100)]
x86/livepatch: Validate __fentry__ location

Currently livepatch assumes __fentry__ lives at func+0, which is most
likely untrue with IBT on. Instead make it use ftrace_location() by
default which both validates and finds the actual ip if there is any
in the same symbol.

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.285971256@infradead.org
2 years agox86/ibt,ftrace: Search for __fentry__ location
Peter Zijlstra [Tue, 8 Mar 2022 15:30:29 +0000 (16:30 +0100)]
x86/ibt,ftrace: Search for __fentry__ location

Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
with Intel IBT enabled the first instruction of a function will most
likely be ENDBR.

Change ftrace_location() to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location.

Then audit/update all callsites of this function to consistently use
these new semantics.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org
2 years agox86/ibt,kvm: Add ENDBR to fastops
Peter Zijlstra [Tue, 8 Mar 2022 15:30:28 +0000 (16:30 +0100)]
x86/ibt,kvm: Add ENDBR to fastops

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.168850084@infradead.org
2 years agox86/ibt,crypto: Add ENDBR for the jump-table entries
Peter Zijlstra [Tue, 8 Mar 2022 15:30:27 +0000 (16:30 +0100)]
x86/ibt,crypto: Add ENDBR for the jump-table entries

The code does:

## branch into array
mov     jump_table(,%rax,8), %bufp
JMP_NOSPEC bufp

resulting in needing to mark the jump-table entries with ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.110500806@infradead.org
2 years agox86/ibt,paravirt: Sprinkle ENDBR
Peter Zijlstra [Tue, 8 Mar 2022 15:30:26 +0000 (16:30 +0100)]
x86/ibt,paravirt: Sprinkle ENDBR

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.051635891@infradead.org
2 years agox86/linkage: Add ENDBR to SYM_FUNC_START*()
Peter Zijlstra [Tue, 8 Mar 2022 15:30:25 +0000 (16:30 +0100)]
x86/linkage: Add ENDBR to SYM_FUNC_START*()

Ensure the ASM functions have ENDBR on for IBT builds, this follows
the ARM64 example. Unlike ARM64, we'll likely end up overwriting them
with poison.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.992708941@infradead.org
2 years agox86/ibt,entry: Sprinkle ENDBR dust
Peter Zijlstra [Tue, 8 Mar 2022 15:30:24 +0000 (16:30 +0100)]
x86/ibt,entry: Sprinkle ENDBR dust

Kernel entry points should be having ENDBR on for IBT configs.

The SYSCALL entry points are found through taking their respective
address in order to program them in the MSRs, while the exception
entry points are found through UNWIND_HINT_IRET_REGS.

The rule is that any UNWIND_HINT_IRET_REGS at sym+0 should have an
ENDBR, see the later objtool ibt validation patch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.933157479@infradead.org
2 years agox86/ibt,xen: Sprinkle the ENDBR
Peter Zijlstra [Tue, 8 Mar 2022 15:30:23 +0000 (16:30 +0100)]
x86/ibt,xen: Sprinkle the ENDBR

Even though Xen currently doesn't advertise IBT, prepare for when it
will eventually do so and sprinkle the ENDBR dust accordingly.

Even though most of the entry points are IRET like, the CPL0
Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.873919996@infradead.org
2 years agox86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()
Peter Zijlstra [Tue, 8 Mar 2022 15:30:22 +0000 (16:30 +0100)]
x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()

By doing an early rewrite of 'jmp native_iret` in
restore_regs_and_return_to_kernel() we can get rid of the last
INTERRUPT_RETURN user and paravirt_iret.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.815039833@infradead.org
2 years agox86/entry: Cleanup PARAVIRT
Peter Zijlstra [Tue, 8 Mar 2022 15:30:21 +0000 (16:30 +0100)]
x86/entry: Cleanup PARAVIRT

Since commit 5c8f6a2e316e ("x86/xen: Add
xenpv_restore_regs_and_return_to_usermode()") Xen will no longer reach
this code and we can do away with the paravirt
SWAPGS/INTERRUPT_RETURN.

Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.756014488@infradead.org
2 years agox86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()
Peter Zijlstra [Tue, 8 Mar 2022 15:30:20 +0000 (16:30 +0100)]
x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()

Less duplication is more better.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.697253958@infradead.org
2 years agox86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR
Peter Zijlstra [Tue, 8 Mar 2022 15:30:19 +0000 (16:30 +0100)]
x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.638561109@infradead.org
2 years agox86/ibt: Add ANNOTATE_NOENDBR
Peter Zijlstra [Tue, 8 Mar 2022 15:30:18 +0000 (16:30 +0100)]
x86/ibt: Add ANNOTATE_NOENDBR

In order to have objtool warn about code references to !ENDBR
instruction, we need an annotation to allow this for non-control-flow
instances -- consider text range checks, text patching, or return
trampolines etc.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.578968224@infradead.org
2 years agox86/ibt: Base IBT bits
Peter Zijlstra [Tue, 8 Mar 2022 15:30:17 +0000 (16:30 +0100)]
x86/ibt: Base IBT bits

Add Kconfig, Makefile and basic instruction support for x86 IBT.

(Ab)use __DISABLE_EXPORTS to disable IBT since it's already employed
to mark compressed and purgatory. Additionally mark realmode with it
as well to avoid inserting ENDBR instructions there. While ENDBR is
technically a NOP, inserting them was causing some grief due to code
growth. There's also a problem with using __noendbr in code compiled
without -fcf-protection=branch.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.519875203@infradead.org
2 years agoobjtool: Have WARN_FUNC fall back to sym+off
Peter Zijlstra [Tue, 8 Mar 2022 15:30:16 +0000 (16:30 +0100)]
objtool: Have WARN_FUNC fall back to sym+off

Currently WARN_FUNC() either prints func+off and failing that prints
sec+off, add an intermediate sym+off. This is useful when playing
around with entry code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.461283840@infradead.org
2 years agoobjtool,efi: Update __efi64_thunk annotation
Peter Zijlstra [Tue, 8 Mar 2022 15:30:15 +0000 (16:30 +0100)]
objtool,efi: Update __efi64_thunk annotation

The current annotation relies on not running objtool on the file; this
won't work when running objtool on vmlinux.o. Instead explicitly mark
__efi64_thunk() to be ignored.

This preserves the status quo, which is somewhat unfortunate. Luckily
this code is hardly ever used.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.402118218@infradead.org
2 years agoobjtool: Default ignore INT3 for unreachable
Peter Zijlstra [Tue, 8 Mar 2022 15:30:14 +0000 (16:30 +0100)]
objtool: Default ignore INT3 for unreachable

Ignore all INT3 instructions for unreachable code warnings, similar to NOP.
This allows using INT3 for various paddings instead of NOPs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.343312938@infradead.org
2 years agoobjtool: Add --dry-run
Peter Zijlstra [Tue, 8 Mar 2022 15:30:13 +0000 (16:30 +0100)]
objtool: Add --dry-run

Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org
2 years agostatic_call: Avoid building empty .static_call_sites
Peter Zijlstra [Tue, 8 Mar 2022 15:30:12 +0000 (16:30 +0100)]
static_call: Avoid building empty .static_call_sites

Without CONFIG_HAVE_STATIC_CALL_INLINE there's no point in creating
the .static_call_sites section and it's related symbols.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.223798256@infradead.org
2 years agoMerge branch 'arm64/for-next/linkage'
Peter Zijlstra [Tue, 15 Mar 2022 09:32:31 +0000 (10:32 +0100)]
Merge branch 'arm64/for-next/linkage'

Enjoy the cleanups and avoid conflicts vs linkage

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2 years agotools/objtool: Check for use of the ENQCMD instruction in the kernel
Fenghua Yu [Mon, 7 Feb 2022 23:02:53 +0000 (15:02 -0800)]
tools/objtool: Check for use of the ENQCMD instruction in the kernel

The ENQCMD instruction implicitly accesses the PASID_MSR to fill in the
pasid field of the descriptor being submitted to an accelerator. But
there is no precise (and stable across kernel changes) point at which
the PASID_MSR is updated from the value for one task to the next.

Kernel code that uses accelerators must always use the ENQCMDS instruction
which does not access the PASID_MSR.

Check for use of the ENQCMD instruction in the kernel and warn on its
usage.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220207230254.3342514-11-fenghua.yu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2 years agoLinux 5.17-rc8
Linus Torvalds [Sun, 13 Mar 2022 20:23:37 +0000 (13:23 -0700)]
Linux 5.17-rc8

2 years agoMerge tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 13 Mar 2022 17:36:38 +0000 (10:36 -0700)]
Merge tag 'x86_urgent_for_v5.17_rc8' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Free shmem backing storage for SGX enclave pages when those are
   swapped back into EPC memory

 - Prevent do_int3() from being kprobed, to avoid recursion

 - Remap setup_data and setup_indirect structures properly when
   accessing their members

 - Correct the alternatives patching order for modules too

* tag 'x86_urgent_for_v5.17_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Free backing memory after faulting the enclave page
  x86/traps: Mark do_int3() NOKPROBE_SYMBOL
  x86/boot: Add setup_indirect support in early_memremap_is_setup_data()
  x86/boot: Fix memremap of setup_indirect structures
  x86/module: Fix the paravirt vs alternative order

2 years agoMerge tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sat, 12 Mar 2022 18:29:25 +0000 (10:29 -0800)]
Merge tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix event parser error for hybrid systems

 - Fix NULL check against wrong variable in 'perf bench' and in the
   parsing code

 - Update arm64 KVM headers from the kernel sources

 - Sync cpufeatures header with the kernel sources

* tag 'perf-tools-fixes-for-v5.17-2022-03-12' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf parse: Fix event parser error for hybrid systems
  perf bench: Fix NULL check against wrong variable
  perf parse-events: Fix NULL check against wrong variable
  tools headers cpufeatures: Sync with the kernel sources
  tools kvm headers arm64: Update KVM headers from the kernel sources

2 years agoMerge tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Sat, 12 Mar 2022 18:22:43 +0000 (10:22 -0800)]
Merge tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm

Pull drm kconfig fix from Dave Airlie:
 "Thorsten pointed out this had fallen down the cracks and was in -next
  only, I've picked it out, fixed up it's Fixes: line.

   - fix regression in Kconfig"

* tag 'drm-fixes-2022-03-12' of git://anongit.freedesktop.org/drm/drm:
  drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP

2 years agoperf parse: Fix event parser error for hybrid systems
Zhengjun Xing [Mon, 7 Mar 2022 15:16:27 +0000 (23:16 +0800)]
perf parse: Fix event parser error for hybrid systems

This bug happened on hybrid systems when both cpu_core and cpu_atom
have the same event name such as "UOPS_RETIRED.MS" while their event
terms are different, then during perf stat, the event for cpu_atom
will parse fail and then no output for cpu_atom.

UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/

It is because event terms in the "head" of parse_events_multi_pmu_add
will be changed to event terms for cpu_core after parsing UOPS_RETIRED.MS
for cpu_core, then when parsing the same event for cpu_atom, it still
uses the event terms for cpu_core, but event terms for cpu_atom are
different with cpu_core, the event parses for cpu_atom will fail. This
patch fixes it, the event terms should be parsed from the original
event.

This patch can work for the hybrid systems that have the same event
in more than 2 PMUs. It also can work in non-hybrid systems.

Before:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 2737845 16068518485 16068518485

 Performance counter stats for 'system wide':

         2,737,845      cpu_core/UOPS_RETIRED.MS/

       1.002553850 seconds time elapsed

After:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 1977555 16076950711 16076950711
  UOPS_RETIRED.MS: 568684 8038694234 8038694234

 Performance counter stats for 'system wide':

         1,977,555      cpu_core/UOPS_RETIRED.MS/
           568,684      cpu_atom/UOPS_RETIRED.MS/

       1.004758259 seconds time elapsed

Fixes: fb0811535e92c6c1 ("perf parse-events: Allow config on kernel PMU events")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220307151627.30049-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf bench: Fix NULL check against wrong variable
Weiguo Li [Fri, 11 Mar 2022 13:07:16 +0000 (21:07 +0800)]
perf bench: Fix NULL check against wrong variable

We did a NULL check after "epollfdp = calloc(...)", but we checked
"epollfd" instead of "epollfdp".

Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_B5D64530EB9C7DBB8D2C88A0C790F1489D0A@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf parse-events: Fix NULL check against wrong variable
Weiguo Li [Fri, 11 Mar 2022 13:06:57 +0000 (21:06 +0800)]
perf parse-events: Fix NULL check against wrong variable

We did a null check after "tmp->symbol = strdup(...)", but we checked
"list->symbol" other than "tmp->symbol".

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_DF39269807EC9425E24787E6DB632441A405@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agotools headers cpufeatures: Sync with the kernel sources
Arnaldo Carvalho de Melo [Thu, 1 Jul 2021 16:39:15 +0000 (13:39 -0300)]
tools headers cpufeatures: Sync with the kernel sources

To pick the changes from:

  d45476d983240937 ("x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE")

Its just a comment fixup.

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YiyiHatGaJQM7l/Y@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agotools kvm headers arm64: Update KVM headers from the kernel sources
Arnaldo Carvalho de Melo [Mon, 21 Dec 2020 15:53:44 +0000 (12:53 -0300)]
tools kvm headers arm64: Update KVM headers from the kernel sources

To pick the changes from:

  a5905d6af492ee6a ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")

That don't causes any changes in tooling (when built on x86), only
addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/lkml/YiyhAK6sVPc83FaI@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agodrm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP
Thomas Zimmermann [Thu, 3 Feb 2022 09:39:22 +0000 (10:39 +0100)]
drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP

As reported in [1], DRM_PANEL_EDP depends on DRM_DP_HELPER. Select
the option to fix the build failure. The error message is shown
below.

  arm-linux-gnueabihf-ld: drivers/gpu/drm/panel/panel-edp.o: in function
    `panel_edp_probe': panel-edp.c:(.text+0xb74): undefined reference to
    `drm_panel_dp_aux_backlight'
  make[1]: *** [/builds/linux/Makefile:1222: vmlinux] Error 1

The issue has been reported before, when DisplayPort helpers were
hidden behind the option CONFIG_DRM_KMS_HELPER. [2]

v2:
* fix and expand commit description (Arnd)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 9d6366e743f3 ("drm: fb_helper: improve CONFIG_FB dependency")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://lore.kernel.org/dri-devel/CA+G9fYvN0NyaVkRQmA1O6rX7H8PPaZrUAD7=RDy33QY9rUU-9g@mail.gmail.com/
Link: https://lore.kernel.org/all/20211117062704.14671-1-rdunlap@infradead.org/
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220203093922.20754-1-tzimmermann@suse.de
Signed-off-by: Dave Airlie <airlied@redhat.com>
2 years agoARM: Spectre-BHB: provide empty stub for non-config
Randy Dunlap [Fri, 11 Mar 2022 19:49:12 +0000 (11:49 -0800)]
ARM: Spectre-BHB: provide empty stub for non-config

When CONFIG_GENERIC_CPU_VULNERABILITIES is not set, references
to spectre_v2_update_state() cause a build error, so provide an
empty stub for that function when the Kconfig option is not set.

Fixes this build error:

  arm-linux-gnueabi-ld: arch/arm/mm/proc-v7-bugs.o: in function `cpu_v7_bugs_init':
  proc-v7-bugs.c:(.text+0x52): undefined reference to `spectre_v2_update_state'
  arm-linux-gnueabi-ld: proc-v7-bugs.c:(.text+0x82): undefined reference to `spectre_v2_update_state'

Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: patches@armlinux.org.uk
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 11 Mar 2022 20:28:21 +0000 (12:28 -0800)]
Merge tag 'riscv-for-linus-5.17-rc8' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - prevent users from enabling the alternatives framework (and thus
   errata handling) on XIP kernels, where runtime code patching does not
   function correctly.

 - properly detect offset overflow for AUIPC-based relocations in
   modules. This may manifest as modules calling arbitrary invalid
   addresses, depending on the address allocated when a module is
   loaded.

* tag 'riscv-for-linus-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix auipc+jalr relocation range checks
  riscv: alternative only works on !XIP_KERNEL

2 years agoMerge tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Fri, 11 Mar 2022 19:50:36 +0000 (11:50 -0800)]
Merge tag 'powerpc-5.17-6' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
 "Fix STACKTRACE=n build, in particular for skiroot_defconfig"

* tag 'powerpc-5.17-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc: Fix STACKTRACE=n build

2 years agoARM: fix Thumb2 regression with Spectre BHB
Russell King (Oracle) [Fri, 11 Mar 2022 17:13:17 +0000 (17:13 +0000)]
ARM: fix Thumb2 regression with Spectre BHB

When building for Thumb2, the vectors make use of a local label. Sadly,
the Spectre BHB code also uses a local label with the same number which
results in the Thumb2 reference pointing at the wrong place. Fix this
by changing the number used for the Spectre BHB local label.

Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround")
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 11 Mar 2022 19:24:58 +0000 (11:24 -0800)]
Merge tag 'mmc-v5.17-rc6' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Restore (mostly) the busy polling for MMC_SEND_OP_COND

  MMC host:
   - meson-gx: Fix DMA usage of meson_mmc_post_req()"

* tag 'mmc-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Restore (almost) the busy polling for MMC_SEND_OP_COND
  mmc: meson: Fix usage of meson_mmc_post_req()

2 years agox86/sgx: Free backing memory after faulting the enclave page
Jarkko Sakkinen [Thu, 3 Mar 2022 22:38:58 +0000 (00:38 +0200)]
x86/sgx: Free backing memory after faulting the enclave page

There is a limited amount of SGX memory (EPC) on each system.  When that
memory is used up, SGX has its own swapping mechanism which is similar
in concept but totally separate from the core mm/* code.  Instead of
swapping to disk, SGX swaps from EPC to normal RAM.  That normal RAM
comes from a shared memory pseudo-file and can itself be swapped by the
core mm code.  There is a hierarchy like this:

EPC <-> shmem <-> disk

After data is swapped back in from shmem to EPC, the shmem backing
storage needs to be freed.  Currently, the backing shmem is not freed.
This effectively wastes the shmem while the enclave is running.  The
memory is recovered when the enclave is destroyed and the backing
storage freed.

Sort this out by freeing memory with shmem_truncate_range(), as soon as
a page is faulted back to the EPC.  In addition, free the memory for
PCMD pages as soon as all PCMD's in a page have been marked as unused
by zeroing its contents.

Cc: stable@vger.kernel.org
Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Reported-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220303223859.273187-1-jarkko@kernel.org
2 years agoMerge branch 'davidh' (fixes from David Howells)
Linus Torvalds [Fri, 11 Mar 2022 18:28:32 +0000 (10:28 -0800)]
Merge branch 'davidh' (fixes from David Howells)

Merge misc fixes from David Howells:
 "A set of patches for watch_queue filter issues noted by Jann. I've
  added in a cleanup patch from Christophe Jaillet to convert to using
  formal bitmap specifiers for the note allocation bitmap.

  Also two filesystem fixes (afs and cachefiles)"

* emailed patches from David Howells <dhowells@redhat.com>:
  cachefiles: Fix volume coherency attribute
  afs: Fix potential thrashing in afs writeback
  watch_queue: Make comment about setting ->defunct more accurate
  watch_queue: Fix lack of barrier/sync/lock between post and read
  watch_queue: Free the alloc bitmap when the watch_queue is torn down
  watch_queue: Fix the alloc bitmap size to reflect notes allocated
  watch_queue: Use the bitmap API when applicable
  watch_queue: Fix to always request a pow-of-2 pipe ring size
  watch_queue: Fix to release page in ->release()
  watch_queue, pipe: Free watchqueue state after clearing pipe ring
  watch_queue: Fix filter limit check

2 years agocachefiles: Fix volume coherency attribute
David Howells [Fri, 11 Mar 2022 16:02:18 +0000 (16:02 +0000)]
cachefiles: Fix volume coherency attribute

A network filesystem may set coherency data on a volume cookie, and if
given, cachefiles will store this in an xattr on the directory in the
cache corresponding to the volume.

The function that sets the xattr just stores the contents of the volume
coherency buffer directly into the xattr, with nothing added; the
checking function, on the other hand, has a cut'n'paste error whereby it
tries to interpret the xattr contents as would be the xattr on an
ordinary file (using the cachefiles_xattr struct).  This results in a
failure to match the coherency data because the buffer ends up being
shifted by 18 bytes.

Fix this by defining a structure specifically for the volume xattr and
making both the setting and checking functions use it.

Since the volume coherency doesn't work if used, take the opportunity to
insert a reserved field for future use, set it to 0 and check that it is
0.  Log mismatch through the appropriate tracepoint.

Note that this only affects cifs; 9p, afs, ceph and nfs don't use the
volume coherency data at the moment.

Fixes: 32e150037dce ("fscache, cachefiles: Store the volume coherency data")
Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Steve French <smfrench@gmail.com>
cc: linux-cifs@vger.kernel.org
cc: linux-cachefs@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoafs: Fix potential thrashing in afs writeback
David Howells [Fri, 11 Mar 2022 15:58:21 +0000 (15:58 +0000)]
afs: Fix potential thrashing in afs writeback

In afs_writepages_region(), if the dirty page we find is undergoing
writeback or write to cache, but the sync_mode is WB_SYNC_NONE, we go
round the loop trying the same page again and again with no pausing or
waiting unless and until another thread manages to clear the writeback
and fscache flags.

Fix this with three measures:

 (1) Advance start to after the page we found.

 (2) Break out of the loop and return if rescheduling is requested.

 (3) Arbitrarily give up after a maximum of 5 skips.

Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
Link: https://lore.kernel.org/r/164692725757.2097000.2060513769492301854.stgit@warthog.procyon.org.uk/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agox86/traps: Mark do_int3() NOKPROBE_SYMBOL
Li Huafei [Thu, 10 Mar 2022 12:09:15 +0000 (20:09 +0800)]
x86/traps: Mark do_int3() NOKPROBE_SYMBOL

Since kprobe_int3_handler() is called in do_int3(), probing do_int3()
can cause a breakpoint recursion and crash the kernel. Therefore,
do_int3() should be marked as NOKPROBE_SYMBOL.

Fixes: 21e28290b317 ("x86/traps: Split int3 handler up")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220310120915.63349-1-lihuafei1@huawei.com
2 years agowatch_queue: Make comment about setting ->defunct more accurate
David Howells [Fri, 11 Mar 2022 13:24:47 +0000 (13:24 +0000)]
watch_queue: Make comment about setting ->defunct more accurate

watch_queue_clear() has a comment stating that setting ->defunct to true
preventing new additions as well as preventing notifications.  Whilst
the latter is true, the first bit is superfluous since at the time this
function is called, the pipe cannot be accessed to add new event
sources.

Remove the "new additions" bit from the comment.

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Fix lack of barrier/sync/lock between post and read
David Howells [Fri, 11 Mar 2022 13:24:36 +0000 (13:24 +0000)]
watch_queue: Fix lack of barrier/sync/lock between post and read

There's nothing to synchronise post_one_notification() versus
pipe_read().  Whilst posting is done under pipe->rd_wait.lock, the
reader only takes pipe->mutex which cannot bar notification posting as
that may need to be made from contexts that cannot sleep.

Fix this by setting pipe->head with a barrier in post_one_notification()
and reading pipe->head with a barrier in pipe_read().

If that's not sufficient, the rd_wait.lock will need to be taken,
possibly in a ->confirm() op so that it only applies to notifications.
The lock would, however, have to be dropped before copy_page_to_iter()
is invoked.

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Free the alloc bitmap when the watch_queue is torn down
David Howells [Fri, 11 Mar 2022 13:24:29 +0000 (13:24 +0000)]
watch_queue: Free the alloc bitmap when the watch_queue is torn down

Free the watch_queue note allocation bitmap when the watch_queue is
destroyed.

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Fix the alloc bitmap size to reflect notes allocated
David Howells [Fri, 11 Mar 2022 13:24:22 +0000 (13:24 +0000)]
watch_queue: Fix the alloc bitmap size to reflect notes allocated

Currently, watch_queue_set_size() sets the number of notes available in
wqueue->nr_notes according to the number of notes allocated, but sets
the size of the bitmap to the unrounded number of notes originally asked
for.

Fix this by setting the bitmap size to the number of notes we're
actually going to make available (ie. the number allocated).

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Use the bitmap API when applicable
Christophe JAILLET [Fri, 11 Mar 2022 13:24:15 +0000 (13:24 +0000)]
watch_queue: Use the bitmap API when applicable

Use bitmap_alloc() to simplify code, improve the semantic and reduce
some open-coded arithmetic in allocator arguments.

Also change a memset(0xff) into an equivalent bitmap_fill() to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Fix to always request a pow-of-2 pipe ring size
David Howells [Fri, 11 Mar 2022 13:24:08 +0000 (13:24 +0000)]
watch_queue: Fix to always request a pow-of-2 pipe ring size

The pipe ring size must always be a power of 2 as the head and tail
pointers are masked off by AND'ing with the size of the ring - 1.
watch_queue_set_size(), however, lets you specify any number of notes
between 1 and 511.  This number is passed through to pipe_resize_ring()
without checking/forcing its alignment.

Fix this by rounding the number of slots required up to the nearest
power of two.  The request is meant to guarantee that at least that many
notifications can be generated before the queue is full, so rounding
down isn't an option, but, alternatively, it may be better to give an
error if we aren't allowed to allocate that much ring space.

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Fix to release page in ->release()
David Howells [Fri, 11 Mar 2022 13:23:46 +0000 (13:23 +0000)]
watch_queue: Fix to release page in ->release()

When a pipe ring descriptor points to a notification message, the
refcount on the backing page is incremented by the generic get function,
but the release function, which marks the bitmap, doesn't drop the page
ref.

Fix this by calling generic_pipe_buf_release() at the end of
watch_queue_pipe_buf_release().

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue, pipe: Free watchqueue state after clearing pipe ring
David Howells [Fri, 11 Mar 2022 13:23:38 +0000 (13:23 +0000)]
watch_queue, pipe: Free watchqueue state after clearing pipe ring

In free_pipe_info(), free the watchqueue state after clearing the pipe
ring as each pipe ring descriptor has a release function, and in the
case of a notification message, this is watch_queue_pipe_buf_release()
which tries to mark the allocation bitmap that was previously released.

Fix this by moving the put of the pipe's ref on the watch queue to after
the ring has been cleared.  We still need to call watch_queue_clear()
before doing that to make sure that the pipe is disconnected from any
notification sources first.

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch_queue: Fix filter limit check
David Howells [Fri, 11 Mar 2022 13:23:31 +0000 (13:23 +0000)]
watch_queue: Fix filter limit check

In watch_queue_set_filter(), there are a couple of places where we check
that the filter type value does not exceed what the type_filter bitmap
can hold.  One place calculates the number of bits by:

   if (tf[i].type >= sizeof(wfilter->type_filter) * 8)

which is fine, but the second does:

   if (tf[i].type >= sizeof(wfilter->type_filter) * BITS_PER_LONG)

which is not.  This can lead to a couple of out-of-bounds writes due to
a too-large type:

 (1) __set_bit() on wfilter->type_filter
 (2) Writing more elements in wfilter->filters[] than we allocated.

Fix this by just using the proper WATCH_TYPE__NR instead, which is the
number of types we actually know about.

The bug may cause an oops looking something like:

  BUG: KASAN: slab-out-of-bounds in watch_queue_set_filter+0x659/0x740
  Write of size 4 at addr ffff88800d2c66bc by task watch_queue_oob/611
  ...
  Call Trace:
   <TASK>
   dump_stack_lvl+0x45/0x59
   print_address_description.constprop.0+0x1f/0x150
   ...
   kasan_report.cold+0x7f/0x11b
   ...
   watch_queue_set_filter+0x659/0x740
   ...
   __x64_sys_ioctl+0x127/0x190
   do_syscall_64+0x43/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

  Allocated by task 611:
   kasan_save_stack+0x1e/0x40
   __kasan_kmalloc+0x81/0xa0
   watch_queue_set_filter+0x23a/0x740
   __x64_sys_ioctl+0x127/0x190
   do_syscall_64+0x43/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xae

  The buggy address belongs to the object at ffff88800d2c66a0
   which belongs to the cache kmalloc-32 of size 32
  The buggy address is located 28 bytes inside of
   32-byte region [ffff88800d2c66a0ffff88800d2c66c0)

Fixes: c73be61cede5 ("pipe: Add general notification queue support")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 11 Mar 2022 05:15:42 +0000 (21:15 -0800)]
Merge tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "As expected at this stage its pretty quiet, one sun4i mixer fix and
  one i915 display flicker fix:

  i915:
   - fix psr screen flicker

  sun4i:
   - mixer format fix"

* tag 'drm-fixes-2022-03-11' of git://anongit.freedesktop.org/drm/drm:
  drm/sun4i: mixer: Fix P010 and P210 format numbers
  drm/i915/psr: Set "SF Partial Frame Enable" also on full update

2 years agoriscv: Fix auipc+jalr relocation range checks
Emil Renner Berthing [Wed, 23 Feb 2022 19:12:57 +0000 (20:12 +0100)]
riscv: Fix auipc+jalr relocation range checks

RISC-V can do PC-relative jumps with a 32bit range using the following
two instructions:

auipc t0, imm20 ; t0 = PC + imm20 * 2^12
jalr ra, t0, imm12 ; ra = PC + 4, PC = t0 + imm12

Crucially both the 20bit immediate imm20 and the 12bit immediate imm12
are treated as two's-complement signed values. For this reason the
immediates are usually calculated like this:

imm20 = (offset + 0x800) >> 12
imm12 = offset & 0xfff

..where offset is the signed offset from the auipc instruction. When
the 11th bit of offset is 0 the addition of 0x800 doesn't change the top
20 bits and imm12 considered positive. When the 11th bit is 1 the carry
of the addition by 0x800 means imm20 is one higher, but since imm12 is
then considered negative the two's complement representation means it
all cancels out nicely.

However, this addition by 0x800 (2^11) means an offset greater than or
equal to 2^31 - 2^11 would overflow so imm20 is considered negative and
result in a backwards jump. Similarly the lower range of offset is also
moved down by 2^11 and hence the true 32bit range is

[-2^31 - 2^11, 2^31 - 2^11)

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2 years agoMerge tag 'drm-intel-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 11 Mar 2022 03:26:18 +0000 (13:26 +1000)]
Merge tag 'drm-intel-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix PSR2 when selective fetch is enabled and cursor at (-1, -1) (Jouni Högander)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YinTFSFg++HvuFpZ@tursulin-mobl2
2 years agoMerge tag 'drm-misc-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 11 Mar 2022 00:37:16 +0000 (10:37 +1000)]
Merge tag 'drm-misc-fixes-2022-03-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

 * drm/sun4i: Fix P010 and P210 format numbers

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YipS65Iuu7RMMlAa@linux-uq9g
2 years agoMerge tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Fri, 11 Mar 2022 01:23:08 +0000 (17:23 -0800)]
Merge tag 'trace-v5.17-rc6' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Minor tracing fixes:

   - Fix unregistering the same event twice. A user could disable the
     same event that osnoise will disable on unregistering.

   - Inform RCU of a quiescent state in the osnoise testing thread.

   - Fix some kerneldoc comments"

* tag 'trace-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix some W=1 warnings in kernel doc comments
  tracing/osnoise: Force quiescent states while tracing
  tracing/osnoise: Do not unregister events twice

2 years agoMerge tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 11 Mar 2022 00:47:58 +0000 (16:47 -0800)]
Merge tag 'net-5.17-rc8' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, and ipsec.

  Current release - regressions:

   - Bluetooth: fix unbalanced unlock in set_device_flags()

   - Bluetooth: fix not processing all entries on cmd_sync_work, make
     connect with qualcomm and intel adapters reliable

   - Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"

   - xdp: xdp_mem_allocator can be NULL in trace_mem_connect()

   - eth: ice: fix race condition and deadlock during interface enslave

  Current release - new code bugs:

   - tipc: fix incorrect order of state message data sanity check

  Previous releases - regressions:

   - esp: fix possible buffer overflow in ESP transformation

   - dsa: unlock the rtnl_mutex when dsa_master_setup() fails

   - phy: meson-gxl: fix interrupt handling in forced mode

   - smsc95xx: ignore -ENODEV errors when device is unplugged

  Previous releases - always broken:

   - xfrm: fix tunnel mode fragmentation behavior

   - esp: fix inter address family tunneling on GSO

   - tipc: fix null-deref due to race when enabling bearer

   - sctp: fix kernel-infoleak for SCTP sockets

   - eth: macb: fix lost RX packet wakeup race in NAPI receive

   - eth: intel stop disabling VFs due to PF error responses

   - eth: bcmgenet: don't claim WOL when its not available"

* tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
  ice: Fix race condition during interface enslave
  net: phy: meson-gxl: improve link-up behavior
  net: bcmgenet: Don't claim WOL when its not available
  net: arc_emac: Fix use after free in arc_mdio_probe()
  sctp: fix kernel-infoleak for SCTP sockets
  net: phy: correct spelling error of media in documentation
  net: phy: DP83822: clear MISR2 register to disable interrupts
  gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
  selftests: pmtu.sh: Kill nettest processes launched in subshell.
  selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
  NFC: port100: fix use-after-free in port100_send_complete
  net/mlx5e: SHAMPO, reduce TIR indication
  net/mlx5e: Lag, Only handle events from highest priority multipath entry
  net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
  net/mlx5: Fix a race on command flush flow
  net/mlx5: Fix size field in bufferx_reg struct
  ax25: Fix NULL pointer dereference in ax25_kill_by_device
  net: marvell: prestera: Add missing of_node_put() in prestera_switch_set_base_mac_addr
  net: ethernet: lpc_eth: Handle error for clk_enable
  ...

2 years agoxdp: xdp_mem_allocator can be NULL in trace_mem_connect().
Sebastian Andrzej Siewior [Wed, 9 Mar 2022 22:13:45 +0000 (23:13 +0100)]
xdp: xdp_mem_allocator can be NULL in trace_mem_connect().

Since the commit mentioned below __xdp_reg_mem_model() can return a NULL
pointer. This pointer is dereferenced in trace_mem_connect() which leads
to segfault.

The trace points (mem_connect + mem_disconnect) were put in place to
pair connect/disconnect using the IDs. The ID is only assigned if
__xdp_reg_mem_model() does not return NULL. That connect trace point is
of no use if there is no ID.

Skip that connect trace point if xdp_alloc is NULL.

[ Toke Høiland-Jørgensen delivered the reasoning for skipping the trace
  point ]

Fixes: 4a48ef70b93b8 ("xdp: Allow registering memory model without rxq reference")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/YikmmXsffE+QajTB@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoice: Fix race condition during interface enslave
Ivan Vecera [Thu, 10 Mar 2022 17:16:41 +0000 (18:16 +0100)]
ice: Fix race condition during interface enslave

Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
auxiliary device") changes a process of re-creation of aux device
so ice_plug_aux_dev() is called from ice_service_task() context.
This unfortunately opens a race window that can result in dead-lock
when interface has left LAG and immediately enters LAG again.

Reproducer:
```
#!/bin/sh

ip link add lag0 type bond mode 1 miimon 100
ip link set lag0

for n in {1..10}; do
        echo Cycle: $n
        ip link set ens7f0 master lag0
        sleep 1
        ip link set ens7f0 nomaster
done
```

This results in:
[20976.208697] Workqueue: ice ice_service_task [ice]
[20976.213422] Call Trace:
[20976.215871]  __schedule+0x2d1/0x830
[20976.219364]  schedule+0x35/0xa0
[20976.222510]  schedule_preempt_disabled+0xa/0x10
[20976.227043]  __mutex_lock.isra.7+0x310/0x420
[20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
[20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
[20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
[20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
[20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
[20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
[20976.285691]  auxiliary_bus_probe+0x45/0x70
[20976.289790]  really_probe+0x1f2/0x480
[20976.298509]  driver_probe_device+0x49/0xc0
[20976.302609]  bus_for_each_drv+0x79/0xc0
[20976.306448]  __device_attach+0xdc/0x160
[20976.310286]  bus_probe_device+0x9d/0xb0
[20976.314128]  device_add+0x43c/0x890
[20976.321287]  __auxiliary_device_add+0x43/0x60
[20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
[20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
[20976.342591]  process_one_work+0x1a7/0x360
[20976.350536]  worker_thread+0x30/0x390
[20976.358128]  kthread+0x10a/0x120
[20976.365547]  ret_from_fork+0x1f/0x40
...
[20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
[20976.446469] Call Trace:
[20976.448921]  __schedule+0x2d1/0x830
[20976.452414]  schedule+0x35/0xa0
[20976.455559]  schedule_preempt_disabled+0xa/0x10
[20976.460090]  __mutex_lock.isra.7+0x310/0x420
[20976.464364]  device_del+0x36/0x3c0
[20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
[20976.477288]  notifier_call_chain+0x47/0x70
[20976.481386]  __netdev_upper_dev_link+0x18b/0x280
[20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
[20976.494475]  do_setlink+0x336/0xf50
[20976.502517]  __rtnl_newlink+0x529/0x8b0
[20976.543441]  rtnl_newlink+0x43/0x60
[20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
[20976.559238]  netlink_rcv_skb+0x4c/0x120
[20976.563079]  netlink_unicast+0x196/0x230
[20976.567005]  netlink_sendmsg+0x204/0x3d0
[20976.570930]  sock_sendmsg+0x4c/0x50
[20976.574423]  ____sys_sendmsg+0x1eb/0x250
[20976.586807]  ___sys_sendmsg+0x7c/0xc0
[20976.606353]  __sys_sendmsg+0x57/0xa0
[20976.609930]  do_syscall_64+0x5b/0x1a0
[20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca

1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
   is called from ice_service_task() context, aux device is created
   and associated device->lock is taken.
2. Command 'ip link ... set master...' calls ice's notifier under
   RTNL lock and that notifier calls ice_unplug_aux_dev(). That
   function tries to take aux device->lock but this is already taken
   by ice_plug_aux_dev() in step 1
3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
   taken in step 2
4. Dead-lock

The patch fixes this issue by following changes:
- Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
  call in ice_service_task()
- The bit is checked in ice_clear_rdma_cap() and only if it is not set
  then ice_unplug_aux_dev() is called. If it is set (in other words
  plugging of aux device was requested and ice_plug_aux_dev() is
  potentially running) then the function only clears the bit
- Once ice_plug_aux_dev() call (in ice_service_task) is finished
  the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
  whether it was already cleared by ice_clear_rdma_cap(). If so then
  aux device is unplugged.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Co-developed-by: Petr Oros <poros@redhat.com>
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Dave Ertman <david.m.ertman@intel.com>
Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phy: meson-gxl: improve link-up behavior
Heiner Kallweit [Wed, 9 Mar 2022 21:04:47 +0000 (22:04 +0100)]
net: phy: meson-gxl: improve link-up behavior

Sometimes the link comes up but no data flows. This patch fixes
this behavior. It's not clear what's the root cause of the issue.

According to the tests one other link-up issue remains.
In very rare cases the link isn't even reported as up.

Fixes: 84c8f773d2dc ("net: phy: meson-gxl: remove the use of .ack_callback()")
Tested-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/e3473452-a1f9-efcf-5fdd-02b6f44c3fcd@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: bcmgenet: Don't claim WOL when its not available
Jeremy Linton [Thu, 10 Mar 2022 04:55:35 +0000 (22:55 -0600)]
net: bcmgenet: Don't claim WOL when its not available

Some of the bcmgenet platforms don't correctly support WOL, yet
ethtool returns:

"Supports Wake-on: gsf"

which is false.

Ideally if there isn't a wol_irq, or there is something else that
keeps the device from being able to wakeup it should display:

"Supports Wake-on: d"

This patch checks whether the device can wakup, before using the
hard-coded supported flags. This corrects the ethtool reporting, as
well as the WOL configuration because ethtool verifies that the mode
is supported before attempting it.

Fixes: c51de7f3976b ("net: bcmgenet: add Wake-on-LAN support code")
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Tested-by: Peter Robinson <pbrobinson@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220310045535.224450-1-jeremy.linton@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: arc_emac: Fix use after free in arc_mdio_probe()
Jianglei Nie [Wed, 9 Mar 2022 12:18:24 +0000 (20:18 +0800)]
net: arc_emac: Fix use after free in arc_mdio_probe()

If bus->state is equal to MDIOBUS_ALLOCATED, mdiobus_free(bus) will free
the "bus". But bus->name is still used in the next line, which will lead
to a use after free.

We can fix it by putting the name in a local variable and make the
bus->name point to the rodata section "name",then use the name in the
error message without referring to bus to avoid the uaf.

Fixes: 95b5fc03c189 ("net: arc_emac: Make use of the helper function dev_err_probe()")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Link: https://lore.kernel.org/r/20220309121824.36529-1-niejianglei2021@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosctp: fix kernel-infoleak for SCTP sockets
Eric Dumazet [Thu, 10 Mar 2022 00:11:45 +0000 (16:11 -0800)]
sctp: fix kernel-infoleak for SCTP sockets

syzbot reported a kernel infoleak [1] of 4 bytes.

After analysis, it turned out r->idiag_expires is not initialized
if inet_sctp_diag_fill() calls inet_diag_msg_common_fill()

Make sure to clear idiag_timer/idiag_retrans/idiag_expires
and let inet_diag_msg_sctpasoc_fill() fill them again if needed.

[1]

BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:154 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668
 instrument_copy_to_user include/linux/instrumented.h:121 [inline]
 copyout lib/iov_iter.c:154 [inline]
 _copy_to_iter+0x6ef/0x25a0 lib/iov_iter.c:668
 copy_to_iter include/linux/uio.h:162 [inline]
 simple_copy_to_iter+0xf3/0x140 net/core/datagram.c:519
 __skb_datagram_iter+0x2d5/0x11b0 net/core/datagram.c:425
 skb_copy_datagram_iter+0xdc/0x270 net/core/datagram.c:533
 skb_copy_datagram_msg include/linux/skbuff.h:3696 [inline]
 netlink_recvmsg+0x669/0x1c80 net/netlink/af_netlink.c:1977
 sock_recvmsg_nosec net/socket.c:948 [inline]
 sock_recvmsg net/socket.c:966 [inline]
 __sys_recvfrom+0x795/0xa10 net/socket.c:2097
 __do_sys_recvfrom net/socket.c:2115 [inline]
 __se_sys_recvfrom net/socket.c:2111 [inline]
 __x64_sys_recvfrom+0x19d/0x210 net/socket.c:2111
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Uninit was created at:
 slab_post_alloc_hook mm/slab.h:737 [inline]
 slab_alloc_node mm/slub.c:3247 [inline]
 __kmalloc_node_track_caller+0xe0c/0x1510 mm/slub.c:4975
 kmalloc_reserve net/core/skbuff.c:354 [inline]
 __alloc_skb+0x545/0xf90 net/core/skbuff.c:426
 alloc_skb include/linux/skbuff.h:1158 [inline]
 netlink_dump+0x3e5/0x16c0 net/netlink/af_netlink.c:2248
 __netlink_dump_start+0xcf8/0xe90 net/netlink/af_netlink.c:2373
 netlink_dump_start include/linux/netlink.h:254 [inline]
 inet_diag_handler_cmd+0x2e7/0x400 net/ipv4/inet_diag.c:1341
 sock_diag_rcv_msg+0x24a/0x620
 netlink_rcv_skb+0x40c/0x7e0 net/netlink/af_netlink.c:2494
 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:277
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x1093/0x1360 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0x14d9/0x1720 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg net/socket.c:725 [inline]
 sock_write_iter+0x594/0x690 net/socket.c:1061
 do_iter_readv_writev+0xa7f/0xc70
 do_iter_write+0x52c/0x1500 fs/read_write.c:851
 vfs_writev fs/read_write.c:924 [inline]
 do_writev+0x645/0xe00 fs/read_write.c:967
 __do_sys_writev fs/read_write.c:1040 [inline]
 __se_sys_writev fs/read_write.c:1037 [inline]
 __x64_sys_writev+0xe5/0x120 fs/read_write.c:1037
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Bytes 68-71 of 2508 are uninitialized
Memory access of size 2508 starts at ffff888114f9b000
Data copied to user address 00007f7fe09ff2e0

CPU: 1 PID: 3478 Comm: syz-executor306 Not tainted 5.17.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220310001145.297371-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: phy: correct spelling error of media in documentation
Colin Foster [Wed, 9 Mar 2022 06:25:44 +0000 (22:25 -0800)]
net: phy: correct spelling error of media in documentation

The header file incorrectly referenced "median-independant interface"
instead of media. Correct this typo.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Fixes: 4069a572d423 ("net: phy: Document core PHY structures")
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220309062544.3073-1-colin.foster@in-advantage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'mlx5-fixes-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 10 Mar 2022 22:32:32 +0000 (14:32 -0800)]
Merge tag 'mlx5-fixes-2022-03-09' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-03-09

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: SHAMPO, reduce TIR indication
  net/mlx5e: Lag, Only handle events from highest priority multipath entry
  net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
  net/mlx5: Fix a race on command flush flow
  net/mlx5: Fix size field in bufferx_reg struct
====================

Link: https://lore.kernel.org/r/20220309201517.589132-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'block-5.17-2022-03-10' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 10 Mar 2022 20:56:36 +0000 (12:56 -0800)]
Merge tag 'block-5.17-2022-03-10' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
 "Just a single fix for a regression that occured in this merge window"

* tag 'block-5.17-2022-03-10' of git://git.kernel.dk/linux-block:
  block: fix blk_mq_attempt_bio_merge and rq_qos_throttle protection

2 years agoMerge tag 'staging-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Thu, 10 Mar 2022 20:43:06 +0000 (12:43 -0800)]
Merge tag 'staging-5.17-rc8' of git://git./linux/kernel/git/gregkh/staging

Pull staging driver fixes from Greg KH:
 "Here are three small fixes for staging drivers for 5.17-rc8 or -final,
  which ever comes next.

  They resolve some reported problems:

   - rtl8723bs wifi driver deadlock fix for reported problem that is a
     revert of a previous patch. Also a documentation fix is added so
     that the same problem hopefully can not come back again.

   - gdm724x driver use-after-free fix for a reported problem.

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'staging-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8723bs: Improve the comment explaining the locking rules
  staging: rtl8723bs: Fix access-point mode deadlock
  staging: gdm724x: fix use after free in gdm_lte_rx()

2 years agonet: phy: DP83822: clear MISR2 register to disable interrupts
Clément Léger [Wed, 9 Mar 2022 14:22:28 +0000 (15:22 +0100)]
net: phy: DP83822: clear MISR2 register to disable interrupts

MISR1 was cleared twice but the original author intention was probably
to clear MISR1 & MISR2 to completely disable interrupts. Fix it to
clear MISR2.

Fixes: 87461f7a58ab ("net: phy: DP83822 initial driver submission")
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220309142228.761153-1-clement.leger@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agogianfar: ethtool: Fix refcount leak in gfar_get_ts_info
Miaoqian Lin [Thu, 10 Mar 2022 01:53:13 +0000 (01:53 +0000)]
gianfar: ethtool: Fix refcount leak in gfar_get_ts_info

The of_find_compatible_node() function returns a node pointer with
refcount incremented, We should use of_node_put() on it when done
Add the missing of_node_put() to release the refcount.

Fixes: 7349a74ea75c ("net: ethernet: gianfar_ethtool: get phc index through drvdata")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20220310015313.14938-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'soc-fixes-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Thu, 10 Mar 2022 19:43:01 +0000 (11:43 -0800)]
Merge tag 'soc-fixes-5.17-3' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Here is a third set of fixes for the soc tree, well within the
  expected set of changes.

  Maintainer list changes:
   - Krzysztof Kozlowski and Jisheng Zhang both have new email addresses
   - Broadcom iProc has a new git tree

  Regressions:
   - Robert Foss sends a revert for a Mediatek DPI bridge patch that
     caused an inadvertent break in the DT binding
   - mstar timers need to be included in Kconfig

  Devicetree fixes for:
   - Aspeed ast2600 spi pinmux
   - Tegra eDP panels on Nyan FHD
   - Tegra display IOMMU
   - Qualcomm sm8350 UFS clocks
   - minor DT changes for Marvell Armada, Qualcomm sdx65, Qualcomm
     sm8450, and Broadcom BCM2711"

* tag 'soc-fixes-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0
  MAINTAINERS: Update Jisheng's email address
  Revert "arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint"
  dt-bindings: drm/bridge: anx7625: Revert DPI support
  ARM: dts: aspeed: Fix AST2600 quad spi group
  MAINTAINERS: update Krzysztof Kozlowski's email
  MAINTAINERS: Update git tree for Broadcom iProc SoCs
  ARM: tegra: Move Nyan FHD panels to AUX bus
  arm64: dts: armada-3720-turris-mox: Add missing ethernet0 alias
  ARM: mstar: Select HAVE_ARM_ARCH_TIMER
  soc: mediatek: mt8192-mmsys: Fix dither to dsi0 path's input sel
  arm64: dts: mt8183: jacuzzi: Fix bus properties in anx's DSI endpoint
  ARM: boot: dts: bcm2711: Fix HVS register range
  arm64: dts: qcom: c630: disable crypto due to serror
  arm64: dts: qcom: sm8450: fix apps_smmu interrupts
  arm64: dts: qcom: sm8450: enable GCC_USB3_0_CLKREF_EN for usb
  arm64: dts: qcom: sm8350: Correct UFS symbol clocks
  arm64: tegra: Disable ISO SMMU for Tegra194
  Revert "dt-bindings: arm: qcom: Document SDX65 platform and boards"

2 years agomm: gup: make fault_in_safe_writeable() use fixup_user_fault()
Linus Torvalds [Tue, 8 Mar 2022 19:55:48 +0000 (11:55 -0800)]
mm: gup: make fault_in_safe_writeable() use fixup_user_fault()

Instead of using GUP, make fault_in_safe_writeable() actually force a
'handle_mm_fault()' using the same fixup_user_fault() machinery that
futexes already use.

Using the GUP machinery meant that fault_in_safe_writeable() did not do
everything that a real fault would do, ranging from not auto-expanding
the stack segment, to not updating accessed or dirty flags in the page
tables (GUP sets those flags on the pages themselves).

The latter causes problems on architectures (like s390) that do accessed
bit handling in software, which meant that fault_in_safe_writeable()
didn't actually do all the fault handling it needed to, and trying to
access the user address afterwards would still cause faults.

Reported-and-tested-by: Andreas Gruenbacher <agruenba@redhat.com>
Fixes: cdd591fc86e3 ("iov_iter: Introduce fault_in_iov_iter_writeable")
Link: https://lore.kernel.org/all/CAHc6FU5nP+nziNGG0JAF1FUx-GV7kKFvM7aZuU_XD2_1v4vnvg@mail.gmail.com/
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoriscv: alternative only works on !XIP_KERNEL
Jisheng Zhang [Thu, 10 Feb 2022 16:49:43 +0000 (00:49 +0800)]
riscv: alternative only works on !XIP_KERNEL

The alternative mechanism needs runtime code patching, it can't work
on XIP_KERNEL. And the errata workarounds are implemented via the
alternative mechanism. So add !XIP_KERNEL dependency for alternative
and erratas.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Fixes: 44c922572952 ("RISC-V: enable XIP")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2 years agoMerge tag 'mvebu-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gclem...
Arnd Bergmann [Thu, 10 Mar 2022 14:25:45 +0000 (15:25 +0100)]
Merge tag 'mvebu-fixes-5.17-2' of git://git./linux/kernel/git/gclement/mvebu into arm/fixes

mvebu fixes for 5.17 (part 2)

Allow using old PCIe card on Armada 37xx

* tag 'mvebu-fixes-5.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu:
  arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0

Link: https://lore.kernel.org/r/87bkydj4fn.fsf@BL-laptop
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2 years agoarm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0
Pali Rohár [Thu, 10 Mar 2022 10:39:23 +0000 (11:39 +0100)]
arm64: dts: marvell: armada-37xx: Remap IO space to bus address 0x0

Legacy and old PCI I/O based cards do not support 32-bit I/O addressing.

Since commit 64f160e19e92 ("PCI: aardvark: Configure PCIe resources from
'ranges' DT property") kernel can set different PCIe address on CPU and
different on the bus for the one A37xx address mapping without any firmware
support in case the bus address does not conflict with other A37xx mapping.

So remap I/O space to the bus address 0x0 to enable support for old legacy
I/O port based cards which have hardcoded I/O ports in low address space.

Note that DDR on A37xx is mapped to bus address 0x0. And mapping of I/O
space can be set to address 0x0 too because MEM space and I/O space are
separate and so do not conflict.

Remapping IO space on Turris Mox to different address is not possible to
due bootloader bug.

Signed-off-by: Pali Rohár <pali@kernel.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 76f6386b25cc ("arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700")
Cc: stable@vger.kernel.org # 64f160e19e92 ("PCI: aardvark: Configure PCIe resources from 'ranges' DT property")
Cc: stable@vger.kernel.org # 514ef1e62d65 ("arm64: dts: marvell: armada-37xx: Extend PCIe MEM space")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
2 years agoMerge tag 'spi-fix-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Thu, 10 Mar 2022 12:15:09 +0000 (04:15 -0800)]
Merge tag 'spi-fix-v5.17-rc7' of git://git./linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
 "One fix for type conversion issues when working out maximum
  scatter/gather segment sizes.

  It caused problems for some systems where the limits overflow
  due to the type conversion"

* tag 'spi-fix-v5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: Fix invalid sgs value

2 years agoARM: fix build warning in proc-v7-bugs.c
Russell King (Oracle) [Thu, 10 Mar 2022 10:22:14 +0000 (10:22 +0000)]
ARM: fix build warning in proc-v7-bugs.c

The kernel test robot discovered that building without
HARDEN_BRANCH_PREDICTOR issues a warning due to a missing
argument to pr_info().

Add the missing argument.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 9dd78194a372 ("ARM: report Spectre v2 status through sysfs")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'gpio-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 10 Mar 2022 11:55:33 +0000 (03:55 -0800)]
Merge tag 'gpio-fixes-for-v5.17' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix a probe failure for Tegra241 GPIO controller in gpio-tegra186

 - revert changes that caused a regression in the sysfs user-space
   interface

 - correct the debounce time conversion in GPIO ACPI

 - statify a struct in gpio-sim and fix a typo

 - update registers in correct order (hardware quirk) in gpio-ts4900

* tag 'gpio-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: fix a typo
  gpio: ts4900: Do not set DAT and OE together
  gpio: sim: Declare gpio_sim_hog_config_item_ops static
  gpiolib: acpi: Convert ACPI value of debounce to microseconds
  gpio: Revert regression in sysfs-gpio (gpiolib.c)
  gpio: tegra186: Add IRQ per bank for Tegra241

2 years agogpio: sim: fix a typo
Bartosz Golaszewski [Tue, 8 Mar 2022 08:44:54 +0000 (09:44 +0100)]
gpio: sim: fix a typo

Just noticed this when applying Andy's patch. s/childred/children/

Fixes: cb8c474e79be ("gpio: sim: new testing module")
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2 years agogpio: ts4900: Do not set DAT and OE together
Mark Featherston [Thu, 10 Mar 2022 01:16:16 +0000 (17:16 -0800)]
gpio: ts4900: Do not set DAT and OE together

This works around an issue with the hardware where both OE and
DAT are exposed in the same register. If both are updated
simultaneously, the harware makes no guarantees that OE or DAT
will actually change in any given order and may result in a
glitch of a few ns on a GPIO pin when changing direction and value
in a single write.

Setting direction to input now only affects OE bit. Setting
direction to output updates DAT first, then OE.

Fixes: 9c6686322d74 ("gpio: add Technologic I2C-FPGA gpio support")
Signed-off-by: Mark Featherston <mark@embeddedTS.com>
Signed-off-by: Kris Bahnsen <kris@embeddedTS.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 10 Mar 2022 04:58:29 +0000 (20:58 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "One more small batch of clk driver fixes:

   - A fix for the Qualcomm GDSC power domain delays that avoids black
     screens at boot on some more recent SoCs that use a different delay
     than the hard-coded delays in the driver.

   - A build fix LAN966X clk driver that let it be built on
     architectures that didn't have IOMEM"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: lan966x: Fix linking error
  clk: qcom: dispcc: Update the transition delay for MDSS GDSC
  clk: qcom: gdsc: Add support to update GDSC transition delay

2 years agoMerge tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Linus Torvalds [Thu, 10 Mar 2022 04:44:17 +0000 (20:44 -0800)]
Merge tag 'xsa396-5.17-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "Several Linux PV device frontends are using the grant table interfaces
  for removing access rights of the backends in ways being subject to
  race conditions, resulting in potential data leaks, data corruption by
  malicious backends, and denial of service triggered by malicious
  backends:

   - blkfront, netfront, scsifront and the gntalloc driver are testing
     whether a grant reference is still in use. If this is not the case,
     they assume that a following removal of the granted access will
     always succeed, which is not true in case the backend has mapped
     the granted page between those two operations.

     As a result the backend can keep access to the memory page of the
     guest no matter how the page will be used after the frontend I/O
     has finished. The xenbus driver has a similar problem, as it
     doesn't check the success of removing the granted access of a
     shared ring buffer.

   - blkfront, netfront, scsifront, usbfront, dmabuf, xenbus, 9p,
     kbdfront, and pvcalls are using a functionality to delay freeing a
     grant reference until it is no longer in use, but the freeing of
     the related data page is not synchronized with dropping the granted
     access.

     As a result the backend can keep access to the memory page even
     after it has been freed and then re-used for a different purpose.

   - netfront will fail a BUG_ON() assertion if it fails to revoke
     access in the rx path.

     This will result in a Denial of Service (DoS) situation of the
     guest which can be triggered by the backend"

* tag 'xsa396-5.17-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/netfront: react properly to failing gnttab_end_foreign_access_ref()
  xen/gnttab: fix gnttab_end_foreign_access() without page specified
  xen/pvcalls: use alloc/free_pages_exact()
  xen/9p: use alloc/free_pages_exact()
  xen/usb: don't use gnttab_end_foreign_access() in xenhcd_gnttab_done()
  xen: remove gnttab_query_foreign_access()
  xen/gntalloc: don't use gnttab_query_foreign_access()
  xen/scsifront: don't use gnttab_query_foreign_access() for mapped status
  xen/netfront: don't use gnttab_query_foreign_access() for mapped status
  xen/blkfront: don't use gnttab_query_foreign_access() for mapped status
  xen/grant-table: add gnttab_try_end_foreign_access()
  xen/xenbus: don't let xenbus_grant_ring() remove grants in error case

2 years agoMerge branch 'selftests-pmtu-sh-fix-cleanup-of-processes-launched-in-subshell'
Jakub Kicinski [Thu, 10 Mar 2022 04:23:37 +0000 (20:23 -0800)]
Merge branch 'selftests-pmtu-sh-fix-cleanup-of-processes-launched-in-subshell'

Guillaume Nault says:

====================
selftests: pmtu.sh: Fix cleanup of processes launched in subshell.

Depending on the options used, pmtu.sh may launch tcpdump and nettest
processes in the background. However it fails to clean them up after
the tests complete.

Patch 1 allows the cleanup() function to read the list of PIDs launched
by the tests.
Patch 2 fixes the way the nettest PIDs are retrieved.
====================

Link: https://lore.kernel.org/r/cover.1646776561.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: pmtu.sh: Kill nettest processes launched in subshell.
Guillaume Nault [Tue, 8 Mar 2022 22:15:03 +0000 (23:15 +0100)]
selftests: pmtu.sh: Kill nettest processes launched in subshell.

When using "run_cmd <command> &", then "$!" refers to the PID of the
subshell used to run <command>, not the command itself. Therefore
nettest_pids actually doesn't contain the list of the nettest commands
running in the background. So cleanup() can't kill them and the nettest
processes run until completion (fortunately they have a 5s timeout).

Fix this by defining a new command for running processes in the
background, for which "$!" really refers to the PID of the command run.

Also, double quote variables on the modified lines, to avoid shellcheck
warnings.

Fixes: ece1278a9b81 ("selftests: net: add ESP-in-UDP PMTU test")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: pmtu.sh: Kill tcpdump processes launched by subshell.
Guillaume Nault [Tue, 8 Mar 2022 22:15:00 +0000 (23:15 +0100)]
selftests: pmtu.sh: Kill tcpdump processes launched by subshell.

The cleanup() function takes care of killing processes launched by the
test functions. It relies on variables like ${tcpdump_pids} to get the
relevant PIDs. But tests are run in their own subshell, so updated
*_pids values are invisible to other shells. Therefore cleanup() never
sees any process to kill:

$ ./tools/testing/selftests/net/pmtu.sh -t pmtu_ipv4_exception
TEST: ipv4: PMTU exceptions                                         [ OK ]
TEST: ipv4: PMTU exceptions - nexthop objects                       [ OK ]

$ pgrep -af tcpdump
6084 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap
6085 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap
6086 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap
6087 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap
6088 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap
6089 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap
6090 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap
6091 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap
6228 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap
6229 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap
6230 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap
6231 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap
6232 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap
6233 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap
6234 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap
6235 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap

Fix this by running cleanup() in the context of the test subshell.
Now that each test cleans the environment after completion, there's no
need for calling cleanup() again when the next test starts. So let's
drop it from the setup() function. This is okay because cleanup() is
also called when pmtu.sh starts, so even the first test starts in a
clean environment.

Also, use tcpdump's immediate mode. Otherwise it might not have time to
process buffered packets, resulting in missing packets or even empty
pcap files for short tests.

Note: PAUSE_ON_FAIL is still evaluated before cleanup(), so one can
still inspect the test environment upon failure when using -p.

Fixes: a92a0a7b8e7c ("selftests: pmtu: Simplify cleanup and namespace names")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoNFC: port100: fix use-after-free in port100_send_complete
Pavel Skripkin [Tue, 8 Mar 2022 18:50:07 +0000 (21:50 +0300)]
NFC: port100: fix use-after-free in port100_send_complete

Syzbot reported UAF in port100_send_complete(). The root case is in
missing usb_kill_urb() calls on error handling path of ->probe function.

port100_send_complete() accesses devm allocated memory which will be
freed on probe failure. We should kill this urbs before returning an
error from probe function to prevent reported use-after-free

Fail log:

BUG: KASAN: use-after-free in port100_send_complete+0x16e/0x1a0 drivers/nfc/port100.c:935
Read of size 1 at addr ffff88801bb59540 by task ksoftirqd/2/26
...
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255
 __kasan_report mm/kasan/report.c:442 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 port100_send_complete+0x16e/0x1a0 drivers/nfc/port100.c:935
 __usb_hcd_giveback_urb+0x2b0/0x5c0 drivers/usb/core/hcd.c:1670

...

Allocated by task 1255:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:45 [inline]
 set_alloc_info mm/kasan/common.c:436 [inline]
 ____kasan_kmalloc mm/kasan/common.c:515 [inline]
 ____kasan_kmalloc mm/kasan/common.c:474 [inline]
 __kasan_kmalloc+0xa6/0xd0 mm/kasan/common.c:524
 alloc_dr drivers/base/devres.c:116 [inline]
 devm_kmalloc+0x96/0x1d0 drivers/base/devres.c:823
 devm_kzalloc include/linux/device.h:209 [inline]
 port100_probe+0x8a/0x1320 drivers/nfc/port100.c:1502

Freed by task 1255:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:45
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free+0xff/0x140 mm/kasan/common.c:328
 kasan_slab_free include/linux/kasan.h:236 [inline]
 __cache_free mm/slab.c:3437 [inline]
 kfree+0xf8/0x2b0 mm/slab.c:3794
 release_nodes+0x112/0x1a0 drivers/base/devres.c:501
 devres_release_all+0x114/0x190 drivers/base/devres.c:530
 really_probe+0x626/0xcc0 drivers/base/dd.c:670

Reported-and-tested-by: syzbot+16bcb127fb73baeecb14@syzkaller.appspotmail.com
Fixes: 0347a6ab300a ("NFC: port100: Commands mechanism implementation")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
Link: https://lore.kernel.org/r/20220308185007.6987-1-paskripkin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Wed, 9 Mar 2022 22:30:09 +0000 (14:30 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 build fix from Catalin Marinas:
 "Fix kernel build with clang LTO after the inclusion of the Spectre BHB
  arm64 mitigations"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Do not include __READ_ONCE() block in assembly files

2 years agoARM: Do not use NOCROSSREFS directive with ld.lld
Nathan Chancellor [Wed, 9 Mar 2022 22:07:27 +0000 (15:07 -0700)]
ARM: Do not use NOCROSSREFS directive with ld.lld

ld.lld does not support the NOCROSSREFS directive at the moment, which
breaks the build after commit b9baf5c8c5c3 ("ARM: Spectre-BHB
workaround"):

  ld.lld: error: ./arch/arm/kernel/vmlinux.lds:34: AT expected, but got NOCROSSREFS

Support for this directive will eventually be implemented, at which
point a version check can be added. To avoid breaking the build in the
meantime, just define NOCROSSREFS to nothing when using ld.lld, with a
link to the issue for tracking.

Cc: stable@vger.kernel.org
Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround")
Link: https://github.com/ClangBuiltLinux/linux/issues/1609
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoarm64: Do not include __READ_ONCE() block in assembly files
Nathan Chancellor [Wed, 9 Mar 2022 19:16:34 +0000 (12:16 -0700)]
arm64: Do not include __READ_ONCE() block in assembly files

When building arm64 defconfig + CONFIG_LTO_CLANG_{FULL,THIN}=y after
commit 558c303c9734 ("arm64: Mitigate spectre style branch history side
channels"), the following error occurs:

  <instantiation>:4:2: error: invalid fixup for movz/movk instruction
   mov w0, #ARM_SMCCC_ARCH_WORKAROUND_3
   ^

Marc figured out that moving "#include <linux/init.h>" in
include/linux/arm-smccc.h into a !__ASSEMBLY__ block resolves it. The
full include chain with CONFIG_LTO=y from include/linux/arm-smccc.h:

include/linux/init.h
include/linux/compiler.h
arch/arm64/include/asm/rwonce.h
arch/arm64/include/asm/alternative-macros.h
arch/arm64/include/asm/assembler.h

The asm/alternative-macros.h include in asm/rwonce.h only happens when
CONFIG_LTO is set, which ultimately casues asm/assembler.h to be
included before the definition of ARM_SMCCC_ARCH_WORKAROUND_3. As a
result, the preprocessor does not expand ARM_SMCCC_ARCH_WORKAROUND_3 in
__mitigate_spectre_bhb_fw, which results in the error above.

Avoid this problem by just avoiding the CONFIG_LTO=y __READ_ONCE() block
in asm/rwonce.h with assembly files, as nothing in that block is useful
to assembly files, which allows ARM_SMCCC_ARCH_WORKAROUND_3 to be
properly expanded with CONFIG_LTO=y builds.

Fixes: e35123d83ee3 ("arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y")
Cc: <stable@vger.kernel.org> # 5.11.x
Link: https://lore.kernel.org/r/20220309155716.3988480-1-maz@kernel.org/
Reported-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20220309191633.2307110-1-nathan@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Linus Torvalds [Wed, 9 Mar 2022 21:47:12 +0000 (13:47 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid

Pull HID fixes from Jiri Kosina:

 - sysfs attributes leak fix for Google Vivaldi driver (Dmitry Torokhov)

 - fix for potential out-of-bounds read in Thrustmaster driver (Pavel
   Skripkin)

 - error handling reference leak in Elo driver (Jiri Kosina)

 - a few new device IDs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: nintendo: check the return value of alloc_workqueue()
  HID: vivaldi: fix sysfs attributes leak
  HID: hid-thrustmaster: fix OOB read in thrustmaster_interrupts
  HID: elo: Revert USB reference counting
  HID: Add support for open wheel and no attachment to T300
  HID: logitech-dj: add new lightspeed receiver id

2 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Wed, 9 Mar 2022 20:59:21 +0000 (12:59 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - Fix compilation of eBPF object files that indirectly include
   mte-kasan.h.

 - Fix test for execute-only permissions with EPAN (Enhanced Privileged
   Access Never, ARMv8.7 feature).

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kasan: fix include error in MTE functions
  arm64: Ensure execute-only permissions are not allowed without EPAN

2 years agoARM: fix co-processor register typo
Russell King (Oracle) [Wed, 9 Mar 2022 19:08:42 +0000 (19:08 +0000)]
ARM: fix co-processor register typo

In the recent Spectre BHB patches, there was a typo that is only
exposed in certain configurations: mcr p15,0,XX,c7,r5,4 should have
been mcr p15,0,XX,c7,c5,4

Reported-by: kernel test robot <lkp@intel.com>
Fixes: b9baf5c8c5c3 ("ARM: Spectre-BHB workaround")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agonet/mlx5e: SHAMPO, reduce TIR indication
Ben Ben-Ishay [Wed, 2 Mar 2022 15:07:08 +0000 (17:07 +0200)]
net/mlx5e: SHAMPO, reduce TIR indication

SHAMPO is an RQ / WQ feature, an indication was added to the TIR in the
first place to enforce suitability between connected TIR and RQ, this
enforcement does not exist in current the Firmware implementation and was
redundant in the first place.

Fixes: 83439f3c37aa ("net/mlx5e: Add HW-GRO offload")
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: Lag, Only handle events from highest priority multipath entry
Roi Dayan [Wed, 16 Feb 2022 11:56:57 +0000 (13:56 +0200)]
net/mlx5e: Lag, Only handle events from highest priority multipath entry

There could be multiple multipath entries but changing the port affinity
for each one doesn't make much sense and there should be a default one.
So only track the entry with lowest priority value.
The commit doesn't affect existing users with a single entry.

Fixes: 544fe7c2e654 ("net/mlx5e: Activate HW multipath and handle port affinity based on FIB events")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
Dima Chumak [Mon, 17 Jan 2022 13:32:16 +0000 (15:32 +0200)]
net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE

Only prio 1 is supported for nic mode when there is no ignore flow level
support in firmware. But for switchdev mode, which supports fixed number
of statically pre-allocated prios, this restriction is not relevant so
it can be relaxed.

Fixes: d671e109bd85 ("net/mlx5: Fix tc max supported prio for nic mode")
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Fix a race on command flush flow
Moshe Shemesh [Fri, 4 Feb 2022 09:47:44 +0000 (11:47 +0200)]
net/mlx5: Fix a race on command flush flow

Fix a refcount use after free warning due to a race on command entry.
Such race occurs when one of the commands releases its last refcount and
frees its index and entry while another process running command flush
flow takes refcount to this command entry. The process which handles
commands flush may see this command as needed to be flushed if the other
process released its refcount but didn't release the index yet. Fix it
by adding the needed spin lock.

It fixes the following warning trace:

refcount_t: addition on 0; use-after-free.
WARNING: CPU: 11 PID: 540311 at lib/refcount.c:25 refcount_warn_saturate+0x80/0xe0
...
RIP: 0010:refcount_warn_saturate+0x80/0xe0
...
Call Trace:
 <TASK>
 mlx5_cmd_trigger_completions+0x293/0x340 [mlx5_core]
 mlx5_cmd_flush+0x3a/0xf0 [mlx5_core]
 enter_error_state+0x44/0x80 [mlx5_core]
 mlx5_fw_fatal_reporter_err_work+0x37/0xe0 [mlx5_core]
 process_one_work+0x1be/0x390
 worker_thread+0x4d/0x3d0
 ? rescuer_thread+0x350/0x350
 kthread+0x141/0x160
 ? set_kthread_struct+0x40/0x40
 ret_from_fork+0x1f/0x30
 </TASK>

Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Fix size field in bufferx_reg struct
Mohammad Kabat [Thu, 25 Mar 2021 12:38:55 +0000 (14:38 +0200)]
net/mlx5: Fix size field in bufferx_reg struct

According to HW spec the field "size" should be 16 bits
in bufferx register.

Fixes: e281682bf294 ("net/mlx5_core: HW data structs/types definitions cleanup")
Signed-off-by: Mohammad Kabat <mohammadkab@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>