linux-2.6-microblaze.git
6 years agoperf/x86: Always store regs->ip in perf_callchain_kernel()
Song Liu [Thu, 27 Jun 2019 00:33:52 +0000 (19:33 -0500)]
perf/x86: Always store regs->ip in perf_callchain_kernel()

The stacktrace_map_raw_tp BPF selftest is failing because the RIP saved by
perf_arch_fetch_caller_regs() isn't getting saved by perf_callchain_kernel().

This was broken by the following commit:

  d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")

With that change, when starting with non-HW regs, the unwinder starts
with the current stack frame and unwinds until it passes up the frame
which called perf_arch_fetch_caller_regs().  So regs->ip needs to be
saved deliberately.

Fixes: d15d356887e7 ("perf/x86: Make perf callchains work without CONFIG_FRAME_POINTER")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Kairui Song <kasong@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/3975a298fa52b506fea32666d8ff6a13467eee6d.1561595111.git.jpoimboe@redhat.com
6 years agox86/speculation: Allow guests to use SSBD even if host does not
Alejandro Jimenez [Mon, 10 Jun 2019 17:20:10 +0000 (13:20 -0400)]
x86/speculation: Allow guests to use SSBD even if host does not

The bits set in x86_spec_ctrl_mask are used to calculate the guest's value
of SPEC_CTRL that is written to the MSR before VMENTRY, and control which
mitigations the guest can enable.  In the case of SSBD, unless the host has
enabled SSBD always on mode (by passing "spec_store_bypass_disable=on" in
the kernel parameters), the SSBD bit is not set in the mask and the guest
can not properly enable the SSBD always on mitigation mode.

This has been confirmed by running the SSBD PoC on a guest using the SSBD
always on mitigation mode (booted with kernel parameter
"spec_store_bypass_disable=on"), and verifying that the guest is vulnerable
unless the host is also using SSBD always on mode. In addition, the guest
OS incorrectly reports the SSB vulnerability as mitigated.

Always set the SSBD bit in x86_spec_ctrl_mask when the host CPU supports
it, allowing the guest to use SSBD whether or not the host has chosen to
enable the mitigation in any of its modes.

Fixes: be6fcb5478e9 ("x86/bugs: Rework spec_ctrl base and mask logic")
Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: bp@alien8.de
Cc: rkrcmar@redhat.com
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1560187210-11054-1-git-send-email-alejandro.j.jimenez@oracle.com
6 years agox86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()
Kirill A. Shutemov [Mon, 24 Jun 2019 12:31:50 +0000 (15:31 +0300)]
x86/mm: Handle physical-virtual alignment mismatch in phys_p4d_init()

Kyle has reported occasional crashes when booting a kernel in 5-level
paging mode with KASLR enabled:

  WARNING: CPU: 0 PID: 0 at arch/x86/mm/init_64.c:87 phys_p4d_init+0x1d4/0x1ea
  RIP: 0010:phys_p4d_init+0x1d4/0x1ea
  Call Trace:
   __kernel_physical_mapping_init+0x10a/0x35c
   kernel_physical_mapping_init+0xe/0x10
   init_memory_mapping+0x1aa/0x3b0
   init_range_memory_mapping+0xc8/0x116
   init_mem_mapping+0x225/0x2eb
   setup_arch+0x6ff/0xcf5
   start_kernel+0x64/0x53b
   ? copy_bootdata+0x1f/0xce
   x86_64_start_reservations+0x24/0x26
   x86_64_start_kernel+0x8a/0x8d
   secondary_startup_64+0xb6/0xc0

which causes later:

  BUG: unable to handle page fault for address: ff484d019580eff8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  BAD
  Oops: 0000 [#1] SMP NOPTI
  RIP: 0010:fill_pud+0x13/0x130
  Call Trace:
   set_pte_vaddr_p4d+0x2e/0x50
   set_pte_vaddr+0x6f/0xb0
   __native_set_fixmap+0x28/0x40
   native_set_fixmap+0x39/0x70
   register_lapic_address+0x49/0xb6
   early_acpi_boot_init+0xa5/0xde
   setup_arch+0x944/0xcf5
   start_kernel+0x64/0x53b

Kyle bisected the issue to commit b569c1843498 ("x86/mm/KASLR: Reduce
randomization granularity for 5-level paging to 1GB")

Before this commit PAGE_OFFSET was always aligned to P4D_SIZE when booting
5-level paging mode. But now only PUD_SIZE alignment is guaranteed.

In the case I was able to reproduce the following vaddr/paddr values were
observed in phys_p4d_init():

Iteration     vaddr paddr
   1        0xff4228027fe00000  0x033fe00000
   2       0xff42287f40000000 0x8000000000

'vaddr' in both cases belongs to the same p4d entry.

But due to the original assumption that PAGE_OFFSET is aligned to P4D_SIZE
this overlap cannot be handled correctly. The code assumes strictly aligned
entries and unconditionally increments the index into the P4D table, which
creates false duplicate entries. Once the index reaches the end, the last
entry in the page table is missing.

Aside of that the 'paddr >= paddr_end' condition can evaluate wrong which
causes an P4D entry to be cleared incorrectly.

Change the loop in phys_p4d_init() to walk purely based on virtual
addresses like __kernel_physical_mapping_init() does. This makes it work
correctly with unaligned virtual addresses.

Fixes: b569c1843498 ("x86/mm/KASLR: Reduce randomization granularity for 5-level paging to 1GB")
Reported-by: Kyle Pelton <kyle.d.pelton@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Pelton <kyle.d.pelton@intel.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190624123150.920-1-kirill.shutemov@linux.intel.com
6 years agox86/boot/64: Add missing fixup_pointer() for next_early_pgt access
Kirill A. Shutemov [Thu, 20 Jun 2019 11:24:22 +0000 (14:24 +0300)]
x86/boot/64: Add missing fixup_pointer() for next_early_pgt access

__startup_64() uses fixup_pointer() to access global variables in a
position-independent fashion. Access to next_early_pgt was wrapped into the
helper, but one instance in the 5-level paging branch was missed.

GCC generates a R_X86_64_PC32 PC-relative relocation for the access which
doesn't trigger the issue, but Clang emmits a R_X86_64_32S which leads to
an invalid memory access and system reboot.

Fixes: 187e91fe5e91 ("x86/boot/64/clang: Use fixup_pointer() to access 'next_early_pgt'")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Potapenko <glider@google.com>
Link: https://lkml.kernel.org/r/20190620112422.29264-1-kirill.shutemov@linux.intel.com
6 years agox86/boot/64: Fix crash if kernel image crosses page table boundary
Kirill A. Shutemov [Thu, 20 Jun 2019 11:23:45 +0000 (14:23 +0300)]
x86/boot/64: Fix crash if kernel image crosses page table boundary

A kernel which boots in 5-level paging mode crashes in a small percentage
of cases if KASLR is enabled.

This issue was tracked down to the case when the kernel image unpacks in a
way that it crosses an 1G boundary. The crash is caused by an overrun of
the PMD page table in __startup_64() and corruption of P4D page table
allocated next to it. This particular issue is not visible with 4-level
paging as P4D page tables are not used.

But the P4D and the PUD calculation have similar problems.

The PMD index calculation is wrong due to operator precedence, which fails
to confine the PMDs in the PMD array on wrap around.

The P4D calculation for 5-level paging and the PUD calculation calculate
the first index correctly, but then blindly increment it which causes the
same issue when a kernel image is located across a 512G and for 5-level
paging across a 46T boundary.

This wrap around mishandling was introduced when these parts moved from
assembly to C.

Restore it to the correct behaviour.

Fixes: c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190620112345.28833-1-kirill.shutemov@linux.intel.com
6 years agox86/apic: Fix integer overflow on 10 bit left shift of cpu_khz
Colin Ian King [Wed, 19 Jun 2019 18:14:46 +0000 (19:14 +0100)]
x86/apic: Fix integer overflow on 10 bit left shift of cpu_khz

The left shift of unsigned int cpu_khz will overflow for large values of
cpu_khz, so cast it to a long long before shifting it to avoid overvlow.
For example, this can happen when cpu_khz is 4194305, i.e. ~4.2 GHz.

Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 8c3ba8d04924 ("x86, apic: ack all pending irqs when crashed/on kexec")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20190619181446.13635-1-colin.king@canonical.com
6 years agox86/resctrl: Prevent possible overrun during bitmap operations
Reinette Chatre [Wed, 19 Jun 2019 20:27:16 +0000 (13:27 -0700)]
x86/resctrl: Prevent possible overrun during bitmap operations

While the DOC at the beginning of lib/bitmap.c explicitly states that
"The number of valid bits in a given bitmap does _not_ need to be an
exact multiple of BITS_PER_LONG.", some of the bitmap operations do
indeed access BITS_PER_LONG portions of the provided bitmap no matter
the size of the provided bitmap.

For example, if find_first_bit() is provided with an 8 bit bitmap the
operation will access BITS_PER_LONG bits from the provided bitmap. While
the operation ensures that these extra bits do not affect the result,
the memory is still accessed.

The capacity bitmasks (CBMs) are typically stored in u32 since they
can never exceed 32 bits. A few instances exist where a bitmap_*
operation is performed on a CBM by simply pointing the bitmap operation
to the stored u32 value.

The consequence of this pattern is that some bitmap_* operations will
access out-of-bounds memory when interacting with the provided CBM.

This same issue has previously been addressed with commit 49e00eee0061
("x86/intel_rdt: Fix out-of-bounds memory access in CBM tests")
but at that time not all instances of the issue were fixed.

Fix this by using an unsigned long to store the capacity bitmask data
that is passed to bitmap functions.

Fixes: e651901187ab ("x86/intel_rdt: Introduce "bit_usage" to display cache allocations details")
Fixes: f4e80d67a527 ("x86/intel_rdt: Resctrl files reflect pseudo-locked information")
Fixes: 95f0b77efa57 ("x86/intel_rdt: Initialize new resource group with sane defaults")
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/58c9b6081fd9bf599af0dfc01a6fdd335768efef.1560975645.git.reinette.chatre@intel.com
6 years agox86/microcode: Fix the microcode load on CPU hotplug for real
Thomas Gleixner [Tue, 18 Jun 2019 20:31:40 +0000 (22:31 +0200)]
x86/microcode: Fix the microcode load on CPU hotplug for real

A recent change moved the microcode loader hotplug callback into the early
startup phase which is running with interrupts disabled. It missed that
the callbacks invoke sysfs functions which might sleep causing nice 'might
sleep' splats with proper debugging enabled.

Split the callbacks and only load the microcode in the early startup phase
and move the sysfs handling back into the later threaded and preemptible
bringup phase where it was before.

Fixes: 78f4e932f776 ("x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906182228350.1766@nanos.tec.linutronix.de
6 years agox86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback
Borislav Petkov [Thu, 13 Jun 2019 13:49:02 +0000 (15:49 +0200)]
x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback

Adric Blake reported the following warning during suspend-resume:

  Enabling non-boot CPUs ...
  x86: Booting SMP configuration:
  smpboot: Booting Node 0 Processor 1 APIC 0x2
  unchecked MSR access error: WRMSR to 0x10f (tried to write 0x0000000000000000) \
   at rIP: 0xffffffff8d267924 (native_write_msr+0x4/0x20)
  Call Trace:
   intel_set_tfa
   intel_pmu_cpu_starting
   ? x86_pmu_dead_cpu
   x86_pmu_starting_cpu
   cpuhp_invoke_callback
   ? _raw_spin_lock_irqsave
   notify_cpu_starting
   start_secondary
   secondary_startup_64
  microcode: sig=0x806ea, pf=0x80, revision=0x96
  microcode: updated to revision 0xb4, date = 2019-04-01
  CPU1 is up

The MSR in question is MSR_TFA_RTM_FORCE_ABORT and that MSR is emulated
by microcode. The log above shows that the microcode loader callback
happens after the PMU restoration, leading to the conjecture that
because the microcode hasn't been updated yet, that MSR is not present
yet, leading to the #GP.

Add a microcode loader-specific hotplug vector which comes before
the PERF vectors and thus executes earlier and makes sure the MSR is
present.

Fixes: 400816f60c54 ("perf/x86/intel: Implement support for TSX Force Abort")
Reported-by: Adric Blake <promarbler14@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: x86@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203637
6 years agox86/kasan: Fix boot with 5-level paging and KASAN
Andrey Ryabinin [Fri, 14 Jun 2019 14:31:49 +0000 (17:31 +0300)]
x86/kasan: Fix boot with 5-level paging and KASAN

Since commit d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on
5-level paging") kernel doesn't boot with KASAN on 5-level paging machines.
The bug is actually in early_p4d_offset() and introduced by commit
12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging")

early_p4d_offset() tries to convert pgd_val(*pgd) value to a physical
address. This doesn't make sense because pgd_val() already contains the
physical address.

It did work prior to commit d52888aa2753 because the result of
"__pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK" was the same as "pgd_val(*pgd)
& PTE_PFN_MASK". __pa_nodebug() just set some high bits which were masked
out by applying PTE_PFN_MASK.

After the change of the PAGE_OFFSET offset in commit d52888aa2753
__pa_nodebug(pgd_val(*pgd)) started to return a value with more high bits
set and PTE_PFN_MASK wasn't enough to mask out all of them. So it returns a
wrong not even canonical address and crashes on the attempt to dereference
it.

Switch back to pgd_val() & PTE_PFN_MASK to cure the issue.

Fixes: 12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging")
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: kasan-dev@googlegroups.com
Cc: stable@vger.kernel.org
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190614143149.2227-1-aryabinin@virtuozzo.com
6 years agox86/fpu: Don't use current->mm to check for a kthread
Christoph Hellwig [Tue, 4 Jun 2019 17:54:12 +0000 (19:54 +0200)]
x86/fpu: Don't use current->mm to check for a kthread

current->mm can be non-NULL if a kthread calls use_mm(). Check for
PF_KTHREAD instead to decide when to store user mode FP state.

Fixes: 2722146eb784 ("x86/fpu: Remove fpu->initialized")
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190604175411.GA27477@lst.de
6 years agox86/kgdb: Return 0 from kgdb_arch_set_breakpoint()
Matt Mullins [Fri, 31 May 2019 19:47:54 +0000 (12:47 -0700)]
x86/kgdb: Return 0 from kgdb_arch_set_breakpoint()

err must be nonzero in order to reach text_poke(), which caused kgdb to
fail to set breakpoints:

  (gdb) break __x64_sys_sync
  Breakpoint 1 at 0xffffffff81288910: file ../fs/sync.c, line 124.
  (gdb) c
  Continuing.
  Warning:
  Cannot insert breakpoint 1.
  Cannot access memory at address 0xffffffff81288910

  Command aborted.

Fixes: 86a22057127d ("x86/kgdb: Avoid redundant comparison of patched code")
Signed-off-by: Matt Mullins <mmullins@fb.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nadav Amit <namit@vmware.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190531194755.6320-1-mmullins@fb.com
6 years agox86/resctrl: Prevent NULL pointer dereference when local MBM is disabled
Prarit Bhargava [Mon, 10 Jun 2019 17:15:44 +0000 (13:15 -0400)]
x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled

Booting with kernel parameter "rdt=cmt,mbmtotal,memlocal,l3cat,mba" and
executing "mount -t resctrl resctrl -o mba_MBps /sys/fs/resctrl" results in
a NULL pointer dereference on systems which do not have local MBM support
enabled..

BUG: kernel NULL pointer dereference, address: 0000000000000020
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 0 PID: 722 Comm: kworker/0:3 Not tainted 5.2.0-0.rc3.git0.1.el7_UNSUPPORTED.x86_64 #2
Workqueue: events mbm_handle_overflow
RIP: 0010:mbm_handle_overflow+0x150/0x2b0

Only enter the bandwith update loop if the system has local MBM enabled.

Fixes: de73f38f7680 ("x86/intel_rdt/mba_sc: Feedback loop to dynamically update mem bandwidth")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190610171544.13474-1-prarit@redhat.com
6 years agox86/resctrl: Don't stop walking closids when a locksetup group is found
James Morse [Mon, 3 Jun 2019 17:25:31 +0000 (18:25 +0100)]
x86/resctrl: Don't stop walking closids when a locksetup group is found

When a new control group is created __init_one_rdt_domain() walks all
the other closids to calculate the sets of used and unused bits.

If it discovers a pseudo_locksetup group, it breaks out of the loop.  This
means any later closid doesn't get its used bits added to used_b.  These
bits will then get set in unused_b, and added to the new control group's
configuration, even if they were marked as exclusive for a later closid.

When encountering a pseudo_locksetup group, we should continue. This is
because "a resource group enters 'pseudo-locked' mode after the schemata is
written while the resource group is in 'pseudo-locksetup' mode." When we
find a pseudo_locksetup group, its configuration is expected to be
overwritten, we can skip it.

Fixes: dfe9674b04ff6 ("x86/intel_rdt: Enable entering of pseudo-locksetup mode")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H Peter Avin <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190603172531.178830-1-james.morse@arm.com
6 years agox86/fpu: Update kernel's FPU state before using for the fsave header
Sebastian Andrzej Siewior [Fri, 7 Jun 2019 14:29:16 +0000 (16:29 +0200)]
x86/fpu: Update kernel's FPU state before using for the fsave header

In commit

  39388e80f9b0c ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()")

I removed the statement

|       if (ia32_fxstate)
|               copy_fxregs_to_kernel(fpu);

and argued that it was wrongly merged because the content was already
saved in kernel's state.

This was wrong: It is required to write it back because it is only
saved on the user-stack and save_fsave_header() reads it from task's
FPU-state. I missed that part…

Save x87 FPU state unless thread's FPU registers are already up to date.

Fixes: 39388e80f9b0c ("x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()")
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190607142915.y52mfmgk5lvhll7n@linutronix.de
6 years agox86/mm/KASLR: Compute the size of the vmemmap section properly
Baoquan He [Thu, 23 May 2019 02:57:44 +0000 (10:57 +0800)]
x86/mm/KASLR: Compute the size of the vmemmap section properly

The size of the vmemmap section is hardcoded to 1 TB to support the
maximum amount of system RAM in 4-level paging mode - 64 TB.

However, 1 TB is not enough for vmemmap in 5-level paging mode. Assuming
the size of struct page is 64 Bytes, to support 4 PB system RAM in 5-level,
64 TB of vmemmap area is needed:

  4 * 1000^5 PB / 4096 bytes page size * 64 bytes per page struct / 1000^4 TB = 62.5 TB.

This hardcoding may cause vmemmap to corrupt the following
cpu_entry_area section, if KASLR puts vmemmap very close to it and the
actual vmemmap size is bigger than 1 TB.

So calculate the actual size of the vmemmap region needed and then align
it up to 1 TB boundary.

In 4-level paging mode it is always 1 TB. In 5-level it's adjusted on
demand. The current code reserves 0.5 PB for vmemmap on 5-level. With
this change, the space can be saved and thus used to increase entropy
for the randomization.

 [ bp: Spell out how the 64 TB needed for vmemmap is computed and massage commit
   message. ]

Fixes: eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y")
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Kirill A. Shutemov <kirill@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: kirill.shutemov@linux.intel.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190523025744.3756-1-bhe@redhat.com
6 years agox86/fpu: Use fault_in_pages_writeable() for pre-faulting
Hugh Dickins [Wed, 29 May 2019 07:25:40 +0000 (09:25 +0200)]
x86/fpu: Use fault_in_pages_writeable() for pre-faulting

Since commit

   d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails")

get_user_pages_unlocked() pre-faults user's memory if a write generates
a page fault while the handler is disabled.

This works in general and uncovered a bug as reported by Mike
Rapoport¹. It has been pointed out that this function may be fragile
and a simple pre-fault as in fault_in_pages_writeable() would be a
better solution. Better as in taste and simplicity: that write (as
performed by the alternative function) performs exactly the same
faulting of memory as before. This was suggested by Hugh Dickins and
Andrew Morton.

Use fault_in_pages_writeable() for pre-faulting user's stack.

  [ bigeasy: Write commit message. ]
  [ bp: Massage some. ]

¹ https://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com

Fixes: d9c9ce34ed5c8 ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails")
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190529072540.g46j4kfeae37a3iu@linutronix.de
Link: https://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com
6 years agox86/CPU: Add more Icelake model numbers
Kan Liang [Mon, 3 Jun 2019 13:41:20 +0000 (06:41 -0700)]
x86/CPU: Add more Icelake model numbers

Add the CPUID model numbers of Icelake (ICL) desktop and server
processors to the Intel family list.

 [ Qiuxu: Sort the macros by model number. ]

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: Rajneesh Bhardwaj <rajneesh.bhardwaj@linux.intel.com>
Cc: rui.zhang@intel.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190603134122.13853-1-kan.liang@linux.intel.com
6 years agomm/vmalloc: Avoid rare case of flushing TLB with weird arguments
Rick Edgecombe [Mon, 27 May 2019 21:10:58 +0000 (14:10 -0700)]
mm/vmalloc: Avoid rare case of flushing TLB with weird arguments

In a rare case, flush_tlb_kernel_range() could be called with a start
higher than the end.

In vm_remove_mappings(), in case page_address() returns 0 for all pages
(for example they were all in highmem), _vm_unmap_aliases() will be
called with start = ULONG_MAX, end = 0 and flush = 1.

If at the same time, the vmalloc purge operation is triggered by something
else while the current operation is between remove_vm_area() and
_vm_unmap_aliases(), then the vm mapping just removed will be already
purged. In this case the call of vm_unmap_aliases() may not find any other
mappings to flush and so end up flushing start = ULONG_MAX, end = 0. So
only set flush = true if we find something in the direct mapping that we
need to flush, and this way this can't happen.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 868b104d7379 ("mm/vmalloc: Add flag for freeing of special permsissions")
Link: https://lkml.kernel.org/r/20190527211058.2729-3-rick.p.edgecombe@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agomm/vmalloc: Fix calculation of direct map addr range
Rick Edgecombe [Mon, 27 May 2019 21:10:57 +0000 (14:10 -0700)]
mm/vmalloc: Fix calculation of direct map addr range

The calculation of the direct map address range to flush was wrong.
This could cause the RO direct map alias to not get flushed. Today
this shouldn't be a problem because this flush is only needed on x86
right now and the spurious fault handler will fix cached RO->RW
translations. In the future though, it could cause the permissions
to remain RO in the TLB for the direct map alias, and then the page
would return from the page allocator to some other component as RO
and cause a crash.

So fix fix the address range calculation so the flush will include the
direct map range.

Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Meelis Roos <mroos@linux.ee>
Cc: Nadav Amit <namit@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 868b104d7379 ("mm/vmalloc: Add flag for freeing of special permsissions")
Link: https://lkml.kernel.org/r/20190527211058.2729-2-rick.p.edgecombe@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoLinux 5.2-rc3
Linus Torvalds [Sun, 2 Jun 2019 20:55:33 +0000 (13:55 -0700)]
Linux 5.2-rc3

6 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Jun 2019 18:10:01 +0000 (11:10 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Two fixes: a quirk for KVM guests running on certain AMD CPUs, and a
  KASAN related build fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Don't force the CPB cap when running under a hypervisor
  x86/boot: Provide KASAN compatible aliases for string routines

6 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Jun 2019 18:08:12 +0000 (11:08 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "On the kernel side there's a bunch of ring-buffer ordering fixes for a
  reproducible bug, plus a PEBS constraints regression fix.

  Plus tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools headers UAPI: Sync kvm.h headers with the kernel sources
  perf record: Fix s390 missing module symbol and warning for non-root users
  perf machine: Read also the end of the kernel
  perf test vmlinux-kallsyms: Ignore aliases to _etext when searching on kallsyms
  perf session: Add missing swap ops for namespace events
  perf namespace: Protect reading thread's namespace
  tools headers UAPI: Sync drm/drm.h with the kernel
  tools headers UAPI: Sync drm/i915_drm.h with the kernel
  tools headers UAPI: Sync linux/fs.h with the kernel
  tools headers UAPI: Sync linux/sched.h with the kernel
  tools arch x86: Sync asm/cpufeatures.h with the with the kernel
  tools include UAPI: Update copy of files related to new fspick, fsmount, fsconfig, fsopen, move_mount and open_tree syscalls
  perf arm64: Fix mksyscalltbl when system kernel headers are ahead of the kernel
  perf data: Fix 'strncat may truncate' build failure with recent gcc
  perf/ring-buffer: Use regular variables for nesting
  perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb->user_page data
  perf/ring_buffer: Add ordering to rb->nest increment
  perf/ring_buffer: Fix exposing a temporarily decreased data_head
  perf/x86/intel/ds: Fix EVENT vs. UEVENT PEBS constraints

6 years agoMerge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Jun 2019 18:06:13 +0000 (11:06 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull EFI fixes from Ingo Molnar:
 "Two EFI fixes: a quirk for weird systabs, plus add more robust error
  handling in the old 1:1 mapping code"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: Allow the number of EFI configuration tables entries to be zero
  efi/x86/Add missing error handling to old_memmap 1:1 mapping code

6 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 2 Jun 2019 18:04:42 +0000 (11:04 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull stacktrace fix from Ingo Molnar:
 "Fix a stack_trace_save_tsk_reliable() regression"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  stacktrace: Unbreak stack_trace_save_tsk_reliable()

6 years agoMerge tag 'spdx-5.2-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 2 Jun 2019 17:22:38 +0000 (10:22 -0700)]
Merge tag 'spdx-5.2-rc3-2' of git://git./linux/kernel/git/gregkh/driver-core

Pull SPDX fixes from Greg KH:
 "Here are just two small patches, that fix up some found SPDX
  identifier issues.

  The first patch fixes an error in a previous SPDX fixup patch, that
  causes build errors when doing 'make clean' on the tree (the fact that
  almost no one noticed it reflects the fact that kernel developers
  don't like doing that option very often...)

  The second patch fixes up a number of places in the tree where people
  mistyped the string "SPDX-License-Identifier". Given that people can
  not even type their own name all the time without mistakes, this was
  bound to happen, and odds are, we will have to add some type of check
  for this to checkpatch.pl to catch this happening in the future.

  Both of these have passed testing by 0-day"

* tag 'spdx-5.2-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  treewide: fix typos of SPDX-License-Identifier
  crypto: ux500 - fix license comment syntax error

6 years agoMerge tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 2 Jun 2019 17:21:04 +0000 (10:21 -0700)]
Merge tag 'powerpc-5.2-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "A minor fix to our IMC PMU code to print a less confusing error
  message when the driver can't initialise properly.

  A fix for a bug where a user requesting an unsupported branch sampling
  filter can corrupt PMU state, preventing the PMU from counting
  properly.

  And finally a fix for a bug in our support for kexec_file_load(),
  which prevented loading a kernel and initramfs. Most versions of kexec
  don't yet use kexec_file_load().

  Thanks to: Anju T Sudhakar, Dave Young, Madhavan Srinivasan, Ravi
  Bangoria, Thiago Jung Bauermann"

* tag 'powerpc-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/kexec: Fix loading of kernel + initramfs with kexec_file_load()
  powerpc/perf: Fix MMCRA corruption by bhrb_filter
  powerpc/powernv: Return for invalid IMC domain

6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 2 Jun 2019 17:19:39 +0000 (10:19 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Fixes for PPC and s390"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Restore SPRG3 in kvmhv_p9_guest_entry()
  KVM: PPC: Book3S HV: Fix lockdep warning when entering guest on POWER9
  KVM: PPC: Book3S HV: XIVE: Fix page offset when clearing ESB pages
  KVM: PPC: Book3S HV: XIVE: Take the srcu read lock when accessing memslots
  KVM: PPC: Book3S HV: XIVE: Do not clear IRQ data of passthrough interrupts
  KVM: PPC: Book3S HV: XIVE: Introduce a new mutex for the XIVE device
  KVM: PPC: Book3S HV: XIVE: Fix the enforced limit on the vCPU identifier
  KVM: PPC: Book3S HV: XIVE: Do not test the EQ flag validity when resetting
  KVM: PPC: Book3S HV: XIVE: Clear file mapping when device is released
  KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu
  KVM: PPC: Book3S: Use new mutex to synchronize access to rtas token list
  KVM: PPC: Book3S HV: Use new mutex to synchronize MMU setup
  KVM: PPC: Book3S HV: Avoid touching arch.mmu_ready in XIVE release functions
  KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID
  kvm: fix compile on s390 part 2

6 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 2 Jun 2019 17:18:11 +0000 (10:18 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "A memleak fix for the core, two driver bugfixes, as well as fixing
  missing file patterns to MAINTAINERS"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: add I2C DT bindings to ARM platforms
  MAINTAINERS: add DT bindings to i2c drivers
  i2c: synquacer: fix synquacer_i2c_doxfer() return value
  i2c: mlxcpld: Fix wrong initialization order in probe
  i2c: dev: fix potential memory leak in i2cdev_ioctl_rdwr

6 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux...
Linus Torvalds [Sun, 2 Jun 2019 17:16:09 +0000 (10:16 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/evalenti/linux-soc-thermal

Pull thermal SoC fix from Eduardo Valentin:
 "A single revert, detected to cause issues on the tsens driver"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  Revert "drivers: thermal: tsens: Add new operation to check if a sensor is enabled"

6 years agoMerge tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 2 Jun 2019 17:14:25 +0000 (10:14 -0700)]
Merge tag 'led-fixes-for-5.2-rc3' of git://git./linux/kernel/git/j.anaszewski/linux-leds

Pull LED fix from Jacek Anaszewski:
 "Fix for a recent change in LED core, that didn't take into account the
  possibility of calling led_blink_setup() from atomic context"

* tag 'led-fixes-for-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  leds: avoid flush_work in atomic context

6 years agoMerge tag 'for-linus-20190601' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 2 Jun 2019 16:27:44 +0000 (09:27 -0700)]
Merge tag 'for-linus-20190601' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - A set of patches fixing code comments / kerneldoc (Bart)

 - Don't allow loop file change for exclusive open (Jan)

 - Fix revalidate of hidden genhd (Jan)

 - Init queue failure memory free fix (Jes)

 - Improve rq limits failure print (John)

 - Fixup for queue removal/addition (Ming)

 - Missed error progagation for io_uring buffer registration (Pavel)

* tag 'for-linus-20190601' of git://git.kernel.dk/linux-block:
  block: print offending values when cloned rq limits are exceeded
  blk-mq: Document the blk_mq_hw_queue_to_node() arguments
  blk-mq: Fix spelling in a source code comment
  block: Fix bsg_setup_queue() kernel-doc header
  block: Fix rq_qos_wait() kernel-doc header
  block: Fix blk_mq_*_map_queues() kernel-doc headers
  block: Fix throtl_pending_timer_fn() kernel-doc header
  block: Convert blk_invalidate_devt() header into a non-kernel-doc header
  block/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header
  blk-mq: Fix memory leak in error handling
  block: don't protect generic_make_request_checks with blk_queue_enter
  block: move blk_exit_queue into __blk_release_queue
  block: Don't revalidate bdev of hidden gendisk
  loop: Don't change loop device under exclusive opener
  io_uring: Fix __io_uring_register() false success

6 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 2 Jun 2019 16:26:34 +0000 (09:26 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Six minor fixes to device drivers and one to the multipath alua
  handler.

  The most extensive fix is the zfcp port remove prevention one, but
  it's impact is only s390"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: libsas: delete sas port if expander discover failed
  scsi: libsas: only clear phy->in_shutdown after shutdown event done
  scsi: scsi_dh_alua: Fix possible null-ptr-deref
  scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask
  scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs)
  scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove
  scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route()

6 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sun, 2 Jun 2019 15:51:30 +0000 (08:51 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "Various fixes and followups"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, compaction: make sure we isolate a valid PFN
  include/linux/generic-radix-tree.h: fix kerneldoc comment
  kernel/signal.c: trace_signal_deliver when signal_group_exit
  drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used
  spdxcheck.py: fix directory structures
  kasan: initialize tag to 0xff in __kasan_kmalloc
  z3fold: fix sheduling while atomic
  scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set
  mm/gup: continue VM_FAULT_RETRY processing even for pre-faults
  ocfs2: fix error path kobject memory leak
  memcg: make it work on sparse non-0-node systems
  mm, memcg: consider subtrees in memory.events
  prctl_set_mm: downgrade mmap_sem to read lock
  prctl_set_mm: refactor checks from validate_prctl_map
  kernel/fork.c: make max_threads symbol static
  arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes
  arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK
  mm/vmalloc.c: fix typo in comment
  lib/sort.c: fix kernel-doc notation warnings
  mm: fix Documentation/vm/hmm.rst Sphinx warnings

6 years agomm, compaction: make sure we isolate a valid PFN
Suzuki K Poulose [Sat, 1 Jun 2019 05:30:59 +0000 (22:30 -0700)]
mm, compaction: make sure we isolate a valid PFN

When we have holes in a normal memory zone, we could endup having
cached_migrate_pfns which may not necessarily be valid, under heavy memory
pressure with swapping enabled ( via __reset_isolation_suitable(),
triggered by kswapd).

Later if we fail to find a page via fast_isolate_freepages(), we may end
up using the migrate_pfn we started the search with, as valid page.  This
could lead to accessing NULL pointer derefernces like below, due to an
invalid mem_section pointer.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [47/1825]
 Mem abort info:
   ESR = 0x96000004
   Exception class = DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
 Data abort info:
   ISV = 0, ISS = 0x00000004
   CM = 0, WnR = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp = 0000000082f94ae9
 [0000000000000008] pgd=0000000000000000
 Internal error: Oops: 96000004 [#1] SMP
 ...
 CPU: 10 PID: 6080 Comm: qemu-system-aar Not tainted 510-rc1+ #6
 Hardware name: AmpereComputing(R) OSPREY EV-883832-X3-0001/OSPREY, BIOS 4819 09/25/2018
 pstate: 60000005 (nZCv daif -PAN -UAO)
 pc : set_pfnblock_flags_mask+0x58/0xe8
 lr : compaction_alloc+0x300/0x950
 [...]
 Process qemu-system-aar (pid: 6080, stack limit = 0x0000000095070da5)
 Call trace:
  set_pfnblock_flags_mask+0x58/0xe8
  compaction_alloc+0x300/0x950
  migrate_pages+0x1a4/0xbb0
  compact_zone+0x750/0xde8
  compact_zone_order+0xd8/0x118
  try_to_compact_pages+0xb4/0x290
  __alloc_pages_direct_compact+0x84/0x1e0
  __alloc_pages_nodemask+0x5e0/0xe18
  alloc_pages_vma+0x1cc/0x210
  do_huge_pmd_anonymous_page+0x108/0x7c8
  __handle_mm_fault+0xdd4/0x1190
  handle_mm_fault+0x114/0x1c0
  __get_user_pages+0x198/0x3c0
  get_user_pages_unlocked+0xb4/0x1d8
  __gfn_to_pfn_memslot+0x12c/0x3b8
  gfn_to_pfn_prot+0x4c/0x60
  kvm_handle_guest_abort+0x4b0/0xcd8
  handle_exit+0x140/0x1b8
  kvm_arch_vcpu_ioctl_run+0x260/0x768
  kvm_vcpu_ioctl+0x490/0x898
  do_vfs_ioctl+0xc4/0x898
  ksys_ioctl+0x8c/0xa0
  __arm64_sys_ioctl+0x28/0x38
  el0_svc_common+0x74/0x118
  el0_svc_handler+0x38/0x78
  el0_svc+0x8/0xc
 Code: f8607840 f100001f 8b011401 9a801020 (f9400400)
 ---[ end trace af6a35219325a9b6 ]---

The issue was reported on an arm64 server with 128GB with holes in the
zone (e.g, [32GB@4GB, 96GB@544GB]), with a swap device enabled, while
running 100 KVM guest instances.

This patch fixes the issue by ensuring that the page belongs to a valid
PFN when we fallback to using the lower limit of the scan range upon
failure in fast_isolate_freepages().

Link: http://lkml.kernel.org/r/1558711908-15688-1-git-send-email-suzuki.poulose@arm.com
Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoinclude/linux/generic-radix-tree.h: fix kerneldoc comment
Jonathan Corbet [Sat, 1 Jun 2019 05:30:55 +0000 (22:30 -0700)]
include/linux/generic-radix-tree.h: fix kerneldoc comment

The DOC comment block section in include/linux/generic-radix-tree.h
contained a spurious colon, causing this warning in the documentation
build:

  include/linux/generic-radix-tree.h:1: warning: no structured comments found

Remove the colon and make the docs build happy.

Link: http://lkml.kernel.org/r/20190524141933.74ae9050@lwn.net
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agokernel/signal.c: trace_signal_deliver when signal_group_exit
Zhenliang Wei [Sat, 1 Jun 2019 05:30:52 +0000 (22:30 -0700)]
kernel/signal.c: trace_signal_deliver when signal_group_exit

In the fixes commit, removing SIGKILL from each thread signal mask and
executing "goto fatal" directly will skip the call to
"trace_signal_deliver".  At this point, the delivery tracking of the
SIGKILL signal will be inaccurate.

Therefore, we need to add trace_signal_deliver before "goto fatal" after
executing sigdelset.

Note: SEND_SIG_NOINFO matches the fact that SIGKILL doesn't have any info.

Link: http://lkml.kernel.org/r/20190425025812.91424-1-weizhenliang@huawei.com
Fixes: cf43a757fd4944 ("signal: Restore the stop PTRACE_EVENT_EXIT")
Signed-off-by: Zhenliang Wei <weizhenliang@huawei.com>
Reviewed-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Ivan Delalande <colona@arista.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agodrivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used
Qian Cai [Sat, 1 Jun 2019 05:30:49 +0000 (22:30 -0700)]
drivers/iommu/intel-iommu.c: fix variable 'iommu' set but not used

Commit cf04eee8bf0e ("iommu/vt-d: Include ACPI devices in iommu=pt")
added for_each_active_iommu() in iommu_prepare_static_identity_mapping()
but never used the each element, i.e, "drhd->iommu".

drivers/iommu/intel-iommu.c: In function
'iommu_prepare_static_identity_mapping':
drivers/iommu/intel-iommu.c:3037:22: warning: variable 'iommu' set but
not used [-Wunused-but-set-variable]
 struct intel_iommu *iommu;

Fixed the warning by appending a compiler attribute __maybe_unused for it.

Link: http://lkml.kernel.org/r/20190523013314.2732-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agospdxcheck.py: fix directory structures
Vincenzo Frascino [Sat, 1 Jun 2019 05:30:45 +0000 (22:30 -0700)]
spdxcheck.py: fix directory structures

The LICENSE directory has recently changed structure and this makes
spdxcheck fails as per below:

FAIL: "Blob or Tree named 'other' not found"
Traceback (most recent call last):
  File "scripts/spdxcheck.py", line 240, in <module>
spdx = read_spdxdata(repo)
  File "scripts/spdxcheck.py", line 41, in read_spdxdata
for el in lictree[d].traverse():
[...]
KeyError: "Blob or Tree named 'other' not found"

Fix the script to restore the correctness on checkpatch License checking.

References: 62be257e986d ("LICENSES: Rename other to deprecated")
References: 8ea8814fcdcb ("LICENSES: Clearly mark dual license only licenses")
Link: http://lkml.kernel.org/r/20190523084755.56739-1-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jeremy Cline <jcline@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agokasan: initialize tag to 0xff in __kasan_kmalloc
Nathan Chancellor [Sat, 1 Jun 2019 05:30:42 +0000 (22:30 -0700)]
kasan: initialize tag to 0xff in __kasan_kmalloc

When building with -Wuninitialized and CONFIG_KASAN_SW_TAGS unset, Clang
warns:

mm/kasan/common.c:484:40: warning: variable 'tag' is uninitialized when
used here [-Wuninitialized]
        kasan_unpoison_shadow(set_tag(object, tag), size);
                                              ^~~

set_tag ignores tag in this configuration but clang doesn't realize it at
this point in its pipeline, as it points to arch_kasan_set_tag as being
the point where it is used, which will later be expanded to (void
*)(object) without a use of tag.  Initialize tag to 0xff, as it removes
this warning and doesn't change the meaning of the code.

Link: https://github.com/ClangBuiltLinux/linux/issues/465
Link: http://lkml.kernel.org/r/20190502163057.6603-1-natechancellor@gmail.com
Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoz3fold: fix sheduling while atomic
Vitaly Wool [Sat, 1 Jun 2019 05:30:39 +0000 (22:30 -0700)]
z3fold: fix sheduling while atomic

kmem_cache_alloc() may be called from z3fold_alloc() in atomic context, so
we need to pass correct gfp flags to avoid "scheduling while atomic" bug.

Link: http://lkml.kernel.org/r/20190523153245.119dfeed55927e8755250ddd@gmail.com
Fixes: 7c2b8baa61fe5 ("mm/z3fold.c: add structure for buddy handles")
Signed-off-by: Vitaly Wool <vitaly.vul@sony.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoscripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set
Fabiano Rosas [Sat, 1 Jun 2019 05:30:36 +0000 (22:30 -0700)]
scripts/gdb: fix invocation when CONFIG_COMMON_CLK is not set

CLK_GET_RATE_NOCACHE depends on CONFIG_COMMON_CLK.  Importing constants.py
when CONFIG_COMMON_CLK is not defined causes:

  (gdb) lx-symbols
  (...)
    File "scripts/gdb/linux/proc.py", line 15, in <module>
      from linux import constants
    File "scripts/gdb/linux/constants.py", line 2, in <module>
      LX_CLK_GET_RATE_NOCACHE = gdb.parse_and_eval("CLK_GET_RATE_NOCACHE")
  gdb.error: No symbol "CLK_GET_RATE_NOCACHE" in current context.

Link: http://lkml.kernel.org/r/20190523195313.24701-1-farosas@linux.ibm.com
Fixes: e7e6f462c1be ("scripts/gdb: print cached rate in lx-clk-summary")
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Leonard Crestez <leonard.crestez@nxp.com>
Cc: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agomm/gup: continue VM_FAULT_RETRY processing even for pre-faults
Mike Rapoport [Sat, 1 Jun 2019 05:30:33 +0000 (22:30 -0700)]
mm/gup: continue VM_FAULT_RETRY processing even for pre-faults

When get_user_pages*() is called with pages = NULL, the processing of
VM_FAULT_RETRY terminates early without actually retrying to fault-in all
the pages.

If the pages in the requested range belong to a VMA that has userfaultfd
registered, handle_userfault() returns VM_FAULT_RETRY *after* user space
has populated the page, but for the gup pre-fault case there's no actual
retry and the caller will get no pages although they are present.

This issue was uncovered when running post-copy memory restore in CRIU
after d9c9ce34ed5c ("x86/fpu: Fault-in user stack if
copy_fpstate_to_sigframe() fails").

After this change, the copying of FPU state to the sigframe switched from
copy_to_user() variants which caused a real page fault to get_user_pages()
with pages parameter set to NULL.

In post-copy mode of CRIU, the destination memory is managed with
userfaultfd and lack of the retry for pre-fault case in get_user_pages()
causes a crash of the restored process.

Making the pre-fault behavior of get_user_pages() the same as the "normal"
one fixes the issue.

Link: http://lkml.kernel.org/r/1557844195-18882-1-git-send-email-rppt@linux.ibm.com
Fixes: d9c9ce34ed5c ("x86/fpu: Fault-in user stack if copy_fpstate_to_sigframe() fails")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Andrei Vagin <avagin@gmail.com> [https://travis-ci.org/avagin/linux/builds/533184940]
Tested-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoocfs2: fix error path kobject memory leak
Tobin C. Harding [Sat, 1 Jun 2019 05:30:29 +0000 (22:30 -0700)]
ocfs2: fix error path kobject memory leak

If a call to kobject_init_and_add() fails we should call kobject_put()
otherwise we leak memory.

Add call to kobject_put() in the error path of call to
kobject_init_and_add().  Please note, this has the side effect that the
release method is called if kobject_init_and_add() fails.

Link: http://lkml.kernel.org/r/20190513033458.2824-1-tobin@kernel.org
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agomemcg: make it work on sparse non-0-node systems
Jiri Slaby [Sat, 1 Jun 2019 05:30:26 +0000 (22:30 -0700)]
memcg: make it work on sparse non-0-node systems

We have a single node system with node 0 disabled:
  Scanning NUMA topology in Northbridge 24
  Number of physical nodes 2
  Skipping disabled node 0
  Node 1 MemBase 0000000000000000 Limit 00000000fbff0000
  NODE_DATA(1) allocated [mem 0xfbfda000-0xfbfeffff]

This causes crashes in memcg when system boots:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  #PF error: [normal kernel read fault]
...
  RIP: 0010:list_lru_add+0x94/0x170
...
  Call Trace:
   d_lru_add+0x44/0x50
   dput.part.34+0xfc/0x110
   __fput+0x108/0x230
   task_work_run+0x9f/0xc0
   exit_to_usermode_loop+0xf5/0x100

It is reproducible as far as 4.12.  I did not try older kernels.  You have
to have a new enough systemd, e.g.  241 (the reason is unknown -- was not
investigated).  Cannot be reproduced with systemd 234.

The system crashes because the size of lru array is never updated in
memcg_update_all_list_lrus and the reads are past the zero-sized array,
causing dereferences of random memory.

The root cause are list_lru_memcg_aware checks in the list_lru code.  The
test in list_lru_memcg_aware is broken: it assumes node 0 is always
present, but it is not true on some systems as can be seen above.

So fix this by avoiding checks on node 0.  Remember the memcg-awareness by
a bool flag in struct list_lru.

Link: http://lkml.kernel.org/r/20190522091940.3615-1-jslaby@suse.cz
Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agomm, memcg: consider subtrees in memory.events
Chris Down [Sat, 1 Jun 2019 05:30:22 +0000 (22:30 -0700)]
mm, memcg: consider subtrees in memory.events

memory.stat and other files already consider subtrees in their output, and
we should too in order to not present an inconsistent interface.

The current situation is fairly confusing, because people interacting with
cgroups expect hierarchical behaviour in the vein of memory.stat,
cgroup.events, and other files.  For example, this causes confusion when
debugging reclaim events under low, as currently these always read "0" at
non-leaf memcg nodes, which frequently causes people to misdiagnose breach
behaviour.  The same confusion applies to other counters in this file when
debugging issues.

Aggregation is done at write time instead of at read-time since these
counters aren't hot (unlike memory.stat which is per-page, so it does it
at read time), and it makes sense to bundle this with the file
notifications.

After this patch, events are propagated up the hierarchy:

    [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
    low 0
    high 0
    max 0
    oom 0
    oom_kill 0
    [root@ktst ~]# systemd-run -p MemoryMax=1 true
    Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service
    [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events
    low 0
    high 0
    max 7
    oom 1
    oom_kill 1

As this is a change in behaviour, this can be reverted to the old
behaviour by mounting with the `memory_localevents' flag set.  However, we
use the new behaviour by default as there's a lack of evidence that there
are any current users of memory.events that would find this change
undesirable.

akpm: this is a behaviour change, so Cc:stable.  THis is so that
forthcoming distros which use cgroup v2 are more likely to pick up the
revised behaviour.

Link: http://lkml.kernel.org/r/20190208224419.GA24772@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoprctl_set_mm: downgrade mmap_sem to read lock
Michal Koutný [Sat, 1 Jun 2019 05:30:19 +0000 (22:30 -0700)]
prctl_set_mm: downgrade mmap_sem to read lock

The commit a3b609ef9f8b ("proc read mm's {arg,env}_{start,end} with mmap
semaphore taken.") added synchronization of reading argument/environment
boundaries under mmap_sem.  Later commit 88aa7cc688d4 ("mm: introduce
arg_lock to protect arg_start|end and env_start|end in mm_struct") avoided
the coarse use of mmap_sem in similar situations.  But there still
remained two places that (mis)use mmap_sem.

get_cmdline should also use arg_lock instead of mmap_sem when it reads the
boundaries.

The second place that should use arg_lock is in prctl_set_mm.  By
protecting the boundaries fields with the arg_lock, we can downgrade
mmap_sem to reader lock (analogous to what we already do in
prctl_set_mm_map).

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/20190502125203.24014-3-mkoutny@suse.com
Fixes: 88aa7cc688d4 ("mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct")
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Co-developed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoprctl_set_mm: refactor checks from validate_prctl_map
Michal Koutný [Sat, 1 Jun 2019 05:30:16 +0000 (22:30 -0700)]
prctl_set_mm: refactor checks from validate_prctl_map

Despite comment of validate_prctl_map claims there are no capability
checks, it is not completely true since commit 4d28df6152aa ("prctl: Allow
local CAP_SYS_ADMIN changing exe_file").  Extract the check out of the
function and make the function perform purely arithmetic checks.

This patch should not change any behavior, it is mere refactoring for
following patch.

[akpm@linux-foundation.org: coding style fixes]
Link: http://lkml.kernel.org/r/20190502125203.24014-2-mkoutny@suse.com
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agokernel/fork.c: make max_threads symbol static
Kefeng Wang [Sat, 1 Jun 2019 05:30:12 +0000 (22:30 -0700)]
kernel/fork.c: make max_threads symbol static

Fix build warning,
kernel/fork.c:125:5: warning: symbol 'max_threads' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20190516015118.140561-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoarch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes
Sebastian Andrzej Siewior [Sat, 1 Jun 2019 05:30:09 +0000 (22:30 -0700)]
arch/arm/boot/compressed/decompress.c: fix build error due to lz4 changes

  include/linux/cpumask.h: In function 'cpumask_parse':
  include/linux/cpumask.h:636:21: error: implicit declaration of function 'strchrnul'; did you mean 'strchr'? [-Werror=implicit-function-declaration]

Because arch/arm/boot/compressed/decompress.c does

#define _LINUX_STRING_H_

preventing linux/string.h from providing strchrnul.  It also #includes
asm/string.h, which for arm has a declaration of strchr(), explaining why
this didn't use to fail.

Link: http://lkml.kernel.org/r/20190528115346.f5a7kn3hdnuf5rts@linutronix.de
Fixes: 3713a4e1fdb8d ("include/linux/cpumask.h: fix double string traverse in cpumask_parse")
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Yury Norov <ynorov@marvell.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoarch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK
David Rientjes [Sat, 1 Jun 2019 05:30:06 +0000 (22:30 -0700)]
arch/parisc/configs/c8000_defconfig: remove obsoleted CONFIG_DEBUG_SLAB_LEAK

CONFIG_DEBUG_SLAB_LEAK has been removed, so remove it from defconfig.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1905201015460.96074@chino.kir.corp.google.com
Fixes: 7878c231dae0 ("slab: remove /proc/slab_allocators")
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agomm/vmalloc.c: fix typo in comment
Andrew Morton [Sat, 1 Jun 2019 05:30:03 +0000 (22:30 -0700)]
mm/vmalloc.c: fix typo in comment

Reported-by: Nicholas Joll <najoll@posteo.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/sort.c: fix kernel-doc notation warnings
Randy Dunlap [Sat, 1 Jun 2019 05:30:00 +0000 (22:30 -0700)]
lib/sort.c: fix kernel-doc notation warnings

Fix kernel-doc notation in lib/sort.c by using correct function parameter
names.

  lib/sort.c:59: warning: Excess function parameter 'size' description in 'swap_words_32'
  lib/sort.c:83: warning: Excess function parameter 'size' description in 'swap_words_64'
  lib/sort.c:110: warning: Excess function parameter 'size' description in 'swap_bytes'

Link: http://lkml.kernel.org/r/60e25d3d-68d1-bde2-3b39-e4baa0b14907@infradead.org
Fixes: 37d0ec34d111a ("lib/sort: make swap functions more generic")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: George Spelvin <lkml@sdf.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agomm: fix Documentation/vm/hmm.rst Sphinx warnings
Randy Dunlap [Sat, 1 Jun 2019 05:29:57 +0000 (22:29 -0700)]
mm: fix Documentation/vm/hmm.rst Sphinx warnings

Fix Sphinx warnings in Documentation/vm/hmm.rst by using "::" notation and
inserting a blank line.  Also add a missing ';'.

  Documentation/vm/hmm.rst:292: WARNING: Unexpected indentation.
  Documentation/vm/hmm.rst:300: WARNING: Unexpected indentation.

Link: http://lkml.kernel.org/r/c5995359-7c82-4e47-c7be-b58a4dda0953@infradead.org
Fixes: 023a019a9b4e ("mm/hmm: add default fault flags to avoid the need to pre-fill pfns arrays")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agotreewide: fix typos of SPDX-License-Identifier
Masahiro Yamada [Sat, 1 Jun 2019 03:22:42 +0000 (12:22 +0900)]
treewide: fix typos of SPDX-License-Identifier

Prior to the adoption of SPDX, it was difficult for tools to determine
the correct license due to incomplete or badly formatted license text.
The SPDX solves this issue, assuming people can correctly spell
"SPDX-License-Identifier" although this assumption is broken in some
places.

Since scripts/spdxcheck.py parses only lines that exactly matches to
the correct tag, it cannot (should not) detect this kind of error.

If the correct tag is missing, scripts/checkpatch.pl warns like this:

 WARNING: Missing or malformed SPDX-License-Identifier tag in line *

So, people should notice it before the patch submission, but in reality
broken tags sometimes slip in. The checkpatch warning is not useful for
checking the committed files globally since large number of files still
have no SPDX tag.

Also, I am not sure about the legal effect when the SPDX tag is broken.

Anyway, these typos are absolutely worth fixing. It is pretty easy to
find suspicious lines by grep.

  $ git grep --not -e SPDX-License-Identifier --and -e SPDX- -- \
    :^LICENSES :^scripts/spdxcheck.py :^*/license-rules.rst
  arch/arm/kernel/bugs.c:// SPDX-Identifier: GPL-2.0
  drivers/phy/st/phy-stm32-usbphyc.c:// SPDX-Licence-Identifier: GPL-2.0
  drivers/pinctrl/sh-pfc/pfc-r8a77980.c:// SPDX-Lincense-Identifier: GPL 2.0
  lib/test_stackinit.c:// SPDX-Licenses: GPLv2
  sound/soc/codecs/max9759.c:// SPDX-Licence-Identifier: GPL-2.0

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 years agocrypto: ux500 - fix license comment syntax error
Alex Xu (Hello71) [Sat, 1 Jun 2019 14:49:43 +0000 (10:49 -0400)]
crypto: ux500 - fix license comment syntax error

Causes error: drivers/crypto/ux500/cryp/Makefile:5: *** missing
separator.  Stop.

Fixes: af873fcecef5 ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 194")
Signed-off-by: Alex Xu (Hello71) <alex_y_xu@yahoo.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 years agoMAINTAINERS: add I2C DT bindings to ARM platforms
Wolfram Sang [Tue, 21 May 2019 08:21:30 +0000 (10:21 +0200)]
MAINTAINERS: add I2C DT bindings to ARM platforms

Reviewed-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
6 years agoMAINTAINERS: add DT bindings to i2c drivers
Wolfram Sang [Tue, 21 May 2019 08:15:04 +0000 (10:15 +0200)]
MAINTAINERS: add DT bindings to i2c drivers

Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
6 years agoMerge tag 'kvm-s390-master-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Fri, 31 May 2019 22:49:02 +0000 (00:49 +0200)]
Merge tag 'kvm-s390-master-5.2-2' of git://git./linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fixes

- fix compilation for !CONFIG_PCI
- fix the output of KVM_CAP_MAX_VCPU_ID

6 years agoMerge tag 'kvm-ppc-fixes-5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Paolo Bonzini [Fri, 31 May 2019 22:48:45 +0000 (00:48 +0200)]
Merge tag 'kvm-ppc-fixes-5.2-1' of git://git./linux/kernel/git/paulus/powerpc into kvm-master

PPC KVM fixes for 5.2

- Several bug fixes for the new XIVE-native code.
- Replace kvm->lock by other mutexes in several places where we hold a
  vcpu mutex, to avoid lock order inversions.
- Fix a lockdep warning on guest entry for radix-mode guests.
- Fix a bug causing user-visible corruption of SPRG3 on the host.

6 years agoblock: print offending values when cloned rq limits are exceeded
John Pittman [Thu, 23 May 2019 21:49:39 +0000 (17:49 -0400)]
block: print offending values when cloned rq limits are exceeded

While troubleshooting issues where cloned request limits have been
exceeded, it is often beneficial to know the actual values that
have been breached.  Print these values, assisting in ease of
identification of root cause of the breach.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblk-mq: Document the blk_mq_hw_queue_to_node() arguments
Bart Van Assche [Fri, 31 May 2019 00:00:53 +0000 (17:00 -0700)]
blk-mq: Document the blk_mq_hw_queue_to_node() arguments

Document the meaning of the blk_mq_hw_queue_to_node() arguments.

Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblk-mq: Fix spelling in a source code comment
Bart Van Assche [Fri, 31 May 2019 00:00:52 +0000 (17:00 -0700)]
blk-mq: Fix spelling in a source code comment

Change one occurrence of 'performace' into 'performance'.

Cc: Max Gurtovoy <maxg@mellanox.com>
Fixes: fe631457ff3e ("blk-mq: map all HWQ also in hyperthreaded system") # v4.13.
Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock: Fix bsg_setup_queue() kernel-doc header
Bart Van Assche [Fri, 31 May 2019 00:00:51 +0000 (17:00 -0700)]
block: Fix bsg_setup_queue() kernel-doc header

Document all bsg_setup_queue() arguments as required.

Fixes: aae3b069d5ce ("bsg: pass in desired timeout handler") # v5.0.
Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock: Fix rq_qos_wait() kernel-doc header
Bart Van Assche [Fri, 31 May 2019 00:00:50 +0000 (17:00 -0700)]
block: Fix rq_qos_wait() kernel-doc header

Add documentation for the @rqw argument and change " - " into ": ".

Fixes: 84f603246db9 ("block: add rq_qos_wait to rq_qos") # v5.0-rc1~52^2~140.
Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock: Fix blk_mq_*_map_queues() kernel-doc headers
Bart Van Assche [Fri, 31 May 2019 00:00:49 +0000 (17:00 -0700)]
block: Fix blk_mq_*_map_queues() kernel-doc headers

This patch avoids that the kernel-doc script complains about these
function headers when building with W=1.

Cc: Hannes Reinecke <hare@suse.com>
Cc: Keith Busch <keith.busch@intel.com>
Fixes: ed76e329d74a ("blk-mq: abstract out queue map") # v5.0.
Fixes: e42b3867de4b ("blk-mq-rdma: pass in queue map to blk_mq_rdma_map_queues") # v5.0.
Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock: Fix throtl_pending_timer_fn() kernel-doc header
Bart Van Assche [Fri, 31 May 2019 00:00:48 +0000 (17:00 -0700)]
block: Fix throtl_pending_timer_fn() kernel-doc header

Commit e99e88a9d2b0 renamed a function argument without updating the
corresponding kernel-doc header. Update the kernel-doc header.

Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Fixes: e99e88a9d2b0 ("treewide: setup_timer() -> timer_setup()") # v4.15.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock: Convert blk_invalidate_devt() header into a non-kernel-doc header
Bart Van Assche [Fri, 31 May 2019 00:00:47 +0000 (17:00 -0700)]
block: Convert blk_invalidate_devt() header into a non-kernel-doc header

This patch avoids that the kernel-doc tool warns about this function
header when building with W=1.

Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblock/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header
Bart Van Assche [Fri, 31 May 2019 00:00:46 +0000 (17:00 -0700)]
block/partitions/ldm: Convert a kernel-doc header into a non-kernel-doc header

This patch avoids that the kernel-doc tool warns about this function
header when building with W=1.

Reviewed-by: Chaitanya Kulkarni <chiatanya.kulkarni@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoMerge tag 'nfsd-5.2-1' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Fri, 31 May 2019 20:51:16 +0000 (13:51 -0700)]
Merge tag 'nfsd-5.2-1' of git://linux-nfs.org/~bfields/linux

Pull nfsd fix from Bruce Fields:
 "This reverts a minor fix which could cause us to treat conflicting NLM
  locks as nonconflicting.

  We have proper fix queued up for 5.3. In the meantime, a quick revert
  seems best for 5.2 and stable"

* tag 'nfsd-5.2-1' of git://linux-nfs.org/~bfields/linux:
  Revert "lockd: Show pid of lockd for remote locks"

6 years agoMerge tag 'v5.2-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Fri, 31 May 2019 20:49:50 +0000 (13:49 -0700)]
Merge tag 'v5.2-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Four small smb3 fixes, one for stable"

* tag 'v5.2-rc2-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: cifs_read_allocate_pages: don't iterate through whole page array on ENOMEM
  dfs_cache: fix a wrong use of kfree in flush_cache_ent()
  fs/cifs/smb2pdu.c: fix buffer free in SMB2_ioctl_free
  cifs: fix memory leak of pneg_inbuf on -EOPNOTSUPP ioctl case

6 years agoleds: avoid flush_work in atomic context
Pavel Machek [Sun, 26 May 2019 07:38:55 +0000 (09:38 +0200)]
leds: avoid flush_work in atomic context

It turns out that various triggers use led_blink_setup() from atomic
context, so we can't do a flush_work there. Flush is still needed for
slow LEDs, but we can move it to sysfs code where it is safe.

    WARNING: inconsistent lock state
    5.2.0-rc1 #1 Tainted: G        W
    --------------------------------
    inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
    000000006e30541b
    ((work_completion)(&led_cdev->set_brightness_work)){+.?.}, at:
    +__flush_work+0x3b/0x38a
    {SOFTIRQ-ON-W} state was registered at:
      lock_acquire+0x146/0x1a1
     __flush_work+0x5b/0x38a
     flush_work+0xb/0xd
     led_blink_setup+0x1e/0xd3
     led_blink_set+0x3f/0x44
     tpt_trig_timer+0xdb/0x106
     ieee80211_mod_tpt_led_trig+0xed/0x112

Fixes: 0db37915d912 ("leds: avoid races with workqueue")
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
6 years agoMerge branch 'next-fixes-for-5.2-rc' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 31 May 2019 18:08:44 +0000 (11:08 -0700)]
Merge branch 'next-fixes-for-5.2-rc' of git://git./linux/kernel/git/zohar/linux-integrity

Pull integrity subsystem fixes from Mimi Zohar:
 "Four bug fixes, none 5.2-specific, all marked for stable.

  The first two are related to the architecture specific IMA policy
  support. The other two patches, one is related to EVM signatures,
  based on additional hash algorithms, and the other is related to
  displaying the IMA policy"

* 'next-fixes-for-5.2-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: show rules with IMA_INMASK correctly
  evm: check hash algorithm passed to init_desc()
  ima: fix wrong signed policy requirement when not appraising
  x86/ima: Check EFI_RUNTIME_SERVICES before using

6 years agoMerge tag 'for-linus-5.2b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 31 May 2019 17:53:34 +0000 (10:53 -0700)]
Merge tag 'for-linus-5.2b-rc3-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "One minor cleanup patch and a fix for handling of live migration when
  running as Xen guest"

* tag 'for-linus-5.2b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xenbus: Avoid deadlock during suspend due to open transactions
  xen/pvcalls: Remove set but not used variable

6 years agoMerge tag 's390-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 31 May 2019 17:49:25 +0000 (10:49 -0700)]
Merge tag 's390-5.2-3' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - Farewell Martin Schwidefsky: add Martin to CREDITS and remove him
   from MAINTAINERS

 - Vasily Gorbik and Christian Borntraeger join as maintainers for s390

 - Fix locking bug in ctr(aes) and ctr(des) s390 specific ciphers

 - A rather large patch which fixes gcm-aes-s390 scatter gather handling

 - Fix zcrypt wrong dispatching for control domain CPRBs

 - Fix assignment of bus resources in PCI code

 - Fix structure definition for set PCI function

 - Fix one compile error and one compile warning seen when
   CONFIG_OPTIMIZE_INLINING is enabled

* tag 's390-5.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  MAINTAINERS: add Vasily Gorbik and Christian Borntraeger for s390
  MAINTAINERS: Farewell Martin Schwidefsky
  s390/crypto: fix possible sleep during spinlock aquired
  s390/crypto: fix gcm-aes-s390 selftest failures
  s390/zcrypt: Fix wrong dispatching for control domain CPRBs
  s390/pci: fix assignment of bus resources
  s390/pci: fix struct definition for set PCI function
  s390: mark __cpacf_check_opcode() and cpacf_query_func() as __always_inline
  s390: add unreachable() to dump_fault_info() to fix -Wmaybe-uninitialized

6 years agoMerge tag 'pm-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 31 May 2019 17:38:35 +0000 (10:38 -0700)]
Merge tag 'pm-5.2-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix three issues in the system-wide suspend and hibernation area
  related to PCI device PM handling by suspend-to-idle, device wakeup
  optimizations and arbitrary differences between suspend and
  hiberantion.

  Specifics:

   - Modify the PCI bus type's PM code to avoid putting devices left by
     their drivers in D0 on purpose during suspend to idle into
     low-power states as doing that may confuse the system resume
     callbacks of the drivers in question (Rafael Wysocki).

   - Avoid checking ACPI wakeup configuration during system-wide suspend
     for suspended devices that do not use ACPI-based wakeup to allow
     them to stay in suspend more often (Rafael Wysocki).

   - The last phase of hibernation is analogous to system-wide suspend
     also because on platforms with ACPI it passes control to the
     platform firmware to complete the transision, so make it indicate
     that by calling pm_set_suspend_via_firmware() to allow the drivers
     that care about this to do the right thing (Rafael Wysocki)"

* tag 'pm-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PCI: PM: Avoid possible suspend-to-idle issue
  ACPI: PM: Call pm_set_suspend_via_firmware() during hibernation
  ACPI/PCI: PM: Add missing wakeup.flags.valid checks

6 years agoMerge tag 'gcc-plugins-v5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 31 May 2019 17:26:05 +0000 (10:26 -0700)]
Merge tag 'gcc-plugins-v5.2-rc3' of git://git./linux/kernel/git/kees/linux

Pull gcc-plugins fix from Kees Cook:
 "Handle unusual header environment, fixing a redefined macro error
  under a Darwin build host"

* tag 'gcc-plugins-v5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  gcc-plugins: Fix build failures under Darwin host

6 years agoMerge tag 'spdx-5.2-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 31 May 2019 15:34:32 +0000 (08:34 -0700)]
Merge tag 'spdx-5.2-rc3-1' of git://git./linux/kernel/git/gregkh/driver-core

Pull yet more SPDX updates from Greg KH:
 "Here is another set of reviewed patches that adds SPDX tags to
  different kernel files, based on a set of rules that are being used to
  parse the comments to try to determine that the license of the file is
  "GPL-2.0-or-later" or "GPL-2.0-only". Only the "obvious" versions of
  these matches are included here, a number of "non-obvious" variants of
  text have been found but those have been postponed for later review
  and analysis.

  There is also a patch in here to add the proper SPDX header to a bunch
  of Kbuild files that we have missed in the past due to new files being
  added and forgetting that Kbuild uses two different file names for
  Makefiles. This issue was reported by the Kbuild maintainer.

  These patches have been out for review on the linux-spdx@vger mailing
  list, and while they were created by automatic tools, they were
  hand-verified by a bunch of different people, all whom names are on
  the patches are reviewers"

* tag 'spdx-5.2-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (82 commits)
  treewide: Add SPDX license identifier - Kbuild
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 225
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 223
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 222
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 221
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 220
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 218
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 217
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 216
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 215
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 214
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 213
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 211
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 210
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 207
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 203
  treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201
  ...

6 years agoMerge tag 'staging-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Fri, 31 May 2019 15:31:45 +0000 (08:31 -0700)]
Merge tag 'staging-5.2-rc3' of git://git./linux/kernel/git/gregkh/staging

Pull staging and IIO driver fixes from Greg KH:
 "Here are some Staging and IIO driver fixes to resolve some reported
  problems for 5.2-rc3.

  Nothing major here, just some tiny changes, full details are in the
  shortlog.

  All have been in linux-next for a while with no reported issues"

* tag 'staging-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: kpc2000: Add dependency on MFD_CORE to kconfig symbol 'KPC2000'
  staging: wilc1000: Fix some double unlock bugs in wilc_wlan_cleanup()
  staging: vc04_services: prevent integer overflow in create_pagelist()
  Staging: vc04_services: Fix a couple error codes
  staging: wlan-ng: fix adapter initialization failure
  staging: kpc2000: double unlock in error handling in kpc_dma_transfer()
  staging: kpc2000: Fix build error without CONFIG_UIO
  staging: kpc2000: fix build error on xtensa
  staging: erofs: set sb->s_root to NULL when failing from __getname()
  iio: adc: ti-ads8688: fix timestamp is not updated in buffer
  iio: dac: ds4422/ds4424 fix chip verification
  iio: imu: mpu6050: Fix FIFO layout for ICM20602
  iio: adc: ads124: avoid buffer overflow
  iio: adc: modify NPCM ADC read reference voltage

6 years agoMerge tag 'tty-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Fri, 31 May 2019 15:28:44 +0000 (08:28 -0700)]
Merge tag 'tty-5.2-rc3' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are some small serial and TTY driver fixes for 5.2-rc3.

  Nothing major, just a number of fixes for reported issues. The fbcon
  core fix also resolves an issue, and was acked by the relevant
  maintainer to go through this tree.

  All of these have been in linux-next with no reported issues"

* tag 'tty-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  vt/fbcon: deinitialize resources in visual_init() after failed memory allocation
  tty: max310x: Fix external crystal register setup
  serial: sh-sci: disable DMA for uart_console
  serial: imx: remove log spamming error message
  tty: serial: msm_serial: Fix XON/XOFF

6 years agoMerge tag 'usb-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 31 May 2019 15:16:31 +0000 (08:16 -0700)]
Merge tag 'usb-5.2-rc3' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some tiny USB fixes for a number of reported issues for
  5.2-rc3.

  Nothing huge here, just a small collection of xhci and other driver
  bugs that syzbot has been finding in some drivers. There is also a
  usbip fix and a fix for the usbip fix in here :)

  All have been in linux-next with no reported issues"

* tag 'usb-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usbip: usbip_host: fix stub_dev lock context imbalance regression
  media: smsusb: better handle optional alignment
  xhci: Use %zu for printing size_t type
  xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic()
  xhci: Fix immediate data transfer if buffer is already DMA mapped
  usb: xhci: avoid null pointer deref when bos field is NULL
  usb: xhci: Fix a potential null pointer dereference in xhci_debugfs_create_endpoint()
  xhci: update bounce buffer with correct sg num
  media: usb: siano: Fix false-positive "uninitialized variable" warning
  USB: rio500: update Documentation
  USB: rio500: simplify locking
  USB: rio500: fix memory leak in close after disconnect
  USB: rio500: refuse more than one device at a time
  usbip: usbip_host: fix BUG: sleeping function called from invalid context
  USB: sisusbvga: fix oops in error path of sisusb_probe
  USB: Add LPM quirk for Surface Dock GigE adapter
  media: usb: siano: Fix general protection fault in smsusb
  usb: mtu3: fix up undefined reference to usb_debug_root
  USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor

6 years agoMerge tag 'drm-fixes-2019-05-31' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 31 May 2019 15:14:16 +0000 (08:14 -0700)]
Merge tag 'drm-fixes-2019-05-31' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Nothing too crazy, pretty quiet, maybe too quiet.

  amdgpu:
   - a fixed version of the raven firmware fix we previously reverted
   - stolen memory fix

  imx:
   - regression fix

  qxl:
   - remove a bad warning

  etnaviv:
   - VM locking fix"

* tag 'drm-fixes-2019-05-31' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: reserve stollen vram for raven series
  drm/etnaviv: lock MMU while dumping core
  drm/imx: ipuv3-plane: fix atomic update status query for non-plus i.MX6Q
  drm/qxl: drop WARN_ONCE()
  drm/amd/display: Don't load DMCU for Raven 1 (v2)

6 years agoRevert "lockd: Show pid of lockd for remote locks"
Benjamin Coddington [Mon, 20 May 2019 14:33:07 +0000 (10:33 -0400)]
Revert "lockd: Show pid of lockd for remote locks"

This reverts most of commit b8eee0e90f97 ("lockd: Show pid of lockd for
remote locks"), which caused remote locks to not be differentiated between
remote processes for NLM.

We retain the fixup for setting the client's fl_pid to a negative value.

Fixes: b8eee0e90f97 ("lockd: Show pid of lockd for remote locks")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: XueWei Zhang <xueweiz@google.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
6 years agoMAINTAINERS: add Vasily Gorbik and Christian Borntraeger for s390
Heiko Carstens [Fri, 31 May 2019 07:51:35 +0000 (09:51 +0200)]
MAINTAINERS: add Vasily Gorbik and Christian Borntraeger for s390

Add Vasily Gorbik and Christian Borntraeger as additional maintainers
for s390.

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agoMAINTAINERS: Farewell Martin Schwidefsky
Heiko Carstens [Wed, 29 May 2019 14:58:44 +0000 (16:58 +0200)]
MAINTAINERS: Farewell Martin Schwidefsky

After two decades of significant contributions to the s390
architecture, we must say goodbye to our dear colleague.

Blue skies, Martin!

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 31 May 2019 04:11:22 +0000 (21:11 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix OOPS during nf_tables rule dump, from Florian Westphal.

 2) Use after free in ip_vs_in, from Yue Haibing.

 3) Fix various kTLS bugs (NULL deref during device removal resync,
    netdev notification ignoring, etc.) From Jakub Kicinski.

 4) Fix ipv6 redirects with VRF, from David Ahern.

 5) Memory leak fix in igmpv3_del_delrec(), from Eric Dumazet.

 6) Missing memory allocation failure check in ip6_ra_control(), from
    Gen Zhang. And likewise fix ip_ra_control().

 7) TX clean budget logic error in aquantia, from Igor Russkikh.

 8) SKB leak in llc_build_and_send_ui_pkt(), from Eric Dumazet.

 9) Double frees in mlx5, from Parav Pandit.

10) Fix lost MAC address in r8169 during PCI D3, from Heiner Kallweit.

11) Fix botched register access in mvpp2, from Antoine Tenart.

12) Use after free in napi_gro_frags(), from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (89 commits)
  net: correct zerocopy refcnt with udp MSG_MORE
  ethtool: Check for vlan etype or vlan tci when parsing flow_rule
  net: don't clear sock->sk early to avoid trouble in strparser
  net-gro: fix use-after-free read in napi_gro_frags()
  net: dsa: tag_8021q: Create a stable binary format
  net: dsa: tag_8021q: Change order of rx_vid setup
  net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
  ipv4: tcp_input: fix stack out of bounds when parsing TCP options.
  mlxsw: spectrum: Prevent force of 56G
  mlxsw: spectrum_acl: Avoid warning after identical rules insertion
  net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT
  r8169: fix MAC address being lost in PCI D3
  net: core: support XDP generic on stacked devices.
  netvsc: unshare skb in VF rx handler
  udp: Avoid post-GRO UDP checksum recalculation
  net: phy: dp83867: Set up RGMII TX delay
  net: phy: dp83867: do not call config_init twice
  net: phy: dp83867: increase SGMII autoneg timer duration
  net: phy: dp83867: fix speed 10 in sgmii mode
  net: phy: marvell10g: report if the PHY fails to boot firmware
  ...

6 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 31 May 2019 04:05:23 +0000 (21:05 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "The fixes are still trickling in for arm64, but the only really
  significant one here is actually fixing a regression in the botched
  module relocation range checking merged for -rc2.

  Hopefully we've nailed it this time.

   - Fix implementation of our set_personality() system call, which
     wasn't being wrapped properly

   - Fix system call function types to keep CFI happy

   - Fix siginfo layout when delivering SIGKILL after a kernel fault

   - Really fix module relocation range checking"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: use the correct function type for __arm64_sys_ni_syscall
  arm64: use the correct function type in SYSCALL_DEFINE0
  arm64: fix syscall_fn_t type
  signal/arm64: Use force_sig not force_sig_fault for SIGKILL
  arm64/module: revert to unsigned interpretation of ABS16/32 relocations
  arm64: Fix the arm64_personality() syscall wrapper redirection

6 years agoMerge tag 'for-5.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 31 May 2019 03:52:40 +0000 (20:52 -0700)]
Merge tag 'for-5.2-rc2-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more fixes for bugs reported by users, fuzzing tools and
  regressions:

   - fix crashes in relocation:
       + resuming interrupted balance operation does not properly clean
         up orphan trees
       + with enabled qgroups, resuming needs to be more careful about
         block groups due to limited context when updating qgroups

   - fsync and logging fixes found by fuzzing

   - incremental send fixes for no-holes and clone

   - fix spin lock type used in timer function for zstd"

* tag 'for-5.2-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix race updating log root item during fsync
  Btrfs: fix wrong ctime and mtime of a directory after log replay
  Btrfs: fix fsync not persisting changed attributes of a directory
  btrfs: qgroup: Check bg while resuming relocation to avoid NULL pointer dereference
  btrfs: reloc: Also queue orphan reloc tree for cleanup to avoid BUG_ON()
  Btrfs: incremental send, fix emission of invalid clone operations
  Btrfs: incremental send, fix file corruption when no-holes feature is enabled
  btrfs: correct zstd workspace manager lock to use spin_lock_bh()
  btrfs: Ensure replaced device doesn't have pending chunk allocation

6 years agoMerge tag 'configfs-for-5.2-2' of git://git.infradead.org/users/hch/configfs
Linus Torvalds [Fri, 31 May 2019 03:35:48 +0000 (20:35 -0700)]
Merge tag 'configfs-for-5.2-2' of git://git.infradead.org/users/hch/configfs

Pull configs fix from Christoph Hellwig:

 - fix a use after free in configfs_d_iput (Sahitya Tummala)

* tag 'configfs-for-5.2-2' of git://git.infradead.org/users/hch/configfs:
  configfs: Fix use-after-free when accessing sd->s_dentry

6 years agoMerge tag 'sound-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 31 May 2019 02:58:59 +0000 (19:58 -0700)]
Merge tag 'sound-5.2-rc3' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "No big surprises here, just a few device-specific fixes.

  HD-audio received several fixes for Acer, Dell, Huawei and other
  laptops as well as the workaround for the new Intel chipset. One
  significant one-liner fix is the disablement of the node-power saving
  on Realtek codecs, which may potentially cover annoying bugs like the
  background noises or click noises on many devices.

  Other than that, a fix for FireWire bit definitions, and another fix
  for LINE6 USB audio bug that was discovered by syzkaller"

* tag 'sound-5.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: fireface: Use ULL suffixes for 64-bit constants
  ALSA: hda/realtek - Improve the headset mic for Acer Aspire laptops
  ALSA: line6: Assure canceling delayed work at disconnection
  ALSA: hda - Force polling mode on CNL for fixing codec communication
  ALSA: hda/realtek - Enable micmute LED for Huawei laptops
  ALSA: hda/realtek - Set default power save node to 0
  ALSA: hda/realtek - Check headset type by unplug and resume

6 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 30 May 2019 23:33:37 +0000 (16:33 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk driver fixes from Stephen Boyd:

 - Don't expose the SiFive clk driver on non-RISCV architectures

 - Fix some bits describing clks in the imx8mm driver

 - Always call clk domain code in the TI driver so non-legacy platforms
   work

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: ti: clkctrl: Fix clkdm_clk handling
  clk: imx: imx8mm: fix int pll clk gate
  clk: sifive: restrict Kconfig scope for the FU540 PRCI driver

6 years agoMerge tag 'imx-drm-fixes-2019-05-29' of git://git.pengutronix.de/git/pza/linux into...
Dave Airlie [Thu, 30 May 2019 23:15:14 +0000 (09:15 +1000)]
Merge tag 'imx-drm-fixes-2019-05-29' of git://git.pengutronix.de/git/pza/linux into drm-fixes

drm/imx: ipuv3-plane: fix frame rate regression on non-plus i.MX6Q

Fix a regression introduced by 70e8a0c71e9 ("drm/imx: ipuv3-plane: add
function to query atomic update status") that halves the frame rate on
non-plus i.MX6Q, because the pending check always returns "pending"
even if an update is actually applied.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/1559128738.3651.4.camel@pengutronix.de
6 years agonet: correct zerocopy refcnt with udp MSG_MORE
Willem de Bruijn [Thu, 30 May 2019 22:01:21 +0000 (18:01 -0400)]
net: correct zerocopy refcnt with udp MSG_MORE

TCP zerocopy takes a uarg reference for every skb, plus one for the
tcp_sendmsg_locked datapath temporarily, to avoid reaching refcnt zero
as it builds, sends and frees skbs inside its inner loop.

UDP and RAW zerocopy do not send inside the inner loop so do not need
the extra sock_zerocopy_get + sock_zerocopy_put pair. Commit
52900d22288ed ("udp: elide zerocopy operation in hot path") introduced
extra_uref to pass the initial reference taken in sock_zerocopy_alloc
to the first generated skb.

But, sock_zerocopy_realloc takes this extra reference at the start of
every call. With MSG_MORE, no new skb may be generated to attach the
extra_uref to, so refcnt is incorrectly 2 with only one skb.

Do not take the extra ref if uarg && !tcp, which implies MSG_MORE.
Update extra_uref accordingly.

This conditional assignment triggers a false positive may be used
uninitialized warning, so have to initialize extra_uref at define.

Changes v1->v2: fix typo in Fixes SHA1

Fixes: 52900d22288e7 ("udp: elide zerocopy operation in hot path")
Reported-by: syzbot <syzkaller@googlegroups.com>
Diagnosed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoethtool: Check for vlan etype or vlan tci when parsing flow_rule
Maxime Chevallier [Thu, 30 May 2019 14:08:40 +0000 (16:08 +0200)]
ethtool: Check for vlan etype or vlan tci when parsing flow_rule

When parsing an ethtool flow spec to build a flow_rule, the code checks
if both the vlan etype and the vlan tci are specified by the user to add
a FLOW_DISSECTOR_KEY_VLAN match.

However, when the user only specified a vlan etype or a vlan tci, this
check silently ignores these parameters.

For example, the following rule :

ethtool -N eth0 flow-type udp4 vlan 0x0010 action -1 loc 0

will result in no error being issued, but the equivalent rule will be
created and passed to the NIC driver :

ethtool -N eth0 flow-type udp4 action -1 loc 0

In the end, neither the NIC driver using the rule nor the end user have
a way to know that these keys were dropped along the way, or that
incorrect parameters were entered.

This kind of check should be left to either the driver, or the ethtool
flow spec layer.

This commit makes so that ethtool parameters are forwarded as-is to the
NIC driver.

Since none of the users of ethtool_rx_flow_rule_create are using the
VLAN dissector, I don't think this qualifies as a regression.

Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by: Pablo Neira Ayuso <pablo@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: don't clear sock->sk early to avoid trouble in strparser
Jakub Kicinski [Wed, 29 May 2019 23:33:23 +0000 (16:33 -0700)]
net: don't clear sock->sk early to avoid trouble in strparser

af_inet sets sock->sk to NULL which trips strparser over:

BUG: kernel NULL pointer dereference, address: 0000000000000012
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.2.0-rc1-00139-g14629453a6d3 #21
RIP: 0010:tcp_peek_len+0x10/0x60
RSP: 0018:ffffc02e41c54b98 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9cf924c4e030 RCX: 0000000000000051
RDX: 0000000000000000 RSI: 000000000000000c RDI: ffff9cf97128f480
RBP: ffff9cf9365e0300 R08: ffff9cf94fe7d2c0 R09: 0000000000000000
R10: 000000000000036b R11: ffff9cf939735e00 R12: ffff9cf91ad9ae40
R13: ffff9cf924c4e000 R14: ffff9cf9a8fcbaae R15: 0000000000000020
FS: 0000000000000000(0000) GS:ffff9cf9af7c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000012 CR3: 000000013920a003 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
 <IRQ>
 strp_data_ready+0x48/0x90
 tls_data_ready+0x22/0xd0 [tls]
 tcp_rcv_established+0x569/0x620
 tcp_v4_do_rcv+0x127/0x1e0
 tcp_v4_rcv+0xad7/0xbf0
 ip_protocol_deliver_rcu+0x2c/0x1c0
 ip_local_deliver_finish+0x41/0x50
 ip_local_deliver+0x6b/0xe0
 ? ip_protocol_deliver_rcu+0x1c0/0x1c0
 ip_rcv+0x52/0xd0
 ? ip_rcv_finish_core.isra.20+0x380/0x380
 __netif_receive_skb_one_core+0x7e/0x90
 netif_receive_skb_internal+0x42/0xf0
 napi_gro_receive+0xed/0x150
 nfp_net_poll+0x7a2/0xd30 [nfp]
 ? kmem_cache_free_bulk+0x286/0x310
 net_rx_action+0x149/0x3b0
 __do_softirq+0xe3/0x30a
 ? handle_irq_event_percpu+0x6a/0x80
 irq_exit+0xe8/0xf0
 do_IRQ+0x85/0xd0
 common_interrupt+0xf/0xf
 </IRQ>
RIP: 0010:cpuidle_enter_state+0xbc/0x450

To avoid this issue set sock->sk after sk_prot->close.
My grepping and testing did not discover any code which
would depend on the current behaviour.

Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-gro: fix use-after-free read in napi_gro_frags()
Eric Dumazet [Wed, 29 May 2019 22:36:10 +0000 (15:36 -0700)]
net-gro: fix use-after-free read in napi_gro_frags()

If a network driver provides to napi_gro_frags() an
skb with a page fragment of exactly 14 bytes, the call
to gro_pull_from_frag0() will 'consume' the fragment
by calling skb_frag_unref(skb, 0), and the page might
be freed and reused.

Reading eth->h_proto at the end of napi_frags_skb() might
read mangled data, or crash under specific debugging features.

BUG: KASAN: use-after-free in napi_frags_skb net/core/dev.c:5833 [inline]
BUG: KASAN: use-after-free in napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
Read of size 2 at addr ffff88809366840c by task syz-executor599/8957

CPU: 1 PID: 8957 Comm: syz-executor599 Not tainted 5.2.0-rc1+ #32
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:188
 __kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 kasan_report+0x12/0x20 mm/kasan/common.c:614
 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:142
 napi_frags_skb net/core/dev.c:5833 [inline]
 napi_gro_frags+0xc6f/0xd10 net/core/dev.c:5841
 tun_get_user+0x2f3c/0x3ff0 drivers/net/tun.c:1991
 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037
 call_write_iter include/linux/fs.h:1872 [inline]
 do_iter_readv_writev+0x5f8/0x8f0 fs/read_write.c:693
 do_iter_write fs/read_write.c:970 [inline]
 do_iter_write+0x184/0x610 fs/read_write.c:951
 vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015
 do_writev+0x15b/0x330 fs/read_write.c:1058

Fixes: a50e233c50db ("net-gro: restore frag0 optimization")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'Fixes-for-DSA-tagging-using-802-1Q'
David S. Miller [Thu, 30 May 2019 21:47:14 +0000 (14:47 -0700)]
Merge branch 'Fixes-for-DSA-tagging-using-802-1Q'

Vladimir Oltean says:

====================
Fixes for DSA tagging using 802.1Q

During the prototyping for the "Decoupling PHYLINK from struct
net_device" patchset, the CPU port of the sja1105 driver was moved to a
different spot.  This uncovered an issue in the tag_8021q DSA code,
which used to work by mistake - the CPU port was the last hardware port
numerically, and this was masking an ordering issue which is very likely
to be seen in other drivers that make use of 802.1Q tags.

A question was also raised whether the VID numbers bear any meaning, and
the conclusion was that they don't, at least not in an absolute sense.
The second patch defines bit fields inside the DSA 802.1Q VID so that
tcpdump can decode it unambiguously (although the meaning is now clear
even by visual inspection).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: tag_8021q: Create a stable binary format
Vladimir Oltean [Wed, 29 May 2019 21:42:31 +0000 (00:42 +0300)]
net: dsa: tag_8021q: Create a stable binary format

Tools like tcpdump need to be able to decode the significance of fake
VLAN headers that DSA uses to separate switch ports.

But currently these have no global significance - they are simply an
ordered list of DSA_MAX_SWITCHES x DSA_MAX_PORTS numbers ending at 4095.

The reason why this is submitted as a fix is that the existing mapping
of VIDs should not enter into a stable kernel, so we can pretend that
only the new format exists. This way tcpdump won't need to try to make
something out of the VLAN tags on 5.2 kernels.

Fixes: f9bbe4477c30 ("net: dsa: Optional VLAN-based port separation for switches without tagging")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: tag_8021q: Change order of rx_vid setup
Ioana Ciornei [Wed, 29 May 2019 21:42:30 +0000 (00:42 +0300)]
net: dsa: tag_8021q: Change order of rx_vid setup

The 802.1Q tagging performs an unbalanced setup in terms of RX VIDs on
the CPU port. For the ingress path of a 802.1Q switch to work, the RX
VID of a port needs to be seen as tagged egress on the CPU port.

While configuring the other front-panel ports to be part of this VID,
for bridge scenarios, the untagged flag is applied even on the CPU port
in dsa_switch_vlan_add.  This happens because DSA applies the same flags
on the CPU port as on the (bridge-controlled) slave ports, and the
effect in this case is that the CPU port tagged settings get deleted.

Instead of fixing DSA by introducing a way to control VLAN flags on the
CPU port (and hence stop inheriting from the slave ports) - a hard,
perhaps intractable problem - avoid this situation by moving the setup
part of the RX VID on the CPU port after all the other front-panel ports
have been added to the VID.

Fixes: f9bbe4477c30 ("net: dsa: Optional VLAN-based port separation for switches without tagging")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value
Antoine Tenart [Wed, 29 May 2019 13:59:48 +0000 (15:59 +0200)]
net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value

MVPP2_TXQ_SCHED_TOKEN_CNTR_REG() expects the logical queue id but
the current code is passing the global tx queue offset, so it ends
up writing to unknown registers (between 0x8280 and 0x82fc, which
seemed to be unused by the hardware). This fixes the issue by using
the logical queue id instead.

Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>