linux-2.6-microblaze.git
3 years agopowerpc/numa: remove arch_update_cpu_topology
Nathan Lynch [Fri, 12 Jun 2020 05:12:34 +0000 (00:12 -0500)]
powerpc/numa: remove arch_update_cpu_topology

Since arch_update_cpu_topology() doesn't do anything on powerpc now,
remove it and associated dead code.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-15-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove prrn_is_enabled()
Nathan Lynch [Fri, 12 Jun 2020 05:12:33 +0000 (00:12 -0500)]
powerpc/numa: remove prrn_is_enabled()

All users of this prrn_is_enabled() are gone; remove it.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-14-nathanl@linux.ibm.com
3 years agopowerpc/rtasd: simplify handle_rtas_event(), emit message on events
Nathan Lynch [Fri, 12 Jun 2020 05:12:32 +0000 (00:12 -0500)]
powerpc/rtasd: simplify handle_rtas_event(), emit message on events

prrn_is_enabled() always returns false/0, so handle_rtas_event() can
be simplified and some dead code can be removed. Use machine_is()
instead of #ifdef to run this code only on pseries, and add an
informational ratelimited message that we are ignoring the
events. PRRN events are relatively rare in normal operation and
usually arise from operator-initiated actions such as a DPO (Dynamic
Platform Optimizer) run.

Eventually we do want to consume these events and update the device
tree, but that needs more care to be safe vs LPM and DLPAR.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-13-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove start/stop_topology_update()
Nathan Lynch [Fri, 12 Jun 2020 05:12:31 +0000 (00:12 -0500)]
powerpc/numa: remove start/stop_topology_update()

These APIs have become no-ops, so remove them and all call sites.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-12-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove timed_topology_update()
Nathan Lynch [Fri, 12 Jun 2020 05:12:30 +0000 (00:12 -0500)]
powerpc/numa: remove timed_topology_update()

timed_topology_update is a no-op now, so remove it and all call sites.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-11-nathanl@linux.ibm.com
3 years agopowerpc/numa: stub out numa_update_cpu_topology()
Nathan Lynch [Fri, 12 Jun 2020 05:12:29 +0000 (00:12 -0500)]
powerpc/numa: stub out numa_update_cpu_topology()

Previous changes have removed the code which sets bits in
cpu_associativity_changes_mask and thus it is never modifed at
runtime. From this we can reason that numa_update_cpu_topology()
always returns 0 without doing anything. Remove the body of
numa_update_cpu_topology() and remove all code which becomes
unreachable as a result.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-10-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove vphn_enabled and prrn_enabled internal flags
Nathan Lynch [Fri, 12 Jun 2020 05:12:28 +0000 (00:12 -0500)]
powerpc/numa: remove vphn_enabled and prrn_enabled internal flags

These flags are always zero now; remove them and suitably adjust the
remaining references to them.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-9-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove unreachable topology workqueue code
Nathan Lynch [Fri, 12 Jun 2020 05:12:27 +0000 (00:12 -0500)]
powerpc/numa: remove unreachable topology workqueue code

Since vphn_enabled is always 0, we can remove the call to
topology_schedule_update() and remove the code which becomes
unreachable as a result.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-8-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove unreachable topology timer code
Nathan Lynch [Fri, 12 Jun 2020 05:12:26 +0000 (00:12 -0500)]
powerpc/numa: remove unreachable topology timer code

Since vphn_enabled is always 0, we can stub out
timed_topology_update() and remove the code which becomes unreachable.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-7-nathanl@linux.ibm.com
3 years agopowerpc/numa: make vphn_enabled, prrn_enabled flags const
Nathan Lynch [Fri, 12 Jun 2020 05:12:25 +0000 (00:12 -0500)]
powerpc/numa: make vphn_enabled, prrn_enabled flags const

Previous changes have made it so these flags are never changed;
enforce this by making them const.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-6-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove unreachable topology update code
Nathan Lynch [Fri, 12 Jun 2020 05:12:24 +0000 (00:12 -0500)]
powerpc/numa: remove unreachable topology update code

Since the topology_updates_enabled flag is now always false, remove it
and the code which has become unreachable. This is the minimum change
that prevents 'defined but unused' warnings emitted by the compiler
after stubbing out the start/stop_topology_updates() functions.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-5-nathanl@linux.ibm.com
3 years agopowerpc/numa: remove ability to enable topology updates
Nathan Lynch [Fri, 12 Jun 2020 05:12:23 +0000 (00:12 -0500)]
powerpc/numa: remove ability to enable topology updates

Remove the /proc/powerpc/topology_updates interface and the
topology_updates=on/off command line argument. The internal
topology_updates_enabled flag remains for now, but always false.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-4-nathanl@linux.ibm.com
3 years agopowerpc/rtas: don't online CPUs for partition suspend
Nathan Lynch [Fri, 12 Jun 2020 05:12:22 +0000 (00:12 -0500)]
powerpc/rtas: don't online CPUs for partition suspend

Partition suspension, used for hibernation and migration, requires
that the OS place all but one of the LPAR's processor threads into one
of two states prior to calling the ibm,suspend-me RTAS function:

  * the architected offline state (via RTAS stop-self); or
  * the H_JOIN hcall, which does not return until the partition
    resumes execution

Using H_CEDE as the offline mode, introduced by
commit 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into
an appropriate offline state"), means that any threads which are
offline from Linux's point of view must be moved to one of those two
states before a partition suspension can proceed.

This was eventually addressed in commit 120496ac2d2d ("powerpc: Bring
all threads online prior to migration/hibernation"), which added code
to temporarily bring up any offline processor threads so they can call
H_JOIN. Conceptually this is fine, but the implementation has had
multiple races with cpu hotplug operations initiated from user
space[1][2][3], the error handling is fragile, and it generates
user-visible cpu hotplug events which is a lot of noise for a platform
feature that's supposed to minimize disruption to workloads.

With commit 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU
into an appropriate offline state") reverted, this code becomes
unnecessary, so remove it. Since any offline CPUs now are truly
offline from the platform's point of view, it is no longer necessary
to bring up CPUs only to have them call H_JOIN and then go offline
again upon resuming. Only active threads are required to call H_JOIN;
stopped threads can be left alone.

[1] commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and
    serialization during LPM")
[2] commit 9fb603050ffd ("powerpc/rtas: retry when cpu offline races
    with suspend/migration")
[3] commit dfd718a2ed1f ("powerpc/rtas: Fix a potential race between
    CPU-Offline & Migration")

Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-3-nathanl@linux.ibm.com
3 years agopowerpc/pseries: remove cede offline state for CPUs
Nathan Lynch [Fri, 12 Jun 2020 05:12:21 +0000 (00:12 -0500)]
powerpc/pseries: remove cede offline state for CPUs

This effectively reverts commit 3aa565f53c39 ("powerpc/pseries: Add
hooks to put the CPU into an appropriate offline state"), which added
an offline mode for CPUs which uses the H_CEDE hcall instead of the
architected stop-self RTAS function in order to facilitate "folding"
of dedicated mode processors on PowerVM platforms to achieve energy
savings. This has been the default offline mode since its
introduction.

There's nothing about stop-self that would prevent the hypervisor from
achieving the energy savings available via H_CEDE, so the original
premise of this change appears to be flawed.

I also have encountered the claim that the transition to and from
ceded state is much faster than stop-self/start-cpu. Certainly we
would not want to use stop-self as an *idle* mode. That is what H_CEDE
is for. However, this difference is insignificant in the context of
Linux CPU hotplug, where the latency of an offline or online operation
on current systems is on the order of 100ms, mainly attributable to
all the various subsystems' cpuhp callbacks.

The cede offline mode also prevents accurate accounting, as discussed
before:
https://lore.kernel.org/linuxppc-dev/1571740391-3251-1-git-send-email-ego@linux.vnet.ibm.com/

Unconditionally use stop-self to offline processor threads. This is
the architected method for offlining CPUs on PAPR systems.

The "cede_offline" boot parameter is rendered obsolete.

Removing this code enables the removal of the partition suspend code
which temporarily onlines all present CPUs.

Fixes: 3aa565f53c39 ("powerpc/pseries: Add hooks to put the CPU into an appropriate offline state")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612051238.1007764-2-nathanl@linux.ibm.com
3 years agopowerpc/security: Allow for processors that flush the link stack using the special...
Nicholas Piggin [Tue, 9 Jun 2020 07:06:09 +0000 (17:06 +1000)]
powerpc/security: Allow for processors that flush the link stack using the special bcctr

If both count cache and link stack are to be flushed, and can be flushed
with the special bcctr, patch that in directly to the flush/branch nop
site.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200609070610.846703-7-npiggin@gmail.com
3 years agopowerpc/64s: Move branch cache flushing bcctr variant to ppc-ops.h
Nicholas Piggin [Tue, 9 Jun 2020 07:06:08 +0000 (17:06 +1000)]
powerpc/64s: Move branch cache flushing bcctr variant to ppc-ops.h

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200609070610.846703-6-npiggin@gmail.com
3 years agopowerpc/security: split branch cache flush toggle from code patching
Nicholas Piggin [Tue, 9 Jun 2020 07:06:07 +0000 (17:06 +1000)]
powerpc/security: split branch cache flush toggle from code patching

Branch cache flushing code patching has inter-dependencies on both the
link stack and the count cache flushing state.

To make the code clearer and to separate the link stack and count
cache handling, split the "toggle" (setting up variables and printing
enable/disable) from the code patching.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Always print something, even if the flush is disabled]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200609070610.846703-5-npiggin@gmail.com
3 years agopowerpc/security: make display of branch cache flush more consistent
Nicholas Piggin [Tue, 9 Jun 2020 07:06:06 +0000 (17:06 +1000)]
powerpc/security: make display of branch cache flush more consistent

Make the count-cache and link-stack messages look the same

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200609070610.846703-4-npiggin@gmail.com
3 years agopowerpc/security: change link stack flush state to the flush type enum
Nicholas Piggin [Tue, 9 Jun 2020 07:06:05 +0000 (17:06 +1000)]
powerpc/security: change link stack flush state to the flush type enum

Prepare to allow for hardware link stack flushing by using the
none/sw/hw type, same as the count cache state.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200609070610.846703-3-npiggin@gmail.com
3 years agopowerpc/security: re-name count cache flush to branch cache flush
Nicholas Piggin [Tue, 9 Jun 2020 07:06:04 +0000 (17:06 +1000)]
powerpc/security: re-name count cache flush to branch cache flush

The count cache flush mostly refers to both count cache and link stack
flushing. As a first step to untangling these a bit, re-name the bits
that apply to both.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200609070610.846703-2-npiggin@gmail.com
3 years agopowerpc: re-initialise lazy FPU/VEC counters on every fault
Nicholas Piggin [Tue, 23 Jun 2020 23:41:39 +0000 (09:41 +1000)]
powerpc: re-initialise lazy FPU/VEC counters on every fault

When a FP/VEC/VSX unavailable fault loads registers and enables the
facility in the MSR, re-set the lazy restore counters to 1 rather
than incrementing them so every fault gets the same number of
restores before the next fault.

This probably shouldn't be a practical change because if a lazy counter
was non-zero then it should have been restored and would not cause a
fault when userspace tries to access it. However the code and comment
implies otherwise so that's misleading and unnecessary.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200623234139.2262227-3-npiggin@gmail.com
3 years agopowerpc/64s: Fix restore_math unnecessarily changing MSR
Nicholas Piggin [Tue, 23 Jun 2020 23:41:38 +0000 (09:41 +1000)]
powerpc/64s: Fix restore_math unnecessarily changing MSR

Before returning to user, if there are missing FP/VEC/VSX bits from the
user MSR then those registers had been saved and must be restored again
before use. restore_math will decide whether to restore immediately, or
skip the restore and let fp/vec/vsx unavailable faults demand load the
registers.

Each time restore_math restores one of the FP/VSX or VEC register sets
is loaded, an 8-bit counter is incremented (load_fp and load_vec). When
these wrap to zero, restore_math no longer restores that register set
until after they are next demand faulted.

It's quite usual for those counters to have different values, so if one
wraps to zero and restore_math no longer restores its registers or user
MSR bit but the other is not zero yet does not need to be restored
(because the kernel is not frequently using the FPU), then restore_math
will be called and it will also not return in the early exit check.
This causes msr_check_and_set to test and set the MSR at every kernel
exit despite having no work to do.

This can cause workloads (e.g., a NULL syscall microbenchmark) to run
fast for a time while both counters are non-zero, then slow down when
one of the counters reaches zero, then speed up again after the second
counter reaches zero. The cost is significant, about 10% slowdown on a
NULL syscall benchmark, and the jittery behaviour is very undesirable.

Fix this by having restore_math test all conditions first, and only
update MSR if we will be loading registers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200623234139.2262227-2-npiggin@gmail.com
3 years agopowerpc/64s: restore_math remove TM test
Nicholas Piggin [Tue, 23 Jun 2020 23:41:37 +0000 (09:41 +1000)]
powerpc/64s: restore_math remove TM test

The TM test in restore_math added by commit dc16b553c949e ("powerpc:
Always restore FPU/VEC/VSX if hardware transactional memory in use") is
no longer necessary after commit a8318c13e79ba ("powerpc/tm: Fix
restoring FP/VMX facility incorrectly on interrupts"), which removed
the cases where restore_math has to restore if TM is active.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200623234139.2262227-1-npiggin@gmail.com
3 years agopowerpc/pmem: Initialize pmem device on newer hardware
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:35 +0000 (12:52 +0530)]
powerpc/pmem: Initialize pmem device on newer hardware

With kernel now supporting new pmem flush/sync instructions, we can now
enable the kernel to initialize the device. On P10 these devices would
appear with a new compatible string. For PAPR device we have

compatible       "ibm,pmemory-v2"

and for OF pmem device we have

compatible       "pmem-region-v2"

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-8-aneesh.kumar@linux.ibm.com
3 years agopowerpc/pmem: Avoid the barrier in flush routines
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:34 +0000 (12:52 +0530)]
powerpc/pmem: Avoid the barrier in flush routines

nvdimm expect the flush routines to just mark the cache clean. The barrier
that mark the store globally visible is done in nvdimm_flush().

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-7-aneesh.kumar@linux.ibm.com
3 years agopowerpc/pmem: Update ppc64 to use the new barrier instruction.
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:33 +0000 (12:52 +0530)]
powerpc/pmem: Update ppc64 to use the new barrier instruction.

pmem on POWER10 can now use phwsync instead of hwsync to ensure
all previous writes are architecturally visible for the platform
buffer flush.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-6-aneesh.kumar@linux.ibm.com
3 years agolibnvdimm/nvdimm/flush: Allow architecture to override the flush barrier
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:32 +0000 (12:52 +0530)]
libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier

Architectures like ppc64 provide persistent memory specific barriers
that will ensure that all stores for which the modifications are
written to persistent storage by preceding dcbfps and dcbstps
instructions have updated persistent storage before any data
access or data transfer caused by subsequent instructions is initiated.
This is in addition to the ordering done by wmb()

Update nvdimm core such that architecture can use barriers other than
wmb to ensure all previous writes are architecturally visible for
the platform buffer flush.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-5-aneesh.kumar@linux.ibm.com
3 years agopowerpc/pmem: Add flush routines using new pmem store and sync instruction
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:31 +0000 (12:52 +0530)]
powerpc/pmem: Add flush routines using new pmem store and sync instruction

Start using dcbstps; phwsync; sequence for flushing persistent memory range.
The new instructions are implemented as a variant of dcbf and hwsync and on
P8 and P9 they will be executed as those instructions. We avoid using them on
older hardware. This helps to avoid difficult to debug bugs.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-4-aneesh.kumar@linux.ibm.com
3 years agopowerpc/pmem: Add new instructions for persistent storage and sync
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:30 +0000 (12:52 +0530)]
powerpc/pmem: Add new instructions for persistent storage and sync

POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps)
that can be used to write modified locations back to persistent storage.

Additionally, POWER10 also introduce phwsync and plwsync which can be used
to establish order of these writes to persistent storage.

This patch exposes these instructions to the rest of the kernel. The existing
dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
synchronization with OpenCAPI-hosted persistent storage. Hence the new
instructions are added as a variant of the old ones that old hardware
won't differentiate.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-3-aneesh.kumar@linux.ibm.com
3 years agopowerpc/pmem: Restrict papr_scm to P8 and above.
Aneesh Kumar K.V [Wed, 1 Jul 2020 07:22:29 +0000 (12:52 +0530)]
powerpc/pmem: Restrict papr_scm to P8 and above.

The PAPR based virtualized persistent memory devices are only supported on
POWER9 and above. In the followup patch, the kernel will switch the persistent
memory cache flush functions to use a new `dcbf` variant instruction. The new
instructions even though added in ISA 3.1 works even on P8 and P9 because these
are implemented as a variant of existing `dcbf` and `hwsync` and on P8 and
P9 behaves as such.

Considering these devices are only supported on P8 and above,  update the driver
to prevent a P7-compat guest from using persistent memory devices.

We don't update of_pmem driver with the same condition, because, on bare-metal,
the firmware enables pmem support only on P9 and above. There the kernel depends
on OPAL firmware to restrict exposing persistent memory related device tree
entries on older hardware. of_pmem.ko is written without any arch dependency and
we don't want to add ppc64 specific cpu feature check in of_pmem driver.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200701072235.223558-2-aneesh.kumar@linux.ibm.com
3 years agopowerpc/mm/book3s64/radix: Off-load TLB invalidations to host when !GTSE
Nicholas Piggin [Fri, 3 Jul 2020 05:36:08 +0000 (11:06 +0530)]
powerpc/mm/book3s64/radix: Off-load TLB invalidations to host when !GTSE

When platform doesn't support GTSE, let TLB invalidation requests
for radix guests be off-loaded to the host using H_RPT_INVALIDATE
hcall.

[hcall wrapper, error path handling and renames]

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200703053608.12884-4-bharata@linux.ibm.com
3 years agopowerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if enabled
Bharata B Rao [Fri, 3 Jul 2020 05:36:07 +0000 (11:06 +0530)]
powerpc/pseries: H_REGISTER_PROC_TBL should ask for GTSE only if enabled

H_REGISTER_PROC_TBL asks for GTSE by default. GTSE flag bit should
be set only when GTSE is supported.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200703053608.12884-3-bharata@linux.ibm.com
3 years agopowerpc/mm: Enable radix GTSE only if supported.
Bharata B Rao [Fri, 3 Jul 2020 05:36:06 +0000 (11:06 +0530)]
powerpc/mm: Enable radix GTSE only if supported.

Make GTSE an MMU feature and enable it by default for radix.
However for guest, conditionally enable it if hypervisor supports
it via OV5 vector. Let prom_init ask for radix GTSE only if the
support exists.

Having GTSE as an MMU feature will make it easy to enable radix
without GTSE. Currently radix assumes GTSE is enabled by default.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200703053608.12884-2-bharata@linux.ibm.com
3 years agopowerpc/vdso64: Switch from __get_datapage() to get_datapage inline macro
Christophe Leroy [Tue, 28 Apr 2020 13:16:47 +0000 (13:16 +0000)]
powerpc/vdso64: Switch from __get_datapage() to get_datapage inline macro

On the same way as already done on PPC32, drop __get_datapage()
function and use get_datapage inline macro instead.

See commit ec0895f08f99 ("powerpc/vdso32: inline __get_datapage()")

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/e13d95312e0b9792556b19b4bb8955cc1ff19fc7.1588079622.git.christophe.leroy@c-s.fr
3 years agopowerpc/signal64: Don't opencode page prefaulting
Christophe Leroy [Tue, 7 Jul 2020 18:32:25 +0000 (18:32 +0000)]
powerpc/signal64: Don't opencode page prefaulting

Instead of doing a __get_user() from the first and last location
into a tmp var which won't be used, use fault_in_pages_readable()

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/810bd8840ef990a200f58c9dea9abe767ca02a3a.1594146723.git.christophe.leroy@csgroup.eu
3 years agopowerpc/signal_32: Simplify loop in PPC64 save_general_regs()
Christophe Leroy [Tue, 7 Jul 2020 12:33:36 +0000 (12:33 +0000)]
powerpc/signal_32: Simplify loop in PPC64 save_general_regs()

save_general_regs() which does special handling when i == PT_SOFTE.

Rewrite it to minimise the specific part, especially the __put_user()
and associated error handling is the same so make it common.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Use a regular if rather than ternary operator]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/47a38df46cae5a5a88a558a64d71f75e9c4d9950.1594125164.git.christophe.leroy@csgroup.eu
3 years agopowerpc/signal_32: Remove !FULL_REGS() special handling in PPC64 save_general_regs()
Christophe Leroy [Tue, 7 Jul 2020 12:33:35 +0000 (12:33 +0000)]
powerpc/signal_32: Remove !FULL_REGS() special handling in PPC64 save_general_regs()

Since commit ("1bd79336a426 powerpc: Fix various
syscall/signal/swapcontext bugs"), getting save_general_regs() called
without FULL_REGS() is very unlikely and generates a warning.

The 32-bit version of save_general_regs() doesn't take care of it
at all and copies all registers anyway since that commit.

Moreover, commit 965dd3ad3076 ("powerpc/64/syscall: Remove
non-volatile GPR save optimisation") is another reason why it would
never happen.

So the same with 64-bit, don't worry about FULL_REGS() and copy
all registers all the time.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/173de3b659fa3a5f126a0eb170522cccd909950f.1594125164.git.christophe.leroy@csgroup.eu
3 years agopowerpc/kasan: Fix shadow pages allocation failure
Christophe Leroy [Thu, 2 Jul 2020 11:52:03 +0000 (11:52 +0000)]
powerpc/kasan: Fix shadow pages allocation failure

Doing kasan pages allocation in MMU_init is too early, kernel doesn't
have access yet to the entire memory space and memblock_alloc() fails
when the kernel is a bit big.

Do it from kasan_init() instead.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Fixes: d2a91cef9bbd ("powerpc/kasan: Fix shadow pages allocation failure")
Cc: stable@vger.kernel.org
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208181
Link: https://lore.kernel.org/r/63048fcea8a1c02f75429ba3152f80f7853f87fc.1593690707.git.christophe.leroy@csgroup.eu
3 years agoRevert "powerpc/kasan: Fix shadow pages allocation failure"
Christophe Leroy [Thu, 2 Jul 2020 11:52:02 +0000 (11:52 +0000)]
Revert "powerpc/kasan: Fix shadow pages allocation failure"

This reverts commit d2a91cef9bbdeb87b7449fdab1a6be6000930210.

This commit moved too much work in kasan_init(). The allocation
of shadow pages has to be moved for the reason explained in that
patch, but the allocation of page tables still need to be done
before switching to the final hash table.

First revert the incorrect commit, following patch redoes it
properly.

Fixes: d2a91cef9bbd ("powerpc/kasan: Fix shadow pages allocation failure")
Cc: stable@vger.kernel.org
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208181
Link: https://lore.kernel.org/r/3667deb0911affbf999b99f87c31c77d5e870cd2.1593690707.git.christophe.leroy@csgroup.eu
3 years agodocs: powerpc: Clarify book3s/32 MMU families
Christophe Leroy [Thu, 2 Jul 2020 14:09:21 +0000 (14:09 +0000)]
docs: powerpc: Clarify book3s/32 MMU families

Documentation wrongly tells that book3s/32 CPU have hash MMU.

603 and e300 core only have software loaded TLB.

755, 7450 family and e600 core have both hash MMU and software loaded
TLB. This can be selected by setting a bit in HID2 (755) or
HID0 (others). At the time being this is not supported by the kernel.

Make this explicit in the documentation.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/261923c075d1cb49d02493685e8585d4ea2a5197.1593698951.git.christophe.leroy@csgroup.eu
3 years agoselftests/powerpc: Add FPU denormal test
Nicholas Piggin [Wed, 8 Jul 2020 07:49:42 +0000 (17:49 +1000)]
selftests/powerpc: Add FPU denormal test

Add a testcase that tries to trigger the FPU denormal exception on
Power8 or earlier CPUs.

Prior to commit 4557ac6b344b ("powerpc/64s/exception: Fix 0x1500
interrupt handler crash") this would trigger a crash such as:

  Oops: Exception in kernel mode, sig: 5 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in: iptable_mangle xt_MASQUERADE iptable_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc ip6table_filter ip6_tables iptable_filter fuse kvm_hv binfmt_misc squashfs mlx4_ib ib_uverbs dm_multipath scsi_dh_rdac scsi_dh_alua ib_core mlx4_en sr_mod cdrom bnx2x lpfc mlx4_core crc_t10dif scsi_transport_fc sg mdio vmx_crypto crct10dif_vpmsum leds_powernv powernv_rng rng_core led_class powernv_op_panel sunrpc ip_tables x_tables autofs4
  CPU: 159 PID: 6854 Comm: fpu_denormal Not tainted 5.8.0-rc2-gcc-8.2.0-00092-g4ec7aaab0828 #192
  NIP:  c0000000000100ec LR: c00000000001b85c CTR: 0000000000000000
  REGS: c000001dd818f770 TRAP: 1500   Not tainted  (5.8.0-rc2-gcc-8.2.0-00092-g4ec7aaab0828)
  MSR:  900000000290b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002884  XER: 20000000
  CFAR: c00000000001005c IRQMASK: 1
  GPR00: c00000000001c4c8 c000001dd818fa00 c00000000171c200 c000001dd8101570
  GPR04: 0000000000000000 c000001dd818fe90 c000001dd8101590 000000000000001d
  GPR08: 0000000000000010 0000000000002000 c000001dd818fe90 fffffffffc48ac60
  GPR12: 0000000000002200 c000001ffff4f480 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 00007fffab225b40 0000000000000001 c000000001757168
  GPR24: c000001dd8101570 c0000018027b00f0 c000001dd8101570 c000000001496098
  GPR28: c00000000174ad05 c000001dd8100000 c000001dd8100000 c000001dd8100000
  NIP save_fpu+0xa8/0x2ac
  LR  __giveup_fpu+0x2c/0xd0
  Call Trace:
    0xc000001dd818fa80 (unreliable)
    giveup_all+0x118/0x120
    __switch_to+0x124/0x6c0
    __schedule+0x390/0xaf0
    do_task_dead+0x70/0x80
    do_exit+0x8fc/0xe10
    do_group_exit+0x64/0xd0
    sys_exit_group+0x24/0x30
    system_call_exception+0x164/0x270
    system_call_common+0xf0/0x278

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Split out of fix patch, add oops log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200708074942.1713396-1-npiggin@gmail.com
3 years agopowerpc/64/signal: Balance return predictor stack in signal trampoline
Nicholas Piggin [Mon, 11 May 2020 10:19:52 +0000 (20:19 +1000)]
powerpc/64/signal: Balance return predictor stack in signal trampoline

Returning from an interrupt or syscall to a signal handler currently
begins execution directly at the handler's entry point, with LR set to
the address of the sigreturn trampoline. When the signal handler
function returns, it runs the trampoline. It looks like this:

    # interrupt at user address xyz
    # kernel stuff... signal is raised
    rfid
    # void handler(int sig)
    addis 2,12,.TOC.-.LCF0@ha
    addi 2,2,.TOC.-.LCF0@l
    mflr 0
    std 0,16(1)
    stdu 1,-96(1)
    # handler stuff
    ld 0,16(1)
    mtlr 0
    blr
    # __kernel_sigtramp_rt64
    addi    r1,r1,__SIGNAL_FRAMESIZE
    li      r0,__NR_rt_sigreturn
    sc
    # kernel executes rt_sigreturn
    rfid
    # back to user address xyz

Note the blr with no matching bl. This can corrupt the return
predictor.

Solve this by instead resuming execution at the signal trampoline
which then calls the signal handler. qtrace-tools link_stack checker
confirms the entire user/kernel/vdso cycle is balanced after this
patch, whereas it's not upstream.

Alan confirms the dwarf unwind info still looks good. gdb still
recognises the signal frame and can step into parent frames if it
break inside a signal handler.

Performance is pretty noisy, not a very significant change on a POWER9
here, but branch misses are consistently a lot lower on a
microbenchmark:

 Performance counter stats for './signal':

       13,085.72 msec task-clock                #    1.000 CPUs utilized
  45,024,760,101      cycles                    #    3.441 GHz
  65,102,895,542      instructions              #    1.45  insn per cycle
  11,271,673,787      branches                  #  861.372 M/sec
      59,468,979      branch-misses             #    0.53% of all branches

       12,989.09 msec task-clock                #    1.000 CPUs utilized
  44,692,719,559      cycles                    #    3.441 GHz
  65,109,984,964      instructions              #    1.46  insn per cycle
  11,282,136,057      branches                  #  868.585 M/sec
      39,786,942      branch-misses             #    0.35% of all branches

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200511101952.1463138-1-npiggin@gmail.com
3 years agopowerpc/spufs: add CONFIG_COREDUMP dependency
Arnd Bergmann [Mon, 6 Jul 2020 13:22:46 +0000 (15:22 +0200)]
powerpc/spufs: add CONFIG_COREDUMP dependency

The kernel test robot pointed out a slightly different error message
after recent commit 5456ffdee666 ("powerpc/spufs: simplify spufs core
dumping") to spufs for a configuration that never worked:

   powerpc64-linux-ld: arch/powerpc/platforms/cell/spufs/file.o: in function `.spufs_proxydma_info_dump':
>> file.c:(.text+0x4c68): undefined reference to `.dump_emit'
   powerpc64-linux-ld: arch/powerpc/platforms/cell/spufs/file.o: in function `.spufs_dma_info_dump':
   file.c:(.text+0x4d70): undefined reference to `.dump_emit'
   powerpc64-linux-ld: arch/powerpc/platforms/cell/spufs/file.o: in function `.spufs_wbox_info_dump':
   file.c:(.text+0x4df4): undefined reference to `.dump_emit'

Add a Kconfig dependency to prevent this from happening again.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200706132302.3885935-1-arnd@arndb.de
3 years agopowerpc/powernv: Move pnv_ioda_setup_bus_dma under CONFIG_IOMMU_API
Oliver O'Halloran [Sun, 5 Jul 2020 13:35:57 +0000 (23:35 +1000)]
powerpc/powernv: Move pnv_ioda_setup_bus_dma under CONFIG_IOMMU_API

pnv_ioda_setup_bus_dma() is only used when a passed through PE is
returned to the host. If the kernel is built without IOMMU support
this is dead code. Move it under the #ifdef with the rest of the
IOMMU API support.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200705133557.443607-2-oohall@gmail.com
3 years agopowerpc/powernv: Make pnv_pci_sriov_enable() and friends static
Oliver O'Halloran [Sun, 5 Jul 2020 13:35:56 +0000 (23:35 +1000)]
powerpc/powernv: Make pnv_pci_sriov_enable() and friends static

The kernel test robot noticed these are non-static which causes Clang to
print some warnings. These are called via ppc_md function pointers so
there's no need for them to be non-static.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200705133557.443607-1-oohall@gmail.com
3 years agocpuidle/powernv : Remove dead code block
Abhishek Goel [Mon, 6 Jul 2020 05:32:58 +0000 (00:32 -0500)]
cpuidle/powernv : Remove dead code block

Commit 1961acad2f88559c2cdd2ef67c58c3627f1f6e54 removes usage of
function "validate_dt_prop_sizes". This patch removes this unused
function.

Signed-off-by: Abhishek Goel <huntbag@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200706053258.121475-1-huntbag@linux.vnet.ibm.com
3 years agopowerpc/cacheinfo: Add per cpu per index shared_cpu_list
Srikar Dronamraju [Mon, 29 Jun 2020 10:37:03 +0000 (16:07 +0530)]
powerpc/cacheinfo: Add per cpu per index shared_cpu_list

Unlike drivers/base/cacheinfo, powerpc cacheinfo code is not exposing
shared_cpu_list under /sys/devices/system/cpu/cpu<n>/cache/index<m>

Add shared_cpu_list to per cpu per index directory to maintain parity
with x86. Some scripts (example: mmtests
https://github.com/gormanm/mmtests) seem to be looking for
shared_cpu_list instead of shared_cpu_map.

Before this patch:
  # ls /sys/devices/system/cpu0/cache/index1
  coherency_line_size  number_of_sets  size  ways_of_associativity
  level                shared_cpu_map  type
  # cat /sys/devices/system/cpu0/cache/index1/shared_cpu_map
  00ff
  #

After this patch:
  # ls /sys/devices/system/cpu0/cache/index1
  coherency_line_size  number_of_sets   shared_cpu_map  type
  level                shared_cpu_list  size            ways_of_associativity
  # cat /sys/devices/system/cpu0/cache/index1/shared_cpu_map
  00ff
  # cat /sys/devices/system/cpu0/cache/index1/shared_cpu_list
  0-7
  #

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200629103703.4538-4-srikar@linux.vnet.ibm.com
3 years agopowerpc/cacheinfo: Make cpumap_show code reusable
Srikar Dronamraju [Mon, 29 Jun 2020 10:37:02 +0000 (16:07 +0530)]
powerpc/cacheinfo: Make cpumap_show code reusable

In anticipation of implementing shared_cpu_list, move code under
shared_cpu_map_show() to a common function.

No functional changes.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200629103703.4538-3-srikar@linux.vnet.ibm.com
3 years agopowerpc/cacheinfo: Use cpumap_print to print cpumap
Srikar Dronamraju [Mon, 29 Jun 2020 10:37:01 +0000 (16:07 +0530)]
powerpc/cacheinfo: Use cpumap_print to print cpumap

Tejun Heo had modified shared_cpu_map_show() to use scnprintf instead
of cpumap_print during support for *pb[l] format. Refer commit
0c118b7bd09a ("powerpc: use %*pb[l] to print bitmaps including
cpumasks and nodemasks").

cpumap_print_to_pagebuf() is a standard function to print cpumap. With
commit 9cf79d115f0d ("bitmap: remove explicit newline handling using
scnprintf format string"), there is no need to print explicit newline
and trailing null character. cpumap_print_to_pagebuf() internally uses
scnprintf(). Hence replace scnprintf() with cpumap_print_to_pagebuf().

Note: shared_cpu_map_show() in drivers/base/cacheinfo.c already uses
cpumap_print_to_pagebuf().

Before this patch:
  # cat /sys/devices/system/cpu0/cache/index1/shared_cpu_map
  00ff

  #

(Notice the extra blank line).

After this patch:
  # cat /sys/devices/system/cpu0/cache/index1/shared_cpu_map
  00ff
  #

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200629103703.4538-2-srikar@linux.vnet.ibm.com
3 years agoocxl: control via sysfs whether the FPGA is reloaded on a link reset
Philippe Bergheaud [Fri, 19 Jun 2020 14:04:39 +0000 (16:04 +0200)]
ocxl: control via sysfs whether the FPGA is reloaded on a link reset

Some opencapi FPGA images allow to control if the FPGA should be reloaded
on the next adapter reset. If it is supported, the image specifies it
through a Vendor Specific DVSEC in the config space of function 0.

Signed-off-by: Philippe Bergheaud <felix@linux.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200619140439.153962-1-fbarrat@linux.ibm.com
3 years agoMAINTAINERS: Remove self from powerpc EEH
Sam Bobroff [Mon, 29 Jun 2020 22:50:44 +0000 (08:50 +1000)]
MAINTAINERS: Remove self from powerpc EEH

I'm sorry to say I can no longer maintain this position.

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/aec7d729c28e35c7fa9969ec50229080c771195c.1593471043.git.sbobroff@linux.ibm.com
3 years agopowerpc/xmon: Reset RCU and soft lockup watchdogs
Anton Blanchard [Tue, 30 Jun 2020 00:02:18 +0000 (10:02 +1000)]
powerpc/xmon: Reset RCU and soft lockup watchdogs

I'm seeing RCU warnings when exiting xmon. xmon resets the NMI
watchdog, but does nothing with the RCU stall or soft lockup
watchdogs. Add a helper function that handles all three.

Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200630100218.62a3c3fb@kryten.localdomain
3 years agoselftests/powerpc: Purge extra count_pmc() calls of ebb selftests
Desnes A. Nunes do Rosario [Fri, 26 Jun 2020 16:47:37 +0000 (13:47 -0300)]
selftests/powerpc: Purge extra count_pmc() calls of ebb selftests

An extra count on ebb_state.stats.pmc_count[PMC_INDEX(pmc)] is being per-
formed when count_pmc() is used to reset PMCs on a few selftests. This
extra pmc_count can occasionally invalidate results, such as the ones from
cycles_test shown hereafter. The ebb_check_count() failed with an above
the upper limit error due to the extra value on ebb_state.stats.pmc_count.

Furthermore, this extra count is also indicated by extra PMC1 trace_log on
the output of the cycle test (as well as on pmc56_overflow_test):

==========
   ...
   [21]: counter = 8
   [22]: register SPRN_MMCR0 = 0x0000000080000080
   [23]: register SPRN_PMC1  = 0x0000000080000004
   [24]: counter = 9
   [25]: register SPRN_MMCR0 = 0x0000000080000080
   [26]: register SPRN_PMC1  = 0x0000000080000004
   [27]: counter = 10
   [28]: register SPRN_MMCR0 = 0x0000000080000080
   [29]: register SPRN_PMC1  = 0x0000000080000004
>> [30]: register SPRN_PMC1  = 0x000000004000051e
PMC1 count (0x280000546) above upper limit 0x2800003e8 (+0x15e)
[FAIL] Test FAILED on line 52
failure: cycles
==========

Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200626164737.21943-1-desnesn@linux.ibm.com
3 years agopowerpc: Drop CONFIG_MTD_M25P80 in 85xx-hw.config
Bin Meng [Sat, 2 May 2020 04:44:54 +0000 (21:44 -0700)]
powerpc: Drop CONFIG_MTD_M25P80 in 85xx-hw.config

Drop CONFIG_MTD_M25P80 that was removed in
commit b35b9a10362d ("mtd: spi-nor: Move m25p80 code in spi-nor.c")

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1588394694-517-1-git-send-email-bmeng.cn@gmail.com
3 years agopowerpc/boot/dts: Fix dtc "pciex" warnings
Michael Ellerman [Tue, 23 Jun 2020 13:03:20 +0000 (23:03 +1000)]
powerpc/boot/dts: Fix dtc "pciex" warnings

With CONFIG_OF_ALL_DTBS=y, as set by eg. allmodconfig, we see lots of
warnings about our dts files, such as:

  arch/powerpc/boot/dts/glacier.dts:492.26-532.5:
  Warning (pci_bridge): /plb/pciex@d00000000: node name is not "pci"
  or "pcie"

The node name should not particularly matter, it's just a name, and
AFAICS there's no kernel code that cares whether nodes are *named*
"pciex" or "pcie". So shutup these warnings by converting to the name
dtc wants.

As always there's some risk this could break something obscure that
does rely on the name, in which case we can revert.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200623130320.405852-1-mpe@ellerman.id.au
3 years agopowerpc/boot: Use address-of operator on section symbols
Nathan Chancellor [Wed, 24 Jun 2020 03:59:20 +0000 (20:59 -0700)]
powerpc/boot: Use address-of operator on section symbols

Clang warns:

arch/powerpc/boot/main.c:107:18: warning: array comparison always
evaluates to a constant [-Wtautological-compare]
        if (_initrd_end > _initrd_start) {
                        ^
arch/powerpc/boot/main.c:155:20: warning: array comparison always
evaluates to a constant [-Wtautological-compare]
        if (_esm_blob_end <= _esm_blob_start)
                          ^
2 warnings generated.

These are not true arrays, they are linker defined symbols, which are
just addresses.  Using the address of operator silences the warning
and does not change the resulting assembly with either clang/ld.lld
or gcc/ld (tested with diff + objdump -Dr).

Reported-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200624035920.835571-1-natechancellor@gmail.com
3 years agoselftests/powerpc: Add test for execute-disabled pkeys
Sandipan Das [Thu, 4 Jun 2020 12:56:10 +0000 (18:26 +0530)]
selftests/powerpc: Add test for execute-disabled pkeys

Apart from read and write access, memory protection keys can
also be used for restricting execute permission of pages on
powerpc. This adds a test to verify if the feature works as
expected.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200604125610.649668-4-sandipan@linux.ibm.com
3 years agoselftests/powerpc: Move Hash MMU check to utilities
Sandipan Das [Thu, 4 Jun 2020 12:56:09 +0000 (18:26 +0530)]
selftests/powerpc: Move Hash MMU check to utilities

This moves a function to test if the MMU is in Hash mode
under the generic test utilities.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200604125610.649668-3-sandipan@linux.ibm.com
3 years agoselftests/powerpc: Fix pkey access right updates
Sandipan Das [Thu, 4 Jun 2020 12:56:08 +0000 (18:26 +0530)]
selftests/powerpc: Fix pkey access right updates

The Power ISA mandates that all writes to the Authority
Mask Register (AMR) must always be preceded as well as
succeeded by a context synchronizing instruction.

This makes sure that the tests follow this requirement
when attempting to update a pkey's access rights.

Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200604125610.649668-2-sandipan@linux.ibm.com
3 years agopowerpc/8xx: Modify ptep_get()
Christophe Leroy [Thu, 18 Jun 2020 12:07:46 +0000 (12:07 +0000)]
powerpc/8xx: Modify ptep_get()

Move ptep_get() close to pte_update(), in an ifdef section already
dedicated to powerpc 8xx. This section contains explanation about
the layout of page table entries.

Also modify it to return 4 times the pte value instead of padding
with zeroes.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/9f2df6621fcaf9eba15fadc61c169d0c8e2fb849.1592481938.git.christophe.leroy@csgroup.eu
3 years agopowerpc/mm/book3s64: Skip 16G page reservation with radix
Aneesh Kumar K.V [Mon, 22 Jun 2020 06:40:19 +0000 (12:10 +0530)]
powerpc/mm/book3s64: Skip 16G page reservation with radix

With hash translation, the hypervisor can hint the LPAR about 16GB contiguous range
via ibm,expected#pages. The kernel marks the range specified in the device tree
as reserved. Avoid doing this when using radix translation. Radix translation
only supports 1G gigantic hugepage and kernel can do the 1G gigantic hugepage
allocation via early memblock reservation. This can be done because with radix
translation pages are not required to be contiguous on the host.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200622064019.16682-1-aneesh.kumar@linux.ibm.com
3 years agopowerpc/4xx: ppc4xx compile flag optimizations
Imre Kaloz [Thu, 22 Dec 2016 08:06:08 +0000 (09:06 +0100)]
powerpc/4xx: ppc4xx compile flag optimizations

This patch splits up the compile flags between ppc40x and ppc44x.

Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: Imre Kaloz <kaloz@openwrt.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1482393968-60623-1-git-send-email-john@phrozen.org
3 years agopowerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k
Christophe Leroy [Mon, 15 Jun 2020 07:48:25 +0000 (07:48 +0000)]
powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k

FIX_EARLY_DEBUG_BASE reserves a 128k area for debuging.

When page size is 256k, the calculation results in a 0 number of
pages, leading to the following failure:

  CC      arch/powerpc/kernel/asm-offsets.s
In file included from ./arch/powerpc/include/asm/nohash/32/pgtable.h:77:0,
                 from ./arch/powerpc/include/asm/nohash/pgtable.h:8,
                 from ./arch/powerpc/include/asm/pgtable.h:20,
                 from ./include/linux/pgtable.h:6,
                 from ./arch/powerpc/include/asm/kup.h:42,
                 from ./arch/powerpc/include/asm/uaccess.h:9,
                 from ./include/linux/uaccess.h:11,
                 from ./include/linux/crypto.h:21,
                 from ./include/crypto/hash.h:11,
                 from ./include/linux/uio.h:10,
                 from ./include/linux/socket.h:8,
                 from ./include/linux/compat.h:15,
                 from arch/powerpc/kernel/asm-offsets.c:14:
./arch/powerpc/include/asm/fixmap.h:75:2: error: overflow in enumeration values
  __end_of_permanent_fixed_addresses,
  ^
make[2]: *** [arch/powerpc/kernel/asm-offsets.s] Error 1

Ensure the debug area is at least one page.

Fixes: b8e8efaa8639 ("powerpc: reserve fixmap entries for early debug")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ca8c9f8249f523b1fab873e67b81b11989d46553.1592207216.git.christophe.leroy@csgroup.eu
3 years agoselftests/powerpc: Add prefixed loads/stores to alignment_handler test
Jordan Niethe [Wed, 20 May 2020 02:11:03 +0000 (12:11 +1000)]
selftests/powerpc: Add prefixed loads/stores to alignment_handler test

Extend the alignment handler selftest to exercise prefixed load store
instructions. Add tests for prefixed VSX, floating point and integer
instructions.

Skip prefix tests if ISA version does not support prefixed instructions.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Tested-by: Alistair Popple <alistair@popple.id.au>
[mpe: Fixup PPC_FEATURE2_ARCH_3_1 naming as noted by Alistair]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200520021103.19798-2-jniethe5@gmail.com
3 years agoselftests/powerpc: Allow choice of CI memory location in alignment_handler test
Jordan Niethe [Wed, 20 May 2020 02:11:02 +0000 (12:11 +1000)]
selftests/powerpc: Allow choice of CI memory location in alignment_handler test

The alignment handler selftest needs cache-inhibited memory and
currently /dev/fb0 is relied on to provided this. This prevents running
the test on systems without /dev/fb0 (e.g., mambo). Read the commandline
arguments for an optional path to be used instead, as well as an
optional offset to be for mmaping this path.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Tested-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200520021103.19798-1-jniethe5@gmail.com
3 years agopowerpc/powernv/ioda: Return correct error if TCE level allocation failed
Alexey Kardashevskiy [Wed, 17 Jun 2020 00:38:35 +0000 (10:38 +1000)]
powerpc/powernv/ioda: Return correct error if TCE level allocation failed

The iommu_table_ops::xchg_no_kill() callback updates TCE. It is quite
possible that not entire table is allocated if it is huge and multilevel
so xchg may also allocate subtables. If failed, it returns H_HARDWARE
for failed allocation and H_TOO_HARD if it needs it but cannot do because
the alloc parameter is "false" (set when called with MMU=off to force
retry with MMU=on).

The problem is that having separate errors only matters in real mode
(MMU=off) but the only caller with alloc="false" does not check the exact
error code and simply returns H_TOO_HARD; and for every other mode
alloc is "true". Also, the function is also called from the ioctl()
handler of the VFIO SPAPR TCE IOMMU subdriver which does not expect
hypervisor error codes (H_xxx) and will expose them to the userspace.

This converts wrong error codes to -ENOMEM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200617003835.48831-1-aik@ozlabs.ru
3 years agopowerpc/pseries/svm: Drop unused align argument in alloc_shared_lppaca() function
Satheesh Rajendran [Fri, 12 Jun 2020 14:29:53 +0000 (19:59 +0530)]
powerpc/pseries/svm: Drop unused align argument in alloc_shared_lppaca() function

Argument "align" in alloc_shared_lppaca() was unused inside the
function. Let's drop it and update code comment for page alignment.

Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
[mpe: Massage comment wording/formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612142953.135408-1-sathnaga@linux.vnet.ibm.com
3 years agopowerpc/ptdump: Fix build failure in hashpagetable.c
Christophe Leroy [Mon, 15 Jun 2020 13:18:39 +0000 (13:18 +0000)]
powerpc/ptdump: Fix build failure in hashpagetable.c

H_SUCCESS is only defined when CONFIG_PPC_PSERIES is defined.

!= H_SUCCESS means != 0. Modify the test accordingly.

Fixes: 65e701b2d2a8 ("powerpc/ptdump: drop non vital #ifdefs")
Cc: stable@vger.kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/795158fc1d2b3dff3bf7347881947a887ea9391a.1592227105.git.christophe.leroy@csgroup.eu
3 years agopowerpc/mm: Fix typo in IS_ENABLED()
Joe Perches [Fri, 5 Jun 2020 14:18:06 +0000 (07:18 -0700)]
powerpc/mm: Fix typo in IS_ENABLED()

IS_ENABLED() matches names exactly, so the missing "CONFIG_" prefix
means this code would never be built.

Also fixes a missing newline in pr_warn().

Fixes: 970d54f99cea ("powerpc/book3s64/hash: Disable 16M linear mapping size if not aligned")
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/202006050717.A2F9809E@keescook
3 years agopowerpc/xive: Ignore kmemleak false positives
Alexey Kardashevskiy [Fri, 12 Jun 2020 04:33:03 +0000 (14:33 +1000)]
powerpc/xive: Ignore kmemleak false positives

xive_native_provision_pages() allocates memory and passes the pointer to
OPAL so kmemleak cannot find the pointer usage in the kernel memory and
produces a false positive report (below) (even if the kernel did scan
OPAL memory, it is unable to deal with __pa() addresses anyway).

This silences the warning.

unreferenced object 0xc000200350c40000 (size 65536):
  comm "qemu-system-ppc", pid 2725, jiffies 4294946414 (age 70776.530s)
  hex dump (first 32 bytes):
    02 00 00 00 50 00 00 00 00 00 00 00 00 00 00 00  ....P...........
    01 00 08 07 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<0000000081ff046c>] xive_native_alloc_vp_block+0x120/0x250
    [<00000000d555d524>] kvmppc_xive_compute_vp_id+0x248/0x350 [kvm]
    [<00000000d69b9c9f>] kvmppc_xive_connect_vcpu+0xc0/0x520 [kvm]
    [<000000006acbc81c>] kvm_arch_vcpu_ioctl+0x308/0x580 [kvm]
    [<0000000089c69580>] kvm_vcpu_ioctl+0x19c/0xae0 [kvm]
    [<00000000902ae91e>] ksys_ioctl+0x184/0x1b0
    [<00000000f3e68bd7>] sys_ioctl+0x48/0xb0
    [<0000000001b2c127>] system_call_exception+0x124/0x1f0
    [<00000000d2b2ee40>] system_call_common+0xe8/0x214

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200612043303.84894-1-aik@ozlabs.ru
3 years agopowerpc/configs: Remove CMDLINE_BOOL
Chris Packham [Thu, 11 Jun 2020 22:42:20 +0000 (10:42 +1200)]
powerpc/configs: Remove CMDLINE_BOOL

Regenerate defconfigs to remove CONFIG_CMDLINE_BOOL and the default
CONFIG_CMDLINE where applicable.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200611224220.25066-3-chris.packham@alliedtelesis.co.nz
3 years agopowerpc: Remove inaccessible CMDLINE default
Chris Packham [Thu, 11 Jun 2020 22:42:19 +0000 (10:42 +1200)]
powerpc: Remove inaccessible CMDLINE default

Since commit cbe46bd4f510 ("powerpc: remove CONFIG_CMDLINE #ifdef mess")
CONFIG_CMDLINE has always had a value regardless of CONFIG_CMDLINE_BOOL.

For example:

 $ make ARCH=powerpc defconfig
 $ cat .config
 # CONFIG_CMDLINE_BOOL is not set
 CONFIG_CMDLINE=""

When enabling CONFIG_CMDLINE_BOOL this value is kept making the 'default
"..." if CONFIG_CMDLINE_BOOL' ineffective.

 $ ./scripts/config --enable CONFIG_CMDLINE_BOOL
 $ cat .config
 CONFIG_CMDLINE_BOOL=y
 CONFIG_CMDLINE=""

Remove CONFIG_CMDLINE_BOOL and the inaccessible default.

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200611224220.25066-2-chris.packham@alliedtelesis.co.nz
3 years agopowerpc/dt_cpu_ftrs: Make use of macro ISA_V3_1
Murilo Opsfelder Araujo [Wed, 10 Jun 2020 21:51:14 +0000 (18:51 -0300)]
powerpc/dt_cpu_ftrs: Make use of macro ISA_V3_1

Macro ISA_V3_1 was defined but never used.  Use it instead of literal.

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200610215114.167544-4-muriloo@linux.ibm.com
3 years agopowerpc/dt_cpu_ftrs: Make use of macro ISA_V3_0B
Murilo Opsfelder Araujo [Wed, 10 Jun 2020 21:51:13 +0000 (18:51 -0300)]
powerpc/dt_cpu_ftrs: Make use of macro ISA_V3_0B

Macro ISA_V3_0B was defined but never used.  Use it instead of
literal.

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200610215114.167544-3-muriloo@linux.ibm.com
3 years agopowerpc/dt_cpu_ftrs: Remove unused macro ISA_V2_07B
Murilo Opsfelder Araujo [Wed, 10 Jun 2020 21:51:12 +0000 (18:51 -0300)]
powerpc/dt_cpu_ftrs: Remove unused macro ISA_V2_07B

Macro ISA_V2_07B is defined but not used anywhere else in the code.

Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200610215114.167544-2-muriloo@linux.ibm.com
3 years agopowerpc/64: indirect function call use bctrl rather than blrl in ret_from_kernel_thread
Nicholas Piggin [Thu, 11 Jun 2020 12:11:19 +0000 (22:11 +1000)]
powerpc/64: indirect function call use bctrl rather than blrl in ret_from_kernel_thread

blrl is not recommended to use as an indirect function call, as it may
corrupt the link stack predictor.

This is not a performance critical path but this should be fixed for
consistency.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200611121119.1015740-1-npiggin@gmail.com
3 years agoLinux 5.8-rc2
Linus Torvalds [Sun, 21 Jun 2020 22:45:29 +0000 (15:45 -0700)]
Linux 5.8-rc2

3 years agoMerge tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Jun 2020 22:41:24 +0000 (15:41 -0700)]
Merge tag 'selinux-pr-20200621' of git://git./linux/kernel/git/pcmoore/selinux

Pull SELinux fixes from Paul Moore:
 "Three small patches to fix problems in the SELinux code, all found via
  clang.

  Two patches fix potential double-free conditions and one fixes an
  undefined return value"

* tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
  selinux: fix undefined return of cond_evaluate_expr
  selinux: fix a double free in cond_read_node()/cond_read_list()
  selinux: fix double free

3 years agoMerge tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Sun, 21 Jun 2020 20:04:57 +0000 (13:04 -0700)]
Merge tag 'pinctrl-v5.8-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Some early fixes collected during the first week after the merge
  window, all pretty self-evident, with the details below. The revert is
  the crucial thing.

   - Fix a warning on the Qualcomm SPMI GPIO chip being instatiated
     twice without a unique irqchip struct

   - Use the noirq variants of the suspend and resume callbacks in the
     Tegra driver

   - Clean up the errorpath on the MCP23s08 driver

   - Revert the use of devm_of_iomap() in the Freescale driver as it was
     regressing the platform

   - Add some missing pins in the Qualcomm IPQ6018 driver

   - Fix a simple documentation bug in the pinctrl-single driver"

* tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: single: fix function name in documentation
  pinctrl: qcom: ipq6018 Add missing pins in qpic pin group
  Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'"
  pinctrl: mcp23s08: Split to three parts: fix ptr_ret.cocci warnings
  pinctrl: tegra: Use noirq suspend/resume callbacks
  pinctrl: qcom: spmi-gpio: fix warning about irq chip reusage

3 years agoMerge tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahi...
Linus Torvalds [Sun, 21 Jun 2020 19:44:52 +0000 (12:44 -0700)]
Merge tag 'kbuild-fixes-v5.8' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - fix -gz=zlib compiler option test for CONFIG_DEBUG_INFO_COMPRESSED

 - improve cc-option in scripts/Kbuild.include to clean up temp files

 - improve cc-option in scripts/Kconfig.include for more reliable
   compile option test

 - do not copy modules.builtin by 'make install' because it would break
   existing systems

 - use 'userprogs' syntax for watch_queue sample

* tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  samples: watch_queue: build sample program for target architecture
  Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n"
  scripts: Fix typo in headers_install.sh
  kconfig: unify cc-option and as-option
  kbuild: improve cc-option to clean up all temporary files
  Makefile: Improve compressed debug info support detection

3 years agoMerge tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 21 Jun 2020 17:02:53 +0000 (10:02 -0700)]
Merge tag 'powerpc-5.8-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - One fix for the interrupt rework we did last release which broke
   KVM-PR

 - Three commits fixing some fallout from the READ_ONCE() changes
   interacting badly with our 8xx 16K pages support, which uses a pte_t
   that is a structure of 4 actual PTEs

 - A cleanup of the 8xx pte_update() to use the newly added pmd_off()

 - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is
   enabled

 - A minor fix for the SPU syscall generation

Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike
Rapoport, Nicholas Piggin.

* tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/8xx: Provide ptep_get() with 16k pages
  mm: Allow arches to provide ptep_get()
  mm/gup: Use huge_ptep_get() in gup_hugepte()
  powerpc/syscalls: Use the number when building SPU syscall table
  powerpc/8xx: use pmd_off() to access a PMD entry in pte_update()
  powerpc/64s: Fix KVM interrupt using wrong save area
  powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL

3 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sun, 21 Jun 2020 17:01:03 +0000 (10:01 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:

 - NULL dereference in octeontx

 - PM reference imbalance in ks-sa

 - deadlock in crypto manager

 - memory leak in drbg

 - missing socket limit check on receive SG list size in algif_skcipher

 - typos in caam

 - warnings in ccp and hisilicon

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: drbg - always try to free Jitter RNG instance
  crypto: marvell/octeontx - Fix a potential NULL dereference
  crypto: algboss - don't wait during notifier callback
  crypto: caam - fix typos
  crypto: ccp - Fix sparse warnings in sev-dev
  crypto: hisilicon - Cap block size at 2^31
  crypto: algif_skcipher - Cap recv SG list at ctx->used
  hwrng: ks-sa - Fix runtime PM imbalance on error

3 years agosamples: watch_queue: build sample program for target architecture
Masahiro Yamada [Wed, 17 Jun 2020 02:08:38 +0000 (11:08 +0900)]
samples: watch_queue: build sample program for target architecture

This userspace program includes UAPI headers exported to usr/include/.
'make headers' always works for the target architecture (i.e. the same
architecture as the kernel), so the sample program should be built for
the target as well. Kbuild now supports 'userprogs' for that.

I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
$(CC) may not provide libc.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 years agoRevert "Makefile: install modules.builtin even if CONFIG_MODULES=n"
Masahiro Yamada [Fri, 19 Jun 2020 15:09:55 +0000 (00:09 +0900)]
Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n"

This reverts commit e0b250b57dcf403529081e5898a9de717f96b76b,
which broke build systems that need to install files to a certain
path, but do not set INSTALL_MOD_PATH when invoking 'make install'.

  $ make INSTALL_PATH=/tmp/destdir install
  mkdir: cannot create directory ‘/lib/modules/5.8.0-rc1+/’: Permission denied
  Makefile:1342: recipe for target '_builtin_inst_' failed
  make: *** [_builtin_inst_] Error 1

While modules.builtin is useful also for CONFIG_MODULES=n, this change
in the behavior is quite unexpected. Maybe "make modules_install"
can install modules.builtin irrespective of CONFIG_MODULES as Jonas
originally suggested.

Anyway, that commit should be reverted ASAP.

Reported-by: Douglas Anderson <dianders@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Jonas Karlman <jonas@kwiboo.se>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 21 Jun 2020 02:23:13 +0000 (19:23 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "One minor fix and two patches reworking the ata dma drain for the
  !CONFIG_LIBATA case. The latter is a 5.7 regression fix"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers
  scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATA
  scsi: ufs-bsg: Fix runtime PM imbalance on error

3 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 21 Jun 2020 02:18:27 +0000 (19:18 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - a small collection of remaining API conversion patches (all acked)
   which allow to finally remove the deprecated API

 - some documentation fixes and a MAINTAINERS addition

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: Add robert and myself as qcom i2c cci maintainers
  i2c: smbus: Fix spelling mistake in the comments
  Documentation/i2c: SMBus start signal is S not A
  i2c: remove deprecated i2c_new_device API
  Documentation: media: convert to use i2c_new_client_device()
  video: backlight: tosa_lcd: convert to use i2c_new_client_device()
  x86/platform/intel-mid: convert to use i2c_new_client_device()
  drm: encoder_slave: use new I2C API
  drm: encoder_slave: fix refcouting error for modules

3 years agopinctrl: single: fix function name in documentation
Drew Fustini [Fri, 12 Jun 2020 11:27:58 +0000 (13:27 +0200)]
pinctrl: single: fix function name in documentation

Use the correct the function name in the documentation for
"pcs_parse_one_pinctrl_entry()".

"smux_parse_one_pinctrl_entry()" appears to be an artifact from the
development of a prior patch series ("simple pinmux driver") which
transformed into pinctrl-single.

Signed-off-by: Drew Fustini <drew@beagleboard.org>
Link: https://lore.kernel.org/r/20200612112758.GA3407886@x1
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
3 years agoMerge tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Sat, 20 Jun 2020 20:17:47 +0000 (13:17 -0700)]
Merge tag 'trace-v5.8-rc1' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Have recordmcount work with > 64K sections (to support LTO)

 - kprobe RCU fixes

 - Correct a kprobe critical section with missing mutex

 - Remove redundant arch_disarm_kprobe() call

 - Fix lockup when kretprobe triggers within kprobe_flush_task()

 - Fix memory leak in fetch_op_data operations

 - Fix sleep in atomic in ftrace trace array sample code

 - Free up memory on failure in sample trace array code

 - Fix incorrect reporting of function_graph fields in format file

 - Fix quote within quote parsing in bootconfig

 - Fix return value of bootconfig tool

 - Add testcases for bootconfig tool

 - Fix maybe uninitialized warning in ftrace pid file code

 - Remove unused variable in tracing_iter_reset()

 - Fix some typos

* tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix maybe-uninitialized compiler warning
  tools/bootconfig: Add testcase for show-command and quotes test
  tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig
  tools/bootconfig: Fix to use correct quotes for value
  proc/bootconfig: Fix to use correct quotes for value
  tracing: Remove unused event variable in tracing_iter_reset
  tracing/probe: Fix memleak in fetch_op_data operations
  trace: Fix typo in allocate_ftrace_ops()'s comment
  tracing: Make ftrace packed events have align of 1
  sample-trace-array: Remove trace_array 'sample-instance'
  sample-trace-array: Fix sleeping function called from invalid context
  kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
  kprobes: Remove redundant arch_disarm_kprobe() call
  kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
  kprobes: Use non RCU traversal APIs on kprobe_tables if possible
  kprobes: Suppress the suspicious RCU warning on kprobes
  recordmcount: support >64k sections

3 years agoMerge tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 20 Jun 2020 20:13:21 +0000 (13:13 -0700)]
Merge tag 'libnvdimm-for-5.8-rc2' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "A feature (papr_scm health retrieval) and a fix (sysfs attribute
  visibility) for v5.8.

  Vaibhav explains in the merge commit below why missing v5.8 would be
  painful and I agreed to try a -rc2 pull because only cosmetics kept
  this out of -rc1 and his initial versions were posted in more than
  enough time for v5.8 consideration:

   'These patches are tied to specific features that were committed to
    customers in upcoming distros releases (RHEL and SLES) whose
    time-lines are tied to 5.8 kernel release.

    Being able to track the health of an nvdimm is critical for our
    customers that are running workloads leveraging papr-scm nvdimms.
    Missing the 5.8 kernel would mean missing the distro timelines and
    shifting forward the availability of this feature in distro kernels
    by at least 6 months'

  Summary:

   - Fix the visibility of the region 'align' attribute.

     The new unit tests for region alignment handling caught a corner
     case where the alignment cannot be specified if the region is
     converted from static to dynamic provisioning at runtime.

   - Add support for device health retrieval for the persistent memory
     supported by the papr_scm driver.

     This includes both the standard sysfs "health flags" that the nfit
     persistent memory driver publishes and a mechanism for the ndctl
     tool to retrieve a health-command payload"

* tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm/region: always show the 'align' attribute
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  seq_buf: Export seq_buf_printf
  powerpc: Document details on H_SCM_HEALTH hcall

3 years agopinctrl: qcom: ipq6018 Add missing pins in qpic pin group
Sivaprakash Murugesan [Fri, 19 Jun 2020 04:31:29 +0000 (10:01 +0530)]
pinctrl: qcom: ipq6018 Add missing pins in qpic pin group

The patch adds missing qpic data pins to qpic pingroup. These pins are
necessary for the qpic nand to work.

Fixes: ef1ea54eab0e ("pinctrl: qcom: Add ipq6018 pinctrl driver")
Signed-off-by: Sivaprakash Murugesan <sivaprak@codeaurora.org>
Link: https://lore.kernel.org/r/1592541089-17700-1-git-send-email-sivaprak@codeaurora.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
3 years agoRevert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in...
Haibo Chen [Tue, 9 Jun 2020 03:27:03 +0000 (11:27 +0800)]
Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'"

This reverts commit ba403242615c2c99e27af7984b1650771a2cc2c9.

After commit 26d8cde5260b ("pinctrl: freescale: imx: add shared
input select reg support"). i.MX7D has two iomux controllers
iomuxc and iomuxc-lpsr which share select_input register for
daisy chain settings.
If use 'devm_of_iomap()', when probe the iomuxc-lpsr, will call
devm_request_mem_region() for the region <0x30330000-0x3033ffff>
for the first time. Then, next time when probe the iomuxc, API
devm_platform_ioremap_resource() will also use the API
devm_request_mem_region() for the share region <0x30330000-0x3033ffff>
again, then cause issue, log like below:

[    0.179561] imx7d-pinctrl 302c0000.iomuxc-lpsr: initialized IMX pinctrl driver
[    0.191742] imx7d-pinctrl 30330000.pinctrl: can't request region for resource [mem 0x30330000-0x3033ffff]
[    0.191842] imx7d-pinctrl: probe of 30330000.pinctrl failed with error -16

Fixes: ba403242615c ("pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Link: https://lore.kernel.org/r/1591673223-1680-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
3 years agoMerge tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 20 Jun 2020 19:31:08 +0000 (12:31 -0700)]
Merge tag 's390-5.8-2' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - a few ptrace fixes mostly for strace and seccomp_bpf kernel tests
   findings

 - cleanup unused pm callbacks in virtio ccw

 - replace kmalloc + memset with kzalloc in crypto

 - use $(LD) for vDSO linkage to make clang happy

 - fix vDSO clock_getres() to preserve the same behaviour as
   posix_get_hrtimer_res()

 - fix workqueue cpumask warning when NUMA=n and nr_node_ids=2

 - reduce SLSB writes during input processing, improve warnings and
   cleanup qdio_data usage in qdio

 - a few fixes to use scnprintf() instead of snprintf()

* tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: fix syscall_get_error for compat processes
  s390/qdio: warn about unexpected SLSB states
  s390/qdio: clean up usage of qdio_data
  s390/numa: let NODES_SHIFT depend on NEED_MULTIPLE_NODES
  s390/vdso: fix vDSO clock_getres()
  s390/vdso: Use $(LD) instead of $(CC) to link vDSO
  s390/protvirt: use scnprintf() instead of snprintf()
  s390: use scnprintf() in sys_##_prefix##_##_name##_show
  s390/crypto: use scnprintf() instead of snprintf()
  s390/zcrypt: use kzalloc
  s390/virtio: remove unused pm callbacks
  s390/qdio: reduce SLSB writes during Input Queue processing
  selftests/seccomp: s390 shares the syscall and return value register
  s390/ptrace: fix setting syscall number
  s390/ptrace: pass invalid syscall numbers to tracing
  s390/ptrace: return -ENOSYS when invalid syscall is supplied
  s390/seccomp: pass syscall arguments via seccomp_data
  s390/qdio: fine-tune SLSB update

3 years agoMerge tag 'riscv-for-linus-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jun 2020 19:14:29 +0000 (12:14 -0700)]
Merge tag 'riscv-for-linus-5.8-rc2' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - a workaround for a compiler surprise related to the "r" inline
   assembly that allows LLVM to boot.

 - a fix to avoid WX-only mappings, which the ISA does not allow. While
   this probably manifests in many ways, the bug was found in stress-ng.

 - a missing lock in set_direct_map_*(), which due to a recent lockdep
   change started asserting.

* tag 'riscv-for-linus-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Acquire mmap lock before invoking walk_page_range
  RISC-V: Don't allow write+exec only page mapping request in mmap
  riscv/atomic: Fix sign extension for RV64I

3 years agoMerge tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Jun 2020 19:10:09 +0000 (12:10 -0700)]
Merge tag 'linux-kselftest-5.8-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest cleanups from Shuah Khan:

 - ftrace "requires:" list for simplifying and unifying requirement
   checks for each test case, adding "requires:" line instead of
   checking required ftrace interfaces in each test case.

 - a minor spelling correction patch

* tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/ftrace: Support ":README" suffix for requires
  selftests/ftrace: Support ":tracer" suffix for requires
  selftests/ftrace: Convert check_filter_file() with requires list
  selftests/ftrace: Convert required interface checks into requires list
  selftests/ftrace: Add "requires:" list support
  selftests/ftrace: Return unsupported for the unconfigured features
  selftests/ftrace: Allow ":" in description
  tools: testing: ftrace: trigger: fix spelling mistake

3 years agoafs: Fix hang on rmmod due to outstanding timer
David Howells [Fri, 19 Jun 2020 22:39:36 +0000 (23:39 +0100)]
afs: Fix hang on rmmod due to outstanding timer

The fileserver probe timer, net->fs_probe_timer, isn't cancelled when
the kafs module is being removed and so the count it holds on
net->servers_outstanding doesn't get dropped..

This causes rmmod to wait forever.  The hung process shows a stack like:

afs_purge_servers+0x1b5/0x23c [kafs]
afs_net_exit+0x44/0x6e [kafs]
ops_exit_list+0x72/0x93
unregister_pernet_operations+0x14c/0x1ba
unregister_pernet_subsys+0x1d/0x2a
afs_exit+0x29/0x6f [kafs]
__do_sys_delete_module.isra.0+0x1a2/0x24b
do_syscall_64+0x51/0x95
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fix this by:

 (1) Attempting to cancel the probe timer and, if successful, drop the
     count that the timer was holding.

 (2) Make the timer function just drop the count and not schedule the
     prober if the afs portion of net namespace is being destroyed.

Also, whilst we're at it, make the following changes:

 (3) Initialise net->servers_outstanding to 1 and decrement it before
     waiting on it so that it doesn't generate wake up events by being
     decremented to 0 until we're cleaning up.

 (4) Switch the atomic_dec() on ->servers_outstanding for ->fs_timer in
     afs_purge_servers() to use the helper function for that.

Fixes: f6cbb368bcb0 ("afs: Actively poll fileservers to maintain NAT or firewall openings")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoafs: Fix afs_do_lookup() to call correct fetch-status op variant
David Howells [Thu, 18 Jun 2020 23:01:28 +0000 (00:01 +0100)]
afs: Fix afs_do_lookup() to call correct fetch-status op variant

Fix afs_do_lookup()'s fallback case for when FS.InlineBulkStatus isn't
supported by the server.

In the fallback, it calls FS.FetchStatus for the specific vnode it's
meant to be looking up.  Commit b6489a49f7b7 broke this by renaming one
of the two identically-named afs_fetch_status_operation descriptors to
something else so that one of them could be made non-static.  The site
that used the renamed one, however, wasn't renamed and didn't produce
any warning because the other was declared in a header.

Fix this by making afs_do_lookup() use the renamed variant.

Note that there are two variants of the success method because one is
called from ->lookup() where we may or may not have an inode, but can't
call iget until after we've talked to the server - whereas the other is
called from within iget where we have an inode, but it may or may not be
initialised.

The latter variant expects there to be an inode, but because it's being
called from there former case, there might not be - resulting in an oops
like the following:

  BUG: kernel NULL pointer dereference, address: 00000000000000b0
  ...
  RIP: 0010:afs_fetch_status_success+0x27/0x7e
  ...
  Call Trace:
    afs_wait_for_operation+0xda/0x234
    afs_do_lookup+0x2fe/0x3c1
    afs_lookup+0x3c5/0x4bd
    __lookup_slow+0xcd/0x10f
    walk_component+0xa2/0x10c
    path_lookupat.isra.0+0x80/0x110
    filename_lookup+0x81/0x104
    vfs_statx+0x76/0x109
    __do_sys_newlstat+0x39/0x6b
    do_syscall_64+0x4c/0x78
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: b6489a49f7b7 ("afs: Fix silly rename")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agopowerpc/8xx: Provide ptep_get() with 16k pages
Christophe Leroy [Mon, 15 Jun 2020 12:57:59 +0000 (12:57 +0000)]
powerpc/8xx: Provide ptep_get() with 16k pages

READ_ONCE() now enforces atomic read, which leads to:

  CC      mm/gup.o
In file included from ./include/linux/kernel.h:11:0,
                 from mm/gup.c:2:
In function 'gup_hugepte.constprop',
    inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
    prefix ## suffix();    \
    ^
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  ^
./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
  compiletime_assert_rwonce_type(x);    \
  ^
mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE'
  pte = READ_ONCE(*ptep);
        ^
In function 'gup_get_pte',
    inlined from 'gup_pte_range' at mm/gup.c:2228:9,
    inlined from 'gup_pmd_range' at mm/gup.c:2613:15,
    inlined from 'gup_pud_range' at mm/gup.c:2641:15,
    inlined from 'gup_p4d_range' at mm/gup.c:2666:15,
    inlined from 'gup_pgd_range' at mm/gup.c:2694:15,
    inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                      ^
./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
    prefix ## suffix();    \
    ^
./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  ^
./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
  compiletime_assert_rwonce_type(x);    \
  ^
mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE'
  return READ_ONCE(*ptep);
         ^
make[2]: *** [mm/gup.o] Error 1

Define ptep_get() on 8xx when using 16k pages.

Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/341688399c1b102756046d19ea6ce39db1ae4742.1592225558.git.christophe.leroy@csgroup.eu
3 years agomm: Allow arches to provide ptep_get()
Christophe Leroy [Mon, 15 Jun 2020 12:57:58 +0000 (12:57 +0000)]
mm: Allow arches to provide ptep_get()

Since commit 9e343b467c70 ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses") it is not possible anymore to
use READ_ONCE() to access complex page table entries like the one
defined for powerpc 8xx with 16k size pages.

Define a ptep_get() helper that architectures can override instead
of performing a READ_ONCE() on the page table entry pointer.

Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/087fa12b6e920e32315136b998aa834f99242695.1592225558.git.christophe.leroy@csgroup.eu
3 years agomm/gup: Use huge_ptep_get() in gup_hugepte()
Christophe Leroy [Mon, 15 Jun 2020 12:57:57 +0000 (12:57 +0000)]
mm/gup: Use huge_ptep_get() in gup_hugepte()

gup_hugepte() reads hugepage table entries, it can't read
them directly, huge_ptep_get() must be used.

Fixes: 9e343b467c70 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/ffc3714334c3bfaca6f13788ad039e8759ae413f.1592225558.git.christophe.leroy@csgroup.eu
3 years agoMerge branch 'for-5.8/papr_scm' into libnvdimm-for-next
Dan Williams [Fri, 19 Jun 2020 21:18:51 +0000 (14:18 -0700)]
Merge branch 'for-5.8/papr_scm' into libnvdimm-for-next

Include the papr_scm health retrieval feature for v5.8-rc2. The
functionality was initially posted well in advance of the merge window,
but review comments and a late build-bot warning kept them out of the
v5.8-rc1 libnvdimm pull request.

Vaibhav notes:
These patches are tied to specific features that were committed to
customers in upcoming distros releases (RHEL and SLES) whose time-lines
are tied to 5.8 kernel release.

Being able to track the health of an nvdimm is critical for our
customers that are running workloads leveraging papr-scm nvdimms.
Missing the 5.8 kernel would mean missing the distro timelines and
shifting forward the availability of this feature in distro kernels by
at least 6 months.