linux-2.6-microblaze.git
9 months agopowerpc/83xx: Fix style problems in usb.c and remove unneccessary includes from mpc83xx.h
Christophe Leroy [Wed, 16 Aug 2023 15:22:16 +0000 (17:22 +0200)]
powerpc/83xx: Fix style problems in usb.c and remove unneccessary includes from mpc83xx.h

Replace printk(KERN_WARN with pr_warn(

Remove a couple of blank lines

Re-align multi-line code.

Replace asm/io.h by linux/io.h

mpc83xx.h doesn't need linux/device.h or asm/pci-bridge.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/2cb498f637e082a4af8032311fad3cae84d6aa5d.1692199324.git.christophe.leroy@csgroup.eu
9 months agopowerpc/fsl_pci: Make fsl_add_bridge() static
Christophe Leroy [Wed, 16 Aug 2023 15:19:54 +0000 (17:19 +0200)]
powerpc/fsl_pci: Make fsl_add_bridge() static

Since commit 905e75c46dba ("powerpc/fsl-pci: Unify pci/pcie initialization code")
fsl_add_bridge() is not used anymore outside of fsl_pci.c

Make it static.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/2115e3597d81e72a865820af54f0e290d0fd2b3a.1692199186.git.christophe.leroy@csgroup.eu
9 months agopowerpc/512x: Make mpc512x_select_reset_compat() static
Christophe Leroy [Fri, 18 Aug 2023 06:51:48 +0000 (08:51 +0200)]
powerpc/512x: Make mpc512x_select_reset_compat() static

mpc512x_select_reset_compat() is only used in the file it
is defined.

Make it static.

Move mpc512x_restart_init() after mpc512x_select_reset_compat().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/36a19e13025dbf17e92e832dd24150642b0e9bad.1692341499.git.christophe.leroy@csgroup.eu
9 months agopowerpc/ps3: refactor strncpy usage
Justin Stitt [Wed, 16 Aug 2023 21:39:24 +0000 (21:39 +0000)]
powerpc/ps3: refactor strncpy usage

`strncpy` is deprecated for use on NUL-terminated destination strings [1].

`make_first_field()` should use similar implementation to `make_field()`
due to memcpy having more obvious behavior here. The end result yields
the same behavior as the previous `strncpy`-based implementation
including the NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230816-strncpy-arch-powerpc-platforms-ps3-repository-v1-1-88283b02fb09@google.com
9 months agopowerpc/ptrace: Split gpr32_set_common
Christophe Leroy [Thu, 22 Jun 2023 10:01:23 +0000 (12:01 +0200)]
powerpc/ptrace: Split gpr32_set_common

objtool reports the following warning:

  arch/powerpc/kernel/ptrace/ptrace-view.o: warning: objtool:
    gpr32_set_common+0x23c (.text+0x860): redundant UACCESS disable

gpr32_set_common() conditionally opens and closes UACCESS based on
whether kbuf pointer is NULL or not. This is wackelig.

Split gpr32_set_common() in two fonctions, one for user one for
kernel.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix oops in gpr32_set_common_user() due to NULL kbuf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b8d6ae4483fcfd17524e79d803c969694a85cc02.1687428075.git.christophe.leroy@csgroup.eu
9 months agoDocumentation/powerpc: Fix ptrace request names
Benjamin Gray [Tue, 25 Jul 2023 00:58:38 +0000 (10:58 +1000)]
Documentation/powerpc: Fix ptrace request names

The documented ptrace request names are currently wrong/incomplete.
Fix this to improve correctness and searchability.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230725005841.28854-2-bgray@linux.ibm.com
9 months agoperf/hw_breakpoint: Remove arch breakpoint hooks
Benjamin Gray [Tue, 1 Aug 2023 01:17:44 +0000 (11:17 +1000)]
perf/hw_breakpoint: Remove arch breakpoint hooks

PowerPC was the only user of these hooks, and has been refactored to no
longer require them. There is no need to keep them around, so remove
them to reduce complexity.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-8-bgray@linux.ibm.com
9 months agoselftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest
Benjamin Gray [Tue, 1 Aug 2023 01:17:43 +0000 (11:17 +1000)]
selftests/powerpc/ptrace: Update ptrace-perf watchpoint selftest

Now that ptrace and perf are no longer exclusive, update the
test to exercise interesting interactions.

An assembly file is used for the children to allow precise instruction
choice and addresses, while avoiding any compiler quirks.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-7-bgray@linux.ibm.com
9 months agopowerpc/watchpoints: Remove ptrace/perf exclusion tracking
Benjamin Gray [Tue, 1 Aug 2023 01:17:42 +0000 (11:17 +1000)]
powerpc/watchpoints: Remove ptrace/perf exclusion tracking

ptrace and perf watchpoints were considered incompatible in
commit 29da4f91c0c1 ("powerpc/watchpoint: Don't allow concurrent perf
and ptrace events"), but the logic in that commit doesn't really apply.

Ptrace doesn't automatically single step; the ptracer must request this
explicitly. And the ptracer can do so regardless of whether a
ptrace/perf watchpoint triggered or not: it could single step every
instruction if it wanted to. Whatever stopped the ptracee before
executing the instruction that would trigger the perf watchpoint is no
longer relevant by this point.

To get correct behaviour when perf and ptrace are watching the same
data we must ignore the perf watchpoint. After all, ptrace has
before-execute semantics, and perf is after-execute, so perf doesn't
actually care about the watchpoint trigger at this point in time.
Pausing before execution does not mean we will actually end up executing
the instruction.

Importantly though, we don't remove the perf watchpoint yet. This is
key.

The ptracer is free to do whatever it likes right now. E.g., it can
continue the process, single step. or even set the child PC somewhere
completely different.

If it does try to execute the instruction though, without reinserting
the watchpoint (in which case we go back to the start of this example),
the perf watchpoint would immediately trigger. This time there is no
ptrace watchpoint, so we can safely perform a single step and increment
the perf counter. Upon receiving the single step exception, the existing
code already handles propagating or consuming it based on whether
another subsystem (e.g. ptrace) requested a single step. Again, this is
needed with or without perf/ptrace exclusion, because ptrace could be
single stepping this instruction regardless of if a watchpoint is
involved.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-6-bgray@linux.ibm.com
9 months agopowerpc/watchpoints: Simplify watchpoint reinsertion
Benjamin Gray [Tue, 1 Aug 2023 01:17:41 +0000 (11:17 +1000)]
powerpc/watchpoints: Simplify watchpoint reinsertion

We only remove watchpoints when they have the perf_single_step flag set,
so we can reinsert them during the first iteration.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-5-bgray@linux.ibm.com
9 months agopowerpc/watchpoints: Track perf single step directly on the breakpoint
Benjamin Gray [Tue, 1 Aug 2023 01:17:40 +0000 (11:17 +1000)]
powerpc/watchpoints: Track perf single step directly on the breakpoint

There is a bug in the current watchpoint tracking logic, where the
teardown in arch_unregister_hw_breakpoint() uses bp->ctx->task, which it
does not have a reference of and parallel threads may be in the process
of destroying. This was partially addressed in commit fb822e6076d9
("powerpc/hw_breakpoint: Fix oops when destroying hw_breakpoint event"),
but the underlying issue of accessing a struct member in an unknown
state still remained. Syzkaller managed to trigger a null pointer
derefernce due to the race between the task destructor and checking the
pointer and dereferencing it in the loop.

While this null pointer dereference could be fixed by using READ_ONCE
to access the task up front, that just changes the error to manipulating
possbily freed memory.

Instead, the breakpoint logic needs to be reworked to remove any
dependency on a context or task struct during breakpoint removal.

The reason we have this currently is to clear thread.last_hit_ubp. This
member is used to differentiate the perf DAWR single-step sequence from
other causes of single-step, such as userspace just calling
ptrace(PTRACE_SINGLESTEP, ...). We need to differentiate them because,
when the single step interrupt is received, we need to know whether to
re-insert the DAWR breakpoint (perf) or not (ptrace / other).

arch_unregister_hw_breakpoint() needs to clear this information to
prevent dangling pointers to possibly freed memory. These pointers are
dereferenced in single_step_dabr_instruction() without a way to check
their validity.

This patch moves the tracking of this information to the breakpoint
itself. This means we no longer have to do anything special to clean up.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-4-bgray@linux.ibm.com
9 months agopowerpc/watchpoints: Don't track info persistently
Benjamin Gray [Tue, 1 Aug 2023 01:17:39 +0000 (11:17 +1000)]
powerpc/watchpoints: Don't track info persistently

info is cheap to retrieve, and is likely optimised by the compiler
anyway. On the other hand, propagating it across the functions makes it
possible to be inconsistent and adds needless complexity.

Remove it, and invoke counter_arch_bp() when we need to work with it.

As we don't persist it, we just use the local bp array to track whether
we are ignoring a breakpoint.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-3-bgray@linux.ibm.com
9 months agopowerpc/watchpoints: Explain thread_change_pc() more
Benjamin Gray [Tue, 1 Aug 2023 01:17:38 +0000 (11:17 +1000)]
powerpc/watchpoints: Explain thread_change_pc() more

The behaviour of the thread_change_pc() function is a bit cryptic
without being more familiar with how the watchpoint logic handles
perf's after-execute semantics.

Expand the comment to explain why we can re-insert the breakpoint and
unset the perf_single_step flag.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230801011744.153973-2-bgray@linux.ibm.com
9 months agodocs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_parti...
Kajol Jain [Sat, 29 Jul 2023 07:34:55 +0000 (13:04 +0530)]
docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_partition sysfs interface file

Add details of the new hv-gpci interface file called
"affinity_domain_via_partition" in the ABI documentation.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-11-kjain@linux.ibm.com
9 months agopowerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via...
Kajol Jain [Sat, 29 Jul 2023 07:34:54 +0000 (13:04 +0530)]
powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via partition information

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
AFFINITY_DOMAIN_INFORMATION_BY_PARTITION(0XB1), can be used to get
the system affinity domain via partition information. To expose the system
affinity domain via partition information, patch adds sysfs file called
"affinity_domain_via_partition" to the "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver.

Add new entry for AFFINITY_DOMAIN_VIA_PAR in sysinfo_counter_request
array, which points to the counter request value
"affinity_domain_via_partition" in hv-gpci.c file. Also add a
new function called "affinity_domain_via_partition_result_parse" to parse
the hcall result and store it in output buffer.

The affinity_domain_via_partition sysfs file is only available for power10
and above platforms. Add a macro called
INTERFACE_AFFINITY_DOMAIN_VIA_PAR_ATTR, which points to the index of NULL
placeholder, for affinity_domain_via_partition attribute in
interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR
macro in hv-gpci.c file.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-10-kjain@linux.ibm.com
9 months agodocs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_domai...
Kajol Jain [Sat, 29 Jul 2023 07:34:53 +0000 (13:04 +0530)]
docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_domain sysfs interface file

Add details of the new hv-gpci interface file called
"affinity_domain_via_domain" in the ABI documentation.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-9-kjain@linux.ibm.com
9 months agopowerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via...
Kajol Jain [Sat, 29 Jul 2023 07:34:52 +0000 (13:04 +0530)]
powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via domain information

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
AFFINITY_DOMAIN_INFORMATION_BY_DOMAIN(0XB0), can be used to get
the system affinity domain via domain information. To expose the system
affinity domain via domain information, patch adds sysfs file called
"affinity_domain_via_domain" to the "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver.

Add new entry for AFFINITY_DOMAIN_VIA_DOM in sysinfo_counter_request
array, which points to the counter request value
"affinity_domain_via_domain" in hv-gpci.c file.

The affinity_domain_via_domain sysfs file is only available for power10
and above platforms. Add a macro called
INTERFACE_AFFINITY_DOMAIN_VIA_DOM_ATTR, which points to the index of NULL
placeholder, for affinity_domain_via_domain attribute in interface_attrs
array. Also updated the value of INTERFACE_NULL_ATTR macro in hv-gpci.c
file.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-8-kjain@linux.ibm.com
9 months agodocs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_virtu...
Kajol Jain [Sat, 29 Jul 2023 07:34:51 +0000 (13:04 +0530)]
docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document affinity_domain_via_virtual_processor sysfs interface file

Add details of the new hv-gpci interface file called
"affinity_domain_via_virtual_processor" in the ABI documentation.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-7-kjain@linux.ibm.com
9 months agopowerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via...
Kajol Jain [Sat, 29 Jul 2023 07:34:50 +0000 (13:04 +0530)]
powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via virtual processor information

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
AFFINITY_DOMAIN_INFORMATION_BY_VIRTUAL_PROCESSOR(0XA0), can be used to get
the system affinity domain via virtual processor information. To expose
the system affinity domain via virtual processor information, patch adds
sysfs file called "affinity_domain_via_virtual_processor" to the
"/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver.

The affinity_domain_via_virtual_processor sysfs file is only available for
power10 and above platforms. Add a macro called
INTERFACE_AFFINITY_DOMAIN_VIA_VP_ATTR, which points to the index of NULL
placeholder, for affinity_domain_via_virtual_processor attribute in
interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR macro
in hv-gpci.c file.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-6-kjain@linux.ibm.com
9 months agodocs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document processor_config sysfs...
Kajol Jain [Sat, 29 Jul 2023 07:34:49 +0000 (13:04 +0530)]
docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document processor_config sysfs interface file

Add details of the new hv-gpci interface file called
"processor_config" in the ABI documentation.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-5-kjain@linux.ibm.com
9 months agopowerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor config inform...
Kajol Jain [Sat, 29 Jul 2023 07:34:48 +0000 (13:04 +0530)]
powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor config information

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
PROCESSOR_CONFIG(0X90), can be used to get the system
processor configuration information. To expose the system
processor config information, patch adds sysfs file called
"processor_config" to the "/sys/devices/hv_gpci/interface/"
of hv_gpci pmu driver.

Add enum and sysinfo_counter_request array to get required
counter request value in hv-gpci.c file.
Also add a new function called "sysinfo_device_attr_create",
which will create and return required device attribute to the
add_sysinfo_interface_files function.

The processor_config sysfs file is only available for power10
and above platforms. Add a new macro called
INTERFACE_PROCESSOR_CONFIG_ATTR, which points to the index of
NULL placefolder, for processor_config attribute in the interface_attrs
array. Also add macro INTERFACE_NULL_ATTR which points to index of NULL
attribute in interface_attrs array.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-4-kjain@linux.ibm.com
9 months agodocs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document processor_bus_topology...
Kajol Jain [Sat, 29 Jul 2023 07:34:47 +0000 (13:04 +0530)]
docs: ABI: sysfs-bus-event_source-devices-hv_gpci: Document processor_bus_topology sysfs interface file

Add details of the new hv-gpci interface file called
"processor_bus_topology" in the ABI documentation.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-3-kjain@linux.ibm.com
9 months agopowerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor bus topology...
Kajol Jain [Sat, 29 Jul 2023 07:34:46 +0000 (13:04 +0530)]
powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor bus topology information

The hcall H_GET_PERF_COUNTER_INFO with counter request value as
PROCESSOR_BUS_TOPOLOGY(0XD0), can be used to get the system
topology information. To expose the system topology information,
patch adds sysfs file called "processor_bus_topology" to the
"/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver.

Add macro for PROCESSOR_BUS_TOPOLOGY counter request value
in hv-gpci.c file. Also add a new function called
"systeminfo_gpci_request", to make the H_GET_PERF_COUNTER_INFO hcall
with added macro and populates the output buffer.

The processor_bus_topology sysfs file is only available for power10
and above platforms. Add a new function called
"add_sysinfo_interface_files", which will add processor_bus_topology
attribute in the interface_attrs array, only for power10 and
above platforms.
Also add macro INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR in hv-gpci.c
file, which points to the index of NULL placefolder, for
processor_bus_topology attribute.

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230729073455.7918-2-kjain@linux.ibm.com
9 months agopowerpc: Make virt_to_pfn() a static inline
Linus Walleij [Wed, 9 Aug 2023 08:07:13 +0000 (10:07 +0200)]
powerpc: Make virt_to_pfn() a static inline

Making virt_to_pfn() a static inline taking a strongly typed
(const void *) makes the contract of a passing a pointer of that
type to the function explicit and exposes any misuse of the
macro virt_to_pfn() acting polymorphic and accepting many types
such as (void *), (unitptr_t) or (unsigned long) as arguments
without warnings.

Move the virt_to_pfn() and related functions below the
declaration of __pa() so it compiles.

For symmetry do the same with pfn_to_kaddr().

As the file is included right into the linker file, we need
to surround the functions with ifndef __ASSEMBLY__ so we
don't cause compilation errors.

The conversion moreover exposes the fact that pmd_page_vaddr()
was returning an unsigned long rather than a const void * as
could be expected, so all the sites defining pmd_page_vaddr()
had to be augmented as well.

Finally the KVM code in book3s_64_mmu_hv.c was passing an
unsigned int to virt_to_phys() so fix that up with a cast so the
result compiles.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[mpe: Fixup kfence.h, simplify pfn_to_kaddr() & pmd_page_vaddr()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230809-virt-to-phys-powerpc-v1-1-12e912a7d439@linaro.org
9 months agopowerpc/powernv/pci: use pci_dev_id() to simplify the code
Xiongfeng Wang [Fri, 4 Aug 2023 08:04:35 +0000 (16:04 +0800)]
powerpc/powernv/pci: use pci_dev_id() to simplify the code

PCI core API pci_dev_id() can be used to get the BDF number for a pci
device. We don't need to compose it mannually. Use pci_dev_id() to
simplify the code a little bit.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230804080435.191196-1-wangxiongfeng2@huawei.com
9 months agopowerpc/xics: Remove unnecessary endian conversion
Gautam Menghani [Mon, 31 Jul 2023 11:55:39 +0000 (17:25 +0530)]
powerpc/xics: Remove unnecessary endian conversion

Remove an unnecessary piece of code that does an endianness conversion
but does not use the result. The following warning was reported by
Clang's static analyzer:

  arch/powerpc/sysdev/xics/ics-opal.c:114:2: warning: Value stored to
  'server' is never read [deadcode.DeadStores]
  server = be16_to_cpu(oserver);

'server' was used as a parameter to opal_get_xive() in commit
5c7c1e9444d8 ("powerpc/powernv: Add OPAL ICS backend") when it was
introduced. 'server' was also used in an error message for the call to
opal_get_xive().

'server' was always later set by a call to ics_opal_mangle_server()
before being used.

Commit bf8e0f891a32 ("powerpc/powernv: Fix endian issues in OPAL ICS
backend") used a new variable 'oserver' as the parameter to
opal_get_xive() instead of 'server' for endian correctness. It also
removed 'server' from the error message for the call to opal_get_xive().

Fix the warning by removing the server variable assignment.

Fixes: bf8e0f891a32 ("powerpc/powernv: Fix endian issues in OPAL ICS backend")
Reviewed-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230731115543.36991-1-gautam@linux.ibm.com
9 months agopowerpc/pseries: fix possible memory leak in ibmebus_bus_init()
ruanjinjie [Thu, 10 Nov 2022 01:19:29 +0000 (09:19 +0800)]
powerpc/pseries: fix possible memory leak in ibmebus_bus_init()

If device_register() returns error in ibmebus_bus_init(), name of kobject
which is allocated in dev_set_name() called in device_add() is leaked.

As comment of device_add() says, it should call put_device() to drop
the reference count that was set in device_initialize() when it fails,
so the name can be freed in kobject_cleanup().

Signed-off-by: ruanjinjie <ruanjinjie@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20221110011929.3709774-1-ruanjinjie@huawei.com
9 months agopowerpc: remove <asm/export.h>
Masahiro Yamada [Sun, 6 Aug 2023 15:09:54 +0000 (00:09 +0900)]
powerpc: remove <asm/export.h>

All *.S files under arch/powerpc/ have been converted to include
<linux/export.h> instead of <asm/export.h>.

Remove <asm/export.h>.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-3-masahiroy@kernel.org
9 months agopowerpc: replace #include <asm/export.h> with #include <linux/export.h>
Masahiro Yamada [Sun, 6 Aug 2023 15:09:53 +0000 (00:09 +0900)]
powerpc: replace #include <asm/export.h> with #include <linux/export.h>

Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.

Replace #include <asm/export.h> with #include <linux/export.h>.

After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
[mpe: Fixup selftests that stub asm/export.h]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-2-masahiroy@kernel.org
9 months agopowerpc: remove unneeded #include <asm/export.h>
Masahiro Yamada [Sun, 6 Aug 2023 15:09:52 +0000 (00:09 +0900)]
powerpc: remove unneeded #include <asm/export.h>

There is no EXPORT_SYMBOL line there, hence #include <asm/export.h>
is unneeded.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230806150954.394189-1-masahiroy@kernel.org
9 months agopowerpc/inst: add PPC_TLBILX_LPID
Nick Desaulniers [Thu, 3 Aug 2023 18:33:52 +0000 (11:33 -0700)]
powerpc/inst: add PPC_TLBILX_LPID

Clang didn't recognize the instruction tlbilxlpid. This was fixed in
clang-18 [0] then backported to clang-17 [1].  To support clang-16 and
older, rather than using that instruction bare in inline asm, add it to
ppc-opcode.h and use that macro as is done elsewhere for other
instructions.

Link: https://github.com/ClangBuiltLinux/linux/issues/1891
Link: https://github.com/llvm/llvm-project/issues/64080
Link: https://github.com/llvm/llvm-project/commit/53648ac1d0c953ae6d008864dd2eddb437a92468
Link: https://github.com/llvm/llvm-project-release-prs/commit/0af7e5e54a8c7ac665773ac1ada328713e8338f5
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/llvm/202307211945.TSPcyOhh-lkp@intel.com/
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230803-ppc_tlbilxlpid-v3-1-ca84739bfd73@google.com
9 months agocxl: Use pci_find_vsec_capability() to simplify the code
Xiongfeng Wang [Fri, 4 Aug 2023 07:56:30 +0000 (15:56 +0800)]
cxl: Use pci_find_vsec_capability() to simplify the code

PCI core add pci_find_vsec_capability() to query VSEC. We can use that
core API to simplify the code.

The only logical change is that pci_find_vsec_capability check the
Vendor ID before finding the VSEC.

PCI spec rev 5.0 says in 7.9.5.2 Vendor-Specific Header:
  VSEC ID - This field is a vendor-defined ID number that indicates the
  nature and format of the VSEC structure
  Software must qualify the Vendor ID before interpreting this field.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230804075630.186054-1-wangxiongfeng2@huawei.com
9 months agopowerpc/reg: Remove #ifdef around mtspr macro
Christophe Leroy [Wed, 21 Jun 2023 10:40:50 +0000 (12:40 +0200)]
powerpc/reg: Remove #ifdef around mtspr macro

That ifdef was introduced by commit 1458dd951f7c ("powerpc/8xx:
Handle CPU6 ERRATA directly in mtspr() macro") and left over by
commit 2a45addd21de ("powerpc/8xx: Remove CPU6 ERRATA Workaround")

Remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/cf652e47ea9e453e89813611b6f76d0939a12063.1687344017.git.christophe.leroy@csgroup.eu
9 months agopowerpc/step: Mark __copy_mem_out() and __emulate_dcbz() __always_inline
Christophe Leroy [Wed, 21 Jun 2023 10:38:10 +0000 (12:38 +0200)]
powerpc/step: Mark __copy_mem_out() and __emulate_dcbz() __always_inline

objtool reports two folliwng warnings:
  arch/powerpc/lib/sstep.o: warning: objtool: copy_mem_out+0x3c
    (.text+0x30c): call to __copy_mem_out() with UACCESS enabled
  arch/powerpc/lib/sstep.o: warning: objtool: emulate_dcbz+0x70
    (.text+0x4dc): call to __emulate_dcbz() with UACCESS enabled

Mark __copy_mem_out() and __emulate_dcbz() __always_inline

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/f1d4a15da70190f8c2fcddb377bbc1e09827242c.1687343857.git.christophe.leroy@csgroup.eu
9 months agopowerpc/cpm2: Remove cpm2_map() and cpm2_unmap()
Christophe Leroy [Tue, 8 Aug 2023 06:04:43 +0000 (08:04 +0200)]
powerpc/cpm2: Remove cpm2_map() and cpm2_unmap()

Since commit 449012daa92a ("[POWERPC] cpm2: Infrastructure code
cleanup.") cpm2_map() is just returning cpm2_immr pointer and
cpm2_unmap() does nothing.

We already have parts of code that use cpm2_immr directly so get rid
of cpm2_map() and cpm2_unmap() by using cpm2_immr directly. And avoid
going through local pointers that hide the pointed structure.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/9fe6ff7284e9f968b12abe7de7c08d7ea40e29d6.1691474658.git.christophe.leroy@csgroup.eu
9 months agopowerpc/8xx: Remove immr_map() and immr_unmap()
Christophe Leroy [Tue, 8 Aug 2023 06:04:42 +0000 (08:04 +0200)]
powerpc/8xx: Remove immr_map() and immr_unmap()

Since commit fb533d0c5a97 ("[POWERPC] 8xx: Infrastructure code cleanup.")
immr_map() is just returning mpc8xxx_immr pointer and immr_unmap()
do nothing.

We already have parts of code that use mpc8xxx_immr directly so get rid
of immr_map() and immr_unmap() by using mpc8xxx_immr directly. And avoid
going through local pointers that hide the pointed structure.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/633ed46f6015ff44d5599258647ea517f75d6a1d.1691474658.git.christophe.leroy@csgroup.eu
9 months agopowerpc: Remove CONFIG_PCI_8260
Christophe Leroy [Tue, 8 Aug 2023 06:04:41 +0000 (08:04 +0200)]
powerpc: Remove CONFIG_PCI_8260

CONFIG_PCI_8260 is not used anymore, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/19a4c07466ce8b80f287a06eadcc80c4ab1d2c9e.1691474658.git.christophe.leroy@csgroup.eu
9 months agopowerpc/include: Remove mpc8260.h and m82xx_pci.h
Christophe Leroy [Tue, 8 Aug 2023 06:04:40 +0000 (08:04 +0200)]
powerpc/include: Remove mpc8260.h and m82xx_pci.h

SIU_INT_IRQ1 is not used anywhere and __IO_BASE is defined in
asm/io.h

Remove m82xx_pci.h

Then the only thing remaining in mpc8260.h is MPC82XX_BCR_PLDP

Move MPC82XX_BCR_PLDP into asm/cpm2.h then remove mpc8260.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/afe23bf3624c389ff17e9789884c78c124b7b202.1691474658.git.christophe.leroy@csgroup.eu
9 months agopowerpc/include: Declare mpc8xx_immr in 8xx_immap.h
Christophe Leroy [Tue, 8 Aug 2023 06:04:39 +0000 (08:04 +0200)]
powerpc/include: Declare mpc8xx_immr in 8xx_immap.h

Do the same as for cmp2_immr : declare it at the same place
as its type immap_t, that is in 8xx_immap.h instead of fs_pd.h

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/62d490b65899c2f2667ca7045c5f0fad9cbda458.1691474658.git.christophe.leroy@csgroup.eu
9 months agopowerpc/include: Remove unneeded #include <asm/fs_pd.h>
Christophe Leroy [Tue, 8 Aug 2023 06:04:38 +0000 (08:04 +0200)]
powerpc/include: Remove unneeded #include <asm/fs_pd.h>

tqm8xx_setup.c and fs_enet.h don't use any items provided by fs_pd.h

Remove unneeded #include <asm/fs_pd.h>

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b056c4e986a4a7707fc1994304c34f7bd15d6871.1691474658.git.christophe.leroy@csgroup.eu
9 months agoocxl: Use pci_dev_id() to simplify the code
Zheng Zengkai [Fri, 11 Aug 2023 10:20:39 +0000 (18:20 +0800)]
ocxl: Use pci_dev_id() to simplify the code

PCI core API pci_dev_id() can be used to get the BDF number for a pci
device. We don't need to compose it mannually. Use pci_dev_id() to
simplify the code a little bit.

Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230811102039.17257-1-zhengzengkai@huawei.com
9 months agomacintosh/ams: mark ams_init() static
Arnd Bergmann [Thu, 10 Aug 2023 14:19:24 +0000 (16:19 +0200)]
macintosh/ams: mark ams_init() static

This is the module init function, which by definition is used only
locally, so mark it static to avoid a warning:

drivers/macintosh/ams/ams-core.c:179:12: error: no previous prototype for 'ams_init' [-Werror=missing-prototypes]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230810141947.1236730-7-arnd@kernel.org
9 months agopowerpc/pseries: PLPKS: undo kernel-doc comment notation
Randy Dunlap [Thu, 10 Aug 2023 00:07:40 +0000 (17:07 -0700)]
powerpc/pseries: PLPKS: undo kernel-doc comment notation

Don't use kernel-doc "/**" comment format for non-kernel-doc comments.
This prevents a kernel-doc warning:

  arch/powerpc/platforms/pseries/plpks.c:186: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
  * Label is combination of label attributes + name.

Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Link: lore.kernel.org/r/202308040430.GxmPAnwZ-lkp@intel.com
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230810000740.23756-1-rdunlap@infradead.org
9 months agopowerpc/radix: Move some functions into #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
Christophe Leroy [Wed, 9 Aug 2023 08:01:43 +0000 (10:01 +0200)]
powerpc/radix: Move some functions into #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE

With skiboot_defconfig, Clang reports:

  CC      arch/powerpc/mm/book3s64/radix_tlb.o
arch/powerpc/mm/book3s64/radix_tlb.c:419:20: error: unused function '_tlbie_pid_lpid' [-Werror,-Wunused-function]
static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
                   ^
arch/powerpc/mm/book3s64/radix_tlb.c:663:20: error: unused function '_tlbie_va_range_lpid' [-Werror,-Wunused-function]
static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
                   ^

This is because those functions are only called from functions
enclosed in a #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE

Move below functions inside that #ifdef
* __tlbie_pid_lpid(unsigned long pid,
* __tlbie_va_lpid(unsigned long va, unsigned long pid,
* fixup_tlbie_pid_lpid(unsigned long pid, unsigned long lpid)
* _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
* fixup_tlbie_va_range_lpid(unsigned long va,
* __tlbie_va_range_lpid(unsigned long start, unsigned long end,
* _tlbie_va_range_lpid(unsigned long start, unsigned long end,

Fixes: f0c6fbbb9050 ("KVM: PPC: Book3S HV: Add support for H_RPT_INVALIDATE")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307260802.Mjr99P5O-lkp@intel.com/
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/3d72efd39f986ee939d068af69fdce28bd600766.1691568093.git.christophe.leroy@csgroup.eu
9 months agopowerpc: xmon: remove unused variables
Arnd Bergmann [Wed, 9 Aug 2023 13:10:09 +0000 (15:10 +0200)]
powerpc: xmon: remove unused variables

Randconfig testing with W=1 showed up these warnings that I'd like to enable
by default:

arch/powerpc/xmon/xmon.c: In function 'dump_tlb_book3e':
arch/powerpc/xmon/xmon.c:3833:42: error: variable 'lrat' set but not used [-Werror=unused-but-set-variable]
 3833 |  int i, tlb, ntlbs, pidsz, lpidsz, rasz, lrat = 0;
      |                                          ^~~~
arch/powerpc/xmon/xmon.c:3831:23: error: variable 'lpidmask' set but not used [-Werror=unused-but-set-variable]
 3831 |  u32 mmucfg, pidmask, lpidmask;
      |                       ^~~~~~~~
arch/powerpc/xmon/xmon.c:3831:14: error: variable 'pidmask' set but not used [-Werror=unused-but-set-variable]
 3831 |  u32 mmucfg, pidmask, lpidmask;
      |              ^~~~~~~

Just remove these as they have been unused since the code was added in 2010.

Fixes: 03247157f7391 ("powerpc/book3e: Add TLB dump in xmon for Book3E")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230809131024.2039647-2-arnd@kernel.org
9 months agopowerpc: mark more local variables as volatile
Arnd Bergmann [Wed, 9 Aug 2023 13:10:08 +0000 (15:10 +0200)]
powerpc: mark more local variables as volatile

A while ago I created a2305e3de8193 ("powerpc: mark local variables
around longjmp as volatile") in order to allow building powerpc with
-Wextra enabled on gcc-11.

I tried this again with gcc-13 and found two more of the same issues,
presumably based on slightly different optimization paths being taken
here:

arch/powerpc/xmon/xmon.c:3306:27: error: variable 'mm' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]
arch/powerpc/kexec/crash.c:353:22: error: variable 'i' might be clobbered by 'longjmp' or 'vfork' [-Werror=clobbered]

I checked a bunch of randconfigs and found only these two, so just
address them the same way as the others.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230809131024.2039647-1-arnd@kernel.org
9 months agopowerpc/pmac32: enable serial options by default in defconfig
Yuan Tan [Wed, 2 Aug 2023 13:41:30 +0000 (21:41 +0800)]
powerpc/pmac32: enable serial options by default in defconfig

Serial is a critical feature for logging and debuging, and the other
architectures enable serial by default.

Let's enable CONFIG_SERIAL_PMACZILOG and CONFIG_SERIAL_PMACZILOG_CONSOLE
by default.

Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/bb7b5f9958b3e3a20f6573ff7ce7c5dc566e7e32.1690982937.git.tanyuan@tinylab.org
9 months agoMerge branch 'topic/cpu-smt' into next
Michael Ellerman [Mon, 14 Aug 2023 11:46:03 +0000 (21:46 +1000)]
Merge branch 'topic/cpu-smt' into next

Merge SMT changes we are sharing with the tip tree.

10 months agopowerpc/pseries: Honour current SMT state when DLPAR onlining CPUs
Michael Ellerman [Wed, 5 Jul 2023 14:51:43 +0000 (16:51 +0200)]
powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs

Integrate with the generic SMT support, so that when a CPU is DLPAR
onlined it is brought up with the correct SMT mode.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230705145143.40545-11-ldufour@linux.ibm.com
10 months agopowerpc: Add HOTPLUG_SMT support
Michael Ellerman [Wed, 5 Jul 2023 14:51:42 +0000 (16:51 +0200)]
powerpc: Add HOTPLUG_SMT support

Add support for HOTPLUG_SMT, which enables the generic sysfs SMT support
files in /sys/devices/system/cpu/smt, as well as the "nosmt" boot
parameter.

Implement the recently added hooks to allow partial SMT states, allow
any number of threads per core.

Tie the config symbol to HOTPLUG_CPU, which enables it on the major
platforms that support SMT. If there are other platforms that want the
SMT support that can be tweaked in future.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[ldufour: remove topology_smt_supported]
[ldufour: remove topology_smt_threads_supported]
[ldufour: select CONFIG_SMT_NUM_THREADS_DYNAMIC]
[ldufour: update kernel-parameters.txt]
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Link: https://msgid.link/20230705145143.40545-10-ldufour@linux.ibm.com
10 months agopowerpc/pseries: Initialise CPU hotplug callbacks earlier
Michael Ellerman [Wed, 5 Jul 2023 14:51:41 +0000 (16:51 +0200)]
powerpc/pseries: Initialise CPU hotplug callbacks earlier

As part of the generic HOTPLUG_SMT code, there is support for disabling
secondary SMT threads at boot time, by passing "nosmt" on the kernel
command line.

The way that is implemented is the secondary threads are brought partly
online, and then taken back offline again. That is done to support x86
CPUs needing certain initialisation done on all threads. However powerpc
has similar needs, see commit d70a54e2d085 ("powerpc/powernv: Ignore
smt-enabled on Power8 and later").

For that to work the powerpc CPU hotplug callbacks need to be registered
before secondary CPUs are brought online, otherwise __cpu_disable()
fails due to smp_ops->cpu_disable being NULL.

So split the basic initialisation into pseries_cpu_hotplug_init() which
can be called early from setup_arch(). The DLPAR related initialisation
can still be done later, because it needs to do allocations.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230705145143.40545-9-ldufour@linux.ibm.com
10 months agoMerge tag 'smp-core-for-ppc-23-07-28' of https://git.kernel.org/pub/scm/linux/kernel...
Michael Ellerman [Wed, 2 Aug 2023 12:39:48 +0000 (22:39 +1000)]
Merge tag 'smp-core-for-ppc-23-07-28' of https://git./linux/kernel/git/tip/tip into topic/cpu-smt

SMP core changes for powerpc

10 months agopowerpc/kexec: fix minor typo
Laurent Dufour [Tue, 25 Jul 2023 13:27:59 +0000 (15:27 +0200)]
powerpc/kexec: fix minor typo

Function name in the descriptor was not correct.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307251721.bUGcsCeQ-lkp@intel.com/
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230725132759.53975-1-ldufour@linux.ibm.com
10 months agopowerpc/ep8248e: Mark driver as non removable
Uwe Kleine-König [Wed, 26 Jul 2023 08:14:42 +0000 (10:14 +0200)]
powerpc/ep8248e: Mark driver as non removable

Instead of resorting to BUG() ensure that the driver isn't unbound by
suppressing its bind and unbind sysfs attributes. As the driver is
built-in there is no way to remove a device once bound.

As a nice side effect this allows to drop the remove function.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230726081442.461026-1-u.kleine-koenig@pengutronix.de
10 months agopowerpc: address missing-prototypes warnings
Arnd Bergmann [Thu, 27 Jul 2023 12:26:50 +0000 (14:26 +0200)]
powerpc: address missing-prototypes warnings

There are a few warnings in powerpc64 defconfig builds after -Wmissing-prototypes
gets promoted from W=1 to the default warning set:

arch/powerpc/mm/book3s64/pgtable.c:422:6: error: no previous prototype for 'arch_report_meminfo' [-Werror=missing-prototypes]
arch/powerpc/platforms/cell/ras.c:275:5: error: no previous prototype for 'cbe_sysreset_hack' [-Werror=missing-prototypes]
arch/powerpc/platforms/cell/spu_manage.c:29:21: error: no previous prototype for 'spu_devnode' [-Werror=missing-prototypes]
arch/powerpc/platforms/pasemi/time.c:12:17: error: no previous prototype for 'pas_get_boot_time' [-Werror=missing-prototypes]
arch/powerpc/platforms/powermac/feature.c:1532:13: error: no previous prototype for 'g5_phy_disable_cpu1' [-Werror=missing-prototypes]
arch/powerpc/platforms/86xx/pic.c:28:13: error: no previous prototype for 'mpc86xx_init_irq' [-Werror=missing-prototypes]
drivers/pci/pci-sysfs.c:936:13: error: no previous prototype for 'pci_adjust_legacy_attr' [-Werror=missing-prototypes]

Address these by including the right header files or marking the
functions static. The audit.c one is a bit tricky since compat_audit.h
cannot include regular kernel headers tht have conflicting types on
32-bit powerpc.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[mpe: Drop change to __vmemmap_free() which only exists in mm]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230727122720.2558065-1-arnd@kernel.org
10 months agoselftests/powerpc/ptrace: Declare test temporary variables as volatile
Benjamin Gray [Tue, 25 Jul 2023 00:58:41 +0000 (10:58 +1000)]
selftests/powerpc/ptrace: Declare test temporary variables as volatile

While the target is volatile, the temporary variables used to access the
target cast away the volatile. This is undefined behaviour, and a
compiler may optimise away/reorder these accesses, breaking the test.

This was observed with GCC 13.1.1, but it can be difficult to reproduce
because of the dependency on compiler behaviour.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230725005841.28854-5-bgray@linux.ibm.com
10 months agoselftests/powerpc/ptrace: Fix typo in pid_max search error
Benjamin Gray [Tue, 25 Jul 2023 00:58:40 +0000 (10:58 +1000)]
selftests/powerpc/ptrace: Fix typo in pid_max search error

pid_max_addr() searches for the 'pid_max' symbol in /proc/kallsyms, and
prints an error if it cannot find it. The error message has a typo,
calling it pix_max.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230725005841.28854-4-bgray@linux.ibm.com
10 months agoselftests/powerpc/ptrace: Explain why tests are skipped
Benjamin Gray [Tue, 25 Jul 2023 00:58:39 +0000 (10:58 +1000)]
selftests/powerpc/ptrace: Explain why tests are skipped

Many tests require specific hardware features/configurations that a
typical machine might not have. As a result, it's common to see a test
is skipped. But it is tedious to find out why a test is skipped
when all it gives is the file location of the skip macro.

Convert SKIP_IF() to SKIP_IF_MSG(), with appropriate descriptions of why
the test is being skipped. This gives a general idea of why a test is
skipped, which can be looked into further if it doesn't make sense.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230725005841.28854-3-bgray@linux.ibm.com
10 months agopowerpc: Explicitly include correct DT includes
Rob Herring [Mon, 24 Jul 2023 21:02:42 +0000 (15:02 -0600)]
powerpc: Explicitly include correct DT includes

The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.

Signed-off-by: Rob Herring <robh@kernel.org>
[mpe: Fixup maple/setup.c which needs platform_device]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230724210247.778034-1-robh@kernel.org
10 months agopowerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs
Nicholas Piggin [Wed, 24 May 2023 06:08:21 +0000 (16:08 +1000)]
powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs

This performs lazy tlb mm shootdown when doing the exit TLB flush when
all mm users go away and user mappings are removed, which avoids having
to do the lazy tlb mm shootdown IPIs on the final mmput when all kernel
references disappear.

powerpc/64s uses a broadcast TLBIE for the exit TLB flush if remote CPUs
need to be invalidated (unless TLBIE is disabled), so this doesn't
necessarily save IPIs but it does avoid a broadcast TLBIE which is quite
expensive.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Squash in preempt_disable/enable() fix from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230524060821.148015-5-npiggin@gmail.com
10 months agopowerpc: Add mm_cpumask warning when context switching
Nicholas Piggin [Wed, 24 May 2023 06:08:20 +0000 (16:08 +1000)]
powerpc: Add mm_cpumask warning when context switching

When context switching away from an mm, add a CONFIG_DEBUG_VM warning
check to ensure this CPU is still set in the mask. This could catch
bugs where the mask is improperly trimmed while the CPU is still using
the mm.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230524060821.148015-4-npiggin@gmail.com
10 months agopowerpc/64s: Use dec_mm_active_cpus helper
Nicholas Piggin [Wed, 24 May 2023 06:08:19 +0000 (16:08 +1000)]
powerpc/64s: Use dec_mm_active_cpus helper

Avoid open-coded atomic_dec on mm->context.active_cpus and use the
function made for it. Add CONFIG_DEBUG_VM underflow checking on the
counter.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230524060821.148015-3-npiggin@gmail.com
10 months agopowerpc: Account mm_cpumask and active_cpus in init_mm
Nicholas Piggin [Wed, 24 May 2023 06:08:18 +0000 (16:08 +1000)]
powerpc: Account mm_cpumask and active_cpus in init_mm

init_mm mm_cpumask and context.active_cpus is not maintained at boot
and hotplug. This seems to be harmless because init_mm does not have a
userspace and so never gets user TLBs flushed, but it looks odd and it
prevents some sanity checks being added.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230524060821.148015-2-npiggin@gmail.com
10 months agopowerpc/64: Enable accelerated crypto algorithms in defconfig
Michael Ellerman [Mon, 17 Jul 2023 11:52:23 +0000 (21:52 +1000)]
powerpc/64: Enable accelerated crypto algorithms in defconfig

Enable all the acclerated crypto algorithms as modules in the 64-bit
defconfig, to get more test coverage.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230717115223.286158-1-mpe@ellerman.id.au
10 months agopowerpc/crypto: don't build aes-gcm-p10 by default
Omar Sandoval [Mon, 10 Jul 2023 16:46:47 +0000 (09:46 -0700)]
powerpc/crypto: don't build aes-gcm-p10 by default

None of the other accelerated crypto modules are built by default.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/40d9c7ebe82c9a9d4ace542ac433753d2f22c6a0.1689007370.git.osandov@osandov.com
10 months agopowerpc/crypto: fix missing skcipher dependency for aes-gcm-p10
Omar Sandoval [Mon, 10 Jul 2023 16:46:46 +0000 (09:46 -0700)]
powerpc/crypto: fix missing skcipher dependency for aes-gcm-p10

My stripped down configuration fails to build with:

  ERROR: modpost: "skcipher_walk_aead_encrypt" [arch/powerpc/crypto/aes-gcm-p10-crypto.ko] undefined!
  ERROR: modpost: "skcipher_walk_done" [arch/powerpc/crypto/aes-gcm-p10-crypto.ko] undefined!
  ERROR: modpost: "skcipher_walk_aead_decrypt" [arch/powerpc/crypto/aes-gcm-p10-crypto.ko] undefined!

Fix it by selecting CRYPTO_SKCIPHER.

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/c55ad70799e027a3d2756b85ccadc0af52ae8915.1689007370.git.osandov@osandov.com
10 months agopowerpc/kuap: Use ASM feature fixups instead of static branches
Christophe Leroy [Tue, 11 Jul 2023 15:59:21 +0000 (17:59 +0200)]
powerpc/kuap: Use ASM feature fixups instead of static branches

To avoid a useless nop on top of every uaccess enable/disable and
make life easier for objtool, replace static branches by ASM feature
fixups that will nop KUAP enabling instructions out in the unlikely
case KUAP is disabled at boottime.

Leave it as is on book3s/64 for now, it will be handled later when
objtool is activated on PPC64.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/671948788024fd890ec4ed175bc332dab8664ea5.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: KUAP enabling/disabling functions must be __always_inline
Christophe Leroy [Tue, 11 Jul 2023 15:59:20 +0000 (17:59 +0200)]
powerpc/kuap: KUAP enabling/disabling functions must be __always_inline

Objtool reports following warnings:

  arch/powerpc/kernel/signal_32.o: warning: objtool:
    __prevent_user_access.constprop.0+0x4 (.text+0x4):
    redundant UACCESS disable

  arch/powerpc/kernel/signal_32.o: warning: objtool: user_access_begin+0x2c
    (.text+0x4c): return with UACCESS enabled

  arch/powerpc/kernel/signal_32.o: warning: objtool: handle_rt_signal32+0x188
    (.text+0x360): call to __prevent_user_access.constprop.0() with UACCESS enabled

  arch/powerpc/kernel/signal_32.o: warning: objtool: handle_signal32+0x150
    (.text+0x4d4): call to __prevent_user_access.constprop.0() with UACCESS enabled

This is due to some KUAP enabling/disabling functions being outline
allthough they are marked inline. Use __always_inline instead.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/ca5e50ddbec3867db5146ebddbc9a1dc0e443bc8.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: Simplify KUAP lock/unlock on BOOK3S/32
Christophe Leroy [Tue, 11 Jul 2023 15:59:19 +0000 (17:59 +0200)]
powerpc/kuap: Simplify KUAP lock/unlock on BOOK3S/32

On book3s/32 KUAP is performed at segment level. At the moment,
when enabling userspace access, only current segment is modified.
Then if a write is performed on another user segment, a fault is
taken and all other user segments get enabled for userspace
access. This then require special attention when disabling
userspace access.

Having a userspace write access crossing a segment boundary is
unlikely. Having a userspace write access crossing a segment boundary
back and forth is even more unlikely. So, instead of enabling
userspace access on all segments when a write fault occurs, just
change which segment has userspace access enabled in order to
eliminate the case when more than one segment has userspace access
enabled. That simplifies userspace access deactivation.

There is however a corner case which is even more unlikely but has
to be handled anyway: an unaligned access which is crossing a
segment boundary. That would definitely require at least having
userspace access enabled on the two segments. To avoid complicating
the likely case for a so unlikely happening, handle such situation
like an alignment exception and emulate the store.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/8de8580513c1a6e880bad1ba9a69d3efad3d4fa5.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: Use MMU_FTR_KUAP on all and refactor disabling kuap
Christophe Leroy [Tue, 11 Jul 2023 15:59:18 +0000 (17:59 +0200)]
powerpc/kuap: Use MMU_FTR_KUAP on all and refactor disabling kuap

All but book3s/64 use a static branch key for disabling kuap.
book3s/64 uses an mmu feature.

Refactor all targets to use MMU_FTR_KUAP like book3s/64.

For PPC32 that implies updating mmu features fixups once KUAP
has been initialised.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/6b3d7c977bad73378ea368bc6818e9c94ea95ab0.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: MMU_FTR_BOOK3S_KUAP becomes MMU_FTR_KUAP
Christophe Leroy [Tue, 11 Jul 2023 15:59:17 +0000 (17:59 +0200)]
powerpc/kuap: MMU_FTR_BOOK3S_KUAP becomes MMU_FTR_KUAP

In order to reuse MMU_FTR_BOOK3S_KUAP for other targets than BOOK3S,
rename it MMU_FTR_KUAP.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/c8b6f7b8cd0eeaace96879ed0e0a157faa619451.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/features: Add capability to update mmu features later
Christophe Leroy [Tue, 11 Jul 2023 15:59:16 +0000 (17:59 +0200)]
powerpc/features: Add capability to update mmu features later

On powerpc32, features fixup is performed very early and that's too
early to read the cmdline and take into account 'nosmap' parameter.

On the other hand, no userspace access is performed that early and
KUAP feature fixup can be performed later.

Add a function to update mmu features. The function is passed a
mask with the features that can be updated.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/31b27ee2c9d338f4f82cd8cd69d6bff979495290.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: Fold kuep_is_disabled() into its only user
Christophe Leroy [Tue, 11 Jul 2023 15:59:15 +0000 (17:59 +0200)]
powerpc/kuap: Fold kuep_is_disabled() into its only user

kuep_is_disabled() was introduced by commit 91bb30822a2e ("powerpc/32s:
Refactor update of user segment registers") but then all users but one
were removed by commit 526d4a4c77ae ("powerpc/32s: Do kuep_lock() and
kuep_unlock() in assembly").

Fold kuep_is_disabled() into init_new_context() which is its only user.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b2247147c0a8c830ac82966451647850df4a64da.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: Avoid useless jump_label on empty function
Christophe Leroy [Tue, 11 Jul 2023 15:59:14 +0000 (17:59 +0200)]
powerpc/kuap: Avoid useless jump_label on empty function

Disassembly of interrupt_enter_prepare() shows a pointless nop
before the mftb

  c000abf0 <interrupt_enter_prepare>:
  c000abf0:       81 23 00 84     lwz     r9,132(r3)
  c000abf4:       71 29 40 00     andi.   r9,r9,16384
  c000abf8:       41 82 00 28     beq-    c000ac20 <interrupt_enter_prepare+0x30>
  c000abfc: ===>  60 00 00 00     nop <====
  c000ac00:       7d 0c 42 e6     mftb    r8
  c000ac04:       80 e2 00 08     lwz     r7,8(r2)
  c000ac08:       81 22 00 28     lwz     r9,40(r2)
  c000ac0c:       91 02 00 24     stw     r8,36(r2)
  c000ac10:       7d 29 38 50     subf    r9,r9,r7
  c000ac14:       7d 29 42 14     add     r9,r9,r8
  c000ac18:       91 22 00 08     stw     r9,8(r2)
  c000ac1c:       4e 80 00 20     blr
  c000ac20:       60 00 00 00     nop
  c000ac24:       7d 5a c2 a6     mfmd_ap r10
  c000ac28:       3d 20 de 00     lis     r9,-8704
  c000ac2c:       91 43 00 b0     stw     r10,176(r3)
  c000ac30:       7d 3a c3 a6     mtspr   794,r9
  c000ac34:       4e 80 00 20     blr

That comes from the call to kuap_loc(), allthough __kuap_lock() is an
empty function on the 8xx.

To avoid that, only perform kuap_is_disabled() check when there is
something to do with __kuap_lock().

Do the same with __kuap_save_and_lock() and
__kuap_get_and_assert_locked().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/a854d25bea375d4ba6ca9c2617f9edbba397100a.1689091022.git.christophe.leroy@csgroup.eu
10 months agopowerpc/kuap: Avoid unnecessary reads of MD_AP
Christophe Leroy [Tue, 11 Jul 2023 15:59:13 +0000 (17:59 +0200)]
powerpc/kuap: Avoid unnecessary reads of MD_AP

A disassembly of interrupt_exit_kernel_prepare() shows a useless read
of MD_AP register. This is shown by r9 being re-used immediately without
doing anything with the value read.

  c000e0e0:       60 00 00 00     nop
  c000e0e4: ===>  7d 3a c2 a6     mfmd_ap r9 <====
  c000e0e8:       7d 20 00 a6     mfmsr   r9
  c000e0ec:       7c 51 13 a6     mtspr   81,r2
  c000e0f0:       81 3f 00 84     lwz     r9,132(r31)
  c000e0f4:       71 29 80 00     andi.   r9,r9,32768

kuap_get_and_assert_locked() is paired with kuap_kernel_restore()
and are only used in interrupt_exit_kernel_prepare(). The value
returned by kuap_get_and_assert_locked() is only used by
kuap_kernel_restore().

On 8xx, kuap_kernel_restore() doesn't use the value read by
kuap_get_and_assert_locked() so modify kuap_get_and_assert_locked()
to not perform the read of MD_AP and return 0 instead.

The same applies on BOOKE.

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/bcbc84c2dd90bb1021da792b1968cdc22112dad8.1689091022.git.christophe.leroy@csgroup.eu
10 months agocpu/SMT: Allow enabling partial SMT states via sysfs
Michael Ellerman [Wed, 5 Jul 2023 14:51:40 +0000 (16:51 +0200)]
cpu/SMT: Allow enabling partial SMT states via sysfs

Add support to the /sys/devices/system/cpu/smt/control interface for
enabling a specified number of SMT threads per core, including partial
SMT states where not all threads are brought online.

The current interface accepts "on" and "off", to enable either 1 or all
SMT threads per core.

This commit allows writing an integer, between 1 and the number of SMT
threads supported by the machine. Writing 1 is a synonym for "off", 2 or
more enables SMT with the specified number of threads.

When reading the file, if all threads are online "on" is returned, to
avoid changing behaviour for existing users. If some other number of
threads is online then the integer value is returned.

Architectures like x86 only supporting 1 thread or all threads, should not
define CONFIG_SMT_NUM_THREADS_DYNAMIC. Architecture supporting partial SMT
states, like PowerPC, should define it.

[ ldufour: Slightly reword the commit's description ]
[ ldufour: Remove switch() in __store_smt_control() ]
[ ldufour: Rix build issue in control_show() ]

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-8-ldufour@linux.ibm.com
10 months agocpu/SMT: Create topology_smt_thread_allowed()
Michael Ellerman [Wed, 5 Jul 2023 14:51:39 +0000 (16:51 +0200)]
cpu/SMT: Create topology_smt_thread_allowed()

Some architectures allows partial SMT states, i.e. when not all SMT threads
are brought online.

To support that, add an architecture helper which checks whether a given
CPU is allowed to be brought online depending on how many SMT threads are
currently enabled. Since this is only applicable to architecture supporting
partial SMT, only these architectures should select the new configuration
variable CONFIG_SMT_NUM_THREADS_DYNAMIC. For the other architectures, not
supporting the partial SMT states, there is no need to define
topology_cpu_smt_allowed(), the generic code assumed that all the threads
are allowed or only the primary ones.

Call the helper from cpu_smt_enable(), and cpu_smt_allowed() when SMT is
enabled, to check if the particular thread should be onlined. Notably,
also call it from cpu_smt_disable() if CPU_SMT_ENABLED, to allow
offlining some threads to move from a higher to lower number of threads
online.

[ ldufour: Slightly reword the commit's description ]
[ ldufour: Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-7-ldufour@linux.ibm.com
10 months agocpu/SMT: Remove topology_smt_supported()
Laurent Dufour [Wed, 5 Jul 2023 14:51:38 +0000 (16:51 +0200)]
cpu/SMT: Remove topology_smt_supported()

Since the maximum number of threads is now passed to cpu_smt_set_num_threads(),
checking that value is enough to know whether SMT is supported.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-6-ldufour@linux.ibm.com
10 months agocpu/SMT: Store the current/max number of threads
Michael Ellerman [Wed, 5 Jul 2023 14:51:37 +0000 (16:51 +0200)]
cpu/SMT: Store the current/max number of threads

Some architectures allow partial SMT states at boot time, ie. when not all
SMT threads are brought online.

To support that the SMT code needs to know the maximum number of SMT
threads, and also the currently configured number.

The architecture code knows the max number of threads, so have the
architecture code pass that value to cpu_smt_set_num_threads(). Note that
although topology_max_smt_threads() exists, it is not configured early
enough to be used here. As architecture, like PowerPC, allows the threads
number to be set through the kernel command line, also pass that value.

[ ldufour: Slightly reword the commit message ]
[ ldufour: Rename cpu_smt_check_topology and add a num_threads argument ]

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-5-ldufour@linux.ibm.com
10 months agocpu/SMT: Move smt/control simple exit cases earlier
Michael Ellerman [Wed, 5 Jul 2023 14:51:36 +0000 (16:51 +0200)]
cpu/SMT: Move smt/control simple exit cases earlier

Move the simple exit cases, i.e. those which don't depend on the value
written, earlier in the function. That makes it clearer that regardless of
the input those states cannot be transitioned out of.

That does have a user-visible effect, in that the error returned will
now always be EPERM/ENODEV for those states, regardless of the value
written. Previously writing an invalid value would return EINVAL even
when in those states.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-4-ldufour@linux.ibm.com
10 months agocpu/SMT: Move SMT prototypes into cpu_smt.h
Michael Ellerman [Wed, 5 Jul 2023 14:51:35 +0000 (16:51 +0200)]
cpu/SMT: Move SMT prototypes into cpu_smt.h

In order to export the cpuhp_smt_control enum as part of the interface
between generic and architecture code, the architecture code needs to
include asm/topology.h.

But that leads to circular header dependencies. So split the enum and
related declarations into a separate header.

[ ldufour: Reworded the commit's description ]

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-3-ldufour@linux.ibm.com
10 months agocpu/hotplug: Remove dependancy against cpu_primary_thread_mask
Laurent Dufour [Wed, 5 Jul 2023 14:51:34 +0000 (16:51 +0200)]
cpu/hotplug: Remove dependancy against cpu_primary_thread_mask

The commit 18415f33e2ac ("cpu/hotplug: Allow "parallel" bringup up to
CPUHP_BP_KICK_AP_STATE") introduce a dependancy against a global variable
cpu_primary_thread_mask exported by the X86 code. This variable is only
used when CONFIG_HOTPLUG_PARALLEL is set.

Since cpuhp_get_primary_thread_mask() and cpuhp_smt_aware() are only used
when CONFIG_HOTPLUG_PARALLEL is set, don't define them when it is not set.

No functional change.

Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230705145143.40545-2-ldufour@linux.ibm.com
10 months agoLinux 6.5-rc3
Linus Torvalds [Sun, 23 Jul 2023 22:24:10 +0000 (15:24 -0700)]
Linux 6.5-rc3

10 months agoMerge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Sun, 23 Jul 2023 22:19:14 +0000 (15:19 -0700)]
Merge tag 'trace-v6.5-rc2' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Swapping the ring buffer for snapshotting (for things like irqsoff)
   can crash if the ring buffer is being resized. Disable swapping when
   this happens. The missed swap will be reported to the tracer

 - Report error if the histogram fails to be created due to an error in
   adding a histogram variable, in event_hist_trigger_parse()

 - Remove unused declaration of tracing_map_set_field_descr()

* tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/histograms: Return an error if we fail to add histogram to hist_vars list
  ring-buffer: Do not swap cpu_buffer during resize process
  tracing: Remove unused extern declaration tracing_map_set_field_descr()

10 months agoMerge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahi...
Linus Torvalds [Sun, 23 Jul 2023 21:55:41 +0000 (14:55 -0700)]
Merge tag 'kbuild-fixes-v6.5' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix stale help text in gconfig

 - Support *.S files in compile_commands.json

 - Flatten KBUILD_CFLAGS

 - Fix external module builds with Rust so that temporary files are
   created in the modules directories instead of the kernel tree

* tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: rust: avoid creating temporary files
  kbuild: flatten KBUILD_CFLAGS
  gen_compile_commands: add assembly files to compilation database
  kconfig: gconfig: correct program name in help text
  kconfig: gconfig: drop the Show Debug Info help text

10 months agokbuild: rust: avoid creating temporary files
Miguel Ojeda [Sun, 23 Jul 2023 14:21:28 +0000 (16:21 +0200)]
kbuild: rust: avoid creating temporary files

`rustc` outputs by default the temporary files (i.e. the ones saved
by `-Csave-temps`, such as `*.rcgu*` files) in the current working
directory when `-o` and `--out-dir` are not given (even if
`--emit=x=path` is given, i.e. it does not use those for temporaries).

Since out-of-tree modules are compiled from the `linux` tree,
`rustc` then tries to create them there, which may not be accessible.

Thus pass `--out-dir` explicitly, even if it is just for the temporary
files.

Similarly, do so for Rust host programs too.

Reported-by: Raphael Nestler <raphael.nestler@gmail.com>
Closes: https://github.com/Rust-for-Linux/linux/issues/1015
Reported-by: Andrea Righi <andrea.righi@canonical.com>
Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs
Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs
Fixes: 295d8398c67e ("kbuild: specify output names separately for each emission type from rustc")
Cc: stable@vger.kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Tested-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
10 months agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 23 Jul 2023 17:44:38 +0000 (10:44 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Avoid pKVM finalization if KVM initialization fails

   - Add missing BTI instructions in the hypervisor, fixing an early
     boot failure on BTI systems

   - Handle MMU notifiers correctly for non hugepage-aligned memslots

   - Work around a bug in the architecture where hypervisor timer
     controls have UNKNOWN behavior under nested virt

   - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel
     BUG in cpu hotplug resulting from per-CPU accessor sanity checking

   - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
     consistently requesting a doorbell interrupt on vcpu_put()

   - Uphold RES0 sysreg behavior when emulating older PMU versions

   - Avoid macro expansion when initializing PMU register names,
     ensuring the tracepoints pretty-print the sysreg

  s390:

   - Two fixes for asynchronous destroy

  x86 fixes will come early next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: s390: pv: fix index value of replaced ASCE
  KVM: s390: pv: simplify shutdown and fix race
  KVM: arm64: Fix the name of sys_reg_desc related to PMU
  KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
  KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
  KVM: arm64: Add missing BTI instructions
  KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
  KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
  KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
  KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits

10 months agoMerge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 23 Jul 2023 17:21:49 +0000 (10:21 -0700)]
Merge tag 'ext4_for_linus-6.5-rc3' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
  checkpoint code"

* tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
  ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
  ext4: correct inline offset when handling xattrs in inode body
  jbd2: remove __journal_try_to_free_buffer()
  jbd2: fix a race when checking checkpoint buffer busy
  jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
  jbd2: remove journal_clean_one_cp_list()
  jbd2: remove t_checkpoint_io_list
  jbd2: recheck chechpointing non-dirty buffer

10 months agoMerge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 23 Jul 2023 17:16:44 +0000 (10:16 -0700)]
Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:
 "Add minor debugging improvement.

  The change improves ability to read a network trace to debug problems
  on encrypted connections which are very common (e.g. using wireshark
  or tcpdump).

  That works today with tools like 'smbinfo keys /mnt/file' but requires
  passing in a filename on the mount (see e.g. [1]), but it often makes
  more sense to just pass in the mount point path (ie a directory not a
  filename).

  So this fix was needed to debug some types of problems (an obvious
  example is on an encrypted connection failing operations on an empty
  share or with no files in the root of the directory) - so you can
  simply pass in the 'smbinfo keys <mntpoint>' and get the information
  that wireshark needs"

Link: https://wiki.samba.org/index.php/Wireshark_Decryption
* tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update internal module version number for cifs.ko
  cifs: allow dumping keys for directories too

10 months agoMerge tag 'kvm-s390-master-6.5-1' of https://git.kernel.org/pub/scm/linux/kernel...
Paolo Bonzini [Sun, 23 Jul 2023 16:50:30 +0000 (12:50 -0400)]
Merge tag 'kvm-s390-master-6.5-1' of https://git./linux/kernel/git/kvms390/linux into HEAD

Two fixes for asynchronous destroy

10 months agoMerge tag 'kvmarm-fixes-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmar...
Paolo Bonzini [Sun, 23 Jul 2023 16:50:14 +0000 (12:50 -0400)]
Merge tag 'kvmarm-fixes-6.5-1' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.5, part #1

 - Avoid pKVM finalization if KVM initialization fails

 - Add missing BTI instructions in the hypervisor, fixing an early boot
   failure on BTI systems

 - Handle MMU notifiers correctly for non hugepage-aligned memslots

 - Work around a bug in the architecture where hypervisor timer controls
   have UNKNOWN behavior under nested virt.

 - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG
   in cpu hotplug resulting from per-CPU accessor sanity checking.

 - Make WFI emulation on GICv4 systems robust w.r.t. preemption,
   consistently requesting a doorbell interrupt on vcpu_put()

 - Uphold RES0 sysreg behavior when emulating older PMU versions

 - Avoid macro expansion when initializing PMU register names, ensuring
   the tracepoints pretty-print the sysreg.

10 months agotracing/histograms: Return an error if we fail to add histogram to hist_vars list
Mohamed Khalfella [Fri, 14 Jul 2023 20:33:41 +0000 (20:33 +0000)]
tracing/histograms: Return an error if we fail to add histogram to hist_vars list

Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if
they have referenced variables") added a check to fail histogram creation
if save_hist_vars() failed to add histogram to hist_vars list. But the
commit failed to set ret to failed return code before jumping to
unregister histogram, fix it.

Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com
Cc: stable@vger.kernel.org
Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables")
Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoring-buffer: Do not swap cpu_buffer during resize process
Chen Lin [Wed, 19 Jul 2023 07:58:47 +0000 (15:58 +0800)]
ring-buffer: Do not swap cpu_buffer during resize process

When ring_buffer_swap_cpu was called during resize process,
the cpu buffer was swapped in the middle, resulting in incorrect state.
Continuing to run in the wrong state will result in oops.

This issue can be easily reproduced using the following two scripts:
/tmp # cat test1.sh
//#! /bin/sh
for i in `seq 0 100000`
do
         echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
         sleep 0.5
         echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
         sleep 0.5
done
/tmp # cat test2.sh
//#! /bin/sh
for i in `seq 0 100000`
do
        echo irqsoff > /sys/kernel/debug/tracing/current_tracer
        sleep 1
        echo nop > /sys/kernel/debug/tracing/current_tracer
        sleep 1
done
/tmp # ./test1.sh &
/tmp # ./test2.sh &

A typical oops log is as follows, sometimes with other different oops logs.

[  231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8
[  231.713375] Modules linked in:
[  231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
[  231.716750] Hardware name: linux,dummy-virt (DT)
[  231.718152] Workqueue: events update_pages_handler
[  231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  231.721171] pc : rb_update_pages+0x378/0x3f8
[  231.722212] lr : rb_update_pages+0x25c/0x3f8
[  231.723248] sp : ffff800082b9bd50
[  231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[  231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0
[  231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a
[  231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000
[  231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510
[  231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002
[  231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558
[  231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001
[  231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000
[  231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208
[  231.744196] Call trace:
[  231.744892]  rb_update_pages+0x378/0x3f8
[  231.745893]  update_pages_handler+0x1c/0x38
[  231.746893]  process_one_work+0x1f0/0x468
[  231.747852]  worker_thread+0x54/0x410
[  231.748737]  kthread+0x124/0x138
[  231.749549]  ret_from_fork+0x10/0x20
[  231.750434] ---[ end trace 0000000000000000 ]---
[  233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[  233.721696] Mem abort info:
[  233.721935]   ESR = 0x0000000096000004
[  233.722283]   EC = 0x25: DABT (current EL), IL = 32 bits
[  233.722596]   SET = 0, FnV = 0
[  233.722805]   EA = 0, S1PTW = 0
[  233.723026]   FSC = 0x04: level 0 translation fault
[  233.723458] Data abort info:
[  233.723734]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  233.724176]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  233.724589]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000
[  233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[  233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[  233.726720] Modules linked in:
[  233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.5.0-rc1-00276-g20edcec23f92 #15
[  233.727777] Hardware name: linux,dummy-virt (DT)
[  233.728225] Workqueue: events update_pages_handler
[  233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  233.729054] pc : rb_update_pages+0x1a8/0x3f8
[  233.729334] lr : rb_update_pages+0x154/0x3f8
[  233.729592] sp : ffff800082b9bd50
[  233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000
[  233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418
[  233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003
[  233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58
[  233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001
[  233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000
[  233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c
[  233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0
[  233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000
[  233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000
[  233.734418] Call trace:
[  233.734593]  rb_update_pages+0x1a8/0x3f8
[  233.734853]  update_pages_handler+0x1c/0x38
[  233.735148]  process_one_work+0x1f0/0x468
[  233.735525]  worker_thread+0x54/0x410
[  233.735852]  kthread+0x124/0x138
[  233.736064]  ret_from_fork+0x10/0x20
[  233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060)
[  233.736959] ---[ end trace 0000000000000000 ]---

After analysis, the seq of the error is as follows [1-5]:

int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
int cpu_id)
{
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//1. get cpu_buffer, aka cpu_buffer(A)
...
...
schedule_work_on(cpu,
 &cpu_buffer->update_pages_work);
//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
// update_pages_handler, do the update process, set 'update_done' in
// complete(&cpu_buffer->update_done) and to wakeup resize process.
//---->
//3. Just at this moment, ring_buffer_swap_cpu is triggered,
//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
//ring_buffer_swap_cpu is called as the 'Call trace' below.

Call trace:
 dump_backtrace+0x0/0x2f8
 show_stack+0x18/0x28
 dump_stack+0x12c/0x188
 ring_buffer_swap_cpu+0x2f8/0x328
 update_max_tr_single+0x180/0x210
 check_critical_timing+0x2b4/0x2c8
 tracer_hardirqs_on+0x1c0/0x200
 trace_hardirqs_on+0xec/0x378
 el0_svc_common+0x64/0x260
 do_el0_svc+0x90/0xf8
 el0_svc+0x20/0x30
 el0_sync_handler+0xb0/0xb8
 el0_sync+0x180/0x1c0
//<----

/* wait for all the updates to complete */
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
//for example, cpu_buffer(A)->update_done will leave be set 1, and will
//not 'wait_for_completion' at the next resize round.
  if (!cpu_buffer->nr_pages_to_update)
continue;

if (cpu_online(cpu))
wait_for_completion(&cpu_buffer->update_done);
cpu_buffer->nr_pages_to_update = 0;
}
...
}
//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
//Continuing to run in the wrong state, then oops occurs.

Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn
Signed-off-by: Chen Lin <chen.lin5@zte.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing: Remove unused extern declaration tracing_map_set_field_descr()
YueHaibing [Sat, 22 Jul 2023 03:21:23 +0000 (11:21 +0800)]
tracing: Remove unused extern declaration tracing_map_set_field_descr()

Since commit 08d43a5fa063 ("tracing: Add lock-free tracing_map"),
this is never used, so can be removed.

Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com
Cc: <mhiramat@kernel.org>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agokbuild: flatten KBUILD_CFLAGS
Alexey Dobriyan [Thu, 13 Jul 2023 18:52:28 +0000 (21:52 +0300)]
kbuild: flatten KBUILD_CFLAGS

Make it slightly easier to see which compiler options are added and
removed (and not worry about column limit too!).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
10 months agogen_compile_commands: add assembly files to compilation database
Benjamin Gray [Wed, 19 Jul 2023 03:19:12 +0000 (13:19 +1000)]
gen_compile_commands: add assembly files to compilation database

Like C source files, tooling can find it useful to have the assembly
source file compilation recorded.

The .S extension appears to used across all architectures.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
10 months agoext4: fix rbtree traversal bug in ext4_mb_use_preallocated
Ojaswin Mujoo [Sat, 22 Jul 2023 17:15:24 +0000 (22:45 +0530)]
ext4: fix rbtree traversal bug in ext4_mb_use_preallocated

During allocations, while looking for preallocations(PA) in the per
inode rbtree, we can't do a direct traversal of the tree because
ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
and that can cause direct traversal to skip some entries. This was
leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
our request and ultimately tried to create a new PA that would overlap
with the missed one.

To makes sure we handle that case while still keeping the performance of
the rbtree, we make use of the fact that the only pa that could possibly
overlap the original goal start is the one that satisfies the below
conditions:

  1. It must have it's logical start immediately to the left of
  (ie less than) original logical start.

  2. It must not be deleted

To find this pa we use the following traversal method:

1. Descend into the rbtree normally to find the immediate neighboring
PA. Here we keep descending irrespective of if the PA is deleted or if
it overlaps with our request etc. The goal is to find an immediately
adjacent PA.

2. If the found PA is on right of original goal, use rb_prev() to find
the left adjacent PA.

3. Check if this PA is deleted and keep moving left with rb_prev() until
a non deleted PA is found.

4. This is the PA we are looking for. Now we can check if it can satisfy
the original request and proceed accordingly.

This approach also takes care of having deleted PAs in the tree.

(While we are at it, also fix a possible overflow bug in calculating the
end of a PA)

[1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/

Cc: stable@kernel.org # 6.4
Fixes: 3872778664e3 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com
Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
10 months agoext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
Ojaswin Mujoo [Fri, 9 Jun 2023 10:34:03 +0000 (16:04 +0530)]
ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()

In ext4_mb_choose_next_group_best_avail(), we want the start order to be
1 less than goal length and the min_order to be, at max, 1 more than the
original length. This commit fixes an off by one issue that arose due to
the fact that 1 << fls(n) > (n).

After all the processing:

order = 1 order below goal len
min_order = maximum of the three:-
             - order - trim_order
             - 1 order below B2C(s_stripe)
             - 1 order above original len

Cc: stable@kernel.org
Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)")
Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
10 months agoext4: correct inline offset when handling xattrs in inode body
Eric Whitney [Mon, 22 May 2023 18:15:20 +0000 (14:15 -0400)]
ext4: correct inline offset when handling xattrs in inode body

When run on a file system where the inline_data feature has been
enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
to emit error messages indicating that inline directory entries are
corrupted.  This occurs because the inline offset used to locate
inline directory entries in the inode body is not updated when an
xattr in that shared region is deleted and the region is shifted in
memory to recover the space it occupied.  If the deleted xattr precedes
the system.data attribute, which points to the inline directory entries,
that attribute will be moved further up in the region.  The inline
offset continues to point to whatever is located in system.data's former
location, with unfortunate effects when used to access directory entries
or (presumably) inline data in the inode body.

Cc: stable@kernel.org
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
10 months agoMerge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 23 Jul 2023 02:32:00 +0000 (19:32 -0700)]
Merge tag 'powerpc-6.5-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Reinstate support for little endian ELFv1 binaries, which it turns
   out still exist in the wild.

 - Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it
   lead to dead code generation and seemed to trigger compiler bugs in
   some edge cases.

 - Fix a deadlock in the pseries VAS code, between live migration and
   the driver's mmap handler.

 - Disable KCOV instrumentation in the powerpc KASAN code.

Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren
Myneni, Russell Currey, and Uwe Kleine-König.

* tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"
  powerpc/kasan: Disable KCOV in KASAN code
  powerpc/512x: lpbfifo: Convert to platform remove callback returning void
  powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files
  Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"
  powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close