For more details refer Documentation/admin-guide/iostats.rst
+What: /sys/block/<disk>/diskseq
+Date: February 2021
+Contact: Matteo Croce <mcroce@microsoft.com>
+Description:
+ The /sys/block/<disk>/diskseq files reports the disk
+ sequence number, which is a monotonically increasing
+ number assigned to every drive.
+ Some devices, like the loop device, refresh such number
+ every time the backing file is changed.
+ The value type is 64 bit unsigned.
+
+
What: /sys/block/<disk>/<part>/stat
Date: February 2008
Contact: Jerome Marchand <jmarchan@redhat.com>
KernelVersion: v4.10
Contact: linux-ide@vger.kernel.org
Description:
- (RW) Write to the file to turn on or off the SATA ncq (native
- command queueing) support. By default this feature is turned
- off.
+ (RW) Write to the file to turn on or off the SATA NCQ (native
+ command queueing) priority support. By default this feature is
+ turned off. If the device does not support the SATA NCQ
+ priority feature, writing "1" to this file results in an error
+ (see ncq_prio_supported).
+
+
+What: /sys/block/*/device/sas_ncq_prio_enable
+Date: Oct, 2016
+KernelVersion: v4.10
+Contact: linux-ide@vger.kernel.org
+Description:
+ (RW) This is the equivalent of the ncq_prio_enable attribute
+ file for SATA devices connected to a SAS host-bus-adapter
+ (HBA) implementing support for the SATA NCQ priority feature.
+ This file does not exist if the HBA driver does not implement
+ support for the SATA NCQ priority feature, regardless of the
+ device support for this feature (see sas_ncq_prio_supported).
+
+
+What: /sys/block/*/device/ncq_prio_supported
+Date: Aug, 2021
+KernelVersion: v5.15
+Contact: linux-ide@vger.kernel.org
+Description:
+ (RO) Indicates if the device supports the SATA NCQ (native
+ command queueing) priority feature.
+
+
+What: /sys/block/*/device/sas_ncq_prio_supported
+Date: Aug, 2021
+KernelVersion: v5.15
+Contact: linux-ide@vger.kernel.org
+Description:
+ (RO) This is the equivalent of the ncq_prio_supported attribute
+ file for SATA devices connected to a SAS host-bus-adapter
+ (HBA) implementing support for the SATA NCQ priority feature.
+ This file does not exist if the HBA driver does not implement
+ support for the SATA NCQ priority feature, regardless of the
+ device support for this feature.
--- /dev/null
+What: /sys/bus/event_source/devices/uncore_*/alias
+Date: June 2021
+KernelVersion: 5.15
+Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description: Read-only. An attribute to describe the alias name of
+ the uncore PMU if an alias exists on some platforms.
+ The 'perf(1)' tool should treat both names the same.
+ They both can be used to access the uncore PMU.
+
+ Example:
+
+ $ cat /sys/devices/uncore_cha_2/alias
+ uncore_type_0_2
value comes from an ACPI _PXM method or a similar firmware
source. Initial users for this file would be devices like
arm smmu which are populated by arm64 acpi_iort.
+
+What: /sys/bus/platform/devices/.../msi_irqs/
+Date: August 2021
+Contact: Barry Song <song.bao.hua@hisilicon.com>
+Description:
+ The /sys/devices/.../msi_irqs directory contains a variable set
+ of files, with each file being named after a corresponding msi
+ irq vector allocated to that device.
+
+What: /sys/bus/platform/devices/.../msi_irqs/<N>
+Date: August 2021
+Contact: Barry Song <song.bao.hua@hisilicon.com>
+Description:
+ This attribute will show "msi" if <N> is a valid msi irq
The ``smp_mb__after_unlock_lock()`` invocations prevent this
``WARN_ON()`` from triggering.
++-----------------------------------------------------------------------+
+| **Quick Quiz**: |
++-----------------------------------------------------------------------+
+| But the chain of rcu_node-structure lock acquisitions guarantees |
+| that new readers will see all of the updater's pre-grace-period |
+| accesses and also guarantees that the updater's post-grace-period |
+| accesses will see all of the old reader's accesses. So why do we |
+| need all of those calls to smp_mb__after_unlock_lock()? |
++-----------------------------------------------------------------------+
+| **Answer**: |
++-----------------------------------------------------------------------+
+| Because we must provide ordering for RCU's polling grace-period |
+| primitives, for example, get_state_synchronize_rcu() and |
+| poll_state_synchronize_rcu(). Consider this code:: |
+| |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| g = get_state_synchronize_rcu() smp_mb() |
+| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
+| continue; |
+| r0 = READ_ONCE(Y) |
+| |
+| RCU guarantees that the outcome r0 == 0 && r1 == 0 will not |
+| happen, even if CPU 1 is in an RCU extended quiescent state |
+| (idle or offline) and thus won't interact directly with the RCU |
+| core processing at all. |
++-----------------------------------------------------------------------+
+
This approach must be extended to include idle CPUs, which need
RCU's grace-period memory ordering guarantee to extend to any
RCU read-side critical sections preceding and following the current
12 }
The rcu_dereference() uses volatile casts and (for DEC Alpha) memory
-barriers in the Linux kernel. Should a `high-quality implementation of
-C11 ``memory_order_consume``
-[PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__
+barriers in the Linux kernel. Should a |high-quality implementation of
+C11 memory_order_consume [PDF]|_
ever appear, then rcu_dereference() could be implemented as a
``memory_order_consume`` load. Regardless of the exact implementation, a
pointer fetched by rcu_dereference() may not be used outside of the
mechanism, most commonly locking or `reference
counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__.
+.. |high-quality implementation of C11 memory_order_consume [PDF]| replace:: high-quality implementation of C11 ``memory_order_consume`` [PDF]
+.. _high-quality implementation of C11 memory_order_consume [PDF]: http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf
+
In short, updaters use rcu_assign_pointer() and readers use
rcu_dereference(), and these two RCU API elements work together to
ensure that readers have a consistent view of newly added data elements.
1. Does the update code have proper mutual exclusion?
- RCU does allow -readers- to run (almost) naked, but -writers- must
+ RCU does allow *readers* to run (almost) naked, but *writers* must
still use some sort of mutual exclusion, such as:
a. locking,
critical section is every bit as bad as letting them leak out
from under a lock. Unless, of course, you have arranged some
other means of protection, such as a lock or a reference count
- -before- letting them out of the RCU read-side critical section.
+ *before* letting them out of the RCU read-side critical section.
3. Does the update code tolerate concurrent accesses?
c. Make updates appear atomic to readers. For example,
pointer updates to properly aligned fields will
appear atomic, as will individual atomic primitives.
- Sequences of operations performed under a lock will -not-
+ Sequences of operations performed under a lock will *not*
appear to be atomic to RCU readers, nor will sequences
of multiple atomic primitives.
for example) may be omitted.
10. Conversely, if you are in an RCU read-side critical section,
- and you don't hold the appropriate update-side lock, you -must-
+ and you don't hold the appropriate update-side lock, you *must*
use the "_rcu()" variants of the list macros. Failing to do so
will break Alpha, cause aggressive compilers to generate bad code,
and confuse people trying to read your code.
callback pending, then that RCU callback will execute on some
surviving CPU. (If this was not the case, a self-spawning RCU
callback would prevent the victim CPU from ever going offline.)
- Furthermore, CPUs designated by rcu_nocbs= might well -always-
+ Furthermore, CPUs designated by rcu_nocbs= might well *always*
have their RCU callbacks executed on some other CPUs, in fact,
for some real-time workloads, this is the whole point of using
the rcu_nocbs= kernel boot parameter.
-13. Unlike other forms of RCU, it -is- permissible to block in an
+13. Unlike other forms of RCU, it *is* permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
Please note that if you don't need to sleep in read-side critical
14. The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
carrying out some otherwise-destructive operation. It is
- therefore critically important to -first- remove any path
+ therefore critically important to *first* remove any path
that readers can follow that could be affected by the
- destructive operation, and -only- -then- invoke call_rcu(),
+ destructive operation, and *only then* invoke call_rcu(),
synchronize_rcu(), or friends.
Because these primitives only wait for pre-existing readers, it
is the caller's responsibility to guarantee that any subsequent
readers will execute safely.
-15. The various RCU read-side primitives do -not- necessarily contain
+15. The various RCU read-side primitives do *not* necessarily contain
memory barriers. You should therefore plan for the CPU
and the compiler to freely reorder code into and out of RCU
read-side critical sections. It is the responsibility of the
pass in a function defined within a loadable module, then it in
necessary to wait for all pending callbacks to be invoked after
the last invocation and before unloading that module. Note that
- it is absolutely -not- sufficient to wait for a grace period!
- The current (say) synchronize_rcu() implementation is -not-
+ it is absolutely *not* sufficient to wait for a grace period!
+ The current (say) synchronize_rcu() implementation is *not*
guaranteed to wait for callbacks registered on other CPUs.
Or even on the current CPU if that CPU recently went offline
and came back online.
- call_rcu() -> rcu_barrier()
- call_srcu() -> srcu_barrier()
- However, these barrier functions are absolutely -not- guaranteed
+ However, these barrier functions are absolutely *not* guaranteed
to wait for a grace period. In fact, if there are no call_rcu()
callbacks waiting anywhere in the system, rcu_barrier() is within
its rights to return immediately.
- Set bits and clear bits down in the must-be-zero low-order
bits of that pointer. This clearly means that the pointer
must have alignment constraints, for example, this does
- -not- work in general for char* pointers.
+ *not* work in general for char* pointers.
- XOR bits to translate pointers, as is done in some
classic buddy-allocator algorithms.
Please see the "CONTROL DEPENDENCIES" section of
Documentation/memory-barriers.txt for more details.
- - The pointers are not equal -and- the compiler does
+ - The pointers are not equal *and* the compiler does
not have enough information to deduce the value of the
pointer. Note that the volatile cast in rcu_dereference()
will normally prevent the compiler from knowing too much.
return values. This can result in "p->b" returning pre-initialization
garbage values.
-In short, rcu_dereference() is -not- optional when you are going to
+In short, rcu_dereference() is *not* optional when you are going to
dereference the resulting pointer.
- Booting Linux using a console connection that is too slow to
keep up with the boot-time console-message rate. For example,
- a 115Kbaud serial console can be -way- too slow to keep up
+ a 115Kbaud serial console can be *way* too slow to keep up
with boot-time message rates, and will frequently result in
RCU CPU stall warning messages. Especially if you have added
debug printk()s.
leading the realization that the CPU had failed.
The RCU, RCU-sched, and RCU-tasks implementations have CPU stall warning.
-Note that SRCU does -not- have CPU stall warnings. Please note that
+Note that SRCU does *not* have CPU stall warnings. Please note that
RCU only detects CPU stalls when there is a grace period in progress.
No grace period, no CPU stall warnings.
this parameter is checked only at the beginning of a cycle.
So if you are 10 seconds into a 40-second stall, setting this
sysfs parameter to (say) five will shorten the timeout for the
- -next- stall, or the following warning for the current stall
+ *next* stall, or the following warning for the current stall
(assuming the stall lasts long enough). It will not affect the
timing of the next warning for the current stall.
Interpreting RCU's CPU Stall-Detector "Splats"
==============================================
-For non-RCU-tasks flavors of RCU, when a CPU detects that it is stalling,
-it will print a message similar to the following::
+For non-RCU-tasks flavors of RCU, when a CPU detects that some other
+CPU is stalling, it will print a message similar to the following::
INFO: rcu_sched detected stalls on CPUs/tasks:
2-...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
will normally be followed by stack dumps for each CPU. Please note that
PREEMPT_RCU builds can be stalled by tasks as well as by CPUs, and that
the tasks will be indicated by PID, for example, "P3421". It is even
-possible for an rcu_state stall to be caused by both CPUs -and- tasks,
+possible for an rcu_state stall to be caused by both CPUs *and* tasks,
in which case the offending CPUs and tasks will all be called out in the list.
+In some cases, CPUs will detect themselves stalling, which will result
+in a self-detected stall.
CPU 2's "(3 GPs behind)" indicates that this CPU has not interacted with
the RCU core for the past three grace periods. In contrast, CPU 16's "(0
last noted the beginning of a grace period, which might be the current
(stalled) grace period, or it might be some earlier grace period (for
example, if the CPU might have been in dyntick-idle mode for an extended
-time period. The number after the "/" is the number that have executed
+time period). The number after the "/" is the number that have executed
since boot until the current time. If this latter number stays constant
across repeated stall-warning messages, it is possible that RCU's softirq
handlers are no longer able to execute on this CPU. This can happen if
the stall warning, as was the case in the "All QSes seen" line above,
the following additional line is printed::
- kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
+ rcu_sched kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
+ Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
Starving the grace-period kthreads of CPU time can of course result
in RCU CPU stall warnings even when all CPUs and tasks have passed
change on successive RCU CPU stall warnings, there is further reason to
suspect a timer problem.
+These messages are usually followed by stack dumps of the CPUs and tasks
+involved in the stall. These stack traces can help you locate the cause
+of the stall, keeping in mind that the CPU detecting the stall will have
+an interrupt frame that is mainly devoted to detecting the stall.
+
Multiple Warnings From One Stall
================================
-If a stall lasts long enough, multiple stall-warning messages will be
-printed for it. The second and subsequent messages are printed at
+If a stall lasts long enough, multiple stall-warning messages will
+be printed for it. The second and subsequent messages are printed at
longer intervals, so that the time between (say) the first and second
message will be about three times the interval between the beginning
-of the stall and the first message.
+of the stall and the first message. It can be helpful to compare the
+stack dumps for the different messages for the same stalled grace period.
Stall Warnings for Expedited Grace Periods
multihit.rst
special-register-buffer-data-sampling.rst
core-scheduling.rst
+ l1d_flush.rst
--- /dev/null
+L1D Flushing
+============
+
+With an increasing number of vulnerabilities being reported around data
+leaks from the Level 1 Data cache (L1D) the kernel provides an opt-in
+mechanism to flush the L1D cache on context switch.
+
+This mechanism can be used to address e.g. CVE-2020-0550. For applications
+the mechanism keeps them safe from vulnerabilities, related to leaks
+(snooping of) from the L1D cache.
+
+
+Related CVEs
+------------
+The following CVEs can be addressed by this
+mechanism
+
+ ============= ======================== ==================
+ CVE-2020-0550 Improper Data Forwarding OS related aspects
+ ============= ======================== ==================
+
+Usage Guidelines
+----------------
+
+Please see document: :ref:`Documentation/userspace-api/spec_ctrl.rst
+<set_spec_ctrl>` for details.
+
+**NOTE**: The feature is disabled by default, applications need to
+specifically opt into the feature to enable it.
+
+Mitigation
+----------
+
+When PR_SET_L1D_FLUSH is enabled for a task a flush of the L1D cache is
+performed when the task is scheduled out and the incoming task belongs to a
+different process and therefore to a different address space.
+
+If the underlying CPU supports L1D flushing in hardware, the hardware
+mechanism is used, software fallback for the mitigation, is not supported.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the L1D flush mitigations at boot
+time with the option "l1d_flush=". The valid arguments for this option are:
+
+ ============ =============================================================
+ on Enables the prctl interface, applications trying to use
+ the prctl() will fail with an error if l1d_flush is not
+ enabled
+ ============ =============================================================
+
+By default the mechanism is disabled.
+
+Limitations
+-----------
+
+The mechanism does not mitigate L1D data leaks between tasks belonging to
+different processes which are concurrently executing on sibling threads of
+a physical CPU core when SMT is enabled on the system.
+
+This can be addressed by controlled placement of processes on physical CPU
+cores or by disabling SMT. See the relevant chapter in the L1TF mitigation
+document: :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
+
+**NOTE** : The opt-in of a task for L1D flushing works only when the task's
+affinity is limited to cores running in non-SMT mode. If a task which
+requested L1D flushing is scheduled on a SMT-enabled core the kernel sends
+a SIGBUS to the task.
feature (tagged TLBs) on capable Intel chips.
Default is 1 (enabled)
+ l1d_flush= [X86,INTEL]
+ Control mitigation for L1D based snooping vulnerability.
+
+ Certain CPUs are vulnerable to an exploit against CPU
+ internal buffers which can forward information to a
+ disclosure gadget under certain conditions.
+
+ In vulnerable processors, the speculatively
+ forwarded data can be used in a cache side channel
+ attack, to access data to which the attacker does
+ not have direct access.
+
+ This parameter controls the mitigation. The
+ options are:
+
+ on - enable the interface for the mitigation
+
l1tf= [X86] Control mitigation of the L1TF vulnerability on
affected CPUs
reboot= [KNL]
Format (x86 or x86_64):
- [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \
+ [w[arm] | c[old] | h[ard] | s[oft] | g[pio]] | d[efault] \
[[,]s[mp]#### \
[[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \
[[,]f[orce]
SC *y, t;
is allowed.
+
+
+CMPXCHG vs TRY_CMPXCHG
+----------------------
+
+ int atomic_cmpxchg(atomic_t *ptr, int old, int new);
+ bool atomic_try_cmpxchg(atomic_t *ptr, int *oldp, int new);
+
+Both provide the same functionality, but try_cmpxchg() can lead to more
+compact code. The functions relate like:
+
+ bool atomic_try_cmpxchg(atomic_t *ptr, int *oldp, int new)
+ {
+ int ret, old = *oldp;
+ ret = atomic_cmpxchg(ptr, old, new);
+ if (ret != old)
+ *oldp = ret;
+ return ret == old;
+ }
+
+and:
+
+ int atomic_cmpxchg(atomic_t *ptr, int old, int new)
+ {
+ (void)atomic_try_cmpxchg(ptr, &old, new);
+ return old;
+ }
+
+Usage:
+
+ old = atomic_read(&v); old = atomic_read(&v);
+ for (;;) { do {
+ new = func(old); new = func(old);
+ tmp = atomic_cmpxchg(&v, old, new); } while (!atomic_try_cmpxchg(&v, &old, new));
+ if (tmp == old)
+ break;
+ old = tmp;
+ }
+
+NB. try_cmpxchg() also generates better code on some platforms (notably x86)
+where the function more closely matches the hardware instruction.
+
+
+FORWARD PROGRESS
+----------------
+
+In general strong forward progress is expected of all unconditional atomic
+operations -- those in the Arithmetic and Bitwise classes and xchg(). However
+a fair amount of code also requires forward progress from the conditional
+atomic operations.
+
+Specifically 'simple' cmpxchg() loops are expected to not starve one another
+indefinitely. However, this is not evident on LL/SC architectures, because
+while an LL/SC architecure 'can/should/must' provide forward progress
+guarantees between competing LL/SC sections, such a guarantee does not
+transfer to cmpxchg() implemented using LL/SC. Consider:
+
+ old = atomic_read(&v);
+ do {
+ new = func(old);
+ } while (!atomic_try_cmpxchg(&v, &old, new));
+
+which on LL/SC becomes something like:
+
+ old = atomic_read(&v);
+ do {
+ new = func(old);
+ } while (!({
+ volatile asm ("1: LL %[oldval], %[v]\n"
+ " CMP %[oldval], %[old]\n"
+ " BNE 2f\n"
+ " SC %[new], %[v]\n"
+ " BNE 1b\n"
+ "2:\n"
+ : [oldval] "=&r" (oldval), [v] "m" (v)
+ : [old] "r" (old), [new] "r" (new)
+ : "memory");
+ success = (oldval == old);
+ if (!success)
+ old = oldval;
+ success; }));
+
+However, even the forward branch from the failed compare can cause the LL/SC
+to fail on some architectures, let alone whatever the compiler makes of the C
+loop body. As a result there is no guarantee what so ever the cacheline
+containing @v will stay on the local CPU and progress is made.
+
+Even native CAS architectures can fail to provide forward progress for their
+primitive (See Sparc64 for an example).
+
+Such implementations are strongly encouraged to add exponential backoff loops
+to a failed CAS in order to ensure some progress. Affected architectures are
+also strongly encouraged to inspect/audit the atomic fallbacks, refcount_t and
+their locking primitives.
each registration and removal function is also available with a ``_nocalls``
suffix which does not invoke the provided callbacks if the invocation of the
callbacks is not desired. During the manual setup (or teardown) the functions
-``get_online_cpus()`` and ``put_online_cpus()`` should be used to inhibit CPU
+``cpus_read_lock()`` and ``cpus_read_unlock()`` should be used to inhibit CPU
hotplug operations.
the hwirq, and call the .map() callback so the driver can perform any
required hardware setup.
-When an interrupt is received, irq_find_mapping() function should
-be used to find the Linux IRQ number from the hwirq number.
+Once a mapping has been established, it can be retrieved or used via a
+variety of methods:
+
+- irq_resolve_mapping() returns a pointer to the irq_desc structure
+ for a given domain and hwirq number, and NULL if there was no
+ mapping.
+- irq_find_mapping() returns a Linux IRQ number for a given domain and
+ hwirq number, and 0 if there was no mapping
+- irq_linear_revmap() is now identical to irq_find_mapping(), and is
+ deprecated
+- generic_handle_domain_irq() handles an interrupt described by a
+ domain and a hwirq number
+- handle_domain_irq() does the same thing for root interrupt
+ controllers and deals with the set_irq_reg()/irq_enter() sequences
+ that most architecture requires
+
+Note that irq domain lookups must happen in contexts that are
+compatible with a RCU read-side critical section.
The irq_create_mapping() function must be called *atleast once*
before any call to irq_find_mapping(), lest the descriptor will not
IRQ number and call the .map() callback so that driver can program the
Linux IRQ number into the hardware.
-Most drivers cannot use this mapping.
+Most drivers cannot use this mapping, and it is now gated on the
+CONFIG_IRQ_DOMAIN_NOMAP option. Please refrain from introducing new
+users of this API.
Legacy
------
case the Linux IRQ numbers cannot be dynamically assigned and the legacy
mapping should be used.
+As the name implies, the *_legacy() functions are deprecated and only
+exist to ease the support of ancient platforms. No new users should be
+added.
+
The legacy map assumes a contiguous range of IRQ numbers has already
been allocated for the controller and that the IRQ number can be
calculated by adding a fixed offset to the hwirq number, and
compatible:
enum:
- ibm,fsi2spi
- - ibm,fsi2spi-restricted
reg:
items:
maxItems: 1
clocks:
- maxItems: 1
+ minItems: 1
+ items:
+ - description: APB interface clock source
+ - description: GPIO debounce reference clock source
gpio-controller: true
compatible:
const: simple-battery
+ device-chemistry:
+ description: This describes the chemical technology of the battery.
+ oneOf:
+ - const: nickel-cadmium
+ - const: nickel-metal-hydride
+ - const: lithium-ion
+ description: This is a blanket type for all lithium-ion batteries,
+ including those below. If possible, a precise compatible string
+ from below should be used, but sometimes it is unknown which specific
+ lithium ion battery is employed and this wide compatible can be used.
+ - const: lithium-ion-polymer
+ - const: lithium-ion-iron-phosphate
+ - const: lithium-ion-manganese-oxide
+
over-voltage-threshold-microvolt:
description: battery over-voltage limit
- maxim,max17047
- maxim,max17050
- maxim,max17055
+ - maxim,max77849-battery
reg:
maxItems: 1
interrupts:
maxItems: 1
+ description: |
+ The ALRT pin, an open-drain interrupt.
maxim,rsns-microohm:
$ref: /schemas/types.yaml#/definitions/uint32
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/power/supply/mt6360_charger.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Battery charger driver for MT6360 PMIC from MediaTek Integrated.
+
+maintainers:
+ - Gene Chen <gene_chen@richtek.com>
+
+description: |
+ This module is part of the MT6360 MFD device.
+ Provides Battery Charger, Boost for OTG devices and BC1.2 detection.
+
+properties:
+ compatible:
+ const: mediatek,mt6360-chg
+
+ richtek,vinovp-microvolt:
+ description: Maximum CHGIN regulation voltage in uV.
+ enum: [ 5500000, 6500000, 11000000, 14500000 ]
+
+
+ usb-otg-vbus-regulator:
+ type: object
+ description: OTG boost regulator.
+ $ref: /schemas/regulator/regulator.yaml#
+
+required:
+ - compatible
+
+additionalProperties: false
+
+examples:
+ - |
+ mt6360_charger: charger {
+ compatible = "mediatek,mt6360-chg";
+ richtek,vinovp-microvolt = <14500000>;
+
+ otg_vbus_regulator: usb-otg-vbus-regulator {
+ regulator-compatible = "usb-otg-vbus";
+ regulator-name = "usb-otg-vbus";
+ regulator-min-microvolt = <4425000>;
+ regulator-max-microvolt = <5825000>;
+ };
+ };
+...
- 1 # SMB3XX_SOFT_TEMP_COMPENSATE_CURRENT Current compensation
- 2 # SMB3XX_SOFT_TEMP_COMPENSATE_VOLTAGE Voltage compensation
+ summit,inok-polarity:
+ description: |
+ Polarity of INOK signal indicating presence of external power supply.
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum:
+ - 0 # SMB3XX_SYSOK_INOK_ACTIVE_LOW
+ - 1 # SMB3XX_SYSOK_INOK_ACTIVE_HIGH
+
+ usb-vbus:
+ $ref: "../../regulator/regulator.yaml#"
+ type: object
+
+ properties:
+ summit,needs-inok-toggle:
+ type: boolean
+ description: INOK signal is fixed and polarity needs to be toggled
+ in order to enable/disable output mode.
+
+ unevaluatedProperties: false
+
allOf:
- if:
properties:
reg = <0x7f>;
summit,enable-charge-control = <SMB3XX_CHG_ENABLE_PIN_ACTIVE_HIGH>;
+ summit,inok-polarity = <SMB3XX_SYSOK_INOK_ACTIVE_LOW>;
summit,chip-temperature-threshold-celsius = <110>;
summit,mains-current-limit-microamp = <2000000>;
summit,usb-current-limit-microamp = <500000>;
summit,enable-mains-charging;
monitored-battery = <&battery>;
+
+ usb-vbus {
+ regulator-name = "usb_vbus";
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <5000000>;
+ regulator-min-microamp = <750000>;
+ regulator-max-microamp = <750000>;
+ summit,needs-inok-toggle;
+ };
};
};
properties:
compatible:
- enum:
- - x-powers,axp202-ac-power-supply
- - x-powers,axp221-ac-power-supply
- - x-powers,axp813-ac-power-supply
+ oneOf:
+ - const: x-powers,axp202-ac-power-supply
+ - const: x-powers,axp221-ac-power-supply
+ - items:
+ - const: x-powers,axp803-ac-power-supply
+ - const: x-powers,axp813-ac-power-supply
+ - const: x-powers,axp813-ac-power-supply
required:
- compatible
properties:
compatible:
- enum:
- - x-powers,axp209-battery-power-supply
- - x-powers,axp221-battery-power-supply
- - x-powers,axp813-battery-power-supply
+ oneOf:
+ - const: x-powers,axp202-battery-power-supply
+ - const: x-powers,axp209-battery-power-supply
+ - const: x-powers,axp221-battery-power-supply
+ - items:
+ - const: x-powers,axp803-battery-power-supply
+ - const: x-powers,axp813-battery-power-supply
+ - const: x-powers,axp813-battery-power-supply
required:
- compatible
properties:
compatible:
- enum:
- - x-powers,axp202-usb-power-supply
- - x-powers,axp221-usb-power-supply
- - x-powers,axp223-usb-power-supply
- - x-powers,axp813-usb-power-supply
+ oneOf:
+ - enum:
+ - x-powers,axp202-usb-power-supply
+ - x-powers,axp221-usb-power-supply
+ - x-powers,axp223-usb-power-supply
+ - x-powers,axp813-usb-power-supply
+ - items:
+ - const: x-powers,axp803-usb-power-supply
+ - const: x-powers,axp813-usb-power-supply
required:
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/richtek,rtq2134-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Richtek RTQ2134 SubPMIC Regulator
+
+maintainers:
+ - ChiYuan Huang <cy_huang@richtek.com>
+
+description: |
+ The RTQ2134 is a multi-phase, programmable power management IC that
+ integrates with four high efficient, synchronous step-down converter cores.
+
+ Datasheet is available at
+ https://www.richtek.com/assets/product_file/RTQ2134-QA/DSQ2134-QA-01.pdf
+
+properties:
+ compatible:
+ enum:
+ - richtek,rtq2134
+
+ reg:
+ maxItems: 1
+
+ regulators:
+ type: object
+
+ patternProperties:
+ "^buck[1-3]$":
+ type: object
+ $ref: regulator.yaml#
+ description: |
+ regulator description for buck[1-3].
+
+ properties:
+ richtek,use-vsel-dvs:
+ type: boolean
+ description: |
+ If specified, buck will listen to 'vsel' pin for dvs config.
+ Else, use dvs0 voltage by default.
+
+ richtek,uv-shutdown:
+ type: boolean
+ description: |
+ If specified, use shutdown as UV action. Else, hiccup by default.
+
+ unevaluatedProperties: false
+
+ additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - regulators
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ rtq2134@18 {
+ compatible = "richtek,rtq2134";
+ reg = <0x18>;
+
+ regulators {
+ buck1 {
+ regulator-name = "rtq2134-buck1";
+ regulator-min-microvolt = <300000>;
+ regulator-max-microvolt = <1850000>;
+ regulator-always-on;
+ richtek,use-vsel-dvs;
+ regulator-state-mem {
+ regulator-suspend-min-microvolt = <550000>;
+ regulator-suspend-max-microvolt = <550000>;
+ };
+ };
+ buck2 {
+ regulator-name = "rtq2134-buck2";
+ regulator-min-microvolt = <1120000>;
+ regulator-max-microvolt = <1120000>;
+ regulator-always-on;
+ richtek,use-vsel-dvs;
+ regulator-state-mem {
+ regulator-suspend-min-microvolt = <1120000>;
+ regulator-suspend-max-microvolt = <1120000>;
+ };
+ };
+ buck3 {
+ regulator-name = "rtq2134-buck3";
+ regulator-min-microvolt = <600000>;
+ regulator-max-microvolt = <600000>;
+ regulator-always-on;
+ richtek,use-vsel-dvs;
+ regulator-state-mem {
+ regulator-suspend-min-microvolt = <600000>;
+ regulator-suspend-max-microvolt = <600000>;
+ };
+ };
+ };
+ };
+ };
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/richtek,rtq6752-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Richtek RTQ6752 TFT LCD Voltage Regulator
+
+maintainers:
+ - ChiYuan Huang <cy_huang@richtek.com>
+
+description: |
+ The RTQ6752 is an I2C interface pgorammable power management IC. It includes
+ two synchronous boost converter for PAVDD, and one synchronous NAVDD
+ buck-boost. The device is suitable for automotive TFT-LCD panel.
+
+properties:
+ compatible:
+ enum:
+ - richtek,rtq6752
+
+ reg:
+ maxItems: 1
+
+ enable-gpios:
+ description: |
+ A connection of the chip 'enable' gpio line. If not provided, treat it as
+ external pull up.
+ maxItems: 1
+
+ regulators:
+ type: object
+
+ patternProperties:
+ "^(p|n)avdd$":
+ type: object
+ $ref: regulator.yaml#
+ description: |
+ regulator description for pavdd and navdd.
+
+ additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - regulators
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ rtq6752@6b {
+ compatible = "richtek,rtq6752";
+ reg = <0x6b>;
+ enable-gpios = <&gpio26 2 0>;
+
+ regulators {
+ pavdd {
+ regulator-name = "rtq6752-pavdd";
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <7300000>;
+ regulator-boot-on;
+ };
+ navdd {
+ regulator-name = "rtq6752-navdd";
+ regulator-min-microvolt = <5000000>;
+ regulator-max-microvolt = <7300000>;
+ regulator-boot-on;
+ };
+ };
+ };
+ };
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/socionext,uniphier-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Socionext UniPhier regulator controller
+
+description: |
+ This regulator controls VBUS and belongs to USB3 glue layer. Before using
+ the regulator, it is necessary to control the clocks and resets to enable
+ this layer. These clocks and resets should be described in each property.
+
+maintainers:
+ - Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
+
+allOf:
+ - $ref: "regulator.yaml#"
+
+# USB3 Controller
+
+properties:
+ compatible:
+ enum:
+ - socionext,uniphier-pro4-usb3-regulator
+ - socionext,uniphier-pro5-usb3-regulator
+ - socionext,uniphier-pxs2-usb3-regulator
+ - socionext,uniphier-ld20-usb3-regulator
+ - socionext,uniphier-pxs3-usb3-regulator
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ minItems: 1
+ maxItems: 2
+
+ clock-names:
+ oneOf:
+ - items: # for Pro4, Pro5
+ - const: gio
+ - const: link
+ - items: # for others
+ - const: link
+
+ resets:
+ minItems: 1
+ maxItems: 2
+
+ reset-names:
+ oneOf:
+ - items: # for Pro4, Pro5
+ - const: gio
+ - const: link
+ - items:
+ - const: link
+
+additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - clocks
+ - clock-names
+ - resets
+ - reset-names
+
+examples:
+ - |
+ usb-glue@65b00000 {
+ compatible = "simple-mfd";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0 0x65b00000 0x400>;
+
+ usb_vbus0: regulators@100 {
+ compatible = "socionext,uniphier-ld20-usb3-regulator";
+ reg = <0x100 0x10>;
+ clock-names = "link";
+ clocks = <&sys_clk 14>;
+ reset-names = "link";
+ resets = <&sys_rst 14>;
+ };
+ };
+
+++ /dev/null
-Socionext UniPhier Regulator Controller
-
-This describes the devicetree bindings for regulator controller implemented
-on Socionext UniPhier SoCs.
-
-USB3 Controller
----------------
-
-This regulator controls VBUS and belongs to USB3 glue layer. Before using
-the regulator, it is necessary to control the clocks and resets to enable
-this layer. These clocks and resets should be described in each property.
-
-Required properties:
-- compatible: Should be
- "socionext,uniphier-pro4-usb3-regulator" - for Pro4 SoC
- "socionext,uniphier-pro5-usb3-regulator" - for Pro5 SoC
- "socionext,uniphier-pxs2-usb3-regulator" - for PXs2 SoC
- "socionext,uniphier-ld20-usb3-regulator" - for LD20 SoC
- "socionext,uniphier-pxs3-usb3-regulator" - for PXs3 SoC
-- reg: Specifies offset and length of the register set for the device.
-- clocks: A list of phandles to the clock gate for USB3 glue layer.
- According to the clock-names, appropriate clocks are required.
-- clock-names: Should contain
- "gio", "link" - for Pro4 and Pro5 SoCs
- "link" - for others
-- resets: A list of phandles to the reset control for USB3 glue layer.
- According to the reset-names, appropriate resets are required.
-- reset-names: Should contain
- "gio", "link" - for Pro4 and Pro5 SoCs
- "link" - for others
-
-See Documentation/devicetree/bindings/regulator/regulator.txt
-for more details about the regulator properties.
-
-Example:
-
- usb-glue@65b00000 {
- compatible = "socionext,uniphier-ld20-dwc3-glue",
- "simple-mfd";
- #address-cells = <1>;
- #size-cells = <1>;
- ranges = <0 0x65b00000 0x400>;
-
- usb_vbus0: regulators@100 {
- compatible = "socionext,uniphier-ld20-usb3-regulator";
- reg = <0x100 0x10>;
- clock-names = "link";
- clocks = <&sys_clk 14>;
- reset-names = "link";
- resets = <&sys_rst 14>;
- };
-
- phy {
- ...
- phy-supply = <&usb_vbus0>;
- };
- ...
- };
+++ /dev/null
-OMAP2+ McSPI device
-
-Required properties:
-- compatible :
- - "ti,am654-mcspi" for AM654.
- - "ti,omap2-mcspi" for OMAP2 & OMAP3.
- - "ti,omap4-mcspi" for OMAP4+.
-- ti,spi-num-cs : Number of chipselect supported by the instance.
-- ti,hwmods: Name of the hwmod associated to the McSPI
-- ti,pindir-d0-out-d1-in: Select the D0 pin as output and D1 as
- input. The default is D0 as input and
- D1 as output.
-
-Optional properties:
-- dmas: List of DMA specifiers with the controller specific format
- as described in the generic DMA client binding. A tx and rx
- specifier is required for each chip select.
-- dma-names: List of DMA request names. These strings correspond
- 1:1 with the DMA specifiers listed in dmas. The string naming
- is to be "rxN" and "txN" for RX and TX requests,
- respectively, where N equals the chip select number.
-
-Examples:
-
-[hwmod populated DMA resources]
-
-mcspi1: mcspi@1 {
- #address-cells = <1>;
- #size-cells = <0>;
- compatible = "ti,omap4-mcspi";
- ti,hwmods = "mcspi1";
- ti,spi-num-cs = <4>;
-};
-
-[generic DMA request binding]
-
-mcspi1: mcspi@1 {
- #address-cells = <1>;
- #size-cells = <0>;
- compatible = "ti,omap4-mcspi";
- ti,hwmods = "mcspi1";
- ti,spi-num-cs = <2>;
- dmas = <&edma 42
- &edma 43
- &edma 44
- &edma 45>;
- dma-names = "tx0", "rx0", "tx1", "rx1";
-};
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/omap-spi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: SPI controller bindings for OMAP and K3 SoCs
+
+maintainers:
+ - Aswath Govindraju <a-govindraju@ti.com>
+
+allOf:
+ - $ref: spi-controller.yaml#
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - ti,am654-mcspi
+ - ti,am4372-mcspi
+ - const: ti,omap4-mcspi
+ - items:
+ - enum:
+ - ti,omap2-mcspi
+ - ti,omap4-mcspi
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+ power-domains:
+ maxItems: 1
+
+ ti,spi-num-cs:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: Number of chipselect supported by the instance.
+ minimum: 1
+ maximum: 4
+
+ ti,hwmods:
+ $ref: /schemas/types.yaml#/definitions/string
+ description:
+ Must be "mcspi<n>", n being the instance number (1-based).
+ This property is applicable only on legacy platforms mainly omap2/3
+ and ti81xx and should not be used on other platforms.
+ deprecated: true
+
+ ti,pindir-d0-out-d1-in:
+ description:
+ Select the D0 pin as output and D1 as input. The default is D0
+ as input and D1 as output.
+ type: boolean
+
+ dmas:
+ description:
+ List of DMA specifiers with the controller specific format as
+ described in the generic DMA client binding. A tx and rx
+ specifier is required for each chip select.
+ minItems: 1
+ maxItems: 8
+
+ dma-names:
+ description:
+ List of DMA request names. These strings correspond 1:1 with
+ the DMA sepecifiers listed in dmas. The string names is to be
+ "rxN" and "txN" for RX and TX requests, respectively. Where N
+ is the chip select number.
+ minItems: 1
+ maxItems: 8
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+unevaluatedProperties: false
+
+if:
+ properties:
+ compatible:
+ oneOf:
+ - const: ti,omap2-mcspi
+ - const: ti,omap4-mcspi
+
+then:
+ properties:
+ ti,hwmods:
+ items:
+ - pattern: "^mcspi([1-9])$"
+
+else:
+ properties:
+ ti,hwmods: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/soc/ti,sci_pm_domain.h>
+
+ spi@2100000 {
+ compatible = "ti,am654-mcspi","ti,omap4-mcspi";
+ reg = <0x2100000 0x400>;
+ interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&k3_clks 137 1>;
+ power-domains = <&k3_pds 137 TI_SCI_PD_EXCLUSIVE>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ dmas = <&main_udmap 0xc500>, <&main_udmap 0x4500>;
+ dma-names = "tx0", "rx0";
+ };
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/rockchip-sfc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Rockchip Serial Flash Controller (SFC)
+
+maintainers:
+ - Heiko Stuebner <heiko@sntech.de>
+ - Chris Morgan <macromorgan@hotmail.com>
+
+allOf:
+ - $ref: spi-controller.yaml#
+
+properties:
+ compatible:
+ const: rockchip,sfc
+ description:
+ The rockchip sfc controller is a standalone IP with version register,
+ and the driver can handle all the feature difference inside the IP
+ depending on the version register.
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: Bus Clock
+ - description: Module Clock
+
+ clock-names:
+ items:
+ - const: clk_sfc
+ - const: hclk_sfc
+
+ power-domains:
+ maxItems: 1
+
+ rockchip,sfc-no-dma:
+ description: Disable DMA and utilize FIFO mode only
+ type: boolean
+
+patternProperties:
+ "^flash@[0-3]$":
+ type: object
+ properties:
+ reg:
+ minimum: 0
+ maximum: 3
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - clock-names
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/px30-cru.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/power/px30-power.h>
+
+ sfc: spi@ff3a0000 {
+ compatible = "rockchip,sfc";
+ reg = <0xff3a0000 0x4000>;
+ interrupts = <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&cru SCLK_SFC>, <&cru HCLK_SFC>;
+ clock-names = "clk_sfc", "hclk_sfc";
+ pinctrl-0 = <&sfc_clk &sfc_cs &sfc_bus2>;
+ pinctrl-names = "default";
+ power-domains = <&power PX30_PD_MMC_NAND>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ flash@0 {
+ compatible = "jedec,spi-nor";
+ reg = <0>;
+ spi-max-frequency = <108000000>;
+ spi-rx-bus-width = <2>;
+ spi-tx-bus-width = <2>;
+ };
+ };
+
+...
- mediatek,mt8135-spi: for mt8135 platforms
- mediatek,mt8173-spi: for mt8173 platforms
- mediatek,mt8183-spi: for mt8183 platforms
+ - mediatek,mt6893-spi: for mt6893 platforms
- "mediatek,mt8192-spi", "mediatek,mt6765-spi": for mt8192 platforms
- "mediatek,mt8195-spi", "mediatek,mt6765-spi": for mt8195 platforms
- "mediatek,mt8516-spi", "mediatek,mt2712-spi": for mt8516 platforms
+++ /dev/null
-Spreadtrum ADI controller
-
-ADI is the abbreviation of Anolog-Digital interface, which is used to access
-analog chip (such as PMIC) from digital chip. ADI controller follows the SPI
-framework for its hardware implementation is alike to SPI bus and its timing
-is compatile to SPI timing.
-
-ADI controller has 50 channels including 2 software read/write channels and
-48 hardware channels to access analog chip. For 2 software read/write channels,
-users should set ADI registers to access analog chip. For hardware channels,
-we can configure them to allow other hardware components to use it independently,
-which means we can just link one analog chip address to one hardware channel,
-then users can access the mapped analog chip address by this hardware channel
-triggered by hardware components instead of ADI software channels.
-
-Thus we introduce one property named "sprd,hw-channels" to configure hardware
-channels, the first value specifies the hardware channel id which is used to
-transfer data triggered by hardware automatically, and the second value specifies
-the analog chip address where user want to access by hardware components.
-
-Since we have multi-subsystems will use unique ADI to access analog chip, when
-one system is reading/writing data by ADI software channels, that should be under
-one hardware spinlock protection to prevent other systems from reading/writing
-data by ADI software channels at the same time, or two parallel routine of setting
-ADI registers will make ADI controller registers chaos to lead incorrect results.
-Then we need one hardware spinlock to synchronize between the multiple subsystems.
-
-The new version ADI controller supplies multiple master channels for different
-subsystem accessing, that means no need to add hardware spinlock to synchronize,
-thus change the hardware spinlock support to be optional to keep backward
-compatibility.
-
-Required properties:
-- compatible: Should be "sprd,sc9860-adi".
-- reg: Offset and length of ADI-SPI controller register space.
-- #address-cells: Number of cells required to define a chip select address
- on the ADI-SPI bus. Should be set to 1.
-- #size-cells: Size of cells required to define a chip select address size
- on the ADI-SPI bus. Should be set to 0.
-
-Optional properties:
-- hwlocks: Reference to a phandle of a hwlock provider node.
-- hwlock-names: Reference to hwlock name strings defined in the same order
- as the hwlocks, should be "adi".
-- sprd,hw-channels: This is an array of channel values up to 49 channels.
- The first value specifies the hardware channel id which is used to
- transfer data triggered by hardware automatically, and the second
- value specifies the analog chip address where user want to access
- by hardware components.
-
-SPI slave nodes must be children of the SPI controller node and can contain
-properties described in Documentation/devicetree/bindings/spi/spi-bus.txt.
-
-Example:
- adi_bus: spi@40030000 {
- compatible = "sprd,sc9860-adi";
- reg = <0 0x40030000 0 0x10000>;
- hwlocks = <&hwlock1 0>;
- hwlock-names = "adi";
- #address-cells = <1>;
- #size-cells = <0>;
- sprd,hw-channels = <30 0x8c20>;
- };
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/spi/sprd,spi-adi.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: Spreadtrum ADI controller
+
+maintainers:
+ - Orson Zhai <orsonzhai@gmail.com>
+ - Baolin Wang <baolin.wang7@gmail.com>
+ - Chunyan Zhang <zhang.lyra@gmail.com>
+
+description: |
+ ADI is the abbreviation of Anolog-Digital interface, which is used to access
+ analog chip (such as PMIC) from digital chip. ADI controller follows the SPI
+ framework for its hardware implementation is alike to SPI bus and its timing
+ is compatile to SPI timing.
+
+ ADI controller has 50 channels including 2 software read/write channels and
+ 48 hardware channels to access analog chip. For 2 software read/write channels,
+ users should set ADI registers to access analog chip. For hardware channels,
+ we can configure them to allow other hardware components to use it independently,
+ which means we can just link one analog chip address to one hardware channel,
+ then users can access the mapped analog chip address by this hardware channel
+ triggered by hardware components instead of ADI software channels.
+
+ Thus we introduce one property named "sprd,hw-channels" to configure hardware
+ channels, the first value specifies the hardware channel id which is used to
+ transfer data triggered by hardware automatically, and the second value specifies
+ the analog chip address where user want to access by hardware components.
+
+ Since we have multi-subsystems will use unique ADI to access analog chip, when
+ one system is reading/writing data by ADI software channels, that should be under
+ one hardware spinlock protection to prevent other systems from reading/writing
+ data by ADI software channels at the same time, or two parallel routine of setting
+ ADI registers will make ADI controller registers chaos to lead incorrect results.
+ Then we need one hardware spinlock to synchronize between the multiple subsystems.
+
+ The new version ADI controller supplies multiple master channels for different
+ subsystem accessing, that means no need to add hardware spinlock to synchronize,
+ thus change the hardware spinlock support to be optional to keep backward
+ compatibility.
+
+allOf:
+ - $ref: /spi/spi-controller.yaml#
+
+properties:
+ compatible:
+ enum:
+ - sprd,sc9860-adi
+ - sprd,sc9863-adi
+ - sprd,ums512-adi
+
+ reg:
+ maxItems: 1
+
+ hwlocks:
+ maxItems: 1
+
+ hwlock-names:
+ const: adi
+
+ sprd,hw-channels:
+ $ref: /schemas/types.yaml#/definitions/uint32-matrix
+ description: A list of hardware channels
+ minItems: 1
+ maxItems: 48
+ items:
+ items:
+ - description: The hardware channel id which is used to transfer data
+ triggered by hardware automatically, channel id 0-1 are for software
+ use, 2-49 are hardware channels.
+ minimum: 2
+ maximum: 49
+ - description: The analog chip address where user want to access by
+ hardware components.
+
+required:
+ - compatible
+ - reg
+ - '#address-cells'
+ - '#size-cells'
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ aon {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ adi_bus: spi@40030000 {
+ compatible = "sprd,sc9860-adi";
+ reg = <0 0x40030000 0 0x10000>;
+ hwlocks = <&hwlock1 0>;
+ hwlock-names = "adi";
+ #address-cells = <1>;
+ #size-cells = <0>;
+ sprd,hw-channels = <30 0x8c20>;
+ };
+ };
+...
+++ /dev/null
-Rockchip rk timer
-
-Required properties:
-- compatible: should be:
- "rockchip,rv1108-timer", "rockchip,rk3288-timer": for Rockchip RV1108
- "rockchip,rk3036-timer", "rockchip,rk3288-timer": for Rockchip RK3036
- "rockchip,rk3066-timer", "rockchip,rk3288-timer": for Rockchip RK3066
- "rockchip,rk3188-timer", "rockchip,rk3288-timer": for Rockchip RK3188
- "rockchip,rk3228-timer", "rockchip,rk3288-timer": for Rockchip RK3228
- "rockchip,rk3229-timer", "rockchip,rk3288-timer": for Rockchip RK3229
- "rockchip,rk3288-timer": for Rockchip RK3288
- "rockchip,rk3368-timer", "rockchip,rk3288-timer": for Rockchip RK3368
- "rockchip,rk3399-timer": for Rockchip RK3399
-- reg: base address of the timer register starting with TIMERS CONTROL register
-- interrupts: should contain the interrupts for Timer0
-- clocks : must contain an entry for each entry in clock-names
-- clock-names : must include the following entries:
- "timer", "pclk"
-
-Example:
- timer: timer@ff810000 {
- compatible = "rockchip,rk3288-timer";
- reg = <0xff810000 0x20>;
- interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&xin24m>, <&cru PCLK_TIMER>;
- clock-names = "timer", "pclk";
- };
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/timer/rockchip,rk-timer.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Rockchip Timer Device Tree Bindings
+
+maintainers:
+ - Daniel Lezcano <daniel.lezcano@linaro.org>
+
+properties:
+ compatible:
+ oneOf:
+ - const: rockchip,rk3288-timer
+ - const: rockchip,rk3399-timer
+ - items:
+ - enum:
+ - rockchip,rv1108-timer
+ - rockchip,rk3036-timer
+ - rockchip,rk3066-timer
+ - rockchip,rk3188-timer
+ - rockchip,rk3228-timer
+ - rockchip,rk3229-timer
+ - rockchip,rk3288-timer
+ - rockchip,rk3368-timer
+ - rockchip,px30-timer
+ - const: rockchip,rk3288-timer
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ minItems: 2
+ maxItems: 2
+
+ clock-names:
+ items:
+ - const: pclk
+ - const: timer
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - clock-names
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/clock/rk3288-cru.h>
+
+ timer: timer@ff810000 {
+ compatible = "rockchip,rk3288-timer";
+ reg = <0xff810000 0x20>;
+ interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&cru PCLK_TIMER>, <&xin24m>;
+ clock-names = "pclk", "timer";
+ };
io-mapping
io_ordering
generic-counter
- lightnvm-pblk
memory-devices/index
men-chameleon-bus
ntb
+++ /dev/null
-pblk: Physical Block Device Target
-==================================
-
-pblk implements a fully associative, host-based FTL that exposes a traditional
-block I/O interface. Its primary responsibilities are:
-
- - Map logical addresses onto physical addresses (4KB granularity) in a
- logical-to-physical (L2P) table.
- - Maintain the integrity and consistency of the L2P table as well as its
- recovery from normal tear down and power outage.
- - Deal with controller- and media-specific constrains.
- - Handle I/O errors.
- - Implement garbage collection.
- - Maintain consistency across the I/O stack during synchronization points.
-
-For more information please refer to:
-
- http://lightnvm.io
-
-which maintains updated FAQs, manual pages, technical documentation, tools,
-contacts, etc.
locking rules:
All except set_page_dirty and freepage may block
-====================== ======================== =========
-ops PageLocked(page) i_rwsem
-====================== ======================== =========
+====================== ======================== ========= ===============
+ops PageLocked(page) i_rwsem invalidate_lock
+====================== ======================== ========= ===============
writepage: yes, unlocks (see below)
-readpage: yes, unlocks
+readpage: yes, unlocks shared
writepages:
set_page_dirty no
-readahead: yes, unlocks
-readpages: no
+readahead: yes, unlocks shared
+readpages: no shared
write_begin: locks the page exclusive
write_end: yes, unlocks exclusive
bmap:
-invalidatepage: yes
+invalidatepage: yes exclusive
releasepage: yes
freepage: yes
direct_IO:
error_remove_page: yes
swap_activate: no
swap_deactivate: no
-====================== ======================== =========
+====================== ======================== ========= ===============
->write_begin(), ->write_end() and ->readpage() may be called from
the request handler (/dev/loop).
->invalidatepage() is called when the filesystem must attempt to drop
some or all of the buffers from the page when it is being truncated. It
returns zero on success. If ->invalidatepage is zero, the kernel uses
-block_invalidatepage() instead.
+block_invalidatepage() instead. The filesystem must exclusively acquire
+invalidate_lock before invalidating page cache in truncate / hole punch path
+(and thus calling into ->invalidatepage) to block races between page cache
+invalidation and page cache filling functions (fault, read, ...).
->releasepage() is called when the kernel is about to try to drop the
buffers from the page in preparation for freeing it. It returns zero to
ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *);
ssize_t (*read_iter) (struct kiocb *, struct iov_iter *);
ssize_t (*write_iter) (struct kiocb *, struct iov_iter *);
+ int (*iopoll) (struct kiocb *kiocb, bool spin);
int (*iterate) (struct file *, struct dir_context *);
int (*iterate_shared) (struct file *, struct dir_context *);
__poll_t (*poll) (struct file *, struct poll_table_struct *);
int (*fsync) (struct file *, loff_t start, loff_t end, int datasync);
int (*fasync) (int, struct file *, int);
int (*lock) (struct file *, int, struct file_lock *);
- ssize_t (*readv) (struct file *, const struct iovec *, unsigned long,
- loff_t *);
- ssize_t (*writev) (struct file *, const struct iovec *, unsigned long,
- loff_t *);
- ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t,
- void __user *);
ssize_t (*sendpage) (struct file *, struct page *, int, size_t,
loff_t *, int);
unsigned long (*get_unmapped_area)(struct file *, unsigned long,
size_t, unsigned int);
int (*setlease)(struct file *, long, struct file_lock **, void **);
long (*fallocate)(struct file *, int, loff_t, loff_t);
+ void (*show_fdinfo)(struct seq_file *m, struct file *f);
+ unsigned (*mmap_capabilities)(struct file *);
+ ssize_t (*copy_file_range)(struct file *, loff_t, struct file *,
+ loff_t, size_t, unsigned int);
+ loff_t (*remap_file_range)(struct file *file_in, loff_t pos_in,
+ struct file *file_out, loff_t pos_out,
+ loff_t len, unsigned int remap_flags);
+ int (*fadvise)(struct file *, loff_t, loff_t, int);
locking rules:
All may block.
the lease within the individual filesystem to record the result of the
operation
+->fallocate implementation must be really careful to maintain page cache
+consistency when punching holes or performing other operations that invalidate
+page cache contents. Usually the filesystem needs to call
+truncate_inode_pages_range() to invalidate relevant range of the page cache.
+However the filesystem usually also needs to update its internal (and on disk)
+view of file offset -> disk block mapping. Until this update is finished, the
+filesystem needs to block page faults and reads from reloading now-stale page
+cache contents from the disk. Since VFS acquires mapping->invalidate_lock in
+shared mode when loading pages from disk (filemap_fault(), filemap_read(),
+readahead paths), the fallocate implementation must take the invalidate_lock to
+prevent reloading.
+
+->copy_file_range and ->remap_file_range implementations need to serialize
+against modifications of file data while the operation is running. For
+blocking changes through write(2) and similar operations inode->i_rwsem can be
+used. To block changes to file contents via a memory mapping during the
+operation, the filesystem must take mapping->invalidate_lock to coordinate
+with ->page_mkwrite.
+
dquot_operations
================
access: yes
============= ========= ===========================
-->fault() is called when a previously not present pte is about
-to be faulted in. The filesystem must find and return the page associated
-with the passed in "pgoff" in the vm_fault structure. If it is possible that
-the page may be truncated and/or invalidated, then the filesystem must lock
-the page, then ensure it is not already truncated (the page lock will block
+->fault() is called when a previously not present pte is about to be faulted
+in. The filesystem must find and return the page associated with the passed in
+"pgoff" in the vm_fault structure. If it is possible that the page may be
+truncated and/or invalidated, then the filesystem must lock invalidate_lock,
+then ensure the page is not already truncated (invalidate_lock will block
subsequent truncate), and then return with VM_FAULT_LOCKED, and the page
locked. The VM will unlock the page.
"pte" field in vm_fault structure. Pointers to entries for other offsets
should be calculated relative to "pte".
-->page_mkwrite() is called when a previously read-only pte is
-about to become writeable. The filesystem again must ensure that there are
-no truncate/invalidate races, and then return with the page locked. If
-the page has been truncated, the filesystem should not look up a new page
-like the ->fault() handler, but simply return with VM_FAULT_NOPAGE, which
-will cause the VM to retry the fault.
+->page_mkwrite() is called when a previously read-only pte is about to become
+writeable. The filesystem again must ensure that there are no
+truncate/invalidate races or races with operations such as ->remap_file_range
+or ->copy_file_range, and then return with the page locked. Usually
+mapping->invalidate_lock is suitable for proper serialization. If the page has
+been truncated, the filesystem should not look up a new page like the ->fault()
+handler, but simply return with VM_FAULT_NOPAGE, which will cause the VM to
+retry the fault.
->pfn_mkwrite() is the same as page_mkwrite but when the pte is
VM_PFNMAP or VM_MIXEDMAP with a page-less entry. Expected return is
+++ /dev/null
-.. SPDX-License-Identifier: GPL-2.0
-
-=====================================================
-Mandatory File Locking For The Linux Operating System
-=====================================================
-
- Andy Walker <andy@lysaker.kvaerner.no>
-
- 15 April 1996
-
- (Updated September 2007)
-
-0. Why you should avoid mandatory locking
------------------------------------------
-
-The Linux implementation is prey to a number of difficult-to-fix race
-conditions which in practice make it not dependable:
-
- - The write system call checks for a mandatory lock only once
- at its start. It is therefore possible for a lock request to
- be granted after this check but before the data is modified.
- A process may then see file data change even while a mandatory
- lock was held.
- - Similarly, an exclusive lock may be granted on a file after
- the kernel has decided to proceed with a read, but before the
- read has actually completed, and the reading process may see
- the file data in a state which should not have been visible
- to it.
- - Similar races make the claimed mutual exclusion between lock
- and mmap similarly unreliable.
-
-1. What is mandatory locking?
-------------------------------
-
-Mandatory locking is kernel enforced file locking, as opposed to the more usual
-cooperative file locking used to guarantee sequential access to files among
-processes. File locks are applied using the flock() and fcntl() system calls
-(and the lockf() library routine which is a wrapper around fcntl().) It is
-normally a process' responsibility to check for locks on a file it wishes to
-update, before applying its own lock, updating the file and unlocking it again.
-The most commonly used example of this (and in the case of sendmail, the most
-troublesome) is access to a user's mailbox. The mail user agent and the mail
-transfer agent must guard against updating the mailbox at the same time, and
-prevent reading the mailbox while it is being updated.
-
-In a perfect world all processes would use and honour a cooperative, or
-"advisory" locking scheme. However, the world isn't perfect, and there's
-a lot of poorly written code out there.
-
-In trying to address this problem, the designers of System V UNIX came up
-with a "mandatory" locking scheme, whereby the operating system kernel would
-block attempts by a process to write to a file that another process holds a
-"read" -or- "shared" lock on, and block attempts to both read and write to a
-file that a process holds a "write " -or- "exclusive" lock on.
-
-The System V mandatory locking scheme was intended to have as little impact as
-possible on existing user code. The scheme is based on marking individual files
-as candidates for mandatory locking, and using the existing fcntl()/lockf()
-interface for applying locks just as if they were normal, advisory locks.
-
-.. Note::
-
- 1. In saying "file" in the paragraphs above I am actually not telling
- the whole truth. System V locking is based on fcntl(). The granularity of
- fcntl() is such that it allows the locking of byte ranges in files, in
- addition to entire files, so the mandatory locking rules also have byte
- level granularity.
-
- 2. POSIX.1 does not specify any scheme for mandatory locking, despite
- borrowing the fcntl() locking scheme from System V. The mandatory locking
- scheme is defined by the System V Interface Definition (SVID) Version 3.
-
-2. Marking a file for mandatory locking
----------------------------------------
-
-A file is marked as a candidate for mandatory locking by setting the group-id
-bit in its file mode but removing the group-execute bit. This is an otherwise
-meaningless combination, and was chosen by the System V implementors so as not
-to break existing user programs.
-
-Note that the group-id bit is usually automatically cleared by the kernel when
-a setgid file is written to. This is a security measure. The kernel has been
-modified to recognize the special case of a mandatory lock candidate and to
-refrain from clearing this bit. Similarly the kernel has been modified not
-to run mandatory lock candidates with setgid privileges.
-
-3. Available implementations
-----------------------------
-
-I have considered the implementations of mandatory locking available with
-SunOS 4.1.x, Solaris 2.x and HP-UX 9.x.
-
-Generally I have tried to make the most sense out of the behaviour exhibited
-by these three reference systems. There are many anomalies.
-
-All the reference systems reject all calls to open() for a file on which
-another process has outstanding mandatory locks. This is in direct
-contravention of SVID 3, which states that only calls to open() with the
-O_TRUNC flag set should be rejected. The Linux implementation follows the SVID
-definition, which is the "Right Thing", since only calls with O_TRUNC can
-modify the contents of the file.
-
-HP-UX even disallows open() with O_TRUNC for a file with advisory locks, not
-just mandatory locks. That would appear to contravene POSIX.1.
-
-mmap() is another interesting case. All the operating systems mentioned
-prevent mandatory locks from being applied to an mmap()'ed file, but HP-UX
-also disallows advisory locks for such a file. SVID actually specifies the
-paranoid HP-UX behaviour.
-
-In my opinion only MAP_SHARED mappings should be immune from locking, and then
-only from mandatory locks - that is what is currently implemented.
-
-SunOS is so hopeless that it doesn't even honour the O_NONBLOCK flag for
-mandatory locks, so reads and writes to locked files always block when they
-should return EAGAIN.
-
-I'm afraid that this is such an esoteric area that the semantics described
-below are just as valid as any others, so long as the main points seem to
-agree.
-
-4. Semantics
-------------
-
-1. Mandatory locks can only be applied via the fcntl()/lockf() locking
- interface - in other words the System V/POSIX interface. BSD style
- locks using flock() never result in a mandatory lock.
-
-2. If a process has locked a region of a file with a mandatory read lock, then
- other processes are permitted to read from that region. If any of these
- processes attempts to write to the region it will block until the lock is
- released, unless the process has opened the file with the O_NONBLOCK
- flag in which case the system call will return immediately with the error
- status EAGAIN.
-
-3. If a process has locked a region of a file with a mandatory write lock, all
- attempts to read or write to that region block until the lock is released,
- unless a process has opened the file with the O_NONBLOCK flag in which case
- the system call will return immediately with the error status EAGAIN.
-
-4. Calls to open() with O_TRUNC, or to creat(), on a existing file that has
- any mandatory locks owned by other processes will be rejected with the
- error status EAGAIN.
-
-5. Attempts to apply a mandatory lock to a file that is memory mapped and
- shared (via mmap() with MAP_SHARED) will be rejected with the error status
- EAGAIN.
-
-6. Attempts to create a shared memory map of a file (via mmap() with MAP_SHARED)
- that has any mandatory locks in effect will be rejected with the error status
- EAGAIN.
-
-5. Which system calls are affected?
------------------------------------
-
-Those which modify a file's contents, not just the inode. That gives read(),
-write(), readv(), writev(), open(), creat(), mmap(), truncate() and
-ftruncate(). truncate() and ftruncate() are considered to be "write" actions
-for the purposes of mandatory locking.
-
-The affected region is usually defined as stretching from the current position
-for the total number of bytes read or written. For the truncate calls it is
-defined as the bytes of a file removed or added (we must also consider bytes
-added, as a lock can specify just "the whole file", rather than a specific
-range of bytes.)
-
-Note 3: I may have overlooked some system calls that need mandatory lock
-checking in my eagerness to get this code out the door. Please let me know, or
-better still fix the system calls yourself and submit a patch to me or Linus.
-
-6. Warning!
------------
-
-Not even root can override a mandatory lock, so runaway processes can wreak
-havoc if they lock crucial files. The way around it is to change the file
-permissions (remove the setgid bit) before trying to read or write to it.
-Of course, that might be a bit tricky if the system is hung :-(
-
-7. The "mand" mount option
---------------------------
-Mandatory locking is disabled on all filesystems by default, and must be
-administratively enabled by mounting with "-o mand". That mount option
-is only allowed if the mounting task has the CAP_SYS_ADMIN capability.
-
-Since kernel v4.5, it is possible to disable mandatory locking
-altogether by setting CONFIG_MANDATORY_FILE_LOCKING to "n". A kernel
-with this disabled will reject attempts to mount filesystems with the
-"mand" mount option with the error status EPERM.
put_prev_task_idle
kmem_cache_create
pick_next_task_rt
- get_online_cpus
+ cpus_read_lock
pick_next_task_fair
mutex_lock
[...]
'K' all linux/kd.h
'L' 00-1F linux/loop.h conflict!
'L' 10-1F drivers/scsi/mpt3sas/mpt3sas_ctl.h conflict!
-'L' 20-2F linux/lightnvm.h
'L' E0-FF linux/ppdd.h encrypted disk device driver
<http://linux01.gwdg.de/~alatham/ppdd.html>
'M' all linux/soundcard.h conflict!
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_ENABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_DISABLE, 0, 0);
* prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH, PR_SPEC_FORCE_DISABLE, 0, 0);
+
+- PR_SPEC_L1D_FLUSH: Flush L1D Cache on context switch out of the task
+ (works only when tasks run on non SMT cores)
+
+ Invocations:
+ * prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_L1D_FLUSH, 0, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_L1D_FLUSH, PR_SPEC_ENABLE, 0, 0);
+ * prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_L1D_FLUSH, PR_SPEC_DISABLE, 0, 0);
Rebooting
=========
- reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] [, [w]arm | [c]old]
+ reboot=b[ios] | t[riple] | k[bd] | a[cpi] | e[fi] | p[ci] [, [w]arm | [c]old]
bios
Use the CPU reboot vector for warm reset
warm
Use efi reset_system runtime service. If EFI is not configured or
the EFI reset does not work, the reboot path attempts the reset using
the keyboard controller.
+ pci
+ Use a write to the PCI config space register 0xcf9 to trigger reboot.
Using warm reset will be much faster especially on big memory
systems because the BIOS will not go through the memory check.
Don't stop other CPUs on reboot. This can make reboot more reliable
in some cases.
+ reboot=default
+ There are some built-in platform specific "quirks" - you may see:
+ "reboot: <name> series board detected. Selecting <type> for reboots."
+ In the case where you think the quirk is in error (e.g. you have
+ newer BIOS, or newer board) using this option will ignore the built-in
+ quirk table, and use the generic default reboot actions.
+
Non Executable Mappings
=======================
F: include/uapi/linux/mii.h
EXFAT FILE SYSTEM
-M: Namjae Jeon <namjae.jeon@samsung.com>
+M: Namjae Jeon <linkinjeon@kernel.org>
M: Sungjong Seo <sj1557.seo@samsung.com>
L: linux-fsdevel@vger.kernel.org
S: Maintained
F: scripts/spdxcheck-test.sh
F: scripts/spdxcheck.py
-LIGHTNVM PLATFORM SUPPORT
-M: Matias Bjorling <mb@lightnvm.io>
-L: linux-block@vger.kernel.org
-S: Maintained
-W: http://github/OpenChannelSSD
-F: drivers/lightnvm/
-F: include/linux/lightnvm.h
-F: include/uapi/linux/lightnvm.h
-
LINEAR RANGES HELPERS
M: Mark Brown <broonie@kernel.org>
R: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
VERSION = 5
PATCHLEVEL = 14
SUBLEVEL = 0
-EXTRAVERSION = -rc7
+EXTRAVERSION =
NAME = Opossums on Parade
# *DOCUMENTATION*
config ARCH_HAS_ELFCORE_COMPAT
bool
+config ARCH_HAS_PARANOID_L1D_FLUSH
+ bool
+
source "kernel/gcov/Kconfig"
source "scripts/gcc-plugins/Kconfig"
irq_hw_number_t idu_hwirq = core_hwirq - FIRST_EXT_IRQ;
chained_irq_enter(core_chip, desc);
- generic_handle_irq(irq_find_mapping(idu_domain, idu_hwirq));
+ generic_handle_domain_irq(idu_domain, idu_hwirq);
chained_irq_exit(core_chip, desc);
}
return irq_create_mapping(sachip->irqdomain, hwirq);
}
-static void sa1111_handle_irqdomain(struct irq_domain *irqdomain, int irq)
-{
- struct irq_desc *d = irq_to_desc(irq_linear_revmap(irqdomain, irq));
-
- if (d)
- generic_handle_irq_desc(d);
-}
-
/*
* SA1111 interrupt support. Since clearing an IRQ while there are
* active IRQs causes the interrupt output to pulse, the upper levels
for (i = 0; stat0; i++, stat0 >>= 1)
if (stat0 & 1)
- sa1111_handle_irqdomain(irqdomain, i);
+ generic_handle_domain_irq(irqdomain, i);
for (i = 32; stat1; i++, stat1 >>= 1)
if (stat1 & 1)
- sa1111_handle_irqdomain(irqdomain, i);
+ generic_handle_domain_irq(irqdomain, i);
/* For level-based interrupts */
desc->irq_data.chip->irq_unmask(&desc->irq_data);
.max_size = curve25519_max_size,
};
-static int __init mod_init(void)
+static int __init arm_curve25519_init(void)
{
if (elf_hwcap & HWCAP_NEON) {
static_branch_enable(&have_neon);
return 0;
}
-static void __exit mod_exit(void)
+static void __exit arm_curve25519_exit(void)
{
if (IS_REACHABLE(CONFIG_CRYPTO_KPP) && elf_hwcap & HWCAP_NEON)
crypto_unregister_kpp(&curve25519_alg);
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(arm_curve25519_init);
+module_exit(arm_curve25519_exit);
MODULE_ALIAS_CRYPTO("curve25519");
MODULE_ALIAS_CRYPTO("curve25519-neon");
/*
* Physical start and end address of the kernel sections. These addresses are
- * 2MB-aligned to match the section mappings placed over the kernel.
+ * 2MB-aligned to match the section mappings placed over the kernel. We use
+ * u64 so that LPAE mappings beyond the 32bit limit will work out as well.
*/
-extern u32 kernel_sec_start;
-extern u32 kernel_sec_end;
+extern u64 kernel_sec_start;
+extern u64 kernel_sec_end;
/*
* Physical vs virtual RAM address space conversion. These are
/*
* This needs to be assigned at runtime when the linker symbols are
- * resolved.
+ * resolved. These are unsigned 64bit really, but in this assembly code
+ * We store them as 32bit.
*/
.pushsection .data
.align 2
.globl kernel_sec_end
kernel_sec_start:
.long 0
+ .long 0
kernel_sec_end:
+ .long 0
.long 0
.popsection
add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER)
ldr r6, =(_end - 1)
adr_l r5, kernel_sec_start @ _pa(kernel_sec_start)
- str r8, [r5] @ Save physical start of kernel
+#ifdef CONFIG_CPU_ENDIAN_BE8
+ str r8, [r5, #4] @ Save physical start of kernel (BE)
+#else
+ str r8, [r5] @ Save physical start of kernel (LE)
+#endif
orr r3, r8, r7 @ Add the MMU flags
add r6, r4, r6, lsr #(SECTION_SHIFT - PMD_ORDER)
1: str r3, [r0], #1 << PMD_ORDER
bls 1b
eor r3, r3, r7 @ Remove the MMU flags
adr_l r5, kernel_sec_end @ _pa(kernel_sec_end)
- str r3, [r5] @ Save physical end of kernel
+#ifdef CONFIG_CPU_ENDIAN_BE8
+ str r3, [r5, #4] @ Save physical end of kernel (BE)
+#else
+ str r3, [r5] @ Save physical end of kernel (LE)
+#endif
#ifdef CONFIG_XIP_KERNEL
/*
do {
pending = readl(fpga->base + FPGA_IRQ_SET_CLR) & fpga->irq_mask;
- for_each_set_bit(bit, &pending, CPLDS_NB_IRQ) {
- generic_handle_irq(irq_find_mapping(fpga->irqdomain,
- bit));
- }
+ for_each_set_bit(bit, &pending, CPLDS_NB_IRQ)
+ generic_handle_domain_irq(fpga->irqdomain, bit);
} while (pending);
return IRQ_HANDLED;
struct s3c_irq_data *irq_data = irq_desc_get_chip_data(desc);
struct s3c_irq_intc *intc = irq_data->intc;
struct s3c_irq_intc *sub_intc = irq_data->sub_intc;
- unsigned int n, offset, irq;
+ unsigned int n, offset;
unsigned long src, msk;
/* we're using individual domains for the non-dt case
while (src) {
n = __ffs(src);
src &= ~(1 << n);
- irq = irq_find_mapping(sub_intc->domain, offset + n);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(sub_intc->domain, offset + n);
}
chained_irq_exit(chip, desc);
if (offset == 0)
return;
+ /*
+ * Offset the kernel section physical offsets so that the kernel
+ * mapping will work out later on.
+ */
+ kernel_sec_start += offset;
+ kernel_sec_end += offset;
+
/*
* Get the address of the remap function in the 1:1 identity
* mapping setup by the early page table assembly code. We
{
void *zero_page;
- pr_debug("physical kernel sections: 0x%08x-0x%08x\n",
+ pr_debug("physical kernel sections: 0x%08llx-0x%08llx\n",
kernel_sec_start, kernel_sec_end);
prepare_page_table();
ldr r6, =(_end - 1)
add r7, r2, #0x1000
add r6, r7, r6, lsr #SECTION_SHIFT - L2_ORDER
- add r7, r7, #PAGE_OFFSET >> (SECTION_SHIFT - L2_ORDER)
+ add r7, r7, #KERNEL_OFFSET >> (SECTION_SHIFT - L2_ORDER)
1: ldrd r4, r5, [r7]
adds r4, r4, r0
adc r5, r5, r1
select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
+ select HAVE_ARCH_PFN_VALID
select HAVE_ARCH_PREL32_RELOCATIONS
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
tristate "SM4 symmetric cipher (ARMv8.2 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_ALGAPI
- select CRYPTO_SM4
+ select CRYPTO_LIB_SM4
config CRYPTO_GHASH_ARM64_CE
tristate "GHASH/AES-GCM using ARMv8 Crypto Extensions"
asmlinkage void sm4_ce_do_crypt(const u32 *rk, void *out, const void *in);
+static int sm4_ce_setkey(struct crypto_tfm *tfm, const u8 *key,
+ unsigned int key_len)
+{
+ struct sm4_ctx *ctx = crypto_tfm_ctx(tfm);
+
+ return sm4_expandkey(ctx, key, key_len);
+}
+
static void sm4_ce_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
- const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
+ const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm);
if (!crypto_simd_usable()) {
- crypto_sm4_encrypt(tfm, out, in);
+ sm4_crypt_block(ctx->rkey_enc, out, in);
} else {
kernel_neon_begin();
sm4_ce_do_crypt(ctx->rkey_enc, out, in);
static void sm4_ce_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
- const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
+ const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm);
if (!crypto_simd_usable()) {
- crypto_sm4_decrypt(tfm, out, in);
+ sm4_crypt_block(ctx->rkey_dec, out, in);
} else {
kernel_neon_begin();
sm4_ce_do_crypt(ctx->rkey_dec, out, in);
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = SM4_BLOCK_SIZE,
- .cra_ctxsize = sizeof(struct crypto_sm4_ctx),
+ .cra_ctxsize = sizeof(struct sm4_ctx),
.cra_module = THIS_MODULE,
.cra_u.cipher = {
.cia_min_keysize = SM4_KEY_SIZE,
.cia_max_keysize = SM4_KEY_SIZE,
- .cia_setkey = crypto_sm4_set_key,
+ .cia_setkey = sm4_ce_setkey,
.cia_encrypt = sm4_ce_encrypt,
.cia_decrypt = sm4_ce_decrypt
}
typedef struct page *pgtable_t;
+int pfn_valid(unsigned long pfn);
int pfn_is_map_memory(unsigned long pfn);
#include <asm/memory.h>
free_area_init(max_zone_pfns);
}
+int pfn_valid(unsigned long pfn)
+{
+ phys_addr_t addr = PFN_PHYS(pfn);
+ struct mem_section *ms;
+
+ /*
+ * Ensure the upper PAGE_SHIFT bits are clear in the
+ * pfn. Else it might lead to false positives when
+ * some of the upper bits are set, but the lower bits
+ * match a valid pfn.
+ */
+ if (PHYS_PFN(addr) != pfn)
+ return 0;
+
+ if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
+ return 0;
+
+ ms = __pfn_to_section(pfn);
+ if (!valid_section(ms))
+ return 0;
+
+ /*
+ * ZONE_DEVICE memory does not have the memblock entries.
+ * memblock_is_map_memory() check for ZONE_DEVICE based
+ * addresses will always fail. Even the normal hotplugged
+ * memory will never have MEMBLOCK_NOMAP flag set in their
+ * memblock entries. Skip memblock search for all non early
+ * memory sections covering all of hotplug memory including
+ * both normal and ZONE_DEVICE based.
+ */
+ if (!early_section(ms))
+ return pfn_section_valid(ms, pfn);
+
+ return memblock_is_memory(addr);
+}
+EXPORT_SYMBOL(pfn_valid);
+
int pfn_is_map_memory(unsigned long pfn)
{
phys_addr_t addr = PFN_PHYS(pfn);
bool "Coldfire CPU family support"
select ARCH_HAVE_CUSTOM_GPIO_H
select CPU_HAS_NO_BITFIELDS
+ select CPU_HAS_NO_CAS
select CPU_HAS_NO_MULDIV64
select GENERIC_CSUM
select GPIOLIB
bool
depends on !MMU
select CPU_HAS_NO_BITFIELDS
+ select CPU_HAS_NO_CAS
select CPU_HAS_NO_MULDIV64
select CPU_HAS_NO_UNALIGNED
select GENERIC_CSUM
config MCPU32
bool
select CPU_HAS_NO_BITFIELDS
+ select CPU_HAS_NO_CAS
select CPU_HAS_NO_UNALIGNED
select CPU_NO_EFFICIENT_FFS
help
config RMW_INSNS
bool "Use read-modify-write instructions"
- depends on ADVANCED
+ depends on ADVANCED && !CPU_HAS_NO_CAS
help
This allows to use certain instructions that work with indivisible
read-modify-write bus cycles. While this is faster than the
config CPU_HAS_NO_BITFIELDS
bool
+config CPU_HAS_NO_CAS
+ bool
+
config CPU_HAS_NO_MULDIV64
bool
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_DUMMY_IRQ=m
CONFIG_RAID_ATTRS=m
-CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_CRC32_SELFTEST=m
CONFIG_CRC64=m
CONFIG_XZ_DEC_TEST=m
+CONFIG_GLOB_SELFTEST=m
CONFIG_STRING_SELFTEST=m
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_DUMMY_IRQ=m
CONFIG_RAID_ATTRS=m
-CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_CRC32_SELFTEST=m
CONFIG_CRC64=m
CONFIG_XZ_DEC_TEST=m
+CONFIG_GLOB_SELFTEST=m
CONFIG_STRING_SELFTEST=m
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_DUMMY_IRQ=m
CONFIG_RAID_ATTRS=m
-CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_CRC32_SELFTEST=m
CONFIG_CRC64=m
CONFIG_XZ_DEC_TEST=m
+CONFIG_GLOB_SELFTEST=m
CONFIG_STRING_SELFTEST=m
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_DUMMY_IRQ=m
CONFIG_RAID_ATTRS=m
-CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_CRC32_SELFTEST=m
CONFIG_CRC64=m
CONFIG_XZ_DEC_TEST=m
+CONFIG_GLOB_SELFTEST=m
CONFIG_STRING_SELFTEST=m
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_ATA_OVER_ETH=m
CONFIG_DUMMY_IRQ=m
CONFIG_RAID_ATTRS=m
-CONFIG_SCSI=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_ST=m
CONFIG_BLK_DEV_SR=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_CRC32_SELFTEST=m
CONFIG_CRC64=m
CONFIG_XZ_DEC_TEST=m
+CONFIG_GLOB_SELFTEST=m
CONFIG_STRING_SELFTEST=m
# CONFIG_SECTION_MISMATCH_WARN_ONLY is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_VECTORBASE=0x40000000
CONFIG_KERNELBASE=0x40001000
# CONFIG_BLK_DEV_BSG is not set
-CONFIG_BLK_CMDLINE_PARSER=y
CONFIG_BINFMT_FLAT=y
CONFIG_BINFMT_ZFLAT=y
CONFIG_BINFMT_MISC=y
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_WW_MUTEX_SELFTEST=m
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
CONFIG_IPV6_VTI=m
CONFIG_IPV6_GRE=m
CONFIG_NETFILTER=y
+CONFIG_NETFILTER_NETLINK_HOOK=m
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_ZONES=y
# CONFIG_NF_CONNTRACK_PROCFS is not set
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
CONFIG_CIFS=m
+# CONFIG_CIFS_STATS2 is not set
# CONFIG_CIFS_DEBUG is not set
CONFIG_CODA_FS=m
CONFIG_NLS_CODEPAGE_437=y
CONFIG_EARLY_PRINTK=y
CONFIG_KUNIT=m
CONFIG_KUNIT_ALL_TESTS=m
-CONFIG_TEST_LIST_SORT=m
CONFIG_TEST_MIN_HEAP=m
CONFIG_TEST_SORT=m
CONFIG_TEST_DIV64=m
CONFIG_TEST_STRSCPY=m
CONFIG_TEST_KSTRTOX=m
CONFIG_TEST_PRINTF=m
+CONFIG_TEST_SCANF=m
CONFIG_TEST_BITMAP=m
CONFIG_TEST_UUID=m
CONFIG_TEST_XARRAY=m
for (i = 0; i < MAX_UNIT; i++) {
if (nfeth_dev[i]) {
- unregister_netdev(nfeth_dev[0]);
- free_netdev(nfeth_dev[0]);
+ unregister_netdev(nfeth_dev[i]);
+ free_netdev(nfeth_dev[i]);
}
}
free_irq(nfEtherIRQ, nfeth_interrupt);
" casl %2,%1,%0\n" \
" jne 1b" \
: "+m" (*v), "=&d" (t), "=&d" (tmp) \
- : "g" (i), "2" (arch_atomic_read(v))); \
+ : "di" (i), "2" (arch_atomic_read(v))); \
return t; \
}
" casl %2,%1,%0\n" \
" jne 1b" \
: "+m" (*v), "=&d" (t), "=&d" (tmp) \
- : "g" (i), "2" (arch_atomic_read(v))); \
+ : "di" (i), "2" (arch_atomic_read(v))); \
return tmp; \
}
{
u32 pending = ar2315_rst_reg_read(AR2315_ISR) &
ar2315_rst_reg_read(AR2315_IMR);
- unsigned nr, misc_irq = 0;
+ unsigned nr;
+ int ret = 0;
if (pending) {
struct irq_domain *domain = irq_desc_get_handler_data(desc);
nr = __ffs(pending);
- misc_irq = irq_find_mapping(domain, nr);
- }
- if (misc_irq) {
if (nr == AR2315_MISC_IRQ_GPIO)
ar2315_rst_reg_write(AR2315_ISR, AR2315_ISR_GPIO);
else if (nr == AR2315_MISC_IRQ_WATCHDOG)
ar2315_rst_reg_write(AR2315_ISR, AR2315_ISR_WD);
- generic_handle_irq(misc_irq);
- } else {
- spurious_interrupt();
+
+ ret = generic_handle_domain_irq(domain, nr);
}
+
+ if (!pending || ret)
+ spurious_interrupt();
}
static void ar2315_misc_irq_unmask(struct irq_data *d)
{
u32 pending = ar5312_rst_reg_read(AR5312_ISR) &
ar5312_rst_reg_read(AR5312_IMR);
- unsigned nr, misc_irq = 0;
+ unsigned nr;
+ int ret = 0;
if (pending) {
struct irq_domain *domain = irq_desc_get_handler_data(desc);
nr = __ffs(pending);
- misc_irq = irq_find_mapping(domain, nr);
- }
- if (misc_irq) {
- generic_handle_irq(misc_irq);
+ ret = generic_handle_domain_irq(domain, nr);
if (nr == AR5312_MISC_IRQ_TIMER)
ar5312_rst_reg_read(AR5312_TIMER);
- } else {
- spurious_interrupt();
}
+
+ if (!pending || ret)
+ spurious_interrupt();
}
/* Enable the specified AR5312_MISC_IRQ interrupt */
#ifndef __ASM_RC32434_RB_H
#define __ASM_RC32434_RB_H
-#include <linux/genhd.h>
-
#define REGBASE 0x18000000
#define IDT434_REG_BASE ((volatile void *) KSEG1ADDR(REGBASE))
#define UART0BASE 0x58000
*/
irq = __fls(irq);
hwirq = irq + MIPS_CPU_IRQ_CASCADE + (INT_NUM_IM_OFFSET * module);
- generic_handle_irq(irq_linear_revmap(ltq_domain, hwirq));
+ generic_handle_domain_irq(ltq_domain, hwirq);
/* if this is a EBU irq, we need to ack it or get a deadlock */
if (irq == LTQ_ICU_EBU_IRQ && !module && LTQ_EBU_PCC_ISTAT != 0)
struct ar2315_pci_ctrl *apc = irq_desc_get_handler_data(desc);
u32 pending = ar2315_pci_reg_read(apc, AR2315_PCI_ISR) &
ar2315_pci_reg_read(apc, AR2315_PCI_IMR);
- unsigned pci_irq = 0;
+ int ret = 0;
if (pending)
- pci_irq = irq_find_mapping(apc->domain, __ffs(pending));
+ ret = generic_handle_domain_irq(apc->domain, __ffs(pending));
- if (pci_irq)
- generic_handle_irq(pci_irq);
- else
+ if (!pending || ret)
spurious_interrupt();
}
}
while (pending) {
- unsigned irq, bit = __ffs(pending);
+ unsigned bit = __ffs(pending);
- irq = irq_find_mapping(rpc->irq_domain, bit);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(rpc->irq_domain, bit);
pending &= ~BIT(bit);
}
if (pending) {
struct irq_domain *domain = irq_desc_get_handler_data(desc);
- generic_handle_irq(irq_find_mapping(domain, __ffs(pending)));
+ generic_handle_domain_irq(domain, __ffs(pending));
} else {
spurious_interrupt();
}
unsigned long *mask = per_cpu(irq_enable_mask, cpu);
struct irq_domain *domain;
u64 pend0;
- int irq;
+ int ret;
/* copied from Irix intpend0() */
pend0 = LOCAL_HUB_L(PI_INT_PEND0);
#endif
{
domain = irq_desc_get_handler_data(desc);
- irq = irq_linear_revmap(domain, __ffs(pend0));
- if (irq)
- generic_handle_irq(irq);
- else
+ ret = generic_handle_domain_irq(domain, __ffs(pend0));
+ if (ret)
spurious_interrupt();
}
unsigned long *mask = per_cpu(irq_enable_mask, cpu);
struct irq_domain *domain;
u64 pend1;
- int irq;
+ int ret;
/* copied from Irix intpend0() */
pend1 = LOCAL_HUB_L(PI_INT_PEND1);
return;
domain = irq_desc_get_handler_data(desc);
- irq = irq_linear_revmap(domain, __ffs(pend1) + 64);
- if (irq)
- generic_handle_irq(irq);
- else
+ ret = generic_handle_domain_irq(domain, __ffs(pend1) + 64);
+ if (ret)
spurious_interrupt();
LOCAL_HUB_L(PI_INT_PEND1);
int cpu = smp_processor_id();
struct irq_domain *domain;
u64 pend, mask;
- int irq;
+ int ret;
pend = heart_read(&heart_regs->isr);
mask = (heart_read(&heart_regs->imr[cpu]) &
#endif
{
domain = irq_desc_get_handler_data(desc);
- irq = irq_linear_revmap(domain, __ffs(pend));
- if (irq)
- generic_handle_irq(irq);
- else
+ ret = generic_handle_domain_irq(domain, __ffs(pend));
+ if (ret)
spurious_interrupt();
}
}
asmlinkage void do_IRQ(int hwirq, struct pt_regs *regs)
{
struct pt_regs *oldregs = set_irq_regs(regs);
- int irq;
irq_enter();
- irq = irq_find_mapping(NULL, hwirq);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(NULL, hwirq);
irq_exit();
set_irq_regs(oldregs);
#define __HAVE_ARCH_MEMCPY
void * memcpy(void * dest,const void *src,size_t count);
-#define __HAVE_ARCH_STRLEN
-extern size_t strlen(const char *s);
-
-#define __HAVE_ARCH_STRCPY
-extern char *strcpy(char *dest, const char *src);
-
-#define __HAVE_ARCH_STRNCPY
-extern char *strncpy(char *dest, const char *src, size_t count);
-
-#define __HAVE_ARCH_STRCAT
-extern char *strcat(char *dest, const char *src);
-
-#define __HAVE_ARCH_MEMSET
-extern void *memset(void *, int, size_t);
-
#endif
#include <linux/string.h>
EXPORT_SYMBOL(memset);
-EXPORT_SYMBOL(strlen);
-EXPORT_SYMBOL(strcpy);
-EXPORT_SYMBOL(strncpy);
-EXPORT_SYMBOL(strcat);
#include <linux/atomic.h>
EXPORT_SYMBOL(__xchg8);
# Makefile for parisc-specific library files
#
-lib-y := lusercopy.o bitops.o checksum.o io.o memcpy.o \
- ucmpdi2.o delay.o string.o
+lib-y := lusercopy.o bitops.o checksum.o io.o memset.o memcpy.o \
+ ucmpdi2.o delay.o
obj-y := iomap.o
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#include <linux/types.h>
+#include <asm/string.h>
+
+#define OPSIZ (BITS_PER_LONG/8)
+typedef unsigned long op_t;
+
+void *
+memset (void *dstpp, int sc, size_t len)
+{
+ unsigned int c = sc;
+ long int dstp = (long int) dstpp;
+
+ if (len >= 8)
+ {
+ size_t xlen;
+ op_t cccc;
+
+ cccc = (unsigned char) c;
+ cccc |= cccc << 8;
+ cccc |= cccc << 16;
+ if (OPSIZ > 4)
+ /* Do the shift in two steps to avoid warning if long has 32 bits. */
+ cccc |= (cccc << 16) << 16;
+
+ /* There are at least some bytes to set.
+ No need to test for LEN == 0 in this alignment loop. */
+ while (dstp % OPSIZ != 0)
+ {
+ ((unsigned char *) dstp)[0] = c;
+ dstp += 1;
+ len -= 1;
+ }
+
+ /* Write 8 `op_t' per iteration until less than 8 `op_t' remain. */
+ xlen = len / (OPSIZ * 8);
+ while (xlen > 0)
+ {
+ ((op_t *) dstp)[0] = cccc;
+ ((op_t *) dstp)[1] = cccc;
+ ((op_t *) dstp)[2] = cccc;
+ ((op_t *) dstp)[3] = cccc;
+ ((op_t *) dstp)[4] = cccc;
+ ((op_t *) dstp)[5] = cccc;
+ ((op_t *) dstp)[6] = cccc;
+ ((op_t *) dstp)[7] = cccc;
+ dstp += 8 * OPSIZ;
+ xlen -= 1;
+ }
+ len %= OPSIZ * 8;
+
+ /* Write 1 `op_t' per iteration until less than OPSIZ bytes remain. */
+ xlen = len / OPSIZ;
+ while (xlen > 0)
+ {
+ ((op_t *) dstp)[0] = cccc;
+ dstp += OPSIZ;
+ xlen -= 1;
+ }
+ len %= OPSIZ;
+ }
+
+ /* Write the last few bytes. */
+ while (len > 0)
+ {
+ ((unsigned char *) dstp)[0] = c;
+ dstp += 1;
+ len -= 1;
+ }
+
+ return dstpp;
+}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * PA-RISC assembly string functions
- *
- * Copyright (C) 2019 Helge Deller <deller@gmx.de>
- */
-
-#include <asm/assembly.h>
-#include <linux/linkage.h>
-
- .section .text.hot
- .level PA_ASM_LEVEL
-
- t0 = r20
- t1 = r21
- t2 = r22
-
-ENTRY_CFI(strlen, frame=0,no_calls)
- or,COND(<>) arg0,r0,ret0
- b,l,n .Lstrlen_null_ptr,r0
- depwi 0,31,2,ret0
- cmpb,COND(<>) arg0,ret0,.Lstrlen_not_aligned
- ldw,ma 4(ret0),t0
- cmpib,tr 0,r0,.Lstrlen_loop
- uxor,nbz r0,t0,r0
-.Lstrlen_not_aligned:
- uaddcm arg0,ret0,t1
- shladd t1,3,r0,t1
- mtsar t1
- depwi -1,%sar,32,t0
- uxor,nbz r0,t0,r0
-.Lstrlen_loop:
- b,l,n .Lstrlen_end_loop,r0
- ldw,ma 4(ret0),t0
- cmpib,tr 0,r0,.Lstrlen_loop
- uxor,nbz r0,t0,r0
-.Lstrlen_end_loop:
- extrw,u,<> t0,7,8,r0
- addib,tr,n -3,ret0,.Lstrlen_out
- extrw,u,<> t0,15,8,r0
- addib,tr,n -2,ret0,.Lstrlen_out
- extrw,u,<> t0,23,8,r0
- addi -1,ret0,ret0
-.Lstrlen_out:
- bv r0(rp)
- uaddcm ret0,arg0,ret0
-.Lstrlen_null_ptr:
- bv,n r0(rp)
-ENDPROC_CFI(strlen)
-
-
-ENTRY_CFI(strcpy, frame=0,no_calls)
- ldb 0(arg1),t0
- stb t0,0(arg0)
- ldo 0(arg0),ret0
- ldo 1(arg1),t1
- cmpb,= r0,t0,2f
- ldo 1(arg0),t2
-1: ldb 0(t1),arg1
- stb arg1,0(t2)
- ldo 1(t1),t1
- cmpb,<> r0,arg1,1b
- ldo 1(t2),t2
-2: bv,n r0(rp)
-ENDPROC_CFI(strcpy)
-
-
-ENTRY_CFI(strncpy, frame=0,no_calls)
- ldb 0(arg1),t0
- stb t0,0(arg0)
- ldo 1(arg1),t1
- ldo 0(arg0),ret0
- cmpb,= r0,t0,2f
- ldo 1(arg0),arg1
-1: ldo -1(arg2),arg2
- cmpb,COND(=),n r0,arg2,2f
- ldb 0(t1),arg0
- stb arg0,0(arg1)
- ldo 1(t1),t1
- cmpb,<> r0,arg0,1b
- ldo 1(arg1),arg1
-2: bv,n r0(rp)
-ENDPROC_CFI(strncpy)
-
-
-ENTRY_CFI(strcat, frame=0,no_calls)
- ldb 0(arg0),t0
- cmpb,= t0,r0,2f
- ldo 0(arg0),ret0
- ldo 1(arg0),arg0
-1: ldb 0(arg0),t1
- cmpb,<>,n r0,t1,1b
- ldo 1(arg0),arg0
-2: ldb 0(arg1),t2
- stb t2,0(arg0)
- ldo 1(arg0),arg0
- ldb 0(arg1),t0
- cmpb,<> r0,t0,2b
- ldo 1(arg1),arg1
- bv,n r0(rp)
-ENDPROC_CFI(strcat)
-
-
-ENTRY_CFI(memset, frame=0,no_calls)
- copy arg0,ret0
- cmpb,COND(=) r0,arg0,4f
- copy arg0,t2
- cmpb,COND(=) r0,arg2,4f
- ldo -1(arg2),arg3
- subi -1,arg3,t0
- subi 0,t0,t1
- cmpiclr,COND(>=) 0,t1,arg2
- ldo -1(t1),arg2
- extru arg2,31,2,arg0
-2: stb arg1,0(t2)
- ldo 1(t2),t2
- addib,>= -1,arg0,2b
- ldo -1(arg3),arg3
- cmpiclr,COND(<=) 4,arg2,r0
- b,l,n 4f,r0
-#ifdef CONFIG_64BIT
- depd,* r0,63,2,arg2
-#else
- depw r0,31,2,arg2
-#endif
- ldo 1(t2),t2
-3: stb arg1,-1(t2)
- stb arg1,0(t2)
- stb arg1,1(t2)
- stb arg1,2(t2)
- addib,COND(>) -4,arg2,3b
- ldo 4(t2),t2
-4: bv,n r0(rp)
-ENDPROC_CFI(memset)
-
- .end
* syscall register convention is in Documentation/powerpc/syscall64-abi.rst
*/
EXC_VIRT_BEGIN(system_call_vectored, 0x3000, 0x1000)
-1:
/* SCV 0 */
mr r9,r13
GET_PACA(r13)
b system_call_vectored_sigill
#endif
.endr
-2:
EXC_VIRT_END(system_call_vectored, 0x3000, 0x1000)
-SOFT_MASK_TABLE(1b, 2b) // Treat scv vectors as soft-masked, see comment above.
+// Treat scv vectors as soft-masked, see comment above.
+// Use absolute values rather than labels here, so they don't get relocated,
+// because this code runs unrelocated.
+SOFT_MASK_TABLE(0xc000000000003000, 0xc000000000004000)
#ifdef CONFIG_RELOCATABLE
TRAMP_VIRT_BEGIN(system_call_vectored_tramp)
struct uic *uic = irq_desc_get_handler_data(desc);
u32 msr;
int src;
- int subvirq;
raw_spin_lock(&desc->lock);
if (irqd_is_level_type(idata))
src = 32 - ffs(msr);
- subvirq = irq_linear_revmap(uic->irqhost, src);
- generic_handle_irq(subvirq);
+ generic_handle_domain_irq(uic->irqhost, src);
uic_irq_ret:
raw_spin_lock(&desc->lock);
.irq_unmask = cpld_unmask_irq,
};
-static int
+static unsigned int
cpld_pic_get_irq(int offset, u8 ignore, u8 __iomem *statusp,
u8 __iomem *maskp)
{
- int cpld_irq;
u8 status = in_8(statusp);
u8 mask = in_8(maskp);
status |= (ignore | mask);
if (status == 0xff)
- return 0;
-
- cpld_irq = ffz(status) + offset;
+ return ~0;
- return irq_linear_revmap(cpld_pic_host, cpld_irq);
+ return ffz(status) + offset;
}
static void cpld_pic_cascade(struct irq_desc *desc)
{
- unsigned int irq;
+ unsigned int hwirq;
- irq = cpld_pic_get_irq(0, PCI_IGNORE, &cpld_regs->pci_status,
+ hwirq = cpld_pic_get_irq(0, PCI_IGNORE, &cpld_regs->pci_status,
&cpld_regs->pci_mask);
- if (irq) {
- generic_handle_irq(irq);
+ if (hwirq != ~0) {
+ generic_handle_domain_irq(cpld_pic_host, hwirq);
return;
}
- irq = cpld_pic_get_irq(8, MISC_IGNORE, &cpld_regs->misc_status,
+ hwirq = cpld_pic_get_irq(8, MISC_IGNORE, &cpld_regs->misc_status,
&cpld_regs->misc_mask);
- if (irq) {
- generic_handle_irq(irq);
+ if (hwirq != ~0) {
+ generic_handle_domain_irq(cpld_pic_host, hwirq);
return;
}
}
static void media5200_irq_cascade(struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
- int sub_virq, val;
+ int val;
u32 status, enable;
/* Mask off the cascaded IRQ */
enable = in_be32(media5200_irq.regs + MEDIA5200_IRQ_STATUS);
val = ffs((status & enable) >> MEDIA5200_IRQ_SHIFT);
if (val) {
- sub_virq = irq_linear_revmap(media5200_irq.irqhost, val - 1);
- /* pr_debug("%s: virq=%i s=%.8x e=%.8x hwirq=%i subvirq=%i\n",
- * __func__, virq, status, enable, val - 1, sub_virq);
+ generic_handle_domain_irq(media5200_irq.irqhost, val - 1);
+ /* pr_debug("%s: virq=%i s=%.8x e=%.8x hwirq=%i\n",
+ * __func__, virq, status, enable, val - 1);
*/
- generic_handle_irq(sub_virq);
}
/* Processing done; can reenable the cascade now */
static void mpc52xx_gpt_irq_cascade(struct irq_desc *desc)
{
struct mpc52xx_gpt_priv *gpt = irq_desc_get_handler_data(desc);
- int sub_virq;
u32 status;
status = in_be32(&gpt->regs->status) & MPC52xx_GPT_STATUS_IRQMASK;
- if (status) {
- sub_virq = irq_linear_revmap(gpt->irqhost, 0);
- generic_handle_irq(sub_virq);
- }
+ if (status)
+ generic_handle_domain_irq(gpt->irqhost, 0);
}
static int mpc52xx_gpt_irq_map(struct irq_domain *h, unsigned int virq,
break;
for (bit = 0; pend != 0; ++bit, pend <<= 1) {
- if (pend & 0x80000000) {
- int virq = irq_linear_revmap(priv->host, bit);
- generic_handle_irq(virq);
- }
+ if (pend & 0x80000000)
+ generic_handle_domain_irq(priv->host, bit);
}
}
}
select PPC_HAVE_PMU_SUPPORT
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION
- select ARCH_ENABLE_PMD_SPLIT_PTLOCK
+ select ARCH_ENABLE_SPLIT_PMD_PTLOCK
select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
select ARCH_SUPPORTS_HUGETLBFS
select ARCH_SUPPORTS_NUMA_BALANCING
out_be64(&node_iic->iic_is, ack);
/* handle them */
for (cascade = 63; cascade >= 0; cascade--)
- if (bits & (0x8000000000000000UL >> cascade)) {
- unsigned int cirq =
- irq_linear_revmap(iic_host,
+ if (bits & (0x8000000000000000UL >> cascade))
+ generic_handle_domain_irq(iic_host,
base | cascade);
- if (cirq)
- generic_handle_irq(cirq);
- }
/* post-ack level interrupts */
ack = bits & ~IIC_ISR_EDGE_MASK;
if (ack)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
struct spider_pic *pic = irq_desc_get_handler_data(desc);
- unsigned int cs, virq;
+ unsigned int cs;
cs = in_be32(pic->regs + TIR_CS) >> 24;
- if (cs == SPIDER_IRQ_INVALID)
- virq = 0;
- else
- virq = irq_linear_revmap(pic->host, cs);
-
- if (virq)
- generic_handle_irq(virq);
+ if (cs != SPIDER_IRQ_INVALID)
+ generic_handle_domain_irq(pic->host, cs);
chip->irq_eoi(&desc->irq_data);
}
static unsigned int __hlwd_pic_get_irq(struct irq_domain *h)
{
void __iomem *io_base = h->host_data;
- int irq;
u32 irq_status;
irq_status = in_be32(io_base + HW_BROADWAY_ICR) &
if (irq_status == 0)
return 0; /* no more IRQs pending */
- irq = __ffs(irq_status);
- return irq_linear_revmap(h, irq);
+ return __ffs(irq_status);
}
static void hlwd_pic_irq_cascade(struct irq_desc *desc)
{
struct irq_chip *chip = irq_desc_get_chip(desc);
struct irq_domain *irq_domain = irq_desc_get_handler_data(desc);
- unsigned int virq;
+ unsigned int hwirq;
raw_spin_lock(&desc->lock);
chip->irq_mask(&desc->irq_data); /* IRQ_LEVEL */
raw_spin_unlock(&desc->lock);
- virq = __hlwd_pic_get_irq(irq_domain);
- if (virq)
- generic_handle_irq(virq);
+ hwirq = __hlwd_pic_get_irq(irq_domain);
+ if (hwirq)
+ generic_handle_domain_irq(irq_domain, hwirq);
else
pr_err("spurious interrupt!\n");
unsigned int hlwd_pic_get_irq(void)
{
- return __hlwd_pic_get_irq(hlwd_irq_host);
+ unsigned int hwirq = __hlwd_pic_get_irq(hlwd_irq_host);
+ return hwirq ? irq_linear_revmap(hlwd_irq_host, hwirq) : 0;
}
/*
e = READ_ONCE(last_outstanding_events) & opal_event_irqchip.mask;
again:
while (e) {
- int virq, hwirq;
+ int hwirq;
hwirq = fls64(e) - 1;
e &= ~BIT_ULL(hwirq);
local_irq_disable();
- virq = irq_find_mapping(opal_event_irqchip.domain, hwirq);
- if (virq) {
- irq_enter();
- generic_handle_irq(virq);
- irq_exit();
- }
+ irq_enter();
+ generic_handle_domain_irq(opal_event_irqchip.domain, hwirq);
+ irq_exit();
local_irq_enable();
cond_resched();
struct mpic *mpic = (struct mpic *) data;
u32 eisr, eimr;
int errint;
- unsigned int cascade_irq;
eisr = mpic_fsl_err_read(mpic->err_regs, MPIC_ERR_INT_EISR);
eimr = mpic_fsl_err_read(mpic->err_regs, MPIC_ERR_INT_EIMR);
return IRQ_NONE;
while (eisr) {
+ int ret;
errint = __builtin_clz(eisr);
- cascade_irq = irq_linear_revmap(mpic->irqhost,
- mpic->err_int_vecs[errint]);
- WARN_ON(!cascade_irq);
- if (cascade_irq) {
- generic_handle_irq(cascade_irq);
- } else {
+ ret = generic_handle_domain_irq(mpic->irqhost,
+ mpic->err_int_vecs[errint]);
+ if (WARN_ON(ret)) {
eimr |= 1 << (31 - errint);
mpic_fsl_err_write(mpic->err_regs, eimr);
}
static irqreturn_t fsl_msi_cascade(int irq, void *data)
{
- unsigned int cascade_irq;
struct fsl_msi *msi_data;
int msir_index = -1;
u32 msir_value = 0;
msir_index = cascade_data->index;
- if (msir_index >= NR_MSI_REG_MAX)
- cascade_irq = 0;
-
switch (msi_data->feature & FSL_PIC_IP_MASK) {
case FSL_PIC_IP_MPIC:
msir_value = fsl_msi_read(msi_data->msi_regs,
}
while (msir_value) {
+ int err;
intr_index = ffs(msir_value) - 1;
- cascade_irq = irq_linear_revmap(msi_data->irqhost,
+ err = generic_handle_domain_irq(msi_data->irqhost,
msi_hwirq(msi_data, msir_index,
intr_index + have_shift));
- if (cascade_irq) {
- generic_handle_irq(cascade_irq);
+ if (!err)
ret = IRQ_HANDLED;
- }
+
have_shift += intr_index + 1;
msir_value = msir_value >> (intr_index + 1);
}
model = "Microchip PolarFire-SoC Icicle Kit";
compatible = "microchip,mpfs-icicle-kit";
+ aliases {
+ ethernet0 = &emac1;
+ };
+
chosen {
stdout-path = &serial0;
};
reg = <0x0 0x20112000 0x0 0x2000>;
interrupt-parent = <&plic>;
interrupts = <70 71 72 73>;
- mac-address = [00 00 00 00 00 00];
+ local-mac-address = [00 00 00 00 00 00];
clocks = <&clkcfg 5>, <&clkcfg 2>;
status = "disabled";
clock-names = "pclk", "hclk";
CONFIG_DEBUG_SG=y
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_EQS_DEBUG=y
-CONFIG_DEBUG_BLOCK_EXT_DEVT=y
# CONFIG_FTRACE is not set
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_MEMTEST=y
CONFIG_DEBUG_SG=y
# CONFIG_RCU_TRACE is not set
CONFIG_RCU_EQS_DEBUG=y
-CONFIG_DEBUG_BLOCK_EXT_DEVT=y
# CONFIG_FTRACE is not set
# CONFIG_RUNTIME_TESTING_MENU is not set
CONFIG_MEMTEST=y
#include <asm/ptrace.h>
#include <asm/syscall.h>
#include <asm/thread_info.h>
+#include <asm/switch_to.h>
#include <linux/audit.h>
#include <linux/ptrace.h>
#include <linux/elf.h>
{
struct __riscv_d_ext_state *fstate = &target->thread.fstate;
+ if (target == current)
+ fstate_save(current, task_pt_regs(current));
+
membuf_write(&to, fstate, offsetof(struct __riscv_d_ext_state, fcsr));
membuf_store(&to, fstate->fcsr);
return membuf_zero(&to, 4); // explicitly pad
select HAVE_ARCH_JUMP_LABEL_RELATIVE
select HAVE_ARCH_KASAN
select HAVE_ARCH_KASAN_VMALLOC
+ select HAVE_ARCH_KCSAN
+ select HAVE_ARCH_KFENCE
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_SOFT_DIRTY
KBUILD_IMAGE := $(boot)/bzImage
install:
- $(Q)$(MAKE) $(build)=$(boot) $@
+ sh -x $(srctree)/$(boot)/install.sh $(KERNELRELEASE) $(KBUILD_IMAGE) \
+ System.map "$(INSTALL_PATH)"
bzImage: vmlinux
$(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
+KCSAN_SANITIZE := n
KBUILD_AFLAGS := $(KBUILD_AFLAGS_DECOMPRESSOR)
KBUILD_CFLAGS := $(KBUILD_CFLAGS_DECOMPRESSOR)
obj-y := head.o als.o startup.o mem_detect.o ipl_parm.o ipl_report.o
obj-y += string.o ebcdic.o sclp_early_core.o mem.o ipl_vmparm.o cmdline.o
-obj-y += version.o pgm_check_info.o ctype.o text_dma.o
+obj-y += version.o pgm_check_info.o ctype.o
obj-$(findstring y, $(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) $(CONFIG_PGSTE)) += uv.o
obj-$(CONFIG_RELOCATABLE) += machine_kexec_reloc.o
obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
$(obj)/startup.a: $(OBJECTS) FORCE
$(call if_changed,ar)
-
-install:
- sh -x $(srctree)/$(obj)/install.sh $(KERNELRELEASE) $(obj)/bzImage \
- System.map "$(INSTALL_PATH)"
#ifndef BOOT_BOOT_H
#define BOOT_BOOT_H
+#include <asm/extable.h>
#include <linux/types.h>
-#define BOOT_STACK_OFFSET 0x8000
-
-#ifndef __ASSEMBLY__
-
-#include <linux/compiler.h>
-
void startup_kernel(void);
unsigned long detect_memory(void);
bool is_ipl_block_dump(void);
void parse_boot_command_line(void);
void verify_facilities(void);
void print_missing_facilities(void);
+void sclp_early_setup_buffer(void);
void print_pgm_check_info(void);
unsigned long get_random_base(unsigned long safe_addr);
void __printf(1, 2) decompressor_printk(const char *fmt, ...);
+/* Symbols defined by linker scripts */
extern const char kernel_version[];
extern unsigned long memory_limit;
extern unsigned long vmalloc_size;
extern int vmalloc_size_set;
extern int kaslr_enabled;
+extern char __boot_data_start[], __boot_data_end[];
+extern char __boot_data_preserved_start[], __boot_data_preserved_end[];
+extern char _decompressor_syms_start[], _decompressor_syms_end[];
+extern char _stack_start[], _stack_end[];
unsigned long read_ipl_report(unsigned long safe_offset);
-#endif /* __ASSEMBLY__ */
#endif /* BOOT_BOOT_H */
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
+KCSAN_SANITIZE := n
obj-y := $(if $(CONFIG_KERNEL_UNCOMPRESSED),,decompressor.o) info.o
obj-$(CONFIG_KERNEL_ZSTD) += clz_ctz.o
#define memmove memmove
#define memzero(s, n) memset((s), 0, (n))
-/* Symbols defined by linker scripts */
-extern char _end[];
-extern unsigned char _compressed_start[];
-extern unsigned char _compressed_end[];
-
#ifdef CONFIG_KERNEL_BZIP2
#define BOOT_HEAP_SIZE 0x400000
#elif CONFIG_KERNEL_ZSTD
unsigned long rela_dyn_end;
};
+/* Symbols defined by linker scripts */
+extern char _end[];
+extern unsigned char _compressed_start[];
+extern unsigned char _compressed_end[];
extern char _vmlinux_info[];
+
#define vmlinux (*(struct vmlinux_info *)_vmlinux_info)
#endif /* BOOT_COMPRESSED_DECOMPRESSOR_H */
/* SPDX-License-Identifier: GPL-2.0 */
#include <asm-generic/vmlinux.lds.h>
#include <asm/vmlinux.lds.h>
+#include <asm/thread_info.h>
+#include <asm/page.h>
+#include <asm/sclp.h>
OUTPUT_FORMAT("elf64-s390", "elf64-s390", "elf64-s390")
OUTPUT_ARCH(s390:64-bit)
*(.data.*)
_edata = . ;
}
- /*
- * .dma section for code, data, ex_table that need to stay below 2 GB,
- * even when the kernel is relocate: above 2 GB.
- */
- . = ALIGN(PAGE_SIZE);
- _sdma = .;
- .dma.text : {
- _stext_dma = .;
- *(.dma.text)
- . = ALIGN(PAGE_SIZE);
- _etext_dma = .;
- }
- . = ALIGN(16);
- .dma.ex_table : {
- _start_dma_ex_table = .;
- KEEP(*(.dma.ex_table))
- _stop_dma_ex_table = .;
- }
- .dma.data : { *(.dma.data) }
- . = ALIGN(PAGE_SIZE);
- _edma = .;
BOOT_DATA
BOOT_DATA_PRESERVED
*(.bss)
*(.bss.*)
*(COMMON)
+ /*
+ * Stacks for the decompressor
+ */
+ . = ALIGN(PAGE_SIZE);
+ _dump_info_stack_start = .;
+ . += PAGE_SIZE;
+ _dump_info_stack_end = .;
+ . = ALIGN(PAGE_SIZE);
+ _stack_start = .;
+ . += BOOT_STACK_SIZE;
+ _stack_end = .;
_ebss = .;
}
#include <linux/init.h>
#include <linux/linkage.h>
#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
#include <asm/page.h>
#include <asm/ptrace.h>
-#include "boot.h"
+#include <asm/sclp.h>
#define ARCH_OFFSET 4
+#define EP_OFFSET 0x10008
+#define EP_STRING "S390EP"
+
__HEAD
#define IPL_BS 0x730
.Lcpuid:.fill 8,1,0
#
-# startup-code at 0x10000, running in absolute addressing mode
+# normal startup-code, running in absolute addressing mode
# this is called either by the ipl loader or directly by PSW restart
# or linload or SALIPL
#
- .org 0x10000
+ .org STARTUP_NORMAL_OFFSET
SYM_CODE_START(startup)
j startup_normal
.org EP_OFFSET
.ascii EP_STRING
.byte 0x00,0x01
#
-# kdump startup-code at 0x10010, running in 64 bit absolute addressing mode
+# kdump startup-code, running in 64 bit absolute addressing mode
#
- .org 0x10010
+ .org STARTUP_KDUMP_OFFSET
j startup_kdump
SYM_CODE_END(startup)
SYM_CODE_START_LOCAL(startup_normal)
xc 0x300(256),0x300
xc 0xe00(256),0xe00
xc 0xf00(256),0xf00
- lctlg %c0,%c15,.Lctl-.LPG0(%r13) # load control registers
stcke __LC_BOOT_CLOCK
mvc __LC_LAST_UPDATE_CLOCK(8),__LC_BOOT_CLOCK+1
spt 6f-.LPG0(%r13)
mvc __LC_LAST_UPDATE_TIMER(8),6f-.LPG0(%r13)
- l %r15,.Lstack-.LPG0(%r13)
+ larl %r15,_stack_end-STACK_FRAME_OVERHEAD
+ brasl %r14,sclp_early_setup_buffer
brasl %r14,verify_facilities
brasl %r14,startup_kernel
SYM_CODE_END(startup_normal)
-.Lstack:
- .long BOOT_STACK_OFFSET + BOOT_STACK_SIZE - STACK_FRAME_OVERHEAD
.align 8
6: .long 0x7fffffff,0xffffffff
.Lext_new_psw:
.quad 0x0000000180000000,startup_pgm_check_handler
.Lio_new_psw:
.quad 0x0002000180000000,0x1f0 # disabled wait
-.Lctl: .quad 0x04040000 # cr0: AFP registers & secondary space
- .quad 0 # cr1: primary space segment table
- .quad .Lduct # cr2: dispatchable unit control table
- .quad 0 # cr3: instruction authorization
- .quad 0xffff # cr4: instruction authorization
- .quad .Lduct # cr5: primary-aste origin
- .quad 0 # cr6: I/O interrupts
- .quad 0 # cr7: secondary space segment table
- .quad 0x0000000000008000 # cr8: access registers translation
- .quad 0 # cr9: tracing off
- .quad 0 # cr10: tracing off
- .quad 0 # cr11: tracing off
- .quad 0 # cr12: tracing off
- .quad 0 # cr13: home space segment table
- .quad 0xc0000000 # cr14: machine check handling off
- .quad .Llinkage_stack # cr15: linkage stack operations
-
- .section .dma.data,"aw",@progbits
-.Lduct: .long 0,.Laste,.Laste,0,.Lduald,0,0,0
- .long 0,0,0,0,0,0,0,0
-.Llinkage_stack:
- .long 0,0,0x89000000,0,0,0,0x8a000000,0
- .align 64
-.Laste: .quad 0,0xffffffffffffffff,0,0,0,0,0,0
- .align 128
-.Lduald:.rept 8
- .long 0x80000000,0,0,0 # invalid access-list entries
- .endr
- .previous
#include "head_kdump.S"
oi __LC_RETURN_PSW+1,0x2 # set wait state bit
larl %r9,.Lold_psw_disabled_wait
stg %r9,__LC_PGM_NEW_PSW+8
- l %r15,.Ldump_info_stack-.Lold_psw_disabled_wait(%r9)
+ larl %r15,_dump_info_stack_end-STACK_FRAME_OVERHEAD
brasl %r14,print_pgm_check_info
.Lold_psw_disabled_wait:
la %r8,4095
lmg %r0,%r15,__LC_GPREGS_SAVE_AREA-4095(%r8)
lpswe __LC_RETURN_PSW # disabled wait
SYM_CODE_END(startup_pgm_check_handler)
-.Ldump_info_stack:
- .long 0x5000 + PAGE_SIZE - STACK_FRAME_OVERHEAD
#
# params at 10400 (setup.h)
.org PARMAREA+__PARMAREA_SIZE
SYM_DATA_END(parmarea)
- .org EARLY_SCCB_OFFSET
- .fill 4096
-
.org HEAD_END
* not overlap with any component or any certificate.
*/
repeat:
- if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && INITRD_START && INITRD_SIZE &&
- intersects(INITRD_START, INITRD_SIZE, safe_addr, size))
- safe_addr = INITRD_START + INITRD_SIZE;
+ if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && initrd_data.start && initrd_data.size &&
+ intersects(initrd_data.start, initrd_data.size, safe_addr, size))
+ safe_addr = initrd_data.start + initrd_data.size;
for_each_rb_entry(comp, comps)
if (intersects(safe_addr, size, comp->addr, comp->len)) {
safe_addr = comp->addr + comp->len;
*/
memory_limit -= kasan_estimate_memory_needs(memory_limit);
- if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && INITRD_START && INITRD_SIZE) {
- if (safe_addr < INITRD_START + INITRD_SIZE)
- safe_addr = INITRD_START + INITRD_SIZE;
+ if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && initrd_data.start && initrd_data.size) {
+ if (safe_addr < initrd_data.start + initrd_data.size)
+ safe_addr = initrd_data.start + initrd_data.size;
}
safe_addr = ALIGN(safe_addr, THREAD_SIZE);
// SPDX-License-Identifier: GPL-2.0
#include <linux/errno.h>
#include <linux/init.h>
+#include <asm/setup.h>
+#include <asm/processor.h>
#include <asm/sclp.h>
#include <asm/sections.h>
#include <asm/mem_detect.h>
{
unsigned long offset = ALIGN(mem_safe_offset(), sizeof(u64));
- if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && INITRD_START && INITRD_SIZE &&
- INITRD_START < offset + ENTRIES_EXTENDED_MAX)
- offset = ALIGN(INITRD_START + INITRD_SIZE, sizeof(u64));
+ if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && initrd_data.start && initrd_data.size &&
+ initrd_data.start < offset + ENTRIES_EXTENDED_MAX)
+ offset = ALIGN(initrd_data.start + initrd_data.size, sizeof(u64));
return (void *)offset;
}
return p + 1;
}
-extern char _decompressor_syms_start[], _decompressor_syms_end[];
static noinline char *findsym(unsigned long ip, unsigned short *off, unsigned short *len)
{
/* symbol entries are in a form "10000 c4 startup\0" */
static noinline void print_stacktrace(void)
{
- struct stack_info boot_stack = { STACK_TYPE_TASK, BOOT_STACK_OFFSET,
- BOOT_STACK_OFFSET + BOOT_STACK_SIZE };
+ struct stack_info boot_stack = { STACK_TYPE_TASK, (unsigned long)_stack_start,
+ (unsigned long)_stack_end };
unsigned long sp = S390_lowcore.gpregs_save_area[15];
bool first = true;
// SPDX-License-Identifier: GPL-2.0
+#include "boot.h"
#include "../../../drivers/s390/char/sclp_early_core.c"
+
+/* SCLP early buffer must stay page-aligned and below 2GB */
+static char __sclp_early_sccb[EXT_SCCB_READ_SCP] __aligned(PAGE_SIZE);
+
+void sclp_early_setup_buffer(void)
+{
+ sclp_early_set_buffer(&__sclp_early_sccb);
+}
#include <asm/uv.h>
#include "compressed/decompressor.h"
#include "boot.h"
+#include "uv.h"
-extern char __boot_data_start[], __boot_data_end[];
-extern char __boot_data_preserved_start[], __boot_data_preserved_end[];
unsigned long __bootdata_preserved(__kaslr_offset);
unsigned long __bootdata_preserved(VMALLOC_START);
unsigned long __bootdata_preserved(VMALLOC_END);
unsigned long __bootdata_preserved(MODULES_END);
unsigned long __bootdata(ident_map_size);
int __bootdata(is_full_image) = 1;
+struct initrd_data __bootdata(initrd_data);
u64 __bootdata_preserved(stfle_fac_list[16]);
u64 __bootdata_preserved(alt_stfle_fac_list[16]);
-
-/*
- * Some code and data needs to stay below 2 GB, even when the kernel would be
- * relocated above 2 GB, because it has to use 31 bit addresses.
- * Such code and data is part of the .dma section, and its location is passed
- * over to the decompressed / relocated kernel via the .boot.preserved.data
- * section.
- */
-extern char _sdma[], _edma[];
-extern char _stext_dma[], _etext_dma[];
-extern struct exception_table_entry _start_dma_ex_table[];
-extern struct exception_table_entry _stop_dma_ex_table[];
-unsigned long __bootdata_preserved(__sdma) = __pa(&_sdma);
-unsigned long __bootdata_preserved(__edma) = __pa(&_edma);
-unsigned long __bootdata_preserved(__stext_dma) = __pa(&_stext_dma);
-unsigned long __bootdata_preserved(__etext_dma) = __pa(&_etext_dma);
-struct exception_table_entry *
- __bootdata_preserved(__start_dma_ex_table) = _start_dma_ex_table;
-struct exception_table_entry *
- __bootdata_preserved(__stop_dma_ex_table) = _stop_dma_ex_table;
-
-int _diag210_dma(struct diag210 *addr);
-int _diag26c_dma(void *req, void *resp, enum diag26c_sc subcode);
-int _diag14_dma(unsigned long rx, unsigned long ry1, unsigned long subcode);
-void _diag0c_dma(struct hypfs_diag0c_entry *entry);
-void _diag308_reset_dma(void);
-struct diag_ops __bootdata_preserved(diag_dma_ops) = {
- .diag210 = _diag210_dma,
- .diag26c = _diag26c_dma,
- .diag14 = _diag14_dma,
- .diag0c = _diag0c_dma,
- .diag308_reset = _diag308_reset_dma
-};
-static struct diag210 _diag210_tmp_dma __section(".dma.data");
-struct diag210 *__bootdata_preserved(__diag210_tmp_dma) = &_diag210_tmp_dma;
+struct oldmem_data __bootdata_preserved(oldmem_data);
void error(char *x)
{
{
if (!IS_ENABLED(CONFIG_BLK_DEV_INITRD))
return;
- if (!INITRD_START || !INITRD_SIZE)
+ if (!initrd_data.start || !initrd_data.size)
return;
- if (addr <= INITRD_START)
+ if (addr <= initrd_data.start)
return;
- memmove((void *)addr, (void *)INITRD_START, INITRD_SIZE);
- INITRD_START = addr;
+ memmove((void *)addr, (void *)initrd_data.start, initrd_data.size);
+ initrd_data.start = addr;
}
static void copy_bootdata(void)
ident_map_size = min(ident_map_size, 1UL << MAX_PHYSMEM_BITS);
#ifdef CONFIG_CRASH_DUMP
- if (OLDMEM_BASE) {
+ if (oldmem_data.start) {
kaslr_enabled = 0;
- ident_map_size = min(ident_map_size, OLDMEM_SIZE);
+ ident_map_size = min(ident_map_size, oldmem_data.size);
} else if (ipl_block_valid && is_ipl_block_dump()) {
kaslr_enabled = 0;
if (!sclp_early_get_hsa_size(&hsa_size) && hsa_size)
vmalloc_size = max(size, vmalloc_size);
}
+static void offset_vmlinux_info(unsigned long offset)
+{
+ vmlinux.default_lma += offset;
+ *(unsigned long *)(&vmlinux.entry) += offset;
+ vmlinux.bootdata_off += offset;
+ vmlinux.bootdata_preserved_off += offset;
+ vmlinux.rela_dyn_start += offset;
+ vmlinux.rela_dyn_end += offset;
+ vmlinux.dynsym_start += offset;
+}
+
void startup_kernel(void)
{
unsigned long random_lma;
unsigned long safe_addr;
void *img;
+ initrd_data.start = parmarea.initrd_start;
+ initrd_data.size = parmarea.initrd_size;
+ oldmem_data.start = parmarea.oldmem_base;
+ oldmem_data.size = parmarea.oldmem_size;
+
setup_lpp();
store_ipl_parmblock();
safe_addr = mem_safe_offset();
sclp_early_read_info();
setup_boot_command_line();
parse_boot_command_line();
+ sanitize_prot_virt_host();
setup_ident_map_size(detect_memory());
setup_vmalloc_size();
setup_kernel_memory_layout();
- random_lma = __kaslr_offset = 0;
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_enabled) {
random_lma = get_random_base(safe_addr);
if (random_lma) {
__kaslr_offset = random_lma - vmlinux.default_lma;
img = (void *)vmlinux.default_lma;
- vmlinux.default_lma += __kaslr_offset;
- vmlinux.entry += __kaslr_offset;
- vmlinux.bootdata_off += __kaslr_offset;
- vmlinux.bootdata_preserved_off += __kaslr_offset;
- vmlinux.rela_dyn_start += __kaslr_offset;
- vmlinux.rela_dyn_end += __kaslr_offset;
- vmlinux.dynsym_start += __kaslr_offset;
+ offset_vmlinux_info(__kaslr_offset);
}
}
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Code that needs to run below 2 GB.
- *
- * Copyright IBM Corp. 2019
- */
-
-#include <linux/linkage.h>
-#include <asm/errno.h>
-#include <asm/sigp.h>
-
- .section .dma.text,"ax"
-/*
- * Simplified version of expoline thunk. The normal thunks can not be used here,
- * because they might be more than 2 GB away, and not reachable by the relative
- * branch. No comdat, exrl, etc. optimizations used here, because it only
- * affects a few functions that are not performance-relevant.
- */
- .macro BR_EX_DMA_r14
- larl %r1,0f
- ex 0,0(%r1)
- j .
-0: br %r14
- .endm
-
-/*
- * int _diag14_dma(unsigned long rx, unsigned long ry1, unsigned long subcode)
- */
-ENTRY(_diag14_dma)
- lgr %r1,%r2
- lgr %r2,%r3
- lgr %r3,%r4
- lhi %r5,-EIO
- sam31
- diag %r1,%r2,0x14
-.Ldiag14_ex:
- ipm %r5
- srl %r5,28
-.Ldiag14_fault:
- sam64
- lgfr %r2,%r5
- BR_EX_DMA_r14
- EX_TABLE_DMA(.Ldiag14_ex, .Ldiag14_fault)
-ENDPROC(_diag14_dma)
-
-/*
- * int _diag210_dma(struct diag210 *addr)
- */
-ENTRY(_diag210_dma)
- lgr %r1,%r2
- lhi %r2,-1
- sam31
- diag %r1,%r0,0x210
-.Ldiag210_ex:
- ipm %r2
- srl %r2,28
-.Ldiag210_fault:
- sam64
- lgfr %r2,%r2
- BR_EX_DMA_r14
- EX_TABLE_DMA(.Ldiag210_ex, .Ldiag210_fault)
-ENDPROC(_diag210_dma)
-
-/*
- * int _diag26c_dma(void *req, void *resp, enum diag26c_sc subcode)
- */
-ENTRY(_diag26c_dma)
- lghi %r5,-EOPNOTSUPP
- sam31
- diag %r2,%r4,0x26c
-.Ldiag26c_ex:
- sam64
- lgfr %r2,%r5
- BR_EX_DMA_r14
- EX_TABLE_DMA(.Ldiag26c_ex, .Ldiag26c_ex)
-ENDPROC(_diag26c_dma)
-
-/*
- * void _diag0c_dma(struct hypfs_diag0c_entry *entry)
- */
-ENTRY(_diag0c_dma)
- sam31
- diag %r2,%r2,0x0c
- sam64
- BR_EX_DMA_r14
-ENDPROC(_diag0c_dma)
-
-/*
- * void _diag308_reset_dma(void)
- *
- * Calls diag 308 subcode 1 and continues execution
- */
-ENTRY(_diag308_reset_dma)
- larl %r4,.Lctlregs # Save control registers
- stctg %c0,%c15,0(%r4)
- lg %r2,0(%r4) # Disable lowcore protection
- nilh %r2,0xefff
- larl %r4,.Lctlreg0
- stg %r2,0(%r4)
- lctlg %c0,%c0,0(%r4)
- larl %r4,.Lfpctl # Floating point control register
- stfpc 0(%r4)
- larl %r4,.Lprefix # Save prefix register
- stpx 0(%r4)
- larl %r4,.Lprefix_zero # Set prefix register to 0
- spx 0(%r4)
- larl %r4,.Lcontinue_psw # Save PSW flags
- epsw %r2,%r3
- stm %r2,%r3,0(%r4)
- larl %r4,restart_part2 # Setup restart PSW at absolute 0
- larl %r3,.Lrestart_diag308_psw
- og %r4,0(%r3) # Save PSW
- lghi %r3,0
- sturg %r4,%r3 # Use sturg, because of large pages
- lghi %r1,1
- lghi %r0,0
- diag %r0,%r1,0x308
-restart_part2:
- lhi %r0,0 # Load r0 with zero
- lhi %r1,2 # Use mode 2 = ESAME (dump)
- sigp %r1,%r0,SIGP_SET_ARCHITECTURE # Switch to ESAME mode
- sam64 # Switch to 64 bit addressing mode
- larl %r4,.Lctlregs # Restore control registers
- lctlg %c0,%c15,0(%r4)
- larl %r4,.Lfpctl # Restore floating point ctl register
- lfpc 0(%r4)
- larl %r4,.Lprefix # Restore prefix register
- spx 0(%r4)
- larl %r4,.Lcontinue_psw # Restore PSW flags
- lpswe 0(%r4)
-.Lcontinue:
- BR_EX_DMA_r14
-ENDPROC(_diag308_reset_dma)
-
- .section .dma.data,"aw",@progbits
-.align 8
-.Lrestart_diag308_psw:
- .long 0x00080000,0x80000000
-
-.align 8
-.Lcontinue_psw:
- .quad 0,.Lcontinue
-
-.align 8
-.Lctlreg0:
- .quad 0
-.Lctlregs:
- .rept 16
- .quad 0
- .endr
-.Lfpctl:
- .long 0
-.Lprefix:
- .long 0
-.Lprefix_zero:
- .long 0
// SPDX-License-Identifier: GPL-2.0
#include <asm/uv.h>
+#include <asm/boot_data.h>
#include <asm/facility.h>
#include <asm/sections.h>
+#include "boot.h"
+#include "uv.h"
+
/* will be used in arch/s390/kernel/uv.c */
#ifdef CONFIG_PROTECTED_VIRTUALIZATION_GUEST
int __bootdata_preserved(prot_virt_guest);
}
#if IS_ENABLED(CONFIG_KVM)
-static bool has_uv_sec_stor_limit(void)
+void adjust_to_uv_max(unsigned long *vmax)
{
- /*
- * keep these conditions in line with setup_uv()
- */
- if (!is_prot_virt_host())
- return false;
+ if (is_prot_virt_host() && uv_info.max_sec_stor_addr)
+ *vmax = min_t(unsigned long, *vmax, uv_info.max_sec_stor_addr);
+}
+static int is_prot_virt_host_capable(void)
+{
+ /* disable if no prot_virt=1 given on command-line */
+ if (!is_prot_virt_host())
+ return 0;
+ /* disable if protected guest virtualization is enabled */
if (is_prot_virt_guest())
- return false;
-
+ return 0;
+ /* disable if no hardware support */
if (!test_facility(158))
- return false;
-
- return !!uv_info.max_sec_stor_addr;
+ return 0;
+ /* disable if kdump */
+ if (oldmem_data.start)
+ return 0;
+ /* disable if stand-alone dump */
+ if (ipl_block_valid && is_ipl_block_dump())
+ return 0;
+ return 1;
}
-void adjust_to_uv_max(unsigned long *vmax)
+void sanitize_prot_virt_host(void)
{
- if (has_uv_sec_stor_limit())
- *vmax = min_t(unsigned long, *vmax, uv_info.max_sec_stor_addr);
+ prot_virt_host = is_prot_virt_host_capable();
}
#endif
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef BOOT_UV_H
+#define BOOT_UV_H
+
+#if IS_ENABLED(CONFIG_KVM)
+void adjust_to_uv_max(unsigned long *vmax);
+void sanitize_prot_virt_host(void);
+#else
+static inline void adjust_to_uv_max(unsigned long *vmax) {}
+static inline void sanitize_prot_virt_host(void) {}
+#endif
+
+#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM)
+void uv_query_info(void);
+#else
+static inline void uv_query_info(void) {}
+#endif
+
+#endif /* BOOT_UV_H */
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_LSM=y
CONFIG_PREEMPT=y
-CONFIG_SCHED_CORE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_MODULE_SIG_SHA256=y
-CONFIG_BLK_DEV_INTEGRITY=y
CONFIG_BLK_DEV_THROTTLING=y
CONFIG_BLK_WBT=y
CONFIG_BLK_CGROUP_IOLATENCY=y
CONFIG_DM_VERITY=m
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
CONFIG_DM_SWITCH=m
+CONFIG_DM_INTEGRITY=m
CONFIG_NETDEVICES=y
CONFIG_BONDING=m
CONFIG_DUMMY=m
CONFIG_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_BPF_LSM=y
-CONFIG_SCHED_CORE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
static void diag0c_fn(void *data)
{
diag_stat_inc(DIAG_STAT_X00C);
- diag_dma_ops.diag0c(((void **) data)[smp_processor_id()]);
+ diag_amode31_ops.diag0c(((void **)data)[smp_processor_id()]);
}
/*
unsigned int cpu_count, cpu, i;
void **cpu_vec;
- get_online_cpus();
+ cpus_read_lock();
cpu_count = num_online_cpus();
cpu_vec = kmalloc_array(num_possible_cpus(), sizeof(*cpu_vec),
GFP_KERNEL);
if (!cpu_vec)
- goto fail_put_online_cpus;
+ goto fail_unlock_cpus;
/* Note: Diag 0c needs 8 byte alignment and real storage */
diag0c_data = kzalloc(struct_size(diag0c_data, entry, cpu_count),
GFP_KERNEL | GFP_DMA);
on_each_cpu(diag0c_fn, cpu_vec, 1);
*count = cpu_count;
kfree(cpu_vec);
- put_online_cpus();
+ cpus_read_unlock();
return diag0c_data;
fail_kfree_cpu_vec:
kfree(cpu_vec);
-fail_put_online_cpus:
- put_online_cpus();
+fail_unlock_cpus:
+ cpus_read_unlock();
return ERR_PTR(-ENOMEM);
}
#ifndef _ASM_S390_CIO_H_
#define _ASM_S390_CIO_H_
-#include <linux/spinlock.h>
#include <linux/bitops.h>
#include <linux/genalloc.h>
#include <asm/types.h>
*/
static __always_inline void __cpacf_query(unsigned int opcode, cpacf_mask_t *mask)
{
- register unsigned long r0 asm("0") = 0; /* query function */
- register unsigned long r1 asm("1") = (unsigned long) mask;
-
asm volatile(
- " spm 0\n" /* pckmo doesn't change the cc */
+ " lghi 0,0\n" /* query function */
+ " lgr 1,%[mask]\n"
+ " spm 0\n" /* pckmo doesn't change the cc */
/* Parameter regs are ignored, but must be nonzero and unique */
"0: .insn rrf,%[opc] << 16,2,4,6,0\n"
" brc 1,0b\n" /* handle partial completion */
: "=m" (*mask)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (opcode)
- : "cc");
+ : [mask] "d" ((unsigned long)mask), [opc] "i" (opcode)
+ : "cc", "0", "1");
}
static __always_inline int __cpacf_check_opcode(unsigned int opcode)
static inline int cpacf_km(unsigned long func, void *param,
u8 *dest, const u8 *src, long src_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
- register unsigned long r4 asm("4") = (unsigned long) dest;
+ union register_pair d, s;
+ d.even = (unsigned long)dest;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,%[dst],%[src]\n"
" brc 1,0b\n" /* handle partial completion */
- : [src] "+a" (r2), [len] "+d" (r3), [dst] "+a" (r4)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KM)
- : "cc", "memory");
+ : [src] "+&d" (s.pair), [dst] "+&d" (d.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_KM)
+ : "cc", "memory", "0", "1");
- return src_len - r3;
+ return src_len - s.odd;
}
/**
static inline int cpacf_kmc(unsigned long func, void *param,
u8 *dest, const u8 *src, long src_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
- register unsigned long r4 asm("4") = (unsigned long) dest;
+ union register_pair d, s;
+ d.even = (unsigned long)dest;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,%[dst],%[src]\n"
" brc 1,0b\n" /* handle partial completion */
- : [src] "+a" (r2), [len] "+d" (r3), [dst] "+a" (r4)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KMC)
- : "cc", "memory");
+ : [src] "+&d" (s.pair), [dst] "+&d" (d.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_KMC)
+ : "cc", "memory", "0", "1");
- return src_len - r3;
+ return src_len - s.odd;
}
/**
static inline void cpacf_kimd(unsigned long func, void *param,
const u8 *src, long src_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
+ union register_pair s;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,0,%[src]\n"
" brc 1,0b\n" /* handle partial completion */
- : [src] "+a" (r2), [len] "+d" (r3)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KIMD)
- : "cc", "memory");
+ : [src] "+&d" (s.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)(param)),
+ [opc] "i" (CPACF_KIMD)
+ : "cc", "memory", "0", "1");
}
/**
static inline void cpacf_klmd(unsigned long func, void *param,
const u8 *src, long src_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
+ union register_pair s;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,0,%[src]\n"
" brc 1,0b\n" /* handle partial completion */
- : [src] "+a" (r2), [len] "+d" (r3)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KLMD)
- : "cc", "memory");
+ : [src] "+&d" (s.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_KLMD)
+ : "cc", "memory", "0", "1");
}
/**
static inline int cpacf_kmac(unsigned long func, void *param,
const u8 *src, long src_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
+ union register_pair s;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,0,%[src]\n"
" brc 1,0b\n" /* handle partial completion */
- : [src] "+a" (r2), [len] "+d" (r3)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KMAC)
- : "cc", "memory");
+ : [src] "+&d" (s.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_KMAC)
+ : "cc", "memory", "0", "1");
- return src_len - r3;
+ return src_len - s.odd;
}
/**
static inline int cpacf_kmctr(unsigned long func, void *param, u8 *dest,
const u8 *src, long src_len, u8 *counter)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
- register unsigned long r4 asm("4") = (unsigned long) dest;
- register unsigned long r6 asm("6") = (unsigned long) counter;
+ union register_pair d, s, c;
+ d.even = (unsigned long)dest;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
+ c.even = (unsigned long)counter;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rrf,%[opc] << 16,%[dst],%[src],%[ctr],0\n"
" brc 1,0b\n" /* handle partial completion */
- : [src] "+a" (r2), [len] "+d" (r3),
- [dst] "+a" (r4), [ctr] "+a" (r6)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KMCTR)
- : "cc", "memory");
+ : [src] "+&d" (s.pair), [dst] "+&d" (d.pair),
+ [ctr] "+&d" (c.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_KMCTR)
+ : "cc", "memory", "0", "1");
- return src_len - r3;
+ return src_len - s.odd;
}
/**
u8 *dest, unsigned long dest_len,
const u8 *seed, unsigned long seed_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) dest;
- register unsigned long r3 asm("3") = (unsigned long) dest_len;
- register unsigned long r4 asm("4") = (unsigned long) seed;
- register unsigned long r5 asm("5") = (unsigned long) seed_len;
+ union register_pair d, s;
+ d.even = (unsigned long)dest;
+ d.odd = (unsigned long)dest_len;
+ s.even = (unsigned long)seed;
+ s.odd = (unsigned long)seed_len;
asm volatile (
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,%[dst],%[seed]\n"
" brc 1,0b\n" /* handle partial completion */
- : [dst] "+a" (r2), [dlen] "+d" (r3)
- : [fc] "d" (r0), [pba] "a" (r1),
- [seed] "a" (r4), [slen] "d" (r5), [opc] "i" (CPACF_PRNO)
- : "cc", "memory");
+ : [dst] "+&d" (d.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [seed] "d" (s.pair), [opc] "i" (CPACF_PRNO)
+ : "cc", "memory", "0", "1");
}
/**
static inline void cpacf_trng(u8 *ucbuf, unsigned long ucbuf_len,
u8 *cbuf, unsigned long cbuf_len)
{
- register unsigned long r0 asm("0") = (unsigned long) CPACF_PRNO_TRNG;
- register unsigned long r2 asm("2") = (unsigned long) ucbuf;
- register unsigned long r3 asm("3") = (unsigned long) ucbuf_len;
- register unsigned long r4 asm("4") = (unsigned long) cbuf;
- register unsigned long r5 asm("5") = (unsigned long) cbuf_len;
+ union register_pair u, c;
+ u.even = (unsigned long)ucbuf;
+ u.odd = (unsigned long)ucbuf_len;
+ c.even = (unsigned long)cbuf;
+ c.odd = (unsigned long)cbuf_len;
asm volatile (
+ " lghi 0,%[fc]\n"
"0: .insn rre,%[opc] << 16,%[ucbuf],%[cbuf]\n"
" brc 1,0b\n" /* handle partial completion */
- : [ucbuf] "+a" (r2), [ucbuflen] "+d" (r3),
- [cbuf] "+a" (r4), [cbuflen] "+d" (r5)
- : [fc] "d" (r0), [opc] "i" (CPACF_PRNO)
- : "cc", "memory");
+ : [ucbuf] "+&d" (u.pair), [cbuf] "+&d" (c.pair)
+ : [fc] "K" (CPACF_PRNO_TRNG), [opc] "i" (CPACF_PRNO)
+ : "cc", "memory", "0");
}
/**
*/
static inline void cpacf_pcc(unsigned long func, void *param)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
-
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rre,%[opc] << 16,0,0\n" /* PCC opcode */
" brc 1,0b\n" /* handle partial completion */
:
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_PCC)
- : "cc", "memory");
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_PCC)
+ : "cc", "memory", "0", "1");
}
/**
*/
static inline void cpacf_pckmo(long func, void *param)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
-
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
" .insn rre,%[opc] << 16,0,0\n" /* PCKMO opcode */
:
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_PCKMO)
- : "cc", "memory");
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_PCKMO)
+ : "cc", "memory", "0", "1");
}
/**
const u8 *src, unsigned long src_len,
const u8 *aad, unsigned long aad_len)
{
- register unsigned long r0 asm("0") = (unsigned long) func;
- register unsigned long r1 asm("1") = (unsigned long) param;
- register unsigned long r2 asm("2") = (unsigned long) src;
- register unsigned long r3 asm("3") = (unsigned long) src_len;
- register unsigned long r4 asm("4") = (unsigned long) aad;
- register unsigned long r5 asm("5") = (unsigned long) aad_len;
- register unsigned long r6 asm("6") = (unsigned long) dest;
+ union register_pair d, s, a;
+ d.even = (unsigned long)dest;
+ s.even = (unsigned long)src;
+ s.odd = (unsigned long)src_len;
+ a.even = (unsigned long)aad;
+ a.odd = (unsigned long)aad_len;
asm volatile(
+ " lgr 0,%[fc]\n"
+ " lgr 1,%[pba]\n"
"0: .insn rrf,%[opc] << 16,%[dst],%[src],%[aad],0\n"
" brc 1,0b\n" /* handle partial completion */
- : [dst] "+a" (r6), [src] "+a" (r2), [slen] "+d" (r3),
- [aad] "+a" (r4), [alen] "+d" (r5)
- : [fc] "d" (r0), [pba] "a" (r1), [opc] "i" (CPACF_KMA)
- : "cc", "memory");
+ : [dst] "+&d" (d.pair), [src] "+&d" (s.pair),
+ [aad] "+&d" (a.pair)
+ : [fc] "d" (func), [pba] "d" ((unsigned long)param),
+ [opc] "i" (CPACF_KMA)
+ : "cc", "memory", "0", "1");
}
#endif /* _ASM_S390_CPACF_H */
#define MAX_ELF_HWCAP_FEATURES (8 * sizeof(elf_hwcap))
#define MAX_CPU_FEATURES MAX_ELF_HWCAP_FEATURES
-#define cpu_feature(feat) ilog2(HWCAP_S390_ ## feat)
+#define cpu_feature(feat) ilog2(HWCAP_ ## feat)
int cpu_have_feature(unsigned int nr);
};
};
+union ctlreg5 {
+ unsigned long val;
+ struct {
+ unsigned long : 33;
+ unsigned long pasteo: 25;
+ unsigned long : 6;
+ };
+};
+
+union ctlreg15 {
+ unsigned long val;
+ struct {
+ unsigned long lsea : 61;
+ unsigned long : 3;
+ };
+};
+
#define ctl_set_bit(cr, bit) smp_ctl_set_bit(cr, bit)
#define ctl_clear_bit(cr, bit) smp_ctl_clear_bit(cr, bit)
#include <linux/time.h>
#include <linux/refcount.h>
#include <linux/fs.h>
+#include <linux/init.h>
#define DEBUG_MAX_LEVEL 6 /* debug levels range from 0 to 6 */
#define DEBUG_OFF_LEVEL -1 /* level where debug is switched off */
int debug_unregister_view(debug_info_t *id, struct debug_view *view);
+#ifndef MODULE
+
+/*
+ * Note: Initial page and area numbers must be fixed to allow static
+ * initialization. This enables very early tracing. Changes to these values
+ * must be reflected in __DEFINE_STATIC_AREA.
+ */
+#define EARLY_PAGES 8
+#define EARLY_AREAS 1
+
+#define VNAME(var, suffix) __##var##_##suffix
+
/*
- define the debug levels:
- - 0 No debugging output to console or syslog
- - 1 Log internal errors to syslog, ignore check conditions
- - 2 Log internal errors and check conditions to syslog
- - 3 Log internal errors to console, log check conditions to syslog
- - 4 Log internal errors and check conditions to console
- - 5 panic on internal errors, log check conditions to console
- - 6 panic on both, internal errors and check conditions
+ * Define static areas for early trace data. During boot debug_register_static()
+ * will replace these with dynamically allocated areas to allow custom page and
+ * area sizes, and dynamic resizing.
*/
+#define __DEFINE_STATIC_AREA(var) \
+static char VNAME(var, data)[EARLY_PAGES][PAGE_SIZE] __initdata; \
+static debug_entry_t *VNAME(var, pages)[EARLY_PAGES] __initdata = { \
+ (debug_entry_t *)VNAME(var, data)[0], \
+ (debug_entry_t *)VNAME(var, data)[1], \
+ (debug_entry_t *)VNAME(var, data)[2], \
+ (debug_entry_t *)VNAME(var, data)[3], \
+ (debug_entry_t *)VNAME(var, data)[4], \
+ (debug_entry_t *)VNAME(var, data)[5], \
+ (debug_entry_t *)VNAME(var, data)[6], \
+ (debug_entry_t *)VNAME(var, data)[7], \
+}; \
+static debug_entry_t **VNAME(var, areas)[EARLY_AREAS] __initdata = { \
+ (debug_entry_t **)VNAME(var, pages), \
+}; \
+static int VNAME(var, active_pages)[EARLY_AREAS] __initdata; \
+static int VNAME(var, active_entries)[EARLY_AREAS] __initdata
+
+#define __DEBUG_INFO_INIT(var, _name, _buf_size) { \
+ .next = NULL, \
+ .prev = NULL, \
+ .ref_count = REFCOUNT_INIT(1), \
+ .lock = __SPIN_LOCK_UNLOCKED(var.lock), \
+ .level = DEBUG_DEFAULT_LEVEL, \
+ .nr_areas = EARLY_AREAS, \
+ .pages_per_area = EARLY_PAGES, \
+ .buf_size = (_buf_size), \
+ .entry_size = sizeof(debug_entry_t) + (_buf_size), \
+ .areas = VNAME(var, areas), \
+ .active_area = 0, \
+ .active_pages = VNAME(var, active_pages), \
+ .active_entries = VNAME(var, active_entries), \
+ .debugfs_root_entry = NULL, \
+ .debugfs_entries = { NULL }, \
+ .views = { NULL }, \
+ .name = (_name), \
+ .mode = 0600, \
+}
+
+#define __REGISTER_STATIC_DEBUG_INFO(var, name, pages, areas, view) \
+static int __init VNAME(var, reg)(void) \
+{ \
+ debug_register_static(&var, (pages), (areas)); \
+ debug_register_view(&var, (view)); \
+ return 0; \
+} \
+arch_initcall(VNAME(var, reg))
+
+/**
+ * DEFINE_STATIC_DEBUG_INFO - Define static debug_info_t
+ *
+ * @var: Name of debug_info_t variable
+ * @name: Name of debug log (e.g. used for debugfs entry)
+ * @pages_per_area: Number of pages per area
+ * @nr_areas: Number of debug areas
+ * @buf_size: Size of data area in each debug entry
+ * @view: Pointer to debug view struct
+ *
+ * Define a static debug_info_t for early tracing. The associated debugfs log
+ * is automatically registered with the specified debug view.
+ *
+ * Important: Users of this macro must not call any of the
+ * debug_register/_unregister() functions for this debug_info_t!
+ *
+ * Note: Tracing will start with a fixed number of initial pages and areas.
+ * The debug area will be changed to use the specified numbers during
+ * arch_initcall.
+ */
+#define DEFINE_STATIC_DEBUG_INFO(var, name, pages, nr_areas, buf_size, view) \
+__DEFINE_STATIC_AREA(var); \
+static debug_info_t __refdata var = \
+ __DEBUG_INFO_INIT(var, (name), (buf_size)); \
+__REGISTER_STATIC_DEBUG_INFO(var, name, pages, nr_areas, view)
+
+void debug_register_static(debug_info_t *id, int pages_per_area, int nr_areas);
-#ifndef DEBUG_LEVEL
-#define DEBUG_LEVEL 4
-#endif
-
-#define INTERNAL_ERRMSG(x,y...) "E" __FILE__ "%d: " x, __LINE__, y
-#define INTERNAL_WRNMSG(x,y...) "W" __FILE__ "%d: " x, __LINE__, y
-#define INTERNAL_INFMSG(x,y...) "I" __FILE__ "%d: " x, __LINE__, y
-#define INTERNAL_DEBMSG(x,y...) "D" __FILE__ "%d: " x, __LINE__, y
-
-#if DEBUG_LEVEL > 0
-#define PRINT_DEBUG(x...) printk(KERN_DEBUG PRINTK_HEADER x)
-#define PRINT_INFO(x...) printk(KERN_INFO PRINTK_HEADER x)
-#define PRINT_WARN(x...) printk(KERN_WARNING PRINTK_HEADER x)
-#define PRINT_ERR(x...) printk(KERN_ERR PRINTK_HEADER x)
-#define PRINT_FATAL(x...) panic(PRINTK_HEADER x)
-#else
-#define PRINT_DEBUG(x...) printk(KERN_DEBUG PRINTK_HEADER x)
-#define PRINT_INFO(x...) printk(KERN_DEBUG PRINTK_HEADER x)
-#define PRINT_WARN(x...) printk(KERN_DEBUG PRINTK_HEADER x)
-#define PRINT_ERR(x...) printk(KERN_DEBUG PRINTK_HEADER x)
-#define PRINT_FATAL(x...) printk(KERN_DEBUG PRINTK_HEADER x)
-#endif /* DASD_DEBUG */
+#endif /* MODULE */
#endif /* DEBUG_H */
struct hypfs_diag0c_entry;
+/*
+ * This structure must contain only pointers/references into
+ * the AMODE31 text section.
+ */
struct diag_ops {
int (*diag210)(struct diag210 *addr);
int (*diag26c)(void *req, void *resp, enum diag26c_sc subcode);
void (*diag308_reset)(void);
};
-extern struct diag_ops diag_dma_ops;
-extern struct diag210 *__diag210_tmp_dma;
+extern struct diag_ops diag_amode31_ops;
+extern struct diag210 *__diag210_tmp_amode31;
+
+int _diag210_amode31(struct diag210 *addr);
+int _diag26c_amode31(void *req, void *resp, enum diag26c_sc subcode);
+int _diag14_amode31(unsigned long rx, unsigned long ry1, unsigned long subcode);
+void _diag0c_amode31(struct hypfs_diag0c_entry *entry);
+void _diag308_reset_amode31(void);
+
#endif /* _ASM_S390_DIAG_H */
/* Keep this the last entry. */
#define R_390_NUM 61
-/* Bits present in AT_HWCAP. */
-#define HWCAP_S390_ESAN3 1
-#define HWCAP_S390_ZARCH 2
-#define HWCAP_S390_STFLE 4
-#define HWCAP_S390_MSA 8
-#define HWCAP_S390_LDISP 16
-#define HWCAP_S390_EIMM 32
-#define HWCAP_S390_DFP 64
-#define HWCAP_S390_HPAGE 128
-#define HWCAP_S390_ETF3EH 256
-#define HWCAP_S390_HIGH_GPRS 512
-#define HWCAP_S390_TE 1024
-#define HWCAP_S390_VXRS 2048
-#define HWCAP_S390_VXRS_BCD 4096
-#define HWCAP_S390_VXRS_EXT 8192
-#define HWCAP_S390_GS 16384
-#define HWCAP_S390_VXRS_EXT2 32768
-#define HWCAP_S390_VXRS_PDE 65536
-#define HWCAP_S390_SORT 131072
-#define HWCAP_S390_DFLT 262144
+enum {
+ HWCAP_NR_ESAN3 = 0,
+ HWCAP_NR_ZARCH = 1,
+ HWCAP_NR_STFLE = 2,
+ HWCAP_NR_MSA = 3,
+ HWCAP_NR_LDISP = 4,
+ HWCAP_NR_EIMM = 5,
+ HWCAP_NR_DFP = 6,
+ HWCAP_NR_HPAGE = 7,
+ HWCAP_NR_ETF3EH = 8,
+ HWCAP_NR_HIGH_GPRS = 9,
+ HWCAP_NR_TE = 10,
+ HWCAP_NR_VXRS = 11,
+ HWCAP_NR_VXRS_BCD = 12,
+ HWCAP_NR_VXRS_EXT = 13,
+ HWCAP_NR_GS = 14,
+ HWCAP_NR_VXRS_EXT2 = 15,
+ HWCAP_NR_VXRS_PDE = 16,
+ HWCAP_NR_SORT = 17,
+ HWCAP_NR_DFLT = 18,
+ HWCAP_NR_VXRS_PDE2 = 19,
+ HWCAP_NR_NNPA = 20,
+ HWCAP_NR_PCI_MIO = 21,
+ HWCAP_NR_SIE = 22,
+ HWCAP_NR_MAX
+};
-/* Internal bits, not exposed via elf */
-#define HWCAP_INT_SIE 1UL
+/* Bits present in AT_HWCAP. */
+#define HWCAP_ESAN3 BIT(HWCAP_NR_ESAN3)
+#define HWCAP_ZARCH BIT(HWCAP_NR_ZARCH)
+#define HWCAP_STFLE BIT(HWCAP_NR_STFLE)
+#define HWCAP_MSA BIT(HWCAP_NR_MSA)
+#define HWCAP_LDISP BIT(HWCAP_NR_LDISP)
+#define HWCAP_EIMM BIT(HWCAP_NR_EIMM)
+#define HWCAP_DFP BIT(HWCAP_NR_DFP)
+#define HWCAP_HPAGE BIT(HWCAP_NR_HPAGE)
+#define HWCAP_ETF3EH BIT(HWCAP_NR_ETF3EH)
+#define HWCAP_HIGH_GPRS BIT(HWCAP_NR_HIGH_GPRS)
+#define HWCAP_TE BIT(HWCAP_NR_TE)
+#define HWCAP_VXRS BIT(HWCAP_NR_VXRS)
+#define HWCAP_VXRS_BCD BIT(HWCAP_NR_VXRS_BCD)
+#define HWCAP_VXRS_EXT BIT(HWCAP_NR_VXRS_EXT)
+#define HWCAP_GS BIT(HWCAP_NR_GS)
+#define HWCAP_VXRS_EXT2 BIT(HWCAP_NR_VXRS_EXT2)
+#define HWCAP_VXRS_PDE BIT(HWCAP_NR_VXRS_PDE)
+#define HWCAP_SORT BIT(HWCAP_NR_SORT)
+#define HWCAP_DFLT BIT(HWCAP_NR_DFLT)
+#define HWCAP_VXRS_PDE2 BIT(HWCAP_NR_VXRS_PDE2)
+#define HWCAP_NNPA BIT(HWCAP_NR_NNPA)
+#define HWCAP_PCI_MIO BIT(HWCAP_NR_PCI_MIO)
+#define HWCAP_SIE BIT(HWCAP_NR_SIE)
/*
* These are used to set parameters in the core dumps.
extern unsigned long elf_hwcap;
#define ELF_HWCAP (elf_hwcap)
-/* Internal hardware capabilities, not exposed via elf */
-
-extern unsigned long int_hwcap;
-
/* This yields a string that ld.so will use to load implementation
specific libraries for optimization. This is more specific in
intent than poking at uname or /proc/cpuinfo.
long handler;
};
-extern struct exception_table_entry *__start_dma_ex_table;
-extern struct exception_table_entry *__stop_dma_ex_table;
+extern struct exception_table_entry *__start_amode31_ex_table;
+extern struct exception_table_entry *__stop_amode31_ex_table;
const struct exception_table_entry *s390_search_extables(unsigned long addr);
void ftrace_caller(void);
extern char ftrace_graph_caller_end;
-extern unsigned long ftrace_plt;
extern void *ftrace_func;
struct dyn_arch_ftrace { };
struct module;
struct dyn_ftrace;
-/*
- * Either -mhotpatch or -mnop-mcount is used - no explicit init is required
- */
-static inline int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) { return 0; }
+
+bool ftrace_need_init_nop(void);
+#define ftrace_need_init_nop ftrace_need_init_nop
+
+int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
#define ftrace_init_nop ftrace_init_nop
static inline unsigned long ftrace_call_adjust(unsigned long addr)
return addr;
}
-struct ftrace_insn {
- u16 opc;
- s32 disp;
-} __packed;
-
-static inline void ftrace_generate_nop_insn(struct ftrace_insn *insn)
-{
-#ifdef CONFIG_FUNCTION_TRACER
- /* brcl 0,0 */
- insn->opc = 0xc004;
- insn->disp = 0;
-#endif
-}
-
-static inline int is_ftrace_nop(struct ftrace_insn *insn)
-{
-#ifdef CONFIG_FUNCTION_TRACER
- if (insn->disp == 0)
- return 1;
-#endif
- return 0;
-}
-
-static inline void ftrace_generate_call_insn(struct ftrace_insn *insn,
- unsigned long ip)
-{
-#ifdef CONFIG_FUNCTION_TRACER
- unsigned long target;
-
- /* brasl r0,ftrace_caller */
- target = is_module_addr((void *) ip) ? ftrace_plt : FTRACE_ADDR;
- insn->opc = 0xc005;
- insn->disp = (target - ip) / 2;
-#endif
-}
-
/*
* Even though the system call numbers are identical for s390/s390x a
* different system call table is used for compat tasks. This may lead
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef DIV_ROUND_UP
+#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+#endif
+
+#define SIZEOF_MCOUNT_LOC_ENTRY 8
+#define SIZEOF_FTRACE_HOTPATCH_TRAMPOLINE 24
+#define FTRACE_HOTPATCH_TRAMPOLINES_SIZE(n) \
+ DIV_ROUND_UP(SIZEOF_FTRACE_HOTPATCH_TRAMPOLINE * (n), \
+ SIZEOF_MCOUNT_LOC_ENTRY)
+
+#ifdef CONFIG_FUNCTION_TRACER
+#define FTRACE_HOTPATCH_TRAMPOLINES_TEXT \
+ . = ALIGN(8); \
+ __ftrace_hotpatch_trampolines_start = .; \
+ . = . + FTRACE_HOTPATCH_TRAMPOLINES_SIZE(__stop_mcount_loc - \
+ __start_mcount_loc); \
+ __ftrace_hotpatch_trampolines_end = .;
+#else
+#define FTRACE_HOTPATCH_TRAMPOLINES_TEXT
+#endif
#include <asm/types.h>
#include <asm/cio.h>
#include <asm/setup.h>
+#include <asm/page.h>
#include <uapi/asm/ipl.h>
struct ipl_parameter_block {
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_S390_KFENCE_H
+#define _ASM_S390_KFENCE_H
+
+#include <linux/mm.h>
+#include <linux/kfence.h>
+#include <asm/set_memory.h>
+#include <asm/page.h>
+
+void __kernel_map_pages(struct page *page, int numpages, int enable);
+
+static __always_inline bool arch_kfence_init_pool(void)
+{
+ return true;
+}
+
+#define arch_kfence_test_address(addr) ((addr) & PAGE_MASK)
+
+/*
+ * Do not split kfence pool to 4k mapping with arch_kfence_init_pool(),
+ * but earlier where page table allocations still happen with memblock.
+ * Reason is that arch_kfence_init_pool() gets called when the system
+ * is still in a limbo state - disabling and enabling bottom halves is
+ * not yet allowed, but that is what our page_table_alloc() would do.
+ */
+static __always_inline void kfence_split_mapping(void)
+{
+#ifdef CONFIG_KFENCE
+ unsigned long pool_pages = KFENCE_POOL_SIZE >> PAGE_SHIFT;
+
+ set_memory_4k((unsigned long)__kfence_pool, pool_pages);
+#endif
+}
+
+static inline bool kfence_protect_page(unsigned long addr, bool protect)
+{
+ __kernel_map_pages(virt_to_page(addr), 1, !protect);
+ return true;
+}
+
+#endif /* _ASM_S390_KFENCE_H */
#include <uapi/asm/kvm_para.h>
#include <asm/diag.h>
-static inline long __kvm_hypercall0(unsigned long nr)
-{
- register unsigned long __nr asm("1") = nr;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr): "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall0(unsigned long nr)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall0(nr);
-}
-
-static inline long __kvm_hypercall1(unsigned long nr, unsigned long p1)
-{
- register unsigned long __nr asm("1") = nr;
- register unsigned long __p1 asm("2") = p1;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr), "0" (__p1) : "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall1(unsigned long nr, unsigned long p1)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall1(nr, p1);
-}
-
-static inline long __kvm_hypercall2(unsigned long nr, unsigned long p1,
- unsigned long p2)
-{
- register unsigned long __nr asm("1") = nr;
- register unsigned long __p1 asm("2") = p1;
- register unsigned long __p2 asm("3") = p2;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2)
- : "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall2(unsigned long nr, unsigned long p1,
- unsigned long p2)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall2(nr, p1, p2);
-}
-
-static inline long __kvm_hypercall3(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3)
-{
- register unsigned long __nr asm("1") = nr;
- register unsigned long __p1 asm("2") = p1;
- register unsigned long __p2 asm("3") = p2;
- register unsigned long __p3 asm("4") = p3;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2),
- "d" (__p3) : "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall3(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall3(nr, p1, p2, p3);
-}
-
-static inline long __kvm_hypercall4(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4)
-{
- register unsigned long __nr asm("1") = nr;
- register unsigned long __p1 asm("2") = p1;
- register unsigned long __p2 asm("3") = p2;
- register unsigned long __p3 asm("4") = p3;
- register unsigned long __p4 asm("5") = p4;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2),
- "d" (__p3), "d" (__p4) : "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall4(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall4(nr, p1, p2, p3, p4);
-}
-
-static inline long __kvm_hypercall5(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4, unsigned long p5)
-{
- register unsigned long __nr asm("1") = nr;
- register unsigned long __p1 asm("2") = p1;
- register unsigned long __p2 asm("3") = p2;
- register unsigned long __p3 asm("4") = p3;
- register unsigned long __p4 asm("5") = p4;
- register unsigned long __p5 asm("6") = p5;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2),
- "d" (__p3), "d" (__p4), "d" (__p5) : "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall5(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4, unsigned long p5)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall5(nr, p1, p2, p3, p4, p5);
-}
-
-static inline long __kvm_hypercall6(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4, unsigned long p5,
- unsigned long p6)
-{
- register unsigned long __nr asm("1") = nr;
- register unsigned long __p1 asm("2") = p1;
- register unsigned long __p2 asm("3") = p2;
- register unsigned long __p3 asm("4") = p3;
- register unsigned long __p4 asm("5") = p4;
- register unsigned long __p5 asm("6") = p5;
- register unsigned long __p6 asm("7") = p6;
- register long __rc asm("2");
-
- asm volatile ("diag 2,4,0x500\n"
- : "=d" (__rc) : "d" (__nr), "0" (__p1), "d" (__p2),
- "d" (__p3), "d" (__p4), "d" (__p5), "d" (__p6)
- : "memory", "cc");
- return __rc;
-}
-
-static inline long kvm_hypercall6(unsigned long nr, unsigned long p1,
- unsigned long p2, unsigned long p3,
- unsigned long p4, unsigned long p5,
- unsigned long p6)
-{
- diag_stat_inc(DIAG_STAT_X500);
- return __kvm_hypercall6(nr, p1, p2, p3, p4, p5, p6);
-}
+#define HYPERCALL_FMT_0
+#define HYPERCALL_FMT_1 , "0" (r2)
+#define HYPERCALL_FMT_2 , "d" (r3) HYPERCALL_FMT_1
+#define HYPERCALL_FMT_3 , "d" (r4) HYPERCALL_FMT_2
+#define HYPERCALL_FMT_4 , "d" (r5) HYPERCALL_FMT_3
+#define HYPERCALL_FMT_5 , "d" (r6) HYPERCALL_FMT_4
+#define HYPERCALL_FMT_6 , "d" (r7) HYPERCALL_FMT_5
+
+#define HYPERCALL_PARM_0
+#define HYPERCALL_PARM_1 , unsigned long arg1
+#define HYPERCALL_PARM_2 HYPERCALL_PARM_1, unsigned long arg2
+#define HYPERCALL_PARM_3 HYPERCALL_PARM_2, unsigned long arg3
+#define HYPERCALL_PARM_4 HYPERCALL_PARM_3, unsigned long arg4
+#define HYPERCALL_PARM_5 HYPERCALL_PARM_4, unsigned long arg5
+#define HYPERCALL_PARM_6 HYPERCALL_PARM_5, unsigned long arg6
+
+#define HYPERCALL_REGS_0
+#define HYPERCALL_REGS_1 \
+ register unsigned long r2 asm("2") = arg1
+#define HYPERCALL_REGS_2 \
+ HYPERCALL_REGS_1; \
+ register unsigned long r3 asm("3") = arg2
+#define HYPERCALL_REGS_3 \
+ HYPERCALL_REGS_2; \
+ register unsigned long r4 asm("4") = arg3
+#define HYPERCALL_REGS_4 \
+ HYPERCALL_REGS_3; \
+ register unsigned long r5 asm("5") = arg4
+#define HYPERCALL_REGS_5 \
+ HYPERCALL_REGS_4; \
+ register unsigned long r6 asm("6") = arg5
+#define HYPERCALL_REGS_6 \
+ HYPERCALL_REGS_5; \
+ register unsigned long r7 asm("7") = arg6
+
+#define HYPERCALL_ARGS_0
+#define HYPERCALL_ARGS_1 , arg1
+#define HYPERCALL_ARGS_2 HYPERCALL_ARGS_1, arg2
+#define HYPERCALL_ARGS_3 HYPERCALL_ARGS_2, arg3
+#define HYPERCALL_ARGS_4 HYPERCALL_ARGS_3, arg4
+#define HYPERCALL_ARGS_5 HYPERCALL_ARGS_4, arg5
+#define HYPERCALL_ARGS_6 HYPERCALL_ARGS_5, arg6
+
+#define GENERATE_KVM_HYPERCALL_FUNC(args) \
+static inline \
+long __kvm_hypercall##args(unsigned long nr HYPERCALL_PARM_##args) \
+{ \
+ register unsigned long __nr asm("1") = nr; \
+ register long __rc asm("2"); \
+ HYPERCALL_REGS_##args; \
+ \
+ asm volatile ( \
+ " diag 2,4,0x500\n" \
+ : "=d" (__rc) \
+ : "d" (__nr) HYPERCALL_FMT_##args \
+ : "memory", "cc"); \
+ return __rc; \
+} \
+ \
+static inline \
+long kvm_hypercall##args(unsigned long nr HYPERCALL_PARM_##args) \
+{ \
+ diag_stat_inc(DIAG_STAT_X500); \
+ return __kvm_hypercall##args(nr HYPERCALL_ARGS_##args); \
+}
+
+GENERATE_KVM_HYPERCALL_FUNC(0)
+GENERATE_KVM_HYPERCALL_FUNC(1)
+GENERATE_KVM_HYPERCALL_FUNC(2)
+GENERATE_KVM_HYPERCALL_FUNC(3)
+GENERATE_KVM_HYPERCALL_FUNC(4)
+GENERATE_KVM_HYPERCALL_FUNC(5)
+GENERATE_KVM_HYPERCALL_FUNC(6)
/* kvm on s390 is always paravirtualization enabled */
static inline int kvm_para_available(void)
#define EX_TABLE(_fault, _target) \
__EX_TABLE(__ex_table, _fault, _target)
-#define EX_TABLE_DMA(_fault, _target) \
- __EX_TABLE(.dma.ex_table, _fault, _target)
+#define EX_TABLE_AMODE31(_fault, _target) \
+ __EX_TABLE(.amode31.ex_table, _fault, _target)
#endif
/* Restart function and parameter. */
__u64 restart_fn; /* 0x0370 */
__u64 restart_data; /* 0x0378 */
- __u64 restart_source; /* 0x0380 */
+ __u32 restart_source; /* 0x0380 */
+ __u32 restart_flags; /* 0x0384 */
/* Address space pointer. */
__u64 kernel_asce; /* 0x0388 */
* This file contains the s390 architecture specific module code.
*/
-struct mod_arch_syminfo
-{
+struct mod_arch_syminfo {
unsigned long got_offset;
unsigned long plt_offset;
int got_initialized;
int plt_initialized;
};
-struct mod_arch_specific
-{
+struct mod_arch_specific {
/* Starting offset of got in the module core memory. */
unsigned long got_offset;
/* Starting offset of plt in the module core memory. */
int nsyms;
/* Additional symbol information (got and plt offsets). */
struct mod_arch_syminfo *syminfo;
+#ifdef CONFIG_FUNCTION_TRACER
+ /* Start of memory reserved for ftrace hotpatch trampolines. */
+ struct ftrace_hotpatch_trampoline *trampolines_start;
+ /* End of memory reserved for ftrace hotpatch trampolines. */
+ struct ftrace_hotpatch_trampoline *trampolines_end;
+ /* Next unused ftrace hotpatch trampoline slot. */
+ struct ftrace_hotpatch_trampoline *next_trampoline;
+#endif /* CONFIG_FUNCTION_TRACER */
};
#endif /* _ASM_S390_MODULE_H */
void arch_free_page(struct page *page, int order);
void arch_alloc_page(struct page *page, int order);
void arch_set_page_dat(struct page *page, int order);
-void arch_set_page_nodat(struct page *page, int order);
-int arch_test_page_nodat(struct page *page);
-void arch_set_page_states(int make_stable);
static inline int devmem_is_allowed(unsigned long pfn)
{
int clp_setup_writeback_mio(void);
int clp_scan_pci_devices(void);
int clp_query_pci_fn(struct zpci_dev *zdev);
-int clp_enable_fh(struct zpci_dev *, u8);
-int clp_disable_fh(struct zpci_dev *);
+int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
+int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
int clp_get_state(u32 fid, enum zpci_state *state);
+int clp_refresh_fh(u32 fid, u32 *fh);
/* UID */
void update_uid_checking(bool new);
/* DMA */
int zpci_dma_init(void);
void zpci_dma_exit(void);
+int zpci_dma_init_device(struct zpci_dev *zdev);
+int zpci_dma_exit_device(struct zpci_dev *zdev);
/* IRQ */
int __init zpci_irq_init(void);
}
/* Prototypes */
-int zpci_dma_init_device(struct zpci_dev *);
-void zpci_dma_exit_device(struct zpci_dev *);
void dma_free_seg_table(unsigned long);
unsigned long *dma_alloc_cpu_table(void);
void dma_cleanup_tables(unsigned long *);
/* TODO: s390 cannot support io_remap_pfn_range... */
#define pte_ERROR(e) \
- printk("%s:%d: bad pte %p.\n", __FILE__, __LINE__, (void *) pte_val(e))
+ pr_err("%s:%d: bad pte %016lx.\n", __FILE__, __LINE__, pte_val(e))
#define pmd_ERROR(e) \
- printk("%s:%d: bad pmd %p.\n", __FILE__, __LINE__, (void *) pmd_val(e))
+ pr_err("%s:%d: bad pmd %016lx.\n", __FILE__, __LINE__, pmd_val(e))
#define pud_ERROR(e) \
- printk("%s:%d: bad pud %p.\n", __FILE__, __LINE__, (void *) pud_val(e))
+ pr_err("%s:%d: bad pud %016lx.\n", __FILE__, __LINE__, pud_val(e))
#define p4d_ERROR(e) \
- printk("%s:%d: bad p4d %p.\n", __FILE__, __LINE__, (void *) p4d_val(e))
+ pr_err("%s:%d: bad p4d %016lx.\n", __FILE__, __LINE__, p4d_val(e))
#define pgd_ERROR(e) \
- printk("%s:%d: bad pgd %p.\n", __FILE__, __LINE__, (void *) pgd_val(e))
+ pr_err("%s:%d: bad pgd %016lx.\n", __FILE__, __LINE__, pgd_val(e))
/*
* The vmalloc and module area will always be on the topmost area of the
#define _CIF_MCCK_GUEST BIT(CIF_MCCK_GUEST)
#define _CIF_DEDICATED_CPU BIT(CIF_DEDICATED_CPU)
+#define RESTART_FLAG_CTLREGS _AC(1 << 0, U)
+
#ifndef __ASSEMBLY__
#include <linux/cpumask.h>
typedef void qdio_handler_t(struct ccw_device *, unsigned int, int,
int, int, unsigned long);
-/* qdio errors reported to the upper-layer program */
+/* qdio errors reported through the queue handlers: */
#define QDIO_ERROR_ACTIVATE 0x0001
#define QDIO_ERROR_GET_BUF_STATE 0x0002
#define QDIO_ERROR_SET_BUF_STATE 0x0004
+
+/* extra info for completed SBALs: */
#define QDIO_ERROR_SLSB_STATE 0x0100
#define QDIO_ERROR_SLSB_PENDING 0x0200
-#define QDIO_ERROR_FATAL 0x00ff
-#define QDIO_ERROR_TEMPORARY 0xff00
-
/* for qdio_cleanup */
#define QDIO_FLAG_CLEANUP_USING_CLEAR 0x01
#define QDIO_FLAG_CLEANUP_USING_HALT 0x02
* @qib_param_field_format: format for qib_parm_field
* @qib_param_field: pointer to 128 bytes or NULL, if no param field
* @qib_rflags: rflags to set
- * @input_slib_elements: pointer to no_input_qs * 128 words of data or NULL
- * @output_slib_elements: pointer to no_output_qs * 128 words of data or NULL
* @no_input_qs: number of input queues
* @no_output_qs: number of output queues
* @input_handler: handler to be called for input queues
unsigned int qib_param_field_format;
unsigned char *qib_param_field;
unsigned char qib_rflags;
- unsigned long *input_slib_elements;
- unsigned long *output_slib_elements;
unsigned int no_input_qs;
unsigned int no_output_qs;
qdio_handler_t *input_handler;
qdio_handler_t *output_handler;
void (*irq_poll)(struct ccw_device *cdev, unsigned long data);
- unsigned int scan_threshold;
unsigned long int_parm;
struct qdio_buffer ***input_sbal_addr_array;
struct qdio_buffer ***output_sbal_addr_array;
};
-#define QDIO_STATE_INACTIVE 0x00000002 /* after qdio_cleanup */
-#define QDIO_STATE_ESTABLISHED 0x00000004 /* after qdio_establish */
-#define QDIO_STATE_ACTIVE 0x00000008 /* after qdio_activate */
-#define QDIO_STATE_STOPPED 0x00000010 /* after queues went down */
-
#define QDIO_FLAG_SYNC_INPUT 0x01
#define QDIO_FLAG_SYNC_OUTPUT 0x02
-#define QDIO_FLAG_PCI_OUT 0x10
int qdio_alloc_buffers(struct qdio_buffer **buf, unsigned int count);
void qdio_free_buffers(struct qdio_buffer **buf, unsigned int count);
unsigned int bufnr, unsigned int count, struct qaob *aob);
extern int qdio_start_irq(struct ccw_device *cdev);
extern int qdio_stop_irq(struct ccw_device *cdev);
-extern int qdio_get_next_buffers(struct ccw_device *, int, int *, int *);
extern int qdio_inspect_queue(struct ccw_device *cdev, unsigned int nr,
bool is_input, unsigned int *bufnr,
unsigned int *error);
#define _ASM_S390_SCLP_H
#include <linux/types.h>
-#include <asm/chpid.h>
-#include <asm/cpu.h>
#define SCLP_CHP_INFO_MASK_SIZE 32
#define EARLY_SCCB_SIZE PAGE_SIZE
/* 24 + 16 * SCLP_MAX_CORES */
#define EXT_SCCB_READ_CPU (3 * PAGE_SIZE)
+#ifndef __ASSEMBLY__
+#include <asm/chpid.h>
+#include <asm/cpu.h>
+
struct sclp_chp_info {
u8 recognized[SCLP_CHP_INFO_MASK_SIZE];
u8 standby[SCLP_CHP_INFO_MASK_SIZE];
u8 data[0]; /* Subsequent Data passed verbatim to SCLP ET 24 */
} __packed;
+extern char *sclp_early_sccb;
+
+void sclp_early_set_buffer(void *sccb);
int sclp_early_read_info(void);
int sclp_early_read_storage_info(void);
int sclp_early_get_core_info(struct sclp_core_info *info);
return _sclp_get_core_info(info);
}
+#endif /* __ASSEMBLY__ */
#endif /* _ASM_S390_SCLP_H */
*/
#define __bootdata_preserved(var) __section(".boot.preserved.data." #var) var
-extern unsigned long __sdma, __edma;
-extern unsigned long __stext_dma, __etext_dma;
+extern unsigned long __samode31, __eamode31;
+extern unsigned long __stext_amode31, __etext_amode31;
#endif
#define SET_MEMORY_RW 2UL
#define SET_MEMORY_NX 4UL
#define SET_MEMORY_X 8UL
+#define SET_MEMORY_4K 16UL
int __set_memory(unsigned long addr, int numpages, unsigned long flags);
return __set_memory(addr, numpages, SET_MEMORY_X);
}
+static inline int set_memory_4k(unsigned long addr, int numpages)
+{
+ return __set_memory(addr, numpages, SET_MEMORY_4K);
+}
+
#endif
#include <uapi/asm/setup.h>
#include <linux/build_bug.h>
-#define EP_OFFSET 0x10008
-#define EP_STRING "S390EP"
#define PARMAREA 0x10400
-#define EARLY_SCCB_OFFSET 0x11000
-#define HEAD_END 0x12000
+#define HEAD_END 0x11000
/*
* Machine features detected in early.c
#define MACHINE_FLAG_NX BIT(15)
#define MACHINE_FLAG_GS BIT(16)
#define MACHINE_FLAG_SCC BIT(17)
+#define MACHINE_FLAG_PCI_MIO BIT(18)
#define LPP_MAGIC BIT(31)
#define LPP_PID_MASK _AC(0xffffffff, UL)
#define STARTUP_NORMAL_OFFSET 0x10000
#define STARTUP_KDUMP_OFFSET 0x10010
-/* Offsets to parameters in kernel/head.S */
-
-#define IPL_DEVICE_OFFSET 0x10400
-#define INITRD_START_OFFSET 0x10408
-#define INITRD_SIZE_OFFSET 0x10410
-#define OLDMEM_BASE_OFFSET 0x10418
-#define OLDMEM_SIZE_OFFSET 0x10420
-#define KERNEL_VERSION_OFFSET 0x10428
-#define COMMAND_LINE_OFFSET 0x10480
-
#ifndef __ASSEMBLY__
#include <asm/lowcore.h>
#include <asm/types.h>
-#define IPL_DEVICE (*(unsigned long *) (IPL_DEVICE_OFFSET))
-#define INITRD_START (*(unsigned long *) (INITRD_START_OFFSET))
-#define INITRD_SIZE (*(unsigned long *) (INITRD_SIZE_OFFSET))
-#define OLDMEM_BASE (*(unsigned long *) (OLDMEM_BASE_OFFSET))
-#define OLDMEM_SIZE (*(unsigned long *) (OLDMEM_SIZE_OFFSET))
-#define COMMAND_LINE ((char *) (COMMAND_LINE_OFFSET))
-
struct parmarea {
unsigned long ipl_device; /* 0x10400 */
unsigned long initrd_start; /* 0x10408 */
#define MACHINE_HAS_NX (S390_lowcore.machine_flags & MACHINE_FLAG_NX)
#define MACHINE_HAS_GS (S390_lowcore.machine_flags & MACHINE_FLAG_GS)
#define MACHINE_HAS_SCC (S390_lowcore.machine_flags & MACHINE_FLAG_SCC)
+#define MACHINE_HAS_PCI_MIO (S390_lowcore.machine_flags & MACHINE_FLAG_PCI_MIO)
/*
* Console mode. Override with conmode=
extern int is_full_image;
+struct initrd_data {
+ unsigned long start;
+ unsigned long size;
+};
+extern struct initrd_data initrd_data;
+
+struct oldmem_data {
+ unsigned long start;
+ unsigned long size;
+};
+extern struct oldmem_data oldmem_data;
+
static inline u32 gen_lpswe(unsigned long addr)
{
BUILD_BUG_ON(addr > 0xfff);
return 0xb2b20000 | addr;
}
-
-#else /* __ASSEMBLY__ */
-
-#define IPL_DEVICE (IPL_DEVICE_OFFSET)
-#define INITRD_START (INITRD_START_OFFSET)
-#define INITRD_SIZE (INITRD_SIZE_OFFSET)
-#define OLDMEM_BASE (OLDMEM_BASE_OFFSET)
-#define OLDMEM_SIZE (OLDMEM_SIZE_OFFSET)
-#define COMMAND_LINE (COMMAND_LINE_OFFSET)
-
#endif /* __ASSEMBLY__ */
#endif /* _ASM_S390_SETUP_H */
return false;
}
+#define SYSCALL_FMT_0
+#define SYSCALL_FMT_1 , "0" (r2)
+#define SYSCALL_FMT_2 , "d" (r3) SYSCALL_FMT_1
+#define SYSCALL_FMT_3 , "d" (r4) SYSCALL_FMT_2
+#define SYSCALL_FMT_4 , "d" (r5) SYSCALL_FMT_3
+#define SYSCALL_FMT_5 , "d" (r6) SYSCALL_FMT_4
+#define SYSCALL_FMT_6 , "d" (r7) SYSCALL_FMT_5
+
+#define SYSCALL_PARM_0
+#define SYSCALL_PARM_1 , long arg1
+#define SYSCALL_PARM_2 SYSCALL_PARM_1, long arg2
+#define SYSCALL_PARM_3 SYSCALL_PARM_2, long arg3
+#define SYSCALL_PARM_4 SYSCALL_PARM_3, long arg4
+#define SYSCALL_PARM_5 SYSCALL_PARM_4, long arg5
+#define SYSCALL_PARM_6 SYSCALL_PARM_5, long arg6
+
+#define SYSCALL_REGS_0
+#define SYSCALL_REGS_1 \
+ register long r2 asm("2") = arg1
+#define SYSCALL_REGS_2 \
+ SYSCALL_REGS_1; \
+ register long r3 asm("3") = arg2
+#define SYSCALL_REGS_3 \
+ SYSCALL_REGS_2; \
+ register long r4 asm("4") = arg3
+#define SYSCALL_REGS_4 \
+ SYSCALL_REGS_3; \
+ register long r5 asm("5") = arg4
+#define SYSCALL_REGS_5 \
+ SYSCALL_REGS_4; \
+ register long r6 asm("6") = arg5
+#define SYSCALL_REGS_6 \
+ SYSCALL_REGS_5; \
+ register long r7 asm("7") = arg6
+
+#define GENERATE_SYSCALL_FUNC(nr) \
+static __always_inline \
+long syscall##nr(unsigned long syscall SYSCALL_PARM_##nr) \
+{ \
+ register unsigned long r1 asm ("1") = syscall; \
+ register long rc asm ("2"); \
+ SYSCALL_REGS_##nr; \
+ \
+ asm volatile ( \
+ " svc 0\n" \
+ : "=d" (rc) \
+ : "d" (r1) SYSCALL_FMT_##nr \
+ : "memory"); \
+ return rc; \
+}
+
+GENERATE_SYSCALL_FUNC(0)
+GENERATE_SYSCALL_FUNC(1)
+GENERATE_SYSCALL_FUNC(2)
+GENERATE_SYSCALL_FUNC(3)
+GENERATE_SYSCALL_FUNC(4)
+GENERATE_SYSCALL_FUNC(5)
+GENERATE_SYSCALL_FUNC(6)
+
#endif /* _ASM_SYSCALL_H */
int gmap_convert_to_secure(struct gmap *gmap, unsigned long gaddr);
void setup_uv(void);
-void adjust_to_uv_max(unsigned long *vmax);
#else
#define is_prot_virt_host() 0
static inline void setup_uv(void) {}
-static inline void adjust_to_uv_max(unsigned long *vmax) {}
static inline int uv_destroy_page(unsigned long paddr)
{
}
#endif
-#if defined(CONFIG_PROTECTED_VIRTUALIZATION_GUEST) || IS_ENABLED(CONFIG_KVM)
-void uv_query_info(void);
-#else
-static inline void uv_query_info(void) {}
-#endif
-
#endif /* _ASM_S390_UV_H */
#define VDSO_HAS_CLOCK_GETRES 1
+#include <asm/syscall.h>
#include <asm/timex.h>
#include <asm/unistd.h>
#include <linux/compiler.h>
static __always_inline
long clock_gettime_fallback(clockid_t clkid, struct __kernel_timespec *ts)
{
- register unsigned long r1 __asm__("r1") = __NR_clock_gettime;
- register unsigned long r2 __asm__("r2") = (unsigned long)clkid;
- register void *r3 __asm__("r3") = ts;
-
- asm ("svc 0\n" : "+d" (r2) : "d" (r1), "d" (r3) : "cc", "memory");
- return r2;
+ return syscall2(__NR_clock_gettime, (long)clkid, (long)ts);
}
static __always_inline
long gettimeofday_fallback(register struct __kernel_old_timeval *tv,
register struct timezone *tz)
{
- register unsigned long r1 __asm__("r1") = __NR_gettimeofday;
- register unsigned long r2 __asm__("r2") = (unsigned long)tv;
- register void *r3 __asm__("r3") = tz;
-
- asm ("svc 0\n" : "+d" (r2) : "d" (r1), "d" (r3) : "cc", "memory");
- return r2;
+ return syscall2(__NR_gettimeofday, (long)tv, (long)tz);
}
static __always_inline
long clock_getres_fallback(clockid_t clkid, struct __kernel_timespec *ts)
{
- register unsigned long r1 __asm__("r1") = __NR_clock_getres;
- register unsigned long r2 __asm__("r2") = (unsigned long)clkid;
- register void *r3 __asm__("r3") = ts;
-
- asm ("svc 0\n" : "+d" (r2) : "d" (r1), "d" (r3) : "cc", "memory");
- return r2;
+ return syscall2(__NR_clock_getres, (long)clkid, (long)ts);
}
#ifdef CONFIG_TIME_NS
obj-y += runtime_instr.o cache.o fpu.o dumpstack.o guarded_storage.o sthyi.o
obj-y += entry.o reipl.o relocate_kernel.o kdebugfs.o alternative.o
obj-y += nospec-branch.o ipl_vmparm.o machine_kexec_reloc.o unwind_bc.o
-obj-y += smp.o
+obj-y += smp.o text_amode31.o
extra-y += head64.o vmlinux.lds
OFFSET(__LC_RESTART_FN, lowcore, restart_fn);
OFFSET(__LC_RESTART_DATA, lowcore, restart_data);
OFFSET(__LC_RESTART_SOURCE, lowcore, restart_source);
+ OFFSET(__LC_RESTART_FLAGS, lowcore, restart_flags);
OFFSET(__LC_KERNEL_ASCE, lowcore, kernel_asce);
OFFSET(__LC_USER_ASCE, lowcore, user_asce);
OFFSET(__LC_LPP, lowcore, lpp);
DEFINE(__KEXEC_SHA_REGION_SIZE, sizeof(struct kexec_sha_region));
/* sizeof kernel parameter area */
DEFINE(__PARMAREA_SIZE, sizeof(struct parmarea));
+ /* kernel parameter area offsets */
+ DEFINE(IPL_DEVICE, PARMAREA + offsetof(struct parmarea, ipl_device));
+ DEFINE(INITRD_START, PARMAREA + offsetof(struct parmarea, initrd_start));
+ DEFINE(INITRD_SIZE, PARMAREA + offsetof(struct parmarea, initrd_size));
+ DEFINE(OLDMEM_BASE, PARMAREA + offsetof(struct parmarea, oldmem_base));
+ DEFINE(OLDMEM_SIZE, PARMAREA + offsetof(struct parmarea, oldmem_size));
+ DEFINE(COMMAND_LINE, PARMAREA + offsetof(struct parmarea, command_line));
return 0;
}
while (count) {
from = __pa(src);
- if (!OLDMEM_BASE && from < sclp.hsa_size) {
+ if (!oldmem_data.start && from < sclp.hsa_size) {
/* Copy from zfcp/nvme dump HSA area */
len = min(count, sclp.hsa_size - from);
rc = memcpy_hsa_kernel(dst, from, len);
return rc;
} else {
/* Check for swapped kdump oldmem areas */
- if (OLDMEM_BASE && from - OLDMEM_BASE < OLDMEM_SIZE) {
- from -= OLDMEM_BASE;
- len = min(count, OLDMEM_SIZE - from);
- } else if (OLDMEM_BASE && from < OLDMEM_SIZE) {
- len = min(count, OLDMEM_SIZE - from);
- from += OLDMEM_BASE;
+ if (oldmem_data.start && from - oldmem_data.start < oldmem_data.size) {
+ from -= oldmem_data.start;
+ len = min(count, oldmem_data.size - from);
+ } else if (oldmem_data.start && from < oldmem_data.size) {
+ len = min(count, oldmem_data.size - from);
+ from += oldmem_data.start;
} else {
len = count;
}
while (count) {
from = __pa(src);
- if (!OLDMEM_BASE && from < sclp.hsa_size) {
+ if (!oldmem_data.start && from < sclp.hsa_size) {
/* Copy from zfcp/nvme dump HSA area */
len = min(count, sclp.hsa_size - from);
rc = memcpy_hsa_user(dst, from, len);
return rc;
} else {
/* Check for swapped kdump oldmem areas */
- if (OLDMEM_BASE && from - OLDMEM_BASE < OLDMEM_SIZE) {
- from -= OLDMEM_BASE;
- len = min(count, OLDMEM_SIZE - from);
- } else if (OLDMEM_BASE && from < OLDMEM_SIZE) {
- len = min(count, OLDMEM_SIZE - from);
- from += OLDMEM_BASE;
+ if (oldmem_data.start && from - oldmem_data.size < oldmem_data.size) {
+ from -= oldmem_data.size;
+ len = min(count, oldmem_data.size - from);
+ } else if (oldmem_data.start && from < oldmem_data.size) {
+ len = min(count, oldmem_data.size - from);
+ from += oldmem_data.start;
} else {
len = count;
}
unsigned long size_old;
int rc;
- if (pfn < OLDMEM_SIZE >> PAGE_SHIFT) {
- size_old = min(size, OLDMEM_SIZE - (pfn << PAGE_SHIFT));
+ if (pfn < oldmem_data.size >> PAGE_SHIFT) {
+ size_old = min(size, oldmem_data.size - (pfn << PAGE_SHIFT));
rc = remap_pfn_range(vma, from,
- pfn + (OLDMEM_BASE >> PAGE_SHIFT),
+ pfn + (oldmem_data.start >> PAGE_SHIFT),
size_old, prot);
if (rc || size == size_old)
return rc;
int remap_oldmem_pfn_range(struct vm_area_struct *vma, unsigned long from,
unsigned long pfn, unsigned long size, pgprot_t prot)
{
- if (OLDMEM_BASE)
+ if (oldmem_data.start)
return remap_oldmem_pfn_range_kdump(vma, from, pfn, size, prot);
else
return remap_oldmem_pfn_range_zfcpdump(vma, from, pfn, size,
u64 hdr_off;
/* If we are not in kdump or zfcp/nvme dump mode return */
- if (!OLDMEM_BASE && !is_ipl_type_dump())
+ if (!oldmem_data.start && !is_ipl_type_dump())
return 0;
/* If we cannot get HSA size for zfcp/nvme dump return error */
if (is_ipl_type_dump() && !sclp.hsa_size)
return -ENODEV;
/* For kdump, exclude previous crashkernel memory */
- if (OLDMEM_BASE) {
- oldmem_region.base = OLDMEM_BASE;
- oldmem_region.size = OLDMEM_SIZE;
- oldmem_type.total_size = OLDMEM_SIZE;
+ if (oldmem_data.start) {
+ oldmem_region.base = oldmem_data.start;
+ oldmem_region.size = oldmem_data.size;
+ oldmem_type.total_size = oldmem_data.size;
}
mem_chunk_cnt = get_mem_chunk_cnt();
#include <linux/export.h>
#include <linux/init.h>
#include <linux/fs.h>
+#include <linux/minmax.h>
#include <linux/debugfs.h>
#include <asm/debug.h>
char *out_buf, const char *in_buf);
static int debug_sprintf_format_fn(debug_info_t *id, struct debug_view *view,
char *out_buf, debug_sprintf_entry_t *curr_event);
+static void debug_areas_swap(debug_info_t *a, debug_info_t *b);
+static void debug_events_append(debug_info_t *dest, debug_info_t *src);
/* globals */
goto out;
rc->mode = mode & ~S_IFMT;
-
- /* create root directory */
- rc->debugfs_root_entry = debugfs_create_dir(rc->name,
- debug_debugfs_root_entry);
-
- /* append new element to linked list */
- if (!debug_area_first) {
- /* first element in list */
- debug_area_first = rc;
- rc->prev = NULL;
- } else {
- /* append element to end of list */
- debug_area_last->next = rc;
- rc->prev = debug_area_last;
- }
- debug_area_last = rc;
- rc->next = NULL;
-
refcount_set(&rc->ref_count, 1);
out:
return rc;
*/
static void debug_info_put(debug_info_t *db_info)
{
- int i;
-
if (!db_info)
return;
- if (refcount_dec_and_test(&db_info->ref_count)) {
- for (i = 0; i < DEBUG_MAX_VIEWS; i++) {
- if (!db_info->views[i])
- continue;
- debugfs_remove(db_info->debugfs_entries[i]);
- }
- debugfs_remove(db_info->debugfs_root_entry);
- if (db_info == debug_area_first)
- debug_area_first = db_info->next;
- if (db_info == debug_area_last)
- debug_area_last = db_info->prev;
- if (db_info->prev)
- db_info->prev->next = db_info->next;
- if (db_info->next)
- db_info->next->prev = db_info->prev;
+ if (refcount_dec_and_test(&db_info->ref_count))
debug_info_free(db_info);
- }
}
/*
return 0; /* success */
}
+/* Create debugfs entries and add to internal list. */
+static void _debug_register(debug_info_t *id)
+{
+ /* create root directory */
+ id->debugfs_root_entry = debugfs_create_dir(id->name,
+ debug_debugfs_root_entry);
+
+ /* append new element to linked list */
+ if (!debug_area_first) {
+ /* first element in list */
+ debug_area_first = id;
+ id->prev = NULL;
+ } else {
+ /* append element to end of list */
+ debug_area_last->next = id;
+ id->prev = debug_area_last;
+ }
+ debug_area_last = id;
+ id->next = NULL;
+
+ debug_register_view(id, &debug_level_view);
+ debug_register_view(id, &debug_flush_view);
+ debug_register_view(id, &debug_pages_view);
+}
+
/**
* debug_register_mode() - creates and initializes debug area.
*
if ((uid != 0) || (gid != 0))
pr_warn("Root becomes the owner of all s390dbf files in sysfs\n");
BUG_ON(!initialized);
- mutex_lock(&debug_mutex);
/* create new debug_info */
rc = debug_info_create(name, pages_per_area, nr_areas, buf_size, mode);
- if (!rc)
- goto out;
- debug_register_view(rc, &debug_level_view);
- debug_register_view(rc, &debug_flush_view);
- debug_register_view(rc, &debug_pages_view);
-out:
- if (!rc)
+ if (rc) {
+ mutex_lock(&debug_mutex);
+ _debug_register(rc);
+ mutex_unlock(&debug_mutex);
+ } else {
pr_err("Registering debug feature %s failed\n", name);
- mutex_unlock(&debug_mutex);
+ }
return rc;
}
EXPORT_SYMBOL(debug_register_mode);
}
EXPORT_SYMBOL(debug_register);
+/**
+ * debug_register_static() - registers a static debug area
+ *
+ * @id: Handle for static debug area
+ * @pages_per_area: Number of pages per area
+ * @nr_areas: Number of debug areas
+ *
+ * Register debug_info_t defined using DEFINE_STATIC_DEBUG_INFO.
+ *
+ * Note: This function is called automatically via an initcall generated by
+ * DEFINE_STATIC_DEBUG_INFO.
+ */
+void debug_register_static(debug_info_t *id, int pages_per_area, int nr_areas)
+{
+ unsigned long flags;
+ debug_info_t *copy;
+
+ if (!initialized) {
+ pr_err("Tried to register debug feature %s too early\n",
+ id->name);
+ return;
+ }
+
+ copy = debug_info_alloc("", pages_per_area, nr_areas, id->buf_size,
+ id->level, ALL_AREAS);
+ if (!copy) {
+ pr_err("Registering debug feature %s failed\n", id->name);
+
+ /* Clear pointers to prevent tracing into released initdata. */
+ spin_lock_irqsave(&id->lock, flags);
+ id->areas = NULL;
+ id->active_pages = NULL;
+ id->active_entries = NULL;
+ spin_unlock_irqrestore(&id->lock, flags);
+
+ return;
+ }
+
+ /* Replace static trace area with dynamic copy. */
+ spin_lock_irqsave(&id->lock, flags);
+ debug_events_append(copy, id);
+ debug_areas_swap(id, copy);
+ spin_unlock_irqrestore(&id->lock, flags);
+
+ /* Clear pointers to initdata and discard copy. */
+ copy->areas = NULL;
+ copy->active_pages = NULL;
+ copy->active_entries = NULL;
+ debug_info_free(copy);
+
+ mutex_lock(&debug_mutex);
+ _debug_register(id);
+ mutex_unlock(&debug_mutex);
+}
+
+/* Remove debugfs entries and remove from internal list. */
+static void _debug_unregister(debug_info_t *id)
+{
+ int i;
+
+ for (i = 0; i < DEBUG_MAX_VIEWS; i++) {
+ if (!id->views[i])
+ continue;
+ debugfs_remove(id->debugfs_entries[i]);
+ }
+ debugfs_remove(id->debugfs_root_entry);
+ if (id == debug_area_first)
+ debug_area_first = id->next;
+ if (id == debug_area_last)
+ debug_area_last = id->prev;
+ if (id->prev)
+ id->prev->next = id->next;
+ if (id->next)
+ id->next->prev = id->prev;
+}
+
/**
* debug_unregister() - give back debug area.
*
if (!id)
return;
mutex_lock(&debug_mutex);
- debug_info_put(id);
+ _debug_unregister(id);
mutex_unlock(&debug_mutex);
+
+ debug_info_put(id);
}
EXPORT_SYMBOL(debug_unregister);
*/
static int debug_set_size(debug_info_t *id, int nr_areas, int pages_per_area)
{
- debug_entry_t ***new_areas;
+ debug_info_t *new_id;
unsigned long flags;
- int rc = 0;
if (!id || (nr_areas <= 0) || (pages_per_area < 0))
return -EINVAL;
- if (pages_per_area > 0) {
- new_areas = debug_areas_alloc(pages_per_area, nr_areas);
- if (!new_areas) {
- pr_info("Allocating memory for %i pages failed\n",
- pages_per_area);
- rc = -ENOMEM;
- goto out;
- }
- } else {
- new_areas = NULL;
+
+ new_id = debug_info_alloc("", pages_per_area, nr_areas, id->buf_size,
+ id->level, ALL_AREAS);
+ if (!new_id) {
+ pr_info("Allocating memory for %i pages failed\n",
+ pages_per_area);
+ return -ENOMEM;
}
+
spin_lock_irqsave(&id->lock, flags);
- debug_areas_free(id);
- id->areas = new_areas;
- id->nr_areas = nr_areas;
- id->pages_per_area = pages_per_area;
- id->active_area = 0;
- memset(id->active_entries, 0, sizeof(int)*id->nr_areas);
- memset(id->active_pages, 0, sizeof(int)*id->nr_areas);
+ debug_events_append(new_id, id);
+ debug_areas_swap(new_id, id);
+ debug_info_free(new_id);
spin_unlock_irqrestore(&id->lock, flags);
pr_info("%s: set new size (%i pages)\n", id->name, pages_per_area);
-out:
- return rc;
+
+ return 0;
}
/**
if (!id)
return;
- spin_lock_irqsave(&id->lock, flags);
+
if (new_level == DEBUG_OFF_LEVEL) {
- id->level = DEBUG_OFF_LEVEL;
pr_info("%s: switched off\n", id->name);
} else if ((new_level > DEBUG_MAX_LEVEL) || (new_level < 0)) {
pr_info("%s: level %i is out of range (%i - %i)\n",
id->name, new_level, 0, DEBUG_MAX_LEVEL);
- } else {
- id->level = new_level;
+ return;
}
+
+ spin_lock_irqsave(&id->lock, flags);
+ id->level = new_level;
spin_unlock_irqrestore(&id->lock, flags);
}
EXPORT_SYMBOL(debug_set_level);
id->active_entries[id->active_area]);
}
+/* Swap debug areas of a and b. */
+static void debug_areas_swap(debug_info_t *a, debug_info_t *b)
+{
+ swap(a->nr_areas, b->nr_areas);
+ swap(a->pages_per_area, b->pages_per_area);
+ swap(a->areas, b->areas);
+ swap(a->active_area, b->active_area);
+ swap(a->active_pages, b->active_pages);
+ swap(a->active_entries, b->active_entries);
+}
+
+/* Append all debug events in active area from source to destination log. */
+static void debug_events_append(debug_info_t *dest, debug_info_t *src)
+{
+ debug_entry_t *from, *to, *last;
+
+ if (!src->areas || !dest->areas)
+ return;
+
+ /* Loop over all entries in src, starting with oldest. */
+ from = get_active_entry(src);
+ last = from;
+ do {
+ if (from->clock != 0LL) {
+ to = get_active_entry(dest);
+ memset(to, 0, dest->entry_size);
+ memcpy(to, from, min(src->entry_size,
+ dest->entry_size));
+ proceed_active_entry(dest);
+ }
+
+ proceed_active_entry(src);
+ from = get_active_entry(src);
+ } while (from != last);
+}
+
/*
* debug_finish_entry:
* - set timestamp, caller address, cpu number etc.
break;
}
if (i == DEBUG_MAX_VIEWS) {
- pr_err("Registering view %s/%s would exceed the maximum "
- "number of views %i\n", id->name, view->name, i);
rc = -1;
} else {
id->views[i] = view;
id->debugfs_entries[i] = pde;
}
spin_unlock_irqrestore(&id->lock, flags);
- if (rc)
+ if (rc) {
+ pr_err("Registering view %s/%s would exceed the maximum "
+ "number of views %i\n", id->name, view->name, i);
debugfs_remove(pde);
+ }
out:
return rc;
}
#include <asm/diag.h>
#include <asm/trace/diag.h>
#include <asm/sections.h>
+#include "entry.h"
struct diag_stat {
unsigned int counter[NR_DIAG_STAT];
[DIAG_STAT_X500] = { .code = 0x500, .name = "Virtio Service" },
};
-struct diag_ops __bootdata_preserved(diag_dma_ops);
-struct diag210 *__bootdata_preserved(__diag210_tmp_dma);
+struct diag_ops __amode31_ref diag_amode31_ops = {
+ .diag210 = _diag210_amode31,
+ .diag26c = _diag26c_amode31,
+ .diag14 = _diag14_amode31,
+ .diag0c = _diag0c_amode31,
+ .diag308_reset = _diag308_reset_amode31
+};
+
+static struct diag210 _diag210_tmp_amode31 __section(".amode31.data");
+struct diag210 __amode31_ref *__diag210_tmp_amode31 = &_diag210_tmp_amode31;
static int show_diag_stat(struct seq_file *m, void *v)
{
unsigned long n = (unsigned long) v - 1;
int cpu, prec, tmp;
- get_online_cpus();
+ cpus_read_lock();
if (n == 0) {
seq_puts(m, " ");
}
seq_printf(m, " %s\n", diag_map[n-1].name);
}
- put_online_cpus();
+ cpus_read_unlock();
return 0;
}
int diag14(unsigned long rx, unsigned long ry1, unsigned long subcode)
{
diag_stat_inc(DIAG_STAT_X014);
- return diag_dma_ops.diag14(rx, ry1, subcode);
+ return diag_amode31_ops.diag14(rx, ry1, subcode);
}
EXPORT_SYMBOL(diag14);
int ccode;
spin_lock_irqsave(&diag210_lock, flags);
- *__diag210_tmp_dma = *addr;
+ *__diag210_tmp_amode31 = *addr;
diag_stat_inc(DIAG_STAT_X210);
- ccode = diag_dma_ops.diag210(__diag210_tmp_dma);
+ ccode = diag_amode31_ops.diag210(__diag210_tmp_amode31);
- *addr = *__diag210_tmp_dma;
+ *addr = *__diag210_tmp_amode31;
spin_unlock_irqrestore(&diag210_lock, flags);
return ccode;
int diag26c(void *req, void *resp, enum diag26c_sc subcode)
{
diag_stat_inc(DIAG_STAT_X26C);
- return diag_dma_ops.diag26c(req, resp, subcode);
+ return diag_amode31_ops.diag26c(req, resp, subcode);
}
EXPORT_SYMBOL(diag26c);
[INSTR_VRR_VV] = { V_8, V_12, 0, 0, 0, 0 },
[INSTR_VRR_VV0U] = { V_8, V_12, U4_32, 0, 0, 0 },
[INSTR_VRR_VV0U0U] = { V_8, V_12, U4_32, U4_24, 0, 0 },
+ [INSTR_VRR_VV0U2] = { V_8, V_12, U4_24, 0, 0, 0 },
[INSTR_VRR_VV0UU2] = { V_8, V_12, U4_32, U4_28, 0, 0 },
[INSTR_VRR_VV0UUU] = { V_8, V_12, U4_32, U4_28, U4_24, 0 },
[INSTR_VRR_VVV] = { V_8, V_12, V_16, 0, 0, 0 },
[INSTR_VRR_VVV0U] = { V_8, V_12, V_16, U4_32, 0, 0 },
+ [INSTR_VRR_VVV0U0] = { V_8, V_12, V_16, U4_24, 0, 0 },
[INSTR_VRR_VVV0U0U] = { V_8, V_12, V_16, U4_32, U4_24, 0 },
[INSTR_VRR_VVV0UU] = { V_8, V_12, V_16, U4_32, U4_28, 0 },
[INSTR_VRR_VVV0UUU] = { V_8, V_12, V_16, U4_32, U4_28, U4_24 },
clock_comparator_max = -1ULL >> 1;
__ctl_set_bit(0, 53);
}
+ if (IS_ENABLED(CONFIG_PCI) && test_facility(153)) {
+ S390_lowcore.machine_flags |= MACHINE_FLAG_PCI_MIO;
+ /* the control bit is set during PCI initialization */
+ }
}
static inline void save_vector_registers(void)
4: j 4b
ENDPROC(mcck_int_handler)
-#
-# PSW restart interrupt handler
-#
ENTRY(restart_int_handler)
ALTERNATIVE "", ".insn s,0xb2800000,_LPP_OFFSET", 40
stg %r15,__LC_SAVE_AREA_RESTART
+ TSTMSK __LC_RESTART_FLAGS,RESTART_FLAG_CTLREGS,4
+ jz 0f
+ la %r15,4095
+ lctlg %c0,%c15,__LC_CREGS_SAVE_AREA-4095(%r15)
+0: larl %r15,.Lstosm_tmp
+ stosm 0(%r15),0x04 # turn dat on, keep irqs off
lg %r15,__LC_RESTART_STACK
xc STACK_FRAME_OVERHEAD(__PT_SIZE,%r15),STACK_FRAME_OVERHEAD(%r15)
stmg %r0,%r14,STACK_FRAME_OVERHEAD+__PT_R0(%r15)
xc 0(STACK_FRAME_OVERHEAD,%r15),0(%r15)
lg %r1,__LC_RESTART_FN # load fn, parm & source cpu
lg %r2,__LC_RESTART_DATA
- lg %r3,__LC_RESTART_SOURCE
+ lgf %r3,__LC_RESTART_SOURCE
ltgr %r3,%r3 # test source cpu address
jm 1f # negative -> skip source stop
0: sigp %r4,%r3,SIGP_SENSE # sigp sense to source cpu
void do_secure_storage_violation(struct pt_regs *regs);
void do_report_trap(struct pt_regs *regs, int si_signo, int si_code, char *str);
void kernel_stack_overflow(struct pt_regs * regs);
-void do_signal(struct pt_regs *regs);
void handle_signal32(struct ksignal *ksig, sigset_t *oldset,
struct pt_regs *regs);
-void do_notify_resume(struct pt_regs *regs);
void __init init_IRQ(void);
void do_io_irq(struct pt_regs *regs);
extern char kprobes_insn_page[];
+extern char _samode31[], _eamode31[];
+extern char _stext_amode31[], _etext_amode31[];
+extern struct exception_table_entry _start_amode31_ex_table[];
+extern struct exception_table_entry _stop_amode31_ex_table[];
+
+#define __amode31_data __section(".amode31.data")
+#define __amode31_ref __section(".amode31.refs")
+extern long _start_amode31_refs[], _end_amode31_refs[];
+
#endif /* _ENTRY_H */
#include <trace/syscall.h>
#include <asm/asm-offsets.h>
#include <asm/cacheflush.h>
+#include <asm/ftrace.lds.h>
+#include <asm/nospec-branch.h>
#include <asm/set_memory.h>
#include "entry.h"
+#include "ftrace.h"
/*
* To generate function prologue either gcc's hotpatch feature (since gcc 4.8)
*/
void *ftrace_func __read_mostly = ftrace_stub;
-unsigned long ftrace_plt;
+struct ftrace_insn {
+ u16 opc;
+ s32 disp;
+} __packed;
+
+asm(
+ " .align 16\n"
+ "ftrace_shared_hotpatch_trampoline_br:\n"
+ " lmg %r0,%r1,2(%r1)\n"
+ " br %r1\n"
+ "ftrace_shared_hotpatch_trampoline_br_end:\n"
+);
+
+#ifdef CONFIG_EXPOLINE
+asm(
+ " .align 16\n"
+ "ftrace_shared_hotpatch_trampoline_ex:\n"
+ " lmg %r0,%r1,2(%r1)\n"
+ " ex %r0," __stringify(__LC_BR_R1) "(%r0)\n"
+ " j .\n"
+ "ftrace_shared_hotpatch_trampoline_ex_end:\n"
+);
+
+asm(
+ " .align 16\n"
+ "ftrace_shared_hotpatch_trampoline_exrl:\n"
+ " lmg %r0,%r1,2(%r1)\n"
+ " .insn ril,0xc60000000000,%r0,0f\n" /* exrl */
+ " j .\n"
+ "0: br %r1\n"
+ "ftrace_shared_hotpatch_trampoline_exrl_end:\n"
+);
+#endif /* CONFIG_EXPOLINE */
+
+#ifdef CONFIG_MODULES
+static char *ftrace_plt;
+
+asm(
+ " .data\n"
+ "ftrace_plt_template:\n"
+ " basr %r1,%r0\n"
+ " lg %r1,0f-.(%r1)\n"
+ " br %r1\n"
+ "0: .quad ftrace_caller\n"
+ "ftrace_plt_template_end:\n"
+ " .previous\n"
+);
+#endif /* CONFIG_MODULES */
+
+static const char *ftrace_shared_hotpatch_trampoline(const char **end)
+{
+ const char *tstart, *tend;
+
+ tstart = ftrace_shared_hotpatch_trampoline_br;
+ tend = ftrace_shared_hotpatch_trampoline_br_end;
+#ifdef CONFIG_EXPOLINE
+ if (!nospec_disable) {
+ tstart = ftrace_shared_hotpatch_trampoline_ex;
+ tend = ftrace_shared_hotpatch_trampoline_ex_end;
+ if (test_facility(35)) { /* exrl */
+ tstart = ftrace_shared_hotpatch_trampoline_exrl;
+ tend = ftrace_shared_hotpatch_trampoline_exrl_end;
+ }
+ }
+#endif /* CONFIG_EXPOLINE */
+ if (end)
+ *end = tend;
+ return tstart;
+}
+
+bool ftrace_need_init_nop(void)
+{
+ return ftrace_shared_hotpatch_trampoline(NULL);
+}
+
+int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec)
+{
+ static struct ftrace_hotpatch_trampoline *next_vmlinux_trampoline =
+ __ftrace_hotpatch_trampolines_start;
+ static const char orig[6] = { 0xc0, 0x04, 0x00, 0x00, 0x00, 0x00 };
+ static struct ftrace_hotpatch_trampoline *trampoline;
+ struct ftrace_hotpatch_trampoline **next_trampoline;
+ struct ftrace_hotpatch_trampoline *trampolines_end;
+ struct ftrace_hotpatch_trampoline tmp;
+ struct ftrace_insn *insn;
+ const char *shared;
+ s32 disp;
+
+ BUILD_BUG_ON(sizeof(struct ftrace_hotpatch_trampoline) !=
+ SIZEOF_FTRACE_HOTPATCH_TRAMPOLINE);
+
+ next_trampoline = &next_vmlinux_trampoline;
+ trampolines_end = __ftrace_hotpatch_trampolines_end;
+ shared = ftrace_shared_hotpatch_trampoline(NULL);
+#ifdef CONFIG_MODULES
+ if (mod) {
+ next_trampoline = &mod->arch.next_trampoline;
+ trampolines_end = mod->arch.trampolines_end;
+ shared = ftrace_plt;
+ }
+#endif
+
+ if (WARN_ON_ONCE(*next_trampoline >= trampolines_end))
+ return -ENOMEM;
+ trampoline = (*next_trampoline)++;
+
+ /* Check for the compiler-generated fentry nop (brcl 0, .). */
+ if (WARN_ON_ONCE(memcmp((const void *)rec->ip, &orig, sizeof(orig))))
+ return -EINVAL;
+
+ /* Generate the trampoline. */
+ tmp.brasl_opc = 0xc015; /* brasl %r1, shared */
+ tmp.brasl_disp = (shared - (const char *)&trampoline->brasl_opc) / 2;
+ tmp.interceptor = FTRACE_ADDR;
+ tmp.rest_of_intercepted_function = rec->ip + sizeof(struct ftrace_insn);
+ s390_kernel_write(trampoline, &tmp, sizeof(tmp));
+
+ /* Generate a jump to the trampoline. */
+ disp = ((char *)trampoline - (char *)rec->ip) / 2;
+ insn = (struct ftrace_insn *)rec->ip;
+ s390_kernel_write(&insn->disp, &disp, sizeof(disp));
+
+ return 0;
+}
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
unsigned long addr)
return 0;
}
+static void ftrace_generate_nop_insn(struct ftrace_insn *insn)
+{
+ /* brcl 0,0 */
+ insn->opc = 0xc004;
+ insn->disp = 0;
+}
+
+static void ftrace_generate_call_insn(struct ftrace_insn *insn,
+ unsigned long ip)
+{
+ unsigned long target;
+
+ /* brasl r0,ftrace_caller */
+ target = FTRACE_ADDR;
+#ifdef CONFIG_MODULES
+ if (is_module_addr((void *)ip))
+ target = (unsigned long)ftrace_plt;
+#endif /* CONFIG_MODULES */
+ insn->opc = 0xc005;
+ insn->disp = (target - ip) / 2;
+}
+
+static void brcl_disable(void *brcl)
+{
+ u8 op = 0x04; /* set mask field to zero */
+
+ s390_kernel_write((char *)brcl + 1, &op, sizeof(op));
+}
+
int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
unsigned long addr)
{
struct ftrace_insn orig, new, old;
+ if (ftrace_shared_hotpatch_trampoline(NULL)) {
+ brcl_disable((void *)rec->ip);
+ return 0;
+ }
+
if (copy_from_kernel_nofault(&old, (void *) rec->ip, sizeof(old)))
return -EFAULT;
/* Replace ftrace call with a nop. */
return 0;
}
+static void brcl_enable(void *brcl)
+{
+ u8 op = 0xf4; /* set mask field to all ones */
+
+ s390_kernel_write((char *)brcl + 1, &op, sizeof(op));
+}
+
int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
{
struct ftrace_insn orig, new, old;
+ if (ftrace_shared_hotpatch_trampoline(NULL)) {
+ brcl_enable((void *)rec->ip);
+ return 0;
+ }
+
if (copy_from_kernel_nofault(&old, (void *) rec->ip, sizeof(old)))
return -EFAULT;
/* Replace nop with an ftrace call. */
return 0;
}
+void arch_ftrace_update_code(int command)
+{
+ if (ftrace_shared_hotpatch_trampoline(NULL))
+ ftrace_modify_all_code(command);
+ else
+ ftrace_run_stop_machine(command);
+}
+
+static void __ftrace_sync(void *dummy)
+{
+}
+
+int ftrace_arch_code_modify_post_process(void)
+{
+ if (ftrace_shared_hotpatch_trampoline(NULL)) {
+ /* Send SIGP to the other CPUs, so they see the new code. */
+ smp_call_function(__ftrace_sync, NULL, 1);
+ }
+ return 0;
+}
+
#ifdef CONFIG_MODULES
static int __init ftrace_plt_init(void)
{
- unsigned int *ip;
+ const char *start, *end;
- ftrace_plt = (unsigned long) module_alloc(PAGE_SIZE);
+ ftrace_plt = module_alloc(PAGE_SIZE);
if (!ftrace_plt)
panic("cannot allocate ftrace plt\n");
- ip = (unsigned int *) ftrace_plt;
- ip[0] = 0x0d10e310; /* basr 1,0; lg 1,10(1); br 1 */
- ip[1] = 0x100a0004;
- ip[2] = 0x07f10000;
- ip[3] = FTRACE_ADDR >> 32;
- ip[4] = FTRACE_ADDR & 0xffffffff;
- set_memory_ro(ftrace_plt, 1);
+
+ start = ftrace_shared_hotpatch_trampoline(&end);
+ if (!start) {
+ start = ftrace_plt_template;
+ end = ftrace_plt_template_end;
+ }
+ memcpy(ftrace_plt, start, end - start);
+ set_memory_ro((unsigned long)ftrace_plt, 1);
return 0;
}
device_initcall(ftrace_plt_init);
*/
int ftrace_enable_ftrace_graph_caller(void)
{
- u8 op = 0x04; /* set mask field to zero */
-
- s390_kernel_write(__va(ftrace_graph_caller)+1, &op, sizeof(op));
+ brcl_disable(__va(ftrace_graph_caller));
return 0;
}
int ftrace_disable_ftrace_graph_caller(void)
{
- u8 op = 0xf4; /* set mask field to all ones */
-
- s390_kernel_write(__va(ftrace_graph_caller)+1, &op, sizeof(op));
+ brcl_enable(__va(ftrace_graph_caller));
return 0;
}
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _FTRACE_H
+#define _FTRACE_H
+
+#include <asm/types.h>
+
+struct ftrace_hotpatch_trampoline {
+ u16 brasl_opc;
+ s32 brasl_disp;
+ s16: 16;
+ u64 rest_of_intercepted_function;
+ u64 interceptor;
+} __packed;
+
+extern struct ftrace_hotpatch_trampoline __ftrace_hotpatch_trampolines_start[];
+extern struct ftrace_hotpatch_trampoline __ftrace_hotpatch_trampolines_end[];
+extern const char ftrace_shared_hotpatch_trampoline_br[];
+extern const char ftrace_shared_hotpatch_trampoline_br_end[];
+extern const char ftrace_shared_hotpatch_trampoline_ex[];
+extern const char ftrace_shared_hotpatch_trampoline_ex_end[];
+extern const char ftrace_shared_hotpatch_trampoline_exrl[];
+extern const char ftrace_shared_hotpatch_trampoline_exrl_end[];
+extern const char ftrace_plt_template[];
+extern const char ftrace_plt_template_end[];
+
+#endif /* _FTRACE_H */
larl %r1,tod_clock_base
mvc 0(16,%r1),__LC_BOOT_CLOCK
larl %r13,.LPG1 # get base
+ lctlg %c0,%c15,.Lctl-.LPG1(%r13) # load control registers
#
# Setup stack
#
.align 16
.LPG1:
.Ldw: .quad 0x0002000180000000,0x0000000000000000
+.Lctl: .quad 0x04040000 # cr0: AFP registers & secondary space
+ .quad 0 # cr1: primary space segment table
+ .quad 0 # cr2: dispatchable unit control table
+ .quad 0 # cr3: instruction authorization
+ .quad 0xffff # cr4: instruction authorization
+ .quad 0 # cr5: primary-aste origin
+ .quad 0 # cr6: I/O interrupts
+ .quad 0 # cr7: secondary space segment table
+ .quad 0x0000000000008000 # cr8: access registers translation
+ .quad 0 # cr9: tracing off
+ .quad 0 # cr10: tracing off
+ .quad 0 # cr11: tracing off
+ .quad 0 # cr12: tracing off
+ .quad 0 # cr13: home space segment table
+ .quad 0xc0000000 # cr14: machine check handling off
+ .quad 0 # cr15: linkage stack operations
int diag308(unsigned long subcode, void *addr)
{
- if (IS_ENABLED(CONFIG_KASAN))
- __arch_local_irq_stosm(0x04); /* enable DAT */
diag_stat_inc(DIAG_STAT_X308);
return __diag308(subcode, addr);
}
static void __do_restart(void *ignore)
{
- __arch_local_irq_stosm(0x04); /* enable DAT */
smp_send_stop();
#ifdef CONFIG_CRASH_DUMP
crash_kexec(NULL);
/* Disable lowcore protection */
__ctl_clear_bit(0, 28);
- diag_dma_ops.diag308_reset();
+ diag_amode31_ops.diag308_reset();
}
#ifdef CONFIG_KEXEC_FILE
// SPDX-License-Identifier: GPL-2.0
+#include <linux/minmax.h>
+#include <linux/string.h>
#include <asm/ebcdic.h>
#include <asm/ipl.h>
int index = *(loff_t *) v;
int cpu, irq;
- get_online_cpus();
+ cpus_read_lock();
if (index == 0) {
seq_puts(p, " ");
for_each_online_cpu(cpu)
seq_putc(p, '\n');
}
out:
- put_online_cpus();
+ cpus_read_unlock();
return 0;
}
unsigned char *ipe = (unsigned char *)expected;
unsigned char *ipn = (unsigned char *)new;
- pr_emerg("Jump label code mismatch at %pS [%p]\n", ipc, ipc);
+ pr_emerg("Jump label code mismatch at %pS [%px]\n", ipc, ipc);
pr_emerg("Found: %6ph\n", ipc);
pr_emerg("Expected: %6ph\n", ipe);
pr_emerg("New: %6ph\n", ipn);
VMCOREINFO_SYMBOL(lowcore_ptr);
VMCOREINFO_SYMBOL(high_memory);
VMCOREINFO_LENGTH(lowcore_ptr, NR_CPUS);
- vmcoreinfo_append_str("SDMA=%lx\n", __sdma);
- vmcoreinfo_append_str("EDMA=%lx\n", __edma);
+ vmcoreinfo_append_str("SAMODE31=%lx\n", __samode31);
+ vmcoreinfo_append_str("EAMODE31=%lx\n", __eamode31);
vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());
mem_assign_absolute(S390_lowcore.vmcore_info, paddr_vmcoreinfo_note());
}
*/
static void __machine_kexec(void *data)
{
- __arch_local_irq_stosm(0x04); /* enable DAT */
pfault_fini();
tracing_off();
debug_locks_off();
#include <linux/elf.h>
#include <linux/vmalloc.h>
#include <linux/fs.h>
+#include <linux/ftrace.h>
#include <linux/string.h>
#include <linux/kernel.h>
#include <linux/kasan.h>
#include <asm/alternative.h>
#include <asm/nospec-branch.h>
#include <asm/facility.h>
+#include <asm/ftrace.lds.h>
+#include <asm/set_memory.h>
#if 0
#define DEBUGP printk
return p;
}
+#ifdef CONFIG_FUNCTION_TRACER
+void module_arch_cleanup(struct module *mod)
+{
+ module_memfree(mod->arch.trampolines_start);
+}
+#endif
+
void module_arch_freeing_init(struct module *mod)
{
if (is_livepatch_module(mod) &&
write);
}
+#ifdef CONFIG_FUNCTION_TRACER
+static int module_alloc_ftrace_hotpatch_trampolines(struct module *me,
+ const Elf_Shdr *s)
+{
+ char *start, *end;
+ int numpages;
+ size_t size;
+
+ size = FTRACE_HOTPATCH_TRAMPOLINES_SIZE(s->sh_size);
+ numpages = DIV_ROUND_UP(size, PAGE_SIZE);
+ start = module_alloc(numpages * PAGE_SIZE);
+ if (!start)
+ return -ENOMEM;
+ set_memory_ro((unsigned long)start, numpages);
+ end = start + size;
+
+ me->arch.trampolines_start = (struct ftrace_hotpatch_trampoline *)start;
+ me->arch.trampolines_end = (struct ftrace_hotpatch_trampoline *)end;
+ me->arch.next_trampoline = me->arch.trampolines_start;
+
+ return 0;
+}
+#endif /* CONFIG_FUNCTION_TRACER */
+
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
const Elf_Shdr *s;
char *secstrings, *secname;
void *aseg;
+#ifdef CONFIG_FUNCTION_TRACER
+ int ret;
+#endif
if (IS_ENABLED(CONFIG_EXPOLINE) &&
!nospec_disable && me->arch.plt_size) {
if (IS_ENABLED(CONFIG_EXPOLINE) &&
(str_has_prefix(secname, ".s390_return")))
nospec_revert(aseg, aseg + s->sh_size);
+
+#ifdef CONFIG_FUNCTION_TRACER
+ if (!strcmp(FTRACE_CALLSITE_SECTION, secname)) {
+ ret = module_alloc_ftrace_hotpatch_trampolines(me, s);
+ if (ret < 0)
+ return ret;
+ }
+#endif /* CONFIG_FUNCTION_TRACER */
}
jump_label_apply_nops(me);
if (os_info_init)
return;
- if (!OLDMEM_BASE)
+ if (!oldmem_data.start)
goto fail;
if (copy_oldmem_kernel(&addr, &S390_lowcore.os_info, sizeof(addr)))
goto fail;
{
int ret;
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&cfset_ctrset_mutex);
switch (cmd) {
case S390_HWCTR_START:
break;
}
mutex_unlock(&cfset_ctrset_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return ret;
}
#include <linux/cpufeature.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
+#include <linux/random.h>
#include <linux/sched/mm.h>
#include <linux/init.h>
#include <linux/seq_file.h>
#include <asm/elf.h>
#include <asm/lowcore.h>
#include <asm/param.h>
+#include <asm/sclp.h>
#include <asm/smp.h>
+unsigned long __read_mostly elf_hwcap;
+char elf_platform[ELF_PLATFORM_SIZE];
+
struct cpu_info {
unsigned int cpu_mhz_dynamic;
unsigned int cpu_mhz_static;
static void show_cpu_summary(struct seq_file *m, void *v)
{
static const char *hwcap_str[] = {
- "esan3", "zarch", "stfle", "msa", "ldisp", "eimm", "dfp",
- "edat", "etf3eh", "highgprs", "te", "vx", "vxd", "vxe", "gs",
- "vxe2", "vxp", "sort", "dflt"
- };
- static const char * const int_hwcap_str[] = {
- "sie"
+ [HWCAP_NR_ESAN3] = "esan3",
+ [HWCAP_NR_ZARCH] = "zarch",
+ [HWCAP_NR_STFLE] = "stfle",
+ [HWCAP_NR_MSA] = "msa",
+ [HWCAP_NR_LDISP] = "ldisp",
+ [HWCAP_NR_EIMM] = "eimm",
+ [HWCAP_NR_DFP] = "dfp",
+ [HWCAP_NR_HPAGE] = "edat",
+ [HWCAP_NR_ETF3EH] = "etf3eh",
+ [HWCAP_NR_HIGH_GPRS] = "highgprs",
+ [HWCAP_NR_TE] = "te",
+ [HWCAP_NR_VXRS] = "vx",
+ [HWCAP_NR_VXRS_BCD] = "vxd",
+ [HWCAP_NR_VXRS_EXT] = "vxe",
+ [HWCAP_NR_GS] = "gs",
+ [HWCAP_NR_VXRS_EXT2] = "vxe2",
+ [HWCAP_NR_VXRS_PDE] = "vxp",
+ [HWCAP_NR_SORT] = "sort",
+ [HWCAP_NR_DFLT] = "dflt",
+ [HWCAP_NR_VXRS_PDE2] = "vxp2",
+ [HWCAP_NR_NNPA] = "nnpa",
+ [HWCAP_NR_PCI_MIO] = "pcimio",
+ [HWCAP_NR_SIE] = "sie",
};
int i, cpu;
+ BUILD_BUG_ON(ARRAY_SIZE(hwcap_str) != HWCAP_NR_MAX);
seq_printf(m, "vendor_id : IBM/S390\n"
"# processors : %i\n"
"bogomips per cpu: %lu.%02lu\n",
for (i = 0; i < ARRAY_SIZE(hwcap_str); i++)
if (hwcap_str[i] && (elf_hwcap & (1UL << i)))
seq_printf(m, "%s ", hwcap_str[i]);
- for (i = 0; i < ARRAY_SIZE(int_hwcap_str); i++)
- if (int_hwcap_str[i] && (int_hwcap & (1UL << i)))
- seq_printf(m, "%s ", int_hwcap_str[i]);
seq_puts(m, "\n");
show_facilities(m);
show_cacheinfo(m);
}
}
+static int __init setup_hwcaps(void)
+{
+ /* instructions named N3, "backported" to esa-mode */
+ if (test_facility(0))
+ elf_hwcap |= HWCAP_ESAN3;
+
+ /* z/Architecture mode active */
+ elf_hwcap |= HWCAP_ZARCH;
+
+ /* store-facility-list-extended */
+ if (test_facility(7))
+ elf_hwcap |= HWCAP_STFLE;
+
+ /* message-security assist */
+ if (test_facility(17))
+ elf_hwcap |= HWCAP_MSA;
+
+ /* long-displacement */
+ if (test_facility(19))
+ elf_hwcap |= HWCAP_LDISP;
+
+ /* extended-immediate */
+ if (test_facility(21))
+ elf_hwcap |= HWCAP_EIMM;
+
+ /* extended-translation facility 3 enhancement */
+ if (test_facility(22) && test_facility(30))
+ elf_hwcap |= HWCAP_ETF3EH;
+
+ /* decimal floating point & perform floating point operation */
+ if (test_facility(42) && test_facility(44))
+ elf_hwcap |= HWCAP_DFP;
+
+ /* huge page support */
+ if (MACHINE_HAS_EDAT1)
+ elf_hwcap |= HWCAP_HPAGE;
+
+ /* 64-bit register support for 31-bit processes */
+ elf_hwcap |= HWCAP_HIGH_GPRS;
+
+ /* transactional execution */
+ if (MACHINE_HAS_TE)
+ elf_hwcap |= HWCAP_TE;
+
+ /*
+ * Vector extension can be disabled with the "novx" parameter.
+ * Use MACHINE_HAS_VX instead of facility bit 129.
+ */
+ if (MACHINE_HAS_VX) {
+ elf_hwcap |= HWCAP_VXRS;
+ if (test_facility(134))
+ elf_hwcap |= HWCAP_VXRS_BCD;
+ if (test_facility(135))
+ elf_hwcap |= HWCAP_VXRS_EXT;
+ if (test_facility(148))
+ elf_hwcap |= HWCAP_VXRS_EXT2;
+ if (test_facility(152))
+ elf_hwcap |= HWCAP_VXRS_PDE;
+ if (test_facility(192))
+ elf_hwcap |= HWCAP_VXRS_PDE2;
+ }
+
+ if (test_facility(150))
+ elf_hwcap |= HWCAP_SORT;
+
+ if (test_facility(151))
+ elf_hwcap |= HWCAP_DFLT;
+
+ if (test_facility(165))
+ elf_hwcap |= HWCAP_NNPA;
+
+ /* guarded storage */
+ if (MACHINE_HAS_GS)
+ elf_hwcap |= HWCAP_GS;
+
+ if (MACHINE_HAS_PCI_MIO)
+ elf_hwcap |= HWCAP_PCI_MIO;
+
+ /* virtualization support */
+ if (sclp.has_sief2)
+ elf_hwcap |= HWCAP_SIE;
+
+ return 0;
+}
+arch_initcall(setup_hwcaps);
+
+static int __init setup_elf_platform(void)
+{
+ struct cpuid cpu_id;
+
+ get_cpu_id(&cpu_id);
+ add_device_randomness(&cpu_id, sizeof(cpu_id));
+ switch (cpu_id.machine) {
+ case 0x2064:
+ case 0x2066:
+ default: /* Use "z900" as default for 64 bit kernels. */
+ strcpy(elf_platform, "z900");
+ break;
+ case 0x2084:
+ case 0x2086:
+ strcpy(elf_platform, "z990");
+ break;
+ case 0x2094:
+ case 0x2096:
+ strcpy(elf_platform, "z9-109");
+ break;
+ case 0x2097:
+ case 0x2098:
+ strcpy(elf_platform, "z10");
+ break;
+ case 0x2817:
+ case 0x2818:
+ strcpy(elf_platform, "z196");
+ break;
+ case 0x2827:
+ case 0x2828:
+ strcpy(elf_platform, "zEC12");
+ break;
+ case 0x2964:
+ case 0x2965:
+ strcpy(elf_platform, "z13");
+ break;
+ case 0x3906:
+ case 0x3907:
+ strcpy(elf_platform, "z14");
+ break;
+ case 0x8561:
+ case 0x8562:
+ strcpy(elf_platform, "z15");
+ break;
+ }
+ return 0;
+}
+arch_initcall(setup_elf_platform);
+
static void show_cpu_topology(struct seq_file *m, unsigned long n)
{
#ifdef CONFIG_SCHED_TOPOLOGY
static void *c_start(struct seq_file *m, loff_t *pos)
{
- get_online_cpus();
+ cpus_read_lock();
return c_update(pos);
}
static void c_stop(struct seq_file *m, void *v)
{
- put_online_cpus();
+ cpus_read_unlock();
}
const struct seq_operations cpuinfo_op = {
unsigned int console_irq = -1;
EXPORT_SYMBOL(console_irq);
-unsigned long elf_hwcap __read_mostly = 0;
-char elf_platform[ELF_PLATFORM_SIZE];
+/*
+ * Some code and data needs to stay below 2 GB, even when the kernel would be
+ * relocated above 2 GB, because it has to use 31 bit addresses.
+ * Such code and data is part of the .amode31 section.
+ */
+unsigned long __amode31_ref __samode31 = __pa(&_samode31);
+unsigned long __amode31_ref __eamode31 = __pa(&_eamode31);
+unsigned long __amode31_ref __stext_amode31 = __pa(&_stext_amode31);
+unsigned long __amode31_ref __etext_amode31 = __pa(&_etext_amode31);
+struct exception_table_entry __amode31_ref *__start_amode31_ex_table = _start_amode31_ex_table;
+struct exception_table_entry __amode31_ref *__stop_amode31_ex_table = _stop_amode31_ex_table;
+
+/*
+ * Control registers CR2, CR5 and CR15 are initialized with addresses
+ * of tables that must be placed below 2G which is handled by the AMODE31
+ * sections.
+ * Because the AMODE31 sections are relocated below 2G at startup,
+ * the content of control registers CR2, CR5 and CR15 must be updated
+ * with new addresses after the relocation. The initial initialization of
+ * control registers occurs in head64.S and then gets updated again after AMODE31
+ * relocation. We must access the relevant AMODE31 tables indirectly via
+ * pointers placed in the .amode31.refs linker section. Those pointers get
+ * updated automatically during AMODE31 relocation and always contain a valid
+ * address within AMODE31 sections.
+ */
+
+static __amode31_data u32 __ctl_duct_amode31[16] __aligned(64);
+
+static __amode31_data u64 __ctl_aste_amode31[8] __aligned(64) = {
+ [1] = 0xffffffffffffffff
+};
+
+static __amode31_data u32 __ctl_duald_amode31[32] __aligned(128) = {
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0,
+ 0x80000000, 0, 0, 0
+};
+
+static __amode31_data u32 __ctl_linkage_stack_amode31[8] __aligned(64) = {
+ 0, 0, 0x89000000, 0,
+ 0, 0, 0x8a000000, 0
+};
-unsigned long int_hwcap = 0;
+static u64 __amode31_ref *__ctl_aste = __ctl_aste_amode31;
+static u32 __amode31_ref *__ctl_duald = __ctl_duald_amode31;
+static u32 __amode31_ref *__ctl_linkage_stack = __ctl_linkage_stack_amode31;
+static u32 __amode31_ref *__ctl_duct = __ctl_duct_amode31;
int __bootdata(noexec_disabled);
unsigned long __bootdata(ident_map_size);
struct mem_detect_info __bootdata(mem_detect);
+struct initrd_data __bootdata(initrd_data);
-struct exception_table_entry *__bootdata_preserved(__start_dma_ex_table);
-struct exception_table_entry *__bootdata_preserved(__stop_dma_ex_table);
-unsigned long __bootdata_preserved(__stext_dma);
-unsigned long __bootdata_preserved(__etext_dma);
-unsigned long __bootdata_preserved(__sdma);
-unsigned long __bootdata_preserved(__edma);
unsigned long __bootdata_preserved(__kaslr_offset);
unsigned int __bootdata_preserved(zlib_dfltcc_support);
EXPORT_SYMBOL(zlib_dfltcc_support);
u64 __bootdata_preserved(stfle_fac_list[16]);
EXPORT_SYMBOL(stfle_fac_list);
u64 __bootdata_preserved(alt_stfle_fac_list[16]);
+struct oldmem_data __bootdata_preserved(oldmem_data);
unsigned long VMALLOC_START;
EXPORT_SYMBOL(VMALLOC_START);
{
if (!is_ipl_type_dump())
return;
- if (OLDMEM_BASE)
+ if (oldmem_data.start)
return;
strcat(boot_command_line, " cio_ignore=all,!ipldev,!condev");
console_loglevel = 2;
lc->restart_stack = (unsigned long) restart_stack;
lc->restart_fn = (unsigned long) do_restart;
lc->restart_data = 0;
- lc->restart_source = -1UL;
+ lc->restart_source = -1U;
mcck_stack = (unsigned long)memblock_alloc(THREAD_SIZE, THREAD_SIZE);
if (!mcck_stack)
static void __init setup_lowcore_dat_on(void)
{
+ struct lowcore *lc = lowcore_ptr[0];
+
__ctl_clear_bit(0, 28);
S390_lowcore.external_new_psw.mask |= PSW_MASK_DAT;
S390_lowcore.svc_new_psw.mask |= PSW_MASK_DAT;
S390_lowcore.program_new_psw.mask |= PSW_MASK_DAT;
S390_lowcore.io_new_psw.mask |= PSW_MASK_DAT;
+ __ctl_store(S390_lowcore.cregs_save_area, 0, 15);
__ctl_set_bit(0, 28);
+ mem_assign_absolute(S390_lowcore.restart_flags, RESTART_FLAG_CTLREGS);
+ mem_assign_absolute(S390_lowcore.program_new_psw, lc->program_new_psw);
+ memcpy_absolute(&S390_lowcore.cregs_save_area, lc->cregs_save_area,
+ sizeof(S390_lowcore.cregs_save_area));
}
static struct resource code_resource = {
return;
}
- low = crash_base ?: OLDMEM_BASE;
+ low = crash_base ?: oldmem_data.start;
high = low + crash_size;
- if (low >= OLDMEM_BASE && high <= OLDMEM_BASE + OLDMEM_SIZE) {
+ if (low >= oldmem_data.start && high <= oldmem_data.start + oldmem_data.size) {
/* The crashkernel fits into OLDMEM, reuse OLDMEM */
crash_base = low;
} else {
if (register_memory_notifier(&kdump_mem_nb))
return;
- if (!OLDMEM_BASE && MACHINE_IS_VM)
+ if (!oldmem_data.start && MACHINE_IS_VM)
diag10_range(PFN_DOWN(crash_base), PFN_DOWN(crash_size));
crashk_res.start = crash_base;
crashk_res.end = crash_base + crash_size - 1;
static void __init reserve_initrd(void)
{
#ifdef CONFIG_BLK_DEV_INITRD
- if (!INITRD_START || !INITRD_SIZE)
+ if (!initrd_data.start || !initrd_data.size)
return;
- initrd_start = INITRD_START;
- initrd_end = initrd_start + INITRD_SIZE;
- memblock_reserve(INITRD_START, INITRD_SIZE);
+ initrd_start = initrd_data.start;
+ initrd_end = initrd_start + initrd_data.size;
+ memblock_reserve(initrd_data.start, initrd_data.size);
#endif
}
static void __init check_initrd(void)
{
#ifdef CONFIG_BLK_DEV_INITRD
- if (INITRD_START && INITRD_SIZE &&
- !memblock_is_region_memory(INITRD_START, INITRD_SIZE)) {
+ if (initrd_data.start && initrd_data.size &&
+ !memblock_is_region_memory(initrd_data.start, initrd_data.size)) {
pr_err("The initial RAM disk does not fit into the memory\n");
- memblock_free(INITRD_START, INITRD_SIZE);
+ memblock_free(initrd_data.start, initrd_data.size);
initrd_start = initrd_end = 0;
}
#endif
{
unsigned long start_pfn = PFN_UP(__pa(_end));
- memblock_reserve(0, HEAD_END);
+ memblock_reserve(0, STARTUP_NORMAL_OFFSET);
+ memblock_reserve((unsigned long)sclp_early_sccb, EXT_SCCB_READ_SCP);
memblock_reserve((unsigned long)_stext, PFN_PHYS(start_pfn)
- (unsigned long)_stext);
- memblock_reserve(__sdma, __edma - __sdma);
}
static void __init setup_memory(void)
memblock_enforce_memory_limit(memblock_end_of_DRAM());
}
-/*
- * Setup hardware capabilities.
- */
-static int __init setup_hwcaps(void)
+static void __init relocate_amode31_section(void)
{
- static const int stfl_bits[6] = { 0, 2, 7, 17, 19, 21 };
- struct cpuid cpu_id;
- int i;
-
- /*
- * The store facility list bits numbers as found in the principles
- * of operation are numbered with bit 1UL<<31 as number 0 to
- * bit 1UL<<0 as number 31.
- * Bit 0: instructions named N3, "backported" to esa-mode
- * Bit 2: z/Architecture mode is active
- * Bit 7: the store-facility-list-extended facility is installed
- * Bit 17: the message-security assist is installed
- * Bit 19: the long-displacement facility is installed
- * Bit 21: the extended-immediate facility is installed
- * Bit 22: extended-translation facility 3 is installed
- * Bit 30: extended-translation facility 3 enhancement facility
- * These get translated to:
- * HWCAP_S390_ESAN3 bit 0, HWCAP_S390_ZARCH bit 1,
- * HWCAP_S390_STFLE bit 2, HWCAP_S390_MSA bit 3,
- * HWCAP_S390_LDISP bit 4, HWCAP_S390_EIMM bit 5 and
- * HWCAP_S390_ETF3EH bit 8 (22 && 30).
- */
- for (i = 0; i < 6; i++)
- if (test_facility(stfl_bits[i]))
- elf_hwcap |= 1UL << i;
-
- if (test_facility(22) && test_facility(30))
- elf_hwcap |= HWCAP_S390_ETF3EH;
-
- /*
- * Check for additional facilities with store-facility-list-extended.
- * stfle stores doublewords (8 byte) with bit 1ULL<<63 as bit 0
- * and 1ULL<<0 as bit 63. Bits 0-31 contain the same information
- * as stored by stfl, bits 32-xxx contain additional facilities.
- * How many facility words are stored depends on the number of
- * doublewords passed to the instruction. The additional facilities
- * are:
- * Bit 42: decimal floating point facility is installed
- * Bit 44: perform floating point operation facility is installed
- * translated to:
- * HWCAP_S390_DFP bit 6 (42 && 44).
- */
- if ((elf_hwcap & (1UL << 2)) && test_facility(42) && test_facility(44))
- elf_hwcap |= HWCAP_S390_DFP;
-
- /*
- * Huge page support HWCAP_S390_HPAGE is bit 7.
- */
- if (MACHINE_HAS_EDAT1)
- elf_hwcap |= HWCAP_S390_HPAGE;
-
- /*
- * 64-bit register support for 31-bit processes
- * HWCAP_S390_HIGH_GPRS is bit 9.
- */
- elf_hwcap |= HWCAP_S390_HIGH_GPRS;
-
- /*
- * Transactional execution support HWCAP_S390_TE is bit 10.
- */
- if (MACHINE_HAS_TE)
- elf_hwcap |= HWCAP_S390_TE;
-
- /*
- * Vector extension HWCAP_S390_VXRS is bit 11. The Vector extension
- * can be disabled with the "novx" parameter. Use MACHINE_HAS_VX
- * instead of facility bit 129.
- */
- if (MACHINE_HAS_VX) {
- elf_hwcap |= HWCAP_S390_VXRS;
- if (test_facility(134))
- elf_hwcap |= HWCAP_S390_VXRS_BCD;
- if (test_facility(135))
- elf_hwcap |= HWCAP_S390_VXRS_EXT;
- if (test_facility(148))
- elf_hwcap |= HWCAP_S390_VXRS_EXT2;
- if (test_facility(152))
- elf_hwcap |= HWCAP_S390_VXRS_PDE;
- }
- if (test_facility(150))
- elf_hwcap |= HWCAP_S390_SORT;
- if (test_facility(151))
- elf_hwcap |= HWCAP_S390_DFLT;
-
- /*
- * Guarded storage support HWCAP_S390_GS is bit 12.
- */
- if (MACHINE_HAS_GS)
- elf_hwcap |= HWCAP_S390_GS;
-
- get_cpu_id(&cpu_id);
- add_device_randomness(&cpu_id, sizeof(cpu_id));
- switch (cpu_id.machine) {
- case 0x2064:
- case 0x2066:
- default: /* Use "z900" as default for 64 bit kernels. */
- strcpy(elf_platform, "z900");
- break;
- case 0x2084:
- case 0x2086:
- strcpy(elf_platform, "z990");
- break;
- case 0x2094:
- case 0x2096:
- strcpy(elf_platform, "z9-109");
- break;
- case 0x2097:
- case 0x2098:
- strcpy(elf_platform, "z10");
- break;
- case 0x2817:
- case 0x2818:
- strcpy(elf_platform, "z196");
- break;
- case 0x2827:
- case 0x2828:
- strcpy(elf_platform, "zEC12");
- break;
- case 0x2964:
- case 0x2965:
- strcpy(elf_platform, "z13");
- break;
- case 0x3906:
- case 0x3907:
- strcpy(elf_platform, "z14");
- break;
- case 0x8561:
- case 0x8562:
- strcpy(elf_platform, "z15");
- break;
- }
-
- /*
- * Virtualization support HWCAP_INT_SIE is bit 0.
- */
- if (sclp.has_sief2)
- int_hwcap |= HWCAP_INT_SIE;
+ unsigned long amode31_addr, amode31_size;
+ long amode31_offset;
+ long *ptr;
+
+ /* Allocate a new AMODE31 capable memory region */
+ amode31_size = __eamode31 - __samode31;
+ pr_info("Relocating AMODE31 section of size 0x%08lx\n", amode31_size);
+ amode31_addr = (unsigned long)memblock_alloc_low(amode31_size, PAGE_SIZE);
+ if (!amode31_addr)
+ panic("Failed to allocate memory for AMODE31 section\n");
+ amode31_offset = amode31_addr - __samode31;
+
+ /* Move original AMODE31 section to the new one */
+ memmove((void *)amode31_addr, (void *)__samode31, amode31_size);
+ /* Zero out the old AMODE31 section to catch invalid accesses within it */
+ memset((void *)__samode31, 0, amode31_size);
+
+ /* Update all AMODE31 region references */
+ for (ptr = _start_amode31_refs; ptr != _end_amode31_refs; ptr++)
+ *ptr += amode31_offset;
+}
- return 0;
+/* This must be called after AMODE31 relocation */
+static void __init setup_cr(void)
+{
+ union ctlreg2 cr2;
+ union ctlreg5 cr5;
+ union ctlreg15 cr15;
+
+ __ctl_duct[1] = (unsigned long)__ctl_aste;
+ __ctl_duct[2] = (unsigned long)__ctl_aste;
+ __ctl_duct[4] = (unsigned long)__ctl_duald;
+
+ /* Update control registers CR2, CR5 and CR15 */
+ __ctl_store(cr2.val, 2, 2);
+ __ctl_store(cr5.val, 5, 5);
+ __ctl_store(cr15.val, 15, 15);
+ cr2.ducto = (unsigned long)__ctl_duct >> 6;
+ cr5.pasteo = (unsigned long)__ctl_duct >> 6;
+ cr15.lsea = (unsigned long)__ctl_linkage_stack >> 3;
+ __ctl_load(cr2.val, 2, 2);
+ __ctl_load(cr5.val, 5, 5);
+ __ctl_load(cr15.val, 15, 15);
}
-arch_initcall(setup_hwcaps);
/*
* Add system information as device randomness
free_mem_detect_info();
+ relocate_amode31_section();
+ setup_cr();
+
setup_uv();
setup_memory_end();
setup_memory();
*/
restore_saved_sigmask();
}
-
-void do_notify_resume(struct pt_regs *regs)
-{
- tracehook_notify_resume(regs);
- rseq_handle_notify_resume(NULL, regs);
-}
cpumask_set_cpu(cpu, &init_mm.context.cpu_attach_mask);
cpumask_set_cpu(cpu, mm_cpumask(&init_mm));
lc->cpu_nr = cpu;
+ lc->restart_flags = RESTART_FLAG_CTLREGS;
lc->spinlock_lockval = arch_spin_lockval(cpu);
lc->spinlock_index = 0;
lc->percpu_offset = __per_cpu_offset[cpu];
cpu = pcpu - pcpu_devices;
lc = lowcore_ptr[cpu];
- lc->restart_stack = lc->nodat_stack;
+ lc->restart_stack = lc->kernel_stack;
lc->restart_fn = (unsigned long) func;
lc->restart_data = (unsigned long) data;
- lc->restart_source = -1UL;
+ lc->restart_source = -1U;
pcpu_sigp_retry(pcpu, SIGP_RESTART, 0);
}
func(data); /* should not return */
}
-static void __no_sanitize_address pcpu_delegate(struct pcpu *pcpu,
- pcpu_delegate_fn *func,
- void *data, unsigned long stack)
+static void pcpu_delegate(struct pcpu *pcpu,
+ pcpu_delegate_fn *func,
+ void *data, unsigned long stack)
{
struct lowcore *lc = lowcore_ptr[pcpu - pcpu_devices];
- unsigned long source_cpu = stap();
+ unsigned int source_cpu = stap();
__load_psw_mask(PSW_KERNEL_BITS | PSW_MASK_DAT);
if (pcpu->address == source_cpu) {
__ctl_load(cregs, 0, 15);
}
+static DEFINE_SPINLOCK(ctl_lock);
+static unsigned long ctlreg;
+
/*
* Set a bit in a control register of all cpus
*/
{
struct ec_creg_mask_parms parms = { 1UL << bit, -1UL, cr };
+ spin_lock(&ctl_lock);
+ memcpy_absolute(&ctlreg, &S390_lowcore.cregs_save_area[cr], sizeof(ctlreg));
+ __set_bit(bit, &ctlreg);
+ memcpy_absolute(&S390_lowcore.cregs_save_area[cr], &ctlreg, sizeof(ctlreg));
+ spin_unlock(&ctl_lock);
on_each_cpu(smp_ctl_bit_callback, &parms, 1);
}
EXPORT_SYMBOL(smp_ctl_set_bit);
{
struct ec_creg_mask_parms parms = { 0, ~(1UL << bit), cr };
+ spin_lock(&ctl_lock);
+ memcpy_absolute(&ctlreg, &S390_lowcore.cregs_save_area[cr], sizeof(ctlreg));
+ __clear_bit(bit, &ctlreg);
+ memcpy_absolute(&S390_lowcore.cregs_save_area[cr], &ctlreg, sizeof(ctlreg));
+ spin_unlock(&ctl_lock);
on_each_cpu(smp_ctl_bit_callback, &parms, 1);
}
EXPORT_SYMBOL(smp_ctl_clear_bit);
unsigned long page;
bool is_boot_cpu;
- if (!(OLDMEM_BASE || is_ipl_type_dump()))
+ if (!(oldmem_data.start || is_ipl_type_dump()))
/* No previous system present, normal boot. */
return;
/* Allocate a page as dumping area for the store status sigps */
* these registers an SCLP request is required which is
* done by drivers/s390/char/zcore.c:init_cpu_info()
*/
- if (!is_boot_cpu || OLDMEM_BASE)
+ if (!is_boot_cpu || oldmem_data.start)
/* Get the CPU registers */
smp_save_cpu_regs(sa, addr, is_boot_cpu, page);
}
memblock_free(page, PAGE_SIZE);
- diag_dma_ops.diag308_reset();
+ diag_amode31_ops.diag308_reset();
pcpu_set_smt(0);
}
#endif /* CONFIG_CRASH_DUMP */
u16 core_id;
int nr, i;
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&smp_cpu_state_mutex);
nr = 0;
cpumask_xor(&avail, cpu_possible_mask, cpu_present_mask);
nr += smp_add_core(&info->core[i], &avail, configured, early);
}
mutex_unlock(&smp_cpu_state_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return nr;
}
memblock_free_early((unsigned long)info, sizeof(*info));
}
-static void smp_init_secondary(void)
+/*
+ * Activate a secondary processor.
+ */
+static void smp_start_secondary(void *cpuvoid)
{
int cpu = raw_smp_processor_id();
S390_lowcore.last_update_clock = get_tod_clock();
+ S390_lowcore.restart_stack = (unsigned long)restart_stack;
+ S390_lowcore.restart_fn = (unsigned long)do_restart;
+ S390_lowcore.restart_data = 0;
+ S390_lowcore.restart_source = -1U;
+ S390_lowcore.restart_flags = 0;
restore_access_regs(S390_lowcore.access_regs_save_area);
cpu_init();
rcu_cpu_starting(cpu);
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
}
-/*
- * Activate a secondary processor.
- */
-static void __no_sanitize_address smp_start_secondary(void *cpuvoid)
-{
- S390_lowcore.restart_stack = (unsigned long) restart_stack;
- S390_lowcore.restart_fn = (unsigned long) do_restart;
- S390_lowcore.restart_data = 0;
- S390_lowcore.restart_source = -1UL;
- __ctl_load(S390_lowcore.cregs_save_area, 0, 15);
- __load_psw_mask(PSW_KERNEL_BITS | PSW_MASK_DAT);
- call_on_stack_noreturn(smp_init_secondary, S390_lowcore.kernel_stack);
-}
-
/* Upping and downing of CPUs */
int __cpu_up(unsigned int cpu, struct task_struct *tidle)
{
return -EINVAL;
if (val != 0 && val != 1)
return -EINVAL;
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&smp_cpu_state_mutex);
rc = -EBUSY;
/* disallow configuration changes of online cpus and cpu 0 */
}
out:
mutex_unlock(&smp_cpu_state_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return rc ? rc : count;
}
static DEVICE_ATTR(configure, 0644, cpu_configure_show, cpu_configure_store);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Code that needs to run below 2 GB.
+ *
+ * Copyright IBM Corp. 2019
+ */
+
+#include <linux/linkage.h>
+#include <asm/errno.h>
+#include <asm/sigp.h>
+
+ .section .amode31.text,"ax"
+/*
+ * Simplified version of expoline thunk. The normal thunks can not be used here,
+ * because they might be more than 2 GB away, and not reachable by the relative
+ * branch. No comdat, exrl, etc. optimizations used here, because it only
+ * affects a few functions that are not performance-relevant.
+ */
+ .macro BR_EX_AMODE31_r14
+ larl %r1,0f
+ ex 0,0(%r1)
+ j .
+0: br %r14
+ .endm
+
+/*
+ * int _diag14_amode31(unsigned long rx, unsigned long ry1, unsigned long subcode)
+ */
+ENTRY(_diag14_amode31)
+ lgr %r1,%r2
+ lgr %r2,%r3
+ lgr %r3,%r4
+ lhi %r5,-EIO
+ sam31
+ diag %r1,%r2,0x14
+.Ldiag14_ex:
+ ipm %r5
+ srl %r5,28
+.Ldiag14_fault:
+ sam64
+ lgfr %r2,%r5
+ BR_EX_AMODE31_r14
+ EX_TABLE_AMODE31(.Ldiag14_ex, .Ldiag14_fault)
+ENDPROC(_diag14_amode31)
+
+/*
+ * int _diag210_amode31(struct diag210 *addr)
+ */
+ENTRY(_diag210_amode31)
+ lgr %r1,%r2
+ lhi %r2,-1
+ sam31
+ diag %r1,%r0,0x210
+.Ldiag210_ex:
+ ipm %r2
+ srl %r2,28
+.Ldiag210_fault:
+ sam64
+ lgfr %r2,%r2
+ BR_EX_AMODE31_r14
+ EX_TABLE_AMODE31(.Ldiag210_ex, .Ldiag210_fault)
+ENDPROC(_diag210_amode31)
+
+/*
+ * int _diag26c_amode31(void *req, void *resp, enum diag26c_sc subcode)
+ */
+ENTRY(_diag26c_amode31)
+ lghi %r5,-EOPNOTSUPP
+ sam31
+ diag %r2,%r4,0x26c
+.Ldiag26c_ex:
+ sam64
+ lgfr %r2,%r5
+ BR_EX_AMODE31_r14
+ EX_TABLE_AMODE31(.Ldiag26c_ex, .Ldiag26c_ex)
+ENDPROC(_diag26c_amode31)
+
+/*
+ * void _diag0c_amode31(struct hypfs_diag0c_entry *entry)
+ */
+ENTRY(_diag0c_amode31)
+ sam31
+ diag %r2,%r2,0x0c
+ sam64
+ BR_EX_AMODE31_r14
+ENDPROC(_diag0c_amode31)
+
+/*
+ * void _diag308_reset_amode31(void)
+ *
+ * Calls diag 308 subcode 1 and continues execution
+ */
+ENTRY(_diag308_reset_amode31)
+ larl %r4,.Lctlregs # Save control registers
+ stctg %c0,%c15,0(%r4)
+ lg %r2,0(%r4) # Disable lowcore protection
+ nilh %r2,0xefff
+ larl %r4,.Lctlreg0
+ stg %r2,0(%r4)
+ lctlg %c0,%c0,0(%r4)
+ larl %r4,.Lfpctl # Floating point control register
+ stfpc 0(%r4)
+ larl %r4,.Lprefix # Save prefix register
+ stpx 0(%r4)
+ larl %r4,.Lprefix_zero # Set prefix register to 0
+ spx 0(%r4)
+ larl %r4,.Lcontinue_psw # Save PSW flags
+ epsw %r2,%r3
+ stm %r2,%r3,0(%r4)
+ larl %r4,.Lrestart_part2 # Setup restart PSW at absolute 0
+ larl %r3,.Lrestart_diag308_psw
+ og %r4,0(%r3) # Save PSW
+ lghi %r3,0
+ sturg %r4,%r3 # Use sturg, because of large pages
+ lghi %r1,1
+ lghi %r0,0
+ diag %r0,%r1,0x308
+.Lrestart_part2:
+ lhi %r0,0 # Load r0 with zero
+ lhi %r1,2 # Use mode 2 = ESAME (dump)
+ sigp %r1,%r0,SIGP_SET_ARCHITECTURE # Switch to ESAME mode
+ sam64 # Switch to 64 bit addressing mode
+ larl %r4,.Lctlregs # Restore control registers
+ lctlg %c0,%c15,0(%r4)
+ larl %r4,.Lfpctl # Restore floating point ctl register
+ lfpc 0(%r4)
+ larl %r4,.Lprefix # Restore prefix register
+ spx 0(%r4)
+ larl %r4,.Lcontinue_psw # Restore PSW flags
+ larl %r2,.Lcontinue
+ stg %r2,8(%r4)
+ lpswe 0(%r4)
+.Lcontinue:
+ BR_EX_AMODE31_r14
+ENDPROC(_diag308_reset_amode31)
+
+ .section .amode31.data,"aw",@progbits
+.align 8
+.Lrestart_diag308_psw:
+ .long 0x00080000,0x80000000
+
+.align 8
+.Lcontinue_psw:
+ .quad 0,0
+
+.align 8
+.Lctlreg0:
+ .quad 0
+.Lctlregs:
+ .rept 16
+ .quad 0
+ .endr
+.Lfpctl:
+ .long 0
+.Lprefix:
+ .long 0
+.Lprefix_zero:
+ .long 0
if (val != 0 && val != 1)
return -EINVAL;
rc = 0;
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&smp_cpu_state_mutex);
if (cpu_management == val)
goto out;
topology_expect_change();
out:
mutex_unlock(&smp_cpu_state_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return rc ? rc : count;
}
static DEVICE_ATTR_RW(dispatching);
void __init trap_init(void)
{
- sort_extable(__start_dma_ex_table, __stop_dma_ex_table);
+ sort_extable(__start_amode31_ex_table, __stop_amode31_ex_table);
local_mcck_enable();
test_monitor_call();
}
{
unsigned long uv_stor_base;
- /*
- * keep these conditions in line with has_uv_sec_stor_limit()
- */
if (!is_prot_virt_host())
return;
- if (is_prot_virt_guest()) {
- prot_virt_host = 0;
- pr_warn("Protected virtualization not available in protected guests.");
- return;
- }
-
- if (!test_facility(158)) {
- prot_virt_host = 0;
- pr_warn("Protected virtualization not supported by the hardware.");
- return;
- }
-
uv_stor_base = (unsigned long)memblock_alloc_try_nid(
uv_info.uv_base_stor_len, SZ_1M, SZ_2G,
MEMBLOCK_ALLOC_ACCESSIBLE, NUMA_NO_NODE);
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
+KCSAN_SANITIZE := n
# Force dependency (incbin is bad)
$(obj)/vdso32_wrapper.o : $(obj)/vdso32.so
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
+KCSAN_SANITIZE := n
# Force dependency (incbin is bad)
$(obj)/vdso64_wrapper.o : $(obj)/vdso64.so
#include <asm/thread_info.h>
#include <asm/page.h>
+#include <asm/ftrace.lds.h>
/*
* Put .bss..swapper_pg_dir as the first thing in .bss. This will
KPROBES_TEXT
IRQENTRY_TEXT
SOFTIRQENTRY_TEXT
+ FTRACE_HOTPATCH_TRAMPOLINES_TEXT
*(.text.*_indirect_*)
*(.fixup)
*(.gnu.warning)
RW_DATA(0x100, PAGE_SIZE, THREAD_SIZE)
BOOT_DATA_PRESERVED
+ . = ALIGN(8);
+ .amode31.refs : {
+ _start_amode31_refs = .;
+ *(.amode31.refs)
+ _end_amode31_refs = .;
+ }
+
_edata = .; /* End of data section */
/* will be freed after init */
BOOT_DATA
+ /*
+ * .amode31 section for code, data, ex_table that need to stay
+ * below 2 GB, even when the kernel is relocated above 2 GB.
+ */
+ . = ALIGN(PAGE_SIZE);
+ _samode31 = .;
+ .amode31.text : {
+ _stext_amode31 = .;
+ *(.amode31.text)
+ *(.amode31.text.*_indirect_*)
+ . = ALIGN(PAGE_SIZE);
+ _etext_amode31 = .;
+ }
+ . = ALIGN(16);
+ .amode31.ex_table : {
+ _start_amode31_ex_table = .;
+ KEEP(*(.amode31.ex_table))
+ _stop_amode31_ex_table = .;
+ }
+ . = ALIGN(PAGE_SIZE);
+ .amode31.data : {
+ *(.amode31.data)
+ }
+ . = ALIGN(PAGE_SIZE);
+ _eamode31 = .;
+
/* early.c uses stsi, which requires page aligned data. */
. = ALIGN(PAGE_SIZE);
INIT_DATA_SECTION(0x100)
* Heiko Carstens <heiko.carstens@de.ibm.com>,
*/
-#include <linux/sched.h>
+#include <linux/processor.h>
#include <linux/delay.h>
-#include <linux/timex.h>
-#include <linux/export.h>
-#include <linux/irqflags.h>
-#include <linux/interrupt.h>
-#include <linux/jump_label.h>
-#include <linux/irq.h>
-#include <asm/vtimer.h>
#include <asm/div64.h>
-#include <asm/idle.h>
+#include <asm/timex.h>
void __delay(unsigned long loops)
{
#include <linux/seq_file.h>
#include <linux/debugfs.h>
#include <linux/mm.h>
+#include <linux/kfence.h>
#include <linux/kasan.h>
#include <asm/ptdump.h>
#include <asm/kasan.h>
IDENTITY_BEFORE_END_NR,
KERNEL_START_NR,
KERNEL_END_NR,
+#ifdef CONFIG_KFENCE
+ KFENCE_START_NR,
+ KFENCE_END_NR,
+#endif
IDENTITY_AFTER_NR,
IDENTITY_AFTER_END_NR,
#ifdef CONFIG_KASAN
[IDENTITY_BEFORE_END_NR] = {(unsigned long)_stext, "Identity Mapping End"},
[KERNEL_START_NR] = {(unsigned long)_stext, "Kernel Image Start"},
[KERNEL_END_NR] = {(unsigned long)_end, "Kernel Image End"},
+#ifdef CONFIG_KFENCE
+ [KFENCE_START_NR] = {0, "KFence Pool Start"},
+ [KFENCE_END_NR] = {0, "KFence Pool End"},
+#endif
[IDENTITY_AFTER_NR] = {(unsigned long)_end, "Identity Mapping Start"},
[IDENTITY_AFTER_END_NR] = {0, "Identity Mapping End"},
#ifdef CONFIG_KASAN
static int pt_dump_init(void)
{
+#ifdef CONFIG_KFENCE
+ unsigned long kfence_start = (unsigned long)__kfence_pool;
+#endif
/*
* Figure out the maximum virtual address being accessible with the
* kernel ASCE. We need this to keep the page table walker functions
address_markers[VMEMMAP_END_NR].start_address = (unsigned long)vmemmap + vmemmap_size;
address_markers[VMALLOC_NR].start_address = VMALLOC_START;
address_markers[VMALLOC_END_NR].start_address = VMALLOC_END;
+#ifdef CONFIG_KFENCE
+ address_markers[KFENCE_START_NR].start_address = kfence_start;
+ address_markers[KFENCE_END_NR].start_address = kfence_start + KFENCE_POOL_SIZE;
+#endif
sort_address_markers();
#ifdef CONFIG_PTDUMP_DEBUGFS
debugfs_create_file("kernel_page_tables", 0400, NULL, NULL, &ptdump_fops);
#include <linux/kprobes.h>
#include <linux/uaccess.h>
#include <linux/hugetlb.h>
+#include <linux/kfence.h>
#include <asm/asm-offsets.h>
#include <asm/diag.h>
#include <asm/gmap.h>
{
const struct exception_table_entry *fixup;
- fixup = search_extable(__start_dma_ex_table,
- __stop_dma_ex_table - __start_dma_ex_table,
+ fixup = search_extable(__start_amode31_ex_table,
+ __stop_amode31_ex_table - __start_amode31_ex_table,
addr);
if (!fixup)
fixup = search_exception_tables(addr);
unsigned long address;
unsigned int flags;
vm_fault_t fault;
+ bool is_write;
tsk = current;
/*
mm = tsk->mm;
trans_exc_code = regs->int_parm_long;
+ address = trans_exc_code & __FAIL_ADDR_MASK;
+ is_write = (trans_exc_code & store_indication) == 0x400;
/*
* Verify that the fault happened in user space, that
type = get_fault_type(regs);
switch (type) {
case KERNEL_FAULT:
+ if (kfence_handle_page_fault(address, is_write, regs))
+ return 0;
goto out;
case USER_FAULT:
case GMAP_FAULT:
break;
}
- address = trans_exc_code & __FAIL_ADDR_MASK;
perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
flags = FAULT_FLAG_DEFAULT;
if (user_mode(regs))
flags |= FAULT_FLAG_USER;
- if (access == VM_WRITE || (trans_exc_code & store_indication) == 0x400)
+ if (access == VM_WRITE || is_write)
flags |= FAULT_FLAG_WRITE;
mmap_read_lock(mm);
#include <asm/processor.h>
#include <linux/uaccess.h>
#include <asm/pgalloc.h>
+#include <asm/kfence.h>
#include <asm/ptdump.h>
#include <asm/dma.h>
#include <asm/lowcore.h>
high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
pv_init();
-
+ kfence_split_mapping();
/* Setup guest page hinting */
cmma_init();
sgt_prot &= ~_SEGMENT_ENTRY_NOEXEC;
}
+ /*
+ * The first 1MB of 1:1 mapping is mapped with 4KB pages
+ */
while (address < end) {
pg_dir = pgd_offset_k(address);
if (pgd_none(*pg_dir)) {
pm_dir = pmd_offset(pu_dir, address);
if (pmd_none(*pm_dir)) {
- if (mode == POPULATE_ZERO_SHADOW &&
- IS_ALIGNED(address, PMD_SIZE) &&
+ if (IS_ALIGNED(address, PMD_SIZE) &&
end - address >= PMD_SIZE) {
- pmd_populate(&init_mm, pm_dir,
- kasan_early_shadow_pte);
- address = (address + PMD_SIZE) & PMD_MASK;
- continue;
- }
- /* the first megabyte of 1:1 is mapped with 4k pages */
- if (has_edat && address && end - address >= PMD_SIZE &&
- mode != POPULATE_ZERO_SHADOW) {
- void *page;
-
- if (mode == POPULATE_ONE2ONE) {
- page = (void *)address;
- } else {
- page = kasan_early_alloc_segment();
- memset(page, 0, _SEGMENT_SIZE);
+ if (mode == POPULATE_ZERO_SHADOW) {
+ pmd_populate(&init_mm, pm_dir, kasan_early_shadow_pte);
+ address = (address + PMD_SIZE) & PMD_MASK;
+ continue;
+ } else if (has_edat && address) {
+ void *page;
+
+ if (mode == POPULATE_ONE2ONE) {
+ page = (void *)address;
+ } else {
+ page = kasan_early_alloc_segment();
+ memset(page, 0, _SEGMENT_SIZE);
+ }
+ pmd_val(*pm_dir) = __pa(page) | sgt_prot;
+ address = (address + PMD_SIZE) & PMD_MASK;
+ continue;
}
- pmd_val(*pm_dir) = __pa(page) | sgt_prot;
- address = (address + PMD_SIZE) & PMD_MASK;
- continue;
}
-
pt_dir = kasan_early_pte_alloc();
pmd_populate(&init_mm, pm_dir, pt_dir);
} else if (pmd_large(*pm_dir)) {
pgalloc_low = round_up((unsigned long)_end, _SEGMENT_SIZE);
if (IS_ENABLED(CONFIG_BLK_DEV_INITRD)) {
initrd_end =
- round_up(INITRD_START + INITRD_SIZE, _SEGMENT_SIZE);
+ round_up(initrd_data.start + initrd_data.size, _SEGMENT_SIZE);
pgalloc_low = max(pgalloc_low, initrd_end);
}
void *bounce = (void *) addr;
unsigned long size;
- get_online_cpus();
+ cpus_read_lock();
preempt_disable();
if (is_swapped(addr)) {
size = PAGE_SIZE - (addr & ~PAGE_MASK);
memcpy_absolute(bounce, (void *) addr, size);
}
preempt_enable();
- put_online_cpus();
+ cpus_read_unlock();
return bounce;
}
return;
set_page_stable_dat(page, order);
}
-
-void arch_set_page_nodat(struct page *page, int order)
-{
- if (cmma_flag < 2)
- return;
- set_page_stable_nodat(page, order);
-}
-
-int arch_test_page_nodat(struct page *page)
-{
- unsigned char state;
-
- if (cmma_flag < 2)
- return 0;
- state = get_page_state(page);
- return !!(state & 0x20);
-}
-
-void arch_set_page_states(int make_stable)
-{
- unsigned long flags, order, t;
- struct list_head *l;
- struct page *page;
- struct zone *zone;
-
- if (!cmma_flag)
- return;
- if (make_stable)
- drain_local_pages(NULL);
- for_each_populated_zone(zone) {
- spin_lock_irqsave(&zone->lock, flags);
- for_each_migratetype_order(order, t) {
- list_for_each(l, &zone->free_area[order].free_list[t]) {
- page = list_entry(l, struct page, lru);
- if (make_stable)
- set_page_stable_dat(page, order);
- else
- set_page_unused(page, order);
- }
- }
- spin_unlock_irqrestore(&zone->lock, flags);
- }
-}
#include <asm/cacheflush.h>
#include <asm/facility.h>
#include <asm/pgalloc.h>
+#include <asm/kfence.h>
#include <asm/page.h>
#include <asm/set_memory.h>
{
pte_t *ptep, new;
+ if (flags == SET_MEMORY_4K)
+ return 0;
ptep = pte_offset_kernel(pmdp, addr);
do {
new = *ptep;
unsigned long flags)
{
unsigned long next;
+ int need_split;
pmd_t *pmdp;
int rc = 0;
return -EINVAL;
next = pmd_addr_end(addr, end);
if (pmd_large(*pmdp)) {
- if (addr & ~PMD_MASK || addr + PMD_SIZE > next) {
+ need_split = !!(flags & SET_MEMORY_4K);
+ need_split |= !!(addr & ~PMD_MASK);
+ need_split |= !!(addr + PMD_SIZE > next);
+ if (need_split) {
rc = split_pmd_page(pmdp, addr);
if (rc)
return rc;
unsigned long flags)
{
unsigned long next;
+ int need_split;
pud_t *pudp;
int rc = 0;
return -EINVAL;
next = pud_addr_end(addr, end);
if (pud_large(*pudp)) {
- if (addr & ~PUD_MASK || addr + PUD_SIZE > next) {
+ need_split = !!(flags & SET_MEMORY_4K);
+ need_split |= !!(addr & ~PUD_MASK);
+ need_split |= !!(addr + PUD_SIZE > next);
+ if (need_split) {
rc = split_pud_page(pudp, addr);
if (rc)
break;
return change_page_attr(addr, addr + numpages * PAGE_SIZE, flags);
}
-#ifdef CONFIG_DEBUG_PAGEALLOC
+#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KFENCE)
static void ipte_range(pte_t *pte, unsigned long address, int nr)
{
pte_t *pte;
for (i = 0; i < numpages;) {
- address = page_to_phys(page + i);
+ address = (unsigned long)page_to_virt(page + i);
pte = virt_to_kpte(address);
nr = (unsigned long)pte >> ilog2(sizeof(long));
nr = PTRS_PER_PTE - (nr & (PTRS_PER_PTE - 1));
__set_memory((unsigned long)_sinittext,
(unsigned long)(_einittext - _sinittext) >> PAGE_SHIFT,
SET_MEMORY_RO | SET_MEMORY_X);
- __set_memory(__stext_dma, (__etext_dma - __stext_dma) >> PAGE_SHIFT,
+ __set_memory(__stext_amode31, (__etext_amode31 - __stext_amode31) >> PAGE_SHIFT,
SET_MEMORY_RO | SET_MEMORY_X);
/* we need lowcore executable for our LPSWE instructions */
{
u64 req = ZPCI_CREATE_REQ(zdev->fh, dmaas, ZPCI_MOD_FC_REG_IOAT);
struct zpci_fib fib = {0};
- u8 status;
+ u8 cc, status;
WARN_ON_ONCE(iota & 0x3fff);
fib.pba = base;
fib.pal = limit;
fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
- return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
+ cc = zpci_mod_fc(req, &fib, &status);
+ if (cc)
+ zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
+ return cc;
}
/* Modify PCI: Unregister I/O address translation parameters */
u8 cc, status;
cc = zpci_mod_fc(req, &fib, &status);
- if (cc == 3) /* Function already gone. */
- cc = 0;
- return cc ? -EIO : 0;
+ if (cc)
+ zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
+ return cc;
}
/* Modify PCI: Set PCI function measurement parameters */
int zpci_enable_device(struct zpci_dev *zdev)
{
- int rc;
-
- rc = clp_enable_fh(zdev, ZPCI_NR_DMA_SPACES);
- if (rc)
- goto out;
-
- rc = zpci_dma_init_device(zdev);
- if (rc)
- goto out_dma;
-
- return 0;
+ u32 fh = zdev->fh;
+ int rc = 0;
-out_dma:
- clp_disable_fh(zdev);
-out:
+ if (clp_enable_fh(zdev, &fh, ZPCI_NR_DMA_SPACES))
+ rc = -EIO;
+ else
+ zdev->fh = fh;
return rc;
}
int zpci_disable_device(struct zpci_dev *zdev)
{
- zpci_dma_exit_device(zdev);
- /*
- * The zPCI function may already be disabled by the platform, this is
- * detected in clp_disable_fh() which becomes a no-op.
- */
- return clp_disable_fh(zdev);
+ u32 fh = zdev->fh;
+ int cc, rc = 0;
+
+ cc = clp_disable_fh(zdev, &fh);
+ if (!cc) {
+ zdev->fh = fh;
+ } else if (cc == CLP_RC_SETPCIFN_ALRDY) {
+ pr_info("Disabling PCI function %08x had no effect as it was already disabled\n",
+ zdev->fid);
+ /* Function is already disabled - update handle */
+ rc = clp_refresh_fh(zdev->fid, &fh);
+ if (!rc) {
+ zdev->fh = fh;
+ rc = -EINVAL;
+ }
+ } else {
+ rc = -EIO;
+ }
+ return rc;
}
/**
if (zdev->zbus->bus)
zpci_bus_remove_device(zdev, false);
+ if (zdev->dma_table) {
+ rc = zpci_dma_exit_device(zdev);
+ if (rc)
+ return rc;
+ }
if (zdev_enabled(zdev)) {
rc = zpci_disable_device(zdev);
if (rc)
if (zdev->zbus->bus)
zpci_bus_remove_device(zdev, false);
+ if (zdev->dma_table)
+ zpci_dma_exit_device(zdev);
if (zdev_enabled(zdev))
zpci_disable_device(zdev);
case ZPCI_FN_STATE_STANDBY:
if (zdev->has_hp_slot)
zpci_exit_slot(zdev);
- zpci_cleanup_bus_resources(zdev);
+ if (zdev->has_resources)
+ zpci_cleanup_bus_resources(zdev);
zpci_bus_device_unregister(zdev);
zpci_destroy_iommu(zdev);
fallthrough;
}
static unsigned int s390_pci_probe __initdata = 1;
-static unsigned int s390_pci_no_mio __initdata;
unsigned int s390_pci_force_floating __initdata;
static unsigned int s390_pci_initialized;
return NULL;
}
if (!strcmp(str, "nomio")) {
- s390_pci_no_mio = 1;
+ S390_lowcore.machine_flags &= ~MACHINE_FLAG_PCI_MIO;
return NULL;
}
if (!strcmp(str, "force_floating")) {
return 0;
}
- if (test_facility(153) && !s390_pci_no_mio) {
+ if (MACHINE_HAS_PCI_MIO) {
static_branch_enable(&have_mio);
ctl_set_bit(2, 5);
}
rc = zpci_enable_device(zdev);
if (rc)
return rc;
+ rc = zpci_dma_init_device(zdev);
+ if (rc) {
+ zpci_disable_device(zdev);
+ return rc;
+ }
}
if (!zdev->has_resources) {
{
int rc = -EINVAL;
- zdev->zbus = zbus;
if (zbus->function[zdev->devfn]) {
pr_err("devfn %04x is already assigned\n", zdev->devfn);
return rc;
}
+ zdev->zbus = zbus;
zbus->function[zdev->devfn] = zdev;
zpci_nb_devices++;
error:
zbus->function[zdev->devfn] = NULL;
+ zdev->zbus = NULL;
zpci_nb_devices--;
return rc;
}
return rc;
}
-static int clp_refresh_fh(u32 fid);
-/*
- * Enable/Disable a given PCI function and update its function handle if
- * necessary
+/**
+ * clp_set_pci_fn() - Execute a command on a PCI function
+ * @zdev: Function that will be affected
+ * @fh: Out parameter for updated function handle
+ * @nr_dma_as: DMA address space number
+ * @command: The command code to execute
+ *
+ * Returns: 0 on success, < 0 for Linux errors (e.g. -ENOMEM), and
+ * > 0 for non-success platform responses
*/
-static int clp_set_pci_fn(struct zpci_dev *zdev, u8 nr_dma_as, u8 command)
+static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 command)
{
struct clp_req_rsp_set_pci *rrb;
int rc, retries = 100;
- u32 fid = zdev->fid;
+ *fh = 0;
rrb = clp_alloc_block(GFP_KERNEL);
if (!rrb)
return -ENOMEM;
}
} while (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY);
- if (rc || rrb->response.hdr.rsp != CLP_RC_OK) {
+ if (!rc && rrb->response.hdr.rsp == CLP_RC_OK) {
+ *fh = rrb->response.fh;
+ } else {
zpci_err("Set PCI FN:\n");
zpci_err_clp(rrb->response.hdr.rsp, rc);
- }
-
- if (!rc && rrb->response.hdr.rsp == CLP_RC_OK) {
- zdev->fh = rrb->response.fh;
- } else if (!rc && rrb->response.hdr.rsp == CLP_RC_SETPCIFN_ALRDY &&
- rrb->response.fh == 0) {
- /* Function is already in desired state - update handle */
- rc = clp_refresh_fh(fid);
+ if (!rc)
+ rc = rrb->response.hdr.rsp;
}
clp_free_block(rrb);
return rc;
return rc;
}
-int clp_enable_fh(struct zpci_dev *zdev, u8 nr_dma_as)
+int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as)
{
int rc;
- rc = clp_set_pci_fn(zdev, nr_dma_as, CLP_SET_ENABLE_PCI_FN);
- zpci_dbg(3, "ena fid:%x, fh:%x, rc:%d\n", zdev->fid, zdev->fh, rc);
- if (rc)
- goto out;
-
- if (zpci_use_mio(zdev)) {
- rc = clp_set_pci_fn(zdev, nr_dma_as, CLP_SET_ENABLE_MIO);
+ rc = clp_set_pci_fn(zdev, fh, nr_dma_as, CLP_SET_ENABLE_PCI_FN);
+ zpci_dbg(3, "ena fid:%x, fh:%x, rc:%d\n", zdev->fid, *fh, rc);
+ if (!rc && zpci_use_mio(zdev)) {
+ rc = clp_set_pci_fn(zdev, fh, nr_dma_as, CLP_SET_ENABLE_MIO);
zpci_dbg(3, "ena mio fid:%x, fh:%x, rc:%d\n",
- zdev->fid, zdev->fh, rc);
+ zdev->fid, *fh, rc);
if (rc)
- clp_disable_fh(zdev);
+ clp_disable_fh(zdev, fh);
}
-out:
return rc;
}
-int clp_disable_fh(struct zpci_dev *zdev)
+int clp_disable_fh(struct zpci_dev *zdev, u32 *fh)
{
int rc;
if (!zdev_enabled(zdev))
return 0;
- rc = clp_set_pci_fn(zdev, 0, CLP_SET_DISABLE_PCI_FN);
- zpci_dbg(3, "dis fid:%x, fh:%x, rc:%d\n", zdev->fid, zdev->fh, rc);
+ rc = clp_set_pci_fn(zdev, fh, 0, CLP_SET_DISABLE_PCI_FN);
+ zpci_dbg(3, "dis fid:%x, fh:%x, rc:%d\n", zdev->fid, *fh, rc);
+ return rc;
+}
+
+static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
+ u64 *resume_token, int *nentries)
+{
+ int rc;
+
+ memset(rrb, 0, sizeof(*rrb));
+ rrb->request.hdr.len = sizeof(rrb->request);
+ rrb->request.hdr.cmd = CLP_LIST_PCI;
+ /* store as many entries as possible */
+ rrb->response.hdr.len = CLP_BLK_SIZE - LIST_PCI_HDR_LEN;
+ rrb->request.resume_token = *resume_token;
+
+ /* Get PCI function handle list */
+ rc = clp_req(rrb, CLP_LPS_PCI);
+ if (rc || rrb->response.hdr.rsp != CLP_RC_OK) {
+ zpci_err("List PCI FN:\n");
+ zpci_err_clp(rrb->response.hdr.rsp, rc);
+ return -EIO;
+ }
+
+ update_uid_checking(rrb->response.uid_checking);
+ WARN_ON_ONCE(rrb->response.entry_size !=
+ sizeof(struct clp_fh_list_entry));
+
+ *nentries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
+ rrb->response.entry_size;
+ *resume_token = rrb->response.resume_token;
+
return rc;
}
void (*cb)(struct clp_fh_list_entry *, void *))
{
u64 resume_token = 0;
- int entries, i, rc;
+ int nentries, i, rc;
do {
- memset(rrb, 0, sizeof(*rrb));
- rrb->request.hdr.len = sizeof(rrb->request);
- rrb->request.hdr.cmd = CLP_LIST_PCI;
- /* store as many entries as possible */
- rrb->response.hdr.len = CLP_BLK_SIZE - LIST_PCI_HDR_LEN;
- rrb->request.resume_token = resume_token;
-
- /* Get PCI function handle list */
- rc = clp_req(rrb, CLP_LPS_PCI);
- if (rc || rrb->response.hdr.rsp != CLP_RC_OK) {
- zpci_err("List PCI FN:\n");
- zpci_err_clp(rrb->response.hdr.rsp, rc);
- rc = -EIO;
- goto out;
- }
+ rc = clp_list_pci_req(rrb, &resume_token, &nentries);
+ if (rc)
+ return rc;
+ for (i = 0; i < nentries; i++)
+ cb(&rrb->response.fh_list[i], data);
+ } while (resume_token);
- update_uid_checking(rrb->response.uid_checking);
- WARN_ON_ONCE(rrb->response.entry_size !=
- sizeof(struct clp_fh_list_entry));
+ return rc;
+}
- entries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
- rrb->response.entry_size;
+static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
+ struct clp_fh_list_entry *entry)
+{
+ struct clp_fh_list_entry *fh_list;
+ u64 resume_token = 0;
+ int nentries, i, rc;
- resume_token = rrb->response.resume_token;
- for (i = 0; i < entries; i++)
- cb(&rrb->response.fh_list[i], data);
+ do {
+ rc = clp_list_pci_req(rrb, &resume_token, &nentries);
+ if (rc)
+ return rc;
+ for (i = 0; i < nentries; i++) {
+ fh_list = rrb->response.fh_list;
+ if (fh_list[i].fid == fid) {
+ *entry = fh_list[i];
+ return 0;
+ }
+ }
} while (resume_token);
-out:
- return rc;
+
+ return -ENODEV;
}
static void __clp_add(struct clp_fh_list_entry *entry, void *data)
return rc;
}
-static void __clp_refresh_fh(struct clp_fh_list_entry *entry, void *data)
-{
- struct zpci_dev *zdev;
- u32 fid = *((u32 *)data);
-
- if (!entry->vendor_id || fid != entry->fid)
- return;
-
- zdev = get_zdev_by_fid(fid);
- if (!zdev)
- return;
-
- zdev->fh = entry->fh;
-}
-
/*
- * Refresh the function handle of the function matching @fid
+ * Get the current function handle of the function matching @fid
*/
-static int clp_refresh_fh(u32 fid)
+int clp_refresh_fh(u32 fid, u32 *fh)
{
struct clp_req_rsp_list_pci *rrb;
+ struct clp_fh_list_entry entry;
int rc;
rrb = clp_alloc_block(GFP_NOWAIT);
if (!rrb)
return -ENOMEM;
- rc = clp_list_pci(rrb, &fid, __clp_refresh_fh);
+ rc = clp_find_pci(rrb, fid, &entry);
+ if (!rc)
+ *fh = entry.fh;
clp_free_block(rrb);
return rc;
}
-struct clp_state_data {
- u32 fid;
- enum zpci_state state;
-};
-
-static void __clp_get_state(struct clp_fh_list_entry *entry, void *data)
-{
- struct clp_state_data *sd = data;
-
- if (entry->fid != sd->fid)
- return;
-
- sd->state = entry->config_state;
-}
-
int clp_get_state(u32 fid, enum zpci_state *state)
{
struct clp_req_rsp_list_pci *rrb;
- struct clp_state_data sd = {fid, ZPCI_FN_STATE_RESERVED};
+ struct clp_fh_list_entry entry;
int rc;
+ *state = ZPCI_FN_STATE_RESERVED;
rrb = clp_alloc_block(GFP_ATOMIC);
if (!rrb)
return -ENOMEM;
- rc = clp_list_pci(rrb, &sd, __clp_get_state);
+ rc = clp_find_pci(rrb, fid, &entry);
if (!rc)
- *state = sd.state;
+ *state = entry.config_state;
clp_free_block(rrb);
return rc;
}
}
- rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
- (u64) zdev->dma_table);
- if (rc)
+ if (zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+ (u64)zdev->dma_table)) {
+ rc = -EIO;
goto free_bitmap;
+ }
return 0;
free_bitmap:
return rc;
}
-void zpci_dma_exit_device(struct zpci_dev *zdev)
+int zpci_dma_exit_device(struct zpci_dev *zdev)
{
+ int cc = 0;
+
/*
* At this point, if the device is part of an IOMMU domain, this would
* be a strong hint towards a bug in the IOMMU API (common) code and/or
* simultaneous access via IOMMU and DMA API. So let's issue a warning.
*/
WARN_ON(zdev->s390_domain);
-
- if (zpci_unregister_ioat(zdev, 0))
- return;
+ if (zdev_enabled(zdev))
+ cc = zpci_unregister_ioat(zdev, 0);
+ /*
+ * cc == 3 indicates the function is gone already. This can happen
+ * if the function was deconfigured/disabled suddenly and we have not
+ * received a new handle yet.
+ */
+ if (cc && cc != 3)
+ return -EIO;
dma_cleanup_tables(zdev->dma_table);
zdev->dma_table = NULL;
zdev->iommu_bitmap = NULL;
vfree(zdev->lazy_bitmap);
zdev->lazy_bitmap = NULL;
-
zdev->next_bit = 0;
+ return 0;
}
static int __init dma_alloc_cpu_table_caches(void)
/* Even though the device is already gone we still
* need to free zPCI resources as part of the disable.
*/
- zpci_disable_device(zdev);
+ if (zdev->dma_table)
+ zpci_dma_exit_device(zdev);
+ if (zdev_enabled(zdev))
+ zpci_disable_device(zdev);
zdev->state = ZPCI_FN_STATE_STANDBY;
}
for_each_pci_msi_entry(msi, pdev) {
if (!msi->irq)
continue;
- if (msi->msi_attrib.is_msix)
- __pci_msix_desc_mask_irq(msi, 1);
- else
- __pci_msi_desc_mask_irq(msi, 1, 1);
irq_set_msi_desc(msi->irq, NULL);
irq_free_desc(msi->irq);
msi->msg.address_lo = 0;
pci_lock_rescan_remove();
if (pci_dev_is_added(pdev)) {
pci_stop_and_remove_bus_device(pdev);
- ret = zpci_disable_device(zdev);
- if (ret)
- goto out;
+ if (zdev->dma_table) {
+ ret = zpci_dma_exit_device(zdev);
+ if (ret)
+ goto out;
+ }
+
+ if (zdev_enabled(zdev)) {
+ ret = zpci_disable_device(zdev);
+ if (ret)
+ goto out;
+ }
ret = zpci_enable_device(zdev);
if (ret)
goto out;
+ ret = zpci_dma_init_device(zdev);
+ if (ret) {
+ zpci_disable_device(zdev);
+ goto out;
+ }
pci_rescan_bus(zdev->zbus->bus);
}
out:
GCOV_PROFILE := n
UBSAN_SANITIZE := n
KASAN_SANITIZE := n
+KCSAN_SANITIZE := n
KBUILD_CFLAGS := -fno-strict-aliasing -Wall -Wstrict-prototypes
KBUILD_CFLAGS += -Wno-pointer-sign -Wno-sign-compare
ae sigp RS_RRRD
af mc SI_URD
b1 lra RX_RRRD
+b200 lbear S_RD
+b201 stbear S_RD
b202 stidp S_RD
b204 sck S_RD
b205 stck S_RD
b938 sortl RRE_RR
b939 dfltcc RRF_R0RR2
b93a kdsa RRE_RR
+b93b nnpa RRE_00
b93c ppno RRE_RR
b93e kimd RRE_RR
b93f klmd RRE_RR
b988 alcgr RRE_RR
b989 slbgr RRE_RR
b98a cspg RRE_RR
+b98b rdp RRF_RURR2
b98d epsw RRE_RR
b98e idte RRF_RURR2
b98f crdte RRF_RURR2
e63f vstrlr VRS_RRDV
e649 vlip VRI_V0UU2
e650 vcvb VRR_RV0UU
+e651 vclzdp VRR_VV0U2
e652 vcvbg VRR_RV0UU
+e654 vupkzh VRR_VV0U2
+e655 vcnf VRR_VV0UU2
+e656 vclfnh VRR_VV0UU2
e658 vcvd VRI_VR0UU
e659 vsrp VRI_VVUUU2
e65a vcvdg VRI_VR0UU
e65b vpsop VRI_VVUUU2
+e65c vupkzl VRR_VV0U2
+e65d vcfn VRR_VV0UU2
+e65e vclfnl VRR_VV0UU2
e65f vtp VRR_0V
+e670 vpkzr VRI_VVV0UU2
e671 vap VRI_VVV0UU2
+e672 vsrpr VRI_VVV0UU2
e673 vsp VRI_VVV0UU2
+e674 vschp VRR_VVV0U0U
+e675 vcrnf VRR_VVV0UU
e677 vcp VRR_0VV0U
e678 vmp VRI_VVV0UU2
e679 vmsp VRI_VVV0UU2
e67a vdp VRI_VVV0UU2
e67b vrp VRI_VVV0UU2
+e67c vscshp VRR_VVV
+e67d vcsph VRR_VVV0U0
e67e vsdp VRI_VVV0UU2
e700 vleb VRX_VRRDU
e701 vleh VRX_VRRDU
eb62 mric RSY_RDRU
eb6a asi SIY_IRD
eb6e alsi SIY_IRD
+eb71 lpswey SIY_URD
eb7a agsi SIY_IRD
eb7e algsi SIY_IRD
eb80 icmh RSY_RURD
mask = ioread16(se7343_irq_regs + PA_CPLD_ST_REG);
for_each_set_bit(bit, &mask, SE7343_FPGA_IRQ_NR)
- generic_handle_irq(irq_linear_revmap(se7343_irq_domain, bit));
+ generic_handle_domain_irq(se7343_irq_domain, bit);
chip->irq_unmask(data);
}
mask = ioread16(se7722_irq_regs + IRQ01_STS_REG);
for_each_set_bit(bit, &mask, SE7722_FPGA_IRQ_NR)
- generic_handle_irq(irq_linear_revmap(se7722_irq_domain, bit));
+ generic_handle_domain_irq(se7722_irq_domain, bit);
chip->irq_unmask(data);
}
mask = __raw_readw(KEYDETR);
for_each_set_bit(pin, &mask, NR_BASEBOARD_GPIOS)
- generic_handle_irq(irq_linear_revmap(x3proto_irq_domain, pin));
+ generic_handle_domain_irq(x3proto_irq_domain, pin);
chip->irq_unmask(data);
}
rq_for_each_segment(bvec, req, iter) {
BUG_ON(i >= io_req->desc_cnt);
- io_req->io_desc[i].buffer =
- page_address(bvec.bv_page) + bvec.bv_offset;
+ io_req->io_desc[i].buffer = bvec_virt(&bvec);
io_req->io_desc[i].length = bvec.bv_len;
i++;
}
select ARCH_WANT_HUGE_PMD_SHARE
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANTS_THP_SWAP if X86_64
+ select ARCH_HAS_PARANOID_L1D_FLUSH
select BUILDTIME_TABLE_SORT
select CLKEVT_I8253
select CLOCKSOURCE_VALIDATE_LAST_CYCLE
REALMODE_CFLAGS += -ffreestanding
REALMODE_CFLAGS += -fno-stack-protector
-REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS), -Wno-address-of-packed-member)
-REALMODE_CFLAGS += $(call __cc-option, $(CC), $(REALMODE_CFLAGS), $(cc_stack_align4))
+REALMODE_CFLAGS += -Wno-address-of-packed-member
+REALMODE_CFLAGS += $(cc_stack_align4)
REALMODE_CFLAGS += $(CLANG_FLAGS)
export REALMODE_CFLAGS
#
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
#
-KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
-KBUILD_CFLAGS += $(call cc-option,-mno-avx,)
+KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
# Intel CET isn't enabled in the kernel
KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
UTS_MACHINE := i386
CHECKFLAGS += -D__i386__
- biarch := $(call cc-option,-m32)
- KBUILD_AFLAGS += $(biarch)
- KBUILD_CFLAGS += $(biarch)
+ KBUILD_AFLAGS += -m32
+ KBUILD_CFLAGS += -m32
KBUILD_CFLAGS += -msoft-float -mregparm=3 -freg-struct-return
# Align the stack to the register width instead of using the default
# alignment of 16 bytes. This reduces stack usage and the number of
# alignment instructions.
- KBUILD_CFLAGS += $(call cc-option,$(cc_stack_align4))
+ KBUILD_CFLAGS += $(cc_stack_align4)
# CPU-specific tuning. Anything which can be shared with UML should go here.
include arch/x86/Makefile_32.cpu
UTS_MACHINE := x86_64
CHECKFLAGS += -D__x86_64__
- biarch := -m64
KBUILD_AFLAGS += -m64
KBUILD_CFLAGS += -m64
KBUILD_CFLAGS += $(call cc-option,-falign-loops=1)
# Don't autogenerate traditional x87 instructions
- KBUILD_CFLAGS += $(call cc-option,-mno-80387)
+ KBUILD_CFLAGS += -mno-80387
KBUILD_CFLAGS += $(call cc-option,-mno-fp-ret-in-387)
# By default gcc and clang use a stack alignment of 16 bytes for x86.
# default alignment which keep the stack *mis*aligned.
# Furthermore an alignment to the register width reduces stack usage
# and the number of alignment instructions.
- KBUILD_CFLAGS += $(call cc-option,$(cc_stack_align8))
+ KBUILD_CFLAGS += $(cc_stack_align8)
# Use -mskip-rax-setup if supported.
KBUILD_CFLAGS += $(call cc-option,-mskip-rax-setup)
# FIXME - should be integrated in Makefile.cpu (Makefile_32.cpu)
- cflags-$(CONFIG_MK8) += $(call cc-option,-march=k8)
- cflags-$(CONFIG_MPSC) += $(call cc-option,-march=nocona)
-
- cflags-$(CONFIG_MCORE2) += \
- $(call cc-option,-march=core2,$(call cc-option,-mtune=generic))
- cflags-$(CONFIG_MATOM) += $(call cc-option,-march=atom) \
- $(call cc-option,-mtune=atom,$(call cc-option,-mtune=generic))
- cflags-$(CONFIG_GENERIC_CPU) += $(call cc-option,-mtune=generic)
+ cflags-$(CONFIG_MK8) += -march=k8
+ cflags-$(CONFIG_MPSC) += -march=nocona
+ cflags-$(CONFIG_MCORE2) += -march=core2
+ cflags-$(CONFIG_MATOM) += -march=atom
+ cflags-$(CONFIG_GENERIC_CPU) += -mtune=generic
KBUILD_CFLAGS += $(cflags-y)
KBUILD_CFLAGS += -mno-red-zone
ifdef CONFIG_FUNCTION_GRAPH_TRACER
ifndef CONFIG_HAVE_FENTRY
ACCUMULATE_OUTGOING_ARGS := 1
- else
- ifeq ($(call cc-option-yn, -mfentry), n)
- ACCUMULATE_OUTGOING_ARGS := 1
-
- # GCC ignores '-maccumulate-outgoing-args' when used with '-Os'.
- # If '-Os' is enabled, disable it and print a warning.
- ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
- undefine CONFIG_CC_OPTIMIZE_FOR_SIZE
- $(warning Disabling CONFIG_CC_OPTIMIZE_FOR_SIZE. Your compiler does not have -mfentry so you cannot optimize for size with CONFIG_FUNCTION_GRAPH_TRACER.)
- endif
-
- endif
endif
endif
# only been fixed starting from gcc stable version 8.4.0 and
# onwards, but not for older ones. See gcc bug #86952.
ifndef CONFIG_CC_IS_CLANG
- KBUILD_CFLAGS += $(call cc-option,-fno-jump-tables)
+ KBUILD_CFLAGS += -fno-jump-tables
endif
endif
$(BOOT_TARGETS): vmlinux
$(Q)$(MAKE) $(build)=$(boot) $@
-PHONY += install bzlilo
-install bzlilo:
- $(Q)$(MAKE) $(build)=$(boot) $@
+PHONY += install
+install:
+ $(CONFIG_SHELL) $(srctree)/$(boot)/install.sh $(KERNELRELEASE) \
+ $(KBUILD_IMAGE) System.map "$(INSTALL_PATH)"
PHONY += vdso_install
vdso_install:
cmd_genimage = $(BASH) $(srctree)/$(src)/genimage.sh $2 $3 $(obj)/bzImage \
$(obj)/mtools.conf '$(FDARGS)' $(FDINITRD)
-PHONY += bzdisk fdimage fdimage144 fdimage288 hdimage isoimage install
+PHONY += bzdisk fdimage fdimage144 fdimage288 hdimage isoimage
# This requires write access to /dev/fd0
# All images require syslinux to be installed; hdimage also requires
isoimage: $(imgdeps)
$(call cmd,genimage,isoimage,$(obj)/image.iso)
@$(kecho) 'Kernel: $(obj)/image.iso is ready'
-
-install:
- $(CONFIG_SHELL) $(srctree)/$(src)/install.sh \
- $(KERNELRELEASE) $(obj)/bzImage \
- System.map "$(INSTALL_PATH)"
* Early support for invoking 32-bit EFI services from a 64-bit kernel.
*
* Because this thunking occurs before ExitBootServices() we have to
- * restore the firmware's 32-bit GDT before we make EFI service calls,
- * since the firmware's 32-bit IDT is still currently installed and it
- * needs to be able to service interrupts.
+ * restore the firmware's 32-bit GDT and IDT before we make EFI service
+ * calls.
*
* On the plus side, we don't have to worry about mangling 64-bit
* addresses into 32-bits because we're executing with an identity
/*
* Convert x86-64 ABI params to i386 ABI
*/
- subq $32, %rsp
+ subq $64, %rsp
movl %esi, 0x0(%rsp)
movl %edx, 0x4(%rsp)
movl %ecx, 0x8(%rsp)
leaq 0x14(%rsp), %rbx
sgdt (%rbx)
+ addq $16, %rbx
+ sidt (%rbx)
+
/*
- * Switch to gdt with 32-bit segments. This is the firmware GDT
- * that was installed when the kernel started executing. This
- * pointer was saved at the EFI stub entry point in head_64.S.
+ * Switch to IDT and GDT with 32-bit segments. This is the firmware GDT
+ * and IDT that was installed when the kernel started executing. The
+ * pointers were saved at the EFI stub entry point in head_64.S.
*
* Pass the saved DS selector to the 32-bit code, and use far return to
* restore the saved CS selector.
*/
+ leaq efi32_boot_idt(%rip), %rax
+ lidt (%rax)
leaq efi32_boot_gdt(%rip), %rax
lgdt (%rax)
pushq %rax
lretq
-1: addq $32, %rsp
+1: addq $64, %rsp
movq %rdi, %rax
pop %rbx
/*
* Some firmware will return with interrupts enabled. Be sure to
- * disable them before we switch GDTs.
+ * disable them before we switch GDTs and IDTs.
*/
cli
+ lidtl (%ebx)
+ subl $16, %ebx
+
lgdtl (%ebx)
movl %cr4, %eax
.quad 0
SYM_DATA_END(efi32_boot_gdt)
+SYM_DATA_START(efi32_boot_idt)
+ .word 0
+ .quad 0
+SYM_DATA_END(efi32_boot_idt)
+
SYM_DATA_START(efi32_boot_cs)
.word 0
SYM_DATA_END(efi32_boot_cs)
movw %cs, rva(efi32_boot_cs)(%ebp)
movw %ds, rva(efi32_boot_ds)(%ebp)
+ /* Store firmware IDT descriptor */
+ sidtl rva(efi32_boot_idt)(%ebp)
+
/* Disable paging */
movl %cr0, %eax
btrl $X86_CR0_PG_BIT, %eax
if (slot_area_index == MAX_SLOT_AREA) {
debug_putstr("Aborted e820/efi memmap scan when walking immovable regions(slot_areas full)!\n");
- return 1;
+ return true;
}
}
#endif
obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o
+obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o
+sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o
+
+obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX2_X86_64) += sm4-aesni-avx2-x86_64.o
+sm4-aesni-avx2-x86_64-y := sm4-aesni-avx2-asm_64.o sm4_aesni_avx2_glue.o
+
quiet_cmd_perlasm = PERLASM $@
cmd_perlasm = $(PERL) $< > $@
$(obj)/%.S: $(src)/%.pl FORCE
return -EINVAL;
err = skcipher_walk_virt(&walk, req, false);
+ if (!walk.nbytes)
+ return err;
if (unlikely(tail > 0 && walk.nbytes < walk.total)) {
int blocks = DIV_ROUND_UP(req->cryptlen, AES_BLOCK_SIZE) - 2;
skcipher_request_set_crypt(&subreq, req->src, req->dst,
blocks * AES_BLOCK_SIZE, req->iv);
req = &subreq;
+
err = skcipher_walk_virt(&walk, req, false);
+ if (err)
+ return err;
} else {
tail = 0;
}
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SM4 Cipher Algorithm, AES-NI/AVX optimized.
+ * as specified in
+ * https://tools.ietf.org/id/draft-ribose-cfrg-sm4-10.html
+ *
+ * Copyright (C) 2018 Markku-Juhani O. Saarinen <mjos@iki.fi>
+ * Copyright (C) 2020 Jussi Kivilinna <jussi.kivilinna@iki.fi>
+ * Copyright (c) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+/* Based on SM4 AES-NI work by libgcrypt and Markku-Juhani O. Saarinen at:
+ * https://github.com/mjosaarinen/sm4ni
+ */
+
+#include <linux/linkage.h>
+#include <asm/frame.h>
+
+#define rRIP (%rip)
+
+#define RX0 %xmm0
+#define RX1 %xmm1
+#define MASK_4BIT %xmm2
+#define RTMP0 %xmm3
+#define RTMP1 %xmm4
+#define RTMP2 %xmm5
+#define RTMP3 %xmm6
+#define RTMP4 %xmm7
+
+#define RA0 %xmm8
+#define RA1 %xmm9
+#define RA2 %xmm10
+#define RA3 %xmm11
+
+#define RB0 %xmm12
+#define RB1 %xmm13
+#define RB2 %xmm14
+#define RB3 %xmm15
+
+#define RNOT %xmm0
+#define RBSWAP %xmm1
+
+
+/* Transpose four 32-bit words between 128-bit vectors. */
+#define transpose_4x4(x0, x1, x2, x3, t1, t2) \
+ vpunpckhdq x1, x0, t2; \
+ vpunpckldq x1, x0, x0; \
+ \
+ vpunpckldq x3, x2, t1; \
+ vpunpckhdq x3, x2, x2; \
+ \
+ vpunpckhqdq t1, x0, x1; \
+ vpunpcklqdq t1, x0, x0; \
+ \
+ vpunpckhqdq x2, t2, x3; \
+ vpunpcklqdq x2, t2, x2;
+
+/* pre-SubByte transform. */
+#define transform_pre(x, lo_t, hi_t, mask4bit, tmp0) \
+ vpand x, mask4bit, tmp0; \
+ vpandn x, mask4bit, x; \
+ vpsrld $4, x, x; \
+ \
+ vpshufb tmp0, lo_t, tmp0; \
+ vpshufb x, hi_t, x; \
+ vpxor tmp0, x, x;
+
+/* post-SubByte transform. Note: x has been XOR'ed with mask4bit by
+ * 'vaeslastenc' instruction.
+ */
+#define transform_post(x, lo_t, hi_t, mask4bit, tmp0) \
+ vpandn mask4bit, x, tmp0; \
+ vpsrld $4, x, x; \
+ vpand x, mask4bit, x; \
+ \
+ vpshufb tmp0, lo_t, tmp0; \
+ vpshufb x, hi_t, x; \
+ vpxor tmp0, x, x;
+
+
+.section .rodata.cst164, "aM", @progbits, 164
+.align 16
+
+/*
+ * Following four affine transform look-up tables are from work by
+ * Markku-Juhani O. Saarinen, at https://github.com/mjosaarinen/sm4ni
+ *
+ * These allow exposing SM4 S-Box from AES SubByte.
+ */
+
+/* pre-SubByte affine transform, from SM4 field to AES field. */
+.Lpre_tf_lo_s:
+ .quad 0x9197E2E474720701, 0xC7C1B4B222245157
+.Lpre_tf_hi_s:
+ .quad 0xE240AB09EB49A200, 0xF052B91BF95BB012
+
+/* post-SubByte affine transform, from AES field to SM4 field. */
+.Lpost_tf_lo_s:
+ .quad 0x5B67F2CEA19D0834, 0xEDD14478172BBE82
+.Lpost_tf_hi_s:
+ .quad 0xAE7201DD73AFDC00, 0x11CDBE62CC1063BF
+
+/* For isolating SubBytes from AESENCLAST, inverse shift row */
+.Linv_shift_row:
+ .byte 0x00, 0x0d, 0x0a, 0x07, 0x04, 0x01, 0x0e, 0x0b
+ .byte 0x08, 0x05, 0x02, 0x0f, 0x0c, 0x09, 0x06, 0x03
+
+/* Inverse shift row + Rotate left by 8 bits on 32-bit words with vpshufb */
+.Linv_shift_row_rol_8:
+ .byte 0x07, 0x00, 0x0d, 0x0a, 0x0b, 0x04, 0x01, 0x0e
+ .byte 0x0f, 0x08, 0x05, 0x02, 0x03, 0x0c, 0x09, 0x06
+
+/* Inverse shift row + Rotate left by 16 bits on 32-bit words with vpshufb */
+.Linv_shift_row_rol_16:
+ .byte 0x0a, 0x07, 0x00, 0x0d, 0x0e, 0x0b, 0x04, 0x01
+ .byte 0x02, 0x0f, 0x08, 0x05, 0x06, 0x03, 0x0c, 0x09
+
+/* Inverse shift row + Rotate left by 24 bits on 32-bit words with vpshufb */
+.Linv_shift_row_rol_24:
+ .byte 0x0d, 0x0a, 0x07, 0x00, 0x01, 0x0e, 0x0b, 0x04
+ .byte 0x05, 0x02, 0x0f, 0x08, 0x09, 0x06, 0x03, 0x0c
+
+/* For CTR-mode IV byteswap */
+.Lbswap128_mask:
+ .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
+
+/* For input word byte-swap */
+.Lbswap32_mask:
+ .byte 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12
+
+.align 4
+/* 4-bit mask */
+.L0f0f0f0f:
+ .long 0x0f0f0f0f
+
+
+.text
+.align 16
+
+/*
+ * void sm4_aesni_avx_crypt4(const u32 *rk, u8 *dst,
+ * const u8 *src, int nblocks)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx_crypt4)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (1..4 blocks)
+ * %rdx: src (1..4 blocks)
+ * %rcx: num blocks (1..4)
+ */
+ FRAME_BEGIN
+
+ vmovdqu 0*16(%rdx), RA0;
+ vmovdqa RA0, RA1;
+ vmovdqa RA0, RA2;
+ vmovdqa RA0, RA3;
+ cmpq $2, %rcx;
+ jb .Lblk4_load_input_done;
+ vmovdqu 1*16(%rdx), RA1;
+ je .Lblk4_load_input_done;
+ vmovdqu 2*16(%rdx), RA2;
+ cmpq $3, %rcx;
+ je .Lblk4_load_input_done;
+ vmovdqu 3*16(%rdx), RA3;
+
+.Lblk4_load_input_done:
+
+ vmovdqa .Lbswap32_mask rRIP, RTMP2;
+ vpshufb RTMP2, RA0, RA0;
+ vpshufb RTMP2, RA1, RA1;
+ vpshufb RTMP2, RA2, RA2;
+ vpshufb RTMP2, RA3, RA3;
+
+ vbroadcastss .L0f0f0f0f rRIP, MASK_4BIT;
+ vmovdqa .Lpre_tf_lo_s rRIP, RTMP4;
+ vmovdqa .Lpre_tf_hi_s rRIP, RB0;
+ vmovdqa .Lpost_tf_lo_s rRIP, RB1;
+ vmovdqa .Lpost_tf_hi_s rRIP, RB2;
+ vmovdqa .Linv_shift_row rRIP, RB3;
+ vmovdqa .Linv_shift_row_rol_8 rRIP, RTMP2;
+ vmovdqa .Linv_shift_row_rol_16 rRIP, RTMP3;
+ transpose_4x4(RA0, RA1, RA2, RA3, RTMP0, RTMP1);
+
+#define ROUND(round, s0, s1, s2, s3) \
+ vbroadcastss (4*(round))(%rdi), RX0; \
+ vpxor s1, RX0, RX0; \
+ vpxor s2, RX0, RX0; \
+ vpxor s3, RX0, RX0; /* s1 ^ s2 ^ s3 ^ rk */ \
+ \
+ /* sbox, non-linear part */ \
+ transform_pre(RX0, RTMP4, RB0, MASK_4BIT, RTMP0); \
+ vaesenclast MASK_4BIT, RX0, RX0; \
+ transform_post(RX0, RB1, RB2, MASK_4BIT, RTMP0); \
+ \
+ /* linear part */ \
+ vpshufb RB3, RX0, RTMP0; \
+ vpxor RTMP0, s0, s0; /* s0 ^ x */ \
+ vpshufb RTMP2, RX0, RTMP1; \
+ vpxor RTMP1, RTMP0, RTMP0; /* x ^ rol(x,8) */ \
+ vpshufb RTMP3, RX0, RTMP1; \
+ vpxor RTMP1, RTMP0, RTMP0; /* x ^ rol(x,8) ^ rol(x,16) */ \
+ vpshufb .Linv_shift_row_rol_24 rRIP, RX0, RTMP1; \
+ vpxor RTMP1, s0, s0; /* s0 ^ x ^ rol(x,24) */ \
+ vpslld $2, RTMP0, RTMP1; \
+ vpsrld $30, RTMP0, RTMP0; \
+ vpxor RTMP0, s0, s0; \
+ /* s0 ^ x ^ rol(x,2) ^ rol(x,10) ^ rol(x,18) ^ rol(x,24) */ \
+ vpxor RTMP1, s0, s0;
+
+ leaq (32*4)(%rdi), %rax;
+.align 16
+.Lroundloop_blk4:
+ ROUND(0, RA0, RA1, RA2, RA3);
+ ROUND(1, RA1, RA2, RA3, RA0);
+ ROUND(2, RA2, RA3, RA0, RA1);
+ ROUND(3, RA3, RA0, RA1, RA2);
+ leaq (4*4)(%rdi), %rdi;
+ cmpq %rax, %rdi;
+ jne .Lroundloop_blk4;
+
+#undef ROUND
+
+ vmovdqa .Lbswap128_mask rRIP, RTMP2;
+
+ transpose_4x4(RA0, RA1, RA2, RA3, RTMP0, RTMP1);
+ vpshufb RTMP2, RA0, RA0;
+ vpshufb RTMP2, RA1, RA1;
+ vpshufb RTMP2, RA2, RA2;
+ vpshufb RTMP2, RA3, RA3;
+
+ vmovdqu RA0, 0*16(%rsi);
+ cmpq $2, %rcx;
+ jb .Lblk4_store_output_done;
+ vmovdqu RA1, 1*16(%rsi);
+ je .Lblk4_store_output_done;
+ vmovdqu RA2, 2*16(%rsi);
+ cmpq $3, %rcx;
+ je .Lblk4_store_output_done;
+ vmovdqu RA3, 3*16(%rsi);
+
+.Lblk4_store_output_done:
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx_crypt4)
+
+.align 8
+SYM_FUNC_START_LOCAL(__sm4_crypt_blk8)
+ /* input:
+ * %rdi: round key array, CTX
+ * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel
+ * plaintext blocks
+ * output:
+ * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: eight parallel
+ * ciphertext blocks
+ */
+ FRAME_BEGIN
+
+ vmovdqa .Lbswap32_mask rRIP, RTMP2;
+ vpshufb RTMP2, RA0, RA0;
+ vpshufb RTMP2, RA1, RA1;
+ vpshufb RTMP2, RA2, RA2;
+ vpshufb RTMP2, RA3, RA3;
+ vpshufb RTMP2, RB0, RB0;
+ vpshufb RTMP2, RB1, RB1;
+ vpshufb RTMP2, RB2, RB2;
+ vpshufb RTMP2, RB3, RB3;
+
+ vbroadcastss .L0f0f0f0f rRIP, MASK_4BIT;
+ transpose_4x4(RA0, RA1, RA2, RA3, RTMP0, RTMP1);
+ transpose_4x4(RB0, RB1, RB2, RB3, RTMP0, RTMP1);
+
+#define ROUND(round, s0, s1, s2, s3, r0, r1, r2, r3) \
+ vbroadcastss (4*(round))(%rdi), RX0; \
+ vmovdqa .Lpre_tf_lo_s rRIP, RTMP4; \
+ vmovdqa .Lpre_tf_hi_s rRIP, RTMP1; \
+ vmovdqa RX0, RX1; \
+ vpxor s1, RX0, RX0; \
+ vpxor s2, RX0, RX0; \
+ vpxor s3, RX0, RX0; /* s1 ^ s2 ^ s3 ^ rk */ \
+ vmovdqa .Lpost_tf_lo_s rRIP, RTMP2; \
+ vmovdqa .Lpost_tf_hi_s rRIP, RTMP3; \
+ vpxor r1, RX1, RX1; \
+ vpxor r2, RX1, RX1; \
+ vpxor r3, RX1, RX1; /* r1 ^ r2 ^ r3 ^ rk */ \
+ \
+ /* sbox, non-linear part */ \
+ transform_pre(RX0, RTMP4, RTMP1, MASK_4BIT, RTMP0); \
+ transform_pre(RX1, RTMP4, RTMP1, MASK_4BIT, RTMP0); \
+ vmovdqa .Linv_shift_row rRIP, RTMP4; \
+ vaesenclast MASK_4BIT, RX0, RX0; \
+ vaesenclast MASK_4BIT, RX1, RX1; \
+ transform_post(RX0, RTMP2, RTMP3, MASK_4BIT, RTMP0); \
+ transform_post(RX1, RTMP2, RTMP3, MASK_4BIT, RTMP0); \
+ \
+ /* linear part */ \
+ vpshufb RTMP4, RX0, RTMP0; \
+ vpxor RTMP0, s0, s0; /* s0 ^ x */ \
+ vpshufb RTMP4, RX1, RTMP2; \
+ vmovdqa .Linv_shift_row_rol_8 rRIP, RTMP4; \
+ vpxor RTMP2, r0, r0; /* r0 ^ x */ \
+ vpshufb RTMP4, RX0, RTMP1; \
+ vpxor RTMP1, RTMP0, RTMP0; /* x ^ rol(x,8) */ \
+ vpshufb RTMP4, RX1, RTMP3; \
+ vmovdqa .Linv_shift_row_rol_16 rRIP, RTMP4; \
+ vpxor RTMP3, RTMP2, RTMP2; /* x ^ rol(x,8) */ \
+ vpshufb RTMP4, RX0, RTMP1; \
+ vpxor RTMP1, RTMP0, RTMP0; /* x ^ rol(x,8) ^ rol(x,16) */ \
+ vpshufb RTMP4, RX1, RTMP3; \
+ vmovdqa .Linv_shift_row_rol_24 rRIP, RTMP4; \
+ vpxor RTMP3, RTMP2, RTMP2; /* x ^ rol(x,8) ^ rol(x,16) */ \
+ vpshufb RTMP4, RX0, RTMP1; \
+ vpxor RTMP1, s0, s0; /* s0 ^ x ^ rol(x,24) */ \
+ /* s0 ^ x ^ rol(x,2) ^ rol(x,10) ^ rol(x,18) ^ rol(x,24) */ \
+ vpslld $2, RTMP0, RTMP1; \
+ vpsrld $30, RTMP0, RTMP0; \
+ vpxor RTMP0, s0, s0; \
+ vpxor RTMP1, s0, s0; \
+ vpshufb RTMP4, RX1, RTMP3; \
+ vpxor RTMP3, r0, r0; /* r0 ^ x ^ rol(x,24) */ \
+ /* r0 ^ x ^ rol(x,2) ^ rol(x,10) ^ rol(x,18) ^ rol(x,24) */ \
+ vpslld $2, RTMP2, RTMP3; \
+ vpsrld $30, RTMP2, RTMP2; \
+ vpxor RTMP2, r0, r0; \
+ vpxor RTMP3, r0, r0;
+
+ leaq (32*4)(%rdi), %rax;
+.align 16
+.Lroundloop_blk8:
+ ROUND(0, RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3);
+ ROUND(1, RA1, RA2, RA3, RA0, RB1, RB2, RB3, RB0);
+ ROUND(2, RA2, RA3, RA0, RA1, RB2, RB3, RB0, RB1);
+ ROUND(3, RA3, RA0, RA1, RA2, RB3, RB0, RB1, RB2);
+ leaq (4*4)(%rdi), %rdi;
+ cmpq %rax, %rdi;
+ jne .Lroundloop_blk8;
+
+#undef ROUND
+
+ vmovdqa .Lbswap128_mask rRIP, RTMP2;
+
+ transpose_4x4(RA0, RA1, RA2, RA3, RTMP0, RTMP1);
+ transpose_4x4(RB0, RB1, RB2, RB3, RTMP0, RTMP1);
+ vpshufb RTMP2, RA0, RA0;
+ vpshufb RTMP2, RA1, RA1;
+ vpshufb RTMP2, RA2, RA2;
+ vpshufb RTMP2, RA3, RA3;
+ vpshufb RTMP2, RB0, RB0;
+ vpshufb RTMP2, RB1, RB1;
+ vpshufb RTMP2, RB2, RB2;
+ vpshufb RTMP2, RB3, RB3;
+
+ FRAME_END
+ ret;
+SYM_FUNC_END(__sm4_crypt_blk8)
+
+/*
+ * void sm4_aesni_avx_crypt8(const u32 *rk, u8 *dst,
+ * const u8 *src, int nblocks)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx_crypt8)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (1..8 blocks)
+ * %rdx: src (1..8 blocks)
+ * %rcx: num blocks (1..8)
+ */
+ FRAME_BEGIN
+
+ cmpq $5, %rcx;
+ jb sm4_aesni_avx_crypt4;
+ vmovdqu (0 * 16)(%rdx), RA0;
+ vmovdqu (1 * 16)(%rdx), RA1;
+ vmovdqu (2 * 16)(%rdx), RA2;
+ vmovdqu (3 * 16)(%rdx), RA3;
+ vmovdqu (4 * 16)(%rdx), RB0;
+ vmovdqa RB0, RB1;
+ vmovdqa RB0, RB2;
+ vmovdqa RB0, RB3;
+ je .Lblk8_load_input_done;
+ vmovdqu (5 * 16)(%rdx), RB1;
+ cmpq $7, %rcx;
+ jb .Lblk8_load_input_done;
+ vmovdqu (6 * 16)(%rdx), RB2;
+ je .Lblk8_load_input_done;
+ vmovdqu (7 * 16)(%rdx), RB3;
+
+.Lblk8_load_input_done:
+ call __sm4_crypt_blk8;
+
+ cmpq $6, %rcx;
+ vmovdqu RA0, (0 * 16)(%rsi);
+ vmovdqu RA1, (1 * 16)(%rsi);
+ vmovdqu RA2, (2 * 16)(%rsi);
+ vmovdqu RA3, (3 * 16)(%rsi);
+ vmovdqu RB0, (4 * 16)(%rsi);
+ jb .Lblk8_store_output_done;
+ vmovdqu RB1, (5 * 16)(%rsi);
+ je .Lblk8_store_output_done;
+ vmovdqu RB2, (6 * 16)(%rsi);
+ cmpq $7, %rcx;
+ je .Lblk8_store_output_done;
+ vmovdqu RB3, (7 * 16)(%rsi);
+
+.Lblk8_store_output_done:
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx_crypt8)
+
+/*
+ * void sm4_aesni_avx_ctr_enc_blk8(const u32 *rk, u8 *dst,
+ * const u8 *src, u8 *iv)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx_ctr_enc_blk8)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (8 blocks)
+ * %rdx: src (8 blocks)
+ * %rcx: iv (big endian, 128bit)
+ */
+ FRAME_BEGIN
+
+ /* load IV and byteswap */
+ vmovdqu (%rcx), RA0;
+
+ vmovdqa .Lbswap128_mask rRIP, RBSWAP;
+ vpshufb RBSWAP, RA0, RTMP0; /* be => le */
+
+ vpcmpeqd RNOT, RNOT, RNOT;
+ vpsrldq $8, RNOT, RNOT; /* low: -1, high: 0 */
+
+#define inc_le128(x, minus_one, tmp) \
+ vpcmpeqq minus_one, x, tmp; \
+ vpsubq minus_one, x, x; \
+ vpslldq $8, tmp, tmp; \
+ vpsubq tmp, x, x;
+
+ /* construct IVs */
+ inc_le128(RTMP0, RNOT, RTMP2); /* +1 */
+ vpshufb RBSWAP, RTMP0, RA1;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +2 */
+ vpshufb RBSWAP, RTMP0, RA2;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +3 */
+ vpshufb RBSWAP, RTMP0, RA3;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +4 */
+ vpshufb RBSWAP, RTMP0, RB0;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +5 */
+ vpshufb RBSWAP, RTMP0, RB1;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +6 */
+ vpshufb RBSWAP, RTMP0, RB2;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +7 */
+ vpshufb RBSWAP, RTMP0, RB3;
+ inc_le128(RTMP0, RNOT, RTMP2); /* +8 */
+ vpshufb RBSWAP, RTMP0, RTMP1;
+
+ /* store new IV */
+ vmovdqu RTMP1, (%rcx);
+
+ call __sm4_crypt_blk8;
+
+ vpxor (0 * 16)(%rdx), RA0, RA0;
+ vpxor (1 * 16)(%rdx), RA1, RA1;
+ vpxor (2 * 16)(%rdx), RA2, RA2;
+ vpxor (3 * 16)(%rdx), RA3, RA3;
+ vpxor (4 * 16)(%rdx), RB0, RB0;
+ vpxor (5 * 16)(%rdx), RB1, RB1;
+ vpxor (6 * 16)(%rdx), RB2, RB2;
+ vpxor (7 * 16)(%rdx), RB3, RB3;
+
+ vmovdqu RA0, (0 * 16)(%rsi);
+ vmovdqu RA1, (1 * 16)(%rsi);
+ vmovdqu RA2, (2 * 16)(%rsi);
+ vmovdqu RA3, (3 * 16)(%rsi);
+ vmovdqu RB0, (4 * 16)(%rsi);
+ vmovdqu RB1, (5 * 16)(%rsi);
+ vmovdqu RB2, (6 * 16)(%rsi);
+ vmovdqu RB3, (7 * 16)(%rsi);
+
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx_ctr_enc_blk8)
+
+/*
+ * void sm4_aesni_avx_cbc_dec_blk8(const u32 *rk, u8 *dst,
+ * const u8 *src, u8 *iv)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx_cbc_dec_blk8)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (8 blocks)
+ * %rdx: src (8 blocks)
+ * %rcx: iv
+ */
+ FRAME_BEGIN
+
+ vmovdqu (0 * 16)(%rdx), RA0;
+ vmovdqu (1 * 16)(%rdx), RA1;
+ vmovdqu (2 * 16)(%rdx), RA2;
+ vmovdqu (3 * 16)(%rdx), RA3;
+ vmovdqu (4 * 16)(%rdx), RB0;
+ vmovdqu (5 * 16)(%rdx), RB1;
+ vmovdqu (6 * 16)(%rdx), RB2;
+ vmovdqu (7 * 16)(%rdx), RB3;
+
+ call __sm4_crypt_blk8;
+
+ vmovdqu (7 * 16)(%rdx), RNOT;
+ vpxor (%rcx), RA0, RA0;
+ vpxor (0 * 16)(%rdx), RA1, RA1;
+ vpxor (1 * 16)(%rdx), RA2, RA2;
+ vpxor (2 * 16)(%rdx), RA3, RA3;
+ vpxor (3 * 16)(%rdx), RB0, RB0;
+ vpxor (4 * 16)(%rdx), RB1, RB1;
+ vpxor (5 * 16)(%rdx), RB2, RB2;
+ vpxor (6 * 16)(%rdx), RB3, RB3;
+ vmovdqu RNOT, (%rcx); /* store new IV */
+
+ vmovdqu RA0, (0 * 16)(%rsi);
+ vmovdqu RA1, (1 * 16)(%rsi);
+ vmovdqu RA2, (2 * 16)(%rsi);
+ vmovdqu RA3, (3 * 16)(%rsi);
+ vmovdqu RB0, (4 * 16)(%rsi);
+ vmovdqu RB1, (5 * 16)(%rsi);
+ vmovdqu RB2, (6 * 16)(%rsi);
+ vmovdqu RB3, (7 * 16)(%rsi);
+
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx_cbc_dec_blk8)
+
+/*
+ * void sm4_aesni_avx_cfb_dec_blk8(const u32 *rk, u8 *dst,
+ * const u8 *src, u8 *iv)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx_cfb_dec_blk8)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (8 blocks)
+ * %rdx: src (8 blocks)
+ * %rcx: iv
+ */
+ FRAME_BEGIN
+
+ /* Load input */
+ vmovdqu (%rcx), RA0;
+ vmovdqu 0 * 16(%rdx), RA1;
+ vmovdqu 1 * 16(%rdx), RA2;
+ vmovdqu 2 * 16(%rdx), RA3;
+ vmovdqu 3 * 16(%rdx), RB0;
+ vmovdqu 4 * 16(%rdx), RB1;
+ vmovdqu 5 * 16(%rdx), RB2;
+ vmovdqu 6 * 16(%rdx), RB3;
+
+ /* Update IV */
+ vmovdqu 7 * 16(%rdx), RNOT;
+ vmovdqu RNOT, (%rcx);
+
+ call __sm4_crypt_blk8;
+
+ vpxor (0 * 16)(%rdx), RA0, RA0;
+ vpxor (1 * 16)(%rdx), RA1, RA1;
+ vpxor (2 * 16)(%rdx), RA2, RA2;
+ vpxor (3 * 16)(%rdx), RA3, RA3;
+ vpxor (4 * 16)(%rdx), RB0, RB0;
+ vpxor (5 * 16)(%rdx), RB1, RB1;
+ vpxor (6 * 16)(%rdx), RB2, RB2;
+ vpxor (7 * 16)(%rdx), RB3, RB3;
+
+ vmovdqu RA0, (0 * 16)(%rsi);
+ vmovdqu RA1, (1 * 16)(%rsi);
+ vmovdqu RA2, (2 * 16)(%rsi);
+ vmovdqu RA3, (3 * 16)(%rsi);
+ vmovdqu RB0, (4 * 16)(%rsi);
+ vmovdqu RB1, (5 * 16)(%rsi);
+ vmovdqu RB2, (6 * 16)(%rsi);
+ vmovdqu RB3, (7 * 16)(%rsi);
+
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx_cfb_dec_blk8)
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * SM4 Cipher Algorithm, AES-NI/AVX2 optimized.
+ * as specified in
+ * https://tools.ietf.org/id/draft-ribose-cfrg-sm4-10.html
+ *
+ * Copyright (C) 2018 Markku-Juhani O. Saarinen <mjos@iki.fi>
+ * Copyright (C) 2020 Jussi Kivilinna <jussi.kivilinna@iki.fi>
+ * Copyright (c) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+/* Based on SM4 AES-NI work by libgcrypt and Markku-Juhani O. Saarinen at:
+ * https://github.com/mjosaarinen/sm4ni
+ */
+
+#include <linux/linkage.h>
+#include <asm/frame.h>
+
+#define rRIP (%rip)
+
+/* vector registers */
+#define RX0 %ymm0
+#define RX1 %ymm1
+#define MASK_4BIT %ymm2
+#define RTMP0 %ymm3
+#define RTMP1 %ymm4
+#define RTMP2 %ymm5
+#define RTMP3 %ymm6
+#define RTMP4 %ymm7
+
+#define RA0 %ymm8
+#define RA1 %ymm9
+#define RA2 %ymm10
+#define RA3 %ymm11
+
+#define RB0 %ymm12
+#define RB1 %ymm13
+#define RB2 %ymm14
+#define RB3 %ymm15
+
+#define RNOT %ymm0
+#define RBSWAP %ymm1
+
+#define RX0x %xmm0
+#define RX1x %xmm1
+#define MASK_4BITx %xmm2
+
+#define RNOTx %xmm0
+#define RBSWAPx %xmm1
+
+#define RTMP0x %xmm3
+#define RTMP1x %xmm4
+#define RTMP2x %xmm5
+#define RTMP3x %xmm6
+#define RTMP4x %xmm7
+
+
+/* helper macros */
+
+/* Transpose four 32-bit words between 128-bit vector lanes. */
+#define transpose_4x4(x0, x1, x2, x3, t1, t2) \
+ vpunpckhdq x1, x0, t2; \
+ vpunpckldq x1, x0, x0; \
+ \
+ vpunpckldq x3, x2, t1; \
+ vpunpckhdq x3, x2, x2; \
+ \
+ vpunpckhqdq t1, x0, x1; \
+ vpunpcklqdq t1, x0, x0; \
+ \
+ vpunpckhqdq x2, t2, x3; \
+ vpunpcklqdq x2, t2, x2;
+
+/* post-SubByte transform. */
+#define transform_pre(x, lo_t, hi_t, mask4bit, tmp0) \
+ vpand x, mask4bit, tmp0; \
+ vpandn x, mask4bit, x; \
+ vpsrld $4, x, x; \
+ \
+ vpshufb tmp0, lo_t, tmp0; \
+ vpshufb x, hi_t, x; \
+ vpxor tmp0, x, x;
+
+/* post-SubByte transform. Note: x has been XOR'ed with mask4bit by
+ * 'vaeslastenc' instruction. */
+#define transform_post(x, lo_t, hi_t, mask4bit, tmp0) \
+ vpandn mask4bit, x, tmp0; \
+ vpsrld $4, x, x; \
+ vpand x, mask4bit, x; \
+ \
+ vpshufb tmp0, lo_t, tmp0; \
+ vpshufb x, hi_t, x; \
+ vpxor tmp0, x, x;
+
+
+.section .rodata.cst164, "aM", @progbits, 164
+.align 16
+
+/*
+ * Following four affine transform look-up tables are from work by
+ * Markku-Juhani O. Saarinen, at https://github.com/mjosaarinen/sm4ni
+ *
+ * These allow exposing SM4 S-Box from AES SubByte.
+ */
+
+/* pre-SubByte affine transform, from SM4 field to AES field. */
+.Lpre_tf_lo_s:
+ .quad 0x9197E2E474720701, 0xC7C1B4B222245157
+.Lpre_tf_hi_s:
+ .quad 0xE240AB09EB49A200, 0xF052B91BF95BB012
+
+/* post-SubByte affine transform, from AES field to SM4 field. */
+.Lpost_tf_lo_s:
+ .quad 0x5B67F2CEA19D0834, 0xEDD14478172BBE82
+.Lpost_tf_hi_s:
+ .quad 0xAE7201DD73AFDC00, 0x11CDBE62CC1063BF
+
+/* For isolating SubBytes from AESENCLAST, inverse shift row */
+.Linv_shift_row:
+ .byte 0x00, 0x0d, 0x0a, 0x07, 0x04, 0x01, 0x0e, 0x0b
+ .byte 0x08, 0x05, 0x02, 0x0f, 0x0c, 0x09, 0x06, 0x03
+
+/* Inverse shift row + Rotate left by 8 bits on 32-bit words with vpshufb */
+.Linv_shift_row_rol_8:
+ .byte 0x07, 0x00, 0x0d, 0x0a, 0x0b, 0x04, 0x01, 0x0e
+ .byte 0x0f, 0x08, 0x05, 0x02, 0x03, 0x0c, 0x09, 0x06
+
+/* Inverse shift row + Rotate left by 16 bits on 32-bit words with vpshufb */
+.Linv_shift_row_rol_16:
+ .byte 0x0a, 0x07, 0x00, 0x0d, 0x0e, 0x0b, 0x04, 0x01
+ .byte 0x02, 0x0f, 0x08, 0x05, 0x06, 0x03, 0x0c, 0x09
+
+/* Inverse shift row + Rotate left by 24 bits on 32-bit words with vpshufb */
+.Linv_shift_row_rol_24:
+ .byte 0x0d, 0x0a, 0x07, 0x00, 0x01, 0x0e, 0x0b, 0x04
+ .byte 0x05, 0x02, 0x0f, 0x08, 0x09, 0x06, 0x03, 0x0c
+
+/* For CTR-mode IV byteswap */
+.Lbswap128_mask:
+ .byte 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0
+
+/* For input word byte-swap */
+.Lbswap32_mask:
+ .byte 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12
+
+.align 4
+/* 4-bit mask */
+.L0f0f0f0f:
+ .long 0x0f0f0f0f
+
+.text
+.align 16
+
+.align 8
+SYM_FUNC_START_LOCAL(__sm4_crypt_blk16)
+ /* input:
+ * %rdi: round key array, CTX
+ * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: sixteen parallel
+ * plaintext blocks
+ * output:
+ * RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3: sixteen parallel
+ * ciphertext blocks
+ */
+ FRAME_BEGIN
+
+ vbroadcasti128 .Lbswap32_mask rRIP, RTMP2;
+ vpshufb RTMP2, RA0, RA0;
+ vpshufb RTMP2, RA1, RA1;
+ vpshufb RTMP2, RA2, RA2;
+ vpshufb RTMP2, RA3, RA3;
+ vpshufb RTMP2, RB0, RB0;
+ vpshufb RTMP2, RB1, RB1;
+ vpshufb RTMP2, RB2, RB2;
+ vpshufb RTMP2, RB3, RB3;
+
+ vpbroadcastd .L0f0f0f0f rRIP, MASK_4BIT;
+ transpose_4x4(RA0, RA1, RA2, RA3, RTMP0, RTMP1);
+ transpose_4x4(RB0, RB1, RB2, RB3, RTMP0, RTMP1);
+
+#define ROUND(round, s0, s1, s2, s3, r0, r1, r2, r3) \
+ vpbroadcastd (4*(round))(%rdi), RX0; \
+ vbroadcasti128 .Lpre_tf_lo_s rRIP, RTMP4; \
+ vbroadcasti128 .Lpre_tf_hi_s rRIP, RTMP1; \
+ vmovdqa RX0, RX1; \
+ vpxor s1, RX0, RX0; \
+ vpxor s2, RX0, RX0; \
+ vpxor s3, RX0, RX0; /* s1 ^ s2 ^ s3 ^ rk */ \
+ vbroadcasti128 .Lpost_tf_lo_s rRIP, RTMP2; \
+ vbroadcasti128 .Lpost_tf_hi_s rRIP, RTMP3; \
+ vpxor r1, RX1, RX1; \
+ vpxor r2, RX1, RX1; \
+ vpxor r3, RX1, RX1; /* r1 ^ r2 ^ r3 ^ rk */ \
+ \
+ /* sbox, non-linear part */ \
+ transform_pre(RX0, RTMP4, RTMP1, MASK_4BIT, RTMP0); \
+ transform_pre(RX1, RTMP4, RTMP1, MASK_4BIT, RTMP0); \
+ vextracti128 $1, RX0, RTMP4x; \
+ vextracti128 $1, RX1, RTMP0x; \
+ vaesenclast MASK_4BITx, RX0x, RX0x; \
+ vaesenclast MASK_4BITx, RTMP4x, RTMP4x; \
+ vaesenclast MASK_4BITx, RX1x, RX1x; \
+ vaesenclast MASK_4BITx, RTMP0x, RTMP0x; \
+ vinserti128 $1, RTMP4x, RX0, RX0; \
+ vbroadcasti128 .Linv_shift_row rRIP, RTMP4; \
+ vinserti128 $1, RTMP0x, RX1, RX1; \
+ transform_post(RX0, RTMP2, RTMP3, MASK_4BIT, RTMP0); \
+ transform_post(RX1, RTMP2, RTMP3, MASK_4BIT, RTMP0); \
+ \
+ /* linear part */ \
+ vpshufb RTMP4, RX0, RTMP0; \
+ vpxor RTMP0, s0, s0; /* s0 ^ x */ \
+ vpshufb RTMP4, RX1, RTMP2; \
+ vbroadcasti128 .Linv_shift_row_rol_8 rRIP, RTMP4; \
+ vpxor RTMP2, r0, r0; /* r0 ^ x */ \
+ vpshufb RTMP4, RX0, RTMP1; \
+ vpxor RTMP1, RTMP0, RTMP0; /* x ^ rol(x,8) */ \
+ vpshufb RTMP4, RX1, RTMP3; \
+ vbroadcasti128 .Linv_shift_row_rol_16 rRIP, RTMP4; \
+ vpxor RTMP3, RTMP2, RTMP2; /* x ^ rol(x,8) */ \
+ vpshufb RTMP4, RX0, RTMP1; \
+ vpxor RTMP1, RTMP0, RTMP0; /* x ^ rol(x,8) ^ rol(x,16) */ \
+ vpshufb RTMP4, RX1, RTMP3; \
+ vbroadcasti128 .Linv_shift_row_rol_24 rRIP, RTMP4; \
+ vpxor RTMP3, RTMP2, RTMP2; /* x ^ rol(x,8) ^ rol(x,16) */ \
+ vpshufb RTMP4, RX0, RTMP1; \
+ vpxor RTMP1, s0, s0; /* s0 ^ x ^ rol(x,24) */ \
+ vpslld $2, RTMP0, RTMP1; \
+ vpsrld $30, RTMP0, RTMP0; \
+ vpxor RTMP0, s0, s0; \
+ /* s0 ^ x ^ rol(x,2) ^ rol(x,10) ^ rol(x,18) ^ rol(x,24) */ \
+ vpxor RTMP1, s0, s0; \
+ vpshufb RTMP4, RX1, RTMP3; \
+ vpxor RTMP3, r0, r0; /* r0 ^ x ^ rol(x,24) */ \
+ vpslld $2, RTMP2, RTMP3; \
+ vpsrld $30, RTMP2, RTMP2; \
+ vpxor RTMP2, r0, r0; \
+ /* r0 ^ x ^ rol(x,2) ^ rol(x,10) ^ rol(x,18) ^ rol(x,24) */ \
+ vpxor RTMP3, r0, r0;
+
+ leaq (32*4)(%rdi), %rax;
+.align 16
+.Lroundloop_blk8:
+ ROUND(0, RA0, RA1, RA2, RA3, RB0, RB1, RB2, RB3);
+ ROUND(1, RA1, RA2, RA3, RA0, RB1, RB2, RB3, RB0);
+ ROUND(2, RA2, RA3, RA0, RA1, RB2, RB3, RB0, RB1);
+ ROUND(3, RA3, RA0, RA1, RA2, RB3, RB0, RB1, RB2);
+ leaq (4*4)(%rdi), %rdi;
+ cmpq %rax, %rdi;
+ jne .Lroundloop_blk8;
+
+#undef ROUND
+
+ vbroadcasti128 .Lbswap128_mask rRIP, RTMP2;
+
+ transpose_4x4(RA0, RA1, RA2, RA3, RTMP0, RTMP1);
+ transpose_4x4(RB0, RB1, RB2, RB3, RTMP0, RTMP1);
+ vpshufb RTMP2, RA0, RA0;
+ vpshufb RTMP2, RA1, RA1;
+ vpshufb RTMP2, RA2, RA2;
+ vpshufb RTMP2, RA3, RA3;
+ vpshufb RTMP2, RB0, RB0;
+ vpshufb RTMP2, RB1, RB1;
+ vpshufb RTMP2, RB2, RB2;
+ vpshufb RTMP2, RB3, RB3;
+
+ FRAME_END
+ ret;
+SYM_FUNC_END(__sm4_crypt_blk16)
+
+#define inc_le128(x, minus_one, tmp) \
+ vpcmpeqq minus_one, x, tmp; \
+ vpsubq minus_one, x, x; \
+ vpslldq $8, tmp, tmp; \
+ vpsubq tmp, x, x;
+
+/*
+ * void sm4_aesni_avx2_ctr_enc_blk16(const u32 *rk, u8 *dst,
+ * const u8 *src, u8 *iv)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx2_ctr_enc_blk16)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (16 blocks)
+ * %rdx: src (16 blocks)
+ * %rcx: iv (big endian, 128bit)
+ */
+ FRAME_BEGIN
+
+ movq 8(%rcx), %rax;
+ bswapq %rax;
+
+ vzeroupper;
+
+ vbroadcasti128 .Lbswap128_mask rRIP, RTMP3;
+ vpcmpeqd RNOT, RNOT, RNOT;
+ vpsrldq $8, RNOT, RNOT; /* ab: -1:0 ; cd: -1:0 */
+ vpaddq RNOT, RNOT, RTMP2; /* ab: -2:0 ; cd: -2:0 */
+
+ /* load IV and byteswap */
+ vmovdqu (%rcx), RTMP4x;
+ vpshufb RTMP3x, RTMP4x, RTMP4x;
+ vmovdqa RTMP4x, RTMP0x;
+ inc_le128(RTMP4x, RNOTx, RTMP1x);
+ vinserti128 $1, RTMP4x, RTMP0, RTMP0;
+ vpshufb RTMP3, RTMP0, RA0; /* +1 ; +0 */
+
+ /* check need for handling 64-bit overflow and carry */
+ cmpq $(0xffffffffffffffff - 16), %rax;
+ ja .Lhandle_ctr_carry;
+
+ /* construct IVs */
+ vpsubq RTMP2, RTMP0, RTMP0; /* +3 ; +2 */
+ vpshufb RTMP3, RTMP0, RA1;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +5 ; +4 */
+ vpshufb RTMP3, RTMP0, RA2;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +7 ; +6 */
+ vpshufb RTMP3, RTMP0, RA3;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +9 ; +8 */
+ vpshufb RTMP3, RTMP0, RB0;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +11 ; +10 */
+ vpshufb RTMP3, RTMP0, RB1;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +13 ; +12 */
+ vpshufb RTMP3, RTMP0, RB2;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +15 ; +14 */
+ vpshufb RTMP3, RTMP0, RB3;
+ vpsubq RTMP2, RTMP0, RTMP0; /* +16 */
+ vpshufb RTMP3x, RTMP0x, RTMP0x;
+
+ jmp .Lctr_carry_done;
+
+.Lhandle_ctr_carry:
+ /* construct IVs */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RA1; /* +3 ; +2 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RA2; /* +5 ; +4 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RA3; /* +7 ; +6 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RB0; /* +9 ; +8 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RB1; /* +11 ; +10 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RB2; /* +13 ; +12 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vpshufb RTMP3, RTMP0, RB3; /* +15 ; +14 */
+ inc_le128(RTMP0, RNOT, RTMP1);
+ vextracti128 $1, RTMP0, RTMP0x;
+ vpshufb RTMP3x, RTMP0x, RTMP0x; /* +16 */
+
+.align 4
+.Lctr_carry_done:
+ /* store new IV */
+ vmovdqu RTMP0x, (%rcx);
+
+ call __sm4_crypt_blk16;
+
+ vpxor (0 * 32)(%rdx), RA0, RA0;
+ vpxor (1 * 32)(%rdx), RA1, RA1;
+ vpxor (2 * 32)(%rdx), RA2, RA2;
+ vpxor (3 * 32)(%rdx), RA3, RA3;
+ vpxor (4 * 32)(%rdx), RB0, RB0;
+ vpxor (5 * 32)(%rdx), RB1, RB1;
+ vpxor (6 * 32)(%rdx), RB2, RB2;
+ vpxor (7 * 32)(%rdx), RB3, RB3;
+
+ vmovdqu RA0, (0 * 32)(%rsi);
+ vmovdqu RA1, (1 * 32)(%rsi);
+ vmovdqu RA2, (2 * 32)(%rsi);
+ vmovdqu RA3, (3 * 32)(%rsi);
+ vmovdqu RB0, (4 * 32)(%rsi);
+ vmovdqu RB1, (5 * 32)(%rsi);
+ vmovdqu RB2, (6 * 32)(%rsi);
+ vmovdqu RB3, (7 * 32)(%rsi);
+
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx2_ctr_enc_blk16)
+
+/*
+ * void sm4_aesni_avx2_cbc_dec_blk16(const u32 *rk, u8 *dst,
+ * const u8 *src, u8 *iv)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx2_cbc_dec_blk16)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (16 blocks)
+ * %rdx: src (16 blocks)
+ * %rcx: iv
+ */
+ FRAME_BEGIN
+
+ vzeroupper;
+
+ vmovdqu (0 * 32)(%rdx), RA0;
+ vmovdqu (1 * 32)(%rdx), RA1;
+ vmovdqu (2 * 32)(%rdx), RA2;
+ vmovdqu (3 * 32)(%rdx), RA3;
+ vmovdqu (4 * 32)(%rdx), RB0;
+ vmovdqu (5 * 32)(%rdx), RB1;
+ vmovdqu (6 * 32)(%rdx), RB2;
+ vmovdqu (7 * 32)(%rdx), RB3;
+
+ call __sm4_crypt_blk16;
+
+ vmovdqu (%rcx), RNOTx;
+ vinserti128 $1, (%rdx), RNOT, RNOT;
+ vpxor RNOT, RA0, RA0;
+ vpxor (0 * 32 + 16)(%rdx), RA1, RA1;
+ vpxor (1 * 32 + 16)(%rdx), RA2, RA2;
+ vpxor (2 * 32 + 16)(%rdx), RA3, RA3;
+ vpxor (3 * 32 + 16)(%rdx), RB0, RB0;
+ vpxor (4 * 32 + 16)(%rdx), RB1, RB1;
+ vpxor (5 * 32 + 16)(%rdx), RB2, RB2;
+ vpxor (6 * 32 + 16)(%rdx), RB3, RB3;
+ vmovdqu (7 * 32 + 16)(%rdx), RNOTx;
+ vmovdqu RNOTx, (%rcx); /* store new IV */
+
+ vmovdqu RA0, (0 * 32)(%rsi);
+ vmovdqu RA1, (1 * 32)(%rsi);
+ vmovdqu RA2, (2 * 32)(%rsi);
+ vmovdqu RA3, (3 * 32)(%rsi);
+ vmovdqu RB0, (4 * 32)(%rsi);
+ vmovdqu RB1, (5 * 32)(%rsi);
+ vmovdqu RB2, (6 * 32)(%rsi);
+ vmovdqu RB3, (7 * 32)(%rsi);
+
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx2_cbc_dec_blk16)
+
+/*
+ * void sm4_aesni_avx2_cfb_dec_blk16(const u32 *rk, u8 *dst,
+ * const u8 *src, u8 *iv)
+ */
+.align 8
+SYM_FUNC_START(sm4_aesni_avx2_cfb_dec_blk16)
+ /* input:
+ * %rdi: round key array, CTX
+ * %rsi: dst (16 blocks)
+ * %rdx: src (16 blocks)
+ * %rcx: iv
+ */
+ FRAME_BEGIN
+
+ vzeroupper;
+
+ /* Load input */
+ vmovdqu (%rcx), RNOTx;
+ vinserti128 $1, (%rdx), RNOT, RA0;
+ vmovdqu (0 * 32 + 16)(%rdx), RA1;
+ vmovdqu (1 * 32 + 16)(%rdx), RA2;
+ vmovdqu (2 * 32 + 16)(%rdx), RA3;
+ vmovdqu (3 * 32 + 16)(%rdx), RB0;
+ vmovdqu (4 * 32 + 16)(%rdx), RB1;
+ vmovdqu (5 * 32 + 16)(%rdx), RB2;
+ vmovdqu (6 * 32 + 16)(%rdx), RB3;
+
+ /* Update IV */
+ vmovdqu (7 * 32 + 16)(%rdx), RNOTx;
+ vmovdqu RNOTx, (%rcx);
+
+ call __sm4_crypt_blk16;
+
+ vpxor (0 * 32)(%rdx), RA0, RA0;
+ vpxor (1 * 32)(%rdx), RA1, RA1;
+ vpxor (2 * 32)(%rdx), RA2, RA2;
+ vpxor (3 * 32)(%rdx), RA3, RA3;
+ vpxor (4 * 32)(%rdx), RB0, RB0;
+ vpxor (5 * 32)(%rdx), RB1, RB1;
+ vpxor (6 * 32)(%rdx), RB2, RB2;
+ vpxor (7 * 32)(%rdx), RB3, RB3;
+
+ vmovdqu RA0, (0 * 32)(%rsi);
+ vmovdqu RA1, (1 * 32)(%rsi);
+ vmovdqu RA2, (2 * 32)(%rsi);
+ vmovdqu RA3, (3 * 32)(%rsi);
+ vmovdqu RB0, (4 * 32)(%rsi);
+ vmovdqu RB1, (5 * 32)(%rsi);
+ vmovdqu RB2, (6 * 32)(%rsi);
+ vmovdqu RB3, (7 * 32)(%rsi);
+
+ vzeroall;
+ FRAME_END
+ ret;
+SYM_FUNC_END(sm4_aesni_avx2_cfb_dec_blk16)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef ASM_X86_SM4_AVX_H
+#define ASM_X86_SM4_AVX_H
+
+#include <linux/types.h>
+#include <crypto/sm4.h>
+
+typedef void (*sm4_crypt_func)(const u32 *rk, u8 *dst, const u8 *src, u8 *iv);
+
+int sm4_avx_ecb_encrypt(struct skcipher_request *req);
+int sm4_avx_ecb_decrypt(struct skcipher_request *req);
+
+int sm4_cbc_encrypt(struct skcipher_request *req);
+int sm4_avx_cbc_decrypt(struct skcipher_request *req,
+ unsigned int bsize, sm4_crypt_func func);
+
+int sm4_cfb_encrypt(struct skcipher_request *req);
+int sm4_avx_cfb_decrypt(struct skcipher_request *req,
+ unsigned int bsize, sm4_crypt_func func);
+
+int sm4_avx_ctr_crypt(struct skcipher_request *req,
+ unsigned int bsize, sm4_crypt_func func);
+
+#endif
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SM4 Cipher Algorithm, AES-NI/AVX2 optimized.
+ * as specified in
+ * https://tools.ietf.org/id/draft-ribose-cfrg-sm4-10.html
+ *
+ * Copyright (c) 2021, Alibaba Group.
+ * Copyright (c) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+#include <linux/module.h>
+#include <linux/crypto.h>
+#include <linux/kernel.h>
+#include <asm/simd.h>
+#include <crypto/internal/simd.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/sm4.h>
+#include "sm4-avx.h"
+
+#define SM4_CRYPT16_BLOCK_SIZE (SM4_BLOCK_SIZE * 16)
+
+asmlinkage void sm4_aesni_avx2_ctr_enc_blk16(const u32 *rk, u8 *dst,
+ const u8 *src, u8 *iv);
+asmlinkage void sm4_aesni_avx2_cbc_dec_blk16(const u32 *rk, u8 *dst,
+ const u8 *src, u8 *iv);
+asmlinkage void sm4_aesni_avx2_cfb_dec_blk16(const u32 *rk, u8 *dst,
+ const u8 *src, u8 *iv);
+
+static int sm4_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key,
+ unsigned int key_len)
+{
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ return sm4_expandkey(ctx, key, key_len);
+}
+
+static int cbc_decrypt(struct skcipher_request *req)
+{
+ return sm4_avx_cbc_decrypt(req, SM4_CRYPT16_BLOCK_SIZE,
+ sm4_aesni_avx2_cbc_dec_blk16);
+}
+
+
+static int cfb_decrypt(struct skcipher_request *req)
+{
+ return sm4_avx_cfb_decrypt(req, SM4_CRYPT16_BLOCK_SIZE,
+ sm4_aesni_avx2_cfb_dec_blk16);
+}
+
+static int ctr_crypt(struct skcipher_request *req)
+{
+ return sm4_avx_ctr_crypt(req, SM4_CRYPT16_BLOCK_SIZE,
+ sm4_aesni_avx2_ctr_enc_blk16);
+}
+
+static struct skcipher_alg sm4_aesni_avx2_skciphers[] = {
+ {
+ .base = {
+ .cra_name = "__ecb(sm4)",
+ .cra_driver_name = "__ecb-sm4-aesni-avx2",
+ .cra_priority = 500,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = SM4_BLOCK_SIZE,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .walksize = 16 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = sm4_avx_ecb_encrypt,
+ .decrypt = sm4_avx_ecb_decrypt,
+ }, {
+ .base = {
+ .cra_name = "__cbc(sm4)",
+ .cra_driver_name = "__cbc-sm4-aesni-avx2",
+ .cra_priority = 500,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = SM4_BLOCK_SIZE,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .ivsize = SM4_BLOCK_SIZE,
+ .walksize = 16 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = sm4_cbc_encrypt,
+ .decrypt = cbc_decrypt,
+ }, {
+ .base = {
+ .cra_name = "__cfb(sm4)",
+ .cra_driver_name = "__cfb-sm4-aesni-avx2",
+ .cra_priority = 500,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = 1,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .ivsize = SM4_BLOCK_SIZE,
+ .chunksize = SM4_BLOCK_SIZE,
+ .walksize = 16 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = sm4_cfb_encrypt,
+ .decrypt = cfb_decrypt,
+ }, {
+ .base = {
+ .cra_name = "__ctr(sm4)",
+ .cra_driver_name = "__ctr-sm4-aesni-avx2",
+ .cra_priority = 500,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = 1,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .ivsize = SM4_BLOCK_SIZE,
+ .chunksize = SM4_BLOCK_SIZE,
+ .walksize = 16 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = ctr_crypt,
+ .decrypt = ctr_crypt,
+ }
+};
+
+static struct simd_skcipher_alg *
+simd_sm4_aesni_avx2_skciphers[ARRAY_SIZE(sm4_aesni_avx2_skciphers)];
+
+static int __init sm4_init(void)
+{
+ const char *feature_name;
+
+ if (!boot_cpu_has(X86_FEATURE_AVX) ||
+ !boot_cpu_has(X86_FEATURE_AVX2) ||
+ !boot_cpu_has(X86_FEATURE_AES) ||
+ !boot_cpu_has(X86_FEATURE_OSXSAVE)) {
+ pr_info("AVX2 or AES-NI instructions are not detected.\n");
+ return -ENODEV;
+ }
+
+ if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
+ &feature_name)) {
+ pr_info("CPU feature '%s' is not supported.\n", feature_name);
+ return -ENODEV;
+ }
+
+ return simd_register_skciphers_compat(sm4_aesni_avx2_skciphers,
+ ARRAY_SIZE(sm4_aesni_avx2_skciphers),
+ simd_sm4_aesni_avx2_skciphers);
+}
+
+static void __exit sm4_exit(void)
+{
+ simd_unregister_skciphers(sm4_aesni_avx2_skciphers,
+ ARRAY_SIZE(sm4_aesni_avx2_skciphers),
+ simd_sm4_aesni_avx2_skciphers);
+}
+
+module_init(sm4_init);
+module_exit(sm4_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
+MODULE_DESCRIPTION("SM4 Cipher Algorithm, AES-NI/AVX2 optimized");
+MODULE_ALIAS_CRYPTO("sm4");
+MODULE_ALIAS_CRYPTO("sm4-aesni-avx2");
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SM4 Cipher Algorithm, AES-NI/AVX optimized.
+ * as specified in
+ * https://tools.ietf.org/id/draft-ribose-cfrg-sm4-10.html
+ *
+ * Copyright (c) 2021, Alibaba Group.
+ * Copyright (c) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+#include <linux/module.h>
+#include <linux/crypto.h>
+#include <linux/kernel.h>
+#include <asm/simd.h>
+#include <crypto/internal/simd.h>
+#include <crypto/internal/skcipher.h>
+#include <crypto/sm4.h>
+#include "sm4-avx.h"
+
+#define SM4_CRYPT8_BLOCK_SIZE (SM4_BLOCK_SIZE * 8)
+
+asmlinkage void sm4_aesni_avx_crypt4(const u32 *rk, u8 *dst,
+ const u8 *src, int nblocks);
+asmlinkage void sm4_aesni_avx_crypt8(const u32 *rk, u8 *dst,
+ const u8 *src, int nblocks);
+asmlinkage void sm4_aesni_avx_ctr_enc_blk8(const u32 *rk, u8 *dst,
+ const u8 *src, u8 *iv);
+asmlinkage void sm4_aesni_avx_cbc_dec_blk8(const u32 *rk, u8 *dst,
+ const u8 *src, u8 *iv);
+asmlinkage void sm4_aesni_avx_cfb_dec_blk8(const u32 *rk, u8 *dst,
+ const u8 *src, u8 *iv);
+
+static int sm4_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key,
+ unsigned int key_len)
+{
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ return sm4_expandkey(ctx, key, key_len);
+}
+
+static int ecb_do_crypt(struct skcipher_request *req, const u32 *rkey)
+{
+ struct skcipher_walk walk;
+ unsigned int nbytes;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while ((nbytes = walk.nbytes) > 0) {
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+
+ kernel_fpu_begin();
+ while (nbytes >= SM4_CRYPT8_BLOCK_SIZE) {
+ sm4_aesni_avx_crypt8(rkey, dst, src, 8);
+ dst += SM4_CRYPT8_BLOCK_SIZE;
+ src += SM4_CRYPT8_BLOCK_SIZE;
+ nbytes -= SM4_CRYPT8_BLOCK_SIZE;
+ }
+ while (nbytes >= SM4_BLOCK_SIZE) {
+ unsigned int nblocks = min(nbytes >> 4, 4u);
+ sm4_aesni_avx_crypt4(rkey, dst, src, nblocks);
+ dst += nblocks * SM4_BLOCK_SIZE;
+ src += nblocks * SM4_BLOCK_SIZE;
+ nbytes -= nblocks * SM4_BLOCK_SIZE;
+ }
+ kernel_fpu_end();
+
+ err = skcipher_walk_done(&walk, nbytes);
+ }
+
+ return err;
+}
+
+int sm4_avx_ecb_encrypt(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ return ecb_do_crypt(req, ctx->rkey_enc);
+}
+EXPORT_SYMBOL_GPL(sm4_avx_ecb_encrypt);
+
+int sm4_avx_ecb_decrypt(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ return ecb_do_crypt(req, ctx->rkey_dec);
+}
+EXPORT_SYMBOL_GPL(sm4_avx_ecb_decrypt);
+
+int sm4_cbc_encrypt(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while ((nbytes = walk.nbytes) > 0) {
+ const u8 *iv = walk.iv;
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+
+ while (nbytes >= SM4_BLOCK_SIZE) {
+ crypto_xor_cpy(dst, src, iv, SM4_BLOCK_SIZE);
+ sm4_crypt_block(ctx->rkey_enc, dst, dst);
+ iv = dst;
+ src += SM4_BLOCK_SIZE;
+ dst += SM4_BLOCK_SIZE;
+ nbytes -= SM4_BLOCK_SIZE;
+ }
+ if (iv != walk.iv)
+ memcpy(walk.iv, iv, SM4_BLOCK_SIZE);
+
+ err = skcipher_walk_done(&walk, nbytes);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(sm4_cbc_encrypt);
+
+int sm4_avx_cbc_decrypt(struct skcipher_request *req,
+ unsigned int bsize, sm4_crypt_func func)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while ((nbytes = walk.nbytes) > 0) {
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+
+ kernel_fpu_begin();
+
+ while (nbytes >= bsize) {
+ func(ctx->rkey_dec, dst, src, walk.iv);
+ dst += bsize;
+ src += bsize;
+ nbytes -= bsize;
+ }
+
+ while (nbytes >= SM4_BLOCK_SIZE) {
+ u8 keystream[SM4_BLOCK_SIZE * 8];
+ u8 iv[SM4_BLOCK_SIZE];
+ unsigned int nblocks = min(nbytes >> 4, 8u);
+ int i;
+
+ sm4_aesni_avx_crypt8(ctx->rkey_dec, keystream,
+ src, nblocks);
+
+ src += ((int)nblocks - 2) * SM4_BLOCK_SIZE;
+ dst += (nblocks - 1) * SM4_BLOCK_SIZE;
+ memcpy(iv, src + SM4_BLOCK_SIZE, SM4_BLOCK_SIZE);
+
+ for (i = nblocks - 1; i > 0; i--) {
+ crypto_xor_cpy(dst, src,
+ &keystream[i * SM4_BLOCK_SIZE],
+ SM4_BLOCK_SIZE);
+ src -= SM4_BLOCK_SIZE;
+ dst -= SM4_BLOCK_SIZE;
+ }
+ crypto_xor_cpy(dst, walk.iv, keystream, SM4_BLOCK_SIZE);
+ memcpy(walk.iv, iv, SM4_BLOCK_SIZE);
+ dst += nblocks * SM4_BLOCK_SIZE;
+ src += (nblocks + 1) * SM4_BLOCK_SIZE;
+ nbytes -= nblocks * SM4_BLOCK_SIZE;
+ }
+
+ kernel_fpu_end();
+ err = skcipher_walk_done(&walk, nbytes);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(sm4_avx_cbc_decrypt);
+
+static int cbc_decrypt(struct skcipher_request *req)
+{
+ return sm4_avx_cbc_decrypt(req, SM4_CRYPT8_BLOCK_SIZE,
+ sm4_aesni_avx_cbc_dec_blk8);
+}
+
+int sm4_cfb_encrypt(struct skcipher_request *req)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while ((nbytes = walk.nbytes) > 0) {
+ u8 keystream[SM4_BLOCK_SIZE];
+ const u8 *iv = walk.iv;
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+
+ while (nbytes >= SM4_BLOCK_SIZE) {
+ sm4_crypt_block(ctx->rkey_enc, keystream, iv);
+ crypto_xor_cpy(dst, src, keystream, SM4_BLOCK_SIZE);
+ iv = dst;
+ src += SM4_BLOCK_SIZE;
+ dst += SM4_BLOCK_SIZE;
+ nbytes -= SM4_BLOCK_SIZE;
+ }
+ if (iv != walk.iv)
+ memcpy(walk.iv, iv, SM4_BLOCK_SIZE);
+
+ /* tail */
+ if (walk.nbytes == walk.total && nbytes > 0) {
+ sm4_crypt_block(ctx->rkey_enc, keystream, walk.iv);
+ crypto_xor_cpy(dst, src, keystream, nbytes);
+ nbytes = 0;
+ }
+
+ err = skcipher_walk_done(&walk, nbytes);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(sm4_cfb_encrypt);
+
+int sm4_avx_cfb_decrypt(struct skcipher_request *req,
+ unsigned int bsize, sm4_crypt_func func)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while ((nbytes = walk.nbytes) > 0) {
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+
+ kernel_fpu_begin();
+
+ while (nbytes >= bsize) {
+ func(ctx->rkey_enc, dst, src, walk.iv);
+ dst += bsize;
+ src += bsize;
+ nbytes -= bsize;
+ }
+
+ while (nbytes >= SM4_BLOCK_SIZE) {
+ u8 keystream[SM4_BLOCK_SIZE * 8];
+ unsigned int nblocks = min(nbytes >> 4, 8u);
+
+ memcpy(keystream, walk.iv, SM4_BLOCK_SIZE);
+ if (nblocks > 1)
+ memcpy(&keystream[SM4_BLOCK_SIZE], src,
+ (nblocks - 1) * SM4_BLOCK_SIZE);
+ memcpy(walk.iv, src + (nblocks - 1) * SM4_BLOCK_SIZE,
+ SM4_BLOCK_SIZE);
+
+ sm4_aesni_avx_crypt8(ctx->rkey_enc, keystream,
+ keystream, nblocks);
+
+ crypto_xor_cpy(dst, src, keystream,
+ nblocks * SM4_BLOCK_SIZE);
+ dst += nblocks * SM4_BLOCK_SIZE;
+ src += nblocks * SM4_BLOCK_SIZE;
+ nbytes -= nblocks * SM4_BLOCK_SIZE;
+ }
+
+ kernel_fpu_end();
+
+ /* tail */
+ if (walk.nbytes == walk.total && nbytes > 0) {
+ u8 keystream[SM4_BLOCK_SIZE];
+
+ sm4_crypt_block(ctx->rkey_enc, keystream, walk.iv);
+ crypto_xor_cpy(dst, src, keystream, nbytes);
+ nbytes = 0;
+ }
+
+ err = skcipher_walk_done(&walk, nbytes);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(sm4_avx_cfb_decrypt);
+
+static int cfb_decrypt(struct skcipher_request *req)
+{
+ return sm4_avx_cfb_decrypt(req, SM4_CRYPT8_BLOCK_SIZE,
+ sm4_aesni_avx_cfb_dec_blk8);
+}
+
+int sm4_avx_ctr_crypt(struct skcipher_request *req,
+ unsigned int bsize, sm4_crypt_func func)
+{
+ struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+ struct sm4_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct skcipher_walk walk;
+ unsigned int nbytes;
+ int err;
+
+ err = skcipher_walk_virt(&walk, req, false);
+
+ while ((nbytes = walk.nbytes) > 0) {
+ const u8 *src = walk.src.virt.addr;
+ u8 *dst = walk.dst.virt.addr;
+
+ kernel_fpu_begin();
+
+ while (nbytes >= bsize) {
+ func(ctx->rkey_enc, dst, src, walk.iv);
+ dst += bsize;
+ src += bsize;
+ nbytes -= bsize;
+ }
+
+ while (nbytes >= SM4_BLOCK_SIZE) {
+ u8 keystream[SM4_BLOCK_SIZE * 8];
+ unsigned int nblocks = min(nbytes >> 4, 8u);
+ int i;
+
+ for (i = 0; i < nblocks; i++) {
+ memcpy(&keystream[i * SM4_BLOCK_SIZE],
+ walk.iv, SM4_BLOCK_SIZE);
+ crypto_inc(walk.iv, SM4_BLOCK_SIZE);
+ }
+ sm4_aesni_avx_crypt8(ctx->rkey_enc, keystream,
+ keystream, nblocks);
+
+ crypto_xor_cpy(dst, src, keystream,
+ nblocks * SM4_BLOCK_SIZE);
+ dst += nblocks * SM4_BLOCK_SIZE;
+ src += nblocks * SM4_BLOCK_SIZE;
+ nbytes -= nblocks * SM4_BLOCK_SIZE;
+ }
+
+ kernel_fpu_end();
+
+ /* tail */
+ if (walk.nbytes == walk.total && nbytes > 0) {
+ u8 keystream[SM4_BLOCK_SIZE];
+
+ memcpy(keystream, walk.iv, SM4_BLOCK_SIZE);
+ crypto_inc(walk.iv, SM4_BLOCK_SIZE);
+
+ sm4_crypt_block(ctx->rkey_enc, keystream, keystream);
+
+ crypto_xor_cpy(dst, src, keystream, nbytes);
+ dst += nbytes;
+ src += nbytes;
+ nbytes = 0;
+ }
+
+ err = skcipher_walk_done(&walk, nbytes);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(sm4_avx_ctr_crypt);
+
+static int ctr_crypt(struct skcipher_request *req)
+{
+ return sm4_avx_ctr_crypt(req, SM4_CRYPT8_BLOCK_SIZE,
+ sm4_aesni_avx_ctr_enc_blk8);
+}
+
+static struct skcipher_alg sm4_aesni_avx_skciphers[] = {
+ {
+ .base = {
+ .cra_name = "__ecb(sm4)",
+ .cra_driver_name = "__ecb-sm4-aesni-avx",
+ .cra_priority = 400,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = SM4_BLOCK_SIZE,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .walksize = 8 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = sm4_avx_ecb_encrypt,
+ .decrypt = sm4_avx_ecb_decrypt,
+ }, {
+ .base = {
+ .cra_name = "__cbc(sm4)",
+ .cra_driver_name = "__cbc-sm4-aesni-avx",
+ .cra_priority = 400,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = SM4_BLOCK_SIZE,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .ivsize = SM4_BLOCK_SIZE,
+ .walksize = 8 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = sm4_cbc_encrypt,
+ .decrypt = cbc_decrypt,
+ }, {
+ .base = {
+ .cra_name = "__cfb(sm4)",
+ .cra_driver_name = "__cfb-sm4-aesni-avx",
+ .cra_priority = 400,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = 1,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .ivsize = SM4_BLOCK_SIZE,
+ .chunksize = SM4_BLOCK_SIZE,
+ .walksize = 8 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = sm4_cfb_encrypt,
+ .decrypt = cfb_decrypt,
+ }, {
+ .base = {
+ .cra_name = "__ctr(sm4)",
+ .cra_driver_name = "__ctr-sm4-aesni-avx",
+ .cra_priority = 400,
+ .cra_flags = CRYPTO_ALG_INTERNAL,
+ .cra_blocksize = 1,
+ .cra_ctxsize = sizeof(struct sm4_ctx),
+ .cra_module = THIS_MODULE,
+ },
+ .min_keysize = SM4_KEY_SIZE,
+ .max_keysize = SM4_KEY_SIZE,
+ .ivsize = SM4_BLOCK_SIZE,
+ .chunksize = SM4_BLOCK_SIZE,
+ .walksize = 8 * SM4_BLOCK_SIZE,
+ .setkey = sm4_skcipher_setkey,
+ .encrypt = ctr_crypt,
+ .decrypt = ctr_crypt,
+ }
+};
+
+static struct simd_skcipher_alg *
+simd_sm4_aesni_avx_skciphers[ARRAY_SIZE(sm4_aesni_avx_skciphers)];
+
+static int __init sm4_init(void)
+{
+ const char *feature_name;
+
+ if (!boot_cpu_has(X86_FEATURE_AVX) ||
+ !boot_cpu_has(X86_FEATURE_AES) ||
+ !boot_cpu_has(X86_FEATURE_OSXSAVE)) {
+ pr_info("AVX or AES-NI instructions are not detected.\n");
+ return -ENODEV;
+ }
+
+ if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
+ &feature_name)) {
+ pr_info("CPU feature '%s' is not supported.\n", feature_name);
+ return -ENODEV;
+ }
+
+ return simd_register_skciphers_compat(sm4_aesni_avx_skciphers,
+ ARRAY_SIZE(sm4_aesni_avx_skciphers),
+ simd_sm4_aesni_avx_skciphers);
+}
+
+static void __exit sm4_exit(void)
+{
+ simd_unregister_skciphers(sm4_aesni_avx_skciphers,
+ ARRAY_SIZE(sm4_aesni_avx_skciphers),
+ simd_sm4_aesni_avx_skciphers);
+}
+
+module_init(sm4_init);
+module_exit(sm4_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
+MODULE_DESCRIPTION("SM4 Cipher Algorithm, AES-NI/AVX optimized");
+MODULE_ALIAS_CRYPTO("sm4");
+MODULE_ALIAS_CRYPTO("sm4-aesni-avx");
(CPUID Fn8000_0007_EDX[12]) interface to calculate the
average power consumption on Family 15h processors.
+config PERF_EVENTS_AMD_UNCORE
+ tristate "AMD Uncore performance events"
+ depends on PERF_EVENTS && CPU_SUP_AMD
+ default y
+ help
+ Include support for AMD uncore performance events for use with
+ e.g., perf stat -e amd_l3/.../,amd_df/.../.
+
+ To compile this driver as a module, choose M here: the
+ module will be called 'amd-uncore'.
endmenu
# SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_CPU_SUP_AMD) += core.o uncore.o
+obj-$(CONFIG_CPU_SUP_AMD) += core.o
obj-$(CONFIG_PERF_EVENTS_AMD_POWER) += power.o
obj-$(CONFIG_X86_LOCAL_APIC) += ibs.o
+obj-$(CONFIG_PERF_EVENTS_AMD_UNCORE) += amd-uncore.o
+amd-uncore-objs := uncore.o
ifdef CONFIG_AMD_IOMMU
obj-$(CONFIG_CPU_SUP_AMD) += iommu.o
endif
-
#include <linux/hardirq.h>
#include <asm/nmi.h>
+#include <asm/amd-ibs.h>
#define IBS_FETCH_CONFIG_MASK (IBS_FETCH_RAND_EN | IBS_FETCH_MAX_CNT)
#define IBS_OP_CONFIG_MASK IBS_OP_MAX_CNT
unsigned long offset_mask[1];
int offset_max;
unsigned int fetch_count_reset_broken : 1;
+ unsigned int fetch_ignore_if_zero_rip : 1;
struct cpu_perf_ibs __percpu *pcpu;
struct attribute **format_attrs;
u64 (*get_count)(u64 config);
};
-struct perf_ibs_data {
- u32 size;
- union {
- u32 data[0]; /* data buffer starts here */
- u32 caps;
- };
- u64 regs[MSR_AMD64_IBS_REG_COUNT_MAX];
-};
-
static int
perf_event_set_period(struct hw_perf_event *hwc, u64 min, u64 max, u64 *hw_period)
{
static u64 get_ibs_fetch_count(u64 config)
{
- return (config & IBS_FETCH_CNT) >> 12;
+ union ibs_fetch_ctl fetch_ctl = (union ibs_fetch_ctl)config;
+
+ return fetch_ctl.fetch_cnt << 4;
}
static u64 get_ibs_op_count(u64 config)
{
+ union ibs_op_ctl op_ctl = (union ibs_op_ctl)config;
u64 count = 0;
/*
* and the lower 7 bits of CurCnt are randomized.
* Otherwise CurCnt has the full 27-bit current counter value.
*/
- if (config & IBS_OP_VAL) {
- count = (config & IBS_OP_MAX_CNT) << 4;
+ if (op_ctl.op_val) {
+ count = op_ctl.opmaxcnt << 4;
if (ibs_caps & IBS_CAPS_OPCNTEXT)
- count += config & IBS_OP_MAX_CNT_EXT_MASK;
+ count += op_ctl.opmaxcnt_ext << 20;
} else if (ibs_caps & IBS_CAPS_RDWROPCNT) {
- count = (config & IBS_OP_CUR_CNT) >> 32;
+ count = op_ctl.opcurcnt;
}
return count;
.start = perf_ibs_start,
.stop = perf_ibs_stop,
.read = perf_ibs_read,
+ .capabilities = PERF_PMU_CAP_NO_EXCLUDE,
},
.msr = MSR_AMD64_IBSOPCTL,
.config_mask = IBS_OP_CONFIG_MASK,
if (check_rip && (ibs_data.regs[2] & IBS_RIP_INVALID)) {
regs.flags &= ~PERF_EFLAGS_EXACT;
} else {
+ /* Workaround for erratum #1197 */
+ if (perf_ibs->fetch_ignore_if_zero_rip && !(ibs_data.regs[1]))
+ goto out;
+
set_linear_ip(®s, ibs_data.regs[1]);
regs.flags |= PERF_EFLAGS_EXACT;
}
if (boot_cpu_data.x86 >= 0x16 && boot_cpu_data.x86 <= 0x18)
perf_ibs_fetch.fetch_count_reset_broken = 1;
+ if (boot_cpu_data.x86 == 0x19 && boot_cpu_data.x86_model < 0x10)
+ perf_ibs_fetch.fetch_ignore_if_zero_rip = 1;
+
perf_ibs_pmu_init(&perf_ibs_fetch, "ibs_fetch");
if (ibs_caps & IBS_CAPS_OPCNT) {
.stop = pmu_event_stop,
.read = pmu_event_read,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE,
+ .module = THIS_MODULE,
};
static int power_cpu_exit(unsigned int cpu)
#include <linux/init.h>
#include <linux/cpu.h>
#include <linux/cpumask.h>
+#include <linux/cpufeature.h>
+#include <linux/smp.h>
-#include <asm/cpufeature.h>
#include <asm/perf_event.h>
#include <asm/msr.h>
-#include <asm/smp.h>
#define NUM_COUNTERS_NB 4
#define NUM_COUNTERS_L2 4
.stop = amd_uncore_stop,
.read = amd_uncore_read,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+ .module = THIS_MODULE,
};
static struct pmu amd_llc_pmu = {
.stop = amd_uncore_stop,
.read = amd_uncore_read,
.capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT,
+ .module = THIS_MODULE,
};
static struct amd_uncore *amd_uncore_alloc(unsigned int cpu)
if (amd_uncore_llc) {
uncore = *per_cpu_ptr(amd_uncore_llc, cpu);
- uncore->id = per_cpu(cpu_llc_id, cpu);
+ uncore->id = get_llc_id(cpu);
uncore = amd_uncore_find_online_sibling(uncore, amd_uncore_llc);
*per_cpu_ptr(amd_uncore_llc, cpu) = uncore;
fail_llc:
if (boot_cpu_has(X86_FEATURE_PERFCTR_NB))
perf_pmu_unregister(&amd_nb_pmu);
- if (amd_uncore_llc)
- free_percpu(amd_uncore_llc);
+ free_percpu(amd_uncore_llc);
fail_nb:
- if (amd_uncore_nb)
- free_percpu(amd_uncore_nb);
+ free_percpu(amd_uncore_nb);
return ret;
}
-device_initcall(amd_uncore_init);
+
+static void __exit amd_uncore_exit(void)
+{
+ cpuhp_remove_state(CPUHP_AP_PERF_X86_AMD_UNCORE_ONLINE);
+ cpuhp_remove_state(CPUHP_AP_PERF_X86_AMD_UNCORE_STARTING);
+ cpuhp_remove_state(CPUHP_PERF_X86_AMD_UNCORE_PREP);
+
+ if (boot_cpu_has(X86_FEATURE_PERFCTR_LLC)) {
+ perf_pmu_unregister(&amd_llc_pmu);
+ free_percpu(amd_uncore_llc);
+ amd_uncore_llc = NULL;
+ }
+
+ if (boot_cpu_has(X86_FEATURE_PERFCTR_NB)) {
+ perf_pmu_unregister(&amd_nb_pmu);
+ free_percpu(amd_uncore_nb);
+ amd_uncore_nb = NULL;
+ }
+}
+
+module_init(amd_uncore_init);
+module_exit(amd_uncore_exit);
+
+MODULE_DESCRIPTION("AMD Uncore Driver");
+MODULE_LICENSE("GPL v2");
* validate an event group (assign == NULL)
*/
if (!unsched && assign) {
- for (i = 0; i < n; i++) {
- e = cpuc->event_list[i];
+ for (i = 0; i < n; i++)
static_call_cond(x86_pmu_commit_scheduling)(cpuc, i, assign[i]);
- }
} else {
for (i = n0; i < n; i++) {
e = cpuc->event_list[i];
x86_pmu.attr_freeze_on_smi = val;
- get_online_cpus();
+ cpus_read_lock();
on_each_cpu(flip_smm_bit, &val, 1);
- put_online_cpus();
+ cpus_read_unlock();
done:
mutex_unlock(&freeze_on_smi_mutex);
allow_tsx_force_abort = val;
- get_online_cpus();
+ cpus_read_lock();
on_each_cpu(update_tfa_sched, NULL, 1);
- put_online_cpus();
+ cpus_read_unlock();
return count;
}
PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)),
PT_CAP(output_subsys, 0, CPUID_ECX, BIT(3)),
PT_CAP(payloads_lip, 0, CPUID_ECX, BIT(31)),
- PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3),
+ PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x7),
PT_CAP(mtc_periods, 1, CPUID_EAX, 0xffff0000),
PT_CAP(cycle_thresholds, 1, CPUID_EBX, 0xffff),
PT_CAP(psb_periods, 1, CPUID_EBX, 0xffff0000),
if (!boot_cpu_has(X86_FEATURE_INTEL_PT))
return -ENODEV;
- get_online_cpus();
+ cpus_read_lock();
for_each_online_cpu(cpu) {
u64 ctl;
if (!ret && (ctl & RTIT_CTL_TRACEEN))
prior_warn++;
}
- put_online_cpus();
+ cpus_read_unlock();
if (prior_warn) {
x86_add_exclusive(x86_lbr_exclusive_pt);
.attrs = uncore_pmu_attrs,
};
+void uncore_get_alias_name(char *pmu_name, struct intel_uncore_pmu *pmu)
+{
+ struct intel_uncore_type *type = pmu->type;
+
+ if (type->num_boxes == 1)
+ sprintf(pmu_name, "uncore_type_%u", type->type_id);
+ else {
+ sprintf(pmu_name, "uncore_type_%u_%d",
+ type->type_id, type->box_ids[pmu->pmu_idx]);
+ }
+}
+
static void uncore_get_pmu_name(struct intel_uncore_pmu *pmu)
{
struct intel_uncore_type *type = pmu->type;
* Use uncore_type_&typeid_&boxid as name.
*/
if (!type->name) {
- if (type->num_boxes == 1)
- sprintf(pmu->name, "uncore_type_%u", type->type_id);
- else {
- sprintf(pmu->name, "uncore_type_%u_%d",
- type->type_id, type->box_ids[pmu->pmu_idx]);
- }
+ uncore_get_alias_name(pmu->name, pmu);
return;
}
sprintf(pmu->name, "uncore_%s", type->name);
else
sprintf(pmu->name, "uncore");
- } else
- sprintf(pmu->name, "uncore_%s_%d", type->name, pmu->pmu_idx);
-
+ } else {
+ /*
+ * Use the box ID from the discovery table if applicable.
+ */
+ sprintf(pmu->name, "uncore_%s_%d", type->name,
+ type->box_ids ? type->box_ids[pmu->pmu_idx] : pmu->pmu_idx);
+ }
}
static int uncore_pmu_register(struct intel_uncore_pmu *pmu)
void (*cpu_init)(void);
int (*pci_init)(void);
void (*mmio_init)(void);
+ bool use_discovery;
};
static const struct intel_uncore_init_fun nhm_uncore_init __initconst = {
.mmio_init = snr_uncore_mmio_init,
};
+static const struct intel_uncore_init_fun spr_uncore_init __initconst = {
+ .cpu_init = spr_uncore_cpu_init,
+ .pci_init = spr_uncore_pci_init,
+ .mmio_init = spr_uncore_mmio_init,
+ .use_discovery = true,
+};
+
static const struct intel_uncore_init_fun generic_uncore_init __initconst = {
.cpu_init = intel_uncore_generic_uncore_cpu_init,
.pci_init = intel_uncore_generic_uncore_pci_init,
X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &rkl_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_uncore_init),
+ X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &spr_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, &snr_uncore_init),
{},
};
uncore_init = (struct intel_uncore_init_fun *)&generic_uncore_init;
else
return -ENODEV;
- } else
+ } else {
uncore_init = (struct intel_uncore_init_fun *)id->driver_data;
+ if (uncore_no_discover && uncore_init->use_discovery)
+ return -ENODEV;
+ if (uncore_init->use_discovery && !intel_uncore_has_discovery_tables())
+ return -ENODEV;
+ }
if (uncore_init->pci_init) {
pret = uncore_init->pci_init();
uncore_get_constraint(struct intel_uncore_box *box, struct perf_event *event);
void uncore_put_constraint(struct intel_uncore_box *box, struct perf_event *event);
u64 uncore_shared_reg_config(struct intel_uncore_box *box, int idx);
+void uncore_get_alias_name(char *pmu_name, struct intel_uncore_pmu *pmu);
extern struct intel_uncore_type *empty_uncore[];
extern struct intel_uncore_type **uncore_msr_uncores;
int icx_uncore_pci_init(void);
void icx_uncore_cpu_init(void);
void icx_uncore_mmio_init(void);
+int spr_uncore_pci_init(void);
+void spr_uncore_cpu_init(void);
+void spr_uncore_mmio_init(void);
/* uncore_nhmex.c */
void nhmex_uncore_cpu_init(void);
.attrs = generic_uncore_formats_attr,
};
-static void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box)
+void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box)
{
wrmsrl(uncore_msr_box_ctl(box), GENERIC_PMON_BOX_CTL_INT);
}
-static void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box)
+void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box)
{
wrmsrl(uncore_msr_box_ctl(box), GENERIC_PMON_BOX_CTL_FRZ);
}
-static void intel_generic_uncore_msr_enable_box(struct intel_uncore_box *box)
+void intel_generic_uncore_msr_enable_box(struct intel_uncore_box *box)
{
wrmsrl(uncore_msr_box_ctl(box), 0);
}
.read_counter = uncore_msr_read_counter,
};
-static void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
+void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
int box_ctl = uncore_pci_box_ctl(box);
pci_write_config_dword(pdev, box_ctl, GENERIC_PMON_BOX_CTL_INT);
}
-static void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box)
+void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
int box_ctl = uncore_pci_box_ctl(box);
pci_write_config_dword(pdev, box_ctl, GENERIC_PMON_BOX_CTL_FRZ);
}
-static void intel_generic_uncore_pci_enable_box(struct intel_uncore_box *box)
+void intel_generic_uncore_pci_enable_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
int box_ctl = uncore_pci_box_ctl(box);
pci_write_config_dword(pdev, hwc->config_base, hwc->config);
}
-static void intel_generic_uncore_pci_disable_event(struct intel_uncore_box *box,
- struct perf_event *event)
+void intel_generic_uncore_pci_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
{
struct pci_dev *pdev = box->pci_dev;
struct hw_perf_event *hwc = &event->hw;
pci_write_config_dword(pdev, hwc->config_base, 0);
}
-static u64 intel_generic_uncore_pci_read_counter(struct intel_uncore_box *box,
- struct perf_event *event)
+u64 intel_generic_uncore_pci_read_counter(struct intel_uncore_box *box,
+ struct perf_event *event)
{
struct pci_dev *pdev = box->pci_dev;
struct hw_perf_event *hwc = &event->hw;
return type->box_ctls[box->dieid] + type->mmio_offsets[box->pmu->pmu_idx];
}
-static void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box)
+void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box)
{
unsigned int box_ctl = generic_uncore_mmio_box_ctl(box);
struct intel_uncore_type *type = box->pmu->type;
writel(GENERIC_PMON_BOX_CTL_INT, box->io_addr);
}
-static void intel_generic_uncore_mmio_disable_box(struct intel_uncore_box *box)
+void intel_generic_uncore_mmio_disable_box(struct intel_uncore_box *box)
{
if (!box->io_addr)
return;
writel(GENERIC_PMON_BOX_CTL_FRZ, box->io_addr);
}
-static void intel_generic_uncore_mmio_enable_box(struct intel_uncore_box *box)
+void intel_generic_uncore_mmio_enable_box(struct intel_uncore_box *box)
{
if (!box->io_addr)
return;
writel(hwc->config, box->io_addr + hwc->config_base);
}
-static void intel_generic_uncore_mmio_disable_event(struct intel_uncore_box *box,
- struct perf_event *event)
+void intel_generic_uncore_mmio_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
return true;
}
-static struct intel_uncore_type **
-intel_uncore_generic_init_uncores(enum uncore_access_type type_id)
+struct intel_uncore_type **
+intel_uncore_generic_init_uncores(enum uncore_access_type type_id, int num_extra)
{
struct intel_uncore_discovery_type *type;
struct intel_uncore_type **uncores;
struct rb_node *node;
int i = 0;
- uncores = kcalloc(num_discovered_types[type_id] + 1,
+ uncores = kcalloc(num_discovered_types[type_id] + num_extra + 1,
sizeof(struct intel_uncore_type *), GFP_KERNEL);
if (!uncores)
return empty_uncore;
void intel_uncore_generic_uncore_cpu_init(void)
{
- uncore_msr_uncores = intel_uncore_generic_init_uncores(UNCORE_ACCESS_MSR);
+ uncore_msr_uncores = intel_uncore_generic_init_uncores(UNCORE_ACCESS_MSR, 0);
}
int intel_uncore_generic_uncore_pci_init(void)
{
- uncore_pci_uncores = intel_uncore_generic_init_uncores(UNCORE_ACCESS_PCI);
+ uncore_pci_uncores = intel_uncore_generic_init_uncores(UNCORE_ACCESS_PCI, 0);
return 0;
}
void intel_uncore_generic_uncore_mmio_init(void)
{
- uncore_mmio_uncores = intel_uncore_generic_init_uncores(UNCORE_ACCESS_MMIO);
+ uncore_mmio_uncores = intel_uncore_generic_init_uncores(UNCORE_ACCESS_MMIO, 0);
}
void intel_uncore_generic_uncore_cpu_init(void);
int intel_uncore_generic_uncore_pci_init(void);
void intel_uncore_generic_uncore_mmio_init(void);
+
+void intel_generic_uncore_msr_init_box(struct intel_uncore_box *box);
+void intel_generic_uncore_msr_disable_box(struct intel_uncore_box *box);
+void intel_generic_uncore_msr_enable_box(struct intel_uncore_box *box);
+
+void intel_generic_uncore_mmio_init_box(struct intel_uncore_box *box);
+void intel_generic_uncore_mmio_disable_box(struct intel_uncore_box *box);
+void intel_generic_uncore_mmio_enable_box(struct intel_uncore_box *box);
+void intel_generic_uncore_mmio_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event);
+
+void intel_generic_uncore_pci_init_box(struct intel_uncore_box *box);
+void intel_generic_uncore_pci_disable_box(struct intel_uncore_box *box);
+void intel_generic_uncore_pci_enable_box(struct intel_uncore_box *box);
+void intel_generic_uncore_pci_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event);
+u64 intel_generic_uncore_pci_read_counter(struct intel_uncore_box *box,
+ struct perf_event *event);
+
+struct intel_uncore_type **
+intel_uncore_generic_init_uncores(enum uncore_access_type type_id, int num_extra);
// SPDX-License-Identifier: GPL-2.0
/* SandyBridge-EP/IvyTown uncore support */
#include "uncore.h"
+#include "uncore_discovery.h"
/* SNB-EP pci bus to socket mapping */
#define SNBEP_CPUNODEID 0x40
#define ICX_NUMBER_IMC_CHN 2
#define ICX_IMC_MEM_STRIDE 0x4
+/* SPR */
+#define SPR_RAW_EVENT_MASK_EXT 0xffffff
+
+/* SPR CHA */
+#define SPR_CHA_PMON_CTL_TID_EN (1 << 16)
+#define SPR_CHA_PMON_EVENT_MASK (SNBEP_PMON_RAW_EVENT_MASK | \
+ SPR_CHA_PMON_CTL_TID_EN)
+#define SPR_CHA_PMON_BOX_FILTER_TID 0x3ff
+
+#define SPR_C0_MSR_PMON_BOX_FILTER0 0x200e
+
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(event2, event, "config:0-6");
DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21");
DEFINE_UNCORE_FORMAT_ATTR(qor, qor, "config:16");
DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19");
+DEFINE_UNCORE_FORMAT_ATTR(tid_en2, tid_en, "config:16");
DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
DEFINE_UNCORE_FORMAT_ATTR(thresh9, thresh, "config:24-35");
DEFINE_UNCORE_FORMAT_ATTR(thresh8, thresh, "config:24-31");
return ret;
}
-static int skx_iio_set_mapping(struct intel_uncore_type *type)
-{
- return pmu_iio_set_mapping(type, &skx_iio_mapping_group);
-}
-
-static void skx_iio_cleanup_mapping(struct intel_uncore_type *type)
+static void
+pmu_iio_cleanup_mapping(struct intel_uncore_type *type, struct attribute_group *ag)
{
- struct attribute **attr = skx_iio_mapping_group.attrs;
+ struct attribute **attr = ag->attrs;
if (!attr)
return;
for (; *attr; attr++)
kfree((*attr)->name);
- kfree(attr_to_ext_attr(*skx_iio_mapping_group.attrs));
- kfree(skx_iio_mapping_group.attrs);
- skx_iio_mapping_group.attrs = NULL;
+ kfree(attr_to_ext_attr(*ag->attrs));
+ kfree(ag->attrs);
+ ag->attrs = NULL;
kfree(type->topology);
}
+static int skx_iio_set_mapping(struct intel_uncore_type *type)
+{
+ return pmu_iio_set_mapping(type, &skx_iio_mapping_group);
+}
+
+static void skx_iio_cleanup_mapping(struct intel_uncore_type *type)
+{
+ pmu_iio_cleanup_mapping(type, &skx_iio_mapping_group);
+}
+
static struct intel_uncore_type skx_uncore_iio = {
.name = "iio",
.num_counters = 4,
return pmu_iio_set_mapping(type, &snr_iio_mapping_group);
}
+static void snr_iio_cleanup_mapping(struct intel_uncore_type *type)
+{
+ pmu_iio_cleanup_mapping(type, &snr_iio_mapping_group);
+}
+
static struct intel_uncore_type snr_uncore_iio = {
.name = "iio",
.num_counters = 4,
.attr_update = snr_iio_attr_update,
.get_topology = snr_iio_get_topology,
.set_mapping = snr_iio_set_mapping,
- .cleanup_mapping = skx_iio_cleanup_mapping,
+ .cleanup_mapping = snr_iio_cleanup_mapping,
};
static struct intel_uncore_type snr_uncore_irp = {
return 0;
}
-static struct pci_dev *snr_uncore_get_mc_dev(int id)
+#define SNR_MC_DEVICE_ID 0x3451
+
+static struct pci_dev *snr_uncore_get_mc_dev(unsigned int device, int id)
{
struct pci_dev *mc_dev = NULL;
int pkg;
while (1) {
- mc_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x3451, mc_dev);
+ mc_dev = pci_get_device(PCI_VENDOR_ID_INTEL, device, mc_dev);
if (!mc_dev)
break;
pkg = uncore_pcibus_to_dieid(mc_dev->bus);
return mc_dev;
}
-static void __snr_uncore_mmio_init_box(struct intel_uncore_box *box,
- unsigned int box_ctl, int mem_offset)
+static int snr_uncore_mmio_map(struct intel_uncore_box *box,
+ unsigned int box_ctl, int mem_offset,
+ unsigned int device)
{
- struct pci_dev *pdev = snr_uncore_get_mc_dev(box->dieid);
+ struct pci_dev *pdev = snr_uncore_get_mc_dev(device, box->dieid);
struct intel_uncore_type *type = box->pmu->type;
resource_size_t addr;
u32 pci_dword;
if (!pdev)
- return;
+ return -ENODEV;
pci_read_config_dword(pdev, SNR_IMC_MMIO_BASE_OFFSET, &pci_dword);
- addr = (pci_dword & SNR_IMC_MMIO_BASE_MASK) << 23;
+ addr = ((resource_size_t)pci_dword & SNR_IMC_MMIO_BASE_MASK) << 23;
pci_read_config_dword(pdev, mem_offset, &pci_dword);
addr |= (pci_dword & SNR_IMC_MMIO_MEM0_MASK) << 12;
box->io_addr = ioremap(addr, type->mmio_map_size);
if (!box->io_addr) {
pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name);
- return;
+ return -EINVAL;
}
- writel(IVBEP_PMON_BOX_CTL_INT, box->io_addr);
+ return 0;
+}
+
+static void __snr_uncore_mmio_init_box(struct intel_uncore_box *box,
+ unsigned int box_ctl, int mem_offset,
+ unsigned int device)
+{
+ if (!snr_uncore_mmio_map(box, box_ctl, mem_offset, device))
+ writel(IVBEP_PMON_BOX_CTL_INT, box->io_addr);
}
static void snr_uncore_mmio_init_box(struct intel_uncore_box *box)
{
__snr_uncore_mmio_init_box(box, uncore_mmio_box_ctl(box),
- SNR_IMC_MMIO_MEM0_OFFSET);
+ SNR_IMC_MMIO_MEM0_OFFSET,
+ SNR_MC_DEVICE_ID);
}
static void snr_uncore_mmio_disable_box(struct intel_uncore_box *box)
return pmu_iio_set_mapping(type, &icx_iio_mapping_group);
}
+static void icx_iio_cleanup_mapping(struct intel_uncore_type *type)
+{
+ pmu_iio_cleanup_mapping(type, &icx_iio_mapping_group);
+}
+
static struct intel_uncore_type icx_uncore_iio = {
.name = "iio",
.num_counters = 4,
.attr_update = icx_iio_attr_update,
.get_topology = icx_iio_get_topology,
.set_mapping = icx_iio_set_mapping,
- .cleanup_mapping = skx_iio_cleanup_mapping,
+ .cleanup_mapping = icx_iio_cleanup_mapping,
};
static struct intel_uncore_type icx_uncore_irp = {
int mem_offset = (box->pmu->pmu_idx / ICX_NUMBER_IMC_CHN) * ICX_IMC_MEM_STRIDE +
SNR_IMC_MMIO_MEM0_OFFSET;
- __snr_uncore_mmio_init_box(box, box_ctl, mem_offset);
+ __snr_uncore_mmio_init_box(box, box_ctl, mem_offset,
+ SNR_MC_DEVICE_ID);
}
static struct intel_uncore_ops icx_uncore_mmio_ops = {
int mem_offset = box->pmu->pmu_idx * ICX_IMC_MEM_STRIDE +
SNR_IMC_MMIO_MEM0_OFFSET;
- __snr_uncore_mmio_init_box(box, uncore_mmio_box_ctl(box), mem_offset);
+ snr_uncore_mmio_map(box, uncore_mmio_box_ctl(box),
+ mem_offset, SNR_MC_DEVICE_ID);
}
static struct intel_uncore_ops icx_uncore_imc_freerunning_ops = {
}
/* end of ICX uncore support */
+
+/* SPR uncore support */
+
+static void spr_uncore_msr_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
+
+ if (reg1->idx != EXTRA_REG_NONE)
+ wrmsrl(reg1->reg, reg1->config);
+
+ wrmsrl(hwc->config_base, hwc->config);
+}
+
+static void spr_uncore_msr_disable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+ struct hw_perf_event_extra *reg1 = &hwc->extra_reg;
+
+ if (reg1->idx != EXTRA_REG_NONE)
+ wrmsrl(reg1->reg, 0);
+
+ wrmsrl(hwc->config_base, 0);
+}
+
+static int spr_cha_hw_config(struct intel_uncore_box *box, struct perf_event *event)
+{
+ struct hw_perf_event_extra *reg1 = &event->hw.extra_reg;
+ bool tie_en = !!(event->hw.config & SPR_CHA_PMON_CTL_TID_EN);
+ struct intel_uncore_type *type = box->pmu->type;
+
+ if (tie_en) {
+ reg1->reg = SPR_C0_MSR_PMON_BOX_FILTER0 +
+ HSWEP_CBO_MSR_OFFSET * type->box_ids[box->pmu->pmu_idx];
+ reg1->config = event->attr.config1 & SPR_CHA_PMON_BOX_FILTER_TID;
+ reg1->idx = 0;
+ }
+
+ return 0;
+}
+
+static struct intel_uncore_ops spr_uncore_chabox_ops = {
+ .init_box = intel_generic_uncore_msr_init_box,
+ .disable_box = intel_generic_uncore_msr_disable_box,
+ .enable_box = intel_generic_uncore_msr_enable_box,
+ .disable_event = spr_uncore_msr_disable_event,
+ .enable_event = spr_uncore_msr_enable_event,
+ .read_counter = uncore_msr_read_counter,
+ .hw_config = spr_cha_hw_config,
+ .get_constraint = uncore_get_constraint,
+ .put_constraint = uncore_put_constraint,
+};
+
+static struct attribute *spr_uncore_cha_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask_ext4.attr,
+ &format_attr_tid_en2.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_thresh8.attr,
+ &format_attr_filter_tid5.attr,
+ NULL,
+};
+static const struct attribute_group spr_uncore_chabox_format_group = {
+ .name = "format",
+ .attrs = spr_uncore_cha_formats_attr,
+};
+
+static ssize_t alias_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct intel_uncore_pmu *pmu = dev_to_uncore_pmu(dev);
+ char pmu_name[UNCORE_PMU_NAME_LEN];
+
+ uncore_get_alias_name(pmu_name, pmu);
+ return sysfs_emit(buf, "%s\n", pmu_name);
+}
+
+static DEVICE_ATTR_RO(alias);
+
+static struct attribute *uncore_alias_attrs[] = {
+ &dev_attr_alias.attr,
+ NULL
+};
+
+ATTRIBUTE_GROUPS(uncore_alias);
+
+static struct intel_uncore_type spr_uncore_chabox = {
+ .name = "cha",
+ .event_mask = SPR_CHA_PMON_EVENT_MASK,
+ .event_mask_ext = SPR_RAW_EVENT_MASK_EXT,
+ .num_shared_regs = 1,
+ .ops = &spr_uncore_chabox_ops,
+ .format_group = &spr_uncore_chabox_format_group,
+ .attr_update = uncore_alias_groups,
+};
+
+static struct intel_uncore_type spr_uncore_iio = {
+ .name = "iio",
+ .event_mask = SNBEP_PMON_RAW_EVENT_MASK,
+ .event_mask_ext = SNR_IIO_PMON_RAW_EVENT_MASK_EXT,
+ .format_group = &snr_uncore_iio_format_group,
+ .attr_update = uncore_alias_groups,
+};
+
+static struct attribute *spr_uncore_raw_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask_ext4.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_thresh8.attr,
+ NULL,
+};
+
+static const struct attribute_group spr_uncore_raw_format_group = {
+ .name = "format",
+ .attrs = spr_uncore_raw_formats_attr,
+};
+
+#define SPR_UNCORE_COMMON_FORMAT() \
+ .event_mask = SNBEP_PMON_RAW_EVENT_MASK, \
+ .event_mask_ext = SPR_RAW_EVENT_MASK_EXT, \
+ .format_group = &spr_uncore_raw_format_group, \
+ .attr_update = uncore_alias_groups
+
+static struct intel_uncore_type spr_uncore_irp = {
+ SPR_UNCORE_COMMON_FORMAT(),
+ .name = "irp",
+
+};
+
+static struct intel_uncore_type spr_uncore_m2pcie = {
+ SPR_UNCORE_COMMON_FORMAT(),
+ .name = "m2pcie",
+};
+
+static struct intel_uncore_type spr_uncore_pcu = {
+ .name = "pcu",
+ .attr_update = uncore_alias_groups,
+};
+
+static void spr_uncore_mmio_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (!box->io_addr)
+ return;
+
+ if (uncore_pmc_fixed(hwc->idx))
+ writel(SNBEP_PMON_CTL_EN, box->io_addr + hwc->config_base);
+ else
+ writel(hwc->config, box->io_addr + hwc->config_base);
+}
+
+static struct intel_uncore_ops spr_uncore_mmio_ops = {
+ .init_box = intel_generic_uncore_mmio_init_box,
+ .exit_box = uncore_mmio_exit_box,
+ .disable_box = intel_generic_uncore_mmio_disable_box,
+ .enable_box = intel_generic_uncore_mmio_enable_box,
+ .disable_event = intel_generic_uncore_mmio_disable_event,
+ .enable_event = spr_uncore_mmio_enable_event,
+ .read_counter = uncore_mmio_read_counter,
+};
+
+static struct intel_uncore_type spr_uncore_imc = {
+ SPR_UNCORE_COMMON_FORMAT(),
+ .name = "imc",
+ .fixed_ctr_bits = 48,
+ .fixed_ctr = SNR_IMC_MMIO_PMON_FIXED_CTR,
+ .fixed_ctl = SNR_IMC_MMIO_PMON_FIXED_CTL,
+ .ops = &spr_uncore_mmio_ops,
+};
+
+static void spr_uncore_pci_enable_event(struct intel_uncore_box *box,
+ struct perf_event *event)
+{
+ struct pci_dev *pdev = box->pci_dev;
+ struct hw_perf_event *hwc = &event->hw;
+
+ pci_write_config_dword(pdev, hwc->config_base + 4, (u32)(hwc->config >> 32));
+ pci_write_config_dword(pdev, hwc->config_base, (u32)hwc->config);
+}
+
+static struct intel_uncore_ops spr_uncore_pci_ops = {
+ .init_box = intel_generic_uncore_pci_init_box,
+ .disable_box = intel_generic_uncore_pci_disable_box,
+ .enable_box = intel_generic_uncore_pci_enable_box,
+ .disable_event = intel_generic_uncore_pci_disable_event,
+ .enable_event = spr_uncore_pci_enable_event,
+ .read_counter = intel_generic_uncore_pci_read_counter,
+};
+
+#define SPR_UNCORE_PCI_COMMON_FORMAT() \
+ SPR_UNCORE_COMMON_FORMAT(), \
+ .ops = &spr_uncore_pci_ops
+
+static struct intel_uncore_type spr_uncore_m2m = {
+ SPR_UNCORE_PCI_COMMON_FORMAT(),
+ .name = "m2m",
+};
+
+static struct intel_uncore_type spr_uncore_upi = {
+ SPR_UNCORE_PCI_COMMON_FORMAT(),
+ .name = "upi",
+};
+
+static struct intel_uncore_type spr_uncore_m3upi = {
+ SPR_UNCORE_PCI_COMMON_FORMAT(),
+ .name = "m3upi",
+};
+
+static struct intel_uncore_type spr_uncore_mdf = {
+ SPR_UNCORE_COMMON_FORMAT(),
+ .name = "mdf",
+};
+
+#define UNCORE_SPR_NUM_UNCORE_TYPES 12
+#define UNCORE_SPR_IIO 1
+#define UNCORE_SPR_IMC 6
+
+static struct intel_uncore_type *spr_uncores[UNCORE_SPR_NUM_UNCORE_TYPES] = {
+ &spr_uncore_chabox,
+ &spr_uncore_iio,
+ &spr_uncore_irp,
+ &spr_uncore_m2pcie,
+ &spr_uncore_pcu,
+ NULL,
+ &spr_uncore_imc,
+ &spr_uncore_m2m,
+ &spr_uncore_upi,
+ &spr_uncore_m3upi,
+ NULL,
+ &spr_uncore_mdf,
+};
+
+enum perf_uncore_spr_iio_freerunning_type_id {
+ SPR_IIO_MSR_IOCLK,
+ SPR_IIO_MSR_BW_IN,
+ SPR_IIO_MSR_BW_OUT,
+
+ SPR_IIO_FREERUNNING_TYPE_MAX,
+};
+
+static struct freerunning_counters spr_iio_freerunning[] = {
+ [SPR_IIO_MSR_IOCLK] = { 0x340e, 0x1, 0x10, 1, 48 },
+ [SPR_IIO_MSR_BW_IN] = { 0x3800, 0x1, 0x10, 8, 48 },
+ [SPR_IIO_MSR_BW_OUT] = { 0x3808, 0x1, 0x10, 8, 48 },
+};
+
+static struct uncore_event_desc spr_uncore_iio_freerunning_events[] = {
+ /* Free-Running IIO CLOCKS Counter */
+ INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"),
+ /* Free-Running IIO BANDWIDTH IN Counters */
+ INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port4, "event=0xff,umask=0x24"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port4.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port4.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port5, "event=0xff,umask=0x25"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port5.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port5.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port6, "event=0xff,umask=0x26"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port6.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port6.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port7, "event=0xff,umask=0x27"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port7.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_in_port7.unit, "MiB"),
+ /* Free-Running IIO BANDWIDTH OUT Counters */
+ INTEL_UNCORE_EVENT_DESC(bw_out_port0, "event=0xff,umask=0x30"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port0.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port0.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port1, "event=0xff,umask=0x31"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port1.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port1.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port2, "event=0xff,umask=0x32"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port2.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port2.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port3, "event=0xff,umask=0x33"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port3.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port3.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port4, "event=0xff,umask=0x34"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port4.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port4.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port5, "event=0xff,umask=0x35"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port5.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port5.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port6, "event=0xff,umask=0x36"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port6.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port6.unit, "MiB"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port7, "event=0xff,umask=0x37"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port7.scale, "3.814697266e-6"),
+ INTEL_UNCORE_EVENT_DESC(bw_out_port7.unit, "MiB"),
+ { /* end: all zeroes */ },
+};
+
+static struct intel_uncore_type spr_uncore_iio_free_running = {
+ .name = "iio_free_running",
+ .num_counters = 17,
+ .num_freerunning_types = SPR_IIO_FREERUNNING_TYPE_MAX,
+ .freerunning = spr_iio_freerunning,
+ .ops = &skx_uncore_iio_freerunning_ops,
+ .event_descs = spr_uncore_iio_freerunning_events,
+ .format_group = &skx_uncore_iio_freerunning_format_group,
+};
+
+enum perf_uncore_spr_imc_freerunning_type_id {
+ SPR_IMC_DCLK,
+ SPR_IMC_PQ_CYCLES,
+
+ SPR_IMC_FREERUNNING_TYPE_MAX,
+};
+
+static struct freerunning_counters spr_imc_freerunning[] = {
+ [SPR_IMC_DCLK] = { 0x22b0, 0x0, 0, 1, 48 },
+ [SPR_IMC_PQ_CYCLES] = { 0x2318, 0x8, 0, 2, 48 },
+};
+
+static struct uncore_event_desc spr_uncore_imc_freerunning_events[] = {
+ INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"),
+
+ INTEL_UNCORE_EVENT_DESC(rpq_cycles, "event=0xff,umask=0x20"),
+ INTEL_UNCORE_EVENT_DESC(wpq_cycles, "event=0xff,umask=0x21"),
+ { /* end: all zeroes */ },
+};
+
+#define SPR_MC_DEVICE_ID 0x3251
+
+static void spr_uncore_imc_freerunning_init_box(struct intel_uncore_box *box)
+{
+ int mem_offset = box->pmu->pmu_idx * ICX_IMC_MEM_STRIDE + SNR_IMC_MMIO_MEM0_OFFSET;
+
+ snr_uncore_mmio_map(box, uncore_mmio_box_ctl(box),
+ mem_offset, SPR_MC_DEVICE_ID);
+}
+
+static struct intel_uncore_ops spr_uncore_imc_freerunning_ops = {
+ .init_box = spr_uncore_imc_freerunning_init_box,
+ .exit_box = uncore_mmio_exit_box,
+ .read_counter = uncore_mmio_read_counter,
+ .hw_config = uncore_freerunning_hw_config,
+};
+
+static struct intel_uncore_type spr_uncore_imc_free_running = {
+ .name = "imc_free_running",
+ .num_counters = 3,
+ .mmio_map_size = SNR_IMC_MMIO_SIZE,
+ .num_freerunning_types = SPR_IMC_FREERUNNING_TYPE_MAX,
+ .freerunning = spr_imc_freerunning,
+ .ops = &spr_uncore_imc_freerunning_ops,
+ .event_descs = spr_uncore_imc_freerunning_events,
+ .format_group = &skx_uncore_iio_freerunning_format_group,
+};
+
+#define UNCORE_SPR_MSR_EXTRA_UNCORES 1
+#define UNCORE_SPR_MMIO_EXTRA_UNCORES 1
+
+static struct intel_uncore_type *spr_msr_uncores[UNCORE_SPR_MSR_EXTRA_UNCORES] = {
+ &spr_uncore_iio_free_running,
+};
+
+static struct intel_uncore_type *spr_mmio_uncores[UNCORE_SPR_MMIO_EXTRA_UNCORES] = {
+ &spr_uncore_imc_free_running,
+};
+
+static void uncore_type_customized_copy(struct intel_uncore_type *to_type,
+ struct intel_uncore_type *from_type)
+{
+ if (!to_type || !from_type)
+ return;
+
+ if (from_type->name)
+ to_type->name = from_type->name;
+ if (from_type->fixed_ctr_bits)
+ to_type->fixed_ctr_bits = from_type->fixed_ctr_bits;
+ if (from_type->event_mask)
+ to_type->event_mask = from_type->event_mask;
+ if (from_type->event_mask_ext)
+ to_type->event_mask_ext = from_type->event_mask_ext;
+ if (from_type->fixed_ctr)
+ to_type->fixed_ctr = from_type->fixed_ctr;
+ if (from_type->fixed_ctl)
+ to_type->fixed_ctl = from_type->fixed_ctl;
+ if (from_type->fixed_ctr_bits)
+ to_type->fixed_ctr_bits = from_type->fixed_ctr_bits;
+ if (from_type->num_shared_regs)
+ to_type->num_shared_regs = from_type->num_shared_regs;
+ if (from_type->constraints)
+ to_type->constraints = from_type->constraints;
+ if (from_type->ops)
+ to_type->ops = from_type->ops;
+ if (from_type->event_descs)
+ to_type->event_descs = from_type->event_descs;
+ if (from_type->format_group)
+ to_type->format_group = from_type->format_group;
+ if (from_type->attr_update)
+ to_type->attr_update = from_type->attr_update;
+}
+
+static struct intel_uncore_type **
+uncore_get_uncores(enum uncore_access_type type_id, int num_extra,
+ struct intel_uncore_type **extra)
+{
+ struct intel_uncore_type **types, **start_types;
+ int i;
+
+ start_types = types = intel_uncore_generic_init_uncores(type_id, num_extra);
+
+ /* Only copy the customized features */
+ for (; *types; types++) {
+ if ((*types)->type_id >= UNCORE_SPR_NUM_UNCORE_TYPES)
+ continue;
+ uncore_type_customized_copy(*types, spr_uncores[(*types)->type_id]);
+ }
+
+ for (i = 0; i < num_extra; i++, types++)
+ *types = extra[i];
+
+ return start_types;
+}
+
+static struct intel_uncore_type *
+uncore_find_type_by_id(struct intel_uncore_type **types, int type_id)
+{
+ for (; *types; types++) {
+ if (type_id == (*types)->type_id)
+ return *types;
+ }
+
+ return NULL;
+}
+
+static int uncore_type_max_boxes(struct intel_uncore_type **types,
+ int type_id)
+{
+ struct intel_uncore_type *type;
+ int i, max = 0;
+
+ type = uncore_find_type_by_id(types, type_id);
+ if (!type)
+ return 0;
+
+ for (i = 0; i < type->num_boxes; i++) {
+ if (type->box_ids[i] > max)
+ max = type->box_ids[i];
+ }
+
+ return max + 1;
+}
+
+void spr_uncore_cpu_init(void)
+{
+ uncore_msr_uncores = uncore_get_uncores(UNCORE_ACCESS_MSR,
+ UNCORE_SPR_MSR_EXTRA_UNCORES,
+ spr_msr_uncores);
+
+ spr_uncore_iio_free_running.num_boxes = uncore_type_max_boxes(uncore_msr_uncores, UNCORE_SPR_IIO);
+}
+
+int spr_uncore_pci_init(void)
+{
+ uncore_pci_uncores = uncore_get_uncores(UNCORE_ACCESS_PCI, 0, NULL);
+ return 0;
+}
+
+void spr_uncore_mmio_init(void)
+{
+ int ret = snbep_pci2phy_map_init(0x3250, SKX_CPUNODEID, SKX_GIDNIDMAP, true);
+
+ if (ret)
+ uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 0, NULL);
+ else {
+ uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
+ UNCORE_SPR_MMIO_EXTRA_UNCORES,
+ spr_mmio_uncores);
+
+ spr_uncore_imc_free_running.num_boxes = uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_SPR_IMC) / 2;
+ }
+}
+
+/* end of SPR uncore support */
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * From PPR Vol 1 for AMD Family 19h Model 01h B1
+ * 55898 Rev 0.35 - Feb 5, 2021
+ */
+
+#include <asm/msr-index.h>
+
+/*
+ * IBS Hardware MSRs
+ */
+
+/* MSR 0xc0011030: IBS Fetch Control */
+union ibs_fetch_ctl {
+ __u64 val;
+ struct {
+ __u64 fetch_maxcnt:16,/* 0-15: instruction fetch max. count */
+ fetch_cnt:16, /* 16-31: instruction fetch count */
+ fetch_lat:16, /* 32-47: instruction fetch latency */
+ fetch_en:1, /* 48: instruction fetch enable */
+ fetch_val:1, /* 49: instruction fetch valid */
+ fetch_comp:1, /* 50: instruction fetch complete */
+ ic_miss:1, /* 51: i-cache miss */
+ phy_addr_valid:1,/* 52: physical address valid */
+ l1tlb_pgsz:2, /* 53-54: i-cache L1TLB page size
+ * (needs IbsPhyAddrValid) */
+ l1tlb_miss:1, /* 55: i-cache fetch missed in L1TLB */
+ l2tlb_miss:1, /* 56: i-cache fetch missed in L2TLB */
+ rand_en:1, /* 57: random tagging enable */
+ fetch_l2_miss:1,/* 58: L2 miss for sampled fetch
+ * (needs IbsFetchComp) */
+ reserved:5; /* 59-63: reserved */
+ };
+};
+
+/* MSR 0xc0011033: IBS Execution Control */
+union ibs_op_ctl {
+ __u64 val;
+ struct {
+ __u64 opmaxcnt:16, /* 0-15: periodic op max. count */
+ reserved0:1, /* 16: reserved */
+ op_en:1, /* 17: op sampling enable */
+ op_val:1, /* 18: op sample valid */
+ cnt_ctl:1, /* 19: periodic op counter control */
+ opmaxcnt_ext:7, /* 20-26: upper 7 bits of periodic op maximum count */
+ reserved1:5, /* 27-31: reserved */
+ opcurcnt:27, /* 32-58: periodic op counter current count */
+ reserved2:5; /* 59-63: reserved */
+ };
+};
+
+/* MSR 0xc0011035: IBS Op Data 2 */
+union ibs_op_data {
+ __u64 val;
+ struct {
+ __u64 comp_to_ret_ctr:16, /* 0-15: op completion to retire count */
+ tag_to_ret_ctr:16, /* 15-31: op tag to retire count */
+ reserved1:2, /* 32-33: reserved */
+ op_return:1, /* 34: return op */
+ op_brn_taken:1, /* 35: taken branch op */
+ op_brn_misp:1, /* 36: mispredicted branch op */
+ op_brn_ret:1, /* 37: branch op retired */
+ op_rip_invalid:1, /* 38: RIP is invalid */
+ op_brn_fuse:1, /* 39: fused branch op */
+ op_microcode:1, /* 40: microcode op */
+ reserved2:23; /* 41-63: reserved */
+ };
+};
+
+/* MSR 0xc0011036: IBS Op Data 2 */
+union ibs_op_data2 {
+ __u64 val;
+ struct {
+ __u64 data_src:3, /* 0-2: data source */
+ reserved0:1, /* 3: reserved */
+ rmt_node:1, /* 4: destination node */
+ cache_hit_st:1, /* 5: cache hit state */
+ reserved1:57; /* 5-63: reserved */
+ };
+};
+
+/* MSR 0xc0011037: IBS Op Data 3 */
+union ibs_op_data3 {
+ __u64 val;
+ struct {
+ __u64 ld_op:1, /* 0: load op */
+ st_op:1, /* 1: store op */
+ dc_l1tlb_miss:1, /* 2: data cache L1TLB miss */
+ dc_l2tlb_miss:1, /* 3: data cache L2TLB hit in 2M page */
+ dc_l1tlb_hit_2m:1, /* 4: data cache L1TLB hit in 2M page */
+ dc_l1tlb_hit_1g:1, /* 5: data cache L1TLB hit in 1G page */
+ dc_l2tlb_hit_2m:1, /* 6: data cache L2TLB hit in 2M page */
+ dc_miss:1, /* 7: data cache miss */
+ dc_mis_acc:1, /* 8: misaligned access */
+ reserved:4, /* 9-12: reserved */
+ dc_wc_mem_acc:1, /* 13: write combining memory access */
+ dc_uc_mem_acc:1, /* 14: uncacheable memory access */
+ dc_locked_op:1, /* 15: locked operation */
+ dc_miss_no_mab_alloc:1, /* 16: DC miss with no MAB allocated */
+ dc_lin_addr_valid:1, /* 17: data cache linear address valid */
+ dc_phy_addr_valid:1, /* 18: data cache physical address valid */
+ dc_l2_tlb_hit_1g:1, /* 19: data cache L2 hit in 1GB page */
+ l2_miss:1, /* 20: L2 cache miss */
+ sw_pf:1, /* 21: software prefetch */
+ op_mem_width:4, /* 22-25: load/store size in bytes */
+ op_dc_miss_open_mem_reqs:6, /* 26-31: outstanding mem reqs on DC fill */
+ dc_miss_lat:16, /* 32-47: data cache miss latency */
+ tlb_refill_lat:16; /* 48-63: L1 TLB refill latency */
+ };
+};
+
+/* MSR 0xc001103c: IBS Fetch Control Extended */
+union ic_ibs_extd_ctl {
+ __u64 val;
+ struct {
+ __u64 itlb_refill_lat:16, /* 0-15: ITLB Refill latency for sampled fetch */
+ reserved:48; /* 16-63: reserved */
+ };
+};
+
+/*
+ * IBS driver related
+ */
+
+struct perf_ibs_data {
+ u32 size;
+ union {
+ u32 data[0]; /* data buffer starts here */
+ u32 caps;
+ };
+ u64 regs[MSR_AMD64_IBS_REG_COUNT_MAX];
+};
#define PIC_MASTER_OCW3 PIC_MASTER_ISR
#define PIC_SLAVE_CMD 0xa0
#define PIC_SLAVE_IMR 0xa1
+#define PIC_ELCR1 0x4d0
+#define PIC_ELCR2 0x4d1
/* i8259A PIC related value */
#define PIC_CASCADE_IR 2
#ifndef _ASM_X86_KFENCE_H
#define _ASM_X86_KFENCE_H
+#ifndef MODULE
+
#include <linux/bug.h>
#include <linux/kfence.h>
return true;
}
+#endif /* !MODULE */
+
#endif /* _ASM_X86_KFENCE_H */
MCP_TIMESTAMP = BIT(0), /* log time stamp */
MCP_UC = BIT(1), /* log uncorrected errors */
MCP_DONTLOG = BIT(2), /* only clear, don't log */
+ MCP_QUEUE_LOG = BIT(3), /* only queue to genpool */
};
bool machine_check_poll(enum mcp_flags flags, mce_banks_t *b);
DECLARE_STATIC_KEY_FALSE(mds_user_clear);
DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
+DECLARE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+
#include <asm/segment.h>
/**
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Support for the configuration register space at port I/O locations
+ * 0x22 and 0x23 variously used by PC architectures, e.g. the MP Spec,
+ * Cyrix CPUs, numerous chipsets.
+ */
+#ifndef _ASM_X86_PC_CONF_REG_H
+#define _ASM_X86_PC_CONF_REG_H
+
+#include <linux/io.h>
+#include <linux/spinlock.h>
+#include <linux/types.h>
+
+#define PC_CONF_INDEX 0x22
+#define PC_CONF_DATA 0x23
+
+#define PC_CONF_MPS_IMCR 0x70
+
+extern raw_spinlock_t pc_conf_lock;
+
+static inline u8 pc_conf_get(u8 reg)
+{
+ outb(reg, PC_CONF_INDEX);
+ return inb(PC_CONF_DATA);
+}
+
+static inline void pc_conf_set(u8 reg, u8 data)
+{
+ outb(reg, PC_CONF_INDEX);
+ outb(data, PC_CONF_DATA);
+}
+
+#endif /* _ASM_X86_PC_CONF_REG_H */
* Access order is always 0x22 (=offset), 0x23 (=value)
*/
+#include <asm/pc-conf-reg.h>
+
static inline u8 getCx86(u8 reg)
{
- outb(reg, 0x22);
- return inb(0x23);
+ return pc_conf_get(reg);
}
static inline void setCx86(u8 reg, u8 data)
{
- outb(reg, 0x22);
- outb(data, 0x23);
+ pc_conf_set(reg, data);
}
u16 logical_die_id;
/* Index into per_cpu list: */
u16 cpu_index;
+ /* Is SMT active on this core? */
+ bool smt_active;
u32 microcode;
/* Address space bits used by the cache internally */
u8 x86_cache_bits;
DECLARE_PER_CPU(u64, msr_misc_features_shadow);
+extern u16 get_llc_id(unsigned int cpu);
+
#ifdef CONFIG_CPU_SUP_AMD
extern u32 amd_get_nodes_per_socket(void);
extern u32 amd_get_highest_perf(void);
#define TIF_SINGLESTEP 4 /* reenable singlestep on user return*/
#define TIF_SSBD 5 /* Speculative store bypass disable */
#define TIF_SPEC_IB 9 /* Indirect branch speculation mitigation */
-#define TIF_SPEC_FORCE_UPDATE 10 /* Force speculation MSR update in context switch */
+#define TIF_SPEC_L1D_FLUSH 10 /* Flush L1D on mm switches (processes) */
#define TIF_USER_RETURN_NOTIFY 11 /* notify kernel of userspace return */
#define TIF_UPROBE 12 /* breakpointed or singlestepping */
#define TIF_PATCH_PENDING 13 /* pending live patching update */
#define TIF_MEMDIE 20 /* is terminating due to OOM killer */
#define TIF_POLLING_NRFLAG 21 /* idle is polling for TIF_NEED_RESCHED */
#define TIF_IO_BITMAP 22 /* uses I/O bitmap */
+#define TIF_SPEC_FORCE_UPDATE 23 /* Force speculation MSR update in context switch */
#define TIF_FORCED_TF 24 /* true if TF in eflags artificially */
#define TIF_BLOCKSTEP 25 /* set when we want DEBUGCTLMSR_BTF */
#define TIF_LAZY_MMU_UPDATES 27 /* task is updating the mmu lazily */
#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_SSBD (1 << TIF_SSBD)
#define _TIF_SPEC_IB (1 << TIF_SPEC_IB)
-#define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE)
+#define _TIF_SPEC_L1D_FLUSH (1 << TIF_SPEC_L1D_FLUSH)
#define _TIF_USER_RETURN_NOTIFY (1 << TIF_USER_RETURN_NOTIFY)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_PATCH_PENDING (1 << TIF_PATCH_PENDING)
#define _TIF_SLD (1 << TIF_SLD)
#define _TIF_POLLING_NRFLAG (1 << TIF_POLLING_NRFLAG)
#define _TIF_IO_BITMAP (1 << TIF_IO_BITMAP)
+#define _TIF_SPEC_FORCE_UPDATE (1 << TIF_SPEC_FORCE_UPDATE)
#define _TIF_FORCED_TF (1 << TIF_FORCED_TF)
#define _TIF_BLOCKSTEP (1 << TIF_BLOCKSTEP)
#define _TIF_LAZY_MMU_UPDATES (1 << TIF_LAZY_MMU_UPDATES)
/* Last user mm for optimizing IBPB */
union {
struct mm_struct *last_user_mm;
- unsigned long last_user_mm_ibpb;
+ unsigned long last_user_mm_spec;
};
u16 loaded_mm_asid;
* If a PIC-mode SCI is not recognized or gives spurious IRQ7's
* it may require Edge Trigger -- use "acpi_sci=edge"
*
- * Port 0x4d0-4d1 are ECLR1 and ECLR2, the Edge/Level Control Registers
+ * Port 0x4d0-4d1 are ELCR1 and ELCR2, the Edge/Level Control Registers
* for the 8259 PIC. bit[n] = 1 means irq[n] is Level, otherwise Edge.
- * ECLR1 is IRQs 0-7 (IRQ 0, 1, 2 must be 0)
- * ECLR2 is IRQs 8-15 (IRQ 8, 13 must be 0)
+ * ELCR1 is IRQs 0-7 (IRQ 0, 1, 2 must be 0)
+ * ELCR2 is IRQs 8-15 (IRQ 8, 13 must be 0)
*/
void __init acpi_pic_sci_set_trigger(unsigned int irq, u16 trigger)
unsigned int old, new;
/* Real old ELCR mask */
- old = inb(0x4d0) | (inb(0x4d1) << 8);
+ old = inb(PIC_ELCR1) | (inb(PIC_ELCR2) << 8);
/*
* If we use ACPI to set PCI IRQs, then we should clear ELCR
return;
pr_warn("setting ELCR to %04x (from %04x)\n", new, old);
- outb(new, 0x4d0);
- outb(new >> 8, 0x4d1);
+ outb(new, PIC_ELCR1);
+ outb(new >> 8, PIC_ELCR2);
}
int acpi_gsi_to_irq(u32 gsi, unsigned int *irqp)
#include <asm/trace/irq_vectors.h>
#include <asm/irq_remapping.h>
+#include <asm/pc-conf-reg.h>
#include <asm/perf_event.h>
#include <asm/x86_init.h>
#include <linux/atomic.h>
*/
static inline void imcr_pic_to_apic(void)
{
- /* select IMCR register */
- outb(0x70, 0x22);
/* NMI and 8259 INTR go through APIC */
- outb(0x01, 0x23);
+ pc_conf_set(PC_CONF_MPS_IMCR, 0x01);
}
static inline void imcr_apic_to_pic(void)
{
- /* select IMCR register */
- outb(0x70, 0x22);
/* NMI and 8259 INTR go directly to BSP */
- outb(0x00, 0x23);
+ pc_conf_set(PC_CONF_MPS_IMCR, 0x00);
}
#endif
static bool EISA_ELCR(unsigned int irq)
{
if (irq < nr_legacy_irqs()) {
- unsigned int port = 0x4d0 + (irq >> 3);
+ unsigned int port = PIC_ELCR1 + (irq >> 3);
return (inb(port) >> (irq & 7)) & 1;
}
apic_printk(APIC_VERBOSE, KERN_INFO
pr_debug("... PIC ISR: %04x\n", v);
- v = inb(0x4d1) << 8 | inb(0x4d0);
+ v = inb(PIC_ELCR2) << 8 | inb(PIC_ELCR1);
pr_debug("... PIC ELCR: %04x\n", v);
}
node = numa_cpu_node(cpu);
if (node == NUMA_NO_NODE)
- node = per_cpu(cpu_llc_id, cpu);
+ node = get_llc_id(cpu);
/*
* On multi-fabric platform (e.g. Numascale NumaChip) a
static void __init mds_print_mitigation(void);
static void __init taa_select_mitigation(void);
static void __init srbds_select_mitigation(void);
+static void __init l1d_flush_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
u64 x86_spec_ctrl_base;
DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
EXPORT_SYMBOL_GPL(mds_idle_clear);
+/*
+ * Controls whether l1d flush based mitigations are enabled,
+ * based on hw features and admin setting via boot parameter
+ * defaults to false
+ */
+DEFINE_STATIC_KEY_FALSE(switch_mm_cond_l1d_flush);
+
void __init check_bugs(void)
{
identify_boot_cpu();
mds_select_mitigation();
taa_select_mitigation();
srbds_select_mitigation();
+ l1d_flush_select_mitigation();
/*
* As MDS and TAA mitigations are inter-related, print MDS
}
early_param("srbds", srbds_parse_cmdline);
+#undef pr_fmt
+#define pr_fmt(fmt) "L1D Flush : " fmt
+
+enum l1d_flush_mitigations {
+ L1D_FLUSH_OFF = 0,
+ L1D_FLUSH_ON,
+};
+
+static enum l1d_flush_mitigations l1d_flush_mitigation __initdata = L1D_FLUSH_OFF;
+
+static void __init l1d_flush_select_mitigation(void)
+{
+ if (!l1d_flush_mitigation || !boot_cpu_has(X86_FEATURE_FLUSH_L1D))
+ return;
+
+ static_branch_enable(&switch_mm_cond_l1d_flush);
+ pr_info("Conditional flush on switch_mm() enabled\n");
+}
+
+static int __init l1d_flush_parse_cmdline(char *str)
+{
+ if (!strcmp(str, "on"))
+ l1d_flush_mitigation = L1D_FLUSH_ON;
+
+ return 0;
+}
+early_param("l1d_flush", l1d_flush_parse_cmdline);
+
#undef pr_fmt
#define pr_fmt(fmt) "Spectre V1 : " fmt
speculation_ctrl_update_current();
}
+static int l1d_flush_prctl_set(struct task_struct *task, unsigned long ctrl)
+{
+
+ if (!static_branch_unlikely(&switch_mm_cond_l1d_flush))
+ return -EPERM;
+
+ switch (ctrl) {
+ case PR_SPEC_ENABLE:
+ set_ti_thread_flag(&task->thread_info, TIF_SPEC_L1D_FLUSH);
+ return 0;
+ case PR_SPEC_DISABLE:
+ clear_ti_thread_flag(&task->thread_info, TIF_SPEC_L1D_FLUSH);
+ return 0;
+ default:
+ return -ERANGE;
+ }
+}
+
static int ssb_prctl_set(struct task_struct *task, unsigned long ctrl)
{
if (ssb_mode != SPEC_STORE_BYPASS_PRCTL &&
return ssb_prctl_set(task, ctrl);
case PR_SPEC_INDIRECT_BRANCH:
return ib_prctl_set(task, ctrl);
+ case PR_SPEC_L1D_FLUSH:
+ return l1d_flush_prctl_set(task, ctrl);
default:
return -ENODEV;
}
}
#endif
+static int l1d_flush_prctl_get(struct task_struct *task)
+{
+ if (!static_branch_unlikely(&switch_mm_cond_l1d_flush))
+ return PR_SPEC_FORCE_DISABLE;
+
+ if (test_ti_thread_flag(&task->thread_info, TIF_SPEC_L1D_FLUSH))
+ return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
+ else
+ return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
+}
+
static int ssb_prctl_get(struct task_struct *task)
{
switch (ssb_mode) {
return ssb_prctl_get(task);
case PR_SPEC_INDIRECT_BRANCH:
return ib_prctl_get(task);
+ case PR_SPEC_L1D_FLUSH:
+ return l1d_flush_prctl_get(task);
default:
return -ENODEV;
}
/* Last level cache ID of each logical CPU */
DEFINE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id) = BAD_APICID;
+u16 get_llc_id(unsigned int cpu)
+{
+ return per_cpu(cpu_llc_id, cpu);
+}
+EXPORT_SYMBOL_GPL(get_llc_id);
+
/* correctly size the local cpu masks */
void __init setup_cpu_local_masks(void)
{
if (mca_cfg.dont_log_ce && !mce_usable_address(&m))
goto clear_it;
- mce_log(&m);
+ if (flags & MCP_QUEUE_LOG)
+ mce_gen_pool_add(&m);
+ else
+ mce_log(&m);
clear_it:
/*
m_fl = MCP_DONTLOG;
/*
- * Log the machine checks left over from the previous reset.
+ * Log the machine checks left over from the previous reset. Log them
+ * only, do not start processing them. That will happen in mcheck_late_init()
+ * when all consumers have been registered on the notifier chain.
*/
bitmap_fill(all_banks, MAX_NR_BANKS);
- machine_check_poll(MCP_UC | m_fl, &all_banks);
+ machine_check_poll(MCP_UC | MCP_QUEUE_LOG | m_fl, &all_banks);
cr4_set_bits(X86_CR4_MCE);
unsigned long start;
int cpu;
- get_online_cpus();
+ cpus_read_lock();
cpumask_copy(mce_inject_cpumask, cpu_online_mask);
cpumask_clear_cpu(get_cpu(), mce_inject_cpumask);
for_each_online_cpu(cpu) {
}
raise_local();
put_cpu();
- put_online_cpus();
+ cpus_read_unlock();
} else {
preempt_disable();
raise_local();
cpu = get_nbc_for_node(topology_die_id(cpu));
}
- get_online_cpus();
+ cpus_read_lock();
if (!cpu_online(cpu))
goto err;
}
err:
- put_online_cpus();
+ cpus_read_unlock();
}
* All non cpu-hotplug-callback call sites use:
*
* - microcode_mutex to synchronize with each other;
- * - get/put_online_cpus() to synchronize with
+ * - cpus_read_lock/unlock() to synchronize with
* the cpu-hotplug-callback call sites.
*
* We guarantee that only a single cpu is being
return ret;
}
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(µcode_mutex);
if (do_microcode_update(buf, len) == 0)
perf_check_microcode();
mutex_unlock(µcode_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return ret;
}
if (val != 1)
return size;
- get_online_cpus();
+ cpus_read_lock();
ret = check_online_cpus();
if (ret)
mutex_unlock(µcode_mutex);
put:
- put_online_cpus();
+ cpus_read_unlock();
if (ret == 0)
ret = size;
if (IS_ERR(microcode_pdev))
return PTR_ERR(microcode_pdev);
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(µcode_mutex);
error = subsys_interface_register(&mc_cpu_interface);
if (!error)
perf_check_microcode();
mutex_unlock(µcode_mutex);
- put_online_cpus();
+ cpus_read_unlock();
if (error)
goto out_pdev;
&cpu_root_microcode_group);
out_driver:
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(µcode_mutex);
subsys_interface_unregister(&mc_cpu_interface);
mutex_unlock(µcode_mutex);
- put_online_cpus();
+ cpus_read_unlock();
out_pdev:
platform_device_unregister(microcode_pdev);
replace = -1;
/* No CPU hotplug when we change MTRR entries */
- get_online_cpus();
+ cpus_read_lock();
/* Search for existing MTRR */
mutex_lock(&mtrr_mutex);
error = i;
out:
mutex_unlock(&mtrr_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return error;
}
max = num_var_ranges;
/* No CPU hotplug when we change MTRR entries */
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&mtrr_mutex);
if (reg < 0) {
/* Search for existing MTRR */
error = reg;
out:
mutex_unlock(&mtrr_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return error;
}
mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
-#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].domains)
+#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.domains)
-struct rdt_resource rdt_resources_all[] = {
+struct rdt_hw_resource rdt_resources_all[] = {
[RDT_RESOURCE_L3] =
{
- .rid = RDT_RESOURCE_L3,
- .name = "L3",
- .domains = domain_init(RDT_RESOURCE_L3),
- .msr_base = MSR_IA32_L3_CBM_BASE,
- .msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- [RDT_RESOURCE_L3DATA] =
- {
- .rid = RDT_RESOURCE_L3DATA,
- .name = "L3DATA",
- .domains = domain_init(RDT_RESOURCE_L3DATA),
- .msr_base = MSR_IA32_L3_CBM_BASE,
- .msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
+ .r_resctrl = {
+ .rid = RDT_RESOURCE_L3,
+ .name = "L3",
+ .cache_level = 3,
+ .cache = {
+ .min_cbm_bits = 1,
+ },
+ .domains = domain_init(RDT_RESOURCE_L3),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
},
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- [RDT_RESOURCE_L3CODE] =
- {
- .rid = RDT_RESOURCE_L3CODE,
- .name = "L3CODE",
- .domains = domain_init(RDT_RESOURCE_L3CODE),
.msr_base = MSR_IA32_L3_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_L2] =
{
- .rid = RDT_RESOURCE_L2,
- .name = "L2",
- .domains = domain_init(RDT_RESOURCE_L2),
- .msr_base = MSR_IA32_L2_CBM_BASE,
- .msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 1,
- .cbm_idx_offset = 0,
+ .r_resctrl = {
+ .rid = RDT_RESOURCE_L2,
+ .name = "L2",
+ .cache_level = 2,
+ .cache = {
+ .min_cbm_bits = 1,
+ },
+ .domains = domain_init(RDT_RESOURCE_L2),
+ .parse_ctrlval = parse_cbm,
+ .format_str = "%d=%0*x",
+ .fflags = RFTYPE_RES_CACHE,
},
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- [RDT_RESOURCE_L2DATA] =
- {
- .rid = RDT_RESOURCE_L2DATA,
- .name = "L2DATA",
- .domains = domain_init(RDT_RESOURCE_L2DATA),
.msr_base = MSR_IA32_L2_CBM_BASE,
.msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 0,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
- },
- [RDT_RESOURCE_L2CODE] =
- {
- .rid = RDT_RESOURCE_L2CODE,
- .name = "L2CODE",
- .domains = domain_init(RDT_RESOURCE_L2CODE),
- .msr_base = MSR_IA32_L2_CBM_BASE,
- .msr_update = cat_wrmsr,
- .cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- .cbm_idx_mult = 2,
- .cbm_idx_offset = 1,
- },
- .parse_ctrlval = parse_cbm,
- .format_str = "%d=%0*x",
- .fflags = RFTYPE_RES_CACHE,
},
[RDT_RESOURCE_MBA] =
{
- .rid = RDT_RESOURCE_MBA,
- .name = "MB",
- .domains = domain_init(RDT_RESOURCE_MBA),
- .cache_level = 3,
- .parse_ctrlval = parse_bw,
- .format_str = "%d=%*u",
- .fflags = RFTYPE_RES_MB,
+ .r_resctrl = {
+ .rid = RDT_RESOURCE_MBA,
+ .name = "MB",
+ .cache_level = 3,
+ .domains = domain_init(RDT_RESOURCE_MBA),
+ .parse_ctrlval = parse_bw,
+ .format_str = "%d=%*u",
+ .fflags = RFTYPE_RES_MB,
+ },
},
};
-static unsigned int cbm_idx(struct rdt_resource *r, unsigned int closid)
-{
- return closid * r->cache.cbm_idx_mult + r->cache.cbm_idx_offset;
-}
-
/*
* cache_alloc_hsw_probe() - Have to probe for Intel haswell server CPUs
* as they do not have CPUID enumeration support for Cache allocation.
*/
static inline void cache_alloc_hsw_probe(void)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_resource *r = &hw_res->r_resctrl;
u32 l, h, max_cbm = BIT_MASK(20) - 1;
if (wrmsr_safe(MSR_IA32_L3_CBM_BASE, max_cbm, 0))
if (l != max_cbm)
return;
- r->num_closid = 4;
+ hw_res->num_closid = 4;
r->default_ctrl = max_cbm;
r->cache.cbm_len = 20;
r->cache.shareable_bits = 0xc0000;
bool is_mba_sc(struct rdt_resource *r)
{
if (!r)
- return rdt_resources_all[RDT_RESOURCE_MBA].membw.mba_sc;
+ return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.mba_sc;
return r->membw.mba_sc;
}
static bool __get_mem_config_intel(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx, max_delay;
cpuid_count(0x00000010, 3, &eax.full, &ebx, &ecx, &edx.full);
- r->num_closid = edx.split.cos_max + 1;
+ hw_res->num_closid = edx.split.cos_max + 1;
max_delay = eax.split.max_delay + 1;
r->default_ctrl = MAX_MBA_BW;
r->membw.arch_needs_linear = true;
static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_3_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;
cpuid_count(0x80000020, 1, &eax.full, &ebx, &ecx, &edx.full);
- r->num_closid = edx.split.cos_max + 1;
+ hw_res->num_closid = edx.split.cos_max + 1;
r->default_ctrl = MAX_MBA_BW_AMD;
/* AMD does not use delay */
static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_1_eax eax;
union cpuid_0x10_x_edx edx;
u32 ebx, ecx;
cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
- r->num_closid = edx.split.cos_max + 1;
+ hw_res->num_closid = edx.split.cos_max + 1;
r->cache.cbm_len = eax.split.cbm_len + 1;
r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->cache.shareable_bits = ebx & r->default_ctrl;
r->alloc_enabled = true;
}
-static void rdt_get_cdp_config(int level, int type)
+static void rdt_get_cdp_config(int level)
{
- struct rdt_resource *r_l = &rdt_resources_all[level];
- struct rdt_resource *r = &rdt_resources_all[type];
-
- r->num_closid = r_l->num_closid / 2;
- r->cache.cbm_len = r_l->cache.cbm_len;
- r->default_ctrl = r_l->default_ctrl;
- r->cache.shareable_bits = r_l->cache.shareable_bits;
- r->data_width = (r->cache.cbm_len + 3) / 4;
- r->alloc_capable = true;
/*
* By default, CDP is disabled. CDP can be enabled by mount parameter
* "cdp" during resctrl file system mount time.
*/
- r->alloc_enabled = false;
+ rdt_resources_all[level].cdp_enabled = false;
+ rdt_resources_all[level].r_resctrl.cdp_capable = true;
}
static void rdt_get_cdp_l3_config(void)
{
- rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA);
- rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3CODE);
+ rdt_get_cdp_config(RDT_RESOURCE_L3);
}
static void rdt_get_cdp_l2_config(void)
{
- rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA);
- rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2CODE);
+ rdt_get_cdp_config(RDT_RESOURCE_L2);
}
static void
mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + i, d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}
/*
struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
/* Write the delay values for mba. */
for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + i, delay_bw_map(d->ctrl_val[i], r));
+ wrmsrl(hw_res->msr_base + i, delay_bw_map(hw_dom->ctrl_val[i], r));
}
static void
cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r)
{
unsigned int i;
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
for (i = m->low; i < m->high; i++)
- wrmsrl(r->msr_base + cbm_idx(r, i), d->ctrl_val[i]);
+ wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]);
}
struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r)
return NULL;
}
+u32 resctrl_arch_get_num_closid(struct rdt_resource *r)
+{
+ return resctrl_to_arch_res(r)->num_closid;
+}
+
void rdt_ctrl_update(void *arg)
{
struct msr_param *m = arg;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(m->res);
struct rdt_resource *r = m->res;
int cpu = smp_processor_id();
struct rdt_domain *d;
d = get_domain_from_cpu(cpu, r);
if (d) {
- r->msr_update(d, m, r);
+ hw_res->msr_update(d, m, r);
return;
}
pr_warn_once("cpu %d not found in any domain for resource %s\n",
void setup_default_ctrlval(struct rdt_resource *r, u32 *dc, u32 *dm)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
int i;
/*
* For Memory Allocation: Set b/w requested to 100%
* and the bandwidth in MBps to U32_MAX
*/
- for (i = 0; i < r->num_closid; i++, dc++, dm++) {
+ for (i = 0; i < hw_res->num_closid; i++, dc++, dm++) {
*dc = r->default_ctrl;
*dm = MBA_MAX_MBPS;
}
static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain *d)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
struct msr_param m;
u32 *dc, *dm;
- dc = kmalloc_array(r->num_closid, sizeof(*d->ctrl_val), GFP_KERNEL);
+ dc = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->ctrl_val),
+ GFP_KERNEL);
if (!dc)
return -ENOMEM;
- dm = kmalloc_array(r->num_closid, sizeof(*d->mbps_val), GFP_KERNEL);
+ dm = kmalloc_array(hw_res->num_closid, sizeof(*hw_dom->mbps_val),
+ GFP_KERNEL);
if (!dm) {
kfree(dc);
return -ENOMEM;
}
- d->ctrl_val = dc;
- d->mbps_val = dm;
+ hw_dom->ctrl_val = dc;
+ hw_dom->mbps_val = dm;
setup_default_ctrlval(r, dc, dm);
m.low = 0;
- m.high = r->num_closid;
- r->msr_update(d, &m, r);
+ m.high = hw_res->num_closid;
+ hw_res->msr_update(d, &m, r);
return 0;
}
{
int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
struct list_head *add_pos = NULL;
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;
d = rdt_find_domain(r, id, &add_pos);
return;
}
- d = kzalloc_node(sizeof(*d), GFP_KERNEL, cpu_to_node(cpu));
- if (!d)
+ hw_dom = kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu));
+ if (!hw_dom)
return;
+ d = &hw_dom->d_resctrl;
d->id = id;
cpumask_set_cpu(cpu, &d->cpu_mask);
static void domain_remove_cpu(int cpu, struct rdt_resource *r)
{
int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;
d = rdt_find_domain(r, id, NULL);
pr_warn("Couldn't find cache id for CPU %d\n", cpu);
return;
}
+ hw_dom = resctrl_to_arch_dom(d);
cpumask_clear_cpu(cpu, &d->cpu_mask);
if (cpumask_empty(&d->cpu_mask)) {
if (d->plr)
d->plr->d = NULL;
- kfree(d->ctrl_val);
- kfree(d->mbps_val);
+ kfree(hw_dom->ctrl_val);
+ kfree(hw_dom->mbps_val);
bitmap_free(d->rmid_busy_llc);
kfree(d->mbm_total);
kfree(d->mbm_local);
- kfree(d);
+ kfree(hw_dom);
return;
}
- if (r == &rdt_resources_all[RDT_RESOURCE_L3]) {
+ if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) {
if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
cancel_delayed_work(&d->mbm_over);
mbm_setup_overflow_handler(d, 0);
static __init void rdt_init_padding(void)
{
struct rdt_resource *r;
- int cl;
for_each_alloc_capable_rdt_resource(r) {
- cl = strlen(r->name);
- if (cl > max_name_width)
- max_name_width = cl;
-
if (r->data_width > max_data_width)
max_data_width = r->data_width;
}
static __init bool get_mem_config(void)
{
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_MBA];
+
if (!rdt_cpu_has(X86_FEATURE_MBA))
return false;
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
- return __get_mem_config_intel(&rdt_resources_all[RDT_RESOURCE_MBA]);
+ return __get_mem_config_intel(&hw_res->r_resctrl);
else if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
- return __rdt_get_mem_config_amd(&rdt_resources_all[RDT_RESOURCE_MBA]);
+ return __rdt_get_mem_config_amd(&hw_res->r_resctrl);
return false;
}
static __init bool get_rdt_alloc_resources(void)
{
+ struct rdt_resource *r;
bool ret = false;
if (rdt_alloc_capable)
return false;
if (rdt_cpu_has(X86_FEATURE_CAT_L3)) {
- rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3]);
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+ rdt_get_cache_alloc_cfg(1, r);
if (rdt_cpu_has(X86_FEATURE_CDP_L3))
rdt_get_cdp_l3_config();
ret = true;
}
if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
/* CPUID 0x10.2 fields are same format at 0x10.1 */
- rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2]);
+ r = &rdt_resources_all[RDT_RESOURCE_L2].r_resctrl;
+ rdt_get_cache_alloc_cfg(2, r);
if (rdt_cpu_has(X86_FEATURE_CDP_L2))
rdt_get_cdp_l2_config();
ret = true;
static __init bool get_rdt_mon_resources(void)
{
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+
if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC))
rdt_mon_features |= (1 << QOS_L3_OCCUP_EVENT_ID);
if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL))
if (!rdt_mon_features)
return false;
- return !rdt_get_mon_l3_config(&rdt_resources_all[RDT_RESOURCE_L3]);
+ return !rdt_get_mon_l3_config(r);
}
static __init void __check_quirks_intel(void)
static __init void rdt_init_res_defs_intel(void)
{
+ struct rdt_hw_resource *hw_res;
struct rdt_resource *r;
for_each_rdt_resource(r) {
+ hw_res = resctrl_to_arch_res(r);
+
if (r->rid == RDT_RESOURCE_L3 ||
- r->rid == RDT_RESOURCE_L3DATA ||
- r->rid == RDT_RESOURCE_L3CODE ||
- r->rid == RDT_RESOURCE_L2 ||
- r->rid == RDT_RESOURCE_L2DATA ||
- r->rid == RDT_RESOURCE_L2CODE) {
+ r->rid == RDT_RESOURCE_L2) {
r->cache.arch_has_sparse_bitmaps = false;
r->cache.arch_has_empty_bitmaps = false;
r->cache.arch_has_per_cpu_cfg = false;
} else if (r->rid == RDT_RESOURCE_MBA) {
- r->msr_base = MSR_IA32_MBA_THRTL_BASE;
- r->msr_update = mba_wrmsr_intel;
+ hw_res->msr_base = MSR_IA32_MBA_THRTL_BASE;
+ hw_res->msr_update = mba_wrmsr_intel;
}
}
}
static __init void rdt_init_res_defs_amd(void)
{
+ struct rdt_hw_resource *hw_res;
struct rdt_resource *r;
for_each_rdt_resource(r) {
+ hw_res = resctrl_to_arch_res(r);
+
if (r->rid == RDT_RESOURCE_L3 ||
- r->rid == RDT_RESOURCE_L3DATA ||
- r->rid == RDT_RESOURCE_L3CODE ||
- r->rid == RDT_RESOURCE_L2 ||
- r->rid == RDT_RESOURCE_L2DATA ||
- r->rid == RDT_RESOURCE_L2CODE) {
+ r->rid == RDT_RESOURCE_L2) {
r->cache.arch_has_sparse_bitmaps = true;
r->cache.arch_has_empty_bitmaps = true;
r->cache.arch_has_per_cpu_cfg = true;
} else if (r->rid == RDT_RESOURCE_MBA) {
- r->msr_base = MSR_IA32_MBA_BW_BASE;
- r->msr_update = mba_wrmsr_amd;
+ hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
+ hw_res->msr_update = mba_wrmsr_amd;
}
}
}
return true;
}
-int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
+ struct resctrl_staged_config *cfg;
+ struct rdt_resource *r = s->res;
unsigned long bw_val;
- if (d->have_new_ctrl) {
+ cfg = &d->staged_config[s->conf_type];
+ if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
return -EINVAL;
}
if (!bw_validate(data->buf, &bw_val, r))
return -EINVAL;
- d->new_ctrl = bw_val;
- d->have_new_ctrl = true;
+ cfg->new_ctrl = bw_val;
+ cfg->have_new_ctrl = true;
return 0;
}
* Read one cache bit mask (hex). Check that it is valid for the current
* resource type.
*/
-int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d)
{
struct rdtgroup *rdtgrp = data->rdtgrp;
+ struct resctrl_staged_config *cfg;
+ struct rdt_resource *r = s->res;
u32 cbm_val;
- if (d->have_new_ctrl) {
+ cfg = &d->staged_config[s->conf_type];
+ if (cfg->have_new_ctrl) {
rdt_last_cmd_printf("Duplicate domain %d\n", d->id);
return -EINVAL;
}
* The CBM may not overlap with the CBM of another closid if
* either is exclusive.
*/
- if (rdtgroup_cbm_overlaps(r, d, cbm_val, rdtgrp->closid, true)) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, true)) {
rdt_last_cmd_puts("Overlaps with exclusive group\n");
return -EINVAL;
}
- if (rdtgroup_cbm_overlaps(r, d, cbm_val, rdtgrp->closid, false)) {
+ if (rdtgroup_cbm_overlaps(s, d, cbm_val, rdtgrp->closid, false)) {
if (rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
rdt_last_cmd_puts("Overlaps with other group\n");
}
}
- d->new_ctrl = cbm_val;
- d->have_new_ctrl = true;
+ cfg->new_ctrl = cbm_val;
+ cfg->have_new_ctrl = true;
return 0;
}
* separated by ";". The "id" is in decimal, and must match one of
* the "id"s for this resource.
*/
-static int parse_line(char *line, struct rdt_resource *r,
+static int parse_line(char *line, struct resctrl_schema *s,
struct rdtgroup *rdtgrp)
{
+ enum resctrl_conf_type t = s->conf_type;
+ struct resctrl_staged_config *cfg;
+ struct rdt_resource *r = s->res;
struct rdt_parse_data data;
char *dom = NULL, *id;
struct rdt_domain *d;
if (d->id == dom_id) {
data.buf = dom;
data.rdtgrp = rdtgrp;
- if (r->parse_ctrlval(&data, r, d))
+ if (r->parse_ctrlval(&data, s, d))
return -EINVAL;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
+ cfg = &d->staged_config[t];
/*
* In pseudo-locking setup mode and just
* parsed a valid CBM that should be
* the required initialization for single
* region and return.
*/
- rdtgrp->plr->r = r;
+ rdtgrp->plr->s = s;
rdtgrp->plr->d = d;
- rdtgrp->plr->cbm = d->new_ctrl;
+ rdtgrp->plr->cbm = cfg->new_ctrl;
d->plr = rdtgrp->plr;
return 0;
}
return -EINVAL;
}
-int update_domains(struct rdt_resource *r, int closid)
+static u32 get_config_index(u32 closid, enum resctrl_conf_type type)
{
+ switch (type) {
+ default:
+ case CDP_NONE:
+ return closid;
+ case CDP_CODE:
+ return closid * 2 + 1;
+ case CDP_DATA:
+ return closid * 2;
+ }
+}
+
+static bool apply_config(struct rdt_hw_domain *hw_dom,
+ struct resctrl_staged_config *cfg, u32 idx,
+ cpumask_var_t cpu_mask, bool mba_sc)
+{
+ struct rdt_domain *dom = &hw_dom->d_resctrl;
+ u32 *dc = !mba_sc ? hw_dom->ctrl_val : hw_dom->mbps_val;
+
+ if (cfg->new_ctrl != dc[idx]) {
+ cpumask_set_cpu(cpumask_any(&dom->cpu_mask), cpu_mask);
+ dc[idx] = cfg->new_ctrl;
+
+ return true;
+ }
+
+ return false;
+}
+
+int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
+{
+ struct resctrl_staged_config *cfg;
+ struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
+ enum resctrl_conf_type t;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
bool mba_sc;
- u32 *dc;
int cpu;
+ u32 idx;
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
- msr_param.low = closid;
- msr_param.high = msr_param.low + 1;
- msr_param.res = r;
-
mba_sc = is_mba_sc(r);
+ msr_param.res = NULL;
list_for_each_entry(d, &r->domains, list) {
- dc = !mba_sc ? d->ctrl_val : d->mbps_val;
- if (d->have_new_ctrl && d->new_ctrl != dc[closid]) {
- cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
- dc[closid] = d->new_ctrl;
+ hw_dom = resctrl_to_arch_dom(d);
+ for (t = 0; t < CDP_NUM_TYPES; t++) {
+ cfg = &hw_dom->d_resctrl.staged_config[t];
+ if (!cfg->have_new_ctrl)
+ continue;
+
+ idx = get_config_index(closid, t);
+ if (!apply_config(hw_dom, cfg, idx, cpu_mask, mba_sc))
+ continue;
+
+ if (!msr_param.res) {
+ msr_param.low = idx;
+ msr_param.high = msr_param.low + 1;
+ msr_param.res = r;
+ } else {
+ msr_param.low = min(msr_param.low, idx);
+ msr_param.high = max(msr_param.high, idx + 1);
+ }
}
}
static int rdtgroup_parse_resource(char *resname, char *tok,
struct rdtgroup *rdtgrp)
{
- struct rdt_resource *r;
+ struct resctrl_schema *s;
- for_each_alloc_enabled_rdt_resource(r) {
- if (!strcmp(resname, r->name) && rdtgrp->closid < r->num_closid)
- return parse_line(tok, r, rdtgrp);
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ if (!strcmp(resname, s->name) && rdtgrp->closid < s->num_closid)
+ return parse_line(tok, s, rdtgrp);
}
rdt_last_cmd_printf("Unknown or unsupported resource name '%s'\n", resname);
return -EINVAL;
ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
+ struct resctrl_schema *s;
struct rdtgroup *rdtgrp;
struct rdt_domain *dom;
struct rdt_resource *r;
goto out;
}
- for_each_alloc_enabled_rdt_resource(r) {
- list_for_each_entry(dom, &r->domains, list)
- dom->have_new_ctrl = false;
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ list_for_each_entry(dom, &s->res->domains, list)
+ memset(dom->staged_config, 0, sizeof(dom->staged_config));
}
while ((tok = strsep(&buf, "\n")) != NULL) {
goto out;
}
- for_each_alloc_enabled_rdt_resource(r) {
- ret = update_domains(r, rdtgrp->closid);
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
+ ret = resctrl_arch_update_domains(r, rdtgrp->closid);
if (ret)
goto out;
}
return ret ?: nbytes;
}
-static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
+u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
+ u32 closid, enum resctrl_conf_type type)
+{
+ struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
+ u32 idx = get_config_index(closid, type);
+
+ if (!is_mba_sc(r))
+ return hw_dom->ctrl_val[idx];
+ return hw_dom->mbps_val[idx];
+}
+
+static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int closid)
{
+ struct rdt_resource *r = schema->res;
struct rdt_domain *dom;
bool sep = false;
u32 ctrl_val;
- seq_printf(s, "%*s:", max_name_width, r->name);
+ seq_printf(s, "%*s:", max_name_width, schema->name);
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_puts(s, ";");
- ctrl_val = (!is_mba_sc(r) ? dom->ctrl_val[closid] :
- dom->mbps_val[closid]);
+ ctrl_val = resctrl_arch_get_config(r, dom, closid,
+ schema->conf_type);
seq_printf(s, r->format_str, dom->id, max_data_width,
ctrl_val);
sep = true;
int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
+ struct resctrl_schema *schema;
struct rdtgroup *rdtgrp;
- struct rdt_resource *r;
int ret = 0;
u32 closid;
rdtgrp = rdtgroup_kn_lock_live(of->kn);
if (rdtgrp) {
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
- for_each_alloc_enabled_rdt_resource(r)
- seq_printf(s, "%s:uninitialized\n", r->name);
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ seq_printf(s, "%s:uninitialized\n", schema->name);
+ }
} else if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
if (!rdtgrp->plr->d) {
rdt_last_cmd_clear();
ret = -ENODEV;
} else {
seq_printf(s, "%s:%d=%x\n",
- rdtgrp->plr->r->name,
+ rdtgrp->plr->s->res->name,
rdtgrp->plr->d->id,
rdtgrp->plr->cbm);
}
} else {
closid = rdtgrp->closid;
- for_each_alloc_enabled_rdt_resource(r) {
- if (closid < r->num_closid)
- show_doms(s, r, closid);
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ if (closid < schema->num_closid)
+ show_doms(s, schema, closid);
}
}
} else {
int rdtgroup_mondata_show(struct seq_file *m, void *arg)
{
struct kernfs_open_file *of = m->private;
+ struct rdt_hw_resource *hw_res;
u32 resid, evtid, domid;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
domid = md.u.domid;
evtid = md.u.evtid;
- r = &rdt_resources_all[resid];
+ hw_res = &rdt_resources_all[resid];
+ r = &hw_res->r_resctrl;
d = rdt_find_domain(r, domid, NULL);
if (IS_ERR_OR_NULL(d)) {
ret = -ENOENT;
else if (rr.val & RMID_VAL_UNAVAIL)
seq_puts(m, "Unavailable\n");
else
- seq_printf(m, "%llu\n", rr.val * r->mon_scale);
+ seq_printf(m, "%llu\n", rr.val * hw_res->mon_scale);
out:
rdtgroup_kn_unlock(of->kn);
#ifndef _ASM_X86_RESCTRL_INTERNAL_H
#define _ASM_X86_RESCTRL_INTERNAL_H
+#include <linux/resctrl.h>
#include <linux/sched.h>
#include <linux/kernfs.h>
#include <linux/fs_context.h>
extern bool rdt_alloc_capable;
extern bool rdt_mon_capable;
extern unsigned int rdt_mon_features;
+extern struct list_head resctrl_schema_all;
enum rdt_group_type {
RDTCTRL_GROUP = 0,
/**
* struct pseudo_lock_region - pseudo-lock region information
- * @r: RDT resource to which this pseudo-locked region
- * belongs
+ * @s: Resctrl schema for the resource to which this
+ * pseudo-locked region belongs
* @d: RDT domain to which this pseudo-locked region
* belongs
* @cbm: bitmask of the pseudo-locked region
* @pm_reqs: Power management QoS requests related to this region
*/
struct pseudo_lock_region {
- struct rdt_resource *r;
+ struct resctrl_schema *s;
struct rdt_domain *d;
u32 cbm;
wait_queue_head_t lock_thread_wq;
};
/**
- * struct rdt_domain - group of cpus sharing an RDT resource
- * @list: all instances of this resource
- * @id: unique id for this instance
- * @cpu_mask: which cpus share this resource
- * @rmid_busy_llc:
- * bitmap of which limbo RMIDs are above threshold
- * @mbm_total: saved state for MBM total bandwidth
- * @mbm_local: saved state for MBM local bandwidth
- * @mbm_over: worker to periodically read MBM h/w counters
- * @cqm_limbo: worker to periodically read CQM h/w counters
- * @mbm_work_cpu:
- * worker cpu for MBM h/w counters
- * @cqm_work_cpu:
- * worker cpu for CQM h/w counters
+ * struct rdt_hw_domain - Arch private attributes of a set of CPUs that share
+ * a resource
+ * @d_resctrl: Properties exposed to the resctrl file system
* @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID)
* @mbps_val: When mba_sc is enabled, this holds the bandwidth in MBps
- * @new_ctrl: new ctrl value to be loaded
- * @have_new_ctrl: did user provide new_ctrl for this domain
- * @plr: pseudo-locked region (if any) associated with domain
+ *
+ * Members of this structure are accessed via helpers that provide abstraction.
*/
-struct rdt_domain {
- struct list_head list;
- int id;
- struct cpumask cpu_mask;
- unsigned long *rmid_busy_llc;
- struct mbm_state *mbm_total;
- struct mbm_state *mbm_local;
- struct delayed_work mbm_over;
- struct delayed_work cqm_limbo;
- int mbm_work_cpu;
- int cqm_work_cpu;
+struct rdt_hw_domain {
+ struct rdt_domain d_resctrl;
u32 *ctrl_val;
u32 *mbps_val;
- u32 new_ctrl;
- bool have_new_ctrl;
- struct pseudo_lock_region *plr;
};
+static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain *r)
+{
+ return container_of(r, struct rdt_hw_domain, d_resctrl);
+}
+
/**
* struct msr_param - set a range of MSRs from a domain
* @res: The resource to use
*/
struct msr_param {
struct rdt_resource *res;
- int low;
- int high;
-};
-
-/**
- * struct rdt_cache - Cache allocation related data
- * @cbm_len: Length of the cache bit mask
- * @min_cbm_bits: Minimum number of consecutive bits to be set
- * @cbm_idx_mult: Multiplier of CBM index
- * @cbm_idx_offset: Offset of CBM index. CBM index is computed by:
- * closid * cbm_idx_multi + cbm_idx_offset
- * in a cache bit mask
- * @shareable_bits: Bitmask of shareable resource with other
- * executing entities
- * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
- * @arch_has_empty_bitmaps: True if the '0' bitmap is valid.
- * @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
- * level has CPU scope.
- */
-struct rdt_cache {
- unsigned int cbm_len;
- unsigned int min_cbm_bits;
- unsigned int cbm_idx_mult;
- unsigned int cbm_idx_offset;
- unsigned int shareable_bits;
- bool arch_has_sparse_bitmaps;
- bool arch_has_empty_bitmaps;
- bool arch_has_per_cpu_cfg;
-};
-
-/**
- * enum membw_throttle_mode - System's memory bandwidth throttling mode
- * @THREAD_THROTTLE_UNDEFINED: Not relevant to the system
- * @THREAD_THROTTLE_MAX: Memory bandwidth is throttled at the core
- * always using smallest bandwidth percentage
- * assigned to threads, aka "max throttling"
- * @THREAD_THROTTLE_PER_THREAD: Memory bandwidth is throttled at the thread
- */
-enum membw_throttle_mode {
- THREAD_THROTTLE_UNDEFINED = 0,
- THREAD_THROTTLE_MAX,
- THREAD_THROTTLE_PER_THREAD,
-};
-
-/**
- * struct rdt_membw - Memory bandwidth allocation related data
- * @min_bw: Minimum memory bandwidth percentage user can request
- * @bw_gran: Granularity at which the memory bandwidth is allocated
- * @delay_linear: True if memory B/W delay is in linear scale
- * @arch_needs_linear: True if we can't configure non-linear resources
- * @throttle_mode: Bandwidth throttling mode when threads request
- * different memory bandwidths
- * @mba_sc: True if MBA software controller(mba_sc) is enabled
- * @mb_map: Mapping of memory B/W percentage to memory B/W delay
- */
-struct rdt_membw {
- u32 min_bw;
- u32 bw_gran;
- u32 delay_linear;
- bool arch_needs_linear;
- enum membw_throttle_mode throttle_mode;
- bool mba_sc;
- u32 *mb_map;
+ u32 low;
+ u32 high;
};
static inline bool is_llc_occupancy_enabled(void)
};
/**
- * struct rdt_resource - attributes of an RDT resource
- * @rid: The index of the resource
- * @alloc_enabled: Is allocation enabled on this machine
- * @mon_enabled: Is monitoring enabled for this feature
- * @alloc_capable: Is allocation available on this machine
- * @mon_capable: Is monitor feature available on this machine
- * @name: Name to use in "schemata" file
- * @num_closid: Number of CLOSIDs available
- * @cache_level: Which cache level defines scope of this resource
- * @default_ctrl: Specifies default cache cbm or memory B/W percent.
+ * struct rdt_hw_resource - arch private attributes of a resctrl resource
+ * @r_resctrl: Attributes of the resource used directly by resctrl.
+ * @num_closid: Maximum number of closid this hardware can support,
+ * regardless of CDP. This is exposed via
+ * resctrl_arch_get_num_closid() to avoid confusion
+ * with struct resctrl_schema's property of the same name,
+ * which has been corrected for features like CDP.
* @msr_base: Base MSR address for CBMs
* @msr_update: Function pointer to update QOS MSRs
- * @data_width: Character width of data when displaying
- * @domains: All domains for this resource
- * @cache: Cache allocation related data
- * @membw: If the component has bandwidth controls, their properties.
- * @format_str: Per resource format string to show domain value
- * @parse_ctrlval: Per resource function pointer to parse control values
- * @evt_list: List of monitoring events
- * @num_rmid: Number of RMIDs available
* @mon_scale: cqm counter * mon_scale = occupancy in bytes
* @mbm_width: Monitor width, to detect and correct for overflow.
- * @fflags: flags to choose base and info files
+ * @cdp_enabled: CDP state of this resource
+ *
+ * Members of this structure are either private to the architecture
+ * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
+ * msr_update and msr_base.
*/
-struct rdt_resource {
- int rid;
- bool alloc_enabled;
- bool mon_enabled;
- bool alloc_capable;
- bool mon_capable;
- char *name;
- int num_closid;
- int cache_level;
- u32 default_ctrl;
+struct rdt_hw_resource {
+ struct rdt_resource r_resctrl;
+ u32 num_closid;
unsigned int msr_base;
void (*msr_update) (struct rdt_domain *d, struct msr_param *m,
struct rdt_resource *r);
- int data_width;
- struct list_head domains;
- struct rdt_cache cache;
- struct rdt_membw membw;
- const char *format_str;
- int (*parse_ctrlval)(struct rdt_parse_data *data,
- struct rdt_resource *r,
- struct rdt_domain *d);
- struct list_head evt_list;
- int num_rmid;
unsigned int mon_scale;
unsigned int mbm_width;
- unsigned long fflags;
+ bool cdp_enabled;
};
-int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
+static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
+{
+ return container_of(r, struct rdt_hw_resource, r_resctrl);
+}
+
+int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d);
-int parse_bw(struct rdt_parse_data *data, struct rdt_resource *r,
+int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
struct rdt_domain *d);
extern struct mutex rdtgroup_mutex;
-extern struct rdt_resource rdt_resources_all[];
+extern struct rdt_hw_resource rdt_resources_all[];
extern struct rdtgroup rdtgroup_default;
DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key);
extern struct dentry *debugfs_resctrl;
-enum {
+enum resctrl_res_level {
RDT_RESOURCE_L3,
- RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE,
RDT_RESOURCE_L2,
- RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE,
RDT_RESOURCE_MBA,
/* Must be the last */
RDT_NUM_RESOURCES,
};
+static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(res);
+
+ hw_res++;
+ return &hw_res->r_resctrl;
+}
+
+static inline bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level l)
+{
+ return rdt_resources_all[l].cdp_enabled;
+}
+
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable);
+
+/*
+ * To return the common struct rdt_resource, which is contained in struct
+ * rdt_hw_resource, walk the resctrl member of struct rdt_hw_resource.
+ */
#define for_each_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++)
+ for (r = &rdt_resources_all[0].r_resctrl; \
+ r <= &rdt_resources_all[RDT_NUM_RESOURCES - 1].r_resctrl; \
+ r = resctrl_inc(r))
#define for_each_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->alloc_capable || r->mon_capable)
#define for_each_alloc_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->alloc_capable)
#define for_each_mon_capable_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->mon_capable)
#define for_each_alloc_enabled_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->alloc_enabled)
#define for_each_mon_enabled_rdt_resource(r) \
- for (r = rdt_resources_all; r < rdt_resources_all + RDT_NUM_RESOURCES;\
- r++) \
+ for_each_rdt_resource(r) \
if (r->mon_enabled)
/* CPUID.(EAX=10H, ECX=ResID=1).EAX */
char *buf, size_t nbytes, loff_t off);
int rdtgroup_schemata_show(struct kernfs_open_file *of,
struct seq_file *s, void *v);
-bool rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
+bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
unsigned long cbm, int closid, bool exclusive);
unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_domain *d,
unsigned long cbm);
int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r);
-int update_domains(struct rdt_resource *r, int closid);
int closids_supported(void);
void closid_free(int closid);
int alloc_rmid(void);
struct rdt_resource *r;
u32 crmid = 1, nrmid;
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
/*
* Skip RMID 0 and start from RMID 1 and check all the RMIDs that
int cpu;
u64 val;
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
entry->busy = 0;
cpu = get_cpu();
static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(rr->r);
struct mbm_state *m;
u64 chunks, tval;
case QOS_L3_MBM_LOCAL_EVENT_ID:
m = &rr->d->mbm_local[rmid];
break;
+ default:
+ /*
+ * Code would never reach here because an invalid
+ * event id would fail the __rmid_read.
+ */
+ return RMID_VAL_ERROR;
}
if (rr->first) {
return 0;
}
- chunks = mbm_overflow_count(m->prev_msr, tval, rr->r->mbm_width);
+ chunks = mbm_overflow_count(m->prev_msr, tval, hw_res->mbm_width);
m->chunks += chunks;
m->prev_msr = tval;
*/
static void mbm_bw_count(u32 rmid, struct rmid_read *rr)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(rr->r);
struct mbm_state *m = &rr->d->mbm_local[rmid];
u64 tval, cur_bw, chunks;
if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL))
return;
- chunks = mbm_overflow_count(m->prev_bw_msr, tval, rr->r->mbm_width);
- cur_bw = (get_corrected_mbm_count(rmid, chunks) * r->mon_scale) >> 20;
+ chunks = mbm_overflow_count(m->prev_bw_msr, tval, hw_res->mbm_width);
+ cur_bw = (get_corrected_mbm_count(rmid, chunks) * hw_res->mon_scale) >> 20;
if (m->delta_comp)
m->delta_bw = abs(cur_bw - m->prev_bw);
{
u32 closid, rmid, cur_msr, cur_msr_val, new_msr_val;
struct mbm_state *pmbm_data, *cmbm_data;
+ struct rdt_hw_resource *hw_r_mba;
+ struct rdt_hw_domain *hw_dom_mba;
u32 cur_bw, delta_bw, user_bw;
struct rdt_resource *r_mba;
struct rdt_domain *dom_mba;
if (!is_mbm_local_enabled())
return;
- r_mba = &rdt_resources_all[RDT_RESOURCE_MBA];
+ hw_r_mba = &rdt_resources_all[RDT_RESOURCE_MBA];
+ r_mba = &hw_r_mba->r_resctrl;
closid = rgrp->closid;
rmid = rgrp->mon.rmid;
pmbm_data = &dom_mbm->mbm_local[rmid];
pr_warn_once("Failure to get domain for MBA update\n");
return;
}
+ hw_dom_mba = resctrl_to_arch_dom(dom_mba);
cur_bw = pmbm_data->prev_bw;
- user_bw = dom_mba->mbps_val[closid];
+ user_bw = resctrl_arch_get_config(r_mba, dom_mba, closid, CDP_NONE);
delta_bw = pmbm_data->delta_bw;
- cur_msr_val = dom_mba->ctrl_val[closid];
+ /*
+ * resctrl_arch_get_config() chooses the mbps/ctrl value to return
+ * based on is_mba_sc(). For now, reach into the hw_dom.
+ */
+ cur_msr_val = hw_dom_mba->ctrl_val[closid];
/*
* For Ctrl groups read data from child monitor groups.
return;
}
- cur_msr = r_mba->msr_base + closid;
+ cur_msr = hw_r_mba->msr_base + closid;
wrmsrl(cur_msr, delay_bw_map(new_msr_val, r_mba));
- dom_mba->ctrl_val[closid] = new_msr_val;
+ hw_dom_mba->ctrl_val[closid] = new_msr_val;
/*
* Delta values are updated dynamically package wise for each
mutex_lock(&rdtgroup_mutex);
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
d = container_of(work, struct rdt_domain, cqm_limbo.work);
__check_limbo(d, false);
if (!static_branch_likely(&rdt_mon_enable_key))
goto out_unlock;
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
d = container_of(work, struct rdt_domain, mbm_over.work);
list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
int rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
unsigned int cl_size = boot_cpu_data.x86_cache_size;
int ret;
- r->mon_scale = boot_cpu_data.x86_cache_occ_scale;
+ hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1;
- r->mbm_width = MBM_CNTR_WIDTH_BASE;
+ hw_res->mbm_width = MBM_CNTR_WIDTH_BASE;
if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX)
- r->mbm_width += mbm_offset;
+ hw_res->mbm_width += mbm_offset;
else if (mbm_offset > MBM_CNTR_WIDTH_OFFSET_MAX)
pr_warn("Ignoring impossible MBM counter offset\n");
resctrl_cqm_threshold = cl_size * 1024 / r->num_rmid;
/* h/w works in units of "boot_cpu_data.x86_cache_occ_scale" */
- resctrl_cqm_threshold /= r->mon_scale;
+ resctrl_cqm_threshold /= hw_res->mon_scale;
ret = dom_data_init(r);
if (ret)
plr->line_size = 0;
kfree(plr->kmem);
plr->kmem = NULL;
- plr->r = NULL;
+ plr->s = NULL;
if (plr->d)
plr->d->plr = NULL;
plr->d = NULL;
ci = get_cpu_cacheinfo(plr->cpu);
- plr->size = rdtgroup_cbm_to_size(plr->r, plr->d, plr->cbm);
+ plr->size = rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm);
for (i = 0; i < ci->num_leaves; i++) {
- if (ci->info_list[i].level == plr->r->cache_level) {
+ if (ci->info_list[i].level == plr->s->res->cache_level) {
plr->line_size = ci->info_list[i].coherency_line_size;
return 0;
}
* resource, the portion of cache used by it should be made
* unavailable to all future allocations from both resources.
*/
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled ||
- rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled) {
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3) ||
+ resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2)) {
rdt_last_cmd_puts("CDP enabled\n");
return -EINVAL;
}
unsigned long cbm_b;
if (d->plr) {
- cbm_len = d->plr->r->cache.cbm_len;
+ cbm_len = d->plr->s->res->cache.cbm_len;
cbm_b = d->plr->cbm;
if (bitmap_intersects(&cbm, &cbm_b, cbm_len))
return true;
struct rdtgroup rdtgroup_default;
LIST_HEAD(rdt_all_groups);
+/* list of entries for the schemata file */
+LIST_HEAD(resctrl_schema_all);
+
/* Kernel fs node for "info" directory under root */
static struct kernfs_node *kn_info;
static void closid_init(void)
{
- struct rdt_resource *r;
- int rdt_min_closid = 32;
+ struct resctrl_schema *s;
+ u32 rdt_min_closid = 32;
/* Compute rdt_min_closid across all resources */
- for_each_alloc_enabled_rdt_resource(r)
- rdt_min_closid = min(rdt_min_closid, r->num_closid);
+ list_for_each_entry(s, &resctrl_schema_all, list)
+ rdt_min_closid = min(rdt_min_closid, s->num_closid);
closid_free_map = BIT_MASK(rdt_min_closid) - 1;
static int rdt_num_closids_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
- seq_printf(seq, "%d\n", r->num_closid);
+ seq_printf(seq, "%u\n", s->num_closid);
return 0;
}
static int rdt_default_ctrl_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
seq_printf(seq, "%x\n", r->default_ctrl);
return 0;
static int rdt_min_cbm_bits_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
seq_printf(seq, "%u\n", r->cache.min_cbm_bits);
return 0;
static int rdt_shareable_bits_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
seq_printf(seq, "%x\n", r->cache.shareable_bits);
return 0;
static int rdt_bit_usage_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
/*
* Use unsigned long even though only 32 bits are used to ensure
* test_bit() is used safely.
*/
unsigned long sw_shareable = 0, hw_shareable = 0;
unsigned long exclusive = 0, pseudo_locked = 0;
+ struct rdt_resource *r = s->res;
struct rdt_domain *dom;
int i, hwb, swb, excl, psl;
enum rdtgrp_mode mode;
bool sep = false;
- u32 *ctrl;
+ u32 ctrl_val;
mutex_lock(&rdtgroup_mutex);
hw_shareable = r->cache.shareable_bits;
list_for_each_entry(dom, &r->domains, list) {
if (sep)
seq_putc(seq, ';');
- ctrl = dom->ctrl_val;
sw_shareable = 0;
exclusive = 0;
seq_printf(seq, "%d=", dom->id);
- for (i = 0; i < closids_supported(); i++, ctrl++) {
+ for (i = 0; i < closids_supported(); i++) {
if (!closid_allocated(i))
continue;
+ ctrl_val = resctrl_arch_get_config(r, dom, i,
+ s->conf_type);
mode = rdtgroup_mode_by_closid(i);
switch (mode) {
case RDT_MODE_SHAREABLE:
- sw_shareable |= *ctrl;
+ sw_shareable |= ctrl_val;
break;
case RDT_MODE_EXCLUSIVE:
- exclusive |= *ctrl;
+ exclusive |= ctrl_val;
break;
case RDT_MODE_PSEUDO_LOCKSETUP:
/*
static int rdt_min_bw_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
seq_printf(seq, "%u\n", r->membw.min_bw);
return 0;
static int rdt_bw_gran_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
seq_printf(seq, "%u\n", r->membw.bw_gran);
return 0;
static int rdt_delay_linear_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
seq_printf(seq, "%u\n", r->membw.delay_linear);
return 0;
struct seq_file *seq, void *v)
{
struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
- seq_printf(seq, "%u\n", resctrl_cqm_threshold * r->mon_scale);
+ seq_printf(seq, "%u\n", resctrl_cqm_threshold * hw_res->mon_scale);
return 0;
}
static int rdt_thread_throttle_mode_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct resctrl_schema *s = of->kn->parent->priv;
+ struct rdt_resource *r = s->res;
if (r->membw.throttle_mode == THREAD_THROTTLE_PER_THREAD)
seq_puts(seq, "per-thread\n");
static ssize_t max_threshold_occ_write(struct kernfs_open_file *of,
char *buf, size_t nbytes, loff_t off)
{
- struct rdt_resource *r = of->kn->parent->priv;
+ struct rdt_hw_resource *hw_res;
unsigned int bytes;
int ret;
if (bytes > (boot_cpu_data.x86_cache_size * 1024))
return -EINVAL;
- resctrl_cqm_threshold = bytes / r->mon_scale;
+ hw_res = resctrl_to_arch_res(of->kn->parent->priv);
+ resctrl_cqm_threshold = bytes / hw_res->mon_scale;
return nbytes;
}
return 0;
}
-/**
- * rdt_cdp_peer_get - Retrieve CDP peer if it exists
- * @r: RDT resource to which RDT domain @d belongs
- * @d: Cache instance for which a CDP peer is requested
- * @r_cdp: RDT resource that shares hardware with @r (RDT resource peer)
- * Used to return the result.
- * @d_cdp: RDT domain that shares hardware with @d (RDT domain peer)
- * Used to return the result.
- *
- * RDT resources are managed independently and by extension the RDT domains
- * (RDT resource instances) are managed independently also. The Code and
- * Data Prioritization (CDP) RDT resources, while managed independently,
- * could refer to the same underlying hardware. For example,
- * RDT_RESOURCE_L2CODE and RDT_RESOURCE_L2DATA both refer to the L2 cache.
- *
- * When provided with an RDT resource @r and an instance of that RDT
- * resource @d rdt_cdp_peer_get() will return if there is a peer RDT
- * resource and the exact instance that shares the same hardware.
- *
- * Return: 0 if a CDP peer was found, <0 on error or if no CDP peer exists.
- * If a CDP peer was found, @r_cdp will point to the peer RDT resource
- * and @d_cdp will point to the peer RDT domain.
- */
-static int rdt_cdp_peer_get(struct rdt_resource *r, struct rdt_domain *d,
- struct rdt_resource **r_cdp,
- struct rdt_domain **d_cdp)
+static enum resctrl_conf_type resctrl_peer_type(enum resctrl_conf_type my_type)
{
- struct rdt_resource *_r_cdp = NULL;
- struct rdt_domain *_d_cdp = NULL;
- int ret = 0;
-
- switch (r->rid) {
- case RDT_RESOURCE_L3DATA:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3CODE];
- break;
- case RDT_RESOURCE_L3CODE:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L3DATA];
- break;
- case RDT_RESOURCE_L2DATA:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2CODE];
- break;
- case RDT_RESOURCE_L2CODE:
- _r_cdp = &rdt_resources_all[RDT_RESOURCE_L2DATA];
- break;
+ switch (my_type) {
+ case CDP_CODE:
+ return CDP_DATA;
+ case CDP_DATA:
+ return CDP_CODE;
default:
- ret = -ENOENT;
- goto out;
- }
-
- /*
- * When a new CPU comes online and CDP is enabled then the new
- * RDT domains (if any) associated with both CDP RDT resources
- * are added in the same CPU online routine while the
- * rdtgroup_mutex is held. It should thus not happen for one
- * RDT domain to exist and be associated with its RDT CDP
- * resource but there is no RDT domain associated with the
- * peer RDT CDP resource. Hence the WARN.
- */
- _d_cdp = rdt_find_domain(_r_cdp, d->id, NULL);
- if (WARN_ON(IS_ERR_OR_NULL(_d_cdp))) {
- _r_cdp = NULL;
- _d_cdp = NULL;
- ret = -EINVAL;
+ case CDP_NONE:
+ return CDP_NONE;
}
-
-out:
- *r_cdp = _r_cdp;
- *d_cdp = _d_cdp;
-
- return ret;
}
/**
* Return: false if CBM does not overlap, true if it does.
*/
static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
- unsigned long cbm, int closid, bool exclusive)
+ unsigned long cbm, int closid,
+ enum resctrl_conf_type type, bool exclusive)
{
enum rdtgrp_mode mode;
unsigned long ctrl_b;
- u32 *ctrl;
int i;
/* Check for any overlap with regions used by hardware directly */
}
/* Check for overlap with other resource groups */
- ctrl = d->ctrl_val;
- for (i = 0; i < closids_supported(); i++, ctrl++) {
- ctrl_b = *ctrl;
+ for (i = 0; i < closids_supported(); i++) {
+ ctrl_b = resctrl_arch_get_config(r, d, i, type);
mode = rdtgroup_mode_by_closid(i);
if (closid_allocated(i) && i != closid &&
mode != RDT_MODE_PSEUDO_LOCKSETUP) {
/**
* rdtgroup_cbm_overlaps - Does CBM overlap with other use of hardware
- * @r: Resource to which domain instance @d belongs.
+ * @s: Schema for the resource to which domain instance @d belongs.
* @d: The domain instance for which @closid is being tested.
* @cbm: Capacity bitmask being tested.
* @closid: Intended closid for @cbm.
*
* Return: true if CBM overlap detected, false if there is no overlap
*/
-bool rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_domain *d,
+bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d,
unsigned long cbm, int closid, bool exclusive)
{
- struct rdt_resource *r_cdp;
- struct rdt_domain *d_cdp;
+ enum resctrl_conf_type peer_type = resctrl_peer_type(s->conf_type);
+ struct rdt_resource *r = s->res;
- if (__rdtgroup_cbm_overlaps(r, d, cbm, closid, exclusive))
+ if (__rdtgroup_cbm_overlaps(r, d, cbm, closid, s->conf_type,
+ exclusive))
return true;
- if (rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp) < 0)
+ if (!resctrl_arch_get_cdp_enabled(r->rid))
return false;
-
- return __rdtgroup_cbm_overlaps(r_cdp, d_cdp, cbm, closid, exclusive);
+ return __rdtgroup_cbm_overlaps(r, d, cbm, closid, peer_type, exclusive);
}
/**
static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
{
int closid = rdtgrp->closid;
+ struct resctrl_schema *s;
struct rdt_resource *r;
bool has_cache = false;
struct rdt_domain *d;
+ u32 ctrl;
- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
if (r->rid == RDT_RESOURCE_MBA)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
- if (rdtgroup_cbm_overlaps(r, d, d->ctrl_val[closid],
- rdtgrp->closid, false)) {
+ ctrl = resctrl_arch_get_config(r, d, closid,
+ s->conf_type);
+ if (rdtgroup_cbm_overlaps(s, d, ctrl, closid, false)) {
rdt_last_cmd_puts("Schemata overlaps\n");
return false;
}
static int rdtgroup_size_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
+ struct resctrl_schema *schema;
struct rdtgroup *rdtgrp;
struct rdt_resource *r;
struct rdt_domain *d;
ret = -ENODEV;
} else {
seq_printf(s, "%*s:", max_name_width,
- rdtgrp->plr->r->name);
- size = rdtgroup_cbm_to_size(rdtgrp->plr->r,
+ rdtgrp->plr->s->name);
+ size = rdtgroup_cbm_to_size(rdtgrp->plr->s->res,
rdtgrp->plr->d,
rdtgrp->plr->cbm);
seq_printf(s, "%d=%u\n", rdtgrp->plr->d->id, size);
goto out;
}
- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(schema, &resctrl_schema_all, list) {
+ r = schema->res;
sep = false;
- seq_printf(s, "%*s:", max_name_width, r->name);
+ seq_printf(s, "%*s:", max_name_width, schema->name);
list_for_each_entry(d, &r->domains, list) {
if (sep)
seq_putc(s, ';');
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP) {
size = 0;
} else {
- ctrl = (!is_mba_sc(r) ?
- d->ctrl_val[rdtgrp->closid] :
- d->mbps_val[rdtgrp->closid]);
+ ctrl = resctrl_arch_get_config(r, d,
+ rdtgrp->closid,
+ schema->conf_type);
if (r->rid == RDT_RESOURCE_MBA)
size = ctrl;
else
return ret;
}
-static int rdtgroup_mkdir_info_resdir(struct rdt_resource *r, char *name,
+static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
unsigned long fflags)
{
struct kernfs_node *kn_subdir;
int ret;
kn_subdir = kernfs_create_dir(kn_info, name,
- kn_info->mode, r);
+ kn_info->mode, priv);
if (IS_ERR(kn_subdir))
return PTR_ERR(kn_subdir);
static int rdtgroup_create_info_dir(struct kernfs_node *parent_kn)
{
+ struct resctrl_schema *s;
struct rdt_resource *r;
unsigned long fflags;
char name[32];
if (ret)
goto out_destroy;
- for_each_alloc_enabled_rdt_resource(r) {
+ /* loop over enabled controls, these are all alloc_enabled */
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
fflags = r->fflags | RF_CTRL_INFO;
- ret = rdtgroup_mkdir_info_resdir(r, r->name, fflags);
+ ret = rdtgroup_mkdir_info_resdir(s, s->name, fflags);
if (ret)
goto out_destroy;
}
static inline bool is_mba_linear(void)
{
- return rdt_resources_all[RDT_RESOURCE_MBA].membw.delay_linear;
+ return rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl.membw.delay_linear;
}
static int set_cache_qos_cfg(int level, bool enable)
if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
- r_l = &rdt_resources_all[level];
+ r_l = &rdt_resources_all[level].r_resctrl;
list_for_each_entry(d, &r_l->domains, list) {
if (r_l->cache.arch_has_per_cpu_cfg)
/* Pick all the CPUs in the domain instance */
/* Restore the qos cfg state when a domain comes online */
void rdt_domain_reconfigure_cdp(struct rdt_resource *r)
{
- if (!r->alloc_capable)
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+ if (!r->cdp_capable)
return;
- if (r == &rdt_resources_all[RDT_RESOURCE_L2DATA])
- l2_qos_cfg_update(&r->alloc_enabled);
+ if (r->rid == RDT_RESOURCE_L2)
+ l2_qos_cfg_update(&hw_res->cdp_enabled);
- if (r == &rdt_resources_all[RDT_RESOURCE_L3DATA])
- l3_qos_cfg_update(&r->alloc_enabled);
+ if (r->rid == RDT_RESOURCE_L3)
+ l3_qos_cfg_update(&hw_res->cdp_enabled);
}
/*
*/
static int set_mba_sc(bool mba_sc)
{
- struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA];
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl;
+ struct rdt_hw_domain *hw_dom;
struct rdt_domain *d;
if (!is_mbm_enabled() || !is_mba_linear() ||
return -EINVAL;
r->membw.mba_sc = mba_sc;
- list_for_each_entry(d, &r->domains, list)
- setup_default_ctrlval(r, d->ctrl_val, d->mbps_val);
+ list_for_each_entry(d, &r->domains, list) {
+ hw_dom = resctrl_to_arch_dom(d);
+ setup_default_ctrlval(r, hw_dom->ctrl_val, hw_dom->mbps_val);
+ }
return 0;
}
-static int cdp_enable(int level, int data_type, int code_type)
+static int cdp_enable(int level)
{
- struct rdt_resource *r_ldata = &rdt_resources_all[data_type];
- struct rdt_resource *r_lcode = &rdt_resources_all[code_type];
- struct rdt_resource *r_l = &rdt_resources_all[level];
+ struct rdt_resource *r_l = &rdt_resources_all[level].r_resctrl;
int ret;
- if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
- !r_lcode->alloc_capable)
+ if (!r_l->alloc_capable)
return -EINVAL;
ret = set_cache_qos_cfg(level, true);
- if (!ret) {
- r_l->alloc_enabled = false;
- r_ldata->alloc_enabled = true;
- r_lcode->alloc_enabled = true;
- }
+ if (!ret)
+ rdt_resources_all[level].cdp_enabled = true;
+
return ret;
}
-static int cdpl3_enable(void)
+static void cdp_disable(int level)
{
- return cdp_enable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
- RDT_RESOURCE_L3CODE);
-}
+ struct rdt_hw_resource *r_hw = &rdt_resources_all[level];
-static int cdpl2_enable(void)
-{
- return cdp_enable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
- RDT_RESOURCE_L2CODE);
+ if (r_hw->cdp_enabled) {
+ set_cache_qos_cfg(level, false);
+ r_hw->cdp_enabled = false;
+ }
}
-static void cdp_disable(int level, int data_type, int code_type)
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level l, bool enable)
{
- struct rdt_resource *r = &rdt_resources_all[level];
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[l];
- r->alloc_enabled = r->alloc_capable;
+ if (!hw_res->r_resctrl.cdp_capable)
+ return -EINVAL;
- if (rdt_resources_all[data_type].alloc_enabled) {
- rdt_resources_all[data_type].alloc_enabled = false;
- rdt_resources_all[code_type].alloc_enabled = false;
- set_cache_qos_cfg(level, false);
- }
-}
+ if (enable)
+ return cdp_enable(l);
-static void cdpl3_disable(void)
-{
- cdp_disable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA, RDT_RESOURCE_L3CODE);
-}
+ cdp_disable(l);
-static void cdpl2_disable(void)
-{
- cdp_disable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA, RDT_RESOURCE_L2CODE);
+ return 0;
}
static void cdp_disable_all(void)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
- cdpl3_disable();
- if (rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled)
- cdpl2_disable();
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3))
+ resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, false);
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
+ resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, false);
}
/*
int ret = 0;
if (ctx->enable_cdpl2)
- ret = cdpl2_enable();
+ ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L2, true);
if (!ret && ctx->enable_cdpl3)
- ret = cdpl3_enable();
+ ret = resctrl_arch_set_cdp_enabled(RDT_RESOURCE_L3, true);
if (!ret && ctx->enable_mba_mbps)
ret = set_mba_sc(true);
return ret;
}
+static int schemata_list_add(struct rdt_resource *r, enum resctrl_conf_type type)
+{
+ struct resctrl_schema *s;
+ const char *suffix = "";
+ int ret, cl;
+
+ s = kzalloc(sizeof(*s), GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;
+
+ s->res = r;
+ s->num_closid = resctrl_arch_get_num_closid(r);
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ s->num_closid /= 2;
+
+ s->conf_type = type;
+ switch (type) {
+ case CDP_CODE:
+ suffix = "CODE";
+ break;
+ case CDP_DATA:
+ suffix = "DATA";
+ break;
+ case CDP_NONE:
+ suffix = "";
+ break;
+ }
+
+ ret = snprintf(s->name, sizeof(s->name), "%s%s", r->name, suffix);
+ if (ret >= sizeof(s->name)) {
+ kfree(s);
+ return -EINVAL;
+ }
+
+ cl = strlen(s->name);
+
+ /*
+ * If CDP is supported by this resource, but not enabled,
+ * include the suffix. This ensures the tabular format of the
+ * schemata file does not change between mounts of the filesystem.
+ */
+ if (r->cdp_capable && !resctrl_arch_get_cdp_enabled(r->rid))
+ cl += 4;
+
+ if (cl > max_name_width)
+ max_name_width = cl;
+
+ INIT_LIST_HEAD(&s->list);
+ list_add(&s->list, &resctrl_schema_all);
+
+ return 0;
+}
+
+static int schemata_list_create(void)
+{
+ struct rdt_resource *r;
+ int ret = 0;
+
+ for_each_alloc_enabled_rdt_resource(r) {
+ if (resctrl_arch_get_cdp_enabled(r->rid)) {
+ ret = schemata_list_add(r, CDP_CODE);
+ if (ret)
+ break;
+
+ ret = schemata_list_add(r, CDP_DATA);
+ } else {
+ ret = schemata_list_add(r, CDP_NONE);
+ }
+
+ if (ret)
+ break;
+ }
+
+ return ret;
+}
+
+static void schemata_list_destroy(void)
+{
+ struct resctrl_schema *s, *tmp;
+
+ list_for_each_entry_safe(s, tmp, &resctrl_schema_all, list) {
+ list_del(&s->list);
+ kfree(s);
+ }
+}
+
static int rdt_get_tree(struct fs_context *fc)
{
struct rdt_fs_context *ctx = rdt_fc2context(fc);
if (ret < 0)
goto out_cdp;
+ ret = schemata_list_create();
+ if (ret) {
+ schemata_list_destroy();
+ goto out_mba;
+ }
+
closid_init();
ret = rdtgroup_create_info_dir(rdtgroup_default.kn);
if (ret < 0)
- goto out_mba;
+ goto out_schemata_free;
if (rdt_mon_capable) {
ret = mongroup_create_dir(rdtgroup_default.kn,
static_branch_enable_cpuslocked(&rdt_enable_key);
if (is_mbm_enabled()) {
- r = &rdt_resources_all[RDT_RESOURCE_L3];
+ r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
list_for_each_entry(dom, &r->domains, list)
mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL);
}
kernfs_remove(kn_mongrp);
out_info:
kernfs_remove(kn_info);
+out_schemata_free:
+ schemata_list_destroy();
out_mba:
if (ctx->enable_mba_mbps)
set_mba_sc(false);
static int reset_all_ctrls(struct rdt_resource *r)
{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ struct rdt_hw_domain *hw_dom;
struct msr_param msr_param;
cpumask_var_t cpu_mask;
struct rdt_domain *d;
msr_param.res = r;
msr_param.low = 0;
- msr_param.high = r->num_closid;
+ msr_param.high = hw_res->num_closid;
/*
* Disable resource control for this resource by setting all
* from each domain to update the MSRs below.
*/
list_for_each_entry(d, &r->domains, list) {
+ hw_dom = resctrl_to_arch_dom(d);
cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
- for (i = 0; i < r->num_closid; i++)
- d->ctrl_val[i] = r->default_ctrl;
+ for (i = 0; i < hw_res->num_closid; i++)
+ hw_dom->ctrl_val[i] = r->default_ctrl;
}
cpu = get_cpu();
/* Update CBM on this cpu if it's in cpu_mask. */
rmdir_all_sub();
rdt_pseudo_lock_release();
rdtgroup_default.mode = RDT_MODE_SHAREABLE;
+ schemata_list_destroy();
static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
static_branch_disable_cpuslocked(&rdt_mon_enable_key);
static_branch_disable_cpuslocked(&rdt_enable_key);
* Set the RDT domain up to start off with all usable allocations. That is,
* all shareable and unused bits. All-zero CBM is invalid.
*/
-static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
+static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_schema *s,
u32 closid)
{
- struct rdt_resource *r_cdp = NULL;
- struct rdt_domain *d_cdp = NULL;
+ enum resctrl_conf_type peer_type = resctrl_peer_type(s->conf_type);
+ enum resctrl_conf_type t = s->conf_type;
+ struct resctrl_staged_config *cfg;
+ struct rdt_resource *r = s->res;
u32 used_b = 0, unused_b = 0;
unsigned long tmp_cbm;
enum rdtgrp_mode mode;
- u32 peer_ctl, *ctrl;
+ u32 peer_ctl, ctrl_val;
int i;
- rdt_cdp_peer_get(r, d, &r_cdp, &d_cdp);
- d->have_new_ctrl = false;
- d->new_ctrl = r->cache.shareable_bits;
+ cfg = &d->staged_config[t];
+ cfg->have_new_ctrl = false;
+ cfg->new_ctrl = r->cache.shareable_bits;
used_b = r->cache.shareable_bits;
- ctrl = d->ctrl_val;
- for (i = 0; i < closids_supported(); i++, ctrl++) {
+ for (i = 0; i < closids_supported(); i++) {
if (closid_allocated(i) && i != closid) {
mode = rdtgroup_mode_by_closid(i);
if (mode == RDT_MODE_PSEUDO_LOCKSETUP)
* usage to ensure there is no overlap
* with an exclusive group.
*/
- if (d_cdp)
- peer_ctl = d_cdp->ctrl_val[i];
+ if (resctrl_arch_get_cdp_enabled(r->rid))
+ peer_ctl = resctrl_arch_get_config(r, d, i,
+ peer_type);
else
peer_ctl = 0;
- used_b |= *ctrl | peer_ctl;
+ ctrl_val = resctrl_arch_get_config(r, d, i,
+ s->conf_type);
+ used_b |= ctrl_val | peer_ctl;
if (mode == RDT_MODE_SHAREABLE)
- d->new_ctrl |= *ctrl | peer_ctl;
+ cfg->new_ctrl |= ctrl_val | peer_ctl;
}
}
if (d->plr && d->plr->cbm > 0)
used_b |= d->plr->cbm;
unused_b = used_b ^ (BIT_MASK(r->cache.cbm_len) - 1);
unused_b &= BIT_MASK(r->cache.cbm_len) - 1;
- d->new_ctrl |= unused_b;
+ cfg->new_ctrl |= unused_b;
/*
* Force the initial CBM to be valid, user can
* modify the CBM based on system availability.
*/
- d->new_ctrl = cbm_ensure_valid(d->new_ctrl, r);
+ cfg->new_ctrl = cbm_ensure_valid(cfg->new_ctrl, r);
/*
* Assign the u32 CBM to an unsigned long to ensure that
* bitmap_weight() does not access out-of-bound memory.
*/
- tmp_cbm = d->new_ctrl;
+ tmp_cbm = cfg->new_ctrl;
if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) {
- rdt_last_cmd_printf("No space on %s:%d\n", r->name, d->id);
+ rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id);
return -ENOSPC;
}
- d->have_new_ctrl = true;
+ cfg->have_new_ctrl = true;
return 0;
}
* If there are no more shareable bits available on any domain then
* the entire allocation will fail.
*/
-static int rdtgroup_init_cat(struct rdt_resource *r, u32 closid)
+static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid)
{
struct rdt_domain *d;
int ret;
- list_for_each_entry(d, &r->domains, list) {
- ret = __init_one_rdt_domain(d, r, closid);
+ list_for_each_entry(d, &s->res->domains, list) {
+ ret = __init_one_rdt_domain(d, s, closid);
if (ret < 0)
return ret;
}
/* Initialize MBA resource with default values. */
static void rdtgroup_init_mba(struct rdt_resource *r)
{
+ struct resctrl_staged_config *cfg;
struct rdt_domain *d;
list_for_each_entry(d, &r->domains, list) {
- d->new_ctrl = is_mba_sc(r) ? MBA_MAX_MBPS : r->default_ctrl;
- d->have_new_ctrl = true;
+ cfg = &d->staged_config[CDP_NONE];
+ cfg->new_ctrl = is_mba_sc(r) ? MBA_MAX_MBPS : r->default_ctrl;
+ cfg->have_new_ctrl = true;
}
}
/* Initialize the RDT group's allocations. */
static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)
{
+ struct resctrl_schema *s;
struct rdt_resource *r;
int ret;
- for_each_alloc_enabled_rdt_resource(r) {
+ list_for_each_entry(s, &resctrl_schema_all, list) {
+ r = s->res;
if (r->rid == RDT_RESOURCE_MBA) {
rdtgroup_init_mba(r);
} else {
- ret = rdtgroup_init_cat(r, rdtgrp->closid);
+ ret = rdtgroup_init_cat(s, rdtgrp->closid);
if (ret < 0)
return ret;
}
- ret = update_domains(r, rdtgrp->closid);
+ ret = resctrl_arch_update_domains(r, rdtgrp->closid);
if (ret < 0) {
rdt_last_cmd_puts("Failed to initialize allocations\n");
return ret;
static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root *kf)
{
- if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3))
seq_puts(seq, ",cdp");
- if (rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled)
+ if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2))
seq_puts(seq, ",cdpl2");
- if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA]))
+ if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl))
seq_puts(seq, ",mba_MBps");
return 0;
*/
static void restore_ELCR(char *trigger)
{
- outb(trigger[0], 0x4d0);
- outb(trigger[1], 0x4d1);
+ outb(trigger[0], PIC_ELCR1);
+ outb(trigger[1], PIC_ELCR2);
}
static void save_ELCR(char *trigger)
{
/* IRQ 0,1,2,8,13 are marked as reserved */
- trigger[0] = inb(0x4d0) & 0xF8;
- trigger[1] = inb(0x4d1) & 0xDE;
+ trigger[0] = inb(PIC_ELCR1) & 0xF8;
+ trigger[1] = inb(PIC_ELCR2) & 0xDE;
}
static void i8259A_resume(void)
#include <linux/smp.h>
#include <linux/pci.h>
+#include <asm/i8259.h>
#include <asm/io_apic.h>
#include <asm/acpi.h>
#include <asm/irqdomain.h>
{
unsigned int port;
- port = 0x4d0 + (irq >> 3);
+ port = PIC_ELCR1 + (irq >> 3);
return (inb(port) >> (irq & 7)) & 1;
}
},
{ /* Handle problems with rebooting on the OptiPlex 990. */
.callback = set_pci_reboot,
- .ident = "Dell OptiPlex 990",
+ .ident = "Dell OptiPlex 990 BIOS A0x",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
DMI_MATCH(DMI_PRODUCT_NAME, "OptiPlex 990"),
+ DMI_MATCH(DMI_BIOS_VERSION, "A0"),
},
},
{ /* Handle problems with rebooting on Dell 300's */
if (threads > __max_smt_threads)
__max_smt_threads = threads;
+ for_each_cpu(i, topology_sibling_cpumask(cpu))
+ cpu_data(i).smt_active = threads > 1;
+
/*
* This needs a separate iteration over the cpus because we rely on all
* topology_sibling_cpumask links to be set-up.
for_each_cpu(sibling, topology_die_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_die_cpumask(sibling));
- for_each_cpu(sibling, topology_sibling_cpumask(cpu))
+
+ for_each_cpu(sibling, topology_sibling_cpumask(cpu)) {
cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
+ if (cpumask_weight(topology_sibling_cpumask(sibling)) == 1)
+ cpu_data(sibling).smt_active = false;
+ }
+
for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
cpumask_clear(cpu_llc_shared_mask(cpu));
addr, len, val);
}
-static int picdev_eclr_write(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
+static int picdev_elcr_write(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
gpa_t addr, int len, const void *val)
{
- return picdev_write(container_of(dev, struct kvm_pic, dev_eclr),
+ return picdev_write(container_of(dev, struct kvm_pic, dev_elcr),
addr, len, val);
}
-static int picdev_eclr_read(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
+static int picdev_elcr_read(struct kvm_vcpu *vcpu, struct kvm_io_device *dev,
gpa_t addr, int len, void *val)
{
- return picdev_read(container_of(dev, struct kvm_pic, dev_eclr),
+ return picdev_read(container_of(dev, struct kvm_pic, dev_elcr),
addr, len, val);
}
.write = picdev_slave_write,
};
-static const struct kvm_io_device_ops picdev_eclr_ops = {
- .read = picdev_eclr_read,
- .write = picdev_eclr_write,
+static const struct kvm_io_device_ops picdev_elcr_ops = {
+ .read = picdev_elcr_read,
+ .write = picdev_elcr_write,
};
int kvm_pic_init(struct kvm *kvm)
*/
kvm_iodevice_init(&s->dev_master, &picdev_master_ops);
kvm_iodevice_init(&s->dev_slave, &picdev_slave_ops);
- kvm_iodevice_init(&s->dev_eclr, &picdev_eclr_ops);
+ kvm_iodevice_init(&s->dev_elcr, &picdev_elcr_ops);
mutex_lock(&kvm->slots_lock);
ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, 0x20, 2,
&s->dev_master);
if (ret < 0)
goto fail_unreg_2;
- ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, 0x4d0, 2, &s->dev_eclr);
+ ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, 0x4d0, 2, &s->dev_elcr);
if (ret < 0)
goto fail_unreg_1;
mutex_lock(&kvm->slots_lock);
kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_master);
kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_slave);
- kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_eclr);
+ kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_elcr);
mutex_unlock(&kvm->slots_lock);
kvm->arch.vpic = NULL;
int output; /* intr from master PIC */
struct kvm_io_device dev_master;
struct kvm_io_device dev_slave;
- struct kvm_io_device dev_eclr;
+ struct kvm_io_device dev_elcr;
void (*ack_notifier)(void *opaque, int irq);
unsigned long irq_states[PIC_NUM_PINS];
};
lib-y := delay.o misc.o cmdline.o cpu.o
lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
lib-y += memcpy_$(BITS).o
+lib-y += pc-conf-reg.o
lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_mc.o copy_mc_64.o
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Support for the configuration register space at port I/O locations
+ * 0x22 and 0x23 variously used by PC architectures, e.g. the MP Spec,
+ * Cyrix CPUs, numerous chipsets. As the space is indirectly addressed
+ * it may have to be protected with a spinlock, depending on the context.
+ */
+
+#include <linux/spinlock.h>
+
+#include <asm/pc-conf-reg.h>
+
+DEFINE_RAW_SPINLOCK(pc_conf_lock);
goto out;
}
- get_online_cpus();
+ cpus_read_lock();
cpumask_copy(downed_cpus, cpu_online_mask);
cpumask_clear_cpu(cpumask_first(cpu_online_mask), downed_cpus);
if (num_online_cpus() > 1)
pr_notice("Disabling non-boot CPUs...\n");
- put_online_cpus();
+ cpus_read_unlock();
for_each_cpu(cpu, downed_cpus) {
err = remove_cpu(cpu);
#include <linux/export.h>
#include <linux/cpu.h>
#include <linux/debugfs.h>
+#include <linux/sched/smt.h>
#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
#include <asm/nospec-branch.h>
#include <asm/cache.h>
+#include <asm/cacheflush.h>
#include <asm/apic.h>
#include <asm/perf_event.h>
*/
/*
- * Use bit 0 to mangle the TIF_SPEC_IB state into the mm pointer which is
- * stored in cpu_tlb_state.last_user_mm_ibpb.
+ * Bits to mangle the TIF_SPEC_* state into the mm pointer which is
+ * stored in cpu_tlb_state.last_user_mm_spec.
*/
#define LAST_USER_MM_IBPB 0x1UL
+#define LAST_USER_MM_L1D_FLUSH 0x2UL
+#define LAST_USER_MM_SPEC_MASK (LAST_USER_MM_IBPB | LAST_USER_MM_L1D_FLUSH)
+
+/* Bits to set when tlbstate and flush is (re)initialized */
+#define LAST_USER_MM_INIT LAST_USER_MM_IBPB
/*
* The x86 feature is called PCID (Process Context IDentifier). It is similar
local_irq_restore(flags);
}
-static unsigned long mm_mangle_tif_spec_ib(struct task_struct *next)
+/*
+ * Invoked from return to user/guest by a task that opted-in to L1D
+ * flushing but ended up running on an SMT enabled core due to wrong
+ * affinity settings or CPU hotplug. This is part of the paranoid L1D flush
+ * contract which this task requested.
+ */
+static void l1d_flush_force_sigbus(struct callback_head *ch)
+{
+ force_sig(SIGBUS);
+}
+
+static void l1d_flush_evaluate(unsigned long prev_mm, unsigned long next_mm,
+ struct task_struct *next)
+{
+ /* Flush L1D if the outgoing task requests it */
+ if (prev_mm & LAST_USER_MM_L1D_FLUSH)
+ wrmsrl(MSR_IA32_FLUSH_CMD, L1D_FLUSH);
+
+ /* Check whether the incoming task opted in for L1D flush */
+ if (likely(!(next_mm & LAST_USER_MM_L1D_FLUSH)))
+ return;
+
+ /*
+ * Validate that it is not running on an SMT sibling as this would
+ * make the excercise pointless because the siblings share L1D. If
+ * it runs on a SMT sibling, notify it with SIGBUS on return to
+ * user/guest
+ */
+ if (this_cpu_read(cpu_info.smt_active)) {
+ clear_ti_thread_flag(&next->thread_info, TIF_SPEC_L1D_FLUSH);
+ next->l1d_flush_kill.func = l1d_flush_force_sigbus;
+ task_work_add(next, &next->l1d_flush_kill, TWA_RESUME);
+ }
+}
+
+static unsigned long mm_mangle_tif_spec_bits(struct task_struct *next)
{
unsigned long next_tif = task_thread_info(next)->flags;
- unsigned long ibpb = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_IBPB;
+ unsigned long spec_bits = (next_tif >> TIF_SPEC_IB) & LAST_USER_MM_SPEC_MASK;
- return (unsigned long)next->mm | ibpb;
+ /*
+ * Ensure that the bit shift above works as expected and the two flags
+ * end up in bit 0 and 1.
+ */
+ BUILD_BUG_ON(TIF_SPEC_L1D_FLUSH != TIF_SPEC_IB + 1);
+
+ return (unsigned long)next->mm | spec_bits;
}
-static void cond_ibpb(struct task_struct *next)
+static void cond_mitigation(struct task_struct *next)
{
+ unsigned long prev_mm, next_mm;
+
if (!next || !next->mm)
return;
+ next_mm = mm_mangle_tif_spec_bits(next);
+ prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_spec);
+
/*
+ * Avoid user/user BTB poisoning by flushing the branch predictor
+ * when switching between processes. This stops one process from
+ * doing Spectre-v2 attacks on another.
+ *
* Both, the conditional and the always IBPB mode use the mm
* pointer to avoid the IBPB when switching between tasks of the
* same process. Using the mm pointer instead of mm->context.ctx_id
* exposed data is not really interesting.
*/
if (static_branch_likely(&switch_mm_cond_ibpb)) {
- unsigned long prev_mm, next_mm;
-
/*
* This is a bit more complex than the always mode because
* it has to handle two cases:
* Optimize this with reasonably small overhead for the
* above cases. Mangle the TIF_SPEC_IB bit into the mm
* pointer of the incoming task which is stored in
- * cpu_tlbstate.last_user_mm_ibpb for comparison.
- */
- next_mm = mm_mangle_tif_spec_ib(next);
- prev_mm = this_cpu_read(cpu_tlbstate.last_user_mm_ibpb);
-
- /*
+ * cpu_tlbstate.last_user_mm_spec for comparison.
+ *
* Issue IBPB only if the mm's are different and one or
* both have the IBPB bit set.
*/
if (next_mm != prev_mm &&
(next_mm | prev_mm) & LAST_USER_MM_IBPB)
indirect_branch_prediction_barrier();
-
- this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, next_mm);
}
if (static_branch_unlikely(&switch_mm_always_ibpb)) {
* different context than the user space task which ran
* last on this CPU.
*/
- if (this_cpu_read(cpu_tlbstate.last_user_mm) != next->mm) {
+ if ((prev_mm & ~LAST_USER_MM_SPEC_MASK) !=
+ (unsigned long)next->mm)
indirect_branch_prediction_barrier();
- this_cpu_write(cpu_tlbstate.last_user_mm, next->mm);
- }
}
+
+ if (static_branch_unlikely(&switch_mm_cond_l1d_flush)) {
+ /*
+ * Flush L1D when the outgoing task requested it and/or
+ * check whether the incoming task requested L1D flushing
+ * and ended up on an SMT sibling.
+ */
+ if (unlikely((prev_mm | next_mm) & LAST_USER_MM_L1D_FLUSH))
+ l1d_flush_evaluate(prev_mm, next_mm, next);
+ }
+
+ this_cpu_write(cpu_tlbstate.last_user_mm_spec, next_mm);
}
#ifdef CONFIG_PERF_EVENTS
need_flush = true;
} else {
/*
- * Avoid user/user BTB poisoning by flushing the branch
- * predictor when switching between processes. This stops
- * one process from doing Spectre-v2 attacks on another.
+ * Apply process to process speculation vulnerability
+ * mitigations if applicable.
*/
- cond_ibpb(tsk);
+ cond_mitigation(tsk);
/*
* Stop remote flushes for the previous mm.
write_cr3(build_cr3(mm->pgd, 0));
/* Reinitialize tlbstate. */
- this_cpu_write(cpu_tlbstate.last_user_mm_ibpb, LAST_USER_MM_IBPB);
+ this_cpu_write(cpu_tlbstate.last_user_mm_spec, LAST_USER_MM_INIT);
this_cpu_write(cpu_tlbstate.loaded_mm_asid, 0);
this_cpu_write(cpu_tlbstate.next_asid, 1);
this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id);
#include <linux/dmi.h>
#include <linux/io.h>
#include <linux/smp.h>
+#include <linux/spinlock.h>
#include <asm/io_apic.h>
#include <linux/irq.h>
#include <linux/acpi.h>
+
+#include <asm/i8259.h>
+#include <asm/pc-conf-reg.h>
#include <asm/pci_x86.h>
#define PIRQ_SIGNATURE (('$' << 0) + ('P' << 8) + ('I' << 16) + ('R' << 24))
int (*get)(struct pci_dev *router, struct pci_dev *dev, int pirq);
int (*set)(struct pci_dev *router, struct pci_dev *dev, int pirq,
int new);
+ int (*lvl)(struct pci_dev *router, struct pci_dev *dev, int pirq,
+ int irq);
};
struct irq_router_handler {
void elcr_set_level_irq(unsigned int irq)
{
unsigned char mask = 1 << (irq & 7);
- unsigned int port = 0x4d0 + (irq >> 3);
+ unsigned int port = PIC_ELCR1 + (irq >> 3);
unsigned char val;
static u16 elcr_irq_mask;
}
}
+/*
+ * PIRQ routing for the M1487 ISA Bus Controller (IBC) ASIC used
+ * with the ALi FinALi 486 chipset. The IBC is not decoded in the
+ * PCI configuration space, so we identify it by the accompanying
+ * M1489 Cache-Memory PCI Controller (CMP) ASIC.
+ *
+ * There are four 4-bit mappings provided, spread across two PCI
+ * INTx Routing Table Mapping Registers, available in the port I/O
+ * space accessible indirectly via the index/data register pair at
+ * 0x22/0x23, located at indices 0x42 and 0x43 for the INT1/INT2
+ * and INT3/INT4 lines respectively. The INT1/INT3 and INT2/INT4
+ * lines are mapped in the low and the high 4-bit nibble of the
+ * corresponding register as follows:
+ *
+ * 0000 : Disabled
+ * 0001 : IRQ9
+ * 0010 : IRQ3
+ * 0011 : IRQ10
+ * 0100 : IRQ4
+ * 0101 : IRQ5
+ * 0110 : IRQ7
+ * 0111 : IRQ6
+ * 1000 : Reserved
+ * 1001 : IRQ11
+ * 1010 : Reserved
+ * 1011 : IRQ12
+ * 1100 : Reserved
+ * 1101 : IRQ14
+ * 1110 : Reserved
+ * 1111 : IRQ15
+ *
+ * In addition to the usual ELCR register pair there is a separate
+ * PCI INTx Sensitivity Register at index 0x44 in the same port I/O
+ * space, whose bits 3:0 select the trigger mode for INT[4:1] lines
+ * respectively. Any bit set to 1 causes interrupts coming on the
+ * corresponding line to be passed to ISA as edge-triggered and
+ * otherwise they are passed as level-triggered. Manufacturer's
+ * documentation says this register has to be set consistently with
+ * the relevant ELCR register.
+ *
+ * Accesses to the port I/O space concerned here need to be unlocked
+ * by writing the value of 0xc5 to the Lock Register at index 0x03
+ * beforehand. Any other value written to said register prevents
+ * further accesses from reaching the register file, except for the
+ * Lock Register being written with 0xc5 again.
+ *
+ * References:
+ *
+ * "M1489/M1487: 486 PCI Chip Set", Version 1.2, Acer Laboratories
+ * Inc., July 1997
+ */
+
+#define PC_CONF_FINALI_LOCK 0x03u
+#define PC_CONF_FINALI_PCI_INTX_RT1 0x42u
+#define PC_CONF_FINALI_PCI_INTX_RT2 0x43u
+#define PC_CONF_FINALI_PCI_INTX_SENS 0x44u
+
+#define PC_CONF_FINALI_LOCK_KEY 0xc5u
+
+static u8 read_pc_conf_nybble(u8 base, u8 index)
+{
+ u8 reg = base + (index >> 1);
+ u8 x;
+
+ x = pc_conf_get(reg);
+ return index & 1 ? x >> 4 : x & 0xf;
+}
+
+static void write_pc_conf_nybble(u8 base, u8 index, u8 val)
+{
+ u8 reg = base + (index >> 1);
+ u8 x;
+
+ x = pc_conf_get(reg);
+ x = index & 1 ? (x & 0x0f) | (val << 4) : (x & 0xf0) | val;
+ pc_conf_set(reg, x);
+}
+
+static int pirq_finali_get(struct pci_dev *router, struct pci_dev *dev,
+ int pirq)
+{
+ static const u8 irqmap[16] = {
+ 0, 9, 3, 10, 4, 5, 7, 6, 0, 11, 0, 12, 0, 14, 0, 15
+ };
+ unsigned long flags;
+ u8 x;
+
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_FINALI_LOCK, PC_CONF_FINALI_LOCK_KEY);
+ x = irqmap[read_pc_conf_nybble(PC_CONF_FINALI_PCI_INTX_RT1, pirq - 1)];
+ pc_conf_set(PC_CONF_FINALI_LOCK, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return x;
+}
+
+static int pirq_finali_set(struct pci_dev *router, struct pci_dev *dev,
+ int pirq, int irq)
+{
+ static const u8 irqmap[16] = {
+ 0, 0, 0, 2, 4, 5, 7, 6, 0, 1, 3, 9, 11, 0, 13, 15
+ };
+ u8 val = irqmap[irq];
+ unsigned long flags;
+
+ if (!val)
+ return 0;
+
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_FINALI_LOCK, PC_CONF_FINALI_LOCK_KEY);
+ write_pc_conf_nybble(PC_CONF_FINALI_PCI_INTX_RT1, pirq - 1, val);
+ pc_conf_set(PC_CONF_FINALI_LOCK, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return 1;
+}
+
+static int pirq_finali_lvl(struct pci_dev *router, struct pci_dev *dev,
+ int pirq, int irq)
+{
+ u8 mask = ~(1u << (pirq - 1));
+ unsigned long flags;
+ u8 trig;
+
+ elcr_set_level_irq(irq);
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_FINALI_LOCK, PC_CONF_FINALI_LOCK_KEY);
+ trig = pc_conf_get(PC_CONF_FINALI_PCI_INTX_SENS);
+ trig &= mask;
+ pc_conf_set(PC_CONF_FINALI_PCI_INTX_SENS, trig);
+ pc_conf_set(PC_CONF_FINALI_LOCK, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return 1;
+}
+
/*
* Common IRQ routing practice: nibbles in config space,
* offset by some magic constant.
return 0;
}
+/*
+ * PIRQ routing for the 82374EB/82374SB EISA System Component (ESC)
+ * ASIC used with the Intel 82420 and 82430 PCIsets. The ESC is not
+ * decoded in the PCI configuration space, so we identify it by the
+ * accompanying 82375EB/82375SB PCI-EISA Bridge (PCEB) ASIC.
+ *
+ * There are four PIRQ Route Control registers, available in the
+ * port I/O space accessible indirectly via the index/data register
+ * pair at 0x22/0x23, located at indices 0x60/0x61/0x62/0x63 for the
+ * PIRQ0/1/2/3# lines respectively. The semantics is the same as
+ * with the PIIX router.
+ *
+ * Accesses to the port I/O space concerned here need to be unlocked
+ * by writing the value of 0x0f to the ESC ID Register at index 0x02
+ * beforehand. Any other value written to said register prevents
+ * further accesses from reaching the register file, except for the
+ * ESC ID Register being written with 0x0f again.
+ *
+ * References:
+ *
+ * "82374EB/82374SB EISA System Component (ESC)", Intel Corporation,
+ * Order Number: 290476-004, March 1996
+ *
+ * "82375EB/82375SB PCI-EISA Bridge (PCEB)", Intel Corporation, Order
+ * Number: 290477-004, March 1996
+ */
+
+#define PC_CONF_I82374_ESC_ID 0x02u
+#define PC_CONF_I82374_PIRQ_ROUTE_CONTROL 0x60u
+
+#define PC_CONF_I82374_ESC_ID_KEY 0x0fu
+
+static int pirq_esc_get(struct pci_dev *router, struct pci_dev *dev, int pirq)
+{
+ unsigned long flags;
+ int reg;
+ u8 x;
+
+ reg = pirq;
+ if (reg >= 1 && reg <= 4)
+ reg += PC_CONF_I82374_PIRQ_ROUTE_CONTROL - 1;
+
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_I82374_ESC_ID, PC_CONF_I82374_ESC_ID_KEY);
+ x = pc_conf_get(reg);
+ pc_conf_set(PC_CONF_I82374_ESC_ID, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return (x < 16) ? x : 0;
+}
+
+static int pirq_esc_set(struct pci_dev *router, struct pci_dev *dev, int pirq,
+ int irq)
+{
+ unsigned long flags;
+ int reg;
+
+ reg = pirq;
+ if (reg >= 1 && reg <= 4)
+ reg += PC_CONF_I82374_PIRQ_ROUTE_CONTROL - 1;
+
+ raw_spin_lock_irqsave(&pc_conf_lock, flags);
+ pc_conf_set(PC_CONF_I82374_ESC_ID, PC_CONF_I82374_ESC_ID_KEY);
+ pc_conf_set(reg, irq);
+ pc_conf_set(PC_CONF_I82374_ESC_ID, 0);
+ raw_spin_unlock_irqrestore(&pc_conf_lock, flags);
+ return 1;
+}
+
/*
* The Intel PIIX4 pirq rules are fairly simple: "pirq" is
* just a pointer to the config space.
return 1;
}
+/*
+ * PIRQ routing for the 82426EX ISA Bridge (IB) ASIC used with the
+ * Intel 82420EX PCIset.
+ *
+ * There are only two PIRQ Route Control registers, available in the
+ * combined 82425EX/82426EX PCI configuration space, at 0x66 and 0x67
+ * for the PIRQ0# and PIRQ1# lines respectively. The semantics is
+ * the same as with the PIIX router.
+ *
+ * References:
+ *
+ * "82420EX PCIset Data Sheet, 82425EX PCI System Controller (PSC)
+ * and 82426EX ISA Bridge (IB)", Intel Corporation, Order Number:
+ * 290488-004, December 1995
+ */
+
+#define PCI_I82426EX_PIRQ_ROUTE_CONTROL 0x66u
+
+static int pirq_ib_get(struct pci_dev *router, struct pci_dev *dev, int pirq)
+{
+ int reg;
+ u8 x;
+
+ reg = pirq;
+ if (reg >= 1 && reg <= 2)
+ reg += PCI_I82426EX_PIRQ_ROUTE_CONTROL - 1;
+
+ pci_read_config_byte(router, reg, &x);
+ return (x < 16) ? x : 0;
+}
+
+static int pirq_ib_set(struct pci_dev *router, struct pci_dev *dev, int pirq,
+ int irq)
+{
+ int reg;
+
+ reg = pirq;
+ if (reg >= 1 && reg <= 2)
+ reg += PCI_I82426EX_PIRQ_ROUTE_CONTROL - 1;
+
+ pci_write_config_byte(router, reg, irq);
+ return 1;
+}
+
/*
* The VIA pirq rules are nibble-based, like ALI,
* but without the ugly irq number munging.
return 0;
switch (device) {
+ case PCI_DEVICE_ID_INTEL_82375:
+ r->name = "PCEB/ESC";
+ r->get = pirq_esc_get;
+ r->set = pirq_esc_set;
+ return 1;
case PCI_DEVICE_ID_INTEL_82371FB_0:
case PCI_DEVICE_ID_INTEL_82371SB_0:
case PCI_DEVICE_ID_INTEL_82371AB_0:
r->get = pirq_piix_get;
r->set = pirq_piix_set;
return 1;
+ case PCI_DEVICE_ID_INTEL_82425:
+ r->name = "PSC/IB";
+ r->get = pirq_ib_get;
+ r->set = pirq_ib_set;
+ return 1;
}
if ((device >= PCI_DEVICE_ID_INTEL_5_3400_SERIES_LPC_MIN &&
static __init int ali_router_probe(struct irq_router *r, struct pci_dev *router, u16 device)
{
switch (device) {
+ case PCI_DEVICE_ID_AL_M1489:
+ r->name = "FinALi";
+ r->get = pirq_finali_get;
+ r->set = pirq_finali_set;
+ r->lvl = pirq_finali_lvl;
+ return 1;
case PCI_DEVICE_ID_AL_M1533:
case PCI_DEVICE_ID_AL_M1563:
r->name = "ALI";
} else if (r->get && (irq = r->get(pirq_router_dev, dev, pirq)) && \
((!(pci_probe & PCI_USE_PIRQ_MASK)) || ((1 << irq) & mask))) {
msg = "found";
- elcr_set_level_irq(irq);
+ if (r->lvl)
+ r->lvl(pirq_router_dev, dev, pirq, irq);
+ else
+ elcr_set_level_irq(irq);
} else if (newirq && r->set &&
(dev->class >> 8) != PCI_CLASS_DISPLAY_VGA) {
if (r->set(pirq_router_dev, dev, pirq, newirq)) {
- elcr_set_level_irq(newirq);
+ if (r->lvl)
+ r->lvl(pirq_router_dev, dev, pirq, newirq);
+ else
+ elcr_set_level_irq(newirq);
msg = "assigned";
irq = newirq;
}
}
/**
- * __save_processor_state - save CPU registers before creating a
- * hibernation image and before restoring the memory state from it
- * @ctxt - structure to store the registers contents in
+ * __save_processor_state() - Save CPU registers before creating a
+ * hibernation image and before restoring
+ * the memory state from it
+ * @ctxt: Structure to store the registers contents in.
*
- * NOTE: If there is a CPU register the modification of which by the
- * boot kernel (ie. the kernel used for loading the hibernation image)
- * might affect the operations of the restored target kernel (ie. the one
- * saved in the hibernation image), then its contents must be saved by this
- * function. In other words, if kernel A is hibernated and different
- * kernel B is used for loading the hibernation image into memory, the
- * kernel A's __save_processor_state() function must save all registers
- * needed by kernel A, so that it can operate correctly after the resume
- * regardless of what kernel B does in the meantime.
+ * NOTE: If there is a CPU register the modification of which by the
+ * boot kernel (ie. the kernel used for loading the hibernation image)
+ * might affect the operations of the restored target kernel (ie. the one
+ * saved in the hibernation image), then its contents must be saved by this
+ * function. In other words, if kernel A is hibernated and different
+ * kernel B is used for loading the hibernation image into memory, the
+ * kernel A's __save_processor_state() function must save all registers
+ * needed by kernel A, so that it can operate correctly after the resume
+ * regardless of what kernel B does in the meantime.
*/
static void __save_processor_state(struct saved_context *ctxt)
{
}
/**
- * __restore_processor_state - restore the contents of CPU registers saved
- * by __save_processor_state()
- * @ctxt - structure to load the registers contents from
+ * __restore_processor_state() - Restore the contents of CPU registers saved
+ * by __save_processor_state()
+ * @ctxt: Structure to load the registers contents from.
*
* The asm code that gets us here will have restored a usable GDT, although
* it will be pointing to the wrong alias.
#if ELF_BITS == 64
static struct relocs relocs32neg;
static struct relocs relocs64;
+#define FMT PRIu64
+#else
+#define FMT PRIu32
#endif
struct section {
Elf_Shdr shdr;
if (fseek(fp, ehdr.e_shoff, SEEK_SET) < 0)
- die("Seek to %d failed: %s\n", ehdr.e_shoff, strerror(errno));
+ die("Seek to %" FMT " failed: %s\n", ehdr.e_shoff, strerror(errno));
if (fread(&shdr, sizeof(shdr), 1, fp) != 1)
die("Cannot read initial ELF section header: %s\n", strerror(errno));
secs = calloc(shnum, sizeof(struct section));
if (!secs) {
- die("Unable to allocate %d section headers\n",
+ die("Unable to allocate %ld section headers\n",
shnum);
}
if (fseek(fp, ehdr.e_shoff, SEEK_SET) < 0) {
- die("Seek to %d failed: %s\n",
- ehdr.e_shoff, strerror(errno));
+ die("Seek to %" FMT " failed: %s\n",
+ ehdr.e_shoff, strerror(errno));
}
for (i = 0; i < shnum; i++) {
struct section *sec = &secs[i];
if (fread(&shdr, sizeof(shdr), 1, fp) != 1)
- die("Cannot read ELF section headers %d/%d: %s\n",
+ die("Cannot read ELF section headers %d/%ld: %s\n",
i, shnum, strerror(errno));
sec->shdr.sh_name = elf_word_to_cpu(shdr.sh_name);
sec->shdr.sh_type = elf_word_to_cpu(shdr.sh_type);
}
sec->strtab = malloc(sec->shdr.sh_size);
if (!sec->strtab) {
- die("malloc of %d bytes for strtab failed\n",
- sec->shdr.sh_size);
+ die("malloc of %" FMT " bytes for strtab failed\n",
+ sec->shdr.sh_size);
}
if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) {
- die("Seek to %d failed: %s\n",
- sec->shdr.sh_offset, strerror(errno));
+ die("Seek to %" FMT " failed: %s\n",
+ sec->shdr.sh_offset, strerror(errno));
}
if (fread(sec->strtab, 1, sec->shdr.sh_size, fp)
!= sec->shdr.sh_size) {
}
sec->symtab = malloc(sec->shdr.sh_size);
if (!sec->symtab) {
- die("malloc of %d bytes for symtab failed\n",
- sec->shdr.sh_size);
+ die("malloc of %" FMT " bytes for symtab failed\n",
+ sec->shdr.sh_size);
}
if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) {
- die("Seek to %d failed: %s\n",
- sec->shdr.sh_offset, strerror(errno));
+ die("Seek to %" FMT " failed: %s\n",
+ sec->shdr.sh_offset, strerror(errno));
}
if (fread(sec->symtab, 1, sec->shdr.sh_size, fp)
!= sec->shdr.sh_size) {
}
sec->reltab = malloc(sec->shdr.sh_size);
if (!sec->reltab) {
- die("malloc of %d bytes for relocs failed\n",
- sec->shdr.sh_size);
+ die("malloc of %" FMT " bytes for relocs failed\n",
+ sec->shdr.sh_size);
}
if (fseek(fp, sec->shdr.sh_offset, SEEK_SET) < 0) {
- die("Seek to %d failed: %s\n",
- sec->shdr.sh_offset, strerror(errno));
+ die("Seek to %" FMT " failed: %s\n",
+ sec->shdr.sh_offset, strerror(errno));
}
if (fread(sec->reltab, 1, sec->shdr.sh_size, fp)
!= sec->shdr.sh_size) {
#include <regex.h>
#include <tools/le_byteshift.h>
+__attribute__((__format__(printf, 1, 2)))
void die(char *fmt, ...) __attribute__((noreturn));
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
asmlinkage void do_IRQ(int hwirq, struct pt_regs *regs)
{
- int irq = irq_find_mapping(NULL, hwirq);
-
#ifdef CONFIG_DEBUG_STACKOVERFLOW
/* Debugging check for stack overflow: is there less than 1KB free? */
{
sp - sizeof(struct thread_info));
}
#endif
- generic_handle_irq(irq);
+ generic_handle_domain_irq(NULL, hwirq);
}
int arch_show_interrupts(struct seq_file *p, int prec)
Note, this is an experimental interface and could be changed someday.
-config BLK_CMDLINE_PARSER
- bool "Block device command line partition parser"
- help
- Enabling this option allows you to specify the partition layout from
- the kernel boot args. This is typically of use for embedded devices
- which don't otherwise have any standardized method for listing the
- partitions on a block device.
-
- See Documentation/block/cmdline-partition.rst for more information.
-
config BLK_WBT
bool "Enable support for block device writeback throttling"
help
config BLK_PM
def_bool BLOCK && PM
+# do not use in new code
+config BLOCK_HOLDER_DEPRECATED
+ bool
+
source "block/Kconfig.iosched"
bfq-y := bfq-iosched.o bfq-wf2q.o bfq-cgroup.o
obj-$(CONFIG_IOSCHED_BFQ) += bfq.o
-obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o
obj-$(CONFIG_BLK_DEV_INTEGRITY_T10) += t10-pi.o
obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o
obj-$(CONFIG_BLK_PM) += blk-pm.o
obj-$(CONFIG_BLK_INLINE_ENCRYPTION) += keyslot-manager.o blk-crypto.o
obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK) += blk-crypto-fallback.o
+obj-$(CONFIG_BLOCK_HOLDER_DEPRECATED) += holder.o
__rq = bfq_find_rq_fmerge(bfqd, bio, q);
if (__rq && elv_bio_merge_ok(__rq, bio)) {
*req = __rq;
+
+ if (blk_discard_mergable(__rq))
+ return ELEVATOR_DISCARD_MERGE;
return ELEVATOR_FRONT_MERGE;
}
int i, j;
for (i = 0; i < 2; i++)
- for (j = 0; j < IOPRIO_BE_NR; j++)
+ for (j = 0; j < IOPRIO_NR_LEVELS; j++)
if (bfqg->async_bfqq[i][j])
bfq_bfqq_end_wr(bfqg->async_bfqq[i][j]);
if (bfqg->async_idle_bfqq)
switch (ioprio_class) {
default:
pr_err("bdi %s: bfq: bad prio class %d\n",
- bdi_dev_name(bfqq->bfqd->queue->backing_dev_info),
- ioprio_class);
+ bdi_dev_name(bfqq->bfqd->queue->disk->bdi),
+ ioprio_class);
fallthrough;
case IOPRIO_CLASS_NONE:
/*
break;
}
- if (bfqq->new_ioprio >= IOPRIO_BE_NR) {
+ if (bfqq->new_ioprio >= IOPRIO_NR_LEVELS) {
pr_crit("bfq_set_next_ioprio_data: new_ioprio %d\n",
bfqq->new_ioprio);
- bfqq->new_ioprio = IOPRIO_BE_NR;
+ bfqq->new_ioprio = IOPRIO_NR_LEVELS - 1;
}
bfqq->entity.new_weight = bfq_ioprio_to_weight(bfqq->new_ioprio);
case IOPRIO_CLASS_RT:
return &bfqg->async_bfqq[0][ioprio];
case IOPRIO_CLASS_NONE:
- ioprio = IOPRIO_NORM;
+ ioprio = IOPRIO_BE_NORM;
fallthrough;
case IOPRIO_CLASS_BE:
return &bfqg->async_bfqq[1][ioprio];
int i, j;
for (i = 0; i < 2; i++)
- for (j = 0; j < IOPRIO_BE_NR; j++)
+ for (j = 0; j < IOPRIO_NR_LEVELS; j++)
__bfq_put_async_bfqq(bfqd, &bfqg->async_bfqq[i][j]);
__bfq_put_async_bfqq(bfqd, &bfqg->async_idle_bfqq);
void *bfqd;
- struct bfq_queue *async_bfqq[2][IOPRIO_BE_NR];
+ struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS];
struct bfq_queue *async_idle_bfqq;
struct bfq_entity *my_entity;
struct bfq_entity entity;
struct bfq_sched_data sched_data;
- struct bfq_queue *async_bfqq[2][IOPRIO_BE_NR];
+ struct bfq_queue *async_bfqq[2][IOPRIO_NR_LEVELS];
struct bfq_queue *async_idle_bfqq;
struct rb_root rq_pos_tree;
};
#endif
-struct bfq_queue *bfq_entity_to_bfqq(struct bfq_entity *entity);
-
/* --------------- main algorithm interface ----------------- */
#define BFQ_SERVICE_TREE_INIT ((struct bfq_service_tree) \
*/
unsigned short bfq_ioprio_to_weight(int ioprio)
{
- return (IOPRIO_BE_NR - ioprio) * BFQ_WEIGHT_CONVERSION_COEFF;
+ return (IOPRIO_NR_LEVELS - ioprio) * BFQ_WEIGHT_CONVERSION_COEFF;
}
/**
*
* To preserve as much as possible the old only-ioprio user interface,
* 0 is used as an escape ioprio value for weights (numerically) equal or
- * larger than IOPRIO_BE_NR * BFQ_WEIGHT_CONVERSION_COEFF.
+ * larger than IOPRIO_NR_LEVELS * BFQ_WEIGHT_CONVERSION_COEFF.
*/
static unsigned short bfq_weight_to_ioprio(int weight)
{
return max_t(int, 0,
- IOPRIO_BE_NR * BFQ_WEIGHT_CONVERSION_COEFF - weight);
+ IOPRIO_NR_LEVELS * BFQ_WEIGHT_CONVERSION_COEFF - weight);
}
static void bfq_get_entity(struct bfq_entity *entity)
struct bio_set *bs = bio->bi_pool;
if (bip->bip_flags & BIP_BLOCK_INTEGRITY)
- kfree(page_address(bip->bip_vec->bv_page) +
- bip->bip_vec->bv_offset);
+ kfree(bvec_virt(bip->bip_vec));
__bio_integrity_free(bs, bip);
bio->bi_integrity = NULL;
struct bio_vec bv;
struct bio_integrity_payload *bip = bio_integrity(bio);
blk_status_t ret = BLK_STS_OK;
- void *prot_buf = page_address(bip->bip_vec->bv_page) +
- bip->bip_vec->bv_offset;
iter.disk_name = bio->bi_bdev->bd_disk->disk_name;
iter.interval = 1 << bi->interval_exp;
iter.seed = proc_iter->bi_sector;
- iter.prot_buf = prot_buf;
+ iter.prot_buf = bvec_virt(bip->bip_vec);
__bio_for_each_segment(bv, bio, bviter, *proc_iter) {
- void *kaddr = kmap_atomic(bv.bv_page);
+ void *kaddr = bvec_kmap_local(&bv);
- iter.data_buf = kaddr + bv.bv_offset;
+ iter.data_buf = kaddr;
iter.data_size = bv.bv_len;
-
ret = proc_fn(&iter);
- if (ret) {
- kunmap_atomic(kaddr);
- return ret;
- }
+ kunmap_local(kaddr);
+
+ if (ret)
+ break;
- kunmap_atomic(kaddr);
}
return ret;
}
#include "blk.h"
#include "blk-rq-qos.h"
+struct bio_alloc_cache {
+ struct bio_list free_list;
+ unsigned int nr;
+};
+
static struct biovec_slab {
int nr_vecs;
char *name;
void bio_init(struct bio *bio, struct bio_vec *table,
unsigned short max_vecs)
{
- memset(bio, 0, sizeof(*bio));
+ bio->bi_next = NULL;
+ bio->bi_bdev = NULL;
+ bio->bi_opf = 0;
+ bio->bi_flags = 0;
+ bio->bi_ioprio = 0;
+ bio->bi_write_hint = 0;
+ bio->bi_status = 0;
+ bio->bi_iter.bi_sector = 0;
+ bio->bi_iter.bi_size = 0;
+ bio->bi_iter.bi_idx = 0;
+ bio->bi_iter.bi_bvec_done = 0;
+ bio->bi_end_io = NULL;
+ bio->bi_private = NULL;
+#ifdef CONFIG_BLK_CGROUP
+ bio->bi_blkg = NULL;
+ bio->bi_issue.value = 0;
+#ifdef CONFIG_BLK_CGROUP_IOCOST
+ bio->bi_iocost_cost = 0;
+#endif
+#endif
+#ifdef CONFIG_BLK_INLINE_ENCRYPTION
+ bio->bi_crypt_context = NULL;
+#endif
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+ bio->bi_integrity = NULL;
+#endif
+ bio->bi_vcnt = 0;
+
atomic_set(&bio->__bi_remaining, 1);
atomic_set(&bio->__bi_cnt, 1);
- bio->bi_io_vec = table;
bio->bi_max_vecs = max_vecs;
+ bio->bi_io_vec = table;
+ bio->bi_pool = NULL;
}
EXPORT_SYMBOL(bio_init);
void zero_fill_bio(struct bio *bio)
{
- unsigned long flags;
struct bio_vec bv;
struct bvec_iter iter;
- bio_for_each_segment(bv, bio, iter) {
- char *data = bvec_kmap_irq(&bv, &flags);
- memset(data, 0, bv.bv_len);
- flush_dcache_page(bv.bv_page);
- bvec_kunmap_irq(data, &flags);
- }
+ bio_for_each_segment(bv, bio, iter)
+ memzero_bvec(&bv);
}
EXPORT_SYMBOL(zero_fill_bio);
bio_truncate(bio, maxsector << 9);
}
+#define ALLOC_CACHE_MAX 512
+#define ALLOC_CACHE_SLACK 64
+
+static void bio_alloc_cache_prune(struct bio_alloc_cache *cache,
+ unsigned int nr)
+{
+ unsigned int i = 0;
+ struct bio *bio;
+
+ while ((bio = bio_list_pop(&cache->free_list)) != NULL) {
+ cache->nr--;
+ bio_free(bio);
+ if (++i == nr)
+ break;
+ }
+}
+
+static int bio_cpu_dead(unsigned int cpu, struct hlist_node *node)
+{
+ struct bio_set *bs;
+
+ bs = hlist_entry_safe(node, struct bio_set, cpuhp_dead);
+ if (bs->cache) {
+ struct bio_alloc_cache *cache = per_cpu_ptr(bs->cache, cpu);
+
+ bio_alloc_cache_prune(cache, -1U);
+ }
+ return 0;
+}
+
+static void bio_alloc_cache_destroy(struct bio_set *bs)
+{
+ int cpu;
+
+ if (!bs->cache)
+ return;
+
+ cpuhp_state_remove_instance_nocalls(CPUHP_BIO_DEAD, &bs->cpuhp_dead);
+ for_each_possible_cpu(cpu) {
+ struct bio_alloc_cache *cache;
+
+ cache = per_cpu_ptr(bs->cache, cpu);
+ bio_alloc_cache_prune(cache, -1U);
+ }
+ free_percpu(bs->cache);
+}
+
/**
* bio_put - release a reference to a bio
* @bio: bio to release reference to
**/
void bio_put(struct bio *bio)
{
- if (!bio_flagged(bio, BIO_REFFED))
- bio_free(bio);
- else {
+ if (unlikely(bio_flagged(bio, BIO_REFFED))) {
BIO_BUG_ON(!atomic_read(&bio->__bi_cnt));
+ if (!atomic_dec_and_test(&bio->__bi_cnt))
+ return;
+ }
- /*
- * last put frees it
- */
- if (atomic_dec_and_test(&bio->__bi_cnt))
- bio_free(bio);
+ if (bio_flagged(bio, BIO_PERCPU_CACHE)) {
+ struct bio_alloc_cache *cache;
+
+ bio_uninit(bio);
+ cache = per_cpu_ptr(bio->bi_pool->cache, get_cpu());
+ bio_list_add_head(&cache->free_list, bio);
+ if (++cache->nr > ALLOC_CACHE_MAX + ALLOC_CACHE_SLACK)
+ bio_alloc_cache_prune(cache, ALLOC_CACHE_SLACK);
+ put_cpu();
+ } else {
+ bio_free(bio);
}
}
EXPORT_SYMBOL(bio_put);
return 0;
}
+static void bio_put_pages(struct page **pages, size_t size, size_t off)
+{
+ size_t i, nr = DIV_ROUND_UP(size + (off & ~PAGE_MASK), PAGE_SIZE);
+
+ for (i = 0; i < nr; i++)
+ put_page(pages[i]);
+}
+
#define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *))
/**
if (same_page)
put_page(page);
} else {
- if (WARN_ON_ONCE(bio_full(bio, len)))
- return -EINVAL;
+ if (WARN_ON_ONCE(bio_full(bio, len))) {
+ bio_put_pages(pages + i, left, offset);
+ return -EINVAL;
+ }
__bio_add_page(bio, page, len, offset);
}
offset = 0;
len = min_t(size_t, PAGE_SIZE - offset, left);
if (bio_add_hw_page(q, bio, page, len, offset,
max_append_sectors, &same_page) != len) {
+ bio_put_pages(pages + i, left, offset);
ret = -EINVAL;
break;
}
void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
struct bio *src, struct bvec_iter *src_iter)
{
- struct bio_vec src_bv, dst_bv;
- void *src_p, *dst_p;
- unsigned bytes;
-
while (src_iter->bi_size && dst_iter->bi_size) {
- src_bv = bio_iter_iovec(src, *src_iter);
- dst_bv = bio_iter_iovec(dst, *dst_iter);
-
- bytes = min(src_bv.bv_len, dst_bv.bv_len);
+ struct bio_vec src_bv = bio_iter_iovec(src, *src_iter);
+ struct bio_vec dst_bv = bio_iter_iovec(dst, *dst_iter);
+ unsigned int bytes = min(src_bv.bv_len, dst_bv.bv_len);
+ void *src_buf;
- src_p = kmap_atomic(src_bv.bv_page);
- dst_p = kmap_atomic(dst_bv.bv_page);
-
- memcpy(dst_p + dst_bv.bv_offset,
- src_p + src_bv.bv_offset,
- bytes);
-
- kunmap_atomic(dst_p);
- kunmap_atomic(src_p);
-
- flush_dcache_page(dst_bv.bv_page);
+ src_buf = bvec_kmap_local(&src_bv);
+ memcpy_to_bvec(&dst_bv, src_buf);
+ kunmap_local(src_buf);
bio_advance_iter_single(src, src_iter, bytes);
bio_advance_iter_single(dst, dst_iter, bytes);
*/
void bioset_exit(struct bio_set *bs)
{
+ bio_alloc_cache_destroy(bs);
if (bs->rescue_workqueue)
destroy_workqueue(bs->rescue_workqueue);
bs->rescue_workqueue = NULL;
biovec_init_pool(&bs->bvec_pool, pool_size))
goto bad;
- if (!(flags & BIOSET_NEED_RESCUER))
- return 0;
-
- bs->rescue_workqueue = alloc_workqueue("bioset", WQ_MEM_RECLAIM, 0);
- if (!bs->rescue_workqueue)
- goto bad;
+ if (flags & BIOSET_NEED_RESCUER) {
+ bs->rescue_workqueue = alloc_workqueue("bioset",
+ WQ_MEM_RECLAIM, 0);
+ if (!bs->rescue_workqueue)
+ goto bad;
+ }
+ if (flags & BIOSET_PERCPU_CACHE) {
+ bs->cache = alloc_percpu(struct bio_alloc_cache);
+ if (!bs->cache)
+ goto bad;
+ cpuhp_state_add_instance_nocalls(CPUHP_BIO_DEAD, &bs->cpuhp_dead);
+ }
return 0;
bad:
}
EXPORT_SYMBOL(bioset_init_from_src);
+/**
+ * bio_alloc_kiocb - Allocate a bio from bio_set based on kiocb
+ * @kiocb: kiocb describing the IO
+ * @nr_iovecs: number of iovecs to pre-allocate
+ * @bs: bio_set to allocate from
+ *
+ * Description:
+ * Like @bio_alloc_bioset, but pass in the kiocb. The kiocb is only
+ * used to check if we should dip into the per-cpu bio_set allocation
+ * cache. The allocation uses GFP_KERNEL internally. On return, the
+ * bio is marked BIO_PERCPU_CACHEABLE, and the final put of the bio
+ * MUST be done from process context, not hard/soft IRQ.
+ *
+ */
+struct bio *bio_alloc_kiocb(struct kiocb *kiocb, unsigned short nr_vecs,
+ struct bio_set *bs)
+{
+ struct bio_alloc_cache *cache;
+ struct bio *bio;
+
+ if (!(kiocb->ki_flags & IOCB_ALLOC_CACHE) || nr_vecs > BIO_INLINE_VECS)
+ return bio_alloc_bioset(GFP_KERNEL, nr_vecs, bs);
+
+ cache = per_cpu_ptr(bs->cache, get_cpu());
+ bio = bio_list_pop(&cache->free_list);
+ if (bio) {
+ cache->nr--;
+ put_cpu();
+ bio_init(bio, nr_vecs ? bio->bi_inline_vecs : NULL, nr_vecs);
+ bio->bi_pool = bs;
+ bio_set_flag(bio, BIO_PERCPU_CACHE);
+ return bio;
+ }
+ put_cpu();
+ bio = bio_alloc_bioset(GFP_KERNEL, nr_vecs, bs);
+ bio_set_flag(bio, BIO_PERCPU_CACHE);
+ return bio;
+}
+EXPORT_SYMBOL_GPL(bio_alloc_kiocb);
+
static int __init init_bio(void)
{
int i;
SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
}
+ cpuhp_setup_state_multi(CPUHP_BIO_DEAD, "block/bio:dead", NULL,
+ bio_cpu_dead);
+
if (bioset_init(&fs_bio_set, BIO_POOL_SIZE, 0, BIOSET_NEED_BVECS))
panic("bio: can't allocate bios\n");
const char *blkg_dev_name(struct blkcg_gq *blkg)
{
- /* some drivers (floppy) instantiate a queue w/o disk registered */
- if (blkg->q->backing_dev_info->dev)
- return bdi_dev_name(blkg->q->backing_dev_info);
- return NULL;
+ if (!blkg->q->disk || !blkg->q->disk->bdi->dev)
+ return NULL;
+ return bdi_dev_name(blkg->q->disk->bdi);
}
/**
}
}
-static int blkcg_print_stat(struct seq_file *sf, void *v)
+static void blkcg_print_one_stat(struct blkcg_gq *blkg, struct seq_file *s)
{
- struct blkcg *blkcg = css_to_blkcg(seq_css(sf));
- struct blkcg_gq *blkg;
-
- if (!seq_css(sf)->parent)
- blkcg_fill_root_iostats();
- else
- cgroup_rstat_flush(blkcg->css.cgroup);
-
- rcu_read_lock();
-
- hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) {
- struct blkg_iostat_set *bis = &blkg->iostat;
- const char *dname;
- char *buf;
- u64 rbytes, wbytes, rios, wios, dbytes, dios;
- size_t size = seq_get_buf(sf, &buf), off = 0;
- int i;
- bool has_stats = false;
- unsigned seq;
+ struct blkg_iostat_set *bis = &blkg->iostat;
+ u64 rbytes, wbytes, rios, wios, dbytes, dios;
+ bool has_stats = false;
+ const char *dname;
+ unsigned seq;
+ int i;
- spin_lock_irq(&blkg->q->queue_lock);
+ if (!blkg->online)
+ return;
- if (!blkg->online)
- goto skip;
+ dname = blkg_dev_name(blkg);
+ if (!dname)
+ return;
- dname = blkg_dev_name(blkg);
- if (!dname)
- goto skip;
+ seq_printf(s, "%s ", dname);
- /*
- * Hooray string manipulation, count is the size written NOT
- * INCLUDING THE \0, so size is now count+1 less than what we
- * had before, but we want to start writing the next bit from
- * the \0 so we only add count to buf.
- */
- off += scnprintf(buf+off, size-off, "%s ", dname);
+ do {
+ seq = u64_stats_fetch_begin(&bis->sync);
+
+ rbytes = bis->cur.bytes[BLKG_IOSTAT_READ];
+ wbytes = bis->cur.bytes[BLKG_IOSTAT_WRITE];
+ dbytes = bis->cur.bytes[BLKG_IOSTAT_DISCARD];
+ rios = bis->cur.ios[BLKG_IOSTAT_READ];
+ wios = bis->cur.ios[BLKG_IOSTAT_WRITE];
+ dios = bis->cur.ios[BLKG_IOSTAT_DISCARD];
+ } while (u64_stats_fetch_retry(&bis->sync, seq));
+
+ if (rbytes || wbytes || rios || wios) {
+ has_stats = true;
+ seq_printf(s, "rbytes=%llu wbytes=%llu rios=%llu wios=%llu dbytes=%llu dios=%llu",
+ rbytes, wbytes, rios, wios,
+ dbytes, dios);
+ }
- do {
- seq = u64_stats_fetch_begin(&bis->sync);
+ if (blkcg_debug_stats && atomic_read(&blkg->use_delay)) {
+ has_stats = true;
+ seq_printf(s, " use_delay=%d delay_nsec=%llu",
+ atomic_read(&blkg->use_delay),
+ atomic64_read(&blkg->delay_nsec));
+ }
- rbytes = bis->cur.bytes[BLKG_IOSTAT_READ];
- wbytes = bis->cur.bytes[BLKG_IOSTAT_WRITE];
- dbytes = bis->cur.bytes[BLKG_IOSTAT_DISCARD];
- rios = bis->cur.ios[BLKG_IOSTAT_READ];
- wios = bis->cur.ios[BLKG_IOSTAT_WRITE];
- dios = bis->cur.ios[BLKG_IOSTAT_DISCARD];
- } while (u64_stats_fetch_retry(&bis->sync, seq));
+ for (i = 0; i < BLKCG_MAX_POLS; i++) {
+ struct blkcg_policy *pol = blkcg_policy[i];
- if (rbytes || wbytes || rios || wios) {
- has_stats = true;
- off += scnprintf(buf+off, size-off,
- "rbytes=%llu wbytes=%llu rios=%llu wios=%llu dbytes=%llu dios=%llu",
- rbytes, wbytes, rios, wios,
- dbytes, dios);
- }
+ if (!blkg->pd[i] || !pol->pd_stat_fn)
+ continue;
- if (blkcg_debug_stats && atomic_read(&blkg->use_delay)) {
+ if (pol->pd_stat_fn(blkg->pd[i], s))
has_stats = true;
- off += scnprintf(buf+off, size-off,
- " use_delay=%d delay_nsec=%llu",
- atomic_read(&blkg->use_delay),
- (unsigned long long)atomic64_read(&blkg->delay_nsec));
- }
+ }
- for (i = 0; i < BLKCG_MAX_POLS; i++) {
- struct blkcg_policy *pol = blkcg_policy[i];
- size_t written;
+ if (has_stats)
+ seq_printf(s, "\n");
+}
- if (!blkg->pd[i] || !pol->pd_stat_fn)
- continue;
+static int blkcg_print_stat(struct seq_file *sf, void *v)
+{
+ struct blkcg *blkcg = css_to_blkcg(seq_css(sf));
+ struct blkcg_gq *blkg;
- written = pol->pd_stat_fn(blkg->pd[i], buf+off, size-off);
- if (written)
- has_stats = true;
- off += written;
- }
+ if (!seq_css(sf)->parent)
+ blkcg_fill_root_iostats();
+ else
+ cgroup_rstat_flush(blkcg->css.cgroup);
- if (has_stats) {
- if (off < size - 1) {
- off += scnprintf(buf+off, size-off, "\n");
- seq_commit(sf, off);
- } else {
- seq_commit(sf, -1);
- }
- }
- skip:
+ rcu_read_lock();
+ hlist_for_each_entry_rcu(blkg, &blkcg->blkg_list, blkcg_node) {
+ spin_lock_irq(&blkg->q->queue_lock);
+ blkcg_print_one_stat(blkg, sf);
spin_unlock_irq(&blkg->q->queue_lock);
}
-
rcu_read_unlock();
return 0;
}
*/
#include <linux/kernel.h>
#include <linux/module.h>
-#include <linux/backing-dev.h>
#include <linux/bio.h>
#include <linux/blkdev.h>
#include <linux/blk-mq.h>
/* for synchronous bio-based driver finish in-flight integrity i/o */
blk_flush_integrity();
- /* @q won't process any more request, flush async actions */
- del_timer_sync(&q->backing_dev_info->laptop_mode_wb_timer);
blk_sync_queue(q);
-
if (queue_is_mq(q))
blk_mq_exit_queue(q);
if (ret)
goto fail_id;
- q->backing_dev_info = bdi_alloc(node_id);
- if (!q->backing_dev_info)
- goto fail_split;
-
q->stats = blk_alloc_queue_stats();
if (!q->stats)
- goto fail_stats;
+ goto fail_split;
q->node = node_id;
atomic_set(&q->nr_active_requests_shared_sbitmap, 0);
- timer_setup(&q->backing_dev_info->laptop_mode_wb_timer,
- laptop_mode_timer_fn, 0);
timer_setup(&q->timeout, blk_rq_timed_out_timer, 0);
INIT_WORK(&q->timeout_work, blk_timeout_work);
INIT_LIST_HEAD(&q->icq_list);
if (percpu_ref_init(&q->q_usage_counter,
blk_queue_usage_counter_release,
PERCPU_REF_INIT_ATOMIC, GFP_KERNEL))
- goto fail_bdi;
+ goto fail_stats;
if (blkcg_init_queue(q))
goto fail_ref;
fail_ref:
percpu_ref_exit(&q->q_usage_counter);
-fail_bdi:
- blk_free_queue_stats(q->stats);
fail_stats:
- bdi_put(q->backing_dev_info);
+ blk_free_queue_stats(q->stats);
fail_split:
bioset_exit(&q->bio_split);
fail_id:
}
if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags))
- bio->bi_opf &= ~REQ_HIPRI;
+ bio_clear_hipri(bio);
switch (bio_op(bio)) {
case REQ_OP_DISCARD:
if (mode->keysize == 0)
return -EINVAL;
- if (dun_bytes == 0 || dun_bytes > BLK_CRYPTO_MAX_IV_SIZE)
+ if (dun_bytes == 0 || dun_bytes > mode->ivsize)
return -EINVAL;
if (!is_power_of_2(data_unit_size))
}
EXPORT_SYMBOL(blk_integrity_unregister);
-void blk_integrity_add(struct gendisk *disk)
+int blk_integrity_add(struct gendisk *disk)
{
- if (kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
- &disk_to_dev(disk)->kobj, "%s", "integrity"))
- return;
+ int ret;
- kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+ ret = kobject_init_and_add(&disk->integrity_kobj, &integrity_ktype,
+ &disk_to_dev(disk)->kobj, "%s", "integrity");
+ if (!ret)
+ kobject_uevent(&disk->integrity_kobj, KOBJ_ADD);
+ return ret;
}
void blk_integrity_del(struct gendisk *disk)
kfree(iocg);
}
-static size_t ioc_pd_stat(struct blkg_policy_data *pd, char *buf, size_t size)
+static bool ioc_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
{
struct ioc_gq *iocg = pd_to_iocg(pd);
struct ioc *ioc = iocg->ioc;
- size_t pos = 0;
if (!ioc->enabled)
- return 0;
+ return false;
if (iocg->level == 0) {
unsigned vp10k = DIV64_U64_ROUND_CLOSEST(
ioc->vtime_base_rate * 10000,
VTIME_PER_USEC);
- pos += scnprintf(buf + pos, size - pos, " cost.vrate=%u.%02u",
- vp10k / 100, vp10k % 100);
+ seq_printf(s, " cost.vrate=%u.%02u", vp10k / 100, vp10k % 100);
}
- pos += scnprintf(buf + pos, size - pos, " cost.usage=%llu",
- iocg->last_stat.usage_us);
+ seq_printf(s, " cost.usage=%llu", iocg->last_stat.usage_us);
if (blkcg_debug_stats)
- pos += scnprintf(buf + pos, size - pos,
- " cost.wait=%llu cost.indebt=%llu cost.indelay=%llu",
- iocg->last_stat.wait_us,
- iocg->last_stat.indebt_us,
- iocg->last_stat.indelay_us);
-
- return pos;
+ seq_printf(s, " cost.wait=%llu cost.indebt=%llu cost.indelay=%llu",
+ iocg->last_stat.wait_us,
+ iocg->last_stat.indebt_us,
+ iocg->last_stat.indelay_us);
+ return true;
}
static u64 ioc_weight_prfill(struct seq_file *sf, struct blkg_policy_data *pd,
return 0;
}
-static size_t iolatency_ssd_stat(struct iolatency_grp *iolat, char *buf,
- size_t size)
+static bool iolatency_ssd_stat(struct iolatency_grp *iolat, struct seq_file *s)
{
struct latency_stat stat;
int cpu;
preempt_enable();
if (iolat->rq_depth.max_depth == UINT_MAX)
- return scnprintf(buf, size, " missed=%llu total=%llu depth=max",
- (unsigned long long)stat.ps.missed,
- (unsigned long long)stat.ps.total);
- return scnprintf(buf, size, " missed=%llu total=%llu depth=%u",
- (unsigned long long)stat.ps.missed,
- (unsigned long long)stat.ps.total,
- iolat->rq_depth.max_depth);
+ seq_printf(s, " missed=%llu total=%llu depth=max",
+ (unsigned long long)stat.ps.missed,
+ (unsigned long long)stat.ps.total);
+ else
+ seq_printf(s, " missed=%llu total=%llu depth=%u",
+ (unsigned long long)stat.ps.missed,
+ (unsigned long long)stat.ps.total,
+ iolat->rq_depth.max_depth);
+ return true;
}
-static size_t iolatency_pd_stat(struct blkg_policy_data *pd, char *buf,
- size_t size)
+static bool iolatency_pd_stat(struct blkg_policy_data *pd, struct seq_file *s)
{
struct iolatency_grp *iolat = pd_to_lat(pd);
unsigned long long avg_lat;
unsigned long long cur_win;
if (!blkcg_debug_stats)
- return 0;
+ return false;
if (iolat->ssd)
- return iolatency_ssd_stat(iolat, buf, size);
+ return iolatency_ssd_stat(iolat, s);
avg_lat = div64_u64(iolat->lat_avg, NSEC_PER_USEC);
cur_win = div64_u64(iolat->cur_win_nsec, NSEC_PER_MSEC);
if (iolat->rq_depth.max_depth == UINT_MAX)
- return scnprintf(buf, size, " depth=max avg_lat=%llu win=%llu",
- avg_lat, cur_win);
-
- return scnprintf(buf, size, " depth=%u avg_lat=%llu win=%llu",
- iolat->rq_depth.max_depth, avg_lat, cur_win);
+ seq_printf(s, " depth=max avg_lat=%llu win=%llu",
+ avg_lat, cur_win);
+ else
+ seq_printf(s, " depth=%u avg_lat=%llu win=%llu",
+ iolat->rq_depth.max_depth, avg_lat, cur_win);
+ return true;
}
-
static struct blkg_policy_data *iolatency_pd_alloc(gfp_t gfp,
struct request_queue *q,
struct blkcg *blkcg)
struct bvec_iter_all iter_all;
bio_for_each_segment_all(bvec, bio, iter_all) {
- memcpy(p, page_address(bvec->bv_page), bvec->bv_len);
+ memcpy_from_bvec(p, bvec);
p += bvec->bv_len;
}
* iopoll in direct IO routine. Given performance gain of iopoll for
* big IO can be trival, disable iopoll when split needed.
*/
- bio->bi_opf &= ~REQ_HIPRI;
+ bio_clear_hipri(bio);
return bio_split(bio, sectors, GFP_NOIO, bs);
}
trace_block_split(split, (*bio)->bi_iter.bi_sector);
submit_bio_noacct(*bio);
*bio = split;
+
+ blk_throtl_charge_bio_split(*bio);
}
}
}
}
-/*
- * Two cases of handling DISCARD merge:
- * If max_discard_segments > 1, the driver takes every bio
- * as a range and send them to controller together. The ranges
- * needn't to be contiguous.
- * Otherwise, the bios/requests will be handled as same as
- * others which should be contiguous.
- */
-static inline bool blk_discard_mergable(struct request *req)
-{
- if (req_op(req) == REQ_OP_DISCARD &&
- queue_max_discard_segments(req->q) > 1)
- return true;
- return false;
-}
-
static enum elv_merge blk_try_req_merge(struct request *req,
struct request *next)
{
kfree(hctx);
}
-struct blk_mq_ctx_sysfs_entry {
- struct attribute attr;
- ssize_t (*show)(struct blk_mq_ctx *, char *);
- ssize_t (*store)(struct blk_mq_ctx *, const char *, size_t);
-};
-
struct blk_mq_hw_ctx_sysfs_entry {
struct attribute attr;
ssize_t (*show)(struct blk_mq_hw_ctx *, char *);
ssize_t (*store)(struct blk_mq_hw_ctx *, const char *, size_t);
};
-static ssize_t blk_mq_sysfs_show(struct kobject *kobj, struct attribute *attr,
- char *page)
-{
- struct blk_mq_ctx_sysfs_entry *entry;
- struct blk_mq_ctx *ctx;
- struct request_queue *q;
- ssize_t res;
-
- entry = container_of(attr, struct blk_mq_ctx_sysfs_entry, attr);
- ctx = container_of(kobj, struct blk_mq_ctx, kobj);
- q = ctx->queue;
-
- if (!entry->show)
- return -EIO;
-
- mutex_lock(&q->sysfs_lock);
- res = entry->show(ctx, page);
- mutex_unlock(&q->sysfs_lock);
- return res;
-}
-
-static ssize_t blk_mq_sysfs_store(struct kobject *kobj, struct attribute *attr,
- const char *page, size_t length)
-{
- struct blk_mq_ctx_sysfs_entry *entry;
- struct blk_mq_ctx *ctx;
- struct request_queue *q;
- ssize_t res;
-
- entry = container_of(attr, struct blk_mq_ctx_sysfs_entry, attr);
- ctx = container_of(kobj, struct blk_mq_ctx, kobj);
- q = ctx->queue;
-
- if (!entry->store)
- return -EIO;
-
- mutex_lock(&q->sysfs_lock);
- res = entry->store(ctx, page, length);
- mutex_unlock(&q->sysfs_lock);
- return res;
-}
-
static ssize_t blk_mq_hw_sysfs_show(struct kobject *kobj,
struct attribute *attr, char *page)
{
};
ATTRIBUTE_GROUPS(default_hw_ctx);
-static const struct sysfs_ops blk_mq_sysfs_ops = {
- .show = blk_mq_sysfs_show,
- .store = blk_mq_sysfs_store,
-};
-
static const struct sysfs_ops blk_mq_hw_sysfs_ops = {
.show = blk_mq_hw_sysfs_show,
.store = blk_mq_hw_sysfs_store,
};
static struct kobj_type blk_mq_ktype = {
- .sysfs_ops = &blk_mq_sysfs_ops,
.release = blk_mq_sysfs_release,
};
static struct kobj_type blk_mq_ctx_ktype = {
- .sysfs_ops = &blk_mq_sysfs_ops,
.release = blk_mq_ctx_sysfs_release,
};
__blk_mq_dec_active_requests(hctx);
if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq)))
- laptop_io_completion(q->backing_dev_info);
+ laptop_io_completion(q->disk->bdi);
rq_qos_done(q, rq);
* This is probably worse than completing the request on a different
* cache domain.
*/
- if (force_irqthreads)
+ if (force_irqthreads())
return false;
/* same CPU or cache domain? Complete locally */
}
EXPORT_SYMBOL(blk_mq_init_queue);
-struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata)
+struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
+ struct lock_class_key *lkclass)
{
struct request_queue *q;
struct gendisk *disk;
if (IS_ERR(q))
return ERR_CAST(q);
- disk = __alloc_disk_node(0, set->numa_node);
+ disk = __alloc_disk_node(q, set->numa_node, lkclass);
if (!disk) {
blk_cleanup_queue(q);
return ERR_PTR(-ENOMEM);
}
- disk->queue = q;
return disk;
}
EXPORT_SYMBOL(__blk_mq_alloc_disk);
#include <linux/bio.h>
#include <linux/blkdev.h>
#include <linux/pagemap.h>
+#include <linux/backing-dev-defs.h>
#include <linux/gcd.h>
#include <linux/lcm.h>
#include <linux/jiffies.h>
limits->logical_block_size >> SECTOR_SHIFT);
limits->max_sectors = max_sectors;
- q->backing_dev_info->io_pages = max_sectors >> (PAGE_SHIFT - 9);
+ if (!q->disk)
+ return;
+ q->disk->bdi->io_pages = max_sectors >> (PAGE_SHIFT - 9);
}
EXPORT_SYMBOL(blk_queue_max_hw_sectors);
}
EXPORT_SYMBOL(blk_queue_alignment_offset);
-void blk_queue_update_readahead(struct request_queue *q)
+void disk_update_readahead(struct gendisk *disk)
{
+ struct request_queue *q = disk->queue;
+
/*
* For read-ahead of large files to be effective, we need to read ahead
* at least twice the optimal I/O size.
*/
- q->backing_dev_info->ra_pages =
+ disk->bdi->ra_pages =
max(queue_io_opt(q) * 2 / PAGE_SIZE, VM_READAHEAD_PAGES);
- q->backing_dev_info->io_pages =
- queue_max_sectors(q) >> (PAGE_SHIFT - 9);
+ disk->bdi->io_pages = queue_max_sectors(q) >> (PAGE_SHIFT - 9);
}
-EXPORT_SYMBOL_GPL(blk_queue_update_readahead);
+EXPORT_SYMBOL_GPL(disk_update_readahead);
/**
* blk_limits_io_min - set minimum request size for a device
void blk_queue_io_opt(struct request_queue *q, unsigned int opt)
{
blk_limits_io_opt(&q->limits, opt);
- q->backing_dev_info->ra_pages =
+ if (!q->disk)
+ return;
+ q->disk->bdi->ra_pages =
max(queue_io_opt(q) * 2 / PAGE_SIZE, VM_READAHEAD_PAGES);
}
EXPORT_SYMBOL(blk_queue_io_opt);
struct request_queue *t = disk->queue;
if (blk_stack_limits(&t->limits, &bdev_get_queue(bdev)->limits,
- get_start_sect(bdev) + (offset >> 9)) < 0) {
- char top[BDEVNAME_SIZE], bottom[BDEVNAME_SIZE];
-
- disk_name(disk, 0, top);
- bdevname(bdev, bottom);
-
- printk(KERN_NOTICE "%s: Warning: Device %s is misaligned\n",
- top, bottom);
- }
+ get_start_sect(bdev) + (offset >> 9)) < 0)
+ pr_notice("%s: Warning: Device %pg is misaligned\n",
+ disk->disk_name, bdev);
- blk_queue_update_readahead(disk->queue);
+ disk_update_readahead(disk);
}
EXPORT_SYMBOL(disk_stack_limits);
static ssize_t queue_ra_show(struct request_queue *q, char *page)
{
- unsigned long ra_kb = q->backing_dev_info->ra_pages <<
- (PAGE_SHIFT - 10);
+ unsigned long ra_kb;
+ if (!q->disk)
+ return -EINVAL;
+ ra_kb = q->disk->bdi->ra_pages << (PAGE_SHIFT - 10);
return queue_var_show(ra_kb, page);
}
queue_ra_store(struct request_queue *q, const char *page, size_t count)
{
unsigned long ra_kb;
- ssize_t ret = queue_var_store(&ra_kb, page, count);
+ ssize_t ret;
+ if (!q->disk)
+ return -EINVAL;
+ ret = queue_var_store(&ra_kb, page, count);
if (ret < 0)
return ret;
-
- q->backing_dev_info->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
-
+ q->disk->bdi->ra_pages = ra_kb >> (PAGE_SHIFT - 10);
return ret;
}
spin_lock_irq(&q->queue_lock);
q->limits.max_sectors = max_sectors_kb << 1;
- q->backing_dev_info->io_pages = max_sectors_kb >> (PAGE_SHIFT - 10);
+ if (q->disk)
+ q->disk->bdi->io_pages = max_sectors_kb >> (PAGE_SHIFT - 10);
spin_unlock_irq(&q->queue_lock);
return ret;
* e.g. blkcg_print_blkgs() to crash.
*/
blkcg_exit_queue(q);
-
- /*
- * Since the cgroup code may dereference the @q->backing_dev_info
- * pointer, only decrease its reference count after having removed the
- * association with the block cgroup controller.
- */
- bdi_put(q->backing_dev_info);
}
/**
struct device *dev = disk_to_dev(disk);
struct request_queue *q = disk->queue;
- if (WARN_ON(!q))
- return -ENXIO;
-
- WARN_ONCE(blk_queue_registered(q),
- "%s is registering an already registered queue\n",
- kobject_name(&dev->kobj));
-
- blk_queue_update_readahead(q);
-
ret = blk_trace_init_sysfs(dev);
if (ret)
return ret;
return ret;
}
-EXPORT_SYMBOL_GPL(blk_register_queue);
/**
* blk_unregister_queue - counterpart of blk_register_queue()
unsigned int bad_bio_cnt; /* bios exceeding latency threshold */
unsigned long bio_cnt_reset_time;
+ atomic_t io_split_cnt[2];
+ atomic_t last_io_split_cnt[2];
+
struct blkg_rwstat stat_bytes;
struct blkg_rwstat stat_ios;
};
tg->bytes_disp[rw] = 0;
tg->io_disp[rw] = 0;
+ atomic_set(&tg->io_split_cnt[rw], 0);
+
/*
* Previous slice has expired. We must have trimmed it after last
* bio dispatch. That means since start of last slice, we never used
tg->io_disp[rw] = 0;
tg->slice_start[rw] = jiffies;
tg->slice_end[rw] = jiffies + tg->td->throtl_slice;
+
+ atomic_set(&tg->io_split_cnt[rw], 0);
+
throtl_log(&tg->service_queue,
"[%c] new slice start=%lu end=%lu jiffies=%lu",
rw == READ ? 'R' : 'W', tg->slice_start[rw],
jiffies + tg->td->throtl_slice);
}
+ if (iops_limit != UINT_MAX)
+ tg->io_disp[rw] += atomic_xchg(&tg->io_split_cnt[rw], 0);
+
if (tg_with_in_bps_limit(tg, bio, bps_limit, &bps_wait) &&
tg_with_in_iops_limit(tg, bio, iops_limit, &iops_wait)) {
if (wait)
}
if (tg->iops[READ][LIMIT_LOW]) {
+ tg->last_io_disp[READ] += atomic_xchg(&tg->last_io_split_cnt[READ], 0);
iops = tg->last_io_disp[READ] * HZ / elapsed_time;
if (iops >= tg->iops[READ][LIMIT_LOW])
tg->last_low_overflow_time[READ] = now;
}
if (tg->iops[WRITE][LIMIT_LOW]) {
+ tg->last_io_disp[WRITE] += atomic_xchg(&tg->last_io_split_cnt[WRITE], 0);
iops = tg->last_io_disp[WRITE] * HZ / elapsed_time;
if (iops >= tg->iops[WRITE][LIMIT_LOW])
tg->last_low_overflow_time[WRITE] = now;
}
#endif
+void blk_throtl_charge_bio_split(struct bio *bio)
+{
+ struct blkcg_gq *blkg = bio->bi_blkg;
+ struct throtl_grp *parent = blkg_to_tg(blkg);
+ struct throtl_service_queue *parent_sq;
+ bool rw = bio_data_dir(bio);
+
+ do {
+ if (!parent->has_rules[rw])
+ break;
+
+ atomic_inc(&parent->io_split_cnt[rw]);
+ atomic_inc(&parent->last_io_split_cnt[rw]);
+
+ parent_sq = parent->service_queue.parent_sq;
+ parent = sq_to_tg(parent_sq);
+ } while (parent);
+}
+
bool blk_throtl_bio(struct bio *bio)
{
struct request_queue *q = bio->bi_bdev->bd_disk->queue;
*/
static bool wb_recent_wait(struct rq_wb *rwb)
{
- struct bdi_writeback *wb = &rwb->rqos.q->backing_dev_info->wb;
+ struct bdi_writeback *wb = &rwb->rqos.q->disk->bdi->wb;
return time_before(jiffies, wb->dirty_sleep + HZ);
}
static int latency_exceeded(struct rq_wb *rwb, struct blk_rq_stat *stat)
{
- struct backing_dev_info *bdi = rwb->rqos.q->backing_dev_info;
+ struct backing_dev_info *bdi = rwb->rqos.q->disk->bdi;
struct rq_depth *rqd = &rwb->rq_depth;
u64 thislat;
static void rwb_trace_step(struct rq_wb *rwb, const char *msg)
{
- struct backing_dev_info *bdi = rwb->rqos.q->backing_dev_info;
+ struct backing_dev_info *bdi = rwb->rqos.q->disk->bdi;
struct rq_depth *rqd = &rwb->rq_depth;
trace_wbt_step(bdi, msg, rqd->scale_step, rwb->cur_win_nsec,
status = latency_exceeded(rwb, cb->stat);
- trace_wbt_timer(rwb->rqos.q->backing_dev_info, status, rqd->scale_step,
+ trace_wbt_timer(rwb->rqos.q->disk->bdi, status, rqd->scale_step,
inflight);
/*
if (!blk_queue_is_zoned(q))
return -ENOTTY;
- if (!capable(CAP_SYS_ADMIN))
- return -EACCES;
-
if (copy_from_user(&rep, argp, sizeof(struct blk_zone_report)))
return -EFAULT;
if (!blk_queue_is_zoned(q))
return -ENOTTY;
- if (!capable(CAP_SYS_ADMIN))
- return -EACCES;
-
if (!(mode & FMODE_WRITE))
return -EBADF;
bip_next->bip_vec[0].bv_offset);
}
-void blk_integrity_add(struct gendisk *);
+int blk_integrity_add(struct gendisk *disk);
void blk_integrity_del(struct gendisk *);
#else /* CONFIG_BLK_DEV_INTEGRITY */
static inline bool blk_integrity_merge_rq(struct request_queue *rq,
static inline void bio_integrity_free(struct bio *bio)
{
}
-static inline void blk_integrity_add(struct gendisk *disk)
+static inline int blk_integrity_add(struct gendisk *disk)
{
+ return 0;
}
static inline void blk_integrity_del(struct gendisk *disk)
{
extern int blk_throtl_init(struct request_queue *q);
extern void blk_throtl_exit(struct request_queue *q);
extern void blk_throtl_register_queue(struct request_queue *q);
+extern void blk_throtl_charge_bio_split(struct bio *bio);
bool blk_throtl_bio(struct bio *bio);
#else /* CONFIG_BLK_DEV_THROTTLING */
static inline int blk_throtl_init(struct request_queue *q) { return 0; }
static inline void blk_throtl_exit(struct request_queue *q) { }
static inline void blk_throtl_register_queue(struct request_queue *q) { }
+static inline void blk_throtl_charge_bio_split(struct bio *bio) { }
static inline bool blk_throtl_bio(struct bio *bio) { return false; }
#endif /* CONFIG_BLK_DEV_THROTTLING */
#ifdef CONFIG_BLK_DEV_THROTTLING_LOW
int blk_alloc_ext_minor(void);
void blk_free_ext_minor(unsigned int minor);
-char *disk_name(struct gendisk *hd, int partno, char *buf);
#define ADDPART_FLAG_NONE 0
#define ADDPART_FLAG_RAID 1
#define ADDPART_FLAG_WHOLEDISK 2
-int bdev_add_partition(struct block_device *bdev, int partno,
- sector_t start, sector_t length);
-int bdev_del_partition(struct block_device *bdev, int partno);
-int bdev_resize_partition(struct block_device *bdev, int partno,
- sector_t start, sector_t length);
+int bdev_add_partition(struct gendisk *disk, int partno, sector_t start,
+ sector_t length);
+int bdev_del_partition(struct gendisk *disk, int partno);
+int bdev_resize_partition(struct gendisk *disk, int partno, sector_t start,
+ sector_t length);
int bio_add_hw_page(struct request_queue *q, struct bio *bio,
struct page *page, unsigned int len, unsigned int offset,
struct request_queue *blk_alloc_queue(int node_id);
-void disk_alloc_events(struct gendisk *disk);
+int disk_alloc_events(struct gendisk *disk);
void disk_add_events(struct gendisk *disk);
void disk_del_events(struct gendisk *disk);
void disk_release_events(struct gendisk *disk);
extern struct device_attribute dev_attr_events_async;
extern struct device_attribute dev_attr_events_poll_msecs;
+static inline void bio_clear_hipri(struct bio *bio)
+{
+ /* can't support alloc cache if we turn off polling */
+ bio_clear_flag(bio, BIO_PERCPU_CACHE);
+ bio->bi_opf &= ~REQ_HIPRI;
+}
+
#endif /* BLK_INTERNAL_H */
__initcall(init_emergency_pool);
-/*
- * highmem version, map in to vec
- */
-static void bounce_copy_vec(struct bio_vec *to, unsigned char *vfrom)
-{
- unsigned char *vto;
-
- vto = kmap_atomic(to->bv_page);
- memcpy(vto + to->bv_offset, vfrom, to->bv_len);
- kunmap_atomic(vto);
-}
-
/*
* Simple bounce buffer support for highmem pages. Depending on the
* queue gfp mask set, *to may or may not be a highmem page. kmap it
*/
static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
{
- unsigned char *vfrom;
struct bio_vec tovec, fromvec;
struct bvec_iter iter;
/*
* been modified by the block layer, so use the original
* copy, bounce_copy_vec already uses tovec->bv_len
*/
- vfrom = page_address(fromvec.bv_page) +
- tovec.bv_offset;
-
- bounce_copy_vec(&tovec, vfrom);
- flush_dcache_page(tovec.bv_page);
+ memcpy_to_bvec(&tovec, page_address(fromvec.bv_page) +
+ tovec.bv_offset);
}
bio_advance_iter(from, &from_iter, tovec.bv_len);
}
* because the 'bio' is single-page bvec.
*/
for (i = 0, to = bio->bi_io_vec; i < bio->bi_vcnt; to++, i++) {
- struct page *page = to->bv_page;
+ struct page *bounce_page;
- if (!PageHighMem(page))
+ if (!PageHighMem(to->bv_page))
continue;
- to->bv_page = mempool_alloc(&page_pool, GFP_NOIO);
- inc_zone_page_state(to->bv_page, NR_BOUNCE);
+ bounce_page = mempool_alloc(&page_pool, GFP_NOIO);
+ inc_zone_page_state(bounce_page, NR_BOUNCE);
if (rw == WRITE) {
- char *vto, *vfrom;
-
- flush_dcache_page(page);
-
- vto = page_address(to->bv_page) + to->bv_offset;
- vfrom = kmap_atomic(page) + to->bv_offset;
- memcpy(vto, vfrom, to->bv_len);
- kunmap_atomic(vfrom);
+ flush_dcache_page(to->bv_page);
+ memcpy_from_bvec(page_address(bounce_page), to);
}
+ to->bv_page = bounce_page;
}
trace_block_bio_bounce(*bio_orig);
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Parse command line, get partition information
- *
- * Written by Cai Zhiyong <caizhiyong@huawei.com>
- *
- */
-#include <linux/export.h>
-#include <linux/cmdline-parser.h>
-
-static int parse_subpart(struct cmdline_subpart **subpart, char *partdef)
-{
- int ret = 0;
- struct cmdline_subpart *new_subpart;
-
- *subpart = NULL;
-
- new_subpart = kzalloc(sizeof(struct cmdline_subpart), GFP_KERNEL);
- if (!new_subpart)
- return -ENOMEM;
-
- if (*partdef == '-') {
- new_subpart->size = (sector_t)(~0ULL);
- partdef++;
- } else {
- new_subpart->size = (sector_t)memparse(partdef, &partdef);
- if (new_subpart->size < (sector_t)PAGE_SIZE) {
- pr_warn("cmdline partition size is invalid.");
- ret = -EINVAL;
- goto fail;
- }
- }
-
- if (*partdef == '@') {
- partdef++;
- new_subpart->from = (sector_t)memparse(partdef, &partdef);
- } else {
- new_subpart->from = (sector_t)(~0ULL);
- }
-
- if (*partdef == '(') {
- int length;
- char *next = strchr(++partdef, ')');
-
- if (!next) {
- pr_warn("cmdline partition format is invalid.");
- ret = -EINVAL;
- goto fail;
- }
-
- length = min_t(int, next - partdef,
- sizeof(new_subpart->name) - 1);
- strncpy(new_subpart->name, partdef, length);
- new_subpart->name[length] = '\0';
-
- partdef = ++next;
- } else
- new_subpart->name[0] = '\0';
-
- new_subpart->flags = 0;
-
- if (!strncmp(partdef, "ro", 2)) {
- new_subpart->flags |= PF_RDONLY;
- partdef += 2;
- }
-
- if (!strncmp(partdef, "lk", 2)) {
- new_subpart->flags |= PF_POWERUP_LOCK;
- partdef += 2;
- }
-
- *subpart = new_subpart;
- return 0;
-fail:
- kfree(new_subpart);
- return ret;
-}
-
-static void free_subpart(struct cmdline_parts *parts)
-{
- struct cmdline_subpart *subpart;
-
- while (parts->subpart) {
- subpart = parts->subpart;
- parts->subpart = subpart->next_subpart;
- kfree(subpart);
- }
-}
-
-static int parse_parts(struct cmdline_parts **parts, const char *bdevdef)
-{
- int ret = -EINVAL;
- char *next;
- int length;
- struct cmdline_subpart **next_subpart;
- struct cmdline_parts *newparts;
- char buf[BDEVNAME_SIZE + 32 + 4];
-
- *parts = NULL;
-
- newparts = kzalloc(sizeof(struct cmdline_parts), GFP_KERNEL);
- if (!newparts)
- return -ENOMEM;
-
- next = strchr(bdevdef, ':');
- if (!next) {
- pr_warn("cmdline partition has no block device.");
- goto fail;
- }
-
- length = min_t(int, next - bdevdef, sizeof(newparts->name) - 1);
- strncpy(newparts->name, bdevdef, length);
- newparts->name[length] = '\0';
- newparts->nr_subparts = 0;
-
- next_subpart = &newparts->subpart;
-
- while (next && *(++next)) {
- bdevdef = next;
- next = strchr(bdevdef, ',');
-
- length = (!next) ? (sizeof(buf) - 1) :
- min_t(int, next - bdevdef, sizeof(buf) - 1);
-
- strncpy(buf, bdevdef, length);
- buf[length] = '\0';
-
- ret = parse_subpart(next_subpart, buf);
- if (ret)
- goto fail;
-
- newparts->nr_subparts++;
- next_subpart = &(*next_subpart)->next_subpart;
- }
-
- if (!newparts->subpart) {
- pr_warn("cmdline partition has no valid partition.");
- ret = -EINVAL;
- goto fail;
- }
-
- *parts = newparts;
-
- return 0;
-fail:
- free_subpart(newparts);
- kfree(newparts);
- return ret;
-}
-
-void cmdline_parts_free(struct cmdline_parts **parts)
-{
- struct cmdline_parts *next_parts;
-
- while (*parts) {
- next_parts = (*parts)->next_parts;
- free_subpart(*parts);
- kfree(*parts);
- *parts = next_parts;
- }
-}
-EXPORT_SYMBOL(cmdline_parts_free);
-
-int cmdline_parts_parse(struct cmdline_parts **parts, const char *cmdline)
-{
- int ret;
- char *buf;
- char *pbuf;
- char *next;
- struct cmdline_parts **next_parts;
-
- *parts = NULL;
-
- next = pbuf = buf = kstrdup(cmdline, GFP_KERNEL);
- if (!buf)
- return -ENOMEM;
-
- next_parts = parts;
-
- while (next && *pbuf) {
- next = strchr(pbuf, ';');
- if (next)
- *next = '\0';
-
- ret = parse_parts(next_parts, pbuf);
- if (ret)
- goto fail;
-
- if (next)
- pbuf = ++next;
-
- next_parts = &(*next_parts)->next_parts;
- }
-
- if (!*parts) {
- pr_warn("cmdline partition has no valid partition.");
- ret = -EINVAL;
- goto fail;
- }
-
- ret = 0;
-done:
- kfree(buf);
- return ret;
-
-fail:
- cmdline_parts_free(parts);
- goto done;
-}
-EXPORT_SYMBOL(cmdline_parts_parse);
-
-struct cmdline_parts *cmdline_parts_find(struct cmdline_parts *parts,
- const char *bdev)
-{
- while (parts && strncmp(bdev, parts->name, sizeof(parts->name)))
- parts = parts->next_parts;
- return parts;
-}
-EXPORT_SYMBOL(cmdline_parts_find);
-
-/*
- * add_part()
- * 0 success.
- * 1 can not add so many partitions.
- */
-int cmdline_parts_set(struct cmdline_parts *parts, sector_t disk_size,
- int slot,
- int (*add_part)(int, struct cmdline_subpart *, void *),
- void *param)
-{
- sector_t from = 0;
- struct cmdline_subpart *subpart;
-
- for (subpart = parts->subpart; subpart;
- subpart = subpart->next_subpart, slot++) {
- if (subpart->from == (sector_t)(~0ULL))
- subpart->from = from;
- else
- from = subpart->from;
-
- if (from >= disk_size)
- break;
-
- if (subpart->size > (disk_size - from))
- subpart->size = disk_size - from;
-
- from += subpart->size;
-
- if (add_part(slot, subpart, param))
- break;
- }
-
- return slot;
-}
-EXPORT_SYMBOL(cmdline_parts_set);
spin_unlock_irq(&ev->lock);
}
+/*
+ * Tell userland about new events. Only the events listed in @disk->events are
+ * reported, and only if DISK_EVENT_FLAG_UEVENT is set. Otherwise, events are
+ * processed internally but never get reported to userland.
+ */
+static void disk_event_uevent(struct gendisk *disk, unsigned int events)
+{
+ char *envp[ARRAY_SIZE(disk_uevents) + 1] = { };
+ int nr_events = 0, i;
+
+ for (i = 0; i < ARRAY_SIZE(disk_uevents); i++)
+ if (events & disk->events & (1 << i))
+ envp[nr_events++] = disk_uevents[i];
+
+ if (nr_events)
+ kobject_uevent_env(&disk_to_dev(disk)->kobj, KOBJ_CHANGE, envp);
+}
+
static void disk_check_events(struct disk_events *ev,
unsigned int *clearing_ptr)
{
struct gendisk *disk = ev->disk;
- char *envp[ARRAY_SIZE(disk_uevents) + 1] = { };
unsigned int clearing = *clearing_ptr;
unsigned int events;
unsigned long intv;
- int nr_events = 0, i;
/* check events */
events = disk->fops->check_events(disk, clearing);
spin_unlock_irq(&ev->lock);
- /*
- * Tell userland about new events. Only the events listed in
- * @disk->events are reported, and only if DISK_EVENT_FLAG_UEVENT
- * is set. Otherwise, events are processed internally but never
- * get reported to userland.
- */
- for (i = 0; i < ARRAY_SIZE(disk_uevents); i++)
- if ((events & disk->events & (1 << i)) &&
- (disk->event_flags & DISK_EVENT_FLAG_UEVENT))
- envp[nr_events++] = disk_uevents[i];
+ if (events & DISK_EVENT_MEDIA_CHANGE)
+ inc_diskseq(disk);
- if (nr_events)
- kobject_uevent_env(&disk_to_dev(disk)->kobj, KOBJ_CHANGE, envp);
+ if (disk->event_flags & DISK_EVENT_FLAG_UEVENT)
+ disk_event_uevent(disk, events);
}
/**
}
EXPORT_SYMBOL(bdev_check_media_change);
+/**
+ * disk_force_media_change - force a media change event
+ * @disk: the disk which will raise the event
+ * @events: the events to raise
+ *
+ * Generate uevents for the disk. If DISK_EVENT_MEDIA_CHANGE is present,
+ * attempt to free all dentries and inodes and invalidates all block
+ * device page cache entries in that case.
+ *
+ * Returns %true if DISK_EVENT_MEDIA_CHANGE was raised, or %false if not.
+ */
+bool disk_force_media_change(struct gendisk *disk, unsigned int events)
+{
+ disk_event_uevent(disk, events);
+
+ if (!(events & DISK_EVENT_MEDIA_CHANGE))
+ return false;
+
+ if (__invalidate_device(disk->part0, true))
+ pr_warn("VFS: busy inodes on changed media %s\n",
+ disk->disk_name);
+ set_bit(GD_NEED_PART_SCAN, &disk->state);
+ return true;
+}
+EXPORT_SYMBOL_GPL(disk_force_media_change);
+
/*
* Separate this part out so that a different pointer for clearing_ptr can be
* passed in for disk_clear_events.
/*
* disk_{alloc|add|del|release}_events - initialize and destroy disk_events.
*/
-void disk_alloc_events(struct gendisk *disk)
+int disk_alloc_events(struct gendisk *disk)
{
struct disk_events *ev;
if (!disk->fops->check_events || !disk->events)
- return;
+ return 0;
ev = kzalloc(sizeof(*ev), GFP_KERNEL);
if (!ev) {
pr_warn("%s: failed to initialize events\n", disk->disk_name);
- return;
+ return -ENOMEM;
}
INIT_LIST_HEAD(&ev->node);
INIT_DELAYED_WORK(&ev->dwork, disk_events_workfn);
disk->ev = ev;
+ return 0;
}
void disk_add_events(struct gendisk *disk)
__rq = elv_rqhash_find(q, bio->bi_iter.bi_sector);
if (__rq && elv_bio_merge_ok(__rq, bio)) {
*req = __rq;
+
+ if (blk_discard_mergable(__rq))
+ return ELEVATOR_DISCARD_MERGE;
return ELEVATOR_BACK_MERGE;
}
*/
static struct elevator_type *elevator_get_default(struct request_queue *q)
{
+ if (q->tag_set && q->tag_set->flags & BLK_MQ_F_NO_SCHED_BY_DEFAULT)
+ return NULL;
+
if (q->nr_hw_queues != 1 &&
!blk_mq_is_sbitmap_shared(q->tag_set->flags))
return NULL;
elevator_put(e);
}
}
-EXPORT_SYMBOL_GPL(elevator_init_mq); /* only for dm-rq */
/*
* switch to new_e io scheduler. be careful not to introduce deadlocks -
static struct kobject *block_depr;
+/*
+ * Unique, monotonically increasing sequential number associated with block
+ * devices instances (i.e. incremented each time a device is attached).
+ * Associating uevents with block devices in userspace is difficult and racy:
+ * the uevent netlink socket is lossy, and on slow and overloaded systems has
+ * a very high latency.
+ * Block devices do not have exclusive owners in userspace, any process can set
+ * one up (e.g. loop devices). Moreover, device names can be reused (e.g. loop0
+ * can be reused again and again).
+ * A userspace process setting up a block device and watching for its events
+ * cannot thus reliably tell whether an event relates to the device it just set
+ * up or another earlier instance with the same name.
+ * This sequential number allows userspace processes to solve this problem, and
+ * uniquely associate an uevent to the lifetime to a device.
+ */
+static atomic64_t diskseq;
+
/* for extended dynamic devt allocation, currently only one major is used */
#define NR_EXT_DEVT (1 << MINORBITS)
static DEFINE_IDA(ext_devt_ida);
* initial capacity during probing.
*/
if (size == capacity ||
- (disk->flags & (GENHD_FL_UP | GENHD_FL_HIDDEN)) != GENHD_FL_UP)
+ !disk_live(disk) ||
+ (disk->flags & GENHD_FL_HIDDEN))
return false;
pr_info("%s: detected capacity change from %lld to %lld\n",
EXPORT_SYMBOL_GPL(set_capacity_and_notify);
/*
- * Format the device name of the indicated disk into the supplied buffer and
- * return a pointer to that same buffer for convenience.
+ * Format the device name of the indicated block device into the supplied buffer
+ * and return a pointer to that same buffer for convenience.
+ *
+ * Note: do not use this in new code, use the %pg specifier to sprintf and
+ * printk insted.
*/
-char *disk_name(struct gendisk *hd, int partno, char *buf)
+const char *bdevname(struct block_device *bdev, char *buf)
{
+ struct gendisk *hd = bdev->bd_disk;
+ int partno = bdev->bd_partno;
+
if (!partno)
snprintf(buf, BDEVNAME_SIZE, "%s", hd->disk_name);
else if (isdigit(hd->disk_name[strlen(hd->disk_name)-1]))
return buf;
}
-
-const char *bdevname(struct block_device *bdev, char *buf)
-{
- return disk_name(bdev->bd_disk, bdev->bd_partno, buf);
-}
EXPORT_SYMBOL(bdevname);
static void part_stat_read_all(struct block_device *part,
EXPORT_SYMBOL(unregister_blkdev);
-/**
- * blk_mangle_minor - scatter minor numbers apart
- * @minor: minor number to mangle
- *
- * Scatter consecutively allocated @minor number apart if MANGLE_DEVT
- * is enabled. Mangling twice gives the original value.
- *
- * RETURNS:
- * Mangled value.
- *
- * CONTEXT:
- * Don't care.
- */
-static int blk_mangle_minor(int minor)
-{
-#ifdef CONFIG_DEBUG_BLOCK_EXT_DEVT
- int i;
-
- for (i = 0; i < MINORBITS / 2; i++) {
- int low = minor & (1 << i);
- int high = minor & (1 << (MINORBITS - 1 - i));
- int distance = MINORBITS - 1 - 2 * i;
-
- minor ^= low | high; /* clear both bits */
- low <<= distance; /* swap the positions */
- high >>= distance;
- minor |= low | high; /* and set */
- }
-#endif
- return minor;
-}
-
int blk_alloc_ext_minor(void)
{
int idx;
idx = ida_alloc_range(&ext_devt_ida, 0, NR_EXT_DEVT, GFP_KERNEL);
- if (idx < 0) {
- if (idx == -ENOSPC)
- return -EBUSY;
- return idx;
- }
- return blk_mangle_minor(idx);
+ if (idx == -ENOSPC)
+ return -EBUSY;
+ return idx;
}
void blk_free_ext_minor(unsigned int minor)
{
- ida_free(&ext_devt_ida, blk_mangle_minor(minor));
+ ida_free(&ext_devt_ida, minor);
}
static char *bdevt_str(dev_t devt, char *buf)
blkdev_put(bdev, FMODE_READ);
}
-static void register_disk(struct device *parent, struct gendisk *disk,
- const struct attribute_group **groups)
-{
- struct device *ddev = disk_to_dev(disk);
- int err;
-
- ddev->parent = parent;
-
- dev_set_name(ddev, "%s", disk->disk_name);
-
- /* delay uevents, until we scanned partition table */
- dev_set_uevent_suppress(ddev, 1);
-
- if (groups) {
- WARN_ON(ddev->groups);
- ddev->groups = groups;
- }
- if (device_add(ddev))
- return;
- if (!sysfs_deprecated) {
- err = sysfs_create_link(block_depr, &ddev->kobj,
- kobject_name(&ddev->kobj));
- if (err) {
- device_del(ddev);
- return;
- }
- }
-
- /*
- * avoid probable deadlock caused by allocating memory with
- * GFP_KERNEL in runtime_resume callback of its all ancestor
- * devices
- */
- pm_runtime_set_memalloc_noio(ddev, true);
-
- disk->part0->bd_holder_dir =
- kobject_create_and_add("holders", &ddev->kobj);
- disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
-
- if (disk->flags & GENHD_FL_HIDDEN)
- return;
-
- disk_scan_partitions(disk);
-
- /* announce the disk and partitions after all partitions are created */
- dev_set_uevent_suppress(ddev, 0);
- disk_uevent(disk, KOBJ_ADD);
-
- if (disk->queue->backing_dev_info->dev) {
- err = sysfs_create_link(&ddev->kobj,
- &disk->queue->backing_dev_info->dev->kobj,
- "bdi");
- WARN_ON(err);
- }
-}
-
/**
- * __device_add_disk - add disk information to kernel list
+ * device_add_disk - add disk information to kernel list
* @parent: parent device for the disk
* @disk: per-device partitioning information
* @groups: Additional per-device sysfs groups
- * @register_queue: register the queue if set to true
*
* This function registers the partitioning information in @disk
* with the kernel.
- *
- * FIXME: error handling
*/
-static void __device_add_disk(struct device *parent, struct gendisk *disk,
- const struct attribute_group **groups,
- bool register_queue)
+int device_add_disk(struct device *parent, struct gendisk *disk,
+ const struct attribute_group **groups)
+
{
+ struct device *ddev = disk_to_dev(disk);
int ret;
/*
* elevator if one is needed, that is, for devices requesting queue
* registration.
*/
- if (register_queue)
- elevator_init_mq(disk->queue);
+ elevator_init_mq(disk->queue);
/*
* If the driver provides an explicit major number it also must provide
* and all partitions from the extended dev_t space.
*/
if (disk->major) {
- WARN_ON(!disk->minors);
+ if (WARN_ON(!disk->minors))
+ return -EINVAL;
if (disk->minors > DISK_MAX_PARTS) {
pr_err("block: can't allocate more than %d partitions\n",
disk->minors = DISK_MAX_PARTS;
}
} else {
- WARN_ON(disk->minors);
+ if (WARN_ON(disk->minors))
+ return -EINVAL;
ret = blk_alloc_ext_minor();
- if (ret < 0) {
- WARN_ON(1);
- return;
- }
+ if (ret < 0)
+ return ret;
disk->major = BLOCK_EXT_MAJOR;
- disk->first_minor = MINOR(ret);
+ disk->first_minor = ret;
disk->flags |= GENHD_FL_EXT_DEVT;
}
- disk->flags |= GENHD_FL_UP;
+ ret = disk_alloc_events(disk);
+ if (ret)
+ goto out_free_ext_minor;
- disk_alloc_events(disk);
+ /* delay uevents, until we scanned partition table */
+ dev_set_uevent_suppress(ddev, 1);
+
+ ddev->parent = parent;
+ ddev->groups = groups;
+ dev_set_name(ddev, "%s", disk->disk_name);
+ if (!(disk->flags & GENHD_FL_HIDDEN))
+ ddev->devt = MKDEV(disk->major, disk->first_minor);
+ ret = device_add(ddev);
+ if (ret)
+ goto out_disk_release_events;
+ if (!sysfs_deprecated) {
+ ret = sysfs_create_link(block_depr, &ddev->kobj,
+ kobject_name(&ddev->kobj));
+ if (ret)
+ goto out_device_del;
+ }
+
+ /*
+ * avoid probable deadlock caused by allocating memory with
+ * GFP_KERNEL in runtime_resume callback of its all ancestor
+ * devices
+ */
+ pm_runtime_set_memalloc_noio(ddev, true);
+
+ ret = blk_integrity_add(disk);
+ if (ret)
+ goto out_del_block_link;
+
+ disk->part0->bd_holder_dir =
+ kobject_create_and_add("holders", &ddev->kobj);
+ if (!disk->part0->bd_holder_dir)
+ goto out_del_integrity;
+ disk->slave_dir = kobject_create_and_add("slaves", &ddev->kobj);
+ if (!disk->slave_dir)
+ goto out_put_holder_dir;
+
+ ret = bd_register_pending_holders(disk);
+ if (ret < 0)
+ goto out_put_slave_dir;
+
+ ret = blk_register_queue(disk);
+ if (ret)
+ goto out_put_slave_dir;
if (disk->flags & GENHD_FL_HIDDEN) {
/*
disk->flags |= GENHD_FL_SUPPRESS_PARTITION_INFO;
disk->flags |= GENHD_FL_NO_PART_SCAN;
} else {
- struct backing_dev_info *bdi = disk->queue->backing_dev_info;
- struct device *dev = disk_to_dev(disk);
-
- /* Register BDI before referencing it from bdev */
- dev->devt = MKDEV(disk->major, disk->first_minor);
- ret = bdi_register(bdi, "%u:%u",
+ ret = bdi_register(disk->bdi, "%u:%u",
disk->major, disk->first_minor);
- WARN_ON(ret);
- bdi_set_owner(bdi, dev);
- bdev_add(disk->part0, dev->devt);
- }
- register_disk(parent, disk, groups);
- if (register_queue)
- blk_register_queue(disk);
+ if (ret)
+ goto out_unregister_queue;
+ bdi_set_owner(disk->bdi, ddev);
+ ret = sysfs_create_link(&ddev->kobj,
+ &disk->bdi->dev->kobj, "bdi");
+ if (ret)
+ goto out_unregister_bdi;
- /*
- * Take an extra ref on queue which will be put on disk_release()
- * so that it sticks around as long as @disk is there.
- */
- if (blk_get_queue(disk->queue))
- set_bit(GD_QUEUE_REF, &disk->state);
- else
- WARN_ON_ONCE(1);
+ bdev_add(disk->part0, ddev->devt);
+ disk_scan_partitions(disk);
- disk_add_events(disk);
- blk_integrity_add(disk);
-}
+ /*
+ * Announce the disk and partitions after all partitions are
+ * created. (for hidden disks uevents remain suppressed forever)
+ */
+ dev_set_uevent_suppress(ddev, 0);
+ disk_uevent(disk, KOBJ_ADD);
+ }
-void device_add_disk(struct device *parent, struct gendisk *disk,
- const struct attribute_group **groups)
+ disk_update_readahead(disk);
+ disk_add_events(disk);
+ return 0;
-{
- __device_add_disk(parent, disk, groups, true);
+out_unregister_bdi:
+ if (!(disk->flags & GENHD_FL_HIDDEN))
+ bdi_unregister(disk->bdi);
+out_unregister_queue:
+ blk_unregister_queue(disk);
+out_put_slave_dir:
+ kobject_put(disk->slave_dir);
+out_put_holder_dir:
+ kobject_put(disk->part0->bd_holder_dir);
+out_del_integrity:
+ blk_integrity_del(disk);
+out_del_block_link:
+ if (!sysfs_deprecated)
+ sysfs_remove_link(block_depr, dev_name(ddev));
+out_device_del:
+ device_del(ddev);
+out_disk_release_events:
+ disk_release_events(disk);
+out_free_ext_minor:
+ if (disk->major == BLOCK_EXT_MAJOR)
+ blk_free_ext_minor(disk->first_minor);
+ return WARN_ON_ONCE(ret); /* keep until all callers handle errors */
}
EXPORT_SYMBOL(device_add_disk);
-void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
-{
- __device_add_disk(parent, disk, NULL, false);
-}
-EXPORT_SYMBOL(device_add_disk_no_queue_reg);
-
/**
* del_gendisk - remove the gendisk
* @disk: the struct gendisk to remove
{
might_sleep();
- if (WARN_ON_ONCE(!disk->queue))
+ if (WARN_ON_ONCE(!disk_live(disk) && !(disk->flags & GENHD_FL_HIDDEN)))
return;
blk_integrity_del(disk);
disk_del_events(disk);
mutex_lock(&disk->open_mutex);
- disk->flags &= ~GENHD_FL_UP;
+ remove_inode_hash(disk->part0->bd_inode);
blk_drop_partitions(disk);
mutex_unlock(&disk->open_mutex);
fsync_bdev(disk->part0);
__invalidate_device(disk->part0, true);
- /*
- * Unhash the bdev inode for this device so that it can't be looked
- * up any more even if openers still hold references to it.
- */
- remove_inode_hash(disk->part0->bd_inode);
-
set_capacity(disk, 0);
if (!(disk->flags & GENHD_FL_HIDDEN)) {
* Unregister bdi before releasing device numbers (as they can
* get reused and we'd get clashes in sysfs).
*/
- bdi_unregister(disk->queue->backing_dev_info);
+ bdi_unregister(disk->bdi);
}
blk_unregister_queue(disk);
while ((dev = class_dev_iter_next(&iter))) {
struct gendisk *disk = dev_to_disk(dev);
struct block_device *part;
- char name_buf[BDEVNAME_SIZE];
char devt_buf[BDEVT_SIZE];
unsigned long idx;
xa_for_each(&disk->part_tbl, idx, part) {
if (!bdev_nr_sectors(part))
continue;
- printk("%s%s %10llu %s %s",
+ printk("%s%s %10llu %pg %s",
bdev_is_partition(part) ? " " : "",
bdevt_str(part->bd_dev, devt_buf),
- bdev_nr_sectors(part) >> 1,
- disk_name(disk, part->bd_partno, name_buf),
+ bdev_nr_sectors(part) >> 1, part,
part->bd_meta_info ?
part->bd_meta_info->uuid : "");
if (bdev_is_partition(part))
struct gendisk *sgp = v;
struct block_device *part;
unsigned long idx;
- char buf[BDEVNAME_SIZE];
/* Don't show non-partitionable removeable devices or empty devices */
if (!get_capacity(sgp) || (!disk_max_parts(sgp) &&
xa_for_each(&sgp->part_tbl, idx, part) {
if (!bdev_nr_sectors(part))
continue;
- seq_printf(seqf, "%4d %7d %10llu %s\n",
+ seq_printf(seqf, "%4d %7d %10llu %pg\n",
MAJOR(part->bd_dev), MINOR(part->bd_dev),
- bdev_nr_sectors(part) >> 1,
- disk_name(sgp, part->bd_partno, buf));
+ bdev_nr_sectors(part) >> 1, part);
}
rcu_read_unlock();
return 0;
return sprintf(buf, "%d\n", queue_discard_alignment(disk->queue));
}
+static ssize_t diskseq_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct gendisk *disk = dev_to_disk(dev);
+
+ return sprintf(buf, "%llu\n", disk->diskseq);
+}
+
static DEVICE_ATTR(range, 0444, disk_range_show, NULL);
static DEVICE_ATTR(ext_range, 0444, disk_ext_range_show, NULL);
static DEVICE_ATTR(removable, 0444, disk_removable_show, NULL);
static DEVICE_ATTR(stat, 0444, part_stat_show, NULL);
static DEVICE_ATTR(inflight, 0444, part_inflight_show, NULL);
static DEVICE_ATTR(badblocks, 0644, disk_badblocks_show, disk_badblocks_store);
+static DEVICE_ATTR(diskseq, 0444, diskseq_show, NULL);
#ifdef CONFIG_FAIL_MAKE_REQUEST
ssize_t part_fail_show(struct device *dev,
&dev_attr_events.attr,
&dev_attr_events_async.attr,
&dev_attr_events_poll_msecs.attr,
+ &dev_attr_diskseq.attr,
#ifdef CONFIG_FAIL_MAKE_REQUEST
&dev_attr_fail.attr,
#endif
might_sleep();
- if (MAJOR(dev->devt) == BLOCK_EXT_MAJOR)
- blk_free_ext_minor(MINOR(dev->devt));
disk_release_events(disk);
kfree(disk->random);
xa_destroy(&disk->part_tbl);
- if (test_bit(GD_QUEUE_REF, &disk->state) && disk->queue)
- blk_put_queue(disk->queue);
- bdput(disk->part0); /* frees the disk */
+ disk->queue->disk = NULL;
+ blk_put_queue(disk->queue);
+ iput(disk->part0->bd_inode); /* frees the disk */
+}
+
+static int block_uevent(struct device *dev, struct kobj_uevent_env *env)
+{
+ struct gendisk *disk = dev_to_disk(dev);
+
+ return add_uevent_var(env, "DISKSEQ=%llu", disk->diskseq);
}
+
struct class block_class = {
.name = "block",
+ .dev_uevent = block_uevent,
};
static char *block_devnode(struct device *dev, umode_t *mode,
{
struct gendisk *gp = v;
struct block_device *hd;
- char buf[BDEVNAME_SIZE];
unsigned int inflight;
struct disk_stats stat;
unsigned long idx;
else
inflight = part_in_flight(hd);
- seq_printf(seqf, "%4d %7d %s "
+ seq_printf(seqf, "%4d %7d %pg "
"%lu %lu %lu %u "
"%lu %lu %lu %u "
"%u %u %u "
"%lu %lu %lu %u "
"%lu %u"
"\n",
- MAJOR(hd->bd_dev), MINOR(hd->bd_dev),
- disk_name(gp, hd->bd_partno, buf),
+ MAJOR(hd->bd_dev), MINOR(hd->bd_dev), hd,
stat.ios[STAT_READ],
stat.merges[STAT_READ],
stat.sectors[STAT_READ],
return devt;
}
-struct gendisk *__alloc_disk_node(int minors, int node_id)
+struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
+ struct lock_class_key *lkclass)
{
struct gendisk *disk;
+ if (!blk_get_queue(q))
+ return NULL;
+
disk = kzalloc_node(sizeof(struct gendisk), GFP_KERNEL, node_id);
if (!disk)
- return NULL;
+ goto out_put_queue;
+
+ disk->bdi = bdi_alloc(node_id);
+ if (!disk->bdi)
+ goto out_free_disk;
disk->part0 = bdev_alloc(disk, 0);
if (!disk->part0)
- goto out_free_disk;
+ goto out_free_bdi;
disk->node_id = node_id;
mutex_init(&disk->open_mutex);
if (xa_insert(&disk->part_tbl, 0, disk->part0, GFP_KERNEL))
goto out_destroy_part_tbl;
- disk->minors = minors;
rand_initialize_disk(disk);
disk_to_dev(disk)->class = &block_class;
disk_to_dev(disk)->type = &disk_type;
device_initialize(disk_to_dev(disk));
+ inc_diskseq(disk);
+ disk->queue = q;
+ q->disk = disk;
+ lockdep_init_map(&disk->lockdep_map, "(bio completion)", lkclass, 0);
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
+ INIT_LIST_HEAD(&disk->slave_bdevs);
+#endif
return disk;
out_destroy_part_tbl:
xa_destroy(&disk->part_tbl);
- bdput(disk->part0);
+ iput(disk->part0->bd_inode);
+out_free_bdi:
+ bdi_put(disk->bdi);
out_free_disk:
kfree(disk);
+out_put_queue:
+ blk_put_queue(q);
return NULL;
}
EXPORT_SYMBOL(__alloc_disk_node);
-struct gendisk *__blk_alloc_disk(int node)
+struct gendisk *__blk_alloc_disk(int node, struct lock_class_key *lkclass)
{
struct request_queue *q;
struct gendisk *disk;
if (!q)
return NULL;
- disk = __alloc_disk_node(0, node);
+ disk = __alloc_disk_node(q, node, lkclass);
if (!disk) {
blk_cleanup_queue(q);
return NULL;
}
- disk->queue = q;
return disk;
}
EXPORT_SYMBOL(__blk_alloc_disk);
return bdev->bd_read_only || get_disk_ro(bdev->bd_disk);
}
EXPORT_SYMBOL(bdev_read_only);
+
+void inc_diskseq(struct gendisk *disk)
+{
+ disk->diskseq = atomic64_inc_return(&diskseq);
+}
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/genhd.h>
+
+struct bd_holder_disk {
+ struct list_head list;
+ struct block_device *bdev;
+ int refcnt;
+};
+
+static struct bd_holder_disk *bd_find_holder_disk(struct block_device *bdev,
+ struct gendisk *disk)
+{
+ struct bd_holder_disk *holder;
+
+ list_for_each_entry(holder, &disk->slave_bdevs, list)
+ if (holder->bdev == bdev)
+ return holder;
+ return NULL;
+}
+
+static int add_symlink(struct kobject *from, struct kobject *to)
+{
+ return sysfs_create_link(from, to, kobject_name(to));
+}
+
+static void del_symlink(struct kobject *from, struct kobject *to)
+{
+ sysfs_remove_link(from, kobject_name(to));
+}
+
+static int __link_disk_holder(struct block_device *bdev, struct gendisk *disk)
+{
+ int ret;
+
+ ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
+ if (ret)
+ return ret;
+ ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+ if (ret)
+ del_symlink(disk->slave_dir, bdev_kobj(bdev));
+ return ret;
+}
+
+/**
+ * bd_link_disk_holder - create symlinks between holding disk and slave bdev
+ * @bdev: the claimed slave bdev
+ * @disk: the holding disk
+ *
+ * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
+ *
+ * This functions creates the following sysfs symlinks.
+ *
+ * - from "slaves" directory of the holder @disk to the claimed @bdev
+ * - from "holders" directory of the @bdev to the holder @disk
+ *
+ * For example, if /dev/dm-0 maps to /dev/sda and disk for dm-0 is
+ * passed to bd_link_disk_holder(), then:
+ *
+ * /sys/block/dm-0/slaves/sda --> /sys/block/sda
+ * /sys/block/sda/holders/dm-0 --> /sys/block/dm-0
+ *
+ * The caller must have claimed @bdev before calling this function and
+ * ensure that both @bdev and @disk are valid during the creation and
+ * lifetime of these symlinks.
+ *
+ * CONTEXT:
+ * Might sleep.
+ *
+ * RETURNS:
+ * 0 on success, -errno on failure.
+ */
+int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
+{
+ struct bd_holder_disk *holder;
+ int ret = 0;
+
+ mutex_lock(&disk->open_mutex);
+
+ WARN_ON_ONCE(!bdev->bd_holder);
+
+ /* FIXME: remove the following once add_disk() handles errors */
+ if (WARN_ON(!bdev->bd_holder_dir))
+ goto out_unlock;
+
+ holder = bd_find_holder_disk(bdev, disk);
+ if (holder) {
+ holder->refcnt++;
+ goto out_unlock;
+ }
+
+ holder = kzalloc(sizeof(*holder), GFP_KERNEL);
+ if (!holder) {
+ ret = -ENOMEM;
+ goto out_unlock;
+ }
+
+ INIT_LIST_HEAD(&holder->list);
+ holder->bdev = bdev;
+ holder->refcnt = 1;
+ if (disk->slave_dir) {
+ ret = __link_disk_holder(bdev, disk);
+ if (ret) {
+ kfree(holder);
+ goto out_unlock;
+ }
+ }
+
+ list_add(&holder->list, &disk->slave_bdevs);
+ /*
+ * del_gendisk drops the initial reference to bd_holder_dir, so we need
+ * to keep our own here to allow for cleanup past that point.
+ */
+ kobject_get(bdev->bd_holder_dir);
+
+out_unlock:
+ mutex_unlock(&disk->open_mutex);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(bd_link_disk_holder);
+
+static void __unlink_disk_holder(struct block_device *bdev,
+ struct gendisk *disk)
+{
+ del_symlink(disk->slave_dir, bdev_kobj(bdev));
+ del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
+}
+
+/**
+ * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
+ * @bdev: the calimed slave bdev
+ * @disk: the holding disk
+ *
+ * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
+ *
+ * CONTEXT:
+ * Might sleep.
+ */
+void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
+{
+ struct bd_holder_disk *holder;
+
+ mutex_lock(&disk->open_mutex);
+ holder = bd_find_holder_disk(bdev, disk);
+ if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
+ if (disk->slave_dir)
+ __unlink_disk_holder(bdev, disk);
+ kobject_put(bdev->bd_holder_dir);
+ list_del_init(&holder->list);
+ kfree(holder);
+ }
+ mutex_unlock(&disk->open_mutex);
+}
+EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
+
+int bd_register_pending_holders(struct gendisk *disk)
+{
+ struct bd_holder_disk *holder;
+ int ret;
+
+ mutex_lock(&disk->open_mutex);
+ list_for_each_entry(holder, &disk->slave_bdevs, list) {
+ ret = __link_disk_holder(holder->bdev, disk);
+ if (ret)
+ goto out_undo;
+ }
+ mutex_unlock(&disk->open_mutex);
+ return 0;
+
+out_undo:
+ list_for_each_entry_continue_reverse(holder, &disk->slave_bdevs, list)
+ __unlink_disk_holder(holder->bdev, disk);
+ mutex_unlock(&disk->open_mutex);
+ return ret;
+}
static int blkpg_do_ioctl(struct block_device *bdev,
struct blkpg_partition __user *upart, int op)
{
+ struct gendisk *disk = bdev->bd_disk;
struct blkpg_partition p;
long long start, length;
return -EINVAL;
if (op == BLKPG_DEL_PARTITION)
- return bdev_del_partition(bdev, p.pno);
+ return bdev_del_partition(disk, p.pno);
start = p.start >> SECTOR_SHIFT;
length = p.length >> SECTOR_SHIFT;
/* check if partition is aligned to blocksize */
if (p.start & (bdev_logical_block_size(bdev) - 1))
return -EINVAL;
- return bdev_add_partition(bdev, p.pno, start, length);
+ return bdev_add_partition(disk, p.pno, start, length);
case BLKPG_RESIZE_PARTITION:
- return bdev_resize_partition(bdev, p.pno, start, length);
+ return bdev_resize_partition(disk, p.pno, start, length);
default:
return -EINVAL;
}
BLKDEV_DISCARD_SECURE);
case BLKZEROOUT:
return blk_ioctl_zeroout(bdev, mode, arg);
+ case BLKGETDISKSEQ:
+ return put_u64(argp, bdev->bd_disk->diskseq);
case BLKREPORTZONE:
return blkdev_report_zones_ioctl(bdev, mode, cmd, arg);
case BLKRESETZONE:
case BLKFRASET:
if(!capable(CAP_SYS_ADMIN))
return -EACCES;
- bdev->bd_bdi->ra_pages = (arg * 512) / PAGE_SIZE;
+ bdev->bd_disk->bdi->ra_pages = (arg * 512) / PAGE_SIZE;
return 0;
case BLKRRPART:
return blkdev_reread_part(bdev, mode);
case BLKFRAGET:
if (!argp)
return -EINVAL;
- return put_long(argp, (bdev->bd_bdi->ra_pages*PAGE_SIZE) / 512);
+ return put_long(argp,
+ (bdev->bd_disk->bdi->ra_pages * PAGE_SIZE) / 512);
case BLKGETSIZE:
size = i_size_read(bdev->bd_inode);
if ((size >> 9) > ~0UL)
if (!argp)
return -EINVAL;
return compat_put_long(argp,
- (bdev->bd_bdi->ra_pages * PAGE_SIZE) / 512);
+ (bdev->bd_disk->bdi->ra_pages * PAGE_SIZE) / 512);
case BLKGETSIZE:
size = i_size_read(bdev->bd_inode);
if ((size >> 9) > ~0UL)
fallthrough;
/* rt has prio field too */
case IOPRIO_CLASS_BE:
- if (data >= IOPRIO_BE_NR || data < 0)
+ if (data >= IOPRIO_NR_LEVELS || data < 0)
return -EINVAL;
-
break;
case IOPRIO_CLASS_IDLE:
break;
ret = security_task_getioprio(p);
if (ret)
goto out;
- ret = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, IOPRIO_NORM);
+ ret = IOPRIO_DEFAULT;
task_lock(p);
if (p->io_context)
ret = p->io_context->ioprio;
int ioprio_best(unsigned short aprio, unsigned short bprio)
{
if (!ioprio_valid(aprio))
- aprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM);
+ aprio = IOPRIO_DEFAULT;
if (!ioprio_valid(bprio))
- bprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM);
+ bprio = IOPRIO_DEFAULT;
return min(aprio, bprio);
}
*/
static const int read_expire = HZ / 2; /* max time before a read is submitted. */
static const int write_expire = 5 * HZ; /* ditto for writes, these limits are SOFT! */
-/*
- * Time after which to dispatch lower priority requests even if higher
- * priority requests are pending.
- */
-static const int aging_expire = 10 * HZ;
static const int writes_starved = 2; /* max times reads can starve a write */
static const int fifo_batch = 16; /* # of sequential requests treated as one
by the above parameters. For throughput. */
int writes_starved;
int front_merges;
u32 async_depth;
- int aging_expire;
spinlock_t lock;
spinlock_t zone_lock;
/*
* deadline_dispatch_requests selects the best request according to
- * read/write expire, fifo_batch, etc and with a start time <= @latest.
+ * read/write expire, fifo_batch, etc
*/
static struct request *__dd_dispatch_request(struct deadline_data *dd,
- struct dd_per_prio *per_prio,
- u64 latest_start_ns)
+ struct dd_per_prio *per_prio)
{
struct request *rq, *next_rq;
enum dd_data_dir data_dir;
if (!list_empty(&per_prio->dispatch)) {
rq = list_first_entry(&per_prio->dispatch, struct request,
queuelist);
- if (rq->start_time_ns > latest_start_ns)
- return NULL;
list_del_init(&rq->queuelist);
goto done;
}
dd->batching = 0;
dispatch_request:
- if (rq->start_time_ns > latest_start_ns)
- return NULL;
/*
* rq is the selected appropriate request.
*/
static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
{
struct deadline_data *dd = hctx->queue->elevator->elevator_data;
- const u64 now_ns = ktime_get_ns();
- struct request *rq = NULL;
+ struct request *rq;
enum dd_prio prio;
spin_lock(&dd->lock);
- /*
- * Start with dispatching requests whose deadline expired more than
- * aging_expire jiffies ago.
- */
- for (prio = DD_BE_PRIO; prio <= DD_PRIO_MAX; prio++) {
- rq = __dd_dispatch_request(dd, &dd->per_prio[prio], now_ns -
- jiffies_to_nsecs(dd->aging_expire));
- if (rq)
- goto unlock;
- }
- /*
- * Next, dispatch requests in priority order. Ignore lower priority
- * requests if any higher priority requests are pending.
- */
for (prio = 0; prio <= DD_PRIO_MAX; prio++) {
- rq = __dd_dispatch_request(dd, &dd->per_prio[prio], now_ns);
- if (rq || dd_queued(dd, prio))
+ rq = __dd_dispatch_request(dd, &dd->per_prio[prio]);
+ if (rq)
break;
}
-
-unlock:
spin_unlock(&dd->lock);
return rq;
dd->front_merges = 1;
dd->last_dir = DD_WRITE;
dd->fifo_batch = fifo_batch;
- dd->aging_expire = aging_expire;
spin_lock_init(&dd->lock);
spin_lock_init(&dd->zone_lock);
if (elv_bio_merge_ok(__rq, bio)) {
*rq = __rq;
+ if (blk_discard_mergable(__rq))
+ return ELEVATOR_DISCARD_MERGE;
return ELEVATOR_FRONT_MERGE;
}
}
prio = ioprio_class_to_prio[ioprio_class];
dd_count(dd, inserted, prio);
+ rq->elv.priv[0] = (void *)(uintptr_t)1;
if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
blk_mq_free_requests(&free);
spin_unlock(&dd->lock);
}
-/*
- * Nothing to do here. This is defined only to ensure that .finish_request
- * method is called upon request completion.
- */
+/* Callback from inside blk_mq_rq_ctx_init(). */
static void dd_prepare_request(struct request *rq)
{
+ rq->elv.priv[0] = NULL;
}
/*
const enum dd_prio prio = ioprio_class_to_prio[ioprio_class];
struct dd_per_prio *per_prio = &dd->per_prio[prio];
- dd_count(dd, completed, prio);
+ /*
+ * The block layer core may call dd_finish_request() without having
+ * called dd_insert_requests(). Hence only update statistics for
+ * requests for which dd_insert_requests() has been called. See also
+ * blk_mq_request_bypass_insert().
+ */
+ if (rq->elv.priv[0])
+ dd_count(dd, completed, prio);
if (blk_queue_is_zoned(q)) {
unsigned long flags;
#define SHOW_JIFFIES(__FUNC, __VAR) SHOW_INT(__FUNC, jiffies_to_msecs(__VAR))
SHOW_JIFFIES(deadline_read_expire_show, dd->fifo_expire[DD_READ]);
SHOW_JIFFIES(deadline_write_expire_show, dd->fifo_expire[DD_WRITE]);
-SHOW_JIFFIES(deadline_aging_expire_show, dd->aging_expire);
SHOW_INT(deadline_writes_starved_show, dd->writes_starved);
SHOW_INT(deadline_front_merges_show, dd->front_merges);
SHOW_INT(deadline_async_depth_show, dd->front_merges);
STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, msecs_to_jiffies)
STORE_JIFFIES(deadline_read_expire_store, &dd->fifo_expire[DD_READ], 0, INT_MAX);
STORE_JIFFIES(deadline_write_expire_store, &dd->fifo_expire[DD_WRITE], 0, INT_MAX);
-STORE_JIFFIES(deadline_aging_expire_store, &dd->aging_expire, 0, INT_MAX);
STORE_INT(deadline_writes_starved_store, &dd->writes_starved, INT_MIN, INT_MAX);
STORE_INT(deadline_front_merges_store, &dd->front_merges, 0, 1);
STORE_INT(deadline_async_depth_store, &dd->front_merges, 1, INT_MAX);
DD_ATTR(front_merges),
DD_ATTR(async_depth),
DD_ATTR(fifo_batch),
- DD_ATTR(aging_expire),
__ATTR_NULL
};
config CMDLINE_PARTITION
bool "Command line partition support" if PARTITION_ADVANCED
- select BLK_CMDLINE_PARSER
help
Say Y here if you want to read the partition table from bootargs.
The format for the command line is just like mtdparts.
/*
* Work out start of non-adfs partition.
*/
- nr_sects = (state->bdev->bd_inode->i_size >> 9) - start_sect;
+ nr_sects = get_capacity(state->disk) - start_sect;
if (start_sect) {
switch (id) {
if (i != 0) {
sector_t size;
- size = get_capacity(state->bdev->bd_disk);
+ size = get_capacity(state->disk);
put_partition(state, slot++, start, size - start);
strlcat(state->pp_buf, "\n", PAGE_SIZE);
}
#define LVM_MAXLVS 256
-/**
- * last_lba(): return number of last logical block of device
- * @bdev: block device
- *
- * Description: Returns last LBA value on success, 0 on error.
- * This is stored (by sd and ide-geometry) in
- * the part[0] entry for this disk, and is the number of
- * physical sectors available on the disk.
- */
-static u64 last_lba(struct block_device *bdev)
-{
- if (!bdev || !bdev->bd_inode)
- return 0;
- return (bdev->bd_inode->i_size >> 9) - 1ULL;
-}
-
/**
* read_lba(): Read bytes from disk, starting at given LBA
* @state
* @buffer
* @count
*
- * Description: Reads @count bytes from @state->bdev into @buffer.
+ * Description: Reads @count bytes from @state->disk into @buffer.
* Returns number of bytes read on success, 0 on error.
*/
static size_t read_lba(struct parsed_partitions *state, u64 lba, u8 *buffer,
{
size_t totalreadcount = 0;
- if (!buffer || lba + count / 512 > last_lba(state->bdev))
+ if (!buffer || lba + count / 512 > get_capacity(state->disk) - 1ULL)
return 0;
while (count) {
int start_sect, nr_sects, blk, part, res = 0;
int blksize = 1; /* Multiplier for disk block size */
int slot = 1;
- char b[BDEVNAME_SIZE];
for (blk = 0; ; blk++, put_dev_sector(sect)) {
if (blk == RDB_ALLOCATION_LIMIT)
data = read_part_sector(state, blk, §);
if (!data) {
pr_err("Dev %s: unable to read RDB block %d\n",
- bdevname(state->bdev, b), blk);
+ state->disk->disk_name, blk);
res = -1;
goto rdb_done;
}
}
pr_err("Dev %s: RDB in block %d has bad checksum\n",
- bdevname(state->bdev, b), blk);
+ state->disk->disk_name, blk);
}
/* blksize is blocks per 512 byte standard block */
data = read_part_sector(state, blk, §);
if (!data) {
pr_err("Dev %s: unable to read partition block %d\n",
- bdevname(state->bdev, b), blk);
+ state->disk->disk_name, blk);
res = -1;
goto rdb_done;
}
* ATARI partition scheme supports 512 lba only. If this is not
* the case, bail early to avoid miscalculating hd_size.
*/
- if (bdev_logical_block_size(state->bdev) != 512)
+ if (queue_logical_block_size(state->disk->queue) != 512)
return 0;
rs = read_part_sector(state, 0, §);
return -1;
/* Verify this is an Atari rootsector: */
- hd_size = state->bdev->bd_inode->i_size >> 9;
+ hd_size = get_capacity(state->disk);
if (!VALID_PARTITION(&rs->part[0], hd_size) &&
!VALID_PARTITION(&rs->part[1], hd_size) &&
!VALID_PARTITION(&rs->part[2], hd_size) &&
* description.
*/
struct parsed_partitions {
- struct block_device *bdev;
+ struct gendisk *disk;
char name[BDEVNAME_SIZE];
struct {
sector_t from;
* For further information, see "Documentation/block/cmdline-partition.rst"
*
*/
+#include <linux/blkdev.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include "check.h"
-#include <linux/cmdline-parser.h>
-#include "check.h"
+/* partition flags */
+#define PF_RDONLY 0x01 /* Device is read only */
+#define PF_POWERUP_LOCK 0x02 /* Always locked after reset */
+
+struct cmdline_subpart {
+ char name[BDEVNAME_SIZE]; /* partition name, such as 'rootfs' */
+ sector_t from;
+ sector_t size;
+ int flags;
+ struct cmdline_subpart *next_subpart;
+};
+
+struct cmdline_parts {
+ char name[BDEVNAME_SIZE]; /* block device, such as 'mmcblk0' */
+ unsigned int nr_subparts;
+ struct cmdline_subpart *subpart;
+ struct cmdline_parts *next_parts;
+};
+
+static int parse_subpart(struct cmdline_subpart **subpart, char *partdef)
+{
+ int ret = 0;
+ struct cmdline_subpart *new_subpart;
+
+ *subpart = NULL;
+
+ new_subpart = kzalloc(sizeof(struct cmdline_subpart), GFP_KERNEL);
+ if (!new_subpart)
+ return -ENOMEM;
+
+ if (*partdef == '-') {
+ new_subpart->size = (sector_t)(~0ULL);
+ partdef++;
+ } else {
+ new_subpart->size = (sector_t)memparse(partdef, &partdef);
+ if (new_subpart->size < (sector_t)PAGE_SIZE) {
+ pr_warn("cmdline partition size is invalid.");
+ ret = -EINVAL;
+ goto fail;
+ }
+ }
+
+ if (*partdef == '@') {
+ partdef++;
+ new_subpart->from = (sector_t)memparse(partdef, &partdef);
+ } else {
+ new_subpart->from = (sector_t)(~0ULL);
+ }
+
+ if (*partdef == '(') {
+ int length;
+ char *next = strchr(++partdef, ')');
+
+ if (!next) {
+ pr_warn("cmdline partition format is invalid.");
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ length = min_t(int, next - partdef,
+ sizeof(new_subpart->name) - 1);
+ strncpy(new_subpart->name, partdef, length);
+ new_subpart->name[length] = '\0';
+
+ partdef = ++next;
+ } else
+ new_subpart->name[0] = '\0';
+
+ new_subpart->flags = 0;
+
+ if (!strncmp(partdef, "ro", 2)) {
+ new_subpart->flags |= PF_RDONLY;
+ partdef += 2;
+ }
+
+ if (!strncmp(partdef, "lk", 2)) {
+ new_subpart->flags |= PF_POWERUP_LOCK;
+ partdef += 2;
+ }
+
+ *subpart = new_subpart;
+ return 0;
+fail:
+ kfree(new_subpart);
+ return ret;
+}
+
+static void free_subpart(struct cmdline_parts *parts)
+{
+ struct cmdline_subpart *subpart;
+
+ while (parts->subpart) {
+ subpart = parts->subpart;
+ parts->subpart = subpart->next_subpart;
+ kfree(subpart);
+ }
+}
+
+static int parse_parts(struct cmdline_parts **parts, const char *bdevdef)
+{
+ int ret = -EINVAL;
+ char *next;
+ int length;
+ struct cmdline_subpart **next_subpart;
+ struct cmdline_parts *newparts;
+ char buf[BDEVNAME_SIZE + 32 + 4];
+
+ *parts = NULL;
+
+ newparts = kzalloc(sizeof(struct cmdline_parts), GFP_KERNEL);
+ if (!newparts)
+ return -ENOMEM;
+
+ next = strchr(bdevdef, ':');
+ if (!next) {
+ pr_warn("cmdline partition has no block device.");
+ goto fail;
+ }
+
+ length = min_t(int, next - bdevdef, sizeof(newparts->name) - 1);
+ strncpy(newparts->name, bdevdef, length);
+ newparts->name[length] = '\0';
+ newparts->nr_subparts = 0;
+
+ next_subpart = &newparts->subpart;
+
+ while (next && *(++next)) {
+ bdevdef = next;
+ next = strchr(bdevdef, ',');
+
+ length = (!next) ? (sizeof(buf) - 1) :
+ min_t(int, next - bdevdef, sizeof(buf) - 1);
+
+ strncpy(buf, bdevdef, length);
+ buf[length] = '\0';
+
+ ret = parse_subpart(next_subpart, buf);
+ if (ret)
+ goto fail;
+
+ newparts->nr_subparts++;
+ next_subpart = &(*next_subpart)->next_subpart;
+ }
+
+ if (!newparts->subpart) {
+ pr_warn("cmdline partition has no valid partition.");
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ *parts = newparts;
+
+ return 0;
+fail:
+ free_subpart(newparts);
+ kfree(newparts);
+ return ret;
+}
+
+static void cmdline_parts_free(struct cmdline_parts **parts)
+{
+ struct cmdline_parts *next_parts;
+
+ while (*parts) {
+ next_parts = (*parts)->next_parts;
+ free_subpart(*parts);
+ kfree(*parts);
+ *parts = next_parts;
+ }
+}
+
+static int cmdline_parts_parse(struct cmdline_parts **parts,
+ const char *cmdline)
+{
+ int ret;
+ char *buf;
+ char *pbuf;
+ char *next;
+ struct cmdline_parts **next_parts;
+
+ *parts = NULL;
+
+ next = pbuf = buf = kstrdup(cmdline, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ next_parts = parts;
+
+ while (next && *pbuf) {
+ next = strchr(pbuf, ';');
+ if (next)
+ *next = '\0';
+
+ ret = parse_parts(next_parts, pbuf);
+ if (ret)
+ goto fail;
+
+ if (next)
+ pbuf = ++next;
+
+ next_parts = &(*next_parts)->next_parts;
+ }
+
+ if (!*parts) {
+ pr_warn("cmdline partition has no valid partition.");
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ ret = 0;
+done:
+ kfree(buf);
+ return ret;
+
+fail:
+ cmdline_parts_free(parts);
+ goto done;
+}
+
+static struct cmdline_parts *cmdline_parts_find(struct cmdline_parts *parts,
+ const char *bdev)
+{
+ while (parts && strncmp(bdev, parts->name, sizeof(parts->name)))
+ parts = parts->next_parts;
+ return parts;
+}
static char *cmdline;
static struct cmdline_parts *bdev_parts;
-static int add_part(int slot, struct cmdline_subpart *subpart, void *param)
+static int add_part(int slot, struct cmdline_subpart *subpart,
+ struct parsed_partitions *state)
{
int label_min;
struct partition_meta_info *info;
char tmp[sizeof(info->volname) + 4];
- struct parsed_partitions *state = (struct parsed_partitions *)param;
if (slot >= state->limit)
return 1;
return 0;
}
+static int cmdline_parts_set(struct cmdline_parts *parts, sector_t disk_size,
+ struct parsed_partitions *state)
+{
+ sector_t from = 0;
+ struct cmdline_subpart *subpart;
+ int slot = 1;
+
+ for (subpart = parts->subpart; subpart;
+ subpart = subpart->next_subpart, slot++) {
+ if (subpart->from == (sector_t)(~0ULL))
+ subpart->from = from;
+ else
+ from = subpart->from;
+
+ if (from >= disk_size)
+ break;
+
+ if (subpart->size > (disk_size - from))
+ subpart->size = disk_size - from;
+
+ from += subpart->size;
+
+ if (add_part(slot, subpart, state))
+ break;
+ }
+
+ return slot;
+}
+
static int __init cmdline_parts_setup(char *s)
{
cmdline = s;
int cmdline_partition(struct parsed_partitions *state)
{
sector_t disk_size;
- char bdev[BDEVNAME_SIZE];
struct cmdline_parts *parts;
if (cmdline) {
if (!bdev_parts)
return 0;
- bdevname(state->bdev, bdev);
- parts = cmdline_parts_find(bdev_parts, bdev);
+ parts = cmdline_parts_find(bdev_parts, state->disk->disk_name);
if (!parts)
return 0;
- disk_size = get_capacity(state->bdev->bd_disk) << 9;
+ disk_size = get_capacity(state->disk) << 9;
- cmdline_parts_set(parts, disk_size, 1, add_part, (void *)state);
+ cmdline_parts_set(parts, disk_size, state);
cmdline_parts_verifier(1, state);
strlcat(state->pp_buf, "\n", PAGE_SIZE);
}
state->pp_buf[0] = '\0';
- state->bdev = hd->part0;
- disk_name(hd, 0, state->name);
+ state->disk = hd;
+ snprintf(state->name, BDEVNAME_SIZE, "%s", hd->disk_name);
snprintf(state->pp_buf, PAGE_SIZE, " %s:", state->name);
if (isdigit(state->name[strlen(state->name)-1]))
sprintf(state->name, "p");
static void part_release(struct device *dev)
{
- if (MAJOR(dev->devt) == BLOCK_EXT_MAJOR)
- blk_free_ext_minor(MINOR(dev->devt));
- bdput(dev_to_bdev(dev));
+ put_disk(dev_to_bdev(dev)->bd_disk);
+ iput(dev_to_bdev(dev)->bd_inode);
}
static int part_uevent(struct device *dev, struct kobj_uevent_env *env)
.uevent = part_uevent,
};
-/*
- * Must be called either with open_mutex held, before a disk can be opened or
- * after all disk users are gone.
- */
static void delete_partition(struct block_device *part)
{
+ lockdep_assert_held(&part->bd_disk->open_mutex);
+
fsync_bdev(part);
__invalidate_device(part, true);
if (xa_load(&disk->part_tbl, partno))
return ERR_PTR(-EBUSY);
+ /* ensure we always have a reference to the whole disk */
+ get_device(disk_to_dev(disk));
+
+ err = -ENOMEM;
bdev = bdev_alloc(disk, partno);
if (!bdev)
- return ERR_PTR(-ENOMEM);
+ goto out_put_disk;
bdev->bd_start_sect = start;
bdev_set_nr_sectors(bdev, len);
- if (info) {
- err = -ENOMEM;
- bdev->bd_meta_info = kmemdup(info, sizeof(*info), GFP_KERNEL);
- if (!bdev->bd_meta_info)
- goto out_bdput;
- }
-
pdev = &bdev->bd_device;
dname = dev_name(ddev);
if (isdigit(dname[strlen(dname) - 1]))
}
pdev->devt = devt;
+ if (info) {
+ err = -ENOMEM;
+ bdev->bd_meta_info = kmemdup(info, sizeof(*info), GFP_KERNEL);
+ if (!bdev->bd_meta_info)
+ goto out_put;
+ }
+
/* delay uevent until 'holders' subdir is created */
dev_set_uevent_suppress(pdev, 1);
err = device_add(pdev);
kobject_uevent(&pdev->kobj, KOBJ_ADD);
return bdev;
-out_bdput:
- bdput(bdev);
- return ERR_PTR(err);
out_del:
kobject_put(bdev->bd_holder_dir);
device_del(pdev);
out_put:
put_device(pdev);
+out_put_disk:
+ put_disk(disk);
return ERR_PTR(err);
}
return overlap;
}
-int bdev_add_partition(struct block_device *bdev, int partno,
- sector_t start, sector_t length)
+int bdev_add_partition(struct gendisk *disk, int partno, sector_t start,
+ sector_t length)
{
struct block_device *part;
- struct gendisk *disk = bdev->bd_disk;
int ret;
mutex_lock(&disk->open_mutex);
- if (!(disk->flags & GENHD_FL_UP)) {
+ if (!disk_live(disk)) {
ret = -ENXIO;
goto out;
}
return ret;
}
-int bdev_del_partition(struct block_device *bdev, int partno)
+int bdev_del_partition(struct gendisk *disk, int partno)
{
struct block_device *part = NULL;
int ret = -ENXIO;
- mutex_lock(&bdev->bd_disk->open_mutex);
- part = xa_load(&bdev->bd_disk->part_tbl, partno);
+ mutex_lock(&disk->open_mutex);
+ part = xa_load(&disk->part_tbl, partno);
if (!part)
goto out_unlock;
delete_partition(part);
ret = 0;
out_unlock:
- mutex_unlock(&bdev->bd_disk->open_mutex);
+ mutex_unlock(&disk->open_mutex);
return ret;
}
-int bdev_resize_partition(struct block_device *bdev, int partno,
- sector_t start, sector_t length)
+int bdev_resize_partition(struct gendisk *disk, int partno, sector_t start,
+ sector_t length)
{
struct block_device *part = NULL;
int ret = -ENXIO;
- mutex_lock(&bdev->bd_disk->open_mutex);
- part = xa_load(&bdev->bd_disk->part_tbl, partno);
+ mutex_lock(&disk->open_mutex);
+ part = xa_load(&disk->part_tbl, partno);
if (!part)
goto out_unlock;
goto out_unlock;
ret = -EBUSY;
- if (partition_overlaps(bdev->bd_disk, start, length, partno))
+ if (partition_overlaps(disk, start, length, partno))
goto out_unlock;
bdev_set_nr_sectors(part, length);
ret = 0;
out_unlock:
- mutex_unlock(&bdev->bd_disk->open_mutex);
+ mutex_unlock(&disk->open_mutex);
return ret;
}
lockdep_assert_held(&disk->open_mutex);
- if (!(disk->flags & GENHD_FL_UP))
+ if (!disk_live(disk))
return -ENXIO;
rescan:
void *read_part_sector(struct parsed_partitions *state, sector_t n, Sector *p)
{
- struct address_space *mapping = state->bdev->bd_inode->i_mapping;
+ struct address_space *mapping = state->disk->part0->bd_inode->i_mapping;
struct page *page;
- if (n >= get_capacity(state->bdev->bd_disk)) {
+ if (n >= get_capacity(state->disk)) {
state->access_beyond_eod = true;
return NULL;
}
/**
* last_lba(): return number of last logical block of device
- * @bdev: block device
+ * @disk: block device
*
* Description: Returns last LBA value on success, 0 on error.
* This is stored (by sd and ide-geometry) in
* the part[0] entry for this disk, and is the number of
* physical sectors available on the disk.
*/
-static u64 last_lba(struct block_device *bdev)
+static u64 last_lba(struct gendisk *disk)
{
- if (!bdev || !bdev->bd_inode)
- return 0;
- return div_u64(bdev->bd_inode->i_size,
- bdev_logical_block_size(bdev)) - 1ULL;
+ return div_u64(disk->part0->bd_inode->i_size,
+ queue_logical_block_size(disk->queue)) - 1ULL;
}
static inline int pmbr_part_valid(gpt_mbr_record *part)
* @buffer: destination buffer
* @count: bytes to read
*
- * Description: Reads @count bytes from @state->bdev into @buffer.
+ * Description: Reads @count bytes from @state->disk into @buffer.
* Returns number of bytes read on success, 0 on error.
*/
static size_t read_lba(struct parsed_partitions *state,
u64 lba, u8 *buffer, size_t count)
{
size_t totalreadcount = 0;
- struct block_device *bdev = state->bdev;
- sector_t n = lba * (bdev_logical_block_size(bdev) / 512);
+ sector_t n = lba *
+ (queue_logical_block_size(state->disk->queue) / 512);
- if (!buffer || lba > last_lba(bdev))
+ if (!buffer || lba > last_lba(state->disk))
return 0;
while (count) {
* @lba: the Logical Block Address of the partition table
*
* Description: returns GPT header on success, NULL on error. Allocates
- * and fills a GPT header starting at @ from @state->bdev.
+ * and fills a GPT header starting at @ from @state->disk.
* Note: remember to free gpt when finished with it.
*/
static gpt_header *alloc_read_gpt_header(struct parsed_partitions *state,
u64 lba)
{
gpt_header *gpt;
- unsigned ssz = bdev_logical_block_size(state->bdev);
+ unsigned ssz = queue_logical_block_size(state->disk->queue);
gpt = kmalloc(ssz, GFP_KERNEL);
if (!gpt)
/* Check the GUID Partition Table header size is too big */
if (le32_to_cpu((*gpt)->header_size) >
- bdev_logical_block_size(state->bdev)) {
+ queue_logical_block_size(state->disk->queue)) {
pr_debug("GUID Partition Table Header size is too large: %u > %u\n",
le32_to_cpu((*gpt)->header_size),
- bdev_logical_block_size(state->bdev));
+ queue_logical_block_size(state->disk->queue));
goto fail;
}
/* Check the first_usable_lba and last_usable_lba are
* within the disk.
*/
- lastlba = last_lba(state->bdev);
+ lastlba = last_lba(state->disk);
if (le64_to_cpu((*gpt)->first_usable_lba) > lastlba) {
pr_debug("GPT: first_usable_lba incorrect: %lld > %lld\n",
(unsigned long long)le64_to_cpu((*gpt)->first_usable_lba),
gpt_header *pgpt = NULL, *agpt = NULL;
gpt_entry *pptes = NULL, *aptes = NULL;
legacy_mbr *legacymbr;
- sector_t total_sectors = i_size_read(state->bdev->bd_inode) >> 9;
+ struct gendisk *disk = state->disk;
+ const struct block_device_operations *fops = disk->fops;
+ sector_t total_sectors = get_capacity(state->disk);
u64 lastlba;
if (!ptes)
return 0;
- lastlba = last_lba(state->bdev);
+ lastlba = last_lba(state->disk);
if (!force_gpt) {
/* This will be added to the EFI Spec. per Intel after v1.02. */
legacymbr = kzalloc(sizeof(*legacymbr), GFP_KERNEL);
if (!good_agpt && force_gpt)
good_agpt = is_gpt_valid(state, lastlba, &agpt, &aptes);
+ if (!good_agpt && force_gpt && fops->alternative_gpt_sector) {
+ sector_t agpt_sector;
+ int err;
+
+ err = fops->alternative_gpt_sector(disk, &agpt_sector);
+ if (!err)
+ good_agpt = is_gpt_valid(state, agpt_sector,
+ &agpt, &aptes);
+ }
+
/* The obviously unsuccessful case */
if (!good_pgpt && !good_agpt)
goto fail;
gpt_header *gpt = NULL;
gpt_entry *ptes = NULL;
u32 i;
- unsigned ssz = bdev_logical_block_size(state->bdev) / 512;
+ unsigned ssz = queue_logical_block_size(state->disk->queue) / 512;
if (!find_valid_gpt(state, &gpt, &ptes) || !gpt || !ptes) {
kfree(gpt);
u64 size = le64_to_cpu(ptes[i].ending_lba) -
le64_to_cpu(ptes[i].starting_lba) + 1ULL;
- if (!is_pte_valid(&ptes[i], last_lba(state->bdev)))
+ if (!is_pte_valid(&ptes[i], last_lba(state->disk)))
continue;
put_partition(state, i+1, start * ssz, size * ssz);
int ibm_partition(struct parsed_partitions *state)
{
int (*fn)(struct gendisk *disk, dasd_information2_t *info);
- struct block_device *bdev = state->bdev;
- struct gendisk *disk = bdev->bd_disk;
+ struct gendisk *disk = state->disk;
+ struct block_device *bdev = disk->part0;
int blocksize, res;
loff_t i_size, offset, size;
dasd_information2_t *info;
}
}
- num_sects = state->bdev->bd_inode->i_size >> 9;
+ num_sects = get_capacity(state->disk);
if ((ph[0]->config_start > num_sects) ||
((ph[0]->config_start + ph[0]->config_size) > num_sects)) {
/**
* ldm_validate_tocblocks - Validate the table of contents and its backups
* @state: Partition check state including device holding the LDM Database
- * @base: Offset, into @state->bdev, of the database
+ * @base: Offset, into @state->disk, of the database
* @ldb: Cache of the database structures
*
* Find and compare the four tables of contents of the LDM Database stored on
- * @state->bdev and return the parsed information into @toc1.
+ * @state->disk and return the parsed information into @toc1.
*
* The offsets and sizes of the configs are range-checked against a privhead.
*
* only likely to happen if the underlying device is strange. If that IS
* the case we should return zero to let someone else try.
*
- * Return: 'true' @state->bdev is a dynamic disk
- * 'false' @state->bdev is not a dynamic disk, or an error occurred
+ * Return: 'true' @state->disk is a dynamic disk
+ * 'false' @state->disk is not a dynamic disk, or an error occurred
*/
static bool ldm_validate_partition_table(struct parsed_partitions *state)
{
/**
* ldm_get_vblks - Read the on-disk database of VBLKs into memory
* @state: Partition check state including device holding the LDM Database
- * @base: Offset, into @state->bdev, of the database
+ * @base: Offset, into @state->disk, of the database
* @ldb: Cache of the database structures
*
* To use the information from the VBLKs, they need to be read from the disk,
* example, if the device is hda, we would have: hda1: LDM database, hda2, hda3,
* and so on: the actual data containing partitions.
*
- * Return: 1 Success, @state->bdev is a dynamic disk and we handled it
- * 0 Success, @state->bdev is not a dynamic disk
+ * Return: 1 Success, @state->disk is a dynamic disk and we handled it
+ * 0 Success, @state->disk is not a dynamic disk
* -1 An error occurred before enough information had been read
- * Or @state->bdev is a dynamic disk, but it may be corrupted
+ * Or @state->disk is a dynamic disk, but it may be corrupted
*/
int ldm_partition(struct parsed_partitions *state)
{
}
#ifdef CONFIG_PPC_PMAC
if (found_root_goodness)
- note_bootable_part(state->bdev->bd_dev, found_root,
+ note_bootable_part(state->disk->part0->bd_dev, found_root,
found_root_goodness);
#endif
Sector sect;
unsigned char *data;
sector_t this_sector, this_size;
- sector_t sector_size = bdev_logical_block_size(state->bdev) / 512;
+ sector_t sector_size;
int loopct = 0; /* number of links followed
without finding a data partition */
int i;
+ sector_size = queue_logical_block_size(state->disk->queue) / 512;
this_sector = first_sector;
this_size = first_size;
int msdos_partition(struct parsed_partitions *state)
{
- sector_t sector_size = bdev_logical_block_size(state->bdev) / 512;
+ sector_t sector_size;
Sector sect;
unsigned char *data;
struct msdos_partition *p;
int slot;
u32 disksig;
+ sector_size = queue_logical_block_size(state->disk->queue) / 512;
data = read_part_sector(state, 0, §);
if (!data)
return -1;
Sector sect;
struct sgi_disklabel *label;
struct sgi_partition *p;
- char b[BDEVNAME_SIZE];
label = read_part_sector(state, 0, §);
if (!label)
magic = label->magic_mushroom;
if(be32_to_cpu(magic) != SGI_LABEL_MAGIC) {
/*printk("Dev %s SGI disklabel: bad magic %08x\n",
- bdevname(bdev, b), be32_to_cpu(magic));*/
+ state->disk->disk_name, be32_to_cpu(magic));*/
put_dev_sector(sect);
return 0;
}
}
if(csum) {
printk(KERN_WARNING "Dev %s SGI disklabel: csum bad, label corrupted\n",
- bdevname(state->bdev, b));
+ state->disk->disk_name);
put_dev_sector(sect);
return 0;
}
} * label;
struct sun_partition *p;
unsigned long spc;
- char b[BDEVNAME_SIZE];
int use_vtoc;
int nparts;
p = label->partitions;
if (be16_to_cpu(label->magic) != SUN_LABEL_MAGIC) {
/* printk(KERN_INFO "Dev %s Sun disklabel: bad magic %04x\n",
- bdevname(bdev, b), be16_to_cpu(label->magic)); */
+ state->disk->disk_name, be16_to_cpu(label->magic)); */
put_dev_sector(sect);
return 0;
}
csum ^= *ush--;
if (csum) {
printk("Dev %s Sun disklabel: Csum bad, label corrupted\n",
- bdevname(state->bdev, b));
+ state->disk->disk_name);
put_dev_sector(sect);
return 0;
}
break;
bip_for_each_vec(iv, bip, iter) {
- void *p, *pmap;
unsigned int j;
+ void *p;
- pmap = kmap_atomic(iv.bv_page);
- p = pmap + iv.bv_offset;
+ p = bvec_kmap_local(&iv);
for (j = 0; j < iv.bv_len; j += tuple_sz) {
struct t10_pi_tuple *pi = p;
ref_tag++;
p += tuple_sz;
}
-
- kunmap_atomic(pmap);
+ kunmap_local(p);
}
bip->bip_flags |= BIP_MAPPED_INTEGRITY;
struct bvec_iter iter;
bip_for_each_vec(iv, bip, iter) {
- void *p, *pmap;
unsigned int j;
+ void *p;
- pmap = kmap_atomic(iv.bv_page);
- p = pmap + iv.bv_offset;
+ p = bvec_kmap_local(&iv);
for (j = 0; j < iv.bv_len && intervals; j += tuple_sz) {
struct t10_pi_tuple *pi = p;
intervals--;
p += tuple_sz;
}
-
- kunmap_atomic(pmap);
+ kunmap_local(p);
}
}
}
then the kernel will automatically generate the private key and
certificate as described in Documentation/admin-guide/module-signing.rst
+choice
+ prompt "Type of module signing key to be generated"
+ default MODULE_SIG_KEY_TYPE_RSA
+ help
+ The type of module signing key type to generate. This option
+ does not apply if a #PKCS11 URI is used.
+
+config MODULE_SIG_KEY_TYPE_RSA
+ bool "RSA"
+ depends on MODULE_SIG || (IMA_APPRAISE_MODSIG && MODULES)
+ help
+ Use an RSA key for module signing.
+
+config MODULE_SIG_KEY_TYPE_ECDSA
+ bool "ECDSA"
+ select CRYPTO_ECDSA
+ depends on MODULE_SIG || (IMA_APPRAISE_MODSIG && MODULES)
+ help
+ Use an elliptic curve key (NIST P384) for module signing. Consider
+ using a strong hash like sha256 or sha384 for hashing modules.
+
+ Note: Remove all ECDSA signing keys, e.g. certs/signing_key.pem,
+ when falling back to building Linux 5.14 and older kernels.
+
+endchoice
+
config SYSTEM_TRUSTED_KEYRING
bool "Provide system-wide ring of trusted keys"
depends on KEYS
redirect_openssl = 2>&1
quiet_redirect_openssl = 2>&1
silent_redirect_openssl = 2>/dev/null
+openssl_available = $(shell openssl help 2>/dev/null && echo yes)
# We do it this way rather than having a boolean option for enabling an
# external private key, because 'make randconfig' might enable such a
# boolean option and we unfortunately can't make it depend on !RANDCONFIG.
ifeq ($(CONFIG_MODULE_SIG_KEY),"certs/signing_key.pem")
+
+ifeq ($(openssl_available),yes)
+X509TEXT=$(shell openssl x509 -in "certs/signing_key.pem" -text 2>/dev/null)
+endif
+
+# Support user changing key type
+ifdef CONFIG_MODULE_SIG_KEY_TYPE_ECDSA
+keytype_openssl = -newkey ec -pkeyopt ec_paramgen_curve:secp384r1
+ifeq ($(openssl_available),yes)
+$(if $(findstring id-ecPublicKey,$(X509TEXT)),,$(shell rm -f "certs/signing_key.pem"))
+endif
+endif # CONFIG_MODULE_SIG_KEY_TYPE_ECDSA
+
+ifdef CONFIG_MODULE_SIG_KEY_TYPE_RSA
+ifeq ($(openssl_available),yes)
+$(if $(findstring rsaEncryption,$(X509TEXT)),,$(shell rm -f "certs/signing_key.pem"))
+endif
+endif # CONFIG_MODULE_SIG_KEY_TYPE_RSA
+
$(obj)/signing_key.pem: $(obj)/x509.genkey
@$(kecho) "###"
@$(kecho) "### Now generating an X.509 key pair to be used for signing modules."
-batch -x509 -config $(obj)/x509.genkey \
-outform PEM -out $(obj)/signing_key.pem \
-keyout $(obj)/signing_key.pem \
+ $(keytype_openssl) \
$($(quiet)redirect_openssl)
@$(kecho) "###"
@$(kecho) "### Key pair generated."
config CRYPTO_SM4
tristate "SM4 cipher algorithm"
select CRYPTO_ALGAPI
+ select CRYPTO_LIB_SM4
help
SM4 cipher algorithms (OSCCA GB/T 32907-2016).
If unsure, say N.
+config CRYPTO_SM4_AESNI_AVX_X86_64
+ tristate "SM4 cipher algorithm (x86_64/AES-NI/AVX)"
+ depends on X86 && 64BIT
+ select CRYPTO_SKCIPHER
+ select CRYPTO_SIMD
+ select CRYPTO_ALGAPI
+ select CRYPTO_LIB_SM4
+ help
+ SM4 cipher algorithms (OSCCA GB/T 32907-2016) (x86_64/AES-NI/AVX).
+
+ SM4 (GBT.32907-2016) is a cryptographic standard issued by the
+ Organization of State Commercial Administration of China (OSCCA)
+ as an authorized cryptographic algorithms for the use within China.
+
+ This is SM4 optimized implementation using AES-NI/AVX/x86_64
+ instruction set for block cipher. Through two affine transforms,
+ we can use the AES S-Box to simulate the SM4 S-Box to achieve the
+ effect of instruction acceleration.
+
+ If unsure, say N.
+
+config CRYPTO_SM4_AESNI_AVX2_X86_64
+ tristate "SM4 cipher algorithm (x86_64/AES-NI/AVX2)"
+ depends on X86 && 64BIT
+ select CRYPTO_SKCIPHER
+ select CRYPTO_SIMD
+ select CRYPTO_ALGAPI
+ select CRYPTO_LIB_SM4
+ select CRYPTO_SM4_AESNI_AVX_X86_64
+ help
+ SM4 cipher algorithms (OSCCA GB/T 32907-2016) (x86_64/AES-NI/AVX2).
+
+ SM4 (GBT.32907-2016) is a cryptographic standard issued by the
+ Organization of State Commercial Administration of China (OSCCA)
+ as an authorized cryptographic algorithms for the use within China.
+
+ This is SM4 optimized implementation using AES-NI/AVX2/x86_64
+ instruction set for block cipher. Through two affine transforms,
+ we can use the AES S-Box to simulate the SM4 S-Box to achieve the
+ effect of instruction acceleration.
+
+ If unsure, say N.
+
config CRYPTO_TEA
tristate "TEA, XTEA and XETA cipher algorithms"
depends on CRYPTO_USER_API_ENABLE_OBSOLETE
obj-$(CONFIG_CRYPTO_MD4) += md4.o
obj-$(CONFIG_CRYPTO_MD5) += md5.o
obj-$(CONFIG_CRYPTO_RMD160) += rmd160.o
-obj-$(CONFIG_CRYPTO_RMD320) += rmd320.o
obj-$(CONFIG_CRYPTO_SHA1) += sha1_generic.o
obj-$(CONFIG_CRYPTO_SHA256) += sha256_generic.o
obj-$(CONFIG_CRYPTO_SHA512) += sha512_generic.o
ctx->sinfo->sig->pkey_algo = "rsa";
ctx->sinfo->sig->encoding = "pkcs1";
break;
+ case OID_id_ecdsa_with_sha1:
+ case OID_id_ecdsa_with_sha224:
+ case OID_id_ecdsa_with_sha256:
+ case OID_id_ecdsa_with_sha384:
+ case OID_id_ecdsa_with_sha512:
+ ctx->sinfo->sig->pkey_algo = "ecdsa";
+ ctx->sinfo->sig->encoding = "x962";
+ break;
default:
printk("Unsupported pkey algo: %u\n", ctx->last_oid);
return -ENOPKG;
#define _CRYPTO_ECC_H
#include <crypto/ecc_curve.h>
+#include <asm/unaligned.h>
/* One digit is u64 qword. */
#define ECC_CURVE_NIST_P192_DIGITS 3
* @out: Output array
* @ndigits: Number of digits to copy
*/
-static inline void ecc_swap_digits(const u64 *in, u64 *out, unsigned int ndigits)
+static inline void ecc_swap_digits(const void *in, u64 *out, unsigned int ndigits)
{
const __be64 *src = (__force __be64 *)in;
int i;
for (i = 0; i < ndigits; i++)
- out[i] = be64_to_cpu(src[ndigits - 1 - i]);
+ out[i] = get_unaligned_be64(&src[ndigits - 1 - i]);
}
/**
state[0] += a; state[1] += b; state[2] += c; state[3] += d;
state[4] += e; state[5] += f; state[6] += g; state[7] += h;
-
- /* erase our data */
- a = b = c = d = e = f = g = h = t1 = t2 = 0;
}
static void sha512_generic_block_fn(struct sha512_state *sst, u8 const *src,
static int skcipher_walk_first(struct skcipher_walk *walk)
{
- if (WARN_ON_ONCE(in_irq()))
+ if (WARN_ON_ONCE(in_hardirq()))
return -EDEADLK;
walk->buffer = NULL;
#include <asm/byteorder.h>
#include <asm/unaligned.h>
-static const u32 fk[4] = {
- 0xa3b1bac6, 0x56aa3350, 0x677d9197, 0xb27022dc
-};
-
-static const u8 sbox[256] = {
- 0xd6, 0x90, 0xe9, 0xfe, 0xcc, 0xe1, 0x3d, 0xb7,
- 0x16, 0xb6, 0x14, 0xc2, 0x28, 0xfb, 0x2c, 0x05,
- 0x2b, 0x67, 0x9a, 0x76, 0x2a, 0xbe, 0x04, 0xc3,
- 0xaa, 0x44, 0x13, 0x26, 0x49, 0x86, 0x06, 0x99,
- 0x9c, 0x42, 0x50, 0xf4, 0x91, 0xef, 0x98, 0x7a,
- 0x33, 0x54, 0x0b, 0x43, 0xed, 0xcf, 0xac, 0x62,
- 0xe4, 0xb3, 0x1c, 0xa9, 0xc9, 0x08, 0xe8, 0x95,
- 0x80, 0xdf, 0x94, 0xfa, 0x75, 0x8f, 0x3f, 0xa6,
- 0x47, 0x07, 0xa7, 0xfc, 0xf3, 0x73, 0x17, 0xba,
- 0x83, 0x59, 0x3c, 0x19, 0xe6, 0x85, 0x4f, 0xa8,
- 0x68, 0x6b, 0x81, 0xb2, 0x71, 0x64, 0xda, 0x8b,
- 0xf8, 0xeb, 0x0f, 0x4b, 0x70, 0x56, 0x9d, 0x35,
- 0x1e, 0x24, 0x0e, 0x5e, 0x63, 0x58, 0xd1, 0xa2,
- 0x25, 0x22, 0x7c, 0x3b, 0x01, 0x21, 0x78, 0x87,
- 0xd4, 0x00, 0x46, 0x57, 0x9f, 0xd3, 0x27, 0x52,
- 0x4c, 0x36, 0x02, 0xe7, 0xa0, 0xc4, 0xc8, 0x9e,
- 0xea, 0xbf, 0x8a, 0xd2, 0x40, 0xc7, 0x38, 0xb5,
- 0xa3, 0xf7, 0xf2, 0xce, 0xf9, 0x61, 0x15, 0xa1,
- 0xe0, 0xae, 0x5d, 0xa4, 0x9b, 0x34, 0x1a, 0x55,
- 0xad, 0x93, 0x32, 0x30, 0xf5, 0x8c, 0xb1, 0xe3,
- 0x1d, 0xf6, 0xe2, 0x2e, 0x82, 0x66, 0xca, 0x60,
- 0xc0, 0x29, 0x23, 0xab, 0x0d, 0x53, 0x4e, 0x6f,
- 0xd5, 0xdb, 0x37, 0x45, 0xde, 0xfd, 0x8e, 0x2f,
- 0x03, 0xff, 0x6a, 0x72, 0x6d, 0x6c, 0x5b, 0x51,
- 0x8d, 0x1b, 0xaf, 0x92, 0xbb, 0xdd, 0xbc, 0x7f,
- 0x11, 0xd9, 0x5c, 0x41, 0x1f, 0x10, 0x5a, 0xd8,
- 0x0a, 0xc1, 0x31, 0x88, 0xa5, 0xcd, 0x7b, 0xbd,
- 0x2d, 0x74, 0xd0, 0x12, 0xb8, 0xe5, 0xb4, 0xb0,
- 0x89, 0x69, 0x97, 0x4a, 0x0c, 0x96, 0x77, 0x7e,
- 0x65, 0xb9, 0xf1, 0x09, 0xc5, 0x6e, 0xc6, 0x84,
- 0x18, 0xf0, 0x7d, 0xec, 0x3a, 0xdc, 0x4d, 0x20,
- 0x79, 0xee, 0x5f, 0x3e, 0xd7, 0xcb, 0x39, 0x48
-};
-
-static const u32 ck[] = {
- 0x00070e15, 0x1c232a31, 0x383f464d, 0x545b6269,
- 0x70777e85, 0x8c939aa1, 0xa8afb6bd, 0xc4cbd2d9,
- 0xe0e7eef5, 0xfc030a11, 0x181f262d, 0x343b4249,
- 0x50575e65, 0x6c737a81, 0x888f969d, 0xa4abb2b9,
- 0xc0c7ced5, 0xdce3eaf1, 0xf8ff060d, 0x141b2229,
- 0x30373e45, 0x4c535a61, 0x686f767d, 0x848b9299,
- 0xa0a7aeb5, 0xbcc3cad1, 0xd8dfe6ed, 0xf4fb0209,
- 0x10171e25, 0x2c333a41, 0x484f565d, 0x646b7279
-};
-
-static u32 sm4_t_non_lin_sub(u32 x)
-{
- int i;
- u8 *b = (u8 *)&x;
-
- for (i = 0; i < 4; ++i)
- b[i] = sbox[b[i]];
-
- return x;
-}
-
-static u32 sm4_key_lin_sub(u32 x)
-{
- return x ^ rol32(x, 13) ^ rol32(x, 23);
-
-}
-
-static u32 sm4_enc_lin_sub(u32 x)
-{
- return x ^ rol32(x, 2) ^ rol32(x, 10) ^ rol32(x, 18) ^ rol32(x, 24);
-}
-
-static u32 sm4_key_sub(u32 x)
-{
- return sm4_key_lin_sub(sm4_t_non_lin_sub(x));
-}
-
-static u32 sm4_enc_sub(u32 x)
-{
- return sm4_enc_lin_sub(sm4_t_non_lin_sub(x));
-}
-
-static u32 sm4_round(const u32 *x, const u32 rk)
-{
- return x[0] ^ sm4_enc_sub(x[1] ^ x[2] ^ x[3] ^ rk);
-}
-
-
/**
- * crypto_sm4_expand_key - Expands the SM4 key as described in GB/T 32907-2016
- * @ctx: The location where the computed key will be stored.
- * @in_key: The supplied key.
- * @key_len: The length of the supplied key.
- *
- * Returns 0 on success. The function fails only if an invalid key size (or
- * pointer) is supplied.
- */
-int crypto_sm4_expand_key(struct crypto_sm4_ctx *ctx, const u8 *in_key,
- unsigned int key_len)
-{
- u32 rk[4], t;
- const u32 *key = (u32 *)in_key;
- int i;
-
- if (key_len != SM4_KEY_SIZE)
- return -EINVAL;
-
- for (i = 0; i < 4; ++i)
- rk[i] = get_unaligned_be32(&key[i]) ^ fk[i];
-
- for (i = 0; i < 32; ++i) {
- t = rk[0] ^ sm4_key_sub(rk[1] ^ rk[2] ^ rk[3] ^ ck[i]);
- ctx->rkey_enc[i] = t;
- rk[0] = rk[1];
- rk[1] = rk[2];
- rk[2] = rk[3];
- rk[3] = t;
- }
-
- for (i = 0; i < 32; ++i)
- ctx->rkey_dec[i] = ctx->rkey_enc[31 - i];
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(crypto_sm4_expand_key);
-
-/**
- * crypto_sm4_set_key - Set the SM4 key.
+ * sm4_setkey - Set the SM4 key.
* @tfm: The %crypto_tfm that is used in the context.
* @in_key: The input key.
* @key_len: The size of the key.
*
- * This function uses crypto_sm4_expand_key() to expand the key.
- * &crypto_sm4_ctx _must_ be the private data embedded in @tfm which is
+ * This function uses sm4_expandkey() to expand the key.
+ * &sm4_ctx _must_ be the private data embedded in @tfm which is
* retrieved with crypto_tfm_ctx().
*
* Return: 0 on success; -EINVAL on failure (only happens for bad key lengths)
*/
-int crypto_sm4_set_key(struct crypto_tfm *tfm, const u8 *in_key,
+static int sm4_setkey(struct crypto_tfm *tfm, const u8 *in_key,
unsigned int key_len)
{
- struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
-
- return crypto_sm4_expand_key(ctx, in_key, key_len);
-}
-EXPORT_SYMBOL_GPL(crypto_sm4_set_key);
-
-static void sm4_do_crypt(const u32 *rk, u32 *out, const u32 *in)
-{
- u32 x[4], i, t;
-
- for (i = 0; i < 4; ++i)
- x[i] = get_unaligned_be32(&in[i]);
-
- for (i = 0; i < 32; ++i) {
- t = sm4_round(x, rk[i]);
- x[0] = x[1];
- x[1] = x[2];
- x[2] = x[3];
- x[3] = t;
- }
+ struct sm4_ctx *ctx = crypto_tfm_ctx(tfm);
- for (i = 0; i < 4; ++i)
- put_unaligned_be32(x[3 - i], &out[i]);
+ return sm4_expandkey(ctx, in_key, key_len);
}
/* encrypt a block of text */
-void crypto_sm4_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
+static void sm4_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
- const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
+ const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm);
- sm4_do_crypt(ctx->rkey_enc, (u32 *)out, (u32 *)in);
+ sm4_crypt_block(ctx->rkey_enc, out, in);
}
-EXPORT_SYMBOL_GPL(crypto_sm4_encrypt);
/* decrypt a block of text */
-void crypto_sm4_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
+static void sm4_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
{
- const struct crypto_sm4_ctx *ctx = crypto_tfm_ctx(tfm);
+ const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm);
- sm4_do_crypt(ctx->rkey_dec, (u32 *)out, (u32 *)in);
+ sm4_crypt_block(ctx->rkey_dec, out, in);
}
-EXPORT_SYMBOL_GPL(crypto_sm4_decrypt);
static struct crypto_alg sm4_alg = {
.cra_name = "sm4",
.cra_priority = 100,
.cra_flags = CRYPTO_ALG_TYPE_CIPHER,
.cra_blocksize = SM4_BLOCK_SIZE,
- .cra_ctxsize = sizeof(struct crypto_sm4_ctx),
+ .cra_ctxsize = sizeof(struct sm4_ctx),
.cra_module = THIS_MODULE,
.cra_u = {
.cipher = {
.cia_min_keysize = SM4_KEY_SIZE,
.cia_max_keysize = SM4_KEY_SIZE,
- .cia_setkey = crypto_sm4_set_key,
- .cia_encrypt = crypto_sm4_encrypt,
- .cia_decrypt = crypto_sm4_decrypt
+ .cia_setkey = sm4_setkey,
+ .cia_encrypt = sm4_encrypt,
+ .cia_decrypt = sm4_decrypt
}
}
};
NULL
};
-static const int block_sizes[] = { 16, 64, 256, 1024, 1420, 4096, 0 };
+static const int block_sizes[] = { 16, 64, 128, 256, 1024, 1420, 4096, 0 };
static const int aead_sizes[] = { 16, 64, 256, 512, 1024, 1420, 4096, 8192, 0 };
#define XBUFSIZE 8
}
ret = crypto_aead_setauthsize(tfm, authsize);
+ if (ret) {
+ pr_err("alg: aead: Failed to setauthsize for %s: %d\n", algo,
+ ret);
+ goto out_free_tfm;
+ }
for (i = 0; i < num_mb; ++i)
if (testmgr_alloc_buf(data[i].xbuf)) {
for (i = 0; i < num_mb; ++i) {
data[i].req = aead_request_alloc(tfm, GFP_KERNEL);
if (!data[i].req) {
- pr_err("alg: skcipher: Failed to allocate request for %s\n",
+ pr_err("alg: aead: Failed to allocate request for %s\n",
algo);
while (i--)
aead_request_free(data[i].req);
sgout = &sg[9];
tfm = crypto_alloc_aead(algo, 0, 0);
-
if (IS_ERR(tfm)) {
pr_err("alg: aead: Failed to load transform for %s: %ld\n", algo,
PTR_ERR(tfm));
goto out_notfm;
}
+ ret = crypto_aead_setauthsize(tfm, authsize);
+ if (ret) {
+ pr_err("alg: aead: Failed to setauthsize for %s: %d\n", algo,
+ ret);
+ goto out_noreq;
+ }
+
crypto_init_wait(&wait);
printk(KERN_INFO "\ntesting speed of %s (%s) %s\n", algo,
get_driver_name(crypto_aead, tfm), e);
break;
}
}
+
ret = crypto_aead_setkey(tfm, key, *keysize);
- ret = crypto_aead_setauthsize(tfm, authsize);
+ if (ret) {
+ pr_err("setkey() failed flags=%x: %d\n",
+ crypto_aead_get_flags(tfm), ret);
+ goto out;
+ }
iv_len = crypto_aead_ivsize(tfm);
if (iv_len)
printk(KERN_INFO "test %u (%d bit key, %d byte blocks): ",
i, *keysize * 8, bs);
-
memset(tvmem[0], 0xff, PAGE_SIZE);
- if (ret) {
- pr_err("setkey() failed flags=%x\n",
- crypto_aead_get_flags(tfm));
- goto out;
- }
-
sg_init_aead(sg, xbuf, bs + (enc ? 0 : authsize),
assoc, aad_size);
ret += tcrypt_test("streebog512");
break;
+ case 55:
+ ret += tcrypt_test("gcm(sm4)");
+ break;
+
+ case 56:
+ ret += tcrypt_test("ccm(sm4)");
+ break;
+
case 100:
ret += tcrypt_test("hmac(md5)");
break;
case 157:
ret += tcrypt_test("authenc(hmac(sha1),ecb(cipher_null))");
break;
+
+ case 158:
+ ret += tcrypt_test("cbcmac(sm4)");
+ break;
+
+ case 159:
+ ret += tcrypt_test("cmac(sm4)");
+ break;
+
case 181:
ret += tcrypt_test("authenc(hmac(sha1),cbc(des))");
break;
case 191:
ret += tcrypt_test("ecb(sm4)");
ret += tcrypt_test("cbc(sm4)");
+ ret += tcrypt_test("cfb(sm4)");
ret += tcrypt_test("ctr(sm4)");
break;
case 200:
speed_template_16);
test_cipher_speed("cbc(sm4)", DECRYPT, sec, NULL, 0,
speed_template_16);
+ test_cipher_speed("cfb(sm4)", ENCRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_cipher_speed("cfb(sm4)", DECRYPT, sec, NULL, 0,
+ speed_template_16);
test_cipher_speed("ctr(sm4)", ENCRYPT, sec, NULL, 0,
speed_template_16);
test_cipher_speed("ctr(sm4)", DECRYPT, sec, NULL, 0,
NULL, 0, 16, 8, speed_template_16);
break;
+ case 222:
+ test_aead_speed("gcm(sm4)", ENCRYPT, sec,
+ NULL, 0, 16, 8, speed_template_16);
+ test_aead_speed("gcm(sm4)", DECRYPT, sec,
+ NULL, 0, 16, 8, speed_template_16);
+ break;
+
+ case 223:
+ test_aead_speed("rfc4309(ccm(sm4))", ENCRYPT, sec,
+ NULL, 0, 16, 16, aead_speed_template_19);
+ test_aead_speed("rfc4309(ccm(sm4))", DECRYPT, sec,
+ NULL, 0, 16, 16, aead_speed_template_19);
+ break;
+
+ case 224:
+ test_mb_aead_speed("gcm(sm4)", ENCRYPT, sec, NULL, 0, 16, 8,
+ speed_template_16, num_mb);
+ test_mb_aead_speed("gcm(sm4)", DECRYPT, sec, NULL, 0, 16, 8,
+ speed_template_16, num_mb);
+ break;
+
+ case 225:
+ test_mb_aead_speed("rfc4309(ccm(sm4))", ENCRYPT, sec, NULL, 0,
+ 16, 16, aead_speed_template_19, num_mb);
+ test_mb_aead_speed("rfc4309(ccm(sm4))", DECRYPT, sec, NULL, 0,
+ 16, 16, aead_speed_template_19, num_mb);
+ break;
+
case 300:
if (alg) {
test_hash_speed(alg, sec, generic_hash_speed_template);
speed_template_8_32);
break;
+ case 518:
+ test_acipher_speed("ecb(sm4)", ENCRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("ecb(sm4)", DECRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("cbc(sm4)", ENCRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("cbc(sm4)", DECRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("cfb(sm4)", ENCRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("cfb(sm4)", DECRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("ctr(sm4)", ENCRYPT, sec, NULL, 0,
+ speed_template_16);
+ test_acipher_speed("ctr(sm4)", DECRYPT, sec, NULL, 0,
+ speed_template_16);
+ break;
+
case 600:
test_mb_skcipher_speed("ecb(aes)", ENCRYPT, sec, NULL, 0,
speed_template_16_24_32, num_mb);
.suite = {
.hash = __VECS(aes_cbcmac_tv_template)
}
+ }, {
+ .alg = "cbcmac(sm4)",
+ .test = alg_test_hash,
+ .suite = {
+ .hash = __VECS(sm4_cbcmac_tv_template)
+ }
}, {
.alg = "ccm(aes)",
.generic_driver = "ccm_base(ctr(aes-generic),cbcmac(aes-generic))",
.einval_allowed = 1,
}
}
+ }, {
+ .alg = "ccm(sm4)",
+ .generic_driver = "ccm_base(ctr(sm4-generic),cbcmac(sm4-generic))",
+ .test = alg_test_aead,
+ .suite = {
+ .aead = {
+ ____VECS(sm4_ccm_tv_template),
+ .einval_allowed = 1,
+ }
+ }
}, {
.alg = "cfb(aes)",
.test = alg_test_skcipher,
.suite = {
.hash = __VECS(des3_ede_cmac64_tv_template)
}
+ }, {
+ .alg = "cmac(sm4)",
+ .test = alg_test_hash,
+ .suite = {
+ .hash = __VECS(sm4_cmac128_tv_template)
+ }
}, {
.alg = "compress_null",
.test = alg_test_null,
.suite = {
.aead = __VECS(aes_gcm_tv_template)
}
+ }, {
+ .alg = "gcm(sm4)",
+ .generic_driver = "gcm_base(ctr(sm4-generic),ghash-generic)",
+ .test = alg_test_aead,
+ .suite = {
+ .aead = __VECS(sm4_gcm_tv_template)
+ }
}, {
.alg = "ghash",
.test = alg_test_hash,
}
};
+static const struct aead_testvec sm4_gcm_tv_template[] = {
+ { /* From https://datatracker.ietf.org/doc/html/rfc8998#appendix-A.1 */
+ .key = "\x01\x23\x45\x67\x89\xAB\xCD\xEF"
+ "\xFE\xDC\xBA\x98\x76\x54\x32\x10",
+ .klen = 16,
+ .iv = "\x00\x00\x12\x34\x56\x78\x00\x00"
+ "\x00\x00\xAB\xCD",
+ .ptext = "\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xAA"
+ "\xBB\xBB\xBB\xBB\xBB\xBB\xBB\xBB"
+ "\xCC\xCC\xCC\xCC\xCC\xCC\xCC\xCC"
+ "\xDD\xDD\xDD\xDD\xDD\xDD\xDD\xDD"
+ "\xEE\xEE\xEE\xEE\xEE\xEE\xEE\xEE"
+ "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF"
+ "\xEE\xEE\xEE\xEE\xEE\xEE\xEE\xEE"
+ "\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xAA",
+ .plen = 64,
+ .assoc = "\xFE\xED\xFA\xCE\xDE\xAD\xBE\xEF"
+ "\xFE\xED\xFA\xCE\xDE\xAD\xBE\xEF"
+ "\xAB\xAD\xDA\xD2",
+ .alen = 20,
+ .ctext = "\x17\xF3\x99\xF0\x8C\x67\xD5\xEE"
+ "\x19\xD0\xDC\x99\x69\xC4\xBB\x7D"
+ "\x5F\xD4\x6F\xD3\x75\x64\x89\x06"
+ "\x91\x57\xB2\x82\xBB\x20\x07\x35"
+ "\xD8\x27\x10\xCA\x5C\x22\xF0\xCC"
+ "\xFA\x7C\xBF\x93\xD4\x96\xAC\x15"
+ "\xA5\x68\x34\xCB\xCF\x98\xC3\x97"
+ "\xB4\x02\x4A\x26\x91\x23\x3B\x8D"
+ "\x83\xDE\x35\x41\xE4\xC2\xB5\x81"
+ "\x77\xE0\x65\xA9\xBF\x7B\x62\xEC",
+ .clen = 80,
+ }
+};
+
+static const struct aead_testvec sm4_ccm_tv_template[] = {
+ { /* From https://datatracker.ietf.org/doc/html/rfc8998#appendix-A.2 */
+ .key = "\x01\x23\x45\x67\x89\xAB\xCD\xEF"
+ "\xFE\xDC\xBA\x98\x76\x54\x32\x10",
+ .klen = 16,
+ .iv = "\x02\x00\x00\x12\x34\x56\x78\x00"
+ "\x00\x00\x00\xAB\xCD\x00\x00\x00",
+ .ptext = "\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xAA"
+ "\xBB\xBB\xBB\xBB\xBB\xBB\xBB\xBB"
+ "\xCC\xCC\xCC\xCC\xCC\xCC\xCC\xCC"
+ "\xDD\xDD\xDD\xDD\xDD\xDD\xDD\xDD"
+ "\xEE\xEE\xEE\xEE\xEE\xEE\xEE\xEE"
+ "\xFF\xFF\xFF\xFF\xFF\xFF\xFF\xFF"
+ "\xEE\xEE\xEE\xEE\xEE\xEE\xEE\xEE"
+ "\xAA\xAA\xAA\xAA\xAA\xAA\xAA\xAA",
+ .plen = 64,
+ .assoc = "\xFE\xED\xFA\xCE\xDE\xAD\xBE\xEF"
+ "\xFE\xED\xFA\xCE\xDE\xAD\xBE\xEF"
+ "\xAB\xAD\xDA\xD2",
+ .alen = 20,
+ .ctext = "\x48\xAF\x93\x50\x1F\xA6\x2A\xDB"
+ "\xCD\x41\x4C\xCE\x60\x34\xD8\x95"
+ "\xDD\xA1\xBF\x8F\x13\x2F\x04\x20"
+ "\x98\x66\x15\x72\xE7\x48\x30\x94"
+ "\xFD\x12\xE5\x18\xCE\x06\x2C\x98"
+ "\xAC\xEE\x28\xD9\x5D\xF4\x41\x6B"
+ "\xED\x31\xA2\xF0\x44\x76\xC1\x8B"
+ "\xB4\x0C\x84\xA7\x4B\x97\xDC\x5B"
+ "\x16\x84\x2D\x4F\xA1\x86\xF5\x6A"
+ "\xB3\x32\x56\x97\x1F\xA1\x10\xF4",
+ .clen = 80,
+ }
+};
+
+static const struct hash_testvec sm4_cbcmac_tv_template[] = {
+ {
+ .key = "\xff\xee\xdd\xcc\xbb\xaa\x99\x88"
+ "\x77\x66\x55\x44\x33\x22\x11\x00",
+ .plaintext = "\x01\x23\x45\x67\x89\xab\xcd\xef"
+ "\xfe\xdc\xba\x98\x76\x54\x32\x10",
+ .digest = "\x97\xb4\x75\x8f\x84\x92\x3d\x3f"
+ "\x86\x81\x0e\x0e\xea\x14\x6d\x73",
+ .psize = 16,
+ .ksize = 16,
+ }, {
+ .key = "\x01\x23\x45\x67\x89\xab\xcd\xef"
+ "\xfe\xdc\xBA\x98\x76\x54\x32\x10",
+ .plaintext = "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa"
+ "\xbb\xbb\xbb\xbb\xbb\xbb\xbb\xbb"
+ "\xcc\xcc\xcc\xcc\xcc\xcc\xcc\xcc"
+ "\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd"
+ "\xee",
+ .digest = "\xc7\xdb\x17\x71\xa1\x5c\x0d\x22"
+ "\xa3\x39\x3a\x31\x88\x91\x49\xa1",
+ .psize = 33,
+ .ksize = 16,
+ }, {
+ .key = "\x01\x23\x45\x67\x89\xab\xcd\xef"
+ "\xfe\xdc\xBA\x98\x76\x54\x32\x10",
+ .plaintext = "\xfb\xd1\xbe\x92\x7e\x50\x3f\x16"
+ "\xf9\xdd\xbe\x91\x73\x53\x37\x1a"
+ "\xfe\xdd\xba\x97\x7e\x53\x3c\x1c"
+ "\xfe\xd7\xbf\x9c\x75\x5f\x3e\x11"
+ "\xf0\xd8\xbc\x96\x73\x5c\x34\x11"
+ "\xf5\xdb\xb1\x99\x7a\x5a\x32\x1f"
+ "\xf6\xdf\xb4\x95\x7f\x5f\x3b\x17"
+ "\xfd\xdb\xb1\x9b\x76\x5c\x37",
+ .digest = "\x9b\x07\x88\x7f\xd5\x95\x23\x12"
+ "\x64\x0a\x66\x7f\x4e\x25\xca\xd0",
+ .psize = 63,
+ .ksize = 16,
+ }
+};
+
+static const struct hash_testvec sm4_cmac128_tv_template[] = {
+ {
+ .key = "\xff\xee\xdd\xcc\xbb\xaa\x99\x88"
+ "\x77\x66\x55\x44\x33\x22\x11\x00",
+ .plaintext = "\x01\x23\x45\x67\x89\xab\xcd\xef"
+ "\xfe\xdc\xba\x98\x76\x54\x32\x10",
+ .digest = "\x00\xd4\x63\xb4\x9a\xf3\x52\xe2"
+ "\x74\xa9\x00\x55\x13\x54\x2a\xd1",
+ .psize = 16,
+ .ksize = 16,
+ }, {
+ .key = "\x01\x23\x45\x67\x89\xab\xcd\xef"
+ "\xfe\xdc\xBA\x98\x76\x54\x32\x10",
+ .plaintext = "\xaa\xaa\xaa\xaa\xaa\xaa\xaa\xaa"
+ "\xbb\xbb\xbb\xbb\xbb\xbb\xbb\xbb"
+ "\xcc\xcc\xcc\xcc\xcc\xcc\xcc\xcc"
+ "\xdd\xdd\xdd\xdd\xdd\xdd\xdd\xdd"
+ "\xee",
+ .digest = "\x8a\x8a\xe9\xc0\xc8\x97\x0e\x85"
+ "\x21\x57\x02\x10\x1a\xbf\x9c\xc6",
+ .psize = 33,
+ .ksize = 16,
+ }, {
+ .key = "\x01\x23\x45\x67\x89\xab\xcd\xef"
+ "\xfe\xdc\xBA\x98\x76\x54\x32\x10",
+ .plaintext = "\xfb\xd1\xbe\x92\x7e\x50\x3f\x16"
+ "\xf9\xdd\xbe\x91\x73\x53\x37\x1a"
+ "\xfe\xdd\xba\x97\x7e\x53\x3c\x1c"
+ "\xfe\xd7\xbf\x9c\x75\x5f\x3e\x11"
+ "\xf0\xd8\xbc\x96\x73\x5c\x34\x11"
+ "\xf5\xdb\xb1\x99\x7a\x5a\x32\x1f"
+ "\xf6\xdf\xb4\x95\x7f\x5f\x3b\x17"
+ "\xfd\xdb\xb1\x9b\x76\x5c\x37",
+ .digest = "\x5f\x14\xc9\xa9\x20\xb2\xb4\xf0"
+ "\x76\xe0\xd8\xd6\xdc\x4f\xe1\xbc",
+ .psize = 63,
+ .ksize = 16,
+ }
+};
+
/* Cast6 test vectors from RFC 2612 */
static const struct cipher_testvec cast6_tv_template[] = {
{
0xca2dbf07ad5a8333ULL,
};
-/**
+/*
* The core Whirlpool transform.
*/
source "drivers/isdn/Kconfig"
-source "drivers/lightnvm/Kconfig"
-
# input before char - char/joystick depends on it. As does USB.
source "drivers/input/Kconfig"
obj-$(CONFIG_FB_INTEL) += video/fbdev/intelfb/
obj-$(CONFIG_PARPORT) += parport/
-obj-$(CONFIG_NVM) += lightnvm/
obj-y += base/ block/ misc/ mfd/ nfc/
obj-$(CONFIG_LIBNVDIMM) += nvdimm/
obj-$(CONFIG_DAX) += dax/
struct device_attribute *ahci_sdev_attrs[] = {
&dev_attr_sw_activity,
&dev_attr_unload_heads,
+ &dev_attr_ncq_prio_supported,
&dev_attr_ncq_prio_enable,
NULL
};
MODULE_LICENSE("GPL");
MODULE_VERSION(DRV_VERSION);
+static inline bool ata_dev_print_info(struct ata_device *dev)
+{
+ struct ata_eh_context *ehc = &dev->link->eh_context;
+
+ return ehc->i.flags & ATA_EHI_PRINTINFO;
+}
static bool ata_sstatus_online(u32 sstatus)
{
if (tf->flags & ATA_TFLAG_FUA)
tf->device |= 1 << 7;
- if (dev->flags & ATA_DFLAG_NCQ_PRIO) {
- if (class == IOPRIO_CLASS_RT)
- tf->hob_nsect |= ATA_PRIO_HIGH <<
- ATA_SHIFT_PRIO;
- }
+ if (dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE &&
+ class == IOPRIO_CLASS_RT)
+ tf->hob_nsect |= ATA_PRIO_HIGH << ATA_SHIFT_PRIO;
} else if (dev->flags & ATA_DFLAG_LBA) {
tf->flags |= ATA_TFLAG_LBA;
*/
static int ata_hpa_resize(struct ata_device *dev)
{
- struct ata_eh_context *ehc = &dev->link->eh_context;
- int print_info = ehc->i.flags & ATA_EHI_PRINTINFO;
+ bool print_info = ata_dev_print_info(dev);
bool unlock_hpa = ata_ignore_hpa || dev->flags & ATA_DFLAG_UNLOCK_HPA;
u64 sectors = ata_id_n_sectors(dev->id);
u64 native_sectors;
err_mask = ata_exec_internal(dev, &tf, NULL, DMA_FROM_DEVICE,
buf, sectors * ATA_SECT_SIZE, 0);
- if (err_mask && dma) {
- dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
- ata_dev_warn(dev, "READ LOG DMA EXT failed, trying PIO\n");
- goto retry;
+ if (err_mask) {
+ if (dma) {
+ dev->horkage |= ATA_HORKAGE_NO_DMA_LOG;
+ goto retry;
+ }
+ ata_dev_err(dev, "Read log page 0x%02x failed, Emask 0x%x\n",
+ (unsigned int)page, err_mask);
}
- DPRINTK("EXIT, err_mask=%x\n", err_mask);
return err_mask;
}
*/
err = ata_read_log_page(dev, ATA_LOG_IDENTIFY_DEVICE, 0, ap->sector_buf,
1);
- if (err) {
- ata_dev_info(dev,
- "failed to get Device Identify Log Emask 0x%x\n",
- err);
+ if (err)
return false;
- }
for (i = 0; i < ap->sector_buf[8]; i++) {
if (ap->sector_buf[9 + i] == page)
}
err_mask = ata_read_log_page(dev, ATA_LOG_NCQ_SEND_RECV,
0, ap->sector_buf, 1);
- if (err_mask) {
- ata_dev_dbg(dev,
- "failed to get NCQ Send/Recv Log Emask 0x%x\n",
- err_mask);
- } else {
+ if (!err_mask) {
u8 *cmds = dev->ncq_send_recv_cmds;
dev->flags |= ATA_DFLAG_NCQ_SEND_RECV;
}
err_mask = ata_read_log_page(dev, ATA_LOG_NCQ_NON_DATA,
0, ap->sector_buf, 1);
- if (err_mask) {
- ata_dev_dbg(dev,
- "failed to get NCQ Non-Data Log Emask 0x%x\n",
- err_mask);
- } else {
+ if (!err_mask) {
u8 *cmds = dev->ncq_non_data_cmds;
memcpy(cmds, ap->sector_buf, ATA_LOG_NCQ_NON_DATA_SIZE);
struct ata_port *ap = dev->link->ap;
unsigned int err_mask;
- if (!(dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE)) {
- dev->flags &= ~ATA_DFLAG_NCQ_PRIO;
- return;
- }
-
err_mask = ata_read_log_page(dev,
ATA_LOG_IDENTIFY_DEVICE,
ATA_LOG_SATA_SETTINGS,
ap->sector_buf,
1);
- if (err_mask) {
- ata_dev_dbg(dev,
- "failed to get Identify Device data, Emask 0x%x\n",
- err_mask);
- return;
- }
+ if (err_mask)
+ goto not_supported;
- if (ap->sector_buf[ATA_LOG_NCQ_PRIO_OFFSET] & BIT(3)) {
- dev->flags |= ATA_DFLAG_NCQ_PRIO;
- } else {
- dev->flags &= ~ATA_DFLAG_NCQ_PRIO;
- ata_dev_dbg(dev, "SATA page does not support priority\n");
- }
+ if (!(ap->sector_buf[ATA_LOG_NCQ_PRIO_OFFSET] & BIT(3)))
+ goto not_supported;
+
+ dev->flags |= ATA_DFLAG_NCQ_PRIO;
+
+ return;
+not_supported:
+ dev->flags &= ~ATA_DFLAG_NCQ_PRIO_ENABLE;
+ dev->flags &= ~ATA_DFLAG_NCQ_PRIO;
}
static int ata_dev_config_ncq(struct ata_device *dev,
err = ata_read_log_page(dev, ATA_LOG_IDENTIFY_DEVICE, ATA_LOG_SECURITY,
ap->sector_buf, 1);
- if (err) {
- ata_dev_dbg(dev,
- "failed to read Security Log, Emask 0x%x\n", err);
+ if (err)
return;
- }
trusted_cap = get_unaligned_le64(&ap->sector_buf[40]);
if (!(trusted_cap & (1ULL << 63))) {
dev->flags |= ATA_DFLAG_TRUSTED;
}
+static int ata_dev_config_lba(struct ata_device *dev)
+{
+ struct ata_port *ap = dev->link->ap;
+ const u16 *id = dev->id;
+ const char *lba_desc;
+ char ncq_desc[24];
+ int ret;
+
+ dev->flags |= ATA_DFLAG_LBA;
+
+ if (ata_id_has_lba48(id)) {
+ lba_desc = "LBA48";
+ dev->flags |= ATA_DFLAG_LBA48;
+ if (dev->n_sectors >= (1UL << 28) &&
+ ata_id_has_flush_ext(id))
+ dev->flags |= ATA_DFLAG_FLUSH_EXT;
+ } else {
+ lba_desc = "LBA";
+ }
+
+ /* config NCQ */
+ ret = ata_dev_config_ncq(dev, ncq_desc, sizeof(ncq_desc));
+
+ /* print device info to dmesg */
+ if (ata_msg_drv(ap) && ata_dev_print_info(dev))
+ ata_dev_info(dev,
+ "%llu sectors, multi %u: %s %s\n",
+ (unsigned long long)dev->n_sectors,
+ dev->multi_count, lba_desc, ncq_desc);
+
+ return ret;
+}
+
+static void ata_dev_config_chs(struct ata_device *dev)
+{
+ struct ata_port *ap = dev->link->ap;
+ const u16 *id = dev->id;
+
+ if (ata_id_current_chs_valid(id)) {
+ /* Current CHS translation is valid. */
+ dev->cylinders = id[54];
+ dev->heads = id[55];
+ dev->sectors = id[56];
+ } else {
+ /* Default translation */
+ dev->cylinders = id[1];
+ dev->heads = id[3];
+ dev->sectors = id[6];
+ }
+
+ /* print device info to dmesg */
+ if (ata_msg_drv(ap) && ata_dev_print_info(dev))
+ ata_dev_info(dev,
+ "%llu sectors, multi %u, CHS %u/%u/%u\n",
+ (unsigned long long)dev->n_sectors,
+ dev->multi_count, dev->cylinders,
+ dev->heads, dev->sectors);
+}
+
+static void ata_dev_config_devslp(struct ata_device *dev)
+{
+ u8 *sata_setting = dev->link->ap->sector_buf;
+ unsigned int err_mask;
+ int i, j;
+
+ /*
+ * Check device sleep capability. Get DevSlp timing variables
+ * from SATA Settings page of Identify Device Data Log.
+ */
+ if (!ata_id_has_devslp(dev->id))
+ return;
+
+ err_mask = ata_read_log_page(dev,
+ ATA_LOG_IDENTIFY_DEVICE,
+ ATA_LOG_SATA_SETTINGS,
+ sata_setting, 1);
+ if (err_mask)
+ return;
+
+ dev->flags |= ATA_DFLAG_DEVSLP;
+ for (i = 0; i < ATA_LOG_DEVSLP_SIZE; i++) {
+ j = ATA_LOG_DEVSLP_OFFSET + i;
+ dev->devslp_timing[i] = sata_setting[j];
+ }
+}
+
+static void ata_dev_print_features(struct ata_device *dev)
+{
+ if (!(dev->flags & ATA_DFLAG_FEATURES_MASK))
+ return;
+
+ ata_dev_info(dev,
+ "Features:%s%s%s%s%s\n",
+ dev->flags & ATA_DFLAG_TRUSTED ? " Trust" : "",
+ dev->flags & ATA_DFLAG_DA ? " Dev-Attention" : "",
+ dev->flags & ATA_DFLAG_DEVSLP ? " Dev-Sleep" : "",
+ dev->flags & ATA_DFLAG_NCQ_SEND_RECV ? " NCQ-sndrcv" : "",
+ dev->flags & ATA_DFLAG_NCQ_PRIO ? " NCQ-prio" : "");
+}
+
/**
* ata_dev_configure - Configure the specified ATA/ATAPI device
* @dev: Target device to configure
int ata_dev_configure(struct ata_device *dev)
{
struct ata_port *ap = dev->link->ap;
- struct ata_eh_context *ehc = &dev->link->eh_context;
- int print_info = ehc->i.flags & ATA_EHI_PRINTINFO;
+ bool print_info = ata_dev_print_info(dev);
const u16 *id = dev->id;
unsigned long xfer_mask;
unsigned int err_mask;
dev->multi_count = cnt;
}
- if (ata_id_has_lba(id)) {
- const char *lba_desc;
- char ncq_desc[24];
-
- lba_desc = "LBA";
- dev->flags |= ATA_DFLAG_LBA;
- if (ata_id_has_lba48(id)) {
- dev->flags |= ATA_DFLAG_LBA48;
- lba_desc = "LBA48";
-
- if (dev->n_sectors >= (1UL << 28) &&
- ata_id_has_flush_ext(id))
- dev->flags |= ATA_DFLAG_FLUSH_EXT;
- }
+ /* print device info to dmesg */
+ if (ata_msg_drv(ap) && print_info)
+ ata_dev_info(dev, "%s: %s, %s, max %s\n",
+ revbuf, modelbuf, fwrevbuf,
+ ata_mode_string(xfer_mask));
- /* config NCQ */
- rc = ata_dev_config_ncq(dev, ncq_desc, sizeof(ncq_desc));
+ if (ata_id_has_lba(id)) {
+ rc = ata_dev_config_lba(dev);
if (rc)
return rc;
-
- /* print device info to dmesg */
- if (ata_msg_drv(ap) && print_info) {
- ata_dev_info(dev, "%s: %s, %s, max %s\n",
- revbuf, modelbuf, fwrevbuf,
- ata_mode_string(xfer_mask));
- ata_dev_info(dev,
- "%llu sectors, multi %u: %s %s\n",
- (unsigned long long)dev->n_sectors,
- dev->multi_count, lba_desc, ncq_desc);
- }
} else {
- /* CHS */
-
- /* Default translation */
- dev->cylinders = id[1];
- dev->heads = id[3];
- dev->sectors = id[6];
-
- if (ata_id_current_chs_valid(id)) {
- /* Current CHS translation is valid. */
- dev->cylinders = id[54];
- dev->heads = id[55];
- dev->sectors = id[56];
- }
-
- /* print device info to dmesg */
- if (ata_msg_drv(ap) && print_info) {
- ata_dev_info(dev, "%s: %s, %s, max %s\n",
- revbuf, modelbuf, fwrevbuf,
- ata_mode_string(xfer_mask));
- ata_dev_info(dev,
- "%llu sectors, multi %u, CHS %u/%u/%u\n",
- (unsigned long long)dev->n_sectors,
- dev->multi_count, dev->cylinders,
- dev->heads, dev->sectors);
- }
+ ata_dev_config_chs(dev);
}
- /* Check and mark DevSlp capability. Get DevSlp timing variables
- * from SATA Settings page of Identify Device Data Log.
- */
- if (ata_id_has_devslp(dev->id)) {
- u8 *sata_setting = ap->sector_buf;
- int i, j;
-
- dev->flags |= ATA_DFLAG_DEVSLP;
- err_mask = ata_read_log_page(dev,
- ATA_LOG_IDENTIFY_DEVICE,
- ATA_LOG_SATA_SETTINGS,
- sata_setting,
- 1);
- if (err_mask)
- ata_dev_dbg(dev,
- "failed to get Identify Device Data, Emask 0x%x\n",
- err_mask);
- else
- for (i = 0; i < ATA_LOG_DEVSLP_SIZE; i++) {
- j = ATA_LOG_DEVSLP_OFFSET + i;
- dev->devslp_timing[i] = sata_setting[j];
- }
- }
+ ata_dev_config_devslp(dev);
ata_dev_config_sense_reporting(dev);
ata_dev_config_zac(dev);
ata_dev_config_trusted(dev);
dev->cdb_len = 32;
+
+ if (ata_msg_drv(ap) && print_info)
+ ata_dev_print_features(dev);
}
/* ATAPI-specific feature tests */
have_stop = 1;
}
- if (host->ops->host_stop)
+ if (host->ops && host->ops->host_stop)
have_stop = 1;
if (have_stop) {
ata_scsi_lpm_show, ata_scsi_lpm_store);
EXPORT_SYMBOL_GPL(dev_attr_link_power_management_policy);
+static ssize_t ata_ncq_prio_supported_show(struct device *device,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct scsi_device *sdev = to_scsi_device(device);
+ struct ata_port *ap = ata_shost_to_port(sdev->host);
+ struct ata_device *dev;
+ bool ncq_prio_supported;
+ int rc = 0;
+
+ spin_lock_irq(ap->lock);
+ dev = ata_scsi_find_dev(ap, sdev);
+ if (!dev)
+ rc = -ENODEV;
+ else
+ ncq_prio_supported = dev->flags & ATA_DFLAG_NCQ_PRIO;
+ spin_unlock_irq(ap->lock);
+
+ return rc ? rc : sysfs_emit(buf, "%u\n", ncq_prio_supported);
+}
+
+DEVICE_ATTR(ncq_prio_supported, S_IRUGO, ata_ncq_prio_supported_show, NULL);
+EXPORT_SYMBOL_GPL(dev_attr_ncq_prio_supported);
+
static ssize_t ata_ncq_prio_enable_show(struct device *device,
struct device_attribute *attr,
char *buf)
{
struct scsi_device *sdev = to_scsi_device(device);
- struct ata_port *ap;
+ struct ata_port *ap = ata_shost_to_port(sdev->host);
struct ata_device *dev;
bool ncq_prio_enable;
int rc = 0;
- ap = ata_shost_to_port(sdev->host);
-
spin_lock_irq(ap->lock);
dev = ata_scsi_find_dev(ap, sdev);
- if (!dev) {
+ if (!dev)
rc = -ENODEV;
- goto unlock;
- }
-
- ncq_prio_enable = dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE;
-
-unlock:
+ else
+ ncq_prio_enable = dev->flags & ATA_DFLAG_NCQ_PRIO_ENABLE;
spin_unlock_irq(ap->lock);
return rc ? rc : snprintf(buf, 20, "%u\n", ncq_prio_enable);
struct ata_port *ap;
struct ata_device *dev;
long int input;
- int rc;
+ int rc = 0;
rc = kstrtol(buf, 10, &input);
if (rc)
return -ENODEV;
spin_lock_irq(ap->lock);
+
+ if (!(dev->flags & ATA_DFLAG_NCQ_PRIO)) {
+ rc = -EINVAL;
+ goto unlock;
+ }
+
if (input)
dev->flags |= ATA_DFLAG_NCQ_PRIO_ENABLE;
else
dev->flags &= ~ATA_DFLAG_NCQ_PRIO_ENABLE;
- dev->link->eh_info.action |= ATA_EH_REVALIDATE;
- dev->link->eh_info.flags |= ATA_EHI_QUIET;
- ata_port_schedule_eh(ap);
+unlock:
spin_unlock_irq(ap->lock);
- ata_port_wait_eh(ap);
-
- if (input) {
- spin_lock_irq(ap->lock);
- if (!(dev->flags & ATA_DFLAG_NCQ_PRIO)) {
- dev->flags &= ~ATA_DFLAG_NCQ_PRIO_ENABLE;
- rc = -EIO;
- }
- spin_unlock_irq(ap->lock);
- }
-
return rc ? rc : len;
}
struct device_attribute *ata_ncq_sdev_attrs[] = {
&dev_attr_unload_heads,
&dev_attr_ncq_prio_enable,
+ &dev_attr_ncq_prio_supported,
NULL
};
EXPORT_SYMBOL_GPL(ata_ncq_sdev_attrs);
struct scsi_cmnd *cmd;
};
-/**
- * ata_scsi_rbuf_get - Map response buffer.
- * @cmd: SCSI command containing buffer to be mapped.
- * @flags: unsigned long variable to store irq enable status
- * @copy_in: copy in from user buffer
- *
- * Prepare buffer for simulated SCSI commands.
- *
- * LOCKING:
- * spin_lock_irqsave(ata_scsi_rbuf_lock) on success
- *
- * RETURNS:
- * Pointer to response buffer.
- */
-static void *ata_scsi_rbuf_get(struct scsi_cmnd *cmd, bool copy_in,
- unsigned long *flags)
-{
- spin_lock_irqsave(&ata_scsi_rbuf_lock, *flags);
-
- memset(ata_scsi_rbuf, 0, ATA_SCSI_RBUF_SIZE);
- if (copy_in)
- sg_copy_to_buffer(scsi_sglist(cmd), scsi_sg_count(cmd),
- ata_scsi_rbuf, ATA_SCSI_RBUF_SIZE);
- return ata_scsi_rbuf;
-}
-
-/**
- * ata_scsi_rbuf_put - Unmap response buffer.
- * @cmd: SCSI command containing buffer to be unmapped.
- * @copy_out: copy out result
- * @flags: @flags passed to ata_scsi_rbuf_get()
- *
- * Returns rbuf buffer. The result is copied to @cmd's buffer if
- * @copy_back is true.
- *
- * LOCKING:
- * Unlocks ata_scsi_rbuf_lock.
- */
-static inline void ata_scsi_rbuf_put(struct scsi_cmnd *cmd, bool copy_out,
- unsigned long *flags)
-{
- if (copy_out)
- sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd),
- ata_scsi_rbuf, ATA_SCSI_RBUF_SIZE);
- spin_unlock_irqrestore(&ata_scsi_rbuf_lock, *flags);
-}
-
/**
* ata_scsi_rbuf_fill - wrapper for SCSI command simulators
* @args: device IDENTIFY data / SCSI command of interest.
static void ata_scsi_rbuf_fill(struct ata_scsi_args *args,
unsigned int (*actor)(struct ata_scsi_args *args, u8 *rbuf))
{
- u8 *rbuf;
unsigned int rc;
struct scsi_cmnd *cmd = args->cmd;
unsigned long flags;
- rbuf = ata_scsi_rbuf_get(cmd, false, &flags);
- rc = actor(args, rbuf);
- ata_scsi_rbuf_put(cmd, rc == 0, &flags);
+ spin_lock_irqsave(&ata_scsi_rbuf_lock, flags);
+
+ memset(ata_scsi_rbuf, 0, ATA_SCSI_RBUF_SIZE);
+ rc = actor(args, ata_scsi_rbuf);
+ if (rc == 0)
+ sg_copy_from_buffer(scsi_sglist(cmd), scsi_sg_count(cmd),
+ ata_scsi_rbuf, ATA_SCSI_RBUF_SIZE);
+
+ spin_unlock_irqrestore(&ata_scsi_rbuf_lock, flags);
if (rc == 0)
cmd->result = SAM_STAT_GOOD;
irq = irq_of_parse_and_map(np, 0);
if (irq == NO_IRQ) {
dev_err(&ofdev->dev, "no SATA DMA irq\n");
- err = -ENODEV;
- goto error_out;
+ return -ENODEV;
}
#ifdef CONFIG_SATA_DWC_OLD_DMA
if (!of_find_property(np, "dmas", NULL)) {
err = sata_dwc_dma_init_old(ofdev, hsdev);
if (err)
- goto error_out;
+ return err;
}
#endif
hsdev->phy = devm_phy_optional_get(hsdev->dev, "sata-phy");
- if (IS_ERR(hsdev->phy)) {
- err = PTR_ERR(hsdev->phy);
- hsdev->phy = NULL;
- goto error_out;
- }
+ if (IS_ERR(hsdev->phy))
+ return PTR_ERR(hsdev->phy);
err = phy_init(hsdev->phy);
if (err)
* and the callback to write the MSI message.
*/
struct platform_msi_priv_data {
- struct device *dev;
- void *host_data;
- msi_alloc_info_t arg;
- irq_write_msi_msg_t write_msg;
- int devid;
+ struct device *dev;
+ void *host_data;
+ const struct attribute_group **msi_irq_groups;
+ msi_alloc_info_t arg;
+ irq_write_msi_msg_t write_msg;
+ int devid;
};
/* The devid allocator */
if (err)
goto out_free_desc;
+ priv_data->msi_irq_groups = msi_populate_sysfs(dev);
+ if (IS_ERR(priv_data->msi_irq_groups)) {
+ err = PTR_ERR(priv_data->msi_irq_groups);
+ goto out_free_irqs;
+ }
+
return 0;
+out_free_irqs:
+ msi_domain_free_irqs(dev->msi_domain, dev);
out_free_desc:
platform_msi_free_descs(dev, 0, nvec);
out_free_priv_data:
struct msi_desc *desc;
desc = first_msi_entry(dev);
+ msi_destroy_sysfs(dev, desc->platform.msi_priv_data->msi_irq_groups);
platform_msi_free_priv_data(desc->platform.msi_priv_data);
}
int dev_pm_genpd_set_performance_state(struct device *dev, unsigned int state)
{
struct generic_pm_domain *genpd;
- int ret;
+ int ret = 0;
genpd = dev_to_genpd_safe(dev);
if (!genpd)
return -EINVAL;
genpd_lock(genpd);
- ret = genpd_set_performance_state(dev, state);
+ if (pm_runtime_suspended(dev)) {
+ dev_gpd_data(dev)->rpm_pstate = state;
+ } else {
+ ret = genpd_set_performance_state(dev, state);
+ if (!ret)
+ dev_gpd_data(dev)->rpm_pstate = 0;
+ }
genpd_unlock(genpd);
return ret;
spinlock_t spinlock;
unsigned long spinlock_flags;
};
+ struct {
+ raw_spinlock_t raw_spinlock;
+ unsigned long raw_spinlock_flags;
+ };
};
regmap_lock lock;
regmap_unlock unlock;
char *buf;
char *entry;
int ret;
- unsigned entry_len;
+ unsigned int entry_len;
if (*ppos < 0 || !count)
return -EINVAL;
struct regmap_mmio_context {
void __iomem *regs;
- unsigned val_bytes;
+ unsigned int val_bytes;
bool relaxed_mmio;
bool attached_clk;
spin_unlock_irqrestore(&map->spinlock, map->spinlock_flags);
}
+static void regmap_lock_raw_spinlock(void *__map)
+__acquires(&map->raw_spinlock)
+{
+ struct regmap *map = __map;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&map->raw_spinlock, flags);
+ map->raw_spinlock_flags = flags;
+}
+
+static void regmap_unlock_raw_spinlock(void *__map)
+__releases(&map->raw_spinlock)
+{
+ struct regmap *map = __map;
+ raw_spin_unlock_irqrestore(&map->raw_spinlock, map->raw_spinlock_flags);
+}
+
static void dev_get_regmap_release(struct device *dev, void *res)
{
/*
} else {
if ((bus && bus->fast_io) ||
config->fast_io) {
- spin_lock_init(&map->spinlock);
- map->lock = regmap_lock_spinlock;
- map->unlock = regmap_unlock_spinlock;
- lockdep_set_class_and_name(&map->spinlock,
- lock_key, lock_name);
+ if (config->use_raw_spinlock) {
+ raw_spin_lock_init(&map->raw_spinlock);
+ map->lock = regmap_lock_raw_spinlock;
+ map->unlock = regmap_unlock_raw_spinlock;
+ lockdep_set_class_and_name(&map->raw_spinlock,
+ lock_key, lock_name);
+ } else {
+ spin_lock_init(&map->spinlock);
+ map->lock = regmap_lock_spinlock;
+ map->unlock = regmap_unlock_spinlock;
+ lockdep_set_class_and_name(&map->spinlock,
+ lock_key, lock_name);
+ }
} else {
mutex_init(&map->mutex);
map->lock = regmap_lock_mutex;
/* Make sure, that this register range has no selector
or data window within its boundary */
for (j = 0; j < config->num_ranges; j++) {
- unsigned sel_reg = config->ranges[j].selector_reg;
- unsigned win_min = config->ranges[j].window_start;
- unsigned win_max = win_min +
- config->ranges[j].window_len - 1;
+ unsigned int sel_reg = config->ranges[j].selector_reg;
+ unsigned int win_min = config->ranges[j].window_start;
+ unsigned int win_max = win_min +
+ config->ranges[j].window_len - 1;
/* Allow data window inside its own virtual range */
if (j == i)
*/
int regmap_field_bulk_alloc(struct regmap *regmap,
struct regmap_field **rm_field,
- struct reg_field *reg_field,
+ const struct reg_field *reg_field,
int num_fields)
{
struct regmap_field *rf;
int devm_regmap_field_bulk_alloc(struct device *dev,
struct regmap *regmap,
struct regmap_field **rm_field,
- struct reg_field *reg_field,
+ const struct reg_field *reg_field,
int num_fields)
{
struct regmap_field *rf;
if (ret) {
dev_err(map->dev,
"Error in caching of register: %x ret: %d\n",
- reg + i, ret);
+ reg + regmap_get_offset(map, i), ret);
return ret;
}
}
dynamically allocated with the /dev/loop-control interface.
config BLK_DEV_CRYPTOLOOP
- tristate "Cryptoloop Support"
+ tristate "Cryptoloop Support (DEPRECATED)"
select CRYPTO
select CRYPTO_CBC
depends on BLK_DEV_LOOP
WARNING: This device is not safe for journaled file systems like
ext3 or Reiserfs. Please use the Device Mapper crypto module
instead, which can be configured to be on-disk compatible with the
- cryptoloop device.
+ cryptoloop device. cryptoloop support will be removed in Linux 5.16.
source "drivers/block/drbd/Kconfig"
#include <linux/uaccess.h>
-#define PAGE_SECTORS_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
-#define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT)
-
/*
* Each block ramdisk device has a radix_tree brd_pages of pages that stores
* the pages containing the block device's contents. A brd page's ->index is
if (rc)
printk(KERN_ERR "cryptoloop: loop_register_transfer failed\n");
+ else
+ pr_warn("the cryptoloop driver has been deprecated and will be removed in in Linux 5.16\n");
return rc;
}
if (b) {
blk_stack_limits(&q->limits, &b->limits, 0);
- blk_queue_update_readahead(q);
+ disk_update_readahead(device->vdisk);
}
fixup_discard_if_not_supported(q);
fixup_write_zeroes(device, q);
static bool remote_due_to_read_balancing(struct drbd_device *device, sector_t sector,
enum drbd_read_balancing rbm)
{
- struct backing_dev_info *bdi;
int stripe_shift;
switch (rbm) {
case RB_CONGESTED_REMOTE:
- bdi = device->ldev->backing_bdev->bd_disk->queue->backing_dev_info;
- return bdi_read_congested(bdi);
+ return bdi_read_congested(
+ device->ldev->backing_bdev->bd_disk->bdi);
case RB_LEAST_PENDING:
return atomic_read(&device->local_cnt) >
atomic_read(&device->ap_pending_cnt) + atomic_read(&device->rs_pending_cnt);
if (fdc_state[FDC(drive)].rawcmd == 1)
fdc_state[FDC(drive)].rawcmd = 2;
- if (mode & (FMODE_READ|FMODE_WRITE)) {
- drive_state[drive].last_checked = 0;
- clear_bit(FD_OPEN_SHOULD_FAIL_BIT, &drive_state[drive].flags);
- if (bdev_check_media_change(bdev))
- floppy_revalidate(bdev->bd_disk);
- if (test_bit(FD_DISK_CHANGED_BIT, &drive_state[drive].flags))
- goto out;
- if (test_bit(FD_OPEN_SHOULD_FAIL_BIT, &drive_state[drive].flags))
+ if (!(mode & FMODE_NDELAY)) {
+ if (mode & (FMODE_READ|FMODE_WRITE)) {
+ drive_state[drive].last_checked = 0;
+ clear_bit(FD_OPEN_SHOULD_FAIL_BIT,
+ &drive_state[drive].flags);
+ if (bdev_check_media_change(bdev))
+ floppy_revalidate(bdev->bd_disk);
+ if (test_bit(FD_DISK_CHANGED_BIT, &drive_state[drive].flags))
+ goto out;
+ if (test_bit(FD_OPEN_SHOULD_FAIL_BIT, &drive_state[drive].flags))
+ goto out;
+ }
+ res = -EROFS;
+ if ((mode & FMODE_WRITE) &&
+ !test_bit(FD_DISK_WRITABLE_BIT, &drive_state[drive].flags))
goto out;
}
-
- res = -EROFS;
-
- if ((mode & FMODE_WRITE) &&
- !test_bit(FD_DISK_WRITABLE_BIT, &drive_state[drive].flags))
- goto out;
-
mutex_unlock(&open_lock);
mutex_unlock(&floppy_mutex);
return 0;
goto out_err;
/* and ... switch */
+ disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
blk_mq_freeze_queue(lo->lo_queue);
mapping_set_gfp_mask(old_file->f_mapping, lo->old_gfp_mask);
lo->lo_backing_file = file;
goto out_unlock;
}
+ disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
INIT_WORK(&lo->rootcg_work, loop_rootcg_workfn);
if (partscan)
lo->lo_disk->flags &= ~GENHD_FL_NO_PART_SCAN;
- /* Grab the block_device to prevent its destruction after we
- * put /dev/loopXX inode. Later in __loop_clr_fd() we bdput(bdev).
- */
- bdgrab(bdev);
loop_global_unlock(lo, is_loop);
if (partscan)
loop_reread_partitions(lo);
blk_queue_physical_block_size(lo->lo_queue, 512);
blk_queue_io_min(lo->lo_queue, 512);
if (bdev) {
- bdput(bdev);
invalidate_bdev(bdev);
bdev->bd_inode->i_mapping->wb_err = 0;
}
partscan = lo->lo_flags & LO_FLAGS_PARTSCAN && bdev;
lo_number = lo->lo_number;
+ disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
out_unlock:
mutex_unlock(&lo->lo_mutex);
if (partscan) {
lo->tag_set.queue_depth = 128;
lo->tag_set.numa_node = NUMA_NO_NODE;
lo->tag_set.cmd_size = sizeof(struct loop_cmd);
- lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING;
+ lo->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_STACKING |
+ BLK_MQ_F_NO_SCHED_BY_DEFAULT;
lo->tag_set.driver_data = lo;
err = blk_mq_alloc_tag_set(&lo->tag_set);
disk->fops = &lo_fops;
disk->private_data = lo;
disk->queue = lo->lo_queue;
+ disk->events = DISK_EVENT_MEDIA_CHANGE;
+ disk->event_flags = DISK_EVENT_FLAG_UEVENT;
sprintf(disk->disk_name, "loop%d", i);
add_disk(disk);
mutex_unlock(&loop_ctl_mutex);
static DEFINE_IDR(nbd_index_idr);
static DEFINE_MUTEX(nbd_index_mutex);
+static struct workqueue_struct *nbd_del_wq;
static int nbd_total_devices = 0;
struct nbd_sock {
struct mutex config_lock;
struct gendisk *disk;
struct workqueue_struct *recv_workq;
+ struct work_struct remove_work;
struct list_head list;
struct task_struct *task_recv;
struct task_struct *task_setup;
- struct completion *destroy_complete;
unsigned long flags;
char *backend;
{
struct gendisk *disk = nbd->disk;
- if (disk) {
- del_gendisk(disk);
- blk_cleanup_disk(disk);
- blk_mq_free_tag_set(&nbd->tag_set);
- }
+ del_gendisk(disk);
+ blk_cleanup_disk(disk);
+ blk_mq_free_tag_set(&nbd->tag_set);
/*
- * Place this in the last just before the nbd is freed to
- * make sure that the disk and the related kobject are also
- * totally removed to avoid duplicate creation of the same
- * one.
+ * Remove from idr after del_gendisk() completes, so if the same ID is
+ * reused, the following add_disk() will succeed.
*/
- if (test_bit(NBD_DESTROY_ON_DISCONNECT, &nbd->flags) && nbd->destroy_complete)
- complete(nbd->destroy_complete);
+ mutex_lock(&nbd_index_mutex);
+ idr_remove(&nbd_index_idr, nbd->index);
+ mutex_unlock(&nbd_index_mutex);
kfree(nbd);
}
+static void nbd_dev_remove_work(struct work_struct *work)
+{
+ nbd_dev_remove(container_of(work, struct nbd_device, remove_work));
+}
+
static void nbd_put(struct nbd_device *nbd)
{
- if (refcount_dec_and_mutex_lock(&nbd->refs,
- &nbd_index_mutex)) {
- idr_remove(&nbd_index_idr, nbd->index);
+ if (!refcount_dec_and_test(&nbd->refs))
+ return;
+
+ /* Call del_gendisk() asynchrounously to prevent deadlock */
+ if (test_bit(NBD_DESTROY_ON_DISCONNECT, &nbd->flags))
+ queue_work(nbd_del_wq, &nbd->remove_work);
+ else
nbd_dev_remove(nbd);
- mutex_unlock(&nbd_index_mutex);
- }
}
static int nbd_disconnected(struct nbd_config *config)
unsigned int cmd, unsigned long arg)
{
struct nbd_config *config = nbd->config;
+ loff_t bytesize;
switch (cmd) {
case NBD_DISCONNECT:
case NBD_SET_SIZE:
return nbd_set_size(nbd, arg, config->blksize);
case NBD_SET_SIZE_BLOCKS:
- return nbd_set_size(nbd, arg * config->blksize,
- config->blksize);
+ if (check_mul_overflow((loff_t)arg, config->blksize, &bytesize))
+ return -EINVAL;
+ return nbd_set_size(nbd, bytesize, config->blksize);
case NBD_SET_TIMEOUT:
nbd_set_cmd_timeout(nbd, arg);
return 0;
.timeout = nbd_xmit_timeout,
};
-static int nbd_dev_add(int index)
+static struct nbd_device *nbd_dev_add(int index, unsigned int refs)
{
struct nbd_device *nbd;
struct gendisk *disk;
nbd->tag_set.flags = BLK_MQ_F_SHOULD_MERGE |
BLK_MQ_F_BLOCKING;
nbd->tag_set.driver_data = nbd;
- nbd->destroy_complete = NULL;
+ INIT_WORK(&nbd->remove_work, nbd_dev_remove_work);
nbd->backend = NULL;
err = blk_mq_alloc_tag_set(&nbd->tag_set);
if (err)
goto out_free_nbd;
+ mutex_lock(&nbd_index_mutex);
if (index >= 0) {
err = idr_alloc(&nbd_index_idr, nbd, index, index + 1,
GFP_KERNEL);
if (err >= 0)
index = err;
}
+ nbd->index = index;
+ mutex_unlock(&nbd_index_mutex);
if (err < 0)
goto out_free_tags;
- nbd->index = index;
disk = blk_mq_alloc_disk(&nbd->tag_set, NULL);
if (IS_ERR(disk)) {
mutex_init(&nbd->config_lock);
refcount_set(&nbd->config_refs, 0);
- refcount_set(&nbd->refs, 1);
+ /*
+ * Start out with a zero references to keep other threads from using
+ * this device until it is fully initialized.
+ */
+ refcount_set(&nbd->refs, 0);
INIT_LIST_HEAD(&nbd->list);
disk->major = NBD_MAJOR;
+
+ /* Too big first_minor can cause duplicate creation of
+ * sysfs files/links, since first_minor will be truncated to
+ * byte in __device_add_disk().
+ */
disk->first_minor = index << part_shift;
+ if (disk->first_minor > 0xff) {
+ err = -EINVAL;
+ goto out_free_idr;
+ }
+
disk->minors = 1 << part_shift;
disk->fops = &nbd_fops;
disk->private_data = nbd;
sprintf(disk->disk_name, "nbd%d", index);
add_disk(disk);
+
+ /*
+ * Now publish the device.
+ */
+ refcount_set(&nbd->refs, refs);
nbd_total_devices++;
- return index;
+ return nbd;
out_free_idr:
+ mutex_lock(&nbd_index_mutex);
idr_remove(&nbd_index_idr, index);
+ mutex_unlock(&nbd_index_mutex);
out_free_tags:
blk_mq_free_tag_set(&nbd->tag_set);
out_free_nbd:
kfree(nbd);
out:
- return err;
+ return ERR_PTR(err);
}
-static int find_free_cb(int id, void *ptr, void *data)
+static struct nbd_device *nbd_find_get_unused(void)
{
- struct nbd_device *nbd = ptr;
- struct nbd_device **found = data;
+ struct nbd_device *nbd;
+ int id;
- if (!refcount_read(&nbd->config_refs)) {
- *found = nbd;
- return 1;
+ lockdep_assert_held(&nbd_index_mutex);
+
+ idr_for_each_entry(&nbd_index_idr, nbd, id) {
+ if (refcount_read(&nbd->config_refs) ||
+ test_bit(NBD_DESTROY_ON_DISCONNECT, &nbd->flags))
+ continue;
+ if (refcount_inc_not_zero(&nbd->refs))
+ return nbd;
}
- return 0;
+
+ return NULL;
}
/* Netlink interface. */
static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
{
- DECLARE_COMPLETION_ONSTACK(destroy_complete);
- struct nbd_device *nbd = NULL;
+ struct nbd_device *nbd;
struct nbd_config *config;
int index = -1;
int ret;
again:
mutex_lock(&nbd_index_mutex);
if (index == -1) {
- ret = idr_for_each(&nbd_index_idr, &find_free_cb, &nbd);
- if (ret == 0) {
- int new_index;
- new_index = nbd_dev_add(-1);
- if (new_index < 0) {
- mutex_unlock(&nbd_index_mutex);
- printk(KERN_ERR "nbd: failed to add new device\n");
- return new_index;
- }
- nbd = idr_find(&nbd_index_idr, new_index);
- }
+ nbd = nbd_find_get_unused();
} else {
nbd = idr_find(&nbd_index_idr, index);
- if (!nbd) {
- ret = nbd_dev_add(index);
- if (ret < 0) {
+ if (nbd) {
+ if ((test_bit(NBD_DESTROY_ON_DISCONNECT, &nbd->flags) &&
+ test_bit(NBD_DISCONNECT_REQUESTED, &nbd->flags)) ||
+ !refcount_inc_not_zero(&nbd->refs)) {
mutex_unlock(&nbd_index_mutex);
- printk(KERN_ERR "nbd: failed to add new device\n");
- return ret;
+ pr_err("nbd: device at index %d is going down\n",
+ index);
+ return -EINVAL;
}
- nbd = idr_find(&nbd_index_idr, index);
}
}
- if (!nbd) {
- printk(KERN_ERR "nbd: couldn't find device at index %d\n",
- index);
- mutex_unlock(&nbd_index_mutex);
- return -EINVAL;
- }
-
- if (test_bit(NBD_DESTROY_ON_DISCONNECT, &nbd->flags) &&
- test_bit(NBD_DISCONNECT_REQUESTED, &nbd->flags)) {
- nbd->destroy_complete = &destroy_complete;
- mutex_unlock(&nbd_index_mutex);
-
- /* Wait untill the the nbd stuff is totally destroyed */
- wait_for_completion(&destroy_complete);
- goto again;
- }
+ mutex_unlock(&nbd_index_mutex);
- if (!refcount_inc_not_zero(&nbd->refs)) {
- mutex_unlock(&nbd_index_mutex);
- if (index == -1)
- goto again;
- printk(KERN_ERR "nbd: device at index %d is going down\n",
- index);
- return -EINVAL;
+ if (!nbd) {
+ nbd = nbd_dev_add(index, 2);
+ if (IS_ERR(nbd)) {
+ pr_err("nbd: failed to add new device\n");
+ return PTR_ERR(nbd);
+ }
}
- mutex_unlock(&nbd_index_mutex);
mutex_lock(&nbd->config_lock);
if (refcount_read(&nbd->config_refs)) {
if (register_blkdev(NBD_MAJOR, "nbd"))
return -EIO;
+ nbd_del_wq = alloc_workqueue("nbd-del", WQ_UNBOUND, 0);
+ if (!nbd_del_wq) {
+ unregister_blkdev(NBD_MAJOR, "nbd");
+ return -ENOMEM;
+ }
+
if (genl_register_family(&nbd_genl_family)) {
+ destroy_workqueue(nbd_del_wq);
unregister_blkdev(NBD_MAJOR, "nbd");
return -EINVAL;
}
nbd_dbg_init();
- mutex_lock(&nbd_index_mutex);
for (i = 0; i < nbds_max; i++)
- nbd_dev_add(i);
- mutex_unlock(&nbd_index_mutex);
+ nbd_dev_add(i, 1);
return 0;
}
struct list_head *list = (struct list_head *)data;
struct nbd_device *nbd = ptr;
- list_add_tail(&nbd->list, list);
+ /* Skip nbd that is being removed asynchronously */
+ if (refcount_read(&nbd->refs))
+ list_add_tail(&nbd->list, list);
+
return 0;
}
nbd_put(nbd);
}
+ /* Also wait for nbd_dev_remove_work() completes */
+ destroy_workqueue(nbd_del_wq);
+
idr_destroy(&nbd_index_idr);
genl_unregister_family(&nbd_genl_family);
unregister_blkdev(NBD_MAJOR, "nbd");
#include <linux/init.h>
#include "null_blk.h"
-#define PAGE_SECTORS_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
-#define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT)
-#define SECTOR_MASK (PAGE_SECTORS - 1)
-
#define FREE_BATCH 16
#define TICKS_PER_SEC 50ULL
return ret;
}
- add_disk(disk);
- return 0;
+ return add_disk(disk);
}
static int null_init_tag_set(struct nullb *nullb, struct blk_mq_tag_set *set)
return;
p = blk_mq_alloc_disk(&disk->tag_set, disk);
- if (!p) {
+ if (IS_ERR(p)) {
blk_mq_free_tag_set(&disk->tag_set);
return;
}
wakeup = (pd->write_congestion_on > 0
&& pd->bio_queue_size <= pd->write_congestion_off);
spin_unlock(&pd->lock);
- if (wakeup) {
- clear_bdi_congested(pd->disk->queue->backing_dev_info,
- BLK_RW_ASYNC);
- }
+ if (wakeup)
+ clear_bdi_congested(pd->disk->bdi, BLK_RW_ASYNC);
pkt->sleep_time = max(PACKET_WAIT_TIME, 1);
pkt_set_state(pkt, PACKET_WAITING_STATE);
spin_lock(&pd->lock);
if (pd->write_congestion_on > 0
&& pd->bio_queue_size >= pd->write_congestion_on) {
- set_bdi_congested(q->backing_dev_info, BLK_RW_ASYNC);
+ set_bdi_congested(bio->bi_bdev->bd_disk->bdi, BLK_RW_ASYNC);
do {
spin_unlock(&pd->lock);
congestion_wait(BLK_RW_ASYNC, HZ);
unsigned int offset = 0;
struct req_iterator iter;
struct bio_vec bvec;
- unsigned int i = 0;
- size_t size;
- void *buf;
rq_for_each_segment(bvec, req, iter) {
- unsigned long flags;
- dev_dbg(&dev->sbd.core, "%s:%u: bio %u: %u sectors from %llu\n",
- __func__, __LINE__, i, bio_sectors(iter.bio),
- iter.bio->bi_iter.bi_sector);
-
- size = bvec.bv_len;
- buf = bvec_kmap_irq(&bvec, &flags);
if (gather)
- memcpy(dev->bounce_buf+offset, buf, size);
+ memcpy_from_bvec(dev->bounce_buf + offset, &bvec);
else
- memcpy(buf, dev->bounce_buf+offset, size);
- offset += size;
- flush_kernel_dcache_page(bvec.bv_page);
- bvec_kunmap_irq(buf, &flags);
- i++;
+ memcpy_to_bvec(&bvec, dev->bounce_buf + offset);
}
}
bio_for_each_segment(bvec, bio, iter) {
/* PS3 is ppc64, so we don't handle highmem */
- char *ptr = page_address(bvec.bv_page) + bvec.bv_offset;
+ char *ptr = bvec_virt(&bvec);
size_t len = bvec.bv_len, retlen;
dev_dbg(&dev->core, " %s %zu bytes at offset %llu\n", op,
rbd_dev->mapping.size = 0;
}
-static void zero_bvec(struct bio_vec *bv)
-{
- void *buf;
- unsigned long flags;
-
- buf = bvec_kmap_irq(bv, &flags);
- memset(buf, 0, bv->bv_len);
- flush_dcache_page(bv->bv_page);
- bvec_kunmap_irq(buf, &flags);
-}
-
static void zero_bios(struct ceph_bio_iter *bio_pos, u32 off, u32 bytes)
{
struct ceph_bio_iter it = *bio_pos;
ceph_bio_iter_advance(&it, off);
ceph_bio_iter_advance_step(&it, bytes, ({
- zero_bvec(&bv);
+ memzero_bvec(&bv);
}));
}
ceph_bvec_iter_advance(&it, off);
ceph_bvec_iter_advance_step(&it, bytes, ({
- zero_bvec(&bv);
+ memzero_bvec(&bv);
}));
}
};
ceph_bvec_iter_advance_step(&it, bytes, ({
- if (memchr_inv(page_address(bv.bv_page) + bv.bv_offset, 0,
- bv.bv_len))
+ if (memchr_inv(bvec_virt(&bv), 0, bv.bv_len))
return false;
}));
return true;
switch (dev->dev_state) {
case DEV_STATE_INIT:
- return snprintf(page, PAGE_SIZE, "init\n");
+ return sysfs_emit(page, "init\n");
case DEV_STATE_MAPPED:
/* TODO fix cli tool before changing to proper state */
- return snprintf(page, PAGE_SIZE, "open\n");
+ return sysfs_emit(page, "open\n");
case DEV_STATE_MAPPED_DISCONNECTED:
/* TODO fix cli tool before changing to proper state */
- return snprintf(page, PAGE_SIZE, "closed\n");
+ return sysfs_emit(page, "closed\n");
case DEV_STATE_UNMAPPED:
- return snprintf(page, PAGE_SIZE, "unmapped\n");
+ return sysfs_emit(page, "unmapped\n");
default:
- return snprintf(page, PAGE_SIZE, "unknown\n");
+ return sysfs_emit(page, "unknown\n");
}
}
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
- return scnprintf(page, PAGE_SIZE, "%s\n", dev->pathname);
+ return sysfs_emit(page, "%s\n", dev->pathname);
}
static struct kobj_attribute rnbd_clt_mapping_path_attr =
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
- return snprintf(page, PAGE_SIZE, "%s\n",
- rnbd_access_mode_str(dev->access_mode));
+ return sysfs_emit(page, "%s\n", rnbd_access_mode_str(dev->access_mode));
}
static struct kobj_attribute rnbd_clt_access_mode =
static ssize_t rnbd_clt_unmap_dev_show(struct kobject *kobj,
struct kobj_attribute *attr, char *page)
{
- return scnprintf(page, PAGE_SIZE, "Usage: echo <normal|force> > %s\n",
- attr->attr.name);
+ return sysfs_emit(page, "Usage: echo <normal|force> > %s\n",
+ attr->attr.name);
}
static ssize_t rnbd_clt_unmap_dev_store(struct kobject *kobj,
struct kobj_attribute *attr,
char *page)
{
- return scnprintf(page, PAGE_SIZE,
- "Usage: echo <new size in sectors> > %s\n",
- attr->attr.name);
+ return sysfs_emit(page, "Usage: echo <new size in sectors> > %s\n",
+ attr->attr.name);
}
static ssize_t rnbd_clt_resize_dev_store(struct kobject *kobj,
static ssize_t rnbd_clt_remap_dev_show(struct kobject *kobj,
struct kobj_attribute *attr, char *page)
{
- return scnprintf(page, PAGE_SIZE, "Usage: echo <1> > %s\n",
- attr->attr.name);
+ return sysfs_emit(page, "Usage: echo <1> > %s\n", attr->attr.name);
}
static ssize_t rnbd_clt_remap_dev_store(struct kobject *kobj,
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
- return scnprintf(page, PAGE_SIZE, "%s\n", dev->sess->sessname);
+ return sysfs_emit(page, "%s\n", dev->sess->sessname);
}
static struct kobj_attribute rnbd_clt_session_attr =
struct kobj_attribute *attr,
char *page)
{
- return scnprintf(page, PAGE_SIZE,
- "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>] [nr_poll_queues=<number of queues>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
+ return sysfs_emit(page,
+ "Usage: echo \"[dest_port=server port number] sessname=<name of the rtrs session> path=<[srcaddr@]dstaddr> [path=<[srcaddr@]dstaddr>] device_path=<full path on remote side> [access_mode=<ro|rw|migration>] [nr_poll_queues=<number of queues>]\" > %s\n\naddr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]\n",
attr->attr.name);
}
*/
if (cpu_q)
*cpup = cpu_q->cpu;
- put_cpu_var(sess->cpu_rr);
+ put_cpu_ptr(sess->cpu_rr);
if (q)
rnbd_clt_dev_requeue(q);
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
- return scnprintf(page, PAGE_SIZE, "%d\n",
- !(sess_dev->open_flags & FMODE_WRITE));
+ return sysfs_emit(page, "%d\n",
+ !(sess_dev->open_flags & FMODE_WRITE));
}
static struct kobj_attribute rnbd_srv_dev_session_ro_attr =
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
- return scnprintf(page, PAGE_SIZE, "%s\n",
- rnbd_access_mode_str(sess_dev->access_mode));
+ return sysfs_emit(page, "%s\n",
+ rnbd_access_mode_str(sess_dev->access_mode));
}
static struct kobj_attribute rnbd_srv_dev_session_access_mode_attr =
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
- return scnprintf(page, PAGE_SIZE, "%s\n", sess_dev->pathname);
+ return sysfs_emit(page, "%s\n", sess_dev->pathname);
}
static struct kobj_attribute rnbd_srv_dev_session_mapping_path_attr =
static ssize_t rnbd_srv_dev_session_force_close_show(struct kobject *kobj,
struct kobj_attribute *attr, char *page)
{
- return scnprintf(page, PAGE_SIZE, "Usage: echo 1 > %s\n",
- attr->attr.name);
+ return sysfs_emit(page, "Usage: echo 1 > %s\n",
+ attr->attr.name);
}
static ssize_t rnbd_srv_dev_session_force_close_store(struct kobject *kobj,
if (!disk)
return;
- if (disk->flags & GENHD_FL_UP)
+ if (host->state > HST_DEV_ACTIVATE)
del_gendisk(disk);
blk_cleanup_disk(disk);
}
{
struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
- if (req->rq_flags & RQF_SPECIAL_PAYLOAD) {
- kfree(page_address(req->special_vec.bv_page) +
- req->special_vec.bv_offset);
- }
-
+ if (req->rq_flags & RQF_SPECIAL_PAYLOAD)
+ kfree(bvec_virt(&req->special_vec));
blk_mq_end_request(req, virtblk_result(vbr));
}
"block size is changed unexpectedly, now is %u\n",
blk_size);
err = -EINVAL;
- goto err_cleanup_disk;
+ goto out_cleanup_disk;
}
/* Use topology information if available */
virtblk_update_capacity(vblk, false);
virtio_device_ready(vdev);
- device_add_disk(&vdev->dev, vblk->disk, virtblk_attr_groups);
+ err = device_add_disk(&vdev->dev, vblk->disk, virtblk_attr_groups);
+ if (err)
+ goto out_cleanup_disk;
+
return 0;
-err_cleanup_disk:
+out_cleanup_disk:
blk_cleanup_disk(vblk->disk);
out_free_tags:
blk_mq_free_tag_set(&vblk->tag_set);
err = xlbd_reserve_minors(minor, nr_minors);
if (err)
return err;
- err = -ENODEV;
memset(&info->tag_set, 0, sizeof(info->tag_set));
info->tag_set.ops = &blkfront_mq_ops;
struct image_info *img_info);
void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl);
int mhi_prepare_channel(struct mhi_controller *mhi_cntrl,
- struct mhi_chan *mhi_chan, unsigned int flags);
+ struct mhi_chan *mhi_chan);
int mhi_init_chan_ctxt(struct mhi_controller *mhi_cntrl,
struct mhi_chan *mhi_chan);
void mhi_deinit_chan_ctxt(struct mhi_controller *mhi_cntrl,
}
int mhi_prepare_channel(struct mhi_controller *mhi_cntrl,
- struct mhi_chan *mhi_chan, unsigned int flags)
+ struct mhi_chan *mhi_chan)
{
int ret = 0;
struct device *dev = &mhi_chan->mhi_dev->dev;
if (ret)
goto error_pm_state;
- if (mhi_chan->dir == DMA_FROM_DEVICE)
- mhi_chan->pre_alloc = !!(flags & MHI_CH_INBOUND_ALLOC_BUFS);
-
/* Pre-allocate buffer for xfer ring */
if (mhi_chan->pre_alloc) {
int nr_el = get_nr_avail_ring_elements(mhi_cntrl,
}
/* Move channel to start state */
-int mhi_prepare_for_transfer(struct mhi_device *mhi_dev, unsigned int flags)
+int mhi_prepare_for_transfer(struct mhi_device *mhi_dev)
{
int ret, dir;
struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl;
if (!mhi_chan)
continue;
- ret = mhi_prepare_channel(mhi_cntrl, mhi_chan, flags);
+ ret = mhi_prepare_channel(mhi_cntrl, mhi_chan);
if (ret)
goto error_open_chan;
}
To compile this driver as a module, choose M here: the
module will be called xiphera-trng.
+config HW_RANDOM_ARM_SMCCC_TRNG
+ tristate "Arm SMCCC TRNG firmware interface support"
+ depends on HAVE_ARM_SMCCC_DISCOVERY
+ default HW_RANDOM
+ help
+ Say 'Y' to enable the True Random Number Generator driver using
+ the Arm SMCCC TRNG firmware interface. This reads entropy from
+ higher exception levels (firmware, hypervisor). Uses SMCCC for
+ communicating with the firmware:
+ https://developer.arm.com/documentation/den0098/latest/
+
+ To compile this driver as a module, choose M here: the
+ module will be called arm_smccc_trng.
+
endif # HW_RANDOM
config UML_RANDOM
obj-$(CONFIG_HW_RANDOM_NPCM) += npcm-rng.o
obj-$(CONFIG_HW_RANDOM_CCTRNG) += cctrng.o
obj-$(CONFIG_HW_RANDOM_XIPHERA) += xiphera-trng.o
+obj-$(CONFIG_HW_RANDOM_ARM_SMCCC_TRNG) += arm_smccc_trng.o
.read = amd_rng_read,
};
-static int __init mod_init(void)
+static int __init amd_rng_mod_init(void)
{
int err;
struct pci_dev *pdev = NULL;
return err;
}
-static void __exit mod_exit(void)
+static void __exit amd_rng_mod_exit(void)
{
struct amd768_priv *priv;
kfree(priv);
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(amd_rng_mod_init);
+module_exit(amd_rng_mod_exit);
MODULE_AUTHOR("The Linux Kernel team");
MODULE_DESCRIPTION("H/W RNG driver for AMD chipsets");
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Randomness driver for the ARM SMCCC TRNG Firmware Interface
+ * https://developer.arm.com/documentation/den0098/latest/
+ *
+ * Copyright (C) 2020 Arm Ltd.
+ *
+ * The ARM TRNG firmware interface specifies a protocol to read entropy
+ * from a higher exception level, to abstract from any machine specific
+ * implemenations and allow easier use in hypervisors.
+ *
+ * The firmware interface is realised using the SMCCC specification.
+ */
+
+#include <linux/bits.h>
+#include <linux/device.h>
+#include <linux/hw_random.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/arm-smccc.h>
+
+#ifdef CONFIG_ARM64
+#define ARM_SMCCC_TRNG_RND ARM_SMCCC_TRNG_RND64
+#define MAX_BITS_PER_CALL (3 * 64UL)
+#else
+#define ARM_SMCCC_TRNG_RND ARM_SMCCC_TRNG_RND32
+#define MAX_BITS_PER_CALL (3 * 32UL)
+#endif
+
+/* We don't want to allow the firmware to stall us forever. */
+#define SMCCC_TRNG_MAX_TRIES 20
+
+#define SMCCC_RET_TRNG_INVALID_PARAMETER -2
+#define SMCCC_RET_TRNG_NO_ENTROPY -3
+
+static int copy_from_registers(char *buf, struct arm_smccc_res *res,
+ size_t bytes)
+{
+ unsigned int chunk, copied;
+
+ if (bytes == 0)
+ return 0;
+
+ chunk = min(bytes, sizeof(long));
+ memcpy(buf, &res->a3, chunk);
+ copied = chunk;
+ if (copied >= bytes)
+ return copied;
+
+ chunk = min((bytes - copied), sizeof(long));
+ memcpy(&buf[copied], &res->a2, chunk);
+ copied += chunk;
+ if (copied >= bytes)
+ return copied;
+
+ chunk = min((bytes - copied), sizeof(long));
+ memcpy(&buf[copied], &res->a1, chunk);
+
+ return copied + chunk;
+}
+
+static int smccc_trng_read(struct hwrng *rng, void *data, size_t max, bool wait)
+{
+ struct arm_smccc_res res;
+ u8 *buf = data;
+ unsigned int copied = 0;
+ int tries = 0;
+
+ while (copied < max) {
+ size_t bits = min_t(size_t, (max - copied) * BITS_PER_BYTE,
+ MAX_BITS_PER_CALL);
+
+ arm_smccc_1_1_invoke(ARM_SMCCC_TRNG_RND, bits, &res);
+ if ((int)res.a0 < 0)
+ return (int)res.a0;
+
+ switch ((int)res.a0) {
+ case SMCCC_RET_SUCCESS:
+ copied += copy_from_registers(buf + copied, &res,
+ bits / BITS_PER_BYTE);
+ tries = 0;
+ break;
+ case SMCCC_RET_TRNG_NO_ENTROPY:
+ if (!wait)
+ return copied;
+ tries++;
+ if (tries >= SMCCC_TRNG_MAX_TRIES)
+ return copied;
+ cond_resched();
+ break;
+ }
+ }
+
+ return copied;
+}
+
+static int smccc_trng_probe(struct platform_device *pdev)
+{
+ struct hwrng *trng;
+
+ trng = devm_kzalloc(&pdev->dev, sizeof(*trng), GFP_KERNEL);
+ if (!trng)
+ return -ENOMEM;
+
+ trng->name = "smccc_trng";
+ trng->read = smccc_trng_read;
+
+ platform_set_drvdata(pdev, trng);
+
+ return devm_hwrng_register(&pdev->dev, trng);
+}
+
+static struct platform_driver smccc_trng_driver = {
+ .driver = {
+ .name = "smccc_trng",
+ },
+ .probe = smccc_trng_probe,
+};
+module_platform_driver(smccc_trng_driver);
+
+MODULE_ALIAS("platform:smccc_trng");
+MODULE_AUTHOR("Andre Przywara");
+MODULE_LICENSE("GPL");
};
-static int __init mod_init(void)
+static int __init geode_rng_init(void)
{
int err = -ENODEV;
struct pci_dev *pdev = NULL;
goto out;
}
-static void __exit mod_exit(void)
+static void __exit geode_rng_exit(void)
{
void __iomem *mem = (void __iomem *)geode_rng.priv;
iounmap(mem);
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(geode_rng_init);
+module_exit(geode_rng_exit);
MODULE_DESCRIPTION("H/W RNG driver for AMD Geode LX CPUs");
MODULE_LICENSE("GPL");
}
-static int __init mod_init(void)
+static int __init intel_rng_mod_init(void)
{
int err = -ENODEV;
int i;
}
-static void __exit mod_exit(void)
+static void __exit intel_rng_mod_exit(void)
{
void __iomem *mem = (void __iomem *)intel_rng.priv;
iounmap(mem);
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(intel_rng_mod_init);
+module_exit(intel_rng_mod_exit);
MODULE_DESCRIPTION("H/W RNG driver for Intel chipsets");
MODULE_LICENSE("GPL");
};
-static int __init mod_init(void)
+static int __init via_rng_mod_init(void)
{
int err;
out:
return err;
}
-module_init(mod_init);
+module_init(via_rng_mod_init);
-static void __exit mod_exit(void)
+static void __exit via_rng_mod_exit(void)
{
hwrng_unregister(&via_rng);
}
-module_exit(mod_exit);
+module_exit(via_rng_mod_exit);
static struct x86_cpu_id __maybe_unused via_rng_cpu_id[] = {
X86_MATCH_FEATURE(X86_FEATURE_XSTORE, NULL),
config TCG_TIS_I2C_CR50
tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
depends on I2C
- select TCG_CR50
help
This is a driver for the Google cr50 I2C TPM interface which is a
custom microcontroller and requires a custom i2c protocol interface
{
struct ibmvtpm_dev *ibmvtpm = dev_get_drvdata(&chip->dev);
u16 len;
- int sig;
if (!ibmvtpm->rtce_buf) {
dev_err(ibmvtpm->dev, "ibmvtpm device is not ready\n");
return 0;
}
- sig = wait_event_interruptible(ibmvtpm->wq, !ibmvtpm->tpm_processing_cmd);
- if (sig)
- return -EINTR;
-
len = ibmvtpm->res_len;
if (count < len) {
* set the processing flag before the Hcall, since we may get the
* result (interrupt) before even being able to check rc.
*/
- ibmvtpm->tpm_processing_cmd = true;
+ ibmvtpm->tpm_processing_cmd = 1;
again:
rc = ibmvtpm_send_crq(ibmvtpm->vdev,
goto again;
}
dev_err(ibmvtpm->dev, "tpm_ibmvtpm_send failed rc=%d\n", rc);
- ibmvtpm->tpm_processing_cmd = false;
+ ibmvtpm->tpm_processing_cmd = 0;
}
spin_unlock(&ibmvtpm->rtce_lock);
static u8 tpm_ibmvtpm_status(struct tpm_chip *chip)
{
- return 0;
+ struct ibmvtpm_dev *ibmvtpm = dev_get_drvdata(&chip->dev);
+
+ return ibmvtpm->tpm_processing_cmd;
}
/**
.send = tpm_ibmvtpm_send,
.cancel = tpm_ibmvtpm_cancel,
.status = tpm_ibmvtpm_status,
- .req_complete_mask = 0,
+ .req_complete_mask = 1,
.req_complete_val = 0,
.req_canceled = tpm_ibmvtpm_req_canceled,
};
case VTPM_TPM_COMMAND_RES:
/* len of the data in rtce buffer */
ibmvtpm->res_len = be16_to_cpu(crq->len);
- ibmvtpm->tpm_processing_cmd = false;
+ ibmvtpm->tpm_processing_cmd = 0;
wake_up_interruptible(&ibmvtpm->wq);
return;
default:
goto init_irq_cleanup;
}
- if (!strcmp(id->compat, "IBM,vtpm20")) {
+
+ if (!strcmp(id->compat, "IBM,vtpm20"))
chip->flags |= TPM_CHIP_FLAG_TPM2;
+
+ rc = tpm_get_timeouts(chip);
+ if (rc)
+ goto init_irq_cleanup;
+
+ if (chip->flags & TPM_CHIP_FLAG_TPM2) {
rc = tpm2_get_cc_attrs_tbl(chip);
if (rc)
goto init_irq_cleanup;
wait_queue_head_t wq;
u16 res_len;
u32 vtpm_version;
- bool tpm_processing_cmd;
+ u8 tpm_processing_cmd;
};
#define CRQ_RES_BUF_SIZE PAGE_SIZE
.req_canceled = &tpm_cr50_i2c_req_canceled,
};
-static const struct i2c_device_id cr50_i2c_table[] = {
- {"cr50_i2c", 0},
- {}
-};
-MODULE_DEVICE_TABLE(i2c, cr50_i2c_table);
-
#ifdef CONFIG_ACPI
static const struct acpi_device_id cr50_i2c_acpi_id[] = {
{ "GOOG0005", 0 },
* - 0: Success.
* - -errno: A POSIX error code.
*/
-static int tpm_cr50_i2c_probe(struct i2c_client *client,
- const struct i2c_device_id *id)
+static int tpm_cr50_i2c_probe(struct i2c_client *client)
{
struct tpm_i2c_cr50_priv_data *priv;
struct device *dev = &client->dev;
static SIMPLE_DEV_PM_OPS(cr50_i2c_pm, tpm_pm_suspend, tpm_pm_resume);
static struct i2c_driver cr50_i2c_driver = {
- .id_table = cr50_i2c_table,
- .probe = tpm_cr50_i2c_probe,
+ .probe_new = tpm_cr50_i2c_probe,
.remove = tpm_cr50_i2c_remove,
.driver = {
.name = "cr50_i2c",
init.ops = &usb2_clock_sel_clock_ops;
priv->hw.init = &init;
- ret = devm_clk_hw_register(NULL, &priv->hw);
+ ret = devm_clk_hw_register(dev, &priv->hw);
if (ret)
goto pm_put;
#define TICK_BASE_CNT 1
+#ifdef CONFIG_ARM
+/* Use values higher than ARM arch timer. See 6282edb72bed. */
+#define MCT_CLKSOURCE_RATING 450
+#define MCT_CLKEVENTS_RATING 500
+#else
+#define MCT_CLKSOURCE_RATING 350
+#define MCT_CLKEVENTS_RATING 350
+#endif
+
enum {
MCT_INT_SPI,
MCT_INT_PPI
static struct clocksource mct_frc = {
.name = "mct-frc",
- .rating = 450, /* use value higher than ARM arch timer */
+ .rating = MCT_CLKSOURCE_RATING,
.read = exynos4_frc_read,
.mask = CLOCKSOURCE_MASK(32),
.flags = CLOCK_SOURCE_IS_CONTINUOUS,
evt->set_state_oneshot = set_state_shutdown;
evt->set_state_oneshot_stopped = set_state_shutdown;
evt->tick_resume = set_state_shutdown;
- evt->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
- evt->rating = 500; /* use value higher than ARM arch timer */
+ evt->features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT |
+ CLOCK_EVT_FEAT_PERCPU;
+ evt->rating = MCT_CLKEVENTS_RATING,
exynos4_mct_write(TICK_BASE_CNT, mevt->base + MCT_L_TCNTB_OFFSET);
* Copyright (c) 2020 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
*/
+#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/clk.h>
#include <linux/clk-provider.h>
/* bits within the OSTCCR register */
#define OSTCCR_PRESCALE1_MASK 0x3
#define OSTCCR_PRESCALE2_MASK 0xc
-#define OSTCCR_PRESCALE1_LSB 0
-#define OSTCCR_PRESCALE2_LSB 2
/* bits within the OSTCR register */
#define OSTCR_OST1CLR BIT(0)
prescale = readl(ost_clk->ost->base + info->ostccr_reg);
- prescale = (prescale & OSTCCR_PRESCALE1_MASK) >> OSTCCR_PRESCALE1_LSB;
+ prescale = FIELD_GET(OSTCCR_PRESCALE1_MASK, prescale);
return parent_rate >> (prescale * 2);
}
prescale = readl(ost_clk->ost->base + info->ostccr_reg);
- prescale = (prescale & OSTCCR_PRESCALE2_MASK) >> OSTCCR_PRESCALE2_LSB;
+ prescale = FIELD_GET(OSTCCR_PRESCALE2_MASK, prescale);
return parent_rate >> (prescale * 2);
}
int val;
val = readl(ost_clk->ost->base + info->ostccr_reg);
- val = (val & ~OSTCCR_PRESCALE1_MASK) | (prescale << OSTCCR_PRESCALE1_LSB);
+ val &= ~OSTCCR_PRESCALE1_MASK;
+ val |= FIELD_PREP(OSTCCR_PRESCALE1_MASK, prescale);
writel(val, ost_clk->ost->base + info->ostccr_reg);
return 0;
int val;
val = readl(ost_clk->ost->base + info->ostccr_reg);
- val = (val & ~OSTCCR_PRESCALE2_MASK) | (prescale << OSTCCR_PRESCALE2_LSB);
+ val &= ~OSTCCR_PRESCALE2_MASK;
+ val |= FIELD_PREP(OSTCCR_PRESCALE2_MASK, prescale);
writel(val, ost_clk->ost->base + info->ostccr_reg);
return 0;
ch->flags |= flag;
/* setup timeout if no clockevent */
- if ((flag == FLAG_CLOCKSOURCE) && (!(ch->flags & FLAG_CLOCKEVENT)))
+ if (ch->cmt->num_channels == 1 &&
+ flag == FLAG_CLOCKSOURCE && (!(ch->flags & FLAG_CLOCKEVENT)))
__sh_cmt_set_next(ch, ch->max_match_value);
out:
raw_spin_unlock_irqrestore(&ch->lock, flags);
static u64 sh_cmt_clocksource_read(struct clocksource *cs)
{
struct sh_cmt_channel *ch = cs_to_sh_cmt(cs);
- unsigned long flags;
u32 has_wrapped;
- u64 value;
- u32 raw;
- raw_spin_lock_irqsave(&ch->lock, flags);
- value = ch->total_cycles;
- raw = sh_cmt_get_counter(ch, &has_wrapped);
+ if (ch->cmt->num_channels == 1) {
+ unsigned long flags;
+ u64 value;
+ u32 raw;
- if (unlikely(has_wrapped))
- raw += ch->match_value + 1;
- raw_spin_unlock_irqrestore(&ch->lock, flags);
+ raw_spin_lock_irqsave(&ch->lock, flags);
+ value = ch->total_cycles;
+ raw = sh_cmt_get_counter(ch, &has_wrapped);
+
+ if (unlikely(has_wrapped))
+ raw += ch->match_value + 1;
+ raw_spin_unlock_irqrestore(&ch->lock, flags);
+
+ return value + raw;
+ }
- return value + raw;
+ return sh_cmt_get_counter(ch, &has_wrapped);
}
static int sh_cmt_clocksource_enable(struct clocksource *cs)
cs->disable = sh_cmt_clocksource_disable;
cs->suspend = sh_cmt_clocksource_suspend;
cs->resume = sh_cmt_clocksource_resume;
- cs->mask = CLOCKSOURCE_MASK(sizeof(u64) * 8);
+ cs->mask = CLOCKSOURCE_MASK(ch->cmt->info->width);
cs->flags = CLOCK_SOURCE_IS_CONTINUOUS;
dev_info(&ch->cmt->pdev->dev, "ch%u: used as clock source\n",
}
static int __init fttmr010_common_init(struct device_node *np,
- bool is_aspeed,
- int (*timer_shutdown)(struct clock_event_device *),
- irq_handler_t irq_handler)
+ bool is_aspeed, bool is_ast2600)
{
struct fttmr010 *fttmr010;
int irq;
fttmr010->tick_rate);
}
- fttmr010->timer_shutdown = timer_shutdown;
-
/*
* Setup clockevent timer (interrupt-driven) on timer 1.
*/
writel(0, fttmr010->base + TIMER1_LOAD);
writel(0, fttmr010->base + TIMER1_MATCH1);
writel(0, fttmr010->base + TIMER1_MATCH2);
- ret = request_irq(irq, irq_handler, IRQF_TIMER,
- "FTTMR010-TIMER1", &fttmr010->clkevt);
+
+ if (is_ast2600) {
+ fttmr010->timer_shutdown = ast2600_timer_shutdown;
+ ret = request_irq(irq, ast2600_timer_interrupt,
+ IRQF_TIMER, "FTTMR010-TIMER1",
+ &fttmr010->clkevt);
+ } else {
+ fttmr010->timer_shutdown = fttmr010_timer_shutdown;
+ ret = request_irq(irq, fttmr010_timer_interrupt,
+ IRQF_TIMER, "FTTMR010-TIMER1",
+ &fttmr010->clkevt);
+ }
if (ret) {
pr_err("FTTMR010-TIMER1 no IRQ\n");
goto out_unmap;
static __init int ast2600_timer_init(struct device_node *np)
{
- return fttmr010_common_init(np, true,
- ast2600_timer_shutdown,
- ast2600_timer_interrupt);
+ return fttmr010_common_init(np, true, true);
}
static __init int aspeed_timer_init(struct device_node *np)
{
- return fttmr010_common_init(np, true,
- fttmr010_timer_shutdown,
- fttmr010_timer_interrupt);
+ return fttmr010_common_init(np, true, false);
}
static __init int fttmr010_timer_init(struct device_node *np)
{
- return fttmr010_common_init(np, false,
- fttmr010_timer_shutdown,
- fttmr010_timer_interrupt);
+ return fttmr010_common_init(np, false, false);
}
TIMER_OF_DECLARE(fttmr010, "faraday,fttmr010", fttmr010_timer_init);
* SYST_CON_EN: Clock enable. Shall be set to
* - Start timer countdown.
* - Allow timeout ticks being updated.
- * - Allow changing interrupt functions.
+ * - Allow changing interrupt status,like clear irq pending.
*
- * SYST_CON_IRQ_EN: Set to allow interrupt.
+ * SYST_CON_IRQ_EN: Set to enable interrupt.
*
* SYST_CON_IRQ_CLR: Set to clear interrupt.
*/
static void mtk_syst_ack_irq(struct timer_of *to)
{
/* Clear and disable interrupt */
+ writel(SYST_CON_EN, SYST_CON_REG(to));
writel(SYST_CON_IRQ_CLR | SYST_CON_EN, SYST_CON_REG(to));
}
static int mtk_syst_clkevt_shutdown(struct clock_event_device *clkevt)
{
+ /* Clear any irq */
+ mtk_syst_ack_irq(to_timer_of(clkevt));
+
/* Disable timer */
writel(0, SYST_CON_REG(to_timer_of(clkevt)));
{
struct sun8i_ce_rng_tfm_ctx *ctx = crypto_tfm_ctx(tfm);
- memzero_explicit(ctx->seed, ctx->slen);
- kfree(ctx->seed);
+ kfree_sensitive(ctx->seed);
ctx->seed = NULL;
ctx->slen = 0;
}
struct sun8i_ce_rng_tfm_ctx *ctx = crypto_rng_ctx(tfm);
if (ctx->seed && ctx->slen != slen) {
- memzero_explicit(ctx->seed, ctx->slen);
- kfree(ctx->seed);
+ kfree_sensitive(ctx->seed);
ctx->slen = 0;
ctx->seed = NULL;
}
memcpy(dst, d, dlen);
memcpy(ctx->seed, d + dlen, ctx->slen);
}
- memzero_explicit(d, todo);
err_iv:
- kfree(d);
+ kfree_sensitive(d);
err_mem:
return err;
}
memcpy(data, d, max);
err = max;
}
- memzero_explicit(d, todo);
err_dst:
- kfree(d);
+ kfree_sensitive(d);
return err;
}
struct sun8i_ss_rng_tfm_ctx *ctx = crypto_rng_ctx(tfm);
if (ctx->seed && ctx->slen != slen) {
- memzero_explicit(ctx->seed, ctx->slen);
- kfree(ctx->seed);
+ kfree_sensitive(ctx->seed);
ctx->slen = 0;
ctx->seed = NULL;
}
{
struct sun8i_ss_rng_tfm_ctx *ctx = crypto_tfm_ctx(tfm);
- memzero_explicit(ctx->seed, ctx->slen);
- kfree(ctx->seed);
+ kfree_sensitive(ctx->seed);
ctx->seed = NULL;
ctx->slen = 0;
}
/* Update seed */
memcpy(ctx->seed, d + dlen, ctx->slen);
}
- memzero_explicit(d, todo);
err_free:
- kfree(d);
+ kfree_sensitive(d);
return err;
}
struct atmel_aes_base_ctx base;
u32 key2[AES_KEYSIZE_256 / sizeof(u32)];
+ struct crypto_skcipher *fallback_tfm;
};
#if IS_ENABLED(CONFIG_CRYPTO_DEV_ATMEL_AUTHENC)
struct atmel_aes_reqctx {
unsigned long mode;
u8 lastc[AES_BLOCK_SIZE];
+ struct skcipher_request fallback_req;
};
#if IS_ENABLED(CONFIG_CRYPTO_DEV_ATMEL_AUTHENC)
return len ? block_size - len : 0;
}
-static struct atmel_aes_dev *atmel_aes_find_dev(struct atmel_aes_base_ctx *ctx)
+static struct atmel_aes_dev *atmel_aes_dev_alloc(struct atmel_aes_base_ctx *ctx)
{
- struct atmel_aes_dev *aes_dd = NULL;
- struct atmel_aes_dev *tmp;
+ struct atmel_aes_dev *aes_dd;
spin_lock_bh(&atmel_aes.lock);
- if (!ctx->dd) {
- list_for_each_entry(tmp, &atmel_aes.dev_list, list) {
- aes_dd = tmp;
- break;
- }
- ctx->dd = aes_dd;
- } else {
- aes_dd = ctx->dd;
- }
-
+ /* One AES IP per SoC. */
+ aes_dd = list_first_entry_or_null(&atmel_aes.dev_list,
+ struct atmel_aes_dev, list);
spin_unlock_bh(&atmel_aes.lock);
-
return aes_dd;
}
ctx = crypto_tfm_ctx(areq->tfm);
dd->areq = areq;
- dd->ctx = ctx;
start_async = (areq != new_areq);
dd->is_async = start_async;
return atmel_aes_ctr_transfer(dd);
}
+static int atmel_aes_xts_fallback(struct skcipher_request *req, bool enc)
+{
+ struct atmel_aes_reqctx *rctx = skcipher_request_ctx(req);
+ struct atmel_aes_xts_ctx *ctx = crypto_skcipher_ctx(
+ crypto_skcipher_reqtfm(req));
+
+ skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback_tfm);
+ skcipher_request_set_callback(&rctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+ skcipher_request_set_crypt(&rctx->fallback_req, req->src, req->dst,
+ req->cryptlen, req->iv);
+
+ return enc ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+ crypto_skcipher_decrypt(&rctx->fallback_req);
+}
+
static int atmel_aes_crypt(struct skcipher_request *req, unsigned long mode)
{
struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req);
struct atmel_aes_base_ctx *ctx = crypto_skcipher_ctx(skcipher);
struct atmel_aes_reqctx *rctx;
- struct atmel_aes_dev *dd;
+ u32 opmode = mode & AES_FLAGS_OPMODE_MASK;
+
+ if (opmode == AES_FLAGS_XTS) {
+ if (req->cryptlen < XTS_BLOCK_SIZE)
+ return -EINVAL;
+
+ if (!IS_ALIGNED(req->cryptlen, XTS_BLOCK_SIZE))
+ return atmel_aes_xts_fallback(req,
+ mode & AES_FLAGS_ENCRYPT);
+ }
+
+ /*
+ * ECB, CBC, CFB, OFB or CTR mode require the plaintext and ciphertext
+ * to have a positve integer length.
+ */
+ if (!req->cryptlen && opmode != AES_FLAGS_XTS)
+ return 0;
+
+ if ((opmode == AES_FLAGS_ECB || opmode == AES_FLAGS_CBC) &&
+ !IS_ALIGNED(req->cryptlen, crypto_skcipher_blocksize(skcipher)))
+ return -EINVAL;
switch (mode & AES_FLAGS_OPMODE_MASK) {
case AES_FLAGS_CFB8:
}
ctx->is_aead = false;
- dd = atmel_aes_find_dev(ctx);
- if (!dd)
- return -ENODEV;
-
rctx = skcipher_request_ctx(req);
rctx->mode = mode;
- if ((mode & AES_FLAGS_OPMODE_MASK) != AES_FLAGS_ECB &&
+ if (opmode != AES_FLAGS_ECB &&
!(mode & AES_FLAGS_ENCRYPT) && req->src == req->dst) {
unsigned int ivsize = crypto_skcipher_ivsize(skcipher);
ivsize, 0);
}
- return atmel_aes_handle_queue(dd, &req->base);
+ return atmel_aes_handle_queue(ctx->dd, &req->base);
}
static int atmel_aes_setkey(struct crypto_skcipher *tfm, const u8 *key,
static int atmel_aes_init_tfm(struct crypto_skcipher *tfm)
{
struct atmel_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct atmel_aes_dev *dd;
+
+ dd = atmel_aes_dev_alloc(&ctx->base);
+ if (!dd)
+ return -ENODEV;
crypto_skcipher_set_reqsize(tfm, sizeof(struct atmel_aes_reqctx));
+ ctx->base.dd = dd;
+ ctx->base.dd->ctx = &ctx->base;
ctx->base.start = atmel_aes_start;
return 0;
static int atmel_aes_ctr_init_tfm(struct crypto_skcipher *tfm)
{
struct atmel_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct atmel_aes_dev *dd;
+
+ dd = atmel_aes_dev_alloc(&ctx->base);
+ if (!dd)
+ return -ENODEV;
crypto_skcipher_set_reqsize(tfm, sizeof(struct atmel_aes_reqctx));
+ ctx->base.dd = dd;
+ ctx->base.dd->ctx = &ctx->base;
ctx->base.start = atmel_aes_ctr_start;
return 0;
{
.base.cra_name = "ofb(aes)",
.base.cra_driver_name = "atmel-ofb-aes",
- .base.cra_blocksize = AES_BLOCK_SIZE,
+ .base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct atmel_aes_ctx),
.init = atmel_aes_init_tfm,
{
struct atmel_aes_base_ctx *ctx;
struct atmel_aes_reqctx *rctx;
- struct atmel_aes_dev *dd;
ctx = crypto_aead_ctx(crypto_aead_reqtfm(req));
ctx->block_size = AES_BLOCK_SIZE;
ctx->is_aead = true;
- dd = atmel_aes_find_dev(ctx);
- if (!dd)
- return -ENODEV;
-
rctx = aead_request_ctx(req);
rctx->mode = AES_FLAGS_GCM | mode;
- return atmel_aes_handle_queue(dd, &req->base);
+ return atmel_aes_handle_queue(ctx->dd, &req->base);
}
static int atmel_aes_gcm_setkey(struct crypto_aead *tfm, const u8 *key,
static int atmel_aes_gcm_init(struct crypto_aead *tfm)
{
struct atmel_aes_gcm_ctx *ctx = crypto_aead_ctx(tfm);
+ struct atmel_aes_dev *dd;
+
+ dd = atmel_aes_dev_alloc(&ctx->base);
+ if (!dd)
+ return -ENODEV;
crypto_aead_set_reqsize(tfm, sizeof(struct atmel_aes_reqctx));
+ ctx->base.dd = dd;
+ ctx->base.dd->ctx = &ctx->base;
ctx->base.start = atmel_aes_gcm_start;
return 0;
* the order of the ciphered tweak bytes need to be reversed before
* writing them into the ODATARx registers.
*/
- for (i = 0; i < AES_BLOCK_SIZE/2; ++i) {
- u8 tmp = tweak_bytes[AES_BLOCK_SIZE - 1 - i];
-
- tweak_bytes[AES_BLOCK_SIZE - 1 - i] = tweak_bytes[i];
- tweak_bytes[i] = tmp;
- }
+ for (i = 0; i < AES_BLOCK_SIZE/2; ++i)
+ swap(tweak_bytes[i], tweak_bytes[AES_BLOCK_SIZE - 1 - i]);
/* Process the data. */
atmel_aes_write_ctrl(dd, use_dma, NULL);
if (err)
return err;
+ crypto_skcipher_clear_flags(ctx->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+ crypto_skcipher_set_flags(ctx->fallback_tfm, tfm->base.crt_flags &
+ CRYPTO_TFM_REQ_MASK);
+ err = crypto_skcipher_setkey(ctx->fallback_tfm, key, keylen);
+ if (err)
+ return err;
+
memcpy(ctx->base.key, key, keylen/2);
memcpy(ctx->key2, key + keylen/2, keylen/2);
ctx->base.keylen = keylen/2;
static int atmel_aes_xts_init_tfm(struct crypto_skcipher *tfm)
{
struct atmel_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+ struct atmel_aes_dev *dd;
+ const char *tfm_name = crypto_tfm_alg_name(&tfm->base);
- crypto_skcipher_set_reqsize(tfm, sizeof(struct atmel_aes_reqctx));
+ dd = atmel_aes_dev_alloc(&ctx->base);
+ if (!dd)
+ return -ENODEV;
+
+ ctx->fallback_tfm = crypto_alloc_skcipher(tfm_name, 0,
+ CRYPTO_ALG_NEED_FALLBACK);
+ if (IS_ERR(ctx->fallback_tfm))
+ return PTR_ERR(ctx->fallback_tfm);
+
+ crypto_skcipher_set_reqsize(tfm, sizeof(struct atmel_aes_reqctx) +
+ crypto_skcipher_reqsize(ctx->fallback_tfm));
+ ctx->base.dd = dd;
+ ctx->base.dd->ctx = &ctx->base;
ctx->base.start = atmel_aes_xts_start;
return 0;
}
+static void atmel_aes_xts_exit_tfm(struct crypto_skcipher *tfm)
+{
+ struct atmel_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm);
+
+ crypto_free_skcipher(ctx->fallback_tfm);
+}
+
static struct skcipher_alg aes_xts_alg = {
.base.cra_name = "xts(aes)",
.base.cra_driver_name = "atmel-xts-aes",
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct atmel_aes_xts_ctx),
+ .base.cra_flags = CRYPTO_ALG_NEED_FALLBACK,
.min_keysize = 2 * AES_MIN_KEY_SIZE,
.max_keysize = 2 * AES_MAX_KEY_SIZE,
.encrypt = atmel_aes_xts_encrypt,
.decrypt = atmel_aes_xts_decrypt,
.init = atmel_aes_xts_init_tfm,
+ .exit = atmel_aes_xts_exit_tfm,
};
#if IS_ENABLED(CONFIG_CRYPTO_DEV_ATMEL_AUTHENC)
{
struct atmel_aes_authenc_ctx *ctx = crypto_aead_ctx(tfm);
unsigned int auth_reqsize = atmel_sha_authenc_get_reqsize();
+ struct atmel_aes_dev *dd;
+
+ dd = atmel_aes_dev_alloc(&ctx->base);
+ if (!dd)
+ return -ENODEV;
ctx->auth = atmel_sha_authenc_spawn(auth_mode);
if (IS_ERR(ctx->auth))
crypto_aead_set_reqsize(tfm, (sizeof(struct atmel_aes_authenc_reqctx) +
auth_reqsize));
+ ctx->base.dd = dd;
+ ctx->base.dd->ctx = &ctx->base;
ctx->base.start = atmel_aes_authenc_start;
return 0;
struct atmel_aes_base_ctx *ctx = crypto_aead_ctx(tfm);
u32 authsize = crypto_aead_authsize(tfm);
bool enc = (mode & AES_FLAGS_ENCRYPT);
- struct atmel_aes_dev *dd;
/* Compute text length. */
if (!enc && req->cryptlen < authsize)
ctx->block_size = AES_BLOCK_SIZE;
ctx->is_aead = true;
- dd = atmel_aes_find_dev(ctx);
- if (!dd)
- return -ENODEV;
-
- return atmel_aes_handle_queue(dd, &req->base);
+ return atmel_aes_handle_queue(ctx->dd, &req->base);
}
static int atmel_aes_authenc_cbc_aes_encrypt(struct aead_request *req)
static void atmel_aes_crypto_alg_init(struct crypto_alg *alg)
{
- alg->cra_flags = CRYPTO_ALG_ASYNC;
+ alg->cra_flags |= CRYPTO_ALG_ASYNC;
alg->cra_alignmask = 0xf;
alg->cra_priority = ATMEL_AES_PRIORITY;
alg->cra_module = THIS_MODULE;
atmel_tdes_write(dd, offset, *value);
}
-static struct atmel_tdes_dev *atmel_tdes_find_dev(struct atmel_tdes_ctx *ctx)
+static struct atmel_tdes_dev *atmel_tdes_dev_alloc(void)
{
- struct atmel_tdes_dev *tdes_dd = NULL;
- struct atmel_tdes_dev *tmp;
+ struct atmel_tdes_dev *tdes_dd;
spin_lock_bh(&atmel_tdes.lock);
- if (!ctx->dd) {
- list_for_each_entry(tmp, &atmel_tdes.dev_list, list) {
- tdes_dd = tmp;
- break;
- }
- ctx->dd = tdes_dd;
- } else {
- tdes_dd = ctx->dd;
- }
+ /* One TDES IP per SoC. */
+ tdes_dd = list_first_entry_or_null(&atmel_tdes.dev_list,
+ struct atmel_tdes_dev, list);
spin_unlock_bh(&atmel_tdes.lock);
-
return tdes_dd;
}
dd->buf_out, dd->buflen, dd->dma_size, 1);
if (count != dd->dma_size) {
err = -EINVAL;
- pr_err("not all data converted: %zu\n", count);
+ dev_dbg(dd->dev, "not all data converted: %zu\n", count);
}
}
dd->buflen &= ~(DES_BLOCK_SIZE - 1);
if (!dd->buf_in || !dd->buf_out) {
- dev_err(dd->dev, "unable to alloc pages.\n");
+ dev_dbg(dd->dev, "unable to alloc pages.\n");
goto err_alloc;
}
/* MAP here */
dd->dma_addr_in = dma_map_single(dd->dev, dd->buf_in,
dd->buflen, DMA_TO_DEVICE);
- if (dma_mapping_error(dd->dev, dd->dma_addr_in)) {
- dev_err(dd->dev, "dma %zd bytes error\n", dd->buflen);
- err = -EINVAL;
+ err = dma_mapping_error(dd->dev, dd->dma_addr_in);
+ if (err) {
+ dev_dbg(dd->dev, "dma %zd bytes error\n", dd->buflen);
goto err_map_in;
}
dd->dma_addr_out = dma_map_single(dd->dev, dd->buf_out,
dd->buflen, DMA_FROM_DEVICE);
- if (dma_mapping_error(dd->dev, dd->dma_addr_out)) {
- dev_err(dd->dev, "dma %zd bytes error\n", dd->buflen);
- err = -EINVAL;
+ err = dma_mapping_error(dd->dev, dd->dma_addr_out);
+ if (err) {
+ dev_dbg(dd->dev, "dma %zd bytes error\n", dd->buflen);
goto err_map_out;
}
err_alloc:
free_page((unsigned long)dd->buf_out);
free_page((unsigned long)dd->buf_in);
- if (err)
- pr_err("error: %d\n", err);
return err;
}
err = dma_map_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE);
if (!err) {
- dev_err(dd->dev, "dma_map_sg() error\n");
+ dev_dbg(dd->dev, "dma_map_sg() error\n");
return -EINVAL;
}
err = dma_map_sg(dd->dev, dd->out_sg, 1,
DMA_FROM_DEVICE);
if (!err) {
- dev_err(dd->dev, "dma_map_sg() error\n");
+ dev_dbg(dd->dev, "dma_map_sg() error\n");
dma_unmap_sg(dd->dev, dd->in_sg, 1,
DMA_TO_DEVICE);
return -EINVAL;
rctx->mode &= TDES_FLAGS_MODE_MASK;
dd->flags = (dd->flags & ~TDES_FLAGS_MODE_MASK) | rctx->mode;
dd->ctx = ctx;
- ctx->dd = dd;
err = atmel_tdes_write_ctrl(dd);
if (!err)
dd->buf_out, dd->buflen, dd->dma_size, 1);
if (count != dd->dma_size) {
err = -EINVAL;
- pr_err("not all data converted: %zu\n", count);
+ dev_dbg(dd->dev, "not all data converted: %zu\n", count);
}
}
}
struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req);
struct atmel_tdes_ctx *ctx = crypto_skcipher_ctx(skcipher);
struct atmel_tdes_reqctx *rctx = skcipher_request_ctx(req);
+ struct device *dev = ctx->dd->dev;
+
+ if (!req->cryptlen)
+ return 0;
switch (mode & TDES_FLAGS_OPMODE_MASK) {
case TDES_FLAGS_CFB8:
if (!IS_ALIGNED(req->cryptlen, CFB8_BLOCK_SIZE)) {
- pr_err("request size is not exact amount of CFB8 blocks\n");
+ dev_dbg(dev, "request size is not exact amount of CFB8 blocks\n");
return -EINVAL;
}
ctx->block_size = CFB8_BLOCK_SIZE;
case TDES_FLAGS_CFB16:
if (!IS_ALIGNED(req->cryptlen, CFB16_BLOCK_SIZE)) {
- pr_err("request size is not exact amount of CFB16 blocks\n");
+ dev_dbg(dev, "request size is not exact amount of CFB16 blocks\n");
return -EINVAL;
}
ctx->block_size = CFB16_BLOCK_SIZE;
case TDES_FLAGS_CFB32:
if (!IS_ALIGNED(req->cryptlen, CFB32_BLOCK_SIZE)) {
- pr_err("request size is not exact amount of CFB32 blocks\n");
+ dev_dbg(dev, "request size is not exact amount of CFB32 blocks\n");
return -EINVAL;
}
ctx->block_size = CFB32_BLOCK_SIZE;
default:
if (!IS_ALIGNED(req->cryptlen, DES_BLOCK_SIZE)) {
- pr_err("request size is not exact amount of DES blocks\n");
+ dev_dbg(dev, "request size is not exact amount of DES blocks\n");
return -EINVAL;
}
ctx->block_size = DES_BLOCK_SIZE;
static int atmel_tdes_init_tfm(struct crypto_skcipher *tfm)
{
struct atmel_tdes_ctx *ctx = crypto_skcipher_ctx(tfm);
- struct atmel_tdes_dev *dd;
-
- crypto_skcipher_set_reqsize(tfm, sizeof(struct atmel_tdes_reqctx));
- dd = atmel_tdes_find_dev(ctx);
- if (!dd)
+ ctx->dd = atmel_tdes_dev_alloc();
+ if (!ctx->dd)
return -ENODEV;
+ crypto_skcipher_set_reqsize(tfm, sizeof(struct atmel_tdes_reqctx));
+
return 0;
}
{
.base.cra_name = "ofb(des)",
.base.cra_driver_name = "atmel-ofb-des",
- .base.cra_blocksize = DES_BLOCK_SIZE,
+ .base.cra_blocksize = 1,
.base.cra_alignmask = 0x7,
.min_keysize = DES_KEY_SIZE,
struct sev_device *sev = psp_master->sev_data;
int ret;
+ if (sev->state == SEV_STATE_UNINIT)
+ return 0;
+
ret = __sev_do_cmd_locked(SEV_CMD_SHUTDOWN, NULL, error);
if (ret)
return ret;
return ret;
}
+static void sev_firmware_shutdown(struct sev_device *sev)
+{
+ sev_platform_shutdown(NULL);
+
+ if (sev_es_tmr) {
+ /* The TMR area was encrypted, flush it from the cache */
+ wbinvd_on_all_cpus();
+
+ free_pages((unsigned long)sev_es_tmr,
+ get_order(SEV_ES_TMR_SIZE));
+ sev_es_tmr = NULL;
+ }
+}
+
void sev_dev_destroy(struct psp_device *psp)
{
struct sev_device *sev = psp->sev_data;
if (!sev)
return;
+ sev_firmware_shutdown(sev);
+
if (sev->misc)
kref_put(&misc_dev->refcount, sev_exit);
if (sev_get_api_version())
goto err;
- /*
- * If platform is not in UNINIT state then firmware upgrade and/or
- * platform INIT command will fail. These command require UNINIT state.
- *
- * In a normal boot we should never run into case where the firmware
- * is not in UNINIT state on boot. But in case of kexec boot, a reboot
- * may not go through a typical shutdown sequence and may leave the
- * firmware in INIT or WORKING state.
- */
-
- if (sev->state != SEV_STATE_UNINIT) {
- sev_platform_shutdown(NULL);
- sev->state = SEV_STATE_UNINIT;
- }
-
if (sev_version_greater_or_equal(0, 15) &&
sev_update_firmware(sev->dev) == 0)
sev_get_api_version();
void sev_pci_exit(void)
{
- if (!psp_master->sev_data)
- return;
-
- sev_platform_shutdown(NULL);
+ struct sev_device *sev = psp_master->sev_data;
- if (sev_es_tmr) {
- /* The TMR area was encrypted, flush it from the cache */
- wbinvd_on_all_cpus();
+ if (!sev)
+ return;
- free_pages((unsigned long)sev_es_tmr,
- get_order(SEV_ES_TMR_SIZE));
- sev_es_tmr = NULL;
- }
+ sev_firmware_shutdown(sev);
}
return ret;
}
+static void sp_pci_shutdown(struct pci_dev *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct sp_device *sp = dev_get_drvdata(dev);
+
+ if (!sp)
+ return;
+
+ sp_destroy(sp);
+}
+
static void sp_pci_remove(struct pci_dev *pdev)
{
struct device *dev = &pdev->dev;
#endif
#ifdef CONFIG_CRYPTO_DEV_SP_PSP
.psp_vdata = &pspv3,
+#endif
+ },
+ { /* 5 */
+ .bar = 2,
+#ifdef CONFIG_CRYPTO_DEV_SP_PSP
+ .psp_vdata = &pspv2,
#endif
},
};
{ PCI_VDEVICE(AMD, 0x1486), (kernel_ulong_t)&dev_vdata[3] },
{ PCI_VDEVICE(AMD, 0x15DF), (kernel_ulong_t)&dev_vdata[4] },
{ PCI_VDEVICE(AMD, 0x1649), (kernel_ulong_t)&dev_vdata[4] },
+ { PCI_VDEVICE(AMD, 0x14CA), (kernel_ulong_t)&dev_vdata[5] },
/* Last entry must be zero */
{ 0, }
};
.id_table = sp_pci_table,
.probe = sp_pci_probe,
.remove = sp_pci_remove,
+ .shutdown = sp_pci_shutdown,
.driver.pm = &sp_pci_pm_ops,
};
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/pci.h>
+#include <linux/pm_runtime.h>
#include <linux/topology.h>
#include <linux/uacce.h>
#include "hpre.h"
#define HPRE_PREFETCH_DISABLE BIT(30)
#define HPRE_SVA_DISABLE_READY (BIT(4) | BIT(8))
+/* clock gate */
+#define HPRE_CLKGATE_CTL 0x301a10
+#define HPRE_PEH_CFG_AUTO_GATE 0x301a2c
+#define HPRE_CLUSTER_DYN_CTL 0x302010
+#define HPRE_CORE_SHB_CFG 0x302088
+#define HPRE_CLKGATE_CTL_EN BIT(0)
+#define HPRE_PEH_CFG_AUTO_GATE_EN BIT(0)
+#define HPRE_CLUSTER_DYN_CTL_EN BIT(0)
+#define HPRE_CORE_GATE_EN (BIT(30) | BIT(31))
+
#define HPRE_AM_OOO_SHUTDOWN_ENB 0x301044
#define HPRE_AM_OOO_SHUTDOWN_ENABLE BIT(0)
#define HPRE_WR_MSI_PORT BIT(2)
pci_err(qm->pdev, "failed to close sva prefetch\n");
}
+static void hpre_enable_clock_gate(struct hisi_qm *qm)
+{
+ u32 val;
+
+ if (qm->ver < QM_HW_V3)
+ return;
+
+ val = readl(qm->io_base + HPRE_CLKGATE_CTL);
+ val |= HPRE_CLKGATE_CTL_EN;
+ writel(val, qm->io_base + HPRE_CLKGATE_CTL);
+
+ val = readl(qm->io_base + HPRE_PEH_CFG_AUTO_GATE);
+ val |= HPRE_PEH_CFG_AUTO_GATE_EN;
+ writel(val, qm->io_base + HPRE_PEH_CFG_AUTO_GATE);
+
+ val = readl(qm->io_base + HPRE_CLUSTER_DYN_CTL);
+ val |= HPRE_CLUSTER_DYN_CTL_EN;
+ writel(val, qm->io_base + HPRE_CLUSTER_DYN_CTL);
+
+ val = readl_relaxed(qm->io_base + HPRE_CORE_SHB_CFG);
+ val |= HPRE_CORE_GATE_EN;
+ writel(val, qm->io_base + HPRE_CORE_SHB_CFG);
+}
+
+static void hpre_disable_clock_gate(struct hisi_qm *qm)
+{
+ u32 val;
+
+ if (qm->ver < QM_HW_V3)
+ return;
+
+ val = readl(qm->io_base + HPRE_CLKGATE_CTL);
+ val &= ~HPRE_CLKGATE_CTL_EN;
+ writel(val, qm->io_base + HPRE_CLKGATE_CTL);
+
+ val = readl(qm->io_base + HPRE_PEH_CFG_AUTO_GATE);
+ val &= ~HPRE_PEH_CFG_AUTO_GATE_EN;
+ writel(val, qm->io_base + HPRE_PEH_CFG_AUTO_GATE);
+
+ val = readl(qm->io_base + HPRE_CLUSTER_DYN_CTL);
+ val &= ~HPRE_CLUSTER_DYN_CTL_EN;
+ writel(val, qm->io_base + HPRE_CLUSTER_DYN_CTL);
+
+ val = readl_relaxed(qm->io_base + HPRE_CORE_SHB_CFG);
+ val &= ~HPRE_CORE_GATE_EN;
+ writel(val, qm->io_base + HPRE_CORE_SHB_CFG);
+}
+
static int hpre_set_user_domain_and_cache(struct hisi_qm *qm)
{
struct device *dev = &qm->pdev->dev;
u32 val;
int ret;
+ /* disabel dynamic clock gate before sram init */
+ hpre_disable_clock_gate(qm);
+
writel(HPRE_QM_USR_CFG_MASK, qm->io_base + QM_ARUSER_M_CFG_ENABLE);
writel(HPRE_QM_USR_CFG_MASK, qm->io_base + QM_AWUSER_M_CFG_ENABLE);
writel_relaxed(HPRE_QM_AXI_CFG_MASK, qm->io_base + QM_AXI_M_CFG);
/* Config data buffer pasid needed by Kunpeng 920 */
hpre_config_pasid(qm);
+ hpre_enable_clock_gate(qm);
+
return ret;
}
size_t count, loff_t *pos)
{
struct hpre_debugfs_file *file = filp->private_data;
+ struct hisi_qm *qm = hpre_file_to_qm(file);
char tbuf[HPRE_DBGFS_VAL_MAX_LEN];
u32 val;
int ret;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
spin_lock_irq(&file->lock);
switch (file->type) {
case HPRE_CLEAR_ENABLE:
val = hpre_cluster_inqry_read(file);
break;
default:
- spin_unlock_irq(&file->lock);
- return -EINVAL;
+ goto err_input;
}
spin_unlock_irq(&file->lock);
+
+ hisi_qm_put_dfx_access(qm);
ret = snprintf(tbuf, HPRE_DBGFS_VAL_MAX_LEN, "%u\n", val);
return simple_read_from_buffer(buf, count, pos, tbuf, ret);
+
+err_input:
+ spin_unlock_irq(&file->lock);
+ hisi_qm_put_dfx_access(qm);
+ return -EINVAL;
}
static ssize_t hpre_ctrl_debug_write(struct file *filp, const char __user *buf,
size_t count, loff_t *pos)
{
struct hpre_debugfs_file *file = filp->private_data;
+ struct hisi_qm *qm = hpre_file_to_qm(file);
char tbuf[HPRE_DBGFS_VAL_MAX_LEN];
unsigned long val;
int len, ret;
if (kstrtoul(tbuf, 0, &val))
return -EFAULT;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
spin_lock_irq(&file->lock);
switch (file->type) {
case HPRE_CLEAR_ENABLE:
ret = -EINVAL;
goto err_input;
}
- spin_unlock_irq(&file->lock);
- return count;
+ ret = count;
err_input:
spin_unlock_irq(&file->lock);
+ hisi_qm_put_dfx_access(qm);
return ret;
}
DEFINE_DEBUGFS_ATTRIBUTE(hpre_atomic64_ops, hpre_debugfs_atomic64_get,
hpre_debugfs_atomic64_set, "%llu\n");
+static int hpre_com_regs_show(struct seq_file *s, void *unused)
+{
+ hisi_qm_regs_dump(s, s->private);
+
+ return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(hpre_com_regs);
+
+static int hpre_cluster_regs_show(struct seq_file *s, void *unused)
+{
+ hisi_qm_regs_dump(s, s->private);
+
+ return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(hpre_cluster_regs);
+
static int hpre_create_debugfs_file(struct hisi_qm *qm, struct dentry *dir,
enum hpre_ctrl_dbgfs_file type, int indx)
{
regset->regs = hpre_com_dfx_regs;
regset->nregs = ARRAY_SIZE(hpre_com_dfx_regs);
regset->base = qm->io_base;
+ regset->dev = dev;
+
+ debugfs_create_file("regs", 0444, qm->debug.debug_root,
+ regset, &hpre_com_regs_fops);
- debugfs_create_regset32("regs", 0444, qm->debug.debug_root, regset);
return 0;
}
regset->regs = hpre_cluster_dfx_regs;
regset->nregs = ARRAY_SIZE(hpre_cluster_dfx_regs);
regset->base = qm->io_base + hpre_cluster_offsets[i];
+ regset->dev = dev;
- debugfs_create_regset32("regs", 0444, tmp_d, regset);
+ debugfs_create_file("regs", 0444, tmp_d, regset,
+ &hpre_cluster_regs_fops);
ret = hpre_create_debugfs_file(qm, tmp_d, HPRE_CLUSTER_CTRL,
i + HPRE_CLUSTER_CTRL);
if (ret)
goto err_with_alg_register;
}
+ hisi_qm_pm_init(qm);
+
return 0;
err_with_alg_register:
struct hisi_qm *qm = pci_get_drvdata(pdev);
int ret;
+ hisi_qm_pm_uninit(qm);
hisi_qm_wait_task_finish(qm, &hpre_devices);
hisi_qm_alg_unregister(qm, &hpre_devices);
if (qm->fun_type == QM_HW_PF && qm->vfs_num) {
hisi_qm_uninit(qm);
}
+static const struct dev_pm_ops hpre_pm_ops = {
+ SET_RUNTIME_PM_OPS(hisi_qm_suspend, hisi_qm_resume, NULL)
+};
+
static const struct pci_error_handlers hpre_err_handler = {
.error_detected = hisi_qm_dev_err_detected,
.slot_reset = hisi_qm_dev_slot_reset,
hisi_qm_sriov_configure : NULL,
.err_handler = &hpre_err_handler,
.shutdown = hisi_qm_dev_shutdown,
+ .driver.pm = &hpre_pm_ops,
};
static void hpre_register_debugfs(void)
#include <linux/acpi.h>
#include <linux/aer.h>
#include <linux/bitmap.h>
-#include <linux/debugfs.h>
#include <linux/dma-mapping.h>
#include <linux/idr.h>
#include <linux/io.h>
#include <linux/irqreturn.h>
#include <linux/log2.h>
+#include <linux/pm_runtime.h>
#include <linux/seq_file.h>
#include <linux/slab.h>
#include <linux/uacce.h>
#define QM_QOS_MAX_CIR_S 11
#define QM_QOS_VAL_MAX_LEN 32
+#define QM_AUTOSUSPEND_DELAY 3000
+
#define QM_MK_CQC_DW3_V1(hop_num, pg_sz, buf_sz, cqe_sz) \
(((hop_num) << QM_CQ_HOP_NUM_SHIFT) | \
((pg_sz) << QM_CQ_PAGE_SIZE_SHIFT) | \
return QM_IRQ_NUM_VF_V3;
}
+static int qm_pm_get_sync(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ int ret;
+
+ if (qm->fun_type == QM_HW_VF || qm->ver < QM_HW_V3)
+ return 0;
+
+ ret = pm_runtime_resume_and_get(dev);
+ if (ret < 0) {
+ dev_err(dev, "failed to get_sync(%d).\n", ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static void qm_pm_put_sync(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+
+ if (qm->fun_type == QM_HW_VF || qm->ver < QM_HW_V3)
+ return;
+
+ pm_runtime_mark_last_busy(dev);
+ pm_runtime_put_autosuspend(dev);
+}
+
static struct hisi_qp *qm_to_hisi_qp(struct hisi_qm *qm, struct qm_eqe *eqe)
{
u16 cqn = le32_to_cpu(eqe->dw0) & QM_EQE_CQN_MASK;
return container_of(debug, struct hisi_qm, debug);
}
-static u32 current_q_read(struct debugfs_file *file)
+static u32 current_q_read(struct hisi_qm *qm)
{
- struct hisi_qm *qm = file_to_qm(file);
-
return readl(qm->io_base + QM_DFX_SQE_CNT_VF_SQN) >> QM_DFX_QN_SHIFT;
}
-static int current_q_write(struct debugfs_file *file, u32 val)
+static int current_q_write(struct hisi_qm *qm, u32 val)
{
- struct hisi_qm *qm = file_to_qm(file);
u32 tmp;
if (val >= qm->debug.curr_qm_qp_num)
return 0;
}
-static u32 clear_enable_read(struct debugfs_file *file)
+static u32 clear_enable_read(struct hisi_qm *qm)
{
- struct hisi_qm *qm = file_to_qm(file);
-
return readl(qm->io_base + QM_DFX_CNT_CLR_CE);
}
/* rd_clr_ctrl 1 enable read clear, otherwise 0 disable it */
-static int clear_enable_write(struct debugfs_file *file, u32 rd_clr_ctrl)
+static int clear_enable_write(struct hisi_qm *qm, u32 rd_clr_ctrl)
{
- struct hisi_qm *qm = file_to_qm(file);
-
if (rd_clr_ctrl > 1)
return -EINVAL;
return 0;
}
-static u32 current_qm_read(struct debugfs_file *file)
+static u32 current_qm_read(struct hisi_qm *qm)
{
- struct hisi_qm *qm = file_to_qm(file);
-
return readl(qm->io_base + QM_DFX_MB_CNT_VF);
}
-static int current_qm_write(struct debugfs_file *file, u32 val)
+static int current_qm_write(struct hisi_qm *qm, u32 val)
{
- struct hisi_qm *qm = file_to_qm(file);
u32 tmp;
if (val > qm->vfs_num)
{
struct debugfs_file *file = filp->private_data;
enum qm_debug_file index = file->index;
+ struct hisi_qm *qm = file_to_qm(file);
char tbuf[QM_DBG_TMP_BUF_LEN];
u32 val;
int ret;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
mutex_lock(&file->lock);
switch (index) {
case CURRENT_QM:
- val = current_qm_read(file);
+ val = current_qm_read(qm);
break;
case CURRENT_Q:
- val = current_q_read(file);
+ val = current_q_read(qm);
break;
case CLEAR_ENABLE:
- val = clear_enable_read(file);
+ val = clear_enable_read(qm);
break;
default:
- mutex_unlock(&file->lock);
- return -EINVAL;
+ goto err_input;
}
mutex_unlock(&file->lock);
+ hisi_qm_put_dfx_access(qm);
ret = scnprintf(tbuf, QM_DBG_TMP_BUF_LEN, "%u\n", val);
return simple_read_from_buffer(buf, count, pos, tbuf, ret);
+
+err_input:
+ mutex_unlock(&file->lock);
+ hisi_qm_put_dfx_access(qm);
+ return -EINVAL;
}
static ssize_t qm_debug_write(struct file *filp, const char __user *buf,
{
struct debugfs_file *file = filp->private_data;
enum qm_debug_file index = file->index;
+ struct hisi_qm *qm = file_to_qm(file);
unsigned long val;
char tbuf[QM_DBG_TMP_BUF_LEN];
int len, ret;
if (kstrtoul(tbuf, 0, &val))
return -EFAULT;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
mutex_lock(&file->lock);
switch (index) {
case CURRENT_QM:
- ret = current_qm_write(file, val);
+ ret = current_qm_write(qm, val);
break;
case CURRENT_Q:
- ret = current_q_write(file, val);
+ ret = current_q_write(qm, val);
break;
case CLEAR_ENABLE:
- ret = clear_enable_write(file, val);
+ ret = clear_enable_write(qm, val);
break;
default:
ret = -EINVAL;
}
mutex_unlock(&file->lock);
+ hisi_qm_put_dfx_access(qm);
+
if (ret)
return ret;
.write = qm_debug_write,
};
-struct qm_dfx_registers {
- char *reg_name;
- u64 reg_offset;
-};
-
#define CNT_CYC_REGS_NUM 10
-static struct qm_dfx_registers qm_dfx_regs[] = {
+static const struct debugfs_reg32 qm_dfx_regs[] = {
/* XXX_CNT are reading clear register */
{"QM_ECC_1BIT_CNT ", 0x104000ull},
{"QM_ECC_MBIT_CNT ", 0x104008ull},
{"QM_DFX_FF_ST5 ", 0x1040dcull},
{"QM_DFX_FF_ST6 ", 0x1040e0ull},
{"QM_IN_IDLE_ST ", 0x1040e4ull},
- { NULL, 0}
};
-static struct qm_dfx_registers qm_vf_dfx_regs[] = {
+static const struct debugfs_reg32 qm_vf_dfx_regs[] = {
{"QM_DFX_FUNS_ACTIVE_ST ", 0x200ull},
- { NULL, 0}
};
-static int qm_regs_show(struct seq_file *s, void *unused)
+/**
+ * hisi_qm_regs_dump() - Dump registers's value.
+ * @s: debugfs file handle.
+ * @regset: accelerator registers information.
+ *
+ * Dump accelerator registers.
+ */
+void hisi_qm_regs_dump(struct seq_file *s, struct debugfs_regset32 *regset)
{
- struct hisi_qm *qm = s->private;
- struct qm_dfx_registers *regs;
+ struct pci_dev *pdev = to_pci_dev(regset->dev);
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ const struct debugfs_reg32 *regs = regset->regs;
+ int regs_len = regset->nregs;
+ int i, ret;
u32 val;
- if (qm->fun_type == QM_HW_PF)
- regs = qm_dfx_regs;
- else
- regs = qm_vf_dfx_regs;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return;
- while (regs->reg_name) {
- val = readl(qm->io_base + regs->reg_offset);
- seq_printf(s, "%s= 0x%08x\n", regs->reg_name, val);
- regs++;
+ for (i = 0; i < regs_len; i++) {
+ val = readl(regset->base + regs[i].offset);
+ seq_printf(s, "%s= 0x%08x\n", regs[i].name, val);
}
+ hisi_qm_put_dfx_access(qm);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_regs_dump);
+
+static int qm_regs_show(struct seq_file *s, void *unused)
+{
+ struct hisi_qm *qm = s->private;
+ struct debugfs_regset32 regset;
+
+ if (qm->fun_type == QM_HW_PF) {
+ regset.regs = qm_dfx_regs;
+ regset.nregs = ARRAY_SIZE(qm_dfx_regs);
+ } else {
+ regset.regs = qm_vf_dfx_regs;
+ regset.nregs = ARRAY_SIZE(qm_vf_dfx_regs);
+ }
+
+ regset.base = qm->io_base;
+ regset.dev = &qm->pdev->dev;
+
+ hisi_qm_regs_dump(s, ®set);
+
return 0;
}
if (*pos)
return 0;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
/* Judge if the instance is being reset. */
if (unlikely(atomic_read(&qm->status.flags) == QM_STOP))
return 0;
- if (count > QM_DBG_WRITE_LEN)
- return -ENOSPC;
+ if (count > QM_DBG_WRITE_LEN) {
+ ret = -ENOSPC;
+ goto put_dfx_access;
+ }
cmd_buf = memdup_user_nul(buffer, count);
- if (IS_ERR(cmd_buf))
- return PTR_ERR(cmd_buf);
+ if (IS_ERR(cmd_buf)) {
+ ret = PTR_ERR(cmd_buf);
+ goto put_dfx_access;
+ }
cmd_buf_tmp = strchr(cmd_buf, '\n');
if (cmd_buf_tmp) {
ret = qm_cmd_write_dump(qm, cmd_buf);
if (ret) {
kfree(cmd_buf);
- return ret;
+ goto put_dfx_access;
}
kfree(cmd_buf);
- return count;
+ ret = count;
+
+put_dfx_access:
+ hisi_qm_put_dfx_access(qm);
+ return ret;
}
static const struct file_operations qm_cmd_fops = {
struct hisi_qp *hisi_qm_create_qp(struct hisi_qm *qm, u8 alg_type)
{
struct hisi_qp *qp;
+ int ret;
+
+ ret = qm_pm_get_sync(qm);
+ if (ret)
+ return ERR_PTR(ret);
down_write(&qm->qps_lock);
qp = qm_create_qp_nolock(qm, alg_type);
up_write(&qm->qps_lock);
+ if (IS_ERR(qp))
+ qm_pm_put_sync(qm);
+
return qp;
}
EXPORT_SYMBOL_GPL(hisi_qm_create_qp);
idr_remove(&qm->qp_idr, qp->qp_id);
up_write(&qm->qps_lock);
+
+ qm_pm_put_sync(qm);
}
EXPORT_SYMBOL_GPL(hisi_qm_release_qp);
init_rwsem(&qm->qps_lock);
qm->qp_in_used = 0;
qm->misc_ctl = false;
+ if (qm->fun_type == QM_HW_PF && qm->ver > QM_HW_V2) {
+ if (!acpi_device_power_manageable(ACPI_COMPANION(&pdev->dev)))
+ dev_info(&pdev->dev, "_PS0 and _PR0 are not defined");
+ }
}
static void qm_cmd_uninit(struct hisi_qm *qm)
u32 qos_val, ir;
int ret;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
/* Mailbox and reset cannot be operated at the same time */
if (test_and_set_bit(QM_RESETTING, &qm->misc_ctl)) {
pci_err(qm->pdev, "dev resetting, read alg qos failed!\n");
- return -EAGAIN;
+ ret = -EAGAIN;
+ goto err_put_dfx_access;
}
if (qm->fun_type == QM_HW_PF) {
err_get_status:
clear_bit(QM_RESETTING, &qm->misc_ctl);
+err_put_dfx_access:
+ hisi_qm_put_dfx_access(qm);
return ret;
}
fun_index = device * 8 + function;
+ ret = qm_pm_get_sync(qm);
+ if (ret) {
+ ret = -EINVAL;
+ goto err_get_status;
+ }
+
ret = qm_func_shaper_enable(qm, fun_index, val);
if (ret) {
pci_err(qm->pdev, "failed to enable function shaper!\n");
ret = -EINVAL;
- goto err_get_status;
+ goto err_put_sync;
}
- ret = count;
+ ret = count;
+err_put_sync:
+ qm_pm_put_sync(qm);
err_get_status:
clear_bit(QM_RESETTING, &qm->misc_ctl);
return ret;
*/
void hisi_qm_debug_regs_clear(struct hisi_qm *qm)
{
- struct qm_dfx_registers *regs;
+ const struct debugfs_reg32 *regs;
int i;
/* clear current_qm */
regs = qm_dfx_regs;
for (i = 0; i < CNT_CYC_REGS_NUM; i++) {
- readl(qm->io_base + regs->reg_offset);
+ readl(qm->io_base + regs->offset);
regs++;
}
struct hisi_qm *qm = pci_get_drvdata(pdev);
int pre_existing_vfs, num_vfs, total_vfs, ret;
+ ret = qm_pm_get_sync(qm);
+ if (ret)
+ return ret;
+
total_vfs = pci_sriov_get_totalvfs(pdev);
pre_existing_vfs = pci_num_vf(pdev);
if (pre_existing_vfs) {
pci_err(pdev, "%d VFs already enabled. Please disable pre-enabled VFs!\n",
pre_existing_vfs);
- return 0;
+ goto err_put_sync;
}
num_vfs = min_t(int, max_vfs, total_vfs);
ret = qm_vf_q_assign(qm, num_vfs);
if (ret) {
pci_err(pdev, "Can't assign queues for VF!\n");
- return ret;
+ goto err_put_sync;
}
qm->vfs_num = num_vfs;
if (ret) {
pci_err(pdev, "Can't enable VF!\n");
qm_clear_vft_config(qm);
- return ret;
+ goto err_put_sync;
}
pci_info(pdev, "VF enabled, vfs_num(=%d)!\n", num_vfs);
return num_vfs;
+
+err_put_sync:
+ qm_pm_put_sync(qm);
+ return ret;
}
EXPORT_SYMBOL_GPL(hisi_qm_sriov_enable);
{
struct hisi_qm *qm = pci_get_drvdata(pdev);
int total_vfs = pci_sriov_get_totalvfs(qm->pdev);
+ int ret;
if (pci_vfs_assigned(pdev)) {
pci_err(pdev, "Failed to disable VFs as VFs are assigned!\n");
pci_disable_sriov(pdev);
/* clear vf function shaper configure array */
memset(qm->factor + 1, 0, sizeof(struct qm_shaper_factor) * total_vfs);
+ ret = qm_clear_vft_config(qm);
+ if (ret)
+ return ret;
- return qm_clear_vft_config(qm);
+ qm_pm_put_sync(qm);
+
+ return 0;
}
EXPORT_SYMBOL_GPL(hisi_qm_sriov_disable);
struct hisi_qm *qm = container_of(rst_work, struct hisi_qm, rst_work);
int ret;
+ ret = qm_pm_get_sync(qm);
+ if (ret) {
+ clear_bit(QM_RST_SCHED, &qm->misc_ctl);
+ return;
+ }
+
/* reset pcie device controller */
ret = qm_controller_reset(qm);
if (ret)
dev_err(&qm->pdev->dev, "controller reset failed (%d)\n", ret);
+ qm_pm_put_sync(qm);
}
static void qm_pf_reset_vf_prepare(struct hisi_qm *qm,
}
EXPORT_SYMBOL_GPL(hisi_qm_init);
+/**
+ * hisi_qm_get_dfx_access() - Try to get dfx access.
+ * @qm: pointer to accelerator device.
+ *
+ * Try to get dfx access, then user can get message.
+ *
+ * If device is in suspended, return failure, otherwise
+ * bump up the runtime PM usage counter.
+ */
+int hisi_qm_get_dfx_access(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+
+ if (pm_runtime_suspended(dev)) {
+ dev_info(dev, "can not read/write - device in suspended.\n");
+ return -EAGAIN;
+ }
+
+ return qm_pm_get_sync(qm);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_get_dfx_access);
+
+/**
+ * hisi_qm_put_dfx_access() - Put dfx access.
+ * @qm: pointer to accelerator device.
+ *
+ * Put dfx access, drop runtime PM usage counter.
+ */
+void hisi_qm_put_dfx_access(struct hisi_qm *qm)
+{
+ qm_pm_put_sync(qm);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_put_dfx_access);
+
+/**
+ * hisi_qm_pm_init() - Initialize qm runtime PM.
+ * @qm: pointer to accelerator device.
+ *
+ * Function that initialize qm runtime PM.
+ */
+void hisi_qm_pm_init(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+
+ if (qm->fun_type == QM_HW_VF || qm->ver < QM_HW_V3)
+ return;
+
+ pm_runtime_set_autosuspend_delay(dev, QM_AUTOSUSPEND_DELAY);
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_put_noidle(dev);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_pm_init);
+
+/**
+ * hisi_qm_pm_uninit() - Uninitialize qm runtime PM.
+ * @qm: pointer to accelerator device.
+ *
+ * Function that uninitialize qm runtime PM.
+ */
+void hisi_qm_pm_uninit(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+
+ if (qm->fun_type == QM_HW_VF || qm->ver < QM_HW_V3)
+ return;
+
+ pm_runtime_get_noresume(dev);
+ pm_runtime_dont_use_autosuspend(dev);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_pm_uninit);
+
+static int qm_prepare_for_suspend(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
+ u32 val;
+
+ ret = qm->ops->set_msi(qm, false);
+ if (ret) {
+ pci_err(pdev, "failed to disable MSI before suspending!\n");
+ return ret;
+ }
+
+ /* shutdown OOO register */
+ writel(ACC_MASTER_GLOBAL_CTRL_SHUTDOWN,
+ qm->io_base + ACC_MASTER_GLOBAL_CTRL);
+
+ ret = readl_relaxed_poll_timeout(qm->io_base + ACC_MASTER_TRANS_RETURN,
+ val,
+ (val == ACC_MASTER_TRANS_RETURN_RW),
+ POLL_PERIOD, POLL_TIMEOUT);
+ if (ret) {
+ pci_emerg(pdev, "Bus lock! Please reset system.\n");
+ return ret;
+ }
+
+ ret = qm_set_pf_mse(qm, false);
+ if (ret)
+ pci_err(pdev, "failed to disable MSE before suspending!\n");
+
+ return ret;
+}
+
+static int qm_rebuild_for_resume(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
+
+ ret = qm_set_pf_mse(qm, true);
+ if (ret) {
+ pci_err(pdev, "failed to enable MSE after resuming!\n");
+ return ret;
+ }
+
+ ret = qm->ops->set_msi(qm, true);
+ if (ret) {
+ pci_err(pdev, "failed to enable MSI after resuming!\n");
+ return ret;
+ }
+
+ ret = qm_dev_hw_init(qm);
+ if (ret) {
+ pci_err(pdev, "failed to init device after resuming\n");
+ return ret;
+ }
+
+ qm_cmd_init(qm);
+ hisi_qm_dev_err_init(qm);
+
+ return 0;
+}
+
+/**
+ * hisi_qm_suspend() - Runtime suspend of given device.
+ * @dev: device to suspend.
+ *
+ * Function that suspend the device.
+ */
+int hisi_qm_suspend(struct device *dev)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ int ret;
+
+ pci_info(pdev, "entering suspended state\n");
+
+ ret = hisi_qm_stop(qm, QM_NORMAL);
+ if (ret) {
+ pci_err(pdev, "failed to stop qm(%d)\n", ret);
+ return ret;
+ }
+
+ ret = qm_prepare_for_suspend(qm);
+ if (ret)
+ pci_err(pdev, "failed to prepare suspended(%d)\n", ret);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_suspend);
+
+/**
+ * hisi_qm_resume() - Runtime resume of given device.
+ * @dev: device to resume.
+ *
+ * Function that resume the device.
+ */
+int hisi_qm_resume(struct device *dev)
+{
+ struct pci_dev *pdev = to_pci_dev(dev);
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ int ret;
+
+ pci_info(pdev, "resuming from suspend state\n");
+
+ ret = qm_rebuild_for_resume(qm);
+ if (ret) {
+ pci_err(pdev, "failed to rebuild resume(%d)\n", ret);
+ return ret;
+ }
+
+ ret = hisi_qm_start(qm);
+ if (ret)
+ pci_err(pdev, "failed to start qm(%d)\n", ret);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_resume);
+
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Zhou Wang <wangzhou1@hisilicon.com>");
MODULE_DESCRIPTION("HiSilicon Accelerator queue manager driver");
#define HISI_ACC_QM_H
#include <linux/bitfield.h>
+#include <linux/debugfs.h>
#include <linux/iopoll.h>
#include <linux/module.h>
#include <linux/pci.h>
void hisi_qm_wait_task_finish(struct hisi_qm *qm, struct hisi_qm_list *qm_list);
int hisi_qm_alg_register(struct hisi_qm *qm, struct hisi_qm_list *qm_list);
void hisi_qm_alg_unregister(struct hisi_qm *qm, struct hisi_qm_list *qm_list);
+int hisi_qm_resume(struct device *dev);
+int hisi_qm_suspend(struct device *dev);
+void hisi_qm_pm_uninit(struct hisi_qm *qm);
+void hisi_qm_pm_init(struct hisi_qm *qm);
+int hisi_qm_get_dfx_access(struct hisi_qm *qm);
+void hisi_qm_put_dfx_access(struct hisi_qm *qm);
+void hisi_qm_regs_dump(struct seq_file *s, struct debugfs_regset32 *regset);
#endif
struct device *dev;
};
-enum sec_endian {
- SEC_LE = 0,
- SEC_32BE,
- SEC_64BE
-};
enum sec_debug_file_index {
SEC_CLEAR_ENABLE,
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/pci.h>
+#include <linux/pm_runtime.h>
#include <linux/seq_file.h>
#include <linux/topology.h>
#include <linux/uacce.h>
#define SEC_MEM_START_INIT_REG 0x301100
#define SEC_MEM_INIT_DONE_REG 0x301104
+/* clock gating */
#define SEC_CONTROL_REG 0x301200
-#define SEC_TRNG_EN_SHIFT 8
+#define SEC_DYNAMIC_GATE_REG 0x30121c
+#define SEC_CORE_AUTO_GATE 0x30212c
+#define SEC_DYNAMIC_GATE_EN 0x7bff
+#define SEC_CORE_AUTO_GATE_EN GENMASK(3, 0)
#define SEC_CLK_GATE_ENABLE BIT(3)
#define SEC_CLK_GATE_DISABLE (~BIT(3))
+
+#define SEC_TRNG_EN_SHIFT 8
#define SEC_AXI_SHUTDOWN_ENABLE BIT(12)
#define SEC_AXI_SHUTDOWN_DISABLE 0xFFFFEFFF
};
MODULE_DEVICE_TABLE(pci, sec_dev_ids);
-static u8 sec_get_endian(struct hisi_qm *qm)
+static void sec_set_endian(struct hisi_qm *qm)
{
u32 reg;
- /*
- * As for VF, it is a wrong way to get endian setting by
- * reading a register of the engine
- */
- if (qm->pdev->is_virtfn) {
- dev_err_ratelimited(&qm->pdev->dev,
- "cannot access a register in VF!\n");
- return SEC_LE;
- }
reg = readl_relaxed(qm->io_base + SEC_CONTROL_REG);
- /* BD little endian mode */
- if (!(reg & BIT(0)))
- return SEC_LE;
+ reg &= ~(BIT(1) | BIT(0));
+ if (!IS_ENABLED(CONFIG_64BIT))
+ reg |= BIT(1);
- /* BD 32-bits big endian mode */
- else if (!(reg & BIT(1)))
- return SEC_32BE;
- /* BD 64-bits big endian mode */
- else
- return SEC_64BE;
+ if (!IS_ENABLED(CONFIG_CPU_LITTLE_ENDIAN))
+ reg |= BIT(0);
+
+ writel_relaxed(reg, qm->io_base + SEC_CONTROL_REG);
}
static void sec_open_sva_prefetch(struct hisi_qm *qm)
pci_err(qm->pdev, "failed to close sva prefetch\n");
}
+static void sec_enable_clock_gate(struct hisi_qm *qm)
+{
+ u32 val;
+
+ if (qm->ver < QM_HW_V3)
+ return;
+
+ val = readl_relaxed(qm->io_base + SEC_CONTROL_REG);
+ val |= SEC_CLK_GATE_ENABLE;
+ writel_relaxed(val, qm->io_base + SEC_CONTROL_REG);
+
+ val = readl(qm->io_base + SEC_DYNAMIC_GATE_REG);
+ val |= SEC_DYNAMIC_GATE_EN;
+ writel(val, qm->io_base + SEC_DYNAMIC_GATE_REG);
+
+ val = readl(qm->io_base + SEC_CORE_AUTO_GATE);
+ val |= SEC_CORE_AUTO_GATE_EN;
+ writel(val, qm->io_base + SEC_CORE_AUTO_GATE);
+}
+
+static void sec_disable_clock_gate(struct hisi_qm *qm)
+{
+ u32 val;
+
+ /* Kunpeng920 needs to close clock gating */
+ val = readl_relaxed(qm->io_base + SEC_CONTROL_REG);
+ val &= SEC_CLK_GATE_DISABLE;
+ writel_relaxed(val, qm->io_base + SEC_CONTROL_REG);
+}
+
static int sec_engine_init(struct hisi_qm *qm)
{
int ret;
u32 reg;
- /* disable clock gate control */
- reg = readl_relaxed(qm->io_base + SEC_CONTROL_REG);
- reg &= SEC_CLK_GATE_DISABLE;
- writel_relaxed(reg, qm->io_base + SEC_CONTROL_REG);
+ /* disable clock gate control before mem init */
+ sec_disable_clock_gate(qm);
writel_relaxed(0x1, qm->io_base + SEC_MEM_START_INIT_REG);
qm->io_base + SEC_BD_ERR_CHK_EN_REG3);
/* config endian */
- reg = readl_relaxed(qm->io_base + SEC_CONTROL_REG);
- reg |= sec_get_endian(qm);
- writel_relaxed(reg, qm->io_base + SEC_CONTROL_REG);
+ sec_set_endian(qm);
+
+ sec_enable_clock_gate(qm);
return 0;
}
writel(SEC_RAS_DISABLE, qm->io_base + SEC_RAS_NFE_REG);
}
-static u32 sec_clear_enable_read(struct sec_debug_file *file)
+static u32 sec_clear_enable_read(struct hisi_qm *qm)
{
- struct hisi_qm *qm = file->qm;
-
return readl(qm->io_base + SEC_CTRL_CNT_CLR_CE) &
SEC_CTRL_CNT_CLR_CE_BIT;
}
-static int sec_clear_enable_write(struct sec_debug_file *file, u32 val)
+static int sec_clear_enable_write(struct hisi_qm *qm, u32 val)
{
- struct hisi_qm *qm = file->qm;
u32 tmp;
if (val != 1 && val)
{
struct sec_debug_file *file = filp->private_data;
char tbuf[SEC_DBGFS_VAL_MAX_LEN];
+ struct hisi_qm *qm = file->qm;
u32 val;
int ret;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
spin_lock_irq(&file->lock);
switch (file->index) {
case SEC_CLEAR_ENABLE:
- val = sec_clear_enable_read(file);
+ val = sec_clear_enable_read(qm);
break;
default:
- spin_unlock_irq(&file->lock);
- return -EINVAL;
+ goto err_input;
}
spin_unlock_irq(&file->lock);
- ret = snprintf(tbuf, SEC_DBGFS_VAL_MAX_LEN, "%u\n", val);
+ hisi_qm_put_dfx_access(qm);
+ ret = snprintf(tbuf, SEC_DBGFS_VAL_MAX_LEN, "%u\n", val);
return simple_read_from_buffer(buf, count, pos, tbuf, ret);
+
+err_input:
+ spin_unlock_irq(&file->lock);
+ hisi_qm_put_dfx_access(qm);
+ return -EINVAL;
}
static ssize_t sec_debug_write(struct file *filp, const char __user *buf,
{
struct sec_debug_file *file = filp->private_data;
char tbuf[SEC_DBGFS_VAL_MAX_LEN];
+ struct hisi_qm *qm = file->qm;
unsigned long val;
int len, ret;
if (kstrtoul(tbuf, 0, &val))
return -EFAULT;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
spin_lock_irq(&file->lock);
switch (file->index) {
case SEC_CLEAR_ENABLE:
- ret = sec_clear_enable_write(file, val);
+ ret = sec_clear_enable_write(qm, val);
if (ret)
goto err_input;
break;
goto err_input;
}
- spin_unlock_irq(&file->lock);
-
- return count;
+ ret = count;
err_input:
spin_unlock_irq(&file->lock);
+ hisi_qm_put_dfx_access(qm);
return ret;
}
DEFINE_DEBUGFS_ATTRIBUTE(sec_atomic64_ops, sec_debugfs_atomic64_get,
sec_debugfs_atomic64_set, "%lld\n");
+static int sec_regs_show(struct seq_file *s, void *unused)
+{
+ hisi_qm_regs_dump(s, s->private);
+
+ return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(sec_regs);
+
static int sec_core_debug_init(struct hisi_qm *qm)
{
struct sec_dev *sec = container_of(qm, struct sec_dev, qm);
regset->regs = sec_dfx_regs;
regset->nregs = ARRAY_SIZE(sec_dfx_regs);
regset->base = qm->io_base;
+ regset->dev = dev;
if (qm->pdev->device == SEC_PF_PCI_DEVICE_ID)
- debugfs_create_regset32("regs", 0444, tmp_d, regset);
+ debugfs_create_file("regs", 0444, tmp_d, regset, &sec_regs_fops);
for (i = 0; i < ARRAY_SIZE(sec_dfx_labels); i++) {
atomic64_t *data = (atomic64_t *)((uintptr_t)dfx +
goto err_alg_unregister;
}
+ hisi_qm_pm_init(qm);
+
return 0;
err_alg_unregister:
- hisi_qm_alg_unregister(qm, &sec_devices);
+ if (qm->qp_num >= ctx_q_num)
+ hisi_qm_alg_unregister(qm, &sec_devices);
err_qm_stop:
sec_debugfs_exit(qm);
hisi_qm_stop(qm, QM_NORMAL);
{
struct hisi_qm *qm = pci_get_drvdata(pdev);
+ hisi_qm_pm_uninit(qm);
hisi_qm_wait_task_finish(qm, &sec_devices);
if (qm->qp_num >= ctx_q_num)
hisi_qm_alg_unregister(qm, &sec_devices);
sec_qm_uninit(qm);
}
+static const struct dev_pm_ops sec_pm_ops = {
+ SET_RUNTIME_PM_OPS(hisi_qm_suspend, hisi_qm_resume, NULL)
+};
+
static const struct pci_error_handlers sec_err_handler = {
.error_detected = hisi_qm_dev_err_detected,
.slot_reset = hisi_qm_dev_slot_reset,
.err_handler = &sec_err_handler,
.sriov_configure = hisi_qm_sriov_configure,
.shutdown = hisi_qm_dev_shutdown,
+ .driver.pm = &sec_pm_ops,
};
static void sec_register_debugfs(void)
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/pci.h>
+#include <linux/pm_runtime.h>
#include <linux/seq_file.h>
#include <linux/topology.h>
#include <linux/uacce.h>
#define HZIP_DELAY_1_US 1
#define HZIP_POLL_TIMEOUT_US 1000
+/* clock gating */
+#define HZIP_PEH_CFG_AUTO_GATE 0x3011A8
+#define HZIP_PEH_CFG_AUTO_GATE_EN BIT(0)
+#define HZIP_CORE_GATED_EN GENMASK(15, 8)
+#define HZIP_CORE_GATED_OOO_EN BIT(29)
+#define HZIP_CLOCK_GATED_EN (HZIP_CORE_GATED_EN | \
+ HZIP_CORE_GATED_OOO_EN)
+
static const char hisi_zip_name[] = "hisi_zip";
static struct dentry *hzip_debugfs_root;
pci_err(qm->pdev, "failed to close sva prefetch\n");
}
+static void hisi_zip_enable_clock_gate(struct hisi_qm *qm)
+{
+ u32 val;
+
+ if (qm->ver < QM_HW_V3)
+ return;
+
+ val = readl(qm->io_base + HZIP_CLOCK_GATE_CTRL);
+ val |= HZIP_CLOCK_GATED_EN;
+ writel(val, qm->io_base + HZIP_CLOCK_GATE_CTRL);
+
+ val = readl(qm->io_base + HZIP_PEH_CFG_AUTO_GATE);
+ val |= HZIP_PEH_CFG_AUTO_GATE_EN;
+ writel(val, qm->io_base + HZIP_PEH_CFG_AUTO_GATE);
+}
+
static int hisi_zip_set_user_domain_and_cache(struct hisi_qm *qm)
{
void __iomem *base = qm->io_base;
CQC_CACHE_WB_ENABLE | FIELD_PREP(SQC_CACHE_WB_THRD, 1) |
FIELD_PREP(CQC_CACHE_WB_THRD, 1), base + QM_CACHE_CTL);
+ hisi_zip_enable_clock_gate(qm);
+
return 0;
}
return &hisi_zip->qm;
}
-static u32 clear_enable_read(struct ctrl_debug_file *file)
+static u32 clear_enable_read(struct hisi_qm *qm)
{
- struct hisi_qm *qm = file_to_qm(file);
-
return readl(qm->io_base + HZIP_SOFT_CTRL_CNT_CLR_CE) &
HZIP_SOFT_CTRL_CNT_CLR_CE_BIT;
}
-static int clear_enable_write(struct ctrl_debug_file *file, u32 val)
+static int clear_enable_write(struct hisi_qm *qm, u32 val)
{
- struct hisi_qm *qm = file_to_qm(file);
u32 tmp;
if (val != 1 && val != 0)
size_t count, loff_t *pos)
{
struct ctrl_debug_file *file = filp->private_data;
+ struct hisi_qm *qm = file_to_qm(file);
char tbuf[HZIP_BUF_SIZE];
u32 val;
int ret;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
spin_lock_irq(&file->lock);
switch (file->index) {
case HZIP_CLEAR_ENABLE:
- val = clear_enable_read(file);
+ val = clear_enable_read(qm);
break;
default:
- spin_unlock_irq(&file->lock);
- return -EINVAL;
+ goto err_input;
}
spin_unlock_irq(&file->lock);
+
+ hisi_qm_put_dfx_access(qm);
ret = scnprintf(tbuf, sizeof(tbuf), "%u\n", val);
return simple_read_from_buffer(buf, count, pos, tbuf, ret);
+
+err_input:
+ spin_unlock_irq(&file->lock);
+ hisi_qm_put_dfx_access(qm);
+ return -EINVAL;
}
static ssize_t hisi_zip_ctrl_debug_write(struct file *filp,
size_t count, loff_t *pos)
{
struct ctrl_debug_file *file = filp->private_data;
+ struct hisi_qm *qm = file_to_qm(file);
char tbuf[HZIP_BUF_SIZE];
unsigned long val;
int len, ret;
if (kstrtoul(tbuf, 0, &val))
return -EFAULT;
+ ret = hisi_qm_get_dfx_access(qm);
+ if (ret)
+ return ret;
+
spin_lock_irq(&file->lock);
switch (file->index) {
case HZIP_CLEAR_ENABLE:
- ret = clear_enable_write(file, val);
+ ret = clear_enable_write(qm, val);
if (ret)
goto err_input;
break;
ret = -EINVAL;
goto err_input;
}
- spin_unlock_irq(&file->lock);
- return count;
+ ret = count;
err_input:
spin_unlock_irq(&file->lock);
+ hisi_qm_put_dfx_access(qm);
return ret;
}
DEFINE_DEBUGFS_ATTRIBUTE(zip_atomic64_ops, zip_debugfs_atomic64_get,
zip_debugfs_atomic64_set, "%llu\n");
+static int hisi_zip_regs_show(struct seq_file *s, void *unused)
+{
+ hisi_qm_regs_dump(s, s->private);
+
+ return 0;
+}
+
+DEFINE_SHOW_ATTRIBUTE(hisi_zip_regs);
+
static int hisi_zip_core_debug_init(struct hisi_qm *qm)
{
struct device *dev = &qm->pdev->dev;
regset->regs = hzip_dfx_regs;
regset->nregs = ARRAY_SIZE(hzip_dfx_regs);
regset->base = qm->io_base + core_offsets[i];
+ regset->dev = dev;
tmp_d = debugfs_create_dir(buf, qm->debug.debug_root);
- debugfs_create_regset32("regs", 0444, tmp_d, regset);
+ debugfs_create_file("regs", 0444, tmp_d, regset,
+ &hisi_zip_regs_fops);
}
return 0;
goto err_qm_alg_unregister;
}
+ hisi_qm_pm_init(qm);
+
return 0;
err_qm_alg_unregister:
{
struct hisi_qm *qm = pci_get_drvdata(pdev);
+ hisi_qm_pm_uninit(qm);
hisi_qm_wait_task_finish(qm, &zip_devices);
hisi_qm_alg_unregister(qm, &zip_devices);
hisi_zip_qm_uninit(qm);
}
+static const struct dev_pm_ops hisi_zip_pm_ops = {
+ SET_RUNTIME_PM_OPS(hisi_qm_suspend, hisi_qm_resume, NULL)
+};
+
static const struct pci_error_handlers hisi_zip_err_handler = {
.error_detected = hisi_qm_dev_err_detected,
.slot_reset = hisi_qm_dev_slot_reset,
hisi_qm_sriov_configure : NULL,
.err_handler = &hisi_zip_err_handler,
.shutdown = hisi_qm_dev_shutdown,
+ .driver.pm = &hisi_zip_pm_ops,
};
static void hisi_zip_register_debugfs(void)
static int mxs_dcp_start_dma(struct dcp_async_ctx *actx)
{
+ int dma_err;
struct dcp *sdcp = global_sdcp;
const int chan = actx->chan;
uint32_t stat;
unsigned long ret;
struct dcp_dma_desc *desc = &sdcp->coh->desc[actx->chan];
-
dma_addr_t desc_phys = dma_map_single(sdcp->dev, desc, sizeof(*desc),
DMA_TO_DEVICE);
+ dma_err = dma_mapping_error(sdcp->dev, desc_phys);
+ if (dma_err)
+ return dma_err;
+
reinit_completion(&sdcp->completion[chan]);
/* Clear status register. */
static int mxs_dcp_run_aes(struct dcp_async_ctx *actx,
struct skcipher_request *req, int init)
{
+ dma_addr_t key_phys, src_phys, dst_phys;
struct dcp *sdcp = global_sdcp;
struct dcp_dma_desc *desc = &sdcp->coh->desc[actx->chan];
struct dcp_aes_req_ctx *rctx = skcipher_request_ctx(req);
int ret;
- dma_addr_t key_phys = dma_map_single(sdcp->dev, sdcp->coh->aes_key,
- 2 * AES_KEYSIZE_128,
- DMA_TO_DEVICE);
- dma_addr_t src_phys = dma_map_single(sdcp->dev, sdcp->coh->aes_in_buf,
- DCP_BUF_SZ, DMA_TO_DEVICE);
- dma_addr_t dst_phys = dma_map_single(sdcp->dev, sdcp->coh->aes_out_buf,
- DCP_BUF_SZ, DMA_FROM_DEVICE);
+ key_phys = dma_map_single(sdcp->dev, sdcp->coh->aes_key,
+ 2 * AES_KEYSIZE_128, DMA_TO_DEVICE);
+ ret = dma_mapping_error(sdcp->dev, key_phys);
+ if (ret)
+ return ret;
+
+ src_phys = dma_map_single(sdcp->dev, sdcp->coh->aes_in_buf,
+ DCP_BUF_SZ, DMA_TO_DEVICE);
+ ret = dma_mapping_error(sdcp->dev, src_phys);
+ if (ret)
+ goto err_src;
+
+ dst_phys = dma_map_single(sdcp->dev, sdcp->coh->aes_out_buf,
+ DCP_BUF_SZ, DMA_FROM_DEVICE);
+ ret = dma_mapping_error(sdcp->dev, dst_phys);
+ if (ret)
+ goto err_dst;
if (actx->fill % AES_BLOCK_SIZE) {
dev_err(sdcp->dev, "Invalid block size!\n");
ret = mxs_dcp_start_dma(actx);
aes_done_run:
+ dma_unmap_single(sdcp->dev, dst_phys, DCP_BUF_SZ, DMA_FROM_DEVICE);
+err_dst:
+ dma_unmap_single(sdcp->dev, src_phys, DCP_BUF_SZ, DMA_TO_DEVICE);
+err_src:
dma_unmap_single(sdcp->dev, key_phys, 2 * AES_KEYSIZE_128,
DMA_TO_DEVICE);
- dma_unmap_single(sdcp->dev, src_phys, DCP_BUF_SZ, DMA_TO_DEVICE);
- dma_unmap_single(sdcp->dev, dst_phys, DCP_BUF_SZ, DMA_FROM_DEVICE);
return ret;
}
struct scatterlist *dst = req->dst;
struct scatterlist *src = req->src;
- const int nents = sg_nents(req->src);
+ int dst_nents = sg_nents(dst);
const int out_off = DCP_BUF_SZ;
uint8_t *in_buf = sdcp->coh->aes_in_buf;
uint8_t *out_buf = sdcp->coh->aes_out_buf;
- uint8_t *out_tmp, *src_buf, *dst_buf = NULL;
uint32_t dst_off = 0;
+ uint8_t *src_buf = NULL;
uint32_t last_out_len = 0;
uint8_t *key = sdcp->coh->aes_key;
int ret = 0;
- int split = 0;
- unsigned int i, len, clen, rem = 0, tlen = 0;
+ unsigned int i, len, clen, tlen = 0;
int init = 0;
bool limit_hit = false;
memset(key + AES_KEYSIZE_128, 0, AES_KEYSIZE_128);
}
- for_each_sg(req->src, src, nents, i) {
+ for_each_sg(req->src, src, sg_nents(src), i) {
src_buf = sg_virt(src);
len = sg_dma_len(src);
tlen += len;
* submit the buffer.
*/
if (actx->fill == out_off || sg_is_last(src) ||
- limit_hit) {
+ limit_hit) {
ret = mxs_dcp_run_aes(actx, req, init);
if (ret)
return ret;
init = 0;
- out_tmp = out_buf;
+ sg_pcopy_from_buffer(dst, dst_nents, out_buf,
+ actx->fill, dst_off);
+ dst_off += actx->fill;
last_out_len = actx->fill;
- while (dst && actx->fill) {
- if (!split) {
- dst_buf = sg_virt(dst);
- dst_off = 0;
- }
- rem = min(sg_dma_len(dst) - dst_off,
- actx->fill);
-
- memcpy(dst_buf + dst_off, out_tmp, rem);
- out_tmp += rem;
- dst_off += rem;
- actx->fill -= rem;
-
- if (dst_off == sg_dma_len(dst)) {
- dst = sg_next(dst);
- split = 0;
- } else {
- split = 1;
- }
- }
+ actx->fill = 0;
}
} while (len);
dma_addr_t buf_phys = dma_map_single(sdcp->dev, sdcp->coh->sha_in_buf,
DCP_BUF_SZ, DMA_TO_DEVICE);
+ ret = dma_mapping_error(sdcp->dev, buf_phys);
+ if (ret)
+ return ret;
+
/* Fill in the DMA descriptor. */
desc->control0 = MXS_DCP_CONTROL0_DECR_SEMAPHORE |
MXS_DCP_CONTROL0_INTERRUPT |
if (rctx->fini) {
digest_phys = dma_map_single(sdcp->dev, sdcp->coh->sha_out_buf,
DCP_SHA_PAY_SZ, DMA_FROM_DEVICE);
+ ret = dma_mapping_error(sdcp->dev, digest_phys);
+ if (ret)
+ goto done_run;
+
desc->control0 |= MXS_DCP_CONTROL0_HASH_TERM;
desc->payload = digest_phys;
}
spin_lock_init(&dd->lock);
INIT_LIST_HEAD(&dd->list);
- spin_lock(&list_lock);
+ spin_lock_bh(&list_lock);
list_add_tail(&dd->list, &dev_list);
- spin_unlock(&list_lock);
+ spin_unlock_bh(&list_lock);
/* Initialize crypto engine */
dd->engine = crypto_engine_alloc_init(dev, 1);
if (!dd)
return -ENODEV;
- spin_lock(&list_lock);
+ spin_lock_bh(&list_lock);
list_del(&dd->list);
- spin_unlock(&list_lock);
+ spin_unlock_bh(&list_lock);
for (i = dd->pdata->algs_info_size - 1; i >= 0; i--)
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--) {
buf = sg_virt(sg);
pages = get_order(len);
- if (orig && (flags & OMAP_CRYPTO_COPY_MASK))
+ if (orig && (flags & OMAP_CRYPTO_DATA_COPIED))
omap_crypto_copy_data(sg, orig, offset, len);
if (flags & OMAP_CRYPTO_DATA_COPIED)
INIT_LIST_HEAD(&dd->list);
- spin_lock(&list_lock);
+ spin_lock_bh(&list_lock);
list_add_tail(&dd->list, &dev_list);
- spin_unlock(&list_lock);
+ spin_unlock_bh(&list_lock);
/* Initialize des crypto engine */
dd->engine = crypto_engine_alloc_init(dev, 1);
if (!dd)
return -ENODEV;
- spin_lock(&list_lock);
+ spin_lock_bh(&list_lock);
list_del(&dd->list);
- spin_unlock(&list_lock);
+ spin_unlock_bh(&list_lock);
for (i = dd->pdata->algs_info_size - 1; i >= 0; i--)
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--)
#define FLAGS_FINAL 1
#define FLAGS_DMA_ACTIVE 2
#define FLAGS_OUTPUT_READY 3
-#define FLAGS_INIT 4
#define FLAGS_CPU 5
#define FLAGS_DMA_READY 6
#define FLAGS_AUTO_XOR 7
hash[i] = le32_to_cpup((__le32 *)in + i);
}
-static int omap_sham_hw_init(struct omap_sham_dev *dd)
-{
- int err;
-
- err = pm_runtime_resume_and_get(dd->dev);
- if (err < 0) {
- dev_err(dd->dev, "failed to get sync: %d\n", err);
- return err;
- }
-
- if (!test_bit(FLAGS_INIT, &dd->flags)) {
- set_bit(FLAGS_INIT, &dd->flags);
- dd->err = 0;
- }
-
- return 0;
-}
-
static void omap_sham_write_ctrl_omap2(struct omap_sham_dev *dd, size_t length,
int final, int dma)
{
dev_dbg(dd->dev, "hash-one: op: %u, total: %u, digcnt: %zd, final: %d",
ctx->op, ctx->total, ctx->digcnt, final);
- dd->req = req;
-
- err = omap_sham_hw_init(dd);
- if (err)
+ err = pm_runtime_resume_and_get(dd->dev);
+ if (err < 0) {
+ dev_err(dd->dev, "failed to get sync: %d\n", err);
return err;
+ }
+
+ dd->err = 0;
+ dd->req = req;
if (ctx->digcnt)
dd->pdata->copy_hash(req, 0);
if (test_and_clear_bit(FLAGS_OUTPUT_READY, &dd->flags))
goto finish;
} else if (test_bit(FLAGS_DMA_READY, &dd->flags)) {
- if (test_and_clear_bit(FLAGS_DMA_ACTIVE, &dd->flags)) {
+ if (test_bit(FLAGS_DMA_ACTIVE, &dd->flags)) {
omap_sham_update_dma_stop(dd);
if (dd->err) {
err = dd->err;
dd->fallback_sz = OMAP_SHA_DMA_THRESHOLD;
pm_runtime_enable(dev);
- pm_runtime_irq_safe(dev);
err = pm_runtime_get_sync(dev);
if (err < 0) {
(rev & dd->pdata->major_mask) >> dd->pdata->major_shift,
(rev & dd->pdata->minor_mask) >> dd->pdata->minor_shift);
- spin_lock(&sham.lock);
+ spin_lock_bh(&sham.lock);
list_add_tail(&dd->list, &sham.dev_list);
- spin_unlock(&sham.lock);
+ spin_unlock_bh(&sham.lock);
dd->engine = crypto_engine_alloc_init(dev, 1);
if (!dd->engine) {
err_engine_start:
crypto_engine_exit(dd->engine);
err_engine:
- spin_lock(&sham.lock);
+ spin_lock_bh(&sham.lock);
list_del(&dd->list);
- spin_unlock(&sham.lock);
+ spin_unlock_bh(&sham.lock);
err_pm:
+ pm_runtime_dont_use_autosuspend(dev);
pm_runtime_disable(dev);
if (!dd->polling_mode)
dma_release_channel(dd->dma_lch);
dd = platform_get_drvdata(pdev);
if (!dd)
return -ENODEV;
- spin_lock(&sham.lock);
+ spin_lock_bh(&sham.lock);
list_del(&dd->list);
- spin_unlock(&sham.lock);
+ spin_unlock_bh(&sham.lock);
for (i = dd->pdata->algs_info_size - 1; i >= 0; i--)
for (j = dd->pdata->algs_info[i].registered - 1; j >= 0; j--) {
crypto_unregister_ahash(
dd->pdata->algs_info[i].registered--;
}
tasklet_kill(&dd->done_task);
+ pm_runtime_dont_use_autosuspend(&pdev->dev);
pm_runtime_disable(&pdev->dev);
if (!dd->polling_mode)
return 0;
}
-#ifdef CONFIG_PM_SLEEP
-static int omap_sham_suspend(struct device *dev)
-{
- pm_runtime_put_sync(dev);
- return 0;
-}
-
-static int omap_sham_resume(struct device *dev)
-{
- int err = pm_runtime_resume_and_get(dev);
- if (err < 0) {
- dev_err(dev, "failed to get sync: %d\n", err);
- return err;
- }
- return 0;
-}
-#endif
-
-static SIMPLE_DEV_PM_OPS(omap_sham_pm_ops, omap_sham_suspend, omap_sham_resume);
-
static struct platform_driver omap_sham_driver = {
.probe = omap_sham_probe,
.remove = omap_sham_remove,
.driver = {
.name = "omap-sham",
- .pm = &omap_sham_pm_ops,
.of_match_table = omap_sham_of_match,
},
};
ADF_CSR_WR(addr, ADF_4XXX_SMIAPF_MASK_OFFSET, 0);
}
-static int adf_pf_enable_vf2pf_comms(struct adf_accel_dev *accel_dev)
+static int adf_enable_pf2vf_comms(struct adf_accel_dev *accel_dev)
{
return 0;
}
hw_data->fw_mmp_name = ADF_4XXX_MMP;
hw_data->init_admin_comms = adf_init_admin_comms;
hw_data->exit_admin_comms = adf_exit_admin_comms;
- hw_data->disable_iov = adf_disable_sriov;
hw_data->send_admin_init = adf_send_admin_init;
hw_data->init_arb = adf_init_arb;
hw_data->exit_arb = adf_exit_arb;
hw_data->get_arb_mapping = adf_get_arbiter_mapping;
hw_data->enable_ints = adf_enable_ints;
- hw_data->enable_vf2pf_comms = adf_pf_enable_vf2pf_comms;
hw_data->reset_device = adf_reset_flr;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
hw_data->admin_ae_mask = ADF_4XXX_ADMIN_AE_MASK;
hw_data->uof_get_num_objs = uof_get_num_objs;
hw_data->uof_get_name = uof_get_name;
hw_data->uof_get_ae_mask = uof_get_ae_mask;
hw_data->set_msix_rttable = set_msix_default_rttable;
hw_data->set_ssm_wdtimer = adf_gen4_set_ssm_wdtimer;
+ hw_data->enable_pfvf_comms = adf_enable_pf2vf_comms;
+ hw_data->disable_iov = adf_disable_sriov;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
adf_gen4_init_hw_csr_ops(&hw_data->csr_ops);
}
}
/* Set DMA identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration.\n");
- ret = -EFAULT;
- goto out_err;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration.\n");
+ goto out_err;
}
/* Get accelerator capabilities mask */
return ADF_C3XXX_PF2VF_OFFSET(i);
}
-static u32 get_vintmsk_offset(u32 i)
-{
- return ADF_C3XXX_VINTMSK_OFFSET(i);
-}
-
static void adf_enable_error_correction(struct adf_accel_dev *accel_dev)
{
struct adf_hw_device_data *hw_device = accel_dev->hw_device;
ADF_C3XXX_SMIA1_MASK);
}
-static int adf_pf_enable_vf2pf_comms(struct adf_accel_dev *accel_dev)
+static int adf_enable_pf2vf_comms(struct adf_accel_dev *accel_dev)
{
+ spin_lock_init(&accel_dev->pf.vf2pf_ints_lock);
+
return 0;
}
hw_data->get_sram_bar_id = get_sram_bar_id;
hw_data->get_etr_bar_id = get_etr_bar_id;
hw_data->get_misc_bar_id = get_misc_bar_id;
- hw_data->get_pf2vf_offset = get_pf2vf_offset;
- hw_data->get_vintmsk_offset = get_vintmsk_offset;
hw_data->get_admin_info = adf_gen2_get_admin_info;
hw_data->get_arb_info = adf_gen2_get_arb_info;
hw_data->get_sku = get_sku;
hw_data->init_admin_comms = adf_init_admin_comms;
hw_data->exit_admin_comms = adf_exit_admin_comms;
hw_data->configure_iov_threads = configure_iov_threads;
- hw_data->disable_iov = adf_disable_sriov;
hw_data->send_admin_init = adf_send_admin_init;
hw_data->init_arb = adf_init_arb;
hw_data->exit_arb = adf_exit_arb;
hw_data->get_arb_mapping = adf_get_arbiter_mapping;
hw_data->enable_ints = adf_enable_ints;
- hw_data->enable_vf2pf_comms = adf_pf_enable_vf2pf_comms;
hw_data->reset_device = adf_reset_flr;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
hw_data->set_ssm_wdtimer = adf_gen2_set_ssm_wdtimer;
+ hw_data->get_pf2vf_offset = get_pf2vf_offset;
+ hw_data->enable_pfvf_comms = adf_enable_pf2vf_comms;
+ hw_data->disable_iov = adf_disable_sriov;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
+
adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
}
#define ADF_C3XXX_ERRSSMSH_EN BIT(3)
#define ADF_C3XXX_PF2VF_OFFSET(i) (0x3A000 + 0x280 + ((i) * 0x04))
-#define ADF_C3XXX_VINTMSK_OFFSET(i) (0x3A000 + 0x200 + ((i) * 0x04))
/* AE to function mapping */
#define ADF_C3XXX_AE2FUNC_MAP_GRP_A_NUM_REGS 48
}
/* set dma identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration\n");
- ret = -EFAULT;
- goto out_err_disable;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
-
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(48));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration\n");
+ goto out_err_disable;
}
if (pci_request_regions(pdev, ADF_C3XXX_DEVICE_NAME)) {
if (pci_save_state(pdev)) {
dev_err(&pdev->dev, "Failed to save pci state\n");
ret = -ENOMEM;
- goto out_err_free_reg;
+ goto out_err_disable_aer;
}
ret = qat_crypto_dev_config(accel_dev);
if (ret)
- goto out_err_free_reg;
+ goto out_err_disable_aer;
ret = adf_dev_init(accel_dev);
if (ret)
adf_dev_stop(accel_dev);
out_err_dev_shutdown:
adf_dev_shutdown(accel_dev);
+out_err_disable_aer:
+ adf_disable_aer(accel_dev);
out_err_free_reg:
pci_release_regions(accel_pci_dev->pci_dev);
out_err_disable:
return ADF_C3XXXIOV_PF2VF_OFFSET;
}
-static u32 get_vintmsk_offset(u32 i)
-{
- return ADF_C3XXXIOV_VINTMSK_OFFSET;
-}
-
static int adf_vf_int_noop(struct adf_accel_dev *accel_dev)
{
return 0;
hw_data->enable_error_correction = adf_vf_void_noop;
hw_data->init_admin_comms = adf_vf_int_noop;
hw_data->exit_admin_comms = adf_vf_void_noop;
- hw_data->send_admin_init = adf_vf2pf_init;
+ hw_data->send_admin_init = adf_vf2pf_notify_init;
hw_data->init_arb = adf_vf_int_noop;
hw_data->exit_arb = adf_vf_void_noop;
- hw_data->disable_iov = adf_vf2pf_shutdown;
+ hw_data->disable_iov = adf_vf2pf_notify_shutdown;
hw_data->get_accel_mask = get_accel_mask;
hw_data->get_ae_mask = get_ae_mask;
hw_data->get_num_accels = get_num_accels;
hw_data->get_etr_bar_id = get_etr_bar_id;
hw_data->get_misc_bar_id = get_misc_bar_id;
hw_data->get_pf2vf_offset = get_pf2vf_offset;
- hw_data->get_vintmsk_offset = get_vintmsk_offset;
hw_data->get_sku = get_sku;
hw_data->enable_ints = adf_vf_void_noop;
- hw_data->enable_vf2pf_comms = adf_enable_vf2pf_comms;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
+ hw_data->enable_pfvf_comms = adf_enable_vf2pf_comms;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
hw_data->dev_class->instances++;
adf_devmgr_update_class_index(hw_data);
adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
#define ADF_C3XXXIOV_ETR_BAR 0
#define ADF_C3XXXIOV_ETR_MAX_BANKS 1
#define ADF_C3XXXIOV_PF2VF_OFFSET 0x200
-#define ADF_C3XXXIOV_VINTMSK_OFFSET 0x208
void adf_init_hw_data_c3xxxiov(struct adf_hw_device_data *hw_data);
void adf_clean_hw_data_c3xxxiov(struct adf_hw_device_data *hw_data);
}
/* set dma identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration\n");
- ret = -EFAULT;
- goto out_err_disable;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
-
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(48));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration\n");
+ goto out_err_disable;
}
if (pci_request_regions(pdev, ADF_C3XXXVF_DEVICE_NAME)) {
pr_err("QAT: Driver removal failed\n");
return;
}
+ adf_flush_vf_wq(accel_dev);
adf_dev_stop(accel_dev);
adf_dev_shutdown(accel_dev);
adf_cleanup_accel(accel_dev);
return ADF_C62X_PF2VF_OFFSET(i);
}
-static u32 get_vintmsk_offset(u32 i)
-{
- return ADF_C62X_VINTMSK_OFFSET(i);
-}
-
static void adf_enable_error_correction(struct adf_accel_dev *accel_dev)
{
struct adf_hw_device_data *hw_device = accel_dev->hw_device;
ADF_C62X_SMIA1_MASK);
}
-static int adf_pf_enable_vf2pf_comms(struct adf_accel_dev *accel_dev)
+static int adf_enable_pf2vf_comms(struct adf_accel_dev *accel_dev)
{
+ spin_lock_init(&accel_dev->pf.vf2pf_ints_lock);
+
return 0;
}
hw_data->get_sram_bar_id = get_sram_bar_id;
hw_data->get_etr_bar_id = get_etr_bar_id;
hw_data->get_misc_bar_id = get_misc_bar_id;
- hw_data->get_pf2vf_offset = get_pf2vf_offset;
- hw_data->get_vintmsk_offset = get_vintmsk_offset;
hw_data->get_admin_info = adf_gen2_get_admin_info;
hw_data->get_arb_info = adf_gen2_get_arb_info;
hw_data->get_sku = get_sku;
hw_data->init_admin_comms = adf_init_admin_comms;
hw_data->exit_admin_comms = adf_exit_admin_comms;
hw_data->configure_iov_threads = configure_iov_threads;
- hw_data->disable_iov = adf_disable_sriov;
hw_data->send_admin_init = adf_send_admin_init;
hw_data->init_arb = adf_init_arb;
hw_data->exit_arb = adf_exit_arb;
hw_data->get_arb_mapping = adf_get_arbiter_mapping;
hw_data->enable_ints = adf_enable_ints;
- hw_data->enable_vf2pf_comms = adf_pf_enable_vf2pf_comms;
hw_data->reset_device = adf_reset_flr;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
hw_data->set_ssm_wdtimer = adf_gen2_set_ssm_wdtimer;
+ hw_data->get_pf2vf_offset = get_pf2vf_offset;
+ hw_data->enable_pfvf_comms = adf_enable_pf2vf_comms;
+ hw_data->disable_iov = adf_disable_sriov;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
+
adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
}
#define ADF_C62X_ERRSSMSH_EN BIT(3)
#define ADF_C62X_PF2VF_OFFSET(i) (0x3A000 + 0x280 + ((i) * 0x04))
-#define ADF_C62X_VINTMSK_OFFSET(i) (0x3A000 + 0x200 + ((i) * 0x04))
/* AE to function mapping */
#define ADF_C62X_AE2FUNC_MAP_GRP_A_NUM_REGS 80
}
/* set dma identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration\n");
- ret = -EFAULT;
- goto out_err_disable;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
-
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(48));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration\n");
+ goto out_err_disable;
}
if (pci_request_regions(pdev, ADF_C62X_DEVICE_NAME)) {
if (pci_save_state(pdev)) {
dev_err(&pdev->dev, "Failed to save pci state\n");
ret = -ENOMEM;
- goto out_err_free_reg;
+ goto out_err_disable_aer;
}
ret = qat_crypto_dev_config(accel_dev);
if (ret)
- goto out_err_free_reg;
+ goto out_err_disable_aer;
ret = adf_dev_init(accel_dev);
if (ret)
adf_dev_stop(accel_dev);
out_err_dev_shutdown:
adf_dev_shutdown(accel_dev);
+out_err_disable_aer:
+ adf_disable_aer(accel_dev);
out_err_free_reg:
pci_release_regions(accel_pci_dev->pci_dev);
out_err_disable:
return ADF_C62XIOV_PF2VF_OFFSET;
}
-static u32 get_vintmsk_offset(u32 i)
-{
- return ADF_C62XIOV_VINTMSK_OFFSET;
-}
-
static int adf_vf_int_noop(struct adf_accel_dev *accel_dev)
{
return 0;
hw_data->enable_error_correction = adf_vf_void_noop;
hw_data->init_admin_comms = adf_vf_int_noop;
hw_data->exit_admin_comms = adf_vf_void_noop;
- hw_data->send_admin_init = adf_vf2pf_init;
+ hw_data->send_admin_init = adf_vf2pf_notify_init;
hw_data->init_arb = adf_vf_int_noop;
hw_data->exit_arb = adf_vf_void_noop;
- hw_data->disable_iov = adf_vf2pf_shutdown;
+ hw_data->disable_iov = adf_vf2pf_notify_shutdown;
hw_data->get_accel_mask = get_accel_mask;
hw_data->get_ae_mask = get_ae_mask;
hw_data->get_num_accels = get_num_accels;
hw_data->get_etr_bar_id = get_etr_bar_id;
hw_data->get_misc_bar_id = get_misc_bar_id;
hw_data->get_pf2vf_offset = get_pf2vf_offset;
- hw_data->get_vintmsk_offset = get_vintmsk_offset;
hw_data->get_sku = get_sku;
hw_data->enable_ints = adf_vf_void_noop;
- hw_data->enable_vf2pf_comms = adf_enable_vf2pf_comms;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
+ hw_data->enable_pfvf_comms = adf_enable_vf2pf_comms;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
hw_data->dev_class->instances++;
adf_devmgr_update_class_index(hw_data);
adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
#define ADF_C62XIOV_ETR_BAR 0
#define ADF_C62XIOV_ETR_MAX_BANKS 1
#define ADF_C62XIOV_PF2VF_OFFSET 0x200
-#define ADF_C62XIOV_VINTMSK_OFFSET 0x208
void adf_init_hw_data_c62xiov(struct adf_hw_device_data *hw_data);
void adf_clean_hw_data_c62xiov(struct adf_hw_device_data *hw_data);
}
/* set dma identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration\n");
- ret = -EFAULT;
- goto out_err_disable;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
-
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(48));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration\n");
+ goto out_err_disable;
}
if (pci_request_regions(pdev, ADF_C62XVF_DEVICE_NAME)) {
pr_err("QAT: Driver removal failed\n");
return;
}
+ adf_flush_vf_wq(accel_dev);
adf_dev_stop(accel_dev);
adf_dev_shutdown(accel_dev);
adf_cleanup_accel(accel_dev);
#define ADF_4XXX_DEVICE_NAME "4xxx"
#define ADF_4XXX_PCI_DEVICE_ID 0x4940
#define ADF_4XXXIOV_PCI_DEVICE_ID 0x4941
-#define ADF_ERRSOU3 (0x3A000 + 0x0C)
-#define ADF_ERRSOU5 (0x3A000 + 0xD8)
#define ADF_DEVICE_FUSECTL_OFFSET 0x40
#define ADF_DEVICE_LEGFUSE_OFFSET 0x4C
#define ADF_DEVICE_FUSECTL_MASK 0x80000000
u32 (*get_num_aes)(struct adf_hw_device_data *self);
u32 (*get_num_accels)(struct adf_hw_device_data *self);
u32 (*get_pf2vf_offset)(u32 i);
- u32 (*get_vintmsk_offset)(u32 i);
void (*get_arb_info)(struct arb_info *arb_csrs_info);
void (*get_admin_info)(struct admin_info *admin_csrs_info);
enum dev_sku_info (*get_sku)(struct adf_hw_device_data *self);
bool enable);
void (*enable_ints)(struct adf_accel_dev *accel_dev);
void (*set_ssm_wdtimer)(struct adf_accel_dev *accel_dev);
- int (*enable_vf2pf_comms)(struct adf_accel_dev *accel_dev);
+ int (*enable_pfvf_comms)(struct adf_accel_dev *accel_dev);
void (*reset_device)(struct adf_accel_dev *accel_dev);
void (*set_msix_rttable)(struct adf_accel_dev *accel_dev);
char *(*uof_get_name)(u32 obj_num);
struct adf_accel_vf_info {
struct adf_accel_dev *accel_dev;
- struct tasklet_struct vf2pf_bh_tasklet;
struct mutex pf2vf_lock; /* protect CSR access for PF2VF messages */
struct ratelimit_state vf2pf_ratelimit;
u32 vf_nr;
struct adf_accel_pci accel_pci_dev;
union {
struct {
+ /* protects VF2PF interrupts access */
+ spinlock_t vf2pf_ints_lock;
/* vf_info is non-zero when SR-IOV is init'ed */
struct adf_accel_vf_info *vf_info;
} pf;
EXPORT_SYMBOL_GPL(adf_enable_aer);
/**
- * adf_disable_aer() - Enable Advance Error Reporting for acceleration device
+ * adf_disable_aer() - Disable Advance Error Reporting for acceleration device
* @accel_dev: Pointer to acceleration device.
*
* Function disables PCI Advance Error Reporting for the
void adf_disable_sriov(struct adf_accel_dev *accel_dev);
void adf_disable_vf2pf_interrupts(struct adf_accel_dev *accel_dev,
u32 vf_mask);
+void adf_disable_vf2pf_interrupts_irq(struct adf_accel_dev *accel_dev,
+ u32 vf_mask);
void adf_enable_vf2pf_interrupts(struct adf_accel_dev *accel_dev,
u32 vf_mask);
void adf_enable_pf2vf_interrupts(struct adf_accel_dev *accel_dev);
void adf_disable_pf2vf_interrupts(struct adf_accel_dev *accel_dev);
+void adf_schedule_vf2pf_handler(struct adf_accel_vf_info *vf_info);
-int adf_vf2pf_init(struct adf_accel_dev *accel_dev);
-void adf_vf2pf_shutdown(struct adf_accel_dev *accel_dev);
+int adf_vf2pf_notify_init(struct adf_accel_dev *accel_dev);
+void adf_vf2pf_notify_shutdown(struct adf_accel_dev *accel_dev);
int adf_init_pf_wq(void);
void adf_exit_pf_wq(void);
int adf_init_vf_wq(void);
void adf_exit_vf_wq(void);
+void adf_flush_vf_wq(struct adf_accel_dev *accel_dev);
#else
-static inline int adf_sriov_configure(struct pci_dev *pdev, int numvfs)
-{
- return 0;
-}
+#define adf_sriov_configure NULL
static inline void adf_disable_sriov(struct adf_accel_dev *accel_dev)
{
{
}
-static inline int adf_vf2pf_init(struct adf_accel_dev *accel_dev)
+static inline int adf_vf2pf_notify_init(struct adf_accel_dev *accel_dev)
{
return 0;
}
-static inline void adf_vf2pf_shutdown(struct adf_accel_dev *accel_dev)
+static inline void adf_vf2pf_notify_shutdown(struct adf_accel_dev *accel_dev)
{
}
{
}
+static inline void adf_flush_vf_wq(struct adf_accel_dev *accel_dev)
+{
+}
+
#endif
#endif
struct service_hndl *service;
struct list_head *list_itr;
struct adf_hw_device_data *hw_data = accel_dev->hw_device;
+ int ret;
if (!hw_data) {
dev_err(&GET_DEV(accel_dev),
return -EFAULT;
}
- hw_data->enable_ints(accel_dev);
-
if (adf_ae_init(accel_dev)) {
dev_err(&GET_DEV(accel_dev),
"Failed to initialise Acceleration Engine\n");
}
set_bit(ADF_STATUS_IRQ_ALLOCATED, &accel_dev->status);
+ hw_data->enable_ints(accel_dev);
+ hw_data->enable_error_correction(accel_dev);
+
+ ret = hw_data->enable_pfvf_comms(accel_dev);
+ if (ret)
+ return ret;
+
/*
* Subservice initialisation is divided into two stages: init and start.
* This is to facilitate any ordering dependencies between services
set_bit(accel_dev->accel_id, service->init_status);
}
- hw_data->enable_error_correction(accel_dev);
- hw_data->enable_vf2pf_comms(accel_dev);
-
return 0;
}
EXPORT_SYMBOL_GPL(adf_dev_init);
#include "adf_transport_access_macros.h"
#include "adf_transport_internal.h"
+#define ADF_MAX_NUM_VFS 32
+#define ADF_ERRSOU3 (0x3A000 + 0x0C)
+#define ADF_ERRSOU5 (0x3A000 + 0xD8)
+#define ADF_ERRMSK3 (0x3A000 + 0x1C)
+#define ADF_ERRMSK5 (0x3A000 + 0xDC)
+#define ADF_ERR_REG_VF2PF_L(vf_src) (((vf_src) & 0x01FFFE00) >> 9)
+#define ADF_ERR_REG_VF2PF_U(vf_src) (((vf_src) & 0x0000FFFF) << 16)
+
static int adf_enable_msix(struct adf_accel_dev *accel_dev)
{
struct adf_accel_pci *pci_dev_info = &accel_dev->accel_pci_dev;
struct adf_hw_device_data *hw_data = accel_dev->hw_device;
struct adf_bar *pmisc =
&GET_BARS(accel_dev)[hw_data->get_misc_bar_id(hw_data)];
- void __iomem *pmisc_bar_addr = pmisc->virt_addr;
- u32 vf_mask;
+ void __iomem *pmisc_addr = pmisc->virt_addr;
+ u32 errsou3, errsou5, errmsk3, errmsk5;
+ unsigned long vf_mask;
/* Get the interrupt sources triggered by VFs */
- vf_mask = ((ADF_CSR_RD(pmisc_bar_addr, ADF_ERRSOU5) &
- 0x0000FFFF) << 16) |
- ((ADF_CSR_RD(pmisc_bar_addr, ADF_ERRSOU3) &
- 0x01FFFE00) >> 9);
+ errsou3 = ADF_CSR_RD(pmisc_addr, ADF_ERRSOU3);
+ errsou5 = ADF_CSR_RD(pmisc_addr, ADF_ERRSOU5);
+ vf_mask = ADF_ERR_REG_VF2PF_L(errsou3);
+ vf_mask |= ADF_ERR_REG_VF2PF_U(errsou5);
+
+ /* To avoid adding duplicate entries to work queue, clear
+ * vf_int_mask_sets bits that are already masked in ERRMSK register.
+ */
+ errmsk3 = ADF_CSR_RD(pmisc_addr, ADF_ERRMSK3);
+ errmsk5 = ADF_CSR_RD(pmisc_addr, ADF_ERRMSK5);
+ vf_mask &= ~ADF_ERR_REG_VF2PF_L(errmsk3);
+ vf_mask &= ~ADF_ERR_REG_VF2PF_U(errmsk5);
if (vf_mask) {
struct adf_accel_vf_info *vf_info;
int i;
/* Disable VF2PF interrupts for VFs with pending ints */
- adf_disable_vf2pf_interrupts(accel_dev, vf_mask);
+ adf_disable_vf2pf_interrupts_irq(accel_dev, vf_mask);
/*
- * Schedule tasklets to handle VF2PF interrupt BHs
- * unless the VF is malicious and is attempting to
- * flood the host OS with VF2PF interrupts.
+ * Handle VF2PF interrupt unless the VF is malicious and
+ * is attempting to flood the host OS with VF2PF interrupts.
*/
- for_each_set_bit(i, (const unsigned long *)&vf_mask,
- (sizeof(vf_mask) * BITS_PER_BYTE)) {
+ for_each_set_bit(i, &vf_mask, ADF_MAX_NUM_VFS) {
vf_info = accel_dev->pf.vf_info + i;
if (!__ratelimit(&vf_info->vf2pf_ratelimit)) {
continue;
}
- /* Tasklet will re-enable ints from this VF */
- tasklet_hi_schedule(&vf_info->vf2pf_bh_tasklet);
+ adf_schedule_vf2pf_handler(vf_info);
irq_handled = true;
}
#define ADF_DH895XCC_ERRMSK5 (ADF_DH895XCC_EP_OFFSET + 0xDC)
#define ADF_DH895XCC_ERRMSK5_VF2PF_U_MASK(vf_mask) (vf_mask >> 16)
-void adf_enable_pf2vf_interrupts(struct adf_accel_dev *accel_dev)
-{
- struct adf_accel_pci *pci_info = &accel_dev->accel_pci_dev;
- struct adf_hw_device_data *hw_data = accel_dev->hw_device;
- void __iomem *pmisc_bar_addr =
- pci_info->pci_bars[hw_data->get_misc_bar_id(hw_data)].virt_addr;
-
- ADF_CSR_WR(pmisc_bar_addr, hw_data->get_vintmsk_offset(0), 0x0);
-}
-
-void adf_disable_pf2vf_interrupts(struct adf_accel_dev *accel_dev)
-{
- struct adf_accel_pci *pci_info = &accel_dev->accel_pci_dev;
- struct adf_hw_device_data *hw_data = accel_dev->hw_device;
- void __iomem *pmisc_bar_addr =
- pci_info->pci_bars[hw_data->get_misc_bar_id(hw_data)].virt_addr;
-
- ADF_CSR_WR(pmisc_bar_addr, hw_data->get_vintmsk_offset(0), 0x2);
-}
-
-void adf_enable_vf2pf_interrupts(struct adf_accel_dev *accel_dev,
- u32 vf_mask)
+static void __adf_enable_vf2pf_interrupts(struct adf_accel_dev *accel_dev,
+ u32 vf_mask)
{
struct adf_hw_device_data *hw_data = accel_dev->hw_device;
struct adf_bar *pmisc =
}
}
-void adf_disable_vf2pf_interrupts(struct adf_accel_dev *accel_dev, u32 vf_mask)
+void adf_enable_vf2pf_interrupts(struct adf_accel_dev *accel_dev, u32 vf_mask)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&accel_dev->pf.vf2pf_ints_lock, flags);
+ __adf_enable_vf2pf_interrupts(accel_dev, vf_mask);
+ spin_unlock_irqrestore(&accel_dev->pf.vf2pf_ints_lock, flags);
+}
+
+static void __adf_disable_vf2pf_interrupts(struct adf_accel_dev *accel_dev,
+ u32 vf_mask)
{
struct adf_hw_device_data *hw_data = accel_dev->hw_device;
struct adf_bar *pmisc =
}
}
+void adf_disable_vf2pf_interrupts(struct adf_accel_dev *accel_dev, u32 vf_mask)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&accel_dev->pf.vf2pf_ints_lock, flags);
+ __adf_disable_vf2pf_interrupts(accel_dev, vf_mask);
+ spin_unlock_irqrestore(&accel_dev->pf.vf2pf_ints_lock, flags);
+}
+
+void adf_disable_vf2pf_interrupts_irq(struct adf_accel_dev *accel_dev, u32 vf_mask)
+{
+ spin_lock(&accel_dev->pf.vf2pf_ints_lock);
+ __adf_disable_vf2pf_interrupts(accel_dev, vf_mask);
+ spin_unlock(&accel_dev->pf.vf2pf_ints_lock);
+}
+
static int __adf_iov_putmsg(struct adf_accel_dev *accel_dev, u32 msg, u8 vf_nr)
{
struct adf_accel_pci *pci_info = &accel_dev->accel_pci_dev;
return ret;
}
-EXPORT_SYMBOL_GPL(adf_iov_putmsg);
void adf_vf2pf_req_hndl(struct adf_accel_vf_info *vf_info)
{
resp = (ADF_PF2VF_MSGORIGIN_SYSTEM |
(ADF_PF2VF_MSGTYPE_VERSION_RESP <<
ADF_PF2VF_MSGTYPE_SHIFT) |
- (ADF_PFVF_COMPATIBILITY_VERSION <<
+ (ADF_PFVF_COMPAT_THIS_VERSION <<
ADF_PF2VF_VERSION_RESP_VERS_SHIFT));
dev_dbg(&GET_DEV(accel_dev),
if (vf_compat_ver < hw_data->min_iov_compat_ver) {
dev_err(&GET_DEV(accel_dev),
"VF (vers %d) incompatible with PF (vers %d)\n",
- vf_compat_ver, ADF_PFVF_COMPATIBILITY_VERSION);
+ vf_compat_ver, ADF_PFVF_COMPAT_THIS_VERSION);
resp |= ADF_PF2VF_VF_INCOMPATIBLE <<
ADF_PF2VF_VERSION_RESP_RESULT_SHIFT;
- } else if (vf_compat_ver > ADF_PFVF_COMPATIBILITY_VERSION) {
+ } else if (vf_compat_ver > ADF_PFVF_COMPAT_THIS_VERSION) {
dev_err(&GET_DEV(accel_dev),
"VF (vers %d) compat with PF (vers %d) unkn.\n",
- vf_compat_ver, ADF_PFVF_COMPATIBILITY_VERSION);
+ vf_compat_ver, ADF_PFVF_COMPAT_THIS_VERSION);
resp |= ADF_PF2VF_VF_COMPAT_UNKNOWN <<
ADF_PF2VF_VERSION_RESP_RESULT_SHIFT;
} else {
dev_dbg(&GET_DEV(accel_dev),
"VF (vers %d) compatible with PF (vers %d)\n",
- vf_compat_ver, ADF_PFVF_COMPATIBILITY_VERSION);
+ vf_compat_ver, ADF_PFVF_COMPAT_THIS_VERSION);
resp |= ADF_PF2VF_VF_COMPATIBLE <<
ADF_PF2VF_VERSION_RESP_RESULT_SHIFT;
}
resp = (ADF_PF2VF_MSGORIGIN_SYSTEM |
(ADF_PF2VF_MSGTYPE_VERSION_RESP <<
ADF_PF2VF_MSGTYPE_SHIFT) |
- (ADF_PFVF_COMPATIBILITY_VERSION <<
+ (ADF_PFVF_COMPAT_THIS_VERSION <<
ADF_PF2VF_VERSION_RESP_VERS_SHIFT));
resp |= ADF_PF2VF_VF_COMPATIBLE <<
ADF_PF2VF_VERSION_RESP_RESULT_SHIFT;
/* re-enable interrupt on PF from this VF */
adf_enable_vf2pf_interrupts(accel_dev, (1 << vf_nr));
+
return;
err:
dev_dbg(&GET_DEV(accel_dev), "Unknown message from VF%d (0x%x);\n",
msg = ADF_VF2PF_MSGORIGIN_SYSTEM;
msg |= ADF_VF2PF_MSGTYPE_COMPAT_VER_REQ << ADF_VF2PF_MSGTYPE_SHIFT;
- msg |= ADF_PFVF_COMPATIBILITY_VERSION << ADF_VF2PF_COMPAT_VER_REQ_SHIFT;
- BUILD_BUG_ON(ADF_PFVF_COMPATIBILITY_VERSION > 255);
+ msg |= ADF_PFVF_COMPAT_THIS_VERSION << ADF_VF2PF_COMPAT_VER_REQ_SHIFT;
+ BUILD_BUG_ON(ADF_PFVF_COMPAT_THIS_VERSION > 255);
+
+ reinit_completion(&accel_dev->vf.iov_msg_completion);
/* Send request from VF to PF */
ret = adf_iov_putmsg(accel_dev, msg, 0);
break;
case ADF_PF2VF_VF_COMPAT_UNKNOWN:
/* VF is newer than PF and decides whether it is compatible */
- if (accel_dev->vf.pf_version >= hw_data->min_iov_compat_ver)
+ if (accel_dev->vf.pf_version >= hw_data->min_iov_compat_ver) {
+ accel_dev->vf.compatible = ADF_PF2VF_VF_COMPATIBLE;
break;
+ }
fallthrough;
case ADF_PF2VF_VF_INCOMPATIBLE:
dev_err(&GET_DEV(accel_dev),
"PF (vers %d) and VF (vers %d) are not compatible\n",
accel_dev->vf.pf_version,
- ADF_PFVF_COMPATIBILITY_VERSION);
+ ADF_PFVF_COMPAT_THIS_VERSION);
return -EINVAL;
default:
dev_err(&GET_DEV(accel_dev),
* IN_USE_BY pattern as part of a collision control scheme (see adf_iov_putmsg).
*/
-#define ADF_PFVF_COMPATIBILITY_VERSION 0x1 /* PF<->VF compat */
+#define ADF_PFVF_COMPAT_THIS_VERSION 0x1 /* PF<->VF compat */
/* PF->VF messages */
#define ADF_PF2VF_INT BIT(0)
kfree(pf2vf_resp);
}
-static void adf_vf2pf_bh_handler(void *data)
+void adf_schedule_vf2pf_handler(struct adf_accel_vf_info *vf_info)
{
- struct adf_accel_vf_info *vf_info = (struct adf_accel_vf_info *)data;
struct adf_pf2vf_resp *pf2vf_resp;
pf2vf_resp = kzalloc(sizeof(*pf2vf_resp), GFP_ATOMIC);
vf_info->accel_dev = accel_dev;
vf_info->vf_nr = i;
- tasklet_init(&vf_info->vf2pf_bh_tasklet,
- (void *)adf_vf2pf_bh_handler,
- (unsigned long)vf_info);
mutex_init(&vf_info->pf2vf_lock);
ratelimit_state_init(&vf_info->vf2pf_ratelimit,
DEFAULT_RATELIMIT_INTERVAL,
hw_data->configure_iov_threads(accel_dev, false);
for (i = 0, vf = accel_dev->pf.vf_info; i < totalvfs; i++, vf++) {
- tasklet_disable(&vf->vf2pf_bh_tasklet);
- tasklet_kill(&vf->vf2pf_bh_tasklet);
mutex_destroy(&vf->pf2vf_lock);
}
#include "adf_pf2vf_msg.h"
/**
- * adf_vf2pf_init() - send init msg to PF
+ * adf_vf2pf_notify_init() - send init msg to PF
* @accel_dev: Pointer to acceleration VF device.
*
* Function sends an init message from the VF to a PF
*
* Return: 0 on success, error code otherwise.
*/
-int adf_vf2pf_init(struct adf_accel_dev *accel_dev)
+int adf_vf2pf_notify_init(struct adf_accel_dev *accel_dev)
{
u32 msg = (ADF_VF2PF_MSGORIGIN_SYSTEM |
(ADF_VF2PF_MSGTYPE_INIT << ADF_VF2PF_MSGTYPE_SHIFT));
set_bit(ADF_STATUS_PF_RUNNING, &accel_dev->status);
return 0;
}
-EXPORT_SYMBOL_GPL(adf_vf2pf_init);
+EXPORT_SYMBOL_GPL(adf_vf2pf_notify_init);
/**
- * adf_vf2pf_shutdown() - send shutdown msg to PF
+ * adf_vf2pf_notify_shutdown() - send shutdown msg to PF
* @accel_dev: Pointer to acceleration VF device.
*
* Function sends a shutdown message from the VF to a PF
*
* Return: void
*/
-void adf_vf2pf_shutdown(struct adf_accel_dev *accel_dev)
+void adf_vf2pf_notify_shutdown(struct adf_accel_dev *accel_dev)
{
u32 msg = (ADF_VF2PF_MSGORIGIN_SYSTEM |
(ADF_VF2PF_MSGTYPE_SHUTDOWN << ADF_VF2PF_MSGTYPE_SHIFT));
dev_err(&GET_DEV(accel_dev),
"Failed to send Shutdown event to PF\n");
}
-EXPORT_SYMBOL_GPL(adf_vf2pf_shutdown);
+EXPORT_SYMBOL_GPL(adf_vf2pf_notify_shutdown);
#include "adf_pf2vf_msg.h"
#define ADF_VINTSOU_OFFSET 0x204
+#define ADF_VINTMSK_OFFSET 0x208
#define ADF_VINTSOU_BUN BIT(0)
#define ADF_VINTSOU_PF2VF BIT(1)
struct work_struct work;
};
+void adf_enable_pf2vf_interrupts(struct adf_accel_dev *accel_dev)
+{
+ struct adf_accel_pci *pci_info = &accel_dev->accel_pci_dev;
+ struct adf_hw_device_data *hw_data = accel_dev->hw_device;
+ void __iomem *pmisc_bar_addr =
+ pci_info->pci_bars[hw_data->get_misc_bar_id(hw_data)].virt_addr;
+
+ ADF_CSR_WR(pmisc_bar_addr, ADF_VINTMSK_OFFSET, 0x0);
+}
+
+void adf_disable_pf2vf_interrupts(struct adf_accel_dev *accel_dev)
+{
+ struct adf_accel_pci *pci_info = &accel_dev->accel_pci_dev;
+ struct adf_hw_device_data *hw_data = accel_dev->hw_device;
+ void __iomem *pmisc_bar_addr =
+ pci_info->pci_bars[hw_data->get_misc_bar_id(hw_data)].virt_addr;
+
+ ADF_CSR_WR(pmisc_bar_addr, ADF_VINTMSK_OFFSET, 0x2);
+}
+EXPORT_SYMBOL_GPL(adf_disable_pf2vf_interrupts);
+
static int adf_enable_msi(struct adf_accel_dev *accel_dev)
{
struct adf_accel_pci *pci_dev_info = &accel_dev->accel_pci_dev;
struct adf_bar *pmisc =
&GET_BARS(accel_dev)[hw_data->get_misc_bar_id(hw_data)];
void __iomem *pmisc_bar_addr = pmisc->virt_addr;
- u32 v_int;
+ bool handled = false;
+ u32 v_int, v_mask;
/* Read VF INT source CSR to determine the source of VF interrupt */
v_int = ADF_CSR_RD(pmisc_bar_addr, ADF_VINTSOU_OFFSET);
+ /* Read VF INT mask CSR to determine which sources are masked */
+ v_mask = ADF_CSR_RD(pmisc_bar_addr, ADF_VINTMSK_OFFSET);
+
+ /*
+ * Recompute v_int ignoring sources that are masked. This is to
+ * avoid rescheduling the tasklet for interrupts already handled
+ */
+ v_int &= ~v_mask;
+
/* Check for PF2VF interrupt */
if (v_int & ADF_VINTSOU_PF2VF) {
/* Disable PF to VF interrupt */
/* Schedule tasklet to handle interrupt BH */
tasklet_hi_schedule(&accel_dev->vf.pf2vf_bh_tasklet);
- return IRQ_HANDLED;
+ handled = true;
}
/* Check bundle interrupt */
csr_ops->write_csr_int_flag_and_col(bank->csr_addr,
bank->bank_number, 0);
tasklet_hi_schedule(&bank->resp_handler);
- return IRQ_HANDLED;
+ handled = true;
}
- return IRQ_NONE;
+ return handled ? IRQ_HANDLED : IRQ_NONE;
}
static int adf_request_msi_irq(struct adf_accel_dev *accel_dev)
}
EXPORT_SYMBOL_GPL(adf_vf_isr_resource_alloc);
+/**
+ * adf_flush_vf_wq() - Flush workqueue for VF
+ * @accel_dev: Pointer to acceleration device.
+ *
+ * Function disables the PF/VF interrupts on the VF so that no new messages
+ * are received and flushes the workqueue 'adf_vf_stop_wq'.
+ *
+ * Return: void.
+ */
+void adf_flush_vf_wq(struct adf_accel_dev *accel_dev)
+{
+ adf_disable_pf2vf_interrupts(accel_dev);
+
+ flush_workqueue(adf_vf_stop_wq);
+}
+EXPORT_SYMBOL_GPL(adf_flush_vf_wq);
+
+/**
+ * adf_init_vf_wq() - Init workqueue for VF
+ *
+ * Function init workqueue 'adf_vf_stop_wq' for VF.
+ *
+ * Return: 0 on success, error code otherwise.
+ */
int __init adf_init_vf_wq(void)
{
adf_vf_stop_wq = alloc_workqueue("adf_vf_stop_wq", WQ_MEM_RECLAIM, 0);
return ADF_DH895XCC_PF2VF_OFFSET(i);
}
-static u32 get_vintmsk_offset(u32 i)
-{
- return ADF_DH895XCC_VINTMSK_OFFSET(i);
-}
-
static void adf_enable_error_correction(struct adf_accel_dev *accel_dev)
{
struct adf_hw_device_data *hw_device = accel_dev->hw_device;
ADF_DH895XCC_SMIA1_MASK);
}
-static int adf_pf_enable_vf2pf_comms(struct adf_accel_dev *accel_dev)
+static int adf_enable_pf2vf_comms(struct adf_accel_dev *accel_dev)
{
+ spin_lock_init(&accel_dev->pf.vf2pf_ints_lock);
+
return 0;
}
hw_data->get_num_aes = get_num_aes;
hw_data->get_etr_bar_id = get_etr_bar_id;
hw_data->get_misc_bar_id = get_misc_bar_id;
- hw_data->get_pf2vf_offset = get_pf2vf_offset;
- hw_data->get_vintmsk_offset = get_vintmsk_offset;
hw_data->get_admin_info = adf_gen2_get_admin_info;
hw_data->get_arb_info = adf_gen2_get_arb_info;
hw_data->get_sram_bar_id = get_sram_bar_id;
hw_data->init_admin_comms = adf_init_admin_comms;
hw_data->exit_admin_comms = adf_exit_admin_comms;
hw_data->configure_iov_threads = configure_iov_threads;
- hw_data->disable_iov = adf_disable_sriov;
hw_data->send_admin_init = adf_send_admin_init;
hw_data->init_arb = adf_init_arb;
hw_data->exit_arb = adf_exit_arb;
hw_data->get_arb_mapping = adf_get_arbiter_mapping;
hw_data->enable_ints = adf_enable_ints;
- hw_data->enable_vf2pf_comms = adf_pf_enable_vf2pf_comms;
hw_data->reset_device = adf_reset_sbr;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
+ hw_data->get_pf2vf_offset = get_pf2vf_offset;
+ hw_data->enable_pfvf_comms = adf_enable_pf2vf_comms;
+ hw_data->disable_iov = adf_disable_sriov;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
+
adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
}
#define ADF_DH895XCC_ERRSSMSH_EN BIT(3)
#define ADF_DH895XCC_PF2VF_OFFSET(i) (0x3A000 + 0x280 + ((i) * 0x04))
-#define ADF_DH895XCC_VINTMSK_OFFSET(i) (0x3A000 + 0x200 + ((i) * 0x04))
/* AE to function mapping */
#define ADF_DH895XCC_AE2FUNC_MAP_GRP_A_NUM_REGS 96
}
/* set dma identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration\n");
- ret = -EFAULT;
- goto out_err_disable;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
-
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(48));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration\n");
+ goto out_err_disable;
}
if (pci_request_regions(pdev, ADF_DH895XCC_DEVICE_NAME)) {
if (pci_save_state(pdev)) {
dev_err(&pdev->dev, "Failed to save pci state\n");
ret = -ENOMEM;
- goto out_err_free_reg;
+ goto out_err_disable_aer;
}
ret = qat_crypto_dev_config(accel_dev);
if (ret)
- goto out_err_free_reg;
+ goto out_err_disable_aer;
ret = adf_dev_init(accel_dev);
if (ret)
adf_dev_stop(accel_dev);
out_err_dev_shutdown:
adf_dev_shutdown(accel_dev);
+out_err_disable_aer:
+ adf_disable_aer(accel_dev);
out_err_free_reg:
pci_release_regions(accel_pci_dev->pci_dev);
out_err_disable:
return ADF_DH895XCCIOV_PF2VF_OFFSET;
}
-static u32 get_vintmsk_offset(u32 i)
-{
- return ADF_DH895XCCIOV_VINTMSK_OFFSET;
-}
-
static int adf_vf_int_noop(struct adf_accel_dev *accel_dev)
{
return 0;
hw_data->enable_error_correction = adf_vf_void_noop;
hw_data->init_admin_comms = adf_vf_int_noop;
hw_data->exit_admin_comms = adf_vf_void_noop;
- hw_data->send_admin_init = adf_vf2pf_init;
+ hw_data->send_admin_init = adf_vf2pf_notify_init;
hw_data->init_arb = adf_vf_int_noop;
hw_data->exit_arb = adf_vf_void_noop;
- hw_data->disable_iov = adf_vf2pf_shutdown;
+ hw_data->disable_iov = adf_vf2pf_notify_shutdown;
hw_data->get_accel_mask = get_accel_mask;
hw_data->get_ae_mask = get_ae_mask;
hw_data->get_num_accels = get_num_accels;
hw_data->get_etr_bar_id = get_etr_bar_id;
hw_data->get_misc_bar_id = get_misc_bar_id;
hw_data->get_pf2vf_offset = get_pf2vf_offset;
- hw_data->get_vintmsk_offset = get_vintmsk_offset;
hw_data->get_sku = get_sku;
hw_data->enable_ints = adf_vf_void_noop;
- hw_data->enable_vf2pf_comms = adf_enable_vf2pf_comms;
- hw_data->min_iov_compat_ver = ADF_PFVF_COMPATIBILITY_VERSION;
+ hw_data->enable_pfvf_comms = adf_enable_vf2pf_comms;
+ hw_data->min_iov_compat_ver = ADF_PFVF_COMPAT_THIS_VERSION;
hw_data->dev_class->instances++;
adf_devmgr_update_class_index(hw_data);
adf_gen2_init_hw_csr_ops(&hw_data->csr_ops);
#define ADF_DH895XCCIOV_ETR_BAR 0
#define ADF_DH895XCCIOV_ETR_MAX_BANKS 1
#define ADF_DH895XCCIOV_PF2VF_OFFSET 0x200
-#define ADF_DH895XCCIOV_VINTMSK_OFFSET 0x208
void adf_init_hw_data_dh895xcciov(struct adf_hw_device_data *hw_data);
void adf_clean_hw_data_dh895xcciov(struct adf_hw_device_data *hw_data);
}
/* set dma identifier */
- if (pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
- if ((pci_set_dma_mask(pdev, DMA_BIT_MASK(32)))) {
- dev_err(&pdev->dev, "No usable DMA configuration\n");
- ret = -EFAULT;
- goto out_err_disable;
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
- }
-
- } else {
- pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(48));
+ if (ret) {
+ dev_err(&pdev->dev, "No usable DMA configuration\n");
+ goto out_err_disable;
}
if (pci_request_regions(pdev, ADF_DH895XCCVF_DEVICE_NAME)) {
pr_err("QAT: Driver removal failed\n");
return;
}
+ adf_flush_vf_wq(accel_dev);
adf_dev_stop(accel_dev);
adf_dev_shutdown(accel_dev);
adf_cleanup_accel(accel_dev);
if (ret)
goto err_free;
- get_online_cpus();
+ cpus_read_lock();
virtcrypto_set_affinity(vi);
- put_online_cpus();
+ cpus_read_unlock();
return 0;
* trigger testing are different for each memory.
*/
+#ifdef CONFIG_EDAC_ALTERA_OCRAM
static const struct edac_device_prv_data ocramecc_data;
+#endif
+#ifdef CONFIG_EDAC_ALTERA_L2C
static const struct edac_device_prv_data l2ecc_data;
+#endif
+#ifdef CONFIG_EDAC_ALTERA_OCRAM
static const struct edac_device_prv_data a10_ocramecc_data;
+#endif
+#ifdef CONFIG_EDAC_ALTERA_L2C
static const struct edac_device_prv_data a10_l2ecc_data;
+#endif
static irqreturn_t altr_edac_device_handler(int irq, void *dev_id)
{
return ret_value;
}
-static ssize_t altr_edac_device_trig(struct file *file,
- const char __user *user_buf,
- size_t count, loff_t *ppos)
+static ssize_t __maybe_unused
+altr_edac_device_trig(struct file *file, const char __user *user_buf,
+ size_t count, loff_t *ppos)
{
u32 *ptemp, i, error_mask;
return count;
}
-static const struct file_operations altr_edac_device_inject_fops = {
+static const struct file_operations altr_edac_device_inject_fops __maybe_unused = {
.open = simple_open,
.write = altr_edac_device_trig,
.llseek = generic_file_llseek,
};
-static ssize_t altr_edac_a10_device_trig(struct file *file,
- const char __user *user_buf,
- size_t count, loff_t *ppos);
+static ssize_t __maybe_unused
+altr_edac_a10_device_trig(struct file *file, const char __user *user_buf,
+ size_t count, loff_t *ppos);
-static const struct file_operations altr_edac_a10_device_inject_fops = {
+static const struct file_operations altr_edac_a10_device_inject_fops __maybe_unused = {
.open = simple_open,
.write = altr_edac_a10_device_trig,
.llseek = generic_file_llseek,
};
-static ssize_t altr_edac_a10_device_trig2(struct file *file,
- const char __user *user_buf,
- size_t count, loff_t *ppos);
+static ssize_t __maybe_unused
+altr_edac_a10_device_trig2(struct file *file, const char __user *user_buf,
+ size_t count, loff_t *ppos);
-static const struct file_operations altr_edac_a10_device_inject2_fops = {
+static const struct file_operations altr_edac_a10_device_inject2_fops __maybe_unused = {
.open = simple_open,
.write = altr_edac_a10_device_trig2,
.llseek = generic_file_llseek,
* Based on xgene_edac.c peripheral code.
*/
-static ssize_t altr_edac_a10_device_trig(struct file *file,
- const char __user *user_buf,
- size_t count, loff_t *ppos)
+static ssize_t __maybe_unused
+altr_edac_a10_device_trig(struct file *file, const char __user *user_buf,
+ size_t count, loff_t *ppos)
{
struct edac_device_ctl_info *edac_dci = file->private_data;
struct altr_edac_device_dev *drvdata = edac_dci->pvt_info;
* slightly. A few Arria10 peripherals can use this injection function.
* Inject the error into the memory and then readback to trigger the IRQ.
*/
-static ssize_t altr_edac_a10_device_trig2(struct file *file,
- const char __user *user_buf,
- size_t count, loff_t *ppos)
+static ssize_t __maybe_unused
+altr_edac_a10_device_trig2(struct file *file, const char __user *user_buf,
+ size_t count, loff_t *ppos)
{
struct edac_device_ctl_info *edac_dci = file->private_data;
struct altr_edac_device_dev *drvdata = edac_dci->pvt_info;
regmap_read(edac->ecc_mgr_map, sm_offset, &irq_status);
bits = irq_status;
- for_each_set_bit(bit, &bits, 32) {
- irq = irq_linear_revmap(edac->domain, dberr * 32 + bit);
- if (irq)
- generic_handle_irq(irq);
- }
+ for_each_set_bit(bit, &bits, 32)
+ generic_handle_domain_irq(edac->domain, dberr * 32 + bit);
chained_irq_exit(chip, desc);
}
EDAC_DCT_ATTR_SHOW(top_mem);
EDAC_DCT_ATTR_SHOW(top_mem2);
-static ssize_t hole_show(struct device *dev, struct device_attribute *mattr,
- char *data)
+static ssize_t dram_hole_show(struct device *dev, struct device_attribute *mattr,
+ char *data)
{
struct mem_ctl_info *mci = to_mci(dev);
static DEVICE_ATTR(dbam, S_IRUGO, dbam0_show, NULL);
static DEVICE_ATTR(topmem, S_IRUGO, top_mem_show, NULL);
static DEVICE_ATTR(topmem2, S_IRUGO, top_mem2_show, NULL);
-static DEVICE_ATTR(dram_hole, S_IRUGO, hole_show, NULL);
+static DEVICE_ATTR_RO(dram_hole);
static struct attribute *dbg_attrs[] = {
&dev_attr_dhar.attr,
* update NUM_INJ_ATTRS in case you add new members
*/
-static DEVICE_ATTR(inject_section, S_IRUGO | S_IWUSR,
- inject_section_show, inject_section_store);
-static DEVICE_ATTR(inject_word, S_IRUGO | S_IWUSR,
- inject_word_show, inject_word_store);
-static DEVICE_ATTR(inject_ecc_vector, S_IRUGO | S_IWUSR,
- inject_ecc_vector_show, inject_ecc_vector_store);
-static DEVICE_ATTR(inject_write, S_IWUSR,
- NULL, inject_write_store);
-static DEVICE_ATTR(inject_read, S_IWUSR,
- NULL, inject_read_store);
+static DEVICE_ATTR_RW(inject_section);
+static DEVICE_ATTR_RW(inject_word);
+static DEVICE_ATTR_RW(inject_ecc_vector);
+static DEVICE_ATTR_WO(inject_write);
+static DEVICE_ATTR_WO(inject_read);
static struct attribute *inj_attrs[] = {
&dev_attr_inject_section.attr,
[MEM_DDR5] = "Unbuffered-DDR5",
[MEM_NVDIMM] = "Non-volatile-RAM",
[MEM_WIO2] = "Wide-IO-2",
+ [MEM_HBM2] = "High-bandwidth-memory-Gen2",
};
EXPORT_SYMBOL_GPL(edac_mem_types);
#define I10NM_GET_DIMMMTR(m, i, j) \
readl((m)->mbase + ((m)->hbm_mc ? 0x80c : 0x2080c) + \
(i) * (m)->chan_mmio_sz + (j) * 4)
-#define I10NM_GET_MCDDRTCFG(m, i, j) \
+#define I10NM_GET_MCDDRTCFG(m, i) \
readl((m)->mbase + ((m)->hbm_mc ? 0x970 : 0x20970) + \
- (i) * (m)->chan_mmio_sz + (j) * 4)
+ (i) * (m)->chan_mmio_sz)
#define I10NM_GET_MCMTR(m, i) \
readl((m)->mbase + ((m)->hbm_mc ? 0xef8 : 0x20ef8) + \
(i) * (m)->chan_mmio_sz)
#define I10NM_GET_AMAP(m, i) \
readl((m)->mbase + ((m)->hbm_mc ? 0x814 : 0x20814) + \
(i) * (m)->chan_mmio_sz)
+#define I10NM_GET_REG32(m, i, offset) \
+ readl((m)->mbase + (i) * (m)->chan_mmio_sz + (offset))
+#define I10NM_GET_REG64(m, i, offset) \
+ readq((m)->mbase + (i) * (m)->chan_mmio_sz + (offset))
+#define I10NM_SET_REG32(m, i, offset, v) \
+ writel(v, (m)->mbase + (i) * (m)->chan_mmio_sz + (offset))
#define I10NM_GET_SCK_MMIO_BASE(reg) (GET_BITFIELD(reg, 0, 28) << 23)
#define I10NM_GET_IMC_MMIO_OFFSET(reg) (GET_BITFIELD(reg, 0, 10) << 12)
#define I10NM_SAD_ENABLE(reg) GET_BITFIELD(reg, 0, 0)
#define I10NM_SAD_NM_CACHEABLE(reg) GET_BITFIELD(reg, 5, 5)
+#define RETRY_RD_ERR_LOG_UC BIT(1)
+#define RETRY_RD_ERR_LOG_NOOVER BIT(14)
+#define RETRY_RD_ERR_LOG_EN BIT(15)
+#define RETRY_RD_ERR_LOG_NOOVER_UC (BIT(14) | BIT(1))
+#define RETRY_RD_ERR_LOG_OVER_UC_V (BIT(2) | BIT(1) | BIT(0))
+
static struct list_head *i10nm_edac_list;
+static struct res_config *res_cfg;
+static int retry_rd_err_log;
+
+static u32 offsets_scrub_icx[] = {0x22c60, 0x22c54, 0x22c5c, 0x22c58, 0x22c28, 0x20ed8};
+static u32 offsets_scrub_spr[] = {0x22c60, 0x22c54, 0x22f08, 0x22c58, 0x22c28, 0x20ed8};
+static u32 offsets_demand_icx[] = {0x22e54, 0x22e60, 0x22e64, 0x22e58, 0x22e5c, 0x20ee0};
+static u32 offsets_demand_spr[] = {0x22e54, 0x22e60, 0x22f10, 0x22e58, 0x22e5c, 0x20ee0};
+
+static void __enable_retry_rd_err_log(struct skx_imc *imc, int chan, bool enable)
+{
+ u32 s, d;
+
+ if (!imc->mbase)
+ return;
+
+ s = I10NM_GET_REG32(imc, chan, res_cfg->offsets_scrub[0]);
+ d = I10NM_GET_REG32(imc, chan, res_cfg->offsets_demand[0]);
+
+ if (enable) {
+ /* Save default configurations */
+ imc->chan[chan].retry_rd_err_log_s = s;
+ imc->chan[chan].retry_rd_err_log_d = d;
+
+ s &= ~RETRY_RD_ERR_LOG_NOOVER_UC;
+ s |= RETRY_RD_ERR_LOG_EN;
+ d &= ~RETRY_RD_ERR_LOG_NOOVER_UC;
+ d |= RETRY_RD_ERR_LOG_EN;
+ } else {
+ /* Restore default configurations */
+ if (imc->chan[chan].retry_rd_err_log_s & RETRY_RD_ERR_LOG_UC)
+ s |= RETRY_RD_ERR_LOG_UC;
+ if (imc->chan[chan].retry_rd_err_log_s & RETRY_RD_ERR_LOG_NOOVER)
+ s |= RETRY_RD_ERR_LOG_NOOVER;
+ if (!(imc->chan[chan].retry_rd_err_log_s & RETRY_RD_ERR_LOG_EN))
+ s &= ~RETRY_RD_ERR_LOG_EN;
+ if (imc->chan[chan].retry_rd_err_log_d & RETRY_RD_ERR_LOG_UC)
+ d |= RETRY_RD_ERR_LOG_UC;
+ if (imc->chan[chan].retry_rd_err_log_d & RETRY_RD_ERR_LOG_NOOVER)
+ d |= RETRY_RD_ERR_LOG_NOOVER;
+ if (!(imc->chan[chan].retry_rd_err_log_d & RETRY_RD_ERR_LOG_EN))
+ d &= ~RETRY_RD_ERR_LOG_EN;
+ }
+
+ I10NM_SET_REG32(imc, chan, res_cfg->offsets_scrub[0], s);
+ I10NM_SET_REG32(imc, chan, res_cfg->offsets_demand[0], d);
+}
+
+static void enable_retry_rd_err_log(bool enable)
+{
+ struct skx_dev *d;
+ int i, j;
+
+ edac_dbg(2, "\n");
+
+ list_for_each_entry(d, i10nm_edac_list, list)
+ for (i = 0; i < I10NM_NUM_IMC; i++)
+ for (j = 0; j < I10NM_NUM_CHANNELS; j++)
+ __enable_retry_rd_err_log(&d->imc[i], j, enable);
+}
+
+static void show_retry_rd_err_log(struct decoded_addr *res, char *msg,
+ int len, bool scrub_err)
+{
+ struct skx_imc *imc = &res->dev->imc[res->imc];
+ u32 log0, log1, log2, log3, log4;
+ u32 corr0, corr1, corr2, corr3;
+ u64 log2a, log5;
+ u32 *offsets;
+ int n;
+
+ if (!imc->mbase)
+ return;
+
+ offsets = scrub_err ? res_cfg->offsets_scrub : res_cfg->offsets_demand;
+
+ log0 = I10NM_GET_REG32(imc, res->channel, offsets[0]);
+ log1 = I10NM_GET_REG32(imc, res->channel, offsets[1]);
+ log3 = I10NM_GET_REG32(imc, res->channel, offsets[3]);
+ log4 = I10NM_GET_REG32(imc, res->channel, offsets[4]);
+ log5 = I10NM_GET_REG64(imc, res->channel, offsets[5]);
+
+ if (res_cfg->type == SPR) {
+ log2a = I10NM_GET_REG64(imc, res->channel, offsets[2]);
+ n = snprintf(msg, len, " retry_rd_err_log[%.8x %.8x %.16llx %.8x %.8x %.16llx]",
+ log0, log1, log2a, log3, log4, log5);
+ } else {
+ log2 = I10NM_GET_REG32(imc, res->channel, offsets[2]);
+ n = snprintf(msg, len, " retry_rd_err_log[%.8x %.8x %.8x %.8x %.8x %.16llx]",
+ log0, log1, log2, log3, log4, log5);
+ }
+
+ corr0 = I10NM_GET_REG32(imc, res->channel, 0x22c18);
+ corr1 = I10NM_GET_REG32(imc, res->channel, 0x22c1c);
+ corr2 = I10NM_GET_REG32(imc, res->channel, 0x22c20);
+ corr3 = I10NM_GET_REG32(imc, res->channel, 0x22c24);
+
+ if (len - n > 0)
+ snprintf(msg + n, len - n,
+ " correrrcnt[%.4x %.4x %.4x %.4x %.4x %.4x %.4x %.4x]",
+ corr0 & 0xffff, corr0 >> 16,
+ corr1 & 0xffff, corr1 >> 16,
+ corr2 & 0xffff, corr2 >> 16,
+ corr3 & 0xffff, corr3 >> 16);
+
+ /* Clear status bits */
+ if (retry_rd_err_log == 2 && (log0 & RETRY_RD_ERR_LOG_OVER_UC_V)) {
+ log0 &= ~RETRY_RD_ERR_LOG_OVER_UC_V;
+ I10NM_SET_REG32(imc, res->channel, offsets[0], log0);
+ }
+}
+
static struct pci_dev *pci_get_dev_wrapper(int dom, unsigned int bus,
unsigned int dev, unsigned int fun)
{
.ddr_chan_mmio_sz = 0x4000,
.sad_all_devfn = PCI_DEVFN(29, 0),
.sad_all_offset = 0x108,
+ .offsets_scrub = offsets_scrub_icx,
+ .offsets_demand = offsets_demand_icx,
};
static struct res_config i10nm_cfg1 = {
.ddr_chan_mmio_sz = 0x4000,
.sad_all_devfn = PCI_DEVFN(29, 0),
.sad_all_offset = 0x108,
+ .offsets_scrub = offsets_scrub_icx,
+ .offsets_demand = offsets_demand_icx,
};
static struct res_config spr_cfg = {
.support_ddr5 = true,
.sad_all_devfn = PCI_DEVFN(10, 0),
.sad_all_offset = 0x300,
+ .offsets_scrub = offsets_scrub_spr,
+ .offsets_demand = offsets_demand_spr,
};
static const struct x86_cpu_id i10nm_cpuids[] = {
ndimms = 0;
amap = I10NM_GET_AMAP(imc, i);
+ mcddrtcfg = I10NM_GET_MCDDRTCFG(imc, i);
for (j = 0; j < imc->num_dimms; j++) {
dimm = edac_get_dimm(mci, i, j, 0);
mtr = I10NM_GET_DIMMMTR(imc, i, j);
- mcddrtcfg = I10NM_GET_MCDDRTCFG(imc, i, j);
edac_dbg(1, "dimmmtr 0x%x mcddrtcfg 0x%x (mc%d ch%d dimm%d)\n",
mtr, mcddrtcfg, imc->mc, i, j);
return -ENODEV;
cfg = (struct res_config *)id->driver_data;
+ res_cfg = cfg;
rc = skx_get_hi_lo(0x09a2, off, &tolm, &tohm);
if (rc)
mce_register_decode_chain(&i10nm_mce_dec);
setup_i10nm_debug();
+ if (retry_rd_err_log && res_cfg->offsets_scrub && res_cfg->offsets_demand) {
+ skx_set_decode(NULL, show_retry_rd_err_log);
+ if (retry_rd_err_log == 2)
+ enable_retry_rd_err_log(true);
+ }
+
i10nm_printk(KERN_INFO, "%s\n", I10NM_REVISION);
return 0;
static void __exit i10nm_exit(void)
{
edac_dbg(2, "\n");
+
+ if (retry_rd_err_log && res_cfg->offsets_scrub && res_cfg->offsets_demand) {
+ skx_set_decode(NULL, NULL);
+ if (retry_rd_err_log == 2)
+ enable_retry_rd_err_log(false);
+ }
+
teardown_i10nm_debug();
mce_unregister_decode_chain(&i10nm_mce_dec);
skx_adxl_put();
module_init(i10nm_init);
module_exit(i10nm_exit);
+module_param(retry_rd_err_log, int, 0444);
+MODULE_PARM_DESC(retry_rd_err_log, "retry_rd_err_log: 0=off(default), 1=bios(Linux doesn't reset any control bits, but just reports values.), 2=linux(Linux tries to take control and resets mode bits, clear valid/UC bits after reading.)");
+
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("MC Driver for Intel 10nm server processors");
c->x86_vendor != X86_VENDOR_HYGON)
return -ENODEV;
+ if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
+ return -ENODEV;
+
if (boot_cpu_has(X86_FEATURE_SMCA)) {
xec_mask = 0x3f;
goto out;
#define SKX_ILV_TARGET(tgt) ((tgt) & 7)
static void skx_show_retry_rd_err_log(struct decoded_addr *res,
- char *msg, int len)
+ char *msg, int len,
+ bool scrub_err)
{
u32 log0, log1, log2, log3, log4;
u32 corr0, corr1, corr2, corr3;
rows = numrow(mtr);
cols = imc->hbm_mc ? 6 : numcol(mtr);
- if (cfg->support_ddr5 && ((amap & 0x8) || imc->hbm_mc)) {
+ if (imc->hbm_mc) {
+ banks = 32;
+ mtype = MEM_HBM2;
+ } else if (cfg->support_ddr5 && (amap & 0x8)) {
banks = 32;
mtype = MEM_DDR5;
} else {
bool ripv = GET_BITFIELD(m->mcgstatus, 0, 0);
bool overflow = GET_BITFIELD(m->status, 62, 62);
bool uncorrected_error = GET_BITFIELD(m->status, 61, 61);
+ bool scrub_err = false;
bool recoverable;
int len;
u32 core_err_cnt = GET_BITFIELD(m->status, 38, 52);
break;
case 4:
optype = "memory scrubbing error";
+ scrub_err = true;
break;
default:
optype = "reserved";
}
if (skx_show_retry_rd_err_log)
- skx_show_retry_rd_err_log(res, skx_msg + len, MSG_SIZE - len);
+ skx_show_retry_rd_err_log(res, skx_msg + len, MSG_SIZE - len, scrub_err);
edac_dbg(0, "%s\n", skx_msg);
struct skx_channel {
struct pci_dev *cdev;
struct pci_dev *edev;
+ u32 retry_rd_err_log_s;
+ u32 retry_rd_err_log_d;
struct skx_dimm {
u8 close_pg;
u8 bank_xor_enable;
/* SAD device number and function number */
unsigned int sad_all_devfn;
int sad_all_offset;
+ /* Offsets of retry_rd_err_log registers */
+ u32 *offsets_scrub;
+ u32 *offsets_demand;
};
typedef int (*get_dimm_config_f)(struct mem_ctl_info *mci,
struct res_config *cfg);
typedef bool (*skx_decode_f)(struct decoded_addr *res);
-typedef void (*skx_show_retry_log_f)(struct decoded_addr *res, char *msg, int len);
+typedef void (*skx_show_retry_log_f)(struct decoded_addr *res, char *msg, int len, bool scrub_err);
int __init skx_adxl_get(void);
void __exit skx_adxl_put(void);
return 0;
n = 0;
- len = CPER_REC_LEN - 1;
+ len = CPER_REC_LEN;
if (mem->validation_bits & CPER_MEM_VALID_NODE)
n += scnprintf(msg + n, len - n, "node: %d ", mem->node);
if (mem->validation_bits & CPER_MEM_VALID_CARD)
n += scnprintf(msg + n, len - n, "responder_id: 0x%016llx ",
mem->responder_id);
if (mem->validation_bits & CPER_MEM_VALID_TARGET_ID)
- scnprintf(msg + n, len - n, "target_id: 0x%016llx ",
- mem->target_id);
+ n += scnprintf(msg + n, len - n, "target_id: 0x%016llx ",
+ mem->target_id);
if (mem->validation_bits & CPER_MEM_VALID_CHIP_ID)
- scnprintf(msg + n, len - n, "chip_id: %d ",
- mem->extended >> CPER_MEM_CHIP_ID_SHIFT);
+ n += scnprintf(msg + n, len - n, "chip_id: %d ",
+ mem->extended >> CPER_MEM_CHIP_ID_SHIFT);
- msg[n] = '\0';
return n;
}
data_len = estatus->data_length;
apei_estatus_for_each_section(estatus, gdata) {
- if (sizeof(struct acpi_hest_generic_data) > data_len)
+ if (acpi_hest_get_size(gdata) > data_len)
return -EINVAL;
record_size = acpi_hest_get_record_size(gdata);
#include <linux/init.h>
#include <linux/arm-smccc.h>
#include <linux/kernel.h>
+#include <linux/platform_device.h>
#include <asm/archrandom.h>
static u32 smccc_version = ARM_SMCCC_VERSION_1_0;
return smccc_version;
}
EXPORT_SYMBOL_GPL(arm_smccc_get_version);
+
+static int __init smccc_devices_init(void)
+{
+ struct platform_device *pdev;
+
+ if (smccc_trng_available) {
+ pdev = platform_device_register_simple("smccc_trng", -1,
+ NULL, 0);
+ if (IS_ERR(pdev))
+ pr_err("smccc_trng: could not register device: %ld\n",
+ PTR_ERR(pdev));
+ }
+
+ return 0;
+}
+device_initcall(smccc_devices_init);
A 32-bit single register GPIO fixed in/out implementation. This
can be used to represent any register as a set of GPIO signals.
+config GPIO_ROCKCHIP
+ tristate "Rockchip GPIO support"
+ depends on ARCH_ROCKCHIP || COMPILE_TEST
+ select GPIOLIB_IRQCHIP
+ default ARCH_ROCKCHIP
+ help
+ Say yes here to support GPIO on Rockchip SoCs.
+
config GPIO_SAMA5D2_PIOBU
tristate "SAMA5D2 PIOBU GPIO support"
depends on MFD_SYSCON
obj-$(CONFIG_GPIO_RDC321X) += gpio-rdc321x.o
obj-$(CONFIG_GPIO_REALTEK_OTTO) += gpio-realtek-otto.o
obj-$(CONFIG_GPIO_REG) += gpio-reg.o
+obj-$(CONFIG_GPIO_ROCKCHIP) += gpio-rockchip.o
obj-$(CONFIG_ARCH_SA1100) += gpio-sa1100.o
obj-$(CONFIG_GPIO_SAMA5D2_PIOBU) += gpio-sama5d2-piobu.o
obj-$(CONFIG_GPIO_SCH311X) += gpio-sch311x.o
unsigned long gpio;
for_each_set_bit(gpio, &irq_mask, 2)
- generic_handle_irq(irq_find_mapping(chip->irq.domain,
- 19 + gpio*24));
+ generic_handle_domain_irq(chip->irq.domain,
+ 19 + gpio*24);
raw_spin_lock(&dio48egpio->lock);
for_each_set_bit(bit_num, &irq_mask, 8) {
gpio = bit_num + boundary * 8;
- generic_handle_irq(irq_find_mapping(chip->irq.domain,
- gpio));
+ generic_handle_domain_irq(chip->irq.domain,
+ gpio);
}
}
int gpio;
for_each_set_bit(gpio, &idio16gpio->irq_mask, chip->ngpio)
- generic_handle_irq(irq_find_mapping(chip->irq.domain, gpio));
+ generic_handle_domain_irq(chip->irq.domain, gpio);
raw_spin_lock(&idio16gpio->lock);
(readl(mm_gc->regs + ALTERA_GPIO_EDGE_CAP) &
readl(mm_gc->regs + ALTERA_GPIO_IRQ_MASK)))) {
writel(status, mm_gc->regs + ALTERA_GPIO_EDGE_CAP);
- for_each_set_bit(i, &status, mm_gc->gc.ngpio) {
- generic_handle_irq(irq_find_mapping(irqdomain, i));
- }
+ for_each_set_bit(i, &status, mm_gc->gc.ngpio)
+ generic_handle_domain_irq(irqdomain, i);
}
chained_irq_exit(chip, desc);
status = readl(mm_gc->regs + ALTERA_GPIO_DATA);
status &= readl(mm_gc->regs + ALTERA_GPIO_IRQ_MASK);
- for_each_set_bit(i, &status, mm_gc->gc.ngpio) {
- generic_handle_irq(irq_find_mapping(irqdomain, i));
- }
+ for_each_set_bit(i, &status, mm_gc->gc.ngpio)
+ generic_handle_domain_irq(irqdomain, i);
+
chained_irq_exit(chip, desc);
}
struct gpio_chip *gc = irq_desc_get_handler_data(desc);
struct irq_chip *ic = irq_desc_get_chip(desc);
struct aspeed_sgpio *data = gpiochip_get_data(gc);
- unsigned int i, p, girq;
+ unsigned int i, p;
unsigned long reg;
chained_irq_enter(ic, desc);
reg = ioread32(bank_reg(data, bank, reg_irq_status));
- for_each_set_bit(p, ®, 32) {
- girq = irq_find_mapping(gc->irq.domain, i * 32 + p);
- generic_handle_irq(girq);
- }
-
+ for_each_set_bit(p, ®, 32)
+ generic_handle_domain_irq(gc->irq.domain, i * 32 + p);
}
chained_irq_exit(ic, desc);
struct gpio_chip *gc = irq_desc_get_handler_data(desc);
struct irq_chip *ic = irq_desc_get_chip(desc);
struct aspeed_gpio *data = gpiochip_get_data(gc);
- unsigned int i, p, girq, banks;
+ unsigned int i, p, banks;
unsigned long reg;
struct aspeed_gpio *gpio = gpiochip_get_data(gc);
reg = ioread32(bank_reg(data, bank, reg_irq_status));
- for_each_set_bit(p, ®, 32) {
- girq = irq_find_mapping(gc->irq.domain, i * 32 + p);
- generic_handle_irq(girq);
- }
-
+ for_each_set_bit(p, ®, 32)
+ generic_handle_domain_irq(gc->irq.domain, i * 32 + p);
}
chained_irq_exit(ic, desc);
raw_spin_unlock_irqrestore(&ctrl->lock, flags);
- if (pending) {
- for_each_set_bit(irq, &pending, gc->ngpio)
- generic_handle_irq(
- irq_linear_revmap(gc->irq.domain, irq));
- }
+ for_each_set_bit(irq, &pending, gc->ngpio)
+ generic_handle_domain_irq(gc->irq.domain, irq);
chained_irq_exit(irqchip, desc);
}
(~(readl(reg_base + GPIO_INT_MASK(bank_id)))))) {
for_each_set_bit(bit, &sta, 32) {
int hwirq = GPIO_PER_BANK * bank_id + bit;
- int child_irq =
- irq_find_mapping(bank->kona_gpio->irq_domain,
- hwirq);
/*
* Clear interrupt before handler is called so we don't
* miss any interrupt occurred during executing them.
writel(readl(reg_base + GPIO_INT_STATUS(bank_id)) |
BIT(bit), reg_base + GPIO_INT_STATUS(bank_id));
/* Invoke interrupt handler */
- generic_handle_irq(child_irq);
+ generic_handle_domain_irq(bank->kona_gpio->irq_domain,
+ hwirq);
}
}
unsigned long status;
while ((status = brcmstb_gpio_get_active_irqs(bank))) {
- unsigned int irq, offset;
+ unsigned int offset;
for_each_set_bit(offset, &status, 32) {
if (offset >= bank->width)
dev_warn(&priv->pdev->dev,
"IRQ for invalid GPIO (bank=%d, offset=%d)\n",
bank->id, offset);
- irq = irq_linear_revmap(domain, hwbase + offset);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(domain, hwbase + offset);
}
}
}
~ioread32(cgpio->regs + CDNS_GPIO_IRQ_MASK);
for_each_set_bit(hwirq, &status, chip->ngpio)
- generic_handle_irq(irq_find_mapping(chip->irq.domain, hwirq));
+ generic_handle_domain_irq(chip->irq.domain, hwirq);
chained_irq_exit(irqchip, desc);
}
*/
hw_irq = (bank_num / 2) * 32 + bit;
- generic_handle_irq(
- irq_find_mapping(d->irq_domain, hw_irq));
+ generic_handle_domain_irq(d->irq_domain, hw_irq);
}
}
chained_irq_exit(irq_desc_get_chip(desc), desc);
static void dln2_gpio_event(struct platform_device *pdev, u16 echo,
const void *data, int len)
{
- int pin, irq;
+ int pin, ret;
const struct {
__le16 count;
return;
}
- irq = irq_find_mapping(dln2->gpio.irq.domain, pin);
- if (!irq) {
- dev_err(dln2->gpio.parent, "pin %d not mapped to IRQ\n", pin);
- return;
- }
-
switch (dln2->irq_type[pin]) {
case DLN2_GPIO_EVENT_CHANGE_RISING:
- if (event->value)
- generic_handle_irq(irq);
+ if (!event->value)
+ return;
break;
case DLN2_GPIO_EVENT_CHANGE_FALLING:
- if (!event->value)
- generic_handle_irq(irq);
+ if (event->value)
+ return;
break;
- default:
- generic_handle_irq(irq);
}
+
+ ret = generic_handle_domain_irq(dln2->gpio.irq.domain, pin);
+ if (unlikely(ret))
+ dev_err(dln2->gpio.parent, "pin %d not mapped to IRQ\n", pin);
}
static int dln2_gpio_probe(struct platform_device *pdev)
while ((pending = em_gio_read(p, GIO_MST))) {
offset = __ffs(pending);
em_gio_write(p, GIO_IIR, BIT(offset));
- generic_handle_irq(irq_find_mapping(p->irq_domain, offset));
+ generic_handle_domain_irq(p->irq_domain, offset);
irqs_handled++;
}
*/
stat = readb(epg->base + EP93XX_GPIO_A_INT_STATUS);
for_each_set_bit(offset, &stat, 8)
- generic_handle_irq(irq_find_mapping(epg->gc[0].gc.irq.domain,
- offset));
+ generic_handle_domain_irq(epg->gc[0].gc.irq.domain,
+ offset);
stat = readb(epg->base + EP93XX_GPIO_B_INT_STATUS);
for_each_set_bit(offset, &stat, 8)
- generic_handle_irq(irq_find_mapping(epg->gc[1].gc.irq.domain,
- offset));
+ generic_handle_domain_irq(epg->gc[1].gc.irq.domain,
+ offset);
chained_irq_exit(irqchip, desc);
}
stat = readl(g->base + GPIO_INT_STAT_RAW);
if (stat)
for_each_set_bit(offset, &stat, gc->ngpio)
- generic_handle_irq(irq_find_mapping(gc->irq.domain,
- offset));
+ generic_handle_domain_irq(gc->irq.domain, offset);
chained_irq_exit(irqchip, desc);
}
chained_irq_enter(irq_c, desc);
for_each_set_bit(hwirq, &irq_msk, HISI_GPIO_LINE_NUM_MAX)
- generic_handle_irq(irq_find_mapping(hisi_gpio->chip.irq.domain,
- hwirq));
+ generic_handle_domain_irq(hisi_gpio->chip.irq.domain,
+ hwirq);
chained_irq_exit(irq_c, desc);
}
chained_irq_enter(chip, desc);
- for_each_set_bit(hwirq, &pending, 32) {
- int irq = irq_find_mapping(hlwd->gpioc.irq.domain, hwirq);
-
- generic_handle_irq(irq);
- }
+ for_each_set_bit(hwirq, &pending, 32)
+ generic_handle_domain_irq(hlwd->gpioc.irq.domain, hwirq);
chained_irq_exit(chip, desc);
}
/* Only interrupts that are enabled */
pending &= enabled;
- for_each_set_bit(gpio, &pending, 32) {
- unsigned int irq;
-
- irq = irq_find_mapping(gc->irq.domain, base + gpio);
- generic_handle_irq(irq);
- }
+ for_each_set_bit(gpio, &pending, 32)
+ generic_handle_domain_irq(gc->irq.domain, base + gpio);
}
chained_irq_exit(irqchip, desc);
mask = gc->read_reg(mpc8xxx_gc->regs + GPIO_IER)
& gc->read_reg(mpc8xxx_gc->regs + GPIO_IMR);
for_each_set_bit(i, &mask, 32)
- generic_handle_irq(irq_linear_revmap(mpc8xxx_gc->irq, 31 - i));
+ generic_handle_domain_irq(mpc8xxx_gc->irq, 31 - i);
return IRQ_HANDLED;
}
pending = mtk_gpio_r32(rg, GPIO_REG_STAT);
for_each_set_bit(bit, &pending, MTK_BANK_WIDTH) {
- u32 map = irq_find_mapping(gc->irq.domain, bit);
-
- generic_handle_irq(map);
+ generic_handle_domain_irq(gc->irq.domain, bit);
mtk_gpio_w32(rg, GPIO_REG_STAT, BIT(bit));
ret |= IRQ_HANDLED;
}
if (port->both_edges & (1 << irqoffset))
mxc_flip_edge(port, irqoffset);
- generic_handle_irq(irq_find_mapping(port->domain, irqoffset));
+ generic_handle_domain_irq(port->domain, irqoffset);
irq_stat &= ~(1 << irqoffset);
}
if (port->both_edges & (1 << irqoffset))
mxs_flip_edge(port, irqoffset);
- generic_handle_irq(irq_find_mapping(port->domain, irqoffset));
+ generic_handle_domain_irq(port->domain, irqoffset);
irq_stat &= ~(1 << irqoffset);
}
}
raw_spin_lock_irqsave(&bank->wa_lock, wa_lock_flags);
- generic_handle_irq(irq_find_mapping(bank->chip.irq.domain,
- bit));
+ generic_handle_domain_irq(bank->chip.irq.domain, bit);
raw_spin_unlock_irqrestore(&bank->wa_lock,
wa_lock_flags);
return IRQ_NONE;
for_each_set_bit(gpio, &idio16gpio->irq_mask, chip->ngpio)
- generic_handle_irq(irq_find_mapping(chip->irq.domain, gpio));
+ generic_handle_domain_irq(chip->irq.domain, gpio);
raw_spin_lock(&idio16gpio->lock);
irq_mask = idio24gpio->irq_mask & irq_status;
for_each_set_bit(gpio, &irq_mask, chip->ngpio - 24)
- generic_handle_irq(irq_find_mapping(chip->irq.domain,
- gpio + 24));
+ generic_handle_domain_irq(chip->irq.domain, gpio + 24);
raw_spin_lock(&idio24gpio->lock);
pending = readb(pl061->base + GPIOMIS);
if (pending) {
for_each_set_bit(offset, &pending, PL061_GPIO_NR)
- generic_handle_irq(irq_find_mapping(gc->irq.domain,
- offset));
+ generic_handle_domain_irq(gc->irq.domain,
+ offset);
}
chained_irq_exit(irqchip, desc);
for_each_set_bit(n, &gedr, BITS_PER_LONG) {
loop = 1;
- generic_handle_irq(
- irq_find_mapping(pchip->irqdomain,
- gpio + n));
+ generic_handle_domain_irq(pchip->irqdomain,
+ gpio + n);
}
}
handled += loop;
struct pxa_gpio_chip *pchip = d;
if (in_irq == pchip->irq0) {
- generic_handle_irq(irq_find_mapping(pchip->irqdomain, 0));
+ generic_handle_domain_irq(pchip->irqdomain, 0);
} else if (in_irq == pchip->irq1) {
- generic_handle_irq(irq_find_mapping(pchip->irqdomain, 1));
+ generic_handle_domain_irq(pchip->irqdomain, 1);
} else {
pr_err("%s() unknown irq %d\n", __func__, in_irq);
return IRQ_NONE;
gpio_rcar_read(p, INTMSK))) {
offset = __ffs(pending);
gpio_rcar_write(p, INTCLR, BIT(offset));
- generic_handle_irq(irq_find_mapping(p->gpio_chip.irq.domain,
- offset));
+ generic_handle_domain_irq(p->gpio_chip.irq.domain,
+ offset);
irqs_handled++;
}
struct irq_chip *ic = irq_desc_get_chip(desc);
struct rda_gpio *rda_gpio = gpiochip_get_data(chip);
unsigned long status;
- u32 n, girq;
+ u32 n;
chained_irq_enter(ic, desc);
/* Only lower 8 bits are capable of generating interrupts */
status &= RDA_GPIO_IRQ_MASK;
- for_each_set_bit(n, &status, RDA_GPIO_BANK_NR) {
- girq = irq_find_mapping(chip->irq.domain, n);
- generic_handle_irq(girq);
- }
+ for_each_set_bit(n, &status, RDA_GPIO_BANK_NR)
+ generic_handle_domain_irq(chip->irq.domain, n);
chained_irq_exit(ic, desc);
}
struct irq_chip *irq_chip = irq_desc_get_chip(desc);
unsigned int lines_done;
unsigned int port_pin_count;
- unsigned int irq;
unsigned long status;
int offset;
for (lines_done = 0; lines_done < gc->ngpio; lines_done += 8) {
status = realtek_gpio_read_isr(ctrl, lines_done / 8);
port_pin_count = min(gc->ngpio - lines_done, 8U);
- for_each_set_bit(offset, &status, port_pin_count) {
- irq = irq_find_mapping(gc->irq.domain, offset);
- generic_handle_irq(irq);
- }
+ for_each_set_bit(offset, &status, port_pin_count)
+ generic_handle_domain_irq(gc->irq.domain, offset);
}
chained_irq_exit(irq_chip, desc);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2013 MundoReader S.L.
+ * Author: Heiko Stuebner <heiko@sntech.de>
+ *
+ * Copyright (c) 2021 Rockchip Electronics Co. Ltd.
+ */
+
+#include <linux/bitops.h>
+#include <linux/clk.h>
+#include <linux/device.h>
+#include <linux/err.h>
+#include <linux/gpio/driver.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_irq.h>
+#include <linux/regmap.h>
+
+#include "../pinctrl/core.h"
+#include "../pinctrl/pinctrl-rockchip.h"
+
+#define GPIO_TYPE_V1 (0) /* GPIO Version ID reserved */
+#define GPIO_TYPE_V2 (0x01000C2B) /* GPIO Version ID 0x01000C2B */
+
+static const struct rockchip_gpio_regs gpio_regs_v1 = {
+ .port_dr = 0x00,
+ .port_ddr = 0x04,
+ .int_en = 0x30,
+ .int_mask = 0x34,
+ .int_type = 0x38,
+ .int_polarity = 0x3c,
+ .int_status = 0x40,
+ .int_rawstatus = 0x44,
+ .debounce = 0x48,
+ .port_eoi = 0x4c,
+ .ext_port = 0x50,
+};
+
+static const struct rockchip_gpio_regs gpio_regs_v2 = {
+ .port_dr = 0x00,
+ .port_ddr = 0x08,
+ .int_en = 0x10,
+ .int_mask = 0x18,
+ .int_type = 0x20,
+ .int_polarity = 0x28,
+ .int_bothedge = 0x30,
+ .int_status = 0x50,
+ .int_rawstatus = 0x58,
+ .debounce = 0x38,
+ .dbclk_div_en = 0x40,
+ .dbclk_div_con = 0x48,
+ .port_eoi = 0x60,
+ .ext_port = 0x70,
+ .version_id = 0x78,
+};
+
+static inline void gpio_writel_v2(u32 val, void __iomem *reg)
+{
+ writel((val & 0xffff) | 0xffff0000, reg);
+ writel((val >> 16) | 0xffff0000, reg + 0x4);
+}
+
+static inline u32 gpio_readl_v2(void __iomem *reg)
+{
+ return readl(reg + 0x4) << 16 | readl(reg);
+}
+
+static inline void rockchip_gpio_writel(struct rockchip_pin_bank *bank,
+ u32 value, unsigned int offset)
+{
+ void __iomem *reg = bank->reg_base + offset;
+
+ if (bank->gpio_type == GPIO_TYPE_V2)
+ gpio_writel_v2(value, reg);
+ else
+ writel(value, reg);
+}
+
+static inline u32 rockchip_gpio_readl(struct rockchip_pin_bank *bank,
+ unsigned int offset)
+{
+ void __iomem *reg = bank->reg_base + offset;
+ u32 value;
+
+ if (bank->gpio_type == GPIO_TYPE_V2)
+ value = gpio_readl_v2(reg);
+ else
+ value = readl(reg);
+
+ return value;
+}
+
+static inline void rockchip_gpio_writel_bit(struct rockchip_pin_bank *bank,
+ u32 bit, u32 value,
+ unsigned int offset)
+{
+ void __iomem *reg = bank->reg_base + offset;
+ u32 data;
+
+ if (bank->gpio_type == GPIO_TYPE_V2) {
+ if (value)
+ data = BIT(bit % 16) | BIT(bit % 16 + 16);
+ else
+ data = BIT(bit % 16 + 16);
+ writel(data, bit >= 16 ? reg + 0x4 : reg);
+ } else {
+ data = readl(reg);
+ data &= ~BIT(bit);
+ if (value)
+ data |= BIT(bit);
+ writel(data, reg);
+ }
+}
+
+static inline u32 rockchip_gpio_readl_bit(struct rockchip_pin_bank *bank,
+ u32 bit, unsigned int offset)
+{
+ void __iomem *reg = bank->reg_base + offset;
+ u32 data;
+
+ if (bank->gpio_type == GPIO_TYPE_V2) {
+ data = readl(bit >= 16 ? reg + 0x4 : reg);
+ data >>= bit % 16;
+ } else {
+ data = readl(reg);
+ data >>= bit;
+ }
+
+ return data & (0x1);
+}
+
+static int rockchip_gpio_get_direction(struct gpio_chip *chip,
+ unsigned int offset)
+{
+ struct rockchip_pin_bank *bank = gpiochip_get_data(chip);
+ u32 data;
+
+ data = rockchip_gpio_readl_bit(bank, offset, bank->gpio_regs->port_ddr);
+ if (data & BIT(offset))
+ return GPIO_LINE_DIRECTION_OUT;
+
+ return GPIO_LINE_DIRECTION_IN;
+}
+
+static int rockchip_gpio_set_direction(struct gpio_chip *chip,
+ unsigned int offset, bool input)
+{
+ struct rockchip_pin_bank *bank = gpiochip_get_data(chip);
+ unsigned long flags;
+ u32 data = input ? 0 : 1;
+
+ raw_spin_lock_irqsave(&bank->slock, flags);
+ rockchip_gpio_writel_bit(bank, offset, data, bank->gpio_regs->port_ddr);
+ raw_spin_unlock_irqrestore(&bank->slock, flags);
+
+ return 0;
+}
+
+static void rockchip_gpio_set(struct gpio_chip *gc, unsigned int offset,
+ int value)
+{
+ struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&bank->slock, flags);
+ rockchip_gpio_writel_bit(bank, offset, value, bank->gpio_regs->port_dr);
+ raw_spin_unlock_irqrestore(&bank->slock, flags);
+}
+
+static int rockchip_gpio_get(struct gpio_chip *gc, unsigned int offset)
+{
+ struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
+ u32 data;
+
+ data = readl(bank->reg_base + bank->gpio_regs->ext_port);
+ data >>= offset;
+ data &= 1;
+
+ return data;
+}
+
+static int rockchip_gpio_set_debounce(struct gpio_chip *gc,
+ unsigned int offset,
+ unsigned int debounce)
+{
+ struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
+ const struct rockchip_gpio_regs *reg = bank->gpio_regs;
+ unsigned long flags, div_reg, freq, max_debounce;
+ bool div_debounce_support;
+ unsigned int cur_div_reg;
+ u64 div;
+
+ if (!IS_ERR(bank->db_clk)) {
+ div_debounce_support = true;
+ freq = clk_get_rate(bank->db_clk);
+ max_debounce = (GENMASK(23, 0) + 1) * 2 * 1000000 / freq;
+ if (debounce > max_debounce)
+ return -EINVAL;
+
+ div = debounce * freq;
+ div_reg = DIV_ROUND_CLOSEST_ULL(div, 2 * USEC_PER_SEC) - 1;
+ } else {
+ div_debounce_support = false;
+ }
+
+ raw_spin_lock_irqsave(&bank->slock, flags);
+
+ /* Only the v1 needs to configure div_en and div_con for dbclk */
+ if (debounce) {
+ if (div_debounce_support) {
+ /* Configure the max debounce from consumers */
+ cur_div_reg = readl(bank->reg_base +
+ reg->dbclk_div_con);
+ if (cur_div_reg < div_reg)
+ writel(div_reg, bank->reg_base +
+ reg->dbclk_div_con);
+ rockchip_gpio_writel_bit(bank, offset, 1,
+ reg->dbclk_div_en);
+ }
+
+ rockchip_gpio_writel_bit(bank, offset, 1, reg->debounce);
+ } else {
+ if (div_debounce_support)
+ rockchip_gpio_writel_bit(bank, offset, 0,
+ reg->dbclk_div_en);
+
+ rockchip_gpio_writel_bit(bank, offset, 0, reg->debounce);
+ }
+
+ raw_spin_unlock_irqrestore(&bank->slock, flags);
+
+ /* Enable or disable dbclk at last */
+ if (div_debounce_support) {
+ if (debounce)
+ clk_prepare_enable(bank->db_clk);
+ else
+ clk_disable_unprepare(bank->db_clk);
+ }
+
+ return 0;
+}
+
+static int rockchip_gpio_direction_input(struct gpio_chip *gc,
+ unsigned int offset)
+{
+ return rockchip_gpio_set_direction(gc, offset, true);
+}
+
+static int rockchip_gpio_direction_output(struct gpio_chip *gc,
+ unsigned int offset, int value)
+{
+ rockchip_gpio_set(gc, offset, value);
+
+ return rockchip_gpio_set_direction(gc, offset, false);
+}
+
+/*
+ * gpiolib set_config callback function. The setting of the pin
+ * mux function as 'gpio output' will be handled by the pinctrl subsystem
+ * interface.
+ */
+static int rockchip_gpio_set_config(struct gpio_chip *gc, unsigned int offset,
+ unsigned long config)
+{
+ enum pin_config_param param = pinconf_to_config_param(config);
+
+ switch (param) {
+ case PIN_CONFIG_INPUT_DEBOUNCE:
+ rockchip_gpio_set_debounce(gc, offset, true);
+ /*
+ * Rockchip's gpio could only support up to one period
+ * of the debounce clock(pclk), which is far away from
+ * satisftying the requirement, as pclk is usually near
+ * 100MHz shared by all peripherals. So the fact is it
+ * has crippled debounce capability could only be useful
+ * to prevent any spurious glitches from waking up the system
+ * if the gpio is conguired as wakeup interrupt source. Let's
+ * still return -ENOTSUPP as before, to make sure the caller
+ * of gpiod_set_debounce won't change its behaviour.
+ */
+ return -ENOTSUPP;
+ default:
+ return -ENOTSUPP;
+ }
+}
+
+/*
+ * gpiolib gpio_to_irq callback function. Creates a mapping between a GPIO pin
+ * and a virtual IRQ, if not already present.
+ */
+static int rockchip_gpio_to_irq(struct gpio_chip *gc, unsigned int offset)
+{
+ struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
+ unsigned int virq;
+
+ if (!bank->domain)
+ return -ENXIO;
+
+ virq = irq_create_mapping(bank->domain, offset);
+
+ return (virq) ? : -ENXIO;
+}
+
+static const struct gpio_chip rockchip_gpiolib_chip = {
+ .request = gpiochip_generic_request,
+ .free = gpiochip_generic_free,
+ .set = rockchip_gpio_set,
+ .get = rockchip_gpio_get,
+ .get_direction = rockchip_gpio_get_direction,
+ .direction_input = rockchip_gpio_direction_input,
+ .direction_output = rockchip_gpio_direction_output,
+ .set_config = rockchip_gpio_set_config,
+ .to_irq = rockchip_gpio_to_irq,
+ .owner = THIS_MODULE,
+};
+
+static void rockchip_irq_demux(struct irq_desc *desc)
+{
+ struct irq_chip *chip = irq_desc_get_chip(desc);
+ struct rockchip_pin_bank *bank = irq_desc_get_handler_data(desc);
+ u32 pend;
+
+ dev_dbg(bank->dev, "got irq for bank %s\n", bank->name);
+
+ chained_irq_enter(chip, desc);
+
+ pend = readl_relaxed(bank->reg_base + bank->gpio_regs->int_status);
+
+ while (pend) {
+ unsigned int irq, virq;
+
+ irq = __ffs(pend);
+ pend &= ~BIT(irq);
+ virq = irq_find_mapping(bank->domain, irq);
+
+ if (!virq) {
+ dev_err(bank->dev, "unmapped irq %d\n", irq);
+ continue;
+ }
+
+ dev_dbg(bank->dev, "handling irq %d\n", irq);
+
+ /*
+ * Triggering IRQ on both rising and falling edge
+ * needs manual intervention.
+ */
+ if (bank->toggle_edge_mode & BIT(irq)) {
+ u32 data, data_old, polarity;
+ unsigned long flags;
+
+ data = readl_relaxed(bank->reg_base +
+ bank->gpio_regs->ext_port);
+ do {
+ raw_spin_lock_irqsave(&bank->slock, flags);
+
+ polarity = readl_relaxed(bank->reg_base +
+ bank->gpio_regs->int_polarity);
+ if (data & BIT(irq))
+ polarity &= ~BIT(irq);
+ else
+ polarity |= BIT(irq);
+ writel(polarity,
+ bank->reg_base +
+ bank->gpio_regs->int_polarity);
+
+ raw_spin_unlock_irqrestore(&bank->slock, flags);
+
+ data_old = data;
+ data = readl_relaxed(bank->reg_base +
+ bank->gpio_regs->ext_port);
+ } while ((data & BIT(irq)) != (data_old & BIT(irq)));
+ }
+
+ generic_handle_irq(virq);
+ }
+
+ chained_irq_exit(chip, desc);
+}
+
+static int rockchip_irq_set_type(struct irq_data *d, unsigned int type)
+{
+ struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
+ struct rockchip_pin_bank *bank = gc->private;
+ u32 mask = BIT(d->hwirq);
+ u32 polarity;
+ u32 level;
+ u32 data;
+ unsigned long flags;
+ int ret = 0;
+
+ raw_spin_lock_irqsave(&bank->slock, flags);
+
+ rockchip_gpio_writel_bit(bank, d->hwirq, 0,
+ bank->gpio_regs->port_ddr);
+
+ raw_spin_unlock_irqrestore(&bank->slock, flags);
+
+ if (type & IRQ_TYPE_EDGE_BOTH)
+ irq_set_handler_locked(d, handle_edge_irq);
+ else
+ irq_set_handler_locked(d, handle_level_irq);
+
+ raw_spin_lock_irqsave(&bank->slock, flags);
+
+ level = rockchip_gpio_readl(bank, bank->gpio_regs->int_type);
+ polarity = rockchip_gpio_readl(bank, bank->gpio_regs->int_polarity);
+
+ switch (type) {
+ case IRQ_TYPE_EDGE_BOTH:
+ if (bank->gpio_type == GPIO_TYPE_V2) {
+ bank->toggle_edge_mode &= ~mask;
+ rockchip_gpio_writel_bit(bank, d->hwirq, 1,
+ bank->gpio_regs->int_bothedge);
+ goto out;
+ } else {
+ bank->toggle_edge_mode |= mask;
+ level |= mask;
+
+ /*
+ * Determine gpio state. If 1 next interrupt should be
+ * falling otherwise rising.
+ */
+ data = readl(bank->reg_base + bank->gpio_regs->ext_port);
+ if (data & mask)
+ polarity &= ~mask;
+ else
+ polarity |= mask;
+ }
+ break;
+ case IRQ_TYPE_EDGE_RISING:
+ bank->toggle_edge_mode &= ~mask;
+ level |= mask;
+ polarity |= mask;
+ break;
+ case IRQ_TYPE_EDGE_FALLING:
+ bank->toggle_edge_mode &= ~mask;
+ level |= mask;
+ polarity &= ~mask;
+ break;
+ case IRQ_TYPE_LEVEL_HIGH:
+ bank->toggle_edge_mode &= ~mask;
+ level &= ~mask;
+ polarity |= mask;
+ break;
+ case IRQ_TYPE_LEVEL_LOW:
+ bank->toggle_edge_mode &= ~mask;
+ level &= ~mask;
+ polarity &= ~mask;
+ break;
+ default:
+ ret = -EINVAL;
+ goto out;
+ }
+
+ rockchip_gpio_writel(bank, level, bank->gpio_regs->int_type);
+ rockchip_gpio_writel(bank, polarity, bank->gpio_regs->int_polarity);
+out:
+ raw_spin_unlock_irqrestore(&bank->slock, flags);
+
+ return ret;
+}
+
+static void rockchip_irq_suspend(struct irq_data *d)
+{
+ struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
+ struct rockchip_pin_bank *bank = gc->private;
+
+ bank->saved_masks = irq_reg_readl(gc, bank->gpio_regs->int_mask);
+ irq_reg_writel(gc, ~gc->wake_active, bank->gpio_regs->int_mask);
+}
+
+static void rockchip_irq_resume(struct irq_data *d)
+{
+ struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
+ struct rockchip_pin_bank *bank = gc->private;
+
+ irq_reg_writel(gc, bank->saved_masks, bank->gpio_regs->int_mask);
+}
+
+static void rockchip_irq_enable(struct irq_data *d)
+{
+ irq_gc_mask_clr_bit(d);
+}
+
+static void rockchip_irq_disable(struct irq_data *d)
+{
+ irq_gc_mask_set_bit(d);
+}
+
+static int rockchip_interrupts_register(struct rockchip_pin_bank *bank)
+{
+ unsigned int clr = IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_NOAUTOEN;
+ struct irq_chip_generic *gc;
+ int ret;
+
+ bank->domain = irq_domain_add_linear(bank->of_node, 32,
+ &irq_generic_chip_ops, NULL);
+ if (!bank->domain) {
+ dev_warn(bank->dev, "could not init irq domain for bank %s\n",
+ bank->name);
+ return -EINVAL;
+ }
+
+ ret = irq_alloc_domain_generic_chips(bank->domain, 32, 1,
+ "rockchip_gpio_irq",
+ handle_level_irq,
+ clr, 0, 0);
+ if (ret) {
+ dev_err(bank->dev, "could not alloc generic chips for bank %s\n",
+ bank->name);
+ irq_domain_remove(bank->domain);
+ return -EINVAL;
+ }
+
+ gc = irq_get_domain_generic_chip(bank->domain, 0);
+ if (bank->gpio_type == GPIO_TYPE_V2) {
+ gc->reg_writel = gpio_writel_v2;
+ gc->reg_readl = gpio_readl_v2;
+ }
+
+ gc->reg_base = bank->reg_base;
+ gc->private = bank;
+ gc->chip_types[0].regs.mask = bank->gpio_regs->int_mask;
+ gc->chip_types[0].regs.ack = bank->gpio_regs->port_eoi;
+ gc->chip_types[0].chip.irq_ack = irq_gc_ack_set_bit;
+ gc->chip_types[0].chip.irq_mask = irq_gc_mask_set_bit;
+ gc->chip_types[0].chip.irq_unmask = irq_gc_mask_clr_bit;
+ gc->chip_types[0].chip.irq_enable = rockchip_irq_enable;
+ gc->chip_types[0].chip.irq_disable = rockchip_irq_disable;
+ gc->chip_types[0].chip.irq_set_wake = irq_gc_set_wake;
+ gc->chip_types[0].chip.irq_suspend = rockchip_irq_suspend;
+ gc->chip_types[0].chip.irq_resume = rockchip_irq_resume;
+ gc->chip_types[0].chip.irq_set_type = rockchip_irq_set_type;
+ gc->wake_enabled = IRQ_MSK(bank->nr_pins);
+
+ /*
+ * Linux assumes that all interrupts start out disabled/masked.
+ * Our driver only uses the concept of masked and always keeps
+ * things enabled, so for us that's all masked and all enabled.
+ */
+ rockchip_gpio_writel(bank, 0xffffffff, bank->gpio_regs->int_mask);
+ rockchip_gpio_writel(bank, 0xffffffff, bank->gpio_regs->port_eoi);
+ rockchip_gpio_writel(bank, 0xffffffff, bank->gpio_regs->int_en);
+ gc->mask_cache = 0xffffffff;
+
+ irq_set_chained_handler_and_data(bank->irq,
+ rockchip_irq_demux, bank);
+
+ return 0;
+}
+
+static int rockchip_gpiolib_register(struct rockchip_pin_bank *bank)
+{
+ struct gpio_chip *gc;
+ int ret;
+
+ bank->gpio_chip = rockchip_gpiolib_chip;
+
+ gc = &bank->gpio_chip;
+ gc->base = bank->pin_base;
+ gc->ngpio = bank->nr_pins;
+ gc->label = bank->name;
+ gc->parent = bank->dev;
+#ifdef CONFIG_OF_GPIO
+ gc->of_node = of_node_get(bank->of_node);
+#endif
+
+ ret = gpiochip_add_data(gc, bank);
+ if (ret) {
+ dev_err(bank->dev, "failed to add gpiochip %s, %d\n",
+ gc->label, ret);
+ return ret;
+ }
+
+ /*
+ * For DeviceTree-supported systems, the gpio core checks the
+ * pinctrl's device node for the "gpio-ranges" property.
+ * If it is present, it takes care of adding the pin ranges
+ * for the driver. In this case the driver can skip ahead.
+ *
+ * In order to remain compatible with older, existing DeviceTree
+ * files which don't set the "gpio-ranges" property or systems that
+ * utilize ACPI the driver has to call gpiochip_add_pin_range().
+ */
+ if (!of_property_read_bool(bank->of_node, "gpio-ranges")) {
+ struct device_node *pctlnp = of_get_parent(bank->of_node);
+ struct pinctrl_dev *pctldev = NULL;
+
+ if (!pctlnp)
+ return -ENODATA;
+
+ pctldev = of_pinctrl_get(pctlnp);
+ if (!pctldev)
+ return -ENODEV;
+
+ ret = gpiochip_add_pin_range(gc, dev_name(pctldev->dev), 0,
+ gc->base, gc->ngpio);
+ if (ret) {
+ dev_err(bank->dev, "Failed to add pin range\n");
+ goto fail;
+ }
+ }
+
+ ret = rockchip_interrupts_register(bank);
+ if (ret) {
+ dev_err(bank->dev, "failed to register interrupt, %d\n", ret);
+ goto fail;
+ }
+
+ return 0;
+
+fail:
+ gpiochip_remove(&bank->gpio_chip);
+
+ return ret;
+}
+
+static int rockchip_get_bank_data(struct rockchip_pin_bank *bank)
+{
+ struct resource res;
+ int id = 0;
+
+ if (of_address_to_resource(bank->of_node, 0, &res)) {
+ dev_err(bank->dev, "cannot find IO resource for bank\n");
+ return -ENOENT;
+ }
+
+ bank->reg_base = devm_ioremap_resource(bank->dev, &res);
+ if (IS_ERR(bank->reg_base))
+ return PTR_ERR(bank->reg_base);
+
+ bank->irq = irq_of_parse_and_map(bank->of_node, 0);
+ if (!bank->irq)
+ return -EINVAL;
+
+ bank->clk = of_clk_get(bank->of_node, 0);
+ if (IS_ERR(bank->clk))
+ return PTR_ERR(bank->clk);
+
+ clk_prepare_enable(bank->clk);
+ id = readl(bank->reg_base + gpio_regs_v2.version_id);
+
+ /* If not gpio v2, that is default to v1. */
+ if (id == GPIO_TYPE_V2) {
+ bank->gpio_regs = &gpio_regs_v2;
+ bank->gpio_type = GPIO_TYPE_V2;
+ bank->db_clk = of_clk_get(bank->of_node, 1);
+ if (IS_ERR(bank->db_clk)) {
+ dev_err(bank->dev, "cannot find debounce clk\n");
+ clk_disable_unprepare(bank->clk);
+ return -EINVAL;
+ }
+ } else {
+ bank->gpio_regs = &gpio_regs_v1;
+ bank->gpio_type = GPIO_TYPE_V1;
+ }
+
+ return 0;
+}
+
+static struct rockchip_pin_bank *
+rockchip_gpio_find_bank(struct pinctrl_dev *pctldev, int id)
+{
+ struct rockchip_pinctrl *info;
+ struct rockchip_pin_bank *bank;
+ int i, found = 0;
+
+ info = pinctrl_dev_get_drvdata(pctldev);
+ bank = info->ctrl->pin_banks;
+ for (i = 0; i < info->ctrl->nr_banks; i++, bank++) {
+ if (bank->bank_num == id) {
+ found = 1;
+ break;
+ }
+ }
+
+ return found ? bank : NULL;
+}
+
+static int rockchip_gpio_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct device_node *np = dev->of_node;
+ struct device_node *pctlnp = of_get_parent(np);
+ struct pinctrl_dev *pctldev = NULL;
+ struct rockchip_pin_bank *bank = NULL;
+ static int gpio;
+ int id, ret;
+
+ if (!np || !pctlnp)
+ return -ENODEV;
+
+ pctldev = of_pinctrl_get(pctlnp);
+ if (!pctldev)
+ return -EPROBE_DEFER;
+
+ id = of_alias_get_id(np, "gpio");
+ if (id < 0)
+ id = gpio++;
+
+ bank = rockchip_gpio_find_bank(pctldev, id);
+ if (!bank)
+ return -EINVAL;
+
+ bank->dev = dev;
+ bank->of_node = np;
+
+ raw_spin_lock_init(&bank->slock);
+
+ ret = rockchip_get_bank_data(bank);
+ if (ret)
+ return ret;
+
+ ret = rockchip_gpiolib_register(bank);
+ if (ret) {
+ clk_disable_unprepare(bank->clk);
+ return ret;
+ }
+
+ platform_set_drvdata(pdev, bank);
+ dev_info(dev, "probed %pOF\n", np);
+
+ return 0;
+}
+
+static int rockchip_gpio_remove(struct platform_device *pdev)
+{
+ struct rockchip_pin_bank *bank = platform_get_drvdata(pdev);
+
+ clk_disable_unprepare(bank->clk);
+ gpiochip_remove(&bank->gpio_chip);
+
+ return 0;
+}
+
+static const struct of_device_id rockchip_gpio_match[] = {
+ { .compatible = "rockchip,gpio-bank", },
+ { .compatible = "rockchip,rk3188-gpio-bank0" },
+ { },
+};
+
+static struct platform_driver rockchip_gpio_driver = {
+ .probe = rockchip_gpio_probe,
+ .remove = rockchip_gpio_remove,
+ .driver = {
+ .name = "rockchip-gpio",
+ .of_match_table = rockchip_gpio_match,
+ },
+};
+
+static int __init rockchip_gpio_init(void)
+{
+ return platform_driver_register(&rockchip_gpio_driver);
+}
+postcore_initcall(rockchip_gpio_init);
+
+static void __exit rockchip_gpio_exit(void)
+{
+ platform_driver_unregister(&rockchip_gpio_driver);
+}
+module_exit(rockchip_gpio_exit);
+
+MODULE_DESCRIPTION("Rockchip gpio driver");
+MODULE_ALIAS("platform:rockchip-gpio");
+MODULE_LICENSE("GPL v2");
+MODULE_DEVICE_TABLE(of, rockchip_gpio_match);
pending = (resume_status << sch->resume_base) | core_status;
for_each_set_bit(offset, &pending, sch->chip.ngpio)
- generic_handle_irq(irq_find_mapping(gc->irq.domain, offset));
+ generic_handle_domain_irq(gc->irq.domain, offset);
/* Set returning value depending on whether we handled an interrupt */
ret = pending ? ACPI_INTERRUPT_HANDLED : ACPI_INTERRUPT_NOT_HANDLED;
return IRQ_NONE;
for_each_set_bit(irq_bit, &irq_stat, 32)
- generic_handle_irq(irq_find_mapping(sd->id, irq_bit));
+ generic_handle_domain_irq(sd->id, irq_bit);
return IRQ_HANDLED;
}
struct gpio_chip *chip = irq_desc_get_handler_data(desc);
struct irq_chip *ic = irq_desc_get_chip(desc);
struct sprd_gpio *sprd_gpio = gpiochip_get_data(chip);
- u32 bank, n, girq;
+ u32 bank, n;
chained_irq_enter(ic, desc);
unsigned long reg = readl_relaxed(base + SPRD_GPIO_MIS) &
SPRD_GPIO_BANK_MASK;
- for_each_set_bit(n, ®, SPRD_GPIO_BANK_NR) {
- girq = irq_find_mapping(chip->irq.domain,
- bank * SPRD_GPIO_BANK_NR + n);
-
- generic_handle_irq(girq);
- }
-
+ for_each_set_bit(n, ®, SPRD_GPIO_BANK_NR)
+ generic_handle_domain_irq(chip->irq.domain,
+ bank * SPRD_GPIO_BANK_NR + n);
}
chained_irq_exit(ic, desc);
}
int i;
for_each_set_bit(i, &bits, 32)
- generic_handle_irq(irq_find_mapping(tb10x_gpio->domain, i));
+ generic_handle_domain_irq(tb10x_gpio->domain, i);
return IRQ_HANDLED;
}
lvl = tegra_gpio_readl(tgi, GPIO_INT_LVL(tgi, gpio));
for_each_set_bit(pin, &sta, 8) {
+ int ret;
+
tegra_gpio_writel(tgi, 1 << pin,
GPIO_INT_CLR(tgi, gpio));
chained_irq_exit(chip, desc);
}
- irq = irq_find_mapping(domain, gpio + pin);
- if (WARN_ON(irq == 0))
- continue;
-
- generic_handle_irq(irq);
+ ret = generic_handle_domain_irq(domain, gpio + pin);
+ WARN_RATELIMIT(ret, "hwirq = %d", gpio + pin);
}
}
for (i = 0; i < gpio->soc->num_ports; i++) {
const struct tegra_gpio_port *port = &gpio->soc->ports[i];
- unsigned int pin, irq;
+ unsigned int pin;
unsigned long value;
void __iomem *base;
value = readl(base + TEGRA186_GPIO_INTERRUPT_STATUS(1));
for_each_set_bit(pin, &value, port->pins) {
- irq = irq_find_mapping(domain, offset + pin);
- if (WARN_ON(irq == 0))
- continue;
-
- generic_handle_irq(irq);
+ int ret = generic_handle_domain_irq(domain, offset + pin);
+ WARN_RATELIMIT(ret, "hwirq = %d", offset + pin);
}
skip:
struct tqmx86_gpio_data *gpio = gpiochip_get_data(chip);
struct irq_chip *irq_chip = irq_desc_get_chip(desc);
unsigned long irq_bits;
- int i = 0, child_irq;
+ int i = 0;
u8 irq_status;
chained_irq_enter(irq_chip, desc);
tqmx86_gpio_write(gpio, irq_status, TQMX86_GPIIS);
irq_bits = irq_status;
- for_each_set_bit(i, &irq_bits, TQMX86_NGPI) {
- child_irq = irq_find_mapping(gpio->chip.irq.domain,
- i + TQMX86_NGPO);
- generic_handle_irq(child_irq);
- }
+ for_each_set_bit(i, &irq_bits, TQMX86_NGPI)
+ generic_handle_domain_irq(gpio->chip.irq.domain,
+ i + TQMX86_NGPO);
chained_irq_exit(irq_chip, desc);
}
for_each_set_bit(pin, &irq_isfr, VF610_GPIO_PER_PORT) {
vf610_gpio_writel(BIT(pin), port->base + PORT_ISFR);
- generic_handle_irq(irq_find_mapping(port->gc.irq.domain, pin));
+ generic_handle_domain_irq(port->gc.irq.domain, pin);
}
chained_irq_exit(chip, desc);
for_each_set_bit(port, &int_pending, 3) {
int_id = inb(ws16c48gpio->base + 8 + port);
for_each_set_bit(gpio, &int_id, 8)
- generic_handle_irq(irq_find_mapping(
- chip->irq.domain, gpio + 8*port));
+ generic_handle_domain_irq(chip->irq.domain,
+ gpio + 8*port);
}
int_pending = inb(ws16c48gpio->base + 6) & 0x7;
int_bits = level | event;
for_each_set_bit(bit, &int_bits, gc->ngpio)
- generic_handle_irq(irq_linear_revmap(gc->irq.domain, bit));
+ generic_handle_domain_irq(gc->irq.domain, bit);
}
return int_bits ? IRQ_HANDLED : IRQ_NONE;
for_each_set_bit(bit, all, 64) {
irq_offset = xgpio_from_bit(chip, bit);
- generic_handle_irq(irq_find_mapping(gc->irq.domain, irq_offset));
+ generic_handle_domain_irq(gc->irq.domain, irq_offset);
}
chained_irq_exit(irqchip, desc);
}
if (gpio_stat & BIT(gpio % XLP_GPIO_REGSZ))
- generic_handle_irq(irq_find_mapping(
- priv->chip.irq.domain, gpio));
+ generic_handle_domain_irq(priv->chip.irq.domain, gpio);
}
chained_irq_exit(irqchip, desc);
}
if (!pending)
return;
- for_each_set_bit(offset, &pending, 32) {
- unsigned int gpio_irq;
-
- gpio_irq = irq_find_mapping(irqdomain, offset + bank_offset);
- generic_handle_irq(gpio_irq);
- }
+ for_each_set_bit(offset, &pending, 32)
+ generic_handle_domain_irq(irqdomain, offset + bank_offset);
}
/**
*/
bool amdgpu_acpi_is_s0ix_supported(struct amdgpu_device *adev)
{
-#if IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_PM_SLEEP)
+#if IS_ENABLED(CONFIG_AMD_PMC) && IS_ENABLED(CONFIG_SUSPEND)
if (acpi_gbl_FADT.flags & ACPI_FADT_LOW_POWER_S0) {
if (adev->flags & AMD_IS_APU)
return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
struct amdgpu_device *adev =
container_of(work, struct amdgpu_device, gfx.gfx_off_delay_work.work);
- mutex_lock(&adev->gfx.gfx_off_mutex);
- if (!adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
- if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
- adev->gfx.gfx_off_state = true;
- }
- mutex_unlock(&adev->gfx.gfx_off_mutex);
+ WARN_ON_ONCE(adev->gfx.gfx_off_state);
+ WARN_ON_ONCE(adev->gfx.gfx_off_req_count);
+
+ if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, true))
+ adev->gfx.gfx_off_state = true;
}
/**
mutex_lock(&adev->gfx.gfx_off_mutex);
- if (!enable)
- adev->gfx.gfx_off_req_count++;
- else if (adev->gfx.gfx_off_req_count > 0)
+ if (enable) {
+ /* If the count is already 0, it means there's an imbalance bug somewhere.
+ * Note that the bug may be in a different caller than the one which triggers the
+ * WARN_ON_ONCE.
+ */
+ if (WARN_ON_ONCE(adev->gfx.gfx_off_req_count == 0))
+ goto unlock;
+
adev->gfx.gfx_off_req_count--;
- if (enable && !adev->gfx.gfx_off_state && !adev->gfx.gfx_off_req_count) {
- schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
- } else if (!enable && adev->gfx.gfx_off_state) {
- if (!amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) {
- adev->gfx.gfx_off_state = false;
+ if (adev->gfx.gfx_off_req_count == 0 && !adev->gfx.gfx_off_state)
+ schedule_delayed_work(&adev->gfx.gfx_off_delay_work, GFX_OFF_DELAY_ENABLE);
+ } else {
+ if (adev->gfx.gfx_off_req_count == 0) {
+ cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
+
+ if (adev->gfx.gfx_off_state &&
+ !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) {
+ adev->gfx.gfx_off_state = false;
- if (adev->gfx.funcs->init_spm_golden) {
- dev_dbg(adev->dev, "GFXOFF is disabled, re-init SPM golden settings\n");
- amdgpu_gfx_init_spm_golden(adev);
+ if (adev->gfx.funcs->init_spm_golden) {
+ dev_dbg(adev->dev,
+ "GFXOFF is disabled, re-init SPM golden settings\n");
+ amdgpu_gfx_init_spm_golden(adev);
+ }
}
}
+
+ adev->gfx.gfx_off_req_count++;
}
+unlock:
mutex_unlock(&adev->gfx.gfx_off_mutex);
}
} else if ((client_id == AMDGPU_IRQ_CLIENTID_LEGACY) &&
adev->irq.virq[src_id]) {
- generic_handle_irq(irq_find_mapping(adev->irq.domain, src_id));
+ generic_handle_domain_irq(adev->irq.domain, src_id);
} else if (!adev->irq.client[client_id].sources) {
DRM_DEBUG("Unregistered interrupt client_id: %d src_id: %d\n",
return -EINVAL;
}
- /* This assumes only APU display buffers are pinned with (VRAM|GTT).
- * See function amdgpu_display_supported_domains()
- */
- domain = amdgpu_bo_get_preferred_pin_domain(adev, domain);
-
if (bo->tbo.pin_count) {
uint32_t mem_type = bo->tbo.resource->mem_type;
uint32_t mem_flags = bo->tbo.resource->placement;
return 0;
}
+ /* This assumes only APU display buffers are pinned with (VRAM|GTT).
+ * See function amdgpu_display_supported_domains()
+ */
+ domain = amdgpu_bo_get_preferred_pin_domain(adev, domain);
+
if (bo->tbo.base.import_attach)
dma_buf_pin(bo->tbo.base.import_attach);
static void intel_dp_check_link_service_irq(struct intel_dp *intel_dp)
{
- struct drm_i915_private *i915 = dp_to_i915(intel_dp);
u8 val;
if (intel_dp->dpcd[DP_DPCD_REV] < 0x11)
return;
if (drm_dp_dpcd_readb(&intel_dp->aux,
- DP_LINK_SERVICE_IRQ_VECTOR_ESI0, &val) != 1 || !val) {
- drm_dbg_kms(&i915->drm, "Error in reading link service irq vector\n");
+ DP_LINK_SERVICE_IRQ_VECTOR_ESI0, &val) != 1 || !val)
return;
- }
if (drm_dp_dpcd_writeb(&intel_dp->aux,
- DP_LINK_SERVICE_IRQ_VECTOR_ESI0, val) != 1) {
- drm_dbg_kms(&i915->drm, "Error in writing link service irq vector\n");
+ DP_LINK_SERVICE_IRQ_VECTOR_ESI0, val) != 1)
return;
- }
if (val & HDMI_LINK_STATUS_CHANGED)
intel_dp_handle_hdmi_link_status_change(intel_dp);
i915_vma_put(timeline->hwsp_ggtt);
i915_active_fini(&timeline->active);
+
+ /*
+ * A small race exists between intel_gt_retire_requests_timeout and
+ * intel_timeline_exit which could result in the syncmap not getting
+ * free'd. Rather than work to hard to seal this race, simply cleanup
+ * the syncmap on fini.
+ */
+ i915_syncmap_free(&timeline->sync);
+
kfree(timeline);
}
break;
}
- ipu_dmfc_config_wait4eot(ipu_plane->dmfc, drm_rect_width(dst));
+ ipu_dmfc_config_wait4eot(ipu_plane->dmfc, ALIGN(drm_rect_width(dst), 8));
width = ipu_src_rect_width(new_state);
height = drm_rect_height(&new_state->src) >> 16;
while (interrupts) {
irq_hw_number_t hwirq = fls(interrupts) - 1;
- unsigned int mapping;
int rc;
- mapping = irq_find_mapping(dpu_mdss->irq_controller.domain,
- hwirq);
- if (mapping == 0) {
- DRM_ERROR("couldn't find irq mapping for %lu\n", hwirq);
- break;
- }
-
- rc = generic_handle_irq(mapping);
+ rc = generic_handle_domain_irq(dpu_mdss->irq_controller.domain,
+ hwirq);
if (rc < 0) {
- DRM_ERROR("handle irq fail: irq=%lu mapping=%u rc=%d\n",
- hwirq, mapping, rc);
+ DRM_ERROR("handle irq fail: irq=%lu rc=%d\n",
+ hwirq, rc);
break;
}
while (intr) {
irq_hw_number_t hwirq = fls(intr) - 1;
- generic_handle_irq(irq_find_mapping(
- mdp5_mdss->irqcontroller.domain, hwirq));
+ generic_handle_domain_irq(mdp5_mdss->irqcontroller.domain, hwirq);
intr &= ~(1 << hwirq);
}
static void ipu_irq_handle(struct ipu_soc *ipu, const int *regs, int num_regs)
{
unsigned long status;
- int i, bit, irq;
+ int i, bit;
for (i = 0; i < num_regs; i++) {
status = ipu_cm_read(ipu, IPU_INT_STAT(regs[i]));
status &= ipu_cm_read(ipu, IPU_INT_CTRL(regs[i]));
- for_each_set_bit(bit, &status, 32) {
- irq = irq_linear_revmap(ipu->domain,
- regs[i] * 32 + bit);
- if (irq)
- generic_handle_irq(irq);
- }
+ for_each_set_bit(bit, &status, 32)
+ generic_handle_domain_irq(ipu->domain,
+ regs[i] * 32 + bit);
}
}
.bits_per_pixel = 16,
};
-#define Y_OFFSET(pix, x, y) ((x) + pix->width * (y))
-#define U_OFFSET(pix, x, y) ((pix->width * pix->height) + \
- (pix->width * ((y) / 2) / 2) + (x) / 2)
-#define V_OFFSET(pix, x, y) ((pix->width * pix->height) + \
- (pix->width * pix->height / 4) + \
- (pix->width * ((y) / 2) / 2) + (x) / 2)
-#define U2_OFFSET(pix, x, y) ((pix->width * pix->height) + \
- (pix->width * (y) / 2) + (x) / 2)
-#define V2_OFFSET(pix, x, y) ((pix->width * pix->height) + \
- (pix->width * pix->height / 2) + \
- (pix->width * (y) / 2) + (x) / 2)
-#define UV_OFFSET(pix, x, y) ((pix->width * pix->height) + \
- (pix->width * ((y) / 2)) + (x))
-#define UV2_OFFSET(pix, x, y) ((pix->width * pix->height) + \
- (pix->width * y) + (x))
+#define Y_OFFSET(pix, x, y) ((x) + pix->bytesperline * (y))
+#define U_OFFSET(pix, x, y) ((pix->bytesperline * pix->height) + \
+ (pix->bytesperline * ((y) / 2) / 2) + (x) / 2)
+#define V_OFFSET(pix, x, y) ((pix->bytesperline * pix->height) + \
+ (pix->bytesperline * pix->height / 4) + \
+ (pix->bytesperline * ((y) / 2) / 2) + (x) / 2)
+#define U2_OFFSET(pix, x, y) ((pix->bytesperline * pix->height) + \
+ (pix->bytesperline * (y) / 2) + (x) / 2)
+#define V2_OFFSET(pix, x, y) ((pix->bytesperline * pix->height) + \
+ (pix->bytesperline * pix->height / 2) + \
+ (pix->bytesperline * (y) / 2) + (x) / 2)
+#define UV_OFFSET(pix, x, y) ((pix->bytesperline * pix->height) + \
+ (pix->bytesperline * ((y) / 2)) + (x))
+#define UV2_OFFSET(pix, x, y) ((pix->bytesperline * pix->height) + \
+ (pix->bytesperline * y) + (x))
#define NUM_ALPHA_CHANNELS 7
#include <linux/completion.h>
#include <linux/regmap.h>
#include <linux/iio/iio.h>
+#include <linux/iio/driver.h>
+#include <linux/iio/machine.h>
#include <linux/slab.h>
#define RN5T618_ADC_CONVERSION_TIMEOUT (msecs_to_jiffies(500))
RN5T618_ADC_CHANNEL(AIN0, IIO_VOLTAGE, "AIN0")
};
+static struct iio_map rn5t618_maps[] = {
+ IIO_MAP("VADP", "rn5t618-power", "vadp"),
+ IIO_MAP("VUSB", "rn5t618-power", "vusb"),
+ { /* sentinel */ }
+};
+
+static void unregister_map(void *data)
+{
+ struct iio_dev *iio_dev = (struct iio_dev *) data;
+
+ iio_map_array_unregister(iio_dev);
+}
+
static int rn5t618_adc_probe(struct platform_device *pdev)
{
int ret;
return ret;
}
+ ret = iio_map_array_register(iio_dev, rn5t618_maps);
+ if (ret < 0)
+ return ret;
+
+ ret = devm_add_action_or_reset(adc->dev, unregister_map, iio_dev);
+ if (ret < 0)
+ return ret;
+
return devm_iio_device_register(adc->dev, iio_dev);
}
mr->uobject = uobj;
atomic_inc(&pd->usecnt);
+ rdma_restrack_new(&mr->res, RDMA_RESTRACK_MR);
+ rdma_restrack_set_name(&mr->res, NULL);
+ rdma_restrack_add(&mr->res);
uobj->object = mr;
uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_REG_DMABUF_MR_HANDLE);
if (nq)
nq->budget++;
atomic_inc(&rdev->srq_count);
+ spin_lock_init(&srq->lock);
return 0;
memset(&rattr, 0, sizeof(rattr));
rc = bnxt_re_register_netdev(rdev);
if (rc) {
- rtnl_unlock();
ibdev_err(&rdev->ibdev,
"Failed to register with netedev: %#x\n", rc);
return -EINVAL;
}
if (irq_num != msix_vecs) {
+ efa_disable_msix(dev);
dev_err(&dev->pdev->dev,
"Allocated %d MSI-X (out of %d requested)\n",
irq_num, msix_vecs);
static int _extend_sdma_tx_descs(struct hfi1_devdata *dd, struct sdma_txreq *tx)
{
int i;
+ struct sdma_desc *descp;
/* Handle last descriptor */
if (unlikely((tx->num_desc == (MAX_DESC - 1)))) {
if (unlikely(tx->num_desc == MAX_DESC))
goto enomem;
- tx->descp = kmalloc_array(
- MAX_DESC,
- sizeof(struct sdma_desc),
- GFP_ATOMIC);
- if (!tx->descp)
+ descp = kmalloc_array(MAX_DESC, sizeof(struct sdma_desc), GFP_ATOMIC);
+ if (!descp)
goto enomem;
+ tx->descp = descp;
/* reserve last descriptor for coalescing */
tx->desc_limit = MAX_DESC - 1;
depends on PCI
depends on ICE && I40E
select GENERIC_ALLOCATOR
- select CONFIG_AUXILIARY_BUS
+ select AUXILIARY_BUS
help
This is an Intel(R) Ethernet Protocol Driver for RDMA driver
that support E810 (iWARP/RoCE) and X722 (iWARP) network devices.
mutex_lock(&mlx5_ib_multiport_mutex);
if (mpi->ibdev)
mlx5_ib_unbind_slave_port(mpi->ibdev, mpi);
- list_del(&mpi->list);
+ else
+ list_del(&mpi->list);
mutex_unlock(&mlx5_ib_multiport_mutex);
kfree(mpi);
}
goto out;
}
- elem = rxe_alloc(&rxe->mc_elem_pool);
+ elem = rxe_alloc_locked(&rxe->mc_elem_pool);
if (!elem) {
err = -ENOMEM;
goto out;
if (*num_elem < 0)
goto err1;
- q = kmalloc(sizeof(*q), GFP_KERNEL);
+ q = kzalloc(sizeof(*q), GFP_KERNEL);
if (!q)
goto err1;
struct zpci_dev *zdev = to_zpci_dev(dev);
struct s390_domain_device *domain_device;
unsigned long flags;
- int rc;
+ int cc, rc;
if (!zdev)
return -ENODEV;
if (!domain_device)
return -ENOMEM;
- if (zdev->dma_table)
- zpci_dma_exit_device(zdev);
+ if (zdev->dma_table) {
+ cc = zpci_dma_exit_device(zdev);
+ if (cc) {
+ rc = -EIO;
+ goto out_free;
+ }
+ }
zdev->dma_table = s390_domain->dma_table;
- rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+ cc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
(u64) zdev->dma_table);
- if (rc)
+ if (cc) {
+ rc = -EIO;
goto out_restore;
+ }
spin_lock_irqsave(&s390_domain->list_lock, flags);
/* First device defines the DMA range limits */
out_restore:
zpci_dma_init_device(zdev);
+out_free:
kfree(domain_device);
return rc;
goto err_priv;
}
- priv->msi_map = kcalloc(BITS_TO_LONGS(priv->num_spis),
- sizeof(*priv->msi_map),
- GFP_KERNEL);
+ priv->msi_map = bitmap_zalloc(priv->num_spis, GFP_KERNEL);
if (!priv->msi_map) {
ret = -ENOMEM;
goto err_priv;
return 0;
err_map:
- kfree(priv->msi_map);
+ bitmap_free(priv->msi_map);
err_priv:
kfree(priv);
return ret;
* Reading the interrupt reason automatically acknowledges and masks
* the IRQ, so we just unmask it here if needed.
*/
- if (!irqd_irq_disabled(d) && !irqd_irq_masked(d))
+ if (!irqd_irq_masked(d))
aic_irq_unmask(d);
}
list_for_each_entry_safe(v2m, tmp, &v2m_nodes, entry) {
list_del(&v2m->entry);
- kfree(v2m->bm);
+ bitmap_free(v2m->bm);
iounmap(v2m->base);
of_node_put(to_of_node(v2m->fwnode));
if (is_fwnode_irqchip(v2m->fwnode))
break;
}
}
- v2m->bm = kcalloc(BITS_TO_LONGS(v2m->nr_spis), sizeof(long),
- GFP_KERNEL);
+ v2m->bm = bitmap_zalloc(v2m->nr_spis, GFP_KERNEL);
if (!v2m->bm) {
ret = -ENOMEM;
goto err_iounmap;
if (err)
goto out;
- bitmap = kcalloc(BITS_TO_LONGS(nr_irqs), sizeof (long), GFP_ATOMIC);
+ bitmap = bitmap_zalloc(nr_irqs, GFP_ATOMIC);
if (!bitmap)
goto out;
static void its_lpi_free(unsigned long *bitmap, u32 base, u32 nr_ids)
{
WARN_ON(free_lpi_range(base, nr_ids));
- kfree(bitmap);
+ bitmap_free(bitmap);
}
static void gic_reset_prop_table(void *va)
if (!dev || !itt || !col_map || (!lpi_map && alloc_lpis)) {
kfree(dev);
kfree(itt);
- kfree(lpi_map);
+ bitmap_free(lpi_map);
kfree(col_map);
return NULL;
}
if (ret)
goto err_free_mbi;
- mbi_ranges[n].bm = kcalloc(BITS_TO_LONGS(mbi_ranges[n].nr_spis),
- sizeof(long), GFP_KERNEL);
+ mbi_ranges[n].bm = bitmap_zalloc(mbi_ranges[n].nr_spis, GFP_KERNEL);
if (!mbi_ranges[n].bm) {
ret = -ENOMEM;
goto err_free_mbi;
err_free_mbi:
if (mbi_ranges) {
for (n = 0; n < mbi_range_nr; n++)
- kfree(mbi_ranges[n].bm);
+ bitmap_free(mbi_ranges[n].bm);
kfree(mbi_ranges);
}
DEFINE_STATIC_KEY_FALSE(gic_nonsecure_priorities);
EXPORT_SYMBOL(gic_nonsecure_priorities);
+/*
+ * When the Non-secure world has access to group 0 interrupts (as a
+ * consequence of SCR_EL3.FIQ == 0), reading the ICC_RPR_EL1 register will
+ * return the Distributor's view of the interrupt priority.
+ *
+ * When GIC security is enabled (GICD_CTLR.DS == 0), the interrupt priority
+ * written by software is moved to the Non-secure range by the Distributor.
+ *
+ * If both are true (which is when gic_nonsecure_priorities gets enabled),
+ * we need to shift down the priority programmed by software to match it
+ * against the value returned by ICC_RPR_EL1.
+ */
+#define GICD_INT_RPR_PRI(priority) \
+ ({ \
+ u32 __priority = (priority); \
+ if (static_branch_unlikely(&gic_nonsecure_priorities)) \
+ __priority = 0x80 | (__priority >> 1); \
+ \
+ __priority; \
+ })
+
/* ppi_nmi_refs[n] == number of cpus having ppi[n + 16] set as NMI */
static refcount_t *ppi_nmi_refs;
writeb_relaxed(prio, base + offset + index);
}
-static u32 gic_get_ppi_index(struct irq_data *d)
+static u32 __gic_get_ppi_index(irq_hw_number_t hwirq)
{
- switch (get_intid_range(d)) {
+ switch (__get_intid_range(hwirq)) {
case PPI_RANGE:
- return d->hwirq - 16;
+ return hwirq - 16;
case EPPI_RANGE:
- return d->hwirq - EPPI_BASE_INTID + 16;
+ return hwirq - EPPI_BASE_INTID + 16;
default:
unreachable();
}
}
+static u32 gic_get_ppi_index(struct irq_data *d)
+{
+ return __gic_get_ppi_index(d->hwirq);
+}
+
static int gic_irq_nmi_setup(struct irq_data *d)
{
struct irq_desc *desc = irq_to_desc(d->irq);
return;
if (gic_supports_nmi() &&
- unlikely(gic_read_rpr() == GICD_INT_NMI_PRI)) {
+ unlikely(gic_read_rpr() == GICD_INT_RPR_PRI(GICD_INT_NMI_PRI))) {
gic_handle_nmi(irqnr, regs);
return;
}
}
}
+static bool fwspec_is_partitioned_ppi(struct irq_fwspec *fwspec,
+ irq_hw_number_t hwirq)
+{
+ enum gic_intid_range range;
+
+ if (!gic_data.ppi_descs)
+ return false;
+
+ if (!is_of_node(fwspec->fwnode))
+ return false;
+
+ if (fwspec->param_count < 4 || !fwspec->param[3])
+ return false;
+
+ range = __get_intid_range(hwirq);
+ if (range != PPI_RANGE && range != EPPI_RANGE)
+ return false;
+
+ return true;
+}
+
static int gic_irq_domain_select(struct irq_domain *d,
struct irq_fwspec *fwspec,
enum irq_domain_bus_token bus_token)
{
+ unsigned int type, ret, ppi_idx;
+ irq_hw_number_t hwirq;
+
/* Not for us */
if (fwspec->fwnode != d->fwnode)
return 0;
if (!is_of_node(fwspec->fwnode))
return 1;
+ ret = gic_irq_domain_translate(d, fwspec, &hwirq, &type);
+ if (WARN_ON_ONCE(ret))
+ return 0;
+
+ if (!fwspec_is_partitioned_ppi(fwspec, hwirq))
+ return d == gic_data.domain;
+
/*
* If this is a PPI and we have a 4th (non-null) parameter,
* then we need to match the partition domain.
*/
- if (fwspec->param_count >= 4 &&
- fwspec->param[0] == 1 && fwspec->param[3] != 0 &&
- gic_data.ppi_descs)
- return d == partition_get_domain(gic_data.ppi_descs[fwspec->param[1]]);
-
- return d == gic_data.domain;
+ ppi_idx = __gic_get_ppi_index(hwirq);
+ return d == partition_get_domain(gic_data.ppi_descs[ppi_idx]);
}
static const struct irq_domain_ops gic_irq_domain_ops = {
unsigned long *hwirq,
unsigned int *type)
{
+ unsigned long ppi_intid;
struct device_node *np;
+ unsigned int ppi_idx;
int ret;
if (!gic_data.ppi_descs)
if (WARN_ON(!np))
return -EINVAL;
- ret = partition_translate_id(gic_data.ppi_descs[fwspec->param[1]],
+ ret = gic_irq_domain_translate(d, fwspec, &ppi_intid, type);
+ if (WARN_ON_ONCE(ret))
+ return 0;
+
+ ppi_idx = __gic_get_ppi_index(ppi_intid);
+ ret = partition_translate_id(gic_data.ppi_descs[ppi_idx],
of_node_to_fwnode(np));
if (ret < 0)
return ret;
case IRQ_TYPE_EDGE_RISING:
pch_pic_bitset(priv, PCH_PIC_EDGE, d->hwirq);
pch_pic_bitclr(priv, PCH_PIC_POL, d->hwirq);
+ irq_set_handler_locked(d, handle_edge_irq);
break;
case IRQ_TYPE_EDGE_FALLING:
pch_pic_bitset(priv, PCH_PIC_EDGE, d->hwirq);
pch_pic_bitset(priv, PCH_PIC_POL, d->hwirq);
+ irq_set_handler_locked(d, handle_edge_irq);
break;
case IRQ_TYPE_LEVEL_HIGH:
pch_pic_bitclr(priv, PCH_PIC_EDGE, d->hwirq);
pch_pic_bitclr(priv, PCH_PIC_POL, d->hwirq);
+ irq_set_handler_locked(d, handle_level_irq);
break;
case IRQ_TYPE_LEVEL_LOW:
pch_pic_bitclr(priv, PCH_PIC_EDGE, d->hwirq);
pch_pic_bitset(priv, PCH_PIC_POL, d->hwirq);
+ irq_set_handler_locked(d, handle_level_irq);
break;
default:
ret = -EINVAL;
return ret;
}
+static void pch_pic_ack_irq(struct irq_data *d)
+{
+ unsigned int reg;
+ struct pch_pic *priv = irq_data_get_irq_chip_data(d);
+
+ reg = readl(priv->base + PCH_PIC_EDGE + PIC_REG_IDX(d->hwirq) * 4);
+ if (reg & BIT(PIC_REG_BIT(d->hwirq))) {
+ writel(BIT(PIC_REG_BIT(d->hwirq)),
+ priv->base + PCH_PIC_CLR + PIC_REG_IDX(d->hwirq) * 4);
+ }
+ irq_chip_ack_parent(d);
+}
+
static struct irq_chip pch_pic_irq_chip = {
.name = "PCH PIC",
.irq_mask = pch_pic_mask_irq,
.irq_unmask = pch_pic_unmask_irq,
- .irq_ack = irq_chip_ack_parent,
+ .irq_ack = pch_pic_ack_irq,
.irq_set_affinity = irq_chip_set_affinity_parent,
.irq_set_type = pch_pic_set_type,
};
msi_data->irqs_num = MSI_IRQS_PER_MSIR *
(1 << msi_data->cfg->ibs_shift);
- msi_data->used = devm_kcalloc(&pdev->dev,
- BITS_TO_LONGS(msi_data->irqs_num),
- sizeof(*msi_data->used),
- GFP_KERNEL);
+ msi_data->used = devm_bitmap_zalloc(&pdev->dev, msi_data->irqs_num, GFP_KERNEL);
if (!msi_data->used)
return -ENOMEM;
/*
.irq_set_type = mtk_sysirq_set_type,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_set_affinity = irq_chip_set_affinity_parent,
+ .flags = IRQCHIP_SKIP_SET_WAKE,
};
static int mtk_sysirq_domain_translate(struct irq_domain *d,
gicp->spi_cnt += gicp->spi_ranges[i].count;
}
- gicp->spi_bitmap = devm_kcalloc(&pdev->dev,
- BITS_TO_LONGS(gicp->spi_cnt), sizeof(long),
- GFP_KERNEL);
+ gicp->spi_bitmap = devm_bitmap_zalloc(&pdev->dev, gicp->spi_cnt, GFP_KERNEL);
if (!gicp->spi_bitmap)
return -ENOMEM;
if (!odmis)
return -ENOMEM;
- odmis_bm = kcalloc(BITS_TO_LONGS(odmis_count * NODMIS_PER_FRAME),
- sizeof(long), GFP_KERNEL);
+ odmis_bm = bitmap_zalloc(odmis_count * NODMIS_PER_FRAME, GFP_KERNEL);
if (!odmis_bm) {
ret = -ENOMEM;
goto err_alloc;
if (odmi->base && !IS_ERR(odmi->base))
iounmap(odmis[i].base);
}
- kfree(odmis_bm);
+ bitmap_free(odmis_bm);
err_alloc:
kfree(odmis);
return ret;
goto out;
desc->domain = d;
- desc->bitmap = kcalloc(BITS_TO_LONGS(nr_parts), sizeof(long),
- GFP_KERNEL);
+ desc->bitmap = bitmap_zalloc(nr_parts, GFP_KERNEL);
if (WARN_ON(!desc->bitmap))
goto out;
return readl_relaxed(pdc_base + reg + i * sizeof(u32));
}
-static int qcom_pdc_gic_get_irqchip_state(struct irq_data *d,
- enum irqchip_irq_state which,
- bool *state)
-{
- if (d->hwirq == GPIO_NO_WAKE_IRQ)
- return 0;
-
- return irq_chip_get_parent_state(d, which, state);
-}
-
-static int qcom_pdc_gic_set_irqchip_state(struct irq_data *d,
- enum irqchip_irq_state which,
- bool value)
-{
- if (d->hwirq == GPIO_NO_WAKE_IRQ)
- return 0;
-
- return irq_chip_set_parent_state(d, which, value);
-}
-
static void pdc_enable_intr(struct irq_data *d, bool on)
{
int pin_out = d->hwirq;
static void qcom_pdc_gic_disable(struct irq_data *d)
{
- if (d->hwirq == GPIO_NO_WAKE_IRQ)
- return;
-
pdc_enable_intr(d, false);
irq_chip_disable_parent(d);
}
static void qcom_pdc_gic_enable(struct irq_data *d)
{
- if (d->hwirq == GPIO_NO_WAKE_IRQ)
- return;
-
pdc_enable_intr(d, true);
irq_chip_enable_parent(d);
}
-static void qcom_pdc_gic_mask(struct irq_data *d)
-{
- if (d->hwirq == GPIO_NO_WAKE_IRQ)
- return;
-
- irq_chip_mask_parent(d);
-}
-
-static void qcom_pdc_gic_unmask(struct irq_data *d)
-{
- if (d->hwirq == GPIO_NO_WAKE_IRQ)
- return;
-
- irq_chip_unmask_parent(d);
-}
-
/*
* GIC does not handle falling edge or active low. To allow falling edge and
* active low interrupts to be handled at GIC, PDC has an inverter that inverts
*/
static int qcom_pdc_gic_set_type(struct irq_data *d, unsigned int type)
{
- int pin_out = d->hwirq;
enum pdc_irq_config_bits pdc_type;
enum pdc_irq_config_bits old_pdc_type;
int ret;
- if (pin_out == GPIO_NO_WAKE_IRQ)
- return 0;
-
switch (type) {
case IRQ_TYPE_EDGE_RISING:
pdc_type = PDC_EDGE_RISING;
return -EINVAL;
}
- old_pdc_type = pdc_reg_read(IRQ_i_CFG, pin_out);
- pdc_reg_write(IRQ_i_CFG, pin_out, pdc_type);
+ old_pdc_type = pdc_reg_read(IRQ_i_CFG, d->hwirq);
+ pdc_reg_write(IRQ_i_CFG, d->hwirq, pdc_type);
ret = irq_chip_set_type_parent(d, type);
if (ret)
static struct irq_chip qcom_pdc_gic_chip = {
.name = "PDC",
.irq_eoi = irq_chip_eoi_parent,
- .irq_mask = qcom_pdc_gic_mask,
- .irq_unmask = qcom_pdc_gic_unmask,
+ .irq_mask = irq_chip_mask_parent,
+ .irq_unmask = irq_chip_unmask_parent,
.irq_disable = qcom_pdc_gic_disable,
.irq_enable = qcom_pdc_gic_enable,
- .irq_get_irqchip_state = qcom_pdc_gic_get_irqchip_state,
- .irq_set_irqchip_state = qcom_pdc_gic_set_irqchip_state,
+ .irq_get_irqchip_state = irq_chip_get_parent_state,
+ .irq_set_irqchip_state = irq_chip_set_parent_state,
.irq_retrigger = irq_chip_retrigger_hierarchy,
.irq_set_type = qcom_pdc_gic_set_type,
.flags = IRQCHIP_MASK_ON_SUSPEND |
parent_hwirq = get_parent_hwirq(hwirq);
if (parent_hwirq == PDC_NO_PARENT_IRQ)
- return 0;
+ return irq_domain_disconnect_hierarchy(domain->parent, virq);
if (type & IRQ_TYPE_EDGE_BOTH)
type = IRQ_TYPE_EDGE_RISING;
if (ret)
return ret;
+ if (hwirq == GPIO_NO_WAKE_IRQ)
+ return irq_domain_disconnect_hierarchy(domain, virq);
+
ret = irq_domain_set_hwirq_and_chip(domain, virq, hwirq,
&qcom_pdc_gic_chip, NULL);
if (ret)
return ret;
- if (hwirq == GPIO_NO_WAKE_IRQ)
- return 0;
-
parent_hwirq = get_parent_hwirq(hwirq);
if (parent_hwirq == PDC_NO_PARENT_IRQ)
- return 0;
+ return irq_domain_disconnect_hierarchy(domain->parent, virq);
if (type & IRQ_TYPE_EDGE_BOTH)
type = IRQ_TYPE_EDGE_RISING;
+++ /dev/null
-# SPDX-License-Identifier: GPL-2.0-only
-#
-# Open-Channel SSD NVM configuration
-#
-
-menuconfig NVM
- bool "Open-Channel SSD target support (DEPRECATED)"
- depends on BLOCK
- help
- Say Y here to get to enable Open-channel SSDs.
-
- Open-Channel SSDs implement a set of extension to SSDs, that
- exposes direct access to the underlying non-volatile memory.
-
- If you say N, all options in this submenu will be skipped and disabled
- only do this if you know what you are doing.
-
- This code is deprecated and will be removed in Linux 5.15.
-
-if NVM
-
-config NVM_PBLK
- tristate "Physical Block Device Open-Channel SSD target"
- select CRC32
- help
- Allows an open-channel SSD to be exposed as a block device to the
- host. The target assumes the device exposes raw flash and must be
- explicitly managed by the host.
-
- Please note the disk format is considered EXPERIMENTAL for now.
-
-if NVM_PBLK
-
-config NVM_PBLK_DEBUG
- bool "PBlk Debug Support"
- default n
- help
- Enables debug support for pblk. This includes extra checks, more
- vocal error messages, and extra tracking fields in the pblk sysfs
- entries.
-
-endif # NVM_PBLK_DEBUG
-
-endif # NVM
+++ /dev/null
-# SPDX-License-Identifier: GPL-2.0
-#
-# Makefile for Open-Channel SSDs.
-#
-
-obj-$(CONFIG_NVM) := core.o
-obj-$(CONFIG_NVM_PBLK) += pblk.o
-pblk-y := pblk-init.o pblk-core.o pblk-rb.o \
- pblk-write.o pblk-cache.o pblk-read.o \
- pblk-gc.o pblk-recovery.o pblk-map.o \
- pblk-rl.o pblk-sysfs.o
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Copyright (C) 2015 IT University of Copenhagen. All rights reserved.
- * Initial release: Matias Bjorling <m@bjorling.me>
- */
-
-#define pr_fmt(fmt) "nvm: " fmt
-
-#include <linux/list.h>
-#include <linux/types.h>
-#include <linux/sem.h>
-#include <linux/bitmap.h>
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-#include <linux/miscdevice.h>
-#include <linux/lightnvm.h>
-#include <linux/sched/sysctl.h>
-
-static LIST_HEAD(nvm_tgt_types);
-static DECLARE_RWSEM(nvm_tgtt_lock);
-static LIST_HEAD(nvm_devices);
-static DECLARE_RWSEM(nvm_lock);
-
-/* Map between virtual and physical channel and lun */
-struct nvm_ch_map {
- int ch_off;
- int num_lun;
- int *lun_offs;
-};
-
-struct nvm_dev_map {
- struct nvm_ch_map *chnls;
- int num_ch;
-};
-
-static void nvm_free(struct kref *ref);
-
-static struct nvm_target *nvm_find_target(struct nvm_dev *dev, const char *name)
-{
- struct nvm_target *tgt;
-
- list_for_each_entry(tgt, &dev->targets, list)
- if (!strcmp(name, tgt->disk->disk_name))
- return tgt;
-
- return NULL;
-}
-
-static bool nvm_target_exists(const char *name)
-{
- struct nvm_dev *dev;
- struct nvm_target *tgt;
- bool ret = false;
-
- down_write(&nvm_lock);
- list_for_each_entry(dev, &nvm_devices, devices) {
- mutex_lock(&dev->mlock);
- list_for_each_entry(tgt, &dev->targets, list) {
- if (!strcmp(name, tgt->disk->disk_name)) {
- ret = true;
- mutex_unlock(&dev->mlock);
- goto out;
- }
- }
- mutex_unlock(&dev->mlock);
- }
-
-out:
- up_write(&nvm_lock);
- return ret;
-}
-
-static int nvm_reserve_luns(struct nvm_dev *dev, int lun_begin, int lun_end)
-{
- int i;
-
- for (i = lun_begin; i <= lun_end; i++) {
- if (test_and_set_bit(i, dev->lun_map)) {
- pr_err("lun %d already allocated\n", i);
- goto err;
- }
- }
-
- return 0;
-err:
- while (--i >= lun_begin)
- clear_bit(i, dev->lun_map);
-
- return -EBUSY;
-}
-
-static void nvm_release_luns_err(struct nvm_dev *dev, int lun_begin,
- int lun_end)
-{
- int i;
-
- for (i = lun_begin; i <= lun_end; i++)
- WARN_ON(!test_and_clear_bit(i, dev->lun_map));
-}
-
-static void nvm_remove_tgt_dev(struct nvm_tgt_dev *tgt_dev, int clear)
-{
- struct nvm_dev *dev = tgt_dev->parent;
- struct nvm_dev_map *dev_map = tgt_dev->map;
- int i, j;
-
- for (i = 0; i < dev_map->num_ch; i++) {
- struct nvm_ch_map *ch_map = &dev_map->chnls[i];
- int *lun_offs = ch_map->lun_offs;
- int ch = i + ch_map->ch_off;
-
- if (clear) {
- for (j = 0; j < ch_map->num_lun; j++) {
- int lun = j + lun_offs[j];
- int lunid = (ch * dev->geo.num_lun) + lun;
-
- WARN_ON(!test_and_clear_bit(lunid,
- dev->lun_map));
- }
- }
-
- kfree(ch_map->lun_offs);
- }
-
- kfree(dev_map->chnls);
- kfree(dev_map);
-
- kfree(tgt_dev->luns);
- kfree(tgt_dev);
-}
-
-static struct nvm_tgt_dev *nvm_create_tgt_dev(struct nvm_dev *dev,
- u16 lun_begin, u16 lun_end,
- u16 op)
-{
- struct nvm_tgt_dev *tgt_dev = NULL;
- struct nvm_dev_map *dev_rmap = dev->rmap;
- struct nvm_dev_map *dev_map;
- struct ppa_addr *luns;
- int num_lun = lun_end - lun_begin + 1;
- int luns_left = num_lun;
- int num_ch = num_lun / dev->geo.num_lun;
- int num_ch_mod = num_lun % dev->geo.num_lun;
- int bch = lun_begin / dev->geo.num_lun;
- int blun = lun_begin % dev->geo.num_lun;
- int lunid = 0;
- int lun_balanced = 1;
- int sec_per_lun, prev_num_lun;
- int i, j;
-
- num_ch = (num_ch_mod == 0) ? num_ch : num_ch + 1;
-
- dev_map = kmalloc(sizeof(struct nvm_dev_map), GFP_KERNEL);
- if (!dev_map)
- goto err_dev;
-
- dev_map->chnls = kcalloc(num_ch, sizeof(struct nvm_ch_map), GFP_KERNEL);
- if (!dev_map->chnls)
- goto err_chnls;
-
- luns = kcalloc(num_lun, sizeof(struct ppa_addr), GFP_KERNEL);
- if (!luns)
- goto err_luns;
-
- prev_num_lun = (luns_left > dev->geo.num_lun) ?
- dev->geo.num_lun : luns_left;
- for (i = 0; i < num_ch; i++) {
- struct nvm_ch_map *ch_rmap = &dev_rmap->chnls[i + bch];
- int *lun_roffs = ch_rmap->lun_offs;
- struct nvm_ch_map *ch_map = &dev_map->chnls[i];
- int *lun_offs;
- int luns_in_chnl = (luns_left > dev->geo.num_lun) ?
- dev->geo.num_lun : luns_left;
-
- if (lun_balanced && prev_num_lun != luns_in_chnl)
- lun_balanced = 0;
-
- ch_map->ch_off = ch_rmap->ch_off = bch;
- ch_map->num_lun = luns_in_chnl;
-
- lun_offs = kcalloc(luns_in_chnl, sizeof(int), GFP_KERNEL);
- if (!lun_offs)
- goto err_ch;
-
- for (j = 0; j < luns_in_chnl; j++) {
- luns[lunid].ppa = 0;
- luns[lunid].a.ch = i;
- luns[lunid++].a.lun = j;
-
- lun_offs[j] = blun;
- lun_roffs[j + blun] = blun;
- }
-
- ch_map->lun_offs = lun_offs;
-
- /* when starting a new channel, lun offset is reset */
- blun = 0;
- luns_left -= luns_in_chnl;
- }
-
- dev_map->num_ch = num_ch;
-
- tgt_dev = kmalloc(sizeof(struct nvm_tgt_dev), GFP_KERNEL);
- if (!tgt_dev)
- goto err_ch;
-
- /* Inherit device geometry from parent */
- memcpy(&tgt_dev->geo, &dev->geo, sizeof(struct nvm_geo));
-
- /* Target device only owns a portion of the physical device */
- tgt_dev->geo.num_ch = num_ch;
- tgt_dev->geo.num_lun = (lun_balanced) ? prev_num_lun : -1;
- tgt_dev->geo.all_luns = num_lun;
- tgt_dev->geo.all_chunks = num_lun * dev->geo.num_chk;
-
- tgt_dev->geo.op = op;
-
- sec_per_lun = dev->geo.clba * dev->geo.num_chk;
- tgt_dev->geo.total_secs = num_lun * sec_per_lun;
-
- tgt_dev->q = dev->q;
- tgt_dev->map = dev_map;
- tgt_dev->luns = luns;
- tgt_dev->parent = dev;
-
- return tgt_dev;
-err_ch:
- while (--i >= 0)
- kfree(dev_map->chnls[i].lun_offs);
- kfree(luns);
-err_luns:
- kfree(dev_map->chnls);
-err_chnls:
- kfree(dev_map);
-err_dev:
- return tgt_dev;
-}
-
-static struct nvm_tgt_type *__nvm_find_target_type(const char *name)
-{
- struct nvm_tgt_type *tt;
-
- list_for_each_entry(tt, &nvm_tgt_types, list)
- if (!strcmp(name, tt->name))
- return tt;
-
- return NULL;
-}
-
-static struct nvm_tgt_type *nvm_find_target_type(const char *name)
-{
- struct nvm_tgt_type *tt;
-
- down_write(&nvm_tgtt_lock);
- tt = __nvm_find_target_type(name);
- up_write(&nvm_tgtt_lock);
-
- return tt;
-}
-
-static int nvm_config_check_luns(struct nvm_geo *geo, int lun_begin,
- int lun_end)
-{
- if (lun_begin > lun_end || lun_end >= geo->all_luns) {
- pr_err("lun out of bound (%u:%u > %u)\n",
- lun_begin, lun_end, geo->all_luns - 1);
- return -EINVAL;
- }
-
- return 0;
-}
-
-static int __nvm_config_simple(struct nvm_dev *dev,
- struct nvm_ioctl_create_simple *s)
-{
- struct nvm_geo *geo = &dev->geo;
-
- if (s->lun_begin == -1 && s->lun_end == -1) {
- s->lun_begin = 0;
- s->lun_end = geo->all_luns - 1;
- }
-
- return nvm_config_check_luns(geo, s->lun_begin, s->lun_end);
-}
-
-static int __nvm_config_extended(struct nvm_dev *dev,
- struct nvm_ioctl_create_extended *e)
-{
- if (e->lun_begin == 0xFFFF && e->lun_end == 0xFFFF) {
- e->lun_begin = 0;
- e->lun_end = dev->geo.all_luns - 1;
- }
-
- /* op not set falls into target's default */
- if (e->op == 0xFFFF) {
- e->op = NVM_TARGET_DEFAULT_OP;
- } else if (e->op < NVM_TARGET_MIN_OP || e->op > NVM_TARGET_MAX_OP) {
- pr_err("invalid over provisioning value\n");
- return -EINVAL;
- }
-
- return nvm_config_check_luns(&dev->geo, e->lun_begin, e->lun_end);
-}
-
-static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
-{
- struct nvm_ioctl_create_extended e;
- struct gendisk *tdisk;
- struct nvm_tgt_type *tt;
- struct nvm_target *t;
- struct nvm_tgt_dev *tgt_dev;
- void *targetdata;
- unsigned int mdts;
- int ret;
-
- switch (create->conf.type) {
- case NVM_CONFIG_TYPE_SIMPLE:
- ret = __nvm_config_simple(dev, &create->conf.s);
- if (ret)
- return ret;
-
- e.lun_begin = create->conf.s.lun_begin;
- e.lun_end = create->conf.s.lun_end;
- e.op = NVM_TARGET_DEFAULT_OP;
- break;
- case NVM_CONFIG_TYPE_EXTENDED:
- ret = __nvm_config_extended(dev, &create->conf.e);
- if (ret)
- return ret;
-
- e = create->conf.e;
- break;
- default:
- pr_err("config type not valid\n");
- return -EINVAL;
- }
-
- tt = nvm_find_target_type(create->tgttype);
- if (!tt) {
- pr_err("target type %s not found\n", create->tgttype);
- return -EINVAL;
- }
-
- if ((tt->flags & NVM_TGT_F_HOST_L2P) != (dev->geo.dom & NVM_RSP_L2P)) {
- pr_err("device is incompatible with target L2P type.\n");
- return -EINVAL;
- }
-
- if (nvm_target_exists(create->tgtname)) {
- pr_err("target name already exists (%s)\n",
- create->tgtname);
- return -EINVAL;
- }
-
- ret = nvm_reserve_luns(dev, e.lun_begin, e.lun_end);
- if (ret)
- return ret;
-
- t = kmalloc(sizeof(struct nvm_target), GFP_KERNEL);
- if (!t) {
- ret = -ENOMEM;
- goto err_reserve;
- }
-
- tgt_dev = nvm_create_tgt_dev(dev, e.lun_begin, e.lun_end, e.op);
- if (!tgt_dev) {
- pr_err("could not create target device\n");
- ret = -ENOMEM;
- goto err_t;
- }
-
- tdisk = blk_alloc_disk(dev->q->node);
- if (!tdisk) {
- ret = -ENOMEM;
- goto err_dev;
- }
-
- strlcpy(tdisk->disk_name, create->tgtname, sizeof(tdisk->disk_name));
- tdisk->major = 0;
- tdisk->first_minor = 0;
- tdisk->fops = tt->bops;
-
- targetdata = tt->init(tgt_dev, tdisk, create->flags);
- if (IS_ERR(targetdata)) {
- ret = PTR_ERR(targetdata);
- goto err_init;
- }
-
- tdisk->private_data = targetdata;
- tdisk->queue->queuedata = targetdata;
-
- mdts = (dev->geo.csecs >> 9) * NVM_MAX_VLBA;
- if (dev->geo.mdts) {
- mdts = min_t(u32, dev->geo.mdts,
- (dev->geo.csecs >> 9) * NVM_MAX_VLBA);
- }
- blk_queue_max_hw_sectors(tdisk->queue, mdts);
-
- set_capacity(tdisk, tt->capacity(targetdata));
- add_disk(tdisk);
-
- if (tt->sysfs_init && tt->sysfs_init(tdisk)) {
- ret = -ENOMEM;
- goto err_sysfs;
- }
-
- t->type = tt;
- t->disk = tdisk;
- t->dev = tgt_dev;
-
- mutex_lock(&dev->mlock);
- list_add_tail(&t->list, &dev->targets);
- mutex_unlock(&dev->mlock);
-
- __module_get(tt->owner);
-
- return 0;
-err_sysfs:
- if (tt->exit)
- tt->exit(targetdata, true);
-err_init:
- blk_cleanup_disk(tdisk);
-err_dev:
- nvm_remove_tgt_dev(tgt_dev, 0);
-err_t:
- kfree(t);
-err_reserve:
- nvm_release_luns_err(dev, e.lun_begin, e.lun_end);
- return ret;
-}
-
-static void __nvm_remove_target(struct nvm_target *t, bool graceful)
-{
- struct nvm_tgt_type *tt = t->type;
- struct gendisk *tdisk = t->disk;
-
- del_gendisk(tdisk);
-
- if (tt->sysfs_exit)
- tt->sysfs_exit(tdisk);
-
- if (tt->exit)
- tt->exit(tdisk->private_data, graceful);
-
- nvm_remove_tgt_dev(t->dev, 1);
- blk_cleanup_disk(tdisk);
- module_put(t->type->owner);
-
- list_del(&t->list);
- kfree(t);
-}
-
-/**
- * nvm_remove_tgt - Removes a target from the media manager
- * @remove: ioctl structure with target name to remove.
- *
- * Returns:
- * 0: on success
- * 1: on not found
- * <0: on error
- */
-static int nvm_remove_tgt(struct nvm_ioctl_remove *remove)
-{
- struct nvm_target *t = NULL;
- struct nvm_dev *dev;
-
- down_read(&nvm_lock);
- list_for_each_entry(dev, &nvm_devices, devices) {
- mutex_lock(&dev->mlock);
- t = nvm_find_target(dev, remove->tgtname);
- if (t) {
- mutex_unlock(&dev->mlock);
- break;
- }
- mutex_unlock(&dev->mlock);
- }
- up_read(&nvm_lock);
-
- if (!t) {
- pr_err("failed to remove target %s\n",
- remove->tgtname);
- return 1;
- }
-
- __nvm_remove_target(t, true);
- kref_put(&dev->ref, nvm_free);
-
- return 0;
-}
-
-static int nvm_register_map(struct nvm_dev *dev)
-{
- struct nvm_dev_map *rmap;
- int i, j;
-
- rmap = kmalloc(sizeof(struct nvm_dev_map), GFP_KERNEL);
- if (!rmap)
- goto err_rmap;
-
- rmap->chnls = kcalloc(dev->geo.num_ch, sizeof(struct nvm_ch_map),
- GFP_KERNEL);
- if (!rmap->chnls)
- goto err_chnls;
-
- for (i = 0; i < dev->geo.num_ch; i++) {
- struct nvm_ch_map *ch_rmap;
- int *lun_roffs;
- int luns_in_chnl = dev->geo.num_lun;
-
- ch_rmap = &rmap->chnls[i];
-
- ch_rmap->ch_off = -1;
- ch_rmap->num_lun = luns_in_chnl;
-
- lun_roffs = kcalloc(luns_in_chnl, sizeof(int), GFP_KERNEL);
- if (!lun_roffs)
- goto err_ch;
-
- for (j = 0; j < luns_in_chnl; j++)
- lun_roffs[j] = -1;
-
- ch_rmap->lun_offs = lun_roffs;
- }
-
- dev->rmap = rmap;
-
- return 0;
-err_ch:
- while (--i >= 0)
- kfree(rmap->chnls[i].lun_offs);
-err_chnls:
- kfree(rmap);
-err_rmap:
- return -ENOMEM;
-}
-
-static void nvm_unregister_map(struct nvm_dev *dev)
-{
- struct nvm_dev_map *rmap = dev->rmap;
- int i;
-
- for (i = 0; i < dev->geo.num_ch; i++)
- kfree(rmap->chnls[i].lun_offs);
-
- kfree(rmap->chnls);
- kfree(rmap);
-}
-
-static void nvm_map_to_dev(struct nvm_tgt_dev *tgt_dev, struct ppa_addr *p)
-{
- struct nvm_dev_map *dev_map = tgt_dev->map;
- struct nvm_ch_map *ch_map = &dev_map->chnls[p->a.ch];
- int lun_off = ch_map->lun_offs[p->a.lun];
-
- p->a.ch += ch_map->ch_off;
- p->a.lun += lun_off;
-}
-
-static void nvm_map_to_tgt(struct nvm_tgt_dev *tgt_dev, struct ppa_addr *p)
-{
- struct nvm_dev *dev = tgt_dev->parent;
- struct nvm_dev_map *dev_rmap = dev->rmap;
- struct nvm_ch_map *ch_rmap = &dev_rmap->chnls[p->a.ch];
- int lun_roff = ch_rmap->lun_offs[p->a.lun];
-
- p->a.ch -= ch_rmap->ch_off;
- p->a.lun -= lun_roff;
-}
-
-static void nvm_ppa_tgt_to_dev(struct nvm_tgt_dev *tgt_dev,
- struct ppa_addr *ppa_list, int nr_ppas)
-{
- int i;
-
- for (i = 0; i < nr_ppas; i++) {
- nvm_map_to_dev(tgt_dev, &ppa_list[i]);
- ppa_list[i] = generic_to_dev_addr(tgt_dev->parent, ppa_list[i]);
- }
-}
-
-static void nvm_ppa_dev_to_tgt(struct nvm_tgt_dev *tgt_dev,
- struct ppa_addr *ppa_list, int nr_ppas)
-{
- int i;
-
- for (i = 0; i < nr_ppas; i++) {
- ppa_list[i] = dev_to_generic_addr(tgt_dev->parent, ppa_list[i]);
- nvm_map_to_tgt(tgt_dev, &ppa_list[i]);
- }
-}
-
-static void nvm_rq_tgt_to_dev(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd)
-{
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- nvm_ppa_tgt_to_dev(tgt_dev, ppa_list, rqd->nr_ppas);
-}
-
-static void nvm_rq_dev_to_tgt(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd)
-{
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- nvm_ppa_dev_to_tgt(tgt_dev, ppa_list, rqd->nr_ppas);
-}
-
-int nvm_register_tgt_type(struct nvm_tgt_type *tt)
-{
- int ret = 0;
-
- down_write(&nvm_tgtt_lock);
- if (__nvm_find_target_type(tt->name))
- ret = -EEXIST;
- else
- list_add(&tt->list, &nvm_tgt_types);
- up_write(&nvm_tgtt_lock);
-
- return ret;
-}
-EXPORT_SYMBOL(nvm_register_tgt_type);
-
-void nvm_unregister_tgt_type(struct nvm_tgt_type *tt)
-{
- if (!tt)
- return;
-
- down_write(&nvm_tgtt_lock);
- list_del(&tt->list);
- up_write(&nvm_tgtt_lock);
-}
-EXPORT_SYMBOL(nvm_unregister_tgt_type);
-
-void *nvm_dev_dma_alloc(struct nvm_dev *dev, gfp_t mem_flags,
- dma_addr_t *dma_handler)
-{
- return dev->ops->dev_dma_alloc(dev, dev->dma_pool, mem_flags,
- dma_handler);
-}
-EXPORT_SYMBOL(nvm_dev_dma_alloc);
-
-void nvm_dev_dma_free(struct nvm_dev *dev, void *addr, dma_addr_t dma_handler)
-{
- dev->ops->dev_dma_free(dev->dma_pool, addr, dma_handler);
-}
-EXPORT_SYMBOL(nvm_dev_dma_free);
-
-static struct nvm_dev *nvm_find_nvm_dev(const char *name)
-{
- struct nvm_dev *dev;
-
- list_for_each_entry(dev, &nvm_devices, devices)
- if (!strcmp(name, dev->name))
- return dev;
-
- return NULL;
-}
-
-static int nvm_set_rqd_ppalist(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd,
- const struct ppa_addr *ppas, int nr_ppas)
-{
- struct nvm_dev *dev = tgt_dev->parent;
- struct nvm_geo *geo = &tgt_dev->geo;
- int i, plane_cnt, pl_idx;
- struct ppa_addr ppa;
-
- if (geo->pln_mode == NVM_PLANE_SINGLE && nr_ppas == 1) {
- rqd->nr_ppas = nr_ppas;
- rqd->ppa_addr = ppas[0];
-
- return 0;
- }
-
- rqd->nr_ppas = nr_ppas;
- rqd->ppa_list = nvm_dev_dma_alloc(dev, GFP_KERNEL, &rqd->dma_ppa_list);
- if (!rqd->ppa_list) {
- pr_err("failed to allocate dma memory\n");
- return -ENOMEM;
- }
-
- plane_cnt = geo->pln_mode;
- rqd->nr_ppas *= plane_cnt;
-
- for (i = 0; i < nr_ppas; i++) {
- for (pl_idx = 0; pl_idx < plane_cnt; pl_idx++) {
- ppa = ppas[i];
- ppa.g.pl = pl_idx;
- rqd->ppa_list[(pl_idx * nr_ppas) + i] = ppa;
- }
- }
-
- return 0;
-}
-
-static void nvm_free_rqd_ppalist(struct nvm_tgt_dev *tgt_dev,
- struct nvm_rq *rqd)
-{
- if (!rqd->ppa_list)
- return;
-
- nvm_dev_dma_free(tgt_dev->parent, rqd->ppa_list, rqd->dma_ppa_list);
-}
-
-static int nvm_set_flags(struct nvm_geo *geo, struct nvm_rq *rqd)
-{
- int flags = 0;
-
- if (geo->version == NVM_OCSSD_SPEC_20)
- return 0;
-
- if (rqd->is_seq)
- flags |= geo->pln_mode >> 1;
-
- if (rqd->opcode == NVM_OP_PREAD)
- flags |= (NVM_IO_SCRAMBLE_ENABLE | NVM_IO_SUSPEND);
- else if (rqd->opcode == NVM_OP_PWRITE)
- flags |= NVM_IO_SCRAMBLE_ENABLE;
-
- return flags;
-}
-
-int nvm_submit_io(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd, void *buf)
-{
- struct nvm_dev *dev = tgt_dev->parent;
- int ret;
-
- if (!dev->ops->submit_io)
- return -ENODEV;
-
- nvm_rq_tgt_to_dev(tgt_dev, rqd);
-
- rqd->dev = tgt_dev;
- rqd->flags = nvm_set_flags(&tgt_dev->geo, rqd);
-
- /* In case of error, fail with right address format */
- ret = dev->ops->submit_io(dev, rqd, buf);
- if (ret)
- nvm_rq_dev_to_tgt(tgt_dev, rqd);
- return ret;
-}
-EXPORT_SYMBOL(nvm_submit_io);
-
-static void nvm_sync_end_io(struct nvm_rq *rqd)
-{
- struct completion *waiting = rqd->private;
-
- complete(waiting);
-}
-
-static int nvm_submit_io_wait(struct nvm_dev *dev, struct nvm_rq *rqd,
- void *buf)
-{
- DECLARE_COMPLETION_ONSTACK(wait);
- int ret = 0;
-
- rqd->end_io = nvm_sync_end_io;
- rqd->private = &wait;
-
- ret = dev->ops->submit_io(dev, rqd, buf);
- if (ret)
- return ret;
-
- wait_for_completion_io(&wait);
-
- return 0;
-}
-
-int nvm_submit_io_sync(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd,
- void *buf)
-{
- struct nvm_dev *dev = tgt_dev->parent;
- int ret;
-
- if (!dev->ops->submit_io)
- return -ENODEV;
-
- nvm_rq_tgt_to_dev(tgt_dev, rqd);
-
- rqd->dev = tgt_dev;
- rqd->flags = nvm_set_flags(&tgt_dev->geo, rqd);
-
- ret = nvm_submit_io_wait(dev, rqd, buf);
-
- return ret;
-}
-EXPORT_SYMBOL(nvm_submit_io_sync);
-
-void nvm_end_io(struct nvm_rq *rqd)
-{
- struct nvm_tgt_dev *tgt_dev = rqd->dev;
-
- /* Convert address space */
- if (tgt_dev)
- nvm_rq_dev_to_tgt(tgt_dev, rqd);
-
- if (rqd->end_io)
- rqd->end_io(rqd);
-}
-EXPORT_SYMBOL(nvm_end_io);
-
-static int nvm_submit_io_sync_raw(struct nvm_dev *dev, struct nvm_rq *rqd)
-{
- if (!dev->ops->submit_io)
- return -ENODEV;
-
- rqd->dev = NULL;
- rqd->flags = nvm_set_flags(&dev->geo, rqd);
-
- return nvm_submit_io_wait(dev, rqd, NULL);
-}
-
-static int nvm_bb_chunk_sense(struct nvm_dev *dev, struct ppa_addr ppa)
-{
- struct nvm_rq rqd = { NULL };
- struct bio bio;
- struct bio_vec bio_vec;
- struct page *page;
- int ret;
-
- page = alloc_page(GFP_KERNEL);
- if (!page)
- return -ENOMEM;
-
- bio_init(&bio, &bio_vec, 1);
- bio_add_page(&bio, page, PAGE_SIZE, 0);
- bio_set_op_attrs(&bio, REQ_OP_READ, 0);
-
- rqd.bio = &bio;
- rqd.opcode = NVM_OP_PREAD;
- rqd.is_seq = 1;
- rqd.nr_ppas = 1;
- rqd.ppa_addr = generic_to_dev_addr(dev, ppa);
-
- ret = nvm_submit_io_sync_raw(dev, &rqd);
- __free_page(page);
- if (ret)
- return ret;
-
- return rqd.error;
-}
-
-/*
- * Scans a 1.2 chunk first and last page to determine if its state.
- * If the chunk is found to be open, also scan it to update the write
- * pointer.
- */
-static int nvm_bb_chunk_scan(struct nvm_dev *dev, struct ppa_addr ppa,
- struct nvm_chk_meta *meta)
-{
- struct nvm_geo *geo = &dev->geo;
- int ret, pg, pl;
-
- /* sense first page */
- ret = nvm_bb_chunk_sense(dev, ppa);
- if (ret < 0) /* io error */
- return ret;
- else if (ret == 0) /* valid data */
- meta->state = NVM_CHK_ST_OPEN;
- else if (ret > 0) {
- /*
- * If empty page, the chunk is free, else it is an
- * actual io error. In that case, mark it offline.
- */
- switch (ret) {
- case NVM_RSP_ERR_EMPTYPAGE:
- meta->state = NVM_CHK_ST_FREE;
- return 0;
- case NVM_RSP_ERR_FAILCRC:
- case NVM_RSP_ERR_FAILECC:
- case NVM_RSP_WARN_HIGHECC:
- meta->state = NVM_CHK_ST_OPEN;
- goto scan;
- default:
- return -ret; /* other io error */
- }
- }
-
- /* sense last page */
- ppa.g.pg = geo->num_pg - 1;
- ppa.g.pl = geo->num_pln - 1;
-
- ret = nvm_bb_chunk_sense(dev, ppa);
- if (ret < 0) /* io error */
- return ret;
- else if (ret == 0) { /* Chunk fully written */
- meta->state = NVM_CHK_ST_CLOSED;
- meta->wp = geo->clba;
- return 0;
- } else if (ret > 0) {
- switch (ret) {
- case NVM_RSP_ERR_EMPTYPAGE:
- case NVM_RSP_ERR_FAILCRC:
- case NVM_RSP_ERR_FAILECC:
- case NVM_RSP_WARN_HIGHECC:
- meta->state = NVM_CHK_ST_OPEN;
- break;
- default:
- return -ret; /* other io error */
- }
- }
-
-scan:
- /*
- * chunk is open, we scan sequentially to update the write pointer.
- * We make the assumption that targets write data across all planes
- * before moving to the next page.
- */
- for (pg = 0; pg < geo->num_pg; pg++) {
- for (pl = 0; pl < geo->num_pln; pl++) {
- ppa.g.pg = pg;
- ppa.g.pl = pl;
-
- ret = nvm_bb_chunk_sense(dev, ppa);
- if (ret < 0) /* io error */
- return ret;
- else if (ret == 0) {
- meta->wp += geo->ws_min;
- } else if (ret > 0) {
- switch (ret) {
- case NVM_RSP_ERR_EMPTYPAGE:
- return 0;
- case NVM_RSP_ERR_FAILCRC:
- case NVM_RSP_ERR_FAILECC:
- case NVM_RSP_WARN_HIGHECC:
- meta->wp += geo->ws_min;
- break;
- default:
- return -ret; /* other io error */
- }
- }
- }
- }
-
- return 0;
-}
-
-/*
- * folds a bad block list from its plane representation to its
- * chunk representation.
- *
- * If any of the planes status are bad or grown bad, the chunk is marked
- * offline. If not bad, the first plane state acts as the chunk state.
- */
-static int nvm_bb_to_chunk(struct nvm_dev *dev, struct ppa_addr ppa,
- u8 *blks, int nr_blks, struct nvm_chk_meta *meta)
-{
- struct nvm_geo *geo = &dev->geo;
- int ret, blk, pl, offset, blktype;
-
- for (blk = 0; blk < geo->num_chk; blk++) {
- offset = blk * geo->pln_mode;
- blktype = blks[offset];
-
- for (pl = 0; pl < geo->pln_mode; pl++) {
- if (blks[offset + pl] &
- (NVM_BLK_T_BAD|NVM_BLK_T_GRWN_BAD)) {
- blktype = blks[offset + pl];
- break;
- }
- }
-
- ppa.g.blk = blk;
-
- meta->wp = 0;
- meta->type = NVM_CHK_TP_W_SEQ;
- meta->wi = 0;
- meta->slba = generic_to_dev_addr(dev, ppa).ppa;
- meta->cnlb = dev->geo.clba;
-
- if (blktype == NVM_BLK_T_FREE) {
- ret = nvm_bb_chunk_scan(dev, ppa, meta);
- if (ret)
- return ret;
- } else {
- meta->state = NVM_CHK_ST_OFFLINE;
- }
-
- meta++;
- }
-
- return 0;
-}
-
-static int nvm_get_bb_meta(struct nvm_dev *dev, sector_t slba,
- int nchks, struct nvm_chk_meta *meta)
-{
- struct nvm_geo *geo = &dev->geo;
- struct ppa_addr ppa;
- u8 *blks;
- int ch, lun, nr_blks;
- int ret = 0;
-
- ppa.ppa = slba;
- ppa = dev_to_generic_addr(dev, ppa);
-
- if (ppa.g.blk != 0)
- return -EINVAL;
-
- if ((nchks % geo->num_chk) != 0)
- return -EINVAL;
-
- nr_blks = geo->num_chk * geo->pln_mode;
-
- blks = kmalloc(nr_blks, GFP_KERNEL);
- if (!blks)
- return -ENOMEM;
-
- for (ch = ppa.g.ch; ch < geo->num_ch; ch++) {
- for (lun = ppa.g.lun; lun < geo->num_lun; lun++) {
- struct ppa_addr ppa_gen, ppa_dev;
-
- if (!nchks)
- goto done;
-
- ppa_gen.ppa = 0;
- ppa_gen.g.ch = ch;
- ppa_gen.g.lun = lun;
- ppa_dev = generic_to_dev_addr(dev, ppa_gen);
-
- ret = dev->ops->get_bb_tbl(dev, ppa_dev, blks);
- if (ret)
- goto done;
-
- ret = nvm_bb_to_chunk(dev, ppa_gen, blks, nr_blks,
- meta);
- if (ret)
- goto done;
-
- meta += geo->num_chk;
- nchks -= geo->num_chk;
- }
- }
-done:
- kfree(blks);
- return ret;
-}
-
-int nvm_get_chunk_meta(struct nvm_tgt_dev *tgt_dev, struct ppa_addr ppa,
- int nchks, struct nvm_chk_meta *meta)
-{
- struct nvm_dev *dev = tgt_dev->parent;
-
- nvm_ppa_tgt_to_dev(tgt_dev, &ppa, 1);
-
- if (dev->geo.version == NVM_OCSSD_SPEC_12)
- return nvm_get_bb_meta(dev, (sector_t)ppa.ppa, nchks, meta);
-
- return dev->ops->get_chk_meta(dev, (sector_t)ppa.ppa, nchks, meta);
-}
-EXPORT_SYMBOL_GPL(nvm_get_chunk_meta);
-
-int nvm_set_chunk_meta(struct nvm_tgt_dev *tgt_dev, struct ppa_addr *ppas,
- int nr_ppas, int type)
-{
- struct nvm_dev *dev = tgt_dev->parent;
- struct nvm_rq rqd;
- int ret;
-
- if (dev->geo.version == NVM_OCSSD_SPEC_20)
- return 0;
-
- if (nr_ppas > NVM_MAX_VLBA) {
- pr_err("unable to update all blocks atomically\n");
- return -EINVAL;
- }
-
- memset(&rqd, 0, sizeof(struct nvm_rq));
-
- nvm_set_rqd_ppalist(tgt_dev, &rqd, ppas, nr_ppas);
- nvm_rq_tgt_to_dev(tgt_dev, &rqd);
-
- ret = dev->ops->set_bb_tbl(dev, &rqd.ppa_addr, rqd.nr_ppas, type);
- nvm_free_rqd_ppalist(tgt_dev, &rqd);
- if (ret)
- return -EINVAL;
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(nvm_set_chunk_meta);
-
-static int nvm_core_init(struct nvm_dev *dev)
-{
- struct nvm_geo *geo = &dev->geo;
- int ret;
-
- dev->lun_map = kcalloc(BITS_TO_LONGS(geo->all_luns),
- sizeof(unsigned long), GFP_KERNEL);
- if (!dev->lun_map)
- return -ENOMEM;
-
- INIT_LIST_HEAD(&dev->area_list);
- INIT_LIST_HEAD(&dev->targets);
- mutex_init(&dev->mlock);
- spin_lock_init(&dev->lock);
-
- ret = nvm_register_map(dev);
- if (ret)
- goto err_fmtype;
-
- return 0;
-err_fmtype:
- kfree(dev->lun_map);
- return ret;
-}
-
-static void nvm_free(struct kref *ref)
-{
- struct nvm_dev *dev = container_of(ref, struct nvm_dev, ref);
-
- if (dev->dma_pool)
- dev->ops->destroy_dma_pool(dev->dma_pool);
-
- if (dev->rmap)
- nvm_unregister_map(dev);
-
- kfree(dev->lun_map);
- kfree(dev);
-}
-
-static int nvm_init(struct nvm_dev *dev)
-{
- struct nvm_geo *geo = &dev->geo;
- int ret = -EINVAL;
-
- if (dev->ops->identity(dev)) {
- pr_err("device could not be identified\n");
- goto err;
- }
-
- pr_debug("ver:%u.%u nvm_vendor:%x\n", geo->major_ver_id,
- geo->minor_ver_id, geo->vmnt);
-
- ret = nvm_core_init(dev);
- if (ret) {
- pr_err("could not initialize core structures.\n");
- goto err;
- }
-
- pr_info("registered %s [%u/%u/%u/%u/%u]\n",
- dev->name, dev->geo.ws_min, dev->geo.ws_opt,
- dev->geo.num_chk, dev->geo.all_luns,
- dev->geo.num_ch);
- return 0;
-err:
- pr_err("failed to initialize nvm\n");
- return ret;
-}
-
-struct nvm_dev *nvm_alloc_dev(int node)
-{
- struct nvm_dev *dev;
-
- dev = kzalloc_node(sizeof(struct nvm_dev), GFP_KERNEL, node);
- if (dev)
- kref_init(&dev->ref);
-
- return dev;
-}
-EXPORT_SYMBOL(nvm_alloc_dev);
-
-int nvm_register(struct nvm_dev *dev)
-{
- int ret, exp_pool_size;
-
- pr_warn_once("lightnvm support is deprecated and will be removed in Linux 5.15.\n");
-
- if (!dev->q || !dev->ops) {
- kref_put(&dev->ref, nvm_free);
- return -EINVAL;
- }
-
- ret = nvm_init(dev);
- if (ret) {
- kref_put(&dev->ref, nvm_free);
- return ret;
- }
-
- exp_pool_size = max_t(int, PAGE_SIZE,
- (NVM_MAX_VLBA * (sizeof(u64) + dev->geo.sos)));
- exp_pool_size = round_up(exp_pool_size, PAGE_SIZE);
-
- dev->dma_pool = dev->ops->create_dma_pool(dev, "ppalist",
- exp_pool_size);
- if (!dev->dma_pool) {
- pr_err("could not create dma pool\n");
- kref_put(&dev->ref, nvm_free);
- return -ENOMEM;
- }
-
- /* register device with a supported media manager */
- down_write(&nvm_lock);
- list_add(&dev->devices, &nvm_devices);
- up_write(&nvm_lock);
-
- return 0;
-}
-EXPORT_SYMBOL(nvm_register);
-
-void nvm_unregister(struct nvm_dev *dev)
-{
- struct nvm_target *t, *tmp;
-
- mutex_lock(&dev->mlock);
- list_for_each_entry_safe(t, tmp, &dev->targets, list) {
- if (t->dev->parent != dev)
- continue;
- __nvm_remove_target(t, false);
- kref_put(&dev->ref, nvm_free);
- }
- mutex_unlock(&dev->mlock);
-
- down_write(&nvm_lock);
- list_del(&dev->devices);
- up_write(&nvm_lock);
-
- kref_put(&dev->ref, nvm_free);
-}
-EXPORT_SYMBOL(nvm_unregister);
-
-static int __nvm_configure_create(struct nvm_ioctl_create *create)
-{
- struct nvm_dev *dev;
- int ret;
-
- down_write(&nvm_lock);
- dev = nvm_find_nvm_dev(create->dev);
- up_write(&nvm_lock);
-
- if (!dev) {
- pr_err("device not found\n");
- return -EINVAL;
- }
-
- kref_get(&dev->ref);
- ret = nvm_create_tgt(dev, create);
- if (ret)
- kref_put(&dev->ref, nvm_free);
-
- return ret;
-}
-
-static long nvm_ioctl_info(struct file *file, void __user *arg)
-{
- struct nvm_ioctl_info *info;
- struct nvm_tgt_type *tt;
- int tgt_iter = 0;
-
- info = memdup_user(arg, sizeof(struct nvm_ioctl_info));
- if (IS_ERR(info))
- return PTR_ERR(info);
-
- info->version[0] = NVM_VERSION_MAJOR;
- info->version[1] = NVM_VERSION_MINOR;
- info->version[2] = NVM_VERSION_PATCH;
-
- down_write(&nvm_tgtt_lock);
- list_for_each_entry(tt, &nvm_tgt_types, list) {
- struct nvm_ioctl_info_tgt *tgt = &info->tgts[tgt_iter];
-
- tgt->version[0] = tt->version[0];
- tgt->version[1] = tt->version[1];
- tgt->version[2] = tt->version[2];
- strncpy(tgt->tgtname, tt->name, NVM_TTYPE_NAME_MAX);
-
- tgt_iter++;
- }
-
- info->tgtsize = tgt_iter;
- up_write(&nvm_tgtt_lock);
-
- if (copy_to_user(arg, info, sizeof(struct nvm_ioctl_info))) {
- kfree(info);
- return -EFAULT;
- }
-
- kfree(info);
- return 0;
-}
-
-static long nvm_ioctl_get_devices(struct file *file, void __user *arg)
-{
- struct nvm_ioctl_get_devices *devices;
- struct nvm_dev *dev;
- int i = 0;
-
- devices = kzalloc(sizeof(struct nvm_ioctl_get_devices), GFP_KERNEL);
- if (!devices)
- return -ENOMEM;
-
- down_write(&nvm_lock);
- list_for_each_entry(dev, &nvm_devices, devices) {
- struct nvm_ioctl_device_info *info = &devices->info[i];
-
- strlcpy(info->devname, dev->name, sizeof(info->devname));
-
- /* kept for compatibility */
- info->bmversion[0] = 1;
- info->bmversion[1] = 0;
- info->bmversion[2] = 0;
- strlcpy(info->bmname, "gennvm", sizeof(info->bmname));
- i++;
-
- if (i >= ARRAY_SIZE(devices->info)) {
- pr_err("max %zd devices can be reported.\n",
- ARRAY_SIZE(devices->info));
- break;
- }
- }
- up_write(&nvm_lock);
-
- devices->nr_devices = i;
-
- if (copy_to_user(arg, devices,
- sizeof(struct nvm_ioctl_get_devices))) {
- kfree(devices);
- return -EFAULT;
- }
-
- kfree(devices);
- return 0;
-}
-
-static long nvm_ioctl_dev_create(struct file *file, void __user *arg)
-{
- struct nvm_ioctl_create create;
-
- if (copy_from_user(&create, arg, sizeof(struct nvm_ioctl_create)))
- return -EFAULT;
-
- if (create.conf.type == NVM_CONFIG_TYPE_EXTENDED &&
- create.conf.e.rsv != 0) {
- pr_err("reserved config field in use\n");
- return -EINVAL;
- }
-
- create.dev[DISK_NAME_LEN - 1] = '\0';
- create.tgttype[NVM_TTYPE_NAME_MAX - 1] = '\0';
- create.tgtname[DISK_NAME_LEN - 1] = '\0';
-
- if (create.flags != 0) {
- __u32 flags = create.flags;
-
- /* Check for valid flags */
- if (flags & NVM_TARGET_FACTORY)
- flags &= ~NVM_TARGET_FACTORY;
-
- if (flags) {
- pr_err("flag not supported\n");
- return -EINVAL;
- }
- }
-
- return __nvm_configure_create(&create);
-}
-
-static long nvm_ioctl_dev_remove(struct file *file, void __user *arg)
-{
- struct nvm_ioctl_remove remove;
-
- if (copy_from_user(&remove, arg, sizeof(struct nvm_ioctl_remove)))
- return -EFAULT;
-
- remove.tgtname[DISK_NAME_LEN - 1] = '\0';
-
- if (remove.flags != 0) {
- pr_err("no flags supported\n");
- return -EINVAL;
- }
-
- return nvm_remove_tgt(&remove);
-}
-
-/* kept for compatibility reasons */
-static long nvm_ioctl_dev_init(struct file *file, void __user *arg)
-{
- struct nvm_ioctl_dev_init init;
-
- if (copy_from_user(&init, arg, sizeof(struct nvm_ioctl_dev_init)))
- return -EFAULT;
-
- if (init.flags != 0) {
- pr_err("no flags supported\n");
- return -EINVAL;
- }
-
- return 0;
-}
-
-/* Kept for compatibility reasons */
-static long nvm_ioctl_dev_factory(struct file *file, void __user *arg)
-{
- struct nvm_ioctl_dev_factory fact;
-
- if (copy_from_user(&fact, arg, sizeof(struct nvm_ioctl_dev_factory)))
- return -EFAULT;
-
- fact.dev[DISK_NAME_LEN - 1] = '\0';
-
- if (fact.flags & ~(NVM_FACTORY_NR_BITS - 1))
- return -EINVAL;
-
- return 0;
-}
-
-static long nvm_ctl_ioctl(struct file *file, uint cmd, unsigned long arg)
-{
- void __user *argp = (void __user *)arg;
-
- if (!capable(CAP_SYS_ADMIN))
- return -EPERM;
-
- switch (cmd) {
- case NVM_INFO:
- return nvm_ioctl_info(file, argp);
- case NVM_GET_DEVICES:
- return nvm_ioctl_get_devices(file, argp);
- case NVM_DEV_CREATE:
- return nvm_ioctl_dev_create(file, argp);
- case NVM_DEV_REMOVE:
- return nvm_ioctl_dev_remove(file, argp);
- case NVM_DEV_INIT:
- return nvm_ioctl_dev_init(file, argp);
- case NVM_DEV_FACTORY:
- return nvm_ioctl_dev_factory(file, argp);
- }
- return 0;
-}
-
-static const struct file_operations _ctl_fops = {
- .open = nonseekable_open,
- .unlocked_ioctl = nvm_ctl_ioctl,
- .owner = THIS_MODULE,
- .llseek = noop_llseek,
-};
-
-static struct miscdevice _nvm_misc = {
- .minor = MISC_DYNAMIC_MINOR,
- .name = "lightnvm",
- .nodename = "lightnvm/control",
- .fops = &_ctl_fops,
-};
-builtin_misc_device(_nvm_misc);
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-cache.c - pblk's write cache
- */
-
-#include "pblk.h"
-
-void pblk_write_to_cache(struct pblk *pblk, struct bio *bio,
- unsigned long flags)
-{
- struct pblk_w_ctx w_ctx;
- sector_t lba = pblk_get_lba(bio);
- unsigned long start_time;
- unsigned int bpos, pos;
- int nr_entries = pblk_get_secs(bio);
- int i, ret;
-
- start_time = bio_start_io_acct(bio);
-
- /* Update the write buffer head (mem) with the entries that we can
- * write. The write in itself cannot fail, so there is no need to
- * rollback from here on.
- */
-retry:
- ret = pblk_rb_may_write_user(&pblk->rwb, bio, nr_entries, &bpos);
- switch (ret) {
- case NVM_IO_REQUEUE:
- io_schedule();
- goto retry;
- case NVM_IO_ERR:
- pblk_pipeline_stop(pblk);
- bio_io_error(bio);
- goto out;
- }
-
- pblk_ppa_set_empty(&w_ctx.ppa);
- w_ctx.flags = flags;
- if (bio->bi_opf & REQ_PREFLUSH) {
- w_ctx.flags |= PBLK_FLUSH_ENTRY;
- pblk_write_kick(pblk);
- }
-
- if (unlikely(!bio_has_data(bio)))
- goto out;
-
- for (i = 0; i < nr_entries; i++) {
- void *data = bio_data(bio);
-
- w_ctx.lba = lba + i;
-
- pos = pblk_rb_wrap_pos(&pblk->rwb, bpos + i);
- pblk_rb_write_entry_user(&pblk->rwb, data, w_ctx, pos);
-
- bio_advance(bio, PBLK_EXPOSED_PAGE_SIZE);
- }
-
- atomic64_add(nr_entries, &pblk->user_wa);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(nr_entries, &pblk->inflight_writes);
- atomic_long_add(nr_entries, &pblk->req_writes);
-#endif
-
- pblk_rl_inserted(&pblk->rl, nr_entries);
-
-out:
- bio_end_io_acct(bio, start_time);
- pblk_write_should_kick(pblk);
-
- if (ret == NVM_IO_DONE)
- bio_endio(bio);
-}
-
-/*
- * On GC the incoming lbas are not necessarily sequential. Also, some of the
- * lbas might not be valid entries, which are marked as empty by the GC thread
- */
-int pblk_write_gc_to_cache(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
-{
- struct pblk_w_ctx w_ctx;
- unsigned int bpos, pos;
- void *data = gc_rq->data;
- int i, valid_entries;
-
- /* Update the write buffer head (mem) with the entries that we can
- * write. The write in itself cannot fail, so there is no need to
- * rollback from here on.
- */
-retry:
- if (!pblk_rb_may_write_gc(&pblk->rwb, gc_rq->secs_to_gc, &bpos)) {
- io_schedule();
- goto retry;
- }
-
- w_ctx.flags = PBLK_IOTYPE_GC;
- pblk_ppa_set_empty(&w_ctx.ppa);
-
- for (i = 0, valid_entries = 0; i < gc_rq->nr_secs; i++) {
- if (gc_rq->lba_list[i] == ADDR_EMPTY)
- continue;
-
- w_ctx.lba = gc_rq->lba_list[i];
-
- pos = pblk_rb_wrap_pos(&pblk->rwb, bpos + valid_entries);
- pblk_rb_write_entry_gc(&pblk->rwb, data, w_ctx, gc_rq->line,
- gc_rq->paddr_list[i], pos);
-
- data += PBLK_EXPOSED_PAGE_SIZE;
- valid_entries++;
- }
-
- WARN_ONCE(gc_rq->secs_to_gc != valid_entries,
- "pblk: inconsistent GC write\n");
-
- atomic64_add(valid_entries, &pblk->gc_wa);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(valid_entries, &pblk->inflight_writes);
- atomic_long_add(valid_entries, &pblk->recov_gc_writes);
-#endif
-
- pblk_write_should_kick(pblk);
- return NVM_IO_OK;
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-core.c - pblk's core functionality
- *
- */
-
-#define CREATE_TRACE_POINTS
-
-#include "pblk.h"
-#include "pblk-trace.h"
-
-static void pblk_line_mark_bb(struct work_struct *work)
-{
- struct pblk_line_ws *line_ws = container_of(work, struct pblk_line_ws,
- ws);
- struct pblk *pblk = line_ws->pblk;
- struct nvm_tgt_dev *dev = pblk->dev;
- struct ppa_addr *ppa = line_ws->priv;
- int ret;
-
- ret = nvm_set_chunk_meta(dev, ppa, 1, NVM_BLK_T_GRWN_BAD);
- if (ret) {
- struct pblk_line *line;
- int pos;
-
- line = pblk_ppa_to_line(pblk, *ppa);
- pos = pblk_ppa_to_pos(&dev->geo, *ppa);
-
- pblk_err(pblk, "failed to mark bb, line:%d, pos:%d\n",
- line->id, pos);
- }
-
- kfree(ppa);
- mempool_free(line_ws, &pblk->gen_ws_pool);
-}
-
-static void pblk_mark_bb(struct pblk *pblk, struct pblk_line *line,
- struct ppa_addr ppa_addr)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct ppa_addr *ppa;
- int pos = pblk_ppa_to_pos(geo, ppa_addr);
-
- pblk_debug(pblk, "erase failed: line:%d, pos:%d\n", line->id, pos);
- atomic_long_inc(&pblk->erase_failed);
-
- atomic_dec(&line->blk_in_line);
- if (test_and_set_bit(pos, line->blk_bitmap))
- pblk_err(pblk, "attempted to erase bb: line:%d, pos:%d\n",
- line->id, pos);
-
- /* Not necessary to mark bad blocks on 2.0 spec. */
- if (geo->version == NVM_OCSSD_SPEC_20)
- return;
-
- ppa = kmalloc(sizeof(struct ppa_addr), GFP_ATOMIC);
- if (!ppa)
- return;
-
- *ppa = ppa_addr;
- pblk_gen_run_ws(pblk, NULL, ppa, pblk_line_mark_bb,
- GFP_ATOMIC, pblk->bb_wq);
-}
-
-static void __pblk_end_io_erase(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct nvm_chk_meta *chunk;
- struct pblk_line *line;
- int pos;
-
- line = pblk_ppa_to_line(pblk, rqd->ppa_addr);
- pos = pblk_ppa_to_pos(geo, rqd->ppa_addr);
- chunk = &line->chks[pos];
-
- atomic_dec(&line->left_seblks);
-
- if (rqd->error) {
- trace_pblk_chunk_reset(pblk_disk_name(pblk),
- &rqd->ppa_addr, PBLK_CHUNK_RESET_FAILED);
-
- chunk->state = NVM_CHK_ST_OFFLINE;
- pblk_mark_bb(pblk, line, rqd->ppa_addr);
- } else {
- trace_pblk_chunk_reset(pblk_disk_name(pblk),
- &rqd->ppa_addr, PBLK_CHUNK_RESET_DONE);
-
- chunk->state = NVM_CHK_ST_FREE;
- }
-
- trace_pblk_chunk_state(pblk_disk_name(pblk), &rqd->ppa_addr,
- chunk->state);
-
- atomic_dec(&pblk->inflight_io);
-}
-
-/* Erase completion assumes that only one block is erased at the time */
-static void pblk_end_io_erase(struct nvm_rq *rqd)
-{
- struct pblk *pblk = rqd->private;
-
- __pblk_end_io_erase(pblk, rqd);
- mempool_free(rqd, &pblk->e_rq_pool);
-}
-
-/*
- * Get information for all chunks from the device.
- *
- * The caller is responsible for freeing (vmalloc) the returned structure
- */
-struct nvm_chk_meta *pblk_get_chunk_meta(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct nvm_chk_meta *meta;
- struct ppa_addr ppa;
- unsigned long len;
- int ret;
-
- ppa.ppa = 0;
-
- len = geo->all_chunks * sizeof(*meta);
- meta = vzalloc(len);
- if (!meta)
- return ERR_PTR(-ENOMEM);
-
- ret = nvm_get_chunk_meta(dev, ppa, geo->all_chunks, meta);
- if (ret) {
- vfree(meta);
- return ERR_PTR(-EIO);
- }
-
- return meta;
-}
-
-struct nvm_chk_meta *pblk_chunk_get_off(struct pblk *pblk,
- struct nvm_chk_meta *meta,
- struct ppa_addr ppa)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int ch_off = ppa.m.grp * geo->num_chk * geo->num_lun;
- int lun_off = ppa.m.pu * geo->num_chk;
- int chk_off = ppa.m.chk;
-
- return meta + ch_off + lun_off + chk_off;
-}
-
-void __pblk_map_invalidate(struct pblk *pblk, struct pblk_line *line,
- u64 paddr)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct list_head *move_list = NULL;
-
- /* Lines being reclaimed (GC'ed) cannot be invalidated. Before the L2P
- * table is modified with reclaimed sectors, a check is done to endure
- * that newer updates are not overwritten.
- */
- spin_lock(&line->lock);
- WARN_ON(line->state == PBLK_LINESTATE_FREE);
-
- if (test_and_set_bit(paddr, line->invalid_bitmap)) {
- WARN_ONCE(1, "pblk: double invalidate\n");
- spin_unlock(&line->lock);
- return;
- }
- le32_add_cpu(line->vsc, -1);
-
- if (line->state == PBLK_LINESTATE_CLOSED)
- move_list = pblk_line_gc_list(pblk, line);
- spin_unlock(&line->lock);
-
- if (move_list) {
- spin_lock(&l_mg->gc_lock);
- spin_lock(&line->lock);
- /* Prevent moving a line that has just been chosen for GC */
- if (line->state == PBLK_LINESTATE_GC) {
- spin_unlock(&line->lock);
- spin_unlock(&l_mg->gc_lock);
- return;
- }
- spin_unlock(&line->lock);
-
- list_move_tail(&line->list, move_list);
- spin_unlock(&l_mg->gc_lock);
- }
-}
-
-void pblk_map_invalidate(struct pblk *pblk, struct ppa_addr ppa)
-{
- struct pblk_line *line;
- u64 paddr;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Callers must ensure that the ppa points to a device address */
- BUG_ON(pblk_addr_in_cache(ppa));
- BUG_ON(pblk_ppa_empty(ppa));
-#endif
-
- line = pblk_ppa_to_line(pblk, ppa);
- paddr = pblk_dev_ppa_to_line_addr(pblk, ppa);
-
- __pblk_map_invalidate(pblk, line, paddr);
-}
-
-static void pblk_invalidate_range(struct pblk *pblk, sector_t slba,
- unsigned int nr_secs)
-{
- sector_t lba;
-
- spin_lock(&pblk->trans_lock);
- for (lba = slba; lba < slba + nr_secs; lba++) {
- struct ppa_addr ppa;
-
- ppa = pblk_trans_map_get(pblk, lba);
-
- if (!pblk_addr_in_cache(ppa) && !pblk_ppa_empty(ppa))
- pblk_map_invalidate(pblk, ppa);
-
- pblk_ppa_set_empty(&ppa);
- pblk_trans_map_set(pblk, lba, ppa);
- }
- spin_unlock(&pblk->trans_lock);
-}
-
-int pblk_alloc_rqd_meta(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
-
- rqd->meta_list = nvm_dev_dma_alloc(dev->parent, GFP_KERNEL,
- &rqd->dma_meta_list);
- if (!rqd->meta_list)
- return -ENOMEM;
-
- if (rqd->nr_ppas == 1)
- return 0;
-
- rqd->ppa_list = rqd->meta_list + pblk_dma_meta_size(pblk);
- rqd->dma_ppa_list = rqd->dma_meta_list + pblk_dma_meta_size(pblk);
-
- return 0;
-}
-
-void pblk_free_rqd_meta(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
-
- if (rqd->meta_list)
- nvm_dev_dma_free(dev->parent, rqd->meta_list,
- rqd->dma_meta_list);
-}
-
-/* Caller must guarantee that the request is a valid type */
-struct nvm_rq *pblk_alloc_rqd(struct pblk *pblk, int type)
-{
- mempool_t *pool;
- struct nvm_rq *rqd;
- int rq_size;
-
- switch (type) {
- case PBLK_WRITE:
- case PBLK_WRITE_INT:
- pool = &pblk->w_rq_pool;
- rq_size = pblk_w_rq_size;
- break;
- case PBLK_READ:
- pool = &pblk->r_rq_pool;
- rq_size = pblk_g_rq_size;
- break;
- default:
- pool = &pblk->e_rq_pool;
- rq_size = pblk_g_rq_size;
- }
-
- rqd = mempool_alloc(pool, GFP_KERNEL);
- memset(rqd, 0, rq_size);
-
- return rqd;
-}
-
-/* Typically used on completion path. Cannot guarantee request consistency */
-void pblk_free_rqd(struct pblk *pblk, struct nvm_rq *rqd, int type)
-{
- mempool_t *pool;
-
- switch (type) {
- case PBLK_WRITE:
- kfree(((struct pblk_c_ctx *)nvm_rq_to_pdu(rqd))->lun_bitmap);
- fallthrough;
- case PBLK_WRITE_INT:
- pool = &pblk->w_rq_pool;
- break;
- case PBLK_READ:
- pool = &pblk->r_rq_pool;
- break;
- case PBLK_ERASE:
- pool = &pblk->e_rq_pool;
- break;
- default:
- pblk_err(pblk, "trying to free unknown rqd type\n");
- return;
- }
-
- pblk_free_rqd_meta(pblk, rqd);
- mempool_free(rqd, pool);
-}
-
-void pblk_bio_free_pages(struct pblk *pblk, struct bio *bio, int off,
- int nr_pages)
-{
- struct bio_vec *bv;
- struct page *page;
- int i, e, nbv = 0;
-
- for (i = 0; i < bio->bi_vcnt; i++) {
- bv = &bio->bi_io_vec[i];
- page = bv->bv_page;
- for (e = 0; e < bv->bv_len; e += PBLK_EXPOSED_PAGE_SIZE, nbv++)
- if (nbv >= off)
- mempool_free(page++, &pblk->page_bio_pool);
- }
-}
-
-int pblk_bio_add_pages(struct pblk *pblk, struct bio *bio, gfp_t flags,
- int nr_pages)
-{
- struct request_queue *q = pblk->dev->q;
- struct page *page;
- int i, ret;
-
- for (i = 0; i < nr_pages; i++) {
- page = mempool_alloc(&pblk->page_bio_pool, flags);
-
- ret = bio_add_pc_page(q, bio, page, PBLK_EXPOSED_PAGE_SIZE, 0);
- if (ret != PBLK_EXPOSED_PAGE_SIZE) {
- pblk_err(pblk, "could not add page to bio\n");
- mempool_free(page, &pblk->page_bio_pool);
- goto err;
- }
- }
-
- return 0;
-err:
- pblk_bio_free_pages(pblk, bio, (bio->bi_vcnt - i), i);
- return -1;
-}
-
-void pblk_write_kick(struct pblk *pblk)
-{
- wake_up_process(pblk->writer_ts);
- mod_timer(&pblk->wtimer, jiffies + msecs_to_jiffies(1000));
-}
-
-void pblk_write_timer_fn(struct timer_list *t)
-{
- struct pblk *pblk = from_timer(pblk, t, wtimer);
-
- /* kick the write thread every tick to flush outstanding data */
- pblk_write_kick(pblk);
-}
-
-void pblk_write_should_kick(struct pblk *pblk)
-{
- unsigned int secs_avail = pblk_rb_read_count(&pblk->rwb);
-
- if (secs_avail >= pblk->min_write_pgs_data)
- pblk_write_kick(pblk);
-}
-
-static void pblk_wait_for_meta(struct pblk *pblk)
-{
- do {
- if (!atomic_read(&pblk->inflight_io))
- break;
-
- schedule();
- } while (1);
-}
-
-static void pblk_flush_writer(struct pblk *pblk)
-{
- pblk_rb_flush(&pblk->rwb);
- do {
- if (!pblk_rb_sync_count(&pblk->rwb))
- break;
-
- pblk_write_kick(pblk);
- schedule();
- } while (1);
-}
-
-struct list_head *pblk_line_gc_list(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct list_head *move_list = NULL;
- int packed_meta = (le32_to_cpu(*line->vsc) / pblk->min_write_pgs_data)
- * (pblk->min_write_pgs - pblk->min_write_pgs_data);
- int vsc = le32_to_cpu(*line->vsc) + packed_meta;
-
- lockdep_assert_held(&line->lock);
-
- if (line->w_err_gc->has_write_err) {
- if (line->gc_group != PBLK_LINEGC_WERR) {
- line->gc_group = PBLK_LINEGC_WERR;
- move_list = &l_mg->gc_werr_list;
- pblk_rl_werr_line_in(&pblk->rl);
- }
- } else if (!vsc) {
- if (line->gc_group != PBLK_LINEGC_FULL) {
- line->gc_group = PBLK_LINEGC_FULL;
- move_list = &l_mg->gc_full_list;
- }
- } else if (vsc < lm->high_thrs) {
- if (line->gc_group != PBLK_LINEGC_HIGH) {
- line->gc_group = PBLK_LINEGC_HIGH;
- move_list = &l_mg->gc_high_list;
- }
- } else if (vsc < lm->mid_thrs) {
- if (line->gc_group != PBLK_LINEGC_MID) {
- line->gc_group = PBLK_LINEGC_MID;
- move_list = &l_mg->gc_mid_list;
- }
- } else if (vsc < line->sec_in_line) {
- if (line->gc_group != PBLK_LINEGC_LOW) {
- line->gc_group = PBLK_LINEGC_LOW;
- move_list = &l_mg->gc_low_list;
- }
- } else if (vsc == line->sec_in_line) {
- if (line->gc_group != PBLK_LINEGC_EMPTY) {
- line->gc_group = PBLK_LINEGC_EMPTY;
- move_list = &l_mg->gc_empty_list;
- }
- } else {
- line->state = PBLK_LINESTATE_CORRUPT;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
-
- line->gc_group = PBLK_LINEGC_NONE;
- move_list = &l_mg->corrupt_list;
- pblk_err(pblk, "corrupted vsc for line %d, vsc:%d (%d/%d/%d)\n",
- line->id, vsc,
- line->sec_in_line,
- lm->high_thrs, lm->mid_thrs);
- }
-
- return move_list;
-}
-
-void pblk_discard(struct pblk *pblk, struct bio *bio)
-{
- sector_t slba = pblk_get_lba(bio);
- sector_t nr_secs = pblk_get_secs(bio);
-
- pblk_invalidate_range(pblk, slba, nr_secs);
-}
-
-void pblk_log_write_err(struct pblk *pblk, struct nvm_rq *rqd)
-{
- atomic_long_inc(&pblk->write_failed);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- pblk_print_failed_rqd(pblk, rqd, rqd->error);
-#endif
-}
-
-void pblk_log_read_err(struct pblk *pblk, struct nvm_rq *rqd)
-{
- /* Empty page read is not necessarily an error (e.g., L2P recovery) */
- if (rqd->error == NVM_RSP_ERR_EMPTYPAGE) {
- atomic_long_inc(&pblk->read_empty);
- return;
- }
-
- switch (rqd->error) {
- case NVM_RSP_WARN_HIGHECC:
- atomic_long_inc(&pblk->read_high_ecc);
- break;
- case NVM_RSP_ERR_FAILECC:
- case NVM_RSP_ERR_FAILCRC:
- atomic_long_inc(&pblk->read_failed);
- break;
- default:
- pblk_err(pblk, "unknown read error:%d\n", rqd->error);
- }
-#ifdef CONFIG_NVM_PBLK_DEBUG
- pblk_print_failed_rqd(pblk, rqd, rqd->error);
-#endif
-}
-
-void pblk_set_sec_per_write(struct pblk *pblk, int sec_per_write)
-{
- pblk->sec_per_write = sec_per_write;
-}
-
-int pblk_submit_io(struct pblk *pblk, struct nvm_rq *rqd, void *buf)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
-
- atomic_inc(&pblk->inflight_io);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- if (pblk_check_io(pblk, rqd))
- return NVM_IO_ERR;
-#endif
-
- return nvm_submit_io(dev, rqd, buf);
-}
-
-void pblk_check_chunk_state_update(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- int i;
-
- for (i = 0; i < rqd->nr_ppas; i++) {
- struct ppa_addr *ppa = &ppa_list[i];
- struct nvm_chk_meta *chunk = pblk_dev_ppa_to_chunk(pblk, *ppa);
- u64 caddr = pblk_dev_ppa_to_chunk_addr(pblk, *ppa);
-
- if (caddr == 0)
- trace_pblk_chunk_state(pblk_disk_name(pblk),
- ppa, NVM_CHK_ST_OPEN);
- else if (caddr == (chunk->cnlb - 1))
- trace_pblk_chunk_state(pblk_disk_name(pblk),
- ppa, NVM_CHK_ST_CLOSED);
- }
-}
-
-int pblk_submit_io_sync(struct pblk *pblk, struct nvm_rq *rqd, void *buf)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- int ret;
-
- atomic_inc(&pblk->inflight_io);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- if (pblk_check_io(pblk, rqd))
- return NVM_IO_ERR;
-#endif
-
- ret = nvm_submit_io_sync(dev, rqd, buf);
-
- if (trace_pblk_chunk_state_enabled() && !ret &&
- rqd->opcode == NVM_OP_PWRITE)
- pblk_check_chunk_state_update(pblk, rqd);
-
- return ret;
-}
-
-static int pblk_submit_io_sync_sem(struct pblk *pblk, struct nvm_rq *rqd,
- void *buf)
-{
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
- int ret;
-
- pblk_down_chunk(pblk, ppa_list[0]);
- ret = pblk_submit_io_sync(pblk, rqd, buf);
- pblk_up_chunk(pblk, ppa_list[0]);
-
- return ret;
-}
-
-int pblk_calc_secs(struct pblk *pblk, unsigned long secs_avail,
- unsigned long secs_to_flush, bool skip_meta)
-{
- int max = pblk->sec_per_write;
- int min = pblk->min_write_pgs;
- int secs_to_sync = 0;
-
- if (skip_meta && pblk->min_write_pgs_data != pblk->min_write_pgs)
- min = max = pblk->min_write_pgs_data;
-
- if (secs_avail >= max)
- secs_to_sync = max;
- else if (secs_avail >= min)
- secs_to_sync = min * (secs_avail / min);
- else if (secs_to_flush)
- secs_to_sync = min;
-
- return secs_to_sync;
-}
-
-void pblk_dealloc_page(struct pblk *pblk, struct pblk_line *line, int nr_secs)
-{
- u64 addr;
- int i;
-
- spin_lock(&line->lock);
- addr = find_next_zero_bit(line->map_bitmap,
- pblk->lm.sec_per_line, line->cur_sec);
- line->cur_sec = addr - nr_secs;
-
- for (i = 0; i < nr_secs; i++, line->cur_sec--)
- WARN_ON(!test_and_clear_bit(line->cur_sec, line->map_bitmap));
- spin_unlock(&line->lock);
-}
-
-u64 __pblk_alloc_page(struct pblk *pblk, struct pblk_line *line, int nr_secs)
-{
- u64 addr;
- int i;
-
- lockdep_assert_held(&line->lock);
-
- /* logic error: ppa out-of-bounds. Prevent generating bad address */
- if (line->cur_sec + nr_secs > pblk->lm.sec_per_line) {
- WARN(1, "pblk: page allocation out of bounds\n");
- nr_secs = pblk->lm.sec_per_line - line->cur_sec;
- }
-
- line->cur_sec = addr = find_next_zero_bit(line->map_bitmap,
- pblk->lm.sec_per_line, line->cur_sec);
- for (i = 0; i < nr_secs; i++, line->cur_sec++)
- WARN_ON(test_and_set_bit(line->cur_sec, line->map_bitmap));
-
- return addr;
-}
-
-u64 pblk_alloc_page(struct pblk *pblk, struct pblk_line *line, int nr_secs)
-{
- u64 addr;
-
- /* Lock needed in case a write fails and a recovery needs to remap
- * failed write buffer entries
- */
- spin_lock(&line->lock);
- addr = __pblk_alloc_page(pblk, line, nr_secs);
- line->left_msecs -= nr_secs;
- WARN(line->left_msecs < 0, "pblk: page allocation out of bounds\n");
- spin_unlock(&line->lock);
-
- return addr;
-}
-
-u64 pblk_lookup_page(struct pblk *pblk, struct pblk_line *line)
-{
- u64 paddr;
-
- spin_lock(&line->lock);
- paddr = find_next_zero_bit(line->map_bitmap,
- pblk->lm.sec_per_line, line->cur_sec);
- spin_unlock(&line->lock);
-
- return paddr;
-}
-
-u64 pblk_line_smeta_start(struct pblk *pblk, struct pblk_line *line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- int bit;
-
- /* This usually only happens on bad lines */
- bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
- if (bit >= lm->blk_per_line)
- return -1;
-
- return bit * geo->ws_opt;
-}
-
-int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct ppa_addr *ppa_list;
- struct nvm_rq rqd;
- u64 paddr = pblk_line_smeta_start(pblk, line);
- int i, ret;
-
- memset(&rqd, 0, sizeof(struct nvm_rq));
-
- ret = pblk_alloc_rqd_meta(pblk, &rqd);
- if (ret)
- return ret;
-
- rqd.opcode = NVM_OP_PREAD;
- rqd.nr_ppas = lm->smeta_sec;
- rqd.is_seq = 1;
- ppa_list = nvm_rq_to_ppa_list(&rqd);
-
- for (i = 0; i < lm->smeta_sec; i++, paddr++)
- ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
-
- ret = pblk_submit_io_sync(pblk, &rqd, line->smeta);
- if (ret) {
- pblk_err(pblk, "smeta I/O submission failed: %d\n", ret);
- goto clear_rqd;
- }
-
- atomic_dec(&pblk->inflight_io);
-
- if (rqd.error && rqd.error != NVM_RSP_WARN_HIGHECC) {
- pblk_log_read_err(pblk, &rqd);
- ret = -EIO;
- }
-
-clear_rqd:
- pblk_free_rqd_meta(pblk, &rqd);
- return ret;
-}
-
-static int pblk_line_smeta_write(struct pblk *pblk, struct pblk_line *line,
- u64 paddr)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct ppa_addr *ppa_list;
- struct nvm_rq rqd;
- __le64 *lba_list = emeta_to_lbas(pblk, line->emeta->buf);
- __le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
- int i, ret;
-
- memset(&rqd, 0, sizeof(struct nvm_rq));
-
- ret = pblk_alloc_rqd_meta(pblk, &rqd);
- if (ret)
- return ret;
-
- rqd.opcode = NVM_OP_PWRITE;
- rqd.nr_ppas = lm->smeta_sec;
- rqd.is_seq = 1;
- ppa_list = nvm_rq_to_ppa_list(&rqd);
-
- for (i = 0; i < lm->smeta_sec; i++, paddr++) {
- struct pblk_sec_meta *meta = pblk_get_meta(pblk,
- rqd.meta_list, i);
-
- ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
- meta->lba = lba_list[paddr] = addr_empty;
- }
-
- ret = pblk_submit_io_sync_sem(pblk, &rqd, line->smeta);
- if (ret) {
- pblk_err(pblk, "smeta I/O submission failed: %d\n", ret);
- goto clear_rqd;
- }
-
- atomic_dec(&pblk->inflight_io);
-
- if (rqd.error) {
- pblk_log_write_err(pblk, &rqd);
- ret = -EIO;
- }
-
-clear_rqd:
- pblk_free_rqd_meta(pblk, &rqd);
- return ret;
-}
-
-int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
- void *emeta_buf)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- void *ppa_list_buf, *meta_list;
- struct ppa_addr *ppa_list;
- struct nvm_rq rqd;
- u64 paddr = line->emeta_ssec;
- dma_addr_t dma_ppa_list, dma_meta_list;
- int min = pblk->min_write_pgs;
- int left_ppas = lm->emeta_sec[0];
- int line_id = line->id;
- int rq_ppas, rq_len;
- int i, j;
- int ret;
-
- meta_list = nvm_dev_dma_alloc(dev->parent, GFP_KERNEL,
- &dma_meta_list);
- if (!meta_list)
- return -ENOMEM;
-
- ppa_list_buf = meta_list + pblk_dma_meta_size(pblk);
- dma_ppa_list = dma_meta_list + pblk_dma_meta_size(pblk);
-
-next_rq:
- memset(&rqd, 0, sizeof(struct nvm_rq));
-
- rq_ppas = pblk_calc_secs(pblk, left_ppas, 0, false);
- rq_len = rq_ppas * geo->csecs;
-
- rqd.meta_list = meta_list;
- rqd.ppa_list = ppa_list_buf;
- rqd.dma_meta_list = dma_meta_list;
- rqd.dma_ppa_list = dma_ppa_list;
- rqd.opcode = NVM_OP_PREAD;
- rqd.nr_ppas = rq_ppas;
- ppa_list = nvm_rq_to_ppa_list(&rqd);
-
- for (i = 0; i < rqd.nr_ppas; ) {
- struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, line_id);
- int pos = pblk_ppa_to_pos(geo, ppa);
-
- if (pblk_io_aligned(pblk, rq_ppas))
- rqd.is_seq = 1;
-
- while (test_bit(pos, line->blk_bitmap)) {
- paddr += min;
- if (pblk_boundary_paddr_checks(pblk, paddr)) {
- ret = -EINTR;
- goto free_rqd_dma;
- }
-
- ppa = addr_to_gen_ppa(pblk, paddr, line_id);
- pos = pblk_ppa_to_pos(geo, ppa);
- }
-
- if (pblk_boundary_paddr_checks(pblk, paddr + min)) {
- ret = -EINTR;
- goto free_rqd_dma;
- }
-
- for (j = 0; j < min; j++, i++, paddr++)
- ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line_id);
- }
-
- ret = pblk_submit_io_sync(pblk, &rqd, emeta_buf);
- if (ret) {
- pblk_err(pblk, "emeta I/O submission failed: %d\n", ret);
- goto free_rqd_dma;
- }
-
- atomic_dec(&pblk->inflight_io);
-
- if (rqd.error && rqd.error != NVM_RSP_WARN_HIGHECC) {
- pblk_log_read_err(pblk, &rqd);
- ret = -EIO;
- goto free_rqd_dma;
- }
-
- emeta_buf += rq_len;
- left_ppas -= rq_ppas;
- if (left_ppas)
- goto next_rq;
-
-free_rqd_dma:
- nvm_dev_dma_free(dev->parent, rqd.meta_list, rqd.dma_meta_list);
- return ret;
-}
-
-static void pblk_setup_e_rq(struct pblk *pblk, struct nvm_rq *rqd,
- struct ppa_addr ppa)
-{
- rqd->opcode = NVM_OP_ERASE;
- rqd->ppa_addr = ppa;
- rqd->nr_ppas = 1;
- rqd->is_seq = 1;
- rqd->bio = NULL;
-}
-
-static int pblk_blk_erase_sync(struct pblk *pblk, struct ppa_addr ppa)
-{
- struct nvm_rq rqd = {NULL};
- int ret;
-
- trace_pblk_chunk_reset(pblk_disk_name(pblk), &ppa,
- PBLK_CHUNK_RESET_START);
-
- pblk_setup_e_rq(pblk, &rqd, ppa);
-
- /* The write thread schedules erases so that it minimizes disturbances
- * with writes. Thus, there is no need to take the LUN semaphore.
- */
- ret = pblk_submit_io_sync(pblk, &rqd, NULL);
- rqd.private = pblk;
- __pblk_end_io_erase(pblk, &rqd);
-
- return ret;
-}
-
-int pblk_line_erase(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct ppa_addr ppa;
- int ret, bit = -1;
-
- /* Erase only good blocks, one at a time */
- do {
- spin_lock(&line->lock);
- bit = find_next_zero_bit(line->erase_bitmap, lm->blk_per_line,
- bit + 1);
- if (bit >= lm->blk_per_line) {
- spin_unlock(&line->lock);
- break;
- }
-
- ppa = pblk->luns[bit].bppa; /* set ch and lun */
- ppa.a.blk = line->id;
-
- atomic_dec(&line->left_eblks);
- WARN_ON(test_and_set_bit(bit, line->erase_bitmap));
- spin_unlock(&line->lock);
-
- ret = pblk_blk_erase_sync(pblk, ppa);
- if (ret) {
- pblk_err(pblk, "failed to erase line %d\n", line->id);
- return ret;
- }
- } while (1);
-
- return 0;
-}
-
-static void pblk_line_setup_metadata(struct pblk_line *line,
- struct pblk_line_mgmt *l_mg,
- struct pblk_line_meta *lm)
-{
- int meta_line;
-
- lockdep_assert_held(&l_mg->free_lock);
-
-retry_meta:
- meta_line = find_first_zero_bit(&l_mg->meta_bitmap, PBLK_DATA_LINES);
- if (meta_line == PBLK_DATA_LINES) {
- spin_unlock(&l_mg->free_lock);
- io_schedule();
- spin_lock(&l_mg->free_lock);
- goto retry_meta;
- }
-
- set_bit(meta_line, &l_mg->meta_bitmap);
- line->meta_line = meta_line;
-
- line->smeta = l_mg->sline_meta[meta_line];
- line->emeta = l_mg->eline_meta[meta_line];
-
- memset(line->smeta, 0, lm->smeta_len);
- memset(line->emeta->buf, 0, lm->emeta_len[0]);
-
- line->emeta->mem = 0;
- atomic_set(&line->emeta->sync, 0);
-}
-
-/* For now lines are always assumed full lines. Thus, smeta former and current
- * lun bitmaps are omitted.
- */
-static int pblk_line_init_metadata(struct pblk *pblk, struct pblk_line *line,
- struct pblk_line *cur)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_emeta *emeta = line->emeta;
- struct line_emeta *emeta_buf = emeta->buf;
- struct line_smeta *smeta_buf = (struct line_smeta *)line->smeta;
- int nr_blk_line;
-
- /* After erasing the line, new bad blocks might appear and we risk
- * having an invalid line
- */
- nr_blk_line = lm->blk_per_line -
- bitmap_weight(line->blk_bitmap, lm->blk_per_line);
- if (nr_blk_line < lm->min_blk_line) {
- spin_lock(&l_mg->free_lock);
- spin_lock(&line->lock);
- line->state = PBLK_LINESTATE_BAD;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- spin_unlock(&line->lock);
-
- list_add_tail(&line->list, &l_mg->bad_list);
- spin_unlock(&l_mg->free_lock);
-
- pblk_debug(pblk, "line %d is bad\n", line->id);
-
- return 0;
- }
-
- /* Run-time metadata */
- line->lun_bitmap = ((void *)(smeta_buf)) + sizeof(struct line_smeta);
-
- /* Mark LUNs allocated in this line (all for now) */
- bitmap_set(line->lun_bitmap, 0, lm->lun_bitmap_len);
-
- smeta_buf->header.identifier = cpu_to_le32(PBLK_MAGIC);
- export_guid(smeta_buf->header.uuid, &pblk->instance_uuid);
- smeta_buf->header.id = cpu_to_le32(line->id);
- smeta_buf->header.type = cpu_to_le16(line->type);
- smeta_buf->header.version_major = SMETA_VERSION_MAJOR;
- smeta_buf->header.version_minor = SMETA_VERSION_MINOR;
-
- /* Start metadata */
- smeta_buf->seq_nr = cpu_to_le64(line->seq_nr);
- smeta_buf->window_wr_lun = cpu_to_le32(geo->all_luns);
-
- /* Fill metadata among lines */
- if (cur) {
- memcpy(line->lun_bitmap, cur->lun_bitmap, lm->lun_bitmap_len);
- smeta_buf->prev_id = cpu_to_le32(cur->id);
- cur->emeta->buf->next_id = cpu_to_le32(line->id);
- } else {
- smeta_buf->prev_id = cpu_to_le32(PBLK_LINE_EMPTY);
- }
-
- /* All smeta must be set at this point */
- smeta_buf->header.crc = cpu_to_le32(
- pblk_calc_meta_header_crc(pblk, &smeta_buf->header));
- smeta_buf->crc = cpu_to_le32(pblk_calc_smeta_crc(pblk, smeta_buf));
-
- /* End metadata */
- memcpy(&emeta_buf->header, &smeta_buf->header,
- sizeof(struct line_header));
-
- emeta_buf->header.version_major = EMETA_VERSION_MAJOR;
- emeta_buf->header.version_minor = EMETA_VERSION_MINOR;
- emeta_buf->header.crc = cpu_to_le32(
- pblk_calc_meta_header_crc(pblk, &emeta_buf->header));
-
- emeta_buf->seq_nr = cpu_to_le64(line->seq_nr);
- emeta_buf->nr_lbas = cpu_to_le64(line->sec_in_line);
- emeta_buf->nr_valid_lbas = cpu_to_le64(0);
- emeta_buf->next_id = cpu_to_le32(PBLK_LINE_EMPTY);
- emeta_buf->crc = cpu_to_le32(0);
- emeta_buf->prev_id = smeta_buf->prev_id;
-
- return 1;
-}
-
-static int pblk_line_alloc_bitmaps(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
-
- line->map_bitmap = mempool_alloc(l_mg->bitmap_pool, GFP_KERNEL);
- if (!line->map_bitmap)
- return -ENOMEM;
-
- memset(line->map_bitmap, 0, lm->sec_bitmap_len);
-
- /* will be initialized using bb info from map_bitmap */
- line->invalid_bitmap = mempool_alloc(l_mg->bitmap_pool, GFP_KERNEL);
- if (!line->invalid_bitmap) {
- mempool_free(line->map_bitmap, l_mg->bitmap_pool);
- line->map_bitmap = NULL;
- return -ENOMEM;
- }
-
- return 0;
-}
-
-/* For now lines are always assumed full lines. Thus, smeta former and current
- * lun bitmaps are omitted.
- */
-static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
- int init)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- u64 off;
- int bit = -1;
- int emeta_secs;
-
- line->sec_in_line = lm->sec_per_line;
-
- /* Capture bad block information on line mapping bitmaps */
- while ((bit = find_next_bit(line->blk_bitmap, lm->blk_per_line,
- bit + 1)) < lm->blk_per_line) {
- off = bit * geo->ws_opt;
- bitmap_shift_left(l_mg->bb_aux, l_mg->bb_template, off,
- lm->sec_per_line);
- bitmap_or(line->map_bitmap, line->map_bitmap, l_mg->bb_aux,
- lm->sec_per_line);
- line->sec_in_line -= geo->clba;
- }
-
- /* Mark smeta metadata sectors as bad sectors */
- bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
- off = bit * geo->ws_opt;
- bitmap_set(line->map_bitmap, off, lm->smeta_sec);
- line->sec_in_line -= lm->smeta_sec;
- line->cur_sec = off + lm->smeta_sec;
-
- if (init && pblk_line_smeta_write(pblk, line, off)) {
- pblk_debug(pblk, "line smeta I/O failed. Retry\n");
- return 0;
- }
-
- bitmap_copy(line->invalid_bitmap, line->map_bitmap, lm->sec_per_line);
-
- /* Mark emeta metadata sectors as bad sectors. We need to consider bad
- * blocks to make sure that there are enough sectors to store emeta
- */
- emeta_secs = lm->emeta_sec[0];
- off = lm->sec_per_line;
- while (emeta_secs) {
- off -= geo->ws_opt;
- if (!test_bit(off, line->invalid_bitmap)) {
- bitmap_set(line->invalid_bitmap, off, geo->ws_opt);
- emeta_secs -= geo->ws_opt;
- }
- }
-
- line->emeta_ssec = off;
- line->sec_in_line -= lm->emeta_sec[0];
- line->nr_valid_lbas = 0;
- line->left_msecs = line->sec_in_line;
- *line->vsc = cpu_to_le32(line->sec_in_line);
-
- if (lm->sec_per_line - line->sec_in_line !=
- bitmap_weight(line->invalid_bitmap, lm->sec_per_line)) {
- spin_lock(&line->lock);
- line->state = PBLK_LINESTATE_BAD;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- spin_unlock(&line->lock);
-
- list_add_tail(&line->list, &l_mg->bad_list);
- pblk_err(pblk, "unexpected line %d is bad\n", line->id);
-
- return 0;
- }
-
- return 1;
-}
-
-static int pblk_prepare_new_line(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int blk_to_erase = atomic_read(&line->blk_in_line);
- int i;
-
- for (i = 0; i < lm->blk_per_line; i++) {
- struct pblk_lun *rlun = &pblk->luns[i];
- int pos = pblk_ppa_to_pos(geo, rlun->bppa);
- int state = line->chks[pos].state;
-
- /* Free chunks should not be erased */
- if (state & NVM_CHK_ST_FREE) {
- set_bit(pblk_ppa_to_pos(geo, rlun->bppa),
- line->erase_bitmap);
- blk_to_erase--;
- }
- }
-
- return blk_to_erase;
-}
-
-static int pblk_line_prepare(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- int blk_in_line = atomic_read(&line->blk_in_line);
- int blk_to_erase;
-
- /* Bad blocks do not need to be erased */
- bitmap_copy(line->erase_bitmap, line->blk_bitmap, lm->blk_per_line);
-
- spin_lock(&line->lock);
-
- /* If we have not written to this line, we need to mark up free chunks
- * as already erased
- */
- if (line->state == PBLK_LINESTATE_NEW) {
- blk_to_erase = pblk_prepare_new_line(pblk, line);
- line->state = PBLK_LINESTATE_FREE;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- } else {
- blk_to_erase = blk_in_line;
- }
-
- if (blk_in_line < lm->min_blk_line) {
- spin_unlock(&line->lock);
- return -EAGAIN;
- }
-
- if (line->state != PBLK_LINESTATE_FREE) {
- WARN(1, "pblk: corrupted line %d, state %d\n",
- line->id, line->state);
- spin_unlock(&line->lock);
- return -EINTR;
- }
-
- line->state = PBLK_LINESTATE_OPEN;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
-
- atomic_set(&line->left_eblks, blk_to_erase);
- atomic_set(&line->left_seblks, blk_to_erase);
-
- line->meta_distance = lm->meta_distance;
- spin_unlock(&line->lock);
-
- kref_init(&line->ref);
- atomic_set(&line->sec_to_update, 0);
-
- return 0;
-}
-
-/* Line allocations in the recovery path are always single threaded */
-int pblk_line_recov_alloc(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- int ret;
-
- spin_lock(&l_mg->free_lock);
- l_mg->data_line = line;
- list_del(&line->list);
-
- ret = pblk_line_prepare(pblk, line);
- if (ret) {
- list_add(&line->list, &l_mg->free_list);
- spin_unlock(&l_mg->free_lock);
- return ret;
- }
- spin_unlock(&l_mg->free_lock);
-
- ret = pblk_line_alloc_bitmaps(pblk, line);
- if (ret)
- goto fail;
-
- if (!pblk_line_init_bb(pblk, line, 0)) {
- ret = -EINTR;
- goto fail;
- }
-
- pblk_rl_free_lines_dec(&pblk->rl, line, true);
- return 0;
-
-fail:
- spin_lock(&l_mg->free_lock);
- list_add(&line->list, &l_mg->free_list);
- spin_unlock(&l_mg->free_lock);
-
- return ret;
-}
-
-void pblk_line_recov_close(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
-
- mempool_free(line->map_bitmap, l_mg->bitmap_pool);
- line->map_bitmap = NULL;
- line->smeta = NULL;
- line->emeta = NULL;
-}
-
-static void pblk_line_reinit(struct pblk_line *line)
-{
- *line->vsc = cpu_to_le32(EMPTY_ENTRY);
-
- line->map_bitmap = NULL;
- line->invalid_bitmap = NULL;
- line->smeta = NULL;
- line->emeta = NULL;
-}
-
-void pblk_line_free(struct pblk_line *line)
-{
- struct pblk *pblk = line->pblk;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
-
- mempool_free(line->map_bitmap, l_mg->bitmap_pool);
- mempool_free(line->invalid_bitmap, l_mg->bitmap_pool);
-
- pblk_line_reinit(line);
-}
-
-struct pblk_line *pblk_line_get(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line *line;
- int ret, bit;
-
- lockdep_assert_held(&l_mg->free_lock);
-
-retry:
- if (list_empty(&l_mg->free_list)) {
- pblk_err(pblk, "no free lines\n");
- return NULL;
- }
-
- line = list_first_entry(&l_mg->free_list, struct pblk_line, list);
- list_del(&line->list);
- l_mg->nr_free_lines--;
-
- bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
- if (unlikely(bit >= lm->blk_per_line)) {
- spin_lock(&line->lock);
- line->state = PBLK_LINESTATE_BAD;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- spin_unlock(&line->lock);
-
- list_add_tail(&line->list, &l_mg->bad_list);
-
- pblk_debug(pblk, "line %d is bad\n", line->id);
- goto retry;
- }
-
- ret = pblk_line_prepare(pblk, line);
- if (ret) {
- switch (ret) {
- case -EAGAIN:
- list_add(&line->list, &l_mg->bad_list);
- goto retry;
- case -EINTR:
- list_add(&line->list, &l_mg->corrupt_list);
- goto retry;
- default:
- pblk_err(pblk, "failed to prepare line %d\n", line->id);
- list_add(&line->list, &l_mg->free_list);
- l_mg->nr_free_lines++;
- return NULL;
- }
- }
-
- return line;
-}
-
-static struct pblk_line *pblk_line_retry(struct pblk *pblk,
- struct pblk_line *line)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *retry_line;
-
-retry:
- spin_lock(&l_mg->free_lock);
- retry_line = pblk_line_get(pblk);
- if (!retry_line) {
- l_mg->data_line = NULL;
- spin_unlock(&l_mg->free_lock);
- return NULL;
- }
-
- retry_line->map_bitmap = line->map_bitmap;
- retry_line->invalid_bitmap = line->invalid_bitmap;
- retry_line->smeta = line->smeta;
- retry_line->emeta = line->emeta;
- retry_line->meta_line = line->meta_line;
-
- pblk_line_reinit(line);
-
- l_mg->data_line = retry_line;
- spin_unlock(&l_mg->free_lock);
-
- pblk_rl_free_lines_dec(&pblk->rl, line, false);
-
- if (pblk_line_erase(pblk, retry_line))
- goto retry;
-
- return retry_line;
-}
-
-static void pblk_set_space_limit(struct pblk *pblk)
-{
- struct pblk_rl *rl = &pblk->rl;
-
- atomic_set(&rl->rb_space, 0);
-}
-
-struct pblk_line *pblk_line_get_first_data(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *line;
-
- spin_lock(&l_mg->free_lock);
- line = pblk_line_get(pblk);
- if (!line) {
- spin_unlock(&l_mg->free_lock);
- return NULL;
- }
-
- line->seq_nr = l_mg->d_seq_nr++;
- line->type = PBLK_LINETYPE_DATA;
- l_mg->data_line = line;
-
- pblk_line_setup_metadata(line, l_mg, &pblk->lm);
-
- /* Allocate next line for preparation */
- l_mg->data_next = pblk_line_get(pblk);
- if (!l_mg->data_next) {
- /* If we cannot get a new line, we need to stop the pipeline.
- * Only allow as many writes in as we can store safely and then
- * fail gracefully
- */
- pblk_set_space_limit(pblk);
-
- l_mg->data_next = NULL;
- } else {
- l_mg->data_next->seq_nr = l_mg->d_seq_nr++;
- l_mg->data_next->type = PBLK_LINETYPE_DATA;
- }
- spin_unlock(&l_mg->free_lock);
-
- if (pblk_line_alloc_bitmaps(pblk, line))
- return NULL;
-
- if (pblk_line_erase(pblk, line)) {
- line = pblk_line_retry(pblk, line);
- if (!line)
- return NULL;
- }
-
-retry_setup:
- if (!pblk_line_init_metadata(pblk, line, NULL)) {
- line = pblk_line_retry(pblk, line);
- if (!line)
- return NULL;
-
- goto retry_setup;
- }
-
- if (!pblk_line_init_bb(pblk, line, 1)) {
- line = pblk_line_retry(pblk, line);
- if (!line)
- return NULL;
-
- goto retry_setup;
- }
-
- pblk_rl_free_lines_dec(&pblk->rl, line, true);
-
- return line;
-}
-
-void pblk_ppa_to_line_put(struct pblk *pblk, struct ppa_addr ppa)
-{
- struct pblk_line *line;
-
- line = pblk_ppa_to_line(pblk, ppa);
- kref_put(&line->ref, pblk_line_put_wq);
-}
-
-void pblk_rq_to_line_put(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
- int i;
-
- for (i = 0; i < rqd->nr_ppas; i++)
- pblk_ppa_to_line_put(pblk, ppa_list[i]);
-}
-
-static void pblk_stop_writes(struct pblk *pblk, struct pblk_line *line)
-{
- lockdep_assert_held(&pblk->l_mg.free_lock);
-
- pblk_set_space_limit(pblk);
- pblk->state = PBLK_STATE_STOPPING;
- trace_pblk_state(pblk_disk_name(pblk), pblk->state);
-}
-
-static void pblk_line_close_meta_sync(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line *line, *tline;
- LIST_HEAD(list);
-
- spin_lock(&l_mg->close_lock);
- if (list_empty(&l_mg->emeta_list)) {
- spin_unlock(&l_mg->close_lock);
- return;
- }
-
- list_cut_position(&list, &l_mg->emeta_list, l_mg->emeta_list.prev);
- spin_unlock(&l_mg->close_lock);
-
- list_for_each_entry_safe(line, tline, &list, list) {
- struct pblk_emeta *emeta = line->emeta;
-
- while (emeta->mem < lm->emeta_len[0]) {
- int ret;
-
- ret = pblk_submit_meta_io(pblk, line);
- if (ret) {
- pblk_err(pblk, "sync meta line %d failed (%d)\n",
- line->id, ret);
- return;
- }
- }
- }
-
- pblk_wait_for_meta(pblk);
- flush_workqueue(pblk->close_wq);
-}
-
-void __pblk_pipeline_flush(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- int ret;
-
- spin_lock(&l_mg->free_lock);
- if (pblk->state == PBLK_STATE_RECOVERING ||
- pblk->state == PBLK_STATE_STOPPED) {
- spin_unlock(&l_mg->free_lock);
- return;
- }
- pblk->state = PBLK_STATE_RECOVERING;
- trace_pblk_state(pblk_disk_name(pblk), pblk->state);
- spin_unlock(&l_mg->free_lock);
-
- pblk_flush_writer(pblk);
- pblk_wait_for_meta(pblk);
-
- ret = pblk_recov_pad(pblk);
- if (ret) {
- pblk_err(pblk, "could not close data on teardown(%d)\n", ret);
- return;
- }
-
- flush_workqueue(pblk->bb_wq);
- pblk_line_close_meta_sync(pblk);
-}
-
-void __pblk_pipeline_stop(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
-
- spin_lock(&l_mg->free_lock);
- pblk->state = PBLK_STATE_STOPPED;
- trace_pblk_state(pblk_disk_name(pblk), pblk->state);
- l_mg->data_line = NULL;
- l_mg->data_next = NULL;
- spin_unlock(&l_mg->free_lock);
-}
-
-void pblk_pipeline_stop(struct pblk *pblk)
-{
- __pblk_pipeline_flush(pblk);
- __pblk_pipeline_stop(pblk);
-}
-
-struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *cur, *new = NULL;
- unsigned int left_seblks;
-
- new = l_mg->data_next;
- if (!new)
- goto out;
-
- spin_lock(&l_mg->free_lock);
- cur = l_mg->data_line;
- l_mg->data_line = new;
-
- pblk_line_setup_metadata(new, l_mg, &pblk->lm);
- spin_unlock(&l_mg->free_lock);
-
-retry_erase:
- left_seblks = atomic_read(&new->left_seblks);
- if (left_seblks) {
- /* If line is not fully erased, erase it */
- if (atomic_read(&new->left_eblks)) {
- if (pblk_line_erase(pblk, new))
- goto out;
- } else {
- io_schedule();
- }
- goto retry_erase;
- }
-
- if (pblk_line_alloc_bitmaps(pblk, new))
- return NULL;
-
-retry_setup:
- if (!pblk_line_init_metadata(pblk, new, cur)) {
- new = pblk_line_retry(pblk, new);
- if (!new)
- goto out;
-
- goto retry_setup;
- }
-
- if (!pblk_line_init_bb(pblk, new, 1)) {
- new = pblk_line_retry(pblk, new);
- if (!new)
- goto out;
-
- goto retry_setup;
- }
-
- pblk_rl_free_lines_dec(&pblk->rl, new, true);
-
- /* Allocate next line for preparation */
- spin_lock(&l_mg->free_lock);
- l_mg->data_next = pblk_line_get(pblk);
- if (!l_mg->data_next) {
- /* If we cannot get a new line, we need to stop the pipeline.
- * Only allow as many writes in as we can store safely and then
- * fail gracefully
- */
- pblk_stop_writes(pblk, new);
- l_mg->data_next = NULL;
- } else {
- l_mg->data_next->seq_nr = l_mg->d_seq_nr++;
- l_mg->data_next->type = PBLK_LINETYPE_DATA;
- }
- spin_unlock(&l_mg->free_lock);
-
-out:
- return new;
-}
-
-static void __pblk_line_put(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_gc *gc = &pblk->gc;
-
- spin_lock(&line->lock);
- WARN_ON(line->state != PBLK_LINESTATE_GC);
- if (line->w_err_gc->has_gc_err) {
- spin_unlock(&line->lock);
- pblk_err(pblk, "line %d had errors during GC\n", line->id);
- pblk_put_line_back(pblk, line);
- line->w_err_gc->has_gc_err = 0;
- return;
- }
-
- line->state = PBLK_LINESTATE_FREE;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- line->gc_group = PBLK_LINEGC_NONE;
- pblk_line_free(line);
-
- if (line->w_err_gc->has_write_err) {
- pblk_rl_werr_line_out(&pblk->rl);
- line->w_err_gc->has_write_err = 0;
- }
-
- spin_unlock(&line->lock);
- atomic_dec(&gc->pipeline_gc);
-
- spin_lock(&l_mg->free_lock);
- list_add_tail(&line->list, &l_mg->free_list);
- l_mg->nr_free_lines++;
- spin_unlock(&l_mg->free_lock);
-
- pblk_rl_free_lines_inc(&pblk->rl, line);
-}
-
-static void pblk_line_put_ws(struct work_struct *work)
-{
- struct pblk_line_ws *line_put_ws = container_of(work,
- struct pblk_line_ws, ws);
- struct pblk *pblk = line_put_ws->pblk;
- struct pblk_line *line = line_put_ws->line;
-
- __pblk_line_put(pblk, line);
- mempool_free(line_put_ws, &pblk->gen_ws_pool);
-}
-
-void pblk_line_put(struct kref *ref)
-{
- struct pblk_line *line = container_of(ref, struct pblk_line, ref);
- struct pblk *pblk = line->pblk;
-
- __pblk_line_put(pblk, line);
-}
-
-void pblk_line_put_wq(struct kref *ref)
-{
- struct pblk_line *line = container_of(ref, struct pblk_line, ref);
- struct pblk *pblk = line->pblk;
- struct pblk_line_ws *line_put_ws;
-
- line_put_ws = mempool_alloc(&pblk->gen_ws_pool, GFP_ATOMIC);
- if (!line_put_ws)
- return;
-
- line_put_ws->pblk = pblk;
- line_put_ws->line = line;
- line_put_ws->priv = NULL;
-
- INIT_WORK(&line_put_ws->ws, pblk_line_put_ws);
- queue_work(pblk->r_end_wq, &line_put_ws->ws);
-}
-
-int pblk_blk_erase_async(struct pblk *pblk, struct ppa_addr ppa)
-{
- struct nvm_rq *rqd;
- int err;
-
- rqd = pblk_alloc_rqd(pblk, PBLK_ERASE);
-
- pblk_setup_e_rq(pblk, rqd, ppa);
-
- rqd->end_io = pblk_end_io_erase;
- rqd->private = pblk;
-
- trace_pblk_chunk_reset(pblk_disk_name(pblk),
- &ppa, PBLK_CHUNK_RESET_START);
-
- /* The write thread schedules erases so that it minimizes disturbances
- * with writes. Thus, there is no need to take the LUN semaphore.
- */
- err = pblk_submit_io(pblk, rqd, NULL);
- if (err) {
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
-
- pblk_err(pblk, "could not async erase line:%d,blk:%d\n",
- pblk_ppa_to_line_id(ppa),
- pblk_ppa_to_pos(geo, ppa));
- }
-
- return err;
-}
-
-struct pblk_line *pblk_line_get_data(struct pblk *pblk)
-{
- return pblk->l_mg.data_line;
-}
-
-/* For now, always erase next line */
-struct pblk_line *pblk_line_get_erase(struct pblk *pblk)
-{
- return pblk->l_mg.data_next;
-}
-
-int pblk_line_is_full(struct pblk_line *line)
-{
- return (line->left_msecs == 0);
-}
-
-static void pblk_line_should_sync_meta(struct pblk *pblk)
-{
- if (pblk_rl_is_limit(&pblk->rl))
- pblk_line_close_meta_sync(pblk);
-}
-
-void pblk_line_close(struct pblk *pblk, struct pblk_line *line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct list_head *move_list;
- int i;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- WARN(!bitmap_full(line->map_bitmap, lm->sec_per_line),
- "pblk: corrupt closed line %d\n", line->id);
-#endif
-
- spin_lock(&l_mg->free_lock);
- WARN_ON(!test_and_clear_bit(line->meta_line, &l_mg->meta_bitmap));
- spin_unlock(&l_mg->free_lock);
-
- spin_lock(&l_mg->gc_lock);
- spin_lock(&line->lock);
- WARN_ON(line->state != PBLK_LINESTATE_OPEN);
- line->state = PBLK_LINESTATE_CLOSED;
- move_list = pblk_line_gc_list(pblk, line);
- list_add_tail(&line->list, move_list);
-
- mempool_free(line->map_bitmap, l_mg->bitmap_pool);
- line->map_bitmap = NULL;
- line->smeta = NULL;
- line->emeta = NULL;
-
- for (i = 0; i < lm->blk_per_line; i++) {
- struct pblk_lun *rlun = &pblk->luns[i];
- int pos = pblk_ppa_to_pos(geo, rlun->bppa);
- int state = line->chks[pos].state;
-
- if (!(state & NVM_CHK_ST_OFFLINE))
- state = NVM_CHK_ST_CLOSED;
- }
-
- spin_unlock(&line->lock);
- spin_unlock(&l_mg->gc_lock);
-
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
-}
-
-void pblk_line_close_meta(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_emeta *emeta = line->emeta;
- struct line_emeta *emeta_buf = emeta->buf;
- struct wa_counters *wa = emeta_to_wa(lm, emeta_buf);
-
- /* No need for exact vsc value; avoid a big line lock and take aprox. */
- memcpy(emeta_to_vsc(pblk, emeta_buf), l_mg->vsc_list, lm->vsc_list_len);
- memcpy(emeta_to_bb(emeta_buf), line->blk_bitmap, lm->blk_bitmap_len);
-
- wa->user = cpu_to_le64(atomic64_read(&pblk->user_wa));
- wa->pad = cpu_to_le64(atomic64_read(&pblk->pad_wa));
- wa->gc = cpu_to_le64(atomic64_read(&pblk->gc_wa));
-
- if (le32_to_cpu(emeta_buf->header.identifier) != PBLK_MAGIC) {
- emeta_buf->header.identifier = cpu_to_le32(PBLK_MAGIC);
- export_guid(emeta_buf->header.uuid, &pblk->instance_uuid);
- emeta_buf->header.id = cpu_to_le32(line->id);
- emeta_buf->header.type = cpu_to_le16(line->type);
- emeta_buf->header.version_major = EMETA_VERSION_MAJOR;
- emeta_buf->header.version_minor = EMETA_VERSION_MINOR;
- emeta_buf->header.crc = cpu_to_le32(
- pblk_calc_meta_header_crc(pblk, &emeta_buf->header));
- }
-
- emeta_buf->nr_valid_lbas = cpu_to_le64(line->nr_valid_lbas);
- emeta_buf->crc = cpu_to_le32(pblk_calc_emeta_crc(pblk, emeta_buf));
-
- spin_lock(&l_mg->close_lock);
- spin_lock(&line->lock);
-
- /* Update the in-memory start address for emeta, in case it has
- * shifted due to write errors
- */
- if (line->emeta_ssec != line->cur_sec)
- line->emeta_ssec = line->cur_sec;
-
- list_add_tail(&line->list, &l_mg->emeta_list);
- spin_unlock(&line->lock);
- spin_unlock(&l_mg->close_lock);
-
- pblk_line_should_sync_meta(pblk);
-}
-
-static void pblk_save_lba_list(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- unsigned int lba_list_size = lm->emeta_len[2];
- struct pblk_w_err_gc *w_err_gc = line->w_err_gc;
- struct pblk_emeta *emeta = line->emeta;
-
- w_err_gc->lba_list = kvmalloc(lba_list_size, GFP_KERNEL);
- memcpy(w_err_gc->lba_list, emeta_to_lbas(pblk, emeta->buf),
- lba_list_size);
-}
-
-void pblk_line_close_ws(struct work_struct *work)
-{
- struct pblk_line_ws *line_ws = container_of(work, struct pblk_line_ws,
- ws);
- struct pblk *pblk = line_ws->pblk;
- struct pblk_line *line = line_ws->line;
- struct pblk_w_err_gc *w_err_gc = line->w_err_gc;
-
- /* Write errors makes the emeta start address stored in smeta invalid,
- * so keep a copy of the lba list until we've gc'd the line
- */
- if (w_err_gc->has_write_err)
- pblk_save_lba_list(pblk, line);
-
- pblk_line_close(pblk, line);
- mempool_free(line_ws, &pblk->gen_ws_pool);
-}
-
-void pblk_gen_run_ws(struct pblk *pblk, struct pblk_line *line, void *priv,
- void (*work)(struct work_struct *), gfp_t gfp_mask,
- struct workqueue_struct *wq)
-{
- struct pblk_line_ws *line_ws;
-
- line_ws = mempool_alloc(&pblk->gen_ws_pool, gfp_mask);
- if (!line_ws) {
- pblk_err(pblk, "pblk: could not allocate memory\n");
- return;
- }
-
- line_ws->pblk = pblk;
- line_ws->line = line;
- line_ws->priv = priv;
-
- INIT_WORK(&line_ws->ws, work);
- queue_work(wq, &line_ws->ws);
-}
-
-static void __pblk_down_chunk(struct pblk *pblk, int pos)
-{
- struct pblk_lun *rlun = &pblk->luns[pos];
- int ret;
-
- /*
- * Only send one inflight I/O per LUN. Since we map at a page
- * granurality, all ppas in the I/O will map to the same LUN
- */
-
- ret = down_timeout(&rlun->wr_sem, msecs_to_jiffies(30000));
- if (ret == -ETIME || ret == -EINTR)
- pblk_err(pblk, "taking lun semaphore timed out: err %d\n",
- -ret);
-}
-
-void pblk_down_chunk(struct pblk *pblk, struct ppa_addr ppa)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int pos = pblk_ppa_to_pos(geo, ppa);
-
- __pblk_down_chunk(pblk, pos);
-}
-
-void pblk_down_rq(struct pblk *pblk, struct ppa_addr ppa,
- unsigned long *lun_bitmap)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int pos = pblk_ppa_to_pos(geo, ppa);
-
- /* If the LUN has been locked for this same request, do no attempt to
- * lock it again
- */
- if (test_and_set_bit(pos, lun_bitmap))
- return;
-
- __pblk_down_chunk(pblk, pos);
-}
-
-void pblk_up_chunk(struct pblk *pblk, struct ppa_addr ppa)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_lun *rlun;
- int pos = pblk_ppa_to_pos(geo, ppa);
-
- rlun = &pblk->luns[pos];
- up(&rlun->wr_sem);
-}
-
-void pblk_up_rq(struct pblk *pblk, unsigned long *lun_bitmap)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_lun *rlun;
- int num_lun = geo->all_luns;
- int bit = -1;
-
- while ((bit = find_next_bit(lun_bitmap, num_lun, bit + 1)) < num_lun) {
- rlun = &pblk->luns[bit];
- up(&rlun->wr_sem);
- }
-}
-
-void pblk_update_map(struct pblk *pblk, sector_t lba, struct ppa_addr ppa)
-{
- struct ppa_addr ppa_l2p;
-
- /* logic error: lba out-of-bounds. Ignore update */
- if (!(lba < pblk->capacity)) {
- WARN(1, "pblk: corrupted L2P map request\n");
- return;
- }
-
- spin_lock(&pblk->trans_lock);
- ppa_l2p = pblk_trans_map_get(pblk, lba);
-
- if (!pblk_addr_in_cache(ppa_l2p) && !pblk_ppa_empty(ppa_l2p))
- pblk_map_invalidate(pblk, ppa_l2p);
-
- pblk_trans_map_set(pblk, lba, ppa);
- spin_unlock(&pblk->trans_lock);
-}
-
-void pblk_update_map_cache(struct pblk *pblk, sector_t lba, struct ppa_addr ppa)
-{
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Callers must ensure that the ppa points to a cache address */
- BUG_ON(!pblk_addr_in_cache(ppa));
- BUG_ON(pblk_rb_pos_oob(&pblk->rwb, pblk_addr_to_cacheline(ppa)));
-#endif
-
- pblk_update_map(pblk, lba, ppa);
-}
-
-int pblk_update_map_gc(struct pblk *pblk, sector_t lba, struct ppa_addr ppa_new,
- struct pblk_line *gc_line, u64 paddr_gc)
-{
- struct ppa_addr ppa_l2p, ppa_gc;
- int ret = 1;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Callers must ensure that the ppa points to a cache address */
- BUG_ON(!pblk_addr_in_cache(ppa_new));
- BUG_ON(pblk_rb_pos_oob(&pblk->rwb, pblk_addr_to_cacheline(ppa_new)));
-#endif
-
- /* logic error: lba out-of-bounds. Ignore update */
- if (!(lba < pblk->capacity)) {
- WARN(1, "pblk: corrupted L2P map request\n");
- return 0;
- }
-
- spin_lock(&pblk->trans_lock);
- ppa_l2p = pblk_trans_map_get(pblk, lba);
- ppa_gc = addr_to_gen_ppa(pblk, paddr_gc, gc_line->id);
-
- if (!pblk_ppa_comp(ppa_l2p, ppa_gc)) {
- spin_lock(&gc_line->lock);
- WARN(!test_bit(paddr_gc, gc_line->invalid_bitmap),
- "pblk: corrupted GC update");
- spin_unlock(&gc_line->lock);
-
- ret = 0;
- goto out;
- }
-
- pblk_trans_map_set(pblk, lba, ppa_new);
-out:
- spin_unlock(&pblk->trans_lock);
- return ret;
-}
-
-void pblk_update_map_dev(struct pblk *pblk, sector_t lba,
- struct ppa_addr ppa_mapped, struct ppa_addr ppa_cache)
-{
- struct ppa_addr ppa_l2p;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Callers must ensure that the ppa points to a device address */
- BUG_ON(pblk_addr_in_cache(ppa_mapped));
-#endif
- /* Invalidate and discard padded entries */
- if (lba == ADDR_EMPTY) {
- atomic64_inc(&pblk->pad_wa);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_inc(&pblk->padded_wb);
-#endif
- if (!pblk_ppa_empty(ppa_mapped))
- pblk_map_invalidate(pblk, ppa_mapped);
- return;
- }
-
- /* logic error: lba out-of-bounds. Ignore update */
- if (!(lba < pblk->capacity)) {
- WARN(1, "pblk: corrupted L2P map request\n");
- return;
- }
-
- spin_lock(&pblk->trans_lock);
- ppa_l2p = pblk_trans_map_get(pblk, lba);
-
- /* Do not update L2P if the cacheline has been updated. In this case,
- * the mapped ppa must be invalidated
- */
- if (!pblk_ppa_comp(ppa_l2p, ppa_cache)) {
- if (!pblk_ppa_empty(ppa_mapped))
- pblk_map_invalidate(pblk, ppa_mapped);
- goto out;
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- WARN_ON(!pblk_addr_in_cache(ppa_l2p) && !pblk_ppa_empty(ppa_l2p));
-#endif
-
- pblk_trans_map_set(pblk, lba, ppa_mapped);
-out:
- spin_unlock(&pblk->trans_lock);
-}
-
-int pblk_lookup_l2p_seq(struct pblk *pblk, struct ppa_addr *ppas,
- sector_t blba, int nr_secs, bool *from_cache)
-{
- int i;
-
- spin_lock(&pblk->trans_lock);
- for (i = 0; i < nr_secs; i++) {
- struct ppa_addr ppa;
-
- ppa = ppas[i] = pblk_trans_map_get(pblk, blba + i);
-
- /* If the L2P entry maps to a line, the reference is valid */
- if (!pblk_ppa_empty(ppa) && !pblk_addr_in_cache(ppa)) {
- struct pblk_line *line = pblk_ppa_to_line(pblk, ppa);
-
- if (i > 0 && *from_cache)
- break;
- *from_cache = false;
-
- kref_get(&line->ref);
- } else {
- if (i > 0 && !*from_cache)
- break;
- *from_cache = true;
- }
- }
- spin_unlock(&pblk->trans_lock);
- return i;
-}
-
-void pblk_lookup_l2p_rand(struct pblk *pblk, struct ppa_addr *ppas,
- u64 *lba_list, int nr_secs)
-{
- u64 lba;
- int i;
-
- spin_lock(&pblk->trans_lock);
- for (i = 0; i < nr_secs; i++) {
- lba = lba_list[i];
- if (lba != ADDR_EMPTY) {
- /* logic error: lba out-of-bounds. Ignore update */
- if (!(lba < pblk->capacity)) {
- WARN(1, "pblk: corrupted L2P map request\n");
- continue;
- }
- ppas[i] = pblk_trans_map_get(pblk, lba);
- }
- }
- spin_unlock(&pblk->trans_lock);
-}
-
-void *pblk_get_meta_for_writes(struct pblk *pblk, struct nvm_rq *rqd)
-{
- void *buffer;
-
- if (pblk_is_oob_meta_supported(pblk)) {
- /* Just use OOB metadata buffer as always */
- buffer = rqd->meta_list;
- } else {
- /* We need to reuse last page of request (packed metadata)
- * in similar way as traditional oob metadata
- */
- buffer = page_to_virt(
- rqd->bio->bi_io_vec[rqd->bio->bi_vcnt - 1].bv_page);
- }
-
- return buffer;
-}
-
-void pblk_get_packed_meta(struct pblk *pblk, struct nvm_rq *rqd)
-{
- void *meta_list = rqd->meta_list;
- void *page;
- int i = 0;
-
- if (pblk_is_oob_meta_supported(pblk))
- return;
-
- page = page_to_virt(rqd->bio->bi_io_vec[rqd->bio->bi_vcnt - 1].bv_page);
- /* We need to fill oob meta buffer with data from packed metadata */
- for (; i < rqd->nr_ppas; i++)
- memcpy(pblk_get_meta(pblk, meta_list, i),
- page + (i * sizeof(struct pblk_sec_meta)),
- sizeof(struct pblk_sec_meta));
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-gc.c - pblk's garbage collector
- */
-
-#include "pblk.h"
-#include "pblk-trace.h"
-#include <linux/delay.h>
-
-
-static void pblk_gc_free_gc_rq(struct pblk_gc_rq *gc_rq)
-{
- vfree(gc_rq->data);
- kfree(gc_rq);
-}
-
-static int pblk_gc_write(struct pblk *pblk)
-{
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_gc_rq *gc_rq, *tgc_rq;
- LIST_HEAD(w_list);
-
- spin_lock(&gc->w_lock);
- if (list_empty(&gc->w_list)) {
- spin_unlock(&gc->w_lock);
- return 1;
- }
-
- list_cut_position(&w_list, &gc->w_list, gc->w_list.prev);
- gc->w_entries = 0;
- spin_unlock(&gc->w_lock);
-
- list_for_each_entry_safe(gc_rq, tgc_rq, &w_list, list) {
- pblk_write_gc_to_cache(pblk, gc_rq);
- list_del(&gc_rq->list);
- kref_put(&gc_rq->line->ref, pblk_line_put);
- pblk_gc_free_gc_rq(gc_rq);
- }
-
- return 0;
-}
-
-static void pblk_gc_writer_kick(struct pblk_gc *gc)
-{
- wake_up_process(gc->gc_writer_ts);
-}
-
-void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct list_head *move_list;
-
- spin_lock(&l_mg->gc_lock);
- spin_lock(&line->lock);
- WARN_ON(line->state != PBLK_LINESTATE_GC);
- line->state = PBLK_LINESTATE_CLOSED;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
-
- /* We need to reset gc_group in order to ensure that
- * pblk_line_gc_list will return proper move_list
- * since right now current line is not on any of the
- * gc lists.
- */
- line->gc_group = PBLK_LINEGC_NONE;
- move_list = pblk_line_gc_list(pblk, line);
- spin_unlock(&line->lock);
- list_add_tail(&line->list, move_list);
- spin_unlock(&l_mg->gc_lock);
-}
-
-static void pblk_gc_line_ws(struct work_struct *work)
-{
- struct pblk_line_ws *gc_rq_ws = container_of(work,
- struct pblk_line_ws, ws);
- struct pblk *pblk = gc_rq_ws->pblk;
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_line *line = gc_rq_ws->line;
- struct pblk_gc_rq *gc_rq = gc_rq_ws->priv;
- int ret;
-
- up(&gc->gc_sem);
-
- /* Read from GC victim block */
- ret = pblk_submit_read_gc(pblk, gc_rq);
- if (ret) {
- line->w_err_gc->has_gc_err = 1;
- goto out;
- }
-
- if (!gc_rq->secs_to_gc)
- goto out;
-
-retry:
- spin_lock(&gc->w_lock);
- if (gc->w_entries >= PBLK_GC_RQ_QD) {
- spin_unlock(&gc->w_lock);
- pblk_gc_writer_kick(&pblk->gc);
- usleep_range(128, 256);
- goto retry;
- }
- gc->w_entries++;
- list_add_tail(&gc_rq->list, &gc->w_list);
- spin_unlock(&gc->w_lock);
-
- pblk_gc_writer_kick(&pblk->gc);
-
- kfree(gc_rq_ws);
- return;
-
-out:
- pblk_gc_free_gc_rq(gc_rq);
- kref_put(&line->ref, pblk_line_put);
- kfree(gc_rq_ws);
-}
-
-static __le64 *get_lba_list_from_emeta(struct pblk *pblk,
- struct pblk_line *line)
-{
- struct line_emeta *emeta_buf;
- struct pblk_line_meta *lm = &pblk->lm;
- unsigned int lba_list_size = lm->emeta_len[2];
- __le64 *lba_list;
- int ret;
-
- emeta_buf = kvmalloc(lm->emeta_len[0], GFP_KERNEL);
- if (!emeta_buf)
- return NULL;
-
- ret = pblk_line_emeta_read(pblk, line, emeta_buf);
- if (ret) {
- pblk_err(pblk, "line %d read emeta failed (%d)\n",
- line->id, ret);
- kvfree(emeta_buf);
- return NULL;
- }
-
- /* If this read fails, it means that emeta is corrupted.
- * For now, leave the line untouched.
- * TODO: Implement a recovery routine that scans and moves
- * all sectors on the line.
- */
-
- ret = pblk_recov_check_emeta(pblk, emeta_buf);
- if (ret) {
- pblk_err(pblk, "inconsistent emeta (line %d)\n",
- line->id);
- kvfree(emeta_buf);
- return NULL;
- }
-
- lba_list = kvmalloc(lba_list_size, GFP_KERNEL);
-
- if (lba_list)
- memcpy(lba_list, emeta_to_lbas(pblk, emeta_buf), lba_list_size);
-
- kvfree(emeta_buf);
-
- return lba_list;
-}
-
-static void pblk_gc_line_prepare_ws(struct work_struct *work)
-{
- struct pblk_line_ws *line_ws = container_of(work, struct pblk_line_ws,
- ws);
- struct pblk *pblk = line_ws->pblk;
- struct pblk_line *line = line_ws->line;
- struct pblk_line_meta *lm = &pblk->lm;
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_line_ws *gc_rq_ws;
- struct pblk_gc_rq *gc_rq;
- __le64 *lba_list;
- unsigned long *invalid_bitmap;
- int sec_left, nr_secs, bit;
-
- invalid_bitmap = kmalloc(lm->sec_bitmap_len, GFP_KERNEL);
- if (!invalid_bitmap)
- goto fail_free_ws;
-
- if (line->w_err_gc->has_write_err) {
- lba_list = line->w_err_gc->lba_list;
- line->w_err_gc->lba_list = NULL;
- } else {
- lba_list = get_lba_list_from_emeta(pblk, line);
- if (!lba_list) {
- pblk_err(pblk, "could not interpret emeta (line %d)\n",
- line->id);
- goto fail_free_invalid_bitmap;
- }
- }
-
- spin_lock(&line->lock);
- bitmap_copy(invalid_bitmap, line->invalid_bitmap, lm->sec_per_line);
- sec_left = pblk_line_vsc(line);
- spin_unlock(&line->lock);
-
- if (sec_left < 0) {
- pblk_err(pblk, "corrupted GC line (%d)\n", line->id);
- goto fail_free_lba_list;
- }
-
- bit = -1;
-next_rq:
- gc_rq = kmalloc(sizeof(struct pblk_gc_rq), GFP_KERNEL);
- if (!gc_rq)
- goto fail_free_lba_list;
-
- nr_secs = 0;
- do {
- bit = find_next_zero_bit(invalid_bitmap, lm->sec_per_line,
- bit + 1);
- if (bit > line->emeta_ssec)
- break;
-
- gc_rq->paddr_list[nr_secs] = bit;
- gc_rq->lba_list[nr_secs++] = le64_to_cpu(lba_list[bit]);
- } while (nr_secs < pblk->max_write_pgs);
-
- if (unlikely(!nr_secs)) {
- kfree(gc_rq);
- goto out;
- }
-
- gc_rq->nr_secs = nr_secs;
- gc_rq->line = line;
-
- gc_rq->data = vmalloc(array_size(gc_rq->nr_secs, geo->csecs));
- if (!gc_rq->data)
- goto fail_free_gc_rq;
-
- gc_rq_ws = kmalloc(sizeof(struct pblk_line_ws), GFP_KERNEL);
- if (!gc_rq_ws)
- goto fail_free_gc_data;
-
- gc_rq_ws->pblk = pblk;
- gc_rq_ws->line = line;
- gc_rq_ws->priv = gc_rq;
-
- /* The write GC path can be much slower than the read GC one due to
- * the budget imposed by the rate-limiter. Balance in case that we get
- * back pressure from the write GC path.
- */
- while (down_timeout(&gc->gc_sem, msecs_to_jiffies(30000)))
- io_schedule();
-
- kref_get(&line->ref);
-
- INIT_WORK(&gc_rq_ws->ws, pblk_gc_line_ws);
- queue_work(gc->gc_line_reader_wq, &gc_rq_ws->ws);
-
- sec_left -= nr_secs;
- if (sec_left > 0)
- goto next_rq;
-
-out:
- kvfree(lba_list);
- kfree(line_ws);
- kfree(invalid_bitmap);
-
- kref_put(&line->ref, pblk_line_put);
- atomic_dec(&gc->read_inflight_gc);
-
- return;
-
-fail_free_gc_data:
- vfree(gc_rq->data);
-fail_free_gc_rq:
- kfree(gc_rq);
-fail_free_lba_list:
- kvfree(lba_list);
-fail_free_invalid_bitmap:
- kfree(invalid_bitmap);
-fail_free_ws:
- kfree(line_ws);
-
- /* Line goes back to closed state, so we cannot release additional
- * reference for line, since we do that only when we want to do
- * gc to free line state transition.
- */
- pblk_put_line_back(pblk, line);
- atomic_dec(&gc->read_inflight_gc);
-
- pblk_err(pblk, "failed to GC line %d\n", line->id);
-}
-
-static int pblk_gc_line(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_line_ws *line_ws;
-
- pblk_debug(pblk, "line '%d' being reclaimed for GC\n", line->id);
-
- line_ws = kmalloc(sizeof(struct pblk_line_ws), GFP_KERNEL);
- if (!line_ws)
- return -ENOMEM;
-
- line_ws->pblk = pblk;
- line_ws->line = line;
-
- atomic_inc(&gc->pipeline_gc);
- INIT_WORK(&line_ws->ws, pblk_gc_line_prepare_ws);
- queue_work(gc->gc_reader_wq, &line_ws->ws);
-
- return 0;
-}
-
-static void pblk_gc_reader_kick(struct pblk_gc *gc)
-{
- wake_up_process(gc->gc_reader_ts);
-}
-
-static void pblk_gc_kick(struct pblk *pblk)
-{
- struct pblk_gc *gc = &pblk->gc;
-
- pblk_gc_writer_kick(gc);
- pblk_gc_reader_kick(gc);
-
- /* If we're shutting down GC, let's not start it up again */
- if (gc->gc_enabled) {
- wake_up_process(gc->gc_ts);
- mod_timer(&gc->gc_timer,
- jiffies + msecs_to_jiffies(GC_TIME_MSECS));
- }
-}
-
-static int pblk_gc_read(struct pblk *pblk)
-{
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_line *line;
-
- spin_lock(&gc->r_lock);
- if (list_empty(&gc->r_list)) {
- spin_unlock(&gc->r_lock);
- return 1;
- }
-
- line = list_first_entry(&gc->r_list, struct pblk_line, list);
- list_del(&line->list);
- spin_unlock(&gc->r_lock);
-
- pblk_gc_kick(pblk);
-
- if (pblk_gc_line(pblk, line)) {
- pblk_err(pblk, "failed to GC line %d\n", line->id);
- /* rollback */
- spin_lock(&gc->r_lock);
- list_add_tail(&line->list, &gc->r_list);
- spin_unlock(&gc->r_lock);
- }
-
- return 0;
-}
-
-static struct pblk_line *pblk_gc_get_victim_line(struct pblk *pblk,
- struct list_head *group_list)
-{
- struct pblk_line *line, *victim;
- unsigned int line_vsc = ~0x0L, victim_vsc = ~0x0L;
-
- victim = list_first_entry(group_list, struct pblk_line, list);
-
- list_for_each_entry(line, group_list, list) {
- if (!atomic_read(&line->sec_to_update))
- line_vsc = le32_to_cpu(*line->vsc);
- if (line_vsc < victim_vsc) {
- victim = line;
- victim_vsc = le32_to_cpu(*victim->vsc);
- }
- }
-
- if (victim_vsc == ~0x0)
- return NULL;
-
- return victim;
-}
-
-static bool pblk_gc_should_run(struct pblk_gc *gc, struct pblk_rl *rl)
-{
- unsigned int nr_blocks_free, nr_blocks_need;
- unsigned int werr_lines = atomic_read(&rl->werr_lines);
-
- nr_blocks_need = pblk_rl_high_thrs(rl);
- nr_blocks_free = pblk_rl_nr_free_blks(rl);
-
- /* This is not critical, no need to take lock here */
- return ((werr_lines > 0) ||
- ((gc->gc_active) && (nr_blocks_need > nr_blocks_free)));
-}
-
-void pblk_gc_free_full_lines(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_line *line;
-
- do {
- spin_lock(&l_mg->gc_lock);
- if (list_empty(&l_mg->gc_full_list)) {
- spin_unlock(&l_mg->gc_lock);
- return;
- }
-
- line = list_first_entry(&l_mg->gc_full_list,
- struct pblk_line, list);
-
- spin_lock(&line->lock);
- WARN_ON(line->state != PBLK_LINESTATE_CLOSED);
- line->state = PBLK_LINESTATE_GC;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- spin_unlock(&line->lock);
-
- list_del(&line->list);
- spin_unlock(&l_mg->gc_lock);
-
- atomic_inc(&gc->pipeline_gc);
- kref_put(&line->ref, pblk_line_put);
- } while (1);
-}
-
-/*
- * Lines with no valid sectors will be returned to the free list immediately. If
- * GC is activated - either because the free block count is under the determined
- * threshold, or because it is being forced from user space - only lines with a
- * high count of invalid sectors will be recycled.
- */
-static void pblk_gc_run(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_gc *gc = &pblk->gc;
- struct pblk_line *line;
- struct list_head *group_list;
- bool run_gc;
- int read_inflight_gc, gc_group = 0, prev_group = 0;
-
- pblk_gc_free_full_lines(pblk);
-
- run_gc = pblk_gc_should_run(&pblk->gc, &pblk->rl);
- if (!run_gc || (atomic_read(&gc->read_inflight_gc) >= PBLK_GC_L_QD))
- return;
-
-next_gc_group:
- group_list = l_mg->gc_lists[gc_group++];
-
- do {
- spin_lock(&l_mg->gc_lock);
-
- line = pblk_gc_get_victim_line(pblk, group_list);
- if (!line) {
- spin_unlock(&l_mg->gc_lock);
- break;
- }
-
- spin_lock(&line->lock);
- WARN_ON(line->state != PBLK_LINESTATE_CLOSED);
- line->state = PBLK_LINESTATE_GC;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- spin_unlock(&line->lock);
-
- list_del(&line->list);
- spin_unlock(&l_mg->gc_lock);
-
- spin_lock(&gc->r_lock);
- list_add_tail(&line->list, &gc->r_list);
- spin_unlock(&gc->r_lock);
-
- read_inflight_gc = atomic_inc_return(&gc->read_inflight_gc);
- pblk_gc_reader_kick(gc);
-
- prev_group = 1;
-
- /* No need to queue up more GC lines than we can handle */
- run_gc = pblk_gc_should_run(&pblk->gc, &pblk->rl);
- if (!run_gc || read_inflight_gc >= PBLK_GC_L_QD)
- break;
- } while (1);
-
- if (!prev_group && pblk->rl.rb_state > gc_group &&
- gc_group < PBLK_GC_NR_LISTS)
- goto next_gc_group;
-}
-
-static void pblk_gc_timer(struct timer_list *t)
-{
- struct pblk *pblk = from_timer(pblk, t, gc.gc_timer);
-
- pblk_gc_kick(pblk);
-}
-
-static int pblk_gc_ts(void *data)
-{
- struct pblk *pblk = data;
-
- while (!kthread_should_stop()) {
- pblk_gc_run(pblk);
- set_current_state(TASK_INTERRUPTIBLE);
- io_schedule();
- }
-
- return 0;
-}
-
-static int pblk_gc_writer_ts(void *data)
-{
- struct pblk *pblk = data;
-
- while (!kthread_should_stop()) {
- if (!pblk_gc_write(pblk))
- continue;
- set_current_state(TASK_INTERRUPTIBLE);
- io_schedule();
- }
-
- return 0;
-}
-
-static int pblk_gc_reader_ts(void *data)
-{
- struct pblk *pblk = data;
- struct pblk_gc *gc = &pblk->gc;
-
- while (!kthread_should_stop()) {
- if (!pblk_gc_read(pblk))
- continue;
- set_current_state(TASK_INTERRUPTIBLE);
- io_schedule();
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- pblk_info(pblk, "flushing gc pipeline, %d lines left\n",
- atomic_read(&gc->pipeline_gc));
-#endif
-
- do {
- if (!atomic_read(&gc->pipeline_gc))
- break;
-
- schedule();
- } while (1);
-
- return 0;
-}
-
-static void pblk_gc_start(struct pblk *pblk)
-{
- pblk->gc.gc_active = 1;
- pblk_debug(pblk, "gc start\n");
-}
-
-void pblk_gc_should_start(struct pblk *pblk)
-{
- struct pblk_gc *gc = &pblk->gc;
-
- if (gc->gc_enabled && !gc->gc_active) {
- pblk_gc_start(pblk);
- pblk_gc_kick(pblk);
- }
-}
-
-void pblk_gc_should_stop(struct pblk *pblk)
-{
- struct pblk_gc *gc = &pblk->gc;
-
- if (gc->gc_active && !gc->gc_forced)
- gc->gc_active = 0;
-}
-
-void pblk_gc_should_kick(struct pblk *pblk)
-{
- pblk_rl_update_rates(&pblk->rl);
-}
-
-void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled,
- int *gc_active)
-{
- struct pblk_gc *gc = &pblk->gc;
-
- spin_lock(&gc->lock);
- *gc_enabled = gc->gc_enabled;
- *gc_active = gc->gc_active;
- spin_unlock(&gc->lock);
-}
-
-int pblk_gc_sysfs_force(struct pblk *pblk, int force)
-{
- struct pblk_gc *gc = &pblk->gc;
-
- if (force < 0 || force > 1)
- return -EINVAL;
-
- spin_lock(&gc->lock);
- gc->gc_forced = force;
-
- if (force)
- gc->gc_enabled = 1;
- else
- gc->gc_enabled = 0;
- spin_unlock(&gc->lock);
-
- pblk_gc_should_start(pblk);
-
- return 0;
-}
-
-int pblk_gc_init(struct pblk *pblk)
-{
- struct pblk_gc *gc = &pblk->gc;
- int ret;
-
- gc->gc_ts = kthread_create(pblk_gc_ts, pblk, "pblk-gc-ts");
- if (IS_ERR(gc->gc_ts)) {
- pblk_err(pblk, "could not allocate GC main kthread\n");
- return PTR_ERR(gc->gc_ts);
- }
-
- gc->gc_writer_ts = kthread_create(pblk_gc_writer_ts, pblk,
- "pblk-gc-writer-ts");
- if (IS_ERR(gc->gc_writer_ts)) {
- pblk_err(pblk, "could not allocate GC writer kthread\n");
- ret = PTR_ERR(gc->gc_writer_ts);
- goto fail_free_main_kthread;
- }
-
- gc->gc_reader_ts = kthread_create(pblk_gc_reader_ts, pblk,
- "pblk-gc-reader-ts");
- if (IS_ERR(gc->gc_reader_ts)) {
- pblk_err(pblk, "could not allocate GC reader kthread\n");
- ret = PTR_ERR(gc->gc_reader_ts);
- goto fail_free_writer_kthread;
- }
-
- timer_setup(&gc->gc_timer, pblk_gc_timer, 0);
- mod_timer(&gc->gc_timer, jiffies + msecs_to_jiffies(GC_TIME_MSECS));
-
- gc->gc_active = 0;
- gc->gc_forced = 0;
- gc->gc_enabled = 1;
- gc->w_entries = 0;
- atomic_set(&gc->read_inflight_gc, 0);
- atomic_set(&gc->pipeline_gc, 0);
-
- /* Workqueue that reads valid sectors from a line and submit them to the
- * GC writer to be recycled.
- */
- gc->gc_line_reader_wq = alloc_workqueue("pblk-gc-line-reader-wq",
- WQ_MEM_RECLAIM | WQ_UNBOUND, PBLK_GC_MAX_READERS);
- if (!gc->gc_line_reader_wq) {
- pblk_err(pblk, "could not allocate GC line reader workqueue\n");
- ret = -ENOMEM;
- goto fail_free_reader_kthread;
- }
-
- /* Workqueue that prepare lines for GC */
- gc->gc_reader_wq = alloc_workqueue("pblk-gc-line_wq",
- WQ_MEM_RECLAIM | WQ_UNBOUND, 1);
- if (!gc->gc_reader_wq) {
- pblk_err(pblk, "could not allocate GC reader workqueue\n");
- ret = -ENOMEM;
- goto fail_free_reader_line_wq;
- }
-
- spin_lock_init(&gc->lock);
- spin_lock_init(&gc->w_lock);
- spin_lock_init(&gc->r_lock);
-
- sema_init(&gc->gc_sem, PBLK_GC_RQ_QD);
-
- INIT_LIST_HEAD(&gc->w_list);
- INIT_LIST_HEAD(&gc->r_list);
-
- return 0;
-
-fail_free_reader_line_wq:
- destroy_workqueue(gc->gc_line_reader_wq);
-fail_free_reader_kthread:
- kthread_stop(gc->gc_reader_ts);
-fail_free_writer_kthread:
- kthread_stop(gc->gc_writer_ts);
-fail_free_main_kthread:
- kthread_stop(gc->gc_ts);
-
- return ret;
-}
-
-void pblk_gc_exit(struct pblk *pblk, bool graceful)
-{
- struct pblk_gc *gc = &pblk->gc;
-
- gc->gc_enabled = 0;
- del_timer_sync(&gc->gc_timer);
- gc->gc_active = 0;
-
- if (gc->gc_ts)
- kthread_stop(gc->gc_ts);
-
- if (gc->gc_reader_ts)
- kthread_stop(gc->gc_reader_ts);
-
- if (graceful) {
- flush_workqueue(gc->gc_reader_wq);
- flush_workqueue(gc->gc_line_reader_wq);
- }
-
- destroy_workqueue(gc->gc_reader_wq);
- destroy_workqueue(gc->gc_line_reader_wq);
-
- if (gc->gc_writer_ts)
- kthread_stop(gc->gc_writer_ts);
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2015 IT University of Copenhagen (rrpc.c)
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * Implementation of a physical block-device target for Open-channel SSDs.
- *
- * pblk-init.c - pblk's initialization.
- */
-
-#include "pblk.h"
-#include "pblk-trace.h"
-
-static unsigned int write_buffer_size;
-
-module_param(write_buffer_size, uint, 0644);
-MODULE_PARM_DESC(write_buffer_size, "number of entries in a write buffer");
-
-struct pblk_global_caches {
- struct kmem_cache *ws;
- struct kmem_cache *rec;
- struct kmem_cache *g_rq;
- struct kmem_cache *w_rq;
-
- struct kref kref;
-
- struct mutex mutex; /* Ensures consistency between
- * caches and kref
- */
-};
-
-static struct pblk_global_caches pblk_caches = {
- .mutex = __MUTEX_INITIALIZER(pblk_caches.mutex),
- .kref = KREF_INIT(0),
-};
-
-struct bio_set pblk_bio_set;
-
-static blk_qc_t pblk_submit_bio(struct bio *bio)
-{
- struct pblk *pblk = bio->bi_bdev->bd_disk->queue->queuedata;
-
- if (bio_op(bio) == REQ_OP_DISCARD) {
- pblk_discard(pblk, bio);
- if (!(bio->bi_opf & REQ_PREFLUSH)) {
- bio_endio(bio);
- return BLK_QC_T_NONE;
- }
- }
-
- /* Read requests must be <= 256kb due to NVMe's 64 bit completion bitmap
- * constraint. Writes can be of arbitrary size.
- */
- if (bio_data_dir(bio) == READ) {
- blk_queue_split(&bio);
- pblk_submit_read(pblk, bio);
- } else {
- /* Prevent deadlock in the case of a modest LUN configuration
- * and large user I/Os. Unless stalled, the rate limiter
- * leaves at least 256KB available for user I/O.
- */
- if (pblk_get_secs(bio) > pblk_rl_max_io(&pblk->rl))
- blk_queue_split(&bio);
-
- pblk_write_to_cache(pblk, bio, PBLK_IOTYPE_USER);
- }
-
- return BLK_QC_T_NONE;
-}
-
-static const struct block_device_operations pblk_bops = {
- .owner = THIS_MODULE,
- .submit_bio = pblk_submit_bio,
-};
-
-
-static size_t pblk_trans_map_size(struct pblk *pblk)
-{
- int entry_size = 8;
-
- if (pblk->addrf_len < 32)
- entry_size = 4;
-
- return entry_size * pblk->capacity;
-}
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
-static u32 pblk_l2p_crc(struct pblk *pblk)
-{
- size_t map_size;
- u32 crc = ~(u32)0;
-
- map_size = pblk_trans_map_size(pblk);
- crc = crc32_le(crc, pblk->trans_map, map_size);
- return crc;
-}
-#endif
-
-static void pblk_l2p_free(struct pblk *pblk)
-{
- vfree(pblk->trans_map);
-}
-
-static int pblk_l2p_recover(struct pblk *pblk, bool factory_init)
-{
- struct pblk_line *line = NULL;
-
- if (factory_init) {
- guid_gen(&pblk->instance_uuid);
- } else {
- line = pblk_recov_l2p(pblk);
- if (IS_ERR(line)) {
- pblk_err(pblk, "could not recover l2p table\n");
- return -EFAULT;
- }
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- pblk_info(pblk, "init: L2P CRC: %x\n", pblk_l2p_crc(pblk));
-#endif
-
- /* Free full lines directly as GC has not been started yet */
- pblk_gc_free_full_lines(pblk);
-
- if (!line) {
- /* Configure next line for user data */
- line = pblk_line_get_first_data(pblk);
- if (!line)
- return -EFAULT;
- }
-
- return 0;
-}
-
-static int pblk_l2p_init(struct pblk *pblk, bool factory_init)
-{
- sector_t i;
- struct ppa_addr ppa;
- size_t map_size;
- int ret = 0;
-
- map_size = pblk_trans_map_size(pblk);
- pblk->trans_map = __vmalloc(map_size, GFP_KERNEL | __GFP_NOWARN |
- __GFP_RETRY_MAYFAIL | __GFP_HIGHMEM);
- if (!pblk->trans_map) {
- pblk_err(pblk, "failed to allocate L2P (need %zu of memory)\n",
- map_size);
- return -ENOMEM;
- }
-
- pblk_ppa_set_empty(&ppa);
-
- for (i = 0; i < pblk->capacity; i++)
- pblk_trans_map_set(pblk, i, ppa);
-
- ret = pblk_l2p_recover(pblk, factory_init);
- if (ret)
- vfree(pblk->trans_map);
-
- return ret;
-}
-
-static void pblk_rwb_free(struct pblk *pblk)
-{
- if (pblk_rb_tear_down_check(&pblk->rwb))
- pblk_err(pblk, "write buffer error on tear down\n");
-
- pblk_rb_free(&pblk->rwb);
-}
-
-static int pblk_rwb_init(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- unsigned long buffer_size;
- int pgs_in_buffer, threshold;
-
- threshold = geo->mw_cunits * geo->all_luns;
- pgs_in_buffer = (max(geo->mw_cunits, geo->ws_opt) + geo->ws_opt)
- * geo->all_luns;
-
- if (write_buffer_size && (write_buffer_size > pgs_in_buffer))
- buffer_size = write_buffer_size;
- else
- buffer_size = pgs_in_buffer;
-
- return pblk_rb_init(&pblk->rwb, buffer_size, threshold, geo->csecs);
-}
-
-static int pblk_set_addrf_12(struct pblk *pblk, struct nvm_geo *geo,
- struct nvm_addrf_12 *dst)
-{
- struct nvm_addrf_12 *src = (struct nvm_addrf_12 *)&geo->addrf;
- int power_len;
-
- /* Re-calculate channel and lun format to adapt to configuration */
- power_len = get_count_order(geo->num_ch);
- if (1 << power_len != geo->num_ch) {
- pblk_err(pblk, "supports only power-of-two channel config.\n");
- return -EINVAL;
- }
- dst->ch_len = power_len;
-
- power_len = get_count_order(geo->num_lun);
- if (1 << power_len != geo->num_lun) {
- pblk_err(pblk, "supports only power-of-two LUN config.\n");
- return -EINVAL;
- }
- dst->lun_len = power_len;
-
- dst->blk_len = src->blk_len;
- dst->pg_len = src->pg_len;
- dst->pln_len = src->pln_len;
- dst->sec_len = src->sec_len;
-
- dst->sec_offset = 0;
- dst->pln_offset = dst->sec_len;
- dst->ch_offset = dst->pln_offset + dst->pln_len;
- dst->lun_offset = dst->ch_offset + dst->ch_len;
- dst->pg_offset = dst->lun_offset + dst->lun_len;
- dst->blk_offset = dst->pg_offset + dst->pg_len;
-
- dst->sec_mask = ((1ULL << dst->sec_len) - 1) << dst->sec_offset;
- dst->pln_mask = ((1ULL << dst->pln_len) - 1) << dst->pln_offset;
- dst->ch_mask = ((1ULL << dst->ch_len) - 1) << dst->ch_offset;
- dst->lun_mask = ((1ULL << dst->lun_len) - 1) << dst->lun_offset;
- dst->pg_mask = ((1ULL << dst->pg_len) - 1) << dst->pg_offset;
- dst->blk_mask = ((1ULL << dst->blk_len) - 1) << dst->blk_offset;
-
- return dst->blk_offset + src->blk_len;
-}
-
-static int pblk_set_addrf_20(struct nvm_geo *geo, struct nvm_addrf *adst,
- struct pblk_addrf *udst)
-{
- struct nvm_addrf *src = &geo->addrf;
-
- adst->ch_len = get_count_order(geo->num_ch);
- adst->lun_len = get_count_order(geo->num_lun);
- adst->chk_len = src->chk_len;
- adst->sec_len = src->sec_len;
-
- adst->sec_offset = 0;
- adst->ch_offset = adst->sec_len;
- adst->lun_offset = adst->ch_offset + adst->ch_len;
- adst->chk_offset = adst->lun_offset + adst->lun_len;
-
- adst->sec_mask = ((1ULL << adst->sec_len) - 1) << adst->sec_offset;
- adst->chk_mask = ((1ULL << adst->chk_len) - 1) << adst->chk_offset;
- adst->lun_mask = ((1ULL << adst->lun_len) - 1) << adst->lun_offset;
- adst->ch_mask = ((1ULL << adst->ch_len) - 1) << adst->ch_offset;
-
- udst->sec_stripe = geo->ws_opt;
- udst->ch_stripe = geo->num_ch;
- udst->lun_stripe = geo->num_lun;
-
- udst->sec_lun_stripe = udst->sec_stripe * udst->ch_stripe;
- udst->sec_ws_stripe = udst->sec_lun_stripe * udst->lun_stripe;
-
- return adst->chk_offset + adst->chk_len;
-}
-
-static int pblk_set_addrf(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int mod;
-
- switch (geo->version) {
- case NVM_OCSSD_SPEC_12:
- div_u64_rem(geo->clba, pblk->min_write_pgs, &mod);
- if (mod) {
- pblk_err(pblk, "bad configuration of sectors/pages\n");
- return -EINVAL;
- }
-
- pblk->addrf_len = pblk_set_addrf_12(pblk, geo,
- (void *)&pblk->addrf);
- break;
- case NVM_OCSSD_SPEC_20:
- pblk->addrf_len = pblk_set_addrf_20(geo, (void *)&pblk->addrf,
- &pblk->uaddrf);
- break;
- default:
- pblk_err(pblk, "OCSSD revision not supported (%d)\n",
- geo->version);
- return -EINVAL;
- }
-
- return 0;
-}
-
-static int pblk_create_global_caches(void)
-{
-
- pblk_caches.ws = kmem_cache_create("pblk_blk_ws",
- sizeof(struct pblk_line_ws), 0, 0, NULL);
- if (!pblk_caches.ws)
- return -ENOMEM;
-
- pblk_caches.rec = kmem_cache_create("pblk_rec",
- sizeof(struct pblk_rec_ctx), 0, 0, NULL);
- if (!pblk_caches.rec)
- goto fail_destroy_ws;
-
- pblk_caches.g_rq = kmem_cache_create("pblk_g_rq", pblk_g_rq_size,
- 0, 0, NULL);
- if (!pblk_caches.g_rq)
- goto fail_destroy_rec;
-
- pblk_caches.w_rq = kmem_cache_create("pblk_w_rq", pblk_w_rq_size,
- 0, 0, NULL);
- if (!pblk_caches.w_rq)
- goto fail_destroy_g_rq;
-
- return 0;
-
-fail_destroy_g_rq:
- kmem_cache_destroy(pblk_caches.g_rq);
-fail_destroy_rec:
- kmem_cache_destroy(pblk_caches.rec);
-fail_destroy_ws:
- kmem_cache_destroy(pblk_caches.ws);
-
- return -ENOMEM;
-}
-
-static int pblk_get_global_caches(void)
-{
- int ret = 0;
-
- mutex_lock(&pblk_caches.mutex);
-
- if (kref_get_unless_zero(&pblk_caches.kref))
- goto out;
-
- ret = pblk_create_global_caches();
- if (!ret)
- kref_init(&pblk_caches.kref);
-
-out:
- mutex_unlock(&pblk_caches.mutex);
- return ret;
-}
-
-static void pblk_destroy_global_caches(struct kref *ref)
-{
- struct pblk_global_caches *c;
-
- c = container_of(ref, struct pblk_global_caches, kref);
-
- kmem_cache_destroy(c->ws);
- kmem_cache_destroy(c->rec);
- kmem_cache_destroy(c->g_rq);
- kmem_cache_destroy(c->w_rq);
-}
-
-static void pblk_put_global_caches(void)
-{
- mutex_lock(&pblk_caches.mutex);
- kref_put(&pblk_caches.kref, pblk_destroy_global_caches);
- mutex_unlock(&pblk_caches.mutex);
-}
-
-static int pblk_core_init(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int ret, max_write_ppas;
-
- atomic64_set(&pblk->user_wa, 0);
- atomic64_set(&pblk->pad_wa, 0);
- atomic64_set(&pblk->gc_wa, 0);
- pblk->user_rst_wa = 0;
- pblk->pad_rst_wa = 0;
- pblk->gc_rst_wa = 0;
-
- atomic64_set(&pblk->nr_flush, 0);
- pblk->nr_flush_rst = 0;
-
- pblk->min_write_pgs = geo->ws_opt;
- pblk->min_write_pgs_data = pblk->min_write_pgs;
- max_write_ppas = pblk->min_write_pgs * geo->all_luns;
- pblk->max_write_pgs = min_t(int, max_write_ppas, NVM_MAX_VLBA);
- pblk->max_write_pgs = min_t(int, pblk->max_write_pgs,
- queue_max_hw_sectors(dev->q) / (geo->csecs >> SECTOR_SHIFT));
- pblk_set_sec_per_write(pblk, pblk->min_write_pgs);
-
- pblk->oob_meta_size = geo->sos;
- if (!pblk_is_oob_meta_supported(pblk)) {
- /* For drives which does not have OOB metadata feature
- * in order to support recovery feature we need to use
- * so called packed metadata. Packed metada will store
- * the same information as OOB metadata (l2p table mapping,
- * but in the form of the single page at the end of
- * every write request.
- */
- if (pblk->min_write_pgs
- * sizeof(struct pblk_sec_meta) > PAGE_SIZE) {
- /* We want to keep all the packed metadata on single
- * page per write requests. So we need to ensure that
- * it will fit.
- *
- * This is more like sanity check, since there is
- * no device with such a big minimal write size
- * (above 1 metabytes).
- */
- pblk_err(pblk, "Not supported min write size\n");
- return -EINVAL;
- }
- /* For packed meta approach we do some simplification.
- * On read path we always issue requests which size
- * equal to max_write_pgs, with all pages filled with
- * user payload except of last one page which will be
- * filled with packed metadata.
- */
- pblk->max_write_pgs = pblk->min_write_pgs;
- pblk->min_write_pgs_data = pblk->min_write_pgs - 1;
- }
-
- pblk->pad_dist = kcalloc(pblk->min_write_pgs - 1, sizeof(atomic64_t),
- GFP_KERNEL);
- if (!pblk->pad_dist)
- return -ENOMEM;
-
- if (pblk_get_global_caches())
- goto fail_free_pad_dist;
-
- /* Internal bios can be at most the sectors signaled by the device. */
- ret = mempool_init_page_pool(&pblk->page_bio_pool, NVM_MAX_VLBA, 0);
- if (ret)
- goto free_global_caches;
-
- ret = mempool_init_slab_pool(&pblk->gen_ws_pool, PBLK_GEN_WS_POOL_SIZE,
- pblk_caches.ws);
- if (ret)
- goto free_page_bio_pool;
-
- ret = mempool_init_slab_pool(&pblk->rec_pool, geo->all_luns,
- pblk_caches.rec);
- if (ret)
- goto free_gen_ws_pool;
-
- ret = mempool_init_slab_pool(&pblk->r_rq_pool, geo->all_luns,
- pblk_caches.g_rq);
- if (ret)
- goto free_rec_pool;
-
- ret = mempool_init_slab_pool(&pblk->e_rq_pool, geo->all_luns,
- pblk_caches.g_rq);
- if (ret)
- goto free_r_rq_pool;
-
- ret = mempool_init_slab_pool(&pblk->w_rq_pool, geo->all_luns,
- pblk_caches.w_rq);
- if (ret)
- goto free_e_rq_pool;
-
- pblk->close_wq = alloc_workqueue("pblk-close-wq",
- WQ_MEM_RECLAIM | WQ_UNBOUND, PBLK_NR_CLOSE_JOBS);
- if (!pblk->close_wq)
- goto free_w_rq_pool;
-
- pblk->bb_wq = alloc_workqueue("pblk-bb-wq",
- WQ_MEM_RECLAIM | WQ_UNBOUND, 0);
- if (!pblk->bb_wq)
- goto free_close_wq;
-
- pblk->r_end_wq = alloc_workqueue("pblk-read-end-wq",
- WQ_MEM_RECLAIM | WQ_UNBOUND, 0);
- if (!pblk->r_end_wq)
- goto free_bb_wq;
-
- if (pblk_set_addrf(pblk))
- goto free_r_end_wq;
-
- INIT_LIST_HEAD(&pblk->compl_list);
- INIT_LIST_HEAD(&pblk->resubmit_list);
-
- return 0;
-
-free_r_end_wq:
- destroy_workqueue(pblk->r_end_wq);
-free_bb_wq:
- destroy_workqueue(pblk->bb_wq);
-free_close_wq:
- destroy_workqueue(pblk->close_wq);
-free_w_rq_pool:
- mempool_exit(&pblk->w_rq_pool);
-free_e_rq_pool:
- mempool_exit(&pblk->e_rq_pool);
-free_r_rq_pool:
- mempool_exit(&pblk->r_rq_pool);
-free_rec_pool:
- mempool_exit(&pblk->rec_pool);
-free_gen_ws_pool:
- mempool_exit(&pblk->gen_ws_pool);
-free_page_bio_pool:
- mempool_exit(&pblk->page_bio_pool);
-free_global_caches:
- pblk_put_global_caches();
-fail_free_pad_dist:
- kfree(pblk->pad_dist);
- return -ENOMEM;
-}
-
-static void pblk_core_free(struct pblk *pblk)
-{
- if (pblk->close_wq)
- destroy_workqueue(pblk->close_wq);
-
- if (pblk->r_end_wq)
- destroy_workqueue(pblk->r_end_wq);
-
- if (pblk->bb_wq)
- destroy_workqueue(pblk->bb_wq);
-
- mempool_exit(&pblk->page_bio_pool);
- mempool_exit(&pblk->gen_ws_pool);
- mempool_exit(&pblk->rec_pool);
- mempool_exit(&pblk->r_rq_pool);
- mempool_exit(&pblk->e_rq_pool);
- mempool_exit(&pblk->w_rq_pool);
-
- pblk_put_global_caches();
- kfree(pblk->pad_dist);
-}
-
-static void pblk_line_mg_free(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- int i;
-
- kfree(l_mg->bb_template);
- kfree(l_mg->bb_aux);
- kfree(l_mg->vsc_list);
-
- for (i = 0; i < PBLK_DATA_LINES; i++) {
- kfree(l_mg->sline_meta[i]);
- kvfree(l_mg->eline_meta[i]->buf);
- kfree(l_mg->eline_meta[i]);
- }
-
- mempool_destroy(l_mg->bitmap_pool);
- kmem_cache_destroy(l_mg->bitmap_cache);
-}
-
-static void pblk_line_meta_free(struct pblk_line_mgmt *l_mg,
- struct pblk_line *line)
-{
- struct pblk_w_err_gc *w_err_gc = line->w_err_gc;
-
- kfree(line->blk_bitmap);
- kfree(line->erase_bitmap);
- kfree(line->chks);
-
- kvfree(w_err_gc->lba_list);
- kfree(w_err_gc);
-}
-
-static void pblk_lines_free(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *line;
- int i;
-
- for (i = 0; i < l_mg->nr_lines; i++) {
- line = &pblk->lines[i];
-
- pblk_line_free(line);
- pblk_line_meta_free(l_mg, line);
- }
-
- pblk_line_mg_free(pblk);
-
- kfree(pblk->luns);
- kfree(pblk->lines);
-}
-
-static int pblk_luns_init(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_lun *rlun;
- int i;
-
- /* TODO: Implement unbalanced LUN support */
- if (geo->num_lun < 0) {
- pblk_err(pblk, "unbalanced LUN config.\n");
- return -EINVAL;
- }
-
- pblk->luns = kcalloc(geo->all_luns, sizeof(struct pblk_lun),
- GFP_KERNEL);
- if (!pblk->luns)
- return -ENOMEM;
-
- for (i = 0; i < geo->all_luns; i++) {
- /* Stripe across channels */
- int ch = i % geo->num_ch;
- int lun_raw = i / geo->num_ch;
- int lunid = lun_raw + ch * geo->num_lun;
-
- rlun = &pblk->luns[i];
- rlun->bppa = dev->luns[lunid];
-
- sema_init(&rlun->wr_sem, 1);
- }
-
- return 0;
-}
-
-/* See comment over struct line_emeta definition */
-static unsigned int calc_emeta_len(struct pblk *pblk)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
-
- /* Round to sector size so that lba_list starts on its own sector */
- lm->emeta_sec[1] = DIV_ROUND_UP(
- sizeof(struct line_emeta) + lm->blk_bitmap_len +
- sizeof(struct wa_counters), geo->csecs);
- lm->emeta_len[1] = lm->emeta_sec[1] * geo->csecs;
-
- /* Round to sector size so that vsc_list starts on its own sector */
- lm->dsec_per_line = lm->sec_per_line - lm->emeta_sec[0];
- lm->emeta_sec[2] = DIV_ROUND_UP(lm->dsec_per_line * sizeof(u64),
- geo->csecs);
- lm->emeta_len[2] = lm->emeta_sec[2] * geo->csecs;
-
- lm->emeta_sec[3] = DIV_ROUND_UP(l_mg->nr_lines * sizeof(u32),
- geo->csecs);
- lm->emeta_len[3] = lm->emeta_sec[3] * geo->csecs;
-
- lm->vsc_list_len = l_mg->nr_lines * sizeof(u32);
-
- return (lm->emeta_len[1] + lm->emeta_len[2] + lm->emeta_len[3]);
-}
-
-static int pblk_set_provision(struct pblk *pblk, int nr_free_chks)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- struct nvm_geo *geo = &dev->geo;
- sector_t provisioned;
- int sec_meta, blk_meta, clba;
- int minimum;
-
- if (geo->op == NVM_TARGET_DEFAULT_OP)
- pblk->op = PBLK_DEFAULT_OP;
- else
- pblk->op = geo->op;
-
- minimum = pblk_get_min_chks(pblk);
- provisioned = nr_free_chks;
- provisioned *= (100 - pblk->op);
- sector_div(provisioned, 100);
-
- if ((nr_free_chks - provisioned) < minimum) {
- if (geo->op != NVM_TARGET_DEFAULT_OP) {
- pblk_err(pblk, "OP too small to create a sane instance\n");
- return -EINTR;
- }
-
- /* If the user did not specify an OP value, and PBLK_DEFAULT_OP
- * is not enough, calculate and set sane value
- */
-
- provisioned = nr_free_chks - minimum;
- pblk->op = (100 * minimum) / nr_free_chks;
- pblk_info(pblk, "Default OP insufficient, adjusting OP to %d\n",
- pblk->op);
- }
-
- pblk->op_blks = nr_free_chks - provisioned;
-
- /* Internally pblk manages all free blocks, but all calculations based
- * on user capacity consider only provisioned blocks
- */
- pblk->rl.total_blocks = nr_free_chks;
-
- /* Consider sectors used for metadata */
- sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
- blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
-
- clba = (geo->clba / pblk->min_write_pgs) * pblk->min_write_pgs_data;
- pblk->capacity = (provisioned - blk_meta) * clba;
-
- atomic_set(&pblk->rl.free_blocks, nr_free_chks);
- atomic_set(&pblk->rl.free_user_blocks, nr_free_chks);
-
- return 0;
-}
-
-static int pblk_setup_line_meta_chk(struct pblk *pblk, struct pblk_line *line,
- struct nvm_chk_meta *meta)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- int i, nr_bad_chks = 0;
-
- for (i = 0; i < lm->blk_per_line; i++) {
- struct pblk_lun *rlun = &pblk->luns[i];
- struct nvm_chk_meta *chunk;
- struct nvm_chk_meta *chunk_meta;
- struct ppa_addr ppa;
- int pos;
-
- ppa = rlun->bppa;
- pos = pblk_ppa_to_pos(geo, ppa);
- chunk = &line->chks[pos];
-
- ppa.m.chk = line->id;
- chunk_meta = pblk_chunk_get_off(pblk, meta, ppa);
-
- chunk->state = chunk_meta->state;
- chunk->type = chunk_meta->type;
- chunk->wi = chunk_meta->wi;
- chunk->slba = chunk_meta->slba;
- chunk->cnlb = chunk_meta->cnlb;
- chunk->wp = chunk_meta->wp;
-
- trace_pblk_chunk_state(pblk_disk_name(pblk), &ppa,
- chunk->state);
-
- if (chunk->type & NVM_CHK_TP_SZ_SPEC) {
- WARN_ONCE(1, "pblk: custom-sized chunks unsupported\n");
- continue;
- }
-
- if (!(chunk->state & NVM_CHK_ST_OFFLINE))
- continue;
-
- set_bit(pos, line->blk_bitmap);
- nr_bad_chks++;
- }
-
- return nr_bad_chks;
-}
-
-static long pblk_setup_line_meta(struct pblk *pblk, struct pblk_line *line,
- void *chunk_meta, int line_id)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- long nr_bad_chks, chk_in_line;
-
- line->pblk = pblk;
- line->id = line_id;
- line->type = PBLK_LINETYPE_FREE;
- line->state = PBLK_LINESTATE_NEW;
- line->gc_group = PBLK_LINEGC_NONE;
- line->vsc = &l_mg->vsc_list[line_id];
- spin_lock_init(&line->lock);
-
- nr_bad_chks = pblk_setup_line_meta_chk(pblk, line, chunk_meta);
-
- chk_in_line = lm->blk_per_line - nr_bad_chks;
- if (nr_bad_chks < 0 || nr_bad_chks > lm->blk_per_line ||
- chk_in_line < lm->min_blk_line) {
- line->state = PBLK_LINESTATE_BAD;
- list_add_tail(&line->list, &l_mg->bad_list);
- return 0;
- }
-
- atomic_set(&line->blk_in_line, chk_in_line);
- list_add_tail(&line->list, &l_mg->free_list);
- l_mg->nr_free_lines++;
-
- return chk_in_line;
-}
-
-static int pblk_alloc_line_meta(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
-
- line->blk_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
- if (!line->blk_bitmap)
- return -ENOMEM;
-
- line->erase_bitmap = kzalloc(lm->blk_bitmap_len, GFP_KERNEL);
- if (!line->erase_bitmap)
- goto free_blk_bitmap;
-
-
- line->chks = kmalloc_array(lm->blk_per_line,
- sizeof(struct nvm_chk_meta), GFP_KERNEL);
- if (!line->chks)
- goto free_erase_bitmap;
-
- line->w_err_gc = kzalloc(sizeof(struct pblk_w_err_gc), GFP_KERNEL);
- if (!line->w_err_gc)
- goto free_chks;
-
- return 0;
-
-free_chks:
- kfree(line->chks);
-free_erase_bitmap:
- kfree(line->erase_bitmap);
-free_blk_bitmap:
- kfree(line->blk_bitmap);
- return -ENOMEM;
-}
-
-static int pblk_line_mg_init(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- int i, bb_distance;
-
- l_mg->nr_lines = geo->num_chk;
- l_mg->log_line = l_mg->data_line = NULL;
- l_mg->l_seq_nr = l_mg->d_seq_nr = 0;
- l_mg->nr_free_lines = 0;
- bitmap_zero(&l_mg->meta_bitmap, PBLK_DATA_LINES);
-
- INIT_LIST_HEAD(&l_mg->free_list);
- INIT_LIST_HEAD(&l_mg->corrupt_list);
- INIT_LIST_HEAD(&l_mg->bad_list);
- INIT_LIST_HEAD(&l_mg->gc_full_list);
- INIT_LIST_HEAD(&l_mg->gc_high_list);
- INIT_LIST_HEAD(&l_mg->gc_mid_list);
- INIT_LIST_HEAD(&l_mg->gc_low_list);
- INIT_LIST_HEAD(&l_mg->gc_empty_list);
- INIT_LIST_HEAD(&l_mg->gc_werr_list);
-
- INIT_LIST_HEAD(&l_mg->emeta_list);
-
- l_mg->gc_lists[0] = &l_mg->gc_werr_list;
- l_mg->gc_lists[1] = &l_mg->gc_high_list;
- l_mg->gc_lists[2] = &l_mg->gc_mid_list;
- l_mg->gc_lists[3] = &l_mg->gc_low_list;
-
- spin_lock_init(&l_mg->free_lock);
- spin_lock_init(&l_mg->close_lock);
- spin_lock_init(&l_mg->gc_lock);
-
- l_mg->vsc_list = kcalloc(l_mg->nr_lines, sizeof(__le32), GFP_KERNEL);
- if (!l_mg->vsc_list)
- goto fail;
-
- l_mg->bb_template = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
- if (!l_mg->bb_template)
- goto fail_free_vsc_list;
-
- l_mg->bb_aux = kzalloc(lm->sec_bitmap_len, GFP_KERNEL);
- if (!l_mg->bb_aux)
- goto fail_free_bb_template;
-
- /* smeta is always small enough to fit on a kmalloc memory allocation,
- * emeta depends on the number of LUNs allocated to the pblk instance
- */
- for (i = 0; i < PBLK_DATA_LINES; i++) {
- l_mg->sline_meta[i] = kmalloc(lm->smeta_len, GFP_KERNEL);
- if (!l_mg->sline_meta[i])
- goto fail_free_smeta;
- }
-
- l_mg->bitmap_cache = kmem_cache_create("pblk_lm_bitmap",
- lm->sec_bitmap_len, 0, 0, NULL);
- if (!l_mg->bitmap_cache)
- goto fail_free_smeta;
-
- /* the bitmap pool is used for both valid and map bitmaps */
- l_mg->bitmap_pool = mempool_create_slab_pool(PBLK_DATA_LINES * 2,
- l_mg->bitmap_cache);
- if (!l_mg->bitmap_pool)
- goto fail_destroy_bitmap_cache;
-
- /* emeta allocates three different buffers for managing metadata with
- * in-memory and in-media layouts
- */
- for (i = 0; i < PBLK_DATA_LINES; i++) {
- struct pblk_emeta *emeta;
-
- emeta = kmalloc(sizeof(struct pblk_emeta), GFP_KERNEL);
- if (!emeta)
- goto fail_free_emeta;
-
- emeta->buf = kvmalloc(lm->emeta_len[0], GFP_KERNEL);
- if (!emeta->buf) {
- kfree(emeta);
- goto fail_free_emeta;
- }
-
- emeta->nr_entries = lm->emeta_sec[0];
- l_mg->eline_meta[i] = emeta;
- }
-
- for (i = 0; i < l_mg->nr_lines; i++)
- l_mg->vsc_list[i] = cpu_to_le32(EMPTY_ENTRY);
-
- bb_distance = (geo->all_luns) * geo->ws_opt;
- for (i = 0; i < lm->sec_per_line; i += bb_distance)
- bitmap_set(l_mg->bb_template, i, geo->ws_opt);
-
- return 0;
-
-fail_free_emeta:
- while (--i >= 0) {
- kvfree(l_mg->eline_meta[i]->buf);
- kfree(l_mg->eline_meta[i]);
- }
-
- mempool_destroy(l_mg->bitmap_pool);
-fail_destroy_bitmap_cache:
- kmem_cache_destroy(l_mg->bitmap_cache);
-fail_free_smeta:
- for (i = 0; i < PBLK_DATA_LINES; i++)
- kfree(l_mg->sline_meta[i]);
- kfree(l_mg->bb_aux);
-fail_free_bb_template:
- kfree(l_mg->bb_template);
-fail_free_vsc_list:
- kfree(l_mg->vsc_list);
-fail:
- return -ENOMEM;
-}
-
-static int pblk_line_meta_init(struct pblk *pblk)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- unsigned int smeta_len, emeta_len;
- int i;
-
- lm->sec_per_line = geo->clba * geo->all_luns;
- lm->blk_per_line = geo->all_luns;
- lm->blk_bitmap_len = BITS_TO_LONGS(geo->all_luns) * sizeof(long);
- lm->sec_bitmap_len = BITS_TO_LONGS(lm->sec_per_line) * sizeof(long);
- lm->lun_bitmap_len = BITS_TO_LONGS(geo->all_luns) * sizeof(long);
- lm->mid_thrs = lm->sec_per_line / 2;
- lm->high_thrs = lm->sec_per_line / 4;
- lm->meta_distance = (geo->all_luns / 2) * pblk->min_write_pgs;
-
- /* Calculate necessary pages for smeta. See comment over struct
- * line_smeta definition
- */
- i = 1;
-add_smeta_page:
- lm->smeta_sec = i * geo->ws_opt;
- lm->smeta_len = lm->smeta_sec * geo->csecs;
-
- smeta_len = sizeof(struct line_smeta) + lm->lun_bitmap_len;
- if (smeta_len > lm->smeta_len) {
- i++;
- goto add_smeta_page;
- }
-
- /* Calculate necessary pages for emeta. See comment over struct
- * line_emeta definition
- */
- i = 1;
-add_emeta_page:
- lm->emeta_sec[0] = i * geo->ws_opt;
- lm->emeta_len[0] = lm->emeta_sec[0] * geo->csecs;
-
- emeta_len = calc_emeta_len(pblk);
- if (emeta_len > lm->emeta_len[0]) {
- i++;
- goto add_emeta_page;
- }
-
- lm->emeta_bb = geo->all_luns > i ? geo->all_luns - i : 0;
-
- lm->min_blk_line = 1;
- if (geo->all_luns > 1)
- lm->min_blk_line += DIV_ROUND_UP(lm->smeta_sec +
- lm->emeta_sec[0], geo->clba);
-
- if (lm->min_blk_line > lm->blk_per_line) {
- pblk_err(pblk, "config. not supported. Min. LUN in line:%d\n",
- lm->blk_per_line);
- return -EINVAL;
- }
-
- return 0;
-}
-
-static int pblk_lines_init(struct pblk *pblk)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *line;
- void *chunk_meta;
- int nr_free_chks = 0;
- int i, ret;
-
- ret = pblk_line_meta_init(pblk);
- if (ret)
- return ret;
-
- ret = pblk_line_mg_init(pblk);
- if (ret)
- return ret;
-
- ret = pblk_luns_init(pblk);
- if (ret)
- goto fail_free_meta;
-
- chunk_meta = pblk_get_chunk_meta(pblk);
- if (IS_ERR(chunk_meta)) {
- ret = PTR_ERR(chunk_meta);
- goto fail_free_luns;
- }
-
- pblk->lines = kcalloc(l_mg->nr_lines, sizeof(struct pblk_line),
- GFP_KERNEL);
- if (!pblk->lines) {
- ret = -ENOMEM;
- goto fail_free_chunk_meta;
- }
-
- for (i = 0; i < l_mg->nr_lines; i++) {
- line = &pblk->lines[i];
-
- ret = pblk_alloc_line_meta(pblk, line);
- if (ret)
- goto fail_free_lines;
-
- nr_free_chks += pblk_setup_line_meta(pblk, line, chunk_meta, i);
-
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- }
-
- if (!nr_free_chks) {
- pblk_err(pblk, "too many bad blocks prevent for sane instance\n");
- ret = -EINTR;
- goto fail_free_lines;
- }
-
- ret = pblk_set_provision(pblk, nr_free_chks);
- if (ret)
- goto fail_free_lines;
-
- vfree(chunk_meta);
- return 0;
-
-fail_free_lines:
- while (--i >= 0)
- pblk_line_meta_free(l_mg, &pblk->lines[i]);
- kfree(pblk->lines);
-fail_free_chunk_meta:
- vfree(chunk_meta);
-fail_free_luns:
- kfree(pblk->luns);
-fail_free_meta:
- pblk_line_mg_free(pblk);
-
- return ret;
-}
-
-static int pblk_writer_init(struct pblk *pblk)
-{
- pblk->writer_ts = kthread_create(pblk_write_ts, pblk, "pblk-writer-t");
- if (IS_ERR(pblk->writer_ts)) {
- int err = PTR_ERR(pblk->writer_ts);
-
- if (err != -EINTR)
- pblk_err(pblk, "could not allocate writer kthread (%d)\n",
- err);
- return err;
- }
-
- timer_setup(&pblk->wtimer, pblk_write_timer_fn, 0);
- mod_timer(&pblk->wtimer, jiffies + msecs_to_jiffies(100));
-
- return 0;
-}
-
-static void pblk_writer_stop(struct pblk *pblk)
-{
- /* The pipeline must be stopped and the write buffer emptied before the
- * write thread is stopped
- */
- WARN(pblk_rb_read_count(&pblk->rwb),
- "Stopping not fully persisted write buffer\n");
-
- WARN(pblk_rb_sync_count(&pblk->rwb),
- "Stopping not fully synced write buffer\n");
-
- del_timer_sync(&pblk->wtimer);
- if (pblk->writer_ts)
- kthread_stop(pblk->writer_ts);
-}
-
-static void pblk_free(struct pblk *pblk)
-{
- pblk_lines_free(pblk);
- pblk_l2p_free(pblk);
- pblk_rwb_free(pblk);
- pblk_core_free(pblk);
-
- kfree(pblk);
-}
-
-static void pblk_tear_down(struct pblk *pblk, bool graceful)
-{
- if (graceful)
- __pblk_pipeline_flush(pblk);
- __pblk_pipeline_stop(pblk);
- pblk_writer_stop(pblk);
- pblk_rb_sync_l2p(&pblk->rwb);
- pblk_rl_free(&pblk->rl);
-
- pblk_debug(pblk, "consistent tear down (graceful:%d)\n", graceful);
-}
-
-static void pblk_exit(void *private, bool graceful)
-{
- struct pblk *pblk = private;
-
- pblk_gc_exit(pblk, graceful);
- pblk_tear_down(pblk, graceful);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- pblk_info(pblk, "exit: L2P CRC: %x\n", pblk_l2p_crc(pblk));
-#endif
-
- pblk_free(pblk);
-}
-
-static sector_t pblk_capacity(void *private)
-{
- struct pblk *pblk = private;
-
- return pblk->capacity * NR_PHY_IN_LOG;
-}
-
-static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
- int flags)
-{
- struct nvm_geo *geo = &dev->geo;
- struct request_queue *bqueue = dev->q;
- struct request_queue *tqueue = tdisk->queue;
- struct pblk *pblk;
- int ret;
-
- pblk = kzalloc(sizeof(struct pblk), GFP_KERNEL);
- if (!pblk)
- return ERR_PTR(-ENOMEM);
-
- pblk->dev = dev;
- pblk->disk = tdisk;
- pblk->state = PBLK_STATE_RUNNING;
- trace_pblk_state(pblk_disk_name(pblk), pblk->state);
- pblk->gc.gc_enabled = 0;
-
- if (!(geo->version == NVM_OCSSD_SPEC_12 ||
- geo->version == NVM_OCSSD_SPEC_20)) {
- pblk_err(pblk, "OCSSD version not supported (%u)\n",
- geo->version);
- kfree(pblk);
- return ERR_PTR(-EINVAL);
- }
-
- if (geo->ext) {
- pblk_err(pblk, "extended metadata not supported\n");
- kfree(pblk);
- return ERR_PTR(-EINVAL);
- }
-
- spin_lock_init(&pblk->resubmit_lock);
- spin_lock_init(&pblk->trans_lock);
- spin_lock_init(&pblk->lock);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_set(&pblk->inflight_writes, 0);
- atomic_long_set(&pblk->padded_writes, 0);
- atomic_long_set(&pblk->padded_wb, 0);
- atomic_long_set(&pblk->req_writes, 0);
- atomic_long_set(&pblk->sub_writes, 0);
- atomic_long_set(&pblk->sync_writes, 0);
- atomic_long_set(&pblk->inflight_reads, 0);
- atomic_long_set(&pblk->cache_reads, 0);
- atomic_long_set(&pblk->sync_reads, 0);
- atomic_long_set(&pblk->recov_writes, 0);
- atomic_long_set(&pblk->recov_writes, 0);
- atomic_long_set(&pblk->recov_gc_writes, 0);
- atomic_long_set(&pblk->recov_gc_reads, 0);
-#endif
-
- atomic_long_set(&pblk->read_failed, 0);
- atomic_long_set(&pblk->read_empty, 0);
- atomic_long_set(&pblk->read_high_ecc, 0);
- atomic_long_set(&pblk->read_failed_gc, 0);
- atomic_long_set(&pblk->write_failed, 0);
- atomic_long_set(&pblk->erase_failed, 0);
-
- ret = pblk_core_init(pblk);
- if (ret) {
- pblk_err(pblk, "could not initialize core\n");
- goto fail;
- }
-
- ret = pblk_lines_init(pblk);
- if (ret) {
- pblk_err(pblk, "could not initialize lines\n");
- goto fail_free_core;
- }
-
- ret = pblk_rwb_init(pblk);
- if (ret) {
- pblk_err(pblk, "could not initialize write buffer\n");
- goto fail_free_lines;
- }
-
- ret = pblk_l2p_init(pblk, flags & NVM_TARGET_FACTORY);
- if (ret) {
- pblk_err(pblk, "could not initialize maps\n");
- goto fail_free_rwb;
- }
-
- ret = pblk_writer_init(pblk);
- if (ret) {
- if (ret != -EINTR)
- pblk_err(pblk, "could not initialize write thread\n");
- goto fail_free_l2p;
- }
-
- ret = pblk_gc_init(pblk);
- if (ret) {
- pblk_err(pblk, "could not initialize gc\n");
- goto fail_stop_writer;
- }
-
- /* inherit the size from the underlying device */
- blk_queue_logical_block_size(tqueue, queue_physical_block_size(bqueue));
- blk_queue_max_hw_sectors(tqueue, queue_max_hw_sectors(bqueue));
-
- blk_queue_write_cache(tqueue, true, false);
-
- tqueue->limits.discard_granularity = geo->clba * geo->csecs;
- tqueue->limits.discard_alignment = 0;
- blk_queue_max_discard_sectors(tqueue, UINT_MAX >> 9);
- blk_queue_flag_set(QUEUE_FLAG_DISCARD, tqueue);
-
- pblk_info(pblk, "luns:%u, lines:%d, secs:%llu, buf entries:%u\n",
- geo->all_luns, pblk->l_mg.nr_lines,
- (unsigned long long)pblk->capacity,
- pblk->rwb.nr_entries);
-
- wake_up_process(pblk->writer_ts);
-
- /* Check if we need to start GC */
- pblk_gc_should_kick(pblk);
-
- return pblk;
-
-fail_stop_writer:
- pblk_writer_stop(pblk);
-fail_free_l2p:
- pblk_l2p_free(pblk);
-fail_free_rwb:
- pblk_rwb_free(pblk);
-fail_free_lines:
- pblk_lines_free(pblk);
-fail_free_core:
- pblk_core_free(pblk);
-fail:
- kfree(pblk);
- return ERR_PTR(ret);
-}
-
-/* physical block device target */
-static struct nvm_tgt_type tt_pblk = {
- .name = "pblk",
- .version = {1, 0, 0},
-
- .bops = &pblk_bops,
- .capacity = pblk_capacity,
-
- .init = pblk_init,
- .exit = pblk_exit,
-
- .sysfs_init = pblk_sysfs_init,
- .sysfs_exit = pblk_sysfs_exit,
- .owner = THIS_MODULE,
-};
-
-static int __init pblk_module_init(void)
-{
- int ret;
-
- ret = bioset_init(&pblk_bio_set, BIO_POOL_SIZE, 0, 0);
- if (ret)
- return ret;
- ret = nvm_register_tgt_type(&tt_pblk);
- if (ret)
- bioset_exit(&pblk_bio_set);
- return ret;
-}
-
-static void pblk_module_exit(void)
-{
- bioset_exit(&pblk_bio_set);
- nvm_unregister_tgt_type(&tt_pblk);
-}
-
-module_init(pblk_module_init);
-module_exit(pblk_module_exit);
-MODULE_AUTHOR("Javier Gonzalez <javier@cnexlabs.com>");
-MODULE_AUTHOR("Matias Bjorling <matias@cnexlabs.com>");
-MODULE_LICENSE("GPL v2");
-MODULE_DESCRIPTION("Physical Block-Device for Open-Channel SSDs");
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-map.c - pblk's lba-ppa mapping strategy
- *
- */
-
-#include "pblk.h"
-
-static int pblk_map_page_data(struct pblk *pblk, unsigned int sentry,
- struct ppa_addr *ppa_list,
- unsigned long *lun_bitmap,
- void *meta_list,
- unsigned int valid_secs)
-{
- struct pblk_line *line = pblk_line_get_data(pblk);
- struct pblk_emeta *emeta;
- struct pblk_w_ctx *w_ctx;
- __le64 *lba_list;
- u64 paddr;
- int nr_secs = pblk->min_write_pgs;
- int i;
-
- if (!line)
- return -ENOSPC;
-
- if (pblk_line_is_full(line)) {
- struct pblk_line *prev_line = line;
-
- /* If we cannot allocate a new line, make sure to store metadata
- * on current line and then fail
- */
- line = pblk_line_replace_data(pblk);
- pblk_line_close_meta(pblk, prev_line);
-
- if (!line) {
- pblk_pipeline_stop(pblk);
- return -ENOSPC;
- }
-
- }
-
- emeta = line->emeta;
- lba_list = emeta_to_lbas(pblk, emeta->buf);
-
- paddr = pblk_alloc_page(pblk, line, nr_secs);
-
- for (i = 0; i < nr_secs; i++, paddr++) {
- struct pblk_sec_meta *meta = pblk_get_meta(pblk, meta_list, i);
- __le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
-
- /* ppa to be sent to the device */
- ppa_list[i] = addr_to_gen_ppa(pblk, paddr, line->id);
-
- /* Write context for target bio completion on write buffer. Note
- * that the write buffer is protected by the sync backpointer,
- * and a single writer thread have access to each specific entry
- * at a time. Thus, it is safe to modify the context for the
- * entry we are setting up for submission without taking any
- * lock or memory barrier.
- */
- if (i < valid_secs) {
- kref_get(&line->ref);
- atomic_inc(&line->sec_to_update);
- w_ctx = pblk_rb_w_ctx(&pblk->rwb, sentry + i);
- w_ctx->ppa = ppa_list[i];
- meta->lba = cpu_to_le64(w_ctx->lba);
- lba_list[paddr] = cpu_to_le64(w_ctx->lba);
- if (lba_list[paddr] != addr_empty)
- line->nr_valid_lbas++;
- else
- atomic64_inc(&pblk->pad_wa);
- } else {
- lba_list[paddr] = addr_empty;
- meta->lba = addr_empty;
- __pblk_map_invalidate(pblk, line, paddr);
- }
- }
-
- pblk_down_rq(pblk, ppa_list[0], lun_bitmap);
- return 0;
-}
-
-int pblk_map_rq(struct pblk *pblk, struct nvm_rq *rqd, unsigned int sentry,
- unsigned long *lun_bitmap, unsigned int valid_secs,
- unsigned int off)
-{
- void *meta_list = pblk_get_meta_for_writes(pblk, rqd);
- void *meta_buffer;
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
- unsigned int map_secs;
- int min = pblk->min_write_pgs;
- int i;
- int ret;
-
- for (i = off; i < rqd->nr_ppas; i += min) {
- map_secs = (i + min > valid_secs) ? (valid_secs % min) : min;
- meta_buffer = pblk_get_meta(pblk, meta_list, i);
-
- ret = pblk_map_page_data(pblk, sentry + i, &ppa_list[i],
- lun_bitmap, meta_buffer, map_secs);
- if (ret)
- return ret;
- }
-
- return 0;
-}
-
-/* only if erase_ppa is set, acquire erase semaphore */
-int pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
- unsigned int sentry, unsigned long *lun_bitmap,
- unsigned int valid_secs, struct ppa_addr *erase_ppa)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- void *meta_list = pblk_get_meta_for_writes(pblk, rqd);
- void *meta_buffer;
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
- struct pblk_line *e_line, *d_line;
- unsigned int map_secs;
- int min = pblk->min_write_pgs;
- int i, erase_lun;
- int ret;
-
-
- for (i = 0; i < rqd->nr_ppas; i += min) {
- map_secs = (i + min > valid_secs) ? (valid_secs % min) : min;
- meta_buffer = pblk_get_meta(pblk, meta_list, i);
-
- ret = pblk_map_page_data(pblk, sentry + i, &ppa_list[i],
- lun_bitmap, meta_buffer, map_secs);
- if (ret)
- return ret;
-
- erase_lun = pblk_ppa_to_pos(geo, ppa_list[i]);
-
- /* line can change after page map. We might also be writing the
- * last line.
- */
- e_line = pblk_line_get_erase(pblk);
- if (!e_line)
- return pblk_map_rq(pblk, rqd, sentry, lun_bitmap,
- valid_secs, i + min);
-
- spin_lock(&e_line->lock);
- if (!test_bit(erase_lun, e_line->erase_bitmap)) {
- set_bit(erase_lun, e_line->erase_bitmap);
- atomic_dec(&e_line->left_eblks);
-
- *erase_ppa = ppa_list[i];
- erase_ppa->a.blk = e_line->id;
- erase_ppa->a.reserved = 0;
-
- spin_unlock(&e_line->lock);
-
- /* Avoid evaluating e_line->left_eblks */
- return pblk_map_rq(pblk, rqd, sentry, lun_bitmap,
- valid_secs, i + min);
- }
- spin_unlock(&e_line->lock);
- }
-
- d_line = pblk_line_get_data(pblk);
-
- /* line can change after page map. We might also be writing the
- * last line.
- */
- e_line = pblk_line_get_erase(pblk);
- if (!e_line)
- return -ENOSPC;
-
- /* Erase blocks that are bad in this line but might not be in next */
- if (unlikely(pblk_ppa_empty(*erase_ppa)) &&
- bitmap_weight(d_line->blk_bitmap, lm->blk_per_line)) {
- int bit = -1;
-
-retry:
- bit = find_next_bit(d_line->blk_bitmap,
- lm->blk_per_line, bit + 1);
- if (bit >= lm->blk_per_line)
- return 0;
-
- spin_lock(&e_line->lock);
- if (test_bit(bit, e_line->erase_bitmap)) {
- spin_unlock(&e_line->lock);
- goto retry;
- }
- spin_unlock(&e_line->lock);
-
- set_bit(bit, e_line->erase_bitmap);
- atomic_dec(&e_line->left_eblks);
- *erase_ppa = pblk->luns[bit].bppa; /* set ch and lun */
- erase_ppa->a.blk = e_line->id;
- }
-
- return 0;
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- *
- * Based upon the circular ringbuffer.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-rb.c - pblk's write buffer
- */
-
-#include <linux/circ_buf.h>
-
-#include "pblk.h"
-
-static DECLARE_RWSEM(pblk_rb_lock);
-
-static void pblk_rb_data_free(struct pblk_rb *rb)
-{
- struct pblk_rb_pages *p, *t;
-
- down_write(&pblk_rb_lock);
- list_for_each_entry_safe(p, t, &rb->pages, list) {
- free_pages((unsigned long)page_address(p->pages), p->order);
- list_del(&p->list);
- kfree(p);
- }
- up_write(&pblk_rb_lock);
-}
-
-void pblk_rb_free(struct pblk_rb *rb)
-{
- pblk_rb_data_free(rb);
- vfree(rb->entries);
-}
-
-/*
- * pblk_rb_calculate_size -- calculate the size of the write buffer
- */
-static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
- unsigned int threshold)
-{
- unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
- unsigned int max_sz = max(thr_sz, nr_entries);
- unsigned int max_io;
-
- /* Alloc a write buffer that can (i) fit at least two split bios
- * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
- * threshold will be respected
- */
- max_io = (1 << max((int)(get_count_order(max_sz)),
- (int)(get_count_order(NVM_MAX_VLBA << 1))));
- if ((threshold + NVM_MAX_VLBA) >= max_io)
- max_io <<= 1;
-
- return max_io;
-}
-
-/*
- * Initialize ring buffer. The data and metadata buffers must be previously
- * allocated and their size must be a power of two
- * (Documentation/core-api/circular-buffers.rst)
- */
-int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
- unsigned int seg_size)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct pblk_rb_entry *entries;
- unsigned int init_entry = 0;
- unsigned int max_order = MAX_ORDER - 1;
- unsigned int power_size, power_seg_sz;
- unsigned int alloc_order, order, iter;
- unsigned int nr_entries;
-
- nr_entries = pblk_rb_calculate_size(size, threshold);
- entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
- if (!entries)
- return -ENOMEM;
-
- power_size = get_count_order(nr_entries);
- power_seg_sz = get_count_order(seg_size);
-
- down_write(&pblk_rb_lock);
- rb->entries = entries;
- rb->seg_size = (1 << power_seg_sz);
- rb->nr_entries = (1 << power_size);
- rb->mem = rb->subm = rb->sync = rb->l2p_update = 0;
- rb->back_thres = threshold;
- rb->flush_point = EMPTY_ENTRY;
-
- spin_lock_init(&rb->w_lock);
- spin_lock_init(&rb->s_lock);
-
- INIT_LIST_HEAD(&rb->pages);
-
- alloc_order = power_size;
- if (alloc_order >= max_order) {
- order = max_order;
- iter = (1 << (alloc_order - max_order));
- } else {
- order = alloc_order;
- iter = 1;
- }
-
- do {
- struct pblk_rb_entry *entry;
- struct pblk_rb_pages *page_set;
- void *kaddr;
- unsigned long set_size;
- int i;
-
- page_set = kmalloc(sizeof(struct pblk_rb_pages), GFP_KERNEL);
- if (!page_set) {
- up_write(&pblk_rb_lock);
- vfree(entries);
- return -ENOMEM;
- }
-
- page_set->order = order;
- page_set->pages = alloc_pages(GFP_KERNEL, order);
- if (!page_set->pages) {
- kfree(page_set);
- pblk_rb_data_free(rb);
- up_write(&pblk_rb_lock);
- vfree(entries);
- return -ENOMEM;
- }
- kaddr = page_address(page_set->pages);
-
- entry = &rb->entries[init_entry];
- entry->data = kaddr;
- entry->cacheline = pblk_cacheline_to_addr(init_entry++);
- entry->w_ctx.flags = PBLK_WRITABLE_ENTRY;
-
- set_size = (1 << order);
- for (i = 1; i < set_size; i++) {
- entry = &rb->entries[init_entry];
- entry->cacheline = pblk_cacheline_to_addr(init_entry++);
- entry->data = kaddr + (i * rb->seg_size);
- entry->w_ctx.flags = PBLK_WRITABLE_ENTRY;
- bio_list_init(&entry->w_ctx.bios);
- }
-
- list_add_tail(&page_set->list, &rb->pages);
- iter--;
- } while (iter > 0);
- up_write(&pblk_rb_lock);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_set(&rb->inflight_flush_point, 0);
-#endif
-
- /*
- * Initialize rate-limiter, which controls access to the write buffer
- * by user and GC I/O
- */
- pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
-
- return 0;
-}
-
-static void clean_wctx(struct pblk_w_ctx *w_ctx)
-{
- int flags;
-
- flags = READ_ONCE(w_ctx->flags);
- WARN_ONCE(!(flags & PBLK_SUBMITTED_ENTRY),
- "pblk: overwriting unsubmitted data\n");
-
- /* Release flags on context. Protect from writes and reads */
- smp_store_release(&w_ctx->flags, PBLK_WRITABLE_ENTRY);
- pblk_ppa_set_empty(&w_ctx->ppa);
- w_ctx->lba = ADDR_EMPTY;
-}
-
-#define pblk_rb_ring_count(head, tail, size) CIRC_CNT(head, tail, size)
-#define pblk_rb_ring_space(rb, head, tail, size) \
- (CIRC_SPACE(head, tail, size))
-
-/*
- * Buffer space is calculated with respect to the back pointer signaling
- * synchronized entries to the media.
- */
-static unsigned int pblk_rb_space(struct pblk_rb *rb)
-{
- unsigned int mem = READ_ONCE(rb->mem);
- unsigned int sync = READ_ONCE(rb->sync);
-
- return pblk_rb_ring_space(rb, mem, sync, rb->nr_entries);
-}
-
-unsigned int pblk_rb_ptr_wrap(struct pblk_rb *rb, unsigned int p,
- unsigned int nr_entries)
-{
- return (p + nr_entries) & (rb->nr_entries - 1);
-}
-
-/*
- * Buffer count is calculated with respect to the submission entry signaling the
- * entries that are available to send to the media
- */
-unsigned int pblk_rb_read_count(struct pblk_rb *rb)
-{
- unsigned int mem = READ_ONCE(rb->mem);
- unsigned int subm = READ_ONCE(rb->subm);
-
- return pblk_rb_ring_count(mem, subm, rb->nr_entries);
-}
-
-unsigned int pblk_rb_sync_count(struct pblk_rb *rb)
-{
- unsigned int mem = READ_ONCE(rb->mem);
- unsigned int sync = READ_ONCE(rb->sync);
-
- return pblk_rb_ring_count(mem, sync, rb->nr_entries);
-}
-
-unsigned int pblk_rb_read_commit(struct pblk_rb *rb, unsigned int nr_entries)
-{
- unsigned int subm;
-
- subm = READ_ONCE(rb->subm);
- /* Commit read means updating submission pointer */
- smp_store_release(&rb->subm, pblk_rb_ptr_wrap(rb, subm, nr_entries));
-
- return subm;
-}
-
-static int __pblk_rb_update_l2p(struct pblk_rb *rb, unsigned int to_update)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct pblk_line *line;
- struct pblk_rb_entry *entry;
- struct pblk_w_ctx *w_ctx;
- unsigned int user_io = 0, gc_io = 0;
- unsigned int i;
- int flags;
-
- for (i = 0; i < to_update; i++) {
- entry = &rb->entries[rb->l2p_update];
- w_ctx = &entry->w_ctx;
-
- flags = READ_ONCE(entry->w_ctx.flags);
- if (flags & PBLK_IOTYPE_USER)
- user_io++;
- else if (flags & PBLK_IOTYPE_GC)
- gc_io++;
- else
- WARN(1, "pblk: unknown IO type\n");
-
- pblk_update_map_dev(pblk, w_ctx->lba, w_ctx->ppa,
- entry->cacheline);
-
- line = pblk_ppa_to_line(pblk, w_ctx->ppa);
- atomic_dec(&line->sec_to_update);
- kref_put(&line->ref, pblk_line_put);
- clean_wctx(w_ctx);
- rb->l2p_update = pblk_rb_ptr_wrap(rb, rb->l2p_update, 1);
- }
-
- pblk_rl_out(&pblk->rl, user_io, gc_io);
-
- return 0;
-}
-
-/*
- * When we move the l2p_update pointer, we update the l2p table - lookups will
- * point to the physical address instead of to the cacheline in the write buffer
- * from this moment on.
- */
-static int pblk_rb_update_l2p(struct pblk_rb *rb, unsigned int nr_entries,
- unsigned int mem, unsigned int sync)
-{
- unsigned int space, count;
- int ret = 0;
-
- lockdep_assert_held(&rb->w_lock);
-
- /* Update l2p only as buffer entries are being overwritten */
- space = pblk_rb_ring_space(rb, mem, rb->l2p_update, rb->nr_entries);
- if (space > nr_entries)
- goto out;
-
- count = nr_entries - space;
- /* l2p_update used exclusively under rb->w_lock */
- ret = __pblk_rb_update_l2p(rb, count);
-
-out:
- return ret;
-}
-
-/*
- * Update the l2p entry for all sectors stored on the write buffer. This means
- * that all future lookups to the l2p table will point to a device address, not
- * to the cacheline in the write buffer.
- */
-void pblk_rb_sync_l2p(struct pblk_rb *rb)
-{
- unsigned int sync;
- unsigned int to_update;
-
- spin_lock(&rb->w_lock);
-
- /* Protect from reads and writes */
- sync = smp_load_acquire(&rb->sync);
-
- to_update = pblk_rb_ring_count(sync, rb->l2p_update, rb->nr_entries);
- __pblk_rb_update_l2p(rb, to_update);
-
- spin_unlock(&rb->w_lock);
-}
-
-/*
- * Write @nr_entries to ring buffer from @data buffer if there is enough space.
- * Typically, 4KB data chunks coming from a bio will be copied to the ring
- * buffer, thus the write will fail if not all incoming data can be copied.
- *
- */
-static void __pblk_rb_write_entry(struct pblk_rb *rb, void *data,
- struct pblk_w_ctx w_ctx,
- struct pblk_rb_entry *entry)
-{
- memcpy(entry->data, data, rb->seg_size);
-
- entry->w_ctx.lba = w_ctx.lba;
- entry->w_ctx.ppa = w_ctx.ppa;
-}
-
-void pblk_rb_write_entry_user(struct pblk_rb *rb, void *data,
- struct pblk_w_ctx w_ctx, unsigned int ring_pos)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct pblk_rb_entry *entry;
- int flags;
-
- entry = &rb->entries[ring_pos];
- flags = READ_ONCE(entry->w_ctx.flags);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Caller must guarantee that the entry is free */
- BUG_ON(!(flags & PBLK_WRITABLE_ENTRY));
-#endif
-
- __pblk_rb_write_entry(rb, data, w_ctx, entry);
-
- pblk_update_map_cache(pblk, w_ctx.lba, entry->cacheline);
- flags = w_ctx.flags | PBLK_WRITTEN_DATA;
-
- /* Release flags on write context. Protect from writes */
- smp_store_release(&entry->w_ctx.flags, flags);
-}
-
-void pblk_rb_write_entry_gc(struct pblk_rb *rb, void *data,
- struct pblk_w_ctx w_ctx, struct pblk_line *line,
- u64 paddr, unsigned int ring_pos)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct pblk_rb_entry *entry;
- int flags;
-
- entry = &rb->entries[ring_pos];
- flags = READ_ONCE(entry->w_ctx.flags);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Caller must guarantee that the entry is free */
- BUG_ON(!(flags & PBLK_WRITABLE_ENTRY));
-#endif
-
- __pblk_rb_write_entry(rb, data, w_ctx, entry);
-
- if (!pblk_update_map_gc(pblk, w_ctx.lba, entry->cacheline, line, paddr))
- entry->w_ctx.lba = ADDR_EMPTY;
-
- flags = w_ctx.flags | PBLK_WRITTEN_DATA;
-
- /* Release flags on write context. Protect from writes */
- smp_store_release(&entry->w_ctx.flags, flags);
-}
-
-static int pblk_rb_flush_point_set(struct pblk_rb *rb, struct bio *bio,
- unsigned int pos)
-{
- struct pblk_rb_entry *entry;
- unsigned int sync, flush_point;
-
- pblk_rb_sync_init(rb, NULL);
- sync = READ_ONCE(rb->sync);
-
- if (pos == sync) {
- pblk_rb_sync_end(rb, NULL);
- return 0;
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_inc(&rb->inflight_flush_point);
-#endif
-
- flush_point = (pos == 0) ? (rb->nr_entries - 1) : (pos - 1);
- entry = &rb->entries[flush_point];
-
- /* Protect flush points */
- smp_store_release(&rb->flush_point, flush_point);
-
- if (bio)
- bio_list_add(&entry->w_ctx.bios, bio);
-
- pblk_rb_sync_end(rb, NULL);
-
- return bio ? 1 : 0;
-}
-
-static int __pblk_rb_may_write(struct pblk_rb *rb, unsigned int nr_entries,
- unsigned int *pos)
-{
- unsigned int mem;
- unsigned int sync;
- unsigned int threshold;
-
- sync = READ_ONCE(rb->sync);
- mem = READ_ONCE(rb->mem);
-
- threshold = nr_entries + rb->back_thres;
-
- if (pblk_rb_ring_space(rb, mem, sync, rb->nr_entries) < threshold)
- return 0;
-
- if (pblk_rb_update_l2p(rb, nr_entries, mem, sync))
- return 0;
-
- *pos = mem;
-
- return 1;
-}
-
-static int pblk_rb_may_write(struct pblk_rb *rb, unsigned int nr_entries,
- unsigned int *pos)
-{
- if (!__pblk_rb_may_write(rb, nr_entries, pos))
- return 0;
-
- /* Protect from read count */
- smp_store_release(&rb->mem, pblk_rb_ptr_wrap(rb, *pos, nr_entries));
- return 1;
-}
-
-void pblk_rb_flush(struct pblk_rb *rb)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- unsigned int mem = READ_ONCE(rb->mem);
-
- if (pblk_rb_flush_point_set(rb, NULL, mem))
- return;
-
- pblk_write_kick(pblk);
-}
-
-static int pblk_rb_may_write_flush(struct pblk_rb *rb, unsigned int nr_entries,
- unsigned int *pos, struct bio *bio,
- int *io_ret)
-{
- unsigned int mem;
-
- if (!__pblk_rb_may_write(rb, nr_entries, pos))
- return 0;
-
- mem = pblk_rb_ptr_wrap(rb, *pos, nr_entries);
- *io_ret = NVM_IO_DONE;
-
- if (bio->bi_opf & REQ_PREFLUSH) {
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
-
- atomic64_inc(&pblk->nr_flush);
- if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
- *io_ret = NVM_IO_OK;
- }
-
- /* Protect from read count */
- smp_store_release(&rb->mem, mem);
-
- return 1;
-}
-
-/*
- * Atomically check that (i) there is space on the write buffer for the
- * incoming I/O, and (ii) the current I/O type has enough budget in the write
- * buffer (rate-limiter).
- */
-int pblk_rb_may_write_user(struct pblk_rb *rb, struct bio *bio,
- unsigned int nr_entries, unsigned int *pos)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- int io_ret;
-
- spin_lock(&rb->w_lock);
- io_ret = pblk_rl_user_may_insert(&pblk->rl, nr_entries);
- if (io_ret) {
- spin_unlock(&rb->w_lock);
- return io_ret;
- }
-
- if (!pblk_rb_may_write_flush(rb, nr_entries, pos, bio, &io_ret)) {
- spin_unlock(&rb->w_lock);
- return NVM_IO_REQUEUE;
- }
-
- pblk_rl_user_in(&pblk->rl, nr_entries);
- spin_unlock(&rb->w_lock);
-
- return io_ret;
-}
-
-/*
- * Look at pblk_rb_may_write_user comment
- */
-int pblk_rb_may_write_gc(struct pblk_rb *rb, unsigned int nr_entries,
- unsigned int *pos)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
-
- spin_lock(&rb->w_lock);
- if (!pblk_rl_gc_may_insert(&pblk->rl, nr_entries)) {
- spin_unlock(&rb->w_lock);
- return 0;
- }
-
- if (!pblk_rb_may_write(rb, nr_entries, pos)) {
- spin_unlock(&rb->w_lock);
- return 0;
- }
-
- pblk_rl_gc_in(&pblk->rl, nr_entries);
- spin_unlock(&rb->w_lock);
-
- return 1;
-}
-
-/*
- * Read available entries on rb and add them to the given bio. To avoid a memory
- * copy, a page reference to the write buffer is used to be added to the bio.
- *
- * This function is used by the write thread to form the write bio that will
- * persist data on the write buffer to the media.
- */
-unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb, struct nvm_rq *rqd,
- unsigned int pos, unsigned int nr_entries,
- unsigned int count)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct request_queue *q = pblk->dev->q;
- struct pblk_c_ctx *c_ctx = nvm_rq_to_pdu(rqd);
- struct bio *bio = rqd->bio;
- struct pblk_rb_entry *entry;
- struct page *page;
- unsigned int pad = 0, to_read = nr_entries;
- unsigned int i;
- int flags;
-
- if (count < nr_entries) {
- pad = nr_entries - count;
- to_read = count;
- }
-
- /* Add space for packed metadata if in use*/
- pad += (pblk->min_write_pgs - pblk->min_write_pgs_data);
-
- c_ctx->sentry = pos;
- c_ctx->nr_valid = to_read;
- c_ctx->nr_padded = pad;
-
- for (i = 0; i < to_read; i++) {
- entry = &rb->entries[pos];
-
- /* A write has been allowed into the buffer, but data is still
- * being copied to it. It is ok to busy wait.
- */
-try:
- flags = READ_ONCE(entry->w_ctx.flags);
- if (!(flags & PBLK_WRITTEN_DATA)) {
- io_schedule();
- goto try;
- }
-
- page = virt_to_page(entry->data);
- if (!page) {
- pblk_err(pblk, "could not allocate write bio page\n");
- flags &= ~PBLK_WRITTEN_DATA;
- flags |= PBLK_SUBMITTED_ENTRY;
- /* Release flags on context. Protect from writes */
- smp_store_release(&entry->w_ctx.flags, flags);
- return NVM_IO_ERR;
- }
-
- if (bio_add_pc_page(q, bio, page, rb->seg_size, 0) !=
- rb->seg_size) {
- pblk_err(pblk, "could not add page to write bio\n");
- flags &= ~PBLK_WRITTEN_DATA;
- flags |= PBLK_SUBMITTED_ENTRY;
- /* Release flags on context. Protect from writes */
- smp_store_release(&entry->w_ctx.flags, flags);
- return NVM_IO_ERR;
- }
-
- flags &= ~PBLK_WRITTEN_DATA;
- flags |= PBLK_SUBMITTED_ENTRY;
-
- /* Release flags on context. Protect from writes */
- smp_store_release(&entry->w_ctx.flags, flags);
-
- pos = pblk_rb_ptr_wrap(rb, pos, 1);
- }
-
- if (pad) {
- if (pblk_bio_add_pages(pblk, bio, GFP_KERNEL, pad)) {
- pblk_err(pblk, "could not pad page in write bio\n");
- return NVM_IO_ERR;
- }
-
- if (pad < pblk->min_write_pgs)
- atomic64_inc(&pblk->pad_dist[pad - 1]);
- else
- pblk_warn(pblk, "padding more than min. sectors\n");
-
- atomic64_add(pad, &pblk->pad_wa);
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(pad, &pblk->padded_writes);
-#endif
-
- return NVM_IO_OK;
-}
-
-/*
- * Copy to bio only if the lba matches the one on the given cache entry.
- * Otherwise, it means that the entry has been overwritten, and the bio should
- * be directed to disk.
- */
-int pblk_rb_copy_to_bio(struct pblk_rb *rb, struct bio *bio, sector_t lba,
- struct ppa_addr ppa)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct pblk_rb_entry *entry;
- struct pblk_w_ctx *w_ctx;
- struct ppa_addr l2p_ppa;
- u64 pos = pblk_addr_to_cacheline(ppa);
- void *data;
- int flags;
- int ret = 1;
-
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Caller must ensure that the access will not cause an overflow */
- BUG_ON(pos >= rb->nr_entries);
-#endif
- entry = &rb->entries[pos];
- w_ctx = &entry->w_ctx;
- flags = READ_ONCE(w_ctx->flags);
-
- spin_lock(&rb->w_lock);
- spin_lock(&pblk->trans_lock);
- l2p_ppa = pblk_trans_map_get(pblk, lba);
- spin_unlock(&pblk->trans_lock);
-
- /* Check if the entry has been overwritten or is scheduled to be */
- if (!pblk_ppa_comp(l2p_ppa, ppa) || w_ctx->lba != lba ||
- flags & PBLK_WRITABLE_ENTRY) {
- ret = 0;
- goto out;
- }
- data = bio_data(bio);
- memcpy(data, entry->data, rb->seg_size);
-
-out:
- spin_unlock(&rb->w_lock);
- return ret;
-}
-
-struct pblk_w_ctx *pblk_rb_w_ctx(struct pblk_rb *rb, unsigned int pos)
-{
- unsigned int entry = pblk_rb_ptr_wrap(rb, pos, 0);
-
- return &rb->entries[entry].w_ctx;
-}
-
-unsigned int pblk_rb_sync_init(struct pblk_rb *rb, unsigned long *flags)
- __acquires(&rb->s_lock)
-{
- if (flags)
- spin_lock_irqsave(&rb->s_lock, *flags);
- else
- spin_lock_irq(&rb->s_lock);
-
- return rb->sync;
-}
-
-void pblk_rb_sync_end(struct pblk_rb *rb, unsigned long *flags)
- __releases(&rb->s_lock)
-{
- lockdep_assert_held(&rb->s_lock);
-
- if (flags)
- spin_unlock_irqrestore(&rb->s_lock, *flags);
- else
- spin_unlock_irq(&rb->s_lock);
-}
-
-unsigned int pblk_rb_sync_advance(struct pblk_rb *rb, unsigned int nr_entries)
-{
- unsigned int sync, flush_point;
- lockdep_assert_held(&rb->s_lock);
-
- sync = READ_ONCE(rb->sync);
- flush_point = READ_ONCE(rb->flush_point);
-
- if (flush_point != EMPTY_ENTRY) {
- unsigned int secs_to_flush;
-
- secs_to_flush = pblk_rb_ring_count(flush_point, sync,
- rb->nr_entries);
- if (secs_to_flush < nr_entries) {
- /* Protect flush points */
- smp_store_release(&rb->flush_point, EMPTY_ENTRY);
- }
- }
-
- sync = pblk_rb_ptr_wrap(rb, sync, nr_entries);
-
- /* Protect from counts */
- smp_store_release(&rb->sync, sync);
-
- return sync;
-}
-
-/* Calculate how many sectors to submit up to the current flush point. */
-unsigned int pblk_rb_flush_point_count(struct pblk_rb *rb)
-{
- unsigned int subm, sync, flush_point;
- unsigned int submitted, to_flush;
-
- /* Protect flush points */
- flush_point = smp_load_acquire(&rb->flush_point);
- if (flush_point == EMPTY_ENTRY)
- return 0;
-
- /* Protect syncs */
- sync = smp_load_acquire(&rb->sync);
-
- subm = READ_ONCE(rb->subm);
- submitted = pblk_rb_ring_count(subm, sync, rb->nr_entries);
-
- /* The sync point itself counts as a sector to sync */
- to_flush = pblk_rb_ring_count(flush_point, sync, rb->nr_entries) + 1;
-
- return (submitted < to_flush) ? (to_flush - submitted) : 0;
-}
-
-int pblk_rb_tear_down_check(struct pblk_rb *rb)
-{
- struct pblk_rb_entry *entry;
- int i;
- int ret = 0;
-
- spin_lock(&rb->w_lock);
- spin_lock_irq(&rb->s_lock);
-
- if ((rb->mem == rb->subm) && (rb->subm == rb->sync) &&
- (rb->sync == rb->l2p_update) &&
- (rb->flush_point == EMPTY_ENTRY)) {
- goto out;
- }
-
- if (!rb->entries) {
- ret = 1;
- goto out;
- }
-
- for (i = 0; i < rb->nr_entries; i++) {
- entry = &rb->entries[i];
-
- if (!entry->data) {
- ret = 1;
- goto out;
- }
- }
-
-out:
- spin_unlock_irq(&rb->s_lock);
- spin_unlock(&rb->w_lock);
-
- return ret;
-}
-
-unsigned int pblk_rb_wrap_pos(struct pblk_rb *rb, unsigned int pos)
-{
- return (pos & (rb->nr_entries - 1));
-}
-
-int pblk_rb_pos_oob(struct pblk_rb *rb, u64 pos)
-{
- return (pos >= rb->nr_entries);
-}
-
-ssize_t pblk_rb_sysfs(struct pblk_rb *rb, char *buf)
-{
- struct pblk *pblk = container_of(rb, struct pblk, rwb);
- struct pblk_c_ctx *c;
- ssize_t offset;
- int queued_entries = 0;
-
- spin_lock_irq(&rb->s_lock);
- list_for_each_entry(c, &pblk->compl_list, list)
- queued_entries++;
- spin_unlock_irq(&rb->s_lock);
-
- if (rb->flush_point != EMPTY_ENTRY)
- offset = scnprintf(buf, PAGE_SIZE,
- "%u\t%u\t%u\t%u\t%u\t%u\t%u - %u/%u/%u - %d\n",
- rb->nr_entries,
- rb->mem,
- rb->subm,
- rb->sync,
- rb->l2p_update,
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_read(&rb->inflight_flush_point),
-#else
- 0,
-#endif
- rb->flush_point,
- pblk_rb_read_count(rb),
- pblk_rb_space(rb),
- pblk_rb_flush_point_count(rb),
- queued_entries);
- else
- offset = scnprintf(buf, PAGE_SIZE,
- "%u\t%u\t%u\t%u\t%u\t%u\tNULL - %u/%u/%u - %d\n",
- rb->nr_entries,
- rb->mem,
- rb->subm,
- rb->sync,
- rb->l2p_update,
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_read(&rb->inflight_flush_point),
-#else
- 0,
-#endif
- pblk_rb_read_count(rb),
- pblk_rb_space(rb),
- pblk_rb_flush_point_count(rb),
- queued_entries);
-
- return offset;
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-read.c - pblk's read path
- */
-
-#include "pblk.h"
-
-/*
- * There is no guarantee that the value read from cache has not been updated and
- * resides at another location in the cache. We guarantee though that if the
- * value is read from the cache, it belongs to the mapped lba. In order to
- * guarantee and order between writes and reads are ordered, a flush must be
- * issued.
- */
-static int pblk_read_from_cache(struct pblk *pblk, struct bio *bio,
- sector_t lba, struct ppa_addr ppa)
-{
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Callers must ensure that the ppa points to a cache address */
- BUG_ON(pblk_ppa_empty(ppa));
- BUG_ON(!pblk_addr_in_cache(ppa));
-#endif
-
- return pblk_rb_copy_to_bio(&pblk->rwb, bio, lba, ppa);
-}
-
-static int pblk_read_ppalist_rq(struct pblk *pblk, struct nvm_rq *rqd,
- struct bio *bio, sector_t blba,
- bool *from_cache)
-{
- void *meta_list = rqd->meta_list;
- int nr_secs, i;
-
-retry:
- nr_secs = pblk_lookup_l2p_seq(pblk, rqd->ppa_list, blba, rqd->nr_ppas,
- from_cache);
-
- if (!*from_cache)
- goto end;
-
- for (i = 0; i < nr_secs; i++) {
- struct pblk_sec_meta *meta = pblk_get_meta(pblk, meta_list, i);
- sector_t lba = blba + i;
-
- if (pblk_ppa_empty(rqd->ppa_list[i])) {
- __le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
-
- meta->lba = addr_empty;
- } else if (pblk_addr_in_cache(rqd->ppa_list[i])) {
- /*
- * Try to read from write buffer. The address is later
- * checked on the write buffer to prevent retrieving
- * overwritten data.
- */
- if (!pblk_read_from_cache(pblk, bio, lba,
- rqd->ppa_list[i])) {
- if (i == 0) {
- /*
- * We didn't call with bio_advance()
- * yet, so we can just retry.
- */
- goto retry;
- } else {
- /*
- * We already call bio_advance()
- * so we cannot retry and we need
- * to quit that function in order
- * to allow caller to handle the bio
- * splitting in the current sector
- * position.
- */
- nr_secs = i;
- goto end;
- }
- }
- meta->lba = cpu_to_le64(lba);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_inc(&pblk->cache_reads);
-#endif
- }
- bio_advance(bio, PBLK_EXPOSED_PAGE_SIZE);
- }
-
-end:
- if (pblk_io_aligned(pblk, nr_secs))
- rqd->is_seq = 1;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(nr_secs, &pblk->inflight_reads);
-#endif
-
- return nr_secs;
-}
-
-
-static void pblk_read_check_seq(struct pblk *pblk, struct nvm_rq *rqd,
- sector_t blba)
-{
- void *meta_list = rqd->meta_list;
- int nr_lbas = rqd->nr_ppas;
- int i;
-
- if (!pblk_is_oob_meta_supported(pblk))
- return;
-
- for (i = 0; i < nr_lbas; i++) {
- struct pblk_sec_meta *meta = pblk_get_meta(pblk, meta_list, i);
- u64 lba = le64_to_cpu(meta->lba);
-
- if (lba == ADDR_EMPTY)
- continue;
-
- if (lba != blba + i) {
-#ifdef CONFIG_NVM_PBLK_DEBUG
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- print_ppa(pblk, &ppa_list[i], "seq", i);
-#endif
- pblk_err(pblk, "corrupted read LBA (%llu/%llu)\n",
- lba, (u64)blba + i);
- WARN_ON(1);
- }
- }
-}
-
-/*
- * There can be holes in the lba list.
- */
-static void pblk_read_check_rand(struct pblk *pblk, struct nvm_rq *rqd,
- u64 *lba_list, int nr_lbas)
-{
- void *meta_lba_list = rqd->meta_list;
- int i, j;
-
- if (!pblk_is_oob_meta_supported(pblk))
- return;
-
- for (i = 0, j = 0; i < nr_lbas; i++) {
- struct pblk_sec_meta *meta = pblk_get_meta(pblk,
- meta_lba_list, j);
- u64 lba = lba_list[i];
- u64 meta_lba;
-
- if (lba == ADDR_EMPTY)
- continue;
-
- meta_lba = le64_to_cpu(meta->lba);
-
- if (lba != meta_lba) {
-#ifdef CONFIG_NVM_PBLK_DEBUG
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- print_ppa(pblk, &ppa_list[j], "rnd", j);
-#endif
- pblk_err(pblk, "corrupted read LBA (%llu/%llu)\n",
- meta_lba, lba);
- WARN_ON(1);
- }
-
- j++;
- }
-
- WARN_ONCE(j != rqd->nr_ppas, "pblk: corrupted random request\n");
-}
-
-static void pblk_end_user_read(struct bio *bio, int error)
-{
- if (error && error != NVM_RSP_WARN_HIGHECC)
- bio_io_error(bio);
- else
- bio_endio(bio);
-}
-
-static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd,
- bool put_line)
-{
- struct pblk_g_ctx *r_ctx = nvm_rq_to_pdu(rqd);
- struct bio *int_bio = rqd->bio;
- unsigned long start_time = r_ctx->start_time;
-
- bio_end_io_acct(int_bio, start_time);
-
- if (rqd->error)
- pblk_log_read_err(pblk, rqd);
-
- pblk_read_check_seq(pblk, rqd, r_ctx->lba);
- bio_put(int_bio);
-
- if (put_line)
- pblk_rq_to_line_put(pblk, rqd);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(rqd->nr_ppas, &pblk->sync_reads);
- atomic_long_sub(rqd->nr_ppas, &pblk->inflight_reads);
-#endif
-
- pblk_free_rqd(pblk, rqd, PBLK_READ);
- atomic_dec(&pblk->inflight_io);
-}
-
-static void pblk_end_io_read(struct nvm_rq *rqd)
-{
- struct pblk *pblk = rqd->private;
- struct pblk_g_ctx *r_ctx = nvm_rq_to_pdu(rqd);
- struct bio *bio = (struct bio *)r_ctx->private;
-
- pblk_end_user_read(bio, rqd->error);
- __pblk_end_io_read(pblk, rqd, true);
-}
-
-static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd, struct bio *bio,
- sector_t lba, bool *from_cache)
-{
- struct pblk_sec_meta *meta = pblk_get_meta(pblk, rqd->meta_list, 0);
- struct ppa_addr ppa;
-
- pblk_lookup_l2p_seq(pblk, &ppa, lba, 1, from_cache);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_inc(&pblk->inflight_reads);
-#endif
-
-retry:
- if (pblk_ppa_empty(ppa)) {
- __le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
-
- meta->lba = addr_empty;
- return;
- }
-
- /* Try to read from write buffer. The address is later checked on the
- * write buffer to prevent retrieving overwritten data.
- */
- if (pblk_addr_in_cache(ppa)) {
- if (!pblk_read_from_cache(pblk, bio, lba, ppa)) {
- pblk_lookup_l2p_seq(pblk, &ppa, lba, 1, from_cache);
- goto retry;
- }
-
- meta->lba = cpu_to_le64(lba);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_inc(&pblk->cache_reads);
-#endif
- } else {
- rqd->ppa_addr = ppa;
- }
-}
-
-void pblk_submit_read(struct pblk *pblk, struct bio *bio)
-{
- sector_t blba = pblk_get_lba(bio);
- unsigned int nr_secs = pblk_get_secs(bio);
- bool from_cache;
- struct pblk_g_ctx *r_ctx;
- struct nvm_rq *rqd;
- struct bio *int_bio, *split_bio;
- unsigned long start_time;
-
- start_time = bio_start_io_acct(bio);
-
- rqd = pblk_alloc_rqd(pblk, PBLK_READ);
-
- rqd->opcode = NVM_OP_PREAD;
- rqd->nr_ppas = nr_secs;
- rqd->private = pblk;
- rqd->end_io = pblk_end_io_read;
-
- r_ctx = nvm_rq_to_pdu(rqd);
- r_ctx->start_time = start_time;
- r_ctx->lba = blba;
-
- if (pblk_alloc_rqd_meta(pblk, rqd)) {
- bio_io_error(bio);
- pblk_free_rqd(pblk, rqd, PBLK_READ);
- return;
- }
-
- /* Clone read bio to deal internally with:
- * -read errors when reading from drive
- * -bio_advance() calls during cache reads
- */
- int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
-
- if (nr_secs > 1)
- nr_secs = pblk_read_ppalist_rq(pblk, rqd, int_bio, blba,
- &from_cache);
- else
- pblk_read_rq(pblk, rqd, int_bio, blba, &from_cache);
-
-split_retry:
- r_ctx->private = bio; /* original bio */
- rqd->bio = int_bio; /* internal bio */
-
- if (from_cache && nr_secs == rqd->nr_ppas) {
- /* All data was read from cache, we can complete the IO. */
- pblk_end_user_read(bio, 0);
- atomic_inc(&pblk->inflight_io);
- __pblk_end_io_read(pblk, rqd, false);
- } else if (nr_secs != rqd->nr_ppas) {
- /* The read bio request could be partially filled by the write
- * buffer, but there are some holes that need to be read from
- * the drive. In order to handle this, we will use block layer
- * mechanism to split this request in to smaller ones and make
- * a chain of it.
- */
- split_bio = bio_split(bio, nr_secs * NR_PHY_IN_LOG, GFP_KERNEL,
- &pblk_bio_set);
- bio_chain(split_bio, bio);
- submit_bio_noacct(bio);
-
- /* New bio contains first N sectors of the previous one, so
- * we can continue to use existing rqd, but we need to shrink
- * the number of PPAs in it. New bio is also guaranteed that
- * it contains only either data from cache or from drive, newer
- * mix of them.
- */
- bio = split_bio;
- rqd->nr_ppas = nr_secs;
- if (rqd->nr_ppas == 1)
- rqd->ppa_addr = rqd->ppa_list[0];
-
- /* Recreate int_bio - existing might have some needed internal
- * fields modified already.
- */
- bio_put(int_bio);
- int_bio = bio_clone_fast(bio, GFP_KERNEL, &pblk_bio_set);
- goto split_retry;
- } else if (pblk_submit_io(pblk, rqd, NULL)) {
- /* Submitting IO to drive failed, let's report an error */
- rqd->error = -ENODEV;
- pblk_end_io_read(rqd);
- }
-}
-
-static int read_ppalist_rq_gc(struct pblk *pblk, struct nvm_rq *rqd,
- struct pblk_line *line, u64 *lba_list,
- u64 *paddr_list_gc, unsigned int nr_secs)
-{
- struct ppa_addr ppa_list_l2p[NVM_MAX_VLBA];
- struct ppa_addr ppa_gc;
- int valid_secs = 0;
- int i;
-
- pblk_lookup_l2p_rand(pblk, ppa_list_l2p, lba_list, nr_secs);
-
- for (i = 0; i < nr_secs; i++) {
- if (lba_list[i] == ADDR_EMPTY)
- continue;
-
- ppa_gc = addr_to_gen_ppa(pblk, paddr_list_gc[i], line->id);
- if (!pblk_ppa_comp(ppa_list_l2p[i], ppa_gc)) {
- paddr_list_gc[i] = lba_list[i] = ADDR_EMPTY;
- continue;
- }
-
- rqd->ppa_list[valid_secs++] = ppa_list_l2p[i];
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(valid_secs, &pblk->inflight_reads);
-#endif
-
- return valid_secs;
-}
-
-static int read_rq_gc(struct pblk *pblk, struct nvm_rq *rqd,
- struct pblk_line *line, sector_t lba,
- u64 paddr_gc)
-{
- struct ppa_addr ppa_l2p, ppa_gc;
- int valid_secs = 0;
-
- if (lba == ADDR_EMPTY)
- goto out;
-
- /* logic error: lba out-of-bounds */
- if (lba >= pblk->capacity) {
- WARN(1, "pblk: read lba out of bounds\n");
- goto out;
- }
-
- spin_lock(&pblk->trans_lock);
- ppa_l2p = pblk_trans_map_get(pblk, lba);
- spin_unlock(&pblk->trans_lock);
-
- ppa_gc = addr_to_gen_ppa(pblk, paddr_gc, line->id);
- if (!pblk_ppa_comp(ppa_l2p, ppa_gc))
- goto out;
-
- rqd->ppa_addr = ppa_l2p;
- valid_secs = 1;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_inc(&pblk->inflight_reads);
-#endif
-
-out:
- return valid_secs;
-}
-
-int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq)
-{
- struct nvm_rq rqd;
- int ret = NVM_IO_OK;
-
- memset(&rqd, 0, sizeof(struct nvm_rq));
-
- ret = pblk_alloc_rqd_meta(pblk, &rqd);
- if (ret)
- return ret;
-
- if (gc_rq->nr_secs > 1) {
- gc_rq->secs_to_gc = read_ppalist_rq_gc(pblk, &rqd, gc_rq->line,
- gc_rq->lba_list,
- gc_rq->paddr_list,
- gc_rq->nr_secs);
- if (gc_rq->secs_to_gc == 1)
- rqd.ppa_addr = rqd.ppa_list[0];
- } else {
- gc_rq->secs_to_gc = read_rq_gc(pblk, &rqd, gc_rq->line,
- gc_rq->lba_list[0],
- gc_rq->paddr_list[0]);
- }
-
- if (!(gc_rq->secs_to_gc))
- goto out;
-
- rqd.opcode = NVM_OP_PREAD;
- rqd.nr_ppas = gc_rq->secs_to_gc;
-
- if (pblk_submit_io_sync(pblk, &rqd, gc_rq->data)) {
- ret = -EIO;
- goto err_free_dma;
- }
-
- pblk_read_check_rand(pblk, &rqd, gc_rq->lba_list, gc_rq->nr_secs);
-
- atomic_dec(&pblk->inflight_io);
-
- if (rqd.error) {
- atomic_long_inc(&pblk->read_failed_gc);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- pblk_print_failed_rqd(pblk, &rqd, rqd.error);
-#endif
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(gc_rq->secs_to_gc, &pblk->sync_reads);
- atomic_long_add(gc_rq->secs_to_gc, &pblk->recov_gc_reads);
- atomic_long_sub(gc_rq->secs_to_gc, &pblk->inflight_reads);
-#endif
-
-out:
- pblk_free_rqd_meta(pblk, &rqd);
- return ret;
-
-err_free_dma:
- pblk_free_rqd_meta(pblk, &rqd);
- return ret;
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial: Javier Gonzalez <javier@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-recovery.c - pblk's recovery path
- *
- * The L2P recovery path is single threaded as the L2P table is updated in order
- * following the line sequence ID.
- */
-
-#include "pblk.h"
-#include "pblk-trace.h"
-
-int pblk_recov_check_emeta(struct pblk *pblk, struct line_emeta *emeta_buf)
-{
- u32 crc;
-
- crc = pblk_calc_emeta_crc(pblk, emeta_buf);
- if (le32_to_cpu(emeta_buf->crc) != crc)
- return 1;
-
- if (le32_to_cpu(emeta_buf->header.identifier) != PBLK_MAGIC)
- return 1;
-
- return 0;
-}
-
-static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_emeta *emeta = line->emeta;
- struct line_emeta *emeta_buf = emeta->buf;
- __le64 *lba_list;
- u64 data_start, data_end;
- u64 nr_valid_lbas, nr_lbas = 0;
- u64 i;
-
- lba_list = emeta_to_lbas(pblk, emeta_buf);
- if (!lba_list)
- return 1;
-
- data_start = pblk_line_smeta_start(pblk, line) + lm->smeta_sec;
- data_end = line->emeta_ssec;
- nr_valid_lbas = le64_to_cpu(emeta_buf->nr_valid_lbas);
-
- for (i = data_start; i < data_end; i++) {
- struct ppa_addr ppa;
- int pos;
-
- ppa = addr_to_gen_ppa(pblk, i, line->id);
- pos = pblk_ppa_to_pos(geo, ppa);
-
- /* Do not update bad blocks */
- if (test_bit(pos, line->blk_bitmap))
- continue;
-
- if (le64_to_cpu(lba_list[i]) == ADDR_EMPTY) {
- spin_lock(&line->lock);
- if (test_and_set_bit(i, line->invalid_bitmap))
- WARN_ONCE(1, "pblk: rec. double invalidate:\n");
- else
- le32_add_cpu(line->vsc, -1);
- spin_unlock(&line->lock);
-
- continue;
- }
-
- pblk_update_map(pblk, le64_to_cpu(lba_list[i]), ppa);
- nr_lbas++;
- }
-
- if (nr_valid_lbas != nr_lbas)
- pblk_err(pblk, "line %d - inconsistent lba list(%llu/%llu)\n",
- line->id, nr_valid_lbas, nr_lbas);
-
- line->left_msecs = 0;
-
- return 0;
-}
-
-static void pblk_update_line_wp(struct pblk *pblk, struct pblk_line *line,
- u64 written_secs)
-{
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- int i;
-
- for (i = 0; i < written_secs; i += pblk->min_write_pgs)
- __pblk_alloc_page(pblk, line, pblk->min_write_pgs);
-
- spin_lock(&l_mg->free_lock);
- if (written_secs > line->left_msecs) {
- /*
- * We have all data sectors written
- * and some emeta sectors written too.
- */
- line->left_msecs = 0;
- } else {
- /* We have only some data sectors written. */
- line->left_msecs -= written_secs;
- }
- spin_unlock(&l_mg->free_lock);
-}
-
-static u64 pblk_sec_in_open_line(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- int nr_bb = bitmap_weight(line->blk_bitmap, lm->blk_per_line);
- u64 written_secs = 0;
- int valid_chunks = 0;
- int i;
-
- for (i = 0; i < lm->blk_per_line; i++) {
- struct nvm_chk_meta *chunk = &line->chks[i];
-
- if (chunk->state & NVM_CHK_ST_OFFLINE)
- continue;
-
- written_secs += chunk->wp;
- valid_chunks++;
- }
-
- if (lm->blk_per_line - nr_bb != valid_chunks)
- pblk_err(pblk, "recovery line %d is bad\n", line->id);
-
- pblk_update_line_wp(pblk, line, written_secs - lm->smeta_sec);
-
- return written_secs;
-}
-
-struct pblk_recov_alloc {
- struct ppa_addr *ppa_list;
- void *meta_list;
- struct nvm_rq *rqd;
- void *data;
- dma_addr_t dma_ppa_list;
- dma_addr_t dma_meta_list;
-};
-
-static void pblk_recov_complete(struct kref *ref)
-{
- struct pblk_pad_rq *pad_rq = container_of(ref, struct pblk_pad_rq, ref);
-
- complete(&pad_rq->wait);
-}
-
-static void pblk_end_io_recov(struct nvm_rq *rqd)
-{
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
- struct pblk_pad_rq *pad_rq = rqd->private;
- struct pblk *pblk = pad_rq->pblk;
-
- pblk_up_chunk(pblk, ppa_list[0]);
-
- pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
-
- atomic_dec(&pblk->inflight_io);
- kref_put(&pad_rq->ref, pblk_recov_complete);
-}
-
-/* pad line using line bitmap. */
-static int pblk_recov_pad_line(struct pblk *pblk, struct pblk_line *line,
- int left_ppas)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- void *meta_list;
- struct pblk_pad_rq *pad_rq;
- struct nvm_rq *rqd;
- struct ppa_addr *ppa_list;
- void *data;
- __le64 *lba_list = emeta_to_lbas(pblk, line->emeta->buf);
- u64 w_ptr = line->cur_sec;
- int left_line_ppas, rq_ppas;
- int i, j;
- int ret = 0;
-
- spin_lock(&line->lock);
- left_line_ppas = line->left_msecs;
- spin_unlock(&line->lock);
-
- pad_rq = kmalloc(sizeof(struct pblk_pad_rq), GFP_KERNEL);
- if (!pad_rq)
- return -ENOMEM;
-
- data = vzalloc(array_size(pblk->max_write_pgs, geo->csecs));
- if (!data) {
- ret = -ENOMEM;
- goto free_rq;
- }
-
- pad_rq->pblk = pblk;
- init_completion(&pad_rq->wait);
- kref_init(&pad_rq->ref);
-
-next_pad_rq:
- rq_ppas = pblk_calc_secs(pblk, left_ppas, 0, false);
- if (rq_ppas < pblk->min_write_pgs) {
- pblk_err(pblk, "corrupted pad line %d\n", line->id);
- goto fail_complete;
- }
-
- rqd = pblk_alloc_rqd(pblk, PBLK_WRITE_INT);
-
- ret = pblk_alloc_rqd_meta(pblk, rqd);
- if (ret) {
- pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
- goto fail_complete;
- }
-
- rqd->bio = NULL;
- rqd->opcode = NVM_OP_PWRITE;
- rqd->is_seq = 1;
- rqd->nr_ppas = rq_ppas;
- rqd->end_io = pblk_end_io_recov;
- rqd->private = pad_rq;
-
- ppa_list = nvm_rq_to_ppa_list(rqd);
- meta_list = rqd->meta_list;
-
- for (i = 0; i < rqd->nr_ppas; ) {
- struct ppa_addr ppa;
- int pos;
-
- w_ptr = pblk_alloc_page(pblk, line, pblk->min_write_pgs);
- ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
- pos = pblk_ppa_to_pos(geo, ppa);
-
- while (test_bit(pos, line->blk_bitmap)) {
- w_ptr += pblk->min_write_pgs;
- ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
- pos = pblk_ppa_to_pos(geo, ppa);
- }
-
- for (j = 0; j < pblk->min_write_pgs; j++, i++, w_ptr++) {
- struct ppa_addr dev_ppa;
- struct pblk_sec_meta *meta;
- __le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
-
- dev_ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
-
- pblk_map_invalidate(pblk, dev_ppa);
- lba_list[w_ptr] = addr_empty;
- meta = pblk_get_meta(pblk, meta_list, i);
- meta->lba = addr_empty;
- ppa_list[i] = dev_ppa;
- }
- }
-
- kref_get(&pad_rq->ref);
- pblk_down_chunk(pblk, ppa_list[0]);
-
- ret = pblk_submit_io(pblk, rqd, data);
- if (ret) {
- pblk_err(pblk, "I/O submission failed: %d\n", ret);
- pblk_up_chunk(pblk, ppa_list[0]);
- kref_put(&pad_rq->ref, pblk_recov_complete);
- pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
- goto fail_complete;
- }
-
- left_line_ppas -= rq_ppas;
- left_ppas -= rq_ppas;
- if (left_ppas && left_line_ppas)
- goto next_pad_rq;
-
-fail_complete:
- kref_put(&pad_rq->ref, pblk_recov_complete);
- wait_for_completion(&pad_rq->wait);
-
- if (!pblk_line_is_full(line))
- pblk_err(pblk, "corrupted padded line: %d\n", line->id);
-
- vfree(data);
-free_rq:
- kfree(pad_rq);
- return ret;
-}
-
-static int pblk_pad_distance(struct pblk *pblk, struct pblk_line *line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int distance = geo->mw_cunits * geo->all_luns * geo->ws_opt;
-
- return (distance > line->left_msecs) ? line->left_msecs : distance;
-}
-
-/* Return a chunk belonging to a line by stripe(write order) index */
-static struct nvm_chk_meta *pblk_get_stripe_chunk(struct pblk *pblk,
- struct pblk_line *line,
- int index)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_lun *rlun;
- struct ppa_addr ppa;
- int pos;
-
- rlun = &pblk->luns[index];
- ppa = rlun->bppa;
- pos = pblk_ppa_to_pos(geo, ppa);
-
- return &line->chks[pos];
-}
-
-static int pblk_line_wps_are_unbalanced(struct pblk *pblk,
- struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- int blk_in_line = lm->blk_per_line;
- struct nvm_chk_meta *chunk;
- u64 max_wp, min_wp;
- int i;
-
- i = find_first_zero_bit(line->blk_bitmap, blk_in_line);
-
- /* If there is one or zero good chunks in the line,
- * the write pointers can't be unbalanced.
- */
- if (i >= (blk_in_line - 1))
- return 0;
-
- chunk = pblk_get_stripe_chunk(pblk, line, i);
- max_wp = chunk->wp;
- if (max_wp > pblk->max_write_pgs)
- min_wp = max_wp - pblk->max_write_pgs;
- else
- min_wp = 0;
-
- i = find_next_zero_bit(line->blk_bitmap, blk_in_line, i + 1);
- while (i < blk_in_line) {
- chunk = pblk_get_stripe_chunk(pblk, line, i);
- if (chunk->wp > max_wp || chunk->wp < min_wp)
- return 1;
-
- i = find_next_zero_bit(line->blk_bitmap, blk_in_line, i + 1);
- }
-
- return 0;
-}
-
-static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
- struct pblk_recov_alloc p)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct pblk_line_meta *lm = &pblk->lm;
- struct nvm_geo *geo = &dev->geo;
- struct ppa_addr *ppa_list;
- void *meta_list;
- struct nvm_rq *rqd;
- void *data;
- dma_addr_t dma_ppa_list, dma_meta_list;
- __le64 *lba_list;
- u64 paddr = pblk_line_smeta_start(pblk, line) + lm->smeta_sec;
- bool padded = false;
- int rq_ppas;
- int i, j;
- int ret;
- u64 left_ppas = pblk_sec_in_open_line(pblk, line) - lm->smeta_sec;
-
- if (pblk_line_wps_are_unbalanced(pblk, line))
- pblk_warn(pblk, "recovering unbalanced line (%d)\n", line->id);
-
- ppa_list = p.ppa_list;
- meta_list = p.meta_list;
- rqd = p.rqd;
- data = p.data;
- dma_ppa_list = p.dma_ppa_list;
- dma_meta_list = p.dma_meta_list;
-
- lba_list = emeta_to_lbas(pblk, line->emeta->buf);
-
-next_rq:
- memset(rqd, 0, pblk_g_rq_size);
-
- rq_ppas = pblk_calc_secs(pblk, left_ppas, 0, false);
- if (!rq_ppas)
- rq_ppas = pblk->min_write_pgs;
-
-retry_rq:
- rqd->bio = NULL;
- rqd->opcode = NVM_OP_PREAD;
- rqd->meta_list = meta_list;
- rqd->nr_ppas = rq_ppas;
- rqd->ppa_list = ppa_list;
- rqd->dma_ppa_list = dma_ppa_list;
- rqd->dma_meta_list = dma_meta_list;
- ppa_list = nvm_rq_to_ppa_list(rqd);
-
- if (pblk_io_aligned(pblk, rq_ppas))
- rqd->is_seq = 1;
-
- for (i = 0; i < rqd->nr_ppas; ) {
- struct ppa_addr ppa;
- int pos;
-
- ppa = addr_to_gen_ppa(pblk, paddr, line->id);
- pos = pblk_ppa_to_pos(geo, ppa);
-
- while (test_bit(pos, line->blk_bitmap)) {
- paddr += pblk->min_write_pgs;
- ppa = addr_to_gen_ppa(pblk, paddr, line->id);
- pos = pblk_ppa_to_pos(geo, ppa);
- }
-
- for (j = 0; j < pblk->min_write_pgs; j++, i++)
- ppa_list[i] =
- addr_to_gen_ppa(pblk, paddr + j, line->id);
- }
-
- ret = pblk_submit_io_sync(pblk, rqd, data);
- if (ret) {
- pblk_err(pblk, "I/O submission failed: %d\n", ret);
- return ret;
- }
-
- atomic_dec(&pblk->inflight_io);
-
- /* If a read fails, do a best effort by padding the line and retrying */
- if (rqd->error && rqd->error != NVM_RSP_WARN_HIGHECC) {
- int pad_distance, ret;
-
- if (padded) {
- pblk_log_read_err(pblk, rqd);
- return -EINTR;
- }
-
- pad_distance = pblk_pad_distance(pblk, line);
- ret = pblk_recov_pad_line(pblk, line, pad_distance);
- if (ret) {
- return ret;
- }
-
- padded = true;
- goto retry_rq;
- }
-
- pblk_get_packed_meta(pblk, rqd);
-
- for (i = 0; i < rqd->nr_ppas; i++) {
- struct pblk_sec_meta *meta = pblk_get_meta(pblk, meta_list, i);
- u64 lba = le64_to_cpu(meta->lba);
-
- lba_list[paddr++] = cpu_to_le64(lba);
-
- if (lba == ADDR_EMPTY || lba >= pblk->capacity)
- continue;
-
- line->nr_valid_lbas++;
- pblk_update_map(pblk, lba, ppa_list[i]);
- }
-
- left_ppas -= rq_ppas;
- if (left_ppas > 0)
- goto next_rq;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- WARN_ON(padded && !pblk_line_is_full(line));
-#endif
-
- return 0;
-}
-
-/* Scan line for lbas on out of bound area */
-static int pblk_recov_l2p_from_oob(struct pblk *pblk, struct pblk_line *line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct nvm_rq *rqd;
- struct ppa_addr *ppa_list;
- void *meta_list;
- struct pblk_recov_alloc p;
- void *data;
- dma_addr_t dma_ppa_list, dma_meta_list;
- int ret = 0;
-
- meta_list = nvm_dev_dma_alloc(dev->parent, GFP_KERNEL, &dma_meta_list);
- if (!meta_list)
- return -ENOMEM;
-
- ppa_list = (void *)(meta_list) + pblk_dma_meta_size(pblk);
- dma_ppa_list = dma_meta_list + pblk_dma_meta_size(pblk);
-
- data = kcalloc(pblk->max_write_pgs, geo->csecs, GFP_KERNEL);
- if (!data) {
- ret = -ENOMEM;
- goto free_meta_list;
- }
-
- rqd = mempool_alloc(&pblk->r_rq_pool, GFP_KERNEL);
- memset(rqd, 0, pblk_g_rq_size);
-
- p.ppa_list = ppa_list;
- p.meta_list = meta_list;
- p.rqd = rqd;
- p.data = data;
- p.dma_ppa_list = dma_ppa_list;
- p.dma_meta_list = dma_meta_list;
-
- ret = pblk_recov_scan_oob(pblk, line, p);
- if (ret) {
- pblk_err(pblk, "could not recover L2P form OOB\n");
- goto out;
- }
-
- if (pblk_line_is_full(line))
- pblk_line_recov_close(pblk, line);
-
-out:
- mempool_free(rqd, &pblk->r_rq_pool);
- kfree(data);
-free_meta_list:
- nvm_dev_dma_free(dev->parent, meta_list, dma_meta_list);
-
- return ret;
-}
-
-/* Insert lines ordered by sequence number (seq_num) on list */
-static void pblk_recov_line_add_ordered(struct list_head *head,
- struct pblk_line *line)
-{
- struct pblk_line *t = NULL;
-
- list_for_each_entry(t, head, list)
- if (t->seq_nr > line->seq_nr)
- break;
-
- __list_add(&line->list, t->list.prev, &t->list);
-}
-
-static u64 pblk_line_emeta_start(struct pblk *pblk, struct pblk_line *line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- unsigned int emeta_secs;
- u64 emeta_start;
- struct ppa_addr ppa;
- int pos;
-
- emeta_secs = lm->emeta_sec[0];
- emeta_start = lm->sec_per_line;
-
- while (emeta_secs) {
- emeta_start--;
- ppa = addr_to_gen_ppa(pblk, emeta_start, line->id);
- pos = pblk_ppa_to_pos(geo, ppa);
- if (!test_bit(pos, line->blk_bitmap))
- emeta_secs--;
- }
-
- return emeta_start;
-}
-
-static int pblk_recov_check_line_version(struct pblk *pblk,
- struct line_emeta *emeta)
-{
- struct line_header *header = &emeta->header;
-
- if (header->version_major != EMETA_VERSION_MAJOR) {
- pblk_err(pblk, "line major version mismatch: %d, expected: %d\n",
- header->version_major, EMETA_VERSION_MAJOR);
- return 1;
- }
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- if (header->version_minor > EMETA_VERSION_MINOR)
- pblk_info(pblk, "newer line minor version found: %d\n",
- header->version_minor);
-#endif
-
- return 0;
-}
-
-static void pblk_recov_wa_counters(struct pblk *pblk,
- struct line_emeta *emeta)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct line_header *header = &emeta->header;
- struct wa_counters *wa = emeta_to_wa(lm, emeta);
-
- /* WA counters were introduced in emeta version 0.2 */
- if (header->version_major > 0 || header->version_minor >= 2) {
- u64 user = le64_to_cpu(wa->user);
- u64 pad = le64_to_cpu(wa->pad);
- u64 gc = le64_to_cpu(wa->gc);
-
- atomic64_set(&pblk->user_wa, user);
- atomic64_set(&pblk->pad_wa, pad);
- atomic64_set(&pblk->gc_wa, gc);
-
- pblk->user_rst_wa = user;
- pblk->pad_rst_wa = pad;
- pblk->gc_rst_wa = gc;
- }
-}
-
-static int pblk_line_was_written(struct pblk_line *line,
- struct pblk *pblk)
-{
-
- struct pblk_line_meta *lm = &pblk->lm;
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct nvm_chk_meta *chunk;
- struct ppa_addr bppa;
- int smeta_blk;
-
- if (line->state == PBLK_LINESTATE_BAD)
- return 0;
-
- smeta_blk = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line);
- if (smeta_blk >= lm->blk_per_line)
- return 0;
-
- bppa = pblk->luns[smeta_blk].bppa;
- chunk = &line->chks[pblk_ppa_to_pos(geo, bppa)];
-
- if (chunk->state & NVM_CHK_ST_CLOSED ||
- (chunk->state & NVM_CHK_ST_OPEN
- && chunk->wp >= lm->smeta_sec))
- return 1;
-
- return 0;
-}
-
-static bool pblk_line_is_open(struct pblk *pblk, struct pblk_line *line)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- int i;
-
- for (i = 0; i < lm->blk_per_line; i++)
- if (line->chks[i].state & NVM_CHK_ST_OPEN)
- return true;
-
- return false;
-}
-
-struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *line, *tline, *data_line = NULL;
- struct pblk_smeta *smeta;
- struct pblk_emeta *emeta;
- struct line_smeta *smeta_buf;
- int found_lines = 0, recovered_lines = 0, open_lines = 0;
- int is_next = 0;
- int meta_line;
- int i, valid_uuid = 0;
- LIST_HEAD(recov_list);
-
- /* TODO: Implement FTL snapshot */
-
- /* Scan recovery - takes place when FTL snapshot fails */
- spin_lock(&l_mg->free_lock);
- meta_line = find_first_zero_bit(&l_mg->meta_bitmap, PBLK_DATA_LINES);
- set_bit(meta_line, &l_mg->meta_bitmap);
- smeta = l_mg->sline_meta[meta_line];
- emeta = l_mg->eline_meta[meta_line];
- smeta_buf = (struct line_smeta *)smeta;
- spin_unlock(&l_mg->free_lock);
-
- /* Order data lines using their sequence number */
- for (i = 0; i < l_mg->nr_lines; i++) {
- u32 crc;
-
- line = &pblk->lines[i];
-
- memset(smeta, 0, lm->smeta_len);
- line->smeta = smeta;
- line->lun_bitmap = ((void *)(smeta_buf)) +
- sizeof(struct line_smeta);
-
- if (!pblk_line_was_written(line, pblk))
- continue;
-
- /* Lines that cannot be read are assumed as not written here */
- if (pblk_line_smeta_read(pblk, line))
- continue;
-
- crc = pblk_calc_smeta_crc(pblk, smeta_buf);
- if (le32_to_cpu(smeta_buf->crc) != crc)
- continue;
-
- if (le32_to_cpu(smeta_buf->header.identifier) != PBLK_MAGIC)
- continue;
-
- if (smeta_buf->header.version_major != SMETA_VERSION_MAJOR) {
- pblk_err(pblk, "found incompatible line version %u\n",
- smeta_buf->header.version_major);
- return ERR_PTR(-EINVAL);
- }
-
- /* The first valid instance uuid is used for initialization */
- if (!valid_uuid) {
- import_guid(&pblk->instance_uuid, smeta_buf->header.uuid);
- valid_uuid = 1;
- }
-
- if (!guid_equal(&pblk->instance_uuid,
- (guid_t *)&smeta_buf->header.uuid)) {
- pblk_debug(pblk, "ignore line %u due to uuid mismatch\n",
- i);
- continue;
- }
-
- /* Update line metadata */
- spin_lock(&line->lock);
- line->id = le32_to_cpu(smeta_buf->header.id);
- line->type = le16_to_cpu(smeta_buf->header.type);
- line->seq_nr = le64_to_cpu(smeta_buf->seq_nr);
- spin_unlock(&line->lock);
-
- /* Update general metadata */
- spin_lock(&l_mg->free_lock);
- if (line->seq_nr >= l_mg->d_seq_nr)
- l_mg->d_seq_nr = line->seq_nr + 1;
- l_mg->nr_free_lines--;
- spin_unlock(&l_mg->free_lock);
-
- if (pblk_line_recov_alloc(pblk, line))
- goto out;
-
- pblk_recov_line_add_ordered(&recov_list, line);
- found_lines++;
- pblk_debug(pblk, "recovering data line %d, seq:%llu\n",
- line->id, smeta_buf->seq_nr);
- }
-
- if (!found_lines) {
- guid_gen(&pblk->instance_uuid);
-
- spin_lock(&l_mg->free_lock);
- WARN_ON_ONCE(!test_and_clear_bit(meta_line,
- &l_mg->meta_bitmap));
- spin_unlock(&l_mg->free_lock);
-
- goto out;
- }
-
- /* Verify closed blocks and recover this portion of L2P table*/
- list_for_each_entry_safe(line, tline, &recov_list, list) {
- recovered_lines++;
-
- line->emeta_ssec = pblk_line_emeta_start(pblk, line);
- line->emeta = emeta;
- memset(line->emeta->buf, 0, lm->emeta_len[0]);
-
- if (pblk_line_is_open(pblk, line)) {
- pblk_recov_l2p_from_oob(pblk, line);
- goto next;
- }
-
- if (pblk_line_emeta_read(pblk, line, line->emeta->buf)) {
- pblk_recov_l2p_from_oob(pblk, line);
- goto next;
- }
-
- if (pblk_recov_check_emeta(pblk, line->emeta->buf)) {
- pblk_recov_l2p_from_oob(pblk, line);
- goto next;
- }
-
- if (pblk_recov_check_line_version(pblk, line->emeta->buf))
- return ERR_PTR(-EINVAL);
-
- pblk_recov_wa_counters(pblk, line->emeta->buf);
-
- if (pblk_recov_l2p_from_emeta(pblk, line))
- pblk_recov_l2p_from_oob(pblk, line);
-
-next:
- if (pblk_line_is_full(line)) {
- struct list_head *move_list;
-
- spin_lock(&line->lock);
- line->state = PBLK_LINESTATE_CLOSED;
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
- move_list = pblk_line_gc_list(pblk, line);
- spin_unlock(&line->lock);
-
- spin_lock(&l_mg->gc_lock);
- list_move_tail(&line->list, move_list);
- spin_unlock(&l_mg->gc_lock);
-
- mempool_free(line->map_bitmap, l_mg->bitmap_pool);
- line->map_bitmap = NULL;
- line->smeta = NULL;
- line->emeta = NULL;
- } else {
- spin_lock(&line->lock);
- line->state = PBLK_LINESTATE_OPEN;
- spin_unlock(&line->lock);
-
- line->emeta->mem = 0;
- atomic_set(&line->emeta->sync, 0);
-
- trace_pblk_line_state(pblk_disk_name(pblk), line->id,
- line->state);
-
- data_line = line;
- line->meta_line = meta_line;
-
- open_lines++;
- }
- }
-
- if (!open_lines) {
- spin_lock(&l_mg->free_lock);
- WARN_ON_ONCE(!test_and_clear_bit(meta_line,
- &l_mg->meta_bitmap));
- spin_unlock(&l_mg->free_lock);
- } else {
- spin_lock(&l_mg->free_lock);
- l_mg->data_line = data_line;
- /* Allocate next line for preparation */
- l_mg->data_next = pblk_line_get(pblk);
- if (l_mg->data_next) {
- l_mg->data_next->seq_nr = l_mg->d_seq_nr++;
- l_mg->data_next->type = PBLK_LINETYPE_DATA;
- is_next = 1;
- }
- spin_unlock(&l_mg->free_lock);
- }
-
- if (is_next)
- pblk_line_erase(pblk, l_mg->data_next);
-
-out:
- if (found_lines != recovered_lines)
- pblk_err(pblk, "failed to recover all found lines %d/%d\n",
- found_lines, recovered_lines);
-
- return data_line;
-}
-
-/*
- * Pad current line
- */
-int pblk_recov_pad(struct pblk *pblk)
-{
- struct pblk_line *line;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- int left_msecs;
- int ret = 0;
-
- spin_lock(&l_mg->free_lock);
- line = l_mg->data_line;
- left_msecs = line->left_msecs;
- spin_unlock(&l_mg->free_lock);
-
- ret = pblk_recov_pad_line(pblk, line, left_msecs);
- if (ret) {
- pblk_err(pblk, "tear down padding failed (%d)\n", ret);
- return ret;
- }
-
- pblk_line_close_meta(pblk, line);
- return ret;
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-rl.c - pblk's rate limiter for user I/O
- *
- */
-
-#include "pblk.h"
-
-static void pblk_rl_kick_u_timer(struct pblk_rl *rl)
-{
- mod_timer(&rl->u_timer, jiffies + msecs_to_jiffies(5000));
-}
-
-int pblk_rl_is_limit(struct pblk_rl *rl)
-{
- int rb_space;
-
- rb_space = atomic_read(&rl->rb_space);
-
- return (rb_space == 0);
-}
-
-int pblk_rl_user_may_insert(struct pblk_rl *rl, int nr_entries)
-{
- int rb_user_cnt = atomic_read(&rl->rb_user_cnt);
- int rb_space = atomic_read(&rl->rb_space);
-
- if (unlikely(rb_space >= 0) && (rb_space - nr_entries < 0))
- return NVM_IO_ERR;
-
- if (rb_user_cnt >= rl->rb_user_max)
- return NVM_IO_REQUEUE;
-
- return NVM_IO_OK;
-}
-
-void pblk_rl_inserted(struct pblk_rl *rl, int nr_entries)
-{
- int rb_space = atomic_read(&rl->rb_space);
-
- if (unlikely(rb_space >= 0))
- atomic_sub(nr_entries, &rl->rb_space);
-}
-
-int pblk_rl_gc_may_insert(struct pblk_rl *rl, int nr_entries)
-{
- int rb_gc_cnt = atomic_read(&rl->rb_gc_cnt);
- int rb_user_active;
-
- /* If there is no user I/O let GC take over space on the write buffer */
- rb_user_active = READ_ONCE(rl->rb_user_active);
- return (!(rb_gc_cnt >= rl->rb_gc_max && rb_user_active));
-}
-
-void pblk_rl_user_in(struct pblk_rl *rl, int nr_entries)
-{
- atomic_add(nr_entries, &rl->rb_user_cnt);
-
- /* Release user I/O state. Protect from GC */
- smp_store_release(&rl->rb_user_active, 1);
- pblk_rl_kick_u_timer(rl);
-}
-
-void pblk_rl_werr_line_in(struct pblk_rl *rl)
-{
- atomic_inc(&rl->werr_lines);
-}
-
-void pblk_rl_werr_line_out(struct pblk_rl *rl)
-{
- atomic_dec(&rl->werr_lines);
-}
-
-void pblk_rl_gc_in(struct pblk_rl *rl, int nr_entries)
-{
- atomic_add(nr_entries, &rl->rb_gc_cnt);
-}
-
-void pblk_rl_out(struct pblk_rl *rl, int nr_user, int nr_gc)
-{
- atomic_sub(nr_user, &rl->rb_user_cnt);
- atomic_sub(nr_gc, &rl->rb_gc_cnt);
-}
-
-unsigned long pblk_rl_nr_free_blks(struct pblk_rl *rl)
-{
- return atomic_read(&rl->free_blocks);
-}
-
-unsigned long pblk_rl_nr_user_free_blks(struct pblk_rl *rl)
-{
- return atomic_read(&rl->free_user_blocks);
-}
-
-static void __pblk_rl_update_rates(struct pblk_rl *rl,
- unsigned long free_blocks)
-{
- struct pblk *pblk = container_of(rl, struct pblk, rl);
- int max = rl->rb_budget;
- int werr_gc_needed = atomic_read(&rl->werr_lines);
-
- if (free_blocks >= rl->high) {
- if (werr_gc_needed) {
- /* Allocate a small budget for recovering
- * lines with write errors
- */
- rl->rb_gc_max = 1 << rl->rb_windows_pw;
- rl->rb_user_max = max - rl->rb_gc_max;
- rl->rb_state = PBLK_RL_WERR;
- } else {
- rl->rb_user_max = max;
- rl->rb_gc_max = 0;
- rl->rb_state = PBLK_RL_OFF;
- }
- } else if (free_blocks < rl->high) {
- int shift = rl->high_pw - rl->rb_windows_pw;
- int user_windows = free_blocks >> shift;
- int user_max = user_windows << ilog2(NVM_MAX_VLBA);
-
- rl->rb_user_max = user_max;
- rl->rb_gc_max = max - user_max;
-
- if (free_blocks <= rl->rsv_blocks) {
- rl->rb_user_max = 0;
- rl->rb_gc_max = max;
- }
-
- /* In the worst case, we will need to GC lines in the low list
- * (high valid sector count). If there are lines to GC on high
- * or mid lists, these will be prioritized
- */
- rl->rb_state = PBLK_RL_LOW;
- }
-
- if (rl->rb_state != PBLK_RL_OFF)
- pblk_gc_should_start(pblk);
- else
- pblk_gc_should_stop(pblk);
-}
-
-void pblk_rl_update_rates(struct pblk_rl *rl)
-{
- __pblk_rl_update_rates(rl, pblk_rl_nr_user_free_blks(rl));
-}
-
-void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line)
-{
- int blk_in_line = atomic_read(&line->blk_in_line);
- int free_blocks;
-
- atomic_add(blk_in_line, &rl->free_blocks);
- free_blocks = atomic_add_return(blk_in_line, &rl->free_user_blocks);
-
- __pblk_rl_update_rates(rl, free_blocks);
-}
-
-void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line,
- bool used)
-{
- int blk_in_line = atomic_read(&line->blk_in_line);
- int free_blocks;
-
- atomic_sub(blk_in_line, &rl->free_blocks);
-
- if (used)
- free_blocks = atomic_sub_return(blk_in_line,
- &rl->free_user_blocks);
- else
- free_blocks = atomic_read(&rl->free_user_blocks);
-
- __pblk_rl_update_rates(rl, free_blocks);
-}
-
-int pblk_rl_high_thrs(struct pblk_rl *rl)
-{
- return rl->high;
-}
-
-int pblk_rl_max_io(struct pblk_rl *rl)
-{
- return rl->rb_max_io;
-}
-
-static void pblk_rl_u_timer(struct timer_list *t)
-{
- struct pblk_rl *rl = from_timer(rl, t, u_timer);
-
- /* Release user I/O state. Protect from GC */
- smp_store_release(&rl->rb_user_active, 0);
-}
-
-void pblk_rl_free(struct pblk_rl *rl)
-{
- del_timer(&rl->u_timer);
-}
-
-void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
-{
- struct pblk *pblk = container_of(rl, struct pblk, rl);
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- int sec_meta, blk_meta;
- unsigned int rb_windows;
-
- /* Consider sectors used for metadata */
- sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
- blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
-
- rl->high = pblk->op_blks - blk_meta - lm->blk_per_line;
- rl->high_pw = get_count_order(rl->high);
-
- rl->rsv_blocks = pblk_get_min_chks(pblk);
-
- /* This will always be a power-of-2 */
- rb_windows = budget / NVM_MAX_VLBA;
- rl->rb_windows_pw = get_count_order(rb_windows);
-
- /* To start with, all buffer is available to user I/O writers */
- rl->rb_budget = budget;
- rl->rb_user_max = budget;
- rl->rb_gc_max = 0;
- rl->rb_state = PBLK_RL_HIGH;
-
- /* Maximize I/O size and ansure that back threshold is respected */
- if (threshold)
- rl->rb_max_io = budget - pblk->min_write_pgs_data - threshold;
- else
- rl->rb_max_io = budget - pblk->min_write_pgs_data - 1;
-
- atomic_set(&rl->rb_user_cnt, 0);
- atomic_set(&rl->rb_gc_cnt, 0);
- atomic_set(&rl->rb_space, -1);
- atomic_set(&rl->werr_lines, 0);
-
- timer_setup(&rl->u_timer, pblk_rl_u_timer, 0);
-
- rl->rb_user_active = 0;
- rl->rb_gc_active = 0;
-}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * Implementation of a physical block-device target for Open-channel SSDs.
- *
- * pblk-sysfs.c - pblk's sysfs
- *
- */
-
-#include "pblk.h"
-
-static ssize_t pblk_sysfs_luns_show(struct pblk *pblk, char *page)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_lun *rlun;
- ssize_t sz = 0;
- int i;
-
- for (i = 0; i < geo->all_luns; i++) {
- int active = 1;
-
- rlun = &pblk->luns[i];
- if (!down_trylock(&rlun->wr_sem)) {
- active = 0;
- up(&rlun->wr_sem);
- }
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "pblk: pos:%d, ch:%d, lun:%d - %d\n",
- i,
- rlun->bppa.a.ch,
- rlun->bppa.a.lun,
- active);
- }
-
- return sz;
-}
-
-static ssize_t pblk_sysfs_rate_limiter(struct pblk *pblk, char *page)
-{
- int free_blocks, free_user_blocks, total_blocks;
- int rb_user_max, rb_user_cnt;
- int rb_gc_max, rb_gc_cnt, rb_budget, rb_state;
-
- free_blocks = pblk_rl_nr_free_blks(&pblk->rl);
- free_user_blocks = pblk_rl_nr_user_free_blks(&pblk->rl);
- rb_user_max = pblk->rl.rb_user_max;
- rb_user_cnt = atomic_read(&pblk->rl.rb_user_cnt);
- rb_gc_max = pblk->rl.rb_gc_max;
- rb_gc_cnt = atomic_read(&pblk->rl.rb_gc_cnt);
- rb_budget = pblk->rl.rb_budget;
- rb_state = pblk->rl.rb_state;
-
- total_blocks = pblk->rl.total_blocks;
-
- return snprintf(page, PAGE_SIZE,
- "u:%u/%u,gc:%u/%u(%u)(stop:<%u,full:>%u,free:%d/%d/%d)-%d\n",
- rb_user_cnt,
- rb_user_max,
- rb_gc_cnt,
- rb_gc_max,
- rb_state,
- rb_budget,
- pblk->rl.high,
- free_blocks,
- free_user_blocks,
- total_blocks,
- READ_ONCE(pblk->rl.rb_user_active));
-}
-
-static ssize_t pblk_sysfs_gc_state_show(struct pblk *pblk, char *page)
-{
- int gc_enabled, gc_active;
-
- pblk_gc_sysfs_state_show(pblk, &gc_enabled, &gc_active);
- return snprintf(page, PAGE_SIZE, "gc_enabled=%d, gc_active=%d\n",
- gc_enabled, gc_active);
-}
-
-static ssize_t pblk_sysfs_stats(struct pblk *pblk, char *page)
-{
- ssize_t sz;
-
- sz = snprintf(page, PAGE_SIZE,
- "read_failed=%lu, read_high_ecc=%lu, read_empty=%lu, read_failed_gc=%lu, write_failed=%lu, erase_failed=%lu\n",
- atomic_long_read(&pblk->read_failed),
- atomic_long_read(&pblk->read_high_ecc),
- atomic_long_read(&pblk->read_empty),
- atomic_long_read(&pblk->read_failed_gc),
- atomic_long_read(&pblk->write_failed),
- atomic_long_read(&pblk->erase_failed));
-
- return sz;
-}
-
-static ssize_t pblk_sysfs_write_buffer(struct pblk *pblk, char *page)
-{
- return pblk_rb_sysfs(&pblk->rwb, page);
-}
-
-static ssize_t pblk_sysfs_ppaf(struct pblk *pblk, char *page)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- ssize_t sz = 0;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = (struct nvm_addrf_12 *)&pblk->addrf;
- struct nvm_addrf_12 *gppaf = (struct nvm_addrf_12 *)&geo->addrf;
-
- sz = scnprintf(page, PAGE_SIZE,
- "g:(b:%d)blk:%d/%d,pg:%d/%d,lun:%d/%d,ch:%d/%d,pl:%d/%d,sec:%d/%d\n",
- pblk->addrf_len,
- ppaf->blk_offset, ppaf->blk_len,
- ppaf->pg_offset, ppaf->pg_len,
- ppaf->lun_offset, ppaf->lun_len,
- ppaf->ch_offset, ppaf->ch_len,
- ppaf->pln_offset, ppaf->pln_len,
- ppaf->sec_offset, ppaf->sec_len);
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "d:blk:%d/%d,pg:%d/%d,lun:%d/%d,ch:%d/%d,pl:%d/%d,sec:%d/%d\n",
- gppaf->blk_offset, gppaf->blk_len,
- gppaf->pg_offset, gppaf->pg_len,
- gppaf->lun_offset, gppaf->lun_len,
- gppaf->ch_offset, gppaf->ch_len,
- gppaf->pln_offset, gppaf->pln_len,
- gppaf->sec_offset, gppaf->sec_len);
- } else {
- struct nvm_addrf *ppaf = &pblk->addrf;
- struct nvm_addrf *gppaf = &geo->addrf;
-
- sz = scnprintf(page, PAGE_SIZE,
- "pblk:(s:%d)ch:%d/%d,lun:%d/%d,chk:%d/%d/sec:%d/%d\n",
- pblk->addrf_len,
- ppaf->ch_offset, ppaf->ch_len,
- ppaf->lun_offset, ppaf->lun_len,
- ppaf->chk_offset, ppaf->chk_len,
- ppaf->sec_offset, ppaf->sec_len);
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "device:ch:%d/%d,lun:%d/%d,chk:%d/%d,sec:%d/%d\n",
- gppaf->ch_offset, gppaf->ch_len,
- gppaf->lun_offset, gppaf->lun_len,
- gppaf->chk_offset, gppaf->chk_len,
- gppaf->sec_offset, gppaf->sec_len);
- }
-
- return sz;
-}
-
-static ssize_t pblk_sysfs_lines(struct pblk *pblk, char *page)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *line;
- ssize_t sz = 0;
- int nr_free_lines;
- int cur_data, cur_log;
- int free_line_cnt = 0, closed_line_cnt = 0, emeta_line_cnt = 0;
- int d_line_cnt = 0, l_line_cnt = 0;
- int gc_full = 0, gc_high = 0, gc_mid = 0, gc_low = 0, gc_empty = 0;
- int gc_werr = 0;
-
- int bad = 0, cor = 0;
- int msecs = 0, cur_sec = 0, vsc = 0, sec_in_line = 0;
- int map_weight = 0, meta_weight = 0;
-
- spin_lock(&l_mg->free_lock);
- cur_data = (l_mg->data_line) ? l_mg->data_line->id : -1;
- cur_log = (l_mg->log_line) ? l_mg->log_line->id : -1;
- nr_free_lines = l_mg->nr_free_lines;
-
- list_for_each_entry(line, &l_mg->free_list, list)
- free_line_cnt++;
- spin_unlock(&l_mg->free_lock);
-
- spin_lock(&l_mg->close_lock);
- list_for_each_entry(line, &l_mg->emeta_list, list)
- emeta_line_cnt++;
- spin_unlock(&l_mg->close_lock);
-
- spin_lock(&l_mg->gc_lock);
- list_for_each_entry(line, &l_mg->gc_full_list, list) {
- if (line->type == PBLK_LINETYPE_DATA)
- d_line_cnt++;
- else if (line->type == PBLK_LINETYPE_LOG)
- l_line_cnt++;
- closed_line_cnt++;
- gc_full++;
- }
-
- list_for_each_entry(line, &l_mg->gc_high_list, list) {
- if (line->type == PBLK_LINETYPE_DATA)
- d_line_cnt++;
- else if (line->type == PBLK_LINETYPE_LOG)
- l_line_cnt++;
- closed_line_cnt++;
- gc_high++;
- }
-
- list_for_each_entry(line, &l_mg->gc_mid_list, list) {
- if (line->type == PBLK_LINETYPE_DATA)
- d_line_cnt++;
- else if (line->type == PBLK_LINETYPE_LOG)
- l_line_cnt++;
- closed_line_cnt++;
- gc_mid++;
- }
-
- list_for_each_entry(line, &l_mg->gc_low_list, list) {
- if (line->type == PBLK_LINETYPE_DATA)
- d_line_cnt++;
- else if (line->type == PBLK_LINETYPE_LOG)
- l_line_cnt++;
- closed_line_cnt++;
- gc_low++;
- }
-
- list_for_each_entry(line, &l_mg->gc_empty_list, list) {
- if (line->type == PBLK_LINETYPE_DATA)
- d_line_cnt++;
- else if (line->type == PBLK_LINETYPE_LOG)
- l_line_cnt++;
- closed_line_cnt++;
- gc_empty++;
- }
-
- list_for_each_entry(line, &l_mg->gc_werr_list, list) {
- if (line->type == PBLK_LINETYPE_DATA)
- d_line_cnt++;
- else if (line->type == PBLK_LINETYPE_LOG)
- l_line_cnt++;
- closed_line_cnt++;
- gc_werr++;
- }
-
- list_for_each_entry(line, &l_mg->bad_list, list)
- bad++;
- list_for_each_entry(line, &l_mg->corrupt_list, list)
- cor++;
- spin_unlock(&l_mg->gc_lock);
-
- spin_lock(&l_mg->free_lock);
- if (l_mg->data_line) {
- cur_sec = l_mg->data_line->cur_sec;
- msecs = l_mg->data_line->left_msecs;
- vsc = le32_to_cpu(*l_mg->data_line->vsc);
- sec_in_line = l_mg->data_line->sec_in_line;
- meta_weight = bitmap_weight(&l_mg->meta_bitmap,
- PBLK_DATA_LINES);
-
- spin_lock(&l_mg->data_line->lock);
- if (l_mg->data_line->map_bitmap)
- map_weight = bitmap_weight(l_mg->data_line->map_bitmap,
- lm->sec_per_line);
- else
- map_weight = 0;
- spin_unlock(&l_mg->data_line->lock);
- }
- spin_unlock(&l_mg->free_lock);
-
- if (nr_free_lines != free_line_cnt)
- pblk_err(pblk, "corrupted free line list:%d/%d\n",
- nr_free_lines, free_line_cnt);
-
- sz = scnprintf(page, PAGE_SIZE - sz,
- "line: nluns:%d, nblks:%d, nsecs:%d\n",
- geo->all_luns, lm->blk_per_line, lm->sec_per_line);
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "lines:d:%d,l:%d-f:%d,m:%d/%d,c:%d,b:%d,co:%d(d:%d,l:%d)t:%d\n",
- cur_data, cur_log,
- nr_free_lines,
- emeta_line_cnt, meta_weight,
- closed_line_cnt,
- bad, cor,
- d_line_cnt, l_line_cnt,
- l_mg->nr_lines);
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "GC: full:%d, high:%d, mid:%d, low:%d, empty:%d, werr: %d, queue:%d\n",
- gc_full, gc_high, gc_mid, gc_low, gc_empty, gc_werr,
- atomic_read(&pblk->gc.read_inflight_gc));
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "data (%d) cur:%d, left:%d, vsc:%d, s:%d, map:%d/%d (%d)\n",
- cur_data, cur_sec, msecs, vsc, sec_in_line,
- map_weight, lm->sec_per_line,
- atomic_read(&pblk->inflight_io));
-
- return sz;
-}
-
-static ssize_t pblk_sysfs_lines_info(struct pblk *pblk, char *page)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_meta *lm = &pblk->lm;
- ssize_t sz = 0;
-
- sz = scnprintf(page, PAGE_SIZE - sz,
- "smeta - len:%d, secs:%d\n",
- lm->smeta_len, lm->smeta_sec);
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "emeta - len:%d, sec:%d, bb_start:%d\n",
- lm->emeta_len[0], lm->emeta_sec[0],
- lm->emeta_bb);
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "bitmap lengths: sec:%d, blk:%d, lun:%d\n",
- lm->sec_bitmap_len,
- lm->blk_bitmap_len,
- lm->lun_bitmap_len);
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "blk_line:%d, sec_line:%d, sec_blk:%d\n",
- lm->blk_per_line,
- lm->sec_per_line,
- geo->clba);
-
- return sz;
-}
-
-static ssize_t pblk_sysfs_get_sec_per_write(struct pblk *pblk, char *page)
-{
- return snprintf(page, PAGE_SIZE, "%d\n", pblk->sec_per_write);
-}
-
-static ssize_t pblk_get_write_amp(u64 user, u64 gc, u64 pad,
- char *page)
-{
- int sz;
-
- sz = scnprintf(page, PAGE_SIZE,
- "user:%lld gc:%lld pad:%lld WA:",
- user, gc, pad);
-
- if (!user) {
- sz += scnprintf(page + sz, PAGE_SIZE - sz, "NaN\n");
- } else {
- u64 wa_int;
- u32 wa_frac;
-
- wa_int = (user + gc + pad) * 100000;
- wa_int = div64_u64(wa_int, user);
- wa_int = div_u64_rem(wa_int, 100000, &wa_frac);
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz, "%llu.%05u\n",
- wa_int, wa_frac);
- }
-
- return sz;
-}
-
-static ssize_t pblk_sysfs_get_write_amp_mileage(struct pblk *pblk, char *page)
-{
- return pblk_get_write_amp(atomic64_read(&pblk->user_wa),
- atomic64_read(&pblk->gc_wa), atomic64_read(&pblk->pad_wa),
- page);
-}
-
-static ssize_t pblk_sysfs_get_write_amp_trip(struct pblk *pblk, char *page)
-{
- return pblk_get_write_amp(
- atomic64_read(&pblk->user_wa) - pblk->user_rst_wa,
- atomic64_read(&pblk->gc_wa) - pblk->gc_rst_wa,
- atomic64_read(&pblk->pad_wa) - pblk->pad_rst_wa, page);
-}
-
-static long long bucket_percentage(unsigned long long bucket,
- unsigned long long total)
-{
- int p = bucket * 100;
-
- p = div_u64(p, total);
-
- return p;
-}
-
-static ssize_t pblk_sysfs_get_padding_dist(struct pblk *pblk, char *page)
-{
- int sz = 0;
- unsigned long long total;
- unsigned long long total_buckets = 0;
- int buckets = pblk->min_write_pgs - 1;
- int i;
-
- total = atomic64_read(&pblk->nr_flush) - pblk->nr_flush_rst;
- if (!total) {
- for (i = 0; i < (buckets + 1); i++)
- sz += scnprintf(page + sz, PAGE_SIZE - sz,
- "%d:0 ", i);
- sz += scnprintf(page + sz, PAGE_SIZE - sz, "\n");
-
- return sz;
- }
-
- for (i = 0; i < buckets; i++)
- total_buckets += atomic64_read(&pblk->pad_dist[i]);
-
- sz += scnprintf(page + sz, PAGE_SIZE - sz, "0:%lld%% ",
- bucket_percentage(total - total_buckets, total));
-
- for (i = 0; i < buckets; i++) {
- unsigned long long p;
-
- p = bucket_percentage(atomic64_read(&pblk->pad_dist[i]),
- total);
- sz += scnprintf(page + sz, PAGE_SIZE - sz, "%d:%lld%% ",
- i + 1, p);
- }
- sz += scnprintf(page + sz, PAGE_SIZE - sz, "\n");
-
- return sz;
-}
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
-static ssize_t pblk_sysfs_stats_debug(struct pblk *pblk, char *page)
-{
- return snprintf(page, PAGE_SIZE,
- "%lu\t%lu\t%ld\t%llu\t%ld\t%lu\t%lu\t%lu\t%lu\t%lu\t%lu\t%lu\t%lu\n",
- atomic_long_read(&pblk->inflight_writes),
- atomic_long_read(&pblk->inflight_reads),
- atomic_long_read(&pblk->req_writes),
- (u64)atomic64_read(&pblk->nr_flush),
- atomic_long_read(&pblk->padded_writes),
- atomic_long_read(&pblk->padded_wb),
- atomic_long_read(&pblk->sub_writes),
- atomic_long_read(&pblk->sync_writes),
- atomic_long_read(&pblk->recov_writes),
- atomic_long_read(&pblk->recov_gc_writes),
- atomic_long_read(&pblk->recov_gc_reads),
- atomic_long_read(&pblk->cache_reads),
- atomic_long_read(&pblk->sync_reads));
-}
-#endif
-
-static ssize_t pblk_sysfs_gc_force(struct pblk *pblk, const char *page,
- size_t len)
-{
- size_t c_len;
- int force;
-
- c_len = strcspn(page, "\n");
- if (c_len >= len)
- return -EINVAL;
-
- if (kstrtouint(page, 0, &force))
- return -EINVAL;
-
- pblk_gc_sysfs_force(pblk, force);
-
- return len;
-}
-
-static ssize_t pblk_sysfs_set_sec_per_write(struct pblk *pblk,
- const char *page, size_t len)
-{
- size_t c_len;
- int sec_per_write;
-
- c_len = strcspn(page, "\n");
- if (c_len >= len)
- return -EINVAL;
-
- if (kstrtouint(page, 0, &sec_per_write))
- return -EINVAL;
-
- if (!pblk_is_oob_meta_supported(pblk)) {
- /* For packed metadata case it is
- * not allowed to change sec_per_write.
- */
- return -EINVAL;
- }
-
- if (sec_per_write < pblk->min_write_pgs
- || sec_per_write > pblk->max_write_pgs
- || sec_per_write % pblk->min_write_pgs != 0)
- return -EINVAL;
-
- pblk_set_sec_per_write(pblk, sec_per_write);
-
- return len;
-}
-
-static ssize_t pblk_sysfs_set_write_amp_trip(struct pblk *pblk,
- const char *page, size_t len)
-{
- size_t c_len;
- int reset_value;
-
- c_len = strcspn(page, "\n");
- if (c_len >= len)
- return -EINVAL;
-
- if (kstrtouint(page, 0, &reset_value))
- return -EINVAL;
-
- if (reset_value != 0)
- return -EINVAL;
-
- pblk->user_rst_wa = atomic64_read(&pblk->user_wa);
- pblk->pad_rst_wa = atomic64_read(&pblk->pad_wa);
- pblk->gc_rst_wa = atomic64_read(&pblk->gc_wa);
-
- return len;
-}
-
-
-static ssize_t pblk_sysfs_set_padding_dist(struct pblk *pblk,
- const char *page, size_t len)
-{
- size_t c_len;
- int reset_value;
- int buckets = pblk->min_write_pgs - 1;
- int i;
-
- c_len = strcspn(page, "\n");
- if (c_len >= len)
- return -EINVAL;
-
- if (kstrtouint(page, 0, &reset_value))
- return -EINVAL;
-
- if (reset_value != 0)
- return -EINVAL;
-
- for (i = 0; i < buckets; i++)
- atomic64_set(&pblk->pad_dist[i], 0);
-
- pblk->nr_flush_rst = atomic64_read(&pblk->nr_flush);
-
- return len;
-}
-
-static struct attribute sys_write_luns = {
- .name = "write_luns",
- .mode = 0444,
-};
-
-static struct attribute sys_rate_limiter_attr = {
- .name = "rate_limiter",
- .mode = 0444,
-};
-
-static struct attribute sys_gc_state = {
- .name = "gc_state",
- .mode = 0444,
-};
-
-static struct attribute sys_errors_attr = {
- .name = "errors",
- .mode = 0444,
-};
-
-static struct attribute sys_rb_attr = {
- .name = "write_buffer",
- .mode = 0444,
-};
-
-static struct attribute sys_stats_ppaf_attr = {
- .name = "ppa_format",
- .mode = 0444,
-};
-
-static struct attribute sys_lines_attr = {
- .name = "lines",
- .mode = 0444,
-};
-
-static struct attribute sys_lines_info_attr = {
- .name = "lines_info",
- .mode = 0444,
-};
-
-static struct attribute sys_gc_force = {
- .name = "gc_force",
- .mode = 0200,
-};
-
-static struct attribute sys_max_sec_per_write = {
- .name = "max_sec_per_write",
- .mode = 0644,
-};
-
-static struct attribute sys_write_amp_mileage = {
- .name = "write_amp_mileage",
- .mode = 0444,
-};
-
-static struct attribute sys_write_amp_trip = {
- .name = "write_amp_trip",
- .mode = 0644,
-};
-
-static struct attribute sys_padding_dist = {
- .name = "padding_dist",
- .mode = 0644,
-};
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
-static struct attribute sys_stats_debug_attr = {
- .name = "stats",
- .mode = 0444,
-};
-#endif
-
-static struct attribute *pblk_attrs[] = {
- &sys_write_luns,
- &sys_rate_limiter_attr,
- &sys_errors_attr,
- &sys_gc_state,
- &sys_gc_force,
- &sys_max_sec_per_write,
- &sys_rb_attr,
- &sys_stats_ppaf_attr,
- &sys_lines_attr,
- &sys_lines_info_attr,
- &sys_write_amp_mileage,
- &sys_write_amp_trip,
- &sys_padding_dist,
-#ifdef CONFIG_NVM_PBLK_DEBUG
- &sys_stats_debug_attr,
-#endif
- NULL,
-};
-
-static ssize_t pblk_sysfs_show(struct kobject *kobj, struct attribute *attr,
- char *buf)
-{
- struct pblk *pblk = container_of(kobj, struct pblk, kobj);
-
- if (strcmp(attr->name, "rate_limiter") == 0)
- return pblk_sysfs_rate_limiter(pblk, buf);
- else if (strcmp(attr->name, "write_luns") == 0)
- return pblk_sysfs_luns_show(pblk, buf);
- else if (strcmp(attr->name, "gc_state") == 0)
- return pblk_sysfs_gc_state_show(pblk, buf);
- else if (strcmp(attr->name, "errors") == 0)
- return pblk_sysfs_stats(pblk, buf);
- else if (strcmp(attr->name, "write_buffer") == 0)
- return pblk_sysfs_write_buffer(pblk, buf);
- else if (strcmp(attr->name, "ppa_format") == 0)
- return pblk_sysfs_ppaf(pblk, buf);
- else if (strcmp(attr->name, "lines") == 0)
- return pblk_sysfs_lines(pblk, buf);
- else if (strcmp(attr->name, "lines_info") == 0)
- return pblk_sysfs_lines_info(pblk, buf);
- else if (strcmp(attr->name, "max_sec_per_write") == 0)
- return pblk_sysfs_get_sec_per_write(pblk, buf);
- else if (strcmp(attr->name, "write_amp_mileage") == 0)
- return pblk_sysfs_get_write_amp_mileage(pblk, buf);
- else if (strcmp(attr->name, "write_amp_trip") == 0)
- return pblk_sysfs_get_write_amp_trip(pblk, buf);
- else if (strcmp(attr->name, "padding_dist") == 0)
- return pblk_sysfs_get_padding_dist(pblk, buf);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- else if (strcmp(attr->name, "stats") == 0)
- return pblk_sysfs_stats_debug(pblk, buf);
-#endif
- return 0;
-}
-
-static ssize_t pblk_sysfs_store(struct kobject *kobj, struct attribute *attr,
- const char *buf, size_t len)
-{
- struct pblk *pblk = container_of(kobj, struct pblk, kobj);
-
- if (strcmp(attr->name, "gc_force") == 0)
- return pblk_sysfs_gc_force(pblk, buf, len);
- else if (strcmp(attr->name, "max_sec_per_write") == 0)
- return pblk_sysfs_set_sec_per_write(pblk, buf, len);
- else if (strcmp(attr->name, "write_amp_trip") == 0)
- return pblk_sysfs_set_write_amp_trip(pblk, buf, len);
- else if (strcmp(attr->name, "padding_dist") == 0)
- return pblk_sysfs_set_padding_dist(pblk, buf, len);
- return 0;
-}
-
-static const struct sysfs_ops pblk_sysfs_ops = {
- .show = pblk_sysfs_show,
- .store = pblk_sysfs_store,
-};
-
-static struct kobj_type pblk_ktype = {
- .sysfs_ops = &pblk_sysfs_ops,
- .default_attrs = pblk_attrs,
-};
-
-int pblk_sysfs_init(struct gendisk *tdisk)
-{
- struct pblk *pblk = tdisk->private_data;
- struct device *parent_dev = disk_to_dev(pblk->disk);
- int ret;
-
- ret = kobject_init_and_add(&pblk->kobj, &pblk_ktype,
- kobject_get(&parent_dev->kobj),
- "%s", "pblk");
- if (ret) {
- pblk_err(pblk, "could not register\n");
- return ret;
- }
-
- kobject_uevent(&pblk->kobj, KOBJ_ADD);
- return 0;
-}
-
-void pblk_sysfs_exit(struct gendisk *tdisk)
-{
- struct pblk *pblk = tdisk->private_data;
-
- kobject_uevent(&pblk->kobj, KOBJ_REMOVE);
- kobject_del(&pblk->kobj);
- kobject_put(&pblk->kobj);
-}
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-#undef TRACE_SYSTEM
-#define TRACE_SYSTEM pblk
-
-#if !defined(_TRACE_PBLK_H) || defined(TRACE_HEADER_MULTI_READ)
-#define _TRACE_PBLK_H
-
-#include <linux/tracepoint.h>
-
-struct ppa_addr;
-
-#define show_chunk_flags(state) __print_flags(state, "", \
- { NVM_CHK_ST_FREE, "FREE", }, \
- { NVM_CHK_ST_CLOSED, "CLOSED", }, \
- { NVM_CHK_ST_OPEN, "OPEN", }, \
- { NVM_CHK_ST_OFFLINE, "OFFLINE", })
-
-#define show_line_state(state) __print_symbolic(state, \
- { PBLK_LINESTATE_NEW, "NEW", }, \
- { PBLK_LINESTATE_FREE, "FREE", }, \
- { PBLK_LINESTATE_OPEN, "OPEN", }, \
- { PBLK_LINESTATE_CLOSED, "CLOSED", }, \
- { PBLK_LINESTATE_GC, "GC", }, \
- { PBLK_LINESTATE_BAD, "BAD", }, \
- { PBLK_LINESTATE_CORRUPT, "CORRUPT" })
-
-
-#define show_pblk_state(state) __print_symbolic(state, \
- { PBLK_STATE_RUNNING, "RUNNING", }, \
- { PBLK_STATE_STOPPING, "STOPPING", }, \
- { PBLK_STATE_RECOVERING, "RECOVERING", }, \
- { PBLK_STATE_STOPPED, "STOPPED" })
-
-#define show_chunk_erase_state(state) __print_symbolic(state, \
- { PBLK_CHUNK_RESET_START, "START", }, \
- { PBLK_CHUNK_RESET_DONE, "OK", }, \
- { PBLK_CHUNK_RESET_FAILED, "FAILED" })
-
-
-TRACE_EVENT(pblk_chunk_reset,
-
- TP_PROTO(const char *name, struct ppa_addr *ppa, int state),
-
- TP_ARGS(name, ppa, state),
-
- TP_STRUCT__entry(
- __string(name, name)
- __field(u64, ppa)
- __field(int, state)
- ),
-
- TP_fast_assign(
- __assign_str(name, name);
- __entry->ppa = ppa->ppa;
- __entry->state = state;
- ),
-
- TP_printk("dev=%s grp=%llu pu=%llu chk=%llu state=%s", __get_str(name),
- (u64)(((struct ppa_addr *)(&__entry->ppa))->m.grp),
- (u64)(((struct ppa_addr *)(&__entry->ppa))->m.pu),
- (u64)(((struct ppa_addr *)(&__entry->ppa))->m.chk),
- show_chunk_erase_state((int)__entry->state))
-
-);
-
-TRACE_EVENT(pblk_chunk_state,
-
- TP_PROTO(const char *name, struct ppa_addr *ppa, int state),
-
- TP_ARGS(name, ppa, state),
-
- TP_STRUCT__entry(
- __string(name, name)
- __field(u64, ppa)
- __field(int, state)
- ),
-
- TP_fast_assign(
- __assign_str(name, name);
- __entry->ppa = ppa->ppa;
- __entry->state = state;
- ),
-
- TP_printk("dev=%s grp=%llu pu=%llu chk=%llu state=%s", __get_str(name),
- (u64)(((struct ppa_addr *)(&__entry->ppa))->m.grp),
- (u64)(((struct ppa_addr *)(&__entry->ppa))->m.pu),
- (u64)(((struct ppa_addr *)(&__entry->ppa))->m.chk),
- show_chunk_flags((int)__entry->state))
-
-);
-
-TRACE_EVENT(pblk_line_state,
-
- TP_PROTO(const char *name, int line, int state),
-
- TP_ARGS(name, line, state),
-
- TP_STRUCT__entry(
- __string(name, name)
- __field(int, line)
- __field(int, state)
- ),
-
- TP_fast_assign(
- __assign_str(name, name);
- __entry->line = line;
- __entry->state = state;
- ),
-
- TP_printk("dev=%s line=%d state=%s", __get_str(name),
- (int)__entry->line,
- show_line_state((int)__entry->state))
-
-);
-
-TRACE_EVENT(pblk_state,
-
- TP_PROTO(const char *name, int state),
-
- TP_ARGS(name, state),
-
- TP_STRUCT__entry(
- __string(name, name)
- __field(int, state)
- ),
-
- TP_fast_assign(
- __assign_str(name, name);
- __entry->state = state;
- ),
-
- TP_printk("dev=%s state=%s", __get_str(name),
- show_pblk_state((int)__entry->state))
-
-);
-
-#endif /* !defined(_TRACE_PBLK_H) || defined(TRACE_HEADER_MULTI_READ) */
-
-/* This part must be outside protection */
-
-#undef TRACE_INCLUDE_PATH
-#define TRACE_INCLUDE_PATH ../../drivers/lightnvm
-#undef TRACE_INCLUDE_FILE
-#define TRACE_INCLUDE_FILE pblk-trace
-#include <trace/define_trace.h>
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Javier Gonzalez <javier@cnexlabs.com>
- * Matias Bjorling <matias@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * pblk-write.c - pblk's write path from write buffer to media
- */
-
-#include "pblk.h"
-#include "pblk-trace.h"
-
-static unsigned long pblk_end_w_bio(struct pblk *pblk, struct nvm_rq *rqd,
- struct pblk_c_ctx *c_ctx)
-{
- struct bio *original_bio;
- struct pblk_rb *rwb = &pblk->rwb;
- unsigned long ret;
- int i;
-
- for (i = 0; i < c_ctx->nr_valid; i++) {
- struct pblk_w_ctx *w_ctx;
- int pos = c_ctx->sentry + i;
- int flags;
-
- w_ctx = pblk_rb_w_ctx(rwb, pos);
- flags = READ_ONCE(w_ctx->flags);
-
- if (flags & PBLK_FLUSH_ENTRY) {
- flags &= ~PBLK_FLUSH_ENTRY;
- /* Release flags on context. Protect from writes */
- smp_store_release(&w_ctx->flags, flags);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_dec(&rwb->inflight_flush_point);
-#endif
- }
-
- while ((original_bio = bio_list_pop(&w_ctx->bios)))
- bio_endio(original_bio);
- }
-
- if (c_ctx->nr_padded)
- pblk_bio_free_pages(pblk, rqd->bio, c_ctx->nr_valid,
- c_ctx->nr_padded);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(rqd->nr_ppas, &pblk->sync_writes);
-#endif
-
- ret = pblk_rb_sync_advance(&pblk->rwb, c_ctx->nr_valid);
-
- bio_put(rqd->bio);
- pblk_free_rqd(pblk, rqd, PBLK_WRITE);
-
- return ret;
-}
-
-static unsigned long pblk_end_queued_w_bio(struct pblk *pblk,
- struct nvm_rq *rqd,
- struct pblk_c_ctx *c_ctx)
-{
- list_del(&c_ctx->list);
- return pblk_end_w_bio(pblk, rqd, c_ctx);
-}
-
-static void pblk_complete_write(struct pblk *pblk, struct nvm_rq *rqd,
- struct pblk_c_ctx *c_ctx)
-{
- struct pblk_c_ctx *c, *r;
- unsigned long flags;
- unsigned long pos;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_sub(c_ctx->nr_valid, &pblk->inflight_writes);
-#endif
- pblk_up_rq(pblk, c_ctx->lun_bitmap);
-
- pos = pblk_rb_sync_init(&pblk->rwb, &flags);
- if (pos == c_ctx->sentry) {
- pos = pblk_end_w_bio(pblk, rqd, c_ctx);
-
-retry:
- list_for_each_entry_safe(c, r, &pblk->compl_list, list) {
- rqd = nvm_rq_from_c_ctx(c);
- if (c->sentry == pos) {
- pos = pblk_end_queued_w_bio(pblk, rqd, c);
- goto retry;
- }
- }
- } else {
- WARN_ON(nvm_rq_from_c_ctx(c_ctx) != rqd);
- list_add_tail(&c_ctx->list, &pblk->compl_list);
- }
- pblk_rb_sync_end(&pblk->rwb, &flags);
-}
-
-/* Map remaining sectors in chunk, starting from ppa */
-static void pblk_map_remaining(struct pblk *pblk, struct ppa_addr *ppa,
- int rqd_ppas)
-{
- struct pblk_line *line;
- struct ppa_addr map_ppa = *ppa;
- __le64 addr_empty = cpu_to_le64(ADDR_EMPTY);
- __le64 *lba_list;
- u64 paddr;
- int done = 0;
- int n = 0;
-
- line = pblk_ppa_to_line(pblk, *ppa);
- lba_list = emeta_to_lbas(pblk, line->emeta->buf);
-
- spin_lock(&line->lock);
-
- while (!done) {
- paddr = pblk_dev_ppa_to_line_addr(pblk, map_ppa);
-
- if (!test_and_set_bit(paddr, line->map_bitmap))
- line->left_msecs--;
-
- if (n < rqd_ppas && lba_list[paddr] != addr_empty)
- line->nr_valid_lbas--;
-
- lba_list[paddr] = addr_empty;
-
- if (!test_and_set_bit(paddr, line->invalid_bitmap))
- le32_add_cpu(line->vsc, -1);
-
- done = nvm_next_ppa_in_chk(pblk->dev, &map_ppa);
-
- n++;
- }
-
- line->w_err_gc->has_write_err = 1;
- spin_unlock(&line->lock);
-}
-
-static void pblk_prepare_resubmit(struct pblk *pblk, unsigned int sentry,
- unsigned int nr_entries)
-{
- struct pblk_rb *rb = &pblk->rwb;
- struct pblk_rb_entry *entry;
- struct pblk_line *line;
- struct pblk_w_ctx *w_ctx;
- struct ppa_addr ppa_l2p;
- int flags;
- unsigned int i;
-
- spin_lock(&pblk->trans_lock);
- for (i = 0; i < nr_entries; i++) {
- entry = &rb->entries[pblk_rb_ptr_wrap(rb, sentry, i)];
- w_ctx = &entry->w_ctx;
-
- /* Check if the lba has been overwritten */
- if (w_ctx->lba != ADDR_EMPTY) {
- ppa_l2p = pblk_trans_map_get(pblk, w_ctx->lba);
- if (!pblk_ppa_comp(ppa_l2p, entry->cacheline))
- w_ctx->lba = ADDR_EMPTY;
- }
-
- /* Mark up the entry as submittable again */
- flags = READ_ONCE(w_ctx->flags);
- flags |= PBLK_WRITTEN_DATA;
- /* Release flags on write context. Protect from writes */
- smp_store_release(&w_ctx->flags, flags);
-
- /* Decrease the reference count to the line as we will
- * re-map these entries
- */
- line = pblk_ppa_to_line(pblk, w_ctx->ppa);
- atomic_dec(&line->sec_to_update);
- kref_put(&line->ref, pblk_line_put);
- }
- spin_unlock(&pblk->trans_lock);
-}
-
-static void pblk_queue_resubmit(struct pblk *pblk, struct pblk_c_ctx *c_ctx)
-{
- struct pblk_c_ctx *r_ctx;
-
- r_ctx = kzalloc(sizeof(struct pblk_c_ctx), GFP_KERNEL);
- if (!r_ctx)
- return;
-
- r_ctx->lun_bitmap = NULL;
- r_ctx->sentry = c_ctx->sentry;
- r_ctx->nr_valid = c_ctx->nr_valid;
- r_ctx->nr_padded = c_ctx->nr_padded;
-
- spin_lock(&pblk->resubmit_lock);
- list_add_tail(&r_ctx->list, &pblk->resubmit_list);
- spin_unlock(&pblk->resubmit_lock);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(c_ctx->nr_valid, &pblk->recov_writes);
-#endif
-}
-
-static void pblk_submit_rec(struct work_struct *work)
-{
- struct pblk_rec_ctx *recovery =
- container_of(work, struct pblk_rec_ctx, ws_rec);
- struct pblk *pblk = recovery->pblk;
- struct nvm_rq *rqd = recovery->rqd;
- struct pblk_c_ctx *c_ctx = nvm_rq_to_pdu(rqd);
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- pblk_log_write_err(pblk, rqd);
-
- pblk_map_remaining(pblk, ppa_list, rqd->nr_ppas);
- pblk_queue_resubmit(pblk, c_ctx);
-
- pblk_up_rq(pblk, c_ctx->lun_bitmap);
- if (c_ctx->nr_padded)
- pblk_bio_free_pages(pblk, rqd->bio, c_ctx->nr_valid,
- c_ctx->nr_padded);
- bio_put(rqd->bio);
- pblk_free_rqd(pblk, rqd, PBLK_WRITE);
- mempool_free(recovery, &pblk->rec_pool);
-
- atomic_dec(&pblk->inflight_io);
- pblk_write_kick(pblk);
-}
-
-
-static void pblk_end_w_fail(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct pblk_rec_ctx *recovery;
-
- recovery = mempool_alloc(&pblk->rec_pool, GFP_ATOMIC);
- if (!recovery) {
- pblk_err(pblk, "could not allocate recovery work\n");
- return;
- }
-
- recovery->pblk = pblk;
- recovery->rqd = rqd;
-
- INIT_WORK(&recovery->ws_rec, pblk_submit_rec);
- queue_work(pblk->close_wq, &recovery->ws_rec);
-}
-
-static void pblk_end_io_write(struct nvm_rq *rqd)
-{
- struct pblk *pblk = rqd->private;
- struct pblk_c_ctx *c_ctx = nvm_rq_to_pdu(rqd);
-
- if (rqd->error) {
- pblk_end_w_fail(pblk, rqd);
- return;
- } else {
- if (trace_pblk_chunk_state_enabled())
- pblk_check_chunk_state_update(pblk, rqd);
-#ifdef CONFIG_NVM_PBLK_DEBUG
- WARN_ONCE(rqd->bio->bi_status, "pblk: corrupted write error\n");
-#endif
- }
-
- pblk_complete_write(pblk, rqd, c_ctx);
- atomic_dec(&pblk->inflight_io);
-}
-
-static void pblk_end_io_write_meta(struct nvm_rq *rqd)
-{
- struct pblk *pblk = rqd->private;
- struct pblk_g_ctx *m_ctx = nvm_rq_to_pdu(rqd);
- struct pblk_line *line = m_ctx->private;
- struct pblk_emeta *emeta = line->emeta;
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
- int sync;
-
- pblk_up_chunk(pblk, ppa_list[0]);
-
- if (rqd->error) {
- pblk_log_write_err(pblk, rqd);
- pblk_err(pblk, "metadata I/O failed. Line %d\n", line->id);
- line->w_err_gc->has_write_err = 1;
- } else {
- if (trace_pblk_chunk_state_enabled())
- pblk_check_chunk_state_update(pblk, rqd);
- }
-
- sync = atomic_add_return(rqd->nr_ppas, &emeta->sync);
- if (sync == emeta->nr_entries)
- pblk_gen_run_ws(pblk, line, NULL, pblk_line_close_ws,
- GFP_ATOMIC, pblk->close_wq);
-
- pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
-
- atomic_dec(&pblk->inflight_io);
-}
-
-static int pblk_alloc_w_rq(struct pblk *pblk, struct nvm_rq *rqd,
- unsigned int nr_secs, nvm_end_io_fn(*end_io))
-{
- /* Setup write request */
- rqd->opcode = NVM_OP_PWRITE;
- rqd->nr_ppas = nr_secs;
- rqd->is_seq = 1;
- rqd->private = pblk;
- rqd->end_io = end_io;
-
- return pblk_alloc_rqd_meta(pblk, rqd);
-}
-
-static int pblk_setup_w_rq(struct pblk *pblk, struct nvm_rq *rqd,
- struct ppa_addr *erase_ppa)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line *e_line = pblk_line_get_erase(pblk);
- struct pblk_c_ctx *c_ctx = nvm_rq_to_pdu(rqd);
- unsigned int valid = c_ctx->nr_valid;
- unsigned int padded = c_ctx->nr_padded;
- unsigned int nr_secs = valid + padded;
- unsigned long *lun_bitmap;
- int ret;
-
- lun_bitmap = kzalloc(lm->lun_bitmap_len, GFP_KERNEL);
- if (!lun_bitmap)
- return -ENOMEM;
- c_ctx->lun_bitmap = lun_bitmap;
-
- ret = pblk_alloc_w_rq(pblk, rqd, nr_secs, pblk_end_io_write);
- if (ret) {
- kfree(lun_bitmap);
- return ret;
- }
-
- if (likely(!e_line || !atomic_read(&e_line->left_eblks)))
- ret = pblk_map_rq(pblk, rqd, c_ctx->sentry, lun_bitmap,
- valid, 0);
- else
- ret = pblk_map_erase_rq(pblk, rqd, c_ctx->sentry, lun_bitmap,
- valid, erase_ppa);
-
- return ret;
-}
-
-static int pblk_calc_secs_to_sync(struct pblk *pblk, unsigned int secs_avail,
- unsigned int secs_to_flush)
-{
- int secs_to_sync;
-
- secs_to_sync = pblk_calc_secs(pblk, secs_avail, secs_to_flush, true);
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- if ((!secs_to_sync && secs_to_flush)
- || (secs_to_sync < 0)
- || (secs_to_sync > secs_avail && !secs_to_flush)) {
- pblk_err(pblk, "bad sector calculation (a:%d,s:%d,f:%d)\n",
- secs_avail, secs_to_sync, secs_to_flush);
- }
-#endif
-
- return secs_to_sync;
-}
-
-int pblk_submit_meta_io(struct pblk *pblk, struct pblk_line *meta_line)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_emeta *emeta = meta_line->emeta;
- struct ppa_addr *ppa_list;
- struct pblk_g_ctx *m_ctx;
- struct nvm_rq *rqd;
- void *data;
- u64 paddr;
- int rq_ppas = pblk->min_write_pgs;
- int id = meta_line->id;
- int rq_len;
- int i, j;
- int ret;
-
- rqd = pblk_alloc_rqd(pblk, PBLK_WRITE_INT);
-
- m_ctx = nvm_rq_to_pdu(rqd);
- m_ctx->private = meta_line;
-
- rq_len = rq_ppas * geo->csecs;
- data = ((void *)emeta->buf) + emeta->mem;
-
- ret = pblk_alloc_w_rq(pblk, rqd, rq_ppas, pblk_end_io_write_meta);
- if (ret)
- goto fail_free_rqd;
-
- ppa_list = nvm_rq_to_ppa_list(rqd);
- for (i = 0; i < rqd->nr_ppas; ) {
- spin_lock(&meta_line->lock);
- paddr = __pblk_alloc_page(pblk, meta_line, rq_ppas);
- spin_unlock(&meta_line->lock);
- for (j = 0; j < rq_ppas; j++, i++, paddr++)
- ppa_list[i] = addr_to_gen_ppa(pblk, paddr, id);
- }
-
- spin_lock(&l_mg->close_lock);
- emeta->mem += rq_len;
- if (emeta->mem >= lm->emeta_len[0])
- list_del(&meta_line->list);
- spin_unlock(&l_mg->close_lock);
-
- pblk_down_chunk(pblk, ppa_list[0]);
-
- ret = pblk_submit_io(pblk, rqd, data);
- if (ret) {
- pblk_err(pblk, "emeta I/O submission failed: %d\n", ret);
- goto fail_rollback;
- }
-
- return NVM_IO_OK;
-
-fail_rollback:
- pblk_up_chunk(pblk, ppa_list[0]);
- spin_lock(&l_mg->close_lock);
- pblk_dealloc_page(pblk, meta_line, rq_ppas);
- list_add(&meta_line->list, &meta_line->list);
- spin_unlock(&l_mg->close_lock);
-fail_free_rqd:
- pblk_free_rqd(pblk, rqd, PBLK_WRITE_INT);
- return ret;
-}
-
-static inline bool pblk_valid_meta_ppa(struct pblk *pblk,
- struct pblk_line *meta_line,
- struct nvm_rq *data_rqd)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_c_ctx *data_c_ctx = nvm_rq_to_pdu(data_rqd);
- struct pblk_line *data_line = pblk_line_get_data(pblk);
- struct ppa_addr ppa, ppa_opt;
- u64 paddr;
- int pos_opt;
-
- /* Schedule a metadata I/O that is half the distance from the data I/O
- * with regards to the number of LUNs forming the pblk instance. This
- * balances LUN conflicts across every I/O.
- *
- * When the LUN configuration changes (e.g., due to GC), this distance
- * can align, which would result on metadata and data I/Os colliding. In
- * this case, modify the distance to not be optimal, but move the
- * optimal in the right direction.
- */
- paddr = pblk_lookup_page(pblk, meta_line);
- ppa = addr_to_gen_ppa(pblk, paddr, 0);
- ppa_opt = addr_to_gen_ppa(pblk, paddr + data_line->meta_distance, 0);
- pos_opt = pblk_ppa_to_pos(geo, ppa_opt);
-
- if (test_bit(pos_opt, data_c_ctx->lun_bitmap) ||
- test_bit(pos_opt, data_line->blk_bitmap))
- return true;
-
- if (unlikely(pblk_ppa_comp(ppa_opt, ppa)))
- data_line->meta_distance--;
-
- return false;
-}
-
-static struct pblk_line *pblk_should_submit_meta_io(struct pblk *pblk,
- struct nvm_rq *data_rqd)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- struct pblk_line_mgmt *l_mg = &pblk->l_mg;
- struct pblk_line *meta_line;
-
- spin_lock(&l_mg->close_lock);
- if (list_empty(&l_mg->emeta_list)) {
- spin_unlock(&l_mg->close_lock);
- return NULL;
- }
- meta_line = list_first_entry(&l_mg->emeta_list, struct pblk_line, list);
- if (meta_line->emeta->mem >= lm->emeta_len[0]) {
- spin_unlock(&l_mg->close_lock);
- return NULL;
- }
- spin_unlock(&l_mg->close_lock);
-
- if (!pblk_valid_meta_ppa(pblk, meta_line, data_rqd))
- return NULL;
-
- return meta_line;
-}
-
-static int pblk_submit_io_set(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct ppa_addr erase_ppa;
- struct pblk_line *meta_line;
- int err;
-
- pblk_ppa_set_empty(&erase_ppa);
-
- /* Assign lbas to ppas and populate request structure */
- err = pblk_setup_w_rq(pblk, rqd, &erase_ppa);
- if (err) {
- pblk_err(pblk, "could not setup write request: %d\n", err);
- return NVM_IO_ERR;
- }
-
- meta_line = pblk_should_submit_meta_io(pblk, rqd);
-
- /* Submit data write for current data line */
- err = pblk_submit_io(pblk, rqd, NULL);
- if (err) {
- pblk_err(pblk, "data I/O submission failed: %d\n", err);
- return NVM_IO_ERR;
- }
-
- if (!pblk_ppa_empty(erase_ppa)) {
- /* Submit erase for next data line */
- if (pblk_blk_erase_async(pblk, erase_ppa)) {
- struct pblk_line *e_line = pblk_line_get_erase(pblk);
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- int bit;
-
- atomic_inc(&e_line->left_eblks);
- bit = pblk_ppa_to_pos(geo, erase_ppa);
- WARN_ON(!test_and_clear_bit(bit, e_line->erase_bitmap));
- }
- }
-
- if (meta_line) {
- /* Submit metadata write for previous data line */
- err = pblk_submit_meta_io(pblk, meta_line);
- if (err) {
- pblk_err(pblk, "metadata I/O submission failed: %d",
- err);
- return NVM_IO_ERR;
- }
- }
-
- return NVM_IO_OK;
-}
-
-static void pblk_free_write_rqd(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct pblk_c_ctx *c_ctx = nvm_rq_to_pdu(rqd);
- struct bio *bio = rqd->bio;
-
- if (c_ctx->nr_padded)
- pblk_bio_free_pages(pblk, bio, c_ctx->nr_valid,
- c_ctx->nr_padded);
-}
-
-static int pblk_submit_write(struct pblk *pblk, int *secs_left)
-{
- struct bio *bio;
- struct nvm_rq *rqd;
- unsigned int secs_avail, secs_to_sync, secs_to_com;
- unsigned int secs_to_flush, packed_meta_pgs;
- unsigned long pos;
- unsigned int resubmit;
-
- *secs_left = 0;
-
- spin_lock(&pblk->resubmit_lock);
- resubmit = !list_empty(&pblk->resubmit_list);
- spin_unlock(&pblk->resubmit_lock);
-
- /* Resubmit failed writes first */
- if (resubmit) {
- struct pblk_c_ctx *r_ctx;
-
- spin_lock(&pblk->resubmit_lock);
- r_ctx = list_first_entry(&pblk->resubmit_list,
- struct pblk_c_ctx, list);
- list_del(&r_ctx->list);
- spin_unlock(&pblk->resubmit_lock);
-
- secs_avail = r_ctx->nr_valid;
- pos = r_ctx->sentry;
-
- pblk_prepare_resubmit(pblk, pos, secs_avail);
- secs_to_sync = pblk_calc_secs_to_sync(pblk, secs_avail,
- secs_avail);
-
- kfree(r_ctx);
- } else {
- /* If there are no sectors in the cache,
- * flushes (bios without data) will be cleared on
- * the cache threads
- */
- secs_avail = pblk_rb_read_count(&pblk->rwb);
- if (!secs_avail)
- return 0;
-
- secs_to_flush = pblk_rb_flush_point_count(&pblk->rwb);
- if (!secs_to_flush && secs_avail < pblk->min_write_pgs_data)
- return 0;
-
- secs_to_sync = pblk_calc_secs_to_sync(pblk, secs_avail,
- secs_to_flush);
- if (secs_to_sync > pblk->max_write_pgs) {
- pblk_err(pblk, "bad buffer sync calculation\n");
- return 0;
- }
-
- secs_to_com = (secs_to_sync > secs_avail) ?
- secs_avail : secs_to_sync;
- pos = pblk_rb_read_commit(&pblk->rwb, secs_to_com);
- }
-
- packed_meta_pgs = (pblk->min_write_pgs - pblk->min_write_pgs_data);
- bio = bio_alloc(GFP_KERNEL, secs_to_sync + packed_meta_pgs);
-
- bio->bi_iter.bi_sector = 0; /* internal bio */
- bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
-
- rqd = pblk_alloc_rqd(pblk, PBLK_WRITE);
- rqd->bio = bio;
-
- if (pblk_rb_read_to_bio(&pblk->rwb, rqd, pos, secs_to_sync,
- secs_avail)) {
- pblk_err(pblk, "corrupted write bio\n");
- goto fail_put_bio;
- }
-
- if (pblk_submit_io_set(pblk, rqd))
- goto fail_free_bio;
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_long_add(secs_to_sync, &pblk->sub_writes);
-#endif
-
- *secs_left = 1;
- return 0;
-
-fail_free_bio:
- pblk_free_write_rqd(pblk, rqd);
-fail_put_bio:
- bio_put(bio);
- pblk_free_rqd(pblk, rqd, PBLK_WRITE);
-
- return -EINTR;
-}
-
-int pblk_write_ts(void *data)
-{
- struct pblk *pblk = data;
- int secs_left;
- int write_failure = 0;
-
- while (!kthread_should_stop()) {
- if (!write_failure) {
- write_failure = pblk_submit_write(pblk, &secs_left);
-
- if (secs_left)
- continue;
- }
- set_current_state(TASK_INTERRUPTIBLE);
- io_schedule();
- }
-
- return 0;
-}
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Copyright (C) 2015 IT University of Copenhagen (rrpc.h)
- * Copyright (C) 2016 CNEX Labs
- * Initial release: Matias Bjorling <matias@cnexlabs.com>
- * Write buffering: Javier Gonzalez <javier@cnexlabs.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * Implementation of a Physical Block-device target for Open-channel SSDs.
- *
- */
-
-#ifndef PBLK_H_
-#define PBLK_H_
-
-#include <linux/blkdev.h>
-#include <linux/blk-mq.h>
-#include <linux/bio.h>
-#include <linux/module.h>
-#include <linux/kthread.h>
-#include <linux/vmalloc.h>
-#include <linux/crc32.h>
-#include <linux/uuid.h>
-
-#include <linux/lightnvm.h>
-
-/* Run only GC if less than 1/X blocks are free */
-#define GC_LIMIT_INVERSE 5
-#define GC_TIME_MSECS 1000
-
-#define PBLK_SECTOR (512)
-#define PBLK_EXPOSED_PAGE_SIZE (4096)
-
-#define PBLK_NR_CLOSE_JOBS (4)
-
-#define PBLK_CACHE_NAME_LEN (DISK_NAME_LEN + 16)
-
-/* Max 512 LUNs per device */
-#define PBLK_MAX_LUNS_BITMAP (4)
-
-#define NR_PHY_IN_LOG (PBLK_EXPOSED_PAGE_SIZE / PBLK_SECTOR)
-
-/* Static pool sizes */
-#define PBLK_GEN_WS_POOL_SIZE (2)
-
-#define PBLK_DEFAULT_OP (11)
-
-enum {
- PBLK_READ = READ,
- PBLK_WRITE = WRITE,/* Write from write buffer */
- PBLK_WRITE_INT, /* Internal write - no write buffer */
- PBLK_READ_RECOV, /* Recovery read - errors allowed */
- PBLK_ERASE,
-};
-
-enum {
- /* IO Types */
- PBLK_IOTYPE_USER = 1 << 0,
- PBLK_IOTYPE_GC = 1 << 1,
-
- /* Write buffer flags */
- PBLK_FLUSH_ENTRY = 1 << 2,
- PBLK_WRITTEN_DATA = 1 << 3,
- PBLK_SUBMITTED_ENTRY = 1 << 4,
- PBLK_WRITABLE_ENTRY = 1 << 5,
-};
-
-enum {
- PBLK_BLK_ST_OPEN = 0x1,
- PBLK_BLK_ST_CLOSED = 0x2,
-};
-
-enum {
- PBLK_CHUNK_RESET_START,
- PBLK_CHUNK_RESET_DONE,
- PBLK_CHUNK_RESET_FAILED,
-};
-
-struct pblk_sec_meta {
- u64 reserved;
- __le64 lba;
-};
-
-/* The number of GC lists and the rate-limiter states go together. This way the
- * rate-limiter can dictate how much GC is needed based on resource utilization.
- */
-#define PBLK_GC_NR_LISTS 4
-
-enum {
- PBLK_RL_OFF = 0,
- PBLK_RL_WERR = 1,
- PBLK_RL_HIGH = 2,
- PBLK_RL_MID = 3,
- PBLK_RL_LOW = 4
-};
-
-#define pblk_dma_ppa_size (sizeof(u64) * NVM_MAX_VLBA)
-
-/* write buffer completion context */
-struct pblk_c_ctx {
- struct list_head list; /* Head for out-of-order completion */
-
- unsigned long *lun_bitmap; /* Luns used on current request */
- unsigned int sentry;
- unsigned int nr_valid;
- unsigned int nr_padded;
-};
-
-/* read context */
-struct pblk_g_ctx {
- void *private;
- unsigned long start_time;
- u64 lba;
-};
-
-/* Pad context */
-struct pblk_pad_rq {
- struct pblk *pblk;
- struct completion wait;
- struct kref ref;
-};
-
-/* Recovery context */
-struct pblk_rec_ctx {
- struct pblk *pblk;
- struct nvm_rq *rqd;
- struct work_struct ws_rec;
-};
-
-/* Write context */
-struct pblk_w_ctx {
- struct bio_list bios; /* Original bios - used for completion
- * in REQ_FUA, REQ_FLUSH case
- */
- u64 lba; /* Logic addr. associated with entry */
- struct ppa_addr ppa; /* Physic addr. associated with entry */
- int flags; /* Write context flags */
-};
-
-struct pblk_rb_entry {
- struct ppa_addr cacheline; /* Cacheline for this entry */
- void *data; /* Pointer to data on this entry */
- struct pblk_w_ctx w_ctx; /* Context for this entry */
- struct list_head index; /* List head to enable indexes */
-};
-
-#define EMPTY_ENTRY (~0U)
-
-struct pblk_rb_pages {
- struct page *pages;
- int order;
- struct list_head list;
-};
-
-struct pblk_rb {
- struct pblk_rb_entry *entries; /* Ring buffer entries */
- unsigned int mem; /* Write offset - points to next
- * writable entry in memory
- */
- unsigned int subm; /* Read offset - points to last entry
- * that has been submitted to the media
- * to be persisted
- */
- unsigned int sync; /* Synced - backpointer that signals
- * the last submitted entry that has
- * been successfully persisted to media
- */
- unsigned int flush_point; /* Sync point - last entry that must be
- * flushed to the media. Used with
- * REQ_FLUSH and REQ_FUA
- */
- unsigned int l2p_update; /* l2p update point - next entry for
- * which l2p mapping will be updated to
- * contain a device ppa address (instead
- * of a cacheline
- */
- unsigned int nr_entries; /* Number of entries in write buffer -
- * must be a power of two
- */
- unsigned int seg_size; /* Size of the data segments being
- * stored on each entry. Typically this
- * will be 4KB
- */
-
- unsigned int back_thres; /* Threshold that shall be maintained by
- * the backpointer in order to respect
- * geo->mw_cunits on a per chunk basis
- */
-
- struct list_head pages; /* List of data pages */
-
- spinlock_t w_lock; /* Write lock */
- spinlock_t s_lock; /* Sync lock */
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- atomic_t inflight_flush_point; /* Not served REQ_FLUSH | REQ_FUA */
-#endif
-};
-
-#define PBLK_RECOVERY_SECTORS 16
-
-struct pblk_lun {
- struct ppa_addr bppa;
- struct semaphore wr_sem;
-};
-
-struct pblk_gc_rq {
- struct pblk_line *line;
- void *data;
- u64 paddr_list[NVM_MAX_VLBA];
- u64 lba_list[NVM_MAX_VLBA];
- int nr_secs;
- int secs_to_gc;
- struct list_head list;
-};
-
-struct pblk_gc {
- /* These states are not protected by a lock since (i) they are in the
- * fast path, and (ii) they are not critical.
- */
- int gc_active;
- int gc_enabled;
- int gc_forced;
-
- struct task_struct *gc_ts;
- struct task_struct *gc_writer_ts;
- struct task_struct *gc_reader_ts;
-
- struct workqueue_struct *gc_line_reader_wq;
- struct workqueue_struct *gc_reader_wq;
-
- struct timer_list gc_timer;
-
- struct semaphore gc_sem;
- atomic_t read_inflight_gc; /* Number of lines with inflight GC reads */
- atomic_t pipeline_gc; /* Number of lines in the GC pipeline -
- * started reads to finished writes
- */
- int w_entries;
-
- struct list_head w_list;
- struct list_head r_list;
-
- spinlock_t lock;
- spinlock_t w_lock;
- spinlock_t r_lock;
-};
-
-struct pblk_rl {
- unsigned int high; /* Upper threshold for rate limiter (free run -
- * user I/O rate limiter
- */
- unsigned int high_pw; /* High rounded up as a power of 2 */
-
-#define PBLK_USER_HIGH_THRS 8 /* Begin write limit at 12% available blks */
-#define PBLK_USER_LOW_THRS 10 /* Aggressive GC at 10% available blocks */
-
- int rb_windows_pw; /* Number of rate windows in the write buffer
- * given as a power-of-2. This guarantees that
- * when user I/O is being rate limited, there
- * will be reserved enough space for the GC to
- * place its payload. A window is of
- * pblk->max_write_pgs size, which in NVMe is
- * 64, i.e., 256kb.
- */
- int rb_budget; /* Total number of entries available for I/O */
- int rb_user_max; /* Max buffer entries available for user I/O */
- int rb_gc_max; /* Max buffer entries available for GC I/O */
- int rb_gc_rsv; /* Reserved buffer entries for GC I/O */
- int rb_state; /* Rate-limiter current state */
- int rb_max_io; /* Maximum size for an I/O giving the config */
-
- atomic_t rb_user_cnt; /* User I/O buffer counter */
- atomic_t rb_gc_cnt; /* GC I/O buffer counter */
- atomic_t rb_space; /* Space limit in case of reaching capacity */
-
- int rsv_blocks; /* Reserved blocks for GC */
-
- int rb_user_active;
- int rb_gc_active;
-
- atomic_t werr_lines; /* Number of write error lines that needs gc */
-
- struct timer_list u_timer;
-
- unsigned long total_blocks;
-
- atomic_t free_blocks; /* Total number of free blocks (+ OP) */
- atomic_t free_user_blocks; /* Number of user free blocks (no OP) */
-};
-
-#define PBLK_LINE_EMPTY (~0U)
-
-enum {
- /* Line Types */
- PBLK_LINETYPE_FREE = 0,
- PBLK_LINETYPE_LOG = 1,
- PBLK_LINETYPE_DATA = 2,
-
- /* Line state */
- PBLK_LINESTATE_NEW = 9,
- PBLK_LINESTATE_FREE = 10,
- PBLK_LINESTATE_OPEN = 11,
- PBLK_LINESTATE_CLOSED = 12,
- PBLK_LINESTATE_GC = 13,
- PBLK_LINESTATE_BAD = 14,
- PBLK_LINESTATE_CORRUPT = 15,
-
- /* GC group */
- PBLK_LINEGC_NONE = 20,
- PBLK_LINEGC_EMPTY = 21,
- PBLK_LINEGC_LOW = 22,
- PBLK_LINEGC_MID = 23,
- PBLK_LINEGC_HIGH = 24,
- PBLK_LINEGC_FULL = 25,
- PBLK_LINEGC_WERR = 26
-};
-
-#define PBLK_MAGIC 0x70626c6b /*pblk*/
-
-/* emeta/smeta persistent storage format versions:
- * Changes in major version requires offline migration.
- * Changes in minor version are handled automatically during
- * recovery.
- */
-
-#define SMETA_VERSION_MAJOR (0)
-#define SMETA_VERSION_MINOR (1)
-
-#define EMETA_VERSION_MAJOR (0)
-#define EMETA_VERSION_MINOR (2)
-
-struct line_header {
- __le32 crc;
- __le32 identifier; /* pblk identifier */
- __u8 uuid[16]; /* instance uuid */
- __le16 type; /* line type */
- __u8 version_major; /* version major */
- __u8 version_minor; /* version minor */
- __le32 id; /* line id for current line */
-};
-
-struct line_smeta {
- struct line_header header;
-
- __le32 crc; /* Full structure including struct crc */
- /* Previous line metadata */
- __le32 prev_id; /* Line id for previous line */
-
- /* Current line metadata */
- __le64 seq_nr; /* Sequence number for current line */
-
- /* Active writers */
- __le32 window_wr_lun; /* Number of parallel LUNs to write */
-
- __le32 rsvd[2];
-
- __le64 lun_bitmap[];
-};
-
-
-/*
- * Metadata layout in media:
- * First sector:
- * 1. struct line_emeta
- * 2. bad block bitmap (u64 * window_wr_lun)
- * 3. write amplification counters
- * Mid sectors (start at lbas_sector):
- * 3. nr_lbas (u64) forming lba list
- * Last sectors (start at vsc_sector):
- * 4. u32 valid sector count (vsc) for all lines (~0U: free line)
- */
-struct line_emeta {
- struct line_header header;
-
- __le32 crc; /* Full structure including struct crc */
-
- /* Previous line metadata */
- __le32 prev_id; /* Line id for prev line */
-
- /* Current line metadata */
- __le64 seq_nr; /* Sequence number for current line */
-
- /* Active writers */
- __le32 window_wr_lun; /* Number of parallel LUNs to write */
-
- /* Bookkeeping for recovery */
- __le32 next_id; /* Line id for next line */
- __le64 nr_lbas; /* Number of lbas mapped in line */
- __le64 nr_valid_lbas; /* Number of valid lbas mapped in line */
- __le64 bb_bitmap[]; /* Updated bad block bitmap for line */
-};
-
-
-/* Write amplification counters stored on media */
-struct wa_counters {
- __le64 user; /* Number of user written sectors */
- __le64 gc; /* Number of sectors written by GC*/
- __le64 pad; /* Number of padded sectors */
-};
-
-struct pblk_emeta {
- struct line_emeta *buf; /* emeta buffer in media format */
- int mem; /* Write offset - points to next
- * writable entry in memory
- */
- atomic_t sync; /* Synced - backpointer that signals the
- * last entry that has been successfully
- * persisted to media
- */
- unsigned int nr_entries; /* Number of emeta entries */
-};
-
-struct pblk_smeta {
- struct line_smeta *buf; /* smeta buffer in persistent format */
-};
-
-struct pblk_w_err_gc {
- int has_write_err;
- int has_gc_err;
- __le64 *lba_list;
-};
-
-struct pblk_line {
- struct pblk *pblk;
- unsigned int id; /* Line number corresponds to the
- * block line
- */
- unsigned int seq_nr; /* Unique line sequence number */
-
- int state; /* PBLK_LINESTATE_X */
- int type; /* PBLK_LINETYPE_X */
- int gc_group; /* PBLK_LINEGC_X */
- struct list_head list; /* Free, GC lists */
-
- unsigned long *lun_bitmap; /* Bitmap for LUNs mapped in line */
-
- struct nvm_chk_meta *chks; /* Chunks forming line */
-
- struct pblk_smeta *smeta; /* Start metadata */
- struct pblk_emeta *emeta; /* End medatada */
-
- int meta_line; /* Metadata line id */
- int meta_distance; /* Distance between data and metadata */
-
- u64 emeta_ssec; /* Sector where emeta starts */
-
- unsigned int sec_in_line; /* Number of usable secs in line */
-
- atomic_t blk_in_line; /* Number of good blocks in line */
- unsigned long *blk_bitmap; /* Bitmap for valid/invalid blocks */
- unsigned long *erase_bitmap; /* Bitmap for erased blocks */
-
- unsigned long *map_bitmap; /* Bitmap for mapped sectors in line */
- unsigned long *invalid_bitmap; /* Bitmap for invalid sectors in line */
-
- atomic_t left_eblks; /* Blocks left for erasing */
- atomic_t left_seblks; /* Blocks left for sync erasing */
-
- int left_msecs; /* Sectors left for mapping */
- unsigned int cur_sec; /* Sector map pointer */
- unsigned int nr_valid_lbas; /* Number of valid lbas in line */
-
- __le32 *vsc; /* Valid sector count in line */
-
- struct kref ref; /* Write buffer L2P references */
- atomic_t sec_to_update; /* Outstanding L2P updates to ppa */
-
- struct pblk_w_err_gc *w_err_gc; /* Write error gc recovery metadata */
-
- spinlock_t lock; /* Necessary for invalid_bitmap only */
-};
-
-#define PBLK_DATA_LINES 4
-
-enum {
- PBLK_EMETA_TYPE_HEADER = 1, /* struct line_emeta first sector */
- PBLK_EMETA_TYPE_LLBA = 2, /* lba list - type: __le64 */
- PBLK_EMETA_TYPE_VSC = 3, /* vsc list - type: __le32 */
-};
-
-struct pblk_line_mgmt {
- int nr_lines; /* Total number of full lines */
- int nr_free_lines; /* Number of full lines in free list */
-
- /* Free lists - use free_lock */
- struct list_head free_list; /* Full lines ready to use */
- struct list_head corrupt_list; /* Full lines corrupted */
- struct list_head bad_list; /* Full lines bad */
-
- /* GC lists - use gc_lock */
- struct list_head *gc_lists[PBLK_GC_NR_LISTS];
- struct list_head gc_high_list; /* Full lines ready to GC, high isc */
- struct list_head gc_mid_list; /* Full lines ready to GC, mid isc */
- struct list_head gc_low_list; /* Full lines ready to GC, low isc */
-
- struct list_head gc_werr_list; /* Write err recovery list */
-
- struct list_head gc_full_list; /* Full lines ready to GC, no valid */
- struct list_head gc_empty_list; /* Full lines close, all valid */
-
- struct pblk_line *log_line; /* Current FTL log line */
- struct pblk_line *data_line; /* Current data line */
- struct pblk_line *log_next; /* Next FTL log line */
- struct pblk_line *data_next; /* Next data line */
-
- struct list_head emeta_list; /* Lines queued to schedule emeta */
-
- __le32 *vsc_list; /* Valid sector counts for all lines */
-
- /* Pre-allocated metadata for data lines */
- struct pblk_smeta *sline_meta[PBLK_DATA_LINES];
- struct pblk_emeta *eline_meta[PBLK_DATA_LINES];
- unsigned long meta_bitmap;
-
- /* Cache and mempool for map/invalid bitmaps */
- struct kmem_cache *bitmap_cache;
- mempool_t *bitmap_pool;
-
- /* Helpers for fast bitmap calculations */
- unsigned long *bb_template;
- unsigned long *bb_aux;
-
- unsigned long d_seq_nr; /* Data line unique sequence number */
- unsigned long l_seq_nr; /* Log line unique sequence number */
-
- spinlock_t free_lock;
- spinlock_t close_lock;
- spinlock_t gc_lock;
-};
-
-struct pblk_line_meta {
- unsigned int smeta_len; /* Total length for smeta */
- unsigned int smeta_sec; /* Sectors needed for smeta */
-
- unsigned int emeta_len[4]; /* Lengths for emeta:
- * [0]: Total
- * [1]: struct line_emeta +
- * bb_bitmap + struct wa_counters
- * [2]: L2P portion
- * [3]: vsc
- */
- unsigned int emeta_sec[4]; /* Sectors needed for emeta. Same layout
- * as emeta_len
- */
-
- unsigned int emeta_bb; /* Boundary for bb that affects emeta */
-
- unsigned int vsc_list_len; /* Length for vsc list */
- unsigned int sec_bitmap_len; /* Length for sector bitmap in line */
- unsigned int blk_bitmap_len; /* Length for block bitmap in line */
- unsigned int lun_bitmap_len; /* Length for lun bitmap in line */
-
- unsigned int blk_per_line; /* Number of blocks in a full line */
- unsigned int sec_per_line; /* Number of sectors in a line */
- unsigned int dsec_per_line; /* Number of data sectors in a line */
- unsigned int min_blk_line; /* Min. number of good blocks in line */
-
- unsigned int mid_thrs; /* Threshold for GC mid list */
- unsigned int high_thrs; /* Threshold for GC high list */
-
- unsigned int meta_distance; /* Distance between data and metadata */
-};
-
-enum {
- PBLK_STATE_RUNNING = 0,
- PBLK_STATE_STOPPING = 1,
- PBLK_STATE_RECOVERING = 2,
- PBLK_STATE_STOPPED = 3,
-};
-
-/* Internal format to support not power-of-2 device formats */
-struct pblk_addrf {
- /* gen to dev */
- int sec_stripe;
- int ch_stripe;
- int lun_stripe;
-
- /* dev to gen */
- int sec_lun_stripe;
- int sec_ws_stripe;
-};
-
-struct pblk {
- struct nvm_tgt_dev *dev;
- struct gendisk *disk;
-
- struct kobject kobj;
-
- struct pblk_lun *luns;
-
- struct pblk_line *lines; /* Line array */
- struct pblk_line_mgmt l_mg; /* Line management */
- struct pblk_line_meta lm; /* Line metadata */
-
- struct nvm_addrf addrf; /* Aligned address format */
- struct pblk_addrf uaddrf; /* Unaligned address format */
- int addrf_len;
-
- struct pblk_rb rwb;
-
- int state; /* pblk line state */
-
- int min_write_pgs; /* Minimum amount of pages required by controller */
- int min_write_pgs_data; /* Minimum amount of payload pages */
- int max_write_pgs; /* Maximum amount of pages supported by controller */
- int oob_meta_size; /* Size of OOB sector metadata */
-
- sector_t capacity; /* Device capacity when bad blocks are subtracted */
-
- int op; /* Percentage of device used for over-provisioning */
- int op_blks; /* Number of blocks used for over-provisioning */
-
- /* pblk provisioning values. Used by rate limiter */
- struct pblk_rl rl;
-
- int sec_per_write;
-
- guid_t instance_uuid;
-
- /* Persistent write amplification counters, 4kb sector I/Os */
- atomic64_t user_wa; /* Sectors written by user */
- atomic64_t gc_wa; /* Sectors written by GC */
- atomic64_t pad_wa; /* Padded sectors written */
-
- /* Reset values for delta write amplification measurements */
- u64 user_rst_wa;
- u64 gc_rst_wa;
- u64 pad_rst_wa;
-
- /* Counters used for calculating padding distribution */
- atomic64_t *pad_dist; /* Padding distribution buckets */
- u64 nr_flush_rst; /* Flushes reset value for pad dist.*/
- atomic64_t nr_flush; /* Number of flush/fua I/O */
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
- /* Non-persistent debug counters, 4kb sector I/Os */
- atomic_long_t inflight_writes; /* Inflight writes (user and gc) */
- atomic_long_t padded_writes; /* Sectors padded due to flush/fua */
- atomic_long_t padded_wb; /* Sectors padded in write buffer */
- atomic_long_t req_writes; /* Sectors stored on write buffer */
- atomic_long_t sub_writes; /* Sectors submitted from buffer */
- atomic_long_t sync_writes; /* Sectors synced to media */
- atomic_long_t inflight_reads; /* Inflight sector read requests */
- atomic_long_t cache_reads; /* Read requests that hit the cache */
- atomic_long_t sync_reads; /* Completed sector read requests */
- atomic_long_t recov_writes; /* Sectors submitted from recovery */
- atomic_long_t recov_gc_writes; /* Sectors submitted from write GC */
- atomic_long_t recov_gc_reads; /* Sectors submitted from read GC */
-#endif
-
- spinlock_t lock;
-
- atomic_long_t read_failed;
- atomic_long_t read_empty;
- atomic_long_t read_high_ecc;
- atomic_long_t read_failed_gc;
- atomic_long_t write_failed;
- atomic_long_t erase_failed;
-
- atomic_t inflight_io; /* General inflight I/O counter */
-
- struct task_struct *writer_ts;
-
- /* Simple translation map of logical addresses to physical addresses.
- * The logical addresses is known by the host system, while the physical
- * addresses are used when writing to the disk block device.
- */
- unsigned char *trans_map;
- spinlock_t trans_lock;
-
- struct list_head compl_list;
-
- spinlock_t resubmit_lock; /* Resubmit list lock */
- struct list_head resubmit_list; /* Resubmit list for failed writes*/
-
- mempool_t page_bio_pool;
- mempool_t gen_ws_pool;
- mempool_t rec_pool;
- mempool_t r_rq_pool;
- mempool_t w_rq_pool;
- mempool_t e_rq_pool;
-
- struct workqueue_struct *close_wq;
- struct workqueue_struct *bb_wq;
- struct workqueue_struct *r_end_wq;
-
- struct timer_list wtimer;
-
- struct pblk_gc gc;
-};
-
-struct pblk_line_ws {
- struct pblk *pblk;
- struct pblk_line *line;
- void *priv;
- struct work_struct ws;
-};
-
-#define pblk_g_rq_size (sizeof(struct nvm_rq) + sizeof(struct pblk_g_ctx))
-#define pblk_w_rq_size (sizeof(struct nvm_rq) + sizeof(struct pblk_c_ctx))
-
-#define pblk_err(pblk, fmt, ...) \
- pr_err("pblk %s: " fmt, pblk->disk->disk_name, ##__VA_ARGS__)
-#define pblk_info(pblk, fmt, ...) \
- pr_info("pblk %s: " fmt, pblk->disk->disk_name, ##__VA_ARGS__)
-#define pblk_warn(pblk, fmt, ...) \
- pr_warn("pblk %s: " fmt, pblk->disk->disk_name, ##__VA_ARGS__)
-#define pblk_debug(pblk, fmt, ...) \
- pr_debug("pblk %s: " fmt, pblk->disk->disk_name, ##__VA_ARGS__)
-
-/*
- * pblk ring buffer operations
- */
-int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
- unsigned int seg_sz);
-int pblk_rb_may_write_user(struct pblk_rb *rb, struct bio *bio,
- unsigned int nr_entries, unsigned int *pos);
-int pblk_rb_may_write_gc(struct pblk_rb *rb, unsigned int nr_entries,
- unsigned int *pos);
-void pblk_rb_write_entry_user(struct pblk_rb *rb, void *data,
- struct pblk_w_ctx w_ctx, unsigned int pos);
-void pblk_rb_write_entry_gc(struct pblk_rb *rb, void *data,
- struct pblk_w_ctx w_ctx, struct pblk_line *line,
- u64 paddr, unsigned int pos);
-struct pblk_w_ctx *pblk_rb_w_ctx(struct pblk_rb *rb, unsigned int pos);
-void pblk_rb_flush(struct pblk_rb *rb);
-
-void pblk_rb_sync_l2p(struct pblk_rb *rb);
-unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb, struct nvm_rq *rqd,
- unsigned int pos, unsigned int nr_entries,
- unsigned int count);
-int pblk_rb_copy_to_bio(struct pblk_rb *rb, struct bio *bio, sector_t lba,
- struct ppa_addr ppa);
-unsigned int pblk_rb_read_commit(struct pblk_rb *rb, unsigned int entries);
-
-unsigned int pblk_rb_sync_init(struct pblk_rb *rb, unsigned long *flags);
-unsigned int pblk_rb_sync_advance(struct pblk_rb *rb, unsigned int nr_entries);
-unsigned int pblk_rb_ptr_wrap(struct pblk_rb *rb, unsigned int p,
- unsigned int nr_entries);
-void pblk_rb_sync_end(struct pblk_rb *rb, unsigned long *flags);
-unsigned int pblk_rb_flush_point_count(struct pblk_rb *rb);
-
-unsigned int pblk_rb_read_count(struct pblk_rb *rb);
-unsigned int pblk_rb_sync_count(struct pblk_rb *rb);
-unsigned int pblk_rb_wrap_pos(struct pblk_rb *rb, unsigned int pos);
-
-int pblk_rb_tear_down_check(struct pblk_rb *rb);
-int pblk_rb_pos_oob(struct pblk_rb *rb, u64 pos);
-void pblk_rb_free(struct pblk_rb *rb);
-ssize_t pblk_rb_sysfs(struct pblk_rb *rb, char *buf);
-
-/*
- * pblk core
- */
-struct nvm_rq *pblk_alloc_rqd(struct pblk *pblk, int type);
-void pblk_free_rqd(struct pblk *pblk, struct nvm_rq *rqd, int type);
-int pblk_alloc_rqd_meta(struct pblk *pblk, struct nvm_rq *rqd);
-void pblk_free_rqd_meta(struct pblk *pblk, struct nvm_rq *rqd);
-void pblk_set_sec_per_write(struct pblk *pblk, int sec_per_write);
-int pblk_setup_w_rec_rq(struct pblk *pblk, struct nvm_rq *rqd,
- struct pblk_c_ctx *c_ctx);
-void pblk_discard(struct pblk *pblk, struct bio *bio);
-struct nvm_chk_meta *pblk_get_chunk_meta(struct pblk *pblk);
-struct nvm_chk_meta *pblk_chunk_get_off(struct pblk *pblk,
- struct nvm_chk_meta *lp,
- struct ppa_addr ppa);
-void pblk_log_write_err(struct pblk *pblk, struct nvm_rq *rqd);
-void pblk_log_read_err(struct pblk *pblk, struct nvm_rq *rqd);
-int pblk_submit_io(struct pblk *pblk, struct nvm_rq *rqd, void *buf);
-int pblk_submit_io_sync(struct pblk *pblk, struct nvm_rq *rqd, void *buf);
-int pblk_submit_meta_io(struct pblk *pblk, struct pblk_line *meta_line);
-void pblk_check_chunk_state_update(struct pblk *pblk, struct nvm_rq *rqd);
-struct pblk_line *pblk_line_get(struct pblk *pblk);
-struct pblk_line *pblk_line_get_first_data(struct pblk *pblk);
-struct pblk_line *pblk_line_replace_data(struct pblk *pblk);
-void pblk_ppa_to_line_put(struct pblk *pblk, struct ppa_addr ppa);
-void pblk_rq_to_line_put(struct pblk *pblk, struct nvm_rq *rqd);
-int pblk_line_recov_alloc(struct pblk *pblk, struct pblk_line *line);
-void pblk_line_recov_close(struct pblk *pblk, struct pblk_line *line);
-struct pblk_line *pblk_line_get_data(struct pblk *pblk);
-struct pblk_line *pblk_line_get_erase(struct pblk *pblk);
-int pblk_line_erase(struct pblk *pblk, struct pblk_line *line);
-int pblk_line_is_full(struct pblk_line *line);
-void pblk_line_free(struct pblk_line *line);
-void pblk_line_close_meta(struct pblk *pblk, struct pblk_line *line);
-void pblk_line_close(struct pblk *pblk, struct pblk_line *line);
-void pblk_line_close_ws(struct work_struct *work);
-void pblk_pipeline_stop(struct pblk *pblk);
-void __pblk_pipeline_stop(struct pblk *pblk);
-void __pblk_pipeline_flush(struct pblk *pblk);
-void pblk_gen_run_ws(struct pblk *pblk, struct pblk_line *line, void *priv,
- void (*work)(struct work_struct *), gfp_t gfp_mask,
- struct workqueue_struct *wq);
-u64 pblk_line_smeta_start(struct pblk *pblk, struct pblk_line *line);
-int pblk_line_smeta_read(struct pblk *pblk, struct pblk_line *line);
-int pblk_line_emeta_read(struct pblk *pblk, struct pblk_line *line,
- void *emeta_buf);
-int pblk_blk_erase_async(struct pblk *pblk, struct ppa_addr erase_ppa);
-void pblk_line_put(struct kref *ref);
-void pblk_line_put_wq(struct kref *ref);
-struct list_head *pblk_line_gc_list(struct pblk *pblk, struct pblk_line *line);
-u64 pblk_lookup_page(struct pblk *pblk, struct pblk_line *line);
-void pblk_dealloc_page(struct pblk *pblk, struct pblk_line *line, int nr_secs);
-u64 pblk_alloc_page(struct pblk *pblk, struct pblk_line *line, int nr_secs);
-u64 __pblk_alloc_page(struct pblk *pblk, struct pblk_line *line, int nr_secs);
-int pblk_calc_secs(struct pblk *pblk, unsigned long secs_avail,
- unsigned long secs_to_flush, bool skip_meta);
-void pblk_down_rq(struct pblk *pblk, struct ppa_addr ppa,
- unsigned long *lun_bitmap);
-void pblk_down_chunk(struct pblk *pblk, struct ppa_addr ppa);
-void pblk_up_chunk(struct pblk *pblk, struct ppa_addr ppa);
-void pblk_up_rq(struct pblk *pblk, unsigned long *lun_bitmap);
-int pblk_bio_add_pages(struct pblk *pblk, struct bio *bio, gfp_t flags,
- int nr_pages);
-void pblk_bio_free_pages(struct pblk *pblk, struct bio *bio, int off,
- int nr_pages);
-void pblk_map_invalidate(struct pblk *pblk, struct ppa_addr ppa);
-void __pblk_map_invalidate(struct pblk *pblk, struct pblk_line *line,
- u64 paddr);
-void pblk_update_map(struct pblk *pblk, sector_t lba, struct ppa_addr ppa);
-void pblk_update_map_cache(struct pblk *pblk, sector_t lba,
- struct ppa_addr ppa);
-void pblk_update_map_dev(struct pblk *pblk, sector_t lba,
- struct ppa_addr ppa, struct ppa_addr entry_line);
-int pblk_update_map_gc(struct pblk *pblk, sector_t lba, struct ppa_addr ppa,
- struct pblk_line *gc_line, u64 paddr);
-void pblk_lookup_l2p_rand(struct pblk *pblk, struct ppa_addr *ppas,
- u64 *lba_list, int nr_secs);
-int pblk_lookup_l2p_seq(struct pblk *pblk, struct ppa_addr *ppas,
- sector_t blba, int nr_secs, bool *from_cache);
-void *pblk_get_meta_for_writes(struct pblk *pblk, struct nvm_rq *rqd);
-void pblk_get_packed_meta(struct pblk *pblk, struct nvm_rq *rqd);
-
-/*
- * pblk user I/O write path
- */
-void pblk_write_to_cache(struct pblk *pblk, struct bio *bio,
- unsigned long flags);
-int pblk_write_gc_to_cache(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
-
-/*
- * pblk map
- */
-int pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
- unsigned int sentry, unsigned long *lun_bitmap,
- unsigned int valid_secs, struct ppa_addr *erase_ppa);
-int pblk_map_rq(struct pblk *pblk, struct nvm_rq *rqd, unsigned int sentry,
- unsigned long *lun_bitmap, unsigned int valid_secs,
- unsigned int off);
-
-/*
- * pblk write thread
- */
-int pblk_write_ts(void *data);
-void pblk_write_timer_fn(struct timer_list *t);
-void pblk_write_should_kick(struct pblk *pblk);
-void pblk_write_kick(struct pblk *pblk);
-
-/*
- * pblk read path
- */
-extern struct bio_set pblk_bio_set;
-void pblk_submit_read(struct pblk *pblk, struct bio *bio);
-int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
-/*
- * pblk recovery
- */
-struct pblk_line *pblk_recov_l2p(struct pblk *pblk);
-int pblk_recov_pad(struct pblk *pblk);
-int pblk_recov_check_emeta(struct pblk *pblk, struct line_emeta *emeta);
-
-/*
- * pblk gc
- */
-#define PBLK_GC_MAX_READERS 8 /* Max number of outstanding GC reader jobs */
-#define PBLK_GC_RQ_QD 128 /* Queue depth for inflight GC requests */
-#define PBLK_GC_L_QD 4 /* Queue depth for inflight GC lines */
-
-int pblk_gc_init(struct pblk *pblk);
-void pblk_gc_exit(struct pblk *pblk, bool graceful);
-void pblk_gc_should_start(struct pblk *pblk);
-void pblk_gc_should_stop(struct pblk *pblk);
-void pblk_gc_should_kick(struct pblk *pblk);
-void pblk_gc_free_full_lines(struct pblk *pblk);
-void pblk_gc_sysfs_state_show(struct pblk *pblk, int *gc_enabled,
- int *gc_active);
-int pblk_gc_sysfs_force(struct pblk *pblk, int force);
-void pblk_put_line_back(struct pblk *pblk, struct pblk_line *line);
-
-/*
- * pblk rate limiter
- */
-void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
-void pblk_rl_free(struct pblk_rl *rl);
-void pblk_rl_update_rates(struct pblk_rl *rl);
-int pblk_rl_high_thrs(struct pblk_rl *rl);
-unsigned long pblk_rl_nr_free_blks(struct pblk_rl *rl);
-unsigned long pblk_rl_nr_user_free_blks(struct pblk_rl *rl);
-int pblk_rl_user_may_insert(struct pblk_rl *rl, int nr_entries);
-void pblk_rl_inserted(struct pblk_rl *rl, int nr_entries);
-void pblk_rl_user_in(struct pblk_rl *rl, int nr_entries);
-int pblk_rl_gc_may_insert(struct pblk_rl *rl, int nr_entries);
-void pblk_rl_gc_in(struct pblk_rl *rl, int nr_entries);
-void pblk_rl_out(struct pblk_rl *rl, int nr_user, int nr_gc);
-int pblk_rl_max_io(struct pblk_rl *rl);
-void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line);
-void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line,
- bool used);
-int pblk_rl_is_limit(struct pblk_rl *rl);
-
-void pblk_rl_werr_line_in(struct pblk_rl *rl);
-void pblk_rl_werr_line_out(struct pblk_rl *rl);
-
-/*
- * pblk sysfs
- */
-int pblk_sysfs_init(struct gendisk *tdisk);
-void pblk_sysfs_exit(struct gendisk *tdisk);
-
-static inline struct nvm_rq *nvm_rq_from_c_ctx(void *c_ctx)
-{
- return c_ctx - sizeof(struct nvm_rq);
-}
-
-static inline void *emeta_to_bb(struct line_emeta *emeta)
-{
- return emeta->bb_bitmap;
-}
-
-static inline void *emeta_to_wa(struct pblk_line_meta *lm,
- struct line_emeta *emeta)
-{
- return emeta->bb_bitmap + lm->blk_bitmap_len;
-}
-
-static inline void *emeta_to_lbas(struct pblk *pblk, struct line_emeta *emeta)
-{
- return ((void *)emeta + pblk->lm.emeta_len[1]);
-}
-
-static inline void *emeta_to_vsc(struct pblk *pblk, struct line_emeta *emeta)
-{
- return (emeta_to_lbas(pblk, emeta) + pblk->lm.emeta_len[2]);
-}
-
-static inline int pblk_line_vsc(struct pblk_line *line)
-{
- return le32_to_cpu(*line->vsc);
-}
-
-static inline int pblk_ppa_to_line_id(struct ppa_addr p)
-{
- return p.a.blk;
-}
-
-static inline struct pblk_line *pblk_ppa_to_line(struct pblk *pblk,
- struct ppa_addr p)
-{
- return &pblk->lines[pblk_ppa_to_line_id(p)];
-}
-
-static inline int pblk_ppa_to_pos(struct nvm_geo *geo, struct ppa_addr p)
-{
- return p.a.lun * geo->num_ch + p.a.ch;
-}
-
-static inline struct ppa_addr addr_to_gen_ppa(struct pblk *pblk, u64 paddr,
- u64 line_id)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct ppa_addr ppa;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = (struct nvm_addrf_12 *)&pblk->addrf;
-
- ppa.ppa = 0;
- ppa.g.blk = line_id;
- ppa.g.pg = (paddr & ppaf->pg_mask) >> ppaf->pg_offset;
- ppa.g.lun = (paddr & ppaf->lun_mask) >> ppaf->lun_offset;
- ppa.g.ch = (paddr & ppaf->ch_mask) >> ppaf->ch_offset;
- ppa.g.pl = (paddr & ppaf->pln_mask) >> ppaf->pln_offset;
- ppa.g.sec = (paddr & ppaf->sec_mask) >> ppaf->sec_offset;
- } else {
- struct pblk_addrf *uaddrf = &pblk->uaddrf;
- int secs, chnls, luns;
-
- ppa.ppa = 0;
-
- ppa.m.chk = line_id;
-
- paddr = div_u64_rem(paddr, uaddrf->sec_stripe, &secs);
- ppa.m.sec = secs;
-
- paddr = div_u64_rem(paddr, uaddrf->ch_stripe, &chnls);
- ppa.m.grp = chnls;
-
- paddr = div_u64_rem(paddr, uaddrf->lun_stripe, &luns);
- ppa.m.pu = luns;
-
- ppa.m.sec += uaddrf->sec_stripe * paddr;
- }
-
- return ppa;
-}
-
-static inline struct nvm_chk_meta *pblk_dev_ppa_to_chunk(struct pblk *pblk,
- struct ppa_addr p)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- struct pblk_line *line = pblk_ppa_to_line(pblk, p);
- int pos = pblk_ppa_to_pos(geo, p);
-
- return &line->chks[pos];
-}
-
-static inline u64 pblk_dev_ppa_to_chunk_addr(struct pblk *pblk,
- struct ppa_addr p)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
-
- return dev_to_chunk_addr(dev->parent, &pblk->addrf, p);
-}
-
-static inline u64 pblk_dev_ppa_to_line_addr(struct pblk *pblk,
- struct ppa_addr p)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct nvm_geo *geo = &dev->geo;
- u64 paddr;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = (struct nvm_addrf_12 *)&pblk->addrf;
-
- paddr = (u64)p.g.ch << ppaf->ch_offset;
- paddr |= (u64)p.g.lun << ppaf->lun_offset;
- paddr |= (u64)p.g.pg << ppaf->pg_offset;
- paddr |= (u64)p.g.pl << ppaf->pln_offset;
- paddr |= (u64)p.g.sec << ppaf->sec_offset;
- } else {
- struct pblk_addrf *uaddrf = &pblk->uaddrf;
- u64 secs = p.m.sec;
- int sec_stripe;
-
- paddr = (u64)p.m.grp * uaddrf->sec_stripe;
- paddr += (u64)p.m.pu * uaddrf->sec_lun_stripe;
-
- secs = div_u64_rem(secs, uaddrf->sec_stripe, &sec_stripe);
- paddr += secs * uaddrf->sec_ws_stripe;
- paddr += sec_stripe;
- }
-
- return paddr;
-}
-
-static inline struct ppa_addr pblk_ppa32_to_ppa64(struct pblk *pblk, u32 ppa32)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
-
- return nvm_ppa32_to_ppa64(dev->parent, &pblk->addrf, ppa32);
-}
-
-static inline u32 pblk_ppa64_to_ppa32(struct pblk *pblk, struct ppa_addr ppa64)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
-
- return nvm_ppa64_to_ppa32(dev->parent, &pblk->addrf, ppa64);
-}
-
-static inline struct ppa_addr pblk_trans_map_get(struct pblk *pblk,
- sector_t lba)
-{
- struct ppa_addr ppa;
-
- if (pblk->addrf_len < 32) {
- u32 *map = (u32 *)pblk->trans_map;
-
- ppa = pblk_ppa32_to_ppa64(pblk, map[lba]);
- } else {
- struct ppa_addr *map = (struct ppa_addr *)pblk->trans_map;
-
- ppa = map[lba];
- }
-
- return ppa;
-}
-
-static inline void pblk_trans_map_set(struct pblk *pblk, sector_t lba,
- struct ppa_addr ppa)
-{
- if (pblk->addrf_len < 32) {
- u32 *map = (u32 *)pblk->trans_map;
-
- map[lba] = pblk_ppa64_to_ppa32(pblk, ppa);
- } else {
- u64 *map = (u64 *)pblk->trans_map;
-
- map[lba] = ppa.ppa;
- }
-}
-
-static inline int pblk_ppa_empty(struct ppa_addr ppa_addr)
-{
- return (ppa_addr.ppa == ADDR_EMPTY);
-}
-
-static inline void pblk_ppa_set_empty(struct ppa_addr *ppa_addr)
-{
- ppa_addr->ppa = ADDR_EMPTY;
-}
-
-static inline bool pblk_ppa_comp(struct ppa_addr lppa, struct ppa_addr rppa)
-{
- return (lppa.ppa == rppa.ppa);
-}
-
-static inline int pblk_addr_in_cache(struct ppa_addr ppa)
-{
- return (ppa.ppa != ADDR_EMPTY && ppa.c.is_cached);
-}
-
-static inline int pblk_addr_to_cacheline(struct ppa_addr ppa)
-{
- return ppa.c.line;
-}
-
-static inline struct ppa_addr pblk_cacheline_to_addr(int addr)
-{
- struct ppa_addr p;
-
- p.c.line = addr;
- p.c.is_cached = 1;
-
- return p;
-}
-
-static inline u32 pblk_calc_meta_header_crc(struct pblk *pblk,
- struct line_header *header)
-{
- u32 crc = ~(u32)0;
-
- crc = crc32_le(crc, (unsigned char *)header + sizeof(crc),
- sizeof(struct line_header) - sizeof(crc));
-
- return crc;
-}
-
-static inline u32 pblk_calc_smeta_crc(struct pblk *pblk,
- struct line_smeta *smeta)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- u32 crc = ~(u32)0;
-
- crc = crc32_le(crc, (unsigned char *)smeta +
- sizeof(struct line_header) + sizeof(crc),
- lm->smeta_len -
- sizeof(struct line_header) - sizeof(crc));
-
- return crc;
-}
-
-static inline u32 pblk_calc_emeta_crc(struct pblk *pblk,
- struct line_emeta *emeta)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- u32 crc = ~(u32)0;
-
- crc = crc32_le(crc, (unsigned char *)emeta +
- sizeof(struct line_header) + sizeof(crc),
- lm->emeta_len[0] -
- sizeof(struct line_header) - sizeof(crc));
-
- return crc;
-}
-
-static inline int pblk_io_aligned(struct pblk *pblk, int nr_secs)
-{
- return !(nr_secs % pblk->min_write_pgs);
-}
-
-#ifdef CONFIG_NVM_PBLK_DEBUG
-static inline void print_ppa(struct pblk *pblk, struct ppa_addr *p,
- char *msg, int error)
-{
- struct nvm_geo *geo = &pblk->dev->geo;
-
- if (p->c.is_cached) {
- pblk_err(pblk, "ppa: (%s: %x) cache line: %llu\n",
- msg, error, (u64)p->c.line);
- } else if (geo->version == NVM_OCSSD_SPEC_12) {
- pblk_err(pblk, "ppa: (%s: %x):ch:%d,lun:%d,blk:%d,pg:%d,pl:%d,sec:%d\n",
- msg, error,
- p->g.ch, p->g.lun, p->g.blk,
- p->g.pg, p->g.pl, p->g.sec);
- } else {
- pblk_err(pblk, "ppa: (%s: %x):ch:%d,lun:%d,chk:%d,sec:%d\n",
- msg, error,
- p->m.grp, p->m.pu, p->m.chk, p->m.sec);
- }
-}
-
-static inline void pblk_print_failed_rqd(struct pblk *pblk, struct nvm_rq *rqd,
- int error)
-{
- int bit = -1;
-
- if (rqd->nr_ppas == 1) {
- print_ppa(pblk, &rqd->ppa_addr, "rqd", error);
- return;
- }
-
- while ((bit = find_next_bit((void *)&rqd->ppa_status, rqd->nr_ppas,
- bit + 1)) < rqd->nr_ppas) {
- print_ppa(pblk, &rqd->ppa_list[bit], "rqd", error);
- }
-
- pblk_err(pblk, "error:%d, ppa_status:%llx\n", error, rqd->ppa_status);
-}
-
-static inline int pblk_boundary_ppa_checks(struct nvm_tgt_dev *tgt_dev,
- struct ppa_addr *ppas, int nr_ppas)
-{
- struct nvm_geo *geo = &tgt_dev->geo;
- struct ppa_addr *ppa;
- int i;
-
- for (i = 0; i < nr_ppas; i++) {
- ppa = &ppas[i];
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- if (!ppa->c.is_cached &&
- ppa->g.ch < geo->num_ch &&
- ppa->g.lun < geo->num_lun &&
- ppa->g.pl < geo->num_pln &&
- ppa->g.blk < geo->num_chk &&
- ppa->g.pg < geo->num_pg &&
- ppa->g.sec < geo->ws_min)
- continue;
- } else {
- if (!ppa->c.is_cached &&
- ppa->m.grp < geo->num_ch &&
- ppa->m.pu < geo->num_lun &&
- ppa->m.chk < geo->num_chk &&
- ppa->m.sec < geo->clba)
- continue;
- }
-
- print_ppa(tgt_dev->q->queuedata, ppa, "boundary", i);
-
- return 1;
- }
- return 0;
-}
-
-static inline int pblk_check_io(struct pblk *pblk, struct nvm_rq *rqd)
-{
- struct nvm_tgt_dev *dev = pblk->dev;
- struct ppa_addr *ppa_list = nvm_rq_to_ppa_list(rqd);
-
- if (pblk_boundary_ppa_checks(dev, ppa_list, rqd->nr_ppas)) {
- WARN_ON(1);
- return -EINVAL;
- }
-
- if (rqd->opcode == NVM_OP_PWRITE) {
- struct pblk_line *line;
- int i;
-
- for (i = 0; i < rqd->nr_ppas; i++) {
- line = pblk_ppa_to_line(pblk, ppa_list[i]);
-
- spin_lock(&line->lock);
- if (line->state != PBLK_LINESTATE_OPEN) {
- pblk_err(pblk, "bad ppa: line:%d,state:%d\n",
- line->id, line->state);
- WARN_ON(1);
- spin_unlock(&line->lock);
- return -EINVAL;
- }
- spin_unlock(&line->lock);
- }
- }
-
- return 0;
-}
-#endif
-
-static inline int pblk_boundary_paddr_checks(struct pblk *pblk, u64 paddr)
-{
- struct pblk_line_meta *lm = &pblk->lm;
-
- if (paddr > lm->sec_per_line)
- return 1;
-
- return 0;
-}
-
-static inline unsigned int pblk_get_bi_idx(struct bio *bio)
-{
- return bio->bi_iter.bi_idx;
-}
-
-static inline sector_t pblk_get_lba(struct bio *bio)
-{
- return bio->bi_iter.bi_sector / NR_PHY_IN_LOG;
-}
-
-static inline unsigned int pblk_get_secs(struct bio *bio)
-{
- return bio->bi_iter.bi_size / PBLK_EXPOSED_PAGE_SIZE;
-}
-
-static inline char *pblk_disk_name(struct pblk *pblk)
-{
- struct gendisk *disk = pblk->disk;
-
- return disk->disk_name;
-}
-
-static inline unsigned int pblk_get_min_chks(struct pblk *pblk)
-{
- struct pblk_line_meta *lm = &pblk->lm;
- /* In a worst-case scenario every line will have OP invalid sectors.
- * We will then need a minimum of 1/OP lines to free up a single line
- */
-
- return DIV_ROUND_UP(100, pblk->op) * lm->blk_per_line;
-}
-
-static inline struct pblk_sec_meta *pblk_get_meta(struct pblk *pblk,
- void *meta, int index)
-{
- return meta +
- max_t(int, sizeof(struct pblk_sec_meta), pblk->oob_meta_size)
- * index;
-}
-
-static inline int pblk_dma_meta_size(struct pblk *pblk)
-{
- return max_t(int, sizeof(struct pblk_sec_meta), pblk->oob_meta_size)
- * NVM_MAX_VLBA;
-}
-
-static inline int pblk_is_oob_meta_supported(struct pblk *pblk)
-{
- return pblk->oob_meta_size >= sizeof(struct pblk_sec_meta);
-}
-#endif /* PBLK_H_ */
config BLK_DEV_MD
tristate "RAID support"
+ select BLOCK_HOLDER_DEPRECATED if SYSFS
help
This driver lets you combine several hard disk partitions into one
logical block device. This can be used to simply append one
config BLK_DEV_DM
tristate "Device mapper support"
+ select BLOCK_HOLDER_DEPRECATED if SYSFS
select BLK_DEV_DM_BUILTIN
depends on DAX || DAX=n
help
config DM_EBS
tristate "Emulated block size target (EXPERIMENTAL)"
- depends on BLK_DEV_DM
+ depends on BLK_DEV_DM && !HIGHMEM
select DM_BUFIO
help
dm-ebs emulates smaller logical block size on backing devices
config BCACHE
tristate "Block device as cache"
+ select BLOCK_HOLDER_DEPRECATED if SYSFS
select CRC64
help
Allows a block device to be used as cache for other devices; uses
struct bvec_iter_all iter_all;
bio_for_each_segment_all(bv, b->bio, iter_all) {
- memcpy(page_address(bv->bv_page), addr, PAGE_SIZE);
+ memcpy(bvec_virt(bv), addr, PAGE_SIZE);
addr += PAGE_SIZE;
}
bcache_device_detach(d);
if (disk) {
- bool disk_added = (disk->flags & GENHD_FL_UP) != 0;
-
- if (disk_added)
- del_gendisk(disk);
-
blk_cleanup_disk(disk);
ida_simple_remove(&bcache_device_idx,
first_minor_to_idx(disk->first_minor));
n = BITS_TO_LONGS(d->nr_stripes) * sizeof(unsigned long);
d->full_dirty_stripes = kvzalloc(n, GFP_KERNEL);
if (!d->full_dirty_stripes)
- return -ENOMEM;
+ goto out_free_stripe_sectors_dirty;
idx = ida_simple_get(&bcache_device_idx, 0,
BCACHE_DEVICE_IDX_MAX, GFP_KERNEL);
if (idx < 0)
- return idx;
+ goto out_free_full_dirty_stripes;
if (bioset_init(&d->bio_split, 4, offsetof(struct bbio, bio),
BIOSET_NEED_BVECS|BIOSET_NEED_RESCUER))
- goto err;
+ goto out_ida_remove;
d->disk = blk_alloc_disk(NUMA_NO_NODE);
if (!d->disk)
- goto err;
+ goto out_bioset_exit;
set_capacity(d->disk, sectors);
snprintf(d->disk->disk_name, DISK_NAME_LEN, "bcache%i", idx);
return 0;
-err:
+out_bioset_exit:
+ bioset_exit(&d->bio_split);
+out_ida_remove:
ida_simple_remove(&bcache_device_idx, idx);
+out_free_full_dirty_stripes:
+ kvfree(d->full_dirty_stripes);
+out_free_stripe_sectors_dirty:
+ kvfree(d->stripe_sectors_dirty);
return -ENOMEM;
}
mutex_lock(&bch_register_lock);
- if (atomic_read(&dc->running))
+ if (atomic_read(&dc->running)) {
bd_unlink_disk_holder(dc->bdev, dc->disk.disk);
+ del_gendisk(dc->disk.disk);
+ }
bcache_device_free(&dc->disk);
list_del(&dc->list);
mutex_lock(&bch_register_lock);
atomic_long_sub(bcache_dev_sectors_dirty(d),
&d->c->flash_dev_dirty_sectors);
+ del_gendisk(d->disk);
bcache_device_free(d);
mutex_unlock(&bch_register_lock);
kobject_put(&d->kobj);
#include "closure.h"
-#define PAGE_SECTORS (PAGE_SIZE / 512)
-
struct closure;
#ifdef CONFIG_BCACHE_DEBUG
if (unlikely(!bv->bv_page || !bv_len))
return -EIO;
- pa = page_address(bv->bv_page) + bv->bv_offset;
+ pa = bvec_virt(bv);
/* Handle overlapping page <-> blocks */
while (bv_len) {
unsigned this_len;
BUG_ON(PageHighMem(biv.bv_page));
- tag = lowmem_page_address(biv.bv_page) + biv.bv_offset;
+ tag = bvec_virt(&biv);
this_len = min(biv.bv_len, data_to_process);
r = dm_integrity_rw_tag(ic, tag, &dio->metadata_block, &dio->metadata_offset,
this_len, dio->op == REQ_OP_READ ? TAG_READ : TAG_WRITE);
unsigned tag_now = min(biv.bv_len, tag_todo);
char *tag_addr;
BUG_ON(PageHighMem(biv.bv_page));
- tag_addr = lowmem_page_address(biv.bv_page) + biv.bv_offset;
+ tag_addr = bvec_virt(&biv);
if (likely(dio->op == REQ_OP_WRITE))
memcpy(tag_ptr, tag_addr, tag_now);
else
}
if (dm_get_md_type(md) == DM_TYPE_NONE) {
- /* Initial table load: acquire type of table. */
- dm_set_md_type(md, dm_table_get_type(t));
-
/* setup md->queue to reflect md's type (may block) */
r = dm_setup_md_queue(md, t);
if (r) {
if (r)
goto err_destroy_table;
- md->type = dm_table_get_type(t);
/* setup md->queue to reflect md's type (may block) */
r = dm_setup_md_queue(md, t);
if (r) {
err = blk_mq_init_allocated_queue(md->tag_set, md->queue);
if (err)
goto out_tag_set;
- elevator_init_mq(md->queue);
return 0;
out_tag_set:
}
dm_update_keyslot_manager(q, t);
- blk_queue_update_readahead(q);
+ disk_update_readahead(t->md->disk);
return 0;
}
static void bio_copy_block(struct dm_writecache *wc, struct bio *bio, void *data)
{
void *buf;
- unsigned long flags;
unsigned size;
int rw = bio_data_dir(bio);
unsigned remaining_size = wc->block_size;
do {
struct bio_vec bv = bio_iter_iovec(bio, bio->bi_iter);
- buf = bvec_kmap_irq(&bv, &flags);
+ buf = bvec_kmap_local(&bv);
size = bv.bv_len;
if (unlikely(size > remaining_size))
size = remaining_size;
memcpy_flushcache_optimized(data, buf, size);
}
- bvec_kunmap_irq(buf, &flags);
+ kunmap_local(buf);
data = (char *)data + size;
remaining_size -= size;
spin_lock(&_minor_lock);
md->disk->private_data = NULL;
spin_unlock(&_minor_lock);
- del_gendisk(md->disk);
- }
-
- if (md->queue)
+ if (dm_get_md_type(md) != DM_TYPE_NONE) {
+ dm_sysfs_exit(md);
+ del_gendisk(md->disk);
+ }
dm_queue_destroy_keyslot_manager(md->queue);
-
- if (md->disk)
blk_cleanup_disk(md->disk);
+ }
cleanup_srcu_struct(&md->io_barrier);
goto bad;
}
- add_disk_no_queue_reg(md->disk);
format_dev_t(md->name, MKDEV(_major, minor));
md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
*/
int dm_create(int minor, struct mapped_device **result)
{
- int r;
struct mapped_device *md;
md = alloc_dev(minor);
if (!md)
return -ENXIO;
- r = dm_sysfs_init(md);
- if (r) {
- free_dev(md);
- return r;
- }
-
*result = md;
return 0;
}
*/
int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
{
- int r;
+ enum dm_queue_mode type = dm_table_get_type(t);
struct queue_limits limits;
- enum dm_queue_mode type = dm_get_md_type(md);
+ int r;
switch (type) {
case DM_TYPE_REQUEST_BASED:
if (r)
return r;
- blk_register_queue(md->disk);
+ add_disk(md->disk);
+ r = dm_sysfs_init(md);
+ if (r) {
+ del_gendisk(md->disk);
+ return r;
+ }
+ md->type = type;
return 0;
}
DMWARN("%s: Forcibly removing mapped_device still in use! (%d users)",
dm_device_name(md), atomic_read(&md->holders));
- dm_sysfs_exit(md);
dm_table_destroy(__unbind(md));
free_dev(md);
}
static inline bool is_mddev_broken(struct md_rdev *rdev, const char *md_type)
{
- int flags = rdev->bdev->bd_disk->flags;
-
- if (!(flags & GENHD_FL_UP)) {
+ if (!disk_live(rdev->bdev->bd_disk)) {
if (!test_and_set_bit(MD_BROKEN, &rdev->mddev->flags))
pr_warn("md: %s: %s array has a missing/failed member\n",
mdname(rdev->mddev), md_type);
struct raid1_plug_cb *plug = NULL;
int first_clone;
int max_sectors;
+ bool write_behind = false;
if (mddev_is_clustered(mddev) &&
md_cluster_ops->area_resyncing(mddev, WRITE,
max_sectors = r1_bio->sectors;
for (i = 0; i < disks; i++) {
struct md_rdev *rdev = rcu_dereference(conf->mirrors[i].rdev);
+
+ /*
+ * The write-behind io is only attempted on drives marked as
+ * write-mostly, which means we could allocate write behind
+ * bio later.
+ */
+ if (rdev && test_bit(WriteMostly, &rdev->flags))
+ write_behind = true;
+
if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
atomic_inc(&rdev->nr_pending);
blocked_rdev = rdev;
goto retry_write;
}
+ /*
+ * When using a bitmap, we may call alloc_behind_master_bio below.
+ * alloc_behind_master_bio allocates a copy of the data payload a page
+ * at a time and thus needs a new bio that can fit the whole payload
+ * this bio in page sized chunks.
+ */
+ if (write_behind && bitmap)
+ max_sectors = min_t(int, max_sectors,
+ BIO_MAX_VECS * (PAGE_SIZE >> 9));
if (max_sectors < bio_sectors(bio)) {
struct bio *split = bio_split(bio, max_sectors,
GFP_NOIO, &conf->bio_split);
} else
r10_bio->master_bio = (struct bio *)first_r10bio;
+ /*
+ * first select target devices under rcu_lock and
+ * inc refcount on their rdev. Record them by setting
+ * bios[x] to bio
+ */
rcu_read_lock();
for (disk = 0; disk < geo->raid_disks; disk++) {
struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev);
for (disk = 0; disk < geo->raid_disks; disk++) {
sector_t dev_start, dev_end;
struct bio *mbio, *rbio = NULL;
- struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev);
- struct md_rdev *rrdev = rcu_dereference(
- conf->mirrors[disk].replacement);
/*
* Now start to calculate the start and end address for each disk.
/*
* It only handles discard bio which size is >= stripe size, so
- * dev_end > dev_start all the time
+ * dev_end > dev_start all the time.
+ * It doesn't need to use rcu lock to get rdev here. We already
+ * add rdev->nr_pending in the first loop.
*/
if (r10_bio->devs[disk].bio) {
+ struct md_rdev *rdev = conf->mirrors[disk].rdev;
mbio = bio_clone_fast(bio, GFP_NOIO, &mddev->bio_set);
mbio->bi_end_io = raid10_end_discard_request;
mbio->bi_private = r10_bio;
bio_endio(mbio);
}
if (r10_bio->devs[disk].repl_bio) {
+ struct md_rdev *rrdev = conf->mirrors[disk].replacement;
rbio = bio_clone_fast(bio, GFP_NOIO, &mddev->bio_set);
rbio->bi_end_io = raid10_end_discard_request;
rbio->bi_private = r10_bio;
conf->scribble_sectors >= new_sectors)
return 0;
mddev_suspend(conf->mddev);
- get_online_cpus();
+ cpus_read_lock();
for_each_present_cpu(cpu) {
struct raid5_percpu *percpu;
break;
}
- put_online_cpus();
+ cpus_read_unlock();
mddev_resume(conf->mddev);
if (!err) {
conf->scribble_disks = new_disks;
err_free_swnodes:
software_node_unregister_nodes(sensor->swnodes);
err_put_adev:
- acpi_dev_put(sensor->adev);
+ acpi_dev_put(adev);
return ret;
}
for (n = 0; n < NUM_PRCMU_WAKEUPS; n++) {
if (ev & prcmu_irq_bit[n])
- generic_handle_irq(irq_find_mapping(db8500_irq_domain, n));
+ generic_handle_domain_irq(db8500_irq_domain, n);
}
r = true;
break;
regmap_read(tsadc->regs, MX25_TSC_TGSR, &status);
if (status & MX25_TGSR_GCQ_INT)
- generic_handle_irq(irq_find_mapping(tsadc->domain, 1));
+ generic_handle_domain_irq(tsadc->domain, 1);
if (status & MX25_TGSR_TCQ_INT)
- generic_handle_irq(irq_find_mapping(tsadc->domain, 0));
+ generic_handle_domain_irq(tsadc->domain, 0);
chained_irq_exit(chip, desc);
}
struct ioc3_priv_data *ipd = domain->host_data;
struct ioc3 __iomem *regs = ipd->regs;
u32 pending, mask;
- unsigned int irq;
pending = readl(®s->sio_ir);
mask = readl(®s->sio_ies);
pending &= mask; /* Mask off not enabled interrupts */
- if (pending) {
- irq = irq_find_mapping(domain, __ffs(pending));
- if (irq)
- generic_handle_irq(irq);
- } else {
+ if (pending)
+ generic_handle_domain_irq(domain, __ffs(pending));
+ else
spurious_interrupt();
- }
}
/*
static int pm8xxx_irq_block_handler(struct pm_irq_chip *chip, int block)
{
- int pmirq, irq, i, ret = 0;
+ int pmirq, i, ret = 0;
unsigned int bits;
ret = pm8xxx_read_block_irq(chip, block, &bits);
for (i = 0; i < 8; i++) {
if (bits & (1 << i)) {
pmirq = block * 8 + i;
- irq = irq_find_mapping(chip->irqdomain, pmirq);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(chip->irqdomain, pmirq);
}
}
return 0;
static void pm8821_irq_block_handler(struct pm_irq_chip *chip,
int master, int block)
{
- int pmirq, irq, i, ret;
+ int pmirq, i, ret;
unsigned int bits;
ret = regmap_read(chip->regmap,
for (i = 0; i < 8; i++) {
if (bits & BIT(i)) {
pmirq = block * 8 + i;
- irq = irq_find_mapping(chip->irqdomain, pmirq);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(chip->irqdomain, pmirq);
}
}
}
* track of the current selected device partition.
*/
unsigned int part_curr;
- struct device_attribute force_ro;
- struct device_attribute power_ro_lock;
int area_type;
/* debugfs files (only in main mmc_blk_data) */
return count;
}
+static DEVICE_ATTR(ro_lock_until_next_power_on, 0,
+ power_ro_lock_show, power_ro_lock_store);
+
static ssize_t force_ro_show(struct device *dev, struct device_attribute *attr,
char *buf)
{
return ret;
}
+static DEVICE_ATTR(force_ro, 0644, force_ro_show, force_ro_store);
+
+static struct attribute *mmc_disk_attrs[] = {
+ &dev_attr_force_ro.attr,
+ &dev_attr_ro_lock_until_next_power_on.attr,
+ NULL,
+};
+
+static umode_t mmc_disk_attrs_is_visible(struct kobject *kobj,
+ struct attribute *a, int n)
+{
+ struct device *dev = container_of(kobj, struct device, kobj);
+ struct mmc_blk_data *md = mmc_blk_get(dev_to_disk(dev));
+ umode_t mode = a->mode;
+
+ if (a == &dev_attr_ro_lock_until_next_power_on.attr &&
+ (md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
+ md->queue.card->ext_csd.boot_ro_lockable) {
+ mode = S_IRUGO;
+ if (!(md->queue.card->ext_csd.boot_ro_lock &
+ EXT_CSD_BOOT_WP_B_PWR_WP_DIS))
+ mode |= S_IWUSR;
+ }
+
+ mmc_blk_put(md);
+ return mode;
+}
+
+static const struct attribute_group mmc_disk_attr_group = {
+ .is_visible = mmc_disk_attrs_is_visible,
+ .attrs = mmc_disk_attrs,
+};
+
+static const struct attribute_group *mmc_disk_attr_groups[] = {
+ &mmc_disk_attr_group,
+ NULL,
+};
+
static int mmc_blk_open(struct block_device *bdev, fmode_t mode)
{
struct mmc_blk_data *md = mmc_blk_get(bdev->bd_disk);
}
#endif
+static int mmc_blk_alternative_gpt_sector(struct gendisk *disk,
+ sector_t *sector)
+{
+ struct mmc_blk_data *md;
+ int ret;
+
+ md = mmc_blk_get(disk);
+ if (!md)
+ return -EINVAL;
+
+ if (md->queue.card)
+ ret = mmc_card_alternative_gpt_sector(md->queue.card, sector);
+ else
+ ret = -ENODEV;
+
+ mmc_blk_put(md);
+
+ return ret;
+}
+
static const struct block_device_operations mmc_bdops = {
.open = mmc_blk_open,
.release = mmc_blk_release,
#ifdef CONFIG_COMPAT
.compat_ioctl = mmc_blk_compat_ioctl,
#endif
+ .alternative_gpt_sector = mmc_blk_alternative_gpt_sector,
};
static int mmc_blk_part_switch_pre(struct mmc_card *card,
sector_t size,
bool default_ro,
const char *subname,
- int area_type)
+ int area_type,
+ unsigned int part_type)
{
struct mmc_blk_data *md;
int devidx, ret;
kref_init(&md->kref);
md->queue.blkdata = md;
+ md->part_type = part_type;
md->disk->major = MMC_BLOCK_MAJOR;
md->disk->minors = perdev_minors;
md->disk->disk_name, mmc_card_id(card), mmc_card_name(card),
cap_str, md->read_only ? "(ro)" : "");
+ /* used in ->open, must be set before add_disk: */
+ if (area_type == MMC_BLK_DATA_AREA_MAIN)
+ dev_set_drvdata(&card->dev, md);
+ device_add_disk(md->parent, md->disk, mmc_disk_attr_groups);
return md;
err_kfree:
}
return mmc_blk_alloc_req(card, &card->dev, size, false, NULL,
- MMC_BLK_DATA_AREA_MAIN);
+ MMC_BLK_DATA_AREA_MAIN, 0);
}
static int mmc_blk_alloc_part(struct mmc_card *card,
struct mmc_blk_data *part_md;
part_md = mmc_blk_alloc_req(card, disk_to_dev(md->disk), size, default_ro,
- subname, area_type);
+ subname, area_type, part_type);
if (IS_ERR(part_md))
return PTR_ERR(part_md);
- part_md->part_type = part_type;
list_add(&part_md->part, &md->part);
return 0;
static void mmc_blk_remove_req(struct mmc_blk_data *md)
{
- struct mmc_card *card;
-
- if (md) {
- /*
- * Flush remaining requests and free queues. It
- * is freeing the queue that stops new requests
- * from being accepted.
- */
- card = md->queue.card;
- if (md->disk->flags & GENHD_FL_UP) {
- device_remove_file(disk_to_dev(md->disk), &md->force_ro);
- if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
- card->ext_csd.boot_ro_lockable)
- device_remove_file(disk_to_dev(md->disk),
- &md->power_ro_lock);
-
- del_gendisk(md->disk);
- }
- mmc_cleanup_queue(&md->queue);
- mmc_blk_put(md);
- }
+ /*
+ * Flush remaining requests and free queues. It is freeing the queue
+ * that stops new requests from being accepted.
+ */
+ del_gendisk(md->disk);
+ mmc_cleanup_queue(&md->queue);
+ mmc_blk_put(md);
}
static void mmc_blk_remove_parts(struct mmc_card *card,
}
}
-static int mmc_add_disk(struct mmc_blk_data *md)
-{
- int ret;
- struct mmc_card *card = md->queue.card;
-
- device_add_disk(md->parent, md->disk, NULL);
- md->force_ro.show = force_ro_show;
- md->force_ro.store = force_ro_store;
- sysfs_attr_init(&md->force_ro.attr);
- md->force_ro.attr.name = "force_ro";
- md->force_ro.attr.mode = S_IRUGO | S_IWUSR;
- ret = device_create_file(disk_to_dev(md->disk), &md->force_ro);
- if (ret)
- goto force_ro_fail;
-
- if ((md->area_type & MMC_BLK_DATA_AREA_BOOT) &&
- card->ext_csd.boot_ro_lockable) {
- umode_t mode;
-
- if (card->ext_csd.boot_ro_lock & EXT_CSD_BOOT_WP_B_PWR_WP_DIS)
- mode = S_IRUGO;
- else
- mode = S_IRUGO | S_IWUSR;
-
- md->power_ro_lock.show = power_ro_lock_show;
- md->power_ro_lock.store = power_ro_lock_store;
- sysfs_attr_init(&md->power_ro_lock.attr);
- md->power_ro_lock.attr.mode = mode;
- md->power_ro_lock.attr.name =
- "ro_lock_until_next_power_on";
- ret = device_create_file(disk_to_dev(md->disk),
- &md->power_ro_lock);
- if (ret)
- goto power_ro_lock_fail;
- }
- return ret;
-
-power_ro_lock_fail:
- device_remove_file(disk_to_dev(md->disk), &md->force_ro);
-force_ro_fail:
- del_gendisk(md->disk);
-
- return ret;
-}
-
#ifdef CONFIG_DEBUG_FS
static int mmc_dbg_card_status_get(void *data, u64 *val)
static int mmc_blk_probe(struct mmc_card *card)
{
- struct mmc_blk_data *md, *part_md;
+ struct mmc_blk_data *md;
int ret = 0;
/*
if (ret)
goto out;
- dev_set_drvdata(&card->dev, md);
-
- ret = mmc_add_disk(md);
- if (ret)
- goto out;
-
- list_for_each_entry(part_md, &md->part, part) {
- ret = mmc_add_disk(part_md);
- if (ret)
- goto out;
- }
-
/* Add two debugfs entries */
mmc_blk_add_debugfs(card, md);
}
EXPORT_SYMBOL(mmc_detect_card_removed);
+int mmc_card_alternative_gpt_sector(struct mmc_card *card, sector_t *gpt_sector)
+{
+ unsigned int boot_sectors_num;
+
+ if ((!(card->host->caps2 & MMC_CAP2_ALT_GPT_TEGRA)))
+ return -EOPNOTSUPP;
+
+ /* filter out unrelated cards */
+ if (card->ext_csd.rev < 3 ||
+ !mmc_card_mmc(card) ||
+ !mmc_card_is_blockaddr(card) ||
+ mmc_card_is_removable(card->host))
+ return -ENOENT;
+
+ /*
+ * eMMC storage has two special boot partitions in addition to the
+ * main one. NVIDIA's bootloader linearizes eMMC boot0->boot1->main
+ * accesses, this means that the partition table addresses are shifted
+ * by the size of boot partitions. In accordance with the eMMC
+ * specification, the boot partition size is calculated as follows:
+ *
+ * boot partition size = 128K byte x BOOT_SIZE_MULT
+ *
+ * Calculate number of sectors occupied by the both boot partitions.
+ */
+ boot_sectors_num = card->ext_csd.raw_boot_mult * SZ_128K /
+ SZ_512 * MMC_NUM_BOOT_PARTITION;
+
+ /* Defined by NVIDIA and used by Android devices. */
+ *gpt_sector = card->ext_csd.sectors - boot_sectors_num - 1;
+
+ return 0;
+}
+EXPORT_SYMBOL(mmc_card_alternative_gpt_sector);
+
void mmc_rescan(struct work_struct *work)
{
struct mmc_host *host =
void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx);
void mmc_put_card(struct mmc_card *card, struct mmc_ctx *ctx);
+int mmc_card_alternative_gpt_sector(struct mmc_card *card, sector_t *sector);
+
/**
* mmc_claim_host - exclusively claim a host
* @host: mmc host to claim
ext_csd[EXT_CSD_ERASE_TIMEOUT_MULT];
card->ext_csd.raw_hc_erase_grp_size =
ext_csd[EXT_CSD_HC_ERASE_GRP_SIZE];
+ card->ext_csd.raw_boot_mult =
+ ext_csd[EXT_CSD_BOOT_MULT];
if (card->ext_csd.rev >= 3) {
u8 sa_shift = ext_csd[EXT_CSD_S_A_TIMEOUT];
card->ext_csd.part_config = ext_csd[EXT_CSD_PART_CONFIG];
};
static const struct sdhci_pltfm_data sdhci_bcm2711_pltfm_data = {
- .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 |
- SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
+ .quirks = SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12,
.ops = &sdhci_iproc_bcm2711_ops,
};
*/
#define NVQUIRK_HAS_TMCLK BIT(10)
+#define NVQUIRK_HAS_ANDROID_GPT_SECTOR BIT(11)
+
/* SDMMC CQE Base Address for Tegra Host Ver 4.1 and Higher */
#define SDHCI_TEGRA_CQE_BASE_ADDR 0xF000
.pdata = &sdhci_tegra20_pdata,
.dma_mask = DMA_BIT_MASK(32),
.nvquirks = NVQUIRK_FORCE_SDHCI_SPEC_200 |
+ NVQUIRK_HAS_ANDROID_GPT_SECTOR |
NVQUIRK_ENABLE_BLOCK_GAP_DET,
};
.nvquirks = NVQUIRK_ENABLE_SDHCI_SPEC_300 |
NVQUIRK_ENABLE_SDR50 |
NVQUIRK_ENABLE_SDR104 |
+ NVQUIRK_HAS_ANDROID_GPT_SECTOR |
NVQUIRK_HAS_PADCALIB,
};
static const struct sdhci_tegra_soc_data soc_data_tegra114 = {
.pdata = &sdhci_tegra114_pdata,
.dma_mask = DMA_BIT_MASK(32),
+ .nvquirks = NVQUIRK_HAS_ANDROID_GPT_SECTOR,
};
static const struct sdhci_pltfm_data sdhci_tegra124_pdata = {
static const struct sdhci_tegra_soc_data soc_data_tegra124 = {
.pdata = &sdhci_tegra124_pdata,
.dma_mask = DMA_BIT_MASK(34),
+ .nvquirks = NVQUIRK_HAS_ANDROID_GPT_SECTOR,
};
static const struct sdhci_ops tegra210_sdhci_ops = {
tegra_host->pad_control_available = false;
tegra_host->soc_data = soc_data;
+ if (soc_data->nvquirks & NVQUIRK_HAS_ANDROID_GPT_SECTOR)
+ host->mmc->caps2 |= MMC_CAP2_ALT_GPT_TEGRA;
+
if (soc_data->nvquirks & NVQUIRK_NEEDS_PAD_CONTROL) {
rc = tegra_sdhci_init_pinctrl_info(&pdev->dev, tegra_host);
if (rc == 0)
if (id == ESD_EV_CAN_ERROR_EXT) {
u8 state = msg->msg.rx.data[0];
u8 ecc = msg->msg.rx.data[1];
- u8 txerr = msg->msg.rx.data[2];
- u8 rxerr = msg->msg.rx.data[3];
+ u8 rxerr = msg->msg.rx.data[2];
+ u8 txerr = msg->msg.rx.data[3];
skb = alloc_can_err_skb(priv->netdev, &cf);
if (skb == NULL) {
u16 data;
u8 gates;
- cur++;
- next++;
-
if (i == schedule->num_entries)
gates = initial->gate_mask ^
cur->gate_mask;
(initial->gate_mask <<
TR_GCLCMD_INIT_GATE_STATES_SHIFT);
hellcreek_write(hellcreek, data, TR_GCLCMD);
+
+ cur++;
+ next++;
}
}
/* Calculate difference to admin base time */
base_time_ns = ktime_to_ns(hellcreek_port->current_schedule->base_time);
- return base_time_ns - current_ns < (s64)8 * NSEC_PER_SEC;
+ return base_time_ns - current_ns < (s64)4 * NSEC_PER_SEC;
}
static void hellcreek_start_schedule(struct hellcreek *hellcreek, int port)
int err;
/* mv88e6393x family errata 4.6:
- * Cannot clear PwrDn bit on SERDES on port 0 if device is configured
- * CPU_MGD mode or P0_mode is configured for [x]MII.
- * Workaround: Set Port0 SERDES register 4.F002 bit 5=0 and bit 15=1.
+ * Cannot clear PwrDn bit on SERDES if device is configured CPU_MGD
+ * mode or P0_mode is configured for [x]MII.
+ * Workaround: Set SERDES register 4.F002 bit 5=0 and bit 15=1.
*
* It seems that after this workaround the SERDES is automatically
* powered up (the bit is cleared), so power it down.
*/
- if (lane == MV88E6393X_PORT0_LANE) {
- err = mv88e6390_serdes_read(chip, MV88E6393X_PORT0_LANE,
+ if (lane == MV88E6393X_PORT0_LANE || lane == MV88E6393X_PORT9_LANE ||
+ lane == MV88E6393X_PORT10_LANE) {
+ err = mv88e6390_serdes_read(chip, lane,
MDIO_MMD_PHYXS,
MV88E6393X_SERDES_POC, ®);
if (err)
ret = register_netdev(ndev);
if (ret) {
netdev_err(ndev, "Failed to register netdev\n");
- goto err;
+ goto err_mdio_remove;
}
return 0;
+err_mdio_remove:
+ xge_mdio_remove(ndev);
err:
free_netdev(ndev);
if (GEM_BFEXT(DMA_RXVALID, desc->addr)) {
desc_ptp = macb_ptp_desc(bp, desc);
+ /* Unlikely but check */
+ if (!desc_ptp) {
+ dev_warn_ratelimited(&bp->pdev->dev,
+ "Timestamp not supported in BD\n");
+ return;
+ }
gem_hw_timestamp(bp, desc_ptp->ts_1, desc_ptp->ts_2, &ts);
memset(shhwtstamps, 0, sizeof(struct skb_shared_hwtstamps));
shhwtstamps->hwtstamp = ktime_set(ts.tv_sec, ts.tv_nsec);
if (CIRC_SPACE(head, tail, PTP_TS_BUFFER_SIZE) == 0)
return -ENOMEM;
- skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
desc_ptp = macb_ptp_desc(queue->bp, desc);
+ /* Unlikely but check */
+ if (!desc_ptp)
+ return -EINVAL;
+ skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
tx_timestamp = &queue->tx_timestamps[head];
tx_timestamp->skb = skb;
/* ensure ts_1/ts_2 is loaded after ctrl (TX_USED check) */
ret = -ENOMEM;
goto bye;
}
+ bitmap_zero(adap->sge.blocked_fl, adap->sge.egr_sz);
#endif
params[0] = FW_PARAM_PFVF(CLIP_START);
setup_memwin(adapter);
err = adap_init0(adapter, 0);
-#ifdef CONFIG_DEBUG_FS
- bitmap_zero(adapter->sge.blocked_fl, adapter->sge.egr_sz);
-#endif
- setup_memwin_rdma(adapter);
if (err)
goto out_unmap_bar;
+ setup_memwin_rdma(adapter);
+
/* configure SGE_STAT_CFG_A to read WC stats */
if (!is_t4(adapter->params.chip))
t4_write_reg(adapter, SGE_STAT_CFG_A, STATSOURCE_T5_V(7) |
return 0;
}
-static int hns3_dbg_get_cmd_index(struct hnae3_handle *handle,
- const unsigned char *name, u32 *index)
+static int hns3_dbg_get_cmd_index(struct hns3_dbg_data *dbg_data, u32 *index)
{
u32 i;
for (i = 0; i < ARRAY_SIZE(hns3_dbg_cmd); i++) {
- if (!strncmp(name, hns3_dbg_cmd[i].name,
- strlen(hns3_dbg_cmd[i].name))) {
+ if (hns3_dbg_cmd[i].cmd == dbg_data->cmd) {
*index = i;
return 0;
}
}
- dev_err(&handle->pdev->dev, "unknown command(%s)\n", name);
+ dev_err(&dbg_data->handle->pdev->dev, "unknown command(%d)\n",
+ dbg_data->cmd);
return -EINVAL;
}
u32 index;
int ret;
- ret = hns3_dbg_get_cmd_index(handle, filp->f_path.dentry->d_iname,
- &index);
+ ret = hns3_dbg_get_cmd_index(dbg_data, &index);
if (ret)
return ret;
char name[HNS3_DBG_FILE_NAME_LEN];
data[i].handle = handle;
+ data[i].cmd = hns3_dbg_cmd[cmd].cmd;
data[i].qid = i;
sprintf(name, "%s%u", hns3_dbg_cmd[cmd].name, i);
debugfs_create_file(name, 0400, entry_dir, &data[i],
return -ENOMEM;
data->handle = handle;
+ data->cmd = hns3_dbg_cmd[cmd].cmd;
entry_dir = hns3_dbg_dentry[hns3_dbg_cmd[cmd].dentry].dentry;
debugfs_create_file(hns3_dbg_cmd[cmd].name, 0400, entry_dir,
data, &hns3_dbg_fops);
struct hns3_dbg_data {
struct hnae3_handle *handle;
+ enum hnae3_dbg_cmd cmd;
u16 qid;
};
void hclge_cmd_uninit(struct hclge_dev *hdev)
{
+ set_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state);
+ /* wait to ensure that the firmware completes the possible left
+ * over commands.
+ */
+ msleep(HCLGE_CMDQ_CLEAR_WAIT_TIME);
spin_lock_bh(&hdev->hw.cmq.csq.lock);
spin_lock(&hdev->hw.cmq.crq.lock);
- set_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state);
hclge_cmd_uninit_regs(&hdev->hw);
spin_unlock(&hdev->hw.cmq.crq.lock);
spin_unlock_bh(&hdev->hw.cmq.csq.lock);
#include "hnae3.h"
#define HCLGE_CMDQ_TX_TIMEOUT 30000
+#define HCLGE_CMDQ_CLEAR_WAIT_TIME 200
#define HCLGE_DESC_DATA_LEN 6
struct hclge_dev;
/* Led command */
HCLGE_OPC_LED_STATUS_CFG = 0xB000,
+ /* clear hardware resource command */
+ HCLGE_OPC_CLEAR_HW_RESOURCE = 0x700B,
+
/* NCL config command */
HCLGE_OPC_QUERY_NCL_CONFIG = 0x7011,
u64 requests[HNAE3_MAX_TC], indications[HNAE3_MAX_TC];
struct hclge_vport *vport = hclge_get_vport(h);
struct hclge_dev *hdev = vport->back;
- u8 i, j, pfc_map, *prio_tc;
int ret;
+ u8 i;
memset(pfc, 0, sizeof(*pfc));
pfc->pfc_cap = hdev->pfc_max;
- prio_tc = hdev->tm_info.prio_tc;
- pfc_map = hdev->tm_info.hw_pfc_map;
-
- /* Pfc setting is based on TC */
- for (i = 0; i < hdev->tm_info.num_tc; i++) {
- for (j = 0; j < HNAE3_MAX_USER_PRIO; j++) {
- if ((prio_tc[j] == i) && (pfc_map & BIT(i)))
- pfc->pfc_en |= BIT(j);
- }
- }
+ pfc->pfc_en = hdev->tm_info.pfc_en;
ret = hclge_pfc_tx_stats_get(hdev, requests);
if (ret)
hdev->tm_info.hw_pfc_map = 0;
hdev->wanted_umv_size = cfg.umv_space;
hdev->tx_spare_buf_size = cfg.tx_spare_buf_size;
+ hdev->gro_en = true;
if (cfg.vlan_fliter_cap == HCLGE_VLAN_FLTR_CAN_MDF)
set_bit(HNAE3_DEV_SUPPORT_VLAN_FLTR_MDF_B, ae_dev->caps);
return hclge_cmd_send(&hdev->hw, &desc, 1);
}
-static int hclge_config_gro(struct hclge_dev *hdev, bool en)
+static int hclge_config_gro(struct hclge_dev *hdev)
{
struct hclge_cfg_gro_status_cmd *req;
struct hclge_desc desc;
hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_GRO_GENERIC_CONFIG, false);
req = (struct hclge_cfg_gro_status_cmd *)desc.data;
- req->gro_en = en ? 1 : 0;
+ req->gro_en = hdev->gro_en ? 1 : 0;
ret = hclge_cmd_send(&hdev->hw, &desc, 1);
if (ret)
}
if (state != hdev->hw.mac.link) {
+ hdev->hw.mac.link = state;
client->ops->link_status_change(handle, state);
hclge_config_mac_tnl_int(hdev, state);
if (rclient && rclient->ops->link_status_change)
rclient->ops->link_status_change(rhandle, state);
- hdev->hw.mac.link = state;
hclge_push_link_status(hdev);
}
static void hclge_add_vport_vlan_table(struct hclge_vport *vport, u16 vlan_id,
bool writen_to_tbl)
{
- struct hclge_vport_vlan_cfg *vlan;
+ struct hclge_vport_vlan_cfg *vlan, *tmp;
+
+ list_for_each_entry_safe(vlan, tmp, &vport->vlan_list, node)
+ if (vlan->vlan_id == vlan_id)
+ return;
vlan = kzalloc(sizeof(*vlan), GFP_KERNEL);
if (!vlan)
}
}
+static int hclge_clear_hw_resource(struct hclge_dev *hdev)
+{
+ struct hclge_desc desc;
+ int ret;
+
+ hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_CLEAR_HW_RESOURCE, false);
+
+ ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+ /* This new command is only supported by new firmware, it will
+ * fail with older firmware. Error value -EOPNOSUPP can only be
+ * returned by older firmware running this command, to keep code
+ * backward compatible we will override this value and return
+ * success.
+ */
+ if (ret && ret != -EOPNOTSUPP) {
+ dev_err(&hdev->pdev->dev,
+ "failed to clear hw resource, ret = %d\n", ret);
+ return ret;
+ }
+ return 0;
+}
+
static void hclge_init_rxd_adv_layout(struct hclge_dev *hdev)
{
if (hnae3_ae_dev_rxd_adv_layout_supported(hdev->ae_dev))
if (ret)
goto err_cmd_uninit;
+ ret = hclge_clear_hw_resource(hdev);
+ if (ret)
+ goto err_cmd_uninit;
+
ret = hclge_get_cap(hdev);
if (ret)
goto err_cmd_uninit;
goto err_mdiobus_unreg;
}
- ret = hclge_config_gro(hdev, true);
+ ret = hclge_config_gro(hdev);
if (ret)
goto err_mdiobus_unreg;
return ret;
}
- ret = hclge_config_gro(hdev, true);
+ ret = hclge_config_gro(hdev);
if (ret)
return ret;
{
struct hclge_vport *vport = hclge_get_vport(handle);
struct hclge_dev *hdev = vport->back;
+ bool gro_en_old = hdev->gro_en;
+ int ret;
- return hclge_config_gro(hdev, enable);
+ hdev->gro_en = enable;
+ ret = hclge_config_gro(hdev);
+ if (ret)
+ hdev->gro_en = gro_en_old;
+
+ return ret;
}
static void hclge_sync_promisc_mode(struct hclge_dev *hdev)
unsigned long fd_bmap[BITS_TO_LONGS(MAX_FD_FILTER_NUM)];
enum HCLGE_FD_ACTIVE_RULE_TYPE fd_active_type;
u8 fd_en;
+ bool gro_en;
u16 wanted_umv_size;
/* max available unicast mac vlan space */
void hclgevf_cmd_uninit(struct hclgevf_dev *hdev)
{
+ set_bit(HCLGEVF_STATE_CMD_DISABLE, &hdev->state);
+ /* wait to ensure that the firmware completes the possible left
+ * over commands.
+ */
+ msleep(HCLGEVF_CMDQ_CLEAR_WAIT_TIME);
spin_lock_bh(&hdev->hw.cmq.csq.lock);
spin_lock(&hdev->hw.cmq.crq.lock);
- set_bit(HCLGEVF_STATE_CMD_DISABLE, &hdev->state);
hclgevf_cmd_uninit_regs(&hdev->hw);
spin_unlock(&hdev->hw.cmq.crq.lock);
spin_unlock_bh(&hdev->hw.cmq.csq.lock);
+
hclgevf_free_cmd_desc(&hdev->hw.cmq.csq);
hclgevf_free_cmd_desc(&hdev->hw.cmq.crq);
}
#include "hnae3.h"
#define HCLGEVF_CMDQ_TX_TIMEOUT 30000
+#define HCLGEVF_CMDQ_CLEAR_WAIT_TIME 200
#define HCLGEVF_CMDQ_RX_INVLD_B 0
#define HCLGEVF_CMDQ_RX_OUTVLD_B 1
link_state =
test_bit(HCLGEVF_STATE_DOWN, &hdev->state) ? 0 : link_state;
if (link_state != hdev->hw.mac.link) {
+ hdev->hw.mac.link = link_state;
client->ops->link_status_change(handle, !!link_state);
if (rclient && rclient->ops->link_status_change)
rclient->ops->link_status_change(rhandle, !!link_state);
- hdev->hw.mac.link = link_state;
}
clear_bit(HCLGEVF_STATE_LINK_UPDATING, &hdev->state);
{
int ret;
+ hdev->gro_en = true;
+
ret = hclgevf_get_basic_info(hdev);
if (ret)
return ret;
return 0;
}
-static int hclgevf_config_gro(struct hclgevf_dev *hdev, bool en)
+static int hclgevf_config_gro(struct hclgevf_dev *hdev)
{
struct hclgevf_cfg_gro_status_cmd *req;
struct hclgevf_desc desc;
false);
req = (struct hclgevf_cfg_gro_status_cmd *)desc.data;
- req->gro_en = en ? 1 : 0;
+ req->gro_en = hdev->gro_en ? 1 : 0;
ret = hclgevf_cmd_send(&hdev->hw, &desc, 1);
if (ret)
return ret;
}
- ret = hclgevf_config_gro(hdev, true);
+ ret = hclgevf_config_gro(hdev);
if (ret)
return ret;
if (ret)
goto err_config;
- ret = hclgevf_config_gro(hdev, true);
+ ret = hclgevf_config_gro(hdev);
if (ret)
goto err_config;
static int hclgevf_gro_en(struct hnae3_handle *handle, bool enable)
{
struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle);
+ bool gro_en_old = hdev->gro_en;
+ int ret;
- return hclgevf_config_gro(hdev, enable);
+ hdev->gro_en = enable;
+ ret = hclgevf_config_gro(hdev);
+ if (ret)
+ hdev->gro_en = gro_en_old;
+
+ return ret;
}
static void hclgevf_get_media_type(struct hnae3_handle *handle, u8 *media_type,
u16 *vector_status;
int *vector_irq;
+ bool gro_en;
+
unsigned long vlan_del_fail_bmap[BITS_TO_LONGS(VLAN_N_VID)];
struct hclgevf_mac_table_cfg mac_table;
flag = (u8)msg_q[5];
/* update upper layer with new link link status */
- hclgevf_update_link_status(hdev, link_status);
hclgevf_update_speed_duplex(hdev, speed, duplex);
+ hclgevf_update_link_status(hdev, link_status);
if (flag & HCLGE_MBX_PUSH_LINK_STATUS_EN)
set_bit(HCLGEVF_STATE_PF_PUSH_LINK_STATUS,
{
u32 reg = link << (E1000_LTRV_REQ_SHIFT + E1000_LTRV_NOSNOOP_SHIFT) |
link << E1000_LTRV_REQ_SHIFT | E1000_LTRV_SEND;
+ u16 max_ltr_enc_d = 0; /* maximum LTR decoded by platform */
+ u16 lat_enc_d = 0; /* latency decoded */
u16 lat_enc = 0; /* latency encoded */
if (link) {
E1000_PCI_LTR_CAP_LPT + 2, &max_nosnoop);
max_ltr_enc = max_t(u16, max_snoop, max_nosnoop);
- if (lat_enc > max_ltr_enc)
+ lat_enc_d = (lat_enc & E1000_LTRV_VALUE_MASK) *
+ (1U << (E1000_LTRV_SCALE_FACTOR *
+ ((lat_enc & E1000_LTRV_SCALE_MASK)
+ >> E1000_LTRV_SCALE_SHIFT)));
+
+ max_ltr_enc_d = (max_ltr_enc & E1000_LTRV_VALUE_MASK) *
+ (1U << (E1000_LTRV_SCALE_FACTOR *
+ ((max_ltr_enc & E1000_LTRV_SCALE_MASK)
+ >> E1000_LTRV_SCALE_SHIFT)));
+
+ if (lat_enc_d > max_ltr_enc_d)
lat_enc = max_ltr_enc;
}
return ret_val;
if (!(data & valid_csum_mask)) {
- data |= valid_csum_mask;
- ret_val = e1000_write_nvm(hw, word, 1, &data);
- if (ret_val)
- return ret_val;
- ret_val = e1000e_update_nvm_checksum(hw);
- if (ret_val)
- return ret_val;
+ e_dbg("NVM Checksum Invalid\n");
+
+ if (hw->mac.type < e1000_pch_cnp) {
+ data |= valid_csum_mask;
+ ret_val = e1000_write_nvm(hw, word, 1, &data);
+ if (ret_val)
+ return ret_val;
+ ret_val = e1000e_update_nvm_checksum(hw);
+ if (ret_val)
+ return ret_val;
+ }
}
return e1000e_validate_nvm_checksum_generic(hw);
/* Latency Tolerance Reporting */
#define E1000_LTRV 0x000F8
+#define E1000_LTRV_VALUE_MASK 0x000003FF
#define E1000_LTRV_SCALE_MAX 5
#define E1000_LTRV_SCALE_FACTOR 5
+#define E1000_LTRV_SCALE_SHIFT 10
+#define E1000_LTRV_SCALE_MASK 0x00001C00
#define E1000_LTRV_REQ_SHIFT 15
#define E1000_LTRV_NOSNOOP_SHIFT 16
#define E1000_LTRV_SEND (1 << 30)
status = ice_read_pba_string(hw, (u8 *)ctx->buf, sizeof(ctx->buf));
if (status)
- return -EIO;
+ /* We failed to locate the PBA, so just skip this entry */
+ dev_dbg(ice_pf_to_dev(pf), "Failed to read Product Board Assembly string, status %s\n",
+ ice_stat_str(status));
return 0;
}
struct igc_hw *hw = &adapter->hw;
u32 ctrl_ext;
+ if (!pci_device_is_present(adapter->pdev))
+ return;
+
/* Let firmware take over control of h/w */
ctrl_ext = rd32(IGC_CTRL_EXT);
wr32(IGC_CTRL_EXT,
igc_ptp_suspend(adapter);
- /* disable receives in the hardware */
- rctl = rd32(IGC_RCTL);
- wr32(IGC_RCTL, rctl & ~IGC_RCTL_EN);
- /* flush and sleep below */
-
+ if (pci_device_is_present(adapter->pdev)) {
+ /* disable receives in the hardware */
+ rctl = rd32(IGC_RCTL);
+ wr32(IGC_RCTL, rctl & ~IGC_RCTL_EN);
+ /* flush and sleep below */
+ }
/* set trans_start so we don't get spurious watchdogs during reset */
netif_trans_update(netdev);
netif_carrier_off(netdev);
netif_tx_stop_all_queues(netdev);
- /* disable transmits in the hardware */
- tctl = rd32(IGC_TCTL);
- tctl &= ~IGC_TCTL_EN;
- wr32(IGC_TCTL, tctl);
- /* flush both disables and wait for them to finish */
- wrfl();
- usleep_range(10000, 20000);
+ if (pci_device_is_present(adapter->pdev)) {
+ /* disable transmits in the hardware */
+ tctl = rd32(IGC_TCTL);
+ tctl &= ~IGC_TCTL_EN;
+ wr32(IGC_TCTL, tctl);
+ /* flush both disables and wait for them to finish */
+ wrfl();
+ usleep_range(10000, 20000);
- igc_irq_disable(adapter);
+ igc_irq_disable(adapter);
+ }
adapter->flags &= ~IGC_FLAG_NEED_LINK_UPDATE;
if (e->command != TC_TAPRIO_CMD_SET_GATES)
return false;
- for (i = 0; i < IGC_MAX_TX_QUEUES; i++) {
+ for (i = 0; i < adapter->num_tx_queues; i++) {
if (e->gate_mask & BIT(i))
queue_uses[i]++;
end_time += e->interval;
- for (i = 0; i < IGC_MAX_TX_QUEUES; i++) {
+ for (i = 0; i < adapter->num_tx_queues; i++) {
struct igc_ring *ring = adapter->tx_ring[i];
if (!(e->gate_mask & BIT(i)))
adapter->ptp_tx_skb = NULL;
clear_bit_unlock(__IGC_PTP_TX_IN_PROGRESS, &adapter->state);
- igc_ptp_time_save(adapter);
+ if (pci_device_is_present(adapter->pdev))
+ igc_ptp_time_save(adapter);
}
/**
#define MVNETA_VLAN_PRIO_TO_RXQ 0x2440
#define MVNETA_VLAN_PRIO_RXQ_MAP(prio, rxq) ((rxq) << ((prio) * 3))
#define MVNETA_PORT_STATUS 0x2444
-#define MVNETA_TX_IN_PRGRS BIT(1)
+#define MVNETA_TX_IN_PRGRS BIT(0)
#define MVNETA_TX_FIFO_EMPTY BIT(8)
#define MVNETA_RX_MIN_FRAME_SIZE 0x247c
/* Only exists on Armada XP and Armada 370 */
rc = cnt;
}
- if (rc > 0) {
+ /* For VFs, we should return with an error in case we didn't get the
+ * exact number of msix vectors as we requested.
+ * Not doing that will lead to a crash when starting queues for
+ * this VF.
+ */
+ if ((IS_PF(cdev) && rc > 0) || (IS_VF(cdev) && rc == cnt)) {
/* MSI-x configuration was achieved */
int_params->out.int_mode = QED_INT_MODE_MSIX;
int_params->out.num_vectors = rc;
}
edev->int_info.used_cnt = 0;
+ edev->int_info.msix_cnt = 0;
}
static int qede_req_msix_irqs(struct qede_dev *edev)
goto out;
err4:
qede_sync_free_irqs(edev);
- memset(&edev->int_info.msix_cnt, 0, sizeof(struct qed_int_info));
err3:
qede_napi_disable_remove(edev);
err2:
#include <linux/delay.h>
#include <linux/mfd/syscon.h>
#include <linux/regmap.h>
-#include <linux/pm_runtime.h>
#include "stmmac_platform.h"
return ret;
}
- pm_runtime_enable(dev);
- pm_runtime_get_sync(dev);
-
if (bsp_priv->integrated_phy)
rk_gmac_integrated_phy_powerup(bsp_priv);
static void rk_gmac_powerdown(struct rk_priv_data *gmac)
{
- struct device *dev = &gmac->pdev->dev;
-
if (gmac->integrated_phy)
rk_gmac_integrated_phy_powerdown(gmac);
- pm_runtime_put_sync(dev);
- pm_runtime_disable(dev);
-
phy_power_on(gmac, false);
gmac_clk_enable(gmac, false);
}
static inline unsigned int stmmac_rx_offset(struct stmmac_priv *priv)
{
if (stmmac_xdp_is_enabled(priv))
- return XDP_PACKET_HEADROOM + NET_IP_ALIGN;
+ return XDP_PACKET_HEADROOM;
- return NET_SKB_PAD + NET_IP_ALIGN;
+ return 0;
}
void stmmac_disable_rx_queue(struct stmmac_priv *priv, u32 queue);
prefetch(np);
+ /* Ensure a valid XSK buffer before proceed */
+ if (!buf->xdp)
+ break;
+
if (priv->extend_desc)
stmmac_rx_extended_status(priv, &priv->dev->stats,
&priv->xstats,
continue;
}
- /* Ensure a valid XSK buffer before proceed */
- if (!buf->xdp)
- break;
-
/* XSK pool expects RX frame 1:1 mapped to XSK buffer */
if (likely(status & rx_not_ls)) {
xsk_buff_free(buf->xdp);
return 0;
disable:
- mutex_lock(&priv->plat->est->lock);
- priv->plat->est->enable = false;
- stmmac_est_configure(priv, priv->ioaddr, priv->plat->est,
- priv->plat->clk_ptp_rate);
- mutex_unlock(&priv->plat->est->lock);
+ if (priv->plat->est) {
+ mutex_lock(&priv->plat->est->lock);
+ priv->plat->est->enable = false;
+ stmmac_est_configure(priv, priv->ioaddr, priv->plat->est,
+ priv->plat->clk_ptp_rate);
+ mutex_unlock(&priv->plat->est->lock);
+ }
priv->plat->fpe_cfg->enable = false;
stmmac_fpe_configure(priv, priv->ioaddr,
need_update = netif_running(priv->dev) && stmmac_xdp_is_enabled(priv);
if (need_update) {
- stmmac_disable_rx_queue(priv, queue);
- stmmac_disable_tx_queue(priv, queue);
napi_disable(&ch->rx_napi);
napi_disable(&ch->tx_napi);
+ stmmac_disable_rx_queue(priv, queue);
+ stmmac_disable_tx_queue(priv, queue);
}
set_bit(queue, priv->af_xdp_zc_qps);
if (need_update) {
- napi_enable(&ch->rxtx_napi);
stmmac_enable_rx_queue(priv, queue);
stmmac_enable_tx_queue(priv, queue);
+ napi_enable(&ch->rxtx_napi);
err = stmmac_xsk_wakeup(priv->dev, queue, XDP_WAKEUP_RX);
if (err)
need_update = netif_running(priv->dev) && stmmac_xdp_is_enabled(priv);
if (need_update) {
+ napi_disable(&ch->rxtx_napi);
stmmac_disable_rx_queue(priv, queue);
stmmac_disable_tx_queue(priv, queue);
synchronize_rcu();
- napi_disable(&ch->rxtx_napi);
}
xsk_pool_dma_unmap(pool, STMMAC_RX_DMA_ATTR);
clear_bit(queue, priv->af_xdp_zc_qps);
if (need_update) {
- napi_enable(&ch->rx_napi);
- napi_enable(&ch->tx_napi);
stmmac_enable_rx_queue(priv, queue);
stmmac_enable_tx_queue(priv, queue);
+ napi_enable(&ch->rx_napi);
+ napi_enable(&ch->tx_napi);
}
return 0;
u64_stats_init(&mhi_netdev->stats.tx_syncp);
/* Start MHI channels */
- err = mhi_prepare_for_transfer(mhi_dev, 0);
+ err = mhi_prepare_for_transfer(mhi_dev);
if (err)
goto out_err;
*/
.config_intr = genphy_no_config_intr,
.handle_interrupt = genphy_handle_interrupt_no_ack,
+ .suspend = genphy_suspend,
+ .resume = genphy_resume,
.read_page = mtk_gephy_read_page,
.write_page = mtk_gephy_write_page,
},
*/
.config_intr = genphy_no_config_intr,
.handle_interrupt = genphy_handle_interrupt_no_ack,
+ .suspend = genphy_suspend,
+ .resume = genphy_resume,
.read_page = mtk_gephy_read_page,
.write_page = mtk_gephy_write_page,
},
struct phy_device *phydev;
u16 phy_addr;
char phy_name[20];
+ bool embd_phy;
};
extern const struct driver_info ax88172a_info;
static int ax88772_hw_reset(struct usbnet *dev, int in_pm)
{
struct asix_data *data = (struct asix_data *)&dev->data;
- int ret, embd_phy;
+ struct asix_common_private *priv = dev->driver_priv;
u16 rx_ctl;
+ int ret;
ret = asix_write_gpio(dev, AX_GPIO_RSE | AX_GPIO_GPO_2 |
AX_GPIO_GPO2EN, 5, in_pm);
if (ret < 0)
goto out;
- embd_phy = ((dev->mii.phy_id & 0x1f) == 0x10 ? 1 : 0);
-
- ret = asix_write_cmd(dev, AX_CMD_SW_PHY_SELECT, embd_phy,
+ ret = asix_write_cmd(dev, AX_CMD_SW_PHY_SELECT, priv->embd_phy,
0, 0, NULL, in_pm);
if (ret < 0) {
netdev_dbg(dev->net, "Select PHY #1 failed: %d\n", ret);
goto out;
}
- if (embd_phy) {
+ if (priv->embd_phy) {
ret = asix_sw_reset(dev, AX_SWRESET_IPPD, in_pm);
if (ret < 0)
goto out;
static int ax88772a_hw_reset(struct usbnet *dev, int in_pm)
{
struct asix_data *data = (struct asix_data *)&dev->data;
- int ret, embd_phy;
+ struct asix_common_private *priv = dev->driver_priv;
u16 rx_ctl, phy14h, phy15h, phy16h;
u8 chipcode = 0;
+ int ret;
ret = asix_write_gpio(dev, AX_GPIO_RSE, 5, in_pm);
if (ret < 0)
goto out;
- embd_phy = ((dev->mii.phy_id & 0x1f) == 0x10 ? 1 : 0);
-
- ret = asix_write_cmd(dev, AX_CMD_SW_PHY_SELECT, embd_phy |
+ ret = asix_write_cmd(dev, AX_CMD_SW_PHY_SELECT, priv->embd_phy |
AX_PHYSEL_SSEN, 0, 0, NULL, in_pm);
if (ret < 0) {
netdev_dbg(dev->net, "Select PHY #1 failed: %d\n", ret);
struct asix_common_private *priv = dev->driver_priv;
int ret;
- ret = asix_read_phy_addr(dev, true);
- if (ret < 0)
- return ret;
-
- priv->phy_addr = ret;
-
snprintf(priv->phy_name, sizeof(priv->phy_name), PHY_ID_FMT,
priv->mdio->id, priv->phy_addr);
int ret, i;
u32 phyid;
+ priv = devm_kzalloc(&dev->udev->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ dev->driver_priv = priv;
+
usbnet_get_endpoints(dev, intf);
/* Maybe the boot loader passed the MAC address via device tree */
dev->net->needed_headroom = 4; /* cf asix_tx_fixup() */
dev->net->needed_tailroom = 4; /* cf asix_tx_fixup() */
+ ret = asix_read_phy_addr(dev, true);
+ if (ret < 0)
+ return ret;
+
+ priv->phy_addr = ret;
+ priv->embd_phy = ((priv->phy_addr & 0x1f) == 0x10);
+
asix_read_cmd(dev, AX_CMD_STATMNGSTS_REG, 0, 0, 1, &chipcode, 0);
chipcode &= AX_CHIPCODE_MASK;
dev->rx_urb_size = 2048;
}
- priv = devm_kzalloc(&dev->udev->dev, sizeof(*priv), GFP_KERNEL);
- if (!priv)
- return -ENOMEM;
-
- dev->driver_priv = priv;
-
priv->presvd_phy_bmcr = 0;
priv->presvd_phy_advertise = 0;
if (chipcode == AX_AX88772_CHIPCODE) {
asix_rx_fixup_common_free(dev->driver_priv);
}
+static void ax88178_unbind(struct usbnet *dev, struct usb_interface *intf)
+{
+ asix_rx_fixup_common_free(dev->driver_priv);
+ kfree(dev->driver_priv);
+}
+
static const struct ethtool_ops ax88178_ethtool_ops = {
.get_drvinfo = asix_get_drvinfo,
.get_link = asix_get_link,
static const struct driver_info ax88178_info = {
.description = "ASIX AX88178 USB 2.0 Ethernet",
.bind = ax88178_bind,
- .unbind = ax88772_unbind,
+ .unbind = ax88178_unbind,
.status = asix_status,
.link_reset = ax88178_link_reset,
.reset = ax88178_reset,
write_mii_word(pegasus, 0, 0x1b, &auxmode);
}
- return 0;
+ return ret;
fail:
netif_dbg(pegasus, drv, pegasus->net, "%s failed\n", __func__);
return ret;
if (!pegasus->rx_skb)
goto exit;
- res = set_registers(pegasus, EthID, 6, net->dev_addr);
+ set_registers(pegasus, EthID, 6, net->dev_addr);
usb_fill_bulk_urb(pegasus->rx_urb, pegasus->usb,
usb_rcvbulkpipe(pegasus->usb, 1),
int ret;
/* Start mhi device's channel(s) */
- ret = mhi_prepare_for_transfer(mhiwwan->mhi_dev, 0);
+ ret = mhi_prepare_for_transfer(mhiwwan->mhi_dev);
if (ret)
return ret;
in the system.
config NVME_FABRICS
+ select NVME_CORE
tristate
config NVME_RDMA
tristate "NVM Express over Fabrics RDMA host driver"
depends on INFINIBAND && INFINIBAND_ADDR_TRANS && BLOCK
- select NVME_CORE
select NVME_FABRICS
select SG_POOL
help
tristate "NVM Express over Fabrics FC host driver"
depends on BLOCK
depends on HAS_DMA
- select NVME_CORE
select NVME_FABRICS
select SG_POOL
help
tristate "NVM Express over Fabrics TCP host driver"
depends on INET
depends on BLOCK
- select NVME_CORE
select NVME_FABRICS
select CRYPTO
select CRYPTO_CRC32C
nvme-core-y := core.o ioctl.o
nvme-core-$(CONFIG_TRACING) += trace.o
nvme-core-$(CONFIG_NVME_MULTIPATH) += multipath.o
-nvme-core-$(CONFIG_NVM) += lightnvm.o
nvme-core-$(CONFIG_BLK_DEV_ZONED) += zns.o
nvme-core-$(CONFIG_FAULT_INJECTION_DEBUG_FS) += fault_inject.o
nvme-core-$(CONFIG_NVME_HWMON) += hwmon.o
{
struct nvme_ns *ns = container_of(kref, struct nvme_ns, kref);
- if (ns->ndev)
- nvme_nvm_unregister(ns);
-
put_disk(ns->disk);
nvme_put_ns_head(ns->head);
nvme_put_ctrl(ns->ctrl);
{
if (req->rq_flags & RQF_SPECIAL_PAYLOAD) {
struct nvme_ctrl *ctrl = nvme_req(req)->ctrl;
- struct page *page = req->special_vec.bv_page;
- if (page == ctrl->discard_page)
+ if (req->special_vec.bv_page == ctrl->discard_page)
clear_bit_unlock(0, &ctrl->discard_page_busy);
else
- kfree(page_address(page) + req->special_vec.bv_offset);
+ kfree(bvec_virt(&req->special_vec));
}
}
EXPORT_SYMBOL_GPL(nvme_cleanup_cmd);
return BLK_STS_IOERR;
}
- cmd->common.command_id = req->tag;
+ nvme_req(req)->genctr++;
+ cmd->common.command_id = nvme_cid(req);
trace_nvme_setup_cmd(req, cmd);
return ret;
}
static inline bool nvme_first_scan(struct gendisk *disk)
{
/* nvme_alloc_ns() scans the disk prior to adding it */
- return !(disk->flags & GENHD_FL_UP);
+ return !disk_live(disk);
}
static void nvme_set_chunk_sectors(struct nvme_ns *ns, struct nvme_id_ns *id)
nvme_update_disk_info(ns->head->disk, ns, id);
blk_stack_limits(&ns->head->disk->queue->limits,
&ns->queue->limits, 0);
- blk_queue_update_readahead(ns->head->disk->queue);
+ disk_update_readahead(ns->head->disk);
blk_mq_unfreeze_queue(ns->head->disk->queue);
}
return 0;
const struct attribute_group *nvme_ns_id_attr_groups[] = {
&nvme_ns_id_attr_group,
-#ifdef CONFIG_NVM
- &nvme_nvm_attr_group,
-#endif
NULL,
};
if (!ns)
goto out_free_id;
- ns->queue = blk_mq_init_queue(ctrl->tagset);
- if (IS_ERR(ns->queue))
+ disk = blk_mq_alloc_disk(ctrl->tagset, ns);
+ if (IS_ERR(disk))
goto out_free_ns;
+ disk->fops = &nvme_bdev_ops;
+ disk->private_data = ns;
+
+ ns->disk = disk;
+ ns->queue = disk->queue;
if (ctrl->opts && ctrl->opts->data_digest)
blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, ns->queue);
if (ctrl->ops->flags & NVME_F_PCI_P2PDMA)
blk_queue_flag_set(QUEUE_FLAG_PCI_P2PDMA, ns->queue);
- ns->queue->queuedata = ns;
ns->ctrl = ctrl;
kref_init(&ns->kref);
if (nvme_init_ns_head(ns, nsid, ids, id->nmic & NVME_NS_NMIC_SHARED))
- goto out_free_queue;
+ goto out_cleanup_disk;
- disk = alloc_disk_node(0, node);
- if (!disk)
- goto out_unlink_ns;
-
- disk->fops = &nvme_bdev_ops;
- disk->private_data = ns;
- disk->queue = ns->queue;
/*
* Without the multipath code enabled, multiple controller per
* subsystems are visible as devices and thus we cannot use the
if (!nvme_mpath_set_disk_name(ns, disk->disk_name, &disk->flags))
sprintf(disk->disk_name, "nvme%dn%d", ctrl->instance,
ns->head->instance);
- ns->disk = disk;
if (nvme_update_ns_info(ns, id))
- goto out_put_disk;
-
- if ((ctrl->quirks & NVME_QUIRK_LIGHTNVM) && id->vs[0] == 0x1) {
- if (nvme_nvm_register(ns, disk->disk_name, node)) {
- dev_warn(ctrl->device, "LightNVM init failure\n");
- goto out_put_disk;
- }
- }
+ goto out_unlink_ns;
down_write(&ctrl->namespaces_rwsem);
list_add_tail(&ns->list, &ctrl->namespaces);
kfree(id);
return;
- out_put_disk:
- /* prevent double queue cleanup */
- ns->disk->queue = NULL;
- put_disk(ns->disk);
+
out_unlink_ns:
mutex_lock(&ctrl->subsys->lock);
list_del_rcu(&ns->siblings);
list_del_init(&ns->head->entry);
mutex_unlock(&ctrl->subsys->lock);
nvme_put_ns_head(ns->head);
- out_free_queue:
- blk_cleanup_queue(ns->queue);
+ out_cleanup_disk:
+ blk_cleanup_disk(disk);
out_free_ns:
kfree(ns);
out_free_id:
nvme_mpath_clear_current_path(ns);
synchronize_srcu(&ns->head->srcu); /* wait for concurrent submissions */
- if (ns->disk->flags & GENHD_FL_UP) {
- if (!nvme_ns_head_multipath(ns->head))
- nvme_cdev_del(&ns->cdev, &ns->cdev_device);
- del_gendisk(ns->disk);
- blk_cleanup_queue(ns->queue);
- if (blk_get_integrity(ns->disk))
- blk_integrity_unregister(ns->disk);
- }
+ if (!nvme_ns_head_multipath(ns->head))
+ nvme_cdev_del(&ns->cdev, &ns->cdev_device);
+ del_gendisk(ns->disk);
+ blk_cleanup_queue(ns->queue);
+ if (blk_get_integrity(ns->disk))
+ blk_integrity_unregister(ns->disk);
down_write(&ns->ctrl->namespaces_rwsem);
list_del_init(&ns->list);
ret = -EINVAL;
goto out;
}
- nvmf_host_put(opts->host);
opts->host = nvmf_host_add(p);
kfree(p);
if (!opts->host) {
case NVME_IOCTL_IO64_CMD:
return nvme_user_cmd64(ns->ctrl, ns, argp);
default:
- if (!ns->ndev)
- return -ENOTTY;
- return nvme_nvm_ioctl(ns, cmd, argp);
+ return -ENOTTY;
}
}
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-/*
- * nvme-lightnvm.c - LightNVM NVMe device
- *
- * Copyright (C) 2014-2015 IT University of Copenhagen
- * Initial release: Matias Bjorling <mb@lightnvm.io>
- */
-
-#include "nvme.h"
-
-#include <linux/nvme.h>
-#include <linux/bitops.h>
-#include <linux/lightnvm.h>
-#include <linux/vmalloc.h>
-#include <linux/sched/sysctl.h>
-#include <uapi/linux/lightnvm.h>
-
-enum nvme_nvm_admin_opcode {
- nvme_nvm_admin_identity = 0xe2,
- nvme_nvm_admin_get_bb_tbl = 0xf2,
- nvme_nvm_admin_set_bb_tbl = 0xf1,
-};
-
-enum nvme_nvm_log_page {
- NVME_NVM_LOG_REPORT_CHUNK = 0xca,
-};
-
-struct nvme_nvm_ph_rw {
- __u8 opcode;
- __u8 flags;
- __u16 command_id;
- __le32 nsid;
- __u64 rsvd2;
- __le64 metadata;
- __le64 prp1;
- __le64 prp2;
- __le64 spba;
- __le16 length;
- __le16 control;
- __le32 dsmgmt;
- __le64 resv;
-};
-
-struct nvme_nvm_erase_blk {
- __u8 opcode;
- __u8 flags;
- __u16 command_id;
- __le32 nsid;
- __u64 rsvd[2];
- __le64 prp1;
- __le64 prp2;
- __le64 spba;
- __le16 length;
- __le16 control;
- __le32 dsmgmt;
- __le64 resv;
-};
-
-struct nvme_nvm_identity {
- __u8 opcode;
- __u8 flags;
- __u16 command_id;
- __le32 nsid;
- __u64 rsvd[2];
- __le64 prp1;
- __le64 prp2;
- __u32 rsvd11[6];
-};
-
-struct nvme_nvm_getbbtbl {
- __u8 opcode;
- __u8 flags;
- __u16 command_id;
- __le32 nsid;
- __u64 rsvd[2];
- __le64 prp1;
- __le64 prp2;
- __le64 spba;
- __u32 rsvd4[4];
-};
-
-struct nvme_nvm_setbbtbl {
- __u8 opcode;
- __u8 flags;
- __u16 command_id;
- __le32 nsid;
- __le64 rsvd[2];
- __le64 prp1;
- __le64 prp2;
- __le64 spba;
- __le16 nlb;
- __u8 value;
- __u8 rsvd3;
- __u32 rsvd4[3];
-};
-
-struct nvme_nvm_command {
- union {
- struct nvme_common_command common;
- struct nvme_nvm_ph_rw ph_rw;
- struct nvme_nvm_erase_blk erase;
- struct nvme_nvm_identity identity;
- struct nvme_nvm_getbbtbl get_bb;
- struct nvme_nvm_setbbtbl set_bb;
- };
-};
-
-struct nvme_nvm_id12_grp {
- __u8 mtype;
- __u8 fmtype;
- __le16 res16;
- __u8 num_ch;
- __u8 num_lun;
- __u8 num_pln;
- __u8 rsvd1;
- __le16 num_chk;
- __le16 num_pg;
- __le16 fpg_sz;
- __le16 csecs;
- __le16 sos;
- __le16 rsvd2;
- __le32 trdt;
- __le32 trdm;
- __le32 tprt;
- __le32 tprm;
- __le32 tbet;
- __le32 tbem;
- __le32 mpos;
- __le32 mccap;
- __le16 cpar;
- __u8 reserved[906];
-} __packed;
-
-struct nvme_nvm_id12_addrf {
- __u8 ch_offset;
- __u8 ch_len;
- __u8 lun_offset;
- __u8 lun_len;
- __u8 pln_offset;
- __u8 pln_len;
- __u8 blk_offset;
- __u8 blk_len;
- __u8 pg_offset;
- __u8 pg_len;
- __u8 sec_offset;
- __u8 sec_len;
- __u8 res[4];
-} __packed;
-
-struct nvme_nvm_id12 {
- __u8 ver_id;
- __u8 vmnt;
- __u8 cgrps;
- __u8 res;
- __le32 cap;
- __le32 dom;
- struct nvme_nvm_id12_addrf ppaf;
- __u8 resv[228];
- struct nvme_nvm_id12_grp grp;
- __u8 resv2[2880];
-} __packed;
-
-struct nvme_nvm_bb_tbl {
- __u8 tblid[4];
- __le16 verid;
- __le16 revid;
- __le32 rvsd1;
- __le32 tblks;
- __le32 tfact;
- __le32 tgrown;
- __le32 tdresv;
- __le32 thresv;
- __le32 rsvd2[8];
- __u8 blk[];
-};
-
-struct nvme_nvm_id20_addrf {
- __u8 grp_len;
- __u8 pu_len;
- __u8 chk_len;
- __u8 lba_len;
- __u8 resv[4];
-};
-
-struct nvme_nvm_id20 {
- __u8 mjr;
- __u8 mnr;
- __u8 resv[6];
-
- struct nvme_nvm_id20_addrf lbaf;
-
- __le32 mccap;
- __u8 resv2[12];
-
- __u8 wit;
- __u8 resv3[31];
-
- /* Geometry */
- __le16 num_grp;
- __le16 num_pu;
- __le32 num_chk;
- __le32 clba;
- __u8 resv4[52];
-
- /* Write data requirements */
- __le32 ws_min;
- __le32 ws_opt;
- __le32 mw_cunits;
- __le32 maxoc;
- __le32 maxocpu;
- __u8 resv5[44];
-
- /* Performance related metrics */
- __le32 trdt;
- __le32 trdm;
- __le32 twrt;
- __le32 twrm;
- __le32 tcrst;
- __le32 tcrsm;
- __u8 resv6[40];
-
- /* Reserved area */
- __u8 resv7[2816];
-
- /* Vendor specific */
- __u8 vs[1024];
-};
-
-struct nvme_nvm_chk_meta {
- __u8 state;
- __u8 type;
- __u8 wi;
- __u8 rsvd[5];
- __le64 slba;
- __le64 cnlb;
- __le64 wp;
-};
-
-/*
- * Check we didn't inadvertently grow the command struct
- */
-static inline void _nvme_nvm_check_size(void)
-{
- BUILD_BUG_ON(sizeof(struct nvme_nvm_identity) != 64);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_ph_rw) != 64);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_erase_blk) != 64);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_getbbtbl) != 64);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_setbbtbl) != 64);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_id12_grp) != 960);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_id12_addrf) != 16);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_id12) != NVME_IDENTIFY_DATA_SIZE);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_bb_tbl) != 64);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_id20_addrf) != 8);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_id20) != NVME_IDENTIFY_DATA_SIZE);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_chk_meta) != 32);
- BUILD_BUG_ON(sizeof(struct nvme_nvm_chk_meta) !=
- sizeof(struct nvm_chk_meta));
-}
-
-static void nvme_nvm_set_addr_12(struct nvm_addrf_12 *dst,
- struct nvme_nvm_id12_addrf *src)
-{
- dst->ch_len = src->ch_len;
- dst->lun_len = src->lun_len;
- dst->blk_len = src->blk_len;
- dst->pg_len = src->pg_len;
- dst->pln_len = src->pln_len;
- dst->sec_len = src->sec_len;
-
- dst->ch_offset = src->ch_offset;
- dst->lun_offset = src->lun_offset;
- dst->blk_offset = src->blk_offset;
- dst->pg_offset = src->pg_offset;
- dst->pln_offset = src->pln_offset;
- dst->sec_offset = src->sec_offset;
-
- dst->ch_mask = ((1ULL << dst->ch_len) - 1) << dst->ch_offset;
- dst->lun_mask = ((1ULL << dst->lun_len) - 1) << dst->lun_offset;
- dst->blk_mask = ((1ULL << dst->blk_len) - 1) << dst->blk_offset;
- dst->pg_mask = ((1ULL << dst->pg_len) - 1) << dst->pg_offset;
- dst->pln_mask = ((1ULL << dst->pln_len) - 1) << dst->pln_offset;
- dst->sec_mask = ((1ULL << dst->sec_len) - 1) << dst->sec_offset;
-}
-
-static int nvme_nvm_setup_12(struct nvme_nvm_id12 *id,
- struct nvm_geo *geo)
-{
- struct nvme_nvm_id12_grp *src;
- int sec_per_pg, sec_per_pl, pg_per_blk;
-
- if (id->cgrps != 1)
- return -EINVAL;
-
- src = &id->grp;
-
- if (src->mtype != 0) {
- pr_err("nvm: memory type not supported\n");
- return -EINVAL;
- }
-
- /* 1.2 spec. only reports a single version id - unfold */
- geo->major_ver_id = id->ver_id;
- geo->minor_ver_id = 2;
-
- /* Set compacted version for upper layers */
- geo->version = NVM_OCSSD_SPEC_12;
-
- geo->num_ch = src->num_ch;
- geo->num_lun = src->num_lun;
- geo->all_luns = geo->num_ch * geo->num_lun;
-
- geo->num_chk = le16_to_cpu(src->num_chk);
-
- geo->csecs = le16_to_cpu(src->csecs);
- geo->sos = le16_to_cpu(src->sos);
-
- pg_per_blk = le16_to_cpu(src->num_pg);
- sec_per_pg = le16_to_cpu(src->fpg_sz) / geo->csecs;
- sec_per_pl = sec_per_pg * src->num_pln;
- geo->clba = sec_per_pl * pg_per_blk;
-
- geo->all_chunks = geo->all_luns * geo->num_chk;
- geo->total_secs = geo->clba * geo->all_chunks;
-
- geo->ws_min = sec_per_pg;
- geo->ws_opt = sec_per_pg;
- geo->mw_cunits = geo->ws_opt << 3; /* default to MLC safe values */
-
- /* Do not impose values for maximum number of open blocks as it is
- * unspecified in 1.2. Users of 1.2 must be aware of this and eventually
- * specify these values through a quirk if restrictions apply.
- */
- geo->maxoc = geo->all_luns * geo->num_chk;
- geo->maxocpu = geo->num_chk;
-
- geo->mccap = le32_to_cpu(src->mccap);
-
- geo->trdt = le32_to_cpu(src->trdt);
- geo->trdm = le32_to_cpu(src->trdm);
- geo->tprt = le32_to_cpu(src->tprt);
- geo->tprm = le32_to_cpu(src->tprm);
- geo->tbet = le32_to_cpu(src->tbet);
- geo->tbem = le32_to_cpu(src->tbem);
-
- /* 1.2 compatibility */
- geo->vmnt = id->vmnt;
- geo->cap = le32_to_cpu(id->cap);
- geo->dom = le32_to_cpu(id->dom);
-
- geo->mtype = src->mtype;
- geo->fmtype = src->fmtype;
-
- geo->cpar = le16_to_cpu(src->cpar);
- geo->mpos = le32_to_cpu(src->mpos);
-
- geo->pln_mode = NVM_PLANE_SINGLE;
-
- if (geo->mpos & 0x020202) {
- geo->pln_mode = NVM_PLANE_DOUBLE;
- geo->ws_opt <<= 1;
- } else if (geo->mpos & 0x040404) {
- geo->pln_mode = NVM_PLANE_QUAD;
- geo->ws_opt <<= 2;
- }
-
- geo->num_pln = src->num_pln;
- geo->num_pg = le16_to_cpu(src->num_pg);
- geo->fpg_sz = le16_to_cpu(src->fpg_sz);
-
- nvme_nvm_set_addr_12((struct nvm_addrf_12 *)&geo->addrf, &id->ppaf);
-
- return 0;
-}
-
-static void nvme_nvm_set_addr_20(struct nvm_addrf *dst,
- struct nvme_nvm_id20_addrf *src)
-{
- dst->ch_len = src->grp_len;
- dst->lun_len = src->pu_len;
- dst->chk_len = src->chk_len;
- dst->sec_len = src->lba_len;
-
- dst->sec_offset = 0;
- dst->chk_offset = dst->sec_len;
- dst->lun_offset = dst->chk_offset + dst->chk_len;
- dst->ch_offset = dst->lun_offset + dst->lun_len;
-
- dst->ch_mask = ((1ULL << dst->ch_len) - 1) << dst->ch_offset;
- dst->lun_mask = ((1ULL << dst->lun_len) - 1) << dst->lun_offset;
- dst->chk_mask = ((1ULL << dst->chk_len) - 1) << dst->chk_offset;
- dst->sec_mask = ((1ULL << dst->sec_len) - 1) << dst->sec_offset;
-}
-
-static int nvme_nvm_setup_20(struct nvme_nvm_id20 *id,
- struct nvm_geo *geo)
-{
- geo->major_ver_id = id->mjr;
- geo->minor_ver_id = id->mnr;
-
- /* Set compacted version for upper layers */
- geo->version = NVM_OCSSD_SPEC_20;
-
- geo->num_ch = le16_to_cpu(id->num_grp);
- geo->num_lun = le16_to_cpu(id->num_pu);
- geo->all_luns = geo->num_ch * geo->num_lun;
-
- geo->num_chk = le32_to_cpu(id->num_chk);
- geo->clba = le32_to_cpu(id->clba);
-
- geo->all_chunks = geo->all_luns * geo->num_chk;
- geo->total_secs = geo->clba * geo->all_chunks;
-
- geo->ws_min = le32_to_cpu(id->ws_min);
- geo->ws_opt = le32_to_cpu(id->ws_opt);
- geo->mw_cunits = le32_to_cpu(id->mw_cunits);
- geo->maxoc = le32_to_cpu(id->maxoc);
- geo->maxocpu = le32_to_cpu(id->maxocpu);
-
- geo->trdt = le32_to_cpu(id->trdt);
- geo->trdm = le32_to_cpu(id->trdm);
- geo->tprt = le32_to_cpu(id->twrt);
- geo->tprm = le32_to_cpu(id->twrm);
- geo->tbet = le32_to_cpu(id->tcrst);
- geo->tbem = le32_to_cpu(id->tcrsm);
-
- nvme_nvm_set_addr_20(&geo->addrf, &id->lbaf);
-
- return 0;
-}
-
-static int nvme_nvm_identity(struct nvm_dev *nvmdev)
-{
- struct nvme_ns *ns = nvmdev->q->queuedata;
- struct nvme_nvm_id12 *id;
- struct nvme_nvm_command c = {};
- int ret;
-
- c.identity.opcode = nvme_nvm_admin_identity;
- c.identity.nsid = cpu_to_le32(ns->head->ns_id);
-
- id = kmalloc(sizeof(struct nvme_nvm_id12), GFP_KERNEL);
- if (!id)
- return -ENOMEM;
-
- ret = nvme_submit_sync_cmd(ns->ctrl->admin_q, (struct nvme_command *)&c,
- id, sizeof(struct nvme_nvm_id12));
- if (ret) {
- ret = -EIO;
- goto out;
- }
-
- /*
- * The 1.2 and 2.0 specifications share the first byte in their geometry
- * command to make it possible to know what version a device implements.
- */
- switch (id->ver_id) {
- case 1:
- ret = nvme_nvm_setup_12(id, &nvmdev->geo);
- break;
- case 2:
- ret = nvme_nvm_setup_20((struct nvme_nvm_id20 *)id,
- &nvmdev->geo);
- break;
- default:
- dev_err(ns->ctrl->device, "OCSSD revision not supported (%d)\n",
- id->ver_id);
- ret = -EINVAL;
- }
-
-out:
- kfree(id);
- return ret;
-}
-
-static int nvme_nvm_get_bb_tbl(struct nvm_dev *nvmdev, struct ppa_addr ppa,
- u8 *blks)
-{
- struct request_queue *q = nvmdev->q;
- struct nvm_geo *geo = &nvmdev->geo;
- struct nvme_ns *ns = q->queuedata;
- struct nvme_ctrl *ctrl = ns->ctrl;
- struct nvme_nvm_command c = {};
- struct nvme_nvm_bb_tbl *bb_tbl;
- int nr_blks = geo->num_chk * geo->num_pln;
- int tblsz = sizeof(struct nvme_nvm_bb_tbl) + nr_blks;
- int ret = 0;
-
- c.get_bb.opcode = nvme_nvm_admin_get_bb_tbl;
- c.get_bb.nsid = cpu_to_le32(ns->head->ns_id);
- c.get_bb.spba = cpu_to_le64(ppa.ppa);
-
- bb_tbl = kzalloc(tblsz, GFP_KERNEL);
- if (!bb_tbl)
- return -ENOMEM;
-
- ret = nvme_submit_sync_cmd(ctrl->admin_q, (struct nvme_command *)&c,
- bb_tbl, tblsz);
- if (ret) {
- dev_err(ctrl->device, "get bad block table failed (%d)\n", ret);
- ret = -EIO;
- goto out;
- }
-
- if (bb_tbl->tblid[0] != 'B' || bb_tbl->tblid[1] != 'B' ||
- bb_tbl->tblid[2] != 'L' || bb_tbl->tblid[3] != 'T') {
- dev_err(ctrl->device, "bbt format mismatch\n");
- ret = -EINVAL;
- goto out;
- }
-
- if (le16_to_cpu(bb_tbl->verid) != 1) {
- ret = -EINVAL;
- dev_err(ctrl->device, "bbt version not supported\n");
- goto out;
- }
-
- if (le32_to_cpu(bb_tbl->tblks) != nr_blks) {
- ret = -EINVAL;
- dev_err(ctrl->device,
- "bbt unsuspected blocks returned (%u!=%u)",
- le32_to_cpu(bb_tbl->tblks), nr_blks);
- goto out;
- }
-
- memcpy(blks, bb_tbl->blk, geo->num_chk * geo->num_pln);
-out:
- kfree(bb_tbl);
- return ret;
-}
-
-static int nvme_nvm_set_bb_tbl(struct nvm_dev *nvmdev, struct ppa_addr *ppas,
- int nr_ppas, int type)
-{
- struct nvme_ns *ns = nvmdev->q->queuedata;
- struct nvme_nvm_command c = {};
- int ret = 0;
-
- c.set_bb.opcode = nvme_nvm_admin_set_bb_tbl;
- c.set_bb.nsid = cpu_to_le32(ns->head->ns_id);
- c.set_bb.spba = cpu_to_le64(ppas->ppa);
- c.set_bb.nlb = cpu_to_le16(nr_ppas - 1);
- c.set_bb.value = type;
-
- ret = nvme_submit_sync_cmd(ns->ctrl->admin_q, (struct nvme_command *)&c,
- NULL, 0);
- if (ret)
- dev_err(ns->ctrl->device, "set bad block table failed (%d)\n",
- ret);
- return ret;
-}
-
-/*
- * Expect the lba in device format
- */
-static int nvme_nvm_get_chk_meta(struct nvm_dev *ndev,
- sector_t slba, int nchks,
- struct nvm_chk_meta *meta)
-{
- struct nvm_geo *geo = &ndev->geo;
- struct nvme_ns *ns = ndev->q->queuedata;
- struct nvme_ctrl *ctrl = ns->ctrl;
- struct nvme_nvm_chk_meta *dev_meta, *dev_meta_off;
- struct ppa_addr ppa;
- size_t left = nchks * sizeof(struct nvme_nvm_chk_meta);
- size_t log_pos, offset, len;
- int i, max_len;
- int ret = 0;
-
- /*
- * limit requests to maximum 256K to avoid issuing arbitrary large
- * requests when the device does not specific a maximum transfer size.
- */
- max_len = min_t(unsigned int, ctrl->max_hw_sectors << 9, 256 * 1024);
-
- dev_meta = kmalloc(max_len, GFP_KERNEL);
- if (!dev_meta)
- return -ENOMEM;
-
- /* Normalize lba address space to obtain log offset */
- ppa.ppa = slba;
- ppa = dev_to_generic_addr(ndev, ppa);
-
- log_pos = ppa.m.chk;
- log_pos += ppa.m.pu * geo->num_chk;
- log_pos += ppa.m.grp * geo->num_lun * geo->num_chk;
-
- offset = log_pos * sizeof(struct nvme_nvm_chk_meta);
-
- while (left) {
- len = min_t(unsigned int, left, max_len);
-
- memset(dev_meta, 0, max_len);
- dev_meta_off = dev_meta;
-
- ret = nvme_get_log(ctrl, ns->head->ns_id,
- NVME_NVM_LOG_REPORT_CHUNK, 0, NVME_CSI_NVM,
- dev_meta, len, offset);
- if (ret) {
- dev_err(ctrl->device, "Get REPORT CHUNK log error\n");
- break;
- }
-
- for (i = 0; i < len; i += sizeof(struct nvme_nvm_chk_meta)) {
- meta->state = dev_meta_off->state;
- meta->type = dev_meta_off->type;
- meta->wi = dev_meta_off->wi;
- meta->slba = le64_to_cpu(dev_meta_off->slba);
- meta->cnlb = le64_to_cpu(dev_meta_off->cnlb);
- meta->wp = le64_to_cpu(dev_meta_off->wp);
-
- meta++;
- dev_meta_off++;
- }
-
- offset += len;
- left -= len;
- }
-
- kfree(dev_meta);
-
- return ret;
-}
-
-static inline void nvme_nvm_rqtocmd(struct nvm_rq *rqd, struct nvme_ns *ns,
- struct nvme_nvm_command *c)
-{
- c->ph_rw.opcode = rqd->opcode;
- c->ph_rw.nsid = cpu_to_le32(ns->head->ns_id);
- c->ph_rw.spba = cpu_to_le64(rqd->ppa_addr.ppa);
- c->ph_rw.metadata = cpu_to_le64(rqd->dma_meta_list);
- c->ph_rw.control = cpu_to_le16(rqd->flags);
- c->ph_rw.length = cpu_to_le16(rqd->nr_ppas - 1);
-}
-
-static void nvme_nvm_end_io(struct request *rq, blk_status_t status)
-{
- struct nvm_rq *rqd = rq->end_io_data;
-
- rqd->ppa_status = le64_to_cpu(nvme_req(rq)->result.u64);
- rqd->error = nvme_req(rq)->status;
- nvm_end_io(rqd);
-
- kfree(nvme_req(rq)->cmd);
- blk_mq_free_request(rq);
-}
-
-static struct request *nvme_nvm_alloc_request(struct request_queue *q,
- struct nvm_rq *rqd,
- struct nvme_nvm_command *cmd)
-{
- struct nvme_ns *ns = q->queuedata;
- struct request *rq;
-
- nvme_nvm_rqtocmd(rqd, ns, cmd);
-
- rq = nvme_alloc_request(q, (struct nvme_command *)cmd, 0);
- if (IS_ERR(rq))
- return rq;
-
- rq->cmd_flags &= ~REQ_FAILFAST_DRIVER;
-
- if (rqd->bio)
- blk_rq_append_bio(rq, rqd->bio);
- else
- rq->ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_NORM);
-
- return rq;
-}
-
-static int nvme_nvm_submit_io(struct nvm_dev *dev, struct nvm_rq *rqd,
- void *buf)
-{
- struct nvm_geo *geo = &dev->geo;
- struct request_queue *q = dev->q;
- struct nvme_nvm_command *cmd;
- struct request *rq;
- int ret;
-
- cmd = kzalloc(sizeof(struct nvme_nvm_command), GFP_KERNEL);
- if (!cmd)
- return -ENOMEM;
-
- rq = nvme_nvm_alloc_request(q, rqd, cmd);
- if (IS_ERR(rq)) {
- ret = PTR_ERR(rq);
- goto err_free_cmd;
- }
-
- if (buf) {
- ret = blk_rq_map_kern(q, rq, buf, geo->csecs * rqd->nr_ppas,
- GFP_KERNEL);
- if (ret)
- goto err_free_cmd;
- }
-
- rq->end_io_data = rqd;
-
- blk_execute_rq_nowait(NULL, rq, 0, nvme_nvm_end_io);
-
- return 0;
-
-err_free_cmd:
- kfree(cmd);
- return ret;
-}
-
-static void *nvme_nvm_create_dma_pool(struct nvm_dev *nvmdev, char *name,
- int size)
-{
- struct nvme_ns *ns = nvmdev->q->queuedata;
-
- return dma_pool_create(name, ns->ctrl->dev, size, PAGE_SIZE, 0);
-}
-
-static void nvme_nvm_destroy_dma_pool(void *pool)
-{
- struct dma_pool *dma_pool = pool;
-
- dma_pool_destroy(dma_pool);
-}
-
-static void *nvme_nvm_dev_dma_alloc(struct nvm_dev *dev, void *pool,
- gfp_t mem_flags, dma_addr_t *dma_handler)
-{
- return dma_pool_alloc(pool, mem_flags, dma_handler);
-}
-
-static void nvme_nvm_dev_dma_free(void *pool, void *addr,
- dma_addr_t dma_handler)
-{
- dma_pool_free(pool, addr, dma_handler);
-}
-
-static struct nvm_dev_ops nvme_nvm_dev_ops = {
- .identity = nvme_nvm_identity,
-
- .get_bb_tbl = nvme_nvm_get_bb_tbl,
- .set_bb_tbl = nvme_nvm_set_bb_tbl,
-
- .get_chk_meta = nvme_nvm_get_chk_meta,
-
- .submit_io = nvme_nvm_submit_io,
-
- .create_dma_pool = nvme_nvm_create_dma_pool,
- .destroy_dma_pool = nvme_nvm_destroy_dma_pool,
- .dev_dma_alloc = nvme_nvm_dev_dma_alloc,
- .dev_dma_free = nvme_nvm_dev_dma_free,
-};
-
-static int nvme_nvm_submit_user_cmd(struct request_queue *q,
- struct nvme_ns *ns,
- struct nvme_nvm_command *vcmd,
- void __user *ubuf, unsigned int bufflen,
- void __user *meta_buf, unsigned int meta_len,
- void __user *ppa_buf, unsigned int ppa_len,
- u32 *result, u64 *status, unsigned int timeout)
-{
- bool write = nvme_is_write((struct nvme_command *)vcmd);
- struct nvm_dev *dev = ns->ndev;
- struct request *rq;
- struct bio *bio = NULL;
- __le64 *ppa_list = NULL;
- dma_addr_t ppa_dma;
- __le64 *metadata = NULL;
- dma_addr_t metadata_dma;
- DECLARE_COMPLETION_ONSTACK(wait);
- int ret = 0;
-
- rq = nvme_alloc_request(q, (struct nvme_command *)vcmd, 0);
- if (IS_ERR(rq)) {
- ret = -ENOMEM;
- goto err_cmd;
- }
-
- if (timeout)
- rq->timeout = timeout;
-
- if (ppa_buf && ppa_len) {
- ppa_list = dma_pool_alloc(dev->dma_pool, GFP_KERNEL, &ppa_dma);
- if (!ppa_list) {
- ret = -ENOMEM;
- goto err_rq;
- }
- if (copy_from_user(ppa_list, (void __user *)ppa_buf,
- sizeof(u64) * (ppa_len + 1))) {
- ret = -EFAULT;
- goto err_ppa;
- }
- vcmd->ph_rw.spba = cpu_to_le64(ppa_dma);
- } else {
- vcmd->ph_rw.spba = cpu_to_le64((uintptr_t)ppa_buf);
- }
-
- if (ubuf && bufflen) {
- ret = blk_rq_map_user(q, rq, NULL, ubuf, bufflen, GFP_KERNEL);
- if (ret)
- goto err_ppa;
- bio = rq->bio;
-
- if (meta_buf && meta_len) {
- metadata = dma_pool_alloc(dev->dma_pool, GFP_KERNEL,
- &metadata_dma);
- if (!metadata) {
- ret = -ENOMEM;
- goto err_map;
- }
-
- if (write) {
- if (copy_from_user(metadata,
- (void __user *)meta_buf,
- meta_len)) {
- ret = -EFAULT;
- goto err_meta;
- }
- }
- vcmd->ph_rw.metadata = cpu_to_le64(metadata_dma);
- }
-
- bio_set_dev(bio, ns->disk->part0);
- }
-
- blk_execute_rq(NULL, rq, 0);
-
- if (nvme_req(rq)->flags & NVME_REQ_CANCELLED)
- ret = -EINTR;
- else if (nvme_req(rq)->status & 0x7ff)
- ret = -EIO;
- if (result)
- *result = nvme_req(rq)->status & 0x7ff;
- if (status)
- *status = le64_to_cpu(nvme_req(rq)->result.u64);
-
- if (metadata && !ret && !write) {
- if (copy_to_user(meta_buf, (void *)metadata, meta_len))
- ret = -EFAULT;
- }
-err_meta:
- if (meta_buf && meta_len)
- dma_pool_free(dev->dma_pool, metadata, metadata_dma);
-err_map:
- if (bio)
- blk_rq_unmap_user(bio);
-err_ppa:
- if (ppa_buf && ppa_len)
- dma_pool_free(dev->dma_pool, ppa_list, ppa_dma);
-err_rq:
- blk_mq_free_request(rq);
-err_cmd:
- return ret;
-}
-
-static int nvme_nvm_submit_vio(struct nvme_ns *ns,
- struct nvm_user_vio __user *uvio)
-{
- struct nvm_user_vio vio;
- struct nvme_nvm_command c;
- unsigned int length;
- int ret;
-
- if (copy_from_user(&vio, uvio, sizeof(vio)))
- return -EFAULT;
- if (vio.flags)
- return -EINVAL;
-
- memset(&c, 0, sizeof(c));
- c.ph_rw.opcode = vio.opcode;
- c.ph_rw.nsid = cpu_to_le32(ns->head->ns_id);
- c.ph_rw.control = cpu_to_le16(vio.control);
- c.ph_rw.length = cpu_to_le16(vio.nppas);
-
- length = (vio.nppas + 1) << ns->lba_shift;
-
- ret = nvme_nvm_submit_user_cmd(ns->queue, ns, &c,
- (void __user *)(uintptr_t)vio.addr, length,
- (void __user *)(uintptr_t)vio.metadata,
- vio.metadata_len,
- (void __user *)(uintptr_t)vio.ppa_list, vio.nppas,
- &vio.result, &vio.status, 0);
-
- if (ret && copy_to_user(uvio, &vio, sizeof(vio)))
- return -EFAULT;
-
- return ret;
-}
-
-static int nvme_nvm_user_vcmd(struct nvme_ns *ns, int admin,
- struct nvm_passthru_vio __user *uvcmd)
-{
- struct nvm_passthru_vio vcmd;
- struct nvme_nvm_command c;
- struct request_queue *q;
- unsigned int timeout = 0;
- int ret;
-
- if (copy_from_user(&vcmd, uvcmd, sizeof(vcmd)))
- return -EFAULT;
- if ((vcmd.opcode != 0xF2) && (!capable(CAP_SYS_ADMIN)))
- return -EACCES;
- if (vcmd.flags)
- return -EINVAL;
-
- memset(&c, 0, sizeof(c));
- c.common.opcode = vcmd.opcode;
- c.common.nsid = cpu_to_le32(ns->head->ns_id);
- c.common.cdw2[0] = cpu_to_le32(vcmd.cdw2);
- c.common.cdw2[1] = cpu_to_le32(vcmd.cdw3);
- /* cdw11-12 */
- c.ph_rw.length = cpu_to_le16(vcmd.nppas);
- c.ph_rw.control = cpu_to_le16(vcmd.control);
- c.common.cdw13 = cpu_to_le32(vcmd.cdw13);
- c.common.cdw14 = cpu_to_le32(vcmd.cdw14);
- c.common.cdw15 = cpu_to_le32(vcmd.cdw15);
-
- if (vcmd.timeout_ms)
- timeout = msecs_to_jiffies(vcmd.timeout_ms);
-
- q = admin ? ns->ctrl->admin_q : ns->queue;
-
- ret = nvme_nvm_submit_user_cmd(q, ns,
- (struct nvme_nvm_command *)&c,
- (void __user *)(uintptr_t)vcmd.addr, vcmd.data_len,
- (void __user *)(uintptr_t)vcmd.metadata,
- vcmd.metadata_len,
- (void __user *)(uintptr_t)vcmd.ppa_list, vcmd.nppas,
- &vcmd.result, &vcmd.status, timeout);
-
- if (ret && copy_to_user(uvcmd, &vcmd, sizeof(vcmd)))
- return -EFAULT;
-
- return ret;
-}
-
-int nvme_nvm_ioctl(struct nvme_ns *ns, unsigned int cmd, void __user *argp)
-{
- switch (cmd) {
- case NVME_NVM_IOCTL_ADMIN_VIO:
- return nvme_nvm_user_vcmd(ns, 1, argp);
- case NVME_NVM_IOCTL_IO_VIO:
- return nvme_nvm_user_vcmd(ns, 0, argp);
- case NVME_NVM_IOCTL_SUBMIT_VIO:
- return nvme_nvm_submit_vio(ns, argp);
- default:
- return -ENOTTY;
- }
-}
-
-int nvme_nvm_register(struct nvme_ns *ns, char *disk_name, int node)
-{
- struct request_queue *q = ns->queue;
- struct nvm_dev *dev;
- struct nvm_geo *geo;
-
- _nvme_nvm_check_size();
-
- dev = nvm_alloc_dev(node);
- if (!dev)
- return -ENOMEM;
-
- /* Note that csecs and sos will be overridden if it is a 1.2 drive. */
- geo = &dev->geo;
- geo->csecs = 1 << ns->lba_shift;
- geo->sos = ns->ms;
- if (ns->features & NVME_NS_EXT_LBAS)
- geo->ext = true;
- else
- geo->ext = false;
- geo->mdts = ns->ctrl->max_hw_sectors;
-
- dev->q = q;
- memcpy(dev->name, disk_name, DISK_NAME_LEN);
- dev->ops = &nvme_nvm_dev_ops;
- dev->private_data = ns;
- ns->ndev = dev;
-
- return nvm_register(dev);
-}
-
-void nvme_nvm_unregister(struct nvme_ns *ns)
-{
- nvm_unregister(ns->ndev);
-}
-
-static ssize_t nvm_dev_attr_show(struct device *dev,
- struct device_attribute *dattr, char *page)
-{
- struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
- struct nvm_dev *ndev = ns->ndev;
- struct nvm_geo *geo = &ndev->geo;
- struct attribute *attr;
-
- if (!ndev)
- return 0;
-
- attr = &dattr->attr;
-
- if (strcmp(attr->name, "version") == 0) {
- if (geo->major_ver_id == 1)
- return scnprintf(page, PAGE_SIZE, "%u\n",
- geo->major_ver_id);
- else
- return scnprintf(page, PAGE_SIZE, "%u.%u\n",
- geo->major_ver_id,
- geo->minor_ver_id);
- } else if (strcmp(attr->name, "capabilities") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->cap);
- } else if (strcmp(attr->name, "read_typ") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->trdt);
- } else if (strcmp(attr->name, "read_max") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->trdm);
- } else {
- return scnprintf(page,
- PAGE_SIZE,
- "Unhandled attr(%s) in `%s`\n",
- attr->name, __func__);
- }
-}
-
-static ssize_t nvm_dev_attr_show_ppaf(struct nvm_addrf_12 *ppaf, char *page)
-{
- return scnprintf(page, PAGE_SIZE,
- "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x\n",
- ppaf->ch_offset, ppaf->ch_len,
- ppaf->lun_offset, ppaf->lun_len,
- ppaf->pln_offset, ppaf->pln_len,
- ppaf->blk_offset, ppaf->blk_len,
- ppaf->pg_offset, ppaf->pg_len,
- ppaf->sec_offset, ppaf->sec_len);
-}
-
-static ssize_t nvm_dev_attr_show_12(struct device *dev,
- struct device_attribute *dattr, char *page)
-{
- struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
- struct nvm_dev *ndev = ns->ndev;
- struct nvm_geo *geo = &ndev->geo;
- struct attribute *attr;
-
- if (!ndev)
- return 0;
-
- attr = &dattr->attr;
-
- if (strcmp(attr->name, "vendor_opcode") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->vmnt);
- } else if (strcmp(attr->name, "device_mode") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->dom);
- /* kept for compatibility */
- } else if (strcmp(attr->name, "media_manager") == 0) {
- return scnprintf(page, PAGE_SIZE, "%s\n", "gennvm");
- } else if (strcmp(attr->name, "ppa_format") == 0) {
- return nvm_dev_attr_show_ppaf((void *)&geo->addrf, page);
- } else if (strcmp(attr->name, "media_type") == 0) { /* u8 */
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->mtype);
- } else if (strcmp(attr->name, "flash_media_type") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->fmtype);
- } else if (strcmp(attr->name, "num_channels") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_ch);
- } else if (strcmp(attr->name, "num_luns") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_lun);
- } else if (strcmp(attr->name, "num_planes") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_pln);
- } else if (strcmp(attr->name, "num_blocks") == 0) { /* u16 */
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_chk);
- } else if (strcmp(attr->name, "num_pages") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_pg);
- } else if (strcmp(attr->name, "page_size") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->fpg_sz);
- } else if (strcmp(attr->name, "hw_sector_size") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->csecs);
- } else if (strcmp(attr->name, "oob_sector_size") == 0) {/* u32 */
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->sos);
- } else if (strcmp(attr->name, "prog_typ") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tprt);
- } else if (strcmp(attr->name, "prog_max") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tprm);
- } else if (strcmp(attr->name, "erase_typ") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tbet);
- } else if (strcmp(attr->name, "erase_max") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tbem);
- } else if (strcmp(attr->name, "multiplane_modes") == 0) {
- return scnprintf(page, PAGE_SIZE, "0x%08x\n", geo->mpos);
- } else if (strcmp(attr->name, "media_capabilities") == 0) {
- return scnprintf(page, PAGE_SIZE, "0x%08x\n", geo->mccap);
- } else if (strcmp(attr->name, "max_phys_secs") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", NVM_MAX_VLBA);
- } else {
- return scnprintf(page, PAGE_SIZE,
- "Unhandled attr(%s) in `%s`\n",
- attr->name, __func__);
- }
-}
-
-static ssize_t nvm_dev_attr_show_20(struct device *dev,
- struct device_attribute *dattr, char *page)
-{
- struct nvme_ns *ns = nvme_get_ns_from_dev(dev);
- struct nvm_dev *ndev = ns->ndev;
- struct nvm_geo *geo = &ndev->geo;
- struct attribute *attr;
-
- if (!ndev)
- return 0;
-
- attr = &dattr->attr;
-
- if (strcmp(attr->name, "groups") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_ch);
- } else if (strcmp(attr->name, "punits") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_lun);
- } else if (strcmp(attr->name, "chunks") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->num_chk);
- } else if (strcmp(attr->name, "clba") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->clba);
- } else if (strcmp(attr->name, "ws_min") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->ws_min);
- } else if (strcmp(attr->name, "ws_opt") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->ws_opt);
- } else if (strcmp(attr->name, "maxoc") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->maxoc);
- } else if (strcmp(attr->name, "maxocpu") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->maxocpu);
- } else if (strcmp(attr->name, "mw_cunits") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->mw_cunits);
- } else if (strcmp(attr->name, "write_typ") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tprt);
- } else if (strcmp(attr->name, "write_max") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tprm);
- } else if (strcmp(attr->name, "reset_typ") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tbet);
- } else if (strcmp(attr->name, "reset_max") == 0) {
- return scnprintf(page, PAGE_SIZE, "%u\n", geo->tbem);
- } else {
- return scnprintf(page, PAGE_SIZE,
- "Unhandled attr(%s) in `%s`\n",
- attr->name, __func__);
- }
-}
-
-#define NVM_DEV_ATTR_RO(_name) \
- DEVICE_ATTR(_name, S_IRUGO, nvm_dev_attr_show, NULL)
-#define NVM_DEV_ATTR_12_RO(_name) \
- DEVICE_ATTR(_name, S_IRUGO, nvm_dev_attr_show_12, NULL)
-#define NVM_DEV_ATTR_20_RO(_name) \
- DEVICE_ATTR(_name, S_IRUGO, nvm_dev_attr_show_20, NULL)
-
-/* general attributes */
-static NVM_DEV_ATTR_RO(version);
-static NVM_DEV_ATTR_RO(capabilities);
-
-static NVM_DEV_ATTR_RO(read_typ);
-static NVM_DEV_ATTR_RO(read_max);
-
-/* 1.2 values */
-static NVM_DEV_ATTR_12_RO(vendor_opcode);
-static NVM_DEV_ATTR_12_RO(device_mode);
-static NVM_DEV_ATTR_12_RO(ppa_format);
-static NVM_DEV_ATTR_12_RO(media_manager);
-static NVM_DEV_ATTR_12_RO(media_type);
-static NVM_DEV_ATTR_12_RO(flash_media_type);
-static NVM_DEV_ATTR_12_RO(num_channels);
-static NVM_DEV_ATTR_12_RO(num_luns);
-static NVM_DEV_ATTR_12_RO(num_planes);
-static NVM_DEV_ATTR_12_RO(num_blocks);
-static NVM_DEV_ATTR_12_RO(num_pages);
-static NVM_DEV_ATTR_12_RO(page_size);
-static NVM_DEV_ATTR_12_RO(hw_sector_size);
-static NVM_DEV_ATTR_12_RO(oob_sector_size);
-static NVM_DEV_ATTR_12_RO(prog_typ);
-static NVM_DEV_ATTR_12_RO(prog_max);
-static NVM_DEV_ATTR_12_RO(erase_typ);
-static NVM_DEV_ATTR_12_RO(erase_max);
-static NVM_DEV_ATTR_12_RO(multiplane_modes);
-static NVM_DEV_ATTR_12_RO(media_capabilities);
-static NVM_DEV_ATTR_12_RO(max_phys_secs);
-
-/* 2.0 values */
-static NVM_DEV_ATTR_20_RO(groups);
-static NVM_DEV_ATTR_20_RO(punits);
-static NVM_DEV_ATTR_20_RO(chunks);
-static NVM_DEV_ATTR_20_RO(clba);
-static NVM_DEV_ATTR_20_RO(ws_min);
-static NVM_DEV_ATTR_20_RO(ws_opt);
-static NVM_DEV_ATTR_20_RO(maxoc);
-static NVM_DEV_ATTR_20_RO(maxocpu);
-static NVM_DEV_ATTR_20_RO(mw_cunits);
-static NVM_DEV_ATTR_20_RO(write_typ);
-static NVM_DEV_ATTR_20_RO(write_max);
-static NVM_DEV_ATTR_20_RO(reset_typ);
-static NVM_DEV_ATTR_20_RO(reset_max);
-
-static struct attribute *nvm_dev_attrs[] = {
- /* version agnostic attrs */
- &dev_attr_version.attr,
- &dev_attr_capabilities.attr,
- &dev_attr_read_typ.attr,
- &dev_attr_read_max.attr,
-
- /* 1.2 attrs */
- &dev_attr_vendor_opcode.attr,
- &dev_attr_device_mode.attr,
- &dev_attr_media_manager.attr,
- &dev_attr_ppa_format.attr,
- &dev_attr_media_type.attr,
- &dev_attr_flash_media_type.attr,
- &dev_attr_num_channels.attr,
- &dev_attr_num_luns.attr,
- &dev_attr_num_planes.attr,
- &dev_attr_num_blocks.attr,
- &dev_attr_num_pages.attr,
- &dev_attr_page_size.attr,
- &dev_attr_hw_sector_size.attr,
- &dev_attr_oob_sector_size.attr,
- &dev_attr_prog_typ.attr,
- &dev_attr_prog_max.attr,
- &dev_attr_erase_typ.attr,
- &dev_attr_erase_max.attr,
- &dev_attr_multiplane_modes.attr,
- &dev_attr_media_capabilities.attr,
- &dev_attr_max_phys_secs.attr,
-
- /* 2.0 attrs */
- &dev_attr_groups.attr,
- &dev_attr_punits.attr,
- &dev_attr_chunks.attr,
- &dev_attr_clba.attr,
- &dev_attr_ws_min.attr,
- &dev_attr_ws_opt.attr,
- &dev_attr_maxoc.attr,
- &dev_attr_maxocpu.attr,
- &dev_attr_mw_cunits.attr,
-
- &dev_attr_write_typ.attr,
- &dev_attr_write_max.attr,
- &dev_attr_reset_typ.attr,
- &dev_attr_reset_max.attr,
-
- NULL,
-};
-
-static umode_t nvm_dev_attrs_visible(struct kobject *kobj,
- struct attribute *attr, int index)
-{
- struct device *dev = kobj_to_dev(kobj);
- struct gendisk *disk = dev_to_disk(dev);
- struct nvme_ns *ns = disk->private_data;
- struct nvm_dev *ndev = ns->ndev;
- struct device_attribute *dev_attr =
- container_of(attr, typeof(*dev_attr), attr);
-
- if (!ndev)
- return 0;
-
- if (dev_attr->show == nvm_dev_attr_show)
- return attr->mode;
-
- switch (ndev->geo.major_ver_id) {
- case 1:
- if (dev_attr->show == nvm_dev_attr_show_12)
- return attr->mode;
- break;
- case 2:
- if (dev_attr->show == nvm_dev_attr_show_20)
- return attr->mode;
- break;
- }
-
- return 0;
-}
-
-const struct attribute_group nvme_nvm_attr_group = {
- .name = "lightnvm",
- .attrs = nvm_dev_attrs,
- .is_visible = nvm_dev_attrs_visible,
-};
if (!head->disk)
return;
kblockd_schedule_work(&head->requeue_work);
- if (head->disk->flags & GENHD_FL_UP) {
+ if (test_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) {
nvme_cdev_del(&head->cdev, &head->cdev_device);
del_gendisk(head->disk);
}
#include <linux/pci.h>
#include <linux/kref.h>
#include <linux/blk-mq.h>
-#include <linux/lightnvm.h>
#include <linux/sed-opal.h>
#include <linux/fault-inject.h>
#include <linux/rcupdate.h>
extern struct workqueue_struct *nvme_reset_wq;
extern struct workqueue_struct *nvme_delete_wq;
-enum {
- NVME_NS_LBA = 0,
- NVME_NS_LIGHTNVM = 1,
-};
-
/*
* List of workarounds for devices that required behavior not specified in
* the standard.
*/
NVME_QUIRK_NO_DEEPEST_PS = (1 << 5),
- /*
- * Supports the LighNVM command set if indicated in vs[1].
- */
- NVME_QUIRK_LIGHTNVM = (1 << 6),
-
/*
* Set MEDIUM priority on SQ creation
*/
struct nvme_request {
struct nvme_command *cmd;
union nvme_result result;
+ u8 genctr;
u8 retries;
u8 flags;
u16 status;
u32 ana_grpid;
#endif
struct list_head siblings;
- struct nvm_dev *ndev;
struct kref kref;
struct nvme_ns_head *head;
int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size);
};
+/*
+ * nvme command_id is constructed as such:
+ * | xxxx | xxxxxxxxxxxx |
+ * gen request tag
+ */
+#define nvme_genctr_mask(gen) (gen & 0xf)
+#define nvme_cid_install_genctr(gen) (nvme_genctr_mask(gen) << 12)
+#define nvme_genctr_from_cid(cid) ((cid & 0xf000) >> 12)
+#define nvme_tag_from_cid(cid) (cid & 0xfff)
+
+static inline u16 nvme_cid(struct request *rq)
+{
+ return nvme_cid_install_genctr(nvme_req(rq)->genctr) | rq->tag;
+}
+
+static inline struct request *nvme_find_rq(struct blk_mq_tags *tags,
+ u16 command_id)
+{
+ u8 genctr = nvme_genctr_from_cid(command_id);
+ u16 tag = nvme_tag_from_cid(command_id);
+ struct request *rq;
+
+ rq = blk_mq_tag_to_rq(tags, tag);
+ if (unlikely(!rq)) {
+ pr_err("could not locate request for tag %#x\n",
+ tag);
+ return NULL;
+ }
+ if (unlikely(nvme_genctr_mask(nvme_req(rq)->genctr) != genctr)) {
+ dev_err(nvme_req(rq)->ctrl->device,
+ "request %#x genctr mismatch (got %#x expected %#x)\n",
+ tag, genctr, nvme_genctr_mask(nvme_req(rq)->genctr));
+ return NULL;
+ }
+ return rq;
+}
+
+static inline struct request *nvme_cid_to_rq(struct blk_mq_tags *tags,
+ u16 command_id)
+{
+ return blk_mq_tag_to_rq(tags, nvme_tag_from_cid(command_id));
+}
+
#ifdef CONFIG_FAULT_INJECTION_DEBUG_FS
void nvme_fault_inject_init(struct nvme_fault_inject *fault_inj,
const char *dev_name);
static inline bool nvme_is_aen_req(u16 qid, __u16 command_id)
{
- return !qid && command_id >= NVME_AQ_BLK_MQ_DEPTH;
+ return !qid &&
+ nvme_tag_from_cid(command_id) >= NVME_AQ_BLK_MQ_DEPTH;
}
void nvme_complete_rq(struct request *req);
}
#endif
-#ifdef CONFIG_NVM
-int nvme_nvm_register(struct nvme_ns *ns, char *disk_name, int node);
-void nvme_nvm_unregister(struct nvme_ns *ns);
-extern const struct attribute_group nvme_nvm_attr_group;
-int nvme_nvm_ioctl(struct nvme_ns *ns, unsigned int cmd, void __user *argp);
-#else
-static inline int nvme_nvm_register(struct nvme_ns *ns, char *disk_name,
- int node)
-{
- return 0;
-}
-
-static inline void nvme_nvm_unregister(struct nvme_ns *ns) {};
-static inline int nvme_nvm_ioctl(struct nvme_ns *ns, unsigned int cmd,
- void __user *argp)
-{
- return -ENOTTY;
-}
-#endif /* CONFIG_NVM */
-
static inline struct nvme_ns *nvme_get_ns_from_dev(struct device *dev)
{
return dev_to_disk(dev)->private_data;
"Use SGLs when average request segment size is larger or equal to "
"this size. Use 0 to disable SGLs.");
+#define NVME_PCI_MIN_QUEUE_SIZE 2
+#define NVME_PCI_MAX_QUEUE_SIZE 4095
static int io_queue_depth_set(const char *val, const struct kernel_param *kp);
static const struct kernel_param_ops io_queue_depth_ops = {
.set = io_queue_depth_set,
static unsigned int io_queue_depth = 1024;
module_param_cb(io_queue_depth, &io_queue_depth_ops, &io_queue_depth, 0644);
-MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2");
+MODULE_PARM_DESC(io_queue_depth, "set io queue depth, should >= 2 and < 4096");
static int io_queue_count_set(const char *val, const struct kernel_param *kp)
{
u32 cmbloc;
struct nvme_ctrl ctrl;
u32 last_ps;
+ bool hmb;
mempool_t *iod_mempool;
unsigned int nr_allocated_queues;
unsigned int nr_write_queues;
unsigned int nr_poll_queues;
+
+ bool attrs_added;
};
static int io_queue_depth_set(const char *val, const struct kernel_param *kp)
{
- int ret;
- u32 n;
-
- ret = kstrtou32(val, 10, &n);
- if (ret != 0 || n < 2)
- return -EINVAL;
-
- return param_set_uint(val, kp);
+ return param_set_uint_minmax(val, kp, NVME_PCI_MIN_QUEUE_SIZE,
+ NVME_PCI_MAX_QUEUE_SIZE);
}
static inline unsigned int sq_idx(unsigned int qid, u32 stride)
return;
}
- req = blk_mq_tag_to_rq(nvme_queue_tagset(nvmeq), command_id);
+ req = nvme_find_rq(nvme_queue_tagset(nvmeq), command_id);
if (unlikely(!req)) {
dev_warn(nvmeq->dev->ctrl.device,
"invalid id %d completed on queue %d\n",
return ret >= 0 ? 0 : ret;
}
-static ssize_t nvme_cmb_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct nvme_dev *ndev = to_nvme_dev(dev_get_drvdata(dev));
-
- return scnprintf(buf, PAGE_SIZE, "cmbloc : x%08x\ncmbsz : x%08x\n",
- ndev->cmbloc, ndev->cmbsz);
-}
-static DEVICE_ATTR(cmb, S_IRUGO, nvme_cmb_show, NULL);
-
static u64 nvme_cmb_size_unit(struct nvme_dev *dev)
{
u8 szu = (dev->cmbsz >> NVME_CMBSZ_SZU_SHIFT) & NVME_CMBSZ_SZU_MASK;
if ((dev->cmbsz & (NVME_CMBSZ_WDS | NVME_CMBSZ_RDS)) ==
(NVME_CMBSZ_WDS | NVME_CMBSZ_RDS))
pci_p2pmem_publish(pdev, true);
-
- if (sysfs_add_file_to_group(&dev->ctrl.device->kobj,
- &dev_attr_cmb.attr, NULL))
- dev_warn(dev->ctrl.device,
- "failed to add sysfs attribute for CMB\n");
-}
-
-static inline void nvme_release_cmb(struct nvme_dev *dev)
-{
- if (dev->cmb_size) {
- sysfs_remove_file_from_group(&dev->ctrl.device->kobj,
- &dev_attr_cmb.attr, NULL);
- dev->cmb_size = 0;
- }
}
static int nvme_set_host_mem(struct nvme_dev *dev, u32 bits)
dev_warn(dev->ctrl.device,
"failed to set host mem (err %d, flags %#x).\n",
ret, bits);
- }
+ } else
+ dev->hmb = bits & NVME_HOST_MEM_ENABLE;
+
return ret;
}
return ret;
}
+static ssize_t cmb_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct nvme_dev *ndev = to_nvme_dev(dev_get_drvdata(dev));
+
+ return sysfs_emit(buf, "cmbloc : x%08x\ncmbsz : x%08x\n",
+ ndev->cmbloc, ndev->cmbsz);
+}
+static DEVICE_ATTR_RO(cmb);
+
+static ssize_t cmbloc_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct nvme_dev *ndev = to_nvme_dev(dev_get_drvdata(dev));
+
+ return sysfs_emit(buf, "%u\n", ndev->cmbloc);
+}
+static DEVICE_ATTR_RO(cmbloc);
+
+static ssize_t cmbsz_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct nvme_dev *ndev = to_nvme_dev(dev_get_drvdata(dev));
+
+ return sysfs_emit(buf, "%u\n", ndev->cmbsz);
+}
+static DEVICE_ATTR_RO(cmbsz);
+
+static ssize_t hmb_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct nvme_dev *ndev = to_nvme_dev(dev_get_drvdata(dev));
+
+ return sysfs_emit(buf, "%d\n", ndev->hmb);
+}
+
+static ssize_t hmb_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct nvme_dev *ndev = to_nvme_dev(dev_get_drvdata(dev));
+ bool new;
+ int ret;
+
+ if (strtobool(buf, &new) < 0)
+ return -EINVAL;
+
+ if (new == ndev->hmb)
+ return count;
+
+ if (new) {
+ ret = nvme_setup_host_mem(ndev);
+ } else {
+ ret = nvme_set_host_mem(ndev, 0);
+ if (!ret)
+ nvme_free_host_mem(ndev);
+ }
+
+ if (ret < 0)
+ return ret;
+
+ return count;
+}
+static DEVICE_ATTR_RW(hmb);
+
+static umode_t nvme_pci_attrs_are_visible(struct kobject *kobj,
+ struct attribute *a, int n)
+{
+ struct nvme_ctrl *ctrl =
+ dev_get_drvdata(container_of(kobj, struct device, kobj));
+ struct nvme_dev *dev = to_nvme_dev(ctrl);
+
+ if (a == &dev_attr_cmb.attr ||
+ a == &dev_attr_cmbloc.attr ||
+ a == &dev_attr_cmbsz.attr) {
+ if (!dev->cmbsz)
+ return 0;
+ }
+ if (a == &dev_attr_hmb.attr && !ctrl->hmpre)
+ return 0;
+
+ return a->mode;
+}
+
+static struct attribute *nvme_pci_attrs[] = {
+ &dev_attr_cmb.attr,
+ &dev_attr_cmbloc.attr,
+ &dev_attr_cmbsz.attr,
+ &dev_attr_hmb.attr,
+ NULL,
+};
+
+static const struct attribute_group nvme_pci_attr_group = {
+ .attrs = nvme_pci_attrs,
+ .is_visible = nvme_pci_attrs_are_visible,
+};
+
/*
* nirqs is the number of interrupts available for write and read
* queues. The core already reserved an interrupt for the admin queue.
goto out;
}
+ if (!dev->attrs_added && !sysfs_create_group(&dev->ctrl.device->kobj,
+ &nvme_pci_attr_group))
+ dev->attrs_added = true;
+
nvme_start_ctrl(&dev->ctrl);
return;
nvme_disable_prepare_reset(dev, true);
}
+static void nvme_remove_attrs(struct nvme_dev *dev)
+{
+ if (dev->attrs_added)
+ sysfs_remove_group(&dev->ctrl.device->kobj,
+ &nvme_pci_attr_group);
+}
+
/*
* The driver's remove may be called on a device in a partially initialized
* state. This function must not have any dependencies on the device state in
nvme_stop_ctrl(&dev->ctrl);
nvme_remove_namespaces(&dev->ctrl);
nvme_dev_disable(dev, true);
- nvme_release_cmb(dev);
+ nvme_remove_attrs(dev);
nvme_free_host_mem(dev);
nvme_dev_remove_admin(dev);
nvme_free_queues(dev, 0);
if (ndev->last_ps == U32_MAX ||
nvme_set_power_state(ctrl, ndev->last_ps) != 0)
- return nvme_try_sched_reset(&ndev->ctrl);
+ goto reset;
+ if (ctrl->hmpre && nvme_setup_host_mem(ndev))
+ goto reset;
+
return 0;
+reset:
+ return nvme_try_sched_reset(ctrl);
}
static int nvme_suspend(struct device *dev)
* the PCI bus layer to put it into D3 in order to take the PCIe link
* down, so as to allow the platform to achieve its minimum low-power
* state (which may not be possible if the link is up).
- *
- * If a host memory buffer is enabled, shut down the device as the NVMe
- * specification allows the device to access the host memory buffer in
- * host DRAM from all power states, but hosts will fail access to DRAM
- * during S3.
*/
if (pm_suspend_via_firmware() || !ctrl->npss ||
!pcie_aspm_enabled(pdev) ||
- ndev->nr_host_mem_descs ||
(ndev->ctrl.quirks & NVME_QUIRK_SIMPLE_SUSPEND))
return nvme_disable_prepare_reset(ndev, true);
if (ctrl->state != NVME_CTRL_LIVE)
goto unfreeze;
+ /*
+ * Host memory access may not be successful in a system suspend state,
+ * but the specification allows the controller to access memory in a
+ * non-operational power state.
+ */
+ if (ndev->hmb) {
+ ret = nvme_set_host_mem(ndev, 0);
+ if (ret < 0)
+ goto unfreeze;
+ }
+
ret = nvme_get_power_state(ctrl, &ndev->last_ps);
if (ret < 0)
goto unfreeze;
{ PCI_DEVICE(0x1b4b, 0x1092), /* Lexar 256 GB SSD */
.driver_data = NVME_QUIRK_NO_NS_DESC_LIST |
NVME_QUIRK_IGNORE_DEV_SUBNQN, },
- { PCI_DEVICE(0x1d1d, 0x1f1f), /* LighNVM qemu device */
- .driver_data = NVME_QUIRK_LIGHTNVM, },
- { PCI_DEVICE(0x1d1d, 0x2807), /* CNEX WL */
- .driver_data = NVME_QUIRK_LIGHTNVM, },
- { PCI_DEVICE(0x1d1d, 0x2601), /* CNEX Granby */
- .driver_data = NVME_QUIRK_LIGHTNVM, },
{ PCI_DEVICE(0x10ec, 0x5762), /* ADATA SX6000LNP */
.driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, },
{ PCI_DEVICE(0x1cc1, 0x8201), /* ADATA SX8200PNP 512GB */
if (ret)
return ret;
- ctrl->ctrl.queue_count = nr_io_queues + 1;
- if (ctrl->ctrl.queue_count < 2) {
+ if (nr_io_queues == 0) {
dev_err(ctrl->ctrl.device,
"unable to set any I/O queues\n");
return -ENOMEM;
}
+ ctrl->ctrl.queue_count = nr_io_queues + 1;
dev_info(ctrl->ctrl.device,
"creating %d I/O queues.\n", nr_io_queues);
struct request *rq;
struct nvme_rdma_request *req;
- rq = blk_mq_tag_to_rq(nvme_rdma_tagset(queue), cqe->command_id);
+ rq = nvme_find_rq(nvme_rdma_tagset(queue), cqe->command_id);
if (!rq) {
dev_err(queue->ctrl->ctrl.device,
- "tag 0x%x on QP %#x not found\n",
+ "got bad command_id %#x on QP %#x\n",
cqe->command_id, queue->qp->qp_num);
nvme_rdma_error_recovery(queue->ctrl);
return;
{
struct request *rq;
- rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), cqe->command_id);
+ rq = nvme_find_rq(nvme_tcp_tagset(queue), cqe->command_id);
if (!rq) {
dev_err(queue->ctrl->ctrl.device,
- "queue %d tag 0x%x not found\n",
- nvme_tcp_queue_id(queue), cqe->command_id);
+ "got bad cqe.command_id %#x on queue %d\n",
+ cqe->command_id, nvme_tcp_queue_id(queue));
nvme_tcp_error_recovery(&queue->ctrl->ctrl);
return -EINVAL;
}
{
struct request *rq;
- rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+ rq = nvme_find_rq(nvme_tcp_tagset(queue), pdu->command_id);
if (!rq) {
dev_err(queue->ctrl->ctrl.device,
- "queue %d tag %#x not found\n",
- nvme_tcp_queue_id(queue), pdu->command_id);
+ "got bad c2hdata.command_id %#x on queue %d\n",
+ pdu->command_id, nvme_tcp_queue_id(queue));
return -ENOENT;
}
data->hdr.plen =
cpu_to_le32(data->hdr.hlen + hdgst + req->pdu_len + ddgst);
data->ttag = pdu->ttag;
- data->command_id = rq->tag;
+ data->command_id = nvme_cid(rq);
data->data_offset = cpu_to_le32(req->data_sent);
data->data_length = cpu_to_le32(req->pdu_len);
return 0;
struct request *rq;
int ret;
- rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+ rq = nvme_find_rq(nvme_tcp_tagset(queue), pdu->command_id);
if (!rq) {
dev_err(queue->ctrl->ctrl.device,
- "queue %d tag %#x not found\n",
- nvme_tcp_queue_id(queue), pdu->command_id);
+ "got bad r2t.command_id %#x on queue %d\n",
+ pdu->command_id, nvme_tcp_queue_id(queue));
return -ENOENT;
}
req = blk_mq_rq_to_pdu(rq);
unsigned int *offset, size_t *len)
{
struct nvme_tcp_data_pdu *pdu = (void *)queue->pdu;
- struct nvme_tcp_request *req;
- struct request *rq;
-
- rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
- if (!rq) {
- dev_err(queue->ctrl->ctrl.device,
- "queue %d tag %#x not found\n",
- nvme_tcp_queue_id(queue), pdu->command_id);
- return -ENOENT;
- }
- req = blk_mq_rq_to_pdu(rq);
+ struct request *rq =
+ nvme_cid_to_rq(nvme_tcp_tagset(queue), pdu->command_id);
+ struct nvme_tcp_request *req = blk_mq_rq_to_pdu(rq);
while (true) {
int recv_len, ret;
}
if (pdu->hdr.flags & NVME_TCP_F_DATA_SUCCESS) {
- struct request *rq = blk_mq_tag_to_rq(nvme_tcp_tagset(queue),
- pdu->command_id);
+ struct request *rq = nvme_cid_to_rq(nvme_tcp_tagset(queue),
+ pdu->command_id);
nvme_tcp_end_request(rq, NVME_SC_SUCCESS);
queue->nr_cqe++;
sock_release(queue->sock);
kfree(queue->pdu);
+ mutex_destroy(&queue->send_mutex);
mutex_destroy(&queue->queue_lock);
}
sock_release(queue->sock);
queue->sock = NULL;
err_destroy_mutex:
+ mutex_destroy(&queue->send_mutex);
mutex_destroy(&queue->queue_lock);
return ret;
}
if (ret)
return ret;
- ctrl->queue_count = nr_io_queues + 1;
- if (ctrl->queue_count < 2) {
+ if (nr_io_queues == 0) {
dev_err(ctrl->device,
"unable to set any I/O queues\n");
return -ENOMEM;
}
+ ctrl->queue_count = nr_io_queues + 1;
dev_info(ctrl->device,
"creating %d I/O queues.\n", nr_io_queues);
return ret;
}
+static const char *nvme_trace_admin_set_features(struct trace_seq *p,
+ u8 *cdw10)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+ u8 fid = cdw10[0];
+ u8 sv = cdw10[3] & 0x8;
+ u32 cdw11 = get_unaligned_le32(cdw10 + 4);
+
+ trace_seq_printf(p, "fid=0x%x, sv=0x%x, cdw11=0x%x", fid, sv, cdw11);
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
static const char *nvme_trace_admin_get_features(struct trace_seq *p,
u8 *cdw10)
{
u8 sel = cdw10[1] & 0x7;
u32 cdw11 = get_unaligned_le32(cdw10 + 4);
- trace_seq_printf(p, "fid=0x%x sel=0x%x cdw11=0x%x", fid, sel, cdw11);
+ trace_seq_printf(p, "fid=0x%x, sel=0x%x, cdw11=0x%x", fid, sel, cdw11);
trace_seq_putc(p, 0);
return ret;
return nvme_trace_create_cq(p, cdw10);
case nvme_admin_identify:
return nvme_trace_admin_identify(p, cdw10);
+ case nvme_admin_set_features:
+ return nvme_trace_admin_set_features(p, cdw10);
case nvme_admin_get_features:
return nvme_trace_admin_get_features(p, cdw10);
case nvme_admin_get_lba_status:
config NVME_TARGET_LOOP
tristate "NVMe loopback device support"
depends on NVME_TARGET
- select NVME_CORE
select NVME_FABRICS
select SG_POOL
help
config NVME_TARGET_FCLOOP
tristate "NVMe over Fabrics FC Transport Loopback Test driver"
depends on NVME_TARGET
- select NVME_CORE
select NVME_FABRICS
select SG_POOL
depends on NVME_FC
* controller teardown as a result of a keep-alive expiration.
*/
ctrl->reset_tbkas = true;
+ sq->ctrl->sqs[sq->qid] = NULL;
nvmet_ctrl_put(ctrl);
sq->ctrl = NULL; /* allows reusing the queue later */
}
u16 qid = le16_to_cpu(c->qid);
u16 sqsize = le16_to_cpu(c->sqsize);
struct nvmet_ctrl *old;
+ u16 mqes = NVME_CAP_MQES(ctrl->cap);
u16 ret;
- old = cmpxchg(&req->sq->ctrl, NULL, ctrl);
- if (old) {
- pr_warn("queue already connected!\n");
- req->error_loc = offsetof(struct nvmf_connect_command, opcode);
- return NVME_SC_CONNECT_CTRL_BUSY | NVME_SC_DNR;
- }
if (!sqsize) {
pr_warn("queue size zero!\n");
req->error_loc = offsetof(struct nvmf_connect_command, sqsize);
+ req->cqe->result.u32 = IPO_IATTR_CONNECT_SQE(sqsize);
ret = NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR;
goto err;
}
+ if (ctrl->sqs[qid] != NULL) {
+ pr_warn("qid %u has already been created\n", qid);
+ req->error_loc = offsetof(struct nvmf_connect_command, qid);
+ return NVME_SC_CMD_SEQ_ERROR | NVME_SC_DNR;
+ }
+
+ if (sqsize > mqes) {
+ pr_warn("sqsize %u is larger than MQES supported %u cntlid %d\n",
+ sqsize, mqes, ctrl->cntlid);
+ req->error_loc = offsetof(struct nvmf_connect_command, sqsize);
+ req->cqe->result.u32 = IPO_IATTR_CONNECT_SQE(sqsize);
+ return NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR;
+ }
+
+ old = cmpxchg(&req->sq->ctrl, NULL, ctrl);
+ if (old) {
+ pr_warn("queue already connected!\n");
+ req->error_loc = offsetof(struct nvmf_connect_command, opcode);
+ return NVME_SC_CONNECT_CTRL_BUSY | NVME_SC_DNR;
+ }
+
/* note: convert queue size from 0's-based value to 1's-based value */
nvmet_cq_setup(ctrl, req->cq, qid, sqsize + 1);
nvmet_sq_setup(ctrl, req->sq, qid, sqsize + 1);
if (ret) {
pr_err("failed to install queue %d cntlid %d ret %x\n",
qid, ctrl->cntlid, ret);
+ ctrl->sqs[qid] = NULL;
goto err;
}
}
}
status = nvmet_install_queue(ctrl, req);
- if (status) {
- /* pass back cntlid that had the issue of installing queue */
- req->cqe->result.u16 = cpu_to_le16(ctrl->cntlid);
+ if (status)
goto out_ctrl_put;
- }
+
+ /* pass back cntlid for successful completion */
+ req->cqe->result.u16 = cpu_to_le16(ctrl->cntlid);
pr_debug("adding queue %d to ctrl %d.\n", qid, ctrl->cntlid);
} else {
struct request *rq;
- rq = blk_mq_tag_to_rq(nvme_loop_tagset(queue), cqe->command_id);
+ rq = nvme_find_rq(nvme_loop_tagset(queue), cqe->command_id);
if (!rq) {
dev_err(queue->ctrl->ctrl.device,
- "tag 0x%x on queue %d not found\n",
+ "got bad command_id %#x on queue %d\n",
cqe->command_id, nvme_loop_queue_idx(queue));
return;
}
u8 sel = cdw10[1] & 0x7;
u32 cdw11 = get_unaligned_le32(cdw10 + 4);
- trace_seq_printf(p, "fid=0x%x sel=0x%x cdw11=0x%x", fid, sel, cdw11);
+ trace_seq_printf(p, "fid=0x%x, sel=0x%x, cdw11=0x%x", fid, sel, cdw11);
trace_seq_putc(p, 0);
return ret;
return ret;
}
+static const char *nvmet_trace_admin_set_features(struct trace_seq *p,
+ u8 *cdw10)
+{
+ const char *ret = trace_seq_buffer_ptr(p);
+ u8 fid = cdw10[0];
+ u8 sv = cdw10[3] & 0x8;
+ u32 cdw11 = get_unaligned_le32(cdw10 + 4);
+
+ trace_seq_printf(p, "fid=0x%x, sv=0x%x, cdw11=0x%x", fid, sv, cdw11);
+ trace_seq_putc(p, 0);
+
+ return ret;
+}
+
static const char *nvmet_trace_read_write(struct trace_seq *p, u8 *cdw10)
{
const char *ret = trace_seq_buffer_ptr(p);
switch (opcode) {
case nvme_admin_identify:
return nvmet_trace_admin_identify(p, cdw10);
+ case nvme_admin_set_features:
+ return nvmet_trace_admin_set_features(p, cdw10);
case nvme_admin_get_features:
return nvmet_trace_admin_get_features(p, cdw10);
case nvme_admin_get_lba_status:
}
status = nvmet_req_find_ns(req);
- if (status) {
- status = NVME_SC_INTERNAL;
+ if (status)
goto done;
- }
if (!bdev_is_zoned(req->ns->bdev)) {
req->error_loc = offsetof(struct nvme_identify, nsid);
- status = NVME_SC_INVALID_NS | NVME_SC_DNR;
goto done;
}
if (!required_opp_tables)
return 0;
+ /* required-opps not fully initialized yet */
+ if (lazy_linking_pending(opp_table))
+ return -EBUSY;
+
/*
* We only support genpd's OPPs in the "required-opps" for now, as we
* don't know much about other use cases. Error out if the required OPP
return -ENOENT;
}
- /* required-opps not fully initialized yet */
- if (lazy_linking_pending(opp_table))
- return -EBUSY;
-
/* Single genpd case */
if (!genpd_virt_devs)
return _set_required_opp(dev, dev, opp, 0);
return default_restore_msi_irqs(dev);
}
-static inline __attribute_const__ u32 msi_mask(unsigned x)
-{
- /* Don't shift by >= width of type */
- if (x >= 5)
- return 0xffffffff;
- return (1 << (1 << x)) - 1;
-}
-
/*
* PCI 2.3 does not specify mask bits for each MSI interrupt. Attempting to
* mask all MSI interrupts by clearing the MSI enable bit does not work
* reliably as devices without an INTx disable bit will then generate a
* level IRQ which will never be cleared.
*/
-void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+static inline __attribute_const__ u32 msi_multi_mask(struct msi_desc *desc)
+{
+ /* Don't shift by >= width of type */
+ if (desc->msi_attrib.multi_cap >= 5)
+ return 0xffffffff;
+ return (1 << (1 << desc->msi_attrib.multi_cap)) - 1;
+}
+
+static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, u32 set)
{
raw_spinlock_t *lock = &desc->dev->msi_lock;
unsigned long flags;
- if (pci_msi_ignore_mask || !desc->msi_attrib.maskbit)
- return;
-
raw_spin_lock_irqsave(lock, flags);
- desc->masked &= ~mask;
- desc->masked |= flag;
+ desc->msi_mask &= ~clear;
+ desc->msi_mask |= set;
pci_write_config_dword(msi_desc_to_pci_dev(desc), desc->mask_pos,
- desc->masked);
+ desc->msi_mask);
raw_spin_unlock_irqrestore(lock, flags);
}
-static void msi_mask_irq(struct msi_desc *desc, u32 mask, u32 flag)
+static inline void pci_msi_mask(struct msi_desc *desc, u32 mask)
{
- __pci_msi_desc_mask_irq(desc, mask, flag);
+ pci_msi_update_mask(desc, 0, mask);
}
-static void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
+static inline void pci_msi_unmask(struct msi_desc *desc, u32 mask)
{
- if (desc->msi_attrib.is_virtual)
- return NULL;
+ pci_msi_update_mask(desc, mask, 0);
+}
- return desc->mask_base +
- desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
+static inline void __iomem *pci_msix_desc_addr(struct msi_desc *desc)
+{
+ return desc->mask_base + desc->msi_attrib.entry_nr * PCI_MSIX_ENTRY_SIZE;
}
/*
- * This internal function does not flush PCI writes to the device.
- * All users must ensure that they read from the device before either
- * assuming that the device state is up to date, or returning out of this
- * file. This saves a few milliseconds when initialising devices with lots
- * of MSI-X interrupts.
+ * This internal function does not flush PCI writes to the device. All
+ * users must ensure that they read from the device before either assuming
+ * that the device state is up to date, or returning out of this file.
+ * It does not affect the msi_desc::msix_ctrl cache either. Use with care!
*/
-u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag)
+static void pci_msix_write_vector_ctrl(struct msi_desc *desc, u32 ctrl)
{
- u32 mask_bits = desc->masked;
- void __iomem *desc_addr;
-
- if (pci_msi_ignore_mask)
- return 0;
-
- desc_addr = pci_msix_desc_addr(desc);
- if (!desc_addr)
- return 0;
+ void __iomem *desc_addr = pci_msix_desc_addr(desc);
- mask_bits &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
- if (flag & PCI_MSIX_ENTRY_CTRL_MASKBIT)
- mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+ writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+}
- writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+static inline void pci_msix_mask(struct msi_desc *desc)
+{
+ desc->msix_ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
+ pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
+ /* Flush write to device */
+ readl(desc->mask_base);
+}
- return mask_bits;
+static inline void pci_msix_unmask(struct msi_desc *desc)
+{
+ desc->msix_ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
+ pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
}
-static void msix_mask_irq(struct msi_desc *desc, u32 flag)
+static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
{
- desc->masked = __pci_msix_desc_mask_irq(desc, flag);
+ if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+ return;
+
+ if (desc->msi_attrib.is_msix)
+ pci_msix_mask(desc);
+ else if (desc->msi_attrib.maskbit)
+ pci_msi_mask(desc, mask);
}
-static void msi_set_mask_bit(struct irq_data *data, u32 flag)
+static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
{
- struct msi_desc *desc = irq_data_get_msi_desc(data);
+ if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+ return;
- if (desc->msi_attrib.is_msix) {
- msix_mask_irq(desc, flag);
- readl(desc->mask_base); /* Flush write to device */
- } else {
- unsigned offset = data->irq - desc->irq;
- msi_mask_irq(desc, 1 << offset, flag << offset);
- }
+ if (desc->msi_attrib.is_msix)
+ pci_msix_unmask(desc);
+ else if (desc->msi_attrib.maskbit)
+ pci_msi_unmask(desc, mask);
}
/**
*/
void pci_msi_mask_irq(struct irq_data *data)
{
- msi_set_mask_bit(data, 1);
+ struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+ __pci_msi_mask_desc(desc, BIT(data->irq - desc->irq));
}
EXPORT_SYMBOL_GPL(pci_msi_mask_irq);
*/
void pci_msi_unmask_irq(struct irq_data *data)
{
- msi_set_mask_bit(data, 0);
+ struct msi_desc *desc = irq_data_get_msi_desc(data);
+
+ __pci_msi_unmask_desc(desc, BIT(data->irq - desc->irq));
}
EXPORT_SYMBOL_GPL(pci_msi_unmask_irq);
if (entry->msi_attrib.is_msix) {
void __iomem *base = pci_msix_desc_addr(entry);
- if (!base) {
- WARN_ON(1);
+ if (WARN_ON_ONCE(entry->msi_attrib.is_virtual))
return;
- }
msg->address_lo = readl(base + PCI_MSIX_ENTRY_LOWER_ADDR);
msg->address_hi = readl(base + PCI_MSIX_ENTRY_UPPER_ADDR);
/* Don't touch the hardware now */
} else if (entry->msi_attrib.is_msix) {
void __iomem *base = pci_msix_desc_addr(entry);
- bool unmasked = !(entry->masked & PCI_MSIX_ENTRY_CTRL_MASKBIT);
+ u32 ctrl = entry->msix_ctrl;
+ bool unmasked = !(ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT);
- if (!base)
+ if (entry->msi_attrib.is_virtual)
goto skip;
/*
* undefined."
*/
if (unmasked)
- __pci_msix_desc_mask_irq(entry, PCI_MSIX_ENTRY_CTRL_MASKBIT);
+ pci_msix_write_vector_ctrl(entry, ctrl | PCI_MSIX_ENTRY_CTRL_MASKBIT);
writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
writel(msg->address_hi, base + PCI_MSIX_ENTRY_UPPER_ADDR);
writel(msg->data, base + PCI_MSIX_ENTRY_DATA);
if (unmasked)
- __pci_msix_desc_mask_irq(entry, 0);
+ pci_msix_write_vector_ctrl(entry, ctrl);
/* Ensure that the writes are visible in the device */
readl(base + PCI_MSIX_ENTRY_DATA);
{
struct list_head *msi_list = dev_to_msi_list(&dev->dev);
struct msi_desc *entry, *tmp;
- struct attribute **msi_attrs;
- struct device_attribute *dev_attr;
- int i, count = 0;
+ int i;
for_each_pci_msi_entry(entry, dev)
if (entry->irq)
}
if (dev->msi_irq_groups) {
- sysfs_remove_groups(&dev->dev.kobj, dev->msi_irq_groups);
- msi_attrs = dev->msi_irq_groups[0]->attrs;
- while (msi_attrs[count]) {
- dev_attr = container_of(msi_attrs[count],
- struct device_attribute, attr);
- kfree(dev_attr->attr.name);
- kfree(dev_attr);
- ++count;
- }
- kfree(msi_attrs);
- kfree(dev->msi_irq_groups[0]);
- kfree(dev->msi_irq_groups);
+ msi_destroy_sysfs(&dev->dev, dev->msi_irq_groups);
dev->msi_irq_groups = NULL;
}
}
arch_restore_msi_irqs(dev);
pci_read_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, &control);
- msi_mask_irq(entry, msi_mask(entry->msi_attrib.multi_cap),
- entry->masked);
+ pci_msi_update_mask(entry, 0, 0);
control &= ~PCI_MSI_FLAGS_QSIZE;
control |= (entry->msi_attrib.multiple << 4) | PCI_MSI_FLAGS_ENABLE;
pci_write_config_word(dev, dev->msi_cap + PCI_MSI_FLAGS, control);
arch_restore_msi_irqs(dev);
for_each_pci_msi_entry(entry, dev)
- msix_mask_irq(entry, entry->masked);
+ pci_msix_write_vector_ctrl(entry, entry->msix_ctrl);
pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
}
}
EXPORT_SYMBOL_GPL(pci_restore_msi_state);
-static ssize_t msi_mode_show(struct device *dev, struct device_attribute *attr,
- char *buf)
-{
- struct msi_desc *entry;
- unsigned long irq;
- int retval;
-
- retval = kstrtoul(attr->attr.name, 10, &irq);
- if (retval)
- return retval;
-
- entry = irq_get_msi_desc(irq);
- if (!entry)
- return -ENODEV;
-
- return sysfs_emit(buf, "%s\n",
- entry->msi_attrib.is_msix ? "msix" : "msi");
-}
-
-static int populate_msi_sysfs(struct pci_dev *pdev)
-{
- struct attribute **msi_attrs;
- struct attribute *msi_attr;
- struct device_attribute *msi_dev_attr;
- struct attribute_group *msi_irq_group;
- const struct attribute_group **msi_irq_groups;
- struct msi_desc *entry;
- int ret = -ENOMEM;
- int num_msi = 0;
- int count = 0;
- int i;
-
- /* Determine how many msi entries we have */
- for_each_pci_msi_entry(entry, pdev)
- num_msi += entry->nvec_used;
- if (!num_msi)
- return 0;
-
- /* Dynamically create the MSI attributes for the PCI device */
- msi_attrs = kcalloc(num_msi + 1, sizeof(void *), GFP_KERNEL);
- if (!msi_attrs)
- return -ENOMEM;
- for_each_pci_msi_entry(entry, pdev) {
- for (i = 0; i < entry->nvec_used; i++) {
- msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL);
- if (!msi_dev_attr)
- goto error_attrs;
- msi_attrs[count] = &msi_dev_attr->attr;
-
- sysfs_attr_init(&msi_dev_attr->attr);
- msi_dev_attr->attr.name = kasprintf(GFP_KERNEL, "%d",
- entry->irq + i);
- if (!msi_dev_attr->attr.name)
- goto error_attrs;
- msi_dev_attr->attr.mode = S_IRUGO;
- msi_dev_attr->show = msi_mode_show;
- ++count;
- }
- }
-
- msi_irq_group = kzalloc(sizeof(*msi_irq_group), GFP_KERNEL);
- if (!msi_irq_group)
- goto error_attrs;
- msi_irq_group->name = "msi_irqs";
- msi_irq_group->attrs = msi_attrs;
-
- msi_irq_groups = kcalloc(2, sizeof(void *), GFP_KERNEL);
- if (!msi_irq_groups)
- goto error_irq_group;
- msi_irq_groups[0] = msi_irq_group;
-
- ret = sysfs_create_groups(&pdev->dev.kobj, msi_irq_groups);
- if (ret)
- goto error_irq_groups;
- pdev->msi_irq_groups = msi_irq_groups;
-
- return 0;
-
-error_irq_groups:
- kfree(msi_irq_groups);
-error_irq_group:
- kfree(msi_irq_group);
-error_attrs:
- count = 0;
- msi_attr = msi_attrs[count];
- while (msi_attr) {
- msi_dev_attr = container_of(msi_attr, struct device_attribute, attr);
- kfree(msi_attr->name);
- kfree(msi_dev_attr);
- ++count;
- msi_attr = msi_attrs[count];
- }
- kfree(msi_attrs);
- return ret;
-}
-
static struct msi_desc *
msi_setup_entry(struct pci_dev *dev, int nvec, struct irq_affinity *affd)
{
/* Save the initial mask status */
if (entry->msi_attrib.maskbit)
- pci_read_config_dword(dev, entry->mask_pos, &entry->masked);
+ pci_read_config_dword(dev, entry->mask_pos, &entry->msi_mask);
out:
kfree(masks);
{
struct msi_desc *entry;
+ if (!dev->no_64bit_msi)
+ return 0;
+
for_each_pci_msi_entry(entry, dev) {
- if (entry->msg.address_hi && dev->no_64bit_msi) {
+ if (entry->msg.address_hi) {
pci_err(dev, "arch assigned 64-bit MSI address %#x%08x but device only supports 32 bits\n",
entry->msg.address_hi, entry->msg.address_lo);
return -EIO;
{
struct msi_desc *entry;
int ret;
- unsigned mask;
pci_msi_set_enable(dev, 0); /* Disable MSI during set up */
return -ENOMEM;
/* All MSIs are unmasked by default; mask them all */
- mask = msi_mask(entry->msi_attrib.multi_cap);
- msi_mask_irq(entry, mask, mask);
+ pci_msi_mask(entry, msi_multi_mask(entry));
list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
/* Configure MSI capability structure */
ret = pci_msi_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSI);
- if (ret) {
- msi_mask_irq(entry, mask, 0);
- free_msi_irqs(dev);
- return ret;
- }
+ if (ret)
+ goto err;
ret = msi_verify_entries(dev);
- if (ret) {
- msi_mask_irq(entry, mask, 0);
- free_msi_irqs(dev);
- return ret;
- }
+ if (ret)
+ goto err;
- ret = populate_msi_sysfs(dev);
- if (ret) {
- msi_mask_irq(entry, mask, 0);
- free_msi_irqs(dev);
- return ret;
+ dev->msi_irq_groups = msi_populate_sysfs(&dev->dev);
+ if (IS_ERR(dev->msi_irq_groups)) {
+ ret = PTR_ERR(dev->msi_irq_groups);
+ goto err;
}
/* Set MSI enabled bits */
pcibios_free_irq(dev);
dev->irq = entry->irq;
return 0;
+
+err:
+ pci_msi_unmask(entry, msi_multi_mask(entry));
+ free_msi_irqs(dev);
+ return ret;
}
static void __iomem *msix_map_region(struct pci_dev *dev, unsigned nr_entries)
entry->msi_attrib.default_irq = dev->irq;
entry->mask_base = base;
- addr = pci_msix_desc_addr(entry);
- if (addr)
- entry->masked = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+ if (!entry->msi_attrib.is_virtual) {
+ addr = pci_msix_desc_addr(entry);
+ entry->msix_ctrl = readl(addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
+ }
list_add_tail(&entry->list, dev_to_msi_list(&dev->dev));
if (masks)
u32 ctrl = PCI_MSIX_ENTRY_CTRL_MASKBIT;
int i;
+ if (pci_msi_ignore_mask)
+ return;
+
for (i = 0; i < tsize; i++, base += PCI_MSIX_ENTRY_SIZE)
writel(ctrl, base + PCI_MSIX_ENTRY_VECTOR_CTRL);
}
msix_update_entries(dev, entries);
- ret = populate_msi_sysfs(dev);
- if (ret)
+ dev->msi_irq_groups = msi_populate_sysfs(&dev->dev);
+ if (IS_ERR(dev->msi_irq_groups)) {
+ ret = PTR_ERR(dev->msi_irq_groups);
goto out_free;
+ }
/* Set MSI-X enabled bits and unmask the function */
pci_intx_for_msi(dev, 0);
static void pci_msi_shutdown(struct pci_dev *dev)
{
struct msi_desc *desc;
- u32 mask;
if (!pci_msi_enable || !dev || !dev->msi_enabled)
return;
dev->msi_enabled = 0;
/* Return the device with MSI unmasked as initial states */
- mask = msi_mask(desc->msi_attrib.multi_cap);
- msi_mask_irq(desc, mask, 0);
+ pci_msi_unmask(desc, msi_multi_mask(desc));
/* Restore dev->irq to its default pin-assertion IRQ */
dev->irq = desc->msi_attrib.default_irq;
/* Return the device with MSI-X masked as initial states */
for_each_pci_msi_entry(entry, dev)
- __pci_msix_desc_mask_irq(entry, 1);
+ pci_msix_mask(entry);
pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_ENABLE, 0);
pci_intx_for_msi(dev, 1);
unsigned int parent = irq_desc_get_irq(desc);
const struct owl_gpio_port *port;
void __iomem *base;
- unsigned int pin, irq, offset = 0, i;
+ unsigned int pin, offset = 0, i;
unsigned long pending_irq;
chained_irq_enter(chip, desc);
pending_irq = readl_relaxed(base + port->intc_pd);
for_each_set_bit(pin, &pending_irq, port->pins) {
- irq = irq_find_mapping(domain, offset + pin);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(domain, offset + pin);
/* clear pending interrupt */
owl_gpio_update_reg(base + port->intc_pd, pin, true);
events &= pc->enabled_irq_map[bank];
for_each_set_bit(offset, &events, 32) {
gpio = (32 * bank) + offset;
- generic_handle_irq(irq_linear_revmap(pc->gpio_chip.irq.domain,
- gpio));
+ generic_handle_domain_irq(pc->gpio_chip.irq.domain,
+ gpio);
}
}
for_each_set_bit(bit, &val, NGPIOS_PER_BANK) {
unsigned pin = NGPIOS_PER_BANK * i + bit;
- int child_irq = irq_find_mapping(gc->irq.domain, pin);
/*
* Clear the interrupt before invoking the
writel(BIT(bit), chip->base + (i * GPIO_BANK_SIZE) +
IPROC_GPIO_INT_CLR_OFFSET);
- generic_handle_irq(child_irq);
+ generic_handle_domain_irq(gc->irq.domain, pin);
}
}
int_bits = level | event;
for_each_set_bit(bit, &int_bits, gc->ngpio)
- generic_handle_irq(
- irq_linear_revmap(gc->irq.domain, bit));
+ generic_handle_domain_irq(gc->irq.domain, bit);
}
return int_bits ? IRQ_HANDLED : IRQ_NONE;
u32 base, pin;
void __iomem *reg;
unsigned long pending;
- unsigned int virq;
/* check from GPIO controller which pin triggered the interrupt */
for (base = 0; base < vg->chip.ngpio; base += 32) {
raw_spin_lock(&byt_lock);
pending = readl(reg);
raw_spin_unlock(&byt_lock);
- for_each_set_bit(pin, &pending, 32) {
- virq = irq_find_mapping(vg->chip.irq.domain, base + pin);
- generic_handle_irq(virq);
- }
+ for_each_set_bit(pin, &pending, 32)
+ generic_handle_domain_irq(vg->chip.irq.domain, base + pin);
}
chip->irq_eoi(data);
}
raw_spin_unlock_irqrestore(&chv_lock, flags);
for_each_set_bit(intr_line, &pending, community->nirqs) {
- unsigned int irq, offset;
+ unsigned int offset;
offset = cctx->intr_lines[intr_line];
- irq = irq_find_mapping(gc->irq.domain, offset);
- generic_handle_irq(irq);
+ generic_handle_domain_irq(gc->irq.domain, offset);
}
chained_irq_exit(chip, desc);
/* Only interrupts that are enabled */
pending = ioread32(reg) & ioread32(ena);
- for_each_set_bit(pin, &pending, 32) {
- unsigned int irq;
-
- irq = irq_find_mapping(lg->chip.irq.domain, base + pin);
- generic_handle_irq(irq);
- }
+ for_each_set_bit(pin, &pending, 32)
+ generic_handle_domain_irq(lg->chip.irq.domain, base + pin);
}
chip->irq_eoi(data);
}
struct irq_chip *chip = irq_desc_get_chip(desc);
struct mtk_eint *eint = irq_desc_get_handler_data(desc);
unsigned int status, eint_num;
- int offset, mask_offset, index, virq;
+ int offset, mask_offset, index;
void __iomem *reg = mtk_eint_get_offset(eint, 0, eint->regs->stat);
int dual_edge, start_level, curr_level;
offset = __ffs(status);
mask_offset = eint_num >> 5;
index = eint_num + offset;
- virq = irq_find_mapping(eint->domain, index);
status &= ~BIT(offset);
/*
index);
}
- generic_handle_irq(virq);
+ generic_handle_domain_irq(eint->domain, index);
if (dual_edge) {
curr_level = mtk_eint_flip_edge(eint, index);
while (status) {
int bit = __ffs(status);
- generic_handle_irq(irq_find_mapping(chip->irq.domain, bit));
+ generic_handle_domain_irq(chip->irq.domain, bit);
status &= ~BIT(bit);
}
sts &= en;
for_each_set_bit(bit, (const void *)&sts, NPCM7XX_GPIO_PER_BANK)
- generic_handle_irq(irq_linear_revmap(gc->irq.domain, bit));
+ generic_handle_domain_irq(gc->irq.domain, bit);
chained_irq_exit(chip, desc);
}
if (!(regval & PIN_IRQ_PENDING) ||
!(regval & BIT(INTERRUPT_MASK_OFF)))
continue;
- irq = irq_find_mapping(gc->irq.domain, irqnr + i);
- if (irq != 0)
- generic_handle_irq(irq);
+ generic_handle_domain_irq(gc->irq.domain, irqnr + i);
/* Clear interrupt.
* We must read the pin register again, in case the
* value was changed while executing
- * generic_handle_irq() above.
+ * generic_handle_domain_irq() above.
* If we didn't find a mapping for the interrupt,
* disable it in order to avoid a system hang caused
* by an interrupt storm.
continue;
}
- for_each_set_bit(n, &isr, BITS_PER_LONG) {
- generic_handle_irq(irq_find_mapping(
- gpio_chip->irq.domain, n));
- }
+ for_each_set_bit(n, &isr, BITS_PER_LONG)
+ generic_handle_domain_irq(gpio_chip->irq.domain, n);
}
chained_irq_exit(chip, desc);
/* now it may re-trigger */
pins = readl(gctrl->membase + GPIO_IRNCR);
for_each_set_bit(offset, &pins, gc->ngpio)
- generic_handle_irq(irq_find_mapping(gc->irq.domain, offset));
+ generic_handle_domain_irq(gc->irq.domain, offset);
chained_irq_exit(ic, desc);
}
flag = ingenic_gpio_read_reg(jzgc, JZ4730_GPIO_GPFR);
for_each_set_bit(i, &flag, 32)
- generic_handle_irq(irq_linear_revmap(gc->irq.domain, i));
+ generic_handle_domain_irq(gc->irq.domain, i);
chained_irq_exit(irq_chip, desc);
}
for_each_set_bit(port, &val, SGPIO_BITS_PER_WORD) {
gpio = sgpio_addr_to_pin(priv, port, bit);
- generic_handle_irq(irq_linear_revmap(chip->irq.domain, gpio));
+ generic_handle_domain_irq(chip->irq.domain, gpio);
}
chained_irq_exit(parent_chip, desc);
for_each_set_bit(irq, &irqs,
min(32U, info->desc->npins - 32 * i))
- generic_handle_irq(irq_linear_revmap(chip->irq.domain,
- irq + 32 * i));
+ generic_handle_domain_irq(chip->irq.domain, irq + 32 * i);
chained_irq_exit(parent_chip, desc);
}
stat = readl(bank->reg_base + IRQ_PENDING);
for_each_set_bit(pin, &stat, BITS_PER_LONG)
- generic_handle_irq(irq_linear_revmap(gc->irq.domain, pin));
+ generic_handle_domain_irq(gc->irq.domain, pin);
chained_irq_exit(chip, desc);
}
pending = pic32_gpio_get_pending(gc, stat);
for_each_set_bit(pin, &pending, BITS_PER_LONG)
- generic_handle_irq(irq_linear_revmap(gc->irq.domain, pin));
+ generic_handle_domain_irq(gc->irq.domain, pin);
chained_irq_exit(chip, desc);
}
pending = gpio_readl(bank, GPIO_INTERRUPT_STATUS) &
gpio_readl(bank, GPIO_INTERRUPT_EN);
for_each_set_bit(pin, &pending, 16)
- generic_handle_irq(irq_linear_revmap(gc->irq.domain, pin));
+ generic_handle_domain_irq(gc->irq.domain, pin);
chained_irq_exit(chip, desc);
}
#include <linux/io.h>
#include <linux/bitops.h>
#include <linux/gpio/driver.h>
-#include <linux/of_device.h>
#include <linux/of_address.h>
+#include <linux/of_device.h>
#include <linux/of_irq.h>
#include <linux/pinctrl/machine.h>
#include <linux/pinctrl/pinconf.h>
#include "core.h"
#include "pinconf.h"
-
-/* GPIO control registers */
-#define GPIO_SWPORT_DR 0x00
-#define GPIO_SWPORT_DDR 0x04
-#define GPIO_INTEN 0x30
-#define GPIO_INTMASK 0x34
-#define GPIO_INTTYPE_LEVEL 0x38
-#define GPIO_INT_POLARITY 0x3c
-#define GPIO_INT_STATUS 0x40
-#define GPIO_INT_RAWSTATUS 0x44
-#define GPIO_DEBOUNCE 0x48
-#define GPIO_PORTS_EOI 0x4c
-#define GPIO_EXT_PORT 0x50
-#define GPIO_LS_SYNC 0x60
-
-enum rockchip_pinctrl_type {
- PX30,
- RV1108,
- RK2928,
- RK3066B,
- RK3128,
- RK3188,
- RK3288,
- RK3308,
- RK3368,
- RK3399,
- RK3568,
-};
-
+#include "pinctrl-rockchip.h"
/**
* Generate a bitmask for setting a value (v) with a write mask bit in hiword
#define IOMUX_WIDTH_3BIT BIT(4)
#define IOMUX_WIDTH_2BIT BIT(5)
-/**
- * struct rockchip_iomux
- * @type: iomux variant using IOMUX_* constants
- * @offset: if initialized to -1 it will be autocalculated, by specifying
- * an initial offset value the relevant source offset can be reset
- * to a new value for autocalculating the following iomux registers.
- */
-struct rockchip_iomux {
- int type;
- int offset;
-};
-
-/*
- * enum type index corresponding to rockchip_perpin_drv_list arrays index.
- */
-enum rockchip_pin_drv_type {
- DRV_TYPE_IO_DEFAULT = 0,
- DRV_TYPE_IO_1V8_OR_3V0,
- DRV_TYPE_IO_1V8_ONLY,
- DRV_TYPE_IO_1V8_3V0_AUTO,
- DRV_TYPE_IO_3V3_ONLY,
- DRV_TYPE_MAX
-};
-
-/*
- * enum type index corresponding to rockchip_pull_list arrays index.
- */
-enum rockchip_pin_pull_type {
- PULL_TYPE_IO_DEFAULT = 0,
- PULL_TYPE_IO_1V8_ONLY,
- PULL_TYPE_MAX
-};
-
-/**
- * struct rockchip_drv
- * @drv_type: drive strength variant using rockchip_perpin_drv_type
- * @offset: if initialized to -1 it will be autocalculated, by specifying
- * an initial offset value the relevant source offset can be reset
- * to a new value for autocalculating the following drive strength
- * registers. if used chips own cal_drv func instead to calculate
- * registers offset, the variant could be ignored.
- */
-struct rockchip_drv {
- enum rockchip_pin_drv_type drv_type;
- int offset;
-};
-
-/**
- * struct rockchip_pin_bank
- * @reg_base: register base of the gpio bank
- * @regmap_pull: optional separate register for additional pull settings
- * @clk: clock of the gpio bank
- * @irq: interrupt of the gpio bank
- * @saved_masks: Saved content of GPIO_INTEN at suspend time.
- * @pin_base: first pin number
- * @nr_pins: number of pins in this bank
- * @name: name of the bank
- * @bank_num: number of the bank, to account for holes
- * @iomux: array describing the 4 iomux sources of the bank
- * @drv: array describing the 4 drive strength sources of the bank
- * @pull_type: array describing the 4 pull type sources of the bank
- * @valid: is all necessary information present
- * @of_node: dt node of this bank
- * @drvdata: common pinctrl basedata
- * @domain: irqdomain of the gpio bank
- * @gpio_chip: gpiolib chip
- * @grange: gpio range
- * @slock: spinlock for the gpio bank
- * @toggle_edge_mode: bit mask to toggle (falling/rising) edge mode
- * @recalced_mask: bit mask to indicate a need to recalulate the mask
- * @route_mask: bits describing the routing pins of per bank
- */
-struct rockchip_pin_bank {
- void __iomem *reg_base;
- struct regmap *regmap_pull;
- struct clk *clk;
- int irq;
- u32 saved_masks;
- u32 pin_base;
- u8 nr_pins;
- char *name;
- u8 bank_num;
- struct rockchip_iomux iomux[4];
- struct rockchip_drv drv[4];
- enum rockchip_pin_pull_type pull_type[4];
- bool valid;
- struct device_node *of_node;
- struct rockchip_pinctrl *drvdata;
- struct irq_domain *domain;
- struct gpio_chip gpio_chip;
- struct pinctrl_gpio_range grange;
- raw_spinlock_t slock;
- u32 toggle_edge_mode;
- u32 recalced_mask;
- u32 route_mask;
-};
-
#define PIN_BANK(id, pins, label) \
{ \
.bank_num = id, \
#define RK_MUXROUTE_PMU(ID, PIN, FUNC, REG, VAL) \
PIN_BANK_MUX_ROUTE_FLAGS(ID, PIN, FUNC, REG, VAL, ROCKCHIP_ROUTE_PMU)
-/**
- * struct rockchip_mux_recalced_data: represent a pin iomux data.
- * @num: bank number.
- * @pin: pin number.
- * @bit: index at register.
- * @reg: register offset.
- * @mask: mask bit
- */
-struct rockchip_mux_recalced_data {
- u8 num;
- u8 pin;
- u32 reg;
- u8 bit;
- u8 mask;
-};
-
-enum rockchip_mux_route_location {
- ROCKCHIP_ROUTE_SAME = 0,
- ROCKCHIP_ROUTE_PMU,
- ROCKCHIP_ROUTE_GRF,
-};
-
-/**
- * struct rockchip_mux_recalced_data: represent a pin iomux data.
- * @bank_num: bank number.
- * @pin: index at register or used to calc index.
- * @func: the min pin.
- * @route_location: the mux route location (same, pmu, grf).
- * @route_offset: the max pin.
- * @route_val: the register offset.
- */
-struct rockchip_mux_route_data {
- u8 bank_num;
- u8 pin;
- u8 func;
- enum rockchip_mux_route_location route_location;
- u32 route_offset;
- u32 route_val;
-};
-
-struct rockchip_pin_ctrl {
- struct rockchip_pin_bank *pin_banks;
- u32 nr_banks;
- u32 nr_pins;
- char *label;
- enum rockchip_pinctrl_type type;
- int grf_mux_offset;
- int pmu_mux_offset;
- int grf_drv_offset;
- int pmu_drv_offset;
- struct rockchip_mux_recalced_data *iomux_recalced;
- u32 niomux_recalced;
- struct rockchip_mux_route_data *iomux_routes;
- u32 niomux_routes;
-
- void (*pull_calc_reg)(struct rockchip_pin_bank *bank,
- int pin_num, struct regmap **regmap,
- int *reg, u8 *bit);
- void (*drv_calc_reg)(struct rockchip_pin_bank *bank,
- int pin_num, struct regmap **regmap,
- int *reg, u8 *bit);
- int (*schmitt_calc_reg)(struct rockchip_pin_bank *bank,
- int pin_num, struct regmap **regmap,
- int *reg, u8 *bit);
-};
-
-struct rockchip_pin_config {
- unsigned int func;
- unsigned long *configs;
- unsigned int nconfigs;
-};
-
-/**
- * struct rockchip_pin_group: represent group of pins of a pinmux function.
- * @name: name of the pin group, used to lookup the group.
- * @pins: the pins included in this group.
- * @npins: number of pins included in this group.
- * @data: local pin configuration
- */
-struct rockchip_pin_group {
- const char *name;
- unsigned int npins;
- unsigned int *pins;
- struct rockchip_pin_config *data;
-};
-
-/**
- * struct rockchip_pmx_func: represent a pin function.
- * @name: name of the pin function, used to lookup the function.
- * @groups: one or more names of pin groups that provide this function.
- * @ngroups: number of groups included in @groups.
- */
-struct rockchip_pmx_func {
- const char *name;
- const char **groups;
- u8 ngroups;
-};
-
-struct rockchip_pinctrl {
- struct regmap *regmap_base;
- int reg_size;
- struct regmap *regmap_pull;
- struct regmap *regmap_pmu;
- struct device *dev;
- struct rockchip_pin_ctrl *ctrl;
- struct pinctrl_desc pctl;
- struct pinctrl_dev *pctl_dev;
- struct rockchip_pin_group *groups;
- unsigned int ngroups;
- struct rockchip_pmx_func *functions;
- unsigned int nfunctions;
-};
-
static struct regmap_config rockchip_regmap_config = {
.reg_bits = 32,
.val_bits = 32,
return 0;
}
-static int rockchip_gpio_get_direction(struct gpio_chip *chip, unsigned offset)
-{
- struct rockchip_pin_bank *bank = gpiochip_get_data(chip);
- u32 data;
- int ret;
-
- ret = clk_enable(bank->clk);
- if (ret < 0) {
- dev_err(bank->drvdata->dev,
- "failed to enable clock for bank %s\n", bank->name);
- return ret;
- }
- data = readl_relaxed(bank->reg_base + GPIO_SWPORT_DDR);
- clk_disable(bank->clk);
-
- if (data & BIT(offset))
- return GPIO_LINE_DIRECTION_OUT;
-
- return GPIO_LINE_DIRECTION_IN;
-}
-
-/*
- * The calls to gpio_direction_output() and gpio_direction_input()
- * leads to this function call (via the pinctrl_gpio_direction_{input|output}()
- * function called from the gpiolib interface).
- */
-static int _rockchip_pmx_gpio_set_direction(struct gpio_chip *chip,
- int pin, bool input)
-{
- struct rockchip_pin_bank *bank;
- int ret;
- unsigned long flags;
- u32 data;
-
- bank = gpiochip_get_data(chip);
-
- ret = rockchip_set_mux(bank, pin, RK_FUNC_GPIO);
- if (ret < 0)
- return ret;
-
- clk_enable(bank->clk);
- raw_spin_lock_irqsave(&bank->slock, flags);
-
- data = readl_relaxed(bank->reg_base + GPIO_SWPORT_DDR);
- /* set bit to 1 for output, 0 for input */
- if (!input)
- data |= BIT(pin);
- else
- data &= ~BIT(pin);
- writel_relaxed(data, bank->reg_base + GPIO_SWPORT_DDR);
-
- raw_spin_unlock_irqrestore(&bank->slock, flags);
- clk_disable(bank->clk);
-
- return 0;
-}
-
-static int rockchip_pmx_gpio_set_direction(struct pinctrl_dev *pctldev,
- struct pinctrl_gpio_range *range,
- unsigned offset, bool input)
-{
- struct rockchip_pinctrl *info = pinctrl_dev_get_drvdata(pctldev);
- struct gpio_chip *chip;
- int pin;
-
- chip = range->gc;
- pin = offset - chip->base;
- dev_dbg(info->dev, "gpio_direction for pin %u as %s-%d to %s\n",
- offset, range->name, pin, input ? "input" : "output");
-
- return _rockchip_pmx_gpio_set_direction(chip, offset - chip->base,
- input);
-}
-
static const struct pinmux_ops rockchip_pmx_ops = {
.get_functions_count = rockchip_pmx_get_funcs_count,
.get_function_name = rockchip_pmx_get_func_name,
.get_function_groups = rockchip_pmx_get_groups,
.set_mux = rockchip_pmx_set,
- .gpio_set_direction = rockchip_pmx_gpio_set_direction,
};
/*
return false;
}
-static void rockchip_gpio_set(struct gpio_chip *gc, unsigned offset, int value);
-static int rockchip_gpio_get(struct gpio_chip *gc, unsigned offset);
-
/* set the pin config settings for a specified pin */
static int rockchip_pinconf_set(struct pinctrl_dev *pctldev, unsigned int pin,
unsigned long *configs, unsigned num_configs)
{
struct rockchip_pinctrl *info = pinctrl_dev_get_drvdata(pctldev);
struct rockchip_pin_bank *bank = pin_to_bank(info, pin);
+ struct gpio_chip *gpio = &bank->gpio_chip;
enum pin_config_param param;
u32 arg;
int i;
return rc;
break;
case PIN_CONFIG_OUTPUT:
- rockchip_gpio_set(&bank->gpio_chip,
- pin - bank->pin_base, arg);
- rc = _rockchip_pmx_gpio_set_direction(&bank->gpio_chip,
- pin - bank->pin_base, false);
+ rc = rockchip_set_mux(bank, pin - bank->pin_base,
+ RK_FUNC_GPIO);
+ if (rc != RK_FUNC_GPIO)
+ return -EINVAL;
+
+ rc = gpio->direction_output(gpio, pin - bank->pin_base,
+ arg);
if (rc)
return rc;
break;
{
struct rockchip_pinctrl *info = pinctrl_dev_get_drvdata(pctldev);
struct rockchip_pin_bank *bank = pin_to_bank(info, pin);
+ struct gpio_chip *gpio = &bank->gpio_chip;
enum pin_config_param param = pinconf_to_config_param(*config);
u16 arg;
int rc;
if (rc != RK_FUNC_GPIO)
return -EINVAL;
- rc = rockchip_gpio_get(&bank->gpio_chip, pin - bank->pin_base);
+ rc = gpio->get(gpio, pin - bank->pin_base);
if (rc < 0)
return rc;
ctrldesc->npins = info->ctrl->nr_pins;
pdesc = pindesc;
- for (bank = 0 , k = 0; bank < info->ctrl->nr_banks; bank++) {
+ for (bank = 0, k = 0; bank < info->ctrl->nr_banks; bank++) {
pin_bank = &info->ctrl->pin_banks[bank];
for (pin = 0; pin < pin_bank->nr_pins; pin++, k++) {
pdesc->number = k;
return PTR_ERR(info->pctl_dev);
}
- for (bank = 0; bank < info->ctrl->nr_banks; ++bank) {
- pin_bank = &info->ctrl->pin_banks[bank];
- pin_bank->grange.name = pin_bank->name;
- pin_bank->grange.id = bank;
- pin_bank->grange.pin_base = pin_bank->pin_base;
- pin_bank->grange.base = pin_bank->gpio_chip.base;
- pin_bank->grange.npins = pin_bank->gpio_chip.ngpio;
- pin_bank->grange.gc = &pin_bank->gpio_chip;
- pinctrl_add_gpio_range(info->pctl_dev, &pin_bank->grange);
- }
-
return 0;
}
-/*
- * GPIO handling
- */
-
-static void rockchip_gpio_set(struct gpio_chip *gc, unsigned offset, int value)
-{
- struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
- void __iomem *reg = bank->reg_base + GPIO_SWPORT_DR;
- unsigned long flags;
- u32 data;
-
- clk_enable(bank->clk);
- raw_spin_lock_irqsave(&bank->slock, flags);
-
- data = readl(reg);
- data &= ~BIT(offset);
- if (value)
- data |= BIT(offset);
- writel(data, reg);
-
- raw_spin_unlock_irqrestore(&bank->slock, flags);
- clk_disable(bank->clk);
-}
-
-/*
- * Returns the level of the pin for input direction and setting of the DR
- * register for output gpios.
- */
-static int rockchip_gpio_get(struct gpio_chip *gc, unsigned offset)
-{
- struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
- u32 data;
-
- clk_enable(bank->clk);
- data = readl(bank->reg_base + GPIO_EXT_PORT);
- clk_disable(bank->clk);
- data >>= offset;
- data &= 1;
- return data;
-}
-
-/*
- * gpiolib gpio_direction_input callback function. The setting of the pin
- * mux function as 'gpio input' will be handled by the pinctrl subsystem
- * interface.
- */
-static int rockchip_gpio_direction_input(struct gpio_chip *gc, unsigned offset)
-{
- return pinctrl_gpio_direction_input(gc->base + offset);
-}
-
-/*
- * gpiolib gpio_direction_output callback function. The setting of the pin
- * mux function as 'gpio output' will be handled by the pinctrl subsystem
- * interface.
- */
-static int rockchip_gpio_direction_output(struct gpio_chip *gc,
- unsigned offset, int value)
-{
- rockchip_gpio_set(gc, offset, value);
- return pinctrl_gpio_direction_output(gc->base + offset);
-}
-
-static void rockchip_gpio_set_debounce(struct gpio_chip *gc,
- unsigned int offset, bool enable)
-{
- struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
- void __iomem *reg = bank->reg_base + GPIO_DEBOUNCE;
- unsigned long flags;
- u32 data;
-
- clk_enable(bank->clk);
- raw_spin_lock_irqsave(&bank->slock, flags);
-
- data = readl(reg);
- if (enable)
- data |= BIT(offset);
- else
- data &= ~BIT(offset);
- writel(data, reg);
-
- raw_spin_unlock_irqrestore(&bank->slock, flags);
- clk_disable(bank->clk);
-}
-
-/*
- * gpiolib set_config callback function. The setting of the pin
- * mux function as 'gpio output' will be handled by the pinctrl subsystem
- * interface.
- */
-static int rockchip_gpio_set_config(struct gpio_chip *gc, unsigned int offset,
- unsigned long config)
-{
- enum pin_config_param param = pinconf_to_config_param(config);
-
- switch (param) {
- case PIN_CONFIG_INPUT_DEBOUNCE:
- rockchip_gpio_set_debounce(gc, offset, true);
- /*
- * Rockchip's gpio could only support up to one period
- * of the debounce clock(pclk), which is far away from
- * satisftying the requirement, as pclk is usually near
- * 100MHz shared by all peripherals. So the fact is it
- * has crippled debounce capability could only be useful
- * to prevent any spurious glitches from waking up the system
- * if the gpio is conguired as wakeup interrupt source. Let's
- * still return -ENOTSUPP as before, to make sure the caller
- * of gpiod_set_debounce won't change its behaviour.
- */
- return -ENOTSUPP;
- default:
- return -ENOTSUPP;
- }
-}
-
-/*
- * gpiolib gpio_to_irq callback function. Creates a mapping between a GPIO pin
- * and a virtual IRQ, if not already present.
- */
-static int rockchip_gpio_to_irq(struct gpio_chip *gc, unsigned offset)
-{
- struct rockchip_pin_bank *bank = gpiochip_get_data(gc);
- unsigned int virq;
-
- if (!bank->domain)
- return -ENXIO;
-
- clk_enable(bank->clk);
- virq = irq_create_mapping(bank->domain, offset);
- clk_disable(bank->clk);
-
- return (virq) ? : -ENXIO;
-}
-
-static const struct gpio_chip rockchip_gpiolib_chip = {
- .request = gpiochip_generic_request,
- .free = gpiochip_generic_free,
- .set = rockchip_gpio_set,
- .get = rockchip_gpio_get,
- .get_direction = rockchip_gpio_get_direction,
- .direction_input = rockchip_gpio_direction_input,
- .direction_output = rockchip_gpio_direction_output,
- .set_config = rockchip_gpio_set_config,
- .to_irq = rockchip_gpio_to_irq,
- .owner = THIS_MODULE,
-};
-
-/*
- * Interrupt handling
- */
-
-static void rockchip_irq_demux(struct irq_desc *desc)
-{
- struct irq_chip *chip = irq_desc_get_chip(desc);
- struct rockchip_pin_bank *bank = irq_desc_get_handler_data(desc);
- u32 pend;
-
- dev_dbg(bank->drvdata->dev, "got irq for bank %s\n", bank->name);
-
- chained_irq_enter(chip, desc);
-
- pend = readl_relaxed(bank->reg_base + GPIO_INT_STATUS);
-
- while (pend) {
- unsigned int irq, virq;
-
- irq = __ffs(pend);
- pend &= ~BIT(irq);
- virq = irq_find_mapping(bank->domain, irq);
-
- if (!virq) {
- dev_err(bank->drvdata->dev, "unmapped irq %d\n", irq);
- continue;
- }
-
- dev_dbg(bank->drvdata->dev, "handling irq %d\n", irq);
-
- /*
- * Triggering IRQ on both rising and falling edge
- * needs manual intervention.
- */
- if (bank->toggle_edge_mode & BIT(irq)) {
- u32 data, data_old, polarity;
- unsigned long flags;
-
- data = readl_relaxed(bank->reg_base + GPIO_EXT_PORT);
- do {
- raw_spin_lock_irqsave(&bank->slock, flags);
-
- polarity = readl_relaxed(bank->reg_base +
- GPIO_INT_POLARITY);
- if (data & BIT(irq))
- polarity &= ~BIT(irq);
- else
- polarity |= BIT(irq);
- writel(polarity,
- bank->reg_base + GPIO_INT_POLARITY);
-
- raw_spin_unlock_irqrestore(&bank->slock, flags);
-
- data_old = data;
- data = readl_relaxed(bank->reg_base +
- GPIO_EXT_PORT);
- } while ((data & BIT(irq)) != (data_old & BIT(irq)));
- }
-
- generic_handle_irq(virq);
- }
-
- chained_irq_exit(chip, desc);
-}
-
-static int rockchip_irq_set_type(struct irq_data *d, unsigned int type)
-{
- struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
- struct rockchip_pin_bank *bank = gc->private;
- u32 mask = BIT(d->hwirq);
- u32 polarity;
- u32 level;
- u32 data;
- unsigned long flags;
- int ret;
-
- /* make sure the pin is configured as gpio input */
- ret = rockchip_set_mux(bank, d->hwirq, RK_FUNC_GPIO);
- if (ret < 0)
- return ret;
-
- clk_enable(bank->clk);
- raw_spin_lock_irqsave(&bank->slock, flags);
-
- data = readl_relaxed(bank->reg_base + GPIO_SWPORT_DDR);
- data &= ~mask;
- writel_relaxed(data, bank->reg_base + GPIO_SWPORT_DDR);
-
- raw_spin_unlock_irqrestore(&bank->slock, flags);
-
- if (type & IRQ_TYPE_EDGE_BOTH)
- irq_set_handler_locked(d, handle_edge_irq);
- else
- irq_set_handler_locked(d, handle_level_irq);
-
- raw_spin_lock_irqsave(&bank->slock, flags);
- irq_gc_lock(gc);
-
- level = readl_relaxed(gc->reg_base + GPIO_INTTYPE_LEVEL);
- polarity = readl_relaxed(gc->reg_base + GPIO_INT_POLARITY);
-
- switch (type) {
- case IRQ_TYPE_EDGE_BOTH:
- bank->toggle_edge_mode |= mask;
- level |= mask;
-
- /*
- * Determine gpio state. If 1 next interrupt should be falling
- * otherwise rising.
- */
- data = readl(bank->reg_base + GPIO_EXT_PORT);
- if (data & mask)
- polarity &= ~mask;
- else
- polarity |= mask;
- break;
- case IRQ_TYPE_EDGE_RISING:
- bank->toggle_edge_mode &= ~mask;
- level |= mask;
- polarity |= mask;
- break;
- case IRQ_TYPE_EDGE_FALLING:
- bank->toggle_edge_mode &= ~mask;
- level |= mask;
- polarity &= ~mask;
- break;
- case IRQ_TYPE_LEVEL_HIGH:
- bank->toggle_edge_mode &= ~mask;
- level &= ~mask;
- polarity |= mask;
- break;
- case IRQ_TYPE_LEVEL_LOW:
- bank->toggle_edge_mode &= ~mask;
- level &= ~mask;
- polarity &= ~mask;
- break;
- default:
- irq_gc_unlock(gc);
- raw_spin_unlock_irqrestore(&bank->slock, flags);
- clk_disable(bank->clk);
- return -EINVAL;
- }
-
- writel_relaxed(level, gc->reg_base + GPIO_INTTYPE_LEVEL);
- writel_relaxed(polarity, gc->reg_base + GPIO_INT_POLARITY);
-
- irq_gc_unlock(gc);
- raw_spin_unlock_irqrestore(&bank->slock, flags);
- clk_disable(bank->clk);
-
- return 0;
-}
-
-static void rockchip_irq_suspend(struct irq_data *d)
-{
- struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
- struct rockchip_pin_bank *bank = gc->private;
-
- clk_enable(bank->clk);
- bank->saved_masks = irq_reg_readl(gc, GPIO_INTMASK);
- irq_reg_writel(gc, ~gc->wake_active, GPIO_INTMASK);
- clk_disable(bank->clk);
-}
-
-static void rockchip_irq_resume(struct irq_data *d)
-{
- struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
- struct rockchip_pin_bank *bank = gc->private;
-
- clk_enable(bank->clk);
- irq_reg_writel(gc, bank->saved_masks, GPIO_INTMASK);
- clk_disable(bank->clk);
-}
-
-static void rockchip_irq_enable(struct irq_data *d)
-{
- struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
- struct rockchip_pin_bank *bank = gc->private;
-
- clk_enable(bank->clk);
- irq_gc_mask_clr_bit(d);
-}
-
-static void rockchip_irq_disable(struct irq_data *d)
-{
- struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d);
- struct rockchip_pin_bank *bank = gc->private;
-
- irq_gc_mask_set_bit(d);
- clk_disable(bank->clk);
-}
-
-static int rockchip_interrupts_register(struct platform_device *pdev,
- struct rockchip_pinctrl *info)
-{
- struct rockchip_pin_ctrl *ctrl = info->ctrl;
- struct rockchip_pin_bank *bank = ctrl->pin_banks;
- unsigned int clr = IRQ_NOREQUEST | IRQ_NOPROBE | IRQ_NOAUTOEN;
- struct irq_chip_generic *gc;
- int ret;
- int i;
-
- for (i = 0; i < ctrl->nr_banks; ++i, ++bank) {
- if (!bank->valid) {
- dev_warn(&pdev->dev, "bank %s is not valid\n",
- bank->name);
- continue;
- }
-
- ret = clk_enable(bank->clk);
- if (ret) {
- dev_err(&pdev->dev, "failed to enable clock for bank %s\n",
- bank->name);
- continue;
- }
-
- bank->domain = irq_domain_add_linear(bank->of_node, 32,
- &irq_generic_chip_ops, NULL);
- if (!bank->domain) {
- dev_warn(&pdev->dev, "could not initialize irq domain for bank %s\n",
- bank->name);
- clk_disable(bank->clk);
- continue;
- }
-
- ret = irq_alloc_domain_generic_chips(bank->domain, 32, 1,
- "rockchip_gpio_irq", handle_level_irq,
- clr, 0, 0);
- if (ret) {
- dev_err(&pdev->dev, "could not alloc generic chips for bank %s\n",
- bank->name);
- irq_domain_remove(bank->domain);
- clk_disable(bank->clk);
- continue;
- }
-
- gc = irq_get_domain_generic_chip(bank->domain, 0);
- gc->reg_base = bank->reg_base;
- gc->private = bank;
- gc->chip_types[0].regs.mask = GPIO_INTMASK;
- gc->chip_types[0].regs.ack = GPIO_PORTS_EOI;
- gc->chip_types[0].chip.irq_ack = irq_gc_ack_set_bit;
- gc->chip_types[0].chip.irq_mask = irq_gc_mask_set_bit;
- gc->chip_types[0].chip.irq_unmask = irq_gc_mask_clr_bit;
- gc->chip_types[0].chip.irq_enable = rockchip_irq_enable;
- gc->chip_types[0].chip.irq_disable = rockchip_irq_disable;
- gc->chip_types[0].chip.irq_set_wake = irq_gc_set_wake;
- gc->chip_types[0].chip.irq_suspend = rockchip_irq_suspend;
- gc->chip_types[0].chip.irq_resume = rockchip_irq_resume;
- gc->chip_types[0].chip.irq_set_type = rockchip_irq_set_type;
- gc->wake_enabled = IRQ_MSK(bank->nr_pins);
-
- /*
- * Linux assumes that all interrupts start out disabled/masked.
- * Our driver only uses the concept of masked and always keeps
- * things enabled, so for us that's all masked and all enabled.
- */
- writel_relaxed(0xffffffff, bank->reg_base + GPIO_INTMASK);
- writel_relaxed(0xffffffff, bank->reg_base + GPIO_PORTS_EOI);
- writel_relaxed(0xffffffff, bank->reg_base + GPIO_INTEN);
- gc->mask_cache = 0xffffffff;
-
- irq_set_chained_handler_and_data(bank->irq,
- rockchip_irq_demux, bank);
- clk_disable(bank->clk);
- }
-
- return 0;
-}
-
-static int rockchip_gpiolib_register(struct platform_device *pdev,
- struct rockchip_pinctrl *info)
-{
- struct rockchip_pin_ctrl *ctrl = info->ctrl;
- struct rockchip_pin_bank *bank = ctrl->pin_banks;
- struct gpio_chip *gc;
- int ret;
- int i;
-
- for (i = 0; i < ctrl->nr_banks; ++i, ++bank) {
- if (!bank->valid) {
- dev_warn(&pdev->dev, "bank %s is not valid\n",
- bank->name);
- continue;
- }
-
- bank->gpio_chip = rockchip_gpiolib_chip;
-
- gc = &bank->gpio_chip;
- gc->base = bank->pin_base;
- gc->ngpio = bank->nr_pins;
- gc->parent = &pdev->dev;
- gc->of_node = bank->of_node;
- gc->label = bank->name;
-
- ret = gpiochip_add_data(gc, bank);
- if (ret) {
- dev_err(&pdev->dev, "failed to register gpio_chip %s, error code: %d\n",
- gc->label, ret);
- goto fail;
- }
- }
-
- rockchip_interrupts_register(pdev, info);
-
- return 0;
-
-fail:
- for (--i, --bank; i >= 0; --i, --bank) {
- if (!bank->valid)
- continue;
- gpiochip_remove(&bank->gpio_chip);
- }
- return ret;
-}
-
-static int rockchip_gpiolib_unregister(struct platform_device *pdev,
- struct rockchip_pinctrl *info)
-{
- struct rockchip_pin_ctrl *ctrl = info->ctrl;
- struct rockchip_pin_bank *bank = ctrl->pin_banks;
- int i;
-
- for (i = 0; i < ctrl->nr_banks; ++i, ++bank) {
- if (!bank->valid)
- continue;
- gpiochip_remove(&bank->gpio_chip);
- }
-
- return 0;
-}
-
-static int rockchip_get_bank_data(struct rockchip_pin_bank *bank,
- struct rockchip_pinctrl *info)
-{
- struct resource res;
- void __iomem *base;
-
- if (of_address_to_resource(bank->of_node, 0, &res)) {
- dev_err(info->dev, "cannot find IO resource for bank\n");
- return -ENOENT;
- }
-
- bank->reg_base = devm_ioremap_resource(info->dev, &res);
- if (IS_ERR(bank->reg_base))
- return PTR_ERR(bank->reg_base);
-
- /*
- * special case, where parts of the pull setting-registers are
- * part of the PMU register space
- */
- if (of_device_is_compatible(bank->of_node,
- "rockchip,rk3188-gpio-bank0")) {
- struct device_node *node;
-
- node = of_parse_phandle(bank->of_node->parent,
- "rockchip,pmu", 0);
- if (!node) {
- if (of_address_to_resource(bank->of_node, 1, &res)) {
- dev_err(info->dev, "cannot find IO resource for bank\n");
- return -ENOENT;
- }
-
- base = devm_ioremap_resource(info->dev, &res);
- if (IS_ERR(base))
- return PTR_ERR(base);
- rockchip_regmap_config.max_register =
- resource_size(&res) - 4;
- rockchip_regmap_config.name =
- "rockchip,rk3188-gpio-bank0-pull";
- bank->regmap_pull = devm_regmap_init_mmio(info->dev,
- base,
- &rockchip_regmap_config);
- }
- of_node_put(node);
- }
-
- bank->irq = irq_of_parse_and_map(bank->of_node, 0);
-
- bank->clk = of_clk_get(bank->of_node, 0);
- if (IS_ERR(bank->clk))
- return PTR_ERR(bank->clk);
-
- return clk_prepare(bank->clk);
-}
-
static const struct of_device_id rockchip_pinctrl_dt_match[];
/* retrieve the soc specific data */
{
const struct of_device_id *match;
struct device_node *node = pdev->dev.of_node;
- struct device_node *np;
struct rockchip_pin_ctrl *ctrl;
struct rockchip_pin_bank *bank;
int grf_offs, pmu_offs, drv_grf_offs, drv_pmu_offs, i, j;
match = of_match_node(rockchip_pinctrl_dt_match, node);
ctrl = (struct rockchip_pin_ctrl *)match->data;
- for_each_child_of_node(node, np) {
- if (!of_find_property(np, "gpio-controller", NULL))
- continue;
-
- bank = ctrl->pin_banks;
- for (i = 0; i < ctrl->nr_banks; ++i, ++bank) {
- if (!strcmp(bank->name, np->name)) {
- bank->of_node = np;
-
- if (!rockchip_get_bank_data(bank, d))
- bank->valid = true;
-
- break;
- }
- }
- }
-
grf_offs = ctrl->grf_mux_offset;
pmu_offs = ctrl->pmu_mux_offset;
drv_pmu_offs = ctrl->pmu_drv_offset;
return PTR_ERR(info->regmap_pmu);
}
- ret = rockchip_gpiolib_register(pdev, info);
+ ret = rockchip_pinctrl_register(pdev, info);
if (ret)
return ret;
- ret = rockchip_pinctrl_register(pdev, info);
+ platform_set_drvdata(pdev, info);
+
+ ret = of_platform_populate(np, rockchip_bank_match, NULL, NULL);
if (ret) {
- rockchip_gpiolib_unregister(pdev, info);
+ dev_err(&pdev->dev, "failed to register gpio device\n");
return ret;
}
- platform_set_drvdata(pdev, info);
-
return 0;
}
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2020-2021 Rockchip Electronics Co. Ltd.
+ *
+ * Copyright (c) 2013 MundoReader S.L.
+ * Author: Heiko Stuebner <heiko@sntech.de>
+ *
+ * With some ideas taken from pinctrl-samsung:
+ * Copyright (c) 2012 Samsung Electronics Co., Ltd.
+ * http://www.samsung.com
+ * Copyright (c) 2012 Linaro Ltd
+ * https://www.linaro.org
+ *
+ * and pinctrl-at91:
+ * Copyright (C) 2011-2012 Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
+ */
+
+#ifndef _PINCTRL_ROCKCHIP_H
+#define _PINCTRL_ROCKCHIP_H
+
+enum rockchip_pinctrl_type {
+ PX30,
+ RV1108,
+ RK2928,
+ RK3066B,
+ RK3128,
+ RK3188,
+ RK3288,
+ RK3308,
+ RK3368,
+ RK3399,
+ RK3568,
+};
+
+/**
+ * struct rockchip_gpio_regs
+ * @port_dr: data register
+ * @port_ddr: data direction register
+ * @int_en: interrupt enable
+ * @int_mask: interrupt mask
+ * @int_type: interrupt trigger type, such as high, low, edge trriger type.
+ * @int_polarity: interrupt polarity enable register
+ * @int_bothedge: interrupt bothedge enable register
+ * @int_status: interrupt status register
+ * @int_rawstatus: int_status = int_rawstatus & int_mask
+ * @debounce: enable debounce for interrupt signal
+ * @dbclk_div_en: enable divider for debounce clock
+ * @dbclk_div_con: setting for divider of debounce clock
+ * @port_eoi: end of interrupt of the port
+ * @ext_port: port data from external
+ * @version_id: controller version register
+ */
+struct rockchip_gpio_regs {
+ u32 port_dr;
+ u32 port_ddr;
+ u32 int_en;
+ u32 int_mask;
+ u32 int_type;
+ u32 int_polarity;
+ u32 int_bothedge;
+ u32 int_status;
+ u32 int_rawstatus;
+ u32 debounce;
+ u32 dbclk_div_en;
+ u32 dbclk_div_con;
+ u32 port_eoi;
+ u32 ext_port;
+ u32 version_id;
+};
+
+/**
+ * struct rockchip_iomux
+ * @type: iomux variant using IOMUX_* constants
+ * @offset: if initialized to -1 it will be autocalculated, by specifying
+ * an initial offset value the relevant source offset can be reset
+ * to a new value for autocalculating the following iomux registers.
+ */
+struct rockchip_iomux {
+ int type;
+ int offset;
+};
+
+/*
+ * enum type index corresponding to rockchip_perpin_drv_list arrays index.
+ */
+enum rockchip_pin_drv_type {
+ DRV_TYPE_IO_DEFAULT = 0,
+ DRV_TYPE_IO_1V8_OR_3V0,
+ DRV_TYPE_IO_1V8_ONLY,
+ DRV_TYPE_IO_1V8_3V0_AUTO,
+ DRV_TYPE_IO_3V3_ONLY,
+ DRV_TYPE_MAX
+};
+
+/*
+ * enum type index corresponding to rockchip_pull_list arrays index.
+ */
+enum rockchip_pin_pull_type {
+ PULL_TYPE_IO_DEFAULT = 0,
+ PULL_TYPE_IO_1V8_ONLY,
+ PULL_TYPE_MAX
+};
+
+/**
+ * struct rockchip_drv
+ * @drv_type: drive strength variant using rockchip_perpin_drv_type
+ * @offset: if initialized to -1 it will be autocalculated, by specifying
+ * an initial offset value the relevant source offset can be reset
+ * to a new value for autocalculating the following drive strength
+ * registers. if used chips own cal_drv func instead to calculate
+ * registers offset, the variant could be ignored.
+ */
+struct rockchip_drv {
+ enum rockchip_pin_drv_type drv_type;
+ int offset;
+};
+
+/**
+ * struct rockchip_pin_bank
+ * @dev: the pinctrl device bind to the bank
+ * @reg_base: register base of the gpio bank
+ * @regmap_pull: optional separate register for additional pull settings
+ * @clk: clock of the gpio bank
+ * @db_clk: clock of the gpio debounce
+ * @irq: interrupt of the gpio bank
+ * @saved_masks: Saved content of GPIO_INTEN at suspend time.
+ * @pin_base: first pin number
+ * @nr_pins: number of pins in this bank
+ * @name: name of the bank
+ * @bank_num: number of the bank, to account for holes
+ * @iomux: array describing the 4 iomux sources of the bank
+ * @drv: array describing the 4 drive strength sources of the bank
+ * @pull_type: array describing the 4 pull type sources of the bank
+ * @valid: is all necessary information present
+ * @of_node: dt node of this bank
+ * @drvdata: common pinctrl basedata
+ * @domain: irqdomain of the gpio bank
+ * @gpio_chip: gpiolib chip
+ * @grange: gpio range
+ * @slock: spinlock for the gpio bank
+ * @toggle_edge_mode: bit mask to toggle (falling/rising) edge mode
+ * @recalced_mask: bit mask to indicate a need to recalulate the mask
+ * @route_mask: bits describing the routing pins of per bank
+ */
+struct rockchip_pin_bank {
+ struct device *dev;
+ void __iomem *reg_base;
+ struct regmap *regmap_pull;
+ struct clk *clk;
+ struct clk *db_clk;
+ int irq;
+ u32 saved_masks;
+ u32 pin_base;
+ u8 nr_pins;
+ char *name;
+ u8 bank_num;
+ struct rockchip_iomux iomux[4];
+ struct rockchip_drv drv[4];
+ enum rockchip_pin_pull_type pull_type[4];
+ bool valid;
+ struct device_node *of_node;
+ struct rockchip_pinctrl *drvdata;
+ struct irq_domain *domain;
+ struct gpio_chip gpio_chip;
+ struct pinctrl_gpio_range grange;
+ raw_spinlock_t slock;
+ const struct rockchip_gpio_regs *gpio_regs;
+ u32 gpio_type;
+ u32 toggle_edge_mode;
+ u32 recalced_mask;
+ u32 route_mask;
+};
+
+/**
+ * struct rockchip_mux_recalced_data: represent a pin iomux data.
+ * @num: bank number.
+ * @pin: pin number.
+ * @bit: index at register.
+ * @reg: register offset.
+ * @mask: mask bit
+ */
+struct rockchip_mux_recalced_data {
+ u8 num;
+ u8 pin;
+ u32 reg;
+ u8 bit;
+ u8 mask;
+};
+
+enum rockchip_mux_route_location {
+ ROCKCHIP_ROUTE_SAME = 0,
+ ROCKCHIP_ROUTE_PMU,
+ ROCKCHIP_ROUTE_GRF,
+};
+
+/**
+ * struct rockchip_mux_recalced_data: represent a pin iomux data.
+ * @bank_num: bank number.
+ * @pin: index at register or used to calc index.
+ * @func: the min pin.
+ * @route_location: the mux route location (same, pmu, grf).
+ * @route_offset: the max pin.
+ * @route_val: the register offset.
+ */
+struct rockchip_mux_route_data {
+ u8 bank_num;
+ u8 pin;
+ u8 func;
+ enum rockchip_mux_route_location route_location;
+ u32 route_offset;
+ u32 route_val;
+};
+
+struct rockchip_pin_ctrl {
+ struct rockchip_pin_bank *pin_banks;
+ u32 nr_banks;
+ u32 nr_pins;
+ char *label;
+ enum rockchip_pinctrl_type type;
+ int grf_mux_offset;
+ int pmu_mux_offset;
+ int grf_drv_offset;
+ int pmu_drv_offset;
+ struct rockchip_mux_recalced_data *iomux_recalced;
+ u32 niomux_recalced;
+ struct rockchip_mux_route_data *iomux_routes;
+ u32 niomux_routes;
+
+ void (*pull_calc_reg)(struct rockchip_pin_bank *bank,
+ int pin_num, struct regmap **regmap,
+ int *reg, u8 *bit);
+ void (*drv_calc_reg)(struct rockchip_pin_bank *bank,
+ int pin_num, struct regmap **regmap,
+ int *reg, u8 *bit);
+ int (*schmitt_calc_reg)(struct rockchip_pin_bank *bank,
+ int pin_num, struct regmap **regmap,
+ int *reg, u8 *bit);
+};
+
+struct rockchip_pin_config {
+ unsigned int func;
+ unsigned long *configs;
+ unsigned int nconfigs;
+};
+
+/**
+ * struct rockchip_pin_group: represent group of pins of a pinmux function.
+ * @name: name of the pin group, used to lookup the group.
+ * @pins: the pins included in this group.
+ * @npins: number of pins included in this group.
+ * @data: local pin configuration
+ */
+struct rockchip_pin_group {
+ const char *name;
+ unsigned int npins;
+ unsigned int *pins;
+ struct rockchip_pin_config *data;
+};
+
+/**
+ * struct rockchip_pmx_func: represent a pin function.
+ * @name: name of the pin function, used to lookup the function.
+ * @groups: one or more names of pin groups that provide this function.
+ * @ngroups: number of groups included in @groups.
+ */
+struct rockchip_pmx_func {
+ const char *name;
+ const char **groups;
+ u8 ngroups;
+};
+
+struct rockchip_pinctrl {
+ struct regmap *regmap_base;
+ int reg_size;
+ struct regmap *regmap_pull;
+ struct regmap *regmap_pmu;
+ struct device *dev;
+ struct rockchip_pin_ctrl *ctrl;
+ struct pinctrl_desc pctl;
+ struct pinctrl_dev *pctl_dev;
+ struct rockchip_pin_group *groups;
+ unsigned int ngroups;
+ struct rockchip_pmx_func *functions;
+ unsigned int nfunctions;
+};
+
+#endif
mask = pcs->read(pcswi->reg);
raw_spin_unlock(&pcs->lock);
if (mask & pcs_soc->irq_status_mask) {
- generic_handle_irq(irq_find_mapping(pcs->domain,
- pcswi->hwirq));
+ generic_handle_domain_irq(pcs->domain,
+ pcswi->hwirq);
count++;
}
}
continue;
}
- generic_handle_irq(irq_find_mapping(bank->gpio_chip.irq.domain, n));
+ generic_handle_domain_irq(bank->gpio_chip.irq.domain, n);
}
}
}
const struct msm_pingroup *g;
struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
struct irq_chip *chip = irq_desc_get_chip(desc);
- int irq_pin;
int handled = 0;
u32 val;
int i;
g = &pctrl->soc->groups[i];
val = msm_readl_intr_status(pctrl, g);
if (val & BIT(g->intr_status_bit)) {
- irq_pin = irq_find_mapping(gc->irq.domain, i);
- generic_handle_irq(irq_pin);
+ generic_handle_domain_irq(gc->irq.domain, i);
handled++;
}
}
{
struct samsung_pinctrl_drv_data *d = data;
struct samsung_pin_bank *bank = d->pin_banks;
- unsigned int svc, group, pin, virq;
+ unsigned int svc, group, pin;
+ int ret;
svc = readl(bank->eint_base + EXYNOS_SVC_OFFSET);
group = EXYNOS_SVC_GROUP(svc);
return IRQ_HANDLED;
bank += (group - 1);
- virq = irq_linear_revmap(bank->irq_domain, pin);
- if (!virq)
+ ret = generic_handle_domain_irq(bank->irq_domain, pin);
+ if (ret)
return IRQ_NONE;
- generic_handle_irq(virq);
+
return IRQ_HANDLED;
}
struct exynos_weint_data *eintd = irq_desc_get_handler_data(desc);
struct samsung_pin_bank *bank = eintd->bank;
struct irq_chip *chip = irq_desc_get_chip(desc);
- int eint_irq;
chained_irq_enter(chip, desc);
- eint_irq = irq_linear_revmap(bank->irq_domain, eintd->irq);
- generic_handle_irq(eint_irq);
+ generic_handle_domain_irq(bank->irq_domain, eintd->irq);
chained_irq_exit(chip, desc);
}
while (pend) {
irq = fls(pend) - 1;
- generic_handle_irq(irq_find_mapping(domain, irq));
+ generic_handle_domain_irq(domain, irq);
pend &= ~(1 << irq);
}
}
{
struct irq_data *data = irq_desc_get_irq_data(desc);
struct s3c24xx_eint_data *eint_data = irq_desc_get_handler_data(desc);
- unsigned int virq;
+ int ret;
/* the first 4 eints have a simple 1 to 1 mapping */
- virq = irq_linear_revmap(eint_data->domains[data->hwirq], data->hwirq);
+ ret = generic_handle_domain_irq(eint_data->domains[data->hwirq], data->hwirq);
/* Something must be really wrong if an unmapped EINT is unmasked */
- BUG_ON(!virq);
-
- generic_handle_irq(virq);
+ BUG_ON(ret);
}
/* Handling of EINTs 0-3 on S3C2412 and S3C2413 */
struct s3c24xx_eint_data *eint_data = irq_desc_get_handler_data(desc);
struct irq_data *data = irq_desc_get_irq_data(desc);
struct irq_chip *chip = irq_data_get_irq_chip(data);
- unsigned int virq;
+ int ret;
chained_irq_enter(chip, desc);
/* the first 4 eints have a simple 1 to 1 mapping */
- virq = irq_linear_revmap(eint_data->domains[data->hwirq], data->hwirq);
+ ret = generic_handle_domain_irq(eint_data->domains[data->hwirq], data->hwirq);
/* Something must be really wrong if an unmapped EINT is unmasked */
- BUG_ON(!virq);
-
- generic_handle_irq(virq);
+ BUG_ON(ret);
chained_irq_exit(chip, desc);
}
pend &= range;
while (pend) {
- unsigned int virq, irq;
+ unsigned int irq;
+ int ret;
irq = __ffs(pend);
pend &= ~(1 << irq);
- virq = irq_linear_revmap(data->domains[irq], irq - offset);
+ ret = generic_handle_domain_irq(data->domains[irq], irq - offset);
/* Something is really wrong if an unmapped EINT is unmasked */
- BUG_ON(!virq);
-
- generic_handle_irq(virq);
+ BUG_ON(ret);
}
chained_irq_exit(chip, desc);
unsigned int svc;
unsigned int group;
unsigned int pin;
- unsigned int virq;
+ int ret;
svc = readl(drvdata->virt_base + SERVICE_REG);
group = SVC_GROUP(svc);
pin -= 8;
}
- virq = irq_linear_revmap(data->domains[group], pin);
+ ret = generic_handle_domain_irq(data->domains[group], pin);
/*
* Something must be really wrong if an unmapped EINT
* was unmasked...
*/
- BUG_ON(!virq);
-
- generic_handle_irq(virq);
+ BUG_ON(ret);
} while (1);
chained_irq_exit(chip, desc);
pend &= range;
while (pend) {
- unsigned int virq, irq;
+ unsigned int irq;
+ int ret;
irq = fls(pend) - 1;
pend &= ~(1 << irq);
- virq = irq_linear_revmap(data->domains[irq], data->pins[irq]);
+ ret = generic_handle_domain_irq(data->domains[irq], data->pins[irq]);
/*
* Something must be really wrong if an unmapped EINT
* was unmasked...
*/
- BUG_ON(!virq);
-
- generic_handle_irq(virq);
+ BUG_ON(ret);
}
chained_irq_exit(chip, desc);
/* get correct irq line number */
pin = i * MAX_GPIO_PER_REG + pin;
- generic_handle_irq(
- irq_find_mapping(gc->irq.domain, pin));
+ generic_handle_domain_irq(gc->irq.domain, pin);
}
}
chained_irq_exit(irqchip, desc);
if (val) {
int irqoffset;
- for_each_set_bit(irqoffset, &val, IRQ_PER_BANK) {
- int pin_irq = irq_find_mapping(pctl->domain,
- bank * IRQ_PER_BANK + irqoffset);
- generic_handle_irq(pin_irq);
- }
+ for_each_set_bit(irqoffset, &val, IRQ_PER_BANK)
+ generic_handle_domain_irq(pctl->domain,
+ bank * IRQ_PER_BANK + irqoffset);
}
chained_irq_exit(chip, desc);
help
Reset support for STMicroelectronics boards.
+config POWER_RESET_TPS65086
+ bool "TPS65086 restart driver"
+ depends on MFD_TPS65086
+ help
+ This driver adds support for resetting the TPS65086 PMIC on restart.
+
config POWER_RESET_VERSATILE
bool "ARM Versatile family reboot driver"
depends on ARM
obj-$(CONFIG_POWER_RESET_REGULATOR) += regulator-poweroff.o
obj-$(CONFIG_POWER_RESET_RESTART) += restart-poweroff.o
obj-$(CONFIG_POWER_RESET_ST) += st-poweroff.o
+obj-$(CONFIG_POWER_RESET_TPS65086) += tps65086-restart.o
obj-$(CONFIG_POWER_RESET_VERSATILE) += arm-versatile-reboot.o
obj-$(CONFIG_POWER_RESET_VEXPRESS) += vexpress-poweroff.o
obj-$(CONFIG_POWER_RESET_XGENE) += xgene-reboot.o
#define MII_MARVELL_PHY_PAGE 22
#define MII_PHY_LED_CTRL 16
+#define MII_PHY_LED_POL_CTRL 17
#define MII_88E1318S_PHY_LED_TCR 18
#define MII_88E1318S_PHY_WOL_CTRL 16
#define MII_M1011_IEVENT 19
#define LED2_FORCE_ON (0x8 << 8)
#define LEDMASK GENMASK(11,8)
+#define MII_88E1318S_PHY_LED_POL_LED2 BIT(4)
+
+struct power_off_cfg {
+ char *mdio_node_name;
+ void (*phy_set_reg)(bool restart);
+};
+
static struct phy_device *phydev;
+static const struct power_off_cfg *cfg;
-static void mvphy_reg_intn(u16 data)
+static void linkstation_mvphy_reg_intn(bool restart)
{
int rc = 0, saved_page;
+ u16 data = 0;
+
+ if (restart)
+ data = MII_88E1318S_PHY_LED_TCR_FORCE_INT;
saved_page = phy_select_page(phydev, MII_MARVELL_LED_PAGE);
if (saved_page < 0)
dev_err(&phydev->mdio.dev, "Write register failed, %d\n", rc);
}
+static void readynas_mvphy_set_reg(bool restart)
+{
+ int rc = 0, saved_page;
+ u16 data = 0;
+
+ if (restart)
+ data = MII_88E1318S_PHY_LED_POL_LED2;
+
+ saved_page = phy_select_page(phydev, MII_MARVELL_LED_PAGE);
+ if (saved_page < 0)
+ goto err;
+
+ /* Set the LED[2].0 Polarity bit to the required state */
+ __phy_modify(phydev, MII_PHY_LED_POL_CTRL,
+ MII_88E1318S_PHY_LED_POL_LED2, data);
+
+ if (!data) {
+ /* If WOL was enabled and a magic packet was received before powering
+ * off, we won't be able to wake up by sending another magic packet.
+ * Clear WOL status.
+ */
+ __phy_write(phydev, MII_MARVELL_PHY_PAGE, MII_MARVELL_WOL_PAGE);
+ __phy_set_bits(phydev, MII_88E1318S_PHY_WOL_CTRL,
+ MII_88E1318S_PHY_WOL_CTRL_CLEAR_WOL_STATUS);
+ }
+err:
+ rc = phy_restore_page(phydev, saved_page, rc);
+ if (rc < 0)
+ dev_err(&phydev->mdio.dev, "Write register failed, %d\n", rc);
+}
+
+static const struct power_off_cfg linkstation_power_off_cfg = {
+ .mdio_node_name = "mdio",
+ .phy_set_reg = linkstation_mvphy_reg_intn,
+};
+
+static const struct power_off_cfg readynas_power_off_cfg = {
+ .mdio_node_name = "mdio-bus",
+ .phy_set_reg = readynas_mvphy_set_reg,
+};
+
static int linkstation_reboot_notifier(struct notifier_block *nb,
unsigned long action, void *unused)
{
if (action == SYS_RESTART)
- mvphy_reg_intn(MII_88E1318S_PHY_LED_TCR_FORCE_INT);
+ cfg->phy_set_reg(true);
return NOTIFY_DONE;
}
static void linkstation_poweroff(void)
{
unregister_reboot_notifier(&linkstation_reboot_nb);
- mvphy_reg_intn(0);
+ cfg->phy_set_reg(false);
kernel_restart("Power off");
}
static const struct of_device_id ls_poweroff_of_match[] = {
- { .compatible = "buffalo,ls421d" },
- { .compatible = "buffalo,ls421de" },
+ { .compatible = "buffalo,ls421d",
+ .data = &linkstation_power_off_cfg,
+ },
+ { .compatible = "buffalo,ls421de",
+ .data = &linkstation_power_off_cfg,
+ },
+ { .compatible = "netgear,readynas-duo-v2",
+ .data = &readynas_power_off_cfg,
+ },
{ },
};
{
struct mii_bus *bus;
struct device_node *dn;
+ const struct of_device_id *match;
dn = of_find_matching_node(NULL, ls_poweroff_of_match);
if (!dn)
return -ENODEV;
of_node_put(dn);
- dn = of_find_node_by_name(NULL, "mdio");
+ match = of_match_node(ls_poweroff_of_match, dn);
+ cfg = match->data;
+
+ dn = of_find_node_by_name(NULL, cfg->mdio_node_name);
if (!dn)
return -ENODEV;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2021 Emil Renner Berthing
+ */
+
+#include <linux/mfd/tps65086.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/reboot.h>
+
+struct tps65086_restart {
+ struct notifier_block handler;
+ struct device *dev;
+};
+
+static int tps65086_restart_notify(struct notifier_block *this,
+ unsigned long mode, void *cmd)
+{
+ struct tps65086_restart *tps65086_restart =
+ container_of(this, struct tps65086_restart, handler);
+ struct tps65086 *tps65086 = dev_get_drvdata(tps65086_restart->dev->parent);
+ int ret;
+
+ ret = regmap_write(tps65086->regmap, TPS65086_FORCESHUTDN, 1);
+ if (ret) {
+ dev_err(tps65086_restart->dev, "%s: error writing to tps65086 pmic: %d\n",
+ __func__, ret);
+ return NOTIFY_DONE;
+ }
+
+ /* give it a little time */
+ mdelay(200);
+
+ WARN_ON(1);
+
+ return NOTIFY_DONE;
+}
+
+static int tps65086_restart_probe(struct platform_device *pdev)
+{
+ struct tps65086_restart *tps65086_restart;
+ int ret;
+
+ tps65086_restart = devm_kzalloc(&pdev->dev, sizeof(*tps65086_restart), GFP_KERNEL);
+ if (!tps65086_restart)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, tps65086_restart);
+
+ tps65086_restart->handler.notifier_call = tps65086_restart_notify;
+ tps65086_restart->handler.priority = 192;
+ tps65086_restart->dev = &pdev->dev;
+
+ ret = register_restart_handler(&tps65086_restart->handler);
+ if (ret) {
+ dev_err(&pdev->dev, "%s: cannot register restart handler: %d\n",
+ __func__, ret);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static int tps65086_restart_remove(struct platform_device *pdev)
+{
+ struct tps65086_restart *tps65086_restart = platform_get_drvdata(pdev);
+ int ret;
+
+ ret = unregister_restart_handler(&tps65086_restart->handler);
+ if (ret) {
+ dev_err(&pdev->dev, "%s: cannot unregister restart handler: %d\n",
+ __func__, ret);
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static const struct platform_device_id tps65086_restart_id_table[] = {
+ { "tps65086-reset", },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(platform, tps65086_restart_id_table);
+
+static struct platform_driver tps65086_restart_driver = {
+ .driver = {
+ .name = "tps65086-restart",
+ },
+ .probe = tps65086_restart_probe,
+ .remove = tps65086_restart_remove,
+ .id_table = tps65086_restart_id_table,
+};
+module_platform_driver(tps65086_restart_driver);
+
+MODULE_AUTHOR("Emil Renner Berthing <kernel@esmil.dk>");
+MODULE_DESCRIPTION("TPS65086 restart driver");
+MODULE_LICENSE("GPL v2");
config AXP288_FUEL_GAUGE
tristate "X-Powers AXP288 Fuel Gauge"
- depends on MFD_AXP20X && IIO
+ depends on MFD_AXP20X && IIO && IOSF_MBI
help
Say yes here to have support for X-Power power management IC (PMIC)
Fuel Gauge. The device provides battery statistics and status
Battery charger. This driver provides Battery charger power management
functions on the systems.
+config CHARGER_MT6360
+ tristate "Mediatek MT6360 Charger Driver"
+ depends on MFD_MT6360
+ depends on REGULATOR
+ select LINEAR_RANGES
+ help
+ Say Y here to enable MT6360 Charger Part.
+ The device supports High-Accuracy Voltage/Current Regulation,
+ Average Input Current Regulation, Battery Temperature Sensing,
+ Over-Temperature Protection, DPDM Detection for BC1.2.
+
config CHARGER_QCOM_SMBB
tristate "Qualcomm Switch-Mode Battery Charger and Boost"
depends on MFD_SPMI_PMIC || COMPILE_TEST
config CHARGER_SMB347
tristate "Summit Microelectronics SMB3XX Battery Charger"
depends on I2C
+ depends on REGULATOR
select REGMAP_I2C
help
Say Y to include support for Summit Microelectronics SMB345,
what is connected to USB PD ports from the EC and converts
that into power_supply properties.
+config CHARGER_CROS_PCHG
+ tristate "ChromeOS EC based peripheral charger"
+ depends on MFD_CROS_EC_DEV
+ default MFD_CROS_EC_DEV
+ help
+ Say Y here to enable ChromeOS EC based peripheral charge driver.
+ This driver gets various information about the devices connected to
+ the peripheral charge ports from the EC and converts that into
+ power_supply properties.
+
config CHARGER_SC2731
tristate "Spreadtrum SC2731 charger driver"
depends on MFD_SC27XX_PMIC || COMPILE_TEST
config RN5T618_POWER
tristate "RN5T618 charger/fuel gauge support"
depends on MFD_RN5T618
+ depends on RN5T618_ADC
+ depends on IIO
help
Say Y here to have support for RN5T618 PMIC family fuel gauge and charger.
This driver can also be built as a module. If so, the module will be
obj-$(CONFIG_CHARGER_88PM860X) += 88pm860x_charger.o
obj-$(CONFIG_CHARGER_PCF50633) += pcf50633-charger.o
obj-$(CONFIG_BATTERY_RX51) += rx51_battery.o
-obj-$(CONFIG_AB8500_BM) += ab8500_bmdata.o ab8500_charger.o ab8500_fg.o ab8500_btemp.o abx500_chargalg.o
+obj-$(CONFIG_AB8500_BM) += ab8500_bmdata.o ab8500_charger.o ab8500_fg.o ab8500_btemp.o ab8500_chargalg.o
obj-$(CONFIG_CHARGER_CPCAP) += cpcap-charger.o
obj-$(CONFIG_CHARGER_ISP1704) += isp1704_charger.o
obj-$(CONFIG_CHARGER_MAX8903) += max8903_charger.o
obj-$(CONFIG_CHARGER_MAX8997) += max8997_charger.o
obj-$(CONFIG_CHARGER_MAX8998) += max8998_charger.o
obj-$(CONFIG_CHARGER_MP2629) += mp2629_charger.o
+obj-$(CONFIG_CHARGER_MT6360) += mt6360_charger.o
obj-$(CONFIG_CHARGER_QCOM_SMBB) += qcom_smbb.o
obj-$(CONFIG_CHARGER_BQ2415X) += bq2415x_charger.o
obj-$(CONFIG_CHARGER_BQ24190) += bq24190_charger.o
obj-$(CONFIG_AXP288_FUEL_GAUGE) += axp288_fuel_gauge.o
obj-$(CONFIG_AXP288_CHARGER) += axp288_charger.o
obj-$(CONFIG_CHARGER_CROS_USBPD) += cros_usbpd-charger.o
+obj-$(CONFIG_CHARGER_CROS_PCHG) += cros_peripheral_charger.o
obj-$(CONFIG_CHARGER_SC2731) += sc2731_charger.o
obj-$(CONFIG_FUEL_GAUGE_SC27XX) += sc27xx_fuel_gauge.o
obj-$(CONFIG_CHARGER_UCS1002) += ucs1002_power.o
/*
* ADC for the battery thermistor.
- * When using the ABx500_ADC_THERM_BATCTRL the battery ID resistor is combined
+ * When using the AB8500_ADC_THERM_BATCTRL the battery ID resistor is combined
* with a NTC resistor to both identify the battery and to measure its
* temperature. Different phone manufactures uses different techniques to both
* identify the battery and to read its temperature.
*/
-enum abx500_adc_therm {
- ABx500_ADC_THERM_BATCTRL,
- ABx500_ADC_THERM_BATTEMP,
+enum ab8500_adc_therm {
+ AB8500_ADC_THERM_BATCTRL,
+ AB8500_ADC_THERM_BATTEMP,
};
/**
- * struct abx500_res_to_temp - defines one point in a temp to res curve. To
+ * struct ab8500_res_to_temp - defines one point in a temp to res curve. To
* be used in battery packs that combines the identification resistor with a
* NTC resistor.
* @temp: battery pack temperature in Celsius
* @resist: NTC resistor net total resistance
*/
-struct abx500_res_to_temp {
+struct ab8500_res_to_temp {
int temp;
int resist;
};
/**
- * struct abx500_v_to_cap - Table for translating voltage to capacity
+ * struct ab8500_v_to_cap - Table for translating voltage to capacity
* @voltage: Voltage in mV
* @capacity: Capacity in percent
*/
-struct abx500_v_to_cap {
+struct ab8500_v_to_cap {
int voltage;
int capacity;
};
/* Forward declaration */
-struct abx500_fg;
+struct ab8500_fg;
/**
- * struct abx500_fg_parameters - Fuel gauge algorithm parameters, in seconds
+ * struct ab8500_fg_parameters - Fuel gauge algorithm parameters, in seconds
* if not specified
* @recovery_sleep_timer: Time between measurements while recovering
* @recovery_total_time: Total recovery time
* @pcut_max_restart: Max number of restarts
* @pcut_debounce_time: Sets battery debounce time
*/
-struct abx500_fg_parameters {
+struct ab8500_fg_parameters {
int recovery_sleep_timer;
int recovery_total_time;
int init_timer;
};
/**
- * struct abx500_charger_maximization - struct used by the board config.
+ * struct ab8500_charger_maximization - struct used by the board config.
* @use_maxi: Enable maximization for this battery type
* @maxi_chg_curr: Maximum charger current allowed
* @maxi_wait_cycles: cycles to wait before setting charger current
* @charger_curr_step delta between two charger current settings (mA)
*/
-struct abx500_maxim_parameters {
+struct ab8500_maxim_parameters {
bool ena_maxi;
int chg_curr;
int wait_cycles;
};
/**
- * struct abx500_battery_type - different batteries supported
+ * struct ab8500_battery_type - different batteries supported
* @name: battery technology
* @resis_high: battery upper resistance limit
* @resis_low: battery lower resistance limit
* @n_batres_tbl_elements number of elements in the batres_tbl
* @batres_tbl battery internal resistance vs temperature table
*/
-struct abx500_battery_type {
+struct ab8500_battery_type {
int name;
int resis_high;
int resis_low;
int low_high_vol_lvl;
int battery_resistance;
int n_temp_tbl_elements;
- const struct abx500_res_to_temp *r_to_t_tbl;
+ const struct ab8500_res_to_temp *r_to_t_tbl;
int n_v_cap_tbl_elements;
- const struct abx500_v_to_cap *v_to_cap_tbl;
+ const struct ab8500_v_to_cap *v_to_cap_tbl;
int n_batres_tbl_elements;
const struct batres_vs_temp *batres_tbl;
};
/**
- * struct abx500_bm_capacity_levels - abx500 capacity level data
+ * struct ab8500_bm_capacity_levels - ab8500 capacity level data
* @critical: critical capacity level in percent
* @low: low capacity level in percent
* @normal: normal capacity level in percent
* @high: high capacity level in percent
* @full: full capacity level in percent
*/
-struct abx500_bm_capacity_levels {
+struct ab8500_bm_capacity_levels {
int critical;
int low;
int normal;
};
/**
- * struct abx500_bm_charger_parameters - Charger specific parameters
+ * struct ab8500_bm_charger_parameters - Charger specific parameters
* @usb_volt_max: maximum allowed USB charger voltage in mV
* @usb_curr_max: maximum allowed USB charger current in mA
* @ac_volt_max: maximum allowed AC charger voltage in mV
* @ac_curr_max: maximum allowed AC charger current in mA
*/
-struct abx500_bm_charger_parameters {
+struct ab8500_bm_charger_parameters {
int usb_volt_max;
int usb_curr_max;
int ac_volt_max;
};
/**
- * struct abx500_bm_data - abx500 battery management data
+ * struct ab8500_bm_data - ab8500 battery management data
* @temp_under under this temp, charging is stopped
* @temp_low between this temp and temp_under charging is reduced
* @temp_high between this temp and temp_over charging is reduced
* @bkup_bat_i current which we charge the backup battery with
* @no_maintenance indicates that maintenance charging is disabled
* @capacity_scaling indicates whether capacity scaling is to be used
- * @abx500_adc_therm placement of thermistor, batctrl or battemp adc
+ * @ab8500_adc_therm placement of thermistor, batctrl or battemp adc
* @chg_unknown_bat flag to enable charging of unknown batteries
* @enable_overshoot flag to enable VBAT overshoot control
* @auto_trig flag to enable auto adc trigger
* @chg_params charger parameters
* @fg_params fuel gauge parameters
*/
-struct abx500_bm_data {
+struct ab8500_bm_data {
int temp_under;
int temp_low;
int temp_high;
bool chg_unknown_bat;
bool enable_overshoot;
bool auto_trig;
- enum abx500_adc_therm adc_therm;
+ enum ab8500_adc_therm adc_therm;
int fg_res;
int n_btypes;
int batt_id;
int n_chg_in_curr;
int *chg_output_curr;
int *chg_input_curr;
- const struct abx500_maxim_parameters *maxi;
- const struct abx500_bm_capacity_levels *cap_levels;
- struct abx500_battery_type *bat_type;
- const struct abx500_bm_charger_parameters *chg_params;
- const struct abx500_fg_parameters *fg_params;
+ const struct ab8500_maxim_parameters *maxi;
+ const struct ab8500_bm_capacity_levels *cap_levels;
+ struct ab8500_battery_type *bat_type;
+ const struct ab8500_bm_charger_parameters *chg_params;
+ const struct ab8500_fg_parameters *fg_params;
};
enum {
/* Forward declaration */
struct ab8500_fg;
-/**
- * struct ab8500_fg_parameters - Fuel gauge algorithm parameters, in seconds
- * if not specified
- * @recovery_sleep_timer: Time between measurements while recovering
- * @recovery_total_time: Total recovery time
- * @init_timer: Measurement interval during startup
- * @init_discard_time: Time we discard voltage measurement at startup
- * @init_total_time: Total init time during startup
- * @high_curr_time: Time current has to be high to go to recovery
- * @accu_charging: FG accumulation time while charging
- * @accu_high_curr: FG accumulation time in high current mode
- * @high_curr_threshold: High current threshold, in mA
- * @lowbat_threshold: Low battery threshold, in mV
- * @battok_falling_th_sel0 Threshold in mV for battOk signal sel0
- * Resolution in 50 mV step.
- * @battok_raising_th_sel1 Threshold in mV for battOk signal sel1
- * Resolution in 50 mV step.
- * @user_cap_limit Capacity reported from user must be within this
- * limit to be considered as sane, in percentage
- * points.
- * @maint_thres This is the threshold where we stop reporting
- * battery full while in maintenance, in per cent
- * @pcut_enable: Enable power cut feature in ab8505
- * @pcut_max_time: Max time threshold
- * @pcut_flag_time: Flagtime threshold
- * @pcut_max_restart: Max number of restarts
- * @pcut_debunce_time: Sets battery debounce time
- */
-struct ab8500_fg_parameters {
- int recovery_sleep_timer;
- int recovery_total_time;
- int init_timer;
- int init_discard_time;
- int init_total_time;
- int high_curr_time;
- int accu_charging;
- int accu_high_curr;
- int high_curr_threshold;
- int lowbat_threshold;
- int battok_falling_th_sel0;
- int battok_raising_th_sel1;
- int user_cap_limit;
- int maint_thres;
- bool pcut_enable;
- u8 pcut_max_time;
- u8 pcut_flag_time;
- u8 pcut_max_restart;
- u8 pcut_debunce_time;
-};
-
-/**
- * struct ab8500_charger_maximization - struct used by the board config.
- * @use_maxi: Enable maximization for this battery type
- * @maxi_chg_curr: Maximum charger current allowed
- * @maxi_wait_cycles: cycles to wait before setting charger current
- * @charger_curr_step delta between two charger current settings (mA)
- */
-struct ab8500_maxim_parameters {
- bool ena_maxi;
- int chg_curr;
- int wait_cycles;
- int charger_curr_step;
-};
-
-/**
- * struct ab8500_bm_capacity_levels - ab8500 capacity level data
- * @critical: critical capacity level in percent
- * @low: low capacity level in percent
- * @normal: normal capacity level in percent
- * @high: high capacity level in percent
- * @full: full capacity level in percent
- */
-struct ab8500_bm_capacity_levels {
- int critical;
- int low;
- int normal;
- int high;
- int full;
-};
-
-/**
- * struct ab8500_bm_charger_parameters - Charger specific parameters
- * @usb_volt_max: maximum allowed USB charger voltage in mV
- * @usb_curr_max: maximum allowed USB charger current in mA
- * @ac_volt_max: maximum allowed AC charger voltage in mV
- * @ac_curr_max: maximum allowed AC charger current in mA
- */
-struct ab8500_bm_charger_parameters {
- int usb_volt_max;
- int usb_curr_max;
- int ac_volt_max;
- int ac_curr_max;
-};
-
-/**
- * struct ab8500_bm_data - ab8500 battery management data
- * @temp_under under this temp, charging is stopped
- * @temp_low between this temp and temp_under charging is reduced
- * @temp_high between this temp and temp_over charging is reduced
- * @temp_over over this temp, charging is stopped
- * @temp_interval_chg temperature measurement interval in s when charging
- * @temp_interval_nochg temperature measurement interval in s when not charging
- * @main_safety_tmr_h safety timer for main charger
- * @usb_safety_tmr_h safety timer for usb charger
- * @bkup_bat_v voltage which we charge the backup battery with
- * @bkup_bat_i current which we charge the backup battery with
- * @no_maintenance indicates that maintenance charging is disabled
- * @capacity_scaling indicates whether capacity scaling is to be used
- * @adc_therm placement of thermistor, batctrl or battemp adc
- * @chg_unknown_bat flag to enable charging of unknown batteries
- * @enable_overshoot flag to enable VBAT overshoot control
- * @fg_res resistance of FG resistor in 0.1mOhm
- * @n_btypes number of elements in array bat_type
- * @batt_id index of the identified battery in array bat_type
- * @interval_charging charge alg cycle period time when charging (sec)
- * @interval_not_charging charge alg cycle period time when not charging (sec)
- * @temp_hysteresis temperature hysteresis
- * @gnd_lift_resistance Battery ground to phone ground resistance (mOhm)
- * @maxi: maximization parameters
- * @cap_levels capacity in percent for the different capacity levels
- * @bat_type table of supported battery types
- * @chg_params charger parameters
- * @fg_params fuel gauge parameters
- */
-struct ab8500_bm_data {
- int temp_under;
- int temp_low;
- int temp_high;
- int temp_over;
- int temp_interval_chg;
- int temp_interval_nochg;
- int main_safety_tmr_h;
- int usb_safety_tmr_h;
- int bkup_bat_v;
- int bkup_bat_i;
- bool no_maintenance;
- bool capacity_scaling;
- bool chg_unknown_bat;
- bool enable_overshoot;
- enum abx500_adc_therm adc_therm;
- int fg_res;
- int n_btypes;
- int batt_id;
- int interval_charging;
- int interval_not_charging;
- int temp_hysteresis;
- int gnd_lift_resistance;
- const struct ab8500_maxim_parameters *maxi;
- const struct ab8500_bm_capacity_levels *cap_levels;
- const struct ab8500_bm_charger_parameters *chg_params;
- const struct ab8500_fg_parameters *fg_params;
-};
-
-extern struct abx500_bm_data ab8500_bm_data;
+extern struct ab8500_bm_data ab8500_bm_data;
void ab8500_charger_usb_state_changed(u8 bm_usb_state, u16 mA);
struct ab8500_fg *ab8500_fg_get(void);
int ab8500_fg_inst_curr_done(struct ab8500_fg *di);
int ab8500_bm_of_probe(struct device *dev,
struct device_node *np,
- struct abx500_bm_data *bm);
+ struct ab8500_bm_data *bm);
extern struct platform_driver ab8500_fg_driver;
extern struct platform_driver ab8500_btemp_driver;
-extern struct platform_driver abx500_chargalg_driver;
+extern struct platform_driver ab8500_chargalg_driver;
#endif /* _AB8500_CHARGER_H_ */
#include <linux/export.h>
#include <linux/power_supply.h>
#include <linux/of.h>
-#include <linux/mfd/abx500.h>
-#include <linux/mfd/abx500/ab8500.h>
#include "ab8500-bm.h"
* Note that the res_to_temp table must be strictly sorted by falling resistance
* values to work.
*/
-const struct abx500_res_to_temp ab8500_temp_tbl_a_thermistor[] = {
+const struct ab8500_res_to_temp ab8500_temp_tbl_a_thermistor[] = {
{-5, 53407},
{ 0, 48594},
{ 5, 43804},
const int ab8500_temp_tbl_a_size = ARRAY_SIZE(ab8500_temp_tbl_a_thermistor);
EXPORT_SYMBOL(ab8500_temp_tbl_a_size);
-const struct abx500_res_to_temp ab8500_temp_tbl_b_thermistor[] = {
+const struct ab8500_res_to_temp ab8500_temp_tbl_b_thermistor[] = {
{-5, 200000},
{ 0, 159024},
{ 5, 151921},
const int ab8500_temp_tbl_b_size = ARRAY_SIZE(ab8500_temp_tbl_b_thermistor);
EXPORT_SYMBOL(ab8500_temp_tbl_b_size);
-static const struct abx500_v_to_cap cap_tbl_a_thermistor[] = {
+static const struct ab8500_v_to_cap cap_tbl_a_thermistor[] = {
{4171, 100},
{4114, 95},
{4009, 83},
{3247, 0},
};
-static const struct abx500_v_to_cap cap_tbl_b_thermistor[] = {
+static const struct ab8500_v_to_cap cap_tbl_b_thermistor[] = {
{4161, 100},
{4124, 98},
{4044, 90},
{3250, 0},
};
-static const struct abx500_v_to_cap cap_tbl[] = {
+static const struct ab8500_v_to_cap cap_tbl[] = {
{4186, 100},
{4163, 99},
{4114, 95},
* Note that the res_to_temp table must be strictly sorted by falling
* resistance values to work.
*/
-static const struct abx500_res_to_temp temp_tbl[] = {
+static const struct ab8500_res_to_temp temp_tbl[] = {
{-5, 214834},
{ 0, 162943},
{ 5, 124820},
{-20, 180},
};
-static struct abx500_battery_type bat_type_thermistor[] = {
+static struct ab8500_battery_type bat_type_thermistor[] = {
[BATTERY_UNKNOWN] = {
/* First element always represent the UNKNOWN battery */
.name = POWER_SUPPLY_TECHNOLOGY_UNKNOWN,
},
};
-static struct abx500_battery_type bat_type_ext_thermistor[] = {
+static struct ab8500_battery_type bat_type_ext_thermistor[] = {
[BATTERY_UNKNOWN] = {
/* First element always represent the UNKNOWN battery */
.name = POWER_SUPPLY_TECHNOLOGY_UNKNOWN,
},
};
-static const struct abx500_bm_capacity_levels cap_levels = {
+static const struct ab8500_bm_capacity_levels cap_levels = {
.critical = 2,
.low = 10,
.normal = 70,
.full = 100,
};
-static const struct abx500_fg_parameters fg = {
+static const struct ab8500_fg_parameters fg = {
.recovery_sleep_timer = 10,
.recovery_total_time = 100,
.init_timer = 1,
.pcut_debounce_time = 2,
};
-static const struct abx500_maxim_parameters ab8500_maxi_params = {
+static const struct ab8500_maxim_parameters ab8500_maxi_params = {
.ena_maxi = true,
.chg_curr = 910,
.wait_cycles = 10,
.charger_curr_step = 100,
};
-static const struct abx500_bm_charger_parameters chg = {
+static const struct ab8500_bm_charger_parameters chg = {
.usb_volt_max = 5500,
.usb_curr_max = 1500,
.ac_volt_max = 7500,
700, 800, 900, 1000, 1100, 1300, 1400, 1500,
};
-struct abx500_bm_data ab8500_bm_data = {
+struct ab8500_bm_data ab8500_bm_data = {
.temp_under = 3,
.temp_low = 8,
.temp_high = 43,
.bkup_bat_i = BUP_ICH_SEL_150UA,
.no_maintenance = false,
.capacity_scaling = false,
- .adc_therm = ABx500_ADC_THERM_BATCTRL,
+ .adc_therm = AB8500_ADC_THERM_BATCTRL,
.chg_unknown_bat = false,
.enable_overshoot = false,
.fg_res = 100,
int ab8500_bm_of_probe(struct device *dev,
struct device_node *np,
- struct abx500_bm_data *bm)
+ struct ab8500_bm_data *bm)
{
const struct batres_vs_temp *tmp_batres_tbl;
struct device_node *battery_node;
} else {
bm->n_btypes = 4;
bm->bat_type = bat_type_ext_thermistor;
- bm->adc_therm = ABx500_ADC_THERM_BATTEMP;
+ bm->adc_therm = AB8500_ADC_THERM_BATTEMP;
tmp_batres_tbl = temp_to_batres_tbl_ext_thermistor;
}
#include <linux/mfd/abx500.h>
#include <linux/mfd/abx500/ab8500.h>
#include <linux/iio/consumer.h>
+#include <linux/fixp-arith.h>
#include "ab8500-bm.h"
struct iio_channel *btemp_ball;
struct iio_channel *bat_ctrl;
struct ab8500_fg *fg;
- struct abx500_bm_data *bm;
+ struct ab8500_bm_data *bm;
struct power_supply *btemp_psy;
struct ab8500_btemp_events events;
struct ab8500_btemp_ranges btemp_ranges;
return (450000 * (v_batctrl)) / (1800 - v_batctrl);
}
- if (di->bm->adc_therm == ABx500_ADC_THERM_BATCTRL) {
+ if (di->bm->adc_therm == AB8500_ADC_THERM_BATCTRL) {
/*
* If the battery has internal NTC, we use the current
* source to calculate the resistance.
return 0;
/* Only do this for batteries with internal NTC */
- if (di->bm->adc_therm == ABx500_ADC_THERM_BATCTRL && enable) {
+ if (di->bm->adc_therm == AB8500_ADC_THERM_BATCTRL && enable) {
if (di->curr_source == BTEMP_BATCTRL_CURR_SRC_7UA)
curr = BAT_CTRL_7U_ENA;
__func__);
goto disable_curr_source;
}
- } else if (di->bm->adc_therm == ABx500_ADC_THERM_BATCTRL && !enable) {
+ } else if (di->bm->adc_therm == AB8500_ADC_THERM_BATCTRL && !enable) {
dev_dbg(di->dev, "Disable BATCTRL curr source\n");
/* Write 0 to the curr bits */
* based on the NTC resistance.
*/
static int ab8500_btemp_res_to_temp(struct ab8500_btemp *di,
- const struct abx500_res_to_temp *tbl, int tbl_size, int res)
+ const struct ab8500_res_to_temp *tbl, int tbl_size, int res)
{
int i;
/*
i++;
}
- return tbl[i].temp + ((tbl[i + 1].temp - tbl[i].temp) *
- (res - tbl[i].resist)) / (tbl[i + 1].resist - tbl[i].resist);
+ return fixp_linear_interpolate(tbl[i].resist, tbl[i].temp,
+ tbl[i + 1].resist, tbl[i + 1].temp,
+ res);
}
/**
id = di->bm->batt_id;
- if (di->bm->adc_therm == ABx500_ADC_THERM_BATCTRL &&
+ if (di->bm->adc_therm == AB8500_ADC_THERM_BATCTRL &&
id != BATTERY_UNKNOWN) {
rbat = ab8500_btemp_get_batctrl_res(di);
dev_dbg(di->dev, "Battery detected on %s"
" low %d < res %d < high: %d"
" index: %d\n",
- di->bm->adc_therm == ABx500_ADC_THERM_BATCTRL ?
+ di->bm->adc_therm == AB8500_ADC_THERM_BATCTRL ?
"BATCTRL" : "BATTEMP",
di->bm->bat_type[i].resis_low, res,
di->bm->bat_type[i].resis_high, i);
* We only have to change current source if the
* detected type is Type 1.
*/
- if (di->bm->adc_therm == ABx500_ADC_THERM_BATCTRL &&
+ if (di->bm->adc_therm == AB8500_ADC_THERM_BATCTRL &&
di->bm->batt_id == 1) {
dev_dbg(di->dev, "Set BATCTRL current source to 20uA\n");
di->curr_source = BTEMP_BATCTRL_CURR_SRC_20UA;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) ST-Ericsson SA 2012
+ * Copyright (c) 2012 Sony Mobile Communications AB
+ *
+ * Charging algorithm driver for AB8500
+ *
+ * Authors:
+ * Johan Palsson <johan.palsson@stericsson.com>
+ * Karl Komierowski <karl.komierowski@stericsson.com>
+ * Arun R Murthy <arun.murthy@stericsson.com>
+ * Author: Imre Sunyi <imre.sunyi@sonymobile.com>
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/device.h>
+#include <linux/component.h>
+#include <linux/hrtimer.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+#include <linux/power_supply.h>
+#include <linux/completion.h>
+#include <linux/workqueue.h>
+#include <linux/kobject.h>
+#include <linux/of.h>
+#include <linux/mfd/core.h>
+#include <linux/mfd/abx500.h>
+#include <linux/mfd/abx500/ab8500.h>
+#include <linux/notifier.h>
+
+#include "ab8500-bm.h"
+#include "ab8500-chargalg.h"
+
+/* Watchdog kick interval */
+#define CHG_WD_INTERVAL (6 * HZ)
+
+/* End-of-charge criteria counter */
+#define EOC_COND_CNT 10
+
+/* One hour expressed in seconds */
+#define ONE_HOUR_IN_SECONDS 3600
+
+/* Five minutes expressed in seconds */
+#define FIVE_MINUTES_IN_SECONDS 300
+
+#define CHARGALG_CURR_STEP_LOW 0
+#define CHARGALG_CURR_STEP_HIGH 100
+
+enum ab8500_chargers {
+ NO_CHG,
+ AC_CHG,
+ USB_CHG,
+};
+
+struct ab8500_chargalg_charger_info {
+ enum ab8500_chargers conn_chg;
+ enum ab8500_chargers prev_conn_chg;
+ enum ab8500_chargers online_chg;
+ enum ab8500_chargers prev_online_chg;
+ enum ab8500_chargers charger_type;
+ bool usb_chg_ok;
+ bool ac_chg_ok;
+ int usb_volt;
+ int usb_curr;
+ int ac_volt;
+ int ac_curr;
+ int usb_vset;
+ int usb_iset;
+ int ac_vset;
+ int ac_iset;
+};
+
+struct ab8500_chargalg_suspension_status {
+ bool suspended_change;
+ bool ac_suspended;
+ bool usb_suspended;
+};
+
+struct ab8500_chargalg_current_step_status {
+ bool curr_step_change;
+ int curr_step;
+};
+
+struct ab8500_chargalg_battery_data {
+ int temp;
+ int volt;
+ int avg_curr;
+ int inst_curr;
+ int percent;
+};
+
+enum ab8500_chargalg_states {
+ STATE_HANDHELD_INIT,
+ STATE_HANDHELD,
+ STATE_CHG_NOT_OK_INIT,
+ STATE_CHG_NOT_OK,
+ STATE_HW_TEMP_PROTECT_INIT,
+ STATE_HW_TEMP_PROTECT,
+ STATE_NORMAL_INIT,
+ STATE_NORMAL,
+ STATE_WAIT_FOR_RECHARGE_INIT,
+ STATE_WAIT_FOR_RECHARGE,
+ STATE_MAINTENANCE_A_INIT,
+ STATE_MAINTENANCE_A,
+ STATE_MAINTENANCE_B_INIT,
+ STATE_MAINTENANCE_B,
+ STATE_TEMP_UNDEROVER_INIT,
+ STATE_TEMP_UNDEROVER,
+ STATE_TEMP_LOWHIGH_INIT,
+ STATE_TEMP_LOWHIGH,
+ STATE_SUSPENDED_INIT,
+ STATE_SUSPENDED,
+ STATE_OVV_PROTECT_INIT,
+ STATE_OVV_PROTECT,
+ STATE_SAFETY_TIMER_EXPIRED_INIT,
+ STATE_SAFETY_TIMER_EXPIRED,
+ STATE_BATT_REMOVED_INIT,
+ STATE_BATT_REMOVED,
+ STATE_WD_EXPIRED_INIT,
+ STATE_WD_EXPIRED,
+};
+
+static const char * const states[] = {
+ "HANDHELD_INIT",
+ "HANDHELD",
+ "CHG_NOT_OK_INIT",
+ "CHG_NOT_OK",
+ "HW_TEMP_PROTECT_INIT",
+ "HW_TEMP_PROTECT",
+ "NORMAL_INIT",
+ "NORMAL",
+ "WAIT_FOR_RECHARGE_INIT",
+ "WAIT_FOR_RECHARGE",
+ "MAINTENANCE_A_INIT",
+ "MAINTENANCE_A",
+ "MAINTENANCE_B_INIT",
+ "MAINTENANCE_B",
+ "TEMP_UNDEROVER_INIT",
+ "TEMP_UNDEROVER",
+ "TEMP_LOWHIGH_INIT",
+ "TEMP_LOWHIGH",
+ "SUSPENDED_INIT",
+ "SUSPENDED",
+ "OVV_PROTECT_INIT",
+ "OVV_PROTECT",
+ "SAFETY_TIMER_EXPIRED_INIT",
+ "SAFETY_TIMER_EXPIRED",
+ "BATT_REMOVED_INIT",
+ "BATT_REMOVED",
+ "WD_EXPIRED_INIT",
+ "WD_EXPIRED",
+};
+
+struct ab8500_chargalg_events {
+ bool batt_unknown;
+ bool mainextchnotok;
+ bool batt_ovv;
+ bool batt_rem;
+ bool btemp_underover;
+ bool btemp_lowhigh;
+ bool main_thermal_prot;
+ bool usb_thermal_prot;
+ bool main_ovv;
+ bool vbus_ovv;
+ bool usbchargernotok;
+ bool safety_timer_expired;
+ bool maintenance_timer_expired;
+ bool ac_wd_expired;
+ bool usb_wd_expired;
+ bool ac_cv_active;
+ bool usb_cv_active;
+ bool vbus_collapsed;
+};
+
+/**
+ * struct ab8500_charge_curr_maximization - Charger maximization parameters
+ * @original_iset: the non optimized/maximised charger current
+ * @current_iset: the charging current used at this moment
+ * @test_delta_i: the delta between the current we want to charge and the
+ current that is really going into the battery
+ * @condition_cnt: number of iterations needed before a new charger current
+ is set
+ * @max_current: maximum charger current
+ * @wait_cnt: to avoid too fast current step down in case of charger
+ * voltage collapse, we insert this delay between step
+ * down
+ * @level: tells in how many steps the charging current has been
+ increased
+ */
+struct ab8500_charge_curr_maximization {
+ int original_iset;
+ int current_iset;
+ int test_delta_i;
+ int condition_cnt;
+ int max_current;
+ int wait_cnt;
+ u8 level;
+};
+
+enum maxim_ret {
+ MAXIM_RET_NOACTION,
+ MAXIM_RET_CHANGE,
+ MAXIM_RET_IBAT_TOO_HIGH,
+};
+
+/**
+ * struct ab8500_chargalg - ab8500 Charging algorithm device information
+ * @dev: pointer to the structure device
+ * @charge_status: battery operating status
+ * @eoc_cnt: counter used to determine end-of_charge
+ * @maintenance_chg: indicate if maintenance charge is active
+ * @t_hyst_norm temperature hysteresis when the temperature has been
+ * over or under normal limits
+ * @t_hyst_lowhigh temperature hysteresis when the temperature has been
+ * over or under the high or low limits
+ * @charge_state: current state of the charging algorithm
+ * @ccm charging current maximization parameters
+ * @chg_info: information about connected charger types
+ * @batt_data: data of the battery
+ * @susp_status: current charger suspension status
+ * @bm: Platform specific battery management information
+ * @curr_status: Current step status for over-current protection
+ * @parent: pointer to the struct ab8500
+ * @chargalg_psy: structure that holds the battery properties exposed by
+ * the charging algorithm
+ * @events: structure for information about events triggered
+ * @chargalg_wq: work queue for running the charging algorithm
+ * @chargalg_periodic_work: work to run the charging algorithm periodically
+ * @chargalg_wd_work: work to kick the charger watchdog periodically
+ * @chargalg_work: work to run the charging algorithm instantly
+ * @safety_timer: charging safety timer
+ * @maintenance_timer: maintenance charging timer
+ * @chargalg_kobject: structure of type kobject
+ */
+struct ab8500_chargalg {
+ struct device *dev;
+ int charge_status;
+ int eoc_cnt;
+ bool maintenance_chg;
+ int t_hyst_norm;
+ int t_hyst_lowhigh;
+ enum ab8500_chargalg_states charge_state;
+ struct ab8500_charge_curr_maximization ccm;
+ struct ab8500_chargalg_charger_info chg_info;
+ struct ab8500_chargalg_battery_data batt_data;
+ struct ab8500_chargalg_suspension_status susp_status;
+ struct ab8500 *parent;
+ struct ab8500_chargalg_current_step_status curr_status;
+ struct ab8500_bm_data *bm;
+ struct power_supply *chargalg_psy;
+ struct ux500_charger *ac_chg;
+ struct ux500_charger *usb_chg;
+ struct ab8500_chargalg_events events;
+ struct workqueue_struct *chargalg_wq;
+ struct delayed_work chargalg_periodic_work;
+ struct delayed_work chargalg_wd_work;
+ struct work_struct chargalg_work;
+ struct hrtimer safety_timer;
+ struct hrtimer maintenance_timer;
+ struct kobject chargalg_kobject;
+};
+
+/*External charger prepare notifier*/
+BLOCKING_NOTIFIER_HEAD(charger_notifier_list);
+
+/* Main battery properties */
+static enum power_supply_property ab8500_chargalg_props[] = {
+ POWER_SUPPLY_PROP_STATUS,
+ POWER_SUPPLY_PROP_HEALTH,
+};
+
+struct ab8500_chargalg_sysfs_entry {
+ struct attribute attr;
+ ssize_t (*show)(struct ab8500_chargalg *di, char *buf);
+ ssize_t (*store)(struct ab8500_chargalg *di, const char *buf, size_t length);
+};
+
+/**
+ * ab8500_chargalg_safety_timer_expired() - Expiration of the safety timer
+ * @timer: pointer to the hrtimer structure
+ *
+ * This function gets called when the safety timer for the charger
+ * expires
+ */
+static enum hrtimer_restart
+ab8500_chargalg_safety_timer_expired(struct hrtimer *timer)
+{
+ struct ab8500_chargalg *di = container_of(timer, struct ab8500_chargalg,
+ safety_timer);
+ dev_err(di->dev, "Safety timer expired\n");
+ di->events.safety_timer_expired = true;
+
+ /* Trigger execution of the algorithm instantly */
+ queue_work(di->chargalg_wq, &di->chargalg_work);
+
+ return HRTIMER_NORESTART;
+}
+
+/**
+ * ab8500_chargalg_maintenance_timer_expired() - Expiration of
+ * the maintenance timer
+ * @timer: pointer to the timer structure
+ *
+ * This function gets called when the maintenence timer
+ * expires
+ */
+static enum hrtimer_restart
+ab8500_chargalg_maintenance_timer_expired(struct hrtimer *timer)
+{
+
+ struct ab8500_chargalg *di = container_of(timer, struct ab8500_chargalg,
+ maintenance_timer);
+
+ dev_dbg(di->dev, "Maintenance timer expired\n");
+ di->events.maintenance_timer_expired = true;
+
+ /* Trigger execution of the algorithm instantly */
+ queue_work(di->chargalg_wq, &di->chargalg_work);
+
+ return HRTIMER_NORESTART;
+}
+
+/**
+ * ab8500_chargalg_state_to() - Change charge state
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * This function gets called when a charge state change should occur
+ */
+static void ab8500_chargalg_state_to(struct ab8500_chargalg *di,
+ enum ab8500_chargalg_states state)
+{
+ dev_dbg(di->dev,
+ "State changed: %s (From state: [%d] %s =to=> [%d] %s )\n",
+ di->charge_state == state ? "NO" : "YES",
+ di->charge_state,
+ states[di->charge_state],
+ state,
+ states[state]);
+
+ di->charge_state = state;
+}
+
+static int ab8500_chargalg_check_charger_enable(struct ab8500_chargalg *di)
+{
+ switch (di->charge_state) {
+ case STATE_NORMAL:
+ case STATE_MAINTENANCE_A:
+ case STATE_MAINTENANCE_B:
+ break;
+ default:
+ return 0;
+ }
+
+ if (di->chg_info.charger_type & USB_CHG) {
+ return di->usb_chg->ops.check_enable(di->usb_chg,
+ di->bm->bat_type[di->bm->batt_id].normal_vol_lvl,
+ di->bm->bat_type[di->bm->batt_id].normal_cur_lvl);
+ } else if ((di->chg_info.charger_type & AC_CHG) &&
+ !(di->ac_chg->external)) {
+ return di->ac_chg->ops.check_enable(di->ac_chg,
+ di->bm->bat_type[di->bm->batt_id].normal_vol_lvl,
+ di->bm->bat_type[di->bm->batt_id].normal_cur_lvl);
+ }
+ return 0;
+}
+
+/**
+ * ab8500_chargalg_check_charger_connection() - Check charger connection change
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * This function will check if there is a change in the charger connection
+ * and change charge state accordingly. AC has precedence over USB.
+ */
+static int ab8500_chargalg_check_charger_connection(struct ab8500_chargalg *di)
+{
+ if (di->chg_info.conn_chg != di->chg_info.prev_conn_chg ||
+ di->susp_status.suspended_change) {
+ /*
+ * Charger state changed or suspension
+ * has changed since last update
+ */
+ if ((di->chg_info.conn_chg & AC_CHG) &&
+ !di->susp_status.ac_suspended) {
+ dev_dbg(di->dev, "Charging source is AC\n");
+ if (di->chg_info.charger_type != AC_CHG) {
+ di->chg_info.charger_type = AC_CHG;
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ }
+ } else if ((di->chg_info.conn_chg & USB_CHG) &&
+ !di->susp_status.usb_suspended) {
+ dev_dbg(di->dev, "Charging source is USB\n");
+ di->chg_info.charger_type = USB_CHG;
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ } else if (di->chg_info.conn_chg &&
+ (di->susp_status.ac_suspended ||
+ di->susp_status.usb_suspended)) {
+ dev_dbg(di->dev, "Charging is suspended\n");
+ di->chg_info.charger_type = NO_CHG;
+ ab8500_chargalg_state_to(di, STATE_SUSPENDED_INIT);
+ } else {
+ dev_dbg(di->dev, "Charging source is OFF\n");
+ di->chg_info.charger_type = NO_CHG;
+ ab8500_chargalg_state_to(di, STATE_HANDHELD_INIT);
+ }
+ di->chg_info.prev_conn_chg = di->chg_info.conn_chg;
+ di->susp_status.suspended_change = false;
+ }
+ return di->chg_info.conn_chg;
+}
+
+/**
+ * ab8500_chargalg_check_current_step_status() - Check charging current
+ * step status.
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * This function will check if there is a change in the charging current step
+ * and change charge state accordingly.
+ */
+static void ab8500_chargalg_check_current_step_status
+ (struct ab8500_chargalg *di)
+{
+ if (di->curr_status.curr_step_change)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ di->curr_status.curr_step_change = false;
+}
+
+/**
+ * ab8500_chargalg_start_safety_timer() - Start charging safety timer
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * The safety timer is used to avoid overcharging of old or bad batteries.
+ * There are different timers for AC and USB
+ */
+static void ab8500_chargalg_start_safety_timer(struct ab8500_chargalg *di)
+{
+ /* Charger-dependent expiration time in hours*/
+ int timer_expiration = 0;
+
+ switch (di->chg_info.charger_type) {
+ case AC_CHG:
+ timer_expiration = di->bm->main_safety_tmr_h;
+ break;
+
+ case USB_CHG:
+ timer_expiration = di->bm->usb_safety_tmr_h;
+ break;
+
+ default:
+ dev_err(di->dev, "Unknown charger to charge from\n");
+ break;
+ }
+
+ di->events.safety_timer_expired = false;
+ hrtimer_set_expires_range(&di->safety_timer,
+ ktime_set(timer_expiration * ONE_HOUR_IN_SECONDS, 0),
+ ktime_set(FIVE_MINUTES_IN_SECONDS, 0));
+ hrtimer_start_expires(&di->safety_timer, HRTIMER_MODE_REL);
+}
+
+/**
+ * ab8500_chargalg_stop_safety_timer() - Stop charging safety timer
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * The safety timer is stopped whenever the NORMAL state is exited
+ */
+static void ab8500_chargalg_stop_safety_timer(struct ab8500_chargalg *di)
+{
+ if (hrtimer_try_to_cancel(&di->safety_timer) >= 0)
+ di->events.safety_timer_expired = false;
+}
+
+/**
+ * ab8500_chargalg_start_maintenance_timer() - Start charging maintenance timer
+ * @di: pointer to the ab8500_chargalg structure
+ * @duration: duration of ther maintenance timer in hours
+ *
+ * The maintenance timer is used to maintain the charge in the battery once
+ * the battery is considered full. These timers are chosen to match the
+ * discharge curve of the battery
+ */
+static void ab8500_chargalg_start_maintenance_timer(struct ab8500_chargalg *di,
+ int duration)
+{
+ hrtimer_set_expires_range(&di->maintenance_timer,
+ ktime_set(duration * ONE_HOUR_IN_SECONDS, 0),
+ ktime_set(FIVE_MINUTES_IN_SECONDS, 0));
+ di->events.maintenance_timer_expired = false;
+ hrtimer_start_expires(&di->maintenance_timer, HRTIMER_MODE_REL);
+}
+
+/**
+ * ab8500_chargalg_stop_maintenance_timer() - Stop maintenance timer
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * The maintenance timer is stopped whenever maintenance ends or when another
+ * state is entered
+ */
+static void ab8500_chargalg_stop_maintenance_timer(struct ab8500_chargalg *di)
+{
+ if (hrtimer_try_to_cancel(&di->maintenance_timer) >= 0)
+ di->events.maintenance_timer_expired = false;
+}
+
+/**
+ * ab8500_chargalg_kick_watchdog() - Kick charger watchdog
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * The charger watchdog have to be kicked periodically whenever the charger is
+ * on, else the ABB will reset the system
+ */
+static int ab8500_chargalg_kick_watchdog(struct ab8500_chargalg *di)
+{
+ /* Check if charger exists and kick watchdog if charging */
+ if (di->ac_chg && di->ac_chg->ops.kick_wd &&
+ di->chg_info.online_chg & AC_CHG) {
+ /*
+ * If AB charger watchdog expired, pm2xxx charging
+ * gets disabled. To be safe, kick both AB charger watchdog
+ * and pm2xxx watchdog.
+ */
+ if (di->ac_chg->external &&
+ di->usb_chg && di->usb_chg->ops.kick_wd)
+ di->usb_chg->ops.kick_wd(di->usb_chg);
+
+ return di->ac_chg->ops.kick_wd(di->ac_chg);
+ } else if (di->usb_chg && di->usb_chg->ops.kick_wd &&
+ di->chg_info.online_chg & USB_CHG)
+ return di->usb_chg->ops.kick_wd(di->usb_chg);
+
+ return -ENXIO;
+}
+
+/**
+ * ab8500_chargalg_ac_en() - Turn on/off the AC charger
+ * @di: pointer to the ab8500_chargalg structure
+ * @enable: charger on/off
+ * @vset: requested charger output voltage
+ * @iset: requested charger output current
+ *
+ * The AC charger will be turned on/off with the requested charge voltage and
+ * current
+ */
+static int ab8500_chargalg_ac_en(struct ab8500_chargalg *di, int enable,
+ int vset, int iset)
+{
+ static int ab8500_chargalg_ex_ac_enable_toggle;
+
+ if (!di->ac_chg || !di->ac_chg->ops.enable)
+ return -ENXIO;
+
+ /* Select maximum of what both the charger and the battery supports */
+ if (di->ac_chg->max_out_volt)
+ vset = min(vset, di->ac_chg->max_out_volt);
+ if (di->ac_chg->max_out_curr)
+ iset = min(iset, di->ac_chg->max_out_curr);
+
+ di->chg_info.ac_iset = iset;
+ di->chg_info.ac_vset = vset;
+
+ /* Enable external charger */
+ if (enable && di->ac_chg->external &&
+ !ab8500_chargalg_ex_ac_enable_toggle) {
+ blocking_notifier_call_chain(&charger_notifier_list,
+ 0, di->dev);
+ ab8500_chargalg_ex_ac_enable_toggle++;
+ }
+
+ return di->ac_chg->ops.enable(di->ac_chg, enable, vset, iset);
+}
+
+/**
+ * ab8500_chargalg_usb_en() - Turn on/off the USB charger
+ * @di: pointer to the ab8500_chargalg structure
+ * @enable: charger on/off
+ * @vset: requested charger output voltage
+ * @iset: requested charger output current
+ *
+ * The USB charger will be turned on/off with the requested charge voltage and
+ * current
+ */
+static int ab8500_chargalg_usb_en(struct ab8500_chargalg *di, int enable,
+ int vset, int iset)
+{
+ if (!di->usb_chg || !di->usb_chg->ops.enable)
+ return -ENXIO;
+
+ /* Select maximum of what both the charger and the battery supports */
+ if (di->usb_chg->max_out_volt)
+ vset = min(vset, di->usb_chg->max_out_volt);
+ if (di->usb_chg->max_out_curr)
+ iset = min(iset, di->usb_chg->max_out_curr);
+
+ di->chg_info.usb_iset = iset;
+ di->chg_info.usb_vset = vset;
+
+ return di->usb_chg->ops.enable(di->usb_chg, enable, vset, iset);
+}
+
+/**
+ * ab8500_chargalg_update_chg_curr() - Update charger current
+ * @di: pointer to the ab8500_chargalg structure
+ * @iset: requested charger output current
+ *
+ * The charger output current will be updated for the charger
+ * that is currently in use
+ */
+static int ab8500_chargalg_update_chg_curr(struct ab8500_chargalg *di,
+ int iset)
+{
+ /* Check if charger exists and update current if charging */
+ if (di->ac_chg && di->ac_chg->ops.update_curr &&
+ di->chg_info.charger_type & AC_CHG) {
+ /*
+ * Select maximum of what both the charger
+ * and the battery supports
+ */
+ if (di->ac_chg->max_out_curr)
+ iset = min(iset, di->ac_chg->max_out_curr);
+
+ di->chg_info.ac_iset = iset;
+
+ return di->ac_chg->ops.update_curr(di->ac_chg, iset);
+ } else if (di->usb_chg && di->usb_chg->ops.update_curr &&
+ di->chg_info.charger_type & USB_CHG) {
+ /*
+ * Select maximum of what both the charger
+ * and the battery supports
+ */
+ if (di->usb_chg->max_out_curr)
+ iset = min(iset, di->usb_chg->max_out_curr);
+
+ di->chg_info.usb_iset = iset;
+
+ return di->usb_chg->ops.update_curr(di->usb_chg, iset);
+ }
+
+ return -ENXIO;
+}
+
+/**
+ * ab8500_chargalg_stop_charging() - Stop charging
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * This function is called from any state where charging should be stopped.
+ * All charging is disabled and all status parameters and timers are changed
+ * accordingly
+ */
+static void ab8500_chargalg_stop_charging(struct ab8500_chargalg *di)
+{
+ ab8500_chargalg_ac_en(di, false, 0, 0);
+ ab8500_chargalg_usb_en(di, false, 0, 0);
+ ab8500_chargalg_stop_safety_timer(di);
+ ab8500_chargalg_stop_maintenance_timer(di);
+ di->charge_status = POWER_SUPPLY_STATUS_NOT_CHARGING;
+ di->maintenance_chg = false;
+ cancel_delayed_work(&di->chargalg_wd_work);
+ power_supply_changed(di->chargalg_psy);
+}
+
+/**
+ * ab8500_chargalg_hold_charging() - Pauses charging
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * This function is called in the case where maintenance charging has been
+ * disabled and instead a battery voltage mode is entered to check when the
+ * battery voltage has reached a certain recharge voltage
+ */
+static void ab8500_chargalg_hold_charging(struct ab8500_chargalg *di)
+{
+ ab8500_chargalg_ac_en(di, false, 0, 0);
+ ab8500_chargalg_usb_en(di, false, 0, 0);
+ ab8500_chargalg_stop_safety_timer(di);
+ ab8500_chargalg_stop_maintenance_timer(di);
+ di->charge_status = POWER_SUPPLY_STATUS_CHARGING;
+ di->maintenance_chg = false;
+ cancel_delayed_work(&di->chargalg_wd_work);
+ power_supply_changed(di->chargalg_psy);
+}
+
+/**
+ * ab8500_chargalg_start_charging() - Start the charger
+ * @di: pointer to the ab8500_chargalg structure
+ * @vset: requested charger output voltage
+ * @iset: requested charger output current
+ *
+ * A charger will be enabled depending on the requested charger type that was
+ * detected previously.
+ */
+static void ab8500_chargalg_start_charging(struct ab8500_chargalg *di,
+ int vset, int iset)
+{
+ switch (di->chg_info.charger_type) {
+ case AC_CHG:
+ dev_dbg(di->dev,
+ "AC parameters: Vset %d, Ich %d\n", vset, iset);
+ ab8500_chargalg_usb_en(di, false, 0, 0);
+ ab8500_chargalg_ac_en(di, true, vset, iset);
+ break;
+
+ case USB_CHG:
+ dev_dbg(di->dev,
+ "USB parameters: Vset %d, Ich %d\n", vset, iset);
+ ab8500_chargalg_ac_en(di, false, 0, 0);
+ ab8500_chargalg_usb_en(di, true, vset, iset);
+ break;
+
+ default:
+ dev_err(di->dev, "Unknown charger to charge from\n");
+ break;
+ }
+}
+
+/**
+ * ab8500_chargalg_check_temp() - Check battery temperature ranges
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * The battery temperature is checked against the predefined limits and the
+ * charge state is changed accordingly
+ */
+static void ab8500_chargalg_check_temp(struct ab8500_chargalg *di)
+{
+ if (di->batt_data.temp > (di->bm->temp_low + di->t_hyst_norm) &&
+ di->batt_data.temp < (di->bm->temp_high - di->t_hyst_norm)) {
+ /* Temp OK! */
+ di->events.btemp_underover = false;
+ di->events.btemp_lowhigh = false;
+ di->t_hyst_norm = 0;
+ di->t_hyst_lowhigh = 0;
+ } else {
+ if (((di->batt_data.temp >= di->bm->temp_high) &&
+ (di->batt_data.temp <
+ (di->bm->temp_over - di->t_hyst_lowhigh))) ||
+ ((di->batt_data.temp >
+ (di->bm->temp_under + di->t_hyst_lowhigh)) &&
+ (di->batt_data.temp <= di->bm->temp_low))) {
+ /* TEMP minor!!!!! */
+ di->events.btemp_underover = false;
+ di->events.btemp_lowhigh = true;
+ di->t_hyst_norm = di->bm->temp_hysteresis;
+ di->t_hyst_lowhigh = 0;
+ } else if (di->batt_data.temp <= di->bm->temp_under ||
+ di->batt_data.temp >= di->bm->temp_over) {
+ /* TEMP major!!!!! */
+ di->events.btemp_underover = true;
+ di->events.btemp_lowhigh = false;
+ di->t_hyst_norm = 0;
+ di->t_hyst_lowhigh = di->bm->temp_hysteresis;
+ } else {
+ /* Within hysteresis */
+ dev_dbg(di->dev, "Within hysteresis limit temp: %d "
+ "hyst_lowhigh %d, hyst normal %d\n",
+ di->batt_data.temp, di->t_hyst_lowhigh,
+ di->t_hyst_norm);
+ }
+ }
+}
+
+/**
+ * ab8500_chargalg_check_charger_voltage() - Check charger voltage
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * Charger voltage is checked against maximum limit
+ */
+static void ab8500_chargalg_check_charger_voltage(struct ab8500_chargalg *di)
+{
+ if (di->chg_info.usb_volt > di->bm->chg_params->usb_volt_max)
+ di->chg_info.usb_chg_ok = false;
+ else
+ di->chg_info.usb_chg_ok = true;
+
+ if (di->chg_info.ac_volt > di->bm->chg_params->ac_volt_max)
+ di->chg_info.ac_chg_ok = false;
+ else
+ di->chg_info.ac_chg_ok = true;
+
+}
+
+/**
+ * ab8500_chargalg_end_of_charge() - Check if end-of-charge criteria is fulfilled
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * End-of-charge criteria is fulfilled when the battery voltage is above a
+ * certain limit and the battery current is below a certain limit for a
+ * predefined number of consecutive seconds. If true, the battery is full
+ */
+static void ab8500_chargalg_end_of_charge(struct ab8500_chargalg *di)
+{
+ if (di->charge_status == POWER_SUPPLY_STATUS_CHARGING &&
+ di->charge_state == STATE_NORMAL &&
+ !di->maintenance_chg && (di->batt_data.volt >=
+ di->bm->bat_type[di->bm->batt_id].termination_vol ||
+ di->events.usb_cv_active || di->events.ac_cv_active) &&
+ di->batt_data.avg_curr <
+ di->bm->bat_type[di->bm->batt_id].termination_curr &&
+ di->batt_data.avg_curr > 0) {
+ if (++di->eoc_cnt >= EOC_COND_CNT) {
+ di->eoc_cnt = 0;
+ di->charge_status = POWER_SUPPLY_STATUS_FULL;
+ di->maintenance_chg = true;
+ dev_dbg(di->dev, "EOC reached!\n");
+ power_supply_changed(di->chargalg_psy);
+ } else {
+ dev_dbg(di->dev,
+ " EOC limit reached for the %d"
+ " time, out of %d before EOC\n",
+ di->eoc_cnt,
+ EOC_COND_CNT);
+ }
+ } else {
+ di->eoc_cnt = 0;
+ }
+}
+
+static void init_maxim_chg_curr(struct ab8500_chargalg *di)
+{
+ di->ccm.original_iset =
+ di->bm->bat_type[di->bm->batt_id].normal_cur_lvl;
+ di->ccm.current_iset =
+ di->bm->bat_type[di->bm->batt_id].normal_cur_lvl;
+ di->ccm.test_delta_i = di->bm->maxi->charger_curr_step;
+ di->ccm.max_current = di->bm->maxi->chg_curr;
+ di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
+ di->ccm.level = 0;
+}
+
+/**
+ * ab8500_chargalg_chg_curr_maxim - increases the charger current to
+ * compensate for the system load
+ * @di pointer to the ab8500_chargalg structure
+ *
+ * This maximization function is used to raise the charger current to get the
+ * battery current as close to the optimal value as possible. The battery
+ * current during charging is affected by the system load
+ */
+static enum maxim_ret ab8500_chargalg_chg_curr_maxim(struct ab8500_chargalg *di)
+{
+ int delta_i;
+
+ if (!di->bm->maxi->ena_maxi)
+ return MAXIM_RET_NOACTION;
+
+ delta_i = di->ccm.original_iset - di->batt_data.inst_curr;
+
+ if (di->events.vbus_collapsed) {
+ dev_dbg(di->dev, "Charger voltage has collapsed %d\n",
+ di->ccm.wait_cnt);
+ if (di->ccm.wait_cnt == 0) {
+ dev_dbg(di->dev, "lowering current\n");
+ di->ccm.wait_cnt++;
+ di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
+ di->ccm.max_current =
+ di->ccm.current_iset - di->ccm.test_delta_i;
+ di->ccm.current_iset = di->ccm.max_current;
+ di->ccm.level--;
+ return MAXIM_RET_CHANGE;
+ } else {
+ dev_dbg(di->dev, "waiting\n");
+ /* Let's go in here twice before lowering curr again */
+ di->ccm.wait_cnt = (di->ccm.wait_cnt + 1) % 3;
+ return MAXIM_RET_NOACTION;
+ }
+ }
+
+ di->ccm.wait_cnt = 0;
+
+ if (di->batt_data.inst_curr > di->ccm.original_iset) {
+ dev_dbg(di->dev, " Maximization Ibat (%dmA) too high"
+ " (limit %dmA) (current iset: %dmA)!\n",
+ di->batt_data.inst_curr, di->ccm.original_iset,
+ di->ccm.current_iset);
+
+ if (di->ccm.current_iset == di->ccm.original_iset)
+ return MAXIM_RET_NOACTION;
+
+ di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
+ di->ccm.current_iset = di->ccm.original_iset;
+ di->ccm.level = 0;
+
+ return MAXIM_RET_IBAT_TOO_HIGH;
+ }
+
+ if (delta_i > di->ccm.test_delta_i &&
+ (di->ccm.current_iset + di->ccm.test_delta_i) <
+ di->ccm.max_current) {
+ if (di->ccm.condition_cnt-- == 0) {
+ /* Increse the iset with cco.test_delta_i */
+ di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
+ di->ccm.current_iset += di->ccm.test_delta_i;
+ di->ccm.level++;
+ dev_dbg(di->dev, " Maximization needed, increase"
+ " with %d mA to %dmA (Optimal ibat: %d)"
+ " Level %d\n",
+ di->ccm.test_delta_i,
+ di->ccm.current_iset,
+ di->ccm.original_iset,
+ di->ccm.level);
+ return MAXIM_RET_CHANGE;
+ } else {
+ return MAXIM_RET_NOACTION;
+ }
+ } else {
+ di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
+ return MAXIM_RET_NOACTION;
+ }
+}
+
+static void handle_maxim_chg_curr(struct ab8500_chargalg *di)
+{
+ enum maxim_ret ret;
+ int result;
+
+ ret = ab8500_chargalg_chg_curr_maxim(di);
+ switch (ret) {
+ case MAXIM_RET_CHANGE:
+ result = ab8500_chargalg_update_chg_curr(di,
+ di->ccm.current_iset);
+ if (result)
+ dev_err(di->dev, "failed to set chg curr\n");
+ break;
+ case MAXIM_RET_IBAT_TOO_HIGH:
+ result = ab8500_chargalg_update_chg_curr(di,
+ di->bm->bat_type[di->bm->batt_id].normal_cur_lvl);
+ if (result)
+ dev_err(di->dev, "failed to set chg curr\n");
+ break;
+
+ case MAXIM_RET_NOACTION:
+ default:
+ /* Do nothing..*/
+ break;
+ }
+}
+
+static int ab8500_chargalg_get_ext_psy_data(struct device *dev, void *data)
+{
+ struct power_supply *psy;
+ struct power_supply *ext = dev_get_drvdata(dev);
+ const char **supplicants = (const char **)ext->supplied_to;
+ struct ab8500_chargalg *di;
+ union power_supply_propval ret;
+ int j;
+ bool capacity_updated = false;
+
+ psy = (struct power_supply *)data;
+ di = power_supply_get_drvdata(psy);
+ /* For all psy where the driver name appears in any supplied_to */
+ j = match_string(supplicants, ext->num_supplicants, psy->desc->name);
+ if (j < 0)
+ return 0;
+
+ /*
+ * If external is not registering 'POWER_SUPPLY_PROP_CAPACITY' to its
+ * property because of handling that sysfs entry on its own, this is
+ * the place to get the battery capacity.
+ */
+ if (!power_supply_get_property(ext, POWER_SUPPLY_PROP_CAPACITY, &ret)) {
+ di->batt_data.percent = ret.intval;
+ capacity_updated = true;
+ }
+
+ /* Go through all properties for the psy */
+ for (j = 0; j < ext->desc->num_properties; j++) {
+ enum power_supply_property prop;
+ prop = ext->desc->properties[j];
+
+ /*
+ * Initialize chargers if not already done.
+ * The ab8500_charger*/
+ if (!di->ac_chg &&
+ ext->desc->type == POWER_SUPPLY_TYPE_MAINS)
+ di->ac_chg = psy_to_ux500_charger(ext);
+ else if (!di->usb_chg &&
+ ext->desc->type == POWER_SUPPLY_TYPE_USB)
+ di->usb_chg = psy_to_ux500_charger(ext);
+
+ if (power_supply_get_property(ext, prop, &ret))
+ continue;
+ switch (prop) {
+ case POWER_SUPPLY_PROP_PRESENT:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_BATTERY:
+ /* Battery present */
+ if (ret.intval)
+ di->events.batt_rem = false;
+ /* Battery removed */
+ else
+ di->events.batt_rem = true;
+ break;
+ case POWER_SUPPLY_TYPE_MAINS:
+ /* AC disconnected */
+ if (!ret.intval &&
+ (di->chg_info.conn_chg & AC_CHG)) {
+ di->chg_info.prev_conn_chg =
+ di->chg_info.conn_chg;
+ di->chg_info.conn_chg &= ~AC_CHG;
+ }
+ /* AC connected */
+ else if (ret.intval &&
+ !(di->chg_info.conn_chg & AC_CHG)) {
+ di->chg_info.prev_conn_chg =
+ di->chg_info.conn_chg;
+ di->chg_info.conn_chg |= AC_CHG;
+ }
+ break;
+ case POWER_SUPPLY_TYPE_USB:
+ /* USB disconnected */
+ if (!ret.intval &&
+ (di->chg_info.conn_chg & USB_CHG)) {
+ di->chg_info.prev_conn_chg =
+ di->chg_info.conn_chg;
+ di->chg_info.conn_chg &= ~USB_CHG;
+ }
+ /* USB connected */
+ else if (ret.intval &&
+ !(di->chg_info.conn_chg & USB_CHG)) {
+ di->chg_info.prev_conn_chg =
+ di->chg_info.conn_chg;
+ di->chg_info.conn_chg |= USB_CHG;
+ }
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_ONLINE:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_BATTERY:
+ break;
+ case POWER_SUPPLY_TYPE_MAINS:
+ /* AC offline */
+ if (!ret.intval &&
+ (di->chg_info.online_chg & AC_CHG)) {
+ di->chg_info.prev_online_chg =
+ di->chg_info.online_chg;
+ di->chg_info.online_chg &= ~AC_CHG;
+ }
+ /* AC online */
+ else if (ret.intval &&
+ !(di->chg_info.online_chg & AC_CHG)) {
+ di->chg_info.prev_online_chg =
+ di->chg_info.online_chg;
+ di->chg_info.online_chg |= AC_CHG;
+ queue_delayed_work(di->chargalg_wq,
+ &di->chargalg_wd_work, 0);
+ }
+ break;
+ case POWER_SUPPLY_TYPE_USB:
+ /* USB offline */
+ if (!ret.intval &&
+ (di->chg_info.online_chg & USB_CHG)) {
+ di->chg_info.prev_online_chg =
+ di->chg_info.online_chg;
+ di->chg_info.online_chg &= ~USB_CHG;
+ }
+ /* USB online */
+ else if (ret.intval &&
+ !(di->chg_info.online_chg & USB_CHG)) {
+ di->chg_info.prev_online_chg =
+ di->chg_info.online_chg;
+ di->chg_info.online_chg |= USB_CHG;
+ queue_delayed_work(di->chargalg_wq,
+ &di->chargalg_wd_work, 0);
+ }
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_HEALTH:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_BATTERY:
+ break;
+ case POWER_SUPPLY_TYPE_MAINS:
+ switch (ret.intval) {
+ case POWER_SUPPLY_HEALTH_UNSPEC_FAILURE:
+ di->events.mainextchnotok = true;
+ di->events.main_thermal_prot = false;
+ di->events.main_ovv = false;
+ di->events.ac_wd_expired = false;
+ break;
+ case POWER_SUPPLY_HEALTH_DEAD:
+ di->events.ac_wd_expired = true;
+ di->events.mainextchnotok = false;
+ di->events.main_ovv = false;
+ di->events.main_thermal_prot = false;
+ break;
+ case POWER_SUPPLY_HEALTH_COLD:
+ case POWER_SUPPLY_HEALTH_OVERHEAT:
+ di->events.main_thermal_prot = true;
+ di->events.mainextchnotok = false;
+ di->events.main_ovv = false;
+ di->events.ac_wd_expired = false;
+ break;
+ case POWER_SUPPLY_HEALTH_OVERVOLTAGE:
+ di->events.main_ovv = true;
+ di->events.mainextchnotok = false;
+ di->events.main_thermal_prot = false;
+ di->events.ac_wd_expired = false;
+ break;
+ case POWER_SUPPLY_HEALTH_GOOD:
+ di->events.main_thermal_prot = false;
+ di->events.mainextchnotok = false;
+ di->events.main_ovv = false;
+ di->events.ac_wd_expired = false;
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_TYPE_USB:
+ switch (ret.intval) {
+ case POWER_SUPPLY_HEALTH_UNSPEC_FAILURE:
+ di->events.usbchargernotok = true;
+ di->events.usb_thermal_prot = false;
+ di->events.vbus_ovv = false;
+ di->events.usb_wd_expired = false;
+ break;
+ case POWER_SUPPLY_HEALTH_DEAD:
+ di->events.usb_wd_expired = true;
+ di->events.usbchargernotok = false;
+ di->events.usb_thermal_prot = false;
+ di->events.vbus_ovv = false;
+ break;
+ case POWER_SUPPLY_HEALTH_COLD:
+ case POWER_SUPPLY_HEALTH_OVERHEAT:
+ di->events.usb_thermal_prot = true;
+ di->events.usbchargernotok = false;
+ di->events.vbus_ovv = false;
+ di->events.usb_wd_expired = false;
+ break;
+ case POWER_SUPPLY_HEALTH_OVERVOLTAGE:
+ di->events.vbus_ovv = true;
+ di->events.usbchargernotok = false;
+ di->events.usb_thermal_prot = false;
+ di->events.usb_wd_expired = false;
+ break;
+ case POWER_SUPPLY_HEALTH_GOOD:
+ di->events.usbchargernotok = false;
+ di->events.usb_thermal_prot = false;
+ di->events.vbus_ovv = false;
+ di->events.usb_wd_expired = false;
+ break;
+ default:
+ break;
+ }
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_VOLTAGE_NOW:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_BATTERY:
+ di->batt_data.volt = ret.intval / 1000;
+ break;
+ case POWER_SUPPLY_TYPE_MAINS:
+ di->chg_info.ac_volt = ret.intval / 1000;
+ break;
+ case POWER_SUPPLY_TYPE_USB:
+ di->chg_info.usb_volt = ret.intval / 1000;
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_VOLTAGE_AVG:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_MAINS:
+ /* AVG is used to indicate when we are
+ * in CV mode */
+ if (ret.intval)
+ di->events.ac_cv_active = true;
+ else
+ di->events.ac_cv_active = false;
+
+ break;
+ case POWER_SUPPLY_TYPE_USB:
+ /* AVG is used to indicate when we are
+ * in CV mode */
+ if (ret.intval)
+ di->events.usb_cv_active = true;
+ else
+ di->events.usb_cv_active = false;
+
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_TECHNOLOGY:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_BATTERY:
+ if (ret.intval)
+ di->events.batt_unknown = false;
+ else
+ di->events.batt_unknown = true;
+
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_TEMP:
+ di->batt_data.temp = ret.intval / 10;
+ break;
+
+ case POWER_SUPPLY_PROP_CURRENT_NOW:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_MAINS:
+ di->chg_info.ac_curr =
+ ret.intval / 1000;
+ break;
+ case POWER_SUPPLY_TYPE_USB:
+ di->chg_info.usb_curr =
+ ret.intval / 1000;
+ break;
+ case POWER_SUPPLY_TYPE_BATTERY:
+ di->batt_data.inst_curr = ret.intval / 1000;
+ break;
+ default:
+ break;
+ }
+ break;
+
+ case POWER_SUPPLY_PROP_CURRENT_AVG:
+ switch (ext->desc->type) {
+ case POWER_SUPPLY_TYPE_BATTERY:
+ di->batt_data.avg_curr = ret.intval / 1000;
+ break;
+ case POWER_SUPPLY_TYPE_USB:
+ if (ret.intval)
+ di->events.vbus_collapsed = true;
+ else
+ di->events.vbus_collapsed = false;
+ break;
+ default:
+ break;
+ }
+ break;
+ case POWER_SUPPLY_PROP_CAPACITY:
+ if (!capacity_updated)
+ di->batt_data.percent = ret.intval;
+ break;
+ default:
+ break;
+ }
+ }
+ return 0;
+}
+
+/**
+ * ab8500_chargalg_external_power_changed() - callback for power supply changes
+ * @psy: pointer to the structure power_supply
+ *
+ * This function is the entry point of the pointer external_power_changed
+ * of the structure power_supply.
+ * This function gets executed when there is a change in any external power
+ * supply that this driver needs to be notified of.
+ */
+static void ab8500_chargalg_external_power_changed(struct power_supply *psy)
+{
+ struct ab8500_chargalg *di = power_supply_get_drvdata(psy);
+
+ /*
+ * Trigger execution of the algorithm instantly and read
+ * all power_supply properties there instead
+ */
+ if (di->chargalg_wq)
+ queue_work(di->chargalg_wq, &di->chargalg_work);
+}
+
+/**
+ * ab8500_chargalg_algorithm() - Main function for the algorithm
+ * @di: pointer to the ab8500_chargalg structure
+ *
+ * This is the main control function for the charging algorithm.
+ * It is called periodically or when something happens that will
+ * trigger a state change
+ */
+static void ab8500_chargalg_algorithm(struct ab8500_chargalg *di)
+{
+ int charger_status;
+ int ret;
+ int curr_step_lvl;
+
+ /* Collect data from all power_supply class devices */
+ class_for_each_device(power_supply_class, NULL,
+ di->chargalg_psy, ab8500_chargalg_get_ext_psy_data);
+
+ ab8500_chargalg_end_of_charge(di);
+ ab8500_chargalg_check_temp(di);
+ ab8500_chargalg_check_charger_voltage(di);
+
+ charger_status = ab8500_chargalg_check_charger_connection(di);
+ ab8500_chargalg_check_current_step_status(di);
+
+ if (is_ab8500(di->parent)) {
+ ret = ab8500_chargalg_check_charger_enable(di);
+ if (ret < 0)
+ dev_err(di->dev, "Checking charger is enabled error"
+ ": Returned Value %d\n", ret);
+ }
+
+ /*
+ * First check if we have a charger connected.
+ * Also we don't allow charging of unknown batteries if configured
+ * this way
+ */
+ if (!charger_status ||
+ (di->events.batt_unknown && !di->bm->chg_unknown_bat)) {
+ if (di->charge_state != STATE_HANDHELD) {
+ di->events.safety_timer_expired = false;
+ ab8500_chargalg_state_to(di, STATE_HANDHELD_INIT);
+ }
+ }
+
+ /* If suspended, we should not continue checking the flags */
+ else if (di->charge_state == STATE_SUSPENDED_INIT ||
+ di->charge_state == STATE_SUSPENDED) {
+ /* We don't do anything here, just don,t continue */
+ }
+
+ /* Safety timer expiration */
+ else if (di->events.safety_timer_expired) {
+ if (di->charge_state != STATE_SAFETY_TIMER_EXPIRED)
+ ab8500_chargalg_state_to(di,
+ STATE_SAFETY_TIMER_EXPIRED_INIT);
+ }
+ /*
+ * Check if any interrupts has occured
+ * that will prevent us from charging
+ */
+
+ /* Battery removed */
+ else if (di->events.batt_rem) {
+ if (di->charge_state != STATE_BATT_REMOVED)
+ ab8500_chargalg_state_to(di, STATE_BATT_REMOVED_INIT);
+ }
+ /* Main or USB charger not ok. */
+ else if (di->events.mainextchnotok || di->events.usbchargernotok) {
+ /*
+ * If vbus_collapsed is set, we have to lower the charger
+ * current, which is done in the normal state below
+ */
+ if (di->charge_state != STATE_CHG_NOT_OK &&
+ !di->events.vbus_collapsed)
+ ab8500_chargalg_state_to(di, STATE_CHG_NOT_OK_INIT);
+ }
+ /* VBUS, Main or VBAT OVV. */
+ else if (di->events.vbus_ovv ||
+ di->events.main_ovv ||
+ di->events.batt_ovv ||
+ !di->chg_info.usb_chg_ok ||
+ !di->chg_info.ac_chg_ok) {
+ if (di->charge_state != STATE_OVV_PROTECT)
+ ab8500_chargalg_state_to(di, STATE_OVV_PROTECT_INIT);
+ }
+ /* USB Thermal, stop charging */
+ else if (di->events.main_thermal_prot ||
+ di->events.usb_thermal_prot) {
+ if (di->charge_state != STATE_HW_TEMP_PROTECT)
+ ab8500_chargalg_state_to(di,
+ STATE_HW_TEMP_PROTECT_INIT);
+ }
+ /* Battery temp over/under */
+ else if (di->events.btemp_underover) {
+ if (di->charge_state != STATE_TEMP_UNDEROVER)
+ ab8500_chargalg_state_to(di,
+ STATE_TEMP_UNDEROVER_INIT);
+ }
+ /* Watchdog expired */
+ else if (di->events.ac_wd_expired ||
+ di->events.usb_wd_expired) {
+ if (di->charge_state != STATE_WD_EXPIRED)
+ ab8500_chargalg_state_to(di, STATE_WD_EXPIRED_INIT);
+ }
+ /* Battery temp high/low */
+ else if (di->events.btemp_lowhigh) {
+ if (di->charge_state != STATE_TEMP_LOWHIGH)
+ ab8500_chargalg_state_to(di, STATE_TEMP_LOWHIGH_INIT);
+ }
+
+ dev_dbg(di->dev,
+ "[CHARGALG] Vb %d Ib_avg %d Ib_inst %d Tb %d Cap %d Maint %d "
+ "State %s Active_chg %d Chg_status %d AC %d USB %d "
+ "AC_online %d USB_online %d AC_CV %d USB_CV %d AC_I %d "
+ "USB_I %d AC_Vset %d AC_Iset %d USB_Vset %d USB_Iset %d\n",
+ di->batt_data.volt,
+ di->batt_data.avg_curr,
+ di->batt_data.inst_curr,
+ di->batt_data.temp,
+ di->batt_data.percent,
+ di->maintenance_chg,
+ states[di->charge_state],
+ di->chg_info.charger_type,
+ di->charge_status,
+ di->chg_info.conn_chg & AC_CHG,
+ di->chg_info.conn_chg & USB_CHG,
+ di->chg_info.online_chg & AC_CHG,
+ di->chg_info.online_chg & USB_CHG,
+ di->events.ac_cv_active,
+ di->events.usb_cv_active,
+ di->chg_info.ac_curr,
+ di->chg_info.usb_curr,
+ di->chg_info.ac_vset,
+ di->chg_info.ac_iset,
+ di->chg_info.usb_vset,
+ di->chg_info.usb_iset);
+
+ switch (di->charge_state) {
+ case STATE_HANDHELD_INIT:
+ ab8500_chargalg_stop_charging(di);
+ di->charge_status = POWER_SUPPLY_STATUS_DISCHARGING;
+ ab8500_chargalg_state_to(di, STATE_HANDHELD);
+ fallthrough;
+
+ case STATE_HANDHELD:
+ break;
+
+ case STATE_SUSPENDED_INIT:
+ if (di->susp_status.ac_suspended)
+ ab8500_chargalg_ac_en(di, false, 0, 0);
+ if (di->susp_status.usb_suspended)
+ ab8500_chargalg_usb_en(di, false, 0, 0);
+ ab8500_chargalg_stop_safety_timer(di);
+ ab8500_chargalg_stop_maintenance_timer(di);
+ di->charge_status = POWER_SUPPLY_STATUS_NOT_CHARGING;
+ di->maintenance_chg = false;
+ ab8500_chargalg_state_to(di, STATE_SUSPENDED);
+ power_supply_changed(di->chargalg_psy);
+ fallthrough;
+
+ case STATE_SUSPENDED:
+ /* CHARGING is suspended */
+ break;
+
+ case STATE_BATT_REMOVED_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_BATT_REMOVED);
+ fallthrough;
+
+ case STATE_BATT_REMOVED:
+ if (!di->events.batt_rem)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_HW_TEMP_PROTECT_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_HW_TEMP_PROTECT);
+ fallthrough;
+
+ case STATE_HW_TEMP_PROTECT:
+ if (!di->events.main_thermal_prot &&
+ !di->events.usb_thermal_prot)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_OVV_PROTECT_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_OVV_PROTECT);
+ fallthrough;
+
+ case STATE_OVV_PROTECT:
+ if (!di->events.vbus_ovv &&
+ !di->events.main_ovv &&
+ !di->events.batt_ovv &&
+ di->chg_info.usb_chg_ok &&
+ di->chg_info.ac_chg_ok)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_CHG_NOT_OK_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_CHG_NOT_OK);
+ fallthrough;
+
+ case STATE_CHG_NOT_OK:
+ if (!di->events.mainextchnotok &&
+ !di->events.usbchargernotok)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_SAFETY_TIMER_EXPIRED_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_SAFETY_TIMER_EXPIRED);
+ fallthrough;
+
+ case STATE_SAFETY_TIMER_EXPIRED:
+ /* We exit this state when charger is removed */
+ break;
+
+ case STATE_NORMAL_INIT:
+ if (di->curr_status.curr_step == CHARGALG_CURR_STEP_LOW)
+ ab8500_chargalg_stop_charging(di);
+ else {
+ curr_step_lvl = di->bm->bat_type[
+ di->bm->batt_id].normal_cur_lvl
+ * di->curr_status.curr_step
+ / CHARGALG_CURR_STEP_HIGH;
+ ab8500_chargalg_start_charging(di,
+ di->bm->bat_type[di->bm->batt_id]
+ .normal_vol_lvl, curr_step_lvl);
+ }
+
+ ab8500_chargalg_state_to(di, STATE_NORMAL);
+ ab8500_chargalg_start_safety_timer(di);
+ ab8500_chargalg_stop_maintenance_timer(di);
+ init_maxim_chg_curr(di);
+ di->charge_status = POWER_SUPPLY_STATUS_CHARGING;
+ di->eoc_cnt = 0;
+ di->maintenance_chg = false;
+ power_supply_changed(di->chargalg_psy);
+
+ break;
+
+ case STATE_NORMAL:
+ handle_maxim_chg_curr(di);
+ if (di->charge_status == POWER_SUPPLY_STATUS_FULL &&
+ di->maintenance_chg) {
+ if (di->bm->no_maintenance)
+ ab8500_chargalg_state_to(di,
+ STATE_WAIT_FOR_RECHARGE_INIT);
+ else
+ ab8500_chargalg_state_to(di,
+ STATE_MAINTENANCE_A_INIT);
+ }
+ break;
+
+ /* This state will be used when the maintenance state is disabled */
+ case STATE_WAIT_FOR_RECHARGE_INIT:
+ ab8500_chargalg_hold_charging(di);
+ ab8500_chargalg_state_to(di, STATE_WAIT_FOR_RECHARGE);
+ fallthrough;
+
+ case STATE_WAIT_FOR_RECHARGE:
+ if (di->batt_data.percent <=
+ di->bm->bat_type[di->bm->batt_id].recharge_cap)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_MAINTENANCE_A_INIT:
+ ab8500_chargalg_stop_safety_timer(di);
+ ab8500_chargalg_start_maintenance_timer(di,
+ di->bm->bat_type[
+ di->bm->batt_id].maint_a_chg_timer_h);
+ ab8500_chargalg_start_charging(di,
+ di->bm->bat_type[
+ di->bm->batt_id].maint_a_vol_lvl,
+ di->bm->bat_type[
+ di->bm->batt_id].maint_a_cur_lvl);
+ ab8500_chargalg_state_to(di, STATE_MAINTENANCE_A);
+ power_supply_changed(di->chargalg_psy);
+ fallthrough;
+
+ case STATE_MAINTENANCE_A:
+ if (di->events.maintenance_timer_expired) {
+ ab8500_chargalg_stop_maintenance_timer(di);
+ ab8500_chargalg_state_to(di, STATE_MAINTENANCE_B_INIT);
+ }
+ break;
+
+ case STATE_MAINTENANCE_B_INIT:
+ ab8500_chargalg_start_maintenance_timer(di,
+ di->bm->bat_type[
+ di->bm->batt_id].maint_b_chg_timer_h);
+ ab8500_chargalg_start_charging(di,
+ di->bm->bat_type[
+ di->bm->batt_id].maint_b_vol_lvl,
+ di->bm->bat_type[
+ di->bm->batt_id].maint_b_cur_lvl);
+ ab8500_chargalg_state_to(di, STATE_MAINTENANCE_B);
+ power_supply_changed(di->chargalg_psy);
+ fallthrough;
+
+ case STATE_MAINTENANCE_B:
+ if (di->events.maintenance_timer_expired) {
+ ab8500_chargalg_stop_maintenance_timer(di);
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ }
+ break;
+
+ case STATE_TEMP_LOWHIGH_INIT:
+ ab8500_chargalg_start_charging(di,
+ di->bm->bat_type[
+ di->bm->batt_id].low_high_vol_lvl,
+ di->bm->bat_type[
+ di->bm->batt_id].low_high_cur_lvl);
+ ab8500_chargalg_stop_maintenance_timer(di);
+ di->charge_status = POWER_SUPPLY_STATUS_CHARGING;
+ ab8500_chargalg_state_to(di, STATE_TEMP_LOWHIGH);
+ power_supply_changed(di->chargalg_psy);
+ fallthrough;
+
+ case STATE_TEMP_LOWHIGH:
+ if (!di->events.btemp_lowhigh)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_WD_EXPIRED_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_WD_EXPIRED);
+ fallthrough;
+
+ case STATE_WD_EXPIRED:
+ if (!di->events.ac_wd_expired &&
+ !di->events.usb_wd_expired)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+
+ case STATE_TEMP_UNDEROVER_INIT:
+ ab8500_chargalg_stop_charging(di);
+ ab8500_chargalg_state_to(di, STATE_TEMP_UNDEROVER);
+ fallthrough;
+
+ case STATE_TEMP_UNDEROVER:
+ if (!di->events.btemp_underover)
+ ab8500_chargalg_state_to(di, STATE_NORMAL_INIT);
+ break;
+ }
+
+ /* Start charging directly if the new state is a charge state */
+ if (di->charge_state == STATE_NORMAL_INIT ||
+ di->charge_state == STATE_MAINTENANCE_A_INIT ||
+ di->charge_state == STATE_MAINTENANCE_B_INIT)
+ queue_work(di->chargalg_wq, &di->chargalg_work);
+}
+
+/**
+ * ab8500_chargalg_periodic_work() - Periodic work for the algorithm
+ * @work: pointer to the work_struct structure
+ *
+ * Work queue function for the charging algorithm
+ */
+static void ab8500_chargalg_periodic_work(struct work_struct *work)
+{
+ struct ab8500_chargalg *di = container_of(work,
+ struct ab8500_chargalg, chargalg_periodic_work.work);
+
+ ab8500_chargalg_algorithm(di);
+
+ /*
+ * If a charger is connected then the battery has to be monitored
+ * frequently, else the work can be delayed.
+ */
+ if (di->chg_info.conn_chg)
+ queue_delayed_work(di->chargalg_wq,
+ &di->chargalg_periodic_work,
+ di->bm->interval_charging * HZ);
+ else
+ queue_delayed_work(di->chargalg_wq,
+ &di->chargalg_periodic_work,
+ di->bm->interval_not_charging * HZ);
+}
+
+/**
+ * ab8500_chargalg_wd_work() - periodic work to kick the charger watchdog
+ * @work: pointer to the work_struct structure
+ *
+ * Work queue function for kicking the charger watchdog
+ */
+static void ab8500_chargalg_wd_work(struct work_struct *work)
+{
+ int ret;
+ struct ab8500_chargalg *di = container_of(work,
+ struct ab8500_chargalg, chargalg_wd_work.work);
+
+ ret = ab8500_chargalg_kick_watchdog(di);
+ if (ret < 0)
+ dev_err(di->dev, "failed to kick watchdog\n");
+
+ queue_delayed_work(di->chargalg_wq,
+ &di->chargalg_wd_work, CHG_WD_INTERVAL);
+}
+
+/**
+ * ab8500_chargalg_work() - Work to run the charging algorithm instantly
+ * @work: pointer to the work_struct structure
+ *
+ * Work queue function for calling the charging algorithm
+ */
+static void ab8500_chargalg_work(struct work_struct *work)
+{
+ struct ab8500_chargalg *di = container_of(work,
+ struct ab8500_chargalg, chargalg_work);
+
+ ab8500_chargalg_algorithm(di);
+}
+
+/**
+ * ab8500_chargalg_get_property() - get the chargalg properties
+ * @psy: pointer to the power_supply structure
+ * @psp: pointer to the power_supply_property structure
+ * @val: pointer to the power_supply_propval union
+ *
+ * This function gets called when an application tries to get the
+ * chargalg properties by reading the sysfs files.
+ * status: charging/discharging/full/unknown
+ * health: health of the battery
+ * Returns error code in case of failure else 0 on success
+ */
+static int ab8500_chargalg_get_property(struct power_supply *psy,
+ enum power_supply_property psp,
+ union power_supply_propval *val)
+{
+ struct ab8500_chargalg *di = power_supply_get_drvdata(psy);
+
+ switch (psp) {
+ case POWER_SUPPLY_PROP_STATUS:
+ val->intval = di->charge_status;
+ break;
+ case POWER_SUPPLY_PROP_HEALTH:
+ if (di->events.batt_ovv) {
+ val->intval = POWER_SUPPLY_HEALTH_OVERVOLTAGE;
+ } else if (di->events.btemp_underover) {
+ if (di->batt_data.temp <= di->bm->temp_under)
+ val->intval = POWER_SUPPLY_HEALTH_COLD;
+ else
+ val->intval = POWER_SUPPLY_HEALTH_OVERHEAT;
+ } else if (di->charge_state == STATE_SAFETY_TIMER_EXPIRED ||
+ di->charge_state == STATE_SAFETY_TIMER_EXPIRED_INIT) {
+ val->intval = POWER_SUPPLY_HEALTH_UNSPEC_FAILURE;
+ } else {
+ val->intval = POWER_SUPPLY_HEALTH_GOOD;
+ }
+ break;
+ default:
+ return -EINVAL;
+ }
+ return 0;
+}
+
+/* Exposure to the sysfs interface */
+
+static ssize_t ab8500_chargalg_curr_step_show(struct ab8500_chargalg *di,
+ char *buf)
+{
+ return sprintf(buf, "%d\n", di->curr_status.curr_step);
+}
+
+static ssize_t ab8500_chargalg_curr_step_store(struct ab8500_chargalg *di,
+ const char *buf, size_t length)
+{
+ long param;
+ int ret;
+
+ ret = kstrtol(buf, 10, ¶m);
+ if (ret < 0)
+ return ret;
+
+ di->curr_status.curr_step = param;
+ if (di->curr_status.curr_step >= CHARGALG_CURR_STEP_LOW &&
+ di->curr_status.curr_step <= CHARGALG_CURR_STEP_HIGH) {
+ di->curr_status.curr_step_change = true;
+ queue_work(di->chargalg_wq, &di->chargalg_work);
+ } else
+ dev_info(di->dev, "Wrong current step\n"
+ "Enter 0. Disable AC/USB Charging\n"
+ "1--100. Set AC/USB charging current step\n"
+ "100. Enable AC/USB Charging\n");
+
+ return strlen(buf);
+}
+
+
+static ssize_t ab8500_chargalg_en_show(struct ab8500_chargalg *di,
+ char *buf)
+{
+ return sprintf(buf, "%d\n",
+ di->susp_status.ac_suspended &&
+ di->susp_status.usb_suspended);
+}
+
+static ssize_t ab8500_chargalg_en_store(struct ab8500_chargalg *di,
+ const char *buf, size_t length)
+{
+ long param;
+ int ac_usb;
+ int ret;
+
+ ret = kstrtol(buf, 10, ¶m);
+ if (ret < 0)
+ return ret;
+
+ ac_usb = param;
+ switch (ac_usb) {
+ case 0:
+ /* Disable charging */
+ di->susp_status.ac_suspended = true;
+ di->susp_status.usb_suspended = true;
+ di->susp_status.suspended_change = true;
+ /* Trigger a state change */
+ queue_work(di->chargalg_wq,
+ &di->chargalg_work);
+ break;
+ case 1:
+ /* Enable AC Charging */
+ di->susp_status.ac_suspended = false;
+ di->susp_status.suspended_change = true;
+ /* Trigger a state change */
+ queue_work(di->chargalg_wq,
+ &di->chargalg_work);
+ break;
+ case 2:
+ /* Enable USB charging */
+ di->susp_status.usb_suspended = false;
+ di->susp_status.suspended_change = true;
+ /* Trigger a state change */
+ queue_work(di->chargalg_wq,
+ &di->chargalg_work);
+ break;
+ default:
+ dev_info(di->dev, "Wrong input\n"
+ "Enter 0. Disable AC/USB Charging\n"
+ "1. Enable AC charging\n"
+ "2. Enable USB Charging\n");
+ }
+ return strlen(buf);
+}
+
+static struct ab8500_chargalg_sysfs_entry ab8500_chargalg_en_charger =
+ __ATTR(chargalg, 0644, ab8500_chargalg_en_show,
+ ab8500_chargalg_en_store);
+
+static struct ab8500_chargalg_sysfs_entry ab8500_chargalg_curr_step =
+ __ATTR(chargalg_curr_step, 0644, ab8500_chargalg_curr_step_show,
+ ab8500_chargalg_curr_step_store);
+
+static ssize_t ab8500_chargalg_sysfs_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+ struct ab8500_chargalg_sysfs_entry *entry = container_of(attr,
+ struct ab8500_chargalg_sysfs_entry, attr);
+
+ struct ab8500_chargalg *di = container_of(kobj,
+ struct ab8500_chargalg, chargalg_kobject);
+
+ if (!entry->show)
+ return -EIO;
+
+ return entry->show(di, buf);
+}
+
+static ssize_t ab8500_chargalg_sysfs_charger(struct kobject *kobj,
+ struct attribute *attr, const char *buf, size_t length)
+{
+ struct ab8500_chargalg_sysfs_entry *entry = container_of(attr,
+ struct ab8500_chargalg_sysfs_entry, attr);
+
+ struct ab8500_chargalg *di = container_of(kobj,
+ struct ab8500_chargalg, chargalg_kobject);
+
+ if (!entry->store)
+ return -EIO;
+
+ return entry->store(di, buf, length);
+}
+
+static struct attribute *ab8500_chargalg_chg[] = {
+ &ab8500_chargalg_en_charger.attr,
+ &ab8500_chargalg_curr_step.attr,
+ NULL,
+};
+
+static const struct sysfs_ops ab8500_chargalg_sysfs_ops = {
+ .show = ab8500_chargalg_sysfs_show,
+ .store = ab8500_chargalg_sysfs_charger,
+};
+
+static struct kobj_type ab8500_chargalg_ktype = {
+ .sysfs_ops = &ab8500_chargalg_sysfs_ops,
+ .default_attrs = ab8500_chargalg_chg,
+};
+
+/**
+ * ab8500_chargalg_sysfs_exit() - de-init of sysfs entry
+ * @di: pointer to the struct ab8500_chargalg
+ *
+ * This function removes the entry in sysfs.
+ */
+static void ab8500_chargalg_sysfs_exit(struct ab8500_chargalg *di)
+{
+ kobject_del(&di->chargalg_kobject);
+}
+
+/**
+ * ab8500_chargalg_sysfs_init() - init of sysfs entry
+ * @di: pointer to the struct ab8500_chargalg
+ *
+ * This function adds an entry in sysfs.
+ * Returns error code in case of failure else 0(on success)
+ */
+static int ab8500_chargalg_sysfs_init(struct ab8500_chargalg *di)
+{
+ int ret = 0;
+
+ ret = kobject_init_and_add(&di->chargalg_kobject,
+ &ab8500_chargalg_ktype,
+ NULL, "ab8500_chargalg");
+ if (ret < 0)
+ dev_err(di->dev, "failed to create sysfs entry\n");
+
+ return ret;
+}
+/* Exposure to the sysfs interface <<END>> */
+
+static int __maybe_unused ab8500_chargalg_resume(struct device *dev)
+{
+ struct ab8500_chargalg *di = dev_get_drvdata(dev);
+
+ /* Kick charger watchdog if charging (any charger online) */
+ if (di->chg_info.online_chg)
+ queue_delayed_work(di->chargalg_wq, &di->chargalg_wd_work, 0);
+
+ /*
+ * Run the charging algorithm directly to be sure we don't
+ * do it too seldom
+ */
+ queue_delayed_work(di->chargalg_wq, &di->chargalg_periodic_work, 0);
+
+ return 0;
+}
+
+static int __maybe_unused ab8500_chargalg_suspend(struct device *dev)
+{
+ struct ab8500_chargalg *di = dev_get_drvdata(dev);
+
+ if (di->chg_info.online_chg)
+ cancel_delayed_work_sync(&di->chargalg_wd_work);
+
+ cancel_delayed_work_sync(&di->chargalg_periodic_work);
+
+ return 0;
+}
+
+static char *supply_interface[] = {
+ "ab8500_fg",
+};
+
+static const struct power_supply_desc ab8500_chargalg_desc = {
+ .name = "ab8500_chargalg",
+ .type = POWER_SUPPLY_TYPE_BATTERY,
+ .properties = ab8500_chargalg_props,
+ .num_properties = ARRAY_SIZE(ab8500_chargalg_props),
+ .get_property = ab8500_chargalg_get_property,
+ .external_power_changed = ab8500_chargalg_external_power_changed,
+};
+
+static int ab8500_chargalg_bind(struct device *dev, struct device *master,
+ void *data)
+{
+ struct ab8500_chargalg *di = dev_get_drvdata(dev);
+
+ /* Create a work queue for the chargalg */
+ di->chargalg_wq = alloc_ordered_workqueue("ab8500_chargalg_wq",
+ WQ_MEM_RECLAIM);
+ if (di->chargalg_wq == NULL) {
+ dev_err(di->dev, "failed to create work queue\n");
+ return -ENOMEM;
+ }
+
+ /* Run the charging algorithm */
+ queue_delayed_work(di->chargalg_wq, &di->chargalg_periodic_work, 0);
+
+ return 0;
+}
+
+static void ab8500_chargalg_unbind(struct device *dev, struct device *master,
+ void *data)
+{
+ struct ab8500_chargalg *di = dev_get_drvdata(dev);
+
+ /* Stop all timers and work */
+ hrtimer_cancel(&di->safety_timer);
+ hrtimer_cancel(&di->maintenance_timer);
+
+ cancel_delayed_work_sync(&di->chargalg_periodic_work);
+ cancel_delayed_work_sync(&di->chargalg_wd_work);
+ cancel_work_sync(&di->chargalg_work);
+
+ /* Delete the work queue */
+ destroy_workqueue(di->chargalg_wq);
+ flush_scheduled_work();
+}
+
+static const struct component_ops ab8500_chargalg_component_ops = {
+ .bind = ab8500_chargalg_bind,
+ .unbind = ab8500_chargalg_unbind,
+};
+
+static int ab8500_chargalg_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct power_supply_config psy_cfg = {};
+ struct ab8500_chargalg *di;
+ int ret = 0;
+
+ di = devm_kzalloc(dev, sizeof(*di), GFP_KERNEL);
+ if (!di)
+ return -ENOMEM;
+
+ di->bm = &ab8500_bm_data;
+
+ /* get device struct and parent */
+ di->dev = dev;
+ di->parent = dev_get_drvdata(pdev->dev.parent);
+
+ psy_cfg.supplied_to = supply_interface;
+ psy_cfg.num_supplicants = ARRAY_SIZE(supply_interface);
+ psy_cfg.drv_data = di;
+
+ /* Initilialize safety timer */
+ hrtimer_init(&di->safety_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+ di->safety_timer.function = ab8500_chargalg_safety_timer_expired;
+
+ /* Initilialize maintenance timer */
+ hrtimer_init(&di->maintenance_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
+ di->maintenance_timer.function =
+ ab8500_chargalg_maintenance_timer_expired;
+
+ /* Init work for chargalg */
+ INIT_DEFERRABLE_WORK(&di->chargalg_periodic_work,
+ ab8500_chargalg_periodic_work);
+ INIT_DEFERRABLE_WORK(&di->chargalg_wd_work,
+ ab8500_chargalg_wd_work);
+
+ /* Init work for chargalg */
+ INIT_WORK(&di->chargalg_work, ab8500_chargalg_work);
+
+ /* To detect charger at startup */
+ di->chg_info.prev_conn_chg = -1;
+
+ /* Register chargalg power supply class */
+ di->chargalg_psy = devm_power_supply_register(di->dev,
+ &ab8500_chargalg_desc,
+ &psy_cfg);
+ if (IS_ERR(di->chargalg_psy)) {
+ dev_err(di->dev, "failed to register chargalg psy\n");
+ return PTR_ERR(di->chargalg_psy);
+ }
+
+ platform_set_drvdata(pdev, di);
+
+ /* sysfs interface to enable/disable charging from user space */
+ ret = ab8500_chargalg_sysfs_init(di);
+ if (ret) {
+ dev_err(di->dev, "failed to create sysfs entry\n");
+ return ret;
+ }
+ di->curr_status.curr_step = CHARGALG_CURR_STEP_HIGH;
+
+ dev_info(di->dev, "probe success\n");
+ return component_add(dev, &ab8500_chargalg_component_ops);
+}
+
+static int ab8500_chargalg_remove(struct platform_device *pdev)
+{
+ struct ab8500_chargalg *di = platform_get_drvdata(pdev);
+
+ component_del(&pdev->dev, &ab8500_chargalg_component_ops);
+
+ /* sysfs interface to enable/disable charging from user space */
+ ab8500_chargalg_sysfs_exit(di);
+
+ return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(ab8500_chargalg_pm_ops, ab8500_chargalg_suspend, ab8500_chargalg_resume);
+
+static const struct of_device_id ab8500_chargalg_match[] = {
+ { .compatible = "stericsson,ab8500-chargalg", },
+ { },
+};
+
+struct platform_driver ab8500_chargalg_driver = {
+ .probe = ab8500_chargalg_probe,
+ .remove = ab8500_chargalg_remove,
+ .driver = {
+ .name = "ab8500_chargalg",
+ .of_match_table = ab8500_chargalg_match,
+ .pm = &ab8500_chargalg_pm_ops,
+ },
+};
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Johan Palsson, Karl Komierowski");
+MODULE_ALIAS("platform:ab8500-chargalg");
+MODULE_DESCRIPTION("ab8500 battery charging algorithm");
struct iio_channel *adc_main_charger_c;
struct iio_channel *adc_vbus_v;
struct iio_channel *adc_usb_charger_c;
- struct abx500_bm_data *bm;
+ struct ab8500_bm_data *bm;
struct ab8500_charger_event_flags flags;
struct ab8500_charger_usb_state usb_state;
struct ab8500_charger_max_usb_in_curr max_usb_in_curr;
static struct platform_driver *const ab8500_charger_component_drivers[] = {
&ab8500_fg_driver,
&ab8500_btemp_driver,
- &abx500_chargalg_driver,
+ &ab8500_chargalg_driver,
};
static int ab8500_charger_compare_dev(struct device *dev, void *data)
#include <linux/mfd/abx500/ab8500.h>
#include <linux/iio/consumer.h>
#include <linux/kernel.h>
+#include <linux/fixp-arith.h>
#include "ab8500-bm.h"
/* FG constants */
#define BATT_OVV 0x01
-#define interpolate(x, x1, y1, x2, y2) \
- ((y1) + ((((y2) - (y1)) * ((x) - (x1))) / ((x2) - (x1))));
-
/**
* struct ab8500_fg_interrupts - ab8500 fg interrupts
* @name: name of the interrupt
struct ab8500_fg_avg_cap avg_cap;
struct ab8500 *parent;
struct iio_channel *main_bat_v;
- struct abx500_bm_data *bm;
+ struct ab8500_bm_data *bm;
struct power_supply *fg_psy;
struct workqueue_struct *fg_wq;
struct delayed_work fg_periodic_work;
static int ab8500_fg_volt_to_capacity(struct ab8500_fg *di, int voltage)
{
int i, tbl_size;
- const struct abx500_v_to_cap *tbl;
+ const struct ab8500_v_to_cap *tbl;
int cap = 0;
tbl = di->bm->bat_type[di->bm->batt_id].v_to_cap_tbl;
}
if ((i > 0) && (i < tbl_size)) {
- cap = interpolate(voltage,
+ cap = fixp_linear_interpolate(
tbl[i].voltage,
tbl[i].capacity * 10,
tbl[i-1].voltage,
- tbl[i-1].capacity * 10);
+ tbl[i-1].capacity * 10,
+ voltage);
} else if (i == 0) {
cap = 1000;
} else {
}
if ((i > 0) && (i < tbl_size)) {
- resist = interpolate(di->bat_temp / 10,
+ resist = fixp_linear_interpolate(
tbl[i].temp,
tbl[i].resist,
tbl[i-1].temp,
- tbl[i-1].resist);
+ tbl[i-1].resist,
+ di->bat_temp / 10);
} else if (i == 0) {
resist = tbl[0].resist;
} else {
case POWER_SUPPLY_TYPE_BATTERY:
if (!di->flags.batt_id_received &&
di->bm->batt_id != BATTERY_UNKNOWN) {
- const struct abx500_battery_type *b;
+ const struct ab8500_battery_type *b;
b = &(di->bm->bat_type[di->bm->batt_id]);
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Copyright (C) ST-Ericsson SA 2012
- * Copyright (c) 2012 Sony Mobile Communications AB
- *
- * Charging algorithm driver for abx500 variants
- *
- * Authors:
- * Johan Palsson <johan.palsson@stericsson.com>
- * Karl Komierowski <karl.komierowski@stericsson.com>
- * Arun R Murthy <arun.murthy@stericsson.com>
- * Author: Imre Sunyi <imre.sunyi@sonymobile.com>
- */
-
-#include <linux/init.h>
-#include <linux/module.h>
-#include <linux/device.h>
-#include <linux/component.h>
-#include <linux/hrtimer.h>
-#include <linux/interrupt.h>
-#include <linux/delay.h>
-#include <linux/slab.h>
-#include <linux/platform_device.h>
-#include <linux/power_supply.h>
-#include <linux/completion.h>
-#include <linux/workqueue.h>
-#include <linux/kobject.h>
-#include <linux/of.h>
-#include <linux/mfd/core.h>
-#include <linux/mfd/abx500.h>
-#include <linux/mfd/abx500/ab8500.h>
-#include <linux/notifier.h>
-
-#include "ab8500-bm.h"
-#include "ab8500-chargalg.h"
-
-/* Watchdog kick interval */
-#define CHG_WD_INTERVAL (6 * HZ)
-
-/* End-of-charge criteria counter */
-#define EOC_COND_CNT 10
-
-/* One hour expressed in seconds */
-#define ONE_HOUR_IN_SECONDS 3600
-
-/* Five minutes expressed in seconds */
-#define FIVE_MINUTES_IN_SECONDS 300
-
-#define CHARGALG_CURR_STEP_LOW 0
-#define CHARGALG_CURR_STEP_HIGH 100
-
-enum abx500_chargers {
- NO_CHG,
- AC_CHG,
- USB_CHG,
-};
-
-struct abx500_chargalg_charger_info {
- enum abx500_chargers conn_chg;
- enum abx500_chargers prev_conn_chg;
- enum abx500_chargers online_chg;
- enum abx500_chargers prev_online_chg;
- enum abx500_chargers charger_type;
- bool usb_chg_ok;
- bool ac_chg_ok;
- int usb_volt;
- int usb_curr;
- int ac_volt;
- int ac_curr;
- int usb_vset;
- int usb_iset;
- int ac_vset;
- int ac_iset;
-};
-
-struct abx500_chargalg_suspension_status {
- bool suspended_change;
- bool ac_suspended;
- bool usb_suspended;
-};
-
-struct abx500_chargalg_current_step_status {
- bool curr_step_change;
- int curr_step;
-};
-
-struct abx500_chargalg_battery_data {
- int temp;
- int volt;
- int avg_curr;
- int inst_curr;
- int percent;
-};
-
-enum abx500_chargalg_states {
- STATE_HANDHELD_INIT,
- STATE_HANDHELD,
- STATE_CHG_NOT_OK_INIT,
- STATE_CHG_NOT_OK,
- STATE_HW_TEMP_PROTECT_INIT,
- STATE_HW_TEMP_PROTECT,
- STATE_NORMAL_INIT,
- STATE_NORMAL,
- STATE_WAIT_FOR_RECHARGE_INIT,
- STATE_WAIT_FOR_RECHARGE,
- STATE_MAINTENANCE_A_INIT,
- STATE_MAINTENANCE_A,
- STATE_MAINTENANCE_B_INIT,
- STATE_MAINTENANCE_B,
- STATE_TEMP_UNDEROVER_INIT,
- STATE_TEMP_UNDEROVER,
- STATE_TEMP_LOWHIGH_INIT,
- STATE_TEMP_LOWHIGH,
- STATE_SUSPENDED_INIT,
- STATE_SUSPENDED,
- STATE_OVV_PROTECT_INIT,
- STATE_OVV_PROTECT,
- STATE_SAFETY_TIMER_EXPIRED_INIT,
- STATE_SAFETY_TIMER_EXPIRED,
- STATE_BATT_REMOVED_INIT,
- STATE_BATT_REMOVED,
- STATE_WD_EXPIRED_INIT,
- STATE_WD_EXPIRED,
-};
-
-static const char *states[] = {
- "HANDHELD_INIT",
- "HANDHELD",
- "CHG_NOT_OK_INIT",
- "CHG_NOT_OK",
- "HW_TEMP_PROTECT_INIT",
- "HW_TEMP_PROTECT",
- "NORMAL_INIT",
- "NORMAL",
- "WAIT_FOR_RECHARGE_INIT",
- "WAIT_FOR_RECHARGE",
- "MAINTENANCE_A_INIT",
- "MAINTENANCE_A",
- "MAINTENANCE_B_INIT",
- "MAINTENANCE_B",
- "TEMP_UNDEROVER_INIT",
- "TEMP_UNDEROVER",
- "TEMP_LOWHIGH_INIT",
- "TEMP_LOWHIGH",
- "SUSPENDED_INIT",
- "SUSPENDED",
- "OVV_PROTECT_INIT",
- "OVV_PROTECT",
- "SAFETY_TIMER_EXPIRED_INIT",
- "SAFETY_TIMER_EXPIRED",
- "BATT_REMOVED_INIT",
- "BATT_REMOVED",
- "WD_EXPIRED_INIT",
- "WD_EXPIRED",
-};
-
-struct abx500_chargalg_events {
- bool batt_unknown;
- bool mainextchnotok;
- bool batt_ovv;
- bool batt_rem;
- bool btemp_underover;
- bool btemp_lowhigh;
- bool main_thermal_prot;
- bool usb_thermal_prot;
- bool main_ovv;
- bool vbus_ovv;
- bool usbchargernotok;
- bool safety_timer_expired;
- bool maintenance_timer_expired;
- bool ac_wd_expired;
- bool usb_wd_expired;
- bool ac_cv_active;
- bool usb_cv_active;
- bool vbus_collapsed;
-};
-
-/**
- * struct abx500_charge_curr_maximization - Charger maximization parameters
- * @original_iset: the non optimized/maximised charger current
- * @current_iset: the charging current used at this moment
- * @test_delta_i: the delta between the current we want to charge and the
- current that is really going into the battery
- * @condition_cnt: number of iterations needed before a new charger current
- is set
- * @max_current: maximum charger current
- * @wait_cnt: to avoid too fast current step down in case of charger
- * voltage collapse, we insert this delay between step
- * down
- * @level: tells in how many steps the charging current has been
- increased
- */
-struct abx500_charge_curr_maximization {
- int original_iset;
- int current_iset;
- int test_delta_i;
- int condition_cnt;
- int max_current;
- int wait_cnt;
- u8 level;
-};
-
-enum maxim_ret {
- MAXIM_RET_NOACTION,
- MAXIM_RET_CHANGE,
- MAXIM_RET_IBAT_TOO_HIGH,
-};
-
-/**
- * struct abx500_chargalg - abx500 Charging algorithm device information
- * @dev: pointer to the structure device
- * @charge_status: battery operating status
- * @eoc_cnt: counter used to determine end-of_charge
- * @maintenance_chg: indicate if maintenance charge is active
- * @t_hyst_norm temperature hysteresis when the temperature has been
- * over or under normal limits
- * @t_hyst_lowhigh temperature hysteresis when the temperature has been
- * over or under the high or low limits
- * @charge_state: current state of the charging algorithm
- * @ccm charging current maximization parameters
- * @chg_info: information about connected charger types
- * @batt_data: data of the battery
- * @susp_status: current charger suspension status
- * @bm: Platform specific battery management information
- * @curr_status: Current step status for over-current protection
- * @parent: pointer to the struct abx500
- * @chargalg_psy: structure that holds the battery properties exposed by
- * the charging algorithm
- * @events: structure for information about events triggered
- * @chargalg_wq: work queue for running the charging algorithm
- * @chargalg_periodic_work: work to run the charging algorithm periodically
- * @chargalg_wd_work: work to kick the charger watchdog periodically
- * @chargalg_work: work to run the charging algorithm instantly
- * @safety_timer: charging safety timer
- * @maintenance_timer: maintenance charging timer
- * @chargalg_kobject: structure of type kobject
- */
-struct abx500_chargalg {
- struct device *dev;
- int charge_status;
- int eoc_cnt;
- bool maintenance_chg;
- int t_hyst_norm;
- int t_hyst_lowhigh;
- enum abx500_chargalg_states charge_state;
- struct abx500_charge_curr_maximization ccm;
- struct abx500_chargalg_charger_info chg_info;
- struct abx500_chargalg_battery_data batt_data;
- struct abx500_chargalg_suspension_status susp_status;
- struct ab8500 *parent;
- struct abx500_chargalg_current_step_status curr_status;
- struct abx500_bm_data *bm;
- struct power_supply *chargalg_psy;
- struct ux500_charger *ac_chg;
- struct ux500_charger *usb_chg;
- struct abx500_chargalg_events events;
- struct workqueue_struct *chargalg_wq;
- struct delayed_work chargalg_periodic_work;
- struct delayed_work chargalg_wd_work;
- struct work_struct chargalg_work;
- struct hrtimer safety_timer;
- struct hrtimer maintenance_timer;
- struct kobject chargalg_kobject;
-};
-
-/*External charger prepare notifier*/
-BLOCKING_NOTIFIER_HEAD(charger_notifier_list);
-
-/* Main battery properties */
-static enum power_supply_property abx500_chargalg_props[] = {
- POWER_SUPPLY_PROP_STATUS,
- POWER_SUPPLY_PROP_HEALTH,
-};
-
-struct abx500_chargalg_sysfs_entry {
- struct attribute attr;
- ssize_t (*show)(struct abx500_chargalg *, char *);
- ssize_t (*store)(struct abx500_chargalg *, const char *, size_t);
-};
-
-/**
- * abx500_chargalg_safety_timer_expired() - Expiration of the safety timer
- * @timer: pointer to the hrtimer structure
- *
- * This function gets called when the safety timer for the charger
- * expires
- */
-static enum hrtimer_restart
-abx500_chargalg_safety_timer_expired(struct hrtimer *timer)
-{
- struct abx500_chargalg *di = container_of(timer, struct abx500_chargalg,
- safety_timer);
- dev_err(di->dev, "Safety timer expired\n");
- di->events.safety_timer_expired = true;
-
- /* Trigger execution of the algorithm instantly */
- queue_work(di->chargalg_wq, &di->chargalg_work);
-
- return HRTIMER_NORESTART;
-}
-
-/**
- * abx500_chargalg_maintenance_timer_expired() - Expiration of
- * the maintenance timer
- * @timer: pointer to the timer structure
- *
- * This function gets called when the maintenence timer
- * expires
- */
-static enum hrtimer_restart
-abx500_chargalg_maintenance_timer_expired(struct hrtimer *timer)
-{
-
- struct abx500_chargalg *di = container_of(timer, struct abx500_chargalg,
- maintenance_timer);
-
- dev_dbg(di->dev, "Maintenance timer expired\n");
- di->events.maintenance_timer_expired = true;
-
- /* Trigger execution of the algorithm instantly */
- queue_work(di->chargalg_wq, &di->chargalg_work);
-
- return HRTIMER_NORESTART;
-}
-
-/**
- * abx500_chargalg_state_to() - Change charge state
- * @di: pointer to the abx500_chargalg structure
- *
- * This function gets called when a charge state change should occur
- */
-static void abx500_chargalg_state_to(struct abx500_chargalg *di,
- enum abx500_chargalg_states state)
-{
- dev_dbg(di->dev,
- "State changed: %s (From state: [%d] %s =to=> [%d] %s )\n",
- di->charge_state == state ? "NO" : "YES",
- di->charge_state,
- states[di->charge_state],
- state,
- states[state]);
-
- di->charge_state = state;
-}
-
-static int abx500_chargalg_check_charger_enable(struct abx500_chargalg *di)
-{
- switch (di->charge_state) {
- case STATE_NORMAL:
- case STATE_MAINTENANCE_A:
- case STATE_MAINTENANCE_B:
- break;
- default:
- return 0;
- }
-
- if (di->chg_info.charger_type & USB_CHG) {
- return di->usb_chg->ops.check_enable(di->usb_chg,
- di->bm->bat_type[di->bm->batt_id].normal_vol_lvl,
- di->bm->bat_type[di->bm->batt_id].normal_cur_lvl);
- } else if ((di->chg_info.charger_type & AC_CHG) &&
- !(di->ac_chg->external)) {
- return di->ac_chg->ops.check_enable(di->ac_chg,
- di->bm->bat_type[di->bm->batt_id].normal_vol_lvl,
- di->bm->bat_type[di->bm->batt_id].normal_cur_lvl);
- }
- return 0;
-}
-
-/**
- * abx500_chargalg_check_charger_connection() - Check charger connection change
- * @di: pointer to the abx500_chargalg structure
- *
- * This function will check if there is a change in the charger connection
- * and change charge state accordingly. AC has precedence over USB.
- */
-static int abx500_chargalg_check_charger_connection(struct abx500_chargalg *di)
-{
- if (di->chg_info.conn_chg != di->chg_info.prev_conn_chg ||
- di->susp_status.suspended_change) {
- /*
- * Charger state changed or suspension
- * has changed since last update
- */
- if ((di->chg_info.conn_chg & AC_CHG) &&
- !di->susp_status.ac_suspended) {
- dev_dbg(di->dev, "Charging source is AC\n");
- if (di->chg_info.charger_type != AC_CHG) {
- di->chg_info.charger_type = AC_CHG;
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- }
- } else if ((di->chg_info.conn_chg & USB_CHG) &&
- !di->susp_status.usb_suspended) {
- dev_dbg(di->dev, "Charging source is USB\n");
- di->chg_info.charger_type = USB_CHG;
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- } else if (di->chg_info.conn_chg &&
- (di->susp_status.ac_suspended ||
- di->susp_status.usb_suspended)) {
- dev_dbg(di->dev, "Charging is suspended\n");
- di->chg_info.charger_type = NO_CHG;
- abx500_chargalg_state_to(di, STATE_SUSPENDED_INIT);
- } else {
- dev_dbg(di->dev, "Charging source is OFF\n");
- di->chg_info.charger_type = NO_CHG;
- abx500_chargalg_state_to(di, STATE_HANDHELD_INIT);
- }
- di->chg_info.prev_conn_chg = di->chg_info.conn_chg;
- di->susp_status.suspended_change = false;
- }
- return di->chg_info.conn_chg;
-}
-
-/**
- * abx500_chargalg_check_current_step_status() - Check charging current
- * step status.
- * @di: pointer to the abx500_chargalg structure
- *
- * This function will check if there is a change in the charging current step
- * and change charge state accordingly.
- */
-static void abx500_chargalg_check_current_step_status
- (struct abx500_chargalg *di)
-{
- if (di->curr_status.curr_step_change)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- di->curr_status.curr_step_change = false;
-}
-
-/**
- * abx500_chargalg_start_safety_timer() - Start charging safety timer
- * @di: pointer to the abx500_chargalg structure
- *
- * The safety timer is used to avoid overcharging of old or bad batteries.
- * There are different timers for AC and USB
- */
-static void abx500_chargalg_start_safety_timer(struct abx500_chargalg *di)
-{
- /* Charger-dependent expiration time in hours*/
- int timer_expiration = 0;
-
- switch (di->chg_info.charger_type) {
- case AC_CHG:
- timer_expiration = di->bm->main_safety_tmr_h;
- break;
-
- case USB_CHG:
- timer_expiration = di->bm->usb_safety_tmr_h;
- break;
-
- default:
- dev_err(di->dev, "Unknown charger to charge from\n");
- break;
- }
-
- di->events.safety_timer_expired = false;
- hrtimer_set_expires_range(&di->safety_timer,
- ktime_set(timer_expiration * ONE_HOUR_IN_SECONDS, 0),
- ktime_set(FIVE_MINUTES_IN_SECONDS, 0));
- hrtimer_start_expires(&di->safety_timer, HRTIMER_MODE_REL);
-}
-
-/**
- * abx500_chargalg_stop_safety_timer() - Stop charging safety timer
- * @di: pointer to the abx500_chargalg structure
- *
- * The safety timer is stopped whenever the NORMAL state is exited
- */
-static void abx500_chargalg_stop_safety_timer(struct abx500_chargalg *di)
-{
- if (hrtimer_try_to_cancel(&di->safety_timer) >= 0)
- di->events.safety_timer_expired = false;
-}
-
-/**
- * abx500_chargalg_start_maintenance_timer() - Start charging maintenance timer
- * @di: pointer to the abx500_chargalg structure
- * @duration: duration of ther maintenance timer in hours
- *
- * The maintenance timer is used to maintain the charge in the battery once
- * the battery is considered full. These timers are chosen to match the
- * discharge curve of the battery
- */
-static void abx500_chargalg_start_maintenance_timer(struct abx500_chargalg *di,
- int duration)
-{
- hrtimer_set_expires_range(&di->maintenance_timer,
- ktime_set(duration * ONE_HOUR_IN_SECONDS, 0),
- ktime_set(FIVE_MINUTES_IN_SECONDS, 0));
- di->events.maintenance_timer_expired = false;
- hrtimer_start_expires(&di->maintenance_timer, HRTIMER_MODE_REL);
-}
-
-/**
- * abx500_chargalg_stop_maintenance_timer() - Stop maintenance timer
- * @di: pointer to the abx500_chargalg structure
- *
- * The maintenance timer is stopped whenever maintenance ends or when another
- * state is entered
- */
-static void abx500_chargalg_stop_maintenance_timer(struct abx500_chargalg *di)
-{
- if (hrtimer_try_to_cancel(&di->maintenance_timer) >= 0)
- di->events.maintenance_timer_expired = false;
-}
-
-/**
- * abx500_chargalg_kick_watchdog() - Kick charger watchdog
- * @di: pointer to the abx500_chargalg structure
- *
- * The charger watchdog have to be kicked periodically whenever the charger is
- * on, else the ABB will reset the system
- */
-static int abx500_chargalg_kick_watchdog(struct abx500_chargalg *di)
-{
- /* Check if charger exists and kick watchdog if charging */
- if (di->ac_chg && di->ac_chg->ops.kick_wd &&
- di->chg_info.online_chg & AC_CHG) {
- /*
- * If AB charger watchdog expired, pm2xxx charging
- * gets disabled. To be safe, kick both AB charger watchdog
- * and pm2xxx watchdog.
- */
- if (di->ac_chg->external &&
- di->usb_chg && di->usb_chg->ops.kick_wd)
- di->usb_chg->ops.kick_wd(di->usb_chg);
-
- return di->ac_chg->ops.kick_wd(di->ac_chg);
- }
- else if (di->usb_chg && di->usb_chg->ops.kick_wd &&
- di->chg_info.online_chg & USB_CHG)
- return di->usb_chg->ops.kick_wd(di->usb_chg);
-
- return -ENXIO;
-}
-
-/**
- * abx500_chargalg_ac_en() - Turn on/off the AC charger
- * @di: pointer to the abx500_chargalg structure
- * @enable: charger on/off
- * @vset: requested charger output voltage
- * @iset: requested charger output current
- *
- * The AC charger will be turned on/off with the requested charge voltage and
- * current
- */
-static int abx500_chargalg_ac_en(struct abx500_chargalg *di, int enable,
- int vset, int iset)
-{
- static int abx500_chargalg_ex_ac_enable_toggle;
-
- if (!di->ac_chg || !di->ac_chg->ops.enable)
- return -ENXIO;
-
- /* Select maximum of what both the charger and the battery supports */
- if (di->ac_chg->max_out_volt)
- vset = min(vset, di->ac_chg->max_out_volt);
- if (di->ac_chg->max_out_curr)
- iset = min(iset, di->ac_chg->max_out_curr);
-
- di->chg_info.ac_iset = iset;
- di->chg_info.ac_vset = vset;
-
- /* Enable external charger */
- if (enable && di->ac_chg->external &&
- !abx500_chargalg_ex_ac_enable_toggle) {
- blocking_notifier_call_chain(&charger_notifier_list,
- 0, di->dev);
- abx500_chargalg_ex_ac_enable_toggle++;
- }
-
- return di->ac_chg->ops.enable(di->ac_chg, enable, vset, iset);
-}
-
-/**
- * abx500_chargalg_usb_en() - Turn on/off the USB charger
- * @di: pointer to the abx500_chargalg structure
- * @enable: charger on/off
- * @vset: requested charger output voltage
- * @iset: requested charger output current
- *
- * The USB charger will be turned on/off with the requested charge voltage and
- * current
- */
-static int abx500_chargalg_usb_en(struct abx500_chargalg *di, int enable,
- int vset, int iset)
-{
- if (!di->usb_chg || !di->usb_chg->ops.enable)
- return -ENXIO;
-
- /* Select maximum of what both the charger and the battery supports */
- if (di->usb_chg->max_out_volt)
- vset = min(vset, di->usb_chg->max_out_volt);
- if (di->usb_chg->max_out_curr)
- iset = min(iset, di->usb_chg->max_out_curr);
-
- di->chg_info.usb_iset = iset;
- di->chg_info.usb_vset = vset;
-
- return di->usb_chg->ops.enable(di->usb_chg, enable, vset, iset);
-}
-
-/**
- * abx500_chargalg_update_chg_curr() - Update charger current
- * @di: pointer to the abx500_chargalg structure
- * @iset: requested charger output current
- *
- * The charger output current will be updated for the charger
- * that is currently in use
- */
-static int abx500_chargalg_update_chg_curr(struct abx500_chargalg *di,
- int iset)
-{
- /* Check if charger exists and update current if charging */
- if (di->ac_chg && di->ac_chg->ops.update_curr &&
- di->chg_info.charger_type & AC_CHG) {
- /*
- * Select maximum of what both the charger
- * and the battery supports
- */
- if (di->ac_chg->max_out_curr)
- iset = min(iset, di->ac_chg->max_out_curr);
-
- di->chg_info.ac_iset = iset;
-
- return di->ac_chg->ops.update_curr(di->ac_chg, iset);
- } else if (di->usb_chg && di->usb_chg->ops.update_curr &&
- di->chg_info.charger_type & USB_CHG) {
- /*
- * Select maximum of what both the charger
- * and the battery supports
- */
- if (di->usb_chg->max_out_curr)
- iset = min(iset, di->usb_chg->max_out_curr);
-
- di->chg_info.usb_iset = iset;
-
- return di->usb_chg->ops.update_curr(di->usb_chg, iset);
- }
-
- return -ENXIO;
-}
-
-/**
- * abx500_chargalg_stop_charging() - Stop charging
- * @di: pointer to the abx500_chargalg structure
- *
- * This function is called from any state where charging should be stopped.
- * All charging is disabled and all status parameters and timers are changed
- * accordingly
- */
-static void abx500_chargalg_stop_charging(struct abx500_chargalg *di)
-{
- abx500_chargalg_ac_en(di, false, 0, 0);
- abx500_chargalg_usb_en(di, false, 0, 0);
- abx500_chargalg_stop_safety_timer(di);
- abx500_chargalg_stop_maintenance_timer(di);
- di->charge_status = POWER_SUPPLY_STATUS_NOT_CHARGING;
- di->maintenance_chg = false;
- cancel_delayed_work(&di->chargalg_wd_work);
- power_supply_changed(di->chargalg_psy);
-}
-
-/**
- * abx500_chargalg_hold_charging() - Pauses charging
- * @di: pointer to the abx500_chargalg structure
- *
- * This function is called in the case where maintenance charging has been
- * disabled and instead a battery voltage mode is entered to check when the
- * battery voltage has reached a certain recharge voltage
- */
-static void abx500_chargalg_hold_charging(struct abx500_chargalg *di)
-{
- abx500_chargalg_ac_en(di, false, 0, 0);
- abx500_chargalg_usb_en(di, false, 0, 0);
- abx500_chargalg_stop_safety_timer(di);
- abx500_chargalg_stop_maintenance_timer(di);
- di->charge_status = POWER_SUPPLY_STATUS_CHARGING;
- di->maintenance_chg = false;
- cancel_delayed_work(&di->chargalg_wd_work);
- power_supply_changed(di->chargalg_psy);
-}
-
-/**
- * abx500_chargalg_start_charging() - Start the charger
- * @di: pointer to the abx500_chargalg structure
- * @vset: requested charger output voltage
- * @iset: requested charger output current
- *
- * A charger will be enabled depending on the requested charger type that was
- * detected previously.
- */
-static void abx500_chargalg_start_charging(struct abx500_chargalg *di,
- int vset, int iset)
-{
- switch (di->chg_info.charger_type) {
- case AC_CHG:
- dev_dbg(di->dev,
- "AC parameters: Vset %d, Ich %d\n", vset, iset);
- abx500_chargalg_usb_en(di, false, 0, 0);
- abx500_chargalg_ac_en(di, true, vset, iset);
- break;
-
- case USB_CHG:
- dev_dbg(di->dev,
- "USB parameters: Vset %d, Ich %d\n", vset, iset);
- abx500_chargalg_ac_en(di, false, 0, 0);
- abx500_chargalg_usb_en(di, true, vset, iset);
- break;
-
- default:
- dev_err(di->dev, "Unknown charger to charge from\n");
- break;
- }
-}
-
-/**
- * abx500_chargalg_check_temp() - Check battery temperature ranges
- * @di: pointer to the abx500_chargalg structure
- *
- * The battery temperature is checked against the predefined limits and the
- * charge state is changed accordingly
- */
-static void abx500_chargalg_check_temp(struct abx500_chargalg *di)
-{
- if (di->batt_data.temp > (di->bm->temp_low + di->t_hyst_norm) &&
- di->batt_data.temp < (di->bm->temp_high - di->t_hyst_norm)) {
- /* Temp OK! */
- di->events.btemp_underover = false;
- di->events.btemp_lowhigh = false;
- di->t_hyst_norm = 0;
- di->t_hyst_lowhigh = 0;
- } else {
- if (((di->batt_data.temp >= di->bm->temp_high) &&
- (di->batt_data.temp <
- (di->bm->temp_over - di->t_hyst_lowhigh))) ||
- ((di->batt_data.temp >
- (di->bm->temp_under + di->t_hyst_lowhigh)) &&
- (di->batt_data.temp <= di->bm->temp_low))) {
- /* TEMP minor!!!!! */
- di->events.btemp_underover = false;
- di->events.btemp_lowhigh = true;
- di->t_hyst_norm = di->bm->temp_hysteresis;
- di->t_hyst_lowhigh = 0;
- } else if (di->batt_data.temp <= di->bm->temp_under ||
- di->batt_data.temp >= di->bm->temp_over) {
- /* TEMP major!!!!! */
- di->events.btemp_underover = true;
- di->events.btemp_lowhigh = false;
- di->t_hyst_norm = 0;
- di->t_hyst_lowhigh = di->bm->temp_hysteresis;
- } else {
- /* Within hysteresis */
- dev_dbg(di->dev, "Within hysteresis limit temp: %d "
- "hyst_lowhigh %d, hyst normal %d\n",
- di->batt_data.temp, di->t_hyst_lowhigh,
- di->t_hyst_norm);
- }
- }
-}
-
-/**
- * abx500_chargalg_check_charger_voltage() - Check charger voltage
- * @di: pointer to the abx500_chargalg structure
- *
- * Charger voltage is checked against maximum limit
- */
-static void abx500_chargalg_check_charger_voltage(struct abx500_chargalg *di)
-{
- if (di->chg_info.usb_volt > di->bm->chg_params->usb_volt_max)
- di->chg_info.usb_chg_ok = false;
- else
- di->chg_info.usb_chg_ok = true;
-
- if (di->chg_info.ac_volt > di->bm->chg_params->ac_volt_max)
- di->chg_info.ac_chg_ok = false;
- else
- di->chg_info.ac_chg_ok = true;
-
-}
-
-/**
- * abx500_chargalg_end_of_charge() - Check if end-of-charge criteria is fulfilled
- * @di: pointer to the abx500_chargalg structure
- *
- * End-of-charge criteria is fulfilled when the battery voltage is above a
- * certain limit and the battery current is below a certain limit for a
- * predefined number of consecutive seconds. If true, the battery is full
- */
-static void abx500_chargalg_end_of_charge(struct abx500_chargalg *di)
-{
- if (di->charge_status == POWER_SUPPLY_STATUS_CHARGING &&
- di->charge_state == STATE_NORMAL &&
- !di->maintenance_chg && (di->batt_data.volt >=
- di->bm->bat_type[di->bm->batt_id].termination_vol ||
- di->events.usb_cv_active || di->events.ac_cv_active) &&
- di->batt_data.avg_curr <
- di->bm->bat_type[di->bm->batt_id].termination_curr &&
- di->batt_data.avg_curr > 0) {
- if (++di->eoc_cnt >= EOC_COND_CNT) {
- di->eoc_cnt = 0;
- di->charge_status = POWER_SUPPLY_STATUS_FULL;
- di->maintenance_chg = true;
- dev_dbg(di->dev, "EOC reached!\n");
- power_supply_changed(di->chargalg_psy);
- } else {
- dev_dbg(di->dev,
- " EOC limit reached for the %d"
- " time, out of %d before EOC\n",
- di->eoc_cnt,
- EOC_COND_CNT);
- }
- } else {
- di->eoc_cnt = 0;
- }
-}
-
-static void init_maxim_chg_curr(struct abx500_chargalg *di)
-{
- di->ccm.original_iset =
- di->bm->bat_type[di->bm->batt_id].normal_cur_lvl;
- di->ccm.current_iset =
- di->bm->bat_type[di->bm->batt_id].normal_cur_lvl;
- di->ccm.test_delta_i = di->bm->maxi->charger_curr_step;
- di->ccm.max_current = di->bm->maxi->chg_curr;
- di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
- di->ccm.level = 0;
-}
-
-/**
- * abx500_chargalg_chg_curr_maxim - increases the charger current to
- * compensate for the system load
- * @di pointer to the abx500_chargalg structure
- *
- * This maximization function is used to raise the charger current to get the
- * battery current as close to the optimal value as possible. The battery
- * current during charging is affected by the system load
- */
-static enum maxim_ret abx500_chargalg_chg_curr_maxim(struct abx500_chargalg *di)
-{
- int delta_i;
-
- if (!di->bm->maxi->ena_maxi)
- return MAXIM_RET_NOACTION;
-
- delta_i = di->ccm.original_iset - di->batt_data.inst_curr;
-
- if (di->events.vbus_collapsed) {
- dev_dbg(di->dev, "Charger voltage has collapsed %d\n",
- di->ccm.wait_cnt);
- if (di->ccm.wait_cnt == 0) {
- dev_dbg(di->dev, "lowering current\n");
- di->ccm.wait_cnt++;
- di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
- di->ccm.max_current =
- di->ccm.current_iset - di->ccm.test_delta_i;
- di->ccm.current_iset = di->ccm.max_current;
- di->ccm.level--;
- return MAXIM_RET_CHANGE;
- } else {
- dev_dbg(di->dev, "waiting\n");
- /* Let's go in here twice before lowering curr again */
- di->ccm.wait_cnt = (di->ccm.wait_cnt + 1) % 3;
- return MAXIM_RET_NOACTION;
- }
- }
-
- di->ccm.wait_cnt = 0;
-
- if ((di->batt_data.inst_curr > di->ccm.original_iset)) {
- dev_dbg(di->dev, " Maximization Ibat (%dmA) too high"
- " (limit %dmA) (current iset: %dmA)!\n",
- di->batt_data.inst_curr, di->ccm.original_iset,
- di->ccm.current_iset);
-
- if (di->ccm.current_iset == di->ccm.original_iset)
- return MAXIM_RET_NOACTION;
-
- di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
- di->ccm.current_iset = di->ccm.original_iset;
- di->ccm.level = 0;
-
- return MAXIM_RET_IBAT_TOO_HIGH;
- }
-
- if (delta_i > di->ccm.test_delta_i &&
- (di->ccm.current_iset + di->ccm.test_delta_i) <
- di->ccm.max_current) {
- if (di->ccm.condition_cnt-- == 0) {
- /* Increse the iset with cco.test_delta_i */
- di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
- di->ccm.current_iset += di->ccm.test_delta_i;
- di->ccm.level++;
- dev_dbg(di->dev, " Maximization needed, increase"
- " with %d mA to %dmA (Optimal ibat: %d)"
- " Level %d\n",
- di->ccm.test_delta_i,
- di->ccm.current_iset,
- di->ccm.original_iset,
- di->ccm.level);
- return MAXIM_RET_CHANGE;
- } else {
- return MAXIM_RET_NOACTION;
- }
- } else {
- di->ccm.condition_cnt = di->bm->maxi->wait_cycles;
- return MAXIM_RET_NOACTION;
- }
-}
-
-static void handle_maxim_chg_curr(struct abx500_chargalg *di)
-{
- enum maxim_ret ret;
- int result;
-
- ret = abx500_chargalg_chg_curr_maxim(di);
- switch (ret) {
- case MAXIM_RET_CHANGE:
- result = abx500_chargalg_update_chg_curr(di,
- di->ccm.current_iset);
- if (result)
- dev_err(di->dev, "failed to set chg curr\n");
- break;
- case MAXIM_RET_IBAT_TOO_HIGH:
- result = abx500_chargalg_update_chg_curr(di,
- di->bm->bat_type[di->bm->batt_id].normal_cur_lvl);
- if (result)
- dev_err(di->dev, "failed to set chg curr\n");
- break;
-
- case MAXIM_RET_NOACTION:
- default:
- /* Do nothing..*/
- break;
- }
-}
-
-static int abx500_chargalg_get_ext_psy_data(struct device *dev, void *data)
-{
- struct power_supply *psy;
- struct power_supply *ext = dev_get_drvdata(dev);
- const char **supplicants = (const char **)ext->supplied_to;
- struct abx500_chargalg *di;
- union power_supply_propval ret;
- int j;
- bool capacity_updated = false;
-
- psy = (struct power_supply *)data;
- di = power_supply_get_drvdata(psy);
- /* For all psy where the driver name appears in any supplied_to */
- j = match_string(supplicants, ext->num_supplicants, psy->desc->name);
- if (j < 0)
- return 0;
-
- /*
- * If external is not registering 'POWER_SUPPLY_PROP_CAPACITY' to its
- * property because of handling that sysfs entry on its own, this is
- * the place to get the battery capacity.
- */
- if (!power_supply_get_property(ext, POWER_SUPPLY_PROP_CAPACITY, &ret)) {
- di->batt_data.percent = ret.intval;
- capacity_updated = true;
- }
-
- /* Go through all properties for the psy */
- for (j = 0; j < ext->desc->num_properties; j++) {
- enum power_supply_property prop;
- prop = ext->desc->properties[j];
-
- /*
- * Initialize chargers if not already done.
- * The ab8500_charger*/
- if (!di->ac_chg &&
- ext->desc->type == POWER_SUPPLY_TYPE_MAINS)
- di->ac_chg = psy_to_ux500_charger(ext);
- else if (!di->usb_chg &&
- ext->desc->type == POWER_SUPPLY_TYPE_USB)
- di->usb_chg = psy_to_ux500_charger(ext);
-
- if (power_supply_get_property(ext, prop, &ret))
- continue;
- switch (prop) {
- case POWER_SUPPLY_PROP_PRESENT:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_BATTERY:
- /* Battery present */
- if (ret.intval)
- di->events.batt_rem = false;
- /* Battery removed */
- else
- di->events.batt_rem = true;
- break;
- case POWER_SUPPLY_TYPE_MAINS:
- /* AC disconnected */
- if (!ret.intval &&
- (di->chg_info.conn_chg & AC_CHG)) {
- di->chg_info.prev_conn_chg =
- di->chg_info.conn_chg;
- di->chg_info.conn_chg &= ~AC_CHG;
- }
- /* AC connected */
- else if (ret.intval &&
- !(di->chg_info.conn_chg & AC_CHG)) {
- di->chg_info.prev_conn_chg =
- di->chg_info.conn_chg;
- di->chg_info.conn_chg |= AC_CHG;
- }
- break;
- case POWER_SUPPLY_TYPE_USB:
- /* USB disconnected */
- if (!ret.intval &&
- (di->chg_info.conn_chg & USB_CHG)) {
- di->chg_info.prev_conn_chg =
- di->chg_info.conn_chg;
- di->chg_info.conn_chg &= ~USB_CHG;
- }
- /* USB connected */
- else if (ret.intval &&
- !(di->chg_info.conn_chg & USB_CHG)) {
- di->chg_info.prev_conn_chg =
- di->chg_info.conn_chg;
- di->chg_info.conn_chg |= USB_CHG;
- }
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_ONLINE:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_BATTERY:
- break;
- case POWER_SUPPLY_TYPE_MAINS:
- /* AC offline */
- if (!ret.intval &&
- (di->chg_info.online_chg & AC_CHG)) {
- di->chg_info.prev_online_chg =
- di->chg_info.online_chg;
- di->chg_info.online_chg &= ~AC_CHG;
- }
- /* AC online */
- else if (ret.intval &&
- !(di->chg_info.online_chg & AC_CHG)) {
- di->chg_info.prev_online_chg =
- di->chg_info.online_chg;
- di->chg_info.online_chg |= AC_CHG;
- queue_delayed_work(di->chargalg_wq,
- &di->chargalg_wd_work, 0);
- }
- break;
- case POWER_SUPPLY_TYPE_USB:
- /* USB offline */
- if (!ret.intval &&
- (di->chg_info.online_chg & USB_CHG)) {
- di->chg_info.prev_online_chg =
- di->chg_info.online_chg;
- di->chg_info.online_chg &= ~USB_CHG;
- }
- /* USB online */
- else if (ret.intval &&
- !(di->chg_info.online_chg & USB_CHG)) {
- di->chg_info.prev_online_chg =
- di->chg_info.online_chg;
- di->chg_info.online_chg |= USB_CHG;
- queue_delayed_work(di->chargalg_wq,
- &di->chargalg_wd_work, 0);
- }
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_HEALTH:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_BATTERY:
- break;
- case POWER_SUPPLY_TYPE_MAINS:
- switch (ret.intval) {
- case POWER_SUPPLY_HEALTH_UNSPEC_FAILURE:
- di->events.mainextchnotok = true;
- di->events.main_thermal_prot = false;
- di->events.main_ovv = false;
- di->events.ac_wd_expired = false;
- break;
- case POWER_SUPPLY_HEALTH_DEAD:
- di->events.ac_wd_expired = true;
- di->events.mainextchnotok = false;
- di->events.main_ovv = false;
- di->events.main_thermal_prot = false;
- break;
- case POWER_SUPPLY_HEALTH_COLD:
- case POWER_SUPPLY_HEALTH_OVERHEAT:
- di->events.main_thermal_prot = true;
- di->events.mainextchnotok = false;
- di->events.main_ovv = false;
- di->events.ac_wd_expired = false;
- break;
- case POWER_SUPPLY_HEALTH_OVERVOLTAGE:
- di->events.main_ovv = true;
- di->events.mainextchnotok = false;
- di->events.main_thermal_prot = false;
- di->events.ac_wd_expired = false;
- break;
- case POWER_SUPPLY_HEALTH_GOOD:
- di->events.main_thermal_prot = false;
- di->events.mainextchnotok = false;
- di->events.main_ovv = false;
- di->events.ac_wd_expired = false;
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_TYPE_USB:
- switch (ret.intval) {
- case POWER_SUPPLY_HEALTH_UNSPEC_FAILURE:
- di->events.usbchargernotok = true;
- di->events.usb_thermal_prot = false;
- di->events.vbus_ovv = false;
- di->events.usb_wd_expired = false;
- break;
- case POWER_SUPPLY_HEALTH_DEAD:
- di->events.usb_wd_expired = true;
- di->events.usbchargernotok = false;
- di->events.usb_thermal_prot = false;
- di->events.vbus_ovv = false;
- break;
- case POWER_SUPPLY_HEALTH_COLD:
- case POWER_SUPPLY_HEALTH_OVERHEAT:
- di->events.usb_thermal_prot = true;
- di->events.usbchargernotok = false;
- di->events.vbus_ovv = false;
- di->events.usb_wd_expired = false;
- break;
- case POWER_SUPPLY_HEALTH_OVERVOLTAGE:
- di->events.vbus_ovv = true;
- di->events.usbchargernotok = false;
- di->events.usb_thermal_prot = false;
- di->events.usb_wd_expired = false;
- break;
- case POWER_SUPPLY_HEALTH_GOOD:
- di->events.usbchargernotok = false;
- di->events.usb_thermal_prot = false;
- di->events.vbus_ovv = false;
- di->events.usb_wd_expired = false;
- break;
- default:
- break;
- }
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_VOLTAGE_NOW:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_BATTERY:
- di->batt_data.volt = ret.intval / 1000;
- break;
- case POWER_SUPPLY_TYPE_MAINS:
- di->chg_info.ac_volt = ret.intval / 1000;
- break;
- case POWER_SUPPLY_TYPE_USB:
- di->chg_info.usb_volt = ret.intval / 1000;
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_VOLTAGE_AVG:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_MAINS:
- /* AVG is used to indicate when we are
- * in CV mode */
- if (ret.intval)
- di->events.ac_cv_active = true;
- else
- di->events.ac_cv_active = false;
-
- break;
- case POWER_SUPPLY_TYPE_USB:
- /* AVG is used to indicate when we are
- * in CV mode */
- if (ret.intval)
- di->events.usb_cv_active = true;
- else
- di->events.usb_cv_active = false;
-
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_TECHNOLOGY:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_BATTERY:
- if (ret.intval)
- di->events.batt_unknown = false;
- else
- di->events.batt_unknown = true;
-
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_TEMP:
- di->batt_data.temp = ret.intval / 10;
- break;
-
- case POWER_SUPPLY_PROP_CURRENT_NOW:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_MAINS:
- di->chg_info.ac_curr =
- ret.intval / 1000;
- break;
- case POWER_SUPPLY_TYPE_USB:
- di->chg_info.usb_curr =
- ret.intval / 1000;
- break;
- case POWER_SUPPLY_TYPE_BATTERY:
- di->batt_data.inst_curr = ret.intval / 1000;
- break;
- default:
- break;
- }
- break;
-
- case POWER_SUPPLY_PROP_CURRENT_AVG:
- switch (ext->desc->type) {
- case POWER_SUPPLY_TYPE_BATTERY:
- di->batt_data.avg_curr = ret.intval / 1000;
- break;
- case POWER_SUPPLY_TYPE_USB:
- if (ret.intval)
- di->events.vbus_collapsed = true;
- else
- di->events.vbus_collapsed = false;
- break;
- default:
- break;
- }
- break;
- case POWER_SUPPLY_PROP_CAPACITY:
- if (!capacity_updated)
- di->batt_data.percent = ret.intval;
- break;
- default:
- break;
- }
- }
- return 0;
-}
-
-/**
- * abx500_chargalg_external_power_changed() - callback for power supply changes
- * @psy: pointer to the structure power_supply
- *
- * This function is the entry point of the pointer external_power_changed
- * of the structure power_supply.
- * This function gets executed when there is a change in any external power
- * supply that this driver needs to be notified of.
- */
-static void abx500_chargalg_external_power_changed(struct power_supply *psy)
-{
- struct abx500_chargalg *di = power_supply_get_drvdata(psy);
-
- /*
- * Trigger execution of the algorithm instantly and read
- * all power_supply properties there instead
- */
- queue_work(di->chargalg_wq, &di->chargalg_work);
-}
-
-/**
- * abx500_chargalg_algorithm() - Main function for the algorithm
- * @di: pointer to the abx500_chargalg structure
- *
- * This is the main control function for the charging algorithm.
- * It is called periodically or when something happens that will
- * trigger a state change
- */
-static void abx500_chargalg_algorithm(struct abx500_chargalg *di)
-{
- int charger_status;
- int ret;
- int curr_step_lvl;
-
- /* Collect data from all power_supply class devices */
- class_for_each_device(power_supply_class, NULL,
- di->chargalg_psy, abx500_chargalg_get_ext_psy_data);
-
- abx500_chargalg_end_of_charge(di);
- abx500_chargalg_check_temp(di);
- abx500_chargalg_check_charger_voltage(di);
-
- charger_status = abx500_chargalg_check_charger_connection(di);
- abx500_chargalg_check_current_step_status(di);
-
- if (is_ab8500(di->parent)) {
- ret = abx500_chargalg_check_charger_enable(di);
- if (ret < 0)
- dev_err(di->dev, "Checking charger is enabled error"
- ": Returned Value %d\n", ret);
- }
-
- /*
- * First check if we have a charger connected.
- * Also we don't allow charging of unknown batteries if configured
- * this way
- */
- if (!charger_status ||
- (di->events.batt_unknown && !di->bm->chg_unknown_bat)) {
- if (di->charge_state != STATE_HANDHELD) {
- di->events.safety_timer_expired = false;
- abx500_chargalg_state_to(di, STATE_HANDHELD_INIT);
- }
- }
-
- /* If suspended, we should not continue checking the flags */
- else if (di->charge_state == STATE_SUSPENDED_INIT ||
- di->charge_state == STATE_SUSPENDED) {
- /* We don't do anything here, just don,t continue */
- }
-
- /* Safety timer expiration */
- else if (di->events.safety_timer_expired) {
- if (di->charge_state != STATE_SAFETY_TIMER_EXPIRED)
- abx500_chargalg_state_to(di,
- STATE_SAFETY_TIMER_EXPIRED_INIT);
- }
- /*
- * Check if any interrupts has occured
- * that will prevent us from charging
- */
-
- /* Battery removed */
- else if (di->events.batt_rem) {
- if (di->charge_state != STATE_BATT_REMOVED)
- abx500_chargalg_state_to(di, STATE_BATT_REMOVED_INIT);
- }
- /* Main or USB charger not ok. */
- else if (di->events.mainextchnotok || di->events.usbchargernotok) {
- /*
- * If vbus_collapsed is set, we have to lower the charger
- * current, which is done in the normal state below
- */
- if (di->charge_state != STATE_CHG_NOT_OK &&
- !di->events.vbus_collapsed)
- abx500_chargalg_state_to(di, STATE_CHG_NOT_OK_INIT);
- }
- /* VBUS, Main or VBAT OVV. */
- else if (di->events.vbus_ovv ||
- di->events.main_ovv ||
- di->events.batt_ovv ||
- !di->chg_info.usb_chg_ok ||
- !di->chg_info.ac_chg_ok) {
- if (di->charge_state != STATE_OVV_PROTECT)
- abx500_chargalg_state_to(di, STATE_OVV_PROTECT_INIT);
- }
- /* USB Thermal, stop charging */
- else if (di->events.main_thermal_prot ||
- di->events.usb_thermal_prot) {
- if (di->charge_state != STATE_HW_TEMP_PROTECT)
- abx500_chargalg_state_to(di,
- STATE_HW_TEMP_PROTECT_INIT);
- }
- /* Battery temp over/under */
- else if (di->events.btemp_underover) {
- if (di->charge_state != STATE_TEMP_UNDEROVER)
- abx500_chargalg_state_to(di,
- STATE_TEMP_UNDEROVER_INIT);
- }
- /* Watchdog expired */
- else if (di->events.ac_wd_expired ||
- di->events.usb_wd_expired) {
- if (di->charge_state != STATE_WD_EXPIRED)
- abx500_chargalg_state_to(di, STATE_WD_EXPIRED_INIT);
- }
- /* Battery temp high/low */
- else if (di->events.btemp_lowhigh) {
- if (di->charge_state != STATE_TEMP_LOWHIGH)
- abx500_chargalg_state_to(di, STATE_TEMP_LOWHIGH_INIT);
- }
-
- dev_dbg(di->dev,
- "[CHARGALG] Vb %d Ib_avg %d Ib_inst %d Tb %d Cap %d Maint %d "
- "State %s Active_chg %d Chg_status %d AC %d USB %d "
- "AC_online %d USB_online %d AC_CV %d USB_CV %d AC_I %d "
- "USB_I %d AC_Vset %d AC_Iset %d USB_Vset %d USB_Iset %d\n",
- di->batt_data.volt,
- di->batt_data.avg_curr,
- di->batt_data.inst_curr,
- di->batt_data.temp,
- di->batt_data.percent,
- di->maintenance_chg,
- states[di->charge_state],
- di->chg_info.charger_type,
- di->charge_status,
- di->chg_info.conn_chg & AC_CHG,
- di->chg_info.conn_chg & USB_CHG,
- di->chg_info.online_chg & AC_CHG,
- di->chg_info.online_chg & USB_CHG,
- di->events.ac_cv_active,
- di->events.usb_cv_active,
- di->chg_info.ac_curr,
- di->chg_info.usb_curr,
- di->chg_info.ac_vset,
- di->chg_info.ac_iset,
- di->chg_info.usb_vset,
- di->chg_info.usb_iset);
-
- switch (di->charge_state) {
- case STATE_HANDHELD_INIT:
- abx500_chargalg_stop_charging(di);
- di->charge_status = POWER_SUPPLY_STATUS_DISCHARGING;
- abx500_chargalg_state_to(di, STATE_HANDHELD);
- fallthrough;
-
- case STATE_HANDHELD:
- break;
-
- case STATE_SUSPENDED_INIT:
- if (di->susp_status.ac_suspended)
- abx500_chargalg_ac_en(di, false, 0, 0);
- if (di->susp_status.usb_suspended)
- abx500_chargalg_usb_en(di, false, 0, 0);
- abx500_chargalg_stop_safety_timer(di);
- abx500_chargalg_stop_maintenance_timer(di);
- di->charge_status = POWER_SUPPLY_STATUS_NOT_CHARGING;
- di->maintenance_chg = false;
- abx500_chargalg_state_to(di, STATE_SUSPENDED);
- power_supply_changed(di->chargalg_psy);
- fallthrough;
-
- case STATE_SUSPENDED:
- /* CHARGING is suspended */
- break;
-
- case STATE_BATT_REMOVED_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_BATT_REMOVED);
- fallthrough;
-
- case STATE_BATT_REMOVED:
- if (!di->events.batt_rem)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_HW_TEMP_PROTECT_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_HW_TEMP_PROTECT);
- fallthrough;
-
- case STATE_HW_TEMP_PROTECT:
- if (!di->events.main_thermal_prot &&
- !di->events.usb_thermal_prot)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_OVV_PROTECT_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_OVV_PROTECT);
- fallthrough;
-
- case STATE_OVV_PROTECT:
- if (!di->events.vbus_ovv &&
- !di->events.main_ovv &&
- !di->events.batt_ovv &&
- di->chg_info.usb_chg_ok &&
- di->chg_info.ac_chg_ok)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_CHG_NOT_OK_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_CHG_NOT_OK);
- fallthrough;
-
- case STATE_CHG_NOT_OK:
- if (!di->events.mainextchnotok &&
- !di->events.usbchargernotok)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_SAFETY_TIMER_EXPIRED_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_SAFETY_TIMER_EXPIRED);
- fallthrough;
-
- case STATE_SAFETY_TIMER_EXPIRED:
- /* We exit this state when charger is removed */
- break;
-
- case STATE_NORMAL_INIT:
- if (di->curr_status.curr_step == CHARGALG_CURR_STEP_LOW)
- abx500_chargalg_stop_charging(di);
- else {
- curr_step_lvl = di->bm->bat_type[
- di->bm->batt_id].normal_cur_lvl
- * di->curr_status.curr_step
- / CHARGALG_CURR_STEP_HIGH;
- abx500_chargalg_start_charging(di,
- di->bm->bat_type[di->bm->batt_id]
- .normal_vol_lvl, curr_step_lvl);
- }
-
- abx500_chargalg_state_to(di, STATE_NORMAL);
- abx500_chargalg_start_safety_timer(di);
- abx500_chargalg_stop_maintenance_timer(di);
- init_maxim_chg_curr(di);
- di->charge_status = POWER_SUPPLY_STATUS_CHARGING;
- di->eoc_cnt = 0;
- di->maintenance_chg = false;
- power_supply_changed(di->chargalg_psy);
-
- break;
-
- case STATE_NORMAL:
- handle_maxim_chg_curr(di);
- if (di->charge_status == POWER_SUPPLY_STATUS_FULL &&
- di->maintenance_chg) {
- if (di->bm->no_maintenance)
- abx500_chargalg_state_to(di,
- STATE_WAIT_FOR_RECHARGE_INIT);
- else
- abx500_chargalg_state_to(di,
- STATE_MAINTENANCE_A_INIT);
- }
- break;
-
- /* This state will be used when the maintenance state is disabled */
- case STATE_WAIT_FOR_RECHARGE_INIT:
- abx500_chargalg_hold_charging(di);
- abx500_chargalg_state_to(di, STATE_WAIT_FOR_RECHARGE);
- fallthrough;
-
- case STATE_WAIT_FOR_RECHARGE:
- if (di->batt_data.percent <=
- di->bm->bat_type[di->bm->batt_id].
- recharge_cap)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_MAINTENANCE_A_INIT:
- abx500_chargalg_stop_safety_timer(di);
- abx500_chargalg_start_maintenance_timer(di,
- di->bm->bat_type[
- di->bm->batt_id].maint_a_chg_timer_h);
- abx500_chargalg_start_charging(di,
- di->bm->bat_type[
- di->bm->batt_id].maint_a_vol_lvl,
- di->bm->bat_type[
- di->bm->batt_id].maint_a_cur_lvl);
- abx500_chargalg_state_to(di, STATE_MAINTENANCE_A);
- power_supply_changed(di->chargalg_psy);
- fallthrough;
-
- case STATE_MAINTENANCE_A:
- if (di->events.maintenance_timer_expired) {
- abx500_chargalg_stop_maintenance_timer(di);
- abx500_chargalg_state_to(di, STATE_MAINTENANCE_B_INIT);
- }
- break;
-
- case STATE_MAINTENANCE_B_INIT:
- abx500_chargalg_start_maintenance_timer(di,
- di->bm->bat_type[
- di->bm->batt_id].maint_b_chg_timer_h);
- abx500_chargalg_start_charging(di,
- di->bm->bat_type[
- di->bm->batt_id].maint_b_vol_lvl,
- di->bm->bat_type[
- di->bm->batt_id].maint_b_cur_lvl);
- abx500_chargalg_state_to(di, STATE_MAINTENANCE_B);
- power_supply_changed(di->chargalg_psy);
- fallthrough;
-
- case STATE_MAINTENANCE_B:
- if (di->events.maintenance_timer_expired) {
- abx500_chargalg_stop_maintenance_timer(di);
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- }
- break;
-
- case STATE_TEMP_LOWHIGH_INIT:
- abx500_chargalg_start_charging(di,
- di->bm->bat_type[
- di->bm->batt_id].low_high_vol_lvl,
- di->bm->bat_type[
- di->bm->batt_id].low_high_cur_lvl);
- abx500_chargalg_stop_maintenance_timer(di);
- di->charge_status = POWER_SUPPLY_STATUS_CHARGING;
- abx500_chargalg_state_to(di, STATE_TEMP_LOWHIGH);
- power_supply_changed(di->chargalg_psy);
- fallthrough;
-
- case STATE_TEMP_LOWHIGH:
- if (!di->events.btemp_lowhigh)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_WD_EXPIRED_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_WD_EXPIRED);
- fallthrough;
-
- case STATE_WD_EXPIRED:
- if (!di->events.ac_wd_expired &&
- !di->events.usb_wd_expired)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
-
- case STATE_TEMP_UNDEROVER_INIT:
- abx500_chargalg_stop_charging(di);
- abx500_chargalg_state_to(di, STATE_TEMP_UNDEROVER);
- fallthrough;
-
- case STATE_TEMP_UNDEROVER:
- if (!di->events.btemp_underover)
- abx500_chargalg_state_to(di, STATE_NORMAL_INIT);
- break;
- }
-
- /* Start charging directly if the new state is a charge state */
- if (di->charge_state == STATE_NORMAL_INIT ||
- di->charge_state == STATE_MAINTENANCE_A_INIT ||
- di->charge_state == STATE_MAINTENANCE_B_INIT)
- queue_work(di->chargalg_wq, &di->chargalg_work);
-}
-
-/**
- * abx500_chargalg_periodic_work() - Periodic work for the algorithm
- * @work: pointer to the work_struct structure
- *
- * Work queue function for the charging algorithm
- */
-static void abx500_chargalg_periodic_work(struct work_struct *work)
-{
- struct abx500_chargalg *di = container_of(work,
- struct abx500_chargalg, chargalg_periodic_work.work);
-
- abx500_chargalg_algorithm(di);
-
- /*
- * If a charger is connected then the battery has to be monitored
- * frequently, else the work can be delayed.
- */
- if (di->chg_info.conn_chg)
- queue_delayed_work(di->chargalg_wq,
- &di->chargalg_periodic_work,
- di->bm->interval_charging * HZ);
- else
- queue_delayed_work(di->chargalg_wq,
- &di->chargalg_periodic_work,
- di->bm->interval_not_charging * HZ);
-}
-
-/**
- * abx500_chargalg_wd_work() - periodic work to kick the charger watchdog
- * @work: pointer to the work_struct structure
- *
- * Work queue function for kicking the charger watchdog
- */
-static void abx500_chargalg_wd_work(struct work_struct *work)
-{
- int ret;
- struct abx500_chargalg *di = container_of(work,
- struct abx500_chargalg, chargalg_wd_work.work);
-
- dev_dbg(di->dev, "abx500_chargalg_wd_work\n");
-
- ret = abx500_chargalg_kick_watchdog(di);
- if (ret < 0)
- dev_err(di->dev, "failed to kick watchdog\n");
-
- queue_delayed_work(di->chargalg_wq,
- &di->chargalg_wd_work, CHG_WD_INTERVAL);
-}
-
-/**
- * abx500_chargalg_work() - Work to run the charging algorithm instantly
- * @work: pointer to the work_struct structure
- *
- * Work queue function for calling the charging algorithm
- */
-static void abx500_chargalg_work(struct work_struct *work)
-{
- struct abx500_chargalg *di = container_of(work,
- struct abx500_chargalg, chargalg_work);
-
- abx500_chargalg_algorithm(di);
-}
-
-/**
- * abx500_chargalg_get_property() - get the chargalg properties
- * @psy: pointer to the power_supply structure
- * @psp: pointer to the power_supply_property structure
- * @val: pointer to the power_supply_propval union
- *
- * This function gets called when an application tries to get the
- * chargalg properties by reading the sysfs files.
- * status: charging/discharging/full/unknown
- * health: health of the battery
- * Returns error code in case of failure else 0 on success
- */
-static int abx500_chargalg_get_property(struct power_supply *psy,
- enum power_supply_property psp,
- union power_supply_propval *val)
-{
- struct abx500_chargalg *di = power_supply_get_drvdata(psy);
-
- switch (psp) {
- case POWER_SUPPLY_PROP_STATUS:
- val->intval = di->charge_status;
- break;
- case POWER_SUPPLY_PROP_HEALTH:
- if (di->events.batt_ovv) {
- val->intval = POWER_SUPPLY_HEALTH_OVERVOLTAGE;
- } else if (di->events.btemp_underover) {
- if (di->batt_data.temp <= di->bm->temp_under)
- val->intval = POWER_SUPPLY_HEALTH_COLD;
- else
- val->intval = POWER_SUPPLY_HEALTH_OVERHEAT;
- } else if (di->charge_state == STATE_SAFETY_TIMER_EXPIRED ||
- di->charge_state == STATE_SAFETY_TIMER_EXPIRED_INIT) {
- val->intval = POWER_SUPPLY_HEALTH_UNSPEC_FAILURE;
- } else {
- val->intval = POWER_SUPPLY_HEALTH_GOOD;
- }
- break;
- default:
- return -EINVAL;
- }
- return 0;
-}
-
-/* Exposure to the sysfs interface */
-
-static ssize_t abx500_chargalg_curr_step_show(struct abx500_chargalg *di,
- char *buf)
-{
- return sprintf(buf, "%d\n", di->curr_status.curr_step);
-}
-
-static ssize_t abx500_chargalg_curr_step_store(struct abx500_chargalg *di,
- const char *buf, size_t length)
-{
- long int param;
- int ret;
-
- ret = kstrtol(buf, 10, ¶m);
- if (ret < 0)
- return ret;
-
- di->curr_status.curr_step = param;
- if (di->curr_status.curr_step >= CHARGALG_CURR_STEP_LOW &&
- di->curr_status.curr_step <= CHARGALG_CURR_STEP_HIGH) {
- di->curr_status.curr_step_change = true;
- queue_work(di->chargalg_wq, &di->chargalg_work);
- } else
- dev_info(di->dev, "Wrong current step\n"
- "Enter 0. Disable AC/USB Charging\n"
- "1--100. Set AC/USB charging current step\n"
- "100. Enable AC/USB Charging\n");
-
- return strlen(buf);
-}
-
-
-static ssize_t abx500_chargalg_en_show(struct abx500_chargalg *di,
- char *buf)
-{
- return sprintf(buf, "%d\n",
- di->susp_status.ac_suspended &&
- di->susp_status.usb_suspended);
-}
-
-static ssize_t abx500_chargalg_en_store(struct abx500_chargalg *di,
- const char *buf, size_t length)
-{
- long int param;
- int ac_usb;
- int ret;
-
- ret = kstrtol(buf, 10, ¶m);
- if (ret < 0)
- return ret;
-
- ac_usb = param;
- switch (ac_usb) {
- case 0:
- /* Disable charging */
- di->susp_status.ac_suspended = true;
- di->susp_status.usb_suspended = true;
- di->susp_status.suspended_change = true;
- /* Trigger a state change */
- queue_work(di->chargalg_wq,
- &di->chargalg_work);
- break;
- case 1:
- /* Enable AC Charging */
- di->susp_status.ac_suspended = false;
- di->susp_status.suspended_change = true;
- /* Trigger a state change */
- queue_work(di->chargalg_wq,
- &di->chargalg_work);
- break;
- case 2:
- /* Enable USB charging */
- di->susp_status.usb_suspended = false;
- di->susp_status.suspended_change = true;
- /* Trigger a state change */
- queue_work(di->chargalg_wq,
- &di->chargalg_work);
- break;
- default:
- dev_info(di->dev, "Wrong input\n"
- "Enter 0. Disable AC/USB Charging\n"
- "1. Enable AC charging\n"
- "2. Enable USB Charging\n");
- }
- return strlen(buf);
-}
-
-static struct abx500_chargalg_sysfs_entry abx500_chargalg_en_charger =
- __ATTR(chargalg, 0644, abx500_chargalg_en_show,
- abx500_chargalg_en_store);
-
-static struct abx500_chargalg_sysfs_entry abx500_chargalg_curr_step =
- __ATTR(chargalg_curr_step, 0644, abx500_chargalg_curr_step_show,
- abx500_chargalg_curr_step_store);
-
-static ssize_t abx500_chargalg_sysfs_show(struct kobject *kobj,
- struct attribute *attr, char *buf)
-{
- struct abx500_chargalg_sysfs_entry *entry = container_of(attr,
- struct abx500_chargalg_sysfs_entry, attr);
-
- struct abx500_chargalg *di = container_of(kobj,
- struct abx500_chargalg, chargalg_kobject);
-
- if (!entry->show)
- return -EIO;
-
- return entry->show(di, buf);
-}
-
-static ssize_t abx500_chargalg_sysfs_charger(struct kobject *kobj,
- struct attribute *attr, const char *buf, size_t length)
-{
- struct abx500_chargalg_sysfs_entry *entry = container_of(attr,
- struct abx500_chargalg_sysfs_entry, attr);
-
- struct abx500_chargalg *di = container_of(kobj,
- struct abx500_chargalg, chargalg_kobject);
-
- if (!entry->store)
- return -EIO;
-
- return entry->store(di, buf, length);
-}
-
-static struct attribute *abx500_chargalg_chg[] = {
- &abx500_chargalg_en_charger.attr,
- &abx500_chargalg_curr_step.attr,
- NULL,
-};
-
-static const struct sysfs_ops abx500_chargalg_sysfs_ops = {
- .show = abx500_chargalg_sysfs_show,
- .store = abx500_chargalg_sysfs_charger,
-};
-
-static struct kobj_type abx500_chargalg_ktype = {
- .sysfs_ops = &abx500_chargalg_sysfs_ops,
- .default_attrs = abx500_chargalg_chg,
-};
-
-/**
- * abx500_chargalg_sysfs_exit() - de-init of sysfs entry
- * @di: pointer to the struct abx500_chargalg
- *
- * This function removes the entry in sysfs.
- */
-static void abx500_chargalg_sysfs_exit(struct abx500_chargalg *di)
-{
- kobject_del(&di->chargalg_kobject);
-}
-
-/**
- * abx500_chargalg_sysfs_init() - init of sysfs entry
- * @di: pointer to the struct abx500_chargalg
- *
- * This function adds an entry in sysfs.
- * Returns error code in case of failure else 0(on success)
- */
-static int abx500_chargalg_sysfs_init(struct abx500_chargalg *di)
-{
- int ret = 0;
-
- ret = kobject_init_and_add(&di->chargalg_kobject,
- &abx500_chargalg_ktype,
- NULL, "abx500_chargalg");
- if (ret < 0)
- dev_err(di->dev, "failed to create sysfs entry\n");
-
- return ret;
-}
-/* Exposure to the sysfs interface <<END>> */
-
-static int __maybe_unused abx500_chargalg_resume(struct device *dev)
-{
- struct abx500_chargalg *di = dev_get_drvdata(dev);
-
- /* Kick charger watchdog if charging (any charger online) */
- if (di->chg_info.online_chg)
- queue_delayed_work(di->chargalg_wq, &di->chargalg_wd_work, 0);
-
- /*
- * Run the charging algorithm directly to be sure we don't
- * do it too seldom
- */
- queue_delayed_work(di->chargalg_wq, &di->chargalg_periodic_work, 0);
-
- return 0;
-}
-
-static int __maybe_unused abx500_chargalg_suspend(struct device *dev)
-{
- struct abx500_chargalg *di = dev_get_drvdata(dev);
-
- if (di->chg_info.online_chg)
- cancel_delayed_work_sync(&di->chargalg_wd_work);
-
- cancel_delayed_work_sync(&di->chargalg_periodic_work);
-
- return 0;
-}
-
-static char *supply_interface[] = {
- "ab8500_fg",
-};
-
-static const struct power_supply_desc abx500_chargalg_desc = {
- .name = "abx500_chargalg",
- .type = POWER_SUPPLY_TYPE_BATTERY,
- .properties = abx500_chargalg_props,
- .num_properties = ARRAY_SIZE(abx500_chargalg_props),
- .get_property = abx500_chargalg_get_property,
- .external_power_changed = abx500_chargalg_external_power_changed,
-};
-
-static int abx500_chargalg_bind(struct device *dev, struct device *master,
- void *data)
-{
- struct abx500_chargalg *di = dev_get_drvdata(dev);
-
- /* Create a work queue for the chargalg */
- di->chargalg_wq = alloc_ordered_workqueue("abx500_chargalg_wq",
- WQ_MEM_RECLAIM);
- if (di->chargalg_wq == NULL) {
- dev_err(di->dev, "failed to create work queue\n");
- return -ENOMEM;
- }
-
- /* Run the charging algorithm */
- queue_delayed_work(di->chargalg_wq, &di->chargalg_periodic_work, 0);
-
- return 0;
-}
-
-static void abx500_chargalg_unbind(struct device *dev, struct device *master,
- void *data)
-{
- struct abx500_chargalg *di = dev_get_drvdata(dev);
-
- /* Stop all timers and work */
- hrtimer_cancel(&di->safety_timer);
- hrtimer_cancel(&di->maintenance_timer);
-
- cancel_delayed_work_sync(&di->chargalg_periodic_work);
- cancel_delayed_work_sync(&di->chargalg_wd_work);
- cancel_work_sync(&di->chargalg_work);
-
- /* Delete the work queue */
- destroy_workqueue(di->chargalg_wq);
- flush_scheduled_work();
-}
-
-static const struct component_ops abx500_chargalg_component_ops = {
- .bind = abx500_chargalg_bind,
- .unbind = abx500_chargalg_unbind,
-};
-
-static int abx500_chargalg_probe(struct platform_device *pdev)
-{
- struct device *dev = &pdev->dev;
- struct power_supply_config psy_cfg = {};
- struct abx500_chargalg *di;
- int ret = 0;
-
- di = devm_kzalloc(dev, sizeof(*di), GFP_KERNEL);
- if (!di)
- return -ENOMEM;
-
- di->bm = &ab8500_bm_data;
-
- /* get device struct and parent */
- di->dev = dev;
- di->parent = dev_get_drvdata(pdev->dev.parent);
-
- psy_cfg.supplied_to = supply_interface;
- psy_cfg.num_supplicants = ARRAY_SIZE(supply_interface);
- psy_cfg.drv_data = di;
-
- /* Initilialize safety timer */
- hrtimer_init(&di->safety_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
- di->safety_timer.function = abx500_chargalg_safety_timer_expired;
-
- /* Initilialize maintenance timer */
- hrtimer_init(&di->maintenance_timer, CLOCK_REALTIME, HRTIMER_MODE_ABS);
- di->maintenance_timer.function =
- abx500_chargalg_maintenance_timer_expired;
-
- /* Init work for chargalg */
- INIT_DEFERRABLE_WORK(&di->chargalg_periodic_work,
- abx500_chargalg_periodic_work);
- INIT_DEFERRABLE_WORK(&di->chargalg_wd_work,
- abx500_chargalg_wd_work);
-
- /* Init work for chargalg */
- INIT_WORK(&di->chargalg_work, abx500_chargalg_work);
-
- /* To detect charger at startup */
- di->chg_info.prev_conn_chg = -1;
-
- /* Register chargalg power supply class */
- di->chargalg_psy = devm_power_supply_register(di->dev,
- &abx500_chargalg_desc,
- &psy_cfg);
- if (IS_ERR(di->chargalg_psy)) {
- dev_err(di->dev, "failed to register chargalg psy\n");
- return PTR_ERR(di->chargalg_psy);
- }
-
- platform_set_drvdata(pdev, di);
-
- /* sysfs interface to enable/disable charging from user space */
- ret = abx500_chargalg_sysfs_init(di);
- if (ret) {
- dev_err(di->dev, "failed to create sysfs entry\n");
- return ret;
- }
- di->curr_status.curr_step = CHARGALG_CURR_STEP_HIGH;
-
- dev_info(di->dev, "probe success\n");
- return component_add(dev, &abx500_chargalg_component_ops);
-}
-
-static int abx500_chargalg_remove(struct platform_device *pdev)
-{
- struct abx500_chargalg *di = platform_get_drvdata(pdev);
-
- component_del(&pdev->dev, &abx500_chargalg_component_ops);
-
- /* sysfs interface to enable/disable charging from user space */
- abx500_chargalg_sysfs_exit(di);
-
- return 0;
-}
-
-static SIMPLE_DEV_PM_OPS(abx500_chargalg_pm_ops, abx500_chargalg_suspend, abx500_chargalg_resume);
-
-static const struct of_device_id ab8500_chargalg_match[] = {
- { .compatible = "stericsson,ab8500-chargalg", },
- { },
-};
-
-struct platform_driver abx500_chargalg_driver = {
- .probe = abx500_chargalg_probe,
- .remove = abx500_chargalg_remove,
- .driver = {
- .name = "ab8500-chargalg",
- .of_match_table = ab8500_chargalg_match,
- .pm = &abx500_chargalg_pm_ops,
- },
-};
-MODULE_LICENSE("GPL v2");
-MODULE_AUTHOR("Johan Palsson, Karl Komierowski");
-MODULE_ALIAS("platform:abx500-chargalg");
-MODULE_DESCRIPTION("abx500 battery charging algorithm");
if (val == 0)
return -ENODEV;
- info = devm_kzalloc(&pdev->dev, sizeof(*info), GFP_KERNEL);
+ info = devm_kzalloc(dev, sizeof(*info), GFP_KERNEL);
if (!info)
return -ENOMEM;
info->cable.edev = extcon_get_extcon_dev(AXP288_EXTCON_DEV_NAME);
if (info->cable.edev == NULL) {
- dev_dbg(&pdev->dev, "%s is not ready, probe deferred\n",
+ dev_dbg(dev, "%s is not ready, probe deferred\n",
AXP288_EXTCON_DEV_NAME);
return -EPROBE_DEFER;
}
dev_dbg(dev, "EXTCON_USB_HOST is not ready, probe deferred\n");
return -EPROBE_DEFER;
}
- dev_info(&pdev->dev,
- "Using " USB_HOST_EXTCON_HID " extcon for usb-id\n");
+ dev_info(dev, "Using " USB_HOST_EXTCON_HID " extcon for usb-id\n");
}
platform_set_drvdata(pdev, info);
INIT_WORK(&info->otg.work, axp288_charger_otg_evt_worker);
info->otg.id_nb.notifier_call = axp288_charger_handle_otg_evt;
if (info->otg.cable) {
- ret = devm_extcon_register_notifier(&pdev->dev, info->otg.cable,
+ ret = devm_extcon_register_notifier(dev, info->otg.cable,
EXTCON_USB_HOST, &info->otg.id_nb);
if (ret) {
dev_err(dev, "failed to register EXTCON_USB_HOST notifier\n");
NULL, axp288_charger_irq_thread_handler,
IRQF_ONESHOT, info->pdev->name, info);
if (ret) {
- dev_err(&pdev->dev, "failed to request interrupt=%d\n",
+ dev_err(dev, "failed to request interrupt=%d\n",
info->irq[i]);
return ret;
}
/*
* axp288_fuel_gauge.c - Xpower AXP288 PMIC Fuel Gauge Driver
*
- * Copyright (C) 2016-2017 Hans de Goede <hdegoede@redhat.com>
+ * Copyright (C) 2020-2021 Andrejus Basovas <xxx@yyy.tld>
+ * Copyright (C) 2016-2021 Hans de Goede <hdegoede@redhat.com>
* Copyright (C) 2014 Intel Corporation
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <linux/platform_device.h>
#include <linux/power_supply.h>
#include <linux/iio/consumer.h>
-#include <linux/debugfs.h>
-#include <linux/seq_file.h>
#include <asm/unaligned.h>
+#include <asm/iosf_mbi.h>
-#define PS_STAT_VBUS_TRIGGER (1 << 0)
-#define PS_STAT_BAT_CHRG_DIR (1 << 2)
-#define PS_STAT_VBAT_ABOVE_VHOLD (1 << 3)
-#define PS_STAT_VBUS_VALID (1 << 4)
-#define PS_STAT_VBUS_PRESENT (1 << 5)
+#define PS_STAT_VBUS_TRIGGER (1 << 0)
+#define PS_STAT_BAT_CHRG_DIR (1 << 2)
+#define PS_STAT_VBAT_ABOVE_VHOLD (1 << 3)
+#define PS_STAT_VBUS_VALID (1 << 4)
+#define PS_STAT_VBUS_PRESENT (1 << 5)
-#define CHRG_STAT_BAT_SAFE_MODE (1 << 3)
+#define CHRG_STAT_BAT_SAFE_MODE (1 << 3)
#define CHRG_STAT_BAT_VALID (1 << 4)
-#define CHRG_STAT_BAT_PRESENT (1 << 5)
+#define CHRG_STAT_BAT_PRESENT (1 << 5)
#define CHRG_STAT_CHARGING (1 << 6)
#define CHRG_STAT_PMIC_OTP (1 << 7)
#define CHRG_CCCV_CC_MASK 0xf /* 4 bits */
-#define CHRG_CCCV_CC_BIT_POS 0
+#define CHRG_CCCV_CC_BIT_POS 0
#define CHRG_CCCV_CC_OFFSET 200 /* 200mA */
-#define CHRG_CCCV_CC_LSB_RES 200 /* 200mA */
+#define CHRG_CCCV_CC_LSB_RES 200 /* 200mA */
#define CHRG_CCCV_ITERM_20P (1 << 4) /* 20% of CC */
#define CHRG_CCCV_CV_MASK 0x60 /* 2 bits */
-#define CHRG_CCCV_CV_BIT_POS 5
+#define CHRG_CCCV_CV_BIT_POS 5
#define CHRG_CCCV_CV_4100MV 0x0 /* 4.10V */
#define CHRG_CCCV_CV_4150MV 0x1 /* 4.15V */
#define CHRG_CCCV_CV_4200MV 0x2 /* 4.20V */
#define CHRG_CCCV_CV_4350MV 0x3 /* 4.35V */
#define CHRG_CCCV_CHG_EN (1 << 7)
-#define FG_CNTL_OCV_ADJ_STAT (1 << 2)
+#define FG_CNTL_OCV_ADJ_STAT (1 << 2)
#define FG_CNTL_OCV_ADJ_EN (1 << 3)
-#define FG_CNTL_CAP_ADJ_STAT (1 << 4)
+#define FG_CNTL_CAP_ADJ_STAT (1 << 4)
#define FG_CNTL_CAP_ADJ_EN (1 << 5)
#define FG_CNTL_CC_EN (1 << 6)
#define FG_CNTL_GAUGE_EN (1 << 7)
#define FG_CC_CAP_VALID (1 << 7)
#define FG_CC_CAP_VAL_MASK 0x7F
-#define FG_LOW_CAP_THR1_MASK 0xf0 /* 5% tp 20% */
+#define FG_LOW_CAP_THR1_MASK 0xf0 /* 5% tp 20% */
#define FG_LOW_CAP_THR1_VAL 0xa0 /* 15 perc */
-#define FG_LOW_CAP_THR2_MASK 0x0f /* 0% to 15% */
+#define FG_LOW_CAP_THR2_MASK 0x0f /* 0% to 15% */
#define FG_LOW_CAP_WARN_THR 14 /* 14 perc */
#define FG_LOW_CAP_CRIT_THR 4 /* 4 perc */
#define FG_LOW_CAP_SHDN_THR 0 /* 0 perc */
-#define NR_RETRY_CNT 3
-#define DEV_NAME "axp288_fuel_gauge"
+#define DEV_NAME "axp288_fuel_gauge"
/* 1.1mV per LSB expressed in uV */
#define VOLTAGE_FROM_ADC(a) ((a * 11) / 10)
/* properties converted to uV, uA */
-#define PROP_VOLT(a) ((a) * 1000)
-#define PROP_CURR(a) ((a) * 1000)
+#define PROP_VOLT(a) ((a) * 1000)
+#define PROP_CURR(a) ((a) * 1000)
-#define AXP288_FG_INTR_NUM 6
+#define AXP288_REG_UPDATE_INTERVAL (60 * HZ)
+#define AXP288_FG_INTR_NUM 6
enum {
QWBTU_IRQ = 0,
WBTU_IRQ,
};
enum {
- BAT_TEMP = 0,
- PMIC_TEMP,
- SYSTEM_TEMP,
BAT_CHRG_CURR,
BAT_D_CURR,
BAT_VOLT,
};
struct axp288_fg_info {
- struct platform_device *pdev;
+ struct device *dev;
struct regmap *regmap;
struct regmap_irq_chip_data *regmap_irqc;
int irq[AXP288_FG_INTR_NUM];
struct mutex lock;
int status;
int max_volt;
+ int pwr_op;
+ int low_cap;
struct dentry *debug_file;
+
+ char valid; /* zero until following fields are valid */
+ unsigned long last_updated; /* in jiffies */
+
+ int pwr_stat;
+ int fg_res;
+ int bat_volt;
+ int d_curr;
+ int c_curr;
+ int ocv;
+ int fg_cc_mtr1;
+ int fg_des_cap1;
};
static enum power_supply_property fuel_gauge_props[] = {
static int fuel_gauge_reg_readb(struct axp288_fg_info *info, int reg)
{
- int ret, i;
unsigned int val;
+ int ret;
- for (i = 0; i < NR_RETRY_CNT; i++) {
- ret = regmap_read(info->regmap, reg, &val);
- if (ret != -EBUSY)
- break;
- }
-
+ ret = regmap_read(info->regmap, reg, &val);
if (ret < 0) {
- dev_err(&info->pdev->dev, "axp288 reg read err:%d\n", ret);
+ dev_err(info->dev, "Error reading reg 0x%02x err: %d\n", reg, ret);
return ret;
}
ret = regmap_write(info->regmap, reg, (unsigned int)val);
if (ret < 0)
- dev_err(&info->pdev->dev, "axp288 reg write err:%d\n", ret);
+ dev_err(info->dev, "Error writing reg 0x%02x err: %d\n", reg, ret);
return ret;
}
ret = regmap_bulk_read(info->regmap, reg, buf, 2);
if (ret < 0) {
- dev_err(&info->pdev->dev, "Error reading reg 0x%02x err: %d\n",
- reg, ret);
+ dev_err(info->dev, "Error reading reg 0x%02x err: %d\n", reg, ret);
return ret;
}
ret = get_unaligned_be16(buf);
if (!(ret & FG_15BIT_WORD_VALID)) {
- dev_err(&info->pdev->dev, "Error reg 0x%02x contents not valid\n",
- reg);
+ dev_err(info->dev, "Error reg 0x%02x contents not valid\n", reg);
return -ENXIO;
}
ret = regmap_bulk_read(info->regmap, reg, buf, 2);
if (ret < 0) {
- dev_err(&info->pdev->dev, "Error reading reg 0x%02x err: %d\n",
- reg, ret);
+ dev_err(info->dev, "Error reading reg 0x%02x err: %d\n", reg, ret);
return ret;
}
return (buf[0] << 4) | ((buf[1] >> 4) & 0x0f);
}
-#ifdef CONFIG_DEBUG_FS
-static int fuel_gauge_debug_show(struct seq_file *s, void *data)
+static int fuel_gauge_update_registers(struct axp288_fg_info *info)
{
- struct axp288_fg_info *info = s->private;
- int raw_val, ret;
-
- seq_printf(s, " PWR_STATUS[%02x] : %02x\n",
- AXP20X_PWR_INPUT_STATUS,
- fuel_gauge_reg_readb(info, AXP20X_PWR_INPUT_STATUS));
- seq_printf(s, "PWR_OP_MODE[%02x] : %02x\n",
- AXP20X_PWR_OP_MODE,
- fuel_gauge_reg_readb(info, AXP20X_PWR_OP_MODE));
- seq_printf(s, " CHRG_CTRL1[%02x] : %02x\n",
- AXP20X_CHRG_CTRL1,
- fuel_gauge_reg_readb(info, AXP20X_CHRG_CTRL1));
- seq_printf(s, " VLTF[%02x] : %02x\n",
- AXP20X_V_LTF_DISCHRG,
- fuel_gauge_reg_readb(info, AXP20X_V_LTF_DISCHRG));
- seq_printf(s, " VHTF[%02x] : %02x\n",
- AXP20X_V_HTF_DISCHRG,
- fuel_gauge_reg_readb(info, AXP20X_V_HTF_DISCHRG));
- seq_printf(s, " CC_CTRL[%02x] : %02x\n",
- AXP20X_CC_CTRL,
- fuel_gauge_reg_readb(info, AXP20X_CC_CTRL));
- seq_printf(s, "BATTERY CAP[%02x] : %02x\n",
- AXP20X_FG_RES,
- fuel_gauge_reg_readb(info, AXP20X_FG_RES));
- seq_printf(s, " FG_RDC1[%02x] : %02x\n",
- AXP288_FG_RDC1_REG,
- fuel_gauge_reg_readb(info, AXP288_FG_RDC1_REG));
- seq_printf(s, " FG_RDC0[%02x] : %02x\n",
- AXP288_FG_RDC0_REG,
- fuel_gauge_reg_readb(info, AXP288_FG_RDC0_REG));
- seq_printf(s, " FG_OCV[%02x] : %04x\n",
- AXP288_FG_OCVH_REG,
- fuel_gauge_read_12bit_word(info, AXP288_FG_OCVH_REG));
- seq_printf(s, " FG_DES_CAP[%02x] : %04x\n",
- AXP288_FG_DES_CAP1_REG,
- fuel_gauge_read_15bit_word(info, AXP288_FG_DES_CAP1_REG));
- seq_printf(s, " FG_CC_MTR[%02x] : %04x\n",
- AXP288_FG_CC_MTR1_REG,
- fuel_gauge_read_15bit_word(info, AXP288_FG_CC_MTR1_REG));
- seq_printf(s, " FG_OCV_CAP[%02x] : %02x\n",
- AXP288_FG_OCV_CAP_REG,
- fuel_gauge_reg_readb(info, AXP288_FG_OCV_CAP_REG));
- seq_printf(s, " FG_CC_CAP[%02x] : %02x\n",
- AXP288_FG_CC_CAP_REG,
- fuel_gauge_reg_readb(info, AXP288_FG_CC_CAP_REG));
- seq_printf(s, " FG_LOW_CAP[%02x] : %02x\n",
- AXP288_FG_LOW_CAP_REG,
- fuel_gauge_reg_readb(info, AXP288_FG_LOW_CAP_REG));
- seq_printf(s, "TUNING_CTL0[%02x] : %02x\n",
- AXP288_FG_TUNE0,
- fuel_gauge_reg_readb(info, AXP288_FG_TUNE0));
- seq_printf(s, "TUNING_CTL1[%02x] : %02x\n",
- AXP288_FG_TUNE1,
- fuel_gauge_reg_readb(info, AXP288_FG_TUNE1));
- seq_printf(s, "TUNING_CTL2[%02x] : %02x\n",
- AXP288_FG_TUNE2,
- fuel_gauge_reg_readb(info, AXP288_FG_TUNE2));
- seq_printf(s, "TUNING_CTL3[%02x] : %02x\n",
- AXP288_FG_TUNE3,
- fuel_gauge_reg_readb(info, AXP288_FG_TUNE3));
- seq_printf(s, "TUNING_CTL4[%02x] : %02x\n",
- AXP288_FG_TUNE4,
- fuel_gauge_reg_readb(info, AXP288_FG_TUNE4));
- seq_printf(s, "TUNING_CTL5[%02x] : %02x\n",
- AXP288_FG_TUNE5,
- fuel_gauge_reg_readb(info, AXP288_FG_TUNE5));
-
- ret = iio_read_channel_raw(info->iio_channel[BAT_TEMP], &raw_val);
- if (ret >= 0)
- seq_printf(s, "axp288-batttemp : %d\n", raw_val);
- ret = iio_read_channel_raw(info->iio_channel[PMIC_TEMP], &raw_val);
- if (ret >= 0)
- seq_printf(s, "axp288-pmictemp : %d\n", raw_val);
- ret = iio_read_channel_raw(info->iio_channel[SYSTEM_TEMP], &raw_val);
- if (ret >= 0)
- seq_printf(s, "axp288-systtemp : %d\n", raw_val);
- ret = iio_read_channel_raw(info->iio_channel[BAT_CHRG_CURR], &raw_val);
- if (ret >= 0)
- seq_printf(s, "axp288-chrgcurr : %d\n", raw_val);
- ret = iio_read_channel_raw(info->iio_channel[BAT_D_CURR], &raw_val);
- if (ret >= 0)
- seq_printf(s, "axp288-dchrgcur : %d\n", raw_val);
- ret = iio_read_channel_raw(info->iio_channel[BAT_VOLT], &raw_val);
- if (ret >= 0)
- seq_printf(s, "axp288-battvolt : %d\n", raw_val);
+ int ret;
- return 0;
-}
+ if (info->valid && time_before(jiffies, info->last_updated + AXP288_REG_UPDATE_INTERVAL))
+ return 0;
-DEFINE_SHOW_ATTRIBUTE(fuel_gauge_debug);
+ dev_dbg(info->dev, "Fuel Gauge updating register values...\n");
-static void fuel_gauge_create_debugfs(struct axp288_fg_info *info)
-{
- info->debug_file = debugfs_create_file("fuelgauge", 0666, NULL,
- info, &fuel_gauge_debug_fops);
-}
+ ret = iosf_mbi_block_punit_i2c_access();
+ if (ret < 0)
+ return ret;
-static void fuel_gauge_remove_debugfs(struct axp288_fg_info *info)
-{
- debugfs_remove(info->debug_file);
-}
-#else
-static inline void fuel_gauge_create_debugfs(struct axp288_fg_info *info)
-{
-}
-static inline void fuel_gauge_remove_debugfs(struct axp288_fg_info *info)
-{
+ ret = fuel_gauge_reg_readb(info, AXP20X_PWR_INPUT_STATUS);
+ if (ret < 0)
+ goto out;
+ info->pwr_stat = ret;
+
+ ret = fuel_gauge_reg_readb(info, AXP20X_FG_RES);
+ if (ret < 0)
+ goto out;
+ info->fg_res = ret;
+
+ ret = iio_read_channel_raw(info->iio_channel[BAT_VOLT], &info->bat_volt);
+ if (ret < 0)
+ goto out;
+
+ if (info->pwr_stat & PS_STAT_BAT_CHRG_DIR) {
+ info->d_curr = 0;
+ ret = iio_read_channel_raw(info->iio_channel[BAT_CHRG_CURR], &info->c_curr);
+ if (ret < 0)
+ goto out;
+ } else {
+ info->c_curr = 0;
+ ret = iio_read_channel_raw(info->iio_channel[BAT_D_CURR], &info->d_curr);
+ if (ret < 0)
+ goto out;
+ }
+
+ ret = fuel_gauge_read_12bit_word(info, AXP288_FG_OCVH_REG);
+ if (ret < 0)
+ goto out;
+ info->ocv = ret;
+
+ ret = fuel_gauge_read_15bit_word(info, AXP288_FG_CC_MTR1_REG);
+ if (ret < 0)
+ goto out;
+ info->fg_cc_mtr1 = ret;
+
+ ret = fuel_gauge_read_15bit_word(info, AXP288_FG_DES_CAP1_REG);
+ if (ret < 0)
+ goto out;
+ info->fg_des_cap1 = ret;
+
+ info->last_updated = jiffies;
+ info->valid = 1;
+ ret = 0;
+out:
+ iosf_mbi_unblock_punit_i2c_access();
+ return ret;
}
-#endif
static void fuel_gauge_get_status(struct axp288_fg_info *info)
{
- int pwr_stat, fg_res, curr, ret;
-
- pwr_stat = fuel_gauge_reg_readb(info, AXP20X_PWR_INPUT_STATUS);
- if (pwr_stat < 0) {
- dev_err(&info->pdev->dev,
- "PWR STAT read failed:%d\n", pwr_stat);
- return;
- }
+ int pwr_stat = info->pwr_stat;
+ int fg_res = info->fg_res;
+ int curr = info->d_curr;
/* Report full if Vbus is valid and the reported capacity is 100% */
if (!(pwr_stat & PS_STAT_VBUS_VALID))
goto not_full;
- fg_res = fuel_gauge_reg_readb(info, AXP20X_FG_RES);
- if (fg_res < 0) {
- dev_err(&info->pdev->dev, "FG RES read failed: %d\n", fg_res);
- return;
- }
if (!(fg_res & FG_REP_CAP_VALID))
goto not_full;
if (fg_res < 90 || (pwr_stat & PS_STAT_BAT_CHRG_DIR))
goto not_full;
- ret = iio_read_channel_raw(info->iio_channel[BAT_D_CURR], &curr);
- if (ret < 0) {
- dev_err(&info->pdev->dev, "FG get current failed: %d\n", ret);
- return;
- }
if (curr == 0) {
info->status = POWER_SUPPLY_STATUS_FULL;
return;
info->status = POWER_SUPPLY_STATUS_DISCHARGING;
}
-static int fuel_gauge_get_vbatt(struct axp288_fg_info *info, int *vbatt)
-{
- int ret = 0, raw_val;
-
- ret = iio_read_channel_raw(info->iio_channel[BAT_VOLT], &raw_val);
- if (ret < 0)
- goto vbatt_read_fail;
-
- *vbatt = VOLTAGE_FROM_ADC(raw_val);
-vbatt_read_fail:
- return ret;
-}
-
-static int fuel_gauge_get_current(struct axp288_fg_info *info, int *cur)
-{
- int ret, discharge;
-
- /* First check discharge current, so that we do only 1 read on bat. */
- ret = iio_read_channel_raw(info->iio_channel[BAT_D_CURR], &discharge);
- if (ret < 0)
- return ret;
-
- if (discharge > 0) {
- *cur = -1 * discharge;
- return 0;
- }
-
- return iio_read_channel_raw(info->iio_channel[BAT_CHRG_CURR], cur);
-}
-
-static int fuel_gauge_get_vocv(struct axp288_fg_info *info, int *vocv)
-{
- int ret;
-
- ret = fuel_gauge_read_12bit_word(info, AXP288_FG_OCVH_REG);
- if (ret >= 0)
- *vocv = VOLTAGE_FROM_ADC(ret);
-
- return ret;
-}
-
static int fuel_gauge_battery_health(struct axp288_fg_info *info)
{
- int ret, vocv, health = POWER_SUPPLY_HEALTH_UNKNOWN;
-
- ret = fuel_gauge_get_vocv(info, &vocv);
- if (ret < 0)
- goto health_read_fail;
+ int vocv = VOLTAGE_FROM_ADC(info->ocv);
+ int health = POWER_SUPPLY_HEALTH_UNKNOWN;
if (vocv > info->max_volt)
health = POWER_SUPPLY_HEALTH_OVERVOLTAGE;
else
health = POWER_SUPPLY_HEALTH_GOOD;
-health_read_fail:
return health;
}
union power_supply_propval *val)
{
struct axp288_fg_info *info = power_supply_get_drvdata(ps);
- int ret = 0, value;
+ int ret, value;
mutex_lock(&info->lock);
+
+ ret = fuel_gauge_update_registers(info);
+ if (ret < 0)
+ goto out;
+
switch (prop) {
case POWER_SUPPLY_PROP_STATUS:
fuel_gauge_get_status(info);
val->intval = fuel_gauge_battery_health(info);
break;
case POWER_SUPPLY_PROP_VOLTAGE_NOW:
- ret = fuel_gauge_get_vbatt(info, &value);
- if (ret < 0)
- goto fuel_gauge_read_err;
+ value = VOLTAGE_FROM_ADC(info->bat_volt);
val->intval = PROP_VOLT(value);
break;
case POWER_SUPPLY_PROP_VOLTAGE_OCV:
- ret = fuel_gauge_get_vocv(info, &value);
- if (ret < 0)
- goto fuel_gauge_read_err;
+ value = VOLTAGE_FROM_ADC(info->ocv);
val->intval = PROP_VOLT(value);
break;
case POWER_SUPPLY_PROP_CURRENT_NOW:
- ret = fuel_gauge_get_current(info, &value);
- if (ret < 0)
- goto fuel_gauge_read_err;
+ if (info->d_curr > 0)
+ value = -1 * info->d_curr;
+ else
+ value = info->c_curr;
+
val->intval = PROP_CURR(value);
break;
case POWER_SUPPLY_PROP_PRESENT:
- ret = fuel_gauge_reg_readb(info, AXP20X_PWR_OP_MODE);
- if (ret < 0)
- goto fuel_gauge_read_err;
-
- if (ret & CHRG_STAT_BAT_PRESENT)
+ if (info->pwr_op & CHRG_STAT_BAT_PRESENT)
val->intval = 1;
else
val->intval = 0;
break;
case POWER_SUPPLY_PROP_CAPACITY:
- ret = fuel_gauge_reg_readb(info, AXP20X_FG_RES);
- if (ret < 0)
- goto fuel_gauge_read_err;
-
- if (!(ret & FG_REP_CAP_VALID))
- dev_err(&info->pdev->dev,
- "capacity measurement not valid\n");
- val->intval = (ret & FG_REP_CAP_VAL_MASK);
+ if (!(info->fg_res & FG_REP_CAP_VALID))
+ dev_err(info->dev, "capacity measurement not valid\n");
+ val->intval = (info->fg_res & FG_REP_CAP_VAL_MASK);
break;
case POWER_SUPPLY_PROP_CAPACITY_ALERT_MIN:
- ret = fuel_gauge_reg_readb(info, AXP288_FG_LOW_CAP_REG);
- if (ret < 0)
- goto fuel_gauge_read_err;
- val->intval = (ret & 0x0f);
+ val->intval = (info->low_cap & 0x0f);
break;
case POWER_SUPPLY_PROP_TECHNOLOGY:
val->intval = POWER_SUPPLY_TECHNOLOGY_LION;
break;
case POWER_SUPPLY_PROP_CHARGE_NOW:
- ret = fuel_gauge_read_15bit_word(info, AXP288_FG_CC_MTR1_REG);
- if (ret < 0)
- goto fuel_gauge_read_err;
-
- val->intval = ret * FG_DES_CAP_RES_LSB;
+ val->intval = info->fg_cc_mtr1 * FG_DES_CAP_RES_LSB;
break;
case POWER_SUPPLY_PROP_CHARGE_FULL:
- ret = fuel_gauge_read_15bit_word(info, AXP288_FG_DES_CAP1_REG);
- if (ret < 0)
- goto fuel_gauge_read_err;
-
- val->intval = ret * FG_DES_CAP_RES_LSB;
+ val->intval = info->fg_des_cap1 * FG_DES_CAP_RES_LSB;
break;
case POWER_SUPPLY_PROP_VOLTAGE_MAX_DESIGN:
val->intval = PROP_VOLT(info->max_volt);
break;
default:
- mutex_unlock(&info->lock);
- return -EINVAL;
+ ret = -EINVAL;
}
- mutex_unlock(&info->lock);
- return 0;
-
-fuel_gauge_read_err:
+out:
mutex_unlock(&info->lock);
return ret;
}
const union power_supply_propval *val)
{
struct axp288_fg_info *info = power_supply_get_drvdata(ps);
- int ret = 0;
+ int new_low_cap, ret = 0;
mutex_lock(&info->lock);
switch (prop) {
ret = -EINVAL;
break;
}
- ret = fuel_gauge_reg_readb(info, AXP288_FG_LOW_CAP_REG);
- if (ret < 0)
- break;
- ret &= 0xf0;
- ret |= (val->intval & 0xf);
- ret = fuel_gauge_reg_writeb(info, AXP288_FG_LOW_CAP_REG, ret);
+ new_low_cap = info->low_cap;
+ new_low_cap &= 0xf0;
+ new_low_cap |= (val->intval & 0xf);
+ ret = fuel_gauge_reg_writeb(info, AXP288_FG_LOW_CAP_REG, new_low_cap);
+ if (ret == 0)
+ info->low_cap = new_low_cap;
break;
default:
ret = -EINVAL;
}
if (i >= AXP288_FG_INTR_NUM) {
- dev_warn(&info->pdev->dev, "spurious interrupt!!\n");
+ dev_warn(info->dev, "spurious interrupt!!\n");
return IRQ_NONE;
}
switch (i) {
case QWBTU_IRQ:
- dev_info(&info->pdev->dev,
- "Quit Battery under temperature in work mode IRQ (QWBTU)\n");
+ dev_info(info->dev, "Quit Battery under temperature in work mode IRQ (QWBTU)\n");
break;
case WBTU_IRQ:
- dev_info(&info->pdev->dev,
- "Battery under temperature in work mode IRQ (WBTU)\n");
+ dev_info(info->dev, "Battery under temperature in work mode IRQ (WBTU)\n");
break;
case QWBTO_IRQ:
- dev_info(&info->pdev->dev,
- "Quit Battery over temperature in work mode IRQ (QWBTO)\n");
+ dev_info(info->dev, "Quit Battery over temperature in work mode IRQ (QWBTO)\n");
break;
case WBTO_IRQ:
- dev_info(&info->pdev->dev,
- "Battery over temperature in work mode IRQ (WBTO)\n");
+ dev_info(info->dev, "Battery over temperature in work mode IRQ (WBTO)\n");
break;
case WL2_IRQ:
- dev_info(&info->pdev->dev, "Low Batt Warning(2) INTR\n");
+ dev_info(info->dev, "Low Batt Warning(2) INTR\n");
break;
case WL1_IRQ:
- dev_info(&info->pdev->dev, "Low Batt Warning(1) INTR\n");
+ dev_info(info->dev, "Low Batt Warning(1) INTR\n");
break;
default:
- dev_warn(&info->pdev->dev, "Spurious Interrupt!!!\n");
+ dev_warn(info->dev, "Spurious Interrupt!!!\n");
}
+ info->valid = 0; /* Force updating of the cached registers */
+
power_supply_changed(info->bat);
return IRQ_HANDLED;
}
{
struct axp288_fg_info *info = power_supply_get_drvdata(psy);
+ info->valid = 0; /* Force updating of the cached registers */
power_supply_changed(info->bat);
}
.external_power_changed = fuel_gauge_external_power_changed,
};
-static void fuel_gauge_init_irq(struct axp288_fg_info *info)
+static void fuel_gauge_init_irq(struct axp288_fg_info *info, struct platform_device *pdev)
{
int ret, i, pirq;
for (i = 0; i < AXP288_FG_INTR_NUM; i++) {
- pirq = platform_get_irq(info->pdev, i);
+ pirq = platform_get_irq(pdev, i);
info->irq[i] = regmap_irq_get_virq(info->regmap_irqc, pirq);
if (info->irq[i] < 0) {
- dev_warn(&info->pdev->dev,
- "regmap_irq get virq failed for IRQ %d: %d\n",
+ dev_warn(info->dev, "regmap_irq get virq failed for IRQ %d: %d\n",
pirq, info->irq[i]);
info->irq[i] = -1;
goto intr_failed;
NULL, fuel_gauge_thread_handler,
IRQF_ONESHOT, DEV_NAME, info);
if (ret) {
- dev_warn(&info->pdev->dev,
- "request irq failed for IRQ %d: %d\n",
+ dev_warn(info->dev, "request irq failed for IRQ %d: %d\n",
pirq, info->irq[i]);
info->irq[i] = -1;
goto intr_failed;
- } else {
- dev_info(&info->pdev->dev, "HW IRQ %d -> VIRQ %d\n",
- pirq, info->irq[i]);
}
}
return;
struct axp20x_dev *axp20x = dev_get_drvdata(pdev->dev.parent);
struct power_supply_config psy_cfg = {};
static const char * const iio_chan_name[] = {
- [BAT_TEMP] = "axp288-batt-temp",
- [PMIC_TEMP] = "axp288-pmic-temp",
- [SYSTEM_TEMP] = "axp288-system-temp",
[BAT_CHRG_CURR] = "axp288-chrg-curr",
[BAT_D_CURR] = "axp288-chrg-d-curr",
[BAT_VOLT] = "axp288-batt-volt",
if (dmi_check_system(axp288_no_battery_list))
return -ENODEV;
- /*
- * On some devices the fuelgauge and charger parts of the axp288 are
- * not used, check that the fuelgauge is enabled (CC_CTRL != 0).
- */
- ret = regmap_read(axp20x->regmap, AXP20X_CC_CTRL, &val);
- if (ret < 0)
- return ret;
- if (val == 0)
- return -ENODEV;
-
info = devm_kzalloc(&pdev->dev, sizeof(*info), GFP_KERNEL);
if (!info)
return -ENOMEM;
- info->pdev = pdev;
+ info->dev = &pdev->dev;
info->regmap = axp20x->regmap;
info->regmap_irqc = axp20x->regmap_irqc;
info->status = POWER_SUPPLY_STATUS_UNKNOWN;
+ info->valid = 0;
platform_set_drvdata(pdev, info);
}
}
- ret = fuel_gauge_reg_readb(info, AXP288_FG_DES_CAP1_REG);
+ ret = iosf_mbi_block_punit_i2c_access();
if (ret < 0)
goto out_free_iio_chan;
+ /*
+ * On some devices the fuelgauge and charger parts of the axp288 are
+ * not used, check that the fuelgauge is enabled (CC_CTRL != 0).
+ */
+ ret = regmap_read(axp20x->regmap, AXP20X_CC_CTRL, &val);
+ if (ret < 0)
+ goto unblock_punit_i2c_access;
+ if (val == 0) {
+ ret = -ENODEV;
+ goto unblock_punit_i2c_access;
+ }
+
+ ret = fuel_gauge_reg_readb(info, AXP288_FG_DES_CAP1_REG);
+ if (ret < 0)
+ goto unblock_punit_i2c_access;
+
if (!(ret & FG_DES_CAP1_VALID)) {
dev_err(&pdev->dev, "axp288 not configured by firmware\n");
ret = -ENODEV;
- goto out_free_iio_chan;
+ goto unblock_punit_i2c_access;
}
ret = fuel_gauge_reg_readb(info, AXP20X_CHRG_CTRL1);
if (ret < 0)
- goto out_free_iio_chan;
+ goto unblock_punit_i2c_access;
switch ((ret & CHRG_CCCV_CV_MASK) >> CHRG_CCCV_CV_BIT_POS) {
case CHRG_CCCV_CV_4100MV:
info->max_volt = 4100;
break;
}
+ ret = fuel_gauge_reg_readb(info, AXP20X_PWR_OP_MODE);
+ if (ret < 0)
+ goto unblock_punit_i2c_access;
+ info->pwr_op = ret;
+
+ ret = fuel_gauge_reg_readb(info, AXP288_FG_LOW_CAP_REG);
+ if (ret < 0)
+ goto unblock_punit_i2c_access;
+ info->low_cap = ret;
+
+unblock_punit_i2c_access:
+ iosf_mbi_unblock_punit_i2c_access();
+ /* In case we arrive here by goto because of a register access error */
+ if (ret < 0)
+ goto out_free_iio_chan;
+
psy_cfg.drv_data = info;
info->bat = power_supply_register(&pdev->dev, &fuel_gauge_desc, &psy_cfg);
if (IS_ERR(info->bat)) {
goto out_free_iio_chan;
}
- fuel_gauge_create_debugfs(info);
- fuel_gauge_init_irq(info);
+ fuel_gauge_init_irq(info, pdev);
return 0;
int i;
power_supply_unregister(info->bat);
- fuel_gauge_remove_debugfs(info);
for (i = 0; i < AXP288_FG_INTR_NUM; i++)
if (info->irq[i] >= 0)
#include <linux/power/bq24735-charger.h>
-#define BQ24735_CHG_OPT 0x12
-#define BQ24735_CHG_OPT_CHARGE_DISABLE (1 << 0)
-#define BQ24735_CHG_OPT_AC_PRESENT (1 << 4)
+/* BQ24735 available commands and their respective masks */
+#define BQ24735_CHARGE_OPT 0x12
#define BQ24735_CHARGE_CURRENT 0x14
#define BQ24735_CHARGE_CURRENT_MASK 0x1fc0
#define BQ24735_CHARGE_VOLTAGE 0x15
#define BQ24735_MANUFACTURER_ID 0xfe
#define BQ24735_DEVICE_ID 0xff
+/* ChargeOptions bits of interest */
+#define BQ24735_CHARGE_OPT_CHG_DISABLE (1 << 0)
+#define BQ24735_CHARGE_OPT_AC_PRESENT (1 << 4)
+
struct bq24735 {
struct power_supply *charger;
struct power_supply_desc charger_desc;
if (ret)
return ret;
- return bq24735_update_word(charger->client, BQ24735_CHG_OPT,
- BQ24735_CHG_OPT_CHARGE_DISABLE, 0);
+ return bq24735_update_word(charger->client, BQ24735_CHARGE_OPT,
+ BQ24735_CHARGE_OPT_CHG_DISABLE, 0);
}
static inline int bq24735_disable_charging(struct bq24735 *charger)
if (charger->pdata->ext_control)
return 0;
- return bq24735_update_word(charger->client, BQ24735_CHG_OPT,
- BQ24735_CHG_OPT_CHARGE_DISABLE,
- BQ24735_CHG_OPT_CHARGE_DISABLE);
+ return bq24735_update_word(charger->client, BQ24735_CHARGE_OPT,
+ BQ24735_CHARGE_OPT_CHG_DISABLE,
+ BQ24735_CHARGE_OPT_CHG_DISABLE);
}
static bool bq24735_charger_is_present(struct bq24735 *charger)
} else {
int ac = 0;
- ac = bq24735_read_word(charger->client, BQ24735_CHG_OPT);
+ ac = bq24735_read_word(charger->client, BQ24735_CHARGE_OPT);
if (ac < 0) {
dev_dbg(&charger->client->dev,
"Failed to read charger options : %d\n",
ac);
return false;
}
- return (ac & BQ24735_CHG_OPT_AC_PRESENT) ? true : false;
+ return (ac & BQ24735_CHARGE_OPT_AC_PRESENT) ? true : false;
}
return false;
if (!bq24735_charger_is_present(charger))
return 0;
- ret = bq24735_read_word(charger->client, BQ24735_CHG_OPT);
+ ret = bq24735_read_word(charger->client, BQ24735_CHARGE_OPT);
if (ret < 0)
return ret;
- return !(ret & BQ24735_CHG_OPT_CHARGE_DISABLE);
+ return !(ret & BQ24735_CHARGE_OPT_CHG_DISABLE);
}
static void bq24735_update(struct bq24735 *charger)
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Power supply driver for ChromeOS EC based Peripheral Device Charger.
+ *
+ * Copyright 2020 Google LLC.
+ */
+
+#include <linux/module.h>
+#include <linux/notifier.h>
+#include <linux/platform_data/cros_ec_commands.h>
+#include <linux/platform_data/cros_ec_proto.h>
+#include <linux/platform_device.h>
+#include <linux/power_supply.h>
+#include <linux/slab.h>
+#include <linux/stringify.h>
+#include <linux/types.h>
+
+#define DRV_NAME "cros-ec-pchg"
+#define PCHG_DIR_PREFIX "peripheral"
+#define PCHG_DIR_NAME PCHG_DIR_PREFIX "%d"
+#define PCHG_DIR_NAME_LENGTH \
+ sizeof(PCHG_DIR_PREFIX __stringify(EC_PCHG_MAX_PORTS))
+#define PCHG_CACHE_UPDATE_DELAY msecs_to_jiffies(500)
+
+struct port_data {
+ int port_number;
+ char name[PCHG_DIR_NAME_LENGTH];
+ struct power_supply *psy;
+ struct power_supply_desc psy_desc;
+ int psy_status;
+ int battery_percentage;
+ int charge_type;
+ struct charger_data *charger;
+ unsigned long last_update;
+};
+
+struct charger_data {
+ struct device *dev;
+ struct cros_ec_dev *ec_dev;
+ struct cros_ec_device *ec_device;
+ int num_registered_psy;
+ struct port_data *ports[EC_PCHG_MAX_PORTS];
+ struct notifier_block notifier;
+};
+
+static enum power_supply_property cros_pchg_props[] = {
+ POWER_SUPPLY_PROP_STATUS,
+ POWER_SUPPLY_PROP_CHARGE_TYPE,
+ POWER_SUPPLY_PROP_CAPACITY,
+ POWER_SUPPLY_PROP_SCOPE,
+};
+
+static int cros_pchg_ec_command(const struct charger_data *charger,
+ unsigned int version,
+ unsigned int command,
+ const void *outdata,
+ unsigned int outsize,
+ void *indata,
+ unsigned int insize)
+{
+ struct cros_ec_dev *ec_dev = charger->ec_dev;
+ struct cros_ec_command *msg;
+ int ret;
+
+ msg = kzalloc(sizeof(*msg) + max(outsize, insize), GFP_KERNEL);
+ if (!msg)
+ return -ENOMEM;
+
+ msg->version = version;
+ msg->command = ec_dev->cmd_offset + command;
+ msg->outsize = outsize;
+ msg->insize = insize;
+
+ if (outsize)
+ memcpy(msg->data, outdata, outsize);
+
+ ret = cros_ec_cmd_xfer_status(charger->ec_device, msg);
+ if (ret >= 0 && insize)
+ memcpy(indata, msg->data, insize);
+
+ kfree(msg);
+ return ret;
+}
+
+static const unsigned int pchg_cmd_version = 1;
+
+static bool cros_pchg_cmd_ver_check(const struct charger_data *charger)
+{
+ struct ec_params_get_cmd_versions_v1 req;
+ struct ec_response_get_cmd_versions rsp;
+ int ret;
+
+ req.cmd = EC_CMD_PCHG;
+ ret = cros_pchg_ec_command(charger, 1, EC_CMD_GET_CMD_VERSIONS,
+ &req, sizeof(req), &rsp, sizeof(rsp));
+ if (ret < 0) {
+ dev_warn(charger->dev,
+ "Unable to get versions of EC_CMD_PCHG (err:%d)\n",
+ ret);
+ return false;
+ }
+
+ return !!(rsp.version_mask & BIT(pchg_cmd_version));
+}
+
+static int cros_pchg_port_count(const struct charger_data *charger)
+{
+ struct ec_response_pchg_count rsp;
+ int ret;
+
+ ret = cros_pchg_ec_command(charger, 0, EC_CMD_PCHG_COUNT,
+ NULL, 0, &rsp, sizeof(rsp));
+ if (ret < 0) {
+ dev_warn(charger->dev,
+ "Unable to get number or ports (err:%d)\n", ret);
+ return ret;
+ }
+
+ return rsp.port_count;
+}
+
+static int cros_pchg_get_status(struct port_data *port)
+{
+ struct charger_data *charger = port->charger;
+ struct ec_params_pchg req;
+ struct ec_response_pchg rsp;
+ struct device *dev = charger->dev;
+ int old_status = port->psy_status;
+ int old_percentage = port->battery_percentage;
+ int ret;
+
+ req.port = port->port_number;
+ ret = cros_pchg_ec_command(charger, pchg_cmd_version, EC_CMD_PCHG,
+ &req, sizeof(req), &rsp, sizeof(rsp));
+ if (ret < 0) {
+ dev_err(dev, "Unable to get port.%d status (err:%d)\n",
+ port->port_number, ret);
+ return ret;
+ }
+
+ switch (rsp.state) {
+ case PCHG_STATE_RESET:
+ case PCHG_STATE_INITIALIZED:
+ case PCHG_STATE_ENABLED:
+ default:
+ port->psy_status = POWER_SUPPLY_STATUS_UNKNOWN;
+ port->charge_type = POWER_SUPPLY_CHARGE_TYPE_NONE;
+ break;
+ case PCHG_STATE_DETECTED:
+ port->psy_status = POWER_SUPPLY_STATUS_CHARGING;
+ port->charge_type = POWER_SUPPLY_CHARGE_TYPE_TRICKLE;
+ break;
+ case PCHG_STATE_CHARGING:
+ port->psy_status = POWER_SUPPLY_STATUS_CHARGING;
+ port->charge_type = POWER_SUPPLY_CHARGE_TYPE_STANDARD;
+ break;
+ case PCHG_STATE_FULL:
+ port->psy_status = POWER_SUPPLY_STATUS_FULL;
+ port->charge_type = POWER_SUPPLY_CHARGE_TYPE_NONE;
+ break;
+ }
+
+ port->battery_percentage = rsp.battery_percentage;
+
+ if (port->psy_status != old_status ||
+ port->battery_percentage != old_percentage)
+ power_supply_changed(port->psy);
+
+ dev_dbg(dev,
+ "Port %d: state=%d battery=%d%%\n",
+ port->port_number, rsp.state, rsp.battery_percentage);
+
+ return 0;
+}
+
+static int cros_pchg_get_port_status(struct port_data *port, bool ratelimit)
+{
+ int ret;
+
+ if (ratelimit &&
+ time_is_after_jiffies(port->last_update + PCHG_CACHE_UPDATE_DELAY))
+ return 0;
+
+ ret = cros_pchg_get_status(port);
+ if (ret < 0)
+ return ret;
+
+ port->last_update = jiffies;
+
+ return ret;
+}
+
+static int cros_pchg_get_prop(struct power_supply *psy,
+ enum power_supply_property psp,
+ union power_supply_propval *val)
+{
+ struct port_data *port = power_supply_get_drvdata(psy);
+
+ switch (psp) {
+ case POWER_SUPPLY_PROP_STATUS:
+ case POWER_SUPPLY_PROP_CAPACITY:
+ case POWER_SUPPLY_PROP_CHARGE_TYPE:
+ cros_pchg_get_port_status(port, true);
+ break;
+ default:
+ break;
+ }
+
+ switch (psp) {
+ case POWER_SUPPLY_PROP_STATUS:
+ val->intval = port->psy_status;
+ break;
+ case POWER_SUPPLY_PROP_CAPACITY:
+ val->intval = port->battery_percentage;
+ break;
+ case POWER_SUPPLY_PROP_CHARGE_TYPE:
+ val->intval = port->charge_type;
+ break;
+ case POWER_SUPPLY_PROP_SCOPE:
+ val->intval = POWER_SUPPLY_SCOPE_DEVICE;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int cros_pchg_event(const struct charger_data *charger,
+ unsigned long host_event)
+{
+ int i;
+
+ for (i = 0; i < charger->num_registered_psy; i++)
+ cros_pchg_get_port_status(charger->ports[i], false);
+
+ return NOTIFY_OK;
+}
+
+static u32 cros_get_device_event(const struct charger_data *charger)
+{
+ struct ec_params_device_event req;
+ struct ec_response_device_event rsp;
+ struct device *dev = charger->dev;
+ int ret;
+
+ req.param = EC_DEVICE_EVENT_PARAM_GET_CURRENT_EVENTS;
+ ret = cros_pchg_ec_command(charger, 0, EC_CMD_DEVICE_EVENT,
+ &req, sizeof(req), &rsp, sizeof(rsp));
+ if (ret < 0) {
+ dev_warn(dev, "Unable to get device events (err:%d)\n", ret);
+ return 0;
+ }
+
+ return rsp.event_mask;
+}
+
+static int cros_ec_notify(struct notifier_block *nb,
+ unsigned long queued_during_suspend,
+ void *data)
+{
+ struct cros_ec_device *ec_dev = (struct cros_ec_device *)data;
+ u32 host_event = cros_ec_get_host_event(ec_dev);
+ struct charger_data *charger =
+ container_of(nb, struct charger_data, notifier);
+ u32 device_event_mask;
+
+ if (!host_event)
+ return NOTIFY_DONE;
+
+ if (!(host_event & EC_HOST_EVENT_MASK(EC_HOST_EVENT_DEVICE)))
+ return NOTIFY_DONE;
+
+ /*
+ * todo: Retrieve device event mask in common place
+ * (e.g. cros_ec_proto.c).
+ */
+ device_event_mask = cros_get_device_event(charger);
+ if (!(device_event_mask & EC_DEVICE_EVENT_MASK(EC_DEVICE_EVENT_WLC)))
+ return NOTIFY_DONE;
+
+ return cros_pchg_event(charger, host_event);
+}
+
+static int cros_pchg_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct cros_ec_dev *ec_dev = dev_get_drvdata(dev->parent);
+ struct cros_ec_device *ec_device = ec_dev->ec_dev;
+ struct power_supply_desc *psy_desc;
+ struct charger_data *charger;
+ struct power_supply *psy;
+ struct port_data *port;
+ struct notifier_block *nb;
+ int num_ports;
+ int ret;
+ int i;
+
+ charger = devm_kzalloc(dev, sizeof(*charger), GFP_KERNEL);
+ if (!charger)
+ return -ENOMEM;
+
+ charger->dev = dev;
+ charger->ec_dev = ec_dev;
+ charger->ec_device = ec_device;
+
+ ret = cros_pchg_port_count(charger);
+ if (ret <= 0) {
+ /*
+ * This feature is enabled by the EC and the kernel driver is
+ * included by default for CrOS devices. Don't need to be loud
+ * since this error can be normal.
+ */
+ dev_info(dev, "No peripheral charge ports (err:%d)\n", ret);
+ return -ENODEV;
+ }
+
+ if (!cros_pchg_cmd_ver_check(charger)) {
+ dev_err(dev, "EC_CMD_PCHG version %d isn't available.\n",
+ pchg_cmd_version);
+ return -EOPNOTSUPP;
+ }
+
+ num_ports = ret;
+ if (num_ports > EC_PCHG_MAX_PORTS) {
+ dev_err(dev, "Too many peripheral charge ports (%d)\n",
+ num_ports);
+ return -ENOBUFS;
+ }
+
+ dev_info(dev, "%d peripheral charge ports found\n", num_ports);
+
+ for (i = 0; i < num_ports; i++) {
+ struct power_supply_config psy_cfg = {};
+
+ port = devm_kzalloc(dev, sizeof(*port), GFP_KERNEL);
+ if (!port)
+ return -ENOMEM;
+
+ port->charger = charger;
+ port->port_number = i;
+ snprintf(port->name, sizeof(port->name), PCHG_DIR_NAME, i);
+
+ psy_desc = &port->psy_desc;
+ psy_desc->name = port->name;
+ psy_desc->type = POWER_SUPPLY_TYPE_BATTERY;
+ psy_desc->get_property = cros_pchg_get_prop;
+ psy_desc->external_power_changed = NULL;
+ psy_desc->properties = cros_pchg_props;
+ psy_desc->num_properties = ARRAY_SIZE(cros_pchg_props);
+ psy_cfg.drv_data = port;
+
+ psy = devm_power_supply_register(dev, psy_desc, &psy_cfg);
+ if (IS_ERR(psy))
+ return dev_err_probe(dev, PTR_ERR(psy),
+ "Failed to register power supply\n");
+ port->psy = psy;
+
+ charger->ports[charger->num_registered_psy++] = port;
+ }
+
+ if (!charger->num_registered_psy)
+ return -ENODEV;
+
+ nb = &charger->notifier;
+ nb->notifier_call = cros_ec_notify;
+ ret = blocking_notifier_chain_register(&ec_dev->ec_dev->event_notifier,
+ nb);
+ if (ret < 0)
+ dev_err(dev, "Failed to register notifier (err:%d)\n", ret);
+
+ return 0;
+}
+
+static struct platform_driver cros_pchg_driver = {
+ .driver = {
+ .name = DRV_NAME,
+ },
+ .probe = cros_pchg_probe
+};
+
+module_platform_driver(cros_pchg_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("ChromeOS EC peripheral device charger");
+MODULE_ALIAS("platform:" DRV_NAME);
&cw2015_bat_desc,
&psy_cfg);
if (IS_ERR(cw_bat->rk_bat)) {
- dev_err(cw_bat->dev, "Failed to register power supply\n");
+ /* try again if this happens */
+ dev_err_probe(&client->dev, PTR_ERR(cw_bat->rk_bat),
+ "Failed to register power supply\n");
return PTR_ERR(cw_bat->rk_bat);
}
/* Interrupt mask bits */
#define CONFIG_ALRT_BIT_ENBL (1 << 2)
-#define STATUS_INTR_SOCMIN_BIT (1 << 10)
-#define STATUS_INTR_SOCMAX_BIT (1 << 14)
#define VFSOC0_LOCK 0x0000
#define VFSOC0_UNLOCK 0x0080
case POWER_SUPPLY_PROP_VOLTAGE_MIN_DESIGN:
if (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17042)
ret = regmap_read(map, MAX17042_V_empty, &data);
- else if (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17055)
- ret = regmap_read(map, MAX17055_V_empty, &data);
else
ret = regmap_read(map, MAX17047_V_empty, &data);
if (ret < 0)
struct max17042_config_data *config = chip->pdata->config_data;
max17042_override_por(map, MAX17042_TGAIN, config->tgain);
- max17042_override_por(map, MAx17042_TOFF, config->toff);
+ max17042_override_por(map, MAX17042_TOFF, config->toff);
max17042_override_por(map, MAX17042_CGAIN, config->cgain);
max17042_override_por(map, MAX17042_COFF, config->coff);
max17042_override_por(map, MAX17042_FilterCFG, config->filter_cfg);
max17042_override_por(map, MAX17042_RelaxCFG, config->relax_cfg);
max17042_override_por(map, MAX17042_MiscCFG, config->misc_cfg);
- max17042_override_por(map, MAX17042_MaskSOC, config->masksoc);
max17042_override_por(map, MAX17042_FullCAP, config->fullcap);
max17042_override_por(map, MAX17042_FullCAPNom, config->fullcapnom);
- if (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17042)
- max17042_override_por(map, MAX17042_SOC_empty,
- config->socempty);
- max17042_override_por(map, MAX17042_LAvg_empty, config->lavg_empty);
max17042_override_por(map, MAX17042_dQacc, config->dqacc);
max17042_override_por(map, MAX17042_dPacc, config->dpacc);
- if (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17042)
- max17042_override_por(map, MAX17042_V_empty, config->vempty);
- if (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17055)
- max17042_override_por(map, MAX17055_V_empty, config->vempty);
- else
- max17042_override_por(map, MAX17047_V_empty, config->vempty);
- max17042_override_por(map, MAX17042_TempNom, config->temp_nom);
- max17042_override_por(map, MAX17042_TempLim, config->temp_lim);
- max17042_override_por(map, MAX17042_FCTC, config->fctc);
max17042_override_por(map, MAX17042_RCOMP0, config->rcomp0);
max17042_override_por(map, MAX17042_TempCo, config->tcompc0);
- if (chip->chip_type &&
- ((chip->chip_type == MAXIM_DEVICE_TYPE_MAX17042) ||
+
+ if (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17042) {
+ max17042_override_por(map, MAX17042_MaskSOC, config->masksoc);
+ max17042_override_por(map, MAX17042_SOC_empty, config->socempty);
+ max17042_override_por(map, MAX17042_V_empty, config->vempty);
+ max17042_override_por(map, MAX17042_EmptyTempCo, config->empty_tempco);
+ max17042_override_por(map, MAX17042_K_empty0, config->kempty0);
+ }
+
+ if ((chip->chip_type == MAXIM_DEVICE_TYPE_MAX17042) ||
(chip->chip_type == MAXIM_DEVICE_TYPE_MAX17047) ||
- (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17050))) {
- max17042_override_por(map, MAX17042_EmptyTempCo,
- config->empty_tempco);
- max17042_override_por(map, MAX17042_K_empty0,
- config->kempty0);
+ (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17050)) {
+ max17042_override_por(map, MAX17042_LAvg_empty, config->lavg_empty);
+ max17042_override_por(map, MAX17042_TempNom, config->temp_nom);
+ max17042_override_por(map, MAX17042_TempLim, config->temp_lim);
+ max17042_override_por(map, MAX17042_FCTC, config->fctc);
+ }
+
+ if ((chip->chip_type == MAXIM_DEVICE_TYPE_MAX17047) ||
+ (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17050) ||
+ (chip->chip_type == MAXIM_DEVICE_TYPE_MAX17055)) {
+ max17042_override_por(map, MAX17047_V_empty, config->vempty);
}
}
{
struct max17042_chip *chip = dev;
u32 val;
+ int ret;
- regmap_read(chip->regmap, MAX17042_STATUS, &val);
- if ((val & STATUS_INTR_SOCMIN_BIT) ||
- (val & STATUS_INTR_SOCMAX_BIT)) {
- dev_info(&chip->client->dev, "SOC threshold INTR\n");
+ ret = regmap_read(chip->regmap, MAX17042_STATUS, &val);
+ if (ret)
+ return IRQ_HANDLED;
+
+ if ((val & STATUS_SMN_BIT) || (val & STATUS_SMX_BIT)) {
+ dev_dbg(&chip->client->dev, "SOC threshold INTR\n");
max17042_set_soc_threshold(chip, 1);
}
{ .compatible = "maxim,max17047" },
{ .compatible = "maxim,max17050" },
{ .compatible = "maxim,max17055" },
+ { .compatible = "maxim,max77849-battery" },
{ },
};
MODULE_DEVICE_TABLE(of, max17042_dt_match);
{ "max17047", MAXIM_DEVICE_TYPE_MAX17047 },
{ "max17050", MAXIM_DEVICE_TYPE_MAX17050 },
{ "max17055", MAXIM_DEVICE_TYPE_MAX17055 },
+ { "max77849-battery", MAXIM_DEVICE_TYPE_MAX17047 },
{ }
};
MODULE_DEVICE_TABLE(i2c, max17042_id);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 MediaTek Inc.
+ */
+
+#include <linux/devm-helpers.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/linear_range.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/power_supply.h>
+#include <linux/property.h>
+#include <linux/regmap.h>
+#include <linux/regulator/driver.h>
+
+#define MT6360_PMU_CHG_CTRL1 0x311
+#define MT6360_PMU_CHG_CTRL2 0x312
+#define MT6360_PMU_CHG_CTRL3 0x313
+#define MT6360_PMU_CHG_CTRL4 0x314
+#define MT6360_PMU_CHG_CTRL5 0x315
+#define MT6360_PMU_CHG_CTRL6 0x316
+#define MT6360_PMU_CHG_CTRL7 0x317
+#define MT6360_PMU_CHG_CTRL8 0x318
+#define MT6360_PMU_CHG_CTRL9 0x319
+#define MT6360_PMU_CHG_CTRL10 0x31A
+#define MT6360_PMU_DEVICE_TYPE 0x322
+#define MT6360_PMU_USB_STATUS1 0x327
+#define MT6360_PMU_CHG_STAT 0x34A
+#define MT6360_PMU_CHG_CTRL19 0x361
+#define MT6360_PMU_FOD_STAT 0x3E7
+
+/* MT6360_PMU_CHG_CTRL1 */
+#define MT6360_FSLP_SHFT (3)
+#define MT6360_FSLP_MASK BIT(MT6360_FSLP_SHFT)
+#define MT6360_OPA_MODE_SHFT (0)
+#define MT6360_OPA_MODE_MASK BIT(MT6360_OPA_MODE_SHFT)
+/* MT6360_PMU_CHG_CTRL2 */
+#define MT6360_IINLMTSEL_SHFT (2)
+#define MT6360_IINLMTSEL_MASK GENMASK(3, 2)
+/* MT6360_PMU_CHG_CTRL3 */
+#define MT6360_IAICR_SHFT (2)
+#define MT6360_IAICR_MASK GENMASK(7, 2)
+#define MT6360_ILIM_EN_MASK BIT(0)
+/* MT6360_PMU_CHG_CTRL4 */
+#define MT6360_VOREG_SHFT (1)
+#define MT6360_VOREG_MASK GENMASK(7, 1)
+/* MT6360_PMU_CHG_CTRL5 */
+#define MT6360_VOBST_MASK GENMASK(7, 2)
+/* MT6360_PMU_CHG_CTRL6 */
+#define MT6360_VMIVR_SHFT (1)
+#define MT6360_VMIVR_MASK GENMASK(7, 1)
+/* MT6360_PMU_CHG_CTRL7 */
+#define MT6360_ICHG_SHFT (2)
+#define MT6360_ICHG_MASK GENMASK(7, 2)
+/* MT6360_PMU_CHG_CTRL8 */
+#define MT6360_IPREC_SHFT (0)
+#define MT6360_IPREC_MASK GENMASK(3, 0)
+/* MT6360_PMU_CHG_CTRL9 */
+#define MT6360_IEOC_SHFT (4)
+#define MT6360_IEOC_MASK GENMASK(7, 4)
+/* MT6360_PMU_CHG_CTRL10 */
+#define MT6360_OTG_OC_MASK GENMASK(3, 0)
+/* MT6360_PMU_DEVICE_TYPE */
+#define MT6360_USBCHGEN_MASK BIT(7)
+/* MT6360_PMU_USB_STATUS1 */
+#define MT6360_USB_STATUS_SHFT (4)
+#define MT6360_USB_STATUS_MASK GENMASK(6, 4)
+/* MT6360_PMU_CHG_STAT */
+#define MT6360_CHG_STAT_SHFT (6)
+#define MT6360_CHG_STAT_MASK GENMASK(7, 6)
+#define MT6360_VBAT_LVL_MASK BIT(5)
+/* MT6360_PMU_CHG_CTRL19 */
+#define MT6360_VINOVP_SHFT (5)
+#define MT6360_VINOVP_MASK GENMASK(6, 5)
+/* MT6360_PMU_FOD_STAT */
+#define MT6360_CHRDET_EXT_MASK BIT(4)
+
+/* uV */
+#define MT6360_VMIVR_MIN 3900000
+#define MT6360_VMIVR_MAX 13400000
+#define MT6360_VMIVR_STEP 100000
+/* uA */
+#define MT6360_ICHG_MIN 100000
+#define MT6360_ICHG_MAX 5000000
+#define MT6360_ICHG_STEP 100000
+/* uV */
+#define MT6360_VOREG_MIN 3900000
+#define MT6360_VOREG_MAX 4710000
+#define MT6360_VOREG_STEP 10000
+/* uA */
+#define MT6360_AICR_MIN 100000
+#define MT6360_AICR_MAX 3250000
+#define MT6360_AICR_STEP 50000
+/* uA */
+#define MT6360_IPREC_MIN 100000
+#define MT6360_IPREC_MAX 850000
+#define MT6360_IPREC_STEP 50000
+/* uA */
+#define MT6360_IEOC_MIN 100000
+#define MT6360_IEOC_MAX 850000
+#define MT6360_IEOC_STEP 50000
+
+enum {
+ MT6360_RANGE_VMIVR,
+ MT6360_RANGE_ICHG,
+ MT6360_RANGE_VOREG,
+ MT6360_RANGE_AICR,
+ MT6360_RANGE_IPREC,
+ MT6360_RANGE_IEOC,
+ MT6360_RANGE_MAX,
+};
+
+#define MT6360_LINEAR_RANGE(idx, _min, _min_sel, _max_sel, _step) \
+ [idx] = REGULATOR_LINEAR_RANGE(_min, _min_sel, _max_sel, _step)
+
+static const struct linear_range mt6360_chg_range[MT6360_RANGE_MAX] = {
+ MT6360_LINEAR_RANGE(MT6360_RANGE_VMIVR, 3900000, 0, 0x5F, 100000),
+ MT6360_LINEAR_RANGE(MT6360_RANGE_ICHG, 100000, 0, 0x31, 100000),
+ MT6360_LINEAR_RANGE(MT6360_RANGE_VOREG, 3900000, 0, 0x51, 10000),
+ MT6360_LINEAR_RANGE(MT6360_RANGE_AICR, 100000, 0, 0x3F, 50000),
+ MT6360_LINEAR_RANGE(MT6360_RANGE_IPREC, 100000, 0, 0x0F, 50000),
+ MT6360_LINEAR_RANGE(MT6360_RANGE_IEOC, 100000, 0, 0x0F, 50000),
+};
+
+struct mt6360_chg_info {
+ struct device *dev;
+ struct regmap *regmap;
+ struct power_supply_desc psy_desc;
+ struct power_supply *psy;
+ struct regulator_dev *otg_rdev;
+ struct mutex chgdet_lock;
+ u32 vinovp;
+ bool pwr_rdy;
+ bool bc12_en;
+ int psy_usb_type;
+ struct work_struct chrdet_work;
+};
+
+enum mt6360_iinlmtsel {
+ MT6360_IINLMTSEL_AICR_3250 = 0,
+ MT6360_IINLMTSEL_CHG_TYPE,
+ MT6360_IINLMTSEL_AICR,
+ MT6360_IINLMTSEL_LOWER_LEVEL,
+};
+
+enum mt6360_pmu_chg_type {
+ MT6360_CHG_TYPE_NOVBUS = 0,
+ MT6360_CHG_TYPE_UNDER_GOING,
+ MT6360_CHG_TYPE_SDP,
+ MT6360_CHG_TYPE_SDPNSTD,
+ MT6360_CHG_TYPE_DCP,
+ MT6360_CHG_TYPE_CDP,
+ MT6360_CHG_TYPE_DISABLE_BC12,
+ MT6360_CHG_TYPE_MAX,
+};
+
+static enum power_supply_usb_type mt6360_charger_usb_types[] = {
+ POWER_SUPPLY_USB_TYPE_UNKNOWN,
+ POWER_SUPPLY_USB_TYPE_SDP,
+ POWER_SUPPLY_USB_TYPE_DCP,
+ POWER_SUPPLY_USB_TYPE_CDP,
+};
+
+static int mt6360_get_chrdet_ext_stat(struct mt6360_chg_info *mci,
+ bool *pwr_rdy)
+{
+ int ret;
+ unsigned int regval;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_FOD_STAT, ®val);
+ if (ret < 0)
+ return ret;
+ *pwr_rdy = (regval & MT6360_CHRDET_EXT_MASK) ? true : false;
+ return 0;
+}
+
+static int mt6360_charger_get_online(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ bool pwr_rdy;
+
+ ret = mt6360_get_chrdet_ext_stat(mci, &pwr_rdy);
+ if (ret < 0)
+ return ret;
+ val->intval = pwr_rdy ? true : false;
+ return 0;
+}
+
+static int mt6360_charger_get_status(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int status, ret;
+ unsigned int regval;
+ bool pwr_rdy;
+
+ ret = mt6360_get_chrdet_ext_stat(mci, &pwr_rdy);
+ if (ret < 0)
+ return ret;
+ if (!pwr_rdy) {
+ status = POWER_SUPPLY_STATUS_DISCHARGING;
+ goto out;
+ }
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_STAT, ®val);
+ if (ret < 0)
+ return ret;
+ regval &= MT6360_CHG_STAT_MASK;
+ regval >>= MT6360_CHG_STAT_SHFT;
+ switch (regval) {
+ case 0x0:
+ status = POWER_SUPPLY_STATUS_NOT_CHARGING;
+ break;
+ case 0x1:
+ status = POWER_SUPPLY_STATUS_CHARGING;
+ break;
+ case 0x2:
+ status = POWER_SUPPLY_STATUS_FULL;
+ break;
+ default:
+ ret = -EIO;
+ }
+out:
+ if (!ret)
+ val->intval = status;
+ return ret;
+}
+
+static int mt6360_charger_get_charge_type(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int type, ret;
+ unsigned int regval;
+ u8 chg_stat;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_STAT, ®val);
+ if (ret < 0)
+ return ret;
+
+ chg_stat = (regval & MT6360_CHG_STAT_MASK) >> MT6360_CHG_STAT_SHFT;
+ switch (chg_stat) {
+ case 0x01: /* Charge in Progress */
+ if (regval & MT6360_VBAT_LVL_MASK)
+ type = POWER_SUPPLY_CHARGE_TYPE_FAST;
+ else
+ type = POWER_SUPPLY_CHARGE_TYPE_TRICKLE;
+ break;
+ case 0x00: /* Not Charging */
+ case 0x02: /* Charge Done */
+ case 0x03: /* Charge Fault */
+ default:
+ type = POWER_SUPPLY_CHARGE_TYPE_NONE;
+ break;
+ }
+
+ val->intval = type;
+ return 0;
+}
+
+static int mt6360_charger_get_ichg(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ u32 sel, value;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_CTRL7, &sel);
+ if (ret < 0)
+ return ret;
+ sel = (sel & MT6360_ICHG_MASK) >> MT6360_ICHG_SHFT;
+ ret = linear_range_get_value(&mt6360_chg_range[MT6360_RANGE_ICHG], sel, &value);
+ if (!ret)
+ val->intval = value;
+ return ret;
+}
+
+static int mt6360_charger_get_max_ichg(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ val->intval = MT6360_ICHG_MAX;
+ return 0;
+}
+
+static int mt6360_charger_get_cv(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ u32 sel, value;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_CTRL4, &sel);
+ if (ret < 0)
+ return ret;
+ sel = (sel & MT6360_VOREG_MASK) >> MT6360_VOREG_SHFT;
+ ret = linear_range_get_value(&mt6360_chg_range[MT6360_RANGE_VOREG], sel, &value);
+ if (!ret)
+ val->intval = value;
+ return ret;
+}
+
+static int mt6360_charger_get_max_cv(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ val->intval = MT6360_VOREG_MAX;
+ return 0;
+}
+
+static int mt6360_charger_get_aicr(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ u32 sel, value;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_CTRL3, &sel);
+ if (ret < 0)
+ return ret;
+ sel = (sel & MT6360_IAICR_MASK) >> MT6360_IAICR_SHFT;
+ ret = linear_range_get_value(&mt6360_chg_range[MT6360_RANGE_AICR], sel, &value);
+ if (!ret)
+ val->intval = value;
+ return ret;
+}
+
+static int mt6360_charger_get_mivr(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ u32 sel, value;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_CTRL6, &sel);
+ if (ret < 0)
+ return ret;
+ sel = (sel & MT6360_VMIVR_MASK) >> MT6360_VMIVR_SHFT;
+ ret = linear_range_get_value(&mt6360_chg_range[MT6360_RANGE_VMIVR], sel, &value);
+ if (!ret)
+ val->intval = value;
+ return ret;
+}
+
+static int mt6360_charger_get_iprechg(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ u32 sel, value;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_CTRL8, &sel);
+ if (ret < 0)
+ return ret;
+ sel = (sel & MT6360_IPREC_MASK) >> MT6360_IPREC_SHFT;
+ ret = linear_range_get_value(&mt6360_chg_range[MT6360_RANGE_IPREC], sel, &value);
+ if (!ret)
+ val->intval = value;
+ return ret;
+}
+
+static int mt6360_charger_get_ieoc(struct mt6360_chg_info *mci,
+ union power_supply_propval *val)
+{
+ int ret;
+ u32 sel, value;
+
+ ret = regmap_read(mci->regmap, MT6360_PMU_CHG_CTRL9, &sel);
+ if (ret < 0)
+ return ret;
+ sel = (sel & MT6360_IEOC_MASK) >> MT6360_IEOC_SHFT;
+ ret = linear_range_get_value(&mt6360_chg_range[MT6360_RANGE_IEOC], sel, &value);
+ if (!ret)
+ val->intval = value;
+ return ret;
+}
+
+static int mt6360_charger_set_online(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u8 force_sleep = val->intval ? 0 : 1;
+
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL1,
+ MT6360_FSLP_MASK,
+ force_sleep << MT6360_FSLP_SHFT);
+}
+
+static int mt6360_charger_set_ichg(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u32 sel;
+
+ linear_range_get_selector_within(&mt6360_chg_range[MT6360_RANGE_ICHG], val->intval, &sel);
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL7,
+ MT6360_ICHG_MASK,
+ sel << MT6360_ICHG_SHFT);
+}
+
+static int mt6360_charger_set_cv(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u32 sel;
+
+ linear_range_get_selector_within(&mt6360_chg_range[MT6360_RANGE_VOREG], val->intval, &sel);
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL4,
+ MT6360_VOREG_MASK,
+ sel << MT6360_VOREG_SHFT);
+}
+
+static int mt6360_charger_set_aicr(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u32 sel;
+
+ linear_range_get_selector_within(&mt6360_chg_range[MT6360_RANGE_AICR], val->intval, &sel);
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL3,
+ MT6360_IAICR_MASK,
+ sel << MT6360_IAICR_SHFT);
+}
+
+static int mt6360_charger_set_mivr(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u32 sel;
+
+ linear_range_get_selector_within(&mt6360_chg_range[MT6360_RANGE_VMIVR], val->intval, &sel);
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL3,
+ MT6360_VMIVR_MASK,
+ sel << MT6360_VMIVR_SHFT);
+}
+
+static int mt6360_charger_set_iprechg(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u32 sel;
+
+ linear_range_get_selector_within(&mt6360_chg_range[MT6360_RANGE_IPREC], val->intval, &sel);
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL8,
+ MT6360_IPREC_MASK,
+ sel << MT6360_IPREC_SHFT);
+}
+
+static int mt6360_charger_set_ieoc(struct mt6360_chg_info *mci,
+ const union power_supply_propval *val)
+{
+ u32 sel;
+
+ linear_range_get_selector_within(&mt6360_chg_range[MT6360_RANGE_IEOC], val->intval, &sel);
+ return regmap_update_bits(mci->regmap,
+ MT6360_PMU_CHG_CTRL9,
+ MT6360_IEOC_MASK,
+ sel << MT6360_IEOC_SHFT);
+}
+
+static int mt6360_charger_get_property(struct power_supply *psy,
+ enum power_supply_property psp,
+ union power_supply_propval *val)
+{
+ struct mt6360_chg_info *mci = power_supply_get_drvdata(psy);
+ int ret = 0;
+
+ switch (psp) {
+ case POWER_SUPPLY_PROP_ONLINE:
+ ret = mt6360_charger_get_online(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_STATUS:
+ ret = mt6360_charger_get_status(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CHARGE_TYPE:
+ ret = mt6360_charger_get_charge_type(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+ ret = mt6360_charger_get_ichg(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX:
+ ret = mt6360_charger_get_max_ichg(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE:
+ ret = mt6360_charger_get_cv(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE_MAX:
+ ret = mt6360_charger_get_max_cv(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT:
+ ret = mt6360_charger_get_aicr(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_INPUT_VOLTAGE_LIMIT:
+ ret = mt6360_charger_get_mivr(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_PRECHARGE_CURRENT:
+ ret = mt6360_charger_get_iprechg(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CHARGE_TERM_CURRENT:
+ ret = mt6360_charger_get_ieoc(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_USB_TYPE:
+ val->intval = mci->psy_usb_type;
+ break;
+ default:
+ ret = -ENODATA;
+ }
+ return ret;
+}
+
+static int mt6360_charger_set_property(struct power_supply *psy,
+ enum power_supply_property psp,
+ const union power_supply_propval *val)
+{
+ struct mt6360_chg_info *mci = power_supply_get_drvdata(psy);
+ int ret;
+
+ switch (psp) {
+ case POWER_SUPPLY_PROP_ONLINE:
+ ret = mt6360_charger_set_online(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+ ret = mt6360_charger_set_ichg(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE:
+ ret = mt6360_charger_set_cv(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT:
+ ret = mt6360_charger_set_aicr(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_INPUT_VOLTAGE_LIMIT:
+ ret = mt6360_charger_set_mivr(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_PRECHARGE_CURRENT:
+ ret = mt6360_charger_set_iprechg(mci, val);
+ break;
+ case POWER_SUPPLY_PROP_CHARGE_TERM_CURRENT:
+ ret = mt6360_charger_set_ieoc(mci, val);
+ break;
+ default:
+ ret = -EINVAL;
+ }
+ return ret;
+}
+
+static int mt6360_charger_property_is_writeable(struct power_supply *psy,
+ enum power_supply_property psp)
+{
+ switch (psp) {
+ case POWER_SUPPLY_PROP_ONLINE:
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT:
+ case POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE:
+ case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT:
+ case POWER_SUPPLY_PROP_INPUT_VOLTAGE_LIMIT:
+ case POWER_SUPPLY_PROP_PRECHARGE_CURRENT:
+ case POWER_SUPPLY_PROP_CHARGE_TERM_CURRENT:
+ return 1;
+ default:
+ return 0;
+ }
+}
+
+static enum power_supply_property mt6360_charger_properties[] = {
+ POWER_SUPPLY_PROP_ONLINE,
+ POWER_SUPPLY_PROP_STATUS,
+ POWER_SUPPLY_PROP_CHARGE_TYPE,
+ POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT,
+ POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX,
+ POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE,
+ POWER_SUPPLY_PROP_CONSTANT_CHARGE_VOLTAGE_MAX,
+ POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT,
+ POWER_SUPPLY_PROP_INPUT_VOLTAGE_LIMIT,
+ POWER_SUPPLY_PROP_PRECHARGE_CURRENT,
+ POWER_SUPPLY_PROP_CHARGE_TERM_CURRENT,
+ POWER_SUPPLY_PROP_USB_TYPE,
+};
+
+static const struct power_supply_desc mt6360_charger_desc = {
+ .type = POWER_SUPPLY_TYPE_USB,
+ .properties = mt6360_charger_properties,
+ .num_properties = ARRAY_SIZE(mt6360_charger_properties),
+ .get_property = mt6360_charger_get_property,
+ .set_property = mt6360_charger_set_property,
+ .property_is_writeable = mt6360_charger_property_is_writeable,
+ .usb_types = mt6360_charger_usb_types,
+ .num_usb_types = ARRAY_SIZE(mt6360_charger_usb_types),
+};
+
+static const struct regulator_ops mt6360_chg_otg_ops = {
+ .list_voltage = regulator_list_voltage_linear,
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+};
+
+static const struct regulator_desc mt6360_otg_rdesc = {
+ .of_match = "usb-otg-vbus",
+ .name = "usb-otg-vbus",
+ .ops = &mt6360_chg_otg_ops,
+ .owner = THIS_MODULE,
+ .type = REGULATOR_VOLTAGE,
+ .min_uV = 4425000,
+ .uV_step = 25000,
+ .n_voltages = 57,
+ .vsel_reg = MT6360_PMU_CHG_CTRL5,
+ .vsel_mask = MT6360_VOBST_MASK,
+ .enable_reg = MT6360_PMU_CHG_CTRL1,
+ .enable_mask = MT6360_OPA_MODE_MASK,
+};
+
+static irqreturn_t mt6360_pmu_attach_i_handler(int irq, void *data)
+{
+ struct mt6360_chg_info *mci = data;
+ int ret;
+ unsigned int usb_status;
+ int last_usb_type;
+
+ mutex_lock(&mci->chgdet_lock);
+ if (!mci->bc12_en) {
+ dev_warn(mci->dev, "Received attach interrupt, bc12 disabled, ignore irq\n");
+ goto out;
+ }
+ last_usb_type = mci->psy_usb_type;
+ /* Plug in */
+ ret = regmap_read(mci->regmap, MT6360_PMU_USB_STATUS1, &usb_status);
+ if (ret < 0)
+ goto out;
+ usb_status &= MT6360_USB_STATUS_MASK;
+ usb_status >>= MT6360_USB_STATUS_SHFT;
+ switch (usb_status) {
+ case MT6360_CHG_TYPE_NOVBUS:
+ dev_dbg(mci->dev, "Received attach interrupt, no vbus\n");
+ goto out;
+ case MT6360_CHG_TYPE_UNDER_GOING:
+ dev_dbg(mci->dev, "Received attach interrupt, under going...\n");
+ goto out;
+ case MT6360_CHG_TYPE_SDP:
+ mci->psy_usb_type = POWER_SUPPLY_USB_TYPE_SDP;
+ break;
+ case MT6360_CHG_TYPE_SDPNSTD:
+ mci->psy_usb_type = POWER_SUPPLY_USB_TYPE_SDP;
+ break;
+ case MT6360_CHG_TYPE_CDP:
+ mci->psy_usb_type = POWER_SUPPLY_USB_TYPE_CDP;
+ break;
+ case MT6360_CHG_TYPE_DCP:
+ mci->psy_usb_type = POWER_SUPPLY_USB_TYPE_DCP;
+ break;
+ case MT6360_CHG_TYPE_DISABLE_BC12:
+ dev_dbg(mci->dev, "Received attach interrupt, bc12 detect not enable\n");
+ goto out;
+ default:
+ mci->psy_usb_type = POWER_SUPPLY_USB_TYPE_UNKNOWN;
+ dev_dbg(mci->dev, "Received attach interrupt, reserved address\n");
+ goto out;
+ }
+
+ dev_dbg(mci->dev, "Received attach interrupt, chg_type = %d\n", mci->psy_usb_type);
+ if (last_usb_type != mci->psy_usb_type)
+ power_supply_changed(mci->psy);
+out:
+ mutex_unlock(&mci->chgdet_lock);
+ return IRQ_HANDLED;
+}
+
+static void mt6360_handle_chrdet_ext_evt(struct mt6360_chg_info *mci)
+{
+ int ret;
+ bool pwr_rdy;
+
+ mutex_lock(&mci->chgdet_lock);
+ ret = mt6360_get_chrdet_ext_stat(mci, &pwr_rdy);
+ if (ret < 0)
+ goto out;
+ if (mci->pwr_rdy == pwr_rdy) {
+ dev_dbg(mci->dev, "Received vbus interrupt, pwr_rdy is same(%d)\n", pwr_rdy);
+ goto out;
+ }
+ mci->pwr_rdy = pwr_rdy;
+ dev_dbg(mci->dev, "Received vbus interrupt, pwr_rdy = %d\n", pwr_rdy);
+ if (!pwr_rdy) {
+ mci->psy_usb_type = POWER_SUPPLY_USB_TYPE_UNKNOWN;
+ power_supply_changed(mci->psy);
+
+ }
+ ret = regmap_update_bits(mci->regmap,
+ MT6360_PMU_DEVICE_TYPE,
+ MT6360_USBCHGEN_MASK,
+ pwr_rdy ? MT6360_USBCHGEN_MASK : 0);
+ if (ret < 0)
+ goto out;
+ mci->bc12_en = pwr_rdy;
+out:
+ mutex_unlock(&mci->chgdet_lock);
+}
+
+static void mt6360_chrdet_work(struct work_struct *work)
+{
+ struct mt6360_chg_info *mci = (struct mt6360_chg_info *)container_of(
+ work, struct mt6360_chg_info, chrdet_work);
+
+ mt6360_handle_chrdet_ext_evt(mci);
+}
+
+static irqreturn_t mt6360_pmu_chrdet_ext_evt_handler(int irq, void *data)
+{
+ struct mt6360_chg_info *mci = data;
+
+ mt6360_handle_chrdet_ext_evt(mci);
+ return IRQ_HANDLED;
+}
+
+static int mt6360_chg_irq_register(struct platform_device *pdev)
+{
+ const struct {
+ const char *name;
+ irq_handler_t handler;
+ } irq_descs[] = {
+ { "attach_i", mt6360_pmu_attach_i_handler },
+ { "chrdet_ext_evt", mt6360_pmu_chrdet_ext_evt_handler }
+ };
+ int i, ret;
+
+ for (i = 0; i < ARRAY_SIZE(irq_descs); i++) {
+ ret = platform_get_irq_byname(pdev, irq_descs[i].name);
+ if (ret < 0)
+ return ret;
+
+ ret = devm_request_threaded_irq(&pdev->dev, ret, NULL,
+ irq_descs[i].handler,
+ IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
+ irq_descs[i].name,
+ platform_get_drvdata(pdev));
+ if (ret < 0)
+ return dev_err_probe(&pdev->dev, ret, "Failed to request %s irq\n",
+ irq_descs[i].name);
+ }
+
+ return 0;
+}
+
+static u32 mt6360_vinovp_trans_to_sel(u32 val)
+{
+ u32 vinovp_tbl[] = { 5500000, 6500000, 11000000, 14500000 };
+ int i;
+
+ /* Select the smaller and equal supported value */
+ for (i = 0; i < ARRAY_SIZE(vinovp_tbl)-1; i++) {
+ if (val < vinovp_tbl[i+1])
+ break;
+ }
+ return i;
+}
+
+static int mt6360_chg_init_setting(struct mt6360_chg_info *mci)
+{
+ int ret;
+ u32 sel;
+
+ sel = mt6360_vinovp_trans_to_sel(mci->vinovp);
+ ret = regmap_update_bits(mci->regmap, MT6360_PMU_CHG_CTRL19,
+ MT6360_VINOVP_MASK, sel << MT6360_VINOVP_SHFT);
+ if (ret)
+ return dev_err_probe(mci->dev, ret, "%s: Failed to apply vinovp\n", __func__);
+ ret = regmap_update_bits(mci->regmap, MT6360_PMU_DEVICE_TYPE,
+ MT6360_USBCHGEN_MASK, 0);
+ if (ret)
+ return dev_err_probe(mci->dev, ret, "%s: Failed to disable bc12\n", __func__);
+ ret = regmap_update_bits(mci->regmap, MT6360_PMU_CHG_CTRL2,
+ MT6360_IINLMTSEL_MASK,
+ MT6360_IINLMTSEL_AICR <<
+ MT6360_IINLMTSEL_SHFT);
+ if (ret)
+ return dev_err_probe(mci->dev, ret,
+ "%s: Failed to switch iinlmtsel to aicr\n", __func__);
+ usleep_range(5000, 6000);
+ ret = regmap_update_bits(mci->regmap, MT6360_PMU_CHG_CTRL3,
+ MT6360_ILIM_EN_MASK, 0);
+ if (ret)
+ return dev_err_probe(mci->dev, ret,
+ "%s: Failed to disable ilim\n", __func__);
+ ret = regmap_update_bits(mci->regmap, MT6360_PMU_CHG_CTRL10,
+ MT6360_OTG_OC_MASK, MT6360_OTG_OC_MASK);
+ if (ret)
+ return dev_err_probe(mci->dev, ret,
+ "%s: Failed to config otg oc to 3A\n", __func__);
+ return 0;
+}
+
+static int mt6360_charger_probe(struct platform_device *pdev)
+{
+ struct mt6360_chg_info *mci;
+ struct power_supply_config charger_cfg = {};
+ struct regulator_config config = { };
+ int ret;
+
+ mci = devm_kzalloc(&pdev->dev, sizeof(*mci), GFP_KERNEL);
+ if (!mci)
+ return -ENOMEM;
+
+ mci->dev = &pdev->dev;
+ mci->vinovp = 6500000;
+ mutex_init(&mci->chgdet_lock);
+ platform_set_drvdata(pdev, mci);
+ devm_work_autocancel(&pdev->dev, &mci->chrdet_work, mt6360_chrdet_work);
+
+ ret = device_property_read_u32(&pdev->dev, "richtek,vinovp-microvolt", &mci->vinovp);
+ if (ret)
+ dev_warn(&pdev->dev, "Failed to parse vinovp in DT, keep default 6.5v\n");
+
+ mci->regmap = dev_get_regmap(pdev->dev.parent, NULL);
+ if (!mci->regmap)
+ return dev_err_probe(&pdev->dev, -ENODEV, "Failed to get parent regmap\n");
+
+ ret = mt6360_chg_init_setting(mci);
+ if (ret)
+ return dev_err_probe(&pdev->dev, ret, "Failed to initial setting\n");
+
+ memcpy(&mci->psy_desc, &mt6360_charger_desc, sizeof(mci->psy_desc));
+ mci->psy_desc.name = dev_name(&pdev->dev);
+ charger_cfg.drv_data = mci;
+ charger_cfg.of_node = pdev->dev.of_node;
+ mci->psy = devm_power_supply_register(&pdev->dev,
+ &mci->psy_desc, &charger_cfg);
+ if (IS_ERR(mci->psy))
+ return dev_err_probe(&pdev->dev, PTR_ERR(mci->psy),
+ "Failed to register power supply dev\n");
+
+
+ ret = mt6360_chg_irq_register(pdev);
+ if (ret)
+ return dev_err_probe(&pdev->dev, ret, "Failed to register irqs\n");
+
+ config.dev = &pdev->dev;
+ config.regmap = mci->regmap;
+ mci->otg_rdev = devm_regulator_register(&pdev->dev, &mt6360_otg_rdesc,
+ &config);
+ if (IS_ERR(mci->otg_rdev))
+ return PTR_ERR(mci->otg_rdev);
+
+ schedule_work(&mci->chrdet_work);
+
+ return 0;
+}
+
+static const struct of_device_id __maybe_unused mt6360_charger_of_id[] = {
+ { .compatible = "mediatek,mt6360-chg", },
+ {},
+};
+MODULE_DEVICE_TABLE(of, mt6360_charger_of_id);
+
+static const struct platform_device_id mt6360_charger_id[] = {
+ { "mt6360-chg", 0 },
+ {},
+};
+MODULE_DEVICE_TABLE(platform, mt6360_charger_id);
+
+static struct platform_driver mt6360_charger_driver = {
+ .driver = {
+ .name = "mt6360-chg",
+ .of_match_table = of_match_ptr(mt6360_charger_of_id),
+ },
+ .probe = mt6360_charger_probe,
+ .id_table = mt6360_charger_id,
+};
+module_platform_driver(mt6360_charger_driver);
+
+MODULE_AUTHOR("Gene Chen <gene_chen@richtek.com>");
+MODULE_DESCRIPTION("MT6360 Charger Driver");
+MODULE_LICENSE("GPL");
int err, len, index;
const __be32 *list;
+ info->technology = POWER_SUPPLY_TECHNOLOGY_UNKNOWN;
info->energy_full_design_uwh = -EINVAL;
info->charge_full_design_uah = -EINVAL;
info->voltage_min_design_uv = -EINVAL;
* Documentation/power/power_supply_class.rst.
*/
+ if (!of_property_read_string(battery_np, "device-chemistry", &value)) {
+ if (!strcmp("nickel-cadmium", value))
+ info->technology = POWER_SUPPLY_TECHNOLOGY_NiCd;
+ else if (!strcmp("nickel-metal-hydride", value))
+ info->technology = POWER_SUPPLY_TECHNOLOGY_NiMH;
+ else if (!strcmp("lithium-ion", value))
+ /* Imprecise lithium-ion type */
+ info->technology = POWER_SUPPLY_TECHNOLOGY_LION;
+ else if (!strcmp("lithium-ion-polymer", value))
+ info->technology = POWER_SUPPLY_TECHNOLOGY_LIPO;
+ else if (!strcmp("lithium-ion-iron-phosphate", value))
+ info->technology = POWER_SUPPLY_TECHNOLOGY_LiFe;
+ else if (!strcmp("lithium-ion-manganese-oxide", value))
+ info->technology = POWER_SUPPLY_TECHNOLOGY_LiMn;
+ else
+ dev_warn(&psy->dev, "%s unknown battery type\n", value);
+ }
+
of_property_read_u32(battery_np, "energy-full-design-microwatt-hours",
&info->energy_full_design_uwh);
of_property_read_u32(battery_np, "charge-full-design-microamp-hours",
int irq;
irq = platform_get_irq_byname(pdev, smbb_charger_irqs[i].name);
- if (irq < 0) {
- dev_err(&pdev->dev, "failed to get irq '%s'\n",
- smbb_charger_irqs[i].name);
+ if (irq < 0)
return irq;
- }
smbb_charger_irqs[i].handler(irq, chg);
#include <linux/device.h>
#include <linux/bitops.h>
#include <linux/errno.h>
+#include <linux/iio/consumer.h>
#include <linux/init.h>
#include <linux/interrupt.h>
#include <linux/module.h>
#include <linux/mfd/rn5t618.h>
+#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/power_supply.h>
#include <linux/regmap.h>
struct power_supply *battery;
struct power_supply *usb;
struct power_supply *adp;
+ struct iio_channel *channel_vusb;
+ struct iio_channel *channel_vadp;
int irq;
};
static enum power_supply_property rn5t618_usb_props[] = {
/* input current limit is not very accurate */
POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT,
+ POWER_SUPPLY_PROP_VOLTAGE_NOW,
POWER_SUPPLY_PROP_STATUS,
POWER_SUPPLY_PROP_USB_TYPE,
POWER_SUPPLY_PROP_ONLINE,
static enum power_supply_property rn5t618_adp_props[] = {
/* input current limit is not very accurate */
POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT,
+ POWER_SUPPLY_PROP_VOLTAGE_NOW,
POWER_SUPPLY_PROP_STATUS,
POWER_SUPPLY_PROP_ONLINE,
};
return ret;
val->intval = FROM_CUR_REG(regval);
+ break;
+ case POWER_SUPPLY_PROP_VOLTAGE_NOW:
+ if (!info->channel_vadp)
+ return -ENODATA;
+
+ ret = iio_read_channel_processed_scale(info->channel_vadp, &val->intval, 1000);
+ if (ret < 0)
+ return ret;
+
break;
default:
return -EINVAL;
val->intval = FROM_CUR_REG(regval);
}
+ break;
+ case POWER_SUPPLY_PROP_VOLTAGE_NOW:
+ if (!info->channel_vusb)
+ return -ENODATA;
+
+ ret = iio_read_channel_processed_scale(info->channel_vusb, &val->intval, 1000);
+ if (ret < 0)
+ return ret;
+
break;
default:
return -EINVAL;
platform_set_drvdata(pdev, info);
+ info->channel_vusb = devm_iio_channel_get(&pdev->dev, "vusb");
+ if (IS_ERR(info->channel_vusb)) {
+ if (PTR_ERR(info->channel_vusb) == -ENODEV)
+ return -EPROBE_DEFER;
+ return PTR_ERR(info->channel_vusb);
+ }
+
+ info->channel_vadp = devm_iio_channel_get(&pdev->dev, "vadp");
+ if (IS_ERR(info->channel_vadp)) {
+ if (PTR_ERR(info->channel_vadp) == -ENODEV)
+ return -EPROBE_DEFER;
+ return PTR_ERR(info->channel_vadp);
+ }
+
ret = regmap_read(info->rn5t618->regmap, RN5T618_CONTROL, &v);
if (ret)
return ret;
REG_CURRENT_AVG,
REG_MAX_ERR,
REG_CAPACITY,
- REG_TIME_TO_EMPTY,
- REG_TIME_TO_FULL,
+ REG_TIME_TO_EMPTY_NOW,
+ REG_TIME_TO_EMPTY_AVG,
+ REG_TIME_TO_FULL_AVG,
REG_STATUS,
REG_CAPACITY_LEVEL,
REG_CYCLE_COUNT,
[REG_TEMPERATURE] =
SBS_DATA(POWER_SUPPLY_PROP_TEMP, 0x08, 0, 65535),
[REG_VOLTAGE] =
- SBS_DATA(POWER_SUPPLY_PROP_VOLTAGE_NOW, 0x09, 0, 20000),
+ SBS_DATA(POWER_SUPPLY_PROP_VOLTAGE_NOW, 0x09, 0, 65535),
[REG_CURRENT_NOW] =
SBS_DATA(POWER_SUPPLY_PROP_CURRENT_NOW, 0x0A, -32768, 32767),
[REG_CURRENT_AVG] =
SBS_DATA(POWER_SUPPLY_PROP_ENERGY_FULL, 0x10, 0, 65535),
[REG_FULL_CHARGE_CAPACITY_CHARGE] =
SBS_DATA(POWER_SUPPLY_PROP_CHARGE_FULL, 0x10, 0, 65535),
- [REG_TIME_TO_EMPTY] =
+ [REG_TIME_TO_EMPTY_NOW] =
+ SBS_DATA(POWER_SUPPLY_PROP_TIME_TO_EMPTY_NOW, 0x11, 0, 65535),
+ [REG_TIME_TO_EMPTY_AVG] =
SBS_DATA(POWER_SUPPLY_PROP_TIME_TO_EMPTY_AVG, 0x12, 0, 65535),
- [REG_TIME_TO_FULL] =
+ [REG_TIME_TO_FULL_AVG] =
SBS_DATA(POWER_SUPPLY_PROP_TIME_TO_FULL_AVG, 0x13, 0, 65535),
[REG_CHARGE_CURRENT] =
SBS_DATA(POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX, 0x14, 0, 65535),
POWER_SUPPLY_PROP_CAPACITY,
POWER_SUPPLY_PROP_CAPACITY_ERROR_MARGIN,
POWER_SUPPLY_PROP_TEMP,
+ POWER_SUPPLY_PROP_TIME_TO_EMPTY_NOW,
POWER_SUPPLY_PROP_TIME_TO_EMPTY_AVG,
POWER_SUPPLY_PROP_TIME_TO_FULL_AVG,
POWER_SUPPLY_PROP_SERIAL_NUMBER,
val->intval -= TEMP_KELVIN_TO_CELSIUS;
break;
+ case POWER_SUPPLY_PROP_TIME_TO_EMPTY_NOW:
case POWER_SUPPLY_PROP_TIME_TO_EMPTY_AVG:
case POWER_SUPPLY_PROP_TIME_TO_FULL_AVG:
/* sbs provides time to empty and time to full in minutes.
case POWER_SUPPLY_PROP_CURRENT_NOW:
case POWER_SUPPLY_PROP_CURRENT_AVG:
case POWER_SUPPLY_PROP_TEMP:
+ case POWER_SUPPLY_PROP_TIME_TO_EMPTY_NOW:
case POWER_SUPPLY_PROP_TIME_TO_EMPTY_AVG:
case POWER_SUPPLY_PROP_TIME_TO_FULL_AVG:
case POWER_SUPPLY_PROP_VOLTAGE_MIN_DESIGN:
}
irq = platform_get_irq(pdev, 0);
- if (irq < 0) {
- dev_err(dev, "no irq resource specified\n");
+ if (irq < 0)
return irq;
- }
ret = devm_request_threaded_irq(data->dev, irq, NULL,
sc27xx_fgu_interrupt,
#include <linux/power_supply.h>
#include <linux/property.h>
#include <linux/regmap.h>
+#include <linux/regulator/driver.h>
#include <dt-bindings/power/summit,smb347-charger.h>
#define CFG_PIN_EN_CTRL_ACTIVE_LOW 0x60
#define CFG_PIN_EN_APSD_IRQ BIT(1)
#define CFG_PIN_EN_CHARGER_ERROR BIT(2)
+#define CFG_PIN_EN_CTRL BIT(4)
#define CFG_THERM 0x07
#define CFG_THERM_SOFT_HOT_COMPENSATION_MASK 0x03
#define CFG_THERM_SOFT_HOT_COMPENSATION_SHIFT 0
#define CFG_THERM_SOFT_COLD_COMPENSATION_SHIFT 2
#define CFG_THERM_MONITOR_DISABLED BIT(4)
#define CFG_SYSOK 0x08
+#define CFG_SYSOK_INOK_ACTIVE_HIGH BIT(0)
#define CFG_SYSOK_SUSPEND_HARD_LIMIT_DISABLED BIT(2)
#define CFG_OTHER 0x09
#define CFG_OTHER_RID_MASK 0xc0
#define CFG_OTHER_RID_ENABLED_AUTO_OTG 0xc0
#define CFG_OTG 0x0a
#define CFG_OTG_TEMP_THRESHOLD_MASK 0x30
+#define CFG_OTG_CURRENT_LIMIT_250mA BIT(2)
+#define CFG_OTG_CURRENT_LIMIT_750mA BIT(3)
#define CFG_OTG_TEMP_THRESHOLD_SHIFT 4
#define CFG_OTG_CC_COMPENSATION_MASK 0xc0
#define CFG_OTG_CC_COMPENSATION_SHIFT 6
#define CMD_A 0x30
#define CMD_A_CHG_ENABLED BIT(1)
#define CMD_A_SUSPEND_ENABLED BIT(2)
+#define CMD_A_OTG_ENABLED BIT(4)
#define CMD_A_ALLOW_WRITE BIT(7)
#define CMD_B 0x31
#define CMD_C 0x33
* @regmap: pointer to driver regmap
* @mains: power_supply instance for AC/DC power
* @usb: power_supply instance for USB power
+ * @usb_rdev: USB VBUS regulator device
* @id: SMB charger ID
* @mains_online: is AC/DC input connected
* @usb_online: is USB input connected
- * @charging_enabled: is charging enabled
* @irq_unsupported: is interrupt unsupported by SMB hardware
+ * @usb_vbus_enabled: is USB VBUS powered by SMB charger
* @max_charge_current: maximum current (in uA) the battery can be charged
* @max_charge_voltage: maximum voltage (in uV) the battery can be charged
* @pre_charge_current: current (in uA) to use in pre-charging phase
* @use_usb_otg: USB OTG output can be used (not implemented yet)
* @enable_control: how charging enable/disable is controlled
* (driver/pin controls)
+ * @inok_polarity: polarity of INOK signal which denotes presence of external
+ * power supply
*
* @use_main, @use_usb, and @use_usb_otg are means to enable/disable
* hardware support for these. This is useful when we want to have for
struct regmap *regmap;
struct power_supply *mains;
struct power_supply *usb;
+ struct regulator_dev *usb_rdev;
unsigned int id;
bool mains_online;
bool usb_online;
- bool charging_enabled;
bool irq_unsupported;
+ bool usb_vbus_enabled;
unsigned int max_charge_current;
unsigned int max_charge_voltage;
bool use_usb;
bool use_usb_otg;
unsigned int enable_control;
+ unsigned int inok_polarity;
};
enum smb_charger_chipid {
static int smb347_charging_set(struct smb347_charger *smb, bool enable)
{
- int ret = 0;
-
if (smb->enable_control != SMB3XX_CHG_ENABLE_SW) {
dev_dbg(smb->dev, "charging enable/disable in SW disabled\n");
return 0;
}
- if (smb->charging_enabled != enable) {
- ret = regmap_update_bits(smb->regmap, CMD_A, CMD_A_CHG_ENABLED,
- enable ? CMD_A_CHG_ENABLED : 0);
- if (!ret)
- smb->charging_enabled = enable;
+ if (enable && smb->usb_vbus_enabled) {
+ dev_dbg(smb->dev, "charging not enabled because USB is in host mode\n");
+ return 0;
}
- return ret;
+ return regmap_update_bits(smb->regmap, CMD_A, CMD_A_CHG_ENABLED,
+ enable ? CMD_A_CHG_ENABLED : 0);
}
static inline int smb347_charging_enable(struct smb347_charger *smb)
*
* Returns %0 on success and negative errno in case of failure.
*/
-static int smb347_set_writable(struct smb347_charger *smb, bool writable)
+static int smb347_set_writable(struct smb347_charger *smb, bool writable,
+ bool irq_toggle)
{
- return regmap_update_bits(smb->regmap, CMD_A, CMD_A_ALLOW_WRITE,
- writable ? CMD_A_ALLOW_WRITE : 0);
+ struct i2c_client *client = to_i2c_client(smb->dev);
+ int ret;
+
+ if (writable && irq_toggle && !smb->irq_unsupported)
+ disable_irq(client->irq);
+
+ ret = regmap_update_bits(smb->regmap, CMD_A, CMD_A_ALLOW_WRITE,
+ writable ? CMD_A_ALLOW_WRITE : 0);
+
+ if ((!writable || ret) && irq_toggle && !smb->irq_unsupported)
+ enable_irq(client->irq);
+
+ return ret;
}
static int smb347_hw_init(struct smb347_charger *smb)
unsigned int val;
int ret;
- ret = smb347_set_writable(smb, true);
+ ret = smb347_set_writable(smb, true, false);
if (ret < 0)
return ret;
if (ret < 0)
goto fail;
+ /* Activate pin control, making it writable. */
+ switch (smb->enable_control) {
+ case SMB3XX_CHG_ENABLE_PIN_ACTIVE_LOW:
+ case SMB3XX_CHG_ENABLE_PIN_ACTIVE_HIGH:
+ ret = regmap_set_bits(smb->regmap, CFG_PIN, CFG_PIN_EN_CTRL);
+ if (ret < 0)
+ goto fail;
+ }
+
/*
* Make the charging functionality controllable by a write to the
* command register unless pin control is specified in the platform
ret = smb347_start_stop_charging(smb);
fail:
- smb347_set_writable(smb, false);
+ smb347_set_writable(smb, false, false);
return ret;
}
if (smb->irq_unsupported)
return 0;
- ret = smb347_set_writable(smb, true);
+ ret = smb347_set_writable(smb, true, true);
if (ret < 0)
return ret;
ret = regmap_update_bits(smb->regmap, CFG_PIN, CFG_PIN_EN_CHARGER_ERROR,
enable ? CFG_PIN_EN_CHARGER_ERROR : 0);
fail:
- smb347_set_writable(smb, false);
+ smb347_set_writable(smb, false, true);
return ret;
}
if (!client->irq)
return 0;
- ret = smb347_set_writable(smb, true);
+ ret = smb347_set_writable(smb, true, false);
if (ret < 0)
return ret;
CFG_STAT_ACTIVE_HIGH | CFG_STAT_DISABLED,
CFG_STAT_DISABLED);
- smb347_set_writable(smb, false);
+ smb347_set_writable(smb, false, false);
if (ret < 0) {
dev_warn(smb->dev, "failed to initialize IRQ: %d\n", ret);
/* Select charging control */
device_property_read_u32(dev, "summit,enable-charge-control",
&smb->enable_control);
+
+ /*
+ * Polarity of INOK signal indicating presence of external power
+ * supply connected to the charger.
+ */
+ device_property_read_u32(dev, "summit,inok-polarity",
+ &smb->inok_polarity);
}
static int smb347_get_battery_info(struct smb347_charger *smb)
return 0;
}
+static int smb347_usb_vbus_get_current_limit(struct regulator_dev *rdev)
+{
+ struct smb347_charger *smb = rdev_get_drvdata(rdev);
+ unsigned int val;
+ int ret;
+
+ ret = regmap_read(smb->regmap, CFG_OTG, &val);
+ if (ret < 0)
+ return ret;
+
+ /*
+ * It's unknown what happens if this bit is unset due to lack of
+ * access to the datasheet, assume it's limit-enable.
+ */
+ if (!(val & CFG_OTG_CURRENT_LIMIT_250mA))
+ return 0;
+
+ return val & CFG_OTG_CURRENT_LIMIT_750mA ? 750000 : 250000;
+}
+
+static int smb347_usb_vbus_set_new_current_limit(struct smb347_charger *smb,
+ int max_uA)
+{
+ const unsigned int mask = CFG_OTG_CURRENT_LIMIT_750mA |
+ CFG_OTG_CURRENT_LIMIT_250mA;
+ unsigned int val = CFG_OTG_CURRENT_LIMIT_250mA;
+ int ret;
+
+ if (max_uA >= 750000)
+ val |= CFG_OTG_CURRENT_LIMIT_750mA;
+
+ ret = regmap_update_bits(smb->regmap, CFG_OTG, mask, val);
+ if (ret < 0)
+ dev_err(smb->dev, "failed to change USB current limit\n");
+
+ return ret;
+}
+
+static int smb347_usb_vbus_set_current_limit(struct regulator_dev *rdev,
+ int min_uA, int max_uA)
+{
+ struct smb347_charger *smb = rdev_get_drvdata(rdev);
+ int ret;
+
+ ret = smb347_set_writable(smb, true, true);
+ if (ret < 0)
+ return ret;
+
+ ret = smb347_usb_vbus_set_new_current_limit(smb, max_uA);
+ smb347_set_writable(smb, false, true);
+
+ return ret;
+}
+
+static int smb347_usb_vbus_regulator_enable(struct regulator_dev *rdev)
+{
+ struct smb347_charger *smb = rdev_get_drvdata(rdev);
+ int ret, max_uA;
+
+ ret = smb347_set_writable(smb, true, true);
+ if (ret < 0)
+ return ret;
+
+ smb347_charging_disable(smb);
+
+ if (device_property_read_bool(&rdev->dev, "summit,needs-inok-toggle")) {
+ unsigned int sysok = 0;
+
+ if (smb->inok_polarity == SMB3XX_SYSOK_INOK_ACTIVE_LOW)
+ sysok = CFG_SYSOK_INOK_ACTIVE_HIGH;
+
+ /*
+ * VBUS won't be powered if INOK is active, so we need to
+ * manually disable INOK on some platforms.
+ */
+ ret = regmap_update_bits(smb->regmap, CFG_SYSOK,
+ CFG_SYSOK_INOK_ACTIVE_HIGH, sysok);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to disable INOK\n");
+ goto done;
+ }
+ }
+
+ ret = smb347_usb_vbus_get_current_limit(rdev);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to get USB VBUS current limit\n");
+ goto done;
+ }
+
+ max_uA = ret;
+
+ ret = smb347_usb_vbus_set_new_current_limit(smb, 250000);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to preset USB VBUS current limit\n");
+ goto done;
+ }
+
+ ret = regmap_set_bits(smb->regmap, CMD_A, CMD_A_OTG_ENABLED);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to enable USB VBUS\n");
+ goto done;
+ }
+
+ smb->usb_vbus_enabled = true;
+
+ ret = smb347_usb_vbus_set_new_current_limit(smb, max_uA);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to restore USB VBUS current limit\n");
+ goto done;
+ }
+done:
+ smb347_set_writable(smb, false, true);
+
+ return ret;
+}
+
+static int smb347_usb_vbus_regulator_disable(struct regulator_dev *rdev)
+{
+ struct smb347_charger *smb = rdev_get_drvdata(rdev);
+ int ret;
+
+ ret = smb347_set_writable(smb, true, true);
+ if (ret < 0)
+ return ret;
+
+ ret = regmap_clear_bits(smb->regmap, CMD_A, CMD_A_OTG_ENABLED);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to disable USB VBUS\n");
+ goto done;
+ }
+
+ smb->usb_vbus_enabled = false;
+
+ if (device_property_read_bool(&rdev->dev, "summit,needs-inok-toggle")) {
+ unsigned int sysok = 0;
+
+ if (smb->inok_polarity == SMB3XX_SYSOK_INOK_ACTIVE_HIGH)
+ sysok = CFG_SYSOK_INOK_ACTIVE_HIGH;
+
+ ret = regmap_update_bits(smb->regmap, CFG_SYSOK,
+ CFG_SYSOK_INOK_ACTIVE_HIGH, sysok);
+ if (ret < 0) {
+ dev_err(smb->dev, "failed to enable INOK\n");
+ goto done;
+ }
+ }
+
+ smb347_start_stop_charging(smb);
+done:
+ smb347_set_writable(smb, false, true);
+
+ return ret;
+}
+
static const struct regmap_config smb347_regmap = {
.reg_bits = 8,
.val_bits = 8,
.max_register = SMB347_MAX_REGISTER,
.volatile_reg = smb347_volatile_reg,
.readable_reg = smb347_readable_reg,
+ .cache_type = REGCACHE_FLAT,
+ .num_reg_defaults_raw = SMB347_MAX_REGISTER,
+};
+
+static const struct regulator_ops smb347_usb_vbus_regulator_ops = {
+ .is_enabled = regulator_is_enabled_regmap,
+ .enable = smb347_usb_vbus_regulator_enable,
+ .disable = smb347_usb_vbus_regulator_disable,
+ .get_current_limit = smb347_usb_vbus_get_current_limit,
+ .set_current_limit = smb347_usb_vbus_set_current_limit,
};
static const struct power_supply_desc smb347_mains_desc = {
.num_properties = ARRAY_SIZE(smb347_properties),
};
+static const struct regulator_desc smb347_usb_vbus_regulator_desc = {
+ .name = "smb347-usb-vbus",
+ .of_match = of_match_ptr("usb-vbus"),
+ .ops = &smb347_usb_vbus_regulator_ops,
+ .type = REGULATOR_VOLTAGE,
+ .owner = THIS_MODULE,
+ .enable_reg = CMD_A,
+ .enable_mask = CMD_A_OTG_ENABLED,
+ .enable_val = CMD_A_OTG_ENABLED,
+ .fixed_uV = 5000000,
+ .n_voltages = 1,
+};
+
static int smb347_probe(struct i2c_client *client,
const struct i2c_device_id *id)
{
struct power_supply_config mains_usb_cfg = {};
+ struct regulator_config usb_rdev_cfg = {};
struct device *dev = &client->dev;
struct smb347_charger *smb;
int ret;
if (ret)
return ret;
+ usb_rdev_cfg.dev = dev;
+ usb_rdev_cfg.driver_data = smb;
+ usb_rdev_cfg.regmap = smb->regmap;
+
+ smb->usb_rdev = devm_regulator_register(dev,
+ &smb347_usb_vbus_regulator_desc,
+ &usb_rdev_cfg);
+ if (IS_ERR(smb->usb_rdev)) {
+ smb347_irq_disable(smb);
+ return PTR_ERR(smb->usb_rdev);
+ }
+
return 0;
}
{
struct smb347_charger *smb = i2c_get_clientdata(client);
+ smb347_usb_vbus_regulator_disable(smb->usb_rdev);
smb347_irq_disable(smb);
return 0;
}
+static void smb347_shutdown(struct i2c_client *client)
+{
+ smb347_remove(client);
+}
+
static const struct i2c_device_id smb347_id[] = {
{ "smb345", SMB345 },
{ "smb347", SMB347 },
},
.probe = smb347_probe,
.remove = smb347_remove,
+ .shutdown = smb347_shutdown,
.id_table = smb347_id,
};
module_i2c_driver(smb347_driver);
help
This adds support for voltage regulator in Richtek RT6160.
This device automatically change voltage output mode from
- Buck or Boost. The mode transistion depend on the input source voltage.
+ Buck or Boost. The mode transition depend on the input source voltage.
The wide output range is from 2025mV to 5200mV and can be used on most
common application scenario.
depends on I2C
select REGMAP_I2C
help
- This adds supprot for Richtek RT6245 voltage regulator.
+ This adds support for Richtek RT6245 voltage regulator.
It can support up to 14A output current and adjustable output voltage
from 0.4375V to 1.3875V, per step 12.5mV.
+config REGULATOR_RTQ2134
+ tristate "Richtek RTQ2134 SubPMIC Regulator"
+ depends on I2C
+ select REGMAP_I2C
+ help
+ This driver adds support for RTQ2134 SubPMIC regulators.
+ The RTQ2134 is a multi-phase, programmable power management IC that
+ integrate with four high efficient, synchronous step-down converter
+ cores. It features wide output voltage range and the capability to
+ configure the corresponding power stages.
+
config REGULATOR_RTMV20
tristate "Richtek RTMV20 Laser Diode Regulator"
depends on I2C
the Richtek RTMV20. It can support the load current up to 6A and
integrate strobe/vsync/fsin signal to synchronize the IR camera.
+config REGULATOR_RTQ6752
+ tristate "Richtek RTQ6752 TFT LCD voltage regulator"
+ depends on I2C
+ select REGMAP_I2C
+ help
+ This driver adds support for Richtek RTQ6752. RTQ6752 includes two
+ synchronous boost converters for PAVDD, and one synchronous NAVDD
+ buck-boost. This device is suitable for automotive TFT-LCD panel.
+
config REGULATOR_S2MPA01
tristate "Samsung S2MPA01 voltage regulator"
depends on MFD_SEC_CORE || COMPILE_TEST
obj-$(CONFIG_REGULATOR_RT6160) += rt6160-regulator.o
obj-$(CONFIG_REGULATOR_RT6245) += rt6245-regulator.o
obj-$(CONFIG_REGULATOR_RTMV20) += rtmv20-regulator.o
+obj-$(CONFIG_REGULATOR_RTQ2134) += rtq2134-regulator.o
+obj-$(CONFIG_REGULATOR_RTQ6752) += rtq6752-regulator.o
obj-$(CONFIG_REGULATOR_S2MPA01) += s2mpa01.o
obj-$(CONFIG_REGULATOR_S2MPS11) += s2mps11.o
obj-$(CONFIG_REGULATOR_S5M8767) += s5m8767.o
#define BD718XX_HWOPNAME(swopname) swopname##_hwcontrol
#define BD718XX_OPS(name, _list_voltage, _map_voltage, _set_voltage_sel, \
- _get_voltage_sel, _set_voltage_time_sel, _set_ramp_delay) \
+ _get_voltage_sel, _set_voltage_time_sel, _set_ramp_delay, \
+ _set_uvp, _set_ovp) \
static const struct regulator_ops name = { \
.enable = regulator_enable_regmap, \
.disable = regulator_disable_regmap, \
.get_voltage_sel = (_get_voltage_sel), \
.set_voltage_time_sel = (_set_voltage_time_sel), \
.set_ramp_delay = (_set_ramp_delay), \
+ .set_under_voltage_protection = (_set_uvp), \
+ .set_over_voltage_protection = (_set_ovp), \
}; \
\
static const struct regulator_ops BD718XX_HWOPNAME(name) = { \
.get_voltage_sel = (_get_voltage_sel), \
.set_voltage_time_sel = (_set_voltage_time_sel), \
.set_ramp_delay = (_set_ramp_delay), \
+ .set_under_voltage_protection = (_set_uvp), \
+ .set_over_voltage_protection = (_set_ovp), \
} \
/*
* exceed it due to the scheduling.
*/
msleep(1);
- /*
- * Note for next hacker. The PWRGOOD should not be masked on
- * BD71847 so we will just unconditionally enable detection
- * when voltage is set.
- * If someone want's to disable PWRGOOD he must implement
- * caching and restoring the old value here. I am not
- * aware of such use-cases so for the sake of the simplicity
- * we just always enable PWRGOOD here.
- */
- ret = regmap_update_bits(rdev->regmap, BD718XX_REG_MVRFLTMASK2,
- *mask, 0);
+
+ ret = regmap_clear_bits(rdev->regmap, BD718XX_REG_MVRFLTMASK2,
+ *mask);
if (ret)
dev_err(&rdev->dev,
"Failed to re-enable voltage monitoring (%d)\n",
* time configurable.
*/
if (new > now) {
+ int tmp;
+ int prot_bit;
int ldo_offset = rdev->desc->id - BD718XX_LDO1;
- *mask = BD718XX_LDO1_VRMON80 << ldo_offset;
- ret = regmap_update_bits(rdev->regmap,
- BD718XX_REG_MVRFLTMASK2,
- *mask, *mask);
+ prot_bit = BD718XX_LDO1_VRMON80 << ldo_offset;
+ ret = regmap_read(rdev->regmap, BD718XX_REG_MVRFLTMASK2,
+ &tmp);
+ if (ret) {
+ dev_err(&rdev->dev,
+ "Failed to read voltage monitoring state\n");
+ return ret;
+ }
+
+ if (!(tmp & prot_bit)) {
+ /* We disable protection if it was enabled... */
+ ret = regmap_set_bits(rdev->regmap,
+ BD718XX_REG_MVRFLTMASK2,
+ prot_bit);
+ /* ...and we also want to re-enable it */
+ *mask = prot_bit;
+ }
if (ret) {
dev_err(&rdev->dev,
"Failed to stop voltage monitoring\n");
return regulator_set_voltage_sel_pickable_regmap(rdev, sel);
}
-/*
- * OPS common for BD71847 and BD71850
- */
-BD718XX_OPS(bd718xx_pickable_range_ldo_ops,
- regulator_list_voltage_pickable_linear_range, NULL,
- bd718xx_set_voltage_sel_pickable_restricted,
- regulator_get_voltage_sel_pickable_regmap, NULL, NULL);
-
-/* BD71847 and BD71850 LDO 5 is by default OFF at RUN state */
-static const struct regulator_ops bd718xx_ldo5_ops_hwstate = {
- .is_enabled = never_enabled_by_hwstate,
- .list_voltage = regulator_list_voltage_pickable_linear_range,
- .set_voltage_sel = bd718xx_set_voltage_sel_pickable_restricted,
- .get_voltage_sel = regulator_get_voltage_sel_pickable_regmap,
-};
-
-BD718XX_OPS(bd718xx_pickable_range_buck_ops,
- regulator_list_voltage_pickable_linear_range, NULL,
- regulator_set_voltage_sel_pickable_regmap,
- regulator_get_voltage_sel_pickable_regmap,
- regulator_set_voltage_time_sel, NULL);
-
-BD718XX_OPS(bd718xx_ldo_regulator_ops, regulator_list_voltage_linear_range,
- NULL, bd718xx_set_voltage_sel_restricted,
- regulator_get_voltage_sel_regmap, NULL, NULL);
-
-BD718XX_OPS(bd718xx_ldo_regulator_nolinear_ops, regulator_list_voltage_table,
- NULL, bd718xx_set_voltage_sel_restricted,
- regulator_get_voltage_sel_regmap, NULL, NULL);
-
-BD718XX_OPS(bd718xx_buck_regulator_ops, regulator_list_voltage_linear_range,
- NULL, regulator_set_voltage_sel_regmap,
- regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
- NULL);
-
-BD718XX_OPS(bd718xx_buck_regulator_nolinear_ops, regulator_list_voltage_table,
- regulator_map_voltage_ascend, regulator_set_voltage_sel_regmap,
- regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
- NULL);
-
-/*
- * OPS for BD71837
- */
-BD718XX_OPS(bd71837_pickable_range_ldo_ops,
- regulator_list_voltage_pickable_linear_range, NULL,
- bd71837_set_voltage_sel_pickable_restricted,
- regulator_get_voltage_sel_pickable_regmap, NULL, NULL);
-
-BD718XX_OPS(bd71837_pickable_range_buck_ops,
- regulator_list_voltage_pickable_linear_range, NULL,
- bd71837_set_voltage_sel_pickable_restricted,
- regulator_get_voltage_sel_pickable_regmap,
- regulator_set_voltage_time_sel, NULL);
-
-BD718XX_OPS(bd71837_ldo_regulator_ops, regulator_list_voltage_linear_range,
- NULL, bd71837_set_voltage_sel_restricted,
- regulator_get_voltage_sel_regmap, NULL, NULL);
-
-BD718XX_OPS(bd71837_ldo_regulator_nolinear_ops, regulator_list_voltage_table,
- NULL, bd71837_set_voltage_sel_restricted,
- regulator_get_voltage_sel_regmap, NULL, NULL);
-
-BD718XX_OPS(bd71837_buck_regulator_ops, regulator_list_voltage_linear_range,
- NULL, bd71837_set_voltage_sel_restricted,
- regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
- NULL);
-
-BD718XX_OPS(bd71837_buck_regulator_nolinear_ops, regulator_list_voltage_table,
- regulator_map_voltage_ascend, bd71837_set_voltage_sel_restricted,
- regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
- NULL);
-/*
- * BD71837 bucks 3 and 4 support defining their enable/disable state also
- * when buck enable state is under HW state machine control. In that case the
- * bit [2] in CTRL register is used to indicate if regulator should be ON.
- */
-static const struct regulator_ops bd71837_buck34_ops_hwctrl = {
- .is_enabled = bd71837_get_buck34_enable_hwctrl,
- .list_voltage = regulator_list_voltage_linear_range,
- .set_voltage_sel = regulator_set_voltage_sel_regmap,
- .get_voltage_sel = regulator_get_voltage_sel_regmap,
- .set_voltage_time_sel = regulator_set_voltage_time_sel,
- .set_ramp_delay = regulator_set_ramp_delay_regmap,
-};
-
-/*
- * OPS for all of the ICs - BD718(37/47/50)
- */
-BD718XX_OPS(bd718xx_dvs_buck_regulator_ops, regulator_list_voltage_linear_range,
- NULL, regulator_set_voltage_sel_regmap,
- regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
- /* bd718xx_buck1234_set_ramp_delay */ regulator_set_ramp_delay_regmap);
-
/*
* BD71837 BUCK1/2/3/4
* BD71847 BUCK1/2
int additional_init_amnt;
};
+static int bd718x7_xvp_sanity_check(struct regulator_dev *rdev, int lim_uV,
+ int severity)
+{
+ /*
+ * BD71837/47/50 ... (ICs supported by this driver) do not provide
+ * warnings, only protection
+ */
+ if (severity != REGULATOR_SEVERITY_PROT) {
+ dev_err(&rdev->dev,
+ "Unsupported Under Voltage protection level\n");
+ return -EINVAL;
+ }
+
+ /*
+ * And protection limit is not changeable. It can only be enabled
+ * or disabled
+ */
+ if (lim_uV)
+ return -EINVAL;
+
+ return 0;
+}
+
+static int bd718x7_set_ldo_uvp(struct regulator_dev *rdev, int lim_uV,
+ int severity, bool enable)
+{
+ int ldo_offset = rdev->desc->id - BD718XX_LDO1;
+ int prot_bit, ret;
+
+ ret = bd718x7_xvp_sanity_check(rdev, lim_uV, severity);
+ if (ret)
+ return ret;
+
+ prot_bit = BD718XX_LDO1_VRMON80 << ldo_offset;
+
+ if (enable)
+ return regmap_clear_bits(rdev->regmap, BD718XX_REG_MVRFLTMASK2,
+ prot_bit);
+
+ return regmap_set_bits(rdev->regmap, BD718XX_REG_MVRFLTMASK2,
+ prot_bit);
+}
+
+static int bd718x7_get_buck_prot_reg(int id, int *reg)
+{
+
+ if (id > BD718XX_BUCK8) {
+ WARN_ON(id > BD718XX_BUCK8);
+ return -EINVAL;
+ }
+
+ if (id > BD718XX_BUCK4)
+ *reg = BD718XX_REG_MVRFLTMASK0;
+ else
+ *reg = BD718XX_REG_MVRFLTMASK1;
+
+ return 0;
+}
+
+static int bd718x7_get_buck_ovp_info(int id, int *reg, int *bit)
+{
+ int ret;
+
+ ret = bd718x7_get_buck_prot_reg(id, reg);
+ if (ret)
+ return ret;
+
+ *bit = BIT((id % 4) * 2 + 1);
+
+ return 0;
+}
+
+static int bd718x7_get_buck_uvp_info(int id, int *reg, int *bit)
+{
+ int ret;
+
+ ret = bd718x7_get_buck_prot_reg(id, reg);
+ if (ret)
+ return ret;
+
+ *bit = BIT((id % 4) * 2);
+
+ return 0;
+}
+
+static int bd718x7_set_buck_uvp(struct regulator_dev *rdev, int lim_uV,
+ int severity, bool enable)
+{
+ int bit, reg, ret;
+
+ ret = bd718x7_xvp_sanity_check(rdev, lim_uV, severity);
+ if (ret)
+ return ret;
+
+ ret = bd718x7_get_buck_uvp_info(rdev->desc->id, ®, &bit);
+ if (ret)
+ return ret;
+
+ if (enable)
+ return regmap_clear_bits(rdev->regmap, reg, bit);
+
+ return regmap_set_bits(rdev->regmap, reg, bit);
+
+}
+
+static int bd718x7_set_buck_ovp(struct regulator_dev *rdev, int lim_uV,
+ int severity,
+ bool enable)
+{
+ int bit, reg, ret;
+
+ ret = bd718x7_xvp_sanity_check(rdev, lim_uV, severity);
+ if (ret)
+ return ret;
+
+ ret = bd718x7_get_buck_ovp_info(rdev->desc->id, ®, &bit);
+ if (ret)
+ return ret;
+
+ if (enable)
+ return regmap_clear_bits(rdev->regmap, reg, bit);
+
+ return regmap_set_bits(rdev->regmap, reg, bit);
+}
+
+/*
+ * OPS common for BD71847 and BD71850
+ */
+BD718XX_OPS(bd718xx_pickable_range_ldo_ops,
+ regulator_list_voltage_pickable_linear_range, NULL,
+ bd718xx_set_voltage_sel_pickable_restricted,
+ regulator_get_voltage_sel_pickable_regmap, NULL, NULL,
+ bd718x7_set_ldo_uvp, NULL);
+
+/* BD71847 and BD71850 LDO 5 is by default OFF at RUN state */
+static const struct regulator_ops bd718xx_ldo5_ops_hwstate = {
+ .is_enabled = never_enabled_by_hwstate,
+ .list_voltage = regulator_list_voltage_pickable_linear_range,
+ .set_voltage_sel = bd718xx_set_voltage_sel_pickable_restricted,
+ .get_voltage_sel = regulator_get_voltage_sel_pickable_regmap,
+ .set_under_voltage_protection = bd718x7_set_ldo_uvp,
+};
+
+BD718XX_OPS(bd718xx_pickable_range_buck_ops,
+ regulator_list_voltage_pickable_linear_range, NULL,
+ regulator_set_voltage_sel_pickable_regmap,
+ regulator_get_voltage_sel_pickable_regmap,
+ regulator_set_voltage_time_sel, NULL, bd718x7_set_buck_uvp,
+ bd718x7_set_buck_ovp);
+
+BD718XX_OPS(bd718xx_ldo_regulator_ops, regulator_list_voltage_linear_range,
+ NULL, bd718xx_set_voltage_sel_restricted,
+ regulator_get_voltage_sel_regmap, NULL, NULL, bd718x7_set_ldo_uvp,
+ NULL);
+
+BD718XX_OPS(bd718xx_ldo_regulator_nolinear_ops, regulator_list_voltage_table,
+ NULL, bd718xx_set_voltage_sel_restricted,
+ regulator_get_voltage_sel_regmap, NULL, NULL, bd718x7_set_ldo_uvp,
+ NULL);
+
+BD718XX_OPS(bd718xx_buck_regulator_ops, regulator_list_voltage_linear_range,
+ NULL, regulator_set_voltage_sel_regmap,
+ regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
+ NULL, bd718x7_set_buck_uvp, bd718x7_set_buck_ovp);
+
+BD718XX_OPS(bd718xx_buck_regulator_nolinear_ops, regulator_list_voltage_table,
+ regulator_map_voltage_ascend, regulator_set_voltage_sel_regmap,
+ regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
+ NULL, bd718x7_set_buck_uvp, bd718x7_set_buck_ovp);
+
+/*
+ * OPS for BD71837
+ */
+BD718XX_OPS(bd71837_pickable_range_ldo_ops,
+ regulator_list_voltage_pickable_linear_range, NULL,
+ bd71837_set_voltage_sel_pickable_restricted,
+ regulator_get_voltage_sel_pickable_regmap, NULL, NULL,
+ bd718x7_set_ldo_uvp, NULL);
+
+BD718XX_OPS(bd71837_pickable_range_buck_ops,
+ regulator_list_voltage_pickable_linear_range, NULL,
+ bd71837_set_voltage_sel_pickable_restricted,
+ regulator_get_voltage_sel_pickable_regmap,
+ regulator_set_voltage_time_sel, NULL, bd718x7_set_buck_uvp,
+ bd718x7_set_buck_ovp);
+
+BD718XX_OPS(bd71837_ldo_regulator_ops, regulator_list_voltage_linear_range,
+ NULL, bd71837_set_voltage_sel_restricted,
+ regulator_get_voltage_sel_regmap, NULL, NULL, bd718x7_set_ldo_uvp,
+ NULL);
+
+BD718XX_OPS(bd71837_ldo_regulator_nolinear_ops, regulator_list_voltage_table,
+ NULL, bd71837_set_voltage_sel_restricted,
+ regulator_get_voltage_sel_regmap, NULL, NULL, bd718x7_set_ldo_uvp,
+ NULL);
+
+BD718XX_OPS(bd71837_buck_regulator_ops, regulator_list_voltage_linear_range,
+ NULL, bd71837_set_voltage_sel_restricted,
+ regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
+ NULL, bd718x7_set_buck_uvp, bd718x7_set_buck_ovp);
+
+BD718XX_OPS(bd71837_buck_regulator_nolinear_ops, regulator_list_voltage_table,
+ regulator_map_voltage_ascend, bd71837_set_voltage_sel_restricted,
+ regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
+ NULL, bd718x7_set_buck_uvp, bd718x7_set_buck_ovp);
+/*
+ * BD71837 bucks 3 and 4 support defining their enable/disable state also
+ * when buck enable state is under HW state machine control. In that case the
+ * bit [2] in CTRL register is used to indicate if regulator should be ON.
+ */
+static const struct regulator_ops bd71837_buck34_ops_hwctrl = {
+ .is_enabled = bd71837_get_buck34_enable_hwctrl,
+ .list_voltage = regulator_list_voltage_linear_range,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .set_voltage_time_sel = regulator_set_voltage_time_sel,
+ .set_ramp_delay = regulator_set_ramp_delay_regmap,
+ .set_under_voltage_protection = bd718x7_set_buck_uvp,
+ .set_over_voltage_protection = bd718x7_set_buck_ovp,
+};
+
+/*
+ * OPS for all of the ICs - BD718(37/47/50)
+ */
+BD718XX_OPS(bd718xx_dvs_buck_regulator_ops, regulator_list_voltage_linear_range,
+ NULL, regulator_set_voltage_sel_regmap,
+ regulator_get_voltage_sel_regmap, regulator_set_voltage_time_sel,
+ regulator_set_ramp_delay_regmap, bd718x7_set_buck_uvp,
+ bd718x7_set_buck_ovp);
+
+
+
/*
* There is a HW quirk in BD71837. The shutdown sequence timings for
* bucks/LDOs which are controlled via register interface are changed.
return regmap_field_write(regl->suspend_sleep, val);
}
+static unsigned int da9063_get_overdrive_mask(const struct regulator_desc *desc)
+{
+ switch (desc->id) {
+ case DA9063_ID_BCORES_MERGED:
+ case DA9063_ID_BCORE1:
+ return DA9063_BCORE1_OD;
+ case DA9063_ID_BCORE2:
+ return DA9063_BCORE2_OD;
+ case DA9063_ID_BPRO:
+ return DA9063_BPRO_OD;
+ default:
+ return 0;
+ }
+}
+
+static int da9063_buck_set_limit_set_overdrive(struct regulator_dev *rdev,
+ int min_uA, int max_uA,
+ unsigned int overdrive_mask)
+{
+ /*
+ * When enabling overdrive, do it before changing the current limit to
+ * ensure sufficient supply throughout the switch.
+ */
+ struct da9063_regulator *regl = rdev_get_drvdata(rdev);
+ int ret;
+ unsigned int orig_overdrive;
+
+ ret = regmap_read(regl->hw->regmap, DA9063_REG_CONFIG_H,
+ &orig_overdrive);
+ if (ret < 0)
+ return ret;
+ orig_overdrive &= overdrive_mask;
+
+ if (orig_overdrive == 0) {
+ ret = regmap_set_bits(regl->hw->regmap, DA9063_REG_CONFIG_H,
+ overdrive_mask);
+ if (ret < 0)
+ return ret;
+ }
+
+ ret = regulator_set_current_limit_regmap(rdev, min_uA / 2, max_uA / 2);
+ if (ret < 0 && orig_overdrive == 0)
+ /*
+ * regulator_set_current_limit_regmap may have rejected the
+ * change because of unusable min_uA and/or max_uA inputs.
+ * Attempt to restore original overdrive state, ignore failure-
+ * on-failure.
+ */
+ regmap_clear_bits(regl->hw->regmap, DA9063_REG_CONFIG_H,
+ overdrive_mask);
+
+ return ret;
+}
+
+static int da9063_buck_set_limit_clear_overdrive(struct regulator_dev *rdev,
+ int min_uA, int max_uA,
+ unsigned int overdrive_mask)
+{
+ /*
+ * When disabling overdrive, do it after changing the current limit to
+ * ensure sufficient supply throughout the switch.
+ */
+ struct da9063_regulator *regl = rdev_get_drvdata(rdev);
+ int ret, orig_limit;
+
+ ret = regmap_read(rdev->regmap, rdev->desc->csel_reg, &orig_limit);
+ if (ret < 0)
+ return ret;
+
+ ret = regulator_set_current_limit_regmap(rdev, min_uA, max_uA);
+ if (ret < 0)
+ return ret;
+
+ ret = regmap_clear_bits(regl->hw->regmap, DA9063_REG_CONFIG_H,
+ overdrive_mask);
+ if (ret < 0)
+ /*
+ * Attempt to restore original current limit, ignore failure-
+ * on-failure.
+ */
+ regmap_write(rdev->regmap, rdev->desc->csel_reg, orig_limit);
+
+ return ret;
+}
+
+static int da9063_buck_set_current_limit(struct regulator_dev *rdev,
+ int min_uA, int max_uA)
+{
+ unsigned int overdrive_mask, n_currents;
+
+ overdrive_mask = da9063_get_overdrive_mask(rdev->desc);
+ if (overdrive_mask) {
+ n_currents = rdev->desc->n_current_limits;
+ if (n_currents == 0)
+ return -EINVAL;
+
+ if (max_uA > rdev->desc->curr_table[n_currents - 1])
+ return da9063_buck_set_limit_set_overdrive(rdev, min_uA,
+ max_uA,
+ overdrive_mask);
+
+ return da9063_buck_set_limit_clear_overdrive(rdev, min_uA,
+ max_uA,
+ overdrive_mask);
+ }
+ return regulator_set_current_limit_regmap(rdev, min_uA, max_uA);
+}
+
+static int da9063_buck_get_current_limit(struct regulator_dev *rdev)
+{
+ struct da9063_regulator *regl = rdev_get_drvdata(rdev);
+ int val, ret, limit;
+ unsigned int mask;
+
+ limit = regulator_get_current_limit_regmap(rdev);
+ if (limit < 0)
+ return limit;
+ mask = da9063_get_overdrive_mask(rdev->desc);
+ if (mask) {
+ ret = regmap_read(regl->hw->regmap, DA9063_REG_CONFIG_H, &val);
+ if (ret < 0)
+ return ret;
+ if (val & mask)
+ limit *= 2;
+ }
+ return limit;
+}
+
static const struct regulator_ops da9063_buck_ops = {
.enable = regulator_enable_regmap,
.disable = regulator_disable_regmap,
.get_voltage_sel = regulator_get_voltage_sel_regmap,
.set_voltage_sel = regulator_set_voltage_sel_regmap,
.list_voltage = regulator_list_voltage_linear,
- .set_current_limit = regulator_set_current_limit_regmap,
- .get_current_limit = regulator_get_current_limit_regmap,
+ .set_current_limit = da9063_buck_set_current_limit,
+ .get_current_limit = da9063_buck_get_current_limit,
.set_mode = da9063_buck_set_mode,
.get_mode = da9063_buck_get_mode,
.get_status = da9063_buck_get_status,
rdebug.dir = debugfs_create_dir("ux500-regulator", NULL);
/* create "status" file */
- debugfs_create_file("status", S_IRUGO, rdebug.dir, &pdev->dev,
+ debugfs_create_file("status", 0444, rdebug.dir, &pdev->dev,
&ux500_regulator_status_fops);
/* create "power-state-count" file */
- debugfs_create_file("power-state-count", S_IRUGO, rdebug.dir,
+ debugfs_create_file("power-state-count", 0444, rdebug.dir,
&pdev->dev, &ux500_regulator_power_state_cnt_fops);
rdebug.regulator_array = regulator_info;
}
EXPORT_SYMBOL_GPL(devm_regulator_register);
-static int devm_rdev_match(struct device *dev, void *res, void *data)
-{
- struct regulator_dev **r = res;
- if (!r || !*r) {
- WARN_ON(!r || !*r);
- return 0;
- }
- return *r == data;
-}
-
-/**
- * devm_regulator_unregister - Resource managed regulator_unregister()
- * @dev: device to supply
- * @rdev: regulator to free
- *
- * Unregister a regulator registered with devm_regulator_register().
- * Normally this function will not need to be called and the resource
- * management code will ensure that the resource is freed.
- */
-void devm_regulator_unregister(struct device *dev, struct regulator_dev *rdev)
-{
- int rc;
-
- rc = devres_release(dev, devm_rdev_release, devm_rdev_match, rdev);
- if (rc != 0)
- WARN_ON(rc);
-}
-EXPORT_SYMBOL_GPL(devm_regulator_unregister);
-
struct regulator_supply_alias_match {
struct device *dev;
const char *id;
}
EXPORT_SYMBOL_GPL(devm_regulator_register_supply_alias);
-/**
- * devm_regulator_unregister_supply_alias - Resource managed
- * regulator_unregister_supply_alias()
- *
- * @dev: device to supply
- * @id: supply name or regulator ID
- *
- * Unregister an alias registered with
- * devm_regulator_register_supply_alias(). Normally this function
- * will not need to be called and the resource management code
- * will ensure that the resource is freed.
- */
-void devm_regulator_unregister_supply_alias(struct device *dev, const char *id)
+static void devm_regulator_unregister_supply_alias(struct device *dev,
+ const char *id)
{
struct regulator_supply_alias_match match;
int rc;
if (rc != 0)
WARN_ON(rc);
}
-EXPORT_SYMBOL_GPL(devm_regulator_unregister_supply_alias);
/**
* devm_regulator_bulk_register_supply_alias - Managed register
}
EXPORT_SYMBOL_GPL(devm_regulator_bulk_register_supply_alias);
-/**
- * devm_regulator_bulk_unregister_supply_alias - Managed unregister
- * multiple aliases
- *
- * @dev: device to supply
- * @id: list of supply names or regulator IDs
- * @num_id: number of aliases to unregister
- *
- * Unregister aliases registered with
- * devm_regulator_bulk_register_supply_alias(). Normally this function
- * will not need to be called and the resource management code
- * will ensure that the resource is freed.
- */
-void devm_regulator_bulk_unregister_supply_alias(struct device *dev,
- const char *const *id,
- int num_id)
-{
- int i;
-
- for (i = 0; i < num_id; ++i)
- devm_regulator_unregister_supply_alias(dev, id[i]);
-}
-EXPORT_SYMBOL_GPL(devm_regulator_bulk_unregister_supply_alias);
-
struct regulator_notifier_match {
struct regulator *regulator;
struct notifier_block *nb;
drvdata->dev = devm_regulator_register(&pdev->dev, &drvdata->desc,
&cfg);
if (IS_ERR(drvdata->dev)) {
- ret = PTR_ERR(drvdata->dev);
- dev_err(&pdev->dev, "Failed to register regulator: %d\n", ret);
+ ret = dev_err_probe(&pdev->dev, PTR_ERR(drvdata->dev),
+ "Failed to register regulator: %ld\n",
+ PTR_ERR(drvdata->dev));
return ret;
}
//
// Copyright (c) 2013 Linaro Ltd.
// Copyright (c) 2011 HiSilicon Ltd.
-// Copyright (c) 2020-2021 Huawei Technologies Co., Ltd
+// Copyright (c) 2020-2021 Huawei Technologies Co., Ltd.
//
// Guodong Xu <guodong.xu@linaro.org>
u32 eco_uA;
};
-static const unsigned int ldo3_voltages[] = {
+static const unsigned int range_1v5_to_2v0[] = {
1500000, 1550000, 1600000, 1650000,
1700000, 1725000, 1750000, 1775000,
1800000, 1825000, 1850000, 1875000,
1900000, 1925000, 1950000, 2000000
};
-static const unsigned int ldo4_voltages[] = {
+static const unsigned int range_1v725_to_1v9[] = {
1725000, 1750000, 1775000, 1800000,
1825000, 1850000, 1875000, 1900000
};
-static const unsigned int ldo9_voltages[] = {
+static const unsigned int range_1v75_to_3v3[] = {
1750000, 1800000, 1825000, 2800000,
2850000, 2950000, 3000000, 3300000
};
-static const unsigned int ldo15_voltages[] = {
+static const unsigned int range_1v8_to_3v0[] = {
1800000, 1850000, 2400000, 2600000,
2700000, 2850000, 2950000, 3000000
};
-static const unsigned int ldo17_voltages[] = {
+static const unsigned int range_2v5_to_3v3[] = {
2500000, 2600000, 2700000, 2800000,
3000000, 3100000, 3200000, 3300000
};
-static const unsigned int ldo34_voltages[] = {
+static const unsigned int range_2v6_to_3v3[] = {
2600000, 2700000, 2800000, 2900000,
3000000, 3100000, 3200000, 3300000
};
*/
#define HI6421V600_LDO(_id, vtable, ereg, emask, vreg, \
odelay, etime, ecomask, ecoamp) \
- [HI6421V600_##_id] = { \
+ [hi6421v600_##_id] = { \
.desc = { \
.name = #_id, \
.of_match = of_match_ptr(#_id), \
.regulators_node = of_match_ptr("regulators"), \
.ops = &hi6421_spmi_ldo_rops, \
.type = REGULATOR_VOLTAGE, \
- .id = HI6421V600_##_id, \
+ .id = hi6421v600_##_id, \
.owner = THIS_MODULE, \
.volt_table = vtable, \
.n_voltages = ARRAY_SIZE(vtable), \
/* HI6421v600 regulators with known registers */
enum hi6421_spmi_regulator_id {
- HI6421V600_LDO3,
- HI6421V600_LDO4,
- HI6421V600_LDO9,
- HI6421V600_LDO15,
- HI6421V600_LDO16,
- HI6421V600_LDO17,
- HI6421V600_LDO33,
- HI6421V600_LDO34,
+ hi6421v600_ldo3,
+ hi6421v600_ldo4,
+ hi6421v600_ldo9,
+ hi6421v600_ldo15,
+ hi6421v600_ldo16,
+ hi6421v600_ldo17,
+ hi6421v600_ldo33,
+ hi6421v600_ldo34,
};
static struct hi6421_spmi_reg_info regulator_info[] = {
- HI6421V600_LDO(LDO3, ldo3_voltages,
+ HI6421V600_LDO(ldo3, range_1v5_to_2v0,
0x16, 0x01, 0x51,
20000, 120,
0, 0),
- HI6421V600_LDO(LDO4, ldo4_voltages,
+ HI6421V600_LDO(ldo4, range_1v725_to_1v9,
0x17, 0x01, 0x52,
20000, 120,
0x10, 10000),
- HI6421V600_LDO(LDO9, ldo9_voltages,
+ HI6421V600_LDO(ldo9, range_1v75_to_3v3,
0x1c, 0x01, 0x57,
20000, 360,
0x10, 10000),
- HI6421V600_LDO(LDO15, ldo15_voltages,
+ HI6421V600_LDO(ldo15, range_1v8_to_3v0,
0x21, 0x01, 0x5c,
20000, 360,
0x10, 10000),
- HI6421V600_LDO(LDO16, ldo15_voltages,
+ HI6421V600_LDO(ldo16, range_1v8_to_3v0,
0x22, 0x01, 0x5d,
20000, 360,
0x10, 10000),
- HI6421V600_LDO(LDO17, ldo17_voltages,
+ HI6421V600_LDO(ldo17, range_2v5_to_3v3,
0x23, 0x01, 0x5e,
20000, 120,
0x10, 10000),
- HI6421V600_LDO(LDO33, ldo17_voltages,
+ HI6421V600_LDO(ldo33, range_2v5_to_3v3,
0x32, 0x01, 0x6d,
20000, 120,
0, 0),
- HI6421V600_LDO(LDO34, ldo34_voltages,
+ HI6421V600_LDO(ldo34, range_2v6_to_3v3,
0x33, 0x01, 0x6e,
20000, 120,
0, 0),
* If retry_count exceeds the given safety limit we call IC specific die
* handler which can try disabling regulator(s).
*
- * If no die handler is given we will just bug() as a last resort.
+ * If no die handler is given we will just power-off as a last resort.
*
* We could try disabling all associated rdevs - but we might shoot
* ourselves in the head and leave the problematic regulator enabled. So
u32 qi;
const u32 *index_table;
unsigned int n_table;
- u32 vsel_shift;
u32 da_vsel_reg;
u32 da_vsel_mask;
- u32 da_vsel_shift;
u32 modeset_reg;
u32 modeset_mask;
- u32 modeset_shift;
};
#define MT6358_BUCK(match, vreg, min, max, step, \
volt_ranges, vosel_mask, _da_vsel_reg, _da_vsel_mask, \
- _da_vsel_shift, _modeset_reg, _modeset_shift) \
+ _modeset_reg, _modeset_shift) \
[MT6358_ID_##vreg] = { \
.desc = { \
.name = #vreg, \
.qi = BIT(0), \
.da_vsel_reg = _da_vsel_reg, \
.da_vsel_mask = _da_vsel_mask, \
- .da_vsel_shift = _da_vsel_shift, \
.modeset_reg = _modeset_reg, \
.modeset_mask = BIT(_modeset_shift), \
- .modeset_shift = _modeset_shift \
}
#define MT6358_LDO(match, vreg, ldo_volt_table, \
ldo_index_table, enreg, enbit, vosel, \
- vosel_mask, vosel_shift) \
+ vosel_mask) \
[MT6358_ID_##vreg] = { \
.desc = { \
.name = #vreg, \
.qi = BIT(15), \
.index_table = ldo_index_table, \
.n_table = ARRAY_SIZE(ldo_index_table), \
- .vsel_shift = vosel_shift, \
}
#define MT6358_LDO1(match, vreg, min, max, step, \
volt_ranges, _da_vsel_reg, _da_vsel_mask, \
- _da_vsel_shift, vosel, vosel_mask) \
+ vosel, vosel_mask) \
[MT6358_ID_##vreg] = { \
.desc = { \
.name = #vreg, \
}, \
.da_vsel_reg = _da_vsel_reg, \
.da_vsel_mask = _da_vsel_mask, \
- .da_vsel_shift = _da_vsel_shift, \
.status_reg = MT6358_LDO_##vreg##_DBG1, \
.qi = BIT(0), \
}
pvol = info->index_table;
idx = pvol[selector];
+ idx <<= ffs(info->desc.vsel_mask) - 1;
ret = regmap_update_bits(rdev->regmap, info->desc.vsel_reg,
- info->desc.vsel_mask,
- idx << info->vsel_shift);
+ info->desc.vsel_mask, idx);
return ret;
}
return ret;
}
- selector = (selector & info->desc.vsel_mask) >> info->vsel_shift;
+ selector = (selector & info->desc.vsel_mask) >>
+ (ffs(info->desc.vsel_mask) - 1);
pvol = info->index_table;
for (idx = 0; idx < info->desc.n_voltages; idx++) {
if (pvol[idx] == selector)
return ret;
}
- ret = (regval >> info->da_vsel_shift) & info->da_vsel_mask;
+ ret = (regval & info->da_vsel_mask) >> (ffs(info->da_vsel_mask) - 1);
return ret;
}
return -EINVAL;
}
- dev_dbg(&rdev->dev, "mt6358 buck set_mode %#x, %#x, %#x, %#x\n",
- info->modeset_reg, info->modeset_mask,
- info->modeset_shift, val);
+ dev_dbg(&rdev->dev, "mt6358 buck set_mode %#x, %#x, %#x\n",
+ info->modeset_reg, info->modeset_mask, val);
- val <<= info->modeset_shift;
+ val <<= ffs(info->modeset_mask) - 1;
return regmap_update_bits(rdev->regmap, info->modeset_reg,
info->modeset_mask, val);
return ret;
}
- switch ((regval & info->modeset_mask) >> info->modeset_shift) {
+ switch ((regval & info->modeset_mask) >> (ffs(info->modeset_mask) - 1)) {
case MT6358_BUCK_MODE_AUTO:
return REGULATOR_MODE_NORMAL;
case MT6358_BUCK_MODE_FORCE_PWM:
static struct mt6358_regulator_info mt6358_regulators[] = {
MT6358_BUCK("buck_vdram1", VDRAM1, 500000, 2087500, 12500,
buck_volt_range2, 0x7f, MT6358_BUCK_VDRAM1_DBG0, 0x7f,
- 0, MT6358_VDRAM1_ANA_CON0, 8),
+ MT6358_VDRAM1_ANA_CON0, 8),
MT6358_BUCK("buck_vcore", VCORE, 500000, 1293750, 6250,
buck_volt_range1, 0x7f, MT6358_BUCK_VCORE_DBG0, 0x7f,
- 0, MT6358_VCORE_VGPU_ANA_CON0, 1),
+ MT6358_VCORE_VGPU_ANA_CON0, 1),
MT6358_BUCK("buck_vpa", VPA, 500000, 3650000, 50000,
- buck_volt_range3, 0x3f, MT6358_BUCK_VPA_DBG0, 0x3f, 0,
+ buck_volt_range3, 0x3f, MT6358_BUCK_VPA_DBG0, 0x3f,
MT6358_VPA_ANA_CON0, 3),
MT6358_BUCK("buck_vproc11", VPROC11, 500000, 1293750, 6250,
buck_volt_range1, 0x7f, MT6358_BUCK_VPROC11_DBG0, 0x7f,
- 0, MT6358_VPROC_ANA_CON0, 1),
+ MT6358_VPROC_ANA_CON0, 1),
MT6358_BUCK("buck_vproc12", VPROC12, 500000, 1293750, 6250,
buck_volt_range1, 0x7f, MT6358_BUCK_VPROC12_DBG0, 0x7f,
- 0, MT6358_VPROC_ANA_CON0, 2),
+ MT6358_VPROC_ANA_CON0, 2),
MT6358_BUCK("buck_vgpu", VGPU, 500000, 1293750, 6250,
- buck_volt_range1, 0x7f, MT6358_BUCK_VGPU_ELR0, 0x7f, 0,
+ buck_volt_range1, 0x7f, MT6358_BUCK_VGPU_ELR0, 0x7f,
MT6358_VCORE_VGPU_ANA_CON0, 2),
MT6358_BUCK("buck_vs2", VS2, 500000, 2087500, 12500,
- buck_volt_range2, 0x7f, MT6358_BUCK_VS2_DBG0, 0x7f, 0,
+ buck_volt_range2, 0x7f, MT6358_BUCK_VS2_DBG0, 0x7f,
MT6358_VS2_ANA_CON0, 8),
MT6358_BUCK("buck_vmodem", VMODEM, 500000, 1293750, 6250,
buck_volt_range1, 0x7f, MT6358_BUCK_VMODEM_DBG0, 0x7f,
- 0, MT6358_VMODEM_ANA_CON0, 8),
+ MT6358_VMODEM_ANA_CON0, 8),
MT6358_BUCK("buck_vs1", VS1, 1000000, 2587500, 12500,
- buck_volt_range4, 0x7f, MT6358_BUCK_VS1_DBG0, 0x7f, 0,
+ buck_volt_range4, 0x7f, MT6358_BUCK_VS1_DBG0, 0x7f,
MT6358_VS1_ANA_CON0, 8),
MT6358_REG_FIXED("ldo_vrf12", VRF12,
MT6358_LDO_VRF12_CON0, 0, 1200000),
MT6358_REG_FIXED("ldo_vaud28", VAUD28,
MT6358_LDO_VAUD28_CON0, 0, 2800000),
MT6358_LDO("ldo_vdram2", VDRAM2, vdram2_voltages, vdram2_idx,
- MT6358_LDO_VDRAM2_CON0, 0, MT6358_LDO_VDRAM2_ELR0, 0xf, 0),
+ MT6358_LDO_VDRAM2_CON0, 0, MT6358_LDO_VDRAM2_ELR0, 0xf),
MT6358_LDO("ldo_vsim1", VSIM1, vsim_voltages, vsim_idx,
- MT6358_LDO_VSIM1_CON0, 0, MT6358_VSIM1_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VSIM1_CON0, 0, MT6358_VSIM1_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vibr", VIBR, vibr_voltages, vibr_idx,
- MT6358_LDO_VIBR_CON0, 0, MT6358_VIBR_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VIBR_CON0, 0, MT6358_VIBR_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vusb", VUSB, vusb_voltages, vusb_idx,
- MT6358_LDO_VUSB_CON0_0, 0, MT6358_VUSB_ANA_CON0, 0x700, 8),
+ MT6358_LDO_VUSB_CON0_0, 0, MT6358_VUSB_ANA_CON0, 0x700),
MT6358_LDO("ldo_vcamd", VCAMD, vcamd_voltages, vcamd_idx,
- MT6358_LDO_VCAMD_CON0, 0, MT6358_VCAMD_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VCAMD_CON0, 0, MT6358_VCAMD_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vefuse", VEFUSE, vefuse_voltages, vefuse_idx,
- MT6358_LDO_VEFUSE_CON0, 0, MT6358_VEFUSE_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VEFUSE_CON0, 0, MT6358_VEFUSE_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vmch", VMCH, vmch_vemc_voltages, vmch_vemc_idx,
- MT6358_LDO_VMCH_CON0, 0, MT6358_VMCH_ANA_CON0, 0x700, 8),
+ MT6358_LDO_VMCH_CON0, 0, MT6358_VMCH_ANA_CON0, 0x700),
MT6358_LDO("ldo_vcama1", VCAMA1, vcama_voltages, vcama_idx,
- MT6358_LDO_VCAMA1_CON0, 0, MT6358_VCAMA1_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VCAMA1_CON0, 0, MT6358_VCAMA1_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vemc", VEMC, vmch_vemc_voltages, vmch_vemc_idx,
- MT6358_LDO_VEMC_CON0, 0, MT6358_VEMC_ANA_CON0, 0x700, 8),
+ MT6358_LDO_VEMC_CON0, 0, MT6358_VEMC_ANA_CON0, 0x700),
MT6358_LDO("ldo_vcn33_bt", VCN33_BT, vcn33_bt_wifi_voltages,
vcn33_bt_wifi_idx, MT6358_LDO_VCN33_CON0_0,
- 0, MT6358_VCN33_ANA_CON0, 0x300, 8),
+ 0, MT6358_VCN33_ANA_CON0, 0x300),
MT6358_LDO("ldo_vcn33_wifi", VCN33_WIFI, vcn33_bt_wifi_voltages,
vcn33_bt_wifi_idx, MT6358_LDO_VCN33_CON0_1,
- 0, MT6358_VCN33_ANA_CON0, 0x300, 8),
+ 0, MT6358_VCN33_ANA_CON0, 0x300),
MT6358_LDO("ldo_vcama2", VCAMA2, vcama_voltages, vcama_idx,
- MT6358_LDO_VCAMA2_CON0, 0, MT6358_VCAMA2_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VCAMA2_CON0, 0, MT6358_VCAMA2_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vmc", VMC, vmc_voltages, vmc_idx,
- MT6358_LDO_VMC_CON0, 0, MT6358_VMC_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VMC_CON0, 0, MT6358_VMC_ANA_CON0, 0xf00),
MT6358_LDO("ldo_vldo28", VLDO28, vldo28_voltages, vldo28_idx,
MT6358_LDO_VLDO28_CON0_0, 0,
- MT6358_VLDO28_ANA_CON0, 0x300, 8),
+ MT6358_VLDO28_ANA_CON0, 0x300),
MT6358_LDO("ldo_vsim2", VSIM2, vsim_voltages, vsim_idx,
- MT6358_LDO_VSIM2_CON0, 0, MT6358_VSIM2_ANA_CON0, 0xf00, 8),
+ MT6358_LDO_VSIM2_CON0, 0, MT6358_VSIM2_ANA_CON0, 0xf00),
MT6358_LDO1("ldo_vsram_proc11", VSRAM_PROC11, 500000, 1293750, 6250,
- buck_volt_range1, MT6358_LDO_VSRAM_PROC11_DBG0, 0x7f, 8,
+ buck_volt_range1, MT6358_LDO_VSRAM_PROC11_DBG0, 0x7f00,
MT6358_LDO_VSRAM_CON0, 0x7f),
MT6358_LDO1("ldo_vsram_others", VSRAM_OTHERS, 500000, 1293750, 6250,
- buck_volt_range1, MT6358_LDO_VSRAM_OTHERS_DBG0, 0x7f, 8,
+ buck_volt_range1, MT6358_LDO_VSRAM_OTHERS_DBG0, 0x7f00,
MT6358_LDO_VSRAM_CON2, 0x7f),
MT6358_LDO1("ldo_vsram_gpu", VSRAM_GPU, 500000, 1293750, 6250,
- buck_volt_range1, MT6358_LDO_VSRAM_GPU_DBG0, 0x7f, 8,
+ buck_volt_range1, MT6358_LDO_VSRAM_GPU_DBG0, 0x7f00,
MT6358_LDO_VSRAM_CON3, 0x7f),
MT6358_LDO1("ldo_vsram_proc12", VSRAM_PROC12, 500000, 1293750, 6250,
- buck_volt_range1, MT6358_LDO_VSRAM_PROC12_DBG0, 0x7f, 8,
+ buck_volt_range1, MT6358_LDO_VSRAM_PROC12_DBG0, 0x7f00,
MT6358_LDO_VSRAM_CON1, 0x7f),
};
* @qi: Mask for query enable signal status of regulators.
* @modeset_reg: for operating AUTO/PWM mode register.
* @modeset_mask: MASK for operating modeset register.
- * @modeset_shift: SHIFT for operating modeset register.
*/
struct mt6359_regulator_info {
struct regulator_desc desc;
u32 qi;
u32 modeset_reg;
u32 modeset_mask;
- u32 modeset_shift;
u32 lp_mode_reg;
u32 lp_mode_mask;
- u32 lp_mode_shift;
};
#define MT6359_BUCK(match, _name, min, max, step, \
.qi = BIT(0), \
.lp_mode_reg = _lp_mode_reg, \
.lp_mode_mask = BIT(_lp_mode_shift), \
- .lp_mode_shift = _lp_mode_shift, \
.modeset_reg = _modeset_reg, \
.modeset_mask = BIT(_modeset_shift), \
- .modeset_shift = _modeset_shift \
}
#define MT6359_LDO_LINEAR(match, _name, min, max, step, \
return ret;
}
- if ((regval & info->modeset_mask) >> info->modeset_shift ==
- MT6359_BUCK_MODE_FORCE_PWM)
+ regval &= info->modeset_mask;
+ regval >>= ffs(info->modeset_mask) - 1;
+
+ if (regval == MT6359_BUCK_MODE_FORCE_PWM)
return REGULATOR_MODE_FAST;
ret = regmap_read(rdev->regmap, info->lp_mode_reg, ®val);
switch (mode) {
case REGULATOR_MODE_FAST:
val = MT6359_BUCK_MODE_FORCE_PWM;
- val <<= info->modeset_shift;
+ val <<= ffs(info->modeset_mask) - 1;
ret = regmap_update_bits(rdev->regmap,
info->modeset_reg,
info->modeset_mask,
case REGULATOR_MODE_NORMAL:
if (curr_mode == REGULATOR_MODE_FAST) {
val = MT6359_BUCK_MODE_AUTO;
- val <<= info->modeset_shift;
+ val <<= ffs(info->modeset_mask) - 1;
ret = regmap_update_bits(rdev->regmap,
info->modeset_reg,
info->modeset_mask,
val);
} else if (curr_mode == REGULATOR_MODE_IDLE) {
val = MT6359_BUCK_MODE_NORMAL;
- val <<= info->lp_mode_shift;
+ val <<= ffs(info->lp_mode_mask) - 1;
ret = regmap_update_bits(rdev->regmap,
info->lp_mode_reg,
info->lp_mode_mask,
break;
case REGULATOR_MODE_IDLE:
val = MT6359_BUCK_MODE_LP >> 1;
- val <<= info->lp_mode_shift;
+ val <<= ffs(info->lp_mode_mask) - 1;
ret = regmap_update_bits(rdev->regmap,
info->lp_mode_reg,
info->lp_mode_mask,
u32 vselctrl_mask;
u32 modeset_reg;
u32 modeset_mask;
- u32 modeset_shift;
};
#define MT6397_BUCK(match, vreg, min, max, step, volt_ranges, enreg, \
.vselctrl_mask = BIT(1), \
.modeset_reg = _modeset_reg, \
.modeset_mask = BIT(_modeset_shift), \
- .modeset_shift = _modeset_shift \
}
#define MT6397_LDO(match, vreg, ldo_volt_table, enreg, enbit, vosel, \
goto err_mode;
}
- dev_dbg(&rdev->dev, "mt6397 buck set_mode %#x, %#x, %#x, %#x\n",
- info->modeset_reg, info->modeset_mask,
- info->modeset_shift, val);
+ dev_dbg(&rdev->dev, "mt6397 buck set_mode %#x, %#x, %#x\n",
+ info->modeset_reg, info->modeset_mask, val);
+
+ val <<= ffs(info->modeset_mask) - 1;
- val <<= info->modeset_shift;
ret = regmap_update_bits(rdev->regmap, info->modeset_reg,
info->modeset_mask, val);
err_mode:
return ret;
}
- switch ((regval & info->modeset_mask) >> info->modeset_shift) {
+ regval &= info->modeset_mask;
+ regval >>= ffs(info->modeset_mask) - 1;
+
+ switch (regval) {
case MT6397_BUCK_MODE_AUTO:
return REGULATOR_MODE_NORMAL;
case MT6397_BUCK_MODE_FORCE_PWM:
#include <linux/mfd/rt5033-private.h>
#include <linux/regulator/of_regulator.h>
+static const struct linear_range rt5033_buck_ranges[] = {
+ REGULATOR_LINEAR_RANGE(1000000, 0, 20, 100000),
+ REGULATOR_LINEAR_RANGE(3000000, 21, 31, 0),
+};
+
+static const struct linear_range rt5033_ldo_ranges[] = {
+ REGULATOR_LINEAR_RANGE(1200000, 0, 18, 100000),
+ REGULATOR_LINEAR_RANGE(3000000, 19, 31, 0),
+};
+
static const struct regulator_ops rt5033_safe_ldo_ops = {
.is_enabled = regulator_is_enabled_regmap,
.enable = regulator_enable_regmap,
.is_enabled = regulator_is_enabled_regmap,
.enable = regulator_enable_regmap,
.disable = regulator_disable_regmap,
- .list_voltage = regulator_list_voltage_linear,
- .map_voltage = regulator_map_voltage_linear,
+ .list_voltage = regulator_list_voltage_linear_range,
.get_voltage_sel = regulator_get_voltage_sel_regmap,
.set_voltage_sel = regulator_set_voltage_sel_regmap,
};
.type = REGULATOR_VOLTAGE,
.owner = THIS_MODULE,
.n_voltages = RT5033_REGULATOR_BUCK_VOLTAGE_STEP_NUM,
- .min_uV = RT5033_REGULATOR_BUCK_VOLTAGE_MIN,
- .uV_step = RT5033_REGULATOR_BUCK_VOLTAGE_STEP,
+ .linear_ranges = rt5033_buck_ranges,
+ .n_linear_ranges = ARRAY_SIZE(rt5033_buck_ranges),
.enable_reg = RT5033_REG_CTRL,
.enable_mask = RT5033_CTRL_EN_BUCK_MASK,
.vsel_reg = RT5033_REG_BUCK_CTRL,
.type = REGULATOR_VOLTAGE,
.owner = THIS_MODULE,
.n_voltages = RT5033_REGULATOR_LDO_VOLTAGE_STEP_NUM,
- .min_uV = RT5033_REGULATOR_LDO_VOLTAGE_MIN,
- .uV_step = RT5033_REGULATOR_LDO_VOLTAGE_STEP,
+ .linear_ranges = rt5033_ldo_ranges,
+ .n_linear_ranges = ARRAY_SIZE(rt5033_ldo_ranges),
.enable_reg = RT5033_REG_CTRL,
.enable_mask = RT5033_CTRL_EN_LDO_MASK,
.vsel_reg = RT5033_REG_LDO_CTRL,
static int rt6245_reg_write(void *context, unsigned int reg, unsigned int val)
{
struct i2c_client *i2c = context;
- const u8 func_base[] = { 0x6F, 0x73, 0x78, 0x61, 0x7C, 0 };
+ static const u8 func_base[] = { 0x6F, 0x73, 0x78, 0x61, 0x7C, 0 };
unsigned int code, bit_count;
code = func_base[reg];
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0+
+
+#include <linux/bitops.h>
+#include <linux/i2c.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/regmap.h>
+#include <linux/regulator/driver.h>
+
+enum {
+ RTQ2134_IDX_BUCK1 = 0,
+ RTQ2134_IDX_BUCK2,
+ RTQ2134_IDX_BUCK3,
+ RTQ2134_IDX_MAX
+};
+
+#define RTQ2134_AUTO_MODE 0
+#define RTQ2134_FCCM_MODE 1
+
+#define RTQ2134_BUCK_DVS0_CTRL 0
+#define RTQ2134_BUCK_VSEL_CTRL 2
+
+#define RTQ2134_REG_IO_CHIPNAME 0x01
+#define RTQ2134_REG_FLT_RECORDTEMP 0x13
+#define RTQ2134_REG_FLT_RECORDBUCK(_id) (0x14 + (_id))
+#define RTQ2134_REG_FLT_BUCKCTRL(_id) (0x37 + (_id))
+#define RTQ2134_REG_BUCK1_CFG0 0x42
+#define RTQ2134_REG_BUCK1_DVS0CFG1 0x48
+#define RTQ2134_REG_BUCK1_DVS0CFG0 0x49
+#define RTQ2134_REG_BUCK1_DVS1CFG1 0x4A
+#define RTQ2134_REG_BUCK1_DVS1CFG0 0x4B
+#define RTQ2134_REG_BUCK1_DVSCFG 0x52
+#define RTQ2134_REG_BUCK1_RSPCFG 0x54
+#define RTQ2134_REG_BUCK2_CFG0 0x5F
+#define RTQ2134_REG_BUCK2_DVS0CFG1 0x62
+#define RTQ2134_REG_BUCK2_DVS0CFG0 0x63
+#define RTQ2134_REG_BUCK2_DVS1CFG1 0x64
+#define RTQ2134_REG_BUCK2_DVS1CFG0 0x65
+#define RTQ2134_REG_BUCK2_DVSCFG 0x6C
+#define RTQ2134_REG_BUCK2_RSPCFG 0x6E
+#define RTQ2134_REG_BUCK3_CFG0 0x79
+#define RTQ2134_REG_BUCK3_DVS0CFG1 0x7C
+#define RTQ2134_REG_BUCK3_DVS0CFG0 0x7D
+#define RTQ2134_REG_BUCK3_DVS1CFG1 0x7E
+#define RTQ2134_REG_BUCK3_DVS1CFG0 0x7F
+#define RTQ2134_REG_BUCK3_DVSCFG 0x86
+#define RTQ2134_REG_BUCK3_RSPCFG 0x88
+#define RTQ2134_REG_BUCK3_SLEWCTRL 0x89
+
+#define RTQ2134_VOUT_MAXNUM 256
+#define RTQ2134_VOUT_MASK 0xFF
+#define RTQ2134_VOUTEN_MASK BIT(0)
+#define RTQ2134_ACTDISCHG_MASK BIT(0)
+#define RTQ2134_RSPUP_MASK GENMASK(6, 4)
+#define RTQ2134_FCCM_MASK BIT(5)
+#define RTQ2134_UVHICCUP_MASK BIT(3)
+#define RTQ2134_BUCKDVS_CTRL_MASK GENMASK(1, 0)
+#define RTQ2134_CHIPOT_MASK BIT(2)
+#define RTQ2134_BUCKOV_MASK BIT(5)
+#define RTQ2134_BUCKUV_MASK BIT(4)
+
+struct rtq2134_regulator_desc {
+ struct regulator_desc desc;
+ /* Extension for proprietary register and mask */
+ unsigned int mode_reg;
+ unsigned int mode_mask;
+ unsigned int suspend_enable_reg;
+ unsigned int suspend_enable_mask;
+ unsigned int suspend_vsel_reg;
+ unsigned int suspend_vsel_mask;
+ unsigned int suspend_mode_reg;
+ unsigned int suspend_mode_mask;
+ unsigned int dvs_ctrl_reg;
+};
+
+static int rtq2134_buck_set_mode(struct regulator_dev *rdev, unsigned int mode)
+{
+ struct rtq2134_regulator_desc *desc =
+ (struct rtq2134_regulator_desc *)rdev->desc;
+ unsigned int val;
+
+ if (mode == REGULATOR_MODE_NORMAL)
+ val = RTQ2134_AUTO_MODE;
+ else if (mode == REGULATOR_MODE_FAST)
+ val = RTQ2134_FCCM_MODE;
+ else
+ return -EINVAL;
+
+ val <<= ffs(desc->mode_mask) - 1;
+ return regmap_update_bits(rdev->regmap, desc->mode_reg, desc->mode_mask,
+ val);
+}
+
+static unsigned int rtq2134_buck_get_mode(struct regulator_dev *rdev)
+{
+ struct rtq2134_regulator_desc *desc =
+ (struct rtq2134_regulator_desc *)rdev->desc;
+ unsigned int mode;
+ int ret;
+
+ ret = regmap_read(rdev->regmap, desc->mode_reg, &mode);
+ if (ret)
+ return ret;
+
+ if (mode & desc->mode_mask)
+ return REGULATOR_MODE_FAST;
+ return REGULATOR_MODE_NORMAL;
+}
+
+static int rtq2134_buck_set_suspend_voltage(struct regulator_dev *rdev, int uV)
+{
+ struct rtq2134_regulator_desc *desc =
+ (struct rtq2134_regulator_desc *)rdev->desc;
+ int sel;
+
+ sel = regulator_map_voltage_linear_range(rdev, uV, uV);
+ if (sel < 0)
+ return sel;
+
+ sel <<= ffs(desc->suspend_vsel_mask) - 1;
+
+ return regmap_update_bits(rdev->regmap, desc->suspend_vsel_reg,
+ desc->suspend_vsel_mask, sel);
+}
+
+static int rtq2134_buck_set_suspend_enable(struct regulator_dev *rdev)
+{
+ struct rtq2134_regulator_desc *desc =
+ (struct rtq2134_regulator_desc *)rdev->desc;
+ unsigned int val = desc->suspend_enable_mask;
+
+ return regmap_update_bits(rdev->regmap, desc->suspend_enable_reg,
+ desc->suspend_enable_mask, val);
+}
+
+static int rtq2134_buck_set_suspend_disable(struct regulator_dev *rdev)
+{
+ struct rtq2134_regulator_desc *desc =
+ (struct rtq2134_regulator_desc *)rdev->desc;
+
+ return regmap_update_bits(rdev->regmap, desc->suspend_enable_reg,
+ desc->suspend_enable_mask, 0);
+}
+
+static int rtq2134_buck_set_suspend_mode(struct regulator_dev *rdev,
+ unsigned int mode)
+{
+ struct rtq2134_regulator_desc *desc =
+ (struct rtq2134_regulator_desc *)rdev->desc;
+ unsigned int val;
+
+ if (mode == REGULATOR_MODE_NORMAL)
+ val = RTQ2134_AUTO_MODE;
+ else if (mode == REGULATOR_MODE_FAST)
+ val = RTQ2134_FCCM_MODE;
+ else
+ return -EINVAL;
+
+ val <<= ffs(desc->suspend_mode_mask) - 1;
+ return regmap_update_bits(rdev->regmap, desc->suspend_mode_reg,
+ desc->suspend_mode_mask, val);
+}
+
+static int rtq2134_buck_get_error_flags(struct regulator_dev *rdev,
+ unsigned int *flags)
+{
+ int rid = rdev_get_id(rdev);
+ unsigned int chip_error, buck_error, events = 0;
+ int ret;
+
+ ret = regmap_read(rdev->regmap, RTQ2134_REG_FLT_RECORDTEMP,
+ &chip_error);
+ if (ret) {
+ dev_err(&rdev->dev, "Failed to get chip error flag\n");
+ return ret;
+ }
+
+ ret = regmap_read(rdev->regmap, RTQ2134_REG_FLT_RECORDBUCK(rid),
+ &buck_error);
+ if (ret) {
+ dev_err(&rdev->dev, "Failed to get buck error flag\n");
+ return ret;
+ }
+
+ if (chip_error & RTQ2134_CHIPOT_MASK)
+ events |= REGULATOR_ERROR_OVER_TEMP;
+
+ if (buck_error & RTQ2134_BUCKUV_MASK)
+ events |= REGULATOR_ERROR_UNDER_VOLTAGE;
+
+ if (buck_error & RTQ2134_BUCKOV_MASK)
+ events |= REGULATOR_ERROR_REGULATION_OUT;
+
+ *flags = events;
+ return 0;
+}
+
+static const struct regulator_ops rtq2134_buck_ops = {
+ .list_voltage = regulator_list_voltage_linear_range,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .set_active_discharge = regulator_set_active_discharge_regmap,
+ .set_ramp_delay = regulator_set_ramp_delay_regmap,
+ .set_mode = rtq2134_buck_set_mode,
+ .get_mode = rtq2134_buck_get_mode,
+ .set_suspend_voltage = rtq2134_buck_set_suspend_voltage,
+ .set_suspend_enable = rtq2134_buck_set_suspend_enable,
+ .set_suspend_disable = rtq2134_buck_set_suspend_disable,
+ .set_suspend_mode = rtq2134_buck_set_suspend_mode,
+ .get_error_flags = rtq2134_buck_get_error_flags,
+};
+
+static const struct linear_range rtq2134_buck_vout_ranges[] = {
+ REGULATOR_LINEAR_RANGE(300000, 0, 200, 5000),
+ REGULATOR_LINEAR_RANGE(1310000, 201, 255, 10000)
+};
+
+static unsigned int rtq2134_buck_of_map_mode(unsigned int mode)
+{
+ switch (mode) {
+ case RTQ2134_AUTO_MODE:
+ return REGULATOR_MODE_NORMAL;
+ case RTQ2134_FCCM_MODE:
+ return REGULATOR_MODE_FAST;
+ }
+
+ return REGULATOR_MODE_INVALID;
+}
+
+static int rtq2134_buck_of_parse_cb(struct device_node *np,
+ const struct regulator_desc *desc,
+ struct regulator_config *cfg)
+{
+ struct rtq2134_regulator_desc *rdesc =
+ (struct rtq2134_regulator_desc *)desc;
+ int rid = desc->id;
+ bool uv_shutdown, vsel_dvs;
+ unsigned int val;
+ int ret;
+
+ vsel_dvs = of_property_read_bool(np, "richtek,use-vsel-dvs");
+ if (vsel_dvs)
+ val = RTQ2134_BUCK_VSEL_CTRL;
+ else
+ val = RTQ2134_BUCK_DVS0_CTRL;
+
+ ret = regmap_update_bits(cfg->regmap, rdesc->dvs_ctrl_reg,
+ RTQ2134_BUCKDVS_CTRL_MASK, val);
+ if (ret)
+ return ret;
+
+ uv_shutdown = of_property_read_bool(np, "richtek,uv-shutdown");
+ if (uv_shutdown)
+ val = 0;
+ else
+ val = RTQ2134_UVHICCUP_MASK;
+
+ return regmap_update_bits(cfg->regmap, RTQ2134_REG_FLT_BUCKCTRL(rid),
+ RTQ2134_UVHICCUP_MASK, val);
+}
+
+static const unsigned int rtq2134_buck_ramp_delay_table[] = {
+ 0, 16000, 0, 8000, 4000, 2000, 1000, 500
+};
+
+#define RTQ2134_BUCK_DESC(_id) { \
+ .desc = { \
+ .name = "rtq2134_buck" #_id, \
+ .of_match = of_match_ptr("buck" #_id), \
+ .regulators_node = of_match_ptr("regulators"), \
+ .id = RTQ2134_IDX_BUCK##_id, \
+ .type = REGULATOR_VOLTAGE, \
+ .owner = THIS_MODULE, \
+ .ops = &rtq2134_buck_ops, \
+ .n_voltages = RTQ2134_VOUT_MAXNUM, \
+ .linear_ranges = rtq2134_buck_vout_ranges, \
+ .n_linear_ranges = ARRAY_SIZE(rtq2134_buck_vout_ranges), \
+ .vsel_reg = RTQ2134_REG_BUCK##_id##_DVS0CFG1, \
+ .vsel_mask = RTQ2134_VOUT_MASK, \
+ .enable_reg = RTQ2134_REG_BUCK##_id##_DVS0CFG0, \
+ .enable_mask = RTQ2134_VOUTEN_MASK, \
+ .active_discharge_reg = RTQ2134_REG_BUCK##_id##_CFG0, \
+ .active_discharge_mask = RTQ2134_ACTDISCHG_MASK, \
+ .ramp_reg = RTQ2134_REG_BUCK##_id##_RSPCFG, \
+ .ramp_mask = RTQ2134_RSPUP_MASK, \
+ .ramp_delay_table = rtq2134_buck_ramp_delay_table, \
+ .n_ramp_values = ARRAY_SIZE(rtq2134_buck_ramp_delay_table), \
+ .of_map_mode = rtq2134_buck_of_map_mode, \
+ .of_parse_cb = rtq2134_buck_of_parse_cb, \
+ }, \
+ .mode_reg = RTQ2134_REG_BUCK##_id##_DVS0CFG0, \
+ .mode_mask = RTQ2134_FCCM_MASK, \
+ .suspend_mode_reg = RTQ2134_REG_BUCK##_id##_DVS1CFG0, \
+ .suspend_mode_mask = RTQ2134_FCCM_MASK, \
+ .suspend_enable_reg = RTQ2134_REG_BUCK##_id##_DVS1CFG0, \
+ .suspend_enable_mask = RTQ2134_VOUTEN_MASK, \
+ .suspend_vsel_reg = RTQ2134_REG_BUCK##_id##_DVS1CFG1, \
+ .suspend_vsel_mask = RTQ2134_VOUT_MASK, \
+ .dvs_ctrl_reg = RTQ2134_REG_BUCK##_id##_DVSCFG, \
+}
+
+static const struct rtq2134_regulator_desc rtq2134_regulator_descs[] = {
+ RTQ2134_BUCK_DESC(1),
+ RTQ2134_BUCK_DESC(2),
+ RTQ2134_BUCK_DESC(3)
+};
+
+static bool rtq2134_is_accissible_reg(struct device *dev, unsigned int reg)
+{
+ if (reg >= RTQ2134_REG_IO_CHIPNAME && reg <= RTQ2134_REG_BUCK3_SLEWCTRL)
+ return true;
+ return false;
+}
+
+static const struct regmap_config rtq2134_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .max_register = RTQ2134_REG_BUCK3_SLEWCTRL,
+
+ .readable_reg = rtq2134_is_accissible_reg,
+ .writeable_reg = rtq2134_is_accissible_reg,
+};
+
+static int rtq2134_probe(struct i2c_client *i2c)
+{
+ struct regmap *regmap;
+ struct regulator_dev *rdev;
+ struct regulator_config regulator_cfg = {};
+ int i;
+
+ regmap = devm_regmap_init_i2c(i2c, &rtq2134_regmap_config);
+ if (IS_ERR(regmap)) {
+ dev_err(&i2c->dev, "Failed to allocate regmap\n");
+ return PTR_ERR(regmap);
+ }
+
+ regulator_cfg.dev = &i2c->dev;
+ regulator_cfg.regmap = regmap;
+ for (i = 0; i < ARRAY_SIZE(rtq2134_regulator_descs); i++) {
+ rdev = devm_regulator_register(&i2c->dev,
+ &rtq2134_regulator_descs[i].desc,
+ ®ulator_cfg);
+ if (IS_ERR(rdev)) {
+ dev_err(&i2c->dev, "Failed to init %d regulator\n", i);
+ return PTR_ERR(rdev);
+ }
+ }
+
+ return 0;
+}
+
+static const struct of_device_id __maybe_unused rtq2134_device_tables[] = {
+ { .compatible = "richtek,rtq2134", },
+ {}
+};
+MODULE_DEVICE_TABLE(of, rtq2134_device_tables);
+
+static struct i2c_driver rtq2134_driver = {
+ .driver = {
+ .name = "rtq2134",
+ .of_match_table = rtq2134_device_tables,
+ },
+ .probe_new = rtq2134_probe,
+};
+module_i2c_driver(rtq2134_driver);
+
+MODULE_AUTHOR("ChiYuan Huang <cy_huang@richtek.com>");
+MODULE_DESCRIPTION("Richtek RTQ2134 Regulator Driver");
+MODULE_LICENSE("GPL v2");
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0+
+
+#include <linux/bitops.h>
+#include <linux/delay.h>
+#include <linux/gpio/consumer.h>
+#include <linux/i2c.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/regmap.h>
+#include <linux/regulator/driver.h>
+
+enum {
+ RTQ6752_IDX_PAVDD = 0,
+ RTQ6752_IDX_NAVDD = 1,
+ RTQ6752_IDX_MAX
+};
+
+#define RTQ6752_REG_PAVDD 0x00
+#define RTQ6752_REG_NAVDD 0x01
+#define RTQ6752_REG_PAVDDONDLY 0x07
+#define RTQ6752_REG_PAVDDSSTIME 0x08
+#define RTQ6752_REG_NAVDDONDLY 0x0D
+#define RTQ6752_REG_NAVDDSSTIME 0x0E
+#define RTQ6752_REG_OPTION1 0x12
+#define RTQ6752_REG_CHSWITCH 0x16
+#define RTQ6752_REG_FAULT 0x1D
+
+#define RTQ6752_VOUT_MASK GENMASK(5, 0)
+#define RTQ6752_NAVDDEN_MASK BIT(3)
+#define RTQ6752_PAVDDEN_MASK BIT(0)
+#define RTQ6752_PAVDDAD_MASK BIT(4)
+#define RTQ6752_NAVDDAD_MASK BIT(3)
+#define RTQ6752_PAVDDF_MASK BIT(3)
+#define RTQ6752_NAVDDF_MASK BIT(0)
+#define RTQ6752_ENABLE_MASK (BIT(RTQ6752_IDX_MAX) - 1)
+
+#define RTQ6752_VOUT_MINUV 5000000
+#define RTQ6752_VOUT_STEPUV 50000
+#define RTQ6752_VOUT_NUM 47
+#define RTQ6752_I2CRDY_TIMEUS 1000
+#define RTQ6752_MINSS_TIMEUS 5000
+
+struct rtq6752_priv {
+ struct regmap *regmap;
+ struct gpio_desc *enable_gpio;
+ struct mutex lock;
+ unsigned char enable_flag;
+};
+
+static int rtq6752_set_vdd_enable(struct regulator_dev *rdev)
+{
+ struct rtq6752_priv *priv = rdev_get_drvdata(rdev);
+ int rid = rdev_get_id(rdev), ret;
+
+ mutex_lock(&priv->lock);
+ if (priv->enable_gpio) {
+ gpiod_set_value(priv->enable_gpio, 1);
+
+ usleep_range(RTQ6752_I2CRDY_TIMEUS,
+ RTQ6752_I2CRDY_TIMEUS + 100);
+ }
+
+ if (!priv->enable_flag) {
+ regcache_cache_only(priv->regmap, false);
+ ret = regcache_sync(priv->regmap);
+ if (ret) {
+ mutex_unlock(&priv->lock);
+ return ret;
+ }
+ }
+
+ priv->enable_flag |= BIT(rid);
+ mutex_unlock(&priv->lock);
+
+ return regulator_enable_regmap(rdev);
+}
+
+static int rtq6752_set_vdd_disable(struct regulator_dev *rdev)
+{
+ struct rtq6752_priv *priv = rdev_get_drvdata(rdev);
+ int rid = rdev_get_id(rdev), ret;
+
+ ret = regulator_disable_regmap(rdev);
+ if (ret)
+ return ret;
+
+ mutex_lock(&priv->lock);
+ priv->enable_flag &= ~BIT(rid);
+
+ if (!priv->enable_flag) {
+ regcache_cache_only(priv->regmap, true);
+ regcache_mark_dirty(priv->regmap);
+ }
+
+ if (priv->enable_gpio)
+ gpiod_set_value(priv->enable_gpio, 0);
+
+ mutex_unlock(&priv->lock);
+
+ return 0;
+}
+
+static int rtq6752_get_error_flags(struct regulator_dev *rdev,
+ unsigned int *flags)
+{
+ unsigned int val, events = 0;
+ const unsigned int fault_mask[] = {
+ RTQ6752_PAVDDF_MASK, RTQ6752_NAVDDF_MASK };
+ int rid = rdev_get_id(rdev), ret;
+
+ ret = regmap_read(rdev->regmap, RTQ6752_REG_FAULT, &val);
+ if (ret)
+ return ret;
+
+ if (val & fault_mask[rid])
+ events = REGULATOR_ERROR_REGULATION_OUT;
+
+ *flags = events;
+ return 0;
+}
+
+static const struct regulator_ops rtq6752_regulator_ops = {
+ .list_voltage = regulator_list_voltage_linear,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .enable = rtq6752_set_vdd_enable,
+ .disable = rtq6752_set_vdd_disable,
+ .is_enabled = regulator_is_enabled_regmap,
+ .set_active_discharge = regulator_set_active_discharge_regmap,
+ .get_error_flags = rtq6752_get_error_flags,
+};
+
+static const struct regulator_desc rtq6752_regulator_descs[] = {
+ {
+ .name = "rtq6752-pavdd",
+ .of_match = of_match_ptr("pavdd"),
+ .regulators_node = of_match_ptr("regulators"),
+ .id = RTQ6752_IDX_PAVDD,
+ .n_voltages = RTQ6752_VOUT_NUM,
+ .ops = &rtq6752_regulator_ops,
+ .owner = THIS_MODULE,
+ .min_uV = RTQ6752_VOUT_MINUV,
+ .uV_step = RTQ6752_VOUT_STEPUV,
+ .enable_time = RTQ6752_MINSS_TIMEUS,
+ .vsel_reg = RTQ6752_REG_PAVDD,
+ .vsel_mask = RTQ6752_VOUT_MASK,
+ .enable_reg = RTQ6752_REG_CHSWITCH,
+ .enable_mask = RTQ6752_PAVDDEN_MASK,
+ .active_discharge_reg = RTQ6752_REG_OPTION1,
+ .active_discharge_mask = RTQ6752_PAVDDAD_MASK,
+ .active_discharge_off = RTQ6752_PAVDDAD_MASK,
+ },
+ {
+ .name = "rtq6752-navdd",
+ .of_match = of_match_ptr("navdd"),
+ .regulators_node = of_match_ptr("regulators"),
+ .id = RTQ6752_IDX_NAVDD,
+ .n_voltages = RTQ6752_VOUT_NUM,
+ .ops = &rtq6752_regulator_ops,
+ .owner = THIS_MODULE,
+ .min_uV = RTQ6752_VOUT_MINUV,
+ .uV_step = RTQ6752_VOUT_STEPUV,
+ .enable_time = RTQ6752_MINSS_TIMEUS,
+ .vsel_reg = RTQ6752_REG_NAVDD,
+ .vsel_mask = RTQ6752_VOUT_MASK,
+ .enable_reg = RTQ6752_REG_CHSWITCH,
+ .enable_mask = RTQ6752_NAVDDEN_MASK,
+ .active_discharge_reg = RTQ6752_REG_OPTION1,
+ .active_discharge_mask = RTQ6752_NAVDDAD_MASK,
+ .active_discharge_off = RTQ6752_NAVDDAD_MASK,
+ }
+};
+
+static int rtq6752_init_device_properties(struct rtq6752_priv *priv)
+{
+ u8 raw_vals[] = { 0, 0 };
+ int ret;
+
+ /* Configure PAVDD on and softstart delay time to the minimum */
+ ret = regmap_raw_write(priv->regmap, RTQ6752_REG_PAVDDONDLY, raw_vals,
+ ARRAY_SIZE(raw_vals));
+ if (ret)
+ return ret;
+
+ /* Configure NAVDD on and softstart delay time to the minimum */
+ return regmap_raw_write(priv->regmap, RTQ6752_REG_NAVDDONDLY, raw_vals,
+ ARRAY_SIZE(raw_vals));
+}
+
+static bool rtq6752_is_volatile_reg(struct device *dev, unsigned int reg)
+{
+ if (reg == RTQ6752_REG_FAULT)
+ return true;
+ return false;
+}
+
+static const struct reg_default rtq6752_reg_defaults[] = {
+ { RTQ6752_REG_PAVDD, 0x14 },
+ { RTQ6752_REG_NAVDD, 0x14 },
+ { RTQ6752_REG_PAVDDONDLY, 0x01 },
+ { RTQ6752_REG_PAVDDSSTIME, 0x01 },
+ { RTQ6752_REG_NAVDDONDLY, 0x01 },
+ { RTQ6752_REG_NAVDDSSTIME, 0x01 },
+ { RTQ6752_REG_OPTION1, 0x07 },
+ { RTQ6752_REG_CHSWITCH, 0x29 },
+};
+
+static const struct regmap_config rtq6752_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .cache_type = REGCACHE_RBTREE,
+ .max_register = RTQ6752_REG_FAULT,
+ .reg_defaults = rtq6752_reg_defaults,
+ .num_reg_defaults = ARRAY_SIZE(rtq6752_reg_defaults),
+ .volatile_reg = rtq6752_is_volatile_reg,
+};
+
+static int rtq6752_probe(struct i2c_client *i2c)
+{
+ struct rtq6752_priv *priv;
+ struct regulator_config reg_cfg = {};
+ struct regulator_dev *rdev;
+ int i, ret;
+
+ priv = devm_kzalloc(&i2c->dev, sizeof(*priv), GFP_KERNEL);
+ if (!priv)
+ return -ENOMEM;
+
+ mutex_init(&priv->lock);
+
+ priv->enable_gpio = devm_gpiod_get_optional(&i2c->dev, "enable",
+ GPIOD_OUT_HIGH);
+ if (IS_ERR(priv->enable_gpio)) {
+ dev_err(&i2c->dev, "Failed to get 'enable' gpio\n");
+ return PTR_ERR(priv->enable_gpio);
+ }
+
+ usleep_range(RTQ6752_I2CRDY_TIMEUS, RTQ6752_I2CRDY_TIMEUS + 100);
+ /* Default EN pin to high, PAVDD and NAVDD will be on */
+ priv->enable_flag = RTQ6752_ENABLE_MASK;
+
+ priv->regmap = devm_regmap_init_i2c(i2c, &rtq6752_regmap_config);
+ if (IS_ERR(priv->regmap)) {
+ dev_err(&i2c->dev, "Failed to init regmap\n");
+ return PTR_ERR(priv->regmap);
+ }
+
+ ret = rtq6752_init_device_properties(priv);
+ if (ret) {
+ dev_err(&i2c->dev, "Failed to init device properties\n");
+ return ret;
+ }
+
+ reg_cfg.dev = &i2c->dev;
+ reg_cfg.regmap = priv->regmap;
+ reg_cfg.driver_data = priv;
+
+ for (i = 0; i < ARRAY_SIZE(rtq6752_regulator_descs); i++) {
+ rdev = devm_regulator_register(&i2c->dev,
+ rtq6752_regulator_descs + i,
+ ®_cfg);
+ if (IS_ERR(rdev)) {
+ dev_err(&i2c->dev, "Failed to init %d regulator\n", i);
+ return PTR_ERR(rdev);
+ }
+ }
+
+ return 0;
+}
+
+static const struct of_device_id __maybe_unused rtq6752_device_table[] = {
+ { .compatible = "richtek,rtq6752", },
+ {}
+};
+MODULE_DEVICE_TABLE(of, rtq6752_device_table);
+
+static struct i2c_driver rtq6752_driver = {
+ .driver = {
+ .name = "rtq6752",
+ .of_match_table = rtq6752_device_table,
+ },
+ .probe_new = rtq6752_probe,
+};
+module_i2c_driver(rtq6752_driver);
+
+MODULE_AUTHOR("ChiYuan Huang <cy_huang@richtek.com>");
+MODULE_DESCRIPTION("Richtek RTQ6752 Regulator Driver");
+MODULE_LICENSE("GPL v2");
#include <linux/gpio/consumer.h>
#include <linux/mfd/sy7636a.h>
-#define SY7636A_POLL_ENABLED_TIME 500
+struct sy7636a_data {
+ struct regmap *regmap;
+ struct gpio_desc *pgood_gpio;
+};
static int sy7636a_get_vcom_voltage_op(struct regulator_dev *rdev)
{
static int sy7636a_get_status(struct regulator_dev *rdev)
{
- struct sy7636a *sy7636a = rdev_get_drvdata(rdev);
+ struct sy7636a_data *data = dev_get_drvdata(rdev->dev.parent);
int ret = 0;
- ret = gpiod_get_value_cansleep(sy7636a->pgood_gpio);
+ ret = gpiod_get_value_cansleep(data->pgood_gpio);
if (ret < 0)
dev_err(&rdev->dev, "Failed to read pgood gpio: %d\n", ret);
.owner = THIS_MODULE,
.enable_reg = SY7636A_REG_OPERATION_MODE_CRL,
.enable_mask = SY7636A_OPERATION_MODE_CRL_ONOFF,
- .poll_enabled_time = SY7636A_POLL_ENABLED_TIME,
.regulators_node = of_match_ptr("regulators"),
.of_match = of_match_ptr("vcom"),
};
static int sy7636a_regulator_probe(struct platform_device *pdev)
{
- struct sy7636a *sy7636a = dev_get_drvdata(pdev->dev.parent);
+ struct regmap *regmap = dev_get_drvdata(pdev->dev.parent);
struct regulator_config config = { };
struct regulator_dev *rdev;
struct gpio_desc *gdp;
+ struct sy7636a_data *data;
int ret;
- if (!sy7636a)
+ if (!regmap)
return -EPROBE_DEFER;
- platform_set_drvdata(pdev, sy7636a);
-
- gdp = devm_gpiod_get(sy7636a->dev, "epd-pwr-good", GPIOD_IN);
+ gdp = devm_gpiod_get(pdev->dev.parent, "epd-pwr-good", GPIOD_IN);
if (IS_ERR(gdp)) {
- dev_err(sy7636a->dev, "Power good GPIO fault %ld\n", PTR_ERR(gdp));
+ dev_err(pdev->dev.parent, "Power good GPIO fault %ld\n", PTR_ERR(gdp));
return PTR_ERR(gdp);
}
- sy7636a->pgood_gpio = gdp;
+ data = devm_kzalloc(&pdev->dev, sizeof(struct sy7636a_data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->regmap = regmap;
+ data->pgood_gpio = gdp;
+
+ platform_set_drvdata(pdev, data);
- ret = regmap_write(sy7636a->regmap, SY7636A_REG_POWER_ON_DELAY_TIME, 0x0);
+ ret = regmap_write(regmap, SY7636A_REG_POWER_ON_DELAY_TIME, 0x0);
if (ret) {
- dev_err(sy7636a->dev, "Failed to initialize regulator: %d\n", ret);
+ dev_err(pdev->dev.parent, "Failed to initialize regulator: %d\n", ret);
return ret;
}
config.dev = &pdev->dev;
- config.dev->of_node = sy7636a->dev->of_node;
- config.driver_data = sy7636a;
- config.regmap = sy7636a->regmap;
+ config.dev->of_node = pdev->dev.parent->of_node;
+ config.regmap = regmap;
rdev = devm_regulator_register(&pdev->dev, &desc, &config);
if (IS_ERR(rdev)) {
- dev_err(sy7636a->dev, "Failed to register %s regulator\n",
+ dev_err(pdev->dev.parent, "Failed to register %s regulator\n",
pdev->name);
return PTR_ERR(rdev);
}
unsigned int vsel_min;
unsigned int vsel_step;
unsigned int vsel_count;
+ const struct regmap_config *config;
};
struct sy8824_device_info {
static const struct regmap_config sy8824_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
+ .num_reg_defaults_raw = 1,
+ .cache_type = REGCACHE_FLAT,
+};
+
+static const struct regmap_config sy20276_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .num_reg_defaults_raw = 2,
+ .cache_type = REGCACHE_FLAT,
};
static int sy8824_i2c_probe(struct i2c_client *client)
di->dev = dev;
di->cfg = of_device_get_match_data(dev);
- regmap = devm_regmap_init_i2c(client, &sy8824_regmap_config);
+ regmap = devm_regmap_init_i2c(client, di->cfg->config);
if (IS_ERR(regmap)) {
dev_err(dev, "Failed to allocate regmap!\n");
return PTR_ERR(regmap);
.vsel_min = 762500,
.vsel_step = 12500,
.vsel_count = 64,
+ .config = &sy8824_regmap_config,
};
static const struct sy8824_config sy8824e_cfg = {
.vsel_min = 700000,
.vsel_step = 12500,
.vsel_count = 64,
+ .config = &sy8824_regmap_config,
};
static const struct sy8824_config sy20276_cfg = {
.vsel_min = 600000,
.vsel_step = 10000,
.vsel_count = 128,
+ .config = &sy20276_regmap_config,
};
static const struct sy8824_config sy20278_cfg = {
.vsel_min = 762500,
.vsel_step = 12500,
.vsel_count = 64,
+ .config = &sy20276_regmap_config,
};
static const struct of_device_id sy8824_dt_ids[] = {
#define SY8827N_MODE (1 << 6)
#define SY8827N_VSEL1 1
#define SY8827N_CTRL 2
+#define SY8827N_ID1 3
+#define SY8827N_ID2 4
+#define SY8827N_PGOOD 5
+#define SY8827N_MAX (SY8827N_PGOOD + 1)
#define SY8827N_NVOLTAGES 64
#define SY8827N_VSELMIN 600000
return PTR_ERR_OR_ZERO(rdev);
}
+static bool sy8827n_volatile_reg(struct device *dev, unsigned int reg)
+{
+ if (reg == SY8827N_PGOOD)
+ return true;
+ return false;
+}
+
static const struct regmap_config sy8827n_regmap_config = {
.reg_bits = 8,
.val_bits = 8,
+ .volatile_reg = sy8827n_volatile_reg,
+ .num_reg_defaults_raw = SY8827N_MAX,
+ .cache_type = REGCACHE_FLAT,
};
static int sy8827n_i2c_probe(struct i2c_client *client)
rdev = devm_regulator_register(&pdev->dev, &pmic->desc[i],
&config);
- if (IS_ERR(rdev)) {
- dev_err(tps65910->dev,
- "failed to register %s regulator\n",
- pdev->name);
- return PTR_ERR(rdev);
- }
+ if (IS_ERR(rdev))
+ return dev_err_probe(tps65910->dev, PTR_ERR(rdev),
+ "failed to register %s regulator\n",
+ pdev->name);
/* Save regulator for cleanup */
pmic->rdev[i] = rdev;
struct vctrl_data {
struct regulator_dev *rdev;
struct regulator_desc desc;
- struct regulator *ctrl_reg;
bool enabled;
unsigned int min_slew_down_rate;
unsigned int ovp_threshold;
static int vctrl_get_voltage(struct regulator_dev *rdev)
{
struct vctrl_data *vctrl = rdev_get_drvdata(rdev);
- int ctrl_uV = regulator_get_voltage_rdev(vctrl->ctrl_reg->rdev);
+ int ctrl_uV;
+
+ if (!rdev->supply)
+ return -EPROBE_DEFER;
+
+ ctrl_uV = regulator_get_voltage_rdev(rdev->supply->rdev);
return vctrl_calc_output_voltage(vctrl, ctrl_uV);
}
unsigned int *selector)
{
struct vctrl_data *vctrl = rdev_get_drvdata(rdev);
- struct regulator *ctrl_reg = vctrl->ctrl_reg;
- int orig_ctrl_uV = regulator_get_voltage_rdev(ctrl_reg->rdev);
- int uV = vctrl_calc_output_voltage(vctrl, orig_ctrl_uV);
+ int orig_ctrl_uV;
+ int uV;
int ret;
+ if (!rdev->supply)
+ return -EPROBE_DEFER;
+
+ orig_ctrl_uV = regulator_get_voltage_rdev(rdev->supply->rdev);
+ uV = vctrl_calc_output_voltage(vctrl, orig_ctrl_uV);
+
if (req_min_uV >= uV || !vctrl->ovp_threshold)
/* voltage rising or no OVP */
- return regulator_set_voltage_rdev(ctrl_reg->rdev,
+ return regulator_set_voltage_rdev(rdev->supply->rdev,
vctrl_calc_ctrl_voltage(vctrl, req_min_uV),
vctrl_calc_ctrl_voltage(vctrl, req_max_uV),
PM_SUSPEND_ON);
next_uV = max_t(int, req_min_uV, uV - max_drop_uV);
next_ctrl_uV = vctrl_calc_ctrl_voltage(vctrl, next_uV);
- ret = regulator_set_voltage_rdev(ctrl_reg->rdev,
+ ret = regulator_set_voltage_rdev(rdev->supply->rdev,
next_ctrl_uV,
next_ctrl_uV,
PM_SUSPEND_ON);
err:
/* Try to go back to original voltage */
- regulator_set_voltage_rdev(ctrl_reg->rdev, orig_ctrl_uV, orig_ctrl_uV,
+ regulator_set_voltage_rdev(rdev->supply->rdev, orig_ctrl_uV, orig_ctrl_uV,
PM_SUSPEND_ON);
return ret;
unsigned int selector)
{
struct vctrl_data *vctrl = rdev_get_drvdata(rdev);
- struct regulator *ctrl_reg = vctrl->ctrl_reg;
unsigned int orig_sel = vctrl->sel;
int ret;
+ if (!rdev->supply)
+ return -EPROBE_DEFER;
+
if (selector >= rdev->desc->n_voltages)
return -EINVAL;
if (selector >= vctrl->sel || !vctrl->ovp_threshold) {
/* voltage rising or no OVP */
- ret = regulator_set_voltage_rdev(ctrl_reg->rdev,
+ ret = regulator_set_voltage_rdev(rdev->supply->rdev,
vctrl->vtable[selector].ctrl,
vctrl->vtable[selector].ctrl,
PM_SUSPEND_ON);
else
next_sel = vctrl->vtable[vctrl->sel].ovp_min_sel;
- ret = regulator_set_voltage_rdev(ctrl_reg->rdev,
+ ret = regulator_set_voltage_rdev(rdev->supply->rdev,
vctrl->vtable[next_sel].ctrl,
vctrl->vtable[next_sel].ctrl,
PM_SUSPEND_ON);
err:
if (vctrl->sel != orig_sel) {
/* Try to go back to original voltage */
- if (!regulator_set_voltage_rdev(ctrl_reg->rdev,
+ if (!regulator_set_voltage_rdev(rdev->supply->rdev,
vctrl->vtable[orig_sel].ctrl,
vctrl->vtable[orig_sel].ctrl,
PM_SUSPEND_ON))
u32 pval;
u32 vrange_ctrl[2];
- vctrl->ctrl_reg = devm_regulator_get(&pdev->dev, "ctrl");
- if (IS_ERR(vctrl->ctrl_reg))
- return PTR_ERR(vctrl->ctrl_reg);
-
ret = of_property_read_u32(np, "ovp-threshold-percent", &pval);
if (!ret) {
vctrl->ovp_threshold = pval;
return at->ctrl - bt->ctrl;
}
-static int vctrl_init_vtable(struct platform_device *pdev)
+static int vctrl_init_vtable(struct platform_device *pdev,
+ struct regulator *ctrl_reg)
{
struct vctrl_data *vctrl = platform_get_drvdata(pdev);
struct regulator_desc *rdesc = &vctrl->desc;
- struct regulator *ctrl_reg = vctrl->ctrl_reg;
struct vctrl_voltage_range *vrange_ctrl = &vctrl->vrange.ctrl;
int n_voltages;
int ctrl_uV;
static int vctrl_enable(struct regulator_dev *rdev)
{
struct vctrl_data *vctrl = rdev_get_drvdata(rdev);
- int ret = regulator_enable(vctrl->ctrl_reg);
- if (!ret)
- vctrl->enabled = true;
+ vctrl->enabled = true;
- return ret;
+ return 0;
}
static int vctrl_disable(struct regulator_dev *rdev)
{
struct vctrl_data *vctrl = rdev_get_drvdata(rdev);
- int ret = regulator_disable(vctrl->ctrl_reg);
- if (!ret)
- vctrl->enabled = false;
+ vctrl->enabled = false;
- return ret;
+ return 0;
}
static int vctrl_is_enabled(struct regulator_dev *rdev)
struct regulator_desc *rdesc;
struct regulator_config cfg = { };
struct vctrl_voltage_range *vrange_ctrl;
+ struct regulator *ctrl_reg;
int ctrl_uV;
int ret;
if (ret)
return ret;
+ ctrl_reg = devm_regulator_get(&pdev->dev, "ctrl");
+ if (IS_ERR(ctrl_reg))
+ return PTR_ERR(ctrl_reg);
+
vrange_ctrl = &vctrl->vrange.ctrl;
rdesc = &vctrl->desc;
rdesc->name = "vctrl";
rdesc->type = REGULATOR_VOLTAGE;
rdesc->owner = THIS_MODULE;
+ rdesc->supply_name = "ctrl";
- if ((regulator_get_linear_step(vctrl->ctrl_reg) == 1) ||
- (regulator_count_voltages(vctrl->ctrl_reg) == -EINVAL)) {
+ if ((regulator_get_linear_step(ctrl_reg) == 1) ||
+ (regulator_count_voltages(ctrl_reg) == -EINVAL)) {
rdesc->continuous_voltage_range = true;
rdesc->ops = &vctrl_ops_cont;
} else {
cfg.init_data = init_data;
if (!rdesc->continuous_voltage_range) {
- ret = vctrl_init_vtable(pdev);
+ ret = vctrl_init_vtable(pdev, ctrl_reg);
if (ret)
return ret;
- ctrl_uV = regulator_get_voltage_rdev(vctrl->ctrl_reg->rdev);
+ /* Use locked consumer API when not in regulator framework */
+ ctrl_uV = regulator_get_voltage(ctrl_reg);
if (ctrl_uV < 0) {
dev_err(&pdev->dev, "failed to get control voltage\n");
return ctrl_uV;
}
}
+ /* Drop ctrl-supply here in favor of regulator core managed supply */
+ devm_regulator_put(ctrl_reg);
+
vctrl->rdev = devm_regulator_register(&pdev->dev, rdesc, &cfg);
if (IS_ERR(vctrl->rdev)) {
ret = PTR_ERR(vctrl->rdev);
config RESET_MCHP_SPARX5
bool "Microchip Sparx5 reset driver"
- depends on HAS_IOMEM || COMPILE_TEST
+ depends on ARCH_SPARX5 || COMPILE_TEST
default y if SPARX5_SWITCH
select MFD_SYSCON
help
unsigned long id)
{
struct zynqmp_reset_data *priv = to_zynqmp_reset_data(rcdev);
- int val, err;
+ int err;
+ u32 val;
err = zynqmp_pm_reset_get_status(priv->data->reset_id + id, &val);
if (err)
dbio = dreq->bio;
recid = first_rec;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
for (off = 0; off < bv.bv_len; off += blksize) {
memset(dbio, 0, sizeof (struct dasd_diag_bio));
dbio->type = rw_cmd;
end_blk = (curr_trk + 1) * recs_per_trk;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
for (off = 0; off < bv.bv_len; off += blksize) {
if (first_blk + blk_count >= end_blk) {
cqr->proc_bytes = blk_count * blksize;
last_rec - recid + 1, cmd, basedev, blksize);
}
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
if (dasd_page_cache) {
char *copy = kmem_cache_alloc(dasd_page_cache,
GFP_DMA | __GFP_NOWARN);
idaw_dst = NULL;
idaw_len = 0;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
seg_len = bv.bv_len;
while (seg_len) {
if (new_track) {
new_track = 1;
recid = first_rec;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
seg_len = bv.bv_len;
while (seg_len) {
if (new_track) {
}
} else {
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
last_tidaw = itcw_add_tidaw(itcw, 0x00,
dst, bv.bv_len);
if (IS_ERR(last_tidaw)) {
idaws = idal_create_words(idaws, rawpadpage, PAGE_SIZE);
}
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
seg_len = bv.bv_len;
if (cmd == DASD_ECKD_CCW_READ_TRACK)
memset(dst, 0, seg_len);
if (private->uses_cdl == 0 || recid > 2*blk_per_trk)
ccw++;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
for (off = 0; off < bv.bv_len; off += blksize) {
/* Skip locate record. */
if (private->uses_cdl && recid <= 2*blk_per_trk)
}
recid = first_rec;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
if (dasd_page_cache) {
char *copy = kmem_cache_alloc(dasd_page_cache,
GFP_DMA | __GFP_NOWARN);
if (private->rdc_data.mode.bits.data_chain != 0)
ccw++;
rq_for_each_segment(bv, req, iter) {
- dst = page_address(bv.bv_page) + bv.bv_offset;
+ dst = bvec_virt(&bv);
for (off = 0; off < bv.bv_len; off += blksize) {
/* Skip locate record. */
if (private->rdc_data.mode.bits.data_chain == 0)
#include "dasd_int.h"
+static struct lock_class_key dasd_bio_compl_lkclass;
+
/*
* Allocate and register gendisk structure for device.
*/
if (base->devindex >= DASD_PER_MAJOR)
return -EBUSY;
- gdp = alloc_disk(1 << DASD_PARTN_BITS);
+ gdp = __alloc_disk_node(block->request_queue, NUMA_NO_NODE,
+ &dasd_bio_compl_lkclass);
if (!gdp)
return -ENOMEM;
/* Initialize gendisk structure. */
gdp->major = DASD_MAJOR;
gdp->first_minor = base->devindex << DASD_PARTN_BITS;
+ gdp->minors = 1 << DASD_PARTN_BITS;
gdp->fops = &dasd_device_operations;
/*
test_bit(DASD_FLAG_DEVICE_RO, &base->flags))
set_disk_ro(gdp, 1);
dasd_add_link_to_gendisk(gdp, base);
- gdp->queue = block->request_queue;
block->gdp = gdp;
set_capacity(block->gdp, 0);
device_add_disk(&base->cdev->dev, block->gdp, NULL);
else
argp = (void __user *)arg;
- if ((_IOC_DIR(cmd) != _IOC_NONE) && !arg) {
- PRINT_DEBUG("empty data ptr");
+ if ((_IOC_DIR(cmd) != _IOC_NONE) && !arg)
return -EINVAL;
- }
base = dasd_device_from_gendisk(bdev->bd_disk);
if (!base)
index = (bio->bi_iter.bi_sector >> 3);
bio_for_each_segment(bvec, bio, iter) {
- page_addr = (unsigned long)
- page_address(bvec.bv_page) + bvec.bv_offset;
+ page_addr = (unsigned long)bvec_virt(&bvec);
source_addr = dev_info->start + (index<<12) + bytes_done;
if (unlikely((page_addr & 4095) != 0) || (bvec.bv_len & 4095) != 0)
// More paranoia.
#include <linux/platform_device.h>
#include <asm/types.h>
#include <asm/irq.h>
+#include <asm/debug.h>
#include "sclp.h"
#define SCLP_HEADER "sclp: "
+struct sclp_trace_entry {
+ char id[4];
+ u32 a;
+ u64 b;
+};
+
+#define SCLP_TRACE_ENTRY_SIZE sizeof(struct sclp_trace_entry)
+#define SCLP_TRACE_MAX_SIZE 128
+#define SCLP_TRACE_EVENT_MAX_SIZE 64
+
+/* Debug trace area intended for all entries in abbreviated form. */
+DEFINE_STATIC_DEBUG_INFO(sclp_debug, "sclp", 8, 1, SCLP_TRACE_ENTRY_SIZE,
+ &debug_hex_ascii_view);
+
+/* Error trace area intended for full entries relating to failed requests. */
+DEFINE_STATIC_DEBUG_INFO(sclp_debug_err, "sclp_err", 4, 1,
+ SCLP_TRACE_ENTRY_SIZE, &debug_hex_ascii_view);
+
/* Lock to protect internal data consistency. */
static DEFINE_SPINLOCK(sclp_lock);
/* Number of times the console dropped buffer pages */
unsigned long sclp_console_full;
+/* The currently active SCLP command word. */
+static sclp_cmdw_t active_cmd;
+
+static inline void sclp_trace(int prio, char *id, u32 a, u64 b, bool err)
+{
+ struct sclp_trace_entry e;
+
+ memset(&e, 0, sizeof(e));
+ strncpy(e.id, id, sizeof(e.id));
+ e.a = a;
+ e.b = b;
+ debug_event(&sclp_debug, prio, &e, sizeof(e));
+ if (err)
+ debug_event(&sclp_debug_err, 0, &e, sizeof(e));
+}
+
+static inline int no_zeroes_len(void *data, int len)
+{
+ char *d = data;
+
+ /* Minimize trace area usage by not tracing trailing zeroes. */
+ while (len > SCLP_TRACE_ENTRY_SIZE && d[len - 1] == 0)
+ len--;
+
+ return len;
+}
+
+static inline void sclp_trace_bin(int prio, void *d, int len, int errlen)
+{
+ debug_event(&sclp_debug, prio, d, no_zeroes_len(d, len));
+ if (errlen)
+ debug_event(&sclp_debug_err, 0, d, no_zeroes_len(d, errlen));
+}
+
+static inline int abbrev_len(sclp_cmdw_t cmd, struct sccb_header *sccb)
+{
+ struct evbuf_header *evbuf = (struct evbuf_header *)(sccb + 1);
+ int len = sccb->length, limit = SCLP_TRACE_MAX_SIZE;
+
+ /* Full SCCB tracing if debug level is set to max. */
+ if (sclp_debug.level == DEBUG_MAX_LEVEL)
+ return len;
+
+ /* Minimal tracing for console writes. */
+ if (cmd == SCLP_CMDW_WRITE_EVENT_DATA &&
+ (evbuf->type == EVTYP_MSG || evbuf->type == EVTYP_VT220MSG))
+ limit = SCLP_TRACE_ENTRY_SIZE;
+
+ return min(len, limit);
+}
+
+static inline void sclp_trace_sccb(int prio, char *id, u32 a, u64 b,
+ sclp_cmdw_t cmd, struct sccb_header *sccb,
+ bool err)
+{
+ sclp_trace(prio, id, a, b, err);
+ if (sccb) {
+ sclp_trace_bin(prio + 1, sccb, abbrev_len(cmd, sccb),
+ err ? sccb->length : 0);
+ }
+}
+
+static inline void sclp_trace_evbuf(int prio, char *id, u32 a, u64 b,
+ struct evbuf_header *evbuf, bool err)
+{
+ sclp_trace(prio, id, a, b, err);
+ sclp_trace_bin(prio + 1, evbuf,
+ min((int)evbuf->length, (int)SCLP_TRACE_EVENT_MAX_SIZE),
+ err ? evbuf->length : 0);
+}
+
+static inline void sclp_trace_req(int prio, char *id, struct sclp_req *req,
+ bool err)
+{
+ struct sccb_header *sccb = req->sccb;
+ union {
+ struct {
+ u16 status;
+ u16 response;
+ u16 timeout;
+ u16 start_count;
+ };
+ u64 b;
+ } summary;
+
+ summary.status = req->status;
+ summary.response = sccb ? sccb->response_code : 0;
+ summary.timeout = (u16)req->queue_timeout;
+ summary.start_count = (u16)req->start_count;
+
+ sclp_trace(prio, id, (u32)(addr_t)sccb, summary.b, err);
+}
+
+static inline void sclp_trace_register(int prio, char *id, u32 a, u64 b,
+ struct sclp_register *reg)
+{
+ struct {
+ u64 receive;
+ u64 send;
+ } d;
+
+ d.receive = reg->receive_mask;
+ d.send = reg->send_mask;
+
+ sclp_trace(prio, id, a, b, false);
+ sclp_trace_bin(prio, &d, sizeof(d), 0);
+}
+
static int __init sclp_setup_console_pages(char *str)
{
int pages, rc;
{
unsigned long flags;
+ /* TMO: A timeout occurred (a=force_restart) */
+ sclp_trace(2, "TMO", force_restart, 0, true);
+
spin_lock_irqsave(&sclp_lock, flags);
if (force_restart) {
if (sclp_running_state == sclp_running_state_running) {
do {
req = __sclp_req_queue_remove_expired_req();
+
+ if (req) {
+ /* RQTM: Request timed out (a=sccb, b=summary) */
+ sclp_trace_req(2, "RQTM", req, true);
+ }
+
if (req && req->callback)
req->callback(req, req->callback_data);
} while (req);
spin_unlock_irqrestore(&sclp_lock, flags);
}
+static int sclp_service_call_trace(sclp_cmdw_t command, void *sccb)
+{
+ static u64 srvc_count;
+ int rc;
+
+ /* SRV1: Service call about to be issued (a=command, b=sccb address) */
+ sclp_trace_sccb(0, "SRV1", command, (u64)sccb, command, sccb, false);
+
+ rc = sclp_service_call(command, sccb);
+
+ /* SRV2: Service call was issued (a=rc, b=SRVC sequence number) */
+ sclp_trace(0, "SRV2", -rc, ++srvc_count, rc != 0);
+
+ if (rc == 0)
+ active_cmd = command;
+
+ return rc;
+}
+
/* Try to start a request. Return zero if the request was successfully
* started or if it will be started at a later time. Return non-zero otherwise.
* Called while sclp_lock is locked. */
if (sclp_running_state != sclp_running_state_idle)
return 0;
del_timer(&sclp_request_timer);
- rc = sclp_service_call(req->command, req->sccb);
+ rc = sclp_service_call_trace(req->command, req->sccb);
req->start_count++;
if (rc == 0) {
}
/* Post-processing for aborted request */
list_del(&req->list);
+
+ /* RQAB: Request aborted (a=sccb, b=summary) */
+ sclp_trace_req(2, "RQAB", req, true);
+
if (req->callback) {
spin_unlock_irqrestore(&sclp_lock, flags);
req->callback(req, req->callback_data);
spin_unlock_irqrestore(&sclp_lock, flags);
return -EIO;
}
+
+ /* RQAD: Request was added (a=sccb, b=caller) */
+ sclp_trace(2, "RQAD", (u32)(addr_t)req->sccb, _RET_IP_, false);
+
req->status = SCLP_REQ_QUEUED;
req->start_count = 0;
list_add_tail(&req->list, &sclp_req_queue);
else
reg = NULL;
}
+
+ /* EVNT: Event callback (b=receiver) */
+ sclp_trace_evbuf(2, "EVNT", 0, reg ? (u64)reg->receiver_fn : 0,
+ evbuf, !reg);
+
if (reg && reg->receiver_fn) {
spin_unlock_irqrestore(&sclp_lock, flags);
reg->receiver_fn(evbuf);
return NULL;
}
+static bool ok_response(u32 sccb_int, sclp_cmdw_t cmd)
+{
+ struct sccb_header *sccb = (struct sccb_header *)(addr_t)sccb_int;
+ struct evbuf_header *evbuf;
+ u16 response;
+
+ if (!sccb)
+ return true;
+
+ /* Check SCCB response. */
+ response = sccb->response_code & 0xff;
+ if (response != 0x10 && response != 0x20)
+ return false;
+
+ /* Check event-processed flag on outgoing events. */
+ if (cmd == SCLP_CMDW_WRITE_EVENT_DATA) {
+ evbuf = (struct evbuf_header *)(sccb + 1);
+ if (!(evbuf->flags & 0x80))
+ return false;
+ }
+
+ return true;
+}
+
/* Handler for external interruption. Perform request post-processing.
* Prepare read event data request if necessary. Start processing of next
* request on queue. */
spin_lock(&sclp_lock);
finished_sccb = param32 & 0xfffffff8;
evbuf_pending = param32 & 0x3;
+
+ /* INT: Interrupt received (a=intparm, b=cmd) */
+ sclp_trace_sccb(0, "INT", param32, active_cmd, active_cmd,
+ (struct sccb_header *)(addr_t)finished_sccb,
+ !ok_response(finished_sccb, active_cmd));
+
if (finished_sccb) {
del_timer(&sclp_request_timer);
sclp_running_state = sclp_running_state_reset_pending;
/* Request post-processing */
list_del(&req->list);
req->status = SCLP_REQ_DONE;
+
+ /* RQOK: Request success (a=sccb, b=summary) */
+ sclp_trace_req(2, "RQOK", req, false);
+
if (req->callback) {
spin_unlock(&sclp_lock);
req->callback(req, req->callback_data);
spin_lock(&sclp_lock);
}
+ } else {
+ /* UNEX: Unexpected SCCB completion (a=sccb address) */
+ sclp_trace(0, "UNEX", finished_sccb, 0, true);
}
sclp_running_state = sclp_running_state_idle;
+ active_cmd = 0;
}
if (evbuf_pending &&
sclp_activation_state == sclp_activation_state_active)
unsigned long long old_tick;
unsigned long flags;
unsigned long cr0, cr0_sync;
+ static u64 sync_count;
u64 timeout;
int irq_context;
+ /* SYN1: Synchronous wait start (a=runstate, b=sync count) */
+ sclp_trace(4, "SYN1", sclp_running_state, ++sync_count, false);
+
/* We'll be disabling timer interrupts, so we need a custom timeout
* mechanism */
timeout = 0;
_local_bh_enable();
local_tick_enable(old_tick);
local_irq_restore(flags);
+
+ /* SYN2: Synchronous wait end (a=runstate, b=sync_count) */
+ sclp_trace(4, "SYN2", sclp_running_state, sync_count, false);
}
EXPORT_SYMBOL(sclp_sync_wait);
reg = NULL;
}
spin_unlock_irqrestore(&sclp_lock, flags);
- if (reg && reg->state_change_fn)
+ if (reg && reg->state_change_fn) {
+ /* STCG: State-change callback (b=callback) */
+ sclp_trace(2, "STCG", 0, (u64)reg->state_change_fn,
+ false);
+
reg->state_change_fn(reg);
+ }
} while (reg);
}
sccb_mask_t send_mask;
int rc;
+ /* REG: Event listener registered (b=caller) */
+ sclp_trace_register(2, "REG", 0, _RET_IP_, reg);
+
rc = sclp_init();
if (rc)
return rc;
{
unsigned long flags;
+ /* UREG: Event listener unregistered (b=caller) */
+ sclp_trace_register(2, "UREG", 0, _RET_IP_, reg);
+
spin_lock_irqsave(&sclp_lock, flags);
list_del(®->list);
spin_unlock_irqrestore(&sclp_lock, flags);
for (retry = 0; retry <= SCLP_INIT_RETRY; retry++) {
__sclp_make_init_req(0, 0);
sccb = (struct init_sccb *) sclp_init_req.sccb;
- rc = sclp_service_call(sclp_init_req.command, sccb);
+ rc = sclp_service_call_trace(sclp_init_req.command, sccb);
if (rc == -EIO)
break;
sclp_init_req.status = SCLP_REQ_RUNNING;
extern unsigned long sclp_console_full;
extern bool sclp_mask_compat_mode;
-extern char *sclp_early_sccb;
-
void sclp_early_wait_irq(void);
int sclp_early_cmd(sclp_cmdw_t cmd, void *sccb);
unsigned int sclp_early_con_check_linemode(struct init_sccb *sccb);
struct read_storage_sccb *sccb;
int i, id, assigned, rc;
- if (OLDMEM_BASE) /* No standby memory in kdump mode */
+ if (oldmem_data.start) /* No standby memory in kdump mode */
return 0;
if ((sclp.facilities & 0xe00000000000ULL) != 0xe00000000000ULL)
return 0;
s390_update_cpu_mhz();
pr_info("CPU capability may have changed\n");
- get_online_cpus();
+ cpus_read_lock();
for_each_online_cpu(cpu) {
dev = get_cpu_device(cpu);
kobject_uevent(&dev->kobj, KOBJ_CHANGE);
}
- put_online_cpus();
+ cpus_read_unlock();
}
static void __ref sclp_cpu_change_notify(struct work_struct *work)
static struct read_info_sccb __bootdata(sclp_info_sccb);
static int __bootdata(sclp_info_sccb_valid);
-char *sclp_early_sccb = (char *) EARLY_SCCB_OFFSET;
+char *__bootdata(sclp_early_sccb);
int sclp_init_state = sclp_init_state_uninitialized;
/*
* Used to keep track of the size of the event masks. Qemu until version 2.11
return rc;
}
+void sclp_early_set_buffer(void *sccb)
+{
+ sclp_early_sccb = sccb;
+}
+
/*
* Output one or more lines of text on the SCLP console (VT220 and /
* or line-mode).
__sclp_early_printk(str, strlen(str));
}
+/*
+ * We can't pass sclp_info_sccb to sclp_early_cmd() here directly,
+ * because it might not fulfil the requiremets for a SCLP communication buffer:
+ * - lie below 2G in memory
+ * - be page-aligned
+ * Therefore, we use the buffer sclp_early_sccb (which fulfils all those
+ * requirements) temporarily for communication and copy a received response
+ * back into the buffer sclp_info_sccb upon successful completion.
+ */
int __init sclp_early_read_info(void)
{
int i;
int length = test_facility(140) ? EXT_SCCB_READ_SCP : PAGE_SIZE;
- struct read_info_sccb *sccb = &sclp_info_sccb;
+ struct read_info_sccb *sccb = (struct read_info_sccb *)sclp_early_sccb;
sclp_cmdw_t commands[] = {SCLP_CMDW_READ_SCP_INFO_FORCED,
SCLP_CMDW_READ_SCP_INFO};
if (sclp_early_cmd(commands[i], sccb))
break;
if (sccb->header.response_code == 0x10) {
+ memcpy(&sclp_info_sccb, sccb, length);
sclp_info_sccb_valid = 1;
return 0;
}
if (!is_ipl_type_dump())
return -ENODATA;
- if (OLDMEM_BASE)
+ if (oldmem_data.start)
return -ENODATA;
zcore_dbf = debug_register("zcore", 4, 1, 4 * sizeof(long));
}
static DEVICE_ATTR_RO(pimpampom);
+static ssize_t dev_busid_show(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct subchannel *sch = to_subchannel(dev);
+ struct pmcw *pmcw = &sch->schib.pmcw;
+
+ if ((pmcw->st == SUBCHANNEL_TYPE_IO ||
+ pmcw->st == SUBCHANNEL_TYPE_MSG) && pmcw->dnv)
+ return sysfs_emit(buf, "0.%x.%04x\n", sch->schid.ssid,
+ pmcw->dev);
+ else
+ return sysfs_emit(buf, "none\n");
+}
+static DEVICE_ATTR_RO(dev_busid);
+
static struct attribute *io_subchannel_type_attrs[] = {
&dev_attr_chpids.attr,
&dev_attr_pimpampom.attr,
+ &dev_attr_dev_busid.attr,
NULL,
};
ATTRIBUTE_GROUPS(io_subchannel_type);
}
static DEVICE_ATTR_RO(real_cssid);
+static ssize_t rescan_store(struct device *dev, struct device_attribute *a,
+ const char *buf, size_t count)
+{
+ CIO_TRACE_EVENT(4, "usr-rescan");
+
+ css_schedule_eval_all();
+ css_complete_work();
+
+ return count;
+}
+static DEVICE_ATTR_WO(rescan);
+
static ssize_t cm_enable_show(struct device *dev, struct device_attribute *a,
char *buf)
{
static struct attribute *cssdev_attrs[] = {
&dev_attr_real_cssid.attr,
+ &dev_attr_rescan.attr,
NULL,
};
struct qdio_irq;
-struct siga_flag {
- u8 input:1;
- u8 output:1;
- u8 sync:1;
- u8 sync_after_ai:1;
- u8 sync_out_after_pci:1;
- u8:3;
-} __attribute__ ((packed));
-
struct qdio_dev_perf_stat {
unsigned int adapter_int;
unsigned int qdio_int;
- unsigned int pci_request_int;
-
- unsigned int tasklet_outbound;
unsigned int siga_read;
unsigned int siga_write;
unsigned int stop_polling;
unsigned int inbound_queue_full;
unsigned int outbound_call;
- unsigned int outbound_handler;
unsigned int outbound_queue_full;
unsigned int fast_requeue;
unsigned int target_full;
};
struct qdio_output_q {
- /* PCIs are enabled for the queue */
- int pci_out_enabled;
- /* timer to check for more outbound work */
- struct timer_list timer;
- /* tasklet to check for completions */
- struct tasklet_struct tasklet;
};
/*
unsigned long sch_token; /* QEBSM facility */
enum qdio_irq_states state;
-
- struct siga_flag siga_flag; /* siga sync information from qdioac */
+ u8 qdioac1;
int nr_input_qs;
int nr_output_qs;
struct qdio_ssqd_desc ssqd_desc;
void (*orig_handler) (struct ccw_device *, unsigned long, struct irb *);
- unsigned int scan_threshold; /* used SBALs before tasklet schedule */
int perf_stat_enabled;
struct qdr *qdr;
#define pci_out_supported(irq) ((irq)->qib.ac & QIB_AC_OUTBOUND_PCI_SUPPORTED)
#define is_qebsm(q) (q->irq_ptr->sch_token != 0)
-#define need_siga_in(q) (q->irq_ptr->siga_flag.input)
-#define need_siga_out(q) (q->irq_ptr->siga_flag.output)
-#define need_siga_sync(q) (unlikely(q->irq_ptr->siga_flag.sync))
-#define need_siga_sync_after_ai(q) \
- (unlikely(q->irq_ptr->siga_flag.sync_after_ai))
-#define need_siga_sync_out_after_pci(q) \
- (unlikely(q->irq_ptr->siga_flag.sync_out_after_pci))
+#define qdio_need_siga_in(irq) ((irq)->qdioac1 & AC1_SIGA_INPUT_NEEDED)
+#define qdio_need_siga_out(irq) ((irq)->qdioac1 & AC1_SIGA_OUTPUT_NEEDED)
+#define qdio_need_siga_sync(irq) (unlikely((irq)->qdioac1 & AC1_SIGA_SYNC_NEEDED))
#define for_each_input_queue(irq_ptr, q, i) \
for (i = 0; i < irq_ptr->nr_input_qs && \
#define sub_buf(bufnr, dec) QDIO_BUFNR((bufnr) - (dec))
#define prev_buf(bufnr) sub_buf(bufnr, 1)
-#define queue_irqs_enabled(q) \
- (test_bit(QDIO_QUEUE_IRQS_DISABLED, &q->u.in.queue_irq_state) == 0)
-#define queue_irqs_disabled(q) \
- (test_bit(QDIO_QUEUE_IRQS_DISABLED, &q->u.in.queue_irq_state) != 0)
-
extern u64 last_ai_time;
/* prototypes for thin interrupt */
int test_nonshared_ind(struct qdio_irq *);
/* prototypes for setup */
-void qdio_outbound_tasklet(struct tasklet_struct *t);
-void qdio_outbound_timer(struct timer_list *t);
void qdio_int_handler(struct ccw_device *cdev, unsigned long intparm,
struct irb *irb);
int qdio_allocate_qs(struct qdio_irq *irq_ptr, int nr_input_qs,
static char *qperf_names[] = {
"Assumed adapter interrupts",
"QDIO interrupts",
- "Requested PCIs",
- "Outbound tasklet runs",
"SIGA read",
"SIGA write",
"SIGA sync",
"Inbound stop_polling",
"Inbound queue full",
"Outbound calls",
- "Outbound handler",
"Outbound queue full",
"Outbound fast_requeue",
"Outbound target_full",
#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>
-#include <linux/timer.h>
#include <linux/delay.h>
#include <linux/gfp.h>
#include <linux/io.h>
return (cc) ? -EIO : 0;
}
+static inline int qdio_sync_input_queue(struct qdio_q *q)
+{
+ return qdio_siga_sync(q, 0, q->mask);
+}
+
+static inline int qdio_sync_output_queue(struct qdio_q *q)
+{
+ return qdio_siga_sync(q, q->mask, 0);
+}
+
static inline int qdio_siga_sync_q(struct qdio_q *q)
{
if (q->is_input_q)
- return qdio_siga_sync(q, 0, q->mask);
+ return qdio_sync_input_queue(q);
else
- return qdio_siga_sync(q, q->mask, 0);
+ return qdio_sync_output_queue(q);
}
static int qdio_siga_output(struct qdio_q *q, unsigned int count,
return (cc) ? -EIO : 0;
}
-#define qdio_siga_sync_out(q) qdio_siga_sync(q, ~0U, 0)
-#define qdio_siga_sync_all(q) qdio_siga_sync(q, ~0U, ~0U)
-
-static inline void qdio_sync_queues(struct qdio_q *q)
-{
- /* PCI capable outbound queues will also be scanned so sync them too */
- if (pci_out_supported(q->irq_ptr))
- qdio_siga_sync_all(q);
- else
- qdio_siga_sync_q(q);
-}
-
int debug_get_buf_state(struct qdio_q *q, unsigned int bufnr,
unsigned char *state)
{
- if (need_siga_sync(q))
+ if (qdio_need_siga_sync(q->irq_ptr))
qdio_siga_sync_q(q);
return get_buf_state(q, bufnr, state, 0);
}
if (!count)
return 0;
- /*
- * No siga sync here, as a PCI or we after a thin interrupt
- * already sync'ed the queues.
- */
+ if (qdio_need_siga_sync(q->irq_ptr))
+ qdio_sync_input_queue(q);
+
count = get_buf_states(q, start, &state, count, 1);
if (!count)
return 0;
if (!atomic_read(&q->nr_buf_used))
return 1;
- if (need_siga_sync(q))
- qdio_siga_sync_q(q);
+ if (qdio_need_siga_sync(q->irq_ptr))
+ qdio_sync_input_queue(q);
get_buf_state(q, start, &state, 0);
if (state == SLSB_P_INPUT_PRIMED || state == SLSB_P_INPUT_ERROR)
return 1;
}
-static inline int qdio_tasklet_schedule(struct qdio_q *q)
-{
- if (likely(q->irq_ptr->state == QDIO_IRQ_STATE_ACTIVE)) {
- tasklet_schedule(&q->u.out.tasklet);
- return 0;
- }
- return -EPERM;
-}
-
static int get_outbound_buffer_frontier(struct qdio_q *q, unsigned int start,
unsigned int *error)
{
q->timestamp = get_tod_clock_fast();
- if (need_siga_sync(q))
- if (((queue_type(q) != QDIO_IQDIO_QFMT) &&
- !pci_out_supported(q->irq_ptr)) ||
- (queue_type(q) == QDIO_IQDIO_QFMT &&
- multicast_outbound(q)))
- qdio_siga_sync_q(q);
-
count = atomic_read(&q->nr_buf_used);
if (!count)
return 0;
+ if (qdio_need_siga_sync(q->irq_ptr))
+ qdio_sync_output_queue(q);
+
count = get_buf_states(q, start, &state, count, 0);
if (!count)
return 0;
}
}
-/* all buffers processed? */
-static inline int qdio_outbound_q_done(struct qdio_q *q)
-{
- return atomic_read(&q->nr_buf_used) == 0;
-}
-
static int qdio_kick_outbound_q(struct qdio_q *q, unsigned int count,
unsigned long aob)
{
int retries = 0, cc;
unsigned int busy_bit;
- if (!need_siga_out(q))
+ if (!qdio_need_siga_out(q->irq_ptr))
return 0;
DBF_DEV_EVENT(DBF_INFO, q->irq_ptr, "siga-w:%1d", q->nr);
return cc;
}
-void qdio_outbound_tasklet(struct tasklet_struct *t)
-{
- struct qdio_output_q *out_q = from_tasklet(out_q, t, tasklet);
- struct qdio_q *q = container_of(out_q, struct qdio_q, u.out);
- unsigned int start = q->first_to_check;
- unsigned int error = 0;
- int count;
-
- qperf_inc(q, tasklet_outbound);
- WARN_ON_ONCE(atomic_read(&q->nr_buf_used) < 0);
-
- count = get_outbound_buffer_frontier(q, start, &error);
- if (count) {
- q->first_to_check = add_buf(start, count);
-
- if (q->irq_ptr->state == QDIO_IRQ_STATE_ACTIVE) {
- qperf_inc(q, outbound_handler);
- DBF_DEV_EVENT(DBF_INFO, q->irq_ptr, "koh: s:%02x c:%02x",
- start, count);
-
- q->handler(q->irq_ptr->cdev, error, q->nr, start,
- count, q->irq_ptr->int_parm);
- }
- }
-
- if (queue_type(q) == QDIO_ZFCP_QFMT && !pci_out_supported(q->irq_ptr) &&
- !qdio_outbound_q_done(q))
- goto sched;
-
- if (q->u.out.pci_out_enabled)
- return;
-
- /*
- * Now we know that queue type is either qeth without pci enabled
- * or HiperSockets. Make sure buffer switch from PRIMED to EMPTY
- * is noticed and outbound_handler is called after some time.
- */
- if (qdio_outbound_q_done(q))
- del_timer_sync(&q->u.out.timer);
- else
- if (!timer_pending(&q->u.out.timer) &&
- likely(q->irq_ptr->state == QDIO_IRQ_STATE_ACTIVE))
- mod_timer(&q->u.out.timer, jiffies + 10 * HZ);
- return;
-
-sched:
- qdio_tasklet_schedule(q);
-}
-
-void qdio_outbound_timer(struct timer_list *t)
-{
- struct qdio_q *q = from_timer(q, t, u.out.timer);
-
- qdio_tasklet_schedule(q);
-}
-
-static inline void qdio_check_outbound_pci_queues(struct qdio_irq *irq)
-{
- struct qdio_q *out;
- int i;
-
- if (!pci_out_supported(irq) || !irq->scan_threshold)
- return;
-
- for_each_output_queue(irq, out, i)
- if (!qdio_outbound_q_done(out))
- qdio_tasklet_schedule(out);
-}
-
static inline void qdio_set_state(struct qdio_irq *irq_ptr,
enum qdio_irq_states state)
{
/* PCI interrupt handler */
static void qdio_int_handler_pci(struct qdio_irq *irq_ptr)
{
- int i;
- struct qdio_q *q;
-
if (unlikely(irq_ptr->state != QDIO_IRQ_STATE_ACTIVE))
return;
qdio_deliver_irq(irq_ptr);
irq_ptr->last_data_irq_time = S390_lowcore.int_clock;
-
- if (!pci_out_supported(irq_ptr) || !irq_ptr->scan_threshold)
- return;
-
- for_each_output_queue(irq_ptr, q, i) {
- if (qdio_outbound_q_done(q))
- continue;
- if (need_siga_sync(q) && need_siga_sync_out_after_pci(q))
- qdio_siga_sync_q(q);
- qdio_tasklet_schedule(q);
- }
}
static void qdio_handle_activate_check(struct qdio_irq *irq_ptr,
}
EXPORT_SYMBOL_GPL(qdio_get_ssqd_desc);
-static void qdio_shutdown_queues(struct qdio_irq *irq_ptr)
+static int qdio_cancel_ccw(struct qdio_irq *irq, int how)
{
- struct qdio_q *q;
- int i;
+ struct ccw_device *cdev = irq->cdev;
+ long timeout;
+ int rc;
- for_each_output_queue(irq_ptr, q, i) {
- del_timer_sync(&q->u.out.timer);
- tasklet_kill(&q->u.out.tasklet);
+ spin_lock_irq(get_ccwdev_lock(cdev));
+ qdio_set_state(irq, QDIO_IRQ_STATE_CLEANUP);
+ if (how & QDIO_FLAG_CLEANUP_USING_CLEAR)
+ rc = ccw_device_clear(cdev, QDIO_DOING_CLEANUP);
+ else
+ /* default behaviour is halt */
+ rc = ccw_device_halt(cdev, QDIO_DOING_CLEANUP);
+ spin_unlock_irq(get_ccwdev_lock(cdev));
+ if (rc) {
+ DBF_ERROR("%4x SHUTD ERR", irq->schid.sch_no);
+ DBF_ERROR("rc:%4d", rc);
+ return rc;
}
+
+ timeout = wait_event_interruptible_timeout(cdev->private->wait_q,
+ irq->state == QDIO_IRQ_STATE_INACTIVE ||
+ irq->state == QDIO_IRQ_STATE_ERR,
+ 10 * HZ);
+ if (timeout <= 0)
+ rc = (timeout == -ERESTARTSYS) ? -EINTR : -ETIME;
+
+ return rc;
}
/**
}
/*
- * Indicate that the device is going down. Scheduling the queue
- * tasklets is forbidden from here on.
+ * Indicate that the device is going down.
*/
qdio_set_state(irq_ptr, QDIO_IRQ_STATE_STOPPED);
- qdio_shutdown_queues(irq_ptr);
qdio_shutdown_debug_entries(irq_ptr);
- /* cleanup subchannel */
- spin_lock_irq(get_ccwdev_lock(cdev));
- qdio_set_state(irq_ptr, QDIO_IRQ_STATE_CLEANUP);
- if (how & QDIO_FLAG_CLEANUP_USING_CLEAR)
- rc = ccw_device_clear(cdev, QDIO_DOING_CLEANUP);
- else
- /* default behaviour is halt */
- rc = ccw_device_halt(cdev, QDIO_DOING_CLEANUP);
- spin_unlock_irq(get_ccwdev_lock(cdev));
- if (rc) {
- DBF_ERROR("%4x SHUTD ERR", irq_ptr->schid.sch_no);
- DBF_ERROR("rc:%4d", rc);
- goto no_cleanup;
- }
-
- wait_event_interruptible_timeout(cdev->private->wait_q,
- irq_ptr->state == QDIO_IRQ_STATE_INACTIVE ||
- irq_ptr->state == QDIO_IRQ_STATE_ERR,
- 10 * HZ);
-
-no_cleanup:
+ rc = qdio_cancel_ccw(irq_ptr, how);
qdio_shutdown_thinint(irq_ptr);
qdio_shutdown_irq(irq_ptr);
DBF_DEV_EVENT(DBF_ERR, irq, "qfmt:%1u", data->q_format);
DBF_DEV_EVENT(DBF_ERR, irq, "qpff%4x", data->qib_param_field_format);
DBF_DEV_HEX(irq, &data->qib_param_field, sizeof(void *), DBF_ERR);
- DBF_DEV_HEX(irq, &data->input_slib_elements, sizeof(void *), DBF_ERR);
- DBF_DEV_HEX(irq, &data->output_slib_elements, sizeof(void *), DBF_ERR);
DBF_DEV_EVENT(DBF_ERR, irq, "niq:%1u noq:%1u", data->no_input_qs,
data->no_output_qs);
DBF_DEV_HEX(irq, &data->input_handler, sizeof(void *), DBF_ERR);
{
struct qdio_irq *irq_ptr = cdev->private->qdio_data;
struct subchannel_id schid;
+ long timeout;
int rc;
ccw_device_get_schid(cdev, &schid);
qdio_setup_irq(irq_ptr, init_data);
rc = qdio_establish_thinint(irq_ptr);
- if (rc) {
- qdio_shutdown_irq(irq_ptr);
- mutex_unlock(&irq_ptr->setup_mutex);
- return rc;
- }
+ if (rc)
+ goto err_thinint;
/* establish q */
irq_ptr->ccw.cmd_code = irq_ptr->equeue.cmd;
irq_ptr->ccw.flags = CCW_FLAG_SLI;
irq_ptr->ccw.count = irq_ptr->equeue.count;
- irq_ptr->ccw.cda = (u32)((addr_t)irq_ptr->qdr);
+ irq_ptr->ccw.cda = (u32) virt_to_phys(irq_ptr->qdr);
spin_lock_irq(get_ccwdev_lock(cdev));
ccw_device_set_options_mask(cdev, 0);
if (rc) {
DBF_ERROR("%4x est IO ERR", irq_ptr->schid.sch_no);
DBF_ERROR("rc:%4x", rc);
- qdio_shutdown_thinint(irq_ptr);
- qdio_shutdown_irq(irq_ptr);
- mutex_unlock(&irq_ptr->setup_mutex);
- return rc;
+ goto err_ccw_start;
}
- wait_event_interruptible_timeout(cdev->private->wait_q,
- irq_ptr->state == QDIO_IRQ_STATE_ESTABLISHED ||
- irq_ptr->state == QDIO_IRQ_STATE_ERR, HZ);
+ timeout = wait_event_interruptible_timeout(cdev->private->wait_q,
+ irq_ptr->state == QDIO_IRQ_STATE_ESTABLISHED ||
+ irq_ptr->state == QDIO_IRQ_STATE_ERR, HZ);
+ if (timeout <= 0) {
+ rc = (timeout == -ERESTARTSYS) ? -EINTR : -ETIME;
+ goto err_ccw_timeout;
+ }
if (irq_ptr->state != QDIO_IRQ_STATE_ESTABLISHED) {
- mutex_unlock(&irq_ptr->setup_mutex);
- qdio_shutdown(cdev, QDIO_FLAG_CLEANUP_USING_CLEAR);
- return -EIO;
+ rc = -EIO;
+ goto err_ccw_error;
}
qdio_setup_ssqd_info(irq_ptr);
qdio_print_subchannel_info(irq_ptr);
qdio_setup_debug_entries(irq_ptr);
return 0;
+
+err_ccw_timeout:
+ qdio_cancel_ccw(irq_ptr, QDIO_FLAG_CLEANUP_USING_CLEAR);
+err_ccw_error:
+err_ccw_start:
+ qdio_shutdown_thinint(irq_ptr);
+err_thinint:
+ qdio_shutdown_irq(irq_ptr);
+ qdio_set_state(irq_ptr, QDIO_IRQ_STATE_INACTIVE);
+ mutex_unlock(&irq_ptr->setup_mutex);
+ return rc;
}
EXPORT_SYMBOL_GPL(qdio_establish);
/**
* handle_inbound - reset processed input buffers
* @q: queue containing the buffers
- * @callflags: flags
* @bufnr: first buffer to process
* @count: how many buffers are emptied
*/
-static int handle_inbound(struct qdio_q *q, unsigned int callflags,
- int bufnr, int count)
+static int handle_inbound(struct qdio_q *q, int bufnr, int count)
{
int overlap;
count = set_buf_states(q, bufnr, SLSB_CU_INPUT_EMPTY, count);
atomic_add(count, &q->nr_buf_used);
- if (need_siga_in(q))
+ if (qdio_need_siga_in(q->irq_ptr))
return qdio_siga_input(q);
return 0;
/**
* handle_outbound - process filled outbound buffers
* @q: queue containing the buffers
- * @callflags: flags
* @bufnr: first buffer to process
* @count: how many buffers are filled
* @aob: asynchronous operation block
*/
-static int handle_outbound(struct qdio_q *q, unsigned int callflags,
- unsigned int bufnr, unsigned int count,
+static int handle_outbound(struct qdio_q *q, unsigned int bufnr, unsigned int count,
struct qaob *aob)
{
- const unsigned int scan_threshold = q->irq_ptr->scan_threshold;
unsigned char state = 0;
int used, rc = 0;
if (used == QDIO_MAX_BUFFERS_PER_Q)
qperf_inc(q, outbound_queue_full);
- if (callflags & QDIO_FLAG_PCI_OUT) {
- q->u.out.pci_out_enabled = 1;
- qperf_inc(q, pci_request_int);
- } else
- q->u.out.pci_out_enabled = 0;
-
if (queue_type(q) == QDIO_IQDIO_QFMT) {
unsigned long phys_aob = aob ? virt_to_phys(aob) : 0;
WARN_ON_ONCE(!IS_ALIGNED(phys_aob, 256));
rc = qdio_kick_outbound_q(q, count, phys_aob);
- } else if (need_siga_sync(q)) {
- rc = qdio_siga_sync_q(q);
+ } else if (qdio_need_siga_sync(q->irq_ptr)) {
+ rc = qdio_sync_output_queue(q);
} else if (count < QDIO_MAX_BUFFERS_PER_Q &&
get_buf_state(q, prev_buf(bufnr), &state, 0) > 0 &&
state == SLSB_CU_OUTPUT_PRIMED) {
rc = qdio_kick_outbound_q(q, count, 0);
}
- /* Let drivers implement their own completion scanning: */
- if (!scan_threshold)
- return rc;
-
- /* in case of SIGA errors we must process the error immediately */
- if (used >= scan_threshold || rc)
- qdio_tasklet_schedule(q);
- else
- /* free the SBALs in case of no further traffic */
- if (!timer_pending(&q->u.out.timer) &&
- likely(q->irq_ptr->state == QDIO_IRQ_STATE_ACTIVE))
- mod_timer(&q->u.out.timer, jiffies + HZ);
return rc;
}
if (!count)
return 0;
if (callflags & QDIO_FLAG_SYNC_INPUT)
- return handle_inbound(irq_ptr->input_qs[q_nr],
- callflags, bufnr, count);
+ return handle_inbound(irq_ptr->input_qs[q_nr], bufnr, count);
else if (callflags & QDIO_FLAG_SYNC_OUTPUT)
- return handle_outbound(irq_ptr->output_qs[q_nr],
- callflags, bufnr, count, aob);
+ return handle_outbound(irq_ptr->output_qs[q_nr], bufnr, count, aob);
return -EINVAL;
}
EXPORT_SYMBOL_GPL(do_QDIO);
return -ENODEV;
q = is_input ? irq_ptr->input_qs[nr] : irq_ptr->output_qs[nr];
- if (need_siga_sync(q))
- qdio_siga_sync_q(q);
-
return __qdio_inspect_queue(q, bufnr, error);
}
EXPORT_SYMBOL_GPL(qdio_inspect_queue);
-/**
- * qdio_get_next_buffers - process input buffers
- * @cdev: associated ccw_device for the qdio subchannel
- * @nr: input queue number
- * @bufnr: first filled buffer number
- * @error: buffers are in error state
- *
- * Return codes
- * < 0 - error
- * = 0 - no new buffers found
- * > 0 - number of processed buffers
- */
-int qdio_get_next_buffers(struct ccw_device *cdev, int nr, int *bufnr,
- int *error)
-{
- struct qdio_q *q;
- struct qdio_irq *irq_ptr = cdev->private->qdio_data;
-
- if (!irq_ptr)
- return -ENODEV;
- q = irq_ptr->input_qs[nr];
-
- /*
- * Cannot rely on automatic sync after interrupt since queues may
- * also be examined without interrupt.
- */
- if (need_siga_sync(q))
- qdio_sync_queues(q);
-
- qdio_check_outbound_pci_queues(irq_ptr);
-
- /* Note: upper-layer MUST stop processing immediately here ... */
- if (unlikely(q->irq_ptr->state != QDIO_IRQ_STATE_ACTIVE))
- return -EIO;
-
- return __qdio_inspect_queue(q, bufnr, error);
-}
-EXPORT_SYMBOL(qdio_get_next_buffers);
-
/**
* qdio_stop_irq - disable interrupt processing for the device
* @cdev: associated ccw_device for the qdio subchannel
}
EXPORT_SYMBOL_GPL(qdio_reset_buffers);
-/*
- * qebsm is only available under 64bit but the adapter sets the feature
- * flag anyway, so we manually override it.
- */
-static inline int qebsm_possible(void)
-{
- return css_general_characteristics.qebsm;
-}
-
-/*
- * qib_param_field: pointer to 128 bytes or NULL, if no param field
- * nr_input_qs: pointer to nr_queues*128 words of data or NULL
- */
-static void set_impl_params(struct qdio_irq *irq_ptr,
- unsigned int qib_param_field_format,
- unsigned char *qib_param_field,
- unsigned long *input_slib_elements,
- unsigned long *output_slib_elements)
-{
- struct qdio_q *q;
- int i, j;
-
- if (!irq_ptr)
- return;
-
- irq_ptr->qib.pfmt = qib_param_field_format;
- if (qib_param_field)
- memcpy(irq_ptr->qib.parm, qib_param_field,
- sizeof(irq_ptr->qib.parm));
-
- if (!input_slib_elements)
- goto output;
-
- for_each_input_queue(irq_ptr, q, i) {
- for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; j++)
- q->slib->slibe[j].parms =
- input_slib_elements[i * QDIO_MAX_BUFFERS_PER_Q + j];
- }
-output:
- if (!output_slib_elements)
- return;
-
- for_each_output_queue(irq_ptr, q, i) {
- for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; j++)
- q->slib->slibe[j].parms =
- output_slib_elements[i * QDIO_MAX_BUFFERS_PER_Q + j];
- }
-}
-
static void __qdio_free_queues(struct qdio_q **queues, unsigned int count)
{
struct qdio_q *q;
q->is_input_q = 0;
setup_storage_lists(q, irq_ptr,
qdio_init->output_sbal_addr_array[i], i);
-
- tasklet_setup(&q->u.out.tasklet, qdio_outbound_tasklet);
- timer_setup(&q->u.out.timer, qdio_outbound_timer, 0);
}
}
-static void process_ac_flags(struct qdio_irq *irq_ptr, unsigned char qdioac)
-{
- if (qdioac & AC1_SIGA_INPUT_NEEDED)
- irq_ptr->siga_flag.input = 1;
- if (qdioac & AC1_SIGA_OUTPUT_NEEDED)
- irq_ptr->siga_flag.output = 1;
- if (qdioac & AC1_SIGA_SYNC_NEEDED)
- irq_ptr->siga_flag.sync = 1;
- if (!(qdioac & AC1_AUTOMATIC_SYNC_ON_THININT))
- irq_ptr->siga_flag.sync_after_ai = 1;
- if (!(qdioac & AC1_AUTOMATIC_SYNC_ON_OUT_PCI))
- irq_ptr->siga_flag.sync_out_after_pci = 1;
-}
-
static void check_and_setup_qebsm(struct qdio_irq *irq_ptr,
unsigned char qdioac, unsigned long token)
{
qdioac = irq_ptr->ssqd_desc.qdioac1;
check_and_setup_qebsm(irq_ptr, qdioac, irq_ptr->ssqd_desc.sch_token);
- process_ac_flags(irq_ptr, qdioac);
+ irq_ptr->qdioac1 = qdioac;
DBF_EVENT("ac 1:%2x 2:%4x", qdioac, irq_ptr->ssqd_desc.qdioac2);
DBF_EVENT("3:%4x qib:%4x", irq_ptr->ssqd_desc.qdioac3, irq_ptr->qib.ac);
}
struct qdesfmt0 *desc = &irq_ptr->qdr->qdf0[0];
int i;
+ memset(irq_ptr->qdr, 0, sizeof(struct qdr));
+
irq_ptr->qdr->qfmt = qdio_init->q_format;
irq_ptr->qdr->ac = qdio_init->qdr_ac;
irq_ptr->qdr->iqdcnt = qdio_init->no_input_qs;
static void setup_qib(struct qdio_irq *irq_ptr,
struct qdio_initialize *init_data)
{
- if (qebsm_possible())
- irq_ptr->qib.rflags |= QIB_RFLAGS_ENABLE_QEBSM;
-
- irq_ptr->qib.rflags |= init_data->qib_rflags;
+ memset(&irq_ptr->qib, 0, sizeof(irq_ptr->qib));
irq_ptr->qib.qfmt = init_data->q_format;
+ irq_ptr->qib.pfmt = init_data->qib_param_field_format;
+
+ irq_ptr->qib.rflags = init_data->qib_rflags;
+ if (css_general_characteristics.qebsm)
+ irq_ptr->qib.rflags |= QIB_RFLAGS_ENABLE_QEBSM;
+
if (init_data->no_input_qs)
irq_ptr->qib.isliba =
(unsigned long)(irq_ptr->input_qs[0]->slib);
(unsigned long)(irq_ptr->output_qs[0]->slib);
memcpy(irq_ptr->qib.ebcnam, dev_name(&irq_ptr->cdev->dev), 8);
ASCEBC(irq_ptr->qib.ebcnam, 8);
+
+ if (init_data->qib_param_field)
+ memcpy(irq_ptr->qib.parm, init_data->qib_param_field,
+ sizeof(irq_ptr->qib.parm));
}
int qdio_setup_irq(struct qdio_irq *irq_ptr, struct qdio_initialize *init_data)
struct ccw_device *cdev = irq_ptr->cdev;
struct ciw *ciw;
- memset(&irq_ptr->qib, 0, sizeof(irq_ptr->qib));
- memset(&irq_ptr->siga_flag, 0, sizeof(irq_ptr->siga_flag));
+ irq_ptr->qdioac1 = 0;
memset(&irq_ptr->ccw, 0, sizeof(irq_ptr->ccw));
memset(&irq_ptr->ssqd_desc, 0, sizeof(irq_ptr->ssqd_desc));
memset(&irq_ptr->perf_stat, 0, sizeof(irq_ptr->perf_stat));
irq_ptr->sch_token = irq_ptr->perf_stat_enabled = 0;
irq_ptr->state = QDIO_IRQ_STATE_INACTIVE;
- /* wipes qib.ac, required by ar7063 */
- memset(irq_ptr->qdr, 0, sizeof(struct qdr));
-
irq_ptr->int_parm = init_data->int_parm;
irq_ptr->nr_input_qs = init_data->no_input_qs;
irq_ptr->nr_output_qs = init_data->no_output_qs;
- irq_ptr->scan_threshold = init_data->scan_threshold;
ccw_device_get_schid(cdev, &irq_ptr->schid);
setup_queues(irq_ptr, init_data);
set_bit(QDIO_IRQ_DISABLED, &irq_ptr->poll_state);
setup_qib(irq_ptr, init_data);
- set_impl_params(irq_ptr, init_data->qib_param_field_format,
- init_data->qib_param_field,
- init_data->input_slib_elements,
- init_data->output_slib_elements);
/* fill input and output descriptors */
setup_qdr(irq_ptr, init_data);
void qdio_print_subchannel_info(struct qdio_irq *irq_ptr)
{
- char s[80];
-
- snprintf(s, 80, "qdio: %s %s on SC %x using "
- "AI:%d QEBSM:%d PRI:%d TDD:%d SIGA:%s%s%s%s%s\n",
- dev_name(&irq_ptr->cdev->dev),
+ dev_info(&irq_ptr->cdev->dev,
+ "qdio: %s on SC %x using AI:%d QEBSM:%d PRI:%d TDD:%d SIGA:%s%s%s\n",
(irq_ptr->qib.qfmt == QDIO_QETH_QFMT) ? "OSA" :
((irq_ptr->qib.qfmt == QDIO_ZFCP_QFMT) ? "ZFCP" : "HS"),
irq_ptr->schid.sch_no,
(irq_ptr->sch_token) ? 1 : 0,
pci_out_supported(irq_ptr) ? 1 : 0,
css_general_characteristics.aif_tdd,
- (irq_ptr->siga_flag.input) ? "R" : " ",
- (irq_ptr->siga_flag.output) ? "W" : " ",
- (irq_ptr->siga_flag.sync) ? "S" : " ",
- (irq_ptr->siga_flag.sync_after_ai) ? "A" : " ",
- (irq_ptr->siga_flag.sync_out_after_pci) ? "P" : " ");
- printk(KERN_INFO "%s", s);
+ qdio_need_siga_in(irq_ptr) ? "R" : " ",
+ qdio_need_siga_out(irq_ptr) ? "W" : " ",
+ qdio_need_siga_sync(irq_ptr) ? "S" : " ");
}
int __init qdio_setup_init(void)
(css_general_characteristics.aif_osa) ? 1 : 0);
/* Check for QEBSM support in general (bit 58). */
- DBF_EVENT("cssQEBSM:%1d", (qebsm_possible()) ? 1 : 0);
+ DBF_EVENT("cssQEBSM:%1d", css_general_characteristics.qebsm);
rc = 0;
out:
return rc;
/* Adapter interrupt definitions */
static void ap_interrupt_handler(struct airq_struct *airq, bool floating);
-static int ap_airq_flag;
+static bool ap_irq_flag;
static struct airq_struct ap_airq = {
.handler = ap_interrupt_handler,
.isc = AP_ISC,
};
-/**
- * ap_using_interrupts() - Returns non-zero if interrupt support is
- * available.
- */
-static inline int ap_using_interrupts(void)
-{
- return ap_airq_flag;
-}
-
/**
* ap_airq_ptr() - Get the address of the adapter interrupt indicator
*
*/
void *ap_airq_ptr(void)
{
- if (ap_using_interrupts())
+ if (ap_irq_flag)
return ap_airq.lsi_ptr;
return NULL;
}
switch (wait) {
case AP_SM_WAIT_AGAIN:
case AP_SM_WAIT_INTERRUPT:
- if (ap_using_interrupts())
+ if (ap_irq_flag)
break;
if (ap_poll_kthread) {
wake_up(&ap_poll_wait);
* be received. Doing it in the beginning of the tasklet is therefor
* important that no requests on any AP get lost.
*/
- if (ap_using_interrupts())
+ if (ap_irq_flag)
xchg(ap_airq.lsi_ptr, 0);
spin_lock_bh(&ap_queues_lock);
{
int rc;
- if (ap_using_interrupts() || ap_poll_kthread)
+ if (ap_irq_flag || ap_poll_kthread)
return 0;
mutex_lock(&ap_poll_thread_mutex);
ap_poll_kthread = kthread_run(ap_poll_thread, NULL, "appoll");
if (is_queue_dev(dev)) {
pctrs->apqns++;
- if ((to_ap_dev(dev))->drv)
+ if (dev->driver)
pctrs->bound++;
}
to_ap_queue(dev)->qid);
spin_unlock_bh(&ap_queues_lock);
- ap_dev->drv = ap_drv;
rc = ap_drv->probe ? ap_drv->probe(ap_dev) : -ENODEV;
if (rc) {
if (is_queue_dev(dev))
hash_del(&to_ap_queue(dev)->hnode);
spin_unlock_bh(&ap_queues_lock);
- ap_dev->drv = NULL;
} else
ap_check_bindings_complete();
static int ap_device_remove(struct device *dev)
{
struct ap_device *ap_dev = to_ap_dev(dev);
- struct ap_driver *ap_drv = ap_dev->drv;
+ struct ap_driver *ap_drv = to_ap_drv(dev->driver);
/* prepare ap queue device removal */
if (is_queue_dev(dev))
if (is_queue_dev(dev))
hash_del(&to_ap_queue(dev)->hnode);
spin_unlock_bh(&ap_queues_lock);
- ap_dev->drv = NULL;
put_device(dev);
static ssize_t ap_interrupts_show(struct bus_type *bus, char *buf)
{
return scnprintf(buf, PAGE_SIZE, "%d\n",
- ap_using_interrupts() ? 1 : 0);
+ ap_irq_flag ? 1 : 0);
}
static BUS_ATTR_RO(ap_interrupts);
/* enable interrupts if available */
if (ap_interrupts_available()) {
rc = register_adapter_interrupt(&ap_airq);
- ap_airq_flag = (rc == 0);
+ ap_irq_flag = (rc == 0);
}
/* Create /sys/bus/ap. */
out_bus:
bus_unregister(&ap_bus_type);
out:
- if (ap_using_interrupts())
+ if (ap_irq_flag)
unregister_adapter_interrupt(&ap_airq);
kfree(ap_qci_info);
return rc;
#define AP_FUNC_EP11 5
#define AP_FUNC_APXA 6
-/*
- * AP interrupt states
- */
-#define AP_INTR_DISABLED 0 /* AP interrupt disabled */
-#define AP_INTR_ENABLED 1 /* AP interrupt enabled */
-
/*
* AP queue state machine states
*/
* AP queue state wait behaviour
*/
enum ap_sm_wait {
- AP_SM_WAIT_AGAIN, /* retry immediately */
+ AP_SM_WAIT_AGAIN = 0, /* retry immediately */
AP_SM_WAIT_TIMEOUT, /* wait for timeout */
AP_SM_WAIT_INTERRUPT, /* wait for thin interrupt (if available) */
AP_SM_WAIT_NONE, /* no wait */
struct ap_device {
struct device device;
- struct ap_driver *drv; /* Pointer to AP device driver. */
int device_type; /* AP device type. */
};
struct ap_card {
struct ap_device ap_dev;
- void *private; /* ap driver private pointer. */
int raw_hwtype; /* AP raw hardware type. */
unsigned int functions; /* AP device function bitfield. */
int queue_depth; /* AP queue depth.*/
struct hlist_node hnode; /* Node for the ap_queues hashtable */
struct ap_card *card; /* Ptr to assoc. AP card. */
spinlock_t lock; /* Per device lock. */
- void *private; /* ap driver private pointer. */
enum ap_dev_state dev_state; /* queue device state */
bool config; /* configured state */
ap_qid_t qid; /* AP queue id. */
- int interrupt; /* indicate if interrupts are enabled */
+ bool interrupt; /* indicate if interrupts are enabled */
int queue_count; /* # messages currently on AP queue. */
int pendingq_count; /* # requests on pendingq list. */
int requestq_count; /* # requests on requestq list. */
static void __ap_flush_queue(struct ap_queue *aq);
/**
- * ap_queue_enable_interruption(): Enable interruption on an AP queue.
+ * ap_queue_enable_irq(): Enable interrupt support on this AP queue.
* @qid: The AP queue number
* @ind: the notification indicator byte
*
* value it waits a while and tests the AP queue if interrupts
* have been switched on using ap_test_queue().
*/
-static int ap_queue_enable_interruption(struct ap_queue *aq, void *ind)
+static int ap_queue_enable_irq(struct ap_queue *aq, void *ind)
{
struct ap_queue_status status;
struct ap_qirq_ctrl qirqctrl = { 0 };
return AP_SM_WAIT_NONE;
case AP_RESPONSE_NO_PENDING_REPLY:
if (aq->queue_count > 0)
- return AP_SM_WAIT_INTERRUPT;
+ return aq->interrupt ?
+ AP_SM_WAIT_INTERRUPT : AP_SM_WAIT_TIMEOUT;
aq->sm_state = AP_SM_STATE_IDLE;
return AP_SM_WAIT_NONE;
default:
fallthrough;
case AP_RESPONSE_Q_FULL:
aq->sm_state = AP_SM_STATE_QUEUE_FULL;
- return AP_SM_WAIT_INTERRUPT;
+ return aq->interrupt ?
+ AP_SM_WAIT_INTERRUPT : AP_SM_WAIT_TIMEOUT;
case AP_RESPONSE_RESET_IN_PROGRESS:
aq->sm_state = AP_SM_STATE_RESET_WAIT;
return AP_SM_WAIT_TIMEOUT;
case AP_RESPONSE_NORMAL:
case AP_RESPONSE_RESET_IN_PROGRESS:
aq->sm_state = AP_SM_STATE_RESET_WAIT;
- aq->interrupt = AP_INTR_DISABLED;
+ aq->interrupt = false;
return AP_SM_WAIT_TIMEOUT;
default:
aq->dev_state = AP_DEV_STATE_ERROR;
switch (status.response_code) {
case AP_RESPONSE_NORMAL:
lsi_ptr = ap_airq_ptr();
- if (lsi_ptr && ap_queue_enable_interruption(aq, lsi_ptr) == 0)
+ if (lsi_ptr && ap_queue_enable_irq(aq, lsi_ptr) == 0)
aq->sm_state = AP_SM_STATE_SETIRQ_WAIT;
else
aq->sm_state = (aq->queue_count > 0) ?
if (status.irq_enabled == 1) {
/* Irqs are now enabled */
- aq->interrupt = AP_INTR_ENABLED;
+ aq->interrupt = true;
aq->sm_state = (aq->queue_count > 0) ?
AP_SM_STATE_WORKING : AP_SM_STATE_IDLE;
}
spin_lock_bh(&aq->lock);
if (aq->sm_state == AP_SM_STATE_SETIRQ_WAIT)
rc = scnprintf(buf, PAGE_SIZE, "Enable Interrupt pending.\n");
- else if (aq->interrupt == AP_INTR_ENABLED)
+ else if (aq->interrupt)
rc = scnprintf(buf, PAGE_SIZE, "Interrupts enabled.\n");
else
rc = scnprintf(buf, PAGE_SIZE, "Interrupts disabled.\n");
aq->ap_dev.device.type = &ap_queue_type;
aq->ap_dev.device_type = device_type;
aq->qid = qid;
- aq->interrupt = AP_INTR_DISABLED;
+ aq->interrupt = false;
spin_lock_init(&aq->lock);
INIT_LIST_HEAD(&aq->pendingq);
INIT_LIST_HEAD(&aq->requestq);
}
/**
- * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
+ * vfio_ap_get_queue - retrieve a queue with a specific APQN from a list
* @matrix_mdev: the associated mediated matrix
* @apqn: The queue APQN
*
* devices of the vfio_ap_drv.
* Verify that the APID and the APQI are set in the matrix.
*
- * Returns the pointer to the associated vfio_ap_queue
+ * Return: the pointer to the associated vfio_ap_queue
*/
static struct vfio_ap_queue *vfio_ap_get_queue(
struct ap_matrix_mdev *matrix_mdev,
}
/**
- * vfio_ap_wait_for_irqclear
+ * vfio_ap_wait_for_irqclear - clears the IR bit or gives up after 5 tries
* @apqn: The AP Queue number
*
* Checks the IRQ bit for the status of this APQN using ap_tapq.
* Returns if ap_tapq function failed with invalid, deconfigured or
* checkstopped AP.
* Otherwise retries up to 5 times after waiting 20ms.
- *
*/
static void vfio_ap_wait_for_irqclear(int apqn)
{
}
/**
- * vfio_ap_free_aqic_resources
+ * vfio_ap_free_aqic_resources - free vfio_ap_queue resources
* @q: The vfio_ap_queue
*
* Unregisters the ISC in the GIB when the saved ISC not invalid.
- * Unpin the guest's page holding the NIB when it exist.
- * Reset the saved_pfn and saved_isc to invalid values.
- *
+ * Unpins the guest's page holding the NIB when it exists.
+ * Resets the saved_pfn and saved_isc to invalid values.
*/
static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q)
{
}
/**
- * vfio_ap_irq_disable
+ * vfio_ap_irq_disable - disables and clears an ap_queue interrupt
* @q: The vfio_ap_queue
*
* Uses ap_aqic to disable the interruption and in case of success, reset
*
* Returns if ap_aqic function failed with invalid, deconfigured or
* checkstopped AP.
+ *
+ * Return: &struct ap_queue_status
*/
static struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
{
}
/**
- * vfio_ap_setirq: Enable Interruption for a APQN
+ * vfio_ap_irq_enable - Enable Interruption for a APQN
*
- * @dev: the device associated with the ap_queue
* @q: the vfio_ap_queue holding AQIC parameters
*
* Pin the NIB saved in *q
*
* Otherwise return the ap_queue_status returned by the ap_aqic(),
* all retry handling will be done by the guest.
+ *
+ * Return: &struct ap_queue_status
*/
static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q,
int isc,
}
/**
- * handle_pqap: PQAP instruction callback
+ * handle_pqap - PQAP instruction callback
*
* @vcpu: The vcpu on which we received the PQAP instruction
*
* We take the matrix_dev lock to ensure serialization on queues and
* mediated device access.
*
- * Return 0 if we could handle the request inside KVM.
- * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
+ * Return: 0 if we could handle the request inside KVM.
+ * Otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
*/
static int handle_pqap(struct kvm_vcpu *vcpu)
{
};
/**
- * vfio_ap_has_queue
+ * vfio_ap_has_queue - determines if the AP queue containing the target in @data
*
* @dev: an AP queue device
* @data: a struct vfio_ap_queue_reserved reference
* - If @data contains only an apqi value, @data will be flagged as
* reserved if the APQI field in the AP queue device matches
*
- * Returns 0 to indicate the input to function succeeded. Returns -EINVAL if
+ * Return: 0 to indicate the input to function succeeded. Returns -EINVAL if
* @data does not contain either an apid or apqi.
*/
static int vfio_ap_has_queue(struct device *dev, void *data)
}
/**
- * vfio_ap_verify_queue_reserved
+ * vfio_ap_verify_queue_reserved - verifies that the AP queue containing
+ * @apid or @aqpi is reserved
*
- * @matrix_dev: a mediated matrix device
* @apid: an AP adapter ID
* @apqi: an AP queue index
*
* - If only @apqi is not NULL, then there must be an AP queue device bound
* to the vfio_ap driver with an APQN containing @apqi
*
- * Returns 0 if the AP queue is reserved; otherwise, returns -EADDRNOTAVAIL.
+ * Return: 0 if the AP queue is reserved; otherwise, returns -EADDRNOTAVAIL.
*/
static int vfio_ap_verify_queue_reserved(unsigned long *apid,
unsigned long *apqi)
}
/**
- * vfio_ap_mdev_verify_no_sharing
+ * vfio_ap_mdev_verify_no_sharing - verifies that the AP matrix is not configured
+ *
+ * @matrix_mdev: the mediated matrix device
*
* Verifies that the APQNs derived from the cross product of the AP adapter IDs
* and AP queue indexes comprising the AP matrix are not configured for another
* mediated device. AP queue sharing is not allowed.
*
- * @matrix_mdev: the mediated matrix device
- *
- * Returns 0 if the APQNs are not shared, otherwise; returns -EADDRINUSE.
+ * Return: 0 if the APQNs are not shared; otherwise returns -EADDRINUSE.
*/
static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
{
}
/**
- * assign_adapter_store
+ * assign_adapter_store - parses the APID from @buf and sets the
+ * corresponding bit in the mediated matrix device's APM
*
* @dev: the matrix device
* @attr: the mediated matrix device's assign_adapter attribute
* be assigned
* @count: the number of bytes in @buf
*
- * Parses the APID from @buf and sets the corresponding bit in the mediated
- * matrix device's APM.
- *
- * Returns the number of bytes processed if the APID is valid; otherwise,
+ * Return: the number of bytes processed if the APID is valid; otherwise,
* returns one of the following errors:
*
* 1. -EINVAL
static DEVICE_ATTR_WO(assign_adapter);
/**
- * unassign_adapter_store
+ * unassign_adapter_store - parses the APID from @buf and clears the
+ * corresponding bit in the mediated matrix device's APM
*
* @dev: the matrix device
* @attr: the mediated matrix device's unassign_adapter attribute
* @buf: a buffer containing the adapter number (APID) to be unassigned
* @count: the number of bytes in @buf
*
- * Parses the APID from @buf and clears the corresponding bit in the mediated
- * matrix device's APM.
- *
- * Returns the number of bytes processed if the APID is valid; otherwise,
+ * Return: the number of bytes processed if the APID is valid; otherwise,
* returns one of the following errors:
* -EINVAL if the APID is not a number
* -ENODEV if the APID it exceeds the maximum value configured for the
}
/**
- * assign_domain_store
+ * assign_domain_store - parses the APQI from @buf and sets the
+ * corresponding bit in the mediated matrix device's AQM
+ *
*
* @dev: the matrix device
* @attr: the mediated matrix device's assign_domain attribute
* be assigned
* @count: the number of bytes in @buf
*
- * Parses the APQI from @buf and sets the corresponding bit in the mediated
- * matrix device's AQM.
- *
- * Returns the number of bytes processed if the APQI is valid; otherwise returns
+ * Return: the number of bytes processed if the APQI is valid; otherwise returns
* one of the following errors:
*
* 1. -EINVAL
/**
- * unassign_domain_store
+ * unassign_domain_store - parses the APQI from @buf and clears the
+ * corresponding bit in the mediated matrix device's AQM
*
* @dev: the matrix device
* @attr: the mediated matrix device's unassign_domain attribute
* be unassigned
* @count: the number of bytes in @buf
*
- * Parses the APQI from @buf and clears the corresponding bit in the
- * mediated matrix device's AQM.
- *
- * Returns the number of bytes processed if the APQI is valid; otherwise,
+ * Return: the number of bytes processed if the APQI is valid; otherwise,
* returns one of the following errors:
* -EINVAL if the APQI is not a number
* -ENODEV if the APQI exceeds the maximum value configured for the system
static DEVICE_ATTR_WO(unassign_domain);
/**
- * assign_control_domain_store
+ * assign_control_domain_store - parses the domain ID from @buf and sets
+ * the corresponding bit in the mediated matrix device's ADM
+ *
*
* @dev: the matrix device
* @attr: the mediated matrix device's assign_control_domain attribute
* @buf: a buffer containing the domain ID to be assigned
* @count: the number of bytes in @buf
*
- * Parses the domain ID from @buf and sets the corresponding bit in the mediated
- * matrix device's ADM.
- *
- * Returns the number of bytes processed if the domain ID is valid; otherwise,
+ * Return: the number of bytes processed if the domain ID is valid; otherwise,
* returns one of the following errors:
* -EINVAL if the ID is not a number
* -ENODEV if the ID exceeds the maximum value configured for the system
static DEVICE_ATTR_WO(assign_control_domain);
/**
- * unassign_control_domain_store
+ * unassign_control_domain_store - parses the domain ID from @buf and
+ * clears the corresponding bit in the mediated matrix device's ADM
*
* @dev: the matrix device
* @attr: the mediated matrix device's unassign_control_domain attribute
* @buf: a buffer containing the domain ID to be unassigned
* @count: the number of bytes in @buf
*
- * Parses the domain ID from @buf and clears the corresponding bit in the
- * mediated matrix device's ADM.
- *
- * Returns the number of bytes processed if the domain ID is valid; otherwise,
+ * Return: the number of bytes processed if the domain ID is valid; otherwise,
* returns one of the following errors:
* -EINVAL if the ID is not a number
* -ENODEV if the ID exceeds the maximum value configured for the system
};
/**
- * vfio_ap_mdev_set_kvm
+ * vfio_ap_mdev_set_kvm - sets all data for @matrix_mdev that are needed
+ * to manage AP resources for the guest whose state is represented by @kvm
*
* @matrix_mdev: a mediated matrix device
* @kvm: reference to KVM instance
*
- * Sets all data for @matrix_mdev that are needed to manage AP resources
- * for the guest whose state is represented by @kvm.
- *
* Note: The matrix_dev->lock must be taken prior to calling
* this function; however, the lock will be temporarily released while the
* guest's AP configuration is set to avoid a potential lockdep splat.
* certain circumstances, will result in a circular lock dependency if this is
* done under the @matrix_mdev->lock.
*
- * Return 0 if no other mediated matrix device has a reference to @kvm;
+ * Return: 0 if no other mediated matrix device has a reference to @kvm;
* otherwise, returns an -EPERM.
*/
static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
return 0;
}
-/*
- * vfio_ap_mdev_iommu_notifier: IOMMU notifier callback
+/**
+ * vfio_ap_mdev_iommu_notifier - IOMMU notifier callback
*
* @nb: The notifier block
* @action: Action to be taken
* For an UNMAP request, unpin the guest IOVA (the NIB guest address we
* pinned before). Other requests are ignored.
*
+ * Return: for an UNMAP request, NOFITY_OK; otherwise NOTIFY_DONE.
*/
static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
unsigned long action, void *data)
}
/**
- * vfio_ap_mdev_unset_kvm
+ * vfio_ap_mdev_unset_kvm - performs clean-up of resources no longer needed
+ * by @matrix_mdev.
*
* @matrix_mdev: a matrix mediated device
*
- * Performs clean-up of resources no longer needed by @matrix_mdev.
- *
* Note: The matrix_dev->lock must be taken prior to calling
* this function; however, the lock will be temporarily released while the
* guest's AP configuration is cleared to avoid a potential lockdep splat.
* The kvm->lock is taken to clear the guest's AP configuration which, under
* certain circumstances, will result in a circular lock dependency if this is
* done under the @matrix_mdev->lock.
- *
*/
static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
{
struct module **pmod,
unsigned int weight)
{
- if (!zq || !try_module_get(zq->queue->ap_dev.drv->driver.owner))
+ if (!zq || !try_module_get(zq->queue->ap_dev.device.driver->owner))
return NULL;
zcrypt_queue_get(zq);
get_device(&zq->queue->ap_dev.device);
atomic_add(weight, &zc->load);
atomic_add(weight, &zq->load);
zq->request_count++;
- *pmod = zq->queue->ap_dev.drv->driver.owner;
+ *pmod = zq->queue->ap_dev.device.driver->owner;
return zq;
}
static ssize_t type_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
- struct zcrypt_card *zc = to_ap_card(dev)->private;
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
return scnprintf(buf, PAGE_SIZE, "%s\n", zc->type_string);
}
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
int online = ac->config && zc->online ? 1 : 0;
return scnprintf(buf, PAGE_SIZE, "%d\n", online);
struct device_attribute *attr,
const char *buf, size_t count)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
struct zcrypt_queue *zq;
int online, id, i = 0, maxzqs = 0;
struct zcrypt_queue **zq_uelist = NULL;
struct device_attribute *attr,
char *buf)
{
- struct zcrypt_card *zc = to_ap_card(dev)->private;
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
return scnprintf(buf, PAGE_SIZE, "%d\n", atomic_read(&zc->load));
}
rlen = vlen = PAGE_SIZE/2;
rc = cca_query_crypto_facility(cardnr, domain, "STATICSB",
rarray, &rlen, varray, &vlen);
- if (rc == 0 && rlen >= 10*8 && vlen >= 240) {
- ci->new_apka_mk_state = (char) rarray[7*8];
- ci->cur_apka_mk_state = (char) rarray[8*8];
- ci->old_apka_mk_state = (char) rarray[9*8];
+ if (rc == 0 && rlen >= 13*8 && vlen >= 240) {
+ ci->new_apka_mk_state = (char) rarray[10*8];
+ ci->cur_apka_mk_state = (char) rarray[11*8];
+ ci->old_apka_mk_state = (char) rarray[12*8];
if (ci->old_apka_mk_state == '2')
memcpy(&ci->old_apka_mkvp, varray + 208, 8);
if (ci->cur_apka_mk_state == '2')
if (!zc)
return -ENOMEM;
zc->card = ac;
- ac->private = zc;
+ dev_set_drvdata(&ap_dev->device, zc);
if (ac->ap_dev.device_type == AP_DEVICE_TYPE_CEX2A) {
zc->min_mod_size = CEX2A_MIN_MOD_SIZE;
rc = zcrypt_card_register(zc);
if (rc) {
- ac->private = NULL;
zcrypt_card_free(zc);
}
*/
static void zcrypt_cex2a_card_remove(struct ap_device *ap_dev)
{
- struct zcrypt_card *zc = to_ap_card(&ap_dev->device)->private;
+ struct zcrypt_card *zc = dev_get_drvdata(&ap_dev->device);
- if (zc)
- zcrypt_card_unregister(zc);
+ zcrypt_card_unregister(zc);
}
static struct ap_driver zcrypt_cex2a_card_driver = {
ap_queue_init_state(aq);
ap_queue_init_reply(aq, &zq->reply);
aq->request_timeout = CEX2A_CLEANUP_TIME;
- aq->private = zq;
+ dev_set_drvdata(&ap_dev->device, zq);
rc = zcrypt_queue_register(zq);
if (rc) {
- aq->private = NULL;
zcrypt_queue_free(zq);
}
*/
static void zcrypt_cex2a_queue_remove(struct ap_device *ap_dev)
{
- struct ap_queue *aq = to_ap_queue(&ap_dev->device);
- struct zcrypt_queue *zq = aq->private;
+ struct zcrypt_queue *zq = dev_get_drvdata(&ap_dev->device);
- if (zq)
- zcrypt_queue_unregister(zq);
+ zcrypt_queue_unregister(zq);
}
static struct ap_driver zcrypt_cex2a_queue_driver = {
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct cca_info ci;
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
memset(&ci, 0, sizeof(ci));
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
int n = 0;
struct cca_info ci;
- struct zcrypt_queue *zq = to_ap_queue(dev)->private;
static const char * const cao_state[] = { "invalid", "valid" };
static const char * const new_state[] = { "empty", "partial", "full" };
if (!zc)
return -ENOMEM;
zc->card = ac;
- ac->private = zc;
+ dev_set_drvdata(&ap_dev->device, zc);
switch (ac->ap_dev.device_type) {
case AP_DEVICE_TYPE_CEX2C:
zc->user_space_type = ZCRYPT_CEX2C;
rc = zcrypt_card_register(zc);
if (rc) {
- ac->private = NULL;
zcrypt_card_free(zc);
return rc;
}
&cca_card_attr_grp);
if (rc) {
zcrypt_card_unregister(zc);
- ac->private = NULL;
zcrypt_card_free(zc);
}
}
*/
static void zcrypt_cex2c_card_remove(struct ap_device *ap_dev)
{
+ struct zcrypt_card *zc = dev_get_drvdata(&ap_dev->device);
struct ap_card *ac = to_ap_card(&ap_dev->device);
- struct zcrypt_card *zc = to_ap_card(&ap_dev->device)->private;
if (ap_test_bit(&ac->functions, AP_FUNC_COPRO))
sysfs_remove_group(&ap_dev->device.kobj, &cca_card_attr_grp);
- if (zc)
- zcrypt_card_unregister(zc);
+
+ zcrypt_card_unregister(zc);
}
static struct ap_driver zcrypt_cex2c_card_driver = {
ap_queue_init_state(aq);
ap_queue_init_reply(aq, &zq->reply);
aq->request_timeout = CEX2C_CLEANUP_TIME;
- aq->private = zq;
+ dev_set_drvdata(&ap_dev->device, zq);
rc = zcrypt_queue_register(zq);
if (rc) {
- aq->private = NULL;
zcrypt_queue_free(zq);
return rc;
}
&cca_queue_attr_grp);
if (rc) {
zcrypt_queue_unregister(zq);
- aq->private = NULL;
zcrypt_queue_free(zq);
}
}
*/
static void zcrypt_cex2c_queue_remove(struct ap_device *ap_dev)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(&ap_dev->device);
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
- struct zcrypt_queue *zq = aq->private;
if (ap_test_bit(&aq->card->functions, AP_FUNC_COPRO))
sysfs_remove_group(&ap_dev->device.kobj, &cca_queue_attr_grp);
- if (zq)
- zcrypt_queue_unregister(zq);
+
+ zcrypt_queue_unregister(zq);
}
static struct ap_driver zcrypt_cex2c_queue_driver = {
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct cca_info ci;
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
memset(&ci, 0, sizeof(ci));
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
int n = 0;
struct cca_info ci;
- struct zcrypt_queue *zq = to_ap_queue(dev)->private;
static const char * const cao_state[] = { "invalid", "valid" };
static const char * const new_state[] = { "empty", "partial", "full" };
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct ep11_card_info ci;
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
memset(&ci, 0, sizeof(ci));
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct ep11_card_info ci;
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
memset(&ci, 0, sizeof(ci));
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
struct ep11_card_info ci;
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
memset(&ci, 0, sizeof(ci));
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_card *zc = dev_get_drvdata(dev);
int i, n = 0;
struct ep11_card_info ci;
struct ap_card *ac = to_ap_card(dev);
- struct zcrypt_card *zc = ac->private;
memset(&ci, 0, sizeof(ci));
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
int n = 0;
struct ep11_domain_info di;
- struct zcrypt_queue *zq = to_ap_queue(dev)->private;
static const char * const cwk_state[] = { "invalid", "valid" };
static const char * const nwk_state[] = { "empty", "uncommitted",
"committed" };
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
int i, n = 0;
struct ep11_domain_info di;
- struct zcrypt_queue *zq = to_ap_queue(dev)->private;
memset(&di, 0, sizeof(di));
if (!zc)
return -ENOMEM;
zc->card = ac;
- ac->private = zc;
+ dev_set_drvdata(&ap_dev->device, zc);
if (ap_test_bit(&ac->functions, AP_FUNC_ACCEL)) {
if (ac->ap_dev.device_type == AP_DEVICE_TYPE_CEX4) {
zc->type_string = "CEX4A";
rc = zcrypt_card_register(zc);
if (rc) {
- ac->private = NULL;
zcrypt_card_free(zc);
return rc;
}
&cca_card_attr_grp);
if (rc) {
zcrypt_card_unregister(zc);
- ac->private = NULL;
zcrypt_card_free(zc);
}
} else if (ap_test_bit(&ac->functions, AP_FUNC_EP11)) {
&ep11_card_attr_grp);
if (rc) {
zcrypt_card_unregister(zc);
- ac->private = NULL;
zcrypt_card_free(zc);
}
}
*/
static void zcrypt_cex4_card_remove(struct ap_device *ap_dev)
{
+ struct zcrypt_card *zc = dev_get_drvdata(&ap_dev->device);
struct ap_card *ac = to_ap_card(&ap_dev->device);
- struct zcrypt_card *zc = ac->private;
if (ap_test_bit(&ac->functions, AP_FUNC_COPRO))
sysfs_remove_group(&ap_dev->device.kobj, &cca_card_attr_grp);
else if (ap_test_bit(&ac->functions, AP_FUNC_EP11))
sysfs_remove_group(&ap_dev->device.kobj, &ep11_card_attr_grp);
- if (zc)
- zcrypt_card_unregister(zc);
+
+ zcrypt_card_unregister(zc);
}
static struct ap_driver zcrypt_cex4_card_driver = {
ap_queue_init_state(aq);
ap_queue_init_reply(aq, &zq->reply);
aq->request_timeout = CEX4_CLEANUP_TIME;
- aq->private = zq;
+ dev_set_drvdata(&ap_dev->device, zq);
rc = zcrypt_queue_register(zq);
if (rc) {
- aq->private = NULL;
zcrypt_queue_free(zq);
return rc;
}
&cca_queue_attr_grp);
if (rc) {
zcrypt_queue_unregister(zq);
- aq->private = NULL;
zcrypt_queue_free(zq);
}
} else if (ap_test_bit(&aq->card->functions, AP_FUNC_EP11)) {
&ep11_queue_attr_grp);
if (rc) {
zcrypt_queue_unregister(zq);
- aq->private = NULL;
zcrypt_queue_free(zq);
}
}
*/
static void zcrypt_cex4_queue_remove(struct ap_device *ap_dev)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(&ap_dev->device);
struct ap_queue *aq = to_ap_queue(&ap_dev->device);
- struct zcrypt_queue *zq = aq->private;
if (ap_test_bit(&aq->card->functions, AP_FUNC_COPRO))
sysfs_remove_group(&ap_dev->device.kobj, &cca_queue_attr_grp);
else if (ap_test_bit(&aq->card->functions, AP_FUNC_EP11))
sysfs_remove_group(&ap_dev->device.kobj, &ep11_queue_attr_grp);
- if (zq)
- zcrypt_queue_unregister(zq);
+
+ zcrypt_queue_unregister(zq);
}
static struct ap_driver zcrypt_cex4_queue_driver = {
struct device_attribute *attr,
char *buf)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
struct ap_queue *aq = to_ap_queue(dev);
- struct zcrypt_queue *zq = aq->private;
int online = aq->config && zq->online ? 1 : 0;
return scnprintf(buf, PAGE_SIZE, "%d\n", online);
struct device_attribute *attr,
const char *buf, size_t count)
{
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
struct ap_queue *aq = to_ap_queue(dev);
- struct zcrypt_queue *zq = aq->private;
struct zcrypt_card *zc = zq->zcard;
int online;
struct device_attribute *attr,
char *buf)
{
- struct zcrypt_queue *zq = to_ap_queue(dev)->private;
+ struct zcrypt_queue *zq = dev_get_drvdata(dev);
return scnprintf(buf, PAGE_SIZE, "%d\n", atomic_read(&zq->load));
}
int rc;
spin_lock(&zcrypt_list_lock);
- zc = zq->queue->card->private;
+ zc = dev_get_drvdata(&zq->queue->card->ap_dev.device);
zcrypt_card_get(zc);
zq->zcard = zc;
zq->online = 1; /* New devices are online by default. */
unsigned long card_ptr)
{
struct qeth_card *card = (struct qeth_card *) card_ptr;
- struct net_device *dev = card->dev;
- QETH_CARD_TEXT(card, 6, "qdouhdl");
- if (qdio_error & QDIO_ERROR_FATAL) {
- QETH_CARD_TEXT(card, 2, "achkcond");
- netif_tx_stop_all_queues(dev);
- qeth_schedule_recovery(card);
- }
+ QETH_CARD_TEXT(card, 2, "achkcond");
+ netif_tx_stop_all_queues(card->dev);
+ qeth_schedule_recovery(card);
}
/**
{
struct zfcp_qdio *qdio = (struct zfcp_qdio *) parm;
- if (unlikely(qdio_err)) {
- zfcp_qdio_handler_error(qdio, "qdireq1", qdio_err);
- return;
- }
+ zfcp_qdio_handler_error(qdio, "qdireq1", qdio_err);
}
static void zfcp_qdio_request_tasklet(struct tasklet_struct *tasklet)
ret = scsi_device_set_state(sdev, state);
/*
* If the device state changes to SDEV_RUNNING, we need to
- * rescan the device to revalidate it, and run the queue to
- * avoid I/O hang.
+ * run the queue to avoid I/O hang, and rescan the device
+ * to revalidate it. Running the queue first is necessary
+ * because another thread may be waiting inside
+ * blk_mq_freeze_queue_wait() and because that call may be
+ * waiting for pending I/O to finish.
*/
if (ret == 0 && state == SDEV_RUNNING) {
- scsi_rescan_device(dev);
blk_mq_run_hw_queues(sdev->request_queue, true);
+ scsi_rescan_device(dev);
}
mutex_unlock(&sdev->state_mutex);
static struct kmem_cache *sd_cdb_cache;
static mempool_t *sd_cdb_pool;
static mempool_t *sd_page_pool;
+static struct lock_class_key sd_bio_compl_lkclass;
static const char *sd_cache_types[] = {
"write through", "none", "write back",
cmd->cmnd[0] = UNMAP;
cmd->cmnd[8] = 24;
- buf = page_address(rq->special_vec.bv_page);
+ buf = bvec_virt(&rq->special_vec);
put_unaligned_be16(6 + 16, &buf[0]);
put_unaligned_be16(16, &buf[2]);
put_unaligned_be64(lba, &buf[8]);
if (!sdkp)
goto out;
- gd = alloc_disk(SD_MINORS);
+ gd = __alloc_disk_node(sdp->request_queue, NUMA_NO_NODE,
+ &sd_bio_compl_lkclass);
if (!gd)
goto out_free;
gd->major = sd_major((index & 0xf0) >> 4);
gd->first_minor = ((index & 0xf) << 4) | (index & 0xfff00);
+ gd->minors = SD_MINORS;
gd->fops = &sd_fops;
gd->private_data = &sdkp->driver;
- gd->queue = sdkp->device->request_queue;
/* defaults, until the device tells us otherwise */
sdp->sector_size = 512;
bool exclude; /* 1->open(O_EXCL) succeeded and is active */
int open_cnt; /* count of opens (perhaps < num(sfds) ) */
char sgdebug; /* 0->off, 1->sense, 9->dump dev, 10-> all devs */
- struct gendisk *disk;
+ char name[DISK_NAME_LEN];
struct cdev * cdev; /* char_dev [sysfs: /sys/cdev/major/sg<n>] */
struct kref d_ref;
} Sg_device;
#define SZ_SG_REQ_INFO sizeof(sg_req_info_t)
#define sg_printk(prefix, sdp, fmt, a...) \
- sdev_prefix_printk(prefix, (sdp)->device, \
- (sdp)->disk->disk_name, fmt, ##a)
+ sdev_prefix_printk(prefix, (sdp)->device, (sdp)->name, fmt, ##a)
/*
* The SCSI interfaces that use read() and write() as an asynchronous variant of
srp->rq->timeout = timeout;
kref_get(&sfp->f_ref); /* sg_rq_end_io() does kref_put(). */
- blk_execute_rq_nowait(sdp->disk, srp->rq, at_head, sg_rq_end_io);
+ blk_execute_rq_nowait(NULL, srp->rq, at_head, sg_rq_end_io);
return 0;
}
return put_user(max_sectors_bytes(sdp->device->request_queue),
ip);
case BLKTRACESETUP:
- return blk_trace_setup(sdp->device->request_queue,
- sdp->disk->disk_name,
+ return blk_trace_setup(sdp->device->request_queue, sdp->name,
MKDEV(SCSI_GENERIC_MAJOR, sdp->index),
NULL, p);
case BLKTRACESTART:
static int sg_sysfs_valid = 0;
static Sg_device *
-sg_alloc(struct gendisk *disk, struct scsi_device *scsidp)
+sg_alloc(struct scsi_device *scsidp)
{
struct request_queue *q = scsidp->request_queue;
Sg_device *sdp;
SCSI_LOG_TIMEOUT(3, sdev_printk(KERN_INFO, scsidp,
"sg_alloc: dev=%d \n", k));
- sprintf(disk->disk_name, "sg%d", k);
- disk->first_minor = k;
- sdp->disk = disk;
+ sprintf(sdp->name, "sg%d", k);
sdp->device = scsidp;
mutex_init(&sdp->open_rel_lock);
INIT_LIST_HEAD(&sdp->sfds);
sg_add_device(struct device *cl_dev, struct class_interface *cl_intf)
{
struct scsi_device *scsidp = to_scsi_device(cl_dev->parent);
- struct gendisk *disk;
Sg_device *sdp = NULL;
struct cdev * cdev = NULL;
int error;
unsigned long iflags;
- disk = alloc_disk(1);
- if (!disk) {
- pr_warn("%s: alloc_disk failed\n", __func__);
- return -ENOMEM;
- }
- disk->major = SCSI_GENERIC_MAJOR;
-
error = -ENOMEM;
cdev = cdev_alloc();
if (!cdev) {
cdev->owner = THIS_MODULE;
cdev->ops = &sg_fops;
- sdp = sg_alloc(disk, scsidp);
+ sdp = sg_alloc(scsidp);
if (IS_ERR(sdp)) {
pr_warn("%s: sg_alloc failed\n", __func__);
error = PTR_ERR(sdp);
sg_class_member = device_create(sg_sysfs_class, cl_dev->parent,
MKDEV(SCSI_GENERIC_MAJOR,
sdp->index),
- sdp, "%s", disk->disk_name);
+ sdp, "%s", sdp->name);
if (IS_ERR(sg_class_member)) {
pr_err("%s: device_create failed\n", __func__);
error = PTR_ERR(sg_class_member);
kfree(sdp);
out:
- put_disk(disk);
if (cdev)
cdev_del(cdev);
return error;
SCSI_LOG_TIMEOUT(3,
sg_printk(KERN_INFO, sdp, "sg_device_destroy\n"));
- put_disk(sdp->disk);
kfree(sdp);
}
goto skip;
read_lock(&sdp->sfd_lock);
if (!list_empty(&sdp->sfds)) {
- seq_printf(s, " >>> device=%s ", sdp->disk->disk_name);
+ seq_printf(s, " >>> device=%s ", sdp->name);
if (atomic_read(&sdp->detaching))
seq_puts(s, "detaching pending close ");
else if (sdp->device) {
static unsigned long sr_index_bits[SR_DISKS / BITS_PER_LONG];
static DEFINE_SPINLOCK(sr_index_lock);
+static struct lock_class_key sr_bio_compl_lkclass;
+
/* This semaphore is used to mediate the 0->1 reference get in the
* face of object destruction (i.e. we can't allow a get on an
* object after last put) */
kref_init(&cd->kref);
- disk = alloc_disk(1);
+ disk = __alloc_disk_node(sdev->request_queue, NUMA_NO_NODE,
+ &sr_bio_compl_lkclass);
if (!disk)
goto fail_free;
mutex_init(&cd->lock);
disk->major = SCSI_CDROM_MAJOR;
disk->first_minor = minor;
+ disk->minors = 1;
sprintf(disk->disk_name, "sr%d", minor);
disk->fops = &sr_bdops;
disk->flags = GENHD_FL_CD | GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE;
set_capacity(disk, cd->capacity);
disk->private_data = &cd->driver;
- disk->queue = sdev->request_queue;
if (register_cdrom(disk, &cd->cdi))
goto fail_minor;
}
\f
-static inline char *tape_name(struct scsi_tape *tape)
-{
- return tape->disk->disk_name;
-}
-
#define st_printk(prefix, t, fmt, a...) \
- sdev_prefix_printk(prefix, (t)->device, tape_name(t), fmt, ##a)
+ sdev_prefix_printk(prefix, (t)->device, (t)->name, fmt, ##a)
#ifdef DEBUG
#define DEBC_printk(t, fmt, a...) \
if (debugging) { st_printk(ST_DEB_MSG, t, fmt, ##a ); }
int result = SRpnt->result;
u8 scode;
DEB(const char *stp;)
- char *name = tape_name(STp);
+ char *name = STp->name;
struct st_cmdstatus *cmdstatp;
if (!result)
!capable(CAP_SYS_RAWIO))
i = -EPERM;
else
- i = scsi_cmd_ioctl(STp->disk->queue, STp->disk,
- file->f_mode, cmd_in, p);
+ i = scsi_cmd_ioctl(STp->device->request_queue,
+ NULL, file->f_mode, cmd_in,
+ p);
if (i != -ENOTTY)
return i;
break;
i = mode << (4 - ST_NBR_MODE_BITS);
snprintf(name, 10, "%s%s%s", rew ? "n" : "",
- tape->disk->disk_name, st_formats[i]);
+ tape->name, st_formats[i]);
dev = device_create(&st_sysfs_class, &tape->device->sdev_gendev,
cdev_devno, &tape->modes[mode], "%s", name);
static int st_probe(struct device *dev)
{
struct scsi_device *SDp = to_scsi_device(dev);
- struct gendisk *disk = NULL;
struct scsi_tape *tpnt = NULL;
struct st_modedef *STm;
struct st_partstat *STps;
goto out;
}
- disk = alloc_disk(1);
- if (!disk) {
- sdev_printk(KERN_ERR, SDp,
- "st: out of memory. Device not attached.\n");
- goto out_buffer_free;
- }
-
tpnt = kzalloc(sizeof(struct scsi_tape), GFP_KERNEL);
if (tpnt == NULL) {
sdev_printk(KERN_ERR, SDp,
"st: Can't allocate device descriptor.\n");
- goto out_put_disk;
+ goto out_buffer_free;
}
kref_init(&tpnt->kref);
- tpnt->disk = disk;
- disk->private_data = &tpnt->driver;
- /* SCSI tape doesn't register this gendisk via add_disk(). Manually
- * take queue reference that release_disk() expects. */
- if (!blk_get_queue(SDp->request_queue))
- goto out_put_disk;
- disk->queue = SDp->request_queue;
tpnt->driver = &st_template;
tpnt->device = SDp;
idr_preload_end();
if (error < 0) {
pr_warn("st: idr allocation failed: %d\n", error);
- goto out_put_queue;
+ goto out_free_tape;
}
tpnt->index = error;
- sprintf(disk->disk_name, "st%d", tpnt->index);
+ sprintf(tpnt->name, "st%d", tpnt->index);
tpnt->stats = kzalloc(sizeof(struct scsi_tape_stats), GFP_KERNEL);
if (tpnt->stats == NULL) {
sdev_printk(KERN_ERR, SDp,
scsi_autopm_put_device(SDp);
sdev_printk(KERN_NOTICE, SDp,
- "Attached scsi tape %s\n", tape_name(tpnt));
+ "Attached scsi tape %s\n", tpnt->name);
sdev_printk(KERN_INFO, SDp, "%s: try direct i/o: %s (alignment %d B)\n",
- tape_name(tpnt), tpnt->try_dio ? "yes" : "no",
+ tpnt->name, tpnt->try_dio ? "yes" : "no",
queue_dma_alignment(SDp->request_queue) + 1);
return 0;
spin_lock(&st_index_lock);
idr_remove(&st_index_idr, tpnt->index);
spin_unlock(&st_index_lock);
-out_put_queue:
- blk_put_queue(disk->queue);
-out_put_disk:
- put_disk(disk);
+out_free_tape:
kfree(tpnt);
out_buffer_free:
kfree(buffer);
static void scsi_tape_release(struct kref *kref)
{
struct scsi_tape *tpnt = to_scsi_tape(kref);
- struct gendisk *disk = tpnt->disk;
tpnt->device = NULL;
kfree(tpnt->buffer);
}
- disk->private_data = NULL;
- put_disk(disk);
kfree(tpnt->stats);
kfree(tpnt);
return;
unsigned char last_cmnd[6];
unsigned char last_sense[16];
#endif
- struct gendisk *disk;
+ char name[DISK_NAME_LEN];
struct kref kref;
struct scsi_tape_stats *stats;
};
The main usecase of this controller is to use spi flash as boot
device.
+config SPI_ROCKCHIP_SFC
+ tristate "Rockchip Serial Flash Controller (SFC)"
+ depends on ARCH_ROCKCHIP || COMPILE_TEST
+ depends on HAS_IOMEM && HAS_DMA
+ help
+ This enables support for Rockchip serial flash controller. This
+ is a specialized controller used to access SPI flash on some
+ Rockchip SOCs.
+
+ ROCKCHIP SFC supports DMA and PIO modes. When DMA is not available,
+ the driver automatically falls back to PIO mode.
+
config SPI_RB4XX
tristate "Mikrotik RB4XX SPI master"
depends on SPI_MASTER && ATH79
obj-$(CONFIG_SPI_QCOM_QSPI) += spi-qcom-qspi.o
obj-$(CONFIG_SPI_QUP) += spi-qup.o
obj-$(CONFIG_SPI_ROCKCHIP) += spi-rockchip.o
+obj-$(CONFIG_SPI_ROCKCHIP_SFC) += spi-rockchip-sfc.o
obj-$(CONFIG_SPI_RB4XX) += spi-rb4xx.o
obj-$(CONFIG_MACH_REALTEK_RTL) += spi-realtek-rtl.o
obj-$(CONFIG_SPI_RPCIF) += spi-rpc-if.o
}
#endif /* CONFIG_DEBUG_FS */
-static inline u32 bcm2835aux_rd(struct bcm2835aux_spi *bs, unsigned reg)
+static inline u32 bcm2835aux_rd(struct bcm2835aux_spi *bs, unsigned int reg)
{
return readl(bs->regs + reg);
}
-static inline void bcm2835aux_wr(struct bcm2835aux_spi *bs, unsigned reg,
+static inline void bcm2835aux_wr(struct bcm2835aux_spi *bs, unsigned int reg,
u32 val)
{
writel(val, bs->regs + reg);
mcfqspi_wr_qmr(mcfqspi, MCFQSPI_QMR_MSTR);
mcfqspi_cs_teardown(mcfqspi);
- clk_disable(mcfqspi->clk);
+ clk_disable_unprepare(mcfqspi->clk);
return 0;
}
* line for the controller
*/
if (spi->cs_gpiod) {
- /*
- * FIXME: is this code ever executed? This host does not
- * set SPI_MASTER_GPIO_SS so this chipselect callback should
- * not get called from the SPI core when we are using
- * GPIOs for chip select.
- */
if (value == BITBANG_CS_ACTIVE)
gpiod_set_value(spi->cs_gpiod, 1);
else
master->bus_num = pdev->id;
master->num_chipselect = pdata->num_chipselect;
master->bits_per_word_mask = SPI_BPW_RANGE_MASK(2, 16);
- master->flags = SPI_MASTER_MUST_RX;
+ master->flags = SPI_MASTER_MUST_RX | SPI_MASTER_GPIO_SS;
master->setup = davinci_spi_setup;
master->cleanup = davinci_spi_cleanup;
master->can_dma = davinci_spi_can_dma;
u32 val;
int ret;
- ret = clk_enable(espi->clk);
+ ret = clk_prepare_enable(espi->clk);
if (ret)
return ret;
val &= ~SSPCR1_SSE;
writel(val, espi->mmio + SSPCR1);
- clk_disable(espi->clk);
+ clk_disable_unprepare(espi->clk);
return 0;
}
#define SPI_FSI_BASE 0x70000
#define SPI_FSI_INIT_TIMEOUT_MS 1000
-#define SPI_FSI_MAX_XFR_SIZE 2048
-#define SPI_FSI_MAX_XFR_SIZE_RESTRICTED 8
+#define SPI_FSI_MAX_RX_SIZE 8
+#define SPI_FSI_MAX_TX_SIZE 40
#define SPI_FSI_ERROR 0x0
#define SPI_FSI_COUNTER_CFG 0x1
-#define SPI_FSI_COUNTER_CFG_LOOPS(x) (((u64)(x) & 0xffULL) << 32)
-#define SPI_FSI_COUNTER_CFG_N2_RX BIT_ULL(8)
-#define SPI_FSI_COUNTER_CFG_N2_TX BIT_ULL(9)
-#define SPI_FSI_COUNTER_CFG_N2_IMPLICIT BIT_ULL(10)
-#define SPI_FSI_COUNTER_CFG_N2_RELOAD BIT_ULL(11)
#define SPI_FSI_CFG1 0x2
#define SPI_FSI_CLOCK_CFG 0x3
#define SPI_FSI_CLOCK_CFG_MM_ENABLE BIT_ULL(32)
struct device *dev; /* SPI controller device */
struct fsi_device *fsi; /* FSI2SPI CFAM engine device */
u32 base;
- size_t max_xfr_size;
- bool restricted;
};
struct fsi_spi_sequence {
return fsi_spi_write_reg(ctx, SPI_FSI_STATUS, 0ULL);
}
-static int fsi_spi_sequence_add(struct fsi_spi_sequence *seq, u8 val)
+static void fsi_spi_sequence_add(struct fsi_spi_sequence *seq, u8 val)
{
/*
* Add the next byte of instruction to the 8-byte sequence register.
*/
seq->data |= (u64)val << seq->bit;
seq->bit -= 8;
-
- return ((64 - seq->bit) / 8) - 2;
}
static void fsi_spi_sequence_init(struct fsi_spi_sequence *seq)
seq->data = 0ULL;
}
-static int fsi_spi_sequence_transfer(struct fsi_spi *ctx,
- struct fsi_spi_sequence *seq,
- struct spi_transfer *transfer)
-{
- int loops;
- int idx;
- int rc;
- u8 val = 0;
- u8 len = min(transfer->len, 8U);
- u8 rem = transfer->len % len;
-
- loops = transfer->len / len;
-
- if (transfer->tx_buf) {
- val = SPI_FSI_SEQUENCE_SHIFT_OUT(len);
- idx = fsi_spi_sequence_add(seq, val);
-
- if (rem)
- rem = SPI_FSI_SEQUENCE_SHIFT_OUT(rem);
- } else if (transfer->rx_buf) {
- val = SPI_FSI_SEQUENCE_SHIFT_IN(len);
- idx = fsi_spi_sequence_add(seq, val);
-
- if (rem)
- rem = SPI_FSI_SEQUENCE_SHIFT_IN(rem);
- } else {
- return -EINVAL;
- }
-
- if (ctx->restricted && loops > 1) {
- dev_warn(ctx->dev,
- "Transfer too large; no branches permitted.\n");
- return -EINVAL;
- }
-
- if (loops > 1) {
- u64 cfg = SPI_FSI_COUNTER_CFG_LOOPS(loops - 1);
-
- fsi_spi_sequence_add(seq, SPI_FSI_SEQUENCE_BRANCH(idx));
-
- if (transfer->rx_buf)
- cfg |= SPI_FSI_COUNTER_CFG_N2_RX |
- SPI_FSI_COUNTER_CFG_N2_TX |
- SPI_FSI_COUNTER_CFG_N2_IMPLICIT |
- SPI_FSI_COUNTER_CFG_N2_RELOAD;
-
- rc = fsi_spi_write_reg(ctx, SPI_FSI_COUNTER_CFG, cfg);
- if (rc)
- return rc;
- } else {
- fsi_spi_write_reg(ctx, SPI_FSI_COUNTER_CFG, 0ULL);
- }
-
- if (rem)
- fsi_spi_sequence_add(seq, rem);
-
- return 0;
-}
-
static int fsi_spi_transfer_data(struct fsi_spi *ctx,
struct spi_transfer *transfer)
{
int rc = 0;
u64 status = 0ULL;
- u64 cfg = 0ULL;
if (transfer->tx_buf) {
int nb;
u64 in = 0ULL;
u8 *rx = transfer->rx_buf;
- rc = fsi_spi_read_reg(ctx, SPI_FSI_COUNTER_CFG, &cfg);
- if (rc)
- return rc;
-
- if (cfg & SPI_FSI_COUNTER_CFG_N2_IMPLICIT) {
- rc = fsi_spi_write_reg(ctx, SPI_FSI_DATA_TX, 0);
- if (rc)
- return rc;
- }
-
while (transfer->len > recv) {
do {
rc = fsi_spi_read_reg(ctx, SPI_FSI_STATUS,
}
} while (seq_state && (seq_state != SPI_FSI_STATUS_SEQ_STATE_IDLE));
+ rc = fsi_spi_write_reg(ctx, SPI_FSI_COUNTER_CFG, 0ULL);
+ if (rc)
+ return rc;
+
rc = fsi_spi_read_reg(ctx, SPI_FSI_CLOCK_CFG, &clock_cfg);
if (rc)
return rc;
{
int rc;
u8 seq_slave = SPI_FSI_SEQUENCE_SEL_SLAVE(mesg->spi->chip_select + 1);
+ unsigned int len;
struct spi_transfer *transfer;
struct fsi_spi *ctx = spi_controller_get_devdata(ctlr);
struct spi_transfer *next = NULL;
/* Sequencer must do shift out (tx) first. */
- if (!transfer->tx_buf ||
- transfer->len > (ctx->max_xfr_size + 8)) {
+ if (!transfer->tx_buf || transfer->len > SPI_FSI_MAX_TX_SIZE) {
rc = -EINVAL;
goto error;
}
fsi_spi_sequence_init(&seq);
fsi_spi_sequence_add(&seq, seq_slave);
- rc = fsi_spi_sequence_transfer(ctx, &seq, transfer);
- if (rc)
- goto error;
+ len = transfer->len;
+ while (len > 8) {
+ fsi_spi_sequence_add(&seq,
+ SPI_FSI_SEQUENCE_SHIFT_OUT(8));
+ len -= 8;
+ }
+ fsi_spi_sequence_add(&seq, SPI_FSI_SEQUENCE_SHIFT_OUT(len));
if (!list_is_last(&transfer->transfer_list,
&mesg->transfers)) {
/* Sequencer can only do shift in (rx) after tx. */
if (next->rx_buf) {
- if (next->len > ctx->max_xfr_size) {
+ u8 shift;
+
+ if (next->len > SPI_FSI_MAX_RX_SIZE) {
rc = -EINVAL;
goto error;
}
dev_dbg(ctx->dev, "Sequence rx of %d bytes.\n",
next->len);
- rc = fsi_spi_sequence_transfer(ctx, &seq,
- next);
- if (rc)
- goto error;
+ shift = SPI_FSI_SEQUENCE_SHIFT_IN(next->len);
+ fsi_spi_sequence_add(&seq, shift);
} else {
next = NULL;
}
static size_t fsi_spi_max_transfer_size(struct spi_device *spi)
{
- struct fsi_spi *ctx = spi_controller_get_devdata(spi->controller);
-
- return ctx->max_xfr_size;
+ return SPI_FSI_MAX_RX_SIZE;
}
static int fsi_spi_probe(struct device *dev)
ctx->fsi = fsi;
ctx->base = base + SPI_FSI_BASE;
- if (of_device_is_compatible(np, "ibm,fsi2spi-restricted")) {
- ctx->restricted = true;
- ctx->max_xfr_size = SPI_FSI_MAX_XFR_SIZE_RESTRICTED;
- } else {
- ctx->restricted = false;
- ctx->max_xfr_size = SPI_FSI_MAX_XFR_SIZE;
- }
-
rc = devm_spi_register_controller(dev, ctlr);
if (rc)
spi_controller_put(ctlr);
goto err_rx_dma_buf;
}
+ memset(&cfg, 0, sizeof(cfg));
cfg.src_addr = phy_addr + SPI_POPR;
cfg.dst_addr = phy_addr + SPI_PUSHR;
cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
*/
spin_lock_irq(&mas->lock);
geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
-
- /*
- * TX_WATERMARK_REG should be set after SPI configuration and
- * setting up GENI SE engine, as driver starts data transfer
- * for the watermark interrupt.
- */
if (m_cmd & SPI_TX_ONLY) {
if (geni_spi_handle_tx(mas))
writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
static void spi_imx_push(struct spi_imx_data *spi_imx)
{
- unsigned int burst_len, fifo_words;
+ unsigned int burst_len;
- if (spi_imx->dynamic_burst)
- fifo_words = 4;
- else
- fifo_words = spi_imx_bytes_per_word(spi_imx->bits_per_word);
/*
* Reload the FIFO when the remaining bytes to be transferred in the
* current burst is 0. This only applies when bits_per_word is a
spi_imx->remainder = burst_len;
} else {
- spi_imx->remainder = fifo_words;
+ spi_imx->remainder = spi_imx_bytes_per_word(spi_imx->bits_per_word);
}
}
if (!spi_imx->count)
break;
if (spi_imx->dynamic_burst &&
- spi_imx->txfifo >= DIV_ROUND_UP(spi_imx->remainder,
- fifo_words))
+ spi_imx->txfifo >= DIV_ROUND_UP(spi_imx->remainder, 4))
break;
spi_imx->tx(spi_imx);
spi_imx->txfifo++;
* dynamic_burst in that case.
*/
if (spi_imx->devtype_data->dynamic_burst && !spi_imx->slave_mode &&
+ !(spi->mode & SPI_CS_WORD) &&
(spi_imx->bits_per_word == 8 ||
spi_imx->bits_per_word == 16 ||
spi_imx->bits_per_word == 32)) {
is_imx53_ecspi(spi_imx))
spi_imx->bitbang.master->mode_bits |= SPI_LOOP | SPI_READY;
+ if (is_imx51_ecspi(spi_imx) &&
+ device_property_read_u32(&pdev->dev, "cs-gpios", NULL))
+ /*
+ * When using HW-CS implementing SPI_CS_WORD can be done by just
+ * setting the burst length to the word size. This is
+ * considerably faster than manually controlling the CS.
+ */
+ spi_imx->bitbang.master->mode_bits |= SPI_CS_WORD;
+
spi_imx->spi_drctl = spi_drctl;
init_completion(&spi_imx->xfer_done);
#define SPI_CFG1_CS_IDLE_OFFSET 0
#define SPI_CFG1_PACKET_LOOP_OFFSET 8
#define SPI_CFG1_PACKET_LENGTH_OFFSET 16
-#define SPI_CFG1_GET_TICK_DLY_OFFSET 30
+#define SPI_CFG1_GET_TICK_DLY_OFFSET 29
+#define SPI_CFG1_GET_TICK_DLY_MASK 0xe0000000
#define SPI_CFG1_CS_IDLE_MASK 0xff
#define SPI_CFG1_PACKET_LOOP_MASK 0xff00
#define SPI_CFG1_PACKET_LENGTH_MASK 0x3ff0000
bool enhance_timing;
/* some IC support DMA addr extension */
bool dma_ext;
+ /* some IC no need unprepare SPI clk */
+ bool no_need_unprepare;
};
struct mtk_spi {
struct scatterlist *tx_sgl, *rx_sgl;
u32 tx_sgl_len, rx_sgl_len;
const struct mtk_spi_compatible *dev_comp;
+ u32 spi_clk_hz;
};
static const struct mtk_spi_compatible mtk_common_compat;
.enhance_timing = true,
};
+static const struct mtk_spi_compatible mt6893_compat = {
+ .need_pad_sel = true,
+ .must_tx = true,
+ .enhance_timing = true,
+ .dma_ext = true,
+ .no_need_unprepare = true,
+};
+
/*
* A piece of default chip info unless the platform
* supplies it.
*/
static const struct mtk_chip_config mtk_default_chip_info = {
.sample_sel = 0,
+ .tick_delay = 0,
};
static const struct of_device_id mtk_spi_of_match[] = {
{ .compatible = "mediatek,mt8192-spi",
.data = (void *)&mt6765_compat,
},
+ { .compatible = "mediatek,mt6893-spi",
+ .data = (void *)&mt6893_compat,
+ },
{}
};
MODULE_DEVICE_TABLE(of, mtk_spi_of_match);
writel(reg_val, mdata->base + SPI_CMD_REG);
}
+static int mtk_spi_set_hw_cs_timing(struct spi_device *spi)
+{
+ struct mtk_spi *mdata = spi_master_get_devdata(spi->master);
+ struct spi_delay *cs_setup = &spi->cs_setup;
+ struct spi_delay *cs_hold = &spi->cs_hold;
+ struct spi_delay *cs_inactive = &spi->cs_inactive;
+ u32 setup, hold, inactive;
+ u32 reg_val;
+ int delay;
+
+ delay = spi_delay_to_ns(cs_setup, NULL);
+ if (delay < 0)
+ return delay;
+ setup = (delay * DIV_ROUND_UP(mdata->spi_clk_hz, 1000000)) / 1000;
+
+ delay = spi_delay_to_ns(cs_hold, NULL);
+ if (delay < 0)
+ return delay;
+ hold = (delay * DIV_ROUND_UP(mdata->spi_clk_hz, 1000000)) / 1000;
+
+ delay = spi_delay_to_ns(cs_inactive, NULL);
+ if (delay < 0)
+ return delay;
+ inactive = (delay * DIV_ROUND_UP(mdata->spi_clk_hz, 1000000)) / 1000;
+
+ setup = setup ? setup : 1;
+ hold = hold ? hold : 1;
+ inactive = inactive ? inactive : 1;
+
+ reg_val = readl(mdata->base + SPI_CFG0_REG);
+ if (mdata->dev_comp->enhance_timing) {
+ hold = min_t(u32, hold, 0x10000);
+ setup = min_t(u32, setup, 0x10000);
+ reg_val &= ~(0xffff << SPI_ADJUST_CFG0_CS_HOLD_OFFSET);
+ reg_val |= (((hold - 1) & 0xffff)
+ << SPI_ADJUST_CFG0_CS_HOLD_OFFSET);
+ reg_val &= ~(0xffff << SPI_ADJUST_CFG0_CS_SETUP_OFFSET);
+ reg_val |= (((setup - 1) & 0xffff)
+ << SPI_ADJUST_CFG0_CS_SETUP_OFFSET);
+ } else {
+ hold = min_t(u32, hold, 0x100);
+ setup = min_t(u32, setup, 0x100);
+ reg_val &= ~(0xff << SPI_CFG0_CS_HOLD_OFFSET);
+ reg_val |= (((hold - 1) & 0xff) << SPI_CFG0_CS_HOLD_OFFSET);
+ reg_val &= ~(0xff << SPI_CFG0_CS_SETUP_OFFSET);
+ reg_val |= (((setup - 1) & 0xff)
+ << SPI_CFG0_CS_SETUP_OFFSET);
+ }
+ writel(reg_val, mdata->base + SPI_CFG0_REG);
+
+ inactive = min_t(u32, inactive, 0x100);
+ reg_val = readl(mdata->base + SPI_CFG1_REG);
+ reg_val &= ~SPI_CFG1_CS_IDLE_MASK;
+ reg_val |= (((inactive - 1) & 0xff) << SPI_CFG1_CS_IDLE_OFFSET);
+ writel(reg_val, mdata->base + SPI_CFG1_REG);
+
+ return 0;
+}
+
static int mtk_spi_prepare_message(struct spi_master *master,
struct spi_message *msg)
{
writel(mdata->pad_sel[spi->chip_select],
mdata->base + SPI_PAD_SEL_REG);
+ /* tick delay */
+ reg_val = readl(mdata->base + SPI_CFG1_REG);
+ reg_val &= ~SPI_CFG1_GET_TICK_DLY_MASK;
+ reg_val |= ((chip_config->tick_delay & 0x7)
+ << SPI_CFG1_GET_TICK_DLY_OFFSET);
+ writel(reg_val, mdata->base + SPI_CFG1_REG);
+
+ /* set hw cs timing */
+ mtk_spi_set_hw_cs_timing(spi);
return 0;
}
static void mtk_spi_prepare_transfer(struct spi_master *master,
struct spi_transfer *xfer)
{
- u32 spi_clk_hz, div, sck_time, reg_val;
+ u32 div, sck_time, reg_val;
struct mtk_spi *mdata = spi_master_get_devdata(master);
- spi_clk_hz = clk_get_rate(mdata->spi_clk);
- if (xfer->speed_hz < spi_clk_hz / 2)
- div = DIV_ROUND_UP(spi_clk_hz, xfer->speed_hz);
+ if (xfer->speed_hz < mdata->spi_clk_hz / 2)
+ div = DIV_ROUND_UP(mdata->spi_clk_hz, xfer->speed_hz);
else
div = 1;
(unsigned long)xfer->rx_buf % 4 == 0);
}
-static int mtk_spi_set_hw_cs_timing(struct spi_device *spi,
- struct spi_delay *setup,
- struct spi_delay *hold,
- struct spi_delay *inactive)
-{
- struct mtk_spi *mdata = spi_master_get_devdata(spi->master);
- u16 setup_dly, hold_dly, inactive_dly;
- u32 reg_val;
-
- if ((setup && setup->unit != SPI_DELAY_UNIT_SCK) ||
- (hold && hold->unit != SPI_DELAY_UNIT_SCK) ||
- (inactive && inactive->unit != SPI_DELAY_UNIT_SCK)) {
- dev_err(&spi->dev,
- "Invalid delay unit, should be SPI_DELAY_UNIT_SCK\n");
- return -EINVAL;
- }
-
- setup_dly = setup ? setup->value : 1;
- hold_dly = hold ? hold->value : 1;
- inactive_dly = inactive ? inactive->value : 1;
-
- reg_val = readl(mdata->base + SPI_CFG0_REG);
- if (mdata->dev_comp->enhance_timing) {
- reg_val &= ~(0xffff << SPI_ADJUST_CFG0_CS_HOLD_OFFSET);
- reg_val |= (((hold_dly - 1) & 0xffff)
- << SPI_ADJUST_CFG0_CS_HOLD_OFFSET);
- reg_val &= ~(0xffff << SPI_ADJUST_CFG0_CS_SETUP_OFFSET);
- reg_val |= (((setup_dly - 1) & 0xffff)
- << SPI_ADJUST_CFG0_CS_SETUP_OFFSET);
- } else {
- reg_val &= ~(0xff << SPI_CFG0_CS_HOLD_OFFSET);
- reg_val |= (((hold_dly - 1) & 0xff) << SPI_CFG0_CS_HOLD_OFFSET);
- reg_val &= ~(0xff << SPI_CFG0_CS_SETUP_OFFSET);
- reg_val |= (((setup_dly - 1) & 0xff)
- << SPI_CFG0_CS_SETUP_OFFSET);
- }
- writel(reg_val, mdata->base + SPI_CFG0_REG);
-
- reg_val = readl(mdata->base + SPI_CFG1_REG);
- reg_val &= ~SPI_CFG1_CS_IDLE_MASK;
- reg_val |= (((inactive_dly - 1) & 0xff) << SPI_CFG1_CS_IDLE_OFFSET);
- writel(reg_val, mdata->base + SPI_CFG1_REG);
-
- return 0;
-}
-
static int mtk_spi_setup(struct spi_device *spi)
{
struct mtk_spi *mdata = spi_master_get_devdata(spi->master);
goto err_put_master;
}
- clk_disable_unprepare(mdata->spi_clk);
+ mdata->spi_clk_hz = clk_get_rate(mdata->spi_clk);
+
+ if (mdata->dev_comp->no_need_unprepare)
+ clk_disable(mdata->spi_clk);
+ else
+ clk_disable_unprepare(mdata->spi_clk);
pm_runtime_enable(&pdev->dev);
mtk_spi_reset(mdata);
+ if (mdata->dev_comp->no_need_unprepare)
+ clk_unprepare(mdata->spi_clk);
+
return 0;
}
struct spi_master *master = dev_get_drvdata(dev);
struct mtk_spi *mdata = spi_master_get_devdata(master);
- clk_disable_unprepare(mdata->spi_clk);
+ if (mdata->dev_comp->no_need_unprepare)
+ clk_disable(mdata->spi_clk);
+ else
+ clk_disable_unprepare(mdata->spi_clk);
return 0;
}
struct mtk_spi *mdata = spi_master_get_devdata(master);
int ret;
- ret = clk_prepare_enable(mdata->spi_clk);
+ if (mdata->dev_comp->no_need_unprepare)
+ ret = clk_enable(mdata->spi_clk);
+ else
+ ret = clk_prepare_enable(mdata->spi_clk);
if (ret < 0) {
dev_err(dev, "failed to enable spi_clk (%d)\n", ret);
return ret;
static bool mxic_spi_mem_supports_op(struct spi_mem *mem,
const struct spi_mem_op *op)
{
- if (op->data.buswidth > 4 || op->addr.buswidth > 4 ||
- op->dummy.buswidth > 4 || op->cmd.buswidth > 4)
+ bool all_false;
+
+ if (op->data.buswidth > 8 || op->addr.buswidth > 8 ||
+ op->dummy.buswidth > 8 || op->cmd.buswidth > 8)
return false;
if (op->data.nbytes && op->dummy.nbytes &&
if (op->addr.nbytes > 7)
return false;
- return spi_mem_default_supports_op(mem, op);
+ all_false = !op->cmd.dtr && !op->addr.dtr && !op->dummy.dtr &&
+ !op->data.dtr;
+
+ if (all_false)
+ return spi_mem_default_supports_op(mem, op);
+ else
+ return spi_mem_dtr_supports_op(mem, op);
}
static int mxic_spi_mem_exec_op(struct spi_mem *mem,
struct mxic_spi *mxic = spi_master_get_devdata(mem->spi->master);
int nio = 1, i, ret;
u32 ss_ctrl;
- u8 addr[8];
- u8 opcode = op->cmd.opcode;
+ u8 addr[8], cmd[2];
ret = mxic_spi_set_freq(mxic, mem->spi->max_speed_hz);
if (ret)
return ret;
- if (mem->spi->mode & (SPI_TX_QUAD | SPI_RX_QUAD))
+ if (mem->spi->mode & (SPI_TX_OCTAL | SPI_RX_OCTAL))
+ nio = 8;
+ else if (mem->spi->mode & (SPI_TX_QUAD | SPI_RX_QUAD))
nio = 4;
else if (mem->spi->mode & (SPI_TX_DUAL | SPI_RX_DUAL))
nio = 2;
mxic->regs + HC_CFG);
writel(HC_EN_BIT, mxic->regs + HC_EN);
- ss_ctrl = OP_CMD_BYTES(1) | OP_CMD_BUSW(fls(op->cmd.buswidth) - 1);
+ ss_ctrl = OP_CMD_BYTES(op->cmd.nbytes) |
+ OP_CMD_BUSW(fls(op->cmd.buswidth) - 1) |
+ (op->cmd.dtr ? OP_CMD_DDR : 0);
if (op->addr.nbytes)
ss_ctrl |= OP_ADDR_BYTES(op->addr.nbytes) |
- OP_ADDR_BUSW(fls(op->addr.buswidth) - 1);
+ OP_ADDR_BUSW(fls(op->addr.buswidth) - 1) |
+ (op->addr.dtr ? OP_ADDR_DDR : 0);
if (op->dummy.nbytes)
ss_ctrl |= OP_DUMMY_CYC(op->dummy.nbytes);
if (op->data.nbytes) {
- ss_ctrl |= OP_DATA_BUSW(fls(op->data.buswidth) - 1);
- if (op->data.dir == SPI_MEM_DATA_IN)
+ ss_ctrl |= OP_DATA_BUSW(fls(op->data.buswidth) - 1) |
+ (op->data.dtr ? OP_DATA_DDR : 0);
+ if (op->data.dir == SPI_MEM_DATA_IN) {
ss_ctrl |= OP_READ;
+ if (op->data.dtr)
+ ss_ctrl |= OP_DQS_EN;
+ }
}
writel(ss_ctrl, mxic->regs + SS_CTRL(mem->spi->chip_select));
writel(readl(mxic->regs + HC_CFG) | HC_CFG_MAN_CS_ASSERT,
mxic->regs + HC_CFG);
- ret = mxic_spi_data_xfer(mxic, &opcode, NULL, 1);
+ for (i = 0; i < op->cmd.nbytes; i++)
+ cmd[i] = op->cmd.opcode >> (8 * (op->cmd.nbytes - i - 1));
+
+ ret = mxic_spi_data_xfer(mxic, cmd, NULL, op->cmd.nbytes);
if (ret)
goto out;
master->bits_per_word_mask = SPI_BPW_MASK(8);
master->mode_bits = SPI_CPOL | SPI_CPHA |
SPI_RX_DUAL | SPI_TX_DUAL |
- SPI_RX_QUAD | SPI_TX_QUAD;
+ SPI_RX_QUAD | SPI_TX_QUAD |
+ SPI_RX_OCTAL | SPI_TX_OCTAL;
mxic_spi_hw_init(mxic);
static void orion_spi_set_cs(struct spi_device *spi, bool enable)
{
struct orion_spi *orion_spi;
+ void __iomem *ctrl_reg;
+ u32 val;
orion_spi = spi_master_get_devdata(spi->master);
+ ctrl_reg = spi_reg(orion_spi, ORION_SPI_IF_CTRL_REG);
+
+ val = readl(ctrl_reg);
+
+ /* Clear existing chip-select and assertion state */
+ val &= ~(ORION_SPI_CS_MASK | 0x1);
/*
* If this line is using a GPIO to control chip select, this internal
* as it is handled by a GPIO, but that doesn't matter. What we need
* is to deassert the old chip select and assert some other chip select.
*/
- orion_spi_clrbits(orion_spi, ORION_SPI_IF_CTRL_REG, ORION_SPI_CS_MASK);
- orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG,
- ORION_SPI_CS(spi->chip_select));
+ val |= ORION_SPI_CS(spi->chip_select);
/*
* Chip select logic is inverted from spi_set_cs(). For lines using a
* doesn't matter.
*/
if (!enable)
- orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG, 0x1);
- else
- orion_spi_clrbits(orion_spi, ORION_SPI_IF_CTRL_REG, 0x1);
+ val |= 0x1;
+
+ /*
+ * To avoid toggling unwanted chip selects update the register
+ * with a single write.
+ */
+ writel(val, ctrl_reg);
}
static inline int orion_spi_wait_till_ready(struct orion_spi *orion_spi)
struct dma_slave_config cfg;
int ret;
+ memset(&cfg, 0, sizeof(cfg));
cfg.device_fc = true;
cfg.src_addr = pic32s->dma_base + buf_offset;
cfg.dst_addr = pic32s->dma_base + buf_offset;
static void reset_sccr1(struct driver_data *drv_data)
{
- struct chip_data *chip =
- spi_get_ctldata(drv_data->controller->cur_msg->spi);
- u32 sccr1_reg;
+ u32 mask = drv_data->int_cr1 | drv_data->dma_cr1, threshold;
+ struct chip_data *chip;
+
+ if (drv_data->controller->cur_msg) {
+ chip = spi_get_ctldata(drv_data->controller->cur_msg->spi);
+ threshold = chip->threshold;
+ } else {
+ threshold = 0;
+ }
- sccr1_reg = pxa2xx_spi_read(drv_data, SSCR1) & ~drv_data->int_cr1;
switch (drv_data->ssp_type) {
case QUARK_X1000_SSP:
- sccr1_reg &= ~QUARK_X1000_SSCR1_RFT;
+ mask |= QUARK_X1000_SSCR1_RFT;
break;
case CE4100_SSP:
- sccr1_reg &= ~CE4100_SSCR1_RFT;
+ mask |= CE4100_SSCR1_RFT;
break;
default:
- sccr1_reg &= ~SSCR1_RFT;
+ mask |= SSCR1_RFT;
break;
}
- sccr1_reg |= chip->threshold;
- pxa2xx_spi_write(drv_data, SSCR1, sccr1_reg);
+
+ pxa2xx_spi_update(drv_data, SSCR1, mask, threshold);
}
static void int_stop_and_reset(struct driver_data *drv_data)
static void handle_bad_msg(struct driver_data *drv_data)
{
+ int_stop_and_reset(drv_data);
pxa2xx_spi_off(drv_data);
- clear_SSCR1_bits(drv_data, drv_data->int_cr1);
- if (!pxa25x_ssp_comp(drv_data))
- pxa2xx_spi_write(drv_data, SSTO, 0);
- write_SSSR_CS(drv_data, drv_data->clear_sr);
dev_err(drv_data->ssp->dev, "bad message state in interrupt handler\n");
}
{
struct driver_data *drv_data = spi_controller_get_devdata(controller);
+ int_stop_and_reset(drv_data);
+
/* Disable the SSP */
pxa2xx_spi_off(drv_data);
- /* Clear and disable interrupts and service requests */
- write_SSSR_CS(drv_data, drv_data->clear_sr);
- clear_SSCR1_bits(drv_data, drv_data->int_cr1 | drv_data->dma_cr1);
- if (!pxa25x_ssp_comp(drv_data))
- pxa2xx_spi_write(drv_data, SSTO, 0);
/*
* Stop the DMA if running. Note DMA callback handler may have unset
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Rockchip Serial Flash Controller Driver
+ *
+ * Copyright (c) 2017-2021, Rockchip Inc.
+ * Author: Shawn Lin <shawn.lin@rock-chips.com>
+ * Chris Morgan <macroalpha82@gmail.com>
+ * Jon Lin <Jon.lin@rock-chips.com>
+ */
+
+#include <linux/bitops.h>
+#include <linux/clk.h>
+#include <linux/completion.h>
+#include <linux/dma-mapping.h>
+#include <linux/iopoll.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+#include <linux/interrupt.h>
+#include <linux/spi/spi-mem.h>
+
+/* System control */
+#define SFC_CTRL 0x0
+#define SFC_CTRL_PHASE_SEL_NEGETIVE BIT(1)
+#define SFC_CTRL_CMD_BITS_SHIFT 8
+#define SFC_CTRL_ADDR_BITS_SHIFT 10
+#define SFC_CTRL_DATA_BITS_SHIFT 12
+
+/* Interrupt mask */
+#define SFC_IMR 0x4
+#define SFC_IMR_RX_FULL BIT(0)
+#define SFC_IMR_RX_UFLOW BIT(1)
+#define SFC_IMR_TX_OFLOW BIT(2)
+#define SFC_IMR_TX_EMPTY BIT(3)
+#define SFC_IMR_TRAN_FINISH BIT(4)
+#define SFC_IMR_BUS_ERR BIT(5)
+#define SFC_IMR_NSPI_ERR BIT(6)
+#define SFC_IMR_DMA BIT(7)
+
+/* Interrupt clear */
+#define SFC_ICLR 0x8
+#define SFC_ICLR_RX_FULL BIT(0)
+#define SFC_ICLR_RX_UFLOW BIT(1)
+#define SFC_ICLR_TX_OFLOW BIT(2)
+#define SFC_ICLR_TX_EMPTY BIT(3)
+#define SFC_ICLR_TRAN_FINISH BIT(4)
+#define SFC_ICLR_BUS_ERR BIT(5)
+#define SFC_ICLR_NSPI_ERR BIT(6)
+#define SFC_ICLR_DMA BIT(7)
+
+/* FIFO threshold level */
+#define SFC_FTLR 0xc
+#define SFC_FTLR_TX_SHIFT 0
+#define SFC_FTLR_TX_MASK 0x1f
+#define SFC_FTLR_RX_SHIFT 8
+#define SFC_FTLR_RX_MASK 0x1f
+
+/* Reset FSM and FIFO */
+#define SFC_RCVR 0x10
+#define SFC_RCVR_RESET BIT(0)
+
+/* Enhanced mode */
+#define SFC_AX 0x14
+
+/* Address Bit number */
+#define SFC_ABIT 0x18
+
+/* Interrupt status */
+#define SFC_ISR 0x1c
+#define SFC_ISR_RX_FULL_SHIFT BIT(0)
+#define SFC_ISR_RX_UFLOW_SHIFT BIT(1)
+#define SFC_ISR_TX_OFLOW_SHIFT BIT(2)
+#define SFC_ISR_TX_EMPTY_SHIFT BIT(3)
+#define SFC_ISR_TX_FINISH_SHIFT BIT(4)
+#define SFC_ISR_BUS_ERR_SHIFT BIT(5)
+#define SFC_ISR_NSPI_ERR_SHIFT BIT(6)
+#define SFC_ISR_DMA_SHIFT BIT(7)
+
+/* FIFO status */
+#define SFC_FSR 0x20
+#define SFC_FSR_TX_IS_FULL BIT(0)
+#define SFC_FSR_TX_IS_EMPTY BIT(1)
+#define SFC_FSR_RX_IS_EMPTY BIT(2)
+#define SFC_FSR_RX_IS_FULL BIT(3)
+#define SFC_FSR_TXLV_MASK GENMASK(12, 8)
+#define SFC_FSR_TXLV_SHIFT 8
+#define SFC_FSR_RXLV_MASK GENMASK(20, 16)
+#define SFC_FSR_RXLV_SHIFT 16
+
+/* FSM status */
+#define SFC_SR 0x24
+#define SFC_SR_IS_IDLE 0x0
+#define SFC_SR_IS_BUSY 0x1
+
+/* Raw interrupt status */
+#define SFC_RISR 0x28
+#define SFC_RISR_RX_FULL BIT(0)
+#define SFC_RISR_RX_UNDERFLOW BIT(1)
+#define SFC_RISR_TX_OVERFLOW BIT(2)
+#define SFC_RISR_TX_EMPTY BIT(3)
+#define SFC_RISR_TRAN_FINISH BIT(4)
+#define SFC_RISR_BUS_ERR BIT(5)
+#define SFC_RISR_NSPI_ERR BIT(6)
+#define SFC_RISR_DMA BIT(7)
+
+/* Version */
+#define SFC_VER 0x2C
+#define SFC_VER_3 0x3
+#define SFC_VER_4 0x4
+#define SFC_VER_5 0x5
+
+/* Delay line controller resiter */
+#define SFC_DLL_CTRL0 0x3C
+#define SFC_DLL_CTRL0_SCLK_SMP_DLL BIT(15)
+#define SFC_DLL_CTRL0_DLL_MAX_VER4 0xFFU
+#define SFC_DLL_CTRL0_DLL_MAX_VER5 0x1FFU
+
+/* Master trigger */
+#define SFC_DMA_TRIGGER 0x80
+#define SFC_DMA_TRIGGER_START 1
+
+/* Src or Dst addr for master */
+#define SFC_DMA_ADDR 0x84
+
+/* Length control register extension 32GB */
+#define SFC_LEN_CTRL 0x88
+#define SFC_LEN_CTRL_TRB_SEL 1
+#define SFC_LEN_EXT 0x8C
+
+/* Command */
+#define SFC_CMD 0x100
+#define SFC_CMD_IDX_SHIFT 0
+#define SFC_CMD_DUMMY_SHIFT 8
+#define SFC_CMD_DIR_SHIFT 12
+#define SFC_CMD_DIR_RD 0
+#define SFC_CMD_DIR_WR 1
+#define SFC_CMD_ADDR_SHIFT 14
+#define SFC_CMD_ADDR_0BITS 0
+#define SFC_CMD_ADDR_24BITS 1
+#define SFC_CMD_ADDR_32BITS 2
+#define SFC_CMD_ADDR_XBITS 3
+#define SFC_CMD_TRAN_BYTES_SHIFT 16
+#define SFC_CMD_CS_SHIFT 30
+
+/* Address */
+#define SFC_ADDR 0x104
+
+/* Data */
+#define SFC_DATA 0x108
+
+/* The controller and documentation reports that it supports up to 4 CS
+ * devices (0-3), however I have only been able to test a single CS (CS 0)
+ * due to the configuration of my device.
+ */
+#define SFC_MAX_CHIPSELECT_NUM 4
+
+/* The SFC can transfer max 16KB - 1 at one time
+ * we set it to 15.5KB here for alignment.
+ */
+#define SFC_MAX_IOSIZE_VER3 (512 * 31)
+
+/* DMA is only enabled for large data transmission */
+#define SFC_DMA_TRANS_THRETHOLD (0x40)
+
+/* Maximum clock values from datasheet suggest keeping clock value under
+ * 150MHz. No minimum or average value is suggested.
+ */
+#define SFC_MAX_SPEED (150 * 1000 * 1000)
+
+struct rockchip_sfc {
+ struct device *dev;
+ void __iomem *regbase;
+ struct clk *hclk;
+ struct clk *clk;
+ u32 frequency;
+ /* virtual mapped addr for dma_buffer */
+ void *buffer;
+ dma_addr_t dma_buffer;
+ struct completion cp;
+ bool use_dma;
+ u32 max_iosize;
+ u16 version;
+};
+
+static int rockchip_sfc_reset(struct rockchip_sfc *sfc)
+{
+ int err;
+ u32 status;
+
+ writel_relaxed(SFC_RCVR_RESET, sfc->regbase + SFC_RCVR);
+
+ err = readl_poll_timeout(sfc->regbase + SFC_RCVR, status,
+ !(status & SFC_RCVR_RESET), 20,
+ jiffies_to_usecs(HZ));
+ if (err)
+ dev_err(sfc->dev, "SFC reset never finished\n");
+
+ /* Still need to clear the masked interrupt from RISR */
+ writel_relaxed(0xFFFFFFFF, sfc->regbase + SFC_ICLR);
+
+ dev_dbg(sfc->dev, "reset\n");
+
+ return err;
+}
+
+static u16 rockchip_sfc_get_version(struct rockchip_sfc *sfc)
+{
+ return (u16)(readl(sfc->regbase + SFC_VER) & 0xffff);
+}
+
+static u32 rockchip_sfc_get_max_iosize(struct rockchip_sfc *sfc)
+{
+ return SFC_MAX_IOSIZE_VER3;
+}
+
+static void rockchip_sfc_irq_unmask(struct rockchip_sfc *sfc, u32 mask)
+{
+ u32 reg;
+
+ /* Enable transfer complete interrupt */
+ reg = readl(sfc->regbase + SFC_IMR);
+ reg &= ~mask;
+ writel(reg, sfc->regbase + SFC_IMR);
+}
+
+static void rockchip_sfc_irq_mask(struct rockchip_sfc *sfc, u32 mask)
+{
+ u32 reg;
+
+ /* Disable transfer finish interrupt */
+ reg = readl(sfc->regbase + SFC_IMR);
+ reg |= mask;
+ writel(reg, sfc->regbase + SFC_IMR);
+}
+
+static int rockchip_sfc_init(struct rockchip_sfc *sfc)
+{
+ writel(0, sfc->regbase + SFC_CTRL);
+ writel(0xFFFFFFFF, sfc->regbase + SFC_ICLR);
+ rockchip_sfc_irq_mask(sfc, 0xFFFFFFFF);
+ if (rockchip_sfc_get_version(sfc) >= SFC_VER_4)
+ writel(SFC_LEN_CTRL_TRB_SEL, sfc->regbase + SFC_LEN_CTRL);
+
+ return 0;
+}
+
+static int rockchip_sfc_wait_txfifo_ready(struct rockchip_sfc *sfc, u32 timeout_us)
+{
+ int ret = 0;
+ u32 status;
+
+ ret = readl_poll_timeout(sfc->regbase + SFC_FSR, status,
+ status & SFC_FSR_TXLV_MASK, 0,
+ timeout_us);
+ if (ret) {
+ dev_dbg(sfc->dev, "sfc wait tx fifo timeout\n");
+
+ return -ETIMEDOUT;
+ }
+
+ return (status & SFC_FSR_TXLV_MASK) >> SFC_FSR_TXLV_SHIFT;
+}
+
+static int rockchip_sfc_wait_rxfifo_ready(struct rockchip_sfc *sfc, u32 timeout_us)
+{
+ int ret = 0;
+ u32 status;
+
+ ret = readl_poll_timeout(sfc->regbase + SFC_FSR, status,
+ status & SFC_FSR_RXLV_MASK, 0,
+ timeout_us);
+ if (ret) {
+ dev_dbg(sfc->dev, "sfc wait rx fifo timeout\n");
+
+ return -ETIMEDOUT;
+ }
+
+ return (status & SFC_FSR_RXLV_MASK) >> SFC_FSR_RXLV_SHIFT;
+}
+
+static void rockchip_sfc_adjust_op_work(struct spi_mem_op *op)
+{
+ if (unlikely(op->dummy.nbytes && !op->addr.nbytes)) {
+ /*
+ * SFC not support output DUMMY cycles right after CMD cycles, so
+ * treat it as ADDR cycles.
+ */
+ op->addr.nbytes = op->dummy.nbytes;
+ op->addr.buswidth = op->dummy.buswidth;
+ op->addr.val = 0xFFFFFFFFF;
+
+ op->dummy.nbytes = 0;
+ }
+}
+
+static int rockchip_sfc_xfer_setup(struct rockchip_sfc *sfc,
+ struct spi_mem *mem,
+ const struct spi_mem_op *op,
+ u32 len)
+{
+ u32 ctrl = 0, cmd = 0;
+
+ /* set CMD */
+ cmd = op->cmd.opcode;
+ ctrl |= ((op->cmd.buswidth >> 1) << SFC_CTRL_CMD_BITS_SHIFT);
+
+ /* set ADDR */
+ if (op->addr.nbytes) {
+ if (op->addr.nbytes == 4) {
+ cmd |= SFC_CMD_ADDR_32BITS << SFC_CMD_ADDR_SHIFT;
+ } else if (op->addr.nbytes == 3) {
+ cmd |= SFC_CMD_ADDR_24BITS << SFC_CMD_ADDR_SHIFT;
+ } else {
+ cmd |= SFC_CMD_ADDR_XBITS << SFC_CMD_ADDR_SHIFT;
+ writel(op->addr.nbytes * 8 - 1, sfc->regbase + SFC_ABIT);
+ }
+
+ ctrl |= ((op->addr.buswidth >> 1) << SFC_CTRL_ADDR_BITS_SHIFT);
+ }
+
+ /* set DUMMY */
+ if (op->dummy.nbytes) {
+ if (op->dummy.buswidth == 4)
+ cmd |= op->dummy.nbytes * 2 << SFC_CMD_DUMMY_SHIFT;
+ else if (op->dummy.buswidth == 2)
+ cmd |= op->dummy.nbytes * 4 << SFC_CMD_DUMMY_SHIFT;
+ else
+ cmd |= op->dummy.nbytes * 8 << SFC_CMD_DUMMY_SHIFT;
+ }
+
+ /* set DATA */
+ if (sfc->version >= SFC_VER_4) /* Clear it if no data to transfer */
+ writel(len, sfc->regbase + SFC_LEN_EXT);
+ else
+ cmd |= len << SFC_CMD_TRAN_BYTES_SHIFT;
+ if (len) {
+ if (op->data.dir == SPI_MEM_DATA_OUT)
+ cmd |= SFC_CMD_DIR_WR << SFC_CMD_DIR_SHIFT;
+
+ ctrl |= ((op->data.buswidth >> 1) << SFC_CTRL_DATA_BITS_SHIFT);
+ }
+ if (!len && op->addr.nbytes)
+ cmd |= SFC_CMD_DIR_WR << SFC_CMD_DIR_SHIFT;
+
+ /* set the Controller */
+ ctrl |= SFC_CTRL_PHASE_SEL_NEGETIVE;
+ cmd |= mem->spi->chip_select << SFC_CMD_CS_SHIFT;
+
+ dev_dbg(sfc->dev, "sfc addr.nbytes=%x(x%d) dummy.nbytes=%x(x%d)\n",
+ op->addr.nbytes, op->addr.buswidth,
+ op->dummy.nbytes, op->dummy.buswidth);
+ dev_dbg(sfc->dev, "sfc ctrl=%x cmd=%x addr=%llx len=%x\n",
+ ctrl, cmd, op->addr.val, len);
+
+ writel(ctrl, sfc->regbase + SFC_CTRL);
+ writel(cmd, sfc->regbase + SFC_CMD);
+ if (op->addr.nbytes)
+ writel(op->addr.val, sfc->regbase + SFC_ADDR);
+
+ return 0;
+}
+
+static int rockchip_sfc_write_fifo(struct rockchip_sfc *sfc, const u8 *buf, int len)
+{
+ u8 bytes = len & 0x3;
+ u32 dwords;
+ int tx_level;
+ u32 write_words;
+ u32 tmp = 0;
+
+ dwords = len >> 2;
+ while (dwords) {
+ tx_level = rockchip_sfc_wait_txfifo_ready(sfc, 1000);
+ if (tx_level < 0)
+ return tx_level;
+ write_words = min_t(u32, tx_level, dwords);
+ iowrite32_rep(sfc->regbase + SFC_DATA, buf, write_words);
+ buf += write_words << 2;
+ dwords -= write_words;
+ }
+
+ /* write the rest non word aligned bytes */
+ if (bytes) {
+ tx_level = rockchip_sfc_wait_txfifo_ready(sfc, 1000);
+ if (tx_level < 0)
+ return tx_level;
+ memcpy(&tmp, buf, bytes);
+ writel(tmp, sfc->regbase + SFC_DATA);
+ }
+
+ return len;
+}
+
+static int rockchip_sfc_read_fifo(struct rockchip_sfc *sfc, u8 *buf, int len)
+{
+ u8 bytes = len & 0x3;
+ u32 dwords;
+ u8 read_words;
+ int rx_level;
+ int tmp;
+
+ /* word aligned access only */
+ dwords = len >> 2;
+ while (dwords) {
+ rx_level = rockchip_sfc_wait_rxfifo_ready(sfc, 1000);
+ if (rx_level < 0)
+ return rx_level;
+ read_words = min_t(u32, rx_level, dwords);
+ ioread32_rep(sfc->regbase + SFC_DATA, buf, read_words);
+ buf += read_words << 2;
+ dwords -= read_words;
+ }
+
+ /* read the rest non word aligned bytes */
+ if (bytes) {
+ rx_level = rockchip_sfc_wait_rxfifo_ready(sfc, 1000);
+ if (rx_level < 0)
+ return rx_level;
+ tmp = readl(sfc->regbase + SFC_DATA);
+ memcpy(buf, &tmp, bytes);
+ }
+
+ return len;
+}
+
+static int rockchip_sfc_fifo_transfer_dma(struct rockchip_sfc *sfc, dma_addr_t dma_buf, size_t len)
+{
+ writel(0xFFFFFFFF, sfc->regbase + SFC_ICLR);
+ writel((u32)dma_buf, sfc->regbase + SFC_DMA_ADDR);
+ writel(SFC_DMA_TRIGGER_START, sfc->regbase + SFC_DMA_TRIGGER);
+
+ return len;
+}
+
+static int rockchip_sfc_xfer_data_poll(struct rockchip_sfc *sfc,
+ const struct spi_mem_op *op, u32 len)
+{
+ dev_dbg(sfc->dev, "sfc xfer_poll len=%x\n", len);
+
+ if (op->data.dir == SPI_MEM_DATA_OUT)
+ return rockchip_sfc_write_fifo(sfc, op->data.buf.out, len);
+ else
+ return rockchip_sfc_read_fifo(sfc, op->data.buf.in, len);
+}
+
+static int rockchip_sfc_xfer_data_dma(struct rockchip_sfc *sfc,
+ const struct spi_mem_op *op, u32 len)
+{
+ int ret;
+
+ dev_dbg(sfc->dev, "sfc xfer_dma len=%x\n", len);
+
+ if (op->data.dir == SPI_MEM_DATA_OUT)
+ memcpy(sfc->buffer, op->data.buf.out, len);
+
+ ret = rockchip_sfc_fifo_transfer_dma(sfc, sfc->dma_buffer, len);
+ if (!wait_for_completion_timeout(&sfc->cp, msecs_to_jiffies(2000))) {
+ dev_err(sfc->dev, "DMA wait for transfer finish timeout\n");
+ ret = -ETIMEDOUT;
+ }
+ rockchip_sfc_irq_mask(sfc, SFC_IMR_DMA);
+ if (op->data.dir == SPI_MEM_DATA_IN)
+ memcpy(op->data.buf.in, sfc->buffer, len);
+
+ return ret;
+}
+
+static int rockchip_sfc_xfer_done(struct rockchip_sfc *sfc, u32 timeout_us)
+{
+ int ret = 0;
+ u32 status;
+
+ ret = readl_poll_timeout(sfc->regbase + SFC_SR, status,
+ !(status & SFC_SR_IS_BUSY),
+ 20, timeout_us);
+ if (ret) {
+ dev_err(sfc->dev, "wait sfc idle timeout\n");
+ rockchip_sfc_reset(sfc);
+
+ ret = -EIO;
+ }
+
+ return ret;
+}
+
+static int rockchip_sfc_exec_mem_op(struct spi_mem *mem, const struct spi_mem_op *op)
+{
+ struct rockchip_sfc *sfc = spi_master_get_devdata(mem->spi->master);
+ u32 len = op->data.nbytes;
+ int ret;
+
+ if (unlikely(mem->spi->max_speed_hz != sfc->frequency)) {
+ ret = clk_set_rate(sfc->clk, mem->spi->max_speed_hz);
+ if (ret)
+ return ret;
+ sfc->frequency = mem->spi->max_speed_hz;
+ dev_dbg(sfc->dev, "set_freq=%dHz real_freq=%ldHz\n",
+ sfc->frequency, clk_get_rate(sfc->clk));
+ }
+
+ rockchip_sfc_adjust_op_work((struct spi_mem_op *)op);
+ rockchip_sfc_xfer_setup(sfc, mem, op, len);
+ if (len) {
+ if (likely(sfc->use_dma) && len >= SFC_DMA_TRANS_THRETHOLD) {
+ init_completion(&sfc->cp);
+ rockchip_sfc_irq_unmask(sfc, SFC_IMR_DMA);
+ ret = rockchip_sfc_xfer_data_dma(sfc, op, len);
+ } else {
+ ret = rockchip_sfc_xfer_data_poll(sfc, op, len);
+ }
+
+ if (ret != len) {
+ dev_err(sfc->dev, "xfer data failed ret %d dir %d\n", ret, op->data.dir);
+
+ return -EIO;
+ }
+ }
+
+ return rockchip_sfc_xfer_done(sfc, 100000);
+}
+
+static int rockchip_sfc_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
+{
+ struct rockchip_sfc *sfc = spi_master_get_devdata(mem->spi->master);
+
+ op->data.nbytes = min(op->data.nbytes, sfc->max_iosize);
+
+ return 0;
+}
+
+static const struct spi_controller_mem_ops rockchip_sfc_mem_ops = {
+ .exec_op = rockchip_sfc_exec_mem_op,
+ .adjust_op_size = rockchip_sfc_adjust_op_size,
+};
+
+static irqreturn_t rockchip_sfc_irq_handler(int irq, void *dev_id)
+{
+ struct rockchip_sfc *sfc = dev_id;
+ u32 reg;
+
+ reg = readl(sfc->regbase + SFC_RISR);
+
+ /* Clear interrupt */
+ writel_relaxed(reg, sfc->regbase + SFC_ICLR);
+
+ if (reg & SFC_RISR_DMA) {
+ complete(&sfc->cp);
+
+ return IRQ_HANDLED;
+ }
+
+ return IRQ_NONE;
+}
+
+static int rockchip_sfc_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct spi_master *master;
+ struct resource *res;
+ struct rockchip_sfc *sfc;
+ int ret;
+
+ master = devm_spi_alloc_master(&pdev->dev, sizeof(*sfc));
+ if (!master)
+ return -ENOMEM;
+
+ master->flags = SPI_MASTER_HALF_DUPLEX;
+ master->mem_ops = &rockchip_sfc_mem_ops;
+ master->dev.of_node = pdev->dev.of_node;
+ master->mode_bits = SPI_TX_QUAD | SPI_TX_DUAL | SPI_RX_QUAD | SPI_RX_DUAL;
+ master->max_speed_hz = SFC_MAX_SPEED;
+ master->num_chipselect = SFC_MAX_CHIPSELECT_NUM;
+
+ sfc = spi_master_get_devdata(master);
+ sfc->dev = dev;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ sfc->regbase = devm_ioremap_resource(dev, res);
+ if (IS_ERR(sfc->regbase))
+ return PTR_ERR(sfc->regbase);
+
+ sfc->clk = devm_clk_get(&pdev->dev, "clk_sfc");
+ if (IS_ERR(sfc->clk)) {
+ dev_err(&pdev->dev, "Failed to get sfc interface clk\n");
+ return PTR_ERR(sfc->clk);
+ }
+
+ sfc->hclk = devm_clk_get(&pdev->dev, "hclk_sfc");
+ if (IS_ERR(sfc->hclk)) {
+ dev_err(&pdev->dev, "Failed to get sfc ahb clk\n");
+ return PTR_ERR(sfc->hclk);
+ }
+
+ sfc->use_dma = !of_property_read_bool(sfc->dev->of_node,
+ "rockchip,sfc-no-dma");
+
+ if (sfc->use_dma) {
+ ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+ if (ret) {
+ dev_warn(dev, "Unable to set dma mask\n");
+ return ret;
+ }
+
+ sfc->buffer = dmam_alloc_coherent(dev, SFC_MAX_IOSIZE_VER3,
+ &sfc->dma_buffer,
+ GFP_KERNEL);
+ if (!sfc->buffer)
+ return -ENOMEM;
+ }
+
+ ret = clk_prepare_enable(sfc->hclk);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to enable ahb clk\n");
+ goto err_hclk;
+ }
+
+ ret = clk_prepare_enable(sfc->clk);
+ if (ret) {
+ dev_err(&pdev->dev, "Failed to enable interface clk\n");
+ goto err_clk;
+ }
+
+ /* Find the irq */
+ ret = platform_get_irq(pdev, 0);
+ if (ret < 0) {
+ dev_err(dev, "Failed to get the irq\n");
+ goto err_irq;
+ }
+
+ ret = devm_request_irq(dev, ret, rockchip_sfc_irq_handler,
+ 0, pdev->name, sfc);
+ if (ret) {
+ dev_err(dev, "Failed to request irq\n");
+
+ return ret;
+ }
+
+ ret = rockchip_sfc_init(sfc);
+ if (ret)
+ goto err_irq;
+
+ sfc->max_iosize = rockchip_sfc_get_max_iosize(sfc);
+ sfc->version = rockchip_sfc_get_version(sfc);
+
+ ret = spi_register_master(master);
+ if (ret)
+ goto err_irq;
+
+ return 0;
+
+err_irq:
+ clk_disable_unprepare(sfc->clk);
+err_clk:
+ clk_disable_unprepare(sfc->hclk);
+err_hclk:
+ return ret;
+}
+
+static int rockchip_sfc_remove(struct platform_device *pdev)
+{
+ struct spi_master *master = platform_get_drvdata(pdev);
+ struct rockchip_sfc *sfc = platform_get_drvdata(pdev);
+
+ spi_unregister_master(master);
+
+ clk_disable_unprepare(sfc->clk);
+ clk_disable_unprepare(sfc->hclk);
+
+ return 0;
+}
+
+static const struct of_device_id rockchip_sfc_dt_ids[] = {
+ { .compatible = "rockchip,sfc"},
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, rockchip_sfc_dt_ids);
+
+static struct platform_driver rockchip_sfc_driver = {
+ .driver = {
+ .name = "rockchip-sfc",
+ .of_match_table = rockchip_sfc_dt_ids,
+ },
+ .probe = rockchip_sfc_probe,
+ .remove = rockchip_sfc_remove,
+};
+module_platform_driver(rockchip_sfc_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Rockchip Serial Flash Controller Driver");
+MODULE_AUTHOR("Shawn Lin <shawn.lin@rock-chips.com>");
+MODULE_AUTHOR("Chris Morgan <macromorgan@hotmail.com>");
+MODULE_AUTHOR("Jon Lin <Jon.lin@rock-chips.com>");
/*
* ADI slave devices include RTC, ADC, regulator, charger, thermal and so on.
- * The slave devices address offset is always 0x8000 and size is 4K.
+ * ADI supports 12/14bit address for r2p0, and additional 17bit for r3p0 or
+ * later versions. Since bit[1:0] are zero, so the spec describe them as
+ * 10/12/15bit address mode.
+ * The 10bit mode supports sigle slave, 12/15bit mode supports 3 slave, the
+ * high two bits is slave_id.
+ * The slave devices address offset is 0x8000 for 10/12bit address mode,
+ * and 0x20000 for 15bit mode.
*/
-#define ADI_SLAVE_ADDR_SIZE SZ_4K
-#define ADI_SLAVE_OFFSET 0x8000
+#define ADI_10BIT_SLAVE_ADDR_SIZE SZ_4K
+#define ADI_10BIT_SLAVE_OFFSET 0x8000
+#define ADI_12BIT_SLAVE_ADDR_SIZE SZ_16K
+#define ADI_12BIT_SLAVE_OFFSET 0x8000
+#define ADI_15BIT_SLAVE_ADDR_SIZE SZ_128K
+#define ADI_15BIT_SLAVE_OFFSET 0x20000
/* Timeout (ms) for the trylock of hardware spinlocks */
#define ADI_HWSPINLOCK_TIMEOUT 5000
#define ADI_FIFO_DRAIN_TIMEOUT 1000
#define ADI_READ_TIMEOUT 2000
-#define REG_ADDR_LOW_MASK GENMASK(11, 0)
+
+/*
+ * Read back address from REG_ADI_RD_DATA bit[30:16] which maps to:
+ * REG_ADI_RD_CMD bit[14:0] for r2p0
+ * REG_ADI_RD_CMD bit[16:2] for r3p0
+ */
+#define RDBACK_ADDR_MASK_R2 GENMASK(14, 0)
+#define RDBACK_ADDR_MASK_R3 GENMASK(16, 2)
+#define RDBACK_ADDR_SHIFT_R3 2
/* Registers definitions for PMIC watchdog controller */
-#define REG_WDG_LOAD_LOW 0x80
-#define REG_WDG_LOAD_HIGH 0x84
-#define REG_WDG_CTRL 0x88
-#define REG_WDG_LOCK 0xa0
+#define REG_WDG_LOAD_LOW 0x0
+#define REG_WDG_LOAD_HIGH 0x4
+#define REG_WDG_CTRL 0x8
+#define REG_WDG_LOCK 0x20
/* Bits definitions for register REG_WDG_CTRL */
#define BIT_WDG_RUN BIT(1)
#define BIT_WDG_NEW BIT(2)
#define BIT_WDG_RST BIT(3)
+/* Bits definitions for register REG_MODULE_EN */
+#define BIT_WDG_EN BIT(2)
+
/* Registers definitions for PMIC */
#define PMIC_RST_STATUS 0xee8
#define PMIC_MODULE_EN 0xc08
#define PMIC_CLK_EN 0xc18
-#define BIT_WDG_EN BIT(2)
+#define PMIC_WDG_BASE 0x80
/* Definition of PMIC reset status register */
#define HWRST_STATUS_SECURITY 0x02
#define HWRST_STATUS_WATCHDOG 0xf0
/* Use default timeout 50 ms that converts to watchdog values */
-#define WDG_LOAD_VAL ((50 * 1000) / 32768)
+#define WDG_LOAD_VAL ((50 * 32768) / 1000)
#define WDG_LOAD_MASK GENMASK(15, 0)
#define WDG_UNLOCK_KEY 0xe551
+struct sprd_adi_wdg {
+ u32 base;
+ u32 rst_sts;
+ u32 wdg_en;
+ u32 wdg_clk;
+};
+
+struct sprd_adi_data {
+ u32 slave_offset;
+ u32 slave_addr_size;
+ int (*read_check)(u32 val, u32 reg);
+ int (*restart)(struct notifier_block *this,
+ unsigned long mode, void *cmd);
+ void (*wdg_rst)(void *p);
+};
+
struct sprd_adi {
struct spi_controller *ctlr;
struct device *dev;
unsigned long slave_vbase;
unsigned long slave_pbase;
struct notifier_block restart_handler;
+ const struct sprd_adi_data *data;
};
-static int sprd_adi_check_paddr(struct sprd_adi *sadi, u32 paddr)
+static int sprd_adi_check_addr(struct sprd_adi *sadi, u32 reg)
{
- if (paddr < sadi->slave_pbase || paddr >
- (sadi->slave_pbase + ADI_SLAVE_ADDR_SIZE)) {
+ if (reg >= sadi->data->slave_addr_size) {
dev_err(sadi->dev,
- "slave physical address is incorrect, addr = 0x%x\n",
- paddr);
+ "slave address offset is incorrect, reg = 0x%x\n",
+ reg);
return -EINVAL;
}
return 0;
}
-static unsigned long sprd_adi_to_vaddr(struct sprd_adi *sadi, u32 paddr)
-{
- return (paddr - sadi->slave_pbase + sadi->slave_vbase);
-}
-
static int sprd_adi_drain_fifo(struct sprd_adi *sadi)
{
u32 timeout = ADI_FIFO_DRAIN_TIMEOUT;
return readl_relaxed(sadi->base + REG_ADI_ARM_FIFO_STS) & BIT_FIFO_FULL;
}
-static int sprd_adi_read(struct sprd_adi *sadi, u32 reg_paddr, u32 *read_val)
+static int sprd_adi_read_check(u32 val, u32 addr)
+{
+ u32 rd_addr;
+
+ rd_addr = (val & RD_ADDR_MASK) >> RD_ADDR_SHIFT;
+
+ if (rd_addr != addr) {
+ pr_err("ADI read error, addr = 0x%x, val = 0x%x\n", addr, val);
+ return -EIO;
+ }
+
+ return 0;
+}
+
+static int sprd_adi_read_check_r2(u32 val, u32 reg)
+{
+ return sprd_adi_read_check(val, reg & RDBACK_ADDR_MASK_R2);
+}
+
+static int sprd_adi_read_check_r3(u32 val, u32 reg)
+{
+ return sprd_adi_read_check(val, (reg & RDBACK_ADDR_MASK_R3) >> RDBACK_ADDR_SHIFT_R3);
+}
+
+static int sprd_adi_read(struct sprd_adi *sadi, u32 reg, u32 *read_val)
{
int read_timeout = ADI_READ_TIMEOUT;
unsigned long flags;
- u32 val, rd_addr;
+ u32 val;
int ret = 0;
if (sadi->hwlock) {
}
}
+ ret = sprd_adi_check_addr(sadi, reg);
+ if (ret)
+ goto out;
+
/*
- * Set the physical register address need to read into RD_CMD register,
+ * Set the slave address offset need to read into RD_CMD register,
* then ADI controller will start to transfer automatically.
*/
- writel_relaxed(reg_paddr, sadi->base + REG_ADI_RD_CMD);
+ writel_relaxed(reg, sadi->base + REG_ADI_RD_CMD);
/*
* Wait read operation complete, the BIT_RD_CMD_BUSY will be set
}
/*
- * The return value includes data and read register address, from bit 0
- * to bit 15 are data, and from bit 16 to bit 30 are read register
- * address. Then we can check the returned register address to validate
- * data.
+ * The return value before adi r5p0 includes data and read register
+ * address, from bit 0to bit 15 are data, and from bit 16 to bit 30
+ * are read register address. Then we can check the returned register
+ * address to validate data.
*/
- rd_addr = (val & RD_ADDR_MASK) >> RD_ADDR_SHIFT;
-
- if (rd_addr != (reg_paddr & REG_ADDR_LOW_MASK)) {
- dev_err(sadi->dev, "read error, reg addr = 0x%x, val = 0x%x\n",
- reg_paddr, val);
- ret = -EIO;
- goto out;
+ if (sadi->data->read_check) {
+ ret = sadi->data->read_check(val, reg);
+ if (ret < 0)
+ goto out;
}
*read_val = val & RD_VALUE_MASK;
return ret;
}
-static int sprd_adi_write(struct sprd_adi *sadi, u32 reg_paddr, u32 val)
+static int sprd_adi_write(struct sprd_adi *sadi, u32 reg, u32 val)
{
- unsigned long reg = sprd_adi_to_vaddr(sadi, reg_paddr);
u32 timeout = ADI_FIFO_DRAIN_TIMEOUT;
unsigned long flags;
int ret;
}
}
+ ret = sprd_adi_check_addr(sadi, reg);
+ if (ret)
+ goto out;
+
ret = sprd_adi_drain_fifo(sadi);
if (ret < 0)
goto out;
*/
do {
if (!sprd_adi_fifo_is_full(sadi)) {
- writel_relaxed(val, (void __iomem *)reg);
+ /* we need virtual register address to write. */
+ writel_relaxed(val, (void __iomem *)(sadi->slave_vbase + reg));
break;
}
struct spi_transfer *t)
{
struct sprd_adi *sadi = spi_controller_get_devdata(ctlr);
- u32 phy_reg, val;
+ u32 reg, val;
int ret;
if (t->rx_buf) {
- phy_reg = *(u32 *)t->rx_buf + sadi->slave_pbase;
-
- ret = sprd_adi_check_paddr(sadi, phy_reg);
- if (ret)
- return ret;
-
- ret = sprd_adi_read(sadi, phy_reg, &val);
- if (ret)
- return ret;
-
+ reg = *(u32 *)t->rx_buf;
+ ret = sprd_adi_read(sadi, reg, &val);
*(u32 *)t->rx_buf = val;
} else if (t->tx_buf) {
u32 *p = (u32 *)t->tx_buf;
-
- /*
- * Get the physical register address need to write and convert
- * the physical address to virtual address. Since we need
- * virtual register address to write.
- */
- phy_reg = *p++ + sadi->slave_pbase;
- ret = sprd_adi_check_paddr(sadi, phy_reg);
- if (ret)
- return ret;
-
+ reg = *p++;
val = *p;
- ret = sprd_adi_write(sadi, phy_reg, val);
- if (ret)
- return ret;
+ ret = sprd_adi_write(sadi, reg, val);
} else {
dev_err(sadi->dev, "no buffer for transfer\n");
- return -EINVAL;
+ ret = -EINVAL;
}
- return 0;
+ return ret;
}
-static void sprd_adi_set_wdt_rst_mode(struct sprd_adi *sadi)
+static void sprd_adi_set_wdt_rst_mode(void *p)
{
#if IS_ENABLED(CONFIG_SPRD_WATCHDOG)
u32 val;
+ struct sprd_adi *sadi = (struct sprd_adi *)p;
- /* Set default watchdog reboot mode */
- sprd_adi_read(sadi, sadi->slave_pbase + PMIC_RST_STATUS, &val);
+ /* Init watchdog reset mode */
+ sprd_adi_read(sadi, PMIC_RST_STATUS, &val);
val |= HWRST_STATUS_WATCHDOG;
- sprd_adi_write(sadi, sadi->slave_pbase + PMIC_RST_STATUS, val);
+ sprd_adi_write(sadi, PMIC_RST_STATUS, val);
#endif
}
-static int sprd_adi_restart_handler(struct notifier_block *this,
- unsigned long mode, void *cmd)
+static int sprd_adi_restart(struct notifier_block *this, unsigned long mode,
+ void *cmd, struct sprd_adi_wdg *wdg)
{
struct sprd_adi *sadi = container_of(this, struct sprd_adi,
restart_handler);
reboot_mode = HWRST_STATUS_NORMAL;
/* Record the reboot mode */
- sprd_adi_read(sadi, sadi->slave_pbase + PMIC_RST_STATUS, &val);
+ sprd_adi_read(sadi, wdg->rst_sts, &val);
val &= ~HWRST_STATUS_WATCHDOG;
val |= reboot_mode;
- sprd_adi_write(sadi, sadi->slave_pbase + PMIC_RST_STATUS, val);
+ sprd_adi_write(sadi, wdg->rst_sts, val);
/* Enable the interface clock of the watchdog */
- sprd_adi_read(sadi, sadi->slave_pbase + PMIC_MODULE_EN, &val);
+ sprd_adi_read(sadi, wdg->wdg_en, &val);
val |= BIT_WDG_EN;
- sprd_adi_write(sadi, sadi->slave_pbase + PMIC_MODULE_EN, val);
+ sprd_adi_write(sadi, wdg->wdg_en, val);
/* Enable the work clock of the watchdog */
- sprd_adi_read(sadi, sadi->slave_pbase + PMIC_CLK_EN, &val);
+ sprd_adi_read(sadi, wdg->wdg_clk, &val);
val |= BIT_WDG_EN;
- sprd_adi_write(sadi, sadi->slave_pbase + PMIC_CLK_EN, val);
+ sprd_adi_write(sadi, wdg->wdg_clk, val);
/* Unlock the watchdog */
- sprd_adi_write(sadi, sadi->slave_pbase + REG_WDG_LOCK, WDG_UNLOCK_KEY);
+ sprd_adi_write(sadi, wdg->base + REG_WDG_LOCK, WDG_UNLOCK_KEY);
- sprd_adi_read(sadi, sadi->slave_pbase + REG_WDG_CTRL, &val);
+ sprd_adi_read(sadi, wdg->base + REG_WDG_CTRL, &val);
val |= BIT_WDG_NEW;
- sprd_adi_write(sadi, sadi->slave_pbase + REG_WDG_CTRL, val);
+ sprd_adi_write(sadi, wdg->base + REG_WDG_CTRL, val);
/* Load the watchdog timeout value, 50ms is always enough. */
- sprd_adi_write(sadi, sadi->slave_pbase + REG_WDG_LOAD_HIGH, 0);
- sprd_adi_write(sadi, sadi->slave_pbase + REG_WDG_LOAD_LOW,
+ sprd_adi_write(sadi, wdg->base + REG_WDG_LOAD_HIGH, 0);
+ sprd_adi_write(sadi, wdg->base + REG_WDG_LOAD_LOW,
WDG_LOAD_VAL & WDG_LOAD_MASK);
/* Start the watchdog to reset system */
- sprd_adi_read(sadi, sadi->slave_pbase + REG_WDG_CTRL, &val);
+ sprd_adi_read(sadi, wdg->base + REG_WDG_CTRL, &val);
val |= BIT_WDG_RUN | BIT_WDG_RST;
- sprd_adi_write(sadi, sadi->slave_pbase + REG_WDG_CTRL, val);
+ sprd_adi_write(sadi, wdg->base + REG_WDG_CTRL, val);
/* Lock the watchdog */
- sprd_adi_write(sadi, sadi->slave_pbase + REG_WDG_LOCK, ~WDG_UNLOCK_KEY);
+ sprd_adi_write(sadi, wdg->base + REG_WDG_LOCK, ~WDG_UNLOCK_KEY);
mdelay(1000);
return NOTIFY_DONE;
}
+static int sprd_adi_restart_sc9860(struct notifier_block *this,
+ unsigned long mode, void *cmd)
+{
+ struct sprd_adi_wdg wdg = {
+ .base = PMIC_WDG_BASE,
+ .rst_sts = PMIC_RST_STATUS,
+ .wdg_en = PMIC_MODULE_EN,
+ .wdg_clk = PMIC_CLK_EN,
+ };
+
+ return sprd_adi_restart(this, mode, cmd, &wdg);
+}
+
static void sprd_adi_hw_init(struct sprd_adi *sadi)
{
struct device_node *np = sadi->dev->of_node;
static int sprd_adi_probe(struct platform_device *pdev)
{
struct device_node *np = pdev->dev.of_node;
+ const struct sprd_adi_data *data;
struct spi_controller *ctlr;
struct sprd_adi *sadi;
struct resource *res;
- u32 num_chipselect;
+ u16 num_chipselect;
int ret;
if (!np) {
return -ENODEV;
}
+ data = of_device_get_match_data(&pdev->dev);
+ if (!data) {
+ dev_err(&pdev->dev, "no matching driver data found\n");
+ return -EINVAL;
+ }
+
pdev->id = of_alias_get_id(np, "spi");
num_chipselect = of_get_child_count(np);
goto put_ctlr;
}
- sadi->slave_vbase = (unsigned long)sadi->base + ADI_SLAVE_OFFSET;
- sadi->slave_pbase = res->start + ADI_SLAVE_OFFSET;
+ sadi->slave_vbase = (unsigned long)sadi->base +
+ data->slave_offset;
+ sadi->slave_pbase = res->start + data->slave_offset;
sadi->ctlr = ctlr;
sadi->dev = &pdev->dev;
+ sadi->data = data;
ret = of_hwspin_lock_get_id(np, 0);
if (ret > 0 || (IS_ENABLED(CONFIG_HWSPINLOCK) && ret == 0)) {
sadi->hwlock =
}
sprd_adi_hw_init(sadi);
- sprd_adi_set_wdt_rst_mode(sadi);
+
+ if (sadi->data->wdg_rst)
+ sadi->data->wdg_rst(sadi);
ctlr->dev.of_node = pdev->dev.of_node;
ctlr->bus_num = pdev->id;
goto put_ctlr;
}
- sadi->restart_handler.notifier_call = sprd_adi_restart_handler;
- sadi->restart_handler.priority = 128;
- ret = register_restart_handler(&sadi->restart_handler);
- if (ret) {
- dev_err(&pdev->dev, "can not register restart handler\n");
- goto put_ctlr;
+ if (sadi->data->restart) {
+ sadi->restart_handler.notifier_call = sadi->data->restart;
+ sadi->restart_handler.priority = 128;
+ ret = register_restart_handler(&sadi->restart_handler);
+ if (ret) {
+ dev_err(&pdev->dev, "can not register restart handler\n");
+ goto put_ctlr;
+ }
}
return 0;
return 0;
}
+static struct sprd_adi_data sc9860_data = {
+ .slave_offset = ADI_10BIT_SLAVE_OFFSET,
+ .slave_addr_size = ADI_10BIT_SLAVE_ADDR_SIZE,
+ .read_check = sprd_adi_read_check_r2,
+ .restart = sprd_adi_restart_sc9860,
+ .wdg_rst = sprd_adi_set_wdt_rst_mode,
+};
+
+static struct sprd_adi_data sc9863_data = {
+ .slave_offset = ADI_12BIT_SLAVE_OFFSET,
+ .slave_addr_size = ADI_12BIT_SLAVE_ADDR_SIZE,
+ .read_check = sprd_adi_read_check_r3,
+};
+
+static struct sprd_adi_data ums512_data = {
+ .slave_offset = ADI_15BIT_SLAVE_OFFSET,
+ .slave_addr_size = ADI_15BIT_SLAVE_ADDR_SIZE,
+ .read_check = sprd_adi_read_check_r3,
+};
+
static const struct of_device_id sprd_adi_of_match[] = {
{
.compatible = "sprd,sc9860-adi",
+ .data = &sc9860_data,
+ },
+ {
+ .compatible = "sprd,sc9863-adi",
+ .data = &sc9863_data,
+ },
+ {
+ .compatible = "sprd,ums512-adi",
+ .data = &ums512_data,
},
{ },
};
#define SPI_3WIRE_TX 3
#define SPI_3WIRE_RX 4
+#define STM32_SPI_AUTOSUSPEND_DELAY 1 /* 1 ms */
+
/*
* use PIO for small transfers, avoiding DMA setup/teardown overhead for drivers
* without fifo buffers.
/**
* stm32h7_spi_read_rxfifo - Read bytes in Receive Data Register
* @spi: pointer to the spi controller data structure
- * @flush: boolean indicating that FIFO should be flushed
*
* Write in rx_buf depends on remaining bytes to avoid to write beyond
* rx_buf end.
*/
-static void stm32h7_spi_read_rxfifo(struct stm32_spi *spi, bool flush)
+static void stm32h7_spi_read_rxfifo(struct stm32_spi *spi)
{
u32 sr = readl_relaxed(spi->base + STM32H7_SPI_SR);
u32 rxplvl = FIELD_GET(STM32H7_SPI_SR_RXPLVL, sr);
while ((spi->rx_len > 0) &&
((sr & STM32H7_SPI_SR_RXP) ||
- (flush && ((sr & STM32H7_SPI_SR_RXWNE) || (rxplvl > 0))))) {
+ ((sr & STM32H7_SPI_SR_EOT) &&
+ ((sr & STM32H7_SPI_SR_RXWNE) || (rxplvl > 0))))) {
u32 offs = spi->cur_xferlen - spi->rx_len;
if ((spi->rx_len >= sizeof(u32)) ||
- (flush && (sr & STM32H7_SPI_SR_RXWNE))) {
+ (sr & STM32H7_SPI_SR_RXWNE)) {
u32 *rx_buf32 = (u32 *)(spi->rx_buf + offs);
*rx_buf32 = readl_relaxed(spi->base + STM32H7_SPI_RXDR);
spi->rx_len -= sizeof(u32);
} else if ((spi->rx_len >= sizeof(u16)) ||
- (flush && (rxplvl >= 2 || spi->cur_bpw > 8))) {
+ (!(sr & STM32H7_SPI_SR_RXWNE) &&
+ (rxplvl >= 2 || spi->cur_bpw > 8))) {
u16 *rx_buf16 = (u16 *)(spi->rx_buf + offs);
*rx_buf16 = readw_relaxed(spi->base + STM32H7_SPI_RXDR);
rxplvl = FIELD_GET(STM32H7_SPI_SR_RXPLVL, sr);
}
- dev_dbg(spi->dev, "%s%s: %d bytes left\n", __func__,
- flush ? "(flush)" : "", spi->rx_len);
+ dev_dbg(spi->dev, "%s: %d bytes left (sr=%08x)\n",
+ __func__, spi->rx_len, sr);
}
/**
* stm32h7_spi_disable - Disable SPI controller
* @spi: pointer to the spi controller data structure
*
- * RX-Fifo is flushed when SPI controller is disabled. To prevent any data
- * loss, use stm32h7_spi_read_rxfifo(flush) to read the remaining bytes in
- * RX-Fifo.
- * Normally, if TSIZE has been configured, we should relax the hardware at the
- * reception of the EOT interrupt. But in case of error, EOT will not be
- * raised. So the subsystem unprepare_message call allows us to properly
- * complete the transfer from an hardware point of view.
+ * RX-Fifo is flushed when SPI controller is disabled.
*/
static void stm32h7_spi_disable(struct stm32_spi *spi)
{
unsigned long flags;
- u32 cr1, sr;
+ u32 cr1;
dev_dbg(spi->dev, "disable controller\n");
return;
}
- /* Wait on EOT or suspend the flow */
- if (readl_relaxed_poll_timeout_atomic(spi->base + STM32H7_SPI_SR,
- sr, !(sr & STM32H7_SPI_SR_EOT),
- 10, 100000) < 0) {
- if (cr1 & STM32H7_SPI_CR1_CSTART) {
- writel_relaxed(cr1 | STM32H7_SPI_CR1_CSUSP,
- spi->base + STM32H7_SPI_CR1);
- if (readl_relaxed_poll_timeout_atomic(
- spi->base + STM32H7_SPI_SR,
- sr, !(sr & STM32H7_SPI_SR_SUSP),
- 10, 100000) < 0)
- dev_warn(spi->dev,
- "Suspend request timeout\n");
- }
- }
-
- if (!spi->cur_usedma && spi->rx_buf && (spi->rx_len > 0))
- stm32h7_spi_read_rxfifo(spi, true);
-
if (spi->cur_usedma && spi->dma_tx)
dmaengine_terminate_all(spi->dma_tx);
if (spi->cur_usedma && spi->dma_rx)
if (__ratelimit(&rs))
dev_dbg_ratelimited(spi->dev, "Communication suspended\n");
if (!spi->cur_usedma && (spi->rx_buf && (spi->rx_len > 0)))
- stm32h7_spi_read_rxfifo(spi, false);
+ stm32h7_spi_read_rxfifo(spi);
/*
* If communication is suspended while using DMA, it means
* that something went wrong, so stop the current transfer
if (sr & STM32H7_SPI_SR_EOT) {
if (!spi->cur_usedma && (spi->rx_buf && (spi->rx_len > 0)))
- stm32h7_spi_read_rxfifo(spi, true);
- end = true;
+ stm32h7_spi_read_rxfifo(spi);
+ if (!spi->cur_usedma ||
+ (spi->cur_comm == SPI_SIMPLEX_TX || spi->cur_comm == SPI_3WIRE_TX))
+ end = true;
}
if (sr & STM32H7_SPI_SR_TXP)
if (sr & STM32H7_SPI_SR_RXP)
if (!spi->cur_usedma && (spi->rx_buf && (spi->rx_len > 0)))
- stm32h7_spi_read_rxfifo(spi, false);
+ stm32h7_spi_read_rxfifo(spi);
writel_relaxed(sr & mask, spi->base + STM32H7_SPI_IFCR);
}
/**
- * stm32f4_spi_dma_rx_cb - dma callback
+ * stm32_spi_dma_rx_cb - dma callback
* @data: pointer to the spi controller data structure
*
* DMA callback is called when the transfer is complete for DMA RX channel.
*/
-static void stm32f4_spi_dma_rx_cb(void *data)
+static void stm32_spi_dma_rx_cb(void *data)
{
struct stm32_spi *spi = data;
spi_finalize_current_transfer(spi->master);
- stm32f4_spi_disable(spi);
-}
-
-/**
- * stm32h7_spi_dma_cb - dma callback
- * @data: pointer to the spi controller data structure
- *
- * DMA callback is called when the transfer is complete or when an error
- * occurs. If the transfer is complete, EOT flag is raised.
- */
-static void stm32h7_spi_dma_cb(void *data)
-{
- struct stm32_spi *spi = data;
- unsigned long flags;
- u32 sr;
-
- spin_lock_irqsave(&spi->lock, flags);
-
- sr = readl_relaxed(spi->base + STM32H7_SPI_SR);
-
- spin_unlock_irqrestore(&spi->lock, flags);
-
- if (!(sr & STM32H7_SPI_SR_EOT))
- dev_warn(spi->dev, "DMA error (sr=0x%08x)\n", sr);
-
- /* Now wait for EOT, or SUSP or OVR in case of error */
+ spi->cfg->disable(spi);
}
/**
*/
static void stm32h7_spi_transfer_one_dma_start(struct stm32_spi *spi)
{
- /* Enable the interrupts relative to the end of transfer */
- stm32_spi_set_bits(spi, STM32H7_SPI_IER, STM32H7_SPI_IER_EOTIE |
- STM32H7_SPI_IER_TXTFIE |
- STM32H7_SPI_IER_OVRIE |
- STM32H7_SPI_IER_MODFIE);
+ uint32_t ier = STM32H7_SPI_IER_OVRIE | STM32H7_SPI_IER_MODFIE;
+
+ /* Enable the interrupts */
+ if (spi->cur_comm == SPI_SIMPLEX_TX || spi->cur_comm == SPI_3WIRE_TX)
+ ier |= STM32H7_SPI_IER_EOTIE | STM32H7_SPI_IER_TXTFIE;
+
+ stm32_spi_set_bits(spi, STM32H7_SPI_IER, ier);
stm32_spi_enable(spi);
struct stm32_spi *spi = spi_master_get_devdata(master);
int ret;
- /* Don't do anything on 0 bytes transfers */
- if (transfer->len == 0)
- return 0;
-
spi->tx_buf = transfer->tx_buf;
spi->rx_buf = transfer->rx_buf;
spi->tx_len = spi->tx_buf ? transfer->len : 0;
.set_mode = stm32f4_spi_set_mode,
.transfer_one_dma_start = stm32f4_spi_transfer_one_dma_start,
.dma_tx_cb = stm32f4_spi_dma_tx_cb,
- .dma_rx_cb = stm32f4_spi_dma_rx_cb,
+ .dma_rx_cb = stm32_spi_dma_rx_cb,
.transfer_one_irq = stm32f4_spi_transfer_one_irq,
.irq_handler_event = stm32f4_spi_irq_event,
.irq_handler_thread = stm32f4_spi_irq_thread,
.set_data_idleness = stm32h7_spi_data_idleness,
.set_number_of_data = stm32h7_spi_number_of_data,
.transfer_one_dma_start = stm32h7_spi_transfer_one_dma_start,
- .dma_rx_cb = stm32h7_spi_dma_cb,
- .dma_tx_cb = stm32h7_spi_dma_cb,
+ .dma_rx_cb = stm32_spi_dma_rx_cb,
+ /*
+ * dma_tx_cb is not necessary since in case of TX, dma is followed by
+ * SPI access hence handling is performed within the SPI interrupt
+ */
.transfer_one_irq = stm32h7_spi_transfer_one_irq,
.irq_handler_thread = stm32h7_spi_irq_thread,
.baud_rate_div_min = STM32H7_SPI_MBR_DIV_MIN,
if (spi->dma_tx || spi->dma_rx)
master->can_dma = stm32_spi_can_dma;
+ pm_runtime_set_autosuspend_delay(&pdev->dev,
+ STM32_SPI_AUTOSUSPEND_DELAY);
+ pm_runtime_use_autosuspend(&pdev->dev);
pm_runtime_set_active(&pdev->dev);
pm_runtime_get_noresume(&pdev->dev);
pm_runtime_enable(&pdev->dev);
goto err_pm_disable;
}
+ pm_runtime_mark_last_busy(&pdev->dev);
+ pm_runtime_put_autosuspend(&pdev->dev);
+
dev_info(&pdev->dev, "driver initialized\n");
return 0;
pm_runtime_disable(&pdev->dev);
pm_runtime_put_noidle(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
+ pm_runtime_dont_use_autosuspend(&pdev->dev);
err_dma_release:
if (spi->dma_tx)
dma_release_channel(spi->dma_tx);
pm_runtime_disable(&pdev->dev);
pm_runtime_put_noidle(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
+ pm_runtime_dont_use_autosuspend(&pdev->dev);
+
if (master->dma_tx)
dma_release_channel(master->dma_tx);
if (master->dma_rx)
dma_release_channel(dma_chan);
}
-static int tegra_spi_set_hw_cs_timing(struct spi_device *spi,
- struct spi_delay *setup,
- struct spi_delay *hold,
- struct spi_delay *inactive)
+static int tegra_spi_set_hw_cs_timing(struct spi_device *spi)
{
struct tegra_spi_data *tspi = spi_master_get_devdata(spi->master);
+ struct spi_delay *setup = &spi->cs_setup;
+ struct spi_delay *hold = &spi->cs_hold;
+ struct spi_delay *inactive = &spi->cs_inactive;
u8 setup_dly, hold_dly, inactive_dly;
u32 setup_hold;
u32 spi_cs_timing;
dev_err(&pdev->dev, "Can not get clock %d\n", ret);
goto exit_free_master;
}
- ret = clk_prepare(tspi->clk);
- if (ret < 0) {
- dev_err(&pdev->dev, "Clock prepare failed %d\n", ret);
- goto exit_free_master;
- }
- ret = clk_enable(tspi->clk);
- if (ret < 0) {
- dev_err(&pdev->dev, "Clock enable failed %d\n", ret);
- goto exit_clk_unprepare;
- }
-
- spi_irq = platform_get_irq(pdev, 0);
- tspi->irq = spi_irq;
- ret = request_threaded_irq(tspi->irq, tegra_slink_isr,
- tegra_slink_isr_thread, IRQF_ONESHOT,
- dev_name(&pdev->dev), tspi);
- if (ret < 0) {
- dev_err(&pdev->dev, "Failed to register ISR for IRQ %d\n",
- tspi->irq);
- goto exit_clk_disable;
- }
tspi->rst = devm_reset_control_get_exclusive(&pdev->dev, "spi");
if (IS_ERR(tspi->rst)) {
dev_err(&pdev->dev, "can not get reset\n");
ret = PTR_ERR(tspi->rst);
- goto exit_free_irq;
+ goto exit_free_master;
}
tspi->max_buf_size = SLINK_FIFO_DEPTH << 2;
ret = tegra_slink_init_dma_param(tspi, true);
if (ret < 0)
- goto exit_free_irq;
+ goto exit_free_master;
ret = tegra_slink_init_dma_param(tspi, false);
if (ret < 0)
goto exit_rx_dma_free;
init_completion(&tspi->xfer_completion);
pm_runtime_enable(&pdev->dev);
- if (!pm_runtime_enabled(&pdev->dev)) {
- ret = tegra_slink_runtime_resume(&pdev->dev);
- if (ret)
- goto exit_pm_disable;
- }
-
- ret = pm_runtime_get_sync(&pdev->dev);
- if (ret < 0) {
+ ret = pm_runtime_resume_and_get(&pdev->dev);
+ if (ret) {
dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret);
- pm_runtime_put_noidle(&pdev->dev);
goto exit_pm_disable;
}
udelay(2);
reset_control_deassert(tspi->rst);
+ spi_irq = platform_get_irq(pdev, 0);
+ tspi->irq = spi_irq;
+ ret = request_threaded_irq(tspi->irq, tegra_slink_isr,
+ tegra_slink_isr_thread, IRQF_ONESHOT,
+ dev_name(&pdev->dev), tspi);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "Failed to register ISR for IRQ %d\n",
+ tspi->irq);
+ goto exit_pm_put;
+ }
+
tspi->def_command_reg = SLINK_M_S;
tspi->def_command2_reg = SLINK_CS_ACTIVE_BETWEEN;
tegra_slink_writel(tspi, tspi->def_command_reg, SLINK_COMMAND);
tegra_slink_writel(tspi, tspi->def_command2_reg, SLINK_COMMAND2);
- pm_runtime_put(&pdev->dev);
master->dev.of_node = pdev->dev.of_node;
- ret = devm_spi_register_master(&pdev->dev, master);
+ ret = spi_register_master(master);
if (ret < 0) {
dev_err(&pdev->dev, "can not register to master err %d\n", ret);
- goto exit_pm_disable;
+ goto exit_free_irq;
}
+
+ pm_runtime_put(&pdev->dev);
+
return ret;
+exit_free_irq:
+ free_irq(spi_irq, tspi);
+exit_pm_put:
+ pm_runtime_put(&pdev->dev);
exit_pm_disable:
pm_runtime_disable(&pdev->dev);
- if (!pm_runtime_status_suspended(&pdev->dev))
- tegra_slink_runtime_suspend(&pdev->dev);
+
tegra_slink_deinit_dma_param(tspi, false);
exit_rx_dma_free:
tegra_slink_deinit_dma_param(tspi, true);
-exit_free_irq:
- free_irq(spi_irq, tspi);
-exit_clk_disable:
- clk_disable(tspi->clk);
-exit_clk_unprepare:
- clk_unprepare(tspi->clk);
exit_free_master:
spi_master_put(master);
return ret;
struct spi_master *master = platform_get_drvdata(pdev);
struct tegra_slink_data *tspi = spi_master_get_devdata(master);
+ spi_unregister_master(master);
+
free_irq(tspi->irq, tspi);
- clk_disable(tspi->clk);
- clk_unprepare(tspi->clk);
+ pm_runtime_disable(&pdev->dev);
if (tspi->tx_dma_chan)
tegra_slink_deinit_dma_param(tspi, false);
if (tspi->rx_dma_chan)
tegra_slink_deinit_dma_param(tspi, true);
- pm_runtime_disable(&pdev->dev);
- if (!pm_runtime_status_suspended(&pdev->dev))
- tegra_slink_runtime_suspend(&pdev->dev);
-
return 0;
}
zynq_qspi_write_op(xqspi, ZYNQ_QSPI_FIFO_DEPTH, true);
zynq_qspi_write(xqspi, ZYNQ_QSPI_IEN_OFFSET,
ZYNQ_QSPI_IXR_RXTX_MASK);
- if (!wait_for_completion_interruptible_timeout(&xqspi->data_completion,
+ if (!wait_for_completion_timeout(&xqspi->data_completion,
msecs_to_jiffies(1000)))
err = -ETIMEDOUT;
}
zynq_qspi_write_op(xqspi, ZYNQ_QSPI_FIFO_DEPTH, true);
zynq_qspi_write(xqspi, ZYNQ_QSPI_IEN_OFFSET,
ZYNQ_QSPI_IXR_RXTX_MASK);
- if (!wait_for_completion_interruptible_timeout(&xqspi->data_completion,
+ if (!wait_for_completion_timeout(&xqspi->data_completion,
msecs_to_jiffies(1000)))
err = -ETIMEDOUT;
}
zynq_qspi_write_op(xqspi, ZYNQ_QSPI_FIFO_DEPTH, true);
zynq_qspi_write(xqspi, ZYNQ_QSPI_IEN_OFFSET,
ZYNQ_QSPI_IXR_RXTX_MASK);
- if (!wait_for_completion_interruptible_timeout(&xqspi->data_completion,
+ if (!wait_for_completion_timeout(&xqspi->data_completion,
msecs_to_jiffies(1000)))
err = -ETIMEDOUT;
zynq_qspi_write_op(xqspi, ZYNQ_QSPI_FIFO_DEPTH, true);
zynq_qspi_write(xqspi, ZYNQ_QSPI_IEN_OFFSET,
ZYNQ_QSPI_IXR_RXTX_MASK);
- if (!wait_for_completion_interruptible_timeout(&xqspi->data_completion,
+ if (!wait_for_completion_timeout(&xqspi->data_completion,
msecs_to_jiffies(1000)))
err = -ETIMEDOUT;
}
if (spi->cs_gpiod || gpio_is_valid(spi->cs_gpio) ||
!spi->controller->set_cs_timing) {
if (activate)
- spi_delay_exec(&spi->controller->cs_setup, NULL);
+ spi_delay_exec(&spi->cs_setup, NULL);
else
- spi_delay_exec(&spi->controller->cs_hold, NULL);
+ spi_delay_exec(&spi->cs_hold, NULL);
}
if (spi->mode & SPI_CS_HIGH)
if (spi->cs_gpiod || gpio_is_valid(spi->cs_gpio) ||
!spi->controller->set_cs_timing) {
if (!activate)
- spi_delay_exec(&spi->controller->cs_inactive, NULL);
+ spi_delay_exec(&spi->cs_inactive, NULL);
}
}
dev_dbg(isp->dev, "Stop stream on pad %d for asd%d\n",
atomisp_subdev_source_pad(vdev), asd->index);
- BUG_ON(!rt_mutex_is_locked(&isp->mutex));
- BUG_ON(!mutex_is_locked(&isp->streamoff_mutex));
+ lockdep_assert_held(&isp->mutex);
+ lockdep_assert_held(&isp->streamoff_mutex);
if (type != V4L2_BUF_TYPE_VIDEO_CAPTURE) {
dev_dbg(isp->dev, "unsupported v4l2 buf type\n");
+++ /dev/null
-/* SPDX-License-Identifier: LGPL-2.1+ WITH Linux-syscall-note */
-/*
- * audio.h - DEPRECATED MPEG-TS audio decoder API
- *
- * NOTE: should not be used on future drivers
- *
- * Copyright (C) 2000 Ralph Metzler <ralph@convergence.de>
- * & Marcus Metzler <marcus@convergence.de>
- * for convergence integrated media GmbH
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Lesser Public License
- * as published by the Free Software Foundation; either version 2.1
- * of the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
- *
- */
-
-#ifndef _DVBAUDIO_H_
-#define _DVBAUDIO_H_
-
-#include <linux/types.h>
-
-typedef enum {
- AUDIO_SOURCE_DEMUX, /* Select the demux as the main source */
- AUDIO_SOURCE_MEMORY /* Select internal memory as the main source */
-} audio_stream_source_t;
-
-
-typedef enum {
- AUDIO_STOPPED, /* Device is stopped */
- AUDIO_PLAYING, /* Device is currently playing */
- AUDIO_PAUSED /* Device is paused */
-} audio_play_state_t;
-
-
-typedef enum {
- AUDIO_STEREO,
- AUDIO_MONO_LEFT,
- AUDIO_MONO_RIGHT,
- AUDIO_MONO,
- AUDIO_STEREO_SWAPPED
-} audio_channel_select_t;
-
-
-typedef struct audio_mixer {
- unsigned int volume_left;
- unsigned int volume_right;
- /* what else do we need? bass, pass-through, ... */
-} audio_mixer_t;
-
-
-typedef struct audio_status {
- int AV_sync_state; /* sync audio and video? */
- int mute_state; /* audio is muted */
- audio_play_state_t play_state; /* current playback state */
- audio_stream_source_t stream_source; /* current stream source */
- audio_channel_select_t channel_select; /* currently selected channel */
- int bypass_mode; /* pass on audio data to */
- audio_mixer_t mixer_state; /* current mixer state */
-} audio_status_t; /* separate decoder hardware */
-
-
-/* for GET_CAPABILITIES and SET_FORMAT, the latter should only set one bit */
-#define AUDIO_CAP_DTS 1
-#define AUDIO_CAP_LPCM 2
-#define AUDIO_CAP_MP1 4
-#define AUDIO_CAP_MP2 8
-#define AUDIO_CAP_MP3 16
-#define AUDIO_CAP_AAC 32
-#define AUDIO_CAP_OGG 64
-#define AUDIO_CAP_SDDS 128
-#define AUDIO_CAP_AC3 256
-
-#define AUDIO_STOP _IO('o', 1)
-#define AUDIO_PLAY _IO('o', 2)
-#define AUDIO_PAUSE _IO('o', 3)
-#define AUDIO_CONTINUE _IO('o', 4)
-#define AUDIO_SELECT_SOURCE _IO('o', 5)
-#define AUDIO_SET_MUTE _IO('o', 6)
-#define AUDIO_SET_AV_SYNC _IO('o', 7)
-#define AUDIO_SET_BYPASS_MODE _IO('o', 8)
-#define AUDIO_CHANNEL_SELECT _IO('o', 9)
-#define AUDIO_GET_STATUS _IOR('o', 10, audio_status_t)
-
-#define AUDIO_GET_CAPABILITIES _IOR('o', 11, unsigned int)
-#define AUDIO_CLEAR_BUFFER _IO('o', 12)
-#define AUDIO_SET_ID _IO('o', 13)
-#define AUDIO_SET_MIXER _IOW('o', 14, audio_mixer_t)
-#define AUDIO_SET_STREAMTYPE _IO('o', 15)
-#define AUDIO_BILINGUAL_CHANNEL_SELECT _IO('o', 20)
-
-#endif /* _DVBAUDIO_H_ */
#include <linux/input.h>
#include <linux/time.h>
-#include "video.h"
-#include "audio.h"
-#include "osd.h"
-
+#include <linux/dvb/video.h>
+#include <linux/dvb/audio.h>
#include <linux/dvb/dmx.h>
#include <linux/dvb/ca.h>
+#include <linux/dvb/osd.h>
#include <linux/dvb/net.h>
#include <linux/mutex.h>
+++ /dev/null
-/* SPDX-License-Identifier: LGPL-2.1+ WITH Linux-syscall-note */
-/*
- * osd.h - DEPRECATED On Screen Display API
- *
- * NOTE: should not be used on future drivers
- *
- * Copyright (C) 2001 Ralph Metzler <ralph@convergence.de>
- * & Marcus Metzler <marcus@convergence.de>
- * for convergence integrated media GmbH
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Lesser Public License
- * as published by the Free Software Foundation; either version 2.1
- * of the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
- *
- */
-
-#ifndef _DVBOSD_H_
-#define _DVBOSD_H_
-
-#include <linux/compiler.h>
-
-typedef enum {
- /* All functions return -2 on "not open" */
- OSD_Close = 1, /* () */
- /*
- * Disables OSD and releases the buffers
- * returns 0 on success
- */
- OSD_Open, /* (x0,y0,x1,y1,BitPerPixel[2/4/8](color&0x0F),mix[0..15](color&0xF0)) */
- /*
- * Opens OSD with this size and bit depth
- * returns 0 on success, -1 on DRAM allocation error, -2 on "already open"
- */
- OSD_Show, /* () */
- /*
- * enables OSD mode
- * returns 0 on success
- */
- OSD_Hide, /* () */
- /*
- * disables OSD mode
- * returns 0 on success
- */
- OSD_Clear, /* () */
- /*
- * Sets all pixel to color 0
- * returns 0 on success
- */
- OSD_Fill, /* (color) */
- /*
- * Sets all pixel to color <col>
- * returns 0 on success
- */
- OSD_SetColor, /* (color,R{x0},G{y0},B{x1},opacity{y1}) */
- /*
- * set palette entry <num> to <r,g,b>, <mix> and <trans> apply
- * R,G,B: 0..255
- * R=Red, G=Green, B=Blue
- * opacity=0: pixel opacity 0% (only video pixel shows)
- * opacity=1..254: pixel opacity as specified in header
- * opacity=255: pixel opacity 100% (only OSD pixel shows)
- * returns 0 on success, -1 on error
- */
- OSD_SetPalette, /* (firstcolor{color},lastcolor{x0},data) */
- /*
- * Set a number of entries in the palette
- * sets the entries "firstcolor" through "lastcolor" from the array "data"
- * data has 4 byte for each color:
- * R,G,B, and a opacity value: 0->transparent, 1..254->mix, 255->pixel
- */
- OSD_SetTrans, /* (transparency{color}) */
- /*
- * Sets transparency of mixed pixel (0..15)
- * returns 0 on success
- */
- OSD_SetPixel, /* (x0,y0,color) */
- /*
- * sets pixel <x>,<y> to color number <col>
- * returns 0 on success, -1 on error
- */
- OSD_GetPixel, /* (x0,y0) */
- /* returns color number of pixel <x>,<y>, or -1 */
- OSD_SetRow, /* (x0,y0,x1,data) */
- /*
- * fills pixels x0,y through x1,y with the content of data[]
- * returns 0 on success, -1 on clipping all pixel (no pixel drawn)
- */
- OSD_SetBlock, /* (x0,y0,x1,y1,increment{color},data) */
- /*
- * fills pixels x0,y0 through x1,y1 with the content of data[]
- * inc contains the width of one line in the data block,
- * inc<=0 uses blockwidth as linewidth
- * returns 0 on success, -1 on clipping all pixel
- */
- OSD_FillRow, /* (x0,y0,x1,color) */
- /*
- * fills pixels x0,y through x1,y with the color <col>
- * returns 0 on success, -1 on clipping all pixel
- */
- OSD_FillBlock, /* (x0,y0,x1,y1,color) */
- /*
- * fills pixels x0,y0 through x1,y1 with the color <col>
- * returns 0 on success, -1 on clipping all pixel
- */
- OSD_Line, /* (x0,y0,x1,y1,color) */
- /*
- * draw a line from x0,y0 to x1,y1 with the color <col>
- * returns 0 on success
- */
- OSD_Query, /* (x0,y0,x1,y1,xasp{color}}), yasp=11 */
- /*
- * fills parameters with the picture dimensions and the pixel aspect ratio
- * returns 0 on success
- */
- OSD_Test, /* () */
- /*
- * draws a test picture. for debugging purposes only
- * returns 0 on success
- * TODO: remove "test" in final version
- */
- OSD_Text, /* (x0,y0,size,color,text) */
- OSD_SetWindow, /* (x0) set window with number 0<x0<8 as current */
- OSD_MoveWindow, /* move current window to (x0, y0) */
- OSD_OpenRaw, /* Open other types of OSD windows */
-} OSD_Command;
-
-typedef struct osd_cmd_s {
- OSD_Command cmd;
- int x0;
- int y0;
- int x1;
- int y1;
- int color;
- void __user *data;
-} osd_cmd_t;
-
-/* OSD_OpenRaw: set 'color' to desired window type */
-typedef enum {
- OSD_BITMAP1, /* 1 bit bitmap */
- OSD_BITMAP2, /* 2 bit bitmap */
- OSD_BITMAP4, /* 4 bit bitmap */
- OSD_BITMAP8, /* 8 bit bitmap */
- OSD_BITMAP1HR, /* 1 Bit bitmap half resolution */
- OSD_BITMAP2HR, /* 2 bit bitmap half resolution */
- OSD_BITMAP4HR, /* 4 bit bitmap half resolution */
- OSD_BITMAP8HR, /* 8 bit bitmap half resolution */
- OSD_YCRCB422, /* 4:2:2 YCRCB Graphic Display */
- OSD_YCRCB444, /* 4:4:4 YCRCB Graphic Display */
- OSD_YCRCB444HR, /* 4:4:4 YCRCB graphic half resolution */
- OSD_VIDEOTSIZE, /* True Size Normal MPEG Video Display */
- OSD_VIDEOHSIZE, /* MPEG Video Display Half Resolution */
- OSD_VIDEOQSIZE, /* MPEG Video Display Quarter Resolution */
- OSD_VIDEODSIZE, /* MPEG Video Display Double Resolution */
- OSD_VIDEOTHSIZE, /* True Size MPEG Video Display Half Resolution */
- OSD_VIDEOTQSIZE, /* True Size MPEG Video Display Quarter Resolution*/
- OSD_VIDEOTDSIZE, /* True Size MPEG Video Display Double Resolution */
- OSD_VIDEONSIZE, /* Full Size MPEG Video Display */
- OSD_CURSOR /* Cursor */
-} osd_raw_window_t;
-
-typedef struct osd_cap_s {
- int cmd;
-#define OSD_CAP_MEMSIZE 1 /* memory size */
- long val;
-} osd_cap_t;
-
-
-#define OSD_SEND_CMD _IOW('o', 160, osd_cmd_t)
-#define OSD_GET_CAPABILITY _IOR('o', 161, osd_cap_t)
-
-#endif
+++ /dev/null
-/* SPDX-License-Identifier: LGPL-2.1+ WITH Linux-syscall-note */
-/*
- * video.h - DEPRECATED MPEG-TS video decoder API
- *
- * NOTE: should not be used on future drivers
- *
- * Copyright (C) 2000 Marcus Metzler <marcus@convergence.de>
- * & Ralph Metzler <ralph@convergence.de>
- * for convergence integrated media GmbH
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU Lesser General Public License
- * as published by the Free Software Foundation; either version 2.1
- * of the License, or (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU Lesser General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
- *
- */
-
-#ifndef _UAPI_DVBVIDEO_H_
-#define _UAPI_DVBVIDEO_H_
-
-#include <linux/types.h>
-#ifndef __KERNEL__
-#include <time.h>
-#endif
-
-typedef enum {
- VIDEO_FORMAT_4_3, /* Select 4:3 format */
- VIDEO_FORMAT_16_9, /* Select 16:9 format. */
- VIDEO_FORMAT_221_1 /* 2.21:1 */
-} video_format_t;
-
-
-typedef enum {
- VIDEO_PAN_SCAN, /* use pan and scan format */
- VIDEO_LETTER_BOX, /* use letterbox format */
- VIDEO_CENTER_CUT_OUT /* use center cut out format */
-} video_displayformat_t;
-
-typedef struct {
- int w;
- int h;
- video_format_t aspect_ratio;
-} video_size_t;
-
-typedef enum {
- VIDEO_SOURCE_DEMUX, /* Select the demux as the main source */
- VIDEO_SOURCE_MEMORY /* If this source is selected, the stream
- comes from the user through the write
- system call */
-} video_stream_source_t;
-
-
-typedef enum {
- VIDEO_STOPPED, /* Video is stopped */
- VIDEO_PLAYING, /* Video is currently playing */
- VIDEO_FREEZED /* Video is freezed */
-} video_play_state_t;
-
-
-/* Decoder commands */
-#define VIDEO_CMD_PLAY (0)
-#define VIDEO_CMD_STOP (1)
-#define VIDEO_CMD_FREEZE (2)
-#define VIDEO_CMD_CONTINUE (3)
-
-/* Flags for VIDEO_CMD_FREEZE */
-#define VIDEO_CMD_FREEZE_TO_BLACK (1 << 0)
-
-/* Flags for VIDEO_CMD_STOP */
-#define VIDEO_CMD_STOP_TO_BLACK (1 << 0)
-#define VIDEO_CMD_STOP_IMMEDIATELY (1 << 1)
-
-/* Play input formats: */
-/* The decoder has no special format requirements */
-#define VIDEO_PLAY_FMT_NONE (0)
-/* The decoder requires full GOPs */
-#define VIDEO_PLAY_FMT_GOP (1)
-
-/* The structure must be zeroed before use by the application
- This ensures it can be extended safely in the future. */
-struct video_command {
- __u32 cmd;
- __u32 flags;
- union {
- struct {
- __u64 pts;
- } stop;
-
- struct {
- /* 0 or 1000 specifies normal speed,
- 1 specifies forward single stepping,
- -1 specifies backward single stepping,
- >1: playback at speed/1000 of the normal speed,
- <-1: reverse playback at (-speed/1000) of the normal speed. */
- __s32 speed;
- __u32 format;
- } play;
-
- struct {
- __u32 data[16];
- } raw;
- };
-};
-
-/* FIELD_UNKNOWN can be used if the hardware does not know whether
- the Vsync is for an odd, even or progressive (i.e. non-interlaced)
- field. */
-#define VIDEO_VSYNC_FIELD_UNKNOWN (0)
-#define VIDEO_VSYNC_FIELD_ODD (1)
-#define VIDEO_VSYNC_FIELD_EVEN (2)
-#define VIDEO_VSYNC_FIELD_PROGRESSIVE (3)
-
-struct video_event {
- __s32 type;
-#define VIDEO_EVENT_SIZE_CHANGED 1
-#define VIDEO_EVENT_FRAME_RATE_CHANGED 2
-#define VIDEO_EVENT_DECODER_STOPPED 3
-#define VIDEO_EVENT_VSYNC 4
- /* unused, make sure to use atomic time for y2038 if it ever gets used */
- long timestamp;
- union {
- video_size_t size;
- unsigned int frame_rate; /* in frames per 1000sec */
- unsigned char vsync_field; /* unknown/odd/even/progressive */
- } u;
-};
-
-
-struct video_status {
- int video_blank; /* blank video on freeze? */
- video_play_state_t play_state; /* current state of playback */
- video_stream_source_t stream_source; /* current source (demux/memory) */
- video_format_t video_format; /* current aspect ratio of stream*/
- video_displayformat_t display_format;/* selected cropping mode */
-};
-
-
-struct video_still_picture {
- char __user *iFrame; /* pointer to a single iframe in memory */
- __s32 size;
-};
-
-
-typedef __u16 video_attributes_t;
-/* bits: descr. */
-/* 15-14 Video compression mode (0=MPEG-1, 1=MPEG-2) */
-/* 13-12 TV system (0=525/60, 1=625/50) */
-/* 11-10 Aspect ratio (0=4:3, 3=16:9) */
-/* 9- 8 permitted display mode on 4:3 monitor (0=both, 1=only pan-sca */
-/* 7 line 21-1 data present in GOP (1=yes, 0=no) */
-/* 6 line 21-2 data present in GOP (1=yes, 0=no) */
-/* 5- 3 source resolution (0=720x480/576, 1=704x480/576, 2=352x480/57 */
-/* 2 source letterboxed (1=yes, 0=no) */
-/* 0 film/camera mode (0=
- *camera, 1=film (625/50 only)) */
-
-
-/* bit definitions for capabilities: */
-/* can the hardware decode MPEG1 and/or MPEG2? */
-#define VIDEO_CAP_MPEG1 1
-#define VIDEO_CAP_MPEG2 2
-/* can you send a system and/or program stream to video device?
- (you still have to open the video and the audio device but only
- send the stream to the video device) */
-#define VIDEO_CAP_SYS 4
-#define VIDEO_CAP_PROG 8
-/* can the driver also handle SPU, NAVI and CSS encoded data?
- (CSS API is not present yet) */
-#define VIDEO_CAP_SPU 16
-#define VIDEO_CAP_NAVI 32
-#define VIDEO_CAP_CSS 64
-
-
-#define VIDEO_STOP _IO('o', 21)
-#define VIDEO_PLAY _IO('o', 22)
-#define VIDEO_FREEZE _IO('o', 23)
-#define VIDEO_CONTINUE _IO('o', 24)
-#define VIDEO_SELECT_SOURCE _IO('o', 25)
-#define VIDEO_SET_BLANK _IO('o', 26)
-#define VIDEO_GET_STATUS _IOR('o', 27, struct video_status)
-#define VIDEO_GET_EVENT _IOR('o', 28, struct video_event)
-#define VIDEO_SET_DISPLAY_FORMAT _IO('o', 29)
-#define VIDEO_STILLPICTURE _IOW('o', 30, struct video_still_picture)
-#define VIDEO_FAST_FORWARD _IO('o', 31)
-#define VIDEO_SLOWMOTION _IO('o', 32)
-#define VIDEO_GET_CAPABILITIES _IOR('o', 33, unsigned int)
-#define VIDEO_CLEAR_BUFFER _IO('o', 34)
-#define VIDEO_SET_STREAMTYPE _IO('o', 36)
-#define VIDEO_SET_FORMAT _IO('o', 37)
-#define VIDEO_GET_SIZE _IOR('o', 55, video_size_t)
-
-/**
- * VIDEO_GET_PTS
- *
- * Read the 33 bit presentation time stamp as defined
- * in ITU T-REC-H.222.0 / ISO/IEC 13818-1.
- *
- * The PTS should belong to the currently played
- * frame if possible, but may also be a value close to it
- * like the PTS of the last decoded frame or the last PTS
- * extracted by the PES parser.
- */
-#define VIDEO_GET_PTS _IOR('o', 57, __u64)
-
-/* Read the number of displayed frames since the decoder was started */
-#define VIDEO_GET_FRAME_COUNT _IOR('o', 58, __u64)
-
-#define VIDEO_COMMAND _IOWR('o', 59, struct video_command)
-#define VIDEO_TRY_COMMAND _IOWR('o', 60, struct video_command)
-
-#endif /* _UAPI_DVBVIDEO_H_ */
enum { ESnormal, ESesc, ESsquare, ESgetpars, ESfunckey,
EShash, ESsetG0, ESsetG1, ESpercent, EScsiignore, ESnonstd,
- ESpalette, ESosc };
+ ESpalette, ESosc, ESapc, ESpm, ESdcs };
/* console_lock is held (except via vc_init()) */
static void reset_terminal(struct vc_data *vc, int do_clear)
vc->vc_translate = set_translate(*charset, vc);
}
+/* is this state an ANSI control string? */
+static bool ansi_control_string(unsigned int state)
+{
+ if (state == ESosc || state == ESapc || state == ESpm || state == ESdcs)
+ return true;
+ return false;
+}
+
/* console_lock is held */
static void do_con_trol(struct tty_struct *tty, struct vc_data *vc, int c)
{
/*
* Control characters can be used in the _middle_
- * of an escape sequence.
+ * of an escape sequence, aside from ANSI control strings.
*/
- if (vc->vc_state == ESosc && c>=8 && c<=13) /* ... except for OSC */
+ if (ansi_control_string(vc->vc_state) && c >= 8 && c <= 13)
return;
switch (c) {
case 0:
return;
case 7:
- if (vc->vc_state == ESosc)
+ if (ansi_control_string(vc->vc_state))
vc->vc_state = ESnormal;
else if (vc->vc_bell_duration)
kd_mksound(vc->vc_bell_pitch, vc->vc_bell_duration);
case ']':
vc->vc_state = ESnonstd;
return;
+ case '_':
+ vc->vc_state = ESapc;
+ return;
+ case '^':
+ vc->vc_state = ESpm;
+ return;
case '%':
vc->vc_state = ESpercent;
return;
if (vc->state.x < VC_TABSTOPS_COUNT)
set_bit(vc->state.x, vc->vc_tab_stop);
return;
+ case 'P':
+ vc->vc_state = ESdcs;
+ return;
case 'Z':
respond_ID(tty);
return;
vc_setGx(vc, 1, c);
vc->vc_state = ESnormal;
return;
+ case ESapc:
+ return;
case ESosc:
return;
+ case ESpm:
+ return;
+ case ESdcs:
+ return;
default:
vc->vc_state = ESnormal;
}
*
* XXX It should at least call into the driver, fbdev's definitely need to
* restore their engine state. --BenH
+ *
+ * Called with the console lock held.
*/
static int vt_kdsetmode(struct vc_data *vc, unsigned long mode)
{
return -EINVAL;
}
- /* FIXME: this needs the console lock extending */
if (vc->vc_mode == mode)
return 0;
return 0;
/* explicitly blank/unblank the screen if switching modes */
- console_lock();
if (mode == KD_TEXT)
do_unblank_screen(1);
else
do_blank_screen(1);
- console_unlock();
return 0;
}
if (!perm)
return -EPERM;
- return vt_kdsetmode(vc, arg);
+ console_lock();
+ ret = vt_kdsetmode(vc, arg);
+ console_unlock();
+ return ret;
case KDGETMODE:
return put_user(vc->vc_mode, (int __user *)arg);
static u32 dwc3_calc_trbs_left(struct dwc3_ep *dep)
{
- struct dwc3_trb *tmp;
u8 trbs_left;
/*
- * If enqueue & dequeue are equal than it is either full or empty.
- *
- * One way to know for sure is if the TRB right before us has HWO bit
- * set or not. If it has, then we're definitely full and can't fit any
- * more transfers in our ring.
+ * If the enqueue & dequeue are equal then the TRB ring is either full
+ * or empty. It's considered full when there are DWC3_TRB_NUM-1 of TRBs
+ * pending to be processed by the driver.
*/
if (dep->trb_enqueue == dep->trb_dequeue) {
- tmp = dwc3_ep_prev_trb(dep, dep->trb_enqueue);
- if (tmp->ctrl & DWC3_TRB_CTRL_HWO)
+ /*
+ * If there is any request remained in the started_list at
+ * this point, that means there is no TRB available.
+ */
+ if (!list_empty(&dep->started_list))
return 0;
return DWC3_TRB_NUM - 1;
ret = wait_for_completion_timeout(&dwc->ep0_in_setup,
msecs_to_jiffies(DWC3_PULL_UP_TIMEOUT));
- if (ret == 0) {
- dev_err(dwc->dev, "timed out waiting for SETUP phase\n");
- return -ETIMEDOUT;
- }
+ if (ret == 0)
+ dev_warn(dwc->dev, "timed out waiting for SETUP phase\n");
}
/*
/* begin to receive SETUP packets */
dwc->ep0state = EP0_SETUP_PHASE;
dwc->link_state = DWC3_LINK_STATE_SS_DIS;
+ dwc->delayed_status = false;
dwc3_ep0_out_start(dwc);
dwc3_gadget_enable_irq(dwc);
int status = req->status;
/* i/f shutting down */
- if (!prm->fb_ep_enabled || req->status == -ESHUTDOWN)
+ if (!prm->fb_ep_enabled) {
+ kfree(req->buf);
+ usb_ep_free_request(ep, req);
+ return;
+ }
+
+ if (req->status == -ESHUTDOWN)
return;
/*
if (!prm->ep_enabled)
return;
- prm->ep_enabled = false;
-
audio_dev = uac->audio_dev;
params = &audio_dev->params;
}
}
+ prm->ep_enabled = false;
+
if (usb_ep_disable(ep))
dev_err(uac->card->dev, "%s:%d Error!\n", __func__, __LINE__);
}
if (!prm->fb_ep_enabled)
return;
- prm->fb_ep_enabled = false;
-
if (prm->req_fback) {
- usb_ep_dequeue(ep, prm->req_fback);
- kfree(prm->req_fback->buf);
- usb_ep_free_request(ep, prm->req_fback);
+ if (usb_ep_dequeue(ep, prm->req_fback)) {
+ kfree(prm->req_fback->buf);
+ usb_ep_free_request(ep, prm->req_fback);
+ }
prm->req_fback = NULL;
}
+ prm->fb_ep_enabled = false;
+
if (usb_ep_disable(ep))
dev_err(uac->card->dev, "%s:%d Error!\n", __func__, __LINE__);
}
return 0;
case RENESAS_ROM_STATUS_NO_RESULT: /* No result yet */
- return 0;
+ dev_dbg(&pdev->dev, "Unknown ROM status ...\n");
+ return -ENOENT;
case RENESAS_ROM_STATUS_ERROR: /* Error State */
default: /* All other states are marked as "Reserved states" */
u8 fw_state;
int err;
- /* Check if device has ROM and loaded, if so skip everything */
- err = renesas_check_rom(pdev);
- if (err) { /* we have rom */
- err = renesas_check_rom_state(pdev);
- if (!err)
- return err;
- }
-
/*
* Test if the device is actually needing the firmware. As most
* BIOSes will initialize the device for us. If the device is
(struct xhci_driver_data *)id->driver_data;
const char *fw_name = driver_data->firmware;
const struct firmware *fw;
+ bool has_rom;
int err;
+ /* Check if device has ROM and loaded, if so skip everything */
+ has_rom = renesas_check_rom(pdev);
+ if (has_rom) {
+ err = renesas_check_rom_state(pdev);
+ if (!err)
+ return 0;
+ else if (err != -ENOENT)
+ has_rom = false;
+ }
+
err = renesas_fw_check_running(pdev);
/* Continue ahead, if the firmware is already running. */
if (err == 0)
return 0;
+ /* no firmware interface available */
if (err != 1)
- return err;
+ return has_rom ? 0 : err;
pci_dev_get(pdev);
- err = request_firmware(&fw, fw_name, &pdev->dev);
+ err = firmware_request_nowarn(&fw, fw_name, &pdev->dev);
pci_dev_put(pdev);
if (err) {
- dev_err(&pdev->dev, "request_firmware failed: %d\n", err);
+ if (has_rom) {
+ dev_info(&pdev->dev, "failed to load firmware %s, fallback to ROM\n",
+ fw_name);
+ return 0;
+ }
+ dev_err(&pdev->dev, "failed to load firmware %s: %d\n",
+ fw_name, err);
return err;
}
.owner = THIS_MODULE,
.name = "ch341-uart",
},
- .bulk_in_size = 512,
.id_table = id_table,
.num_ports = 1,
.open = ch341_open,
.driver_info = RSVD(4) | RSVD(5) },
{ USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x0105, 0xff), /* Fibocom NL678 series */
.driver_info = RSVD(6) },
+ { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x010b, 0xff, 0xff, 0x30) }, /* Fibocom FG150 Diag */
+ { USB_DEVICE_AND_INTERFACE_INFO(0x2cb7, 0x010b, 0xff, 0, 0) }, /* Fibocom FG150 AT */
{ USB_DEVICE_INTERFACE_CLASS(0x2cb7, 0x01a0, 0xff) }, /* Fibocom NL668-AM/NL652-EU (laptop MBIM) */
{ USB_DEVICE_INTERFACE_CLASS(0x2df3, 0x9d03, 0xff) }, /* LongSung M5710 */
{ USB_DEVICE_INTERFACE_CLASS(0x305a, 0x1404, 0xff) }, /* GosunCn GM500 RNDIS */
bool vbus_source;
bool vbus_charge;
+ /* Set to true when Discover_Identity Command is expected to be sent in Ready states. */
bool send_discover;
bool op_vsafe5v;
struct hrtimer send_discover_timer;
struct kthread_work send_discover_work;
bool state_machine_running;
+ /* Set to true when VDM State Machine has following actions. */
bool vdm_sm_running;
struct completion tx_complete;
/* Set ready, vdm state machine will actually send */
port->vdm_retries = 0;
port->vdm_state = VDM_STATE_READY;
+ port->vdm_sm_running = true;
mod_vdm_delayed_work(port, 0);
}
rlen = 1;
} else {
tcpm_register_partner_altmodes(port);
- port->vdm_sm_running = false;
}
break;
case CMD_ENTER_MODE:
(VDO_SVDM_VERS(svdm_version));
break;
}
- port->vdm_sm_running = false;
break;
default:
response[0] = p[0] | VDO_CMDT(CMDT_RSP_NAK);
rlen = 1;
response[0] = (response[0] & ~VDO_SVDM_VERS_MASK) |
(VDO_SVDM_VERS(svdm_version));
- port->vdm_sm_running = false;
break;
}
}
if (PD_VDO_SVDM(p[0]) && (adev || tcpm_vdm_ams(port) || port->nr_snk_vdo)) {
+ /*
+ * Here a SVDM is received (INIT or RSP or unknown). Set the vdm_sm_running in
+ * advance because we are dropping the lock but may send VDMs soon.
+ * For the cases of INIT received:
+ * - If no response to send, it will be cleared later in this function.
+ * - If there are responses to send, it will be cleared in the state machine.
+ * For the cases of RSP received:
+ * - If no further INIT to send, it will be cleared later in this function.
+ * - Otherwise, it will be cleared in the state machine if timeout or it will go
+ * back here until no further INIT to send.
+ * For the cases of unknown type received:
+ * - We will send NAK and the flag will be cleared in the state machine.
+ */
+ port->vdm_sm_running = true;
rlen = tcpm_pd_svdm(port, adev, p, cnt, response, &adev_action);
} else {
if (port->negotiated_rev >= PD_REV30)
if (rlen > 0)
tcpm_queue_vdm(port, response[0], &response[1], rlen - 1);
+ else
+ port->vdm_sm_running = false;
}
static void tcpm_send_vdm(struct tcpm_port *port, u32 vid, int cmd,
* if there's traffic or we're not in PDO ready state don't send
* a VDM.
*/
- if (port->state != SRC_READY && port->state != SNK_READY)
+ if (port->state != SRC_READY && port->state != SNK_READY) {
+ port->vdm_sm_running = false;
break;
+ }
/* TODO: AMS operation for Unstructured VDM */
if (PD_VDO_SVDM(vdo_hdr) && PD_VDO_CMDT(vdo_hdr) == CMDT_INIT) {
TYPEC_PWR_MODE_PD,
port->pps_data.active,
port->supply_voltage);
- /* Set VDM running flag ASAP */
- if (port->data_role == TYPEC_HOST &&
- port->send_discover)
- port->vdm_sm_running = true;
tcpm_set_state(port, SNK_READY, 0);
} else {
/*
switch (port->state) {
case SNK_NEGOTIATE_CAPABILITIES:
/* USB PD specification, Figure 8-43 */
- if (port->explicit_contract) {
+ if (port->explicit_contract)
next_state = SNK_READY;
- if (port->data_role == TYPEC_HOST &&
- port->send_discover)
- port->vdm_sm_running = true;
- } else {
+ else
next_state = SNK_WAIT_CAPABILITIES;
- }
/* Threshold was relaxed before sending Request. Restore it back. */
tcpm_set_auto_vbus_discharge_threshold(port, TYPEC_PWR_MODE_PD,
port->pps_status = (type == PD_CTRL_WAIT ?
-EAGAIN : -EOPNOTSUPP);
- if (port->data_role == TYPEC_HOST &&
- port->send_discover)
- port->vdm_sm_running = true;
-
/* Threshold was relaxed before sending Request. Restore it back. */
tcpm_set_auto_vbus_discharge_threshold(port, TYPEC_PWR_MODE_PD,
port->pps_data.active,
}
break;
case DR_SWAP_SEND:
- if (port->data_role == TYPEC_DEVICE &&
- port->send_discover)
- port->vdm_sm_running = true;
-
tcpm_set_state(port, DR_SWAP_CHANGE_DR, 0);
break;
case PR_SWAP_SEND:
PD_MSG_CTRL_NOT_SUPP,
NONE_AMS);
} else {
- if (port->vdm_sm_running) {
+ if (port->send_discover) {
tcpm_queue_message(port, PD_MSG_CTRL_WAIT);
break;
}
PD_MSG_CTRL_NOT_SUPP,
NONE_AMS);
} else {
- if (port->vdm_sm_running) {
+ if (port->send_discover) {
tcpm_queue_message(port, PD_MSG_CTRL_WAIT);
break;
}
}
break;
case PD_CTRL_VCONN_SWAP:
- if (port->vdm_sm_running) {
+ if (port->send_discover) {
tcpm_queue_message(port, PD_MSG_CTRL_WAIT);
break;
}
/* DR_Swap states */
case DR_SWAP_SEND:
tcpm_pd_send_control(port, PD_CTRL_DR_SWAP);
+ if (port->data_role == TYPEC_DEVICE || port->negotiated_rev > PD_REV20)
+ port->send_discover = true;
tcpm_set_state_cond(port, DR_SWAP_SEND_TIMEOUT,
PD_T_SENDER_RESPONSE);
break;
case DR_SWAP_ACCEPT:
tcpm_pd_send_control(port, PD_CTRL_ACCEPT);
- /* Set VDM state machine running flag ASAP */
- if (port->data_role == TYPEC_DEVICE && port->send_discover)
- port->vdm_sm_running = true;
+ if (port->data_role == TYPEC_DEVICE || port->negotiated_rev > PD_REV20)
+ port->send_discover = true;
tcpm_set_state_cond(port, DR_SWAP_CHANGE_DR, 0);
break;
case DR_SWAP_SEND_TIMEOUT:
tcpm_swap_complete(port, -ETIMEDOUT);
+ port->send_discover = false;
tcpm_ams_finish(port);
tcpm_set_state(port, ready_state(port), 0);
break;
} else {
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_HOST);
- port->send_discover = true;
}
tcpm_ams_finish(port);
tcpm_set_state(port, ready_state(port), 0);
break;
case VCONN_SWAP_SEND_TIMEOUT:
tcpm_swap_complete(port, -ETIMEDOUT);
- if (port->data_role == TYPEC_HOST && port->send_discover)
- port->vdm_sm_running = true;
tcpm_set_state(port, ready_state(port), 0);
break;
case VCONN_SWAP_START:
case VCONN_SWAP_TURN_ON_VCONN:
tcpm_set_vconn(port, true);
tcpm_pd_send_control(port, PD_CTRL_PS_RDY);
- if (port->data_role == TYPEC_HOST && port->send_discover)
- port->vdm_sm_running = true;
tcpm_set_state(port, ready_state(port), 0);
break;
case VCONN_SWAP_TURN_OFF_VCONN:
tcpm_set_vconn(port, false);
- if (port->data_role == TYPEC_HOST && port->send_discover)
- port->vdm_sm_running = true;
tcpm_set_state(port, ready_state(port), 0);
break;
case PR_SWAP_CANCEL:
case VCONN_SWAP_CANCEL:
tcpm_swap_complete(port, port->swap_status);
- if (port->data_role == TYPEC_HOST && port->send_discover)
- port->vdm_sm_running = true;
if (port->pwr_role == TYPEC_SOURCE)
tcpm_set_state(port, SRC_READY, 0);
else
switch (port->state) {
case SNK_TRANSITION_SINK_VBUS:
port->explicit_contract = true;
- /* Set the VDM flag ASAP */
- if (port->data_role == TYPEC_HOST && port->send_discover)
- port->vdm_sm_running = true;
tcpm_set_state(port, SNK_READY, 0);
break;
case SNK_DISCOVERY:
if (!port->send_discover)
goto unlock;
+ if (port->data_role == TYPEC_DEVICE && port->negotiated_rev < PD_REV30) {
+ port->send_discover = false;
+ goto unlock;
+ }
+
/* Retry if the port is not idle */
if ((port->state != SRC_READY && port->state != SNK_READY) || port->vdm_sm_running) {
mod_send_discover_delayed_work(port, SEND_DISCOVER_RETRY_MS);
goto unlock;
}
- /* Only send the Message if the port is host for PD rev2.0 */
- if (port->data_role == TYPEC_HOST || port->negotiated_rev > PD_REV20)
- tcpm_send_vdm(port, USB_SID_PD, CMD_DISCOVER_IDENT, NULL, 0);
+ tcpm_send_vdm(port, USB_SID_PD, CMD_DISCOVER_IDENT, NULL, 0);
unlock:
mutex_unlock(&port->lock);
do_online = virtio_mem_bbm_get_bb_state(vm, id) !=
VIRTIO_MEM_BBM_BB_FAKE_OFFLINE;
}
+
+ /*
+ * virtio_mem_set_fake_offline() might sleep, we don't need
+ * the device anymore. See virtio_mem_remove() how races
+ * between memory onlining and device removal are handled.
+ */
+ rcu_read_unlock();
+
if (do_online)
generic_online_page(page, order);
else
virtio_mem_set_fake_offline(PFN_DOWN(addr), 1 << order,
false);
- rcu_read_unlock();
return;
}
rcu_read_unlock();
p9_debug(P9_DEBUG_VFS, "filp: %p lock: %p\n", filp, fl);
- /* No mandatory locks */
- if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
- return -ENOLCK;
-
if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) {
filemap_write_and_wait(inode->i_mapping);
invalidate_mapping_pages(&inode->i_data, 0, -1);
p9_debug(P9_DEBUG_VFS, "filp: %p cmd:%d lock: %p name: %pD\n",
filp, cmd, fl, filp);
- /* No mandatory locks */
- if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
- goto out_err;
-
if ((IS_SETLK(cmd) || IS_SETLKW(cmd)) && fl->fl_type != F_UNLCK) {
filemap_write_and_wait(inode->i_mapping);
invalidate_mapping_pages(&inode->i_data, 0, -1);
ret = v9fs_file_getlock(filp, fl);
else
ret = -EINVAL;
-out_err:
return ret;
}
p9_debug(P9_DEBUG_VFS, "filp: %p cmd:%d lock: %p name: %pD\n",
filp, cmd, fl, filp);
- /* No mandatory locks */
- if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
- goto out_err;
-
if (!(fl->fl_flags & FL_FLOCK))
goto out_err;
for filesystems like NFS and for the flock() system
call. Disabling this option saves about 11k.
-config MANDATORY_FILE_LOCKING
- bool "Enable Mandatory file locking"
- depends on FILE_LOCKING
- default y
- help
- This option enables files appropriately marked files on appropriely
- mounted filesystems to support mandatory locking.
-
- To the best of my knowledge this is dead code that no one cares about.
-
source "fs/crypto/Kconfig"
source "fs/verity/Kconfig"
fl->fl_type, fl->fl_flags,
(long long) fl->fl_start, (long long) fl->fl_end);
- /* AFS doesn't support mandatory locks */
- if (__mandatory_lock(&vnode->vfs_inode) && fl->fl_type != F_UNLCK)
- return -ENOLCK;
-
if (IS_GETLK(cmd))
return afs_do_getlk(file, fl);
list_del(&iocb->ki_list);
iocb->ki_res.res = mangle_poll(mask);
req->done = true;
- if (iocb->ki_eventfd && eventfd_signal_count()) {
+ if (iocb->ki_eventfd && eventfd_signal_allowed()) {
iocb = NULL;
INIT_WORK(&req->work, aio_poll_put_work);
schedule_work(&req->work);
#include <linux/uaccess.h>
#include <linux/suspend.h>
#include "internal.h"
+#include "../block/blk.h"
struct bdev_inode {
struct block_device bdev;
(bdev_logical_block_size(bdev) - 1))
return -EINVAL;
- bio = bio_alloc_bioset(GFP_KERNEL, nr_pages, &blkdev_dio_pool);
+ bio = bio_alloc_kiocb(iocb, nr_pages, &blkdev_dio_pool);
dio = container_of(bio, struct blkdev_dio, bio);
dio->is_sync = is_sync = is_sync_kiocb(iocb);
static __init int blkdev_init(void)
{
- return bioset_init(&blkdev_dio_pool, 4, offsetof(struct blkdev_dio, bio), BIOSET_NEED_BVECS);
+ return bioset_init(&blkdev_dio_pool, 4,
+ offsetof(struct blkdev_dio, bio),
+ BIOSET_NEED_BVECS|BIOSET_PERCPU_CACHE);
}
module_init(blkdev_init);
return retval;
}
-int blkdev_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
+static int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
+ int datasync)
{
struct inode *bd_inode = bdev_file_inode(filp);
struct block_device *bdev = I_BDEV(bd_inode);
return error;
}
-EXPORT_SYMBOL(blkdev_fsync);
/**
* bdev_read_page() - Start reading a page from a block device
if (!ei)
return NULL;
memset(&ei->bdev, 0, sizeof(ei->bdev));
- ei->bdev.bd_bdi = &noop_backing_dev_info;
return &ei->vfs_inode;
}
free_percpu(bdev->bd_stats);
kfree(bdev->bd_meta_info);
- if (!bdev_is_partition(bdev))
+ if (!bdev_is_partition(bdev)) {
+ if (bdev->bd_disk && bdev->bd_disk->bdi)
+ bdi_put(bdev->bd_disk->bdi);
kfree(bdev->bd_disk);
+ }
+
+ if (MAJOR(bdev->bd_dev) == BLOCK_EXT_MAJOR)
+ blk_free_ext_minor(MINOR(bdev->bd_dev));
+
kmem_cache_free(bdev_cachep, BDEV_I(inode));
}
static void bdev_evict_inode(struct inode *inode)
{
- struct block_device *bdev = &BDEV_I(inode)->bdev;
truncate_inode_pages_final(&inode->i_data);
invalidate_inode_buffers(inode); /* is it needed here? */
clear_inode(inode);
- /* Detach inode from wb early as bdi_put() may free bdi->wb */
- inode_detach_wb(inode);
- if (bdev->bd_bdi != &noop_backing_dev_info) {
- bdi_put(bdev->bd_bdi);
- bdev->bd_bdi = &noop_backing_dev_info;
- }
}
static const struct super_operations bdev_sops = {
bdev->bd_disk = disk;
bdev->bd_partno = partno;
bdev->bd_inode = inode;
-#ifdef CONFIG_SYSFS
- INIT_LIST_HEAD(&bdev->bd_holder_disks);
-#endif
bdev->bd_stats = alloc_percpu(struct disk_stats);
if (!bdev->bd_stats) {
iput(inode);
insert_inode_hash(bdev->bd_inode);
}
-static struct block_device *bdget(dev_t dev)
-{
- struct inode *inode;
-
- inode = ilookup(blockdev_superblock, dev);
- if (!inode)
- return NULL;
- return &BDEV_I(inode)->bdev;
-}
-
-/**
- * bdgrab -- Grab a reference to an already referenced block device
- * @bdev: Block device to grab a reference to.
- *
- * Returns the block_device with an additional reference when successful,
- * or NULL if the inode is already beeing freed.
- */
-struct block_device *bdgrab(struct block_device *bdev)
-{
- if (!igrab(bdev->bd_inode))
- return NULL;
- return bdev;
-}
-EXPORT_SYMBOL(bdgrab);
-
long nr_blockdev_pages(void)
{
struct inode *inode;
return ret;
}
-void bdput(struct block_device *bdev)
-{
- iput(bdev->bd_inode);
-}
-EXPORT_SYMBOL(bdput);
-
/**
* bd_may_claim - test whether a block device can be claimed
* @bdev: block device of interest
}
EXPORT_SYMBOL(bd_abort_claiming);
-#ifdef CONFIG_SYSFS
-struct bd_holder_disk {
- struct list_head list;
- struct gendisk *disk;
- int refcnt;
-};
-
-static struct bd_holder_disk *bd_find_holder_disk(struct block_device *bdev,
- struct gendisk *disk)
-{
- struct bd_holder_disk *holder;
-
- list_for_each_entry(holder, &bdev->bd_holder_disks, list)
- if (holder->disk == disk)
- return holder;
- return NULL;
-}
-
-static int add_symlink(struct kobject *from, struct kobject *to)
-{
- return sysfs_create_link(from, to, kobject_name(to));
-}
-
-static void del_symlink(struct kobject *from, struct kobject *to)
-{
- sysfs_remove_link(from, kobject_name(to));
-}
-
-/**
- * bd_link_disk_holder - create symlinks between holding disk and slave bdev
- * @bdev: the claimed slave bdev
- * @disk: the holding disk
- *
- * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
- *
- * This functions creates the following sysfs symlinks.
- *
- * - from "slaves" directory of the holder @disk to the claimed @bdev
- * - from "holders" directory of the @bdev to the holder @disk
- *
- * For example, if /dev/dm-0 maps to /dev/sda and disk for dm-0 is
- * passed to bd_link_disk_holder(), then:
- *
- * /sys/block/dm-0/slaves/sda --> /sys/block/sda
- * /sys/block/sda/holders/dm-0 --> /sys/block/dm-0
- *
- * The caller must have claimed @bdev before calling this function and
- * ensure that both @bdev and @disk are valid during the creation and
- * lifetime of these symlinks.
- *
- * CONTEXT:
- * Might sleep.
- *
- * RETURNS:
- * 0 on success, -errno on failure.
- */
-int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk)
-{
- struct bd_holder_disk *holder;
- int ret = 0;
-
- mutex_lock(&bdev->bd_disk->open_mutex);
-
- WARN_ON_ONCE(!bdev->bd_holder);
-
- /* FIXME: remove the following once add_disk() handles errors */
- if (WARN_ON(!disk->slave_dir || !bdev->bd_holder_dir))
- goto out_unlock;
-
- holder = bd_find_holder_disk(bdev, disk);
- if (holder) {
- holder->refcnt++;
- goto out_unlock;
- }
-
- holder = kzalloc(sizeof(*holder), GFP_KERNEL);
- if (!holder) {
- ret = -ENOMEM;
- goto out_unlock;
- }
-
- INIT_LIST_HEAD(&holder->list);
- holder->disk = disk;
- holder->refcnt = 1;
-
- ret = add_symlink(disk->slave_dir, bdev_kobj(bdev));
- if (ret)
- goto out_free;
-
- ret = add_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
- if (ret)
- goto out_del;
- /*
- * bdev could be deleted beneath us which would implicitly destroy
- * the holder directory. Hold on to it.
- */
- kobject_get(bdev->bd_holder_dir);
-
- list_add(&holder->list, &bdev->bd_holder_disks);
- goto out_unlock;
-
-out_del:
- del_symlink(disk->slave_dir, bdev_kobj(bdev));
-out_free:
- kfree(holder);
-out_unlock:
- mutex_unlock(&bdev->bd_disk->open_mutex);
- return ret;
-}
-EXPORT_SYMBOL_GPL(bd_link_disk_holder);
-
-/**
- * bd_unlink_disk_holder - destroy symlinks created by bd_link_disk_holder()
- * @bdev: the calimed slave bdev
- * @disk: the holding disk
- *
- * DON'T USE THIS UNLESS YOU'RE ALREADY USING IT.
- *
- * CONTEXT:
- * Might sleep.
- */
-void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk)
-{
- struct bd_holder_disk *holder;
-
- mutex_lock(&bdev->bd_disk->open_mutex);
-
- holder = bd_find_holder_disk(bdev, disk);
-
- if (!WARN_ON_ONCE(holder == NULL) && !--holder->refcnt) {
- del_symlink(disk->slave_dir, bdev_kobj(bdev));
- del_symlink(bdev->bd_holder_dir, &disk_to_dev(disk)->kobj);
- kobject_put(bdev->bd_holder_dir);
- list_del_init(&holder->list);
- kfree(holder);
- }
-
- mutex_unlock(&bdev->bd_disk->open_mutex);
-}
-EXPORT_SYMBOL_GPL(bd_unlink_disk_holder);
-#endif
-
static void blkdev_flush_mapping(struct block_device *bdev)
{
WARN_ON_ONCE(bdev->bd_holders);
}
}
- if (!bdev->bd_openers) {
+ if (!bdev->bd_openers)
set_init_blocksize(bdev);
- if (bdev->bd_bdi == &noop_backing_dev_info)
- bdev->bd_bdi = bdi_get(disk->queue->backing_dev_info);
- }
if (test_bit(GD_NEED_PART_SCAN, &disk->state))
bdev_disk_changed(disk, false);
bdev->bd_openers++;
static int blkdev_get_part(struct block_device *part, fmode_t mode)
{
struct gendisk *disk = part->bd_disk;
- struct block_device *whole;
int ret;
if (part->bd_openers)
goto done;
- whole = bdgrab(disk->part0);
- ret = blkdev_get_whole(whole, mode);
+ ret = blkdev_get_whole(bdev_whole(part), mode);
if (ret)
- goto out_put_whole;
+ return ret;
ret = -ENXIO;
if (!bdev_nr_sectors(part))
disk->open_partitions++;
set_init_blocksize(part);
- if (part->bd_bdi == &noop_backing_dev_info)
- part->bd_bdi = bdi_get(disk->queue->backing_dev_info);
done:
part->bd_openers++;
return 0;
out_blkdev_put:
- blkdev_put_whole(whole, mode);
-out_put_whole:
- bdput(whole);
+ blkdev_put_whole(bdev_whole(part), mode);
return ret;
}
blkdev_flush_mapping(part);
whole->bd_disk->open_partitions--;
blkdev_put_whole(whole, mode);
- bdput(whole);
}
struct block_device *blkdev_get_no_open(dev_t dev)
{
struct block_device *bdev;
- struct gendisk *disk;
+ struct inode *inode;
- bdev = bdget(dev);
- if (!bdev) {
+ inode = ilookup(blockdev_superblock, dev);
+ if (!inode) {
blk_request_module(dev);
- bdev = bdget(dev);
- if (!bdev)
+ inode = ilookup(blockdev_superblock, dev);
+ if (!inode)
return NULL;
}
- disk = bdev->bd_disk;
- if (!kobject_get_unless_zero(&disk_to_dev(disk)->kobj))
- goto bdput;
- if ((disk->flags & (GENHD_FL_UP | GENHD_FL_HIDDEN)) != GENHD_FL_UP)
- goto put_disk;
- if (!try_module_get(bdev->bd_disk->fops->owner))
- goto put_disk;
+ /* switch from the inode reference to a device mode one: */
+ bdev = &BDEV_I(inode)->bdev;
+ if (!kobject_get_unless_zero(&bdev->bd_device.kobj))
+ bdev = NULL;
+ iput(inode);
+
+ if (!bdev)
+ return NULL;
+ if ((bdev->bd_disk->flags & GENHD_FL_HIDDEN) ||
+ !try_module_get(bdev->bd_disk->fops->owner)) {
+ put_device(&bdev->bd_device);
+ return NULL;
+ }
+
return bdev;
-put_disk:
- put_disk(disk);
-bdput:
- bdput(bdev);
- return NULL;
}
void blkdev_put_no_open(struct block_device *bdev)
{
module_put(bdev->bd_disk->fops->owner);
- put_disk(bdev->bd_disk);
- bdput(bdev);
+ put_device(&bdev->bd_device);
}
/**
mutex_lock(&disk->open_mutex);
ret = -ENXIO;
- if (!(disk->flags & GENHD_FL_UP))
+ if (!disk_live(disk))
goto abort_claiming;
if (bdev_is_partition(bdev))
ret = blkdev_get_part(bdev, mode);
* inode has not been flagged as nocompress. This flag can
* change at any time if we discover bad compression ratios.
*/
- if (nr_pages > 1 && inode_need_compress(BTRFS_I(inode), start, end)) {
+ if (inode_need_compress(BTRFS_I(inode), start, end)) {
WARN_ON(pages);
pages = kcalloc(nr_pages, sizeof(struct page *), GFP_NOFS);
if (!pages) {
ret = VM_FAULT_SIGBUS;
} else {
struct address_space *mapping = inode->i_mapping;
- struct page *page = find_or_create_page(mapping, 0,
- mapping_gfp_constraint(mapping,
- ~__GFP_FS));
+ struct page *page;
+
+ filemap_invalidate_lock_shared(mapping);
+ page = find_or_create_page(mapping, 0,
+ mapping_gfp_constraint(mapping, ~__GFP_FS));
if (!page) {
ret = VM_FAULT_OOM;
goto out_inline;
vmf->page = page;
ret = VM_FAULT_MAJOR | VM_FAULT_LOCKED;
out_inline:
+ filemap_invalidate_unlock_shared(mapping);
dout("filemap_fault %p %llu read inline data ret %x\n",
inode, off, ret);
}
struct ceph_cap_flush *ceph_alloc_cap_flush(void)
{
- return kmem_cache_alloc(ceph_cap_flush_cachep, GFP_KERNEL);
+ struct ceph_cap_flush *cf;
+
+ cf = kmem_cache_alloc(ceph_cap_flush_cachep, GFP_KERNEL);
+ cf->is_capsnap = false;
+ return cf;
}
void ceph_free_cap_flush(struct ceph_cap_flush *cf)
prev->wake = true;
wake = false;
}
- list_del(&cf->g_list);
+ list_del_init(&cf->g_list);
return wake;
}
prev->wake = true;
wake = false;
}
- list_del(&cf->i_list);
+ list_del_init(&cf->i_list);
return wake;
}
ci->i_ceph_flags &= ~CEPH_I_KICK_FLUSH;
list_for_each_entry_reverse(cf, &ci->i_cap_flush_list, i_list) {
- if (!cf->caps) {
+ if (cf->is_capsnap) {
last_snap_flush = cf->tid;
break;
}
first_tid = cf->tid + 1;
- if (cf->caps) {
+ if (!cf->is_capsnap) {
struct cap_msg_args arg;
dout("kick_flushing_caps %p cap %p tid %llu %s\n",
cleaned = cf->caps;
/* Is this a capsnap? */
- if (cf->caps == 0)
+ if (cf->is_capsnap)
continue;
if (cf->tid <= flush_tid) {
while (!list_empty(&to_remove)) {
cf = list_first_entry(&to_remove,
struct ceph_cap_flush, i_list);
- list_del(&cf->i_list);
- ceph_free_cap_flush(cf);
+ list_del_init(&cf->i_list);
+ if (!cf->is_capsnap)
+ ceph_free_cap_flush(cf);
}
if (wake_ci)
if (ret < 0)
goto unlock;
+ filemap_invalidate_lock(inode->i_mapping);
ceph_zero_pagecache_range(inode, offset, length);
ret = ceph_zero_objects(inode, offset, length);
if (dirty)
__mark_inode_dirty(inode, dirty);
}
+ filemap_invalidate_unlock(inode->i_mapping);
ceph_put_cap_refs(ci, got);
unlock:
if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK;
- /* No mandatory locks */
- if (__mandatory_lock(file->f_mapping->host) && fl->fl_type != F_UNLCK)
- return -ENOLCK;
dout("ceph_lock, fl_owner: %p\n", fl->fl_owner);
spin_lock(&mdsc->cap_dirty_lock);
list_for_each_entry(cf, &to_remove, i_list)
- list_del(&cf->g_list);
+ list_del_init(&cf->g_list);
if (!list_empty(&ci->i_dirty_item)) {
pr_warn_ratelimited(
struct ceph_cap_flush *cf;
cf = list_first_entry(&to_remove,
struct ceph_cap_flush, i_list);
- list_del(&cf->i_list);
- ceph_free_cap_flush(cf);
+ list_del_init(&cf->i_list);
+ if (!cf->is_capsnap)
+ ceph_free_cap_flush(cf);
}
wake_up_all(&ci->i_cap_wq);
{
int i;
- for (i = 0; i < m->possible_max_rank; i++)
- kfree(m->m_info[i].export_targets);
- kfree(m->m_info);
+ if (m->m_info) {
+ for (i = 0; i < m->possible_max_rank; i++)
+ kfree(m->m_info[i].export_targets);
+ kfree(m->m_info);
+ }
kfree(m->m_data_pg_pools);
kfree(m);
}
pr_err("ENOMEM allocating ceph_cap_snap on %p\n", inode);
return;
}
+ capsnap->cap_flush.is_capsnap = true;
+ INIT_LIST_HEAD(&capsnap->cap_flush.i_list);
+ INIT_LIST_HEAD(&capsnap->cap_flush.g_list);
spin_lock(&ci->i_ceph_lock);
used = __ceph_caps_used(ci);
struct ceph_cap_flush {
u64 tid;
- int caps; /* 0 means capsnap */
+ int caps;
bool wake; /* wake up flush waiters when finish ? */
+ bool is_capsnap; /* true means capsnap */
struct list_head g_list; // global
struct list_head i_list; // per inode
};
return rc;
}
+ filemap_invalidate_lock(inode->i_mapping);
/*
* We implement the punch hole through ioctl, so we need remove the page
* caches first, otherwise the data may be inconsistent with the server.
sizeof(struct file_zero_data_information),
CIFSMaxBufSize, NULL, NULL);
free_xid(xid);
+ filemap_invalidate_unlock(inode->i_mapping);
return rc;
}
#include <linux/idr.h>
#include <linux/uio.h>
-DEFINE_PER_CPU(int, eventfd_wake_count);
-
static DEFINE_IDA(eventfd_ida);
struct eventfd_ctx {
* Deadlock or stack overflow issues can happen if we recurse here
* through waitqueue wakeup handlers. If the caller users potentially
* nested waitqueues with custom wakeup handlers, then it should
- * check eventfd_signal_count() before calling this function. If
- * it returns true, the eventfd_signal() call should be deferred to a
+ * check eventfd_signal_allowed() before calling this function. If
+ * it returns false, the eventfd_signal() call should be deferred to a
* safe context.
*/
- if (WARN_ON_ONCE(this_cpu_read(eventfd_wake_count)))
+ if (WARN_ON_ONCE(current->in_eventfd_signal))
return 0;
spin_lock_irqsave(&ctx->wqh.lock, flags);
- this_cpu_inc(eventfd_wake_count);
+ current->in_eventfd_signal = 1;
if (ULLONG_MAX - ctx->count < n)
n = ULLONG_MAX - ctx->count;
ctx->count += n;
if (waitqueue_active(&ctx->wqh))
wake_up_locked_poll(&ctx->wqh, EPOLLIN);
- this_cpu_dec(eventfd_wake_count);
+ current->in_eventfd_signal = 0;
spin_unlock_irqrestore(&ctx->wqh.lock, flags);
return n;
# SPDX-License-Identifier: GPL-2.0-only
config EXT2_FS
tristate "Second extended fs support"
+ select FS_IOMAP
help
Ext2 is a standard Linux file system for hard disks.
struct rw_semaphore xattr_sem;
#endif
rwlock_t i_meta_lock;
-#ifdef CONFIG_FS_DAX
- struct rw_semaphore dax_sem;
-#endif
/*
* truncate_mutex is for serialising ext2_truncate() against
#endif
};
-#ifdef CONFIG_FS_DAX
-#define dax_sem_down_write(ext2_inode) down_write(&(ext2_inode)->dax_sem)
-#define dax_sem_up_write(ext2_inode) up_write(&(ext2_inode)->dax_sem)
-#else
-#define dax_sem_down_write(ext2_inode)
-#define dax_sem_up_write(ext2_inode)
-#endif
-
/*
* Inode dynamic state flags
*/
*
* mmap_lock (MM)
* sb_start_pagefault (vfs, freeze)
- * ext2_inode_info->dax_sem
+ * address_space->invalidate_lock
* address_space->i_mmap_rwsem or page_lock (mutually exclusive in DAX)
* ext2_inode_info->truncate_mutex
*
static vm_fault_t ext2_dax_fault(struct vm_fault *vmf)
{
struct inode *inode = file_inode(vmf->vma->vm_file);
- struct ext2_inode_info *ei = EXT2_I(inode);
vm_fault_t ret;
bool write = (vmf->flags & FAULT_FLAG_WRITE) &&
(vmf->vma->vm_flags & VM_SHARED);
sb_start_pagefault(inode->i_sb);
file_update_time(vmf->vma->vm_file);
}
- down_read(&ei->dax_sem);
+ filemap_invalidate_lock_shared(inode->i_mapping);
ret = dax_iomap_fault(vmf, PE_SIZE_PTE, NULL, NULL, &ext2_iomap_ops);
- up_read(&ei->dax_sem);
+ filemap_invalidate_unlock_shared(inode->i_mapping);
if (write)
sb_end_pagefault(inode->i_sb);
return ret;
}
-#ifdef CONFIG_FS_DAX
static int ext2_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
unsigned flags, struct iomap *iomap, struct iomap *srcmap)
{
.iomap_begin = ext2_iomap_begin,
.iomap_end = ext2_iomap_end,
};
-#else
-/* Define empty ops for !CONFIG_FS_DAX case to avoid ugly ifdefs */
-const struct iomap_ops ext2_iomap_ops;
-#endif /* CONFIG_FS_DAX */
int ext2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 len)
{
- return generic_block_fiemap(inode, fieinfo, start, len,
- ext2_get_block);
+ int ret;
+
+ inode_lock(inode);
+ len = min_t(u64, len, i_size_read(inode));
+ ret = iomap_fiemap(inode, fieinfo, start, len, &ext2_iomap_ops);
+ inode_unlock(inode);
+
+ return ret;
}
static int ext2_writepage(struct page *page, struct writeback_control *wbc)
ext2_free_data(inode, p, q);
}
-/* dax_sem must be held when calling this function */
+/* mapping->invalidate_lock must be held when calling this function */
static void __ext2_truncate_blocks(struct inode *inode, loff_t offset)
{
__le32 *i_data = EXT2_I(inode)->i_data;
iblock = (offset + blocksize-1) >> EXT2_BLOCK_SIZE_BITS(inode->i_sb);
#ifdef CONFIG_FS_DAX
- WARN_ON(!rwsem_is_locked(&ei->dax_sem));
+ WARN_ON(!rwsem_is_locked(&inode->i_mapping->invalidate_lock));
#endif
n = ext2_block_to_path(inode, iblock, offsets, NULL);
if (ext2_inode_is_fast_symlink(inode))
return;
- dax_sem_down_write(EXT2_I(inode));
+ filemap_invalidate_lock(inode->i_mapping);
__ext2_truncate_blocks(inode, offset);
- dax_sem_up_write(EXT2_I(inode));
+ filemap_invalidate_unlock(inode->i_mapping);
}
static int ext2_setsize(struct inode *inode, loff_t newsize)
if (error)
return error;
- dax_sem_down_write(EXT2_I(inode));
+ filemap_invalidate_lock(inode->i_mapping);
truncate_setsize(inode, newsize);
__ext2_truncate_blocks(inode, newsize);
- dax_sem_up_write(EXT2_I(inode));
+ filemap_invalidate_unlock(inode->i_mapping);
inode->i_mtime = inode->i_ctime = current_time(inode);
if (inode_needs_sync(inode)) {
init_rwsem(&ei->xattr_sem);
#endif
mutex_init(&ei->truncate_mutex);
-#ifdef CONFIG_FS_DAX
- init_rwsem(&ei->dax_sem);
-#endif
inode_init_once(&ei->vfs_inode);
}
* by other means, so we have i_data_sem.
*/
struct rw_semaphore i_data_sem;
- /*
- * i_mmap_sem is for serializing page faults with truncate / punch hole
- * operations. We have to make sure that new page cannot be faulted in
- * a section of the inode that is being punched. We cannot easily use
- * i_data_sem for this since we need protection for the whole punch
- * operation and i_data_sem ranks below transaction start so we have
- * to occasionally drop it.
- */
- struct rw_semaphore i_mmap_sem;
struct inode vfs_inode;
struct jbd2_inode *jinode;
extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode,
loff_t lstart, loff_t lend);
extern vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf);
-extern vm_fault_t ext4_filemap_fault(struct vm_fault *vmf);
extern qsize_t *ext4_get_reserved_space(struct inode *inode);
extern int ext4_get_projid(struct inode *inode, kprojid_t *projid);
extern void ext4_da_release_space(struct inode *inode, int to_free);
loff_t len, int mode)
{
struct inode *inode = file_inode(file);
+ struct address_space *mapping = file->f_mapping;
handle_t *handle = NULL;
unsigned int max_blocks;
loff_t new_size = 0;
* Prevent page faults from reinstantiating pages we have
* released from page cache.
*/
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = ext4_break_layouts(inode);
if (ret) {
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
goto out_mutex;
}
ret = ext4_update_disksize_before_punch(inode, offset, len);
if (ret) {
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
goto out_mutex;
}
/* Now release the pages and zero block aligned part of pages */
ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size,
flags);
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
if (ret)
goto out_mutex;
}
static int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
{
struct super_block *sb = inode->i_sb;
+ struct address_space *mapping = inode->i_mapping;
ext4_lblk_t punch_start, punch_stop;
handle_t *handle;
unsigned int credits;
* Prevent page faults from reinstantiating pages we have released from
* page cache.
*/
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = ext4_break_layouts(inode);
if (ret)
* Write tail of the last page before removed range since it will get
* removed from the page cache below.
*/
- ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, offset);
+ ret = filemap_write_and_wait_range(mapping, ioffset, offset);
if (ret)
goto out_mmap;
/*
* Write data that will be shifted to preserve them when discarding
* page cache below. We are also protected from pages becoming dirty
- * by i_mmap_sem.
+ * by i_rwsem and invalidate_lock.
*/
- ret = filemap_write_and_wait_range(inode->i_mapping, offset + len,
+ ret = filemap_write_and_wait_range(mapping, offset + len,
LLONG_MAX);
if (ret)
goto out_mmap;
ext4_journal_stop(handle);
ext4_fc_stop_ineligible(sb);
out_mmap:
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
out_mutex:
inode_unlock(inode);
return ret;
static int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)
{
struct super_block *sb = inode->i_sb;
+ struct address_space *mapping = inode->i_mapping;
handle_t *handle;
struct ext4_ext_path *path;
struct ext4_extent *extent;
* Prevent page faults from reinstantiating pages we have released from
* page cache.
*/
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = ext4_break_layouts(inode);
if (ret)
ext4_journal_stop(handle);
ext4_fc_stop_ineligible(sb);
out_mmap:
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
out_mutex:
inode_unlock(inode);
return ret;
*/
bool write = (vmf->flags & FAULT_FLAG_WRITE) &&
(vmf->vma->vm_flags & VM_SHARED);
+ struct address_space *mapping = vmf->vma->vm_file->f_mapping;
pfn_t pfn;
if (write) {
sb_start_pagefault(sb);
file_update_time(vmf->vma->vm_file);
- down_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock_shared(mapping);
retry:
handle = ext4_journal_start_sb(sb, EXT4_HT_WRITE_PAGE,
EXT4_DATA_TRANS_BLOCKS(sb));
if (IS_ERR(handle)) {
- up_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(mapping);
sb_end_pagefault(sb);
return VM_FAULT_SIGBUS;
}
} else {
- down_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock_shared(mapping);
}
result = dax_iomap_fault(vmf, pe_size, &pfn, &error, &ext4_iomap_ops);
if (write) {
/* Handling synchronous page fault? */
if (result & VM_FAULT_NEEDDSYNC)
result = dax_finish_sync_fault(vmf, pe_size, pfn);
- up_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(mapping);
sb_end_pagefault(sb);
} else {
- up_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(mapping);
}
return result;
#endif
static const struct vm_operations_struct ext4_file_vm_ops = {
- .fault = ext4_filemap_fault,
+ .fault = filemap_fault,
.map_pages = filemap_map_pages,
.page_mkwrite = ext4_page_mkwrite,
};
return ret;
}
-static void ext4_wait_dax_page(struct ext4_inode_info *ei)
+static void ext4_wait_dax_page(struct inode *inode)
{
- up_write(&ei->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
schedule();
- down_write(&ei->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
}
int ext4_break_layouts(struct inode *inode)
{
- struct ext4_inode_info *ei = EXT4_I(inode);
struct page *page;
int error;
- if (WARN_ON_ONCE(!rwsem_is_locked(&ei->i_mmap_sem)))
+ if (WARN_ON_ONCE(!rwsem_is_locked(&inode->i_mapping->invalidate_lock)))
return -EINVAL;
do {
error = ___wait_var_event(&page->_refcount,
atomic_read(&page->_refcount) == 1,
TASK_INTERRUPTIBLE, 0, 0,
- ext4_wait_dax_page(ei));
+ ext4_wait_dax_page(inode));
} while (error == 0);
return error;
ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
if (ext4_has_inline_data(inode)) {
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = ext4_convert_inline_data(inode);
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
if (ret)
return ret;
}
* Prevent page faults from reinstantiating pages we have released from
* page cache.
*/
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = ext4_break_layouts(inode);
if (ret)
out_stop:
ext4_journal_stop(handle);
out_dio:
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
out_mutex:
inode_unlock(inode);
return ret;
inode_dio_wait(inode);
}
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
rc = ext4_break_layouts(inode);
if (rc) {
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
goto err_out;
}
error = rc;
}
out_mmap_sem:
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
}
if (!error) {
* data (and journalled aops don't know how to handle these cases).
*/
if (val) {
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
err = filemap_write_and_wait(inode->i_mapping);
if (err < 0) {
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
return err;
}
}
percpu_up_write(&sbi->s_writepages_rwsem);
if (val)
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
/* Finally we can mark the inode as dirty. */
sb_start_pagefault(inode->i_sb);
file_update_time(vma->vm_file);
- down_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock_shared(mapping);
err = ext4_convert_inline_data(inode);
if (err)
out_ret:
ret = block_page_mkwrite_return(err);
out:
- up_read(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(mapping);
sb_end_pagefault(inode->i_sb);
return ret;
out_error:
ext4_journal_stop(handle);
goto out;
}
-
-vm_fault_t ext4_filemap_fault(struct vm_fault *vmf)
-{
- struct inode *inode = file_inode(vmf->vma->vm_file);
- vm_fault_t ret;
-
- down_read(&EXT4_I(inode)->i_mmap_sem);
- ret = filemap_fault(vmf);
- up_read(&EXT4_I(inode)->i_mmap_sem);
-
- return ret;
-}
goto journal_err_out;
}
- down_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
err = filemap_write_and_wait(inode->i_mapping);
if (err)
goto err_out;
ext4_double_up_write_data_sem(inode, inode_bl);
err_out:
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
journal_err_out:
unlock_two_nondirectories(inode, inode_bl);
iput(inode_bl);
/*
* Lock ordering
*
- * Note the difference between i_mmap_sem (EXT4_I(inode)->i_mmap_sem) and
- * i_mmap_rwsem (inode->i_mmap_rwsem)!
- *
* page fault path:
- * mmap_lock -> sb_start_pagefault -> i_mmap_sem (r) -> transaction start ->
- * page lock -> i_data_sem (rw)
+ * mmap_lock -> sb_start_pagefault -> invalidate_lock (r) -> transaction start
+ * -> page lock -> i_data_sem (rw)
*
* buffered write path:
* sb_start_write -> i_mutex -> mmap_lock
* i_data_sem (rw)
*
* truncate:
- * sb_start_write -> i_mutex -> i_mmap_sem (w) -> i_mmap_rwsem (w) -> page lock
- * sb_start_write -> i_mutex -> i_mmap_sem (w) -> transaction start ->
+ * sb_start_write -> i_mutex -> invalidate_lock (w) -> i_mmap_rwsem (w) ->
+ * page lock
+ * sb_start_write -> i_mutex -> invalidate_lock (w) -> transaction start ->
* i_data_sem (rw)
*
* direct IO:
INIT_LIST_HEAD(&ei->i_orphan);
init_rwsem(&ei->xattr_sem);
init_rwsem(&ei->i_data_sem);
- init_rwsem(&ei->i_mmap_sem);
inode_init_once(&ei->vfs_inode);
ext4_fc_init_inode(&ei->vfs_inode);
}
*/
static inline void ext4_truncate_failed_write(struct inode *inode)
{
+ struct address_space *mapping = inode->i_mapping;
+
/*
* We don't need to call ext4_break_layouts() because the blocks we
* are truncating were never visible to userspace.
*/
- down_write(&EXT4_I(inode)->i_mmap_sem);
- truncate_inode_pages(inode->i_mapping, inode->i_size);
+ filemap_invalidate_lock(mapping);
+ truncate_inode_pages(mapping, inode->i_size);
ext4_truncate(inode);
- up_write(&EXT4_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
}
/*
/* In the fs-verity case, f2fs_end_enable_verity() does the truncate */
if (to > i_size && !f2fs_verity_in_progress(inode)) {
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
truncate_pagecache(inode, i_size);
f2fs_truncate_blocks(inode, i_size, true);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
}
}
int ret = 0;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
set_inode_flag(inode, FI_ALIGNED_WRITE);
clear_inode_flag(inode, FI_DO_DEFRAG);
clear_inode_flag(inode, FI_ALIGNED_WRITE);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
return ret;
/* avoid racing between foreground op and gc */
struct rw_semaphore i_gc_rwsem[2];
- struct rw_semaphore i_mmap_sem;
struct rw_semaphore i_xattr_sem; /* avoid racing between reading and changing EAs */
int i_extra_isize; /* size of extra space located in i_addr */
struct inode *inode = file_inode(vmf->vma->vm_file);
vm_fault_t ret;
- down_read(&F2FS_I(inode)->i_mmap_sem);
ret = filemap_fault(vmf);
- up_read(&F2FS_I(inode)->i_mmap_sem);
-
if (!ret)
f2fs_update_iostat(F2FS_I_SB(inode), APP_MAPPED_READ_IO,
F2FS_BLKSIZE);
f2fs_bug_on(sbi, f2fs_has_inline_data(inode));
file_update_time(vmf->vma->vm_file);
- down_read(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock_shared(inode->i_mapping);
lock_page(page);
if (unlikely(page->mapping != inode->i_mapping ||
page_offset(page) > i_size_read(inode) ||
trace_f2fs_vm_page_mkwrite(page, DATA);
out_sem:
- up_read(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(inode->i_mapping);
sb_end_pagefault(inode->i_sb);
err:
}
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
truncate_setsize(inode, attr->ia_size);
* do not trim all blocks after i_size if target size is
* larger than i_size.
*/
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
if (err)
return err;
blk_end = (loff_t)pg_end << PAGE_SHIFT;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
truncate_inode_pages_range(mapping, blk_start,
blk_end - 1);
ret = f2fs_truncate_hole(inode, pg_start, pg_end);
f2fs_unlock_op(sbi);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
}
}
/* avoid gc operation during block exchange */
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
f2fs_lock_op(sbi);
f2fs_drop_extent_tree(inode);
ret = __exchange_data_block(inode, inode, end, start, nrpages - end, true);
f2fs_unlock_op(sbi);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
return ret;
}
return ret;
/* write out all moved pages, if possible */
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
filemap_write_and_wait_range(inode->i_mapping, offset, LLONG_MAX);
truncate_pagecache(inode, offset);
new_size = i_size_read(inode) - len;
ret = f2fs_truncate_blocks(inode, new_size, true);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
if (!ret)
f2fs_i_size_write(inode, new_size);
return ret;
pgoff_t end;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
truncate_pagecache_range(inode,
(loff_t)index << PAGE_SHIFT,
ret = f2fs_get_dnode_of_data(&dn, index, ALLOC_NODE);
if (ret) {
f2fs_unlock_op(sbi);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
goto out;
}
f2fs_put_dnode(&dn);
f2fs_unlock_op(sbi);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
f2fs_balance_fs(sbi, dn.node_changed);
static int f2fs_insert_range(struct inode *inode, loff_t offset, loff_t len)
{
struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
+ struct address_space *mapping = inode->i_mapping;
pgoff_t nr, pg_start, pg_end, delta, idx;
loff_t new_size;
int ret = 0;
f2fs_balance_fs(sbi, true);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = f2fs_truncate_blocks(inode, i_size_read(inode), true);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
if (ret)
return ret;
/* write out all dirty pages from offset */
- ret = filemap_write_and_wait_range(inode->i_mapping, offset, LLONG_MAX);
+ ret = filemap_write_and_wait_range(mapping, offset, LLONG_MAX);
if (ret)
return ret;
/* avoid gc operation during block exchange */
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
truncate_pagecache(inode, offset);
while (!ret && idx > pg_start) {
idx + delta, nr, false);
f2fs_unlock_op(sbi);
}
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
/* write out all moved pages, if possible */
- down_write(&F2FS_I(inode)->i_mmap_sem);
- filemap_write_and_wait_range(inode->i_mapping, offset, LLONG_MAX);
+ filemap_invalidate_lock(mapping);
+ filemap_write_and_wait_range(mapping, offset, LLONG_MAX);
truncate_pagecache(inode, offset);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
if (!ret)
f2fs_i_size_write(inode, new_size);
goto out;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
}
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
out:
inode_unlock(inode);
}
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
last_idx = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
}
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
if (ret >= 0) {
clear_inode_flag(inode, FI_COMPRESS_RELEASED);
goto err;
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
ret = filemap_write_and_wait_range(mapping, range.start,
to_end ? LLONG_MAX : end_addr - 1);
ret = f2fs_secure_erase(prev_bdev, inode, prev_index,
prev_block, len, range.flags);
out:
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
err:
inode_unlock(inode);
/* if we couldn't write data, we should deallocate blocks. */
if (preallocated && i_size_read(inode) < target_size) {
down_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
- down_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
f2fs_truncate(inode);
- up_write(&F2FS_I(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
up_write(&F2FS_I(inode)->i_gc_rwsem[WRITE]);
}
mutex_init(&fi->inmem_lock);
init_rwsem(&fi->i_gc_rwsem[READ]);
init_rwsem(&fi->i_gc_rwsem[WRITE]);
- init_rwsem(&fi->i_mmap_sem);
init_rwsem(&fi->i_xattr_sem);
/* Will be used by directory only */
ret = kstrtol(name, 10, &data);
if (ret)
return ret;
- if (data >= IOPRIO_BE_NR || data < 0)
+ if (data >= IOPRIO_NR_LEVELS || data < 0)
return -EINVAL;
cprc->ckpt_thread_ioprio = IOPRIO_PRIO_VALUE(class, data);
#include <linux/blkdev.h>
#include <linux/sched/signal.h>
+#include <linux/backing-dev-defs.h>
#include "fat.h"
struct fatent_operations {
pid_t f_getown(struct file *filp)
{
pid_t pid = 0;
- read_lock(&filp->f_owner.lock);
+
+ read_lock_irq(&filp->f_owner.lock);
rcu_read_lock();
if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
pid = pid_vnr(filp->f_owner.pid);
pid = -pid;
}
rcu_read_unlock();
- read_unlock(&filp->f_owner.lock);
+ read_unlock_irq(&filp->f_owner.lock);
return pid;
}
struct f_owner_ex owner = {};
int ret = 0;
- read_lock(&filp->f_owner.lock);
+ read_lock_irq(&filp->f_owner.lock);
rcu_read_lock();
if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
owner.pid = pid_vnr(filp->f_owner.pid);
ret = -EINVAL;
break;
}
- read_unlock(&filp->f_owner.lock);
+ read_unlock_irq(&filp->f_owner.lock);
if (!ret) {
ret = copy_to_user(owner_p, &owner, sizeof(owner));
uid_t src[2];
int err;
- read_lock(&filp->f_owner.lock);
+ read_lock_irq(&filp->f_owner.lock);
src[0] = from_kuid(user_ns, filp->f_owner.uid);
src[1] = from_kuid(user_ns, filp->f_owner.euid);
- read_unlock(&filp->f_owner.lock);
+ read_unlock_irq(&filp->f_owner.lock);
err = put_user(src[0], &dst[0]);
err |= put_user(src[1], &dst[1]);
{
while (fa) {
struct fown_struct *fown;
+ unsigned long flags;
if (fa->magic != FASYNC_MAGIC) {
printk(KERN_ERR "kill_fasync: bad magic number in "
"fasync_struct!\n");
return;
}
- read_lock(&fa->fa_lock);
+ read_lock_irqsave(&fa->fa_lock, flags);
if (fa->fa_file) {
fown = &fa->fa_file->f_owner;
/* Don't send SIGURG to processes which have not set a
if (!(sig == SIGURG && fown->signum == 0))
send_sigio(fown, fa->fa_fd, band);
}
- read_unlock(&fa->fa_lock);
+ read_unlock_irqrestore(&fa->fa_lock, flags);
fa = rcu_dereference(fa->fa_next);
}
}
/*
* Can't do inline reclaim in fault path. We call
* dax_layout_busy_page() before we free a range. And
- * fuse_wait_dax_page() drops fi->i_mmap_sem lock and requires it.
- * In fault path we enter with fi->i_mmap_sem held and can't drop
- * it. Also in fault path we hold fi->i_mmap_sem shared and not
- * exclusive, so that creates further issues with fuse_wait_dax_page().
- * Hence return -EAGAIN and fuse_dax_fault() will wait for a memory
- * range to become free and retry.
+ * fuse_wait_dax_page() drops mapping->invalidate_lock and requires it.
+ * In fault path we enter with mapping->invalidate_lock held and can't
+ * drop it. Also in fault path we hold mapping->invalidate_lock shared
+ * and not exclusive, so that creates further issues with
+ * fuse_wait_dax_page(). Hence return -EAGAIN and fuse_dax_fault()
+ * will wait for a memory range to become free and retry.
*/
if (flags & IOMAP_FAULT) {
alloc_dmap = alloc_dax_mapping(fcd);
down_write(&fi->dax->sem);
node = interval_tree_iter_first(&fi->dax->tree, idx, idx);
- /* We are holding either inode lock or i_mmap_sem, and that should
+ /* We are holding either inode lock or invalidate_lock, and that should
* ensure that dmap can't be truncated. We are holding a reference
* on dmap and that should make sure it can't be reclaimed. So dmap
* should still be there in tree despite the fact we dropped and
static void fuse_wait_dax_page(struct inode *inode)
{
- struct fuse_inode *fi = get_fuse_inode(inode);
-
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
schedule();
- down_write(&fi->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
}
-/* Should be called with fi->i_mmap_sem lock held exclusively */
+/* Should be called with mapping->invalidate_lock held exclusively */
static int __fuse_dax_break_layouts(struct inode *inode, bool *retry,
loff_t start, loff_t end)
{
* we do not want any read/write/mmap to make progress and try
* to populate page cache or access memory we are trying to free.
*/
- down_read(&get_fuse_inode(inode)->i_mmap_sem);
+ filemap_invalidate_lock_shared(inode->i_mapping);
ret = dax_iomap_fault(vmf, pe_size, &pfn, &error, &fuse_iomap_ops);
if ((ret & VM_FAULT_ERROR) && error == -EAGAIN) {
error = 0;
retry = true;
- up_read(&get_fuse_inode(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(inode->i_mapping);
goto retry;
}
if (ret & VM_FAULT_NEEDDSYNC)
ret = dax_finish_sync_fault(vmf, pe_size, pfn);
- up_read(&get_fuse_inode(inode)->i_mmap_sem);
+ filemap_invalidate_unlock_shared(inode->i_mapping);
if (write)
sb_end_pagefault(sb);
int ret;
struct interval_tree_node *node;
- down_write(&fi->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
/* Lookup a dmap and corresponding file offset to reclaim. */
down_read(&fi->dax->sem);
out_write_dmap_sem:
up_write(&fi->dax->sem);
out_mmap_sem:
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
return dmap;
}
* had a reference or some other temporary failure,
* Try again. We want to give up inline reclaim only
* if there is no range assigned to this node. Otherwise
- * if a deadlock is possible if we sleep with fi->i_mmap_sem
- * held and worker to free memory can't make progress due
- * to unavailability of fi->i_mmap_sem lock. So sleep
- * only if fi->dax->nr=0
+ * if a deadlock is possible if we sleep with
+ * mapping->invalidate_lock held and worker to free memory
+ * can't make progress due to unavailability of
+ * mapping->invalidate_lock. So sleep only if fi->dax->nr=0
*/
if (retry)
continue;
* There are no mappings which can be reclaimed. Wait for one.
* We are not holding fi->dax->sem. So it is possible
* that range gets added now. But as we are not holding
- * fi->i_mmap_sem, worker should still be able to free up
- * a range and wake us up.
+ * mapping->invalidate_lock, worker should still be able to
+ * free up a range and wake us up.
*/
if (!fi->dax->nr && !(fcd->nr_free_ranges > 0)) {
if (wait_event_killable_exclusive(fcd->range_waitq,
/*
* Free a range of memory.
* Locking:
- * 1. Take fi->i_mmap_sem to block dax faults.
+ * 1. Take mapping->invalidate_lock to block dax faults.
* 2. Take fi->dax->sem to protect interval tree and also to make sure
* read/write can not reuse a dmap which we might be freeing.
*/
loff_t dmap_start = start_idx << FUSE_DAX_SHIFT;
loff_t dmap_end = (dmap_start + FUSE_DAX_SZ) - 1;
- down_write(&fi->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
ret = fuse_dax_break_layouts(inode, dmap_start, dmap_end);
if (ret) {
pr_debug("virtio_fs: fuse_dax_break_layouts() failed. err=%d\n",
ret = lookup_and_reclaim_dmap_locked(fcd, inode, start_idx);
up_write(&fi->dax->sem);
out_mmap_sem:
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
return ret;
}
struct fuse_mount *fm = get_fuse_mount(inode);
struct fuse_conn *fc = fm->fc;
struct fuse_inode *fi = get_fuse_inode(inode);
+ struct address_space *mapping = inode->i_mapping;
FUSE_ARGS(args);
struct fuse_setattr_in inarg;
struct fuse_attr_out outarg;
}
if (FUSE_IS_DAX(inode) && is_truncate) {
- down_write(&fi->i_mmap_sem);
+ filemap_invalidate_lock(mapping);
fault_blocked = true;
err = fuse_dax_break_layouts(inode, 0, 0);
if (err) {
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
return err;
}
}
if ((is_truncate || !is_wb) &&
S_ISREG(inode->i_mode) && oldsize != outarg.attr.size) {
truncate_pagecache(inode, outarg.attr.size);
- invalidate_inode_pages2(inode->i_mapping);
+ invalidate_inode_pages2(mapping);
}
clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
out:
if (fault_blocked)
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
return 0;
clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
if (fault_blocked)
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(mapping);
return err;
}
}
if (dax_truncate) {
- down_write(&get_fuse_inode(inode)->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
err = fuse_dax_break_layouts(inode, 0, 0);
if (err)
goto out;
out:
if (dax_truncate)
- up_write(&get_fuse_inode(inode)->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
if (is_wb_truncate | dax_truncate) {
fuse_release_nowrite(inode);
if (lock_inode) {
inode_lock(inode);
if (block_faults) {
- down_write(&fi->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
err = fuse_dax_break_layouts(inode, 0, 0);
if (err)
goto out;
clear_bit(FUSE_I_SIZE_UNSTABLE, &fi->state);
if (block_faults)
- up_write(&fi->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
if (lock_inode)
inode_unlock(inode);
* modifications. Yet this does give less guarantees than if the
* copying was performed with write(2).
*
- * To fix this a i_mmap_sem style lock could be used to prevent new
+ * To fix this a mapping->invalidate_lock could be used to prevent new
* faults while the copy is ongoing.
*/
err = fuse_writeback_range(inode_out, pos_out, pos_out + len - 1);
/** Lock to protect write related fields */
spinlock_t lock;
- /**
- * Can't take inode lock in fault path (leads to circular dependency).
- * Introduce another semaphore which can be taken in fault path and
- * then other filesystem paths can take this to block faults.
- */
- struct rw_semaphore i_mmap_sem;
-
#ifdef CONFIG_FUSE_DAX
/*
* Dax specific inode data
fi->orig_ino = 0;
fi->state = 0;
mutex_init(&fi->mutex);
- init_rwsem(&fi->i_mmap_sem);
spin_lock_init(&fi->lock);
fi->forget = fuse_alloc_forget();
if (!fi->forget)
if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK;
- if (__mandatory_lock(&ip->i_inode) && fl->fl_type != F_UNLCK)
- return -ENOLCK;
-
if (cmd == F_CANCELLK) {
/* Hack: */
cmd = F_SETLK;
config HPFS_FS
tristate "OS/2 HPFS file system support"
depends on BLOCK
+ select FS_IOMAP
help
OS/2 is IBM's operating system for PC's, the same as Warp, and HPFS
is the file system used for organizing files on OS/2 hard disk
#include "hpfs_fn.h"
#include <linux/mpage.h>
+#include <linux/iomap.h>
#include <linux/fiemap.h>
#define BLOCKS(size) (((size) + 511) >> 9)
return r;
}
+static int hpfs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+ unsigned flags, struct iomap *iomap, struct iomap *srcmap)
+{
+ struct super_block *sb = inode->i_sb;
+ unsigned int blkbits = inode->i_blkbits;
+ unsigned int n_secs;
+ secno s;
+
+ if (WARN_ON_ONCE(flags & (IOMAP_WRITE | IOMAP_ZERO)))
+ return -EINVAL;
+
+ iomap->bdev = inode->i_sb->s_bdev;
+ iomap->offset = offset;
+
+ hpfs_lock(sb);
+ s = hpfs_bmap(inode, offset >> blkbits, &n_secs);
+ if (s) {
+ n_secs = hpfs_search_hotfix_map_for_range(sb, s,
+ min_t(loff_t, n_secs, length));
+ if (unlikely(!n_secs)) {
+ s = hpfs_search_hotfix_map(sb, s);
+ n_secs = 1;
+ }
+ iomap->type = IOMAP_MAPPED;
+ iomap->flags = IOMAP_F_MERGED;
+ iomap->addr = (u64)s << blkbits;
+ iomap->length = (u64)n_secs << blkbits;
+ } else {
+ iomap->type = IOMAP_HOLE;
+ iomap->addr = IOMAP_NULL_ADDR;
+ iomap->length = 1 << blkbits;
+ }
+
+ hpfs_unlock(sb);
+ return 0;
+}
+
+static const struct iomap_ops hpfs_iomap_ops = {
+ .iomap_begin = hpfs_iomap_begin,
+};
+
static int hpfs_readpage(struct file *file, struct page *page)
{
return mpage_readpage(page, hpfs_get_block);
static int hpfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len)
{
- return generic_block_fiemap(inode, fieinfo, start, len, hpfs_get_block);
+ int ret;
+
+ inode_lock(inode);
+ len = min_t(u64, len, i_size_read(inode));
+ ret = iomap_fiemap(inode, fieinfo, start, len, &hpfs_iomap_ops);
+ inode_unlock(inode);
+
+ return ret;
}
const struct address_space_operations hpfs_aops = {
mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE);
mapping->private_data = NULL;
mapping->writeback_index = 0;
+ __init_rwsem(&mapping->invalidate_lock, "mapping.invalidate_lock",
+ &sb->s_type->invalidate_lock_key);
inode->i_private = NULL;
inode->i_mapping = mapping;
INIT_HLIST_HEAD(&inode->i_dentry); /* buggered by rcu freeing */
complete(&worker->ref_done);
wait_for_completion(&worker->ref_done);
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
if (worker->flags & IO_WORKER_F_FREE)
hlist_nulls_del_rcu(&worker->nulls_node);
list_del_rcu(&worker->all_list);
worker->flags = 0;
current->flags &= ~PF_IO_WORKER;
preempt_enable();
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
kfree_rcu(worker, rcu);
io_worker_ref_put(wqe->wq);
if (!ret) {
bool do_create = false, first = false;
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
if (acct->nr_workers < acct->max_workers) {
if (!acct->nr_workers)
first = true;
acct->nr_workers++;
do_create = true;
}
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
if (do_create) {
atomic_inc(&acct->nr_running);
atomic_inc(&wqe->wq->worker_refs);
wqe = worker->wqe;
wq = wqe->wq;
acct = &wqe->acct[worker->create_index];
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
if (acct->nr_workers < acct->max_workers) {
if (!acct->nr_workers)
first = true;
acct->nr_workers++;
do_create = true;
}
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
if (do_create) {
create_io_worker(wq, wqe, worker->create_index, first);
} else {
spin_unlock(&wq->hash->wait.lock);
}
-static struct io_wq_work *io_get_next_work(struct io_wqe *wqe)
+/*
+ * We can always run the work if the worker is currently the same type as
+ * the work (eg both are bound, or both are unbound). If they are not the
+ * same, only allow it if incrementing the worker count would be allowed.
+ */
+static bool io_worker_can_run_work(struct io_worker *worker,
+ struct io_wq_work *work)
+{
+ struct io_wqe_acct *acct;
+
+ if (!(worker->flags & IO_WORKER_F_BOUND) !=
+ !(work->flags & IO_WQ_WORK_UNBOUND))
+ return true;
+
+ /* not the same type, check if we'd go over the limit */
+ acct = io_work_get_acct(worker->wqe, work);
+ return acct->nr_workers < acct->max_workers;
+}
+
+static struct io_wq_work *io_get_next_work(struct io_wqe *wqe,
+ struct io_worker *worker,
+ bool *stalled)
__must_hold(wqe->lock)
{
struct io_wq_work_node *node, *prev;
work = container_of(node, struct io_wq_work, list);
+ if (!io_worker_can_run_work(worker, work))
+ break;
+
/* not hashed, can run anytime */
if (!io_wq_is_hashed(work)) {
wq_list_del(&wqe->work_list, node, prev);
raw_spin_unlock(&wqe->lock);
io_wait_on_hash(wqe, stall_hash);
raw_spin_lock(&wqe->lock);
+ *stalled = true;
}
return NULL;
cond_resched();
}
- spin_lock_irq(&worker->lock);
+ spin_lock(&worker->lock);
worker->cur_work = work;
- spin_unlock_irq(&worker->lock);
+ spin_unlock(&worker->lock);
}
static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work);
do {
struct io_wq_work *work;
+ bool stalled;
get_next:
/*
* If we got some work, mark us as busy. If we didn't, but
* can't make progress, any work completion or insertion will
* clear the stalled flag.
*/
- work = io_get_next_work(wqe);
+ stalled = false;
+ work = io_get_next_work(wqe, worker, &stalled);
if (work)
__io_worker_busy(wqe, worker, work);
- else if (!wq_list_empty(&wqe->work_list))
+ else if (stalled)
wqe->flags |= IO_WQE_FLAG_STALLED;
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
if (!work)
break;
io_assign_current_work(worker, work);
clear_bit(hash, &wq->hash->map);
if (wq_has_sleeper(&wq->hash->wait))
wake_up(&wq->hash->wait);
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
wqe->flags &= ~IO_WQE_FLAG_STALLED;
/* skip unnecessary unlock-lock wqe->lock */
if (!work)
goto get_next;
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
}
} while (work);
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
} while (1);
}
set_current_state(TASK_INTERRUPTIBLE);
loop:
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
if (io_wqe_run_queue(wqe)) {
io_worker_handle_work(worker);
goto loop;
}
__io_worker_idle(wqe, worker);
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
if (io_flush_signals())
continue;
ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
}
if (test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
io_worker_handle_work(worker);
}
worker->flags &= ~IO_WORKER_F_RUNNING;
- raw_spin_lock_irq(&worker->wqe->lock);
+ raw_spin_lock(&worker->wqe->lock);
io_wqe_dec_running(worker);
- raw_spin_unlock_irq(&worker->wqe->lock);
+ raw_spin_unlock(&worker->wqe->lock);
}
static void create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index, bool first)
kfree(worker);
fail:
atomic_dec(&acct->nr_running);
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
acct->nr_workers--;
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
io_worker_ref_put(wq);
return;
}
set_cpus_allowed_ptr(tsk, wqe->cpu_mask);
tsk->flags |= PF_NO_SETAFFINITY;
- raw_spin_lock_irq(&wqe->lock);
+ raw_spin_lock(&wqe->lock);
hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
list_add_tail_rcu(&worker->all_list, &wqe->all_list);
worker->flags |= IO_WORKER_F_FREE;
worker->flags |= IO_WORKER_F_BOUND;
if (first && (worker->flags & IO_WORKER_F_BOUND))
worker->flags |= IO_WORKER_F_FIXED;
- raw_spin_unlock_irq(&wqe->lock);
+ raw_spin_unlock(&wqe->lock);
wake_up_new_task(tsk);
}
static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
{
struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
- int work_flags;
- unsigned long flags;
+ bool do_wake;
/*
* If io-wq is exiting for this task, or if the request has explicitly
return;
}
- work_flags = work->flags;
- raw_spin_lock_irqsave(&wqe->lock, flags);
+ raw_spin_lock(&wqe->lock);
io_wqe_insert_work(wqe, work);
wqe->flags &= ~IO_WQE_FLAG_STALLED;
- raw_spin_unlock_irqrestore(&wqe->lock, flags);
+ do_wake = (work->flags & IO_WQ_WORK_CONCURRENT) ||
+ !atomic_read(&acct->nr_running);
+ raw_spin_unlock(&wqe->lock);
- if ((work_flags & IO_WQ_WORK_CONCURRENT) ||
- !atomic_read(&acct->nr_running))
+ if (do_wake)
io_wqe_wake_worker(wqe, acct);
}
static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
{
struct io_cb_cancel_data *match = data;
- unsigned long flags;
/*
* Hold the lock to avoid ->cur_work going out of scope, caller
* may dereference the passed in work.
*/
- spin_lock_irqsave(&worker->lock, flags);
+ spin_lock(&worker->lock);
if (worker->cur_work &&
match->fn(worker->cur_work, match->data)) {
set_notify_signal(worker->task);
match->nr_running++;
}
- spin_unlock_irqrestore(&worker->lock, flags);
+ spin_unlock(&worker->lock);
return match->nr_running && !match->cancel_all;
}
{
struct io_wq_work_node *node, *prev;
struct io_wq_work *work;
- unsigned long flags;
retry:
- raw_spin_lock_irqsave(&wqe->lock, flags);
+ raw_spin_lock(&wqe->lock);
wq_list_for_each(node, prev, &wqe->work_list) {
work = container_of(node, struct io_wq_work, list);
if (!match->fn(work, match->data))
continue;
io_wqe_remove_pending(wqe, work, prev);
- raw_spin_unlock_irqrestore(&wqe->lock, flags);
+ raw_spin_unlock(&wqe->lock);
io_run_cancel(work, wqe);
match->nr_pending++;
if (!match->cancel_all)
/* not safe to continue after unlock */
goto retry;
}
- raw_spin_unlock_irqrestore(&wqe->lock, flags);
+ raw_spin_unlock(&wqe->lock);
}
static void io_wqe_cancel_running_work(struct io_wqe *wqe,
return 0;
}
+/*
+ * Set max number of unbounded workers, returns old value. If new_count is 0,
+ * then just return the old value.
+ */
+int io_wq_max_workers(struct io_wq *wq, int *new_count)
+{
+ int i, node, prev = 0;
+
+ for (i = 0; i < 2; i++) {
+ if (new_count[i] > task_rlimit(current, RLIMIT_NPROC))
+ new_count[i] = task_rlimit(current, RLIMIT_NPROC);
+ }
+
+ rcu_read_lock();
+ for_each_node(node) {
+ struct io_wqe_acct *acct;
+
+ for (i = 0; i < 2; i++) {
+ acct = &wq->wqes[node]->acct[i];
+ prev = max_t(int, acct->max_workers, prev);
+ if (new_count[i])
+ acct->max_workers = new_count[i];
+ new_count[i] = prev;
+ }
+ }
+ rcu_read_unlock();
+ return 0;
+}
+
static __init int io_wq_init(void)
{
int ret;
void io_wq_hash_work(struct io_wq_work *work, void *val);
int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask);
+int io_wq_max_workers(struct io_wq *wq, int *new_count);
static inline bool io_wq_is_hashed(struct io_wq_work *work)
{
struct io_submit_state submit_state;
struct list_head timeout_list;
+ struct list_head ltimeout_list;
struct list_head cq_overflow_list;
struct xarray io_buffers;
struct xarray personalities;
struct hrtimer timer;
struct timespec64 ts;
enum hrtimer_mode mode;
+ u32 flags;
};
struct io_accept {
struct sockaddr __user *addr;
int __user *addr_len;
int flags;
+ u32 file_slot;
unsigned long nofile;
};
/* timeout update */
struct timespec64 ts;
u32 flags;
+ bool ltimeout;
};
struct io_rw {
struct io_open {
struct file *file;
int dfd;
+ u32 file_slot;
struct filename *filename;
struct open_how how;
unsigned long nofile;
struct io_poll_iocb *double_poll;
};
-typedef void (*io_req_tw_func_t)(struct io_kiocb *req);
+typedef void (*io_req_tw_func_t)(struct io_kiocb *req, bool *locked);
struct io_task_work {
union {
static void io_submit_flush_completions(struct io_ring_ctx *ctx);
static int io_req_prep_async(struct io_kiocb *req);
+static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
+ unsigned int issue_flags, u32 slot_index);
+static enum hrtimer_restart io_link_timeout_fn(struct hrtimer *timer);
+
static struct kmem_cache *req_cachep;
static const struct file_operations io_uring_fops;
}
EXPORT_SYMBOL(io_uring_get_socket);
+static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
+{
+ if (!*locked) {
+ mutex_lock(&ctx->uring_lock);
+ *locked = true;
+ }
+}
+
#define io_for_each_link(pos, head) \
for (pos = (head); pos; pos = pos->link)
req->flags |= REQ_F_FAIL;
}
+static inline void req_fail_link_node(struct io_kiocb *req, int res)
+{
+ req_set_fail(req);
+ req->result = res;
+}
+
static void io_ring_ctx_ref_free(struct percpu_ref *ref)
{
struct io_ring_ctx *ctx = container_of(ref, struct io_ring_ctx, refs);
fallback_work.work);
struct llist_node *node = llist_del_all(&ctx->fallback_llist);
struct io_kiocb *req, *tmp;
+ bool locked = false;
percpu_ref_get(&ctx->refs);
llist_for_each_entry_safe(req, tmp, node, io_task_work.fallback_node)
- req->io_task_work.func(req);
+ req->io_task_work.func(req, &locked);
+
+ if (locked) {
+ if (ctx->submit_state.compl_nr)
+ io_submit_flush_completions(ctx);
+ mutex_unlock(&ctx->uring_lock);
+ }
percpu_ref_put(&ctx->refs);
+
}
static struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
INIT_LIST_HEAD(&ctx->iopoll_list);
INIT_LIST_HEAD(&ctx->defer_list);
INIT_LIST_HEAD(&ctx->timeout_list);
+ INIT_LIST_HEAD(&ctx->ltimeout_list);
spin_lock_init(&ctx->rsrc_ref_lock);
INIT_LIST_HEAD(&ctx->rsrc_ref_list);
INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work);
}
}
-static void io_queue_async_work(struct io_kiocb *req)
+static void io_queue_async_work(struct io_kiocb *req, bool *locked)
{
struct io_ring_ctx *ctx = req->ctx;
struct io_kiocb *link = io_prep_linked_timeout(req);
struct io_uring_task *tctx = req->task->io_uring;
+ /* must not take the lock, NULL it as a precaution */
+ locked = NULL;
+
BUG_ON(!tctx);
BUG_ON(!tctx->io_wq);
}
}
+static void io_task_refs_refill(struct io_uring_task *tctx)
+{
+ unsigned int refill = -tctx->cached_refs + IO_TCTX_REFS_CACHE_NR;
+
+ percpu_counter_add(&tctx->inflight, refill);
+ refcount_add(refill, ¤t->usage);
+ tctx->cached_refs += refill;
+}
+
+static inline void io_get_task_refs(int nr)
+{
+ struct io_uring_task *tctx = current->io_uring;
+
+ tctx->cached_refs -= nr;
+ if (unlikely(tctx->cached_refs < 0))
+ io_task_refs_refill(tctx);
+}
+
static bool io_cqring_event_overflow(struct io_ring_ctx *ctx, u64 user_data,
long res, unsigned int cflags)
{
io_remove_next_linked(req);
link->timeout.head = NULL;
if (hrtimer_try_to_cancel(&io->timer) != -1) {
+ list_del(&link->timeout.list);
io_cqring_fill_event(link->ctx, link->user_data,
-ECANCELED, 0);
io_put_req_deferred(link);
req->link = NULL;
while (link) {
+ long res = -ECANCELED;
+
+ if (link->flags & REQ_F_FAIL)
+ res = link->result;
+
nxt = link->link;
link->link = NULL;
trace_io_uring_fail_link(req, link);
- io_cqring_fill_event(link->ctx, link->user_data, -ECANCELED, 0);
+ io_cqring_fill_event(link->ctx, link->user_data, res, 0);
io_put_req_deferred(link);
link = nxt;
}
return __io_req_find_next(req);
}
-static void ctx_flush_and_put(struct io_ring_ctx *ctx)
+static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
{
if (!ctx)
return;
- if (ctx->submit_state.compl_nr) {
- mutex_lock(&ctx->uring_lock);
+ if (*locked) {
if (ctx->submit_state.compl_nr)
io_submit_flush_completions(ctx);
mutex_unlock(&ctx->uring_lock);
+ *locked = false;
}
percpu_ref_put(&ctx->refs);
}
static void tctx_task_work(struct callback_head *cb)
{
+ bool locked = false;
struct io_ring_ctx *ctx = NULL;
struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
task_work);
io_task_work.node);
if (req->ctx != ctx) {
- ctx_flush_and_put(ctx);
+ ctx_flush_and_put(ctx, &locked);
ctx = req->ctx;
+ /* if not contended, grab and improve batching */
+ locked = mutex_trylock(&ctx->uring_lock);
percpu_ref_get(&ctx->refs);
}
- req->io_task_work.func(req);
+ req->io_task_work.func(req, &locked);
node = next;
} while (node);
cond_resched();
}
- ctx_flush_and_put(ctx);
+ ctx_flush_and_put(ctx, &locked);
}
static void io_req_task_work_add(struct io_kiocb *req)
}
}
-static void io_req_task_cancel(struct io_kiocb *req)
+static void io_req_task_cancel(struct io_kiocb *req, bool *locked)
{
struct io_ring_ctx *ctx = req->ctx;
- /* ctx is guaranteed to stay alive while we hold uring_lock */
- mutex_lock(&ctx->uring_lock);
+ /* not needed for normal modes, but SQPOLL depends on it */
+ io_tw_lock(ctx, locked);
io_req_complete_failed(req, req->result);
- mutex_unlock(&ctx->uring_lock);
}
-static void io_req_task_submit(struct io_kiocb *req)
+static void io_req_task_submit(struct io_kiocb *req, bool *locked)
{
struct io_ring_ctx *ctx = req->ctx;
- /* ctx stays valid until unlock, even if we drop all ours ctx->refs */
- mutex_lock(&ctx->uring_lock);
+ io_tw_lock(ctx, locked);
/* req->task == current here, checking PF_EXITING is safe */
if (likely(!(req->task->flags & PF_EXITING)))
__io_queue_sqe(req);
else
io_req_complete_failed(req, -EFAULT);
- mutex_unlock(&ctx->uring_lock);
}
static void io_req_task_queue_fail(struct io_kiocb *req, int ret)
__io_free_req(req);
}
+static void io_free_req_work(struct io_kiocb *req, bool *locked)
+{
+ io_free_req(req);
+}
+
struct req_batch {
struct task_struct *task;
int task_refs;
static inline void io_put_req_deferred(struct io_kiocb *req)
{
if (req_ref_put_and_test(req)) {
- req->io_task_work.func = io_free_req;
+ req->io_task_work.func = io_free_req_work;
io_req_task_work_add(req);
}
}
return false;
}
-static void io_req_task_complete(struct io_kiocb *req)
+static void io_req_task_complete(struct io_kiocb *req, bool *locked)
{
- __io_req_complete(req, 0, req->result, io_put_rw_kbuf(req));
+ unsigned int cflags = io_put_rw_kbuf(req);
+ long res = req->result;
+
+ if (*locked) {
+ struct io_ring_ctx *ctx = req->ctx;
+ struct io_submit_state *state = &ctx->submit_state;
+
+ io_req_complete_state(req, res, cflags);
+ state->compl_reqs[state->compl_nr++] = req;
+ if (state->compl_nr == ARRAY_SIZE(state->compl_reqs))
+ io_submit_flush_completions(ctx);
+ } else {
+ io_req_complete_post(req, res, cflags);
+ }
}
static void __io_complete_rw(struct io_kiocb *req, long res, long res2,
{
if (__io_complete_rw_common(req, res))
return;
- io_req_task_complete(req);
+ __io_req_complete(req, 0, req->result, io_put_rw_kbuf(req));
}
static void io_complete_rw(struct kiocb *kiocb, long res, long res2)
!kiocb->ki_filp->f_op->iopoll)
return -EOPNOTSUPP;
- kiocb->ki_flags |= IOCB_HIPRI;
+ kiocb->ki_flags |= IOCB_HIPRI | IOCB_ALLOC_CACHE;
kiocb->ki_complete = io_complete_rw_iopoll;
req->iopoll_completed = 0;
} else {
if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
return -EINVAL;
- if (unlikely(sqe->ioprio || sqe->buf_index || sqe->splice_fd_in))
+ if (unlikely(sqe->ioprio || sqe->buf_index))
return -EINVAL;
if (unlikely(req->flags & REQ_F_FIXED_FILE))
return -EBADF;
req->open.filename = NULL;
return ret;
}
+
+ req->open.file_slot = READ_ONCE(sqe->file_index);
+ if (req->open.file_slot && (req->open.how.flags & O_CLOEXEC))
+ return -EINVAL;
+
req->open.nofile = rlimit(RLIMIT_NOFILE);
req->flags |= REQ_F_NEED_CLEANUP;
return 0;
{
struct open_flags op;
struct file *file;
- bool nonblock_set;
- bool resolve_nonblock;
+ bool resolve_nonblock, nonblock_set;
+ bool fixed = !!req->open.file_slot;
int ret;
ret = build_open_flags(&req->open.how, &op);
op.open_flag |= O_NONBLOCK;
}
- ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
- if (ret < 0)
- goto err;
+ if (!fixed) {
+ ret = __get_unused_fd_flags(req->open.how.flags, req->open.nofile);
+ if (ret < 0)
+ goto err;
+ }
file = do_filp_open(req->open.dfd, req->open.filename, &op);
if (IS_ERR(file)) {
* marginal gain for something that is now known to be a slower
* path. So just put it, and we'll get a new one when we retry.
*/
- put_unused_fd(ret);
+ if (!fixed)
+ put_unused_fd(ret);
ret = PTR_ERR(file);
/* only retry if RESOLVE_CACHED wasn't already set by application */
if ((issue_flags & IO_URING_F_NONBLOCK) && !nonblock_set)
file->f_flags &= ~O_NONBLOCK;
fsnotify_open(file);
- fd_install(ret, file);
+
+ if (!fixed)
+ fd_install(ret, file);
+ else
+ ret = io_install_fixed_file(req, file, issue_flags,
+ req->open.file_slot - 1);
err:
putname(req->open.filename);
req->flags &= ~REQ_F_NEED_CLEANUP;
if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
return -EINVAL;
- if (sqe->ioprio || sqe->len || sqe->buf_index || sqe->splice_fd_in)
+ if (sqe->ioprio || sqe->len || sqe->buf_index)
return -EINVAL;
accept->addr = u64_to_user_ptr(READ_ONCE(sqe->addr));
accept->addr_len = u64_to_user_ptr(READ_ONCE(sqe->addr2));
accept->flags = READ_ONCE(sqe->accept_flags);
accept->nofile = rlimit(RLIMIT_NOFILE);
+
+ accept->file_slot = READ_ONCE(sqe->file_index);
+ if (accept->file_slot && ((req->open.how.flags & O_CLOEXEC) ||
+ (accept->flags & SOCK_CLOEXEC)))
+ return -EINVAL;
+ if (accept->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
+ return -EINVAL;
+ if (SOCK_NONBLOCK != O_NONBLOCK && (accept->flags & SOCK_NONBLOCK))
+ accept->flags = (accept->flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
return 0;
}
struct io_accept *accept = &req->accept;
bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
unsigned int file_flags = force_nonblock ? O_NONBLOCK : 0;
- int ret;
+ bool fixed = !!accept->file_slot;
+ struct file *file;
+ int ret, fd;
if (req->file->f_flags & O_NONBLOCK)
req->flags |= REQ_F_NOWAIT;
- ret = __sys_accept4_file(req->file, file_flags, accept->addr,
- accept->addr_len, accept->flags,
- accept->nofile);
- if (ret == -EAGAIN && force_nonblock)
- return -EAGAIN;
- if (ret < 0) {
+ if (!fixed) {
+ fd = __get_unused_fd_flags(accept->flags, accept->nofile);
+ if (unlikely(fd < 0))
+ return fd;
+ }
+ file = do_accept(req->file, file_flags, accept->addr, accept->addr_len,
+ accept->flags);
+ if (IS_ERR(file)) {
+ if (!fixed)
+ put_unused_fd(fd);
+ ret = PTR_ERR(file);
+ if (ret == -EAGAIN && force_nonblock)
+ return -EAGAIN;
if (ret == -ERESTARTSYS)
ret = -EINTR;
req_set_fail(req);
+ } else if (!fixed) {
+ fd_install(fd, file);
+ ret = fd;
+ } else {
+ ret = io_install_fixed_file(req, file, issue_flags,
+ accept->file_slot - 1);
}
__io_req_complete(req, issue_flags, ret, 0);
return 0;
return !(flags & IORING_CQE_F_MORE);
}
-static void io_poll_task_func(struct io_kiocb *req)
+static void io_poll_task_func(struct io_kiocb *req, bool *locked)
{
struct io_ring_ctx *ctx = req->ctx;
struct io_kiocb *nxt;
if (done) {
nxt = io_put_req_find_next(req);
if (nxt)
- io_req_task_submit(nxt);
+ io_req_task_submit(nxt, locked);
}
}
}
__io_queue_proc(&apoll->poll, pt, head, &apoll->double_poll);
}
-static void io_async_task_func(struct io_kiocb *req)
+static void io_async_task_func(struct io_kiocb *req, bool *locked)
{
struct async_poll *apoll = req->apoll;
struct io_ring_ctx *ctx = req->ctx;
spin_unlock(&ctx->completion_lock);
if (!READ_ONCE(apoll->poll.canceled))
- io_req_task_submit(req);
+ io_req_task_submit(req, locked);
else
io_req_complete_failed(req, -ECANCELED);
}
return 0;
}
-static void io_req_task_timeout(struct io_kiocb *req)
+static void io_req_task_timeout(struct io_kiocb *req, bool *locked)
{
req_set_fail(req);
io_req_complete_post(req, -ETIME, 0);
return 0;
}
+static clockid_t io_timeout_get_clock(struct io_timeout_data *data)
+{
+ switch (data->flags & IORING_TIMEOUT_CLOCK_MASK) {
+ case IORING_TIMEOUT_BOOTTIME:
+ return CLOCK_BOOTTIME;
+ case IORING_TIMEOUT_REALTIME:
+ return CLOCK_REALTIME;
+ default:
+ /* can't happen, vetted at prep time */
+ WARN_ON_ONCE(1);
+ fallthrough;
+ case 0:
+ return CLOCK_MONOTONIC;
+ }
+}
+
+static int io_linked_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
+ struct timespec64 *ts, enum hrtimer_mode mode)
+ __must_hold(&ctx->timeout_lock)
+{
+ struct io_timeout_data *io;
+ struct io_kiocb *req;
+ bool found = false;
+
+ list_for_each_entry(req, &ctx->ltimeout_list, timeout.list) {
+ found = user_data == req->user_data;
+ if (found)
+ break;
+ }
+ if (!found)
+ return -ENOENT;
+
+ io = req->async_data;
+ if (hrtimer_try_to_cancel(&io->timer) == -1)
+ return -EALREADY;
+ hrtimer_init(&io->timer, io_timeout_get_clock(io), mode);
+ io->timer.function = io_link_timeout_fn;
+ hrtimer_start(&io->timer, timespec64_to_ktime(*ts), mode);
+ return 0;
+}
+
static int io_timeout_update(struct io_ring_ctx *ctx, __u64 user_data,
struct timespec64 *ts, enum hrtimer_mode mode)
__must_hold(&ctx->timeout_lock)
req->timeout.off = 0; /* noseq */
data = req->async_data;
list_add_tail(&req->timeout.list, &ctx->timeout_list);
- hrtimer_init(&data->timer, CLOCK_MONOTONIC, mode);
+ hrtimer_init(&data->timer, io_timeout_get_clock(data), mode);
data->timer.function = io_timeout_fn;
hrtimer_start(&data->timer, timespec64_to_ktime(*ts), mode);
return 0;
if (sqe->ioprio || sqe->buf_index || sqe->len || sqe->splice_fd_in)
return -EINVAL;
+ tr->ltimeout = false;
tr->addr = READ_ONCE(sqe->addr);
tr->flags = READ_ONCE(sqe->timeout_flags);
- if (tr->flags & IORING_TIMEOUT_UPDATE) {
- if (tr->flags & ~(IORING_TIMEOUT_UPDATE|IORING_TIMEOUT_ABS))
+ if (tr->flags & IORING_TIMEOUT_UPDATE_MASK) {
+ if (hweight32(tr->flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
+ return -EINVAL;
+ if (tr->flags & IORING_LINK_TIMEOUT_UPDATE)
+ tr->ltimeout = true;
+ if (tr->flags & ~(IORING_TIMEOUT_UPDATE_MASK|IORING_TIMEOUT_ABS))
return -EINVAL;
if (get_timespec64(&tr->ts, u64_to_user_ptr(sqe->addr2)))
return -EFAULT;
spin_unlock_irq(&ctx->timeout_lock);
spin_unlock(&ctx->completion_lock);
} else {
+ enum hrtimer_mode mode = io_translate_timeout_mode(tr->flags);
+
spin_lock_irq(&ctx->timeout_lock);
- ret = io_timeout_update(ctx, tr->addr, &tr->ts,
- io_translate_timeout_mode(tr->flags));
+ if (tr->ltimeout)
+ ret = io_linked_timeout_update(ctx, tr->addr, &tr->ts, mode);
+ else
+ ret = io_timeout_update(ctx, tr->addr, &tr->ts, mode);
spin_unlock_irq(&ctx->timeout_lock);
}
if (off && is_timeout_link)
return -EINVAL;
flags = READ_ONCE(sqe->timeout_flags);
- if (flags & ~IORING_TIMEOUT_ABS)
+ if (flags & ~(IORING_TIMEOUT_ABS | IORING_TIMEOUT_CLOCK_MASK))
+ return -EINVAL;
+ /* more than one clock specified is invalid, obviously */
+ if (hweight32(flags & IORING_TIMEOUT_CLOCK_MASK) > 1)
return -EINVAL;
+ INIT_LIST_HEAD(&req->timeout.list);
req->timeout.off = off;
if (unlikely(off && !req->ctx->off_timeout_used))
req->ctx->off_timeout_used = true;
data = req->async_data;
data->req = req;
+ data->flags = flags;
if (get_timespec64(&data->ts, u64_to_user_ptr(sqe->addr)))
return -EFAULT;
data->mode = io_translate_timeout_mode(flags);
- hrtimer_init(&data->timer, CLOCK_MONOTONIC, data->mode);
+ hrtimer_init(&data->timer, io_timeout_get_clock(data), data->mode);
if (is_timeout_link) {
struct io_submit_link *link = &req->ctx->submit_state.link;
struct io_ring_ctx *ctx = req->ctx;
int ret;
- WARN_ON_ONCE(req->task != current);
+ WARN_ON_ONCE(!io_wq_current_is_worker() && req->task != current);
ret = io_async_cancel_one(req->task->io_uring, sqe_addr, ctx);
if (ret != -ENOENT)
if (!req_need_defer(req, seq) && list_empty(&ctx->defer_list)) {
spin_unlock(&ctx->completion_lock);
kfree(de);
- io_queue_async_work(req);
+ io_queue_async_work(req, NULL);
return true;
}
if (timeout)
io_queue_linked_timeout(timeout);
+ /* either cancelled or io-wq is dying, so don't touch tctx->iowq */
if (work->flags & IO_WQ_WORK_CANCEL)
ret = -ECANCELED;
return io_file_get_normal(ctx, req, fd);
}
-static void io_req_task_link_timeout(struct io_kiocb *req)
+static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
{
struct io_kiocb *prev = req->timeout.prev;
int ret;
if (!req_ref_inc_not_zero(prev))
prev = NULL;
}
+ list_del(&req->timeout.list);
req->timeout.prev = prev;
spin_unlock_irqrestore(&ctx->timeout_lock, flags);
data->timer.function = io_link_timeout_fn;
hrtimer_start(&data->timer, timespec64_to_ktime(data->ts),
data->mode);
+ list_add_tail(&req->timeout.list, &ctx->ltimeout_list);
}
spin_unlock_irq(&ctx->timeout_lock);
/* drop submission reference */
* Queued up for async execution, worker will release
* submit reference when the iocb is actually submitted.
*/
- io_queue_async_work(req);
+ io_queue_async_work(req, NULL);
break;
}
if (unlikely(req->ctx->drain_active) && io_drain_req(req))
return;
- if (likely(!(req->flags & REQ_F_FORCE_ASYNC))) {
+ if (likely(!(req->flags & (REQ_F_FORCE_ASYNC | REQ_F_FAIL)))) {
__io_queue_sqe(req);
+ } else if (req->flags & REQ_F_FAIL) {
+ io_req_complete_failed(req, req->result);
} else {
int ret = io_req_prep_async(req);
if (unlikely(ret))
io_req_complete_failed(req, ret);
else
- io_queue_async_work(req);
+ io_queue_async_work(req, NULL);
}
}
ret = io_init_req(ctx, req, sqe);
if (unlikely(ret)) {
fail_req:
+ /* fail even hard links since we don't submit */
if (link->head) {
- /* fail even hard links since we don't submit */
- req_set_fail(link->head);
- io_req_complete_failed(link->head, -ECANCELED);
- link->head = NULL;
+ /*
+ * we can judge a link req is failed or cancelled by if
+ * REQ_F_FAIL is set, but the head is an exception since
+ * it may be set REQ_F_FAIL because of other req's failure
+ * so let's leverage req->result to distinguish if a head
+ * is set REQ_F_FAIL because of its failure or other req's
+ * failure so that we can set the correct ret code for it.
+ * init result here to avoid affecting the normal path.
+ */
+ if (!(link->head->flags & REQ_F_FAIL))
+ req_fail_link_node(link->head, -ECANCELED);
+ } else if (!(req->flags & (REQ_F_LINK | REQ_F_HARDLINK))) {
+ /*
+ * the current req is a normal req, we should return
+ * error and thus break the submittion loop.
+ */
+ io_req_complete_failed(req, ret);
+ return ret;
}
- io_req_complete_failed(req, ret);
- return ret;
+ req_fail_link_node(req, ret);
+ } else {
+ ret = io_req_prep(req, sqe);
+ if (unlikely(ret))
+ goto fail_req;
}
- ret = io_req_prep(req, sqe);
- if (unlikely(ret))
- goto fail_req;
-
/* don't need @sqe from now on */
trace_io_uring_submit_sqe(ctx, req, req->opcode, req->user_data,
req->flags, true,
if (link->head) {
struct io_kiocb *head = link->head;
- ret = io_req_prep_async(req);
- if (unlikely(ret))
- goto fail_req;
+ if (!(req->flags & REQ_F_FAIL)) {
+ ret = io_req_prep_async(req);
+ if (unlikely(ret)) {
+ req_fail_link_node(req, ret);
+ if (!(head->flags & REQ_F_FAIL))
+ req_fail_link_node(head, -ECANCELED);
+ }
+ }
trace_io_uring_link(ctx, req, head);
link->last->link = req;
link->last = req;
static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
__must_hold(&ctx->uring_lock)
{
- struct io_uring_task *tctx;
int submitted = 0;
/* make sure SQ entry isn't read before tail */
nr = min3(nr, ctx->sq_entries, io_sqring_entries(ctx));
if (!percpu_ref_tryget_many(&ctx->refs, nr))
return -EAGAIN;
+ io_get_task_refs(nr);
- tctx = current->io_uring;
- tctx->cached_refs -= nr;
- if (unlikely(tctx->cached_refs < 0)) {
- unsigned int refill = -tctx->cached_refs + IO_TCTX_REFS_CACHE_NR;
-
- percpu_counter_add(&tctx->inflight, refill);
- refcount_add(refill, ¤t->usage);
- tctx->cached_refs += refill;
- }
io_submit_state_start(&ctx->submit_state, nr);
-
while (submitted < nr) {
const struct io_uring_sqe *sqe;
struct io_kiocb *req;
}
sqe = io_get_sqe(ctx);
if (unlikely(!sqe)) {
- kmem_cache_free(req_cachep, req);
+ list_add(&req->inflight_entry, &ctx->submit_state.free_list);
break;
}
/* will complete beyond this point, count as submitted */
#endif
}
+static int io_install_fixed_file(struct io_kiocb *req, struct file *file,
+ unsigned int issue_flags, u32 slot_index)
+{
+ struct io_ring_ctx *ctx = req->ctx;
+ bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK;
+ struct io_fixed_file *file_slot;
+ int ret = -EBADF;
+
+ io_ring_submit_lock(ctx, !force_nonblock);
+ if (file->f_op == &io_uring_fops)
+ goto err;
+ ret = -ENXIO;
+ if (!ctx->file_data)
+ goto err;
+ ret = -EINVAL;
+ if (slot_index >= ctx->nr_user_files)
+ goto err;
+
+ slot_index = array_index_nospec(slot_index, ctx->nr_user_files);
+ file_slot = io_fixed_file_slot(&ctx->file_table, slot_index);
+ ret = -EBADF;
+ if (file_slot->file_ptr)
+ goto err;
+
+ *io_get_tag_slot(ctx->file_data, slot_index) = 0;
+ io_fixed_file_set(file_slot, file);
+ ret = io_sqe_file_register(ctx, file, slot_index);
+ if (ret) {
+ file_slot->file_ptr = 0;
+ goto err;
+ }
+
+ ret = 0;
+err:
+ io_ring_submit_unlock(ctx, !force_nonblock);
+ if (ret)
+ fput(file);
+ return ret;
+}
+
static int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
struct io_rsrc_node *node, void *rsrc)
{
sock_release(ctx->ring_sock);
}
#endif
+ WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
io_mem_free(ctx->rings);
io_mem_free(ctx->sq_sqes);
* Must be after io_uring_del_task_file() (removes nodes under
* uring_lock) to avoid race with io_uring_try_cancel_iowq().
*/
- tctx->io_wq = NULL;
io_wq_put_and_exit(wq);
+ tctx->io_wq = NULL;
}
}
return io_wq_cpu_affinity(tctx->io_wq, NULL);
}
+static int io_register_iowq_max_workers(struct io_ring_ctx *ctx,
+ void __user *arg)
+{
+ struct io_uring_task *tctx = current->io_uring;
+ __u32 new_count[2];
+ int i, ret;
+
+ if (!tctx || !tctx->io_wq)
+ return -EINVAL;
+ if (copy_from_user(new_count, arg, sizeof(new_count)))
+ return -EFAULT;
+ for (i = 0; i < ARRAY_SIZE(new_count); i++)
+ if (new_count[i] > INT_MAX)
+ return -EINVAL;
+
+ ret = io_wq_max_workers(tctx->io_wq, new_count);
+ if (ret)
+ return ret;
+
+ if (copy_to_user(arg, new_count, sizeof(new_count)))
+ return -EFAULT;
+
+ return 0;
+}
+
static bool io_register_op_must_quiesce(int op)
{
switch (op) {
case IORING_REGISTER_BUFFERS_UPDATE:
case IORING_REGISTER_IOWQ_AFF:
case IORING_UNREGISTER_IOWQ_AFF:
+ case IORING_REGISTER_IOWQ_MAX_WORKERS:
return false;
default:
return true;
break;
ret = io_unregister_iowq_aff(ctx);
break;
+ case IORING_REGISTER_IOWQ_MAX_WORKERS:
+ ret = -EINVAL;
+ if (!arg || nr_args != 2)
+ break;
+ ret = io_register_iowq_max_workers(ctx, arg);
+ break;
default:
ret = -EINVAL;
break;
BUILD_BUG_SQE_ELEM(40, __u16, buf_group);
BUILD_BUG_SQE_ELEM(42, __u16, personality);
BUILD_BUG_SQE_ELEM(44, __s32, splice_fd_in);
+ BUILD_BUG_SQE_ELEM(44, __u32, file_index);
BUILD_BUG_ON(sizeof(struct io_uring_files_update) !=
sizeof(struct io_uring_rsrc_update));
BUILD_BUG_ON(sizeof(struct io_uring_rsrc_update) >
sizeof(struct io_uring_rsrc_update2));
+
+ /* ->buf_index is u16 */
+ BUILD_BUG_ON(IORING_MAX_REG_BUFFERS >= (1u << 16));
+
/* should fit into one byte */
BUILD_BUG_ON(SQE_VALID_FLAGS >= (1 << 8));
args.src_length, args.dest_offset);
}
-#ifdef CONFIG_BLOCK
-
-static inline sector_t logical_to_blk(struct inode *inode, loff_t offset)
-{
- return (offset >> inode->i_blkbits);
-}
-
-static inline loff_t blk_to_logical(struct inode *inode, sector_t blk)
-{
- return (blk << inode->i_blkbits);
-}
-
-/**
- * __generic_block_fiemap - FIEMAP for block based inodes (no locking)
- * @inode: the inode to map
- * @fieinfo: the fiemap info struct that will be passed back to userspace
- * @start: where to start mapping in the inode
- * @len: how much space to map
- * @get_block: the fs's get_block function
- *
- * This does FIEMAP for block based inodes. Basically it will just loop
- * through get_block until we hit the number of extents we want to map, or we
- * go past the end of the file and hit a hole.
- *
- * If it is possible to have data blocks beyond a hole past @inode->i_size, then
- * please do not use this function, it will stop at the first unmapped block
- * beyond i_size.
- *
- * If you use this function directly, you need to do your own locking. Use
- * generic_block_fiemap if you want the locking done for you.
- */
-static int __generic_block_fiemap(struct inode *inode,
- struct fiemap_extent_info *fieinfo, loff_t start,
- loff_t len, get_block_t *get_block)
-{
- struct buffer_head map_bh;
- sector_t start_blk, last_blk;
- loff_t isize = i_size_read(inode);
- u64 logical = 0, phys = 0, size = 0;
- u32 flags = FIEMAP_EXTENT_MERGED;
- bool past_eof = false, whole_file = false;
- int ret = 0;
-
- ret = fiemap_prep(inode, fieinfo, start, &len, FIEMAP_FLAG_SYNC);
- if (ret)
- return ret;
-
- /*
- * Either the i_mutex or other appropriate locking needs to be held
- * since we expect isize to not change at all through the duration of
- * this call.
- */
- if (len >= isize) {
- whole_file = true;
- len = isize;
- }
-
- /*
- * Some filesystems can't deal with being asked to map less than
- * blocksize, so make sure our len is at least block length.
- */
- if (logical_to_blk(inode, len) == 0)
- len = blk_to_logical(inode, 1);
-
- start_blk = logical_to_blk(inode, start);
- last_blk = logical_to_blk(inode, start + len - 1);
-
- do {
- /*
- * we set b_size to the total size we want so it will map as
- * many contiguous blocks as possible at once
- */
- memset(&map_bh, 0, sizeof(struct buffer_head));
- map_bh.b_size = len;
-
- ret = get_block(inode, start_blk, &map_bh, 0);
- if (ret)
- break;
-
- /* HOLE */
- if (!buffer_mapped(&map_bh)) {
- start_blk++;
-
- /*
- * We want to handle the case where there is an
- * allocated block at the front of the file, and then
- * nothing but holes up to the end of the file properly,
- * to make sure that extent at the front gets properly
- * marked with FIEMAP_EXTENT_LAST
- */
- if (!past_eof &&
- blk_to_logical(inode, start_blk) >= isize)
- past_eof = 1;
-
- /*
- * First hole after going past the EOF, this is our
- * last extent
- */
- if (past_eof && size) {
- flags = FIEMAP_EXTENT_MERGED|FIEMAP_EXTENT_LAST;
- ret = fiemap_fill_next_extent(fieinfo, logical,
- phys, size,
- flags);
- } else if (size) {
- ret = fiemap_fill_next_extent(fieinfo, logical,
- phys, size, flags);
- size = 0;
- }
-
- /* if we have holes up to/past EOF then we're done */
- if (start_blk > last_blk || past_eof || ret)
- break;
- } else {
- /*
- * We have gone over the length of what we wanted to
- * map, and it wasn't the entire file, so add the extent
- * we got last time and exit.
- *
- * This is for the case where say we want to map all the
- * way up to the second to the last block in a file, but
- * the last block is a hole, making the second to last
- * block FIEMAP_EXTENT_LAST. In this case we want to
- * see if there is a hole after the second to last block
- * so we can mark it properly. If we found data after
- * we exceeded the length we were requesting, then we
- * are good to go, just add the extent to the fieinfo
- * and break
- */
- if (start_blk > last_blk && !whole_file) {
- ret = fiemap_fill_next_extent(fieinfo, logical,
- phys, size,
- flags);
- break;
- }
-
- /*
- * if size != 0 then we know we already have an extent
- * to add, so add it.
- */
- if (size) {
- ret = fiemap_fill_next_extent(fieinfo, logical,
- phys, size,
- flags);
- if (ret)
- break;
- }
-
- logical = blk_to_logical(inode, start_blk);
- phys = blk_to_logical(inode, map_bh.b_blocknr);
- size = map_bh.b_size;
- flags = FIEMAP_EXTENT_MERGED;
-
- start_blk += logical_to_blk(inode, size);
-
- /*
- * If we are past the EOF, then we need to make sure as
- * soon as we find a hole that the last extent we found
- * is marked with FIEMAP_EXTENT_LAST
- */
- if (!past_eof && logical + size >= isize)
- past_eof = true;
- }
- cond_resched();
- if (fatal_signal_pending(current)) {
- ret = -EINTR;
- break;
- }
-
- } while (1);
-
- /* If ret is 1 then we just hit the end of the extent array */
- if (ret == 1)
- ret = 0;
-
- return ret;
-}
-
-/**
- * generic_block_fiemap - FIEMAP for block based inodes
- * @inode: The inode to map
- * @fieinfo: The mapping information
- * @start: The initial block to map
- * @len: The length of the extect to attempt to map
- * @get_block: The block mapping function for the fs
- *
- * Calls __generic_block_fiemap to map the inode, after taking
- * the inode's mutex lock.
- */
-
-int generic_block_fiemap(struct inode *inode,
- struct fiemap_extent_info *fieinfo, u64 start,
- u64 len, get_block_t *get_block)
-{
- int ret;
- inode_lock(inode);
- ret = __generic_block_fiemap(inode, fieinfo, start, len, get_block);
- inode_unlock(inode);
- return ret;
-}
-EXPORT_SYMBOL(generic_block_fiemap);
-
-#endif /* CONFIG_BLOCK */
-
/*
* This provides compatibility with legacy XFS pre-allocation ioctls
* which predate the fallocate syscall.
unsigned int overriderockperm:1;
unsigned int uid_set:1;
unsigned int gid_set:1;
- unsigned int utf8:1;
unsigned char map;
unsigned char check;
unsigned int blocksize;
popt->gid = GLOBAL_ROOT_GID;
popt->uid = GLOBAL_ROOT_UID;
popt->iocharset = NULL;
- popt->utf8 = 0;
popt->overriderockperm = 0;
popt->session=-1;
popt->sbsector=-1;
case Opt_cruft:
popt->cruft = 1;
break;
+#ifdef CONFIG_JOLIET
case Opt_utf8:
- popt->utf8 = 1;
+ kfree(popt->iocharset);
+ popt->iocharset = kstrdup("utf8", GFP_KERNEL);
+ if (!popt->iocharset)
+ return 0;
break;
-#ifdef CONFIG_JOLIET
case Opt_iocharset:
kfree(popt->iocharset);
popt->iocharset = match_strdup(&args[0]);
if (sbi->s_nocompress) seq_puts(m, ",nocompress");
if (sbi->s_overriderockperm) seq_puts(m, ",overriderockperm");
if (sbi->s_showassoc) seq_puts(m, ",showassoc");
- if (sbi->s_utf8) seq_puts(m, ",utf8");
if (sbi->s_check) seq_printf(m, ",check=%c", sbi->s_check);
if (sbi->s_mapping) seq_printf(m, ",map=%c", sbi->s_mapping);
seq_printf(m, ",fmode=%o", sbi->s_fmode);
#ifdef CONFIG_JOLIET
- if (sbi->s_nls_iocharset &&
- strcmp(sbi->s_nls_iocharset->charset, CONFIG_NLS_DEFAULT) != 0)
+ if (sbi->s_nls_iocharset)
seq_printf(m, ",iocharset=%s", sbi->s_nls_iocharset->charset);
+ else
+ seq_puts(m, ",iocharset=utf8");
#endif
return 0;
}
sbi->s_nls_iocharset = NULL;
#ifdef CONFIG_JOLIET
- if (joliet_level && opt.utf8 == 0) {
+ if (joliet_level) {
char *p = opt.iocharset ? opt.iocharset : CONFIG_NLS_DEFAULT;
- sbi->s_nls_iocharset = load_nls(p);
- if (! sbi->s_nls_iocharset) {
- /* Fail only if explicit charset specified */
- if (opt.iocharset)
+ if (strcmp(p, "utf8") != 0) {
+ sbi->s_nls_iocharset = opt.iocharset ?
+ load_nls(opt.iocharset) : load_nls_default();
+ if (!sbi->s_nls_iocharset)
goto out_freesbi;
- sbi->s_nls_iocharset = load_nls_default();
}
}
#endif
sbi->s_gid = opt.gid;
sbi->s_uid_set = opt.uid_set;
sbi->s_gid_set = opt.gid_set;
- sbi->s_utf8 = opt.utf8;
sbi->s_nocompress = opt.nocompress;
sbi->s_overriderockperm = opt.overriderockperm;
/*
unsigned char s_session;
unsigned int s_high_sierra:1;
unsigned int s_rock:2;
- unsigned int s_utf8:1;
unsigned int s_cruft:1; /* Broken disks with high byte of length
* containing junk */
unsigned int s_nocompress:1;
int
get_joliet_filename(struct iso_directory_record * de, unsigned char *outname, struct inode * inode)
{
- unsigned char utf8;
struct nls_table *nls;
unsigned char len = 0;
- utf8 = ISOFS_SB(inode->i_sb)->s_utf8;
nls = ISOFS_SB(inode->i_sb)->s_nls_iocharset;
- if (utf8) {
+ if (!nls) {
len = utf16s_to_utf8s((const wchar_t *) de->name,
de->name_len[0] >> 1, UTF16_BIG_ENDIAN,
outname, PAGE_SIZE);
return error;
}
-#ifdef CONFIG_MANDATORY_FILE_LOCKING
-/**
- * locks_mandatory_locked - Check for an active lock
- * @file: the file to check
- *
- * Searches the inode's list of locks to find any POSIX locks which conflict.
- * This function is called from locks_verify_locked() only.
- */
-int locks_mandatory_locked(struct file *file)
-{
- int ret;
- struct inode *inode = locks_inode(file);
- struct file_lock_context *ctx;
- struct file_lock *fl;
-
- ctx = smp_load_acquire(&inode->i_flctx);
- if (!ctx || list_empty_careful(&ctx->flc_posix))
- return 0;
-
- /*
- * Search the lock list for this inode for any POSIX locks.
- */
- spin_lock(&ctx->flc_lock);
- ret = 0;
- list_for_each_entry(fl, &ctx->flc_posix, fl_list) {
- if (fl->fl_owner != current->files &&
- fl->fl_owner != file) {
- ret = -EAGAIN;
- break;
- }
- }
- spin_unlock(&ctx->flc_lock);
- return ret;
-}
-
-/**
- * locks_mandatory_area - Check for a conflicting lock
- * @inode: the file to check
- * @filp: how the file was opened (if it was)
- * @start: first byte in the file to check
- * @end: lastbyte in the file to check
- * @type: %F_WRLCK for a write lock, else %F_RDLCK
- *
- * Searches the inode's list of locks to find any POSIX locks which conflict.
- */
-int locks_mandatory_area(struct inode *inode, struct file *filp, loff_t start,
- loff_t end, unsigned char type)
-{
- struct file_lock fl;
- int error;
- bool sleep = false;
-
- locks_init_lock(&fl);
- fl.fl_pid = current->tgid;
- fl.fl_file = filp;
- fl.fl_flags = FL_POSIX | FL_ACCESS;
- if (filp && !(filp->f_flags & O_NONBLOCK))
- sleep = true;
- fl.fl_type = type;
- fl.fl_start = start;
- fl.fl_end = end;
-
- for (;;) {
- if (filp) {
- fl.fl_owner = filp;
- fl.fl_flags &= ~FL_SLEEP;
- error = posix_lock_inode(inode, &fl, NULL);
- if (!error)
- break;
- }
-
- if (sleep)
- fl.fl_flags |= FL_SLEEP;
- fl.fl_owner = current->files;
- error = posix_lock_inode(inode, &fl, NULL);
- if (error != FILE_LOCK_DEFERRED)
- break;
- error = wait_event_interruptible(fl.fl_wait,
- list_empty(&fl.fl_blocked_member));
- if (!error) {
- /*
- * If we've been sleeping someone might have
- * changed the permissions behind our back.
- */
- if (__mandatory_lock(inode))
- continue;
- }
-
- break;
- }
- locks_delete_block(&fl);
-
- return error;
-}
-EXPORT_SYMBOL(locks_mandatory_area);
-#endif /* CONFIG_MANDATORY_FILE_LOCKING */
-
static void lease_clear_pending(struct file_lock *fl, int arg)
{
switch (arg) {
if (file_lock == NULL)
return -ENOLCK;
- /* Don't allow mandatory locks on files that may be memory mapped
- * and shared.
- */
- if (mandatory_lock(inode) && mapping_writably_mapped(filp->f_mapping)) {
- error = -EAGAIN;
- goto out;
- }
-
error = flock_to_posix_lock(filp, file_lock, flock);
if (error)
goto out;
struct flock64 *flock)
{
struct file_lock *file_lock = locks_alloc_lock();
- struct inode *inode = locks_inode(filp);
struct file *f;
int error;
if (file_lock == NULL)
return -ENOLCK;
- /* Don't allow mandatory locks on files that may be memory mapped
- * and shared.
- */
- if (mandatory_lock(inode) && mapping_writably_mapped(filp->f_mapping)) {
- error = -EAGAIN;
- goto out;
- }
-
error = flock64_to_posix_lock(filp, file_lock, flock);
if (error)
goto out;
seq_puts(f, "POSIX ");
seq_printf(f, " %s ",
- (inode == NULL) ? "*NOINODE*" :
- mandatory_lock(inode) ? "MANDATORY" : "ADVISORY ");
+ (inode == NULL) ? "*NOINODE*" : "ADVISORY ");
} else if (IS_FLOCK(fl)) {
if (fl->fl_type & LOCK_MAND) {
seq_puts(f, "FLOCK MSNFS ");
/*
* Refuse to truncate files with mandatory locks held on them.
*/
- error = locks_verify_locked(filp);
- if (!error)
- error = security_path_truncate(path);
+ error = security_path_truncate(path);
if (!error) {
error = do_truncate(mnt_userns, path->dentry, 0,
ATTR_MTIME|ATTR_CTIME|ATTR_OPEN,
return ns_capable(current->nsproxy->mnt_ns->user_ns, CAP_SYS_ADMIN);
}
-#ifdef CONFIG_MANDATORY_FILE_LOCKING
-static bool may_mandlock(void)
+static void warn_mandlock(void)
{
- pr_warn_once("======================================================\n"
- "WARNING: the mand mount option is being deprecated and\n"
- " will be removed in v5.15!\n"
- "======================================================\n");
- return capable(CAP_SYS_ADMIN);
+ pr_warn_once("=======================================================\n"
+ "WARNING: The mand mount option has been deprecated and\n"
+ " and is ignored by this kernel. Remove the mand\n"
+ " option from the mount to silence this warning.\n"
+ "=======================================================\n");
}
-#else
-static inline bool may_mandlock(void)
-{
- pr_warn("VFS: \"mand\" mount option not supported");
- return false;
-}
-#endif
static int can_umount(const struct path *path, int flags)
{
return ret;
if (!may_mount())
return -EPERM;
- if ((flags & SB_MANDLOCK) && !may_mandlock())
- return -EPERM;
+ if (flags & SB_MANDLOCK)
+ warn_mandlock();
/* Default to relatime unless overriden */
if (!(flags & MS_NOATIME))
if (fc->phase != FS_CONTEXT_AWAITING_MOUNT)
goto err_unlock;
- ret = -EPERM;
- if ((fc->sb_flags & SB_MANDLOCK) && !may_mandlock())
- goto err_unlock;
+ if (fc->sb_flags & SB_MANDLOCK)
+ warn_mandlock();
newmount.mnt = vfs_create_mount(fc);
if (IS_ERR(newmount.mnt)) {
nfs_inc_stats(inode, NFSIOS_VFSLOCK);
- /* No mandatory locks over NFS */
- if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
- goto out_err;
-
if (NFS_SERVER(inode)->flags & NFS_MOUNT_LOCAL_FCNTL)
is_local = 1;
NFS4_SHARE_DENY_READ);
}
-/*
- * Allow READ/WRITE during grace period on recovered state only for files
- * that are not able to provide mandatory locking.
- */
-static inline int
-grace_disallows_io(struct net *net, struct inode *inode)
-{
- return opens_in_grace(net) && mandatory_lock(inode);
-}
-
static __be32 check_stateid_generation(stateid_t *in, stateid_t *ref, bool has_session)
{
/*
stateid_t *stateid, int flags, struct nfsd_file **nfp,
struct nfs4_stid **cstid)
{
- struct inode *ino = d_inode(fhp->fh_dentry);
struct net *net = SVC_NET(rqstp);
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
struct nfs4_stid *s = NULL;
if (nfp)
*nfp = NULL;
- if (grace_disallows_io(net, ino))
- return nfserr_grace;
-
if (ZERO_STATEID(stateid) || ONE_STATEID(stateid)) {
status = check_special_stateids(net, fhp, stateid, flags);
goto done;
struct iattr *iap)
{
struct inode *inode = d_inode(fhp->fh_dentry);
- int host_err;
if (iap->ia_size < inode->i_size) {
__be32 err;
if (err)
return err;
}
-
- host_err = get_write_access(inode);
- if (host_err)
- goto out_nfserrno;
-
- host_err = locks_verify_truncate(inode, NULL, iap->ia_size);
- if (host_err)
- goto out_put_write_access;
- return 0;
-
-out_put_write_access:
- put_write_access(inode);
-out_nfserrno:
- return nfserrno(host_err);
+ return nfserrno(get_write_access(inode));
}
/*
err = nfserr_perm;
if (IS_APPEND(inode) && (may_flags & NFSD_MAY_WRITE))
goto out;
- /*
- * We must ignore files (but only files) which might have mandatory
- * locks on them because there is no way to know if the accesser has
- * the lock.
- */
- if (S_ISREG((inode)->i_mode) && mandatory_lock(inode))
- goto out;
if (!inode->i_fop)
goto out;
sb->s_time_gran = 1;
sb->s_max_links = NILFS_LINK_MAX;
- sb->s_bdi = bdi_get(sb->s_bdev->bd_bdi);
+ sb->s_bdi = bdi_get(sb->s_bdev->bd_disk->bdi);
err = load_nilfs(nilfs, sb);
if (err)
// SPDX-License-Identifier: GPL-2.0
#include <linux/fanotify.h>
#include <linux/fcntl.h>
+#include <linux/fdtable.h>
#include <linux/file.h>
#include <linux/fs.h>
#include <linux/anon_inodes.h>
struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
#define FANOTIFY_EVENT_ALIGN 4
-#define FANOTIFY_INFO_HDR_LEN \
+#define FANOTIFY_FID_INFO_HDR_LEN \
(sizeof(struct fanotify_event_info_fid) + sizeof(struct file_handle))
+#define FANOTIFY_PIDFD_INFO_HDR_LEN \
+ sizeof(struct fanotify_event_info_pidfd)
static int fanotify_fid_info_len(int fh_len, int name_len)
{
if (name_len)
info_len += name_len + 1;
- return roundup(FANOTIFY_INFO_HDR_LEN + info_len, FANOTIFY_EVENT_ALIGN);
+ return roundup(FANOTIFY_FID_INFO_HDR_LEN + info_len,
+ FANOTIFY_EVENT_ALIGN);
}
-static int fanotify_event_info_len(unsigned int fid_mode,
+static int fanotify_event_info_len(unsigned int info_mode,
struct fanotify_event *event)
{
struct fanotify_info *info = fanotify_event_info(event);
if (dir_fh_len) {
info_len += fanotify_fid_info_len(dir_fh_len, info->name_len);
- } else if ((fid_mode & FAN_REPORT_NAME) && (event->mask & FAN_ONDIR)) {
+ } else if ((info_mode & FAN_REPORT_NAME) &&
+ (event->mask & FAN_ONDIR)) {
/*
* With group flag FAN_REPORT_NAME, if name was not recorded in
* event on a directory, we will report the name ".".
dot_len = 1;
}
+ if (info_mode & FAN_REPORT_PIDFD)
+ info_len += FANOTIFY_PIDFD_INFO_HDR_LEN;
+
if (fh_len)
info_len += fanotify_fid_info_len(fh_len, dot_len);
size_t event_size = FAN_EVENT_METADATA_LEN;
struct fanotify_event *event = NULL;
struct fsnotify_event *fsn_event;
- unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+ unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
goto out;
event = FANOTIFY_E(fsn_event);
- if (fid_mode)
- event_size += fanotify_event_info_len(fid_mode, event);
+ if (info_mode)
+ event_size += fanotify_event_info_len(info_mode, event);
if (event_size > count) {
event = ERR_PTR(-EINVAL);
return -ENOENT;
}
-static int copy_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
- int info_type, const char *name, size_t name_len,
- char __user *buf, size_t count)
+static int copy_fid_info_to_user(__kernel_fsid_t *fsid, struct fanotify_fh *fh,
+ int info_type, const char *name,
+ size_t name_len,
+ char __user *buf, size_t count)
{
struct fanotify_event_info_fid info = { };
struct file_handle handle = { };
return info_len;
}
+static int copy_pidfd_info_to_user(int pidfd,
+ char __user *buf,
+ size_t count)
+{
+ struct fanotify_event_info_pidfd info = { };
+ size_t info_len = FANOTIFY_PIDFD_INFO_HDR_LEN;
+
+ if (WARN_ON_ONCE(info_len > count))
+ return -EFAULT;
+
+ info.hdr.info_type = FAN_EVENT_INFO_TYPE_PIDFD;
+ info.hdr.len = info_len;
+ info.pidfd = pidfd;
+
+ if (copy_to_user(buf, &info, info_len))
+ return -EFAULT;
+
+ return info_len;
+}
+
+static int copy_info_records_to_user(struct fanotify_event *event,
+ struct fanotify_info *info,
+ unsigned int info_mode, int pidfd,
+ char __user *buf, size_t count)
+{
+ int ret, total_bytes = 0, info_type = 0;
+ unsigned int fid_mode = info_mode & FANOTIFY_FID_BITS;
+ unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
+
+ /*
+ * Event info records order is as follows: dir fid + name, child fid.
+ */
+ if (fanotify_event_dir_fh_len(event)) {
+ info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
+ FAN_EVENT_INFO_TYPE_DFID;
+ ret = copy_fid_info_to_user(fanotify_event_fsid(event),
+ fanotify_info_dir_fh(info),
+ info_type,
+ fanotify_info_name(info),
+ info->name_len, buf, count);
+ if (ret < 0)
+ return ret;
+
+ buf += ret;
+ count -= ret;
+ total_bytes += ret;
+ }
+
+ if (fanotify_event_object_fh_len(event)) {
+ const char *dot = NULL;
+ int dot_len = 0;
+
+ if (fid_mode == FAN_REPORT_FID || info_type) {
+ /*
+ * With only group flag FAN_REPORT_FID only type FID is
+ * reported. Second info record type is always FID.
+ */
+ info_type = FAN_EVENT_INFO_TYPE_FID;
+ } else if ((fid_mode & FAN_REPORT_NAME) &&
+ (event->mask & FAN_ONDIR)) {
+ /*
+ * With group flag FAN_REPORT_NAME, if name was not
+ * recorded in an event on a directory, report the name
+ * "." with info type DFID_NAME.
+ */
+ info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
+ dot = ".";
+ dot_len = 1;
+ } else if ((event->mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
+ (event->mask & FAN_ONDIR)) {
+ /*
+ * With group flag FAN_REPORT_DIR_FID, a single info
+ * record has type DFID for directory entry modification
+ * event and for event on a directory.
+ */
+ info_type = FAN_EVENT_INFO_TYPE_DFID;
+ } else {
+ /*
+ * With group flags FAN_REPORT_DIR_FID|FAN_REPORT_FID,
+ * a single info record has type FID for event on a
+ * non-directory, when there is no directory to report.
+ * For example, on FAN_DELETE_SELF event.
+ */
+ info_type = FAN_EVENT_INFO_TYPE_FID;
+ }
+
+ ret = copy_fid_info_to_user(fanotify_event_fsid(event),
+ fanotify_event_object_fh(event),
+ info_type, dot, dot_len,
+ buf, count);
+ if (ret < 0)
+ return ret;
+
+ buf += ret;
+ count -= ret;
+ total_bytes += ret;
+ }
+
+ if (pidfd_mode) {
+ ret = copy_pidfd_info_to_user(pidfd, buf, count);
+ if (ret < 0)
+ return ret;
+
+ buf += ret;
+ count -= ret;
+ total_bytes += ret;
+ }
+
+ return total_bytes;
+}
+
static ssize_t copy_event_to_user(struct fsnotify_group *group,
struct fanotify_event *event,
char __user *buf, size_t count)
struct fanotify_event_metadata metadata;
struct path *path = fanotify_event_path(event);
struct fanotify_info *info = fanotify_event_info(event);
- unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+ unsigned int info_mode = FAN_GROUP_FLAG(group, FANOTIFY_INFO_MODES);
+ unsigned int pidfd_mode = info_mode & FAN_REPORT_PIDFD;
struct file *f = NULL;
- int ret, fd = FAN_NOFD;
- int info_type = 0;
+ int ret, pidfd = FAN_NOPIDFD, fd = FAN_NOFD;
pr_debug("%s: group=%p event=%p\n", __func__, group, event);
metadata.event_len = FAN_EVENT_METADATA_LEN +
- fanotify_event_info_len(fid_mode, event);
+ fanotify_event_info_len(info_mode, event);
metadata.metadata_len = FAN_EVENT_METADATA_LEN;
metadata.vers = FANOTIFY_METADATA_VERSION;
metadata.reserved = 0;
}
metadata.fd = fd;
+ if (pidfd_mode) {
+ /*
+ * Complain if the FAN_REPORT_PIDFD and FAN_REPORT_TID mutual
+ * exclusion is ever lifted. At the time of incoporating pidfd
+ * support within fanotify, the pidfd API only supported the
+ * creation of pidfds for thread-group leaders.
+ */
+ WARN_ON_ONCE(FAN_GROUP_FLAG(group, FAN_REPORT_TID));
+
+ /*
+ * The PIDTYPE_TGID check for an event->pid is performed
+ * preemptively in an attempt to catch out cases where the event
+ * listener reads events after the event generating process has
+ * already terminated. Report FAN_NOPIDFD to the event listener
+ * in those cases, with all other pidfd creation errors being
+ * reported as FAN_EPIDFD.
+ */
+ if (metadata.pid == 0 ||
+ !pid_has_task(event->pid, PIDTYPE_TGID)) {
+ pidfd = FAN_NOPIDFD;
+ } else {
+ pidfd = pidfd_create(event->pid, 0);
+ if (pidfd < 0)
+ pidfd = FAN_EPIDFD;
+ }
+ }
+
ret = -EFAULT;
/*
* Sanity check copy size in case get_one_event() and
if (f)
fd_install(fd, f);
- /* Event info records order is: dir fid + name, child fid */
- if (fanotify_event_dir_fh_len(event)) {
- info_type = info->name_len ? FAN_EVENT_INFO_TYPE_DFID_NAME :
- FAN_EVENT_INFO_TYPE_DFID;
- ret = copy_info_to_user(fanotify_event_fsid(event),
- fanotify_info_dir_fh(info),
- info_type, fanotify_info_name(info),
- info->name_len, buf, count);
+ if (info_mode) {
+ ret = copy_info_records_to_user(event, info, info_mode, pidfd,
+ buf, count);
if (ret < 0)
goto out_close_fd;
-
- buf += ret;
- count -= ret;
- }
-
- if (fanotify_event_object_fh_len(event)) {
- const char *dot = NULL;
- int dot_len = 0;
-
- if (fid_mode == FAN_REPORT_FID || info_type) {
- /*
- * With only group flag FAN_REPORT_FID only type FID is
- * reported. Second info record type is always FID.
- */
- info_type = FAN_EVENT_INFO_TYPE_FID;
- } else if ((fid_mode & FAN_REPORT_NAME) &&
- (event->mask & FAN_ONDIR)) {
- /*
- * With group flag FAN_REPORT_NAME, if name was not
- * recorded in an event on a directory, report the
- * name "." with info type DFID_NAME.
- */
- info_type = FAN_EVENT_INFO_TYPE_DFID_NAME;
- dot = ".";
- dot_len = 1;
- } else if ((event->mask & ALL_FSNOTIFY_DIRENT_EVENTS) ||
- (event->mask & FAN_ONDIR)) {
- /*
- * With group flag FAN_REPORT_DIR_FID, a single info
- * record has type DFID for directory entry modification
- * event and for event on a directory.
- */
- info_type = FAN_EVENT_INFO_TYPE_DFID;
- } else {
- /*
- * With group flags FAN_REPORT_DIR_FID|FAN_REPORT_FID,
- * a single info record has type FID for event on a
- * non-directory, when there is no directory to report.
- * For example, on FAN_DELETE_SELF event.
- */
- info_type = FAN_EVENT_INFO_TYPE_FID;
- }
-
- ret = copy_info_to_user(fanotify_event_fsid(event),
- fanotify_event_object_fh(event),
- info_type, dot, dot_len, buf, count);
- if (ret < 0)
- goto out_close_fd;
-
- buf += ret;
- count -= ret;
}
return metadata.event_len;
put_unused_fd(fd);
fput(f);
}
+
+ if (pidfd >= 0)
+ close_fd(pidfd);
+
return ret;
}
#endif
return -EINVAL;
+ /*
+ * A pidfd can only be returned for a thread-group leader; thus
+ * FAN_REPORT_PIDFD and FAN_REPORT_TID need to remain mutually
+ * exclusive.
+ */
+ if ((flags & FAN_REPORT_PIDFD) && (flags & FAN_REPORT_TID))
+ return -EINVAL;
+
if (event_f_flags & ~FANOTIFY_INIT_ALL_EVENT_F_BITS)
return -EINVAL;
FANOTIFY_DEFAULT_MAX_USER_MARKS);
BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_GROUP_FLAGS);
- BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10);
+ BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 11);
BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
fanotify_mark_cache = KMEM_CACHE(fsnotify_mark,
if (iput_inode)
iput(iput_inode);
- /* Wait for outstanding inode references from connectors */
- wait_var_event(&sb->s_fsnotify_inode_refs,
- !atomic_long_read(&sb->s_fsnotify_inode_refs));
}
void fsnotify_sb_delete(struct super_block *sb)
{
fsnotify_unmount_inodes(sb);
fsnotify_clear_marks_by_sb(sb);
+ /* Wait for outstanding object references from connectors */
+ wait_var_event(&sb->s_fsnotify_connectors,
+ !atomic_long_read(&sb->s_fsnotify_connectors));
}
/*
return container_of(conn->obj, struct super_block, s_fsnotify_marks);
}
+static inline struct super_block *fsnotify_connector_sb(
+ struct fsnotify_mark_connector *conn)
+{
+ switch (conn->type) {
+ case FSNOTIFY_OBJ_TYPE_INODE:
+ return fsnotify_conn_inode(conn)->i_sb;
+ case FSNOTIFY_OBJ_TYPE_VFSMOUNT:
+ return fsnotify_conn_mount(conn)->mnt.mnt_sb;
+ case FSNOTIFY_OBJ_TYPE_SB:
+ return fsnotify_conn_sb(conn);
+ default:
+ return NULL;
+ }
+}
+
/* destroy all events sitting in this groups notification queue */
extern void fsnotify_flush_notify(struct fsnotify_group *group);
}
}
+static void fsnotify_get_inode_ref(struct inode *inode)
+{
+ ihold(inode);
+ atomic_long_inc(&inode->i_sb->s_fsnotify_connectors);
+}
+
+static void fsnotify_put_inode_ref(struct inode *inode)
+{
+ struct super_block *sb = inode->i_sb;
+
+ iput(inode);
+ if (atomic_long_dec_and_test(&sb->s_fsnotify_connectors))
+ wake_up_var(&sb->s_fsnotify_connectors);
+}
+
+static void fsnotify_get_sb_connectors(struct fsnotify_mark_connector *conn)
+{
+ struct super_block *sb = fsnotify_connector_sb(conn);
+
+ if (sb)
+ atomic_long_inc(&sb->s_fsnotify_connectors);
+}
+
+static void fsnotify_put_sb_connectors(struct fsnotify_mark_connector *conn)
+{
+ struct super_block *sb = fsnotify_connector_sb(conn);
+
+ if (sb && atomic_long_dec_and_test(&sb->s_fsnotify_connectors))
+ wake_up_var(&sb->s_fsnotify_connectors);
+}
+
static void *fsnotify_detach_connector_from_object(
struct fsnotify_mark_connector *conn,
unsigned int *type)
if (conn->type == FSNOTIFY_OBJ_TYPE_INODE) {
inode = fsnotify_conn_inode(conn);
inode->i_fsnotify_mask = 0;
- atomic_long_inc(&inode->i_sb->s_fsnotify_inode_refs);
} else if (conn->type == FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
fsnotify_conn_mount(conn)->mnt_fsnotify_mask = 0;
} else if (conn->type == FSNOTIFY_OBJ_TYPE_SB) {
fsnotify_conn_sb(conn)->s_fsnotify_mask = 0;
}
+ fsnotify_put_sb_connectors(conn);
rcu_assign_pointer(*(conn->obj), NULL);
conn->obj = NULL;
conn->type = FSNOTIFY_OBJ_TYPE_DETACHED;
/* Drop object reference originally held by a connector */
static void fsnotify_drop_object(unsigned int type, void *objp)
{
- struct inode *inode;
- struct super_block *sb;
-
if (!objp)
return;
/* Currently only inode references are passed to be dropped */
if (WARN_ON_ONCE(type != FSNOTIFY_OBJ_TYPE_INODE))
return;
- inode = objp;
- sb = inode->i_sb;
- iput(inode);
- if (atomic_long_dec_and_test(&sb->s_fsnotify_inode_refs))
- wake_up_var(&sb->s_fsnotify_inode_refs);
+ fsnotify_put_inode_ref(objp);
}
void fsnotify_put_mark(struct fsnotify_mark *mark)
conn->fsid.val[0] = conn->fsid.val[1] = 0;
conn->flags = 0;
}
- if (conn->type == FSNOTIFY_OBJ_TYPE_INODE)
- inode = igrab(fsnotify_conn_inode(conn));
+ if (conn->type == FSNOTIFY_OBJ_TYPE_INODE) {
+ inode = fsnotify_conn_inode(conn);
+ fsnotify_get_inode_ref(inode);
+ }
+ fsnotify_get_sb_connectors(conn);
+
/*
* cmpxchg() provides the barrier so that readers of *connp can see
* only initialized structure
if (cmpxchg(connp, NULL, conn)) {
/* Someone else created list structure for us */
if (inode)
- iput(inode);
+ fsnotify_put_inode_ref(inode);
kmem_cache_free(fsnotify_mark_connector_cachep, conn);
}
if (!(fl->fl_flags & FL_FLOCK))
return -ENOLCK;
- if (__mandatory_lock(inode))
- return -ENOLCK;
if ((osb->s_mount_opt & OCFS2_MOUNT_LOCALFLOCKS) ||
ocfs2_mount_local(osb))
if (!(fl->fl_flags & FL_POSIX))
return -ENOLCK;
- if (__mandatory_lock(inode) && fl->fl_type != F_UNLCK)
- return -ENOLCK;
return ocfs2_plock(osb->cconn, OCFS2_I(inode)->ip_blkno, file, cmd, fl);
}
if (error)
goto put_write_and_out;
- error = locks_verify_truncate(inode, NULL, length);
- if (!error)
- error = security_path_truncate(path);
+ error = security_path_truncate(path);
if (!error)
error = do_truncate(mnt_userns, path->dentry, length, 0, NULL);
if (IS_APPEND(file_inode(f.file)))
goto out_putf;
sb_start_write(inode->i_sb);
- error = locks_verify_truncate(inode, f.file, length);
- if (!error)
- error = security_path_truncate(&f.file->f_path);
+ error = security_path_truncate(&f.file->f_path);
if (!error)
error = do_truncate(file_mnt_user_ns(f.file), dentry, length,
ATTR_MTIME | ATTR_CTIME, f.file);
* _very_ unlikely case that the pipe was full, but we got
* no data.
*/
- if (unlikely(was_full)) {
+ if (unlikely(was_full))
wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
- kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
- }
+ kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
/*
* But because we didn't read anything, at this point we can
wake_next_reader = false;
__pipe_unlock(pipe);
- if (was_full) {
+ if (was_full)
wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
- kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
- }
if (wake_next_reader)
wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
+ kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
if (ret > 0)
file_accessed(filp);
return ret;
* become empty while we dropped the lock.
*/
__pipe_unlock(pipe);
- if (was_empty) {
+ if (was_empty)
wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
- kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
- }
+ kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
wait_event_interruptible_exclusive(pipe->wr_wait, pipe_writable(pipe));
__pipe_lock(pipe);
was_empty = pipe_empty(pipe->head, pipe->tail);
* Epoll nonsensically wants a wakeup whether the pipe
* was already empty or not.
*/
- if (was_empty || pipe->poll_usage) {
+ if (was_empty || pipe->poll_usage)
wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM);
- kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
- }
+ kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
if (wake_next_writer)
wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM);
if (ret > 0 && sb_start_write_trylock(file_inode(filp)->i_sb)) {
int rw_verify_area(int read_write, struct file *file, const loff_t *ppos, size_t count)
{
- struct inode *inode;
- int retval = -EINVAL;
-
- inode = file_inode(file);
if (unlikely((ssize_t) count < 0))
- return retval;
+ return -EINVAL;
/*
* ranged mandatory locking does not apply to streams - it makes sense
if (unlikely(pos < 0)) {
if (!unsigned_offsets(file))
- return retval;
+ return -EINVAL;
if (count >= -pos) /* both values are in 0..LLONG_MAX */
return -EOVERFLOW;
} else if (unlikely((loff_t) (pos + count) < 0)) {
if (!unsigned_offsets(file))
- return retval;
- }
-
- if (unlikely(inode->i_flctx && mandatory_lock(inode))) {
- retval = locks_mandatory_area(inode, file, pos, pos + count - 1,
- read_write == READ ? F_RDLCK : F_WRLCK);
- if (retval < 0)
- return retval;
+ return -EINVAL;
}
}
static int remap_verify_area(struct file *file, loff_t pos, loff_t len,
bool write)
{
- struct inode *inode = file_inode(file);
-
if (unlikely(pos < 0 || len < 0))
return -EINVAL;
if (unlikely((loff_t) (pos + len) < 0))
return -EINVAL;
- if (unlikely(inode->i_flctx && mandatory_lock(inode))) {
- loff_t end = len ? pos + len - 1 : OFFSET_MAX;
- int retval;
-
- retval = locks_mandatory_area(inode, file, pos, end,
- write ? F_WRLCK : F_RDLCK);
- if (retval < 0)
- return retval;
- }
-
return security_file_permission(file, write ? MAY_WRITE : MAY_READ);
}
bytes_to_copy = min_t(int, bytes_to_copy,
req_length - copied_bytes);
- memcpy(actor_addr + actor_offset,
- page_address(bvec->bv_page) + bvec->bv_offset + offset,
+ memcpy(actor_addr + actor_offset, bvec_virt(bvec) + offset,
bytes_to_copy);
actor_offset += bytes_to_copy;
goto out_free_bio;
}
/* Extract the length of the metadata block */
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
length = data[offset];
if (offset < bvec->bv_len - 1) {
length |= data[offset + 1] << 8;
res = -EIO;
goto out_free_bio;
}
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
length |= data[0] << 8;
}
bio_free_pages(bio);
while (bio_next_segment(bio, &iter_all)) {
int avail = min(bytes, ((int)bvec->bv_len) - offset);
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
memcpy(buff, data + offset, avail);
buff += avail;
bytes -= avail;
while (bio_next_segment(bio, &iter_all)) {
int avail = min(bytes, ((int)bvec->bv_len) - offset);
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
memcpy(buff, data + offset, avail);
buff += avail;
bytes -= avail;
}
avail = min(length, ((int)bvec->bv_len) - offset);
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
length -= avail;
stream->buf.in = data + offset;
stream->buf.in_size = avail;
}
avail = min(length, ((int)bvec->bv_len) - offset);
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
length -= avail;
stream->next_in = data + offset;
stream->avail_in = avail;
}
avail = min(length, ((int)bvec->bv_len) - offset);
- data = page_address(bvec->bv_page) + bvec->bv_offset;
+ data = bvec_virt(bvec);
length -= avail;
in_buf.src = data + offset;
in_buf.size = avail;
{
s->s_bdev = data;
s->s_dev = s->s_bdev->bd_dev;
- s->s_bdi = bdi_get(s->s_bdev->bd_bdi);
+ s->s_bdi = bdi_get(s->s_bdev->bd_disk->bdi);
if (blk_queue_stable_writes(s->s_bdev->bd_disk->queue))
s->s_iflags |= SB_I_STABLE_WRITES;
rcu_read_unlock();
}
+static void timerfd_resume_work(struct work_struct *work)
+{
+ timerfd_clock_was_set();
+}
+
+static DECLARE_WORK(timerfd_work, timerfd_resume_work);
+
+/*
+ * Invoked from timekeeping_resume(). Defer the actual update to work so
+ * timerfd_clock_was_set() runs in task context.
+ */
+void timerfd_resume(void)
+{
+ schedule_work(&timerfd_work);
+}
+
static void __timerfd_remove_cancel(struct timerfd_ctx *ctx)
{
if (ctx->might_cancel) {
#include "udf_i.h"
#include "udf_sb.h"
-
static int udf_readdir(struct file *file, struct dir_context *ctx)
{
struct inode *dir = file_inode(file);
lfi = cfi.lengthFileIdent;
if (fibh.sbh == fibh.ebh) {
- nameptr = fi->fileIdent + liu;
+ nameptr = udf_get_fi_ident(fi);
} else {
int poffset; /* Unpaded ending offset */
}
}
nameptr = copy_name;
- memcpy(nameptr, fi->fileIdent + liu,
+ memcpy(nameptr, udf_get_fi_ident(fi),
lfi - poffset);
memcpy(nameptr + lfi - poffset,
fibh.ebh->b_data, poffset);
struct regid impIdent;
uint8_t impUse[128];
struct extent_ad integritySeqExt;
- uint8_t partitionMaps[0];
+ uint8_t partitionMaps[];
} __packed;
/* Generic Partition Map (ECMA 167r3 3/10.7.1) */
struct genericPartitionMap {
uint8_t partitionMapType;
uint8_t partitionMapLength;
- uint8_t partitionMapping[0];
+ uint8_t partitionMapping[];
} __packed;
/* Partition Map Type (ECMA 167r3 3/10.7.1.1) */
struct tag descTag;
__le32 volDescSeqNum;
__le32 numAllocDescs;
- struct extent_ad allocDescs[0];
+ struct extent_ad allocDescs[];
} __packed;
/* Terminating Descriptor (ECMA 167r3 3/10.9) */
uint8_t logicalVolContentsUse[32];
__le32 numOfPartitions;
__le32 lengthOfImpUse;
- __le32 freeSpaceTable[0];
- __le32 sizeTable[0];
- uint8_t impUse[0];
+ __le32 freeSpaceTable[];
+ /* __le32 sizeTable[]; */
+ /* uint8_t impUse[]; */
} __packed;
/* Integrity Type (ECMA 167r3 3/10.10.3) */
uint8_t lengthFileIdent;
struct long_ad icb;
__le16 lengthOfImpUse;
- uint8_t impUse[0];
- uint8_t fileIdent[0];
- uint8_t padding[0];
+ uint8_t impUse[];
+ /* uint8_t fileIdent[]; */
+ /* uint8_t padding[]; */
} __packed;
/* File Characteristics (ECMA 167r3 4/14.4.3) */
__le64 uniqueID;
__le32 lengthExtendedAttr;
__le32 lengthAllocDescs;
- uint8_t extendedAttr[0];
- uint8_t allocDescs[0];
+ uint8_t extendedAttr[];
+ /* uint8_t allocDescs[]; */
} __packed;
/* Permissions (ECMA 167r3 4/14.9.5) */
uint8_t attrSubtype;
uint8_t reserved[3];
__le32 attrLength;
- uint8_t attrData[0];
+ uint8_t attrData[];
} __packed;
/* Character Set Information (ECMA 167r3 4/14.10.3) */
__le32 attrLength;
__le32 escapeSeqLength;
uint8_t charSetType;
- uint8_t escapeSeq[0];
+ uint8_t escapeSeq[];
} __packed;
/* Alternate Permissions (ECMA 167r3 4/14.10.4) */
__le32 attrLength;
__le32 dataLength;
__le32 infoTimeExistence;
- uint8_t infoTimes[0];
+ uint8_t infoTimes[];
} __packed;
/* Device Specification (ECMA 167r3 4/14.10.7) */
__le32 impUseLength;
__le32 majorDeviceIdent;
__le32 minorDeviceIdent;
- uint8_t impUse[0];
+ uint8_t impUse[];
} __packed;
/* Implementation Use Extended Attr (ECMA 167r3 4/14.10.8) */
__le32 attrLength;
__le32 impUseLength;
struct regid impIdent;
- uint8_t impUse[0];
+ uint8_t impUse[];
} __packed;
/* Application Use Extended Attribute (ECMA 167r3 4/14.10.9) */
__le32 attrLength;
__le32 appUseLength;
struct regid appIdent;
- uint8_t appUse[0];
+ uint8_t appUse[];
} __packed;
#define EXTATTR_CHAR_SET 1
struct tag descTag;
struct icbtag icbTag;
__le32 lengthAllocDescs;
- uint8_t allocDescs[0];
+ uint8_t allocDescs[];
} __packed;
/* Space Bitmap Descriptor (ECMA 167r3 4/14.12) */
struct tag descTag;
__le32 numOfBits;
__le32 numOfBytes;
- uint8_t bitmap[0];
+ uint8_t bitmap[];
} __packed;
/* Partition Integrity Entry (ECMA 167r3 4/14.13) */
uint8_t componentType;
uint8_t lengthComponentIdent;
__le16 componentFileVersionNum;
- dchars componentIdent[0];
+ dchars componentIdent[];
} __packed;
/* File Entry (ECMA 167r3 4/14.17) */
__le64 uniqueID;
__le32 lengthExtendedAttr;
__le32 lengthAllocDescs;
- uint8_t extendedAttr[0];
- uint8_t allocDescs[0];
+ uint8_t extendedAttr[];
+ /* uint8_t allocDescs[]; */
} __packed;
#endif /* _ECMA_167_H */
dfibh.eoffset += (sfibh.eoffset - sfibh.soffset);
dfi = (struct fileIdentDesc *)(dbh->b_data + dfibh.soffset);
if (udf_write_fi(inode, sfi, dfi, &dfibh, sfi->impUse,
- sfi->fileIdent +
- le16_to_cpu(sfi->lengthOfImpUse))) {
+ udf_get_fi_ident(sfi))) {
iinfo->i_alloc_type = ICBTAG_FLAG_AD_IN_ICB;
brelse(dbh);
return NULL;
else
offset = le32_to_cpu(eahd->appAttrLocation);
- while (offset < iinfo->i_lenEAttr) {
+ while (offset + sizeof(*gaf) < iinfo->i_lenEAttr) {
+ uint32_t attrLength;
+
gaf = (struct genericFormat *)&ea[offset];
+ attrLength = le32_to_cpu(gaf->attrLength);
+
+ /* Detect undersized elements and buffer overflows */
+ if ((attrLength < sizeof(*gaf)) ||
+ (attrLength > (iinfo->i_lenEAttr - offset)))
+ break;
+
if (le32_to_cpu(gaf->attrType) == type &&
gaf->attrSubtype == subtype)
return gaf;
else
- offset += le32_to_cpu(gaf->attrLength);
+ offset += attrLength;
}
}
if (fileident) {
if (adinicb || (offset + lfi < 0)) {
- memcpy((uint8_t *)sfi->fileIdent + liu, fileident, lfi);
+ memcpy(udf_get_fi_ident(sfi), fileident, lfi);
} else if (offset >= 0) {
memcpy(fibh->ebh->b_data + offset, fileident, lfi);
} else {
- memcpy((uint8_t *)sfi->fileIdent + liu, fileident,
- -offset);
+ memcpy(udf_get_fi_ident(sfi), fileident, -offset);
memcpy(fibh->ebh->b_data, fileident - offset,
lfi + offset);
}
offset += lfi;
if (adinicb || (offset + padlen < 0)) {
- memset((uint8_t *)sfi->padding + liu + lfi, 0x00, padlen);
+ memset(udf_get_fi_ident(sfi) + lfi, 0x00, padlen);
} else if (offset >= 0) {
memset(fibh->ebh->b_data + offset, 0x00, padlen);
} else {
- memset((uint8_t *)sfi->padding + liu + lfi, 0x00, -offset);
+ memset(udf_get_fi_ident(sfi) + lfi, 0x00, -offset);
memset(fibh->ebh->b_data, 0x00, padlen + offset);
}
lfi = cfi->lengthFileIdent;
if (fibh->sbh == fibh->ebh) {
- nameptr = fi->fileIdent + liu;
+ nameptr = udf_get_fi_ident(fi);
} else {
int poffset; /* Unpaded ending offset */
}
}
nameptr = copy_name;
- memcpy(nameptr, fi->fileIdent + liu,
+ memcpy(nameptr, udf_get_fi_ident(fi),
lfi - poffset);
memcpy(nameptr + lfi - poffset,
fibh->ebh->b_data, poffset);
__le16 minUDFReadRev;
__le16 minUDFWriteRev;
__le16 maxUDFWriteRev;
- uint8_t impUse[0];
+ uint8_t impUse[];
} __packed;
/* Implementation Use Volume Descriptor (UDF 2.60 2.2.7) */
uint8_t reserved2[5];
} __packed;
-/* Virtual Allocation Table (UDF 1.5 2.2.10) */
-struct virtualAllocationTable15 {
- __le32 vatEntry[0];
- struct regid vatIdent;
- __le32 previousVATICBLoc;
-} __packed;
-
-#define ICBTAG_FILE_TYPE_VAT15 0x00U
-
/* Virtual Allocation Table (UDF 2.60 2.2.11) */
struct virtualAllocationTable20 {
__le16 lengthHeader;
__le16 minUDFWriteRev;
__le16 maxUDFWriteRev;
__le16 reserved;
- uint8_t impUse[0];
- __le32 vatEntry[0];
+ uint8_t impUse[];
+ /* __le32 vatEntry[]; */
} __packed;
#define ICBTAG_FILE_TYPE_VAT20 0xF8U
__le16 reallocationTableLen;
__le16 reserved;
__le32 sequenceNum;
- struct sparingEntry
- mapEntry[0];
+ struct sparingEntry mapEntry[];
} __packed;
/* Metadata File (and Metadata Mirror File) (UDF 2.60 2.2.13.1) */
/* FreeEASpace (UDF 2.60 3.3.4.5.1.1) */
struct freeEaSpace {
__le16 headerChecksum;
- uint8_t freeEASpace[0];
+ uint8_t freeEASpace[];
} __packed;
/* DVD Copyright Management Information (UDF 2.60 3.3.4.5.1.2) */
/* FreeAppEASpace (UDF 2.60 3.3.4.6.1) */
struct freeAppEASpace {
__le16 headerChecksum;
- uint8_t freeEASpace[0];
+ uint8_t freeEASpace[];
} __packed;
/* UDF Defined System Stream (UDF 2.60 3.3.7) */
return NULL;
lvid = (struct logicalVolIntegrityDesc *)UDF_SB(sb)->s_lvid_bh->b_data;
partnum = le32_to_cpu(lvid->numOfPartitions);
- if ((sb->s_blocksize - sizeof(struct logicalVolIntegrityDescImpUse) -
- offsetof(struct logicalVolIntegrityDesc, impUse)) /
- (2 * sizeof(uint32_t)) < partnum) {
- udf_err(sb, "Logical volume integrity descriptor corrupted "
- "(numOfPartitions = %u)!\n", partnum);
- return NULL;
- }
/* The offset is to skip freeSpaceTable and sizeTable arrays */
offset = partnum * 2 * sizeof(uint32_t);
- return (struct logicalVolIntegrityDescImpUse *)&(lvid->impUse[offset]);
+ return (struct logicalVolIntegrityDescImpUse *)
+ (((uint8_t *)(lvid + 1)) + offset);
}
/* UDF filesystem type */
seq_printf(seq, ",lastblock=%u", sbi->s_last_block);
if (sbi->s_anchor != 0)
seq_printf(seq, ",anchor=%u", sbi->s_anchor);
- if (UDF_QUERY_FLAG(sb, UDF_FLAG_UTF8))
- seq_puts(seq, ",utf8");
- if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP) && sbi->s_nls_map)
+ if (sbi->s_nls_map)
seq_printf(seq, ",iocharset=%s", sbi->s_nls_map->charset);
+ else
+ seq_puts(seq, ",iocharset=utf8");
return 0;
}
/* Ignored (never implemented properly) */
break;
case Opt_utf8:
- uopt->flags |= (1 << UDF_FLAG_UTF8);
+ if (!remount) {
+ unload_nls(uopt->nls_map);
+ uopt->nls_map = NULL;
+ }
break;
case Opt_iocharset:
if (!remount) {
- if (uopt->nls_map)
- unload_nls(uopt->nls_map);
- /*
- * load_nls() failure is handled later in
- * udf_fill_super() after all options are
- * parsed.
- */
+ unload_nls(uopt->nls_map);
+ uopt->nls_map = NULL;
+ }
+ /* When nls_map is not loaded then UTF-8 is used */
+ if (!remount && strcmp(args[0].from, "utf8") != 0) {
uopt->nls_map = load_nls(args[0].from);
- uopt->flags |= (1 << UDF_FLAG_NLS_MAP);
+ if (!uopt->nls_map) {
+ pr_err("iocharset %s not found\n",
+ args[0].from);
+ return 0;
+ }
}
break;
case Opt_uforget:
struct udf_sb_info *sbi = UDF_SB(sb);
struct logicalVolIntegrityDesc *lvid;
int indirections = 0;
+ u32 parts, impuselen;
while (++indirections <= UDF_MAX_LVID_NESTING) {
final_bh = NULL;
lvid = (struct logicalVolIntegrityDesc *)final_bh->b_data;
if (lvid->nextIntegrityExt.extLength == 0)
- return;
+ goto check;
loc = leea_to_cpu(lvid->nextIntegrityExt);
}
udf_warn(sb, "Too many LVID indirections (max %u), ignoring.\n",
UDF_MAX_LVID_NESTING);
+out_err:
brelse(sbi->s_lvid_bh);
sbi->s_lvid_bh = NULL;
+ return;
+check:
+ parts = le32_to_cpu(lvid->numOfPartitions);
+ impuselen = le32_to_cpu(lvid->lengthOfImpUse);
+ if (parts >= sb->s_blocksize || impuselen >= sb->s_blocksize ||
+ sizeof(struct logicalVolIntegrityDesc) + impuselen +
+ 2 * parts * sizeof(u32) > sb->s_blocksize) {
+ udf_warn(sb, "Corrupted LVID (parts=%u, impuselen=%u), "
+ "ignoring.\n", parts, impuselen);
+ goto out_err;
+ }
}
/*
if (!udf_parse_options((char *)options, &uopt, false))
goto parse_options_failure;
- if (uopt.flags & (1 << UDF_FLAG_UTF8) &&
- uopt.flags & (1 << UDF_FLAG_NLS_MAP)) {
- udf_err(sb, "utf8 cannot be combined with iocharset\n");
- goto parse_options_failure;
- }
- if ((uopt.flags & (1 << UDF_FLAG_NLS_MAP)) && !uopt.nls_map) {
- uopt.nls_map = load_nls_default();
- if (!uopt.nls_map)
- uopt.flags &= ~(1 << UDF_FLAG_NLS_MAP);
- else
- udf_debug("Using default NLS map\n");
- }
- if (!(uopt.flags & (1 << UDF_FLAG_NLS_MAP)))
- uopt.flags |= (1 << UDF_FLAG_UTF8);
-
fileset.logicalBlockNum = 0xFFFFFFFF;
fileset.partitionReferenceNum = 0xFFFF;
error_out:
iput(sbi->s_vat_inode);
parse_options_failure:
- if (uopt.nls_map)
- unload_nls(uopt.nls_map);
+ unload_nls(uopt.nls_map);
if (lvid_open)
udf_close_lvid(sb);
brelse(sbi->s_lvid_bh);
sbi = UDF_SB(sb);
iput(sbi->s_vat_inode);
- if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP))
- unload_nls(sbi->s_nls_map);
+ unload_nls(sbi->s_nls_map);
if (!sb_rdonly(sb))
udf_close_lvid(sb);
brelse(sbi->s_lvid_bh);
#define UDF_FLAG_UNDELETE 6
#define UDF_FLAG_UNHIDE 7
#define UDF_FLAG_VARCONV 8
-#define UDF_FLAG_NLS_MAP 9
-#define UDF_FLAG_UTF8 10
#define UDF_FLAG_UID_FORGET 11 /* save -1 for uid to disk */
#define UDF_FLAG_GID_FORGET 12
#define UDF_FLAG_UID_SET 13
le16_to_cpu(cfi->lengthOfImpUse) + cfi->lengthFileIdent,
UDF_NAME_PAD);
}
+static inline uint8_t *udf_get_fi_ident(struct fileIdentDesc *fi)
+{
+ return ((uint8_t *)(fi + 1)) + le16_to_cpu(fi->lengthOfImpUse);
+}
/* file.c */
extern long udf_ioctl(struct file *, unsigned int, unsigned long);
return 0;
}
- if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP))
+ if (UDF_SB(sb)->s_nls_map)
conv_f = UDF_SB(sb)->s_nls_map->uni2char;
else
conv_f = NULL;
if (ocu_max_len <= 0)
return 0;
- if (UDF_QUERY_FLAG(sb, UDF_FLAG_NLS_MAP))
+ if (UDF_SB(sb)->s_nls_map)
conv_f = UDF_SB(sb)->s_nls_map->char2uni;
else
conv_f = NULL;
struct xfs_bstat *sbp = &sxp->sx_stat;
int src_log_flags, target_log_flags;
int error = 0;
- int lock_flags;
uint64_t f;
int resblks = 0;
unsigned int flags = 0;
* do the rest of the checks.
*/
lock_two_nondirectories(VFS_I(ip), VFS_I(tip));
- lock_flags = XFS_MMAPLOCK_EXCL;
- xfs_lock_two_inodes(ip, XFS_MMAPLOCK_EXCL, tip, XFS_MMAPLOCK_EXCL);
+ filemap_invalidate_lock_two(VFS_I(ip)->i_mapping,
+ VFS_I(tip)->i_mapping);
/* Verify that both files have the same format */
if ((VFS_I(ip)->i_mode & S_IFMT) != (VFS_I(tip)->i_mode & S_IFMT)) {
* or cancel will unlock the inodes from this point onwards.
*/
xfs_lock_two_inodes(ip, XFS_ILOCK_EXCL, tip, XFS_ILOCK_EXCL);
- lock_flags |= XFS_ILOCK_EXCL;
xfs_trans_ijoin(tp, ip, 0);
xfs_trans_ijoin(tp, tip, 0);
trace_xfs_swap_extent_after(ip, 0);
trace_xfs_swap_extent_after(tip, 1);
+out_unlock_ilock:
+ xfs_iunlock(ip, XFS_ILOCK_EXCL);
+ xfs_iunlock(tip, XFS_ILOCK_EXCL);
out_unlock:
- xfs_iunlock(ip, lock_flags);
- xfs_iunlock(tip, lock_flags);
+ filemap_invalidate_unlock_two(VFS_I(ip)->i_mapping,
+ VFS_I(tip)->i_mapping);
unlock_two_nondirectories(VFS_I(ip), VFS_I(tip));
return error;
out_trans_cancel:
xfs_trans_cancel(tp);
- goto out_unlock;
+ goto out_unlock_ilock;
}
{
struct xfs_buf *bp;
- if (bdi_read_congested(target->bt_bdev->bd_bdi))
+ if (bdi_read_congested(target->bt_bdev->bd_disk->bdi))
return;
xfs_buf_read_map(target, map, nmaps,
*
* mmap_lock (MM)
* sb_start_pagefault(vfs, freeze)
- * i_mmaplock (XFS - truncate serialisation)
+ * invalidate_lock (vfs/XFS_MMAPLOCK - truncate serialisation)
* page_lock (MM)
* i_lock (XFS - extent map serialisation)
*/
file_update_time(vmf->vma->vm_file);
}
- xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
if (IS_DAX(inode)) {
pfn_t pfn;
+ xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
ret = dax_iomap_fault(vmf, pe_size, &pfn, NULL,
(write_fault && !vmf->cow_page) ?
&xfs_direct_write_iomap_ops :
&xfs_read_iomap_ops);
if (ret & VM_FAULT_NEEDDSYNC)
ret = dax_finish_sync_fault(vmf, pe_size, pfn);
+ xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
} else {
- if (write_fault)
+ if (write_fault) {
+ xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
ret = iomap_page_mkwrite(vmf,
&xfs_buffered_write_iomap_ops);
- else
+ xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
+ } else {
ret = filemap_fault(vmf);
+ }
}
- xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
if (write_fault)
sb_end_pagefault(inode->i_sb);
/*
* In addition to i_rwsem in the VFS inode, the xfs inode contains 2
- * multi-reader locks: i_mmap_lock and the i_lock. This routine allows
+ * multi-reader locks: invalidate_lock and the i_lock. This routine allows
* various combinations of the locks to be obtained.
*
* The 3 locks should always be ordered so that the IO lock is obtained first,
*
* Basic locking order:
*
- * i_rwsem -> i_mmap_lock -> page_lock -> i_ilock
+ * i_rwsem -> invalidate_lock -> page_lock -> i_ilock
*
* mmap_lock locking order:
*
* i_rwsem -> page lock -> mmap_lock
- * mmap_lock -> i_mmap_lock -> page_lock
+ * mmap_lock -> invalidate_lock -> page_lock
*
* The difference in mmap_lock locking order mean that we cannot hold the
- * i_mmap_lock over syscall based read(2)/write(2) based IO. These IO paths can
- * fault in pages during copy in/out (for buffered IO) or require the mmap_lock
- * in get_user_pages() to map the user pages into the kernel address space for
- * direct IO. Similarly the i_rwsem cannot be taken inside a page fault because
- * page faults already hold the mmap_lock.
+ * invalidate_lock over syscall based read(2)/write(2) based IO. These IO paths
+ * can fault in pages during copy in/out (for buffered IO) or require the
+ * mmap_lock in get_user_pages() to map the user pages into the kernel address
+ * space for direct IO. Similarly the i_rwsem cannot be taken inside a page
+ * fault because page faults already hold the mmap_lock.
*
* Hence to serialise fully against both syscall and mmap based IO, we need to
- * take both the i_rwsem and the i_mmap_lock. These locks should *only* be both
- * taken in places where we need to invalidate the page cache in a race
+ * take both the i_rwsem and the invalidate_lock. These locks should *only* be
+ * both taken in places where we need to invalidate the page cache in a race
* free manner (e.g. truncate, hole punch and other extent manipulation
* functions).
*/
XFS_IOLOCK_DEP(lock_flags));
}
- if (lock_flags & XFS_MMAPLOCK_EXCL)
- mrupdate_nested(&ip->i_mmaplock, XFS_MMAPLOCK_DEP(lock_flags));
- else if (lock_flags & XFS_MMAPLOCK_SHARED)
- mraccess_nested(&ip->i_mmaplock, XFS_MMAPLOCK_DEP(lock_flags));
+ if (lock_flags & XFS_MMAPLOCK_EXCL) {
+ down_write_nested(&VFS_I(ip)->i_mapping->invalidate_lock,
+ XFS_MMAPLOCK_DEP(lock_flags));
+ } else if (lock_flags & XFS_MMAPLOCK_SHARED) {
+ down_read_nested(&VFS_I(ip)->i_mapping->invalidate_lock,
+ XFS_MMAPLOCK_DEP(lock_flags));
+ }
if (lock_flags & XFS_ILOCK_EXCL)
mrupdate_nested(&ip->i_lock, XFS_ILOCK_DEP(lock_flags));
}
if (lock_flags & XFS_MMAPLOCK_EXCL) {
- if (!mrtryupdate(&ip->i_mmaplock))
+ if (!down_write_trylock(&VFS_I(ip)->i_mapping->invalidate_lock))
goto out_undo_iolock;
} else if (lock_flags & XFS_MMAPLOCK_SHARED) {
- if (!mrtryaccess(&ip->i_mmaplock))
+ if (!down_read_trylock(&VFS_I(ip)->i_mapping->invalidate_lock))
goto out_undo_iolock;
}
out_undo_mmaplock:
if (lock_flags & XFS_MMAPLOCK_EXCL)
- mrunlock_excl(&ip->i_mmaplock);
+ up_write(&VFS_I(ip)->i_mapping->invalidate_lock);
else if (lock_flags & XFS_MMAPLOCK_SHARED)
- mrunlock_shared(&ip->i_mmaplock);
+ up_read(&VFS_I(ip)->i_mapping->invalidate_lock);
out_undo_iolock:
if (lock_flags & XFS_IOLOCK_EXCL)
up_write(&VFS_I(ip)->i_rwsem);
up_read(&VFS_I(ip)->i_rwsem);
if (lock_flags & XFS_MMAPLOCK_EXCL)
- mrunlock_excl(&ip->i_mmaplock);
+ up_write(&VFS_I(ip)->i_mapping->invalidate_lock);
else if (lock_flags & XFS_MMAPLOCK_SHARED)
- mrunlock_shared(&ip->i_mmaplock);
+ up_read(&VFS_I(ip)->i_mapping->invalidate_lock);
if (lock_flags & XFS_ILOCK_EXCL)
mrunlock_excl(&ip->i_lock);
if (lock_flags & XFS_ILOCK_EXCL)
mrdemote(&ip->i_lock);
if (lock_flags & XFS_MMAPLOCK_EXCL)
- mrdemote(&ip->i_mmaplock);
+ downgrade_write(&VFS_I(ip)->i_mapping->invalidate_lock);
if (lock_flags & XFS_IOLOCK_EXCL)
downgrade_write(&VFS_I(ip)->i_rwsem);
}
#if defined(DEBUG) || defined(XFS_WARN)
-int
+static inline bool
+__xfs_rwsem_islocked(
+ struct rw_semaphore *rwsem,
+ bool shared)
+{
+ if (!debug_locks)
+ return rwsem_is_locked(rwsem);
+
+ if (!shared)
+ return lockdep_is_held_type(rwsem, 0);
+
+ /*
+ * We are checking that the lock is held at least in shared
+ * mode but don't care that it might be held exclusively
+ * (i.e. shared | excl). Hence we check if the lock is held
+ * in any mode rather than an explicit shared mode.
+ */
+ return lockdep_is_held_type(rwsem, -1);
+}
+
+bool
xfs_isilocked(
- xfs_inode_t *ip,
+ struct xfs_inode *ip,
uint lock_flags)
{
if (lock_flags & (XFS_ILOCK_EXCL|XFS_ILOCK_SHARED)) {
}
if (lock_flags & (XFS_MMAPLOCK_EXCL|XFS_MMAPLOCK_SHARED)) {
- if (!(lock_flags & XFS_MMAPLOCK_SHARED))
- return !!ip->i_mmaplock.mr_writer;
- return rwsem_is_locked(&ip->i_mmaplock.mr_lock);
+ return __xfs_rwsem_islocked(&VFS_I(ip)->i_rwsem,
+ (lock_flags & XFS_IOLOCK_SHARED));
}
- if (lock_flags & (XFS_IOLOCK_EXCL|XFS_IOLOCK_SHARED)) {
- if (!(lock_flags & XFS_IOLOCK_SHARED))
- return !debug_locks ||
- lockdep_is_held_type(&VFS_I(ip)->i_rwsem, 0);
- return rwsem_is_locked(&VFS_I(ip)->i_rwsem);
+ if (lock_flags & (XFS_IOLOCK_EXCL | XFS_IOLOCK_SHARED)) {
+ return __xfs_rwsem_islocked(&VFS_I(ip)->i_rwsem,
+ (lock_flags & XFS_IOLOCK_SHARED));
}
ASSERT(0);
- return 0;
+ return false;
}
#endif
}
/*
- * xfs_lock_two_inodes() can only be used to lock one type of lock at a time -
- * the mmaplock or the ilock, but not more than one type at a time. If we lock
- * more than one at a time, lockdep will report false positives saying we have
- * violated locking orders. The iolock must be double-locked separately since
- * we use i_rwsem for that. We now support taking one lock EXCL and the other
- * SHARED.
+ * xfs_lock_two_inodes() can only be used to lock ilock. The iolock and
+ * mmaplock must be double-locked separately since we use i_rwsem and
+ * invalidate_lock for that. We now support taking one lock EXCL and the
+ * other SHARED.
*/
void
xfs_lock_two_inodes(
ASSERT(hweight32(ip1_mode) == 1);
ASSERT(!(ip0_mode & (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)));
ASSERT(!(ip1_mode & (XFS_IOLOCK_SHARED|XFS_IOLOCK_EXCL)));
- ASSERT(!(ip0_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)) ||
- !(ip0_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)));
- ASSERT(!(ip1_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)) ||
- !(ip1_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)));
- ASSERT(!(ip1_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)) ||
- !(ip0_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)));
- ASSERT(!(ip0_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)) ||
- !(ip1_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL)));
-
+ ASSERT(!(ip0_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)));
+ ASSERT(!(ip1_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)));
ASSERT(ip0->i_ino != ip1->i_ino);
if (ip0->i_ino > ip1->i_ino) {
ret = xfs_iolock_two_inodes_and_break_layout(VFS_I(ip1), VFS_I(ip2));
if (ret)
return ret;
- if (ip1 == ip2)
- xfs_ilock(ip1, XFS_MMAPLOCK_EXCL);
- else
- xfs_lock_two_inodes(ip1, XFS_MMAPLOCK_EXCL,
- ip2, XFS_MMAPLOCK_EXCL);
+ filemap_invalidate_lock_two(VFS_I(ip1)->i_mapping,
+ VFS_I(ip2)->i_mapping);
return 0;
}
struct xfs_inode *ip1,
struct xfs_inode *ip2)
{
- bool same_inode = (ip1 == ip2);
-
- xfs_iunlock(ip2, XFS_MMAPLOCK_EXCL);
- if (!same_inode)
- xfs_iunlock(ip1, XFS_MMAPLOCK_EXCL);
+ filemap_invalidate_unlock_two(VFS_I(ip1)->i_mapping,
+ VFS_I(ip2)->i_mapping);
inode_unlock(VFS_I(ip2));
- if (!same_inode)
+ if (ip1 != ip2)
inode_unlock(VFS_I(ip1));
}
/* Transaction and locking information. */
struct xfs_inode_log_item *i_itemp; /* logging information */
mrlock_t i_lock; /* inode lock */
- mrlock_t i_mmaplock; /* inode mmap IO lock */
atomic_t i_pincount; /* inode pin count */
/*
int xfs_ilock_nowait(xfs_inode_t *, uint);
void xfs_iunlock(xfs_inode_t *, uint);
void xfs_ilock_demote(xfs_inode_t *, uint);
-int xfs_isilocked(xfs_inode_t *, uint);
+bool xfs_isilocked(struct xfs_inode *, uint);
uint xfs_ilock_data_map_shared(struct xfs_inode *);
uint xfs_ilock_attr_map_shared(struct xfs_inode *);
atomic_set(&ip->i_pincount, 0);
spin_lock_init(&ip->i_flags_lock);
- mrlock_init(&ip->i_mmaplock, MRLOCK_ALLOW_EQUAL_PRI|MRLOCK_BARRIER,
- "xfsino", ip->i_ino);
mrlock_init(&ip->i_lock, MRLOCK_ALLOW_EQUAL_PRI|MRLOCK_BARRIER,
"xfsino", ip->i_ino);
}
inode_dio_wait(inode);
/* Serialize against page faults */
- down_write(&zi->i_mmap_sem);
+ filemap_invalidate_lock(inode->i_mapping);
/* Serialize against zonefs_iomap_begin() */
mutex_lock(&zi->i_truncate_mutex);
unlock:
mutex_unlock(&zi->i_truncate_mutex);
- up_write(&zi->i_mmap_sem);
+ filemap_invalidate_unlock(inode->i_mapping);
return ret;
}
return ret;
}
-static vm_fault_t zonefs_filemap_fault(struct vm_fault *vmf)
-{
- struct zonefs_inode_info *zi = ZONEFS_I(file_inode(vmf->vma->vm_file));
- vm_fault_t ret;
-
- down_read(&zi->i_mmap_sem);
- ret = filemap_fault(vmf);
- up_read(&zi->i_mmap_sem);
-
- return ret;
-}
-
static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf)
{
struct inode *inode = file_inode(vmf->vma->vm_file);
file_update_time(vmf->vma->vm_file);
/* Serialize against truncates */
- down_read(&zi->i_mmap_sem);
+ filemap_invalidate_lock_shared(inode->i_mapping);
ret = iomap_page_mkwrite(vmf, &zonefs_iomap_ops);
- up_read(&zi->i_mmap_sem);
+ filemap_invalidate_unlock_shared(inode->i_mapping);
sb_end_pagefault(inode->i_sb);
return ret;
}
static const struct vm_operations_struct zonefs_file_vm_ops = {
- .fault = zonefs_filemap_fault,
+ .fault = filemap_fault,
.map_pages = filemap_map_pages,
.page_mkwrite = zonefs_filemap_page_mkwrite,
};
inode_init_once(&zi->i_vnode);
mutex_init(&zi->i_truncate_mutex);
- init_rwsem(&zi->i_mmap_sem);
zi->i_wr_refcnt = 0;
return &zi->i_vnode;
* and changes to the inode private data, and in particular changes to
* a sequential file size on completion of direct IO writes.
* Serialization of mmap read IOs with truncate and syscall IO
- * operations is done with i_mmap_sem in addition to i_truncate_mutex.
- * Only zonefs_seq_file_truncate() takes both lock (i_mmap_sem first,
- * i_truncate_mutex second).
+ * operations is done with invalidate_lock in addition to
+ * i_truncate_mutex. Only zonefs_seq_file_truncate() takes both lock
+ * (invalidate_lock first, i_truncate_mutex second).
*/
struct mutex i_truncate_mutex;
- struct rw_semaphore i_mmap_sem;
/* guarded by i_truncate_mutex */
unsigned int i_wr_refcnt;
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-
-// Generated by scripts/atomic/gen-atomic-instrumented.sh
-// DO NOT MODIFY THIS FILE DIRECTLY
-
-/*
- * This file provides wrappers with KASAN instrumentation for atomic operations.
- * To use this functionality an arch's atomic.h file needs to define all
- * atomic operations with arch_ prefix (e.g. arch_atomic_read()) and include
- * this file at the end. This file provides atomic_read() that forwards to
- * arch_atomic_read() for actual atomic operation.
- * Note: if an arch atomic operation is implemented by means of other atomic
- * operations (e.g. atomic_read()/atomic_cmpxchg() loop), then it needs to use
- * arch_ variants (i.e. arch_atomic_read()/arch_atomic_cmpxchg()) to avoid
- * double instrumentation.
- */
-#ifndef _ASM_GENERIC_ATOMIC_INSTRUMENTED_H
-#define _ASM_GENERIC_ATOMIC_INSTRUMENTED_H
-
-#include <linux/build_bug.h>
-#include <linux/compiler.h>
-#include <linux/instrumented.h>
-
-static __always_inline int
-atomic_read(const atomic_t *v)
-{
- instrument_atomic_read(v, sizeof(*v));
- return arch_atomic_read(v);
-}
-
-static __always_inline int
-atomic_read_acquire(const atomic_t *v)
-{
- instrument_atomic_read(v, sizeof(*v));
- return arch_atomic_read_acquire(v);
-}
-
-static __always_inline void
-atomic_set(atomic_t *v, int i)
-{
- instrument_atomic_write(v, sizeof(*v));
- arch_atomic_set(v, i);
-}
-
-static __always_inline void
-atomic_set_release(atomic_t *v, int i)
-{
- instrument_atomic_write(v, sizeof(*v));
- arch_atomic_set_release(v, i);
-}
-
-static __always_inline void
-atomic_add(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_add(i, v);
-}
-
-static __always_inline int
-atomic_add_return(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_add_return(i, v);
-}
-
-static __always_inline int
-atomic_add_return_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_add_return_acquire(i, v);
-}
-
-static __always_inline int
-atomic_add_return_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_add_return_release(i, v);
-}
-
-static __always_inline int
-atomic_add_return_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_add_return_relaxed(i, v);
-}
-
-static __always_inline int
-atomic_fetch_add(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_add(i, v);
-}
-
-static __always_inline int
-atomic_fetch_add_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_add_acquire(i, v);
-}
-
-static __always_inline int
-atomic_fetch_add_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_add_release(i, v);
-}
-
-static __always_inline int
-atomic_fetch_add_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_add_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_sub(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_sub(i, v);
-}
-
-static __always_inline int
-atomic_sub_return(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_sub_return(i, v);
-}
-
-static __always_inline int
-atomic_sub_return_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_sub_return_acquire(i, v);
-}
-
-static __always_inline int
-atomic_sub_return_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_sub_return_release(i, v);
-}
-
-static __always_inline int
-atomic_sub_return_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_sub_return_relaxed(i, v);
-}
-
-static __always_inline int
-atomic_fetch_sub(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_sub(i, v);
-}
-
-static __always_inline int
-atomic_fetch_sub_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_sub_acquire(i, v);
-}
-
-static __always_inline int
-atomic_fetch_sub_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_sub_release(i, v);
-}
-
-static __always_inline int
-atomic_fetch_sub_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_sub_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_inc(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_inc(v);
-}
-
-static __always_inline int
-atomic_inc_return(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_return(v);
-}
-
-static __always_inline int
-atomic_inc_return_acquire(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_return_acquire(v);
-}
-
-static __always_inline int
-atomic_inc_return_release(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_return_release(v);
-}
-
-static __always_inline int
-atomic_inc_return_relaxed(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_return_relaxed(v);
-}
-
-static __always_inline int
-atomic_fetch_inc(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_inc(v);
-}
-
-static __always_inline int
-atomic_fetch_inc_acquire(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_inc_acquire(v);
-}
-
-static __always_inline int
-atomic_fetch_inc_release(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_inc_release(v);
-}
-
-static __always_inline int
-atomic_fetch_inc_relaxed(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_inc_relaxed(v);
-}
-
-static __always_inline void
-atomic_dec(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_dec(v);
-}
-
-static __always_inline int
-atomic_dec_return(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_return(v);
-}
-
-static __always_inline int
-atomic_dec_return_acquire(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_return_acquire(v);
-}
-
-static __always_inline int
-atomic_dec_return_release(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_return_release(v);
-}
-
-static __always_inline int
-atomic_dec_return_relaxed(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_return_relaxed(v);
-}
-
-static __always_inline int
-atomic_fetch_dec(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_dec(v);
-}
-
-static __always_inline int
-atomic_fetch_dec_acquire(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_dec_acquire(v);
-}
-
-static __always_inline int
-atomic_fetch_dec_release(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_dec_release(v);
-}
-
-static __always_inline int
-atomic_fetch_dec_relaxed(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_dec_relaxed(v);
-}
-
-static __always_inline void
-atomic_and(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_and(i, v);
-}
-
-static __always_inline int
-atomic_fetch_and(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_and(i, v);
-}
-
-static __always_inline int
-atomic_fetch_and_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_and_acquire(i, v);
-}
-
-static __always_inline int
-atomic_fetch_and_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_and_release(i, v);
-}
-
-static __always_inline int
-atomic_fetch_and_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_and_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_andnot(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_andnot(i, v);
-}
-
-static __always_inline int
-atomic_fetch_andnot(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_andnot(i, v);
-}
-
-static __always_inline int
-atomic_fetch_andnot_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_andnot_acquire(i, v);
-}
-
-static __always_inline int
-atomic_fetch_andnot_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_andnot_release(i, v);
-}
-
-static __always_inline int
-atomic_fetch_andnot_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_andnot_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_or(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_or(i, v);
-}
-
-static __always_inline int
-atomic_fetch_or(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_or(i, v);
-}
-
-static __always_inline int
-atomic_fetch_or_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_or_acquire(i, v);
-}
-
-static __always_inline int
-atomic_fetch_or_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_or_release(i, v);
-}
-
-static __always_inline int
-atomic_fetch_or_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_or_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_xor(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic_xor(i, v);
-}
-
-static __always_inline int
-atomic_fetch_xor(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_xor(i, v);
-}
-
-static __always_inline int
-atomic_fetch_xor_acquire(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_xor_acquire(i, v);
-}
-
-static __always_inline int
-atomic_fetch_xor_release(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_xor_release(i, v);
-}
-
-static __always_inline int
-atomic_fetch_xor_relaxed(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_xor_relaxed(i, v);
-}
-
-static __always_inline int
-atomic_xchg(atomic_t *v, int i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_xchg(v, i);
-}
-
-static __always_inline int
-atomic_xchg_acquire(atomic_t *v, int i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_xchg_acquire(v, i);
-}
-
-static __always_inline int
-atomic_xchg_release(atomic_t *v, int i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_xchg_release(v, i);
-}
-
-static __always_inline int
-atomic_xchg_relaxed(atomic_t *v, int i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_xchg_relaxed(v, i);
-}
-
-static __always_inline int
-atomic_cmpxchg(atomic_t *v, int old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_cmpxchg(v, old, new);
-}
-
-static __always_inline int
-atomic_cmpxchg_acquire(atomic_t *v, int old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_cmpxchg_acquire(v, old, new);
-}
-
-static __always_inline int
-atomic_cmpxchg_release(atomic_t *v, int old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_cmpxchg_release(v, old, new);
-}
-
-static __always_inline int
-atomic_cmpxchg_relaxed(atomic_t *v, int old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_cmpxchg_relaxed(v, old, new);
-}
-
-static __always_inline bool
-atomic_try_cmpxchg(atomic_t *v, int *old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic_try_cmpxchg(v, old, new);
-}
-
-static __always_inline bool
-atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic_try_cmpxchg_acquire(v, old, new);
-}
-
-static __always_inline bool
-atomic_try_cmpxchg_release(atomic_t *v, int *old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic_try_cmpxchg_release(v, old, new);
-}
-
-static __always_inline bool
-atomic_try_cmpxchg_relaxed(atomic_t *v, int *old, int new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic_try_cmpxchg_relaxed(v, old, new);
-}
-
-static __always_inline bool
-atomic_sub_and_test(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_sub_and_test(i, v);
-}
-
-static __always_inline bool
-atomic_dec_and_test(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_and_test(v);
-}
-
-static __always_inline bool
-atomic_inc_and_test(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_and_test(v);
-}
-
-static __always_inline bool
-atomic_add_negative(int i, atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_add_negative(i, v);
-}
-
-static __always_inline int
-atomic_fetch_add_unless(atomic_t *v, int a, int u)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_fetch_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic_add_unless(atomic_t *v, int a, int u)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic_inc_not_zero(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_not_zero(v);
-}
-
-static __always_inline bool
-atomic_inc_unless_negative(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_inc_unless_negative(v);
-}
-
-static __always_inline bool
-atomic_dec_unless_positive(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_unless_positive(v);
-}
-
-static __always_inline int
-atomic_dec_if_positive(atomic_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic_dec_if_positive(v);
-}
-
-static __always_inline s64
-atomic64_read(const atomic64_t *v)
-{
- instrument_atomic_read(v, sizeof(*v));
- return arch_atomic64_read(v);
-}
-
-static __always_inline s64
-atomic64_read_acquire(const atomic64_t *v)
-{
- instrument_atomic_read(v, sizeof(*v));
- return arch_atomic64_read_acquire(v);
-}
-
-static __always_inline void
-atomic64_set(atomic64_t *v, s64 i)
-{
- instrument_atomic_write(v, sizeof(*v));
- arch_atomic64_set(v, i);
-}
-
-static __always_inline void
-atomic64_set_release(atomic64_t *v, s64 i)
-{
- instrument_atomic_write(v, sizeof(*v));
- arch_atomic64_set_release(v, i);
-}
-
-static __always_inline void
-atomic64_add(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_add(i, v);
-}
-
-static __always_inline s64
-atomic64_add_return(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_add_return(i, v);
-}
-
-static __always_inline s64
-atomic64_add_return_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_add_return_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_add_return_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_add_return_release(i, v);
-}
-
-static __always_inline s64
-atomic64_add_return_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_add_return_relaxed(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_add(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_add(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_add_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_add_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_add_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_add_release(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_add_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_add_relaxed(i, v);
-}
-
-static __always_inline void
-atomic64_sub(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_sub(i, v);
-}
-
-static __always_inline s64
-atomic64_sub_return(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_sub_return(i, v);
-}
-
-static __always_inline s64
-atomic64_sub_return_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_sub_return_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_sub_return_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_sub_return_release(i, v);
-}
-
-static __always_inline s64
-atomic64_sub_return_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_sub_return_relaxed(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_sub(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_sub(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_sub_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_sub_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_sub_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_sub_release(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_sub_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_sub_relaxed(i, v);
-}
-
-static __always_inline void
-atomic64_inc(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_inc(v);
-}
-
-static __always_inline s64
-atomic64_inc_return(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_return(v);
-}
-
-static __always_inline s64
-atomic64_inc_return_acquire(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_return_acquire(v);
-}
-
-static __always_inline s64
-atomic64_inc_return_release(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_return_release(v);
-}
-
-static __always_inline s64
-atomic64_inc_return_relaxed(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_return_relaxed(v);
-}
-
-static __always_inline s64
-atomic64_fetch_inc(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_inc(v);
-}
-
-static __always_inline s64
-atomic64_fetch_inc_acquire(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_inc_acquire(v);
-}
-
-static __always_inline s64
-atomic64_fetch_inc_release(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_inc_release(v);
-}
-
-static __always_inline s64
-atomic64_fetch_inc_relaxed(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_inc_relaxed(v);
-}
-
-static __always_inline void
-atomic64_dec(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_dec(v);
-}
-
-static __always_inline s64
-atomic64_dec_return(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_return(v);
-}
-
-static __always_inline s64
-atomic64_dec_return_acquire(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_return_acquire(v);
-}
-
-static __always_inline s64
-atomic64_dec_return_release(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_return_release(v);
-}
-
-static __always_inline s64
-atomic64_dec_return_relaxed(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_return_relaxed(v);
-}
-
-static __always_inline s64
-atomic64_fetch_dec(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_dec(v);
-}
-
-static __always_inline s64
-atomic64_fetch_dec_acquire(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_dec_acquire(v);
-}
-
-static __always_inline s64
-atomic64_fetch_dec_release(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_dec_release(v);
-}
-
-static __always_inline s64
-atomic64_fetch_dec_relaxed(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_dec_relaxed(v);
-}
-
-static __always_inline void
-atomic64_and(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_and(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_and(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_and(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_and_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_and_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_and_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_and_release(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_and_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_and_relaxed(i, v);
-}
-
-static __always_inline void
-atomic64_andnot(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_andnot(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_andnot(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_andnot(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_andnot_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_andnot_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_andnot_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_andnot_release(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_andnot_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_andnot_relaxed(i, v);
-}
-
-static __always_inline void
-atomic64_or(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_or(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_or(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_or(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_or_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_or_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_or_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_or_release(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_or_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_or_relaxed(i, v);
-}
-
-static __always_inline void
-atomic64_xor(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- arch_atomic64_xor(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_xor(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_xor(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_xor_acquire(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_xor_acquire(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_xor_release(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_xor_release(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_xor_relaxed(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_xor_relaxed(i, v);
-}
-
-static __always_inline s64
-atomic64_xchg(atomic64_t *v, s64 i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_xchg(v, i);
-}
-
-static __always_inline s64
-atomic64_xchg_acquire(atomic64_t *v, s64 i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_xchg_acquire(v, i);
-}
-
-static __always_inline s64
-atomic64_xchg_release(atomic64_t *v, s64 i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_xchg_release(v, i);
-}
-
-static __always_inline s64
-atomic64_xchg_relaxed(atomic64_t *v, s64 i)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_xchg_relaxed(v, i);
-}
-
-static __always_inline s64
-atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_cmpxchg(v, old, new);
-}
-
-static __always_inline s64
-atomic64_cmpxchg_acquire(atomic64_t *v, s64 old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_cmpxchg_acquire(v, old, new);
-}
-
-static __always_inline s64
-atomic64_cmpxchg_release(atomic64_t *v, s64 old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_cmpxchg_release(v, old, new);
-}
-
-static __always_inline s64
-atomic64_cmpxchg_relaxed(atomic64_t *v, s64 old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_cmpxchg_relaxed(v, old, new);
-}
-
-static __always_inline bool
-atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic64_try_cmpxchg(v, old, new);
-}
-
-static __always_inline bool
-atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic64_try_cmpxchg_acquire(v, old, new);
-}
-
-static __always_inline bool
-atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic64_try_cmpxchg_release(v, old, new);
-}
-
-static __always_inline bool
-atomic64_try_cmpxchg_relaxed(atomic64_t *v, s64 *old, s64 new)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- instrument_atomic_read_write(old, sizeof(*old));
- return arch_atomic64_try_cmpxchg_relaxed(v, old, new);
-}
-
-static __always_inline bool
-atomic64_sub_and_test(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_sub_and_test(i, v);
-}
-
-static __always_inline bool
-atomic64_dec_and_test(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_and_test(v);
-}
-
-static __always_inline bool
-atomic64_inc_and_test(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_and_test(v);
-}
-
-static __always_inline bool
-atomic64_add_negative(s64 i, atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_add_negative(i, v);
-}
-
-static __always_inline s64
-atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_fetch_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic64_inc_not_zero(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_not_zero(v);
-}
-
-static __always_inline bool
-atomic64_inc_unless_negative(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_inc_unless_negative(v);
-}
-
-static __always_inline bool
-atomic64_dec_unless_positive(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_unless_positive(v);
-}
-
-static __always_inline s64
-atomic64_dec_if_positive(atomic64_t *v)
-{
- instrument_atomic_read_write(v, sizeof(*v));
- return arch_atomic64_dec_if_positive(v);
-}
-
-#define xchg(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg(__ai_ptr, __VA_ARGS__); \
-})
-
-#define xchg_acquire(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg_acquire(__ai_ptr, __VA_ARGS__); \
-})
-
-#define xchg_release(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg_release(__ai_ptr, __VA_ARGS__); \
-})
-
-#define xchg_relaxed(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_xchg_relaxed(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg_acquire(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_acquire(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg_release(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_release(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg_relaxed(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_relaxed(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg64(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg64_acquire(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_acquire(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg64_release(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_release(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg64_relaxed(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \
-})
-
-#define try_cmpxchg(ptr, oldp, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- typeof(oldp) __ai_oldp = (oldp); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
- arch_try_cmpxchg(__ai_ptr, __ai_oldp, __VA_ARGS__); \
-})
-
-#define try_cmpxchg_acquire(ptr, oldp, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- typeof(oldp) __ai_oldp = (oldp); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
- arch_try_cmpxchg_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \
-})
-
-#define try_cmpxchg_release(ptr, oldp, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- typeof(oldp) __ai_oldp = (oldp); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
- arch_try_cmpxchg_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \
-})
-
-#define try_cmpxchg_relaxed(ptr, oldp, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- typeof(oldp) __ai_oldp = (oldp); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
- arch_try_cmpxchg_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \
-})
-
-#define cmpxchg_local(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg_local(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg64_local(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \
-})
-
-#define sync_cmpxchg(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
- arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
-})
-
-#define cmpxchg_double(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
- arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \
-})
-
-
-#define cmpxchg_double_local(ptr, ...) \
-({ \
- typeof(ptr) __ai_ptr = (ptr); \
- instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
- arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \
-})
-
-#endif /* _ASM_GENERIC_ATOMIC_INSTRUMENTED_H */
-// 1d7c3a25aca5c7fb031c307be4c3d24c7b48fcd5
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-
-// Generated by scripts/atomic/gen-atomic-long.sh
-// DO NOT MODIFY THIS FILE DIRECTLY
-
-#ifndef _ASM_GENERIC_ATOMIC_LONG_H
-#define _ASM_GENERIC_ATOMIC_LONG_H
-
-#include <linux/compiler.h>
-#include <asm/types.h>
-
-#ifdef CONFIG_64BIT
-typedef atomic64_t atomic_long_t;
-#define ATOMIC_LONG_INIT(i) ATOMIC64_INIT(i)
-#define atomic_long_cond_read_acquire atomic64_cond_read_acquire
-#define atomic_long_cond_read_relaxed atomic64_cond_read_relaxed
-#else
-typedef atomic_t atomic_long_t;
-#define ATOMIC_LONG_INIT(i) ATOMIC_INIT(i)
-#define atomic_long_cond_read_acquire atomic_cond_read_acquire
-#define atomic_long_cond_read_relaxed atomic_cond_read_relaxed
-#endif
-
-#ifdef CONFIG_64BIT
-
-static __always_inline long
-atomic_long_read(const atomic_long_t *v)
-{
- return atomic64_read(v);
-}
-
-static __always_inline long
-atomic_long_read_acquire(const atomic_long_t *v)
-{
- return atomic64_read_acquire(v);
-}
-
-static __always_inline void
-atomic_long_set(atomic_long_t *v, long i)
-{
- atomic64_set(v, i);
-}
-
-static __always_inline void
-atomic_long_set_release(atomic_long_t *v, long i)
-{
- atomic64_set_release(v, i);
-}
-
-static __always_inline void
-atomic_long_add(long i, atomic_long_t *v)
-{
- atomic64_add(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return(long i, atomic_long_t *v)
-{
- return atomic64_add_return(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return_acquire(long i, atomic_long_t *v)
-{
- return atomic64_add_return_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return_release(long i, atomic_long_t *v)
-{
- return atomic64_add_return_release(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_add_return_relaxed(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add(long i, atomic_long_t *v)
-{
- return atomic64_fetch_add(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_acquire(long i, atomic_long_t *v)
-{
- return atomic64_fetch_add_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_release(long i, atomic_long_t *v)
-{
- return atomic64_fetch_add_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_fetch_add_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_sub(long i, atomic_long_t *v)
-{
- atomic64_sub(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return(long i, atomic_long_t *v)
-{
- return atomic64_sub_return(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return_acquire(long i, atomic_long_t *v)
-{
- return atomic64_sub_return_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return_release(long i, atomic_long_t *v)
-{
- return atomic64_sub_return_release(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_sub_return_relaxed(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub(long i, atomic_long_t *v)
-{
- return atomic64_fetch_sub(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub_acquire(long i, atomic_long_t *v)
-{
- return atomic64_fetch_sub_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub_release(long i, atomic_long_t *v)
-{
- return atomic64_fetch_sub_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_fetch_sub_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_inc(atomic_long_t *v)
-{
- atomic64_inc(v);
-}
-
-static __always_inline long
-atomic_long_inc_return(atomic_long_t *v)
-{
- return atomic64_inc_return(v);
-}
-
-static __always_inline long
-atomic_long_inc_return_acquire(atomic_long_t *v)
-{
- return atomic64_inc_return_acquire(v);
-}
-
-static __always_inline long
-atomic_long_inc_return_release(atomic_long_t *v)
-{
- return atomic64_inc_return_release(v);
-}
-
-static __always_inline long
-atomic_long_inc_return_relaxed(atomic_long_t *v)
-{
- return atomic64_inc_return_relaxed(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc(atomic_long_t *v)
-{
- return atomic64_fetch_inc(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc_acquire(atomic_long_t *v)
-{
- return atomic64_fetch_inc_acquire(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc_release(atomic_long_t *v)
-{
- return atomic64_fetch_inc_release(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc_relaxed(atomic_long_t *v)
-{
- return atomic64_fetch_inc_relaxed(v);
-}
-
-static __always_inline void
-atomic_long_dec(atomic_long_t *v)
-{
- atomic64_dec(v);
-}
-
-static __always_inline long
-atomic_long_dec_return(atomic_long_t *v)
-{
- return atomic64_dec_return(v);
-}
-
-static __always_inline long
-atomic_long_dec_return_acquire(atomic_long_t *v)
-{
- return atomic64_dec_return_acquire(v);
-}
-
-static __always_inline long
-atomic_long_dec_return_release(atomic_long_t *v)
-{
- return atomic64_dec_return_release(v);
-}
-
-static __always_inline long
-atomic_long_dec_return_relaxed(atomic_long_t *v)
-{
- return atomic64_dec_return_relaxed(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec(atomic_long_t *v)
-{
- return atomic64_fetch_dec(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec_acquire(atomic_long_t *v)
-{
- return atomic64_fetch_dec_acquire(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec_release(atomic_long_t *v)
-{
- return atomic64_fetch_dec_release(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec_relaxed(atomic_long_t *v)
-{
- return atomic64_fetch_dec_relaxed(v);
-}
-
-static __always_inline void
-atomic_long_and(long i, atomic_long_t *v)
-{
- atomic64_and(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and(long i, atomic_long_t *v)
-{
- return atomic64_fetch_and(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and_acquire(long i, atomic_long_t *v)
-{
- return atomic64_fetch_and_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and_release(long i, atomic_long_t *v)
-{
- return atomic64_fetch_and_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_fetch_and_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_andnot(long i, atomic_long_t *v)
-{
- atomic64_andnot(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot(long i, atomic_long_t *v)
-{
- return atomic64_fetch_andnot(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v)
-{
- return atomic64_fetch_andnot_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot_release(long i, atomic_long_t *v)
-{
- return atomic64_fetch_andnot_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_fetch_andnot_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_or(long i, atomic_long_t *v)
-{
- atomic64_or(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or(long i, atomic_long_t *v)
-{
- return atomic64_fetch_or(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or_acquire(long i, atomic_long_t *v)
-{
- return atomic64_fetch_or_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or_release(long i, atomic_long_t *v)
-{
- return atomic64_fetch_or_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_fetch_or_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_xor(long i, atomic_long_t *v)
-{
- atomic64_xor(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor(long i, atomic_long_t *v)
-{
- return atomic64_fetch_xor(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor_acquire(long i, atomic_long_t *v)
-{
- return atomic64_fetch_xor_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor_release(long i, atomic_long_t *v)
-{
- return atomic64_fetch_xor_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v)
-{
- return atomic64_fetch_xor_relaxed(i, v);
-}
-
-static __always_inline long
-atomic_long_xchg(atomic_long_t *v, long i)
-{
- return atomic64_xchg(v, i);
-}
-
-static __always_inline long
-atomic_long_xchg_acquire(atomic_long_t *v, long i)
-{
- return atomic64_xchg_acquire(v, i);
-}
-
-static __always_inline long
-atomic_long_xchg_release(atomic_long_t *v, long i)
-{
- return atomic64_xchg_release(v, i);
-}
-
-static __always_inline long
-atomic_long_xchg_relaxed(atomic_long_t *v, long i)
-{
- return atomic64_xchg_relaxed(v, i);
-}
-
-static __always_inline long
-atomic_long_cmpxchg(atomic_long_t *v, long old, long new)
-{
- return atomic64_cmpxchg(v, old, new);
-}
-
-static __always_inline long
-atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new)
-{
- return atomic64_cmpxchg_acquire(v, old, new);
-}
-
-static __always_inline long
-atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new)
-{
- return atomic64_cmpxchg_release(v, old, new);
-}
-
-static __always_inline long
-atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new)
-{
- return atomic64_cmpxchg_relaxed(v, old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new)
-{
- return atomic64_try_cmpxchg(v, (s64 *)old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new)
-{
- return atomic64_try_cmpxchg_acquire(v, (s64 *)old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new)
-{
- return atomic64_try_cmpxchg_release(v, (s64 *)old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new)
-{
- return atomic64_try_cmpxchg_relaxed(v, (s64 *)old, new);
-}
-
-static __always_inline bool
-atomic_long_sub_and_test(long i, atomic_long_t *v)
-{
- return atomic64_sub_and_test(i, v);
-}
-
-static __always_inline bool
-atomic_long_dec_and_test(atomic_long_t *v)
-{
- return atomic64_dec_and_test(v);
-}
-
-static __always_inline bool
-atomic_long_inc_and_test(atomic_long_t *v)
-{
- return atomic64_inc_and_test(v);
-}
-
-static __always_inline bool
-atomic_long_add_negative(long i, atomic_long_t *v)
-{
- return atomic64_add_negative(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u)
-{
- return atomic64_fetch_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic_long_add_unless(atomic_long_t *v, long a, long u)
-{
- return atomic64_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic_long_inc_not_zero(atomic_long_t *v)
-{
- return atomic64_inc_not_zero(v);
-}
-
-static __always_inline bool
-atomic_long_inc_unless_negative(atomic_long_t *v)
-{
- return atomic64_inc_unless_negative(v);
-}
-
-static __always_inline bool
-atomic_long_dec_unless_positive(atomic_long_t *v)
-{
- return atomic64_dec_unless_positive(v);
-}
-
-static __always_inline long
-atomic_long_dec_if_positive(atomic_long_t *v)
-{
- return atomic64_dec_if_positive(v);
-}
-
-#else /* CONFIG_64BIT */
-
-static __always_inline long
-atomic_long_read(const atomic_long_t *v)
-{
- return atomic_read(v);
-}
-
-static __always_inline long
-atomic_long_read_acquire(const atomic_long_t *v)
-{
- return atomic_read_acquire(v);
-}
-
-static __always_inline void
-atomic_long_set(atomic_long_t *v, long i)
-{
- atomic_set(v, i);
-}
-
-static __always_inline void
-atomic_long_set_release(atomic_long_t *v, long i)
-{
- atomic_set_release(v, i);
-}
-
-static __always_inline void
-atomic_long_add(long i, atomic_long_t *v)
-{
- atomic_add(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return(long i, atomic_long_t *v)
-{
- return atomic_add_return(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return_acquire(long i, atomic_long_t *v)
-{
- return atomic_add_return_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return_release(long i, atomic_long_t *v)
-{
- return atomic_add_return_release(i, v);
-}
-
-static __always_inline long
-atomic_long_add_return_relaxed(long i, atomic_long_t *v)
-{
- return atomic_add_return_relaxed(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add(long i, atomic_long_t *v)
-{
- return atomic_fetch_add(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_acquire(long i, atomic_long_t *v)
-{
- return atomic_fetch_add_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_release(long i, atomic_long_t *v)
-{
- return atomic_fetch_add_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_relaxed(long i, atomic_long_t *v)
-{
- return atomic_fetch_add_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_sub(long i, atomic_long_t *v)
-{
- atomic_sub(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return(long i, atomic_long_t *v)
-{
- return atomic_sub_return(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return_acquire(long i, atomic_long_t *v)
-{
- return atomic_sub_return_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return_release(long i, atomic_long_t *v)
-{
- return atomic_sub_return_release(i, v);
-}
-
-static __always_inline long
-atomic_long_sub_return_relaxed(long i, atomic_long_t *v)
-{
- return atomic_sub_return_relaxed(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub(long i, atomic_long_t *v)
-{
- return atomic_fetch_sub(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub_acquire(long i, atomic_long_t *v)
-{
- return atomic_fetch_sub_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub_release(long i, atomic_long_t *v)
-{
- return atomic_fetch_sub_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v)
-{
- return atomic_fetch_sub_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_inc(atomic_long_t *v)
-{
- atomic_inc(v);
-}
-
-static __always_inline long
-atomic_long_inc_return(atomic_long_t *v)
-{
- return atomic_inc_return(v);
-}
-
-static __always_inline long
-atomic_long_inc_return_acquire(atomic_long_t *v)
-{
- return atomic_inc_return_acquire(v);
-}
-
-static __always_inline long
-atomic_long_inc_return_release(atomic_long_t *v)
-{
- return atomic_inc_return_release(v);
-}
-
-static __always_inline long
-atomic_long_inc_return_relaxed(atomic_long_t *v)
-{
- return atomic_inc_return_relaxed(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc(atomic_long_t *v)
-{
- return atomic_fetch_inc(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc_acquire(atomic_long_t *v)
-{
- return atomic_fetch_inc_acquire(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc_release(atomic_long_t *v)
-{
- return atomic_fetch_inc_release(v);
-}
-
-static __always_inline long
-atomic_long_fetch_inc_relaxed(atomic_long_t *v)
-{
- return atomic_fetch_inc_relaxed(v);
-}
-
-static __always_inline void
-atomic_long_dec(atomic_long_t *v)
-{
- atomic_dec(v);
-}
-
-static __always_inline long
-atomic_long_dec_return(atomic_long_t *v)
-{
- return atomic_dec_return(v);
-}
-
-static __always_inline long
-atomic_long_dec_return_acquire(atomic_long_t *v)
-{
- return atomic_dec_return_acquire(v);
-}
-
-static __always_inline long
-atomic_long_dec_return_release(atomic_long_t *v)
-{
- return atomic_dec_return_release(v);
-}
-
-static __always_inline long
-atomic_long_dec_return_relaxed(atomic_long_t *v)
-{
- return atomic_dec_return_relaxed(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec(atomic_long_t *v)
-{
- return atomic_fetch_dec(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec_acquire(atomic_long_t *v)
-{
- return atomic_fetch_dec_acquire(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec_release(atomic_long_t *v)
-{
- return atomic_fetch_dec_release(v);
-}
-
-static __always_inline long
-atomic_long_fetch_dec_relaxed(atomic_long_t *v)
-{
- return atomic_fetch_dec_relaxed(v);
-}
-
-static __always_inline void
-atomic_long_and(long i, atomic_long_t *v)
-{
- atomic_and(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and(long i, atomic_long_t *v)
-{
- return atomic_fetch_and(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and_acquire(long i, atomic_long_t *v)
-{
- return atomic_fetch_and_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and_release(long i, atomic_long_t *v)
-{
- return atomic_fetch_and_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_and_relaxed(long i, atomic_long_t *v)
-{
- return atomic_fetch_and_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_andnot(long i, atomic_long_t *v)
-{
- atomic_andnot(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot(long i, atomic_long_t *v)
-{
- return atomic_fetch_andnot(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v)
-{
- return atomic_fetch_andnot_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot_release(long i, atomic_long_t *v)
-{
- return atomic_fetch_andnot_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v)
-{
- return atomic_fetch_andnot_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_or(long i, atomic_long_t *v)
-{
- atomic_or(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or(long i, atomic_long_t *v)
-{
- return atomic_fetch_or(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or_acquire(long i, atomic_long_t *v)
-{
- return atomic_fetch_or_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or_release(long i, atomic_long_t *v)
-{
- return atomic_fetch_or_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_or_relaxed(long i, atomic_long_t *v)
-{
- return atomic_fetch_or_relaxed(i, v);
-}
-
-static __always_inline void
-atomic_long_xor(long i, atomic_long_t *v)
-{
- atomic_xor(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor(long i, atomic_long_t *v)
-{
- return atomic_fetch_xor(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor_acquire(long i, atomic_long_t *v)
-{
- return atomic_fetch_xor_acquire(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor_release(long i, atomic_long_t *v)
-{
- return atomic_fetch_xor_release(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v)
-{
- return atomic_fetch_xor_relaxed(i, v);
-}
-
-static __always_inline long
-atomic_long_xchg(atomic_long_t *v, long i)
-{
- return atomic_xchg(v, i);
-}
-
-static __always_inline long
-atomic_long_xchg_acquire(atomic_long_t *v, long i)
-{
- return atomic_xchg_acquire(v, i);
-}
-
-static __always_inline long
-atomic_long_xchg_release(atomic_long_t *v, long i)
-{
- return atomic_xchg_release(v, i);
-}
-
-static __always_inline long
-atomic_long_xchg_relaxed(atomic_long_t *v, long i)
-{
- return atomic_xchg_relaxed(v, i);
-}
-
-static __always_inline long
-atomic_long_cmpxchg(atomic_long_t *v, long old, long new)
-{
- return atomic_cmpxchg(v, old, new);
-}
-
-static __always_inline long
-atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new)
-{
- return atomic_cmpxchg_acquire(v, old, new);
-}
-
-static __always_inline long
-atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new)
-{
- return atomic_cmpxchg_release(v, old, new);
-}
-
-static __always_inline long
-atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new)
-{
- return atomic_cmpxchg_relaxed(v, old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new)
-{
- return atomic_try_cmpxchg(v, (int *)old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new)
-{
- return atomic_try_cmpxchg_acquire(v, (int *)old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new)
-{
- return atomic_try_cmpxchg_release(v, (int *)old, new);
-}
-
-static __always_inline bool
-atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new)
-{
- return atomic_try_cmpxchg_relaxed(v, (int *)old, new);
-}
-
-static __always_inline bool
-atomic_long_sub_and_test(long i, atomic_long_t *v)
-{
- return atomic_sub_and_test(i, v);
-}
-
-static __always_inline bool
-atomic_long_dec_and_test(atomic_long_t *v)
-{
- return atomic_dec_and_test(v);
-}
-
-static __always_inline bool
-atomic_long_inc_and_test(atomic_long_t *v)
-{
- return atomic_inc_and_test(v);
-}
-
-static __always_inline bool
-atomic_long_add_negative(long i, atomic_long_t *v)
-{
- return atomic_add_negative(i, v);
-}
-
-static __always_inline long
-atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u)
-{
- return atomic_fetch_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic_long_add_unless(atomic_long_t *v, long a, long u)
-{
- return atomic_add_unless(v, a, u);
-}
-
-static __always_inline bool
-atomic_long_inc_not_zero(atomic_long_t *v)
-{
- return atomic_inc_not_zero(v);
-}
-
-static __always_inline bool
-atomic_long_inc_unless_negative(atomic_long_t *v)
-{
- return atomic_inc_unless_negative(v);
-}
-
-static __always_inline bool
-atomic_long_dec_unless_positive(atomic_long_t *v)
-{
- return atomic_dec_unless_positive(v);
-}
-
-static __always_inline long
-atomic_long_dec_if_positive(atomic_long_t *v)
-{
- return atomic_dec_if_positive(v);
-}
-
-#endif /* CONFIG_64BIT */
-#endif /* _ASM_GENERIC_ATOMIC_LONG_H */
-// a624200981f552b2c6be4f32fe44da8289f30d87
* See Documentation/atomic_bitops.txt for details.
*/
-static __always_inline void set_bit(unsigned int nr, volatile unsigned long *p)
+static __always_inline void
+arch_set_bit(unsigned int nr, volatile unsigned long *p)
{
p += BIT_WORD(nr);
- atomic_long_or(BIT_MASK(nr), (atomic_long_t *)p);
+ arch_atomic_long_or(BIT_MASK(nr), (atomic_long_t *)p);
}
-static __always_inline void clear_bit(unsigned int nr, volatile unsigned long *p)
+static __always_inline void
+arch_clear_bit(unsigned int nr, volatile unsigned long *p)
{
p += BIT_WORD(nr);
- atomic_long_andnot(BIT_MASK(nr), (atomic_long_t *)p);
+ arch_atomic_long_andnot(BIT_MASK(nr), (atomic_long_t *)p);
}
-static __always_inline void change_bit(unsigned int nr, volatile unsigned long *p)
+static __always_inline void
+arch_change_bit(unsigned int nr, volatile unsigned long *p)
{
p += BIT_WORD(nr);
- atomic_long_xor(BIT_MASK(nr), (atomic_long_t *)p);
+ arch_atomic_long_xor(BIT_MASK(nr), (atomic_long_t *)p);
}
-static inline int test_and_set_bit(unsigned int nr, volatile unsigned long *p)
+static __always_inline int
+arch_test_and_set_bit(unsigned int nr, volatile unsigned long *p)
{
long old;
unsigned long mask = BIT_MASK(nr);
if (READ_ONCE(*p) & mask)
return 1;
- old = atomic_long_fetch_or(mask, (atomic_long_t *)p);
+ old = arch_atomic_long_fetch_or(mask, (atomic_long_t *)p);
return !!(old & mask);
}
-static inline int test_and_clear_bit(unsigned int nr, volatile unsigned long *p)
+static __always_inline int
+arch_test_and_clear_bit(unsigned int nr, volatile unsigned long *p)
{
long old;
unsigned long mask = BIT_MASK(nr);
if (!(READ_ONCE(*p) & mask))
return 0;
- old = atomic_long_fetch_andnot(mask, (atomic_long_t *)p);
+ old = arch_atomic_long_fetch_andnot(mask, (atomic_long_t *)p);
return !!(old & mask);
}
-static inline int test_and_change_bit(unsigned int nr, volatile unsigned long *p)
+static __always_inline int
+arch_test_and_change_bit(unsigned int nr, volatile unsigned long *p)
{
long old;
unsigned long mask = BIT_MASK(nr);
p += BIT_WORD(nr);
- old = atomic_long_fetch_xor(mask, (atomic_long_t *)p);
+ old = arch_atomic_long_fetch_xor(mask, (atomic_long_t *)p);
return !!(old & mask);
}
+#include <asm-generic/bitops/instrumented-atomic.h>
+
#endif /* _ASM_GENERIC_BITOPS_ATOMIC_H */
#include <asm/barrier.h>
/**
- * test_and_set_bit_lock - Set a bit and return its old value, for lock
+ * arch_test_and_set_bit_lock - Set a bit and return its old value, for lock
* @nr: Bit to set
* @addr: Address to count from
*
* the returned value is 0.
* It can be used to implement bit locks.
*/
-static inline int test_and_set_bit_lock(unsigned int nr,
- volatile unsigned long *p)
+static __always_inline int
+arch_test_and_set_bit_lock(unsigned int nr, volatile unsigned long *p)
{
long old;
unsigned long mask = BIT_MASK(nr);
if (READ_ONCE(*p) & mask)
return 1;
- old = atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
+ old = arch_atomic_long_fetch_or_acquire(mask, (atomic_long_t *)p);
return !!(old & mask);
}
/**
- * clear_bit_unlock - Clear a bit in memory, for unlock
+ * arch_clear_bit_unlock - Clear a bit in memory, for unlock
* @nr: the bit to set
* @addr: the address to start counting from
*
* This operation is atomic and provides release barrier semantics.
*/
-static inline void clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
+static __always_inline void
+arch_clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
{
p += BIT_WORD(nr);
- atomic_long_fetch_andnot_release(BIT_MASK(nr), (atomic_long_t *)p);
+ arch_atomic_long_fetch_andnot_release(BIT_MASK(nr), (atomic_long_t *)p);
}
/**
- * __clear_bit_unlock - Clear a bit in memory, for unlock
+ * arch___clear_bit_unlock - Clear a bit in memory, for unlock
* @nr: the bit to set
* @addr: the address to start counting from
*
*
* See for example x86's implementation.
*/
-static inline void __clear_bit_unlock(unsigned int nr,
- volatile unsigned long *p)
+static inline void
+arch___clear_bit_unlock(unsigned int nr, volatile unsigned long *p)
{
unsigned long old;
p += BIT_WORD(nr);
old = READ_ONCE(*p);
old &= ~BIT_MASK(nr);
- atomic_long_set_release((atomic_long_t *)p, old);
+ arch_atomic_long_set_release((atomic_long_t *)p, old);
}
/**
- * clear_bit_unlock_is_negative_byte - Clear a bit in memory and test if bottom
- * byte is negative, for unlock.
+ * arch_clear_bit_unlock_is_negative_byte - Clear a bit in memory and test if bottom
+ * byte is negative, for unlock.
* @nr: the bit to clear
* @addr: the address to start counting from
*
* This is a bit of a one-trick-pony for the filemap code, which clears
* PG_locked and tests PG_waiters,
*/
-#ifndef clear_bit_unlock_is_negative_byte
-static inline bool clear_bit_unlock_is_negative_byte(unsigned int nr,
- volatile unsigned long *p)
+#ifndef arch_clear_bit_unlock_is_negative_byte
+static inline bool arch_clear_bit_unlock_is_negative_byte(unsigned int nr,
+ volatile unsigned long *p)
{
long old;
unsigned long mask = BIT_MASK(nr);
p += BIT_WORD(nr);
- old = atomic_long_fetch_andnot_release(mask, (atomic_long_t *)p);
+ old = arch_atomic_long_fetch_andnot_release(mask, (atomic_long_t *)p);
return !!(old & BIT(7));
}
-#define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+#define arch_clear_bit_unlock_is_negative_byte arch_clear_bit_unlock_is_negative_byte
#endif
+#include <asm-generic/bitops/instrumented-lock.h>
+
#endif /* _ASM_GENERIC_BITOPS_LOCK_H_ */
#include <asm/types.h>
/**
- * __set_bit - Set a bit in memory
+ * arch___set_bit - Set a bit in memory
* @nr: the bit to set
* @addr: the address to start counting from
*
* If it's called on the same region of memory simultaneously, the effect
* may be that only one operation succeeds.
*/
-static inline void __set_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___set_bit(int nr, volatile unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
*p |= mask;
}
+#define __set_bit arch___set_bit
-static inline void __clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___clear_bit(int nr, volatile unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
*p &= ~mask;
}
+#define __clear_bit arch___clear_bit
/**
- * __change_bit - Toggle a bit in memory
+ * arch___change_bit - Toggle a bit in memory
* @nr: the bit to change
* @addr: the address to start counting from
*
* If it's called on the same region of memory simultaneously, the effect
* may be that only one operation succeeds.
*/
-static inline void __change_bit(int nr, volatile unsigned long *addr)
+static __always_inline
+void arch___change_bit(int nr, volatile unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
*p ^= mask;
}
+#define __change_bit arch___change_bit
/**
- * __test_and_set_bit - Set a bit and return its old value
+ * arch___test_and_set_bit - Set a bit and return its old value
* @nr: Bit to set
* @addr: Address to count from
*
* If two examples of this operation race, one can appear to succeed
* but actually fail. You must protect multiple accesses with a lock.
*/
-static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
+static __always_inline int
+arch___test_and_set_bit(int nr, volatile unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
*p = old | mask;
return (old & mask) != 0;
}
+#define __test_and_set_bit arch___test_and_set_bit
/**
- * __test_and_clear_bit - Clear a bit and return its old value
+ * arch___test_and_clear_bit - Clear a bit and return its old value
* @nr: Bit to clear
* @addr: Address to count from
*
* If two examples of this operation race, one can appear to succeed
* but actually fail. You must protect multiple accesses with a lock.
*/
-static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline int
+arch___test_and_clear_bit(int nr, volatile unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
*p = old & ~mask;
return (old & mask) != 0;
}
+#define __test_and_clear_bit arch___test_and_clear_bit
/* WARNING: non atomic and it can be reordered! */
-static inline int __test_and_change_bit(int nr,
- volatile unsigned long *addr)
+static __always_inline int
+arch___test_and_change_bit(int nr, volatile unsigned long *addr)
{
unsigned long mask = BIT_MASK(nr);
unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
*p = old ^ mask;
return (old & mask) != 0;
}
+#define __test_and_change_bit arch___test_and_change_bit
/**
- * test_bit - Determine whether a bit is set
+ * arch_test_bit - Determine whether a bit is set
* @nr: bit number to test
* @addr: Address to start counting from
*/
-static inline int test_bit(int nr, const volatile unsigned long *addr)
+static __always_inline int
+arch_test_bit(int nr, const volatile unsigned long *addr)
{
return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
}
+#define test_bit arch_test_bit
#endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
struct public_key_signature {
struct asymmetric_key_id *auth_ids[2];
u8 *s; /* Signature */
- u32 s_size; /* Number of bytes in signature */
u8 *digest;
- u8 digest_size; /* Number of bytes in digest */
+ u32 s_size; /* Number of bytes in signature */
+ u32 digest_size; /* Number of bytes in digest */
const char *pkey_algo;
const char *hash_algo;
const char *encoding;
/*
* Common values for the SM4 algorithm
* Copyright (C) 2018 ARM Limited or its affiliates.
+ * Copyright (c) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#ifndef _CRYPTO_SM4_H
#define SM4_BLOCK_SIZE 16
#define SM4_RKEY_WORDS 32
-struct crypto_sm4_ctx {
+struct sm4_ctx {
u32 rkey_enc[SM4_RKEY_WORDS];
u32 rkey_dec[SM4_RKEY_WORDS];
};
-int crypto_sm4_set_key(struct crypto_tfm *tfm, const u8 *in_key,
- unsigned int key_len);
-int crypto_sm4_expand_key(struct crypto_sm4_ctx *ctx, const u8 *in_key,
+/**
+ * sm4_expandkey - Expands the SM4 key as described in GB/T 32907-2016
+ * @ctx: The location where the computed key will be stored.
+ * @in_key: The supplied key.
+ * @key_len: The length of the supplied key.
+ *
+ * Returns 0 on success. The function fails only if an invalid key size (or
+ * pointer) is supplied.
+ */
+int sm4_expandkey(struct sm4_ctx *ctx, const u8 *in_key,
unsigned int key_len);
-void crypto_sm4_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in);
-void crypto_sm4_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in);
+/**
+ * sm4_crypt_block - Encrypt or decrypt a single SM4 block
+ * @rk: The rkey_enc for encrypt or rkey_dec for decrypt
+ * @out: Buffer to store output data
+ * @in: Buffer containing the input data
+ */
+void sm4_crypt_block(const u32 *rk, u8 *out, const u8 *in);
#endif
#define OST_CLK_PERCPU_TIMER2 3
#define OST_CLK_PERCPU_TIMER3 4
+#define OST_CLK_EVENT_TIMER 1
+
+#define OST_CLK_EVENT_TIMER0 0
+#define OST_CLK_EVENT_TIMER1 1
+#define OST_CLK_EVENT_TIMER2 2
+#define OST_CLK_EVENT_TIMER3 3
+#define OST_CLK_EVENT_TIMER4 4
+#define OST_CLK_EVENT_TIMER5 5
+#define OST_CLK_EVENT_TIMER6 6
+#define OST_CLK_EVENT_TIMER7 7
+#define OST_CLK_EVENT_TIMER8 8
+#define OST_CLK_EVENT_TIMER9 9
+#define OST_CLK_EVENT_TIMER10 10
+#define OST_CLK_EVENT_TIMER11 11
+#define OST_CLK_EVENT_TIMER12 12
+#define OST_CLK_EVENT_TIMER13 13
+#define OST_CLK_EVENT_TIMER14 14
+#define OST_CLK_EVENT_TIMER15 15
+
#endif /* __DT_BINDINGS_CLOCK_INGENIC_OST_H__ */
#define SMB3XX_CHG_ENABLE_PIN_ACTIVE_LOW 1
#define SMB3XX_CHG_ENABLE_PIN_ACTIVE_HIGH 2
+/* Polarity of INOK signal */
+#define SMB3XX_SYSOK_INOK_ACTIVE_LOW 0
+#define SMB3XX_SYSOK_INOK_ACTIVE_HIGH 1
+
#endif
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-
-// Generated by scripts/atomic/gen-atomic-fallback.sh
-// DO NOT MODIFY THIS FILE DIRECTLY
-
-#ifndef _LINUX_ATOMIC_FALLBACK_H
-#define _LINUX_ATOMIC_FALLBACK_H
-
-#include <linux/compiler.h>
-
-#ifndef arch_xchg_relaxed
-#define arch_xchg_acquire arch_xchg
-#define arch_xchg_release arch_xchg
-#define arch_xchg_relaxed arch_xchg
-#else /* arch_xchg_relaxed */
-
-#ifndef arch_xchg_acquire
-#define arch_xchg_acquire(...) \
- __atomic_op_acquire(arch_xchg, __VA_ARGS__)
-#endif
-
-#ifndef arch_xchg_release
-#define arch_xchg_release(...) \
- __atomic_op_release(arch_xchg, __VA_ARGS__)
-#endif
-
-#ifndef arch_xchg
-#define arch_xchg(...) \
- __atomic_op_fence(arch_xchg, __VA_ARGS__)
-#endif
-
-#endif /* arch_xchg_relaxed */
-
-#ifndef arch_cmpxchg_relaxed
-#define arch_cmpxchg_acquire arch_cmpxchg
-#define arch_cmpxchg_release arch_cmpxchg
-#define arch_cmpxchg_relaxed arch_cmpxchg
-#else /* arch_cmpxchg_relaxed */
-
-#ifndef arch_cmpxchg_acquire
-#define arch_cmpxchg_acquire(...) \
- __atomic_op_acquire(arch_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef arch_cmpxchg_release
-#define arch_cmpxchg_release(...) \
- __atomic_op_release(arch_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef arch_cmpxchg
-#define arch_cmpxchg(...) \
- __atomic_op_fence(arch_cmpxchg, __VA_ARGS__)
-#endif
-
-#endif /* arch_cmpxchg_relaxed */
-
-#ifndef arch_cmpxchg64_relaxed
-#define arch_cmpxchg64_acquire arch_cmpxchg64
-#define arch_cmpxchg64_release arch_cmpxchg64
-#define arch_cmpxchg64_relaxed arch_cmpxchg64
-#else /* arch_cmpxchg64_relaxed */
-
-#ifndef arch_cmpxchg64_acquire
-#define arch_cmpxchg64_acquire(...) \
- __atomic_op_acquire(arch_cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef arch_cmpxchg64_release
-#define arch_cmpxchg64_release(...) \
- __atomic_op_release(arch_cmpxchg64, __VA_ARGS__)
-#endif
-
-#ifndef arch_cmpxchg64
-#define arch_cmpxchg64(...) \
- __atomic_op_fence(arch_cmpxchg64, __VA_ARGS__)
-#endif
-
-#endif /* arch_cmpxchg64_relaxed */
-
-#ifndef arch_try_cmpxchg_relaxed
-#ifdef arch_try_cmpxchg
-#define arch_try_cmpxchg_acquire arch_try_cmpxchg
-#define arch_try_cmpxchg_release arch_try_cmpxchg
-#define arch_try_cmpxchg_relaxed arch_try_cmpxchg
-#endif /* arch_try_cmpxchg */
-
-#ifndef arch_try_cmpxchg
-#define arch_try_cmpxchg(_ptr, _oldp, _new) \
-({ \
- typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
- ___r = arch_cmpxchg((_ptr), ___o, (_new)); \
- if (unlikely(___r != ___o)) \
- *___op = ___r; \
- likely(___r == ___o); \
-})
-#endif /* arch_try_cmpxchg */
-
-#ifndef arch_try_cmpxchg_acquire
-#define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
-({ \
- typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
- ___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
- if (unlikely(___r != ___o)) \
- *___op = ___r; \
- likely(___r == ___o); \
-})
-#endif /* arch_try_cmpxchg_acquire */
-
-#ifndef arch_try_cmpxchg_release
-#define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
-({ \
- typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
- ___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
- if (unlikely(___r != ___o)) \
- *___op = ___r; \
- likely(___r == ___o); \
-})
-#endif /* arch_try_cmpxchg_release */
-
-#ifndef arch_try_cmpxchg_relaxed
-#define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
-({ \
- typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
- ___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
- if (unlikely(___r != ___o)) \
- *___op = ___r; \
- likely(___r == ___o); \
-})
-#endif /* arch_try_cmpxchg_relaxed */
-
-#else /* arch_try_cmpxchg_relaxed */
-
-#ifndef arch_try_cmpxchg_acquire
-#define arch_try_cmpxchg_acquire(...) \
- __atomic_op_acquire(arch_try_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef arch_try_cmpxchg_release
-#define arch_try_cmpxchg_release(...) \
- __atomic_op_release(arch_try_cmpxchg, __VA_ARGS__)
-#endif
-
-#ifndef arch_try_cmpxchg
-#define arch_try_cmpxchg(...) \
- __atomic_op_fence(arch_try_cmpxchg, __VA_ARGS__)
-#endif
-
-#endif /* arch_try_cmpxchg_relaxed */
-
-#ifndef arch_atomic_read_acquire
-static __always_inline int
-arch_atomic_read_acquire(const atomic_t *v)
-{
- return smp_load_acquire(&(v)->counter);
-}
-#define arch_atomic_read_acquire arch_atomic_read_acquire
-#endif
-
-#ifndef arch_atomic_set_release
-static __always_inline void
-arch_atomic_set_release(atomic_t *v, int i)
-{
- smp_store_release(&(v)->counter, i);
-}
-#define arch_atomic_set_release arch_atomic_set_release
-#endif
-
-#ifndef arch_atomic_add_return_relaxed
-#define arch_atomic_add_return_acquire arch_atomic_add_return
-#define arch_atomic_add_return_release arch_atomic_add_return
-#define arch_atomic_add_return_relaxed arch_atomic_add_return
-#else /* arch_atomic_add_return_relaxed */
-
-#ifndef arch_atomic_add_return_acquire
-static __always_inline int
-arch_atomic_add_return_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_add_return_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_add_return_acquire arch_atomic_add_return_acquire
-#endif
-
-#ifndef arch_atomic_add_return_release
-static __always_inline int
-arch_atomic_add_return_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_add_return_relaxed(i, v);
-}
-#define arch_atomic_add_return_release arch_atomic_add_return_release
-#endif
-
-#ifndef arch_atomic_add_return
-static __always_inline int
-arch_atomic_add_return(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_add_return_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_add_return arch_atomic_add_return
-#endif
-
-#endif /* arch_atomic_add_return_relaxed */
-
-#ifndef arch_atomic_fetch_add_relaxed
-#define arch_atomic_fetch_add_acquire arch_atomic_fetch_add
-#define arch_atomic_fetch_add_release arch_atomic_fetch_add
-#define arch_atomic_fetch_add_relaxed arch_atomic_fetch_add
-#else /* arch_atomic_fetch_add_relaxed */
-
-#ifndef arch_atomic_fetch_add_acquire
-static __always_inline int
-arch_atomic_fetch_add_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_fetch_add_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_add_acquire arch_atomic_fetch_add_acquire
-#endif
-
-#ifndef arch_atomic_fetch_add_release
-static __always_inline int
-arch_atomic_fetch_add_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_add_relaxed(i, v);
-}
-#define arch_atomic_fetch_add_release arch_atomic_fetch_add_release
-#endif
-
-#ifndef arch_atomic_fetch_add
-static __always_inline int
-arch_atomic_fetch_add(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_add_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_add arch_atomic_fetch_add
-#endif
-
-#endif /* arch_atomic_fetch_add_relaxed */
-
-#ifndef arch_atomic_sub_return_relaxed
-#define arch_atomic_sub_return_acquire arch_atomic_sub_return
-#define arch_atomic_sub_return_release arch_atomic_sub_return
-#define arch_atomic_sub_return_relaxed arch_atomic_sub_return
-#else /* arch_atomic_sub_return_relaxed */
-
-#ifndef arch_atomic_sub_return_acquire
-static __always_inline int
-arch_atomic_sub_return_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_sub_return_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_sub_return_acquire arch_atomic_sub_return_acquire
-#endif
-
-#ifndef arch_atomic_sub_return_release
-static __always_inline int
-arch_atomic_sub_return_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_sub_return_relaxed(i, v);
-}
-#define arch_atomic_sub_return_release arch_atomic_sub_return_release
-#endif
-
-#ifndef arch_atomic_sub_return
-static __always_inline int
-arch_atomic_sub_return(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_sub_return_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_sub_return arch_atomic_sub_return
-#endif
-
-#endif /* arch_atomic_sub_return_relaxed */
-
-#ifndef arch_atomic_fetch_sub_relaxed
-#define arch_atomic_fetch_sub_acquire arch_atomic_fetch_sub
-#define arch_atomic_fetch_sub_release arch_atomic_fetch_sub
-#define arch_atomic_fetch_sub_relaxed arch_atomic_fetch_sub
-#else /* arch_atomic_fetch_sub_relaxed */
-
-#ifndef arch_atomic_fetch_sub_acquire
-static __always_inline int
-arch_atomic_fetch_sub_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_fetch_sub_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_sub_acquire arch_atomic_fetch_sub_acquire
-#endif
-
-#ifndef arch_atomic_fetch_sub_release
-static __always_inline int
-arch_atomic_fetch_sub_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_sub_relaxed(i, v);
-}
-#define arch_atomic_fetch_sub_release arch_atomic_fetch_sub_release
-#endif
-
-#ifndef arch_atomic_fetch_sub
-static __always_inline int
-arch_atomic_fetch_sub(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_sub_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_sub arch_atomic_fetch_sub
-#endif
-
-#endif /* arch_atomic_fetch_sub_relaxed */
-
-#ifndef arch_atomic_inc
-static __always_inline void
-arch_atomic_inc(atomic_t *v)
-{
- arch_atomic_add(1, v);
-}
-#define arch_atomic_inc arch_atomic_inc
-#endif
-
-#ifndef arch_atomic_inc_return_relaxed
-#ifdef arch_atomic_inc_return
-#define arch_atomic_inc_return_acquire arch_atomic_inc_return
-#define arch_atomic_inc_return_release arch_atomic_inc_return
-#define arch_atomic_inc_return_relaxed arch_atomic_inc_return
-#endif /* arch_atomic_inc_return */
-
-#ifndef arch_atomic_inc_return
-static __always_inline int
-arch_atomic_inc_return(atomic_t *v)
-{
- return arch_atomic_add_return(1, v);
-}
-#define arch_atomic_inc_return arch_atomic_inc_return
-#endif
-
-#ifndef arch_atomic_inc_return_acquire
-static __always_inline int
-arch_atomic_inc_return_acquire(atomic_t *v)
-{
- return arch_atomic_add_return_acquire(1, v);
-}
-#define arch_atomic_inc_return_acquire arch_atomic_inc_return_acquire
-#endif
-
-#ifndef arch_atomic_inc_return_release
-static __always_inline int
-arch_atomic_inc_return_release(atomic_t *v)
-{
- return arch_atomic_add_return_release(1, v);
-}
-#define arch_atomic_inc_return_release arch_atomic_inc_return_release
-#endif
-
-#ifndef arch_atomic_inc_return_relaxed
-static __always_inline int
-arch_atomic_inc_return_relaxed(atomic_t *v)
-{
- return arch_atomic_add_return_relaxed(1, v);
-}
-#define arch_atomic_inc_return_relaxed arch_atomic_inc_return_relaxed
-#endif
-
-#else /* arch_atomic_inc_return_relaxed */
-
-#ifndef arch_atomic_inc_return_acquire
-static __always_inline int
-arch_atomic_inc_return_acquire(atomic_t *v)
-{
- int ret = arch_atomic_inc_return_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_inc_return_acquire arch_atomic_inc_return_acquire
-#endif
-
-#ifndef arch_atomic_inc_return_release
-static __always_inline int
-arch_atomic_inc_return_release(atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_inc_return_relaxed(v);
-}
-#define arch_atomic_inc_return_release arch_atomic_inc_return_release
-#endif
-
-#ifndef arch_atomic_inc_return
-static __always_inline int
-arch_atomic_inc_return(atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_inc_return_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_inc_return arch_atomic_inc_return
-#endif
-
-#endif /* arch_atomic_inc_return_relaxed */
-
-#ifndef arch_atomic_fetch_inc_relaxed
-#ifdef arch_atomic_fetch_inc
-#define arch_atomic_fetch_inc_acquire arch_atomic_fetch_inc
-#define arch_atomic_fetch_inc_release arch_atomic_fetch_inc
-#define arch_atomic_fetch_inc_relaxed arch_atomic_fetch_inc
-#endif /* arch_atomic_fetch_inc */
-
-#ifndef arch_atomic_fetch_inc
-static __always_inline int
-arch_atomic_fetch_inc(atomic_t *v)
-{
- return arch_atomic_fetch_add(1, v);
-}
-#define arch_atomic_fetch_inc arch_atomic_fetch_inc
-#endif
-
-#ifndef arch_atomic_fetch_inc_acquire
-static __always_inline int
-arch_atomic_fetch_inc_acquire(atomic_t *v)
-{
- return arch_atomic_fetch_add_acquire(1, v);
-}
-#define arch_atomic_fetch_inc_acquire arch_atomic_fetch_inc_acquire
-#endif
-
-#ifndef arch_atomic_fetch_inc_release
-static __always_inline int
-arch_atomic_fetch_inc_release(atomic_t *v)
-{
- return arch_atomic_fetch_add_release(1, v);
-}
-#define arch_atomic_fetch_inc_release arch_atomic_fetch_inc_release
-#endif
-
-#ifndef arch_atomic_fetch_inc_relaxed
-static __always_inline int
-arch_atomic_fetch_inc_relaxed(atomic_t *v)
-{
- return arch_atomic_fetch_add_relaxed(1, v);
-}
-#define arch_atomic_fetch_inc_relaxed arch_atomic_fetch_inc_relaxed
-#endif
-
-#else /* arch_atomic_fetch_inc_relaxed */
-
-#ifndef arch_atomic_fetch_inc_acquire
-static __always_inline int
-arch_atomic_fetch_inc_acquire(atomic_t *v)
-{
- int ret = arch_atomic_fetch_inc_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_inc_acquire arch_atomic_fetch_inc_acquire
-#endif
-
-#ifndef arch_atomic_fetch_inc_release
-static __always_inline int
-arch_atomic_fetch_inc_release(atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_inc_relaxed(v);
-}
-#define arch_atomic_fetch_inc_release arch_atomic_fetch_inc_release
-#endif
-
-#ifndef arch_atomic_fetch_inc
-static __always_inline int
-arch_atomic_fetch_inc(atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_inc_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_inc arch_atomic_fetch_inc
-#endif
-
-#endif /* arch_atomic_fetch_inc_relaxed */
-
-#ifndef arch_atomic_dec
-static __always_inline void
-arch_atomic_dec(atomic_t *v)
-{
- arch_atomic_sub(1, v);
-}
-#define arch_atomic_dec arch_atomic_dec
-#endif
-
-#ifndef arch_atomic_dec_return_relaxed
-#ifdef arch_atomic_dec_return
-#define arch_atomic_dec_return_acquire arch_atomic_dec_return
-#define arch_atomic_dec_return_release arch_atomic_dec_return
-#define arch_atomic_dec_return_relaxed arch_atomic_dec_return
-#endif /* arch_atomic_dec_return */
-
-#ifndef arch_atomic_dec_return
-static __always_inline int
-arch_atomic_dec_return(atomic_t *v)
-{
- return arch_atomic_sub_return(1, v);
-}
-#define arch_atomic_dec_return arch_atomic_dec_return
-#endif
-
-#ifndef arch_atomic_dec_return_acquire
-static __always_inline int
-arch_atomic_dec_return_acquire(atomic_t *v)
-{
- return arch_atomic_sub_return_acquire(1, v);
-}
-#define arch_atomic_dec_return_acquire arch_atomic_dec_return_acquire
-#endif
-
-#ifndef arch_atomic_dec_return_release
-static __always_inline int
-arch_atomic_dec_return_release(atomic_t *v)
-{
- return arch_atomic_sub_return_release(1, v);
-}
-#define arch_atomic_dec_return_release arch_atomic_dec_return_release
-#endif
-
-#ifndef arch_atomic_dec_return_relaxed
-static __always_inline int
-arch_atomic_dec_return_relaxed(atomic_t *v)
-{
- return arch_atomic_sub_return_relaxed(1, v);
-}
-#define arch_atomic_dec_return_relaxed arch_atomic_dec_return_relaxed
-#endif
-
-#else /* arch_atomic_dec_return_relaxed */
-
-#ifndef arch_atomic_dec_return_acquire
-static __always_inline int
-arch_atomic_dec_return_acquire(atomic_t *v)
-{
- int ret = arch_atomic_dec_return_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_dec_return_acquire arch_atomic_dec_return_acquire
-#endif
-
-#ifndef arch_atomic_dec_return_release
-static __always_inline int
-arch_atomic_dec_return_release(atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_dec_return_relaxed(v);
-}
-#define arch_atomic_dec_return_release arch_atomic_dec_return_release
-#endif
-
-#ifndef arch_atomic_dec_return
-static __always_inline int
-arch_atomic_dec_return(atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_dec_return_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_dec_return arch_atomic_dec_return
-#endif
-
-#endif /* arch_atomic_dec_return_relaxed */
-
-#ifndef arch_atomic_fetch_dec_relaxed
-#ifdef arch_atomic_fetch_dec
-#define arch_atomic_fetch_dec_acquire arch_atomic_fetch_dec
-#define arch_atomic_fetch_dec_release arch_atomic_fetch_dec
-#define arch_atomic_fetch_dec_relaxed arch_atomic_fetch_dec
-#endif /* arch_atomic_fetch_dec */
-
-#ifndef arch_atomic_fetch_dec
-static __always_inline int
-arch_atomic_fetch_dec(atomic_t *v)
-{
- return arch_atomic_fetch_sub(1, v);
-}
-#define arch_atomic_fetch_dec arch_atomic_fetch_dec
-#endif
-
-#ifndef arch_atomic_fetch_dec_acquire
-static __always_inline int
-arch_atomic_fetch_dec_acquire(atomic_t *v)
-{
- return arch_atomic_fetch_sub_acquire(1, v);
-}
-#define arch_atomic_fetch_dec_acquire arch_atomic_fetch_dec_acquire
-#endif
-
-#ifndef arch_atomic_fetch_dec_release
-static __always_inline int
-arch_atomic_fetch_dec_release(atomic_t *v)
-{
- return arch_atomic_fetch_sub_release(1, v);
-}
-#define arch_atomic_fetch_dec_release arch_atomic_fetch_dec_release
-#endif
-
-#ifndef arch_atomic_fetch_dec_relaxed
-static __always_inline int
-arch_atomic_fetch_dec_relaxed(atomic_t *v)
-{
- return arch_atomic_fetch_sub_relaxed(1, v);
-}
-#define arch_atomic_fetch_dec_relaxed arch_atomic_fetch_dec_relaxed
-#endif
-
-#else /* arch_atomic_fetch_dec_relaxed */
-
-#ifndef arch_atomic_fetch_dec_acquire
-static __always_inline int
-arch_atomic_fetch_dec_acquire(atomic_t *v)
-{
- int ret = arch_atomic_fetch_dec_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_dec_acquire arch_atomic_fetch_dec_acquire
-#endif
-
-#ifndef arch_atomic_fetch_dec_release
-static __always_inline int
-arch_atomic_fetch_dec_release(atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_dec_relaxed(v);
-}
-#define arch_atomic_fetch_dec_release arch_atomic_fetch_dec_release
-#endif
-
-#ifndef arch_atomic_fetch_dec
-static __always_inline int
-arch_atomic_fetch_dec(atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_dec_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_dec arch_atomic_fetch_dec
-#endif
-
-#endif /* arch_atomic_fetch_dec_relaxed */
-
-#ifndef arch_atomic_fetch_and_relaxed
-#define arch_atomic_fetch_and_acquire arch_atomic_fetch_and
-#define arch_atomic_fetch_and_release arch_atomic_fetch_and
-#define arch_atomic_fetch_and_relaxed arch_atomic_fetch_and
-#else /* arch_atomic_fetch_and_relaxed */
-
-#ifndef arch_atomic_fetch_and_acquire
-static __always_inline int
-arch_atomic_fetch_and_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_fetch_and_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_and_acquire arch_atomic_fetch_and_acquire
-#endif
-
-#ifndef arch_atomic_fetch_and_release
-static __always_inline int
-arch_atomic_fetch_and_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_and_relaxed(i, v);
-}
-#define arch_atomic_fetch_and_release arch_atomic_fetch_and_release
-#endif
-
-#ifndef arch_atomic_fetch_and
-static __always_inline int
-arch_atomic_fetch_and(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_and_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_and arch_atomic_fetch_and
-#endif
-
-#endif /* arch_atomic_fetch_and_relaxed */
-
-#ifndef arch_atomic_andnot
-static __always_inline void
-arch_atomic_andnot(int i, atomic_t *v)
-{
- arch_atomic_and(~i, v);
-}
-#define arch_atomic_andnot arch_atomic_andnot
-#endif
-
-#ifndef arch_atomic_fetch_andnot_relaxed
-#ifdef arch_atomic_fetch_andnot
-#define arch_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot
-#define arch_atomic_fetch_andnot_release arch_atomic_fetch_andnot
-#define arch_atomic_fetch_andnot_relaxed arch_atomic_fetch_andnot
-#endif /* arch_atomic_fetch_andnot */
-
-#ifndef arch_atomic_fetch_andnot
-static __always_inline int
-arch_atomic_fetch_andnot(int i, atomic_t *v)
-{
- return arch_atomic_fetch_and(~i, v);
-}
-#define arch_atomic_fetch_andnot arch_atomic_fetch_andnot
-#endif
-
-#ifndef arch_atomic_fetch_andnot_acquire
-static __always_inline int
-arch_atomic_fetch_andnot_acquire(int i, atomic_t *v)
-{
- return arch_atomic_fetch_and_acquire(~i, v);
-}
-#define arch_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire
-#endif
-
-#ifndef arch_atomic_fetch_andnot_release
-static __always_inline int
-arch_atomic_fetch_andnot_release(int i, atomic_t *v)
-{
- return arch_atomic_fetch_and_release(~i, v);
-}
-#define arch_atomic_fetch_andnot_release arch_atomic_fetch_andnot_release
-#endif
-
-#ifndef arch_atomic_fetch_andnot_relaxed
-static __always_inline int
-arch_atomic_fetch_andnot_relaxed(int i, atomic_t *v)
-{
- return arch_atomic_fetch_and_relaxed(~i, v);
-}
-#define arch_atomic_fetch_andnot_relaxed arch_atomic_fetch_andnot_relaxed
-#endif
-
-#else /* arch_atomic_fetch_andnot_relaxed */
-
-#ifndef arch_atomic_fetch_andnot_acquire
-static __always_inline int
-arch_atomic_fetch_andnot_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_fetch_andnot_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire
-#endif
-
-#ifndef arch_atomic_fetch_andnot_release
-static __always_inline int
-arch_atomic_fetch_andnot_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_andnot_relaxed(i, v);
-}
-#define arch_atomic_fetch_andnot_release arch_atomic_fetch_andnot_release
-#endif
-
-#ifndef arch_atomic_fetch_andnot
-static __always_inline int
-arch_atomic_fetch_andnot(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_andnot_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_andnot arch_atomic_fetch_andnot
-#endif
-
-#endif /* arch_atomic_fetch_andnot_relaxed */
-
-#ifndef arch_atomic_fetch_or_relaxed
-#define arch_atomic_fetch_or_acquire arch_atomic_fetch_or
-#define arch_atomic_fetch_or_release arch_atomic_fetch_or
-#define arch_atomic_fetch_or_relaxed arch_atomic_fetch_or
-#else /* arch_atomic_fetch_or_relaxed */
-
-#ifndef arch_atomic_fetch_or_acquire
-static __always_inline int
-arch_atomic_fetch_or_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_fetch_or_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_or_acquire arch_atomic_fetch_or_acquire
-#endif
-
-#ifndef arch_atomic_fetch_or_release
-static __always_inline int
-arch_atomic_fetch_or_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_or_relaxed(i, v);
-}
-#define arch_atomic_fetch_or_release arch_atomic_fetch_or_release
-#endif
-
-#ifndef arch_atomic_fetch_or
-static __always_inline int
-arch_atomic_fetch_or(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_or_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_or arch_atomic_fetch_or
-#endif
-
-#endif /* arch_atomic_fetch_or_relaxed */
-
-#ifndef arch_atomic_fetch_xor_relaxed
-#define arch_atomic_fetch_xor_acquire arch_atomic_fetch_xor
-#define arch_atomic_fetch_xor_release arch_atomic_fetch_xor
-#define arch_atomic_fetch_xor_relaxed arch_atomic_fetch_xor
-#else /* arch_atomic_fetch_xor_relaxed */
-
-#ifndef arch_atomic_fetch_xor_acquire
-static __always_inline int
-arch_atomic_fetch_xor_acquire(int i, atomic_t *v)
-{
- int ret = arch_atomic_fetch_xor_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_fetch_xor_acquire arch_atomic_fetch_xor_acquire
-#endif
-
-#ifndef arch_atomic_fetch_xor_release
-static __always_inline int
-arch_atomic_fetch_xor_release(int i, atomic_t *v)
-{
- __atomic_release_fence();
- return arch_atomic_fetch_xor_relaxed(i, v);
-}
-#define arch_atomic_fetch_xor_release arch_atomic_fetch_xor_release
-#endif
-
-#ifndef arch_atomic_fetch_xor
-static __always_inline int
-arch_atomic_fetch_xor(int i, atomic_t *v)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_fetch_xor_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_fetch_xor arch_atomic_fetch_xor
-#endif
-
-#endif /* arch_atomic_fetch_xor_relaxed */
-
-#ifndef arch_atomic_xchg_relaxed
-#define arch_atomic_xchg_acquire arch_atomic_xchg
-#define arch_atomic_xchg_release arch_atomic_xchg
-#define arch_atomic_xchg_relaxed arch_atomic_xchg
-#else /* arch_atomic_xchg_relaxed */
-
-#ifndef arch_atomic_xchg_acquire
-static __always_inline int
-arch_atomic_xchg_acquire(atomic_t *v, int i)
-{
- int ret = arch_atomic_xchg_relaxed(v, i);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_xchg_acquire arch_atomic_xchg_acquire
-#endif
-
-#ifndef arch_atomic_xchg_release
-static __always_inline int
-arch_atomic_xchg_release(atomic_t *v, int i)
-{
- __atomic_release_fence();
- return arch_atomic_xchg_relaxed(v, i);
-}
-#define arch_atomic_xchg_release arch_atomic_xchg_release
-#endif
-
-#ifndef arch_atomic_xchg
-static __always_inline int
-arch_atomic_xchg(atomic_t *v, int i)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_xchg_relaxed(v, i);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_xchg arch_atomic_xchg
-#endif
-
-#endif /* arch_atomic_xchg_relaxed */
-
-#ifndef arch_atomic_cmpxchg_relaxed
-#define arch_atomic_cmpxchg_acquire arch_atomic_cmpxchg
-#define arch_atomic_cmpxchg_release arch_atomic_cmpxchg
-#define arch_atomic_cmpxchg_relaxed arch_atomic_cmpxchg
-#else /* arch_atomic_cmpxchg_relaxed */
-
-#ifndef arch_atomic_cmpxchg_acquire
-static __always_inline int
-arch_atomic_cmpxchg_acquire(atomic_t *v, int old, int new)
-{
- int ret = arch_atomic_cmpxchg_relaxed(v, old, new);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_cmpxchg_acquire arch_atomic_cmpxchg_acquire
-#endif
-
-#ifndef arch_atomic_cmpxchg_release
-static __always_inline int
-arch_atomic_cmpxchg_release(atomic_t *v, int old, int new)
-{
- __atomic_release_fence();
- return arch_atomic_cmpxchg_relaxed(v, old, new);
-}
-#define arch_atomic_cmpxchg_release arch_atomic_cmpxchg_release
-#endif
-
-#ifndef arch_atomic_cmpxchg
-static __always_inline int
-arch_atomic_cmpxchg(atomic_t *v, int old, int new)
-{
- int ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_cmpxchg_relaxed(v, old, new);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_cmpxchg arch_atomic_cmpxchg
-#endif
-
-#endif /* arch_atomic_cmpxchg_relaxed */
-
-#ifndef arch_atomic_try_cmpxchg_relaxed
-#ifdef arch_atomic_try_cmpxchg
-#define arch_atomic_try_cmpxchg_acquire arch_atomic_try_cmpxchg
-#define arch_atomic_try_cmpxchg_release arch_atomic_try_cmpxchg
-#define arch_atomic_try_cmpxchg_relaxed arch_atomic_try_cmpxchg
-#endif /* arch_atomic_try_cmpxchg */
-
-#ifndef arch_atomic_try_cmpxchg
-static __always_inline bool
-arch_atomic_try_cmpxchg(atomic_t *v, int *old, int new)
-{
- int r, o = *old;
- r = arch_atomic_cmpxchg(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic_try_cmpxchg arch_atomic_try_cmpxchg
-#endif
-
-#ifndef arch_atomic_try_cmpxchg_acquire
-static __always_inline bool
-arch_atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new)
-{
- int r, o = *old;
- r = arch_atomic_cmpxchg_acquire(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic_try_cmpxchg_acquire arch_atomic_try_cmpxchg_acquire
-#endif
-
-#ifndef arch_atomic_try_cmpxchg_release
-static __always_inline bool
-arch_atomic_try_cmpxchg_release(atomic_t *v, int *old, int new)
-{
- int r, o = *old;
- r = arch_atomic_cmpxchg_release(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic_try_cmpxchg_release arch_atomic_try_cmpxchg_release
-#endif
-
-#ifndef arch_atomic_try_cmpxchg_relaxed
-static __always_inline bool
-arch_atomic_try_cmpxchg_relaxed(atomic_t *v, int *old, int new)
-{
- int r, o = *old;
- r = arch_atomic_cmpxchg_relaxed(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic_try_cmpxchg_relaxed arch_atomic_try_cmpxchg_relaxed
-#endif
-
-#else /* arch_atomic_try_cmpxchg_relaxed */
-
-#ifndef arch_atomic_try_cmpxchg_acquire
-static __always_inline bool
-arch_atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new)
-{
- bool ret = arch_atomic_try_cmpxchg_relaxed(v, old, new);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic_try_cmpxchg_acquire arch_atomic_try_cmpxchg_acquire
-#endif
-
-#ifndef arch_atomic_try_cmpxchg_release
-static __always_inline bool
-arch_atomic_try_cmpxchg_release(atomic_t *v, int *old, int new)
-{
- __atomic_release_fence();
- return arch_atomic_try_cmpxchg_relaxed(v, old, new);
-}
-#define arch_atomic_try_cmpxchg_release arch_atomic_try_cmpxchg_release
-#endif
-
-#ifndef arch_atomic_try_cmpxchg
-static __always_inline bool
-arch_atomic_try_cmpxchg(atomic_t *v, int *old, int new)
-{
- bool ret;
- __atomic_pre_full_fence();
- ret = arch_atomic_try_cmpxchg_relaxed(v, old, new);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic_try_cmpxchg arch_atomic_try_cmpxchg
-#endif
-
-#endif /* arch_atomic_try_cmpxchg_relaxed */
-
-#ifndef arch_atomic_sub_and_test
-/**
- * arch_atomic_sub_and_test - subtract value from variable and test result
- * @i: integer value to subtract
- * @v: pointer of type atomic_t
- *
- * Atomically subtracts @i from @v and returns
- * true if the result is zero, or false for all
- * other cases.
- */
-static __always_inline bool
-arch_atomic_sub_and_test(int i, atomic_t *v)
-{
- return arch_atomic_sub_return(i, v) == 0;
-}
-#define arch_atomic_sub_and_test arch_atomic_sub_and_test
-#endif
-
-#ifndef arch_atomic_dec_and_test
-/**
- * arch_atomic_dec_and_test - decrement and test
- * @v: pointer of type atomic_t
- *
- * Atomically decrements @v by 1 and
- * returns true if the result is 0, or false for all other
- * cases.
- */
-static __always_inline bool
-arch_atomic_dec_and_test(atomic_t *v)
-{
- return arch_atomic_dec_return(v) == 0;
-}
-#define arch_atomic_dec_and_test arch_atomic_dec_and_test
-#endif
-
-#ifndef arch_atomic_inc_and_test
-/**
- * arch_atomic_inc_and_test - increment and test
- * @v: pointer of type atomic_t
- *
- * Atomically increments @v by 1
- * and returns true if the result is zero, or false for all
- * other cases.
- */
-static __always_inline bool
-arch_atomic_inc_and_test(atomic_t *v)
-{
- return arch_atomic_inc_return(v) == 0;
-}
-#define arch_atomic_inc_and_test arch_atomic_inc_and_test
-#endif
-
-#ifndef arch_atomic_add_negative
-/**
- * arch_atomic_add_negative - add and test if negative
- * @i: integer value to add
- * @v: pointer of type atomic_t
- *
- * Atomically adds @i to @v and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
- */
-static __always_inline bool
-arch_atomic_add_negative(int i, atomic_t *v)
-{
- return arch_atomic_add_return(i, v) < 0;
-}
-#define arch_atomic_add_negative arch_atomic_add_negative
-#endif
-
-#ifndef arch_atomic_fetch_add_unless
-/**
- * arch_atomic_fetch_add_unless - add unless the number is already a given value
- * @v: pointer of type atomic_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
- *
- * Atomically adds @a to @v, so long as @v was not already @u.
- * Returns original value of @v
- */
-static __always_inline int
-arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
-{
- int c = arch_atomic_read(v);
-
- do {
- if (unlikely(c == u))
- break;
- } while (!arch_atomic_try_cmpxchg(v, &c, c + a));
-
- return c;
-}
-#define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
-#endif
-
-#ifndef arch_atomic_add_unless
-/**
- * arch_atomic_add_unless - add unless the number is already a given value
- * @v: pointer of type atomic_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
- *
- * Atomically adds @a to @v, if @v was not already @u.
- * Returns true if the addition was done.
- */
-static __always_inline bool
-arch_atomic_add_unless(atomic_t *v, int a, int u)
-{
- return arch_atomic_fetch_add_unless(v, a, u) != u;
-}
-#define arch_atomic_add_unless arch_atomic_add_unless
-#endif
-
-#ifndef arch_atomic_inc_not_zero
-/**
- * arch_atomic_inc_not_zero - increment unless the number is zero
- * @v: pointer of type atomic_t
- *
- * Atomically increments @v by 1, if @v is non-zero.
- * Returns true if the increment was done.
- */
-static __always_inline bool
-arch_atomic_inc_not_zero(atomic_t *v)
-{
- return arch_atomic_add_unless(v, 1, 0);
-}
-#define arch_atomic_inc_not_zero arch_atomic_inc_not_zero
-#endif
-
-#ifndef arch_atomic_inc_unless_negative
-static __always_inline bool
-arch_atomic_inc_unless_negative(atomic_t *v)
-{
- int c = arch_atomic_read(v);
-
- do {
- if (unlikely(c < 0))
- return false;
- } while (!arch_atomic_try_cmpxchg(v, &c, c + 1));
-
- return true;
-}
-#define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative
-#endif
-
-#ifndef arch_atomic_dec_unless_positive
-static __always_inline bool
-arch_atomic_dec_unless_positive(atomic_t *v)
-{
- int c = arch_atomic_read(v);
-
- do {
- if (unlikely(c > 0))
- return false;
- } while (!arch_atomic_try_cmpxchg(v, &c, c - 1));
-
- return true;
-}
-#define arch_atomic_dec_unless_positive arch_atomic_dec_unless_positive
-#endif
-
-#ifndef arch_atomic_dec_if_positive
-static __always_inline int
-arch_atomic_dec_if_positive(atomic_t *v)
-{
- int dec, c = arch_atomic_read(v);
-
- do {
- dec = c - 1;
- if (unlikely(dec < 0))
- break;
- } while (!arch_atomic_try_cmpxchg(v, &c, dec));
-
- return dec;
-}
-#define arch_atomic_dec_if_positive arch_atomic_dec_if_positive
-#endif
-
-#ifdef CONFIG_GENERIC_ATOMIC64
-#include <asm-generic/atomic64.h>
-#endif
-
-#ifndef arch_atomic64_read_acquire
-static __always_inline s64
-arch_atomic64_read_acquire(const atomic64_t *v)
-{
- return smp_load_acquire(&(v)->counter);
-}
-#define arch_atomic64_read_acquire arch_atomic64_read_acquire
-#endif
-
-#ifndef arch_atomic64_set_release
-static __always_inline void
-arch_atomic64_set_release(atomic64_t *v, s64 i)
-{
- smp_store_release(&(v)->counter, i);
-}
-#define arch_atomic64_set_release arch_atomic64_set_release
-#endif
-
-#ifndef arch_atomic64_add_return_relaxed
-#define arch_atomic64_add_return_acquire arch_atomic64_add_return
-#define arch_atomic64_add_return_release arch_atomic64_add_return
-#define arch_atomic64_add_return_relaxed arch_atomic64_add_return
-#else /* arch_atomic64_add_return_relaxed */
-
-#ifndef arch_atomic64_add_return_acquire
-static __always_inline s64
-arch_atomic64_add_return_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_add_return_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_add_return_acquire arch_atomic64_add_return_acquire
-#endif
-
-#ifndef arch_atomic64_add_return_release
-static __always_inline s64
-arch_atomic64_add_return_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_add_return_relaxed(i, v);
-}
-#define arch_atomic64_add_return_release arch_atomic64_add_return_release
-#endif
-
-#ifndef arch_atomic64_add_return
-static __always_inline s64
-arch_atomic64_add_return(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_add_return_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_add_return arch_atomic64_add_return
-#endif
-
-#endif /* arch_atomic64_add_return_relaxed */
-
-#ifndef arch_atomic64_fetch_add_relaxed
-#define arch_atomic64_fetch_add_acquire arch_atomic64_fetch_add
-#define arch_atomic64_fetch_add_release arch_atomic64_fetch_add
-#define arch_atomic64_fetch_add_relaxed arch_atomic64_fetch_add
-#else /* arch_atomic64_fetch_add_relaxed */
-
-#ifndef arch_atomic64_fetch_add_acquire
-static __always_inline s64
-arch_atomic64_fetch_add_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_add_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_add_acquire arch_atomic64_fetch_add_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_add_release
-static __always_inline s64
-arch_atomic64_fetch_add_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_add_relaxed(i, v);
-}
-#define arch_atomic64_fetch_add_release arch_atomic64_fetch_add_release
-#endif
-
-#ifndef arch_atomic64_fetch_add
-static __always_inline s64
-arch_atomic64_fetch_add(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_add_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_add arch_atomic64_fetch_add
-#endif
-
-#endif /* arch_atomic64_fetch_add_relaxed */
-
-#ifndef arch_atomic64_sub_return_relaxed
-#define arch_atomic64_sub_return_acquire arch_atomic64_sub_return
-#define arch_atomic64_sub_return_release arch_atomic64_sub_return
-#define arch_atomic64_sub_return_relaxed arch_atomic64_sub_return
-#else /* arch_atomic64_sub_return_relaxed */
-
-#ifndef arch_atomic64_sub_return_acquire
-static __always_inline s64
-arch_atomic64_sub_return_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_sub_return_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_sub_return_acquire arch_atomic64_sub_return_acquire
-#endif
-
-#ifndef arch_atomic64_sub_return_release
-static __always_inline s64
-arch_atomic64_sub_return_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_sub_return_relaxed(i, v);
-}
-#define arch_atomic64_sub_return_release arch_atomic64_sub_return_release
-#endif
-
-#ifndef arch_atomic64_sub_return
-static __always_inline s64
-arch_atomic64_sub_return(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_sub_return_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_sub_return arch_atomic64_sub_return
-#endif
-
-#endif /* arch_atomic64_sub_return_relaxed */
-
-#ifndef arch_atomic64_fetch_sub_relaxed
-#define arch_atomic64_fetch_sub_acquire arch_atomic64_fetch_sub
-#define arch_atomic64_fetch_sub_release arch_atomic64_fetch_sub
-#define arch_atomic64_fetch_sub_relaxed arch_atomic64_fetch_sub
-#else /* arch_atomic64_fetch_sub_relaxed */
-
-#ifndef arch_atomic64_fetch_sub_acquire
-static __always_inline s64
-arch_atomic64_fetch_sub_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_sub_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_sub_acquire arch_atomic64_fetch_sub_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_sub_release
-static __always_inline s64
-arch_atomic64_fetch_sub_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_sub_relaxed(i, v);
-}
-#define arch_atomic64_fetch_sub_release arch_atomic64_fetch_sub_release
-#endif
-
-#ifndef arch_atomic64_fetch_sub
-static __always_inline s64
-arch_atomic64_fetch_sub(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_sub_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_sub arch_atomic64_fetch_sub
-#endif
-
-#endif /* arch_atomic64_fetch_sub_relaxed */
-
-#ifndef arch_atomic64_inc
-static __always_inline void
-arch_atomic64_inc(atomic64_t *v)
-{
- arch_atomic64_add(1, v);
-}
-#define arch_atomic64_inc arch_atomic64_inc
-#endif
-
-#ifndef arch_atomic64_inc_return_relaxed
-#ifdef arch_atomic64_inc_return
-#define arch_atomic64_inc_return_acquire arch_atomic64_inc_return
-#define arch_atomic64_inc_return_release arch_atomic64_inc_return
-#define arch_atomic64_inc_return_relaxed arch_atomic64_inc_return
-#endif /* arch_atomic64_inc_return */
-
-#ifndef arch_atomic64_inc_return
-static __always_inline s64
-arch_atomic64_inc_return(atomic64_t *v)
-{
- return arch_atomic64_add_return(1, v);
-}
-#define arch_atomic64_inc_return arch_atomic64_inc_return
-#endif
-
-#ifndef arch_atomic64_inc_return_acquire
-static __always_inline s64
-arch_atomic64_inc_return_acquire(atomic64_t *v)
-{
- return arch_atomic64_add_return_acquire(1, v);
-}
-#define arch_atomic64_inc_return_acquire arch_atomic64_inc_return_acquire
-#endif
-
-#ifndef arch_atomic64_inc_return_release
-static __always_inline s64
-arch_atomic64_inc_return_release(atomic64_t *v)
-{
- return arch_atomic64_add_return_release(1, v);
-}
-#define arch_atomic64_inc_return_release arch_atomic64_inc_return_release
-#endif
-
-#ifndef arch_atomic64_inc_return_relaxed
-static __always_inline s64
-arch_atomic64_inc_return_relaxed(atomic64_t *v)
-{
- return arch_atomic64_add_return_relaxed(1, v);
-}
-#define arch_atomic64_inc_return_relaxed arch_atomic64_inc_return_relaxed
-#endif
-
-#else /* arch_atomic64_inc_return_relaxed */
-
-#ifndef arch_atomic64_inc_return_acquire
-static __always_inline s64
-arch_atomic64_inc_return_acquire(atomic64_t *v)
-{
- s64 ret = arch_atomic64_inc_return_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_inc_return_acquire arch_atomic64_inc_return_acquire
-#endif
-
-#ifndef arch_atomic64_inc_return_release
-static __always_inline s64
-arch_atomic64_inc_return_release(atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_inc_return_relaxed(v);
-}
-#define arch_atomic64_inc_return_release arch_atomic64_inc_return_release
-#endif
-
-#ifndef arch_atomic64_inc_return
-static __always_inline s64
-arch_atomic64_inc_return(atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_inc_return_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_inc_return arch_atomic64_inc_return
-#endif
-
-#endif /* arch_atomic64_inc_return_relaxed */
-
-#ifndef arch_atomic64_fetch_inc_relaxed
-#ifdef arch_atomic64_fetch_inc
-#define arch_atomic64_fetch_inc_acquire arch_atomic64_fetch_inc
-#define arch_atomic64_fetch_inc_release arch_atomic64_fetch_inc
-#define arch_atomic64_fetch_inc_relaxed arch_atomic64_fetch_inc
-#endif /* arch_atomic64_fetch_inc */
-
-#ifndef arch_atomic64_fetch_inc
-static __always_inline s64
-arch_atomic64_fetch_inc(atomic64_t *v)
-{
- return arch_atomic64_fetch_add(1, v);
-}
-#define arch_atomic64_fetch_inc arch_atomic64_fetch_inc
-#endif
-
-#ifndef arch_atomic64_fetch_inc_acquire
-static __always_inline s64
-arch_atomic64_fetch_inc_acquire(atomic64_t *v)
-{
- return arch_atomic64_fetch_add_acquire(1, v);
-}
-#define arch_atomic64_fetch_inc_acquire arch_atomic64_fetch_inc_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_inc_release
-static __always_inline s64
-arch_atomic64_fetch_inc_release(atomic64_t *v)
-{
- return arch_atomic64_fetch_add_release(1, v);
-}
-#define arch_atomic64_fetch_inc_release arch_atomic64_fetch_inc_release
-#endif
-
-#ifndef arch_atomic64_fetch_inc_relaxed
-static __always_inline s64
-arch_atomic64_fetch_inc_relaxed(atomic64_t *v)
-{
- return arch_atomic64_fetch_add_relaxed(1, v);
-}
-#define arch_atomic64_fetch_inc_relaxed arch_atomic64_fetch_inc_relaxed
-#endif
-
-#else /* arch_atomic64_fetch_inc_relaxed */
-
-#ifndef arch_atomic64_fetch_inc_acquire
-static __always_inline s64
-arch_atomic64_fetch_inc_acquire(atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_inc_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_inc_acquire arch_atomic64_fetch_inc_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_inc_release
-static __always_inline s64
-arch_atomic64_fetch_inc_release(atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_inc_relaxed(v);
-}
-#define arch_atomic64_fetch_inc_release arch_atomic64_fetch_inc_release
-#endif
-
-#ifndef arch_atomic64_fetch_inc
-static __always_inline s64
-arch_atomic64_fetch_inc(atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_inc_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_inc arch_atomic64_fetch_inc
-#endif
-
-#endif /* arch_atomic64_fetch_inc_relaxed */
-
-#ifndef arch_atomic64_dec
-static __always_inline void
-arch_atomic64_dec(atomic64_t *v)
-{
- arch_atomic64_sub(1, v);
-}
-#define arch_atomic64_dec arch_atomic64_dec
-#endif
-
-#ifndef arch_atomic64_dec_return_relaxed
-#ifdef arch_atomic64_dec_return
-#define arch_atomic64_dec_return_acquire arch_atomic64_dec_return
-#define arch_atomic64_dec_return_release arch_atomic64_dec_return
-#define arch_atomic64_dec_return_relaxed arch_atomic64_dec_return
-#endif /* arch_atomic64_dec_return */
-
-#ifndef arch_atomic64_dec_return
-static __always_inline s64
-arch_atomic64_dec_return(atomic64_t *v)
-{
- return arch_atomic64_sub_return(1, v);
-}
-#define arch_atomic64_dec_return arch_atomic64_dec_return
-#endif
-
-#ifndef arch_atomic64_dec_return_acquire
-static __always_inline s64
-arch_atomic64_dec_return_acquire(atomic64_t *v)
-{
- return arch_atomic64_sub_return_acquire(1, v);
-}
-#define arch_atomic64_dec_return_acquire arch_atomic64_dec_return_acquire
-#endif
-
-#ifndef arch_atomic64_dec_return_release
-static __always_inline s64
-arch_atomic64_dec_return_release(atomic64_t *v)
-{
- return arch_atomic64_sub_return_release(1, v);
-}
-#define arch_atomic64_dec_return_release arch_atomic64_dec_return_release
-#endif
-
-#ifndef arch_atomic64_dec_return_relaxed
-static __always_inline s64
-arch_atomic64_dec_return_relaxed(atomic64_t *v)
-{
- return arch_atomic64_sub_return_relaxed(1, v);
-}
-#define arch_atomic64_dec_return_relaxed arch_atomic64_dec_return_relaxed
-#endif
-
-#else /* arch_atomic64_dec_return_relaxed */
-
-#ifndef arch_atomic64_dec_return_acquire
-static __always_inline s64
-arch_atomic64_dec_return_acquire(atomic64_t *v)
-{
- s64 ret = arch_atomic64_dec_return_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_dec_return_acquire arch_atomic64_dec_return_acquire
-#endif
-
-#ifndef arch_atomic64_dec_return_release
-static __always_inline s64
-arch_atomic64_dec_return_release(atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_dec_return_relaxed(v);
-}
-#define arch_atomic64_dec_return_release arch_atomic64_dec_return_release
-#endif
-
-#ifndef arch_atomic64_dec_return
-static __always_inline s64
-arch_atomic64_dec_return(atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_dec_return_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_dec_return arch_atomic64_dec_return
-#endif
-
-#endif /* arch_atomic64_dec_return_relaxed */
-
-#ifndef arch_atomic64_fetch_dec_relaxed
-#ifdef arch_atomic64_fetch_dec
-#define arch_atomic64_fetch_dec_acquire arch_atomic64_fetch_dec
-#define arch_atomic64_fetch_dec_release arch_atomic64_fetch_dec
-#define arch_atomic64_fetch_dec_relaxed arch_atomic64_fetch_dec
-#endif /* arch_atomic64_fetch_dec */
-
-#ifndef arch_atomic64_fetch_dec
-static __always_inline s64
-arch_atomic64_fetch_dec(atomic64_t *v)
-{
- return arch_atomic64_fetch_sub(1, v);
-}
-#define arch_atomic64_fetch_dec arch_atomic64_fetch_dec
-#endif
-
-#ifndef arch_atomic64_fetch_dec_acquire
-static __always_inline s64
-arch_atomic64_fetch_dec_acquire(atomic64_t *v)
-{
- return arch_atomic64_fetch_sub_acquire(1, v);
-}
-#define arch_atomic64_fetch_dec_acquire arch_atomic64_fetch_dec_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_dec_release
-static __always_inline s64
-arch_atomic64_fetch_dec_release(atomic64_t *v)
-{
- return arch_atomic64_fetch_sub_release(1, v);
-}
-#define arch_atomic64_fetch_dec_release arch_atomic64_fetch_dec_release
-#endif
-
-#ifndef arch_atomic64_fetch_dec_relaxed
-static __always_inline s64
-arch_atomic64_fetch_dec_relaxed(atomic64_t *v)
-{
- return arch_atomic64_fetch_sub_relaxed(1, v);
-}
-#define arch_atomic64_fetch_dec_relaxed arch_atomic64_fetch_dec_relaxed
-#endif
-
-#else /* arch_atomic64_fetch_dec_relaxed */
-
-#ifndef arch_atomic64_fetch_dec_acquire
-static __always_inline s64
-arch_atomic64_fetch_dec_acquire(atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_dec_relaxed(v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_dec_acquire arch_atomic64_fetch_dec_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_dec_release
-static __always_inline s64
-arch_atomic64_fetch_dec_release(atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_dec_relaxed(v);
-}
-#define arch_atomic64_fetch_dec_release arch_atomic64_fetch_dec_release
-#endif
-
-#ifndef arch_atomic64_fetch_dec
-static __always_inline s64
-arch_atomic64_fetch_dec(atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_dec_relaxed(v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_dec arch_atomic64_fetch_dec
-#endif
-
-#endif /* arch_atomic64_fetch_dec_relaxed */
-
-#ifndef arch_atomic64_fetch_and_relaxed
-#define arch_atomic64_fetch_and_acquire arch_atomic64_fetch_and
-#define arch_atomic64_fetch_and_release arch_atomic64_fetch_and
-#define arch_atomic64_fetch_and_relaxed arch_atomic64_fetch_and
-#else /* arch_atomic64_fetch_and_relaxed */
-
-#ifndef arch_atomic64_fetch_and_acquire
-static __always_inline s64
-arch_atomic64_fetch_and_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_and_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_and_acquire arch_atomic64_fetch_and_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_and_release
-static __always_inline s64
-arch_atomic64_fetch_and_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_and_relaxed(i, v);
-}
-#define arch_atomic64_fetch_and_release arch_atomic64_fetch_and_release
-#endif
-
-#ifndef arch_atomic64_fetch_and
-static __always_inline s64
-arch_atomic64_fetch_and(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_and_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_and arch_atomic64_fetch_and
-#endif
-
-#endif /* arch_atomic64_fetch_and_relaxed */
-
-#ifndef arch_atomic64_andnot
-static __always_inline void
-arch_atomic64_andnot(s64 i, atomic64_t *v)
-{
- arch_atomic64_and(~i, v);
-}
-#define arch_atomic64_andnot arch_atomic64_andnot
-#endif
-
-#ifndef arch_atomic64_fetch_andnot_relaxed
-#ifdef arch_atomic64_fetch_andnot
-#define arch_atomic64_fetch_andnot_acquire arch_atomic64_fetch_andnot
-#define arch_atomic64_fetch_andnot_release arch_atomic64_fetch_andnot
-#define arch_atomic64_fetch_andnot_relaxed arch_atomic64_fetch_andnot
-#endif /* arch_atomic64_fetch_andnot */
-
-#ifndef arch_atomic64_fetch_andnot
-static __always_inline s64
-arch_atomic64_fetch_andnot(s64 i, atomic64_t *v)
-{
- return arch_atomic64_fetch_and(~i, v);
-}
-#define arch_atomic64_fetch_andnot arch_atomic64_fetch_andnot
-#endif
-
-#ifndef arch_atomic64_fetch_andnot_acquire
-static __always_inline s64
-arch_atomic64_fetch_andnot_acquire(s64 i, atomic64_t *v)
-{
- return arch_atomic64_fetch_and_acquire(~i, v);
-}
-#define arch_atomic64_fetch_andnot_acquire arch_atomic64_fetch_andnot_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_andnot_release
-static __always_inline s64
-arch_atomic64_fetch_andnot_release(s64 i, atomic64_t *v)
-{
- return arch_atomic64_fetch_and_release(~i, v);
-}
-#define arch_atomic64_fetch_andnot_release arch_atomic64_fetch_andnot_release
-#endif
-
-#ifndef arch_atomic64_fetch_andnot_relaxed
-static __always_inline s64
-arch_atomic64_fetch_andnot_relaxed(s64 i, atomic64_t *v)
-{
- return arch_atomic64_fetch_and_relaxed(~i, v);
-}
-#define arch_atomic64_fetch_andnot_relaxed arch_atomic64_fetch_andnot_relaxed
-#endif
-
-#else /* arch_atomic64_fetch_andnot_relaxed */
-
-#ifndef arch_atomic64_fetch_andnot_acquire
-static __always_inline s64
-arch_atomic64_fetch_andnot_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_andnot_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_andnot_acquire arch_atomic64_fetch_andnot_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_andnot_release
-static __always_inline s64
-arch_atomic64_fetch_andnot_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_andnot_relaxed(i, v);
-}
-#define arch_atomic64_fetch_andnot_release arch_atomic64_fetch_andnot_release
-#endif
-
-#ifndef arch_atomic64_fetch_andnot
-static __always_inline s64
-arch_atomic64_fetch_andnot(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_andnot_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_andnot arch_atomic64_fetch_andnot
-#endif
-
-#endif /* arch_atomic64_fetch_andnot_relaxed */
-
-#ifndef arch_atomic64_fetch_or_relaxed
-#define arch_atomic64_fetch_or_acquire arch_atomic64_fetch_or
-#define arch_atomic64_fetch_or_release arch_atomic64_fetch_or
-#define arch_atomic64_fetch_or_relaxed arch_atomic64_fetch_or
-#else /* arch_atomic64_fetch_or_relaxed */
-
-#ifndef arch_atomic64_fetch_or_acquire
-static __always_inline s64
-arch_atomic64_fetch_or_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_or_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_or_acquire arch_atomic64_fetch_or_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_or_release
-static __always_inline s64
-arch_atomic64_fetch_or_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_or_relaxed(i, v);
-}
-#define arch_atomic64_fetch_or_release arch_atomic64_fetch_or_release
-#endif
-
-#ifndef arch_atomic64_fetch_or
-static __always_inline s64
-arch_atomic64_fetch_or(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_or_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_or arch_atomic64_fetch_or
-#endif
-
-#endif /* arch_atomic64_fetch_or_relaxed */
-
-#ifndef arch_atomic64_fetch_xor_relaxed
-#define arch_atomic64_fetch_xor_acquire arch_atomic64_fetch_xor
-#define arch_atomic64_fetch_xor_release arch_atomic64_fetch_xor
-#define arch_atomic64_fetch_xor_relaxed arch_atomic64_fetch_xor
-#else /* arch_atomic64_fetch_xor_relaxed */
-
-#ifndef arch_atomic64_fetch_xor_acquire
-static __always_inline s64
-arch_atomic64_fetch_xor_acquire(s64 i, atomic64_t *v)
-{
- s64 ret = arch_atomic64_fetch_xor_relaxed(i, v);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_fetch_xor_acquire arch_atomic64_fetch_xor_acquire
-#endif
-
-#ifndef arch_atomic64_fetch_xor_release
-static __always_inline s64
-arch_atomic64_fetch_xor_release(s64 i, atomic64_t *v)
-{
- __atomic_release_fence();
- return arch_atomic64_fetch_xor_relaxed(i, v);
-}
-#define arch_atomic64_fetch_xor_release arch_atomic64_fetch_xor_release
-#endif
-
-#ifndef arch_atomic64_fetch_xor
-static __always_inline s64
-arch_atomic64_fetch_xor(s64 i, atomic64_t *v)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_fetch_xor_relaxed(i, v);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_fetch_xor arch_atomic64_fetch_xor
-#endif
-
-#endif /* arch_atomic64_fetch_xor_relaxed */
-
-#ifndef arch_atomic64_xchg_relaxed
-#define arch_atomic64_xchg_acquire arch_atomic64_xchg
-#define arch_atomic64_xchg_release arch_atomic64_xchg
-#define arch_atomic64_xchg_relaxed arch_atomic64_xchg
-#else /* arch_atomic64_xchg_relaxed */
-
-#ifndef arch_atomic64_xchg_acquire
-static __always_inline s64
-arch_atomic64_xchg_acquire(atomic64_t *v, s64 i)
-{
- s64 ret = arch_atomic64_xchg_relaxed(v, i);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_xchg_acquire arch_atomic64_xchg_acquire
-#endif
-
-#ifndef arch_atomic64_xchg_release
-static __always_inline s64
-arch_atomic64_xchg_release(atomic64_t *v, s64 i)
-{
- __atomic_release_fence();
- return arch_atomic64_xchg_relaxed(v, i);
-}
-#define arch_atomic64_xchg_release arch_atomic64_xchg_release
-#endif
-
-#ifndef arch_atomic64_xchg
-static __always_inline s64
-arch_atomic64_xchg(atomic64_t *v, s64 i)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_xchg_relaxed(v, i);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_xchg arch_atomic64_xchg
-#endif
-
-#endif /* arch_atomic64_xchg_relaxed */
-
-#ifndef arch_atomic64_cmpxchg_relaxed
-#define arch_atomic64_cmpxchg_acquire arch_atomic64_cmpxchg
-#define arch_atomic64_cmpxchg_release arch_atomic64_cmpxchg
-#define arch_atomic64_cmpxchg_relaxed arch_atomic64_cmpxchg
-#else /* arch_atomic64_cmpxchg_relaxed */
-
-#ifndef arch_atomic64_cmpxchg_acquire
-static __always_inline s64
-arch_atomic64_cmpxchg_acquire(atomic64_t *v, s64 old, s64 new)
-{
- s64 ret = arch_atomic64_cmpxchg_relaxed(v, old, new);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_cmpxchg_acquire arch_atomic64_cmpxchg_acquire
-#endif
-
-#ifndef arch_atomic64_cmpxchg_release
-static __always_inline s64
-arch_atomic64_cmpxchg_release(atomic64_t *v, s64 old, s64 new)
-{
- __atomic_release_fence();
- return arch_atomic64_cmpxchg_relaxed(v, old, new);
-}
-#define arch_atomic64_cmpxchg_release arch_atomic64_cmpxchg_release
-#endif
-
-#ifndef arch_atomic64_cmpxchg
-static __always_inline s64
-arch_atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
-{
- s64 ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_cmpxchg_relaxed(v, old, new);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_cmpxchg arch_atomic64_cmpxchg
-#endif
-
-#endif /* arch_atomic64_cmpxchg_relaxed */
-
-#ifndef arch_atomic64_try_cmpxchg_relaxed
-#ifdef arch_atomic64_try_cmpxchg
-#define arch_atomic64_try_cmpxchg_acquire arch_atomic64_try_cmpxchg
-#define arch_atomic64_try_cmpxchg_release arch_atomic64_try_cmpxchg
-#define arch_atomic64_try_cmpxchg_relaxed arch_atomic64_try_cmpxchg
-#endif /* arch_atomic64_try_cmpxchg */
-
-#ifndef arch_atomic64_try_cmpxchg
-static __always_inline bool
-arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
-{
- s64 r, o = *old;
- r = arch_atomic64_cmpxchg(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
-#endif
-
-#ifndef arch_atomic64_try_cmpxchg_acquire
-static __always_inline bool
-arch_atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new)
-{
- s64 r, o = *old;
- r = arch_atomic64_cmpxchg_acquire(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic64_try_cmpxchg_acquire arch_atomic64_try_cmpxchg_acquire
-#endif
-
-#ifndef arch_atomic64_try_cmpxchg_release
-static __always_inline bool
-arch_atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new)
-{
- s64 r, o = *old;
- r = arch_atomic64_cmpxchg_release(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic64_try_cmpxchg_release arch_atomic64_try_cmpxchg_release
-#endif
-
-#ifndef arch_atomic64_try_cmpxchg_relaxed
-static __always_inline bool
-arch_atomic64_try_cmpxchg_relaxed(atomic64_t *v, s64 *old, s64 new)
-{
- s64 r, o = *old;
- r = arch_atomic64_cmpxchg_relaxed(v, o, new);
- if (unlikely(r != o))
- *old = r;
- return likely(r == o);
-}
-#define arch_atomic64_try_cmpxchg_relaxed arch_atomic64_try_cmpxchg_relaxed
-#endif
-
-#else /* arch_atomic64_try_cmpxchg_relaxed */
-
-#ifndef arch_atomic64_try_cmpxchg_acquire
-static __always_inline bool
-arch_atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new)
-{
- bool ret = arch_atomic64_try_cmpxchg_relaxed(v, old, new);
- __atomic_acquire_fence();
- return ret;
-}
-#define arch_atomic64_try_cmpxchg_acquire arch_atomic64_try_cmpxchg_acquire
-#endif
-
-#ifndef arch_atomic64_try_cmpxchg_release
-static __always_inline bool
-arch_atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new)
-{
- __atomic_release_fence();
- return arch_atomic64_try_cmpxchg_relaxed(v, old, new);
-}
-#define arch_atomic64_try_cmpxchg_release arch_atomic64_try_cmpxchg_release
-#endif
-
-#ifndef arch_atomic64_try_cmpxchg
-static __always_inline bool
-arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
-{
- bool ret;
- __atomic_pre_full_fence();
- ret = arch_atomic64_try_cmpxchg_relaxed(v, old, new);
- __atomic_post_full_fence();
- return ret;
-}
-#define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
-#endif
-
-#endif /* arch_atomic64_try_cmpxchg_relaxed */
-
-#ifndef arch_atomic64_sub_and_test
-/**
- * arch_atomic64_sub_and_test - subtract value from variable and test result
- * @i: integer value to subtract
- * @v: pointer of type atomic64_t
- *
- * Atomically subtracts @i from @v and returns
- * true if the result is zero, or false for all
- * other cases.
- */
-static __always_inline bool
-arch_atomic64_sub_and_test(s64 i, atomic64_t *v)
-{
- return arch_atomic64_sub_return(i, v) == 0;
-}
-#define arch_atomic64_sub_and_test arch_atomic64_sub_and_test
-#endif
-
-#ifndef arch_atomic64_dec_and_test
-/**
- * arch_atomic64_dec_and_test - decrement and test
- * @v: pointer of type atomic64_t
- *
- * Atomically decrements @v by 1 and
- * returns true if the result is 0, or false for all other
- * cases.
- */
-static __always_inline bool
-arch_atomic64_dec_and_test(atomic64_t *v)
-{
- return arch_atomic64_dec_return(v) == 0;
-}
-#define arch_atomic64_dec_and_test arch_atomic64_dec_and_test
-#endif
-
-#ifndef arch_atomic64_inc_and_test
-/**
- * arch_atomic64_inc_and_test - increment and test
- * @v: pointer of type atomic64_t
- *
- * Atomically increments @v by 1
- * and returns true if the result is zero, or false for all
- * other cases.
- */
-static __always_inline bool
-arch_atomic64_inc_and_test(atomic64_t *v)
-{
- return arch_atomic64_inc_return(v) == 0;
-}
-#define arch_atomic64_inc_and_test arch_atomic64_inc_and_test
-#endif
-
-#ifndef arch_atomic64_add_negative
-/**
- * arch_atomic64_add_negative - add and test if negative
- * @i: integer value to add
- * @v: pointer of type atomic64_t
- *
- * Atomically adds @i to @v and returns true
- * if the result is negative, or false when
- * result is greater than or equal to zero.
- */
-static __always_inline bool
-arch_atomic64_add_negative(s64 i, atomic64_t *v)
-{
- return arch_atomic64_add_return(i, v) < 0;
-}
-#define arch_atomic64_add_negative arch_atomic64_add_negative
-#endif
-
-#ifndef arch_atomic64_fetch_add_unless
-/**
- * arch_atomic64_fetch_add_unless - add unless the number is already a given value
- * @v: pointer of type atomic64_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
- *
- * Atomically adds @a to @v, so long as @v was not already @u.
- * Returns original value of @v
- */
-static __always_inline s64
-arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
-{
- s64 c = arch_atomic64_read(v);
-
- do {
- if (unlikely(c == u))
- break;
- } while (!arch_atomic64_try_cmpxchg(v, &c, c + a));
-
- return c;
-}
-#define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
-#endif
-
-#ifndef arch_atomic64_add_unless
-/**
- * arch_atomic64_add_unless - add unless the number is already a given value
- * @v: pointer of type atomic64_t
- * @a: the amount to add to v...
- * @u: ...unless v is equal to u.
- *
- * Atomically adds @a to @v, if @v was not already @u.
- * Returns true if the addition was done.
- */
-static __always_inline bool
-arch_atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
-{
- return arch_atomic64_fetch_add_unless(v, a, u) != u;
-}
-#define arch_atomic64_add_unless arch_atomic64_add_unless
-#endif
-
-#ifndef arch_atomic64_inc_not_zero
-/**
- * arch_atomic64_inc_not_zero - increment unless the number is zero
- * @v: pointer of type atomic64_t
- *
- * Atomically increments @v by 1, if @v is non-zero.
- * Returns true if the increment was done.
- */
-static __always_inline bool
-arch_atomic64_inc_not_zero(atomic64_t *v)
-{
- return arch_atomic64_add_unless(v, 1, 0);
-}
-#define arch_atomic64_inc_not_zero arch_atomic64_inc_not_zero
-#endif
-
-#ifndef arch_atomic64_inc_unless_negative
-static __always_inline bool
-arch_atomic64_inc_unless_negative(atomic64_t *v)
-{
- s64 c = arch_atomic64_read(v);
-
- do {
- if (unlikely(c < 0))
- return false;
- } while (!arch_atomic64_try_cmpxchg(v, &c, c + 1));
-
- return true;
-}
-#define arch_atomic64_inc_unless_negative arch_atomic64_inc_unless_negative
-#endif
-
-#ifndef arch_atomic64_dec_unless_positive
-static __always_inline bool
-arch_atomic64_dec_unless_positive(atomic64_t *v)
-{
- s64 c = arch_atomic64_read(v);
-
- do {
- if (unlikely(c > 0))
- return false;
- } while (!arch_atomic64_try_cmpxchg(v, &c, c - 1));
-
- return true;
-}
-#define arch_atomic64_dec_unless_positive arch_atomic64_dec_unless_positive
-#endif
-
-#ifndef arch_atomic64_dec_if_positive
-static __always_inline s64
-arch_atomic64_dec_if_positive(atomic64_t *v)
-{
- s64 dec, c = arch_atomic64_read(v);
-
- do {
- dec = c - 1;
- if (unlikely(dec < 0))
- break;
- } while (!arch_atomic64_try_cmpxchg(v, &c, dec));
-
- return dec;
-}
-#define arch_atomic64_dec_if_positive arch_atomic64_dec_if_positive
-#endif
-
-#endif /* _LINUX_ATOMIC_FALLBACK_H */
-// cca554917d7ea73d5e3e7397dd70c484cad9b2c4
__ret; \
})
-#include <linux/atomic-arch-fallback.h>
-#include <asm-generic/atomic-instrumented.h>
-
-#include <asm-generic/atomic-long.h>
+#include <linux/atomic/atomic-arch-fallback.h>
+#include <linux/atomic/atomic-long.h>
+#include <linux/atomic/atomic-instrumented.h>
#endif /* _LINUX_ATOMIC_H */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+// Generated by scripts/atomic/gen-atomic-fallback.sh
+// DO NOT MODIFY THIS FILE DIRECTLY
+
+#ifndef _LINUX_ATOMIC_FALLBACK_H
+#define _LINUX_ATOMIC_FALLBACK_H
+
+#include <linux/compiler.h>
+
+#ifndef arch_xchg_relaxed
+#define arch_xchg_acquire arch_xchg
+#define arch_xchg_release arch_xchg
+#define arch_xchg_relaxed arch_xchg
+#else /* arch_xchg_relaxed */
+
+#ifndef arch_xchg_acquire
+#define arch_xchg_acquire(...) \
+ __atomic_op_acquire(arch_xchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_xchg_release
+#define arch_xchg_release(...) \
+ __atomic_op_release(arch_xchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_xchg
+#define arch_xchg(...) \
+ __atomic_op_fence(arch_xchg, __VA_ARGS__)
+#endif
+
+#endif /* arch_xchg_relaxed */
+
+#ifndef arch_cmpxchg_relaxed
+#define arch_cmpxchg_acquire arch_cmpxchg
+#define arch_cmpxchg_release arch_cmpxchg
+#define arch_cmpxchg_relaxed arch_cmpxchg
+#else /* arch_cmpxchg_relaxed */
+
+#ifndef arch_cmpxchg_acquire
+#define arch_cmpxchg_acquire(...) \
+ __atomic_op_acquire(arch_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_cmpxchg_release
+#define arch_cmpxchg_release(...) \
+ __atomic_op_release(arch_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_cmpxchg
+#define arch_cmpxchg(...) \
+ __atomic_op_fence(arch_cmpxchg, __VA_ARGS__)
+#endif
+
+#endif /* arch_cmpxchg_relaxed */
+
+#ifndef arch_cmpxchg64_relaxed
+#define arch_cmpxchg64_acquire arch_cmpxchg64
+#define arch_cmpxchg64_release arch_cmpxchg64
+#define arch_cmpxchg64_relaxed arch_cmpxchg64
+#else /* arch_cmpxchg64_relaxed */
+
+#ifndef arch_cmpxchg64_acquire
+#define arch_cmpxchg64_acquire(...) \
+ __atomic_op_acquire(arch_cmpxchg64, __VA_ARGS__)
+#endif
+
+#ifndef arch_cmpxchg64_release
+#define arch_cmpxchg64_release(...) \
+ __atomic_op_release(arch_cmpxchg64, __VA_ARGS__)
+#endif
+
+#ifndef arch_cmpxchg64
+#define arch_cmpxchg64(...) \
+ __atomic_op_fence(arch_cmpxchg64, __VA_ARGS__)
+#endif
+
+#endif /* arch_cmpxchg64_relaxed */
+
+#ifndef arch_try_cmpxchg_relaxed
+#ifdef arch_try_cmpxchg
+#define arch_try_cmpxchg_acquire arch_try_cmpxchg
+#define arch_try_cmpxchg_release arch_try_cmpxchg
+#define arch_try_cmpxchg_relaxed arch_try_cmpxchg
+#endif /* arch_try_cmpxchg */
+
+#ifndef arch_try_cmpxchg
+#define arch_try_cmpxchg(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg */
+
+#ifndef arch_try_cmpxchg_acquire
+#define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_acquire */
+
+#ifndef arch_try_cmpxchg_release
+#define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_release */
+
+#ifndef arch_try_cmpxchg_relaxed
+#define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
+({ \
+ typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+ ___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
+ if (unlikely(___r != ___o)) \
+ *___op = ___r; \
+ likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_relaxed */
+
+#else /* arch_try_cmpxchg_relaxed */
+
+#ifndef arch_try_cmpxchg_acquire
+#define arch_try_cmpxchg_acquire(...) \
+ __atomic_op_acquire(arch_try_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_try_cmpxchg_release
+#define arch_try_cmpxchg_release(...) \
+ __atomic_op_release(arch_try_cmpxchg, __VA_ARGS__)
+#endif
+
+#ifndef arch_try_cmpxchg
+#define arch_try_cmpxchg(...) \
+ __atomic_op_fence(arch_try_cmpxchg, __VA_ARGS__)
+#endif
+
+#endif /* arch_try_cmpxchg_relaxed */
+
+#ifndef arch_atomic_read_acquire
+static __always_inline int
+arch_atomic_read_acquire(const atomic_t *v)
+{
+ return smp_load_acquire(&(v)->counter);
+}
+#define arch_atomic_read_acquire arch_atomic_read_acquire
+#endif
+
+#ifndef arch_atomic_set_release
+static __always_inline void
+arch_atomic_set_release(atomic_t *v, int i)
+{
+ smp_store_release(&(v)->counter, i);
+}
+#define arch_atomic_set_release arch_atomic_set_release
+#endif
+
+#ifndef arch_atomic_add_return_relaxed
+#define arch_atomic_add_return_acquire arch_atomic_add_return
+#define arch_atomic_add_return_release arch_atomic_add_return
+#define arch_atomic_add_return_relaxed arch_atomic_add_return
+#else /* arch_atomic_add_return_relaxed */
+
+#ifndef arch_atomic_add_return_acquire
+static __always_inline int
+arch_atomic_add_return_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_add_return_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_add_return_acquire arch_atomic_add_return_acquire
+#endif
+
+#ifndef arch_atomic_add_return_release
+static __always_inline int
+arch_atomic_add_return_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_add_return_relaxed(i, v);
+}
+#define arch_atomic_add_return_release arch_atomic_add_return_release
+#endif
+
+#ifndef arch_atomic_add_return
+static __always_inline int
+arch_atomic_add_return(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_add_return_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_add_return arch_atomic_add_return
+#endif
+
+#endif /* arch_atomic_add_return_relaxed */
+
+#ifndef arch_atomic_fetch_add_relaxed
+#define arch_atomic_fetch_add_acquire arch_atomic_fetch_add
+#define arch_atomic_fetch_add_release arch_atomic_fetch_add
+#define arch_atomic_fetch_add_relaxed arch_atomic_fetch_add
+#else /* arch_atomic_fetch_add_relaxed */
+
+#ifndef arch_atomic_fetch_add_acquire
+static __always_inline int
+arch_atomic_fetch_add_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_fetch_add_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_add_acquire arch_atomic_fetch_add_acquire
+#endif
+
+#ifndef arch_atomic_fetch_add_release
+static __always_inline int
+arch_atomic_fetch_add_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_add_relaxed(i, v);
+}
+#define arch_atomic_fetch_add_release arch_atomic_fetch_add_release
+#endif
+
+#ifndef arch_atomic_fetch_add
+static __always_inline int
+arch_atomic_fetch_add(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_add_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_add arch_atomic_fetch_add
+#endif
+
+#endif /* arch_atomic_fetch_add_relaxed */
+
+#ifndef arch_atomic_sub_return_relaxed
+#define arch_atomic_sub_return_acquire arch_atomic_sub_return
+#define arch_atomic_sub_return_release arch_atomic_sub_return
+#define arch_atomic_sub_return_relaxed arch_atomic_sub_return
+#else /* arch_atomic_sub_return_relaxed */
+
+#ifndef arch_atomic_sub_return_acquire
+static __always_inline int
+arch_atomic_sub_return_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_sub_return_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_sub_return_acquire arch_atomic_sub_return_acquire
+#endif
+
+#ifndef arch_atomic_sub_return_release
+static __always_inline int
+arch_atomic_sub_return_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_sub_return_relaxed(i, v);
+}
+#define arch_atomic_sub_return_release arch_atomic_sub_return_release
+#endif
+
+#ifndef arch_atomic_sub_return
+static __always_inline int
+arch_atomic_sub_return(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_sub_return_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_sub_return arch_atomic_sub_return
+#endif
+
+#endif /* arch_atomic_sub_return_relaxed */
+
+#ifndef arch_atomic_fetch_sub_relaxed
+#define arch_atomic_fetch_sub_acquire arch_atomic_fetch_sub
+#define arch_atomic_fetch_sub_release arch_atomic_fetch_sub
+#define arch_atomic_fetch_sub_relaxed arch_atomic_fetch_sub
+#else /* arch_atomic_fetch_sub_relaxed */
+
+#ifndef arch_atomic_fetch_sub_acquire
+static __always_inline int
+arch_atomic_fetch_sub_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_fetch_sub_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_sub_acquire arch_atomic_fetch_sub_acquire
+#endif
+
+#ifndef arch_atomic_fetch_sub_release
+static __always_inline int
+arch_atomic_fetch_sub_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_sub_relaxed(i, v);
+}
+#define arch_atomic_fetch_sub_release arch_atomic_fetch_sub_release
+#endif
+
+#ifndef arch_atomic_fetch_sub
+static __always_inline int
+arch_atomic_fetch_sub(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_sub_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_sub arch_atomic_fetch_sub
+#endif
+
+#endif /* arch_atomic_fetch_sub_relaxed */
+
+#ifndef arch_atomic_inc
+static __always_inline void
+arch_atomic_inc(atomic_t *v)
+{
+ arch_atomic_add(1, v);
+}
+#define arch_atomic_inc arch_atomic_inc
+#endif
+
+#ifndef arch_atomic_inc_return_relaxed
+#ifdef arch_atomic_inc_return
+#define arch_atomic_inc_return_acquire arch_atomic_inc_return
+#define arch_atomic_inc_return_release arch_atomic_inc_return
+#define arch_atomic_inc_return_relaxed arch_atomic_inc_return
+#endif /* arch_atomic_inc_return */
+
+#ifndef arch_atomic_inc_return
+static __always_inline int
+arch_atomic_inc_return(atomic_t *v)
+{
+ return arch_atomic_add_return(1, v);
+}
+#define arch_atomic_inc_return arch_atomic_inc_return
+#endif
+
+#ifndef arch_atomic_inc_return_acquire
+static __always_inline int
+arch_atomic_inc_return_acquire(atomic_t *v)
+{
+ return arch_atomic_add_return_acquire(1, v);
+}
+#define arch_atomic_inc_return_acquire arch_atomic_inc_return_acquire
+#endif
+
+#ifndef arch_atomic_inc_return_release
+static __always_inline int
+arch_atomic_inc_return_release(atomic_t *v)
+{
+ return arch_atomic_add_return_release(1, v);
+}
+#define arch_atomic_inc_return_release arch_atomic_inc_return_release
+#endif
+
+#ifndef arch_atomic_inc_return_relaxed
+static __always_inline int
+arch_atomic_inc_return_relaxed(atomic_t *v)
+{
+ return arch_atomic_add_return_relaxed(1, v);
+}
+#define arch_atomic_inc_return_relaxed arch_atomic_inc_return_relaxed
+#endif
+
+#else /* arch_atomic_inc_return_relaxed */
+
+#ifndef arch_atomic_inc_return_acquire
+static __always_inline int
+arch_atomic_inc_return_acquire(atomic_t *v)
+{
+ int ret = arch_atomic_inc_return_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_inc_return_acquire arch_atomic_inc_return_acquire
+#endif
+
+#ifndef arch_atomic_inc_return_release
+static __always_inline int
+arch_atomic_inc_return_release(atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_inc_return_relaxed(v);
+}
+#define arch_atomic_inc_return_release arch_atomic_inc_return_release
+#endif
+
+#ifndef arch_atomic_inc_return
+static __always_inline int
+arch_atomic_inc_return(atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_inc_return_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_inc_return arch_atomic_inc_return
+#endif
+
+#endif /* arch_atomic_inc_return_relaxed */
+
+#ifndef arch_atomic_fetch_inc_relaxed
+#ifdef arch_atomic_fetch_inc
+#define arch_atomic_fetch_inc_acquire arch_atomic_fetch_inc
+#define arch_atomic_fetch_inc_release arch_atomic_fetch_inc
+#define arch_atomic_fetch_inc_relaxed arch_atomic_fetch_inc
+#endif /* arch_atomic_fetch_inc */
+
+#ifndef arch_atomic_fetch_inc
+static __always_inline int
+arch_atomic_fetch_inc(atomic_t *v)
+{
+ return arch_atomic_fetch_add(1, v);
+}
+#define arch_atomic_fetch_inc arch_atomic_fetch_inc
+#endif
+
+#ifndef arch_atomic_fetch_inc_acquire
+static __always_inline int
+arch_atomic_fetch_inc_acquire(atomic_t *v)
+{
+ return arch_atomic_fetch_add_acquire(1, v);
+}
+#define arch_atomic_fetch_inc_acquire arch_atomic_fetch_inc_acquire
+#endif
+
+#ifndef arch_atomic_fetch_inc_release
+static __always_inline int
+arch_atomic_fetch_inc_release(atomic_t *v)
+{
+ return arch_atomic_fetch_add_release(1, v);
+}
+#define arch_atomic_fetch_inc_release arch_atomic_fetch_inc_release
+#endif
+
+#ifndef arch_atomic_fetch_inc_relaxed
+static __always_inline int
+arch_atomic_fetch_inc_relaxed(atomic_t *v)
+{
+ return arch_atomic_fetch_add_relaxed(1, v);
+}
+#define arch_atomic_fetch_inc_relaxed arch_atomic_fetch_inc_relaxed
+#endif
+
+#else /* arch_atomic_fetch_inc_relaxed */
+
+#ifndef arch_atomic_fetch_inc_acquire
+static __always_inline int
+arch_atomic_fetch_inc_acquire(atomic_t *v)
+{
+ int ret = arch_atomic_fetch_inc_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_inc_acquire arch_atomic_fetch_inc_acquire
+#endif
+
+#ifndef arch_atomic_fetch_inc_release
+static __always_inline int
+arch_atomic_fetch_inc_release(atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_inc_relaxed(v);
+}
+#define arch_atomic_fetch_inc_release arch_atomic_fetch_inc_release
+#endif
+
+#ifndef arch_atomic_fetch_inc
+static __always_inline int
+arch_atomic_fetch_inc(atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_inc_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_inc arch_atomic_fetch_inc
+#endif
+
+#endif /* arch_atomic_fetch_inc_relaxed */
+
+#ifndef arch_atomic_dec
+static __always_inline void
+arch_atomic_dec(atomic_t *v)
+{
+ arch_atomic_sub(1, v);
+}
+#define arch_atomic_dec arch_atomic_dec
+#endif
+
+#ifndef arch_atomic_dec_return_relaxed
+#ifdef arch_atomic_dec_return
+#define arch_atomic_dec_return_acquire arch_atomic_dec_return
+#define arch_atomic_dec_return_release arch_atomic_dec_return
+#define arch_atomic_dec_return_relaxed arch_atomic_dec_return
+#endif /* arch_atomic_dec_return */
+
+#ifndef arch_atomic_dec_return
+static __always_inline int
+arch_atomic_dec_return(atomic_t *v)
+{
+ return arch_atomic_sub_return(1, v);
+}
+#define arch_atomic_dec_return arch_atomic_dec_return
+#endif
+
+#ifndef arch_atomic_dec_return_acquire
+static __always_inline int
+arch_atomic_dec_return_acquire(atomic_t *v)
+{
+ return arch_atomic_sub_return_acquire(1, v);
+}
+#define arch_atomic_dec_return_acquire arch_atomic_dec_return_acquire
+#endif
+
+#ifndef arch_atomic_dec_return_release
+static __always_inline int
+arch_atomic_dec_return_release(atomic_t *v)
+{
+ return arch_atomic_sub_return_release(1, v);
+}
+#define arch_atomic_dec_return_release arch_atomic_dec_return_release
+#endif
+
+#ifndef arch_atomic_dec_return_relaxed
+static __always_inline int
+arch_atomic_dec_return_relaxed(atomic_t *v)
+{
+ return arch_atomic_sub_return_relaxed(1, v);
+}
+#define arch_atomic_dec_return_relaxed arch_atomic_dec_return_relaxed
+#endif
+
+#else /* arch_atomic_dec_return_relaxed */
+
+#ifndef arch_atomic_dec_return_acquire
+static __always_inline int
+arch_atomic_dec_return_acquire(atomic_t *v)
+{
+ int ret = arch_atomic_dec_return_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_dec_return_acquire arch_atomic_dec_return_acquire
+#endif
+
+#ifndef arch_atomic_dec_return_release
+static __always_inline int
+arch_atomic_dec_return_release(atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_dec_return_relaxed(v);
+}
+#define arch_atomic_dec_return_release arch_atomic_dec_return_release
+#endif
+
+#ifndef arch_atomic_dec_return
+static __always_inline int
+arch_atomic_dec_return(atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_dec_return_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_dec_return arch_atomic_dec_return
+#endif
+
+#endif /* arch_atomic_dec_return_relaxed */
+
+#ifndef arch_atomic_fetch_dec_relaxed
+#ifdef arch_atomic_fetch_dec
+#define arch_atomic_fetch_dec_acquire arch_atomic_fetch_dec
+#define arch_atomic_fetch_dec_release arch_atomic_fetch_dec
+#define arch_atomic_fetch_dec_relaxed arch_atomic_fetch_dec
+#endif /* arch_atomic_fetch_dec */
+
+#ifndef arch_atomic_fetch_dec
+static __always_inline int
+arch_atomic_fetch_dec(atomic_t *v)
+{
+ return arch_atomic_fetch_sub(1, v);
+}
+#define arch_atomic_fetch_dec arch_atomic_fetch_dec
+#endif
+
+#ifndef arch_atomic_fetch_dec_acquire
+static __always_inline int
+arch_atomic_fetch_dec_acquire(atomic_t *v)
+{
+ return arch_atomic_fetch_sub_acquire(1, v);
+}
+#define arch_atomic_fetch_dec_acquire arch_atomic_fetch_dec_acquire
+#endif
+
+#ifndef arch_atomic_fetch_dec_release
+static __always_inline int
+arch_atomic_fetch_dec_release(atomic_t *v)
+{
+ return arch_atomic_fetch_sub_release(1, v);
+}
+#define arch_atomic_fetch_dec_release arch_atomic_fetch_dec_release
+#endif
+
+#ifndef arch_atomic_fetch_dec_relaxed
+static __always_inline int
+arch_atomic_fetch_dec_relaxed(atomic_t *v)
+{
+ return arch_atomic_fetch_sub_relaxed(1, v);
+}
+#define arch_atomic_fetch_dec_relaxed arch_atomic_fetch_dec_relaxed
+#endif
+
+#else /* arch_atomic_fetch_dec_relaxed */
+
+#ifndef arch_atomic_fetch_dec_acquire
+static __always_inline int
+arch_atomic_fetch_dec_acquire(atomic_t *v)
+{
+ int ret = arch_atomic_fetch_dec_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_dec_acquire arch_atomic_fetch_dec_acquire
+#endif
+
+#ifndef arch_atomic_fetch_dec_release
+static __always_inline int
+arch_atomic_fetch_dec_release(atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_dec_relaxed(v);
+}
+#define arch_atomic_fetch_dec_release arch_atomic_fetch_dec_release
+#endif
+
+#ifndef arch_atomic_fetch_dec
+static __always_inline int
+arch_atomic_fetch_dec(atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_dec_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_dec arch_atomic_fetch_dec
+#endif
+
+#endif /* arch_atomic_fetch_dec_relaxed */
+
+#ifndef arch_atomic_fetch_and_relaxed
+#define arch_atomic_fetch_and_acquire arch_atomic_fetch_and
+#define arch_atomic_fetch_and_release arch_atomic_fetch_and
+#define arch_atomic_fetch_and_relaxed arch_atomic_fetch_and
+#else /* arch_atomic_fetch_and_relaxed */
+
+#ifndef arch_atomic_fetch_and_acquire
+static __always_inline int
+arch_atomic_fetch_and_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_fetch_and_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_and_acquire arch_atomic_fetch_and_acquire
+#endif
+
+#ifndef arch_atomic_fetch_and_release
+static __always_inline int
+arch_atomic_fetch_and_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_and_relaxed(i, v);
+}
+#define arch_atomic_fetch_and_release arch_atomic_fetch_and_release
+#endif
+
+#ifndef arch_atomic_fetch_and
+static __always_inline int
+arch_atomic_fetch_and(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_and_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_and arch_atomic_fetch_and
+#endif
+
+#endif /* arch_atomic_fetch_and_relaxed */
+
+#ifndef arch_atomic_andnot
+static __always_inline void
+arch_atomic_andnot(int i, atomic_t *v)
+{
+ arch_atomic_and(~i, v);
+}
+#define arch_atomic_andnot arch_atomic_andnot
+#endif
+
+#ifndef arch_atomic_fetch_andnot_relaxed
+#ifdef arch_atomic_fetch_andnot
+#define arch_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot
+#define arch_atomic_fetch_andnot_release arch_atomic_fetch_andnot
+#define arch_atomic_fetch_andnot_relaxed arch_atomic_fetch_andnot
+#endif /* arch_atomic_fetch_andnot */
+
+#ifndef arch_atomic_fetch_andnot
+static __always_inline int
+arch_atomic_fetch_andnot(int i, atomic_t *v)
+{
+ return arch_atomic_fetch_and(~i, v);
+}
+#define arch_atomic_fetch_andnot arch_atomic_fetch_andnot
+#endif
+
+#ifndef arch_atomic_fetch_andnot_acquire
+static __always_inline int
+arch_atomic_fetch_andnot_acquire(int i, atomic_t *v)
+{
+ return arch_atomic_fetch_and_acquire(~i, v);
+}
+#define arch_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire
+#endif
+
+#ifndef arch_atomic_fetch_andnot_release
+static __always_inline int
+arch_atomic_fetch_andnot_release(int i, atomic_t *v)
+{
+ return arch_atomic_fetch_and_release(~i, v);
+}
+#define arch_atomic_fetch_andnot_release arch_atomic_fetch_andnot_release
+#endif
+
+#ifndef arch_atomic_fetch_andnot_relaxed
+static __always_inline int
+arch_atomic_fetch_andnot_relaxed(int i, atomic_t *v)
+{
+ return arch_atomic_fetch_and_relaxed(~i, v);
+}
+#define arch_atomic_fetch_andnot_relaxed arch_atomic_fetch_andnot_relaxed
+#endif
+
+#else /* arch_atomic_fetch_andnot_relaxed */
+
+#ifndef arch_atomic_fetch_andnot_acquire
+static __always_inline int
+arch_atomic_fetch_andnot_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_fetch_andnot_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire
+#endif
+
+#ifndef arch_atomic_fetch_andnot_release
+static __always_inline int
+arch_atomic_fetch_andnot_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_andnot_relaxed(i, v);
+}
+#define arch_atomic_fetch_andnot_release arch_atomic_fetch_andnot_release
+#endif
+
+#ifndef arch_atomic_fetch_andnot
+static __always_inline int
+arch_atomic_fetch_andnot(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_andnot_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_andnot arch_atomic_fetch_andnot
+#endif
+
+#endif /* arch_atomic_fetch_andnot_relaxed */
+
+#ifndef arch_atomic_fetch_or_relaxed
+#define arch_atomic_fetch_or_acquire arch_atomic_fetch_or
+#define arch_atomic_fetch_or_release arch_atomic_fetch_or
+#define arch_atomic_fetch_or_relaxed arch_atomic_fetch_or
+#else /* arch_atomic_fetch_or_relaxed */
+
+#ifndef arch_atomic_fetch_or_acquire
+static __always_inline int
+arch_atomic_fetch_or_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_fetch_or_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_or_acquire arch_atomic_fetch_or_acquire
+#endif
+
+#ifndef arch_atomic_fetch_or_release
+static __always_inline int
+arch_atomic_fetch_or_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_or_relaxed(i, v);
+}
+#define arch_atomic_fetch_or_release arch_atomic_fetch_or_release
+#endif
+
+#ifndef arch_atomic_fetch_or
+static __always_inline int
+arch_atomic_fetch_or(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_or_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_or arch_atomic_fetch_or
+#endif
+
+#endif /* arch_atomic_fetch_or_relaxed */
+
+#ifndef arch_atomic_fetch_xor_relaxed
+#define arch_atomic_fetch_xor_acquire arch_atomic_fetch_xor
+#define arch_atomic_fetch_xor_release arch_atomic_fetch_xor
+#define arch_atomic_fetch_xor_relaxed arch_atomic_fetch_xor
+#else /* arch_atomic_fetch_xor_relaxed */
+
+#ifndef arch_atomic_fetch_xor_acquire
+static __always_inline int
+arch_atomic_fetch_xor_acquire(int i, atomic_t *v)
+{
+ int ret = arch_atomic_fetch_xor_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_fetch_xor_acquire arch_atomic_fetch_xor_acquire
+#endif
+
+#ifndef arch_atomic_fetch_xor_release
+static __always_inline int
+arch_atomic_fetch_xor_release(int i, atomic_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic_fetch_xor_relaxed(i, v);
+}
+#define arch_atomic_fetch_xor_release arch_atomic_fetch_xor_release
+#endif
+
+#ifndef arch_atomic_fetch_xor
+static __always_inline int
+arch_atomic_fetch_xor(int i, atomic_t *v)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_fetch_xor_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_fetch_xor arch_atomic_fetch_xor
+#endif
+
+#endif /* arch_atomic_fetch_xor_relaxed */
+
+#ifndef arch_atomic_xchg_relaxed
+#define arch_atomic_xchg_acquire arch_atomic_xchg
+#define arch_atomic_xchg_release arch_atomic_xchg
+#define arch_atomic_xchg_relaxed arch_atomic_xchg
+#else /* arch_atomic_xchg_relaxed */
+
+#ifndef arch_atomic_xchg_acquire
+static __always_inline int
+arch_atomic_xchg_acquire(atomic_t *v, int i)
+{
+ int ret = arch_atomic_xchg_relaxed(v, i);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_xchg_acquire arch_atomic_xchg_acquire
+#endif
+
+#ifndef arch_atomic_xchg_release
+static __always_inline int
+arch_atomic_xchg_release(atomic_t *v, int i)
+{
+ __atomic_release_fence();
+ return arch_atomic_xchg_relaxed(v, i);
+}
+#define arch_atomic_xchg_release arch_atomic_xchg_release
+#endif
+
+#ifndef arch_atomic_xchg
+static __always_inline int
+arch_atomic_xchg(atomic_t *v, int i)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_xchg_relaxed(v, i);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_xchg arch_atomic_xchg
+#endif
+
+#endif /* arch_atomic_xchg_relaxed */
+
+#ifndef arch_atomic_cmpxchg_relaxed
+#define arch_atomic_cmpxchg_acquire arch_atomic_cmpxchg
+#define arch_atomic_cmpxchg_release arch_atomic_cmpxchg
+#define arch_atomic_cmpxchg_relaxed arch_atomic_cmpxchg
+#else /* arch_atomic_cmpxchg_relaxed */
+
+#ifndef arch_atomic_cmpxchg_acquire
+static __always_inline int
+arch_atomic_cmpxchg_acquire(atomic_t *v, int old, int new)
+{
+ int ret = arch_atomic_cmpxchg_relaxed(v, old, new);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_cmpxchg_acquire arch_atomic_cmpxchg_acquire
+#endif
+
+#ifndef arch_atomic_cmpxchg_release
+static __always_inline int
+arch_atomic_cmpxchg_release(atomic_t *v, int old, int new)
+{
+ __atomic_release_fence();
+ return arch_atomic_cmpxchg_relaxed(v, old, new);
+}
+#define arch_atomic_cmpxchg_release arch_atomic_cmpxchg_release
+#endif
+
+#ifndef arch_atomic_cmpxchg
+static __always_inline int
+arch_atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+ int ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_cmpxchg_relaxed(v, old, new);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_cmpxchg arch_atomic_cmpxchg
+#endif
+
+#endif /* arch_atomic_cmpxchg_relaxed */
+
+#ifndef arch_atomic_try_cmpxchg_relaxed
+#ifdef arch_atomic_try_cmpxchg
+#define arch_atomic_try_cmpxchg_acquire arch_atomic_try_cmpxchg
+#define arch_atomic_try_cmpxchg_release arch_atomic_try_cmpxchg
+#define arch_atomic_try_cmpxchg_relaxed arch_atomic_try_cmpxchg
+#endif /* arch_atomic_try_cmpxchg */
+
+#ifndef arch_atomic_try_cmpxchg
+static __always_inline bool
+arch_atomic_try_cmpxchg(atomic_t *v, int *old, int new)
+{
+ int r, o = *old;
+ r = arch_atomic_cmpxchg(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic_try_cmpxchg arch_atomic_try_cmpxchg
+#endif
+
+#ifndef arch_atomic_try_cmpxchg_acquire
+static __always_inline bool
+arch_atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new)
+{
+ int r, o = *old;
+ r = arch_atomic_cmpxchg_acquire(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic_try_cmpxchg_acquire arch_atomic_try_cmpxchg_acquire
+#endif
+
+#ifndef arch_atomic_try_cmpxchg_release
+static __always_inline bool
+arch_atomic_try_cmpxchg_release(atomic_t *v, int *old, int new)
+{
+ int r, o = *old;
+ r = arch_atomic_cmpxchg_release(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic_try_cmpxchg_release arch_atomic_try_cmpxchg_release
+#endif
+
+#ifndef arch_atomic_try_cmpxchg_relaxed
+static __always_inline bool
+arch_atomic_try_cmpxchg_relaxed(atomic_t *v, int *old, int new)
+{
+ int r, o = *old;
+ r = arch_atomic_cmpxchg_relaxed(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic_try_cmpxchg_relaxed arch_atomic_try_cmpxchg_relaxed
+#endif
+
+#else /* arch_atomic_try_cmpxchg_relaxed */
+
+#ifndef arch_atomic_try_cmpxchg_acquire
+static __always_inline bool
+arch_atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new)
+{
+ bool ret = arch_atomic_try_cmpxchg_relaxed(v, old, new);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic_try_cmpxchg_acquire arch_atomic_try_cmpxchg_acquire
+#endif
+
+#ifndef arch_atomic_try_cmpxchg_release
+static __always_inline bool
+arch_atomic_try_cmpxchg_release(atomic_t *v, int *old, int new)
+{
+ __atomic_release_fence();
+ return arch_atomic_try_cmpxchg_relaxed(v, old, new);
+}
+#define arch_atomic_try_cmpxchg_release arch_atomic_try_cmpxchg_release
+#endif
+
+#ifndef arch_atomic_try_cmpxchg
+static __always_inline bool
+arch_atomic_try_cmpxchg(atomic_t *v, int *old, int new)
+{
+ bool ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic_try_cmpxchg_relaxed(v, old, new);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic_try_cmpxchg arch_atomic_try_cmpxchg
+#endif
+
+#endif /* arch_atomic_try_cmpxchg_relaxed */
+
+#ifndef arch_atomic_sub_and_test
+/**
+ * arch_atomic_sub_and_test - subtract value from variable and test result
+ * @i: integer value to subtract
+ * @v: pointer of type atomic_t
+ *
+ * Atomically subtracts @i from @v and returns
+ * true if the result is zero, or false for all
+ * other cases.
+ */
+static __always_inline bool
+arch_atomic_sub_and_test(int i, atomic_t *v)
+{
+ return arch_atomic_sub_return(i, v) == 0;
+}
+#define arch_atomic_sub_and_test arch_atomic_sub_and_test
+#endif
+
+#ifndef arch_atomic_dec_and_test
+/**
+ * arch_atomic_dec_and_test - decrement and test
+ * @v: pointer of type atomic_t
+ *
+ * Atomically decrements @v by 1 and
+ * returns true if the result is 0, or false for all other
+ * cases.
+ */
+static __always_inline bool
+arch_atomic_dec_and_test(atomic_t *v)
+{
+ return arch_atomic_dec_return(v) == 0;
+}
+#define arch_atomic_dec_and_test arch_atomic_dec_and_test
+#endif
+
+#ifndef arch_atomic_inc_and_test
+/**
+ * arch_atomic_inc_and_test - increment and test
+ * @v: pointer of type atomic_t
+ *
+ * Atomically increments @v by 1
+ * and returns true if the result is zero, or false for all
+ * other cases.
+ */
+static __always_inline bool
+arch_atomic_inc_and_test(atomic_t *v)
+{
+ return arch_atomic_inc_return(v) == 0;
+}
+#define arch_atomic_inc_and_test arch_atomic_inc_and_test
+#endif
+
+#ifndef arch_atomic_add_negative
+/**
+ * arch_atomic_add_negative - add and test if negative
+ * @i: integer value to add
+ * @v: pointer of type atomic_t
+ *
+ * Atomically adds @i to @v and returns true
+ * if the result is negative, or false when
+ * result is greater than or equal to zero.
+ */
+static __always_inline bool
+arch_atomic_add_negative(int i, atomic_t *v)
+{
+ return arch_atomic_add_return(i, v) < 0;
+}
+#define arch_atomic_add_negative arch_atomic_add_negative
+#endif
+
+#ifndef arch_atomic_fetch_add_unless
+/**
+ * arch_atomic_fetch_add_unless - add unless the number is already a given value
+ * @v: pointer of type atomic_t
+ * @a: the amount to add to v...
+ * @u: ...unless v is equal to u.
+ *
+ * Atomically adds @a to @v, so long as @v was not already @u.
+ * Returns original value of @v
+ */
+static __always_inline int
+arch_atomic_fetch_add_unless(atomic_t *v, int a, int u)
+{
+ int c = arch_atomic_read(v);
+
+ do {
+ if (unlikely(c == u))
+ break;
+ } while (!arch_atomic_try_cmpxchg(v, &c, c + a));
+
+ return c;
+}
+#define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless
+#endif
+
+#ifndef arch_atomic_add_unless
+/**
+ * arch_atomic_add_unless - add unless the number is already a given value
+ * @v: pointer of type atomic_t
+ * @a: the amount to add to v...
+ * @u: ...unless v is equal to u.
+ *
+ * Atomically adds @a to @v, if @v was not already @u.
+ * Returns true if the addition was done.
+ */
+static __always_inline bool
+arch_atomic_add_unless(atomic_t *v, int a, int u)
+{
+ return arch_atomic_fetch_add_unless(v, a, u) != u;
+}
+#define arch_atomic_add_unless arch_atomic_add_unless
+#endif
+
+#ifndef arch_atomic_inc_not_zero
+/**
+ * arch_atomic_inc_not_zero - increment unless the number is zero
+ * @v: pointer of type atomic_t
+ *
+ * Atomically increments @v by 1, if @v is non-zero.
+ * Returns true if the increment was done.
+ */
+static __always_inline bool
+arch_atomic_inc_not_zero(atomic_t *v)
+{
+ return arch_atomic_add_unless(v, 1, 0);
+}
+#define arch_atomic_inc_not_zero arch_atomic_inc_not_zero
+#endif
+
+#ifndef arch_atomic_inc_unless_negative
+static __always_inline bool
+arch_atomic_inc_unless_negative(atomic_t *v)
+{
+ int c = arch_atomic_read(v);
+
+ do {
+ if (unlikely(c < 0))
+ return false;
+ } while (!arch_atomic_try_cmpxchg(v, &c, c + 1));
+
+ return true;
+}
+#define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative
+#endif
+
+#ifndef arch_atomic_dec_unless_positive
+static __always_inline bool
+arch_atomic_dec_unless_positive(atomic_t *v)
+{
+ int c = arch_atomic_read(v);
+
+ do {
+ if (unlikely(c > 0))
+ return false;
+ } while (!arch_atomic_try_cmpxchg(v, &c, c - 1));
+
+ return true;
+}
+#define arch_atomic_dec_unless_positive arch_atomic_dec_unless_positive
+#endif
+
+#ifndef arch_atomic_dec_if_positive
+static __always_inline int
+arch_atomic_dec_if_positive(atomic_t *v)
+{
+ int dec, c = arch_atomic_read(v);
+
+ do {
+ dec = c - 1;
+ if (unlikely(dec < 0))
+ break;
+ } while (!arch_atomic_try_cmpxchg(v, &c, dec));
+
+ return dec;
+}
+#define arch_atomic_dec_if_positive arch_atomic_dec_if_positive
+#endif
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#include <asm-generic/atomic64.h>
+#endif
+
+#ifndef arch_atomic64_read_acquire
+static __always_inline s64
+arch_atomic64_read_acquire(const atomic64_t *v)
+{
+ return smp_load_acquire(&(v)->counter);
+}
+#define arch_atomic64_read_acquire arch_atomic64_read_acquire
+#endif
+
+#ifndef arch_atomic64_set_release
+static __always_inline void
+arch_atomic64_set_release(atomic64_t *v, s64 i)
+{
+ smp_store_release(&(v)->counter, i);
+}
+#define arch_atomic64_set_release arch_atomic64_set_release
+#endif
+
+#ifndef arch_atomic64_add_return_relaxed
+#define arch_atomic64_add_return_acquire arch_atomic64_add_return
+#define arch_atomic64_add_return_release arch_atomic64_add_return
+#define arch_atomic64_add_return_relaxed arch_atomic64_add_return
+#else /* arch_atomic64_add_return_relaxed */
+
+#ifndef arch_atomic64_add_return_acquire
+static __always_inline s64
+arch_atomic64_add_return_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_add_return_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_add_return_acquire arch_atomic64_add_return_acquire
+#endif
+
+#ifndef arch_atomic64_add_return_release
+static __always_inline s64
+arch_atomic64_add_return_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_add_return_relaxed(i, v);
+}
+#define arch_atomic64_add_return_release arch_atomic64_add_return_release
+#endif
+
+#ifndef arch_atomic64_add_return
+static __always_inline s64
+arch_atomic64_add_return(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_add_return_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_add_return arch_atomic64_add_return
+#endif
+
+#endif /* arch_atomic64_add_return_relaxed */
+
+#ifndef arch_atomic64_fetch_add_relaxed
+#define arch_atomic64_fetch_add_acquire arch_atomic64_fetch_add
+#define arch_atomic64_fetch_add_release arch_atomic64_fetch_add
+#define arch_atomic64_fetch_add_relaxed arch_atomic64_fetch_add
+#else /* arch_atomic64_fetch_add_relaxed */
+
+#ifndef arch_atomic64_fetch_add_acquire
+static __always_inline s64
+arch_atomic64_fetch_add_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_add_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_add_acquire arch_atomic64_fetch_add_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_add_release
+static __always_inline s64
+arch_atomic64_fetch_add_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_add_relaxed(i, v);
+}
+#define arch_atomic64_fetch_add_release arch_atomic64_fetch_add_release
+#endif
+
+#ifndef arch_atomic64_fetch_add
+static __always_inline s64
+arch_atomic64_fetch_add(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_add_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_add arch_atomic64_fetch_add
+#endif
+
+#endif /* arch_atomic64_fetch_add_relaxed */
+
+#ifndef arch_atomic64_sub_return_relaxed
+#define arch_atomic64_sub_return_acquire arch_atomic64_sub_return
+#define arch_atomic64_sub_return_release arch_atomic64_sub_return
+#define arch_atomic64_sub_return_relaxed arch_atomic64_sub_return
+#else /* arch_atomic64_sub_return_relaxed */
+
+#ifndef arch_atomic64_sub_return_acquire
+static __always_inline s64
+arch_atomic64_sub_return_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_sub_return_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_sub_return_acquire arch_atomic64_sub_return_acquire
+#endif
+
+#ifndef arch_atomic64_sub_return_release
+static __always_inline s64
+arch_atomic64_sub_return_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_sub_return_relaxed(i, v);
+}
+#define arch_atomic64_sub_return_release arch_atomic64_sub_return_release
+#endif
+
+#ifndef arch_atomic64_sub_return
+static __always_inline s64
+arch_atomic64_sub_return(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_sub_return_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_sub_return arch_atomic64_sub_return
+#endif
+
+#endif /* arch_atomic64_sub_return_relaxed */
+
+#ifndef arch_atomic64_fetch_sub_relaxed
+#define arch_atomic64_fetch_sub_acquire arch_atomic64_fetch_sub
+#define arch_atomic64_fetch_sub_release arch_atomic64_fetch_sub
+#define arch_atomic64_fetch_sub_relaxed arch_atomic64_fetch_sub
+#else /* arch_atomic64_fetch_sub_relaxed */
+
+#ifndef arch_atomic64_fetch_sub_acquire
+static __always_inline s64
+arch_atomic64_fetch_sub_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_sub_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_sub_acquire arch_atomic64_fetch_sub_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_sub_release
+static __always_inline s64
+arch_atomic64_fetch_sub_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_sub_relaxed(i, v);
+}
+#define arch_atomic64_fetch_sub_release arch_atomic64_fetch_sub_release
+#endif
+
+#ifndef arch_atomic64_fetch_sub
+static __always_inline s64
+arch_atomic64_fetch_sub(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_sub_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_sub arch_atomic64_fetch_sub
+#endif
+
+#endif /* arch_atomic64_fetch_sub_relaxed */
+
+#ifndef arch_atomic64_inc
+static __always_inline void
+arch_atomic64_inc(atomic64_t *v)
+{
+ arch_atomic64_add(1, v);
+}
+#define arch_atomic64_inc arch_atomic64_inc
+#endif
+
+#ifndef arch_atomic64_inc_return_relaxed
+#ifdef arch_atomic64_inc_return
+#define arch_atomic64_inc_return_acquire arch_atomic64_inc_return
+#define arch_atomic64_inc_return_release arch_atomic64_inc_return
+#define arch_atomic64_inc_return_relaxed arch_atomic64_inc_return
+#endif /* arch_atomic64_inc_return */
+
+#ifndef arch_atomic64_inc_return
+static __always_inline s64
+arch_atomic64_inc_return(atomic64_t *v)
+{
+ return arch_atomic64_add_return(1, v);
+}
+#define arch_atomic64_inc_return arch_atomic64_inc_return
+#endif
+
+#ifndef arch_atomic64_inc_return_acquire
+static __always_inline s64
+arch_atomic64_inc_return_acquire(atomic64_t *v)
+{
+ return arch_atomic64_add_return_acquire(1, v);
+}
+#define arch_atomic64_inc_return_acquire arch_atomic64_inc_return_acquire
+#endif
+
+#ifndef arch_atomic64_inc_return_release
+static __always_inline s64
+arch_atomic64_inc_return_release(atomic64_t *v)
+{
+ return arch_atomic64_add_return_release(1, v);
+}
+#define arch_atomic64_inc_return_release arch_atomic64_inc_return_release
+#endif
+
+#ifndef arch_atomic64_inc_return_relaxed
+static __always_inline s64
+arch_atomic64_inc_return_relaxed(atomic64_t *v)
+{
+ return arch_atomic64_add_return_relaxed(1, v);
+}
+#define arch_atomic64_inc_return_relaxed arch_atomic64_inc_return_relaxed
+#endif
+
+#else /* arch_atomic64_inc_return_relaxed */
+
+#ifndef arch_atomic64_inc_return_acquire
+static __always_inline s64
+arch_atomic64_inc_return_acquire(atomic64_t *v)
+{
+ s64 ret = arch_atomic64_inc_return_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_inc_return_acquire arch_atomic64_inc_return_acquire
+#endif
+
+#ifndef arch_atomic64_inc_return_release
+static __always_inline s64
+arch_atomic64_inc_return_release(atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_inc_return_relaxed(v);
+}
+#define arch_atomic64_inc_return_release arch_atomic64_inc_return_release
+#endif
+
+#ifndef arch_atomic64_inc_return
+static __always_inline s64
+arch_atomic64_inc_return(atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_inc_return_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_inc_return arch_atomic64_inc_return
+#endif
+
+#endif /* arch_atomic64_inc_return_relaxed */
+
+#ifndef arch_atomic64_fetch_inc_relaxed
+#ifdef arch_atomic64_fetch_inc
+#define arch_atomic64_fetch_inc_acquire arch_atomic64_fetch_inc
+#define arch_atomic64_fetch_inc_release arch_atomic64_fetch_inc
+#define arch_atomic64_fetch_inc_relaxed arch_atomic64_fetch_inc
+#endif /* arch_atomic64_fetch_inc */
+
+#ifndef arch_atomic64_fetch_inc
+static __always_inline s64
+arch_atomic64_fetch_inc(atomic64_t *v)
+{
+ return arch_atomic64_fetch_add(1, v);
+}
+#define arch_atomic64_fetch_inc arch_atomic64_fetch_inc
+#endif
+
+#ifndef arch_atomic64_fetch_inc_acquire
+static __always_inline s64
+arch_atomic64_fetch_inc_acquire(atomic64_t *v)
+{
+ return arch_atomic64_fetch_add_acquire(1, v);
+}
+#define arch_atomic64_fetch_inc_acquire arch_atomic64_fetch_inc_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_inc_release
+static __always_inline s64
+arch_atomic64_fetch_inc_release(atomic64_t *v)
+{
+ return arch_atomic64_fetch_add_release(1, v);
+}
+#define arch_atomic64_fetch_inc_release arch_atomic64_fetch_inc_release
+#endif
+
+#ifndef arch_atomic64_fetch_inc_relaxed
+static __always_inline s64
+arch_atomic64_fetch_inc_relaxed(atomic64_t *v)
+{
+ return arch_atomic64_fetch_add_relaxed(1, v);
+}
+#define arch_atomic64_fetch_inc_relaxed arch_atomic64_fetch_inc_relaxed
+#endif
+
+#else /* arch_atomic64_fetch_inc_relaxed */
+
+#ifndef arch_atomic64_fetch_inc_acquire
+static __always_inline s64
+arch_atomic64_fetch_inc_acquire(atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_inc_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_inc_acquire arch_atomic64_fetch_inc_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_inc_release
+static __always_inline s64
+arch_atomic64_fetch_inc_release(atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_inc_relaxed(v);
+}
+#define arch_atomic64_fetch_inc_release arch_atomic64_fetch_inc_release
+#endif
+
+#ifndef arch_atomic64_fetch_inc
+static __always_inline s64
+arch_atomic64_fetch_inc(atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_inc_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_inc arch_atomic64_fetch_inc
+#endif
+
+#endif /* arch_atomic64_fetch_inc_relaxed */
+
+#ifndef arch_atomic64_dec
+static __always_inline void
+arch_atomic64_dec(atomic64_t *v)
+{
+ arch_atomic64_sub(1, v);
+}
+#define arch_atomic64_dec arch_atomic64_dec
+#endif
+
+#ifndef arch_atomic64_dec_return_relaxed
+#ifdef arch_atomic64_dec_return
+#define arch_atomic64_dec_return_acquire arch_atomic64_dec_return
+#define arch_atomic64_dec_return_release arch_atomic64_dec_return
+#define arch_atomic64_dec_return_relaxed arch_atomic64_dec_return
+#endif /* arch_atomic64_dec_return */
+
+#ifndef arch_atomic64_dec_return
+static __always_inline s64
+arch_atomic64_dec_return(atomic64_t *v)
+{
+ return arch_atomic64_sub_return(1, v);
+}
+#define arch_atomic64_dec_return arch_atomic64_dec_return
+#endif
+
+#ifndef arch_atomic64_dec_return_acquire
+static __always_inline s64
+arch_atomic64_dec_return_acquire(atomic64_t *v)
+{
+ return arch_atomic64_sub_return_acquire(1, v);
+}
+#define arch_atomic64_dec_return_acquire arch_atomic64_dec_return_acquire
+#endif
+
+#ifndef arch_atomic64_dec_return_release
+static __always_inline s64
+arch_atomic64_dec_return_release(atomic64_t *v)
+{
+ return arch_atomic64_sub_return_release(1, v);
+}
+#define arch_atomic64_dec_return_release arch_atomic64_dec_return_release
+#endif
+
+#ifndef arch_atomic64_dec_return_relaxed
+static __always_inline s64
+arch_atomic64_dec_return_relaxed(atomic64_t *v)
+{
+ return arch_atomic64_sub_return_relaxed(1, v);
+}
+#define arch_atomic64_dec_return_relaxed arch_atomic64_dec_return_relaxed
+#endif
+
+#else /* arch_atomic64_dec_return_relaxed */
+
+#ifndef arch_atomic64_dec_return_acquire
+static __always_inline s64
+arch_atomic64_dec_return_acquire(atomic64_t *v)
+{
+ s64 ret = arch_atomic64_dec_return_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_dec_return_acquire arch_atomic64_dec_return_acquire
+#endif
+
+#ifndef arch_atomic64_dec_return_release
+static __always_inline s64
+arch_atomic64_dec_return_release(atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_dec_return_relaxed(v);
+}
+#define arch_atomic64_dec_return_release arch_atomic64_dec_return_release
+#endif
+
+#ifndef arch_atomic64_dec_return
+static __always_inline s64
+arch_atomic64_dec_return(atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_dec_return_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_dec_return arch_atomic64_dec_return
+#endif
+
+#endif /* arch_atomic64_dec_return_relaxed */
+
+#ifndef arch_atomic64_fetch_dec_relaxed
+#ifdef arch_atomic64_fetch_dec
+#define arch_atomic64_fetch_dec_acquire arch_atomic64_fetch_dec
+#define arch_atomic64_fetch_dec_release arch_atomic64_fetch_dec
+#define arch_atomic64_fetch_dec_relaxed arch_atomic64_fetch_dec
+#endif /* arch_atomic64_fetch_dec */
+
+#ifndef arch_atomic64_fetch_dec
+static __always_inline s64
+arch_atomic64_fetch_dec(atomic64_t *v)
+{
+ return arch_atomic64_fetch_sub(1, v);
+}
+#define arch_atomic64_fetch_dec arch_atomic64_fetch_dec
+#endif
+
+#ifndef arch_atomic64_fetch_dec_acquire
+static __always_inline s64
+arch_atomic64_fetch_dec_acquire(atomic64_t *v)
+{
+ return arch_atomic64_fetch_sub_acquire(1, v);
+}
+#define arch_atomic64_fetch_dec_acquire arch_atomic64_fetch_dec_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_dec_release
+static __always_inline s64
+arch_atomic64_fetch_dec_release(atomic64_t *v)
+{
+ return arch_atomic64_fetch_sub_release(1, v);
+}
+#define arch_atomic64_fetch_dec_release arch_atomic64_fetch_dec_release
+#endif
+
+#ifndef arch_atomic64_fetch_dec_relaxed
+static __always_inline s64
+arch_atomic64_fetch_dec_relaxed(atomic64_t *v)
+{
+ return arch_atomic64_fetch_sub_relaxed(1, v);
+}
+#define arch_atomic64_fetch_dec_relaxed arch_atomic64_fetch_dec_relaxed
+#endif
+
+#else /* arch_atomic64_fetch_dec_relaxed */
+
+#ifndef arch_atomic64_fetch_dec_acquire
+static __always_inline s64
+arch_atomic64_fetch_dec_acquire(atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_dec_relaxed(v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_dec_acquire arch_atomic64_fetch_dec_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_dec_release
+static __always_inline s64
+arch_atomic64_fetch_dec_release(atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_dec_relaxed(v);
+}
+#define arch_atomic64_fetch_dec_release arch_atomic64_fetch_dec_release
+#endif
+
+#ifndef arch_atomic64_fetch_dec
+static __always_inline s64
+arch_atomic64_fetch_dec(atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_dec_relaxed(v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_dec arch_atomic64_fetch_dec
+#endif
+
+#endif /* arch_atomic64_fetch_dec_relaxed */
+
+#ifndef arch_atomic64_fetch_and_relaxed
+#define arch_atomic64_fetch_and_acquire arch_atomic64_fetch_and
+#define arch_atomic64_fetch_and_release arch_atomic64_fetch_and
+#define arch_atomic64_fetch_and_relaxed arch_atomic64_fetch_and
+#else /* arch_atomic64_fetch_and_relaxed */
+
+#ifndef arch_atomic64_fetch_and_acquire
+static __always_inline s64
+arch_atomic64_fetch_and_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_and_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_and_acquire arch_atomic64_fetch_and_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_and_release
+static __always_inline s64
+arch_atomic64_fetch_and_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_and_relaxed(i, v);
+}
+#define arch_atomic64_fetch_and_release arch_atomic64_fetch_and_release
+#endif
+
+#ifndef arch_atomic64_fetch_and
+static __always_inline s64
+arch_atomic64_fetch_and(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_and_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_and arch_atomic64_fetch_and
+#endif
+
+#endif /* arch_atomic64_fetch_and_relaxed */
+
+#ifndef arch_atomic64_andnot
+static __always_inline void
+arch_atomic64_andnot(s64 i, atomic64_t *v)
+{
+ arch_atomic64_and(~i, v);
+}
+#define arch_atomic64_andnot arch_atomic64_andnot
+#endif
+
+#ifndef arch_atomic64_fetch_andnot_relaxed
+#ifdef arch_atomic64_fetch_andnot
+#define arch_atomic64_fetch_andnot_acquire arch_atomic64_fetch_andnot
+#define arch_atomic64_fetch_andnot_release arch_atomic64_fetch_andnot
+#define arch_atomic64_fetch_andnot_relaxed arch_atomic64_fetch_andnot
+#endif /* arch_atomic64_fetch_andnot */
+
+#ifndef arch_atomic64_fetch_andnot
+static __always_inline s64
+arch_atomic64_fetch_andnot(s64 i, atomic64_t *v)
+{
+ return arch_atomic64_fetch_and(~i, v);
+}
+#define arch_atomic64_fetch_andnot arch_atomic64_fetch_andnot
+#endif
+
+#ifndef arch_atomic64_fetch_andnot_acquire
+static __always_inline s64
+arch_atomic64_fetch_andnot_acquire(s64 i, atomic64_t *v)
+{
+ return arch_atomic64_fetch_and_acquire(~i, v);
+}
+#define arch_atomic64_fetch_andnot_acquire arch_atomic64_fetch_andnot_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_andnot_release
+static __always_inline s64
+arch_atomic64_fetch_andnot_release(s64 i, atomic64_t *v)
+{
+ return arch_atomic64_fetch_and_release(~i, v);
+}
+#define arch_atomic64_fetch_andnot_release arch_atomic64_fetch_andnot_release
+#endif
+
+#ifndef arch_atomic64_fetch_andnot_relaxed
+static __always_inline s64
+arch_atomic64_fetch_andnot_relaxed(s64 i, atomic64_t *v)
+{
+ return arch_atomic64_fetch_and_relaxed(~i, v);
+}
+#define arch_atomic64_fetch_andnot_relaxed arch_atomic64_fetch_andnot_relaxed
+#endif
+
+#else /* arch_atomic64_fetch_andnot_relaxed */
+
+#ifndef arch_atomic64_fetch_andnot_acquire
+static __always_inline s64
+arch_atomic64_fetch_andnot_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_andnot_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_andnot_acquire arch_atomic64_fetch_andnot_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_andnot_release
+static __always_inline s64
+arch_atomic64_fetch_andnot_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_andnot_relaxed(i, v);
+}
+#define arch_atomic64_fetch_andnot_release arch_atomic64_fetch_andnot_release
+#endif
+
+#ifndef arch_atomic64_fetch_andnot
+static __always_inline s64
+arch_atomic64_fetch_andnot(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_andnot_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_andnot arch_atomic64_fetch_andnot
+#endif
+
+#endif /* arch_atomic64_fetch_andnot_relaxed */
+
+#ifndef arch_atomic64_fetch_or_relaxed
+#define arch_atomic64_fetch_or_acquire arch_atomic64_fetch_or
+#define arch_atomic64_fetch_or_release arch_atomic64_fetch_or
+#define arch_atomic64_fetch_or_relaxed arch_atomic64_fetch_or
+#else /* arch_atomic64_fetch_or_relaxed */
+
+#ifndef arch_atomic64_fetch_or_acquire
+static __always_inline s64
+arch_atomic64_fetch_or_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_or_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_or_acquire arch_atomic64_fetch_or_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_or_release
+static __always_inline s64
+arch_atomic64_fetch_or_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_or_relaxed(i, v);
+}
+#define arch_atomic64_fetch_or_release arch_atomic64_fetch_or_release
+#endif
+
+#ifndef arch_atomic64_fetch_or
+static __always_inline s64
+arch_atomic64_fetch_or(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_or_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_or arch_atomic64_fetch_or
+#endif
+
+#endif /* arch_atomic64_fetch_or_relaxed */
+
+#ifndef arch_atomic64_fetch_xor_relaxed
+#define arch_atomic64_fetch_xor_acquire arch_atomic64_fetch_xor
+#define arch_atomic64_fetch_xor_release arch_atomic64_fetch_xor
+#define arch_atomic64_fetch_xor_relaxed arch_atomic64_fetch_xor
+#else /* arch_atomic64_fetch_xor_relaxed */
+
+#ifndef arch_atomic64_fetch_xor_acquire
+static __always_inline s64
+arch_atomic64_fetch_xor_acquire(s64 i, atomic64_t *v)
+{
+ s64 ret = arch_atomic64_fetch_xor_relaxed(i, v);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_xor_acquire arch_atomic64_fetch_xor_acquire
+#endif
+
+#ifndef arch_atomic64_fetch_xor_release
+static __always_inline s64
+arch_atomic64_fetch_xor_release(s64 i, atomic64_t *v)
+{
+ __atomic_release_fence();
+ return arch_atomic64_fetch_xor_relaxed(i, v);
+}
+#define arch_atomic64_fetch_xor_release arch_atomic64_fetch_xor_release
+#endif
+
+#ifndef arch_atomic64_fetch_xor
+static __always_inline s64
+arch_atomic64_fetch_xor(s64 i, atomic64_t *v)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_fetch_xor_relaxed(i, v);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_fetch_xor arch_atomic64_fetch_xor
+#endif
+
+#endif /* arch_atomic64_fetch_xor_relaxed */
+
+#ifndef arch_atomic64_xchg_relaxed
+#define arch_atomic64_xchg_acquire arch_atomic64_xchg
+#define arch_atomic64_xchg_release arch_atomic64_xchg
+#define arch_atomic64_xchg_relaxed arch_atomic64_xchg
+#else /* arch_atomic64_xchg_relaxed */
+
+#ifndef arch_atomic64_xchg_acquire
+static __always_inline s64
+arch_atomic64_xchg_acquire(atomic64_t *v, s64 i)
+{
+ s64 ret = arch_atomic64_xchg_relaxed(v, i);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_xchg_acquire arch_atomic64_xchg_acquire
+#endif
+
+#ifndef arch_atomic64_xchg_release
+static __always_inline s64
+arch_atomic64_xchg_release(atomic64_t *v, s64 i)
+{
+ __atomic_release_fence();
+ return arch_atomic64_xchg_relaxed(v, i);
+}
+#define arch_atomic64_xchg_release arch_atomic64_xchg_release
+#endif
+
+#ifndef arch_atomic64_xchg
+static __always_inline s64
+arch_atomic64_xchg(atomic64_t *v, s64 i)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_xchg_relaxed(v, i);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_xchg arch_atomic64_xchg
+#endif
+
+#endif /* arch_atomic64_xchg_relaxed */
+
+#ifndef arch_atomic64_cmpxchg_relaxed
+#define arch_atomic64_cmpxchg_acquire arch_atomic64_cmpxchg
+#define arch_atomic64_cmpxchg_release arch_atomic64_cmpxchg
+#define arch_atomic64_cmpxchg_relaxed arch_atomic64_cmpxchg
+#else /* arch_atomic64_cmpxchg_relaxed */
+
+#ifndef arch_atomic64_cmpxchg_acquire
+static __always_inline s64
+arch_atomic64_cmpxchg_acquire(atomic64_t *v, s64 old, s64 new)
+{
+ s64 ret = arch_atomic64_cmpxchg_relaxed(v, old, new);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_cmpxchg_acquire arch_atomic64_cmpxchg_acquire
+#endif
+
+#ifndef arch_atomic64_cmpxchg_release
+static __always_inline s64
+arch_atomic64_cmpxchg_release(atomic64_t *v, s64 old, s64 new)
+{
+ __atomic_release_fence();
+ return arch_atomic64_cmpxchg_relaxed(v, old, new);
+}
+#define arch_atomic64_cmpxchg_release arch_atomic64_cmpxchg_release
+#endif
+
+#ifndef arch_atomic64_cmpxchg
+static __always_inline s64
+arch_atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
+{
+ s64 ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_cmpxchg_relaxed(v, old, new);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_cmpxchg arch_atomic64_cmpxchg
+#endif
+
+#endif /* arch_atomic64_cmpxchg_relaxed */
+
+#ifndef arch_atomic64_try_cmpxchg_relaxed
+#ifdef arch_atomic64_try_cmpxchg
+#define arch_atomic64_try_cmpxchg_acquire arch_atomic64_try_cmpxchg
+#define arch_atomic64_try_cmpxchg_release arch_atomic64_try_cmpxchg
+#define arch_atomic64_try_cmpxchg_relaxed arch_atomic64_try_cmpxchg
+#endif /* arch_atomic64_try_cmpxchg */
+
+#ifndef arch_atomic64_try_cmpxchg
+static __always_inline bool
+arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
+{
+ s64 r, o = *old;
+ r = arch_atomic64_cmpxchg(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
+#endif
+
+#ifndef arch_atomic64_try_cmpxchg_acquire
+static __always_inline bool
+arch_atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new)
+{
+ s64 r, o = *old;
+ r = arch_atomic64_cmpxchg_acquire(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic64_try_cmpxchg_acquire arch_atomic64_try_cmpxchg_acquire
+#endif
+
+#ifndef arch_atomic64_try_cmpxchg_release
+static __always_inline bool
+arch_atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new)
+{
+ s64 r, o = *old;
+ r = arch_atomic64_cmpxchg_release(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic64_try_cmpxchg_release arch_atomic64_try_cmpxchg_release
+#endif
+
+#ifndef arch_atomic64_try_cmpxchg_relaxed
+static __always_inline bool
+arch_atomic64_try_cmpxchg_relaxed(atomic64_t *v, s64 *old, s64 new)
+{
+ s64 r, o = *old;
+ r = arch_atomic64_cmpxchg_relaxed(v, o, new);
+ if (unlikely(r != o))
+ *old = r;
+ return likely(r == o);
+}
+#define arch_atomic64_try_cmpxchg_relaxed arch_atomic64_try_cmpxchg_relaxed
+#endif
+
+#else /* arch_atomic64_try_cmpxchg_relaxed */
+
+#ifndef arch_atomic64_try_cmpxchg_acquire
+static __always_inline bool
+arch_atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new)
+{
+ bool ret = arch_atomic64_try_cmpxchg_relaxed(v, old, new);
+ __atomic_acquire_fence();
+ return ret;
+}
+#define arch_atomic64_try_cmpxchg_acquire arch_atomic64_try_cmpxchg_acquire
+#endif
+
+#ifndef arch_atomic64_try_cmpxchg_release
+static __always_inline bool
+arch_atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new)
+{
+ __atomic_release_fence();
+ return arch_atomic64_try_cmpxchg_relaxed(v, old, new);
+}
+#define arch_atomic64_try_cmpxchg_release arch_atomic64_try_cmpxchg_release
+#endif
+
+#ifndef arch_atomic64_try_cmpxchg
+static __always_inline bool
+arch_atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
+{
+ bool ret;
+ __atomic_pre_full_fence();
+ ret = arch_atomic64_try_cmpxchg_relaxed(v, old, new);
+ __atomic_post_full_fence();
+ return ret;
+}
+#define arch_atomic64_try_cmpxchg arch_atomic64_try_cmpxchg
+#endif
+
+#endif /* arch_atomic64_try_cmpxchg_relaxed */
+
+#ifndef arch_atomic64_sub_and_test
+/**
+ * arch_atomic64_sub_and_test - subtract value from variable and test result
+ * @i: integer value to subtract
+ * @v: pointer of type atomic64_t
+ *
+ * Atomically subtracts @i from @v and returns
+ * true if the result is zero, or false for all
+ * other cases.
+ */
+static __always_inline bool
+arch_atomic64_sub_and_test(s64 i, atomic64_t *v)
+{
+ return arch_atomic64_sub_return(i, v) == 0;
+}
+#define arch_atomic64_sub_and_test arch_atomic64_sub_and_test
+#endif
+
+#ifndef arch_atomic64_dec_and_test
+/**
+ * arch_atomic64_dec_and_test - decrement and test
+ * @v: pointer of type atomic64_t
+ *
+ * Atomically decrements @v by 1 and
+ * returns true if the result is 0, or false for all other
+ * cases.
+ */
+static __always_inline bool
+arch_atomic64_dec_and_test(atomic64_t *v)
+{
+ return arch_atomic64_dec_return(v) == 0;
+}
+#define arch_atomic64_dec_and_test arch_atomic64_dec_and_test
+#endif
+
+#ifndef arch_atomic64_inc_and_test
+/**
+ * arch_atomic64_inc_and_test - increment and test
+ * @v: pointer of type atomic64_t
+ *
+ * Atomically increments @v by 1
+ * and returns true if the result is zero, or false for all
+ * other cases.
+ */
+static __always_inline bool
+arch_atomic64_inc_and_test(atomic64_t *v)
+{
+ return arch_atomic64_inc_return(v) == 0;
+}
+#define arch_atomic64_inc_and_test arch_atomic64_inc_and_test
+#endif
+
+#ifndef arch_atomic64_add_negative
+/**
+ * arch_atomic64_add_negative - add and test if negative
+ * @i: integer value to add
+ * @v: pointer of type atomic64_t
+ *
+ * Atomically adds @i to @v and returns true
+ * if the result is negative, or false when
+ * result is greater than or equal to zero.
+ */
+static __always_inline bool
+arch_atomic64_add_negative(s64 i, atomic64_t *v)
+{
+ return arch_atomic64_add_return(i, v) < 0;
+}
+#define arch_atomic64_add_negative arch_atomic64_add_negative
+#endif
+
+#ifndef arch_atomic64_fetch_add_unless
+/**
+ * arch_atomic64_fetch_add_unless - add unless the number is already a given value
+ * @v: pointer of type atomic64_t
+ * @a: the amount to add to v...
+ * @u: ...unless v is equal to u.
+ *
+ * Atomically adds @a to @v, so long as @v was not already @u.
+ * Returns original value of @v
+ */
+static __always_inline s64
+arch_atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
+{
+ s64 c = arch_atomic64_read(v);
+
+ do {
+ if (unlikely(c == u))
+ break;
+ } while (!arch_atomic64_try_cmpxchg(v, &c, c + a));
+
+ return c;
+}
+#define arch_atomic64_fetch_add_unless arch_atomic64_fetch_add_unless
+#endif
+
+#ifndef arch_atomic64_add_unless
+/**
+ * arch_atomic64_add_unless - add unless the number is already a given value
+ * @v: pointer of type atomic64_t
+ * @a: the amount to add to v...
+ * @u: ...unless v is equal to u.
+ *
+ * Atomically adds @a to @v, if @v was not already @u.
+ * Returns true if the addition was done.
+ */
+static __always_inline bool
+arch_atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
+{
+ return arch_atomic64_fetch_add_unless(v, a, u) != u;
+}
+#define arch_atomic64_add_unless arch_atomic64_add_unless
+#endif
+
+#ifndef arch_atomic64_inc_not_zero
+/**
+ * arch_atomic64_inc_not_zero - increment unless the number is zero
+ * @v: pointer of type atomic64_t
+ *
+ * Atomically increments @v by 1, if @v is non-zero.
+ * Returns true if the increment was done.
+ */
+static __always_inline bool
+arch_atomic64_inc_not_zero(atomic64_t *v)
+{
+ return arch_atomic64_add_unless(v, 1, 0);
+}
+#define arch_atomic64_inc_not_zero arch_atomic64_inc_not_zero
+#endif
+
+#ifndef arch_atomic64_inc_unless_negative
+static __always_inline bool
+arch_atomic64_inc_unless_negative(atomic64_t *v)
+{
+ s64 c = arch_atomic64_read(v);
+
+ do {
+ if (unlikely(c < 0))
+ return false;
+ } while (!arch_atomic64_try_cmpxchg(v, &c, c + 1));
+
+ return true;
+}
+#define arch_atomic64_inc_unless_negative arch_atomic64_inc_unless_negative
+#endif
+
+#ifndef arch_atomic64_dec_unless_positive
+static __always_inline bool
+arch_atomic64_dec_unless_positive(atomic64_t *v)
+{
+ s64 c = arch_atomic64_read(v);
+
+ do {
+ if (unlikely(c > 0))
+ return false;
+ } while (!arch_atomic64_try_cmpxchg(v, &c, c - 1));
+
+ return true;
+}
+#define arch_atomic64_dec_unless_positive arch_atomic64_dec_unless_positive
+#endif
+
+#ifndef arch_atomic64_dec_if_positive
+static __always_inline s64
+arch_atomic64_dec_if_positive(atomic64_t *v)
+{
+ s64 dec, c = arch_atomic64_read(v);
+
+ do {
+ dec = c - 1;
+ if (unlikely(dec < 0))
+ break;
+ } while (!arch_atomic64_try_cmpxchg(v, &c, dec));
+
+ return dec;
+}
+#define arch_atomic64_dec_if_positive arch_atomic64_dec_if_positive
+#endif
+
+#endif /* _LINUX_ATOMIC_FALLBACK_H */
+// cca554917d7ea73d5e3e7397dd70c484cad9b2c4
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+// Generated by scripts/atomic/gen-atomic-instrumented.sh
+// DO NOT MODIFY THIS FILE DIRECTLY
+
+/*
+ * This file provides wrappers with KASAN instrumentation for atomic operations.
+ * To use this functionality an arch's atomic.h file needs to define all
+ * atomic operations with arch_ prefix (e.g. arch_atomic_read()) and include
+ * this file at the end. This file provides atomic_read() that forwards to
+ * arch_atomic_read() for actual atomic operation.
+ * Note: if an arch atomic operation is implemented by means of other atomic
+ * operations (e.g. atomic_read()/atomic_cmpxchg() loop), then it needs to use
+ * arch_ variants (i.e. arch_atomic_read()/arch_atomic_cmpxchg()) to avoid
+ * double instrumentation.
+ */
+#ifndef _LINUX_ATOMIC_INSTRUMENTED_H
+#define _LINUX_ATOMIC_INSTRUMENTED_H
+
+#include <linux/build_bug.h>
+#include <linux/compiler.h>
+#include <linux/instrumented.h>
+
+static __always_inline int
+atomic_read(const atomic_t *v)
+{
+ instrument_atomic_read(v, sizeof(*v));
+ return arch_atomic_read(v);
+}
+
+static __always_inline int
+atomic_read_acquire(const atomic_t *v)
+{
+ instrument_atomic_read(v, sizeof(*v));
+ return arch_atomic_read_acquire(v);
+}
+
+static __always_inline void
+atomic_set(atomic_t *v, int i)
+{
+ instrument_atomic_write(v, sizeof(*v));
+ arch_atomic_set(v, i);
+}
+
+static __always_inline void
+atomic_set_release(atomic_t *v, int i)
+{
+ instrument_atomic_write(v, sizeof(*v));
+ arch_atomic_set_release(v, i);
+}
+
+static __always_inline void
+atomic_add(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_add(i, v);
+}
+
+static __always_inline int
+atomic_add_return(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_add_return(i, v);
+}
+
+static __always_inline int
+atomic_add_return_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_add_return_acquire(i, v);
+}
+
+static __always_inline int
+atomic_add_return_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_add_return_release(i, v);
+}
+
+static __always_inline int
+atomic_add_return_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_add_return_relaxed(i, v);
+}
+
+static __always_inline int
+atomic_fetch_add(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_add(i, v);
+}
+
+static __always_inline int
+atomic_fetch_add_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_add_acquire(i, v);
+}
+
+static __always_inline int
+atomic_fetch_add_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_add_release(i, v);
+}
+
+static __always_inline int
+atomic_fetch_add_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_add_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_sub(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_sub(i, v);
+}
+
+static __always_inline int
+atomic_sub_return(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_sub_return(i, v);
+}
+
+static __always_inline int
+atomic_sub_return_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_sub_return_acquire(i, v);
+}
+
+static __always_inline int
+atomic_sub_return_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_sub_return_release(i, v);
+}
+
+static __always_inline int
+atomic_sub_return_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_sub_return_relaxed(i, v);
+}
+
+static __always_inline int
+atomic_fetch_sub(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_sub(i, v);
+}
+
+static __always_inline int
+atomic_fetch_sub_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_sub_acquire(i, v);
+}
+
+static __always_inline int
+atomic_fetch_sub_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_sub_release(i, v);
+}
+
+static __always_inline int
+atomic_fetch_sub_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_sub_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_inc(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_inc(v);
+}
+
+static __always_inline int
+atomic_inc_return(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_return(v);
+}
+
+static __always_inline int
+atomic_inc_return_acquire(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_return_acquire(v);
+}
+
+static __always_inline int
+atomic_inc_return_release(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_return_release(v);
+}
+
+static __always_inline int
+atomic_inc_return_relaxed(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_return_relaxed(v);
+}
+
+static __always_inline int
+atomic_fetch_inc(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_inc(v);
+}
+
+static __always_inline int
+atomic_fetch_inc_acquire(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_inc_acquire(v);
+}
+
+static __always_inline int
+atomic_fetch_inc_release(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_inc_release(v);
+}
+
+static __always_inline int
+atomic_fetch_inc_relaxed(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_inc_relaxed(v);
+}
+
+static __always_inline void
+atomic_dec(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_dec(v);
+}
+
+static __always_inline int
+atomic_dec_return(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_return(v);
+}
+
+static __always_inline int
+atomic_dec_return_acquire(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_return_acquire(v);
+}
+
+static __always_inline int
+atomic_dec_return_release(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_return_release(v);
+}
+
+static __always_inline int
+atomic_dec_return_relaxed(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_return_relaxed(v);
+}
+
+static __always_inline int
+atomic_fetch_dec(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_dec(v);
+}
+
+static __always_inline int
+atomic_fetch_dec_acquire(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_dec_acquire(v);
+}
+
+static __always_inline int
+atomic_fetch_dec_release(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_dec_release(v);
+}
+
+static __always_inline int
+atomic_fetch_dec_relaxed(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_dec_relaxed(v);
+}
+
+static __always_inline void
+atomic_and(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_and(i, v);
+}
+
+static __always_inline int
+atomic_fetch_and(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_and(i, v);
+}
+
+static __always_inline int
+atomic_fetch_and_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_and_acquire(i, v);
+}
+
+static __always_inline int
+atomic_fetch_and_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_and_release(i, v);
+}
+
+static __always_inline int
+atomic_fetch_and_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_and_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_andnot(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_andnot(i, v);
+}
+
+static __always_inline int
+atomic_fetch_andnot(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_andnot(i, v);
+}
+
+static __always_inline int
+atomic_fetch_andnot_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_andnot_acquire(i, v);
+}
+
+static __always_inline int
+atomic_fetch_andnot_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_andnot_release(i, v);
+}
+
+static __always_inline int
+atomic_fetch_andnot_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_andnot_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_or(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_or(i, v);
+}
+
+static __always_inline int
+atomic_fetch_or(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_or(i, v);
+}
+
+static __always_inline int
+atomic_fetch_or_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_or_acquire(i, v);
+}
+
+static __always_inline int
+atomic_fetch_or_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_or_release(i, v);
+}
+
+static __always_inline int
+atomic_fetch_or_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_or_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_xor(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_xor(i, v);
+}
+
+static __always_inline int
+atomic_fetch_xor(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_xor(i, v);
+}
+
+static __always_inline int
+atomic_fetch_xor_acquire(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_xor_acquire(i, v);
+}
+
+static __always_inline int
+atomic_fetch_xor_release(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_xor_release(i, v);
+}
+
+static __always_inline int
+atomic_fetch_xor_relaxed(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_xor_relaxed(i, v);
+}
+
+static __always_inline int
+atomic_xchg(atomic_t *v, int i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_xchg(v, i);
+}
+
+static __always_inline int
+atomic_xchg_acquire(atomic_t *v, int i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_xchg_acquire(v, i);
+}
+
+static __always_inline int
+atomic_xchg_release(atomic_t *v, int i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_xchg_release(v, i);
+}
+
+static __always_inline int
+atomic_xchg_relaxed(atomic_t *v, int i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_xchg_relaxed(v, i);
+}
+
+static __always_inline int
+atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_cmpxchg(v, old, new);
+}
+
+static __always_inline int
+atomic_cmpxchg_acquire(atomic_t *v, int old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline int
+atomic_cmpxchg_release(atomic_t *v, int old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_cmpxchg_release(v, old, new);
+}
+
+static __always_inline int
+atomic_cmpxchg_relaxed(atomic_t *v, int old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+atomic_try_cmpxchg(atomic_t *v, int *old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_try_cmpxchg(v, old, new);
+}
+
+static __always_inline bool
+atomic_try_cmpxchg_acquire(atomic_t *v, int *old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_try_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline bool
+atomic_try_cmpxchg_release(atomic_t *v, int *old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_try_cmpxchg_release(v, old, new);
+}
+
+static __always_inline bool
+atomic_try_cmpxchg_relaxed(atomic_t *v, int *old, int new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_try_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+atomic_sub_and_test(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_sub_and_test(i, v);
+}
+
+static __always_inline bool
+atomic_dec_and_test(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_and_test(v);
+}
+
+static __always_inline bool
+atomic_inc_and_test(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_and_test(v);
+}
+
+static __always_inline bool
+atomic_add_negative(int i, atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_add_negative(i, v);
+}
+
+static __always_inline int
+atomic_fetch_add_unless(atomic_t *v, int a, int u)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_fetch_add_unless(v, a, u);
+}
+
+static __always_inline bool
+atomic_add_unless(atomic_t *v, int a, int u)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_add_unless(v, a, u);
+}
+
+static __always_inline bool
+atomic_inc_not_zero(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_not_zero(v);
+}
+
+static __always_inline bool
+atomic_inc_unless_negative(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_inc_unless_negative(v);
+}
+
+static __always_inline bool
+atomic_dec_unless_positive(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_unless_positive(v);
+}
+
+static __always_inline int
+atomic_dec_if_positive(atomic_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_dec_if_positive(v);
+}
+
+static __always_inline s64
+atomic64_read(const atomic64_t *v)
+{
+ instrument_atomic_read(v, sizeof(*v));
+ return arch_atomic64_read(v);
+}
+
+static __always_inline s64
+atomic64_read_acquire(const atomic64_t *v)
+{
+ instrument_atomic_read(v, sizeof(*v));
+ return arch_atomic64_read_acquire(v);
+}
+
+static __always_inline void
+atomic64_set(atomic64_t *v, s64 i)
+{
+ instrument_atomic_write(v, sizeof(*v));
+ arch_atomic64_set(v, i);
+}
+
+static __always_inline void
+atomic64_set_release(atomic64_t *v, s64 i)
+{
+ instrument_atomic_write(v, sizeof(*v));
+ arch_atomic64_set_release(v, i);
+}
+
+static __always_inline void
+atomic64_add(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_add(i, v);
+}
+
+static __always_inline s64
+atomic64_add_return(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_add_return(i, v);
+}
+
+static __always_inline s64
+atomic64_add_return_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_add_return_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_add_return_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_add_return_release(i, v);
+}
+
+static __always_inline s64
+atomic64_add_return_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_add_return_relaxed(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_add(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_add(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_add_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_add_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_add_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_add_release(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_add_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_add_relaxed(i, v);
+}
+
+static __always_inline void
+atomic64_sub(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_sub(i, v);
+}
+
+static __always_inline s64
+atomic64_sub_return(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_sub_return(i, v);
+}
+
+static __always_inline s64
+atomic64_sub_return_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_sub_return_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_sub_return_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_sub_return_release(i, v);
+}
+
+static __always_inline s64
+atomic64_sub_return_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_sub_return_relaxed(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_sub(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_sub(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_sub_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_sub_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_sub_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_sub_release(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_sub_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_sub_relaxed(i, v);
+}
+
+static __always_inline void
+atomic64_inc(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_inc(v);
+}
+
+static __always_inline s64
+atomic64_inc_return(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_return(v);
+}
+
+static __always_inline s64
+atomic64_inc_return_acquire(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_return_acquire(v);
+}
+
+static __always_inline s64
+atomic64_inc_return_release(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_return_release(v);
+}
+
+static __always_inline s64
+atomic64_inc_return_relaxed(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_return_relaxed(v);
+}
+
+static __always_inline s64
+atomic64_fetch_inc(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_inc(v);
+}
+
+static __always_inline s64
+atomic64_fetch_inc_acquire(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_inc_acquire(v);
+}
+
+static __always_inline s64
+atomic64_fetch_inc_release(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_inc_release(v);
+}
+
+static __always_inline s64
+atomic64_fetch_inc_relaxed(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_inc_relaxed(v);
+}
+
+static __always_inline void
+atomic64_dec(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_dec(v);
+}
+
+static __always_inline s64
+atomic64_dec_return(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_return(v);
+}
+
+static __always_inline s64
+atomic64_dec_return_acquire(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_return_acquire(v);
+}
+
+static __always_inline s64
+atomic64_dec_return_release(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_return_release(v);
+}
+
+static __always_inline s64
+atomic64_dec_return_relaxed(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_return_relaxed(v);
+}
+
+static __always_inline s64
+atomic64_fetch_dec(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_dec(v);
+}
+
+static __always_inline s64
+atomic64_fetch_dec_acquire(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_dec_acquire(v);
+}
+
+static __always_inline s64
+atomic64_fetch_dec_release(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_dec_release(v);
+}
+
+static __always_inline s64
+atomic64_fetch_dec_relaxed(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_dec_relaxed(v);
+}
+
+static __always_inline void
+atomic64_and(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_and(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_and(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_and(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_and_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_and_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_and_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_and_release(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_and_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_and_relaxed(i, v);
+}
+
+static __always_inline void
+atomic64_andnot(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_andnot(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_andnot(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_andnot(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_andnot_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_andnot_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_andnot_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_andnot_release(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_andnot_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_andnot_relaxed(i, v);
+}
+
+static __always_inline void
+atomic64_or(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_or(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_or(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_or(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_or_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_or_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_or_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_or_release(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_or_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_or_relaxed(i, v);
+}
+
+static __always_inline void
+atomic64_xor(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic64_xor(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_xor(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_xor(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_xor_acquire(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_xor_acquire(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_xor_release(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_xor_release(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_xor_relaxed(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_xor_relaxed(i, v);
+}
+
+static __always_inline s64
+atomic64_xchg(atomic64_t *v, s64 i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_xchg(v, i);
+}
+
+static __always_inline s64
+atomic64_xchg_acquire(atomic64_t *v, s64 i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_xchg_acquire(v, i);
+}
+
+static __always_inline s64
+atomic64_xchg_release(atomic64_t *v, s64 i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_xchg_release(v, i);
+}
+
+static __always_inline s64
+atomic64_xchg_relaxed(atomic64_t *v, s64 i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_xchg_relaxed(v, i);
+}
+
+static __always_inline s64
+atomic64_cmpxchg(atomic64_t *v, s64 old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_cmpxchg(v, old, new);
+}
+
+static __always_inline s64
+atomic64_cmpxchg_acquire(atomic64_t *v, s64 old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline s64
+atomic64_cmpxchg_release(atomic64_t *v, s64 old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_cmpxchg_release(v, old, new);
+}
+
+static __always_inline s64
+atomic64_cmpxchg_relaxed(atomic64_t *v, s64 old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+atomic64_try_cmpxchg(atomic64_t *v, s64 *old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic64_try_cmpxchg(v, old, new);
+}
+
+static __always_inline bool
+atomic64_try_cmpxchg_acquire(atomic64_t *v, s64 *old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic64_try_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline bool
+atomic64_try_cmpxchg_release(atomic64_t *v, s64 *old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic64_try_cmpxchg_release(v, old, new);
+}
+
+static __always_inline bool
+atomic64_try_cmpxchg_relaxed(atomic64_t *v, s64 *old, s64 new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic64_try_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+atomic64_sub_and_test(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_sub_and_test(i, v);
+}
+
+static __always_inline bool
+atomic64_dec_and_test(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_and_test(v);
+}
+
+static __always_inline bool
+atomic64_inc_and_test(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_and_test(v);
+}
+
+static __always_inline bool
+atomic64_add_negative(s64 i, atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_add_negative(i, v);
+}
+
+static __always_inline s64
+atomic64_fetch_add_unless(atomic64_t *v, s64 a, s64 u)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_fetch_add_unless(v, a, u);
+}
+
+static __always_inline bool
+atomic64_add_unless(atomic64_t *v, s64 a, s64 u)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_add_unless(v, a, u);
+}
+
+static __always_inline bool
+atomic64_inc_not_zero(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_not_zero(v);
+}
+
+static __always_inline bool
+atomic64_inc_unless_negative(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_inc_unless_negative(v);
+}
+
+static __always_inline bool
+atomic64_dec_unless_positive(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_unless_positive(v);
+}
+
+static __always_inline s64
+atomic64_dec_if_positive(atomic64_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic64_dec_if_positive(v);
+}
+
+static __always_inline long
+atomic_long_read(const atomic_long_t *v)
+{
+ instrument_atomic_read(v, sizeof(*v));
+ return arch_atomic_long_read(v);
+}
+
+static __always_inline long
+atomic_long_read_acquire(const atomic_long_t *v)
+{
+ instrument_atomic_read(v, sizeof(*v));
+ return arch_atomic_long_read_acquire(v);
+}
+
+static __always_inline void
+atomic_long_set(atomic_long_t *v, long i)
+{
+ instrument_atomic_write(v, sizeof(*v));
+ arch_atomic_long_set(v, i);
+}
+
+static __always_inline void
+atomic_long_set_release(atomic_long_t *v, long i)
+{
+ instrument_atomic_write(v, sizeof(*v));
+ arch_atomic_long_set_release(v, i);
+}
+
+static __always_inline void
+atomic_long_add(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_add(i, v);
+}
+
+static __always_inline long
+atomic_long_add_return(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_add_return(i, v);
+}
+
+static __always_inline long
+atomic_long_add_return_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_add_return_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_add_return_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_add_return_release(i, v);
+}
+
+static __always_inline long
+atomic_long_add_return_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_add_return_relaxed(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_add(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_add(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_add_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_add_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_add_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_add_release(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_add_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_add_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_long_sub(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_sub(i, v);
+}
+
+static __always_inline long
+atomic_long_sub_return(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_sub_return(i, v);
+}
+
+static __always_inline long
+atomic_long_sub_return_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_sub_return_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_sub_return_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_sub_return_release(i, v);
+}
+
+static __always_inline long
+atomic_long_sub_return_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_sub_return_relaxed(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_sub(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_sub(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_sub_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_sub_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_sub_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_sub_release(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_sub_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_long_inc(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_inc(v);
+}
+
+static __always_inline long
+atomic_long_inc_return(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_return(v);
+}
+
+static __always_inline long
+atomic_long_inc_return_acquire(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_return_acquire(v);
+}
+
+static __always_inline long
+atomic_long_inc_return_release(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_return_release(v);
+}
+
+static __always_inline long
+atomic_long_inc_return_relaxed(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_return_relaxed(v);
+}
+
+static __always_inline long
+atomic_long_fetch_inc(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_inc(v);
+}
+
+static __always_inline long
+atomic_long_fetch_inc_acquire(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_inc_acquire(v);
+}
+
+static __always_inline long
+atomic_long_fetch_inc_release(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_inc_release(v);
+}
+
+static __always_inline long
+atomic_long_fetch_inc_relaxed(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_inc_relaxed(v);
+}
+
+static __always_inline void
+atomic_long_dec(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_dec(v);
+}
+
+static __always_inline long
+atomic_long_dec_return(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_return(v);
+}
+
+static __always_inline long
+atomic_long_dec_return_acquire(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_return_acquire(v);
+}
+
+static __always_inline long
+atomic_long_dec_return_release(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_return_release(v);
+}
+
+static __always_inline long
+atomic_long_dec_return_relaxed(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_return_relaxed(v);
+}
+
+static __always_inline long
+atomic_long_fetch_dec(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_dec(v);
+}
+
+static __always_inline long
+atomic_long_fetch_dec_acquire(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_dec_acquire(v);
+}
+
+static __always_inline long
+atomic_long_fetch_dec_release(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_dec_release(v);
+}
+
+static __always_inline long
+atomic_long_fetch_dec_relaxed(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_dec_relaxed(v);
+}
+
+static __always_inline void
+atomic_long_and(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_and(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_and(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_and(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_and_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_and_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_and_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_and_release(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_and_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_and_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_long_andnot(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_andnot(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_andnot(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_andnot(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_andnot_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_andnot_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_andnot_release(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_andnot_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_long_or(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_or(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_or(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_or(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_or_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_or_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_or_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_or_release(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_or_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_or_relaxed(i, v);
+}
+
+static __always_inline void
+atomic_long_xor(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ arch_atomic_long_xor(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_xor(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_xor(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_xor_acquire(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_xor_acquire(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_xor_release(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_xor_release(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_xor_relaxed(i, v);
+}
+
+static __always_inline long
+atomic_long_xchg(atomic_long_t *v, long i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_xchg(v, i);
+}
+
+static __always_inline long
+atomic_long_xchg_acquire(atomic_long_t *v, long i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_xchg_acquire(v, i);
+}
+
+static __always_inline long
+atomic_long_xchg_release(atomic_long_t *v, long i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_xchg_release(v, i);
+}
+
+static __always_inline long
+atomic_long_xchg_relaxed(atomic_long_t *v, long i)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_xchg_relaxed(v, i);
+}
+
+static __always_inline long
+atomic_long_cmpxchg(atomic_long_t *v, long old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_cmpxchg(v, old, new);
+}
+
+static __always_inline long
+atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline long
+atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_cmpxchg_release(v, old, new);
+}
+
+static __always_inline long
+atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_long_try_cmpxchg(v, old, new);
+}
+
+static __always_inline bool
+atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_long_try_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline bool
+atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_long_try_cmpxchg_release(v, old, new);
+}
+
+static __always_inline bool
+atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ instrument_atomic_read_write(old, sizeof(*old));
+ return arch_atomic_long_try_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+atomic_long_sub_and_test(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_sub_and_test(i, v);
+}
+
+static __always_inline bool
+atomic_long_dec_and_test(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_and_test(v);
+}
+
+static __always_inline bool
+atomic_long_inc_and_test(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_and_test(v);
+}
+
+static __always_inline bool
+atomic_long_add_negative(long i, atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_add_negative(i, v);
+}
+
+static __always_inline long
+atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_fetch_add_unless(v, a, u);
+}
+
+static __always_inline bool
+atomic_long_add_unless(atomic_long_t *v, long a, long u)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_add_unless(v, a, u);
+}
+
+static __always_inline bool
+atomic_long_inc_not_zero(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_not_zero(v);
+}
+
+static __always_inline bool
+atomic_long_inc_unless_negative(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_inc_unless_negative(v);
+}
+
+static __always_inline bool
+atomic_long_dec_unless_positive(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_unless_positive(v);
+}
+
+static __always_inline long
+atomic_long_dec_if_positive(atomic_long_t *v)
+{
+ instrument_atomic_read_write(v, sizeof(*v));
+ return arch_atomic_long_dec_if_positive(v);
+}
+
+#define xchg(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg(__ai_ptr, __VA_ARGS__); \
+})
+
+#define xchg_acquire(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg_acquire(__ai_ptr, __VA_ARGS__); \
+})
+
+#define xchg_release(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg_release(__ai_ptr, __VA_ARGS__); \
+})
+
+#define xchg_relaxed(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_xchg_relaxed(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg_acquire(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_acquire(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg_release(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_release(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg_relaxed(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_relaxed(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg64(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg64_acquire(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_acquire(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg64_release(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_release(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg64_relaxed(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \
+})
+
+#define try_cmpxchg(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg_acquire(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg_release(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg_relaxed(ptr, oldp, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ typeof(oldp) __ai_oldp = (oldp); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+ arch_try_cmpxchg_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define cmpxchg_local(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg_local(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg64_local(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \
+})
+
+#define sync_cmpxchg(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+ arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg_double(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
+ arch_cmpxchg_double(__ai_ptr, __VA_ARGS__); \
+})
+
+
+#define cmpxchg_double_local(ptr, ...) \
+({ \
+ typeof(ptr) __ai_ptr = (ptr); \
+ instrument_atomic_write(__ai_ptr, 2 * sizeof(*__ai_ptr)); \
+ arch_cmpxchg_double_local(__ai_ptr, __VA_ARGS__); \
+})
+
+#endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
+// 2a9553f0a9d5619f19151092df5cabbbf16ce835
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+// Generated by scripts/atomic/gen-atomic-long.sh
+// DO NOT MODIFY THIS FILE DIRECTLY
+
+#ifndef _LINUX_ATOMIC_LONG_H
+#define _LINUX_ATOMIC_LONG_H
+
+#include <linux/compiler.h>
+#include <asm/types.h>
+
+#ifdef CONFIG_64BIT
+typedef atomic64_t atomic_long_t;
+#define ATOMIC_LONG_INIT(i) ATOMIC64_INIT(i)
+#define atomic_long_cond_read_acquire atomic64_cond_read_acquire
+#define atomic_long_cond_read_relaxed atomic64_cond_read_relaxed
+#else
+typedef atomic_t atomic_long_t;
+#define ATOMIC_LONG_INIT(i) ATOMIC_INIT(i)
+#define atomic_long_cond_read_acquire atomic_cond_read_acquire
+#define atomic_long_cond_read_relaxed atomic_cond_read_relaxed
+#endif
+
+#ifdef CONFIG_64BIT
+
+static __always_inline long
+arch_atomic_long_read(const atomic_long_t *v)
+{
+ return arch_atomic64_read(v);
+}
+
+static __always_inline long
+arch_atomic_long_read_acquire(const atomic_long_t *v)
+{
+ return arch_atomic64_read_acquire(v);
+}
+
+static __always_inline void
+arch_atomic_long_set(atomic_long_t *v, long i)
+{
+ arch_atomic64_set(v, i);
+}
+
+static __always_inline void
+arch_atomic_long_set_release(atomic_long_t *v, long i)
+{
+ arch_atomic64_set_release(v, i);
+}
+
+static __always_inline void
+arch_atomic_long_add(long i, atomic_long_t *v)
+{
+ arch_atomic64_add(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return(long i, atomic_long_t *v)
+{
+ return arch_atomic64_add_return(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_add_return_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_add_return_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_add_return_relaxed(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_add(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_add_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_add_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_add_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_sub(long i, atomic_long_t *v)
+{
+ arch_atomic64_sub(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return(long i, atomic_long_t *v)
+{
+ return arch_atomic64_sub_return(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_sub_return_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_sub_return_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_sub_return_relaxed(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_sub(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_sub_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_sub_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_sub_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_inc(atomic_long_t *v)
+{
+ arch_atomic64_inc(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return(atomic_long_t *v)
+{
+ return arch_atomic64_inc_return(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return_acquire(atomic_long_t *v)
+{
+ return arch_atomic64_inc_return_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return_release(atomic_long_t *v)
+{
+ return arch_atomic64_inc_return_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return_relaxed(atomic_long_t *v)
+{
+ return arch_atomic64_inc_return_relaxed(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_inc(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc_acquire(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_inc_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc_release(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_inc_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc_relaxed(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_inc_relaxed(v);
+}
+
+static __always_inline void
+arch_atomic_long_dec(atomic_long_t *v)
+{
+ arch_atomic64_dec(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return(atomic_long_t *v)
+{
+ return arch_atomic64_dec_return(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return_acquire(atomic_long_t *v)
+{
+ return arch_atomic64_dec_return_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return_release(atomic_long_t *v)
+{
+ return arch_atomic64_dec_return_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return_relaxed(atomic_long_t *v)
+{
+ return arch_atomic64_dec_return_relaxed(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_dec(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec_acquire(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_dec_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec_release(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_dec_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec_relaxed(atomic_long_t *v)
+{
+ return arch_atomic64_fetch_dec_relaxed(v);
+}
+
+static __always_inline void
+arch_atomic_long_and(long i, atomic_long_t *v)
+{
+ arch_atomic64_and(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_and(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_and_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_and_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_and_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_andnot(long i, atomic_long_t *v)
+{
+ arch_atomic64_andnot(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_andnot(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_andnot_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_andnot_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_andnot_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_or(long i, atomic_long_t *v)
+{
+ arch_atomic64_or(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_or(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_or_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_or_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_or_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_xor(long i, atomic_long_t *v)
+{
+ arch_atomic64_xor(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_xor(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_xor_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor_release(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_xor_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic64_fetch_xor_relaxed(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_xchg(atomic_long_t *v, long i)
+{
+ return arch_atomic64_xchg(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_xchg_acquire(atomic_long_t *v, long i)
+{
+ return arch_atomic64_xchg_acquire(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_xchg_release(atomic_long_t *v, long i)
+{
+ return arch_atomic64_xchg_release(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_xchg_relaxed(atomic_long_t *v, long i)
+{
+ return arch_atomic64_xchg_relaxed(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic64_cmpxchg(v, old, new);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic64_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic64_cmpxchg_release(v, old, new);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic64_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic64_try_cmpxchg(v, (s64 *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic64_try_cmpxchg_acquire(v, (s64 *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic64_try_cmpxchg_release(v, (s64 *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic64_try_cmpxchg_relaxed(v, (s64 *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_sub_and_test(long i, atomic_long_t *v)
+{
+ return arch_atomic64_sub_and_test(i, v);
+}
+
+static __always_inline bool
+arch_atomic_long_dec_and_test(atomic_long_t *v)
+{
+ return arch_atomic64_dec_and_test(v);
+}
+
+static __always_inline bool
+arch_atomic_long_inc_and_test(atomic_long_t *v)
+{
+ return arch_atomic64_inc_and_test(v);
+}
+
+static __always_inline bool
+arch_atomic_long_add_negative(long i, atomic_long_t *v)
+{
+ return arch_atomic64_add_negative(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u)
+{
+ return arch_atomic64_fetch_add_unless(v, a, u);
+}
+
+static __always_inline bool
+arch_atomic_long_add_unless(atomic_long_t *v, long a, long u)
+{
+ return arch_atomic64_add_unless(v, a, u);
+}
+
+static __always_inline bool
+arch_atomic_long_inc_not_zero(atomic_long_t *v)
+{
+ return arch_atomic64_inc_not_zero(v);
+}
+
+static __always_inline bool
+arch_atomic_long_inc_unless_negative(atomic_long_t *v)
+{
+ return arch_atomic64_inc_unless_negative(v);
+}
+
+static __always_inline bool
+arch_atomic_long_dec_unless_positive(atomic_long_t *v)
+{
+ return arch_atomic64_dec_unless_positive(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_if_positive(atomic_long_t *v)
+{
+ return arch_atomic64_dec_if_positive(v);
+}
+
+#else /* CONFIG_64BIT */
+
+static __always_inline long
+arch_atomic_long_read(const atomic_long_t *v)
+{
+ return arch_atomic_read(v);
+}
+
+static __always_inline long
+arch_atomic_long_read_acquire(const atomic_long_t *v)
+{
+ return arch_atomic_read_acquire(v);
+}
+
+static __always_inline void
+arch_atomic_long_set(atomic_long_t *v, long i)
+{
+ arch_atomic_set(v, i);
+}
+
+static __always_inline void
+arch_atomic_long_set_release(atomic_long_t *v, long i)
+{
+ arch_atomic_set_release(v, i);
+}
+
+static __always_inline void
+arch_atomic_long_add(long i, atomic_long_t *v)
+{
+ arch_atomic_add(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return(long i, atomic_long_t *v)
+{
+ return arch_atomic_add_return(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_add_return_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_add_return_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_add_return_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_add_return_relaxed(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_add(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_add_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_add_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_add_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_sub(long i, atomic_long_t *v)
+{
+ arch_atomic_sub(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return(long i, atomic_long_t *v)
+{
+ return arch_atomic_sub_return(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_sub_return_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_sub_return_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_sub_return_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_sub_return_relaxed(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_sub(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_sub_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_sub_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_sub_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_sub_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_inc(atomic_long_t *v)
+{
+ arch_atomic_inc(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return(atomic_long_t *v)
+{
+ return arch_atomic_inc_return(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return_acquire(atomic_long_t *v)
+{
+ return arch_atomic_inc_return_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return_release(atomic_long_t *v)
+{
+ return arch_atomic_inc_return_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_inc_return_relaxed(atomic_long_t *v)
+{
+ return arch_atomic_inc_return_relaxed(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc(atomic_long_t *v)
+{
+ return arch_atomic_fetch_inc(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc_acquire(atomic_long_t *v)
+{
+ return arch_atomic_fetch_inc_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc_release(atomic_long_t *v)
+{
+ return arch_atomic_fetch_inc_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_inc_relaxed(atomic_long_t *v)
+{
+ return arch_atomic_fetch_inc_relaxed(v);
+}
+
+static __always_inline void
+arch_atomic_long_dec(atomic_long_t *v)
+{
+ arch_atomic_dec(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return(atomic_long_t *v)
+{
+ return arch_atomic_dec_return(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return_acquire(atomic_long_t *v)
+{
+ return arch_atomic_dec_return_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return_release(atomic_long_t *v)
+{
+ return arch_atomic_dec_return_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_return_relaxed(atomic_long_t *v)
+{
+ return arch_atomic_dec_return_relaxed(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec(atomic_long_t *v)
+{
+ return arch_atomic_fetch_dec(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec_acquire(atomic_long_t *v)
+{
+ return arch_atomic_fetch_dec_acquire(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec_release(atomic_long_t *v)
+{
+ return arch_atomic_fetch_dec_release(v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_dec_relaxed(atomic_long_t *v)
+{
+ return arch_atomic_fetch_dec_relaxed(v);
+}
+
+static __always_inline void
+arch_atomic_long_and(long i, atomic_long_t *v)
+{
+ arch_atomic_and(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_and(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_and_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_and_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_and_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_and_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_andnot(long i, atomic_long_t *v)
+{
+ arch_atomic_andnot(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_andnot(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_andnot_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_andnot_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_andnot_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_andnot_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_or(long i, atomic_long_t *v)
+{
+ arch_atomic_or(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_or(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_or_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_or_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_or_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_or_relaxed(i, v);
+}
+
+static __always_inline void
+arch_atomic_long_xor(long i, atomic_long_t *v)
+{
+ arch_atomic_xor(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_xor(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor_acquire(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_xor_acquire(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor_release(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_xor_release(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_xor_relaxed(long i, atomic_long_t *v)
+{
+ return arch_atomic_fetch_xor_relaxed(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_xchg(atomic_long_t *v, long i)
+{
+ return arch_atomic_xchg(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_xchg_acquire(atomic_long_t *v, long i)
+{
+ return arch_atomic_xchg_acquire(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_xchg_release(atomic_long_t *v, long i)
+{
+ return arch_atomic_xchg_release(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_xchg_relaxed(atomic_long_t *v, long i)
+{
+ return arch_atomic_xchg_relaxed(v, i);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic_cmpxchg(v, old, new);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg_acquire(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic_cmpxchg_acquire(v, old, new);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg_release(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic_cmpxchg_release(v, old, new);
+}
+
+static __always_inline long
+arch_atomic_long_cmpxchg_relaxed(atomic_long_t *v, long old, long new)
+{
+ return arch_atomic_cmpxchg_relaxed(v, old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic_try_cmpxchg(v, (int *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg_acquire(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic_try_cmpxchg_acquire(v, (int *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg_release(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic_try_cmpxchg_release(v, (int *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_try_cmpxchg_relaxed(atomic_long_t *v, long *old, long new)
+{
+ return arch_atomic_try_cmpxchg_relaxed(v, (int *)old, new);
+}
+
+static __always_inline bool
+arch_atomic_long_sub_and_test(long i, atomic_long_t *v)
+{
+ return arch_atomic_sub_and_test(i, v);
+}
+
+static __always_inline bool
+arch_atomic_long_dec_and_test(atomic_long_t *v)
+{
+ return arch_atomic_dec_and_test(v);
+}
+
+static __always_inline bool
+arch_atomic_long_inc_and_test(atomic_long_t *v)
+{
+ return arch_atomic_inc_and_test(v);
+}
+
+static __always_inline bool
+arch_atomic_long_add_negative(long i, atomic_long_t *v)
+{
+ return arch_atomic_add_negative(i, v);
+}
+
+static __always_inline long
+arch_atomic_long_fetch_add_unless(atomic_long_t *v, long a, long u)
+{
+ return arch_atomic_fetch_add_unless(v, a, u);
+}
+
+static __always_inline bool
+arch_atomic_long_add_unless(atomic_long_t *v, long a, long u)
+{
+ return arch_atomic_add_unless(v, a, u);
+}
+
+static __always_inline bool
+arch_atomic_long_inc_not_zero(atomic_long_t *v)
+{
+ return arch_atomic_inc_not_zero(v);
+}
+
+static __always_inline bool
+arch_atomic_long_inc_unless_negative(atomic_long_t *v)
+{
+ return arch_atomic_inc_unless_negative(v);
+}
+
+static __always_inline bool
+arch_atomic_long_dec_unless_positive(atomic_long_t *v)
+{
+ return arch_atomic_dec_unless_positive(v);
+}
+
+static __always_inline long
+arch_atomic_long_dec_if_positive(atomic_long_t *v)
+{
+ return arch_atomic_dec_if_positive(v);
+}
+
+#endif /* CONFIG_64BIT */
+#endif /* _LINUX_ATOMIC_LONG_H */
+// e8f0e08ff072b74d180eabe2ad001282b38c2c88
sb = inode->i_sb;
#ifdef CONFIG_BLOCK
if (sb_is_blkdev_sb(sb))
- return I_BDEV(inode)->bd_bdi;
+ return I_BDEV(inode)->bd_disk->bdi;
#endif
return sb->s_bdi;
}
#ifndef __LINUX_BIO_H
#define __LINUX_BIO_H
-#include <linux/highmem.h>
#include <linux/mempool.h>
#include <linux/ioprio.h>
/* struct bio, bio_vec and BIO_* flags are defined in blk_types.h */
enum {
BIOSET_NEED_BVECS = BIT(0),
BIOSET_NEED_RESCUER = BIT(1),
+ BIOSET_PERCPU_CACHE = BIT(2),
};
extern int bioset_init(struct bio_set *, unsigned int, unsigned int, int flags);
extern void bioset_exit(struct bio_set *);
struct bio *bio_alloc_bioset(gfp_t gfp, unsigned short nr_iovecs,
struct bio_set *bs);
+struct bio *bio_alloc_kiocb(struct kiocb *kiocb, unsigned short nr_vecs,
+ struct bio_set *bs);
struct bio *bio_kmalloc(gfp_t gfp_mask, unsigned short nr_iovecs);
extern void bio_put(struct bio *);
struct bio *src) { }
#endif /* CONFIG_BLK_CGROUP */
-#ifdef CONFIG_HIGHMEM
-/*
- * remember never ever reenable interrupts between a bvec_kmap_irq and
- * bvec_kunmap_irq!
- */
-static inline char *bvec_kmap_irq(struct bio_vec *bvec, unsigned long *flags)
-{
- unsigned long addr;
-
- /*
- * might not be a highmem page, but the preempt/irq count
- * balancing is a lot nicer this way
- */
- local_irq_save(*flags);
- addr = (unsigned long) kmap_atomic(bvec->bv_page);
-
- BUG_ON(addr & ~PAGE_MASK);
-
- return (char *) addr + bvec->bv_offset;
-}
-
-static inline void bvec_kunmap_irq(char *buffer, unsigned long *flags)
-{
- unsigned long ptr = (unsigned long) buffer & PAGE_MASK;
-
- kunmap_atomic((void *) ptr);
- local_irq_restore(*flags);
-}
-
-#else
-static inline char *bvec_kmap_irq(struct bio_vec *bvec, unsigned long *flags)
-{
- return page_address(bvec->bv_page) + bvec->bv_offset;
-}
-
-static inline void bvec_kunmap_irq(char *buffer, unsigned long *flags)
-{
- *flags = 0;
-}
-#endif
-
/*
* BIO list management for use by remapping drivers (e.g. DM or MD) and loop.
*
struct kmem_cache *bio_slab;
unsigned int front_pad;
+ /*
+ * per-cpu bio alloc cache
+ */
+ struct bio_alloc_cache __percpu *cache;
+
mempool_t bio_pool;
mempool_t bvec_pool;
#if defined(CONFIG_BLK_DEV_INTEGRITY)
struct bio_list rescue_list;
struct work_struct rescue_work;
struct workqueue_struct *rescue_workqueue;
+
+ /*
+ * Hot un-plug notifier for the per-cpu cache, if used
+ */
+ struct hlist_node cpuhp_dead;
};
static inline bool bioset_initialized(struct bio_set *bs)
typedef void (blkcg_pol_offline_pd_fn)(struct blkg_policy_data *pd);
typedef void (blkcg_pol_free_pd_fn)(struct blkg_policy_data *pd);
typedef void (blkcg_pol_reset_pd_stats_fn)(struct blkg_policy_data *pd);
-typedef size_t (blkcg_pol_stat_pd_fn)(struct blkg_policy_data *pd, char *buf,
- size_t size);
+typedef bool (blkcg_pol_stat_pd_fn)(struct blkg_policy_data *pd,
+ struct seq_file *s);
struct blkcg_policy {
int plid;
BLK_MQ_F_STACKING = 1 << 2,
BLK_MQ_F_TAG_HCTX_SHARED = 1 << 3,
BLK_MQ_F_BLOCKING = 1 << 5,
+ /* Do not allow an I/O scheduler to be configured. */
BLK_MQ_F_NO_SCHED = 1 << 6,
+ /*
+ * Select 'none' during queue registration in case of a single hwq
+ * or shared hwqs instead of 'mq-deadline'.
+ */
+ BLK_MQ_F_NO_SCHED_BY_DEFAULT = 1 << 7,
BLK_MQ_F_ALLOC_POLICY_START_BIT = 8,
BLK_MQ_F_ALLOC_POLICY_BITS = 1,
((policy & ((1 << BLK_MQ_F_ALLOC_POLICY_BITS) - 1)) \
<< BLK_MQ_F_ALLOC_POLICY_START_BIT)
+struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set, void *queuedata,
+ struct lock_class_key *lkclass);
#define blk_mq_alloc_disk(set, queuedata) \
({ \
static struct lock_class_key __key; \
- struct gendisk *__disk = __blk_mq_alloc_disk(set, queuedata); \
\
- if (!IS_ERR(__disk)) \
- lockdep_init_map(&__disk->lockdep_map, \
- "(bio completion)", &__key, 0); \
- __disk; \
+ __blk_mq_alloc_disk(set, queuedata, &__key); \
})
-struct gendisk *__blk_mq_alloc_disk(struct blk_mq_tag_set *set,
- void *queuedata);
struct request_queue *blk_mq_init_queue(struct blk_mq_tag_set *);
int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
struct request_queue *q);
void * bd_holder;
int bd_holders;
bool bd_write_holder;
-#ifdef CONFIG_SYSFS
- struct list_head bd_holder_disks;
-#endif
struct kobject *bd_holder_dir;
u8 bd_partno;
spinlock_t bd_size_lock; /* for bd_inode->i_size updates */
struct gendisk * bd_disk;
- struct backing_dev_info *bd_bdi;
/* The counter of freeze processes */
int bd_fsfreeze_count;
BIO_TRACKED, /* set if bio goes through the rq_qos path */
BIO_REMAPPED,
BIO_ZONE_WRITE_LOCKED, /* Owns a zoned device zone write lock */
+ BIO_PERCPU_CACHE, /* can participate in per-cpu alloc cache */
BIO_FLAG_LAST
};
#include <linux/minmax.h>
#include <linux/timer.h>
#include <linux/workqueue.h>
-#include <linux/backing-dev-defs.h>
#include <linux/wait.h>
#include <linux/mempool.h>
#include <linux/pfn.h>
struct blk_mq_hw_ctx **queue_hw_ctx;
unsigned int nr_hw_queues;
- struct backing_dev_info *backing_dev_info;
-
/*
* The queue owner gets to use this for whatever they like.
* ll_rw_blk doesn't touch it.
spinlock_t queue_lock;
+ struct gendisk *disk;
+
/*
* queue kobject
*/
dma_map_page_attrs(dev, (bv)->bv_page, (bv)->bv_offset, (bv)->bv_len, \
(dir), (attrs))
-#define queue_to_disk(q) (dev_to_disk(kobj_to_dev((q)->kobj.parent)))
-
static inline bool queue_is_mq(struct request_queue *q)
{
return q->mq_ops;
#define SECTOR_SIZE (1 << SECTOR_SHIFT)
#endif
+#define PAGE_SECTORS_SHIFT (PAGE_SHIFT - SECTOR_SHIFT)
+#define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT)
+#define SECTOR_MASK (PAGE_SECTORS - 1)
+
/*
* blk_rq_pos() : the current sector
* blk_rq_bytes() : bytes left in the entire request
unsigned int size);
extern void blk_queue_alignment_offset(struct request_queue *q,
unsigned int alignment);
-void blk_queue_update_readahead(struct request_queue *q);
+void disk_update_readahead(struct gendisk *disk);
extern void blk_limits_io_min(struct queue_limits *limits, unsigned int min);
extern void blk_queue_io_min(struct request_queue *q, unsigned int min);
extern void blk_limits_io_opt(struct queue_limits *limits, unsigned int opt);
return offset << SECTOR_SHIFT;
}
+/*
+ * Two cases of handling DISCARD merge:
+ * If max_discard_segments > 1, the driver takes every bio
+ * as a range and send them to controller together. The ranges
+ * needn't to be contiguous.
+ * Otherwise, the bios/requests will be handled as same as
+ * others which should be contiguous.
+ */
+static inline bool blk_discard_mergable(struct request *req)
+{
+ if (req_op(req) == REQ_OP_DISCARD &&
+ queue_max_discard_segments(req->q) > 1)
+ return true;
+ return false;
+}
+
static inline int bdev_discard_alignment(struct block_device *bdev)
{
struct request_queue *q = bdev_get_queue(bdev);
char *(*devnode)(struct gendisk *disk, umode_t *mode);
struct module *owner;
const struct pr_ops *pr_ops;
+
+ /*
+ * Special callback for probing GPT entry at a given sector.
+ * Needed by Android devices, used by GPT scanner and MMC blk
+ * driver.
+ */
+ int (*alternative_gpt_sector)(struct gendisk *disk, sector_t *sector);
};
#ifdef CONFIG_COMPAT
struct block_device *bdev_alloc(struct gendisk *disk, u8 partno);
void bdev_add(struct block_device *bdev, dev_t dev);
struct block_device *I_BDEV(struct inode *inode);
-struct block_device *bdgrab(struct block_device *bdev);
-void bdput(struct block_device *);
int truncate_bdev_range(struct block_device *bdev, fmode_t mode, loff_t lstart,
loff_t lend);
*
* Copyright (C) 2001 Ming Lei <ming.lei@canonical.com>
*/
-#ifndef __LINUX_BVEC_ITER_H
-#define __LINUX_BVEC_ITER_H
+#ifndef __LINUX_BVEC_H
+#define __LINUX_BVEC_H
+#include <linux/highmem.h>
#include <linux/bug.h>
#include <linux/errno.h>
#include <linux/limits.h>
}
}
-#endif /* __LINUX_BVEC_ITER_H */
+/**
+ * bvec_kmap_local - map a bvec into the kernel virtual address space
+ * @bvec: bvec to map
+ *
+ * Must be called on single-page bvecs only. Call kunmap_local on the returned
+ * address to unmap.
+ */
+static inline void *bvec_kmap_local(struct bio_vec *bvec)
+{
+ return kmap_local_page(bvec->bv_page) + bvec->bv_offset;
+}
+
+/**
+ * memcpy_from_bvec - copy data from a bvec
+ * @bvec: bvec to copy from
+ *
+ * Must be called on single-page bvecs only.
+ */
+static inline void memcpy_from_bvec(char *to, struct bio_vec *bvec)
+{
+ memcpy_from_page(to, bvec->bv_page, bvec->bv_offset, bvec->bv_len);
+}
+
+/**
+ * memcpy_to_bvec - copy data to a bvec
+ * @bvec: bvec to copy to
+ *
+ * Must be called on single-page bvecs only.
+ */
+static inline void memcpy_to_bvec(struct bio_vec *bvec, const char *from)
+{
+ memcpy_to_page(bvec->bv_page, bvec->bv_offset, from, bvec->bv_len);
+}
+
+/**
+ * memzero_bvec - zero all data in a bvec
+ * @bvec: bvec to zero
+ *
+ * Must be called on single-page bvecs only.
+ */
+static inline void memzero_bvec(struct bio_vec *bvec)
+{
+ memzero_page(bvec->bv_page, bvec->bv_offset, bvec->bv_len);
+}
+
+/**
+ * bvec_virt - return the virtual address for a bvec
+ * @bvec: bvec to return the virtual address for
+ *
+ * Note: the caller must ensure that @bvec->bv_page is not a highmem page.
+ */
+static inline void *bvec_virt(struct bio_vec *bvec)
+{
+ WARN_ON_ONCE(PageHighMem(bvec->bv_page));
+ return page_address(bvec->bv_page) + bvec->bv_offset;
+}
+
+#endif /* __LINUX_BVEC_H */
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Parsing command line, get the partitions information.
- *
- * Written by Cai Zhiyong <caizhiyong@huawei.com>
- *
- */
-#ifndef CMDLINEPARSEH
-#define CMDLINEPARSEH
-
-#include <linux/blkdev.h>
-#include <linux/fs.h>
-#include <linux/slab.h>
-
-/* partition flags */
-#define PF_RDONLY 0x01 /* Device is read only */
-#define PF_POWERUP_LOCK 0x02 /* Always locked after reset */
-
-struct cmdline_subpart {
- char name[BDEVNAME_SIZE]; /* partition name, such as 'rootfs' */
- sector_t from;
- sector_t size;
- int flags;
- struct cmdline_subpart *next_subpart;
-};
-
-struct cmdline_parts {
- char name[BDEVNAME_SIZE]; /* block device, such as 'mmcblk0' */
- unsigned int nr_subparts;
- struct cmdline_subpart *subpart;
- struct cmdline_parts *next_parts;
-};
-
-void cmdline_parts_free(struct cmdline_parts **parts);
-
-int cmdline_parts_parse(struct cmdline_parts **parts, const char *cmdline);
-
-struct cmdline_parts *cmdline_parts_find(struct cmdline_parts *parts,
- const char *bdev);
-
-int cmdline_parts_set(struct cmdline_parts *parts, sector_t disk_size,
- int slot,
- int (*add_part)(int, struct cmdline_subpart *, void *),
- void *param);
-
-#endif /* CMDLINEPARSEH */
CPUHP_ARM_OMAP_WAKE_DEAD,
CPUHP_IRQ_POLL_DEAD,
CPUHP_BLOCK_SOFTIRQ_DEAD,
+ CPUHP_BIO_DEAD,
CPUHP_ACPI_CPUDRV_DEAD,
CPUHP_S390_PFAULT_DEAD,
CPUHP_BLK_MQ_DEAD,
/**
* cpuhp_state_remove_instance_nocalls - Remove hotplug instance from state
- * without invoking the reatdown callback
+ * without invoking the teardown callback
* @state: The state from which the instance is removed
* @node: The node for this individual state.
*
#include <linux/cpumask.h>
#include <linux/nodemask.h>
#include <linux/mm.h>
+#include <linux/mmu_context.h>
#include <linux/jump_label.h>
#ifdef CONFIG_CPUSETS
extern void cpuset_read_lock(void);
extern void cpuset_read_unlock(void);
extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mask);
-extern void cpuset_cpus_allowed_fallback(struct task_struct *p);
+extern bool cpuset_cpus_allowed_fallback(struct task_struct *p);
extern nodemask_t cpuset_mems_allowed(struct task_struct *p);
#define cpuset_current_mems_allowed (current->mems_allowed)
void cpuset_init_current_mems_allowed(void);
static inline void cpuset_cpus_allowed(struct task_struct *p,
struct cpumask *mask)
{
- cpumask_copy(mask, cpu_possible_mask);
+ cpumask_copy(mask, task_cpu_possible_mask(p));
}
-static inline void cpuset_cpus_allowed_fallback(struct task_struct *p)
+static inline bool cpuset_cpus_allowed_fallback(struct task_struct *p)
{
+ return false;
}
static inline nodemask_t cpuset_mems_allowed(struct task_struct *p)
#define __LINUX_DEBUG_LOCKING_H
#include <linux/atomic.h>
-#include <linux/bug.h>
-#include <linux/printk.h>
+#include <linux/cache.h>
struct task_struct;
void *addr, size_t bytes, struct iov_iter *i);
typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
size_t nr_pages);
-#define PAGE_SECTORS (PAGE_SIZE / 512)
void dm_error(const char *message);
* @MEM_DDR5: Unbuffered DDR5 RAM
* @MEM_NVDIMM: Non-volatile RAM
* @MEM_WIO2: Wide I/O 2.
+ * @MEM_HBM2: High bandwidth Memory Gen 2.
*/
enum mem_type {
MEM_EMPTY = 0,
MEM_DDR5,
MEM_NVDIMM,
MEM_WIO2,
+ MEM_HBM2,
};
#define MEM_FLAG_EMPTY BIT(MEM_EMPTY)
#define MEM_FLAG_DDR5 BIT(MEM_DDR5)
#define MEM_FLAG_NVDIMM BIT(MEM_NVDIMM)
#define MEM_FLAG_WIO2 BIT(MEM_WIO2)
+#define MEM_FLAG_HBM2 BIT(MEM_HBM2)
/**
* enum edac_type - Error Detection and Correction capabilities and mode
#include <linux/err.h>
#include <linux/percpu-defs.h>
#include <linux/percpu.h>
+#include <linux/sched.h>
/*
* CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining
__u64 *cnt);
void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt);
-DECLARE_PER_CPU(int, eventfd_wake_count);
-
-static inline bool eventfd_signal_count(void)
+static inline bool eventfd_signal_allowed(void)
{
- return this_cpu_read(eventfd_wake_count);
+ return !current->in_eventfd_signal;
}
#else /* CONFIG_EVENTFD */
return -ENOSYS;
}
-static inline bool eventfd_signal_count(void)
+static inline bool eventfd_signal_allowed(void)
{
- return false;
+ return true;
}
static inline void eventfd_ctx_do_read(struct eventfd_ctx *ctx, __u64 *cnt)
#define FANOTIFY_FID_BITS (FAN_REPORT_FID | FAN_REPORT_DFID_NAME)
+#define FANOTIFY_INFO_MODES (FANOTIFY_FID_BITS | FAN_REPORT_PIDFD)
+
/*
* fanotify_init() flags that require CAP_SYS_ADMIN.
* We do not allow unprivileged groups to request permission events.
*/
#define FANOTIFY_ADMIN_INIT_FLAGS (FANOTIFY_PERM_CLASSES | \
FAN_REPORT_TID | \
+ FAN_REPORT_PIDFD | \
FAN_UNLIMITED_QUEUE | \
FAN_UNLIMITED_MARKS)
int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical,
u64 phys, u64 len, u32 flags);
-int generic_block_fiemap(struct inode *inode,
- struct fiemap_extent_info *fieinfo, u64 start, u64 len,
- get_block_t *get_block);
-
#endif /* _LINUX_FIEMAP_H 1 */
/* iocb->ki_waitq is valid */
#define IOCB_WAITQ (1 << 19)
#define IOCB_NOIO (1 << 20)
+/* can use bio alloc cache */
+#define IOCB_ALLOC_CACHE (1 << 21)
struct kiocb {
struct file *ki_filp;
* struct address_space - Contents of a cacheable, mappable object.
* @host: Owner, either the inode or the block_device.
* @i_pages: Cached pages.
+ * @invalidate_lock: Guards coherency between page cache contents and
+ * file offset->disk block mappings in the filesystem during invalidates.
+ * It is also used to block modification of page cache contents through
+ * memory mappings.
* @gfp_mask: Memory allocation flags to use for allocating pages.
* @i_mmap_writable: Number of VM_SHARED mappings.
* @nr_thps: Number of THPs in the pagecache (non-shmem only).
struct address_space {
struct inode *host;
struct xarray i_pages;
+ struct rw_semaphore invalidate_lock;
gfp_t gfp_mask;
atomic_t i_mmap_writable;
#ifdef CONFIG_READ_ONLY_THP_FOR_FS
down_read_nested(&inode->i_rwsem, subclass);
}
+static inline void filemap_invalidate_lock(struct address_space *mapping)
+{
+ down_write(&mapping->invalidate_lock);
+}
+
+static inline void filemap_invalidate_unlock(struct address_space *mapping)
+{
+ up_write(&mapping->invalidate_lock);
+}
+
+static inline void filemap_invalidate_lock_shared(struct address_space *mapping)
+{
+ down_read(&mapping->invalidate_lock);
+}
+
+static inline int filemap_invalidate_trylock_shared(
+ struct address_space *mapping)
+{
+ return down_read_trylock(&mapping->invalidate_lock);
+}
+
+static inline void filemap_invalidate_unlock_shared(
+ struct address_space *mapping)
+{
+ up_read(&mapping->invalidate_lock);
+}
+
void lock_two_nondirectories(struct inode *, struct inode*);
void unlock_two_nondirectories(struct inode *, struct inode*);
+void filemap_invalidate_lock_two(struct address_space *mapping1,
+ struct address_space *mapping2);
+void filemap_invalidate_unlock_two(struct address_space *mapping1,
+ struct address_space *mapping2);
+
+
/*
* NOTE: in a 32bit arch with a preemptable kernel and
* an UP compile the i_size_read/write must be atomic
/* Number of inodes with nlink == 0 but still referenced */
atomic_long_t s_remove_count;
- /* Pending fsnotify inode refs */
- atomic_long_t s_fsnotify_inode_refs;
+ /*
+ * Number of inode/mount/sb objects that are being watched, note that
+ * inodes objects are currently double-accounted.
+ */
+ atomic_long_t s_fsnotify_connectors;
/* Being remounted read-only */
int s_readonly_remount;
struct lock_class_key i_lock_key;
struct lock_class_key i_mutex_key;
+ struct lock_class_key invalidate_lock_key;
struct lock_class_key i_mutex_dir_key;
};
#define MAX_RW_COUNT (INT_MAX & PAGE_MASK)
-#ifdef CONFIG_MANDATORY_FILE_LOCKING
-extern int locks_mandatory_locked(struct file *);
-extern int locks_mandatory_area(struct inode *, struct file *, loff_t, loff_t, unsigned char);
-
-/*
- * Candidates for mandatory locking have the setgid bit set
- * but no group execute bit - an otherwise meaningless combination.
- */
-
-static inline int __mandatory_lock(struct inode *ino)
-{
- return (ino->i_mode & (S_ISGID | S_IXGRP)) == S_ISGID;
-}
-
-/*
- * ... and these candidates should be on SB_MANDLOCK mounted fs,
- * otherwise these will be advisory locks
- */
-
-static inline int mandatory_lock(struct inode *ino)
-{
- return IS_MANDLOCK(ino) && __mandatory_lock(ino);
-}
-
-static inline int locks_verify_locked(struct file *file)
-{
- if (mandatory_lock(locks_inode(file)))
- return locks_mandatory_locked(file);
- return 0;
-}
-
-static inline int locks_verify_truncate(struct inode *inode,
- struct file *f,
- loff_t size)
-{
- if (!inode->i_flctx || !mandatory_lock(inode))
- return 0;
-
- if (size < inode->i_size) {
- return locks_mandatory_area(inode, f, size, inode->i_size - 1,
- F_WRLCK);
- } else {
- return locks_mandatory_area(inode, f, inode->i_size, size - 1,
- F_WRLCK);
- }
-}
-
-#else /* !CONFIG_MANDATORY_FILE_LOCKING */
-
-static inline int locks_mandatory_locked(struct file *file)
-{
- return 0;
-}
-
-static inline int locks_mandatory_area(struct inode *inode, struct file *filp,
- loff_t start, loff_t end, unsigned char type)
-{
- return 0;
-}
-
-static inline int __mandatory_lock(struct inode *inode)
-{
- return 0;
-}
-
-static inline int mandatory_lock(struct inode *inode)
-{
- return 0;
-}
-
-static inline int locks_verify_locked(struct file *file)
-{
- return 0;
-}
-
-static inline int locks_verify_truncate(struct inode *inode, struct file *filp,
- size_t size)
-{
- return 0;
-}
-
-#endif /* CONFIG_MANDATORY_FILE_LOCKING */
-
-
#ifdef CONFIG_FILE_LOCKING
static inline int break_lease(struct inode *inode, unsigned int mode)
{
ssize_t vfs_iocb_iter_write(struct file *file, struct kiocb *iocb,
struct iov_iter *iter);
-/* fs/block_dev.c */
-extern int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
- int datasync);
-
/* fs/splice.c */
extern ssize_t generic_file_splice_read(struct file *, loff_t *,
struct pipe_inode_info *, size_t, unsigned int);
struct inode *child,
const struct qstr *name, u32 cookie)
{
+ if (atomic_long_read(&dir->i_sb->s_fsnotify_connectors) == 0)
+ return;
+
fsnotify(mask, child, FSNOTIFY_EVENT_INODE, dir, name, NULL, cookie);
}
static inline void fsnotify_inode(struct inode *inode, __u32 mask)
{
+ if (atomic_long_read(&inode->i_sb->s_fsnotify_connectors) == 0)
+ return;
+
if (S_ISDIR(inode->i_mode))
mask |= FS_ISDIR;
{
struct inode *inode = d_inode(dentry);
+ if (atomic_long_read(&inode->i_sb->s_fsnotify_connectors) == 0)
+ return 0;
+
if (S_ISDIR(inode->i_mode)) {
mask |= FS_ISDIR;
extern int ftrace_make_nop(struct module *mod,
struct dyn_ftrace *rec, unsigned long addr);
+/**
+ * ftrace_need_init_nop - return whether nop call sites should be initialized
+ *
+ * Normally the compiler's -mnop-mcount generates suitable nops, so we don't
+ * need to call ftrace_init_nop() if the code is built with that flag.
+ * Architectures where this is not always the case may define their own
+ * condition.
+ *
+ * Return must be:
+ * 0 if ftrace_init_nop() should be called
+ * Nonzero if ftrace_init_nop() should not be called
+ */
+
+#ifndef ftrace_need_init_nop
+#define ftrace_need_init_nop() (!__is_defined(CC_USING_NOP_MCOUNT))
+#endif
/**
* ftrace_init_nop - initialize a nop call site
* device.
* Affects responses to the ``CDROM_GET_CAPABILITY`` ioctl.
*
- * ``GENHD_FL_UP`` (0x0010): indicates that the block device is "up",
- * with a similar meaning to network interfaces.
- *
* ``GENHD_FL_SUPPRESS_PARTITION_INFO`` (0x0020): don't include
* partition information in ``/proc/partitions`` or in the output of
* printk_all_partitions().
/* 2 is unused (used to be GENHD_FL_DRIVERFS) */
/* 4 is unused (used to be GENHD_FL_MEDIA_CHANGE_NOTIFY) */
#define GENHD_FL_CD 0x0008
-#define GENHD_FL_UP 0x0010
#define GENHD_FL_SUPPRESS_PARTITION_INFO 0x0020
#define GENHD_FL_EXT_DEVT 0x0040
#define GENHD_FL_NATIVE_CAPACITY 0x0080
unsigned long state;
#define GD_NEED_PART_SCAN 0
#define GD_READ_ONLY 1
-#define GD_QUEUE_REF 2
struct mutex open_mutex; /* open/close mutex */
unsigned open_partitions; /* number of open partitions */
+ struct backing_dev_info *bdi;
struct kobject *slave_dir;
-
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
+ struct list_head slave_bdevs;
+#endif
struct timer_rand_state *random;
atomic_t sync_io; /* RAID */
struct disk_events *ev;
int node_id;
struct badblocks *bb;
struct lockdep_map lockdep_map;
+ u64 diskseq;
};
+static inline bool disk_live(struct gendisk *disk)
+{
+ return !inode_unhashed(disk->part0->bd_inode);
+}
+
/*
* The gendisk is refcounted by the part0 block_device, and the bd_device
* therein is also used for device model presentation in sysfs.
void disk_uevent(struct gendisk *disk, enum kobject_action action);
/* block/genhd.c */
-extern void device_add_disk(struct device *parent, struct gendisk *disk,
- const struct attribute_group **groups);
-static inline void add_disk(struct gendisk *disk)
+int device_add_disk(struct device *parent, struct gendisk *disk,
+ const struct attribute_group **groups);
+static inline int add_disk(struct gendisk *disk)
{
- device_add_disk(NULL, disk, NULL);
+ return device_add_disk(NULL, disk, NULL);
}
-extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
-static inline void add_disk_no_queue_reg(struct gendisk *disk)
-{
- device_add_disk_no_queue_reg(NULL, disk);
-}
-
extern void del_gendisk(struct gendisk *gp);
void set_disk_ro(struct gendisk *disk, bool read_only);
extern void disk_unblock_events(struct gendisk *disk);
extern void disk_flush_events(struct gendisk *disk, unsigned int mask);
bool set_capacity_and_notify(struct gendisk *disk, sector_t size);
+bool disk_force_media_change(struct gendisk *disk, unsigned int events);
/* drivers/char/random.c */
extern void add_disk_randomness(struct gendisk *disk) __latent_entropy;
int bdev_disk_changed(struct gendisk *disk, bool invalidate);
void blk_drop_partitions(struct gendisk *disk);
-extern struct gendisk *__alloc_disk_node(int minors, int node_id);
+struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
+ struct lock_class_key *lkclass);
extern void put_disk(struct gendisk *disk);
-
-#define alloc_disk_node(minors, node_id) \
-({ \
- static struct lock_class_key __key; \
- const char *__name; \
- struct gendisk *__disk; \
- \
- __name = "(gendisk_completion)"#minors"("#node_id")"; \
- \
- __disk = __alloc_disk_node(minors, node_id); \
- \
- if (__disk) \
- lockdep_init_map(&__disk->lockdep_map, __name, &__key, 0); \
- \
- __disk; \
-})
-
-#define alloc_disk(minors) alloc_disk_node(minors, NUMA_NO_NODE)
+struct gendisk *__blk_alloc_disk(int node, struct lock_class_key *lkclass);
/**
* blk_alloc_disk - allocate a gendisk structure
*/
#define blk_alloc_disk(node_id) \
({ \
- struct gendisk *__disk = __blk_alloc_disk(node_id); \
static struct lock_class_key __key; \
\
- if (__disk) \
- lockdep_init_map(&__disk->lockdep_map, \
- "(bio completion)", &__key, 0); \
- __disk; \
+ __blk_alloc_disk(node_id, &__key); \
})
-struct gendisk *__blk_alloc_disk(int node);
void blk_cleanup_disk(struct gendisk *disk);
int __register_blkdev(unsigned int major, const char *name,
int blkdev_ioctl(struct block_device *, fmode_t, unsigned, unsigned long);
long compat_blkdev_ioctl(struct file *, unsigned, unsigned long);
-#ifdef CONFIG_SYSFS
+#ifdef CONFIG_BLOCK_HOLDER_DEPRECATED
int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
void bd_unlink_disk_holder(struct block_device *bdev, struct gendisk *disk);
+int bd_register_pending_holders(struct gendisk *disk);
#else
static inline int bd_link_disk_holder(struct block_device *bdev,
struct gendisk *disk)
struct gendisk *disk)
{
}
-#endif /* CONFIG_SYSFS */
+static inline int bd_register_pending_holders(struct gendisk *disk)
+{
+ return 0;
+}
+#endif /* CONFIG_BLOCK_HOLDER_DEPRECATED */
dev_t part_devt(struct gendisk *disk, u8 partno);
+void inc_diskseq(struct gendisk *disk);
dev_t blk_lookup_devt(const char *name, int partno);
void blk_request_module(dev_t devt);
#ifdef CONFIG_BLOCK
extern void hrtimer_interrupt(struct clock_event_device *dev);
-extern void clock_was_set_delayed(void);
-
extern unsigned int hrtimer_resolution;
#else
#define hrtimer_resolution (unsigned int)LOW_RES_NSEC
-static inline void clock_was_set_delayed(void) { }
-
#endif
static inline ktime_t
timer->base->get_time());
}
-extern void clock_was_set(void);
#ifdef CONFIG_TIMERFD
extern void timerfd_clock_was_set(void);
+extern void timerfd_resume(void);
#else
static inline void timerfd_clock_was_set(void) { }
+static inline void timerfd_resume(void) { }
#endif
-extern void hrtimers_resume(void);
DECLARE_PER_CPU(struct tick_device, tick_cpu_device);
#include <linux/hrtimer.h>
#include <linux/kref.h>
#include <linux/workqueue.h>
+#include <linux/jump_label.h>
#include <linux/atomic.h>
#include <asm/ptrace.h>
#ifdef CONFIG_IRQ_FORCED_THREADING
# ifdef CONFIG_PREEMPT_RT
-# define force_irqthreads (true)
+# define force_irqthreads() (true)
# else
-extern bool force_irqthreads;
+DECLARE_STATIC_KEY_FALSE(force_irqthreads_key);
+# define force_irqthreads() (static_branch_unlikely(&force_irqthreads_key))
# endif
#else
-#define force_irqthreads (0)
+#define force_irqthreads() (false)
#endif
#ifndef local_softirq_pending
#include <linux/sched/rt.h>
#include <linux/iocontext.h>
-/*
- * Gives us 8 prio classes with 13-bits of data for each class
- */
-#define IOPRIO_CLASS_SHIFT (13)
-#define IOPRIO_PRIO_MASK ((1UL << IOPRIO_CLASS_SHIFT) - 1)
-
-#define IOPRIO_PRIO_CLASS(mask) ((mask) >> IOPRIO_CLASS_SHIFT)
-#define IOPRIO_PRIO_DATA(mask) ((mask) & IOPRIO_PRIO_MASK)
-#define IOPRIO_PRIO_VALUE(class, data) (((class) << IOPRIO_CLASS_SHIFT) | data)
-
-#define ioprio_valid(mask) (IOPRIO_PRIO_CLASS((mask)) != IOPRIO_CLASS_NONE)
+#include <uapi/linux/ioprio.h>
/*
- * These are the io priority groups as implemented by CFQ. RT is the realtime
- * class, it always gets premium service. BE is the best-effort scheduling
- * class, the default for any process. IDLE is the idle scheduling class, it
- * is only served when no one else is using the disk.
+ * Default IO priority.
*/
-enum {
- IOPRIO_CLASS_NONE,
- IOPRIO_CLASS_RT,
- IOPRIO_CLASS_BE,
- IOPRIO_CLASS_IDLE,
-};
+#define IOPRIO_DEFAULT IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, IOPRIO_BE_NORM)
/*
- * 8 best effort priority levels are supported
+ * Check that a priority value has a valid class.
*/
-#define IOPRIO_BE_NR (8)
-
-enum {
- IOPRIO_WHO_PROCESS = 1,
- IOPRIO_WHO_PGRP,
- IOPRIO_WHO_USER,
-};
+static inline bool ioprio_valid(unsigned short ioprio)
+{
+ unsigned short class = IOPRIO_PRIO_CLASS(ioprio);
-/*
- * Fallback BE priority
- */
-#define IOPRIO_NORM (4)
+ return class > IOPRIO_CLASS_NONE && class <= IOPRIO_CLASS_IDLE;
+}
/*
* if process has set io priority explicitly, use that. if not, convert
if (ioc)
return ioc->ioprio;
- return IOPRIO_PRIO_VALUE(IOPRIO_CLASS_NONE, 0);
+ return IOPRIO_DEFAULT;
}
/*
ATA_DFLAG_D_SENSE = (1 << 29), /* Descriptor sense requested */
ATA_DFLAG_ZAC = (1 << 30), /* ZAC device */
+ ATA_DFLAG_FEATURES_MASK = ATA_DFLAG_TRUSTED | ATA_DFLAG_DA | \
+ ATA_DFLAG_DEVSLP | ATA_DFLAG_NCQ_SEND_RECV | \
+ ATA_DFLAG_NCQ_PRIO,
+
ATA_DEV_UNKNOWN = 0, /* unknown device */
ATA_DEV_ATA = 1, /* ATA device */
ATA_DEV_ATA_UNSUP = 2, /* ATA device (unsupported) */
extern struct device_attribute dev_attr_unload_heads;
#ifdef CONFIG_SATA_HOST
extern struct device_attribute dev_attr_link_power_management_policy;
+extern struct device_attribute dev_attr_ncq_prio_supported;
extern struct device_attribute dev_attr_ncq_prio_enable;
extern struct device_attribute dev_attr_em_message_type;
extern struct device_attribute dev_attr_em_message;
static inline bool ata_is_host_link(const struct ata_link *link)
{
- return 1;
+ return true;
}
#endif /* CONFIG_SATA_PMP */
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef NVM_H
-#define NVM_H
-
-#include <linux/blkdev.h>
-#include <linux/types.h>
-#include <uapi/linux/lightnvm.h>
-
-enum {
- NVM_IO_OK = 0,
- NVM_IO_REQUEUE = 1,
- NVM_IO_DONE = 2,
- NVM_IO_ERR = 3,
-
- NVM_IOTYPE_NONE = 0,
- NVM_IOTYPE_GC = 1,
-};
-
-/* common format */
-#define NVM_GEN_CH_BITS (8)
-#define NVM_GEN_LUN_BITS (8)
-#define NVM_GEN_BLK_BITS (16)
-#define NVM_GEN_RESERVED (32)
-
-/* 1.2 format */
-#define NVM_12_PG_BITS (16)
-#define NVM_12_PL_BITS (4)
-#define NVM_12_SEC_BITS (4)
-#define NVM_12_RESERVED (8)
-
-/* 2.0 format */
-#define NVM_20_SEC_BITS (24)
-#define NVM_20_RESERVED (8)
-
-enum {
- NVM_OCSSD_SPEC_12 = 12,
- NVM_OCSSD_SPEC_20 = 20,
-};
-
-struct ppa_addr {
- /* Generic structure for all addresses */
- union {
- /* generic device format */
- struct {
- u64 ch : NVM_GEN_CH_BITS;
- u64 lun : NVM_GEN_LUN_BITS;
- u64 blk : NVM_GEN_BLK_BITS;
- u64 reserved : NVM_GEN_RESERVED;
- } a;
-
- /* 1.2 device format */
- struct {
- u64 ch : NVM_GEN_CH_BITS;
- u64 lun : NVM_GEN_LUN_BITS;
- u64 blk : NVM_GEN_BLK_BITS;
- u64 pg : NVM_12_PG_BITS;
- u64 pl : NVM_12_PL_BITS;
- u64 sec : NVM_12_SEC_BITS;
- u64 reserved : NVM_12_RESERVED;
- } g;
-
- /* 2.0 device format */
- struct {
- u64 grp : NVM_GEN_CH_BITS;
- u64 pu : NVM_GEN_LUN_BITS;
- u64 chk : NVM_GEN_BLK_BITS;
- u64 sec : NVM_20_SEC_BITS;
- u64 reserved : NVM_20_RESERVED;
- } m;
-
- struct {
- u64 line : 63;
- u64 is_cached : 1;
- } c;
-
- u64 ppa;
- };
-};
-
-struct nvm_rq;
-struct nvm_id;
-struct nvm_dev;
-struct nvm_tgt_dev;
-struct nvm_chk_meta;
-
-typedef int (nvm_id_fn)(struct nvm_dev *);
-typedef int (nvm_op_bb_tbl_fn)(struct nvm_dev *, struct ppa_addr, u8 *);
-typedef int (nvm_op_set_bb_fn)(struct nvm_dev *, struct ppa_addr *, int, int);
-typedef int (nvm_get_chk_meta_fn)(struct nvm_dev *, sector_t, int,
- struct nvm_chk_meta *);
-typedef int (nvm_submit_io_fn)(struct nvm_dev *, struct nvm_rq *, void *);
-typedef void *(nvm_create_dma_pool_fn)(struct nvm_dev *, char *, int);
-typedef void (nvm_destroy_dma_pool_fn)(void *);
-typedef void *(nvm_dev_dma_alloc_fn)(struct nvm_dev *, void *, gfp_t,
- dma_addr_t *);
-typedef void (nvm_dev_dma_free_fn)(void *, void*, dma_addr_t);
-
-struct nvm_dev_ops {
- nvm_id_fn *identity;
- nvm_op_bb_tbl_fn *get_bb_tbl;
- nvm_op_set_bb_fn *set_bb_tbl;
-
- nvm_get_chk_meta_fn *get_chk_meta;
-
- nvm_submit_io_fn *submit_io;
-
- nvm_create_dma_pool_fn *create_dma_pool;
- nvm_destroy_dma_pool_fn *destroy_dma_pool;
- nvm_dev_dma_alloc_fn *dev_dma_alloc;
- nvm_dev_dma_free_fn *dev_dma_free;
-};
-
-#ifdef CONFIG_NVM
-
-#include <linux/file.h>
-#include <linux/dmapool.h>
-
-enum {
- /* HW Responsibilities */
- NVM_RSP_L2P = 1 << 0,
- NVM_RSP_ECC = 1 << 1,
-
- /* Physical Adressing Mode */
- NVM_ADDRMODE_LINEAR = 0,
- NVM_ADDRMODE_CHANNEL = 1,
-
- /* Plane programming mode for LUN */
- NVM_PLANE_SINGLE = 1,
- NVM_PLANE_DOUBLE = 2,
- NVM_PLANE_QUAD = 4,
-
- /* Status codes */
- NVM_RSP_SUCCESS = 0x0,
- NVM_RSP_NOT_CHANGEABLE = 0x1,
- NVM_RSP_ERR_FAILWRITE = 0x40ff,
- NVM_RSP_ERR_EMPTYPAGE = 0x42ff,
- NVM_RSP_ERR_FAILECC = 0x4281,
- NVM_RSP_ERR_FAILCRC = 0x4004,
- NVM_RSP_WARN_HIGHECC = 0x4700,
-
- /* Device opcodes */
- NVM_OP_PWRITE = 0x91,
- NVM_OP_PREAD = 0x92,
- NVM_OP_ERASE = 0x90,
-
- /* PPA Command Flags */
- NVM_IO_SNGL_ACCESS = 0x0,
- NVM_IO_DUAL_ACCESS = 0x1,
- NVM_IO_QUAD_ACCESS = 0x2,
-
- /* NAND Access Modes */
- NVM_IO_SUSPEND = 0x80,
- NVM_IO_SLC_MODE = 0x100,
- NVM_IO_SCRAMBLE_ENABLE = 0x200,
-
- /* Block Types */
- NVM_BLK_T_FREE = 0x0,
- NVM_BLK_T_BAD = 0x1,
- NVM_BLK_T_GRWN_BAD = 0x2,
- NVM_BLK_T_DEV = 0x4,
- NVM_BLK_T_HOST = 0x8,
-
- /* Memory capabilities */
- NVM_ID_CAP_SLC = 0x1,
- NVM_ID_CAP_CMD_SUSPEND = 0x2,
- NVM_ID_CAP_SCRAMBLE = 0x4,
- NVM_ID_CAP_ENCRYPT = 0x8,
-
- /* Memory types */
- NVM_ID_FMTYPE_SLC = 0,
- NVM_ID_FMTYPE_MLC = 1,
-
- /* Device capabilities */
- NVM_ID_DCAP_BBLKMGMT = 0x1,
- NVM_UD_DCAP_ECC = 0x2,
-};
-
-struct nvm_id_lp_mlc {
- u16 num_pairs;
- u8 pairs[886];
-};
-
-struct nvm_id_lp_tbl {
- __u8 id[8];
- struct nvm_id_lp_mlc mlc;
-};
-
-struct nvm_addrf_12 {
- u8 ch_len;
- u8 lun_len;
- u8 blk_len;
- u8 pg_len;
- u8 pln_len;
- u8 sec_len;
-
- u8 ch_offset;
- u8 lun_offset;
- u8 blk_offset;
- u8 pg_offset;
- u8 pln_offset;
- u8 sec_offset;
-
- u64 ch_mask;
- u64 lun_mask;
- u64 blk_mask;
- u64 pg_mask;
- u64 pln_mask;
- u64 sec_mask;
-};
-
-struct nvm_addrf {
- u8 ch_len;
- u8 lun_len;
- u8 chk_len;
- u8 sec_len;
- u8 rsv_len[2];
-
- u8 ch_offset;
- u8 lun_offset;
- u8 chk_offset;
- u8 sec_offset;
- u8 rsv_off[2];
-
- u64 ch_mask;
- u64 lun_mask;
- u64 chk_mask;
- u64 sec_mask;
- u64 rsv_mask[2];
-};
-
-enum {
- /* Chunk states */
- NVM_CHK_ST_FREE = 1 << 0,
- NVM_CHK_ST_CLOSED = 1 << 1,
- NVM_CHK_ST_OPEN = 1 << 2,
- NVM_CHK_ST_OFFLINE = 1 << 3,
-
- /* Chunk types */
- NVM_CHK_TP_W_SEQ = 1 << 0,
- NVM_CHK_TP_W_RAN = 1 << 1,
- NVM_CHK_TP_SZ_SPEC = 1 << 4,
-};
-
-/*
- * Note: The structure size is linked to nvme_nvm_chk_meta such that the same
- * buffer can be used when converting from little endian to cpu addressing.
- */
-struct nvm_chk_meta {
- u8 state;
- u8 type;
- u8 wi;
- u8 rsvd[5];
- u64 slba;
- u64 cnlb;
- u64 wp;
-};
-
-struct nvm_target {
- struct list_head list;
- struct nvm_tgt_dev *dev;
- struct nvm_tgt_type *type;
- struct gendisk *disk;
-};
-
-#define ADDR_EMPTY (~0ULL)
-
-#define NVM_TARGET_DEFAULT_OP (101)
-#define NVM_TARGET_MIN_OP (3)
-#define NVM_TARGET_MAX_OP (80)
-
-#define NVM_VERSION_MAJOR 1
-#define NVM_VERSION_MINOR 0
-#define NVM_VERSION_PATCH 0
-
-#define NVM_MAX_VLBA (64) /* max logical blocks in a vector command */
-
-struct nvm_rq;
-typedef void (nvm_end_io_fn)(struct nvm_rq *);
-
-struct nvm_rq {
- struct nvm_tgt_dev *dev;
-
- struct bio *bio;
-
- union {
- struct ppa_addr ppa_addr;
- dma_addr_t dma_ppa_list;
- };
-
- struct ppa_addr *ppa_list;
-
- void *meta_list;
- dma_addr_t dma_meta_list;
-
- nvm_end_io_fn *end_io;
-
- uint8_t opcode;
- uint16_t nr_ppas;
- uint16_t flags;
-
- u64 ppa_status; /* ppa media status */
- int error;
-
- int is_seq; /* Sequential hint flag. 1.2 only */
-
- void *private;
-};
-
-static inline struct nvm_rq *nvm_rq_from_pdu(void *pdu)
-{
- return pdu - sizeof(struct nvm_rq);
-}
-
-static inline void *nvm_rq_to_pdu(struct nvm_rq *rqdata)
-{
- return rqdata + 1;
-}
-
-static inline struct ppa_addr *nvm_rq_to_ppa_list(struct nvm_rq *rqd)
-{
- return (rqd->nr_ppas > 1) ? rqd->ppa_list : &rqd->ppa_addr;
-}
-
-enum {
- NVM_BLK_ST_FREE = 0x1, /* Free block */
- NVM_BLK_ST_TGT = 0x2, /* Block in use by target */
- NVM_BLK_ST_BAD = 0x8, /* Bad block */
-};
-
-/* Instance geometry */
-struct nvm_geo {
- /* device reported version */
- u8 major_ver_id;
- u8 minor_ver_id;
-
- /* kernel short version */
- u8 version;
-
- /* instance specific geometry */
- int num_ch;
- int num_lun; /* per channel */
-
- /* calculated values */
- int all_luns; /* across channels */
- int all_chunks; /* across channels */
-
- int op; /* over-provision in instance */
-
- sector_t total_secs; /* across channels */
-
- /* chunk geometry */
- u32 num_chk; /* chunks per lun */
- u32 clba; /* sectors per chunk */
- u16 csecs; /* sector size */
- u16 sos; /* out-of-band area size */
- bool ext; /* metadata in extended data buffer */
- u32 mdts; /* Max data transfer size*/
-
- /* device write constrains */
- u32 ws_min; /* minimum write size */
- u32 ws_opt; /* optimal write size */
- u32 mw_cunits; /* distance required for successful read */
- u32 maxoc; /* maximum open chunks */
- u32 maxocpu; /* maximum open chunks per parallel unit */
-
- /* device capabilities */
- u32 mccap;
-
- /* device timings */
- u32 trdt; /* Avg. Tread (ns) */
- u32 trdm; /* Max Tread (ns) */
- u32 tprt; /* Avg. Tprog (ns) */
- u32 tprm; /* Max Tprog (ns) */
- u32 tbet; /* Avg. Terase (ns) */
- u32 tbem; /* Max Terase (ns) */
-
- /* generic address format */
- struct nvm_addrf addrf;
-
- /* 1.2 compatibility */
- u8 vmnt;
- u32 cap;
- u32 dom;
-
- u8 mtype;
- u8 fmtype;
-
- u16 cpar;
- u32 mpos;
-
- u8 num_pln;
- u8 pln_mode;
- u16 num_pg;
- u16 fpg_sz;
-};
-
-/* sub-device structure */
-struct nvm_tgt_dev {
- /* Device information */
- struct nvm_geo geo;
-
- /* Base ppas for target LUNs */
- struct ppa_addr *luns;
-
- struct request_queue *q;
-
- struct nvm_dev *parent;
- void *map;
-};
-
-struct nvm_dev {
- struct nvm_dev_ops *ops;
-
- struct list_head devices;
-
- /* Device information */
- struct nvm_geo geo;
-
- unsigned long *lun_map;
- void *dma_pool;
-
- /* Backend device */
- struct request_queue *q;
- char name[DISK_NAME_LEN];
- void *private_data;
-
- struct kref ref;
- void *rmap;
-
- struct mutex mlock;
- spinlock_t lock;
-
- /* target management */
- struct list_head area_list;
- struct list_head targets;
-};
-
-static inline struct ppa_addr generic_to_dev_addr(struct nvm_dev *dev,
- struct ppa_addr r)
-{
- struct nvm_geo *geo = &dev->geo;
- struct ppa_addr l;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = (struct nvm_addrf_12 *)&geo->addrf;
-
- l.ppa = ((u64)r.g.ch) << ppaf->ch_offset;
- l.ppa |= ((u64)r.g.lun) << ppaf->lun_offset;
- l.ppa |= ((u64)r.g.blk) << ppaf->blk_offset;
- l.ppa |= ((u64)r.g.pg) << ppaf->pg_offset;
- l.ppa |= ((u64)r.g.pl) << ppaf->pln_offset;
- l.ppa |= ((u64)r.g.sec) << ppaf->sec_offset;
- } else {
- struct nvm_addrf *lbaf = &geo->addrf;
-
- l.ppa = ((u64)r.m.grp) << lbaf->ch_offset;
- l.ppa |= ((u64)r.m.pu) << lbaf->lun_offset;
- l.ppa |= ((u64)r.m.chk) << lbaf->chk_offset;
- l.ppa |= ((u64)r.m.sec) << lbaf->sec_offset;
- }
-
- return l;
-}
-
-static inline struct ppa_addr dev_to_generic_addr(struct nvm_dev *dev,
- struct ppa_addr r)
-{
- struct nvm_geo *geo = &dev->geo;
- struct ppa_addr l;
-
- l.ppa = 0;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = (struct nvm_addrf_12 *)&geo->addrf;
-
- l.g.ch = (r.ppa & ppaf->ch_mask) >> ppaf->ch_offset;
- l.g.lun = (r.ppa & ppaf->lun_mask) >> ppaf->lun_offset;
- l.g.blk = (r.ppa & ppaf->blk_mask) >> ppaf->blk_offset;
- l.g.pg = (r.ppa & ppaf->pg_mask) >> ppaf->pg_offset;
- l.g.pl = (r.ppa & ppaf->pln_mask) >> ppaf->pln_offset;
- l.g.sec = (r.ppa & ppaf->sec_mask) >> ppaf->sec_offset;
- } else {
- struct nvm_addrf *lbaf = &geo->addrf;
-
- l.m.grp = (r.ppa & lbaf->ch_mask) >> lbaf->ch_offset;
- l.m.pu = (r.ppa & lbaf->lun_mask) >> lbaf->lun_offset;
- l.m.chk = (r.ppa & lbaf->chk_mask) >> lbaf->chk_offset;
- l.m.sec = (r.ppa & lbaf->sec_mask) >> lbaf->sec_offset;
- }
-
- return l;
-}
-
-static inline u64 dev_to_chunk_addr(struct nvm_dev *dev, void *addrf,
- struct ppa_addr p)
-{
- struct nvm_geo *geo = &dev->geo;
- u64 caddr;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = (struct nvm_addrf_12 *)addrf;
-
- caddr = (u64)p.g.pg << ppaf->pg_offset;
- caddr |= (u64)p.g.pl << ppaf->pln_offset;
- caddr |= (u64)p.g.sec << ppaf->sec_offset;
- } else {
- caddr = p.m.sec;
- }
-
- return caddr;
-}
-
-static inline struct ppa_addr nvm_ppa32_to_ppa64(struct nvm_dev *dev,
- void *addrf, u32 ppa32)
-{
- struct ppa_addr ppa64;
-
- ppa64.ppa = 0;
-
- if (ppa32 == -1) {
- ppa64.ppa = ADDR_EMPTY;
- } else if (ppa32 & (1U << 31)) {
- ppa64.c.line = ppa32 & ((~0U) >> 1);
- ppa64.c.is_cached = 1;
- } else {
- struct nvm_geo *geo = &dev->geo;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = addrf;
-
- ppa64.g.ch = (ppa32 & ppaf->ch_mask) >>
- ppaf->ch_offset;
- ppa64.g.lun = (ppa32 & ppaf->lun_mask) >>
- ppaf->lun_offset;
- ppa64.g.blk = (ppa32 & ppaf->blk_mask) >>
- ppaf->blk_offset;
- ppa64.g.pg = (ppa32 & ppaf->pg_mask) >>
- ppaf->pg_offset;
- ppa64.g.pl = (ppa32 & ppaf->pln_mask) >>
- ppaf->pln_offset;
- ppa64.g.sec = (ppa32 & ppaf->sec_mask) >>
- ppaf->sec_offset;
- } else {
- struct nvm_addrf *lbaf = addrf;
-
- ppa64.m.grp = (ppa32 & lbaf->ch_mask) >>
- lbaf->ch_offset;
- ppa64.m.pu = (ppa32 & lbaf->lun_mask) >>
- lbaf->lun_offset;
- ppa64.m.chk = (ppa32 & lbaf->chk_mask) >>
- lbaf->chk_offset;
- ppa64.m.sec = (ppa32 & lbaf->sec_mask) >>
- lbaf->sec_offset;
- }
- }
-
- return ppa64;
-}
-
-static inline u32 nvm_ppa64_to_ppa32(struct nvm_dev *dev,
- void *addrf, struct ppa_addr ppa64)
-{
- u32 ppa32 = 0;
-
- if (ppa64.ppa == ADDR_EMPTY) {
- ppa32 = ~0U;
- } else if (ppa64.c.is_cached) {
- ppa32 |= ppa64.c.line;
- ppa32 |= 1U << 31;
- } else {
- struct nvm_geo *geo = &dev->geo;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- struct nvm_addrf_12 *ppaf = addrf;
-
- ppa32 |= ppa64.g.ch << ppaf->ch_offset;
- ppa32 |= ppa64.g.lun << ppaf->lun_offset;
- ppa32 |= ppa64.g.blk << ppaf->blk_offset;
- ppa32 |= ppa64.g.pg << ppaf->pg_offset;
- ppa32 |= ppa64.g.pl << ppaf->pln_offset;
- ppa32 |= ppa64.g.sec << ppaf->sec_offset;
- } else {
- struct nvm_addrf *lbaf = addrf;
-
- ppa32 |= ppa64.m.grp << lbaf->ch_offset;
- ppa32 |= ppa64.m.pu << lbaf->lun_offset;
- ppa32 |= ppa64.m.chk << lbaf->chk_offset;
- ppa32 |= ppa64.m.sec << lbaf->sec_offset;
- }
- }
-
- return ppa32;
-}
-
-static inline int nvm_next_ppa_in_chk(struct nvm_tgt_dev *dev,
- struct ppa_addr *ppa)
-{
- struct nvm_geo *geo = &dev->geo;
- int last = 0;
-
- if (geo->version == NVM_OCSSD_SPEC_12) {
- int sec = ppa->g.sec;
-
- sec++;
- if (sec == geo->ws_min) {
- int pg = ppa->g.pg;
-
- sec = 0;
- pg++;
- if (pg == geo->num_pg) {
- int pl = ppa->g.pl;
-
- pg = 0;
- pl++;
- if (pl == geo->num_pln)
- last = 1;
-
- ppa->g.pl = pl;
- }
- ppa->g.pg = pg;
- }
- ppa->g.sec = sec;
- } else {
- ppa->m.sec++;
- if (ppa->m.sec == geo->clba)
- last = 1;
- }
-
- return last;
-}
-
-typedef sector_t (nvm_tgt_capacity_fn)(void *);
-typedef void *(nvm_tgt_init_fn)(struct nvm_tgt_dev *, struct gendisk *,
- int flags);
-typedef void (nvm_tgt_exit_fn)(void *, bool);
-typedef int (nvm_tgt_sysfs_init_fn)(struct gendisk *);
-typedef void (nvm_tgt_sysfs_exit_fn)(struct gendisk *);
-
-enum {
- NVM_TGT_F_DEV_L2P = 0,
- NVM_TGT_F_HOST_L2P = 1 << 0,
-};
-
-struct nvm_tgt_type {
- const char *name;
- unsigned int version[3];
- int flags;
-
- /* target entry points */
- const struct block_device_operations *bops;
- nvm_tgt_capacity_fn *capacity;
-
- /* module-specific init/teardown */
- nvm_tgt_init_fn *init;
- nvm_tgt_exit_fn *exit;
-
- /* sysfs */
- nvm_tgt_sysfs_init_fn *sysfs_init;
- nvm_tgt_sysfs_exit_fn *sysfs_exit;
-
- /* For internal use */
- struct list_head list;
- struct module *owner;
-};
-
-extern int nvm_register_tgt_type(struct nvm_tgt_type *);
-extern void nvm_unregister_tgt_type(struct nvm_tgt_type *);
-
-extern void *nvm_dev_dma_alloc(struct nvm_dev *, gfp_t, dma_addr_t *);
-extern void nvm_dev_dma_free(struct nvm_dev *, void *, dma_addr_t);
-
-extern struct nvm_dev *nvm_alloc_dev(int);
-extern int nvm_register(struct nvm_dev *);
-extern void nvm_unregister(struct nvm_dev *);
-
-extern int nvm_get_chunk_meta(struct nvm_tgt_dev *, struct ppa_addr,
- int, struct nvm_chk_meta *);
-extern int nvm_set_chunk_meta(struct nvm_tgt_dev *, struct ppa_addr *,
- int, int);
-extern int nvm_submit_io(struct nvm_tgt_dev *, struct nvm_rq *, void *);
-extern int nvm_submit_io_sync(struct nvm_tgt_dev *, struct nvm_rq *, void *);
-extern void nvm_end_io(struct nvm_rq *);
-
-#else /* CONFIG_NVM */
-struct nvm_dev_ops;
-
-static inline struct nvm_dev *nvm_alloc_dev(int node)
-{
- return ERR_PTR(-EINVAL);
-}
-static inline int nvm_register(struct nvm_dev *dev)
-{
- return -EINVAL;
-}
-static inline void nvm_unregister(struct nvm_dev *dev) {}
-#endif /* CONFIG_NVM */
-#endif /* LIGHTNVM.H */
int linear_range_get_selector_high(const struct linear_range *r,
unsigned int val, unsigned int *selector,
bool *found);
+void linear_range_get_selector_within(const struct linear_range *r,
+ unsigned int val, unsigned int *selector);
int linear_range_get_selector_low_array(const struct linear_range *r,
int ranges, unsigned int val,
unsigned int *selector, bool *found);
#include <linux/percpu-defs.h>
#include <linux/lockdep.h>
+#ifndef CONFIG_PREEMPT_RT
+
typedef struct {
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
} local_lock_t;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define LL_DEP_MAP_INIT(lockname) \
+# define LOCAL_LOCK_DEBUG_INIT(lockname) \
.dep_map = { \
.name = #lockname, \
.wait_type_inner = LD_WAIT_CONFIG, \
- .lock_type = LD_LOCK_PERCPU, \
- }
-#else
-# define LL_DEP_MAP_INIT(lockname)
-#endif
-
-#define INIT_LOCAL_LOCK(lockname) { LL_DEP_MAP_INIT(lockname) }
-
-#define __local_lock_init(lock) \
-do { \
- static struct lock_class_key __key; \
- \
- debug_check_no_locks_freed((void *)lock, sizeof(*lock));\
- lockdep_init_map_type(&(lock)->dep_map, #lock, &__key, 0, \
- LD_WAIT_CONFIG, LD_WAIT_INV, \
- LD_LOCK_PERCPU); \
-} while (0)
+ .lock_type = LD_LOCK_PERCPU, \
+ }, \
+ .owner = NULL,
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
static inline void local_lock_acquire(local_lock_t *l)
{
lock_map_acquire(&l->dep_map);
lock_map_release(&l->dep_map);
}
+static inline void local_lock_debug_init(local_lock_t *l)
+{
+ l->owner = NULL;
+}
#else /* CONFIG_DEBUG_LOCK_ALLOC */
+# define LOCAL_LOCK_DEBUG_INIT(lockname)
static inline void local_lock_acquire(local_lock_t *l) { }
static inline void local_lock_release(local_lock_t *l) { }
+static inline void local_lock_debug_init(local_lock_t *l) { }
#endif /* !CONFIG_DEBUG_LOCK_ALLOC */
+#define INIT_LOCAL_LOCK(lockname) { LOCAL_LOCK_DEBUG_INIT(lockname) }
+
+#define __local_lock_init(lock) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ debug_check_no_locks_freed((void *)lock, sizeof(*lock));\
+ lockdep_init_map_type(&(lock)->dep_map, #lock, &__key, \
+ 0, LD_WAIT_CONFIG, LD_WAIT_INV, \
+ LD_LOCK_PERCPU); \
+ local_lock_debug_init(lock); \
+} while (0)
+
#define __local_lock(lock) \
do { \
preempt_disable(); \
local_lock_release(this_cpu_ptr(lock)); \
local_irq_restore(flags); \
} while (0)
+
+#else /* !CONFIG_PREEMPT_RT */
+
+/*
+ * On PREEMPT_RT local_lock maps to a per CPU spinlock, which protects the
+ * critical section while staying preemptible.
+ */
+typedef spinlock_t local_lock_t;
+
+#define INIT_LOCAL_LOCK(lockname) __LOCAL_SPIN_LOCK_UNLOCKED((lockname))
+
+#define __local_lock_init(l) \
+ do { \
+ local_spin_lock_init((l)); \
+ } while (0)
+
+#define __local_lock(__lock) \
+ do { \
+ migrate_disable(); \
+ spin_lock(this_cpu_ptr((__lock))); \
+ } while (0)
+
+#define __local_lock_irq(lock) __local_lock(lock)
+
+#define __local_lock_irqsave(lock, flags) \
+ do { \
+ typecheck(unsigned long, flags); \
+ flags = 0; \
+ __local_lock(lock); \
+ } while (0)
+
+#define __local_unlock(__lock) \
+ do { \
+ spin_unlock(this_cpu_ptr((__lock))); \
+ migrate_enable(); \
+ } while (0)
+
+#define __local_unlock_irq(lock) __local_unlock(lock)
+
+#define __local_unlock_irqrestore(lock, flags) __local_unlock(lock)
+
+#endif /* CONFIG_PREEMPT_RT */
#define RT5033_REGULATOR_BUCK_VOLTAGE_MIN 1000000U
#define RT5033_REGULATOR_BUCK_VOLTAGE_MAX 3000000U
#define RT5033_REGULATOR_BUCK_VOLTAGE_STEP 100000U
-#define RT5033_REGULATOR_BUCK_VOLTAGE_STEP_NUM 21
+#define RT5033_REGULATOR_BUCK_VOLTAGE_STEP_NUM 32
/* RT5033 regulator LDO output voltage uV */
#define RT5033_REGULATOR_LDO_VOLTAGE_MIN 1200000U
#define RT5033_REGULATOR_LDO_VOLTAGE_MAX 3000000U
#define RT5033_REGULATOR_LDO_VOLTAGE_STEP 100000U
-#define RT5033_REGULATOR_LDO_VOLTAGE_STEP_NUM 19
+#define RT5033_REGULATOR_LDO_VOLTAGE_STEP_NUM 32
/* RT5033 regulator SAFE LDO output voltage uV */
#define RT5033_REGULATOR_SAFE_LDO_VOLTAGE 4900000U
* host and device execution environments match and
* channels are in a DISABLED state.
* @mhi_dev: Device associated with the channels
- * @flags: MHI channel flags
*/
-int mhi_prepare_for_transfer(struct mhi_device *mhi_dev,
- unsigned int flags);
-
-/* Automatically allocate and queue inbound buffers */
-#define MHI_CH_INBOUND_ALLOC_BUFS BIT(0)
+int mhi_prepare_for_transfer(struct mhi_device *mhi_dev);
/**
* mhi_unprepare_from_transfer - Reset UL and DL channels for data transfer.
u8 raw_hc_erase_gap_size; /* 221 */
u8 raw_erase_timeout_mult; /* 223 */
u8 raw_hc_erase_grp_size; /* 224 */
+ u8 raw_boot_mult; /* 226 */
u8 raw_sec_trim_mult; /* 229 */
u8 raw_sec_erase_mult; /* 230 */
u8 raw_sec_feature_support;/* 231 */
#else
#define MMC_CAP2_CRYPTO 0
#endif
+#define MMC_CAP2_ALT_GPT_TEGRA (1 << 28) /* Host with eMMC that has GPT entry at a non-standard location */
int fixed_drv_type; /* fixed driver type for non-removable media */
static inline void leave_mm(int cpu) { }
#endif
+/*
+ * CPUs that are capable of running user task @p. Must contain at least one
+ * active CPU. It is assumed that the kernel can run on all CPUs, so calling
+ * this for a kernel thread is pointless.
+ *
+ * By default, we assume a sane, homogeneous system.
+ */
+#ifndef task_cpu_possible_mask
+# define task_cpu_possible_mask(p) cpu_possible_mask
+# define task_cpu_possible(cpu, p) true
+#else
+# define task_cpu_possible(cpu, p) cpumask_test_cpu((cpu), task_cpu_possible_mask(p))
+#endif
+
#endif
extern const struct kernel_param_ops param_ops_uint;
extern int param_set_uint(const char *val, const struct kernel_param *kp);
extern int param_get_uint(char *buffer, const struct kernel_param *kp);
+int param_set_uint_minmax(const char *val, const struct kernel_param *kp,
+ unsigned int min, unsigned int max);
#define param_check_uint(name, p) __param_check(name, p, unsigned int)
extern const struct kernel_param_ops param_ops_long;
* address or data changes
* @write_msi_msg_data: Data parameter for the callback.
*
- * @masked: [PCI MSI/X] Mask bits
+ * @msi_mask: [PCI MSI] MSI cached mask bits
+ * @msix_ctrl: [PCI MSI-X] MSI-X cached per vector control bits
* @is_msix: [PCI MSI/X] True if MSI-X
* @multiple: [PCI MSI/X] log2 num of messages allocated
* @multi_cap: [PCI MSI/X] log2 num of messages supported
union {
/* PCI MSI/X specific data */
struct {
- u32 masked;
+ union {
+ u32 msi_mask;
+ u32 msix_ctrl;
+ };
struct {
u8 is_msix : 1;
u8 multiple : 3;
void __pci_read_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
-u32 __pci_msix_desc_mask_irq(struct msi_desc *desc, u32 flag);
-void __pci_msi_desc_mask_irq(struct msi_desc *desc, u32 mask, u32 flag);
void pci_msi_mask_irq(struct irq_data *data);
void pci_msi_unmask_irq(struct irq_data *data);
+const struct attribute_group **msi_populate_sysfs(struct device *dev);
+void msi_destroy_sysfs(struct device *dev,
+ const struct attribute_group **msi_irq_groups);
+
/*
* The arch hooks to setup up msi irqs. Default functions are implemented
* as weak symbols so that they /can/ be overriden by architecture specific
#include <linux/osq_lock.h>
#include <linux/debug_locks.h>
-struct ww_class;
-struct ww_acquire_ctx;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+# define __DEP_MAP_MUTEX_INITIALIZER(lockname) \
+ , .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_SLEEP, \
+ }
+#else
+# define __DEP_MAP_MUTEX_INITIALIZER(lockname)
+#endif
+
+#ifndef CONFIG_PREEMPT_RT
/*
* Simple, straightforward mutexes with strict semantics:
*/
struct mutex {
atomic_long_t owner;
- spinlock_t wait_lock;
+ raw_spinlock_t wait_lock;
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
struct optimistic_spin_queue osq; /* Spinner MCS lock */
#endif
#endif
};
-struct ww_mutex {
- struct mutex base;
- struct ww_acquire_ctx *ctx;
-#ifdef CONFIG_DEBUG_MUTEXES
- struct ww_class *ww_class;
-#endif
-};
-
-/*
- * This is the control structure for tasks blocked on mutex,
- * which resides on the blocked task's kernel stack:
- */
-struct mutex_waiter {
- struct list_head list;
- struct task_struct *task;
- struct ww_acquire_ctx *ww_ctx;
-#ifdef CONFIG_DEBUG_MUTEXES
- void *magic;
-#endif
-};
-
#ifdef CONFIG_DEBUG_MUTEXES
#define __DEBUG_MUTEX_INITIALIZER(lockname) \
__mutex_init((mutex), #mutex, &__key); \
} while (0)
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __DEP_MAP_MUTEX_INITIALIZER(lockname) \
- , .dep_map = { \
- .name = #lockname, \
- .wait_type_inner = LD_WAIT_SLEEP, \
- }
-#else
-# define __DEP_MAP_MUTEX_INITIALIZER(lockname)
-#endif
-
#define __MUTEX_INITIALIZER(lockname) \
{ .owner = ATOMIC_LONG_INIT(0) \
- , .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
+ , .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
__DEBUG_MUTEX_INITIALIZER(lockname) \
__DEP_MAP_MUTEX_INITIALIZER(lockname) }
*/
extern bool mutex_is_locked(struct mutex *lock);
+#else /* !CONFIG_PREEMPT_RT */
+/*
+ * Preempt-RT variant based on rtmutexes.
+ */
+#include <linux/rtmutex.h>
+
+struct mutex {
+ struct rt_mutex_base rtmutex;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
+};
+
+#define __MUTEX_INITIALIZER(mutexname) \
+{ \
+ .rtmutex = __RT_MUTEX_BASE_INITIALIZER(mutexname.rtmutex) \
+ __DEP_MAP_MUTEX_INITIALIZER(mutexname) \
+}
+
+#define DEFINE_MUTEX(mutexname) \
+ struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
+
+extern void __mutex_rt_init(struct mutex *lock, const char *name,
+ struct lock_class_key *key);
+extern int mutex_trylock(struct mutex *lock);
+
+static inline void mutex_destroy(struct mutex *lock) { }
+
+#define mutex_is_locked(l) rt_mutex_base_is_locked(&(l)->rtmutex)
+
+#define __mutex_init(mutex, name, key) \
+do { \
+ rt_mutex_base_init(&(mutex)->rtmutex); \
+ __mutex_rt_init((mutex), name, key); \
+} while (0)
+
+#define mutex_init(mutex) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ __mutex_init((mutex), #mutex, &__key); \
+} while (0)
+#endif /* CONFIG_PREEMPT_RT */
+
/*
* See kernel/locking/mutex.c for detailed documentation of these APIs.
* Also see Documentation/locking/mutex-design.rst.
#ifndef PADATA_H
#define PADATA_H
+#include <linux/refcount.h>
#include <linux/compiler_types.h>
#include <linux/workqueue.h>
#include <linux/spinlock.h>
struct padata_shell *ps;
struct padata_list __percpu *reorder_list;
struct padata_serial_queue __percpu *squeue;
- atomic_t refcnt;
+ refcount_t refcnt;
unsigned int seq_nr;
unsigned int processed;
int cpu;
#define PCI_DEVICE_ID_3COM_3CR990SVR 0x990a
#define PCI_VENDOR_ID_AL 0x10b9
+#define PCI_DEVICE_ID_AL_M1489 0x1489
#define PCI_DEVICE_ID_AL_M1533 0x1533
#define PCI_DEVICE_ID_AL_M1535 0x1535
#define PCI_DEVICE_ID_AL_M1541 0x1541
#define PCI_DEVICE_ID_INTEL_82375 0x0482
#define PCI_DEVICE_ID_INTEL_82424 0x0483
#define PCI_DEVICE_ID_INTEL_82378 0x0484
+#define PCI_DEVICE_ID_INTEL_82425 0x0486
#define PCI_DEVICE_ID_INTEL_MRST_SD0 0x0807
#define PCI_DEVICE_ID_INTEL_MRST_SD1 0x0808
#define PCI_DEVICE_ID_INTEL_MFD_SD 0x0820
extern struct pid *pidfd_pid(const struct file *file);
struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags);
+int pidfd_create(struct pid *pid, unsigned int flags);
static inline struct pid *get_pid(struct pid *pid)
{
EC_DEVICE_EVENT_TRACKPAD,
EC_DEVICE_EVENT_DSP,
EC_DEVICE_EVENT_WIFI,
+ EC_DEVICE_EVENT_WLC,
};
enum ec_device_event_param {
/* Issue AP reset */
#define EC_CMD_AP_RESET 0x0125
+/**
+ * Get the number of peripheral charge ports
+ */
+#define EC_CMD_PCHG_COUNT 0x0134
+
+#define EC_PCHG_MAX_PORTS 8
+
+struct ec_response_pchg_count {
+ uint8_t port_count;
+} __ec_align1;
+
+/**
+ * Get the status of a peripheral charge port
+ */
+#define EC_CMD_PCHG 0x0135
+
+struct ec_params_pchg {
+ uint8_t port;
+} __ec_align1;
+
+struct ec_response_pchg {
+ uint32_t error; /* enum pchg_error */
+ uint8_t state; /* enum pchg_state state */
+ uint8_t battery_percentage;
+ uint8_t unused0;
+ uint8_t unused1;
+ /* Fields added in version 1 */
+ uint32_t fw_version;
+ uint32_t dropped_event_count;
+} __ec_align2;
+
+enum pchg_state {
+ /* Charger is reset and not initialized. */
+ PCHG_STATE_RESET = 0,
+ /* Charger is initialized or disabled. */
+ PCHG_STATE_INITIALIZED,
+ /* Charger is enabled and ready to detect a device. */
+ PCHG_STATE_ENABLED,
+ /* Device is in proximity. */
+ PCHG_STATE_DETECTED,
+ /* Device is being charged. */
+ PCHG_STATE_CHARGING,
+ /* Device is fully charged. It implies DETECTED (& not charging). */
+ PCHG_STATE_FULL,
+ /* In download (a.k.a. firmware update) mode */
+ PCHG_STATE_DOWNLOAD,
+ /* In download mode. Ready for receiving data. */
+ PCHG_STATE_DOWNLOADING,
+ /* Device is ready for data communication. */
+ PCHG_STATE_CONNECTED,
+ /* Put no more entry below */
+ PCHG_STATE_COUNT,
+};
+
+#define EC_PCHG_STATE_TEXT { \
+ [PCHG_STATE_RESET] = "RESET", \
+ [PCHG_STATE_INITIALIZED] = "INITIALIZED", \
+ [PCHG_STATE_ENABLED] = "ENABLED", \
+ [PCHG_STATE_DETECTED] = "DETECTED", \
+ [PCHG_STATE_CHARGING] = "CHARGING", \
+ [PCHG_STATE_FULL] = "FULL", \
+ [PCHG_STATE_DOWNLOAD] = "DOWNLOAD", \
+ [PCHG_STATE_DOWNLOADING] = "DOWNLOADING", \
+ [PCHG_STATE_CONNECTED] = "CONNECTED", \
+ }
+
/*****************************************************************************/
/* Voltage regulator controls */
/* Board specific platform_data */
struct mtk_chip_config {
u32 sample_sel;
+ u32 tick_delay;
};
#endif
return timerqueue_add(head, &ctmr->node);
}
-static inline void cpu_timer_dequeue(struct cpu_timer *ctmr)
+static inline bool cpu_timer_queued(struct cpu_timer *ctmr)
{
- if (ctmr->head) {
+ return !!ctmr->head;
+}
+
+static inline bool cpu_timer_dequeue(struct cpu_timer *ctmr)
+{
+ if (cpu_timer_queued(ctmr)) {
timerqueue_del(ctmr->head, &ctmr->node);
ctmr->head = NULL;
+ return true;
}
+ return false;
}
static inline u64 cpu_timer_getexpires(struct cpu_timer *ctmr)
MAX17042_RelaxCFG = 0x2A,
MAX17042_MiscCFG = 0x2B,
MAX17042_TGAIN = 0x2C,
- MAx17042_TOFF = 0x2D,
+ MAX17042_TOFF = 0x2D,
MAX17042_CGAIN = 0x2E,
MAX17042_COFF = 0x2F,
MAX17042_VFSOC = 0xFF,
};
+/* Registers specific to max17055 only */
enum max17055_register {
MAX17055_QRes = 0x0C,
+ MAX17055_RCell = 0x14,
MAX17055_TTF = 0x20,
- MAX17055_V_empty = 0x3A,
- MAX17055_TIMER = 0x3E,
+ MAX17055_DieTemp = 0x34,
MAX17055_USER_MEM = 0x40,
- MAX17055_RGAIN = 0x42,
+ MAX17055_RGAIN = 0x43,
MAX17055_ConvgCfg = 0x49,
MAX17055_VFRemCap = 0x4A,
MAX17055_AtAvCap = 0xDF,
};
-/* Registers specific to max17047/50 */
+/* Registers specific to max17047/50/55 */
enum max17047_register {
MAX17047_QRTbl00 = 0x12,
MAX17047_FullSOCThr = 0x13,
MAX17047_QRTbl10 = 0x22,
MAX17047_QRTbl20 = 0x32,
MAX17047_V_empty = 0x3A,
+ MAX17047_TIMER = 0x3E,
MAX17047_QRTbl30 = 0x42,
};
*/
struct power_supply_battery_info {
+ unsigned int technology; /* from the enum above */
int energy_full_design_uwh; /* microWatt-hours */
int charge_full_design_uah; /* microAmp-hours */
int voltage_min_design_uv; /* microVolts */
/*
* The preempt_count offset after spin_lock()
*/
+#if !defined(CONFIG_PREEMPT_RT)
#define PREEMPT_LOCK_OFFSET PREEMPT_DISABLE_OFFSET
+#else
+#define PREEMPT_LOCK_OFFSET 0
+#endif
/*
* The preempt_count offset needed for things like:
#ifndef _LINUX_RBTREE_H
#define _LINUX_RBTREE_H
+#include <linux/rbtree_types.h>
+
#include <linux/kernel.h>
#include <linux/stddef.h>
#include <linux/rcupdate.h>
-struct rb_node {
- unsigned long __rb_parent_color;
- struct rb_node *rb_right;
- struct rb_node *rb_left;
-} __attribute__((aligned(sizeof(long))));
- /* The alignment might seem pointless, but allegedly CRIS needs it */
-
-struct rb_root {
- struct rb_node *rb_node;
-};
-
#define rb_parent(r) ((struct rb_node *)((r)->__rb_parent_color & ~3))
-#define RB_ROOT (struct rb_root) { NULL, }
#define rb_entry(ptr, type, member) container_of(ptr, type, member)
#define RB_EMPTY_ROOT(root) (READ_ONCE((root)->rb_node) == NULL)
typeof(*pos), field); 1; }); \
pos = n)
-/*
- * Leftmost-cached rbtrees.
- *
- * We do not cache the rightmost node based on footprint
- * size vs number of potential users that could benefit
- * from O(1) rb_last(). Just not worth it, users that want
- * this feature can always implement the logic explicitly.
- * Furthermore, users that want to cache both pointers may
- * find it a bit asymmetric, but that's ok.
- */
-struct rb_root_cached {
- struct rb_root rb_root;
- struct rb_node *rb_leftmost;
-};
-
-#define RB_ROOT_CACHED (struct rb_root_cached) { {NULL, }, NULL }
-
/* Same as rb_first(), but O(1) */
#define rb_first_cached(root) (root)->rb_leftmost
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _LINUX_RBTREE_TYPES_H
+#define _LINUX_RBTREE_TYPES_H
+
+struct rb_node {
+ unsigned long __rb_parent_color;
+ struct rb_node *rb_right;
+ struct rb_node *rb_left;
+} __attribute__((aligned(sizeof(long))));
+/* The alignment might seem pointless, but allegedly CRIS needs it */
+
+struct rb_root {
+ struct rb_node *rb_node;
+};
+
+/*
+ * Leftmost-cached rbtrees.
+ *
+ * We do not cache the rightmost node based on footprint
+ * size vs number of potential users that could benefit
+ * from O(1) rb_last(). Just not worth it, users that want
+ * this feature can always implement the logic explicitly.
+ * Furthermore, users that want to cache both pointers may
+ * find it a bit asymmetric, but that's ok.
+ */
+struct rb_root_cached {
+ struct rb_root rb_root;
+ struct rb_node *rb_leftmost;
+};
+
+#define RB_ROOT (struct rb_root) { NULL, }
+#define RB_ROOT_CACHED (struct rb_root_cached) { {NULL, }, NULL }
+
+#endif
#include <linux/list.h>
#include <linux/rcupdate.h>
-/*
- * Why is there no list_empty_rcu()? Because list_empty() serves this
- * purpose. The list_empty() function fetches the RCU-protected pointer
- * and compares it to the address of the list head, but neither dereferences
- * this pointer itself nor provides this pointer to the caller. Therefore,
- * it is not necessary to use rcu_dereference(), so that list_empty() can
- * be used anywhere you would want to use a list_empty_rcu().
- */
-
/*
* INIT_LIST_HEAD_RCU - Initialize a list_head visible to RCU readers
* @list: list to be initialized
/*
* Where are list_empty_rcu() and list_first_entry_rcu()?
*
- * Implementing those functions following their counterparts list_empty() and
- * list_first_entry() is not advisable because they lead to subtle race
- * conditions as the following snippet shows:
+ * They do not exist because they would lead to subtle race conditions:
*
* if (!list_empty_rcu(mylist)) {
* struct foo *bar = list_first_entry_rcu(mylist, struct foo, list_member);
* do_something(bar);
* }
*
- * The list may not be empty when list_empty_rcu checks it, but it may be when
- * list_first_entry_rcu rereads the ->next pointer.
- *
- * Rereading the ->next pointer is not a problem for list_empty() and
- * list_first_entry() because they would be protected by a lock that blocks
- * writers.
+ * The list might be non-empty when list_empty_rcu() checks it, but it
+ * might have become empty by the time that list_first_entry_rcu() rereads
+ * the ->next pointer, which would result in a SEGV.
+ *
+ * When not using RCU, it is OK for list_first_entry() to re-read that
+ * pointer because both functions should be protected by some lock that
+ * blocks writers.
+ *
+ * When using RCU, list_empty() uses READ_ONCE() to fetch the
+ * RCU-protected ->next pointer and then compares it to the address of the
+ * list head. However, it neither dereferences this pointer nor provides
+ * this pointer to its caller. Thus, READ_ONCE() suffices (that is,
+ * rcu_dereference() is not needed), which means that list_empty() can be
+ * used anywhere you would want to use list_empty_rcu(). Just don't
+ * expect anything useful to happen if you do a subsequent lockless
+ * call to list_first_entry_rcu()!!!
*
* See list_first_or_null_rcu for an alternative.
*/
* nesting depth, but makes sense only if CONFIG_PREEMPT_RCU -- in other
* types of kernel builds, the rcu_read_lock() nesting depth is unknowable.
*/
-#define rcu_preempt_depth() (current->rcu_read_lock_nesting)
+#define rcu_preempt_depth() READ_ONCE(current->rcu_read_lock_nesting)
#else /* #ifdef CONFIG_PREEMPT_RCU */
# define synchronize_rcu_tasks synchronize_rcu
# endif
-# ifdef CONFIG_TASKS_RCU_TRACE
+# ifdef CONFIG_TASKS_TRACE_RCU
# define rcu_tasks_trace_qs(t) \
do { \
if (!likely(READ_ONCE((t)->trc_reader_checked)) && \
#include <asm/param.h> /* for HZ */
-/* Never flag non-existent other CPUs! */
-static inline bool rcu_eqs_special_set(int cpu) { return false; }
-
unsigned long get_state_synchronize_rcu(void);
unsigned long start_poll_synchronize_rcu(void);
bool poll_state_synchronize_rcu(unsigned long oldstate);
* @ranges: Array of configuration entries for virtual address ranges.
* @num_ranges: Number of range configuration entries.
* @use_hwlock: Indicate if a hardware spinlock should be used.
+ * @use_raw_spinlock: Indicate if a raw spinlock should be used.
* @hwlock_id: Specify the hardware spinlock id.
* @hwlock_mode: The hardware spinlock mode, should be HWLOCK_IRQSTATE,
* HWLOCK_IRQ or 0.
unsigned int num_ranges;
bool use_hwlock;
+ bool use_raw_spinlock;
unsigned int hwlock_id;
unsigned int hwlock_mode;
int regmap_field_bulk_alloc(struct regmap *regmap,
struct regmap_field **rm_field,
- struct reg_field *reg_field,
+ const struct reg_field *reg_field,
int num_fields);
void regmap_field_bulk_free(struct regmap_field *field);
int devm_regmap_field_bulk_alloc(struct device *dev, struct regmap *regmap,
struct regmap_field **field,
- struct reg_field *reg_field, int num_fields);
+ const struct reg_field *reg_field,
+ int num_fields);
void devm_regmap_field_bulk_free(struct device *dev,
struct regmap_field *field);
int devm_regulator_register_supply_alias(struct device *dev, const char *id,
struct device *alias_dev,
const char *alias_id);
-void devm_regulator_unregister_supply_alias(struct device *dev,
- const char *id);
int devm_regulator_bulk_register_supply_alias(struct device *dev,
const char *const *id,
struct device *alias_dev,
const char *const *alias_id,
int num_id);
-void devm_regulator_bulk_unregister_supply_alias(struct device *dev,
- const char *const *id,
- int num_id);
/* regulator output control and status */
int __must_check regulator_enable(struct regulator *regulator);
return 0;
}
-static inline void devm_regulator_unregister_supply_alias(struct device *dev,
- const char *id)
-{
-}
-
static inline int devm_regulator_bulk_register_supply_alias(struct device *dev,
const char *const *id,
struct device *alias_dev,
return 0;
}
-static inline void devm_regulator_bulk_unregister_supply_alias(
- struct device *dev, const char *const *id, int num_id)
-{
-}
-
static inline int regulator_enable(struct regulator *regulator)
{
return 0;
* @pull_down_val_on: Enabling value for control when using regmap
* set_pull_down
*
+ * @ramp_reg: Register for controlling the regulator ramp-rate.
+ * @ramp_mask: Bitmask for the ramp-rate control register.
+ * @ramp_delay_table: Table for mapping the regulator ramp-rate values. Values
+ * should be given in units of V/S (uV/uS). See the
+ * regulator_set_ramp_delay_regmap().
+ *
* @enable_time: Time taken for initial enable of regulator (in uS).
* @off_on_delay: guard time (in uS), before re-enabling a regulator
*
};
/**
- * struct regulator_irq_data - regulator error/notification status date
+ * struct regulator_irq_data - regulator error/notification status data
*
* @states: Status structs for each of the associated regulators.
* @num_states: Amount of associated regulators.
* active events as core does not clean the map data.
* REGULATOR_FAILED_RETRY can be returned to indicate that the
* status reading from IC failed. If this is repeated for
- * fatal_cnt times the core will call die() callback or BUG()
- * as a last resort to protect the HW.
+ * fatal_cnt times the core will call die() callback or power-off
+ * the system as a last resort to protect the HW.
* @renable: Optional callback to check status (if HW supports that) before
* re-enabling IRQ. If implemented this should clear the error
* flags so that errors fetched by regulator_get_error_flags()
* REGULATOR_FAILED_RETRY can be returned to
* indicate that the status reading from IC failed. If this is
* repeated for 'fatal_cnt' times the core will call die()
- * callback or BUG() as a last resort to protect the HW.
+ * callback or if die() is not populated then attempt to power-off
+ * the system as a last resort to protect the HW.
* Returning zero indicates that the problem in HW has been solved
* and IRQ will be re-enabled. Returning REGULATOR_ERROR_ON
* indicates the error condition is still active and keeps IRQ
const struct regulator_desc *regulator_desc,
const struct regulator_config *config);
void regulator_unregister(struct regulator_dev *rdev);
-void devm_regulator_unregister(struct device *dev, struct regulator_dev *rdev);
int regulator_notifier_call_chain(struct regulator_dev *rdev,
unsigned long event, void *data);
* @over_voltage_limits: Limits for acting on over voltage.
* @under_voltage_limits: Limits for acting on under voltage.
* @temp_limits: Limits for acting on over temperature.
-
+ *
* @max_spread: Max possible spread between coupled regulators
* @max_uV_step: Max possible step change in voltage
* @valid_modes_mask: Mask of modes which may be configured by consumers.
#ifndef _RESCTRL_H
#define _RESCTRL_H
+#include <linux/kernel.h>
+#include <linux/list.h>
#include <linux/pid.h>
#ifdef CONFIG_PROC_CPU_RESCTRL
#endif
+/**
+ * enum resctrl_conf_type - The type of configuration.
+ * @CDP_NONE: No prioritisation, both code and data are controlled or monitored.
+ * @CDP_CODE: Configuration applies to instruction fetches.
+ * @CDP_DATA: Configuration applies to reads and writes.
+ */
+enum resctrl_conf_type {
+ CDP_NONE,
+ CDP_CODE,
+ CDP_DATA,
+};
+
+#define CDP_NUM_TYPES (CDP_DATA + 1)
+
+/**
+ * struct resctrl_staged_config - parsed configuration to be applied
+ * @new_ctrl: new ctrl value to be loaded
+ * @have_new_ctrl: whether the user provided new_ctrl is valid
+ */
+struct resctrl_staged_config {
+ u32 new_ctrl;
+ bool have_new_ctrl;
+};
+
+/**
+ * struct rdt_domain - group of CPUs sharing a resctrl resource
+ * @list: all instances of this resource
+ * @id: unique id for this instance
+ * @cpu_mask: which CPUs share this resource
+ * @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold
+ * @mbm_total: saved state for MBM total bandwidth
+ * @mbm_local: saved state for MBM local bandwidth
+ * @mbm_over: worker to periodically read MBM h/w counters
+ * @cqm_limbo: worker to periodically read CQM h/w counters
+ * @mbm_work_cpu: worker CPU for MBM h/w counters
+ * @cqm_work_cpu: worker CPU for CQM h/w counters
+ * @plr: pseudo-locked region (if any) associated with domain
+ * @staged_config: parsed configuration to be applied
+ */
+struct rdt_domain {
+ struct list_head list;
+ int id;
+ struct cpumask cpu_mask;
+ unsigned long *rmid_busy_llc;
+ struct mbm_state *mbm_total;
+ struct mbm_state *mbm_local;
+ struct delayed_work mbm_over;
+ struct delayed_work cqm_limbo;
+ int mbm_work_cpu;
+ int cqm_work_cpu;
+ struct pseudo_lock_region *plr;
+ struct resctrl_staged_config staged_config[CDP_NUM_TYPES];
+};
+
+/**
+ * struct resctrl_cache - Cache allocation related data
+ * @cbm_len: Length of the cache bit mask
+ * @min_cbm_bits: Minimum number of consecutive bits to be set
+ * @shareable_bits: Bitmask of shareable resource with other
+ * executing entities
+ * @arch_has_sparse_bitmaps: True if a bitmap like f00f is valid.
+ * @arch_has_empty_bitmaps: True if the '0' bitmap is valid.
+ * @arch_has_per_cpu_cfg: True if QOS_CFG register for this cache
+ * level has CPU scope.
+ */
+struct resctrl_cache {
+ unsigned int cbm_len;
+ unsigned int min_cbm_bits;
+ unsigned int shareable_bits;
+ bool arch_has_sparse_bitmaps;
+ bool arch_has_empty_bitmaps;
+ bool arch_has_per_cpu_cfg;
+};
+
+/**
+ * enum membw_throttle_mode - System's memory bandwidth throttling mode
+ * @THREAD_THROTTLE_UNDEFINED: Not relevant to the system
+ * @THREAD_THROTTLE_MAX: Memory bandwidth is throttled at the core
+ * always using smallest bandwidth percentage
+ * assigned to threads, aka "max throttling"
+ * @THREAD_THROTTLE_PER_THREAD: Memory bandwidth is throttled at the thread
+ */
+enum membw_throttle_mode {
+ THREAD_THROTTLE_UNDEFINED = 0,
+ THREAD_THROTTLE_MAX,
+ THREAD_THROTTLE_PER_THREAD,
+};
+
+/**
+ * struct resctrl_membw - Memory bandwidth allocation related data
+ * @min_bw: Minimum memory bandwidth percentage user can request
+ * @bw_gran: Granularity at which the memory bandwidth is allocated
+ * @delay_linear: True if memory B/W delay is in linear scale
+ * @arch_needs_linear: True if we can't configure non-linear resources
+ * @throttle_mode: Bandwidth throttling mode when threads request
+ * different memory bandwidths
+ * @mba_sc: True if MBA software controller(mba_sc) is enabled
+ * @mb_map: Mapping of memory B/W percentage to memory B/W delay
+ */
+struct resctrl_membw {
+ u32 min_bw;
+ u32 bw_gran;
+ u32 delay_linear;
+ bool arch_needs_linear;
+ enum membw_throttle_mode throttle_mode;
+ bool mba_sc;
+ u32 *mb_map;
+};
+
+struct rdt_parse_data;
+struct resctrl_schema;
+
+/**
+ * struct rdt_resource - attributes of a resctrl resource
+ * @rid: The index of the resource
+ * @alloc_enabled: Is allocation enabled on this machine
+ * @mon_enabled: Is monitoring enabled for this feature
+ * @alloc_capable: Is allocation available on this machine
+ * @mon_capable: Is monitor feature available on this machine
+ * @num_rmid: Number of RMIDs available
+ * @cache_level: Which cache level defines scope of this resource
+ * @cache: Cache allocation related data
+ * @membw: If the component has bandwidth controls, their properties.
+ * @domains: All domains for this resource
+ * @name: Name to use in "schemata" file.
+ * @data_width: Character width of data when displaying
+ * @default_ctrl: Specifies default cache cbm or memory B/W percent.
+ * @format_str: Per resource format string to show domain value
+ * @parse_ctrlval: Per resource function pointer to parse control values
+ * @evt_list: List of monitoring events
+ * @fflags: flags to choose base and info files
+ * @cdp_capable: Is the CDP feature available on this resource
+ */
+struct rdt_resource {
+ int rid;
+ bool alloc_enabled;
+ bool mon_enabled;
+ bool alloc_capable;
+ bool mon_capable;
+ int num_rmid;
+ int cache_level;
+ struct resctrl_cache cache;
+ struct resctrl_membw membw;
+ struct list_head domains;
+ char *name;
+ int data_width;
+ u32 default_ctrl;
+ const char *format_str;
+ int (*parse_ctrlval)(struct rdt_parse_data *data,
+ struct resctrl_schema *s,
+ struct rdt_domain *d);
+ struct list_head evt_list;
+ unsigned long fflags;
+ bool cdp_capable;
+};
+
+/**
+ * struct resctrl_schema - configuration abilities of a resource presented to
+ * user-space
+ * @list: Member of resctrl_schema_all.
+ * @name: The name to use in the "schemata" file.
+ * @conf_type: Whether this schema is specific to code/data.
+ * @res: The resource structure exported by the architecture to describe
+ * the hardware that is configured by this schema.
+ * @num_closid: The number of closid that can be used with this schema. When
+ * features like CDP are enabled, this will be lower than the
+ * hardware supports for the resource.
+ */
+struct resctrl_schema {
+ struct list_head list;
+ char name[8];
+ enum resctrl_conf_type conf_type;
+ struct rdt_resource *res;
+ u32 num_closid;
+};
+
+/* The number of closid supported by this resource regardless of CDP */
+u32 resctrl_arch_get_num_closid(struct rdt_resource *r);
+int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid);
+u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
+ u32 closid, enum resctrl_conf_type type);
+
#endif /* _RESCTRL_H */
#ifndef __LINUX_RT_MUTEX_H
#define __LINUX_RT_MUTEX_H
+#include <linux/compiler.h>
#include <linux/linkage.h>
-#include <linux/rbtree.h>
-#include <linux/spinlock_types.h>
+#include <linux/rbtree_types.h>
+#include <linux/spinlock_types_raw.h>
extern int max_lock_depth; /* for sysctl */
+struct rt_mutex_base {
+ raw_spinlock_t wait_lock;
+ struct rb_root_cached waiters;
+ struct task_struct *owner;
+};
+
+#define __RT_MUTEX_BASE_INITIALIZER(rtbasename) \
+{ \
+ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(rtbasename.wait_lock), \
+ .waiters = RB_ROOT_CACHED, \
+ .owner = NULL \
+}
+
+/**
+ * rt_mutex_base_is_locked - is the rtmutex locked
+ * @lock: the mutex to be queried
+ *
+ * Returns true if the mutex is locked, false if unlocked.
+ */
+static inline bool rt_mutex_base_is_locked(struct rt_mutex_base *lock)
+{
+ return READ_ONCE(lock->owner) != NULL;
+}
+
+extern void rt_mutex_base_init(struct rt_mutex_base *rtb);
+
/**
* The rt_mutex structure
*
* @owner: the mutex owner
*/
struct rt_mutex {
- raw_spinlock_t wait_lock;
- struct rb_root_cached waiters;
- struct task_struct *owner;
+ struct rt_mutex_base rtmutex;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
#endif
} while (0)
#ifdef CONFIG_DEBUG_LOCK_ALLOC
-#define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \
- , .dep_map = { .name = #mutexname }
+#define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \
+ .dep_map = { \
+ .name = #mutexname, \
+ .wait_type_inner = LD_WAIT_SLEEP, \
+ }
#else
#define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)
#endif
-#define __RT_MUTEX_INITIALIZER(mutexname) \
- { .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \
- , .waiters = RB_ROOT_CACHED \
- , .owner = NULL \
- __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)}
+#define __RT_MUTEX_INITIALIZER(mutexname) \
+{ \
+ .rtmutex = __RT_MUTEX_BASE_INITIALIZER(mutexname.rtmutex), \
+ __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \
+}
#define DEFINE_RT_MUTEX(mutexname) \
struct rt_mutex mutexname = __RT_MUTEX_INITIALIZER(mutexname)
-/**
- * rt_mutex_is_locked - is the mutex locked
- * @lock: the mutex to be queried
- *
- * Returns 1 if the mutex is locked, 0 if unlocked.
- */
-static inline int rt_mutex_is_locked(struct rt_mutex *lock)
-{
- return lock->owner != NULL;
-}
-
extern void __rt_mutex_init(struct rt_mutex *lock, const char *name, struct lock_class_key *key);
#ifdef CONFIG_DEBUG_LOCK_ALLOC
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+#ifndef _LINUX_RWBASE_RT_H
+#define _LINUX_RWBASE_RT_H
+
+#include <linux/rtmutex.h>
+#include <linux/atomic.h>
+
+#define READER_BIAS (1U << 31)
+#define WRITER_BIAS (1U << 30)
+
+struct rwbase_rt {
+ atomic_t readers;
+ struct rt_mutex_base rtmutex;
+};
+
+#define __RWBASE_INITIALIZER(name) \
+{ \
+ .readers = ATOMIC_INIT(READER_BIAS), \
+ .rtmutex = __RT_MUTEX_BASE_INITIALIZER(name.rtmutex), \
+}
+
+#define init_rwbase_rt(rwbase) \
+ do { \
+ rt_mutex_base_init(&(rwbase)->rtmutex); \
+ atomic_set(&(rwbase)->readers, READER_BIAS); \
+ } while (0)
+
+
+static __always_inline bool rw_base_is_locked(struct rwbase_rt *rwb)
+{
+ return atomic_read(&rwb->readers) != READER_BIAS;
+}
+
+static __always_inline bool rw_base_is_contended(struct rwbase_rt *rwb)
+{
+ return atomic_read(&rwb->readers) > 0;
+}
+
+#endif /* _LINUX_RWBASE_RT_H */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+#ifndef __LINUX_RWLOCK_RT_H
+#define __LINUX_RWLOCK_RT_H
+
+#ifndef __LINUX_SPINLOCK_RT_H
+#error Do not #include directly. Use <linux/spinlock.h>.
+#endif
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+extern void __rt_rwlock_init(rwlock_t *rwlock, const char *name,
+ struct lock_class_key *key);
+#else
+static inline void __rt_rwlock_init(rwlock_t *rwlock, char *name,
+ struct lock_class_key *key)
+{
+}
+#endif
+
+#define rwlock_init(rwl) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ init_rwbase_rt(&(rwl)->rwbase); \
+ __rt_rwlock_init(rwl, #rwl, &__key); \
+} while (0)
+
+extern void rt_read_lock(rwlock_t *rwlock);
+extern int rt_read_trylock(rwlock_t *rwlock);
+extern void rt_read_unlock(rwlock_t *rwlock);
+extern void rt_write_lock(rwlock_t *rwlock);
+extern int rt_write_trylock(rwlock_t *rwlock);
+extern void rt_write_unlock(rwlock_t *rwlock);
+
+static __always_inline void read_lock(rwlock_t *rwlock)
+{
+ rt_read_lock(rwlock);
+}
+
+static __always_inline void read_lock_bh(rwlock_t *rwlock)
+{
+ local_bh_disable();
+ rt_read_lock(rwlock);
+}
+
+static __always_inline void read_lock_irq(rwlock_t *rwlock)
+{
+ rt_read_lock(rwlock);
+}
+
+#define read_lock_irqsave(lock, flags) \
+ do { \
+ typecheck(unsigned long, flags); \
+ rt_read_lock(lock); \
+ flags = 0; \
+ } while (0)
+
+#define read_trylock(lock) __cond_lock(lock, rt_read_trylock(lock))
+
+static __always_inline void read_unlock(rwlock_t *rwlock)
+{
+ rt_read_unlock(rwlock);
+}
+
+static __always_inline void read_unlock_bh(rwlock_t *rwlock)
+{
+ rt_read_unlock(rwlock);
+ local_bh_enable();
+}
+
+static __always_inline void read_unlock_irq(rwlock_t *rwlock)
+{
+ rt_read_unlock(rwlock);
+}
+
+static __always_inline void read_unlock_irqrestore(rwlock_t *rwlock,
+ unsigned long flags)
+{
+ rt_read_unlock(rwlock);
+}
+
+static __always_inline void write_lock(rwlock_t *rwlock)
+{
+ rt_write_lock(rwlock);
+}
+
+static __always_inline void write_lock_bh(rwlock_t *rwlock)
+{
+ local_bh_disable();
+ rt_write_lock(rwlock);
+}
+
+static __always_inline void write_lock_irq(rwlock_t *rwlock)
+{
+ rt_write_lock(rwlock);
+}
+
+#define write_lock_irqsave(lock, flags) \
+ do { \
+ typecheck(unsigned long, flags); \
+ rt_write_lock(lock); \
+ flags = 0; \
+ } while (0)
+
+#define write_trylock(lock) __cond_lock(lock, rt_write_trylock(lock))
+
+#define write_trylock_irqsave(lock, flags) \
+({ \
+ int __locked; \
+ \
+ typecheck(unsigned long, flags); \
+ flags = 0; \
+ __locked = write_trylock(lock); \
+ __locked; \
+})
+
+static __always_inline void write_unlock(rwlock_t *rwlock)
+{
+ rt_write_unlock(rwlock);
+}
+
+static __always_inline void write_unlock_bh(rwlock_t *rwlock)
+{
+ rt_write_unlock(rwlock);
+ local_bh_enable();
+}
+
+static __always_inline void write_unlock_irq(rwlock_t *rwlock)
+{
+ rt_write_unlock(rwlock);
+}
+
+static __always_inline void write_unlock_irqrestore(rwlock_t *rwlock,
+ unsigned long flags)
+{
+ rt_write_unlock(rwlock);
+}
+
+#define rwlock_is_contended(lock) (((void)(lock), 0))
+
+#endif /* __LINUX_RWLOCK_RT_H */
#ifndef __LINUX_RWLOCK_TYPES_H
#define __LINUX_RWLOCK_TYPES_H
+#if !defined(__LINUX_SPINLOCK_TYPES_H)
+# error "Do not include directly, include spinlock_types.h"
+#endif
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+# define RW_DEP_MAP_INIT(lockname) \
+ .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_CONFIG, \
+ }
+#else
+# define RW_DEP_MAP_INIT(lockname)
+#endif
+
+#ifndef CONFIG_PREEMPT_RT
/*
- * include/linux/rwlock_types.h - generic rwlock type definitions
- * and initializers
+ * generic rwlock type definitions and initializers
*
* portions Copyright 2005, Red Hat, Inc., Ingo Molnar
* Released under the General Public License (GPL).
#define RWLOCK_MAGIC 0xdeaf1eed
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define RW_DEP_MAP_INIT(lockname) \
- .dep_map = { \
- .name = #lockname, \
- .wait_type_inner = LD_WAIT_CONFIG, \
- }
-#else
-# define RW_DEP_MAP_INIT(lockname)
-#endif
-
#ifdef CONFIG_DEBUG_SPINLOCK
#define __RW_LOCK_UNLOCKED(lockname) \
(rwlock_t) { .raw_lock = __ARCH_RW_LOCK_UNLOCKED, \
#define DEFINE_RWLOCK(x) rwlock_t x = __RW_LOCK_UNLOCKED(x)
+#else /* !CONFIG_PREEMPT_RT */
+
+#include <linux/rwbase_rt.h>
+
+typedef struct {
+ struct rwbase_rt rwbase;
+ atomic_t readers;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
+} rwlock_t;
+
+#define __RWLOCK_RT_INITIALIZER(name) \
+{ \
+ .rwbase = __RWBASE_INITIALIZER(name), \
+ RW_DEP_MAP_INIT(name) \
+}
+
+#define __RW_LOCK_UNLOCKED(name) __RWLOCK_RT_INITIALIZER(name)
+
+#define DEFINE_RWLOCK(name) \
+ rwlock_t name = __RW_LOCK_UNLOCKED(name)
+
+#endif /* CONFIG_PREEMPT_RT */
+
#endif /* __LINUX_RWLOCK_TYPES_H */
#include <linux/spinlock.h>
#include <linux/atomic.h>
#include <linux/err.h>
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+# define __RWSEM_DEP_MAP_INIT(lockname) \
+ .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_SLEEP, \
+ },
+#else
+# define __RWSEM_DEP_MAP_INIT(lockname)
+#endif
+
+#ifndef CONFIG_PREEMPT_RT
+
#ifdef CONFIG_RWSEM_SPIN_ON_OWNER
#include <linux/osq_lock.h>
#endif
/* Common initializer macros and functions */
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __RWSEM_DEP_MAP_INIT(lockname) \
- .dep_map = { \
- .name = #lockname, \
- .wait_type_inner = LD_WAIT_SLEEP, \
- },
-#else
-# define __RWSEM_DEP_MAP_INIT(lockname)
-#endif
-
#ifdef CONFIG_DEBUG_RWSEMS
# define __RWSEM_DEBUG_INIT(lockname) .magic = &lockname,
#else
return !list_empty(&sem->wait_list);
}
+#else /* !CONFIG_PREEMPT_RT */
+
+#include <linux/rwbase_rt.h>
+
+struct rw_semaphore {
+ struct rwbase_rt rwbase;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
+};
+
+#define __RWSEM_INITIALIZER(name) \
+ { \
+ .rwbase = __RWBASE_INITIALIZER(name), \
+ __RWSEM_DEP_MAP_INIT(name) \
+ }
+
+#define DECLARE_RWSEM(lockname) \
+ struct rw_semaphore lockname = __RWSEM_INITIALIZER(lockname)
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+extern void __rwsem_init(struct rw_semaphore *rwsem, const char *name,
+ struct lock_class_key *key);
+#else
+static inline void __rwsem_init(struct rw_semaphore *rwsem, const char *name,
+ struct lock_class_key *key)
+{
+}
+#endif
+
+#define init_rwsem(sem) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ init_rwbase_rt(&(sem)->rwbase); \
+ __rwsem_init((sem), #sem, &__key); \
+} while (0)
+
+static __always_inline int rwsem_is_locked(struct rw_semaphore *sem)
+{
+ return rw_base_is_locked(&sem->rwbase);
+}
+
+static __always_inline int rwsem_is_contended(struct rw_semaphore *sem)
+{
+ return rw_base_is_contended(&sem->rwbase);
+}
+
+#endif /* CONFIG_PREEMPT_RT */
+
+/*
+ * The functions below are the same for all rwsem implementations including
+ * the RT specific variant.
+ */
+
/*
* lock for reading
*/
#define TASK_WAKING 0x0200
#define TASK_NOLOAD 0x0400
#define TASK_NEW 0x0800
-#define TASK_STATE_MAX 0x1000
+/* RT specific auxilliary flag to mark RT lock waiters */
+#define TASK_RTLOCK_WAIT 0x1000
+#define TASK_STATE_MAX 0x2000
/* Convenience macros for the sake of set_current_state: */
#define TASK_KILLABLE (TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)
#define task_is_stopped_or_traced(task) ((READ_ONCE(task->__state) & (__TASK_STOPPED | __TASK_TRACED)) != 0)
-#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
-
/*
* Special states are those that do not use the normal wait-loop pattern. See
* the comment with set_special_state().
#define is_special_task_state(state) \
((state) & (__TASK_STOPPED | __TASK_TRACED | TASK_PARKED | TASK_DEAD))
-#define __set_current_state(state_value) \
- do { \
- WARN_ON_ONCE(is_special_task_state(state_value));\
- current->task_state_change = _THIS_IP_; \
- WRITE_ONCE(current->__state, (state_value)); \
- } while (0)
-
-#define set_current_state(state_value) \
- do { \
- WARN_ON_ONCE(is_special_task_state(state_value));\
- current->task_state_change = _THIS_IP_; \
- smp_store_mb(current->__state, (state_value)); \
+#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+# define debug_normal_state_change(state_value) \
+ do { \
+ WARN_ON_ONCE(is_special_task_state(state_value)); \
+ current->task_state_change = _THIS_IP_; \
} while (0)
-#define set_special_state(state_value) \
+# define debug_special_state_change(state_value) \
do { \
- unsigned long flags; /* may shadow */ \
WARN_ON_ONCE(!is_special_task_state(state_value)); \
- raw_spin_lock_irqsave(¤t->pi_lock, flags); \
current->task_state_change = _THIS_IP_; \
- WRITE_ONCE(current->__state, (state_value)); \
- raw_spin_unlock_irqrestore(¤t->pi_lock, flags); \
} while (0)
+
+# define debug_rtlock_wait_set_state() \
+ do { \
+ current->saved_state_change = current->task_state_change;\
+ current->task_state_change = _THIS_IP_; \
+ } while (0)
+
+# define debug_rtlock_wait_restore_state() \
+ do { \
+ current->task_state_change = current->saved_state_change;\
+ } while (0)
+
#else
+# define debug_normal_state_change(cond) do { } while (0)
+# define debug_special_state_change(cond) do { } while (0)
+# define debug_rtlock_wait_set_state() do { } while (0)
+# define debug_rtlock_wait_restore_state() do { } while (0)
+#endif
+
/*
* set_current_state() includes a barrier so that the write of current->state
* is correctly serialised wrt the caller's subsequent test of whether to
* Also see the comments of try_to_wake_up().
*/
#define __set_current_state(state_value) \
- WRITE_ONCE(current->__state, (state_value))
+ do { \
+ debug_normal_state_change((state_value)); \
+ WRITE_ONCE(current->__state, (state_value)); \
+ } while (0)
#define set_current_state(state_value) \
- smp_store_mb(current->__state, (state_value))
+ do { \
+ debug_normal_state_change((state_value)); \
+ smp_store_mb(current->__state, (state_value)); \
+ } while (0)
/*
* set_special_state() should be used for those states when the blocking task
* can not use the regular condition based wait-loop. In that case we must
- * serialize against wakeups such that any possible in-flight TASK_RUNNING stores
- * will not collide with our state change.
+ * serialize against wakeups such that any possible in-flight TASK_RUNNING
+ * stores will not collide with our state change.
*/
#define set_special_state(state_value) \
do { \
unsigned long flags; /* may shadow */ \
+ \
raw_spin_lock_irqsave(¤t->pi_lock, flags); \
+ debug_special_state_change((state_value)); \
WRITE_ONCE(current->__state, (state_value)); \
raw_spin_unlock_irqrestore(¤t->pi_lock, flags); \
} while (0)
-#endif
+/*
+ * PREEMPT_RT specific variants for "sleeping" spin/rwlocks
+ *
+ * RT's spin/rwlock substitutions are state preserving. The state of the
+ * task when blocking on the lock is saved in task_struct::saved_state and
+ * restored after the lock has been acquired. These operations are
+ * serialized by task_struct::pi_lock against try_to_wake_up(). Any non RT
+ * lock related wakeups while the task is blocked on the lock are
+ * redirected to operate on task_struct::saved_state to ensure that these
+ * are not dropped. On restore task_struct::saved_state is set to
+ * TASK_RUNNING so any wakeup attempt redirected to saved_state will fail.
+ *
+ * The lock operation looks like this:
+ *
+ * current_save_and_set_rtlock_wait_state();
+ * for (;;) {
+ * if (try_lock())
+ * break;
+ * raw_spin_unlock_irq(&lock->wait_lock);
+ * schedule_rtlock();
+ * raw_spin_lock_irq(&lock->wait_lock);
+ * set_current_state(TASK_RTLOCK_WAIT);
+ * }
+ * current_restore_rtlock_saved_state();
+ */
+#define current_save_and_set_rtlock_wait_state() \
+ do { \
+ lockdep_assert_irqs_disabled(); \
+ raw_spin_lock(¤t->pi_lock); \
+ current->saved_state = current->__state; \
+ debug_rtlock_wait_set_state(); \
+ WRITE_ONCE(current->__state, TASK_RTLOCK_WAIT); \
+ raw_spin_unlock(¤t->pi_lock); \
+ } while (0);
+
+#define current_restore_rtlock_saved_state() \
+ do { \
+ lockdep_assert_irqs_disabled(); \
+ raw_spin_lock(¤t->pi_lock); \
+ debug_rtlock_wait_restore_state(); \
+ WRITE_ONCE(current->__state, current->saved_state); \
+ current->saved_state = TASK_RUNNING; \
+ raw_spin_unlock(¤t->pi_lock); \
+ } while (0);
#define get_current_state() READ_ONCE(current->__state)
asmlinkage void schedule(void);
extern void schedule_preempt_disabled(void);
asmlinkage void preempt_schedule_irq(void);
+#ifdef CONFIG_PREEMPT_RT
+ extern void schedule_rtlock(void);
+#endif
extern int __must_check io_schedule_prepare(void);
extern void io_schedule_finish(int token);
#endif
unsigned int __state;
+#ifdef CONFIG_PREEMPT_RT
+ /* saved state for "spinlock sleepers" */
+ unsigned int saved_state;
+#endif
+
/*
* This begins the randomizable portion of task_struct. Only
* scheduling-critical items should be added above here.
unsigned int policy;
int nr_cpus_allowed;
const cpumask_t *cpus_ptr;
+ cpumask_t *user_cpus_ptr;
cpumask_t cpus_mask;
void *migration_pending;
#ifdef CONFIG_SMP
/* Used by page_owner=on to detect recursion in page tracking. */
unsigned in_page_owner:1;
#endif
+#ifdef CONFIG_EVENTFD
+ /* Recursion prevention for eventfd_signal() */
+ unsigned in_eventfd_signal:1;
+#endif
unsigned long atomic_flags; /* Flags requiring atomic access. */
struct kmap_ctrl kmap_ctrl;
#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
unsigned long task_state_change;
+# ifdef CONFIG_PREEMPT_RT
+ unsigned long saved_state_change;
+# endif
#endif
int pagefault_disabled;
#ifdef CONFIG_MMU
struct llist_head kretprobe_instances;
#endif
+#ifdef CONFIG_ARCH_HAS_PARANOID_L1D_FLUSH
+ /*
+ * If L1D flush is supported on mm context switch
+ * then we use this callback head to queue kill work
+ * to kill tasks that are not running on SMT disabled
+ * cores
+ */
+ struct callback_head l1d_flush_kill;
+#endif
+
/*
* New fields for task_struct should be added above here, so that
* they are included in the randomized portion of task_struct.
#ifdef CONFIG_SMP
extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask);
extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask);
+extern int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src, int node);
+extern void release_user_cpus_ptr(struct task_struct *p);
+extern int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask);
+extern void force_compatible_cpus_allowed_ptr(struct task_struct *p);
+extern void relax_compatible_cpus_allowed_ptr(struct task_struct *p);
#else
static inline void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
{
return -EINVAL;
return 0;
}
+static inline int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src, int node)
+{
+ if (src->user_cpus_ptr)
+ return -EINVAL;
+ return 0;
+}
+static inline void release_user_cpus_ptr(struct task_struct *p)
+{
+ WARN_ON(p->user_cpus_ptr);
+}
+
+static inline int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
+{
+ return 0;
+}
#endif
extern int yield_to(struct task_struct *p, bool preempt);
spin_unlock_irqrestore(&task->sighand->siglock, *flags);
}
+#ifdef CONFIG_LOCKDEP
+extern void lockdep_assert_task_sighand_held(struct task_struct *task);
+#else
+static inline void lockdep_assert_task_sighand_held(struct task_struct *task) { }
+#endif
+
static inline unsigned long task_rlimit(const struct task_struct *task,
unsigned int limit)
{
extern unsigned int sysctl_sched_child_runs_first;
-extern unsigned int sysctl_sched_latency;
-extern unsigned int sysctl_sched_min_granularity;
-extern unsigned int sysctl_sched_wakeup_granularity;
-
enum sched_tunable_scaling {
SCHED_TUNABLESCALING_NONE,
SCHED_TUNABLESCALING_LOG,
SCHED_TUNABLESCALING_LINEAR,
SCHED_TUNABLESCALING_END,
};
-extern unsigned int sysctl_sched_tunable_scaling;
-
-extern unsigned int sysctl_numa_balancing_scan_delay;
-extern unsigned int sysctl_numa_balancing_scan_period_min;
-extern unsigned int sysctl_numa_balancing_scan_period_max;
-extern unsigned int sysctl_numa_balancing_scan_size;
-
-#ifdef CONFIG_SCHED_DEBUG
-extern __read_mostly unsigned int sysctl_sched_migration_cost;
-extern __read_mostly unsigned int sysctl_sched_nr_migrate;
-
-extern int sysctl_resched_latency_warn_ms;
-extern int sysctl_resched_latency_warn_once;
-#endif
/*
* control realtime throttling:
#define WAKE_Q_TAIL ((struct wake_q_node *) 0x01)
-#define DEFINE_WAKE_Q(name) \
- struct wake_q_head name = { WAKE_Q_TAIL, &name.first }
+#define WAKE_Q_HEAD_INITIALIZER(name) \
+ { WAKE_Q_TAIL, &name.first }
+
+#define DEFINE_WAKE_Q(name) \
+ struct wake_q_head name = WAKE_Q_HEAD_INITIALIZER(name)
static inline void wake_q_init(struct wake_q_head *head)
{
struct sockaddr __user *upeer_sockaddr,
int __user *upeer_addrlen, int flags,
unsigned long nofile);
+extern struct file *do_accept(struct file *file, unsigned file_flags,
+ struct sockaddr __user *upeer_sockaddr,
+ int __user *upeer_addrlen, int flags);
extern int __sys_accept4(int fd, struct sockaddr __user *upeer_sockaddr,
int __user *upeer_addrlen, int flags);
extern int __sys_socket(int family, int type, int protocol);
* not using a GPIO line)
* @word_delay: delay to be inserted between consecutive
* words of a transfer
- *
+ * @cs_setup: delay to be introduced by the controller after CS is asserted
+ * @cs_hold: delay to be introduced by the controller before CS is deasserted
+ * @cs_inactive: delay to be introduced by the controller after CS is
+ * deasserted. If @cs_change_delay is used from @spi_transfer, then the
+ * two delays will be added up.
* @statistics: statistics for the spi_device
*
* A @spi_device is used to interchange data between an SPI slave
int cs_gpio; /* LEGACY: chip select gpio */
struct gpio_desc *cs_gpiod; /* chip select gpio desc */
struct spi_delay word_delay; /* inter-word delay */
+ /* CS delays */
+ struct spi_delay cs_setup;
+ struct spi_delay cs_hold;
+ struct spi_delay cs_inactive;
/* the statistics */
struct spi_statistics statistics;
* @max_speed_hz: Highest supported transfer speed
* @flags: other constraints relevant to this driver
* @slave: indicates that this is an SPI slave controller
+ * @devm_allocated: whether the allocation of this struct is devres-managed
* @max_transfer_size: function that returns the max transfer size for
* a &spi_device; may be %NULL, so the default %SIZE_MAX will be used.
* @max_message_size: function that returns the max message size for
* controller has native support for memory like operations.
* @unprepare_message: undo any work done by prepare_message().
* @slave_abort: abort the ongoing transfer request on an SPI slave controller
- * @cs_setup: delay to be introduced by the controller after CS is asserted
- * @cs_hold: delay to be introduced by the controller before CS is deasserted
- * @cs_inactive: delay to be introduced by the controller after CS is
- * deasserted. If @cs_change_delay is used from @spi_transfer, then the
- * two delays will be added up.
* @cs_gpios: LEGACY: array of GPIO descs to use as chip select lines; one per
* CS number. Any individual value may be -ENOENT for CS lines that
* are not GPIOs (driven by the SPI controller itself). Use the cs_gpiods
#define SPI_MASTER_GPIO_SS BIT(5) /* GPIO CS must select slave */
- /* flag indicating this is a non-devres managed controller */
+ /* flag indicating if the allocation of this struct is devres-managed */
bool devm_allocated;
/* flag indicating this is an SPI slave controller */
* to configure specific CS timing through spi_set_cs_timing() after
* spi_setup().
*/
- int (*set_cs_timing)(struct spi_device *spi, struct spi_delay *setup,
- struct spi_delay *hold, struct spi_delay *inactive);
+ int (*set_cs_timing)(struct spi_device *spi);
/* bidirectional bulk transfers
*
/* Optimized handlers for SPI memory-like operations. */
const struct spi_controller_mem_ops *mem_ops;
- /* CS delays */
- struct spi_delay cs_setup;
- struct spi_delay cs_hold;
- struct spi_delay cs_inactive;
-
/* gpio chip select */
int *cs_gpios;
struct gpio_desc **cs_gpiods;
* asm/spinlock_types.h: contains the arch_spinlock_t/arch_rwlock_t and the
* initializers
*
+ * linux/spinlock_types_raw:
+ * The raw types and initializers
* linux/spinlock_types.h:
* defines the generic type and initializers
*
* contains the generic, simplified UP spinlock type.
* (which is an empty structure on non-debug builds)
*
+ * linux/spinlock_types_raw:
+ * The raw RT types and initializers
* linux/spinlock_types.h:
* defines the generic type and initializers
*
1 : ({ local_irq_restore(flags); 0; }); \
})
-/* Include rwlock functions */
+#ifndef CONFIG_PREEMPT_RT
+/* Include rwlock functions for !RT */
#include <linux/rwlock.h>
+#endif
/*
* Pull the _spin_*()/_read_*()/_write_*() functions/declarations:
# include <linux/spinlock_api_up.h>
#endif
+/* Non PREEMPT_RT kernel, map to raw spinlocks: */
+#ifndef CONFIG_PREEMPT_RT
+
/*
* Map the spin_lock functions to the raw variants for PREEMPT_RT=n
*/
#define assert_spin_locked(lock) assert_raw_spin_locked(&(lock)->rlock)
+#else /* !CONFIG_PREEMPT_RT */
+# include <linux/spinlock_rt.h>
+#endif /* CONFIG_PREEMPT_RT */
+
/*
* Pull the atomic_t declaration:
* (asm-mips/atomic.h needs above definitions)
return 0;
}
+/* PREEMPT_RT has its own rwlock implementation */
+#ifndef CONFIG_PREEMPT_RT
#include <linux/rwlock_api_smp.h>
+#endif
#endif /* __LINUX_SPINLOCK_API_SMP_H */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+#ifndef __LINUX_SPINLOCK_RT_H
+#define __LINUX_SPINLOCK_RT_H
+
+#ifndef __LINUX_SPINLOCK_H
+#error Do not include directly. Use spinlock.h
+#endif
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+extern void __rt_spin_lock_init(spinlock_t *lock, const char *name,
+ struct lock_class_key *key, bool percpu);
+#else
+static inline void __rt_spin_lock_init(spinlock_t *lock, const char *name,
+ struct lock_class_key *key, bool percpu)
+{
+}
+#endif
+
+#define spin_lock_init(slock) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ rt_mutex_base_init(&(slock)->lock); \
+ __rt_spin_lock_init(slock, #slock, &__key, false); \
+} while (0)
+
+#define local_spin_lock_init(slock) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ rt_mutex_base_init(&(slock)->lock); \
+ __rt_spin_lock_init(slock, #slock, &__key, true); \
+} while (0)
+
+extern void rt_spin_lock(spinlock_t *lock);
+extern void rt_spin_lock_nested(spinlock_t *lock, int subclass);
+extern void rt_spin_lock_nest_lock(spinlock_t *lock, struct lockdep_map *nest_lock);
+extern void rt_spin_unlock(spinlock_t *lock);
+extern void rt_spin_lock_unlock(spinlock_t *lock);
+extern int rt_spin_trylock_bh(spinlock_t *lock);
+extern int rt_spin_trylock(spinlock_t *lock);
+
+static __always_inline void spin_lock(spinlock_t *lock)
+{
+ rt_spin_lock(lock);
+}
+
+#ifdef CONFIG_LOCKDEP
+# define __spin_lock_nested(lock, subclass) \
+ rt_spin_lock_nested(lock, subclass)
+
+# define __spin_lock_nest_lock(lock, nest_lock) \
+ do { \
+ typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
+ rt_spin_lock_nest_lock(lock, &(nest_lock)->dep_map); \
+ } while (0)
+# define __spin_lock_irqsave_nested(lock, flags, subclass) \
+ do { \
+ typecheck(unsigned long, flags); \
+ flags = 0; \
+ __spin_lock_nested(lock, subclass); \
+ } while (0)
+
+#else
+ /*
+ * Always evaluate the 'subclass' argument to avoid that the compiler
+ * warns about set-but-not-used variables when building with
+ * CONFIG_DEBUG_LOCK_ALLOC=n and with W=1.
+ */
+# define __spin_lock_nested(lock, subclass) spin_lock(((void)(subclass), (lock)))
+# define __spin_lock_nest_lock(lock, subclass) spin_lock(((void)(subclass), (lock)))
+# define __spin_lock_irqsave_nested(lock, flags, subclass) \
+ spin_lock_irqsave(((void)(subclass), (lock)), flags)
+#endif
+
+#define spin_lock_nested(lock, subclass) \
+ __spin_lock_nested(lock, subclass)
+
+#define spin_lock_nest_lock(lock, nest_lock) \
+ __spin_lock_nest_lock(lock, nest_lock)
+
+#define spin_lock_irqsave_nested(lock, flags, subclass) \
+ __spin_lock_irqsave_nested(lock, flags, subclass)
+
+static __always_inline void spin_lock_bh(spinlock_t *lock)
+{
+ /* Investigate: Drop bh when blocking ? */
+ local_bh_disable();
+ rt_spin_lock(lock);
+}
+
+static __always_inline void spin_lock_irq(spinlock_t *lock)
+{
+ rt_spin_lock(lock);
+}
+
+#define spin_lock_irqsave(lock, flags) \
+ do { \
+ typecheck(unsigned long, flags); \
+ flags = 0; \
+ spin_lock(lock); \
+ } while (0)
+
+static __always_inline void spin_unlock(spinlock_t *lock)
+{
+ rt_spin_unlock(lock);
+}
+
+static __always_inline void spin_unlock_bh(spinlock_t *lock)
+{
+ rt_spin_unlock(lock);
+ local_bh_enable();
+}
+
+static __always_inline void spin_unlock_irq(spinlock_t *lock)
+{
+ rt_spin_unlock(lock);
+}
+
+static __always_inline void spin_unlock_irqrestore(spinlock_t *lock,
+ unsigned long flags)
+{
+ rt_spin_unlock(lock);
+}
+
+#define spin_trylock(lock) \
+ __cond_lock(lock, rt_spin_trylock(lock))
+
+#define spin_trylock_bh(lock) \
+ __cond_lock(lock, rt_spin_trylock_bh(lock))
+
+#define spin_trylock_irq(lock) \
+ __cond_lock(lock, rt_spin_trylock(lock))
+
+#define __spin_trylock_irqsave(lock, flags) \
+({ \
+ int __locked; \
+ \
+ typecheck(unsigned long, flags); \
+ flags = 0; \
+ __locked = spin_trylock(lock); \
+ __locked; \
+})
+
+#define spin_trylock_irqsave(lock, flags) \
+ __cond_lock(lock, __spin_trylock_irqsave(lock, flags))
+
+#define spin_is_contended(lock) (((void)(lock), 0))
+
+static inline int spin_is_locked(spinlock_t *lock)
+{
+ return rt_mutex_base_is_locked(&lock->lock);
+}
+
+#define assert_spin_locked(lock) BUG_ON(!spin_is_locked(lock))
+
+#include <linux/rwlock_rt.h>
+
+#endif
* Released under the General Public License (GPL).
*/
-#if defined(CONFIG_SMP)
-# include <asm/spinlock_types.h>
-#else
-# include <linux/spinlock_types_up.h>
-#endif
-
-#include <linux/lockdep_types.h>
+#include <linux/spinlock_types_raw.h>
-typedef struct raw_spinlock {
- arch_spinlock_t raw_lock;
-#ifdef CONFIG_DEBUG_SPINLOCK
- unsigned int magic, owner_cpu;
- void *owner;
-#endif
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
- struct lockdep_map dep_map;
-#endif
-} raw_spinlock_t;
-
-#define SPINLOCK_MAGIC 0xdead4ead
-
-#define SPINLOCK_OWNER_INIT ((void *)-1L)
-
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define RAW_SPIN_DEP_MAP_INIT(lockname) \
- .dep_map = { \
- .name = #lockname, \
- .wait_type_inner = LD_WAIT_SPIN, \
- }
-# define SPIN_DEP_MAP_INIT(lockname) \
- .dep_map = { \
- .name = #lockname, \
- .wait_type_inner = LD_WAIT_CONFIG, \
- }
-#else
-# define RAW_SPIN_DEP_MAP_INIT(lockname)
-# define SPIN_DEP_MAP_INIT(lockname)
-#endif
-
-#ifdef CONFIG_DEBUG_SPINLOCK
-# define SPIN_DEBUG_INIT(lockname) \
- .magic = SPINLOCK_MAGIC, \
- .owner_cpu = -1, \
- .owner = SPINLOCK_OWNER_INIT,
-#else
-# define SPIN_DEBUG_INIT(lockname)
-#endif
-
-#define __RAW_SPIN_LOCK_INITIALIZER(lockname) \
- { \
- .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \
- SPIN_DEBUG_INIT(lockname) \
- RAW_SPIN_DEP_MAP_INIT(lockname) }
-
-#define __RAW_SPIN_LOCK_UNLOCKED(lockname) \
- (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname)
-
-#define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x)
+#ifndef CONFIG_PREEMPT_RT
+/* Non PREEMPT_RT kernels map spinlock to raw_spinlock */
typedef struct spinlock {
union {
struct raw_spinlock rlock;
#define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x)
+#else /* !CONFIG_PREEMPT_RT */
+
+/* PREEMPT_RT kernels map spinlock to rt_mutex */
+#include <linux/rtmutex.h>
+
+typedef struct spinlock {
+ struct rt_mutex_base lock;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
+} spinlock_t;
+
+#define __SPIN_LOCK_UNLOCKED(name) \
+ { \
+ .lock = __RT_MUTEX_BASE_INITIALIZER(name.lock), \
+ SPIN_DEP_MAP_INIT(name) \
+ }
+
+#define __LOCAL_SPIN_LOCK_UNLOCKED(name) \
+ { \
+ .lock = __RT_MUTEX_BASE_INITIALIZER(name.lock), \
+ LOCAL_SPIN_DEP_MAP_INIT(name) \
+ }
+
+#define DEFINE_SPINLOCK(name) \
+ spinlock_t name = __SPIN_LOCK_UNLOCKED(name)
+
+#endif /* CONFIG_PREEMPT_RT */
+
#include <linux/rwlock_types.h>
#endif /* __LINUX_SPINLOCK_TYPES_H */
--- /dev/null
+#ifndef __LINUX_SPINLOCK_TYPES_RAW_H
+#define __LINUX_SPINLOCK_TYPES_RAW_H
+
+#include <linux/types.h>
+
+#if defined(CONFIG_SMP)
+# include <asm/spinlock_types.h>
+#else
+# include <linux/spinlock_types_up.h>
+#endif
+
+#include <linux/lockdep_types.h>
+
+typedef struct raw_spinlock {
+ arch_spinlock_t raw_lock;
+#ifdef CONFIG_DEBUG_SPINLOCK
+ unsigned int magic, owner_cpu;
+ void *owner;
+#endif
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+#endif
+} raw_spinlock_t;
+
+#define SPINLOCK_MAGIC 0xdead4ead
+
+#define SPINLOCK_OWNER_INIT ((void *)-1L)
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+# define RAW_SPIN_DEP_MAP_INIT(lockname) \
+ .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_SPIN, \
+ }
+# define SPIN_DEP_MAP_INIT(lockname) \
+ .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_CONFIG, \
+ }
+
+# define LOCAL_SPIN_DEP_MAP_INIT(lockname) \
+ .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_CONFIG, \
+ .lock_type = LD_LOCK_PERCPU, \
+ }
+#else
+# define RAW_SPIN_DEP_MAP_INIT(lockname)
+# define SPIN_DEP_MAP_INIT(lockname)
+# define LOCAL_SPIN_DEP_MAP_INIT(lockname)
+#endif
+
+#ifdef CONFIG_DEBUG_SPINLOCK
+# define SPIN_DEBUG_INIT(lockname) \
+ .magic = SPINLOCK_MAGIC, \
+ .owner_cpu = -1, \
+ .owner = SPINLOCK_OWNER_INIT,
+#else
+# define SPIN_DEBUG_INIT(lockname)
+#endif
+
+#define __RAW_SPIN_LOCK_INITIALIZER(lockname) \
+{ \
+ .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED, \
+ SPIN_DEBUG_INIT(lockname) \
+ RAW_SPIN_DEP_MAP_INIT(lockname) }
+
+#define __RAW_SPIN_LOCK_UNLOCKED(lockname) \
+ (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname)
+
+#define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x)
+
+#endif /* __LINUX_SPINLOCK_TYPES_RAW_H */
int idx;
idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
- WRITE_ONCE(ssp->srcu_lock_nesting[idx], ssp->srcu_lock_nesting[idx] + 1);
+ WRITE_ONCE(ssp->srcu_lock_nesting[idx], READ_ONCE(ssp->srcu_lock_nesting[idx]) + 1);
return idx;
}
{
int idx;
- idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
+ idx = ((data_race(READ_ONCE(ssp->srcu_idx)) + 1) & 0x2) >> 1;
pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n",
tt, tf, idx,
- READ_ONCE(ssp->srcu_lock_nesting[!idx]),
- READ_ONCE(ssp->srcu_lock_nesting[idx]));
+ data_race(READ_ONCE(ssp->srcu_lock_nesting[!idx])),
+ data_race(READ_ONCE(ssp->srcu_lock_nesting[idx])));
}
#endif
* DECLARE_STATIC_CALL(name, func);
* DEFINE_STATIC_CALL(name, func);
* DEFINE_STATIC_CALL_NULL(name, typename);
+ * DEFINE_STATIC_CALL_RET0(name, typename);
+ *
+ * __static_call_return0;
+ *
* static_call(name)(args...);
* static_call_cond(name)(args...);
* static_call_update(name, func);
* static_call_query(name);
*
+ * EXPORT_STATIC_CALL{,_TRAMP}{,_GPL}()
+ *
* Usage example:
*
* # Start with the following functions (with identical prototypes):
* To query which function is currently set to be called, use:
*
* func = static_call_query(name);
+ *
+ *
+ * DEFINE_STATIC_CALL_RET0 / __static_call_return0:
+ *
+ * Just like how DEFINE_STATIC_CALL_NULL() / static_call_cond() optimize the
+ * conditional void function call, DEFINE_STATIC_CALL_RET0 /
+ * __static_call_return0 optimize the do nothing return 0 function.
+ *
+ * This feature is strictly UB per the C standard (since it casts a function
+ * pointer to a different signature) and relies on the architecture ABI to
+ * make things work. In particular it relies on Caller Stack-cleanup and the
+ * whole return register being clobbered for short return values. All normal
+ * CDECL style ABIs conform.
+ *
+ * In particular the x86_64 implementation replaces the 5 byte CALL
+ * instruction at the callsite with a 5 byte clear of the RAX register,
+ * completely eliding any function call overhead.
+ *
+ * Notably argument setup is unconditional.
+ *
+ *
+ * EXPORT_STATIC_CALL() vs EXPORT_STATIC_CALL_TRAMP():
+ *
+ * The difference is that the _TRAMP variant tries to only export the
+ * trampoline with the result that a module can use static_call{,_cond}() but
+ * not static_call_update().
+ *
*/
#include <linux/types.h>
#define __WAIT_QUEUE_HEAD_INITIALIZER(name) { \
.lock = __SPIN_LOCK_UNLOCKED(name.lock), \
- .head = { &(name).head, &(name).head } }
+ .head = LIST_HEAD_INIT(name.head) }
#define DECLARE_WAIT_QUEUE_HEAD(name) \
struct wait_queue_head name = __WAIT_QUEUE_HEAD_INITIALIZER(name)
/*
* mm/page-writeback.c
*/
-#ifdef CONFIG_BLOCK
void laptop_io_completion(struct backing_dev_info *info);
void laptop_sync_completion(void);
-void laptop_mode_sync(struct work_struct *work);
void laptop_mode_timer_fn(struct timer_list *t);
-#else
-static inline void laptop_sync_completion(void) { }
-#endif
bool node_dirty_ok(struct pglist_data *pgdat);
int wb_domain_init(struct wb_domain *dom, gfp_t gfp);
#ifdef CONFIG_CGROUP_WRITEBACK
#define __LINUX_WW_MUTEX_H
#include <linux/mutex.h>
+#include <linux/rtmutex.h>
+
+#if defined(CONFIG_DEBUG_MUTEXES) || \
+ (defined(CONFIG_PREEMPT_RT) && defined(CONFIG_DEBUG_RT_MUTEXES))
+#define DEBUG_WW_MUTEXES
+#endif
+
+#ifndef CONFIG_PREEMPT_RT
+#define WW_MUTEX_BASE mutex
+#define ww_mutex_base_init(l,n,k) __mutex_init(l,n,k)
+#define ww_mutex_base_trylock(l) mutex_trylock(l)
+#define ww_mutex_base_is_locked(b) mutex_is_locked((b))
+#else
+#define WW_MUTEX_BASE rt_mutex
+#define ww_mutex_base_init(l,n,k) __rt_mutex_init(l,n,k)
+#define ww_mutex_base_trylock(l) rt_mutex_trylock(l)
+#define ww_mutex_base_is_locked(b) rt_mutex_base_is_locked(&(b)->rtmutex)
+#endif
struct ww_class {
atomic_long_t stamp;
unsigned int is_wait_die;
};
+struct ww_mutex {
+ struct WW_MUTEX_BASE base;
+ struct ww_acquire_ctx *ctx;
+#ifdef DEBUG_WW_MUTEXES
+ struct ww_class *ww_class;
+#endif
+};
+
struct ww_acquire_ctx {
struct task_struct *task;
unsigned long stamp;
unsigned int acquired;
unsigned short wounded;
unsigned short is_wait_die;
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
unsigned int done_acquire;
struct ww_class *ww_class;
- struct ww_mutex *contending_lock;
+ void *contending_lock;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map dep_map;
static inline void ww_mutex_init(struct ww_mutex *lock,
struct ww_class *ww_class)
{
- __mutex_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key);
+ ww_mutex_base_init(&lock->base, ww_class->mutex_name, &ww_class->mutex_key);
lock->ctx = NULL;
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
lock->ww_class = ww_class;
#endif
}
ctx->acquired = 0;
ctx->wounded = false;
ctx->is_wait_die = ww_class->is_wait_die;
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
ctx->ww_class = ww_class;
ctx->done_acquire = 0;
ctx->contending_lock = NULL;
*/
static inline void ww_acquire_done(struct ww_acquire_ctx *ctx)
{
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
lockdep_assert_held(ctx);
DEBUG_LOCKS_WARN_ON(ctx->done_acquire);
#ifdef CONFIG_DEBUG_LOCK_ALLOC
mutex_release(&ctx->dep_map, _THIS_IP_);
#endif
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
DEBUG_LOCKS_WARN_ON(ctx->acquired);
if (!IS_ENABLED(CONFIG_PROVE_LOCKING))
/*
ww_mutex_lock_slow(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
{
int ret;
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
DEBUG_LOCKS_WARN_ON(!ctx->contending_lock);
#endif
ret = ww_mutex_lock(lock, ctx);
ww_mutex_lock_slow_interruptible(struct ww_mutex *lock,
struct ww_acquire_ctx *ctx)
{
-#ifdef CONFIG_DEBUG_MUTEXES
+#ifdef DEBUG_WW_MUTEXES
DEBUG_LOCKS_WARN_ON(!ctx->contending_lock);
#endif
return ww_mutex_lock_interruptible(lock, ctx);
*/
static inline int __must_check ww_mutex_trylock(struct ww_mutex *lock)
{
- return mutex_trylock(&lock->base);
+ return ww_mutex_base_trylock(&lock->base);
}
/***
*/
static inline void ww_mutex_destroy(struct ww_mutex *lock)
{
+#ifndef CONFIG_PREEMPT_RT
mutex_destroy(&lock->base);
+#endif
}
/**
*/
static inline bool ww_mutex_is_locked(struct ww_mutex *lock)
{
- return mutex_is_locked(&lock->base);
+ return ww_mutex_base_is_locked(&lock->base);
}
#endif
return false;
}
-/* Function to safely get fn->sernum for passed in rt
+/* Function to safely get fn->fn_sernum for passed in rt
* and store result in passed in cookie.
* Return true if we can get cookie safely
* Return false if not
if (fn) {
*cookie = fn->fn_sernum;
- /* pairs with smp_wmb() in fib6_update_sernum_upto_root() */
+ /* pairs with smp_wmb() in __fib6_update_sernum_upto_root() */
smp_rmb();
status = true;
}
),
TP_fast_assign(
- __entry->dev = disk_devt(queue_to_disk(q));
+ __entry->dev = disk_devt(q->disk);
strlcpy(__entry->domain, domain, sizeof(__entry->domain));
strlcpy(__entry->type, type, sizeof(__entry->type));
__entry->percentile = percentile;
),
TP_fast_assign(
- __entry->dev = disk_devt(queue_to_disk(q));
+ __entry->dev = disk_devt(q->disk);
strlcpy(__entry->domain, domain, sizeof(__entry->domain));
__entry->depth = depth;
),
),
TP_fast_assign(
- __entry->dev = disk_devt(queue_to_disk(q));
+ __entry->dev = disk_devt(q->disk);
strlcpy(__entry->domain, domain, sizeof(__entry->domain));
),
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+ WITH Linux-syscall-note */
+/*
+ * audio.h - DEPRECATED MPEG-TS audio decoder API
+ *
+ * NOTE: should not be used on future drivers
+ *
+ * Copyright (C) 2000 Ralph Metzler <ralph@convergence.de>
+ * & Marcus Metzler <marcus@convergence.de>
+ * for convergence integrated media GmbH
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Lesser Public License
+ * as published by the Free Software Foundation; either version 2.1
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+#ifndef _DVBAUDIO_H_
+#define _DVBAUDIO_H_
+
+#include <linux/types.h>
+
+typedef enum {
+ AUDIO_SOURCE_DEMUX, /* Select the demux as the main source */
+ AUDIO_SOURCE_MEMORY /* Select internal memory as the main source */
+} audio_stream_source_t;
+
+
+typedef enum {
+ AUDIO_STOPPED, /* Device is stopped */
+ AUDIO_PLAYING, /* Device is currently playing */
+ AUDIO_PAUSED /* Device is paused */
+} audio_play_state_t;
+
+
+typedef enum {
+ AUDIO_STEREO,
+ AUDIO_MONO_LEFT,
+ AUDIO_MONO_RIGHT,
+ AUDIO_MONO,
+ AUDIO_STEREO_SWAPPED
+} audio_channel_select_t;
+
+
+typedef struct audio_mixer {
+ unsigned int volume_left;
+ unsigned int volume_right;
+ /* what else do we need? bass, pass-through, ... */
+} audio_mixer_t;
+
+
+typedef struct audio_status {
+ int AV_sync_state; /* sync audio and video? */
+ int mute_state; /* audio is muted */
+ audio_play_state_t play_state; /* current playback state */
+ audio_stream_source_t stream_source; /* current stream source */
+ audio_channel_select_t channel_select; /* currently selected channel */
+ int bypass_mode; /* pass on audio data to */
+ audio_mixer_t mixer_state; /* current mixer state */
+} audio_status_t; /* separate decoder hardware */
+
+
+/* for GET_CAPABILITIES and SET_FORMAT, the latter should only set one bit */
+#define AUDIO_CAP_DTS 1
+#define AUDIO_CAP_LPCM 2
+#define AUDIO_CAP_MP1 4
+#define AUDIO_CAP_MP2 8
+#define AUDIO_CAP_MP3 16
+#define AUDIO_CAP_AAC 32
+#define AUDIO_CAP_OGG 64
+#define AUDIO_CAP_SDDS 128
+#define AUDIO_CAP_AC3 256
+
+#define AUDIO_STOP _IO('o', 1)
+#define AUDIO_PLAY _IO('o', 2)
+#define AUDIO_PAUSE _IO('o', 3)
+#define AUDIO_CONTINUE _IO('o', 4)
+#define AUDIO_SELECT_SOURCE _IO('o', 5)
+#define AUDIO_SET_MUTE _IO('o', 6)
+#define AUDIO_SET_AV_SYNC _IO('o', 7)
+#define AUDIO_SET_BYPASS_MODE _IO('o', 8)
+#define AUDIO_CHANNEL_SELECT _IO('o', 9)
+#define AUDIO_GET_STATUS _IOR('o', 10, audio_status_t)
+
+#define AUDIO_GET_CAPABILITIES _IOR('o', 11, unsigned int)
+#define AUDIO_CLEAR_BUFFER _IO('o', 12)
+#define AUDIO_SET_ID _IO('o', 13)
+#define AUDIO_SET_MIXER _IOW('o', 14, audio_mixer_t)
+#define AUDIO_SET_STREAMTYPE _IO('o', 15)
+#define AUDIO_BILINGUAL_CHANNEL_SELECT _IO('o', 20)
+
+#endif /* _DVBAUDIO_H_ */
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+ WITH Linux-syscall-note */
+/*
+ * osd.h - DEPRECATED On Screen Display API
+ *
+ * NOTE: should not be used on future drivers
+ *
+ * Copyright (C) 2001 Ralph Metzler <ralph@convergence.de>
+ * & Marcus Metzler <marcus@convergence.de>
+ * for convergence integrated media GmbH
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Lesser Public License
+ * as published by the Free Software Foundation; either version 2.1
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+#ifndef _DVBOSD_H_
+#define _DVBOSD_H_
+
+#include <linux/compiler.h>
+
+typedef enum {
+ /* All functions return -2 on "not open" */
+ OSD_Close = 1, /* () */
+ /*
+ * Disables OSD and releases the buffers
+ * returns 0 on success
+ */
+ OSD_Open, /* (x0,y0,x1,y1,BitPerPixel[2/4/8](color&0x0F),mix[0..15](color&0xF0)) */
+ /*
+ * Opens OSD with this size and bit depth
+ * returns 0 on success, -1 on DRAM allocation error, -2 on "already open"
+ */
+ OSD_Show, /* () */
+ /*
+ * enables OSD mode
+ * returns 0 on success
+ */
+ OSD_Hide, /* () */
+ /*
+ * disables OSD mode
+ * returns 0 on success
+ */
+ OSD_Clear, /* () */
+ /*
+ * Sets all pixel to color 0
+ * returns 0 on success
+ */
+ OSD_Fill, /* (color) */
+ /*
+ * Sets all pixel to color <col>
+ * returns 0 on success
+ */
+ OSD_SetColor, /* (color,R{x0},G{y0},B{x1},opacity{y1}) */
+ /*
+ * set palette entry <num> to <r,g,b>, <mix> and <trans> apply
+ * R,G,B: 0..255
+ * R=Red, G=Green, B=Blue
+ * opacity=0: pixel opacity 0% (only video pixel shows)
+ * opacity=1..254: pixel opacity as specified in header
+ * opacity=255: pixel opacity 100% (only OSD pixel shows)
+ * returns 0 on success, -1 on error
+ */
+ OSD_SetPalette, /* (firstcolor{color},lastcolor{x0},data) */
+ /*
+ * Set a number of entries in the palette
+ * sets the entries "firstcolor" through "lastcolor" from the array "data"
+ * data has 4 byte for each color:
+ * R,G,B, and a opacity value: 0->transparent, 1..254->mix, 255->pixel
+ */
+ OSD_SetTrans, /* (transparency{color}) */
+ /*
+ * Sets transparency of mixed pixel (0..15)
+ * returns 0 on success
+ */
+ OSD_SetPixel, /* (x0,y0,color) */
+ /*
+ * sets pixel <x>,<y> to color number <col>
+ * returns 0 on success, -1 on error
+ */
+ OSD_GetPixel, /* (x0,y0) */
+ /* returns color number of pixel <x>,<y>, or -1 */
+ OSD_SetRow, /* (x0,y0,x1,data) */
+ /*
+ * fills pixels x0,y through x1,y with the content of data[]
+ * returns 0 on success, -1 on clipping all pixel (no pixel drawn)
+ */
+ OSD_SetBlock, /* (x0,y0,x1,y1,increment{color},data) */
+ /*
+ * fills pixels x0,y0 through x1,y1 with the content of data[]
+ * inc contains the width of one line in the data block,
+ * inc<=0 uses blockwidth as linewidth
+ * returns 0 on success, -1 on clipping all pixel
+ */
+ OSD_FillRow, /* (x0,y0,x1,color) */
+ /*
+ * fills pixels x0,y through x1,y with the color <col>
+ * returns 0 on success, -1 on clipping all pixel
+ */
+ OSD_FillBlock, /* (x0,y0,x1,y1,color) */
+ /*
+ * fills pixels x0,y0 through x1,y1 with the color <col>
+ * returns 0 on success, -1 on clipping all pixel
+ */
+ OSD_Line, /* (x0,y0,x1,y1,color) */
+ /*
+ * draw a line from x0,y0 to x1,y1 with the color <col>
+ * returns 0 on success
+ */
+ OSD_Query, /* (x0,y0,x1,y1,xasp{color}}), yasp=11 */
+ /*
+ * fills parameters with the picture dimensions and the pixel aspect ratio
+ * returns 0 on success
+ */
+ OSD_Test, /* () */
+ /*
+ * draws a test picture. for debugging purposes only
+ * returns 0 on success
+ * TODO: remove "test" in final version
+ */
+ OSD_Text, /* (x0,y0,size,color,text) */
+ OSD_SetWindow, /* (x0) set window with number 0<x0<8 as current */
+ OSD_MoveWindow, /* move current window to (x0, y0) */
+ OSD_OpenRaw, /* Open other types of OSD windows */
+} OSD_Command;
+
+typedef struct osd_cmd_s {
+ OSD_Command cmd;
+ int x0;
+ int y0;
+ int x1;
+ int y1;
+ int color;
+ void __user *data;
+} osd_cmd_t;
+
+/* OSD_OpenRaw: set 'color' to desired window type */
+typedef enum {
+ OSD_BITMAP1, /* 1 bit bitmap */
+ OSD_BITMAP2, /* 2 bit bitmap */
+ OSD_BITMAP4, /* 4 bit bitmap */
+ OSD_BITMAP8, /* 8 bit bitmap */
+ OSD_BITMAP1HR, /* 1 Bit bitmap half resolution */
+ OSD_BITMAP2HR, /* 2 bit bitmap half resolution */
+ OSD_BITMAP4HR, /* 4 bit bitmap half resolution */
+ OSD_BITMAP8HR, /* 8 bit bitmap half resolution */
+ OSD_YCRCB422, /* 4:2:2 YCRCB Graphic Display */
+ OSD_YCRCB444, /* 4:4:4 YCRCB Graphic Display */
+ OSD_YCRCB444HR, /* 4:4:4 YCRCB graphic half resolution */
+ OSD_VIDEOTSIZE, /* True Size Normal MPEG Video Display */
+ OSD_VIDEOHSIZE, /* MPEG Video Display Half Resolution */
+ OSD_VIDEOQSIZE, /* MPEG Video Display Quarter Resolution */
+ OSD_VIDEODSIZE, /* MPEG Video Display Double Resolution */
+ OSD_VIDEOTHSIZE, /* True Size MPEG Video Display Half Resolution */
+ OSD_VIDEOTQSIZE, /* True Size MPEG Video Display Quarter Resolution*/
+ OSD_VIDEOTDSIZE, /* True Size MPEG Video Display Double Resolution */
+ OSD_VIDEONSIZE, /* Full Size MPEG Video Display */
+ OSD_CURSOR /* Cursor */
+} osd_raw_window_t;
+
+typedef struct osd_cap_s {
+ int cmd;
+#define OSD_CAP_MEMSIZE 1 /* memory size */
+ long val;
+} osd_cap_t;
+
+
+#define OSD_SEND_CMD _IOW('o', 160, osd_cmd_t)
+#define OSD_GET_CAPABILITY _IOR('o', 161, osd_cap_t)
+
+#endif
--- /dev/null
+/* SPDX-License-Identifier: LGPL-2.1+ WITH Linux-syscall-note */
+/*
+ * video.h - DEPRECATED MPEG-TS video decoder API
+ *
+ * NOTE: should not be used on future drivers
+ *
+ * Copyright (C) 2000 Marcus Metzler <marcus@convergence.de>
+ * & Ralph Metzler <ralph@convergence.de>
+ * for convergence integrated media GmbH
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ */
+
+#ifndef _UAPI_DVBVIDEO_H_
+#define _UAPI_DVBVIDEO_H_
+
+#include <linux/types.h>
+#ifndef __KERNEL__
+#include <time.h>
+#endif
+
+typedef enum {
+ VIDEO_FORMAT_4_3, /* Select 4:3 format */
+ VIDEO_FORMAT_16_9, /* Select 16:9 format. */
+ VIDEO_FORMAT_221_1 /* 2.21:1 */
+} video_format_t;
+
+
+typedef enum {
+ VIDEO_PAN_SCAN, /* use pan and scan format */
+ VIDEO_LETTER_BOX, /* use letterbox format */
+ VIDEO_CENTER_CUT_OUT /* use center cut out format */
+} video_displayformat_t;
+
+typedef struct {
+ int w;
+ int h;
+ video_format_t aspect_ratio;
+} video_size_t;
+
+typedef enum {
+ VIDEO_SOURCE_DEMUX, /* Select the demux as the main source */
+ VIDEO_SOURCE_MEMORY /* If this source is selected, the stream
+ comes from the user through the write
+ system call */
+} video_stream_source_t;
+
+
+typedef enum {
+ VIDEO_STOPPED, /* Video is stopped */
+ VIDEO_PLAYING, /* Video is currently playing */
+ VIDEO_FREEZED /* Video is freezed */
+} video_play_state_t;
+
+
+/* Decoder commands */
+#define VIDEO_CMD_PLAY (0)
+#define VIDEO_CMD_STOP (1)
+#define VIDEO_CMD_FREEZE (2)
+#define VIDEO_CMD_CONTINUE (3)
+
+/* Flags for VIDEO_CMD_FREEZE */
+#define VIDEO_CMD_FREEZE_TO_BLACK (1 << 0)
+
+/* Flags for VIDEO_CMD_STOP */
+#define VIDEO_CMD_STOP_TO_BLACK (1 << 0)
+#define VIDEO_CMD_STOP_IMMEDIATELY (1 << 1)
+
+/* Play input formats: */
+/* The decoder has no special format requirements */
+#define VIDEO_PLAY_FMT_NONE (0)
+/* The decoder requires full GOPs */
+#define VIDEO_PLAY_FMT_GOP (1)
+
+/* The structure must be zeroed before use by the application
+ This ensures it can be extended safely in the future. */
+struct video_command {
+ __u32 cmd;
+ __u32 flags;
+ union {
+ struct {
+ __u64 pts;
+ } stop;
+
+ struct {
+ /* 0 or 1000 specifies normal speed,
+ 1 specifies forward single stepping,
+ -1 specifies backward single stepping,
+ >1: playback at speed/1000 of the normal speed,
+ <-1: reverse playback at (-speed/1000) of the normal speed. */
+ __s32 speed;
+ __u32 format;
+ } play;
+
+ struct {
+ __u32 data[16];
+ } raw;
+ };
+};
+
+/* FIELD_UNKNOWN can be used if the hardware does not know whether
+ the Vsync is for an odd, even or progressive (i.e. non-interlaced)
+ field. */
+#define VIDEO_VSYNC_FIELD_UNKNOWN (0)
+#define VIDEO_VSYNC_FIELD_ODD (1)
+#define VIDEO_VSYNC_FIELD_EVEN (2)
+#define VIDEO_VSYNC_FIELD_PROGRESSIVE (3)
+
+struct video_event {
+ __s32 type;
+#define VIDEO_EVENT_SIZE_CHANGED 1
+#define VIDEO_EVENT_FRAME_RATE_CHANGED 2
+#define VIDEO_EVENT_DECODER_STOPPED 3
+#define VIDEO_EVENT_VSYNC 4
+ /* unused, make sure to use atomic time for y2038 if it ever gets used */
+ long timestamp;
+ union {
+ video_size_t size;
+ unsigned int frame_rate; /* in frames per 1000sec */
+ unsigned char vsync_field; /* unknown/odd/even/progressive */
+ } u;
+};
+
+
+struct video_status {
+ int video_blank; /* blank video on freeze? */
+ video_play_state_t play_state; /* current state of playback */
+ video_stream_source_t stream_source; /* current source (demux/memory) */
+ video_format_t video_format; /* current aspect ratio of stream*/
+ video_displayformat_t display_format;/* selected cropping mode */
+};
+
+
+struct video_still_picture {
+ char __user *iFrame; /* pointer to a single iframe in memory */
+ __s32 size;
+};
+
+
+typedef __u16 video_attributes_t;
+/* bits: descr. */
+/* 15-14 Video compression mode (0=MPEG-1, 1=MPEG-2) */
+/* 13-12 TV system (0=525/60, 1=625/50) */
+/* 11-10 Aspect ratio (0=4:3, 3=16:9) */
+/* 9- 8 permitted display mode on 4:3 monitor (0=both, 1=only pan-sca */
+/* 7 line 21-1 data present in GOP (1=yes, 0=no) */
+/* 6 line 21-2 data present in GOP (1=yes, 0=no) */
+/* 5- 3 source resolution (0=720x480/576, 1=704x480/576, 2=352x480/57 */
+/* 2 source letterboxed (1=yes, 0=no) */
+/* 0 film/camera mode (0=
+ *camera, 1=film (625/50 only)) */
+
+
+/* bit definitions for capabilities: */
+/* can the hardware decode MPEG1 and/or MPEG2? */
+#define VIDEO_CAP_MPEG1 1
+#define VIDEO_CAP_MPEG2 2
+/* can you send a system and/or program stream to video device?
+ (you still have to open the video and the audio device but only
+ send the stream to the video device) */
+#define VIDEO_CAP_SYS 4
+#define VIDEO_CAP_PROG 8
+/* can the driver also handle SPU, NAVI and CSS encoded data?
+ (CSS API is not present yet) */
+#define VIDEO_CAP_SPU 16
+#define VIDEO_CAP_NAVI 32
+#define VIDEO_CAP_CSS 64
+
+
+#define VIDEO_STOP _IO('o', 21)
+#define VIDEO_PLAY _IO('o', 22)
+#define VIDEO_FREEZE _IO('o', 23)
+#define VIDEO_CONTINUE _IO('o', 24)
+#define VIDEO_SELECT_SOURCE _IO('o', 25)
+#define VIDEO_SET_BLANK _IO('o', 26)
+#define VIDEO_GET_STATUS _IOR('o', 27, struct video_status)
+#define VIDEO_GET_EVENT _IOR('o', 28, struct video_event)
+#define VIDEO_SET_DISPLAY_FORMAT _IO('o', 29)
+#define VIDEO_STILLPICTURE _IOW('o', 30, struct video_still_picture)
+#define VIDEO_FAST_FORWARD _IO('o', 31)
+#define VIDEO_SLOWMOTION _IO('o', 32)
+#define VIDEO_GET_CAPABILITIES _IOR('o', 33, unsigned int)
+#define VIDEO_CLEAR_BUFFER _IO('o', 34)
+#define VIDEO_SET_STREAMTYPE _IO('o', 36)
+#define VIDEO_SET_FORMAT _IO('o', 37)
+#define VIDEO_GET_SIZE _IOR('o', 55, video_size_t)
+
+/**
+ * VIDEO_GET_PTS
+ *
+ * Read the 33 bit presentation time stamp as defined
+ * in ITU T-REC-H.222.0 / ISO/IEC 13818-1.
+ *
+ * The PTS should belong to the currently played
+ * frame if possible, but may also be a value close to it
+ * like the PTS of the last decoded frame or the last PTS
+ * extracted by the PES parser.
+ */
+#define VIDEO_GET_PTS _IOR('o', 57, __u64)
+
+/* Read the number of displayed frames since the decoder was started */
+#define VIDEO_GET_FRAME_COUNT _IOR('o', 58, __u64)
+
+#define VIDEO_COMMAND _IOWR('o', 59, struct video_command)
+#define VIDEO_TRY_COMMAND _IOWR('o', 60, struct video_command)
+
+#endif /* _UAPI_DVBVIDEO_H_ */
#define FAN_ENABLE_AUDIT 0x00000040
/* Flags to determine fanotify event format */
+#define FAN_REPORT_PIDFD 0x00000080 /* Report pidfd for event->pid */
#define FAN_REPORT_TID 0x00000100 /* event->pid is thread id */
#define FAN_REPORT_FID 0x00000200 /* Report unique file id */
#define FAN_REPORT_DIR_FID 0x00000400 /* Report unique directory id */
#define FAN_EVENT_INFO_TYPE_FID 1
#define FAN_EVENT_INFO_TYPE_DFID_NAME 2
#define FAN_EVENT_INFO_TYPE_DFID 3
+#define FAN_EVENT_INFO_TYPE_PIDFD 4
/* Variable length info record following event metadata */
struct fanotify_event_info_header {
unsigned char handle[0];
};
+/*
+ * This structure is used for info records of type FAN_EVENT_INFO_TYPE_PIDFD.
+ * It holds a pidfd for the pid that was responsible for generating an event.
+ */
+struct fanotify_event_info_pidfd {
+ struct fanotify_event_info_header hdr;
+ __s32 pidfd;
+};
+
struct fanotify_response {
__s32 fd;
__u32 response;
/* No fd set in event */
#define FAN_NOFD -1
+#define FAN_NOPIDFD FAN_NOFD
+#define FAN_EPIDFD -2
/* Helper functions to deal with fanotify_event_metadata buffers */
#define FAN_EVENT_METADATA_LEN (sizeof(struct fanotify_event_metadata))
#define BLKSECDISCARD _IO(0x12,125)
#define BLKROTATIONAL _IO(0x12,126)
#define BLKZEROOUT _IO(0x12,127)
+#define BLKGETDISKSEQ _IOR(0x12,128,__u64)
/*
* A jump here: 130-136 are reserved for zoned block devices
* (see uapi/linux/blkzoned.h)
} __attribute__((packed));
/* personality to use, if used */
__u16 personality;
- __s32 splice_fd_in;
+ union {
+ __s32 splice_fd_in;
+ __u32 file_index;
+ };
__u64 __pad2[2];
};
/*
* sqe->timeout_flags
*/
-#define IORING_TIMEOUT_ABS (1U << 0)
-#define IORING_TIMEOUT_UPDATE (1U << 1)
-
+#define IORING_TIMEOUT_ABS (1U << 0)
+#define IORING_TIMEOUT_UPDATE (1U << 1)
+#define IORING_TIMEOUT_BOOTTIME (1U << 2)
+#define IORING_TIMEOUT_REALTIME (1U << 3)
+#define IORING_LINK_TIMEOUT_UPDATE (1U << 4)
+#define IORING_TIMEOUT_CLOCK_MASK (IORING_TIMEOUT_BOOTTIME | IORING_TIMEOUT_REALTIME)
+#define IORING_TIMEOUT_UPDATE_MASK (IORING_TIMEOUT_UPDATE | IORING_LINK_TIMEOUT_UPDATE)
/*
* sqe->splice_flags
* extends splice(2) flags
IORING_REGISTER_IOWQ_AFF = 17,
IORING_UNREGISTER_IOWQ_AFF = 18,
+ /* set/get max number of workers */
+ IORING_REGISTER_IOWQ_MAX_WORKERS = 19,
+
/* this goes last */
IORING_REGISTER_LAST
};
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_IOPRIO_H
+#define _UAPI_LINUX_IOPRIO_H
+
+/*
+ * Gives us 8 prio classes with 13-bits of data for each class
+ */
+#define IOPRIO_CLASS_SHIFT 13
+#define IOPRIO_CLASS_MASK 0x07
+#define IOPRIO_PRIO_MASK ((1UL << IOPRIO_CLASS_SHIFT) - 1)
+
+#define IOPRIO_PRIO_CLASS(ioprio) \
+ (((ioprio) >> IOPRIO_CLASS_SHIFT) & IOPRIO_CLASS_MASK)
+#define IOPRIO_PRIO_DATA(ioprio) ((ioprio) & IOPRIO_PRIO_MASK)
+#define IOPRIO_PRIO_VALUE(class, data) \
+ ((((class) & IOPRIO_CLASS_MASK) << IOPRIO_CLASS_SHIFT) | \
+ ((data) & IOPRIO_PRIO_MASK))
+
+/*
+ * These are the io priority groups as implemented by the BFQ and mq-deadline
+ * schedulers. RT is the realtime class, it always gets premium service. For
+ * ATA disks supporting NCQ IO priority, RT class IOs will be processed using
+ * high priority NCQ commands. BE is the best-effort scheduling class, the
+ * default for any process. IDLE is the idle scheduling class, it is only
+ * served when no one else is using the disk.
+ */
+enum {
+ IOPRIO_CLASS_NONE,
+ IOPRIO_CLASS_RT,
+ IOPRIO_CLASS_BE,
+ IOPRIO_CLASS_IDLE,
+};
+
+/*
+ * The RT and BE priority classes both support up to 8 priority levels.
+ */
+#define IOPRIO_NR_LEVELS 8
+#define IOPRIO_BE_NR IOPRIO_NR_LEVELS
+
+enum {
+ IOPRIO_WHO_PROCESS = 1,
+ IOPRIO_WHO_PGRP,
+ IOPRIO_WHO_USER,
+};
+
+/*
+ * Fallback BE priority level.
+ */
+#define IOPRIO_NORM 4
+#define IOPRIO_BE_NORM IOPRIO_NORM
+
+#endif /* _UAPI_LINUX_IOPRIO_H */
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
-/*
- * Copyright (C) 2015 CNEX Labs. All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; see the file COPYING. If not, write to
- * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
- * USA.
- */
-
-#ifndef _UAPI_LINUX_LIGHTNVM_H
-#define _UAPI_LINUX_LIGHTNVM_H
-
-#ifdef __KERNEL__
-#include <linux/const.h>
-#else /* __KERNEL__ */
-#include <stdio.h>
-#include <sys/ioctl.h>
-#define DISK_NAME_LEN 32
-#endif /* __KERNEL__ */
-
-#include <linux/types.h>
-#include <linux/ioctl.h>
-
-#define NVM_TTYPE_NAME_MAX 48
-#define NVM_TTYPE_MAX 63
-#define NVM_MMTYPE_LEN 8
-
-#define NVM_CTRL_FILE "/dev/lightnvm/control"
-
-struct nvm_ioctl_info_tgt {
- __u32 version[3];
- __u32 reserved;
- char tgtname[NVM_TTYPE_NAME_MAX];
-};
-
-struct nvm_ioctl_info {
- __u32 version[3]; /* in/out - major, minor, patch */
- __u16 tgtsize; /* number of targets */
- __u16 reserved16; /* pad to 4K page */
- __u32 reserved[12];
- struct nvm_ioctl_info_tgt tgts[NVM_TTYPE_MAX];
-};
-
-enum {
- NVM_DEVICE_ACTIVE = 1 << 0,
-};
-
-struct nvm_ioctl_device_info {
- char devname[DISK_NAME_LEN];
- char bmname[NVM_TTYPE_NAME_MAX];
- __u32 bmversion[3];
- __u32 flags;
- __u32 reserved[8];
-};
-
-struct nvm_ioctl_get_devices {
- __u32 nr_devices;
- __u32 reserved[31];
- struct nvm_ioctl_device_info info[31];
-};
-
-struct nvm_ioctl_create_simple {
- __u32 lun_begin;
- __u32 lun_end;
-};
-
-struct nvm_ioctl_create_extended {
- __u16 lun_begin;
- __u16 lun_end;
- __u16 op;
- __u16 rsv;
-};
-
-enum {
- NVM_CONFIG_TYPE_SIMPLE = 0,
- NVM_CONFIG_TYPE_EXTENDED = 1,
-};
-
-struct nvm_ioctl_create_conf {
- __u32 type;
- union {
- struct nvm_ioctl_create_simple s;
- struct nvm_ioctl_create_extended e;
- };
-};
-
-enum {
- NVM_TARGET_FACTORY = 1 << 0, /* Init target in factory mode */
-};
-
-struct nvm_ioctl_create {
- char dev[DISK_NAME_LEN]; /* open-channel SSD device */
- char tgttype[NVM_TTYPE_NAME_MAX]; /* target type name */
- char tgtname[DISK_NAME_LEN]; /* dev to expose target as */
-
- __u32 flags;
-
- struct nvm_ioctl_create_conf conf;
-};
-
-struct nvm_ioctl_remove {
- char tgtname[DISK_NAME_LEN];
-
- __u32 flags;
-};
-
-struct nvm_ioctl_dev_init {
- char dev[DISK_NAME_LEN]; /* open-channel SSD device */
- char mmtype[NVM_MMTYPE_LEN]; /* register to media manager */
-
- __u32 flags;
-};
-
-enum {
- NVM_FACTORY_ERASE_ONLY_USER = 1 << 0, /* erase only blocks used as
- * host blks or grown blks */
- NVM_FACTORY_RESET_HOST_BLKS = 1 << 1, /* remove host blk marks */
- NVM_FACTORY_RESET_GRWN_BBLKS = 1 << 2, /* remove grown blk marks */
- NVM_FACTORY_NR_BITS = 1 << 3, /* stops here */
-};
-
-struct nvm_ioctl_dev_factory {
- char dev[DISK_NAME_LEN];
-
- __u32 flags;
-};
-
-struct nvm_user_vio {
- __u8 opcode;
- __u8 flags;
- __u16 control;
- __u16 nppas;
- __u16 rsvd;
- __u64 metadata;
- __u64 addr;
- __u64 ppa_list;
- __u32 metadata_len;
- __u32 data_len;
- __u64 status;
- __u32 result;
- __u32 rsvd3[3];
-};
-
-struct nvm_passthru_vio {
- __u8 opcode;
- __u8 flags;
- __u8 rsvd[2];
- __u32 nsid;
- __u32 cdw2;
- __u32 cdw3;
- __u64 metadata;
- __u64 addr;
- __u32 metadata_len;
- __u32 data_len;
- __u64 ppa_list;
- __u16 nppas;
- __u16 control;
- __u32 cdw13;
- __u32 cdw14;
- __u32 cdw15;
- __u64 status;
- __u32 result;
- __u32 timeout_ms;
-};
-
-/* The ioctl type, 'L', 0x20 - 0x2F documented in ioctl-number.txt */
-enum {
- /* top level cmds */
- NVM_INFO_CMD = 0x20,
- NVM_GET_DEVICES_CMD,
-
- /* device level cmds */
- NVM_DEV_CREATE_CMD,
- NVM_DEV_REMOVE_CMD,
-
- /* Init a device to support LightNVM media managers */
- NVM_DEV_INIT_CMD,
-
- /* Factory reset device */
- NVM_DEV_FACTORY_CMD,
-
- /* Vector user I/O */
- NVM_DEV_VIO_ADMIN_CMD = 0x41,
- NVM_DEV_VIO_CMD = 0x42,
- NVM_DEV_VIO_USER_CMD = 0x43,
-};
-
-#define NVM_IOCTL 'L' /* 0x4c */
-
-#define NVM_INFO _IOWR(NVM_IOCTL, NVM_INFO_CMD, \
- struct nvm_ioctl_info)
-#define NVM_GET_DEVICES _IOR(NVM_IOCTL, NVM_GET_DEVICES_CMD, \
- struct nvm_ioctl_get_devices)
-#define NVM_DEV_CREATE _IOW(NVM_IOCTL, NVM_DEV_CREATE_CMD, \
- struct nvm_ioctl_create)
-#define NVM_DEV_REMOVE _IOW(NVM_IOCTL, NVM_DEV_REMOVE_CMD, \
- struct nvm_ioctl_remove)
-#define NVM_DEV_INIT _IOW(NVM_IOCTL, NVM_DEV_INIT_CMD, \
- struct nvm_ioctl_dev_init)
-#define NVM_DEV_FACTORY _IOW(NVM_IOCTL, NVM_DEV_FACTORY_CMD, \
- struct nvm_ioctl_dev_factory)
-
-#define NVME_NVM_IOCTL_IO_VIO _IOWR(NVM_IOCTL, NVM_DEV_VIO_USER_CMD, \
- struct nvm_passthru_vio)
-#define NVME_NVM_IOCTL_ADMIN_VIO _IOWR(NVM_IOCTL, NVM_DEV_VIO_ADMIN_CMD,\
- struct nvm_passthru_vio)
-#define NVME_NVM_IOCTL_SUBMIT_VIO _IOWR(NVM_IOCTL, NVM_DEV_VIO_CMD,\
- struct nvm_user_vio)
-
-#define NVM_VERSION_MAJOR 1
-#define NVM_VERSION_MINOR 0
-#define NVM_VERSION_PATCHLEVEL 0
-
-#endif
/* Speculation control variants */
# define PR_SPEC_STORE_BYPASS 0
# define PR_SPEC_INDIRECT_BRANCH 1
+# define PR_SPEC_L1D_FLUSH 2
/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
# define PR_SPEC_NOT_AFFECTED 0
# define PR_SPEC_PRCTL (1UL << 0)
printk("Please append a correct \"root=\" boot option; here are the available partitions:\n");
printk_all_partitions();
-#ifdef CONFIG_DEBUG_BLOCK_EXT_DEVT
- printk("DEBUG_BLOCK_EXT_DEVT is enabled, you need to specify "
- "explicit textual name for \"root=\" boot option.\n");
-#endif
panic("VFS: Unable to mount root fs on %s", b);
}
if (!(flags & SB_RDONLY)) {
.normal_prio = MAX_PRIO - 20,
.policy = SCHED_NORMAL,
.cpus_ptr = &init_task.cpus_mask,
+ .user_cpus_ptr = NULL,
.cpus_mask = CPU_MASK_ALL,
.nr_cpus_allowed= NR_CPUS,
.mm = NULL,
config QUEUED_RWLOCKS
def_bool y if ARCH_USE_QUEUED_RWLOCKS
- depends on SMP
+ depends on SMP && !PREEMPT_RT
config ARCH_HAS_MMIOWB
bool
case BPF_MAP_TYPE_RINGBUF:
if (func_id != BPF_FUNC_ringbuf_output &&
func_id != BPF_FUNC_ringbuf_reserve &&
- func_id != BPF_FUNC_ringbuf_submit &&
- func_id != BPF_FUNC_ringbuf_discard &&
func_id != BPF_FUNC_ringbuf_query)
goto error;
break;
if (map->map_type != BPF_MAP_TYPE_PERF_EVENT_ARRAY)
goto error;
break;
+ case BPF_FUNC_ringbuf_output:
+ case BPF_FUNC_ringbuf_reserve:
+ case BPF_FUNC_ringbuf_query:
+ if (map->map_type != BPF_MAP_TYPE_RINGBUF)
+ goto error;
+ break;
case BPF_FUNC_get_stackid:
if (map->map_type != BPF_MAP_TYPE_STACK_TRACE)
goto error;
}
/*
- * Return in pmask the portion of a cpusets's cpus_allowed that
- * are online. If none are online, walk up the cpuset hierarchy
- * until we find one that does have some online cpus.
+ * Return in pmask the portion of a task's cpusets's cpus_allowed that
+ * are online and are capable of running the task. If none are found,
+ * walk up the cpuset hierarchy until we find one that does have some
+ * appropriate cpus.
*
* One way or another, we guarantee to return some non-empty subset
* of cpu_online_mask.
*
* Call with callback_lock or cpuset_mutex held.
*/
-static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
+static void guarantee_online_cpus(struct task_struct *tsk,
+ struct cpumask *pmask)
{
- while (!cpumask_intersects(cs->effective_cpus, cpu_online_mask)) {
+ const struct cpumask *possible_mask = task_cpu_possible_mask(tsk);
+ struct cpuset *cs;
+
+ if (WARN_ON(!cpumask_and(pmask, possible_mask, cpu_online_mask)))
+ cpumask_copy(pmask, cpu_online_mask);
+
+ rcu_read_lock();
+ cs = task_cs(tsk);
+
+ while (!cpumask_intersects(cs->effective_cpus, pmask)) {
cs = parent_cs(cs);
if (unlikely(!cs)) {
/*
* cpuset's effective_cpus is on its way to be
* identical to cpu_online_mask.
*/
- cpumask_copy(pmask, cpu_online_mask);
- return;
+ goto out_unlock;
}
}
- cpumask_and(pmask, cs->effective_cpus, cpu_online_mask);
+ cpumask_and(pmask, pmask, cs->effective_cpus);
+
+out_unlock:
+ rcu_read_unlock();
}
/*
percpu_down_write(&cpuset_rwsem);
- /* prepare for attach */
- if (cs == &top_cpuset)
- cpumask_copy(cpus_attach, cpu_possible_mask);
- else
- guarantee_online_cpus(cs, cpus_attach);
-
guarantee_online_mems(cs, &cpuset_attach_nodemask_to);
cgroup_taskset_for_each(task, css, tset) {
+ if (cs != &top_cpuset)
+ guarantee_online_cpus(task, cpus_attach);
+ else
+ cpumask_copy(cpus_attach, task_cpu_possible_mask(task));
/*
* can_attach beforehand should guarantee that this doesn't
* fail. TODO: have a better way to handle failure here
unsigned long flags;
spin_lock_irqsave(&callback_lock, flags);
- rcu_read_lock();
- guarantee_online_cpus(task_cs(tsk), pmask);
- rcu_read_unlock();
+ guarantee_online_cpus(tsk, pmask);
spin_unlock_irqrestore(&callback_lock, flags);
}
* which will not contain a sane cpumask during cases such as cpu hotplugging.
* This is the absolute last resort for the scheduler and it is only used if
* _every_ other avenue has been traveled.
+ *
+ * Returns true if the affinity of @tsk was changed, false otherwise.
**/
-void cpuset_cpus_allowed_fallback(struct task_struct *tsk)
+bool cpuset_cpus_allowed_fallback(struct task_struct *tsk)
{
+ const struct cpumask *possible_mask = task_cpu_possible_mask(tsk);
+ const struct cpumask *cs_mask;
+ bool changed = false;
+
rcu_read_lock();
- do_set_cpus_allowed(tsk, is_in_v2_mode() ?
- task_cs(tsk)->cpus_allowed : cpu_possible_mask);
+ cs_mask = task_cs(tsk)->cpus_allowed;
+ if (is_in_v2_mode() && cpumask_subset(cs_mask, possible_mask)) {
+ do_set_cpus_allowed(tsk, cs_mask);
+ changed = true;
+ }
rcu_read_unlock();
/*
* select_fallback_rq() will fix things ups and set cpu_possible_mask
* if required.
*/
+ return changed;
}
void __init cpuset_init_current_mems_allowed(void)
#include "smpboot.h"
/**
- * cpuhp_cpu_state - Per cpu hotplug state storage
+ * struct cpuhp_cpu_state - Per cpu hotplug state storage
* @state: The current cpu state
* @target: The target state
+ * @fail: Current CPU hotplug callback state
* @thread: Pointer to the hotplug thread
* @should_run: Thread should execute
* @rollback: Perform a rollback
* @single: Single callback invocation
* @bringup: Single callback bringup or teardown selector
+ * @cpu: CPU number
+ * @node: Remote CPU node; for multi-instance, do a
+ * single entry callback for install/remove
+ * @last: For multi-instance rollback, remember how far we got
* @cb_state: The state for a single callback (install/uninstall)
* @result: Result of the operation
* @done_up: Signal completion to the issuer of the task for cpu-up
#endif
/**
- * cpuhp_step - Hotplug state machine step
+ * struct cpuhp_step - Hotplug state machine step
* @name: Name of the step
* @startup: Startup function of the step
* @teardown: Teardown function of the step
* @cant_stop: Bringup/teardown can't be stopped at this step
+ * @multi_instance: State has multiple instances which get added afterwards
*/
struct cpuhp_step {
const char *name;
int (*multi)(unsigned int cpu,
struct hlist_node *node);
} teardown;
+ /* private: */
struct hlist_head list;
+ /* public: */
bool cant_stop;
bool multi_instance;
};
}
/**
- * cpuhp_invoke_callback _ Invoke the callbacks for a given state
+ * cpuhp_invoke_callback - Invoke the callbacks for a given state
* @cpu: The cpu for which the callback should be invoked
* @state: The state to do callbacks for
* @bringup: True if the bringup callback should be invoked
* @lastp: For multi-instance rollback, remember how far we got
*
* Called from cpu hotplug and from the state register machinery.
+ *
+ * Return: %0 on success or a negative errno code
*/
static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
bool bringup, struct hlist_node *node,
ret = cpuhp_invoke_callback_range(true, cpu, st, target);
if (ret) {
+ pr_debug("CPU UP failed (%d) CPU %u state %s (%d)\n",
+ ret, cpu, cpuhp_get_step(st->state)->name,
+ st->state);
+
cpuhp_reset_state(st, prev_state);
if (can_rollback_cpu(st))
WARN_ON(cpuhp_invoke_callback_range(false, cpu, st,
ret = cpuhp_invoke_callback_range(false, cpu, st, target);
if (ret) {
+ pr_debug("CPU DOWN failed (%d) CPU %u state %s (%d)\n",
+ ret, cpu, cpuhp_get_step(st->state)->name,
+ st->state);
cpuhp_reset_state(st, prev_state);
* This function is meant to be used by device core cpu subsystem only.
*
* Other subsystems should use remove_cpu() instead.
+ *
+ * Return: %0 on success or a negative errno code
*/
int cpu_device_down(struct device *dev)
{
* This function is meant to be used by device core cpu subsystem only.
*
* Other subsystems should use add_cpu() instead.
+ *
+ * Return: %0 on success or a negative errno code
*/
int cpu_device_up(struct device *dev)
{
* On some architectures like arm64, we can hibernate on any CPU, but on
* wake up the CPU we hibernated on might be offline as a side effect of
* using maxcpus= for example.
+ *
+ * Return: %0 on success or a negative errno code
*/
int bringup_hibernate_cpu(unsigned int sleep_cpu)
{
/**
* __cpuhp_setup_state_cpuslocked - Setup the callbacks for an hotplug machine state
* @state: The state to setup
+ * @name: Name of the step
* @invoke: If true, the startup function is invoked for cpus where
* cpu state >= @state
* @startup: startup callback function
* added afterwards.
*
* The caller needs to hold cpus read locked while calling this function.
- * Returns:
+ * Return:
* On success:
- * Positive state number if @state is CPUHP_AP_ONLINE_DYN
+ * Positive state number if @state is CPUHP_AP_ONLINE_DYN;
* 0 for all other states
* On failure: proper (negative) error code
*/
#endif
#if defined(CONFIG_SYSFS) && defined(CONFIG_HOTPLUG_CPU)
-static ssize_t show_cpuhp_state(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t state_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
return sprintf(buf, "%d\n", st->state);
}
-static DEVICE_ATTR(state, 0444, show_cpuhp_state, NULL);
+static DEVICE_ATTR_RO(state);
-static ssize_t write_cpuhp_target(struct device *dev,
- struct device_attribute *attr,
- const char *buf, size_t count)
+static ssize_t target_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
struct cpuhp_step *sp;
return ret ? ret : count;
}
-static ssize_t show_cpuhp_target(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t target_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
return sprintf(buf, "%d\n", st->target);
}
-static DEVICE_ATTR(target, 0644, show_cpuhp_target, write_cpuhp_target);
-
+static DEVICE_ATTR_RW(target);
-static ssize_t write_cpuhp_fail(struct device *dev,
- struct device_attribute *attr,
- const char *buf, size_t count)
+static ssize_t fail_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
struct cpuhp_step *sp;
return count;
}
-static ssize_t show_cpuhp_fail(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t fail_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, dev->id);
return sprintf(buf, "%d\n", st->fail);
}
-static DEVICE_ATTR(fail, 0644, show_cpuhp_fail, write_cpuhp_fail);
+static DEVICE_ATTR_RW(fail);
static struct attribute *cpuhp_cpu_attrs[] = {
&dev_attr_state.attr,
NULL
};
-static ssize_t show_cpuhp_states(struct device *dev,
+static ssize_t states_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
ssize_t cur, res = 0;
mutex_unlock(&cpuhp_state_mutex);
return res;
}
-static DEVICE_ATTR(states, 0444, show_cpuhp_states, NULL);
+static DEVICE_ATTR_RO(states);
static struct attribute *cpuhp_cpu_root_attrs[] = {
&dev_attr_states.attr,
[CPU_SMT_NOT_IMPLEMENTED] = "notimplemented",
};
-static ssize_t
-show_smt_control(struct device *dev, struct device_attribute *attr, char *buf)
+static ssize_t control_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
{
const char *state = smt_states[cpu_smt_control];
return snprintf(buf, PAGE_SIZE - 2, "%s\n", state);
}
-static ssize_t
-store_smt_control(struct device *dev, struct device_attribute *attr,
- const char *buf, size_t count)
+static ssize_t control_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
{
return __store_smt_control(dev, attr, buf, count);
}
-static DEVICE_ATTR(control, 0644, show_smt_control, store_smt_control);
+static DEVICE_ATTR_RW(control);
-static ssize_t
-show_smt_active(struct device *dev, struct device_attribute *attr, char *buf)
+static ssize_t active_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
{
return snprintf(buf, PAGE_SIZE - 2, "%d\n", sched_smt_active());
}
-static DEVICE_ATTR(active, 0444, show_smt_active, NULL);
+static DEVICE_ATTR_RO(active);
static struct attribute *cpuhp_smt_attrs[] = {
&dev_attr_control.attr,
new->security = NULL;
#endif
- if (security_prepare_creds(new, old, GFP_KERNEL_ACCOUNT) < 0)
- goto error;
-
new->ucounts = get_ucounts(new->ucounts);
if (!new->ucounts)
goto error;
+ if (security_prepare_creds(new, old, GFP_KERNEL_ACCOUNT) < 0)
+ goto error;
+
validate_creds(new);
return new;
#ifdef CONFIG_SECURITY
new->security = NULL;
#endif
- if (security_prepare_creds(new, old, GFP_KERNEL_ACCOUNT) < 0)
- goto error;
-
new->ucounts = get_ucounts(new->ucounts);
if (!new->ucounts)
goto error;
+ if (security_prepare_creds(new, old, GFP_KERNEL_ACCOUNT) < 0)
+ goto error;
+
put_cred(old);
validate_creds(new);
return new;
if (!cpu_events)
return (void __percpu __force *)ERR_PTR(-ENOMEM);
- get_online_cpus();
+ cpus_read_lock();
for_each_online_cpu(cpu) {
bp = perf_event_create_kernel_counter(attr, cpu, NULL,
triggered, context);
per_cpu(*cpu_events, cpu) = bp;
}
- put_online_cpus();
+ cpus_read_unlock();
if (likely(!err))
return cpu_events;
void free_task(struct task_struct *tsk)
{
+ release_user_cpus_ptr(tsk);
scs_release(tsk);
#ifndef CONFIG_THREAD_INFO_IN_TASK
for (i = 0; i < MAX_PER_NAMESPACE_UCOUNTS; i++)
init_user_ns.ucount_max[i] = max_threads/2;
- set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, task_rlimit(&init_task, RLIMIT_NPROC));
- set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, task_rlimit(&init_task, RLIMIT_MSGQUEUE));
- set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, task_rlimit(&init_task, RLIMIT_SIGPENDING));
- set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, task_rlimit(&init_task, RLIMIT_MEMLOCK));
+ set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_NPROC, RLIM_INFINITY);
+ set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MSGQUEUE, RLIM_INFINITY);
+ set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_SIGPENDING, RLIM_INFINITY);
+ set_rlimit_ucount_max(&init_user_ns, UCOUNT_RLIMIT_MEMLOCK, RLIM_INFINITY);
#ifdef CONFIG_VMAP_STACK
cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "fork:vm_stack_cache",
#endif
if (orig->cpus_ptr == &orig->cpus_mask)
tsk->cpus_ptr = &tsk->cpus_mask;
+ dup_user_cpus_ptr(tsk, orig, node);
/*
* One for the user space visible state that goes away when reaped.
/*
* The PI object:
*/
- struct rt_mutex pi_mutex;
+ struct rt_mutex_base pi_mutex;
struct task_struct *owner;
refcount_t refcount;
* @rt_waiter: rt_waiter storage for use with requeue_pi
* @requeue_pi_key: the requeue_pi target futex key
* @bitset: bitset for the optional bitmasked wakeup
+ * @requeue_state: State field for futex_requeue_pi()
+ * @requeue_wait: RCU wait for futex_requeue_pi() (RT only)
*
* We use this hashed waitqueue, instead of a normal wait_queue_entry_t, so
* we can wake only the relevant ones (hashed queues may be shared).
struct rt_mutex_waiter *rt_waiter;
union futex_key *requeue_pi_key;
u32 bitset;
+ atomic_t requeue_state;
+#ifdef CONFIG_PREEMPT_RT
+ struct rcuwait requeue_wait;
+#endif
} __randomize_layout;
+/*
+ * On PREEMPT_RT, the hash bucket lock is a 'sleeping' spinlock with an
+ * underlying rtmutex. The task which is about to be requeued could have
+ * just woken up (timeout, signal). After the wake up the task has to
+ * acquire hash bucket lock, which is held by the requeue code. As a task
+ * can only be blocked on _ONE_ rtmutex at a time, the proxy lock blocking
+ * and the hash bucket lock blocking would collide and corrupt state.
+ *
+ * On !PREEMPT_RT this is not a problem and everything could be serialized
+ * on hash bucket lock, but aside of having the benefit of common code,
+ * this allows to avoid doing the requeue when the task is already on the
+ * way out and taking the hash bucket lock of the original uaddr1 when the
+ * requeue has been completed.
+ *
+ * The following state transitions are valid:
+ *
+ * On the waiter side:
+ * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE
+ * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT
+ *
+ * On the requeue side:
+ * Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS
+ * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED
+ * Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed)
+ * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED
+ * Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed)
+ *
+ * The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this
+ * signals that the waiter is already on the way out. It also means that
+ * the waiter is still on the 'wait' futex, i.e. uaddr1.
+ *
+ * The waiter side signals early wakeup to the requeue side either through
+ * setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending
+ * on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately
+ * proceed to take the hash bucket lock of uaddr1. If it set state to WAIT,
+ * which means the wakeup is interleaving with a requeue in progress it has
+ * to wait for the requeue side to change the state. Either to DONE/LOCKED
+ * or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex
+ * and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by
+ * the requeue side when the requeue attempt failed via deadlock detection
+ * and therefore the waiter q is still on the uaddr1 futex.
+ */
+enum {
+ Q_REQUEUE_PI_NONE = 0,
+ Q_REQUEUE_PI_IGNORE,
+ Q_REQUEUE_PI_IN_PROGRESS,
+ Q_REQUEUE_PI_WAIT,
+ Q_REQUEUE_PI_DONE,
+ Q_REQUEUE_PI_LOCKED,
+};
+
static const struct futex_q futex_q_init = {
/* list gets initialized in queue_me()*/
- .key = FUTEX_KEY_INIT,
- .bitset = FUTEX_BITSET_MATCH_ANY
+ .key = FUTEX_KEY_INIT,
+ .bitset = FUTEX_BITSET_MATCH_ANY,
+ .requeue_state = ATOMIC_INIT(Q_REQUEUE_PI_NONE),
};
/*
return 0;
}
-static int lookup_pi_state(u32 __user *uaddr, u32 uval,
- struct futex_hash_bucket *hb,
- union futex_key *key, struct futex_pi_state **ps,
- struct task_struct **exiting)
-{
- struct futex_q *top_waiter = futex_top_waiter(hb, key);
-
- /*
- * If there is a waiter on that futex, validate it and
- * attach to the pi_state when the validation succeeds.
- */
- if (top_waiter)
- return attach_to_pi_state(uaddr, uval, top_waiter->pi_state, ps);
-
- /*
- * We are the first waiter - try to look up the owner based on
- * @uval and attach to it.
- */
- return attach_to_pi_owner(uaddr, uval, key, ps, exiting);
-}
-
static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
{
int err;
* - 1 - acquired the lock;
* - <0 - error
*
- * The hb->lock and futex_key refs shall be held by the caller.
+ * The hb->lock must be held by the caller.
*
* @exiting is only set when the return value is -EBUSY. If so, this holds
* a refcount on the exiting task on return and the caller needs to drop it
*/
static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_pi_state *pi_state)
{
- u32 curval, newval;
struct rt_mutex_waiter *top_waiter;
struct task_struct *new_owner;
bool postunlock = false;
- DEFINE_WAKE_Q(wake_q);
+ DEFINE_RT_WAKE_Q(wqh);
+ u32 curval, newval;
int ret = 0;
top_waiter = rt_mutex_top_waiter(&pi_state->pi_mutex);
* not fail.
*/
pi_state_update_owner(pi_state, new_owner);
- postunlock = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
+ postunlock = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wqh);
}
out_unlock:
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
if (postunlock)
- rt_mutex_postunlock(&wake_q);
+ rt_mutex_postunlock(&wqh);
return ret;
}
q->key = *key2;
}
+static inline bool futex_requeue_pi_prepare(struct futex_q *q,
+ struct futex_pi_state *pi_state)
+{
+ int old, new;
+
+ /*
+ * Set state to Q_REQUEUE_PI_IN_PROGRESS unless an early wakeup has
+ * already set Q_REQUEUE_PI_IGNORE to signal that requeue should
+ * ignore the waiter.
+ */
+ old = atomic_read_acquire(&q->requeue_state);
+ do {
+ if (old == Q_REQUEUE_PI_IGNORE)
+ return false;
+
+ /*
+ * futex_proxy_trylock_atomic() might have set it to
+ * IN_PROGRESS and a interleaved early wake to WAIT.
+ *
+ * It was considered to have an extra state for that
+ * trylock, but that would just add more conditionals
+ * all over the place for a dubious value.
+ */
+ if (old != Q_REQUEUE_PI_NONE)
+ break;
+
+ new = Q_REQUEUE_PI_IN_PROGRESS;
+ } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new));
+
+ q->pi_state = pi_state;
+ return true;
+}
+
+static inline void futex_requeue_pi_complete(struct futex_q *q, int locked)
+{
+ int old, new;
+
+ old = atomic_read_acquire(&q->requeue_state);
+ do {
+ if (old == Q_REQUEUE_PI_IGNORE)
+ return;
+
+ if (locked >= 0) {
+ /* Requeue succeeded. Set DONE or LOCKED */
+ WARN_ON_ONCE(old != Q_REQUEUE_PI_IN_PROGRESS &&
+ old != Q_REQUEUE_PI_WAIT);
+ new = Q_REQUEUE_PI_DONE + locked;
+ } else if (old == Q_REQUEUE_PI_IN_PROGRESS) {
+ /* Deadlock, no early wakeup interleave */
+ new = Q_REQUEUE_PI_NONE;
+ } else {
+ /* Deadlock, early wakeup interleave. */
+ WARN_ON_ONCE(old != Q_REQUEUE_PI_WAIT);
+ new = Q_REQUEUE_PI_IGNORE;
+ }
+ } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new));
+
+#ifdef CONFIG_PREEMPT_RT
+ /* If the waiter interleaved with the requeue let it know */
+ if (unlikely(old == Q_REQUEUE_PI_WAIT))
+ rcuwait_wake_up(&q->requeue_wait);
+#endif
+}
+
+static inline int futex_requeue_pi_wakeup_sync(struct futex_q *q)
+{
+ int old, new;
+
+ old = atomic_read_acquire(&q->requeue_state);
+ do {
+ /* Is requeue done already? */
+ if (old >= Q_REQUEUE_PI_DONE)
+ return old;
+
+ /*
+ * If not done, then tell the requeue code to either ignore
+ * the waiter or to wake it up once the requeue is done.
+ */
+ new = Q_REQUEUE_PI_WAIT;
+ if (old == Q_REQUEUE_PI_NONE)
+ new = Q_REQUEUE_PI_IGNORE;
+ } while (!atomic_try_cmpxchg(&q->requeue_state, &old, new));
+
+ /* If the requeue was in progress, wait for it to complete */
+ if (old == Q_REQUEUE_PI_IN_PROGRESS) {
+#ifdef CONFIG_PREEMPT_RT
+ rcuwait_wait_event(&q->requeue_wait,
+ atomic_read(&q->requeue_state) != Q_REQUEUE_PI_WAIT,
+ TASK_UNINTERRUPTIBLE);
+#else
+ (void)atomic_cond_read_relaxed(&q->requeue_state, VAL != Q_REQUEUE_PI_WAIT);
+#endif
+ }
+
+ /*
+ * Requeue is now either prohibited or complete. Reread state
+ * because during the wait above it might have changed. Nothing
+ * will modify q->requeue_state after this point.
+ */
+ return atomic_read(&q->requeue_state);
+}
+
/**
* requeue_pi_wake_futex() - Wake a task that acquired the lock during requeue
* @q: the futex_q
q->lock_ptr = &hb->lock;
+ /* Signal locked state to the waiter */
+ futex_requeue_pi_complete(q, 1);
wake_up_state(q->task, TASK_NORMAL);
}
if (!top_waiter)
return 0;
+ /*
+ * Ensure that this is a waiter sitting in futex_wait_requeue_pi()
+ * and waiting on the 'waitqueue' futex which is always !PI.
+ */
+ if (!top_waiter->rt_waiter || top_waiter->pi_state)
+ ret = -EINVAL;
+
/* Ensure we requeue to the expected futex. */
if (!match_futex(top_waiter->requeue_pi_key, key2))
return -EINVAL;
+ /* Ensure that this does not race against an early wakeup */
+ if (!futex_requeue_pi_prepare(top_waiter, NULL))
+ return -EAGAIN;
+
/*
* Try to take the lock for top_waiter. Set the FUTEX_WAITERS bit in
* the contended case or if set_waiters is 1. The pi_state is returned
ret = futex_lock_pi_atomic(pifutex, hb2, key2, ps, top_waiter->task,
exiting, set_waiters);
if (ret == 1) {
+ /* Dequeue, wake up and update top_waiter::requeue_state */
requeue_pi_wake_futex(top_waiter, key2, hb2);
return vpid;
+ } else if (ret < 0) {
+ /* Rewind top_waiter::requeue_state */
+ futex_requeue_pi_complete(top_waiter, ret);
+ } else {
+ /*
+ * futex_lock_pi_atomic() did not acquire the user space
+ * futex, but managed to establish the proxy lock and pi
+ * state. top_waiter::requeue_state cannot be fixed up here
+ * because the waiter is not enqueued on the rtmutex
+ * yet. This is handled at the callsite depending on the
+ * result of rt_mutex_start_proxy_lock() which is
+ * guaranteed to be reached with this function returning 0.
+ */
}
return ret;
}
if (uaddr1 == uaddr2)
return -EINVAL;
+ /*
+ * futex_requeue() allows the caller to define the number
+ * of waiters to wake up via the @nr_wake argument. With
+ * REQUEUE_PI, waking up more than one waiter is creating
+ * more problems than it solves. Waking up a waiter makes
+ * only sense if the PI futex @uaddr2 is uncontended as
+ * this allows the requeue code to acquire the futex
+ * @uaddr2 before waking the waiter. The waiter can then
+ * return to user space without further action. A secondary
+ * wakeup would just make the futex_wait_requeue_pi()
+ * handling more complex, because that code would have to
+ * look up pi_state and do more or less all the handling
+ * which the requeue code has to do for the to be requeued
+ * waiters. So restrict the number of waiters to wake to
+ * one, and only wake it up when the PI futex is
+ * uncontended. Otherwise requeue it and let the unlock of
+ * the PI futex handle the wakeup.
+ *
+ * All REQUEUE_PI users, e.g. pthread_cond_signal() and
+ * pthread_cond_broadcast() must use nr_wake=1.
+ */
+ if (nr_wake != 1)
+ return -EINVAL;
+
/*
* requeue_pi requires a pi_state, try to allocate it now
* without any locks in case it fails.
*/
if (refill_pi_state_cache())
return -ENOMEM;
- /*
- * requeue_pi must wake as many tasks as it can, up to nr_wake
- * + nr_requeue, since it acquires the rt_mutex prior to
- * returning to userspace, so as to not leave the rt_mutex with
- * waiters and no owner. However, second and third wake-ups
- * cannot be predicted as they involve race conditions with the
- * first wake and a fault while looking up the pi_state. Both
- * pthread_cond_signal() and pthread_cond_broadcast() should
- * use nr_wake=1.
- */
- if (nr_wake != 1)
- return -EINVAL;
}
retry:
}
}
- if (requeue_pi && (task_count - nr_wake < nr_requeue)) {
+ if (requeue_pi) {
struct task_struct *exiting = NULL;
/*
* intend to requeue waiters, force setting the FUTEX_WAITERS
* bit. We force this here where we are able to easily handle
* faults rather in the requeue loop below.
+ *
+ * Updates topwaiter::requeue_state if a top waiter exists.
*/
ret = futex_proxy_trylock_atomic(uaddr2, hb1, hb2, &key1,
&key2, &pi_state,
* At this point the top_waiter has either taken uaddr2 or is
* waiting on it. If the former, then the pi_state will not
* exist yet, look it up one more time to ensure we have a
- * reference to it. If the lock was taken, ret contains the
- * vpid of the top waiter task.
+ * reference to it. If the lock was taken, @ret contains the
+ * VPID of the top waiter task.
* If the lock was not taken, we have pi_state and an initial
* refcount on it. In case of an error we have nothing.
+ *
+ * The top waiter's requeue_state is up to date:
+ *
+ * - If the lock was acquired atomically (ret > 0), then
+ * the state is Q_REQUEUE_PI_LOCKED.
+ *
+ * - If the trylock failed with an error (ret < 0) then
+ * the state is either Q_REQUEUE_PI_NONE, i.e. "nothing
+ * happened", or Q_REQUEUE_PI_IGNORE when there was an
+ * interleaved early wakeup.
+ *
+ * - If the trylock did not succeed (ret == 0) then the
+ * state is either Q_REQUEUE_PI_IN_PROGRESS or
+ * Q_REQUEUE_PI_WAIT if an early wakeup interleaved.
+ * This will be cleaned up in the loop below, which
+ * cannot fail because futex_proxy_trylock_atomic() did
+ * the same sanity checks for requeue_pi as the loop
+ * below does.
*/
if (ret > 0) {
WARN_ON(pi_state);
task_count++;
/*
- * If we acquired the lock, then the user space value
- * of uaddr2 should be vpid. It cannot be changed by
- * the top waiter as it is blocked on hb2 lock if it
- * tries to do so. If something fiddled with it behind
- * our back the pi state lookup might unearth it. So
- * we rather use the known value than rereading and
- * handing potential crap to lookup_pi_state.
+ * If futex_proxy_trylock_atomic() acquired the
+ * user space futex, then the user space value
+ * @uaddr2 has been set to the @hb1's top waiter
+ * task VPID. This task is guaranteed to be alive
+ * and cannot be exiting because it is either
+ * sleeping or blocked on @hb2 lock.
+ *
+ * The @uaddr2 futex cannot have waiters either as
+ * otherwise futex_proxy_trylock_atomic() would not
+ * have succeeded.
*
- * If that call succeeds then we have pi_state and an
- * initial refcount on it.
+ * In order to requeue waiters to @hb2, pi state is
+ * required. Hand in the VPID value (@ret) and
+ * allocate PI state with an initial refcount on
+ * it.
*/
- ret = lookup_pi_state(uaddr2, ret, hb2, &key2,
- &pi_state, &exiting);
+ ret = attach_to_pi_owner(uaddr2, ret, &key2, &pi_state,
+ &exiting);
+ WARN_ON(ret);
}
switch (ret) {
/* We hold a reference on the pi state. */
break;
- /* If the above failed, then pi_state is NULL */
+ /*
+ * If the above failed, then pi_state is NULL and
+ * waiter::requeue_state is correct.
+ */
case -EFAULT:
double_unlock_hb(hb1, hb2);
hb_waiters_dec(hb2);
break;
}
- /*
- * Wake nr_wake waiters. For requeue_pi, if we acquired the
- * lock, we already woke the top_waiter. If not, it will be
- * woken by futex_unlock_pi().
- */
- if (++task_count <= nr_wake && !requeue_pi) {
- mark_wake_futex(&wake_q, this);
+ /* Plain futexes just wake or requeue and are done */
+ if (!requeue_pi) {
+ if (++task_count <= nr_wake)
+ mark_wake_futex(&wake_q, this);
+ else
+ requeue_futex(this, hb1, hb2, &key2);
continue;
}
/* Ensure we requeue to the expected futex for requeue_pi. */
- if (requeue_pi && !match_futex(this->requeue_pi_key, &key2)) {
+ if (!match_futex(this->requeue_pi_key, &key2)) {
ret = -EINVAL;
break;
}
/*
* Requeue nr_requeue waiters and possibly one more in the case
* of requeue_pi if we couldn't acquire the lock atomically.
+ *
+ * Prepare the waiter to take the rt_mutex. Take a refcount
+ * on the pi_state and store the pointer in the futex_q
+ * object of the waiter.
*/
- if (requeue_pi) {
+ get_pi_state(pi_state);
+
+ /* Don't requeue when the waiter is already on the way out. */
+ if (!futex_requeue_pi_prepare(this, pi_state)) {
/*
- * Prepare the waiter to take the rt_mutex. Take a
- * refcount on the pi_state and store the pointer in
- * the futex_q object of the waiter.
+ * Early woken waiter signaled that it is on the
+ * way out. Drop the pi_state reference and try the
+ * next waiter. @this->pi_state is still NULL.
*/
- get_pi_state(pi_state);
- this->pi_state = pi_state;
- ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex,
- this->rt_waiter,
- this->task);
- if (ret == 1) {
- /*
- * We got the lock. We do neither drop the
- * refcount on pi_state nor clear
- * this->pi_state because the waiter needs the
- * pi_state for cleaning up the user space
- * value. It will drop the refcount after
- * doing so.
- */
- requeue_pi_wake_futex(this, &key2, hb2);
- continue;
- } else if (ret) {
- /*
- * rt_mutex_start_proxy_lock() detected a
- * potential deadlock when we tried to queue
- * that waiter. Drop the pi_state reference
- * which we took above and remove the pointer
- * to the state from the waiters futex_q
- * object.
- */
- this->pi_state = NULL;
- put_pi_state(pi_state);
- /*
- * We stop queueing more waiters and let user
- * space deal with the mess.
- */
- break;
- }
+ put_pi_state(pi_state);
+ continue;
+ }
+
+ ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex,
+ this->rt_waiter,
+ this->task);
+
+ if (ret == 1) {
+ /*
+ * We got the lock. We do neither drop the refcount
+ * on pi_state nor clear this->pi_state because the
+ * waiter needs the pi_state for cleaning up the
+ * user space value. It will drop the refcount
+ * after doing so. this::requeue_state is updated
+ * in the wakeup as well.
+ */
+ requeue_pi_wake_futex(this, &key2, hb2);
+ task_count++;
+ } else if (!ret) {
+ /* Waiter is queued, move it to hb2 */
+ requeue_futex(this, hb1, hb2, &key2);
+ futex_requeue_pi_complete(this, 0);
+ task_count++;
+ } else {
+ /*
+ * rt_mutex_start_proxy_lock() detected a potential
+ * deadlock when we tried to queue that waiter.
+ * Drop the pi_state reference which we took above
+ * and remove the pointer to the state from the
+ * waiters futex_q object.
+ */
+ this->pi_state = NULL;
+ put_pi_state(pi_state);
+ futex_requeue_pi_complete(this, ret);
+ /*
+ * We stop queueing more waiters and let user space
+ * deal with the mess.
+ */
+ break;
}
- requeue_futex(this, hb1, hb2, &key2);
}
/*
- * We took an extra initial reference to the pi_state either
- * in futex_proxy_trylock_atomic() or in lookup_pi_state(). We
- * need to drop it here again.
+ * We took an extra initial reference to the pi_state either in
+ * futex_proxy_trylock_atomic() or in attach_to_pi_owner(). We need
+ * to drop it here again.
*/
put_pi_state(pi_state);
* Modifying pi_state _before_ the user space value would leave the
* pi_state in an inconsistent state when we fault here, because we
* need to drop the locks to handle the fault. This might be observed
- * in the PID check in lookup_pi_state.
+ * in the PID checks when attaching to PI state .
*/
retry:
if (!argowner) {
*
* Setup the futex_q and locate the hash_bucket. Get the futex value and
* compare it with the expected value. Handle atomic faults internally.
- * Return with the hb lock held and a q.key reference on success, and unlocked
- * with no q.key reference on failure.
+ * Return with the hb lock held on success, and unlocked on failure.
*
* Return:
* - 0 - uaddr contains val and hb has been locked;
current->timer_slack_ns);
retry:
/*
- * Prepare to wait on uaddr. On success, holds hb lock and increments
- * q.key refs.
+ * Prepare to wait on uaddr. On success, it holds hb->lock and q
+ * is initialized.
*/
ret = futex_wait_setup(uaddr, val, flags, &q, &hb);
if (ret)
/* If we were woken (and unqueued), we succeeded, whatever. */
ret = 0;
- /* unqueue_me() drops q.key ref */
if (!unqueue_me(&q))
goto out;
ret = -ETIMEDOUT;
}
/**
- * handle_early_requeue_pi_wakeup() - Detect early wakeup on the initial futex
+ * handle_early_requeue_pi_wakeup() - Handle early wakeup on the initial futex
* @hb: the hash_bucket futex_q was original enqueued on
* @q: the futex_q woken while waiting to be requeued
- * @key2: the futex_key of the requeue target futex
* @timeout: the timeout associated with the wait (NULL if none)
*
- * Detect if the task was woken on the initial futex as opposed to the requeue
- * target futex. If so, determine if it was a timeout or a signal that caused
- * the wakeup and return the appropriate error code to the caller. Must be
- * called with the hb lock held.
+ * Determine the cause for the early wakeup.
*
* Return:
- * - 0 = no early wakeup detected;
- * - <0 = -ETIMEDOUT or -ERESTARTNOINTR
+ * -EWOULDBLOCK or -ETIMEDOUT or -ERESTARTNOINTR
*/
static inline
int handle_early_requeue_pi_wakeup(struct futex_hash_bucket *hb,
- struct futex_q *q, union futex_key *key2,
+ struct futex_q *q,
struct hrtimer_sleeper *timeout)
{
- int ret = 0;
+ int ret;
/*
* With the hb lock held, we avoid races while we process the wakeup.
* It can't be requeued from uaddr2 to something else since we don't
* support a PI aware source futex for requeue.
*/
- if (!match_futex(&q->key, key2)) {
- WARN_ON(q->lock_ptr && (&hb->lock != q->lock_ptr));
- /*
- * We were woken prior to requeue by a timeout or a signal.
- * Unqueue the futex_q and determine which it was.
- */
- plist_del(&q->list, &hb->chain);
- hb_waiters_dec(hb);
+ WARN_ON_ONCE(&hb->lock != q->lock_ptr);
- /* Handle spurious wakeups gracefully */
- ret = -EWOULDBLOCK;
- if (timeout && !timeout->task)
- ret = -ETIMEDOUT;
- else if (signal_pending(current))
- ret = -ERESTARTNOINTR;
- }
+ /*
+ * We were woken prior to requeue by a timeout or a signal.
+ * Unqueue the futex_q and determine which it was.
+ */
+ plist_del(&q->list, &hb->chain);
+ hb_waiters_dec(hb);
+
+ /* Handle spurious wakeups gracefully */
+ ret = -EWOULDBLOCK;
+ if (timeout && !timeout->task)
+ ret = -ETIMEDOUT;
+ else if (signal_pending(current))
+ ret = -ERESTARTNOINTR;
return ret;
}
struct futex_hash_bucket *hb;
union futex_key key2 = FUTEX_KEY_INIT;
struct futex_q q = futex_q_init;
+ struct rt_mutex_base *pi_mutex;
int res, ret;
if (!IS_ENABLED(CONFIG_FUTEX_PI))
q.requeue_pi_key = &key2;
/*
- * Prepare to wait on uaddr. On success, increments q.key (key1) ref
- * count.
+ * Prepare to wait on uaddr. On success, it holds hb->lock and q
+ * is initialized.
*/
ret = futex_wait_setup(uaddr, val, flags, &q, &hb);
if (ret)
/* Queue the futex_q, drop the hb lock, wait for wakeup. */
futex_wait_queue_me(hb, &q, to);
- spin_lock(&hb->lock);
- ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
- spin_unlock(&hb->lock);
- if (ret)
- goto out;
-
- /*
- * In order for us to be here, we know our q.key == key2, and since
- * we took the hb->lock above, we also know that futex_requeue() has
- * completed and we no longer have to concern ourselves with a wakeup
- * race with the atomic proxy lock acquisition by the requeue code. The
- * futex_requeue dropped our key1 reference and incremented our key2
- * reference count.
- */
+ switch (futex_requeue_pi_wakeup_sync(&q)) {
+ case Q_REQUEUE_PI_IGNORE:
+ /* The waiter is still on uaddr1 */
+ spin_lock(&hb->lock);
+ ret = handle_early_requeue_pi_wakeup(hb, &q, to);
+ spin_unlock(&hb->lock);
+ break;
- /*
- * Check if the requeue code acquired the second futex for us and do
- * any pertinent fixup.
- */
- if (!q.rt_waiter) {
+ case Q_REQUEUE_PI_LOCKED:
+ /* The requeue acquired the lock */
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_owner(uaddr2, &q, true);
/*
- * Drop the reference to the pi state which
- * the requeue_pi() code acquired for us.
+ * Drop the reference to the pi state which the
+ * requeue_pi() code acquired for us.
*/
put_pi_state(q.pi_state);
spin_unlock(q.lock_ptr);
*/
ret = ret < 0 ? ret : 0;
}
- } else {
- struct rt_mutex *pi_mutex;
+ break;
- /*
- * We have been woken up by futex_unlock_pi(), a timeout, or a
- * signal. futex_unlock_pi() will not destroy the lock_ptr nor
- * the pi_state.
- */
- WARN_ON(!q.pi_state);
+ case Q_REQUEUE_PI_DONE:
+ /* Requeue completed. Current is 'pi_blocked_on' the rtmutex */
pi_mutex = &q.pi_state->pi_mutex;
ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter);
+ /* Current is not longer pi_blocked_on */
spin_lock(q.lock_ptr);
if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter))
ret = 0;
unqueue_me_pi(&q);
spin_unlock(q.lock_ptr);
- }
- if (ret == -EINTR) {
- /*
- * We've already been requeued, but cannot restart by calling
- * futex_lock_pi() directly. We could restart this syscall, but
- * it would detect that the user space "val" changed and return
- * -EWOULDBLOCK. Save the overhead of the restart and return
- * -EWOULDBLOCK directly.
- */
- ret = -EWOULDBLOCK;
+ if (ret == -EINTR) {
+ /*
+ * We've already been requeued, but cannot restart
+ * by calling futex_lock_pi() directly. We could
+ * restart this syscall, but it would detect that
+ * the user space "val" changed and return
+ * -EWOULDBLOCK. Save the overhead of the restart
+ * and return -EWOULDBLOCK directly.
+ */
+ ret = -EWOULDBLOCK;
+ }
+ break;
+ default:
+ BUG();
}
out:
goto fail_npresmsk;
/* Stabilize the cpumasks */
- get_online_cpus();
+ cpus_read_lock();
build_node_to_cpumask(node_to_cpumask);
/* Spread on present CPUs starting from affd->pre_vectors */
nr_others = ret;
fail_build_affinity:
- put_online_cpus();
+ cpus_read_unlock();
if (ret >= 0)
WARN_ON(nr_present + nr_others < numvecs);
if (affd->calc_sets) {
set_vecs = maxvec - resv;
} else {
- get_online_cpus();
+ cpus_read_lock();
set_vecs = cpumask_weight(cpu_possible_mask);
- put_online_cpus();
+ cpus_read_unlock();
}
return resv + min(set_vecs, maxvec - resv);
raw_spin_unlock(&desc->lock);
if (affinity_broken) {
- pr_warn_ratelimited("IRQ %u: no longer affine to CPU%u\n",
+ pr_debug_ratelimited("IRQ %u: no longer affine to CPU%u\n",
irq, smp_processor_id());
}
}
void __iomem *reg_base, irq_flow_handler_t handler)
{
struct irq_chip_generic *gc;
- unsigned long sz = sizeof(*gc) + num_ct * sizeof(struct irq_chip_type);
- gc = kzalloc(sz, GFP_KERNEL);
+ gc = kzalloc(struct_size(gc, chip_types, num_ct), GFP_KERNEL);
if (gc) {
irq_init_generic_chip(gc, name, num_ct, irq_base, reg_base,
handler);
{
struct irq_domain_chip_generic *dgc;
struct irq_chip_generic *gc;
- int numchips, sz, i;
unsigned long flags;
+ int numchips, i;
+ size_t dgc_sz;
+ size_t gc_sz;
+ size_t sz;
void *tmp;
if (d->gc)
return -EINVAL;
/* Allocate a pointer, generic chip and chiptypes for each chip */
- sz = sizeof(*dgc) + numchips * sizeof(gc);
- sz += numchips * (sizeof(*gc) + num_ct * sizeof(struct irq_chip_type));
+ gc_sz = struct_size(gc, chip_types, num_ct);
+ dgc_sz = struct_size(dgc, gc, numchips);
+ sz = dgc_sz + numchips * gc_sz;
tmp = dgc = kzalloc(sz, GFP_KERNEL);
if (!dgc)
d->gc = dgc;
/* Calc pointer to the first generic chip */
- tmp += sizeof(*dgc) + numchips * sizeof(gc);
+ tmp += dgc_sz;
for (i = 0; i < numchips; i++) {
/* Store the pointer to the generic chip */
dgc->gc[i] = gc = tmp;
list_add_tail(&gc->list, &gc_list);
raw_spin_unlock_irqrestore(&gc_lock, flags);
/* Calc pointer to the next generic chip */
- tmp += sizeof(*gc) + num_ct * sizeof(struct irq_chip_type);
+ tmp += gc_sz;
}
return 0;
}
/**
* irq_reserve_ipi() - Setup an IPI to destination cpumask
* @domain: IPI domain
- * @dest: cpumask of cpus which can receive the IPI
+ * @dest: cpumask of CPUs which can receive the IPI
*
* Allocate a virq that can be used to send IPI to any CPU in dest mask.
*
- * On success it'll return linux irq number and error code on failure
+ * Return: Linux IRQ number on success or error code on failure
*/
int irq_reserve_ipi(struct irq_domain *domain,
const struct cpumask *dest)
/**
* irq_destroy_ipi() - unreserve an IPI that was previously allocated
- * @irq: linux irq number to be destroyed
- * @dest: cpumask of cpus which should have the IPI removed
+ * @irq: Linux IRQ number to be destroyed
+ * @dest: cpumask of CPUs which should have the IPI removed
*
* The IPIs allocated with irq_reserve_ipi() are returned to the system
* destroying all virqs associated with them.
*
- * Return 0 on success or error code on failure.
+ * Return: %0 on success or error code on failure.
*/
int irq_destroy_ipi(unsigned int irq, const struct cpumask *dest)
{
}
/**
- * ipi_get_hwirq - Get the hwirq associated with an IPI to a cpu
- * @irq: linux irq number
- * @cpu: the target cpu
+ * ipi_get_hwirq - Get the hwirq associated with an IPI to a CPU
+ * @irq: Linux IRQ number
+ * @cpu: the target CPU
*
* When dealing with coprocessors IPI, we need to inform the coprocessor of
* the hwirq it needs to use to receive and send IPIs.
*
- * Returns hwirq value on success and INVALID_HWIRQ on failure.
+ * Return: hwirq value on success or INVALID_HWIRQ on failure.
*/
irq_hw_number_t ipi_get_hwirq(unsigned int irq, unsigned int cpu)
{
* This function is for architecture or core code to speed up IPI sending. Not
* usable from driver code.
*
- * Returns zero on success and negative error number on failure.
+ * Return: %0 on success or negative error number on failure.
*/
int __ipi_send_single(struct irq_desc *desc, unsigned int cpu)
{
}
/**
- * ipi_send_mask - send an IPI to target Linux SMP CPU(s)
+ * __ipi_send_mask - send an IPI to target Linux SMP CPU(s)
* @desc: pointer to irq_desc of the IRQ
* @dest: dest CPU(s), must be a subset of the mask passed to
* irq_reserve_ipi()
* This function is for architecture or core code to speed up IPI sending. Not
* usable from driver code.
*
- * Returns zero on success and negative error number on failure.
+ * Return: %0 on success or negative error number on failure.
*/
int __ipi_send_mask(struct irq_desc *desc, const struct cpumask *dest)
{
/**
* ipi_send_single - Send an IPI to a single CPU
- * @virq: linux irq number from irq_reserve_ipi()
+ * @virq: Linux IRQ number from irq_reserve_ipi()
* @cpu: destination CPU, must in the destination mask passed to
* irq_reserve_ipi()
*
- * Returns zero on success and negative error number on failure.
+ * Return: %0 on success or negative error number on failure.
*/
int ipi_send_single(unsigned int virq, unsigned int cpu)
{
/**
* ipi_send_mask - Send an IPI to target CPU(s)
- * @virq: linux irq number from irq_reserve_ipi()
+ * @virq: Linux IRQ number from irq_reserve_ipi()
* @dest: dest CPU(s), must be a subset of the mask passed to
* irq_reserve_ipi()
*
- * Returns zero on success and negative error number on failure.
+ * Return: %0 on success or negative error number on failure.
*/
int ipi_send_mask(unsigned int virq, const struct cpumask *dest)
{
raw_spin_lock_irq(&desc->lock);
if (desc->irq_data.domain)
- ret = sprintf(buf, "%d\n", (int)desc->irq_data.hwirq);
+ ret = sprintf(buf, "%lu\n", desc->irq_data.hwirq);
raw_spin_unlock_irq(&desc->lock);
return ret;
irqd->chip = ERR_PTR(-ENOTCONN);
return 0;
}
+EXPORT_SYMBOL_GPL(irq_domain_disconnect_hierarchy);
static int irq_domain_trim_hierarchy(unsigned int virq)
{
#include "internals.h"
#if defined(CONFIG_IRQ_FORCED_THREADING) && !defined(CONFIG_PREEMPT_RT)
-__read_mostly bool force_irqthreads;
-EXPORT_SYMBOL_GPL(force_irqthreads);
+DEFINE_STATIC_KEY_FALSE(force_irqthreads_key);
static int __init setup_forced_irqthreads(char *arg)
{
- force_irqthreads = true;
+ static_branch_enable(&force_irqthreads_key);
return 0;
}
early_param("threadirqs", setup_forced_irqthreads);
irqreturn_t (*handler_fn)(struct irq_desc *desc,
struct irqaction *action);
- if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD,
- &action->thread_flags))
+ if (force_irqthreads() && test_bit(IRQTF_FORCED_THREAD,
+ &action->thread_flags))
handler_fn = irq_forced_thread_fn;
else
handler_fn = irq_thread_fn;
static int irq_setup_forced_threading(struct irqaction *new)
{
- if (!force_irqthreads)
+ if (!force_irqthreads())
return 0;
if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))
return 0;
* request_threaded_irq - allocate an interrupt line
* @irq: Interrupt line to allocate
* @handler: Function to be called when the IRQ occurs.
- * Primary handler for threaded interrupts
- * If NULL and thread_fn != NULL the default
- * primary handler is installed
+ * Primary handler for threaded interrupts.
+ * If handler is NULL and thread_fn != NULL
+ * the default primary handler is installed.
* @thread_fn: Function called from the irq handler thread
* If NULL, no irq thread is created
* @irqflags: Interrupt type flags
*
* IRQF_SHARED Interrupt is shared
* IRQF_TRIGGER_* Specify active edge(s) or level
- *
+ * IRQF_ONESHOT Run thread_fn with interrupt line masked
*/
int request_threaded_irq(unsigned int irq, irq_handler_t handler,
irq_handler_t thread_fn, unsigned long irqflags,
/**
* irq_matrix_alloc_managed - Allocate a managed interrupt in a CPU map
* @m: Matrix pointer
- * @cpu: On which CPU the interrupt should be allocated
+ * @msk: Which CPUs to search in
+ * @mapped_cpu: Pointer to store the CPU for which the irq was allocated
*/
int irq_matrix_alloc_managed(struct irq_matrix *m, const struct cpumask *msk,
unsigned int *mapped_cpu)
#include <linux/irqdomain.h>
#include <linux/msi.h>
#include <linux/slab.h>
+#include <linux/pci.h>
#include "internals.h"
/**
- * alloc_msi_entry - Allocate an initialize msi_entry
+ * alloc_msi_entry - Allocate an initialized msi_desc
* @dev: Pointer to the device for which this is allocated
* @nvec: The number of vectors used in this entry
* @affinity: Optional pointer to an affinity mask array size of @nvec
*
- * If @affinity is not NULL then an affinity array[@nvec] is allocated
+ * If @affinity is not %NULL then an affinity array[@nvec] is allocated
* and the affinity masks and flags from @affinity are copied.
+ *
+ * Return: pointer to allocated &msi_desc on success or %NULL on failure
*/
struct msi_desc *alloc_msi_entry(struct device *dev, int nvec,
const struct irq_affinity_desc *affinity)
}
EXPORT_SYMBOL_GPL(get_cached_msi_msg);
+static ssize_t msi_mode_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct msi_desc *entry;
+ bool is_msix = false;
+ unsigned long irq;
+ int retval;
+
+ retval = kstrtoul(attr->attr.name, 10, &irq);
+ if (retval)
+ return retval;
+
+ entry = irq_get_msi_desc(irq);
+ if (!entry)
+ return -ENODEV;
+
+ if (dev_is_pci(dev))
+ is_msix = entry->msi_attrib.is_msix;
+
+ return sysfs_emit(buf, "%s\n", is_msix ? "msix" : "msi");
+}
+
+/**
+ * msi_populate_sysfs - Populate msi_irqs sysfs entries for devices
+ * @dev: The device(PCI, platform etc) who will get sysfs entries
+ *
+ * Return attribute_group ** so that specific bus MSI can save it to
+ * somewhere during initilizing msi irqs. If devices has no MSI irq,
+ * return NULL; if it fails to populate sysfs, return ERR_PTR
+ */
+const struct attribute_group **msi_populate_sysfs(struct device *dev)
+{
+ const struct attribute_group **msi_irq_groups;
+ struct attribute **msi_attrs, *msi_attr;
+ struct device_attribute *msi_dev_attr;
+ struct attribute_group *msi_irq_group;
+ struct msi_desc *entry;
+ int ret = -ENOMEM;
+ int num_msi = 0;
+ int count = 0;
+ int i;
+
+ /* Determine how many msi entries we have */
+ for_each_msi_entry(entry, dev)
+ num_msi += entry->nvec_used;
+ if (!num_msi)
+ return NULL;
+
+ /* Dynamically create the MSI attributes for the device */
+ msi_attrs = kcalloc(num_msi + 1, sizeof(void *), GFP_KERNEL);
+ if (!msi_attrs)
+ return ERR_PTR(-ENOMEM);
+
+ for_each_msi_entry(entry, dev) {
+ for (i = 0; i < entry->nvec_used; i++) {
+ msi_dev_attr = kzalloc(sizeof(*msi_dev_attr), GFP_KERNEL);
+ if (!msi_dev_attr)
+ goto error_attrs;
+ msi_attrs[count] = &msi_dev_attr->attr;
+
+ sysfs_attr_init(&msi_dev_attr->attr);
+ msi_dev_attr->attr.name = kasprintf(GFP_KERNEL, "%d",
+ entry->irq + i);
+ if (!msi_dev_attr->attr.name)
+ goto error_attrs;
+ msi_dev_attr->attr.mode = 0444;
+ msi_dev_attr->show = msi_mode_show;
+ ++count;
+ }
+ }
+
+ msi_irq_group = kzalloc(sizeof(*msi_irq_group), GFP_KERNEL);
+ if (!msi_irq_group)
+ goto error_attrs;
+ msi_irq_group->name = "msi_irqs";
+ msi_irq_group->attrs = msi_attrs;
+
+ msi_irq_groups = kcalloc(2, sizeof(void *), GFP_KERNEL);
+ if (!msi_irq_groups)
+ goto error_irq_group;
+ msi_irq_groups[0] = msi_irq_group;
+
+ ret = sysfs_create_groups(&dev->kobj, msi_irq_groups);
+ if (ret)
+ goto error_irq_groups;
+
+ return msi_irq_groups;
+
+error_irq_groups:
+ kfree(msi_irq_groups);
+error_irq_group:
+ kfree(msi_irq_group);
+error_attrs:
+ count = 0;
+ msi_attr = msi_attrs[count];
+ while (msi_attr) {
+ msi_dev_attr = container_of(msi_attr, struct device_attribute, attr);
+ kfree(msi_attr->name);
+ kfree(msi_dev_attr);
+ ++count;
+ msi_attr = msi_attrs[count];
+ }
+ kfree(msi_attrs);
+ return ERR_PTR(ret);
+}
+
+/**
+ * msi_destroy_sysfs - Destroy msi_irqs sysfs entries for devices
+ * @dev: The device(PCI, platform etc) who will remove sysfs entries
+ * @msi_irq_groups: attribute_group for device msi_irqs entries
+ */
+void msi_destroy_sysfs(struct device *dev, const struct attribute_group **msi_irq_groups)
+{
+ struct device_attribute *dev_attr;
+ struct attribute **msi_attrs;
+ int count = 0;
+
+ if (msi_irq_groups) {
+ sysfs_remove_groups(&dev->kobj, msi_irq_groups);
+ msi_attrs = msi_irq_groups[0]->attrs;
+ while (msi_attrs[count]) {
+ dev_attr = container_of(msi_attrs[count],
+ struct device_attribute, attr);
+ kfree(dev_attr->attr.name);
+ kfree(dev_attr);
+ ++count;
+ }
+ kfree(msi_attrs);
+ kfree(msi_irq_groups[0]);
+ kfree(msi_irq_groups);
+ }
+}
+
#ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN
static inline void irq_chip_write_msi_msg(struct irq_data *data,
struct msi_msg *msg)
*
* Intended to be used by MSI interrupt controllers which are
* implemented with hierarchical domains.
+ *
+ * Return: IRQ_SET_MASK_* result code
*/
int msi_domain_set_affinity(struct irq_data *irq_data,
const struct cpumask *mask, bool force)
}
/**
- * msi_create_irq_domain - Create a MSI interrupt domain
+ * msi_create_irq_domain - Create an MSI interrupt domain
* @fwnode: Optional fwnode of the interrupt controller
* @info: MSI domain info
* @parent: Parent irq domain
+ *
+ * Return: pointer to the created &struct irq_domain or %NULL on failure
*/
struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode,
struct msi_domain_info *info,
* are allocated
* @nvec: The number of interrupts to allocate
*
- * Returns 0 on success or an error code.
+ * Return: %0 on success or an error code.
*/
int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
int nvec)
}
/**
- * __msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain associated tp @dev
+ * msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain associated to @dev
* @domain: The domain to managing the interrupts
* @dev: Pointer to device struct of the device for which the interrupts
* are free
* msi_get_domain_info - Get the MSI interrupt domain info for @domain
* @domain: The interrupt domain to retrieve data from
*
- * Returns the pointer to the msi_domain_info stored in
- * @domain->host_data.
+ * Return: the pointer to the msi_domain_info stored in @domain->host_data.
*/
struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain)
{
}
/**
- * irq_pm_syscore_ops - enable interrupt lines early
+ * irq_pm_syscore_resume - enable interrupt lines early
*
* Enable all interrupt lines with %IRQF_EARLY_RESUME set.
*/
seq_printf(p, " %8s", "None");
}
if (desc->irq_data.domain)
- seq_printf(p, " %*d", prec, (int) desc->irq_data.hwirq);
+ seq_printf(p, " %*lu", prec, desc->irq_data.hwirq);
else
seq_printf(p, " %*s", prec, "");
#ifdef CONFIG_GENERIC_IRQ_SHOW_LEVEL
__irq_timings_store(irq, irqs, ti->intervals[i]);
if (irqs->circ_timings[i & IRQ_TIMINGS_MASK] != index) {
+ ret = -EBADSLT;
pr_err("Failed to store in the circular buffer\n");
goto out;
}
}
if (irqs->count != ti->count) {
+ ret = -ERANGE;
pr_err("Count differs\n");
goto out;
}
{
const struct kcsan_ctx ctx_save = current->kcsan_ctx;
const bool was_enabled = READ_ONCE(kcsan_enabled);
- cycles_t cycles;
+ u64 cycles;
/* We may have been called from an atomic region; reset context. */
memset(¤t->kcsan_ctx, 0, sizeof(current->kcsan_ctx));
obj-$(CONFIG_LOCK_SPIN_ON_OWNER) += osq_lock.o
obj-$(CONFIG_PROVE_LOCKING) += spinlock.o
obj-$(CONFIG_QUEUED_SPINLOCKS) += qspinlock.o
-obj-$(CONFIG_RT_MUTEXES) += rtmutex.o
+obj-$(CONFIG_RT_MUTEXES) += rtmutex_api.o
+obj-$(CONFIG_PREEMPT_RT) += spinlock_rt.o ww_rt_mutex.o
obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock.o
obj-$(CONFIG_DEBUG_SPINLOCK) += spinlock_debug.o
obj-$(CONFIG_QUEUED_RWLOCKS) += qrwlock.o
static struct task_struct **reader_tasks;
static bool lock_is_write_held;
-static bool lock_is_read_held;
+static atomic_t lock_is_read_held;
static unsigned long last_lock_release;
struct lock_stress_stats {
if (WARN_ON_ONCE(lock_is_write_held))
lwsp->n_lock_fail++;
lock_is_write_held = true;
- if (WARN_ON_ONCE(lock_is_read_held))
+ if (WARN_ON_ONCE(atomic_read(&lock_is_read_held)))
lwsp->n_lock_fail++; /* rare, but... */
lwsp->n_lock_acquired++;
schedule_timeout_uninterruptible(1);
cxt.cur_ops->readlock(tid);
- lock_is_read_held = true;
+ atomic_inc(&lock_is_read_held);
if (WARN_ON_ONCE(lock_is_write_held))
lrsp->n_lock_fail++; /* rare, but... */
lrsp->n_lock_acquired++;
cxt.cur_ops->read_delay(&rand);
- lock_is_read_held = false;
+ atomic_dec(&lock_is_read_held);
cxt.cur_ops->readunlock(tid);
stutter_wait("lock_torture_reader");
static void __torture_print_stats(char *page,
struct lock_stress_stats *statp, bool write)
{
+ long cur;
bool fail = false;
int i, n_stress;
- long max = 0, min = statp ? statp[0].n_lock_acquired : 0;
+ long max = 0, min = statp ? data_race(statp[0].n_lock_acquired) : 0;
long long sum = 0;
n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
for (i = 0; i < n_stress; i++) {
- if (statp[i].n_lock_fail)
+ if (data_race(statp[i].n_lock_fail))
fail = true;
- sum += statp[i].n_lock_acquired;
- if (max < statp[i].n_lock_acquired)
- max = statp[i].n_lock_acquired;
- if (min > statp[i].n_lock_acquired)
- min = statp[i].n_lock_acquired;
+ cur = data_race(statp[i].n_lock_acquired);
+ sum += cur;
+ if (max < cur)
+ max = cur;
+ if (min > cur)
+ min = cur;
}
page += sprintf(page,
"%s: Total: %lld Max/Min: %ld/%ld %s Fail: %d %s\n",
}
if (nreaders_stress) {
- lock_is_read_held = false;
cxt.lrsa = kmalloc_array(cxt.nrealreaders_stress,
sizeof(*cxt.lrsa),
GFP_KERNEL);
/*
- * kernel/mutex-debug.c
- *
* Debugging code for mutexes
*
* Started by Ingo Molnar:
#include <linux/interrupt.h>
#include <linux/debug_locks.h>
-#include "mutex-debug.h"
+#include "mutex.h"
/*
* Must be called with lock->wait_lock held.
memset(waiter, MUTEX_DEBUG_INIT, sizeof(*waiter));
waiter->magic = waiter;
INIT_LIST_HEAD(&waiter->list);
+ waiter->ww_ctx = MUTEX_POISON_WW_CTX;
}
void debug_mutex_wake_waiter(struct mutex *lock, struct mutex_waiter *waiter)
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Mutexes: blocking mutual exclusion locks
- *
- * started by Ingo Molnar:
- *
- * Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar <mingo@redhat.com>
- *
- * This file contains mutex debugging related internal declarations,
- * prototypes and inline functions, for the CONFIG_DEBUG_MUTEXES case.
- * More details are in kernel/mutex-debug.c.
- */
-
-/*
- * This must be called with lock->wait_lock held.
- */
-extern void debug_mutex_lock_common(struct mutex *lock,
- struct mutex_waiter *waiter);
-extern void debug_mutex_wake_waiter(struct mutex *lock,
- struct mutex_waiter *waiter);
-extern void debug_mutex_free_waiter(struct mutex_waiter *waiter);
-extern void debug_mutex_add_waiter(struct mutex *lock,
- struct mutex_waiter *waiter,
- struct task_struct *task);
-extern void debug_mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,
- struct task_struct *task);
-extern void debug_mutex_unlock(struct mutex *lock);
-extern void debug_mutex_init(struct mutex *lock, const char *name,
- struct lock_class_key *key);
#include <linux/debug_locks.h>
#include <linux/osq_lock.h>
+#ifndef CONFIG_PREEMPT_RT
+#include "mutex.h"
+
#ifdef CONFIG_DEBUG_MUTEXES
-# include "mutex-debug.h"
+# define MUTEX_WARN_ON(cond) DEBUG_LOCKS_WARN_ON(cond)
#else
-# include "mutex.h"
+# define MUTEX_WARN_ON(cond)
#endif
void
__mutex_init(struct mutex *lock, const char *name, struct lock_class_key *key)
{
atomic_long_set(&lock->owner, 0);
- spin_lock_init(&lock->wait_lock);
+ raw_spin_lock_init(&lock->wait_lock);
INIT_LIST_HEAD(&lock->wait_list);
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
osq_lock_init(&lock->osq);
return owner & MUTEX_FLAGS;
}
-/*
- * Trylock variant that returns the owning task on failure.
- */
-static inline struct task_struct *__mutex_trylock_or_owner(struct mutex *lock)
+static inline struct task_struct *__mutex_trylock_common(struct mutex *lock, bool handoff)
{
unsigned long owner, curr = (unsigned long)current;
owner = atomic_long_read(&lock->owner);
for (;;) { /* must loop, can race against a flag */
- unsigned long old, flags = __owner_flags(owner);
+ unsigned long flags = __owner_flags(owner);
unsigned long task = owner & ~MUTEX_FLAGS;
if (task) {
- if (likely(task != curr))
- break;
-
- if (likely(!(flags & MUTEX_FLAG_PICKUP)))
+ if (flags & MUTEX_FLAG_PICKUP) {
+ if (task != curr)
+ break;
+ flags &= ~MUTEX_FLAG_PICKUP;
+ } else if (handoff) {
+ if (flags & MUTEX_FLAG_HANDOFF)
+ break;
+ flags |= MUTEX_FLAG_HANDOFF;
+ } else {
break;
-
- flags &= ~MUTEX_FLAG_PICKUP;
+ }
} else {
-#ifdef CONFIG_DEBUG_MUTEXES
- DEBUG_LOCKS_WARN_ON(flags & MUTEX_FLAG_PICKUP);
-#endif
+ MUTEX_WARN_ON(flags & (MUTEX_FLAG_HANDOFF | MUTEX_FLAG_PICKUP));
+ task = curr;
}
- /*
- * We set the HANDOFF bit, we must make sure it doesn't live
- * past the point where we acquire it. This would be possible
- * if we (accidentally) set the bit on an unlocked mutex.
- */
- flags &= ~MUTEX_FLAG_HANDOFF;
-
- old = atomic_long_cmpxchg_acquire(&lock->owner, owner, curr | flags);
- if (old == owner)
- return NULL;
-
- owner = old;
+ if (atomic_long_try_cmpxchg_acquire(&lock->owner, &owner, task | flags)) {
+ if (task == curr)
+ return NULL;
+ break;
+ }
}
return __owner_task(owner);
}
+/*
+ * Trylock or set HANDOFF
+ */
+static inline bool __mutex_trylock_or_handoff(struct mutex *lock, bool handoff)
+{
+ return !__mutex_trylock_common(lock, handoff);
+}
+
/*
* Actual trylock that will work on any unlocked state.
*/
static inline bool __mutex_trylock(struct mutex *lock)
{
- return !__mutex_trylock_or_owner(lock);
+ return !__mutex_trylock_common(lock, false);
}
#ifndef CONFIG_DEBUG_LOCK_ALLOC
{
unsigned long curr = (unsigned long)current;
- if (atomic_long_cmpxchg_release(&lock->owner, curr, 0UL) == curr)
- return true;
-
- return false;
+ return atomic_long_try_cmpxchg_release(&lock->owner, &curr, 0UL);
}
#endif
unsigned long owner = atomic_long_read(&lock->owner);
for (;;) {
- unsigned long old, new;
+ unsigned long new;
-#ifdef CONFIG_DEBUG_MUTEXES
- DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current);
- DEBUG_LOCKS_WARN_ON(owner & MUTEX_FLAG_PICKUP);
-#endif
+ MUTEX_WARN_ON(__owner_task(owner) != current);
+ MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP);
new = (owner & MUTEX_FLAG_WAITERS);
new |= (unsigned long)task;
if (task)
new |= MUTEX_FLAG_PICKUP;
- old = atomic_long_cmpxchg_release(&lock->owner, owner, new);
- if (old == owner)
+ if (atomic_long_try_cmpxchg_release(&lock->owner, &owner, new))
break;
-
- owner = old;
}
}
EXPORT_SYMBOL(mutex_lock);
#endif
-/*
- * Wait-Die:
- * The newer transactions are killed when:
- * It (the new transaction) makes a request for a lock being held
- * by an older transaction.
- *
- * Wound-Wait:
- * The newer transactions are wounded when:
- * An older transaction makes a request for a lock being held by
- * the newer transaction.
- */
-
-/*
- * Associate the ww_mutex @ww with the context @ww_ctx under which we acquired
- * it.
- */
-static __always_inline void
-ww_mutex_lock_acquired(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx)
-{
-#ifdef CONFIG_DEBUG_MUTEXES
- /*
- * If this WARN_ON triggers, you used ww_mutex_lock to acquire,
- * but released with a normal mutex_unlock in this call.
- *
- * This should never happen, always use ww_mutex_unlock.
- */
- DEBUG_LOCKS_WARN_ON(ww->ctx);
-
- /*
- * Not quite done after calling ww_acquire_done() ?
- */
- DEBUG_LOCKS_WARN_ON(ww_ctx->done_acquire);
+#include "ww_mutex.h"
- if (ww_ctx->contending_lock) {
- /*
- * After -EDEADLK you tried to
- * acquire a different ww_mutex? Bad!
- */
- DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock != ww);
-
- /*
- * You called ww_mutex_lock after receiving -EDEADLK,
- * but 'forgot' to unlock everything else first?
- */
- DEBUG_LOCKS_WARN_ON(ww_ctx->acquired > 0);
- ww_ctx->contending_lock = NULL;
- }
-
- /*
- * Naughty, using a different class will lead to undefined behavior!
- */
- DEBUG_LOCKS_WARN_ON(ww_ctx->ww_class != ww->ww_class);
-#endif
- ww_ctx->acquired++;
- ww->ctx = ww_ctx;
-}
-
-/*
- * Determine if context @a is 'after' context @b. IOW, @a is a younger
- * transaction than @b and depending on algorithm either needs to wait for
- * @b or die.
- */
-static inline bool __sched
-__ww_ctx_stamp_after(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b)
-{
-
- return (signed long)(a->stamp - b->stamp) > 0;
-}
-
-/*
- * Wait-Die; wake a younger waiter context (when locks held) such that it can
- * die.
- *
- * Among waiters with context, only the first one can have other locks acquired
- * already (ctx->acquired > 0), because __ww_mutex_add_waiter() and
- * __ww_mutex_check_kill() wake any but the earliest context.
- */
-static bool __sched
-__ww_mutex_die(struct mutex *lock, struct mutex_waiter *waiter,
- struct ww_acquire_ctx *ww_ctx)
-{
- if (!ww_ctx->is_wait_die)
- return false;
-
- if (waiter->ww_ctx->acquired > 0 &&
- __ww_ctx_stamp_after(waiter->ww_ctx, ww_ctx)) {
- debug_mutex_wake_waiter(lock, waiter);
- wake_up_process(waiter->task);
- }
-
- return true;
-}
-
-/*
- * Wound-Wait; wound a younger @hold_ctx if it holds the lock.
- *
- * Wound the lock holder if there are waiters with older transactions than
- * the lock holders. Even if multiple waiters may wound the lock holder,
- * it's sufficient that only one does.
- */
-static bool __ww_mutex_wound(struct mutex *lock,
- struct ww_acquire_ctx *ww_ctx,
- struct ww_acquire_ctx *hold_ctx)
-{
- struct task_struct *owner = __mutex_owner(lock);
-
- lockdep_assert_held(&lock->wait_lock);
-
- /*
- * Possible through __ww_mutex_add_waiter() when we race with
- * ww_mutex_set_context_fastpath(). In that case we'll get here again
- * through __ww_mutex_check_waiters().
- */
- if (!hold_ctx)
- return false;
-
- /*
- * Can have !owner because of __mutex_unlock_slowpath(), but if owner,
- * it cannot go away because we'll have FLAG_WAITERS set and hold
- * wait_lock.
- */
- if (!owner)
- return false;
-
- if (ww_ctx->acquired > 0 && __ww_ctx_stamp_after(hold_ctx, ww_ctx)) {
- hold_ctx->wounded = 1;
-
- /*
- * wake_up_process() paired with set_current_state()
- * inserts sufficient barriers to make sure @owner either sees
- * it's wounded in __ww_mutex_check_kill() or has a
- * wakeup pending to re-read the wounded state.
- */
- if (owner != current)
- wake_up_process(owner);
-
- return true;
- }
-
- return false;
-}
-
-/*
- * We just acquired @lock under @ww_ctx, if there are later contexts waiting
- * behind us on the wait-list, check if they need to die, or wound us.
- *
- * See __ww_mutex_add_waiter() for the list-order construction; basically the
- * list is ordered by stamp, smallest (oldest) first.
- *
- * This relies on never mixing wait-die/wound-wait on the same wait-list;
- * which is currently ensured by that being a ww_class property.
- *
- * The current task must not be on the wait list.
- */
-static void __sched
-__ww_mutex_check_waiters(struct mutex *lock, struct ww_acquire_ctx *ww_ctx)
-{
- struct mutex_waiter *cur;
-
- lockdep_assert_held(&lock->wait_lock);
-
- list_for_each_entry(cur, &lock->wait_list, list) {
- if (!cur->ww_ctx)
- continue;
-
- if (__ww_mutex_die(lock, cur, ww_ctx) ||
- __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx))
- break;
- }
-}
+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
/*
- * After acquiring lock with fastpath, where we do not hold wait_lock, set ctx
- * and wake up any waiters so they can recheck.
+ * Trylock variant that returns the owning task on failure.
*/
-static __always_inline void
-ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+static inline struct task_struct *__mutex_trylock_or_owner(struct mutex *lock)
{
- ww_mutex_lock_acquired(lock, ctx);
-
- /*
- * The lock->ctx update should be visible on all cores before
- * the WAITERS check is done, otherwise contended waiters might be
- * missed. The contended waiters will either see ww_ctx == NULL
- * and keep spinning, or it will acquire wait_lock, add itself
- * to waiter list and sleep.
- */
- smp_mb(); /* See comments above and below. */
-
- /*
- * [W] ww->ctx = ctx [W] MUTEX_FLAG_WAITERS
- * MB MB
- * [R] MUTEX_FLAG_WAITERS [R] ww->ctx
- *
- * The memory barrier above pairs with the memory barrier in
- * __ww_mutex_add_waiter() and makes sure we either observe ww->ctx
- * and/or !empty list.
- */
- if (likely(!(atomic_long_read(&lock->base.owner) & MUTEX_FLAG_WAITERS)))
- return;
-
- /*
- * Uh oh, we raced in fastpath, check if any of the waiters need to
- * die or wound us.
- */
- spin_lock(&lock->base.wait_lock);
- __ww_mutex_check_waiters(&lock->base, ctx);
- spin_unlock(&lock->base.wait_lock);
+ return __mutex_trylock_common(lock, false);
}
-#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-
static inline
bool ww_mutex_spin_on_owner(struct mutex *lock, struct ww_acquire_ctx *ww_ctx,
struct mutex_waiter *waiter)
*/
void __sched ww_mutex_unlock(struct ww_mutex *lock)
{
- /*
- * The unlocking fastpath is the 0->1 transition from 'locked'
- * into 'unlocked' state:
- */
- if (lock->ctx) {
-#ifdef CONFIG_DEBUG_MUTEXES
- DEBUG_LOCKS_WARN_ON(!lock->ctx->acquired);
-#endif
- if (lock->ctx->acquired > 0)
- lock->ctx->acquired--;
- lock->ctx = NULL;
- }
-
+ __ww_mutex_unlock(lock);
mutex_unlock(&lock->base);
}
EXPORT_SYMBOL(ww_mutex_unlock);
-
-static __always_inline int __sched
-__ww_mutex_kill(struct mutex *lock, struct ww_acquire_ctx *ww_ctx)
-{
- if (ww_ctx->acquired > 0) {
-#ifdef CONFIG_DEBUG_MUTEXES
- struct ww_mutex *ww;
-
- ww = container_of(lock, struct ww_mutex, base);
- DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock);
- ww_ctx->contending_lock = ww;
-#endif
- return -EDEADLK;
- }
-
- return 0;
-}
-
-
-/*
- * Check the wound condition for the current lock acquire.
- *
- * Wound-Wait: If we're wounded, kill ourself.
- *
- * Wait-Die: If we're trying to acquire a lock already held by an older
- * context, kill ourselves.
- *
- * Since __ww_mutex_add_waiter() orders the wait-list on stamp, we only have to
- * look at waiters before us in the wait-list.
- */
-static inline int __sched
-__ww_mutex_check_kill(struct mutex *lock, struct mutex_waiter *waiter,
- struct ww_acquire_ctx *ctx)
-{
- struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
- struct ww_acquire_ctx *hold_ctx = READ_ONCE(ww->ctx);
- struct mutex_waiter *cur;
-
- if (ctx->acquired == 0)
- return 0;
-
- if (!ctx->is_wait_die) {
- if (ctx->wounded)
- return __ww_mutex_kill(lock, ctx);
-
- return 0;
- }
-
- if (hold_ctx && __ww_ctx_stamp_after(ctx, hold_ctx))
- return __ww_mutex_kill(lock, ctx);
-
- /*
- * If there is a waiter in front of us that has a context, then its
- * stamp is earlier than ours and we must kill ourself.
- */
- cur = waiter;
- list_for_each_entry_continue_reverse(cur, &lock->wait_list, list) {
- if (!cur->ww_ctx)
- continue;
-
- return __ww_mutex_kill(lock, ctx);
- }
-
- return 0;
-}
-
-/*
- * Add @waiter to the wait-list, keep the wait-list ordered by stamp, smallest
- * first. Such that older contexts are preferred to acquire the lock over
- * younger contexts.
- *
- * Waiters without context are interspersed in FIFO order.
- *
- * Furthermore, for Wait-Die kill ourself immediately when possible (there are
- * older contexts already waiting) to avoid unnecessary waiting and for
- * Wound-Wait ensure we wound the owning context when it is younger.
- */
-static inline int __sched
-__ww_mutex_add_waiter(struct mutex_waiter *waiter,
- struct mutex *lock,
- struct ww_acquire_ctx *ww_ctx)
-{
- struct mutex_waiter *cur;
- struct list_head *pos;
- bool is_wait_die;
-
- if (!ww_ctx) {
- __mutex_add_waiter(lock, waiter, &lock->wait_list);
- return 0;
- }
-
- is_wait_die = ww_ctx->is_wait_die;
-
- /*
- * Add the waiter before the first waiter with a higher stamp.
- * Waiters without a context are skipped to avoid starving
- * them. Wait-Die waiters may die here. Wound-Wait waiters
- * never die here, but they are sorted in stamp order and
- * may wound the lock holder.
- */
- pos = &lock->wait_list;
- list_for_each_entry_reverse(cur, &lock->wait_list, list) {
- if (!cur->ww_ctx)
- continue;
-
- if (__ww_ctx_stamp_after(ww_ctx, cur->ww_ctx)) {
- /*
- * Wait-Die: if we find an older context waiting, there
- * is no point in queueing behind it, as we'd have to
- * die the moment it would acquire the lock.
- */
- if (is_wait_die) {
- int ret = __ww_mutex_kill(lock, ww_ctx);
-
- if (ret)
- return ret;
- }
-
- break;
- }
-
- pos = &cur->list;
-
- /* Wait-Die: ensure younger waiters die. */
- __ww_mutex_die(lock, cur, ww_ctx);
- }
-
- __mutex_add_waiter(lock, waiter, pos);
-
- /*
- * Wound-Wait: if we're blocking on a mutex owned by a younger context,
- * wound that such that we might proceed.
- */
- if (!is_wait_die) {
- struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
-
- /*
- * See ww_mutex_set_context_fastpath(). Orders setting
- * MUTEX_FLAG_WAITERS vs the ww->ctx load,
- * such that either we or the fastpath will wound @ww->ctx.
- */
- smp_mb();
- __ww_mutex_wound(lock, ww_ctx, ww->ctx);
- }
-
- return 0;
-}
-
/*
* Lock a mutex (possibly interruptible), slowpath:
*/
struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
{
struct mutex_waiter waiter;
- bool first = false;
struct ww_mutex *ww;
int ret;
might_sleep();
-#ifdef CONFIG_DEBUG_MUTEXES
- DEBUG_LOCKS_WARN_ON(lock->magic != lock);
-#endif
+ MUTEX_WARN_ON(lock->magic != lock);
ww = container_of(lock, struct ww_mutex, base);
if (ww_ctx) {
*/
if (ww_ctx->acquired == 0)
ww_ctx->wounded = 0;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ nest_lock = &ww_ctx->dep_map;
+#endif
}
preempt_disable();
return 0;
}
- spin_lock(&lock->wait_lock);
+ raw_spin_lock(&lock->wait_lock);
/*
* After waiting to acquire the wait_lock, try again.
*/
}
debug_mutex_lock_common(lock, &waiter);
+ waiter.task = current;
+ if (use_ww_ctx)
+ waiter.ww_ctx = ww_ctx;
lock_contended(&lock->dep_map, ip);
if (!use_ww_ctx) {
/* add waiting tasks to the end of the waitqueue (FIFO): */
__mutex_add_waiter(lock, &waiter, &lock->wait_list);
-
-
-#ifdef CONFIG_DEBUG_MUTEXES
- waiter.ww_ctx = MUTEX_POISON_WW_CTX;
-#endif
} else {
/*
* Add in stamp order, waking up waiters that must kill
ret = __ww_mutex_add_waiter(&waiter, lock, ww_ctx);
if (ret)
goto err_early_kill;
-
- waiter.ww_ctx = ww_ctx;
}
- waiter.task = current;
-
set_current_state(state);
for (;;) {
+ bool first;
+
/*
* Once we hold wait_lock, we're serialized against
* mutex_unlock() handing the lock off to us, do a trylock
goto err;
}
- spin_unlock(&lock->wait_lock);
+ raw_spin_unlock(&lock->wait_lock);
schedule_preempt_disabled();
- /*
- * ww_mutex needs to always recheck its position since its waiter
- * list is not FIFO ordered.
- */
- if (ww_ctx || !first) {
- first = __mutex_waiter_is_first(lock, &waiter);
- if (first)
- __mutex_set_flag(lock, MUTEX_FLAG_HANDOFF);
- }
+ first = __mutex_waiter_is_first(lock, &waiter);
set_current_state(state);
/*
* state back to RUNNING and fall through the next schedule(),
* or we must see its unlock and acquire.
*/
- if (__mutex_trylock(lock) ||
+ if (__mutex_trylock_or_handoff(lock, first) ||
(first && mutex_optimistic_spin(lock, ww_ctx, &waiter)))
break;
- spin_lock(&lock->wait_lock);
+ raw_spin_lock(&lock->wait_lock);
}
- spin_lock(&lock->wait_lock);
+ raw_spin_lock(&lock->wait_lock);
acquired:
__set_current_state(TASK_RUNNING);
if (ww_ctx)
ww_mutex_lock_acquired(ww, ww_ctx);
- spin_unlock(&lock->wait_lock);
+ raw_spin_unlock(&lock->wait_lock);
preempt_enable();
return 0;
__set_current_state(TASK_RUNNING);
__mutex_remove_waiter(lock, &waiter);
err_early_kill:
- spin_unlock(&lock->wait_lock);
+ raw_spin_unlock(&lock->wait_lock);
debug_mutex_free_waiter(&waiter);
mutex_release(&lock->dep_map, ip);
preempt_enable();
static int __sched
__ww_mutex_lock(struct mutex *lock, unsigned int state, unsigned int subclass,
- struct lockdep_map *nest_lock, unsigned long ip,
- struct ww_acquire_ctx *ww_ctx)
+ unsigned long ip, struct ww_acquire_ctx *ww_ctx)
{
- return __mutex_lock_common(lock, state, subclass, nest_lock, ip, ww_ctx, true);
+ return __mutex_lock_common(lock, state, subclass, NULL, ip, ww_ctx, true);
}
#ifdef CONFIG_DEBUG_LOCK_ALLOC
might_sleep();
ret = __ww_mutex_lock(&lock->base, TASK_UNINTERRUPTIBLE,
- 0, ctx ? &ctx->dep_map : NULL, _RET_IP_,
- ctx);
+ 0, _RET_IP_, ctx);
if (!ret && ctx && ctx->acquired > 1)
return ww_mutex_deadlock_injection(lock, ctx);
might_sleep();
ret = __ww_mutex_lock(&lock->base, TASK_INTERRUPTIBLE,
- 0, ctx ? &ctx->dep_map : NULL, _RET_IP_,
- ctx);
+ 0, _RET_IP_, ctx);
if (!ret && ctx && ctx->acquired > 1)
return ww_mutex_deadlock_injection(lock, ctx);
*/
owner = atomic_long_read(&lock->owner);
for (;;) {
- unsigned long old;
-
-#ifdef CONFIG_DEBUG_MUTEXES
- DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current);
- DEBUG_LOCKS_WARN_ON(owner & MUTEX_FLAG_PICKUP);
-#endif
+ MUTEX_WARN_ON(__owner_task(owner) != current);
+ MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP);
if (owner & MUTEX_FLAG_HANDOFF)
break;
- old = atomic_long_cmpxchg_release(&lock->owner, owner,
- __owner_flags(owner));
- if (old == owner) {
+ if (atomic_long_try_cmpxchg_release(&lock->owner, &owner, __owner_flags(owner))) {
if (owner & MUTEX_FLAG_WAITERS)
break;
return;
}
-
- owner = old;
}
- spin_lock(&lock->wait_lock);
+ raw_spin_lock(&lock->wait_lock);
debug_mutex_unlock(lock);
if (!list_empty(&lock->wait_list)) {
/* get the first entry from the wait-list: */
if (owner & MUTEX_FLAG_HANDOFF)
__mutex_handoff(lock, next);
- spin_unlock(&lock->wait_lock);
+ raw_spin_unlock(&lock->wait_lock);
wake_up_q(&wake_q);
}
static noinline int __sched
__ww_mutex_lock_slowpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
{
- return __ww_mutex_lock(&lock->base, TASK_UNINTERRUPTIBLE, 0, NULL,
+ return __ww_mutex_lock(&lock->base, TASK_UNINTERRUPTIBLE, 0,
_RET_IP_, ctx);
}
__ww_mutex_lock_interruptible_slowpath(struct ww_mutex *lock,
struct ww_acquire_ctx *ctx)
{
- return __ww_mutex_lock(&lock->base, TASK_INTERRUPTIBLE, 0, NULL,
+ return __ww_mutex_lock(&lock->base, TASK_INTERRUPTIBLE, 0,
_RET_IP_, ctx);
}
{
bool locked;
-#ifdef CONFIG_DEBUG_MUTEXES
- DEBUG_LOCKS_WARN_ON(lock->magic != lock);
-#endif
+ MUTEX_WARN_ON(lock->magic != lock);
locked = __mutex_trylock(lock);
if (locked)
}
EXPORT_SYMBOL(ww_mutex_lock_interruptible);
-#endif
+#endif /* !CONFIG_DEBUG_LOCK_ALLOC */
+#endif /* !CONFIG_PREEMPT_RT */
/**
* atomic_dec_and_mutex_lock - return holding mutex if we dec to 0
* started by Ingo Molnar:
*
* Copyright (C) 2004, 2005, 2006 Red Hat, Inc., Ingo Molnar <mingo@redhat.com>
- *
- * This file contains mutex debugging related internal prototypes, for the
- * !CONFIG_DEBUG_MUTEXES case. Most of them are NOPs:
*/
-#define debug_mutex_wake_waiter(lock, waiter) do { } while (0)
-#define debug_mutex_free_waiter(waiter) do { } while (0)
-#define debug_mutex_add_waiter(lock, waiter, ti) do { } while (0)
-#define debug_mutex_remove_waiter(lock, waiter, ti) do { } while (0)
-#define debug_mutex_unlock(lock) do { } while (0)
-#define debug_mutex_init(lock, name, key) do { } while (0)
+/*
+ * This is the control structure for tasks blocked on mutex, which resides
+ * on the blocked task's kernel stack:
+ */
+struct mutex_waiter {
+ struct list_head list;
+ struct task_struct *task;
+ struct ww_acquire_ctx *ww_ctx;
+#ifdef CONFIG_DEBUG_MUTEXES
+ void *magic;
+#endif
+};
-static inline void
-debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter)
-{
-}
+#ifdef CONFIG_DEBUG_MUTEXES
+extern void debug_mutex_lock_common(struct mutex *lock,
+ struct mutex_waiter *waiter);
+extern void debug_mutex_wake_waiter(struct mutex *lock,
+ struct mutex_waiter *waiter);
+extern void debug_mutex_free_waiter(struct mutex_waiter *waiter);
+extern void debug_mutex_add_waiter(struct mutex *lock,
+ struct mutex_waiter *waiter,
+ struct task_struct *task);
+extern void debug_mutex_remove_waiter(struct mutex *lock, struct mutex_waiter *waiter,
+ struct task_struct *task);
+extern void debug_mutex_unlock(struct mutex *lock);
+extern void debug_mutex_init(struct mutex *lock, const char *name,
+ struct lock_class_key *key);
+#else /* CONFIG_DEBUG_MUTEXES */
+# define debug_mutex_lock_common(lock, waiter) do { } while (0)
+# define debug_mutex_wake_waiter(lock, waiter) do { } while (0)
+# define debug_mutex_free_waiter(waiter) do { } while (0)
+# define debug_mutex_add_waiter(lock, waiter, ti) do { } while (0)
+# define debug_mutex_remove_waiter(lock, waiter, ti) do { } while (0)
+# define debug_mutex_unlock(lock) do { } while (0)
+# define debug_mutex_init(lock, name, key) do { } while (0)
+#endif /* !CONFIG_DEBUG_MUTEXES */
* Copyright (C) 2005-2006 Timesys Corp., Thomas Gleixner <tglx@timesys.com>
* Copyright (C) 2005 Kihon Technologies Inc., Steven Rostedt
* Copyright (C) 2006 Esben Nielsen
+ * Adaptive Spinlocks:
+ * Copyright (C) 2008 Novell, Inc., Gregory Haskins, Sven Dietrich,
+ * and Peter Morreale,
+ * Adaptive Spinlocks simplification:
+ * Copyright (C) 2008 Red Hat, Inc., Steven Rostedt <srostedt@redhat.com>
*
* See Documentation/locking/rt-mutex-design.rst for details.
*/
-#include <linux/spinlock.h>
-#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/sched/debug.h>
+#include <linux/sched/deadline.h>
#include <linux/sched/signal.h>
#include <linux/sched/rt.h>
-#include <linux/sched/deadline.h>
#include <linux/sched/wake_q.h>
-#include <linux/sched/debug.h>
-#include <linux/timer.h>
+#include <linux/ww_mutex.h>
#include "rtmutex_common.h"
+#ifndef WW_RT
+# define build_ww_mutex() (false)
+# define ww_container_of(rtm) NULL
+
+static inline int __ww_mutex_add_waiter(struct rt_mutex_waiter *waiter,
+ struct rt_mutex *lock,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ return 0;
+}
+
+static inline void __ww_mutex_check_waiters(struct rt_mutex *lock,
+ struct ww_acquire_ctx *ww_ctx)
+{
+}
+
+static inline void ww_mutex_lock_acquired(struct ww_mutex *lock,
+ struct ww_acquire_ctx *ww_ctx)
+{
+}
+
+static inline int __ww_mutex_check_kill(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ return 0;
+}
+
+#else
+# define build_ww_mutex() (true)
+# define ww_container_of(rtm) container_of(rtm, struct ww_mutex, base)
+# include "ww_mutex.h"
+#endif
+
/*
* lock->owner state tracking:
*
*/
static __always_inline void
-rt_mutex_set_owner(struct rt_mutex *lock, struct task_struct *owner)
+rt_mutex_set_owner(struct rt_mutex_base *lock, struct task_struct *owner)
{
unsigned long val = (unsigned long)owner;
WRITE_ONCE(lock->owner, (struct task_struct *)val);
}
-static __always_inline void clear_rt_mutex_waiters(struct rt_mutex *lock)
+static __always_inline void clear_rt_mutex_waiters(struct rt_mutex_base *lock)
{
lock->owner = (struct task_struct *)
((unsigned long)lock->owner & ~RT_MUTEX_HAS_WAITERS);
}
-static __always_inline void fixup_rt_mutex_waiters(struct rt_mutex *lock)
+static __always_inline void fixup_rt_mutex_waiters(struct rt_mutex_base *lock)
{
unsigned long owner, *p = (unsigned long *) &lock->owner;
* set up.
*/
#ifndef CONFIG_DEBUG_RT_MUTEXES
-# define rt_mutex_cmpxchg_acquire(l,c,n) (cmpxchg_acquire(&l->owner, c, n) == c)
-# define rt_mutex_cmpxchg_release(l,c,n) (cmpxchg_release(&l->owner, c, n) == c)
+static __always_inline bool rt_mutex_cmpxchg_acquire(struct rt_mutex_base *lock,
+ struct task_struct *old,
+ struct task_struct *new)
+{
+ return try_cmpxchg_acquire(&lock->owner, &old, new);
+}
+
+static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base *lock,
+ struct task_struct *old,
+ struct task_struct *new)
+{
+ return try_cmpxchg_release(&lock->owner, &old, new);
+}
/*
* Callers must hold the ->wait_lock -- which is the whole purpose as we force
* all future threads that attempt to [Rmw] the lock to the slowpath. As such
* relaxed semantics suffice.
*/
-static __always_inline void mark_rt_mutex_waiters(struct rt_mutex *lock)
+static __always_inline void mark_rt_mutex_waiters(struct rt_mutex_base *lock)
{
unsigned long owner, *p = (unsigned long *) &lock->owner;
* 2) Drop lock->wait_lock
* 3) Try to unlock the lock with cmpxchg
*/
-static __always_inline bool unlock_rt_mutex_safe(struct rt_mutex *lock,
+static __always_inline bool unlock_rt_mutex_safe(struct rt_mutex_base *lock,
unsigned long flags)
__releases(lock->wait_lock)
{
}
#else
-# define rt_mutex_cmpxchg_acquire(l,c,n) (0)
-# define rt_mutex_cmpxchg_release(l,c,n) (0)
+static __always_inline bool rt_mutex_cmpxchg_acquire(struct rt_mutex_base *lock,
+ struct task_struct *old,
+ struct task_struct *new)
+{
+ return false;
+
+}
+
+static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base *lock,
+ struct task_struct *old,
+ struct task_struct *new)
+{
+ return false;
+}
-static __always_inline void mark_rt_mutex_waiters(struct rt_mutex *lock)
+static __always_inline void mark_rt_mutex_waiters(struct rt_mutex_base *lock)
{
lock->owner = (struct task_struct *)
((unsigned long)lock->owner | RT_MUTEX_HAS_WAITERS);
/*
* Simple slow path only version: lock->owner is protected by lock->wait_lock.
*/
-static __always_inline bool unlock_rt_mutex_safe(struct rt_mutex *lock,
+static __always_inline bool unlock_rt_mutex_safe(struct rt_mutex_base *lock,
unsigned long flags)
__releases(lock->wait_lock)
{
}
#endif
+static __always_inline int __waiter_prio(struct task_struct *task)
+{
+ int prio = task->prio;
+
+ if (!rt_prio(prio))
+ return DEFAULT_PRIO;
+
+ return prio;
+}
+
+static __always_inline void
+waiter_update_prio(struct rt_mutex_waiter *waiter, struct task_struct *task)
+{
+ waiter->prio = __waiter_prio(task);
+ waiter->deadline = task->dl.deadline;
+}
+
/*
* Only use with rt_mutex_waiter_{less,equal}()
*/
#define task_to_waiter(p) \
- &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline }
+ &(struct rt_mutex_waiter){ .prio = __waiter_prio(p), .deadline = (p)->dl.deadline }
static __always_inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left,
struct rt_mutex_waiter *right)
return 1;
}
+static inline bool rt_mutex_steal(struct rt_mutex_waiter *waiter,
+ struct rt_mutex_waiter *top_waiter)
+{
+ if (rt_mutex_waiter_less(waiter, top_waiter))
+ return true;
+
+#ifdef RT_MUTEX_BUILD_SPINLOCKS
+ /*
+ * Note that RT tasks are excluded from same priority (lateral)
+ * steals to prevent the introduction of an unbounded latency.
+ */
+ if (rt_prio(waiter->prio) || dl_prio(waiter->prio))
+ return false;
+
+ return rt_mutex_waiter_equal(waiter, top_waiter);
+#else
+ return false;
+#endif
+}
+
#define __node_2_waiter(node) \
rb_entry((node), struct rt_mutex_waiter, tree_entry)
static __always_inline bool __waiter_less(struct rb_node *a, const struct rb_node *b)
{
- return rt_mutex_waiter_less(__node_2_waiter(a), __node_2_waiter(b));
+ struct rt_mutex_waiter *aw = __node_2_waiter(a);
+ struct rt_mutex_waiter *bw = __node_2_waiter(b);
+
+ if (rt_mutex_waiter_less(aw, bw))
+ return 1;
+
+ if (!build_ww_mutex())
+ return 0;
+
+ if (rt_mutex_waiter_less(bw, aw))
+ return 0;
+
+ /* NOTE: relies on waiter->ww_ctx being set before insertion */
+ if (aw->ww_ctx) {
+ if (!bw->ww_ctx)
+ return 1;
+
+ return (signed long)(aw->ww_ctx->stamp -
+ bw->ww_ctx->stamp) < 0;
+ }
+
+ return 0;
}
static __always_inline void
-rt_mutex_enqueue(struct rt_mutex *lock, struct rt_mutex_waiter *waiter)
+rt_mutex_enqueue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter)
{
rb_add_cached(&waiter->tree_entry, &lock->waiters, __waiter_less);
}
static __always_inline void
-rt_mutex_dequeue(struct rt_mutex *lock, struct rt_mutex_waiter *waiter)
+rt_mutex_dequeue(struct rt_mutex_base *lock, struct rt_mutex_waiter *waiter)
{
if (RB_EMPTY_NODE(&waiter->tree_entry))
return;
rt_mutex_setprio(p, pi_task);
}
+/* RT mutex specific wake_q wrappers */
+static __always_inline void rt_mutex_wake_q_add(struct rt_wake_q_head *wqh,
+ struct rt_mutex_waiter *w)
+{
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && w->wake_state != TASK_NORMAL) {
+ if (IS_ENABLED(CONFIG_PROVE_LOCKING))
+ WARN_ON_ONCE(wqh->rtlock_task);
+ get_task_struct(w->task);
+ wqh->rtlock_task = w->task;
+ } else {
+ wake_q_add(&wqh->head, w->task);
+ }
+}
+
+static __always_inline void rt_mutex_wake_up_q(struct rt_wake_q_head *wqh)
+{
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && wqh->rtlock_task) {
+ wake_up_state(wqh->rtlock_task, TASK_RTLOCK_WAIT);
+ put_task_struct(wqh->rtlock_task);
+ wqh->rtlock_task = NULL;
+ }
+
+ if (!wake_q_empty(&wqh->head))
+ wake_up_q(&wqh->head);
+
+ /* Pairs with preempt_disable() in mark_wakeup_next_waiter() */
+ preempt_enable();
+}
+
/*
* Deadlock detection is conditional:
*
return chwalk == RT_MUTEX_FULL_CHAINWALK;
}
-/*
- * Max number of times we'll walk the boosting chain:
- */
-int max_lock_depth = 1024;
-
-static __always_inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
+static __always_inline struct rt_mutex_base *task_blocked_on_lock(struct task_struct *p)
{
return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
}
*/
static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
enum rtmutex_chainwalk chwalk,
- struct rt_mutex *orig_lock,
- struct rt_mutex *next_lock,
+ struct rt_mutex_base *orig_lock,
+ struct rt_mutex_base *next_lock,
struct rt_mutex_waiter *orig_waiter,
struct task_struct *top_task)
{
struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
struct rt_mutex_waiter *prerequeue_top_waiter;
int ret = 0, depth = 0;
- struct rt_mutex *lock;
+ struct rt_mutex_base *lock;
bool detect_deadlock;
bool requeue = true;
if (next_lock != waiter->lock)
goto out_unlock_pi;
+ /*
+ * There could be 'spurious' loops in the lock graph due to ww_mutex,
+ * consider:
+ *
+ * P1: A, ww_A, ww_B
+ * P2: ww_B, ww_A
+ * P3: A
+ *
+ * P3 should not return -EDEADLK because it gets trapped in the cycle
+ * created by P1 and P2 (which will resolve -- and runs into
+ * max_lock_depth above). Therefore disable detect_deadlock such that
+ * the below termination condition can trigger once all relevant tasks
+ * are boosted.
+ *
+ * Even when we start with ww_mutex we can disable deadlock detection,
+ * since we would supress a ww_mutex induced deadlock at [6] anyway.
+ * Supressing it here however is not sufficient since we might still
+ * hit [6] due to adjustment driven iteration.
+ *
+ * NOTE: if someone were to create a deadlock between 2 ww_classes we'd
+ * utterly fail to report it; lockdep should.
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && waiter->ww_ctx && detect_deadlock)
+ detect_deadlock = false;
+
/*
* Drop out, when the task has no waiters. Note,
* top_waiter can be NULL, when we are in the deboosting
* walk, we detected a deadlock.
*/
if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
- raw_spin_unlock(&lock->wait_lock);
ret = -EDEADLK;
+
+ /*
+ * When the deadlock is due to ww_mutex; also see above. Don't
+ * report the deadlock and instead let the ww_mutex wound/die
+ * logic pick which of the contending threads gets -EDEADLK.
+ *
+ * NOTE: assumes the cycle only contains a single ww_class; any
+ * other configuration and we fail to report; also, see
+ * lockdep.
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
+ ret = 0;
+
+ raw_spin_unlock(&lock->wait_lock);
goto out_unlock_pi;
}
* serializes all pi_waiters access and rb_erase() does not care about
* the values of the node being removed.
*/
- waiter->prio = task->prio;
- waiter->deadline = task->dl.deadline;
+ waiter_update_prio(waiter, task);
rt_mutex_enqueue(lock, waiter);
* to get the lock.
*/
if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
- wake_up_process(rt_mutex_top_waiter(lock)->task);
+ wake_up_state(waiter->task, waiter->wake_state);
raw_spin_unlock_irq(&lock->wait_lock);
return 0;
}
* callsite called task_blocked_on_lock(), otherwise NULL
*/
static int __sched
-try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task,
+try_to_take_rt_mutex(struct rt_mutex_base *lock, struct task_struct *task,
struct rt_mutex_waiter *waiter)
{
lockdep_assert_held(&lock->wait_lock);
* trylock attempt.
*/
if (waiter) {
- /*
- * If waiter is not the highest priority waiter of
- * @lock, give up.
- */
- if (waiter != rt_mutex_top_waiter(lock))
- return 0;
+ struct rt_mutex_waiter *top_waiter = rt_mutex_top_waiter(lock);
/*
- * We can acquire the lock. Remove the waiter from the
- * lock waiters tree.
+ * If waiter is the highest priority waiter of @lock,
+ * or allowed to steal it, take it over.
*/
- rt_mutex_dequeue(lock, waiter);
-
+ if (waiter == top_waiter || rt_mutex_steal(waiter, top_waiter)) {
+ /*
+ * We can acquire the lock. Remove the waiter from the
+ * lock waiters tree.
+ */
+ rt_mutex_dequeue(lock, waiter);
+ } else {
+ return 0;
+ }
} else {
/*
* If the lock has waiters already we check whether @task is
* not need to be dequeued.
*/
if (rt_mutex_has_waiters(lock)) {
- /*
- * If @task->prio is greater than or equal to
- * the top waiter priority (kernel view),
- * @task lost.
- */
- if (!rt_mutex_waiter_less(task_to_waiter(task),
- rt_mutex_top_waiter(lock)))
+ /* Check whether the trylock can steal it. */
+ if (!rt_mutex_steal(task_to_waiter(task),
+ rt_mutex_top_waiter(lock)))
return 0;
/*
*
* This must be called with lock->wait_lock held and interrupts disabled
*/
-static int __sched task_blocks_on_rt_mutex(struct rt_mutex *lock,
+static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task,
+ struct ww_acquire_ctx *ww_ctx,
enum rtmutex_chainwalk chwalk)
{
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex_waiter *top_waiter = waiter;
- struct rt_mutex *next_lock;
+ struct rt_mutex_base *next_lock;
int chain_walk = 0, res;
lockdep_assert_held(&lock->wait_lock);
raw_spin_lock(&task->pi_lock);
waiter->task = task;
waiter->lock = lock;
- waiter->prio = task->prio;
- waiter->deadline = task->dl.deadline;
+ waiter_update_prio(waiter, task);
/* Get the top priority waiter on the lock */
if (rt_mutex_has_waiters(lock))
raw_spin_unlock(&task->pi_lock);
+ if (build_ww_mutex() && ww_ctx) {
+ struct rt_mutex *rtm;
+
+ /* Check whether the waiter should back out immediately */
+ rtm = container_of(lock, struct rt_mutex, rtmutex);
+ res = __ww_mutex_add_waiter(waiter, rtm, ww_ctx);
+ if (res) {
+ raw_spin_lock(&task->pi_lock);
+ rt_mutex_dequeue(lock, waiter);
+ task->pi_blocked_on = NULL;
+ raw_spin_unlock(&task->pi_lock);
+ return res;
+ }
+ }
+
if (!owner)
return 0;
*
* Called with lock->wait_lock held and interrupts disabled.
*/
-static void __sched mark_wakeup_next_waiter(struct wake_q_head *wake_q,
- struct rt_mutex *lock)
+static void __sched mark_wakeup_next_waiter(struct rt_wake_q_head *wqh,
+ struct rt_mutex_base *lock)
{
struct rt_mutex_waiter *waiter;
* deboost but before waking our donor task, hence the preempt_disable()
* before unlock.
*
- * Pairs with preempt_enable() in rt_mutex_postunlock();
+ * Pairs with preempt_enable() in rt_mutex_wake_up_q();
*/
preempt_disable();
- wake_q_add(wake_q, waiter->task);
+ rt_mutex_wake_q_add(wqh, waiter);
raw_spin_unlock(¤t->pi_lock);
}
+static int __sched __rt_mutex_slowtrylock(struct rt_mutex_base *lock)
+{
+ int ret = try_to_take_rt_mutex(lock, current, NULL);
+
+ /*
+ * try_to_take_rt_mutex() sets the lock waiters bit
+ * unconditionally. Clean this up.
+ */
+ fixup_rt_mutex_waiters(lock);
+
+ return ret;
+}
+
+/*
+ * Slow path try-lock function:
+ */
+static int __sched rt_mutex_slowtrylock(struct rt_mutex_base *lock)
+{
+ unsigned long flags;
+ int ret;
+
+ /*
+ * If the lock already has an owner we fail to get the lock.
+ * This can be done without taking the @lock->wait_lock as
+ * it is only being read, and this is a trylock anyway.
+ */
+ if (rt_mutex_owner(lock))
+ return 0;
+
+ /*
+ * The mutex has currently no owner. Lock the wait lock and try to
+ * acquire the lock. We use irqsave here to support early boot calls.
+ */
+ raw_spin_lock_irqsave(&lock->wait_lock, flags);
+
+ ret = __rt_mutex_slowtrylock(lock);
+
+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
+
+ return ret;
+}
+
+static __always_inline int __rt_mutex_trylock(struct rt_mutex_base *lock)
+{
+ if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
+ return 1;
+
+ return rt_mutex_slowtrylock(lock);
+}
+
+/*
+ * Slow path to release a rt-mutex.
+ */
+static void __sched rt_mutex_slowunlock(struct rt_mutex_base *lock)
+{
+ DEFINE_RT_WAKE_Q(wqh);
+ unsigned long flags;
+
+ /* irqsave required to support early boot calls */
+ raw_spin_lock_irqsave(&lock->wait_lock, flags);
+
+ debug_rt_mutex_unlock(lock);
+
+ /*
+ * We must be careful here if the fast path is enabled. If we
+ * have no waiters queued we cannot set owner to NULL here
+ * because of:
+ *
+ * foo->lock->owner = NULL;
+ * rtmutex_lock(foo->lock); <- fast path
+ * free = atomic_dec_and_test(foo->refcnt);
+ * rtmutex_unlock(foo->lock); <- fast path
+ * if (free)
+ * kfree(foo);
+ * raw_spin_unlock(foo->lock->wait_lock);
+ *
+ * So for the fastpath enabled kernel:
+ *
+ * Nothing can set the waiters bit as long as we hold
+ * lock->wait_lock. So we do the following sequence:
+ *
+ * owner = rt_mutex_owner(lock);
+ * clear_rt_mutex_waiters(lock);
+ * raw_spin_unlock(&lock->wait_lock);
+ * if (cmpxchg(&lock->owner, owner, 0) == owner)
+ * return;
+ * goto retry;
+ *
+ * The fastpath disabled variant is simple as all access to
+ * lock->owner is serialized by lock->wait_lock:
+ *
+ * lock->owner = NULL;
+ * raw_spin_unlock(&lock->wait_lock);
+ */
+ while (!rt_mutex_has_waiters(lock)) {
+ /* Drops lock->wait_lock ! */
+ if (unlock_rt_mutex_safe(lock, flags) == true)
+ return;
+ /* Relock the rtmutex and try again */
+ raw_spin_lock_irqsave(&lock->wait_lock, flags);
+ }
+
+ /*
+ * The wakeup next waiter path does not suffer from the above
+ * race. See the comments there.
+ *
+ * Queue the next waiter for wakeup once we release the wait_lock.
+ */
+ mark_wakeup_next_waiter(&wqh, lock);
+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
+
+ rt_mutex_wake_up_q(&wqh);
+}
+
+static __always_inline void __rt_mutex_unlock(struct rt_mutex_base *lock)
+{
+ if (likely(rt_mutex_cmpxchg_release(lock, current, NULL)))
+ return;
+
+ rt_mutex_slowunlock(lock);
+}
+
+#ifdef CONFIG_SMP
+static bool rtmutex_spin_on_owner(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *owner)
+{
+ bool res = true;
+
+ rcu_read_lock();
+ for (;;) {
+ /* If owner changed, trylock again. */
+ if (owner != rt_mutex_owner(lock))
+ break;
+ /*
+ * Ensure that @owner is dereferenced after checking that
+ * the lock owner still matches @owner. If that fails,
+ * @owner might point to freed memory. If it still matches,
+ * the rcu_read_lock() ensures the memory stays valid.
+ */
+ barrier();
+ /*
+ * Stop spinning when:
+ * - the lock owner has been scheduled out
+ * - current is not longer the top waiter
+ * - current is requested to reschedule (redundant
+ * for CONFIG_PREEMPT_RCU=y)
+ * - the VCPU on which owner runs is preempted
+ */
+ if (!owner->on_cpu || need_resched() ||
+ rt_mutex_waiter_is_top_waiter(lock, waiter) ||
+ vcpu_is_preempted(task_cpu(owner))) {
+ res = false;
+ break;
+ }
+ cpu_relax();
+ }
+ rcu_read_unlock();
+ return res;
+}
+#else
+static bool rtmutex_spin_on_owner(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *owner)
+{
+ return false;
+}
+#endif
+
+#ifdef RT_MUTEX_BUILD_MUTEX
+/*
+ * Functions required for:
+ * - rtmutex, futex on all kernels
+ * - mutex and rwsem substitutions on RT kernels
+ */
+
/*
* Remove a waiter from a lock and give up
*
- * Must be called with lock->wait_lock held and interrupts disabled. I must
+ * Must be called with lock->wait_lock held and interrupts disabled. It must
* have just failed to try_to_take_rt_mutex().
*/
-static void __sched remove_waiter(struct rt_mutex *lock,
+static void __sched remove_waiter(struct rt_mutex_base *lock,
struct rt_mutex_waiter *waiter)
{
bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
struct task_struct *owner = rt_mutex_owner(lock);
- struct rt_mutex *next_lock;
+ struct rt_mutex_base *next_lock;
lockdep_assert_held(&lock->wait_lock);
raw_spin_lock_irq(&lock->wait_lock);
}
-/*
- * Recheck the pi chain, in case we got a priority setting
+/**
+ * rt_mutex_slowlock_block() - Perform the wait-wake-try-to-take loop
+ * @lock: the rt_mutex to take
+ * @ww_ctx: WW mutex context pointer
+ * @state: the state the task should block in (TASK_INTERRUPTIBLE
+ * or TASK_UNINTERRUPTIBLE)
+ * @timeout: the pre-initialized and started timer, or NULL for none
+ * @waiter: the pre-initialized rt_mutex_waiter
*
- * Called from sched_setscheduler
+ * Must be called with lock->wait_lock held and interrupts disabled
*/
-void __sched rt_mutex_adjust_pi(struct task_struct *task)
-{
- struct rt_mutex_waiter *waiter;
- struct rt_mutex *next_lock;
- unsigned long flags;
-
- raw_spin_lock_irqsave(&task->pi_lock, flags);
-
- waiter = task->pi_blocked_on;
- if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
- raw_spin_unlock_irqrestore(&task->pi_lock, flags);
- return;
- }
- next_lock = waiter->lock;
- raw_spin_unlock_irqrestore(&task->pi_lock, flags);
-
- /* gets dropped in rt_mutex_adjust_prio_chain()! */
- get_task_struct(task);
-
- rt_mutex_adjust_prio_chain(task, RT_MUTEX_MIN_CHAINWALK, NULL,
- next_lock, NULL, task);
-}
-
-void __sched rt_mutex_init_waiter(struct rt_mutex_waiter *waiter)
-{
- debug_rt_mutex_init_waiter(waiter);
- RB_CLEAR_NODE(&waiter->pi_tree_entry);
- RB_CLEAR_NODE(&waiter->tree_entry);
- waiter->task = NULL;
-}
-
-/**
- * __rt_mutex_slowlock() - Perform the wait-wake-try-to-take loop
- * @lock: the rt_mutex to take
- * @state: the state the task should block in (TASK_INTERRUPTIBLE
- * or TASK_UNINTERRUPTIBLE)
- * @timeout: the pre-initialized and started timer, or NULL for none
- * @waiter: the pre-initialized rt_mutex_waiter
- *
- * Must be called with lock->wait_lock held and interrupts disabled
- */
-static int __sched __rt_mutex_slowlock(struct rt_mutex *lock, unsigned int state,
- struct hrtimer_sleeper *timeout,
- struct rt_mutex_waiter *waiter)
+static int __sched rt_mutex_slowlock_block(struct rt_mutex_base *lock,
+ struct ww_acquire_ctx *ww_ctx,
+ unsigned int state,
+ struct hrtimer_sleeper *timeout,
+ struct rt_mutex_waiter *waiter)
{
+ struct rt_mutex *rtm = container_of(lock, struct rt_mutex, rtmutex);
+ struct task_struct *owner;
int ret = 0;
for (;;) {
break;
}
+ if (build_ww_mutex() && ww_ctx) {
+ ret = __ww_mutex_check_kill(rtm, waiter, ww_ctx);
+ if (ret)
+ break;
+ }
+
+ if (waiter == rt_mutex_top_waiter(lock))
+ owner = rt_mutex_owner(lock);
+ else
+ owner = NULL;
raw_spin_unlock_irq(&lock->wait_lock);
- schedule();
+ if (!owner || !rtmutex_spin_on_owner(lock, waiter, owner))
+ schedule();
raw_spin_lock_irq(&lock->wait_lock);
set_current_state(state);
if (res != -EDEADLOCK || detect_deadlock)
return;
+ if (build_ww_mutex() && w->ww_ctx)
+ return;
+
/*
* Yell loudly and stop the task right here.
*/
}
}
-/*
- * Slow path lock function:
+/**
+ * __rt_mutex_slowlock - Locking slowpath invoked with lock::wait_lock held
+ * @lock: The rtmutex to block lock
+ * @ww_ctx: WW mutex context pointer
+ * @state: The task state for sleeping
+ * @chwalk: Indicator whether full or partial chainwalk is requested
+ * @waiter: Initializer waiter for blocking
*/
-static int __sched rt_mutex_slowlock(struct rt_mutex *lock, unsigned int state,
- struct hrtimer_sleeper *timeout,
- enum rtmutex_chainwalk chwalk)
+static int __sched __rt_mutex_slowlock(struct rt_mutex_base *lock,
+ struct ww_acquire_ctx *ww_ctx,
+ unsigned int state,
+ enum rtmutex_chainwalk chwalk,
+ struct rt_mutex_waiter *waiter)
{
- struct rt_mutex_waiter waiter;
- unsigned long flags;
- int ret = 0;
-
- rt_mutex_init_waiter(&waiter);
+ struct rt_mutex *rtm = container_of(lock, struct rt_mutex, rtmutex);
+ struct ww_mutex *ww = ww_container_of(rtm);
+ int ret;
- /*
- * Technically we could use raw_spin_[un]lock_irq() here, but this can
- * be called in early boot if the cmpxchg() fast path is disabled
- * (debug, no architecture support). In this case we will acquire the
- * rtmutex with lock->wait_lock held. But we cannot unconditionally
- * enable interrupts in that early boot case. So we need to use the
- * irqsave/restore variants.
- */
- raw_spin_lock_irqsave(&lock->wait_lock, flags);
+ lockdep_assert_held(&lock->wait_lock);
/* Try to acquire the lock again: */
if (try_to_take_rt_mutex(lock, current, NULL)) {
- raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
+ if (build_ww_mutex() && ww_ctx) {
+ __ww_mutex_check_waiters(rtm, ww_ctx);
+ ww_mutex_lock_acquired(ww, ww_ctx);
+ }
return 0;
}
set_current_state(state);
- /* Setup the timer, when timeout != NULL */
- if (unlikely(timeout))
- hrtimer_start_expires(&timeout->timer, HRTIMER_MODE_ABS);
-
- ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk);
-
+ ret = task_blocks_on_rt_mutex(lock, waiter, current, ww_ctx, chwalk);
if (likely(!ret))
- /* sleep on the mutex */
- ret = __rt_mutex_slowlock(lock, state, timeout, &waiter);
-
- if (unlikely(ret)) {
+ ret = rt_mutex_slowlock_block(lock, ww_ctx, state, NULL, waiter);
+
+ if (likely(!ret)) {
+ /* acquired the lock */
+ if (build_ww_mutex() && ww_ctx) {
+ if (!ww_ctx->is_wait_die)
+ __ww_mutex_check_waiters(rtm, ww_ctx);
+ ww_mutex_lock_acquired(ww, ww_ctx);
+ }
+ } else {
__set_current_state(TASK_RUNNING);
- remove_waiter(lock, &waiter);
- rt_mutex_handle_deadlock(ret, chwalk, &waiter);
+ remove_waiter(lock, waiter);
+ rt_mutex_handle_deadlock(ret, chwalk, waiter);
}
/*
* unconditionally. We might have to fix that up.
*/
fixup_rt_mutex_waiters(lock);
-
- raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
-
- /* Remove pending timer: */
- if (unlikely(timeout))
- hrtimer_cancel(&timeout->timer);
-
- debug_rt_mutex_free_waiter(&waiter);
-
return ret;
}
-static int __sched __rt_mutex_slowtrylock(struct rt_mutex *lock)
+static inline int __rt_mutex_slowlock_locked(struct rt_mutex_base *lock,
+ struct ww_acquire_ctx *ww_ctx,
+ unsigned int state)
{
- int ret = try_to_take_rt_mutex(lock, current, NULL);
+ struct rt_mutex_waiter waiter;
+ int ret;
- /*
- * try_to_take_rt_mutex() sets the lock waiters bit
- * unconditionally. Clean this up.
- */
- fixup_rt_mutex_waiters(lock);
+ rt_mutex_init_waiter(&waiter);
+ waiter.ww_ctx = ww_ctx;
+
+ ret = __rt_mutex_slowlock(lock, ww_ctx, state, RT_MUTEX_MIN_CHAINWALK,
+ &waiter);
+ debug_rt_mutex_free_waiter(&waiter);
return ret;
}
/*
- * Slow path try-lock function:
+ * rt_mutex_slowlock - Locking slowpath invoked when fast path fails
+ * @lock: The rtmutex to block lock
+ * @ww_ctx: WW mutex context pointer
+ * @state: The task state for sleeping
*/
-static int __sched rt_mutex_slowtrylock(struct rt_mutex *lock)
+static int __sched rt_mutex_slowlock(struct rt_mutex_base *lock,
+ struct ww_acquire_ctx *ww_ctx,
+ unsigned int state)
{
unsigned long flags;
int ret;
/*
- * If the lock already has an owner we fail to get the lock.
- * This can be done without taking the @lock->wait_lock as
- * it is only being read, and this is a trylock anyway.
- */
- if (rt_mutex_owner(lock))
- return 0;
-
- /*
- * The mutex has currently no owner. Lock the wait lock and try to
- * acquire the lock. We use irqsave here to support early boot calls.
+ * Technically we could use raw_spin_[un]lock_irq() here, but this can
+ * be called in early boot if the cmpxchg() fast path is disabled
+ * (debug, no architecture support). In this case we will acquire the
+ * rtmutex with lock->wait_lock held. But we cannot unconditionally
+ * enable interrupts in that early boot case. So we need to use the
+ * irqsave/restore variants.
*/
raw_spin_lock_irqsave(&lock->wait_lock, flags);
-
- ret = __rt_mutex_slowtrylock(lock);
-
+ ret = __rt_mutex_slowlock_locked(lock, ww_ctx, state);
raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
return ret;
}
-/*
- * Performs the wakeup of the top-waiter and re-enables preemption.
- */
-void __sched rt_mutex_postunlock(struct wake_q_head *wake_q)
-{
- wake_up_q(wake_q);
-
- /* Pairs with preempt_disable() in mark_wakeup_next_waiter() */
- preempt_enable();
-}
-
-/*
- * Slow path to release a rt-mutex.
- *
- * Return whether the current task needs to call rt_mutex_postunlock().
- */
-static void __sched rt_mutex_slowunlock(struct rt_mutex *lock)
-{
- DEFINE_WAKE_Q(wake_q);
- unsigned long flags;
-
- /* irqsave required to support early boot calls */
- raw_spin_lock_irqsave(&lock->wait_lock, flags);
-
- debug_rt_mutex_unlock(lock);
-
- /*
- * We must be careful here if the fast path is enabled. If we
- * have no waiters queued we cannot set owner to NULL here
- * because of:
- *
- * foo->lock->owner = NULL;
- * rtmutex_lock(foo->lock); <- fast path
- * free = atomic_dec_and_test(foo->refcnt);
- * rtmutex_unlock(foo->lock); <- fast path
- * if (free)
- * kfree(foo);
- * raw_spin_unlock(foo->lock->wait_lock);
- *
- * So for the fastpath enabled kernel:
- *
- * Nothing can set the waiters bit as long as we hold
- * lock->wait_lock. So we do the following sequence:
- *
- * owner = rt_mutex_owner(lock);
- * clear_rt_mutex_waiters(lock);
- * raw_spin_unlock(&lock->wait_lock);
- * if (cmpxchg(&lock->owner, owner, 0) == owner)
- * return;
- * goto retry;
- *
- * The fastpath disabled variant is simple as all access to
- * lock->owner is serialized by lock->wait_lock:
- *
- * lock->owner = NULL;
- * raw_spin_unlock(&lock->wait_lock);
- */
- while (!rt_mutex_has_waiters(lock)) {
- /* Drops lock->wait_lock ! */
- if (unlock_rt_mutex_safe(lock, flags) == true)
- return;
- /* Relock the rtmutex and try again */
- raw_spin_lock_irqsave(&lock->wait_lock, flags);
- }
-
- /*
- * The wakeup next waiter path does not suffer from the above
- * race. See the comments there.
- *
- * Queue the next waiter for wakeup once we release the wait_lock.
- */
- mark_wakeup_next_waiter(&wake_q, lock);
- raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
-
- rt_mutex_postunlock(&wake_q);
-}
-
-/*
- * debug aware fast / slowpath lock,trylock,unlock
- *
- * The atomic acquire/release ops are compiled away, when either the
- * architecture does not support cmpxchg or when debugging is enabled.
- */
-static __always_inline int __rt_mutex_lock(struct rt_mutex *lock, long state,
- unsigned int subclass)
+static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
+ unsigned int state)
{
- int ret;
-
- might_sleep();
- mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
-
if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
return 0;
- ret = rt_mutex_slowlock(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK);
- if (ret)
- mutex_release(&lock->dep_map, _RET_IP_);
- return ret;
+ return rt_mutex_slowlock(lock, NULL, state);
}
+#endif /* RT_MUTEX_BUILD_MUTEX */
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-/**
- * rt_mutex_lock_nested - lock a rt_mutex
- *
- * @lock: the rt_mutex to be locked
- * @subclass: the lockdep subclass
- */
-void __sched rt_mutex_lock_nested(struct rt_mutex *lock, unsigned int subclass)
-{
- __rt_mutex_lock(lock, TASK_UNINTERRUPTIBLE, subclass);
-}
-EXPORT_SYMBOL_GPL(rt_mutex_lock_nested);
-
-#else /* !CONFIG_DEBUG_LOCK_ALLOC */
-
-/**
- * rt_mutex_lock - lock a rt_mutex
- *
- * @lock: the rt_mutex to be locked
- */
-void __sched rt_mutex_lock(struct rt_mutex *lock)
-{
- __rt_mutex_lock(lock, TASK_UNINTERRUPTIBLE, 0);
-}
-EXPORT_SYMBOL_GPL(rt_mutex_lock);
-#endif
-
-/**
- * rt_mutex_lock_interruptible - lock a rt_mutex interruptible
- *
- * @lock: the rt_mutex to be locked
- *
- * Returns:
- * 0 on success
- * -EINTR when interrupted by a signal
- */
-int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock)
-{
- return __rt_mutex_lock(lock, TASK_INTERRUPTIBLE, 0);
-}
-EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible);
-
-/**
- * rt_mutex_trylock - try to lock a rt_mutex
- *
- * @lock: the rt_mutex to be locked
- *
- * This function can only be called in thread context. It's safe to call it
- * from atomic regions, but not from hard or soft interrupt context.
- *
- * Returns:
- * 1 on success
- * 0 on contention
- */
-int __sched rt_mutex_trylock(struct rt_mutex *lock)
-{
- int ret;
-
- if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
- return 0;
-
- /*
- * No lockdep annotation required because lockdep disables the fast
- * path.
- */
- if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
- return 1;
-
- ret = rt_mutex_slowtrylock(lock);
- if (ret)
- mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
-
- return ret;
-}
-EXPORT_SYMBOL_GPL(rt_mutex_trylock);
-
-/**
- * rt_mutex_unlock - unlock a rt_mutex
- *
- * @lock: the rt_mutex to be unlocked
- */
-void __sched rt_mutex_unlock(struct rt_mutex *lock)
-{
- mutex_release(&lock->dep_map, _RET_IP_);
- if (likely(rt_mutex_cmpxchg_release(lock, current, NULL)))
- return;
-
- rt_mutex_slowunlock(lock);
-}
-EXPORT_SYMBOL_GPL(rt_mutex_unlock);
-
+#ifdef RT_MUTEX_BUILD_SPINLOCKS
/*
- * Futex variants, must not use fastpath.
+ * Functions required for spin/rw_lock substitution on RT kernels
*/
-int __sched rt_mutex_futex_trylock(struct rt_mutex *lock)
-{
- return rt_mutex_slowtrylock(lock);
-}
-
-int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock)
-{
- return __rt_mutex_slowtrylock(lock);
-}
/**
- * __rt_mutex_futex_unlock - Futex variant, that since futex variants
- * do not use the fast-path, can be simple and will not need to retry.
- *
- * @lock: The rt_mutex to be unlocked
- * @wake_q: The wake queue head from which to get the next lock waiter
+ * rtlock_slowlock_locked - Slow path lock acquisition for RT locks
+ * @lock: The underlying RT mutex
*/
-bool __sched __rt_mutex_futex_unlock(struct rt_mutex *lock,
- struct wake_q_head *wake_q)
+static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock)
{
- lockdep_assert_held(&lock->wait_lock);
-
- debug_rt_mutex_unlock(lock);
-
- if (!rt_mutex_has_waiters(lock)) {
- lock->owner = NULL;
- return false; /* done */
- }
-
- /*
- * We've already deboosted, mark_wakeup_next_waiter() will
- * retain preempt_disabled when we drop the wait_lock, to
- * avoid inversion prior to the wakeup. preempt_disable()
- * therein pairs with rt_mutex_postunlock().
- */
- mark_wakeup_next_waiter(wake_q, lock);
-
- return true; /* call postunlock() */
-}
-
-void __sched rt_mutex_futex_unlock(struct rt_mutex *lock)
-{
- DEFINE_WAKE_Q(wake_q);
- unsigned long flags;
- bool postunlock;
-
- raw_spin_lock_irqsave(&lock->wait_lock, flags);
- postunlock = __rt_mutex_futex_unlock(lock, &wake_q);
- raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
-
- if (postunlock)
- rt_mutex_postunlock(&wake_q);
-}
+ struct rt_mutex_waiter waiter;
+ struct task_struct *owner;
-/**
- * __rt_mutex_init - initialize the rt_mutex
- *
- * @lock: The rt_mutex to be initialized
- * @name: The lock name used for debugging
- * @key: The lock class key used for debugging
- *
- * Initialize the rt_mutex to unlocked state.
- *
- * Initializing of a locked rt_mutex is not allowed
- */
-void __sched __rt_mutex_init(struct rt_mutex *lock, const char *name,
- struct lock_class_key *key)
-{
- debug_check_no_locks_freed((void *)lock, sizeof(*lock));
- lockdep_init_map(&lock->dep_map, name, key, 0);
+ lockdep_assert_held(&lock->wait_lock);
- __rt_mutex_basic_init(lock);
-}
-EXPORT_SYMBOL_GPL(__rt_mutex_init);
+ if (try_to_take_rt_mutex(lock, current, NULL))
+ return;
-/**
- * rt_mutex_init_proxy_locked - initialize and lock a rt_mutex on behalf of a
- * proxy owner
- *
- * @lock: the rt_mutex to be locked
- * @proxy_owner:the task to set as owner
- *
- * No locking. Caller has to do serializing itself
- *
- * Special API call for PI-futex support. This initializes the rtmutex and
- * assigns it to @proxy_owner. Concurrent operations on the rtmutex are not
- * possible at this point because the pi_state which contains the rtmutex
- * is not yet visible to other tasks.
- */
-void __sched rt_mutex_init_proxy_locked(struct rt_mutex *lock,
- struct task_struct *proxy_owner)
-{
- __rt_mutex_basic_init(lock);
- rt_mutex_set_owner(lock, proxy_owner);
-}
+ rt_mutex_init_rtlock_waiter(&waiter);
-/**
- * rt_mutex_proxy_unlock - release a lock on behalf of owner
- *
- * @lock: the rt_mutex to be locked
- *
- * No locking. Caller has to do serializing itself
- *
- * Special API call for PI-futex support. This merrily cleans up the rtmutex
- * (debugging) state. Concurrent operations on this rt_mutex are not
- * possible because it belongs to the pi_state which is about to be freed
- * and it is not longer visible to other tasks.
- */
-void __sched rt_mutex_proxy_unlock(struct rt_mutex *lock)
-{
- debug_rt_mutex_proxy_unlock(lock);
- rt_mutex_set_owner(lock, NULL);
-}
+ /* Save current state and set state to TASK_RTLOCK_WAIT */
+ current_save_and_set_rtlock_wait_state();
-/**
- * __rt_mutex_start_proxy_lock() - Start lock acquisition for another task
- * @lock: the rt_mutex to take
- * @waiter: the pre-initialized rt_mutex_waiter
- * @task: the task to prepare
- *
- * Starts the rt_mutex acquire; it enqueues the @waiter and does deadlock
- * detection. It does not wait, see rt_mutex_wait_proxy_lock() for that.
- *
- * NOTE: does _NOT_ remove the @waiter on failure; must either call
- * rt_mutex_wait_proxy_lock() or rt_mutex_cleanup_proxy_lock() after this.
- *
- * Returns:
- * 0 - task blocked on lock
- * 1 - acquired the lock for task, caller should wake it up
- * <0 - error
- *
- * Special API call for PI-futex support.
- */
-int __sched __rt_mutex_start_proxy_lock(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter,
- struct task_struct *task)
-{
- int ret;
+ task_blocks_on_rt_mutex(lock, &waiter, current, NULL, RT_MUTEX_MIN_CHAINWALK);
- lockdep_assert_held(&lock->wait_lock);
+ for (;;) {
+ /* Try to acquire the lock again */
+ if (try_to_take_rt_mutex(lock, current, &waiter))
+ break;
- if (try_to_take_rt_mutex(lock, task, NULL))
- return 1;
+ if (&waiter == rt_mutex_top_waiter(lock))
+ owner = rt_mutex_owner(lock);
+ else
+ owner = NULL;
+ raw_spin_unlock_irq(&lock->wait_lock);
- /* We enforce deadlock detection for futexes */
- ret = task_blocks_on_rt_mutex(lock, waiter, task,
- RT_MUTEX_FULL_CHAINWALK);
+ if (!owner || !rtmutex_spin_on_owner(lock, &waiter, owner))
+ schedule_rtlock();
- if (ret && !rt_mutex_owner(lock)) {
- /*
- * Reset the return value. We might have
- * returned with -EDEADLK and the owner
- * released the lock while we were walking the
- * pi chain. Let the waiter sort it out.
- */
- ret = 0;
+ raw_spin_lock_irq(&lock->wait_lock);
+ set_current_state(TASK_RTLOCK_WAIT);
}
- return ret;
-}
-
-/**
- * rt_mutex_start_proxy_lock() - Start lock acquisition for another task
- * @lock: the rt_mutex to take
- * @waiter: the pre-initialized rt_mutex_waiter
- * @task: the task to prepare
- *
- * Starts the rt_mutex acquire; it enqueues the @waiter and does deadlock
- * detection. It does not wait, see rt_mutex_wait_proxy_lock() for that.
- *
- * NOTE: unlike __rt_mutex_start_proxy_lock this _DOES_ remove the @waiter
- * on failure.
- *
- * Returns:
- * 0 - task blocked on lock
- * 1 - acquired the lock for task, caller should wake it up
- * <0 - error
- *
- * Special API call for PI-futex support.
- */
-int __sched rt_mutex_start_proxy_lock(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter,
- struct task_struct *task)
-{
- int ret;
-
- raw_spin_lock_irq(&lock->wait_lock);
- ret = __rt_mutex_start_proxy_lock(lock, waiter, task);
- if (unlikely(ret))
- remove_waiter(lock, waiter);
- raw_spin_unlock_irq(&lock->wait_lock);
-
- return ret;
-}
-
-/**
- * rt_mutex_wait_proxy_lock() - Wait for lock acquisition
- * @lock: the rt_mutex we were woken on
- * @to: the timeout, null if none. hrtimer should already have
- * been started.
- * @waiter: the pre-initialized rt_mutex_waiter
- *
- * Wait for the lock acquisition started on our behalf by
- * rt_mutex_start_proxy_lock(). Upon failure, the caller must call
- * rt_mutex_cleanup_proxy_lock().
- *
- * Returns:
- * 0 - success
- * <0 - error, one of -EINTR, -ETIMEDOUT
- *
- * Special API call for PI-futex support
- */
-int __sched rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
- struct hrtimer_sleeper *to,
- struct rt_mutex_waiter *waiter)
-{
- int ret;
+ /* Restore the task state */
+ current_restore_rtlock_saved_state();
- raw_spin_lock_irq(&lock->wait_lock);
- /* sleep on the mutex */
- set_current_state(TASK_INTERRUPTIBLE);
- ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
/*
- * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
- * have to fix that up.
+ * try_to_take_rt_mutex() sets the waiter bit unconditionally.
+ * We might have to fix that up:
*/
fixup_rt_mutex_waiters(lock);
- raw_spin_unlock_irq(&lock->wait_lock);
-
- return ret;
+ debug_rt_mutex_free_waiter(&waiter);
}
-/**
- * rt_mutex_cleanup_proxy_lock() - Cleanup failed lock acquisition
- * @lock: the rt_mutex we were woken on
- * @waiter: the pre-initialized rt_mutex_waiter
- *
- * Attempt to clean up after a failed __rt_mutex_start_proxy_lock() or
- * rt_mutex_wait_proxy_lock().
- *
- * Unless we acquired the lock; we're still enqueued on the wait-list and can
- * in fact still be granted ownership until we're removed. Therefore we can
- * find we are in fact the owner and must disregard the
- * rt_mutex_wait_proxy_lock() failure.
- *
- * Returns:
- * true - did the cleanup, we done.
- * false - we acquired the lock after rt_mutex_wait_proxy_lock() returned,
- * caller should disregards its return value.
- *
- * Special API call for PI-futex support
- */
-bool __sched rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter)
+static __always_inline void __sched rtlock_slowlock(struct rt_mutex_base *lock)
{
- bool cleanup = false;
-
- raw_spin_lock_irq(&lock->wait_lock);
- /*
- * Do an unconditional try-lock, this deals with the lock stealing
- * state where __rt_mutex_futex_unlock() -> mark_wakeup_next_waiter()
- * sets a NULL owner.
- *
- * We're not interested in the return value, because the subsequent
- * test on rt_mutex_owner() will infer that. If the trylock succeeded,
- * we will own the lock and it will have removed the waiter. If we
- * failed the trylock, we're still not owner and we need to remove
- * ourselves.
- */
- try_to_take_rt_mutex(lock, current, waiter);
- /*
- * Unless we're the owner; we're still enqueued on the wait_list.
- * So check if we became owner, if not, take us off the wait_list.
- */
- if (rt_mutex_owner(lock) != current) {
- remove_waiter(lock, waiter);
- cleanup = true;
- }
- /*
- * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
- * have to fix that up.
- */
- fixup_rt_mutex_waiters(lock);
-
- raw_spin_unlock_irq(&lock->wait_lock);
+ unsigned long flags;
- return cleanup;
+ raw_spin_lock_irqsave(&lock->wait_lock, flags);
+ rtlock_slowlock_locked(lock);
+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
}
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-void rt_mutex_debug_task_free(struct task_struct *task)
-{
- DEBUG_LOCKS_WARN_ON(!RB_EMPTY_ROOT(&task->pi_waiters.rb_root));
- DEBUG_LOCKS_WARN_ON(task->pi_blocked_on);
-}
-#endif
+#endif /* RT_MUTEX_BUILD_SPINLOCKS */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * rtmutex API
+ */
+#include <linux/spinlock.h>
+#include <linux/export.h>
+
+#define RT_MUTEX_BUILD_MUTEX
+#include "rtmutex.c"
+
+/*
+ * Max number of times we'll walk the boosting chain:
+ */
+int max_lock_depth = 1024;
+
+/*
+ * Debug aware fast / slowpath lock,trylock,unlock
+ *
+ * The atomic acquire/release ops are compiled away, when either the
+ * architecture does not support cmpxchg or when debugging is enabled.
+ */
+static __always_inline int __rt_mutex_lock_common(struct rt_mutex *lock,
+ unsigned int state,
+ unsigned int subclass)
+{
+ int ret;
+
+ might_sleep();
+ mutex_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
+ ret = __rt_mutex_lock(&lock->rtmutex, state);
+ if (ret)
+ mutex_release(&lock->dep_map, _RET_IP_);
+ return ret;
+}
+
+void rt_mutex_base_init(struct rt_mutex_base *rtb)
+{
+ __rt_mutex_base_init(rtb);
+}
+EXPORT_SYMBOL(rt_mutex_base_init);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+/**
+ * rt_mutex_lock_nested - lock a rt_mutex
+ *
+ * @lock: the rt_mutex to be locked
+ * @subclass: the lockdep subclass
+ */
+void __sched rt_mutex_lock_nested(struct rt_mutex *lock, unsigned int subclass)
+{
+ __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass);
+}
+EXPORT_SYMBOL_GPL(rt_mutex_lock_nested);
+
+#else /* !CONFIG_DEBUG_LOCK_ALLOC */
+
+/**
+ * rt_mutex_lock - lock a rt_mutex
+ *
+ * @lock: the rt_mutex to be locked
+ */
+void __sched rt_mutex_lock(struct rt_mutex *lock)
+{
+ __rt_mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0);
+}
+EXPORT_SYMBOL_GPL(rt_mutex_lock);
+#endif
+
+/**
+ * rt_mutex_lock_interruptible - lock a rt_mutex interruptible
+ *
+ * @lock: the rt_mutex to be locked
+ *
+ * Returns:
+ * 0 on success
+ * -EINTR when interrupted by a signal
+ */
+int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock)
+{
+ return __rt_mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0);
+}
+EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible);
+
+/**
+ * rt_mutex_trylock - try to lock a rt_mutex
+ *
+ * @lock: the rt_mutex to be locked
+ *
+ * This function can only be called in thread context. It's safe to call it
+ * from atomic regions, but not from hard or soft interrupt context.
+ *
+ * Returns:
+ * 1 on success
+ * 0 on contention
+ */
+int __sched rt_mutex_trylock(struct rt_mutex *lock)
+{
+ int ret;
+
+ if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
+ return 0;
+
+ ret = __rt_mutex_trylock(&lock->rtmutex);
+ if (ret)
+ mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(rt_mutex_trylock);
+
+/**
+ * rt_mutex_unlock - unlock a rt_mutex
+ *
+ * @lock: the rt_mutex to be unlocked
+ */
+void __sched rt_mutex_unlock(struct rt_mutex *lock)
+{
+ mutex_release(&lock->dep_map, _RET_IP_);
+ __rt_mutex_unlock(&lock->rtmutex);
+}
+EXPORT_SYMBOL_GPL(rt_mutex_unlock);
+
+/*
+ * Futex variants, must not use fastpath.
+ */
+int __sched rt_mutex_futex_trylock(struct rt_mutex_base *lock)
+{
+ return rt_mutex_slowtrylock(lock);
+}
+
+int __sched __rt_mutex_futex_trylock(struct rt_mutex_base *lock)
+{
+ return __rt_mutex_slowtrylock(lock);
+}
+
+/**
+ * __rt_mutex_futex_unlock - Futex variant, that since futex variants
+ * do not use the fast-path, can be simple and will not need to retry.
+ *
+ * @lock: The rt_mutex to be unlocked
+ * @wqh: The wake queue head from which to get the next lock waiter
+ */
+bool __sched __rt_mutex_futex_unlock(struct rt_mutex_base *lock,
+ struct rt_wake_q_head *wqh)
+{
+ lockdep_assert_held(&lock->wait_lock);
+
+ debug_rt_mutex_unlock(lock);
+
+ if (!rt_mutex_has_waiters(lock)) {
+ lock->owner = NULL;
+ return false; /* done */
+ }
+
+ /*
+ * We've already deboosted, mark_wakeup_next_waiter() will
+ * retain preempt_disabled when we drop the wait_lock, to
+ * avoid inversion prior to the wakeup. preempt_disable()
+ * therein pairs with rt_mutex_postunlock().
+ */
+ mark_wakeup_next_waiter(wqh, lock);
+
+ return true; /* call postunlock() */
+}
+
+void __sched rt_mutex_futex_unlock(struct rt_mutex_base *lock)
+{
+ DEFINE_RT_WAKE_Q(wqh);
+ unsigned long flags;
+ bool postunlock;
+
+ raw_spin_lock_irqsave(&lock->wait_lock, flags);
+ postunlock = __rt_mutex_futex_unlock(lock, &wqh);
+ raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
+
+ if (postunlock)
+ rt_mutex_postunlock(&wqh);
+}
+
+/**
+ * __rt_mutex_init - initialize the rt_mutex
+ *
+ * @lock: The rt_mutex to be initialized
+ * @name: The lock name used for debugging
+ * @key: The lock class key used for debugging
+ *
+ * Initialize the rt_mutex to unlocked state.
+ *
+ * Initializing of a locked rt_mutex is not allowed
+ */
+void __sched __rt_mutex_init(struct rt_mutex *lock, const char *name,
+ struct lock_class_key *key)
+{
+ debug_check_no_locks_freed((void *)lock, sizeof(*lock));
+ __rt_mutex_base_init(&lock->rtmutex);
+ lockdep_init_map_wait(&lock->dep_map, name, key, 0, LD_WAIT_SLEEP);
+}
+EXPORT_SYMBOL_GPL(__rt_mutex_init);
+
+/**
+ * rt_mutex_init_proxy_locked - initialize and lock a rt_mutex on behalf of a
+ * proxy owner
+ *
+ * @lock: the rt_mutex to be locked
+ * @proxy_owner:the task to set as owner
+ *
+ * No locking. Caller has to do serializing itself
+ *
+ * Special API call for PI-futex support. This initializes the rtmutex and
+ * assigns it to @proxy_owner. Concurrent operations on the rtmutex are not
+ * possible at this point because the pi_state which contains the rtmutex
+ * is not yet visible to other tasks.
+ */
+void __sched rt_mutex_init_proxy_locked(struct rt_mutex_base *lock,
+ struct task_struct *proxy_owner)
+{
+ static struct lock_class_key pi_futex_key;
+
+ __rt_mutex_base_init(lock);
+ /*
+ * On PREEMPT_RT the futex hashbucket spinlock becomes 'sleeping'
+ * and rtmutex based. That causes a lockdep false positive, because
+ * some of the futex functions invoke spin_unlock(&hb->lock) with
+ * the wait_lock of the rtmutex associated to the pi_futex held.
+ * spin_unlock() in turn takes wait_lock of the rtmutex on which
+ * the spinlock is based, which makes lockdep notice a lock
+ * recursion. Give the futex/rtmutex wait_lock a separate key.
+ */
+ lockdep_set_class(&lock->wait_lock, &pi_futex_key);
+ rt_mutex_set_owner(lock, proxy_owner);
+}
+
+/**
+ * rt_mutex_proxy_unlock - release a lock on behalf of owner
+ *
+ * @lock: the rt_mutex to be locked
+ *
+ * No locking. Caller has to do serializing itself
+ *
+ * Special API call for PI-futex support. This just cleans up the rtmutex
+ * (debugging) state. Concurrent operations on this rt_mutex are not
+ * possible because it belongs to the pi_state which is about to be freed
+ * and it is not longer visible to other tasks.
+ */
+void __sched rt_mutex_proxy_unlock(struct rt_mutex_base *lock)
+{
+ debug_rt_mutex_proxy_unlock(lock);
+ rt_mutex_set_owner(lock, NULL);
+}
+
+/**
+ * __rt_mutex_start_proxy_lock() - Start lock acquisition for another task
+ * @lock: the rt_mutex to take
+ * @waiter: the pre-initialized rt_mutex_waiter
+ * @task: the task to prepare
+ *
+ * Starts the rt_mutex acquire; it enqueues the @waiter and does deadlock
+ * detection. It does not wait, see rt_mutex_wait_proxy_lock() for that.
+ *
+ * NOTE: does _NOT_ remove the @waiter on failure; must either call
+ * rt_mutex_wait_proxy_lock() or rt_mutex_cleanup_proxy_lock() after this.
+ *
+ * Returns:
+ * 0 - task blocked on lock
+ * 1 - acquired the lock for task, caller should wake it up
+ * <0 - error
+ *
+ * Special API call for PI-futex support.
+ */
+int __sched __rt_mutex_start_proxy_lock(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task)
+{
+ int ret;
+
+ lockdep_assert_held(&lock->wait_lock);
+
+ if (try_to_take_rt_mutex(lock, task, NULL))
+ return 1;
+
+ /* We enforce deadlock detection for futexes */
+ ret = task_blocks_on_rt_mutex(lock, waiter, task, NULL,
+ RT_MUTEX_FULL_CHAINWALK);
+
+ if (ret && !rt_mutex_owner(lock)) {
+ /*
+ * Reset the return value. We might have
+ * returned with -EDEADLK and the owner
+ * released the lock while we were walking the
+ * pi chain. Let the waiter sort it out.
+ */
+ ret = 0;
+ }
+
+ return ret;
+}
+
+/**
+ * rt_mutex_start_proxy_lock() - Start lock acquisition for another task
+ * @lock: the rt_mutex to take
+ * @waiter: the pre-initialized rt_mutex_waiter
+ * @task: the task to prepare
+ *
+ * Starts the rt_mutex acquire; it enqueues the @waiter and does deadlock
+ * detection. It does not wait, see rt_mutex_wait_proxy_lock() for that.
+ *
+ * NOTE: unlike __rt_mutex_start_proxy_lock this _DOES_ remove the @waiter
+ * on failure.
+ *
+ * Returns:
+ * 0 - task blocked on lock
+ * 1 - acquired the lock for task, caller should wake it up
+ * <0 - error
+ *
+ * Special API call for PI-futex support.
+ */
+int __sched rt_mutex_start_proxy_lock(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task)
+{
+ int ret;
+
+ raw_spin_lock_irq(&lock->wait_lock);
+ ret = __rt_mutex_start_proxy_lock(lock, waiter, task);
+ if (unlikely(ret))
+ remove_waiter(lock, waiter);
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ return ret;
+}
+
+/**
+ * rt_mutex_wait_proxy_lock() - Wait for lock acquisition
+ * @lock: the rt_mutex we were woken on
+ * @to: the timeout, null if none. hrtimer should already have
+ * been started.
+ * @waiter: the pre-initialized rt_mutex_waiter
+ *
+ * Wait for the lock acquisition started on our behalf by
+ * rt_mutex_start_proxy_lock(). Upon failure, the caller must call
+ * rt_mutex_cleanup_proxy_lock().
+ *
+ * Returns:
+ * 0 - success
+ * <0 - error, one of -EINTR, -ETIMEDOUT
+ *
+ * Special API call for PI-futex support
+ */
+int __sched rt_mutex_wait_proxy_lock(struct rt_mutex_base *lock,
+ struct hrtimer_sleeper *to,
+ struct rt_mutex_waiter *waiter)
+{
+ int ret;
+
+ raw_spin_lock_irq(&lock->wait_lock);
+ /* sleep on the mutex */
+ set_current_state(TASK_INTERRUPTIBLE);
+ ret = rt_mutex_slowlock_block(lock, NULL, TASK_INTERRUPTIBLE, to, waiter);
+ /*
+ * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
+ * have to fix that up.
+ */
+ fixup_rt_mutex_waiters(lock);
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ return ret;
+}
+
+/**
+ * rt_mutex_cleanup_proxy_lock() - Cleanup failed lock acquisition
+ * @lock: the rt_mutex we were woken on
+ * @waiter: the pre-initialized rt_mutex_waiter
+ *
+ * Attempt to clean up after a failed __rt_mutex_start_proxy_lock() or
+ * rt_mutex_wait_proxy_lock().
+ *
+ * Unless we acquired the lock; we're still enqueued on the wait-list and can
+ * in fact still be granted ownership until we're removed. Therefore we can
+ * find we are in fact the owner and must disregard the
+ * rt_mutex_wait_proxy_lock() failure.
+ *
+ * Returns:
+ * true - did the cleanup, we done.
+ * false - we acquired the lock after rt_mutex_wait_proxy_lock() returned,
+ * caller should disregards its return value.
+ *
+ * Special API call for PI-futex support
+ */
+bool __sched rt_mutex_cleanup_proxy_lock(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter)
+{
+ bool cleanup = false;
+
+ raw_spin_lock_irq(&lock->wait_lock);
+ /*
+ * Do an unconditional try-lock, this deals with the lock stealing
+ * state where __rt_mutex_futex_unlock() -> mark_wakeup_next_waiter()
+ * sets a NULL owner.
+ *
+ * We're not interested in the return value, because the subsequent
+ * test on rt_mutex_owner() will infer that. If the trylock succeeded,
+ * we will own the lock and it will have removed the waiter. If we
+ * failed the trylock, we're still not owner and we need to remove
+ * ourselves.
+ */
+ try_to_take_rt_mutex(lock, current, waiter);
+ /*
+ * Unless we're the owner; we're still enqueued on the wait_list.
+ * So check if we became owner, if not, take us off the wait_list.
+ */
+ if (rt_mutex_owner(lock) != current) {
+ remove_waiter(lock, waiter);
+ cleanup = true;
+ }
+ /*
+ * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
+ * have to fix that up.
+ */
+ fixup_rt_mutex_waiters(lock);
+
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ return cleanup;
+}
+
+/*
+ * Recheck the pi chain, in case we got a priority setting
+ *
+ * Called from sched_setscheduler
+ */
+void __sched rt_mutex_adjust_pi(struct task_struct *task)
+{
+ struct rt_mutex_waiter *waiter;
+ struct rt_mutex_base *next_lock;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&task->pi_lock, flags);
+
+ waiter = task->pi_blocked_on;
+ if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
+ raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+ return;
+ }
+ next_lock = waiter->lock;
+ raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+
+ /* gets dropped in rt_mutex_adjust_prio_chain()! */
+ get_task_struct(task);
+
+ rt_mutex_adjust_prio_chain(task, RT_MUTEX_MIN_CHAINWALK, NULL,
+ next_lock, NULL, task);
+}
+
+/*
+ * Performs the wakeup of the top-waiter and re-enables preemption.
+ */
+void __sched rt_mutex_postunlock(struct rt_wake_q_head *wqh)
+{
+ rt_mutex_wake_up_q(wqh);
+}
+
+#ifdef CONFIG_DEBUG_RT_MUTEXES
+void rt_mutex_debug_task_free(struct task_struct *task)
+{
+ DEBUG_LOCKS_WARN_ON(!RB_EMPTY_ROOT(&task->pi_waiters.rb_root));
+ DEBUG_LOCKS_WARN_ON(task->pi_blocked_on);
+}
+#endif
+
+#ifdef CONFIG_PREEMPT_RT
+/* Mutexes */
+void __mutex_rt_init(struct mutex *mutex, const char *name,
+ struct lock_class_key *key)
+{
+ debug_check_no_locks_freed((void *)mutex, sizeof(*mutex));
+ lockdep_init_map_wait(&mutex->dep_map, name, key, 0, LD_WAIT_SLEEP);
+}
+EXPORT_SYMBOL(__mutex_rt_init);
+
+static __always_inline int __mutex_lock_common(struct mutex *lock,
+ unsigned int state,
+ unsigned int subclass,
+ struct lockdep_map *nest_lock,
+ unsigned long ip)
+{
+ int ret;
+
+ might_sleep();
+ mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
+ ret = __rt_mutex_lock(&lock->rtmutex, state);
+ if (ret)
+ mutex_release(&lock->dep_map, ip);
+ else
+ lock_acquired(&lock->dep_map, ip);
+ return ret;
+}
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void __sched mutex_lock_nested(struct mutex *lock, unsigned int subclass)
+{
+ __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, _RET_IP_);
+}
+EXPORT_SYMBOL_GPL(mutex_lock_nested);
+
+void __sched _mutex_lock_nest_lock(struct mutex *lock,
+ struct lockdep_map *nest_lock)
+{
+ __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, nest_lock, _RET_IP_);
+}
+EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
+
+int __sched mutex_lock_interruptible_nested(struct mutex *lock,
+ unsigned int subclass)
+{
+ return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, subclass, NULL, _RET_IP_);
+}
+EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
+
+int __sched mutex_lock_killable_nested(struct mutex *lock,
+ unsigned int subclass)
+{
+ return __mutex_lock_common(lock, TASK_KILLABLE, subclass, NULL, _RET_IP_);
+}
+EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
+
+void __sched mutex_lock_io_nested(struct mutex *lock, unsigned int subclass)
+{
+ int token;
+
+ might_sleep();
+
+ token = io_schedule_prepare();
+ __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, _RET_IP_);
+ io_schedule_finish(token);
+}
+EXPORT_SYMBOL_GPL(mutex_lock_io_nested);
+
+#else /* CONFIG_DEBUG_LOCK_ALLOC */
+
+void __sched mutex_lock(struct mutex *lock)
+{
+ __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
+}
+EXPORT_SYMBOL(mutex_lock);
+
+int __sched mutex_lock_interruptible(struct mutex *lock)
+{
+ return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0, NULL, _RET_IP_);
+}
+EXPORT_SYMBOL(mutex_lock_interruptible);
+
+int __sched mutex_lock_killable(struct mutex *lock)
+{
+ return __mutex_lock_common(lock, TASK_KILLABLE, 0, NULL, _RET_IP_);
+}
+EXPORT_SYMBOL(mutex_lock_killable);
+
+void __sched mutex_lock_io(struct mutex *lock)
+{
+ int token = io_schedule_prepare();
+
+ __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
+ io_schedule_finish(token);
+}
+EXPORT_SYMBOL(mutex_lock_io);
+#endif /* !CONFIG_DEBUG_LOCK_ALLOC */
+
+int __sched mutex_trylock(struct mutex *lock)
+{
+ int ret;
+
+ if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
+ return 0;
+
+ ret = __rt_mutex_trylock(&lock->rtmutex);
+ if (ret)
+ mutex_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+
+ return ret;
+}
+EXPORT_SYMBOL(mutex_trylock);
+
+void __sched mutex_unlock(struct mutex *lock)
+{
+ mutex_release(&lock->dep_map, _RET_IP_);
+ __rt_mutex_unlock(&lock->rtmutex);
+}
+EXPORT_SYMBOL(mutex_unlock);
+
+#endif /* CONFIG_PREEMPT_RT */
* @pi_tree_entry: pi node to enqueue into the mutex owner waiters tree
* @task: task reference to the blocked task
* @lock: Pointer to the rt_mutex on which the waiter blocks
+ * @wake_state: Wakeup state to use (TASK_NORMAL or TASK_RTLOCK_WAIT)
* @prio: Priority of the waiter
* @deadline: Deadline of the waiter if applicable
+ * @ww_ctx: WW context pointer
*/
struct rt_mutex_waiter {
struct rb_node tree_entry;
struct rb_node pi_tree_entry;
struct task_struct *task;
- struct rt_mutex *lock;
+ struct rt_mutex_base *lock;
+ unsigned int wake_state;
int prio;
u64 deadline;
+ struct ww_acquire_ctx *ww_ctx;
};
+/**
+ * rt_wake_q_head - Wrapper around regular wake_q_head to support
+ * "sleeping" spinlocks on RT
+ * @head: The regular wake_q_head for sleeping lock variants
+ * @rtlock_task: Task pointer for RT lock (spin/rwlock) wakeups
+ */
+struct rt_wake_q_head {
+ struct wake_q_head head;
+ struct task_struct *rtlock_task;
+};
+
+#define DEFINE_RT_WAKE_Q(name) \
+ struct rt_wake_q_head name = { \
+ .head = WAKE_Q_HEAD_INITIALIZER(name.head), \
+ .rtlock_task = NULL, \
+ }
+
+/*
+ * PI-futex support (proxy locking functions, etc.):
+ */
+extern void rt_mutex_init_proxy_locked(struct rt_mutex_base *lock,
+ struct task_struct *proxy_owner);
+extern void rt_mutex_proxy_unlock(struct rt_mutex_base *lock);
+extern int __rt_mutex_start_proxy_lock(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task);
+extern int rt_mutex_start_proxy_lock(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task);
+extern int rt_mutex_wait_proxy_lock(struct rt_mutex_base *lock,
+ struct hrtimer_sleeper *to,
+ struct rt_mutex_waiter *waiter);
+extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter);
+
+extern int rt_mutex_futex_trylock(struct rt_mutex_base *l);
+extern int __rt_mutex_futex_trylock(struct rt_mutex_base *l);
+
+extern void rt_mutex_futex_unlock(struct rt_mutex_base *lock);
+extern bool __rt_mutex_futex_unlock(struct rt_mutex_base *lock,
+ struct rt_wake_q_head *wqh);
+
+extern void rt_mutex_postunlock(struct rt_wake_q_head *wqh);
+
/*
* Must be guarded because this header is included from rcu/tree_plugin.h
* unconditionally.
*/
#ifdef CONFIG_RT_MUTEXES
-static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
+static inline int rt_mutex_has_waiters(struct rt_mutex_base *lock)
{
return !RB_EMPTY_ROOT(&lock->waiters.rb_root);
}
-static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex *lock)
+/*
+ * Lockless speculative check whether @waiter is still the top waiter on
+ * @lock. This is solely comparing pointers and not derefencing the
+ * leftmost entry which might be about to vanish.
+ */
+static inline bool rt_mutex_waiter_is_top_waiter(struct rt_mutex_base *lock,
+ struct rt_mutex_waiter *waiter)
+{
+ struct rb_node *leftmost = rb_first_cached(&lock->waiters);
+
+ return rb_entry(leftmost, struct rt_mutex_waiter, tree_entry) == waiter;
+}
+
+static inline struct rt_mutex_waiter *rt_mutex_top_waiter(struct rt_mutex_base *lock)
{
struct rb_node *leftmost = rb_first_cached(&lock->waiters);
struct rt_mutex_waiter *w = NULL;
#define RT_MUTEX_HAS_WAITERS 1UL
-static inline struct task_struct *rt_mutex_owner(struct rt_mutex *lock)
+static inline struct task_struct *rt_mutex_owner(struct rt_mutex_base *lock)
{
unsigned long owner = (unsigned long) READ_ONCE(lock->owner);
return (struct task_struct *) (owner & ~RT_MUTEX_HAS_WAITERS);
}
-#else /* CONFIG_RT_MUTEXES */
-/* Used in rcu/tree_plugin.h */
-static inline struct task_struct *rt_mutex_owner(struct rt_mutex *lock)
-{
- return NULL;
-}
-#endif /* !CONFIG_RT_MUTEXES */
/*
* Constants for rt mutex functions which have a selectable deadlock
RT_MUTEX_FULL_CHAINWALK,
};
-static inline void __rt_mutex_basic_init(struct rt_mutex *lock)
+static inline void __rt_mutex_base_init(struct rt_mutex_base *lock)
{
- lock->owner = NULL;
raw_spin_lock_init(&lock->wait_lock);
lock->waiters = RB_ROOT_CACHED;
+ lock->owner = NULL;
}
-/*
- * PI-futex support (proxy locking functions, etc.):
- */
-extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock,
- struct task_struct *proxy_owner);
-extern void rt_mutex_proxy_unlock(struct rt_mutex *lock);
-extern void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter);
-extern int __rt_mutex_start_proxy_lock(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter,
- struct task_struct *task);
-extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter,
- struct task_struct *task);
-extern int rt_mutex_wait_proxy_lock(struct rt_mutex *lock,
- struct hrtimer_sleeper *to,
- struct rt_mutex_waiter *waiter);
-extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
- struct rt_mutex_waiter *waiter);
-
-extern int rt_mutex_futex_trylock(struct rt_mutex *l);
-extern int __rt_mutex_futex_trylock(struct rt_mutex *l);
-
-extern void rt_mutex_futex_unlock(struct rt_mutex *lock);
-extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock,
- struct wake_q_head *wqh);
-
-extern void rt_mutex_postunlock(struct wake_q_head *wake_q);
-
/* Debug functions */
-static inline void debug_rt_mutex_unlock(struct rt_mutex *lock)
+static inline void debug_rt_mutex_unlock(struct rt_mutex_base *lock)
{
if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES))
DEBUG_LOCKS_WARN_ON(rt_mutex_owner(lock) != current);
}
-static inline void debug_rt_mutex_proxy_unlock(struct rt_mutex *lock)
+static inline void debug_rt_mutex_proxy_unlock(struct rt_mutex_base *lock)
{
if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES))
DEBUG_LOCKS_WARN_ON(!rt_mutex_owner(lock));
memset(waiter, 0x22, sizeof(*waiter));
}
+static inline void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter)
+{
+ debug_rt_mutex_init_waiter(waiter);
+ RB_CLEAR_NODE(&waiter->pi_tree_entry);
+ RB_CLEAR_NODE(&waiter->tree_entry);
+ waiter->wake_state = TASK_NORMAL;
+ waiter->task = NULL;
+}
+
+static inline void rt_mutex_init_rtlock_waiter(struct rt_mutex_waiter *waiter)
+{
+ rt_mutex_init_waiter(waiter);
+ waiter->wake_state = TASK_RTLOCK_WAIT;
+}
+
+#else /* CONFIG_RT_MUTEXES */
+/* Used in rcu/tree_plugin.h */
+static inline struct task_struct *rt_mutex_owner(struct rt_mutex_base *lock)
+{
+ return NULL;
+}
+#endif /* !CONFIG_RT_MUTEXES */
+
#endif
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * RT-specific reader/writer semaphores and reader/writer locks
+ *
+ * down_write/write_lock()
+ * 1) Lock rtmutex
+ * 2) Remove the reader BIAS to force readers into the slow path
+ * 3) Wait until all readers have left the critical section
+ * 4) Mark it write locked
+ *
+ * up_write/write_unlock()
+ * 1) Remove the write locked marker
+ * 2) Set the reader BIAS, so readers can use the fast path again
+ * 3) Unlock rtmutex, to release blocked readers
+ *
+ * down_read/read_lock()
+ * 1) Try fast path acquisition (reader BIAS is set)
+ * 2) Take tmutex::wait_lock, which protects the writelocked flag
+ * 3) If !writelocked, acquire it for read
+ * 4) If writelocked, block on tmutex
+ * 5) unlock rtmutex, goto 1)
+ *
+ * up_read/read_unlock()
+ * 1) Try fast path release (reader count != 1)
+ * 2) Wake the writer waiting in down_write()/write_lock() #3
+ *
+ * down_read/read_lock()#3 has the consequence, that rw semaphores and rw
+ * locks on RT are not writer fair, but writers, which should be avoided in
+ * RT tasks (think mmap_sem), are subject to the rtmutex priority/DL
+ * inheritance mechanism.
+ *
+ * It's possible to make the rw primitives writer fair by keeping a list of
+ * active readers. A blocked writer would force all newly incoming readers
+ * to block on the rtmutex, but the rtmutex would have to be proxy locked
+ * for one reader after the other. We can't use multi-reader inheritance
+ * because there is no way to support that with SCHED_DEADLINE.
+ * Implementing the one by one reader boosting/handover mechanism is a
+ * major surgery for a very dubious value.
+ *
+ * The risk of writer starvation is there, but the pathological use cases
+ * which trigger it are not necessarily the typical RT workloads.
+ *
+ * Common code shared between RT rw_semaphore and rwlock
+ */
+
+static __always_inline int rwbase_read_trylock(struct rwbase_rt *rwb)
+{
+ int r;
+
+ /*
+ * Increment reader count, if sem->readers < 0, i.e. READER_BIAS is
+ * set.
+ */
+ for (r = atomic_read(&rwb->readers); r < 0;) {
+ if (likely(atomic_try_cmpxchg(&rwb->readers, &r, r + 1)))
+ return 1;
+ }
+ return 0;
+}
+
+static int __sched __rwbase_read_lock(struct rwbase_rt *rwb,
+ unsigned int state)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+ int ret;
+
+ raw_spin_lock_irq(&rtm->wait_lock);
+ /*
+ * Allow readers, as long as the writer has not completely
+ * acquired the semaphore for write.
+ */
+ if (atomic_read(&rwb->readers) != WRITER_BIAS) {
+ atomic_inc(&rwb->readers);
+ raw_spin_unlock_irq(&rtm->wait_lock);
+ return 0;
+ }
+
+ /*
+ * Call into the slow lock path with the rtmutex->wait_lock
+ * held, so this can't result in the following race:
+ *
+ * Reader1 Reader2 Writer
+ * down_read()
+ * down_write()
+ * rtmutex_lock(m)
+ * wait()
+ * down_read()
+ * unlock(m->wait_lock)
+ * up_read()
+ * wake(Writer)
+ * lock(m->wait_lock)
+ * sem->writelocked=true
+ * unlock(m->wait_lock)
+ *
+ * up_write()
+ * sem->writelocked=false
+ * rtmutex_unlock(m)
+ * down_read()
+ * down_write()
+ * rtmutex_lock(m)
+ * wait()
+ * rtmutex_lock(m)
+ *
+ * That would put Reader1 behind the writer waiting on
+ * Reader2 to call up_read(), which might be unbound.
+ */
+
+ /*
+ * For rwlocks this returns 0 unconditionally, so the below
+ * !ret conditionals are optimized out.
+ */
+ ret = rwbase_rtmutex_slowlock_locked(rtm, state);
+
+ /*
+ * On success the rtmutex is held, so there can't be a writer
+ * active. Increment the reader count and immediately drop the
+ * rtmutex again.
+ *
+ * rtmutex->wait_lock has to be unlocked in any case of course.
+ */
+ if (!ret)
+ atomic_inc(&rwb->readers);
+ raw_spin_unlock_irq(&rtm->wait_lock);
+ if (!ret)
+ rwbase_rtmutex_unlock(rtm);
+ return ret;
+}
+
+static __always_inline int rwbase_read_lock(struct rwbase_rt *rwb,
+ unsigned int state)
+{
+ if (rwbase_read_trylock(rwb))
+ return 0;
+
+ return __rwbase_read_lock(rwb, state);
+}
+
+static void __sched __rwbase_read_unlock(struct rwbase_rt *rwb,
+ unsigned int state)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+ struct task_struct *owner;
+
+ raw_spin_lock_irq(&rtm->wait_lock);
+ /*
+ * Wake the writer, i.e. the rtmutex owner. It might release the
+ * rtmutex concurrently in the fast path (due to a signal), but to
+ * clean up rwb->readers it needs to acquire rtm->wait_lock. The
+ * worst case which can happen is a spurious wakeup.
+ */
+ owner = rt_mutex_owner(rtm);
+ if (owner)
+ wake_up_state(owner, state);
+
+ raw_spin_unlock_irq(&rtm->wait_lock);
+}
+
+static __always_inline void rwbase_read_unlock(struct rwbase_rt *rwb,
+ unsigned int state)
+{
+ /*
+ * rwb->readers can only hit 0 when a writer is waiting for the
+ * active readers to leave the critical section.
+ */
+ if (unlikely(atomic_dec_and_test(&rwb->readers)))
+ __rwbase_read_unlock(rwb, state);
+}
+
+static inline void __rwbase_write_unlock(struct rwbase_rt *rwb, int bias,
+ unsigned long flags)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+
+ atomic_add(READER_BIAS - bias, &rwb->readers);
+ raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
+ rwbase_rtmutex_unlock(rtm);
+}
+
+static inline void rwbase_write_unlock(struct rwbase_rt *rwb)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&rtm->wait_lock, flags);
+ __rwbase_write_unlock(rwb, WRITER_BIAS, flags);
+}
+
+static inline void rwbase_write_downgrade(struct rwbase_rt *rwb)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+ unsigned long flags;
+
+ raw_spin_lock_irqsave(&rtm->wait_lock, flags);
+ /* Release it and account current as reader */
+ __rwbase_write_unlock(rwb, WRITER_BIAS - 1, flags);
+}
+
+static int __sched rwbase_write_lock(struct rwbase_rt *rwb,
+ unsigned int state)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+ unsigned long flags;
+
+ /* Take the rtmutex as a first step */
+ if (rwbase_rtmutex_lock_state(rtm, state))
+ return -EINTR;
+
+ /* Force readers into slow path */
+ atomic_sub(READER_BIAS, &rwb->readers);
+
+ raw_spin_lock_irqsave(&rtm->wait_lock, flags);
+ /*
+ * set_current_state() for rw_semaphore
+ * current_save_and_set_rtlock_wait_state() for rwlock
+ */
+ rwbase_set_and_save_current_state(state);
+
+ /* Block until all readers have left the critical section. */
+ for (; atomic_read(&rwb->readers);) {
+ /* Optimized out for rwlocks */
+ if (rwbase_signal_pending_state(state, current)) {
+ __set_current_state(TASK_RUNNING);
+ __rwbase_write_unlock(rwb, 0, flags);
+ return -EINTR;
+ }
+ raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
+
+ /*
+ * Schedule and wait for the readers to leave the critical
+ * section. The last reader leaving it wakes the waiter.
+ */
+ if (atomic_read(&rwb->readers) != 0)
+ rwbase_schedule();
+ set_current_state(state);
+ raw_spin_lock_irqsave(&rtm->wait_lock, flags);
+ }
+
+ atomic_set(&rwb->readers, WRITER_BIAS);
+ rwbase_restore_current_state();
+ raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
+ return 0;
+}
+
+static inline int rwbase_write_trylock(struct rwbase_rt *rwb)
+{
+ struct rt_mutex_base *rtm = &rwb->rtmutex;
+ unsigned long flags;
+
+ if (!rwbase_rtmutex_trylock(rtm))
+ return 0;
+
+ atomic_sub(READER_BIAS, &rwb->readers);
+
+ raw_spin_lock_irqsave(&rtm->wait_lock, flags);
+ if (!atomic_read(&rwb->readers)) {
+ atomic_set(&rwb->readers, WRITER_BIAS);
+ raw_spin_unlock_irqrestore(&rtm->wait_lock, flags);
+ return 1;
+ }
+ __rwbase_write_unlock(rwb, 0, flags);
+ return 0;
+}
#include <linux/rwsem.h>
#include <linux/atomic.h>
+#ifndef CONFIG_PREEMPT_RT
#include "lock_events.h"
/*
* handle waking up a waiter on the semaphore
* - up_read/up_write has decremented the active part of count if we come here
*/
-static struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem, long count)
+static struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem)
{
unsigned long flags;
DEFINE_WAKE_Q(wake_q);
if (unlikely((tmp & (RWSEM_LOCK_MASK|RWSEM_FLAG_WAITERS)) ==
RWSEM_FLAG_WAITERS)) {
clear_nonspinnable(sem);
- rwsem_wake(sem, tmp);
+ rwsem_wake(sem);
}
}
rwsem_clear_owner(sem);
tmp = atomic_long_fetch_add_release(-RWSEM_WRITER_LOCKED, &sem->count);
if (unlikely(tmp & RWSEM_FLAG_WAITERS))
- rwsem_wake(sem, tmp);
+ rwsem_wake(sem);
}
/*
rwsem_downgrade_wake(sem);
}
+#else /* !CONFIG_PREEMPT_RT */
+
+#define RT_MUTEX_BUILD_MUTEX
+#include "rtmutex.c"
+
+#define rwbase_set_and_save_current_state(state) \
+ set_current_state(state)
+
+#define rwbase_restore_current_state() \
+ __set_current_state(TASK_RUNNING)
+
+#define rwbase_rtmutex_lock_state(rtm, state) \
+ __rt_mutex_lock(rtm, state)
+
+#define rwbase_rtmutex_slowlock_locked(rtm, state) \
+ __rt_mutex_slowlock_locked(rtm, NULL, state)
+
+#define rwbase_rtmutex_unlock(rtm) \
+ __rt_mutex_unlock(rtm)
+
+#define rwbase_rtmutex_trylock(rtm) \
+ __rt_mutex_trylock(rtm)
+
+#define rwbase_signal_pending_state(state, current) \
+ signal_pending_state(state, current)
+
+#define rwbase_schedule() \
+ schedule()
+
+#include "rwbase_rt.c"
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void __rwsem_init(struct rw_semaphore *sem, const char *name,
+ struct lock_class_key *key)
+{
+ debug_check_no_locks_freed((void *)sem, sizeof(*sem));
+ lockdep_init_map_wait(&sem->dep_map, name, key, 0, LD_WAIT_SLEEP);
+}
+EXPORT_SYMBOL(__rwsem_init);
+#endif
+
+static inline void __down_read(struct rw_semaphore *sem)
+{
+ rwbase_read_lock(&sem->rwbase, TASK_UNINTERRUPTIBLE);
+}
+
+static inline int __down_read_interruptible(struct rw_semaphore *sem)
+{
+ return rwbase_read_lock(&sem->rwbase, TASK_INTERRUPTIBLE);
+}
+
+static inline int __down_read_killable(struct rw_semaphore *sem)
+{
+ return rwbase_read_lock(&sem->rwbase, TASK_KILLABLE);
+}
+
+static inline int __down_read_trylock(struct rw_semaphore *sem)
+{
+ return rwbase_read_trylock(&sem->rwbase);
+}
+
+static inline void __up_read(struct rw_semaphore *sem)
+{
+ rwbase_read_unlock(&sem->rwbase, TASK_NORMAL);
+}
+
+static inline void __sched __down_write(struct rw_semaphore *sem)
+{
+ rwbase_write_lock(&sem->rwbase, TASK_UNINTERRUPTIBLE);
+}
+
+static inline int __sched __down_write_killable(struct rw_semaphore *sem)
+{
+ return rwbase_write_lock(&sem->rwbase, TASK_KILLABLE);
+}
+
+static inline int __down_write_trylock(struct rw_semaphore *sem)
+{
+ return rwbase_write_trylock(&sem->rwbase);
+}
+
+static inline void __up_write(struct rw_semaphore *sem)
+{
+ rwbase_write_unlock(&sem->rwbase);
+}
+
+static inline void __downgrade_write(struct rw_semaphore *sem)
+{
+ rwbase_write_downgrade(&sem->rwbase);
+}
+
+/* Debug stubs for the common API */
+#define DEBUG_RWSEMS_WARN_ON(c, sem)
+
+static inline void __rwsem_set_reader_owned(struct rw_semaphore *sem,
+ struct task_struct *owner)
+{
+}
+
+static inline bool is_rwsem_reader_owned(struct rw_semaphore *sem)
+{
+ int count = atomic_read(&sem->rwbase.readers);
+
+ return count < 0 && count != READER_BIAS;
+}
+
+#endif /* CONFIG_PREEMPT_RT */
+
/*
* lock for reading
*/
{
unsigned long flags;
+ might_sleep();
raw_spin_lock_irqsave(&sem->lock, flags);
if (likely(sem->count > 0))
sem->count--;
unsigned long flags;
int result = 0;
+ might_sleep();
raw_spin_lock_irqsave(&sem->lock, flags);
if (likely(sem->count > 0))
sem->count--;
unsigned long flags;
int result = 0;
+ might_sleep();
raw_spin_lock_irqsave(&sem->lock, flags);
if (likely(sem->count > 0))
sem->count--;
unsigned long flags;
int result = 0;
+ might_sleep();
raw_spin_lock_irqsave(&sem->lock, flags);
if (likely(sem->count > 0))
sem->count--;
* __[spin|read|write]_lock_bh()
*/
BUILD_LOCK_OPS(spin, raw_spinlock);
+
+#ifndef CONFIG_PREEMPT_RT
BUILD_LOCK_OPS(read, rwlock);
BUILD_LOCK_OPS(write, rwlock);
+#endif
#endif
EXPORT_SYMBOL(_raw_spin_unlock_bh);
#endif
+#ifndef CONFIG_PREEMPT_RT
+
#ifndef CONFIG_INLINE_READ_TRYLOCK
int __lockfunc _raw_read_trylock(rwlock_t *lock)
{
EXPORT_SYMBOL(_raw_write_unlock_bh);
#endif
+#endif /* !CONFIG_PREEMPT_RT */
+
#ifdef CONFIG_DEBUG_LOCK_ALLOC
void __lockfunc _raw_spin_lock_nested(raw_spinlock_t *lock, int subclass)
EXPORT_SYMBOL(__raw_spin_lock_init);
+#ifndef CONFIG_PREEMPT_RT
void __rwlock_init(rwlock_t *lock, const char *name,
struct lock_class_key *key)
{
}
EXPORT_SYMBOL(__rwlock_init);
+#endif
static void spin_dump(raw_spinlock_t *lock, const char *msg)
{
arch_spin_unlock(&lock->raw_lock);
}
+#ifndef CONFIG_PREEMPT_RT
static void rwlock_bug(rwlock_t *lock, const char *msg)
{
if (!debug_locks_off())
debug_write_unlock(lock);
arch_write_unlock(&lock->raw_lock);
}
+
+#endif /* !CONFIG_PREEMPT_RT */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * PREEMPT_RT substitution for spin/rw_locks
+ *
+ * spinlocks and rwlocks on RT are based on rtmutexes, with a few twists to
+ * resemble the non RT semantics:
+ *
+ * - Contrary to plain rtmutexes, spinlocks and rwlocks are state
+ * preserving. The task state is saved before blocking on the underlying
+ * rtmutex, and restored when the lock has been acquired. Regular wakeups
+ * during that time are redirected to the saved state so no wake up is
+ * missed.
+ *
+ * - Non RT spin/rwlocks disable preemption and eventually interrupts.
+ * Disabling preemption has the side effect of disabling migration and
+ * preventing RCU grace periods.
+ *
+ * The RT substitutions explicitly disable migration and take
+ * rcu_read_lock() across the lock held section.
+ */
+#include <linux/spinlock.h>
+#include <linux/export.h>
+
+#define RT_MUTEX_BUILD_SPINLOCKS
+#include "rtmutex.c"
+
+static __always_inline void rtlock_lock(struct rt_mutex_base *rtm)
+{
+ if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current)))
+ rtlock_slowlock(rtm);
+}
+
+static __always_inline void __rt_spin_lock(spinlock_t *lock)
+{
+ ___might_sleep(__FILE__, __LINE__, 0);
+ rtlock_lock(&lock->lock);
+ rcu_read_lock();
+ migrate_disable();
+}
+
+void __sched rt_spin_lock(spinlock_t *lock)
+{
+ spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
+ __rt_spin_lock(lock);
+}
+EXPORT_SYMBOL(rt_spin_lock);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void __sched rt_spin_lock_nested(spinlock_t *lock, int subclass)
+{
+ spin_acquire(&lock->dep_map, subclass, 0, _RET_IP_);
+ __rt_spin_lock(lock);
+}
+EXPORT_SYMBOL(rt_spin_lock_nested);
+
+void __sched rt_spin_lock_nest_lock(spinlock_t *lock,
+ struct lockdep_map *nest_lock)
+{
+ spin_acquire_nest(&lock->dep_map, 0, 0, nest_lock, _RET_IP_);
+ __rt_spin_lock(lock);
+}
+EXPORT_SYMBOL(rt_spin_lock_nest_lock);
+#endif
+
+void __sched rt_spin_unlock(spinlock_t *lock)
+{
+ spin_release(&lock->dep_map, _RET_IP_);
+ migrate_enable();
+ rcu_read_unlock();
+
+ if (unlikely(!rt_mutex_cmpxchg_release(&lock->lock, current, NULL)))
+ rt_mutex_slowunlock(&lock->lock);
+}
+EXPORT_SYMBOL(rt_spin_unlock);
+
+/*
+ * Wait for the lock to get unlocked: instead of polling for an unlock
+ * (like raw spinlocks do), lock and unlock, to force the kernel to
+ * schedule if there's contention:
+ */
+void __sched rt_spin_lock_unlock(spinlock_t *lock)
+{
+ spin_lock(lock);
+ spin_unlock(lock);
+}
+EXPORT_SYMBOL(rt_spin_lock_unlock);
+
+static __always_inline int __rt_spin_trylock(spinlock_t *lock)
+{
+ int ret = 1;
+
+ if (unlikely(!rt_mutex_cmpxchg_acquire(&lock->lock, NULL, current)))
+ ret = rt_mutex_slowtrylock(&lock->lock);
+
+ if (ret) {
+ spin_acquire(&lock->dep_map, 0, 1, _RET_IP_);
+ rcu_read_lock();
+ migrate_disable();
+ }
+ return ret;
+}
+
+int __sched rt_spin_trylock(spinlock_t *lock)
+{
+ return __rt_spin_trylock(lock);
+}
+EXPORT_SYMBOL(rt_spin_trylock);
+
+int __sched rt_spin_trylock_bh(spinlock_t *lock)
+{
+ int ret;
+
+ local_bh_disable();
+ ret = __rt_spin_trylock(lock);
+ if (!ret)
+ local_bh_enable();
+ return ret;
+}
+EXPORT_SYMBOL(rt_spin_trylock_bh);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void __rt_spin_lock_init(spinlock_t *lock, const char *name,
+ struct lock_class_key *key, bool percpu)
+{
+ u8 type = percpu ? LD_LOCK_PERCPU : LD_LOCK_NORMAL;
+
+ debug_check_no_locks_freed((void *)lock, sizeof(*lock));
+ lockdep_init_map_type(&lock->dep_map, name, key, 0, LD_WAIT_CONFIG,
+ LD_WAIT_INV, type);
+}
+EXPORT_SYMBOL(__rt_spin_lock_init);
+#endif
+
+/*
+ * RT-specific reader/writer locks
+ */
+#define rwbase_set_and_save_current_state(state) \
+ current_save_and_set_rtlock_wait_state()
+
+#define rwbase_restore_current_state() \
+ current_restore_rtlock_saved_state()
+
+static __always_inline int
+rwbase_rtmutex_lock_state(struct rt_mutex_base *rtm, unsigned int state)
+{
+ if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current)))
+ rtlock_slowlock(rtm);
+ return 0;
+}
+
+static __always_inline int
+rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int state)
+{
+ rtlock_slowlock_locked(rtm);
+ return 0;
+}
+
+static __always_inline void rwbase_rtmutex_unlock(struct rt_mutex_base *rtm)
+{
+ if (likely(rt_mutex_cmpxchg_acquire(rtm, current, NULL)))
+ return;
+
+ rt_mutex_slowunlock(rtm);
+}
+
+static __always_inline int rwbase_rtmutex_trylock(struct rt_mutex_base *rtm)
+{
+ if (likely(rt_mutex_cmpxchg_acquire(rtm, NULL, current)))
+ return 1;
+
+ return rt_mutex_slowtrylock(rtm);
+}
+
+#define rwbase_signal_pending_state(state, current) (0)
+
+#define rwbase_schedule() \
+ schedule_rtlock()
+
+#include "rwbase_rt.c"
+/*
+ * The common functions which get wrapped into the rwlock API.
+ */
+int __sched rt_read_trylock(rwlock_t *rwlock)
+{
+ int ret;
+
+ ret = rwbase_read_trylock(&rwlock->rwbase);
+ if (ret) {
+ rwlock_acquire_read(&rwlock->dep_map, 0, 1, _RET_IP_);
+ rcu_read_lock();
+ migrate_disable();
+ }
+ return ret;
+}
+EXPORT_SYMBOL(rt_read_trylock);
+
+int __sched rt_write_trylock(rwlock_t *rwlock)
+{
+ int ret;
+
+ ret = rwbase_write_trylock(&rwlock->rwbase);
+ if (ret) {
+ rwlock_acquire(&rwlock->dep_map, 0, 1, _RET_IP_);
+ rcu_read_lock();
+ migrate_disable();
+ }
+ return ret;
+}
+EXPORT_SYMBOL(rt_write_trylock);
+
+void __sched rt_read_lock(rwlock_t *rwlock)
+{
+ ___might_sleep(__FILE__, __LINE__, 0);
+ rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_);
+ rwbase_read_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT);
+ rcu_read_lock();
+ migrate_disable();
+}
+EXPORT_SYMBOL(rt_read_lock);
+
+void __sched rt_write_lock(rwlock_t *rwlock)
+{
+ ___might_sleep(__FILE__, __LINE__, 0);
+ rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_);
+ rwbase_write_lock(&rwlock->rwbase, TASK_RTLOCK_WAIT);
+ rcu_read_lock();
+ migrate_disable();
+}
+EXPORT_SYMBOL(rt_write_lock);
+
+void __sched rt_read_unlock(rwlock_t *rwlock)
+{
+ rwlock_release(&rwlock->dep_map, _RET_IP_);
+ migrate_enable();
+ rcu_read_unlock();
+ rwbase_read_unlock(&rwlock->rwbase, TASK_RTLOCK_WAIT);
+}
+EXPORT_SYMBOL(rt_read_unlock);
+
+void __sched rt_write_unlock(rwlock_t *rwlock)
+{
+ rwlock_release(&rwlock->dep_map, _RET_IP_);
+ rcu_read_unlock();
+ migrate_enable();
+ rwbase_write_unlock(&rwlock->rwbase);
+}
+EXPORT_SYMBOL(rt_write_unlock);
+
+int __sched rt_rwlock_is_contended(rwlock_t *rwlock)
+{
+ return rw_base_is_contended(&rwlock->rwbase);
+}
+EXPORT_SYMBOL(rt_rwlock_is_contended);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void __rt_rwlock_init(rwlock_t *rwlock, const char *name,
+ struct lock_class_key *key)
+{
+ debug_check_no_locks_freed((void *)rwlock, sizeof(*rwlock));
+ lockdep_init_map_wait(&rwlock->dep_map, name, key, 0, LD_WAIT_CONFIG);
+}
+EXPORT_SYMBOL(__rt_rwlock_init);
+#endif
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef WW_RT
+
+#define MUTEX mutex
+#define MUTEX_WAITER mutex_waiter
+
+static inline struct mutex_waiter *
+__ww_waiter_first(struct mutex *lock)
+{
+ struct mutex_waiter *w;
+
+ w = list_first_entry(&lock->wait_list, struct mutex_waiter, list);
+ if (list_entry_is_head(w, &lock->wait_list, list))
+ return NULL;
+
+ return w;
+}
+
+static inline struct mutex_waiter *
+__ww_waiter_next(struct mutex *lock, struct mutex_waiter *w)
+{
+ w = list_next_entry(w, list);
+ if (list_entry_is_head(w, &lock->wait_list, list))
+ return NULL;
+
+ return w;
+}
+
+static inline struct mutex_waiter *
+__ww_waiter_prev(struct mutex *lock, struct mutex_waiter *w)
+{
+ w = list_prev_entry(w, list);
+ if (list_entry_is_head(w, &lock->wait_list, list))
+ return NULL;
+
+ return w;
+}
+
+static inline struct mutex_waiter *
+__ww_waiter_last(struct mutex *lock)
+{
+ struct mutex_waiter *w;
+
+ w = list_last_entry(&lock->wait_list, struct mutex_waiter, list);
+ if (list_entry_is_head(w, &lock->wait_list, list))
+ return NULL;
+
+ return w;
+}
+
+static inline void
+__ww_waiter_add(struct mutex *lock, struct mutex_waiter *waiter, struct mutex_waiter *pos)
+{
+ struct list_head *p = &lock->wait_list;
+ if (pos)
+ p = &pos->list;
+ __mutex_add_waiter(lock, waiter, p);
+}
+
+static inline struct task_struct *
+__ww_mutex_owner(struct mutex *lock)
+{
+ return __mutex_owner(lock);
+}
+
+static inline bool
+__ww_mutex_has_waiters(struct mutex *lock)
+{
+ return atomic_long_read(&lock->owner) & MUTEX_FLAG_WAITERS;
+}
+
+static inline void lock_wait_lock(struct mutex *lock)
+{
+ raw_spin_lock(&lock->wait_lock);
+}
+
+static inline void unlock_wait_lock(struct mutex *lock)
+{
+ raw_spin_unlock(&lock->wait_lock);
+}
+
+static inline void lockdep_assert_wait_lock_held(struct mutex *lock)
+{
+ lockdep_assert_held(&lock->wait_lock);
+}
+
+#else /* WW_RT */
+
+#define MUTEX rt_mutex
+#define MUTEX_WAITER rt_mutex_waiter
+
+static inline struct rt_mutex_waiter *
+__ww_waiter_first(struct rt_mutex *lock)
+{
+ struct rb_node *n = rb_first(&lock->rtmutex.waiters.rb_root);
+ if (!n)
+ return NULL;
+ return rb_entry(n, struct rt_mutex_waiter, tree_entry);
+}
+
+static inline struct rt_mutex_waiter *
+__ww_waiter_next(struct rt_mutex *lock, struct rt_mutex_waiter *w)
+{
+ struct rb_node *n = rb_next(&w->tree_entry);
+ if (!n)
+ return NULL;
+ return rb_entry(n, struct rt_mutex_waiter, tree_entry);
+}
+
+static inline struct rt_mutex_waiter *
+__ww_waiter_prev(struct rt_mutex *lock, struct rt_mutex_waiter *w)
+{
+ struct rb_node *n = rb_prev(&w->tree_entry);
+ if (!n)
+ return NULL;
+ return rb_entry(n, struct rt_mutex_waiter, tree_entry);
+}
+
+static inline struct rt_mutex_waiter *
+__ww_waiter_last(struct rt_mutex *lock)
+{
+ struct rb_node *n = rb_last(&lock->rtmutex.waiters.rb_root);
+ if (!n)
+ return NULL;
+ return rb_entry(n, struct rt_mutex_waiter, tree_entry);
+}
+
+static inline void
+__ww_waiter_add(struct rt_mutex *lock, struct rt_mutex_waiter *waiter, struct rt_mutex_waiter *pos)
+{
+ /* RT unconditionally adds the waiter first and then removes it on error */
+}
+
+static inline struct task_struct *
+__ww_mutex_owner(struct rt_mutex *lock)
+{
+ return rt_mutex_owner(&lock->rtmutex);
+}
+
+static inline bool
+__ww_mutex_has_waiters(struct rt_mutex *lock)
+{
+ return rt_mutex_has_waiters(&lock->rtmutex);
+}
+
+static inline void lock_wait_lock(struct rt_mutex *lock)
+{
+ raw_spin_lock(&lock->rtmutex.wait_lock);
+}
+
+static inline void unlock_wait_lock(struct rt_mutex *lock)
+{
+ raw_spin_unlock(&lock->rtmutex.wait_lock);
+}
+
+static inline void lockdep_assert_wait_lock_held(struct rt_mutex *lock)
+{
+ lockdep_assert_held(&lock->rtmutex.wait_lock);
+}
+
+#endif /* WW_RT */
+
+/*
+ * Wait-Die:
+ * The newer transactions are killed when:
+ * It (the new transaction) makes a request for a lock being held
+ * by an older transaction.
+ *
+ * Wound-Wait:
+ * The newer transactions are wounded when:
+ * An older transaction makes a request for a lock being held by
+ * the newer transaction.
+ */
+
+/*
+ * Associate the ww_mutex @ww with the context @ww_ctx under which we acquired
+ * it.
+ */
+static __always_inline void
+ww_mutex_lock_acquired(struct ww_mutex *ww, struct ww_acquire_ctx *ww_ctx)
+{
+#ifdef DEBUG_WW_MUTEXES
+ /*
+ * If this WARN_ON triggers, you used ww_mutex_lock to acquire,
+ * but released with a normal mutex_unlock in this call.
+ *
+ * This should never happen, always use ww_mutex_unlock.
+ */
+ DEBUG_LOCKS_WARN_ON(ww->ctx);
+
+ /*
+ * Not quite done after calling ww_acquire_done() ?
+ */
+ DEBUG_LOCKS_WARN_ON(ww_ctx->done_acquire);
+
+ if (ww_ctx->contending_lock) {
+ /*
+ * After -EDEADLK you tried to
+ * acquire a different ww_mutex? Bad!
+ */
+ DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock != ww);
+
+ /*
+ * You called ww_mutex_lock after receiving -EDEADLK,
+ * but 'forgot' to unlock everything else first?
+ */
+ DEBUG_LOCKS_WARN_ON(ww_ctx->acquired > 0);
+ ww_ctx->contending_lock = NULL;
+ }
+
+ /*
+ * Naughty, using a different class will lead to undefined behavior!
+ */
+ DEBUG_LOCKS_WARN_ON(ww_ctx->ww_class != ww->ww_class);
+#endif
+ ww_ctx->acquired++;
+ ww->ctx = ww_ctx;
+}
+
+/*
+ * Determine if @a is 'less' than @b. IOW, either @a is a lower priority task
+ * or, when of equal priority, a younger transaction than @b.
+ *
+ * Depending on the algorithm, @a will either need to wait for @b, or die.
+ */
+static inline bool
+__ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acquire_ctx *b)
+{
+/*
+ * Can only do the RT prio for WW_RT, because task->prio isn't stable due to PI,
+ * so the wait_list ordering will go wobbly. rt_mutex re-queues the waiter and
+ * isn't affected by this.
+ */
+#ifdef WW_RT
+ /* kernel prio; less is more */
+ int a_prio = a->task->prio;
+ int b_prio = b->task->prio;
+
+ if (rt_prio(a_prio) || rt_prio(b_prio)) {
+
+ if (a_prio > b_prio)
+ return true;
+
+ if (a_prio < b_prio)
+ return false;
+
+ /* equal static prio */
+
+ if (dl_prio(a_prio)) {
+ if (dl_time_before(b->task->dl.deadline,
+ a->task->dl.deadline))
+ return true;
+
+ if (dl_time_before(a->task->dl.deadline,
+ b->task->dl.deadline))
+ return false;
+ }
+
+ /* equal prio */
+ }
+#endif
+
+ /* FIFO order tie break -- bigger is younger */
+ return (signed long)(a->stamp - b->stamp) > 0;
+}
+
+/*
+ * Wait-Die; wake a lesser waiter context (when locks held) such that it can
+ * die.
+ *
+ * Among waiters with context, only the first one can have other locks acquired
+ * already (ctx->acquired > 0), because __ww_mutex_add_waiter() and
+ * __ww_mutex_check_kill() wake any but the earliest context.
+ */
+static bool
+__ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER *waiter,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ if (!ww_ctx->is_wait_die)
+ return false;
+
+ if (waiter->ww_ctx->acquired > 0 && __ww_ctx_less(waiter->ww_ctx, ww_ctx)) {
+#ifndef WW_RT
+ debug_mutex_wake_waiter(lock, waiter);
+#endif
+ wake_up_process(waiter->task);
+ }
+
+ return true;
+}
+
+/*
+ * Wound-Wait; wound a lesser @hold_ctx if it holds the lock.
+ *
+ * Wound the lock holder if there are waiters with more important transactions
+ * than the lock holders. Even if multiple waiters may wound the lock holder,
+ * it's sufficient that only one does.
+ */
+static bool __ww_mutex_wound(struct MUTEX *lock,
+ struct ww_acquire_ctx *ww_ctx,
+ struct ww_acquire_ctx *hold_ctx)
+{
+ struct task_struct *owner = __ww_mutex_owner(lock);
+
+ lockdep_assert_wait_lock_held(lock);
+
+ /*
+ * Possible through __ww_mutex_add_waiter() when we race with
+ * ww_mutex_set_context_fastpath(). In that case we'll get here again
+ * through __ww_mutex_check_waiters().
+ */
+ if (!hold_ctx)
+ return false;
+
+ /*
+ * Can have !owner because of __mutex_unlock_slowpath(), but if owner,
+ * it cannot go away because we'll have FLAG_WAITERS set and hold
+ * wait_lock.
+ */
+ if (!owner)
+ return false;
+
+ if (ww_ctx->acquired > 0 && __ww_ctx_less(hold_ctx, ww_ctx)) {
+ hold_ctx->wounded = 1;
+
+ /*
+ * wake_up_process() paired with set_current_state()
+ * inserts sufficient barriers to make sure @owner either sees
+ * it's wounded in __ww_mutex_check_kill() or has a
+ * wakeup pending to re-read the wounded state.
+ */
+ if (owner != current)
+ wake_up_process(owner);
+
+ return true;
+ }
+
+ return false;
+}
+
+/*
+ * We just acquired @lock under @ww_ctx, if there are more important contexts
+ * waiting behind us on the wait-list, check if they need to die, or wound us.
+ *
+ * See __ww_mutex_add_waiter() for the list-order construction; basically the
+ * list is ordered by stamp, smallest (oldest) first.
+ *
+ * This relies on never mixing wait-die/wound-wait on the same wait-list;
+ * which is currently ensured by that being a ww_class property.
+ *
+ * The current task must not be on the wait list.
+ */
+static void
+__ww_mutex_check_waiters(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx)
+{
+ struct MUTEX_WAITER *cur;
+
+ lockdep_assert_wait_lock_held(lock);
+
+ for (cur = __ww_waiter_first(lock); cur;
+ cur = __ww_waiter_next(lock, cur)) {
+
+ if (!cur->ww_ctx)
+ continue;
+
+ if (__ww_mutex_die(lock, cur, ww_ctx) ||
+ __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx))
+ break;
+ }
+}
+
+/*
+ * After acquiring lock with fastpath, where we do not hold wait_lock, set ctx
+ * and wake up any waiters so they can recheck.
+ */
+static __always_inline void
+ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ ww_mutex_lock_acquired(lock, ctx);
+
+ /*
+ * The lock->ctx update should be visible on all cores before
+ * the WAITERS check is done, otherwise contended waiters might be
+ * missed. The contended waiters will either see ww_ctx == NULL
+ * and keep spinning, or it will acquire wait_lock, add itself
+ * to waiter list and sleep.
+ */
+ smp_mb(); /* See comments above and below. */
+
+ /*
+ * [W] ww->ctx = ctx [W] MUTEX_FLAG_WAITERS
+ * MB MB
+ * [R] MUTEX_FLAG_WAITERS [R] ww->ctx
+ *
+ * The memory barrier above pairs with the memory barrier in
+ * __ww_mutex_add_waiter() and makes sure we either observe ww->ctx
+ * and/or !empty list.
+ */
+ if (likely(!__ww_mutex_has_waiters(&lock->base)))
+ return;
+
+ /*
+ * Uh oh, we raced in fastpath, check if any of the waiters need to
+ * die or wound us.
+ */
+ lock_wait_lock(&lock->base);
+ __ww_mutex_check_waiters(&lock->base, ctx);
+ unlock_wait_lock(&lock->base);
+}
+
+static __always_inline int
+__ww_mutex_kill(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx)
+{
+ if (ww_ctx->acquired > 0) {
+#ifdef DEBUG_WW_MUTEXES
+ struct ww_mutex *ww;
+
+ ww = container_of(lock, struct ww_mutex, base);
+ DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock);
+ ww_ctx->contending_lock = ww;
+#endif
+ return -EDEADLK;
+ }
+
+ return 0;
+}
+
+/*
+ * Check the wound condition for the current lock acquire.
+ *
+ * Wound-Wait: If we're wounded, kill ourself.
+ *
+ * Wait-Die: If we're trying to acquire a lock already held by an older
+ * context, kill ourselves.
+ *
+ * Since __ww_mutex_add_waiter() orders the wait-list on stamp, we only have to
+ * look at waiters before us in the wait-list.
+ */
+static inline int
+__ww_mutex_check_kill(struct MUTEX *lock, struct MUTEX_WAITER *waiter,
+ struct ww_acquire_ctx *ctx)
+{
+ struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
+ struct ww_acquire_ctx *hold_ctx = READ_ONCE(ww->ctx);
+ struct MUTEX_WAITER *cur;
+
+ if (ctx->acquired == 0)
+ return 0;
+
+ if (!ctx->is_wait_die) {
+ if (ctx->wounded)
+ return __ww_mutex_kill(lock, ctx);
+
+ return 0;
+ }
+
+ if (hold_ctx && __ww_ctx_less(ctx, hold_ctx))
+ return __ww_mutex_kill(lock, ctx);
+
+ /*
+ * If there is a waiter in front of us that has a context, then its
+ * stamp is earlier than ours and we must kill ourself.
+ */
+ for (cur = __ww_waiter_prev(lock, waiter); cur;
+ cur = __ww_waiter_prev(lock, cur)) {
+
+ if (!cur->ww_ctx)
+ continue;
+
+ return __ww_mutex_kill(lock, ctx);
+ }
+
+ return 0;
+}
+
+/*
+ * Add @waiter to the wait-list, keep the wait-list ordered by stamp, smallest
+ * first. Such that older contexts are preferred to acquire the lock over
+ * younger contexts.
+ *
+ * Waiters without context are interspersed in FIFO order.
+ *
+ * Furthermore, for Wait-Die kill ourself immediately when possible (there are
+ * older contexts already waiting) to avoid unnecessary waiting and for
+ * Wound-Wait ensure we wound the owning context when it is younger.
+ */
+static inline int
+__ww_mutex_add_waiter(struct MUTEX_WAITER *waiter,
+ struct MUTEX *lock,
+ struct ww_acquire_ctx *ww_ctx)
+{
+ struct MUTEX_WAITER *cur, *pos = NULL;
+ bool is_wait_die;
+
+ if (!ww_ctx) {
+ __ww_waiter_add(lock, waiter, NULL);
+ return 0;
+ }
+
+ is_wait_die = ww_ctx->is_wait_die;
+
+ /*
+ * Add the waiter before the first waiter with a higher stamp.
+ * Waiters without a context are skipped to avoid starving
+ * them. Wait-Die waiters may die here. Wound-Wait waiters
+ * never die here, but they are sorted in stamp order and
+ * may wound the lock holder.
+ */
+ for (cur = __ww_waiter_last(lock); cur;
+ cur = __ww_waiter_prev(lock, cur)) {
+
+ if (!cur->ww_ctx)
+ continue;
+
+ if (__ww_ctx_less(ww_ctx, cur->ww_ctx)) {
+ /*
+ * Wait-Die: if we find an older context waiting, there
+ * is no point in queueing behind it, as we'd have to
+ * die the moment it would acquire the lock.
+ */
+ if (is_wait_die) {
+ int ret = __ww_mutex_kill(lock, ww_ctx);
+
+ if (ret)
+ return ret;
+ }
+
+ break;
+ }
+
+ pos = cur;
+
+ /* Wait-Die: ensure younger waiters die. */
+ __ww_mutex_die(lock, cur, ww_ctx);
+ }
+
+ __ww_waiter_add(lock, waiter, pos);
+
+ /*
+ * Wound-Wait: if we're blocking on a mutex owned by a younger context,
+ * wound that such that we might proceed.
+ */
+ if (!is_wait_die) {
+ struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
+
+ /*
+ * See ww_mutex_set_context_fastpath(). Orders setting
+ * MUTEX_FLAG_WAITERS vs the ww->ctx load,
+ * such that either we or the fastpath will wound @ww->ctx.
+ */
+ smp_mb();
+ __ww_mutex_wound(lock, ww_ctx, ww->ctx);
+ }
+
+ return 0;
+}
+
+static inline void __ww_mutex_unlock(struct ww_mutex *lock)
+{
+ if (lock->ctx) {
+#ifdef DEBUG_WW_MUTEXES
+ DEBUG_LOCKS_WARN_ON(!lock->ctx->acquired);
+#endif
+ if (lock->ctx->acquired > 0)
+ lock->ctx->acquired--;
+ lock->ctx = NULL;
+ }
+}
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * rtmutex API
+ */
+#include <linux/spinlock.h>
+#include <linux/export.h>
+
+#define RT_MUTEX_BUILD_MUTEX
+#define WW_RT
+#include "rtmutex.c"
+
+static int __sched
+__ww_rt_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx,
+ unsigned int state, unsigned long ip)
+{
+ struct lockdep_map __maybe_unused *nest_lock = NULL;
+ struct rt_mutex *rtm = &lock->base;
+ int ret;
+
+ might_sleep();
+
+ if (ww_ctx) {
+ if (unlikely(ww_ctx == READ_ONCE(lock->ctx)))
+ return -EALREADY;
+
+ /*
+ * Reset the wounded flag after a kill. No other process can
+ * race and wound us here, since they can't have a valid owner
+ * pointer if we don't have any locks held.
+ */
+ if (ww_ctx->acquired == 0)
+ ww_ctx->wounded = 0;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ nest_lock = &ww_ctx->dep_map;
+#endif
+ }
+ mutex_acquire_nest(&rtm->dep_map, 0, 0, nest_lock, ip);
+
+ if (likely(rt_mutex_cmpxchg_acquire(&rtm->rtmutex, NULL, current))) {
+ if (ww_ctx)
+ ww_mutex_set_context_fastpath(lock, ww_ctx);
+ return 0;
+ }
+
+ ret = rt_mutex_slowlock(&rtm->rtmutex, ww_ctx, state);
+
+ if (ret)
+ mutex_release(&rtm->dep_map, ip);
+ return ret;
+}
+
+int __sched
+ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ return __ww_rt_mutex_lock(lock, ctx, TASK_UNINTERRUPTIBLE, _RET_IP_);
+}
+EXPORT_SYMBOL(ww_mutex_lock);
+
+int __sched
+ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx)
+{
+ return __ww_rt_mutex_lock(lock, ctx, TASK_INTERRUPTIBLE, _RET_IP_);
+}
+EXPORT_SYMBOL(ww_mutex_lock_interruptible);
+
+void __sched ww_mutex_unlock(struct ww_mutex *lock)
+{
+ struct rt_mutex *rtm = &lock->base;
+
+ __ww_mutex_unlock(lock);
+
+ mutex_release(&rtm->dep_map, _RET_IP_);
+ __rt_mutex_unlock(&rtm->rtmutex);
+}
+EXPORT_SYMBOL(ww_mutex_unlock);
*
* Copyright (c) 2020 Oracle and/or its affiliates.
* Author: Daniel Jordan <daniel.m.jordan@oracle.com>
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms and conditions of the GNU General Public License,
- * version 2, as published by the Free Software Foundation.
- *
- * This program is distributed in the hope it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
*/
#include <linux/completion.h>
if ((pinst->flags & PADATA_RESET))
goto out;
- atomic_inc(&pd->refcnt);
+ refcount_inc(&pd->refcnt);
padata->pd = pd;
padata->cb_cpu = *cb_cpu;
}
local_bh_enable();
- if (atomic_sub_and_test(cnt, &pd->refcnt))
+ if (refcount_sub_and_test(cnt, &pd->refcnt))
padata_free_pd(pd);
}
padata_init_reorder_list(pd);
padata_init_squeues(pd);
pd->seq_nr = -1;
- atomic_set(&pd->refcnt, 1);
+ refcount_set(&pd->refcnt, 1);
spin_lock_init(&pd->lock);
pd->cpu = cpumask_first(pd->cpumask.pcpu);
INIT_WORK(&pd->reorder_work, invoke_padata_reorder);
synchronize_rcu();
list_for_each_entry_continue_reverse(ps, &pinst->pslist, list)
- if (atomic_dec_and_test(&ps->opd->refcnt))
+ if (refcount_dec_and_test(&ps->opd->refcnt))
padata_free_pd(ps->opd);
pinst->flags &= ~PADATA_RESET;
struct cpumask *serial_mask, *parallel_mask;
int err = -EINVAL;
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&pinst->lock);
switch (cpumask_type) {
out:
mutex_unlock(&pinst->lock);
- put_online_cpus();
+ cpus_read_unlock();
return err;
}
if (!pinst->parallel_wq)
goto err_free_inst;
- get_online_cpus();
+ cpus_read_lock();
pinst->serial_wq = alloc_workqueue("%s_serial", WQ_MEM_RECLAIM |
WQ_CPU_INTENSIVE, 1, name);
&pinst->cpu_dead_node);
#endif
- put_online_cpus();
+ cpus_read_unlock();
return pinst;
err_free_serial_wq:
destroy_workqueue(pinst->serial_wq);
err_put_cpus:
- put_online_cpus();
+ cpus_read_unlock();
destroy_workqueue(pinst->parallel_wq);
err_free_inst:
kfree(pinst);
ps->pinst = pinst;
- get_online_cpus();
+ cpus_read_lock();
pd = padata_alloc_pd(ps);
- put_online_cpus();
+ cpus_read_unlock();
if (!pd)
goto out_free_ps;
STANDARD_PARAM_DEF(ullong, unsigned long long, "%llu", kstrtoull);
STANDARD_PARAM_DEF(hexint, unsigned int, "%#08x", kstrtouint);
+int param_set_uint_minmax(const char *val, const struct kernel_param *kp,
+ unsigned int min, unsigned int max)
+{
+ unsigned int num;
+ int ret;
+
+ if (!val)
+ return -EINVAL;
+ ret = kstrtouint(val, 0, &num);
+ if (ret)
+ return ret;
+ if (num < min || num > max)
+ return -EINVAL;
+ *((unsigned int *)kp->arg) = num;
+ return 0;
+}
+EXPORT_SYMBOL_GPL(param_set_uint_minmax);
+
int param_set_charp(const char *val, const struct kernel_param *kp)
{
if (strlen(val) > 1024) {
* Note, that this function can only be called after the fd table has
* been unshared to avoid leaking the pidfd to the new process.
*
+ * This symbol should not be explicitly exported to loadable modules.
+ *
* Return: On success, a cloexec pidfd is returned.
* On error, a negative errno number will be returned.
*/
-static int pidfd_create(struct pid *pid, unsigned int flags)
+int pidfd_create(struct pid *pid, unsigned int flags)
{
int fd;
+ if (!pid || !pid_has_task(pid, PIDTYPE_TGID))
+ return -EINVAL;
+
+ if (flags & ~(O_NONBLOCK | O_RDWR | O_CLOEXEC))
+ return -EINVAL;
+
fd = anon_inode_getfd("[pidfd]", &pidfd_fops, get_pid(pid),
flags | O_RDWR | O_CLOEXEC);
if (fd < 0)
if (!p)
return -ESRCH;
- if (pid_has_task(p, PIDTYPE_TGID))
- fd = pidfd_create(p, flags);
- else
- fd = -EINVAL;
+ fd = pidfd_create(p, flags);
put_pid(p);
return fd;
if (gp_async) {
cur_ops->gp_barrier();
}
- writer_n_durations[me] = i_max;
+ writer_n_durations[me] = i_max + 1;
torture_kthread_stopping("rcu_scale_writer");
return 0;
}
wdpp = writer_durations[i];
if (!wdpp)
continue;
- for (j = 0; j <= writer_n_durations[i]; j++) {
+ for (j = 0; j < writer_n_durations[i]; j++) {
wdp = &wdpp[j];
pr_alert("%s%s %4d writer-duration: %5d %llu\n",
scale_type, SCALE_FLAG,
__func__, raw_smp_processor_id());
while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
stop_at))
- if (stall_cpu_block)
+ if (stall_cpu_block) {
+#ifdef CONFIG_PREEMPTION
+ preempt_schedule();
+#else
schedule_timeout_uninterruptible(HZ);
+#endif
+ }
if (stall_cpu_irqsoff)
local_irq_enable();
else if (!stall_cpu_block)
.name = "acqrel"
};
+static volatile u64 stopopts;
+
+static void ref_clock_section(const int nloops)
+{
+ u64 x = 0;
+ int i;
+
+ preempt_disable();
+ for (i = nloops; i >= 0; i--)
+ x += ktime_get_real_fast_ns();
+ preempt_enable();
+ stopopts = x;
+}
+
+static void ref_clock_delay_section(const int nloops, const int udl, const int ndl)
+{
+ u64 x = 0;
+ int i;
+
+ preempt_disable();
+ for (i = nloops; i >= 0; i--) {
+ x += ktime_get_real_fast_ns();
+ un_delay(udl, ndl);
+ }
+ preempt_enable();
+ stopopts = x;
+}
+
+static struct ref_scale_ops clock_ops = {
+ .readsection = ref_clock_section,
+ .delaysection = ref_clock_delay_section,
+ .name = "clock"
+};
+
static void rcu_scale_one_reader(void)
{
if (readdelay <= 0)
int firsterr = 0;
static struct ref_scale_ops *scale_ops[] = {
&rcu_ops, &srcu_ops, &rcu_trace_ops, &rcu_tasks_ops, &refcnt_ops, &rwlock_ops,
- &rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops,
+ &rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops,
};
if (!torture_init_begin(scale_type, verbose))
*/
void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
{
- int newval = ssp->srcu_lock_nesting[idx] - 1;
+ int newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
if (!newval && READ_ONCE(ssp->srcu_gp_waiting))
//
// "Rude" variant of Tasks RCU, inspired by Steve Rostedt's trick of
// passing an empty function to schedule_on_each_cpu(). This approach
-// provides an asynchronous call_rcu_tasks_rude() API and batching
-// of concurrent calls to the synchronous synchronize_rcu_rude() API.
+// provides an asynchronous call_rcu_tasks_rude() API and batching of
+// concurrent calls to the synchronous synchronize_rcu_tasks_rude() API.
// This invokes schedule_on_each_cpu() in order to send IPIs far and wide
// and induces otherwise unnecessary context switches on all online CPUs,
// whether idle or not.
// set that task's .need_qs flag so that task's next outermost
// rcu_read_unlock_trace() will report the quiescent state (in which
// case the count of readers is incremented). If both attempts fail,
-// the task is added to a "holdout" list.
+// the task is added to a "holdout" list. Note that IPIs are used
+// to invoke trc_read_check_handler() in the context of running tasks
+// in order to avoid ordering overhead on common-case shared-variable
+// accessses.
// rcu_tasks_trace_postscan():
// Initialize state and attempt to identify an immediate quiescent
// state as above (but only for idle tasks), unblock CPU-hotplug
/* If we are the last reader, wake up the grace-period kthread. */
void rcu_read_unlock_trace_special(struct task_struct *t, int nesting)
{
- int nq = t->trc_reader_special.b.need_qs;
+ int nq = READ_ONCE(t->trc_reader_special.b.need_qs);
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) &&
t->trc_reader_special.b.need_mb)
// If the task is not in a read-side critical section, and
// if this is the last reader, awaken the grace-period kthread.
- if (likely(!t->trc_reader_nesting)) {
+ if (likely(!READ_ONCE(t->trc_reader_nesting))) {
if (WARN_ON_ONCE(atomic_dec_and_test(&trc_n_readers_need_end)))
wake_up(&trc_wait);
// Mark as checked after decrement to avoid false
goto reset_ipi;
}
// If we are racing with an rcu_read_unlock_trace(), try again later.
- if (unlikely(t->trc_reader_nesting < 0)) {
+ if (unlikely(READ_ONCE(t->trc_reader_nesting) < 0)) {
if (WARN_ON_ONCE(atomic_dec_and_test(&trc_n_readers_need_end)))
wake_up(&trc_wait);
goto reset_ipi;
// Get here if the task is in a read-side critical section. Set
// its state so that it will awaken the grace-period kthread upon
// exit from that critical section.
- WARN_ON_ONCE(t->trc_reader_special.b.need_qs);
+ WARN_ON_ONCE(READ_ONCE(t->trc_reader_special.b.need_qs));
WRITE_ONCE(t->trc_reader_special.b.need_qs, true);
reset_ipi:
// Allow future IPIs to be sent on CPU and for task.
// Also order this IPI handler against any later manipulations of
// the intended task.
- smp_store_release(&per_cpu(trc_ipi_to_cpu, smp_processor_id()), false); // ^^^
+ smp_store_release(per_cpu_ptr(&trc_ipi_to_cpu, smp_processor_id()), false); // ^^^
smp_store_release(&texp->trc_ipi_to_cpu, -1); // ^^^
}
n_heavy_reader_ofl_updates++;
in_qs = true;
} else {
+ // The task is not running, so C-language access is safe.
in_qs = likely(!t->trc_reader_nesting);
}
// state so that it will awaken the grace-period kthread upon exit
// from that critical section.
atomic_inc(&trc_n_readers_need_end); // One more to wait on.
- WARN_ON_ONCE(t->trc_reader_special.b.need_qs);
+ WARN_ON_ONCE(READ_ONCE(t->trc_reader_special.b.need_qs));
WRITE_ONCE(t->trc_reader_special.b.need_qs, true);
return true;
}
// The current task had better be in a quiescent state.
if (t == current) {
t->trc_reader_checked = true;
- WARN_ON_ONCE(t->trc_reader_nesting);
+ WARN_ON_ONCE(READ_ONCE(t->trc_reader_nesting));
return;
}
}
put_task_struct(t);
+ // If this task is not yet on the holdout list, then we are in
+ // an RCU read-side critical section. Otherwise, the invocation of
+ // rcu_add_holdout() that added it to the list did the necessary
+ // get_task_struct(). Either way, the task cannot be freed out
+ // from under this code.
+
// If currently running, send an IPI, either way, add to list.
trc_add_holdout(t, bhp);
if (task_curr(t) &&
".I"[READ_ONCE(t->trc_ipi_to_cpu) > 0],
".i"[is_idle_task(t)],
".N"[cpu > 0 && tick_nohz_full_cpu(cpu)],
- t->trc_reader_nesting,
- " N"[!!t->trc_reader_special.b.need_qs],
+ READ_ONCE(t->trc_reader_nesting),
+ " N"[!!READ_ONCE(t->trc_reader_special.b.need_qs)],
cpu);
sched_show_task(t);
}
static void exit_tasks_rcu_finish_trace(struct task_struct *t)
{
WRITE_ONCE(t->trc_reader_checked, true);
- WARN_ON_ONCE(t->trc_reader_nesting);
+ WARN_ON_ONCE(READ_ONCE(t->trc_reader_nesting));
WRITE_ONCE(t->trc_reader_nesting, 0);
if (WARN_ON_ONCE(READ_ONCE(t->trc_reader_special.b.need_qs)))
rcu_read_unlock_trace_special(t, 0);
/* Data structures. */
-/*
- * Steal a bit from the bottom of ->dynticks for idle entry/exit
- * control. Initially this is for TLB flushing.
- */
-#define RCU_DYNTICK_CTRL_MASK 0x1
-#define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1)
-
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
.dynticks_nesting = 1,
.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
- .dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR),
+ .dynticks = ATOMIC_INIT(1),
#ifdef CONFIG_RCU_NOCB_CPU
.cblist.flags = SEGCBLIST_SOFTIRQ_ONLY,
#endif
rcu_tasks_qs(current, false);
}
+/*
+ * Increment the current CPU's rcu_data structure's ->dynticks field
+ * with ordering. Return the new value.
+ */
+static noinline noinstr unsigned long rcu_dynticks_inc(int incby)
+{
+ return arch_atomic_add_return(incby, this_cpu_ptr(&rcu_data.dynticks));
+}
+
/*
* Record entry into an extended quiescent state. This is only to be
* called when not already in an extended quiescent state, that is,
*/
static noinstr void rcu_dynticks_eqs_enter(void)
{
- struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
int seq;
/*
* next idle sojourn.
*/
rcu_dynticks_task_trace_enter(); // Before ->dynticks update!
- seq = arch_atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks);
+ seq = rcu_dynticks_inc(1);
// RCU is no longer watching. Better be in extended quiescent state!
- WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
- (seq & RCU_DYNTICK_CTRL_CTR));
- /* Better not have special action (TLB flush) pending! */
- WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
- (seq & RCU_DYNTICK_CTRL_MASK));
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & 0x1));
}
/*
*/
static noinstr void rcu_dynticks_eqs_exit(void)
{
- struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
int seq;
/*
* and we also must force ordering with the next RCU read-side
* critical section.
*/
- seq = arch_atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks);
+ seq = rcu_dynticks_inc(1);
// RCU is now watching. Better not be in an extended quiescent state!
rcu_dynticks_task_trace_exit(); // After ->dynticks update!
- WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
- !(seq & RCU_DYNTICK_CTRL_CTR));
- if (seq & RCU_DYNTICK_CTRL_MASK) {
- arch_atomic_andnot(RCU_DYNTICK_CTRL_MASK, &rdp->dynticks);
- smp_mb__after_atomic(); /* _exit after clearing mask. */
- }
+ WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & 0x1));
}
/*
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
- if (atomic_read(&rdp->dynticks) & RCU_DYNTICK_CTRL_CTR)
+ if (atomic_read(&rdp->dynticks) & 0x1)
return;
- atomic_add(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks);
+ rcu_dynticks_inc(1);
}
/*
*/
static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
{
- struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
-
- return !(arch_atomic_read(&rdp->dynticks) & RCU_DYNTICK_CTRL_CTR);
+ return !(atomic_read(this_cpu_ptr(&rcu_data.dynticks)) & 0x1);
}
/*
*/
static int rcu_dynticks_snap(struct rcu_data *rdp)
{
- int snap = atomic_add_return(0, &rdp->dynticks);
-
- return snap & ~RCU_DYNTICK_CTRL_MASK;
+ smp_mb(); // Fundamental RCU ordering guarantee.
+ return atomic_read_acquire(&rdp->dynticks);
}
/*
*/
static bool rcu_dynticks_in_eqs(int snap)
{
- return !(snap & RCU_DYNTICK_CTRL_CTR);
+ return !(snap & 0x1);
}
/* Return true if the specified CPU is currently idle from an RCU viewpoint. */
int snap;
// If not quiescent, force back to earlier extended quiescent state.
- snap = atomic_read(&rdp->dynticks) & ~(RCU_DYNTICK_CTRL_MASK |
- RCU_DYNTICK_CTRL_CTR);
+ snap = atomic_read(&rdp->dynticks) & ~0x1;
smp_rmb(); // Order ->dynticks and *vp reads.
if (READ_ONCE(*vp))
smp_rmb(); // Order *vp read and ->dynticks re-read.
// If still in the same extended quiescent state, we are good!
- return snap == (atomic_read(&rdp->dynticks) & ~RCU_DYNTICK_CTRL_MASK);
-}
-
-/*
- * Set the special (bottom) bit of the specified CPU so that it
- * will take special action (such as flushing its TLB) on the
- * next exit from an extended quiescent state. Returns true if
- * the bit was successfully set, or false if the CPU was not in
- * an extended quiescent state.
- */
-bool rcu_eqs_special_set(int cpu)
-{
- int old;
- int new;
- int new_old;
- struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
-
- new_old = atomic_read(&rdp->dynticks);
- do {
- old = new_old;
- if (old & RCU_DYNTICK_CTRL_CTR)
- return false;
- new = old | RCU_DYNTICK_CTRL_MASK;
- new_old = atomic_cmpxchg(&rdp->dynticks, old, new);
- } while (new_old != old);
- return true;
+ return snap == atomic_read(&rdp->dynticks);
}
/*
*/
notrace void rcu_momentary_dyntick_idle(void)
{
- int special;
+ int seq;
raw_cpu_write(rcu_data.rcu_need_heavy_qs, false);
- special = atomic_add_return(2 * RCU_DYNTICK_CTRL_CTR,
- &this_cpu_ptr(&rcu_data)->dynticks);
+ seq = rcu_dynticks_inc(2);
/* It is illegal to call this from idle state. */
- WARN_ON_ONCE(!(special & RCU_DYNTICK_CTRL_CTR));
+ WARN_ON_ONCE(!(seq & 0x1));
rcu_preempt_deferred_qs(current);
}
EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle);
*/
jtsq = READ_ONCE(jiffies_to_sched_qs);
ruqp = per_cpu_ptr(&rcu_data.rcu_urgent_qs, rdp->cpu);
- rnhqp = &per_cpu(rcu_data.rcu_need_heavy_qs, rdp->cpu);
+ rnhqp = per_cpu_ptr(&rcu_data.rcu_need_heavy_qs, rdp->cpu);
if (!READ_ONCE(*rnhqp) &&
(time_after(jiffies, rcu_state.gp_start + jtsq * 2) ||
time_after(jiffies, rcu_state.jiffies_resched) ||
/*
* Initialize a new grace period. Return false if no grace period required.
*/
-static bool rcu_gp_init(void)
+static noinline_for_stack bool rcu_gp_init(void)
{
unsigned long firstseq;
unsigned long flags;
/*
* Loop doing repeated quiescent-state forcing until the grace period ends.
*/
-static void rcu_gp_fqs_loop(void)
+static noinline_for_stack void rcu_gp_fqs_loop(void)
{
bool first_gp_fqs;
int gf = 0;
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq,
TPS("fqswait"));
WRITE_ONCE(rcu_state.gp_state, RCU_GP_WAIT_FQS);
- ret = swait_event_idle_timeout_exclusive(
- rcu_state.gp_wq, rcu_gp_fqs_check_wake(&gf), j);
+ (void)swait_event_idle_timeout_exclusive(rcu_state.gp_wq,
+ rcu_gp_fqs_check_wake(&gf), j);
rcu_gp_torture_wait();
WRITE_ONCE(rcu_state.gp_state, RCU_GP_DOING_FQS);
/* Locking provides needed memory barriers. */
WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
/* Adjust any no-longer-needed kthreads. */
rcu_boost_kthread_setaffinity(rnp, -1);
- /* Do any needed no-CB deferred wakeups from this CPU. */
- do_nocb_deferred_wakeup(per_cpu_ptr(&rcu_data, cpu));
-
// Stop-machine done, so allow nohz_full to disable tick.
tick_dep_clear(TICK_DEP_BIT_RCU);
return 0;
*/
init_completion(&rcu_state.barrier_completion);
atomic_set(&rcu_state.barrier_cpu_count, 2);
- get_online_cpus();
+ cpus_read_lock();
/*
* Force each CPU with callbacks to register a new callback.
rcu_state.barrier_sequence);
}
}
- put_online_cpus();
+ cpus_read_unlock();
/*
* Now that we have an rcu_barrier_callback() callback on each
#include "tree_stall.h"
#include "tree_exp.h"
+#include "tree_nocb.h"
#include "tree_plugin.h"
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Read-Copy Update mechanism for mutual exclusion (tree-based version)
+ * Internal non-public definitions that provide either classic
+ * or preemptible semantics.
+ *
+ * Copyright Red Hat, 2009
+ * Copyright IBM Corporation, 2009
+ * Copyright SUSE, 2021
+ *
+ * Author: Ingo Molnar <mingo@elte.hu>
+ * Paul E. McKenney <paulmck@linux.ibm.com>
+ * Frederic Weisbecker <frederic@kernel.org>
+ */
+
+#ifdef CONFIG_RCU_NOCB_CPU
+static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
+static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */
+static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
+{
+ return lockdep_is_held(&rdp->nocb_lock);
+}
+
+static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
+{
+ /* Race on early boot between thread creation and assignment */
+ if (!rdp->nocb_cb_kthread || !rdp->nocb_gp_kthread)
+ return true;
+
+ if (current == rdp->nocb_cb_kthread || current == rdp->nocb_gp_kthread)
+ if (in_task())
+ return true;
+ return false;
+}
+
+/*
+ * Offload callback processing from the boot-time-specified set of CPUs
+ * specified by rcu_nocb_mask. For the CPUs in the set, there are kthreads
+ * created that pull the callbacks from the corresponding CPU, wait for
+ * a grace period to elapse, and invoke the callbacks. These kthreads
+ * are organized into GP kthreads, which manage incoming callbacks, wait for
+ * grace periods, and awaken CB kthreads, and the CB kthreads, which only
+ * invoke callbacks. Each GP kthread invokes its own CBs. The no-CBs CPUs
+ * do a wake_up() on their GP kthread when they insert a callback into any
+ * empty list, unless the rcu_nocb_poll boot parameter has been specified,
+ * in which case each kthread actively polls its CPU. (Which isn't so great
+ * for energy efficiency, but which does reduce RCU's overhead on that CPU.)
+ *
+ * This is intended to be used in conjunction with Frederic Weisbecker's
+ * adaptive-idle work, which would seriously reduce OS jitter on CPUs
+ * running CPU-bound user-mode computations.
+ *
+ * Offloading of callbacks can also be used as an energy-efficiency
+ * measure because CPUs with no RCU callbacks queued are more aggressive
+ * about entering dyntick-idle mode.
+ */
+
+
+/*
+ * Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters.
+ * If the list is invalid, a warning is emitted and all CPUs are offloaded.
+ */
+static int __init rcu_nocb_setup(char *str)
+{
+ alloc_bootmem_cpumask_var(&rcu_nocb_mask);
+ if (cpulist_parse(str, rcu_nocb_mask)) {
+ pr_warn("rcu_nocbs= bad CPU range, all CPUs set\n");
+ cpumask_setall(rcu_nocb_mask);
+ }
+ return 1;
+}
+__setup("rcu_nocbs=", rcu_nocb_setup);
+
+static int __init parse_rcu_nocb_poll(char *arg)
+{
+ rcu_nocb_poll = true;
+ return 0;
+}
+early_param("rcu_nocb_poll", parse_rcu_nocb_poll);
+
+/*
+ * Don't bother bypassing ->cblist if the call_rcu() rate is low.
+ * After all, the main point of bypassing is to avoid lock contention
+ * on ->nocb_lock, which only can happen at high call_rcu() rates.
+ */
+static int nocb_nobypass_lim_per_jiffy = 16 * 1000 / HZ;
+module_param(nocb_nobypass_lim_per_jiffy, int, 0);
+
+/*
+ * Acquire the specified rcu_data structure's ->nocb_bypass_lock. If the
+ * lock isn't immediately available, increment ->nocb_lock_contended to
+ * flag the contention.
+ */
+static void rcu_nocb_bypass_lock(struct rcu_data *rdp)
+ __acquires(&rdp->nocb_bypass_lock)
+{
+ lockdep_assert_irqs_disabled();
+ if (raw_spin_trylock(&rdp->nocb_bypass_lock))
+ return;
+ atomic_inc(&rdp->nocb_lock_contended);
+ WARN_ON_ONCE(smp_processor_id() != rdp->cpu);
+ smp_mb__after_atomic(); /* atomic_inc() before lock. */
+ raw_spin_lock(&rdp->nocb_bypass_lock);
+ smp_mb__before_atomic(); /* atomic_dec() after lock. */
+ atomic_dec(&rdp->nocb_lock_contended);
+}
+
+/*
+ * Spinwait until the specified rcu_data structure's ->nocb_lock is
+ * not contended. Please note that this is extremely special-purpose,
+ * relying on the fact that at most two kthreads and one CPU contend for
+ * this lock, and also that the two kthreads are guaranteed to have frequent
+ * grace-period-duration time intervals between successive acquisitions
+ * of the lock. This allows us to use an extremely simple throttling
+ * mechanism, and further to apply it only to the CPU doing floods of
+ * call_rcu() invocations. Don't try this at home!
+ */
+static void rcu_nocb_wait_contended(struct rcu_data *rdp)
+{
+ WARN_ON_ONCE(smp_processor_id() != rdp->cpu);
+ while (WARN_ON_ONCE(atomic_read(&rdp->nocb_lock_contended)))
+ cpu_relax();
+}
+
+/*
+ * Conditionally acquire the specified rcu_data structure's
+ * ->nocb_bypass_lock.
+ */
+static bool rcu_nocb_bypass_trylock(struct rcu_data *rdp)
+{
+ lockdep_assert_irqs_disabled();
+ return raw_spin_trylock(&rdp->nocb_bypass_lock);
+}
+
+/*
+ * Release the specified rcu_data structure's ->nocb_bypass_lock.
+ */
+static void rcu_nocb_bypass_unlock(struct rcu_data *rdp)
+ __releases(&rdp->nocb_bypass_lock)
+{
+ lockdep_assert_irqs_disabled();
+ raw_spin_unlock(&rdp->nocb_bypass_lock);
+}
+
+/*
+ * Acquire the specified rcu_data structure's ->nocb_lock, but only
+ * if it corresponds to a no-CBs CPU.
+ */
+static void rcu_nocb_lock(struct rcu_data *rdp)
+{
+ lockdep_assert_irqs_disabled();
+ if (!rcu_rdp_is_offloaded(rdp))
+ return;
+ raw_spin_lock(&rdp->nocb_lock);
+}
+
+/*
+ * Release the specified rcu_data structure's ->nocb_lock, but only
+ * if it corresponds to a no-CBs CPU.
+ */
+static void rcu_nocb_unlock(struct rcu_data *rdp)
+{
+ if (rcu_rdp_is_offloaded(rdp)) {
+ lockdep_assert_irqs_disabled();
+ raw_spin_unlock(&rdp->nocb_lock);
+ }
+}
+
+/*
+ * Release the specified rcu_data structure's ->nocb_lock and restore
+ * interrupts, but only if it corresponds to a no-CBs CPU.
+ */
+static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
+ unsigned long flags)
+{
+ if (rcu_rdp_is_offloaded(rdp)) {
+ lockdep_assert_irqs_disabled();
+ raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
+ } else {
+ local_irq_restore(flags);
+ }
+}
+
+/* Lockdep check that ->cblist may be safely accessed. */
+static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
+{
+ lockdep_assert_irqs_disabled();
+ if (rcu_rdp_is_offloaded(rdp))
+ lockdep_assert_held(&rdp->nocb_lock);
+}
+
+/*
+ * Wake up any no-CBs CPUs' kthreads that were waiting on the just-ended
+ * grace period.
+ */
+static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
+{
+ swake_up_all(sq);
+}
+
+static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
+{
+ return &rnp->nocb_gp_wq[rcu_seq_ctr(rnp->gp_seq) & 0x1];
+}
+
+static void rcu_init_one_nocb(struct rcu_node *rnp)
+{
+ init_swait_queue_head(&rnp->nocb_gp_wq[0]);
+ init_swait_queue_head(&rnp->nocb_gp_wq[1]);
+}
+
+/* Is the specified CPU a no-CBs CPU? */
+bool rcu_is_nocb_cpu(int cpu)
+{
+ if (cpumask_available(rcu_nocb_mask))
+ return cpumask_test_cpu(cpu, rcu_nocb_mask);
+ return false;
+}
+
+static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
+ struct rcu_data *rdp,
+ bool force, unsigned long flags)
+ __releases(rdp_gp->nocb_gp_lock)
+{
+ bool needwake = false;
+
+ if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
+ raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("AlreadyAwake"));
+ return false;
+ }
+
+ if (rdp_gp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
+ WRITE_ONCE(rdp_gp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+ del_timer(&rdp_gp->nocb_timer);
+ }
+
+ if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
+ WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
+ needwake = true;
+ }
+ raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+ if (needwake) {
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake"));
+ wake_up_process(rdp_gp->nocb_gp_kthread);
+ }
+
+ return needwake;
+}
+
+/*
+ * Kick the GP kthread for this NOCB group.
+ */
+static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
+{
+ unsigned long flags;
+ struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+
+ raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+ return __wake_nocb_gp(rdp_gp, rdp, force, flags);
+}
+
+/*
+ * Arrange to wake the GP kthread for this NOCB group at some future
+ * time when it is safe to do so.
+ */
+static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
+ const char *reason)
+{
+ unsigned long flags;
+ struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+
+ raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+
+ /*
+ * Bypass wakeup overrides previous deferments. In case
+ * of callback storm, no need to wake up too early.
+ */
+ if (waketype == RCU_NOCB_WAKE_BYPASS) {
+ mod_timer(&rdp_gp->nocb_timer, jiffies + 2);
+ WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
+ } else {
+ if (rdp_gp->nocb_defer_wakeup < RCU_NOCB_WAKE)
+ mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
+ if (rdp_gp->nocb_defer_wakeup < waketype)
+ WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
+ }
+
+ raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason);
+}
+
+/*
+ * Flush the ->nocb_bypass queue into ->cblist, enqueuing rhp if non-NULL.
+ * However, if there is a callback to be enqueued and if ->nocb_bypass
+ * proves to be initially empty, just return false because the no-CB GP
+ * kthread may need to be awakened in this case.
+ *
+ * Note that this function always returns true if rhp is NULL.
+ */
+static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+ unsigned long j)
+{
+ struct rcu_cblist rcl;
+
+ WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
+ rcu_lockdep_assert_cblist_protected(rdp);
+ lockdep_assert_held(&rdp->nocb_bypass_lock);
+ if (rhp && !rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
+ raw_spin_unlock(&rdp->nocb_bypass_lock);
+ return false;
+ }
+ /* Note: ->cblist.len already accounts for ->nocb_bypass contents. */
+ if (rhp)
+ rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
+ rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
+ rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
+ WRITE_ONCE(rdp->nocb_bypass_first, j);
+ rcu_nocb_bypass_unlock(rdp);
+ return true;
+}
+
+/*
+ * Flush the ->nocb_bypass queue into ->cblist, enqueuing rhp if non-NULL.
+ * However, if there is a callback to be enqueued and if ->nocb_bypass
+ * proves to be initially empty, just return false because the no-CB GP
+ * kthread may need to be awakened in this case.
+ *
+ * Note that this function always returns true if rhp is NULL.
+ */
+static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+ unsigned long j)
+{
+ if (!rcu_rdp_is_offloaded(rdp))
+ return true;
+ rcu_lockdep_assert_cblist_protected(rdp);
+ rcu_nocb_bypass_lock(rdp);
+ return rcu_nocb_do_flush_bypass(rdp, rhp, j);
+}
+
+/*
+ * If the ->nocb_bypass_lock is immediately available, flush the
+ * ->nocb_bypass queue into ->cblist.
+ */
+static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
+{
+ rcu_lockdep_assert_cblist_protected(rdp);
+ if (!rcu_rdp_is_offloaded(rdp) ||
+ !rcu_nocb_bypass_trylock(rdp))
+ return;
+ WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j));
+}
+
+/*
+ * See whether it is appropriate to use the ->nocb_bypass list in order
+ * to control contention on ->nocb_lock. A limited number of direct
+ * enqueues are permitted into ->cblist per jiffy. If ->nocb_bypass
+ * is non-empty, further callbacks must be placed into ->nocb_bypass,
+ * otherwise rcu_barrier() breaks. Use rcu_nocb_flush_bypass() to switch
+ * back to direct use of ->cblist. However, ->nocb_bypass should not be
+ * used if ->cblist is empty, because otherwise callbacks can be stranded
+ * on ->nocb_bypass because we cannot count on the current CPU ever again
+ * invoking call_rcu(). The general rule is that if ->nocb_bypass is
+ * non-empty, the corresponding no-CBs grace-period kthread must not be
+ * in an indefinite sleep state.
+ *
+ * Finally, it is not permitted to use the bypass during early boot,
+ * as doing so would confuse the auto-initialization code. Besides
+ * which, there is no point in worrying about lock contention while
+ * there is only one CPU in operation.
+ */
+static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+ bool *was_alldone, unsigned long flags)
+{
+ unsigned long c;
+ unsigned long cur_gp_seq;
+ unsigned long j = jiffies;
+ long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+
+ lockdep_assert_irqs_disabled();
+
+ // Pure softirq/rcuc based processing: no bypassing, no
+ // locking.
+ if (!rcu_rdp_is_offloaded(rdp)) {
+ *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+ return false;
+ }
+
+ // In the process of (de-)offloading: no bypassing, but
+ // locking.
+ if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
+ rcu_nocb_lock(rdp);
+ *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+ return false; /* Not offloaded, no bypassing. */
+ }
+
+ // Don't use ->nocb_bypass during early boot.
+ if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
+ rcu_nocb_lock(rdp);
+ WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+ *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+ return false;
+ }
+
+ // If we have advanced to a new jiffy, reset counts to allow
+ // moving back from ->nocb_bypass to ->cblist.
+ if (j == rdp->nocb_nobypass_last) {
+ c = rdp->nocb_nobypass_count + 1;
+ } else {
+ WRITE_ONCE(rdp->nocb_nobypass_last, j);
+ c = rdp->nocb_nobypass_count - nocb_nobypass_lim_per_jiffy;
+ if (ULONG_CMP_LT(rdp->nocb_nobypass_count,
+ nocb_nobypass_lim_per_jiffy))
+ c = 0;
+ else if (c > nocb_nobypass_lim_per_jiffy)
+ c = nocb_nobypass_lim_per_jiffy;
+ }
+ WRITE_ONCE(rdp->nocb_nobypass_count, c);
+
+ // If there hasn't yet been all that many ->cblist enqueues
+ // this jiffy, tell the caller to enqueue onto ->cblist. But flush
+ // ->nocb_bypass first.
+ if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) {
+ rcu_nocb_lock(rdp);
+ *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+ if (*was_alldone)
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("FirstQ"));
+ WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
+ WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+ return false; // Caller must enqueue the callback.
+ }
+
+ // If ->nocb_bypass has been used too long or is too full,
+ // flush ->nocb_bypass to ->cblist.
+ if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
+ ncbs >= qhimark) {
+ rcu_nocb_lock(rdp);
+ if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
+ *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
+ if (*was_alldone)
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("FirstQ"));
+ WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+ return false; // Caller must enqueue the callback.
+ }
+ if (j != rdp->nocb_gp_adv_time &&
+ rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
+ rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
+ rcu_advance_cbs_nowake(rdp->mynode, rdp);
+ rdp->nocb_gp_adv_time = j;
+ }
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ return true; // Callback already enqueued.
+ }
+
+ // We need to use the bypass.
+ rcu_nocb_wait_contended(rdp);
+ rcu_nocb_bypass_lock(rdp);
+ ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+ rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
+ rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
+ if (!ncbs) {
+ WRITE_ONCE(rdp->nocb_bypass_first, j);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
+ }
+ rcu_nocb_bypass_unlock(rdp);
+ smp_mb(); /* Order enqueue before wake. */
+ if (ncbs) {
+ local_irq_restore(flags);
+ } else {
+ // No-CBs GP kthread might be indefinitely asleep, if so, wake.
+ rcu_nocb_lock(rdp); // Rare during call_rcu() flood.
+ if (!rcu_segcblist_pend_cbs(&rdp->cblist)) {
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("FirstBQwake"));
+ __call_rcu_nocb_wake(rdp, true, flags);
+ } else {
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("FirstBQnoWake"));
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ }
+ }
+ return true; // Callback already enqueued.
+}
+
+/*
+ * Awaken the no-CBs grace-period kthread if needed, either due to it
+ * legitimately being asleep or due to overload conditions.
+ *
+ * If warranted, also wake up the kthread servicing this CPUs queues.
+ */
+static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
+ unsigned long flags)
+ __releases(rdp->nocb_lock)
+{
+ unsigned long cur_gp_seq;
+ unsigned long j;
+ long len;
+ struct task_struct *t;
+
+ // If we are being polled or there is no kthread, just leave.
+ t = READ_ONCE(rdp->nocb_gp_kthread);
+ if (rcu_nocb_poll || !t) {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("WakeNotPoll"));
+ return;
+ }
+ // Need to actually to a wakeup.
+ len = rcu_segcblist_n_cbs(&rdp->cblist);
+ if (was_alldone) {
+ rdp->qlen_last_fqs_check = len;
+ if (!irqs_disabled_flags(flags)) {
+ /* ... if queue was empty ... */
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ wake_nocb_gp(rdp, false);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("WakeEmpty"));
+ } else {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE,
+ TPS("WakeEmptyIsDeferred"));
+ }
+ } else if (len > rdp->qlen_last_fqs_check + qhimark) {
+ /* ... or if many callbacks queued. */
+ rdp->qlen_last_fqs_check = len;
+ j = jiffies;
+ if (j != rdp->nocb_gp_adv_time &&
+ rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
+ rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
+ rcu_advance_cbs_nowake(rdp->mynode, rdp);
+ rdp->nocb_gp_adv_time = j;
+ }
+ smp_mb(); /* Enqueue before timer_pending(). */
+ if ((rdp->nocb_cb_sleep ||
+ !rcu_segcblist_ready_cbs(&rdp->cblist)) &&
+ !timer_pending(&rdp->nocb_timer)) {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_FORCE,
+ TPS("WakeOvfIsDeferred"));
+ } else {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
+ }
+ } else {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
+ }
+ return;
+}
+
+/*
+ * Check if we ignore this rdp.
+ *
+ * We check that without holding the nocb lock but
+ * we make sure not to miss a freshly offloaded rdp
+ * with the current ordering:
+ *
+ * rdp_offload_toggle() nocb_gp_enabled_cb()
+ * ------------------------- ----------------------------
+ * WRITE flags LOCK nocb_gp_lock
+ * LOCK nocb_gp_lock READ/WRITE nocb_gp_sleep
+ * READ/WRITE nocb_gp_sleep UNLOCK nocb_gp_lock
+ * UNLOCK nocb_gp_lock READ flags
+ */
+static inline bool nocb_gp_enabled_cb(struct rcu_data *rdp)
+{
+ u8 flags = SEGCBLIST_OFFLOADED | SEGCBLIST_KTHREAD_GP;
+
+ return rcu_segcblist_test_flags(&rdp->cblist, flags);
+}
+
+static inline bool nocb_gp_update_state_deoffloading(struct rcu_data *rdp,
+ bool *needwake_state)
+{
+ struct rcu_segcblist *cblist = &rdp->cblist;
+
+ if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
+ if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) {
+ rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP);
+ if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
+ *needwake_state = true;
+ }
+ return false;
+ }
+
+ /*
+ * De-offloading. Clear our flag and notify the de-offload worker.
+ * We will ignore this rdp until it ever gets re-offloaded.
+ */
+ WARN_ON_ONCE(!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
+ rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
+ if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
+ *needwake_state = true;
+ return true;
+}
+
+
+/*
+ * No-CBs GP kthreads come here to wait for additional callbacks to show up
+ * or for grace periods to end.
+ */
+static void nocb_gp_wait(struct rcu_data *my_rdp)
+{
+ bool bypass = false;
+ long bypass_ncbs;
+ int __maybe_unused cpu = my_rdp->cpu;
+ unsigned long cur_gp_seq;
+ unsigned long flags;
+ bool gotcbs = false;
+ unsigned long j = jiffies;
+ bool needwait_gp = false; // This prevents actual uninitialized use.
+ bool needwake;
+ bool needwake_gp;
+ struct rcu_data *rdp;
+ struct rcu_node *rnp;
+ unsigned long wait_gp_seq = 0; // Suppress "use uninitialized" warning.
+ bool wasempty = false;
+
+ /*
+ * Each pass through the following loop checks for CBs and for the
+ * nearest grace period (if any) to wait for next. The CB kthreads
+ * and the global grace-period kthread are awakened if needed.
+ */
+ WARN_ON_ONCE(my_rdp->nocb_gp_rdp != my_rdp);
+ for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) {
+ bool needwake_state = false;
+
+ if (!nocb_gp_enabled_cb(rdp))
+ continue;
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
+ rcu_nocb_lock_irqsave(rdp, flags);
+ if (nocb_gp_update_state_deoffloading(rdp, &needwake_state)) {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ if (needwake_state)
+ swake_up_one(&rdp->nocb_state_wq);
+ continue;
+ }
+ bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+ if (bypass_ncbs &&
+ (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
+ bypass_ncbs > 2 * qhimark)) {
+ // Bypass full or old, so flush it.
+ (void)rcu_nocb_try_flush_bypass(rdp, j);
+ bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
+ } else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ if (needwake_state)
+ swake_up_one(&rdp->nocb_state_wq);
+ continue; /* No callbacks here, try next. */
+ }
+ if (bypass_ncbs) {
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("Bypass"));
+ bypass = true;
+ }
+ rnp = rdp->mynode;
+
+ // Advance callbacks if helpful and low contention.
+ needwake_gp = false;
+ if (!rcu_segcblist_restempty(&rdp->cblist,
+ RCU_NEXT_READY_TAIL) ||
+ (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
+ rcu_seq_done(&rnp->gp_seq, cur_gp_seq))) {
+ raw_spin_lock_rcu_node(rnp); /* irqs disabled. */
+ needwake_gp = rcu_advance_cbs(rnp, rdp);
+ wasempty = rcu_segcblist_restempty(&rdp->cblist,
+ RCU_NEXT_READY_TAIL);
+ raw_spin_unlock_rcu_node(rnp); /* irqs disabled. */
+ }
+ // Need to wait on some grace period?
+ WARN_ON_ONCE(wasempty &&
+ !rcu_segcblist_restempty(&rdp->cblist,
+ RCU_NEXT_READY_TAIL));
+ if (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq)) {
+ if (!needwait_gp ||
+ ULONG_CMP_LT(cur_gp_seq, wait_gp_seq))
+ wait_gp_seq = cur_gp_seq;
+ needwait_gp = true;
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
+ TPS("NeedWaitGP"));
+ }
+ if (rcu_segcblist_ready_cbs(&rdp->cblist)) {
+ needwake = rdp->nocb_cb_sleep;
+ WRITE_ONCE(rdp->nocb_cb_sleep, false);
+ smp_mb(); /* CB invocation -after- GP end. */
+ } else {
+ needwake = false;
+ }
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ if (needwake) {
+ swake_up_one(&rdp->nocb_cb_wq);
+ gotcbs = true;
+ }
+ if (needwake_gp)
+ rcu_gp_kthread_wake();
+ if (needwake_state)
+ swake_up_one(&rdp->nocb_state_wq);
+ }
+
+ my_rdp->nocb_gp_bypass = bypass;
+ my_rdp->nocb_gp_gp = needwait_gp;
+ my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
+
+ if (bypass && !rcu_nocb_poll) {
+ // At least one child with non-empty ->nocb_bypass, so set
+ // timer in order to avoid stranding its callbacks.
+ wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
+ TPS("WakeBypassIsDeferred"));
+ }
+ if (rcu_nocb_poll) {
+ /* Polling, so trace if first poll in the series. */
+ if (gotcbs)
+ trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Poll"));
+ schedule_timeout_idle(1);
+ } else if (!needwait_gp) {
+ /* Wait for callbacks to appear. */
+ trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Sleep"));
+ swait_event_interruptible_exclusive(my_rdp->nocb_gp_wq,
+ !READ_ONCE(my_rdp->nocb_gp_sleep));
+ trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("EndSleep"));
+ } else {
+ rnp = my_rdp->mynode;
+ trace_rcu_this_gp(rnp, my_rdp, wait_gp_seq, TPS("StartWait"));
+ swait_event_interruptible_exclusive(
+ rnp->nocb_gp_wq[rcu_seq_ctr(wait_gp_seq) & 0x1],
+ rcu_seq_done(&rnp->gp_seq, wait_gp_seq) ||
+ !READ_ONCE(my_rdp->nocb_gp_sleep));
+ trace_rcu_this_gp(rnp, my_rdp, wait_gp_seq, TPS("EndWait"));
+ }
+ if (!rcu_nocb_poll) {
+ raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
+ if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
+ WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+ del_timer(&my_rdp->nocb_timer);
+ }
+ WRITE_ONCE(my_rdp->nocb_gp_sleep, true);
+ raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
+ }
+ my_rdp->nocb_gp_seq = -1;
+ WARN_ON(signal_pending(current));
+}
+
+/*
+ * No-CBs grace-period-wait kthread. There is one of these per group
+ * of CPUs, but only once at least one CPU in that group has come online
+ * at least once since boot. This kthread checks for newly posted
+ * callbacks from any of the CPUs it is responsible for, waits for a
+ * grace period, then awakens all of the rcu_nocb_cb_kthread() instances
+ * that then have callback-invocation work to do.
+ */
+static int rcu_nocb_gp_kthread(void *arg)
+{
+ struct rcu_data *rdp = arg;
+
+ for (;;) {
+ WRITE_ONCE(rdp->nocb_gp_loops, rdp->nocb_gp_loops + 1);
+ nocb_gp_wait(rdp);
+ cond_resched_tasks_rcu_qs();
+ }
+ return 0;
+}
+
+static inline bool nocb_cb_can_run(struct rcu_data *rdp)
+{
+ u8 flags = SEGCBLIST_OFFLOADED | SEGCBLIST_KTHREAD_CB;
+ return rcu_segcblist_test_flags(&rdp->cblist, flags);
+}
+
+static inline bool nocb_cb_wait_cond(struct rcu_data *rdp)
+{
+ return nocb_cb_can_run(rdp) && !READ_ONCE(rdp->nocb_cb_sleep);
+}
+
+/*
+ * Invoke any ready callbacks from the corresponding no-CBs CPU,
+ * then, if there are no more, wait for more to appear.
+ */
+static void nocb_cb_wait(struct rcu_data *rdp)
+{
+ struct rcu_segcblist *cblist = &rdp->cblist;
+ unsigned long cur_gp_seq;
+ unsigned long flags;
+ bool needwake_state = false;
+ bool needwake_gp = false;
+ bool can_sleep = true;
+ struct rcu_node *rnp = rdp->mynode;
+
+ local_irq_save(flags);
+ rcu_momentary_dyntick_idle();
+ local_irq_restore(flags);
+ /*
+ * Disable BH to provide the expected environment. Also, when
+ * transitioning to/from NOCB mode, a self-requeuing callback might
+ * be invoked from softirq. A short grace period could cause both
+ * instances of this callback would execute concurrently.
+ */
+ local_bh_disable();
+ rcu_do_batch(rdp);
+ local_bh_enable();
+ lockdep_assert_irqs_enabled();
+ rcu_nocb_lock_irqsave(rdp, flags);
+ if (rcu_segcblist_nextgp(cblist, &cur_gp_seq) &&
+ rcu_seq_done(&rnp->gp_seq, cur_gp_seq) &&
+ raw_spin_trylock_rcu_node(rnp)) { /* irqs already disabled. */
+ needwake_gp = rcu_advance_cbs(rdp->mynode, rdp);
+ raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
+ }
+
+ if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
+ if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB)) {
+ rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_CB);
+ if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP))
+ needwake_state = true;
+ }
+ if (rcu_segcblist_ready_cbs(cblist))
+ can_sleep = false;
+ } else {
+ /*
+ * De-offloading. Clear our flag and notify the de-offload worker.
+ * We won't touch the callbacks and keep sleeping until we ever
+ * get re-offloaded.
+ */
+ WARN_ON_ONCE(!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB));
+ rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_CB);
+ if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP))
+ needwake_state = true;
+ }
+
+ WRITE_ONCE(rdp->nocb_cb_sleep, can_sleep);
+
+ if (rdp->nocb_cb_sleep)
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
+
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ if (needwake_gp)
+ rcu_gp_kthread_wake();
+
+ if (needwake_state)
+ swake_up_one(&rdp->nocb_state_wq);
+
+ do {
+ swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
+ nocb_cb_wait_cond(rdp));
+
+ // VVV Ensure CB invocation follows _sleep test.
+ if (smp_load_acquire(&rdp->nocb_cb_sleep)) { // ^^^
+ WARN_ON(signal_pending(current));
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
+ }
+ } while (!nocb_cb_can_run(rdp));
+}
+
+/*
+ * Per-rcu_data kthread, but only for no-CBs CPUs. Repeatedly invoke
+ * nocb_cb_wait() to do the dirty work.
+ */
+static int rcu_nocb_cb_kthread(void *arg)
+{
+ struct rcu_data *rdp = arg;
+
+ // Each pass through this loop does one callback batch, and,
+ // if there are no more ready callbacks, waits for them.
+ for (;;) {
+ nocb_cb_wait(rdp);
+ cond_resched_tasks_rcu_qs();
+ }
+ return 0;
+}
+
+/* Is a deferred wakeup of rcu_nocb_kthread() required? */
+static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level)
+{
+ return READ_ONCE(rdp->nocb_defer_wakeup) >= level;
+}
+
+/* Do a deferred wakeup of rcu_nocb_kthread(). */
+static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp_gp,
+ struct rcu_data *rdp, int level,
+ unsigned long flags)
+ __releases(rdp_gp->nocb_gp_lock)
+{
+ int ndw;
+ int ret;
+
+ if (!rcu_nocb_need_deferred_wakeup(rdp_gp, level)) {
+ raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+ return false;
+ }
+
+ ndw = rdp_gp->nocb_defer_wakeup;
+ ret = __wake_nocb_gp(rdp_gp, rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));
+
+ return ret;
+}
+
+/* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */
+static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
+{
+ unsigned long flags;
+ struct rcu_data *rdp = from_timer(rdp, t, nocb_timer);
+
+ WARN_ON_ONCE(rdp->nocb_gp_rdp != rdp);
+ trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer"));
+
+ raw_spin_lock_irqsave(&rdp->nocb_gp_lock, flags);
+ smp_mb__after_spinlock(); /* Timer expire before wakeup. */
+ do_nocb_deferred_wakeup_common(rdp, rdp, RCU_NOCB_WAKE_BYPASS, flags);
+}
+
+/*
+ * Do a deferred wakeup of rcu_nocb_kthread() from fastpath.
+ * This means we do an inexact common-case check. Note that if
+ * we miss, ->nocb_timer will eventually clean things up.
+ */
+static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
+{
+ unsigned long flags;
+ struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+
+ if (!rdp_gp || !rcu_nocb_need_deferred_wakeup(rdp_gp, RCU_NOCB_WAKE))
+ return false;
+
+ raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+ return do_nocb_deferred_wakeup_common(rdp_gp, rdp, RCU_NOCB_WAKE, flags);
+}
+
+void rcu_nocb_flush_deferred_wakeup(void)
+{
+ do_nocb_deferred_wakeup(this_cpu_ptr(&rcu_data));
+}
+EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup);
+
+static int rdp_offload_toggle(struct rcu_data *rdp,
+ bool offload, unsigned long flags)
+ __releases(rdp->nocb_lock)
+{
+ struct rcu_segcblist *cblist = &rdp->cblist;
+ struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
+ bool wake_gp = false;
+
+ rcu_segcblist_offload(cblist, offload);
+
+ if (rdp->nocb_cb_sleep)
+ rdp->nocb_cb_sleep = false;
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+
+ /*
+ * Ignore former value of nocb_cb_sleep and force wake up as it could
+ * have been spuriously set to false already.
+ */
+ swake_up_one(&rdp->nocb_cb_wq);
+
+ raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+ if (rdp_gp->nocb_gp_sleep) {
+ rdp_gp->nocb_gp_sleep = false;
+ wake_gp = true;
+ }
+ raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+
+ if (wake_gp)
+ wake_up_process(rdp_gp->nocb_gp_kthread);
+
+ return 0;
+}
+
+static long rcu_nocb_rdp_deoffload(void *arg)
+{
+ struct rcu_data *rdp = arg;
+ struct rcu_segcblist *cblist = &rdp->cblist;
+ unsigned long flags;
+ int ret;
+
+ WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
+
+ pr_info("De-offloading %d\n", rdp->cpu);
+
+ rcu_nocb_lock_irqsave(rdp, flags);
+ /*
+ * Flush once and for all now. This suffices because we are
+ * running on the target CPU holding ->nocb_lock (thus having
+ * interrupts disabled), and because rdp_offload_toggle()
+ * invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED.
+ * Thus future calls to rcu_segcblist_completely_offloaded() will
+ * return false, which means that future calls to rcu_nocb_try_bypass()
+ * will refuse to put anything into the bypass.
+ */
+ WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
+ ret = rdp_offload_toggle(rdp, false, flags);
+ swait_event_exclusive(rdp->nocb_state_wq,
+ !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
+ SEGCBLIST_KTHREAD_GP));
+ /*
+ * Lock one last time to acquire latest callback updates from kthreads
+ * so we can later handle callbacks locally without locking.
+ */
+ rcu_nocb_lock_irqsave(rdp, flags);
+ /*
+ * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY after the nocb
+ * lock is released but how about being paranoid for once?
+ */
+ rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
+ /*
+ * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
+ * rcu_nocb_unlock_irqrestore() anymore.
+ */
+ raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
+
+ /* Sanity check */
+ WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
+
+
+ return ret;
+}
+
+int rcu_nocb_cpu_deoffload(int cpu)
+{
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ int ret = 0;
+
+ mutex_lock(&rcu_state.barrier_mutex);
+ cpus_read_lock();
+ if (rcu_rdp_is_offloaded(rdp)) {
+ if (cpu_online(cpu)) {
+ ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
+ if (!ret)
+ cpumask_clear_cpu(cpu, rcu_nocb_mask);
+ } else {
+ pr_info("NOCB: Can't CB-deoffload an offline CPU\n");
+ ret = -EINVAL;
+ }
+ }
+ cpus_read_unlock();
+ mutex_unlock(&rcu_state.barrier_mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
+
+static long rcu_nocb_rdp_offload(void *arg)
+{
+ struct rcu_data *rdp = arg;
+ struct rcu_segcblist *cblist = &rdp->cblist;
+ unsigned long flags;
+ int ret;
+
+ WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
+ /*
+ * For now we only support re-offload, ie: the rdp must have been
+ * offloaded on boot first.
+ */
+ if (!rdp->nocb_gp_rdp)
+ return -EINVAL;
+
+ pr_info("Offloading %d\n", rdp->cpu);
+ /*
+ * Can't use rcu_nocb_lock_irqsave() while we are in
+ * SEGCBLIST_SOFTIRQ_ONLY mode.
+ */
+ raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
+
+ /*
+ * We didn't take the nocb lock while working on the
+ * rdp->cblist in SEGCBLIST_SOFTIRQ_ONLY mode.
+ * Every modifications that have been done previously on
+ * rdp->cblist must be visible remotely by the nocb kthreads
+ * upon wake up after reading the cblist flags.
+ *
+ * The layout against nocb_lock enforces that ordering:
+ *
+ * __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait()
+ * ------------------------- ----------------------------
+ * WRITE callbacks rcu_nocb_lock()
+ * rcu_nocb_lock() READ flags
+ * WRITE flags READ callbacks
+ * rcu_nocb_unlock() rcu_nocb_unlock()
+ */
+ ret = rdp_offload_toggle(rdp, true, flags);
+ swait_event_exclusive(rdp->nocb_state_wq,
+ rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB) &&
+ rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
+
+ return ret;
+}
+
+int rcu_nocb_cpu_offload(int cpu)
+{
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ int ret = 0;
+
+ mutex_lock(&rcu_state.barrier_mutex);
+ cpus_read_lock();
+ if (!rcu_rdp_is_offloaded(rdp)) {
+ if (cpu_online(cpu)) {
+ ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
+ if (!ret)
+ cpumask_set_cpu(cpu, rcu_nocb_mask);
+ } else {
+ pr_info("NOCB: Can't CB-offload an offline CPU\n");
+ ret = -EINVAL;
+ }
+ }
+ cpus_read_unlock();
+ mutex_unlock(&rcu_state.barrier_mutex);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
+
+void __init rcu_init_nohz(void)
+{
+ int cpu;
+ bool need_rcu_nocb_mask = false;
+ struct rcu_data *rdp;
+
+#if defined(CONFIG_NO_HZ_FULL)
+ if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask))
+ need_rcu_nocb_mask = true;
+#endif /* #if defined(CONFIG_NO_HZ_FULL) */
+
+ if (!cpumask_available(rcu_nocb_mask) && need_rcu_nocb_mask) {
+ if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
+ pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
+ return;
+ }
+ }
+ if (!cpumask_available(rcu_nocb_mask))
+ return;
+
+#if defined(CONFIG_NO_HZ_FULL)
+ if (tick_nohz_full_running)
+ cpumask_or(rcu_nocb_mask, rcu_nocb_mask, tick_nohz_full_mask);
+#endif /* #if defined(CONFIG_NO_HZ_FULL) */
+
+ if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
+ pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
+ cpumask_and(rcu_nocb_mask, cpu_possible_mask,
+ rcu_nocb_mask);
+ }
+ if (cpumask_empty(rcu_nocb_mask))
+ pr_info("\tOffload RCU callbacks from CPUs: (none).\n");
+ else
+ pr_info("\tOffload RCU callbacks from CPUs: %*pbl.\n",
+ cpumask_pr_args(rcu_nocb_mask));
+ if (rcu_nocb_poll)
+ pr_info("\tPoll for callbacks from no-CBs CPUs.\n");
+
+ for_each_cpu(cpu, rcu_nocb_mask) {
+ rdp = per_cpu_ptr(&rcu_data, cpu);
+ if (rcu_segcblist_empty(&rdp->cblist))
+ rcu_segcblist_init(&rdp->cblist);
+ rcu_segcblist_offload(&rdp->cblist, true);
+ rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_CB);
+ rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP);
+ }
+ rcu_organize_nocb_kthreads();
+}
+
+/* Initialize per-rcu_data variables for no-CBs CPUs. */
+static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
+{
+ init_swait_queue_head(&rdp->nocb_cb_wq);
+ init_swait_queue_head(&rdp->nocb_gp_wq);
+ init_swait_queue_head(&rdp->nocb_state_wq);
+ raw_spin_lock_init(&rdp->nocb_lock);
+ raw_spin_lock_init(&rdp->nocb_bypass_lock);
+ raw_spin_lock_init(&rdp->nocb_gp_lock);
+ timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0);
+ rcu_cblist_init(&rdp->nocb_bypass);
+}
+
+/*
+ * If the specified CPU is a no-CBs CPU that does not already have its
+ * rcuo CB kthread, spawn it. Additionally, if the rcuo GP kthread
+ * for this CPU's group has not yet been created, spawn it as well.
+ */
+static void rcu_spawn_one_nocb_kthread(int cpu)
+{
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ struct rcu_data *rdp_gp;
+ struct task_struct *t;
+
+ /*
+ * If this isn't a no-CBs CPU or if it already has an rcuo kthread,
+ * then nothing to do.
+ */
+ if (!rcu_is_nocb_cpu(cpu) || rdp->nocb_cb_kthread)
+ return;
+
+ /* If we didn't spawn the GP kthread first, reorganize! */
+ rdp_gp = rdp->nocb_gp_rdp;
+ if (!rdp_gp->nocb_gp_kthread) {
+ t = kthread_run(rcu_nocb_gp_kthread, rdp_gp,
+ "rcuog/%d", rdp_gp->cpu);
+ if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__))
+ return;
+ WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
+ }
+
+ /* Spawn the kthread for this CPU. */
+ t = kthread_run(rcu_nocb_cb_kthread, rdp,
+ "rcuo%c/%d", rcu_state.abbr, cpu);
+ if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
+ return;
+ WRITE_ONCE(rdp->nocb_cb_kthread, t);
+ WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
+}
+
+/*
+ * If the specified CPU is a no-CBs CPU that does not already have its
+ * rcuo kthread, spawn it.
+ */
+static void rcu_spawn_cpu_nocb_kthread(int cpu)
+{
+ if (rcu_scheduler_fully_active)
+ rcu_spawn_one_nocb_kthread(cpu);
+}
+
+/*
+ * Once the scheduler is running, spawn rcuo kthreads for all online
+ * no-CBs CPUs. This assumes that the early_initcall()s happen before
+ * non-boot CPUs come online -- if this changes, we will need to add
+ * some mutual exclusion.
+ */
+static void __init rcu_spawn_nocb_kthreads(void)
+{
+ int cpu;
+
+ for_each_online_cpu(cpu)
+ rcu_spawn_cpu_nocb_kthread(cpu);
+}
+
+/* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */
+static int rcu_nocb_gp_stride = -1;
+module_param(rcu_nocb_gp_stride, int, 0444);
+
+/*
+ * Initialize GP-CB relationships for all no-CBs CPU.
+ */
+static void __init rcu_organize_nocb_kthreads(void)
+{
+ int cpu;
+ bool firsttime = true;
+ bool gotnocbs = false;
+ bool gotnocbscbs = true;
+ int ls = rcu_nocb_gp_stride;
+ int nl = 0; /* Next GP kthread. */
+ struct rcu_data *rdp;
+ struct rcu_data *rdp_gp = NULL; /* Suppress misguided gcc warn. */
+ struct rcu_data *rdp_prev = NULL;
+
+ if (!cpumask_available(rcu_nocb_mask))
+ return;
+ if (ls == -1) {
+ ls = nr_cpu_ids / int_sqrt(nr_cpu_ids);
+ rcu_nocb_gp_stride = ls;
+ }
+
+ /*
+ * Each pass through this loop sets up one rcu_data structure.
+ * Should the corresponding CPU come online in the future, then
+ * we will spawn the needed set of rcu_nocb_kthread() kthreads.
+ */
+ for_each_cpu(cpu, rcu_nocb_mask) {
+ rdp = per_cpu_ptr(&rcu_data, cpu);
+ if (rdp->cpu >= nl) {
+ /* New GP kthread, set up for CBs & next GP. */
+ gotnocbs = true;
+ nl = DIV_ROUND_UP(rdp->cpu + 1, ls) * ls;
+ rdp->nocb_gp_rdp = rdp;
+ rdp_gp = rdp;
+ if (dump_tree) {
+ if (!firsttime)
+ pr_cont("%s\n", gotnocbscbs
+ ? "" : " (self only)");
+ gotnocbscbs = false;
+ firsttime = false;
+ pr_alert("%s: No-CB GP kthread CPU %d:",
+ __func__, cpu);
+ }
+ } else {
+ /* Another CB kthread, link to previous GP kthread. */
+ gotnocbscbs = true;
+ rdp->nocb_gp_rdp = rdp_gp;
+ rdp_prev->nocb_next_cb_rdp = rdp;
+ if (dump_tree)
+ pr_cont(" %d", cpu);
+ }
+ rdp_prev = rdp;
+ }
+ if (gotnocbs && dump_tree)
+ pr_cont("%s\n", gotnocbscbs ? "" : " (self only)");
+}
+
+/*
+ * Bind the current task to the offloaded CPUs. If there are no offloaded
+ * CPUs, leave the task unbound. Splat if the bind attempt fails.
+ */
+void rcu_bind_current_to_nocb(void)
+{
+ if (cpumask_available(rcu_nocb_mask) && cpumask_weight(rcu_nocb_mask))
+ WARN_ON(sched_setaffinity(current->pid, rcu_nocb_mask));
+}
+EXPORT_SYMBOL_GPL(rcu_bind_current_to_nocb);
+
+// The ->on_cpu field is available only in CONFIG_SMP=y, so...
+#ifdef CONFIG_SMP
+static char *show_rcu_should_be_on_cpu(struct task_struct *tsp)
+{
+ return tsp && task_is_running(tsp) && !tsp->on_cpu ? "!" : "";
+}
+#else // #ifdef CONFIG_SMP
+static char *show_rcu_should_be_on_cpu(struct task_struct *tsp)
+{
+ return "";
+}
+#endif // #else #ifdef CONFIG_SMP
+
+/*
+ * Dump out nocb grace-period kthread state for the specified rcu_data
+ * structure.
+ */
+static void show_rcu_nocb_gp_state(struct rcu_data *rdp)
+{
+ struct rcu_node *rnp = rdp->mynode;
+
+ pr_info("nocb GP %d %c%c%c%c%c %c[%c%c] %c%c:%ld rnp %d:%d %lu %c CPU %d%s\n",
+ rdp->cpu,
+ "kK"[!!rdp->nocb_gp_kthread],
+ "lL"[raw_spin_is_locked(&rdp->nocb_gp_lock)],
+ "dD"[!!rdp->nocb_defer_wakeup],
+ "tT"[timer_pending(&rdp->nocb_timer)],
+ "sS"[!!rdp->nocb_gp_sleep],
+ ".W"[swait_active(&rdp->nocb_gp_wq)],
+ ".W"[swait_active(&rnp->nocb_gp_wq[0])],
+ ".W"[swait_active(&rnp->nocb_gp_wq[1])],
+ ".B"[!!rdp->nocb_gp_bypass],
+ ".G"[!!rdp->nocb_gp_gp],
+ (long)rdp->nocb_gp_seq,
+ rnp->grplo, rnp->grphi, READ_ONCE(rdp->nocb_gp_loops),
+ rdp->nocb_gp_kthread ? task_state_to_char(rdp->nocb_gp_kthread) : '.',
+ rdp->nocb_cb_kthread ? (int)task_cpu(rdp->nocb_gp_kthread) : -1,
+ show_rcu_should_be_on_cpu(rdp->nocb_cb_kthread));
+}
+
+/* Dump out nocb kthread state for the specified rcu_data structure. */
+static void show_rcu_nocb_state(struct rcu_data *rdp)
+{
+ char bufw[20];
+ char bufr[20];
+ struct rcu_segcblist *rsclp = &rdp->cblist;
+ bool waslocked;
+ bool wassleep;
+
+ if (rdp->nocb_gp_rdp == rdp)
+ show_rcu_nocb_gp_state(rdp);
+
+ sprintf(bufw, "%ld", rsclp->gp_seq[RCU_WAIT_TAIL]);
+ sprintf(bufr, "%ld", rsclp->gp_seq[RCU_NEXT_READY_TAIL]);
+ pr_info(" CB %d^%d->%d %c%c%c%c%c%c F%ld L%ld C%d %c%c%s%c%s%c%c q%ld %c CPU %d%s\n",
+ rdp->cpu, rdp->nocb_gp_rdp->cpu,
+ rdp->nocb_next_cb_rdp ? rdp->nocb_next_cb_rdp->cpu : -1,
+ "kK"[!!rdp->nocb_cb_kthread],
+ "bB"[raw_spin_is_locked(&rdp->nocb_bypass_lock)],
+ "cC"[!!atomic_read(&rdp->nocb_lock_contended)],
+ "lL"[raw_spin_is_locked(&rdp->nocb_lock)],
+ "sS"[!!rdp->nocb_cb_sleep],
+ ".W"[swait_active(&rdp->nocb_cb_wq)],
+ jiffies - rdp->nocb_bypass_first,
+ jiffies - rdp->nocb_nobypass_last,
+ rdp->nocb_nobypass_count,
+ ".D"[rcu_segcblist_ready_cbs(rsclp)],
+ ".W"[!rcu_segcblist_segempty(rsclp, RCU_WAIT_TAIL)],
+ rcu_segcblist_segempty(rsclp, RCU_WAIT_TAIL) ? "" : bufw,
+ ".R"[!rcu_segcblist_segempty(rsclp, RCU_NEXT_READY_TAIL)],
+ rcu_segcblist_segempty(rsclp, RCU_NEXT_READY_TAIL) ? "" : bufr,
+ ".N"[!rcu_segcblist_segempty(rsclp, RCU_NEXT_TAIL)],
+ ".B"[!!rcu_cblist_n_cbs(&rdp->nocb_bypass)],
+ rcu_segcblist_n_cbs(&rdp->cblist),
+ rdp->nocb_cb_kthread ? task_state_to_char(rdp->nocb_cb_kthread) : '.',
+ rdp->nocb_cb_kthread ? (int)task_cpu(rdp->nocb_gp_kthread) : -1,
+ show_rcu_should_be_on_cpu(rdp->nocb_cb_kthread));
+
+ /* It is OK for GP kthreads to have GP state. */
+ if (rdp->nocb_gp_rdp == rdp)
+ return;
+
+ waslocked = raw_spin_is_locked(&rdp->nocb_gp_lock);
+ wassleep = swait_active(&rdp->nocb_gp_wq);
+ if (!rdp->nocb_gp_sleep && !waslocked && !wassleep)
+ return; /* Nothing untoward. */
+
+ pr_info(" nocb GP activity on CB-only CPU!!! %c%c%c %c\n",
+ "lL"[waslocked],
+ "dD"[!!rdp->nocb_defer_wakeup],
+ "sS"[!!rdp->nocb_gp_sleep],
+ ".W"[wassleep]);
+}
+
+#else /* #ifdef CONFIG_RCU_NOCB_CPU */
+
+static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
+{
+ return 0;
+}
+
+static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
+{
+ return false;
+}
+
+/* No ->nocb_lock to acquire. */
+static void rcu_nocb_lock(struct rcu_data *rdp)
+{
+}
+
+/* No ->nocb_lock to release. */
+static void rcu_nocb_unlock(struct rcu_data *rdp)
+{
+}
+
+/* No ->nocb_lock to release. */
+static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
+ unsigned long flags)
+{
+ local_irq_restore(flags);
+}
+
+/* Lockdep check that ->cblist may be safely accessed. */
+static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
+{
+ lockdep_assert_irqs_disabled();
+}
+
+static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
+{
+}
+
+static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
+{
+ return NULL;
+}
+
+static void rcu_init_one_nocb(struct rcu_node *rnp)
+{
+}
+
+static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+ unsigned long j)
+{
+ return true;
+}
+
+static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+ bool *was_alldone, unsigned long flags)
+{
+ return false;
+}
+
+static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
+ unsigned long flags)
+{
+ WARN_ON_ONCE(1); /* Should be dead code! */
+}
+
+static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
+{
+}
+
+static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level)
+{
+ return false;
+}
+
+static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
+{
+ return false;
+}
+
+static void rcu_spawn_cpu_nocb_kthread(int cpu)
+{
+}
+
+static void __init rcu_spawn_nocb_kthreads(void)
+{
+}
+
+static void show_rcu_nocb_state(struct rcu_data *rdp)
+{
+}
+
+#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
#include "../locking/rtmutex_common.h"
-#ifdef CONFIG_RCU_NOCB_CPU
-static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
-static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */
-static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
-{
- return lockdep_is_held(&rdp->nocb_lock);
-}
-
-static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
-{
- /* Race on early boot between thread creation and assignment */
- if (!rdp->nocb_cb_kthread || !rdp->nocb_gp_kthread)
- return true;
-
- if (current == rdp->nocb_cb_kthread || current == rdp->nocb_gp_kthread)
- if (in_task())
- return true;
- return false;
-}
-
-#else
-static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
-{
- return 0;
-}
-
-static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp)
-{
- return false;
-}
-
-#endif /* #ifdef CONFIG_RCU_NOCB_CPU */
-
static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
{
/*
trace_rcu_utilization(TPS("Start context switch"));
lockdep_assert_irqs_disabled();
- WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0);
+ WARN_ONCE(!preempt && rcu_preempt_depth() > 0, "Voluntary context switch within RCU read-side critical section!");
if (rcu_preempt_depth() > 0 &&
!t->rcu_read_unlock_special.b.blocked) {
static void rcu_preempt_read_enter(void)
{
- current->rcu_read_lock_nesting++;
+ WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
}
static int rcu_preempt_read_exit(void)
{
- return --current->rcu_read_lock_nesting;
+ int ret = READ_ONCE(current->rcu_read_lock_nesting) - 1;
+
+ WRITE_ONCE(current->rcu_read_lock_nesting, ret);
+ return ret;
}
static void rcu_preempt_depth_set(int val)
{
- current->rcu_read_lock_nesting = val;
+ WRITE_ONCE(current->rcu_read_lock_nesting, val);
}
/*
WRITE_ONCE(rnp->exp_tasks, np);
if (IS_ENABLED(CONFIG_RCU_BOOST)) {
/* Snapshot ->boost_mtx ownership w/rnp->lock held. */
- drop_boost_mutex = rt_mutex_owner(&rnp->boost_mtx) == t;
+ drop_boost_mutex = rt_mutex_owner(&rnp->boost_mtx.rtmutex) == t;
if (&t->rcu_node_entry == rnp->boost_tasks)
WRITE_ONCE(rnp->boost_tasks, np);
}
/* Unboost if we were boosted. */
if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex)
- rt_mutex_futex_unlock(&rnp->boost_mtx);
+ rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex);
/*
* If this was the last task on the expedited lists,
* section.
*/
t = container_of(tb, struct task_struct, rcu_node_entry);
- rt_mutex_init_proxy_locked(&rnp->boost_mtx, t);
+ rt_mutex_init_proxy_locked(&rnp->boost_mtx.rtmutex, t);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
/* Lock only for side effect: boosts task t's priority. */
rt_mutex_lock(&rnp->boost_mtx);
#endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
-#ifdef CONFIG_RCU_NOCB_CPU
-
-/*
- * Offload callback processing from the boot-time-specified set of CPUs
- * specified by rcu_nocb_mask. For the CPUs in the set, there are kthreads
- * created that pull the callbacks from the corresponding CPU, wait for
- * a grace period to elapse, and invoke the callbacks. These kthreads
- * are organized into GP kthreads, which manage incoming callbacks, wait for
- * grace periods, and awaken CB kthreads, and the CB kthreads, which only
- * invoke callbacks. Each GP kthread invokes its own CBs. The no-CBs CPUs
- * do a wake_up() on their GP kthread when they insert a callback into any
- * empty list, unless the rcu_nocb_poll boot parameter has been specified,
- * in which case each kthread actively polls its CPU. (Which isn't so great
- * for energy efficiency, but which does reduce RCU's overhead on that CPU.)
- *
- * This is intended to be used in conjunction with Frederic Weisbecker's
- * adaptive-idle work, which would seriously reduce OS jitter on CPUs
- * running CPU-bound user-mode computations.
- *
- * Offloading of callbacks can also be used as an energy-efficiency
- * measure because CPUs with no RCU callbacks queued are more aggressive
- * about entering dyntick-idle mode.
- */
-
-
-/*
- * Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters.
- * If the list is invalid, a warning is emitted and all CPUs are offloaded.
- */
-static int __init rcu_nocb_setup(char *str)
-{
- alloc_bootmem_cpumask_var(&rcu_nocb_mask);
- if (cpulist_parse(str, rcu_nocb_mask)) {
- pr_warn("rcu_nocbs= bad CPU range, all CPUs set\n");
- cpumask_setall(rcu_nocb_mask);
- }
- return 1;
-}
-__setup("rcu_nocbs=", rcu_nocb_setup);
-
-static int __init parse_rcu_nocb_poll(char *arg)
-{
- rcu_nocb_poll = true;
- return 0;
-}
-early_param("rcu_nocb_poll", parse_rcu_nocb_poll);
-
-/*
- * Don't bother bypassing ->cblist if the call_rcu() rate is low.
- * After all, the main point of bypassing is to avoid lock contention
- * on ->nocb_lock, which only can happen at high call_rcu() rates.
- */
-static int nocb_nobypass_lim_per_jiffy = 16 * 1000 / HZ;
-module_param(nocb_nobypass_lim_per_jiffy, int, 0);
-
-/*
- * Acquire the specified rcu_data structure's ->nocb_bypass_lock. If the
- * lock isn't immediately available, increment ->nocb_lock_contended to
- * flag the contention.
- */
-static void rcu_nocb_bypass_lock(struct rcu_data *rdp)
- __acquires(&rdp->nocb_bypass_lock)
-{
- lockdep_assert_irqs_disabled();
- if (raw_spin_trylock(&rdp->nocb_bypass_lock))
- return;
- atomic_inc(&rdp->nocb_lock_contended);
- WARN_ON_ONCE(smp_processor_id() != rdp->cpu);
- smp_mb__after_atomic(); /* atomic_inc() before lock. */
- raw_spin_lock(&rdp->nocb_bypass_lock);
- smp_mb__before_atomic(); /* atomic_dec() after lock. */
- atomic_dec(&rdp->nocb_lock_contended);
-}
-
-/*
- * Spinwait until the specified rcu_data structure's ->nocb_lock is
- * not contended. Please note that this is extremely special-purpose,
- * relying on the fact that at most two kthreads and one CPU contend for
- * this lock, and also that the two kthreads are guaranteed to have frequent
- * grace-period-duration time intervals between successive acquisitions
- * of the lock. This allows us to use an extremely simple throttling
- * mechanism, and further to apply it only to the CPU doing floods of
- * call_rcu() invocations. Don't try this at home!
- */
-static void rcu_nocb_wait_contended(struct rcu_data *rdp)
-{
- WARN_ON_ONCE(smp_processor_id() != rdp->cpu);
- while (WARN_ON_ONCE(atomic_read(&rdp->nocb_lock_contended)))
- cpu_relax();
-}
-
-/*
- * Conditionally acquire the specified rcu_data structure's
- * ->nocb_bypass_lock.
- */
-static bool rcu_nocb_bypass_trylock(struct rcu_data *rdp)
-{
- lockdep_assert_irqs_disabled();
- return raw_spin_trylock(&rdp->nocb_bypass_lock);
-}
-
-/*
- * Release the specified rcu_data structure's ->nocb_bypass_lock.
- */
-static void rcu_nocb_bypass_unlock(struct rcu_data *rdp)
- __releases(&rdp->nocb_bypass_lock)
-{
- lockdep_assert_irqs_disabled();
- raw_spin_unlock(&rdp->nocb_bypass_lock);
-}
-
-/*
- * Acquire the specified rcu_data structure's ->nocb_lock, but only
- * if it corresponds to a no-CBs CPU.
- */
-static void rcu_nocb_lock(struct rcu_data *rdp)
-{
- lockdep_assert_irqs_disabled();
- if (!rcu_rdp_is_offloaded(rdp))
- return;
- raw_spin_lock(&rdp->nocb_lock);
-}
-
-/*
- * Release the specified rcu_data structure's ->nocb_lock, but only
- * if it corresponds to a no-CBs CPU.
- */
-static void rcu_nocb_unlock(struct rcu_data *rdp)
-{
- if (rcu_rdp_is_offloaded(rdp)) {
- lockdep_assert_irqs_disabled();
- raw_spin_unlock(&rdp->nocb_lock);
- }
-}
-
-/*
- * Release the specified rcu_data structure's ->nocb_lock and restore
- * interrupts, but only if it corresponds to a no-CBs CPU.
- */
-static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
- unsigned long flags)
-{
- if (rcu_rdp_is_offloaded(rdp)) {
- lockdep_assert_irqs_disabled();
- raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
- } else {
- local_irq_restore(flags);
- }
-}
-
-/* Lockdep check that ->cblist may be safely accessed. */
-static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
-{
- lockdep_assert_irqs_disabled();
- if (rcu_rdp_is_offloaded(rdp))
- lockdep_assert_held(&rdp->nocb_lock);
-}
-
-/*
- * Wake up any no-CBs CPUs' kthreads that were waiting on the just-ended
- * grace period.
- */
-static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
-{
- swake_up_all(sq);
-}
-
-static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
-{
- return &rnp->nocb_gp_wq[rcu_seq_ctr(rnp->gp_seq) & 0x1];
-}
-
-static void rcu_init_one_nocb(struct rcu_node *rnp)
-{
- init_swait_queue_head(&rnp->nocb_gp_wq[0]);
- init_swait_queue_head(&rnp->nocb_gp_wq[1]);
-}
-
-/* Is the specified CPU a no-CBs CPU? */
-bool rcu_is_nocb_cpu(int cpu)
-{
- if (cpumask_available(rcu_nocb_mask))
- return cpumask_test_cpu(cpu, rcu_nocb_mask);
- return false;
-}
-
-static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
- struct rcu_data *rdp,
- bool force, unsigned long flags)
- __releases(rdp_gp->nocb_gp_lock)
-{
- bool needwake = false;
-
- if (!READ_ONCE(rdp_gp->nocb_gp_kthread)) {
- raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("AlreadyAwake"));
- return false;
- }
-
- if (rdp_gp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
- WRITE_ONCE(rdp_gp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
- del_timer(&rdp_gp->nocb_timer);
- }
-
- if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
- WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
- needwake = true;
- }
- raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
- if (needwake) {
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake"));
- wake_up_process(rdp_gp->nocb_gp_kthread);
- }
-
- return needwake;
-}
-
-/*
- * Kick the GP kthread for this NOCB group.
- */
-static bool wake_nocb_gp(struct rcu_data *rdp, bool force)
-{
- unsigned long flags;
- struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
-
- raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
- return __wake_nocb_gp(rdp_gp, rdp, force, flags);
-}
-
-/*
- * Arrange to wake the GP kthread for this NOCB group at some future
- * time when it is safe to do so.
- */
-static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
- const char *reason)
-{
- unsigned long flags;
- struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
-
- raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
-
- /*
- * Bypass wakeup overrides previous deferments. In case
- * of callback storm, no need to wake up too early.
- */
- if (waketype == RCU_NOCB_WAKE_BYPASS) {
- mod_timer(&rdp_gp->nocb_timer, jiffies + 2);
- WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
- } else {
- if (rdp_gp->nocb_defer_wakeup < RCU_NOCB_WAKE)
- mod_timer(&rdp_gp->nocb_timer, jiffies + 1);
- if (rdp_gp->nocb_defer_wakeup < waketype)
- WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
- }
-
- raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
-
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason);
-}
-
-/*
- * Flush the ->nocb_bypass queue into ->cblist, enqueuing rhp if non-NULL.
- * However, if there is a callback to be enqueued and if ->nocb_bypass
- * proves to be initially empty, just return false because the no-CB GP
- * kthread may need to be awakened in this case.
- *
- * Note that this function always returns true if rhp is NULL.
- */
-static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
- unsigned long j)
-{
- struct rcu_cblist rcl;
-
- WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
- rcu_lockdep_assert_cblist_protected(rdp);
- lockdep_assert_held(&rdp->nocb_bypass_lock);
- if (rhp && !rcu_cblist_n_cbs(&rdp->nocb_bypass)) {
- raw_spin_unlock(&rdp->nocb_bypass_lock);
- return false;
- }
- /* Note: ->cblist.len already accounts for ->nocb_bypass contents. */
- if (rhp)
- rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
- rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
- rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
- WRITE_ONCE(rdp->nocb_bypass_first, j);
- rcu_nocb_bypass_unlock(rdp);
- return true;
-}
-
-/*
- * Flush the ->nocb_bypass queue into ->cblist, enqueuing rhp if non-NULL.
- * However, if there is a callback to be enqueued and if ->nocb_bypass
- * proves to be initially empty, just return false because the no-CB GP
- * kthread may need to be awakened in this case.
- *
- * Note that this function always returns true if rhp is NULL.
- */
-static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
- unsigned long j)
-{
- if (!rcu_rdp_is_offloaded(rdp))
- return true;
- rcu_lockdep_assert_cblist_protected(rdp);
- rcu_nocb_bypass_lock(rdp);
- return rcu_nocb_do_flush_bypass(rdp, rhp, j);
-}
-
-/*
- * If the ->nocb_bypass_lock is immediately available, flush the
- * ->nocb_bypass queue into ->cblist.
- */
-static void rcu_nocb_try_flush_bypass(struct rcu_data *rdp, unsigned long j)
-{
- rcu_lockdep_assert_cblist_protected(rdp);
- if (!rcu_rdp_is_offloaded(rdp) ||
- !rcu_nocb_bypass_trylock(rdp))
- return;
- WARN_ON_ONCE(!rcu_nocb_do_flush_bypass(rdp, NULL, j));
-}
-
-/*
- * See whether it is appropriate to use the ->nocb_bypass list in order
- * to control contention on ->nocb_lock. A limited number of direct
- * enqueues are permitted into ->cblist per jiffy. If ->nocb_bypass
- * is non-empty, further callbacks must be placed into ->nocb_bypass,
- * otherwise rcu_barrier() breaks. Use rcu_nocb_flush_bypass() to switch
- * back to direct use of ->cblist. However, ->nocb_bypass should not be
- * used if ->cblist is empty, because otherwise callbacks can be stranded
- * on ->nocb_bypass because we cannot count on the current CPU ever again
- * invoking call_rcu(). The general rule is that if ->nocb_bypass is
- * non-empty, the corresponding no-CBs grace-period kthread must not be
- * in an indefinite sleep state.
- *
- * Finally, it is not permitted to use the bypass during early boot,
- * as doing so would confuse the auto-initialization code. Besides
- * which, there is no point in worrying about lock contention while
- * there is only one CPU in operation.
- */
-static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
- bool *was_alldone, unsigned long flags)
-{
- unsigned long c;
- unsigned long cur_gp_seq;
- unsigned long j = jiffies;
- long ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
-
- lockdep_assert_irqs_disabled();
-
- // Pure softirq/rcuc based processing: no bypassing, no
- // locking.
- if (!rcu_rdp_is_offloaded(rdp)) {
- *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
- return false;
- }
-
- // In the process of (de-)offloading: no bypassing, but
- // locking.
- if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) {
- rcu_nocb_lock(rdp);
- *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
- return false; /* Not offloaded, no bypassing. */
- }
-
- // Don't use ->nocb_bypass during early boot.
- if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) {
- rcu_nocb_lock(rdp);
- WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
- *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
- return false;
- }
-
- // If we have advanced to a new jiffy, reset counts to allow
- // moving back from ->nocb_bypass to ->cblist.
- if (j == rdp->nocb_nobypass_last) {
- c = rdp->nocb_nobypass_count + 1;
- } else {
- WRITE_ONCE(rdp->nocb_nobypass_last, j);
- c = rdp->nocb_nobypass_count - nocb_nobypass_lim_per_jiffy;
- if (ULONG_CMP_LT(rdp->nocb_nobypass_count,
- nocb_nobypass_lim_per_jiffy))
- c = 0;
- else if (c > nocb_nobypass_lim_per_jiffy)
- c = nocb_nobypass_lim_per_jiffy;
- }
- WRITE_ONCE(rdp->nocb_nobypass_count, c);
-
- // If there hasn't yet been all that many ->cblist enqueues
- // this jiffy, tell the caller to enqueue onto ->cblist. But flush
- // ->nocb_bypass first.
- if (rdp->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy) {
- rcu_nocb_lock(rdp);
- *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
- if (*was_alldone)
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("FirstQ"));
- WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, j));
- WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
- return false; // Caller must enqueue the callback.
- }
-
- // If ->nocb_bypass has been used too long or is too full,
- // flush ->nocb_bypass to ->cblist.
- if ((ncbs && j != READ_ONCE(rdp->nocb_bypass_first)) ||
- ncbs >= qhimark) {
- rcu_nocb_lock(rdp);
- if (!rcu_nocb_flush_bypass(rdp, rhp, j)) {
- *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist);
- if (*was_alldone)
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("FirstQ"));
- WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
- return false; // Caller must enqueue the callback.
- }
- if (j != rdp->nocb_gp_adv_time &&
- rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
- rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
- rcu_advance_cbs_nowake(rdp->mynode, rdp);
- rdp->nocb_gp_adv_time = j;
- }
- rcu_nocb_unlock_irqrestore(rdp, flags);
- return true; // Callback already enqueued.
- }
-
- // We need to use the bypass.
- rcu_nocb_wait_contended(rdp);
- rcu_nocb_bypass_lock(rdp);
- ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
- rcu_segcblist_inc_len(&rdp->cblist); /* Must precede enqueue. */
- rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
- if (!ncbs) {
- WRITE_ONCE(rdp->nocb_bypass_first, j);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ"));
- }
- rcu_nocb_bypass_unlock(rdp);
- smp_mb(); /* Order enqueue before wake. */
- if (ncbs) {
- local_irq_restore(flags);
- } else {
- // No-CBs GP kthread might be indefinitely asleep, if so, wake.
- rcu_nocb_lock(rdp); // Rare during call_rcu() flood.
- if (!rcu_segcblist_pend_cbs(&rdp->cblist)) {
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("FirstBQwake"));
- __call_rcu_nocb_wake(rdp, true, flags);
- } else {
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("FirstBQnoWake"));
- rcu_nocb_unlock_irqrestore(rdp, flags);
- }
- }
- return true; // Callback already enqueued.
-}
-
-/*
- * Awaken the no-CBs grace-period kthread if needed, either due to it
- * legitimately being asleep or due to overload conditions.
- *
- * If warranted, also wake up the kthread servicing this CPUs queues.
- */
-static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_alldone,
- unsigned long flags)
- __releases(rdp->nocb_lock)
-{
- unsigned long cur_gp_seq;
- unsigned long j;
- long len;
- struct task_struct *t;
-
- // If we are being polled or there is no kthread, just leave.
- t = READ_ONCE(rdp->nocb_gp_kthread);
- if (rcu_nocb_poll || !t) {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("WakeNotPoll"));
- return;
- }
- // Need to actually to a wakeup.
- len = rcu_segcblist_n_cbs(&rdp->cblist);
- if (was_alldone) {
- rdp->qlen_last_fqs_check = len;
- if (!irqs_disabled_flags(flags)) {
- /* ... if queue was empty ... */
- rcu_nocb_unlock_irqrestore(rdp, flags);
- wake_nocb_gp(rdp, false);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("WakeEmpty"));
- } else {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE,
- TPS("WakeEmptyIsDeferred"));
- }
- } else if (len > rdp->qlen_last_fqs_check + qhimark) {
- /* ... or if many callbacks queued. */
- rdp->qlen_last_fqs_check = len;
- j = jiffies;
- if (j != rdp->nocb_gp_adv_time &&
- rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
- rcu_seq_done(&rdp->mynode->gp_seq, cur_gp_seq)) {
- rcu_advance_cbs_nowake(rdp->mynode, rdp);
- rdp->nocb_gp_adv_time = j;
- }
- smp_mb(); /* Enqueue before timer_pending(). */
- if ((rdp->nocb_cb_sleep ||
- !rcu_segcblist_ready_cbs(&rdp->cblist)) &&
- !timer_pending(&rdp->nocb_timer)) {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- wake_nocb_gp_defer(rdp, RCU_NOCB_WAKE_FORCE,
- TPS("WakeOvfIsDeferred"));
- } else {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
- }
- } else {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WakeNot"));
- }
- return;
-}
-
-/*
- * Check if we ignore this rdp.
- *
- * We check that without holding the nocb lock but
- * we make sure not to miss a freshly offloaded rdp
- * with the current ordering:
- *
- * rdp_offload_toggle() nocb_gp_enabled_cb()
- * ------------------------- ----------------------------
- * WRITE flags LOCK nocb_gp_lock
- * LOCK nocb_gp_lock READ/WRITE nocb_gp_sleep
- * READ/WRITE nocb_gp_sleep UNLOCK nocb_gp_lock
- * UNLOCK nocb_gp_lock READ flags
- */
-static inline bool nocb_gp_enabled_cb(struct rcu_data *rdp)
-{
- u8 flags = SEGCBLIST_OFFLOADED | SEGCBLIST_KTHREAD_GP;
-
- return rcu_segcblist_test_flags(&rdp->cblist, flags);
-}
-
-static inline bool nocb_gp_update_state_deoffloading(struct rcu_data *rdp,
- bool *needwake_state)
-{
- struct rcu_segcblist *cblist = &rdp->cblist;
-
- if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
- if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) {
- rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP);
- if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
- *needwake_state = true;
- }
- return false;
- }
-
- /*
- * De-offloading. Clear our flag and notify the de-offload worker.
- * We will ignore this rdp until it ever gets re-offloaded.
- */
- WARN_ON_ONCE(!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
- rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP);
- if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB))
- *needwake_state = true;
- return true;
-}
-
-
-/*
- * No-CBs GP kthreads come here to wait for additional callbacks to show up
- * or for grace periods to end.
- */
-static void nocb_gp_wait(struct rcu_data *my_rdp)
-{
- bool bypass = false;
- long bypass_ncbs;
- int __maybe_unused cpu = my_rdp->cpu;
- unsigned long cur_gp_seq;
- unsigned long flags;
- bool gotcbs = false;
- unsigned long j = jiffies;
- bool needwait_gp = false; // This prevents actual uninitialized use.
- bool needwake;
- bool needwake_gp;
- struct rcu_data *rdp;
- struct rcu_node *rnp;
- unsigned long wait_gp_seq = 0; // Suppress "use uninitialized" warning.
- bool wasempty = false;
-
- /*
- * Each pass through the following loop checks for CBs and for the
- * nearest grace period (if any) to wait for next. The CB kthreads
- * and the global grace-period kthread are awakened if needed.
- */
- WARN_ON_ONCE(my_rdp->nocb_gp_rdp != my_rdp);
- for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_cb_rdp) {
- bool needwake_state = false;
-
- if (!nocb_gp_enabled_cb(rdp))
- continue;
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Check"));
- rcu_nocb_lock_irqsave(rdp, flags);
- if (nocb_gp_update_state_deoffloading(rdp, &needwake_state)) {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- if (needwake_state)
- swake_up_one(&rdp->nocb_state_wq);
- continue;
- }
- bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
- if (bypass_ncbs &&
- (time_after(j, READ_ONCE(rdp->nocb_bypass_first) + 1) ||
- bypass_ncbs > 2 * qhimark)) {
- // Bypass full or old, so flush it.
- (void)rcu_nocb_try_flush_bypass(rdp, j);
- bypass_ncbs = rcu_cblist_n_cbs(&rdp->nocb_bypass);
- } else if (!bypass_ncbs && rcu_segcblist_empty(&rdp->cblist)) {
- rcu_nocb_unlock_irqrestore(rdp, flags);
- if (needwake_state)
- swake_up_one(&rdp->nocb_state_wq);
- continue; /* No callbacks here, try next. */
- }
- if (bypass_ncbs) {
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("Bypass"));
- bypass = true;
- }
- rnp = rdp->mynode;
-
- // Advance callbacks if helpful and low contention.
- needwake_gp = false;
- if (!rcu_segcblist_restempty(&rdp->cblist,
- RCU_NEXT_READY_TAIL) ||
- (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq) &&
- rcu_seq_done(&rnp->gp_seq, cur_gp_seq))) {
- raw_spin_lock_rcu_node(rnp); /* irqs disabled. */
- needwake_gp = rcu_advance_cbs(rnp, rdp);
- wasempty = rcu_segcblist_restempty(&rdp->cblist,
- RCU_NEXT_READY_TAIL);
- raw_spin_unlock_rcu_node(rnp); /* irqs disabled. */
- }
- // Need to wait on some grace period?
- WARN_ON_ONCE(wasempty &&
- !rcu_segcblist_restempty(&rdp->cblist,
- RCU_NEXT_READY_TAIL));
- if (rcu_segcblist_nextgp(&rdp->cblist, &cur_gp_seq)) {
- if (!needwait_gp ||
- ULONG_CMP_LT(cur_gp_seq, wait_gp_seq))
- wait_gp_seq = cur_gp_seq;
- needwait_gp = true;
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu,
- TPS("NeedWaitGP"));
- }
- if (rcu_segcblist_ready_cbs(&rdp->cblist)) {
- needwake = rdp->nocb_cb_sleep;
- WRITE_ONCE(rdp->nocb_cb_sleep, false);
- smp_mb(); /* CB invocation -after- GP end. */
- } else {
- needwake = false;
- }
- rcu_nocb_unlock_irqrestore(rdp, flags);
- if (needwake) {
- swake_up_one(&rdp->nocb_cb_wq);
- gotcbs = true;
- }
- if (needwake_gp)
- rcu_gp_kthread_wake();
- if (needwake_state)
- swake_up_one(&rdp->nocb_state_wq);
- }
-
- my_rdp->nocb_gp_bypass = bypass;
- my_rdp->nocb_gp_gp = needwait_gp;
- my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
-
- if (bypass && !rcu_nocb_poll) {
- // At least one child with non-empty ->nocb_bypass, so set
- // timer in order to avoid stranding its callbacks.
- wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
- TPS("WakeBypassIsDeferred"));
- }
- if (rcu_nocb_poll) {
- /* Polling, so trace if first poll in the series. */
- if (gotcbs)
- trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Poll"));
- schedule_timeout_idle(1);
- } else if (!needwait_gp) {
- /* Wait for callbacks to appear. */
- trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Sleep"));
- swait_event_interruptible_exclusive(my_rdp->nocb_gp_wq,
- !READ_ONCE(my_rdp->nocb_gp_sleep));
- trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("EndSleep"));
- } else {
- rnp = my_rdp->mynode;
- trace_rcu_this_gp(rnp, my_rdp, wait_gp_seq, TPS("StartWait"));
- swait_event_interruptible_exclusive(
- rnp->nocb_gp_wq[rcu_seq_ctr(wait_gp_seq) & 0x1],
- rcu_seq_done(&rnp->gp_seq, wait_gp_seq) ||
- !READ_ONCE(my_rdp->nocb_gp_sleep));
- trace_rcu_this_gp(rnp, my_rdp, wait_gp_seq, TPS("EndWait"));
- }
- if (!rcu_nocb_poll) {
- raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
- if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
- WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
- del_timer(&my_rdp->nocb_timer);
- }
- WRITE_ONCE(my_rdp->nocb_gp_sleep, true);
- raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
- }
- my_rdp->nocb_gp_seq = -1;
- WARN_ON(signal_pending(current));
-}
-
-/*
- * No-CBs grace-period-wait kthread. There is one of these per group
- * of CPUs, but only once at least one CPU in that group has come online
- * at least once since boot. This kthread checks for newly posted
- * callbacks from any of the CPUs it is responsible for, waits for a
- * grace period, then awakens all of the rcu_nocb_cb_kthread() instances
- * that then have callback-invocation work to do.
- */
-static int rcu_nocb_gp_kthread(void *arg)
-{
- struct rcu_data *rdp = arg;
-
- for (;;) {
- WRITE_ONCE(rdp->nocb_gp_loops, rdp->nocb_gp_loops + 1);
- nocb_gp_wait(rdp);
- cond_resched_tasks_rcu_qs();
- }
- return 0;
-}
-
-static inline bool nocb_cb_can_run(struct rcu_data *rdp)
-{
- u8 flags = SEGCBLIST_OFFLOADED | SEGCBLIST_KTHREAD_CB;
- return rcu_segcblist_test_flags(&rdp->cblist, flags);
-}
-
-static inline bool nocb_cb_wait_cond(struct rcu_data *rdp)
-{
- return nocb_cb_can_run(rdp) && !READ_ONCE(rdp->nocb_cb_sleep);
-}
-
-/*
- * Invoke any ready callbacks from the corresponding no-CBs CPU,
- * then, if there are no more, wait for more to appear.
- */
-static void nocb_cb_wait(struct rcu_data *rdp)
-{
- struct rcu_segcblist *cblist = &rdp->cblist;
- unsigned long cur_gp_seq;
- unsigned long flags;
- bool needwake_state = false;
- bool needwake_gp = false;
- bool can_sleep = true;
- struct rcu_node *rnp = rdp->mynode;
-
- local_irq_save(flags);
- rcu_momentary_dyntick_idle();
- local_irq_restore(flags);
- /*
- * Disable BH to provide the expected environment. Also, when
- * transitioning to/from NOCB mode, a self-requeuing callback might
- * be invoked from softirq. A short grace period could cause both
- * instances of this callback would execute concurrently.
- */
- local_bh_disable();
- rcu_do_batch(rdp);
- local_bh_enable();
- lockdep_assert_irqs_enabled();
- rcu_nocb_lock_irqsave(rdp, flags);
- if (rcu_segcblist_nextgp(cblist, &cur_gp_seq) &&
- rcu_seq_done(&rnp->gp_seq, cur_gp_seq) &&
- raw_spin_trylock_rcu_node(rnp)) { /* irqs already disabled. */
- needwake_gp = rcu_advance_cbs(rdp->mynode, rdp);
- raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
- }
-
- if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) {
- if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB)) {
- rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_CB);
- if (rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP))
- needwake_state = true;
- }
- if (rcu_segcblist_ready_cbs(cblist))
- can_sleep = false;
- } else {
- /*
- * De-offloading. Clear our flag and notify the de-offload worker.
- * We won't touch the callbacks and keep sleeping until we ever
- * get re-offloaded.
- */
- WARN_ON_ONCE(!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB));
- rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_CB);
- if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP))
- needwake_state = true;
- }
-
- WRITE_ONCE(rdp->nocb_cb_sleep, can_sleep);
-
- if (rdp->nocb_cb_sleep)
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("CBSleep"));
-
- rcu_nocb_unlock_irqrestore(rdp, flags);
- if (needwake_gp)
- rcu_gp_kthread_wake();
-
- if (needwake_state)
- swake_up_one(&rdp->nocb_state_wq);
-
- do {
- swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
- nocb_cb_wait_cond(rdp));
-
- // VVV Ensure CB invocation follows _sleep test.
- if (smp_load_acquire(&rdp->nocb_cb_sleep)) { // ^^^
- WARN_ON(signal_pending(current));
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
- }
- } while (!nocb_cb_can_run(rdp));
-}
-
-/*
- * Per-rcu_data kthread, but only for no-CBs CPUs. Repeatedly invoke
- * nocb_cb_wait() to do the dirty work.
- */
-static int rcu_nocb_cb_kthread(void *arg)
-{
- struct rcu_data *rdp = arg;
-
- // Each pass through this loop does one callback batch, and,
- // if there are no more ready callbacks, waits for them.
- for (;;) {
- nocb_cb_wait(rdp);
- cond_resched_tasks_rcu_qs();
- }
- return 0;
-}
-
-/* Is a deferred wakeup of rcu_nocb_kthread() required? */
-static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level)
-{
- return READ_ONCE(rdp->nocb_defer_wakeup) >= level;
-}
-
-/* Do a deferred wakeup of rcu_nocb_kthread(). */
-static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp_gp,
- struct rcu_data *rdp, int level,
- unsigned long flags)
- __releases(rdp_gp->nocb_gp_lock)
-{
- int ndw;
- int ret;
-
- if (!rcu_nocb_need_deferred_wakeup(rdp_gp, level)) {
- raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
- return false;
- }
-
- ndw = rdp_gp->nocb_defer_wakeup;
- ret = __wake_nocb_gp(rdp_gp, rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake"));
-
- return ret;
-}
-
-/* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */
-static void do_nocb_deferred_wakeup_timer(struct timer_list *t)
-{
- unsigned long flags;
- struct rcu_data *rdp = from_timer(rdp, t, nocb_timer);
-
- WARN_ON_ONCE(rdp->nocb_gp_rdp != rdp);
- trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("Timer"));
-
- raw_spin_lock_irqsave(&rdp->nocb_gp_lock, flags);
- smp_mb__after_spinlock(); /* Timer expire before wakeup. */
- do_nocb_deferred_wakeup_common(rdp, rdp, RCU_NOCB_WAKE_BYPASS, flags);
-}
-
-/*
- * Do a deferred wakeup of rcu_nocb_kthread() from fastpath.
- * This means we do an inexact common-case check. Note that if
- * we miss, ->nocb_timer will eventually clean things up.
- */
-static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
-{
- unsigned long flags;
- struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
-
- if (!rdp_gp || !rcu_nocb_need_deferred_wakeup(rdp_gp, RCU_NOCB_WAKE))
- return false;
-
- raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
- return do_nocb_deferred_wakeup_common(rdp_gp, rdp, RCU_NOCB_WAKE, flags);
-}
-
-void rcu_nocb_flush_deferred_wakeup(void)
-{
- do_nocb_deferred_wakeup(this_cpu_ptr(&rcu_data));
-}
-EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup);
-
-static int rdp_offload_toggle(struct rcu_data *rdp,
- bool offload, unsigned long flags)
- __releases(rdp->nocb_lock)
-{
- struct rcu_segcblist *cblist = &rdp->cblist;
- struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
- bool wake_gp = false;
-
- rcu_segcblist_offload(cblist, offload);
-
- if (rdp->nocb_cb_sleep)
- rdp->nocb_cb_sleep = false;
- rcu_nocb_unlock_irqrestore(rdp, flags);
-
- /*
- * Ignore former value of nocb_cb_sleep and force wake up as it could
- * have been spuriously set to false already.
- */
- swake_up_one(&rdp->nocb_cb_wq);
-
- raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
- if (rdp_gp->nocb_gp_sleep) {
- rdp_gp->nocb_gp_sleep = false;
- wake_gp = true;
- }
- raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
-
- if (wake_gp)
- wake_up_process(rdp_gp->nocb_gp_kthread);
-
- return 0;
-}
-
-static long rcu_nocb_rdp_deoffload(void *arg)
-{
- struct rcu_data *rdp = arg;
- struct rcu_segcblist *cblist = &rdp->cblist;
- unsigned long flags;
- int ret;
-
- WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
-
- pr_info("De-offloading %d\n", rdp->cpu);
-
- rcu_nocb_lock_irqsave(rdp, flags);
- /*
- * Flush once and for all now. This suffices because we are
- * running on the target CPU holding ->nocb_lock (thus having
- * interrupts disabled), and because rdp_offload_toggle()
- * invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED.
- * Thus future calls to rcu_segcblist_completely_offloaded() will
- * return false, which means that future calls to rcu_nocb_try_bypass()
- * will refuse to put anything into the bypass.
- */
- WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
- ret = rdp_offload_toggle(rdp, false, flags);
- swait_event_exclusive(rdp->nocb_state_wq,
- !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB |
- SEGCBLIST_KTHREAD_GP));
- /*
- * Lock one last time to acquire latest callback updates from kthreads
- * so we can later handle callbacks locally without locking.
- */
- rcu_nocb_lock_irqsave(rdp, flags);
- /*
- * Theoretically we could set SEGCBLIST_SOFTIRQ_ONLY after the nocb
- * lock is released but how about being paranoid for once?
- */
- rcu_segcblist_set_flags(cblist, SEGCBLIST_SOFTIRQ_ONLY);
- /*
- * With SEGCBLIST_SOFTIRQ_ONLY, we can't use
- * rcu_nocb_unlock_irqrestore() anymore.
- */
- raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
-
- /* Sanity check */
- WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass));
-
-
- return ret;
-}
-
-int rcu_nocb_cpu_deoffload(int cpu)
-{
- struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
- int ret = 0;
-
- mutex_lock(&rcu_state.barrier_mutex);
- cpus_read_lock();
- if (rcu_rdp_is_offloaded(rdp)) {
- if (cpu_online(cpu)) {
- ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp);
- if (!ret)
- cpumask_clear_cpu(cpu, rcu_nocb_mask);
- } else {
- pr_info("NOCB: Can't CB-deoffload an offline CPU\n");
- ret = -EINVAL;
- }
- }
- cpus_read_unlock();
- mutex_unlock(&rcu_state.barrier_mutex);
-
- return ret;
-}
-EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload);
-
-static long rcu_nocb_rdp_offload(void *arg)
-{
- struct rcu_data *rdp = arg;
- struct rcu_segcblist *cblist = &rdp->cblist;
- unsigned long flags;
- int ret;
-
- WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id());
- /*
- * For now we only support re-offload, ie: the rdp must have been
- * offloaded on boot first.
- */
- if (!rdp->nocb_gp_rdp)
- return -EINVAL;
-
- pr_info("Offloading %d\n", rdp->cpu);
- /*
- * Can't use rcu_nocb_lock_irqsave() while we are in
- * SEGCBLIST_SOFTIRQ_ONLY mode.
- */
- raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
-
- /*
- * We didn't take the nocb lock while working on the
- * rdp->cblist in SEGCBLIST_SOFTIRQ_ONLY mode.
- * Every modifications that have been done previously on
- * rdp->cblist must be visible remotely by the nocb kthreads
- * upon wake up after reading the cblist flags.
- *
- * The layout against nocb_lock enforces that ordering:
- *
- * __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait()
- * ------------------------- ----------------------------
- * WRITE callbacks rcu_nocb_lock()
- * rcu_nocb_lock() READ flags
- * WRITE flags READ callbacks
- * rcu_nocb_unlock() rcu_nocb_unlock()
- */
- ret = rdp_offload_toggle(rdp, true, flags);
- swait_event_exclusive(rdp->nocb_state_wq,
- rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_CB) &&
- rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP));
-
- return ret;
-}
-
-int rcu_nocb_cpu_offload(int cpu)
-{
- struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
- int ret = 0;
-
- mutex_lock(&rcu_state.barrier_mutex);
- cpus_read_lock();
- if (!rcu_rdp_is_offloaded(rdp)) {
- if (cpu_online(cpu)) {
- ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp);
- if (!ret)
- cpumask_set_cpu(cpu, rcu_nocb_mask);
- } else {
- pr_info("NOCB: Can't CB-offload an offline CPU\n");
- ret = -EINVAL;
- }
- }
- cpus_read_unlock();
- mutex_unlock(&rcu_state.barrier_mutex);
-
- return ret;
-}
-EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);
-
-void __init rcu_init_nohz(void)
-{
- int cpu;
- bool need_rcu_nocb_mask = false;
- struct rcu_data *rdp;
-
-#if defined(CONFIG_NO_HZ_FULL)
- if (tick_nohz_full_running && cpumask_weight(tick_nohz_full_mask))
- need_rcu_nocb_mask = true;
-#endif /* #if defined(CONFIG_NO_HZ_FULL) */
-
- if (!cpumask_available(rcu_nocb_mask) && need_rcu_nocb_mask) {
- if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
- pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
- return;
- }
- }
- if (!cpumask_available(rcu_nocb_mask))
- return;
-
-#if defined(CONFIG_NO_HZ_FULL)
- if (tick_nohz_full_running)
- cpumask_or(rcu_nocb_mask, rcu_nocb_mask, tick_nohz_full_mask);
-#endif /* #if defined(CONFIG_NO_HZ_FULL) */
-
- if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
- pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
- cpumask_and(rcu_nocb_mask, cpu_possible_mask,
- rcu_nocb_mask);
- }
- if (cpumask_empty(rcu_nocb_mask))
- pr_info("\tOffload RCU callbacks from CPUs: (none).\n");
- else
- pr_info("\tOffload RCU callbacks from CPUs: %*pbl.\n",
- cpumask_pr_args(rcu_nocb_mask));
- if (rcu_nocb_poll)
- pr_info("\tPoll for callbacks from no-CBs CPUs.\n");
-
- for_each_cpu(cpu, rcu_nocb_mask) {
- rdp = per_cpu_ptr(&rcu_data, cpu);
- if (rcu_segcblist_empty(&rdp->cblist))
- rcu_segcblist_init(&rdp->cblist);
- rcu_segcblist_offload(&rdp->cblist, true);
- rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_CB);
- rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP);
- }
- rcu_organize_nocb_kthreads();
-}
-
-/* Initialize per-rcu_data variables for no-CBs CPUs. */
-static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
-{
- init_swait_queue_head(&rdp->nocb_cb_wq);
- init_swait_queue_head(&rdp->nocb_gp_wq);
- init_swait_queue_head(&rdp->nocb_state_wq);
- raw_spin_lock_init(&rdp->nocb_lock);
- raw_spin_lock_init(&rdp->nocb_bypass_lock);
- raw_spin_lock_init(&rdp->nocb_gp_lock);
- timer_setup(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer, 0);
- rcu_cblist_init(&rdp->nocb_bypass);
-}
-
-/*
- * If the specified CPU is a no-CBs CPU that does not already have its
- * rcuo CB kthread, spawn it. Additionally, if the rcuo GP kthread
- * for this CPU's group has not yet been created, spawn it as well.
- */
-static void rcu_spawn_one_nocb_kthread(int cpu)
-{
- struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
- struct rcu_data *rdp_gp;
- struct task_struct *t;
-
- /*
- * If this isn't a no-CBs CPU or if it already has an rcuo kthread,
- * then nothing to do.
- */
- if (!rcu_is_nocb_cpu(cpu) || rdp->nocb_cb_kthread)
- return;
-
- /* If we didn't spawn the GP kthread first, reorganize! */
- rdp_gp = rdp->nocb_gp_rdp;
- if (!rdp_gp->nocb_gp_kthread) {
- t = kthread_run(rcu_nocb_gp_kthread, rdp_gp,
- "rcuog/%d", rdp_gp->cpu);
- if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__))
- return;
- WRITE_ONCE(rdp_gp->nocb_gp_kthread, t);
- }
-
- /* Spawn the kthread for this CPU. */
- t = kthread_run(rcu_nocb_cb_kthread, rdp,
- "rcuo%c/%d", rcu_state.abbr, cpu);
- if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
- return;
- WRITE_ONCE(rdp->nocb_cb_kthread, t);
- WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
-}
-
-/*
- * If the specified CPU is a no-CBs CPU that does not already have its
- * rcuo kthread, spawn it.
- */
-static void rcu_spawn_cpu_nocb_kthread(int cpu)
-{
- if (rcu_scheduler_fully_active)
- rcu_spawn_one_nocb_kthread(cpu);
-}
-
-/*
- * Once the scheduler is running, spawn rcuo kthreads for all online
- * no-CBs CPUs. This assumes that the early_initcall()s happen before
- * non-boot CPUs come online -- if this changes, we will need to add
- * some mutual exclusion.
- */
-static void __init rcu_spawn_nocb_kthreads(void)
-{
- int cpu;
-
- for_each_online_cpu(cpu)
- rcu_spawn_cpu_nocb_kthread(cpu);
-}
-
-/* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */
-static int rcu_nocb_gp_stride = -1;
-module_param(rcu_nocb_gp_stride, int, 0444);
-
-/*
- * Initialize GP-CB relationships for all no-CBs CPU.
- */
-static void __init rcu_organize_nocb_kthreads(void)
-{
- int cpu;
- bool firsttime = true;
- bool gotnocbs = false;
- bool gotnocbscbs = true;
- int ls = rcu_nocb_gp_stride;
- int nl = 0; /* Next GP kthread. */
- struct rcu_data *rdp;
- struct rcu_data *rdp_gp = NULL; /* Suppress misguided gcc warn. */
- struct rcu_data *rdp_prev = NULL;
-
- if (!cpumask_available(rcu_nocb_mask))
- return;
- if (ls == -1) {
- ls = nr_cpu_ids / int_sqrt(nr_cpu_ids);
- rcu_nocb_gp_stride = ls;
- }
-
- /*
- * Each pass through this loop sets up one rcu_data structure.
- * Should the corresponding CPU come online in the future, then
- * we will spawn the needed set of rcu_nocb_kthread() kthreads.
- */
- for_each_cpu(cpu, rcu_nocb_mask) {
- rdp = per_cpu_ptr(&rcu_data, cpu);
- if (rdp->cpu >= nl) {
- /* New GP kthread, set up for CBs & next GP. */
- gotnocbs = true;
- nl = DIV_ROUND_UP(rdp->cpu + 1, ls) * ls;
- rdp->nocb_gp_rdp = rdp;
- rdp_gp = rdp;
- if (dump_tree) {
- if (!firsttime)
- pr_cont("%s\n", gotnocbscbs
- ? "" : " (self only)");
- gotnocbscbs = false;
- firsttime = false;
- pr_alert("%s: No-CB GP kthread CPU %d:",
- __func__, cpu);
- }
- } else {
- /* Another CB kthread, link to previous GP kthread. */
- gotnocbscbs = true;
- rdp->nocb_gp_rdp = rdp_gp;
- rdp_prev->nocb_next_cb_rdp = rdp;
- if (dump_tree)
- pr_cont(" %d", cpu);
- }
- rdp_prev = rdp;
- }
- if (gotnocbs && dump_tree)
- pr_cont("%s\n", gotnocbscbs ? "" : " (self only)");
-}
-
-/*
- * Bind the current task to the offloaded CPUs. If there are no offloaded
- * CPUs, leave the task unbound. Splat if the bind attempt fails.
- */
-void rcu_bind_current_to_nocb(void)
-{
- if (cpumask_available(rcu_nocb_mask) && cpumask_weight(rcu_nocb_mask))
- WARN_ON(sched_setaffinity(current->pid, rcu_nocb_mask));
-}
-EXPORT_SYMBOL_GPL(rcu_bind_current_to_nocb);
-
-// The ->on_cpu field is available only in CONFIG_SMP=y, so...
-#ifdef CONFIG_SMP
-static char *show_rcu_should_be_on_cpu(struct task_struct *tsp)
-{
- return tsp && task_is_running(tsp) && !tsp->on_cpu ? "!" : "";
-}
-#else // #ifdef CONFIG_SMP
-static char *show_rcu_should_be_on_cpu(struct task_struct *tsp)
-{
- return "";
-}
-#endif // #else #ifdef CONFIG_SMP
-
-/*
- * Dump out nocb grace-period kthread state for the specified rcu_data
- * structure.
- */
-static void show_rcu_nocb_gp_state(struct rcu_data *rdp)
-{
- struct rcu_node *rnp = rdp->mynode;
-
- pr_info("nocb GP %d %c%c%c%c%c %c[%c%c] %c%c:%ld rnp %d:%d %lu %c CPU %d%s\n",
- rdp->cpu,
- "kK"[!!rdp->nocb_gp_kthread],
- "lL"[raw_spin_is_locked(&rdp->nocb_gp_lock)],
- "dD"[!!rdp->nocb_defer_wakeup],
- "tT"[timer_pending(&rdp->nocb_timer)],
- "sS"[!!rdp->nocb_gp_sleep],
- ".W"[swait_active(&rdp->nocb_gp_wq)],
- ".W"[swait_active(&rnp->nocb_gp_wq[0])],
- ".W"[swait_active(&rnp->nocb_gp_wq[1])],
- ".B"[!!rdp->nocb_gp_bypass],
- ".G"[!!rdp->nocb_gp_gp],
- (long)rdp->nocb_gp_seq,
- rnp->grplo, rnp->grphi, READ_ONCE(rdp->nocb_gp_loops),
- rdp->nocb_gp_kthread ? task_state_to_char(rdp->nocb_gp_kthread) : '.',
- rdp->nocb_cb_kthread ? (int)task_cpu(rdp->nocb_gp_kthread) : -1,
- show_rcu_should_be_on_cpu(rdp->nocb_cb_kthread));
-}
-
-/* Dump out nocb kthread state for the specified rcu_data structure. */
-static void show_rcu_nocb_state(struct rcu_data *rdp)
-{
- char bufw[20];
- char bufr[20];
- struct rcu_segcblist *rsclp = &rdp->cblist;
- bool waslocked;
- bool wassleep;
-
- if (rdp->nocb_gp_rdp == rdp)
- show_rcu_nocb_gp_state(rdp);
-
- sprintf(bufw, "%ld", rsclp->gp_seq[RCU_WAIT_TAIL]);
- sprintf(bufr, "%ld", rsclp->gp_seq[RCU_NEXT_READY_TAIL]);
- pr_info(" CB %d^%d->%d %c%c%c%c%c%c F%ld L%ld C%d %c%c%s%c%s%c%c q%ld %c CPU %d%s\n",
- rdp->cpu, rdp->nocb_gp_rdp->cpu,
- rdp->nocb_next_cb_rdp ? rdp->nocb_next_cb_rdp->cpu : -1,
- "kK"[!!rdp->nocb_cb_kthread],
- "bB"[raw_spin_is_locked(&rdp->nocb_bypass_lock)],
- "cC"[!!atomic_read(&rdp->nocb_lock_contended)],
- "lL"[raw_spin_is_locked(&rdp->nocb_lock)],
- "sS"[!!rdp->nocb_cb_sleep],
- ".W"[swait_active(&rdp->nocb_cb_wq)],
- jiffies - rdp->nocb_bypass_first,
- jiffies - rdp->nocb_nobypass_last,
- rdp->nocb_nobypass_count,
- ".D"[rcu_segcblist_ready_cbs(rsclp)],
- ".W"[!rcu_segcblist_segempty(rsclp, RCU_WAIT_TAIL)],
- rcu_segcblist_segempty(rsclp, RCU_WAIT_TAIL) ? "" : bufw,
- ".R"[!rcu_segcblist_segempty(rsclp, RCU_NEXT_READY_TAIL)],
- rcu_segcblist_segempty(rsclp, RCU_NEXT_READY_TAIL) ? "" : bufr,
- ".N"[!rcu_segcblist_segempty(rsclp, RCU_NEXT_TAIL)],
- ".B"[!!rcu_cblist_n_cbs(&rdp->nocb_bypass)],
- rcu_segcblist_n_cbs(&rdp->cblist),
- rdp->nocb_cb_kthread ? task_state_to_char(rdp->nocb_cb_kthread) : '.',
- rdp->nocb_cb_kthread ? (int)task_cpu(rdp->nocb_gp_kthread) : -1,
- show_rcu_should_be_on_cpu(rdp->nocb_cb_kthread));
-
- /* It is OK for GP kthreads to have GP state. */
- if (rdp->nocb_gp_rdp == rdp)
- return;
-
- waslocked = raw_spin_is_locked(&rdp->nocb_gp_lock);
- wassleep = swait_active(&rdp->nocb_gp_wq);
- if (!rdp->nocb_gp_sleep && !waslocked && !wassleep)
- return; /* Nothing untoward. */
-
- pr_info(" nocb GP activity on CB-only CPU!!! %c%c%c %c\n",
- "lL"[waslocked],
- "dD"[!!rdp->nocb_defer_wakeup],
- "sS"[!!rdp->nocb_gp_sleep],
- ".W"[wassleep]);
-}
-
-#else /* #ifdef CONFIG_RCU_NOCB_CPU */
-
-/* No ->nocb_lock to acquire. */
-static void rcu_nocb_lock(struct rcu_data *rdp)
-{
-}
-
-/* No ->nocb_lock to release. */
-static void rcu_nocb_unlock(struct rcu_data *rdp)
-{
-}
-
-/* No ->nocb_lock to release. */
-static void rcu_nocb_unlock_irqrestore(struct rcu_data *rdp,
- unsigned long flags)
-{
- local_irq_restore(flags);
-}
-
-/* Lockdep check that ->cblist may be safely accessed. */
-static void rcu_lockdep_assert_cblist_protected(struct rcu_data *rdp)
-{
- lockdep_assert_irqs_disabled();
-}
-
-static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq)
-{
-}
-
-static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp)
-{
- return NULL;
-}
-
-static void rcu_init_one_nocb(struct rcu_node *rnp)
-{
-}
-
-static bool rcu_nocb_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
- unsigned long j)
-{
- return true;
-}
-
-static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
- bool *was_alldone, unsigned long flags)
-{
- return false;
-}
-
-static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty,
- unsigned long flags)
-{
- WARN_ON_ONCE(1); /* Should be dead code! */
-}
-
-static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
-{
-}
-
-static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp, int level)
-{
- return false;
-}
-
-static bool do_nocb_deferred_wakeup(struct rcu_data *rdp)
-{
- return false;
-}
-
-static void rcu_spawn_cpu_nocb_kthread(int cpu)
-{
-}
-
-static void __init rcu_spawn_nocb_kthreads(void)
-{
-}
-
-static void show_rcu_nocb_state(struct rcu_data *rdp)
-{
-}
-
-#endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */
-
/*
* Is this CPU a NO_HZ_FULL CPU that should ignore RCU so that the
* grace-period kthread will do force_quiescent_state() processing?
/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
static void rcu_dynticks_task_trace_enter(void)
{
-#ifdef CONFIG_TASKS_RCU_TRACE
+#ifdef CONFIG_TASKS_TRACE_RCU
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
current->trc_reader_special.b.need_mb = true;
-#endif /* #ifdef CONFIG_TASKS_RCU_TRACE */
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
}
/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
static void rcu_dynticks_task_trace_exit(void)
{
-#ifdef CONFIG_TASKS_RCU_TRACE
+#ifdef CONFIG_TASKS_TRACE_RCU
if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
current->trc_reader_special.b.need_mb = false;
-#endif /* #ifdef CONFIG_TASKS_RCU_TRACE */
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
}
* Author: Paul E. McKenney <paulmck@linux.ibm.com>
*/
+#include <linux/kvm_para.h>
+
//////////////////////////////////////////////////////////////////////////////
//
// Controlling CPU stall warnings, including delay calculation.
}
/**
- * rcu_cpu_stall_reset - prevent further stall warnings in current grace period
- *
- * Set the stall-warning timeout way off into the future, thus preventing
- * any RCU CPU stall-warning messages from appearing in the current set of
- * RCU grace periods.
+ * rcu_cpu_stall_reset - restart stall-warning timeout for current grace period
*
* The caller must disable hard irqs.
*/
void rcu_cpu_stall_reset(void)
{
- WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2);
+ WRITE_ONCE(rcu_state.jiffies_stall,
+ jiffies + rcu_jiffies_till_stall_check());
}
//////////////////////////////////////////////////////////////////////////////
struct task_struct *ts[8];
lockdep_assert_irqs_disabled();
- if (!rcu_preempt_blocked_readers_cgp(rnp))
+ if (!rcu_preempt_blocked_readers_cgp(rnp)) {
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return 0;
+ }
pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
rnp->level, rnp->grplo, rnp->grphi);
t = list_entry(rnp->gp_tasks->prev,
break;
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
- for (i--; i; i--) {
- t = ts[i];
+ while (i) {
+ t = ts[--i];
if (!try_invoke_on_locked_down_task(t, check_slow_task, &rscr))
pr_cont(" P%d", t->pid);
else
static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
{
- struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
sprintf(cp, "last_accelerate: %04lx/%04lx dyntick_enabled: %d",
rdp->last_accelerate & 0xffff, jiffies & 0xffff,
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#x ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
- data_race(rcu_state.gp_flags),
- gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
- gpk ? gpk->__state : ~0, cpu);
+ data_race(READ_ONCE(rcu_state.gp_flags)),
+ gp_state_getname(rcu_state.gp_state),
+ data_race(READ_ONCE(rcu_state.gp_state)),
+ gpk ? data_race(READ_ONCE(gpk->__state)) : ~0, cpu);
if (gpk) {
pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
pr_err("RCU grace-period kthread stack dump:\n");
(long)rcu_seq_current(&rcu_state.gp_seq),
data_race(rcu_state.gp_flags),
gp_state_getname(RCU_GP_WAIT_FQS), RCU_GP_WAIT_FQS,
- gpk->__state);
+ data_race(READ_ONCE(gpk->__state)));
pr_err("\tPossible timer handling issue on cpu=%d timer-softirq=%u\n",
cpu, kstat_softirqs_cpu(TIMER_SOFTIRQ, cpu));
}
pr_err("INFO: Stall ended before state dump start\n");
} else {
j = jiffies;
- gpa = data_race(rcu_state.gp_activity);
+ gpa = data_race(READ_ONCE(rcu_state.gp_activity));
pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n",
rcu_state.name, j - gpa, j, gpa,
- data_race(jiffies_till_next_fqs),
- rcu_get_root()->qsmask);
+ data_race(READ_ONCE(jiffies_till_next_fqs)),
+ data_race(READ_ONCE(rcu_get_root()->qsmask)));
}
}
/* Rewrite if needed in case of slow consoles. */
static void check_cpu_stall(struct rcu_data *rdp)
{
+ bool didstall = false;
unsigned long gs1;
unsigned long gs2;
unsigned long gps;
ULONG_CMP_GE(gps, js))
return; /* No stall or GP completed since entering function. */
rnp = rdp->mynode;
- jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
+ jn = jiffies + ULONG_MAX / 2;
if (rcu_gp_in_progress() &&
(READ_ONCE(rnp->qsmask) & rdp->grpmask) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
+ /*
+ * If a virtual machine is stopped by the host it can look to
+ * the watchdog like an RCU stall. Check to see if the host
+ * stopped the vm.
+ */
+ if (kvm_check_and_clear_guest_paused())
+ return;
+
/* We haven't checked in, so go dump stack. */
print_cpu_stall(gps);
if (READ_ONCE(rcu_cpu_stall_ftrace_dump))
rcu_ftrace_dump(DUMP_ALL);
+ didstall = true;
} else if (rcu_gp_in_progress() &&
ULONG_CMP_GE(j, js + RCU_STALL_RAT_DELAY) &&
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
+ /*
+ * If a virtual machine is stopped by the host it can look to
+ * the watchdog like an RCU stall. Check to see if the host
+ * stopped the vm.
+ */
+ if (kvm_check_and_clear_guest_paused())
+ return;
+
/* They had a few time units to dump stack, so complain. */
print_other_cpu_stall(gs2, gps);
if (READ_ONCE(rcu_cpu_stall_ftrace_dump))
rcu_ftrace_dump(DUMP_ALL);
+ didstall = true;
+ }
+ if (didstall && READ_ONCE(rcu_state.jiffies_stall) == jn) {
+ jn = jiffies + 3 * rcu_jiffies_till_stall_check() + 3;
+ WRITE_ONCE(rcu_state.jiffies_stall, jn);
}
}
rcu_for_each_leaf_node(rnp) {
if (!cpup) {
- if (READ_ONCE(rnp->qsmask)) {
+ if (data_race(READ_ONCE(rnp->qsmask))) {
return false;
} else {
if (READ_ONCE(rnp->gp_tasks))
struct task_struct *t = READ_ONCE(rcu_state.gp_kthread);
j = jiffies;
- ja = j - data_race(rcu_state.gp_activity);
- jr = j - data_race(rcu_state.gp_req_activity);
- js = j - data_race(rcu_state.gp_start);
- jw = j - data_race(rcu_state.gp_wake_time);
+ ja = j - data_race(READ_ONCE(rcu_state.gp_activity));
+ jr = j - data_race(READ_ONCE(rcu_state.gp_req_activity));
+ js = j - data_race(READ_ONCE(rcu_state.gp_start));
+ jw = j - data_race(READ_ONCE(rcu_state.gp_wake_time));
pr_info("%s: wait state: %s(%d) ->state: %#x ->rt_priority %u delta ->gp_start %lu ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_max %lu ->gp_flags %#x\n",
rcu_state.name, gp_state_getname(rcu_state.gp_state),
- rcu_state.gp_state, t ? t->__state : 0x1ffff, t ? t->rt_priority : 0xffU,
- js, ja, jr, jw, (long)data_race(rcu_state.gp_wake_seq),
- (long)data_race(rcu_state.gp_seq),
- (long)data_race(rcu_get_root()->gp_seq_needed),
- data_race(rcu_state.gp_max),
- data_race(rcu_state.gp_flags));
+ data_race(READ_ONCE(rcu_state.gp_state)),
+ t ? data_race(READ_ONCE(t->__state)) : 0x1ffff, t ? t->rt_priority : 0xffU,
+ js, ja, jr, jw, (long)data_race(READ_ONCE(rcu_state.gp_wake_seq)),
+ (long)data_race(READ_ONCE(rcu_state.gp_seq)),
+ (long)data_race(READ_ONCE(rcu_get_root()->gp_seq_needed)),
+ data_race(READ_ONCE(rcu_state.gp_max)),
+ data_race(READ_ONCE(rcu_state.gp_flags)));
rcu_for_each_node_breadth_first(rnp) {
if (ULONG_CMP_GE(READ_ONCE(rcu_state.gp_seq), READ_ONCE(rnp->gp_seq_needed)) &&
- !data_race(rnp->qsmask) && !data_race(rnp->boost_tasks) &&
- !data_race(rnp->exp_tasks) && !data_race(rnp->gp_tasks))
+ !data_race(READ_ONCE(rnp->qsmask)) && !data_race(READ_ONCE(rnp->boost_tasks)) &&
+ !data_race(READ_ONCE(rnp->exp_tasks)) && !data_race(READ_ONCE(rnp->gp_tasks)))
continue;
pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld ->qsmask %#lx %c%c%c%c ->n_boosts %ld\n",
rnp->grplo, rnp->grphi,
- (long)data_race(rnp->gp_seq), (long)data_race(rnp->gp_seq_needed),
- data_race(rnp->qsmask),
- ".b"[!!data_race(rnp->boost_kthread_task)],
- ".B"[!!data_race(rnp->boost_tasks)],
- ".E"[!!data_race(rnp->exp_tasks)],
- ".G"[!!data_race(rnp->gp_tasks)],
- data_race(rnp->n_boosts));
+ (long)data_race(READ_ONCE(rnp->gp_seq)),
+ (long)data_race(READ_ONCE(rnp->gp_seq_needed)),
+ data_race(READ_ONCE(rnp->qsmask)),
+ ".b"[!!data_race(READ_ONCE(rnp->boost_kthread_task))],
+ ".B"[!!data_race(READ_ONCE(rnp->boost_tasks))],
+ ".E"[!!data_race(READ_ONCE(rnp->exp_tasks))],
+ ".G"[!!data_race(READ_ONCE(rnp->gp_tasks))],
+ data_race(READ_ONCE(rnp->n_boosts)));
if (!rcu_is_leaf_node(rnp))
continue;
for_each_leaf_node_possible_cpu(rnp, cpu) {
READ_ONCE(rdp->gp_seq_needed)))
continue;
pr_info("\tcpu %d ->gp_seq_needed %ld\n",
- cpu, (long)data_race(rdp->gp_seq_needed));
+ cpu, (long)data_race(READ_ONCE(rdp->gp_seq_needed)));
}
}
for_each_possible_cpu(cpu) {
rdp = per_cpu_ptr(&rcu_data, cpu);
- cbs += data_race(rdp->n_cbs_invoked);
+ cbs += data_race(READ_ONCE(rdp->n_cbs_invoked));
if (rcu_segcblist_is_offloaded(&rdp->cblist))
show_rcu_nocb_state(rdp);
}
if (rcu_gp_in_progress()) {
pr_info("%s: GP age %lu jiffies\n",
- __func__, jiffies - rcu_state.gp_start);
+ __func__, jiffies - data_race(READ_ONCE(rcu_state.gp_start)));
show_rcu_gp_kthreads();
} else {
pr_info("%s: Last GP end %lu jiffies ago\n",
- __func__, jiffies - rcu_state.gp_end);
+ __func__, jiffies - data_race(READ_ONCE(rcu_state.gp_end)));
preempt_disable();
rdp = this_cpu_ptr(&rcu_data);
rcu_check_gp_start_stall(rdp->mynode, rdp, j);
torture_param(int, verbose, 0, "Enable verbose debugging printk()s");
torture_param(int, weight_resched, -1, "Testing weight for resched_cpu() operations.");
torture_param(int, weight_single, -1, "Testing weight for single-CPU no-wait operations.");
+torture_param(int, weight_single_rpc, -1, "Testing weight for single-CPU RPC operations.");
torture_param(int, weight_single_wait, -1, "Testing weight for single-CPU operations.");
torture_param(int, weight_many, -1, "Testing weight for multi-CPU no-wait operations.");
torture_param(int, weight_many_wait, -1, "Testing weight for multi-CPU operations.");
long long n_resched;
long long n_single;
long long n_single_ofl;
+ long long n_single_rpc;
+ long long n_single_rpc_ofl;
long long n_single_wait;
long long n_single_wait_ofl;
long long n_many;
// Data for random primitive selection
#define SCF_PRIM_RESCHED 0
#define SCF_PRIM_SINGLE 1
-#define SCF_PRIM_MANY 2
-#define SCF_PRIM_ALL 3
-#define SCF_NPRIMS 7 // Need wait and no-wait versions of each,
- // except for SCF_PRIM_RESCHED.
+#define SCF_PRIM_SINGLE_RPC 2
+#define SCF_PRIM_MANY 3
+#define SCF_PRIM_ALL 4
+#define SCF_NPRIMS 8 // Need wait and no-wait versions of each,
+ // except for SCF_PRIM_RESCHED and
+ // SCF_PRIM_SINGLE_RPC.
static char *scf_prim_name[] = {
"resched_cpu",
"smp_call_function_single",
+ "smp_call_function_single_rpc",
"smp_call_function_many",
"smp_call_function",
};
bool scfc_out;
int scfc_cpu; // -1 for not _single().
bool scfc_wait;
+ bool scfc_rpc;
+ struct completion scfc_completion;
};
// Use to wait for all threads to start.
scfs.n_resched += scf_stats_p[i].n_resched;
scfs.n_single += scf_stats_p[i].n_single;
scfs.n_single_ofl += scf_stats_p[i].n_single_ofl;
+ scfs.n_single_rpc += scf_stats_p[i].n_single_rpc;
scfs.n_single_wait += scf_stats_p[i].n_single_wait;
scfs.n_single_wait_ofl += scf_stats_p[i].n_single_wait_ofl;
scfs.n_many += scf_stats_p[i].n_many;
if (atomic_read(&n_errs) || atomic_read(&n_mb_in_errs) ||
atomic_read(&n_mb_out_errs) || atomic_read(&n_alloc_errs))
bangstr = "!!! ";
- pr_alert("%s %sscf_invoked_count %s: %lld resched: %lld single: %lld/%lld single_ofl: %lld/%lld many: %lld/%lld all: %lld/%lld ",
+ pr_alert("%s %sscf_invoked_count %s: %lld resched: %lld single: %lld/%lld single_ofl: %lld/%lld single_rpc: %lld single_rpc_ofl: %lld many: %lld/%lld all: %lld/%lld ",
SCFTORT_FLAG, bangstr, isdone ? "VER" : "ver", invoked_count, scfs.n_resched,
scfs.n_single, scfs.n_single_wait, scfs.n_single_ofl, scfs.n_single_wait_ofl,
+ scfs.n_single_rpc, scfs.n_single_rpc_ofl,
scfs.n_many, scfs.n_many_wait, scfs.n_all, scfs.n_all_wait);
torture_onoff_stats();
pr_cont("ste: %d stnmie: %d stnmoe: %d staf: %d\n", atomic_read(&n_errs),
out:
if (unlikely(!scfcp))
return;
- if (scfcp->scfc_wait)
+ if (scfcp->scfc_wait) {
WRITE_ONCE(scfcp->scfc_out, true);
- else
+ if (scfcp->scfc_rpc)
+ complete(&scfcp->scfc_completion);
+ } else {
kfree(scfcp);
+ }
}
// As above, but check for correct CPU.
scfcp->scfc_cpu = -1;
scfcp->scfc_wait = scfsp->scfs_wait;
scfcp->scfc_out = false;
+ scfcp->scfc_rpc = false;
}
}
switch (scfsp->scfs_prim) {
scfcp = NULL;
}
break;
+ case SCF_PRIM_SINGLE_RPC:
+ if (!scfcp)
+ break;
+ cpu = torture_random(trsp) % nr_cpu_ids;
+ scfp->n_single_rpc++;
+ scfcp->scfc_cpu = cpu;
+ scfcp->scfc_wait = true;
+ init_completion(&scfcp->scfc_completion);
+ scfcp->scfc_rpc = true;
+ barrier(); // Prevent race-reduction compiler optimizations.
+ scfcp->scfc_in = true;
+ ret = smp_call_function_single(cpu, scf_handler_1, (void *)scfcp, 0);
+ if (!ret) {
+ if (use_cpus_read_lock)
+ cpus_read_unlock();
+ else
+ preempt_enable();
+ wait_for_completion(&scfcp->scfc_completion);
+ if (use_cpus_read_lock)
+ cpus_read_lock();
+ else
+ preempt_disable();
+ } else {
+ scfp->n_single_rpc_ofl++;
+ kfree(scfcp);
+ scfcp = NULL;
+ }
+ break;
case SCF_PRIM_MANY:
if (scfsp->scfs_wait)
scfp->n_many_wait++;
}
if (scfcp && scfsp->scfs_wait) {
if (WARN_ON_ONCE((num_online_cpus() > 1 || scfsp->scfs_prim == SCF_PRIM_SINGLE) &&
- !scfcp->scfc_out))
+ !scfcp->scfc_out)) {
+ pr_warn("%s: Memory-ordering failure, scfs_prim: %d.\n", __func__, scfsp->scfs_prim);
atomic_inc(&n_mb_out_errs); // Leak rather than trash!
- else
+ } else {
kfree(scfcp);
+ }
barrier(); // Prevent race-reduction compiler optimizations.
}
if (use_cpus_read_lock)
scftorture_print_module_parms(const char *tag)
{
pr_alert(SCFTORT_FLAG
- "--- %s: verbose=%d holdoff=%d longwait=%d nthreads=%d onoff_holdoff=%d onoff_interval=%d shutdown_secs=%d stat_interval=%d stutter=%d use_cpus_read_lock=%d, weight_resched=%d, weight_single=%d, weight_single_wait=%d, weight_many=%d, weight_many_wait=%d, weight_all=%d, weight_all_wait=%d\n", tag,
- verbose, holdoff, longwait, nthreads, onoff_holdoff, onoff_interval, shutdown, stat_interval, stutter, use_cpus_read_lock, weight_resched, weight_single, weight_single_wait, weight_many, weight_many_wait, weight_all, weight_all_wait);
+ "--- %s: verbose=%d holdoff=%d longwait=%d nthreads=%d onoff_holdoff=%d onoff_interval=%d shutdown_secs=%d stat_interval=%d stutter=%d use_cpus_read_lock=%d, weight_resched=%d, weight_single=%d, weight_single_rpc=%d, weight_single_wait=%d, weight_many=%d, weight_many_wait=%d, weight_all=%d, weight_all_wait=%d\n", tag,
+ verbose, holdoff, longwait, nthreads, onoff_holdoff, onoff_interval, shutdown, stat_interval, stutter, use_cpus_read_lock, weight_resched, weight_single, weight_single_rpc, weight_single_wait, weight_many, weight_many_wait, weight_all, weight_all_wait);
}
static void scf_cleanup_handler(void *unused)
return;
WRITE_ONCE(scfdone, true);
- if (nthreads)
+ if (nthreads && scf_stats_p)
for (i = 0; i < nthreads; i++)
torture_stop_kthread("scftorture_invoker", scf_stats_p[i].task);
else
int firsterr = 0;
unsigned long weight_resched1 = weight_resched;
unsigned long weight_single1 = weight_single;
+ unsigned long weight_single_rpc1 = weight_single_rpc;
unsigned long weight_single_wait1 = weight_single_wait;
unsigned long weight_many1 = weight_many;
unsigned long weight_many_wait1 = weight_many_wait;
scftorture_print_module_parms("Start of test");
- if (weight_resched == -1 && weight_single == -1 && weight_single_wait == -1 &&
+ if (weight_resched == -1 &&
+ weight_single == -1 && weight_single_rpc == -1 && weight_single_wait == -1 &&
weight_many == -1 && weight_many_wait == -1 &&
weight_all == -1 && weight_all_wait == -1) {
weight_resched1 = 2 * nr_cpu_ids;
weight_single1 = 2 * nr_cpu_ids;
+ weight_single_rpc1 = 2 * nr_cpu_ids;
weight_single_wait1 = 2 * nr_cpu_ids;
weight_many1 = 2;
weight_many_wait1 = 2;
weight_resched1 = 0;
if (weight_single == -1)
weight_single1 = 0;
+ if (weight_single_rpc == -1)
+ weight_single_rpc1 = 0;
if (weight_single_wait == -1)
weight_single_wait1 = 0;
if (weight_many == -1)
if (weight_all_wait == -1)
weight_all_wait1 = 0;
}
- if (weight_single1 == 0 && weight_single_wait1 == 0 &&
+ if (weight_single1 == 0 && weight_single_rpc1 == 0 && weight_single_wait1 == 0 &&
weight_many1 == 0 && weight_many_wait1 == 0 &&
weight_all1 == 0 && weight_all_wait1 == 0) {
VERBOSE_SCFTORTOUT_ERRSTRING("all zero weights makes no sense");
else if (weight_resched1)
VERBOSE_SCFTORTOUT_ERRSTRING("built as module, weight_resched ignored");
scf_sel_add(weight_single1, SCF_PRIM_SINGLE, false);
+ scf_sel_add(weight_single_rpc1, SCF_PRIM_SINGLE_RPC, true);
scf_sel_add(weight_single_wait1, SCF_PRIM_SINGLE, true);
scf_sel_add(weight_many1, SCF_PRIM_MANY, false);
scf_sel_add(weight_many_wait1, SCF_PRIM_MANY, true);
static atomic_t sched_core_count;
static struct cpumask sched_core_mask;
+static void sched_core_lock(int cpu, unsigned long *flags)
+{
+ const struct cpumask *smt_mask = cpu_smt_mask(cpu);
+ int t, i = 0;
+
+ local_irq_save(*flags);
+ for_each_cpu(t, smt_mask)
+ raw_spin_lock_nested(&cpu_rq(t)->__lock, i++);
+}
+
+static void sched_core_unlock(int cpu, unsigned long *flags)
+{
+ const struct cpumask *smt_mask = cpu_smt_mask(cpu);
+ int t;
+
+ for_each_cpu(t, smt_mask)
+ raw_spin_unlock(&cpu_rq(t)->__lock);
+ local_irq_restore(*flags);
+}
+
static void __sched_core_flip(bool enabled)
{
- int cpu, t, i;
+ unsigned long flags;
+ int cpu, t;
cpus_read_lock();
for_each_cpu(cpu, &sched_core_mask) {
const struct cpumask *smt_mask = cpu_smt_mask(cpu);
- i = 0;
- local_irq_disable();
- for_each_cpu(t, smt_mask) {
- /* supports up to SMT8 */
- raw_spin_lock_nested(&cpu_rq(t)->__lock, i++);
- }
+ sched_core_lock(cpu, &flags);
for_each_cpu(t, smt_mask)
cpu_rq(t)->core_enabled = enabled;
- for_each_cpu(t, smt_mask)
- raw_spin_unlock(&cpu_rq(t)->__lock);
- local_irq_enable();
+ sched_core_unlock(cpu, &flags);
cpumask_andnot(&sched_core_mask, &sched_core_mask, smt_mask);
}
{
int i, cpu = smp_processor_id(), default_cpu = -1;
struct sched_domain *sd;
+ const struct cpumask *hk_mask;
if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) {
if (!idle_cpu(cpu))
default_cpu = cpu;
}
+ hk_mask = housekeeping_cpumask(HK_FLAG_TIMER);
+
rcu_read_lock();
for_each_domain(cpu, sd) {
- for_each_cpu_and(i, sched_domain_span(sd),
- housekeeping_cpumask(HK_FLAG_TIMER)) {
+ for_each_cpu_and(i, sched_domain_span(sd), hk_mask) {
if (cpu == i)
continue;
uclamp_rq_dec_id(rq, p, clamp_id);
}
+static inline void uclamp_rq_reinc_id(struct rq *rq, struct task_struct *p,
+ enum uclamp_id clamp_id)
+{
+ if (!p->uclamp[clamp_id].active)
+ return;
+
+ uclamp_rq_dec_id(rq, p, clamp_id);
+ uclamp_rq_inc_id(rq, p, clamp_id);
+
+ /*
+ * Make sure to clear the idle flag if we've transiently reached 0
+ * active tasks on rq.
+ */
+ if (clamp_id == UCLAMP_MAX && (rq->uclamp_flags & UCLAMP_FLAG_IDLE))
+ rq->uclamp_flags &= ~UCLAMP_FLAG_IDLE;
+}
+
static inline void
uclamp_update_active(struct task_struct *p)
{
* affecting a valid clamp bucket, the next time it's enqueued,
* it will already see the updated clamp bucket value.
*/
- for_each_clamp_id(clamp_id) {
- if (p->uclamp[clamp_id].active) {
- uclamp_rq_dec_id(rq, p, clamp_id);
- uclamp_rq_inc_id(rq, p, clamp_id);
- }
- }
+ for_each_clamp_id(clamp_id)
+ uclamp_rq_reinc_id(rq, p, clamp_id);
task_rq_unlock(rq, p, &rf);
}
/* Non kernel threads are not allowed during either online or offline. */
if (!(p->flags & PF_KTHREAD))
- return cpu_active(cpu);
+ return cpu_active(cpu) && task_cpu_possible(cpu, p);
/* KTHREAD_IS_PER_CPU is always allowed. */
if (kthread_is_per_cpu(p))
__do_set_cpus_allowed(p, new_mask, 0);
}
+int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
+ int node)
+{
+ if (!src->user_cpus_ptr)
+ return 0;
+
+ dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
+ if (!dst->user_cpus_ptr)
+ return -ENOMEM;
+
+ cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
+ return 0;
+}
+
+static inline struct cpumask *clear_user_cpus_ptr(struct task_struct *p)
+{
+ struct cpumask *user_mask = NULL;
+
+ swap(p->user_cpus_ptr, user_mask);
+
+ return user_mask;
+}
+
+void release_user_cpus_ptr(struct task_struct *p)
+{
+ kfree(clear_user_cpus_ptr(p));
+}
+
/*
* This function is wildly self concurrent; here be dragons.
*
}
/*
- * Change a given task's CPU affinity. Migrate the thread to a
- * proper CPU and schedule it away if the CPU it's executing on
- * is removed from the allowed bitmask.
- *
- * NOTE: the caller must have a valid reference to the task, the
- * task must not exit() & deallocate itself prematurely. The
- * call is not atomic; no spinlocks may be held.
+ * Called with both p->pi_lock and rq->lock held; drops both before returning.
*/
-static int __set_cpus_allowed_ptr(struct task_struct *p,
- const struct cpumask *new_mask,
- u32 flags)
+static int __set_cpus_allowed_ptr_locked(struct task_struct *p,
+ const struct cpumask *new_mask,
+ u32 flags,
+ struct rq *rq,
+ struct rq_flags *rf)
+ __releases(rq->lock)
+ __releases(p->pi_lock)
{
+ const struct cpumask *cpu_allowed_mask = task_cpu_possible_mask(p);
const struct cpumask *cpu_valid_mask = cpu_active_mask;
+ bool kthread = p->flags & PF_KTHREAD;
+ struct cpumask *user_mask = NULL;
unsigned int dest_cpu;
- struct rq_flags rf;
- struct rq *rq;
int ret = 0;
- rq = task_rq_lock(p, &rf);
update_rq_clock(rq);
- if (p->flags & PF_KTHREAD || is_migration_disabled(p)) {
+ if (kthread || is_migration_disabled(p)) {
/*
* Kernel threads are allowed on online && !active CPUs,
* however, during cpu-hot-unplug, even these might get pushed
cpu_valid_mask = cpu_online_mask;
}
+ if (!kthread && !cpumask_subset(new_mask, cpu_allowed_mask)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
/*
* Must re-check here, to close a race against __kthread_bind(),
* sched_setaffinity() is not guaranteed to observe the flag.
__do_set_cpus_allowed(p, new_mask, flags);
- return affine_move_task(rq, p, &rf, dest_cpu, flags);
+ if (flags & SCA_USER)
+ user_mask = clear_user_cpus_ptr(p);
+
+ ret = affine_move_task(rq, p, rf, dest_cpu, flags);
+
+ kfree(user_mask);
+
+ return ret;
out:
- task_rq_unlock(rq, p, &rf);
+ task_rq_unlock(rq, p, rf);
return ret;
}
+/*
+ * Change a given task's CPU affinity. Migrate the thread to a
+ * proper CPU and schedule it away if the CPU it's executing on
+ * is removed from the allowed bitmask.
+ *
+ * NOTE: the caller must have a valid reference to the task, the
+ * task must not exit() & deallocate itself prematurely. The
+ * call is not atomic; no spinlocks may be held.
+ */
+static int __set_cpus_allowed_ptr(struct task_struct *p,
+ const struct cpumask *new_mask, u32 flags)
+{
+ struct rq_flags rf;
+ struct rq *rq;
+
+ rq = task_rq_lock(p, &rf);
+ return __set_cpus_allowed_ptr_locked(p, new_mask, flags, rq, &rf);
+}
+
int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumask *new_mask)
{
return __set_cpus_allowed_ptr(p, new_mask, 0);
}
EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
+/*
+ * Change a given task's CPU affinity to the intersection of its current
+ * affinity mask and @subset_mask, writing the resulting mask to @new_mask
+ * and pointing @p->user_cpus_ptr to a copy of the old mask.
+ * If the resulting mask is empty, leave the affinity unchanged and return
+ * -EINVAL.
+ */
+static int restrict_cpus_allowed_ptr(struct task_struct *p,
+ struct cpumask *new_mask,
+ const struct cpumask *subset_mask)
+{
+ struct cpumask *user_mask = NULL;
+ struct rq_flags rf;
+ struct rq *rq;
+ int err;
+
+ if (!p->user_cpus_ptr) {
+ user_mask = kmalloc(cpumask_size(), GFP_KERNEL);
+ if (!user_mask)
+ return -ENOMEM;
+ }
+
+ rq = task_rq_lock(p, &rf);
+
+ /*
+ * Forcefully restricting the affinity of a deadline task is
+ * likely to cause problems, so fail and noisily override the
+ * mask entirely.
+ */
+ if (task_has_dl_policy(p) && dl_bandwidth_enabled()) {
+ err = -EPERM;
+ goto err_unlock;
+ }
+
+ if (!cpumask_and(new_mask, &p->cpus_mask, subset_mask)) {
+ err = -EINVAL;
+ goto err_unlock;
+ }
+
+ /*
+ * We're about to butcher the task affinity, so keep track of what
+ * the user asked for in case we're able to restore it later on.
+ */
+ if (user_mask) {
+ cpumask_copy(user_mask, p->cpus_ptr);
+ p->user_cpus_ptr = user_mask;
+ }
+
+ return __set_cpus_allowed_ptr_locked(p, new_mask, 0, rq, &rf);
+
+err_unlock:
+ task_rq_unlock(rq, p, &rf);
+ kfree(user_mask);
+ return err;
+}
+
+/*
+ * Restrict the CPU affinity of task @p so that it is a subset of
+ * task_cpu_possible_mask() and point @p->user_cpu_ptr to a copy of the
+ * old affinity mask. If the resulting mask is empty, we warn and walk
+ * up the cpuset hierarchy until we find a suitable mask.
+ */
+void force_compatible_cpus_allowed_ptr(struct task_struct *p)
+{
+ cpumask_var_t new_mask;
+ const struct cpumask *override_mask = task_cpu_possible_mask(p);
+
+ alloc_cpumask_var(&new_mask, GFP_KERNEL);
+
+ /*
+ * __migrate_task() can fail silently in the face of concurrent
+ * offlining of the chosen destination CPU, so take the hotplug
+ * lock to ensure that the migration succeeds.
+ */
+ cpus_read_lock();
+ if (!cpumask_available(new_mask))
+ goto out_set_mask;
+
+ if (!restrict_cpus_allowed_ptr(p, new_mask, override_mask))
+ goto out_free_mask;
+
+ /*
+ * We failed to find a valid subset of the affinity mask for the
+ * task, so override it based on its cpuset hierarchy.
+ */
+ cpuset_cpus_allowed(p, new_mask);
+ override_mask = new_mask;
+
+out_set_mask:
+ if (printk_ratelimit()) {
+ printk_deferred("Overriding affinity for process %d (%s) to CPUs %*pbl\n",
+ task_pid_nr(p), p->comm,
+ cpumask_pr_args(override_mask));
+ }
+
+ WARN_ON(set_cpus_allowed_ptr(p, override_mask));
+out_free_mask:
+ cpus_read_unlock();
+ free_cpumask_var(new_mask);
+}
+
+static int
+__sched_setaffinity(struct task_struct *p, const struct cpumask *mask);
+
+/*
+ * Restore the affinity of a task @p which was previously restricted by a
+ * call to force_compatible_cpus_allowed_ptr(). This will clear (and free)
+ * @p->user_cpus_ptr.
+ *
+ * It is the caller's responsibility to serialise this with any calls to
+ * force_compatible_cpus_allowed_ptr(@p).
+ */
+void relax_compatible_cpus_allowed_ptr(struct task_struct *p)
+{
+ struct cpumask *user_mask = p->user_cpus_ptr;
+ unsigned long flags;
+
+ /*
+ * Try to restore the old affinity mask. If this fails, then
+ * we free the mask explicitly to avoid it being inherited across
+ * a subsequent fork().
+ */
+ if (!user_mask || !__sched_setaffinity(p, user_mask))
+ return;
+
+ raw_spin_lock_irqsave(&p->pi_lock, flags);
+ user_mask = clear_user_cpus_ptr(p);
+ raw_spin_unlock_irqrestore(&p->pi_lock, flags);
+
+ kfree(user_mask);
+}
+
void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
{
#ifdef CONFIG_SCHED_DEBUG
/* Look for allowed, online CPU in same node. */
for_each_cpu(dest_cpu, nodemask) {
- if (!cpu_active(dest_cpu))
- continue;
- if (cpumask_test_cpu(dest_cpu, p->cpus_ptr))
+ if (is_cpu_allowed(p, dest_cpu))
return dest_cpu;
}
}
/* No more Mr. Nice Guy. */
switch (state) {
case cpuset:
- if (IS_ENABLED(CONFIG_CPUSETS)) {
- cpuset_cpus_allowed_fallback(p);
+ if (cpuset_cpus_allowed_fallback(p)) {
state = possible;
break;
}
*
* More yuck to audit.
*/
- do_set_cpus_allowed(p, cpu_possible_mask);
+ do_set_cpus_allowed(p, task_cpu_possible_mask(p));
state = fail;
break;
-
case fail:
BUG();
break;
rq_unlock(rq, &rf);
}
+/*
+ * Invoked from try_to_wake_up() to check whether the task can be woken up.
+ *
+ * The caller holds p::pi_lock if p != current or has preemption
+ * disabled when p == current.
+ *
+ * The rules of PREEMPT_RT saved_state:
+ *
+ * The related locking code always holds p::pi_lock when updating
+ * p::saved_state, which means the code is fully serialized in both cases.
+ *
+ * The lock wait and lock wakeups happen via TASK_RTLOCK_WAIT. No other
+ * bits set. This allows to distinguish all wakeup scenarios.
+ */
+static __always_inline
+bool ttwu_state_match(struct task_struct *p, unsigned int state, int *success)
+{
+ if (IS_ENABLED(CONFIG_DEBUG_PREEMPT)) {
+ WARN_ON_ONCE((state & TASK_RTLOCK_WAIT) &&
+ state != TASK_RTLOCK_WAIT);
+ }
+
+ if (READ_ONCE(p->__state) & state) {
+ *success = 1;
+ return true;
+ }
+
+#ifdef CONFIG_PREEMPT_RT
+ /*
+ * Saved state preserves the task state across blocking on
+ * an RT lock. If the state matches, set p::saved_state to
+ * TASK_RUNNING, but do not wake the task because it waits
+ * for a lock wakeup. Also indicate success because from
+ * the regular waker's point of view this has succeeded.
+ *
+ * After acquiring the lock the task will restore p::__state
+ * from p::saved_state which ensures that the regular
+ * wakeup is not lost. The restore will also set
+ * p::saved_state to TASK_RUNNING so any further tests will
+ * not result in false positives vs. @success
+ */
+ if (p->saved_state & state) {
+ p->saved_state = TASK_RUNNING;
+ *success = 1;
+ }
+#endif
+ return false;
+}
+
/*
* Notes on Program-Order guarantees on SMP systems.
*
* - we're serialized against set_special_state() by virtue of
* it disabling IRQs (this allows not taking ->pi_lock).
*/
- if (!(READ_ONCE(p->__state) & state))
+ if (!ttwu_state_match(p, state, &success))
goto out;
- success = 1;
trace_sched_waking(p);
WRITE_ONCE(p->__state, TASK_RUNNING);
trace_sched_wakeup(p);
*/
raw_spin_lock_irqsave(&p->pi_lock, flags);
smp_mb__after_spinlock();
- if (!(READ_ONCE(p->__state) & state))
+ if (!ttwu_state_match(p, state, &success))
goto unlock;
trace_sched_waking(p);
- /* We're going to change ->state: */
- success = 1;
-
/*
* Ensure we load p->on_rq _after_ p->state, otherwise it would
* be possible to, falsely, observe p->on_rq == 0 and get stuck
if (p->core_occupation > dst->idle->core_occupation)
goto next;
- p->on_rq = TASK_ON_RQ_MIGRATING;
deactivate_task(src, p, 0);
set_task_cpu(p, this);
activate_task(dst, p, 0);
- p->on_rq = TASK_ON_RQ_QUEUED;
resched_curr(dst);
queue_balance_callback(rq, &per_cpu(core_balance_head, rq->cpu), sched_core_balance);
}
-static inline void sched_core_cpu_starting(unsigned int cpu)
+static void sched_core_cpu_starting(unsigned int cpu)
{
const struct cpumask *smt_mask = cpu_smt_mask(cpu);
- struct rq *rq, *core_rq = NULL;
- int i;
+ struct rq *rq = cpu_rq(cpu), *core_rq = NULL;
+ unsigned long flags;
+ int t;
- core_rq = cpu_rq(cpu)->core;
+ sched_core_lock(cpu, &flags);
- if (!core_rq) {
- for_each_cpu(i, smt_mask) {
- rq = cpu_rq(i);
- if (rq->core && rq->core == rq)
- core_rq = rq;
+ WARN_ON_ONCE(rq->core != rq);
+
+ /* if we're the first, we'll be our own leader */
+ if (cpumask_weight(smt_mask) == 1)
+ goto unlock;
+
+ /* find the leader */
+ for_each_cpu(t, smt_mask) {
+ if (t == cpu)
+ continue;
+ rq = cpu_rq(t);
+ if (rq->core == rq) {
+ core_rq = rq;
+ break;
}
+ }
- if (!core_rq)
- core_rq = cpu_rq(cpu);
+ if (WARN_ON_ONCE(!core_rq)) /* whoopsie */
+ goto unlock;
- for_each_cpu(i, smt_mask) {
- rq = cpu_rq(i);
+ /* install and validate core_rq */
+ for_each_cpu(t, smt_mask) {
+ rq = cpu_rq(t);
- WARN_ON_ONCE(rq->core && rq->core != core_rq);
+ if (t == cpu)
rq->core = core_rq;
- }
+
+ WARN_ON_ONCE(rq->core != core_rq);
+ }
+
+unlock:
+ sched_core_unlock(cpu, &flags);
+}
+
+static void sched_core_cpu_deactivate(unsigned int cpu)
+{
+ const struct cpumask *smt_mask = cpu_smt_mask(cpu);
+ struct rq *rq = cpu_rq(cpu), *core_rq = NULL;
+ unsigned long flags;
+ int t;
+
+ sched_core_lock(cpu, &flags);
+
+ /* if we're the last man standing, nothing to do */
+ if (cpumask_weight(smt_mask) == 1) {
+ WARN_ON_ONCE(rq->core != rq);
+ goto unlock;
+ }
+
+ /* if we're not the leader, nothing to do */
+ if (rq->core != rq)
+ goto unlock;
+
+ /* find a new leader */
+ for_each_cpu(t, smt_mask) {
+ if (t == cpu)
+ continue;
+ core_rq = cpu_rq(t);
+ break;
}
+
+ if (WARN_ON_ONCE(!core_rq)) /* impossible */
+ goto unlock;
+
+ /* copy the shared state to the new leader */
+ core_rq->core_task_seq = rq->core_task_seq;
+ core_rq->core_pick_seq = rq->core_pick_seq;
+ core_rq->core_cookie = rq->core_cookie;
+ core_rq->core_forceidle = rq->core_forceidle;
+ core_rq->core_forceidle_seq = rq->core_forceidle_seq;
+
+ /* install new leader */
+ for_each_cpu(t, smt_mask) {
+ rq = cpu_rq(t);
+ rq->core = core_rq;
+ }
+
+unlock:
+ sched_core_unlock(cpu, &flags);
}
+
+static inline void sched_core_cpu_dying(unsigned int cpu)
+{
+ struct rq *rq = cpu_rq(cpu);
+
+ if (rq->core != rq)
+ rq->core = rq;
+}
+
#else /* !CONFIG_SCHED_CORE */
static inline void sched_core_cpu_starting(unsigned int cpu) {}
+static inline void sched_core_cpu_deactivate(unsigned int cpu) {}
+static inline void sched_core_cpu_dying(unsigned int cpu) {}
static struct task_struct *
pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
#endif /* CONFIG_SCHED_CORE */
+/*
+ * Constants for the sched_mode argument of __schedule().
+ *
+ * The mode argument allows RT enabled kernels to differentiate a
+ * preemption from blocking on an 'sleeping' spin/rwlock. Note that
+ * SM_MASK_PREEMPT for !RT has all bits set, which allows the compiler to
+ * optimize the AND operation out and just check for zero.
+ */
+#define SM_NONE 0x0
+#define SM_PREEMPT 0x1
+#define SM_RTLOCK_WAIT 0x2
+
+#ifndef CONFIG_PREEMPT_RT
+# define SM_MASK_PREEMPT (~0U)
+#else
+# define SM_MASK_PREEMPT SM_PREEMPT
+#endif
+
/*
* __schedule() is the main scheduler function.
*
*
* WARNING: must be called with preemption disabled!
*/
-static void __sched notrace __schedule(bool preempt)
+static void __sched notrace __schedule(unsigned int sched_mode)
{
struct task_struct *prev, *next;
unsigned long *switch_count;
rq = cpu_rq(cpu);
prev = rq->curr;
- schedule_debug(prev, preempt);
+ schedule_debug(prev, !!sched_mode);
if (sched_feat(HRTICK) || sched_feat(HRTICK_DL))
hrtick_clear(rq);
local_irq_disable();
- rcu_note_context_switch(preempt);
+ rcu_note_context_switch(!!sched_mode);
/*
* Make sure that signal_pending_state()->signal_pending() below
* - ptrace_{,un}freeze_traced() can change ->state underneath us.
*/
prev_state = READ_ONCE(prev->__state);
- if (!preempt && prev_state) {
+ if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) {
if (signal_pending_state(prev_state, prev)) {
WRITE_ONCE(prev->__state, TASK_RUNNING);
} else {
migrate_disable_switch(rq, prev);
psi_sched_switch(prev, next, !task_on_rq_queued(prev));
- trace_sched_switch(preempt, prev, next);
+ trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next);
/* Also unlocks the rq: */
rq = context_switch(rq, prev, next, &rf);
/* Tell freezer to ignore us: */
current->flags |= PF_NOFREEZE;
- __schedule(false);
+ __schedule(SM_NONE);
BUG();
/* Avoid "noreturn function does return" - but don't continue if BUG() is a NOP: */
sched_submit_work(tsk);
do {
preempt_disable();
- __schedule(false);
+ __schedule(SM_NONE);
sched_preempt_enable_no_resched();
} while (need_resched());
sched_update_worker(tsk);
*/
WARN_ON_ONCE(current->__state);
do {
- __schedule(false);
+ __schedule(SM_NONE);
} while (need_resched());
}
preempt_disable();
}
+#ifdef CONFIG_PREEMPT_RT
+void __sched notrace schedule_rtlock(void)
+{
+ do {
+ preempt_disable();
+ __schedule(SM_RTLOCK_WAIT);
+ sched_preempt_enable_no_resched();
+ } while (need_resched());
+}
+NOKPROBE_SYMBOL(schedule_rtlock);
+#endif
+
static void __sched notrace preempt_schedule_common(void)
{
do {
*/
preempt_disable_notrace();
preempt_latency_start(1);
- __schedule(true);
+ __schedule(SM_PREEMPT);
preempt_latency_stop(1);
preempt_enable_no_resched_notrace();
* an infinite recursion.
*/
prev_ctx = exception_enter();
- __schedule(true);
+ __schedule(SM_PREEMPT);
exception_exit(prev_ctx);
preempt_latency_stop(1);
do {
preempt_disable();
local_irq_enable();
- __schedule(true);
+ __schedule(SM_PREEMPT);
local_irq_disable();
sched_preempt_enable_no_resched();
} while (need_resched());
return -E2BIG;
}
+static void get_params(struct task_struct *p, struct sched_attr *attr)
+{
+ if (task_has_dl_policy(p))
+ __getparam_dl(p, attr);
+ else if (task_has_rt_policy(p))
+ attr->sched_priority = p->rt_priority;
+ else
+ attr->sched_nice = task_nice(p);
+}
+
/**
* sys_sched_setscheduler - set/change the scheduler policy and RT priority
* @pid: the pid in question.
rcu_read_unlock();
if (likely(p)) {
+ if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
+ get_params(p, &attr);
retval = sched_setattr(p, &attr);
put_task_struct(p);
}
kattr.sched_policy = p->policy;
if (p->sched_reset_on_fork)
kattr.sched_flags |= SCHED_FLAG_RESET_ON_FORK;
- if (task_has_dl_policy(p))
- __getparam_dl(p, &kattr);
- else if (task_has_rt_policy(p))
- kattr.sched_priority = p->rt_priority;
- else
- kattr.sched_nice = task_nice(p);
+ get_params(p, &kattr);
+ kattr.sched_flags &= SCHED_FLAG_ALL;
#ifdef CONFIG_UCLAMP_TASK
/*
return retval;
}
-long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
+#ifdef CONFIG_SMP
+int dl_task_check_affinity(struct task_struct *p, const struct cpumask *mask)
{
+ int ret = 0;
+
+ /*
+ * If the task isn't a deadline task or admission control is
+ * disabled then we don't care about affinity changes.
+ */
+ if (!task_has_dl_policy(p) || !dl_bandwidth_enabled())
+ return 0;
+
+ /*
+ * Since bandwidth control happens on root_domain basis,
+ * if admission test is enabled, we only admit -deadline
+ * tasks allowed to run on all the CPUs in the task's
+ * root_domain.
+ */
+ rcu_read_lock();
+ if (!cpumask_subset(task_rq(p)->rd->span, mask))
+ ret = -EBUSY;
+ rcu_read_unlock();
+ return ret;
+}
+#endif
+
+static int
+__sched_setaffinity(struct task_struct *p, const struct cpumask *mask)
+{
+ int retval;
cpumask_var_t cpus_allowed, new_mask;
+
+ if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL))
+ return -ENOMEM;
+
+ if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
+ retval = -ENOMEM;
+ goto out_free_cpus_allowed;
+ }
+
+ cpuset_cpus_allowed(p, cpus_allowed);
+ cpumask_and(new_mask, mask, cpus_allowed);
+
+ retval = dl_task_check_affinity(p, new_mask);
+ if (retval)
+ goto out_free_new_mask;
+again:
+ retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK | SCA_USER);
+ if (retval)
+ goto out_free_new_mask;
+
+ cpuset_cpus_allowed(p, cpus_allowed);
+ if (!cpumask_subset(new_mask, cpus_allowed)) {
+ /*
+ * We must have raced with a concurrent cpuset update.
+ * Just reset the cpumask to the cpuset's cpus_allowed.
+ */
+ cpumask_copy(new_mask, cpus_allowed);
+ goto again;
+ }
+
+out_free_new_mask:
+ free_cpumask_var(new_mask);
+out_free_cpus_allowed:
+ free_cpumask_var(cpus_allowed);
+ return retval;
+}
+
+long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
+{
struct task_struct *p;
int retval;
retval = -EINVAL;
goto out_put_task;
}
- if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL)) {
- retval = -ENOMEM;
- goto out_put_task;
- }
- if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) {
- retval = -ENOMEM;
- goto out_free_cpus_allowed;
- }
- retval = -EPERM;
+
if (!check_same_owner(p)) {
rcu_read_lock();
if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) {
rcu_read_unlock();
- goto out_free_new_mask;
+ retval = -EPERM;
+ goto out_put_task;
}
rcu_read_unlock();
}
retval = security_task_setscheduler(p);
if (retval)
- goto out_free_new_mask;
-
-
- cpuset_cpus_allowed(p, cpus_allowed);
- cpumask_and(new_mask, in_mask, cpus_allowed);
-
- /*
- * Since bandwidth control happens on root_domain basis,
- * if admission test is enabled, we only admit -deadline
- * tasks allowed to run on all the CPUs in the task's
- * root_domain.
- */
-#ifdef CONFIG_SMP
- if (task_has_dl_policy(p) && dl_bandwidth_enabled()) {
- rcu_read_lock();
- if (!cpumask_subset(task_rq(p)->rd->span, new_mask)) {
- retval = -EBUSY;
- rcu_read_unlock();
- goto out_free_new_mask;
- }
- rcu_read_unlock();
- }
-#endif
-again:
- retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK);
+ goto out_put_task;
- if (!retval) {
- cpuset_cpus_allowed(p, cpus_allowed);
- if (!cpumask_subset(new_mask, cpus_allowed)) {
- /*
- * We must have raced with a concurrent cpuset
- * update. Just reset the cpus_allowed to the
- * cpuset's cpus_allowed
- */
- cpumask_copy(new_mask, cpus_allowed);
- goto again;
- }
- }
-out_free_new_mask:
- free_cpumask_var(new_mask);
-out_free_cpus_allowed:
- free_cpumask_var(cpus_allowed);
+ retval = __sched_setaffinity(p, in_mask);
out_put_task:
put_task_struct(p);
return retval;
preempt_schedule_common();
return 1;
}
+ /*
+ * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
+ * whether the current CPU is in an RCU read-side critical section,
+ * so the tick can report quiescent states even for CPUs looping
+ * in kernel context. In contrast, in non-preemptible kernels,
+ * RCU readers leave no in-memory hints, which means that CPU-bound
+ * processes executing in kernel context might never report an
+ * RCU quiescent state. Therefore, the following code causes
+ * cond_resched() to report a quiescent state, but only when RCU
+ * is in urgent need of one.
+ */
#ifndef CONFIG_PREEMPT_RCU
rcu_all_qs();
#endif
*/
if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
static_branch_dec_cpuslocked(&sched_smt_present);
+
+ sched_core_cpu_deactivate(cpu);
#endif
if (!sched_smp_initialized)
calc_load_migrate(rq);
update_max_interval();
hrtick_clear(rq);
+ sched_core_cpu_dying(cpu);
return 0;
}
#endif
atomic_set(&rq->nr_iowait, 0);
#ifdef CONFIG_SCHED_CORE
- rq->core = NULL;
+ rq->core = rq;
rq->core_pick = NULL;
rq->core_enabled = 0;
rq->core_tree = RB_ROOT;
* Prevent race between setting of cfs_rq->runtime_enabled and
* unthrottle_offline_cfs_rqs().
*/
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&cfs_constraints_mutex);
ret = __cfs_schedulable(tg, period, quota);
if (ret)
cfs_bandwidth_usage_dec();
out_unlock:
mutex_unlock(&cfs_constraints_mutex);
- put_online_cpus();
+ cpus_read_unlock();
return ret;
}
}
#endif /* CONFIG_RT_GROUP_SCHED */
+#ifdef CONFIG_FAIR_GROUP_SCHED
+static s64 cpu_idle_read_s64(struct cgroup_subsys_state *css,
+ struct cftype *cft)
+{
+ return css_tg(css)->idle;
+}
+
+static int cpu_idle_write_s64(struct cgroup_subsys_state *css,
+ struct cftype *cft, s64 idle)
+{
+ return sched_group_set_idle(css_tg(css), idle);
+}
+#endif
+
static struct cftype cpu_legacy_files[] = {
#ifdef CONFIG_FAIR_GROUP_SCHED
{
.read_u64 = cpu_shares_read_u64,
.write_u64 = cpu_shares_write_u64,
},
+ {
+ .name = "idle",
+ .read_s64 = cpu_idle_read_s64,
+ .write_s64 = cpu_idle_write_s64,
+ },
#endif
#ifdef CONFIG_CFS_BANDWIDTH
{
.read_s64 = cpu_weight_nice_read_s64,
.write_s64 = cpu_weight_nice_write_s64,
},
+ {
+ .name = "idle",
+ .flags = CFTYPE_NOT_ON_ROOT,
+ .read_s64 = cpu_idle_read_s64,
+ .write_s64 = cpu_idle_write_s64,
+ },
#endif
#ifdef CONFIG_CFS_BANDWIDTH
{
*/
raw_spin_rq_lock(rq);
if (p->dl.dl_non_contending) {
+ update_rq_clock(rq);
sub_running_bw(&p->dl, &rq->dl);
p->dl.dl_non_contending = 0;
/*
dl_se->dl_runtime = attr->sched_runtime;
dl_se->dl_deadline = attr->sched_deadline;
dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
- dl_se->flags = attr->sched_flags;
+ dl_se->flags = attr->sched_flags & SCHED_DL_FLAGS;
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
dl_se->dl_density = to_ratio(dl_se->dl_deadline, dl_se->dl_runtime);
}
attr->sched_runtime = dl_se->dl_runtime;
attr->sched_deadline = dl_se->dl_deadline;
attr->sched_period = dl_se->dl_period;
- attr->sched_flags = dl_se->flags;
+ attr->sched_flags &= ~SCHED_DL_FLAGS;
+ attr->sched_flags |= dl_se->flags;
}
/*
if (dl_se->dl_runtime != attr->sched_runtime ||
dl_se->dl_deadline != attr->sched_deadline ||
dl_se->dl_period != attr->sched_period ||
- dl_se->flags != attr->sched_flags)
+ dl_se->flags != (attr->sched_flags & SCHED_DL_FLAGS))
return true;
return false;
{
int cpu, i;
+ /*
+ * This can unfortunately be invoked before sched_debug_init() creates
+ * the debug directory. Don't touch sd_sysctl_cpus until then.
+ */
+ if (!debugfs_sched)
+ return;
+
if (!cpumask_available(sd_sysctl_cpus)) {
if (!alloc_cpumask_var(&sd_sysctl_cpus, GFP_KERNEL))
return;
SEQ_printf(m, " .%-30s: %d\n", "nr_spread_over",
cfs_rq->nr_spread_over);
SEQ_printf(m, " .%-30s: %d\n", "nr_running", cfs_rq->nr_running);
+ SEQ_printf(m, " .%-30s: %d\n", "h_nr_running", cfs_rq->h_nr_running);
+ SEQ_printf(m, " .%-30s: %d\n", "idle_h_nr_running",
+ cfs_rq->idle_h_nr_running);
SEQ_printf(m, " .%-30s: %ld\n", "load", cfs_rq->load.weight);
#ifdef CONFIG_SMP
SEQ_printf(m, " .%-30s: %lu\n", "load_avg",
}
}
+static int tg_is_idle(struct task_group *tg)
+{
+ return tg->idle > 0;
+}
+
+static int cfs_rq_is_idle(struct cfs_rq *cfs_rq)
+{
+ return cfs_rq->idle > 0;
+}
+
+static int se_is_idle(struct sched_entity *se)
+{
+ if (entity_is_task(se))
+ return task_has_idle_policy(task_of(se));
+ return cfs_rq_is_idle(group_cfs_rq(se));
+}
+
#else /* !CONFIG_FAIR_GROUP_SCHED */
#define for_each_sched_entity(se) \
{
}
+static inline int tg_is_idle(struct task_group *tg)
+{
+ return 0;
+}
+
+static int cfs_rq_is_idle(struct cfs_rq *cfs_rq)
+{
+ return 0;
+}
+
+static int se_is_idle(struct sched_entity *se)
+{
+ return 0;
+}
+
#endif /* CONFIG_FAIR_GROUP_SCHED */
static __always_inline
if (cpu == sibling)
continue;
- if (!idle_cpu(cpu))
+ if (!idle_cpu(sibling))
return false;
}
#endif
dequeue_entity(qcfs_rq, se, DEQUEUE_SLEEP);
+ if (cfs_rq_is_idle(group_cfs_rq(se)))
+ idle_task_delta = cfs_rq->h_nr_running;
+
qcfs_rq->h_nr_running -= task_delta;
qcfs_rq->idle_h_nr_running -= idle_task_delta;
update_load_avg(qcfs_rq, se, 0);
se_update_runnable(se);
+ if (cfs_rq_is_idle(group_cfs_rq(se)))
+ idle_task_delta = cfs_rq->h_nr_running;
+
qcfs_rq->h_nr_running -= task_delta;
qcfs_rq->idle_h_nr_running -= idle_task_delta;
}
task_delta = cfs_rq->h_nr_running;
idle_task_delta = cfs_rq->idle_h_nr_running;
for_each_sched_entity(se) {
+ struct cfs_rq *qcfs_rq = cfs_rq_of(se);
+
if (se->on_rq)
break;
- cfs_rq = cfs_rq_of(se);
- enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
+ enqueue_entity(qcfs_rq, se, ENQUEUE_WAKEUP);
+
+ if (cfs_rq_is_idle(group_cfs_rq(se)))
+ idle_task_delta = cfs_rq->h_nr_running;
- cfs_rq->h_nr_running += task_delta;
- cfs_rq->idle_h_nr_running += idle_task_delta;
+ qcfs_rq->h_nr_running += task_delta;
+ qcfs_rq->idle_h_nr_running += idle_task_delta;
/* end evaluation on encountering a throttled cfs_rq */
- if (cfs_rq_throttled(cfs_rq))
+ if (cfs_rq_throttled(qcfs_rq))
goto unthrottle_throttle;
}
for_each_sched_entity(se) {
- cfs_rq = cfs_rq_of(se);
+ struct cfs_rq *qcfs_rq = cfs_rq_of(se);
- update_load_avg(cfs_rq, se, UPDATE_TG);
+ update_load_avg(qcfs_rq, se, UPDATE_TG);
se_update_runnable(se);
- cfs_rq->h_nr_running += task_delta;
- cfs_rq->idle_h_nr_running += idle_task_delta;
+ if (cfs_rq_is_idle(group_cfs_rq(se)))
+ idle_task_delta = cfs_rq->h_nr_running;
+ qcfs_rq->h_nr_running += task_delta;
+ qcfs_rq->idle_h_nr_running += idle_task_delta;
/* end evaluation on encountering a throttled cfs_rq */
- if (cfs_rq_throttled(cfs_rq))
+ if (cfs_rq_throttled(qcfs_rq))
goto unthrottle_throttle;
/*
* One parent has been throttled and cfs_rq removed from the
* list. Add it back to not break the leaf list.
*/
- if (throttled_hierarchy(cfs_rq))
- list_add_leaf_cfs_rq(cfs_rq);
+ if (throttled_hierarchy(qcfs_rq))
+ list_add_leaf_cfs_rq(qcfs_rq);
}
/* At this point se is NULL and we are at root level*/
* assertion below.
*/
for_each_sched_entity(se) {
- cfs_rq = cfs_rq_of(se);
+ struct cfs_rq *qcfs_rq = cfs_rq_of(se);
- if (list_add_leaf_cfs_rq(cfs_rq))
+ if (list_add_leaf_cfs_rq(qcfs_rq))
break;
}
cfs_rq->h_nr_running++;
cfs_rq->idle_h_nr_running += idle_h_nr_running;
+ if (cfs_rq_is_idle(cfs_rq))
+ idle_h_nr_running = 1;
+
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
goto enqueue_throttle;
cfs_rq->h_nr_running++;
cfs_rq->idle_h_nr_running += idle_h_nr_running;
+ if (cfs_rq_is_idle(cfs_rq))
+ idle_h_nr_running = 1;
+
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
goto enqueue_throttle;
cfs_rq->h_nr_running--;
cfs_rq->idle_h_nr_running -= idle_h_nr_running;
+ if (cfs_rq_is_idle(cfs_rq))
+ idle_h_nr_running = 1;
+
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
goto dequeue_throttle;
cfs_rq->h_nr_running--;
cfs_rq->idle_h_nr_running -= idle_h_nr_running;
+ if (cfs_rq_is_idle(cfs_rq))
+ idle_h_nr_running = 1;
+
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
goto dequeue_throttle;
time = cpu_clock(this);
}
- for_each_cpu_wrap(cpu, cpus, target) {
+ for_each_cpu_wrap(cpu, cpus, target + 1) {
if (has_idle_core) {
i = select_idle_core(p, cpu, cpus, &idle_cpu);
if ((unsigned int)i < nr_cpumask_bits)
/* Check a recently used CPU as a potential idle candidate: */
recent_used_cpu = p->recent_used_cpu;
+ p->recent_used_cpu = prev;
if (recent_used_cpu != prev &&
recent_used_cpu != target &&
cpus_share_cache(recent_used_cpu, target) &&
} else if (wake_flags & WF_TTWU) { /* XXX always ? */
/* Fast path */
new_cpu = select_idle_sibling(p, prev_cpu, new_cpu);
-
- if (want_affine)
- current->recent_used_cpu = cpu;
}
rcu_read_unlock();
static void set_last_buddy(struct sched_entity *se)
{
- if (entity_is_task(se) && unlikely(task_has_idle_policy(task_of(se))))
- return;
-
for_each_sched_entity(se) {
if (SCHED_WARN_ON(!se->on_rq))
return;
+ if (se_is_idle(se))
+ return;
cfs_rq_of(se)->last = se;
}
}
static void set_next_buddy(struct sched_entity *se)
{
- if (entity_is_task(se) && unlikely(task_has_idle_policy(task_of(se))))
- return;
-
for_each_sched_entity(se) {
if (SCHED_WARN_ON(!se->on_rq))
return;
+ if (se_is_idle(se))
+ return;
cfs_rq_of(se)->next = se;
}
}
struct cfs_rq *cfs_rq = task_cfs_rq(curr);
int scale = cfs_rq->nr_running >= sched_nr_latency;
int next_buddy_marked = 0;
+ int cse_is_idle, pse_is_idle;
if (unlikely(se == pse))
return;
return;
find_matching_se(&se, &pse);
- update_curr(cfs_rq_of(se));
BUG_ON(!pse);
+
+ cse_is_idle = se_is_idle(se);
+ pse_is_idle = se_is_idle(pse);
+
+ /*
+ * Preempt an idle group in favor of a non-idle group (and don't preempt
+ * in the inverse case).
+ */
+ if (cse_is_idle && !pse_is_idle)
+ goto preempt;
+ if (cse_is_idle != pse_is_idle)
+ return;
+
+ update_curr(cfs_rq_of(se));
if (wakeup_preempt_entity(se, pse) == 1) {
/*
* Bias pick_next to pick the sched entity that is
static inline int find_new_ilb(void)
{
int ilb;
+ const struct cpumask *hk_mask;
+
+ hk_mask = housekeeping_cpumask(HK_FLAG_MISC);
- for_each_cpu_and(ilb, nohz.idle_cpus_mask,
- housekeeping_cpumask(HK_FLAG_MISC)) {
+ for_each_cpu_and(ilb, nohz.idle_cpus_mask, hk_mask) {
if (ilb == smp_processor_id())
continue;
static DEFINE_MUTEX(shares_mutex);
-int sched_group_set_shares(struct task_group *tg, unsigned long shares)
+static int __sched_group_set_shares(struct task_group *tg, unsigned long shares)
{
int i;
+ lockdep_assert_held(&shares_mutex);
+
/*
* We can't change the weight of the root cgroup.
*/
shares = clamp(shares, scale_load(MIN_SHARES), scale_load(MAX_SHARES));
- mutex_lock(&shares_mutex);
if (tg->shares == shares)
- goto done;
+ return 0;
tg->shares = shares;
for_each_possible_cpu(i) {
rq_unlock_irqrestore(rq, &rf);
}
-done:
+ return 0;
+}
+
+int sched_group_set_shares(struct task_group *tg, unsigned long shares)
+{
+ int ret;
+
+ mutex_lock(&shares_mutex);
+ if (tg_is_idle(tg))
+ ret = -EINVAL;
+ else
+ ret = __sched_group_set_shares(tg, shares);
+ mutex_unlock(&shares_mutex);
+
+ return ret;
+}
+
+int sched_group_set_idle(struct task_group *tg, long idle)
+{
+ int i;
+
+ if (tg == &root_task_group)
+ return -EINVAL;
+
+ if (idle < 0 || idle > 1)
+ return -EINVAL;
+
+ mutex_lock(&shares_mutex);
+
+ if (tg->idle == idle) {
+ mutex_unlock(&shares_mutex);
+ return 0;
+ }
+
+ tg->idle = idle;
+
+ for_each_possible_cpu(i) {
+ struct rq *rq = cpu_rq(i);
+ struct sched_entity *se = tg->se[i];
+ struct cfs_rq *grp_cfs_rq = tg->cfs_rq[i];
+ bool was_idle = cfs_rq_is_idle(grp_cfs_rq);
+ long idle_task_delta;
+ struct rq_flags rf;
+
+ rq_lock_irqsave(rq, &rf);
+
+ grp_cfs_rq->idle = idle;
+ if (WARN_ON_ONCE(was_idle == cfs_rq_is_idle(grp_cfs_rq)))
+ goto next_cpu;
+
+ idle_task_delta = grp_cfs_rq->h_nr_running -
+ grp_cfs_rq->idle_h_nr_running;
+ if (!cfs_rq_is_idle(grp_cfs_rq))
+ idle_task_delta *= -1;
+
+ for_each_sched_entity(se) {
+ struct cfs_rq *cfs_rq = cfs_rq_of(se);
+
+ if (!se->on_rq)
+ break;
+
+ cfs_rq->idle_h_nr_running += idle_task_delta;
+
+ /* Already accounted at parent level and above. */
+ if (cfs_rq_is_idle(cfs_rq))
+ break;
+ }
+
+next_cpu:
+ rq_unlock_irqrestore(rq, &rf);
+ }
+
+ /* Idle groups have minimum weight. */
+ if (tg_is_idle(tg))
+ __sched_group_set_shares(tg, scale_load(WEIGHT_IDLEPRIO));
+ else
+ __sched_group_set_shares(tg, NICE_0_LOAD);
+
mutex_unlock(&shares_mutex);
return 0;
}
+
#else /* CONFIG_FAIR_GROUP_SCHED */
void free_fair_sched_group(struct task_group *tg) { }
*/
#define SCHED_FLAG_SUGOV 0x10000000
+#define SCHED_DL_FLAGS (SCHED_FLAG_RECLAIM | SCHED_FLAG_DL_OVERRUN | SCHED_FLAG_SUGOV)
+
static inline bool dl_entity_is_special(struct sched_dl_entity *dl_se)
{
#ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL
struct cfs_rq **cfs_rq;
unsigned long shares;
+ /* A positive value indicates that this is a SCHED_IDLE group. */
+ int idle;
+
#ifdef CONFIG_SMP
/*
* load_avg can be heavily contended at clock tick time, so put
#ifdef CONFIG_FAIR_GROUP_SCHED
extern int sched_group_set_shares(struct task_group *tg, unsigned long shares);
+extern int sched_group_set_idle(struct task_group *tg, long idle);
+
#ifdef CONFIG_SMP
extern void set_task_rq_fair(struct sched_entity *se,
struct cfs_rq *prev, struct cfs_rq *next);
struct list_head leaf_cfs_rq_list;
struct task_group *tg; /* group that "owns" this runqueue */
+ /* Locally cached copy of our task_group's idle value */
+ int idle;
+
#ifdef CONFIG_CFS_BANDWIDTH
int runtime_enabled;
s64 runtime_remaining;
unsigned int core_sched_seq;
struct rb_root core_tree;
- /* shared state */
+ /* shared state -- careful with sched_core_cpu_deactivate() */
unsigned int core_task_seq;
unsigned int core_pick_seq;
unsigned long core_cookie;
#define SCA_CHECK 0x01
#define SCA_MIGRATE_DISABLE 0x02
#define SCA_MIGRATE_ENABLE 0x04
+#define SCA_USER 0x08
#ifdef CONFIG_SMP
if (p->nr_cpus_allowed == 1)
return NULL;
+ if (p->migration_disabled)
+ return NULL;
+
rq->push_busy = true;
return get_task_struct(p);
}
extern const_debug unsigned int sysctl_sched_nr_migrate;
extern const_debug unsigned int sysctl_sched_migration_cost;
+#ifdef CONFIG_SCHED_DEBUG
+extern unsigned int sysctl_sched_latency;
+extern unsigned int sysctl_sched_min_granularity;
+extern unsigned int sysctl_sched_wakeup_granularity;
+extern int sysctl_resched_latency_warn_ms;
+extern int sysctl_resched_latency_warn_once;
+
+extern unsigned int sysctl_sched_tunable_scaling;
+
+extern unsigned int sysctl_numa_balancing_scan_delay;
+extern unsigned int sysctl_numa_balancing_scan_period_min;
+extern unsigned int sysctl_numa_balancing_scan_period_max;
+extern unsigned int sysctl_numa_balancing_scan_size;
+#endif
+
#ifdef CONFIG_SCHED_HRTICK
/*
static int *sched_domains_numa_distance;
static struct cpumask ***sched_domains_numa_masks;
int __read_mostly node_reclaim_distance = RECLAIM_DISTANCE;
+
+static unsigned long __read_mostly *sched_numa_onlined_nodes;
#endif
/*
sched_domains_numa_masks[i][j] = mask;
for_each_node(k) {
+ /*
+ * Distance information can be unreliable for
+ * offline nodes, defer building the node
+ * masks to its bringup.
+ * This relies on all unique distance values
+ * still being visible at init time.
+ */
+ if (!node_online(j))
+ continue;
+
if (sched_debug() && (node_distance(j, k) != node_distance(k, j)))
sched_numa_warn("Node-distance not symmetric");
sched_max_numa_distance = sched_domains_numa_distance[nr_levels - 1];
init_numa_topology_type();
+
+ sched_numa_onlined_nodes = bitmap_alloc(nr_node_ids, GFP_KERNEL);
+ if (!sched_numa_onlined_nodes)
+ return;
+
+ bitmap_zero(sched_numa_onlined_nodes, nr_node_ids);
+ for_each_online_node(i)
+ bitmap_set(sched_numa_onlined_nodes, i, 1);
+}
+
+static void __sched_domains_numa_masks_set(unsigned int node)
+{
+ int i, j;
+
+ /*
+ * NUMA masks are not built for offline nodes in sched_init_numa().
+ * Thus, when a CPU of a never-onlined-before node gets plugged in,
+ * adding that new CPU to the right NUMA masks is not sufficient: the
+ * masks of that CPU's node must also be updated.
+ */
+ if (test_bit(node, sched_numa_onlined_nodes))
+ return;
+
+ bitmap_set(sched_numa_onlined_nodes, node, 1);
+
+ for (i = 0; i < sched_domains_numa_levels; i++) {
+ for (j = 0; j < nr_node_ids; j++) {
+ if (!node_online(j) || node == j)
+ continue;
+
+ if (node_distance(j, node) > sched_domains_numa_distance[i])
+ continue;
+
+ /* Add remote nodes in our masks */
+ cpumask_or(sched_domains_numa_masks[i][node],
+ sched_domains_numa_masks[i][node],
+ sched_domains_numa_masks[0][j]);
+ }
+ }
+
+ /*
+ * A new node has been brought up, potentially changing the topology
+ * classification.
+ *
+ * Note that this is racy vs any use of sched_numa_topology_type :/
+ */
+ init_numa_topology_type();
}
void sched_domains_numa_masks_set(unsigned int cpu)
int node = cpu_to_node(cpu);
int i, j;
+ __sched_domains_numa_masks_set(node);
+
for (i = 0; i < sched_domains_numa_levels; i++) {
for (j = 0; j < nr_node_ids; j++) {
+ if (!node_online(j))
+ continue;
+
+ /* Set ourselves in the remote node's masks */
if (node_distance(j, node) <= sched_domains_numa_distance[i])
cpumask_set_cpu(cpu, sched_domains_numa_masks[i][j]);
}
return sighand;
}
+#ifdef CONFIG_LOCKDEP
+void lockdep_assert_task_sighand_held(struct task_struct *task)
+{
+ struct sighand_struct *sighand;
+
+ rcu_read_lock();
+ sighand = rcu_dereference(task->sighand);
+ if (sighand)
+ lockdep_assert_held(&sighand->siglock);
+ else
+ WARN_ON_ONCE(1);
+ rcu_read_unlock();
+}
+#endif
+
/*
* send signal info to all the members of a group
*/
EXPORT_SYMBOL(smp_call_function_single);
/**
- * smp_call_function_single_async(): Run an asynchronous function on a
+ * smp_call_function_single_async() - Run an asynchronous function on a
* specific CPU.
* @cpu: The CPU to run on.
* @csd: Pre-allocated and setup data structure
*
* NOTE: Be careful, there is unfortunately no current debugging facility to
* validate the correctness of this serialization.
+ *
+ * Return: %0 on success or negative errno value on error
*/
int smp_call_function_single_async(int cpu, struct __call_single_data *csd)
{
* @mask: The set of cpus to run on (only runs on online subset).
* @func: The function to run. This must be fast and non-blocking.
* @info: An arbitrary pointer to pass to the function.
- * @flags: Bitmask that controls the operation. If %SCF_WAIT is set, wait
+ * @wait: Bitmask that controls the operation. If %SCF_WAIT is set, wait
* (atomically) until function has completed on other CPUs. If
* %SCF_RUN_LOCAL is set, the function will also be run locally
* if the local CPU is set in the @cpumask.
EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus);
/**
- * smp_call_on_cpu - Call a function on a specific cpu
+ * struct smp_call_on_cpu_struct - Call a function on a specific CPU
+ * @work: &work_struct
+ * @done: &completion to signal
+ * @func: function to call
+ * @data: function's data argument
+ * @ret: return value from @func
+ * @cpu: target CPU (%-1 for any CPU)
*
* Used to call a function on a specific cpu and wait for it to return.
* Optionally make sure the call is done on a specified physical cpu via vcpu
unsigned int cpu;
int ret = 0;
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&smpboot_threads_lock);
for_each_online_cpu(cpu) {
ret = __smpboot_create_thread(plug_thread, cpu);
list_add(&plug_thread->list, &hotplug_threads);
out:
mutex_unlock(&smpboot_threads_lock);
- put_online_cpus();
+ cpus_read_unlock();
return ret;
}
EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread);
*/
void smpboot_unregister_percpu_thread(struct smp_hotplug_thread *plug_thread)
{
- get_online_cpus();
+ cpus_read_lock();
mutex_lock(&smpboot_threads_lock);
list_del(&plug_thread->list);
smpboot_destroy_threads(plug_thread);
mutex_unlock(&smpboot_threads_lock);
- put_online_cpus();
+ cpus_read_unlock();
}
EXPORT_SYMBOL_GPL(smpboot_unregister_percpu_thread);
if (ksoftirqd_running(local_softirq_pending()))
return;
- if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
+ if (!force_irqthreads() || !__this_cpu_read(ksoftirqd)) {
#ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
/*
* We can safely execute softirq on the current stack if
#include <linux/prandom.h>
#include <linux/cpu.h>
+#include "tick-internal.h"
+
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@kernel.org>");
return (u64)jiffies;
}
-/* Assume HZ > 100. */
-#define JIFFIES_SHIFT 8
-
static struct clocksource clocksource_wdtest_jiffies = {
.name = "wdtest-jiffies",
.rating = 1, /* lowest valid rating*/
return;
cpumask_clear(&cpus_ahead);
cpumask_clear(&cpus_behind);
- get_online_cpus();
+ cpus_read_lock();
preempt_disable();
clocksource_verify_choose_cpus();
if (cpumask_weight(&cpus_chosen) == 0) {
preempt_enable();
- put_online_cpus();
+ cpus_read_unlock();
pr_warn("Not enough CPUs to check clocksource '%s'.\n", cs->name);
return;
}
cs_nsec_min = cs_nsec;
}
preempt_enable();
- put_online_cpus();
+ cpus_read_unlock();
if (!cpumask_empty(&cpus_ahead))
pr_warn(" CPUs %*pbl ahead of CPU %d for clocksource %s.\n",
cpumask_pr_args(&cpus_ahead), testcpu, cs->name);
return __hrtimer_hres_active(this_cpu_ptr(&hrtimer_bases));
}
-/*
- * Reprogram the event source with checking both queues for the
- * next event
- * Called with interrupts disabled and base->lock held
- */
-static void
-hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
+static void __hrtimer_reprogram(struct hrtimer_cpu_base *cpu_base,
+ struct hrtimer *next_timer,
+ ktime_t expires_next)
{
- ktime_t expires_next;
-
- expires_next = hrtimer_update_next_event(cpu_base);
-
- if (skip_equal && expires_next == cpu_base->expires_next)
- return;
-
cpu_base->expires_next = expires_next;
/*
if (!__hrtimer_hres_active(cpu_base) || cpu_base->hang_detected)
return;
- tick_program_event(cpu_base->expires_next, 1);
+ tick_program_event(expires_next, 1);
+}
+
+/*
+ * Reprogram the event source with checking both queues for the
+ * next event
+ * Called with interrupts disabled and base->lock held
+ */
+static void
+hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
+{
+ ktime_t expires_next;
+
+ expires_next = hrtimer_update_next_event(cpu_base);
+
+ if (skip_equal && expires_next == cpu_base->expires_next)
+ return;
+
+ __hrtimer_reprogram(cpu_base, cpu_base->next_timer, expires_next);
}
/* High resolution timer related functions */
return hrtimer_hres_enabled;
}
-/*
- * Retrigger next event is called after clock was set
- *
- * Called with interrupts disabled via on_each_cpu()
- */
-static void retrigger_next_event(void *arg)
-{
- struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
-
- if (!__hrtimer_hres_active(base))
- return;
-
- raw_spin_lock(&base->lock);
- hrtimer_update_base(base);
- hrtimer_force_reprogram(base, 0);
- raw_spin_unlock(&base->lock);
-}
+static void retrigger_next_event(void *arg);
/*
* Switch to high resolution mode
retrigger_next_event(NULL);
}
-static void clock_was_set_work(struct work_struct *work)
-{
- clock_was_set();
-}
+#else
-static DECLARE_WORK(hrtimer_work, clock_was_set_work);
+static inline int hrtimer_is_hres_enabled(void) { return 0; }
+static inline void hrtimer_switch_to_hres(void) { }
+#endif /* CONFIG_HIGH_RES_TIMERS */
/*
- * Called from timekeeping and resume code to reprogram the hrtimer
- * interrupt device on all cpus.
+ * Retrigger next event is called after clock was set with interrupts
+ * disabled through an SMP function call or directly from low level
+ * resume code.
+ *
+ * This is only invoked when:
+ * - CONFIG_HIGH_RES_TIMERS is enabled.
+ * - CONFIG_NOHZ_COMMON is enabled
+ *
+ * For the other cases this function is empty and because the call sites
+ * are optimized out it vanishes as well, i.e. no need for lots of
+ * #ifdeffery.
*/
-void clock_was_set_delayed(void)
+static void retrigger_next_event(void *arg)
{
- schedule_work(&hrtimer_work);
-}
-
-#else
+ struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
-static inline int hrtimer_is_hres_enabled(void) { return 0; }
-static inline void hrtimer_switch_to_hres(void) { }
-static inline void retrigger_next_event(void *arg) { }
+ /*
+ * When high resolution mode or nohz is active, then the offsets of
+ * CLOCK_REALTIME/TAI/BOOTTIME have to be updated. Otherwise the
+ * next tick will take care of that.
+ *
+ * If high resolution mode is active then the next expiring timer
+ * must be reevaluated and the clock event device reprogrammed if
+ * necessary.
+ *
+ * In the NOHZ case the update of the offset and the reevaluation
+ * of the next expiring timer is enough. The return from the SMP
+ * function call will take care of the reprogramming in case the
+ * CPU was in a NOHZ idle sleep.
+ */
+ if (!__hrtimer_hres_active(base) && !tick_nohz_active)
+ return;
-#endif /* CONFIG_HIGH_RES_TIMERS */
+ raw_spin_lock(&base->lock);
+ hrtimer_update_base(base);
+ if (__hrtimer_hres_active(base))
+ hrtimer_force_reprogram(base, 0);
+ else
+ hrtimer_update_next_event(base);
+ raw_spin_unlock(&base->lock);
+}
/*
* When a timer is enqueued and expires earlier than the already enqueued
if (base->cpu_base != cpu_base)
return;
+ if (expires >= cpu_base->expires_next)
+ return;
+
/*
- * If the hrtimer interrupt is running, then it will
- * reevaluate the clock bases and reprogram the clock event
- * device. The callbacks are always executed in hard interrupt
- * context so we don't need an extra check for a running
- * callback.
+ * If the hrtimer interrupt is running, then it will reevaluate the
+ * clock bases and reprogram the clock event device.
*/
if (cpu_base->in_hrtirq)
return;
- if (expires >= cpu_base->expires_next)
- return;
-
- /* Update the pointer to the next expiring timer */
cpu_base->next_timer = timer;
- cpu_base->expires_next = expires;
+
+ __hrtimer_reprogram(cpu_base, timer, expires);
+}
+
+static bool update_needs_ipi(struct hrtimer_cpu_base *cpu_base,
+ unsigned int active)
+{
+ struct hrtimer_clock_base *base;
+ unsigned int seq;
+ ktime_t expires;
/*
- * If hres is not active, hardware does not have to be
- * programmed yet.
+ * Update the base offsets unconditionally so the following
+ * checks whether the SMP function call is required works.
*
- * If a hang was detected in the last timer interrupt then we
- * do not schedule a timer which is earlier than the expiry
- * which we enforced in the hang detection. We want the system
- * to make progress.
+ * The update is safe even when the remote CPU is in the hrtimer
+ * interrupt or the hrtimer soft interrupt and expiring affected
+ * bases. Either it will see the update before handling a base or
+ * it will see it when it finishes the processing and reevaluates
+ * the next expiring timer.
*/
- if (!__hrtimer_hres_active(cpu_base) || cpu_base->hang_detected)
- return;
+ seq = cpu_base->clock_was_set_seq;
+ hrtimer_update_base(cpu_base);
+
+ /*
+ * If the sequence did not change over the update then the
+ * remote CPU already handled it.
+ */
+ if (seq == cpu_base->clock_was_set_seq)
+ return false;
+
+ /*
+ * If the remote CPU is currently handling an hrtimer interrupt, it
+ * will reevaluate the first expiring timer of all clock bases
+ * before reprogramming. Nothing to do here.
+ */
+ if (cpu_base->in_hrtirq)
+ return false;
/*
- * Program the timer hardware. We enforce the expiry for
- * events which are already in the past.
+ * Walk the affected clock bases and check whether the first expiring
+ * timer in a clock base is moving ahead of the first expiring timer of
+ * @cpu_base. If so, the IPI must be invoked because per CPU clock
+ * event devices cannot be remotely reprogrammed.
*/
- tick_program_event(expires, 1);
+ active &= cpu_base->active_bases;
+
+ for_each_active_base(base, cpu_base, active) {
+ struct timerqueue_node *next;
+
+ next = timerqueue_getnext(&base->active);
+ expires = ktime_sub(next->expires, base->offset);
+ if (expires < cpu_base->expires_next)
+ return true;
+
+ /* Extra check for softirq clock bases */
+ if (base->clockid < HRTIMER_BASE_MONOTONIC_SOFT)
+ continue;
+ if (cpu_base->softirq_activated)
+ continue;
+ if (expires < cpu_base->softirq_expires_next)
+ return true;
+ }
+ return false;
}
/*
- * Clock realtime was set
- *
- * Change the offset of the realtime clock vs. the monotonic
- * clock.
+ * Clock was set. This might affect CLOCK_REALTIME, CLOCK_TAI and
+ * CLOCK_BOOTTIME (for late sleep time injection).
*
- * We might have to reprogram the high resolution timer interrupt. On
- * SMP we call the architecture specific code to retrigger _all_ high
- * resolution timer interrupts. On UP we just disable interrupts and
- * call the high resolution interrupt code.
+ * This requires to update the offsets for these clocks
+ * vs. CLOCK_MONOTONIC. When high resolution timers are enabled, then this
+ * also requires to eventually reprogram the per CPU clock event devices
+ * when the change moves an affected timer ahead of the first expiring
+ * timer on that CPU. Obviously remote per CPU clock event devices cannot
+ * be reprogrammed. The other reason why an IPI has to be sent is when the
+ * system is in !HIGH_RES and NOHZ mode. The NOHZ mode updates the offsets
+ * in the tick, which obviously might be stopped, so this has to bring out
+ * the remote CPU which might sleep in idle to get this sorted.
*/
-void clock_was_set(void)
+void clock_was_set(unsigned int bases)
{
-#ifdef CONFIG_HIGH_RES_TIMERS
- /* Retrigger the CPU local events everywhere */
- on_each_cpu(retrigger_next_event, NULL, 1);
-#endif
+ struct hrtimer_cpu_base *cpu_base = raw_cpu_ptr(&hrtimer_bases);
+ cpumask_var_t mask;
+ int cpu;
+
+ if (!__hrtimer_hres_active(cpu_base) && !tick_nohz_active)
+ goto out_timerfd;
+
+ if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) {
+ on_each_cpu(retrigger_next_event, NULL, 1);
+ goto out_timerfd;
+ }
+
+ /* Avoid interrupting CPUs if possible */
+ cpus_read_lock();
+ for_each_online_cpu(cpu) {
+ unsigned long flags;
+
+ cpu_base = &per_cpu(hrtimer_bases, cpu);
+ raw_spin_lock_irqsave(&cpu_base->lock, flags);
+
+ if (update_needs_ipi(cpu_base, bases))
+ cpumask_set_cpu(cpu, mask);
+
+ raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
+ }
+
+ preempt_disable();
+ smp_call_function_many(mask, retrigger_next_event, NULL, 1);
+ preempt_enable();
+ cpus_read_unlock();
+ free_cpumask_var(mask);
+
+out_timerfd:
timerfd_clock_was_set();
}
+static void clock_was_set_work(struct work_struct *work)
+{
+ clock_was_set(CLOCK_SET_WALL);
+}
+
+static DECLARE_WORK(hrtimer_work, clock_was_set_work);
+
+/*
+ * Called from timekeeping code to reprogram the hrtimer interrupt device
+ * on all cpus and to notify timerfd.
+ */
+void clock_was_set_delayed(void)
+{
+ schedule_work(&hrtimer_work);
+}
+
/*
- * During resume we might have to reprogram the high resolution timer
- * interrupt on all online CPUs. However, all other CPUs will be
- * stopped with IRQs interrupts disabled so the clock_was_set() call
- * must be deferred.
+ * Called during resume either directly from via timekeeping_resume()
+ * or in the case of s2idle from tick_unfreeze() to ensure that the
+ * hrtimers are up to date.
*/
-void hrtimers_resume(void)
+void hrtimers_resume_local(void)
{
lockdep_assert_irqs_disabled();
/* Retrigger on the local CPU */
retrigger_next_event(NULL);
- /* And schedule a retrigger for all others */
- clock_was_set_delayed();
}
/*
* remove hrtimer, called with base lock held
*/
static inline int
-remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base, bool restart)
+remove_hrtimer(struct hrtimer *timer, struct hrtimer_clock_base *base,
+ bool restart, bool keep_local)
{
u8 state = timer->state;
if (state & HRTIMER_STATE_ENQUEUED) {
- int reprogram;
+ bool reprogram;
/*
* Remove the timer and force reprogramming when high
debug_deactivate(timer);
reprogram = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
+ /*
+ * If the timer is not restarted then reprogramming is
+ * required if the timer is local. If it is local and about
+ * to be restarted, avoid programming it twice (on removal
+ * and a moment later when it's requeued).
+ */
if (!restart)
state = HRTIMER_STATE_INACTIVE;
+ else
+ reprogram &= !keep_local;
__remove_hrtimer(timer, base, state, reprogram);
return 1;
struct hrtimer_clock_base *base)
{
struct hrtimer_clock_base *new_base;
+ bool force_local, first;
- /* Remove an active timer from the queue: */
- remove_hrtimer(timer, base, true);
+ /*
+ * If the timer is on the local cpu base and is the first expiring
+ * timer then this might end up reprogramming the hardware twice
+ * (on removal and on enqueue). To avoid that by prevent the
+ * reprogram on removal, keep the timer local to the current CPU
+ * and enforce reprogramming after it is queued no matter whether
+ * it is the new first expiring timer again or not.
+ */
+ force_local = base->cpu_base == this_cpu_ptr(&hrtimer_bases);
+ force_local &= base->cpu_base->next_timer == timer;
+
+ /*
+ * Remove an active timer from the queue. In case it is not queued
+ * on the current CPU, make sure that remove_hrtimer() updates the
+ * remote data correctly.
+ *
+ * If it's on the current CPU and the first expiring timer, then
+ * skip reprogramming, keep the timer local and enforce
+ * reprogramming later if it was the first expiring timer. This
+ * avoids programming the underlying clock event twice (once at
+ * removal and once after enqueue).
+ */
+ remove_hrtimer(timer, base, true, force_local);
if (mode & HRTIMER_MODE_REL)
tim = ktime_add_safe(tim, base->get_time());
hrtimer_set_expires_range_ns(timer, tim, delta_ns);
/* Switch the timer base, if necessary: */
- new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
+ if (!force_local) {
+ new_base = switch_hrtimer_base(timer, base,
+ mode & HRTIMER_MODE_PINNED);
+ } else {
+ new_base = base;
+ }
- return enqueue_hrtimer(timer, new_base, mode);
+ first = enqueue_hrtimer(timer, new_base, mode);
+ if (!force_local)
+ return first;
+
+ /*
+ * Timer was forced to stay on the current CPU to avoid
+ * reprogramming on removal and enqueue. Force reprogram the
+ * hardware by evaluating the new first expiring timer.
+ */
+ hrtimer_force_reprogram(new_base->cpu_base, 1);
+ return 0;
}
/**
base = lock_hrtimer_base(timer, &flags);
if (!hrtimer_callback_running(timer))
- ret = remove_hrtimer(timer, base, false);
+ ret = remove_hrtimer(timer, base, false, false);
unlock_hrtimer_base(timer, &flags);
#include <linux/init.h>
#include "timekeeping.h"
+#include "tick-internal.h"
-/* Since jiffies uses a simple TICK_NSEC multiplier
- * conversion, the .shift value could be zero. However
- * this would make NTP adjustments impossible as they are
- * in units of 1/2^.shift. Thus we use JIFFIES_SHIFT to
- * shift both the nominator and denominator the same
- * amount, and give ntp adjustments in units of 1/2^8
- *
- * The value 8 is somewhat carefully chosen, as anything
- * larger can result in overflows. TICK_NSEC grows as HZ
- * shrinks, so values greater than 8 overflow 32bits when
- * HZ=100.
- */
-#if HZ < 34
-#define JIFFIES_SHIFT 6
-#elif HZ < 67
-#define JIFFIES_SHIFT 7
-#else
-#define JIFFIES_SHIFT 8
-#endif
-
static u64 jiffies_read(struct clocksource *cs)
{
return (u64) jiffies;
struct thread_group_cputimer *cputimer = &tsk->signal->cputimer;
struct posix_cputimers *pct = &tsk->signal->posix_cputimers;
+ lockdep_assert_task_sighand_held(tsk);
+
/* Check if cputimer isn't running. This is accessed without locking. */
if (!READ_ONCE(pct->timers_active)) {
struct task_cputime sum;
return 0;
}
+static struct posix_cputimer_base *timer_base(struct k_itimer *timer,
+ struct task_struct *tsk)
+{
+ int clkidx = CPUCLOCK_WHICH(timer->it_clock);
+
+ if (CPUCLOCK_PERTHREAD(timer->it_clock))
+ return tsk->posix_cputimers.bases + clkidx;
+ else
+ return tsk->signal->posix_cputimers.bases + clkidx;
+}
+
+/*
+ * Force recalculating the base earliest expiration on the next tick.
+ * This will also re-evaluate the need to keep around the process wide
+ * cputime counter and tick dependency and eventually shut these down
+ * if necessary.
+ */
+static void trigger_base_recalc_expires(struct k_itimer *timer,
+ struct task_struct *tsk)
+{
+ struct posix_cputimer_base *base = timer_base(timer, tsk);
+
+ base->nextevt = 0;
+}
+
+/*
+ * Dequeue the timer and reset the base if it was its earliest expiration.
+ * It makes sure the next tick recalculates the base next expiration so we
+ * don't keep the costly process wide cputime counter around for a random
+ * amount of time, along with the tick dependency.
+ *
+ * If another timer gets queued between this and the next tick, its
+ * expiration will update the base next event if necessary on the next
+ * tick.
+ */
+static void disarm_timer(struct k_itimer *timer, struct task_struct *p)
+{
+ struct cpu_timer *ctmr = &timer->it.cpu;
+ struct posix_cputimer_base *base;
+
+ if (!cpu_timer_dequeue(ctmr))
+ return;
+
+ base = timer_base(timer, p);
+ if (cpu_timer_getexpires(ctmr) == base->nextevt)
+ trigger_base_recalc_expires(timer, p);
+}
+
+
/*
* Clean up a CPU-clock timer that is about to be destroyed.
* This is called from timer deletion with the timer already locked.
if (timer->it.cpu.firing)
ret = TIMER_RETRY;
else
- cpu_timer_dequeue(ctmr);
+ disarm_timer(timer, p);
unlock_task_sighand(p, &flags);
}
*/
static void arm_timer(struct k_itimer *timer, struct task_struct *p)
{
- int clkidx = CPUCLOCK_WHICH(timer->it_clock);
+ struct posix_cputimer_base *base = timer_base(timer, p);
struct cpu_timer *ctmr = &timer->it.cpu;
u64 newexp = cpu_timer_getexpires(ctmr);
- struct posix_cputimer_base *base;
-
- if (CPUCLOCK_PERTHREAD(timer->it_clock))
- base = p->posix_cputimers.bases + clkidx;
- else
- base = p->signal->posix_cputimers.bases + clkidx;
if (!cpu_timer_enqueue(&base->tqhead, ctmr))
return;
timer->it_overrun_last = 0;
timer->it_overrun = -1;
- if (new_expires != 0 && !(val < new_expires)) {
+ if (val >= new_expires) {
+ if (new_expires != 0) {
+ /*
+ * The designated time already passed, so we notify
+ * immediately, even if the thread never runs to
+ * accumulate more time on this clock.
+ */
+ cpu_timer_fire(timer);
+ }
+
/*
- * The designated time already passed, so we notify
- * immediately, even if the thread never runs to
- * accumulate more time on this clock.
+ * Make sure we don't keep around the process wide cputime
+ * counter or the tick dependency if they are not necessary.
*/
- cpu_timer_fire(timer);
- }
+ sighand = lock_task_sighand(p, &flags);
+ if (!sighand)
+ goto out;
+
+ if (!cpu_timer_queued(ctmr))
+ trigger_base_recalc_expires(timer, p);
- ret = 0;
+ unlock_task_sighand(p, &flags);
+ }
out:
rcu_read_unlock();
if (old)
}
}
- if (!*newval)
- return;
*newval += now;
}
int posix_timer_event(struct k_itimer *timr, int si_private)
{
enum pid_type type;
- int ret = -1;
+ int ret;
/*
* FIXME: if ->sigq is queued we can race with
* dequeue_signal()->posixtimer_rearm().
else
tick_resume_oneshot();
}
+
+ /*
+ * Ensure that hrtimers are up to date and the clockevents device
+ * is reprogrammed correctly when high resolution timers are
+ * enabled.
+ */
+ hrtimers_resume_local();
}
/**
extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
void timer_clear_idle(void);
+
+#define CLOCK_SET_WALL \
+ (BIT(HRTIMER_BASE_REALTIME) | BIT(HRTIMER_BASE_REALTIME_SOFT) | \
+ BIT(HRTIMER_BASE_TAI) | BIT(HRTIMER_BASE_TAI_SOFT))
+
+#define CLOCK_SET_BOOT \
+ (BIT(HRTIMER_BASE_BOOTTIME) | BIT(HRTIMER_BASE_BOOTTIME_SOFT))
+
+void clock_was_set(unsigned int bases);
+void clock_was_set_delayed(void);
+
+void hrtimers_resume_local(void);
+
+/* Since jiffies uses a simple TICK_NSEC multiplier
+ * conversion, the .shift value could be zero. However
+ * this would make NTP adjustments impossible as they are
+ * in units of 1/2^.shift. Thus we use JIFFIES_SHIFT to
+ * shift both the nominator and denominator the same
+ * amount, and give ntp adjustments in units of 1/2^8
+ *
+ * The value 8 is somewhat carefully chosen, as anything
+ * larger can result in overflows. TICK_NSEC grows as HZ
+ * shrinks, so values greater than 8 overflow 32bits when
+ * HZ=100.
+ */
+#if HZ < 34
+#define JIFFIES_SHIFT 6
+#elif HZ < 67
+#define JIFFIES_SHIFT 7
+#else
+#define JIFFIES_SHIFT 8
+#endif
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
- /* signal hrtimers about time change */
- clock_was_set();
+ /* Signal hrtimers about time change */
+ clock_was_set(CLOCK_SET_WALL);
if (!ret)
audit_tk_injoffset(ts_delta);
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
- /* signal hrtimers about time change */
- clock_was_set();
+ /* Signal hrtimers about time change */
+ clock_was_set(CLOCK_SET_WALL);
return ret;
}
write_seqcount_end(&tk_core.seq);
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
- /* signal hrtimers about time change */
- clock_was_set();
+ /* Signal hrtimers about time change */
+ clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
}
#endif
touch_softlockup_watchdog();
+ /* Resume the clockevent device(s) and hrtimers */
tick_resume();
- hrtimers_resume();
+ /* Notify timerfd as resume is equivalent to clock_was_set() */
+ timerfd_resume();
}
int timekeeping_suspend(void)
* timekeeping_advance - Updates the timekeeper to the current time and
* current NTP tick length
*/
-static void timekeeping_advance(enum timekeeping_adv_mode mode)
+static bool timekeeping_advance(enum timekeeping_adv_mode mode)
{
struct timekeeper *real_tk = &tk_core.timekeeper;
struct timekeeper *tk = &shadow_timekeeper;
write_seqcount_end(&tk_core.seq);
out:
raw_spin_unlock_irqrestore(&timekeeper_lock, flags);
- if (clock_set)
- /* Have to call _delayed version, since in irq context*/
- clock_was_set_delayed();
+
+ return !!clock_set;
}
/**
*/
void update_wall_time(void)
{
- timekeeping_advance(TK_ADV_TICK);
+ if (timekeeping_advance(TK_ADV_TICK))
+ clock_was_set_delayed();
}
/**
{
struct timekeeper *tk = &tk_core.timekeeper;
struct audit_ntp_data ad;
- unsigned long flags;
+ bool clock_set = false;
struct timespec64 ts;
+ unsigned long flags;
s32 orig_tai, tai;
int ret;
if (tai != orig_tai) {
__timekeeping_set_tai_offset(tk, tai);
timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET);
+ clock_set = true;
}
tk_update_leap_state(tk);
/* Update the multiplier immediately if frequency was set directly */
if (txc->modes & (ADJ_FREQUENCY | ADJ_TICK))
- timekeeping_advance(TK_ADV_FREQ);
+ clock_set |= timekeeping_advance(TK_ADV_FREQ);
- if (tai != orig_tai)
- clock_was_set();
+ if (clock_set)
+ clock_was_set(CLOCK_REALTIME);
ntp_notify_cmos_timer();
struct shuffle_task *stp;
cpumask_setall(shuffle_tmp_mask);
- get_online_cpus();
+ cpus_read_lock();
/* No point in shuffling if there is only one online CPU (ex: UP) */
if (num_online_cpus() == 1) {
- put_online_cpus();
+ cpus_read_unlock();
return;
}
set_cpus_allowed_ptr(stp->st_t, shuffle_tmp_mask);
mutex_unlock(&shuffle_task_mutex);
- put_online_cpus();
+ cpus_read_unlock();
}
/* Shuffle tasks across CPUs, with the intent of allowing each CPU in the
static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
{
+ bool init_nop = ftrace_need_init_nop();
struct ftrace_page *pg;
struct dyn_ftrace *p;
u64 start, stop;
* Do the initial record conversion from mcount jump
* to the NOP instructions.
*/
- if (!__is_defined(CC_USING_NOP_MCOUNT) &&
- !ftrace_nop_initialize(mod, p))
+ if (init_nop && !ftrace_nop_initialize(mod, p))
break;
update_cnt++;
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
select LOCKDEP
select DEBUG_SPINLOCK
- select DEBUG_MUTEXES
+ select DEBUG_MUTEXES if !PREEMPT_RT
select DEBUG_RT_MUTEXES if RT_MUTEXES
select DEBUG_RWSEMS
select DEBUG_WW_MUTEX_SLOWPATH
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
select LOCKDEP
select DEBUG_SPINLOCK
- select DEBUG_MUTEXES
+ select DEBUG_MUTEXES if !PREEMPT_RT
select DEBUG_RT_MUTEXES if RT_MUTEXES
select DEBUG_LOCK_ALLOC
default n
config DEBUG_MUTEXES
bool "Mutex debugging: basic checks"
- depends on DEBUG_KERNEL
+ depends on DEBUG_KERNEL && !PREEMPT_RT
help
This feature allows mutex semantics violations to be detected and
reported.
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
select DEBUG_LOCK_ALLOC
select DEBUG_SPINLOCK
- select DEBUG_MUTEXES
+ select DEBUG_MUTEXES if !PREEMPT_RT
+ select DEBUG_RT_MUTEXES if PREEMPT_RT
help
This feature enables slowpath testing for w/w mutex users by
injecting additional -EDEADLK wound/backoff cases. Together with
bool "Lock debugging: detect incorrect freeing of live locks"
depends on DEBUG_KERNEL && LOCK_DEBUGGING_SUPPORT
select DEBUG_SPINLOCK
- select DEBUG_MUTEXES
+ select DEBUG_MUTEXES if !PREEMPT_RT
select DEBUG_RT_MUTEXES if RT_MUTEXES
select LOCKDEP
help
feature by default. When enabled, memory and cache locality will
be impacted.
-config DEBUG_BLOCK_EXT_DEVT
- bool "Force extended block device numbers and spread them"
- depends on DEBUG_KERNEL
- depends on BLOCK
- default n
- help
- BIG FAT WARNING: ENABLING THIS OPTION MIGHT BREAK BOOTING ON
- SOME DISTRIBUTIONS. DO NOT ENABLE THIS UNLESS YOU KNOW WHAT
- YOU ARE DOING. Distros, please enable this and fix whatever
- is broken.
-
- Conventionally, block device numbers are allocated from
- predetermined contiguous area. However, extended block area
- may introduce non-contiguous block device numbers. This
- option forces most block device numbers to be allocated from
- the extended space and spreads them to discover kernel or
- userland code paths which assume predetermined contiguous
- device number allocation.
-
- Note that turning on this debug option shuffles all the
- device numbers for all IDE and SCSI devices including libata
- ones, so root partition specified using device number
- directly (via rdev or root=MAJ:MIN) won't work anymore.
- Textual device names (root=/dev/sdXn) will continue to work.
-
- Say N if you are unsure.
-
config CPU_HOTPLUG_STATE_CONTROL
bool "Enable CPU hotplug state control"
depends on DEBUG_KERNEL
config CRYPTO_LIB_SHA256
tristate
+
+config CRYPTO_LIB_SM4
+ tristate
obj-$(CONFIG_CRYPTO_LIB_SHA256) += libsha256.o
libsha256-y := sha256.o
+obj-$(CONFIG_CRYPTO_LIB_SM4) += libsm4.o
+libsm4-y := sm4.o
+
ifneq ($(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS),y)
libblake2s-y += blake2s-selftest.o
libchacha20poly1305-y += chacha20poly1305-selftest.o
}
EXPORT_SYMBOL(blake2s256_hmac);
-static int __init mod_init(void)
+static int __init blake2s_mod_init(void)
{
if (!IS_ENABLED(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS) &&
WARN_ON(!blake2s_selftest()))
return 0;
}
-static void __exit mod_exit(void)
+static void __exit blake2s_mod_exit(void)
{
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(blake2s_mod_init);
+module_exit(blake2s_mod_exit);
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("BLAKE2s hash function");
MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
}
EXPORT_SYMBOL(chacha20poly1305_decrypt_sg_inplace);
-static int __init mod_init(void)
+static int __init chacha20poly1305_init(void)
{
if (!IS_ENABLED(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS) &&
WARN_ON(!chacha20poly1305_selftest()))
return 0;
}
-static void __exit mod_exit(void)
+static void __exit chacha20poly1305_exit(void)
{
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(chacha20poly1305_init);
+module_exit(chacha20poly1305_exit);
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("ChaCha20Poly1305 AEAD construction");
MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
#include <linux/module.h>
#include <linux/init.h>
-static int __init mod_init(void)
+static int __init curve25519_init(void)
{
if (!IS_ENABLED(CONFIG_CRYPTO_MANAGER_DISABLE_TESTS) &&
WARN_ON(!curve25519_selftest()))
return 0;
}
-static void __exit mod_exit(void)
+static void __exit curve25519_exit(void)
{
}
-module_init(mod_init);
-module_exit(mod_exit);
+module_init(curve25519_init);
+module_exit(curve25519_exit);
MODULE_LICENSE("GPL v2");
MODULE_DESCRIPTION("Curve25519 scalar multiplication");
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * SM4, as specified in
+ * https://tools.ietf.org/id/draft-ribose-cfrg-sm4-10.html
+ *
+ * Copyright (C) 2018 ARM Limited or its affiliates.
+ * Copyright (c) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
+ */
+
+#include <linux/module.h>
+#include <asm/unaligned.h>
+#include <crypto/sm4.h>
+
+static const u32 fk[4] = {
+ 0xa3b1bac6, 0x56aa3350, 0x677d9197, 0xb27022dc
+};
+
+static const u32 __cacheline_aligned ck[32] = {
+ 0x00070e15, 0x1c232a31, 0x383f464d, 0x545b6269,
+ 0x70777e85, 0x8c939aa1, 0xa8afb6bd, 0xc4cbd2d9,
+ 0xe0e7eef5, 0xfc030a11, 0x181f262d, 0x343b4249,
+ 0x50575e65, 0x6c737a81, 0x888f969d, 0xa4abb2b9,
+ 0xc0c7ced5, 0xdce3eaf1, 0xf8ff060d, 0x141b2229,
+ 0x30373e45, 0x4c535a61, 0x686f767d, 0x848b9299,
+ 0xa0a7aeb5, 0xbcc3cad1, 0xd8dfe6ed, 0xf4fb0209,
+ 0x10171e25, 0x2c333a41, 0x484f565d, 0x646b7279
+};
+
+static const u8 __cacheline_aligned sbox[256] = {
+ 0xd6, 0x90, 0xe9, 0xfe, 0xcc, 0xe1, 0x3d, 0xb7,
+ 0x16, 0xb6, 0x14, 0xc2, 0x28, 0xfb, 0x2c, 0x05,
+ 0x2b, 0x67, 0x9a, 0x76, 0x2a, 0xbe, 0x04, 0xc3,
+ 0xaa, 0x44, 0x13, 0x26, 0x49, 0x86, 0x06, 0x99,
+ 0x9c, 0x42, 0x50, 0xf4, 0x91, 0xef, 0x98, 0x7a,
+ 0x33, 0x54, 0x0b, 0x43, 0xed, 0xcf, 0xac, 0x62,
+ 0xe4, 0xb3, 0x1c, 0xa9, 0xc9, 0x08, 0xe8, 0x95,
+ 0x80, 0xdf, 0x94, 0xfa, 0x75, 0x8f, 0x3f, 0xa6,
+ 0x47, 0x07, 0xa7, 0xfc, 0xf3, 0x73, 0x17, 0xba,
+ 0x83, 0x59, 0x3c, 0x19, 0xe6, 0x85, 0x4f, 0xa8,
+ 0x68, 0x6b, 0x81, 0xb2, 0x71, 0x64, 0xda, 0x8b,
+ 0xf8, 0xeb, 0x0f, 0x4b, 0x70, 0x56, 0x9d, 0x35,
+ 0x1e, 0x24, 0x0e, 0x5e, 0x63, 0x58, 0xd1, 0xa2,
+ 0x25, 0x22, 0x7c, 0x3b, 0x01, 0x21, 0x78, 0x87,
+ 0xd4, 0x00, 0x46, 0x57, 0x9f, 0xd3, 0x27, 0x52,
+ 0x4c, 0x36, 0x02, 0xe7, 0xa0, 0xc4, 0xc8, 0x9e,
+ 0xea, 0xbf, 0x8a, 0xd2, 0x40, 0xc7, 0x38, 0xb5,
+ 0xa3, 0xf7, 0xf2, 0xce, 0xf9, 0x61, 0x15, 0xa1,
+ 0xe0, 0xae, 0x5d, 0xa4, 0x9b, 0x34, 0x1a, 0x55,
+ 0xad, 0x93, 0x32, 0x30, 0xf5, 0x8c, 0xb1, 0xe3,
+ 0x1d, 0xf6, 0xe2, 0x2e, 0x82, 0x66, 0xca, 0x60,
+ 0xc0, 0x29, 0x23, 0xab, 0x0d, 0x53, 0x4e, 0x6f,
+ 0xd5, 0xdb, 0x37, 0x45, 0xde, 0xfd, 0x8e, 0x2f,
+ 0x03, 0xff, 0x6a, 0x72, 0x6d, 0x6c, 0x5b, 0x51,
+ 0x8d, 0x1b, 0xaf, 0x92, 0xbb, 0xdd, 0xbc, 0x7f,
+ 0x11, 0xd9, 0x5c, 0x41, 0x1f, 0x10, 0x5a, 0xd8,
+ 0x0a, 0xc1, 0x31, 0x88, 0xa5, 0xcd, 0x7b, 0xbd,
+ 0x2d, 0x74, 0xd0, 0x12, 0xb8, 0xe5, 0xb4, 0xb0,
+ 0x89, 0x69, 0x97, 0x4a, 0x0c, 0x96, 0x77, 0x7e,
+ 0x65, 0xb9, 0xf1, 0x09, 0xc5, 0x6e, 0xc6, 0x84,
+ 0x18, 0xf0, 0x7d, 0xec, 0x3a, 0xdc, 0x4d, 0x20,
+ 0x79, 0xee, 0x5f, 0x3e, 0xd7, 0xcb, 0x39, 0x48
+};
+
+static inline u32 sm4_t_non_lin_sub(u32 x)
+{
+ u32 out;
+
+ out = (u32)sbox[x & 0xff];
+ out |= (u32)sbox[(x >> 8) & 0xff] << 8;
+ out |= (u32)sbox[(x >> 16) & 0xff] << 16;
+ out |= (u32)sbox[(x >> 24) & 0xff] << 24;
+
+ return out;
+}
+
+static inline u32 sm4_key_lin_sub(u32 x)
+{
+ return x ^ rol32(x, 13) ^ rol32(x, 23);
+}
+
+static inline u32 sm4_enc_lin_sub(u32 x)
+{
+ return x ^ rol32(x, 2) ^ rol32(x, 10) ^ rol32(x, 18) ^ rol32(x, 24);
+}
+
+static inline u32 sm4_key_sub(u32 x)
+{
+ return sm4_key_lin_sub(sm4_t_non_lin_sub(x));
+}
+
+static inline u32 sm4_enc_sub(u32 x)
+{
+ return sm4_enc_lin_sub(sm4_t_non_lin_sub(x));
+}
+
+static inline u32 sm4_round(u32 x0, u32 x1, u32 x2, u32 x3, u32 rk)
+{
+ return x0 ^ sm4_enc_sub(x1 ^ x2 ^ x3 ^ rk);
+}
+
+
+/**
+ * sm4_expandkey - Expands the SM4 key as described in GB/T 32907-2016
+ * @ctx: The location where the computed key will be stored.
+ * @in_key: The supplied key.
+ * @key_len: The length of the supplied key.
+ *
+ * Returns 0 on success. The function fails only if an invalid key size (or
+ * pointer) is supplied.
+ */
+int sm4_expandkey(struct sm4_ctx *ctx, const u8 *in_key,
+ unsigned int key_len)
+{
+ u32 rk[4];
+ const u32 *key = (u32 *)in_key;
+ int i;
+
+ if (key_len != SM4_KEY_SIZE)
+ return -EINVAL;
+
+ rk[0] = get_unaligned_be32(&key[0]) ^ fk[0];
+ rk[1] = get_unaligned_be32(&key[1]) ^ fk[1];
+ rk[2] = get_unaligned_be32(&key[2]) ^ fk[2];
+ rk[3] = get_unaligned_be32(&key[3]) ^ fk[3];
+
+ for (i = 0; i < 32; i += 4) {
+ rk[0] ^= sm4_key_sub(rk[1] ^ rk[2] ^ rk[3] ^ ck[i + 0]);
+ rk[1] ^= sm4_key_sub(rk[2] ^ rk[3] ^ rk[0] ^ ck[i + 1]);
+ rk[2] ^= sm4_key_sub(rk[3] ^ rk[0] ^ rk[1] ^ ck[i + 2]);
+ rk[3] ^= sm4_key_sub(rk[0] ^ rk[1] ^ rk[2] ^ ck[i + 3]);
+
+ ctx->rkey_enc[i + 0] = rk[0];
+ ctx->rkey_enc[i + 1] = rk[1];
+ ctx->rkey_enc[i + 2] = rk[2];
+ ctx->rkey_enc[i + 3] = rk[3];
+ ctx->rkey_dec[31 - 0 - i] = rk[0];
+ ctx->rkey_dec[31 - 1 - i] = rk[1];
+ ctx->rkey_dec[31 - 2 - i] = rk[2];
+ ctx->rkey_dec[31 - 3 - i] = rk[3];
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(sm4_expandkey);
+
+/**
+ * sm4_crypt_block - Encrypt or decrypt a single SM4 block
+ * @rk: The rkey_enc for encrypt or rkey_dec for decrypt
+ * @out: Buffer to store output data
+ * @in: Buffer containing the input data
+ */
+void sm4_crypt_block(const u32 *rk, u8 *out, const u8 *in)
+{
+ u32 x[4], i;
+
+ x[0] = get_unaligned_be32(in + 0 * 4);
+ x[1] = get_unaligned_be32(in + 1 * 4);
+ x[2] = get_unaligned_be32(in + 2 * 4);
+ x[3] = get_unaligned_be32(in + 3 * 4);
+
+ for (i = 0; i < 32; i += 4) {
+ x[0] = sm4_round(x[0], x[1], x[2], x[3], rk[i + 0]);
+ x[1] = sm4_round(x[1], x[2], x[3], x[0], rk[i + 1]);
+ x[2] = sm4_round(x[2], x[3], x[0], x[1], rk[i + 2]);
+ x[3] = sm4_round(x[3], x[0], x[1], x[2], rk[i + 3]);
+ }
+
+ put_unaligned_be32(x[3 - 0], out + 0 * 4);
+ put_unaligned_be32(x[3 - 1], out + 1 * 4);
+ put_unaligned_be32(x[3 - 2], out + 2 * 4);
+ put_unaligned_be32(x[3 - 3], out + 3 * 4);
+}
+EXPORT_SYMBOL_GPL(sm4_crypt_block);
+
+MODULE_DESCRIPTION("Generic SM4 library");
+MODULE_LICENSE("GPL v2");
struct debug_obj *obj;
unsigned long flags;
- fill_pool();
+ /*
+ * On RT enabled kernels the pool refill must happen in preemptible
+ * context:
+ */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT) || preemptible())
+ fill_pool();
db = get_bucket((unsigned long) addr);
}
EXPORT_SYMBOL_GPL(linear_range_get_selector_high);
+/**
+ * linear_range_get_selector_within - return linear range selector for value
+ * @r: pointer to linear range where selector is looked from
+ * @val: value for which the selector is searched
+ * @selector: address where found selector value is updated
+ *
+ * Return selector for which range value is closest match for given
+ * input value. Value is matching if it is equal or lower than given
+ * value. But return maximum selector if given value is higher than
+ * maximum value.
+ */
+void linear_range_get_selector_within(const struct linear_range *r,
+ unsigned int val, unsigned int *selector)
+{
+ if (r->min > val) {
+ *selector = r->min_sel;
+ return;
+ }
+
+ if (linear_range_get_max_value(r) < val) {
+ *selector = r->max_sel;
+ return;
+ }
+
+ if (r->step == 0)
+ *selector = r->min_sel;
+ else
+ *selector = (val - r->min) / r->step + r->min_sel;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_selector_within);
+
MODULE_DESCRIPTION("linear-ranges helper");
MODULE_LICENSE("GPL");
return 0; /* no need to do it */
if (a->d) {
- p = kmalloc_array(nlimbs, sizeof(mpi_limb_t), GFP_KERNEL);
+ p = kcalloc(nlimbs, sizeof(mpi_limb_t), GFP_KERNEL);
if (!p)
return -ENOMEM;
memcpy(p, a->d, a->alloced * sizeof(mpi_limb_t));
#include <linux/errno.h>
#include <linux/slab.h>
+#include <asm/unaligned.h>
#include <asm/byteorder.h>
#include <asm/word-at-a-time.h>
#include <asm/page.h>
const unsigned char *su1, *su2;
int res = 0;
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+ if (count >= sizeof(unsigned long)) {
+ const unsigned long *u1 = cs;
+ const unsigned long *u2 = ct;
+ do {
+ if (get_unaligned(u1) != get_unaligned(u2))
+ break;
+ u1++;
+ u2++;
+ count -= sizeof(unsigned long);
+ } while (count >= sizeof(unsigned long));
+ cs = u1;
+ ct = u2;
+ }
+#endif
for (su1 = cs, su2 = ct; 0 < count; ++su1, ++su2, count--)
if ((res = *su1 - *su2) != 0)
break;
offsetof(spinlock_t, lock.wait_lock.magic),
SPINLOCK_MAGIC) ||
test_magic(lock_rwlock_ptr,
- offsetof(rwlock_t, rtmutex.wait_lock.magic),
+ offsetof(rwlock_t, rwbase.rtmutex.wait_lock.magic),
SPINLOCK_MAGIC) ||
test_magic(lock_mutex_ptr,
- offsetof(struct mutex, lock.wait_lock.magic),
+ offsetof(struct mutex, rtmutex.wait_lock.magic),
SPINLOCK_MAGIC) ||
test_magic(lock_rwsem_ptr,
- offsetof(struct rw_semaphore, rtmutex.wait_lock.magic),
+ offsetof(struct rw_semaphore, rwbase.rtmutex.wait_lock.magic),
SPINLOCK_MAGIC))
return -EINVAL;
#else
offsetof(rwlock_t, magic),
RWLOCK_MAGIC) ||
test_magic(lock_mutex_ptr,
- offsetof(struct mutex, wait_lock.rlock.magic),
+ offsetof(struct mutex, wait_lock.magic),
SPINLOCK_MAGIC) ||
test_magic(lock_rwsem_ptr,
offsetof(struct rw_semaphore, wait_lock.magic),
bdi->capabilities = BDI_CAP_WRITEBACK | BDI_CAP_WRITEBACK_ACCT;
bdi->ra_pages = VM_READAHEAD_PAGES;
bdi->io_pages = VM_READAHEAD_PAGES;
+ timer_setup(&bdi->laptop_mode_wb_timer, laptop_mode_timer_fn, 0);
return bdi;
}
EXPORT_SYMBOL(bdi_alloc);
void bdi_unregister(struct backing_dev_info *bdi)
{
+ del_timer_sync(&bdi->laptop_mode_wb_timer);
+
/* make sure nobody finds us on the bdi_list anymore */
bdi_remove_from_list(bdi);
wb_shutdown(&bdi->wb);
* ->swap_lock (exclusive_swap_page, others)
* ->i_pages lock
*
- * ->i_mutex
- * ->i_mmap_rwsem (truncate->unmap_mapping_range)
+ * ->i_rwsem
+ * ->invalidate_lock (acquired by fs in truncate path)
+ * ->i_mmap_rwsem (truncate->unmap_mapping_range)
*
* ->mmap_lock
* ->i_mmap_rwsem
* ->i_pages lock (arch-dependent flush_dcache_mmap_lock)
*
* ->mmap_lock
- * ->lock_page (access_process_vm)
+ * ->invalidate_lock (filemap_fault)
+ * ->lock_page (filemap_fault, access_process_vm)
*
- * ->i_mutex (generic_perform_write)
+ * ->i_rwsem (generic_perform_write)
* ->mmap_lock (fault_in_pages_readable->do_page_fault)
*
* bdi->wb.list_lock
EXPORT_SYMBOL(__page_cache_alloc);
#endif
+/*
+ * filemap_invalidate_lock_two - lock invalidate_lock for two mappings
+ *
+ * Lock exclusively invalidate_lock of any passed mapping that is not NULL.
+ *
+ * @mapping1: the first mapping to lock
+ * @mapping2: the second mapping to lock
+ */
+void filemap_invalidate_lock_two(struct address_space *mapping1,
+ struct address_space *mapping2)
+{
+ if (mapping1 > mapping2)
+ swap(mapping1, mapping2);
+ if (mapping1)
+ down_write(&mapping1->invalidate_lock);
+ if (mapping2 && mapping1 != mapping2)
+ down_write_nested(&mapping2->invalidate_lock, 1);
+}
+EXPORT_SYMBOL(filemap_invalidate_lock_two);
+
+/*
+ * filemap_invalidate_unlock_two - unlock invalidate_lock for two mappings
+ *
+ * Unlock exclusive invalidate_lock of any passed mapping that is not NULL.
+ *
+ * @mapping1: the first mapping to unlock
+ * @mapping2: the second mapping to unlock
+ */
+void filemap_invalidate_unlock_two(struct address_space *mapping1,
+ struct address_space *mapping2)
+{
+ if (mapping1)
+ up_write(&mapping1->invalidate_lock);
+ if (mapping2 && mapping1 != mapping2)
+ up_write(&mapping2->invalidate_lock);
+}
+EXPORT_SYMBOL(filemap_invalidate_unlock_two);
+
/*
* In order to wait for pages to become available there must be
* waitqueues associated with pages. By using a hash table of
{
int error;
+ if (iocb->ki_flags & IOCB_NOWAIT) {
+ if (!filemap_invalidate_trylock_shared(mapping))
+ return -EAGAIN;
+ } else {
+ filemap_invalidate_lock_shared(mapping);
+ }
+
if (!trylock_page(page)) {
+ error = -EAGAIN;
if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_NOIO))
- return -EAGAIN;
+ goto unlock_mapping;
if (!(iocb->ki_flags & IOCB_WAITQ)) {
+ filemap_invalidate_unlock_shared(mapping);
put_and_wait_on_page_locked(page, TASK_KILLABLE);
return AOP_TRUNCATED_PAGE;
}
error = __lock_page_async(page, iocb->ki_waitq);
if (error)
- return error;
+ goto unlock_mapping;
}
+ error = AOP_TRUNCATED_PAGE;
if (!page->mapping)
- goto truncated;
+ goto unlock;
error = 0;
if (filemap_range_uptodate(mapping, iocb->ki_pos, iter, page))
goto unlock;
error = filemap_read_page(iocb->ki_filp, mapping, page);
- if (error == AOP_TRUNCATED_PAGE)
- put_page(page);
- return error;
-truncated:
- unlock_page(page);
- put_page(page);
- return AOP_TRUNCATED_PAGE;
+ goto unlock_mapping;
unlock:
unlock_page(page);
+unlock_mapping:
+ filemap_invalidate_unlock_shared(mapping);
+ if (error == AOP_TRUNCATED_PAGE)
+ put_page(page);
return error;
}
if (!page)
return -ENOMEM;
+ /*
+ * Protect against truncate / hole punch. Grabbing invalidate_lock here
+ * assures we cannot instantiate and bring uptodate new pagecache pages
+ * after evicting page cache during truncate and before actually
+ * freeing blocks. Note that we could release invalidate_lock after
+ * inserting the page into page cache as the locked page would then be
+ * enough to synchronize with hole punching. But there are code paths
+ * such as filemap_update_page() filling in partially uptodate pages or
+ * ->readpages() that need to hold invalidate_lock while mapping blocks
+ * for IO so let's hold the lock here as well to keep locking rules
+ * simple.
+ */
+ filemap_invalidate_lock_shared(mapping);
error = add_to_page_cache_lru(page, mapping, index,
mapping_gfp_constraint(mapping, GFP_KERNEL));
if (error == -EEXIST)
if (error)
goto error;
+ filemap_invalidate_unlock_shared(mapping);
pagevec_add(pvec, page);
return 0;
error:
+ filemap_invalidate_unlock_shared(mapping);
put_page(page);
return error;
}
pgoff_t max_off;
struct page *page;
vm_fault_t ret = 0;
+ bool mapping_locked = false;
max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
if (unlikely(offset >= max_off))
* Do we have something in the page cache already?
*/
page = find_get_page(mapping, offset);
- if (likely(page) && !(vmf->flags & FAULT_FLAG_TRIED)) {
+ if (likely(page)) {
/*
- * We found the page, so try async readahead before
- * waiting for the lock.
+ * We found the page, so try async readahead before waiting for
+ * the lock.
*/
- fpin = do_async_mmap_readahead(vmf, page);
- } else if (!page) {
+ if (!(vmf->flags & FAULT_FLAG_TRIED))
+ fpin = do_async_mmap_readahead(vmf, page);
+ if (unlikely(!PageUptodate(page))) {
+ filemap_invalidate_lock_shared(mapping);
+ mapping_locked = true;
+ }
+ } else {
/* No page in the page cache at all */
count_vm_event(PGMAJFAULT);
count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
ret = VM_FAULT_MAJOR;
fpin = do_sync_mmap_readahead(vmf);
retry_find:
+ /*
+ * See comment in filemap_create_page() why we need
+ * invalidate_lock
+ */
+ if (!mapping_locked) {
+ filemap_invalidate_lock_shared(mapping);
+ mapping_locked = true;
+ }
page = pagecache_get_page(mapping, offset,
FGP_CREAT|FGP_FOR_MMAP,
vmf->gfp_mask);
if (!page) {
if (fpin)
goto out_retry;
+ filemap_invalidate_unlock_shared(mapping);
return VM_FAULT_OOM;
}
}
* We have a locked page in the page cache, now we need to check
* that it's up-to-date. If not, it is going to be due to an error.
*/
- if (unlikely(!PageUptodate(page)))
+ if (unlikely(!PageUptodate(page))) {
+ /*
+ * The page was in cache and uptodate and now it is not.
+ * Strange but possible since we didn't hold the page lock all
+ * the time. Let's drop everything get the invalidate lock and
+ * try again.
+ */
+ if (!mapping_locked) {
+ unlock_page(page);
+ put_page(page);
+ goto retry_find;
+ }
goto page_not_uptodate;
+ }
/*
* We've made it this far and we had to drop our mmap_lock, now is the
unlock_page(page);
goto out_retry;
}
+ if (mapping_locked)
+ filemap_invalidate_unlock_shared(mapping);
/*
* Found the page and have a reference on it.
if (!error || error == AOP_TRUNCATED_PAGE)
goto retry_find;
+ filemap_invalidate_unlock_shared(mapping);
return VM_FAULT_SIGBUS;
*/
if (page)
put_page(page);
+ if (mapping_locked)
+ filemap_invalidate_unlock_shared(mapping);
if (fpin)
fput(fpin);
return ret | VM_FAULT_RETRY;
*
* If the page does not get brought uptodate, return -EIO.
*
+ * The function expects mapping->invalidate_lock to be already held.
+ *
* Return: up to date page on success, ERR_PTR() on failure.
*/
struct page *read_cache_page(struct address_space *mapping,
*
* If the page does not get brought uptodate, return -EIO.
*
+ * The function expects mapping->invalidate_lock to be already held.
+ *
* Return: up to date page on success, ERR_PTR() on failure.
*/
struct page *read_cache_page_gfp(struct address_space *mapping,
* modification times and calls proper subroutines depending on whether we
* do direct IO or a standard buffered write.
*
- * It expects i_mutex to be grabbed unless we work on a block device or similar
+ * It expects i_rwsem to be grabbed unless we work on a block device or similar
* object which does not need locking at all.
*
* This function does *not* take care of syncing data in case of O_SYNC write.
* A caller has to handle it. This is mainly due to the fact that we want to
- * avoid syncing under i_mutex.
+ * avoid syncing under i_rwsem.
*
* Return:
* * number of bytes written, even for truncated writes
*
* This is a wrapper around __generic_file_write_iter() to be used by most
* filesystems. It takes care of syncing the file in case of O_SYNC file
- * and acquires i_mutex as needed.
+ * and acquires i_rwsem as needed.
* Return:
* * negative error code if no data has been written at all of
* vfs_fsync_range() failed for a synchronous write
#include <linux/tracepoint.h>
#include <trace/events/printk.h>
+#include <asm/kfence.h>
+
#include "kfence.h"
+/* May be overridden by <asm/kfence.h>. */
+#ifndef arch_kfence_test_address
+#define arch_kfence_test_address(addr) (addr)
+#endif
+
/* Report as observed from console. */
static struct {
spinlock_t lock;
/* Check observed report matches information in @r. */
static bool report_matches(const struct expect_report *r)
{
+ unsigned long addr = (unsigned long)r->addr;
bool ret = false;
unsigned long flags;
typeof(observed.lines) expect;
switch (r->type) {
case KFENCE_ERROR_OOB:
cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
+ addr = arch_kfence_test_address(addr);
break;
case KFENCE_ERROR_UAF:
cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
+ addr = arch_kfence_test_address(addr);
break;
case KFENCE_ERROR_CORRUPTION:
cur += scnprintf(cur, end - cur, "Corrupted memory at");
break;
case KFENCE_ERROR_INVALID:
cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
+ addr = arch_kfence_test_address(addr);
break;
case KFENCE_ERROR_INVALID_FREE:
cur += scnprintf(cur, end - cur, "Invalid free of");
break;
}
- cur += scnprintf(cur, end - cur, " 0x%p", (void *)r->addr);
+ cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
spin_lock_irqsave(&observed.lock, flags);
if (!report_available())
+ ((loff_t)vma->vm_pgoff << PAGE_SHIFT);
/*
- * Filesystem's fallocate may need to take i_mutex. We need to
+ * Filesystem's fallocate may need to take i_rwsem. We need to
* explicitly grab a reference because the vma (and hence the
* vma's reference to the file) can go away as soon as we drop
* mmap_lock.
/*
* Truncation is a bit tricky. Enable it per file system for now.
*
- * Open: to take i_mutex or not for this? Right now we don't.
+ * Open: to take i_rwsem or not for this? Right now we don't.
*/
ret = truncate_error_page(p, pfn, mapping);
out:
undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
memory_notify(MEM_CANCEL_OFFLINE, &arg);
failed_removal_pcplists_disabled:
+ lru_cache_enable();
zone_pcp_enable(zone);
failed_removal:
pr_debug("memory offlining [mem %#010llx-%#010llx] failed due to %s\n",
if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
return -EACCES;
- /*
- * Make sure there are no mandatory locks on the file.
- */
- if (locks_verify_locked(file))
- return -EAGAIN;
-
vm_flags |= VM_SHARED | VM_MAYSHARE;
if (!(file->f_mode & FMODE_WRITE))
vm_flags &= ~(VM_MAYWRITE | VM_SHARED);
(file->f_mode & FMODE_WRITE))
return -EACCES;
- if (locks_verify_locked(file))
- return -EAGAIN;
-
if (!(capabilities & NOMMU_MAP_DIRECT))
return -ENODEV;
return ret;
}
-#ifdef CONFIG_BLOCK
void laptop_mode_timer_fn(struct timer_list *t)
{
struct backing_dev_info *backing_dev_info =
rcu_read_unlock();
}
-#endif
/*
* If ratelimit_pages is too high then we can get into dirty-data overload
*/
unsigned int nofs = memalloc_nofs_save();
+ filemap_invalidate_lock_shared(mapping);
/*
* Preallocate as many pages as we will need.
*/
* will then handle the error.
*/
read_pages(ractl, &page_pool, false);
+ filemap_invalidate_unlock_shared(mapping);
memalloc_nofs_restore(nofs);
}
EXPORT_SYMBOL_GPL(page_cache_ra_unbounded);
/*
* Lock ordering in mm:
*
- * inode->i_mutex (while writing or truncating, not reading or faulting)
+ * inode->i_rwsem (while writing or truncating, not reading or faulting)
* mm->mmap_lock
- * page->flags PG_locked (lock_page) * (see huegtlbfs below)
- * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share)
- * mapping->i_mmap_rwsem
- * hugetlb_fault_mutex (hugetlbfs specific page fault mutex)
- * anon_vma->rwsem
- * mm->page_table_lock or pte_lock
- * swap_lock (in swap_duplicate, swap_info_get)
- * mmlist_lock (in mmput, drain_mmlist and others)
- * mapping->private_lock (in __set_page_dirty_buffers)
- * lock_page_memcg move_lock (in __set_page_dirty_buffers)
- * i_pages lock (widely used)
- * lruvec->lru_lock (in lock_page_lruvec_irq)
- * inode->i_lock (in set_page_dirty's __mark_inode_dirty)
- * bdi.wb->list_lock (in set_page_dirty's __mark_inode_dirty)
- * sb_lock (within inode_lock in fs/fs-writeback.c)
- * i_pages lock (widely used, in set_page_dirty,
- * in arch-dependent flush_dcache_mmap_lock,
- * within bdi.wb->list_lock in __sync_single_inode)
+ * mapping->invalidate_lock (in filemap_fault)
+ * page->flags PG_locked (lock_page) * (see hugetlbfs below)
+ * hugetlbfs_i_mmap_rwsem_key (in huge_pmd_share)
+ * mapping->i_mmap_rwsem
+ * hugetlb_fault_mutex (hugetlbfs specific page fault mutex)
+ * anon_vma->rwsem
+ * mm->page_table_lock or pte_lock
+ * swap_lock (in swap_duplicate, swap_info_get)
+ * mmlist_lock (in mmput, drain_mmlist and others)
+ * mapping->private_lock (in __set_page_dirty_buffers)
+ * lock_page_memcg move_lock (in __set_page_dirty_buffers)
+ * i_pages lock (widely used)
+ * lruvec->lru_lock (in lock_page_lruvec_irq)
+ * inode->i_lock (in set_page_dirty's __mark_inode_dirty)
+ * bdi.wb->list_lock (in set_page_dirty's __mark_inode_dirty)
+ * sb_lock (within inode_lock in fs/fs-writeback.c)
+ * i_pages lock (widely used, in set_page_dirty,
+ * in arch-dependent flush_dcache_mmap_lock,
+ * within bdi.wb->list_lock in __sync_single_inode)
*
- * anon_vma->rwsem,mapping->i_mutex (memory_failure, collect_procs_anon)
+ * anon_vma->rwsem,mapping->i_mmap_rwsem (memory_failure, collect_procs_anon)
* ->tasklist_lock
* pte map lock
*
/*
* shmem_fallocate communicates with shmem_fault or shmem_writepage via
- * inode->i_private (with i_mutex making sure that it has only one user at
+ * inode->i_private (with i_rwsem making sure that it has only one user at
* a time): we would prefer not to enlarge the shmem inode just for that.
*/
struct shmem_falloc {
* Determine (in bytes) how many of the shmem object's pages mapped by the
* given offsets are swapped out.
*
- * This is safe to call without i_mutex or the i_pages lock thanks to RCU,
+ * This is safe to call without i_rwsem or the i_pages lock thanks to RCU,
* as long as the inode doesn't go away and racy results are not a problem.
*/
unsigned long shmem_partial_swap_usage(struct address_space *mapping,
* Determine (in bytes) how many of the shmem object's pages mapped by the
* given vma is swapped out.
*
- * This is safe to call without i_mutex or the i_pages lock thanks to RCU,
+ * This is safe to call without i_rwsem or the i_pages lock thanks to RCU,
* as long as the inode doesn't go away and racy results are not a problem.
*/
unsigned long shmem_swap_usage(struct vm_area_struct *vma)
loff_t oldsize = inode->i_size;
loff_t newsize = attr->ia_size;
- /* protected by i_mutex */
+ /* protected by i_rwsem */
if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) ||
(newsize > oldsize && (info->seals & F_SEAL_GROW)))
return -EPERM;
/*
* Trinity finds that probing a hole which tmpfs is punching can
* prevent the hole-punch from ever completing: which in turn
- * locks writers out with its hold on i_mutex. So refrain from
+ * locks writers out with its hold on i_rwsem. So refrain from
* faulting pages into the hole while it's being punched. Although
* shmem_undo_range() does remove the additions, it may be unable to
* keep up, as each new page needs its own unmap_mapping_range() call,
* we just need to make racing faults a rare case.
*
* The implementation below would be much simpler if we just used a
- * standard mutex or completion: but we cannot take i_mutex in fault,
+ * standard mutex or completion: but we cannot take i_rwsem in fault,
* and bloating every shmem inode for this unlikely case would be sad.
*/
if (unlikely(inode->i_private)) {
struct shmem_inode_info *info = SHMEM_I(inode);
pgoff_t index = pos >> PAGE_SHIFT;
- /* i_mutex is held by caller */
+ /* i_rwsem is held by caller */
if (unlikely(info->seals & (F_SEAL_GROW |
F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) {
if (info->seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))
/*
* We must evaluate after, since reads (unlike writes)
- * are called without i_mutex protection against truncate
+ * are called without i_rwsem protection against truncate
*/
nr = PAGE_SIZE;
i_size = i_size_read(inode);
return -ENXIO;
inode_lock(inode);
- /* We're holding i_mutex so we can access i_size directly */
+ /* We're holding i_rwsem so we can access i_size directly */
offset = mapping_seek_hole_data(mapping, offset, inode->i_size, whence);
if (offset >= 0)
offset = vfs_setpos(file, offset, MAX_LFS_FILESIZE);
loff_t unmap_end = round_down(offset + len, PAGE_SIZE) - 1;
DECLARE_WAIT_QUEUE_HEAD_ONSTACK(shmem_falloc_waitq);
- /* protected by i_mutex */
+ /* protected by i_rwsem */
if (info->seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) {
error = -EPERM;
goto out;
swap_slot_cache_enabled = false;
if (swap_slot_cache_initialized) {
/* serialize with cpu hotplug operations */
- get_online_cpus();
+ cpus_read_lock();
__drain_swap_slots_cache(SLOTS_CACHE|SLOTS_CACHE_RET);
- put_online_cpus();
+ cpus_read_unlock();
}
}
* @mapping: mapping to truncate
* @lstart: offset from which to truncate
*
- * Called under (and serialised by) inode->i_mutex.
+ * Called under (and serialised by) inode->i_rwsem and
+ * mapping->invalidate_lock.
*
* Note: When this function returns, there can be a page in the process of
* deletion (inside __delete_from_page_cache()) in the specified range. Thus
* truncate_inode_pages_final - truncate *all* pages before inode dies
* @mapping: mapping to truncate
*
- * Called under (and serialized by) inode->i_mutex.
+ * Called under (and serialized by) inode->i_rwsem.
*
* Filesystems have to use this in the .evict_inode path to inform the
* VM that this is the final truncate and the inode is going away.
* setattr function when ATTR_SIZE is passed in.
*
* Must be called with a lock serializing truncates and writes (generally
- * i_mutex but e.g. xfs uses a different lock) and before all filesystem
+ * i_rwsem but e.g. xfs uses a different lock) and before all filesystem
* specific block truncation has been performed.
*/
void truncate_setsize(struct inode *inode, loff_t newsize)
*
* The function must be called after i_size is updated so that page fault
* coming after we unlock the page will already see the new i_size.
- * The function must be called while we still hold i_mutex - this not only
+ * The function must be called while we still hold i_rwsem - this not only
* makes sure i_size is stable but also that userspace cannot observe new
* i_size value before we are prepared to store mmap writes at new inode size.
*/
*/
void all_vm_events(unsigned long *ret)
{
- get_online_cpus();
+ cpus_read_lock();
sum_vm_events(ret);
- put_online_cpus();
+ cpus_read_unlock();
}
EXPORT_SYMBOL_GPL(all_vm_events);
{
int cpu;
- get_online_cpus();
+ cpus_read_lock();
/* Check processors whose vmstat worker threads have been disabled */
for_each_online_cpu(cpu) {
struct delayed_work *dw = &per_cpu(vmstat_work, cpu);
cond_resched();
}
- put_online_cpus();
+ cpus_read_unlock();
schedule_delayed_work(&shepherd,
round_jiffies_relative(sysctl_stat_interval));
if (ret < 0)
pr_err("vmstat: failed to register 'online' hotplug state\n");
- get_online_cpus();
+ cpus_read_lock();
init_cpu_node_state();
- put_online_cpus();
+ cpus_read_unlock();
start_shepherd_timer();
#endif
return err;
if (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD] || tb[IFLA_TARGET_NETNSID]) {
+ const char *pat = ifname && ifname[0] ? ifname : NULL;
struct net *net;
int new_ifindex;
else
new_ifindex = 0;
- err = __dev_change_net_namespace(dev, net, ifname, new_ifindex);
+ err = __dev_change_net_namespace(dev, net, pat, new_ifindex);
put_net(net);
if (err)
goto errout;
if (!doi_def)
return;
- switch (doi_def->type) {
- case CIPSO_V4_MAP_TRANS:
- kfree(doi_def->map.std->lvl.cipso);
- kfree(doi_def->map.std->lvl.local);
- kfree(doi_def->map.std->cat.cipso);
- kfree(doi_def->map.std->cat.local);
- kfree(doi_def->map.std);
- break;
+ if (doi_def->map.std) {
+ switch (doi_def->type) {
+ case CIPSO_V4_MAP_TRANS:
+ kfree(doi_def->map.std->lvl.cipso);
+ kfree(doi_def->map.std->lvl.local);
+ kfree(doi_def->map.std->cat.cipso);
+ kfree(doi_def->map.std->cat.local);
+ kfree(doi_def->map.std);
+ break;
+ }
}
kfree(doi_def);
}
static int gre_handle_offloads(struct sk_buff *skb, bool csum)
{
+ if (csum && skb_checksum_start(skb) < skb->data)
+ return -EINVAL;
return iptunnel_handle_offloads(skb, csum ? SKB_GSO_GRE_CSUM : SKB_GSO_GRE);
}
return oldest;
}
-static inline u32 fnhe_hashfun(__be32 daddr)
+static u32 fnhe_hashfun(__be32 daddr)
{
- static u32 fnhe_hashrnd __read_mostly;
- u32 hval;
+ static siphash_key_t fnhe_hash_key __read_mostly;
+ u64 hval;
- net_get_random_once(&fnhe_hashrnd, sizeof(fnhe_hashrnd));
- hval = jhash_1word((__force u32)daddr, fnhe_hashrnd);
- return hash_32(hval, FNHE_HASH_SHIFT);
+ net_get_random_once(&fnhe_hash_key, sizeof(fnhe_hash_key));
+ hval = siphash_1u32((__force u32)daddr, &fnhe_hash_key);
+ return hash_64(hval, FNHE_HASH_SHIFT);
}
static void fill_route_from_fnhe(struct rtable *rt, struct fib_nh_exception *fnhe)
struct fib6_node *fn = rcu_dereference_protected(rt->fib6_node,
lockdep_is_held(&rt->fib6_table->tb6_lock));
- /* paired with smp_rmb() in rt6_get_cookie_safe() */
+ /* paired with smp_rmb() in fib6_get_cookie_safe() */
smp_wmb();
while (fn) {
fn->fn_sernum = sernum;
static int gre_handle_offloads(struct sk_buff *skb, bool csum)
{
+ if (csum && skb_checksum_start(skb) < skb->data)
+ return -EINVAL;
return iptunnel_handle_offloads(skb,
csum ? SKB_GSO_GRE_CSUM : SKB_GSO_GRE);
}
#include <linux/nsproxy.h>
#include <linux/slab.h>
#include <linux/jhash.h>
+#include <linux/siphash.h>
#include <net/net_namespace.h>
#include <net/snmp.h>
#include <net/ipv6.h>
static u32 rt6_exception_hash(const struct in6_addr *dst,
const struct in6_addr *src)
{
- static u32 seed __read_mostly;
- u32 val;
+ static siphash_key_t rt6_exception_key __read_mostly;
+ struct {
+ struct in6_addr dst;
+ struct in6_addr src;
+ } __aligned(SIPHASH_ALIGNMENT) combined = {
+ .dst = *dst,
+ };
+ u64 val;
- net_get_random_once(&seed, sizeof(seed));
- val = jhash2((const u32 *)dst, sizeof(*dst)/sizeof(u32), seed);
+ net_get_random_once(&rt6_exception_key, sizeof(rt6_exception_key));
#ifdef CONFIG_IPV6_SUBTREES
if (src)
- val = jhash2((const u32 *)src, sizeof(*src)/sizeof(u32), val);
+ combined.src = *src;
#endif
- return hash_32(val, FIB6_EXCEPTION_BUCKET_SIZE_SHIFT);
+ val = siphash(&combined, sizeof(combined), &rt6_exception_key);
+
+ return hash_64(val, FIB6_EXCEPTION_BUCKET_SIZE_SHIFT);
}
/* Helper function to find the cached rt in the hash table
struct qrtr_endpoint ep;
struct mhi_device *mhi_dev;
struct device *dev;
- struct completion ready;
};
/* From MHI to QRTR */
struct qrtr_mhi_dev *qdev = container_of(ep, struct qrtr_mhi_dev, ep);
int rc;
- rc = wait_for_completion_interruptible(&qdev->ready);
- if (rc)
- goto free_skb;
-
if (skb->sk)
sock_hold(skb->sk);
int rc;
/* start channels */
- rc = mhi_prepare_for_transfer(mhi_dev, 0);
+ rc = mhi_prepare_for_transfer(mhi_dev);
if (rc)
return rc;
if (rc)
return rc;
- /* start channels */
- rc = mhi_prepare_for_transfer(mhi_dev, MHI_CH_INBOUND_ALLOC_BUFS);
- if (rc) {
- qrtr_endpoint_unregister(&qdev->ep);
- dev_set_drvdata(&mhi_dev->dev, NULL);
- return rc;
- }
-
- complete_all(&qdev->ready);
dev_dbg(qdev->dev, "Qualcomm MHI QRTR driver probed\n");
return 0;
goto err;
}
- if (len != ALIGN(size, 4) + hdrlen)
+ if (!size || len != ALIGN(size, 4) + hdrlen)
goto err;
if (cb->dst_port != QRTR_PORT_CTRL && cb->type != QRTR_TYPE_DATA &&
sch_tree_lock(sch);
q->nbands = nbands;
+ for (i = nstrict; i < q->nstrict; i++) {
+ INIT_LIST_HEAD(&q->classes[i].alist);
+ if (q->classes[i].qdisc->q.qlen) {
+ list_add_tail(&q->classes[i].alist, &q->active);
+ q->classes[i].deficit = quanta[i];
+ }
+ }
q->nstrict = nstrict;
memcpy(q->prio2band, priomap, sizeof(priomap));
return __sys_listen(fd, backlog);
}
-int __sys_accept4_file(struct file *file, unsigned file_flags,
+struct file *do_accept(struct file *file, unsigned file_flags,
struct sockaddr __user *upeer_sockaddr,
- int __user *upeer_addrlen, int flags,
- unsigned long nofile)
+ int __user *upeer_addrlen, int flags)
{
struct socket *sock, *newsock;
struct file *newfile;
- int err, len, newfd;
+ int err, len;
struct sockaddr_storage address;
- if (flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
- return -EINVAL;
-
- if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
- flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
-
sock = sock_from_file(file);
- if (!sock) {
- err = -ENOTSOCK;
- goto out;
- }
+ if (!sock)
+ return ERR_PTR(-ENOTSOCK);
- err = -ENFILE;
newsock = sock_alloc();
if (!newsock)
- goto out;
+ return ERR_PTR(-ENFILE);
newsock->type = sock->type;
newsock->ops = sock->ops;
*/
__module_get(newsock->ops->owner);
- newfd = __get_unused_fd_flags(flags, nofile);
- if (unlikely(newfd < 0)) {
- err = newfd;
- sock_release(newsock);
- goto out;
- }
newfile = sock_alloc_file(newsock, flags, sock->sk->sk_prot_creator->name);
- if (IS_ERR(newfile)) {
- err = PTR_ERR(newfile);
- put_unused_fd(newfd);
- goto out;
- }
+ if (IS_ERR(newfile))
+ return newfile;
err = security_socket_accept(sock, newsock);
if (err)
}
/* File flags are not inherited via accept() unlike another OSes. */
-
- fd_install(newfd, newfile);
- err = newfd;
-out:
- return err;
+ return newfile;
out_fd:
fput(newfile);
- put_unused_fd(newfd);
- goto out;
+ return ERR_PTR(err);
+}
+
+int __sys_accept4_file(struct file *file, unsigned file_flags,
+ struct sockaddr __user *upeer_sockaddr,
+ int __user *upeer_addrlen, int flags,
+ unsigned long nofile)
+{
+ struct file *newfile;
+ int newfd;
+ if (flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
+ return -EINVAL;
+
+ if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK))
+ flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK;
+
+ newfd = __get_unused_fd_flags(flags, nofile);
+ if (unlikely(newfd < 0))
+ return newfd;
+
+ newfile = do_accept(file, file_flags, upeer_sockaddr, upeer_addrlen,
+ flags);
+ if (IS_ERR(newfile)) {
+ put_unused_fd(newfd);
+ return PTR_ERR(newfile);
+ }
+ fd_install(newfd, newfile);
+ return newfd;
}
/*
rqstp->rq_stime = ktime_get();
rqstp->rq_reserved = serv->sv_max_mesg;
atomic_add(rqstp->rq_reserved, &xprt->xpt_reserved);
- }
+ } else
+ svc_xprt_received(xprt);
out:
trace_svc_handle_xprt(xprt, len);
return len;
xprt_unregister_transport(&xs_bc_tcp_transport);
}
-static int param_set_uint_minmax(const char *val,
- const struct kernel_param *kp,
- unsigned int min, unsigned int max)
-{
- unsigned int num;
- int ret;
-
- if (!val)
- return -EINVAL;
- ret = kstrtouint(val, 0, &num);
- if (ret)
- return ret;
- if (num < min || num > max)
- return -EINVAL;
- *((unsigned int *)kp->arg) = num;
- return 0;
-}
-
static int param_set_portnr(const char *val, const struct kernel_param *kp)
{
return param_set_uint_minmax(val, kp,
fi
cat <<EOF |
-asm-generic/atomic-instrumented.h
-asm-generic/atomic-long.h
-linux/atomic-arch-fallback.h
+linux/atomic/atomic-instrumented.h
+linux/atomic/atomic-long.h
+linux/atomic/atomic-arch-fallback.h
EOF
while read header; do
OLDSUM="$(tail -n 1 ${LINUXDIR}/include/${header})"
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_${pfx}${name}${sfx}_acquire(${params})
+arch_${atomic}_${pfx}${name}${sfx}_acquire(${params})
{
- ${ret} ret = ${arch}${atomic}_${pfx}${name}${sfx}_relaxed(${args});
+ ${ret} ret = arch_${atomic}_${pfx}${name}${sfx}_relaxed(${args});
__atomic_acquire_fence();
return ret;
}
cat <<EOF
/**
- * ${arch}${atomic}_add_negative - add and test if negative
+ * arch_${atomic}_add_negative - add and test if negative
* @i: integer value to add
* @v: pointer of type ${atomic}_t
*
* result is greater than or equal to zero.
*/
static __always_inline bool
-${arch}${atomic}_add_negative(${int} i, ${atomic}_t *v)
+arch_${atomic}_add_negative(${int} i, ${atomic}_t *v)
{
- return ${arch}${atomic}_add_return(i, v) < 0;
+ return arch_${atomic}_add_return(i, v) < 0;
}
EOF
cat << EOF
/**
- * ${arch}${atomic}_add_unless - add unless the number is already a given value
+ * arch_${atomic}_add_unless - add unless the number is already a given value
* @v: pointer of type ${atomic}_t
* @a: the amount to add to v...
* @u: ...unless v is equal to u.
* Returns true if the addition was done.
*/
static __always_inline bool
-${arch}${atomic}_add_unless(${atomic}_t *v, ${int} a, ${int} u)
+arch_${atomic}_add_unless(${atomic}_t *v, ${int} a, ${int} u)
{
- return ${arch}${atomic}_fetch_add_unless(v, a, u) != u;
+ return arch_${atomic}_fetch_add_unless(v, a, u) != u;
}
EOF
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_${pfx}andnot${sfx}${order}(${int} i, ${atomic}_t *v)
+arch_${atomic}_${pfx}andnot${sfx}${order}(${int} i, ${atomic}_t *v)
{
- ${retstmt}${arch}${atomic}_${pfx}and${sfx}${order}(~i, v);
+ ${retstmt}arch_${atomic}_${pfx}and${sfx}${order}(~i, v);
}
EOF
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_${pfx}dec${sfx}${order}(${atomic}_t *v)
+arch_${atomic}_${pfx}dec${sfx}${order}(${atomic}_t *v)
{
- ${retstmt}${arch}${atomic}_${pfx}sub${sfx}${order}(1, v);
+ ${retstmt}arch_${atomic}_${pfx}sub${sfx}${order}(1, v);
}
EOF
cat <<EOF
/**
- * ${arch}${atomic}_dec_and_test - decrement and test
+ * arch_${atomic}_dec_and_test - decrement and test
* @v: pointer of type ${atomic}_t
*
* Atomically decrements @v by 1 and
* cases.
*/
static __always_inline bool
-${arch}${atomic}_dec_and_test(${atomic}_t *v)
+arch_${atomic}_dec_and_test(${atomic}_t *v)
{
- return ${arch}${atomic}_dec_return(v) == 0;
+ return arch_${atomic}_dec_return(v) == 0;
}
EOF
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_dec_if_positive(${atomic}_t *v)
+arch_${atomic}_dec_if_positive(${atomic}_t *v)
{
- ${int} dec, c = ${arch}${atomic}_read(v);
+ ${int} dec, c = arch_${atomic}_read(v);
do {
dec = c - 1;
if (unlikely(dec < 0))
break;
- } while (!${arch}${atomic}_try_cmpxchg(v, &c, dec));
+ } while (!arch_${atomic}_try_cmpxchg(v, &c, dec));
return dec;
}
cat <<EOF
static __always_inline bool
-${arch}${atomic}_dec_unless_positive(${atomic}_t *v)
+arch_${atomic}_dec_unless_positive(${atomic}_t *v)
{
- ${int} c = ${arch}${atomic}_read(v);
+ ${int} c = arch_${atomic}_read(v);
do {
if (unlikely(c > 0))
return false;
- } while (!${arch}${atomic}_try_cmpxchg(v, &c, c - 1));
+ } while (!arch_${atomic}_try_cmpxchg(v, &c, c - 1));
return true;
}
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_${pfx}${name}${sfx}(${params})
+arch_${atomic}_${pfx}${name}${sfx}(${params})
{
${ret} ret;
__atomic_pre_full_fence();
- ret = ${arch}${atomic}_${pfx}${name}${sfx}_relaxed(${args});
+ ret = arch_${atomic}_${pfx}${name}${sfx}_relaxed(${args});
__atomic_post_full_fence();
return ret;
}
cat << EOF
/**
- * ${arch}${atomic}_fetch_add_unless - add unless the number is already a given value
+ * arch_${atomic}_fetch_add_unless - add unless the number is already a given value
* @v: pointer of type ${atomic}_t
* @a: the amount to add to v...
* @u: ...unless v is equal to u.
* Returns original value of @v
*/
static __always_inline ${int}
-${arch}${atomic}_fetch_add_unless(${atomic}_t *v, ${int} a, ${int} u)
+arch_${atomic}_fetch_add_unless(${atomic}_t *v, ${int} a, ${int} u)
{
- ${int} c = ${arch}${atomic}_read(v);
+ ${int} c = arch_${atomic}_read(v);
do {
if (unlikely(c == u))
break;
- } while (!${arch}${atomic}_try_cmpxchg(v, &c, c + a));
+ } while (!arch_${atomic}_try_cmpxchg(v, &c, c + a));
return c;
}
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_${pfx}inc${sfx}${order}(${atomic}_t *v)
+arch_${atomic}_${pfx}inc${sfx}${order}(${atomic}_t *v)
{
- ${retstmt}${arch}${atomic}_${pfx}add${sfx}${order}(1, v);
+ ${retstmt}arch_${atomic}_${pfx}add${sfx}${order}(1, v);
}
EOF
cat <<EOF
/**
- * ${arch}${atomic}_inc_and_test - increment and test
+ * arch_${atomic}_inc_and_test - increment and test
* @v: pointer of type ${atomic}_t
*
* Atomically increments @v by 1
* other cases.
*/
static __always_inline bool
-${arch}${atomic}_inc_and_test(${atomic}_t *v)
+arch_${atomic}_inc_and_test(${atomic}_t *v)
{
- return ${arch}${atomic}_inc_return(v) == 0;
+ return arch_${atomic}_inc_return(v) == 0;
}
EOF
cat <<EOF
/**
- * ${arch}${atomic}_inc_not_zero - increment unless the number is zero
+ * arch_${atomic}_inc_not_zero - increment unless the number is zero
* @v: pointer of type ${atomic}_t
*
* Atomically increments @v by 1, if @v is non-zero.
* Returns true if the increment was done.
*/
static __always_inline bool
-${arch}${atomic}_inc_not_zero(${atomic}_t *v)
+arch_${atomic}_inc_not_zero(${atomic}_t *v)
{
- return ${arch}${atomic}_add_unless(v, 1, 0);
+ return arch_${atomic}_add_unless(v, 1, 0);
}
EOF
cat <<EOF
static __always_inline bool
-${arch}${atomic}_inc_unless_negative(${atomic}_t *v)
+arch_${atomic}_inc_unless_negative(${atomic}_t *v)
{
- ${int} c = ${arch}${atomic}_read(v);
+ ${int} c = arch_${atomic}_read(v);
do {
if (unlikely(c < 0))
return false;
- } while (!${arch}${atomic}_try_cmpxchg(v, &c, c + 1));
+ } while (!arch_${atomic}_try_cmpxchg(v, &c, c + 1));
return true;
}
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_read_acquire(const ${atomic}_t *v)
+arch_${atomic}_read_acquire(const ${atomic}_t *v)
{
return smp_load_acquire(&(v)->counter);
}
cat <<EOF
static __always_inline ${ret}
-${arch}${atomic}_${pfx}${name}${sfx}_release(${params})
+arch_${atomic}_${pfx}${name}${sfx}_release(${params})
{
__atomic_release_fence();
- ${retstmt}${arch}${atomic}_${pfx}${name}${sfx}_relaxed(${args});
+ ${retstmt}arch_${atomic}_${pfx}${name}${sfx}_relaxed(${args});
}
EOF
cat <<EOF
static __always_inline void
-${arch}${atomic}_set_release(${atomic}_t *v, ${int} i)
+arch_${atomic}_set_release(${atomic}_t *v, ${int} i)
{
smp_store_release(&(v)->counter, i);
}
cat <<EOF
/**
- * ${arch}${atomic}_sub_and_test - subtract value from variable and test result
+ * arch_${atomic}_sub_and_test - subtract value from variable and test result
* @i: integer value to subtract
* @v: pointer of type ${atomic}_t
*
* other cases.
*/
static __always_inline bool
-${arch}${atomic}_sub_and_test(${int} i, ${atomic}_t *v)
+arch_${atomic}_sub_and_test(${int} i, ${atomic}_t *v)
{
- return ${arch}${atomic}_sub_return(i, v) == 0;
+ return arch_${atomic}_sub_return(i, v) == 0;
}
EOF
cat <<EOF
static __always_inline bool
-${arch}${atomic}_try_cmpxchg${order}(${atomic}_t *v, ${int} *old, ${int} new)
+arch_${atomic}_try_cmpxchg${order}(${atomic}_t *v, ${int} *old, ${int} new)
{
${int} r, o = *old;
- r = ${arch}${atomic}_cmpxchg${order}(v, o, new);
+ r = arch_${atomic}_cmpxchg${order}(v, o, new);
if (unlikely(r != o))
*old = r;
return likely(r == o);
# SPDX-License-Identifier: GPL-2.0
ATOMICDIR=$(dirname $0)
-ARCH=$2
. ${ATOMICDIR}/atomic-tbl.sh
-#gen_template_fallback(template, meta, pfx, name, sfx, order, arch, atomic, int, args...)
+#gen_template_fallback(template, meta, pfx, name, sfx, order, atomic, int, args...)
gen_template_fallback()
{
local template="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local order="$1"; shift
- local arch="$1"; shift
local atomic="$1"; shift
local int="$1"; shift
- local atomicname="${arch}${atomic}_${pfx}${name}${sfx}${order}"
+ local atomicname="arch_${atomic}_${pfx}${name}${sfx}${order}"
local ret="$(gen_ret_type "${meta}" "${int}")"
local retstmt="$(gen_ret_stmt "${meta}")"
fi
}
-#gen_proto_fallback(meta, pfx, name, sfx, order, arch, atomic, int, args...)
+#gen_proto_fallback(meta, pfx, name, sfx, order, atomic, int, args...)
gen_proto_fallback()
{
local meta="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
local order="$1"; shift
- local arch="$1"
- local atomic="$2"
+ local atomic="$1"
- local basename="${arch}${atomic}_${pfx}${name}${sfx}"
+ local basename="arch_${atomic}_${pfx}${name}${sfx}"
- printf "#define arch_${basename}${order} ${basename}${order}\n"
+ printf "#define ${basename}${order} ${basename}${order}\n"
}
-#gen_proto_order_variants(meta, pfx, name, sfx, arch, atomic, int, args...)
+#gen_proto_order_variants(meta, pfx, name, sfx, atomic, int, args...)
gen_proto_order_variants()
{
local meta="$1"; shift
local pfx="$1"; shift
local name="$1"; shift
local sfx="$1"; shift
- local arch="$1"
- local atomic="$2"
+ local atomic="$1"
- local basename="${arch}${atomic}_${pfx}${name}${sfx}"
+ local basename="arch_${atomic}_${pfx}${name}${sfx}"
local template="$(find_fallback_template "${pfx}" "${name}" "${sfx}" "${order}")"
- if [ -z "$arch" ]; then
- gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "" "$@"
-
- if meta_has_acquire "${meta}"; then
- gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "_acquire" "$@"
- fi
- if meta_has_release "${meta}"; then
- gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "_release" "$@"
- fi
- if meta_has_relaxed "${meta}"; then
- gen_proto_order_variant "${meta}" "${pfx}" "${name}" "${sfx}" "_relaxed" "$@"
- fi
-
- echo ""
- fi
-
# If we don't have relaxed atomics, then we don't bother with ordering fallbacks
# read_acquire and set_release need to be templated, though
if ! meta_has_relaxed "${meta}"; then
gen_basic_fallbacks "${basename}"
if [ ! -z "${template}" ]; then
- printf "#endif /* ${arch}${atomic}_${pfx}${name}${sfx} */\n\n"
+ printf "#endif /* ${basename} */\n\n"
gen_proto_fallback "${meta}" "${pfx}" "${name}" "${sfx}" "" "$@"
gen_proto_fallback "${meta}" "${pfx}" "${name}" "${sfx}" "_acquire" "$@"
gen_proto_fallback "${meta}" "${pfx}" "${name}" "${sfx}" "_release" "$@"
local order="$1"; shift;
cat <<EOF
-#ifndef ${ARCH}try_cmpxchg${order}
-#define ${ARCH}try_cmpxchg${order}(_ptr, _oldp, _new) \\
+#ifndef arch_try_cmpxchg${order}
+#define arch_try_cmpxchg${order}(_ptr, _oldp, _new) \\
({ \\
typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
- ___r = ${ARCH}cmpxchg${order}((_ptr), ___o, (_new)); \\
+ ___r = arch_cmpxchg${order}((_ptr), ___o, (_new)); \\
if (unlikely(___r != ___o)) \\
*___op = ___r; \\
likely(___r == ___o); \\
})
-#endif /* ${ARCH}try_cmpxchg${order} */
+#endif /* arch_try_cmpxchg${order} */
EOF
}
gen_try_cmpxchg_fallbacks()
{
- printf "#ifndef ${ARCH}try_cmpxchg_relaxed\n"
- printf "#ifdef ${ARCH}try_cmpxchg\n"
+ printf "#ifndef arch_try_cmpxchg_relaxed\n"
+ printf "#ifdef arch_try_cmpxchg\n"
- gen_basic_fallbacks "${ARCH}try_cmpxchg"
+ gen_basic_fallbacks "arch_try_cmpxchg"
- printf "#endif /* ${ARCH}try_cmpxchg */\n\n"
+ printf "#endif /* arch_try_cmpxchg */\n\n"
for order in "" "_acquire" "_release" "_relaxed"; do
gen_try_cmpxchg_fallback "${order}"
done
- printf "#else /* ${ARCH}try_cmpxchg_relaxed */\n"
+ printf "#else /* arch_try_cmpxchg_relaxed */\n"
- gen_order_fallbacks "${ARCH}try_cmpxchg"
+ gen_order_fallbacks "arch_try_cmpxchg"
- printf "#endif /* ${ARCH}try_cmpxchg_relaxed */\n\n"
+ printf "#endif /* arch_try_cmpxchg_relaxed */\n\n"
}
cat << EOF
EOF
-for xchg in "${ARCH}xchg" "${ARCH}cmpxchg" "${ARCH}cmpxchg64"; do
+for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64"; do
gen_xchg_fallbacks "${xchg}"
done
gen_try_cmpxchg_fallbacks
grep '^[a-z]' "$1" | while read name meta args; do
- gen_proto "${meta}" "${name}" "${ARCH}" "atomic" "int" ${args}
+ gen_proto "${meta}" "${name}" "atomic" "int" ${args}
done
cat <<EOF
EOF
grep '^[a-z]' "$1" | while read name meta args; do
- gen_proto "${meta}" "${name}" "${ARCH}" "atomic64" "s64" ${args}
+ gen_proto "${meta}" "${name}" "atomic64" "s64" ${args}
done
cat <<EOF
* arch_ variants (i.e. arch_atomic_read()/arch_atomic_cmpxchg()) to avoid
* double instrumentation.
*/
-#ifndef _ASM_GENERIC_ATOMIC_INSTRUMENTED_H
-#define _ASM_GENERIC_ATOMIC_INSTRUMENTED_H
+#ifndef _LINUX_ATOMIC_INSTRUMENTED_H
+#define _LINUX_ATOMIC_INSTRUMENTED_H
#include <linux/build_bug.h>
#include <linux/compiler.h>
gen_proto "${meta}" "${name}" "atomic64" "s64" ${args}
done
+grep '^[a-z]' "$1" | while read name meta args; do
+ gen_proto "${meta}" "${name}" "atomic_long" "long" ${args}
+done
+
+
for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg"; do
for order in "" "_acquire" "_release" "_relaxed"; do
gen_xchg "${xchg}${order}" ""
cat <<EOF
-#endif /* _ASM_GENERIC_ATOMIC_INSTRUMENTED_H */
+#endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
EOF
cat <<EOF
static __always_inline ${ret}
-atomic_long_${name}(${params})
+arch_atomic_long_${name}(${params})
{
- ${retstmt}${atomic}_${name}(${argscast});
+ ${retstmt}arch_${atomic}_${name}(${argscast});
}
EOF
// Generated by $0
// DO NOT MODIFY THIS FILE DIRECTLY
-#ifndef _ASM_GENERIC_ATOMIC_LONG_H
-#define _ASM_GENERIC_ATOMIC_LONG_H
+#ifndef _LINUX_ATOMIC_LONG_H
+#define _LINUX_ATOMIC_LONG_H
#include <linux/compiler.h>
#include <asm/types.h>
cat <<EOF
#endif /* CONFIG_64BIT */
-#endif /* _ASM_GENERIC_ATOMIC_LONG_H */
+#endif /* _LINUX_ATOMIC_LONG_H */
EOF
LINUXDIR=${ATOMICDIR}/../..
cat <<EOF |
-gen-atomic-instrumented.sh asm-generic/atomic-instrumented.h
-gen-atomic-long.sh asm-generic/atomic-long.h
-gen-atomic-fallback.sh linux/atomic-arch-fallback.h arch_
+gen-atomic-instrumented.sh linux/atomic/atomic-instrumented.h
+gen-atomic-long.sh linux/atomic/atomic-long.h
+gen-atomic-fallback.sh linux/atomic/atomic-arch-fallback.h
EOF
while read script header args; do
/bin/sh ${ATOMICDIR}/${script} ${ATOMICTBL} ${args} > ${LINUXDIR}/include/${header}
memcpy(&list, data, sizeof(list));
pr_devel("LIST[%04x] guid=%pUl ls=%x hs=%x ss=%x\n",
offs,
- list.signature_type.b, list.signature_list_size,
+ &list.signature_type, list.signature_list_size,
list.signature_header_size, list.signature_size);
lsize = list.signature_list_size;
* scall32-o32.S in the kernel sources.
* - the system call is performed by calling "syscall"
* - syscall return comes in v0, and register a3 needs to be checked to know
- * if an error occured, in which case errno is in v0.
+ * if an error occurred, in which case errno is in v0.
* - the arguments are cast to long and assigned into the target registers
* which are then simply passed as registers to the asm code, so that we
* don't have to experience issues with register constraints.
return 0;
}
+static __attribute__((unused))
+int msleep(unsigned int msecs)
+{
+ struct timeval my_timeval = { msecs / 1000, (msecs % 1000) * 1000 };
+
+ if (sys_select(0, 0, 0, 0, &my_timeval) < 0)
+ return (my_timeval.tv_sec * 1000) +
+ (my_timeval.tv_usec / 1000) +
+ !!(my_timeval.tv_usec % 1000);
+ else
+ return 0;
+}
+
static __attribute__((unused))
int stat(const char *path, struct stat *buf)
{
cpumask=`awk -v cpus="$cpus" -v me=$me -v n=$n 'BEGIN {
srand(n + me + systime());
ncpus = split(cpus, ca);
- curcpu = ca[int(rand() * ncpus + 1)];
- z = "";
- for (i = 1; 4 * i <= curcpu; i++)
- z = z "0";
- print "0x" 2 ^ (curcpu % 4) z;
+ print ca[int(rand() * ncpus + 1)];
}' < /dev/null`
n=$(($n+1))
- if ! taskset -p $cpumask $$ > /dev/null 2>&1
+ if ! taskset -c -p $cpumask $$ > /dev/null 2>&1
then
- echo taskset failure: '"taskset -p ' $cpumask $$ '"'
+ echo taskset failure: '"taskset -c -p ' $cpumask $$ '"'
exit 1
fi
then
exit 0
fi
-cat $1/*/console.log |
+find $1 -name console.log -exec cat {} \; |
grep "BUG: KCSAN: " |
sed -e 's/^\[[^]]*] //' |
sort |
echo "Cannot copy from $oldrun to $rundir."
usage
fi
-rm -f "$rundir"/*/{console.log,console.log.diags,qemu_pid,qemu-retval,Warnings,kvm-test-1-run.sh.out,kvm-test-1-run-qemu.sh.out,vmlinux} "$rundir"/log
+rm -f "$rundir"/*/{console.log,console.log.diags,qemu_pid,qemu-pid,qemu-retval,Warnings,kvm-test-1-run.sh.out,kvm-test-1-run-qemu.sh.out,vmlinux} "$rundir"/log
touch "$rundir/log"
echo $scriptname $args | tee -a "$rundir/log"
echo $oldrun > "$rundir/re-run"
then
echo ---- Dryrun complete, directory: $rundir | tee -a "$rundir/log"
else
- ( cd "$rundir"; sh $T/runbatches.sh )
+ ( cd "$rundir"; sh $T/runbatches.sh ) | tee -a "$rundir/log"
kvm-end-run-stats.sh "$rundir" "$starttime"
fi
--- /dev/null
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Produce awk statements roughly depicting the system's CPU and cache
+# layout. If the required information is not available, produce
+# error messages as awk comments. Successful exit regardless.
+#
+# Usage: kvm-assign-cpus.sh /path/to/sysfs
+
+T=/tmp/kvm-assign-cpus.sh.$$
+trap 'rm -rf $T' 0 2
+mkdir $T
+
+sysfsdir=${1-/sys/devices/system/node}
+if ! cd "$sysfsdir" > $T/msg 2>&1
+then
+ sed -e 's/^/# /' < $T/msg
+ exit 0
+fi
+nodelist="`ls -d node*`"
+for i in node*
+do
+ if ! test -d $i/
+ then
+ echo "# Not a directory: $sysfsdir/node*"
+ exit 0
+ fi
+ for j in $i/cpu*/cache/index*
+ do
+ if ! test -d $j/
+ then
+ echo "# Not a directory: $sysfsdir/$j"
+ exit 0
+ else
+ break
+ fi
+ done
+ indexlist="`ls -d $i/cpu* | grep 'cpu[0-9][0-9]*' | head -1 | sed -e 's,^.*$,ls -d &/cache/index*,' | sh | sed -e 's,^.*/,,'`"
+ break
+done
+for i in node*/cpu*/cache/index*/shared_cpu_list
+do
+ if ! test -f $i
+ then
+ echo "# Not a file: $sysfsdir/$i"
+ exit 0
+ else
+ break
+ fi
+done
+firstshared=
+for i in $indexlist
+do
+ rm -f $T/cpulist
+ for n in node*
+ do
+ f="$n/cpu*/cache/$i/shared_cpu_list"
+ if ! cat $f > $T/msg 2>&1
+ then
+ sed -e 's/^/# /' < $T/msg
+ exit 0
+ fi
+ cat $f >> $T/cpulist
+ done
+ if grep -q '[-,]' $T/cpulist
+ then
+ if test -z "$firstshared"
+ then
+ firstshared="$i"
+ fi
+ fi
+done
+if test -z "$firstshared"
+then
+ splitindex="`echo $indexlist | sed -e 's/ .*$//'`"
+else
+ splitindex="$firstshared"
+fi
+nodenum=0
+for n in node*
+do
+ cat $n/cpu*/cache/$splitindex/shared_cpu_list | sort -u -k1n |
+ awk -v nodenum="$nodenum" '
+ BEGIN {
+ idx = 0;
+ }
+
+ {
+ nlists = split($0, cpulists, ",");
+ for (i = 1; i <= nlists; i++) {
+ listsize = split(cpulists[i], cpus, "-");
+ if (listsize == 1)
+ cpus[2] = cpus[1];
+ for (j = cpus[1]; j <= cpus[2]; j++) {
+ print "cpu[" nodenum "][" idx "] = " j ";";
+ idx++;
+ }
+ }
+ }
+
+ END {
+ print "nodecpus[" nodenum "] = " idx ";";
+ }'
+ nodenum=`expr $nodenum + 1`
+done
+echo "numnodes = $nodenum;"
--- /dev/null
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Create an awk script that takes as input numbers of CPUs and outputs
+# lists of CPUs, one per line in both cases.
+#
+# Usage: kvm-get-cpus-script.sh /path/to/cpu/arrays /path/to/put/script [ /path/to/state ]
+#
+# The CPU arrays are output by kvm-assign-cpus.sh, and are valid awk
+# statements initializing the variables describing the system's topology.
+#
+# The optional state is input by this script (if the file exists and is
+# non-empty), and can also be output by this script.
+
+cpuarrays="${1-/sys/devices/system/node}"
+scriptfile="${2}"
+statefile="${3}"
+
+if ! test -f "$cpuarrays"
+then
+ echo "File not found: $cpuarrays" 1>&2
+ exit 1
+fi
+scriptdir="`dirname "$scriptfile"`"
+if ! test -d "$scriptdir" || ! test -x "$scriptdir" || ! test -w "$scriptdir"
+then
+ echo "Directory not usable for script output: $scriptdir"
+ exit 1
+fi
+
+cat << '___EOF___' > "$scriptfile"
+BEGIN {
+___EOF___
+cat "$cpuarrays" >> "$scriptfile"
+if test -r "$statefile"
+then
+ cat "$statefile" >> "$scriptfile"
+fi
+cat << '___EOF___' >> "$scriptfile"
+}
+
+# Do we have the system architecture to guide CPU affinity?
+function gotcpus()
+{
+ return numnodes != "";
+}
+
+# Return a comma-separated list of the next n CPUs.
+function nextcpus(n, i, s)
+{
+ for (i = 0; i < n; i++) {
+ if (nodecpus[curnode] == "")
+ curnode = 0;
+ if (cpu[curnode][curcpu[curnode]] == "")
+ curcpu[curnode] = 0;
+ if (s != "")
+ s = s ",";
+ s = s cpu[curnode][curcpu[curnode]];
+ curcpu[curnode]++;
+ curnode++
+ }
+ return s;
+}
+
+# Dump out the current node/CPU state so that a later invocation of this
+# script can continue where this one left off. Of course, this only works
+# when a state file was specified and where there was valid sysfs state.
+# Returns 1 if the state was dumped, 0 otherwise.
+#
+# Dumping the state for one system configuration and loading it into
+# another isn't likely to do what you want, whatever that might be.
+function dumpcpustate( i, fn)
+{
+___EOF___
+echo ' fn = "'"$statefile"'";' >> $scriptfile
+cat << '___EOF___' >> "$scriptfile"
+ if (fn != "" && gotcpus()) {
+ print "curnode = " curnode ";" > fn;
+ for (i = 0; i < numnodes; i++)
+ if (curcpu[i] != "")
+ print "curcpu[" i "] = " curcpu[i] ";" >> fn;
+ return 1;
+ }
+ if (fn != "")
+ print "# No CPU state to dump." > fn;
+ return 0;
+}
+___EOF___
echo "$configfile -------"
else
title="$configfile ------- $ncs acquisitions/releases"
- dur=`sed -e 's/^.* locktorture.shutdown_secs=//' -e 's/ .*$//' < $i/qemu-cmd 2> /dev/null`
+ dur=`grep -v '^#' $i/qemu-cmd | sed -e 's/^.* locktorture.shutdown_secs=//' -e 's/ .*$//' 2> /dev/null`
if test -z "$dur"
then
:
then
echo "$configfile ------- "
else
- dur="`sed -e 's/^.* scftorture.shutdown_secs=//' -e 's/ .*$//' < $i/qemu-cmd 2> /dev/null`"
+ dur="`grep -v '^#' $i/qemu-cmd | sed -e 's/^.* scftorture.shutdown_secs=//' -e 's/ .*$//' 2> /dev/null`"
if test -z "$dur"
then
rate=""
done
if test -f "$rd/kcsan.sum"
then
- if grep -q CONFIG_KCSAN=y $T
+ if ! test -f $T
+ then
+ :
+ elif grep -q CONFIG_KCSAN=y $T
then
echo "Compiler or architecture does not support KCSAN!"
echo Did you forget to switch your compiler with '--kmake-arg CC=<cc-that-supports-kcsan>'?
--- /dev/null
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Periodically scan a directory tree to prevent files from being reaped
+# by systemd and friends on long runs.
+#
+# Usage: kvm-remote-noreap.sh pathname
+#
+# Copyright (C) 2021 Facebook, Inc.
+#
+# Authors: Paul E. McKenney <paulmck@kernel.org>
+
+pathname="$1"
+if test "$pathname" = ""
+then
+ echo Usage: kvm-remote-noreap.sh pathname
+ exit 1
+fi
+if ! test -d "$pathname"
+then
+ echo Usage: kvm-remote-noreap.sh pathname
+ echo " pathname must be a directory."
+ exit 2
+fi
+
+while test -d "$pathname"
+do
+ find "$pathname" -type f -exec touch -c {} \; > /dev/null 2>&1
+ sleep 30
+done
n = $1;
sub(/\./, "", n);
fn = dest "/kvm-remote-" n ".sh"
+ print "kvm-remote-noreap.sh " rundir " &" > fn;
scenarios = "";
for (i = 2; i <= NF; i++)
scenarios = scenarios " " $i;
- print "kvm-test-1-run-batch.sh" scenarios > fn;
+ print "kvm-test-1-run-batch.sh" scenarios >> fn;
+ print "sync" >> fn;
print "rm " rundir "/remote.run" >> fn;
}'
chmod +x $T/bin/kvm-remote-*.sh
do
ssh $1 "test -f \"$2\""
ret=$?
- if test "$ret" -ne 255
+ if test "$ret" -eq 255
then
+ echo " ---" ssh failure to $1 checking for file $2, retry after $sleeptime seconds. `date`
+ elif test "$ret" -eq 0
+ then
+ return 0
+ elif test "$ret" -eq 1
+ then
+ echo " ---" File \"$2\" not found: ssh $1 test -f \"$2\"
+ return 1
+ else
+ echo " ---" Exit code $ret: ssh $1 test -f \"$2\", retry after $sleeptime seconds. `date`
return $ret
fi
- echo " ---" ssh failure to $1 checking for file $2, retry after $sleeptime seconds. `date`
sleep $sleeptime
done
}
do
sleep 30
done
- ( cd "$oldrun"; ssh $i "cd $rundir; tar -czf - kvm-remote-*.sh.out */console.log */kvm-test-1-run*.sh.out */qemu_pid */qemu-retval; rm -rf $T > /dev/null 2>&1" | tar -xzf - )
+ echo " ---" Collecting results from $i `date`
+ ( cd "$oldrun"; ssh $i "cd $rundir; tar -czf - kvm-remote-*.sh.out */console.log */kvm-test-1-run*.sh.out */qemu[_-]pid */qemu-retval */qemu-affinity; rm -rf $T > /dev/null 2>&1" | tar -xzf - )
done
( kvm-end-run-stats.sh "$oldrun" "$starttime"; echo $? > $T/exitcode ) | tee -a "$oldrun/remote-log"
echo ---- System running test: `uname -a`
echo ---- Starting kernels. `date` | tee -a log
$TORTURE_JITTER_START
+kvm-assign-cpus.sh /sys/devices/system/node > $T/cpuarray.awk
for i in "$@"
do
echo ---- System running test: `uname -a` > $i/kvm-test-1-run-qemu.sh.out
echo > $i/kvm-test-1-run-qemu.sh.out
+ export TORTURE_AFFINITY=
+ kvm-get-cpus-script.sh $T/cpuarray.awk $T/cpubatches.awk $T/cpustate
+ cat << ' ___EOF___' >> $T/cpubatches.awk
+ END {
+ affinitylist = "";
+ if (!gotcpus()) {
+ print "echo No CPU-affinity information, so no taskset command.";
+ } else if (cpu_count !~ /^[0-9][0-9]*$/) {
+ print "echo " scenario ": Bogus number of CPUs (old qemu-cmd?), so no taskset command.";
+ } else {
+ affinitylist = nextcpus(cpu_count);
+ if (!(affinitylist ~ /^[0-9,-][0-9,-]*$/))
+ print "echo " scenario ": Bogus CPU-affinity information, so no taskset command.";
+ else if (!dumpcpustate())
+ print "echo " scenario ": Could not dump state, so no taskset command.";
+ else
+ print "export TORTURE_AFFINITY=" affinitylist;
+ }
+ }
+ ___EOF___
+ cpu_count="`grep '# TORTURE_CPU_COUNT=' $i/qemu-cmd | sed -e 's/^.*=//'`"
+ affinity_export="`awk -f $T/cpubatches.awk -v cpu_count="$cpu_count" -v scenario=$i < /dev/null`"
+ $affinity_export
kvm-test-1-run-qemu.sh $i >> $i/kvm-test-1-run-qemu.sh.out 2>&1 &
done
for i in $runfiles
grep '^#' $resdir/qemu-cmd | sed -e 's/^# //' > $T/qemu-cmd-settings
. $T/qemu-cmd-settings
-# Decorate qemu-cmd with redirection, backgrounding, and PID capture
-sed -e 's/$/ 2>\&1 \&/' < $resdir/qemu-cmd > $T/qemu-cmd
-echo 'echo $! > $resdir/qemu_pid' >> $T/qemu-cmd
+# Decorate qemu-cmd with affinity, redirection, backgrounding, and PID capture
+taskset_command=
+if test -n "$TORTURE_AFFINITY"
+then
+ taskset_command="taskset -c $TORTURE_AFFINITY "
+fi
+sed -e 's/^[^#].*$/'"$taskset_command"'& 2>\&1 \&/' < $resdir/qemu-cmd > $T/qemu-cmd
+echo 'qemu_pid=$!' >> $T/qemu-cmd
+echo 'echo $qemu_pid > $resdir/qemu-pid' >> $T/qemu-cmd
+echo 'taskset -c -p $qemu_pid > $resdir/qemu-affinity' >> $T/qemu-cmd
# In case qemu refuses to run...
echo "NOTE: $QEMU either did not run or was interactive" > $resdir/console.log
# Attempt to run qemu
kstarttime=`gawk 'BEGIN { print systime() }' < /dev/null`
-( . $T/qemu-cmd; wait `cat $resdir/qemu_pid`; echo $? > $resdir/qemu-retval ) &
+( . $T/qemu-cmd; wait `cat $resdir/qemu-pid`; echo $? > $resdir/qemu-retval ) &
commandcompleted=0
if test -z "$TORTURE_KCONFIG_GDB_ARG"
then
sleep 10 # Give qemu's pid a chance to reach the file
- if test -s "$resdir/qemu_pid"
+ if test -s "$resdir/qemu-pid"
then
- qemu_pid=`cat "$resdir/qemu_pid"`
- echo Monitoring qemu job at pid $qemu_pid
+ qemu_pid=`cat "$resdir/qemu-pid"`
+ echo Monitoring qemu job at pid $qemu_pid `date`
else
qemu_pid=""
- echo Monitoring qemu job at yet-as-unknown pid
+ echo Monitoring qemu job at yet-as-unknown pid `date`
fi
fi
if test -n "$TORTURE_KCONFIG_GDB_ARG"
fi
while :
do
- if test -z "$qemu_pid" -a -s "$resdir/qemu_pid"
+ if test -z "$qemu_pid" && test -s "$resdir/qemu-pid"
then
- qemu_pid=`cat "$resdir/qemu_pid"`
+ qemu_pid=`cat "$resdir/qemu-pid"`
fi
kruntime=`gawk 'BEGIN { print systime() - '"$kstarttime"' }' < /dev/null`
if test -z "$qemu_pid" || kill -0 "$qemu_pid" > /dev/null 2>&1
break
fi
done
-if test -z "$qemu_pid" -a -s "$resdir/qemu_pid"
+if test -z "$qemu_pid" && test -s "$resdir/qemu-pid"
then
- qemu_pid=`cat "$resdir/qemu_pid"`
+ qemu_pid=`cat "$resdir/qemu-pid"`
fi
-if test $commandcompleted -eq 0 -a -n "$qemu_pid"
+if test $commandcompleted -eq 0 && test -n "$qemu_pid"
then
if ! test -f "$resdir/../STOP.1"
then
- echo Grace period for qemu job at pid $qemu_pid
+ echo Grace period for qemu job at pid $qemu_pid `date`
fi
oldline="`tail $resdir/console.log`"
while :
do
if test -f "$resdir/../STOP.1"
then
- echo "PID $qemu_pid killed due to run STOP.1 request" >> $resdir/Warnings 2>&1
+ echo "PID $qemu_pid killed due to run STOP.1 request `date`" >> $resdir/Warnings 2>&1
kill -KILL $qemu_pid
break
fi
then
last_ts=0
fi
- if test "$newline" != "$oldline" -a "$last_ts" -lt $((seconds + $TORTURE_SHUTDOWN_GRACE))
+ if test "$newline" != "$oldline" && test "$last_ts" -lt $((seconds + $TORTURE_SHUTDOWN_GRACE)) && test "$last_ts" -gt "$TORTURE_SHUTDOWN_GRACE"
then
must_continue=yes
+ if test $kruntime -ge $((seconds + $TORTURE_SHUTDOWN_GRACE))
+ then
+ echo Continuing at console.log time $last_ts \"`tail -n 1 $resdir/console.log`\" `date`
+ fi
fi
- if test $must_continue = no -a $kruntime -ge $((seconds + $TORTURE_SHUTDOWN_GRACE))
+ if test $must_continue = no && test $kruntime -ge $((seconds + $TORTURE_SHUTDOWN_GRACE))
then
- echo "!!! PID $qemu_pid hung at $kruntime vs. $seconds seconds" >> $resdir/Warnings 2>&1
+ echo "!!! PID $qemu_pid hung at $kruntime vs. $seconds seconds `date`" >> $resdir/Warnings 2>&1
kill -KILL $qemu_pid
break
fi
# Tell the script that this run is done.
rm -f $resdir/build.run
-
-parse-console.sh $resdir/console.log $title
echo "# TORTURE_JITTER_START=\"$TORTURE_JITTER_START\"" >> $resdir/qemu-cmd
echo "# TORTURE_JITTER_STOP=\"$TORTURE_JITTER_STOP\"" >> $resdir/qemu-cmd
echo "# TORTURE_TRUST_MAKE=\"$TORTURE_TRUST_MAKE\"; export TORTURE_TRUST_MAKE" >> $resdir/qemu-cmd
+echo "# TORTURE_CPU_COUNT=$cpu_count" >> $resdir/qemu-cmd
if test -n "$TORTURE_BUILDONLY"
then
fi
kvm-test-1-run-qemu.sh $resdir
+parse-console.sh $resdir/console.log $title
git diff HEAD >> $resdir/$ds/testid.txt
fi
___EOF___
-awk < $T/cfgcpu.pack \
- -v TORTURE_BUILDONLY="$TORTURE_BUILDONLY" \
- -v CONFIGDIR="$CONFIGFRAG/" \
- -v KVM="$KVM" \
- -v ncpus=$cpus \
- -v jitter="$jitter" \
- -v rd=$resdir/$ds/ \
- -v dur=$dur \
- -v TORTURE_QEMU_ARG="$TORTURE_QEMU_ARG" \
- -v TORTURE_BOOTARGS="$TORTURE_BOOTARGS" \
-'BEGIN {
+kvm-assign-cpus.sh /sys/devices/system/node > $T/cpuarray.awk
+kvm-get-cpus-script.sh $T/cpuarray.awk $T/dumpbatches.awk
+cat << '___EOF___' >> $T/dumpbatches.awk
+BEGIN {
i = 0;
}
}
# Dump out the scripting required to run one test batch.
-function dump(first, pastlast, batchnum)
+function dump(first, pastlast, batchnum, affinitylist)
{
print "echo ----Start batch " batchnum ": `date` | tee -a " rd "log";
print "needqemurun="
print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` | tee -a " rd "log";
print "mkdir " rd cfr[jn] " || :";
print "touch " builddir ".wait";
+ affinitylist = "";
+ if (gotcpus()) {
+ affinitylist = nextcpus(cpusr[jn]);
+ }
+ if (affinitylist ~ /^[0-9,-][0-9,-]*$/)
+ print "export TORTURE_AFFINITY=" affinitylist;
+ else
+ print "export TORTURE_AFFINITY=";
print "kvm-test-1-run.sh " CONFIGDIR cf[j], rd cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn] "/kvm-test-1-run.sh.out 2>&1 &"
print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` | tee -a " rd "log";
print "while test -f " builddir ".wait"
# Dump the last batch.
if (ncpus != 0)
dump(first, i, batchnum);
-}' >> $T/script
+}
+___EOF___
+awk < $T/cfgcpu.pack \
+ -v TORTURE_BUILDONLY="$TORTURE_BUILDONLY" \
+ -v CONFIGDIR="$CONFIGFRAG/" \
+ -v KVM="$KVM" \
+ -v ncpus=$cpus \
+ -v jitter="$jitter" \
+ -v rd=$resdir/$ds/ \
+ -v dur=$dur \
+ -v TORTURE_QEMU_ARG="$TORTURE_QEMU_ARG" \
+ -v TORTURE_BOOTARGS="$TORTURE_BOOTARGS" \
+ -f $T/dumpbatches.awk >> $T/script
echo kvm-end-run-stats.sh "$resdir/$ds" "$starttime" >> $T/script
# Extract the tests and their batches from the script.
do_kvfree=yes
do_kasan=yes
do_kcsan=no
+do_clocksourcewd=yes
# doyesno - Helper function for yes/no arguments
function doyesno () {
echo " --configs-scftorture \"config-file list w/ repeat factor (2*CFLIST)\""
echo " --doall"
echo " --doallmodconfig / --do-no-allmodconfig"
+ echo " --do-clocksourcewd / --do-no-clocksourcewd"
echo " --do-kasan / --do-no-kasan"
echo " --do-kcsan / --do-no-kcsan"
echo " --do-kvfree / --do-no-kvfree"
configs_scftorture="$configs_scftorture $2"
shift
;;
- --doall)
+ --do-all|--doall)
do_allmodconfig=yes
do_rcutorture=yes
do_locktorture=yes
do_kvfree=yes
do_kasan=yes
do_kcsan=yes
+ do_clocksourcewd=yes
;;
--do-allmodconfig|--do-no-allmodconfig)
do_allmodconfig=`doyesno "$1" --do-allmodconfig`
;;
+ --do-clocksourcewd|--do-no-clocksourcewd)
+ do_clocksourcewd=`doyesno "$1" --do-clocksourcewd`
+ ;;
--do-kasan|--do-no-kasan)
do_kasan=`doyesno "$1" --do-kasan`
;;
--do-locktorture|--do-no-locktorture)
do_locktorture=`doyesno "$1" --do-locktorture`
;;
- --do-none)
+ --do-none|--donone)
do_allmodconfig=no
do_rcutorture=no
do_locktorture=no
do_kvfree=no
do_kasan=no
do_kcsan=no
+ do_clocksourcewd=no
;;
--do-rcuscale|--do-no-rcuscale)
do_rcuscale=`doyesno "$1" --do-rcuscale`
# torture_bootargs="[ kernel boot arguments ]"
# torture_set flavor [ kvm.sh arguments ]
#
-# Note that "flavor" is an arbitrary string. Supply --torture if needed.
-# Note that quoting is problematic. So on the command line, pass multiple
-# values with multiple kvm.sh argument instances.
+# Note that "flavor" is an arbitrary string that does not affect kvm.sh
+# in any way. So also supply --torture if you need something other than
+# the default.
function torture_set {
local cur_kcsan_kmake_args=
local kcsan_kmake_tag=
torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 10 --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 1G --trust-make
fi
+if test "$do_clocksourcewd" = "yes"
+then
+ torture_bootargs="rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=30000"
+ torture_set "clocksourcewd-1" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 45s --configs TREE03 --kconfig "CONFIG_TEST_CLOCKSOURCE_WATCHDOG=y" --trust-make
+
+ torture_bootargs="rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=30000 clocksource.max_cswd_read_retries=1"
+ torture_set "clocksourcewd-2" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 45s --configs TREE03 --kconfig "CONFIG_TEST_CLOCKSOURCE_WATCHDOG=y" --trust-make
+
+ # In case our work is already done...
+ if test "$do_rcutorture" != "yes"
+ then
+ torture_bootargs="rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot rcupdate.rcu_task_stall_timeout=30000"
+ torture_set "clocksourcewd-3" tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 45s --configs TREE03 --trust-make
+ fi
+fi
+
echo " --- " $scriptname $args
echo " --- " Done `date` | tee -a $T/log
ret=0
nfailures="`wc -l "$T/failures" | awk '{ print $1 }'`"
ret=2
fi
+if test "$do_kcsan" = "yes"
+then
+ TORTURE_KCONFIG_KCSAN_ARG=1 tools/testing/selftests/rcutorture/bin/kcsan-collapse.sh tools/testing/selftests/rcutorture/res/$ds > tools/testing/selftests/rcutorture/res/$ds/kcsan.sum
+fi
echo Started at $startdate, ended at `date`, duration `get_starttime_duration $starttime`. | tee -a $T/log
echo Summary: Successes: $nsuccesses Failures: $nfailures. | tee -a $T/log
tdir="`cat $T/successes $T/failures | head -1 | awk '{ print $NF }' | sed -e 's,/[^/]\+/*$,,'`"
CONFIG_SMP=y
-CONFIG_NR_CPUS=2
+CONFIG_NR_CPUS=4
CONFIG_HOTPLUG_CPU=y
CONFIG_PREEMPT_NONE=n
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_SMP=y
-CONFIG_NR_CPUS=2
+CONFIG_NR_CPUS=4
CONFIG_HOTPLUG_CPU=y
CONFIG_PREEMPT_NONE=n
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_SMP=y
-CONFIG_NR_CPUS=2
+CONFIG_NR_CPUS=4
CONFIG_PREEMPT_NONE=n
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=y