Morten Welinder <welinder@troll.com>
Mythri P K <mythripk@ti.com>
Nguyen Anh Quynh <aquynh@gmail.com>
+Nicolas Pitre <nico@fluxnic.net> <nicolas.pitre@linaro.org>
+Nicolas Pitre <nico@fluxnic.net> <nico@linaro.org>
Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Patrick Mochel <mochel@digitalimplant.org>
Paul Burton <paul.burton@mips.com> <paul.burton@imgtec.com>
still doing productive work. As such, time spent in this subset of the
stall state is tracked separately and exported in the "full" averages.
-The ratios are tracked as recent trends over ten, sixty, and three
-hundred second windows, which gives insight into short term events as
-well as medium and long term trends. The total absolute stall time is
-tracked and exported as well, to allow detection of latency spikes
-which wouldn't necessarily make a dent in the time averages, or to
-average trends over custom time frames.
+The ratios (in %) are tracked as recent trends over ten, sixty, and
+three hundred second windows, which gives insight into short term events
+as well as medium and long term trends. The total absolute stall time
+(in us) is tracked and exported as well, to allow detection of latency
+spikes which wouldn't necessarily make a dent in the time averages,
+or to average trends over custom time frames.
Cgroup2 interface
=================
for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
-for this int. For example, a bitfield struct member has: * btf member bit
-offset 100 from the start of the structure, * btf member pointing to an int
-type, * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
+for this int. For example, a bitfield struct member has:
+ * btf member bit offset 100 from the start of the structure,
+ * btf member pointing to an int type,
+ * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
Then in the struct memory layout, this member will occupy ``4`` bits starting
from bits ``100 + 2 = 102``.
Alternatively, the bitfield struct member can be the following to access the
same bits as the above:
-
* btf member bit offset 102,
* btf member pointing to an int type,
* the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
Optional node properties:
- - ti,mode: Operation mode (see above).
+ - ti,mode: Operation mode (u8) (see above).
Example (operation mode 2):
adc128d818@1d {
compatible = "ti,adc128d818";
reg = <0x1d>;
- ti,mode = <2>;
+ ti,mode = /bits/ 8 <2>;
};
dictionary which is empty, and that it will always be
invalid at this place.
- 17 : bitstream version. If the first byte is 17, the next byte
- gives the bitstream version (version 1 only). If the first byte
- is not 17, the bitstream version is 0.
+ 17 : bitstream version. If the first byte is 17, and compressed
+ stream length is at least 5 bytes (length of shortest possible
+ versioned bitstream), the next byte gives the bitstream version
+ (version 1 only).
+ Otherwise, the bitstream version is 0.
18..21 : copy 0..3 literals
state = (byte - 17) = 0..3 [ copy <state> literals ]
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+==================
+BPF Flow Dissector
+==================
+
+Overview
+========
+
+Flow dissector is a routine that parses metadata out of the packets. It's
+used in the various places in the networking subsystem (RFS, flow hash, etc).
+
+BPF flow dissector is an attempt to reimplement C-based flow dissector logic
+in BPF to gain all the benefits of BPF verifier (namely, limits on the
+number of instructions and tail calls).
+
+API
+===
+
+BPF flow dissector programs operate on an ``__sk_buff``. However, only the
+limited set of fields is allowed: ``data``, ``data_end`` and ``flow_keys``.
+``flow_keys`` is ``struct bpf_flow_keys`` and contains flow dissector input
+and output arguments.
+
+The inputs are:
+ * ``nhoff`` - initial offset of the networking header
+ * ``thoff`` - initial offset of the transport header, initialized to nhoff
+ * ``n_proto`` - L3 protocol type, parsed out of L2 header
+
+Flow dissector BPF program should fill out the rest of the ``struct
+bpf_flow_keys`` fields. Input arguments ``nhoff/thoff/n_proto`` should be
+also adjusted accordingly.
+
+The return code of the BPF program is either BPF_OK to indicate successful
+dissection, or BPF_DROP to indicate parsing error.
+
+__sk_buff->data
+===============
+
+In the VLAN-less case, this is what the initial state of the BPF flow
+dissector looks like::
+
+ +------+------+------------+-----------+
+ | DMAC | SMAC | ETHER_TYPE | L3_HEADER |
+ +------+------+------------+-----------+
+ ^
+ |
+ +-- flow dissector starts here
+
+
+.. code:: c
+
+ skb->data + flow_keys->nhoff point to the first byte of L3_HEADER
+ flow_keys->thoff = nhoff
+ flow_keys->n_proto = ETHER_TYPE
+
+In case of VLAN, flow dissector can be called with the two different states.
+
+Pre-VLAN parsing::
+
+ +------+------+------+-----+-----------+-----------+
+ | DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
+ +------+------+------+-----+-----------+-----------+
+ ^
+ |
+ +-- flow dissector starts here
+
+.. code:: c
+
+ skb->data + flow_keys->nhoff point the to first byte of TCI
+ flow_keys->thoff = nhoff
+ flow_keys->n_proto = TPID
+
+Please note that TPID can be 802.1AD and, hence, BPF program would
+have to parse VLAN information twice for double tagged packets.
+
+
+Post-VLAN parsing::
+
+ +------+------+------+-----+-----------+-----------+
+ | DMAC | SMAC | TPID | TCI |ETHER_TYPE | L3_HEADER |
+ +------+------+------+-----+-----------+-----------+
+ ^
+ |
+ +-- flow dissector starts here
+
+.. code:: c
+
+ skb->data + flow_keys->nhoff point the to first byte of L3_HEADER
+ flow_keys->thoff = nhoff
+ flow_keys->n_proto = ETHER_TYPE
+
+In this case VLAN information has been processed before the flow dissector
+and BPF flow dissector is not required to handle it.
+
+
+The takeaway here is as follows: BPF flow dissector program can be called with
+the optional VLAN header and should gracefully handle both cases: when single
+or double VLAN is present and when it is not present. The same program
+can be called for both cases and would have to be written carefully to
+handle both cases.
+
+
+Reference Implementation
+========================
+
+See ``tools/testing/selftests/bpf/progs/bpf_flow.c`` for the reference
+implementation and ``tools/testing/selftests/bpf/flow_dissector_load.[hc]``
+for the loader. bpftool can be used to load BPF flow dissector program as well.
+
+The reference implementation is organized as follows:
+ * ``jmp_table`` map that contains sub-programs for each supported L3 protocol
+ * ``_dissect`` routine - entry point; it does input ``n_proto`` parsing and
+ does ``bpf_tail_call`` to the appropriate L3 handler
+
+Since BPF at this point doesn't support looping (or any jumping back),
+jmp_table is used instead to handle multiple levels of encapsulation (and
+IPv6 options).
+
+
+Current Limitations
+===================
+BPF flow dissector doesn't support exporting all the metadata that in-kernel
+C-based implementation can export. Notable example is single VLAN (802.1Q)
+and double VLAN (802.1AD) tags. Please refer to the ``struct bpf_flow_keys``
+for a set of information that's currently can be exported from the BPF context.
netdev-FAQ
af_xdp
batman-adv
+ bpf_flow_dissector
can
can_ucan_protocol
device_drivers/freescale/dpaa2/index
ARM/NUVOTON NPCM ARCHITECTURE
M: Avi Fishman <avifishman70@gmail.com>
M: Tomer Maimon <tmaimon77@gmail.com>
+M: Tali Perry <tali.perry1@gmail.com>
R: Patrick Venture <venture@google.com>
R: Nancy Yuen <yuenn@google.com>
-R: Brendan Higgins <brendanhiggins@google.com>
+R: Benjamin Fair <benjaminfair@google.com>
L: openbmc@lists.ozlabs.org (moderated for non-subscribers)
S: Supported
F: arch/arm/mach-npcm/
F: arch/arm/boot/dts/nuvoton-npcm*
-F: include/dt-bindings/clock/nuvoton,npcm7xx-clks.h
+F: include/dt-bindings/clock/nuvoton,npcm7xx-clock.h
F: drivers/*/*npcm*
F: Documentation/devicetree/bindings/*/*npcm*
F: Documentation/devicetree/bindings/*/*/*npcm*
F: include/linux/cpuidle.h
CRAMFS FILESYSTEM
-M: Nicolas Pitre <nico@linaro.org>
+M: Nicolas Pitre <nico@fluxnic.net>
S: Maintained
F: Documentation/filesystems/cramfs.txt
F: fs/cramfs/
S: Maintained
F: Documentation/ABI/testing/sysfs-bus-mdio
F: Documentation/devicetree/bindings/net/mdio*
-F: Documentation/networking/phy.txt
+F: Documentation/networking/phy.rst
F: drivers/net/phy/
F: drivers/of/of_mdio.c
F: drivers/of/of_net.c
SFC NETWORK DRIVER
M: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
M: Edward Cree <ecree@solarflare.com>
-M: Bert Kenward <bkenward@solarflare.com>
+M: Martin Habets <mhabets@solarflare.com>
L: netdev@vger.kernel.org
S: Supported
F: drivers/net/ethernet/sfc/
*/
static inline void
syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args)
+ unsigned long *args)
{
unsigned long *inside_ptregs = &(regs->r0);
- inside_ptregs -= i;
-
- BUG_ON((i + n) > 6);
+ unsigned int n = 6;
+ unsigned int i = 0;
while (n--) {
args[i++] = (*inside_ptregs);
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- if (n == 0)
- return;
-
- if (i + n > SYSCALL_MAX_ARGS) {
- unsigned long *args_bad = args + SYSCALL_MAX_ARGS - i;
- unsigned int n_bad = n + i - SYSCALL_MAX_ARGS;
- pr_warn("%s called with max args %d, handling only %d\n",
- __func__, i + n, SYSCALL_MAX_ARGS);
- memset(args_bad, 0, n_bad * sizeof(args[0]));
- n = SYSCALL_MAX_ARGS - i;
- }
-
- if (i == 0) {
- args[0] = regs->ARM_ORIG_r0;
- args++;
- i++;
- n--;
- }
-
- memcpy(args, ®s->ARM_r0 + i, n * sizeof(args[0]));
+ args[0] = regs->ARM_ORIG_r0;
+ args++;
+
+ memcpy(args, ®s->ARM_r0 + 1, 5 * sizeof(args[0]));
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- if (n == 0)
- return;
-
- if (i + n > SYSCALL_MAX_ARGS) {
- pr_warn("%s called with max args %d, handling only %d\n",
- __func__, i + n, SYSCALL_MAX_ARGS);
- n = SYSCALL_MAX_ARGS - i;
- }
-
- if (i == 0) {
- regs->ARM_ORIG_r0 = args[0];
- args++;
- i++;
- n--;
- }
-
- memcpy(®s->ARM_r0 + i, args, n * sizeof(args[0]));
+ regs->ARM_ORIG_r0 = args[0];
+ args++;
+
+ memcpy(®s->ARM_r0 + 1, args, 5 * sizeof(args[0]));
}
static inline int syscall_get_arch(void)
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- if (n == 0)
- return;
-
- if (i + n > SYSCALL_MAX_ARGS) {
- unsigned long *args_bad = args + SYSCALL_MAX_ARGS - i;
- unsigned int n_bad = n + i - SYSCALL_MAX_ARGS;
- pr_warning("%s called with max args %d, handling only %d\n",
- __func__, i + n, SYSCALL_MAX_ARGS);
- memset(args_bad, 0, n_bad * sizeof(args[0]));
- }
-
- if (i == 0) {
- args[0] = regs->orig_x0;
- args++;
- i++;
- n--;
- }
-
- memcpy(args, ®s->regs[i], n * sizeof(args[0]));
+ args[0] = regs->orig_x0;
+ args++;
+
+ memcpy(args, ®s->regs[1], 5 * sizeof(args[0]));
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- if (n == 0)
- return;
-
- if (i + n > SYSCALL_MAX_ARGS) {
- pr_warning("%s called with max args %d, handling only %d\n",
- __func__, i + n, SYSCALL_MAX_ARGS);
- n = SYSCALL_MAX_ARGS - i;
- }
-
- if (i == 0) {
- regs->orig_x0 = args[0];
- args++;
- i++;
- n--;
- }
-
- memcpy(®s->regs[i], args, n * sizeof(args[0]));
+ regs->orig_x0 = args[0];
+ args++;
+
+ memcpy(®s->regs[1], args, 5 * sizeof(args[0]));
}
/*
unsigned long low = (unsigned long)raw_cpu_read(sdei_stack_normal_ptr);
unsigned long high = low + SDEI_STACK_SIZE;
+ if (!low)
+ return false;
+
if (sp < low || sp >= high)
return false;
unsigned long low = (unsigned long)raw_cpu_read(sdei_stack_critical_ptr);
unsigned long high = low + SDEI_STACK_SIZE;
+ if (!low)
+ return false;
+
if (sp < low || sp >= high)
return false;
}
static inline void syscall_get_arguments(struct task_struct *task,
- struct pt_regs *regs, unsigned int i,
- unsigned int n, unsigned long *args)
+ struct pt_regs *regs,
+ unsigned long *args)
{
- switch (i) {
- case 0:
- if (!n--)
- break;
- *args++ = regs->a4;
- case 1:
- if (!n--)
- break;
- *args++ = regs->b4;
- case 2:
- if (!n--)
- break;
- *args++ = regs->a6;
- case 3:
- if (!n--)
- break;
- *args++ = regs->b6;
- case 4:
- if (!n--)
- break;
- *args++ = regs->a8;
- case 5:
- if (!n--)
- break;
- *args++ = regs->b8;
- case 6:
- if (!n--)
- break;
- default:
- BUG();
- }
+ *args++ = regs->a4;
+ *args++ = regs->b4;
+ *args++ = regs->a6;
+ *args++ = regs->b6;
+ *args++ = regs->a8;
+ *args = regs->b8;
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- switch (i) {
- case 0:
- if (!n--)
- break;
- regs->a4 = *args++;
- case 1:
- if (!n--)
- break;
- regs->b4 = *args++;
- case 2:
- if (!n--)
- break;
- regs->a6 = *args++;
- case 3:
- if (!n--)
- break;
- regs->b6 = *args++;
- case 4:
- if (!n--)
- break;
- regs->a8 = *args++;
- case 5:
- if (!n--)
- break;
- regs->a9 = *args++;
- case 6:
- if (!n)
- break;
- default:
- BUG();
- }
+ regs->a4 = *args++;
+ regs->b4 = *args++;
+ regs->a6 = *args++;
+ regs->b6 = *args++;
+ regs->a8 = *args++;
+ regs->a9 = *args;
}
#endif /* __ASM_C6X_SYSCALLS_H */
static inline void
syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args)
+ unsigned long *args)
{
- BUG_ON(i + n > 6);
- if (i == 0) {
- args[0] = regs->orig_a0;
- args++;
- i++;
- n--;
- }
- memcpy(args, ®s->a1 + i * sizeof(regs->a1), n * sizeof(args[0]));
+ args[0] = regs->orig_a0;
+ args++;
+ memcpy(args, ®s->a1, 5 * sizeof(args[0]));
}
static inline void
syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, const unsigned long *args)
+ const unsigned long *args)
{
- BUG_ON(i + n > 6);
- if (i == 0) {
- regs->orig_a0 = args[0];
- args++;
- i++;
- n--;
- }
- memcpy(®s->a1 + i * sizeof(regs->a1), args, n * sizeof(regs->a0));
+ regs->orig_a0 = args[0];
+ args++;
+ memcpy(®s->a1, args, 5 * sizeof(regs->a1));
}
static inline int
static inline void
syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args)
+ unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- while (n > 0) {
- switch (i) {
- case 0:
- *args++ = regs->er1;
- break;
- case 1:
- *args++ = regs->er2;
- break;
- case 2:
- *args++ = regs->er3;
- break;
- case 3:
- *args++ = regs->er4;
- break;
- case 4:
- *args++ = regs->er5;
- break;
- case 5:
- *args++ = regs->er6;
- break;
- }
- i++;
- n--;
- }
+ *args++ = regs->er1;
+ *args++ = regs->er2;
+ *args++ = regs->er3;
+ *args++ = regs->er4;
+ *args++ = regs->er5;
+ *args = regs->er6;
}
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- BUG_ON(i + n > 6);
- memcpy(args, &(®s->r00)[i], n * sizeof(args[0]));
+ memcpy(args, &(®s->r00)[0], 6 * sizeof(args[0]));
}
#endif
}
extern void ia64_syscall_get_set_arguments(struct task_struct *task,
- struct pt_regs *regs, unsigned int i, unsigned int n,
- unsigned long *args, int rw);
+ struct pt_regs *regs, unsigned long *args, int rw);
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- ia64_syscall_get_set_arguments(task, regs, i, n, args, 0);
+ ia64_syscall_get_set_arguments(task, regs, args, 0);
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- ia64_syscall_get_set_arguments(task, regs, i, n, args, 1);
+ ia64_syscall_get_set_arguments(task, regs, args, 1);
}
static inline int syscall_get_arch(void)
}
void ia64_syscall_get_set_arguments(struct task_struct *task,
- struct pt_regs *regs, unsigned int i, unsigned int n,
- unsigned long *args, int rw)
+ struct pt_regs *regs, unsigned long *args, int rw)
{
struct syscall_get_set_args data = {
- .i = i,
- .n = n,
+ .i = 0,
+ .n = 6,
.args = args,
.regs = regs,
.rw = rw,
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
+ unsigned int i = 0;
+ unsigned int n = 6;
+
while (n--)
*args++ = microblaze_get_syscall_arg(regs, i++);
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
+ unsigned int i = 0;
+ unsigned int n = 6;
+
while (n--)
microblaze_set_syscall_arg(regs, i++, *args++);
}
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
+ unsigned int i = 0;
+ unsigned int n = 6;
int ret;
/* O32 ABI syscall() */
sd.nr = syscall;
sd.arch = syscall_get_arch();
- syscall_get_arguments(current, regs, 0, 6, args);
+ syscall_get_arguments(current, regs, args);
for (i = 0; i < 6; i++)
sd.args[i] = args[i];
sd.instruction_pointer = KSTK_EIP(current);
* syscall_get_arguments - extract system call parameter values
* @task: task of interest, must be blocked
* @regs: task_pt_regs() of @task
- * @i: argument index [0,5]
- * @n: number of arguments; n+i must be [1,6].
* @args: array filled with argument values
*
- * Fetches @n arguments to the system call starting with the @i'th argument
- * (from 0 through 5). Argument @i is stored in @args[0], and so on.
- * An arch inline version is probably optimal when @i and @n are constants.
+ * Fetches 6 arguments to the system call (from 0 through 5). The first
+ * argument is stored in @args[0], and so on.
*
* It's only valid to call this when @task is stopped for tracing on
* entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
- * It's invalid to call this with @i + @n > 6; we only support system calls
- * taking up to 6 arguments.
*/
#define SYSCALL_MAX_ARGS 6
void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args)
+ unsigned long *args)
{
- if (n == 0)
- return;
- if (i + n > SYSCALL_MAX_ARGS) {
- unsigned long *args_bad = args + SYSCALL_MAX_ARGS - i;
- unsigned int n_bad = n + i - SYSCALL_MAX_ARGS;
- pr_warning("%s called with max args %d, handling only %d\n",
- __func__, i + n, SYSCALL_MAX_ARGS);
- memset(args_bad, 0, n_bad * sizeof(args[0]));
- memset(args_bad, 0, n_bad * sizeof(args[0]));
- }
-
- if (i == 0) {
- args[0] = regs->orig_r0;
- args++;
- i++;
- n--;
- }
-
- memcpy(args, ®s->uregs[0] + i, n * sizeof(args[0]));
+ args[0] = regs->orig_r0;
+ args++;
+ memcpy(args, ®s->uregs[0] + 1, 5 * sizeof(args[0]));
}
/**
* syscall_set_arguments - change system call parameter value
* @task: task of interest, must be in system call entry tracing
* @regs: task_pt_regs() of @task
- * @i: argument index [0,5]
- * @n: number of arguments; n+i must be [1,6].
* @args: array of argument values to store
*
- * Changes @n arguments to the system call starting with the @i'th argument.
- * Argument @i gets value @args[0], and so on.
- * An arch inline version is probably optimal when @i and @n are constants.
+ * Changes 6 arguments to the system call. The first argument gets value
+ * @args[0], and so on.
*
* It's only valid to call this when @task is stopped for tracing on
* entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
- * It's invalid to call this with @i + @n > 6; we only support system calls
- * taking up to 6 arguments.
*/
void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- if (n == 0)
- return;
-
- if (i + n > SYSCALL_MAX_ARGS) {
- pr_warn("%s called with max args %d, handling only %d\n",
- __func__, i + n, SYSCALL_MAX_ARGS);
- n = SYSCALL_MAX_ARGS - i;
- }
-
- if (i == 0) {
- regs->orig_r0 = args[0];
- args++;
- i++;
- n--;
- }
+ regs->orig_r0 = args[0];
+ args++;
- memcpy(®s->uregs[0] + i, args, n * sizeof(args[0]));
+ memcpy(®s->uregs[0] + 1, args, 5 * sizeof(args[0]));
}
#endif /* _ASM_NDS32_SYSCALL_H */
}
static inline void syscall_get_arguments(struct task_struct *task,
- struct pt_regs *regs, unsigned int i, unsigned int n,
- unsigned long *args)
+ struct pt_regs *regs, unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- switch (i) {
- case 0:
- if (!n--)
- break;
- *args++ = regs->r4;
- case 1:
- if (!n--)
- break;
- *args++ = regs->r5;
- case 2:
- if (!n--)
- break;
- *args++ = regs->r6;
- case 3:
- if (!n--)
- break;
- *args++ = regs->r7;
- case 4:
- if (!n--)
- break;
- *args++ = regs->r8;
- case 5:
- if (!n--)
- break;
- *args++ = regs->r9;
- case 6:
- if (!n--)
- break;
- default:
- BUG();
- }
+ *args++ = regs->r4;
+ *args++ = regs->r5;
+ *args++ = regs->r6;
+ *args++ = regs->r7;
+ *args++ = regs->r8;
+ *args = regs->r9;
}
static inline void syscall_set_arguments(struct task_struct *task,
- struct pt_regs *regs, unsigned int i, unsigned int n,
- const unsigned long *args)
+ struct pt_regs *regs, const unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- switch (i) {
- case 0:
- if (!n--)
- break;
- regs->r4 = *args++;
- case 1:
- if (!n--)
- break;
- regs->r5 = *args++;
- case 2:
- if (!n--)
- break;
- regs->r6 = *args++;
- case 3:
- if (!n--)
- break;
- regs->r7 = *args++;
- case 4:
- if (!n--)
- break;
- regs->r8 = *args++;
- case 5:
- if (!n--)
- break;
- regs->r9 = *args++;
- case 6:
- if (!n)
- break;
- default:
- BUG();
- }
+ regs->r4 = *args++;
+ regs->r5 = *args++;
+ regs->r6 = *args++;
+ regs->r7 = *args++;
+ regs->r8 = *args++;
+ regs->r9 = *args;
}
#endif
static inline void
syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args)
+ unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- memcpy(args, ®s->gpr[3 + i], n * sizeof(args[0]));
+ memcpy(args, ®s->gpr[3], 6 * sizeof(args[0]));
}
static inline void
syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, const unsigned long *args)
+ const unsigned long *args)
{
- BUG_ON(i + n > 6);
-
- memcpy(®s->gpr[3 + i], args, n * sizeof(args[0]));
+ memcpy(®s->gpr[3], args, 6 * sizeof(args[0]));
}
static inline int syscall_get_arch(void)
static inline unsigned long regs_return_value(struct pt_regs *regs)
{
- return regs->gr[20];
+ return regs->gr[28];
}
static inline void instruction_pointer_set(struct pt_regs *regs,
unsigned long val)
{
- regs->iaoq[0] = val;
+ regs->iaoq[0] = val;
+ regs->iaoq[1] = val + 4;
}
/* Query offset/name of register from its name/offset */
}
static inline void syscall_get_arguments(struct task_struct *tsk,
- struct pt_regs *regs, unsigned int i,
- unsigned int n, unsigned long *args)
+ struct pt_regs *regs,
+ unsigned long *args)
{
- BUG_ON(i);
-
- switch (n) {
- case 6:
- args[5] = regs->gr[21];
- case 5:
- args[4] = regs->gr[22];
- case 4:
- args[3] = regs->gr[23];
- case 3:
- args[2] = regs->gr[24];
- case 2:
- args[1] = regs->gr[25];
- case 1:
- args[0] = regs->gr[26];
- case 0:
- break;
- default:
- BUG();
- }
+ args[5] = regs->gr[21];
+ args[4] = regs->gr[22];
+ args[3] = regs->gr[23];
+ args[2] = regs->gr[24];
+ args[1] = regs->gr[25];
+ args[0] = regs->gr[26];
}
static inline long syscall_get_return_value(struct task_struct *task,
static int __init parisc_idle_init(void)
{
- const char *marker;
-
- /* check QEMU/SeaBIOS marker in PAGE0 */
- marker = (char *) &PAGE0->pad0;
- running_on_qemu = (memcmp(marker, "SeaBIOS", 8) == 0);
-
if (!running_on_qemu)
cpu_idle_poll_ctrl(1);
int ret, cpunum;
struct pdc_coproc_cfg coproc_cfg;
+ /* check QEMU/SeaBIOS marker in PAGE0 */
+ running_on_qemu = (memcmp(&PAGE0->pad0, "SeaBIOS", 8) == 0);
+
cpunum = smp_processor_id();
init_cpu_topology();
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
unsigned long val, mask = -1UL;
-
- BUG_ON(i + n > 6);
+ unsigned int n = 6;
#ifdef CONFIG_COMPAT
if (test_tsk_thread_flag(task, TIF_32BIT))
mask = 0xffffffff;
#endif
while (n--) {
- if (n == 0 && i == 0)
+ if (n == 0)
val = regs->orig_gpr3;
else
- val = regs->gpr[3 + i + n];
+ val = regs->gpr[3 + n];
args[n] = val & mask;
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- BUG_ON(i + n > 6);
- memcpy(®s->gpr[3 + i], args, n * sizeof(args[0]));
+ memcpy(®s->gpr[3], args, 6 * sizeof(args[0]));
/* Also copy the first argument into orig_gpr3 */
- if (i == 0 && n > 0)
- regs->orig_gpr3 = args[0];
+ regs->orig_gpr3 = args[0];
}
static inline int syscall_get_arch(void)
#include <linux/kvm_host.h>
#include <linux/init.h>
#include <linux/export.h>
+#include <linux/kmemleak.h>
#include <linux/kvm_para.h>
#include <linux/slab.h>
#include <linux/of.h>
static __init void kvm_free_tmp(void)
{
+ /*
+ * Inform kmemleak about the hole in the .bss section since the
+ * corresponding pages will be unmapped with DEBUG_PAGEALLOC=y.
+ */
+ kmemleak_free_part(&kvm_tmp[kvm_tmp_index],
+ ARRAY_SIZE(kvm_tmp) - kvm_tmp_index);
free_reserved_area(&kvm_tmp[kvm_tmp_index],
&kvm_tmp[ARRAY_SIZE(kvm_tmp)], -1, NULL);
}
};
#define FIXADDR_SIZE (__end_of_fixed_addresses * PAGE_SIZE)
-#define FIXADDR_TOP (PAGE_OFFSET)
+#define FIXADDR_TOP (VMALLOC_START)
#define FIXADDR_START (FIXADDR_TOP - FIXADDR_SIZE)
#define FIXMAP_PAGE_IO PAGE_KERNEL
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- BUG_ON(i + n > 6);
- if (i == 0) {
- args[0] = regs->orig_a0;
- args++;
- i++;
- n--;
- }
- memcpy(args, ®s->a1 + i * sizeof(regs->a1), n * sizeof(args[0]));
+ args[0] = regs->orig_a0;
+ args++;
+ memcpy(args, ®s->a1, 5 * sizeof(args[0]));
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- BUG_ON(i + n > 6);
- if (i == 0) {
- regs->orig_a0 = args[0];
- args++;
- i++;
- n--;
- }
- memcpy(®s->a1 + i * sizeof(regs->a1), args, n * sizeof(regs->a0));
+ regs->orig_a0 = args[0];
+ args++;
+ memcpy(®s->a1, args, 5 * sizeof(regs->a1));
}
static inline int syscall_get_arch(void)
" .balign 4\n" \
"4:\n" \
" li %0, %6\n" \
- " jump 2b, %1\n" \
+ " jump 3b, %1\n" \
" .previous\n" \
" .section __ex_table,\"a\"\n" \
" .balign " RISCV_SZPTR "\n" \
ifdef CONFIG_FTRACE
CFLAGS_REMOVE_ftrace.o = -pg
-CFLAGS_REMOVE_setup.o = -pg
endif
extra-y += head.o
obj-y += cacheinfo.o
obj-y += vdso/
-CFLAGS_setup.o := -mcmodel=medany
-
obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_SMP) += smpboot.o
obj-$(CONFIG_SMP) += smp.o
{
s32 hi20;
- if (IS_ENABLED(CMODEL_MEDLOW)) {
+ if (IS_ENABLED(CONFIG_CMODEL_MEDLOW)) {
pr_err(
"%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
me->name, (long long)v, location);
};
#endif
-unsigned long va_pa_offset;
-EXPORT_SYMBOL(va_pa_offset);
-unsigned long pfn_base;
-EXPORT_SYMBOL(pfn_base);
-
-unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)] __page_aligned_bss;
-EXPORT_SYMBOL(empty_zero_page);
-
/* The lucky hart to first increment this variable will boot the other cores */
atomic_t hart_lottery;
unsigned long boot_cpu_hartid;
+
+CFLAGS_init.o := -mcmodel=medany
+ifdef CONFIG_FTRACE
+CFLAGS_REMOVE_init.o = -pg
+endif
+
obj-y += init.o
obj-y += fault.o
obj-y += extable.o
#include <asm/pgtable.h>
#include <asm/io.h>
+unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
+ __page_aligned_bss;
+EXPORT_SYMBOL(empty_zero_page);
+
static void __init zone_sizes_init(void)
{
unsigned long max_zone_pfns[MAX_NR_ZONES] = { 0, };
}
}
+unsigned long va_pa_offset;
+EXPORT_SYMBOL(va_pa_offset);
+unsigned long pfn_base;
+EXPORT_SYMBOL(pfn_base);
+
pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
pgd_t trampoline_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
}
}
+/*
+ * setup_vm() is called from head.S with MMU-off.
+ *
+ * Following requirements should be honoured for setup_vm() to work
+ * correctly:
+ * 1) It should use PC-relative addressing for accessing kernel symbols.
+ * To achieve this we always use GCC cmodel=medany.
+ * 2) The compiler instrumentation for FTRACE will not work for setup_vm()
+ * so disable compiler instrumentation when FTRACE is enabled.
+ *
+ * Currently, the above requirements are honoured by using custom CFLAGS
+ * for init.o in mm/Makefile.
+ */
+
+#ifndef __riscv_cmodel_medany
+#error "setup_vm() is called from head.S before relocate so it should "
+ "not use absolute addressing."
+#endif
+
asmlinkage void __init setup_vm(void)
{
extern char _start;
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
unsigned long mask = -1UL;
+ unsigned int n = 6;
- /*
- * No arguments for this syscall, there's nothing to do.
- */
- if (!n)
- return;
-
- BUG_ON(i + n > 6);
#ifdef CONFIG_COMPAT
if (test_tsk_thread_flag(task, TIF_31BIT))
mask = 0xffffffff;
#endif
while (n-- > 0)
- if (i + n > 0)
- args[n] = regs->gprs[2 + i + n] & mask;
- if (i == 0)
- args[0] = regs->orig_gpr2 & mask;
+ if (n > 0)
+ args[n] = regs->gprs[2 + n] & mask;
+
+ args[0] = regs->orig_gpr2 & mask;
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- BUG_ON(i + n > 6);
+ unsigned int n = 6;
+
while (n-- > 0)
- if (i + n > 0)
- regs->gprs[2 + i + n] = args[n];
- if (i == 0)
- regs->orig_gpr2 = args[0];
+ if (n > 0)
+ regs->gprs[2 + n] = args[n];
+ regs->orig_gpr2 = args[0];
}
static inline int syscall_get_arch(void)
struct sh_clk_ops;
-void __init arch_init_clk_ops(struct sh_clk_ops **ops, int idx)
+void __init __weak arch_init_clk_ops(struct sh_clk_ops **ops, int idx)
{
}
-void __init plat_irq_setup(void)
+void __init __weak plat_irq_setup(void)
{
}
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- /*
- * Do this simply for now. If we need to start supporting
- * fetching arguments from arbitrary indices, this will need some
- * extra logic. Presently there are no in-tree users that depend
- * on this behaviour.
- */
- BUG_ON(i);
/* Argument pattern is: R4, R5, R6, R7, R0, R1 */
- switch (n) {
- case 6: args[5] = regs->regs[1];
- case 5: args[4] = regs->regs[0];
- case 4: args[3] = regs->regs[7];
- case 3: args[2] = regs->regs[6];
- case 2: args[1] = regs->regs[5];
- case 1: args[0] = regs->regs[4];
- case 0:
- break;
- default:
- BUG();
- }
+ args[5] = regs->regs[1];
+ args[4] = regs->regs[0];
+ args[3] = regs->regs[7];
+ args[2] = regs->regs[6];
+ args[1] = regs->regs[5];
+ args[0] = regs->regs[4];
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- /* Same note as above applies */
- BUG_ON(i);
-
- switch (n) {
- case 6: regs->regs[1] = args[5];
- case 5: regs->regs[0] = args[4];
- case 4: regs->regs[7] = args[3];
- case 3: regs->regs[6] = args[2];
- case 2: regs->regs[5] = args[1];
- case 1: regs->regs[4] = args[0];
- break;
- default:
- BUG();
- }
+ regs->regs[1] = args[5];
+ regs->regs[0] = args[4];
+ regs->regs[7] = args[3];
+ regs->regs[6] = args[2];
+ regs->regs[5] = args[1];
+ regs->regs[4] = args[0];
}
static inline int syscall_get_arch(void)
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- BUG_ON(i + n > 6);
- memcpy(args, ®s->regs[2 + i], n * sizeof(args[0]));
+ memcpy(args, ®s->regs[2], 6 * sizeof(args[0]));
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- BUG_ON(i + n > 6);
- memcpy(®s->regs[2 + i], args, n * sizeof(args[0]));
+ memcpy(®s->regs[2], args, 6 * sizeof(args[0]));
}
static inline int syscall_get_arch(void)
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
int zero_extend = 0;
unsigned int j;
+ unsigned int n = 6;
#ifdef CONFIG_SPARC64
if (test_tsk_thread_flag(task, TIF_32BIT))
#endif
for (j = 0; j < n; j++) {
- unsigned long val = regs->u_regs[UREG_I0 + i + j];
+ unsigned long val = regs->u_regs[UREG_I0 + j];
if (zero_extend)
args[j] = (u32) val;
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
- unsigned int j;
+ unsigned int i;
- for (j = 0; j < n; j++)
- regs->u_regs[UREG_I0 + i + j] = args[j];
+ for (i = 0; i < 6; i++)
+ regs->u_regs[UREG_I0 + i] = args[i];
}
static inline int syscall_get_arch(void)
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
const struct uml_pt_regs *r = ®s->regs;
- switch (i) {
- case 0:
- if (!n--)
- break;
- *args++ = UPT_SYSCALL_ARG1(r);
- case 1:
- if (!n--)
- break;
- *args++ = UPT_SYSCALL_ARG2(r);
- case 2:
- if (!n--)
- break;
- *args++ = UPT_SYSCALL_ARG3(r);
- case 3:
- if (!n--)
- break;
- *args++ = UPT_SYSCALL_ARG4(r);
- case 4:
- if (!n--)
- break;
- *args++ = UPT_SYSCALL_ARG5(r);
- case 5:
- if (!n--)
- break;
- *args++ = UPT_SYSCALL_ARG6(r);
- case 6:
- if (!n--)
- break;
- default:
- BUG();
- break;
- }
+ *args++ = UPT_SYSCALL_ARG1(r);
+ *args++ = UPT_SYSCALL_ARG2(r);
+ *args++ = UPT_SYSCALL_ARG3(r);
+ *args++ = UPT_SYSCALL_ARG4(r);
+ *args++ = UPT_SYSCALL_ARG5(r);
+ *args = UPT_SYSCALL_ARG6(r);
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
struct uml_pt_regs *r = ®s->regs;
- switch (i) {
- case 0:
- if (!n--)
- break;
- UPT_SYSCALL_ARG1(r) = *args++;
- case 1:
- if (!n--)
- break;
- UPT_SYSCALL_ARG2(r) = *args++;
- case 2:
- if (!n--)
- break;
- UPT_SYSCALL_ARG3(r) = *args++;
- case 3:
- if (!n--)
- break;
- UPT_SYSCALL_ARG4(r) = *args++;
- case 4:
- if (!n--)
- break;
- UPT_SYSCALL_ARG5(r) = *args++;
- case 5:
- if (!n--)
- break;
- UPT_SYSCALL_ARG6(r) = *args++;
- case 6:
- if (!n--)
- break;
- default:
- BUG();
- break;
- }
+ UPT_SYSCALL_ARG1(r) = *args++;
+ UPT_SYSCALL_ARG2(r) = *args++;
+ UPT_SYSCALL_ARG3(r) = *args++;
+ UPT_SYSCALL_ARG4(r) = *args++;
+ UPT_SYSCALL_ARG5(r) = *args++;
+ UPT_SYSCALL_ARG6(r) = *args;
}
/* See arch/x86/um/asm/syscall.h for syscall_get_arch() definition. */
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
- BUG_ON(i + n > 6);
- memcpy(args, ®s->bx + i, n * sizeof(args[0]));
+ memcpy(args, ®s->bx, 6 * sizeof(args[0]));
}
static inline void syscall_set_arguments(struct task_struct *task,
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
# ifdef CONFIG_IA32_EMULATION
- if (task->thread_info.status & TS_COMPAT)
- switch (i) {
- case 0:
- if (!n--) break;
- *args++ = regs->bx;
- case 1:
- if (!n--) break;
- *args++ = regs->cx;
- case 2:
- if (!n--) break;
- *args++ = regs->dx;
- case 3:
- if (!n--) break;
- *args++ = regs->si;
- case 4:
- if (!n--) break;
- *args++ = regs->di;
- case 5:
- if (!n--) break;
- *args++ = regs->bp;
- case 6:
- if (!n--) break;
- default:
- BUG();
- break;
- }
- else
+ if (task->thread_info.status & TS_COMPAT) {
+ *args++ = regs->bx;
+ *args++ = regs->cx;
+ *args++ = regs->dx;
+ *args++ = regs->si;
+ *args++ = regs->di;
+ *args = regs->bp;
+ } else
# endif
- switch (i) {
- case 0:
- if (!n--) break;
- *args++ = regs->di;
- case 1:
- if (!n--) break;
- *args++ = regs->si;
- case 2:
- if (!n--) break;
- *args++ = regs->dx;
- case 3:
- if (!n--) break;
- *args++ = regs->r10;
- case 4:
- if (!n--) break;
- *args++ = regs->r8;
- case 5:
- if (!n--) break;
- *args++ = regs->r9;
- case 6:
- if (!n--) break;
- default:
- BUG();
- break;
- }
+ {
+ *args++ = regs->di;
+ *args++ = regs->si;
+ *args++ = regs->dx;
+ *args++ = regs->r10;
+ *args++ = regs->r8;
+ *args = regs->r9;
+ }
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
# ifdef CONFIG_IA32_EMULATION
- if (task->thread_info.status & TS_COMPAT)
- switch (i) {
- case 0:
- if (!n--) break;
- regs->bx = *args++;
- case 1:
- if (!n--) break;
- regs->cx = *args++;
- case 2:
- if (!n--) break;
- regs->dx = *args++;
- case 3:
- if (!n--) break;
- regs->si = *args++;
- case 4:
- if (!n--) break;
- regs->di = *args++;
- case 5:
- if (!n--) break;
- regs->bp = *args++;
- case 6:
- if (!n--) break;
- default:
- BUG();
- break;
- }
- else
+ if (task->thread_info.status & TS_COMPAT) {
+ regs->bx = *args++;
+ regs->cx = *args++;
+ regs->dx = *args++;
+ regs->si = *args++;
+ regs->di = *args++;
+ regs->bp = *args;
+ } else
# endif
- switch (i) {
- case 0:
- if (!n--) break;
- regs->di = *args++;
- case 1:
- if (!n--) break;
- regs->si = *args++;
- case 2:
- if (!n--) break;
- regs->dx = *args++;
- case 3:
- if (!n--) break;
- regs->r10 = *args++;
- case 4:
- if (!n--) break;
- regs->r8 = *args++;
- case 5:
- if (!n--) break;
- regs->r9 = *args++;
- case 6:
- if (!n--) break;
- default:
- BUG();
- break;
- }
+ {
+ regs->di = *args++;
+ regs->si = *args++;
+ regs->dx = *args++;
+ regs->r10 = *args++;
+ regs->r8 = *args++;
+ regs->r9 = *args;
+ }
}
static inline int syscall_get_arch(void)
__HYPERCALL_DECLS;
__HYPERCALL_5ARG(a1, a2, a3, a4, a5);
+ if (call >= PAGE_SIZE / sizeof(hypercall_page[0]))
+ return -EINVAL;
+
asm volatile(CALL_NOSPEC
: __HYPERCALL_5PARAM
: [thunk_target] "a" (&hypercall_page[call])
return ret;
}
-static int get_num_contig_pages(int idx, struct page **inpages,
- unsigned long npages)
+static unsigned long get_num_contig_pages(unsigned long idx,
+ struct page **inpages, unsigned long npages)
{
unsigned long paddr, next_paddr;
- int i = idx + 1, pages = 1;
+ unsigned long i = idx + 1, pages = 1;
/* find the number of contiguous pages starting from idx */
paddr = __sme_page_pa(inpages[idx]);
static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
{
- unsigned long vaddr, vaddr_end, next_vaddr, npages, size;
+ unsigned long vaddr, vaddr_end, next_vaddr, npages, pages, size, i;
struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
struct kvm_sev_launch_update_data params;
struct sev_data_launch_update_data *data;
struct page **inpages;
- int i, ret, pages;
+ int ret;
if (!sev_guest(kvm))
return -ENOTTY;
struct page **src_p, **dst_p;
struct kvm_sev_dbg debug;
unsigned long n;
- int ret, size;
+ unsigned int size;
+ int ret;
if (!sev_guest(kvm))
return -ENOTTY;
if (copy_from_user(&debug, (void __user *)(uintptr_t)argp->data, sizeof(debug)))
return -EFAULT;
+ if (!debug.len || debug.src_uaddr + debug.len < debug.src_uaddr)
+ return -EINVAL;
+ if (!debug.dst_uaddr)
+ return -EINVAL;
+
vaddr = debug.src_uaddr;
size = debug.len;
vaddr_end = vaddr + size;
dst_vaddr,
len, &argp->error);
- sev_unpin_memory(kvm, src_p, 1);
- sev_unpin_memory(kvm, dst_p, 1);
+ sev_unpin_memory(kvm, src_p, n);
+ sev_unpin_memory(kvm, dst_p, n);
if (ret)
goto err;
}
}
+static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap) {
+ int msr;
+
+ for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
+ unsigned word = msr / BITS_PER_LONG;
+
+ msr_bitmap[word] = ~0;
+ msr_bitmap[word + (0x800 / sizeof(long))] = ~0;
+ }
+}
+
/*
* Merge L0's and L1's MSR bitmap, return false to indicate that
* we do not use the hardware.
return false;
msr_bitmap_l1 = (unsigned long *)kmap(page);
- if (nested_cpu_has_apic_reg_virt(vmcs12)) {
- /*
- * L0 need not intercept reads for MSRs between 0x800 and 0x8ff, it
- * just lets the processor take the value from the virtual-APIC page;
- * take those 256 bits directly from the L1 bitmap.
- */
- for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
- unsigned word = msr / BITS_PER_LONG;
- msr_bitmap_l0[word] = msr_bitmap_l1[word];
- msr_bitmap_l0[word + (0x800 / sizeof(long))] = ~0;
- }
- } else {
- for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
- unsigned word = msr / BITS_PER_LONG;
- msr_bitmap_l0[word] = ~0;
- msr_bitmap_l0[word + (0x800 / sizeof(long))] = ~0;
- }
- }
- nested_vmx_disable_intercept_for_msr(
- msr_bitmap_l1, msr_bitmap_l0,
- X2APIC_MSR(APIC_TASKPRI),
- MSR_TYPE_W);
+ /*
+ * To keep the control flow simple, pay eight 8-byte writes (sixteen
+ * 4-byte writes on 32-bit systems) up front to enable intercepts for
+ * the x2APIC MSR range and selectively disable them below.
+ */
+ enable_x2apic_msr_intercepts(msr_bitmap_l0);
+
+ if (nested_cpu_has_virt_x2apic_mode(vmcs12)) {
+ if (nested_cpu_has_apic_reg_virt(vmcs12)) {
+ /*
+ * L0 need not intercept reads for MSRs between 0x800
+ * and 0x8ff, it just lets the processor take the value
+ * from the virtual-APIC page; take those 256 bits
+ * directly from the L1 bitmap.
+ */
+ for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
+ unsigned word = msr / BITS_PER_LONG;
+
+ msr_bitmap_l0[word] = msr_bitmap_l1[word];
+ }
+ }
- if (nested_cpu_has_vid(vmcs12)) {
- nested_vmx_disable_intercept_for_msr(
- msr_bitmap_l1, msr_bitmap_l0,
- X2APIC_MSR(APIC_EOI),
- MSR_TYPE_W);
nested_vmx_disable_intercept_for_msr(
msr_bitmap_l1, msr_bitmap_l0,
- X2APIC_MSR(APIC_SELF_IPI),
- MSR_TYPE_W);
+ X2APIC_MSR(APIC_TASKPRI),
+ MSR_TYPE_R | MSR_TYPE_W);
+
+ if (nested_cpu_has_vid(vmcs12)) {
+ nested_vmx_disable_intercept_for_msr(
+ msr_bitmap_l1, msr_bitmap_l0,
+ X2APIC_MSR(APIC_EOI),
+ MSR_TYPE_W);
+ nested_vmx_disable_intercept_for_msr(
+ msr_bitmap_l1, msr_bitmap_l0,
+ X2APIC_MSR(APIC_SELF_IPI),
+ MSR_TYPE_W);
+ }
}
if (spec_ctrl)
static inline void syscall_get_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
unsigned long *args)
{
static const unsigned int reg[] = XTENSA_SYSCALL_ARGUMENT_REGS;
- unsigned int j;
+ unsigned int i;
- if (n == 0)
- return;
-
- WARN_ON_ONCE(i + n > SYSCALL_MAX_ARGS);
-
- for (j = 0; j < n; ++j) {
- if (i + j < SYSCALL_MAX_ARGS)
- args[j] = regs->areg[reg[i + j]];
- else
- args[j] = 0;
- }
+ for (i = 0; i < 6; ++i)
+ args[i] = regs->areg[reg[i]];
}
static inline void syscall_set_arguments(struct task_struct *task,
struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args)
{
static const unsigned int reg[] = XTENSA_SYSCALL_ARGUMENT_REGS;
- unsigned int j;
-
- if (n == 0)
- return;
-
- if (WARN_ON_ONCE(i + n > SYSCALL_MAX_ARGS)) {
- if (i < SYSCALL_MAX_ARGS)
- n = SYSCALL_MAX_ARGS - i;
- else
- return;
- }
+ unsigned int i;
- for (j = 0; j < n; ++j)
- regs->areg[reg[i + j]] = args[j];
+ for (i = 0; i < 6; ++i)
+ regs->areg[reg[i]] = args[i];
}
asmlinkage long xtensa_rt_sigreturn(struct pt_regs*);
* at least two nodes.
*/
return !(varied_queue_weights || multiple_classes_busy
-#ifdef BFQ_GROUP_IOSCHED_ENABLED
+#ifdef CONFIG_BFQ_GROUP_IOSCHED
|| bfqd->num_groups_with_pending_reqs > 0
#endif
);
entity->on_st = true;
}
-#ifdef BFQ_GROUP_IOSCHED_ENABLED
+#ifdef CONFIG_BFQ_GROUP_IOSCHED
if (!bfq_entity_to_bfqq(entity)) { /* bfq_group */
struct bfq_group *bfqg =
container_of(entity, struct bfq_group, entity);
*/
blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *rq)
{
- blk_qc_t unused;
-
if (blk_cloned_rq_check_limits(q, rq))
return BLK_STS_IOERR;
* bypass a potential scheduler on the bottom device for
* insert.
*/
- return blk_mq_try_issue_directly(rq->mq_hctx, rq, &unused, true, true);
+ return blk_mq_request_issue_directly(rq, true);
}
EXPORT_SYMBOL_GPL(blk_insert_cloned_request);
* busy in case of 'none' scheduler, and this way may save
* us one extra enqueue & dequeue to sw queue.
*/
- if (!hctx->dispatch_busy && !e && !run_queue_async)
+ if (!hctx->dispatch_busy && !e && !run_queue_async) {
blk_mq_try_issue_list_directly(hctx, list);
- else
- blk_mq_insert_requests(hctx, ctx, list);
+ if (list_empty(list))
+ return;
+ }
+ blk_mq_insert_requests(hctx, ctx, list);
}
blk_mq_run_hw_queue(hctx, run_queue_async);
unsigned int depth;
list_splice_init(&plug->mq_list, &list);
- plug->rq_count = 0;
if (plug->rq_count > 2 && plug->multiple_queues)
list_sort(NULL, &list, plug_rq_cmp);
+ plug->rq_count = 0;
+
this_q = NULL;
this_hctx = NULL;
this_ctx = NULL;
return ret;
}
-blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
struct request *rq,
blk_qc_t *cookie,
- bool bypass, bool last)
+ bool bypass_insert, bool last)
{
struct request_queue *q = rq->q;
bool run_queue = true;
- blk_status_t ret = BLK_STS_RESOURCE;
- int srcu_idx;
- bool force = false;
- hctx_lock(hctx, &srcu_idx);
/*
- * hctx_lock is needed before checking quiesced flag.
+ * RCU or SRCU read lock is needed before checking quiesced flag.
*
- * When queue is stopped or quiesced, ignore 'bypass', insert
- * and return BLK_STS_OK to caller, and avoid driver to try to
- * dispatch again.
+ * When queue is stopped or quiesced, ignore 'bypass_insert' from
+ * blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
+ * and avoid driver to try to dispatch again.
*/
- if (unlikely(blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q))) {
+ if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
run_queue = false;
- bypass = false;
- goto out_unlock;
+ bypass_insert = false;
+ goto insert;
}
- if (unlikely(q->elevator && !bypass))
- goto out_unlock;
+ if (q->elevator && !bypass_insert)
+ goto insert;
if (!blk_mq_get_dispatch_budget(hctx))
- goto out_unlock;
+ goto insert;
if (!blk_mq_get_driver_tag(rq)) {
blk_mq_put_dispatch_budget(hctx);
- goto out_unlock;
+ goto insert;
}
- /*
- * Always add a request that has been through
- *.queue_rq() to the hardware dispatch list.
- */
- force = true;
- ret = __blk_mq_issue_directly(hctx, rq, cookie, last);
-out_unlock:
+ return __blk_mq_issue_directly(hctx, rq, cookie, last);
+insert:
+ if (bypass_insert)
+ return BLK_STS_RESOURCE;
+
+ blk_mq_request_bypass_insert(rq, run_queue);
+ return BLK_STS_OK;
+}
+
+static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+ struct request *rq, blk_qc_t *cookie)
+{
+ blk_status_t ret;
+ int srcu_idx;
+
+ might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
+
+ hctx_lock(hctx, &srcu_idx);
+
+ ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
+ if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
+ blk_mq_request_bypass_insert(rq, true);
+ else if (ret != BLK_STS_OK)
+ blk_mq_end_request(rq, ret);
+
+ hctx_unlock(hctx, srcu_idx);
+}
+
+blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last)
+{
+ blk_status_t ret;
+ int srcu_idx;
+ blk_qc_t unused_cookie;
+ struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
+
+ hctx_lock(hctx, &srcu_idx);
+ ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true, last);
hctx_unlock(hctx, srcu_idx);
- switch (ret) {
- case BLK_STS_OK:
- break;
- case BLK_STS_DEV_RESOURCE:
- case BLK_STS_RESOURCE:
- if (force) {
- blk_mq_request_bypass_insert(rq, run_queue);
- /*
- * We have to return BLK_STS_OK for the DM
- * to avoid livelock. Otherwise, we return
- * the real result to indicate whether the
- * request is direct-issued successfully.
- */
- ret = bypass ? BLK_STS_OK : ret;
- } else if (!bypass) {
- blk_mq_sched_insert_request(rq, false,
- run_queue, false);
- }
- break;
- default:
- if (!bypass)
- blk_mq_end_request(rq, ret);
- break;
- }
return ret;
}
void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list)
{
- blk_qc_t unused;
- blk_status_t ret = BLK_STS_OK;
-
while (!list_empty(list)) {
+ blk_status_t ret;
struct request *rq = list_first_entry(list, struct request,
queuelist);
list_del_init(&rq->queuelist);
- if (ret == BLK_STS_OK)
- ret = blk_mq_try_issue_directly(hctx, rq, &unused,
- false,
+ ret = blk_mq_request_issue_directly(rq, list_empty(list));
+ if (ret != BLK_STS_OK) {
+ if (ret == BLK_STS_RESOURCE ||
+ ret == BLK_STS_DEV_RESOURCE) {
+ blk_mq_request_bypass_insert(rq,
list_empty(list));
- else
- blk_mq_sched_insert_request(rq, false, true, false);
+ break;
+ }
+ blk_mq_end_request(rq, ret);
+ }
}
/*
* the driver there was more coming, but that turned out to
* be a lie.
*/
- if (ret != BLK_STS_OK && hctx->queue->mq_ops->commit_rqs)
+ if (!list_empty(list) && hctx->queue->mq_ops->commit_rqs)
hctx->queue->mq_ops->commit_rqs(hctx);
}
plug->rq_count--;
}
blk_add_rq_to_plug(plug, rq);
+ trace_block_plug(q);
blk_mq_put_ctx(data.ctx);
if (same_queue_rq) {
data.hctx = same_queue_rq->mq_hctx;
+ trace_block_unplug(q, 1, true);
blk_mq_try_issue_directly(data.hctx, same_queue_rq,
- &cookie, false, true);
+ &cookie);
}
} else if ((q->nr_hw_queues > 1 && is_sync) || (!q->elevator &&
!data.hctx->dispatch_busy)) {
blk_mq_put_ctx(data.ctx);
blk_mq_bio_to_request(rq, bio);
- blk_mq_try_issue_directly(data.hctx, rq, &cookie, false, true);
+ blk_mq_try_issue_directly(data.hctx, rq, &cookie);
} else {
blk_mq_put_ctx(data.ctx);
blk_mq_bio_to_request(rq, bio);
return 0;
free_fq:
- kfree(hctx->fq);
+ blk_free_flush_queue(hctx->fq);
exit_hctx:
if (set->ops->exit_hctx)
set->ops->exit_hctx(hctx, hctx_idx);
void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);
-blk_status_t blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
- struct request *rq,
- blk_qc_t *cookie,
- bool bypass, bool last);
+/* Used by blk_insert_cloned_request() to issue request directly */
+blk_status_t blk_mq_request_issue_directly(struct request *rq, bool last);
void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
ACPI_FUNCTION_TRACE(ev_enable_gpe);
- /* Enable the requested GPE */
+ /* Clear the GPE status */
+ status = acpi_hw_clear_gpe(gpe_event_info);
+ if (ACPI_FAILURE(status))
+ return_ACPI_STATUS(status);
+ /* Enable the requested GPE */
status = acpi_hw_low_set_gpe(gpe_event_info, ACPI_GPE_ENABLE);
return_ACPI_STATUS(status);
}
return -EINVAL;
}
+ if (g_home_node != NUMA_NO_NODE && g_home_node >= nr_online_nodes) {
+ pr_err("null_blk: invalid home_node value\n");
+ g_home_node = NUMA_NO_NODE;
+ }
+
if (g_queue_mode == NULL_Q_RQ) {
pr_err("null_blk: legacy IO path no longer available\n");
return -EINVAL;
disk->queue = blk_mq_init_sq_queue(&cd->tag_set, &pcd_mq_ops,
1, BLK_MQ_F_SHOULD_MERGE);
if (IS_ERR(disk->queue)) {
+ put_disk(disk);
disk->queue = NULL;
continue;
}
printk("%s: No CD-ROM drive found\n", name);
for (unit = 0, cd = pcd; unit < PCD_UNITS; unit++, cd++) {
+ if (!cd->disk)
+ continue;
blk_cleanup_queue(cd->disk->queue);
cd->disk->queue = NULL;
blk_mq_free_tag_set(&cd->tag_set);
pcd_probe_capabilities();
if (register_blkdev(major, name)) {
- for (unit = 0, cd = pcd; unit < PCD_UNITS; unit++, cd++)
+ for (unit = 0, cd = pcd; unit < PCD_UNITS; unit++, cd++) {
+ if (!cd->disk)
+ continue;
+
+ blk_cleanup_queue(cd->disk->queue);
+ blk_mq_free_tag_set(&cd->tag_set);
put_disk(cd->disk);
+ }
return -EBUSY;
}
int unit;
for (unit = 0, cd = pcd; unit < PCD_UNITS; unit++, cd++) {
+ if (!cd->disk)
+ continue;
+
if (cd->present) {
del_gendisk(cd->disk);
pi_release(cd->pi);
printk("%s: No ATAPI disk detected\n", name);
for (pf = units, unit = 0; unit < PF_UNITS; pf++, unit++) {
+ if (!pf->disk)
+ continue;
blk_cleanup_queue(pf->disk->queue);
pf->disk->queue = NULL;
blk_mq_free_tag_set(&pf->tag_set);
pf_busy = 0;
if (register_blkdev(major, name)) {
- for (pf = units, unit = 0; unit < PF_UNITS; pf++, unit++)
+ for (pf = units, unit = 0; unit < PF_UNITS; pf++, unit++) {
+ if (!pf->disk)
+ continue;
+ blk_cleanup_queue(pf->disk->queue);
+ blk_mq_free_tag_set(&pf->tag_set);
put_disk(pf->disk);
+ }
return -EBUSY;
}
int unit;
unregister_blkdev(major, name);
for (pf = units, unit = 0; unit < PF_UNITS; pf++, unit++) {
+ if (!pf->disk)
+ continue;
+
if (pf->present)
del_gendisk(pf->disk);
return 0;
err_read:
+ /* prevent double queue cleanup */
+ ace->gd->queue = NULL;
put_disk(ace->gd);
err_alloc_disk:
blk_cleanup_queue(ace->queue);
config R3964
tristate "Siemens R3964 line discipline"
- depends on TTY
+ depends on TTY && BROKEN
---help---
This driver allows synchronous communication with devices using the
Siemens R3964 packet protocol. Unless you are dealing with special
const struct x86_cpu_id *id;
int rc;
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL)
+ return -ENODEV;
+
if (no_load)
return -ENODEV;
} else {
id = x86_match_cpu(intel_pstate_cpu_ids);
if (!id) {
- pr_info("CPU ID not supported\n");
+ pr_info("CPU model not supported\n");
return -ENODEV;
}
struct pci_dev *pdev = adev->pdev;
enum pci_bus_speed cur_speed;
enum pcie_link_width cur_width;
+ u32 ret = 1;
*speed = PCI_SPEED_UNKNOWN;
*width = PCIE_LNK_WIDTH_UNKNOWN;
while (pdev) {
cur_speed = pcie_get_speed_cap(pdev);
cur_width = pcie_get_width_cap(pdev);
+ ret = pcie_bandwidth_available(adev->pdev, NULL,
+ NULL, &cur_width);
+ if (!ret)
+ cur_width = PCIE_LNK_WIDTH_RESRV;
if (cur_speed != PCI_SPEED_UNKNOWN) {
if (*speed == PCI_SPEED_UNKNOWN)
/* disable CG */
WREG32_SOC15(GC, 0, mmRLC_CGCG_CGLS_CTRL, 0);
- adev->gfx.rlc.funcs->reset(adev);
-
gfx_v9_0_init_pg(adev);
if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
void core_link_disable_stream(struct pipe_ctx *pipe_ctx, int option)
{
struct dc *core_dc = pipe_ctx->stream->ctx->dc;
+ struct dc_stream_state *stream = pipe_ctx->stream;
core_dc->hwss.blank_stream(pipe_ctx);
if (pipe_ctx->stream->signal == SIGNAL_TYPE_DISPLAY_PORT_MST)
deallocate_mst_payload(pipe_ctx);
+ if (dc_is_hdmi_signal(pipe_ctx->stream->signal))
+ dal_ddc_service_write_scdc_data(
+ stream->link->ddc, 0,
+ stream->timing.flags.LTE_340MCSC_SCRAMBLE);
+
core_dc->hwss.disable_stream(pipe_ctx, option);
disable_link(pipe_ctx->stream->link, pipe_ctx->stream->signal);
* MP0CLK DS
*/
data->registry_data.disallowed_features = 0xE0041C00;
+ /* ECC feature should be disabled on old SMUs */
+ smum_send_msg_to_smc(hwmgr, PPSMC_MSG_GetSmuVersion);
+ hwmgr->smu_version = smum_get_argument(hwmgr);
+ if (hwmgr->smu_version < 0x282100)
+ data->registry_data.disallowed_features |= FEATURE_ECC_MASK;
+
data->registry_data.od_state_in_dc_support = 0;
data->registry_data.thermal_support = 1;
data->registry_data.skip_baco_hardware = 0;
data->smu_features[GNLD_DS_MP1CLK].smu_feature_id = FEATURE_DS_MP1CLK_BIT;
data->smu_features[GNLD_DS_MP0CLK].smu_feature_id = FEATURE_DS_MP0CLK_BIT;
data->smu_features[GNLD_XGMI].smu_feature_id = FEATURE_XGMI_BIT;
+ data->smu_features[GNLD_ECC].smu_feature_id = FEATURE_ECC_BIT;
for (i = 0; i < GNLD_FEATURES_MAX; i++) {
data->smu_features[i].smu_feature_bitmap =
"FCLK_DS",
"MP1CLK_DS",
"MP0CLK_DS",
- "XGMI"};
+ "XGMI",
+ "ECC"};
static const char *output_title[] = {
"FEATURES",
"BITMASK",
struct vega20_single_dpm_table *dpm_table;
bool vblank_too_short = false;
bool disable_mclk_switching;
+ bool disable_fclk_switching;
uint32_t i, latency;
disable_mclk_switching = ((1 < hwmgr->display_config->num_display) &&
if (hwmgr->display_config->nb_pstate_switch_disable)
dpm_table->dpm_state.hard_min_level = dpm_table->dpm_levels[dpm_table->count - 1].value;
+ if ((disable_mclk_switching &&
+ (dpm_table->dpm_state.hard_min_level == dpm_table->dpm_levels[dpm_table->count - 1].value)) ||
+ hwmgr->display_config->min_mem_set_clock / 100 >= dpm_table->dpm_levels[dpm_table->count - 1].value)
+ disable_fclk_switching = true;
+ else
+ disable_fclk_switching = false;
+
/* fclk */
dpm_table = &(data->dpm_table.fclk_table);
dpm_table->dpm_state.soft_min_level = dpm_table->dpm_levels[0].value;
dpm_table->dpm_state.soft_max_level = VG20_CLOCK_MAX_DEFAULT;
dpm_table->dpm_state.hard_min_level = dpm_table->dpm_levels[0].value;
dpm_table->dpm_state.hard_max_level = VG20_CLOCK_MAX_DEFAULT;
- if (hwmgr->display_config->nb_pstate_switch_disable)
+ if (hwmgr->display_config->nb_pstate_switch_disable || disable_fclk_switching)
dpm_table->dpm_state.soft_min_level = dpm_table->dpm_levels[dpm_table->count - 1].value;
/* vclk */
GNLD_DS_MP1CLK,
GNLD_DS_MP0CLK,
GNLD_XGMI,
+ GNLD_ECC,
GNLD_FEATURES_MAX
};
#define FEATURE_DS_MP1CLK_BIT 30
#define FEATURE_DS_MP0CLK_BIT 31
#define FEATURE_XGMI_BIT 32
-#define FEATURE_SPARE_33_BIT 33
+#define FEATURE_ECC_BIT 33
#define FEATURE_SPARE_34_BIT 34
#define FEATURE_SPARE_35_BIT 35
#define FEATURE_SPARE_36_BIT 36
#define FEATURE_DS_FCLK_MASK (1 << FEATURE_DS_FCLK_BIT )
#define FEATURE_DS_MP1CLK_MASK (1 << FEATURE_DS_MP1CLK_BIT )
#define FEATURE_DS_MP0CLK_MASK (1 << FEATURE_DS_MP0CLK_BIT )
-#define FEATURE_XGMI_MASK (1 << FEATURE_XGMI_BIT )
+#define FEATURE_XGMI_MASK (1ULL << FEATURE_XGMI_BIT )
+#define FEATURE_ECC_MASK (1ULL << FEATURE_ECC_BIT )
#define DPM_OVERRIDE_DISABLE_SOCCLK_PID 0x00000001
#define DPM_OVERRIDE_DISABLE_UCLK_PID 0x00000002
/**
* intel_vgpu_emulate_hotplug - trigger hotplug event for vGPU
* @vgpu: a vGPU
- * @conncted: link state
+ * @connected: link state
*
* This function is used to trigger hotplug interrupt for vGPU
*
default:
gvt_vgpu_err("invalid tiling mode: %x\n", p.tiled);
}
-
- info->size = (((p.stride * p.height * p.bpp) / 8) +
- (PAGE_SIZE - 1)) >> PAGE_SHIFT;
} else if (plane_id == DRM_PLANE_TYPE_CURSOR) {
ret = intel_vgpu_decode_cursor_plane(vgpu, &c);
if (ret)
info->x_hot = UINT_MAX;
info->y_hot = UINT_MAX;
}
-
- info->size = (((info->stride * c.height * c.bpp) / 8)
- + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
} else {
gvt_vgpu_err("invalid plane id:%d\n", plane_id);
return -EINVAL;
}
+ info->size = (info->stride * info->height + PAGE_SIZE - 1)
+ >> PAGE_SHIFT;
if (info->size == 0) {
gvt_vgpu_err("fb size is zero\n");
return -EINVAL;
*/
void intel_vgpu_unpin_mm(struct intel_vgpu_mm *mm)
{
- atomic_dec(&mm->pincount);
+ atomic_dec_if_positive(&mm->pincount);
}
/**
intel_runtime_pm_put_unchecked(dev_priv);
}
- if (ret && (vgpu_is_vm_unhealthy(ret))) {
- enter_failsafe_mode(vgpu, GVT_FAILSAFE_GUEST_ERR);
+ if (ret) {
+ if (vgpu_is_vm_unhealthy(ret))
+ enter_failsafe_mode(vgpu, GVT_FAILSAFE_GUEST_ERR);
intel_vgpu_destroy_workload(workload);
return ERR_PTR(ret);
}
ret = drm_modeset_lock(&dev->mode_config.connection_mutex,
&ctx);
if (ret) {
- ret = -EINTR;
+ if (ret == -EDEADLK && !drm_modeset_backoff(&ctx)) {
+ try_again = true;
+ continue;
+ }
break;
}
crtc = connector->state->crtc;
tristate "Asus"
depends on LEDS_CLASS
depends on ASUS_WMI || ASUS_WMI=n
+ select POWER_SUPPLY
---help---
Support for Asus notebook built-in keyboard and touchpad via i2c, and
the Asus Republic of Gamers laptop keyboard special keys.
u32 hid_field_extract(const struct hid_device *hid, u8 *report,
unsigned offset, unsigned n)
{
- if (n > 32) {
- hid_warn(hid, "hid_field_extract() called with n (%d) > 32! (%s)\n",
+ if (n > 256) {
+ hid_warn(hid, "hid_field_extract() called with n (%d) > 256! (%s)\n",
n, current->comm);
- n = 32;
+ n = 256;
}
return __extract(report, offset, n);
seq_printf(f, "\n\n");
/* dump parsed data and input mappings */
+ if (down_interruptible(&hdev->driver_input_lock))
+ return 0;
+
hid_dump_device(hdev, f);
seq_printf(f, "\n");
hid_dump_input_mapping(hdev, f);
+ up(&hdev->driver_input_lock);
+
return 0;
}
#define USB_DEVICE_ID_SYNAPTICS_HD 0x0ac3
#define USB_DEVICE_ID_SYNAPTICS_QUAD_HD 0x1ac3
#define USB_DEVICE_ID_SYNAPTICS_TP_V103 0x5710
+#define I2C_DEVICE_ID_SYNAPTICS_7E7E 0x7e7e
#define USB_VENDOR_ID_TEXAS_INSTRUMENTS 0x2047
#define USB_DEVICE_ID_TEXAS_INSTRUMENTS_LENOVO_YOGA 0x0855
case 0x1b8: map_key_clear(KEY_VIDEO); break;
case 0x1bc: map_key_clear(KEY_MESSENGER); break;
case 0x1bd: map_key_clear(KEY_INFO); break;
+ case 0x1cb: map_key_clear(KEY_ASSISTANT); break;
case 0x201: map_key_clear(KEY_NEW); break;
case 0x202: map_key_clear(KEY_OPEN); break;
case 0x203: map_key_clear(KEY_CLOSE); break;
kfree(data);
return -ENOMEM;
}
+ data->wq = create_singlethread_workqueue("hidpp-ff-sendqueue");
+ if (!data->wq) {
+ kfree(data->effect_ids);
+ kfree(data);
+ return -ENOMEM;
+ }
+
data->hidpp = hidpp;
data->feature_index = feature_index;
data->version = version;
/* ignore boost value at response.fap.params[2] */
/* init the hardware command queue */
- data->wq = create_singlethread_workqueue("hidpp-ff-sendqueue");
atomic_set(&data->workqueue_size, 0);
/* initialize with zero autocenter to get wheel in usable state */
input_report_rel(mydata->input, REL_Y, v);
v = hid_snto32(data[6], 8);
- hidpp_scroll_counter_handle_scroll(
- &hidpp->vertical_wheel_counter, v);
+ if (v != 0)
+ hidpp_scroll_counter_handle_scroll(
+ &hidpp->vertical_wheel_counter, v);
input_sync(mydata->input);
}
{ HID_USB_DEVICE(USB_VENDOR_ID_DEALEXTREAME, USB_DEVICE_ID_DEALEXTREAME_RADIO_SI4701) },
{ HID_USB_DEVICE(USB_VENDOR_ID_DELORME, USB_DEVICE_ID_DELORME_EARTHMATE) },
{ HID_USB_DEVICE(USB_VENDOR_ID_DELORME, USB_DEVICE_ID_DELORME_EM_LT20) },
- { HID_I2C_DEVICE(USB_VENDOR_ID_ELAN, 0x0400) },
{ HID_USB_DEVICE(USB_VENDOR_ID_ESSENTIAL_REALITY, USB_DEVICE_ID_ESSENTIAL_REALITY_P5) },
{ HID_USB_DEVICE(USB_VENDOR_ID_ETT, USB_DEVICE_ID_TC5UH) },
{ HID_USB_DEVICE(USB_VENDOR_ID_ETT, USB_DEVICE_ID_TC4UM) },
{ }
};
-/**
+/*
* hid_mouse_ignore_list - mouse devices which should not be handled by the hid layer
*
* There are composite devices for which we want to ignore only a certain
if (hdev->product == 0x0401 &&
strncmp(hdev->name, "ELAN0800", 8) != 0)
return true;
+ /* Same with product id 0x0400 */
+ if (hdev->product == 0x0400 &&
+ strncmp(hdev->name, "QTEC0001", 8) != 0)
+ return true;
break;
}
}
if (bl_entry != NULL)
- dbg_hid("Found dynamic quirk 0x%lx for HID device 0x%hx:0x%hx\n",
+ dbg_hid("Found dynamic quirk 0x%lx for HID device 0x%04x:0x%04x\n",
bl_entry->driver_data, bl_entry->vendor,
bl_entry->product);
quirks |= bl_entry->driver_data;
if (quirks)
- dbg_hid("Found squirk 0x%lx for HID device 0x%hx:0x%hx\n",
+ dbg_hid("Found squirk 0x%lx for HID device 0x%04x:0x%04x\n",
quirks, hdev->vendor, hdev->product);
return quirks;
}
static int steam_register(struct steam_device *steam)
{
int ret;
+ bool client_opened;
/*
* This function can be called several times in a row with the
* Unlikely, but getting the serial could fail, and it is not so
* important, so make up a serial number and go on.
*/
+ mutex_lock(&steam->mutex);
if (steam_get_serial(steam) < 0)
strlcpy(steam->serial_no, "XXXXXXXXXX",
sizeof(steam->serial_no));
+ mutex_unlock(&steam->mutex);
hid_info(steam->hdev, "Steam Controller '%s' connected",
steam->serial_no);
}
mutex_lock(&steam->mutex);
- if (!steam->client_opened) {
+ client_opened = steam->client_opened;
+ if (!client_opened)
steam_set_lizard_mode(steam, lizard_mode);
+ mutex_unlock(&steam->mutex);
+
+ if (!client_opened)
ret = steam_input_register(steam);
- } else {
+ else
ret = 0;
- }
- mutex_unlock(&steam->mutex);
return ret;
}
{
struct steam_device *steam = hdev->driver_data;
+ unsigned long flags;
+ bool connected;
+
+ spin_lock_irqsave(&steam->lock, flags);
+ connected = steam->connected;
+ spin_unlock_irqrestore(&steam->lock, flags);
+
mutex_lock(&steam->mutex);
steam->client_opened = false;
+ if (connected)
+ steam_set_lizard_mode(steam, lizard_mode);
mutex_unlock(&steam->mutex);
- if (steam->connected) {
- steam_set_lizard_mode(steam, lizard_mode);
+ if (connected)
steam_input_register(steam);
- }
}
static int steam_client_ll_raw_request(struct hid_device *hdev,
goto cleanup;
}
rc = usb_string(udev, 201, ver_ptr, ver_len);
- if (ver_ptr == NULL) {
- rc = -ENOMEM;
- goto cleanup;
- }
if (rc == -EPIPE) {
*ver_ptr = '\0';
} else if (rc < 0) {
I2C_HID_QUIRK_NO_RUNTIME_PM },
{ USB_VENDOR_ID_ELAN, HID_ANY_ID,
I2C_HID_QUIRK_BOGUS_IRQ },
+ { USB_VENDOR_ID_SYNAPTICS, I2C_DEVICE_ID_SYNAPTICS_7E7E,
+ I2C_HID_QUIRK_NO_RUNTIME_PM },
{ 0, 0 }
};
config SENSORS_W83773G
tristate "Nuvoton W83773G"
depends on I2C
+ select REGMAP_I2C
help
If you say yes here you get support for the Nuvoton W83773G hardware
monitoring chip.
};
static const u32 ntc_temp_config[] = {
- HWMON_T_INPUT, HWMON_T_TYPE,
+ HWMON_T_INPUT | HWMON_T_TYPE,
0
};
s++;
}
}
+
+ s = (sensors->power.num_sensors * 4) + 1;
} else {
for (i = 0; i < sensors->power.num_sensors; ++i) {
s = i + 1;
show_power, NULL, 3, i);
attr++;
}
- }
- if (sensors->caps.num_sensors >= 1) {
s = sensors->power.num_sensors + 1;
+ }
+ if (sensors->caps.num_sensors >= 1) {
snprintf(attr->name, sizeof(attr->name), "power%d_label", s);
attr->sensor = OCC_INIT_ATTR(attr->name, 0444, show_caps, NULL,
0, 0);
/* Init DMA config if supported */
ret = i2c_imx_dma_request(i2c_imx, phy_addr);
if (ret < 0)
- goto clk_notifier_unregister;
+ goto del_adapter;
dev_info(&i2c_imx->adapter.dev, "IMX I2C adapter registered\n");
return 0; /* Return OK */
+del_adapter:
+ i2c_del_adapter(&i2c_imx->adapter);
clk_notifier_unregister:
clk_notifier_unregister(i2c_imx->clk, &i2c_imx->clk_change_nb);
rpm_disable:
struct srcu_struct io_barrier;
};
+void disable_discard(struct mapped_device *md);
void disable_write_same(struct mapped_device *md);
void disable_write_zeroes(struct mapped_device *md);
struct list_head list;
};
-const char *dm_allowed_targets[] __initconst = {
+const char * const dm_allowed_targets[] __initconst = {
"crypt",
"delay",
"linear",
static bool ranges_overlap(struct dm_integrity_range *range1, struct dm_integrity_range *range2)
{
return range1->logical_sector < range2->logical_sector + range2->n_sectors &&
- range2->logical_sector + range2->n_sectors > range2->logical_sector;
+ range1->logical_sector + range1->n_sectors > range2->logical_sector;
}
static bool add_new_range(struct dm_integrity_c *ic, struct dm_integrity_range *new_range, bool check_waiting)
struct dm_integrity_range *last_range =
list_first_entry(&ic->wait_list, struct dm_integrity_range, wait_entry);
struct task_struct *last_range_task;
- if (!ranges_overlap(range, last_range))
- break;
last_range_task = last_range->task;
list_del(&last_range->wait_entry);
if (!add_new_range(ic, last_range, false)) {
journal_watermark = val;
else if (sscanf(opt_string, "commit_time:%u%c", &val, &dummy) == 1)
sync_msec = val;
- else if (!memcmp(opt_string, "meta_device:", strlen("meta_device:"))) {
+ else if (!strncmp(opt_string, "meta_device:", strlen("meta_device:"))) {
if (ic->meta_dev) {
dm_put_device(ti, ic->meta_dev);
ic->meta_dev = NULL;
goto bad;
}
ic->sectors_per_block = val >> SECTOR_SHIFT;
- } else if (!memcmp(opt_string, "internal_hash:", strlen("internal_hash:"))) {
+ } else if (!strncmp(opt_string, "internal_hash:", strlen("internal_hash:"))) {
r = get_alg_and_key(opt_string, &ic->internal_hash_alg, &ti->error,
"Invalid internal_hash argument");
if (r)
goto bad;
- } else if (!memcmp(opt_string, "journal_crypt:", strlen("journal_crypt:"))) {
+ } else if (!strncmp(opt_string, "journal_crypt:", strlen("journal_crypt:"))) {
r = get_alg_and_key(opt_string, &ic->journal_crypt_alg, &ti->error,
"Invalid journal_crypt argument");
if (r)
goto bad;
- } else if (!memcmp(opt_string, "journal_mac:", strlen("journal_mac:"))) {
+ } else if (!strncmp(opt_string, "journal_mac:", strlen("journal_mac:"))) {
r = get_alg_and_key(opt_string, &ic->journal_mac_alg, &ti->error,
"Invalid journal_mac argument");
if (r)
.io_hints = dm_integrity_io_hints,
};
-int __init dm_integrity_init(void)
+static int __init dm_integrity_init(void)
{
int r;
return r;
}
-void dm_integrity_exit(void)
+static void __exit dm_integrity_exit(void)
{
dm_unregister_target(&integrity_target);
kmem_cache_destroy(journal_io_cache);
}
if (unlikely(error == BLK_STS_TARGET)) {
- if (req_op(clone) == REQ_OP_WRITE_SAME &&
- !clone->q->limits.max_write_same_sectors)
+ if (req_op(clone) == REQ_OP_DISCARD &&
+ !clone->q->limits.max_discard_sectors)
+ disable_discard(tio->md);
+ else if (req_op(clone) == REQ_OP_WRITE_SAME &&
+ !clone->q->limits.max_write_same_sectors)
disable_write_same(tio->md);
- if (req_op(clone) == REQ_OP_WRITE_ZEROES &&
- !clone->q->limits.max_write_zeroes_sectors)
+ else if (req_op(clone) == REQ_OP_WRITE_ZEROES &&
+ !clone->q->limits.max_write_zeroes_sectors)
disable_write_zeroes(tio->md);
}
return true;
}
+static int device_requires_stable_pages(struct dm_target *ti,
+ struct dm_dev *dev, sector_t start,
+ sector_t len, void *data)
+{
+ struct request_queue *q = bdev_get_queue(dev->bdev);
+
+ return q && bdi_cap_stable_pages_required(q->backing_dev_info);
+}
+
+/*
+ * If any underlying device requires stable pages, a table must require
+ * them as well. Only targets that support iterate_devices are considered:
+ * don't want error, zero, etc to require stable pages.
+ */
+static bool dm_table_requires_stable_pages(struct dm_table *t)
+{
+ struct dm_target *ti;
+ unsigned i;
+
+ for (i = 0; i < dm_table_get_num_targets(t); i++) {
+ ti = dm_table_get_target(t, i);
+
+ if (ti->type->iterate_devices &&
+ ti->type->iterate_devices(ti, device_requires_stable_pages, NULL))
+ return true;
+ }
+
+ return false;
+}
+
void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
struct queue_limits *limits)
{
dm_table_verify_integrity(t);
+ /*
+ * Some devices don't use blk_integrity but still want stable pages
+ * because they do their own checksumming.
+ */
+ if (dm_table_requires_stable_pages(t))
+ q->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES;
+ else
+ q->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES;
+
/*
* Determine whether or not this queue's I/O timings contribute
* to the entropy pool, Only request-based targets use this.
}
}
+void disable_discard(struct mapped_device *md)
+{
+ struct queue_limits *limits = dm_get_queue_limits(md);
+
+ /* device doesn't really support DISCARD, disable it */
+ limits->max_discard_sectors = 0;
+ blk_queue_flag_clear(QUEUE_FLAG_DISCARD, md->queue);
+}
+
void disable_write_same(struct mapped_device *md)
{
struct queue_limits *limits = dm_get_queue_limits(md);
dm_endio_fn endio = tio->ti->type->end_io;
if (unlikely(error == BLK_STS_TARGET) && md->type != DM_TYPE_NVME_BIO_BASED) {
- if (bio_op(bio) == REQ_OP_WRITE_SAME &&
- !bio->bi_disk->queue->limits.max_write_same_sectors)
+ if (bio_op(bio) == REQ_OP_DISCARD &&
+ !bio->bi_disk->queue->limits.max_discard_sectors)
+ disable_discard(md);
+ else if (bio_op(bio) == REQ_OP_WRITE_SAME &&
+ !bio->bi_disk->queue->limits.max_write_same_sectors)
disable_write_same(md);
- if (bio_op(bio) == REQ_OP_WRITE_ZEROES &&
- !bio->bi_disk->queue->limits.max_write_zeroes_sectors)
+ else if (bio_op(bio) == REQ_OP_WRITE_ZEROES &&
+ !bio->bi_disk->queue->limits.max_write_zeroes_sectors)
disable_write_zeroes(md);
}
return -EINVAL;
}
- /*
- * BIO based queue uses its own splitting. When multipage bvecs
- * is switched on, size of the incoming bio may be too big to
- * be handled in some targets, such as crypt.
- *
- * When these targets are ready for the big bio, we can remove
- * the limit.
- */
- ti->max_io_len = min_t(uint32_t, len, BIO_MAX_PAGES * PAGE_SIZE);
+ ti->max_io_len = (uint32_t) len;
return 0;
}
config MFD_SUN6I_PRCM
bool "Allwinner A31 PRCM controller"
- depends on ARCH_SUNXI
+ depends on ARCH_SUNXI || COMPILE_TEST
select MFD_CORE
help
Support for the PRCM (Power/Reset/Clock Management) unit available
static const struct mfd_cell sprd_pmic_devs[] = {
{
.name = "sc27xx-wdt",
- .of_compatible = "sprd,sc27xx-wdt",
+ .of_compatible = "sprd,sc2731-wdt",
}, {
.name = "sc27xx-rtc",
- .of_compatible = "sprd,sc27xx-rtc",
+ .of_compatible = "sprd,sc2731-rtc",
}, {
.name = "sc27xx-charger",
- .of_compatible = "sprd,sc27xx-charger",
+ .of_compatible = "sprd,sc2731-charger",
}, {
.name = "sc27xx-chg-timer",
- .of_compatible = "sprd,sc27xx-chg-timer",
+ .of_compatible = "sprd,sc2731-chg-timer",
}, {
.name = "sc27xx-fast-chg",
- .of_compatible = "sprd,sc27xx-fast-chg",
+ .of_compatible = "sprd,sc2731-fast-chg",
}, {
.name = "sc27xx-chg-wdt",
- .of_compatible = "sprd,sc27xx-chg-wdt",
+ .of_compatible = "sprd,sc2731-chg-wdt",
}, {
.name = "sc27xx-typec",
- .of_compatible = "sprd,sc27xx-typec",
+ .of_compatible = "sprd,sc2731-typec",
}, {
.name = "sc27xx-flash",
- .of_compatible = "sprd,sc27xx-flash",
+ .of_compatible = "sprd,sc2731-flash",
}, {
.name = "sc27xx-eic",
- .of_compatible = "sprd,sc27xx-eic",
+ .of_compatible = "sprd,sc2731-eic",
}, {
.name = "sc27xx-efuse",
- .of_compatible = "sprd,sc27xx-efuse",
+ .of_compatible = "sprd,sc2731-efuse",
}, {
.name = "sc27xx-thermal",
- .of_compatible = "sprd,sc27xx-thermal",
+ .of_compatible = "sprd,sc2731-thermal",
}, {
.name = "sc27xx-adc",
- .of_compatible = "sprd,sc27xx-adc",
+ .of_compatible = "sprd,sc2731-adc",
}, {
.name = "sc27xx-audio-codec",
- .of_compatible = "sprd,sc27xx-audio-codec",
+ .of_compatible = "sprd,sc2731-audio-codec",
}, {
.name = "sc27xx-regulator",
- .of_compatible = "sprd,sc27xx-regulator",
+ .of_compatible = "sprd,sc2731-regulator",
}, {
.name = "sc27xx-vibrator",
- .of_compatible = "sprd,sc27xx-vibrator",
+ .of_compatible = "sprd,sc2731-vibrator",
}, {
.name = "sc27xx-keypad-led",
- .of_compatible = "sprd,sc27xx-keypad-led",
+ .of_compatible = "sprd,sc2731-keypad-led",
}, {
.name = "sc27xx-bltc",
- .of_compatible = "sprd,sc27xx-bltc",
+ .of_compatible = "sprd,sc2731-bltc",
}, {
.name = "sc27xx-fgu",
- .of_compatible = "sprd,sc27xx-fgu",
+ .of_compatible = "sprd,sc2731-fgu",
}, {
.name = "sc27xx-7sreset",
- .of_compatible = "sprd,sc27xx-7sreset",
+ .of_compatible = "sprd,sc2731-7sreset",
}, {
.name = "sc27xx-poweroff",
- .of_compatible = "sprd,sc27xx-poweroff",
+ .of_compatible = "sprd,sc2731-poweroff",
}, {
.name = "sc27xx-syscon",
- .of_compatible = "sprd,sc27xx-syscon",
+ .of_compatible = "sprd,sc2731-syscon",
},
};
return status;
}
+static int __maybe_unused twl_suspend(struct device *dev)
+{
+ struct i2c_client *client = to_i2c_client(dev);
+
+ if (client->irq)
+ disable_irq(client->irq);
+
+ return 0;
+}
+
+static int __maybe_unused twl_resume(struct device *dev)
+{
+ struct i2c_client *client = to_i2c_client(dev);
+
+ if (client->irq)
+ enable_irq(client->irq);
+
+ return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(twl_dev_pm_ops, twl_suspend, twl_resume);
+
static const struct i2c_device_id twl_ids[] = {
{ "twl4030", TWL4030_VAUX2 }, /* "Triton 2" */
{ "twl5030", 0 }, /* T2 updated */
/* One Client Driver , 4 Clients */
static struct i2c_driver twl_driver = {
.driver.name = DRIVER_NAME,
+ .driver.pm = &twl_dev_pm_ops,
.id_table = twl_ids,
.probe = twl_probe,
.remove = twl_remove,
continue;
}
- if (time_after(jiffies, timeo) && !chip_ready(map, adr))
+ /*
+ * We check "time_after" and "!chip_good" before checking "chip_good" to avoid
+ * the failure due to scheduling.
+ */
+ if (time_after(jiffies, timeo) && !chip_good(map, adr, datum))
break;
if (chip_good(map, adr, datum)) {
static ssize_t perm_hwaddr_show(struct slave *slave, char *buf)
{
- return sprintf(buf, "%pM\n", slave->perm_hwaddr);
+ return sprintf(buf, "%*phC\n",
+ slave->dev->addr_len,
+ slave->perm_hwaddr);
}
static SLAVE_ATTR_RO(perm_hwaddr);
return 0;
lane = mv88e6390x_serdes_get_lane(chip, port);
- if (lane < 0)
+ if (lane < 0 && lane != -ENODEV)
return lane;
- if (chip->ports[port].serdes_irq) {
- err = mv88e6390_serdes_irq_disable(chip, port, lane);
+ if (lane >= 0) {
+ if (chip->ports[port].serdes_irq) {
+ err = mv88e6390_serdes_irq_disable(chip, port, lane);
+ if (err)
+ return err;
+ }
+
+ err = mv88e6390x_serdes_power(chip, port, false);
if (err)
return err;
}
- err = mv88e6390x_serdes_power(chip, port, false);
- if (err)
- return err;
+ chip->ports[port].cmode = 0;
if (cmode) {
err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_STS, ®);
if (err)
return err;
+ chip->ports[port].cmode = cmode;
+
+ lane = mv88e6390x_serdes_get_lane(chip, port);
+ if (lane < 0)
+ return lane;
+
err = mv88e6390x_serdes_power(chip, port, true);
if (err)
return err;
}
}
- chip->ports[port].cmode = cmode;
-
return 0;
}
struct nicvf_cq_poll *cq_poll = NULL;
union nic_mbx mbx = {};
- cancel_delayed_work_sync(&nic->link_change_work);
-
/* wait till all queued set_rx_mode tasks completes */
- drain_workqueue(nic->nicvf_rx_mode_wq);
+ if (nic->nicvf_rx_mode_wq) {
+ cancel_delayed_work_sync(&nic->link_change_work);
+ drain_workqueue(nic->nicvf_rx_mode_wq);
+ }
mbx.msg.msg = NIC_MBOX_MSG_SHUTDOWN;
nicvf_send_msg_to_pf(nic, &mbx);
struct nicvf_cq_poll *cq_poll = NULL;
/* wait till all queued set_rx_mode tasks completes if any */
- drain_workqueue(nic->nicvf_rx_mode_wq);
+ if (nic->nicvf_rx_mode_wq)
+ drain_workqueue(nic->nicvf_rx_mode_wq);
netif_carrier_off(netdev);
/* Send VF config done msg to PF */
nicvf_send_cfg_done(nic);
- INIT_DELAYED_WORK(&nic->link_change_work,
- nicvf_link_status_check_task);
- queue_delayed_work(nic->nicvf_rx_mode_wq,
- &nic->link_change_work, 0);
+ if (nic->nicvf_rx_mode_wq) {
+ INIT_DELAYED_WORK(&nic->link_change_work,
+ nicvf_link_status_check_task);
+ queue_delayed_work(nic->nicvf_rx_mode_wq,
+ &nic->link_change_work, 0);
+ }
return 0;
cleanup:
/* Check if page can be recycled */
if (page) {
ref_count = page_ref_count(page);
- /* Check if this page has been used once i.e 'put_page'
- * called after packet transmission i.e internal ref_count
- * and page's ref_count are equal i.e page can be recycled.
+ /* This page can be recycled if internal ref_count and page's
+ * ref_count are equal, indicating that the page has been used
+ * once for packet transmission. For non-XDP mode, internal
+ * ref_count is always '1'.
*/
- if (rbdr->is_xdp && (ref_count == pgcache->ref_count))
- pgcache->ref_count--;
- else
- page = NULL;
-
- /* In non-XDP mode, page's ref_count needs to be '1' for it
- * to be recycled.
- */
- if (!rbdr->is_xdp && (ref_count != 1))
+ if (rbdr->is_xdp) {
+ if (ref_count == pgcache->ref_count)
+ pgcache->ref_count--;
+ else
+ page = NULL;
+ } else if (ref_count != 1) {
page = NULL;
+ }
}
if (!page) {
while (head < rbdr->pgcnt) {
pgcache = &rbdr->pgcache[head];
if (pgcache->page && page_ref_count(pgcache->page) != 0) {
- if (!rbdr->is_xdp) {
- put_page(pgcache->page);
- continue;
+ if (rbdr->is_xdp) {
+ page_ref_sub(pgcache->page,
+ pgcache->ref_count - 1);
}
- page_ref_sub(pgcache->page, pgcache->ref_count - 1);
put_page(pgcache->page);
}
head++;
ppmax = max;
/* pool size must be multiple of unsigned long */
- bmap = BITS_TO_LONGS(ppmax);
+ bmap = ppmax / BITS_PER_TYPE(unsigned long);
+ if (!bmap)
+ return NULL;
+
ppmax = (bmap * sizeof(unsigned long)) << 3;
alloc_sz = sizeof(*pools) + sizeof(unsigned long) * bmap;
if (reserve_factor) {
ppmax_pool = ppmax / reserve_factor;
pool = ppm_alloc_cpu_pool(&ppmax_pool, &pool_index_max);
+ if (!pool) {
+ ppmax_pool = 0;
+ reserve_factor = 0;
+ }
pr_debug("%s: ppmax %u, cpu total %u, per cpu %u.\n",
ndev->name, ppmax, ppmax_pool, pool_index_max);
/* free desc along with its attached buffer */
static void hnae_free_desc(struct hnae_ring *ring)
{
- hnae_free_buffers(ring);
dma_unmap_single(ring_to_dev(ring), ring->desc_dma_addr,
ring->desc_num * sizeof(ring->desc[0]),
ring_to_dma_dir(ring));
/* fini ring, also free the buffer for the ring */
static void hnae_fini_ring(struct hnae_ring *ring)
{
+ if (is_rx_ring(ring))
+ hnae_free_buffers(ring);
+
hnae_free_desc(ring);
kfree(ring->desc_cb);
ring->desc_cb = NULL;
};
struct hnae_queue {
- void __iomem *io_base;
+ u8 __iomem *io_base;
phys_addr_t phy_base;
struct hnae_ae_dev *dev; /* the device who use this queue */
struct hnae_ring rx_ring ____cacheline_internodealigned_in_smp;
static void hns_mac_param_get(struct mac_params *param,
struct hns_mac_cb *mac_cb)
{
- param->vaddr = (void *)mac_cb->vaddr;
+ param->vaddr = mac_cb->vaddr;
param->mac_mode = hns_get_enet_interface(mac_cb);
ether_addr_copy(param->addr, mac_cb->addr_entry_idx[0].addr);
param->mac_id = mac_cb->mac_id;
/*mac para struct ,mac get param from nic or dsaf when initialize*/
struct mac_params {
char addr[ETH_ALEN];
- void *vaddr; /*virtual address*/
+ u8 __iomem *vaddr; /*virtual address*/
struct device *dev;
u8 mac_id;
/**< Ethernet operation mode (MAC-PHY interface and speed) */
enum mac_mode mac_mode;
u8 mac_id;
struct hns_mac_cb *mac_cb;
- void __iomem *io_base;
+ u8 __iomem *io_base;
unsigned int mac_en_flg;/*you'd better don't enable mac twice*/
unsigned int virt_dev_num;
struct device *dev;
DSAF_TBL_TCAM_KEY_VLAN_S, vlan_id);
dsaf_set_field(mac_key->low.bits.port_vlan, DSAF_TBL_TCAM_KEY_PORT_M,
DSAF_TBL_TCAM_KEY_PORT_S, port);
-
- mac_key->low.bits.port_vlan = le16_to_cpu(mac_key->low.bits.port_vlan);
}
/**
/* default config dvc to 0 */
mac_data.tbl_ucast_dvc = 0;
mac_data.tbl_ucast_out_port = mac_entry->port_num;
- tcam_data.tbl_tcam_data_high = cpu_to_le32(mac_key.high.val);
- tcam_data.tbl_tcam_data_low = cpu_to_le32(mac_key.low.val);
+ tcam_data.tbl_tcam_data_high = mac_key.high.val;
+ tcam_data.tbl_tcam_data_low = mac_key.low.val;
hns_dsaf_tcam_uc_cfg(dsaf_dev, entry_index, &tcam_data, &mac_data);
0xff,
mc_mask);
- mask_key.high.val = le32_to_cpu(mask_key.high.val);
- mask_key.low.val = le32_to_cpu(mask_key.low.val);
-
pmask_key = (struct dsaf_tbl_tcam_data *)(&mask_key);
}
dsaf_dev->ae_dev.name, mac_key.high.val,
mac_key.low.val, entry_index);
- tcam_data.tbl_tcam_data_high = cpu_to_le32(mac_key.high.val);
- tcam_data.tbl_tcam_data_low = cpu_to_le32(mac_key.low.val);
+ tcam_data.tbl_tcam_data_high = mac_key.high.val;
+ tcam_data.tbl_tcam_data_low = mac_key.low.val;
/* config mc entry with mask */
hns_dsaf_tcam_mc_cfg(dsaf_dev, entry_index, &tcam_data,
/* config key mask */
hns_dsaf_set_mac_key(dsaf_dev, &mask_key, 0x00, 0xff, mc_mask);
- mask_key.high.val = le32_to_cpu(mask_key.high.val);
- mask_key.low.val = le32_to_cpu(mask_key.low.val);
-
pmask_key = (struct dsaf_tbl_tcam_data *)(&mask_key);
}
soft_mac_entry += entry_index;
soft_mac_entry->index = DSAF_INVALID_ENTRY_IDX;
} else { /* not zero, just del port, update */
- tcam_data.tbl_tcam_data_high = cpu_to_le32(mac_key.high.val);
- tcam_data.tbl_tcam_data_low = cpu_to_le32(mac_key.low.val);
+ tcam_data.tbl_tcam_data_high = mac_key.high.val;
+ tcam_data.tbl_tcam_data_low = mac_key.low.val;
hns_dsaf_tcam_mc_cfg(dsaf_dev, entry_index,
&tcam_data,
return DSAF_DUMP_REGS_NUM;
}
+static int hns_dsaf_get_port_id(u8 port)
+{
+ if (port < DSAF_SERVICE_NW_NUM)
+ return port;
+
+ if (port >= DSAF_BASE_INNER_PORT_NUM)
+ return port - DSAF_BASE_INNER_PORT_NUM + DSAF_SERVICE_NW_NUM;
+
+ return -EINVAL;
+}
+
static void set_promisc_tcam_enable(struct dsaf_device *dsaf_dev, u32 port)
{
struct dsaf_tbl_tcam_ucast_cfg tbl_tcam_ucast = {0, 1, 0, 0, 0x80};
memset(&temp_key, 0x0, sizeof(temp_key));
mask_entry.addr[0] = 0x01;
hns_dsaf_set_mac_key(dsaf_dev, &mask_key, mask_entry.in_vlan_id,
- port, mask_entry.addr);
+ 0xf, mask_entry.addr);
tbl_tcam_mcast.tbl_mcast_item_vld = 1;
tbl_tcam_mcast.tbl_mcast_old_en = 0;
- if (port < DSAF_SERVICE_NW_NUM) {
- mskid = port;
- } else if (port >= DSAF_BASE_INNER_PORT_NUM) {
- mskid = port - DSAF_BASE_INNER_PORT_NUM + DSAF_SERVICE_NW_NUM;
- } else {
+ /* set MAC port to handle multicast */
+ mskid = hns_dsaf_get_port_id(port);
+ if (mskid == -EINVAL) {
dev_err(dsaf_dev->dev, "%s,pnum(%d)error,key(%#x:%#x)\n",
dsaf_dev->ae_dev.name, port,
mask_key.high.val, mask_key.low.val);
return;
}
+ dsaf_set_bit(tbl_tcam_mcast.tbl_mcast_port_msk[mskid / 32],
+ mskid % 32, 1);
+ /* set pool bit map to handle multicast */
+ mskid = hns_dsaf_get_port_id(port_num);
+ if (mskid == -EINVAL) {
+ dev_err(dsaf_dev->dev,
+ "%s, pool bit map pnum(%d)error,key(%#x:%#x)\n",
+ dsaf_dev->ae_dev.name, port_num,
+ mask_key.high.val, mask_key.low.val);
+ return;
+ }
dsaf_set_bit(tbl_tcam_mcast.tbl_mcast_port_msk[mskid / 32],
mskid % 32, 1);
+
memcpy(&temp_key, &mask_key, sizeof(mask_key));
hns_dsaf_tcam_mc_cfg_vague(dsaf_dev, entry_index, &tbl_tcam_data_mc,
(struct dsaf_tbl_tcam_data *)(&mask_key),
u8 mac_id, u8 port_num);
int hns_dsaf_wait_pkt_clean(struct dsaf_device *dsaf_dev, int port);
+int hns_dsaf_roce_reset(struct fwnode_handle *dsaf_fwnode, bool dereset);
+
#endif /* __HNS_DSAF_MAIN_H__ */
dsaf_set_field(origin, 1ull << 10, 10, en);
dsaf_write_syscon(mac_cb->serdes_ctrl, reg_offset, origin);
} else {
- u8 *base_addr = (u8 *)mac_cb->serdes_vaddr +
+ u8 __iomem *base_addr = mac_cb->serdes_vaddr +
(mac_cb->mac_id <= 3 ? 0x00280000 : 0x00200000);
dsaf_set_reg_field(base_addr, reg_offset, 1ull << 10, 10, en);
}
}
}
-static void __iomem *
+static u8 __iomem *
hns_ppe_common_get_ioaddr(struct ppe_common_cb *ppe_common)
{
return ppe_common->dsaf_dev->ppe_base + PPE_COMMON_REG_OFFSET;
dsaf_dev->ppe_common[comm_index] = NULL;
}
-static void __iomem *hns_ppe_get_iobase(struct ppe_common_cb *ppe_common,
- int ppe_idx)
+static u8 __iomem *hns_ppe_get_iobase(struct ppe_common_cb *ppe_common,
+ int ppe_idx)
{
return ppe_common->dsaf_dev->ppe_base + ppe_idx * PPE_REG_OFFSET;
}
struct hns_ppe_hw_stats hw_stats;
u8 index; /* index in a ppe common device */
- void __iomem *io_base;
+ u8 __iomem *io_base;
int virq;
u32 rss_indir_table[HNS_PPEV2_RSS_IND_TBL_SIZE]; /*shadow indir tab */
u32 rss_key[HNS_PPEV2_RSS_KEY_NUM]; /* rss hash key */
struct ppe_common_cb {
struct device *dev;
struct dsaf_device *dsaf_dev;
- void __iomem *io_base;
+ u8 __iomem *io_base;
enum ppe_common_mode ppe_mode;
mdnum_ppkt = HNS_RCB_RING_MAX_BD_PER_PKT;
} else {
ring = &q->tx_ring;
- ring->io_base = (u8 __iomem *)ring_pair_cb->q.io_base +
+ ring->io_base = ring_pair_cb->q.io_base +
HNS_RCB_TX_REG_OFFSET;
irq_idx = HNS_RCB_IRQ_IDX_TX;
mdnum_ppkt = is_ver1 ? HNS_RCB_RING_MAX_TXBD_PER_PKT :
}
}
-static void __iomem *hns_rcb_common_get_vaddr(struct rcb_common_cb *rcb_common)
+static u8 __iomem *hns_rcb_common_get_vaddr(struct rcb_common_cb *rcb_common)
{
struct dsaf_device *dsaf_dev = rcb_common->dsaf_dev;
#define XGMAC_PAUSE_CTL_RSP_MODE_B 2
#define XGMAC_PAUSE_CTL_TX_XOFF_B 3
-static inline void dsaf_write_reg(void __iomem *base, u32 reg, u32 value)
+static inline void dsaf_write_reg(u8 __iomem *base, u32 reg, u32 value)
{
writel(value, base + reg);
}
#define dsaf_set_bit(origin, shift, val) \
dsaf_set_field((origin), (1ull << (shift)), (shift), (val))
-static inline void dsaf_set_reg_field(void __iomem *base, u32 reg, u32 mask,
+static inline void dsaf_set_reg_field(u8 __iomem *base, u32 reg, u32 mask,
u32 shift, u32 val)
{
u32 origin = dsaf_read_reg(base, reg);
#define dsaf_get_bit(origin, shift) \
dsaf_get_field((origin), (1ull << (shift)), (shift))
-static inline u32 dsaf_get_reg_field(void __iomem *base, u32 reg, u32 mask,
+static inline u32 dsaf_get_reg_field(u8 __iomem *base, u32 reg, u32 mask,
u32 shift)
{
u32 origin;
dsaf_get_reg_field((dev)->io_base, (reg), (1ull << (bit)), (bit))
#define dsaf_write_b(addr, data)\
- writeb((data), (__iomem unsigned char *)(addr))
+ writeb((data), (__iomem u8 *)(addr))
#define dsaf_read_b(addr)\
- readb((__iomem unsigned char *)(addr))
+ readb((__iomem u8 *)(addr))
#define hns_mac_reg_read64(drv, offset) \
- readq((__iomem void *)(((u8 *)(drv)->io_base + 0xc00 + (offset))))
+ readq((__iomem void *)(((drv)->io_base + 0xc00 + (offset))))
#endif /* _DSAF_REG_H */
dsaf_set_bit(val, XGMAC_UNIDIR_EN_B, 0);
dsaf_set_bit(val, XGMAC_RF_TX_EN_B, 1);
dsaf_set_field(val, XGMAC_LF_RF_INSERT_M, XGMAC_LF_RF_INSERT_S, 0);
- dsaf_write_reg(mac_drv, XGMAC_MAC_TX_LF_RF_CONTROL_REG, val);
+ dsaf_write_dev(mac_drv, XGMAC_MAC_TX_LF_RF_CONTROL_REG, val);
}
/**
#define SERVICE_TIMER_HZ (1 * HZ)
-#define NIC_TX_CLEAN_MAX_NUM 256
-#define NIC_RX_CLEAN_MAX_NUM 64
-
#define RCB_IRQ_NOT_INITED 0
#define RCB_IRQ_INITED 1
#define HNS_BUFFER_SIZE_2048 2048
wmb(); /* commit all data before submit */
assert(skb->queue_mapping < priv->ae_handle->q_num);
hnae_queue_xmit(priv->ae_handle->qs[skb->queue_mapping], buf_num);
- ring->stats.tx_pkts++;
- ring->stats.tx_bytes += skb->len;
return NETDEV_TX_OK;
/* issue prefetch for next Tx descriptor */
prefetch(&ring->desc_cb[ring->next_to_clean]);
}
+ /* update tx ring statistics. */
+ ring->stats.tx_pkts += pkts;
+ ring->stats.tx_bytes += bytes;
NETIF_TX_UNLOCK(ring);
hns_nic_tx_fini_pro_v2;
netif_napi_add(priv->netdev, &rd->napi,
- hns_nic_common_poll, NIC_TX_CLEAN_MAX_NUM);
+ hns_nic_common_poll, NAPI_POLL_WEIGHT);
rd->ring->irq_init_flag = RCB_IRQ_NOT_INITED;
}
for (i = h->q_num; i < h->q_num * 2; i++) {
hns_nic_rx_fini_pro_v2;
netif_napi_add(priv->netdev, &rd->napi,
- hns_nic_common_poll, NIC_RX_CLEAN_MAX_NUM);
+ hns_nic_common_poll, NAPI_POLL_WEIGHT);
rd->ring->irq_init_flag = RCB_IRQ_NOT_INITED;
}
# Makefile for the HISILICON network device drivers.
#
-ccflags-y := -Idrivers/net/ethernet/hisilicon/hns3
+ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3
obj-$(CONFIG_HNS3_HCLGE) += hclge.o
hclge-objs = hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o hclge_mbx.o hclge_err.o hclge_debugfs.o
# Makefile for the HISILICON network device drivers.
#
-ccflags-y := -Idrivers/net/ethernet/hisilicon/hns3
+ccflags-y := -I $(srctree)/drivers/net/ethernet/hisilicon/hns3
obj-$(CONFIG_HNS3_HCLGEVF) += hclgevf.o
hclgevf-objs = hclgevf_main.o hclgevf_cmd.o hclgevf_mbx.o
\ No newline at end of file
};
struct hns_mdio_device {
- void *vbase; /* mdio reg base address */
+ u8 __iomem *vbase; /* mdio reg base address */
struct regmap *subctrl_vbase;
struct hns_mdio_sc_reg sc_reg;
};
#define MDIO_SC_CLK_ST 0x531C
#define MDIO_SC_RESET_ST 0x5A1C
-static void mdio_write_reg(void *base, u32 reg, u32 value)
+static void mdio_write_reg(u8 __iomem *base, u32 reg, u32 value)
{
- u8 __iomem *reg_addr = (u8 __iomem *)base;
-
- writel_relaxed(value, reg_addr + reg);
+ writel_relaxed(value, base + reg);
}
#define MDIO_WRITE_REG(a, reg, value) \
mdio_write_reg((a)->vbase, (reg), (value))
-static u32 mdio_read_reg(void *base, u32 reg)
+static u32 mdio_read_reg(u8 __iomem *base, u32 reg)
{
- u8 __iomem *reg_addr = (u8 __iomem *)base;
-
- return readl_relaxed(reg_addr + reg);
+ return readl_relaxed(base + reg);
}
#define mdio_set_field(origin, mask, shift, val) \
#define mdio_get_field(origin, mask, shift) (((origin) >> (shift)) & (mask))
-static void mdio_set_reg_field(void *base, u32 reg, u32 mask, u32 shift,
+static void mdio_set_reg_field(u8 __iomem *base, u32 reg, u32 mask, u32 shift,
u32 val)
{
u32 origin = mdio_read_reg(base, reg);
#define MDIO_SET_REG_FIELD(dev, reg, mask, shift, val) \
mdio_set_reg_field((dev)->vbase, (reg), (mask), (shift), (val))
-static u32 mdio_get_reg_field(void *base, u32 reg, u32 mask, u32 shift)
+static u32 mdio_get_reg_field(u8 __iomem *base, u32 reg, u32 mask, u32 shift)
{
u32 origin;
*/
adapter->state = VNIC_PROBED;
+ reinit_completion(&adapter->init_done);
rc = init_crq_queue(adapter);
if (rc) {
netdev_err(adapter->netdev,
old_num_rx_queues = adapter->req_rx_queues;
old_num_tx_queues = adapter->req_tx_queues;
- init_completion(&adapter->init_done);
+ reinit_completion(&adapter->init_done);
adapter->init_done_rc = 0;
ibmvnic_send_crq_init(adapter);
if (!wait_for_completion_timeout(&adapter->init_done, timeout)) {
adapter->from_passive_init = false;
- init_completion(&adapter->init_done);
adapter->init_done_rc = 0;
ibmvnic_send_crq_init(adapter);
if (!wait_for_completion_timeout(&adapter->init_done, timeout)) {
INIT_WORK(&adapter->ibmvnic_reset, __ibmvnic_reset);
INIT_LIST_HEAD(&adapter->rwi_list);
spin_lock_init(&adapter->rwi_lock);
+ init_completion(&adapter->init_done);
adapter->resetting = false;
adapter->mac_change_pending = false;
/* create driver workqueue */
fm10k_workqueue = alloc_workqueue("%s", WQ_MEM_RECLAIM, 0,
fm10k_driver_name);
+ if (!fm10k_workqueue)
+ return -ENOMEM;
fm10k_dbg_init();
/* VSI specific handlers */
irqreturn_t (*irq_handler)(int irq, void *data);
+
+ unsigned long *af_xdp_zc_qps; /* tracks AF_XDP ZC enabled qps */
} ____cacheline_internodealigned_in_smp;
struct i40e_netdev_priv {
return !!vsi->xdp_prog;
}
-static inline struct xdp_umem *i40e_xsk_umem(struct i40e_ring *ring)
-{
- bool xdp_on = i40e_enabled_xdp_vsi(ring->vsi);
- int qid = ring->queue_index;
-
- if (ring_is_xdp(ring))
- qid -= ring->vsi->alloc_queue_pairs;
-
- if (!xdp_on)
- return NULL;
-
- return xdp_get_umem_from_qid(ring->vsi->netdev, qid);
-}
-
int i40e_create_queue_channel(struct i40e_vsi *vsi, struct i40e_channel *ch);
int i40e_set_bw_limit(struct i40e_vsi *vsi, u16 seid, u64 max_tx_rate);
int i40e_add_del_cloud_filter(struct i40e_vsi *vsi,
return -EOPNOTSUPP;
/* only magic packet is supported */
- if (wol->wolopts && (wol->wolopts != WAKE_MAGIC)
- | (wol->wolopts != WAKE_FILTER))
+ if (wol->wolopts & ~WAKE_MAGIC)
return -EOPNOTSUPP;
/* is this a new value? */
ring->queue_index);
}
+/**
+ * i40e_xsk_umem - Retrieve the AF_XDP ZC if XDP and ZC is enabled
+ * @ring: The Tx or Rx ring
+ *
+ * Returns the UMEM or NULL.
+ **/
+static struct xdp_umem *i40e_xsk_umem(struct i40e_ring *ring)
+{
+ bool xdp_on = i40e_enabled_xdp_vsi(ring->vsi);
+ int qid = ring->queue_index;
+
+ if (ring_is_xdp(ring))
+ qid -= ring->vsi->alloc_queue_pairs;
+
+ if (!xdp_on || !test_bit(qid, ring->vsi->af_xdp_zc_qps))
+ return NULL;
+
+ return xdp_get_umem_from_qid(ring->vsi->netdev, qid);
+}
+
/**
* i40e_configure_tx_ring - Configure a transmit ring context and rest
* @ring: The Tx ring to configure
hash_init(vsi->mac_filter_hash);
vsi->irqs_ready = false;
+ if (type == I40E_VSI_MAIN) {
+ vsi->af_xdp_zc_qps = bitmap_zalloc(pf->num_lan_qps, GFP_KERNEL);
+ if (!vsi->af_xdp_zc_qps)
+ goto err_rings;
+ }
+
ret = i40e_set_num_rings_in_vsi(vsi);
if (ret)
goto err_rings;
goto unlock_pf;
err_rings:
+ bitmap_free(vsi->af_xdp_zc_qps);
pf->next_vsi = i - 1;
kfree(vsi);
unlock_pf:
i40e_put_lump(pf->qp_pile, vsi->base_queue, vsi->idx);
i40e_put_lump(pf->irq_pile, vsi->base_vector, vsi->idx);
+ bitmap_free(vsi->af_xdp_zc_qps);
i40e_vsi_free_arrays(vsi, true);
i40e_clear_rss_config_user(vsi);
static int i40e_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
{
struct i40e_pf *pf = container_of(ptp, struct i40e_pf, ptp_caps);
- struct timespec64 now;
+ struct timespec64 now, then;
+ then = ns_to_timespec64(delta);
mutex_lock(&pf->tmreg_lock);
i40e_ptp_read(pf, &now, NULL);
- timespec64_add_ns(&now, delta);
+ now = timespec64_add(now, then);
i40e_ptp_write(pf, (const struct timespec64 *)&now);
mutex_unlock(&pf->tmreg_lock);
if (err)
return err;
+ set_bit(qid, vsi->af_xdp_zc_qps);
+
if_running = netif_running(vsi->netdev) && i40e_enabled_xdp_vsi(vsi);
if (if_running) {
return err;
}
+ clear_bit(qid, vsi->af_xdp_zc_qps);
i40e_xsk_umem_dma_unmap(vsi, umem);
if (if_running) {
/* enable link status from external LINK_0 and LINK_1 pins */
#define E1000_CTRL_SWDPIN0 0x00040000 /* SWDPIN 0 value */
#define E1000_CTRL_SWDPIN1 0x00080000 /* SWDPIN 1 value */
+#define E1000_CTRL_ADVD3WUC 0x00100000 /* D3 WUC */
+#define E1000_CTRL_EN_PHY_PWR_MGMT 0x00200000 /* PHY PM enable */
#define E1000_CTRL_SDP0_DIR 0x00400000 /* SDP0 Data direction */
#define E1000_CTRL_SDP1_DIR 0x00800000 /* SDP1 Data direction */
#define E1000_CTRL_RST 0x04000000 /* Global reset */
struct e1000_hw *hw = &adapter->hw;
u32 ctrl, rctl, status;
u32 wufc = runtime ? E1000_WUFC_LNKC : adapter->wol;
-#ifdef CONFIG_PM
- int retval = 0;
-#endif
+ bool wake;
rtnl_lock();
netif_device_detach(netdev);
igb_clear_interrupt_scheme(adapter);
rtnl_unlock();
-#ifdef CONFIG_PM
- if (!runtime) {
- retval = pci_save_state(pdev);
- if (retval)
- return retval;
- }
-#endif
-
status = rd32(E1000_STATUS);
if (status & E1000_STATUS_LU)
wufc &= ~E1000_WUFC_LNKC;
}
ctrl = rd32(E1000_CTRL);
- /* advertise wake from D3Cold */
- #define E1000_CTRL_ADVD3WUC 0x00100000
- /* phy power management enable */
- #define E1000_CTRL_EN_PHY_PWR_MGMT 0x00200000
ctrl |= E1000_CTRL_ADVD3WUC;
wr32(E1000_CTRL, ctrl);
wr32(E1000_WUFC, 0);
}
- *enable_wake = wufc || adapter->en_mng_pt;
- if (!*enable_wake)
+ wake = wufc || adapter->en_mng_pt;
+ if (!wake)
igb_power_down_link(adapter);
else
igb_power_up_link(adapter);
+ if (enable_wake)
+ *enable_wake = wake;
+
/* Release control of h/w to f/w. If f/w is AMT enabled, this
* would have already happened in close and is redundant.
*/
static int __maybe_unused igb_suspend(struct device *dev)
{
- int retval;
- bool wake;
- struct pci_dev *pdev = to_pci_dev(dev);
-
- retval = __igb_shutdown(pdev, &wake, 0);
- if (retval)
- return retval;
-
- if (wake) {
- pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- pci_set_power_state(pdev, PCI_D3hot);
- }
-
- return 0;
+ return __igb_shutdown(to_pci_dev(dev), NULL, 0);
}
static int __maybe_unused igb_resume(struct device *dev)
static int __maybe_unused igb_runtime_suspend(struct device *dev)
{
- struct pci_dev *pdev = to_pci_dev(dev);
- int retval;
- bool wake;
-
- retval = __igb_shutdown(pdev, &wake, 1);
- if (retval)
- return retval;
-
- if (wake) {
- pci_prepare_to_sleep(pdev);
- } else {
- pci_wake_from_d3(pdev, false);
- pci_set_power_state(pdev, PCI_D3hot);
- }
-
- return 0;
+ return __igb_shutdown(to_pci_dev(dev), NULL, 1);
}
static int __maybe_unused igb_runtime_resume(struct device *dev)
struct pci_dev *pdev = adapter->pdev;
struct device *dev = &adapter->netdev->dev;
struct mii_bus *bus;
+ int err = -ENODEV;
- adapter->mii_bus = devm_mdiobus_alloc(dev);
- if (!adapter->mii_bus)
+ bus = devm_mdiobus_alloc(dev);
+ if (!bus)
return -ENOMEM;
- bus = adapter->mii_bus;
-
switch (hw->device_id) {
/* C3000 SoCs */
case IXGBE_DEV_ID_X550EM_A_KR:
*/
hw->phy.mdio.mode_support = MDIO_SUPPORTS_C45 | MDIO_SUPPORTS_C22;
- return mdiobus_register(bus);
+ err = mdiobus_register(bus);
+ if (!err) {
+ adapter->mii_bus = bus;
+ return 0;
+ }
ixgbe_no_mii_bus:
devm_mdiobus_free(dev, bus);
- adapter->mii_bus = NULL;
- return -ENODEV;
+ return err;
}
/**
if (!eproto)
return -EINVAL;
- if (ext != MLX5_CAP_PCAM_FEATURE(dev, ptys_extended_ethernet))
- return -EOPNOTSUPP;
-
err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, port);
if (err)
return err;
return err;
}
-/* xoff = ((301+2.16 * len [m]) * speed [Gbps] + 2.72 MTU [B]) */
+/* xoff = ((301+2.16 * len [m]) * speed [Gbps] + 2.72 MTU [B])
+ * minimum speed value is 40Gbps
+ */
static u32 calculate_xoff(struct mlx5e_priv *priv, unsigned int mtu)
{
u32 speed;
int err;
err = mlx5e_port_linkspeed(priv->mdev, &speed);
- if (err) {
- mlx5_core_warn(priv->mdev, "cannot get port speed\n");
- return 0;
- }
+ if (err)
+ speed = SPEED_40000;
+ speed = max_t(u32, speed, SPEED_40000);
xoff = (301 + 216 * priv->dcbx.cable_len / 100) * speed / 1000 + 272 * mtu / 100;
}
static int update_xoff_threshold(struct mlx5e_port_buffer *port_buffer,
- u32 xoff, unsigned int mtu)
+ u32 xoff, unsigned int max_mtu)
{
int i;
}
if (port_buffer->buffer[i].size <
- (xoff + mtu + (1 << MLX5E_BUFFER_CELL_SHIFT)))
+ (xoff + max_mtu + (1 << MLX5E_BUFFER_CELL_SHIFT)))
return -ENOMEM;
port_buffer->buffer[i].xoff = port_buffer->buffer[i].size - xoff;
- port_buffer->buffer[i].xon = port_buffer->buffer[i].xoff - mtu;
+ port_buffer->buffer[i].xon =
+ port_buffer->buffer[i].xoff - max_mtu;
}
return 0;
/**
* update_buffer_lossy()
- * mtu: device's MTU
+ * max_mtu: netdev's max_mtu
* pfc_en: <input> current pfc configuration
* buffer: <input> current prio to buffer mapping
* xoff: <input> xoff value
* Return 0 if no error.
* Set change to true if buffer configuration is modified.
*/
-static int update_buffer_lossy(unsigned int mtu,
+static int update_buffer_lossy(unsigned int max_mtu,
u8 pfc_en, u8 *buffer, u32 xoff,
struct mlx5e_port_buffer *port_buffer,
bool *change)
}
if (changed) {
- err = update_xoff_threshold(port_buffer, xoff, mtu);
+ err = update_xoff_threshold(port_buffer, xoff, max_mtu);
if (err)
return err;
return 0;
}
+#define MINIMUM_MAX_MTU 9216
int mlx5e_port_manual_buffer_config(struct mlx5e_priv *priv,
u32 change, unsigned int mtu,
struct ieee_pfc *pfc,
bool update_prio2buffer = false;
u8 buffer[MLX5E_MAX_PRIORITY];
bool update_buffer = false;
+ unsigned int max_mtu;
u32 total_used = 0;
u8 curr_pfc_en;
int err;
int i;
mlx5e_dbg(HW, priv, "%s: change=%x\n", __func__, change);
+ max_mtu = max_t(unsigned int, priv->netdev->max_mtu, MINIMUM_MAX_MTU);
err = mlx5e_port_query_buffer(priv, &port_buffer);
if (err)
if (change & MLX5E_PORT_BUFFER_CABLE_LEN) {
update_buffer = true;
- err = update_xoff_threshold(&port_buffer, xoff, mtu);
+ err = update_xoff_threshold(&port_buffer, xoff, max_mtu);
if (err)
return err;
}
if (err)
return err;
- err = update_buffer_lossy(mtu, pfc->pfc_en, buffer, xoff,
+ err = update_buffer_lossy(max_mtu, pfc->pfc_en, buffer, xoff,
&port_buffer, &update_buffer);
if (err)
return err;
if (err)
return err;
- err = update_buffer_lossy(mtu, curr_pfc_en, prio2buffer, xoff,
- &port_buffer, &update_buffer);
+ err = update_buffer_lossy(max_mtu, curr_pfc_en, prio2buffer,
+ xoff, &port_buffer, &update_buffer);
if (err)
return err;
}
return -EINVAL;
update_buffer = true;
- err = update_xoff_threshold(&port_buffer, xoff, mtu);
+ err = update_xoff_threshold(&port_buffer, xoff, max_mtu);
if (err)
return err;
}
/* Need to update buffer configuration if xoff value is changed */
if (!update_buffer && xoff != priv->dcbx.xoff) {
update_buffer = true;
- err = update_xoff_threshold(&port_buffer, xoff, mtu);
+ err = update_xoff_threshold(&port_buffer, xoff, max_mtu);
if (err)
return err;
}
if (err)
return err;
+ mutex_lock(&mdev->mlx5e_res.td.list_lock);
list_add(&tir->list, &mdev->mlx5e_res.td.tirs_list);
+ mutex_unlock(&mdev->mlx5e_res.td.list_lock);
return 0;
}
void mlx5e_destroy_tir(struct mlx5_core_dev *mdev,
struct mlx5e_tir *tir)
{
+ mutex_lock(&mdev->mlx5e_res.td.list_lock);
mlx5_core_destroy_tir(mdev, tir->tirn);
list_del(&tir->list);
+ mutex_unlock(&mdev->mlx5e_res.td.list_lock);
}
static int mlx5e_create_mkey(struct mlx5_core_dev *mdev, u32 pdn,
}
INIT_LIST_HEAD(&mdev->mlx5e_res.td.tirs_list);
+ mutex_init(&mdev->mlx5e_res.td.list_lock);
return 0;
{
struct mlx5_core_dev *mdev = priv->mdev;
struct mlx5e_tir *tir;
- int err = -ENOMEM;
+ int err = 0;
u32 tirn = 0;
int inlen;
void *in;
inlen = MLX5_ST_SZ_BYTES(modify_tir_in);
in = kvzalloc(inlen, GFP_KERNEL);
- if (!in)
+ if (!in) {
+ err = -ENOMEM;
goto out;
+ }
if (enable_uc_lb)
MLX5_SET(modify_tir_in, in, ctx.self_lb_block,
MLX5_SET(modify_tir_in, in, bitmask.self_lb_en, 1);
+ mutex_lock(&mdev->mlx5e_res.td.list_lock);
list_for_each_entry(tir, &mdev->mlx5e_res.td.tirs_list, list) {
tirn = tir->tirn;
err = mlx5_core_modify_tir(mdev, tirn, in, inlen);
kvfree(in);
if (err)
netdev_err(priv->netdev, "refresh tir(0x%x) failed, %d\n", tirn, err);
+ mutex_unlock(&mdev->mlx5e_res.td.list_lock);
return err;
}
__ETHTOOL_LINK_MODE_MASK_NBITS);
}
-static void ptys2ethtool_adver_link(struct mlx5_core_dev *mdev,
- unsigned long *advertising_modes,
- u32 eth_proto_cap)
+static void ptys2ethtool_adver_link(unsigned long *advertising_modes,
+ u32 eth_proto_cap, bool ext)
{
unsigned long proto_cap = eth_proto_cap;
struct ptys2ethtool_config *table;
u32 max_size;
int proto;
- mlx5e_ethtool_get_speed_arr(mdev, &table, &max_size);
+ table = ext ? ptys2ext_ethtool_table : ptys2legacy_ethtool_table;
+ max_size = ext ? ARRAY_SIZE(ptys2ext_ethtool_table) :
+ ARRAY_SIZE(ptys2legacy_ethtool_table);
+
for_each_set_bit(proto, &proto_cap, max_size)
bitmap_or(advertising_modes, advertising_modes,
table[proto].advertised,
ethtool_link_ksettings_add_link_mode(link_ksettings, supported, Pause);
}
-static void get_advertising(struct mlx5_core_dev *mdev, u32 eth_proto_cap,
- u8 tx_pause, u8 rx_pause,
- struct ethtool_link_ksettings *link_ksettings)
+static void get_advertising(u32 eth_proto_cap, u8 tx_pause, u8 rx_pause,
+ struct ethtool_link_ksettings *link_ksettings,
+ bool ext)
{
unsigned long *advertising = link_ksettings->link_modes.advertising;
- ptys2ethtool_adver_link(mdev, advertising, eth_proto_cap);
+ ptys2ethtool_adver_link(advertising, eth_proto_cap, ext);
if (rx_pause)
ethtool_link_ksettings_add_link_mode(link_ksettings, advertising, Pause);
struct ethtool_link_ksettings *link_ksettings)
{
unsigned long *lp_advertising = link_ksettings->link_modes.lp_advertising;
+ bool ext = MLX5_CAP_PCAM_FEATURE(mdev, ptys_extended_ethernet);
- ptys2ethtool_adver_link(mdev, lp_advertising, eth_proto_lp);
+ ptys2ethtool_adver_link(lp_advertising, eth_proto_lp, ext);
}
int mlx5e_ethtool_get_link_ksettings(struct mlx5e_priv *priv,
u8 an_disable_admin;
u8 an_status;
u8 connector_type;
+ bool admin_ext;
bool ext;
int err;
eth_proto_capability);
eth_proto_admin = MLX5_GET_ETH_PROTO(ptys_reg, out, ext,
eth_proto_admin);
+ /* Fields: eth_proto_admin and ext_eth_proto_admin are
+ * mutually exclusive. Hence try reading legacy advertising
+ * when extended advertising is zero.
+ * admin_ext indicates how eth_proto_admin should be
+ * interpreted
+ */
+ admin_ext = ext;
+ if (ext && !eth_proto_admin) {
+ eth_proto_admin = MLX5_GET_ETH_PROTO(ptys_reg, out, false,
+ eth_proto_admin);
+ admin_ext = false;
+ }
+
eth_proto_oper = MLX5_GET_ETH_PROTO(ptys_reg, out, ext,
eth_proto_oper);
eth_proto_lp = MLX5_GET(ptys_reg, out, eth_proto_lp_advertise);
ethtool_link_ksettings_zero_link_mode(link_ksettings, advertising);
get_supported(mdev, eth_proto_cap, link_ksettings);
- get_advertising(mdev, eth_proto_admin, tx_pause, rx_pause, link_ksettings);
+ get_advertising(eth_proto_admin, tx_pause, rx_pause, link_ksettings,
+ admin_ext);
get_speed_duplex(priv->netdev, eth_proto_oper, link_ksettings);
eth_proto_oper = eth_proto_oper ? eth_proto_oper : eth_proto_cap;
#define MLX5E_PTYS_EXT ((1ULL << ETHTOOL_LINK_MODE_50000baseKR_Full_BIT) - 1)
- ext_requested = (link_ksettings->link_modes.advertising[0] >
- MLX5E_PTYS_EXT);
+ ext_requested = !!(link_ksettings->link_modes.advertising[0] >
+ MLX5E_PTYS_EXT ||
+ link_ksettings->link_modes.advertising[1]);
ext_supported = MLX5_CAP_PCAM_FEATURE(mdev, ptys_extended_ethernet);
-
- /*when ptys_extended_ethernet is set legacy link modes are deprecated */
- if (ext_requested != ext_supported)
- return -EPROTONOSUPPORT;
+ ext_requested &= ext_supported;
speed = link_ksettings->base.speed;
ethtool2ptys_adver_func = ext_requested ?
mlx5e_ethtool2ptys_ext_adver_link :
mlx5e_ethtool2ptys_adver_link;
- err = mlx5_port_query_eth_proto(mdev, 1, ext_supported, &eproto);
+ err = mlx5_port_query_eth_proto(mdev, 1, ext_requested, &eproto);
if (err) {
netdev_err(priv->netdev, "%s: query port eth proto failed: %d\n",
__func__, err);
if (!an_changes && link_modes == eproto.admin)
goto out;
- mlx5_port_set_eth_ptys(mdev, an_disable, link_modes, ext_supported);
+ mlx5_port_set_eth_ptys(mdev, an_disable, link_modes, ext_requested);
mlx5_toggle_port_link(mdev);
out:
return true;
}
+struct ip_ttl_word {
+ __u8 ttl;
+ __u8 protocol;
+ __sum16 check;
+};
+
+struct ipv6_hoplimit_word {
+ __be16 payload_len;
+ __u8 nexthdr;
+ __u8 hop_limit;
+};
+
+static bool is_action_keys_supported(const struct flow_action_entry *act)
+{
+ u32 mask, offset;
+ u8 htype;
+
+ htype = act->mangle.htype;
+ offset = act->mangle.offset;
+ mask = ~act->mangle.mask;
+ /* For IPv4 & IPv6 header check 4 byte word,
+ * to determine that modified fields
+ * are NOT ttl & hop_limit only.
+ */
+ if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4) {
+ struct ip_ttl_word *ttl_word =
+ (struct ip_ttl_word *)&mask;
+
+ if (offset != offsetof(struct iphdr, ttl) ||
+ ttl_word->protocol ||
+ ttl_word->check) {
+ return true;
+ }
+ } else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) {
+ struct ipv6_hoplimit_word *hoplimit_word =
+ (struct ipv6_hoplimit_word *)&mask;
+
+ if (offset != offsetof(struct ipv6hdr, payload_len) ||
+ hoplimit_word->payload_len ||
+ hoplimit_word->nexthdr) {
+ return true;
+ }
+ }
+ return false;
+}
+
static bool modify_header_match_supported(struct mlx5_flow_spec *spec,
struct flow_action *flow_action,
u32 actions,
{
const struct flow_action_entry *act;
bool modify_ip_header;
- u8 htype, ip_proto;
void *headers_v;
u16 ethertype;
+ u8 ip_proto;
int i;
if (actions & MLX5_FLOW_CONTEXT_ACTION_DECAP)
act->id != FLOW_ACTION_ADD)
continue;
- htype = act->mangle.htype;
- if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4 ||
- htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) {
+ if (is_action_keys_supported(act)) {
modify_ip_header = true;
break;
}
return 0;
}
-static inline int cmp_encap_info(struct ip_tunnel_key *a,
- struct ip_tunnel_key *b)
+struct encap_key {
+ struct ip_tunnel_key *ip_tun_key;
+ int tunnel_type;
+};
+
+static inline int cmp_encap_info(struct encap_key *a,
+ struct encap_key *b)
{
- return memcmp(a, b, sizeof(*a));
+ return memcmp(a->ip_tun_key, b->ip_tun_key, sizeof(*a->ip_tun_key)) ||
+ a->tunnel_type != b->tunnel_type;
}
-static inline int hash_encap_info(struct ip_tunnel_key *key)
+static inline int hash_encap_info(struct encap_key *key)
{
- return jhash(key, sizeof(*key), 0);
+ return jhash(key->ip_tun_key, sizeof(*key->ip_tun_key),
+ key->tunnel_type);
}
struct mlx5_esw_flow_attr *attr = flow->esw_attr;
struct mlx5e_tc_flow_parse_attr *parse_attr;
struct ip_tunnel_info *tun_info;
- struct ip_tunnel_key *key;
+ struct encap_key key, e_key;
struct mlx5e_encap_entry *e;
unsigned short family;
uintptr_t hash_key;
parse_attr = attr->parse_attr;
tun_info = &parse_attr->tun_info[out_index];
family = ip_tunnel_info_af(tun_info);
- key = &tun_info->key;
+ key.ip_tun_key = &tun_info->key;
+ key.tunnel_type = mlx5e_tc_tun_get_type(mirred_dev);
- hash_key = hash_encap_info(key);
+ hash_key = hash_encap_info(&key);
hash_for_each_possible_rcu(esw->offloads.encap_tbl, e,
encap_hlist, hash_key) {
- if (!cmp_encap_info(&e->tun_info.key, key)) {
+ e_key.ip_tun_key = &e->tun_info.key;
+ e_key.tunnel_type = e->tunnel_type;
+ if (!cmp_encap_info(&e_key, &key)) {
found = true;
break;
}
if (hdrs[TCA_PEDIT_KEY_EX_CMD_SET].pedits ||
hdrs[TCA_PEDIT_KEY_EX_CMD_ADD].pedits) {
- err = alloc_tc_pedit_action(priv, MLX5_FLOW_NAMESPACE_KERNEL,
+ err = alloc_tc_pedit_action(priv, MLX5_FLOW_NAMESPACE_FDB,
parse_attr, hdrs, extack);
if (err)
return err;
opcode, MLX5_CMD_OP_MODIFY_NIC_VPORT_CONTEXT);
MLX5_SET(modify_nic_vport_context_in, in, field_select.change_event, 1);
MLX5_SET(modify_nic_vport_context_in, in, vport_number, vport);
- if (vport)
- MLX5_SET(modify_nic_vport_context_in, in, other_vport, 1);
+ MLX5_SET(modify_nic_vport_context_in, in, other_vport, 1);
nic_vport_ctx = MLX5_ADDR_OF(modify_nic_vport_context_in,
in, nic_vport_context);
MLX5_SET(modify_esw_vport_context_in, in, opcode,
MLX5_CMD_OP_MODIFY_ESW_VPORT_CONTEXT);
MLX5_SET(modify_esw_vport_context_in, in, vport_number, vport);
- if (vport)
- MLX5_SET(modify_esw_vport_context_in, in, other_vport, 1);
+ MLX5_SET(modify_esw_vport_context_in, in, other_vport, 1);
return mlx5_cmd_exec(dev, in, inlen, out, sizeof(out));
}
{
int err;
+ memset(&esw->fdb_table.legacy, 0, sizeof(struct legacy_fdb));
+
err = esw_create_legacy_vepa_table(esw);
if (err)
return err;
/* Star rule to forward all traffic to uplink vport */
memset(spec, 0, sizeof(*spec));
+ memset(&dest, 0, sizeof(dest));
dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
dest.vport.num = MLX5_VPORT_UPLINK;
flow_act.action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
{
int err;
+ memset(&esw->fdb_table.offloads, 0, sizeof(struct offloads_fdb));
mutex_init(&esw->fdb_table.offloads.fdb_prio_lock);
err = esw_create_offloads_fdb_tables(esw, nvports);
void *cmd;
int ret;
+ rcu_read_lock();
+ flow = idr_find(&mdev->fpga->tls->rx_idr, ntohl(handle));
+ rcu_read_unlock();
+
+ if (!flow) {
+ WARN_ONCE(1, "Received NULL pointer for handle\n");
+ return -EINVAL;
+ }
+
buf = kzalloc(size, GFP_ATOMIC);
if (!buf)
return -ENOMEM;
cmd = (buf + 1);
- rcu_read_lock();
- flow = idr_find(&mdev->fpga->tls->rx_idr, ntohl(handle));
- rcu_read_unlock();
mlx5_fpga_tls_flow_to_cmd(flow, cmd);
MLX5_SET(tls_cmd, cmd, swid, ntohl(handle));
buf->complete = mlx_tls_kfree_complete;
ret = mlx5_fpga_sbu_conn_sendmsg(mdev->fpga->tls->conn, buf);
+ if (ret < 0)
+ kfree(buf);
return ret;
}
.size = 8,
.limit = 4
},
- .mr_cache[16] = {
- .size = 8,
- .limit = 4
- },
- .mr_cache[17] = {
- .size = 8,
- .limit = 4
- },
- .mr_cache[18] = {
- .size = 8,
- .limit = 4
- },
- .mr_cache[19] = {
- .size = 4,
- .limit = 2
- },
- .mr_cache[20] = {
- .size = 4,
- .limit = 2
- },
},
};
tmp_push_vlan_tci =
FIELD_PREP(NFP_FL_PUSH_VLAN_PRIO, act->vlan.prio) |
- FIELD_PREP(NFP_FL_PUSH_VLAN_VID, act->vlan.vid) |
- NFP_FL_PUSH_VLAN_CFI;
+ FIELD_PREP(NFP_FL_PUSH_VLAN_VID, act->vlan.vid);
push_vlan->vlan_tci = cpu_to_be16(tmp_push_vlan_tci);
}
#define NFP_FLOWER_LAYER2_GENEVE_OP BIT(6)
#define NFP_FLOWER_MASK_VLAN_PRIO GENMASK(15, 13)
-#define NFP_FLOWER_MASK_VLAN_CFI BIT(12)
+#define NFP_FLOWER_MASK_VLAN_PRESENT BIT(12)
#define NFP_FLOWER_MASK_VLAN_VID GENMASK(11, 0)
#define NFP_FLOWER_MASK_MPLS_LB GENMASK(31, 12)
#define NFP_FL_OUT_FLAGS_TYPE_IDX GENMASK(2, 0)
#define NFP_FL_PUSH_VLAN_PRIO GENMASK(15, 13)
-#define NFP_FL_PUSH_VLAN_CFI BIT(12)
#define NFP_FL_PUSH_VLAN_VID GENMASK(11, 0)
#define IPV6_FLOW_LABEL_MASK cpu_to_be32(0x000fffff)
flow_rule_match_vlan(rule, &match);
/* Populate the tci field. */
- if (match.key->vlan_id || match.key->vlan_priority) {
- tmp_tci = FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO,
- match.key->vlan_priority) |
- FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID,
- match.key->vlan_id) |
- NFP_FLOWER_MASK_VLAN_CFI;
- ext->tci = cpu_to_be16(tmp_tci);
- tmp_tci = FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO,
- match.mask->vlan_priority) |
- FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID,
- match.mask->vlan_id) |
- NFP_FLOWER_MASK_VLAN_CFI;
- msk->tci = cpu_to_be16(tmp_tci);
- }
+ tmp_tci = NFP_FLOWER_MASK_VLAN_PRESENT;
+ tmp_tci |= FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO,
+ match.key->vlan_priority) |
+ FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID,
+ match.key->vlan_id);
+ ext->tci = cpu_to_be16(tmp_tci);
+
+ tmp_tci = NFP_FLOWER_MASK_VLAN_PRESENT;
+ tmp_tci |= FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO,
+ match.mask->vlan_priority) |
+ FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID,
+ match.mask->vlan_id);
+ msk->tci = cpu_to_be16(tmp_tci);
}
}
ret = dev_queue_xmit(skb);
nfp_repr_inc_tx_stats(netdev, len, ret);
- return ret;
+ return NETDEV_TX_OK;
}
static int nfp_repr_stop(struct net_device *netdev)
netdev->features &= ~(NETIF_F_TSO | NETIF_F_TSO6);
netdev->gso_max_segs = NFP_NET_LSO_MAX_SEGS;
- netdev->priv_flags |= IFF_NO_QUEUE;
+ netdev->priv_flags |= IFF_NO_QUEUE | IFF_DISABLE_NETPOLL;
netdev->features |= NETIF_F_LLTX;
if (nfp_app_has_tc(app)) {
tp->cp_cmd |= PktCntrDisable | INTT_1;
RTL_W16(tp, CPlusCmd, tp->cp_cmd);
- RTL_W16(tp, IntrMitigate, 0x5151);
+ RTL_W16(tp, IntrMitigate, 0x5100);
/* Work around for RxFIFO overflow. */
if (tp->mac_version == RTL_GIGA_MAC_VER_11) {
/* Specific functions used for Ring mode */
/* Enhanced descriptors */
-static inline void ehn_desc_rx_set_on_ring(struct dma_desc *p, int end)
+static inline void ehn_desc_rx_set_on_ring(struct dma_desc *p, int end,
+ int bfsize)
{
- p->des1 |= cpu_to_le32((BUF_SIZE_8KiB
- << ERDES1_BUFFER2_SIZE_SHIFT)
- & ERDES1_BUFFER2_SIZE_MASK);
+ if (bfsize == BUF_SIZE_16KiB)
+ p->des1 |= cpu_to_le32((BUF_SIZE_8KiB
+ << ERDES1_BUFFER2_SIZE_SHIFT)
+ & ERDES1_BUFFER2_SIZE_MASK);
if (end)
p->des1 |= cpu_to_le32(ERDES1_END_RING);
}
/* Normal descriptors */
-static inline void ndesc_rx_set_on_ring(struct dma_desc *p, int end)
+static inline void ndesc_rx_set_on_ring(struct dma_desc *p, int end, int bfsize)
{
- p->des1 |= cpu_to_le32(((BUF_SIZE_2KiB - 1)
- << RDES1_BUFFER2_SIZE_SHIFT)
- & RDES1_BUFFER2_SIZE_MASK);
+ if (bfsize >= BUF_SIZE_2KiB) {
+ int bfsize2;
+
+ bfsize2 = min(bfsize - BUF_SIZE_2KiB + 1, BUF_SIZE_2KiB - 1);
+ p->des1 |= cpu_to_le32((bfsize2 << RDES1_BUFFER2_SIZE_SHIFT)
+ & RDES1_BUFFER2_SIZE_MASK);
+ }
if (end)
p->des1 |= cpu_to_le32(RDES1_END_RING);
}
static void dwmac4_rd_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
- int mode, int end)
+ int mode, int end, int bfsize)
{
dwmac4_set_rx_owner(p, disable_rx_ic);
}
}
static void dwxgmac2_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
- int mode, int end)
+ int mode, int end, int bfsize)
{
dwxgmac2_set_rx_owner(p, disable_rx_ic);
}
if (unlikely(rdes0 & RDES0_OWN))
return dma_own;
+ if (unlikely(!(rdes0 & RDES0_LAST_DESCRIPTOR))) {
+ stats->rx_length_errors++;
+ return discard_frame;
+ }
+
if (unlikely(rdes0 & RDES0_ERROR_SUMMARY)) {
if (unlikely(rdes0 & RDES0_DESCRIPTOR_ERROR)) {
x->rx_desc++;
* It doesn't match with the information reported into the databook.
* At any rate, we need to understand if the CSUM hw computation is ok
* and report this info to the upper layers. */
- ret = enh_desc_coe_rdes0(!!(rdes0 & RDES0_IPC_CSUM_ERROR),
- !!(rdes0 & RDES0_FRAME_TYPE),
- !!(rdes0 & ERDES0_RX_MAC_ADDR));
+ if (likely(ret == good_frame))
+ ret = enh_desc_coe_rdes0(!!(rdes0 & RDES0_IPC_CSUM_ERROR),
+ !!(rdes0 & RDES0_FRAME_TYPE),
+ !!(rdes0 & ERDES0_RX_MAC_ADDR));
if (unlikely(rdes0 & RDES0_DRIBBLING))
x->dribbling_bit++;
}
static void enh_desc_init_rx_desc(struct dma_desc *p, int disable_rx_ic,
- int mode, int end)
+ int mode, int end, int bfsize)
{
+ int bfsize1;
+
p->des0 |= cpu_to_le32(RDES0_OWN);
- p->des1 |= cpu_to_le32(BUF_SIZE_8KiB & ERDES1_BUFFER1_SIZE_MASK);
+
+ bfsize1 = min(bfsize, BUF_SIZE_8KiB);
+ p->des1 |= cpu_to_le32(bfsize1 & ERDES1_BUFFER1_SIZE_MASK);
if (mode == STMMAC_CHAIN_MODE)
ehn_desc_rx_set_on_chain(p);
else
- ehn_desc_rx_set_on_ring(p, end);
+ ehn_desc_rx_set_on_ring(p, end, bfsize);
if (disable_rx_ic)
p->des1 |= cpu_to_le32(ERDES1_DISABLE_IC);
struct stmmac_desc_ops {
/* DMA RX descriptor ring initialization */
void (*init_rx_desc)(struct dma_desc *p, int disable_rx_ic, int mode,
- int end);
+ int end, int bfsize);
/* DMA TX descriptor ring initialization */
void (*init_tx_desc)(struct dma_desc *p, int mode, int end);
/* Invoked by the xmit function to prepare the tx descriptor */
return dma_own;
if (unlikely(!(rdes0 & RDES0_LAST_DESCRIPTOR))) {
- pr_warn("%s: Oversized frame spanned multiple buffers\n",
- __func__);
stats->rx_length_errors++;
return discard_frame;
}
}
static void ndesc_init_rx_desc(struct dma_desc *p, int disable_rx_ic, int mode,
- int end)
+ int end, int bfsize)
{
+ int bfsize1;
+
p->des0 |= cpu_to_le32(RDES0_OWN);
- p->des1 |= cpu_to_le32((BUF_SIZE_2KiB - 1) & RDES1_BUFFER1_SIZE_MASK);
+
+ bfsize1 = min(bfsize, BUF_SIZE_2KiB - 1);
+ p->des1 |= cpu_to_le32(bfsize & RDES1_BUFFER1_SIZE_MASK);
if (mode == STMMAC_CHAIN_MODE)
ndesc_rx_set_on_chain(p, end);
else
- ndesc_rx_set_on_ring(p, end);
+ ndesc_rx_set_on_ring(p, end, bfsize);
if (disable_rx_ic)
p->des1 |= cpu_to_le32(RDES1_DISABLE_IC);
if (priv->extend_desc)
stmmac_init_rx_desc(priv, &rx_q->dma_erx[i].basic,
priv->use_riwt, priv->mode,
- (i == DMA_RX_SIZE - 1));
+ (i == DMA_RX_SIZE - 1),
+ priv->dma_buf_sz);
else
stmmac_init_rx_desc(priv, &rx_q->dma_rx[i],
priv->use_riwt, priv->mode,
- (i == DMA_RX_SIZE - 1));
+ (i == DMA_RX_SIZE - 1),
+ priv->dma_buf_sz);
}
/**
{
struct stmmac_rx_queue *rx_q = &priv->rx_queue[queue];
struct stmmac_channel *ch = &priv->channel[queue];
- unsigned int entry = rx_q->cur_rx;
+ unsigned int next_entry = rx_q->cur_rx;
int coe = priv->hw->rx_csum;
- unsigned int next_entry;
unsigned int count = 0;
bool xmac;
stmmac_display_ring(priv, rx_head, DMA_RX_SIZE, true);
}
while (count < limit) {
- int status;
+ int entry, status;
struct dma_desc *p;
struct dma_desc *np;
+ entry = next_entry;
+
if (priv->extend_desc)
p = (struct dma_desc *)(rx_q->dma_erx + entry);
else
* ignored
*/
if (frame_len > priv->dma_buf_sz) {
- netdev_err(priv->dev,
- "len %d larger than size (%d)\n",
- frame_len, priv->dma_buf_sz);
+ if (net_ratelimit())
+ netdev_err(priv->dev,
+ "len %d larger than size (%d)\n",
+ frame_len, priv->dma_buf_sz);
priv->dev->stats.rx_length_errors++;
- break;
+ continue;
}
/* ACS is set; GMAC core strips PAD/FCS for IEEE 802.3
dev_warn(priv->device,
"packet dropped\n");
priv->dev->stats.rx_dropped++;
- break;
+ continue;
}
dma_sync_single_for_cpu(priv->device,
} else {
skb = rx_q->rx_skbuff[entry];
if (unlikely(!skb)) {
- netdev_err(priv->dev,
- "%s: Inconsistent Rx chain\n",
- priv->dev->name);
+ if (net_ratelimit())
+ netdev_err(priv->dev,
+ "%s: Inconsistent Rx chain\n",
+ priv->dev->name);
priv->dev->stats.rx_dropped++;
- break;
+ continue;
}
prefetch(skb->data - NET_IP_ALIGN);
rx_q->rx_skbuff[entry] = NULL;
priv->dev->stats.rx_packets++;
priv->dev->stats.rx_bytes += frame_len;
}
- entry = next_entry;
}
stmmac_rx_refill(priv, queue);
wait_queue_head_t wait_drain;
bool destroy;
+ bool tx_disable; /* if true, do not wake up queue again */
/* Receive buffer allocated by us but manages by NetVSP */
void *recv_buf;
init_waitqueue_head(&net_device->wait_drain);
net_device->destroy = false;
+ net_device->tx_disable = false;
net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT;
net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT;
} else {
struct netdev_queue *txq = netdev_get_tx_queue(ndev, q_idx);
- if (netif_tx_queue_stopped(txq) &&
+ if (netif_tx_queue_stopped(txq) && !net_device->tx_disable &&
(hv_get_avail_to_write_percent(&channel->outbound) >
RING_AVAIL_PERCENT_HIWATER || queue_sends < 1)) {
netif_tx_wake_queue(txq);
} else if (ret == -EAGAIN) {
netif_tx_stop_queue(txq);
ndev_ctx->eth_stats.stop_queue++;
- if (atomic_read(&nvchan->queue_sends) < 1) {
+ if (atomic_read(&nvchan->queue_sends) < 1 &&
+ !net_device->tx_disable) {
netif_tx_wake_queue(txq);
ndev_ctx->eth_stats.wake_queue++;
ret = -ENOSPC;
rcu_read_unlock();
}
+static void netvsc_tx_enable(struct netvsc_device *nvscdev,
+ struct net_device *ndev)
+{
+ nvscdev->tx_disable = false;
+ virt_wmb(); /* ensure queue wake up mechanism is on */
+
+ netif_tx_wake_all_queues(ndev);
+}
+
static int netvsc_open(struct net_device *net)
{
struct net_device_context *ndev_ctx = netdev_priv(net);
rdev = nvdev->extension;
if (!rdev->link_state) {
netif_carrier_on(net);
- netif_tx_wake_all_queues(net);
+ netvsc_tx_enable(nvdev, net);
}
if (vf_netdev) {
}
}
+static void netvsc_tx_disable(struct netvsc_device *nvscdev,
+ struct net_device *ndev)
+{
+ if (nvscdev) {
+ nvscdev->tx_disable = true;
+ virt_wmb(); /* ensure txq will not wake up after stop */
+ }
+
+ netif_tx_disable(ndev);
+}
+
static int netvsc_close(struct net_device *net)
{
struct net_device_context *net_device_ctx = netdev_priv(net);
struct netvsc_device *nvdev = rtnl_dereference(net_device_ctx->nvdev);
int ret;
- netif_tx_disable(net);
+ netvsc_tx_disable(nvdev, net);
/* No need to close rndis filter if it is removed already */
if (!nvdev)
/* If device was up (receiving) then shutdown */
if (netif_running(ndev)) {
- netif_tx_disable(ndev);
+ netvsc_tx_disable(nvdev, ndev);
ret = rndis_filter_close(nvdev);
if (ret) {
if (rdev->link_state) {
rdev->link_state = false;
netif_carrier_on(net);
- netif_tx_wake_all_queues(net);
+ netvsc_tx_enable(net_device, net);
} else {
notify = true;
}
if (!rdev->link_state) {
rdev->link_state = true;
netif_carrier_off(net);
- netif_tx_stop_all_queues(net);
+ netvsc_tx_disable(net_device, net);
}
kfree(event);
break;
if (!rdev->link_state) {
rdev->link_state = true;
netif_carrier_off(net);
- netif_tx_stop_all_queues(net);
+ netvsc_tx_disable(net_device, net);
event->event = RNDIS_STATUS_MEDIA_CONNECT;
spin_lock_irqsave(&ndev_ctx->lock, flags);
list_add(&event->list, &ndev_ctx->reconfig_events);
{QMI_FIXED_INTF(0x19d2, 0x2002, 4)}, /* ZTE (Vodafone) K3765-Z */
{QMI_FIXED_INTF(0x2001, 0x7e19, 4)}, /* D-Link DWM-221 B1 */
{QMI_FIXED_INTF(0x2001, 0x7e35, 4)}, /* D-Link DWM-222 */
+ {QMI_FIXED_INTF(0x2020, 0x2031, 4)}, /* Olicard 600 */
{QMI_FIXED_INTF(0x2020, 0x2033, 4)}, /* BroadMobi BM806U */
{QMI_FIXED_INTF(0x0f3d, 0x68a2, 8)}, /* Sierra Wireless MC7700 */
{QMI_FIXED_INTF(0x114f, 0x68a2, 8)}, /* Sierra Wireless MC7750 */
/* default to no qdisc; user can add if desired */
dev->priv_flags |= IFF_NO_QUEUE;
+ dev->priv_flags |= IFF_NO_RX_HANDLER;
dev->min_mtu = 0;
dev->max_mtu = 0;
#define DBG_IRT(x...)
#endif
+#ifdef CONFIG_64BIT
+#define COMPARE_IRTE_ADDR(irte, hpa) ((irte)->dest_iosapic_addr == (hpa))
+#else
#define COMPARE_IRTE_ADDR(irte, hpa) \
- ((irte)->dest_iosapic_addr == F_EXTEND(hpa))
+ ((irte)->dest_iosapic_addr == ((hpa) | 0xffffffff00000000ULL))
+#endif
#define IOSAPIC_REG_SELECT 0x00
#define IOSAPIC_REG_WINDOW 0x10
will be called rtc-s5m.
config RTC_DRV_SD3078
- tristate "ZXW Crystal SD3078"
+ tristate "ZXW Shenzhen whwave SD3078"
help
- If you say yes here you get support for the ZXW Crystal
+ If you say yes here you get support for the ZXW Shenzhen whwave
SD3078 RTC chips.
This driver can also be built as a module. If so, the module
struct cros_ec_rtc *cros_ec_rtc = dev_get_drvdata(&pdev->dev);
if (device_may_wakeup(dev))
- enable_irq_wake(cros_ec_rtc->cros_ec->irq);
+ return enable_irq_wake(cros_ec_rtc->cros_ec->irq);
return 0;
}
struct cros_ec_rtc *cros_ec_rtc = dev_get_drvdata(&pdev->dev);
if (device_may_wakeup(dev))
- disable_irq_wake(cros_ec_rtc->cros_ec->irq);
+ return disable_irq_wake(cros_ec_rtc->cros_ec->irq);
return 0;
}
da9063_data_to_tm(data, &rtc->alarm_time, rtc);
rtc->rtc_sync = false;
+ /*
+ * TODO: some models have alarms on a minute boundary but still support
+ * real hardware interrupts. Add this once the core supports it.
+ */
+ if (config->rtc_data_start != RTC_SEC)
+ rtc->rtc_dev->uie_unsupported = 1;
+
irq_alarm = platform_get_irq_byname(pdev, "ALARM");
ret = devm_request_threaded_irq(&pdev->dev, irq_alarm, NULL,
da9063_alarm_event,
static inline int sh_rtc_read_alarm_value(struct sh_rtc *rtc, int reg_off)
{
unsigned int byte;
- int value = 0xff; /* return 0xff for ignored values */
+ int value = -1; /* return -1 for ignored values */
byte = readb(rtc->regbase + reg_off);
if (byte & AR_ENB) {
* wake up the thread.
*/
spin_lock(&lpfc_cmd->buf_lock);
- if (unlikely(lpfc_cmd->cur_iocbq.iocb_flag & LPFC_DRIVER_ABORTED)) {
- lpfc_cmd->cur_iocbq.iocb_flag &= ~LPFC_DRIVER_ABORTED;
- if (lpfc_cmd->waitq)
- wake_up(lpfc_cmd->waitq);
+ lpfc_cmd->cur_iocbq.iocb_flag &= ~LPFC_DRIVER_ABORTED;
+ if (lpfc_cmd->waitq) {
+ wake_up(lpfc_cmd->waitq);
lpfc_cmd->waitq = NULL;
}
spin_unlock(&lpfc_cmd->buf_lock);
static int qedi_alloc_nvm_iscsi_cfg(struct qedi_ctx *qedi)
{
- struct qedi_nvm_iscsi_image nvm_image;
-
qedi->iscsi_image = dma_alloc_coherent(&qedi->pdev->dev,
- sizeof(nvm_image),
+ sizeof(struct qedi_nvm_iscsi_image),
&qedi->nvm_buf_dma, GFP_KERNEL);
if (!qedi->iscsi_image) {
QEDI_ERR(&qedi->dbg_ctx, "Could not allocate NVM BUF.\n");
static int qedi_get_boot_info(struct qedi_ctx *qedi)
{
int ret = 1;
- struct qedi_nvm_iscsi_image nvm_image;
QEDI_INFO(&qedi->dbg_ctx, QEDI_LOG_INFO,
"Get NVM iSCSI CFG image\n");
ret = qedi_ops->common->nvm_get_image(qedi->cdev,
QED_NVM_IMAGE_ISCSI_CFG,
(char *)qedi->iscsi_image,
- sizeof(nvm_image));
+ sizeof(struct qedi_nvm_iscsi_image));
if (ret)
QEDI_ERR(&qedi->dbg_ctx,
"Could not get NVM image. ret = %d\n", ret);
{"NETAPP", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
{"LSI", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
{"ENGENIO", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
+ {"LENOVO", "Universal Xport", "*", BLIST_NO_ULD_ATTACH},
{"SMSC", "USB 2 HS-CF", NULL, BLIST_SPARSELUN | BLIST_INQUIRY_36},
{"SONY", "CD-ROM CDU-8001", NULL, BLIST_BORKEN},
{"SONY", "TSL", NULL, BLIST_FORCELUN}, /* DDS3 & DDS4 autoloaders */
{"NETAPP", "INF-01-00", "rdac", },
{"LSI", "INF-01-00", "rdac", },
{"ENGENIO", "INF-01-00", "rdac", },
+ {"LENOVO", "DE_Series", "rdac", },
{NULL, NULL, NULL },
};
* This is the end of Protocol specific defines.
*/
-static int storvsc_ringbuffer_size = (256 * PAGE_SIZE);
+static int storvsc_ringbuffer_size = (128 * 1024);
static u32 max_outstanding_req_per_channel;
static int storvsc_vcpus_per_sub_channel = 4;
{
struct device *dev = &device->device;
struct storvsc_device *stor_device;
- int num_cpus = num_online_cpus();
int num_sc;
struct storvsc_cmd_request *request;
struct vstor_packet *vstor_packet;
int ret, t;
- num_sc = ((max_chns > num_cpus) ? num_cpus : max_chns);
+ /*
+ * If the number of CPUs is artificially restricted, such as
+ * with maxcpus=1 on the kernel boot line, Hyper-V could offer
+ * sub-channels >= the number of CPUs. These sub-channels
+ * should not be created. The primary channel is already created
+ * and assigned to one CPU, so check against # CPUs - 1.
+ */
+ num_sc = min((int)(num_online_cpus() - 1), max_chns);
+ if (!num_sc)
+ return;
+
stor_device = get_out_stor_device(device);
if (!stor_device)
return;
rc = pci_add_dynid(&vfio_pci_driver, vendor, device,
subvendor, subdevice, class, class_mask, 0);
if (rc)
- pr_warn("failed to add dynamic id [%04hx:%04hx[%04hx:%04hx]] class %#08x/%08x (%d)\n",
+ pr_warn("failed to add dynamic id [%04x:%04x[%04x:%04x]] class %#08x/%08x (%d)\n",
vendor, device, subvendor, subdevice,
class, class_mask, rc);
else
- pr_info("add [%04hx:%04hx[%04hx:%04hx]] class %#08x/%08x\n",
+ pr_info("add [%04x:%04x[%04x:%04x]] class %#08x/%08x\n",
vendor, device, subvendor, subdevice,
class, class_mask);
}
mutex_unlock(&container->lock);
}
-const struct vfio_iommu_driver_ops tce_iommu_driver_ops = {
+static const struct vfio_iommu_driver_ops tce_iommu_driver_ops = {
.name = "iommu-vfio-powerpc",
.owner = THIS_MODULE,
.open = tce_iommu_open,
MODULE_PARM_DESC(disable_hugepages,
"Disable VFIO IOMMU support for IOMMU hugepages.");
+static unsigned int dma_entry_limit __read_mostly = U16_MAX;
+module_param_named(dma_entry_limit, dma_entry_limit, uint, 0644);
+MODULE_PARM_DESC(dma_entry_limit,
+ "Maximum number of user DMA mappings per container (65535).");
+
struct vfio_iommu {
struct list_head domain_list;
struct vfio_domain *external_domain; /* domain for external user */
struct mutex lock;
struct rb_root dma_list;
struct blocking_notifier_head notifier;
+ unsigned int dma_avail;
bool v2;
bool nesting;
};
vfio_unlink_dma(iommu, dma);
put_task_struct(dma->task);
kfree(dma);
+ iommu->dma_avail++;
}
static unsigned long vfio_pgsize_bitmap(struct vfio_iommu *iommu)
goto out_unlock;
}
+ if (!iommu->dma_avail) {
+ ret = -ENOSPC;
+ goto out_unlock;
+ }
+
dma = kzalloc(sizeof(*dma), GFP_KERNEL);
if (!dma) {
ret = -ENOMEM;
goto out_unlock;
}
+ iommu->dma_avail--;
dma->iova = iova;
dma->vaddr = vaddr;
dma->prot = prot;
INIT_LIST_HEAD(&iommu->domain_list);
iommu->dma_list = RB_ROOT;
+ iommu->dma_avail = dma_entry_limit;
mutex_init(&iommu->lock);
BLOCKING_INIT_NOTIFIER_HEAD(&iommu->notifier);
if (!(vma->vm_flags & VM_SHARED))
return -EINVAL;
- vma_priv = kzalloc(sizeof(*vma_priv) + count * sizeof(void *),
- GFP_KERNEL);
+ vma_priv = kzalloc(struct_size(vma_priv, pages, count), GFP_KERNEL);
if (!vma_priv)
return -ENOMEM;
if (xen_store_evtchn == 0)
return -ENOENT;
- nonseekable_open(inode, filp);
-
- filp->f_mode &= ~FMODE_ATOMIC_POS; /* cdev-style semantics */
+ stream_open(inode, filp);
u = kzalloc(sizeof(*u), GFP_KERNEL);
if (u == NULL)
struct file *file;
struct wait_queue_head *head;
__poll_t events;
- bool woken;
+ bool done;
bool cancelled;
struct wait_queue_entry wait;
struct work_struct work;
struct kioctx *ki_ctx;
kiocb_cancel_fn *ki_cancel;
- struct iocb __user *ki_user_iocb; /* user's aiocb */
- __u64 ki_user_data; /* user's data for completion */
+ struct io_event ki_res;
struct list_head ki_list; /* the aio core uses this
* for cancellation */
/* aio_get_req
* Allocate a slot for an aio request.
* Returns NULL if no requests are free.
+ *
+ * The refcount is initialized to 2 - one for the async op completion,
+ * one for the synchronous code that does this.
*/
static inline struct aio_kiocb *aio_get_req(struct kioctx *ctx)
{
if (unlikely(!req))
return NULL;
+ if (unlikely(!get_reqs_available(ctx))) {
+ kfree(req);
+ return NULL;
+ }
+
percpu_ref_get(&ctx->reqs);
req->ki_ctx = ctx;
INIT_LIST_HEAD(&req->ki_list);
- refcount_set(&req->ki_refcnt, 0);
+ refcount_set(&req->ki_refcnt, 2);
req->ki_eventfd = NULL;
return req;
}
return ret;
}
-static inline void iocb_put(struct aio_kiocb *iocb)
-{
- if (refcount_read(&iocb->ki_refcnt) == 0 ||
- refcount_dec_and_test(&iocb->ki_refcnt)) {
- if (iocb->ki_filp)
- fput(iocb->ki_filp);
- percpu_ref_put(&iocb->ki_ctx->reqs);
- kmem_cache_free(kiocb_cachep, iocb);
- }
-}
-
-static void aio_fill_event(struct io_event *ev, struct aio_kiocb *iocb,
- long res, long res2)
+static inline void iocb_destroy(struct aio_kiocb *iocb)
{
- ev->obj = (u64)(unsigned long)iocb->ki_user_iocb;
- ev->data = iocb->ki_user_data;
- ev->res = res;
- ev->res2 = res2;
+ if (iocb->ki_eventfd)
+ eventfd_ctx_put(iocb->ki_eventfd);
+ if (iocb->ki_filp)
+ fput(iocb->ki_filp);
+ percpu_ref_put(&iocb->ki_ctx->reqs);
+ kmem_cache_free(kiocb_cachep, iocb);
}
/* aio_complete
* Called when the io request on the given iocb is complete.
*/
-static void aio_complete(struct aio_kiocb *iocb, long res, long res2)
+static void aio_complete(struct aio_kiocb *iocb)
{
struct kioctx *ctx = iocb->ki_ctx;
struct aio_ring *ring;
ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
event = ev_page + pos % AIO_EVENTS_PER_PAGE;
- aio_fill_event(event, iocb, res, res2);
+ *event = iocb->ki_res;
kunmap_atomic(ev_page);
flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]);
- pr_debug("%p[%u]: %p: %p %Lx %lx %lx\n",
- ctx, tail, iocb, iocb->ki_user_iocb, iocb->ki_user_data,
- res, res2);
+ pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb,
+ (void __user *)(unsigned long)iocb->ki_res.obj,
+ iocb->ki_res.data, iocb->ki_res.res, iocb->ki_res.res2);
/* after flagging the request as done, we
* must never even look at it again
* eventfd. The eventfd_signal() function is safe to be called
* from IRQ context.
*/
- if (iocb->ki_eventfd) {
+ if (iocb->ki_eventfd)
eventfd_signal(iocb->ki_eventfd, 1);
- eventfd_ctx_put(iocb->ki_eventfd);
- }
/*
* We have to order our ring_info tail store above and test
if (waitqueue_active(&ctx->wait))
wake_up(&ctx->wait);
- iocb_put(iocb);
+}
+
+static inline void iocb_put(struct aio_kiocb *iocb)
+{
+ if (refcount_dec_and_test(&iocb->ki_refcnt)) {
+ aio_complete(iocb);
+ iocb_destroy(iocb);
+ }
}
/* aio_read_events_ring
file_end_write(kiocb->ki_filp);
}
- aio_complete(iocb, res, res2);
+ iocb->ki_res.res = res;
+ iocb->ki_res.res2 = res2;
+ iocb_put(iocb);
}
static int aio_prep_rw(struct kiocb *req, const struct iocb *iocb)
}
}
-static ssize_t aio_read(struct kiocb *req, const struct iocb *iocb,
+static int aio_read(struct kiocb *req, const struct iocb *iocb,
bool vectored, bool compat)
{
struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
struct iov_iter iter;
struct file *file;
- ssize_t ret;
+ int ret;
ret = aio_prep_rw(req, iocb);
if (ret)
return ret;
}
-static ssize_t aio_write(struct kiocb *req, const struct iocb *iocb,
+static int aio_write(struct kiocb *req, const struct iocb *iocb,
bool vectored, bool compat)
{
struct iovec inline_vecs[UIO_FASTIOV], *iovec = inline_vecs;
struct iov_iter iter;
struct file *file;
- ssize_t ret;
+ int ret;
ret = aio_prep_rw(req, iocb);
if (ret)
static void aio_fsync_work(struct work_struct *work)
{
- struct fsync_iocb *req = container_of(work, struct fsync_iocb, work);
- int ret;
+ struct aio_kiocb *iocb = container_of(work, struct aio_kiocb, fsync.work);
- ret = vfs_fsync(req->file, req->datasync);
- aio_complete(container_of(req, struct aio_kiocb, fsync), ret, 0);
+ iocb->ki_res.res = vfs_fsync(iocb->fsync.file, iocb->fsync.datasync);
+ iocb_put(iocb);
}
static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
return 0;
}
-static inline void aio_poll_complete(struct aio_kiocb *iocb, __poll_t mask)
-{
- aio_complete(iocb, mangle_poll(mask), 0);
-}
-
static void aio_poll_complete_work(struct work_struct *work)
{
struct poll_iocb *req = container_of(work, struct poll_iocb, work);
return;
}
list_del_init(&iocb->ki_list);
+ iocb->ki_res.res = mangle_poll(mask);
+ req->done = true;
spin_unlock_irq(&ctx->ctx_lock);
- aio_poll_complete(iocb, mask);
+ iocb_put(iocb);
}
/* assumes we are called with irqs disabled */
__poll_t mask = key_to_poll(key);
unsigned long flags;
- req->woken = true;
-
/* for instances that support it check for an event match first: */
- if (mask) {
- if (!(mask & req->events))
- return 0;
+ if (mask && !(mask & req->events))
+ return 0;
+
+ list_del_init(&req->wait.entry);
+ if (mask && spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
/*
* Try to complete the iocb inline if we can. Use
* irqsave/irqrestore because not all filesystems (e.g. fuse)
* call this function with IRQs disabled and because IRQs
* have to be disabled before ctx_lock is obtained.
*/
- if (spin_trylock_irqsave(&iocb->ki_ctx->ctx_lock, flags)) {
- list_del(&iocb->ki_list);
- spin_unlock_irqrestore(&iocb->ki_ctx->ctx_lock, flags);
-
- list_del_init(&req->wait.entry);
- aio_poll_complete(iocb, mask);
- return 1;
- }
+ list_del(&iocb->ki_list);
+ iocb->ki_res.res = mangle_poll(mask);
+ req->done = true;
+ spin_unlock_irqrestore(&iocb->ki_ctx->ctx_lock, flags);
+ iocb_put(iocb);
+ } else {
+ schedule_work(&req->work);
}
-
- list_del_init(&req->wait.entry);
- schedule_work(&req->work);
return 1;
}
add_wait_queue(head, &pt->iocb->poll.wait);
}
-static ssize_t aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
+static int aio_poll(struct aio_kiocb *aiocb, const struct iocb *iocb)
{
struct kioctx *ctx = aiocb->ki_ctx;
struct poll_iocb *req = &aiocb->poll;
struct aio_poll_table apt;
+ bool cancel = false;
__poll_t mask;
/* reject any unknown events outside the normal event mask. */
req->events = demangle_poll(iocb->aio_buf) | EPOLLERR | EPOLLHUP;
req->head = NULL;
- req->woken = false;
+ req->done = false;
req->cancelled = false;
apt.pt._qproc = aio_poll_queue_proc;
INIT_LIST_HEAD(&req->wait.entry);
init_waitqueue_func_entry(&req->wait, aio_poll_wake);
- /* one for removal from waitqueue, one for this function */
- refcount_set(&aiocb->ki_refcnt, 2);
-
mask = vfs_poll(req->file, &apt.pt) & req->events;
- if (unlikely(!req->head)) {
- /* we did not manage to set up a waitqueue, done */
- goto out;
- }
-
spin_lock_irq(&ctx->ctx_lock);
- spin_lock(&req->head->lock);
- if (req->woken) {
- /* wake_up context handles the rest */
- mask = 0;
+ if (likely(req->head)) {
+ spin_lock(&req->head->lock);
+ if (unlikely(list_empty(&req->wait.entry))) {
+ if (apt.error)
+ cancel = true;
+ apt.error = 0;
+ mask = 0;
+ }
+ if (mask || apt.error) {
+ list_del_init(&req->wait.entry);
+ } else if (cancel) {
+ WRITE_ONCE(req->cancelled, true);
+ } else if (!req->done) { /* actually waiting for an event */
+ list_add_tail(&aiocb->ki_list, &ctx->active_reqs);
+ aiocb->ki_cancel = aio_poll_cancel;
+ }
+ spin_unlock(&req->head->lock);
+ }
+ if (mask) { /* no async, we'd stolen it */
+ aiocb->ki_res.res = mangle_poll(mask);
apt.error = 0;
- } else if (mask || apt.error) {
- /* if we get an error or a mask we are done */
- WARN_ON_ONCE(list_empty(&req->wait.entry));
- list_del_init(&req->wait.entry);
- } else {
- /* actually waiting for an event */
- list_add_tail(&aiocb->ki_list, &ctx->active_reqs);
- aiocb->ki_cancel = aio_poll_cancel;
}
- spin_unlock(&req->head->lock);
spin_unlock_irq(&ctx->ctx_lock);
-
-out:
- if (unlikely(apt.error))
- return apt.error;
-
if (mask)
- aio_poll_complete(aiocb, mask);
- iocb_put(aiocb);
- return 0;
+ iocb_put(aiocb);
+ return apt.error;
}
static int __io_submit_one(struct kioctx *ctx, const struct iocb *iocb,
- struct iocb __user *user_iocb, bool compat)
+ struct iocb __user *user_iocb, struct aio_kiocb *req,
+ bool compat)
{
- struct aio_kiocb *req;
- ssize_t ret;
-
- /* enforce forwards compatibility on users */
- if (unlikely(iocb->aio_reserved2)) {
- pr_debug("EINVAL: reserve field set\n");
- return -EINVAL;
- }
-
- /* prevent overflows */
- if (unlikely(
- (iocb->aio_buf != (unsigned long)iocb->aio_buf) ||
- (iocb->aio_nbytes != (size_t)iocb->aio_nbytes) ||
- ((ssize_t)iocb->aio_nbytes < 0)
- )) {
- pr_debug("EINVAL: overflow check\n");
- return -EINVAL;
- }
-
- if (!get_reqs_available(ctx))
- return -EAGAIN;
-
- ret = -EAGAIN;
- req = aio_get_req(ctx);
- if (unlikely(!req))
- goto out_put_reqs_available;
-
req->ki_filp = fget(iocb->aio_fildes);
- ret = -EBADF;
if (unlikely(!req->ki_filp))
- goto out_put_req;
+ return -EBADF;
if (iocb->aio_flags & IOCB_FLAG_RESFD) {
+ struct eventfd_ctx *eventfd;
/*
* If the IOCB_FLAG_RESFD flag of aio_flags is set, get an
* instance of the file* now. The file descriptor must be
* an eventfd() fd, and will be signaled for each completed
* event using the eventfd_signal() function.
*/
- req->ki_eventfd = eventfd_ctx_fdget((int) iocb->aio_resfd);
- if (IS_ERR(req->ki_eventfd)) {
- ret = PTR_ERR(req->ki_eventfd);
- req->ki_eventfd = NULL;
- goto out_put_req;
- }
+ eventfd = eventfd_ctx_fdget(iocb->aio_resfd);
+ if (IS_ERR(eventfd))
+ return PTR_ERR(req->ki_eventfd);
+
+ req->ki_eventfd = eventfd;
}
- ret = put_user(KIOCB_KEY, &user_iocb->aio_key);
- if (unlikely(ret)) {
+ if (unlikely(put_user(KIOCB_KEY, &user_iocb->aio_key))) {
pr_debug("EFAULT: aio_key\n");
- goto out_put_req;
+ return -EFAULT;
}
- req->ki_user_iocb = user_iocb;
- req->ki_user_data = iocb->aio_data;
+ req->ki_res.obj = (u64)(unsigned long)user_iocb;
+ req->ki_res.data = iocb->aio_data;
+ req->ki_res.res = 0;
+ req->ki_res.res2 = 0;
switch (iocb->aio_lio_opcode) {
case IOCB_CMD_PREAD:
- ret = aio_read(&req->rw, iocb, false, compat);
- break;
+ return aio_read(&req->rw, iocb, false, compat);
case IOCB_CMD_PWRITE:
- ret = aio_write(&req->rw, iocb, false, compat);
- break;
+ return aio_write(&req->rw, iocb, false, compat);
case IOCB_CMD_PREADV:
- ret = aio_read(&req->rw, iocb, true, compat);
- break;
+ return aio_read(&req->rw, iocb, true, compat);
case IOCB_CMD_PWRITEV:
- ret = aio_write(&req->rw, iocb, true, compat);
- break;
+ return aio_write(&req->rw, iocb, true, compat);
case IOCB_CMD_FSYNC:
- ret = aio_fsync(&req->fsync, iocb, false);
- break;
+ return aio_fsync(&req->fsync, iocb, false);
case IOCB_CMD_FDSYNC:
- ret = aio_fsync(&req->fsync, iocb, true);
- break;
+ return aio_fsync(&req->fsync, iocb, true);
case IOCB_CMD_POLL:
- ret = aio_poll(req, iocb);
- break;
+ return aio_poll(req, iocb);
default:
pr_debug("invalid aio operation %d\n", iocb->aio_lio_opcode);
- ret = -EINVAL;
- break;
+ return -EINVAL;
}
-
- /*
- * If ret is 0, we'd either done aio_complete() ourselves or have
- * arranged for that to be done asynchronously. Anything non-zero
- * means that we need to destroy req ourselves.
- */
- if (ret)
- goto out_put_req;
- return 0;
-out_put_req:
- if (req->ki_eventfd)
- eventfd_ctx_put(req->ki_eventfd);
- iocb_put(req);
-out_put_reqs_available:
- put_reqs_available(ctx, 1);
- return ret;
}
static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
bool compat)
{
+ struct aio_kiocb *req;
struct iocb iocb;
+ int err;
if (unlikely(copy_from_user(&iocb, user_iocb, sizeof(iocb))))
return -EFAULT;
- return __io_submit_one(ctx, &iocb, user_iocb, compat);
+ /* enforce forwards compatibility on users */
+ if (unlikely(iocb.aio_reserved2)) {
+ pr_debug("EINVAL: reserve field set\n");
+ return -EINVAL;
+ }
+
+ /* prevent overflows */
+ if (unlikely(
+ (iocb.aio_buf != (unsigned long)iocb.aio_buf) ||
+ (iocb.aio_nbytes != (size_t)iocb.aio_nbytes) ||
+ ((ssize_t)iocb.aio_nbytes < 0)
+ )) {
+ pr_debug("EINVAL: overflow check\n");
+ return -EINVAL;
+ }
+
+ req = aio_get_req(ctx);
+ if (unlikely(!req))
+ return -EAGAIN;
+
+ err = __io_submit_one(ctx, &iocb, user_iocb, req, compat);
+
+ /* Done with the synchronous reference */
+ iocb_put(req);
+
+ /*
+ * If err is 0, we'd either done aio_complete() ourselves or have
+ * arranged for that to be done asynchronously. Anything non-zero
+ * means that we need to destroy req ourselves.
+ */
+ if (unlikely(err)) {
+ iocb_destroy(req);
+ put_reqs_available(ctx, 1);
+ }
+ return err;
}
/* sys_io_submit:
}
#endif
-/* lookup_kiocb
- * Finds a given iocb for cancellation.
- */
-static struct aio_kiocb *
-lookup_kiocb(struct kioctx *ctx, struct iocb __user *iocb)
-{
- struct aio_kiocb *kiocb;
-
- assert_spin_locked(&ctx->ctx_lock);
-
- /* TODO: use a hash or array, this sucks. */
- list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
- if (kiocb->ki_user_iocb == iocb)
- return kiocb;
- }
- return NULL;
-}
-
/* sys_io_cancel:
* Attempts to cancel an iocb previously passed to io_submit. If
* the operation is successfully cancelled, the resulting event is
struct aio_kiocb *kiocb;
int ret = -EINVAL;
u32 key;
+ u64 obj = (u64)(unsigned long)iocb;
if (unlikely(get_user(key, &iocb->aio_key)))
return -EFAULT;
return -EINVAL;
spin_lock_irq(&ctx->ctx_lock);
- kiocb = lookup_kiocb(ctx, iocb);
- if (kiocb) {
- ret = kiocb->ki_cancel(&kiocb->rw);
- list_del_init(&kiocb->ki_list);
+ /* TODO: use a hash or array, this sucks. */
+ list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
+ if (kiocb->ki_res.obj == obj) {
+ ret = kiocb->ki_cancel(&kiocb->rw);
+ list_del_init(&kiocb->ki_list);
+ break;
+ }
}
spin_unlock_irq(&ctx->ctx_lock);
tcon->ses->server->echo_interval / HZ);
if (tcon->snapshot_time)
seq_printf(s, ",snapshot=%llu", tcon->snapshot_time);
+ if (tcon->handle_timeout)
+ seq_printf(s, ",handletimeout=%u", tcon->handle_timeout);
/* convert actimeo and display it in seconds */
seq_printf(s, ",actimeo=%lu", cifs_sb->actimeo / HZ);
*/
#define CIFS_MAX_ACTIMEO (1 << 30)
+/*
+ * Max persistent and resilient handle timeout (milliseconds).
+ * Windows durable max was 960000 (16 minutes)
+ */
+#define SMB3_MAX_HANDLE_TIMEOUT 960000
+
/*
* MAX_REQ is the maximum number of requests that WE will send
* on one socket concurrently.
struct nls_table *local_nls;
unsigned int echo_interval; /* echo interval in secs */
__u64 snapshot_time; /* needed for timewarp tokens */
+ __u32 handle_timeout; /* persistent and durable handle timeout in ms */
unsigned int max_credits; /* smb3 max_credits 10 < credits < 60000 */
};
__u32 vol_serial_number;
__le64 vol_create_time;
__u64 snapshot_time; /* for timewarp tokens - timestamp of snapshot */
+ __u32 handle_timeout; /* persistent and durable handle timeout in ms */
__u32 ss_flags; /* sector size flags */
__u32 perf_sector_size; /* best sector size for perf */
__u32 max_chunks;
Opt_cruid, Opt_gid, Opt_file_mode,
Opt_dirmode, Opt_port,
Opt_blocksize, Opt_rsize, Opt_wsize, Opt_actimeo,
- Opt_echo_interval, Opt_max_credits,
+ Opt_echo_interval, Opt_max_credits, Opt_handletimeout,
Opt_snapshot,
/* Mount options which take string value */
{ Opt_rsize, "rsize=%s" },
{ Opt_wsize, "wsize=%s" },
{ Opt_actimeo, "actimeo=%s" },
+ { Opt_handletimeout, "handletimeout=%s" },
{ Opt_echo_interval, "echo_interval=%s" },
{ Opt_max_credits, "max_credits=%s" },
{ Opt_snapshot, "snapshot=%s" },
vol->actimeo = CIFS_DEF_ACTIMEO;
+ /* Most clients set timeout to 0, allows server to use its default */
+ vol->handle_timeout = 0; /* See MS-SMB2 spec section 2.2.14.2.12 */
+
/* offer SMB2.1 and later (SMB3 etc). Secure and widely accepted */
vol->ops = &smb30_operations;
vol->vals = &smbdefault_values;
goto cifs_parse_mount_err;
}
break;
+ case Opt_handletimeout:
+ if (get_option_ul(args, &option)) {
+ cifs_dbg(VFS, "%s: Invalid handletimeout value\n",
+ __func__);
+ goto cifs_parse_mount_err;
+ }
+ vol->handle_timeout = option;
+ if (vol->handle_timeout > SMB3_MAX_HANDLE_TIMEOUT) {
+ cifs_dbg(VFS, "Invalid handle cache timeout, longer than 16 minutes\n");
+ goto cifs_parse_mount_err;
+ }
+ break;
case Opt_echo_interval:
if (get_option_ul(args, &option)) {
cifs_dbg(VFS, "%s: Invalid echo interval value\n",
return 0;
if (tcon->snapshot_time != volume_info->snapshot_time)
return 0;
+ if (tcon->handle_timeout != volume_info->handle_timeout)
+ return 0;
return 1;
}
tcon->snapshot_time = volume_info->snapshot_time;
}
+ if (volume_info->handle_timeout) {
+ if (ses->server->vals->protocol_id == 0) {
+ cifs_dbg(VFS,
+ "Use SMB2.1 or later for handle timeout option\n");
+ rc = -EOPNOTSUPP;
+ goto out_fail;
+ } else
+ tcon->handle_timeout = volume_info->handle_timeout;
+ }
+
tcon->ses = ses;
if (volume_info->password) {
tcon->password = kstrdup(volume_info->password, GFP_KERNEL);
if (oparms->tcon->use_resilient) {
- nr_ioctl_req.Timeout = 0; /* use server default (120 seconds) */
+ /* default timeout is 0, servers pick default (120 seconds) */
+ nr_ioctl_req.Timeout =
+ cpu_to_le32(oparms->tcon->handle_timeout);
nr_ioctl_req.Reserved = 0;
rc = SMB2_ioctl(xid, oparms->tcon, fid->persistent_fid,
fid->volatile_fid, FSCTL_LMR_REQUEST_RESILIENCY,
true /* is_fsctl */,
(char *)&nr_ioctl_req, sizeof(nr_ioctl_req),
- NULL, NULL /* no return info */);
+ CIFSMaxBufSize, NULL, NULL /* no return info */);
if (rc == -EOPNOTSUPP) {
cifs_dbg(VFS,
"resiliency not supported by server, disabling\n");
rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
FSCTL_QUERY_NETWORK_INTERFACE_INFO, true /* is_fsctl */,
NULL /* no data input */, 0 /* no data input */,
- (char **)&out_buf, &ret_data_len);
+ CIFSMaxBufSize, (char **)&out_buf, &ret_data_len);
if (rc == -EOPNOTSUPP) {
cifs_dbg(FYI,
"server does not support query network interfaces\n");
oparms.fid->mid = le64_to_cpu(o_rsp->sync_hdr.MessageId);
#endif /* CIFS_DEBUG2 */
- if (o_rsp->OplockLevel == SMB2_OPLOCK_LEVEL_LEASE)
- oplock = smb2_parse_lease_state(server, o_rsp,
- &oparms.fid->epoch,
- oparms.fid->lease_key);
- else
- goto oshr_exit;
-
-
memcpy(tcon->crfid.fid, pfid, sizeof(struct cifs_fid));
tcon->crfid.tcon = tcon;
tcon->crfid.is_valid = true;
kref_init(&tcon->crfid.refcount);
- kref_get(&tcon->crfid.refcount);
+ if (o_rsp->OplockLevel == SMB2_OPLOCK_LEVEL_LEASE) {
+ kref_get(&tcon->crfid.refcount);
+ oplock = smb2_parse_lease_state(server, o_rsp,
+ &oparms.fid->epoch,
+ oparms.fid->lease_key);
+ } else
+ goto oshr_exit;
qi_rsp = (struct smb2_query_info_rsp *)rsp_iov[1].iov_base;
if (le32_to_cpu(qi_rsp->OutputBufferLength) < sizeof(struct smb2_file_all_info))
goto oshr_exit;
- rc = smb2_validate_and_copy_iov(
+ if (!smb2_validate_and_copy_iov(
le16_to_cpu(qi_rsp->OutputBufferOffset),
sizeof(struct smb2_file_all_info),
&rsp_iov[1], sizeof(struct smb2_file_all_info),
- (char *)&tcon->crfid.file_all_info);
- if (rc)
- goto oshr_exit;
- tcon->crfid.file_all_info_is_valid = 1;
+ (char *)&tcon->crfid.file_all_info))
+ tcon->crfid.file_all_info_is_valid = 1;
oshr_exit:
mutex_unlock(&tcon->crfid.fid_mutex);
rc = SMB2_ioctl(xid, tcon, persistent_fid, volatile_fid,
FSCTL_SRV_REQUEST_RESUME_KEY, true /* is_fsctl */,
- NULL, 0 /* no input */,
+ NULL, 0 /* no input */, CIFSMaxBufSize,
(char **)&res_key, &ret_data_len);
if (rc) {
rc = SMB2_ioctl_init(tcon, &rqst[1],
COMPOUND_FID, COMPOUND_FID,
qi.info_type, true, NULL,
- 0);
+ 0, CIFSMaxBufSize);
}
} else if (qi.flags == PASSTHRU_QUERY_INFO) {
memset(&qi_iov, 0, sizeof(qi_iov));
rc = SMB2_ioctl(xid, tcon, trgtfile->fid.persistent_fid,
trgtfile->fid.volatile_fid, FSCTL_SRV_COPYCHUNK_WRITE,
true /* is_fsctl */, (char *)pcchunk,
- sizeof(struct copychunk_ioctl), (char **)&retbuf,
- &ret_data_len);
+ sizeof(struct copychunk_ioctl), CIFSMaxBufSize,
+ (char **)&retbuf, &ret_data_len);
if (rc == 0) {
if (ret_data_len !=
sizeof(struct copychunk_ioctl_rsp)) {
rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
cfile->fid.volatile_fid, FSCTL_SET_SPARSE,
true /* is_fctl */,
- &setsparse, 1, NULL, NULL);
+ &setsparse, 1, CIFSMaxBufSize, NULL, NULL);
if (rc) {
tcon->broken_sparse_sup = true;
cifs_dbg(FYI, "set sparse rc = %d\n", rc);
true /* is_fsctl */,
(char *)&dup_ext_buf,
sizeof(struct duplicate_extents_to_file),
- NULL,
+ CIFSMaxBufSize, NULL,
&ret_data_len);
if (ret_data_len > 0)
true /* is_fsctl */,
(char *)&integr_info,
sizeof(struct fsctl_set_integrity_information_req),
- NULL,
+ CIFSMaxBufSize, NULL,
&ret_data_len);
}
/* GMT Token is @GMT-YYYY.MM.DD-HH.MM.SS Unicode which is 48 bytes + null */
#define GMT_TOKEN_SIZE 50
+#define MIN_SNAPSHOT_ARRAY_SIZE 16 /* See MS-SMB2 section 3.3.5.15.1 */
+
/*
* Input buffer contains (empty) struct smb_snapshot array with size filled in
* For output see struct SRV_SNAPSHOT_ARRAY in MS-SMB2 section 2.2.32.2
char *retbuf = NULL;
unsigned int ret_data_len = 0;
int rc;
+ u32 max_response_size;
struct smb_snapshot_array snapshot_in;
+ if (get_user(ret_data_len, (unsigned int __user *)ioc_buf))
+ return -EFAULT;
+
+ /*
+ * Note that for snapshot queries that servers like Azure expect that
+ * the first query be minimal size (and just used to get the number/size
+ * of previous versions) so response size must be specified as EXACTLY
+ * sizeof(struct snapshot_array) which is 16 when rounded up to multiple
+ * of eight bytes.
+ */
+ if (ret_data_len == 0)
+ max_response_size = MIN_SNAPSHOT_ARRAY_SIZE;
+ else
+ max_response_size = CIFSMaxBufSize;
+
rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
cfile->fid.volatile_fid,
FSCTL_SRV_ENUMERATE_SNAPSHOTS,
true /* is_fsctl */,
- NULL, 0 /* no input data */,
+ NULL, 0 /* no input data */, max_response_size,
(char **)&retbuf,
&ret_data_len);
cifs_dbg(FYI, "enum snaphots ioctl returned %d and ret buflen is %d\n",
rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
FSCTL_DFS_GET_REFERRALS,
true /* is_fsctl */,
- (char *)dfs_req, dfs_req_size,
+ (char *)dfs_req, dfs_req_size, CIFSMaxBufSize,
(char **)&dfs_rsp, &dfs_rsp_size);
} while (rc == -EAGAIN);
rc = SMB2_ioctl_init(tcon, &rqst[num++], cfile->fid.persistent_fid,
cfile->fid.volatile_fid, FSCTL_SET_ZERO_DATA,
true /* is_fctl */, (char *)&fsctl_buf,
- sizeof(struct file_zero_data_information));
+ sizeof(struct file_zero_data_information),
+ CIFSMaxBufSize);
if (rc)
goto zero_range_exit;
rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
cfile->fid.volatile_fid, FSCTL_SET_ZERO_DATA,
true /* is_fctl */, (char *)&fsctl_buf,
- sizeof(struct file_zero_data_information), NULL, NULL);
+ sizeof(struct file_zero_data_information),
+ CIFSMaxBufSize, NULL, NULL);
free_xid(xid);
return rc;
}
rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
FSCTL_VALIDATE_NEGOTIATE_INFO, true /* is_fsctl */,
- (char *)pneg_inbuf, inbuflen, (char **)&pneg_rsp, &rsplen);
+ (char *)pneg_inbuf, inbuflen, CIFSMaxBufSize,
+ (char **)&pneg_rsp, &rsplen);
if (rc == -EOPNOTSUPP) {
/*
* Old Windows versions or Netapp SMB server can return
}
static struct create_durable_v2 *
-create_durable_v2_buf(struct cifs_fid *pfid)
+create_durable_v2_buf(struct cifs_open_parms *oparms)
{
+ struct cifs_fid *pfid = oparms->fid;
struct create_durable_v2 *buf;
buf = kzalloc(sizeof(struct create_durable_v2), GFP_KERNEL);
(struct create_durable_v2, Name));
buf->ccontext.NameLength = cpu_to_le16(4);
- buf->dcontext.Timeout = 0; /* Should this be configurable by workload */
+ /*
+ * NB: Handle timeout defaults to 0, which allows server to choose
+ * (most servers default to 120 seconds) and most clients default to 0.
+ * This can be overridden at mount ("handletimeout=") if the user wants
+ * a different persistent (or resilient) handle timeout for all opens
+ * opens on a particular SMB3 mount.
+ */
+ buf->dcontext.Timeout = cpu_to_le32(oparms->tcon->handle_timeout);
buf->dcontext.Flags = cpu_to_le32(SMB2_DHANDLE_FLAG_PERSISTENT);
generate_random_uuid(buf->dcontext.CreateGuid);
memcpy(pfid->create_guid, buf->dcontext.CreateGuid, 16);
struct smb2_create_req *req = iov[0].iov_base;
unsigned int num = *num_iovec;
- iov[num].iov_base = create_durable_v2_buf(oparms->fid);
+ iov[num].iov_base = create_durable_v2_buf(oparms);
if (iov[num].iov_base == NULL)
return -ENOMEM;
iov[num].iov_len = sizeof(struct create_durable_v2);
int
SMB2_ioctl_init(struct cifs_tcon *tcon, struct smb_rqst *rqst,
u64 persistent_fid, u64 volatile_fid, u32 opcode,
- bool is_fsctl, char *in_data, u32 indatalen)
+ bool is_fsctl, char *in_data, u32 indatalen,
+ __u32 max_response_size)
{
struct smb2_ioctl_req *req;
struct kvec *iov = rqst->rq_iov;
req->OutputCount = 0; /* MBZ */
/*
- * Could increase MaxOutputResponse, but that would require more
- * than one credit. Windows typically sets this smaller, but for some
+ * In most cases max_response_size is set to 16K (CIFSMaxBufSize)
+ * We Could increase default MaxOutputResponse, but that could require
+ * more credits. Windows typically sets this smaller, but for some
* ioctls it may be useful to allow server to send more. No point
* limiting what the server can send as long as fits in one credit
- * Unfortunately - we can not handle more than CIFS_MAX_MSG_SIZE
- * (by default, note that it can be overridden to make max larger)
- * in responses (except for read responses which can be bigger.
- * We may want to bump this limit up
+ * We can not handle more than CIFS_MAX_BUF_SIZE yet but may want
+ * to increase this limit up in the future.
+ * Note that for snapshot queries that servers like Azure expect that
+ * the first query be minimal size (and just used to get the number/size
+ * of previous versions) so response size must be specified as EXACTLY
+ * sizeof(struct snapshot_array) which is 16 when rounded up to multiple
+ * of eight bytes. Currently that is the only case where we set max
+ * response size smaller.
*/
- req->MaxOutputResponse = cpu_to_le32(CIFSMaxBufSize);
+ req->MaxOutputResponse = cpu_to_le32(max_response_size);
if (is_fsctl)
req->Flags = cpu_to_le32(SMB2_0_IOCTL_IS_FSCTL);
cifs_small_buf_release(rqst->rq_iov[0].iov_base); /* request */
}
+
/*
* SMB2 IOCTL is used for both IOCTLs and FSCTLs
*/
int
SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
u64 volatile_fid, u32 opcode, bool is_fsctl,
- char *in_data, u32 indatalen,
+ char *in_data, u32 indatalen, u32 max_out_data_len,
char **out_data, u32 *plen /* returned data len */)
{
struct smb_rqst rqst;
rqst.rq_iov = iov;
rqst.rq_nvec = SMB2_IOCTL_IOV_SIZE;
- rc = SMB2_ioctl_init(tcon, &rqst, persistent_fid, volatile_fid,
- opcode, is_fsctl, in_data, indatalen);
+ rc = SMB2_ioctl_init(tcon, &rqst, persistent_fid, volatile_fid, opcode,
+ is_fsctl, in_data, indatalen, max_out_data_len);
if (rc)
goto ioctl_exit;
rc = SMB2_ioctl(xid, tcon, persistent_fid, volatile_fid,
FSCTL_SET_COMPRESSION, true /* is_fsctl */,
(char *)&fsctl_input /* data input */,
- 2 /* in data len */, &ret_data /* out data */, NULL);
+ 2 /* in data len */, CIFSMaxBufSize /* max out data */,
+ &ret_data /* out data */, NULL);
cifs_dbg(FYI, "set compression rc %d\n", rc);
extern void SMB2_open_free(struct smb_rqst *rqst);
extern int SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon,
u64 persistent_fid, u64 volatile_fid, u32 opcode,
- bool is_fsctl, char *in_data, u32 indatalen,
+ bool is_fsctl, char *in_data, u32 indatalen, u32 maxoutlen,
char **out_data, u32 *plen /* returned data len */);
extern int SMB2_ioctl_init(struct cifs_tcon *tcon, struct smb_rqst *rqst,
u64 persistent_fid, u64 volatile_fid, u32 opcode,
- bool is_fsctl, char *in_data, u32 indatalen);
+ bool is_fsctl, char *in_data, u32 indatalen,
+ __u32 max_response_size);
extern void SMB2_ioctl_free(struct smb_rqst *rqst);
extern int SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
u64 persistent_file_id, u64 volatile_file_id);
return 0;
}
-static void debugfs_evict_inode(struct inode *inode)
+static void debugfs_i_callback(struct rcu_head *head)
{
- truncate_inode_pages_final(&inode->i_data);
- clear_inode(inode);
+ struct inode *inode = container_of(head, struct inode, i_rcu);
if (S_ISLNK(inode->i_mode))
kfree(inode->i_link);
+ free_inode_nonrcu(inode);
+}
+
+static void debugfs_destroy_inode(struct inode *inode)
+{
+ call_rcu(&inode->i_rcu, debugfs_i_callback);
}
static const struct super_operations debugfs_super_operations = {
.statfs = simple_statfs,
.remount_fs = debugfs_remount,
.show_options = debugfs_show_options,
- .evict_inode = debugfs_evict_inode,
+ .destroy_inode = debugfs_destroy_inode,
};
static void debugfs_release_dentry(struct dentry *dentry)
umode_t mode, dev_t dev)
{
struct inode *inode;
- struct resv_map *resv_map;
+ struct resv_map *resv_map = NULL;
- resv_map = resv_map_alloc();
- if (!resv_map)
- return NULL;
+ /*
+ * Reserve maps are only needed for inodes that can have associated
+ * page allocations.
+ */
+ if (S_ISREG(mode) || S_ISLNK(mode)) {
+ resv_map = resv_map_alloc();
+ if (!resv_map)
+ return NULL;
+ }
inode = new_inode(sb);
if (inode) {
break;
}
lockdep_annotate_inode_mutex_key(inode);
- } else
- kref_put(&resv_map->refs, resv_map_release);
+ } else {
+ if (resv_map)
+ kref_put(&resv_map->refs, resv_map_release);
+ }
return inode;
}
fput(ctx->user_files[i]);
kfree(ctx->user_files);
+ ctx->user_files = NULL;
ctx->nr_user_files = 0;
return ret;
}
jffs2_kill_fragtree(&f->fragtree, deleted?c:NULL);
- if (f->target) {
- kfree(f->target);
- f->target = NULL;
- }
-
fds = f->dents;
while(fds) {
fd = fds;
static void jffs2_i_callback(struct rcu_head *head)
{
struct inode *inode = container_of(head, struct inode, i_rcu);
- kmem_cache_free(jffs2_inode_cachep, JFFS2_INODE_INFO(inode));
+ struct jffs2_inode_info *f = JFFS2_INODE_INFO(inode);
+
+ kfree(f->target);
+ kmem_cache_free(jffs2_inode_cachep, f);
}
static void jffs2_destroy_inode(struct inode *inode)
}
EXPORT_SYMBOL(nonseekable_open);
+
+/*
+ * stream_open is used by subsystems that want stream-like file descriptors.
+ * Such file descriptors are not seekable and don't have notion of position
+ * (file.f_pos is always 0). Contrary to file descriptors of other regular
+ * files, .read() and .write() can run simultaneously.
+ *
+ * stream_open never fails and is marked to return int so that it could be
+ * directly used as file_operations.open .
+ */
+int stream_open(struct inode *inode, struct file *filp)
+{
+ filp->f_mode &= ~(FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE | FMODE_ATOMIC_POS);
+ filp->f_mode |= FMODE_STREAM;
+ return 0;
+}
+
+EXPORT_SYMBOL(stream_open);
static int proc_pid_syscall(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
- long nr;
- unsigned long args[6], sp, pc;
+ struct syscall_info info;
+ u64 *args = &info.data.args[0];
int res;
res = lock_trace(task);
if (res)
return res;
- if (task_current_syscall(task, &nr, args, 6, &sp, &pc))
+ if (task_current_syscall(task, &info))
seq_puts(m, "running\n");
- else if (nr < 0)
- seq_printf(m, "%ld 0x%lx 0x%lx\n", nr, sp, pc);
+ else if (info.data.nr < 0)
+ seq_printf(m, "%d 0x%llx 0x%llx\n",
+ info.data.nr, info.sp, info.data.instruction_pointer);
else
seq_printf(m,
- "%ld 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx 0x%lx\n",
- nr,
+ "%d 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx 0x%llx\n",
+ info.data.nr,
args[0], args[1], args[2], args[3], args[4], args[5],
- sp, pc);
+ info.sp, info.data.instruction_pointer);
unlock_trace(task);
return 0;
static inline loff_t file_pos_read(struct file *file)
{
- return file->f_pos;
+ return file->f_mode & FMODE_STREAM ? 0 : file->f_pos;
}
static inline void file_pos_write(struct file *file, loff_t pos)
{
- file->f_pos = pos;
+ if ((file->f_mode & FMODE_STREAM) == 0)
+ file->f_pos = pos;
}
ssize_t ksys_read(unsigned int fd, char __user *buf, size_t count)
{
struct inode *inode = container_of(head, struct inode, i_rcu);
struct ubifs_inode *ui = ubifs_inode(inode);
+ kfree(ui->data);
kmem_cache_free(ubifs_inode_slab, ui);
}
static void ubifs_destroy_inode(struct inode *inode)
{
- struct ubifs_inode *ui = ubifs_inode(inode);
-
- kfree(ui->data);
call_rcu(&inode->i_rcu, ubifs_i_callback);
}
* syscall_get_arguments - extract system call parameter values
* @task: task of interest, must be blocked
* @regs: task_pt_regs() of @task
- * @i: argument index [0,5]
- * @n: number of arguments; n+i must be [1,6].
* @args: array filled with argument values
*
- * Fetches @n arguments to the system call starting with the @i'th argument
- * (from 0 through 5). Argument @i is stored in @args[0], and so on.
- * An arch inline version is probably optimal when @i and @n are constants.
+ * Fetches 6 arguments to the system call. First argument is stored in
+* @args[0], and so on.
*
* It's only valid to call this when @task is stopped for tracing on
* entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
- * It's invalid to call this with @i + @n > 6; we only support system calls
- * taking up to 6 arguments.
*/
void syscall_get_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n, unsigned long *args);
+ unsigned long *args);
/**
* syscall_set_arguments - change system call parameter value
* @task: task of interest, must be in system call entry tracing
* @regs: task_pt_regs() of @task
- * @i: argument index [0,5]
- * @n: number of arguments; n+i must be [1,6].
* @args: array of argument values to store
*
- * Changes @n arguments to the system call starting with the @i'th argument.
- * Argument @i gets value @args[0], and so on.
- * An arch inline version is probably optimal when @i and @n are constants.
+ * Changes 6 arguments to the system call.
+ * The first argument gets value @args[0], and so on.
*
* It's only valid to call this when @task is stopped for tracing on
* entry to a system call, due to %TIF_SYSCALL_TRACE or %TIF_SYSCALL_AUDIT.
- * It's invalid to call this with @i + @n > 6; we only support system calls
- * taking up to 6 arguments.
*/
void syscall_set_arguments(struct task_struct *task, struct pt_regs *regs,
- unsigned int i, unsigned int n,
const unsigned long *args);
/**
#define __constant_bitrev32(x) \
({ \
- u32 __x = x; \
- __x = (__x >> 16) | (__x << 16); \
- __x = ((__x & (u32)0xFF00FF00UL) >> 8) | ((__x & (u32)0x00FF00FFUL) << 8); \
- __x = ((__x & (u32)0xF0F0F0F0UL) >> 4) | ((__x & (u32)0x0F0F0F0FUL) << 4); \
- __x = ((__x & (u32)0xCCCCCCCCUL) >> 2) | ((__x & (u32)0x33333333UL) << 2); \
- __x = ((__x & (u32)0xAAAAAAAAUL) >> 1) | ((__x & (u32)0x55555555UL) << 1); \
- __x; \
+ u32 ___x = x; \
+ ___x = (___x >> 16) | (___x << 16); \
+ ___x = ((___x & (u32)0xFF00FF00UL) >> 8) | ((___x & (u32)0x00FF00FFUL) << 8); \
+ ___x = ((___x & (u32)0xF0F0F0F0UL) >> 4) | ((___x & (u32)0x0F0F0F0FUL) << 4); \
+ ___x = ((___x & (u32)0xCCCCCCCCUL) >> 2) | ((___x & (u32)0x33333333UL) << 2); \
+ ___x = ((___x & (u32)0xAAAAAAAAUL) >> 1) | ((___x & (u32)0x55555555UL) << 1); \
+ ___x; \
})
#define __constant_bitrev16(x) \
({ \
- u16 __x = x; \
- __x = (__x >> 8) | (__x << 8); \
- __x = ((__x & (u16)0xF0F0U) >> 4) | ((__x & (u16)0x0F0FU) << 4); \
- __x = ((__x & (u16)0xCCCCU) >> 2) | ((__x & (u16)0x3333U) << 2); \
- __x = ((__x & (u16)0xAAAAU) >> 1) | ((__x & (u16)0x5555U) << 1); \
- __x; \
+ u16 ___x = x; \
+ ___x = (___x >> 8) | (___x << 8); \
+ ___x = ((___x & (u16)0xF0F0U) >> 4) | ((___x & (u16)0x0F0FU) << 4); \
+ ___x = ((___x & (u16)0xCCCCU) >> 2) | ((___x & (u16)0x3333U) << 2); \
+ ___x = ((___x & (u16)0xAAAAU) >> 1) | ((___x & (u16)0x5555U) << 1); \
+ ___x; \
})
#define __constant_bitrev8x4(x) \
({ \
- u32 __x = x; \
- __x = ((__x & (u32)0xF0F0F0F0UL) >> 4) | ((__x & (u32)0x0F0F0F0FUL) << 4); \
- __x = ((__x & (u32)0xCCCCCCCCUL) >> 2) | ((__x & (u32)0x33333333UL) << 2); \
- __x = ((__x & (u32)0xAAAAAAAAUL) >> 1) | ((__x & (u32)0x55555555UL) << 1); \
- __x; \
+ u32 ___x = x; \
+ ___x = ((___x & (u32)0xF0F0F0F0UL) >> 4) | ((___x & (u32)0x0F0F0F0FUL) << 4); \
+ ___x = ((___x & (u32)0xCCCCCCCCUL) >> 2) | ((___x & (u32)0x33333333UL) << 2); \
+ ___x = ((___x & (u32)0xAAAAAAAAUL) >> 1) | ((___x & (u32)0x55555555UL) << 1); \
+ ___x; \
})
#define __constant_bitrev8(x) \
({ \
- u8 __x = x; \
- __x = (__x >> 4) | (__x << 4); \
- __x = ((__x & (u8)0xCCU) >> 2) | ((__x & (u8)0x33U) << 2); \
- __x = ((__x & (u8)0xAAU) >> 1) | ((__x & (u8)0x55U) << 1); \
- __x; \
+ u8 ___x = x; \
+ ___x = (___x >> 4) | (___x << 4); \
+ ___x = ((___x & (u8)0xCCU) >> 2) | ((___x & (u8)0x33U) << 2); \
+ ___x = ((___x & (u8)0xAAU) >> 1) | ((___x & (u8)0x55U) << 1); \
+ ___x; \
})
#define bitrev32(x) \
#define FMODE_OPENED ((__force fmode_t)0x80000)
#define FMODE_CREATED ((__force fmode_t)0x100000)
+/* File is stream-like */
+#define FMODE_STREAM ((__force fmode_t)0x200000)
+
/* File was opened by fanotify and shouldn't generate fanotify events */
#define FMODE_NONOTIFY ((__force fmode_t)0x4000000)
extern loff_t no_seek_end_llseek(struct file *, loff_t, int);
extern int generic_file_open(struct inode * inode, struct file * filp);
extern int nonseekable_open(struct inode * inode, struct file * filp);
+extern int stream_open(struct inode * inode, struct file * filp);
#ifdef CONFIG_BLOCK
typedef void (dio_submit_t)(struct bio *bio, struct inode *inode,
void __unlock_page_memcg(struct mem_cgroup *memcg);
void unlock_page_memcg(struct page *page);
-/* idx can be of type enum memcg_stat_item or node_stat_item */
+/*
+ * idx can be of type enum memcg_stat_item or node_stat_item.
+ * Keep in sync with memcg_exact_page_state().
+ */
static inline unsigned long memcg_page_state(struct mem_cgroup *memcg,
int idx)
{
if (linkmode_test_bit(ETHTOOL_LINK_MODE_Pause_BIT,
advertising))
lcl_adv |= ADVERTISE_PAUSE_CAP;
- if (linkmode_test_bit(ETHTOOL_LINK_MODE_Pause_BIT,
+ if (linkmode_test_bit(ETHTOOL_LINK_MODE_Asym_Pause_BIT,
advertising))
lcl_adv |= ADVERTISE_PAUSE_ASYM;
};
struct mlx5_td {
+ /* protects tirs list changes while tirs refresh */
+ struct mutex list_lock;
struct list_head tirs_list;
u32 tdn;
};
/* Encode hstate index for a hwpoisoned large page */
#define VM_FAULT_SET_HINDEX(x) ((__force vm_fault_t)((x) << 16))
-#define VM_FAULT_GET_HINDEX(x) (((x) >> 16) & 0xf)
+#define VM_FAULT_GET_HINDEX(x) (((__force unsigned int)(x) >> 16) & 0xf)
#define VM_FAULT_ERROR (VM_FAULT_OOM | VM_FAULT_SIGBUS | \
VM_FAULT_SIGSEGV | VM_FAULT_HWPOISON | \
#include <linux/bug.h> /* For BUG_ON. */
#include <linux/pid_namespace.h> /* For task_active_pid_ns. */
#include <uapi/linux/ptrace.h>
+#include <linux/seccomp.h>
+
+/* Add sp to seccomp_data, as seccomp is user API, we don't want to modify it */
+struct syscall_info {
+ __u64 sp;
+ struct seccomp_data data;
+};
extern int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
void *buf, int len, unsigned int gup_flags);
#define current_user_stack_pointer() user_stack_pointer(current_pt_regs())
#endif
-extern int task_current_syscall(struct task_struct *target, long *callno,
- unsigned long args[6], unsigned int maxargs,
- unsigned long *sp, unsigned long *pc);
+extern int task_current_syscall(struct task_struct *target, struct syscall_info *info);
extern void sigaction_compat_abi(struct k_sigaction *act, struct k_sigaction *oact);
#endif
#ifndef __HAVE_ARCH_MEMCMP
extern int memcmp(const void *,const void *,__kernel_size_t);
#endif
+#ifndef __HAVE_ARCH_BCMP
+extern int bcmp(const void *,const void *,__kernel_size_t);
+#endif
#ifndef __HAVE_ARCH_MEMCHR
extern void * memchr(const void *,int,__kernel_size_t);
#endif
unsigned char __user *data, int optlen);
void ip_options_undo(struct ip_options *opt);
void ip_forward_options(struct sk_buff *skb);
-int ip_options_rcv_srr(struct sk_buff *skb);
+int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev);
/*
* Functions provided by ip_sockglue.c
*/
spinlock_t rules_mod_lock;
+ u32 hash_mix;
atomic64_t cookie_gen;
struct list_head list; /* list of network namespaces */
#ifndef __NET_NS_HASH_H__
#define __NET_NS_HASH_H__
-#include <asm/cache.h>
-
-struct net;
+#include <net/net_namespace.h>
static inline u32 net_hash_mix(const struct net *net)
{
-#ifdef CONFIG_NET_NS
- return (u32)(((unsigned long)net) >> ilog2(sizeof(*net)));
-#else
- return 0;
-#endif
+ return net->hash_mix;
}
#endif
sch->qstats.overlimits++;
}
+static inline int qdisc_qstats_copy(struct gnet_dump *d, struct Qdisc *sch)
+{
+ __u32 qlen = qdisc_qlen_sum(sch);
+
+ return gnet_stats_copy_queue(d, sch->cpu_qstats, &sch->qstats, qlen);
+}
+
+static inline void qdisc_qstats_qlen_backlog(struct Qdisc *sch, __u32 *qlen,
+ __u32 *backlog)
+{
+ struct gnet_stats_queue qstats = { 0 };
+ __u32 len = qdisc_qlen_sum(sch);
+
+ __gnet_stats_copy_queue(&qstats, sch->cpu_qstats, &sch->qstats, len);
+ *qlen = qstats.qlen;
+ *backlog = qstats.backlog;
+}
+
+static inline void qdisc_tree_flush_backlog(struct Qdisc *sch)
+{
+ __u32 qlen, backlog;
+
+ qdisc_qstats_qlen_backlog(sch, &qlen, &backlog);
+ qdisc_tree_reduce_backlog(sch, qlen, backlog);
+}
+
+static inline void qdisc_purge_queue(struct Qdisc *sch)
+{
+ __u32 qlen, backlog;
+
+ qdisc_qstats_qlen_backlog(sch, &qlen, &backlog);
+ qdisc_reset(sch);
+ qdisc_tree_reduce_backlog(sch, qlen, backlog);
+}
+
static inline void qdisc_skb_head_init(struct qdisc_skb_head *qh)
{
qh->head = NULL;
sch_tree_lock(sch);
old = *pold;
*pold = new;
- if (old != NULL) {
- unsigned int qlen = old->q.qlen;
- unsigned int backlog = old->qstats.backlog;
-
- qdisc_reset(old);
- qdisc_tree_reduce_backlog(old, qlen, backlog);
- }
+ if (old != NULL)
+ qdisc_tree_flush_backlog(old);
sch_tree_unlock(sch);
return old;
TP_fast_assign(
__entry->id = id;
- syscall_get_arguments(current, regs, 0, 6, __entry->args);
+ syscall_get_arguments(current, regs, __entry->args);
),
TP_printk("NR %ld (%lx, %lx, %lx, %lx, %lx, %lx)",
static struct sk_buff *cpu_map_build_skb(struct bpf_cpu_map_entry *rcpu,
struct xdp_frame *xdpf)
{
+ unsigned int hard_start_headroom;
unsigned int frame_size;
void *pkt_data_start;
struct sk_buff *skb;
+ /* Part of headroom was reserved to xdpf */
+ hard_start_headroom = sizeof(struct xdp_frame) + xdpf->headroom;
+
/* build_skb need to place skb_shared_info after SKB end, and
* also want to know the memory "truesize". Thus, need to
* know the memory frame size backing xdp_buff.
* is not at a fixed memory location, with mixed length
* packets, which is bad for cache-line hotness.
*/
- frame_size = SKB_DATA_ALIGN(xdpf->len + xdpf->headroom) +
+ frame_size = SKB_DATA_ALIGN(xdpf->len + hard_start_headroom) +
SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
- pkt_data_start = xdpf->data - xdpf->headroom;
+ pkt_data_start = xdpf->data - hard_start_headroom;
skb = build_skb(pkt_data_start, frame_size);
if (!skb)
return NULL;
- skb_reserve(skb, xdpf->headroom);
+ skb_reserve(skb, hard_start_headroom);
__skb_put(skb, xdpf->len);
if (xdpf->metasize)
skb_metadata_set(skb, xdpf->metasize);
* - RX ring dev queue index (skb_record_rx_queue)
*/
+ /* Allow SKB to reuse area used by xdp_frame */
+ xdp_scrub_frame(xdpf);
+
return skb;
}
}
EXPORT_SYMBOL(bpf_prog_get_type_path);
-static void bpf_evict_inode(struct inode *inode)
-{
- enum bpf_type type;
-
- truncate_inode_pages_final(&inode->i_data);
- clear_inode(inode);
-
- if (S_ISLNK(inode->i_mode))
- kfree(inode->i_link);
- if (!bpf_inode_type(inode, &type))
- bpf_any_put(inode->i_private, type);
-}
-
/*
* Display the mount options in /proc/mounts.
*/
return 0;
}
+static void bpf_destroy_inode_deferred(struct rcu_head *head)
+{
+ struct inode *inode = container_of(head, struct inode, i_rcu);
+ enum bpf_type type;
+
+ if (S_ISLNK(inode->i_mode))
+ kfree(inode->i_link);
+ if (!bpf_inode_type(inode, &type))
+ bpf_any_put(inode->i_private, type);
+ free_inode_nonrcu(inode);
+}
+
+static void bpf_destroy_inode(struct inode *inode)
+{
+ call_rcu(&inode->i_rcu, bpf_destroy_inode_deferred);
+}
+
static const struct super_operations bpf_super_ops = {
.statfs = simple_statfs,
.drop_inode = generic_delete_inode,
.show_options = bpf_show_options,
- .evict_inode = bpf_evict_inode,
+ .destroy_inode = bpf_destroy_inode,
};
enum {
}
frame++;
if (frame >= MAX_CALL_FRAMES) {
- WARN_ONCE(1, "verifier bug. Call stack is too deep\n");
- return -EFAULT;
+ verbose(env, "the call stack of %d frames is too deep !\n",
+ frame);
+ return -E2BIG;
}
goto process_func;
}
sd->nr = syscall_get_nr(task, regs);
sd->arch = syscall_get_arch();
- syscall_get_arguments(task, regs, 0, 6, args);
+ syscall_get_arguments(task, regs, args);
sd->args[0] = args[0];
sd->args[1] = args[1];
sd->args[2] = args[2];
if (unlikely(sig != kinfo.si_signo))
goto err;
+ /* Only allow sending arbitrary signals to yourself. */
+ ret = -EPERM;
if ((task_pid(current) != pid) &&
- (kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL)) {
- /* Only allow sending arbitrary signals to yourself. */
- ret = -EPERM;
- if (kinfo.si_code != SI_USER)
- goto err;
-
- /* Turn this into a regular kill signal. */
- prepare_kill_siginfo(sig, &kinfo);
- }
+ (kinfo.si_code >= 0 || kinfo.si_code == SI_TKILL))
+ goto err;
} else {
prepare_kill_siginfo(sig, &kinfo);
}
static int __maybe_unused one = 1;
static int __maybe_unused two = 2;
static int __maybe_unused four = 4;
+static unsigned long zero_ul;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
static int one_hundred = 100;
.maxlen = sizeof(files_stat.max_files),
.mode = 0644,
.proc_handler = proc_doulongvec_minmax,
- .extra1 = &zero,
+ .extra1 = &zero_ul,
.extra2 = &long_max,
},
{
struct ring_buffer_event *event;
struct ring_buffer *buffer;
unsigned long irq_flags;
+ unsigned long args[6];
int pc;
int syscall_nr;
int size;
entry = ring_buffer_event_data(event);
entry->nr = syscall_nr;
- syscall_get_arguments(current, regs, 0, sys_data->nb_args, entry->args);
+ syscall_get_arguments(current, regs, args);
+ memcpy(entry->args, args, sizeof(unsigned long) * sys_data->nb_args);
event_trigger_unlock_commit(trace_file, buffer, event, entry,
irq_flags, pc);
struct syscall_metadata *sys_data;
struct syscall_trace_enter *rec;
struct hlist_head *head;
+ unsigned long args[6];
bool valid_prog_array;
int syscall_nr;
int rctx;
return;
rec->nr = syscall_nr;
- syscall_get_arguments(current, regs, 0, sys_data->nb_args,
- (unsigned long *)&rec->args);
+ syscall_get_arguments(current, regs, args);
+ memcpy(&rec->args, args, sizeof(unsigned long) * sys_data->nb_args);
if ((valid_prog_array &&
!perf_call_bpf_enter(sys_data->enter_event, regs, sys_data, rec)) ||
{
const unsigned char *ip = in;
unsigned char *op = out;
+ unsigned char *data_start;
size_t l = in_len;
size_t t = 0;
signed char state_offset = -2;
unsigned int m4_max_offset;
- // LZO v0 will never write 17 as first byte,
- // so this is used to version the bitstream
+ // LZO v0 will never write 17 as first byte (except for zero-length
+ // input), so this is used to version the bitstream
if (bitstream_version > 0) {
*op++ = 17;
*op++ = bitstream_version;
m4_max_offset = M4_MAX_OFFSET_V0;
}
+ data_start = op;
+
while (l > 20) {
size_t ll = l <= (m4_max_offset + 1) ? l : (m4_max_offset + 1);
uintptr_t ll_end = (uintptr_t) ip + ll;
if (t > 0) {
const unsigned char *ii = in + in_len - t;
- if (op == out && t <= 238) {
+ if (op == data_start && t <= 238) {
*op++ = (17 + t);
} else if (t <= 3) {
op[state_offset] |= t;
if (unlikely(in_len < 3))
goto input_overrun;
- if (likely(*ip == 17)) {
+ if (likely(in_len >= 5) && likely(*ip == 17)) {
bitstream_version = ip[1];
ip += 2;
- if (unlikely(in_len < 5))
- goto input_overrun;
} else {
bitstream_version = 0;
}
EXPORT_SYMBOL(memcmp);
#endif
+#ifndef __HAVE_ARCH_BCMP
+/**
+ * bcmp - returns 0 if and only if the buffers have identical contents.
+ * @a: pointer to first buffer.
+ * @b: pointer to second buffer.
+ * @len: size of buffers.
+ *
+ * The sign or magnitude of a non-zero return value has no particular
+ * meaning, and architectures may implement their own more efficient bcmp(). So
+ * while this particular implementation is a simple (tail) call to memcmp, do
+ * not rely on anything but whether the return value is zero or non-zero.
+ */
+#undef bcmp
+int bcmp(const void *a, const void *b, size_t len)
+{
+ return memcmp(a, b, len);
+}
+EXPORT_SYMBOL(bcmp);
+#endif
+
#ifndef __HAVE_ARCH_MEMSCAN
/**
* memscan - Find a character in an area of memory.
#include <linux/export.h>
#include <asm/syscall.h>
-static int collect_syscall(struct task_struct *target, long *callno,
- unsigned long args[6], unsigned int maxargs,
- unsigned long *sp, unsigned long *pc)
+static int collect_syscall(struct task_struct *target, struct syscall_info *info)
{
struct pt_regs *regs;
if (!try_get_task_stack(target)) {
/* Task has no stack, so the task isn't in a syscall. */
- *sp = *pc = 0;
- *callno = -1;
+ memset(info, 0, sizeof(*info));
+ info->data.nr = -1;
return 0;
}
return -EAGAIN;
}
- *sp = user_stack_pointer(regs);
- *pc = instruction_pointer(regs);
+ info->sp = user_stack_pointer(regs);
+ info->data.instruction_pointer = instruction_pointer(regs);
- *callno = syscall_get_nr(target, regs);
- if (*callno != -1L && maxargs > 0)
- syscall_get_arguments(target, regs, 0, maxargs, args);
+ info->data.nr = syscall_get_nr(target, regs);
+ if (info->data.nr != -1L)
+ syscall_get_arguments(target, regs,
+ (unsigned long *)&info->data.args[0]);
put_task_stack(target);
return 0;
/**
* task_current_syscall - Discover what a blocked task is doing.
* @target: thread to examine
- * @callno: filled with system call number or -1
- * @args: filled with @maxargs system call arguments
- * @maxargs: number of elements in @args to fill
- * @sp: filled with user stack pointer
- * @pc: filled with user PC
+ * @info: structure with the following fields:
+ * .sp - filled with user stack pointer
+ * .data.nr - filled with system call number or -1
+ * .data.args - filled with @maxargs system call arguments
+ * .data.instruction_pointer - filled with user PC
*
- * If @target is blocked in a system call, returns zero with *@callno
- * set to the the call's number and @args filled in with its arguments.
- * Registers not used for system call arguments may not be available and
- * it is not kosher to use &struct user_regset calls while the system
+ * If @target is blocked in a system call, returns zero with @info.data.nr
+ * set to the the call's number and @info.data.args filled in with its
+ * arguments. Registers not used for system call arguments may not be available
+ * and it is not kosher to use &struct user_regset calls while the system
* call is still in progress. Note we may get this result if @target
* has finished its system call but not yet returned to user mode, such
* as when it's stopped for signal handling or syscall exit tracing.
*
* If @target is blocked in the kernel during a fault or exception,
- * returns zero with *@callno set to -1 and does not fill in @args.
- * If so, it's now safe to examine @target using &struct user_regset
- * get() calls as long as we're sure @target won't return to user mode.
+ * returns zero with *@info.data.nr set to -1 and does not fill in
+ * @info.data.args. If so, it's now safe to examine @target using
+ * &struct user_regset get() calls as long as we're sure @target won't return
+ * to user mode.
*
* Returns -%EAGAIN if @target does not remain blocked.
- *
- * Returns -%EINVAL if @maxargs is too large (maximum is six).
*/
-int task_current_syscall(struct task_struct *target, long *callno,
- unsigned long args[6], unsigned int maxargs,
- unsigned long *sp, unsigned long *pc)
+int task_current_syscall(struct task_struct *target, struct syscall_info *info)
{
long state;
unsigned long ncsw;
- if (unlikely(maxargs > 6))
- return -EINVAL;
-
if (target == current)
- return collect_syscall(target, callno, args, maxargs, sp, pc);
+ return collect_syscall(target, info);
state = target->state;
if (unlikely(!state))
ncsw = wait_task_inactive(target, state);
if (unlikely(!ncsw) ||
- unlikely(collect_syscall(target, callno, args, maxargs, sp, pc)) ||
+ unlikely(collect_syscall(target, info)) ||
unlikely(wait_task_inactive(target, state) != ncsw))
return -EAGAIN;
bool check_target)
{
struct page *page = pfn_to_online_page(pfn);
+ struct page *block_page;
struct page *end_page;
unsigned long block_pfn;
get_pageblock_migratetype(page) != MIGRATE_MOVABLE)
return false;
+ /* Ensure the start of the pageblock or zone is online and valid */
+ block_pfn = pageblock_start_pfn(pfn);
+ block_page = pfn_to_online_page(max(block_pfn, zone->zone_start_pfn));
+ if (block_page) {
+ page = block_page;
+ pfn = block_pfn;
+ }
+
+ /* Ensure the end of the pageblock or zone is online and valid */
+ block_pfn += pageblock_nr_pages;
+ block_pfn = min(block_pfn, zone_end_pfn(zone) - 1);
+ end_page = pfn_to_online_page(block_pfn);
+ if (!end_page)
+ return false;
+
/*
* Only clear the hint if a sample indicates there is either a
* free page or an LRU page in the block. One or other condition
* is necessary for the block to be a migration source/target.
*/
- block_pfn = pageblock_start_pfn(pfn);
- pfn = max(block_pfn, zone->zone_start_pfn);
- page = pfn_to_page(pfn);
- if (zone != page_zone(page))
- return false;
- pfn = block_pfn + pageblock_nr_pages;
- pfn = min(pfn, zone_end_pfn(zone));
- end_page = pfn_to_page(pfn);
-
do {
if (pfn_valid_within(pfn)) {
if (check_source && PageLRU(page)) {
static void __reset_isolation_suitable(struct zone *zone)
{
unsigned long migrate_pfn = zone->zone_start_pfn;
- unsigned long free_pfn = zone_end_pfn(zone);
+ unsigned long free_pfn = zone_end_pfn(zone) - 1;
unsigned long reset_migrate = free_pfn;
unsigned long reset_free = migrate_pfn;
bool source_set = false;
count_compact_events(COMPACTISOLATED, nr_isolated);
} else {
/* If isolation fails, abort the search */
- order = -1;
+ order = cc->search_order + 1;
page = NULL;
}
}
spinlock_t *ptl;
ptl = pmd_lock(mm, pmd);
+ if (!pmd_none(*pmd)) {
+ if (write) {
+ if (pmd_pfn(*pmd) != pfn_t_to_pfn(pfn)) {
+ WARN_ON_ONCE(!is_huge_zero_pmd(*pmd));
+ goto out_unlock;
+ }
+ entry = pmd_mkyoung(*pmd);
+ entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
+ if (pmdp_set_access_flags(vma, addr, pmd, entry, 1))
+ update_mmu_cache_pmd(vma, addr, pmd);
+ }
+
+ goto out_unlock;
+ }
+
entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
if (pfn_t_devmap(pfn))
entry = pmd_mkdevmap(entry);
if (pgtable) {
pgtable_trans_huge_deposit(mm, pmd, pgtable);
mm_inc_nr_ptes(mm);
+ pgtable = NULL;
}
set_pmd_at(mm, addr, pmd, entry);
update_mmu_cache_pmd(vma, addr, pmd);
+
+out_unlock:
spin_unlock(ptl);
+ if (pgtable)
+ pte_free(mm, pgtable);
}
vm_fault_t vmf_insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
spinlock_t *ptl;
ptl = pud_lock(mm, pud);
+ if (!pud_none(*pud)) {
+ if (write) {
+ if (pud_pfn(*pud) != pfn_t_to_pfn(pfn)) {
+ WARN_ON_ONCE(!is_huge_zero_pud(*pud));
+ goto out_unlock;
+ }
+ entry = pud_mkyoung(*pud);
+ entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma);
+ if (pudp_set_access_flags(vma, addr, pud, entry, 1))
+ update_mmu_cache_pud(vma, addr, pud);
+ }
+ goto out_unlock;
+ }
+
entry = pud_mkhuge(pfn_t_pud(pfn, prot));
if (pfn_t_devmap(pfn))
entry = pud_mkdevmap(entry);
}
set_pud_at(mm, addr, pud, entry);
update_mmu_cache_pud(vma, addr, pud);
+
+out_unlock:
spin_unlock(ptl);
}
}
rcu_read_unlock();
- /* data/bss scanning */
- scan_large_block(_sdata, _edata);
- scan_large_block(__bss_start, __bss_stop);
- scan_large_block(__start_ro_after_init, __end_ro_after_init);
-
#ifdef CONFIG_SMP
/* per-cpu sections scanning */
for_each_possible_cpu(i)
}
local_irq_restore(flags);
+ /* register the data/bss sections */
+ create_object((unsigned long)_sdata, _edata - _sdata,
+ KMEMLEAK_GREY, GFP_ATOMIC);
+ create_object((unsigned long)__bss_start, __bss_stop - __bss_start,
+ KMEMLEAK_GREY, GFP_ATOMIC);
+ /* only register .data..ro_after_init if not within .data */
+ if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
+ create_object((unsigned long)__start_ro_after_init,
+ __end_ro_after_init - __start_ro_after_init,
+ KMEMLEAK_GREY, GFP_ATOMIC);
+
/*
* This is the point where tracking allocations is safe. Automatic
* scanning is started during the late initcall. Add the early logged
return &memcg->cgwb_domain;
}
+/*
+ * idx can be of type enum memcg_stat_item or node_stat_item.
+ * Keep in sync with memcg_exact_page().
+ */
+static unsigned long memcg_exact_page_state(struct mem_cgroup *memcg, int idx)
+{
+ long x = atomic_long_read(&memcg->stat[idx]);
+ int cpu;
+
+ for_each_online_cpu(cpu)
+ x += per_cpu_ptr(memcg->stat_cpu, cpu)->count[idx];
+ if (x < 0)
+ x = 0;
+ return x;
+}
+
/**
* mem_cgroup_wb_stats - retrieve writeback related stats from its memcg
* @wb: bdi_writeback in question
struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
struct mem_cgroup *parent;
- *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
+ *pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY);
/* this should eventually include NR_UNSTABLE_NFS */
- *pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
+ *pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK);
*pfilepages = mem_cgroup_nr_lru_pages(memcg, (1 << LRU_INACTIVE_FILE) |
(1 << LRU_ACTIVE_FILE));
*pheadroom = PAGE_COUNTER_MAX;
* @s: The string to duplicate
* @n: Maximum number of bytes to copy, including the trailing NUL.
*
- * Return: newly allocated copy of @s or %NULL in case of error
+ * Return: newly allocated copy of @s or an ERR_PTR() in case of error
*/
char *strndup_user(const char __user *s, long n)
{
return rc;
}
-static int vlan_dev_fcoe_get_wwn(struct net_device *dev, u64 *wwn, int type)
+static int vlan_dev_fcoe_ddp_target(struct net_device *dev, u16 xid,
+ struct scatterlist *sgl, unsigned int sgc)
{
struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
const struct net_device_ops *ops = real_dev->netdev_ops;
- int rc = -EINVAL;
+ int rc = 0;
+
+ if (ops->ndo_fcoe_ddp_target)
+ rc = ops->ndo_fcoe_ddp_target(real_dev, xid, sgl, sgc);
- if (ops->ndo_fcoe_get_wwn)
- rc = ops->ndo_fcoe_get_wwn(real_dev, wwn, type);
return rc;
}
+#endif
-static int vlan_dev_fcoe_ddp_target(struct net_device *dev, u16 xid,
- struct scatterlist *sgl, unsigned int sgc)
+#ifdef NETDEV_FCOE_WWNN
+static int vlan_dev_fcoe_get_wwn(struct net_device *dev, u64 *wwn, int type)
{
struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
const struct net_device_ops *ops = real_dev->netdev_ops;
- int rc = 0;
-
- if (ops->ndo_fcoe_ddp_target)
- rc = ops->ndo_fcoe_ddp_target(real_dev, xid, sgl, sgc);
+ int rc = -EINVAL;
+ if (ops->ndo_fcoe_get_wwn)
+ rc = ops->ndo_fcoe_get_wwn(real_dev, wwn, type);
return rc;
}
#endif
.ndo_fcoe_ddp_done = vlan_dev_fcoe_ddp_done,
.ndo_fcoe_enable = vlan_dev_fcoe_enable,
.ndo_fcoe_disable = vlan_dev_fcoe_disable,
- .ndo_fcoe_get_wwn = vlan_dev_fcoe_get_wwn,
.ndo_fcoe_ddp_target = vlan_dev_fcoe_ddp_target,
#endif
+#ifdef NETDEV_FCOE_WWNN
+ .ndo_fcoe_get_wwn = vlan_dev_fcoe_get_wwn,
+#endif
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = vlan_dev_poll_controller,
.ndo_netpoll_setup = vlan_dev_netpoll_setup,
ret = cfg80211_get_station(real_netdev, neigh->addr, &sinfo);
- /* free the TID stats immediately */
- cfg80211_sinfo_release_content(&sinfo);
+ if (!ret) {
+ /* free the TID stats immediately */
+ cfg80211_sinfo_release_content(&sinfo);
+ }
dev_put(real_netdev);
if (ret == -ENOENT) {
const u8 *mac, const unsigned short vid)
{
struct batadv_bla_claim search_claim, *claim;
+ struct batadv_bla_claim *claim_removed_entry;
+ struct hlist_node *claim_removed_node;
ether_addr_copy(search_claim.addr, mac);
search_claim.vid = vid;
batadv_dbg(BATADV_DBG_BLA, bat_priv, "%s(): %pM, vid %d\n", __func__,
mac, batadv_print_vid(vid));
- batadv_hash_remove(bat_priv->bla.claim_hash, batadv_compare_claim,
- batadv_choose_claim, claim);
- batadv_claim_put(claim); /* reference from the hash is gone */
+ claim_removed_node = batadv_hash_remove(bat_priv->bla.claim_hash,
+ batadv_compare_claim,
+ batadv_choose_claim, claim);
+ if (!claim_removed_node)
+ goto free_claim;
+ /* reference from the hash is gone */
+ claim_removed_entry = hlist_entry(claim_removed_node,
+ struct batadv_bla_claim, hash_entry);
+ batadv_claim_put(claim_removed_entry);
+
+free_claim:
/* don't need the reference from hash_find() anymore */
batadv_claim_put(claim);
}
struct attribute *attr,
char *buff, size_t count)
{
- struct batadv_priv *bat_priv = batadv_kobj_to_batpriv(kobj);
struct net_device *net_dev = batadv_kobj_to_netdev(kobj);
struct batadv_hard_iface *hard_iface;
+ struct batadv_priv *bat_priv;
u32 tp_override;
u32 old_tp_override;
bool ret;
atomic_set(&hard_iface->bat_v.throughput_override, tp_override);
- batadv_netlink_notify_hardif(bat_priv, hard_iface);
+ if (hard_iface->soft_iface) {
+ bat_priv = netdev_priv(hard_iface->soft_iface);
+ batadv_netlink_notify_hardif(bat_priv, hard_iface);
+ }
out:
batadv_hardif_put(hard_iface);
struct batadv_tt_global_entry *tt_global,
const char *message)
{
+ struct batadv_tt_global_entry *tt_removed_entry;
+ struct hlist_node *tt_removed_node;
+
batadv_dbg(BATADV_DBG_TT, bat_priv,
"Deleting global tt entry %pM (vid: %d): %s\n",
tt_global->common.addr,
batadv_print_vid(tt_global->common.vid), message);
- batadv_hash_remove(bat_priv->tt.global_hash, batadv_compare_tt,
- batadv_choose_tt, &tt_global->common);
- batadv_tt_global_entry_put(tt_global);
+ tt_removed_node = batadv_hash_remove(bat_priv->tt.global_hash,
+ batadv_compare_tt,
+ batadv_choose_tt,
+ &tt_global->common);
+ if (!tt_removed_node)
+ return;
+
+ /* drop reference of remove hash entry */
+ tt_removed_entry = hlist_entry(tt_removed_node,
+ struct batadv_tt_global_entry,
+ common.hash_entry);
+ batadv_tt_global_entry_put(tt_removed_entry);
}
/**
unsigned short vid, const char *message,
bool roaming)
{
+ struct batadv_tt_local_entry *tt_removed_entry;
struct batadv_tt_local_entry *tt_local_entry;
u16 flags, curr_flags = BATADV_NO_FLAGS;
- void *tt_entry_exists;
+ struct hlist_node *tt_removed_node;
tt_local_entry = batadv_tt_local_hash_find(bat_priv, addr, vid);
if (!tt_local_entry)
*/
batadv_tt_local_event(bat_priv, tt_local_entry, BATADV_TT_CLIENT_DEL);
- tt_entry_exists = batadv_hash_remove(bat_priv->tt.local_hash,
+ tt_removed_node = batadv_hash_remove(bat_priv->tt.local_hash,
batadv_compare_tt,
batadv_choose_tt,
&tt_local_entry->common);
- if (!tt_entry_exists)
+ if (!tt_removed_node)
goto out;
- /* extra call to free the local tt entry */
- batadv_tt_local_entry_put(tt_local_entry);
+ /* drop reference of remove hash entry */
+ tt_removed_entry = hlist_entry(tt_removed_node,
+ struct batadv_tt_local_entry,
+ common.hash_entry);
+ batadv_tt_local_entry_put(tt_removed_entry);
out:
if (tt_local_entry)
if (ipv4_is_local_multicast(group))
return 0;
+ memset(&br_group, 0, sizeof(br_group));
br_group.u.ip4 = group;
br_group.proto = htons(ETH_P_IP);
br_group.vid = vid;
own_query = port ? &port->ip4_own_query : &br->ip4_own_query;
+ memset(&br_group, 0, sizeof(br_group));
br_group.u.ip4 = group;
br_group.proto = htons(ETH_P_IP);
br_group.vid = vid;
own_query = port ? &port->ip6_own_query : &br->ip6_own_query;
+ memset(&br_group, 0, sizeof(br_group));
br_group.u.ip6 = *group;
br_group.proto = htons(ETH_P_IPV6);
br_group.vid = vid;
break;
sk_busy_loop(sk, flags & MSG_DONTWAIT);
- } while (!skb_queue_empty(&sk->sk_receive_queue));
+ } while (sk->sk_receive_queue.prev != *last);
error = -EAGAIN;
if (pt_prev->list_func != NULL)
pt_prev->list_func(head, pt_prev, orig_dev);
else
- list_for_each_entry_safe(skb, next, head, list)
+ list_for_each_entry_safe(skb, next, head, list) {
+ skb_list_del_init(skb);
pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
+ }
}
static void __netif_receive_skb_list_core(struct list_head *head, bool pfmemalloc)
WARN_ON_ONCE(!ret);
gstrings.len = ret;
- data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
- if (gstrings.len && !data)
- return -ENOMEM;
- __ethtool_get_strings(dev, gstrings.string_set, data);
+ if (gstrings.len) {
+ data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
+ if (!data)
+ return -ENOMEM;
+
+ __ethtool_get_strings(dev, gstrings.string_set, data);
+ } else {
+ data = NULL;
+ }
ret = -EFAULT;
if (copy_to_user(useraddr, &gstrings, sizeof(gstrings)))
return -EFAULT;
stats.n_stats = n_stats;
- data = vzalloc(array_size(n_stats, sizeof(u64)));
- if (n_stats && !data)
- return -ENOMEM;
- ops->get_ethtool_stats(dev, &stats, data);
+ if (n_stats) {
+ data = vzalloc(array_size(n_stats, sizeof(u64)));
+ if (!data)
+ return -ENOMEM;
+ ops->get_ethtool_stats(dev, &stats, data);
+ } else {
+ data = NULL;
+ }
ret = -EFAULT;
if (copy_to_user(useraddr, &stats, sizeof(stats)))
return -EFAULT;
stats.n_stats = n_stats;
- data = vzalloc(array_size(n_stats, sizeof(u64)));
- if (n_stats && !data)
- return -ENOMEM;
- if (dev->phydev && !ops->get_ethtool_phy_stats) {
- ret = phy_ethtool_get_stats(dev->phydev, &stats, data);
- if (ret < 0)
- return ret;
+ if (n_stats) {
+ data = vzalloc(array_size(n_stats, sizeof(u64)));
+ if (!data)
+ return -ENOMEM;
+
+ if (dev->phydev && !ops->get_ethtool_phy_stats) {
+ ret = phy_ethtool_get_stats(dev->phydev, &stats, data);
+ if (ret < 0)
+ goto out;
+ } else {
+ ops->get_ethtool_phy_stats(dev, &stats, data);
+ }
} else {
- ops->get_ethtool_phy_stats(dev, &stats, data);
+ data = NULL;
}
ret = -EFAULT;
const struct bpf_prog *prog,
struct bpf_insn_access_aux *info)
{
- if (type == BPF_WRITE) {
- switch (off) {
- case bpf_ctx_range_till(struct __sk_buff, cb[0], cb[4]):
- break;
- default:
- return false;
- }
- }
+ if (type == BPF_WRITE)
+ return false;
switch (off) {
case bpf_ctx_range(struct __sk_buff, data):
case bpf_ctx_range_ptr(struct __sk_buff, flow_keys):
info->reg_type = PTR_TO_FLOW_KEYS;
break;
- case bpf_ctx_range(struct __sk_buff, tc_classid):
- case bpf_ctx_range(struct __sk_buff, data_meta):
- case bpf_ctx_range_till(struct __sk_buff, family, local_port):
- case bpf_ctx_range(struct __sk_buff, tstamp):
- case bpf_ctx_range(struct __sk_buff, wire_len):
+ default:
return false;
}
/* Pass parameters to the BPF program */
memset(flow_keys, 0, sizeof(*flow_keys));
cb->qdisc_cb.flow_keys = flow_keys;
+ flow_keys->n_proto = skb->protocol;
flow_keys->nhoff = skb_network_offset(skb);
flow_keys->thoff = flow_keys->nhoff;
/* Restore state */
memcpy(cb, &cb_saved, sizeof(cb_saved));
- flow_keys->nhoff = clamp_t(u16, flow_keys->nhoff, 0, skb->len);
+ flow_keys->nhoff = clamp_t(u16, flow_keys->nhoff,
+ skb_network_offset(skb), skb->len);
flow_keys->thoff = clamp_t(u16, flow_keys->thoff,
flow_keys->nhoff, skb->len);
refcount_set(&net->count, 1);
refcount_set(&net->passive, 1);
+ get_random_bytes(&net->hash_mix, sizeof(u32));
net->dev_base_seq = 1;
net->user_ns = user_ns;
idr_init(&net->netns_ids);
unsigned int delta_truesize;
struct sk_buff *lp;
- if (unlikely(p->len + len >= 65536))
+ if (unlikely(p->len + len >= 65536 || NAPI_GRO_CB(skb)->flush))
return -E2BIG;
lp = NAPI_GRO_CB(p)->last;
if (dccp_feat_clone_sp_val(&fval, sp_val, sp_len))
return -ENOMEM;
- return dccp_feat_push_change(fn, feat, is_local, mandatory, &fval);
+ if (dccp_feat_push_change(fn, feat, is_local, mandatory, &fval)) {
+ kfree(fval.sp.vec);
+ return -ENOMEM;
+ }
+
+ return 0;
}
/**
return skb;
}
+static int qca_tag_flow_dissect(const struct sk_buff *skb, __be16 *proto,
+ int *offset)
+{
+ *offset = QCA_HDR_LEN;
+ *proto = ((__be16 *)skb->data)[0];
+
+ return 0;
+}
+
const struct dsa_device_ops qca_netdev_ops = {
.xmit = qca_tag_xmit,
.rcv = qca_tag_rcv,
+ .flow_dissect = qca_tag_flow_dissect,
.overhead = QCA_HDR_LEN,
};
ip_local_deliver_finish);
}
-static inline bool ip_rcv_options(struct sk_buff *skb)
+static inline bool ip_rcv_options(struct sk_buff *skb, struct net_device *dev)
{
struct ip_options *opt;
const struct iphdr *iph;
- struct net_device *dev = skb->dev;
/* It looks as overkill, because not all
IP options require packet mangling.
}
}
- if (ip_options_rcv_srr(skb))
+ if (ip_options_rcv_srr(skb, dev))
goto drop;
}
}
#endif
- if (iph->ihl > 5 && ip_rcv_options(skb))
+ if (iph->ihl > 5 && ip_rcv_options(skb, dev))
goto drop;
rt = skb_rtable(skb);
}
}
-int ip_options_rcv_srr(struct sk_buff *skb)
+int ip_options_rcv_srr(struct sk_buff *skb, struct net_device *dev)
{
struct ip_options *opt = &(IPCB(skb)->opt);
int srrspace, srrptr;
orefdst = skb->_skb_refdst;
skb_dst_set(skb, NULL);
- err = ip_route_input(skb, nexthop, iph->saddr, iph->tos, skb->dev);
+ err = ip_route_input(skb, nexthop, iph->saddr, iph->tos, dev);
rt2 = skb_rtable(skb);
if (err || (rt2->rt_type != RTN_UNICAST && rt2->rt_type != RTN_LOCAL)) {
skb_dst_drop(skb);
module_param(dctcp_alpha_on_init, uint, 0644);
MODULE_PARM_DESC(dctcp_alpha_on_init, "parameter for initial alpha value");
-static unsigned int dctcp_clamp_alpha_on_loss __read_mostly;
-module_param(dctcp_clamp_alpha_on_loss, uint, 0644);
-MODULE_PARM_DESC(dctcp_clamp_alpha_on_loss,
- "parameter for clamping alpha on loss");
-
static struct tcp_congestion_ops dctcp_reno;
static void dctcp_reset(const struct tcp_sock *tp, struct dctcp *ca)
}
}
-static void dctcp_state(struct sock *sk, u8 new_state)
+static void dctcp_react_to_loss(struct sock *sk)
{
- if (dctcp_clamp_alpha_on_loss && new_state == TCP_CA_Loss) {
- struct dctcp *ca = inet_csk_ca(sk);
+ struct dctcp *ca = inet_csk_ca(sk);
+ struct tcp_sock *tp = tcp_sk(sk);
- /* If this extension is enabled, we clamp dctcp_alpha to
- * max on packet loss; the motivation is that dctcp_alpha
- * is an indicator to the extend of congestion and packet
- * loss is an indicator of extreme congestion; setting
- * this in practice turned out to be beneficial, and
- * effectively assumes total congestion which reduces the
- * window by half.
- */
- ca->dctcp_alpha = DCTCP_MAX_ALPHA;
- }
+ ca->loss_cwnd = tp->snd_cwnd;
+ tp->snd_ssthresh = max(tp->snd_cwnd >> 1U, 2U);
+}
+
+static void dctcp_state(struct sock *sk, u8 new_state)
+{
+ if (new_state == TCP_CA_Recovery &&
+ new_state != inet_csk(sk)->icsk_ca_state)
+ dctcp_react_to_loss(sk);
+ /* We handle RTO in dctcp_cwnd_event to ensure that we perform only
+ * one loss-adjustment per RTT.
+ */
}
static void dctcp_cwnd_event(struct sock *sk, enum tcp_ca_event ev)
case CA_EVENT_ECN_NO_CE:
dctcp_ece_ack_update(sk, ev, &ca->prior_rcv_nxt, &ca->ce_state);
break;
+ case CA_EVENT_LOSS:
+ dctcp_react_to_loss(sk);
+ break;
default:
/* Don't care for the rest. */
break;
{
int cpu;
- module_put(net->ipv4.tcp_congestion_control->owner);
+ if (net->ipv4.tcp_congestion_control)
+ module_put(net->ipv4.tcp_congestion_control->owner);
for_each_possible_cpu(cpu)
inet_ctl_sock_destroy(*per_cpu_ptr(net->ipv4.tcp_sk, cpu));
done:
rhashtable_walk_stop(&iter);
+ rhashtable_walk_exit(&iter);
return ret;
}
inet6_sk(skb->sk) : NULL;
struct ipv6hdr *tmp_hdr;
struct frag_hdr *fh;
- unsigned int mtu, hlen, left, len;
+ unsigned int mtu, hlen, left, len, nexthdr_offset;
int hroom, troom;
__be32 frag_id;
int ptr, offset = 0, err = 0;
goto fail;
hlen = err;
nexthdr = *prevhdr;
+ nexthdr_offset = prevhdr - skb_network_header(skb);
mtu = ip6_skb_dst_mtu(skb);
(err = skb_checksum_help(skb)))
goto fail;
+ prevhdr = skb_network_header(skb) + nexthdr_offset;
hroom = LL_RESERVED_SPACE(rt->dst.dev);
if (skb_has_frag_list(skb)) {
unsigned int first_len = skb_pagelen(skb);
rt = ip_route_output_ports(dev_net(skb->dev), &fl4, NULL,
eiph->daddr, eiph->saddr, 0, 0,
IPPROTO_IPIP, RT_TOS(eiph->tos), 0);
- if (IS_ERR(rt) || rt->dst.dev->type != ARPHRD_TUNNEL) {
+ if (IS_ERR(rt) || rt->dst.dev->type != ARPHRD_TUNNEL6) {
if (!IS_ERR(rt))
ip_rt_put(rt);
goto out;
} else {
if (ip_route_input(skb2, eiph->daddr, eiph->saddr, eiph->tos,
skb2->dev) ||
- skb_dst(skb2)->dev->type != ARPHRD_TUNNEL)
+ skb_dst(skb2)->dev->type != ARPHRD_TUNNEL6)
goto out;
}
!net_eq(tunnel->net, dev_net(tunnel->dev))))
goto out;
+ /* skb can be uncloned in iptunnel_pull_header, so
+ * old iph is no longer valid
+ */
+ iph = (const struct iphdr *)skb_mac_header(skb);
err = IP_ECN_decapsulate(iph, skb);
if (unlikely(err)) {
if (log_ecn_error)
if (err)
goto fail;
- err = sock_register(&kcm_family_ops);
- if (err)
- goto sock_register_fail;
-
err = register_pernet_device(&kcm_net_ops);
if (err)
goto net_ops_fail;
+ err = sock_register(&kcm_family_ops);
+ if (err)
+ goto sock_register_fail;
+
err = kcm_proc_init();
if (err)
goto proc_init_fail;
return 0;
proc_init_fail:
- unregister_pernet_device(&kcm_net_ops);
-
-net_ops_fail:
sock_unregister(PF_KCM);
sock_register_fail:
+ unregister_pernet_device(&kcm_net_ops);
+
+net_ops_fail:
proto_unregister(&kcm_proto);
fail:
static void __exit kcm_exit(void)
{
kcm_proc_exit();
- unregister_pernet_device(&kcm_net_ops);
sock_unregister(PF_KCM);
+ unregister_pernet_device(&kcm_net_ops);
proto_unregister(&kcm_proto);
destroy_workqueue(kcm_wq);
struct sw_flow_actions *acts;
int new_acts_size;
- int req_size = NLA_ALIGN(attr_len);
+ size_t req_size = NLA_ALIGN(attr_len);
int next_offset = offsetof(struct sw_flow_actions, actions) +
(*sfa)->actions_len;
if (req_size <= (ksize(*sfa) - next_offset))
goto out;
- new_acts_size = ksize(*sfa) * 2;
+ new_acts_size = max(next_offset + req_size, ksize(*sfa) * 2);
if (new_acts_size > MAX_ACTIONS_BUFSIZE) {
if ((MAX_ACTIONS_BUFSIZE - next_offset) < req_size) {
list_for_each_entry_safe(tc, _tc, &rds_tcp_conn_list, t_tcp_node) {
struct net *c_net = read_pnet(&tc->t_cpath->cp_conn->c_net);
- if (net != c_net || !tc->t_sock)
+ if (net != c_net)
continue;
if (!list_has_conn(&tmp_list, tc->t_cpath->cp_conn)) {
list_move_tail(&tc->t_tcp_node, &tmp_list);
struct nlattr *tb[TCA_SAMPLE_MAX + 1];
struct psample_group *psample_group;
struct tcf_chain *goto_ch = NULL;
+ u32 psample_group_num, rate;
struct tc_sample *parm;
- u32 psample_group_num;
struct tcf_sample *s;
bool exists = false;
int ret, err;
if (err < 0)
goto release_idr;
+ rate = nla_get_u32(tb[TCA_SAMPLE_RATE]);
+ if (!rate) {
+ NL_SET_ERR_MSG(extack, "invalid sample rate");
+ err = -EINVAL;
+ goto put_chain;
+ }
psample_group_num = nla_get_u32(tb[TCA_SAMPLE_PSAMPLE_GROUP]);
psample_group = psample_group_get(net, psample_group_num);
if (!psample_group) {
spin_lock_bh(&s->tcf_lock);
goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch);
- s->rate = nla_get_u32(tb[TCA_SAMPLE_RATE]);
+ s->rate = rate;
s->psample_group_num = psample_group_num;
RCU_INIT_POINTER(s->psample_group, psample_group);
static void *mall_get(struct tcf_proto *tp, u32 handle)
{
+ struct cls_mall_head *head = rtnl_dereference(tp->root);
+
+ if (head && head->handle == handle)
+ return head;
+
return NULL;
}
static u8 cake_handle_diffserv(struct sk_buff *skb, u16 wash)
{
+ int wlen = skb_network_offset(skb);
u8 dscp;
- switch (skb->protocol) {
+ switch (tc_skb_protocol(skb)) {
case htons(ETH_P_IP):
+ wlen += sizeof(struct iphdr);
+ if (!pskb_may_pull(skb, wlen) ||
+ skb_try_make_writable(skb, wlen))
+ return 0;
+
dscp = ipv4_get_dsfield(ip_hdr(skb)) >> 2;
if (wash && dscp)
ipv4_change_dsfield(ip_hdr(skb), INET_ECN_MASK, 0);
return dscp;
case htons(ETH_P_IPV6):
+ wlen += sizeof(struct ipv6hdr);
+ if (!pskb_may_pull(skb, wlen) ||
+ skb_try_make_writable(skb, wlen))
+ return 0;
+
dscp = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2;
if (wash && dscp)
ipv6_change_dsfield(ipv6_hdr(skb), INET_ECN_MASK, 0);
{
struct cbq_sched_data *q = qdisc_priv(sch);
struct cbq_class *cl = (struct cbq_class *)arg;
+ __u32 qlen;
cl->xstats.avgidle = cl->avgidle;
cl->xstats.undertime = 0;
+ qdisc_qstats_qlen_backlog(cl->q, &qlen, &cl->qstats.backlog);
if (cl->undertime != PSCHED_PASTPERFECT)
cl->xstats.undertime = cl->undertime - q->now;
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
d, NULL, &cl->bstats) < 0 ||
gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
- gnet_stats_copy_queue(d, NULL, &cl->qstats, cl->q->q.qlen) < 0)
+ gnet_stats_copy_queue(d, NULL, &cl->qstats, qlen) < 0)
return -1;
return gnet_stats_copy_app(d, &cl->xstats, sizeof(cl->xstats));
{
struct cbq_sched_data *q = qdisc_priv(sch);
struct cbq_class *cl = (struct cbq_class *)arg;
- unsigned int qlen, backlog;
if (cl->filters || cl->children || cl == &q->link)
return -EBUSY;
sch_tree_lock(sch);
- qlen = cl->q->q.qlen;
- backlog = cl->q->qstats.backlog;
- qdisc_reset(cl->q);
- qdisc_tree_reduce_backlog(cl->q, qlen, backlog);
+ qdisc_purge_queue(cl->q);
if (cl->next_alive)
cbq_deactivate_class(cl);
return container_of(clc, struct drr_class, common);
}
-static void drr_purge_queue(struct drr_class *cl)
-{
- unsigned int len = cl->qdisc->q.qlen;
- unsigned int backlog = cl->qdisc->qstats.backlog;
-
- qdisc_reset(cl->qdisc);
- qdisc_tree_reduce_backlog(cl->qdisc, len, backlog);
-}
-
static const struct nla_policy drr_policy[TCA_DRR_MAX + 1] = {
[TCA_DRR_QUANTUM] = { .type = NLA_U32 },
};
sch_tree_lock(sch);
- drr_purge_queue(cl);
+ qdisc_purge_queue(cl->qdisc);
qdisc_class_hash_remove(&q->clhash, &cl->common);
sch_tree_unlock(sch);
struct gnet_dump *d)
{
struct drr_class *cl = (struct drr_class *)arg;
- __u32 qlen = cl->qdisc->q.qlen;
+ __u32 qlen = qdisc_qlen_sum(cl->qdisc);
+ struct Qdisc *cl_q = cl->qdisc;
struct tc_drr_stats xstats;
memset(&xstats, 0, sizeof(xstats));
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
d, NULL, &cl->bstats) < 0 ||
gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
- gnet_stats_copy_queue(d, NULL, &cl->qdisc->qstats, qlen) < 0)
+ gnet_stats_copy_queue(d, cl_q->cpu_qstats, &cl_q->qstats, qlen) < 0)
return -1;
return gnet_stats_copy_app(d, &xstats, sizeof(xstats));
return len;
}
-static void
-hfsc_purge_queue(struct Qdisc *sch, struct hfsc_class *cl)
-{
- unsigned int len = cl->qdisc->q.qlen;
- unsigned int backlog = cl->qdisc->qstats.backlog;
-
- qdisc_reset(cl->qdisc);
- qdisc_tree_reduce_backlog(cl->qdisc, len, backlog);
-}
-
static void
hfsc_adjust_levels(struct hfsc_class *cl)
{
qdisc_class_hash_insert(&q->clhash, &cl->cl_common);
list_add_tail(&cl->siblings, &parent->children);
if (parent->level == 0)
- hfsc_purge_queue(sch, parent);
+ qdisc_purge_queue(parent->qdisc);
hfsc_adjust_levels(parent);
sch_tree_unlock(sch);
list_del(&cl->siblings);
hfsc_adjust_levels(cl->cl_parent);
- hfsc_purge_queue(sch, cl);
+ qdisc_purge_queue(cl->qdisc);
qdisc_class_hash_remove(&q->clhash, &cl->cl_common);
sch_tree_unlock(sch);
{
struct hfsc_class *cl = (struct hfsc_class *)arg;
struct tc_hfsc_stats xstats;
+ __u32 qlen;
- cl->qstats.backlog = cl->qdisc->qstats.backlog;
+ qdisc_qstats_qlen_backlog(cl->qdisc, &qlen, &cl->qstats.backlog);
xstats.level = cl->level;
xstats.period = cl->cl_vtperiod;
xstats.work = cl->cl_total;
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch), d, NULL, &cl->bstats) < 0 ||
gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
- gnet_stats_copy_queue(d, NULL, &cl->qstats, cl->qdisc->q.qlen) < 0)
+ gnet_stats_copy_queue(d, NULL, &cl->qstats, qlen) < 0)
return -1;
return gnet_stats_copy_app(d, &xstats, sizeof(xstats));
};
__u32 qlen = 0;
- if (!cl->level && cl->leaf.q) {
- qlen = cl->leaf.q->q.qlen;
- qs.backlog = cl->leaf.q->qstats.backlog;
- }
+ if (!cl->level && cl->leaf.q)
+ qdisc_qstats_qlen_backlog(cl->leaf.q, &qlen, &qs.backlog);
+
cl->xstats.tokens = clamp_t(s64, PSCHED_NS2TICKS(cl->tokens),
INT_MIN, INT_MAX);
cl->xstats.ctokens = clamp_t(s64, PSCHED_NS2TICKS(cl->ctokens),
sch_tree_lock(sch);
- if (!cl->level) {
- unsigned int qlen = cl->leaf.q->q.qlen;
- unsigned int backlog = cl->leaf.q->qstats.backlog;
-
- qdisc_reset(cl->leaf.q);
- qdisc_tree_reduce_backlog(cl->leaf.q, qlen, backlog);
- }
+ if (!cl->level)
+ qdisc_purge_queue(cl->leaf.q);
/* delete from hash and active; remainder in destroy_class */
qdisc_class_hash_remove(&q->clhash, &cl->common);
classid, NULL);
sch_tree_lock(sch);
if (parent && !parent->level) {
- unsigned int qlen = parent->leaf.q->q.qlen;
- unsigned int backlog = parent->leaf.q->qstats.backlog;
-
/* turn parent into inner node */
- qdisc_reset(parent->leaf.q);
- qdisc_tree_reduce_backlog(parent->leaf.q, qlen, backlog);
+ qdisc_purge_queue(parent->leaf.q);
qdisc_put(parent->leaf.q);
if (parent->prio_activity)
htb_deactivate(q, parent);
sch = dev_queue->qdisc_sleeping;
if (gnet_stats_copy_basic(&sch->running, d, NULL, &sch->bstats) < 0 ||
- gnet_stats_copy_queue(d, NULL, &sch->qstats, sch->q.qlen) < 0)
+ qdisc_qstats_copy(d, sch) < 0)
return -1;
return 0;
}
sch = dev_queue->qdisc_sleeping;
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
d, NULL, &sch->bstats) < 0 ||
- gnet_stats_copy_queue(d, NULL,
- &sch->qstats, sch->q.qlen) < 0)
+ qdisc_qstats_copy(d, sch) < 0)
return -1;
}
return 0;
for (i = q->bands; i < q->max_bands; i++) {
if (q->queues[i] != &noop_qdisc) {
struct Qdisc *child = q->queues[i];
+
q->queues[i] = &noop_qdisc;
- qdisc_tree_reduce_backlog(child, child->q.qlen,
- child->qstats.backlog);
+ qdisc_tree_flush_backlog(child);
qdisc_put(child);
}
}
qdisc_hash_add(child, true);
if (old != &noop_qdisc) {
- qdisc_tree_reduce_backlog(old,
- old->q.qlen,
- old->qstats.backlog);
+ qdisc_tree_flush_backlog(old);
qdisc_put(old);
}
sch_tree_unlock(sch);
cl_q = q->queues[cl - 1];
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
d, NULL, &cl_q->bstats) < 0 ||
- gnet_stats_copy_queue(d, NULL, &cl_q->qstats, cl_q->q.qlen) < 0)
+ qdisc_qstats_copy(d, cl_q) < 0)
return -1;
return 0;
q->bands = qopt->bands;
memcpy(q->prio2band, qopt->priomap, TC_PRIO_MAX+1);
- for (i = q->bands; i < oldbands; i++) {
- struct Qdisc *child = q->queues[i];
-
- qdisc_tree_reduce_backlog(child, child->q.qlen,
- child->qstats.backlog);
- }
+ for (i = q->bands; i < oldbands; i++)
+ qdisc_tree_flush_backlog(q->queues[i]);
for (i = oldbands; i < q->bands; i++) {
q->queues[i] = queues[i];
cl_q = q->queues[cl - 1];
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
d, NULL, &cl_q->bstats) < 0 ||
- gnet_stats_copy_queue(d, NULL, &cl_q->qstats, cl_q->q.qlen) < 0)
+ qdisc_qstats_copy(d, cl_q) < 0)
return -1;
return 0;
return container_of(clc, struct qfq_class, common);
}
-static void qfq_purge_queue(struct qfq_class *cl)
-{
- unsigned int len = cl->qdisc->q.qlen;
- unsigned int backlog = cl->qdisc->qstats.backlog;
-
- qdisc_reset(cl->qdisc);
- qdisc_tree_reduce_backlog(cl->qdisc, len, backlog);
-}
-
static const struct nla_policy qfq_policy[TCA_QFQ_MAX + 1] = {
[TCA_QFQ_WEIGHT] = { .type = NLA_U32 },
[TCA_QFQ_LMAX] = { .type = NLA_U32 },
sch_tree_lock(sch);
- qfq_purge_queue(cl);
+ qdisc_purge_queue(cl->qdisc);
qdisc_class_hash_remove(&q->clhash, &cl->common);
sch_tree_unlock(sch);
if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
d, NULL, &cl->bstats) < 0 ||
gnet_stats_copy_rate_est(d, &cl->rate_est) < 0 ||
- gnet_stats_copy_queue(d, NULL,
- &cl->qdisc->qstats, cl->qdisc->q.qlen) < 0)
+ qdisc_qstats_copy(d, cl->qdisc) < 0)
return -1;
return gnet_stats_copy_app(d, &xstats, sizeof(xstats));
q->flags = ctl->flags;
q->limit = ctl->limit;
if (child) {
- qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen,
- q->qdisc->qstats.backlog);
+ qdisc_tree_flush_backlog(q->qdisc);
old_child = q->qdisc;
q->qdisc = child;
}
qdisc_hash_add(child, true);
sch_tree_lock(sch);
- qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen,
- q->qdisc->qstats.backlog);
+ qdisc_tree_flush_backlog(q->qdisc);
qdisc_put(q->qdisc);
q->qdisc = child;
sch = dev_queue->qdisc_sleeping;
if (gnet_stats_copy_basic(&sch->running, d, NULL, &sch->bstats) < 0 ||
- gnet_stats_copy_queue(d, NULL, &sch->qstats, sch->q.qlen) < 0)
+ qdisc_qstats_copy(d, sch) < 0)
return -1;
return 0;
}
sch_tree_lock(sch);
if (child) {
- qdisc_tree_reduce_backlog(q->qdisc, q->qdisc->q.qlen,
- q->qdisc->qstats.backlog);
+ qdisc_tree_flush_backlog(q->qdisc);
qdisc_put(q->qdisc);
q->qdisc = child;
}
static int sctp_v4_addr_to_user(struct sctp_sock *sp, union sctp_addr *addr)
{
/* No address mapping for V4 sockets */
+ memset(addr->v4.sin_zero, 0, sizeof(addr->v4.sin_zero));
return sizeof(struct sockaddr_in);
}
if (msg->rep_type)
tipc_tlv_init(msg->rep, msg->rep_type);
- if (cmd->header)
- (*cmd->header)(msg);
+ if (cmd->header) {
+ err = (*cmd->header)(msg);
+ if (err) {
+ kfree_skb(msg->rep);
+ msg->rep = NULL;
+ return err;
+ }
+ }
arg = nlmsg_new(0, GFP_KERNEL);
if (!arg) {
if (!bearer)
return -EMSGSIZE;
- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_BEARER_NAME);
+ len = TLV_GET_DATA_LEN(msg->req);
+ len -= offsetof(struct tipc_bearer_config, name);
+ if (len <= 0)
+ return -EINVAL;
+
+ len = min_t(int, len, TIPC_MAX_BEARER_NAME);
if (!string_is_valid(b->name, len))
return -EINVAL;
lc = (struct tipc_link_config *)TLV_DATA(msg->req);
- len = min_t(int, TLV_GET_DATA_LEN(msg->req), TIPC_MAX_LINK_NAME);
+ len = TLV_GET_DATA_LEN(msg->req);
+ len -= offsetof(struct tipc_link_config, name);
+ if (len <= 0)
+ return -EINVAL;
+
+ len = min_t(int, len, TIPC_MAX_LINK_NAME);
if (!string_is_valid(lc->name, len))
return -EINVAL;
return err;
}
+ } else {
+ *zc = false;
}
rxm->full_len -= padding_length(ctx, tls_ctx, skb);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+// Author: Kirill Smelkov (kirr@nexedi.com)
+//
+// Search for stream-like files that are using nonseekable_open and convert
+// them to stream_open. A stream-like file is a file that does not use ppos in
+// its read and write. Rationale for the conversion is to avoid deadlock in
+// between read and write.
+
+virtual report
+virtual patch
+virtual explain // explain decisions in the patch (SPFLAGS="-D explain")
+
+// stream-like reader & writer - ones that do not depend on f_pos.
+@ stream_reader @
+identifier readstream, ppos;
+identifier f, buf, len;
+type loff_t;
+@@
+ ssize_t readstream(struct file *f, char *buf, size_t len, loff_t *ppos)
+ {
+ ... when != ppos
+ }
+
+@ stream_writer @
+identifier writestream, ppos;
+identifier f, buf, len;
+type loff_t;
+@@
+ ssize_t writestream(struct file *f, const char *buf, size_t len, loff_t *ppos)
+ {
+ ... when != ppos
+ }
+
+
+// a function that blocks
+@ blocks @
+identifier block_f;
+identifier wait_event =~ "^wait_event_.*";
+@@
+ block_f(...) {
+ ... when exists
+ wait_event(...)
+ ... when exists
+ }
+
+// stream_reader that can block inside.
+//
+// XXX wait_* can be called not directly from current function (e.g. func -> f -> g -> wait())
+// XXX currently reader_blocks supports only direct and 1-level indirect cases.
+@ reader_blocks_direct @
+identifier stream_reader.readstream;
+identifier wait_event =~ "^wait_event_.*";
+@@
+ readstream(...)
+ {
+ ... when exists
+ wait_event(...)
+ ... when exists
+ }
+
+@ reader_blocks_1 @
+identifier stream_reader.readstream;
+identifier blocks.block_f;
+@@
+ readstream(...)
+ {
+ ... when exists
+ block_f(...)
+ ... when exists
+ }
+
+@ reader_blocks depends on reader_blocks_direct || reader_blocks_1 @
+identifier stream_reader.readstream;
+@@
+ readstream(...) {
+ ...
+ }
+
+
+// file_operations + whether they have _any_ .read, .write, .llseek ... at all.
+//
+// XXX add support for file_operations xxx[N] = ... (sound/core/pcm_native.c)
+@ fops0 @
+identifier fops;
+@@
+ struct file_operations fops = {
+ ...
+ };
+
+@ has_read @
+identifier fops0.fops;
+identifier read_f;
+@@
+ struct file_operations fops = {
+ .read = read_f,
+ };
+
+@ has_read_iter @
+identifier fops0.fops;
+identifier read_iter_f;
+@@
+ struct file_operations fops = {
+ .read_iter = read_iter_f,
+ };
+
+@ has_write @
+identifier fops0.fops;
+identifier write_f;
+@@
+ struct file_operations fops = {
+ .write = write_f,
+ };
+
+@ has_write_iter @
+identifier fops0.fops;
+identifier write_iter_f;
+@@
+ struct file_operations fops = {
+ .write_iter = write_iter_f,
+ };
+
+@ has_llseek @
+identifier fops0.fops;
+identifier llseek_f;
+@@
+ struct file_operations fops = {
+ .llseek = llseek_f,
+ };
+
+@ has_no_llseek @
+identifier fops0.fops;
+@@
+ struct file_operations fops = {
+ .llseek = no_llseek,
+ };
+
+@ has_mmap @
+identifier fops0.fops;
+identifier mmap_f;
+@@
+ struct file_operations fops = {
+ .mmap = mmap_f,
+ };
+
+@ has_copy_file_range @
+identifier fops0.fops;
+identifier copy_file_range_f;
+@@
+ struct file_operations fops = {
+ .copy_file_range = copy_file_range_f,
+ };
+
+@ has_remap_file_range @
+identifier fops0.fops;
+identifier remap_file_range_f;
+@@
+ struct file_operations fops = {
+ .remap_file_range = remap_file_range_f,
+ };
+
+@ has_splice_read @
+identifier fops0.fops;
+identifier splice_read_f;
+@@
+ struct file_operations fops = {
+ .splice_read = splice_read_f,
+ };
+
+@ has_splice_write @
+identifier fops0.fops;
+identifier splice_write_f;
+@@
+ struct file_operations fops = {
+ .splice_write = splice_write_f,
+ };
+
+
+// file_operations that is candidate for stream_open conversion - it does not
+// use mmap and other methods that assume @offset access to file.
+//
+// XXX for simplicity require no .{read/write}_iter and no .splice_{read/write} for now.
+// XXX maybe_steam.fops cannot be used in other rules - it gives "bad rule maybe_stream or bad variable fops".
+@ maybe_stream depends on (!has_llseek || has_no_llseek) && !has_mmap && !has_copy_file_range && !has_remap_file_range && !has_read_iter && !has_write_iter && !has_splice_read && !has_splice_write @
+identifier fops0.fops;
+@@
+ struct file_operations fops = {
+ };
+
+
+// ---- conversions ----
+
+// XXX .open = nonseekable_open -> .open = stream_open
+// XXX .open = func -> openfunc -> nonseekable_open
+
+// read & write
+//
+// if both are used in the same file_operations together with an opener -
+// under that conditions we can use stream_open instead of nonseekable_open.
+@ fops_rw depends on maybe_stream @
+identifier fops0.fops, openfunc;
+identifier stream_reader.readstream;
+identifier stream_writer.writestream;
+@@
+ struct file_operations fops = {
+ .open = openfunc,
+ .read = readstream,
+ .write = writestream,
+ };
+
+@ report_rw depends on report @
+identifier fops_rw.openfunc;
+position p1;
+@@
+ openfunc(...) {
+ <...
+ nonseekable_open@p1
+ ...>
+ }
+
+@ script:python depends on report && reader_blocks @
+fops << fops0.fops;
+p << report_rw.p1;
+@@
+coccilib.report.print_report(p[0],
+ "ERROR: %s: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix." % (fops,))
+
+@ script:python depends on report && !reader_blocks @
+fops << fops0.fops;
+p << report_rw.p1;
+@@
+coccilib.report.print_report(p[0],
+ "WARNING: %s: .read() and .write() have stream semantic; safe to change nonseekable_open -> stream_open." % (fops,))
+
+
+@ explain_rw_deadlocked depends on explain && reader_blocks @
+identifier fops_rw.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ nonseekable_open /* read & write (was deadlock) */
+ ...>
+ }
+
+
+@ explain_rw_nodeadlock depends on explain && !reader_blocks @
+identifier fops_rw.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ nonseekable_open /* read & write (no direct deadlock) */
+ ...>
+ }
+
+@ patch_rw depends on patch @
+identifier fops_rw.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ stream_open
+ ...>
+ }
+
+
+// read, but not write
+@ fops_r depends on maybe_stream && !has_write @
+identifier fops0.fops, openfunc;
+identifier stream_reader.readstream;
+@@
+ struct file_operations fops = {
+ .open = openfunc,
+ .read = readstream,
+ };
+
+@ report_r depends on report @
+identifier fops_r.openfunc;
+position p1;
+@@
+ openfunc(...) {
+ <...
+ nonseekable_open@p1
+ ...>
+ }
+
+@ script:python depends on report @
+fops << fops0.fops;
+p << report_r.p1;
+@@
+coccilib.report.print_report(p[0],
+ "WARNING: %s: .read() has stream semantic; safe to change nonseekable_open -> stream_open." % (fops,))
+
+@ explain_r depends on explain @
+identifier fops_r.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ nonseekable_open /* read only */
+ ...>
+ }
+
+@ patch_r depends on patch @
+identifier fops_r.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ stream_open
+ ...>
+ }
+
+
+// write, but not read
+@ fops_w depends on maybe_stream && !has_read @
+identifier fops0.fops, openfunc;
+identifier stream_writer.writestream;
+@@
+ struct file_operations fops = {
+ .open = openfunc,
+ .write = writestream,
+ };
+
+@ report_w depends on report @
+identifier fops_w.openfunc;
+position p1;
+@@
+ openfunc(...) {
+ <...
+ nonseekable_open@p1
+ ...>
+ }
+
+@ script:python depends on report @
+fops << fops0.fops;
+p << report_w.p1;
+@@
+coccilib.report.print_report(p[0],
+ "WARNING: %s: .write() has stream semantic; safe to change nonseekable_open -> stream_open." % (fops,))
+
+@ explain_w depends on explain @
+identifier fops_w.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ nonseekable_open /* write only */
+ ...>
+ }
+
+@ patch_w depends on patch @
+identifier fops_w.openfunc;
+@@
+ openfunc(...) {
+ <...
+- nonseekable_open
++ stream_open
+ ...>
+ }
+
+
+// no read, no write - don't change anything
$(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN)
$(QUIET_LINK)$(CC) --shared -Wl,-soname,libbpf.so.$(VERSION) \
- -Wl,--version-script=$(VERSION_SCRIPT) $^ -o $@
+ -Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -o $@
@ln -sf $(@F) $(OUTPUT)libbpf.so
@ln -sf $(@F) $(OUTPUT)libbpf.so.$(VERSION)
install_headers:
$(call QUIET_INSTALL, headers) \
$(call do_install,bpf.h,$(prefix)/include/bpf,644); \
- $(call do_install,libbpf.h,$(prefix)/include/bpf,644);
- $(call do_install,btf.h,$(prefix)/include/bpf,644);
+ $(call do_install,libbpf.h,$(prefix)/include/bpf,644); \
+ $(call do_install,btf.h,$(prefix)/include/bpf,644); \
+ $(call do_install,xsk.h,$(prefix)/include/bpf,644);
install: install_lib
return fwd_kind == real_kind;
}
+ if (cand_kind != canon_kind)
+ return 0;
+
switch (cand_kind) {
case BTF_KIND_INT:
return btf_equal_int(cand_type, canon_type);
#include <cpuid.h>
#include <linux/capability.h>
#include <errno.h>
+#include <math.h>
char *proc_stat = "/proc/stat";
FILE *outf;
unsigned int do_snb_cstates;
unsigned int do_knl_cstates;
unsigned int do_slm_cstates;
-unsigned int do_cnl_cstates;
unsigned int use_c1_residency_msr;
unsigned int has_aperf;
unsigned int has_epb;
#define RAPL_CORES_ENERGY_STATUS (1 << 9)
/* 0x639 MSR_PP0_ENERGY_STATUS */
+#define RAPL_PER_CORE_ENERGY (1 << 10)
+ /* Indicates cores energy collection is per-core,
+ * not per-package. */
+#define RAPL_AMD_F17H (1 << 11)
+ /* 0xc0010299 MSR_RAPL_PWR_UNIT */
+ /* 0xc001029a MSR_CORE_ENERGY_STAT */
+ /* 0xc001029b MSR_PKG_ENERGY_STAT */
#define RAPL_CORES (RAPL_CORES_ENERGY_STATUS | RAPL_CORES_POWER_LIMIT)
#define TJMAX_DEFAULT 100
+/* MSRs that are not yet in the kernel-provided header. */
+#define MSR_RAPL_PWR_UNIT 0xc0010299
+#define MSR_CORE_ENERGY_STAT 0xc001029a
+#define MSR_PKG_ENERGY_STAT 0xc001029b
+
#define MAX(a, b) ((a) > (b) ? (a) : (b))
/*
unsigned long long c7;
unsigned long long mc6_us; /* duplicate as per-core for now, even though per module */
unsigned int core_temp_c;
+ unsigned int core_energy; /* MSR_CORE_ENERGY_STAT */
unsigned int core_id;
unsigned long long counter[MAX_ADDED_COUNTERS];
} *core_even, *core_odd;
struct cpu_topology {
int physical_package_id;
+ int die_id;
int logical_cpu_id;
int physical_node_id;
int logical_node_id; /* 0-based count within the package */
struct topo_params {
int num_packages;
+ int num_die;
int num_cpus;
int num_cores;
int max_cpu_num;
int retval, pkg_no, core_no, thread_no, node_no;
for (pkg_no = 0; pkg_no < topo.num_packages; ++pkg_no) {
- for (core_no = 0; core_no < topo.cores_per_node; ++core_no) {
- for (node_no = 0; node_no < topo.nodes_per_pkg;
- node_no++) {
+ for (node_no = 0; node_no < topo.nodes_per_pkg; node_no++) {
+ for (core_no = 0; core_no < topo.cores_per_node; ++core_no) {
for (thread_no = 0; thread_no <
topo.threads_per_core; ++thread_no) {
struct thread_data *t;
{ 0x0, "CPU" },
{ 0x0, "APIC" },
{ 0x0, "X2APIC" },
+ { 0x0, "Die" },
};
#define MAX_BIC (sizeof(bic) / sizeof(struct msr_counter))
#define BIC_CPU (1ULL << 47)
#define BIC_APIC (1ULL << 48)
#define BIC_X2APIC (1ULL << 49)
+#define BIC_Die (1ULL << 50)
#define BIC_DISABLED_BY_DEFAULT (BIC_USEC | BIC_TOD | BIC_APIC | BIC_X2APIC)
outp += sprintf(outp, "%sTime_Of_Day_Seconds", (printed++ ? delim : ""));
if (DO_BIC(BIC_Package))
outp += sprintf(outp, "%sPackage", (printed++ ? delim : ""));
+ if (DO_BIC(BIC_Die))
+ outp += sprintf(outp, "%sDie", (printed++ ? delim : ""));
if (DO_BIC(BIC_Node))
outp += sprintf(outp, "%sNode", (printed++ ? delim : ""));
if (DO_BIC(BIC_Core))
if (DO_BIC(BIC_CPU_c1))
outp += sprintf(outp, "%sCPU%%c1", (printed++ ? delim : ""));
- if (DO_BIC(BIC_CPU_c3) && !do_slm_cstates && !do_knl_cstates && !do_cnl_cstates)
+ if (DO_BIC(BIC_CPU_c3))
outp += sprintf(outp, "%sCPU%%c3", (printed++ ? delim : ""));
if (DO_BIC(BIC_CPU_c6))
outp += sprintf(outp, "%sCPU%%c6", (printed++ ? delim : ""));
if (DO_BIC(BIC_CoreTmp))
outp += sprintf(outp, "%sCoreTmp", (printed++ ? delim : ""));
+ if (do_rapl && !rapl_joules) {
+ if (DO_BIC(BIC_CorWatt) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ outp += sprintf(outp, "%sCorWatt", (printed++ ? delim : ""));
+ } else if (do_rapl && rapl_joules) {
+ if (DO_BIC(BIC_Cor_J) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ outp += sprintf(outp, "%sCor_J", (printed++ ? delim : ""));
+ }
+
for (mp = sys.cp; mp; mp = mp->next) {
if (mp->format == FORMAT_RAW) {
if (mp->width == 64)
if (do_rapl && !rapl_joules) {
if (DO_BIC(BIC_PkgWatt))
outp += sprintf(outp, "%sPkgWatt", (printed++ ? delim : ""));
- if (DO_BIC(BIC_CorWatt))
+ if (DO_BIC(BIC_CorWatt) && !(do_rapl & RAPL_PER_CORE_ENERGY))
outp += sprintf(outp, "%sCorWatt", (printed++ ? delim : ""));
if (DO_BIC(BIC_GFXWatt))
outp += sprintf(outp, "%sGFXWatt", (printed++ ? delim : ""));
} else if (do_rapl && rapl_joules) {
if (DO_BIC(BIC_Pkg_J))
outp += sprintf(outp, "%sPkg_J", (printed++ ? delim : ""));
- if (DO_BIC(BIC_Cor_J))
+ if (DO_BIC(BIC_Cor_J) && !(do_rapl & RAPL_PER_CORE_ENERGY))
outp += sprintf(outp, "%sCor_J", (printed++ ? delim : ""));
if (DO_BIC(BIC_GFX_J))
outp += sprintf(outp, "%sGFX_J", (printed++ ? delim : ""));
outp += sprintf(outp, "c6: %016llX\n", c->c6);
outp += sprintf(outp, "c7: %016llX\n", c->c7);
outp += sprintf(outp, "DTS: %dC\n", c->core_temp_c);
+ outp += sprintf(outp, "Joules: %0X\n", c->core_energy);
for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
outp += sprintf(outp, "cADDED [%d] msr0x%x: %08llX\n",
if (t == &average.threads) {
if (DO_BIC(BIC_Package))
outp += sprintf(outp, "%s-", (printed++ ? delim : ""));
+ if (DO_BIC(BIC_Die))
+ outp += sprintf(outp, "%s-", (printed++ ? delim : ""));
if (DO_BIC(BIC_Node))
outp += sprintf(outp, "%s-", (printed++ ? delim : ""));
if (DO_BIC(BIC_Core))
else
outp += sprintf(outp, "%s-", (printed++ ? delim : ""));
}
+ if (DO_BIC(BIC_Die)) {
+ if (c)
+ outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), cpus[t->cpu_id].die_id);
+ else
+ outp += sprintf(outp, "%s-", (printed++ ? delim : ""));
+ }
if (DO_BIC(BIC_Node)) {
if (t)
outp += sprintf(outp, "%s%d",
if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
goto done;
- if (DO_BIC(BIC_CPU_c3) && !do_slm_cstates && !do_knl_cstates && !do_cnl_cstates)
+ if (DO_BIC(BIC_CPU_c3))
outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->c3/tsc);
if (DO_BIC(BIC_CPU_c6))
outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->c6/tsc);
}
}
+ /*
+ * If measurement interval exceeds minimum RAPL Joule Counter range,
+ * indicate that results are suspect by printing "**" in fraction place.
+ */
+ if (interval_float < rapl_joule_counter_range)
+ fmt8 = "%s%.2f";
+ else
+ fmt8 = "%6.0f**";
+
+ if (DO_BIC(BIC_CorWatt) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ outp += sprintf(outp, fmt8, (printed++ ? delim : ""), c->core_energy * rapl_energy_units / interval_float);
+ if (DO_BIC(BIC_Cor_J) && (do_rapl & RAPL_PER_CORE_ENERGY))
+ outp += sprintf(outp, fmt8, (printed++ ? delim : ""), c->core_energy * rapl_energy_units);
+
/* print per-package data only for 1st core in package */
if (!(t->flags & CPU_IS_FIRST_CORE_IN_PACKAGE))
goto done;
if (DO_BIC(BIC_SYS_LPI))
outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->sys_lpi / 1000000.0 / interval_float);
- /*
- * If measurement interval exceeds minimum RAPL Joule Counter range,
- * indicate that results are suspect by printing "**" in fraction place.
- */
- if (interval_float < rapl_joule_counter_range)
- fmt8 = "%s%.2f";
- else
- fmt8 = "%6.0f**";
-
if (DO_BIC(BIC_PkgWatt))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_pkg * rapl_energy_units / interval_float);
- if (DO_BIC(BIC_CorWatt))
+ if (DO_BIC(BIC_CorWatt) && !(do_rapl & RAPL_PER_CORE_ENERGY))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_cores * rapl_energy_units / interval_float);
if (DO_BIC(BIC_GFXWatt))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_gfx * rapl_energy_units / interval_float);
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_dram * rapl_dram_energy_units / interval_float);
if (DO_BIC(BIC_Pkg_J))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_pkg * rapl_energy_units);
- if (DO_BIC(BIC_Cor_J))
+ if (DO_BIC(BIC_Cor_J) && !(do_rapl & RAPL_PER_CORE_ENERGY))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_cores * rapl_energy_units);
if (DO_BIC(BIC_GFX_J))
outp += sprintf(outp, fmt8, (printed++ ? delim : ""), p->energy_gfx * rapl_energy_units);
old->core_temp_c = new->core_temp_c;
old->mc6_us = new->mc6_us - old->mc6_us;
+ DELTA_WRAP32(new->core_energy, old->core_energy);
+
for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
if (mp->format == FORMAT_RAW)
old->counter[i] = new->counter[i];
c->c7 = 0;
c->mc6_us = 0;
c->core_temp_c = 0;
+ c->core_energy = 0;
p->pkg_wtd_core_c0 = 0;
p->pkg_any_core_c0 = 0;
average.cores.core_temp_c = MAX(average.cores.core_temp_c, c->core_temp_c);
+ average.cores.core_energy += c->core_energy;
+
for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
if (mp->format == FORMAT_RAW)
continue;
if (!(t->flags & CPU_IS_FIRST_THREAD_IN_CORE))
goto done;
- if (DO_BIC(BIC_CPU_c3) && !do_slm_cstates && !do_knl_cstates && !do_cnl_cstates) {
+ if (DO_BIC(BIC_CPU_c3)) {
if (get_msr(cpu, MSR_CORE_C3_RESIDENCY, &c->c3))
return -6;
}
c->core_temp_c = tcc_activation_temp - ((msr >> 16) & 0x7F);
}
+ if (do_rapl & RAPL_AMD_F17H) {
+ if (get_msr(cpu, MSR_CORE_ENERGY_STAT, &msr))
+ return -14;
+ c->core_energy = msr & 0xFFFFFFFF;
+ }
+
for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) {
if (get_mp(cpu, mp, &c->counter[i]))
return -10;
return -16;
p->rapl_dram_perf_status = msr & 0xFFFFFFFF;
}
+ if (do_rapl & RAPL_AMD_F17H) {
+ if (get_msr(cpu, MSR_PKG_ENERGY_STAT, &msr))
+ return -13;
+ p->energy_pkg = msr & 0xFFFFFFFF;
+ }
if (DO_BIC(BIC_PkgTmp)) {
if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_STATUS, &msr))
return -17;
/*
* Parse a file containing a single int.
+ * Return 0 if file can not be opened
+ * Exit if file can be opened, but can not be parsed
*/
int parse_int_file(const char *fmt, ...)
{
va_start(args, fmt);
vsnprintf(path, sizeof(path), fmt, args);
va_end(args);
- filep = fopen_or_die(path, "r");
+ filep = fopen(path, "r");
+ if (!filep)
+ return 0;
if (fscanf(filep, "%d", &value) != 1)
err(1, "%s: failed to parse number from file", path);
fclose(filep);
return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/physical_package_id", cpu);
}
+int get_die_id(int cpu)
+{
+ return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/die_id", cpu);
+}
+
int get_core_id(int cpu)
{
return parse_int_file("/sys/devices/system/cpu/cpu%d/topology/core_id", cpu);
filep = fopen_or_die(path, "r");
do {
offset -= BITMASK_SIZE;
- fscanf(filep, "%lx%c", &map, &character);
+ if (fscanf(filep, "%lx%c", &map, &character) != 2)
+ err(1, "%s: failed to parse file", path);
for (shift = 0; shift < BITMASK_SIZE; shift++) {
if ((map >> shift) & 0x1) {
so = shift + offset;
fp = fopen_or_die("/sys/devices/system/cpu/cpuidle/low_power_idle_cpu_residency_us", "r");
retval = fscanf(fp, "%lld", &cpuidle_cur_cpu_lpi_us);
- if (retval != 1)
- err(1, "CPU LPI");
+ if (retval != 1) {
+ fprintf(stderr, "Disabling Low Power Idle CPU output\n");
+ BIC_NOT_PRESENT(BIC_CPU_LPI);
+ return -1;
+ }
fclose(fp);
fp = fopen_or_die("/sys/devices/system/cpu/cpuidle/low_power_idle_system_residency_us", "r");
retval = fscanf(fp, "%lld", &cpuidle_cur_sys_lpi_us);
- if (retval != 1)
- err(1, "SYS LPI");
-
+ if (retval != 1) {
+ fprintf(stderr, "Disabling Low Power Idle System output\n");
+ BIC_NOT_PRESENT(BIC_SYS_LPI);
+ return -1;
+ }
fclose(fp);
return 0;
input = fopen(path, "r");
if (input == NULL)
continue;
- fgets(name_buf, sizeof(name_buf), input);
+ if (!fgets(name_buf, sizeof(name_buf), input))
+ err(1, "%s: failed to read file", path);
/* truncate "C1-HSW\n" to "C1", or truncate "C1\n" to "C1" */
sp = strchr(name_buf, '-');
if (!sp)
sp = strchrnul(name_buf, '\n');
*sp = '\0';
-
fclose(input);
sprintf(path, "/sys/devices/system/cpu/cpu%d/cpuidle/state%d/desc",
input = fopen(path, "r");
if (input == NULL)
continue;
- fgets(desc, sizeof(desc), input);
+ if (!fgets(desc, sizeof(desc), input))
+ err(1, "%s: failed to read file", path);
fprintf(outf, "cpu%d: %s: %s", base_cpu, name_buf, desc);
fclose(input);
base_cpu);
input = fopen(path, "r");
if (input == NULL) {
- fprintf(stderr, "NSFOD %s\n", path);
+ fprintf(outf, "NSFOD %s\n", path);
return;
}
- fgets(driver_buf, sizeof(driver_buf), input);
+ if (!fgets(driver_buf, sizeof(driver_buf), input))
+ err(1, "%s: failed to read file", path);
fclose(input);
sprintf(path, "/sys/devices/system/cpu/cpu%d/cpufreq/scaling_governor",
base_cpu);
input = fopen(path, "r");
if (input == NULL) {
- fprintf(stderr, "NSFOD %s\n", path);
+ fprintf(outf, "NSFOD %s\n", path);
return;
}
- fgets(governor_buf, sizeof(governor_buf), input);
+ if (!fgets(governor_buf, sizeof(governor_buf), input))
+ err(1, "%s: failed to read file", path);
fclose(input);
fprintf(outf, "cpu%d: cpufreq driver: %s", base_cpu, driver_buf);
sprintf(path, "/sys/devices/system/cpu/cpufreq/boost");
input = fopen(path, "r");
if (input != NULL) {
- fscanf(input, "%d", &turbo);
+ if (fscanf(input, "%d", &turbo) != 1)
+ err(1, "%s: failed to parse number from file", path);
fprintf(outf, "cpufreq boost: %d\n", turbo);
fclose(input);
}
sprintf(path, "/sys/devices/system/cpu/intel_pstate/no_turbo");
input = fopen(path, "r");
if (input != NULL) {
- fscanf(input, "%d", &turbo);
+ if (fscanf(input, "%d", &turbo) != 1)
+ err(1, "%s: failed to parse number from file", path);
fprintf(outf, "cpufreq intel_pstate no_turbo: %d\n", turbo);
fclose(input);
}
#define RAPL_POWER_GRANULARITY 0x7FFF /* 15 bit power granularity */
#define RAPL_TIME_GRANULARITY 0x3F /* 6 bit time granularity */
-double get_tdp(unsigned int model)
+double get_tdp_intel(unsigned int model)
{
unsigned long long msr;
}
}
+double get_tdp_amd(unsigned int family)
+{
+ switch (family) {
+ case 0x17:
+ default:
+ /* This is the max stock TDP of HEDT/Server Fam17h chips */
+ return 250.0;
+ }
+}
+
/*
* rapl_dram_energy_units_probe()
* Energy units are either hard-coded, or come from RAPL Energy Unit MSR.
}
}
-
-/*
- * rapl_probe()
- *
- * sets do_rapl, rapl_power_units, rapl_energy_units, rapl_time_units
- */
-void rapl_probe(unsigned int family, unsigned int model)
+void rapl_probe_intel(unsigned int family, unsigned int model)
{
unsigned long long msr;
unsigned int time_unit;
double tdp;
- if (!genuine_intel)
- return;
-
if (family != 6)
return;
rapl_time_units = 1.0 / (1 << (time_unit));
- tdp = get_tdp(model);
+ tdp = get_tdp_intel(model);
rapl_joule_counter_range = 0xFFFFFFFF * rapl_energy_units / tdp;
if (!quiet)
fprintf(outf, "RAPL: %.0f sec. Joule Counter Range, at %.0f Watts\n", rapl_joule_counter_range, tdp);
+}
- return;
+void rapl_probe_amd(unsigned int family, unsigned int model)
+{
+ unsigned long long msr;
+ unsigned int eax, ebx, ecx, edx;
+ unsigned int has_rapl = 0;
+ double tdp;
+
+ if (max_extended_level >= 0x80000007) {
+ __cpuid(0x80000007, eax, ebx, ecx, edx);
+ /* RAPL (Fam 17h) */
+ has_rapl = edx & (1 << 14);
+ }
+
+ if (!has_rapl)
+ return;
+
+ switch (family) {
+ case 0x17: /* Zen, Zen+ */
+ do_rapl = RAPL_AMD_F17H | RAPL_PER_CORE_ENERGY;
+ if (rapl_joules) {
+ BIC_PRESENT(BIC_Pkg_J);
+ BIC_PRESENT(BIC_Cor_J);
+ } else {
+ BIC_PRESENT(BIC_PkgWatt);
+ BIC_PRESENT(BIC_CorWatt);
+ }
+ break;
+ default:
+ return;
+ }
+
+ if (get_msr(base_cpu, MSR_RAPL_PWR_UNIT, &msr))
+ return;
+
+ rapl_time_units = ldexp(1.0, -(msr >> 16 & 0xf));
+ rapl_energy_units = ldexp(1.0, -(msr >> 8 & 0x1f));
+ rapl_power_units = ldexp(1.0, -(msr & 0xf));
+
+ tdp = get_tdp_amd(model);
+
+ rapl_joule_counter_range = 0xFFFFFFFF * rapl_energy_units / tdp;
+ if (!quiet)
+ fprintf(outf, "RAPL: %.0f sec. Joule Counter Range, at %.0f Watts\n", rapl_joule_counter_range, tdp);
+}
+
+/*
+ * rapl_probe()
+ *
+ * sets do_rapl, rapl_power_units, rapl_energy_units, rapl_time_units
+ */
+void rapl_probe(unsigned int family, unsigned int model)
+{
+ if (genuine_intel)
+ rapl_probe_intel(family, model);
+ if (authentic_amd)
+ rapl_probe_amd(family, model);
}
void perf_limit_reasons_probe(unsigned int family, unsigned int model)
int print_rapl(struct thread_data *t, struct core_data *c, struct pkg_data *p)
{
unsigned long long msr;
+ const char *msr_name;
int cpu;
if (!do_rapl)
return -1;
}
- if (get_msr(cpu, MSR_RAPL_POWER_UNIT, &msr))
- return -1;
+ if (do_rapl & RAPL_AMD_F17H) {
+ msr_name = "MSR_RAPL_PWR_UNIT";
+ if (get_msr(cpu, MSR_RAPL_PWR_UNIT, &msr))
+ return -1;
+ } else {
+ msr_name = "MSR_RAPL_POWER_UNIT";
+ if (get_msr(cpu, MSR_RAPL_POWER_UNIT, &msr))
+ return -1;
+ }
- fprintf(outf, "cpu%d: MSR_RAPL_POWER_UNIT: 0x%08llx (%f Watts, %f Joules, %f sec.)\n", cpu, msr,
+ fprintf(outf, "cpu%d: %s: 0x%08llx (%f Watts, %f Joules, %f sec.)\n", cpu, msr_name, msr,
rapl_power_units, rapl_energy_units, rapl_time_units);
if (do_rapl & RAPL_PKG_POWER_INFO) {
case INTEL_FAM6_KABYLAKE_MOBILE:
case INTEL_FAM6_KABYLAKE_DESKTOP:
return INTEL_FAM6_SKYLAKE_MOBILE;
+
+ case INTEL_FAM6_ICELAKE_MOBILE:
+ return INTEL_FAM6_CANNONLAKE_MOBILE;
}
return model;
}
}
do_slm_cstates = is_slm(family, model);
do_knl_cstates = is_knl(family, model);
- do_cnl_cstates = is_cnl(family, model);
+
+ if (do_slm_cstates || do_knl_cstates || is_cnl(family, model))
+ BIC_NOT_PRESENT(BIC_CPU_c3);
if (!quiet)
decode_misc_pwr_mgmt_msr();
int i;
int max_core_id = 0;
int max_package_id = 0;
+ int max_die_id = 0;
int max_siblings = 0;
/* Initialize num_cpus, max_cpu_num */
if (cpus[i].physical_package_id > max_package_id)
max_package_id = cpus[i].physical_package_id;
+ /* get die information */
+ cpus[i].die_id = get_die_id(i);
+ if (cpus[i].die_id > max_die_id)
+ max_die_id = cpus[i].die_id;
+
/* get numa node information */
cpus[i].physical_node_id = get_physical_node_id(&cpus[i]);
if (cpus[i].physical_node_id > topo.max_node_num)
if (!summary_only && topo.cores_per_node > 1)
BIC_PRESENT(BIC_Core);
+ topo.num_die = max_die_id + 1;
+ if (debug > 1)
+ fprintf(outf, "max_die_id %d, sizing for %d die\n",
+ max_die_id, topo.num_die);
+ if (!summary_only && topo.num_die > 1)
+ BIC_PRESENT(BIC_Die);
+
topo.num_packages = max_package_id + 1;
if (debug > 1)
fprintf(outf, "max_package_id %d, sizing for %d packages\n",
if (cpu_is_not_present(i))
continue;
fprintf(outf,
- "cpu %d pkg %d node %d lnode %d core %d thread %d\n",
- i, cpus[i].physical_package_id,
+ "cpu %d pkg %d die %d node %d lnode %d core %d thread %d\n",
+ i, cpus[i].physical_package_id, cpus[i].die_id,
cpus[i].physical_node_id,
cpus[i].logical_node_id,
cpus[i].physical_core_id,
}
void print_version() {
- fprintf(outf, "turbostat version 18.07.27"
+ fprintf(outf, "turbostat version 19.03.20"
" - Len Brown <lenb@kernel.org>\n");
}
input = fopen(path, "r");
if (input == NULL)
continue;
- fgets(name_buf, sizeof(name_buf), input);
+ if (!fgets(name_buf, sizeof(name_buf), input))
+ err(1, "%s: failed to read file", path);
/* truncate "C1-HSW\n" to "C1", or truncate "C1\n" to "C1" */
sp = strchr(name_buf, '-');
input = fopen(path, "r");
if (input == NULL)
continue;
- fgets(name_buf, sizeof(name_buf), input);
+ if (!fgets(name_buf, sizeof(name_buf), input))
+ err(1, "%s: failed to read file", path);
/* truncate "C1-HSW\n" to "C1", or truncate "C1\n" to "C1" */
sp = strchr(name_buf, '-');
if (!sp)
.n_proto = __bpf_constant_htons(ETH_P_IPV6),
};
+#define VLAN_HLEN 4
+
+static struct {
+ struct ethhdr eth;
+ __u16 vlan_tci;
+ __u16 vlan_proto;
+ struct iphdr iph;
+ struct tcphdr tcp;
+} __packed pkt_vlan_v4 = {
+ .eth.h_proto = __bpf_constant_htons(ETH_P_8021Q),
+ .vlan_proto = __bpf_constant_htons(ETH_P_IP),
+ .iph.ihl = 5,
+ .iph.protocol = IPPROTO_TCP,
+ .iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
+ .tcp.urg_ptr = 123,
+ .tcp.doff = 5,
+};
+
+static struct bpf_flow_keys pkt_vlan_v4_flow_keys = {
+ .nhoff = VLAN_HLEN,
+ .thoff = VLAN_HLEN + sizeof(struct iphdr),
+ .addr_proto = ETH_P_IP,
+ .ip_proto = IPPROTO_TCP,
+ .n_proto = __bpf_constant_htons(ETH_P_IP),
+};
+
+static struct {
+ struct ethhdr eth;
+ __u16 vlan_tci;
+ __u16 vlan_proto;
+ __u16 vlan_tci2;
+ __u16 vlan_proto2;
+ struct ipv6hdr iph;
+ struct tcphdr tcp;
+} __packed pkt_vlan_v6 = {
+ .eth.h_proto = __bpf_constant_htons(ETH_P_8021AD),
+ .vlan_proto = __bpf_constant_htons(ETH_P_8021Q),
+ .vlan_proto2 = __bpf_constant_htons(ETH_P_IPV6),
+ .iph.nexthdr = IPPROTO_TCP,
+ .iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
+ .tcp.urg_ptr = 123,
+ .tcp.doff = 5,
+};
+
+static struct bpf_flow_keys pkt_vlan_v6_flow_keys = {
+ .nhoff = VLAN_HLEN * 2,
+ .thoff = VLAN_HLEN * 2 + sizeof(struct ipv6hdr),
+ .addr_proto = ETH_P_IPV6,
+ .ip_proto = IPPROTO_TCP,
+ .n_proto = __bpf_constant_htons(ETH_P_IPV6),
+};
+
void test_flow_dissector(void)
{
struct bpf_flow_keys flow_keys;
err, errno, retval, duration, size, sizeof(flow_keys));
CHECK_FLOW_KEYS("ipv6_flow_keys", flow_keys, pkt_v6_flow_keys);
+ err = bpf_prog_test_run(prog_fd, 10, &pkt_vlan_v4, sizeof(pkt_vlan_v4),
+ &flow_keys, &size, &retval, &duration);
+ CHECK(size != sizeof(flow_keys) || err || retval != 1, "vlan_ipv4",
+ "err %d errno %d retval %d duration %d size %u/%lu\n",
+ err, errno, retval, duration, size, sizeof(flow_keys));
+ CHECK_FLOW_KEYS("vlan_ipv4_flow_keys", flow_keys,
+ pkt_vlan_v4_flow_keys);
+
+ err = bpf_prog_test_run(prog_fd, 10, &pkt_vlan_v6, sizeof(pkt_vlan_v6),
+ &flow_keys, &size, &retval, &duration);
+ CHECK(size != sizeof(flow_keys) || err || retval != 1, "vlan_ipv6",
+ "err %d errno %d retval %d duration %d size %u/%lu\n",
+ err, errno, retval, duration, size, sizeof(flow_keys));
+ CHECK_FLOW_KEYS("vlan_ipv6_flow_keys", flow_keys,
+ pkt_vlan_v6_flow_keys);
+
bpf_object__close(obj);
}
{
struct bpf_flow_keys *keys = skb->flow_keys;
- keys->n_proto = proto;
switch (proto) {
case bpf_htons(ETH_P_IP):
bpf_tail_call(skb, &jmp_table, IP);
SEC("flow_dissector")
int _dissect(struct __sk_buff *skb)
{
- if (!skb->vlan_present)
- return parse_eth_proto(skb, skb->protocol);
- else
- return parse_eth_proto(skb, skb->vlan_proto);
+ struct bpf_flow_keys *keys = skb->flow_keys;
+
+ return parse_eth_proto(skb, keys->n_proto);
}
/* Parses on IPPROTO_* */
{
struct bpf_flow_keys *keys = skb->flow_keys;
struct vlan_hdr *vlan, _vlan;
- __be16 proto;
-
- /* Peek back to see if single or double-tagging */
- if (bpf_skb_load_bytes(skb, keys->thoff - sizeof(proto), &proto,
- sizeof(proto)))
- return BPF_DROP;
/* Account for double-tagging */
- if (proto == bpf_htons(ETH_P_8021AD)) {
+ if (keys->n_proto == bpf_htons(ETH_P_8021AD)) {
vlan = bpf_flow_dissect_get_header(skb, sizeof(*vlan), &_vlan);
if (!vlan)
return BPF_DROP;
if (vlan->h_vlan_encapsulated_proto != bpf_htons(ETH_P_8021Q))
return BPF_DROP;
+ keys->nhoff += sizeof(*vlan);
keys->thoff += sizeof(*vlan);
}
if (!vlan)
return BPF_DROP;
+ keys->nhoff += sizeof(*vlan);
keys->thoff += sizeof(*vlan);
/* Only allow 8021AD + 8021Q double tagging and no triple tagging.*/
if (vlan->h_vlan_encapsulated_proto == bpf_htons(ETH_P_8021AD) ||
vlan->h_vlan_encapsulated_proto == bpf_htons(ETH_P_8021Q))
return BPF_DROP;
+ keys->n_proto = vlan->h_vlan_encapsulated_proto;
return parse_eth_proto(skb, vlan->h_vlan_encapsulated_proto);
}
.dedup_table_size = 1, /* force hash collisions */
},
},
+{
+ .descr = "dedup: void equiv check",
+ /*
+ * // CU 1:
+ * struct s {
+ * struct {} *x;
+ * };
+ * // CU 2:
+ * struct s {
+ * int *x;
+ * };
+ */
+ .input = {
+ .raw_types = {
+ /* CU 1 */
+ BTF_STRUCT_ENC(0, 0, 1), /* [1] struct {} */
+ BTF_PTR_ENC(1), /* [2] ptr -> [1] */
+ BTF_STRUCT_ENC(NAME_NTH(1), 1, 8), /* [3] struct s */
+ BTF_MEMBER_ENC(NAME_NTH(2), 2, 0),
+ /* CU 2 */
+ BTF_PTR_ENC(0), /* [4] ptr -> void */
+ BTF_STRUCT_ENC(NAME_NTH(1), 1, 8), /* [5] struct s */
+ BTF_MEMBER_ENC(NAME_NTH(2), 4, 0),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0s\0x"),
+ },
+ .expect = {
+ .raw_types = {
+ /* CU 1 */
+ BTF_STRUCT_ENC(0, 0, 1), /* [1] struct {} */
+ BTF_PTR_ENC(1), /* [2] ptr -> [1] */
+ BTF_STRUCT_ENC(NAME_NTH(1), 1, 8), /* [3] struct s */
+ BTF_MEMBER_ENC(NAME_NTH(2), 2, 0),
+ /* CU 2 */
+ BTF_PTR_ENC(0), /* [4] ptr -> void */
+ BTF_STRUCT_ENC(NAME_NTH(1), 1, 8), /* [5] struct s */
+ BTF_MEMBER_ENC(NAME_NTH(2), 4, 0),
+ BTF_END_RAW,
+ },
+ BTF_STR_SEC("\0s\0x"),
+ },
+ .opts = {
+ .dont_resolve_fwds = false,
+ .dedup_table_size = 1, /* force hash collisions */
+ },
+},
{
.descr = "dedup: all possible kinds (no duplicates)",
.input = {
.errstr = "call stack",
.result = REJECT,
},
+{
+ "calls: stack depth check in dead code",
+ .insns = {
+ /* main */
+ BPF_MOV64_IMM(BPF_REG_1, 0),
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call A */
+ BPF_EXIT_INSN(),
+ /* A */
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 1),
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 2), /* call B */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ /* B */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call C */
+ BPF_EXIT_INSN(),
+ /* C */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call D */
+ BPF_EXIT_INSN(),
+ /* D */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call E */
+ BPF_EXIT_INSN(),
+ /* E */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call F */
+ BPF_EXIT_INSN(),
+ /* F */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call G */
+ BPF_EXIT_INSN(),
+ /* G */
+ BPF_RAW_INSN(BPF_JMP|BPF_CALL, 0, 1, 0, 1), /* call H */
+ BPF_EXIT_INSN(),
+ /* H */
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_XDP,
+ .errstr = "call stack",
+ .result = REJECT,
+},
{
"calls: spill into caller stack frame",
.insns = {
"$TC actions flush action sample"
]
},
+ {
+ "id": "7571",
+ "name": "Add sample action with invalid rate",
+ "category": [
+ "actions",
+ "sample"
+ ],
+ "setup": [
+ [
+ "$TC actions flush action sample",
+ 0,
+ 1,
+ 255
+ ]
+ ],
+ "cmdUnderTest": "$TC actions add action sample rate 0 group 1 index 2",
+ "expExitCode": "255",
+ "verifyCmd": "$TC actions get action sample index 2",
+ "matchPattern": "action order [0-9]+: sample rate 1/0 group 1.*index 2 ref",
+ "matchCount": "0",
+ "teardown": [
+ "$TC actions flush action sample"
+ ]
+ },
{
"id": "b6d4",
"name": "Add sample action with mandatory arguments and invalid control action",