linux-2.6-microblaze.git
2 years agobpf: Dynptr support for ring buffers
Joanne Koong [Mon, 23 May 2022 21:07:09 +0000 (14:07 -0700)]
bpf: Dynptr support for ring buffers

Currently, our only way of writing dynamically-sized data into a ring
buffer is through bpf_ringbuf_output but this incurs an extra memcpy
cost. bpf_ringbuf_reserve + bpf_ringbuf_commit avoids this extra
memcpy, but it can only safely support reservation sizes that are
statically known since the verifier cannot guarantee that the bpf
program won’t access memory outside the reserved space.

The bpf_dynptr abstraction allows for dynamically-sized ring buffer
reservations without the extra memcpy.

There are 3 new APIs:

long bpf_ringbuf_reserve_dynptr(void *ringbuf, u32 size, u64 flags, struct bpf_dynptr *ptr);
void bpf_ringbuf_submit_dynptr(struct bpf_dynptr *ptr, u64 flags);
void bpf_ringbuf_discard_dynptr(struct bpf_dynptr *ptr, u64 flags);

These closely follow the functionalities of the original ringbuf APIs.
For example, all ringbuffer dynptrs that have been reserved must be
either submitted or discarded before the program exits.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-4-joannelkoong@gmail.com
2 years agobpf: Add bpf_dynptr_from_mem for local dynptrs
Joanne Koong [Mon, 23 May 2022 21:07:08 +0000 (14:07 -0700)]
bpf: Add bpf_dynptr_from_mem for local dynptrs

This patch adds a new api bpf_dynptr_from_mem:

long bpf_dynptr_from_mem(void *data, u32 size, u64 flags, struct bpf_dynptr *ptr);

which initializes a dynptr to point to a bpf program's local memory. For now
only local memory that is of reg type PTR_TO_MAP_VALUE is supported.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-3-joannelkoong@gmail.com
2 years agobpf: Add verifier support for dynptrs
Joanne Koong [Mon, 23 May 2022 21:07:07 +0000 (14:07 -0700)]
bpf: Add verifier support for dynptrs

This patch adds the bulk of the verifier work for supporting dynamic
pointers (dynptrs) in bpf.

A bpf_dynptr is opaque to the bpf program. It is a 16-byte structure
defined internally as:

struct bpf_dynptr_kern {
    void *data;
    u32 size;
    u32 offset;
} __aligned(8);

The upper 8 bits of *size* is reserved (it contains extra metadata about
read-only status and dynptr type). Consequently, a dynptr only supports
memory less than 16 MB.

There are different types of dynptrs (eg malloc, ringbuf, ...). In this
patchset, the most basic one, dynptrs to a bpf program's local memory,
is added. For now only local memory that is of reg type PTR_TO_MAP_VALUE
is supported.

In the verifier, dynptr state information will be tracked in stack
slots. When the program passes in an uninitialized dynptr
(ARG_PTR_TO_DYNPTR | MEM_UNINIT), the stack slots corresponding
to the frame pointer where the dynptr resides at are marked
STACK_DYNPTR. For helper functions that take in initialized dynptrs (eg
bpf_dynptr_read + bpf_dynptr_write which are added later in this
patchset), the verifier enforces that the dynptr has been initialized
properly by checking that their corresponding stack slots have been
marked as STACK_DYNPTR.

The 6th patch in this patchset adds test cases that the verifier should
successfully reject, such as for example attempting to use a dynptr
after doing a direct write into it inside the bpf program.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20220523210712.3641569-2-joannelkoong@gmail.com
2 years agobpf: Suppress 'passing zero to PTR_ERR' warning
Kumar Kartikeya Dwivedi [Sat, 21 May 2022 13:26:20 +0000 (18:56 +0530)]
bpf: Suppress 'passing zero to PTR_ERR' warning

Kernel Test Robot complains about passing zero to PTR_ERR for the said
line, suppress it by using PTR_ERR_OR_ZERO.

Fixes: c0a5a21c25f3 ("bpf: Allow storing referenced kptr in map")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220521132620.1976921-1-memxor@gmail.com
2 years agobpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack
Song Liu [Fri, 20 May 2022 23:57:53 +0000 (16:57 -0700)]
bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack

Introduce bpf_arch_text_invalidate and use it to fill unused part of the
bpf_prog_pack with illegal instructions when a BPF program is freed.

Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator")
Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220520235758.1858153-4-song@kernel.org
2 years agox86/alternative: Introduce text_poke_set
Song Liu [Fri, 20 May 2022 23:57:52 +0000 (16:57 -0700)]
x86/alternative: Introduce text_poke_set

Introduce a memset like API for text_poke. This will be used to fill the
unused RX memory with illegal instructions.

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/bpf/20220520235758.1858153-3-song@kernel.org
2 years agobpf: Fill new bpf_prog_pack with illegal instructions
Song Liu [Fri, 20 May 2022 23:57:51 +0000 (16:57 -0700)]
bpf: Fill new bpf_prog_pack with illegal instructions

bpf_prog_pack enables sharing huge pages among multiple BPF programs.
These pages are marked as executable before the JIT engine fill it with
BPF programs. To make these pages safe, fill the hole bpf_prog_pack with
illegal instructions before making it executable.

Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator")
Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220520235758.1858153-2-song@kernel.org
2 years agoselftests/bpf: Fix spelling mistake: "unpriviliged" -> "unprivileged"
Colin Ian King [Mon, 23 May 2022 11:56:04 +0000 (12:56 +0100)]
selftests/bpf: Fix spelling mistake: "unpriviliged" -> "unprivileged"

There are spelling mistakes in ASSERT messages. Fix these.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220523115604.49942-1-colin.i.king@gmail.com
2 years agoselftests/bpf: fix btf_dump/btf_dump due to recent clang change
Yonghong Song [Mon, 23 May 2022 15:20:44 +0000 (08:20 -0700)]
selftests/bpf: fix btf_dump/btf_dump due to recent clang change

Latest llvm-project upstream had a change of behavior
related to qualifiers on function return type ([1]).
This caused selftests btf_dump/btf_dump failure.
The following example shows what changed.

  $ cat t.c
  typedef const char * const (* const (* const fn_ptr_arr2_t[5])())(char * (*)(int));
  struct t {
    int a;
    fn_ptr_arr2_t l;
  };
  int foo(struct t *arg) {
    return arg->a;
  }

Compiled with latest upstream llvm15,
  $ clang -O2 -g -target bpf -S -emit-llvm t.c
The related generated debuginfo IR looks like:
  !16 = !DIDerivedType(tag: DW_TAG_typedef, name: "fn_ptr_arr2_t", file: !1, line: 1, baseType: !17)
  !17 = !DICompositeType(tag: DW_TAG_array_type, baseType: !18, size: 320, elements: !32)
  !18 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !19)
  !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
  !20 = !DISubroutineType(types: !21)
  !21 = !{!22, null}
  !22 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !23, size: 64)
  !23 = !DISubroutineType(types: !24)
  !24 = !{!25, !28}
  !25 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !26, size: 64)
  !26 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !27)
  !27 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
You can see two intermediate const qualifier to pointer are dropped in debuginfo IR.

With llvm14, we have following debuginfo IR:
  !16 = !DIDerivedType(tag: DW_TAG_typedef, name: "fn_ptr_arr2_t", file: !1, line: 1, baseType: !17)
  !17 = !DICompositeType(tag: DW_TAG_array_type, baseType: !18, size: 320, elements: !34)
  !18 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !19)
  !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
  !20 = !DISubroutineType(types: !21)
  !21 = !{!22, null}
  !22 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !23)
  !23 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !24, size: 64)
  !24 = !DISubroutineType(types: !25)
  !25 = !{!26, !30}
  !26 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !27)
  !27 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !28, size: 64)
  !28 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !29)
  !29 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
All const qualifiers are preserved.

To adapt the selftest to both old and new llvm, this patch removed
the intermediate const qualifier in const-to-ptr types, to make the
test succeed again.

  [1] https://reviews.llvm.org/D125919

Reported-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220523152044.3905809-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agos390/bpf: Fix typo in comment
Julia Lawall [Sat, 21 May 2022 11:11:34 +0000 (13:11 +0200)]
s390/bpf: Fix typo in comment

Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/20220521111145.81697-84-Julia.Lawall@inria.fr
2 years agolibbpf: Fix typo in comment
Julia Lawall [Sat, 21 May 2022 11:11:21 +0000 (13:11 +0200)]
libbpf: Fix typo in comment

Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220521111145.81697-71-Julia.Lawall@inria.fr
2 years agoMAINTAINERS: Add maintainer to AF_XDP
Magnus Karlsson [Mon, 23 May 2022 08:32:54 +0000 (10:32 +0200)]
MAINTAINERS: Add maintainer to AF_XDP

Maciej Fijalkowski has gracefully accepted to become the third
maintainer for the AF_XDP code. Thank you Maciej!

Signed-off-by: Magnus Karlsson <magnus.karlsson@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20220523083254.32285-1-magnus.karlsson@gmail.com
2 years agoMerge branch 'bpf: refine kernel.unprivileged_bpf_disabled behaviour'
Alexei Starovoitov [Sat, 21 May 2022 02:48:29 +0000 (19:48 -0700)]
Merge branch 'bpf: refine kernel.unprivileged_bpf_disabled behaviour'

Alan Maguire says:

====================

Unprivileged BPF disabled (kernel.unprivileged_bpf_disabled >= 1)
is the default in most cases now; when set, the BPF system call is
blocked for users without CAP_BPF/CAP_SYS_ADMIN.  In some cases
however, it makes sense to split activities between capability-requiring
ones - such as program load/attach - and those that might not require
capabilities such as reading perf/ringbuf events, reading or
updating BPF map configuration etc.  One example of this sort of
approach is a service that loads a BPF program, and a user-space
program that interacts with it.

Here - rather than blocking all BPF syscall commands - unprivileged
BPF disabled blocks the key object-creating commands (prog load,
map load).  Discussion has alluded to this idea in the past [1],
and Alexei mentioned it was also discussed at LSF/MM/BPF this year.

Changes since v3 [2]:
- added acks to patch 1
- CI was failing on Ubuntu; I suspect the issue was an old capability.h
  file which specified CAP_LAST_CAP as < CAP_BPF, leading to the logic
  disabling all caps not disabling CAP_BPF.  Use CAP_BPF as basis for
  "all caps" bitmap instead as we explicitly define it in cap_helpers.h
  if not already found in capabilities.h
- made global variables arguments to subtests instead (Andrii, patch 2)

Changes since v2 [3]:

- added acks from Yonghong
- clang compilation issue in selftest with bpf_prog_query()
  (Alexei, patch 2)
- disable all capabilities for test (Yonghong, patch 2)
- add assertions that size of perf/ringbuf data matches expectations
  (Yonghong, patch 2)
- add map array size definition, remove unneeded whitespace (Yonghong, patch 2)

Changes since RFC [4]:

- widened scope of commands unprivileged BPF disabled allows
  (Alexei, patch 1)
- removed restrictions on map types for lookup, update, delete
  (Alexei, patch 1)
- removed kernel CONFIG parameter controlling unprivileged bpf disabled
  change (Alexei, patch 1)
- widened test scope to cover most BPF syscall commands, with positive
  and negative subtests

[1] https://lore.kernel.org/bpf/CAADnVQLTBhCTAx1a_nev7CgMZxv1Bb7ecz1AFRin8tHmjPREJA@mail.gmail.com/
[2] https://lore.kernel.org/bpf/1652880861-27373-1-git-send-email-alan.maguire@oracle.com/T/
[3] https://lore.kernel.org/bpf/1652788780-25520-1-git-send-email-alan.maguire@oracle.com/T/#t
[4] https://lore.kernel.org/bpf/20220511163604.5kuczj6jx3ec5qv6@MBP-98dd607d3435.dhcp.thefacebook.com/T/#mae65f35a193279e718f37686da636094d69b96ee
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: add tests verifying unprivileged bpf behaviour
Alan Maguire [Thu, 19 May 2022 14:25:34 +0000 (15:25 +0100)]
selftests/bpf: add tests verifying unprivileged bpf behaviour

tests load/attach bpf prog with maps, perfbuf and ringbuf, pinning
them.  Then effective caps are dropped and we verify we can

- pick up the pin
- create ringbuf/perfbuf
- get ringbuf/perfbuf events, carry out map update, lookup and delete
- create a link

Negative testing also ensures

- BPF prog load fails
- BPF map create fails
- get fd by id fails
- get next id fails
- query fails
- BTF load fails

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/1652970334-30510-3-git-send-email-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: refine kernel.unprivileged_bpf_disabled behaviour
Alan Maguire [Thu, 19 May 2022 14:25:33 +0000 (15:25 +0100)]
bpf: refine kernel.unprivileged_bpf_disabled behaviour

With unprivileged BPF disabled, all cmds associated with the BPF syscall
are blocked to users without CAP_BPF/CAP_SYS_ADMIN.  However there are
use cases where we may wish to allow interactions with BPF programs
without being able to load and attach them.  So for example, a process
with required capabilities loads/attaches a BPF program, and a process
with less capabilities interacts with it; retrieving perf/ring buffer
events, modifying map-specified config etc.  With all BPF syscall
commands blocked as a result of unprivileged BPF being disabled,
this mode of interaction becomes impossible for processes without
CAP_BPF.

As Alexei notes

"The bpf ACL model is the same as traditional file's ACL.
The creds and ACLs are checked at open().  Then during file's write/read
additional checks might be performed. BPF has such functionality already.
Different map_creates have capability checks while map_lookup has:
map_get_sys_perms(map, f) & FMODE_CAN_READ.
In other words it's enough to gate FD-receiving parts of bpf
with unprivileged_bpf_disabled sysctl.
The rest is handled by availability of FD and access to files in bpffs."

So key fd creation syscall commands BPF_PROG_LOAD and BPF_MAP_CREATE
are blocked with unprivileged BPF disabled and no CAP_BPF.

And as Alexei notes, map creation with unprivileged BPF disabled off
blocks creation of maps aside from array, hash and ringbuf maps.

Programs responsible for loading and attaching the BPF program
can still control access to its pinned representation by restricting
permissions on the pin path, as with normal files.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/r/1652970334-30510-2-git-send-email-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Allow kfunc in tracing and syscall programs.
Benjamin Tissoires [Wed, 18 May 2022 20:59:08 +0000 (22:59 +0200)]
bpf: Allow kfunc in tracing and syscall programs.

Tracing and syscall BPF program types are very convenient to add BPF
capabilities to subsystem otherwise not BPF capable.
When we add kfuncs capabilities to those program types, we can add
BPF features to subsystems without having to touch BPF core.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20220518205924.399291-2-benjamin.tissoires@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Remove filtered subtests from output
Mykola Lysenko [Fri, 20 May 2022 06:13:03 +0000 (23:13 -0700)]
selftests/bpf: Remove filtered subtests from output

Currently filtered subtests show up in the output as skipped.

Before:
$ sudo ./test_progs -t log_fixup/missing_map
 #94 /1     log_fixup/bad_core_relo_trunc_none:SKIP
 #94 /2     log_fixup/bad_core_relo_trunc_partial:SKIP
 #94 /3     log_fixup/bad_core_relo_trunc_full:SKIP
 #94 /4     log_fixup/bad_core_relo_subprog:SKIP
 #94 /5     log_fixup/missing_map:OK
 #94        log_fixup:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

After:
$ sudo ./test_progs -t log_fixup/missing_map
 #94 /5     log_fixup/missing_map:OK
 #94        log_fixup:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220520061303.4004808-1-mykolal@fb.com
2 years agoselftests/bpf: Fix subtest number formatting in test_progs
Mykola Lysenko [Fri, 20 May 2022 07:01:44 +0000 (00:01 -0700)]
selftests/bpf: Fix subtest number formatting in test_progs

Remove weird spaces around / while preserving proper
indentation

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/bpf/20220520070144.10312-1-mykolal@fb.com
2 years agoselftests/bpf: Add missing trampoline program type to trampoline_count test
Yuntao Wang [Thu, 19 May 2022 15:06:10 +0000 (23:06 +0800)]
selftests/bpf: Add missing trampoline program type to trampoline_count test

Currently the trampoline_count test doesn't include any fmod_ret bpf
programs, fix it to make the test cover all possible trampoline program
types.

Since fmod_ret bpf programs can't be attached to __set_task_comm function,
as it's neither whitelisted for error injection nor a security hook, change
it to bpf_modify_return_test.

This patch also does some other cleanups such as removing duplicate code,
dropping inconsistent comments, etc.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220519150610.601313-1-ytcoode@gmail.com
2 years agoMerge branch 'bpf: mptcp: Support for mptcp_sock'
Andrii Nakryiko [Fri, 20 May 2022 22:29:01 +0000 (15:29 -0700)]
Merge branch 'bpf: mptcp: Support for mptcp_sock'

Mat Martineau says:

====================

This patch set adds BPF access to mptcp_sock structures, along with
associated self tests. You may recognize some of the code from earlier
(https://lore.kernel.org/bpf/20200918121046.190240-6-nicolas.rybowski@tessares.net/)
but it has been reworked quite a bit.

v1 -> v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c,
remove build check for CONFIG_BPF_JIT, add selftest check for CONFIG_MPTCP,
and add a patch to include CONFIG_IKCONFIG/CONFIG_IKCONFIG_PROC for the
BPF self tests.

v2 -> v3: Access sysctl through the filesystem to work around CI use of
the more limited busybox sysctl command.

v3 -> v4: Dropped special case kernel code for tcp_sock is_mptcp, use
existing bpf_tcp_helpers.h, and add check for 'ip mptcp monitor' support.

v4 -> v5: Use BPF test skeleton, more consistent use of ASSERT macros,
drop some unnecessary parameters / checks, and use tracing to acquire
MPTCP token.

Geliang Tang (6):
  bpf: add bpf_skc_to_mptcp_sock_proto
  selftests/bpf: Enable CONFIG_IKCONFIG_PROC in config
  selftests/bpf: test bpf_skc_to_mptcp_sock
  selftests/bpf: verify token of struct mptcp_sock
  selftests/bpf: verify ca_name of struct mptcp_sock
  selftests/bpf: verify first of struct mptcp_sock
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2 years agoselftests/bpf: Verify first of struct mptcp_sock
Geliang Tang [Thu, 19 May 2022 23:30:16 +0000 (16:30 -0700)]
selftests/bpf: Verify first of struct mptcp_sock

This patch verifies the 'first' struct member of struct mptcp_sock, which
points to the first subflow of msk. Save 'sk' in mptcp_storage, and verify
it with 'first' in verify_msk().

v5:
 - Use ASSERT_EQ() instead of a manual comparison + log (Andrii).

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-8-mathew.j.martineau@linux.intel.com
2 years agoselftests/bpf: Verify ca_name of struct mptcp_sock
Geliang Tang [Thu, 19 May 2022 23:30:15 +0000 (16:30 -0700)]
selftests/bpf: Verify ca_name of struct mptcp_sock

This patch verifies another member of struct mptcp_sock, ca_name. Add a
new function get_msk_ca_name() to read the sysctl tcp_congestion_control
and verify it in verify_msk().

v3: Access the sysctl through the filesystem to avoid compatibility
    issues with the busybox sysctl command.

v4: use ASSERT_* instead of CHECK_FAIL (Andrii)

v5: use ASSERT_STRNEQ() instead of strncmp() (Andrii)

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-7-mathew.j.martineau@linux.intel.com
2 years agoselftests/bpf: Verify token of struct mptcp_sock
Geliang Tang [Thu, 19 May 2022 23:30:14 +0000 (16:30 -0700)]
selftests/bpf: Verify token of struct mptcp_sock

This patch verifies the struct member token of struct mptcp_sock. Add a
new member token in struct mptcp_storage to store the token value of the
msk socket got by bpf_skc_to_mptcp_sock(). Trace the kernel function
mptcp_pm_new_connection() by using bpf fentry prog to obtain the msk token
and save it in a global bpf variable. Pass the variable to verify_msk() to
verify it with the token saved in socket_storage_map.

v4:
 - use ASSERT_* instead of CHECK_FAIL (Andrii)
 - skip the test if 'ip mptcp monitor' is not supported (Mat)

v5:
 - Drop 'ip mptcp monitor', trace mptcp_pm_new_connection instead (Martin)
 - Use ASSERT_EQ (Andrii)

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-6-mathew.j.martineau@linux.intel.com
2 years agoselftests/bpf: Test bpf_skc_to_mptcp_sock
Geliang Tang [Thu, 19 May 2022 23:30:13 +0000 (16:30 -0700)]
selftests/bpf: Test bpf_skc_to_mptcp_sock

This patch extends the MPTCP test base, to test the new helper
bpf_skc_to_mptcp_sock().

Define struct mptcp_sock in bpf_tcp_helpers.h, use bpf_skc_to_mptcp_sock
to get the msk socket in progs/mptcp_sock.c and store the infos in
socket_storage_map.

Get the infos from socket_storage_map in prog_tests/mptcp.c. Add a new
function verify_msk() to verify the infos of MPTCP socket, and rename
verify_sk() to verify_tsk() to verify TCP socket only.

v2: Add CONFIG_MPTCP check for clearer error messages

v4:
 - use ASSERT_* instead of CHECK_FAIL (Andrii)
 - drop bpf_mptcp_helpers.h (Andrii)

v5:
 - some 'ASSERT_*' were replaced in the next commit by mistake.
 - Drop CONFIG_MPTCP (Martin)
 - Use ASSERT_EQ (Andrii)

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-5-mathew.j.martineau@linux.intel.com
2 years agoselftests/bpf: Add MPTCP test base
Nicolas Rybowski [Thu, 19 May 2022 23:30:12 +0000 (16:30 -0700)]
selftests/bpf: Add MPTCP test base

This patch adds a base for MPTCP specific tests.

It is currently limited to the is_mptcp field in case of plain TCP
connection because there is no easy way to get the subflow sk from a msk
in userspace. This implies that we cannot lookup the sk_storage attached
to the subflow sk in the sockops program.

v4:
 - add copyright 2022 (Andrii)
 - use ASSERT_* instead of CHECK_FAIL (Andrii)
 - drop SEC("version") (Andrii)
 - use is_mptcp in tcp_sock, instead of bpf_tcp_sock (Martin & Andrii)

v5:
 - Drop connect_to_mptcp_fd (Martin)
 - Use BPF test skeleton (Andrii)
 - Use ASSERT_EQ (Andrii)
 - Drop the 'msg' parameter of verify_sk

Co-developed-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/bpf/20220519233016.105670-4-mathew.j.martineau@linux.intel.com
2 years agoselftests/bpf: Enable CONFIG_IKCONFIG_PROC in config
Geliang Tang [Thu, 19 May 2022 23:30:11 +0000 (16:30 -0700)]
selftests/bpf: Enable CONFIG_IKCONFIG_PROC in config

CONFIG_IKCONFIG_PROC is required by BPF selftests, otherwise we get
errors like this:

 libbpf: failed to open system Kconfig
 libbpf: failed to load object 'kprobe_multi'
 libbpf: failed to load BPF skeleton 'kprobe_multi': -22

It's because /proc/config.gz is opened in bpf_object__read_kconfig_file()
in tools/lib/bpf/libbpf.c:

        file = gzopen("/proc/config.gz", "r");

So this patch enables CONFIG_IKCONFIG and CONFIG_IKCONFIG_PROC in
tools/testing/selftests/bpf/config.

Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-3-mathew.j.martineau@linux.intel.com
2 years agobpf: Add bpf_skc_to_mptcp_sock_proto
Geliang Tang [Thu, 19 May 2022 23:30:10 +0000 (16:30 -0700)]
bpf: Add bpf_skc_to_mptcp_sock_proto

This patch implements a new struct bpf_func_proto, named
bpf_skc_to_mptcp_sock_proto. Define a new bpf_id BTF_SOCK_TYPE_MPTCP,
and a new helper bpf_skc_to_mptcp_sock(), which invokes another new
helper bpf_mptcp_sock_from_subflow() in net/mptcp/bpf.c to get struct
mptcp_sock from a given subflow socket.

v2: Emit BTF type, add func_id checks in verifier.c and bpf_trace.c,
remove build check for CONFIG_BPF_JIT
v5: Drop EXPORT_SYMBOL (Martin)

Co-developed-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220519233016.105670-2-mathew.j.martineau@linux.intel.com
2 years agoselftests/bpf: Fix some bugs in map_lookup_percpu_elem testcase
Feng Zhou [Wed, 18 May 2022 02:50:53 +0000 (10:50 +0800)]
selftests/bpf: Fix some bugs in map_lookup_percpu_elem testcase

comments from Andrii Nakryiko, details in here:
https://lore.kernel.org/lkml/20220511093854.411-1-zhoufeng.zf@bytedance.com/T/

use /* */ instead of //
use libbpf_num_possible_cpus() instead of sysconf(_SC_NPROCESSORS_ONLN)
use 8 bytes for value size
fix memory leak
use ASSERT_EQ instead of ASSERT_OK
add bpf_loop to fetch values on each possible CPU

Fixes: ed7c13776e20c74486b0939a3c1de984c5efb6aa ("selftests/bpf: add test case for bpf_map_lookup_percpu_elem")
Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220518025053.20492-1-zhoufeng.zf@bytedance.com
2 years agoMerge branch 'Start libbpf 1.0 dev cycle'
Alexei Starovoitov [Thu, 19 May 2022 16:03:31 +0000 (09:03 -0700)]
Merge branch 'Start libbpf 1.0 dev cycle'

Andrii Nakryiko says:

====================

Start preparations for libbpf 1.0 release and as a first test remove
bpf_create_map*() APIs.
====================

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: remove bpf_create_map*() APIs
Andrii Nakryiko [Wed, 18 May 2022 18:59:15 +0000 (11:59 -0700)]
libbpf: remove bpf_create_map*() APIs

To test API removal, get rid of bpf_create_map*() APIs. Perf defines
__weak implementation of bpf_map_create() that redirects to old
bpf_create_map() and that seems to compile and run fine.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: start 1.0 development cycle
Andrii Nakryiko [Wed, 18 May 2022 18:59:14 +0000 (11:59 -0700)]
libbpf: start 1.0 development cycle

Start libbpf 1.0 development cycle by adding LIBBPF_1.0.0 section to
libbpf.map file and marking all current symbols as local. As we remove
all the deprecated APIs we'll populate global list before the final 1.0
release.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: fix up global symbol counting logic
Andrii Nakryiko [Wed, 18 May 2022 18:59:13 +0000 (11:59 -0700)]
libbpf: fix up global symbol counting logic

Add the same negative ABS filter that we use in VERSIONED_SYM_COUNT to
filter out ABS symbols like LIBBPF_0.8.0.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220518185915.3529475-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add missed ima_setup.sh in Makefile
Hangbin Liu [Mon, 16 May 2022 04:00:20 +0000 (12:00 +0800)]
selftests/bpf: Add missed ima_setup.sh in Makefile

When build bpf test and install it to another folder, e.g.

  make -j10 install -C tools/testing/selftests/ TARGETS="bpf" \
SKIP_TARGETS="" INSTALL_PATH=/tmp/kselftests

The ima_setup.sh is missed in target folder, which makes test_ima failed.

Fix it by adding ima_setup.sh to TEST_PROGS_EXTENDED.

Fixes: 34b82d3ac105 ("bpf: Add a selftest for bpf_ima_inode_hash")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220516040020.653291-1-liuhangbin@gmail.com
2 years agoselftests/bpf: Fix building bpf selftests statically
Yosry Ahmed [Sat, 14 May 2022 00:21:15 +0000 (00:21 +0000)]
selftests/bpf: Fix building bpf selftests statically

bpf selftests can no longer be built with CFLAGS=-static with
liburandom_read.so and its dependent target.

Filter out -static for liburandom_read.so and its dependent target.

When building statically, this leaves urandom_read relying on
system-wide shared libraries.

Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220514002115.1376033-1-yosryahmed@google.com
2 years agolibbpf: fix memory leak in attach_tp for target-less tracepoint program
Andrii Nakryiko [Mon, 16 May 2022 18:45:47 +0000 (11:45 -0700)]
libbpf: fix memory leak in attach_tp for target-less tracepoint program

Fix sec_name memory leak if user defines target-less SEC("tp").

Fixes: 9af8efc45eb1 ("libbpf: Allow "incomplete" basic tracing SEC() definitions")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20220516184547.3204674-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpftool: Use sysfs vmlinux when dumping BTF by ID
Larysa Zaremba [Fri, 13 May 2022 12:17:43 +0000 (14:17 +0200)]
bpftool: Use sysfs vmlinux when dumping BTF by ID

Currently, dumping almost all BTFs specified by id requires
using the -B option to pass the base BTF. For kernel module
BTFs the vmlinux BTF sysfs path should work.

This patch simplifies dumping by ID usage by loading
vmlinux BTF from sysfs as base, if base BTF was not specified
and the ID corresponds to a kernel module BTF.

Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Link: https://lore.kernel.org/bpf/20220513121743.12411-1-larysa.zaremba@intel.com
2 years agobpf: Add MEM_UNINIT as a bpf_type_flag
Joanne Koong [Mon, 9 May 2022 22:42:52 +0000 (15:42 -0700)]
bpf: Add MEM_UNINIT as a bpf_type_flag

Instead of having uninitialized versions of arguments as separate
bpf_arg_types (eg ARG_PTR_TO_UNINIT_MEM as the uninitialized version
of ARG_PTR_TO_MEM), we can instead use MEM_UNINIT as a bpf_type_flag
modifier to denote that the argument is uninitialized.

Doing so cleans up some of the logic in the verifier. We no longer
need to do two checks against an argument type (eg "if
(base_type(arg_type) == ARG_PTR_TO_MEM || base_type(arg_type) ==
ARG_PTR_TO_UNINIT_MEM)"), since uninitialized and initialized
versions of the same argument type will now share the same base type.

In the near future, MEM_UNINIT will be used by dynptr helper functions
as well.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20220509224257.3222614-2-joannelkoong@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Fix usdt_400 test case
Andrii Nakryiko [Fri, 13 May 2022 17:37:03 +0000 (10:37 -0700)]
selftests/bpf: Fix usdt_400 test case

usdt_400 test case relies on compiler using the same arg spec for
usdt_400 USDT. This assumption breaks with Clang (Clang generates
different arg specs with varying offsets relative to %rbp), so simplify
this further and hard-code the constant which will guarantee that arg
spec is the same across all 400 inlinings.

Fixes: 630301b0d59d ("selftests/bpf: Add basic USDT selftests")
Reported-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220513173703.89271-1-andrii@kernel.org
2 years agoselftests/bpf: Convert some selftests to high-level BPF map APIs
Andrii Nakryiko [Thu, 12 May 2022 22:07:13 +0000 (15:07 -0700)]
selftests/bpf: Convert some selftests to high-level BPF map APIs

Convert a bunch of selftests to using newly added high-level BPF map
APIs.

This change exposed that map_kptr selftests allocated too big buffer,
which is fixed in this patch as well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220512220713.2617964-2-andrii@kernel.org
2 years agolibbpf: Add safer high-level wrappers for map operations
Andrii Nakryiko [Thu, 12 May 2022 22:07:12 +0000 (15:07 -0700)]
libbpf: Add safer high-level wrappers for map operations

Add high-level API wrappers for most common and typical BPF map
operations that works directly on instances of struct bpf_map * (so
you don't have to call bpf_map__fd()) and validate key/value size
expectations.

These helpers require users to specify key (and value, where
appropriate) sizes when performing lookup/update/delete/etc. This forces
user to actually think and validate (for themselves) those. This is
a good thing as user is expected by kernel to implicitly provide correct
key/value buffer sizes and kernel will just read/write necessary amount
of data. If it so happens that user doesn't set up buffers correctly
(which bit people for per-CPU maps especially) kernel either randomly
overwrites stack data or return -EFAULT, depending on user's luck and
circumstances. These high-level APIs are meant to prevent such
unpleasant and hard to debug bugs.

This patch also adds bpf_map_delete_elem_flags() low-level API and
requires passing flags to bpf_map__delete_elem() API for consistency
across all similar APIs, even though currently kernel doesn't expect
any extra flags for BPF_MAP_DELETE_ELEM operation.

List of map operations that get these high-level APIs:

  - bpf_map_lookup_elem;
  - bpf_map_update_elem;
  - bpf_map_delete_elem;
  - bpf_map_lookup_and_delete_elem;
  - bpf_map_get_next_key.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220512220713.2617964-1-andrii@kernel.org
2 years agoselftests/bpf: Check combination of jit blinding and pointers to bpf subprogs.
Alexei Starovoitov [Fri, 13 May 2022 01:10:25 +0000 (18:10 -0700)]
selftests/bpf: Check combination of jit blinding and pointers to bpf subprogs.

Check that ld_imm64 with src_reg=1 (aka BPF_PSEUDO_FUNC) works
with jit_blinding.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220513011025.13344-2-alexei.starovoitov@gmail.com
2 years agobpf: Fix combination of jit blinding and pointers to bpf subprogs.
Alexei Starovoitov [Fri, 13 May 2022 01:10:24 +0000 (18:10 -0700)]
bpf: Fix combination of jit blinding and pointers to bpf subprogs.

The combination of jit blinding and pointers to bpf subprogs causes:
[   36.989548] BUG: unable to handle page fault for address: 0000000100000001
[   36.990342] #PF: supervisor instruction fetch in kernel mode
[   36.990968] #PF: error_code(0x0010) - not-present page
[   36.994859] RIP: 0010:0x100000001
[   36.995209] Code: Unable to access opcode bytes at RIP 0xffffffd7.
[   37.004091] Call Trace:
[   37.004351]  <TASK>
[   37.004576]  ? bpf_loop+0x4d/0x70
[   37.004932]  ? bpf_prog_3899083f75e4c5de_F+0xe3/0x13b

The jit blinding logic didn't recognize that ld_imm64 with an address
of bpf subprogram is a special instruction and proceeded to randomize it.
By itself it wouldn't have been an issue, but jit_subprogs() logic
relies on two step process to JIT all subprogs and then JIT them
again when addresses of all subprogs are known.
Blinding process in the first JIT phase caused second JIT to miss
adjustment of special ld_imm64.

Fix this issue by ignoring special ld_imm64 instructions that don't have
user controlled constants and shouldn't be blinded.

Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220513011025.13344-1-alexei.starovoitov@gmail.com
2 years agobpf: Fix potential array overflow in bpf_trampoline_get_progs()
Yuntao Wang [Sat, 30 Apr 2022 13:08:03 +0000 (21:08 +0800)]
bpf: Fix potential array overflow in bpf_trampoline_get_progs()

The cnt value in the 'cnt >= BPF_MAX_TRAMP_PROGS' check does not
include BPF_TRAMP_MODIFY_RETURN bpf programs, so the number of
the attached BPF_TRAMP_MODIFY_RETURN bpf programs in a trampoline
can exceed BPF_MAX_TRAMP_PROGS.

When this happens, the assignment '*progs++ = aux->prog' in
bpf_trampoline_get_progs() will cause progs array overflow as the
progs field in the bpf_tramp_progs struct can only hold at most
BPF_MAX_TRAMP_PROGS bpf programs.

Fixes: 88fd9e5352fe ("bpf: Refactor trampoline update code")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Link: https://lore.kernel.org/r/20220430130803.210624-1-ytcoode@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: make fexit_stress test run in serial mode
Andrii Nakryiko [Wed, 11 May 2022 23:20:12 +0000 (16:20 -0700)]
selftests/bpf: make fexit_stress test run in serial mode

fexit_stress is attaching maximum allowed amount of fexit programs to
bpf_fentry_test1 kernel function, which is used by a bunch of other
parallel tests, thus pretty frequently interfering with their execution.

Given the test assumes nothing else is attaching to bpf_fentry_test1,
mark it serial.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220511232012.609370-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoMerge branch 'Introduce access remote cpu elem support in BPF percpu map'
Alexei Starovoitov [Thu, 12 May 2022 01:16:55 +0000 (18:16 -0700)]
Merge branch 'Introduce access remote cpu elem support in BPF percpu map'

Feng zhou says:

====================

From: Feng Zhou <zhoufeng.zf@bytedance.com>

Trace some functions, such as enqueue_task_fair, need to access the
corresponding cpu, not the current cpu, and bpf_map_lookup_elem percpu map
cannot do it. So add bpf_map_lookup_percpu_elem to accomplish it for
percpu_array_map, percpu_hash_map, lru_percpu_hash_map.

v1->v2: Addressed comments from Alexei Starovoitov.
- add a selftest for bpf_map_lookup_percpu_elem.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: add test case for bpf_map_lookup_percpu_elem
Feng Zhou [Wed, 11 May 2022 09:38:54 +0000 (17:38 +0800)]
selftests/bpf: add test case for bpf_map_lookup_percpu_elem

test_progs:
Tests new ebpf helpers bpf_map_lookup_percpu_elem.

Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Link: https://lore.kernel.org/r/20220511093854.411-3-zhoufeng.zf@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: add bpf_map_lookup_percpu_elem for percpu map
Feng Zhou [Wed, 11 May 2022 09:38:53 +0000 (17:38 +0800)]
bpf: add bpf_map_lookup_percpu_elem for percpu map

Add new ebpf helpers bpf_map_lookup_percpu_elem.

The implementation method is relatively simple, refer to the implementation
method of map_lookup_elem of percpu map, increase the parameters of cpu, and
obtain it according to the specified cpu.

Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Link: https://lore.kernel.org/r/20220511093854.411-2-zhoufeng.zf@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoMerge branch 'Follow ups for kptr series'
Alexei Starovoitov [Wed, 11 May 2022 23:57:27 +0000 (16:57 -0700)]
Merge branch 'Follow ups for kptr series'

Kumar Kartikeya Dwivedi says:

====================

Fix a build time warning, and address comments from Alexei on the merged
version [0].

  [0]: https://lore.kernel.org/bpf/20220424214901.2743946-1-memxor@gmail.com

Changelog:
----------
v1 -> v2
v1: https://lore.kernel.org/bpf/20220510211727.575686-1-memxor@gmail.com

 * Add Fixes tag to patch 1
 * Fix test_progs-noalu32 failure in CI due to different alloc_insn (Alexei)
 * Remove per-CPU struct, use global struct (Alexei)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add tests for kptr_ref refcounting
Kumar Kartikeya Dwivedi [Wed, 11 May 2022 19:46:54 +0000 (01:16 +0530)]
selftests/bpf: Add tests for kptr_ref refcounting

Check at runtime how various operations for kptr_ref affect its refcount
and verify against the actual count.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220511194654.765705-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add negative C tests for kptrs
Kumar Kartikeya Dwivedi [Wed, 11 May 2022 19:46:53 +0000 (01:16 +0530)]
selftests/bpf: Add negative C tests for kptrs

This uses the newly added SEC("?foo") naming to disable autoload of
programs, and then loads them one by one for the object and verifies
that loading fails and matches the returned error string from verifier.
This is similar to already existing verifier tests but provides coverage
for BPF C.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220511194654.765705-4-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Prepare prog_test_struct kfuncs for runtime tests
Kumar Kartikeya Dwivedi [Wed, 11 May 2022 19:46:52 +0000 (01:16 +0530)]
bpf: Prepare prog_test_struct kfuncs for runtime tests

In an effort to actually test the refcounting logic at runtime, add a
refcount_t member to prog_test_ref_kfunc and use it in selftests to
verify and test the whole logic more exhaustively.

The kfunc calls for prog_test_member do not require runtime refcounting,
as they are only used for verifier selftests, not during runtime
execution. Hence, their implementation now has a WARN_ON_ONCE as it is
not meant to be reachable code at runtime. It is strictly used in tests
triggering failure cases in the verifier. bpf_kfunc_call_memb_release is
called from map free path, since prog_test_member is embedded in map
value for some verifier tests, so we skip WARN_ON_ONCE for it.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220511194654.765705-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Fix sparse warning for bpf_kptr_xchg_proto
Kumar Kartikeya Dwivedi [Wed, 11 May 2022 19:46:51 +0000 (01:16 +0530)]
bpf: Fix sparse warning for bpf_kptr_xchg_proto

Kernel Test Robot complained about missing static storage class
annotation for bpf_kptr_xchg_proto variable.

sparse: symbol 'bpf_kptr_xchg_proto' was not declared. Should it be static?

This caused by missing extern definition in the header. Add it to
suppress the sparse warning.

Fixes: c0a5a21c25f3 ("bpf: Allow storing referenced kptr in map")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220511194654.765705-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: fix a few clang compilation errors
Yonghong Song [Wed, 11 May 2022 18:47:35 +0000 (11:47 -0700)]
selftests/bpf: fix a few clang compilation errors

With latest clang, I got the following compilation errors:
  .../prog_tests/test_tunnel.c:291:6: error: variable 'local_ip_map_fd' is used uninitialized
     whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
       if (attach_tc_prog(&tc_hook, -1, set_dst_prog_fd))
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  .../bpf/prog_tests/test_tunnel.c:312:6: note: uninitialized use occurs here
        if (local_ip_map_fd >= 0)
            ^~~~~~~~~~~~~~~
  ...
  .../prog_tests/kprobe_multi_test.c:346:6: error: variable 'err' is used uninitialized
      whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (IS_ERR(map))
            ^~~~~~~~~~~
  .../prog_tests/kprobe_multi_test.c:388:6: note: uninitialized use occurs here
        if (err) {
            ^~~

This patch fixed the above compilation errors.

Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20220511184735.3670214-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Enable CONFIG_FPROBE for self tests
Daniel Müller [Wed, 11 May 2022 17:22:49 +0000 (17:22 +0000)]
selftests/bpf: Enable CONFIG_FPROBE for self tests

Some of the BPF selftests are failing when running with a rather bare
bones configuration based on tools/testing/selftests/bpf/config.
Specifically, we see a bunch of failures due to errno 95:

  > test_attach_api:PASS:fentry_raw_skel_load 0 nsec
  > libbpf: prog 'test_kprobe_manual': failed to attach: Operation not supported
  > test_attach_api:FAIL:bpf_program__attach_kprobe_multi_opts unexpected error: -95
  > 79 /6     kprobe_multi_test/attach_api_syms:FAIL

The cause of these is that CONFIG_FPROBE is missing. With this change we
add this configuration value to the BPF selftests config.

Signed-off-by: Daniel Müller <deso@posteo.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20220511172249.4082510-1-deso@posteo.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoMerge branch 'selftests: xsk: add busy-poll testing plus various fixes'
Alexei Starovoitov [Wed, 11 May 2022 15:03:16 +0000 (08:03 -0700)]
Merge branch 'selftests: xsk: add busy-poll testing plus various fixes'

Magnus Karlsson says:

====================

This patch set adds busy-poll testing to the xsk selftests. It runs
exactly the same tests as with regular softirq processing, but with
busy-poll enabled. I have also included a number of fixes to the
selftests that have been bugging me for a while or was discovered
while implementing the busy-poll support. In summary these are:

* Fix the error reporting of failed tests. Each failed test used to be
  reported as both failed and passed, messing up things.

* Added a summary test printout at the end of the test suite so that
  users do not have to scroll up and look at the result of both the
  softirq run and the busy_poll run.

* Added a timeout to the tests, so that if a test locks up, we report
  a fail and still get to run all the other tests.

* Made the stats test just look and feel like all the other
  tests. Makes the code simpler and the test reporting more
  consistent. These are the 3 last commits.

* Replaced zero length packets with packets of 64 byte length. This so
  that some of the tests will pass after commit 726e2c5929de84 ("veth:
  Ensure eth header is in skb's linear part").

* Added clean-up of the veth pair when terminating the test run.

* Some smaller clean-ups of unused stuff.

Note, to pass the busy-poll tests commit 8de8b71b787f ("xsk: Fix
l2fwd for copy mode + busy poll combo") need to be present. It is
present in bpf but not yet in bpf-next.

Thanks: Magnus
====================

Acked-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: make stat tests not spin on getsockopt
Magnus Karlsson [Tue, 10 May 2022 11:56:04 +0000 (13:56 +0200)]
selftests: xsk: make stat tests not spin on getsockopt

Convert the stats tests from spinning on the getsockopt to just check
getsockopt once when the Rx thread has received all the packets. The
actual completion of receiving the last packet forms a natural point
in time when the receiver is ready to call the getsockopt to check the
stats. In the previous version , we just span on the getsockopt until
we received the right answer. This could be forever or just getting
the "correct" answer by shear luck.

The pacing_on variable can now be dropped since all test can now
handle pacing properly.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-10-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: make the stats tests normal tests
Magnus Karlsson [Tue, 10 May 2022 11:56:03 +0000 (13:56 +0200)]
selftests: xsk: make the stats tests normal tests

Make the stats tests look and feel just like normal tests instead of
bunched under the umbrella of TEST_STATS. This means we will always
run each of them even if one fails. Also gets rid of some special case
code.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-9-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: introduce validation functions
Magnus Karlsson [Tue, 10 May 2022 11:56:02 +0000 (13:56 +0200)]
selftests: xsk: introduce validation functions

Introduce validation functions that can be optionally called by the Rx
and Tx threads. These are then used to replace the Rx and Tx stats
dispatchers. This so that we in the next commit can make the stats
tests proper normal tests and not be some special case, as today.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-8-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: cleanup veth pair at ctrl-c
Magnus Karlsson [Tue, 10 May 2022 11:56:01 +0000 (13:56 +0200)]
selftests: xsk: cleanup veth pair at ctrl-c

Remove the veth pair when the tests are aborted by pressing
ctrl-c. Currently in this situation, the veth pair is left on the
system polluting the netdev space.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-7-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: add timeout to tests
Magnus Karlsson [Tue, 10 May 2022 11:56:00 +0000 (13:56 +0200)]
selftests: xsk: add timeout to tests

Add a timeout to the tests so that if all packets have not been
received within 3 seconds, fail the ongoing test. Hinders a test from
dead-locking if there is something wrong.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-6-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: fix reporting of failed tests
Magnus Karlsson [Tue, 10 May 2022 11:55:59 +0000 (13:55 +0200)]
selftests: xsk: fix reporting of failed tests

Fix the reporting of failed tests as it was broken in several
ways. First, a failed test was reported as both failed and passed
messing up the count. Second, tests were not aborted after a failure
and could generate more "failures" messing up the count even
more. Third, the failure reporting from the application to the shell
script was wrong. It always reported pass. And finally, the handling
of the failures in the launch script was not correct.

Correct all this by propagating the failure up through the function
calls to a calling function that can abort the test. A receiver or
sender thread will mark the new variable in the test spec called fail,
if a test has failed. This is then picked up by the main thread when
everyone else has exited and this is then marked and propagated up to
the calling script.

Also add a summary function in the calling script so that a user
does not have to go through the sub tests to see if something has
failed.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-5-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: run all tests for busy-poll
Magnus Karlsson [Tue, 10 May 2022 11:55:58 +0000 (13:55 +0200)]
selftests: xsk: run all tests for busy-poll

Execute all xsk selftests for busy-poll mode too. Currently they were
only run for the standard interrupt driven softirq mode. Replace the
unused option queue-id with the new option busy-poll.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-4-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: do not send zero-length packets
Magnus Karlsson [Tue, 10 May 2022 11:55:57 +0000 (13:55 +0200)]
selftests: xsk: do not send zero-length packets

Do not try to send packets of zero length since they are dropped by
veth after commit 726e2c5929de84 ("veth: Ensure eth header is in skb's
linear part"). Replace these two packets with packets of length 60 so
that they are not dropped.

Also clean up the confusing naming. MIN_PKT_SIZE was really
MIN_ETH_PKT_SIZE and PKT_SIZE was both MIN_ETH_SIZE and the default
packet size called just PKT_SIZE. Make it consistent by using the
right define in the right place.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-3-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests: xsk: cleanup bash scripts
Magnus Karlsson [Tue, 10 May 2022 11:55:56 +0000 (13:55 +0200)]
selftests: xsk: cleanup bash scripts

Remove the spec-file that is not used any longer from the shell
scripts. Also remove an unused option.

Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20220510115604.8717-2-magnus.karlsson@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add bpf_program__set_insns function
Jiri Olsa [Tue, 10 May 2022 07:46:57 +0000 (09:46 +0200)]
libbpf: Add bpf_program__set_insns function

Adding bpf_program__set_insns that allows to set new instructions
for a BPF program.

This is a very advanced libbpf API and users need to know what
they are doing. This should be used from prog_prepare_load_fn
callback only.

We can have changed instructions after calling prog_prepare_load_fn
callback, reloading them.

One of the users of this new API will be perf's internal BPF prologue
generation.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org
2 years agolibbpf: Clean up ringbuf size adjustment implementation
Andrii Nakryiko [Tue, 10 May 2022 18:51:59 +0000 (11:51 -0700)]
libbpf: Clean up ringbuf size adjustment implementation

Drop unused iteration variable, move overflow prevention check into the
for loop.

Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org
2 years agoMerge branch 'Attach a cookie to a tracing program.'
Andrii Nakryiko [Wed, 11 May 2022 00:47:45 +0000 (17:47 -0700)]
Merge branch 'Attach a cookie to a tracing program.'

Kui-Feng Lee says:

====================

Allow users to attach a 64-bits cookie to a bpf_link of fentry, fexit,
or fmod_ret.

This patchset includes several major changes.

 - Define struct bpf_tramp_links to replace bpf_tramp_prog.
   struct bpf_tramp_links collects bpf_links of a trampoline

 - Generate a trampoline to call bpf_progs of given bpf_links.

 - Trampolines always set/reset bpf_run_ctx before/after
   calling/leaving a tracing program.

 - Attach a cookie to a bpf_link of fentry/fexit/fmod_ret/lsm.  The
   value will be available when running the associated bpf_prog.

Th major differences from v6:

 - bpf_link_create() can create links of BPF_LSM_MAC attach type.

 - Add a test for lsm.

 - Add function proto of bpf_get_attach_cookie() for lsm.

 - Check BPF_LSM_MAC in bpf_prog_has_trampoline().

 - Adapt to the changes of LINK_CREATE made by Andrii.

The major differences from v7:

 - Change stack_size instead of pushing/popping run_ctx.

 - Move cookie to bpf_tramp_link from bpf_tracing_link..

v1: https://lore.kernel.org/all/20220126214809.3868787-1-kuifeng@fb.com/
v2: https://lore.kernel.org/bpf/20220316004231.1103318-1-kuifeng@fb.com/
v3: https://lore.kernel.org/bpf/20220407192552.2343076-1-kuifeng@fb.com/
v4: https://lore.kernel.org/bpf/20220411173429.4139609-1-kuifeng@fb.com/
v5: https://lore.kernel.org/bpf/20220412165555.4146407-1-kuifeng@fb.com/
v6: https://lore.kernel.org/bpf/20220416042940.656344-1-kuifeng@fb.com/
v7: https://lore.kernel.org/bpf/20220508032117.2783209-1-kuifeng@fb.com/
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2 years agoselftest/bpf: The test cases of BPF cookie for fentry/fexit/fmod_ret/lsm.
Kui-Feng Lee [Tue, 10 May 2022 20:59:23 +0000 (13:59 -0700)]
selftest/bpf: The test cases of BPF cookie for fentry/fexit/fmod_ret/lsm.

Make sure BPF cookies are correct for fentry/fexit/fmod_ret/lsm.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-6-kuifeng@fb.com
2 years agolibbpf: Assign cookies to links in libbpf.
Kui-Feng Lee [Tue, 10 May 2022 20:59:22 +0000 (13:59 -0700)]
libbpf: Assign cookies to links in libbpf.

Add a cookie field to the attributes of bpf_link_create().
Add bpf_program__attach_trace_opts() to attach a cookie to a link.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com
2 years agobpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm.
Kui-Feng Lee [Tue, 10 May 2022 20:59:21 +0000 (13:59 -0700)]
bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm.

Pass a cookie along with BPF_LINK_CREATE requests.

Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie.
The cookie of a bpf_tracing_link is available by calling
bpf_get_attach_cookie when running the BPF program of the attached
link.

The value of a cookie will be set at bpf_tramp_run_ctx by the
trampoline of the link.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com
2 years agobpf, x86: Create bpf_tramp_run_ctx on the caller thread's stack
Kui-Feng Lee [Tue, 10 May 2022 20:59:20 +0000 (13:59 -0700)]
bpf, x86: Create bpf_tramp_run_ctx on the caller thread's stack

BPF trampolines will create a bpf_tramp_run_ctx, a bpf_run_ctx, on
stacks and set/reset the current bpf_run_ctx before/after calling a
bpf_prog.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-3-kuifeng@fb.com
2 years agobpf, x86: Generate trampolines from bpf_tramp_links
Kui-Feng Lee [Tue, 10 May 2022 20:59:19 +0000 (13:59 -0700)]
bpf, x86: Generate trampolines from bpf_tramp_links

Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect
struct bpf_tramp_link(s) for a trampoline.  struct bpf_tramp_link
extends bpf_link to act as a linked list node.

arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to
collects all bpf_tramp_link(s) that a trampoline should call.

Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links
instead of bpf_tramp_progs.

Signed-off-by: Kui-Feng Lee <kuifeng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com
2 years agoMerge branch 'bpf: Speed up symbol resolving in kprobe multi link'
Alexei Starovoitov [Tue, 10 May 2022 21:42:06 +0000 (14:42 -0700)]
Merge branch 'bpf: Speed up symbol resolving in kprobe multi link'

Jiri Olsa says:

====================

hi,
sending additional fix for symbol resolving in kprobe multi link
requested by Alexei and Andrii [1].

This speeds up bpftrace kprobe attachment, when using pure symbols
(3344 symbols) to attach:

Before:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )

After:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )

v6 changes:
  - rewrote patch 1 changelog and fixed the line length [Christoph]

v5 changes:
  - added acks [Masami]
  - workaround in selftest for RCU warning by filtering out several
    functions to attach

v4 changes:
  - fix compile issue [kernel test robot]
  - added acks [Andrii]

v3 changes:
  - renamed kallsyms_lookup_names to ftrace_lookup_symbols
    and moved it to ftrace.c [Masami]
  - added ack [Andrii]
  - couple small test fixes [Andrii]

v2 changes (first version [2]):
  - removed the 2 seconds check [Alexei]
  - moving/forcing symbols sorting out of kallsyms_lookup_names function [Alexei]
  - skipping one array allocation and copy_from_user [Andrii]
  - several small fixes [Masami,Andrii]
  - build fix [kernel test robot]

thanks,
jirka

[1] https://lore.kernel.org/bpf/CAEf4BzZtQaiUxQ-sm_hH2qKPRaqGHyOfEsW96DxtBHRaKLoL3Q@mail.gmail.com/
[2] https://lore.kernel.org/bpf/20220407125224.310255-1-jolsa@kernel.org/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add attach bench test
Jiri Olsa [Tue, 10 May 2022 12:26:16 +0000 (14:26 +0200)]
selftests/bpf: Add attach bench test

Adding test that reads all functions from ftrace available_filter_functions
file and attach them all through kprobe_multi API.

It also prints stats info with -v option, like on my setup:

  test_bench_attach: found 48712 functions
  test_bench_attach: attached in   1.069s
  test_bench_attach: detached in   0.373s

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220510122616.2652285-6-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link
Jiri Olsa [Tue, 10 May 2022 12:26:15 +0000 (14:26 +0200)]
bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link

Using kallsyms_lookup_names function to speed up symbols lookup in
kprobe multi link attachment and replacing with it the current
kprobe_multi_resolve_syms function.

This speeds up bpftrace kprobe attachment:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )

After:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220510122616.2652285-5-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agofprobe: Resolve symbols with ftrace_lookup_symbols
Jiri Olsa [Tue, 10 May 2022 12:26:14 +0000 (14:26 +0200)]
fprobe: Resolve symbols with ftrace_lookup_symbols

Using ftrace_lookup_symbols to speed up symbols lookup
in register_fprobe_syms API.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220510122616.2652285-4-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoftrace: Add ftrace_lookup_symbols function
Jiri Olsa [Tue, 10 May 2022 12:26:13 +0000 (14:26 +0200)]
ftrace: Add ftrace_lookup_symbols function

Adding ftrace_lookup_symbols function that resolves array of symbols
with single pass over kallsyms.

The user provides array of string pointers with count and pointer to
allocated array for resolved values.

  int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt,
                            unsigned long *addrs)

It iterates all kallsyms symbols and tries to loop up each in provided
symbols array with bsearch. The symbols array needs to be sorted by
name for this reason.

We also check each symbol to pass ftrace_location, because this API
will be used for fprobe symbols resolving.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220510122616.2652285-3-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agokallsyms: Make kallsyms_on_each_symbol generally available
Jiri Olsa [Tue, 10 May 2022 12:26:12 +0000 (14:26 +0200)]
kallsyms: Make kallsyms_on_each_symbol generally available

Making kallsyms_on_each_symbol generally available, so it can be
used outside CONFIG_LIVEPATCH option in following changes.

Rather than adding another ifdef option let's make the function
generally available (when CONFIG_KALLSYMS option is defined).

Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220510122616.2652285-2-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoMerge branch 'bpf: bpf link iterator'
Alexei Starovoitov [Tue, 10 May 2022 18:20:45 +0000 (11:20 -0700)]
Merge branch 'bpf: bpf link iterator'

Dmitrii Dolgov says:

====================

Bpf links seem to be one of the important structures for which no
iterator is provided. Such iterator could be useful in those cases when
generic 'task/file' is not suitable or better performance is needed.

The implementation is mostly copied from prog iterator. This time tests were
executed, although I still had to exclude test_bpf_nf (failed to find BTF info
for global/extern symbol 'bpf_skb_ct_lookup') -- since it's unrelated, I hope
it's a minor issue.

Per suggestion from the previous discussion, there is a new patch for
converting CHECK to corresponding ASSERT_* macro. Such replacement is done only
if the final result would be the same, e.g. CHECK with important-looking custom
formatting strings are still in place -- from what I understand ASSERT_*
doesn't allow to specify such format.

The third small patch fixes what looks like a copy-paste error in the condition
checking.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add bpf link iter test
Dmitrii Dolgov [Tue, 10 May 2022 15:52:33 +0000 (17:52 +0200)]
selftests/bpf: Add bpf link iter test

Add a simple test for bpf link iterator

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Link: https://lore.kernel.org/r/20220510155233.9815-5-9erthalion6@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Use ASSERT_* instead of CHECK
Dmitrii Dolgov [Tue, 10 May 2022 15:52:32 +0000 (17:52 +0200)]
selftests/bpf: Use ASSERT_* instead of CHECK

Replace usage of CHECK with a corresponding ASSERT_* macro for bpf_iter
tests. Only done if the final result is equivalent, no changes when
replacement means loosing some information, e.g. from formatting string.

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Link: https://lore.kernel.org/r/20220510155233.9815-4-9erthalion6@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Fix result check for test_bpf_hash_map
Dmitrii Dolgov [Tue, 10 May 2022 15:52:31 +0000 (17:52 +0200)]
selftests/bpf: Fix result check for test_bpf_hash_map

The original condition looks like a typo, verify the skeleton loading
result instead.

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Link: https://lore.kernel.org/r/20220510155233.9815-3-9erthalion6@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Add bpf_link iterator
Dmitrii Dolgov [Tue, 10 May 2022 15:52:30 +0000 (17:52 +0200)]
bpf: Add bpf_link iterator

Implement bpf_link iterator to traverse links via bpf_seq_file
operations. The changeset is mostly shamelessly copied from
commit a228a64fc1e4 ("bpf: Add bpf_prog iterator")

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220510155233.9815-2-9erthalion6@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoMerge branch 'Add source ip in bpf tunnel key'
Alexei Starovoitov [Tue, 10 May 2022 17:49:03 +0000 (10:49 -0700)]
Merge branch 'Add source ip in bpf tunnel key'

Kaixi Fan says:

====================
From: Kaixi Fan <fankaixi.li@bytedance.com>

Now bpf code could not set tunnel source ip address of ip tunnel. So it
could not support flow based tunnel mode completely. Because flow based
tunnel mode could set tunnel source, destination ip address and tunnel
key simultaneously.

Flow based tunnel is useful for overlay networks. And by configuring tunnel
source ip address, user could make their networks more elastic.
For example, tunnel source ip could be used to select different egress
nic interface for different flows with same tunnel destination ip. Another
example, user could choose one of multiple ip address of the egress nic
interface as the packet's tunnel source ip.

Add tunnel and tunnel source testcases in test_progs. Other types of
tunnel testcases would be moved to test_progs step by step in the
future.

v6:
- use libbpf api to attach tc progs and remove some shell commands to reduce
  test runtime based on Alexei Starovoitov's suggestion

v5:
- fix some code format errors
- use bpf kernel code at namespace at_ns0 to set tunnel metadata

v4:
- fix subject error of first patch

v3:
- move vxlan tunnel testcases to test_progs
- replace bpf_trace_printk with bpf_printk
- rename bpf kernel prog section name to tic

v2:
- merge vxlan tunnel and tunnel source ip testcases in test_tunnel.sh
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Replace bpf_trace_printk in tunnel kernel code
Kaixi Fan [Sat, 30 Apr 2022 07:48:44 +0000 (15:48 +0800)]
selftests/bpf: Replace bpf_trace_printk in tunnel kernel code

Replace bpf_trace_printk with bpf_printk in test_tunnel_kern.c.
function bpf_printk is more easier and useful than bpf_trace_printk.

Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com>
Link: https://lore.kernel.org/r/20220430074844.69214-4-fankaixi.li@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Move vxlan tunnel testcases to test_progs
Kaixi Fan [Sat, 30 Apr 2022 07:48:43 +0000 (15:48 +0800)]
selftests/bpf: Move vxlan tunnel testcases to test_progs

Move vxlan tunnel testcases from test_tunnel.sh to test_progs.
And add vxlan tunnel source testcases also. Other tunnel testcases
will be moved to test_progs step by step in the future.
Rename bpf program section name as SEC("tc") because test_progs
bpf loader could not load sections with name SEC("gre_set_tunnel").
Because of this, add bpftool to load bpf programs in test_tunnel.sh.

Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com>
Link: https://lore.kernel.org/r/20220430074844.69214-3-fankaixi.li@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Add source ip in "struct bpf_tunnel_key"
Kaixi Fan [Sat, 30 Apr 2022 07:48:42 +0000 (15:48 +0800)]
bpf: Add source ip in "struct bpf_tunnel_key"

Add tunnel source ip field in "struct bpf_tunnel_key". Add related code
to set and get tunnel source field.

Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com>
Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpftool: bpf_link_get_from_fd support for LSM programs in lskel
KP Singh [Mon, 9 May 2022 21:49:05 +0000 (21:49 +0000)]
bpftool: bpf_link_get_from_fd support for LSM programs in lskel

bpf_link_get_from_fd currently returns a NULL fd for LSM programs.
LSM programs are similar to tracing programs and can also use
skel_raw_tracepoint_open.

Signed-off-by: KP Singh <kpsingh@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220509214905.3754984-1-kpsingh@kernel.org
2 years agoselftests/bpf: Handle batch operations for map-in-map bpf-maps
Takshak Chahande [Tue, 10 May 2022 08:22:21 +0000 (01:22 -0700)]
selftests/bpf: Handle batch operations for map-in-map bpf-maps

This patch adds up test cases that handles 4 combinations:
 a) outer map: BPF_MAP_TYPE_ARRAY_OF_MAPS
    inner maps: BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH
 b) outer map: BPF_MAP_TYPE_HASH_OF_MAPS
    inner maps: BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH

Signed-off-by: Takshak Chahande <ctakshak@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220510082221.2390540-2-ctakshak@fb.com
2 years agobpf: Extend batch operations for map-in-map bpf-maps
Takshak Chahande [Tue, 10 May 2022 08:22:20 +0000 (01:22 -0700)]
bpf: Extend batch operations for map-in-map bpf-maps

This patch extends batch operations support for map-in-map map-types:
BPF_MAP_TYPE_HASH_OF_MAPS and BPF_MAP_TYPE_ARRAY_OF_MAPS

A usecase where outer HASH map holds hundred of VIP entries and its
associated reuse-ports per VIP stored in REUSEPORT_SOCKARRAY type
inner map, needs to do batch operation for performance gain.

This patch leverages the exiting generic functions for most of the batch
operations. As map-in-map's value contains the actual reference of the inner map,
for BPF_MAP_TYPE_HASH_OF_MAPS type, it needed an extra step to fetch the
map_id from the reference value.

selftests are added in next patch 2/2.

Signed-off-by: Takshak Chahande <ctakshak@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220510082221.2390540-1-ctakshak@fb.com
2 years agobpf: Print some info if disable bpf_jit_enable failed
Tiezhu Yang [Tue, 10 May 2022 03:35:03 +0000 (11:35 +0800)]
bpf: Print some info if disable bpf_jit_enable failed

A user told me that bpf_jit_enable can be disabled on one system, but he
failed to disable bpf_jit_enable on the other system:

  # echo 0 > /proc/sys/net/core/bpf_jit_enable
  bash: echo: write error: Invalid argument

No useful info is available through the dmesg log, a quick analysis shows
that the issue is related with CONFIG_BPF_JIT_ALWAYS_ON.

When CONFIG_BPF_JIT_ALWAYS_ON is enabled, bpf_jit_enable is permanently set
to 1 and setting any other value than that will return failure.

It is better to print some info to tell the user if disable bpf_jit_enable
failed.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1652153703-22729-3-git-send-email-yangtiezhu@loongson.cn
2 years agonet: sysctl: Use SYSCTL_TWO instead of &two
Tiezhu Yang [Tue, 10 May 2022 03:35:02 +0000 (11:35 +0800)]
net: sysctl: Use SYSCTL_TWO instead of &two

It is better to use SYSCTL_TWO instead of &two, and then we can
remove the variable "two" in net/core/sysctl_net_core.c.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1652153703-22729-2-git-send-email-yangtiezhu@loongson.cn
2 years agobpf: Remove unused parameter from find_kfunc_desc_btf()
Yuntao Wang [Thu, 5 May 2022 07:01:14 +0000 (15:01 +0800)]
bpf: Remove unused parameter from find_kfunc_desc_btf()

The func_id parameter in find_kfunc_desc_btf() is not used, get rid of it.

Fixes: 2357672c54c3 ("bpf: Introduce BPF support for kernel module function calls")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20220505070114.3522522-1-ytcoode@gmail.com
2 years agobpftool: Declare generator name
Jason Wang [Mon, 9 May 2022 09:02:47 +0000 (17:02 +0800)]
bpftool: Declare generator name

Most code generators declare its name so did this for bfptool.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220509090247.5457-1-jasowang@redhat.com
2 years agosamples: bpf: Don't fail for a missing VMLINUX_BTF when VMLINUX_H is provided
Jerome Marchand [Sat, 7 May 2022 16:16:35 +0000 (18:16 +0200)]
samples: bpf: Don't fail for a missing VMLINUX_BTF when VMLINUX_H is provided

samples/bpf build currently always fails if it can't generate
vmlinux.h from vmlinux, even when vmlinux.h is directly provided by
VMLINUX_H variable, which makes VMLINUX_H pointless.
Only fails when neither method works.

Fixes: 384b6b3bbf0d ("samples: bpf: Add vmlinux.h generation support")
Reported-by: CKI Project <cki-project@redhat.com>
Reported-by: Veronika Kabatova <vkabatov@redhat.com>
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220507161635.2219052-1-jmarchan@redhat.com
2 years agoMerge branch 'bpftool: fix feature output when helper probes fail'
Andrii Nakryiko [Tue, 10 May 2022 00:16:05 +0000 (17:16 -0700)]
Merge branch 'bpftool: fix feature output when helper probes fail'

Milan Landaverde says:

====================

Currently in bpftool's feature probe, we incorrectly tell the user that
all of the helper functions are supported for program types where helper
probing fails or is explicitly unsupported[1]:

$ bpftool feature probe
...
eBPF helpers supported for program type tracing:
- bpf_map_lookup_elem
- bpf_map_update_elem
- bpf_map_delete_elem
...
- bpf_redirect_neigh
- bpf_check_mtu
- bpf_sys_bpf
- bpf_sys_close

This patch adjusts bpftool to relay to the user when helper support
can't be determined:

$ bpftool feature probe
...
eBPF helpers supported for program type lirc_mode2:
    Program type not supported
eBPF helpers supported for program type tracing:
    Could not determine which helpers are available
eBPF helpers supported for program type struct_opts:
    Could not determine which helpers are available
eBPF helpers supported for program type ext:
    Could not determine which helpers are available

Rather than imply that no helpers are available for the program type, we
let the user know that helper function probing failed entirely.

[1] https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org/
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2 years agobpftool: Output message if no helpers found in feature probing
Milan Landaverde [Wed, 4 May 2022 16:13:32 +0000 (12:13 -0400)]
bpftool: Output message if no helpers found in feature probing

Currently in libbpf, we have hardcoded program types that are not
supported for helper function probing (e.g. tracing, ext, lsm).
Due to this (and other legitimate failures), bpftool feature probe returns
empty for those program type helper functions.

Instead of implying to the user that there are no helper functions
available for a program type, we output a message to the user explaining
that helper function probing failed for that program type.

Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220504161356.3497972-3-milan@mdaverde.com
2 years agobpftool: Adjust for error codes from libbpf probes
Milan Landaverde [Wed, 4 May 2022 16:13:31 +0000 (12:13 -0400)]
bpftool: Adjust for error codes from libbpf probes

Originally [1], libbpf's (now deprecated) probe functions returned a bool
to acknowledge support but the new APIs return an int with a possible
negative error code to reflect probe failure. This change decides for
bpftool to declare maps and helpers are not available on probe failures.

[1]: https://lore.kernel.org/bpf/20220202225916.3313522-3-andrii@kernel.org/

Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220504161356.3497972-2-milan@mdaverde.com
2 years agoselftests/bpf: Test libbpf's ringbuf size fix up logic
Andrii Nakryiko [Mon, 9 May 2022 00:41:48 +0000 (17:41 -0700)]
selftests/bpf: Test libbpf's ringbuf size fix up logic

Make sure we always excercise libbpf's ringbuf map size adjustment logic
by specifying non-zero size that's definitely not a page size multiple.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-10-andrii@kernel.org
2 years agolibbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary
Andrii Nakryiko [Mon, 9 May 2022 00:41:47 +0000 (17:41 -0700)]
libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary

Kernel imposes a pretty particular restriction on ringbuf map size. It
has to be a power-of-2 multiple of page size. While generally this isn't
hard for user to satisfy, sometimes it's impossible to do this
declaratively in BPF source code or just plain inconvenient to do at
runtime.

One such example might be BPF libraries that are supposed to work on
different architectures, which might not agree on what the common page
size is.

Let libbpf find the right size for user instead, if it turns out to not
satisfy kernel requirements. If user didn't set size at all, that's most
probably a mistake so don't upsize such zero size to one full page,
though. Also we need to be careful about not overflowing __u32
max_entries.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org