linux-2.6-microblaze.git
13 months agonet: ag71xx: add MODULE_DESCRIPTION
Rosen Penev [Thu, 5 Sep 2024 19:49:33 +0000 (12:49 -0700)]
net: ag71xx: add MODULE_DESCRIPTION

Now that COMPILE_TEST is enabled, it gets flagged when building with
allmodconfig W=1 builds. Text taken from the beginning of the file.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240905194938.8453-3-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: ag71xx: add COMPILE_TEST to test compilation
Rosen Penev [Thu, 5 Sep 2024 19:49:32 +0000 (12:49 -0700)]
net: ag71xx: add COMPILE_TEST to test compilation

While this driver is meant for MIPS only, it can be compiled on x86 just
fine. Remove pointless parentheses while at it.

Enables CI building of this driver.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240905194938.8453-2-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'af_unix-correct-manage_oob-when-oob-follows-a-consumed-oob'
Jakub Kicinski [Tue, 10 Sep 2024 00:14:28 +0000 (17:14 -0700)]
Merge branch 'af_unix-correct-manage_oob-when-oob-follows-a-consumed-oob'

Kuniyuki Iwashima says:

====================
af_unix: Correct manage_oob() when OOB follows a consumed OOB.

Recently syzkaller reported UAF of OOB skb.

The bug was introduced by commit 93c99f21db36 ("af_unix: Don't stop
recv(MSG_DONTWAIT) if consumed OOB skb is at the head.") but uncovered
by another recent commit 8594d9b85c07 ("af_unix: Don't call skb_get()
for OOB skb.").

[0]: https://lore.kernel.org/netdev/00000000000083b05a06214c9ddc@google.com/
====================

Link: https://patch.msgid.link/20240905193240.17565-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoaf_unix: Don't return OOB skb in manage_oob().
Kuniyuki Iwashima [Thu, 5 Sep 2024 19:32:40 +0000 (12:32 -0700)]
af_unix: Don't return OOB skb in manage_oob().

syzbot reported use-after-free in unix_stream_recv_urg(). [0]

The scenario is

  1. send(MSG_OOB)
  2. recv(MSG_OOB)
     -> The consumed OOB remains in recv queue
  3. send(MSG_OOB)
  4. recv()
     -> manage_oob() returns the next skb of the consumed OOB
     -> This is also OOB, but unix_sk(sk)->oob_skb is not cleared
  5. recv(MSG_OOB)
     -> unix_sk(sk)->oob_skb is used but already freed

The recent commit 8594d9b85c07 ("af_unix: Don't call skb_get() for OOB
skb.") uncovered the issue.

If the OOB skb is consumed and the next skb is peeked in manage_oob(),
we still need to check if the skb is OOB.

Let's do so by falling back to the following checks in manage_oob()
and add the test case in selftest.

Note that we need to add a similar check for SIOCATMARK.

[0]:
BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa6/0xb0 net/unix/af_unix.c:2959
Read of size 4 at addr ffff8880326abcc4 by task syz-executor178/5235

CPU: 0 UID: 0 PID: 5235 Comm: syz-executor178 Not tainted 6.11.0-rc5-syzkaller-00742-gfbdaffe41adc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:93 [inline]
 dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
 print_address_description mm/kasan/report.c:377 [inline]
 print_report+0x169/0x550 mm/kasan/report.c:488
 kasan_report+0x143/0x180 mm/kasan/report.c:601
 unix_stream_read_actor+0xa6/0xb0 net/unix/af_unix.c:2959
 unix_stream_recv_urg+0x1df/0x320 net/unix/af_unix.c:2640
 unix_stream_read_generic+0x2456/0x2520 net/unix/af_unix.c:2778
 unix_stream_recvmsg+0x22b/0x2c0 net/unix/af_unix.c:2996
 sock_recvmsg_nosec net/socket.c:1046 [inline]
 sock_recvmsg+0x22f/0x280 net/socket.c:1068
 ____sys_recvmsg+0x1db/0x470 net/socket.c:2816
 ___sys_recvmsg net/socket.c:2858 [inline]
 __sys_recvmsg+0x2f0/0x3e0 net/socket.c:2888
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5360d6b4e9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff29b3a458 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00007fff29b3a638 RCX: 00007f5360d6b4e9
RDX: 0000000000002001 RSI: 0000000020000640 RDI: 0000000000000003
RBP: 00007f5360dde610 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fff29b3a628 R14: 0000000000000001 R15: 0000000000000001
 </TASK>

Allocated by task 5235:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 unpoison_slab_object mm/kasan/common.c:312 [inline]
 __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook mm/slub.c:3988 [inline]
 slab_alloc_node mm/slub.c:4037 [inline]
 kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4080
 __alloc_skb+0x1c3/0x440 net/core/skbuff.c:667
 alloc_skb include/linux/skbuff.h:1320 [inline]
 alloc_skb_with_frags+0xc3/0x770 net/core/skbuff.c:6528
 sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2815
 sock_alloc_send_skb include/net/sock.h:1778 [inline]
 queue_oob+0x108/0x680 net/unix/af_unix.c:2198
 unix_stream_sendmsg+0xd24/0xf80 net/unix/af_unix.c:2351
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg+0x221/0x270 net/socket.c:745
 ____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
 ___sys_sendmsg net/socket.c:2651 [inline]
 __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 5235:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
 poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
 __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
 kasan_slab_free include/linux/kasan.h:184 [inline]
 slab_free_hook mm/slub.c:2252 [inline]
 slab_free mm/slub.c:4473 [inline]
 kmem_cache_free+0x145/0x350 mm/slub.c:4548
 unix_stream_read_generic+0x1ef6/0x2520 net/unix/af_unix.c:2917
 unix_stream_recvmsg+0x22b/0x2c0 net/unix/af_unix.c:2996
 sock_recvmsg_nosec net/socket.c:1046 [inline]
 sock_recvmsg+0x22f/0x280 net/socket.c:1068
 __sys_recvfrom+0x256/0x3e0 net/socket.c:2255
 __do_sys_recvfrom net/socket.c:2273 [inline]
 __se_sys_recvfrom net/socket.c:2269 [inline]
 __x64_sys_recvfrom+0xde/0x100 net/socket.c:2269
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff8880326abc80
 which belongs to the cache skbuff_head_cache of size 240
The buggy address is located 68 bytes inside of
 freed 240-byte region [ffff8880326abc80ffff8880326abd70)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x326ab
ksm flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xfdffffff(slab)
raw: 00fff00000000000 ffff88801eaee780 ffffea0000b7dc80 dead000000000003
raw: 0000000000000000 00000000800c000c 00000001fdffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 4686, tgid 4686 (udevadm), ts 32357469485, free_ts 28829011109
 set_page_owner include/linux/page_owner.h:32 [inline]
 post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
 prep_new_page mm/page_alloc.c:1501 [inline]
 get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
 alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
 alloc_slab_page+0x5f/0x120 mm/slub.c:2321
 allocate_slab+0x5a/0x2f0 mm/slub.c:2484
 new_slab mm/slub.c:2537 [inline]
 ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
 __slab_alloc+0x58/0xa0 mm/slub.c:3813
 __slab_alloc_node mm/slub.c:3866 [inline]
 slab_alloc_node mm/slub.c:4025 [inline]
 kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4080
 __alloc_skb+0x1c3/0x440 net/core/skbuff.c:667
 alloc_skb include/linux/skbuff.h:1320 [inline]
 alloc_uevent_skb+0x74/0x230 lib/kobject_uevent.c:289
 uevent_net_broadcast_untagged lib/kobject_uevent.c:326 [inline]
 kobject_uevent_net_broadcast+0x2fd/0x580 lib/kobject_uevent.c:410
 kobject_uevent_env+0x57d/0x8e0 lib/kobject_uevent.c:608
 kobject_synth_uevent+0x4ef/0xae0 lib/kobject_uevent.c:207
 uevent_store+0x4b/0x70 drivers/base/bus.c:633
 kernfs_fop_write_iter+0x3a1/0x500 fs/kernfs/file.c:334
 new_sync_write fs/read_write.c:497 [inline]
 vfs_write+0xa72/0xc90 fs/read_write.c:590
page last free pid 1 tgid 1 stack trace:
 reset_page_owner include/linux/page_owner.h:25 [inline]
 free_pages_prepare mm/page_alloc.c:1094 [inline]
 free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
 kasan_depopulate_vmalloc_pte+0x74/0x90 mm/kasan/shadow.c:408
 apply_to_pte_range mm/memory.c:2797 [inline]
 apply_to_pmd_range mm/memory.c:2841 [inline]
 apply_to_pud_range mm/memory.c:2877 [inline]
 apply_to_p4d_range mm/memory.c:2913 [inline]
 __apply_to_page_range+0x8a8/0xe50 mm/memory.c:2947
 kasan_release_vmalloc+0x9a/0xb0 mm/kasan/shadow.c:525
 purge_vmap_node+0x3e3/0x770 mm/vmalloc.c:2208
 __purge_vmap_area_lazy+0x708/0xae0 mm/vmalloc.c:2290
 _vm_unmap_aliases+0x79d/0x840 mm/vmalloc.c:2885
 change_page_attr_set_clr+0x2fe/0xdb0 arch/x86/mm/pat/set_memory.c:1881
 change_page_attr_set arch/x86/mm/pat/set_memory.c:1922 [inline]
 set_memory_nx+0xf2/0x130 arch/x86/mm/pat/set_memory.c:2110
 free_init_pages arch/x86/mm/init.c:924 [inline]
 free_kernel_image_pages arch/x86/mm/init.c:943 [inline]
 free_initmem+0x79/0x110 arch/x86/mm/init.c:970
 kernel_init+0x31/0x2b0 init/main.c:1476
 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Memory state around the buggy address:
 ffff8880326abb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880326abc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff8880326abc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                           ^
 ffff8880326abd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
 ffff8880326abd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb

Fixes: 93c99f21db36 ("af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.")
Reported-by: syzbot+8811381d455e3e9ec788@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8811381d455e3e9ec788
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoaf_unix: Move spin_lock() in manage_oob().
Kuniyuki Iwashima [Thu, 5 Sep 2024 19:32:39 +0000 (12:32 -0700)]
af_unix: Move spin_lock() in manage_oob().

When OOB skb has been already consumed, manage_oob() returns the next
skb if exists.  In such a case, we need to fall back to the else branch
below.

Then, we want to keep holding spin_lock(&sk->sk_receive_queue.lock).

Let's move it out of if-else branch and add lightweight check before
spin_lock() for major use cases without OOB skb.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoaf_unix: Rename unlinked_skb in manage_oob().
Kuniyuki Iwashima [Thu, 5 Sep 2024 19:32:38 +0000 (12:32 -0700)]
af_unix: Rename unlinked_skb in manage_oob().

When OOB skb has been already consumed, manage_oob() returns the next
skb if exists.  In such a case, we need to fall back to the else branch
below.

Then, we need to keep two skbs and free them later with consume_skb()
and kfree_skb().

Let's rename unlinked_skb accordingly.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoaf_unix: Remove single nest in manage_oob().
Kuniyuki Iwashima [Thu, 5 Sep 2024 19:32:37 +0000 (12:32 -0700)]
af_unix: Remove single nest in manage_oob().

This is a prep for the later fix.

No functional change intended.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20240905193240.17565-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge tag 'linux-can-next-for-6.12-20240909' of git://git.kernel.org/pub/scm/linux...
Jakub Kicinski [Tue, 10 Sep 2024 00:11:05 +0000 (17:11 -0700)]
Merge tag 'linux-can-next-for-6.12-20240909' of git://git./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2024-09-09

The first patch is by Rob Herring and simplifies the DT parsing in the
cc770 driver.

The next 2 patches both target the rockchip_canfd driver added in the
last pull request to net-next. The first one is by Nathan Chancellor
and fixes the return type of rkcanfd_start_xmit(), the second one is
by me and fixes a 64 bit division on 32 bit platforms in
rkcanfd_timestamp_init().

* tag 'linux-can-next-for-6.12-20240909' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: rockchip_canfd: rkcanfd_timestamp_init(): fix 64 bit division on 32 bit platforms
  can: rockchip_canfd: fix return type of rkcanfd_start_xmit()
  net: can: cc770: Simplify parsing DT properties
====================

Link: https://patch.msgid.link/20240909063657.2287493-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: remove dev_pick_tx_cpu_id()
Jakub Kicinski [Fri, 6 Sep 2024 16:10:59 +0000 (09:10 -0700)]
net: remove dev_pick_tx_cpu_id()

dev_pick_tx_cpu_id() has been introduced with two users by
commit a4ea8a3dacc3 ("net: Add generic ndo_select_queue functions").
The use in AF_PACKET has been removed in 2019 by
commit b71b5837f871 ("packet: rework packet_pick_tx_queue() to use common code selection")
The other user was a Netlogic XLP driver, removed in 2021 by
commit 47ac6f567c28 ("staging: Remove Netlogic XLP network driver").

It's relatively unlikely that any modern driver will need an
.ndo_select_queue implementation which picks purely based on CPU ID
and skips XPS, delete dev_pick_tx_cpu_id()

Found by code inspection.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240906161059.715546-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'selftests-mptcp-add-time-per-subtests-in-tap-output'
Jakub Kicinski [Mon, 9 Sep 2024 23:52:06 +0000 (16:52 -0700)]
Merge branch 'selftests-mptcp-add-time-per-subtests-in-tap-output'

Matthieu Baerts says:

====================
selftests: mptcp: add time per subtests in TAP output

Patches here add 'time=<N>ms' in the diagnostic data of the TAP output,
e.g.

  ok 1 - pm_netlink: defaults addr list # time=9ms

This addition is useful to quickly identify which subtests are taking a
longer time than the others, or more than expected.

Note that there are no specific formats to follow to show this time
according to the TAP 13, TAP 14 and KTAP specifications, but we follow
the format being parsed by NIPA [1].

Patch 1 modifies mptcp_lib.sh to add this support to all MPTCP
selftests.

Patch 2 removes the now duplicated info in mptcp_connect.sh

Patch 3 slightly improves the precision of the first subtests in all
MPTCP subtests.

Patches 4 and 5 remove duplicated spaces in TAP output, for the TAP
parsers that cannot handle them properly.

v1: https://lore.kernel.org/20240902-net-next-mptcp-ksft-subtest-time-v1-0-f1ed499a11b1@kernel.org
Link: https://github.com/linux-netdev/nipa/pull/36
====================

Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-0-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: mptcp: connect: remove duplicated spaces in TAP output
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:11 +0000 (20:46 +0200)]
selftests: mptcp: connect: remove duplicated spaces in TAP output

It is nice to have a visual alignment in the test output to present the
different results, but it makes less sense in the TAP output that is
there for computers.

It sounds then better to remove the duplicated whitespaces in the TAP
output, also because it can cause some issues with TAP parsers expecting
only one space around the directive delimiter (#).

While at it, change the variable name (result_msg) to something more
explicit.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-5-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: mptcp: diag: remove trailing whitespace
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:10 +0000 (20:46 +0200)]
selftests: mptcp: diag: remove trailing whitespace

It doesn't need to be there, and it can cause some issues with TAP
parsers expecting only one space around the directive delimiter (#).

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-4-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: mptcp: reset the last TS before the first test
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:09 +0000 (20:46 +0200)]
selftests: mptcp: reset the last TS before the first test

Just to slightly improve the precision of the duration of the first
test.

In mptcp_join.sh, the last append_prev_results is now done as soon as
the last test is over: this will add the last result in the list, and
get a more precise time for this last test.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-3-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: mptcp: connect: remote time in TAP output
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:08 +0000 (20:46 +0200)]
selftests: mptcp: connect: remote time in TAP output

It is now added by the MPTCP lib automatically, see the parent commit.

The time in the TAP output might be slightly different from the one
displayed before, but that's OK.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-2-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: mptcp: lib: add time per subtests in TAP output
Matthieu Baerts (NGI0) [Fri, 6 Sep 2024 18:46:07 +0000 (20:46 +0200)]
selftests: mptcp: lib: add time per subtests in TAP output

It adds 'time=<N>ms' in the diagnostic data of the TAP output, e.g.

  ok 1 - pm_netlink: defaults addr list # time=9ms

This addition is useful to quickly identify which subtests are taking a
longer time than the others, or more than expected.

Note that there are no specific formats to follow to show this time
according to the TAP 13 [1], TAP 14 [2] and KTAP [3] specifications.
Let's then define this one here.

Link: https://testanything.org/tap-version-13-specification.html
Link: https://testanything.org/tap-version-14-specification.html
Link: https://docs.kernel.org/dev-tools/ktap.html
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-1-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoselftests: return failure when timestamps can't be reported
Jason Xing [Thu, 5 Sep 2024 16:00:35 +0000 (00:00 +0800)]
selftests: return failure when timestamps can't be reported

When I was trying to modify the tx timestamping feature, I found that
running "./txtimestamp -4 -C -L 127.0.0.1" didn't reflect the error:
I succeeded to generate timestamp stored in the skb but later failed
to report it to the userspace (which means failed to put css into cmsg).
It can happen when someone writes buggy codes in __sock_recv_timestamp(),
for example.

After adding the check so that running ./txtimestamp will reflect the
result correctly like this if there is a bug in the reporting phase:
protocol:     TCP
payload:      10
server port:  9000

family:       INET
test SND
    USR: 1725458477 s 667997 us (seq=0, len=0)
Failed to report timestamps
    USR: 1725458477 s 718128 us (seq=0, len=0)
Failed to report timestamps
    USR: 1725458477 s 768273 us (seq=0, len=0)
Failed to report timestamps
    USR: 1725458477 s 818416 us (seq=0, len=0)
Failed to report timestamps
...

In the future, it will help us detect whether the new coming patch has
bugs or not.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240905160035.62407-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'unmask-dscp-part-four'
David S. Miller [Mon, 9 Sep 2024 13:14:54 +0000 (14:14 +0100)]
Merge branch 'unmask-dscp-part-four'

Ido Schimmel says:

====================
Unmask upper DSCP bits - part 4 (last)

tl;dr - This patchset finishes to unmask the upper DSCP bits in the IPv4
flow key in preparation for allowing IPv4 FIB rules to match on DSCP. No
functional changes are expected.

The TOS field in the IPv4 flow key ('flowi4_tos') is used during FIB
lookup to match against the TOS selector in FIB rules and routes.

It is currently impossible for user space to configure FIB rules that
match on the DSCP value as the upper DSCP bits are either masked in the
various call sites that initialize the IPv4 flow key or along the path
to the FIB core.

In preparation for adding a DSCP selector to IPv4 and IPv6 FIB rules, we
need to make sure the entire DSCP value is present in the IPv4 flow key.
This patchset finishes to unmask the upper DSCP bits by adjusting all
the callers of ip_route_output_key() to properly initialize the full
DSCP value in the IPv4 flow key.

No functional changes are expected as commit 1fa3314c14c6 ("ipv4:
Centralize TOS matching") moved the masking of the upper DSCP bits to
the core where 'flowi4_tos' is matched against the TOS selector.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agosctp: Unmask upper DSCP bits in sctp_v4_get_dst()
Ido Schimmel [Thu, 5 Sep 2024 16:51:40 +0000 (19:51 +0300)]
sctp: Unmask upper DSCP bits in sctp_v4_get_dst()

Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Note that the 'tos' variable holds the full DS field.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: udp_tunnel: Unmask upper DSCP bits in udp_tunnel_dst_lookup()
Ido Schimmel [Thu, 5 Sep 2024 16:51:39 +0000 (19:51 +0300)]
ipv4: udp_tunnel: Unmask upper DSCP bits in udp_tunnel_dst_lookup()

Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Note that callers of udp_tunnel_dst_lookup() pass the entire DS field in
the 'tos' argument.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonetfilter: nf_dup4: Unmask upper DSCP bits in nf_dup_ipv4_route()
Ido Schimmel [Thu, 5 Sep 2024 16:51:38 +0000 (19:51 +0300)]
netfilter: nf_dup4: Unmask upper DSCP bits in nf_dup_ipv4_route()

Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonetfilter: nft_flow_offload: Unmask upper DSCP bits in nft_flow_route()
Ido Schimmel [Thu, 5 Sep 2024 16:51:37 +0000 (19:51 +0300)]
netfilter: nft_flow_offload: Unmask upper DSCP bits in nft_flow_route()

Unmask the upper DSCP bits when calling nf_route() which eventually
calls ip_route_output_key() so that in the future it could perform the
FIB lookup according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: netfilter: Unmask upper DSCP bits in ip_route_me_harder()
Ido Schimmel [Thu, 5 Sep 2024 16:51:36 +0000 (19:51 +0300)]
ipv4: netfilter: Unmask upper DSCP bits in ip_route_me_harder()

Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_xmit()
Ido Schimmel [Thu, 5 Sep 2024 16:51:35 +0000 (19:51 +0300)]
ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_xmit()

Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.

Note that the 'tos' variable includes the full DS field. Either the one
specified as part of the tunnel parameters or the one inherited from the
inner packet.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: ip_tunnel: Unmask upper DSCP bits in ip_md_tunnel_xmit()
Ido Schimmel [Thu, 5 Sep 2024 16:51:34 +0000 (19:51 +0300)]
ipv4: ip_tunnel: Unmask upper DSCP bits in ip_md_tunnel_xmit()

Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.

Note that the 'tos' variable includes the full DS field. Either the one
specified via the tunnel key or the one inherited from the inner packet.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_bind_dev()
Ido Schimmel [Thu, 5 Sep 2024 16:51:33 +0000 (19:51 +0300)]
ipv4: ip_tunnel: Unmask upper DSCP bits in ip_tunnel_bind_dev()

Unmask the upper DSCP bits when initializing an IPv4 flow key via
ip_tunnel_init_flow() before passing it to ip_route_output_key() so that
in the future we could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: icmp: Unmask upper DSCP bits in icmp_reply()
Ido Schimmel [Thu, 5 Sep 2024 16:51:32 +0000 (19:51 +0300)]
ipv4: icmp: Unmask upper DSCP bits in icmp_reply()

Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agobpf: lwtunnel: Unmask upper DSCP bits in bpf_lwt_xmit_reroute()
Ido Schimmel [Thu, 5 Sep 2024 16:51:31 +0000 (19:51 +0300)]
bpf: lwtunnel: Unmask upper DSCP bits in bpf_lwt_xmit_reroute()

Unmask the upper DSCP bits when calling ip_route_output_key() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoipv4: ip_gre: Unmask upper DSCP bits in ipgre_open()
Ido Schimmel [Thu, 5 Sep 2024 16:51:30 +0000 (19:51 +0300)]
ipv4: ip_gre: Unmask upper DSCP bits in ipgre_open()

Unmask the upper DSCP bits when calling ip_route_output_gre() so that in
the future it could perform the FIB lookup according to the full DSCP
value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonetfilter: br_netfilter: Unmask upper DSCP bits in br_nf_pre_routing_finish()
Ido Schimmel [Thu, 5 Sep 2024 16:51:29 +0000 (19:51 +0300)]
netfilter: br_netfilter: Unmask upper DSCP bits in br_nf_pre_routing_finish()

Unmask upper DSCP bits when calling ip_route_output() so that in the
future it could perform the FIB lookup according to the full DSCP value.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: sysfs: Fix weird usage of class's namespace relevant fields
Zijun Hu [Wed, 4 Sep 2024 23:35:38 +0000 (07:35 +0800)]
net: sysfs: Fix weird usage of class's namespace relevant fields

Device class has two namespace relevant fields which are associated by
the following usage:

struct class {
...
const struct kobj_ns_type_operations *ns_type;
const void *(*namespace)(const struct device *dev);
...
}
if (dev->class && dev->class->ns_type)
dev->class->namespace(dev);

The usage looks weird since it checks @ns_type but calls namespace()
it is found for all existing class definitions that the other filed is
also assigned once one is assigned in current kernel tree, so fix this
weird usage by checking @namespace to call namespace().

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge branch 'fs_enet-cleanup'
David S. Miller [Mon, 9 Sep 2024 09:29:05 +0000 (10:29 +0100)]
Merge branch 'fs_enet-cleanup'

Maxime Chevallier says:

====================
net: ethernet: fs_enet: Cleanup and phylink conversion

This is V3 of a series that cleans-up fs_enet, with the ultimate goal of
converting it to phylink (patch 8).

The main changes compared to V2 are :
 - Reviewed-by tags from Andrew were gathered
 - Patch 5 now includes the removal of now unused includes, thanks
   Andrew for spotting this
 - Patch 4 is new, it reworks the adjust_link to move the spinlock
   acquisition to a more suitable location. Although this dissapears in
   the actual phylink port, it makes the phylink conversion clearer on
   that point
 - Patch 8 includes fixes in the tx_timeout cancellation, to prevent
   taking rtnl twice when canceling a pending tx_timeout. Thanks Jakub
   for spotting this.

Link to V2: https://lore.kernel.org/netdev/20240829161531.610874-1-maxime.chevallier@bootlin.com/
Link to V1: https://lore.kernel.org/netdev/20240828095103.132625-1-maxime.chevallier@bootlin.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: phylink conversion
Maxime Chevallier [Wed, 4 Sep 2024 17:18:21 +0000 (19:18 +0200)]
net: ethernet: fs_enet: phylink conversion

fs_enet is a quite old but still used Ethernet driver found on some NXP
devices. It has support for 10/100 Mbps ethernet, with half and full
duplex. Some variants of it can use RMII, while other integrations are
MII-only.

Add phylink support, thus removing custom fixed-link hanldling.

This also allows removing some internal flags such as the use_rmii flag.

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: simplify clock handling with devm accessors
Maxime Chevallier [Wed, 4 Sep 2024 17:18:20 +0000 (19:18 +0200)]
net: ethernet: fs_enet: simplify clock handling with devm accessors

devm_clock_get_enabled() can be used to simplify clock handling for the
PER register clock.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: use macros for speed and duplex values
Maxime Chevallier [Wed, 4 Sep 2024 17:18:19 +0000 (19:18 +0200)]
net: ethernet: fs_enet: use macros for speed and duplex values

The PHY speed and duplex should be manipulated using the SPEED_XXX and
DUPLEX_XXX macros available. Use it in the fcc, fec and scc MAC for
fs_enet.

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: drop unused phy_info and mii_if_info
Maxime Chevallier [Wed, 4 Sep 2024 17:18:18 +0000 (19:18 +0200)]
net: ethernet: fs_enet: drop unused phy_info and mii_if_info

There's no user of the struct phy_info, the 'phy' field and the
mii_if_info in the fs_enet driver, probably dating back when phylib
wasn't as widely used.  Drop these from the driver code.

As the definition for struct mii_if_info is no longer required, drop the
include for linux/mii.h altogether in the driver.

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: only protect the .restart() call in .adjust_link
Maxime Chevallier [Wed, 4 Sep 2024 17:18:17 +0000 (19:18 +0200)]
net: ethernet: fs_enet: only protect the .restart() call in .adjust_link

When .adjust_link() gets called, it runs in thread context, with the
phydev->lock held. We only need to protect the fep->fecp/fccp/sccp
register that are accessed within the .restart() function from
concurrent access from the interrupts.

These registers are being protected by the fep->lock spinlock, so we can
move the spinlock protection around the .restart() call instead of the
entire adjust_link() call. By doing so, we can simplify further the
.adjust_link() callback and avoid the intermediate helper.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: drop the .adjust_link custom fs_ops
Maxime Chevallier [Wed, 4 Sep 2024 17:18:16 +0000 (19:18 +0200)]
net: ethernet: fs_enet: drop the .adjust_link custom fs_ops

There's no in-tree user for the fs_ops .adjust_link() function, so we
can always use the generic one in fe_enet-main.

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: cosmetic cleanups
Maxime Chevallier [Wed, 4 Sep 2024 17:18:15 +0000 (19:18 +0200)]
net: ethernet: fs_enet: cosmetic cleanups

Due to the age of the driver and the slow recent activity on it, the code
has taken some layers of dust. Clean the main driver file up so that it
passes checkpatch and also conforms with the net coding style.

Changes include :
 - Re-ordering of the variable declarations for RCT
 - Fixing the comment styles to either one-line comments, or net-style
   comments
 - Adding braces around single-statement 'else' clauses
 - Aligning function/macro parameters on the opening parenthesis
 - Simplifying checks for NULL pointers
 - Splitting cascaded assignments into individual assignments
 - Fixing some typos
 - Fixing whitespace issues

This is a cosmetic change and doesn't introduce any change in behaviour.

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: fs_enet: convert to SPDX
Maxime Chevallier [Wed, 4 Sep 2024 17:18:14 +0000 (19:18 +0200)]
net: ethernet: fs_enet: convert to SPDX

The ENET driver has SPDX tags in the header files, but they were missing
in the C files. Change the licence information to SPDX format.

Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agocan: rockchip_canfd: rkcanfd_timestamp_init(): fix 64 bit division on 32 bit platforms
Marc Kleine-Budde [Sun, 8 Sep 2024 15:00:00 +0000 (17:00 +0200)]
can: rockchip_canfd: rkcanfd_timestamp_init(): fix 64 bit division on 32 bit platforms

On some 32-bit platforms (at least on parisc), the compiler generates
a call to __divdi3() from the u32 by 3 division in
rkcanfd_timestamp_init(), which results in the following linker
error:

| ERROR: modpost: "__divdi3" [drivers/net/can/rockchip/rockchip_canfd.ko] undefined!

As this code doesn't run in the hot path, a 64 bit by 32 bit division
is OK, even on 32 bit platforms. Use an explicit call to div_u64() to
fix linking.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409072304.lCQWyNLU-lkp@intel.com/
Link: https://patch.msgid.link/20240909-can-rockchip_canfd-fix-64-bit-division-v1-1-2748d9422b00@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
13 months agocan: rockchip_canfd: fix return type of rkcanfd_start_xmit()
Nathan Chancellor [Fri, 6 Sep 2024 20:26:41 +0000 (13:26 -0700)]
can: rockchip_canfd: fix return type of rkcanfd_start_xmit()

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
warning in clang aims to catch these at compile time, which reveals:

  drivers/net/can/rockchip/rockchip_canfd-core.c:770:20: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
    770 |         .ndo_start_xmit = rkcanfd_start_xmit,
        |                           ^~~~~~~~~~~~~~~~~~

->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
'netdev_tx_t', not 'int' (although the types are ABI compatible). Adjust
the return type of rkcanfd_start_xmit() to match the prototype's to
resolve the warning.

Fixes: ff60bfbaf67f ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20240906-rockchip-canfd-wifpts-v1-1-b1398da865b7@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
13 months agonet: can: cc770: Simplify parsing DT properties
Rob Herring (Arm) [Tue, 3 Sep 2024 13:57:30 +0000 (08:57 -0500)]
net: can: cc770: Simplify parsing DT properties

Use of the typed property accessors is preferred over of_get_property().
The existing code doesn't work on little endian systems either. Replace
the of_get_property() calls with of_property_read_bool() and
of_property_read_u32().

Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20240903135731.405635-1-robh@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
13 months agoptp/ioctl: support MONOTONIC{,_RAW} timestamps for PTP_SYS_OFFSET_EXTENDED
Mahesh Bandewar [Wed, 4 Sep 2024 14:13:05 +0000 (07:13 -0700)]
ptp/ioctl: support MONOTONIC{,_RAW} timestamps for PTP_SYS_OFFSET_EXTENDED

The ability to read the PHC (Physical Hardware Clock) alongside
multiple system clocks is currently dependent on the specific
hardware architecture. This limitation restricts the use of
PTP_SYS_OFFSET_PRECISE to certain hardware configurations.

The generic soultion which would work across all architectures
is to read the PHC along with the latency to perform PHC-read as
offered by PTP_SYS_OFFSET_EXTENDED which provides pre and post
timestamps.  However, these timestamps are currently limited
to the CLOCK_REALTIME timebase. Since CLOCK_REALTIME is affected
by NTP (or similar time synchronization services), it can
experience significant jumps forward or backward. This hinders
the precise latency measurements that PTP_SYS_OFFSET_EXTENDED
is designed to provide.

This problem could be addressed by supporting MONOTONIC_RAW
timestamps within PTP_SYS_OFFSET_EXTENDED. Unlike CLOCK_REALTIME
or CLOCK_MONOTONIC, the MONOTONIC_RAW timebase is unaffected
by NTP adjustments.

This enhancement can be implemented by utilizing one of the three
reserved words within the PTP_SYS_OFFSET_EXTENDED struct to pass
the clock-id for timestamps.  The current behavior aligns with
clock-id for CLOCK_REALTIME timebase (value of 0), ensuring
backward compatibility of the UAPI.

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: sched: consistently use rcu_replace_pointer() in taprio_change()
Dmitry Antipov [Wed, 4 Sep 2024 11:54:01 +0000 (14:54 +0300)]
net: sched: consistently use rcu_replace_pointer() in taprio_change()

According to Vinicius (and carefully looking through the whole
https://syzkaller.appspot.com/bug?extid=b65e0af58423fc8a73aa
once again), txtime branch of 'taprio_change()' is not going to
race against 'advance_sched()'. But using 'rcu_replace_pointer()'
in the former may be a good idea as well.

Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilt...
Jakub Kicinski [Sat, 7 Sep 2024 01:39:31 +0000 (18:39 -0700)]
Merge tag 'nf-next-24-09-06' of git://git./linux/kernel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

Patch #1 adds ctnetlink support for kernel side filtering for
 deletions, from Changliang Wu.

Patch #2 updates nft_counter support to Use u64_stats_t,
 from Sebastian Andrzej Siewior.

Patch #3 uses kmemdup_array() in all xtables frontends,
 from Yan Zhen.

Patch #4 is a oneliner to use ERR_CAST() in nf_conntrack instead
 opencoded casting, from Shen Lichuan.

Patch #5 removes unused argument in nftables .validate interface,
 from Florian Westphal.

Patch #6 is a oneliner to correct a typo in nftables kdoc,
 from Simon Horman.

Patch #7 fixes missing kdoc in nftables, also from Simon.

Patch #8 updates nftables to handle timeout less than CONFIG_HZ.

Patch #9 rejects element expiration if timeout is zero,
 otherwise it is silently ignored.

Patch #10 disallows element expiration larger than timeout.

Patch #11 removes unnecessary READ_ONCE annotation while mutex is held.

Patch #12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset.

Patch #13 annotates data-races around element expiration.

Patch #14 allocates timeout and expiration in one single set element
  extension, they are tighly couple, no reason to keep them
  separated anymore.

Patch #15 updates nftables to interpret zero timeout element as never
  times out. Note that it is already possible to declare sets
  with elements that never time out but this generalizes to all
  kind of set with timeouts.

Patch #16 supports for element timeout and expiration updates.

* tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: set element timeout update support
  netfilter: nf_tables: zero timeout means element never times out
  netfilter: nf_tables: consolidate timeout extension for elements
  netfilter: nf_tables: annotate data-races around element expiration
  netfilter: nft_dynset: annotate data-races around set timeout
  netfilter: nf_tables: remove annotation to access set timeout while holding lock
  netfilter: nf_tables: reject expiration higher than timeout
  netfilter: nf_tables: reject element expiration with no timeout
  netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
  netfilter: nf_tables: Add missing Kernel doc
  netfilter: nf_tables: Correct spelling in nf_tables.h
  netfilter: nf_tables: drop unused 3rd argument from validate callback ops
  netfilter: conntrack: Convert to use ERR_CAST()
  netfilter: Use kmemdup_array instead of kmemdup for multiple allocation
  netfilter: nft_counter: Use u64_stats_t for statistic.
  netfilter: ctnetlink: support CTA_FILTER for flush
====================

Link: https://patch.msgid.link/20240905232920.5481-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoptp: ocp: Improve PCIe delay estimation
Vadim Fedorenko [Thu, 5 Sep 2024 14:00:28 +0000 (14:00 +0000)]
ptp: ocp: Improve PCIe delay estimation

The PCIe bus can be pretty busy during boot and probe function can
see excessive delays. Let's find the minimal value out of several
tests and use it as estimated value.

Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20240905140028.560454-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonetpoll: remove netpoll_srcu
Eric Dumazet [Thu, 5 Sep 2024 08:49:09 +0000 (08:49 +0000)]
netpoll: remove netpoll_srcu

netpoll_srcu is currently used from netpoll_poll_disable() and
__netpoll_cleanup()

Both functions run under RTNL, using netpoll_srcu adds confusion
and no additional protection.

Moreover the synchronize_srcu() call in __netpoll_cleanup() is
performed before clearing np->dev->npinfo, which violates RCU rules.

After this patch, netpoll_poll_disable() and netpoll_poll_enable()
simply use rtnl_dereference().

This saves a big chunk of memory (more than 192KB on platforms
with 512 cpus)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20240905084909.2082486-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'octeontx2-address-some-warnings'
Jakub Kicinski [Sat, 7 Sep 2024 01:23:51 +0000 (18:23 -0700)]
Merge branch 'octeontx2-address-some-warnings'

Simon Horman says:

====================
octeontx2: Address some warnings

This patchset addresses some warnings flagged by Sparse, gcc-14, and
clang-18 in files touched by recent patch submissions.

Although these changes do not alter the functionality of the code, by
addressing them real problems introduced in future which are flagged by
Sparse will stand out more readily.

v1: https://lore.kernel.org/20240903-octeontx2-sparse-v1-0-f190309ecb0a@kernel.org
====================

Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-0-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoocteontx2-pf: Make iplen __be16 in otx2_sqe_add_ext()
Simon Horman [Wed, 4 Sep 2024 18:29:37 +0000 (19:29 +0100)]
octeontx2-pf: Make iplen __be16 in otx2_sqe_add_ext()

In otx2_sqe_add_ext() iplen is used to hold a 16-bit big-endian value,
but it's type is u16, indicating a host byte order integer.

Address this mismatch by changing the type of iplen to __be16.

Flagged by Sparse as:

.../otx2_txrx.c:699:31: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:699:31:    expected unsigned short [usertype] iplen
.../otx2_txrx.c:699:31:    got restricted __be16 [usertype]
.../otx2_txrx.c:701:54: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:701:54:    expected restricted __be16 [usertype] tot_len
.../otx2_txrx.c:701:54:    got unsigned short [usertype] iplen
.../otx2_txrx.c:704:60: warning: incorrect type in assignment (different base types)
.../otx2_txrx.c:704:60:    expected restricted __be16 [usertype] payload_len
.../otx2_txrx.c:704:60:    got unsigned short [usertype] iplen

Introduced in
commit dc1a9bf2c816 ("octeontx2-pf: Add UDP segmentation offload support")

No functional change intended.
Compile tested only by author.

Tested-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-2-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoocteontx2-af: Pass string literal as format argument of alloc_workqueue()
Simon Horman [Wed, 4 Sep 2024 18:29:36 +0000 (19:29 +0100)]
octeontx2-af: Pass string literal as format argument of alloc_workqueue()

Recently I noticed that both gcc-14 and clang-18 report that passing
a non-string literal as the format argument of alloc_workqueue()
is potentially insecure.

E.g. clang-18 says:

.../rvu.c:2493:32: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
 2493 |         mw->mbox_wq = alloc_workqueue(name,
      |                                       ^~~~
.../rvu.c:2493:32: note: treat the string as an argument to avoid this
 2493 |         mw->mbox_wq = alloc_workqueue(name,
      |                                       ^
      |                                       "%s",

It is always the case where the contents of name is safe to pass as the
format argument. That is, in my understanding, it never contains any
format escape sequences.

But, it seems better to be safe than sorry. And, as a bonus, compiler
output becomes less verbose by addressing this issue as suggested by
clang-18.

Compile tested only by author.

Tested-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904-octeontx2-sparse-v2-1-14f2305fe4b2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: phy: qca83xx: use PHY_ID_MATCH_EXACT
Rosen Penev [Wed, 4 Sep 2024 20:56:59 +0000 (13:56 -0700)]
net: phy: qca83xx: use PHY_ID_MATCH_EXACT

No need for the mask when there's already a macro for this.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240904205659.7470-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agosfc: siena: rip out rss-context dead code
Edward Cree [Wed, 4 Sep 2024 18:11:56 +0000 (19:11 +0100)]
sfc: siena: rip out rss-context dead code

Siena hardware does not support custom RSS contexts, but when the
 driver was forked from sfc.ko, some of the plumbing for them was
 copied across from the common code.  Actually trying to use them
 would lead to EOPNOTSUPP as the relevant efx_nic_type methods were
 not populated.
Remove this dead code from the Siena driver.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904181156.1993666-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'use-functionality-of-irq_get_trigger_type'
Jakub Kicinski [Sat, 7 Sep 2024 01:21:47 +0000 (18:21 -0700)]
Merge branch 'use-functionality-of-irq_get_trigger_type'

Vasileios Amoiridis says:

====================
Use functionality of irq_get_trigger_type()

v1: https://lore.kernel.org/20240902225534.130383-1-vassilisamir@gmail.com
====================

Link: https://patch.msgid.link/20240904151018.71967-1-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: smc91x: Make use of irq_get_trigger_type()
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:18 +0000 (17:10 +0200)]
net: smc91x: Make use of irq_get_trigger_type()

Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).

Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240904151018.71967-4-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: dsa: realtek: rtl8366rb: Make use of irq_get_trigger_type()
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:17 +0000 (17:10 +0200)]
net: dsa: realtek: rtl8366rb: Make use of irq_get_trigger_type()

Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).

Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Link: https://patch.msgid.link/20240904151018.71967-3-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: dsa: realtek: rtl8365mb: Make use of irq_get_trigger_type()
Vasileios Amoiridis [Wed, 4 Sep 2024 15:10:16 +0000 (17:10 +0200)]
net: dsa: realtek: rtl8365mb: Make use of irq_get_trigger_type()

Convert irqd_get_trigger_type(irq_get_irq_data(irq)) cases to the more
simple irq_get_trigger_type(irq).

Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com>
Reviewed-by: Alvin Å ipraga <alsi@bang-olufsen.dk>
Link: https://patch.msgid.link/20240904151018.71967-2-vassilisamir@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: tls: wait for async completion on last message
Sascha Hauer [Wed, 4 Sep 2024 12:17:41 +0000 (14:17 +0200)]
net: tls: wait for async completion on last message

When asynchronous encryption is used KTLS sends out the final data at
proto->close time. This becomes problematic when the task calling
close() receives a signal. In this case it can happen that
tcp_sendmsg_locked() called at close time returns -ERESTARTSYS and the
final data is not sent.

The described situation happens when KTLS is used in conjunction with
io_uring, as io_uring uses task_work_add() to add work to the current
userspace task. A discussion of the problem along with a reproducer can
be found in [1] and [2]

Fix this by waiting for the asynchronous encryption to be completed on
the final message. With this there is no data left to be sent at close
time.

[1] https://lore.kernel.org/all/20231010141932.GD3114228@pengutronix.de/
[2] https://lore.kernel.org/all/20240315100159.3898944-1-s.hauer@pengutronix.de/

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://patch.msgid.link/20240904-ktls-wait-async-v1-1-a62892833110@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'make-use-of-the-helper-macro-list_head'
Jakub Kicinski [Sat, 7 Sep 2024 01:10:26 +0000 (18:10 -0700)]
Merge branch 'make-use-of-the-helper-macro-list_head'

Hongbo Li says:

====================
make use of the helper macro LIST_HEAD()

The macro LIST_HEAD() declares a list variable and
initializes it, which can be used to simplify the steps
of list initialization, thereby simplifying the code.
These serials just do some equivalatent substitutions,
and with no functional modifications.
====================

Link: https://patch.msgid.link/20240904093243.3345012-1-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet/core: make use of the helper macro LIST_HEAD()
Hongbo Li [Wed, 4 Sep 2024 09:32:43 +0000 (17:32 +0800)]
net/core: make use of the helper macro LIST_HEAD()

list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-6-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet/ipv6: make use of the helper macro LIST_HEAD()
Hongbo Li [Wed, 4 Sep 2024 09:32:42 +0000 (17:32 +0800)]
net/ipv6: make use of the helper macro LIST_HEAD()

list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-5-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet/netfilter: make use of the helper macro LIST_HEAD()
Hongbo Li [Wed, 4 Sep 2024 09:32:41 +0000 (17:32 +0800)]
net/netfilter: make use of the helper macro LIST_HEAD()

list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://patch.msgid.link/20240904093243.3345012-4-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet/tipc: make use of the helper macro LIST_HEAD()
Hongbo Li [Wed, 4 Sep 2024 09:32:40 +0000 (17:32 +0800)]
net/tipc: make use of the helper macro LIST_HEAD()

list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-3-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet/ipv4: make use of the helper macro LIST_HEAD()
Hongbo Li [Wed, 4 Sep 2024 09:32:39 +0000 (17:32 +0800)]
net/ipv4: make use of the helper macro LIST_HEAD()

list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-2-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agosfc: convert comma to semicolon
Chen Ni [Wed, 4 Sep 2024 08:49:51 +0000 (16:49 +0800)]
sfc: convert comma to semicolon

Replace comma between expressions with semicolons.

Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.

Found by inspection.
No functional change intended.
Compile tested only.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240904084951.1353518-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agosfc/siena: Convert comma to semicolon
Chen Ni [Wed, 4 Sep 2024 08:40:34 +0000 (16:40 +0800)]
sfc/siena: Convert comma to semicolon

Replace comma between expressions with semicolons.

Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.

Found by inspection.
No functional change intended.
Compile tested only.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20240904084034.1353404-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoionic: Convert comma to semicolon
Chen Ni [Wed, 4 Sep 2024 08:17:28 +0000 (16:17 +0800)]
ionic: Convert comma to semicolon

Replace comma between expressions with semicolons.

Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.

Found by inspection.
No functional change intended.
Compile tested only.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20240904081728.1353260-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agonet: atlantic: convert comma to semicolon
Chen Ni [Wed, 4 Sep 2024 08:08:45 +0000 (16:08 +0800)]
net: atlantic: convert comma to semicolon

Replace comma between expressions with semicolons.

Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.

Found by inspection.
No functional change intended.
Compile tested only.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240904080845.1353144-1-nichen@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMerge branch 'rx-sw-tstamp-for-all'
David S. Miller [Fri, 6 Sep 2024 08:34:19 +0000 (09:34 +0100)]
Merge branch 'rx-sw-tstamp-for-all'

Gal Pressman says:

====================
RX software timestamp for all - round 2

Round 1 of drivers conversion was merged [1], this is round 2, more
drivers to follow.

[1] https://lore.kernel.org/netdev/20240901112803.212753-1-gal@nvidia.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agobnx2x: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:22 +0000 (10:49 +0300)]
bnx2x: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agocxgb4: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:21 +0000 (10:49 +0300)]
cxgb4: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoixgbe: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:20 +0000 (10:49 +0300)]
ixgbe: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoigc: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:19 +0000 (10:49 +0300)]
igc: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoigb: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:18 +0000 (10:49 +0300)]
igb: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoice: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:17 +0000 (10:49 +0300)]
ice: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoi40e: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:16 +0000 (10:49 +0300)]
i40e: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: netcp: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:15 +0000 (10:49 +0300)]
net: netcp: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ti: icssg-prueth: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:14 +0000 (10:49 +0300)]
net: ti: icssg-prueth: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: ti: cpsw_ethtool: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:13 +0000 (10:49 +0300)]
net: ethernet: ti: cpsw_ethtool: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: ethernet: ti: am65-cpsw-ethtool: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:12 +0000 (10:49 +0300)]
net: ethernet: ti: am65-cpsw-ethtool: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agomlxsw: spectrum: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:11 +0000 (10:49 +0300)]
mlxsw: spectrum: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: sparx5: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:10 +0000 (10:49 +0300)]
net: sparx5: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: lan966x: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:09 +0000 (10:49 +0300)]
net: lan966x: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agolan743x: Remove setting of RX software timestamp
Gal Pressman [Wed, 4 Sep 2024 07:49:08 +0000 (10:49 +0300)]
lan743x: Remove setting of RX software timestamp

The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.

Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge branch 'microchip=ksz8-cleanup'
David S. Miller [Fri, 6 Sep 2024 07:41:36 +0000 (08:41 +0100)]
Merge branch 'microchip=ksz8-cleanup'

Pieter Van Trappen says:

====================
net: dsa: microchip: rename and clean ksz8 series files

The first KSZ8 series implementation was done for a KSZ8795 device but
since several other KSZ8 devices have been added. Rename these files
to adhere to the ksz8 naming convention as already used in most
functions and the existing ksz8.h; add an explanatory note.

In addition, clean the files by removing macros that are defined at
more than one place and remove confusion by renaming the KSZ8830
string which in fact is not an existing KSZ series switch.

Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
---
v4:
 - correct once more Kconfig list of supported switches

v3: https://lore.kernel.org/netdev/20240903072946.344507-1-vtpieter@gmail.com/
 - rename all KSZ8830 to KSZ88X3 only (not KSZ8863)
 - update Kconfig as per Arun's suggestion

v2: https://lore.kernel.org/netdev/20240830141250.30425-1-vtpieter@gmail.com/
 - more finegrained description in Kconfig and ksz8.c header
 - add KSZ8830/ksz8830 to KSZ8863/ksz88x3 renaming

v1: https://lore.kernel.org/netdev/20240828102801.227588-1-vtpieter@gmail.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: dsa: microchip: replace unclear KSZ8830 strings
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:42 +0000 (08:27 +0200)]
net: dsa: microchip: replace unclear KSZ8830 strings

Replace ksz8830 with ksz88x3 for CHIP_ID definition and other
strings. This due to KSZ8830 not being an actual switch but the Chip
ID shared among KSZ8863/8873 switches, impossible to differentiate
from their Chip ID or Revision ID registers.

Now all KSZ*_CHIP_ID macros refer to actual, existing switches which
removes confusion.

Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: dsa: microchip: clean up ksz8_reg definition macros
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:41 +0000 (08:27 +0200)]
net: dsa: microchip: clean up ksz8_reg definition macros

Remove macros that are already defined at more appropriate places.

Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agonet: dsa: microchip: rename ksz8 series files
Pieter Van Trappen [Wed, 4 Sep 2024 06:27:40 +0000 (08:27 +0200)]
net: dsa: microchip: rename ksz8 series files

The first KSZ8 series implementation was done for a KSZ8795 device but
since several other KSZ8 devices have been added. Rename these files
to adhere to the ksz8 naming convention as already used in most
functions and the existing ksz8.h; add an explanatory note.

Signed-off-by: Pieter Van Trappen <pieter.van.trappen@cern.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 months agoMerge branch 'add-realtek-automotive-pcie-driver'
Jakub Kicinski [Fri, 6 Sep 2024 05:02:41 +0000 (22:02 -0700)]
Merge branch 'add-realtek-automotive-pcie-driver'

Justin Lai says:

====================
Add Realtek automotive PCIe driver

This series includes adding realtek automotive ethernet driver
and adding rtase ethernet driver entry in MAINTAINERS file.

This ethernet device driver for the PCIe interface of
Realtek Automotive Ethernet Switch,applicable to
RTL9054, RTL9068, RTL9072, RTL9075, RTL9068, RTL9071.
====================

Link: https://patch.msgid.link/20240904032114.247117-1-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agoMAINTAINERS: Add the rtase ethernet driver entry
Justin Lai [Wed, 4 Sep 2024 03:21:14 +0000 (11:21 +0800)]
MAINTAINERS: Add the rtase ethernet driver entry

Add myself and Larry Chiu as the maintainer for the rtase ethernet driver.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-14-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agorealtek: Update the Makefile and Kconfig in the realtek folder
Justin Lai [Wed, 4 Sep 2024 03:21:13 +0000 (11:21 +0800)]
realtek: Update the Makefile and Kconfig in the realtek folder

1. Add the RTASE entry in the Kconfig.
2. Add the CONFIG_RTASE entry in the Makefile.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-13-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Add a Makefile in the rtase folder
Justin Lai [Wed, 4 Sep 2024 03:21:12 +0000 (11:21 +0800)]
rtase: Add a Makefile in the rtase folder

Add a Makefile in the rtase folder to build rtase driver.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-12-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement ethtool function
Justin Lai [Wed, 4 Sep 2024 03:21:11 +0000 (11:21 +0800)]
rtase: Implement ethtool function

Implement the ethtool function to support users to obtain network card
information, including obtaining various device settings, Report whether
physical link is up, Report pause parameters, Set pause parameters,
Return extended statistics about the device.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-11-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement pci_driver suspend and resume function
Justin Lai [Wed, 4 Sep 2024 03:21:10 +0000 (11:21 +0800)]
rtase: Implement pci_driver suspend and resume function

Implement the pci_driver suspend function to enable the device
to sleep, and implement the resume function to enable the device
to resume operation.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-10-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement net_device_ops
Justin Lai [Wed, 4 Sep 2024 03:21:09 +0000 (11:21 +0800)]
rtase: Implement net_device_ops

1. Implement .ndo_set_rx_mode so that the device can change address
list filtering.
2. Implement .ndo_set_mac_address so that mac address can be changed.
3. Implement .ndo_change_mtu so that mtu can be changed.
4. Implement .ndo_tx_timeout to perform related processing when the
transmitter does not make any progress.
5. Implement .ndo_get_stats64 to provide statistics that are called
when the user wants to get network device usage.
6. Implement .ndo_vlan_rx_add_vid to register VLAN ID when the device
supports VLAN filtering.
7. Implement .ndo_vlan_rx_kill_vid to unregister VLAN ID when the device
supports VLAN filtering.
8. Implement the .ndo_setup_tc to enable setting any "tc" scheduler,
classifier or action on dev.
9. Implement .ndo_fix_features enables adjusting requested feature flags
based on device-specific constraints.
10. Implement .ndo_set_features enables updating device configuration to
new features.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-9-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement a function to receive packets
Justin Lai [Wed, 4 Sep 2024 03:21:08 +0000 (11:21 +0800)]
rtase: Implement a function to receive packets

Implement rx_handler to read the information of the rx descriptor,
thereby checking the packet accordingly and storing the packet
in the socket buffer to complete the reception of the packet.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-8-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement .ndo_start_xmit function
Justin Lai [Wed, 4 Sep 2024 03:21:07 +0000 (11:21 +0800)]
rtase: Implement .ndo_start_xmit function

Implement .ndo_start_xmit function to fill the information of the packet
to be transmitted into the tx descriptor, and then the hardware will
transmit the packet using the information in the tx descriptor.
In addition, we also implemented the tx_handler function to enable the
tx descriptor to be reused.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-7-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement hardware configuration function
Justin Lai [Wed, 4 Sep 2024 03:21:06 +0000 (11:21 +0800)]
rtase: Implement hardware configuration function

Implement rtase_hw_config to set default hardware settings, including
setting interrupt mitigation, tx/rx DMA burst, interframe gap time,
rx packet filter, near fifo threshold and fill descriptor ring and
tally counter address, and enable flow control. When filling the
rx descriptor ring, the first group of queues needs to be processed
separately because the positions of the first group of queues are not
regular with other subsequent groups. The other queues are all newly
added features, but we want to retain the original design. So they were
not put together.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-6-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement the interrupt routine and rtase_poll
Justin Lai [Wed, 4 Sep 2024 03:21:05 +0000 (11:21 +0800)]
rtase: Implement the interrupt routine and rtase_poll

1. Implement rtase_interrupt to handle txQ0/rxQ0, txQ4~txQ7 interrupts,
and implement rtase_q_interrupt to handle txQ1/rxQ1, txQ2/rxQ2 and
txQ3/rxQ3 interrupts.
2. Implement rtase_poll to call ring_handler to process the tx or
rx packet of each ring. If the returned value is budget,it means that
there is still work of a certain ring that has not yet been completed.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-5-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement the rtase_down function
Justin Lai [Wed, 4 Sep 2024 03:21:04 +0000 (11:21 +0800)]
rtase: Implement the rtase_down function

Implement the rtase_down function to disable hardware setting
and interrupt and clear descriptor ring.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-4-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 months agortase: Implement the .ndo_open function
Justin Lai [Wed, 4 Sep 2024 03:21:03 +0000 (11:21 +0800)]
rtase: Implement the .ndo_open function

Implement the .ndo_open function to set default hardware settings
and initialize the descriptor ring and interrupts. Among them,
when requesting interrupt, because the first group of interrupts
needs to process more events, the overall structure and interrupt
handler will be different from other groups of interrupts, so it
needs to be handled separately. The first set of interrupt handlers
need to handle the interrupt status of RXQ0 and TXQ0, TXQ4~7,
while other groups of interrupt handlers will handle the interrupt
status of RXQ1&TXQ1 or RXQ2&TXQ2 or RXQ3&TXQ3 according to the
interrupt vector.

Signed-off-by: Justin Lai <justinlai0215@realtek.com>
Link: https://patch.msgid.link/20240904032114.247117-3-justinlai0215@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>