linux-2.6-microblaze.git
2 years agoMerge tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 9 Jun 2022 19:06:52 +0000 (12:06 -0700)]
Merge tag 'net-5.19-rc2' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - eth: amt: fix possible null-ptr-deref in amt_rcv()

  Previous releases - regressions:

   - tcp: use alloc_large_system_hash() to allocate table_perturb

   - af_unix: fix a data-race in unix_dgram_peer_wake_me()

   - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling

   - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF

  Previous releases - always broken:

   - ipv6: fix signed integer overflow in __ip6_append_data

   - netfilter:
       - nat: really support inet nat without l3 address
       - nf_tables: memleak flow rule from commit path

   - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs

   - openvswitch: fix misuse of the cached connection on tuple changes

   - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred

   - eth: altera: fix refcount leak in altera_tse_mdio_create

  Misc:

   - add Quentin Monnet to bpftool maintainers"

* tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  net: amd-xgbe: fix clang -Wformat warning
  tcp: use alloc_large_system_hash() to allocate table_perturb
  net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
  net: dsa: mv88e6xxx: correctly report serdes link failure
  net: dsa: mv88e6xxx: fix BMSR error to be consistent with others
  net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
  net: altera: Fix refcount leak in altera_tse_mdio_create
  net: openvswitch: fix misuse of the cached connection on tuple changes
  net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
  ip_gre: test csum_start instead of transport header
  au1000_eth: stop using virt_to_bus()
  ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
  ipv6: Fix signed integer overflow in __ip6_append_data
  nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
  nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
  nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
  nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
  net: ipv6: unexport __init-annotated seg6_hmac_init()
  net: xfrm: unexport __init-annotated xfrm4_protocol_init()
  net: mdio: unexport __init-annotated mdio_bus_init()
  ...

2 years agodocs: arm: tcm: Fix typo in description of TCM and MMU usage
Simon Horman [Thu, 9 Jun 2022 18:42:30 +0000 (20:42 +0200)]
docs: arm: tcm: Fix typo in description of TCM and MMU usage

Correct a typo in the description of interaction between
the TCM and MMU.

Found by inspection.

Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220609184230.627958-1-simon.horman@corigine.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2 years agoscripts/check-local-export: avoid 'wait $!' for process substitution
Masahiro Yamada [Wed, 8 Jun 2022 01:11:00 +0000 (10:11 +0900)]
scripts/check-local-export: avoid 'wait $!' for process substitution

Bash 4.4, released in 2016, supports 'wait $!' to check the exit status
of a process substitution, but it seems too new.

Some people using older bash versions (on CentOS 7, Ubuntu 16.04, etc.)
reported an error like this:

  ./scripts/check-local-export: line 54: wait: pid 17328 is not a child of this shell

I used the process substitution to avoid a pipeline, which executes each
command in a subshell. If the while-loop is executed in the subshell
context, variable changes within are lost after the subshell terminates.

Fortunately, Bash 4.2, released in 2011, supports the 'lastpipe' option,
which makes the last element of a pipeline run in the current shell process.

Switch to the pipeline with 'lastpipe' solution, and also set 'pipefail'
to catch errors from ${NM}.

Add the bash requirement to Documentation/process/changes.rst.

Fixes: 31cb50b5590f ("kbuild: check static EXPORT_SYMBOL* by script instead of modpost")
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2 years agonetfs: gcc-12: temporarily disable '-Wattribute-warning' for now
Linus Torvalds [Thu, 9 Jun 2022 18:29:36 +0000 (11:29 -0700)]
netfs: gcc-12: temporarily disable '-Wattribute-warning' for now

This is a pure band-aid so that I can continue merging stuff from people
while some of the gcc-12 fallout gets sorted out.

In particular, gcc-12 is very unhappy about the kinds of pointer
arithmetic tricks that netfs does, and that makes the fortify checks
trigger in afs and ceph:

  In function ‘fortify_memset_chk’,
      inlined from ‘netfs_i_context_init’ at include/linux/netfs.h:327:2,
      inlined from ‘afs_set_netfs_context’ at fs/afs/inode.c:61:2,
      inlined from ‘afs_root_iget’ at fs/afs/inode.c:543:2:
  include/linux/fortify-string.h:258:25: warning: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
    258 |                         __write_overflow_field(p_size_field, size);
        |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

and the reason is that netfs_i_context_init() is passed a 'struct inode'
pointer, and then it does

        struct netfs_i_context *ctx = netfs_i_context(inode);

        memset(ctx, 0, sizeof(*ctx));

where that netfs_i_context() function just does pointer arithmetic on
the inode pointer, knowing that the netfs_i_context is laid out
immediately after it in memory.

This is all truly disgusting, since the whole "netfs_i_context is laid
out immediately after it in memory" is not actually remotely true in
general, but is just made to be that way for afs and ceph.

See for example fs/cifs/cifsglob.h:

  struct cifsInodeInfo {
        struct {
                /* These must be contiguous */
                struct inode    vfs_inode;      /* the VFS's inode record */
                struct netfs_i_context netfs_ctx; /* Netfslib context */
        };
[...]

and realize that this is all entirely wrong, and the pointer arithmetic
that netfs_i_context() is doing is also very very wrong and wouldn't
give the right answer if netfs_ctx had different alignment rules from a
'struct inode', for example).

Anyway, that's just a long-winded way to say "the gcc-12 warning is
actually quite reasonable, and our code happens to work but is pretty
disgusting".

This is getting fixed properly, but for now I made the mistake of
thinking "the week right after the merge window tends to be calm for me
as people take a breather" and I did a sustem upgrade.  And I got gcc-12
as a result, so to continue merging fixes from people and not have the
end result drown in warnings, I am fixing all these gcc-12 issues I hit.

Including with these kinds of temporary fixes.

Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/all/AEEBCF5D-8402-441D-940B-105AA718C71F@chromium.org/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agogcc-12: disable '-Warray-bounds' universally for now
Linus Torvalds [Thu, 9 Jun 2022 17:11:12 +0000 (10:11 -0700)]
gcc-12: disable '-Warray-bounds' universally for now

In commit 8b202ee21839 ("s390: disable -Warray-bounds") the s390 people
disabled the '-Warray-bounds' warning for gcc-12, because the new logic
in gcc would cause warnings for their use of the S390_lowcore macro,
which accesses absolute pointers.

It turns out gcc-12 has many other issues in this area, so this takes
that s390 warning disable logic, and turns it into a kernel build config
entry instead.

Part of the intent is that we can make this all much more targeted, and
use this conflig flag to disable it in only particular configurations
that cause problems, with the s390 case as an example:

        select GCC12_NO_ARRAY_BOUNDS

and we could do that for other configuration cases that cause issues.

Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more
targeted way, and disable the warning only for particular uses: again
the s390 case as an example:

  KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)

but this ends up just doing it globally in the top-level Makefile, since
the current issues are spread fairly widely all over:

  KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds

We'll try to limit this later, since the gcc-12 problems are rare enough
that *much* of the kernel can be built with it without disabling this
warning.

Cc: Kees Cook <keescook@chromium.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomellanox: mlx5: avoid uninitialized variable warning with gcc-12
Linus Torvalds [Thu, 9 Jun 2022 17:03:28 +0000 (10:03 -0700)]
mellanox: mlx5: avoid uninitialized variable warning with gcc-12

gcc-12 started warning about 'tracker' being used uninitialized:

  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c: In function ‘mlx5_do_bond’:
  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c:786:28: warning: ‘tracker’ is used uninitialized [-Wuninitialized]
    786 |         struct lag_tracker tracker;
        |                            ^~~~~~~

which seems to be because it doesn't track how the use (and
initialization) is bound by the 'do_bond' flag.

But admittedly that 'do_bond' usage is fairly complicated, and involves
passing it around as an argument to helper functions, so it's somewhat
understandable that gcc doesn't see how that all works.

This function could be rewritten to make the use of that tracker
variable more obviously safe, but for now I'm just adding the forced
initialization of it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agogcc-12: disable '-Wdangling-pointer' warning for now
Linus Torvalds [Thu, 9 Jun 2022 16:41:42 +0000 (09:41 -0700)]
gcc-12: disable '-Wdangling-pointer' warning for now

While the concept of checking for dangling pointers to local variables
at function exit is really interesting, the gcc-12 implementation is not
compatible with reality, and results in false positives.

For example, gcc sees us putting things on a local list head allocated
on the stack, which involves exactly those kinds of pointers to the
local stack entry:

  In function ‘__list_add’,
      inlined from ‘list_add_tail’ at include/linux/list.h:102:2,
      inlined from ‘rebuild_snap_realms’ at fs/ceph/snap.c:434:2:
  include/linux/list.h:74:19: warning: storing the address of local variable ‘realm_queue’ in ‘*&realm_27(D)->rebuild_item.prev’ [-Wdangling-pointer=]
     74 |         new->prev = prev;
        |         ~~~~~~~~~~^~~~~~

But then gcc - understandably - doesn't really understand the big
picture how the doubly linked list works, so doesn't see how we then end
up emptying said list head in a loop and the pointer we added has been
removed.

Gcc also complains about us (intentionally) using this as a way to store
a kind of fake stack trace, eg

  drivers/acpi/acpica/utdebug.c:40:38: warning: storing the address of local variable ‘current_sp’ in ‘acpi_gbl_entry_stack_pointer’ [-Wdangling-pointer=]
     40 |         acpi_gbl_entry_stack_pointer = &current_sp;
        |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~

which is entirely reasonable from a compiler standpoint, and we may want
to change those kinds of patterns, but not not.

So this is one of those "it would be lovely if the compiler were to
complain about us leaving dangling pointers to the stack", but not this
way.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agodrm: imx: fix compiler warning with gcc-12
Linus Torvalds [Wed, 8 Jun 2022 23:59:29 +0000 (16:59 -0700)]
drm: imx: fix compiler warning with gcc-12

Gcc-12 correctly warned about this code using a non-NULL pointer as a
truth value:

  drivers/gpu/drm/imx/ipuv3-crtc.c: In function ‘ipu_crtc_disable_planes’:
  drivers/gpu/drm/imx/ipuv3-crtc.c:72:21: error: the comparison will always evaluate as ‘true’ for the address of ‘plane’ will never be NULL [-Werror=address]
     72 |                 if (&ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
        |                     ^

due to the extraneous '&' address-of operator.

Philipp Zabel points out that The mistake had no adverse effect since
the following condition doesn't actually dereference the NULL pointer,
but the intent of the code was obviously to check for it, not to take
the address of the member.

Fixes: eb8c88808c83 ("drm/imx: add deferred plane disabling")
Acked-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agodocs: Move the HTE documentation to driver-api/
Jonathan Corbet [Mon, 6 Jun 2022 14:40:55 +0000 (08:40 -0600)]
docs: Move the HTE documentation to driver-api/

The hardware timestamp engine documentation is driver API material, and
really belongs in the driver-API book; move it there.

Cc: Thierry Reding <treding@nvidia.com>
Acked-by: Dipen Patel <dipenp@nvidia.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2 years agodocs: usb: fix literal block marker in usbmon verification example
Justin Swartz [Sat, 4 Jun 2022 15:54:31 +0000 (17:54 +0200)]
docs: usb: fix literal block marker in usbmon verification example

The "Verify that bus sockets are present" example was not properly
formatted due to a typo in the literal block marker.

Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
Link: https://lore.kernel.org/r/20220604155431.23246-1-justin.swartz@risingedge.co.za
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2 years agoDocumentation/features: Update the arch support status files
Zheng Zengkai [Thu, 9 Jun 2022 02:56:56 +0000 (10:56 +0800)]
Documentation/features: Update the arch support status files

The arch support status files don't match reality as of v5.19-rc1,
use the features-refresh.sh to refresh all the arch-support.txt files
in place.  The main effect is to add entries for the new loong
architecture.

Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Link: https://lore.kernel.org/r/20220609025656.143460-1-zhengzengkai@huawei.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2 years agopowerpc/32: Fix overread/overwrite of thread_struct via ptrace
Michael Ellerman [Mon, 6 Jun 2022 14:34:56 +0000 (00:34 +1000)]
powerpc/32: Fix overread/overwrite of thread_struct via ptrace

The ptrace PEEKUSR/POKEUSR (aka PEEKUSER/POKEUSER) API allows a process
to read/write registers of another process.

To get/set a register, the API takes an index into an imaginary address
space called the "USER area", where the registers of the process are
laid out in some fashion.

The kernel then maps that index to a particular register in its own data
structures and gets/sets the value.

The API only allows a single machine-word to be read/written at a time.
So 4 bytes on 32-bit kernels and 8 bytes on 64-bit kernels.

The way floating point registers (FPRs) are addressed is somewhat
complicated, because double precision float values are 64-bit even on
32-bit CPUs. That means on 32-bit kernels each FPR occupies two
word-sized locations in the USER area. On 64-bit kernels each FPR
occupies one word-sized location in the USER area.

Internally the kernel stores the FPRs in an array of u64s, or if VSX is
enabled, an array of pairs of u64s where one half of each pair stores
the FPR. Which half of the pair stores the FPR depends on the kernel's
endianness.

To handle the different layouts of the FPRs depending on VSX/no-VSX and
big/little endian, the TS_FPR() macro was introduced.

Unfortunately the TS_FPR() macro does not take into account the fact
that the addressing of each FPR differs between 32-bit and 64-bit
kernels. It just takes the index into the "USER area" passed from
userspace and indexes into the fp_state.fpr array.

On 32-bit there are 64 indexes that address FPRs, but only 32 entries in
the fp_state.fpr array, meaning the user can read/write 256 bytes past
the end of the array. Because the fp_state sits in the middle of the
thread_struct there are various fields than can be overwritten,
including some pointers. As such it may be exploitable.

It has also been observed to cause systems to hang or otherwise
misbehave when using gdbserver, and is probably the root cause of this
report which could not be easily reproduced:
  https://lore.kernel.org/linuxppc-dev/dc38afe9-6b78-f3f5-666b-986939e40fc6@keymile.com/

Rather than trying to make the TS_FPR() macro even more complicated to
fix the bug, or add more macros, instead add a special-case for 32-bit
kernels. This is more obvious and hopefully avoids a similar bug
happening again in future.

Note that because 32-bit kernels never have VSX enabled the code doesn't
need to consider TS_FPRWIDTH/OFFSET at all. Add a BUILD_BUG_ON() to
ensure that 32-bit && VSX is never enabled.

Fixes: 87fec0514f61 ("powerpc: PTRACE_PEEKUSR/PTRACE_POKEUSER of FPR registers in little endian builds")
Cc: stable@vger.kernel.org # v3.13+
Reported-by: Ariel Miculas <ariel.miculas@belden.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220609133245.573565-1-mpe@ellerman.id.au
2 years agoMerge tag 'intel-gpio-v5.19-2' of gitolite.kernel.org:pub/scm/linux/kernel/git/andy...
Bartosz Golaszewski [Thu, 9 Jun 2022 09:07:54 +0000 (11:07 +0200)]
Merge tag 'intel-gpio-v5.19-2' of gitolite.pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-current

intel-gpio for v5.19-2

* Convert IRQ chips in Diolan and Intel GPIO drivers to be immutable

2 years agodrm/ast: Support multiple outputs
Thomas Zimmermann [Tue, 7 Jun 2022 09:20:04 +0000 (11:20 +0200)]
drm/ast: Support multiple outputs

Systems with AST graphics can have multiple output; typically VGA
plus some other port. Record detected output chips in a bitmask and
initialize each output on its own.

Assume a VGA output by default and use SIL164 and DP501 if available.
For ASTDP assume that it can run in parallel with VGA.

Tested on AST2100.

v3:
* define a macro for each BIT(ast_tx_chip) (Patrik)
v2:
* make VGA/SIL164/DP501 mutually exclusive

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Fixes: a59b026419f3 ("drm/ast: Initialize encoder and connector for VGA in helper function")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Link: https://patchwork.freedesktop.org/patch/msgid/20220607092008.22123-2-tzimmermann@suse.de
(cherry picked from commit 7f35680ada234ce00828b8ea841ba7ca1e00ff52)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2 years agoMerge tag 'amd-drm-fixes-5.19-2022-06-08' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 9 Jun 2022 07:22:48 +0000 (17:22 +1000)]
Merge tag 'amd-drm-fixes-5.19-2022-06-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.19-2022-06-08:

amdgpu:
- DCN 3.1 golden settings fix
- eDP fixes
- DMCUB fixes
- GFX11 fixes and cleanups
- VCN fix for yellow carp
- GMC11 fixes
- RAS fixes
- GPUVM TLB flush fixes
- SMU13 fixes
- VCN3 AV1 regression fix
- VCN2 JPEG fix
- Other misc fixes

amdkfd:
- MMU notifier fix
- Support for more GC 10.3.x families
- Pinned BO handling fix
- Partial migration bug fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608203008.6187-1-alexander.deucher@amd.com
2 years agovdpa: make get_vq_group and set_group_asid optional
Jason Wang [Thu, 9 Jun 2022 04:19:01 +0000 (12:19 +0800)]
vdpa: make get_vq_group and set_group_asid optional

This patch makes get_vq_group and set_group_asid optional. This is
needed to unbreak the vDPA parent that doesn't support multiple
address spaces.

Cc: Gautam Dawar <gautam.dawar@xilinx.com>
Fixes: aaca8373c4b1 ("vhost-vdpa: support ASID based IOTLB API")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220609041901.2029-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovirtio: Fix all occurences of the "the the" typo
Bo Liu [Thu, 9 Jun 2022 03:11:06 +0000 (23:11 -0400)]
virtio: Fix all occurences of the "the the" typo

There are double "the" in message in file virtio_mmio.c
and virtio_pci_modern_dev.c, fix it.

Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20220609031106.2161-1-liubo03@inspur.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agonet: amd-xgbe: fix clang -Wformat warning
Justin Stitt [Tue, 7 Jun 2022 19:11:19 +0000 (12:11 -0700)]
net: amd-xgbe: fix clang -Wformat warning

see warning:
| drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
| type 'unsigned short' but the argument has type 'int' [-Wformat]
|        netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
|                                      ~~~~~~     ^~~~~~~~~~~~~~~~~~~

Variadic functions (printf-like) undergo default argument promotion.
Documentation/core-api/printk-formats.rst specifically recommends
using the promoted-to-type's format flag.

Also, as per C11 6.3.1.1:
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
`If an int can represent all values of the original type ..., the
value is converted to an int; otherwise, it is converted to an
unsigned int. These are called the integer promotions.`

Since the argument is a u16 it will get promoted to an int and thus it is
most accurate to use the %x format specifier here. It should be noted that the
`#06` formatting sugar does not alter the promotion rules.

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <jstitt007@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220607191119.20686-1-jstitt007@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: use alloc_large_system_hash() to allocate table_perturb
Muchun Song [Tue, 7 Jun 2022 07:02:14 +0000 (15:02 +0800)]
tcp: use alloc_large_system_hash() to allocate table_perturb

In our server, there may be no high order (>= 6) memory since we reserve
lots of HugeTLB pages when booting.  Then the system panic.  So use
alloc_large_system_hash() to allocate table_perturb.

Fixes: e9261476184b ("tcp: dynamically allocate the perturb table used by source ports")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220607070214.94443-1-songmuchun@bytedance.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
Alvin Šipraga [Tue, 7 Jun 2022 18:46:24 +0000 (20:46 +0200)]
net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY

Since commit a18e6521a7d9 ("net: phylink: handle NA interface mode in
phylink_fwnode_phy_connect()"), phylib defaults to GMII when no phy-mode
or phy-connection-type property is specified in a DSA port node of the
device tree. The same commit caused a regression in rtl8365mb whereby
phylink would fail to connect, because the driver did not advertise
support for GMII for ports with internal PHY.

It should be noted that the aforementioned regression is not because the
blamed commit was incorrect: on the contrary, the blamed commit is
correcting the previous behaviour whereby unspecified phy-mode would
cause the internal interface mode to be PHY_INTERFACE_MODE_NA. The
rtl8365mb driver only worked by accident before because it _did_
advertise support for PHY_INTERFACE_MODE_NA, despite NA being reserved
for internal use by phylink. With one mistake fixed, the other was
exposed.

Commit a5dba0f207e5 ("net: dsa: rtl8365mb: add GMII as user port mode")
then introduced implicit support for GMII mode on ports with internal
PHY to allow a PHY connection for device trees where the phy-mode is not
explicitly set to "internal". At this point everything was working OK
again.

Subsequently, commit 6ff6064605e9 ("net: dsa: realtek: convert to
phylink_generic_validate()") broke this behaviour again by discarding
the usage of rtl8365mb_phy_mode_supported() - where this GMII support
was indicated - while switching to the new .phylink_get_caps API.

With the new API, rtl8365mb_phy_mode_supported() is no longer needed.
Remove it altogether and add back the GMII capability - this time to
rtl8365mb_phylink_get_caps() - so that the above default behaviour works
for ports with internal PHY again.

Fixes: 6ff6064605e9 ("net: dsa: realtek: convert to phylink_generic_validate()")
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220607184624.417641-1-alvin@pqrs.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Jakub Kicinski [Thu, 9 Jun 2022 04:02:22 +0000 (21:02 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-07

This series contains updates to ixgbe driver only.

Olivier Matz resolves an issue so that broadcast packets can still be
received when VF removes promiscuous settings and removes setting of
VLAN promiscuous, in promiscuous mode, to prevent a loop when VFs are
bridged.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ixgbe: fix unexpected VLAN Rx in promisc mode on VF
  ixgbe: fix bcast packets Rx on VF after promisc removal
====================

Link: https://lore.kernel.org/r/20220607181538.748786-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mv88e6xxx-fixes-for-reading-serdes-state'
Jakub Kicinski [Thu, 9 Jun 2022 03:58:32 +0000 (20:58 -0700)]
Merge branch 'mv88e6xxx-fixes-for-reading-serdes-state'

Russell King says:

====================
mv88e6xxx: fixes for reading serdes state

These are some low-priority fixes to the mv88e6xxx serdes code.
Patch 1 fixes the reporting of an_complete, which is used in the
emulation of a conventional C22 PHY. Patch from Marek.

Patch 2 makes one of the error messages in patch 2 to be consistent
with the other error messages in this function.

Patch 3 ensures that we do not miss a link-failure event.
====================

Link: https://lore.kernel.org/r/Yp82TyoLon9jz6k3@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mv88e6xxx: correctly report serdes link failure
Russell King (Oracle) [Tue, 7 Jun 2022 11:28:52 +0000 (12:28 +0100)]
net: dsa: mv88e6xxx: correctly report serdes link failure

Phylink wants to know if the link has dropped since the last time state
was retrieved, and the BMSR gives us that. Read the BMSR and use it when
deciding the link state. Fill in the an_complete member as well for the
emulated PHY state.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mv88e6xxx: fix BMSR error to be consistent with others
Russell King (Oracle) [Tue, 7 Jun 2022 11:28:47 +0000 (12:28 +0100)]
net: dsa: mv88e6xxx: fix BMSR error to be consistent with others

Other errors accessing the registers in mv88e6352_serdes_pcs_get_state()
print "PHY " before the register name, except for the BMSR. Make this
consistent with the other error messages.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
Marek Behún [Tue, 7 Jun 2022 11:28:42 +0000 (12:28 +0100)]
net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete

Commit ede359d8843a ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN
is bypassed") added the ability to link if AN was bypassed, and added
filling of state->an_complete field, but set it to true if AN was
enabled in BMCR, not when AN was reported complete in BMSR.

This was done because for some reason, when I wanted to use BMSR value
to infer an_complete, I was looking at BMSR_ANEGCAPABLE bit (which was
always 1), instead of BMSR_ANEGCOMPLETE bit.

Use BMSR_ANEGCOMPLETE for filling state->an_complete.

Fixes: ede359d8843a ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: altera: Fix refcount leak in altera_tse_mdio_create
Miaoqian Lin [Tue, 7 Jun 2022 04:11:43 +0000 (08:11 +0400)]
net: altera: Fix refcount leak in altera_tse_mdio_create

Every iteration of for_each_child_of_node() decrements
the reference count of the previous node.
When break from a for_each_child_of_node() loop,
we need to explicitly call of_node_put() on the child node when
not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: bbd2190ce96d ("Altera TSE: Add main and header file for Altera Ethernet Driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220607041144.7553-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: openvswitch: fix misuse of the cached connection on tuple changes
Ilya Maximets [Mon, 6 Jun 2022 22:11:40 +0000 (00:11 +0200)]
net: openvswitch: fix misuse of the cached connection on tuple changes

If packet headers changed, the cached nfct is no longer relevant
for the packet and attempt to re-use it leads to the incorrect packet
classification.

This issue is causing broken connectivity in OpenStack deployments
with OVS/OVN due to hairpin traffic being unexpectedly dropped.

The setup has datapath flows with several conntrack actions and tuple
changes between them:

  actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
          set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
          set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
          ct(zone=8),recirc(0x4)

After the first ct() action the packet headers are almost fully
re-written.  The next ct() tries to re-use the existing nfct entry
and marks the packet as invalid, so it gets dropped later in the
pipeline.

Clearing the cached conntrack entry whenever packet tuple is changed
to avoid the issue.

The flow key should not be cleared though, because we should still
be able to match on the ct_state if the recirculation happens after
the tuple change but before the next ct() action.

Cc: stable@vger.kernel.org
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Reported-by: Frode Nordahl <frode.nordahl@canonical.com>
Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
Chen Lin [Wed, 8 Jun 2022 12:46:53 +0000 (20:46 +0800)]
net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag

When rx_flag == MTK_RX_FLAGS_HWLRO,
rx_data_len = MTK_MAX_LRO_RX_LENGTH(4096 * 3) > PAGE_SIZE.
netdev_alloc_frag is for alloction of page fragment only.
Reference to other drivers and Documentation/vm/page_frags.rst

Branch to use __get_free_pages when ring->frag_size > PAGE_SIZE.

Signed-off-by: Chen Lin <chen45464546@163.com>
Link: https://lore.kernel.org/r/1654692413-2598-1-git-send-email-chen45464546@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoip_gre: test csum_start instead of transport header
Willem de Bruijn [Mon, 6 Jun 2022 13:21:07 +0000 (09:21 -0400)]
ip_gre: test csum_start instead of transport header

GRE with TUNNEL_CSUM will apply local checksum offload on
CHECKSUM_PARTIAL packets.

ipgre_xmit must validate csum_start after an optional skb_pull,
else lco_csum may trigger an overflow. The original check was

if (csum && skb_checksum_start(skb) < skb->data)
return -EINVAL;

This had false positives when skb_checksum_start is undefined:
when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement
was straightforward

if (csum && skb->ip_summed == CHECKSUM_PARTIAL &&
    skb_checksum_start(skb) < skb->data)
return -EINVAL;

But was eventually revised more thoroughly:
- restrict the check to the only branch where needed, in an
  uncommon GRE path that uses header_ops and calls skb_pull.
- test skb_transport_header, which is set along with csum_start
  in skb_partial_csum_set in the normal header_ops datapath.

Turns out skbs can arrive in this branch without the transport
header set, e.g., through BPF redirection.

Revise the check back to check csum_start directly, and only if
CHECKSUM_PARTIAL. Do leave the check in the updated location.
Check field regardless of whether TUNNEL_CSUM is configured.

Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
Link: https://lore.kernel.org/all/20210902193447.94039-2-willemdebruijn.kernel@gmail.com/T/#u
Fixes: 8a0ed250f911 ("ip_gre: validate csum_start only on pull")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20220606132107.3582565-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Thu, 9 Jun 2022 03:31:21 +0000 (20:31 -0700)]
Merge https://git./linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2022-06-09

We've added 6 non-merge commits during the last 2 day(s) which contain
a total of 8 files changed, 49 insertions(+), 15 deletions(-).

The main changes are:

1) Fix an illegal copy_to_user() attempt seen by syzkaller through arm64
   BPF JIT compiler, from Eric Dumazet.

2) Fix calling global functions from BPF_PROG_TYPE_EXT programs by using
   the correct program context type, from Toke Høiland-Jørgensen.

3) Fix XSK TX batching invalid descriptor handling, from Maciej Fijalkowski.

4) Fix potential integer overflows in multi-kprobe link code by using safer
   kvmalloc_array() allocation helpers, from Dan Carpenter.

5) Add Quentin as bpftool maintainer, from Quentin Monnet.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  MAINTAINERS: Add a maintainer for bpftool
  xsk: Fix handling of invalid descriptors in XSK TX batching API
  selftests/bpf: Add selftest for calling global functions from freplace
  bpf: Fix calling global functions from BPF_PROG_TYPE_EXT programs
  bpf: Use safer kvmalloc_array() where possible
  bpf, arm64: Clear prog->jited_len along prog->jited
====================

Link: https://lore.kernel.org/r/20220608234133.32265-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMAINTAINERS: add ATA sysfs file documentation to libata entry
Sergey Shtylyov [Wed, 8 Jun 2022 20:37:09 +0000 (23:37 +0300)]
MAINTAINERS: add ATA sysfs file documentation to libata entry

Add the (still missing!) ATA sysfs file documentation to the libata
subsystem entry in the MAINTAINERS file.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2 years agoata: libata-transport: fix {dma|pio|xfer}_mode sysfs files
Sergey Shtylyov [Wed, 8 Jun 2022 19:51:07 +0000 (22:51 +0300)]
ata: libata-transport: fix {dma|pio|xfer}_mode sysfs files

The {dma|pio}_mode sysfs files are incorrectly documented as having a
list of the supported DMA/PIO transfer modes, while the corresponding
fields of the *struct* ata_device hold the transfer mode IDs, not masks.

To match these docs, the {dma|pio}_mode (and even xfer_mode!) sysfs
files are handled by the ata_bitfield_name_match() macro which leads to
reading such kind of nonsense from them:

$ cat /sys/class/ata_device/dev3.0/pio_mode
XFER_UDMA_7, XFER_UDMA_6, XFER_UDMA_5, XFER_UDMA_4, XFER_MW_DMA_4,
XFER_PIO_6, XFER_PIO_5, XFER_PIO_4, XFER_PIO_3, XFER_PIO_2, XFER_PIO_1,
XFER_PIO_0

Using the correct ata_bitfield_name_search() macro fixes that:

$ cat /sys/class/ata_device/dev3.0/pio_mode
XFER_PIO_4

While fixing the file documentation, somewhat reword the {dma|pio}_mode
file doc and add a note about being mostly useful for PATA devices to
the xfer_mode file doc...

Fixes: d9027470b886 ("[libata] Add ATA transport class")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2 years agocert host tools: Stop complaining about deprecated OpenSSL functions
Linus Torvalds [Wed, 8 Jun 2022 20:18:39 +0000 (13:18 -0700)]
cert host tools: Stop complaining about deprecated OpenSSL functions

OpenSSL 3.0 deprecated the OpenSSL's ENGINE API.  That is as may be, but
the kernel build host tools still use it.  Disable the warning about
deprecated declarations until somebody who cares fixes it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agodrm/amdgpu/mes: only invalid/prime icache when finish loading both pipe MES FWs.
Yifan Zhang [Fri, 3 Jun 2022 02:24:31 +0000 (10:24 +0800)]
drm/amdgpu/mes: only invalid/prime icache when finish loading both pipe MES FWs.

invalid/prime icahce operation takes effect both pipes cuconrrently,
therefore CP_MES_IC_BASE_LO/HI and CP_MES_MDBASE_LO/HI both have to be
set before prime icache. Otherwise MES hardware gets garbage data in
above regsters and causes page fault

[  470.873200] amdgpu 0000:33:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:217 vmid:0 pasid:0, for process  pid 0 thread  pid 0)
[  470.873222] amdgpu 0000:33:00.0: amdgpu:   in page starting at address 0x000092cb89b00000 from client 10
[  470.873234] amdgpu 0000:33:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000BB3
[  470.873242] amdgpu 0000:33:00.0: amdgpu:      Faulty UTCL2 client ID: CPC (0x5)
[  470.873247] amdgpu 0000:33:00.0: amdgpu:      MORE_FAULTS: 0x1
[  470.873251] amdgpu 0000:33:00.0: amdgpu:      WALKER_ERROR: 0x1
[  470.873256] amdgpu 0000:33:00.0: amdgpu:      PERMISSION_FAULTS: 0xb
[  470.873260] amdgpu 0000:33:00.0: amdgpu:      MAPPING_ERROR: 0x1
[  470.873264] amdgpu 0000:33:00.0: amdgpu:      RW: 0x0

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 years agonet/mlx5: fs, fail conflicting actions
Mark Bloch [Mon, 30 May 2022 07:46:59 +0000 (10:46 +0300)]
net/mlx5: fs, fail conflicting actions

When combining two steering rules into one check
not only do they share the same actions but those
actions are also the same. This resolves an issue where
when creating two different rules with the same match
the actions are overwritten and one of the rules is deleted
a FW syndrome can be seen in dmesg.

mlx5_core 0000:03:00.0: mlx5_cmd_check:819:(pid 2105): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444)

Fixes: 0d235c3fabb7 ("net/mlx5: Add hash table to search FTEs in a flow-group")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Rearm the FW tracer after each tracer event
Feras Daoud [Sat, 19 Mar 2022 19:47:48 +0000 (21:47 +0200)]
net/mlx5: Rearm the FW tracer after each tracer event

The current design does not arm the tracer if traces are available before
the tracer string database is fully loaded, leading to an unfunctional tracer.
This fix will rearm the tracer every time the FW triggers tracer event
regardless of the tracer strings database status.

Fixes: c71ad41ccb0c ("net/mlx5: FW tracer, events handling")
Signed-off-by: Feras Daoud <ferasda@nvidia.com>
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: E-Switch, pair only capable devices
Mark Bloch [Thu, 26 May 2022 05:15:28 +0000 (08:15 +0300)]
net/mlx5: E-Switch, pair only capable devices

OFFLOADS paring using devcom is possible only on devices
that support LAG. Filter based on lag capabilities.

This fixes an issue where mlx5_get_next_phys_dev() was
called without holding the interface lock.

This issue was found when commit
bc4c2f2e0179 ("net/mlx5: Lag, filter non compatible devices")
added an assert that verifies the interface lock is held.

WARNING: CPU: 9 PID: 1706 at drivers/net/ethernet/mellanox/mlx5/core/dev.c:642 mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm ib_uverbs ib_core overlay fuse [last unloaded: mlx5_core]
CPU: 9 PID: 1706 Comm: devlink Not tainted 5.18.0-rc7+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
Code: 02 00 75 48 48 8b 85 80 04 00 00 5d c3 31 c0 5d c3 be ff ff ff ff 48 c7 c7 08 41 5b a0 e8 36 87 28 e3 85 c0 0f 85 6f ff ff ff <0f> 0b e9 68 ff ff ff 48 c7 c7 0c 91 cc 84 e8 cb 36 6f e1 e9 4d ff
RSP: 0018:ffff88811bf47458 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88811b398000 RCX: 0000000000000001
RDX: 0000000080000000 RSI: ffffffffa05b4108 RDI: ffff88812daaaa78
RBP: ffff88812d050380 R08: 0000000000000001 R09: ffff88811d6b3437
R10: 0000000000000001 R11: 00000000fddd3581 R12: ffff88815238c000
R13: ffff88812d050380 R14: ffff8881018aa7e0 R15: ffff88811d6b3428
FS:  00007fc82e18ae80(0000) GS:ffff88842e080000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9630d1b421 CR3: 0000000149802004 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 mlx5_esw_offloads_devcom_event+0x99/0x3b0 [mlx5_core]
 mlx5_devcom_send_event+0x167/0x1d0 [mlx5_core]
 esw_offloads_enable+0x1153/0x1500 [mlx5_core]
 ? mlx5_esw_offloads_controller_valid+0x170/0x170 [mlx5_core]
 ? wait_for_completion_io_timeout+0x20/0x20
 ? mlx5_rescan_drivers_locked+0x318/0x810 [mlx5_core]
 mlx5_eswitch_enable_locked+0x586/0xc50 [mlx5_core]
 ? mlx5_eswitch_disable_pf_vf_vports+0x1d0/0x1d0 [mlx5_core]
 ? mlx5_esw_try_lock+0x1b/0xb0 [mlx5_core]
 ? mlx5_eswitch_enable+0x270/0x270 [mlx5_core]
 ? __debugfs_create_file+0x260/0x3e0
 mlx5_devlink_eswitch_mode_set+0x27e/0x870 [mlx5_core]
 ? mutex_lock_io_nested+0x12c0/0x12c0
 ? esw_offloads_disable+0x250/0x250 [mlx5_core]
 ? devlink_nl_cmd_trap_get_dumpit+0x470/0x470
 ? rcu_read_lock_sched_held+0x3f/0x70
 devlink_nl_cmd_eswitch_set_doit+0x217/0x620

Fixes: dd3fddb82780 ("net/mlx5: E-Switch, handle devcom events only for ports on the same device")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
Paul Blakey [Tue, 29 Mar 2022 15:37:18 +0000 (18:37 +0300)]
net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules

CT cleanup assumes that all tc rules were deleted first, and so
is free to delete the CT shared resources (e.g the dr_action
fwd_action which is shared for all tuples). But currently for
uplink, this is happens in reverse, causing the below trace.

CT cleanup is called from:
mlx5e_cleanup_rep_tx()->mlx5e_cleanup_uplink_rep_tx()->
mlx5e_rep_tc_cleanup()->mlx5e_tc_esw_cleanup()->
mlx5_tc_ct_clean()

Only afterwards, tc cleanup is called from:
mlx5e_cleanup_rep_tx()->mlx5e_tc_ht_cleanup()
which would have deleted all the tc ct rules, and so delete
all the offloaded tuples.

Fix this reversing the order of init and on cleanup, which
will result in tc cleanup then ct cleanup.

[ 9443.593347] WARNING: CPU: 2 PID: 206774 at drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1882 mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
[ 9443.593349] Modules linked in: act_ct nf_flow_table rdma_ucm(O) rdma_cm(O) iw_cm(O) ib_ipoib(O) ib_cm(O) ib_umad(O) mlx5_core(O-) mlxfw(O) mlxdevm(O) auxiliary(O) ib_uverbs(O) psample ib_core(O) mlx_compat(O) ip_gre gre ip_tunnel act_vlan bonding geneve esp6_offload esp6 esp4_offload esp4 act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel act_mirred act_skbedit act_gact cls_flower sch_ingress nfnetlink_cttimeout nfnetlink xfrm_user xfrm_algo 8021q garp stp ipmi_devintf mrp ipmi_msghandler llc openvswitch nsh nf_conncount nf_nat mst_pciconf(O) dm_multipath sbsa_gwdt uio_pdrv_genirq uio mlxbf_pmc mlxbf_pka mlx_trio mlx_bootctl(O) bluefield_edac sch_fq_codel ip_tables ipv6 crc_ccitt btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 crct10dif_ce i2c_mlxbf gpio_mlxbf2 mlxbf_gige aes_neon_bs aes_neon_blk [last unloaded: mlx5_ib]
[ 9443.593419] CPU: 2 PID: 206774 Comm: modprobe Tainted: G           O      5.4.0-1023.24.gc14613d-bluefield #1
[ 9443.593422] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:143ebaf Jan 11 2022
[ 9443.593424] pstate: 20000005 (nzCv daif -PAN -UAO)
[ 9443.593489] pc : mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
[ 9443.593545] lr : mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core]
[ 9443.593546] sp : ffff8000135dbab0
[ 9443.593548] x29: ffff8000135dbab0 x28: ffff0003a6ab8e80
[ 9443.593550] x27: 0000000000000000 x26: ffff0003e07d7000
[ 9443.593552] x25: ffff800009609de0 x24: ffff000397fb2120
[ 9443.593554] x23: ffff0003975c0000 x22: 0000000000000000
[ 9443.593556] x21: ffff0003975f08c0 x20: ffff800009609de0
[ 9443.593558] x19: ffff0003c8a13380 x18: 0000000000000014
[ 9443.593560] x17: 0000000067f5f125 x16: 000000006529c620
[ 9443.593561] x15: 000000000000000b x14: 0000000000000000
[ 9443.593563] x13: 0000000000000002 x12: 0000000000000001
[ 9443.593565] x11: ffff800011108868 x10: 0000000000000000
[ 9443.593567] x9 : 0000000000000000 x8 : ffff8000117fb270
[ 9443.593569] x7 : ffff0003ebc01288 x6 : 0000000000000000
[ 9443.593571] x5 : ffff800009591ab8 x4 : fffffe000f6d9a20
[ 9443.593572] x3 : 0000000080040001 x2 : fffffe000f6d9a20
[ 9443.593574] x1 : ffff8000095901d8 x0 : 0000000000000025
[ 9443.593577] Call trace:
[ 9443.593634]  mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
[ 9443.593688]  mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core]
[ 9443.593743]  mlx5_tc_ct_clean+0x34/0xa8 [mlx5_core]
[ 9443.593797]  mlx5e_tc_esw_cleanup+0x58/0x88 [mlx5_core]
[ 9443.593851]  mlx5e_rep_tc_cleanup+0x24/0x30 [mlx5_core]
[ 9443.593905]  mlx5e_cleanup_rep_tx+0x6c/0x78 [mlx5_core]
[ 9443.593959]  mlx5e_detach_netdev+0x74/0x98 [mlx5_core]
[ 9443.594013]  mlx5e_netdev_change_profile+0x70/0x180 [mlx5_core]
[ 9443.594067]  mlx5e_netdev_attach_nic_profile+0x34/0x40 [mlx5_core]
[ 9443.594122]  mlx5e_vport_rep_unload+0x15c/0x1a8 [mlx5_core]
[ 9443.594177]  mlx5_eswitch_unregister_vport_reps+0x228/0x298 [mlx5_core]
[ 9443.594231]  mlx5e_rep_remove+0x2c/0x38 [mlx5_core]
[ 9443.594236]  auxiliary_bus_remove+0x30/0x50 [auxiliary]
[ 9443.594246]  device_release_driver_internal+0x108/0x1d0
[ 9443.594248]  driver_detach+0x5c/0xe8
[ 9443.594250]  bus_remove_driver+0x64/0xd8
[ 9443.594253]  driver_unregister+0x38/0x60
[ 9443.594255]  auxiliary_driver_unregister+0x24/0x38 [auxiliary]
[ 9443.594311]  mlx5e_rep_cleanup+0x20/0x38 [mlx5_core]
[ 9443.594365]  mlx5e_cleanup+0x18/0x30 [mlx5_core]
[ 9443.594419]  cleanup+0xc/0x20cc [mlx5_core]
[ 9443.594424]  __arm64_sys_delete_module+0x154/0x2b0
[ 9443.594429]  el0_svc_common.constprop.0+0xf4/0x200
[ 9443.594432]  el0_svc_handler+0x38/0xa8
[ 9443.594435]  el0_svc+0x10/0x26c

Fixes: d1a3138f7913 ("net/mlx5e: TC, Move flow hashtable to be per rep")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agoRevert "net/mlx5e: Allow relaxed ordering over VFs"
Saeed Mahameed [Fri, 3 Jun 2022 21:33:03 +0000 (14:33 -0700)]
Revert "net/mlx5e: Allow relaxed ordering over VFs"

FW is not ready, fix was sent too soon.
This reverts commit f05ec8d9d0d62367b6e1f2cb50d7d2a45e7747cf.

Fixes: f05ec8d9d0d6 ("net/mlx5e: Allow relaxed ordering over VFs")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agoMAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal
Lukas Bulwahn [Wed, 1 Jun 2022 04:57:38 +0000 (06:57 +0200)]
MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal

Commit 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") removes all
files in the directory drivers/net/ethernet/mellanox/mlx5/core/accel/, but
misses to adjust its reference in MAINTAINERS.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference.

Remove the file entry to the removed directory in MELLANOX ETHERNET INNOVA
DRIVERS.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agoau1000_eth: stop using virt_to_bus()
Arnd Bergmann [Tue, 7 Jun 2022 09:01:46 +0000 (11:01 +0200)]
au1000_eth: stop using virt_to_bus()

The conversion to the dma-mapping API in linux-2.6.11 was incomplete
and left a virt_to_bus() call around. There have been a number of
fixes for DMA mapping API abuse in this driver, but this one always
slipped through.

Change it to just use the existing dma_addr_t pointer, and make it
use the correct types throughout the driver to make it easier to
understand the virtual vs dma address spaces.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Manuel Lauss <manuel.lauss@gmail.com>
Link: https://lore.kernel.org/r/20220607090206.19830-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoblock: remove bioset_init_from_src
Christoph Hellwig [Wed, 8 Jun 2022 06:34:07 +0000 (08:34 +0200)]
block: remove bioset_init_from_src

Unused now, and the interface never really made a whole lot of sense to
start with.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2 years agodm: fix bio_set allocation
Christoph Hellwig [Wed, 8 Jun 2022 06:34:06 +0000 (08:34 +0200)]
dm: fix bio_set allocation

The use of bioset_init_from_src mean that the pre-allocated pools weren't
used for anything except parameter passing, and the integrity pool
creation got completely lost for the actual live mapped_device.  Fix that
by assigning the actual preallocated dm_md_mempools to the mapped_device
and using that for I/O instead of creating new mempools.

Fixes: 2a2a4c510b76 ("dm: use bioset_init_from_src() to copy bio_set")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2 years agoipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
Wang Yufen [Tue, 7 Jun 2022 12:00:28 +0000 (20:00 +0800)]
ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg

When len >= INT_MAX - transhdrlen, ulen = len + transhdrlen will be
overflow. To fix, we can follow what udpv6 does and subtract the
transhdrlen from the max.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/20220607120028.845916-2-wangyufen@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: Fix signed integer overflow in __ip6_append_data
Wang Yufen [Tue, 7 Jun 2022 12:00:27 +0000 (20:00 +0800)]
ipv6: Fix signed integer overflow in __ip6_append_data

Resurrect ubsan overflow checks and ubsan report this warning,
fix it by change the variable [length] type to size_t.

UBSAN: signed-integer-overflow in net/ipv6/ip6_output.c:1489:19
2147479552 + 8567 cannot be represented in type 'int'
CPU: 0 PID: 253 Comm: err Not tainted 5.16.0+ #1
Hardware name: linux,dummy-virt (DT)
Call trace:
  dump_backtrace+0x214/0x230
  show_stack+0x30/0x78
  dump_stack_lvl+0xf8/0x118
  dump_stack+0x18/0x30
  ubsan_epilogue+0x18/0x60
  handle_overflow+0xd0/0xf0
  __ubsan_handle_add_overflow+0x34/0x44
  __ip6_append_data.isra.48+0x1598/0x1688
  ip6_append_data+0x128/0x260
  udpv6_sendmsg+0x680/0xdd0
  inet6_sendmsg+0x54/0x90
  sock_sendmsg+0x70/0x88
  ____sys_sendmsg+0xe8/0x368
  ___sys_sendmsg+0x98/0xe0
  __sys_sendmmsg+0xf4/0x3b8
  __arm64_sys_sendmmsg+0x34/0x48
  invoke_syscall+0x64/0x160
  el0_svc_common.constprop.4+0x124/0x300
  do_el0_svc+0x44/0xc8
  el0_svc+0x3c/0x1e8
  el0t_64_sync_handler+0x88/0xb0
  el0t_64_sync+0x16c/0x170

Changes since v1:
-Change the variable [length] type to unsigned, as Eric Dumazet suggested.
Changes since v2:
-Don't change exthdrlen type in ip6_make_skb, as Paolo Abeni suggested.
Changes since v3:
-Don't change ulen type in udpv6_sendmsg and l2tp_ip6_sendmsg, as
Jakub Kicinski suggested.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/20220607120028.845916-1-wangyufen@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoarm64/sme: Fix SVE/SME typo in ABI documentation
Mark Brown [Wed, 8 Jun 2022 11:59:15 +0000 (12:59 +0100)]
arm64/sme: Fix SVE/SME typo in ABI documentation

Fix a cut'n'paste error.

Reported-by: Luis Machado <luis.machado@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220608115915.251870-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2 years agoarm64/sme: Fix tests for 0b1111 value ID registers
Mark Brown [Tue, 7 Jun 2022 16:51:28 +0000 (17:51 +0100)]
arm64/sme: Fix tests for 0b1111 value ID registers

For both ID_AA64SMFR0_EL1.I16I64 and ID_AA64SMFR0_EL1.I8I32 we check for
the presence of the feature by looking for a specific ID value of 0x4 but
should instead be checking for the value 0xf defined by the architecture.

This had no practical effect since we are looking for values >= our define
and the only valid values in the architecture are 0b0000 and 0b1111 so we
would detect things appropriately with the architecture as it stands even
with the incorrect defines.

Signed-off-by: Mark Brown <broonie@kernel.org>
Fixes: b4adc83b0770 ("arm64/sme: System register and exception syndrome definitions")
Link: https://lore.kernel.org/r/20220607165128.2833157-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2 years agonfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
Xiaohui Zhang [Tue, 7 Jun 2022 08:32:30 +0000 (16:32 +0800)]
nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred

Similar to the handling of play_deferred in commit 19cfe912c37b
("Bluetooth: btusb: Fix memory leak in play_deferred"), we thought
a patch might be needed here as well.

Currently usb_submit_urb is called directly to submit deferred tx
urbs after unanchor them.

So the usb_giveback_urb_bh would failed to unref it in usb_unanchor_urb
and cause memory leak.

Put those urbs in tx_anchor to avoid the leak, and also fix the error
handling.

Signed-off-by: Xiaohui Zhang <xiaohuizhang@ruc.edu.cn>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220607083230.6182-1-xiaohuizhang@ruc.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'split-nfc-st21nfca-refactor-evt_transaction-into-3'
Jakub Kicinski [Wed, 8 Jun 2022 17:17:30 +0000 (10:17 -0700)]
Merge branch 'split-nfc-st21nfca-refactor-evt_transaction-into-3'

Martin Faltesek says:

====================
Split "nfc: st21nfca: Refactor EVT_TRANSACTION" into 3

v2: https://lore.kernel.org/netdev/20220401180939.2025819-1-mfaltesek@google.com/
v1: https://lore.kernel.org/netdev/20220329175431.3175472-1-mfaltesek@google.com/
====================

Link: https://lore.kernel.org/r/20220607025729.1673212-1-mfaltesek@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
Martin Faltesek [Tue, 7 Jun 2022 02:57:29 +0000 (21:57 -0500)]
nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION

The transaction buffer is allocated by using the size of the packet buf,
and subtracting two which seem intended to remove the two tags which are
not present in the target structure. This calculation leads to under
counting memory because of differences between the packet contents and the
target structure. The aid_len field is a u8 in the packet, but a u32 in
the structure, resulting in at least 3 bytes always being under counted.
Further, the aid data is a variable length field in the packet, but fixed
in the structure, so if this field is less than the max, the difference is
added to the under counting.

The last validation check for transaction->params_len is also incorrect
since it employs the same accounting error.

To fix, perform validation checks progressively to safely reach the
next field, to determine the size of both buffers and verify both tags.
Once all validation checks pass, allocate the buffer and copy the data.
This eliminates freeing memory on the error path, as those checks are
moved ahead of memory allocation.

Fixes: 26fc6c7f02cb ("NFC: st21nfca: Add HCI transaction event support")
Fixes: 4fbcc1a4cb20 ("nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Faltesek <mfaltesek@google.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
Martin Faltesek [Tue, 7 Jun 2022 02:57:28 +0000 (21:57 -0500)]
nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling

Error paths do not free previously allocated memory. Add devm_kfree() to
those failure paths.

Fixes: 26fc6c7f02cb ("NFC: st21nfca: Add HCI transaction event support")
Fixes: 4fbcc1a4cb20 ("nfc: st21nfca: Fix potential buffer overflows in EVT_TRANSACTION")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Faltesek <mfaltesek@google.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
Martin Faltesek [Tue, 7 Jun 2022 02:57:27 +0000 (21:57 -0500)]
nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION

The first validation check for EVT_TRANSACTION has two different checks
tied together with logical AND. One is a check for minimum packet length,
and the other is for a valid aid_tag. If either condition is true (fails),
then an error should be triggered.  The fix is to change && to ||.

Fixes: 26fc6c7f02cb ("NFC: st21nfca: Add HCI transaction event support")
Cc: stable@vger.kernel.org
Signed-off-by: Martin Faltesek <mfaltesek@google.com>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge v5.19-rc1 into drm-misc-fixes
Maxime Ripard [Wed, 8 Jun 2022 17:11:27 +0000 (19:11 +0200)]
Merge v5.19-rc1 into drm-misc-fixes

Let's kick-off the start of the 5.19 fix cycle

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2 years agoMerge branch 'net-unexport-some-symbols-that-are-annotated-__init'
Jakub Kicinski [Wed, 8 Jun 2022 17:10:17 +0000 (10:10 -0700)]
Merge branch 'net-unexport-some-symbols-that-are-annotated-__init'

Masahiro Yamada says:

====================
net: unexport some symbols that are annotated __init

This patch set fixes odd combinations
of EXPORT_SYMBOL and __init.

Checking this in modpost is a good thing and I really wanted to do it,
but Linus Torvalds imposes a very strict rule, "No new warning".

I'd like the maintainer to kindly pick this up and send a fixes pull request.

Unless I eliminate all the sites of warnings beforehand,
Linus refuses to re-enable the modpost check. [1]

[1]: https://lore.kernel.org/linux-kbuild/CAK7LNATmd0bigp7HQ4fTzHw5ugZMkSb3UXG7L4fxpGbqkRKESA@mail.gmail.com/T/#m5e50cc2da17491ba210c72b5efdbc0ce76e0217f
====================

Link: https://lore.kernel.org/r/20220606045355.4160711-1-masahiroy@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ipv6: unexport __init-annotated seg6_hmac_init()
Masahiro Yamada [Mon, 6 Jun 2022 04:53:55 +0000 (13:53 +0900)]
net: ipv6: unexport __init-annotated seg6_hmac_init()

EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it has been broken for a decade.

Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.

There are two ways to fix it:

  - Remove __init
  - Remove EXPORT_SYMBOL

I chose the latter for this case because the caller (net/ipv6/seg6.c)
and the callee (net/ipv6/seg6_hmac.c) belong to the same module.
It seems an internal function call in ipv6.ko.

Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: xfrm: unexport __init-annotated xfrm4_protocol_init()
Masahiro Yamada [Mon, 6 Jun 2022 04:53:54 +0000 (13:53 +0900)]
net: xfrm: unexport __init-annotated xfrm4_protocol_init()

EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it has been broken for a decade.

Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.

There are two ways to fix it:

  - Remove __init
  - Remove EXPORT_SYMBOL

I chose the latter for this case because the only in-tree call-site,
net/ipv4/xfrm4_policy.c is never compiled as modular.
(CONFIG_XFRM is boolean)

Fixes: 2f32b51b609f ("xfrm: Introduce xfrm_input_afinfo to access the the callbacks properly")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: mdio: unexport __init-annotated mdio_bus_init()
Masahiro Yamada [Mon, 6 Jun 2022 04:53:53 +0000 (13:53 +0900)]
net: mdio: unexport __init-annotated mdio_bus_init()

EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it has been broken for a decade.

Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.

There are two ways to fix it:

  - Remove __init
  - Remove EXPORT_SYMBOL

I chose the latter for this case because the only in-tree call-site,
drivers/net/phy/phy_device.c is never compiled as modular.
(CONFIG_PHYLIB is boolean)

Fixes: 90eff9096c01 ("net: phy: Allow splitting MDIO bus/device support from PHYs")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoSUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
Chuck Lever [Tue, 7 Jun 2022 20:48:18 +0000 (16:48 -0400)]
SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()

To make the code easier to read, remove visual clutter by changing
the declared type of @p.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2 years agoSUNRPC: Clean up xdr_get_next_encode_buffer()
Chuck Lever [Tue, 7 Jun 2022 20:48:11 +0000 (16:48 -0400)]
SUNRPC: Clean up xdr_get_next_encode_buffer()

The value of @p is not used until the "location of the next item" is
computed. Help human readers by moving its initial assignment to the
paragraph where that value is used and by clarifying the antecedents
in the documenting comment.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2 years agoSUNRPC: Clean up xdr_commit_encode()
Chuck Lever [Tue, 7 Jun 2022 20:48:05 +0000 (16:48 -0400)]
SUNRPC: Clean up xdr_commit_encode()

Both the kvec::iov_len field and the third parameter of memcpy() and
memmove() are size_t. There's no reason for the implicit conversion
from size_t to int and back. Change the type of @shift to make the
code easier to read and understand.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2 years agoSUNRPC: Optimize xdr_reserve_space()
Chuck Lever [Tue, 7 Jun 2022 20:47:58 +0000 (16:47 -0400)]
SUNRPC: Optimize xdr_reserve_space()

Transitioning between encode buffers is quite infrequent. It happens
about 1 time in 400 calls to xdr_reserve_space(), measured on NFSD
with a typical build/test workload.

Force the compiler to remove that code from xdr_reserve_space(),
which is a hot path on both the server and the client. This change
reduces the size of xdr_reserve_space() from 10 cache lines to 2
when compiled with -Os.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2 years agoSUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
Chuck Lever [Tue, 7 Jun 2022 20:47:52 +0000 (16:47 -0400)]
SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()

I found that NFSD's new NFSv3 READDIRPLUS XDR encoder was screwing up
right at the end of the page array. xdr_get_next_encode_buffer() does
not compute the value of xdr->end correctly:

 * The check to see if we're on the final available page in xdr->buf
   needs to account for the space consumed by @nbytes.

 * The new xdr->end value needs to account for the portion of @nbytes
   that is to be encoded into the previous buffer.

Fixes: 2825a7f90753 ("nfsd4: allow encoding across page boundaries")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: J. Bruce Fields <bfields@fieldses.org>
2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Wed, 8 Jun 2022 16:16:31 +0000 (09:16 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:

 - syzkaller NULL pointer dereference

 - TDP MMU performance issue with disabling dirty logging

 - 5.14 regression with SVM TSC scaling

 - indefinite stall on applying live patches

 - unstable selftest

 - memory leak from wrong copy-and-paste

 - missed PV TLB flush when racing with emulation

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: do not report a vCPU as preempted outside instruction boundaries
  KVM: x86: do not set st->preempted when going back to user space
  KVM: SVM: fix tsc scaling cache logic
  KVM: selftests: Make hyperv_clock selftest more stable
  KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging
  x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm()
  KVM: x86/mmu: Check every prev_roots in __kvm_mmu_free_obsolete_roots()
  entry/kvm: Exit to user mode when TIF_NOTIFY_SIGNAL is set
  KVM: Don't null dereference ops->destroy

2 years agoMerge tag 'tpmdd-next-v5.19-rc2-v2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 8 Jun 2022 16:10:11 +0000 (09:10 -0700)]
Merge tag 'tpmdd-next-v5.19-rc2-v2' of git://git./linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen:
 "A bug fix for migratable (whether or not a key is tied to the TPM chip
  soldered to the machine) handling for TPM2 trusted keys"

* tag 'tpmdd-next-v5.19-rc2-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  KEYS: trusted: tpm2: Fix migratable logic

2 years agocpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE
Peter Zijlstra [Wed, 8 Jun 2022 14:27:27 +0000 (16:27 +0200)]
cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE

Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
Xeons") wrecked intel_idle in two ways:

 - must not have tracing in idle functions
 - must return with IRQs disabled

Additionally, it added a branch for no good reason.

Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ rjw: Moved the intel_idle() kerneldoc comment next to the function ]
Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2 years agodrm/amdgpu/jpeg2: Add jpeg vmid update under IB submit
Mohammad Zafar Ziya [Tue, 7 Jun 2022 03:38:16 +0000 (11:38 +0800)]
drm/amdgpu/jpeg2: Add jpeg vmid update under IB submit

Add jpeg vmid update under IB submit

Signed-off-by: Mohammad Zafar Ziya <Mohammadzafar.ziya@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2 years agodrm/amdgpu: always flush the TLB on gfx8
Christian König [Fri, 3 Jun 2022 13:05:04 +0000 (15:05 +0200)]
drm/amdgpu: always flush the TLB on gfx8

The TLB on GFX8 stores each block of 8 PTEs where any of the valid bits
are set.

Fixes: 5255e146c99a ("drm/amdgpu: rework TLB flushing")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 years agodrm/amdgpu: fix limiting AV1 to the first instance on VCN3
Christian König [Fri, 3 Jun 2022 10:21:06 +0000 (12:21 +0200)]
drm/amdgpu: fix limiting AV1 to the first instance on VCN3

The job is not yet initialized here.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2037
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: cdc7893fc93f ("drm/amdgpu: use job and ib structures directly in CS parsers")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2 years agodrm/amdkfd:Fix fw version for 10.3.6
Jesse Zhang [Tue, 7 Jun 2022 02:44:57 +0000 (10:44 +0800)]
drm/amdkfd:Fix fw version for 10.3.6

fix fw error when loading fw for 10.3.6

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x
2 years agoMAINTAINERS: Add a maintainer for bpftool
Quentin Monnet [Wed, 8 Jun 2022 12:14:28 +0000 (13:14 +0100)]
MAINTAINERS: Add a maintainer for bpftool

I've been contributing and reviewing patches for bpftool for some time,
and I'm taking care of its external mirror. On Alexei, KP, and Daniel's
suggestion, I would like to step forwards and become a maintainer for
the tool. This patch adds a dedicated entry to MAINTAINERS.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220608121428.69708-1-quentin@isovalent.com
2 years agoALSA: hda/realtek: Add quirk for HP Dev One
Jeremy Soller [Wed, 8 Jun 2022 14:01:11 +0000 (08:01 -0600)]
ALSA: hda/realtek: Add quirk for HP Dev One

Enables the audio mute LEDs and limits the mic boost to avoid picking up
noise.

Signed-off-by: Jeremy Soller <jeremy@system76.com>
Signed-off-by: Tim Crawford <tcrawford@system76.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220608140111.23170-1-tcrawford@system76.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 years agoxsk: Fix handling of invalid descriptors in XSK TX batching API
Maciej Fijalkowski [Tue, 7 Jun 2022 14:22:00 +0000 (16:22 +0200)]
xsk: Fix handling of invalid descriptors in XSK TX batching API

xdpxceiver run on a AF_XDP ZC enabled driver revealed a problem with XSK
Tx batching API. There is a test that checks how invalid Tx descriptors
are handled by AF_XDP. Each valid descriptor is followed by invalid one
on Tx side whereas the Rx side expects only to receive a set of valid
descriptors.

In current xsk_tx_peek_release_desc_batch() function, the amount of
available descriptors is hidden inside xskq_cons_peek_desc_batch(). This
can be problematic in cases where invalid descriptors are present due to
the fact that xskq_cons_peek_desc_batch() returns only a count of valid
descriptors. This means that it is impossible to properly update XSK
ring state when calling xskq_cons_release_n().

To address this issue, pull out the contents of
xskq_cons_peek_desc_batch() so that callers (currently only
xsk_tx_peek_release_desc_batch()) will always be able to update the
state of ring properly, as total count of entries is now available and
use this value as an argument in xskq_cons_release_n(). By
doing so, xskq_cons_peek_desc_batch() can be dropped altogether.

Fixes: 9349eb3a9d2a ("xsk: Introduce batched Tx descriptor interfaces")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20220607142200.576735-1-maciej.fijalkowski@intel.com
2 years agovduse: Fix NULL pointer dereference on sysfs access
Xie Yongji [Tue, 26 Apr 2022 07:36:56 +0000 (15:36 +0800)]
vduse: Fix NULL pointer dereference on sysfs access

The control device has no drvdata. So we will get a
NULL pointer dereference when accessing control
device's msg_timeout attribute via sysfs:

[ 132.841881][ T3644] BUG: kernel NULL pointer dereference, address: 00000000000000f8
[ 132.850619][ T3644] RIP: 0010:msg_timeout_show (drivers/vdpa/vdpa_user/vduse_dev.c:1271)
[ 132.869447][ T3644] dev_attr_show (drivers/base/core.c:2094)
[ 132.870215][ T3644] sysfs_kf_seq_show (fs/sysfs/file.c:59)
[ 132.871164][ T3644] ? device_remove_bin_file (drivers/base/core.c:2088)
[ 132.872082][ T3644] kernfs_seq_show (fs/kernfs/file.c:164)
[ 132.872838][ T3644] seq_read_iter (fs/seq_file.c:230)
[ 132.873578][ T3644] ? __vmalloc_area_node (mm/vmalloc.c:3041)
[ 132.874532][ T3644] kernfs_fop_read_iter (fs/kernfs/file.c:238)
[ 132.875513][ T3644] __kernel_read (fs/read_write.c:440 (discriminator 1))
[ 132.876319][ T3644] kernel_read (fs/read_write.c:459)
[ 132.877129][ T3644] kernel_read_file (fs/kernel_read_file.c:94)
[ 132.877978][ T3644] kernel_read_file_from_fd (include/linux/file.h:45 fs/kernel_read_file.c:186)
[ 132.879019][ T3644] __do_sys_finit_module (kernel/module.c:4207)
[ 132.879930][ T3644] __ia32_sys_finit_module (kernel/module.c:4189)
[ 132.880930][ T3644] do_int80_syscall_32 (arch/x86/entry/common.c:112 arch/x86/entry/common.c:132)
[ 132.881847][ T3644] entry_INT80_compat (arch/x86/entry/entry_64_compat.S:419)

To fix it, don't create the unneeded attribute for
control device anymore.

Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Reported-by: kernel test robot <oliver.sang@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Message-Id: <20220426073656.229-1-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2 years agovringh: Fix loop descriptors check in the indirect cases
Xie Yongji [Thu, 5 May 2022 10:09:10 +0000 (18:09 +0800)]
vringh: Fix loop descriptors check in the indirect cases

We should use size of descriptor chain to test loop condition
in the indirect case. And another statistical count is also introduced
for indirect descriptors to avoid conflict with the statistical count
of direct descriptors.

Fixes: f87d0fbb5798 ("vringh: host-side implementation of virtio rings.")
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Signed-off-by: Fam Zheng <fam.zheng@bytedance.com>
Message-Id: <20220505100910.137-1-xieyongji@bytedance.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2 years agovdpa/mlx5: clean up indenting in handle_ctrl_vlan()
Dan Carpenter [Tue, 7 Jun 2022 06:50:09 +0000 (09:50 +0300)]
vdpa/mlx5: clean up indenting in handle_ctrl_vlan()

These lines were supposed to be indented.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <Yp71IYMP+QfuCJ8t@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2 years agovdpa/mlx5: fix error code for deleting vlan
Dan Carpenter [Tue, 7 Jun 2022 06:49:25 +0000 (09:49 +0300)]
vdpa/mlx5: fix error code for deleting vlan

Return success if we were able to delete a vlan.  The current code
always returns failure.

Fixes: baf2ad3f6a98 ("vdpa/mlx5: Add RX MAC VLAN filter support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <Yp709f1g9NcMBCHg@kili>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eli Cohen <elic@nvidia.com>
Acked-by: Si-Wei Liu <si-wei.liu@oracle.com>
2 years agovirtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed
chengkaitao [Thu, 2 Jun 2022 00:55:42 +0000 (08:55 +0800)]
virtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed

The reference must be released when device_register(&vm_cmdline_parent)
failed. Add the corresponding 'put_device()' in the error handling path.

Signed-off-by: chengkaitao <pilgrimtao@gmail.com>
Message-Id: <20220602005542.16489-1-chengkaitao@didiglobal.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2 years agovdpa/mlx5: Fix syntax errors in comments
Xiang wangx [Sat, 4 Jun 2022 14:38:58 +0000 (22:38 +0800)]
vdpa/mlx5: Fix syntax errors in comments

Delete the redundant word 'is'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Message-Id: <20220604143858.16073-1-wangxiang@cdjrlc.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2 years agovirtio-rng: make device ready before making request
Jason Wang [Wed, 8 Jun 2022 06:14:22 +0000 (14:14 +0800)]
virtio-rng: make device ready before making request

Current virtio-rng does a entropy request before DRIVER_OK, this
violates the spec:

virtio spec requires that all drivers set DRIVER_OK
before using devices.

Further, kernel will ignore the interrupt after commit
8b4ec69d7e09 ("virtio: harden vring IRQ").

Fixing this by making device ready before the request.

Cc: stable@vger.kernel.org
Fixes: 8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.")
Reported-and-tested-by: syzbot+5b59d6d459306a556f54@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220608061422.38437-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2 years agoscripts/nsdeps: adjust to the format change of *.mod files
Masahiro Yamada [Tue, 7 Jun 2022 15:18:40 +0000 (00:18 +0900)]
scripts/nsdeps: adjust to the format change of *.mod files

Commit 22f26f21774f ("kbuild: get rid of duplication in *.mod files")
changed the format of *.mod files to put one object per line, but missed
to adjust scripts/nsdeps.

Fixes: 22f26f21774f ("kbuild: get rid of duplication in *.mod files")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2 years agoKEYS: trusted: tpm2: Fix migratable logic
David Safford [Tue, 7 Jun 2022 18:07:57 +0000 (14:07 -0400)]
KEYS: trusted: tpm2: Fix migratable logic

When creating (sealing) a new trusted key, migratable
trusted keys have the FIXED_TPM and FIXED_PARENT attributes
set, and non-migratable keys don't. This is backwards, and
also causes creation to fail when creating a migratable key
under a migratable parent. (The TPM thinks you are trying to
seal a non-migratable blob under a migratable parent.)

The following simple patch fixes the logic, and has been
tested for all four combinations of migratable and non-migratable
trusted keys and parent storage keys. With this logic, you will
get a proper failure if you try to create a non-migratable
trusted key under a migratable parent storage key, and all other
combinations work correctly.

Cc: stable@vger.kernel.org # v5.13+
Fixes: e5fb5d2c5a03 ("security: keys: trusted: Make sealed key properly interoperable")
Signed-off-by: David Safford <david.safford@gmail.com>
Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2 years agozonefs: fix zonefs_iomap_begin() for reads
Damien Le Moal [Mon, 23 May 2022 07:29:10 +0000 (16:29 +0900)]
zonefs: fix zonefs_iomap_begin() for reads

If a readahead is issued to a sequential zone file with an offset
exactly equal to the current file size, the iomap type is set to
IOMAP_UNWRITTEN, which will prevent an IO, but the iomap length is
calculated as 0. This causes a WARN_ON() in iomap_iter():

[17309.548939] WARNING: CPU: 3 PID: 2137 at fs/iomap/iter.c:34 iomap_iter+0x9cf/0xe80
[...]
[17309.650907] RIP: 0010:iomap_iter+0x9cf/0xe80
[...]
[17309.754560] Call Trace:
[17309.757078]  <TASK>
[17309.759240]  ? lock_is_held_type+0xd8/0x130
[17309.763531]  iomap_readahead+0x1a8/0x870
[17309.767550]  ? iomap_read_folio+0x4c0/0x4c0
[17309.771817]  ? lockdep_hardirqs_on_prepare+0x400/0x400
[17309.778848]  ? lock_release+0x370/0x750
[17309.784462]  ? folio_add_lru+0x217/0x3f0
[17309.790220]  ? reacquire_held_locks+0x4e0/0x4e0
[17309.796543]  read_pages+0x17d/0xb60
[17309.801854]  ? folio_add_lru+0x238/0x3f0
[17309.807573]  ? readahead_expand+0x5f0/0x5f0
[17309.813554]  ? policy_node+0xb5/0x140
[17309.819018]  page_cache_ra_unbounded+0x27d/0x450
[17309.825439]  filemap_get_pages+0x500/0x1450
[17309.831444]  ? filemap_add_folio+0x140/0x140
[17309.837519]  ? lock_is_held_type+0xd8/0x130
[17309.843509]  filemap_read+0x28c/0x9f0
[17309.848953]  ? zonefs_file_read_iter+0x1ea/0x4d0 [zonefs]
[17309.856162]  ? trace_contention_end+0xd6/0x130
[17309.862416]  ? __mutex_lock+0x221/0x1480
[17309.868151]  ? zonefs_file_read_iter+0x166/0x4d0 [zonefs]
[17309.875364]  ? filemap_get_pages+0x1450/0x1450
[17309.881647]  ? __mutex_unlock_slowpath+0x15e/0x620
[17309.888248]  ? wait_for_completion_io_timeout+0x20/0x20
[17309.895231]  ? lock_is_held_type+0xd8/0x130
[17309.901115]  ? lock_is_held_type+0xd8/0x130
[17309.906934]  zonefs_file_read_iter+0x356/0x4d0 [zonefs]
[17309.913750]  new_sync_read+0x2d8/0x520
[17309.919035]  ? __x64_sys_lseek+0x1d0/0x1d0

Furthermore, this causes iomap_readahead() to loop forever as
iomap_readahead_iter() always returns 0, making no progress.

Fix this by treating reads after the file size as access to holes,
setting the iomap type to IOMAP_HOLE, the iomap addr to IOMAP_NULL_ADDR
and using the length argument as is for the iomap length. To simplify
the code with this change, zonefs_iomap_begin() is split into the read
variant, zonefs_read_iomap_begin() and zonefs_read_iomap_ops, and the
write variant, zonefs_write_iomap_begin() and zonefs_write_iomap_ops.

Reported-by: Jorgen Hansen <Jorgen.Hansen@wdc.com>
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Jorgen Hansen <Jorgen.Hansen@wdc.com>
2 years agoALSA: hda/realtek - Add HW8326 support
huangwenhui [Wed, 8 Jun 2022 08:23:57 +0000 (16:23 +0800)]
ALSA: hda/realtek - Add HW8326 support

Added the support of new Huawei codec HW8326. The HW8326 is developed
by Huawei with Realtek's IP Core, and it's compatible with ALC256.

Signed-off-by: huangwenhui <huangwenhuia@uniontech.com>
Link: https://lore.kernel.org/r/20220608082357.26898-1-huangwenhuia@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 years agoKVM: x86: do not report a vCPU as preempted outside instruction boundaries
Paolo Bonzini [Tue, 7 Jun 2022 14:09:03 +0000 (10:09 -0400)]
KVM: x86: do not report a vCPU as preempted outside instruction boundaries

If a vCPU is outside guest mode and is scheduled out, it might be in the
process of making a memory access.  A problem occurs if another vCPU uses
the PV TLB flush feature during the period when the vCPU is scheduled
out, and a virtual address has already been translated but has not yet
been accessed, because this is equivalent to using a stale TLB entry.

To avoid this, only report a vCPU as preempted if sure that the guest
is at an instruction boundary.  A rescheduling request will be delivered
to the host physical CPU as an external interrupt, so for simplicity
consider any vmexit *not* instruction boundary except for external
interrupts.

It would in principle be okay to report the vCPU as preempted also
if it is sleeping in kvm_vcpu_block(): a TLB flush IPI will incur the
vmentry/vmexit overhead unnecessarily, and optimistic spinning is
also unlikely to succeed.  However, leave it for later because right
now kvm_vcpu_check_block() is doing memory accesses.  Even
though the TLB flush issue only applies to virtual memory address,
it's very much preferrable to be conservative.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agoKVM: x86: do not set st->preempted when going back to user space
Paolo Bonzini [Tue, 7 Jun 2022 14:07:11 +0000 (10:07 -0400)]
KVM: x86: do not set st->preempted when going back to user space

Similar to the Xen path, only change the vCPU's reported state if the vCPU
was actually preempted.  The reason for KVM's behavior is that for example
optimistic spinning might not be a good idea if the guest is doing repeated
exits to userspace; however, it is confusing and unlikely to make a difference,
because well-tuned guests will hardly ever exit KVM_RUN in the first place.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2 years agozonefs: Do not ignore explicit_open with active zone limit
Damien Le Moal [Thu, 2 Jun 2022 12:33:25 +0000 (21:33 +0900)]
zonefs: Do not ignore explicit_open with active zone limit

A zoned device may have no limit on the number of open zones but may
have a limit on the number of active zones it can support. In such
case, the explicit_open mount option should not be ignored to ensure
that the open() system call activates the zone with an explicit zone
open command, thus guaranteeing that the zone can be written.

Enforce this by ignoring the explicit_open mount option only for
devices that have both the open and active zone limits equal to 0.

Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2 years agozonefs: fix handling of explicit_open option on mount
Damien Le Moal [Thu, 2 Jun 2022 14:16:57 +0000 (23:16 +0900)]
zonefs: fix handling of explicit_open option on mount

Ignoring the explicit_open mount option on mount for devices that do not
have a limit on the number of open zones must be done after the mount
options are parsed and set in s_mount_opts. Move the check to ignore
the explicit_open option after the call to zonefs_parse_options() in
zonefs_fill_super().

Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
2 years agonet/mlx4_en: Fix wrong return value on ioctl EEPROM query failure
Gal Pressman [Mon, 6 Jun 2022 11:57:18 +0000 (14:57 +0300)]
net/mlx4_en: Fix wrong return value on ioctl EEPROM query failure

The ioctl EEPROM query wrongly returns success on read failures, fix
that by returning the appropriate error code.

Fixes: 7202da8b7f71 ("ethtool, net/mlx4_en: Cable info, get_module_info/eeprom ethtool support")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220606115718.14233-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: lantiq_gswip: Fix refcount leak in gswip_gphy_fw_list
Miaoqian Lin [Sun, 5 Jun 2022 07:23:34 +0000 (11:23 +0400)]
net: dsa: lantiq_gswip: Fix refcount leak in gswip_gphy_fw_list

Every iteration of for_each_available_child_of_node() decrements
the reference count of the previous node.
when breaking early from a for_each_available_child_of_node() loop,
we need to explicitly call of_node_put() on the gphy_fw_np.
Add missing of_node_put() to avoid refcount leak.

Fixes: 14fceff4771e ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220605072335.11257-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agolibata: fix translation of concurrent positioning ranges
Tyler Erickson [Thu, 2 Jun 2022 22:51:12 +0000 (16:51 -0600)]
libata: fix translation of concurrent positioning ranges

Fixing the page length in the SCSI translation for the concurrent
positioning ranges VPD page. It was writing starting in offset 3
rather than offset 2 where the MSB is supposed to start for
the VPD page length.

Cc: stable@vger.kernel.org
Fixes: fe22e1c2f705 ("libata: support concurrent positioning ranges log")
Signed-off-by: Tyler Erickson <tyler.erickson@seagate.com>
Reviewed-by: Muhammad Ahmad <muhammad.ahmad@seagate.com>
Tested-by: Michael English <michael.english@seagate.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2 years agolibata: fix reading concurrent positioning ranges log
Tyler Erickson [Thu, 2 Jun 2022 22:51:11 +0000 (16:51 -0600)]
libata: fix reading concurrent positioning ranges log

The concurrent positioning ranges log is not a fixed size and may depend
on how many ranges are supported by the device. This patch uses the size
reported in the GPL directory to determine the number of pages supported
by the device before attempting to read this log page.

This resolves this error from the dmesg output:
    ata6.00: Read log 0x47 page 0x00 failed, Emask 0x1

Cc: stable@vger.kernel.org
Fixes: fe22e1c2f705 ("libata: support concurrent positioning ranges log")
Signed-off-by: Tyler Erickson <tyler.erickson@seagate.com>
Reviewed-by: Muhammad Ahmad <muhammad.ahmad@seagate.com>
Tested-by: Michael English <michael.english@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2 years agoLoongArch: Remove MIPS comment about cycle counter
Jason A. Donenfeld [Sun, 5 Jun 2022 08:20:08 +0000 (16:20 +0800)]
LoongArch: Remove MIPS comment about cycle counter

This comment block was taken originally from the MIPS architecture code,
where indeed there are particular assumptions one can make regarding SMP
and !SMP and cycle counters. On LoongArch, however, the rdtime family of
functions is always available. As Xuerui wrote:

    The rdtime family of instructions is in fact guaranteed to be
    available on LoongArch; LoongArch's subsets all contain them, even
    the 32-bit "Primary" subset intended for university teaching -- they
    provide the rdtimeh.w and rdtimel.w pair of instructions that access
    the same 64-bit counter.

So this commit simply removes the incorrect comment block.

Link: https://lore.kernel.org/lkml/e78940bc-9be2-2fe7-026f-9e64a1416c9f@xen0n.name/
Fixes: b738c106f735 ("LoongArch: Add other common headers")
Reviewed-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Fix copy_thread() build errors
Huacai Chen [Sun, 5 Jun 2022 08:20:03 +0000 (16:20 +0800)]
LoongArch: Fix copy_thread() build errors

Commit c5febea0956fd387 ("fork: Pass struct kernel_clone_args into
copy_thread") change the prototype of copy_thread(), while commit
5bd2e97c868a8a44 ("fork: Generalize PF_IO_WORKER handling") change
the structure of kernel_clone_args. They cause build errors, so fix it.

Fixes: 5bd2e97c868a8a44 ("fork: Generalize PF_IO_WORKER handling")
Fixes: c5febea0956fd387 ("fork: Pass struct kernel_clone_args into copy_thread")
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoLoongArch: Fix the !CONFIG_SMP build
Huacai Chen [Sun, 5 Jun 2022 08:19:53 +0000 (16:19 +0800)]
LoongArch: Fix the !CONFIG_SMP build

1, We assume arch/loongarch/include/asm/smp.h be included in include/
   linux/smp.h is valid and the reverse inclusion isn't. So remove the
   <linux/smp.h> in arch/loongarch/include/asm/smp.h.
2, arch/loongarch/include/asm/smp.h is only needed when CONFIG_SMP,
   and setup.c include it only because it need plat_smp_setup(). So,
   reorganize setup.c & smp.h, and then remove <asm/smp.h> in setup.c.
3, Fix cacheinfo.c and percpu.h build error by adding the missing header
   files when !CONFIG_SMP.
4, Fix acpi.c build error by adding CONFIG_SMP guards.
5, Move irq_stat definition from smp.c to irq.c and fix its declaration.
6, Select CONFIG_SMP for CONFIG_NUMA, similar as other architectures do.

Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2 years agoscsi: pmcraid: Fix missing resource cleanup in error case
Chengguang Xu [Sun, 29 May 2022 15:34:55 +0000 (23:34 +0800)]
scsi: pmcraid: Fix missing resource cleanup in error case

Fix missing resource cleanup (when '(--i) == 0') for error case in
pmcraid_register_interrupt_handler().

Link: https://lore.kernel.org/r/20220529153456.4183738-6-cgxu519@mykernel.net
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2 years agoscsi: ipr: Fix missing/incorrect resource cleanup in error case
Chengguang Xu [Sun, 29 May 2022 15:34:53 +0000 (23:34 +0800)]
scsi: ipr: Fix missing/incorrect resource cleanup in error case

Fix missing resource cleanup (when '(--i) == 0') for error case in
ipr_alloc_mem() and skip incorrect resource cleanup (when '(--i) == 0') for
error case in ipr_request_other_msi_irqs() because variable i started from
1.

Link: https://lore.kernel.org/r/20220529153456.4183738-4-cgxu519@mykernel.net
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2 years agoscsi: mpt3sas: Fix out-of-bounds compiler warning
Helge Deller [Tue, 31 May 2022 20:09:27 +0000 (22:09 +0200)]
scsi: mpt3sas: Fix out-of-bounds compiler warning

I'm facing this warning when building for the parisc64 architecture:

drivers/scsi/mpt3sas/mpt3sas_base.c: In function ‘_base_make_ioc_operational’:
drivers/scsi/mpt3sas/mpt3sas_base.c:5396:40: warning: array subscript ‘Mpi2SasIOUnitPage1_t {aka struct _MPI2_CONFIG_PAGE_SASIOUNIT_1}[0]’ is partly outside array bounds of ‘unsigned char[20]’ [-Warray-bounds]
 5396 |             (le16_to_cpu(sas_iounit_pg1->SASWideMaxQueueDepth)) ?
drivers/scsi/mpt3sas/mpt3sas_base.c:5382:26: note: referencing an object of size 20 allocated by ‘kzalloc’
 5382 |         sas_iounit_pg1 = kzalloc(sz, GFP_KERNEL);
      |                          ^~~~~~~~~~~~~~~~~~~~~~~

The problem is, that only 20 bytes are allocated with kmalloc(), which is
sufficient to hold the bytes which are needed.  Nevertheless, gcc complains
because the whole Mpi2SasIOUnitPage1_t struct is 32 bytes in size and thus
doesn't fit into those 20 bytes.

This patch simply allocates all 32 bytes (instead of 20) and thus avoids
the warning. There is no functional change introduced by this patch.

While touching the code I cleaned up to calculation of max_wideport_qd,
max_narrowport_qd and max_sata_qd to make it easier readable.

Test successfully tested on a HP C8000 PA-RISC workstation with 64-bit
kernel.

Link: https://lore.kernel.org/r/YpZ197iZdDZSCzrT@p100
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2 years agoscsi: lpfc: Update lpfc version to 14.2.0.4
James Smart [Fri, 3 Jun 2022 17:43:29 +0000 (10:43 -0700)]
scsi: lpfc: Update lpfc version to 14.2.0.4

Update lpfc version to 14.2.0.4

Link: https://lore.kernel.org/r/20220603174329.63777-10-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2 years agoscsi: lpfc: Allow reduced polling rate for nvme_admin_async_event cmd completion
James Smart [Fri, 3 Jun 2022 17:43:28 +0000 (10:43 -0700)]
scsi: lpfc: Allow reduced polling rate for nvme_admin_async_event cmd completion

NVMe Asynchronous Event Request commands have no command timeout value per
specifications.

Set WQE option to allow a reduced FLUSH polling rate for I/O error
detection specifically for nvme_admin_async_event commands.

Link: https://lore.kernel.org/r/20220603174329.63777-9-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2 years agoscsi: lpfc: Add more logging of cmd and cqe information for aborted NVMe cmds
James Smart [Fri, 3 Jun 2022 17:43:27 +0000 (10:43 -0700)]
scsi: lpfc: Add more logging of cmd and cqe information for aborted NVMe cmds

When an NVMe command is aborted or completes with an ERSP, log the opcode
and command ID fields to help provide more detail on the failed command.

Link: https://lore.kernel.org/r/20220603174329.63777-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>