linux-2.6-microblaze.git
4 years agoRDMA/efa: Expose minimum SQ size
Gal Pressman [Wed, 22 Jul 2020 14:03:10 +0000 (17:03 +0300)]
RDMA/efa: Expose minimum SQ size

The device reports the minimum SQ size required for creation.

This patch queries the min SQ size and reports it back to the userspace
library.

Link: https://lore.kernel.org/r/20200722140312.3651-3-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Shadi Ammouri <sammouri@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/efa: Expose maximum TX doorbell batch
Gal Pressman [Wed, 22 Jul 2020 14:03:09 +0000 (17:03 +0300)]
RDMA/efa: Expose maximum TX doorbell batch

The device reports the maximum number of bytes to be written before
ringing the doorbell (zero means unlimited).

This patch queries the max batch size and reports it back to the userspace
library.

Link: https://lore.kernel.org/r/20200722140312.3651-2-galpress@amazon.com
Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com>
Reviewed-by: Firas JahJah <firasj@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/srpt: use new shared CQ mechanism
Yamin Friedman [Wed, 22 Jul 2020 13:56:29 +0000 (16:56 +0300)]
IB/srpt: use new shared CQ mechanism

Have the driver use shared CQs provided by the rdma core driver.  This
provides the advantage of improved efficiency handling interrupts.

Link: https://lore.kernel.org/r/20200722135629.49467-3-maxg@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/isert: use new shared CQ mechanism
Yamin Friedman [Wed, 22 Jul 2020 13:56:28 +0000 (16:56 +0300)]
IB/isert: use new shared CQ mechanism

Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed.

Now there is no reason to allocate very large CQs when the driver is
loaded while gaining the advantage of shared CQs. Previously when a single
connection was opened a CQ was opened for every core with enough space for
eight connections, this is a very large overhead that in most cases will
not be utilized.

Link: https://lore.kernel.org/r/20200722135629.49467-2-maxg@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/iser: use new shared CQ mechanism
Yamin Friedman [Wed, 22 Jul 2020 13:56:27 +0000 (16:56 +0300)]
IB/iser: use new shared CQ mechanism

Have the driver use shared CQs provided by the rdma core driver.  Since
this provides similar functionality to iser_comp it has been removed. Now
there is no reason to allocate very large CQs when the driver is loaded
while gaining the advantage of shared CQs.

Link: https://lore.kernel.org/r/20200722135629.49467-1-maxg@mellanox.com
Signed-off-by: Yamin Friedman <yaminf@mellanox.com>
Acked-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Delete unreachable code
Leon Romanovsky [Mon, 27 Jul 2020 09:57:46 +0000 (12:57 +0300)]
RDMA/mlx5: Delete unreachable code

Delete two occurrences of unreachable code discovered by the Coverity.

Link: https://lore.kernel.org/r/20200727095746.495915-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Fix return error value in _ib_modify_qp() to negative
Li Heng [Sat, 25 Jul 2020 02:56:27 +0000 (10:56 +0800)]
RDMA/core: Fix return error value in _ib_modify_qp() to negative

The error codes in _ib_modify_qp() are supposed to be negative errno.

Fixes: 7a5c938b9ed0 ("IB/core: Check for rdma_protocol_ib only after validating port_num")
Link: https://lore.kernel.org/r/1595645787-20375-1-git-send-email-liheng40@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Li Heng <liheng40@huawei.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoMerge branch 'mlx5_uar' into rdma.git /for-next
Jason Gunthorpe [Mon, 27 Jul 2020 14:44:36 +0000 (11:44 -0300)]
Merge branch 'mlx5_uar' into rdma.git /for-next

Meir Lichtinger says:

====================
ConnectX-7 supports setting relaxed ordering read/write mkey attribute by
UMR, indicated by new HCA capabilities, so extend mlx5_ib driver to
configure UMR control segment
====================

Based on the mlx5-next branch at
      git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_uar':
  RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7
  RDMA/mlx5: Use MLX5_SET macro instead of local structure
  RDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMR

4 years agoRDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7
Meir Lichtinger [Thu, 16 Jul 2020 10:52:48 +0000 (13:52 +0300)]
RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7

Up to ConnectX-7 UMR is not used when user passes relaxed ordering access
flag. ConnectX-7 supports setting relaxed ordering read/write mkey
attribute by UMR, indicated by new HCA capabilities.

With ConnectX-7 driver uses UMR when user set relaxed ordering access
flag, in contrast to previous silicon models. Specifically it includes
setting relvant flags of mkey context mask in UMR control segment, and
relaxed ordering write and read flags in UMR mkey context segment.

Link: https://lore.kernel.org/r/20200716105248.1423452-4-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Use MLX5_SET macro instead of local structure
Meir Lichtinger [Thu, 16 Jul 2020 10:52:47 +0000 (13:52 +0300)]
RDMA/mlx5: Use MLX5_SET macro instead of local structure

Use generic mlx5 structure defined in mlx5_ifc.h to represent ConnectX
device data structures instead of using structure defined specifically for
mlx5_ib module.

Link: https://lore.kernel.org/r/20200716105248.1423452-3-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Fix typo in enum name
Pavel Machek [Fri, 24 Jul 2020 08:41:12 +0000 (10:41 +0200)]
RDMA/mlx5: Fix typo in enum name

Nnothing uses the enum name, so this is harmless.

Fixes: 322694412400 ("IB/mlx5: Introduce driver create and destroy flow methods")
Link: https://lore.kernel.org/r/20200724084112.GC31930@amd
Signed-off-by: Pavel Machek (CIP) <pavel@denx.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/hfi1: Use fallthrough pseudo-keyword
Gustavo A. R. Silva [Tue, 21 Jul 2020 13:34:55 +0000 (08:34 -0500)]
IB/hfi1: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with the
new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7-rc7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Link: https://lore.kernel.org/r/20200721133455.GA14363@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/uverbs: Silence shiftTooManyBitsSigned warning
Leon Romanovsky [Mon, 20 Jul 2020 17:56:27 +0000 (20:56 +0300)]
RDMA/uverbs: Silence shiftTooManyBitsSigned warning

Fix reported by kbuild warning.

   drivers/infiniband/core/uverbs_cmd.c:1897:47: warning: Shifting signed 32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
    BUILD_BUG_ON(IB_USER_LAST_QP_ATTR_MASK == (1 << 31));
                                                 ^
Link: https://lore.kernel.org/r/20200720175627.1273096-3-leon@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/uverbs: Remove redundant assignments
Leon Romanovsky [Mon, 20 Jul 2020 17:56:26 +0000 (20:56 +0300)]
RDMA/uverbs: Remove redundant assignments

The kbuild reported the following warning, so clean whole uverbs_cmd.c
file.

   drivers/infiniband/core/uverbs_cmd.c:1066:6: warning: Variable 'ret' is reassigned a value before the old one has been used. [redundantAssignment]
    ret = uverbs_request(attrs, &cmd, sizeof(cmd));
        ^
   drivers/infiniband/core/uverbs_cmd.c:1064:0: note: Variable 'ret' is reassigned a value before the old one has been used.
    int    ret = -EINVAL;
   ^

Link: https://lore.kernel.org/r/20200720175627.1273096-2-leon@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow
Maor Gottlieb [Sun, 19 Jul 2020 06:57:47 +0000 (09:57 +0300)]
RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow

According to the locking scheme, mlx5_ib_update_xlt() should be called
with srcu_read_lock(dev->odp->srcu). Prefetch missed this. This fixes the
below WARN from lockdep_assert_held():

  WARNING: CPU: 1 PID: 1130 at drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib]
  Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp scsi_transport_srp rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core mlx5_core mlxfw ptp pps_core
  CPU: 1 PID: 1130 Comm: kworker/u16:11 Tainted: G        W 5.8.0-rc5_for_upstream_debug_2020_07_13_11_04 #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
  Workqueue: events_unbound mlx5_ib_prefetch_mr_work [mlx5_ib]
  RIP: 0010:mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib]
  Code: 08 e2 85 c0 0f 84 65 ff ff ff 49 8b 87 60 01 00 00 be ff ff ff ff 48 8d b8 b0 39 00 00 e8 93 e0 50 e1 85 c0 0f 85 45 ff ff ff <0f> 0b e9 3e ff ff ff 0f 0b eb c7 0f 1f 44 00 00 48 8b 87 98 0f 00
  RSP: 0018:ffff88840f44fc68 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff88840cc9d000 RCX: ffff88840efcd940
  RDX: 0000000000000000 RSI: ffff88844871b9b0 RDI: ffff88840efce100
  RBP: ffff88840cc9d040 R08: 0000000000000040 R09: 0000000000000001
  R10: ffff88846ced3068 R11: 0000000000000000 R12: 00000000000156ec
  R13: 0000000000000004 R14: 0000000000000004 R15: ffff888439941000
  FS:  0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f8536d12430 CR3: 0000000437a5e006 CR4: 0000000000360ea0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   mlx5_ib_update_xlt+0x37c/0x7c0 [mlx5_ib]
   pagefault_mr+0x315/0x440 [mlx5_ib]
   mlx5_ib_prefetch_mr_work+0x56/0xa0 [mlx5_ib]
   process_one_work+0x215/0x5c0
   worker_thread+0x3c/0x380
   ? process_one_work+0x5c0/0x5c0
   kthread+0x133/0x150
   ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30

Hold the SRCU during prefetch, even though it strictly isn't needed since
prefetch is holding the num_deferred_work it does make it easier to reason
about.

Fixes: 5256edcb98a1 ("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200719065747.131157-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMR
Meir Lichtinger [Thu, 4 Jun 2020 05:49:38 +0000 (08:49 +0300)]
RDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMR

Up to ConnectX-7 setting mkey relaxed ordering read/write attributes
by UMR is not supported. ConnectX-7 supports this option, which is
indicated by two new HCA capabilities - relaxed_ordering_write_umr
and relaxed_ordering_read_umr.

Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoRDMA/core: Update write interface to use automatic object lifetime
Leon Romanovsky [Sun, 19 Jul 2020 05:22:23 +0000 (08:22 +0300)]
RDMA/core: Update write interface to use automatic object lifetime

The automatic object lifetime model allows us to change the write()
interface to have the same logic as the ioctl() path. Update the
create/alloc functions to be in the following format, so the code flow
will be the same:

 * Allocate objects
 * Initialize them
 * Call to the drivers, this is last step that is allowed to fail
 * Finalize object
 * Return response and allow to core code to handle abort/commit
   respectively.

Link: https://lore.kernel.org/r/20200719052223.75245-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Align abort/commit object scheme for write() and ioctl() paths
Leon Romanovsky [Sun, 19 Jul 2020 05:22:22 +0000 (08:22 +0300)]
RDMA/core: Align abort/commit object scheme for write() and ioctl() paths

Create the same logic flow for the write() interface as we have for the
ioctl() path by making sure that the object is committed or aborted
automatically after HW object creation.

Link: https://lore.kernel.org/r/20200719052223.75245-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Allow SQ modification
Maor Gottlieb [Thu, 16 Jul 2020 10:54:16 +0000 (13:54 +0300)]
RDMA/mlx5: Allow SQ modification

Currently the SQ is set to a ready state when the RAW QP is modified to
INIT.  When the TIS is modified, e.g. to change the lag_tx_affinity, then
SQs which are already in the ready state will not be affected.

Open a window to modify the SQ behavior by setting the SQ as ready only
when QP was modified to RTS.

Link: https://lore.kernel.org/r/20200716105416.1423826-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: rdma_user_ioctl.h: fix a duplicated word + clarify
Randy Dunlap [Sun, 19 Jul 2020 00:32:20 +0000 (17:32 -0700)]
RDMA: rdma_user_ioctl.h: fix a duplicated word + clarify

Change the repeated word "it" in a comment to "it to".
Also insert a dash in the sentence to add clarity.

Link: https://lore.kernel.org/r/20200719003220.21250-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/bnxt_re: Update maintainers for Broadcom rdma driver
Devesh Sharma [Wed, 15 Jul 2020 14:16:59 +0000 (10:16 -0400)]
RDMA/bnxt_re: Update maintainers for Broadcom rdma driver

Adding a new co-maintainer for Broadcom's RDMA driver.

Link: https://lore.kernel.org/r/1594822619-4098-7-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/bnxt_re: Change wr posting logic to accommodate variable wqes
Devesh Sharma [Wed, 15 Jul 2020 14:16:58 +0000 (10:16 -0400)]
RDMA/bnxt_re: Change wr posting logic to accommodate variable wqes

Modifying the post-send and post-recv to initialize the wqes slot by slot
dynamically depending on the number of max sges requested by consumer at
the time of QP creation.

Changed the QP creation logic to determine the size of SQ and RQ in 16B
slots based on the number of wqe and number of SGEs requested by consumer

Link: https://lore.kernel.org/r/1594822619-4098-6-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/bnxt_re: Add helper data structures
Devesh Sharma [Wed, 15 Jul 2020 14:16:57 +0000 (10:16 -0400)]
RDMA/bnxt_re: Add helper data structures

Adding few helper data structure which are useful to initialize hardware
send wqe in variable wqe mode.

Adding a qp flag in HSI to indicate variable wqe is enabled for this qp.

Link: https://lore.kernel.org/r/1594822619-4098-5-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/bnxt_re: Pull psn buffer dynamically based on prod
Devesh Sharma [Wed, 15 Jul 2020 14:16:56 +0000 (10:16 -0400)]
RDMA/bnxt_re: Pull psn buffer dynamically based on prod

Changing the PSN management memory buffers from statically initialized to
dynamic pull scheme.

During create qp only the start pointers are initialized and during
post-send the psn buffer is pulled based on current producer index.

Adjusting post_send code to accommodate dynamic psn-pull and changing
post_recv code to match post-send code wrt pseudo flush wqe generation.

Link: https://lore.kernel.org/r/1594822619-4098-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/bnxt_re: introduce a function to allocate swq
Devesh Sharma [Wed, 15 Jul 2020 14:16:55 +0000 (10:16 -0400)]
RDMA/bnxt_re: introduce a function to allocate swq

The bnxt_re driver now allocates shadow sq and rq to maintain per wqe
wr_id and few other flags required to support variable wqe. Segregated the
allocation of shadow queue in a separate function and adjust the cqe
polling logic. The new polling logic is based on shadow queue indices.

Link: https://lore.kernel.org/r/1594822619-4098-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/bnxt_re: introduce wqe mode to select execution path
Devesh Sharma [Wed, 15 Jul 2020 14:16:54 +0000 (10:16 -0400)]
RDMA/bnxt_re: introduce wqe mode to select execution path

The bnxt_re driver need to decide on how much SQ and RQ memory should to
be allocated and which wqe posting/polling algorithm to use.

Making changes to set the wqe-mode to a default value during device
registration sequence. The wqe-mode is passed to the lower layer driver as
well. Going forward in the lower layer driver wqe-mode will be used to
decide execution path. Initializing the wqe-mode to static wqe type for
now.

Link: https://lore.kernel.org/r/1594822619-4098-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/qedr: Remove the query_pkey callback
Kamal Heib [Tue, 14 Jul 2020 18:34:14 +0000 (21:34 +0300)]
RDMA/qedr: Remove the query_pkey callback

Now that the query_pkey() isn't mandatory by the RDMA core for iwarp
providers, this callback can be removed from the common ops and moved to
the RoCE only ops within the qedr driver.

Link: https://lore.kernel.org/r/20200714183414.61069-8-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/i40iw: Remove the query_pkey callback
Kamal Heib [Tue, 14 Jul 2020 18:34:13 +0000 (21:34 +0300)]
RDMA/i40iw: Remove the query_pkey callback

Now that the query_pkey() isn't mandatory by the RDMA core for iwarp
providers, this callback can be removed.

Link: https://lore.kernel.org/r/20200714183414.61069-7-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/cxgb4: Remove the query_pkey callback
Kamal Heib [Tue, 14 Jul 2020 18:34:12 +0000 (21:34 +0300)]
RDMA/cxgb4: Remove the query_pkey callback

Now that the query_pkey() isn't mandatory by the RDMA core for iwarp
providers, this callback can be removed.

Link: https://lore.kernel.org/r/20200714183414.61069-6-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/siw: Remove the query_pkey callback
Kamal Heib [Tue, 14 Jul 2020 18:34:11 +0000 (21:34 +0300)]
RDMA/siw: Remove the query_pkey callback

Now that the query_pkey() isn't mandatory by the RDMA core for iwarp
providers, this callback can be removed.

Link: https://lore.kernel.org/r/20200714183414.61069-5-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Remove query_pkey from the mandatory ops
Kamal Heib [Tue, 14 Jul 2020 18:34:10 +0000 (21:34 +0300)]
RDMA/core: Remove query_pkey from the mandatory ops

The query_pkey() isn't mandatory for the iwarp providers, so remove this
requirement from the RDMA core.

Link: https://lore.kernel.org/r/20200714183414.61069-4-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Allocate the pkey cache only if the pkey_tbl_len is set
Kamal Heib [Tue, 14 Jul 2020 18:34:09 +0000 (21:34 +0300)]
RDMA/core: Allocate the pkey cache only if the pkey_tbl_len is set

Allocate the pkey cache only if the pkey_tbl_len is set by the provider,
also add checks to avoid accessing the pkey cache when it not initialized.

Link: https://lore.kernel.org/r/20200714183414.61069-3-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Expose pkeys sysfs files only if pkey_tbl_len is set
Kamal Heib [Tue, 14 Jul 2020 18:34:08 +0000 (21:34 +0300)]
RDMA/core: Expose pkeys sysfs files only if pkey_tbl_len is set

Expose the pkeys sysfs files only if the pkey_tbl_len is set by the
providers.

Link: https://lore.kernel.org/r/20200714183414.61069-2-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/rxe: Prevent access to wr->next ptr afrer wr is posted to send queue
Mikhail Malygin [Thu, 16 Jul 2020 19:03:41 +0000 (22:03 +0300)]
RDMA/rxe: Prevent access to wr->next ptr afrer wr is posted to send queue

rxe_post_send_kernel() iterates over linked list of wr's, until the
wr->next ptr is NULL.  However if we've got an interrupt after last wr is
posted, control may be returned to the code after send completion callback
is executed and wr memory is freed.

As a result, wr->next pointer may contain incorrect value leading to
panic. Store the wr->next on the stack before posting it.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200716190340.23453-1-m.malygin@yadro.com
Signed-off-by: Mikhail Malygin <m.malygin@yadro.com>
Signed-off-by: Sergey Kojushev <s.kojushev@yadro.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/qedr: Add EDPM max size to alloc ucontext response
Michal Kalderon [Tue, 7 Jul 2020 06:31:00 +0000 (09:31 +0300)]
RDMA/qedr: Add EDPM max size to alloc ucontext response

User space should receive the maximum edpm size from kernel driver,
similar to other edpm/ldpm related limits.  Add an additional parameter to
the alloc_ucontext_resp structure for the edpm maximum size.

In addition, pass an indication from user-space to kernel
(and not just kernel to user) that the DPM sizes are supported.

This is for supporting backward-forward compatibility between driver and
lib for everything related to DPM transaction and limit sizes.

This should have been part of commit mentioned in Fixes tag.

Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com
Fixes: 93a3d05f9d68 ("RDMA/qedr: Add kernel capability flags for dpm enabled mode")
Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/qedr: Add EDPM mode type for user-fw compatibility
Michal Kalderon [Tue, 7 Jul 2020 06:30:59 +0000 (09:30 +0300)]
RDMA/qedr: Add EDPM mode type for user-fw compatibility

In older FW versions the completion flag was treated as the ack flag in
edpm messages.  commit ff937b916eb6 ("qed: Add EDPM mode type for user-fw
compatibility") exposed the FW option of setting which mode the QP is in
by adding a flag to the qedr <-> qed API.

This patch adds the qedr <-> libqedr interface so that the libqedr can set
the flag appropriately and qedr can pass it down to FW.  Flag is added for
backward compatibility with libqedr.

For older libs, this flag didn't exist and therefore set to zero.

Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs")
Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com
Signed-off-by: Yuval Bason <yuval.bason@marvell.com>
Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/usnic: switch from 'pci_' to 'dma_' API
Christophe JAILLET [Sat, 11 Jul 2020 07:31:20 +0000 (09:31 +0200)]
RDMA/usnic: switch from 'pci_' to 'dma_' API

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script bellow.
It has been compile tested.

When memory is allocated, GFP_ATOMIC should be used to be consistent with
the surrounding code.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)

Link: https://lore.kernel.org/r/20200711073120.249146-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/hfi1: Remove unnecessary fall-through markings
Gustavo A. R. Silva [Thu, 9 Jul 2020 23:52:50 +0000 (18:52 -0500)]
IB/hfi1: Remove unnecessary fall-through markings

Reorganize the code a bit in a more standard way[1] and remove
unnecessary fall-through markings.

[1] https://lore.kernel.org/lkml/20200708054703.GR207186@unreal/

Link: https://lore.kernel.org/r/20200709235250.GA26678@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/qedr: SRQ's bug fixes
Yuval Basson [Wed, 8 Jul 2020 19:55:26 +0000 (22:55 +0300)]
RDMA/qedr: SRQ's bug fixes

QP's with the same SRQ, working on different CQs and running in parallel
on different CPUs could lead to a race when maintaining the SRQ consumer
count, and leads to FW running out of SRQs. Update the consumer
atomically.  Make sure the wqe_prod is updated after the sge_prod due to
FW requirements.

Fixes: 3491c9e799fb ("qedr: Add support for kernel mode SRQ's")
Link: https://lore.kernel.org/r/20200708195526.31040-1-ybason@marvell.com
Signed-off-by: Michal Kalderon <mkalderon@marvell.com>
Signed-off-by: Yuval Basson <ybason@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/isert: allocate RW ctxs according to max IO size
Max Gurtovoy [Wed, 8 Jul 2020 09:19:08 +0000 (12:19 +0300)]
IB/isert: allocate RW ctxs according to max IO size

Current iSER target code allocates MR pool budget based on queue size.
Since there is no handshake between iSER initiator and target on max IO
size, we'll set the iSER target to support upto 16MiB IO operations and
allocate the correct number of RDMA ctxs according to the factor of MR's
per IO operation. This would guarantee sufficient size of the MR pool for
the required IO queue depth and IO size.

Link: https://lore.kernel.org/r/20200708091908.162263-1-maxg@mellanox.com
Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Init dest_type when create flow
Daria Velikovsky [Tue, 7 Jul 2020 11:02:59 +0000 (14:02 +0300)]
RDMA/mlx5: Init dest_type when create flow

When using action drop dest_type was never assigned to any value.  Add
initialization of dest_type to -1 since 0 is valid.

Fixes: f29de9eee782 ("RDMA/mlx5: Add support for drop action in DV steering")
Link: https://lore.kernel.org/r/20200707110259.882276-1-leon@kernel.org
Signed-off-by: Daria Velikovsky <daria@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/rxe: Remove rxe_link_layer()
Kamal Heib [Sun, 5 Jul 2020 10:43:13 +0000 (13:43 +0300)]
RDMA/rxe: Remove rxe_link_layer()

Instead of returning IB_LINK_LAYER_ETHERNET from rxe_link_layer, return it
directly from get_link_layer callback and remove rxe_link_layer().

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200705104313.283034-5-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/rxe: Return void from rxe_mem_init_dma()
Kamal Heib [Sun, 5 Jul 2020 10:43:12 +0000 (13:43 +0300)]
RDMA/rxe: Return void from rxe_mem_init_dma()

The return value from rxe_mem_init_dma() is always 0 - change it to be
void and fix the callers accordingly.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200705104313.283034-4-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/rxe: Return void from rxe_init_port_param()
Kamal Heib [Sun, 5 Jul 2020 10:43:11 +0000 (13:43 +0300)]
RDMA/rxe: Return void from rxe_init_port_param()

The return value from rxe_init_port_param() is always 0 - change it to be
void.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200705104313.283034-3-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/rxe: Drop pointless checks in rxe_init_ports
Kamal Heib [Sun, 5 Jul 2020 10:43:10 +0000 (13:43 +0300)]
RDMA/rxe: Drop pointless checks in rxe_init_ports

Both pkey_tbl_len and gid_tbl_len are set in rxe_init_port_param() - so no
need to check if they aren't set.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200705104313.283034-2-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agonet/mlx5: Enable count action for rules with allow action
Michael Guralnik [Wed, 15 Jul 2020 04:28:35 +0000 (21:28 -0700)]
net/mlx5: Enable count action for rules with allow action

Enable the creation of rules with allow and count actions.
This enables using counters on egress flow tables.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Add interface changes required for VDPA
Eli Cohen [Wed, 15 Jul 2020 04:28:34 +0000 (21:28 -0700)]
net/mlx5: Add interface changes required for VDPA

Rename mlx5_ifc_device_virtio_emulation_cap_bits to
mlx5_ifc_virtio_emulation_cap_bits to match names produced by the
tools producing these auto generated files.

In addition missing capabilities that will be required by VDPA
implementation.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Add VDPA interface type to supported enumerations
Eli Cohen [Wed, 15 Jul 2020 04:28:33 +0000 (21:28 -0700)]
net/mlx5: Add VDPA interface type to supported enumerations

VDPA is a new interface that will be added in subsequent patches. It
uses mlx5 core devices and resources. Add an interface type for it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Support setting access rights of dma addresses
Eli Cohen [Wed, 15 Jul 2020 04:28:32 +0000 (21:28 -0700)]
net/mlx5: Support setting access rights of dma addresses

mlx5_fill_page_frag_array() is used to populate dma addresses to
resources that require it, such as QPs, RQs etc. When the resource is
used, PA list permissions are ignored. For resources that use MTT list,
the user is required to provide the access rights. Subsequent patches
use resources that require MTT lists, so modify API and implementation
to support that.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoRDMA/counter: Allow manually bind QPs with different pids to same counter
Mark Zhang [Thu, 2 Jul 2020 08:29:33 +0000 (11:29 +0300)]
RDMA/counter: Allow manually bind QPs with different pids to same counter

In manual mode allow bind user QPs with different pids to same counter,
since this is allowed in auto mode.
Bind kernel QPs and user QPs to the same counter are not allowed.

Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support")
Link: https://lore.kernel.org/r/20200702082933.424537-4-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/counter: Only bind user QPs in auto mode
Mark Zhang [Thu, 2 Jul 2020 08:29:32 +0000 (11:29 +0300)]
RDMA/counter: Only bind user QPs in auto mode

In auto mode only bind user QPs to a dynamic counter, since this feature
is mainly used for system statistic and diagnostic purpose, while there's
no need to counter kernel QPs so far.

Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support")
Link: https://lore.kernel.org/r/20200702082933.424537-3-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/counter: Add PID category support in auto mode
Mark Zhang [Thu, 2 Jul 2020 08:29:31 +0000 (11:29 +0300)]
RDMA/counter: Add PID category support in auto mode

With the "PID" category QPs have same PID will be bound to same counter;
If this category is not set then QPs have different PIDs will be bound
to same counter.

This is implemented for 2 reasons:
1. The counter is a limited resource, while there may be dozens of
   applications, each of which creates several types of QPs, which means
   it may doesn't have enough counter.
2. The system administrator needs all QPs created by all applications
   with same type bound to one counter.

The counter name and PID is only make sense when "PID" category are
configured.

This category can also be used in combine with others, e.g. QP type.

Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Remove unused to_mibmr function
Gal Pressman [Sun, 5 Jul 2020 14:11:43 +0000 (17:11 +0300)]
RDMA/mlx5: Remove unused to_mibmr function

The to_mibmr function is unused, remove it.

Link: https://lore.kernel.org/r/20200705141143.47303-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Delete one-time used functions
Leon Romanovsky [Thu, 2 Jul 2020 08:18:09 +0000 (11:18 +0300)]
RDMA/mlx5: Delete one-time used functions

Merge them into their callers, usually the only thing the caller did was
to call the one function, so this is clearer.

Link: https://lore.kernel.org/r/20200702081809.423482-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Cleanup DEVX initialization flow
Leon Romanovsky [Thu, 2 Jul 2020 08:18:08 +0000 (11:18 +0300)]
RDMA/mlx5: Cleanup DEVX initialization flow

Move DEVX initialization and cleanup flows to the devx.c instead of having
almost empty functions in main.c

Link: https://lore.kernel.org/r/20200702081809.423482-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Separate flow steering logic from main.c
Leon Romanovsky [Thu, 2 Jul 2020 08:18:07 +0000 (11:18 +0300)]
RDMA/mlx5: Separate flow steering logic from main.c

Move flow steering logic to be in separate file and rename flow.c to be
fs.c because it is better describe the content.

Link: https://lore.kernel.org/r/20200702081809.423482-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Separate counters from main.c
Leon Romanovsky [Thu, 2 Jul 2020 08:18:06 +0000 (11:18 +0300)]
RDMA/mlx5: Separate counters from main.c

There are number of counters types supported in mlx5_ib: HW counters,
congestion counters, Q-counters and flow counters. Almost all supporting
code was placed in main.c that made almost impossible to maintain the code
anymore. Let's create separate code namespace for the counters to easy
future generalization effort.

Link: https://lore.kernel.org/r/20200702081809.423482-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Separate restrack callbacks initialization from main.c
Leon Romanovsky [Thu, 2 Jul 2020 08:18:05 +0000 (11:18 +0300)]
RDMA/mlx5: Separate restrack callbacks initialization from main.c

The restrack code has separate .c, so move callbacks initialization to
that file to improve code locality.

Link: https://lore.kernel.org/r/20200702081809.423482-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Limit the scope of mlx5_ib_enable_driver function
Leon Romanovsky [Thu, 2 Jul 2020 08:18:04 +0000 (11:18 +0300)]
RDMA/mlx5: Limit the scope of mlx5_ib_enable_driver function

The mlx5_ib_enable_driver() is local function and doesn't need to be
shared in mlx5_ib, so change it's signature to have static keyword in it.

Link: https://lore.kernel.org/r/20200702081809.423482-2-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/hns: Optimize MTR level-0 addressing to access huge page
Xi Wang [Tue, 30 Jun 2020 14:01:36 +0000 (22:01 +0800)]
RDMA/hns: Optimize MTR level-0 addressing to access huge page

If hns ROCEE is set to level-0 addressing, the length of the entire buffer
can be used as the page size. The driver needn't to split the buffer into
small units because all pages are continuous.

Link: https://lore.kernel.org/r/1593525696-12570-1-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/rxe: Skip dgid check in loopback mode
Zhu Yanjun [Tue, 30 Jun 2020 12:36:05 +0000 (15:36 +0300)]
RDMA/rxe: Skip dgid check in loopback mode

In the loopback tests, the following call trace occurs.

 Call Trace:
  __rxe_do_task+0x1a/0x30 [rdma_rxe]
  rxe_qp_destroy+0x61/0xa0 [rdma_rxe]
  rxe_destroy_qp+0x20/0x60 [rdma_rxe]
  ib_destroy_qp_user+0xcc/0x220 [ib_core]
  uverbs_free_qp+0x3c/0xc0 [ib_uverbs]
  destroy_hw_idr_uobject+0x24/0x70 [ib_uverbs]
  uverbs_destroy_uobject+0x43/0x1b0 [ib_uverbs]
  uobj_destroy+0x41/0x70 [ib_uverbs]
  __uobj_get_destroy+0x39/0x70 [ib_uverbs]
  ib_uverbs_destroy_qp+0x88/0xc0 [ib_uverbs]
  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb9/0xf0 [ib_uverbs]
  ib_uverbs_cmd_verbs+0xb16/0xc30 [ib_uverbs]

The root cause is that the actual RDMA connection is not created in the
loopback tests and the rxe_match_dgid will fail randomly.

To fix this call trace which appear in the loopback tests, skip check of
the dgid.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20200630123605.446959-1-leon@kernel.org
Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: Move XRCD to be under ib_core responsibility
Leon Romanovsky [Tue, 30 Jun 2020 10:18:54 +0000 (13:18 +0300)]
RDMA: Move XRCD to be under ib_core responsibility

Update the code to allocate and free ib_xrcd structure in the
ib_core instead of inside drivers.

Link: https://lore.kernel.org/r/20200630101855.368895-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Create and destroy counters in the ib_core
Leon Romanovsky [Tue, 30 Jun 2020 10:18:52 +0000 (13:18 +0300)]
RDMA/core: Create and destroy counters in the ib_core

Move allocation and destruction of counters under ib_core responsibility

Link: https://lore.kernel.org/r/20200630101855.368895-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/uverbs: Expose UAPI to query MR
Yishai Hadas [Tue, 30 Jun 2020 09:39:16 +0000 (12:39 +0300)]
IB/uverbs: Expose UAPI to query MR

Expose UAPI to query MR, this will let user space application that
didn't allocate the MR but has access to by owning the matching command
FD to retrieve its information.

Link: https://lore.kernel.org/r/20200630093916.332097-8-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Introduce UAPI to query PD attributes
Yishai Hadas [Tue, 30 Jun 2020 09:39:15 +0000 (12:39 +0300)]
RDMA/mlx5: Introduce UAPI to query PD attributes

Introduce UAPI to query PD attributes, this can be used to retrieve PD
attributes by having the PD handle of the created one and owning the
command FD for the ucontxet.

Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Implement the query ucontext functionality
Yishai Hadas [Tue, 30 Jun 2020 09:39:14 +0000 (12:39 +0300)]
RDMA/mlx5: Implement the query ucontext functionality

Implement the query ucontext functionality by returning the original
ucontext data as part of an extra mlx5 attribute that holds the driver
UAPI response.

Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Refactor mlx5_ib_alloc_ucontext() response
Yishai Hadas [Tue, 30 Jun 2020 09:39:13 +0000 (12:39 +0300)]
RDMA/mlx5: Refactor mlx5_ib_alloc_ucontext() response

Refactor mlx5_ib_alloc_ucontext() to set its response fields in a
cleaner way.

It includes,
- Move the relevant code to a self contained function.
- Calculate the response length once and drop redundant code all around.
- Reuse previously set ucontext fields once preparing the response.

The self contained function will be used in next patch as part of
implementing the query ucontext functionality.

Link: https://lore.kernel.org/r/20200630093916.332097-5-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/uverbs: Expose UAPI to query ucontext
Yishai Hadas [Tue, 30 Jun 2020 09:39:12 +0000 (12:39 +0300)]
IB/uverbs: Expose UAPI to query ucontext

Expose UAPI to query ucontext, this will let user space application that
didn't allocate the ucontext but has access to by owning the matching
command FD to retrieve the ucontext information.

Link: https://lore.kernel.org/r/20200630093916.332097-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/uverbs: Set IOVA on IB MR in uverbs layer
Yishai Hadas [Tue, 30 Jun 2020 09:39:11 +0000 (12:39 +0300)]
IB/uverbs: Set IOVA on IB MR in uverbs layer

Set IOVA on IB MR in uverbs layer to let all drivers have it, this
includes both reg/rereg MR flows.
As part of this change cleaned-up this setting from the drivers that
already did it by themselves in their user flows.

Fixes: e6f0330106f4 ("mlx4_ib: set user mr attributes in struct ib_mr")
Link: https://lore.kernel.org/r/20200630093916.332097-3-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/uverbs: Enable CQ ioctl commands by default
Yishai Hadas [Tue, 30 Jun 2020 09:39:10 +0000 (12:39 +0300)]
IB/uverbs: Enable CQ ioctl commands by default

Enable CQ ioctl commands by default, this functionality is fully mature
to be used over ioctl, no reason to maintain any more the EXP KCONFIG
entry to enable it.

Link: https://lore.kernel.org/r/20200630093916.332097-2-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Optimize XRC target lookup
Maor Gottlieb [Mon, 6 Jul 2020 12:27:16 +0000 (15:27 +0300)]
RDMA/core: Optimize XRC target lookup

Replace the mutex with read write semaphore and use xarray instead of
linked list for XRC target QPs. This will give faster XRC target
lookup. In addition, when QP is closed, don't insert it back to the xarray
if the destroy command failed.

Link: https://lore.kernel.org/r/20200706122716.647338-4-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Clean ib_alloc_xrcd() and reuse it to allocate XRC domain
Maor Gottlieb [Mon, 6 Jul 2020 12:27:15 +0000 (15:27 +0300)]
RDMA/core: Clean ib_alloc_xrcd() and reuse it to allocate XRC domain

ib_alloc_xrcd() already does the required initialization, so move the
uverbs to call it and save code duplication, while cleaning the function
argument lists of that function.

Link: https://lore.kernel.org/r/20200706122716.647338-3-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Get XRCD number directly for the internal use
Leon Romanovsky [Mon, 6 Jul 2020 12:27:14 +0000 (15:27 +0300)]
RDMA/mlx5: Get XRCD number directly for the internal use

The mlx5_ib creates XRC domain and uses for creating internal SRQ.
However all that is needed is XRCD number and not full blown ib_xrcd
objects.

Update the code to get and store the number only.

Link: https://lore.kernel.org/r/20200706122716.647338-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: Remove the udata parameter from alloc_mr callback
Gal Pressman [Mon, 6 Jul 2020 12:03:43 +0000 (15:03 +0300)]
RDMA: Remove the udata parameter from alloc_mr callback

Allocating an MR flow can only be initiated by kernel users, and not from
userspace so a udata parameter is redundant.

Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Remove ib_alloc_mr_user function
Gal Pressman [Mon, 6 Jul 2020 12:03:42 +0000 (15:03 +0300)]
RDMA/core: Remove ib_alloc_mr_user function

Allocating an MR flow can only be initiated by kernel users, and not from
userspace. As a result, the udata parameter is always being passed as
NULL. Rename ib_alloc_mr_user function to ib_alloc_mr and remove the udata
parameter.

Link: https://lore.kernel.org/r/20200706120343.10816-3-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Check for error instead of success in alloc MR function
Gal Pressman [Mon, 6 Jul 2020 12:03:41 +0000 (15:03 +0300)]
RDMA/core: Check for error instead of success in alloc MR function

The common kernel pattern is to check for error, not success.  Flip the if
statement accordingly and keep the main flow unindented.

Link: https://lore.kernel.org/r/20200706120343.10816-2-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Clean up tracepoint headers
Chuck Lever [Thu, 2 Jul 2020 14:19:46 +0000 (10:19 -0400)]
RDMA/core: Clean up tracepoint headers

There's no need for core/trace.c to include rdma/ib_verbs.h twice.

Link: https://lore.kernel.org/r/20200702141946.3775.51943.stgit@klimt.1015granger.net
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoMerge branch 'mlx5_ipoib_qpn' into rdma.git for-next
Jason Gunthorpe [Mon, 6 Jul 2020 17:29:58 +0000 (14:29 -0300)]
Merge branch 'mlx5_ipoib_qpn' into rdma.git for-next

Michael Guralnik says:

====================
This series handles IPoIB child interface creation with setting
interface's HW address.

In current implementation, lladdr requested by user is ignored and
overwritten. Child interface gets the same GID as the parent interface and
a QP number which is assigned by the underlying drivers.

In this series we fix this behavior so that user's requested address is
assigned to the newly created interface.

As specific QP number request is not supported for all vendors, QP number
requested by user will still be overwritten when this is not supported.

Behavior of creation of child interfaces through the sysfs mechanism or
without specifying a requested address, stays the same.
====================

Based on the mlx5-next branch at
      git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_ipoib_qpn':
  RDMA/ipoib: Handle user-supplied address when creating child
  net/mlx5: Enable QP number request when creating IPoIB underlay QP

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/ipoib: Handle user-supplied address when creating child
Michael Guralnik [Tue, 23 Jun 2020 11:01:05 +0000 (14:01 +0300)]
RDMA/ipoib: Handle user-supplied address when creating child

Use the address supplied by user when creating a child interface.

Previously, the address requested by the user was ignored and overridden
with parent's GID and the random QP number assigned to the child.

Link: https://lore.kernel.org/r/20200623110105.1225750-3-leon@kernel.org
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agonet/mlx5: Enable QP number request when creating IPoIB underlay QP
Michael Guralnik [Wed, 20 May 2020 10:59:06 +0000 (13:59 +0300)]
net/mlx5: Enable QP number request when creating IPoIB underlay QP

If in the process of creating the underlay QP for an IPoIB interface
the user has set the address and specifically the 1st-3rd bytes
representing the QP number, use the requested QP number when creating
the underlay QP.

For a user to be able to request a QP number on QP creation, the MKEY_BY_NAME
NVCONFIG should be set. As mkey_by_name and qp_by_name are coupled in FW.
This requires driver to query the mkey_by_name max cap during initialization
and set the current cap if it was enabled in FW.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
4 years agoRDMA/mlx5: Introduce ODP prefetch counter
Maor Gottlieb [Sun, 21 Jun 2020 10:41:47 +0000 (13:41 +0300)]
RDMA/mlx5: Introduce ODP prefetch counter

For debugging purpose it will be easier to understand if prefetch works
okay if it has its own counter. Introduce ODP prefetch counter and count
per MR the total number of prefetched pages.

In addition remove comment which is not relevant anymore and anyway not in
the correct place.

Link: https://lore.kernel.org/r/20200621104147.53795-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/core: Fix bogus WARN_ON during ib_unregister_device_queued()
Jason Gunthorpe [Fri, 26 Jun 2020 17:49:10 +0000 (14:49 -0300)]
RDMA/core: Fix bogus WARN_ON during ib_unregister_device_queued()

ib_unregister_device_queued() can only be used by drivers using the new
dealloc_device callback flow, and it has a safety WARN_ON to ensure
drivers are using it properly.

However, if unregister and register are raced there is a special
destruction path that maintains the uniform error handling semantic of
'caller does ib_dealloc_device() on failure'. This requires disabling the
dealloc_device callback which triggers the WARN_ON.

Instead of using NULL to disable the callback use a special function
pointer so the WARN_ON does not trigger.

Fixes: d0899892edd0 ("RDMA/device: Provide APIs from the core code to help unregistration")
Link: https://lore.kernel.org/r/0-v1-a36d512e0a99+762-syz_dealloc_driver_jgg@nvidia.com
Reported-by: syzbot+4088ed905e4ae2b0e13b@syzkaller.appspotmail.com
Suggested-by: Hillf Danton <hdanton@sina.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()
Jason Gunthorpe [Thu, 25 Jun 2020 17:42:19 +0000 (20:42 +0300)]
RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()

ipoib_mcast_carrier_on_task() insanely open codes a rtnl_lock() such that
the only time flush_workqueue() can be called is if it also clears
IPOIB_FLAG_OPER_UP.

Thus the flush inside ipoib_flush_ah() will deadlock if it gets unlucky
enough, and lockdep doesn't help us to find it early:

          CPU0               CPU1          CPU2
   __ipoib_ib_dev_flush()
      down_read(vlan_rwsem)

                         ipoib_vlan_add()
                           rtnl_trylock()
                           down_write(vlan_rwsem)

      ipoib_mcast_carrier_on_task()
 while (!rtnl_trylock())
      msleep(20);

      ipoib_flush_ah()
flush_workqueue(priv->wq)

Clean up the ah_reaper related functions and lifecycle to make sense:

 - Start/Stop of the reaper should only be done in open/stop NDOs, not in
   any other places

 - cancel and flush of the reaper should only happen in the stop NDO.
   cancel is only functional when combined with IPOIB_STOP_REAPER.

 - Non-stop places were flushing the AH's just need to flush out dead AH's
   synchronously and ignore the background task completely. It is fully
   locked and harmless to leave running.

Which ultimately fixes the ABBA deadlock by removing the unnecessary
flush_workqueue() from the problematic place under the vlan_rwsem.

Fixes: efc82eeeae4e ("IB/ipoib: No longer use flush as a parameter")
Link: https://lore.kernel.org/r/20200625174219.290842-1-kamalheib1@gmail.com
Reported-by: Kamal Heib <kheib@redhat.com>
Tested-by: Kamal Heib <kheib@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/hfi1: Convert PCIBIOS_* errors to generic -E* errors
Bolarinwa Olayemi Saheed [Mon, 15 Jun 2020 07:32:19 +0000 (09:32 +0200)]
IB/hfi1: Convert PCIBIOS_* errors to generic -E* errors

pcie_speeds() and restore_pci_variables() returns PCIBIOS_ error codes
from PCIe capability accessors.

PCIBIOS_ error codes have positive values. Passing on these values is
inconsistent with functions which return only a negative value on failure.

Before passing on the return value of PCIe capability accessors, call
pcibios_err_to_errno() to convert any positive PCIBIOS_ error codes to
negative generic error values.

Link: https://lore.kernel.org/r/20200615073225.24061-3-refactormyself@gmail.com
Suggested-by: Bjorn Helgaas <bjorn@helgaas.com>
Signed-off-by: Bolarinwa Olayemi Saheed <refactormyself@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agonet/mlx5: kTLS, Improve TLS params layout structures
Tariq Toukan [Fri, 26 Jun 2020 05:59:43 +0000 (22:59 -0700)]
net/mlx5: kTLS, Improve TLS params layout structures

Add explicit WQE segment structures for the TLS static and progress
params.
According to the HW spec, TISN is not part of the progress params context,
take it out of it.
Rename the control segment tisn field as it could hold either a TIS or
a TIR number.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Avoid eswitch header inclusion in fs core layer
Parav Pandit [Fri, 26 Jun 2020 05:59:42 +0000 (22:59 -0700)]
net/mlx5: Avoid eswitch header inclusion in fs core layer

Flow steering core layer is independent of the eswitch layer.
Hence avoid fs_core dependency on eswitch.

Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Avoid RDMA file inclusion in core driver
Parav Pandit [Fri, 26 Jun 2020 05:59:41 +0000 (22:59 -0700)]
net/mlx5: Avoid RDMA file inclusion in core driver

mlx5 cq.h does not depend on RDMA verbs.
Remove RDMA verbs file inclusion.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agoRDMA/core: Delete not-used create RWQ table function
Leon Romanovsky [Wed, 24 Jun 2020 10:54:21 +0000 (13:54 +0300)]
RDMA/core: Delete not-used create RWQ table function

The RWQ table is used for RSS uverbs and not in used for the kernel
consumers, delete ib_create_rwq_ind_table() routine that is not
called at all.

Link: https://lore.kernel.org/r/20200624105422.1452290-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/mad: Delete RMPP_STATE_CANCELING state
Shay Drory [Sun, 21 Jun 2020 10:47:38 +0000 (13:47 +0300)]
IB/mad: Delete RMPP_STATE_CANCELING state

The cancel_delayed_work can be called under lock since it doesn't sleep.
This makes the RMPP_STATE_CANCELING state not needed anymore, remove it.

Link: https://lore.kernel.org/r/20200621104738.54850-5-leon@kernel.org
Signed-off-by: Shay Drory <shayd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/mad: Change atomics to refcount API
Shay Drory [Sun, 21 Jun 2020 10:47:37 +0000 (13:47 +0300)]
IB/mad: Change atomics to refcount API

The refcount API provides better safety than atomics API.  Therefore,
change atomic functions to refcount functions.

Link: https://lore.kernel.org/r/20200621104738.54850-4-leon@kernel.org
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoIB/mad: Issue complete whenever decrements agent refcount
Shay Drory [Sun, 21 Jun 2020 10:47:36 +0000 (13:47 +0300)]
IB/mad: Issue complete whenever decrements agent refcount

Replace calls of atomic_dec() to mad_agent_priv->refcount with calls to
deref_mad_agent() in order to issue complete. Most likely the refcount is
> 1 at these points, but it is difficult to prove. Performance is not
important on these paths, so be obviously correct.

Link: https://lore.kernel.org/r/20200621104738.54850-3-leon@kernel.org
Signed-off-by: Shay Drory <shayd@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/ipoib: Return void from ipoib_ib_dev_stop()
Kamal Heib [Tue, 23 Jun 2020 10:52:36 +0000 (13:52 +0300)]
RDMA/ipoib: Return void from ipoib_ib_dev_stop()

The return value from ipoib_ib_dev_stop() is always 0 - change it to be
void.

Link: https://lore.kernel.org/r/20200623105236.18683-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoMerge branch 'raw_dumps' into rdma.git for-next
Jason Gunthorpe [Tue, 23 Jun 2020 14:55:31 +0000 (11:55 -0300)]
Merge branch 'raw_dumps' into rdma.git for-next

Maor Gottlieb says:

====================
The following series adds support to get the RDMA resource data in RAW
format. The main motivation for doing this is to enable vendors to return
the entire QP/CQ/MR data without a need from the vendor to set each
field separately.
====================

Based on the mlx5-next branch at
      git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies

* branch 'raw_dumps':
  RDMA/mlx5: Add support to get MR resource in RAW format
  RDMA/mlx5: Add support to get CQ resource in RAW format
  RDMA/mlx5: Add support to get QP resource in RAW format
  RDMA: Add support to dump resource tracker in RAW format
  RDMA: Add dedicated CM_ID resource tracker function
  RDMA: Add dedicated QP resource tracker function
  RDMA: Add a dedicated CQ resource tracker function
  RDMA: Add dedicated MR resource tracker function
  RDMA/core: Don't call fill_res_entry for PD
  net/mlx5: Add support in query QP, CQ and MKEY segments
  net/mlx5: Export resource dump interface

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Add support to get MR resource in RAW format
Maor Gottlieb [Tue, 23 Jun 2020 11:30:43 +0000 (14:30 +0300)]
RDMA/mlx5: Add support to get MR resource in RAW format

Add support to get MR (mkey) resource dump in RAW format.

Link: https://lore.kernel.org/r/20200623113043.1228482-12-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Add support to get CQ resource in RAW format
Maor Gottlieb [Tue, 23 Jun 2020 11:30:42 +0000 (14:30 +0300)]
RDMA/mlx5: Add support to get CQ resource in RAW format

Add support to get CQ resource dump in RAW format.

Link: https://lore.kernel.org/r/20200623113043.1228482-11-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA/mlx5: Add support to get QP resource in RAW format
Maor Gottlieb [Tue, 23 Jun 2020 11:30:41 +0000 (14:30 +0300)]
RDMA/mlx5: Add support to get QP resource in RAW format

Add a generic function to use the resource dump mechanism to get the
QP resource data.

Link: https://lore.kernel.org/r/20200623113043.1228482-10-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: Add support to dump resource tracker in RAW format
Maor Gottlieb [Tue, 23 Jun 2020 11:30:40 +0000 (14:30 +0300)]
RDMA: Add support to dump resource tracker in RAW format

Add support to get resource dump in raw format. It enable drivers to
return the entire device specific QP/CQ/MR context without a need from the
driver to set each field separately.

The raw query returns only the device specific data, general data is still
returned by using the existing queries.

Example:

$ rdma res show mr dev mlx5_1 mrn 2 -r -j
[{"ifindex":7,"ifname":"mlx5_1",
"data":[0,4,255,254,0,0,0,0,0,0,0,0,16,28,0,216,...]}]

Link: https://lore.kernel.org/r/20200623113043.1228482-9-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: Add dedicated CM_ID resource tracker function
Maor Gottlieb [Tue, 23 Jun 2020 11:30:39 +0000 (14:30 +0300)]
RDMA: Add dedicated CM_ID resource tracker function

In order to avoid double multiplexing of the resource when it is a cm id,
add a dedicated callback function. In addition remove fill_res_entry which
is not used anymore.

Link: https://lore.kernel.org/r/20200623113043.1228482-8-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: Add dedicated QP resource tracker function
Maor Gottlieb [Tue, 23 Jun 2020 11:30:38 +0000 (14:30 +0300)]
RDMA: Add dedicated QP resource tracker function

In order to avoid double multiplexing of the resource when it is a QP, add
a dedicated callback function.

Link: https://lore.kernel.org/r/20200623113043.1228482-7-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
4 years agoRDMA: Add a dedicated CQ resource tracker function
Maor Gottlieb [Tue, 23 Jun 2020 11:30:37 +0000 (14:30 +0300)]
RDMA: Add a dedicated CQ resource tracker function

In order to avoid double multiplexing of the resource when it is a CQ, add
a dedicated callback function.

Link: https://lore.kernel.org/r/20200623113043.1228482-6-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>