linux-2.6-microblaze.git
6 years agoIB/{qib, hfi1}: Avoid flow control testing for RDMA write operation
Mike Marciniszyn [Tue, 22 Aug 2017 01:26:20 +0000 (18:26 -0700)]
IB/{qib, hfi1}: Avoid flow control testing for RDMA write operation

Section 9.7.7.2.5 of the 1.3 IBTA spec clearly says that receive
credits should never apply to RDMA write.

qib and hfi1 were doing that.  The following situation will result
in a QP hang:
- A prior SEND or RDMA_WRITE with immmediate consumed the last
  credit for a QP using RC receive buffer credits
- The prior op is acked so there are no more acks
- The peer ULP fails to post receive for some reason
- An RDMA write sees that the credits are exhausted and waits
- The peer ULP posts receive buffers
- The ULP posts a send or RDMA write that will be hung

The fix is to avoid the credit test for the RDMA write operation.

Cc: <stable@vger.kernel.org>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/rdmavt: Use rvt_put_swqe() in rvt_clear_mr_ref()
Mike Marciniszyn [Tue, 22 Aug 2017 01:26:14 +0000 (18:26 -0700)]
IB/rdmavt: Use rvt_put_swqe() in rvt_clear_mr_ref()

hfi1 and qib were converted in previous patches, do the same for rdmavt.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoMerge branch 'mellanox' into k.o/for-next
Doug Ledford [Fri, 25 Aug 2017 00:25:15 +0000 (20:25 -0400)]
Merge branch 'mellanox' into k.o/for-next

Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Report mlx5 enhanced multi packet WQE capability
Bodong Wang [Thu, 17 Aug 2017 12:52:35 +0000 (15:52 +0300)]
IB/mlx5: Report mlx5 enhanced multi packet WQE capability

Expose enhanced multi packet WQE capability to user space through
query_device by uhw.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Allow posting multi packet send WQEs if hardware supports
Bodong Wang [Thu, 17 Aug 2017 12:52:34 +0000 (15:52 +0300)]
IB/mlx5: Allow posting multi packet send WQEs if hardware supports

Set the field to allow posting multi packet send WQEs if hardware
supports this feature. This doesn't mean the send WQEs will be for
multi packet unless the send WQE was prepared according to multi
packet send WQE format.

User space shall use flag MLX5_IB_ALLOW_MPW to check if hardware
supports MPW and allows MPW in SQ context.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Add support for multi underlay QP
Yishai Hadas [Thu, 17 Aug 2017 12:52:33 +0000 (15:52 +0300)]
IB/mlx5: Add support for multi underlay QP

Set underlay QPN as part of flow rule when it's applicable.

There is one root flow table in the NIC RX namespace and all the
underlay QPs steer the traffic to this flow table.
In order to prevent QP to get traffic which is not target to its
underlay QP, we need to set the underlay QP number as part of
the steering matching.

Note:
When multicast traffic is sent the QPN filtering is done by the firmware
as some early step. Adding the QPN match on the flow table entry is
wrong as by that time the target QPN holds the multicast address (e.g.
FF(s)) and it won't match.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Fix integer overflow when page_shift == 31
Ilya Lesokhin [Thu, 17 Aug 2017 12:52:32 +0000 (15:52 +0300)]
IB/mlx5: Fix integer overflow when page_shift == 31

Fix a bug where MR registration fails when mlx5_ib_cont_pages
indicates that the MR can be mapped using 2GB pages (page_shift == 31).

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Fix memory leak in clean_mr error path
Kamal Heib [Thu, 17 Aug 2017 12:52:31 +0000 (15:52 +0300)]
IB/mlx5: Fix memory leak in clean_mr error path

In clean_mr error path the 'mr' should be freed.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Decouple MR allocation and population flows
Ilya Lesokhin [Thu, 17 Aug 2017 12:52:30 +0000 (15:52 +0300)]
IB/mlx5: Decouple MR allocation and population flows

mlx5 compatible devices have two ways of populating the MTT
table of an MKEY: using a FW command and using a UMR WQE.

A UMR is much faster, so it should be used whenever possible.
Unfortunately the code today uses UMR only if the MKEY was allocated
from the MR cache.

Fix the code to use UMR even for MKEYs that were allocated using
a FW command.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Enable UMR for MRs created with reg_create
Ilya Lesokhin [Thu, 17 Aug 2017 12:52:29 +0000 (15:52 +0300)]
IB/mlx5: Enable UMR for MRs created with reg_create

This patch is the first step in decoupling UMR usage and
allocation from the MR cache. The only functional change
in this patch is to enables UMR for MRs created with
reg_create.

This change fixes a bug where ODP memory regions that
were not allocated from the MR cache did not have UMR
enabled.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Expose software parsing for Raw Ethernet QP
Noa Osherovich [Thu, 17 Aug 2017 12:52:28 +0000 (15:52 +0300)]
IB/mlx5: Expose software parsing for Raw Ethernet QP

Software parsing (SWP) is a feature that can be used to instruct the
device to stop using its internal parser and to parse packets on the
transmit path according to offsets set for each packets.

Through this feature, the device allows the handling of checksum and
LSO by the hardware according to the location of IP and TCP/UDP
headers.

Enable SW parsing on Raw Ethernet send queue by default if firmware
supports it and report these capabilities to user space.

Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/i40iw: Remove unused argument
Yuval Shaia [Thu, 24 Aug 2017 17:11:42 +0000 (20:11 +0300)]
RDMA/i40iw: Remove unused argument

None of the calls to i40iw_netdev_vlan_ipv6 are using mac so let's
remove it from func's args-list.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/qedr: fix spelling mistake: "invlaid" -> "invalid"
Colin Ian King [Thu, 24 Aug 2017 08:25:53 +0000 (09:25 +0100)]
RDMA/qedr: fix spelling mistake: "invlaid" -> "invalid"

Trivial fix to spelling mistake in DP_ERR error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB: Avoid ib_modify_port() failure for RoCE devices
Selvin Xavier [Wed, 23 Aug 2017 08:08:07 +0000 (01:08 -0700)]
IB: Avoid ib_modify_port() failure for RoCE devices

IB CM calls ib_modify_port() irrespective of link layer. If the
failure is returned, the mad agent gets unregistered for those
devices. Recently, modify_port() hook was removed from some of the
low level drivers as it was always returning success. This breaks
rdma connection establishment over those devices.
For ethernet devices, Qkey violation and port capabilities are not
applicable. So returning success for RoCE when modify_port hook is
is not implemented.

Cc: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/vmw_pvrdma: Update device query parameters and port caps
Adit Ranadive [Wed, 23 Aug 2017 06:19:01 +0000 (23:19 -0700)]
RDMA/vmw_pvrdma: Update device query parameters and port caps

Added support for two device caps - max_sge_rd, max_fast_reg_page_list_len
and the IP_BASED_GIDS port cap flag.

Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Bryan Tan <bryantan@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/vmw_pvrdma: Add RoCEv2 support
Bryan Tan [Wed, 23 Aug 2017 06:19:00 +0000 (23:19 -0700)]
RDMA/vmw_pvrdma: Add RoCEv2 support

The driver version is bumped for compatibility purposes. Also, send correct
GID type during register to device. Added compatibility check macros for
the device.

Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Aditya Sarwade <asarwade@vmware.com>
Signed-off-by: Bryan Tan <bryantan@vmware.com>
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/ipoib: Enable ioctl for to IPoIB rdma netdevs
Feras Daoud [Wed, 23 Aug 2017 05:37:21 +0000 (08:37 +0300)]
IB/ipoib: Enable ioctl for to IPoIB rdma netdevs

Adds support for ioctl callback in the RDMA netdevs to allow
supporting functions not handled by the generic interface code.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Eitan Rabin <rabin@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/nes: Remove zeroed parameter from port query callback
Leon Romanovsky [Thu, 17 Aug 2017 12:50:55 +0000 (15:50 +0300)]
RDMA/nes: Remove zeroed parameter from port query callback

There is no need to explicitly zero parameters, because
the structure requested to be filled already initialized to zeros.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/mlx4: Properly annotate link layer variable
Leon Romanovsky [Thu, 17 Aug 2017 12:50:54 +0000 (15:50 +0300)]
RDMA/mlx4: Properly annotate link layer variable

The rdma_port_get_link_layer() returns enum rdma_link_layer as
a return value, hence it is better to store the return value in
specially annotated variable and not in int.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/mlx5: Limit scope of get vector affinity local function
Leon Romanovsky [Thu, 17 Aug 2017 12:50:53 +0000 (15:50 +0300)]
RDMA/mlx5: Limit scope of get vector affinity local function

The mlx5_ib_get_vector_affinity() call is local to main.c file and there
is no need to be declared globally visible.

Fixes: 40b24403f33e ("mlx5: support ->get_vector_affinity")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/rxe: Make rxe_counter_name static
Kamal Heib [Thu, 17 Aug 2017 12:50:52 +0000 (15:50 +0300)]
IB/rxe: Make rxe_counter_name static

rxe_counter_name is used in rxe_hw_counters.c only. Make it static.

Fixes: 0b1e5b99a48b ('IB/rxe: Add port protocol stats')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/ipoib: Sync between remove_one to sysfs calls that use rtnl_lock
Erez Shitrit [Thu, 17 Aug 2017 12:50:50 +0000 (15:50 +0300)]
IB/ipoib: Sync between remove_one to sysfs calls that use rtnl_lock

In order to avoid deadlock between sysfs functions (like create/delete
child) and remove_one (both of them are using the sysfs lock and
rtnl_lock) the driver will use a state mutex for sync.

That will fix traces as the following:
schedule+0x3e/0x90
kernfs_drain+0x75/0xf0
? wait_woken+0x90/0x90
__kernfs_remove+0x12e/0x1c0
kernfs_remove+0x25/0x40
sysfs_remove_dir+0x57/0x90
kobject_del+0x22/0x60
device_del+0x195/0x230
 pm_runtime_set_memalloc_noio+0xac/0xf0
netdev_unregister_kobject+0x71/0x80
rollback_registered_many+0x205/0x2f0
rollback_registered+0x31/0x40
unregister_netdevice_queue+0x58/0xb0
unregister_netdev+0x20/0x30
ipoib_remove_one+0xb7/0x240 [ib_ipoib]
ib_unregister_device+0xbc/0x1b0 [ib_core]
ib_unregister_mad_agent+0x29/0x30 [ib_core]
mlx4_ib_remove+0x67/0x280 [mlx4_ib]
INFO: task echo:24082 blocked for more than 120 seconds.
Tainted: G           OE   4.1.12-37.5.1.el6uek.x86_64 #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
Call Trace:
schedule+0x3e/0x90
schedule_preempt_disabled+0xe/0x10
__mutex_lock_slowpath+0x95/0x110
? _rcu_barrier+0x177/0x220
mutex_lock+0x23/0x40
rtnl_lock+0x15/0x20
netdev_run_todo+0x81/0x1f0
rtnl_unlock+0xe/0x10
ipoib_vlan_delete+0x12f/0x1c0 [ib_ipoib]
delete_child+0x69/0x80 [ib_ipoib]
dev_attr_store+0x20/0x30
sysfs_kf_write+0x41/0x50

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx4: Check that reserved fields in mlx4_ib_create_qp_rss are zero
Guy Levi [Thu, 17 Aug 2017 12:50:49 +0000 (15:50 +0300)]
IB/mlx4: Check that reserved fields in mlx4_ib_create_qp_rss are zero

According to mlx4 convention, need to fail the command due to a non-zero
value in the user data which is expected to be zero.

Fixes: 3078f5f1bd8b ("IB/mlx4: Add support for RSS QP")
Signed-off-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx4: Remove redundant attribute in mlx4_ib_create_qp_rss struct
Guy Levi [Thu, 17 Aug 2017 12:50:48 +0000 (15:50 +0300)]
IB/mlx4: Remove redundant attribute in mlx4_ib_create_qp_rss struct

rx_key_len is not in use and needs to be removed.

Fixes: 3078f5f1bd8b ("IB/mlx4: Add support for RSS QP")
Signed-off-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx4: Fix struct mlx4_ib_create_wq alignment
Guy Levi [Thu, 17 Aug 2017 12:50:47 +0000 (15:50 +0300)]
IB/mlx4: Fix struct mlx4_ib_create_wq alignment

The mlx4 ABI defines to have structures with alignment of 64B.

Fixes: 400b1ebcfe31 ("IB/mlx4: Add support for WQ related verbs")
Signed-off-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx4: Fix RSS QP type in creation verb
Guy Levi [Thu, 17 Aug 2017 12:50:46 +0000 (15:50 +0300)]
IB/mlx4: Fix RSS QP type in creation verb

The mlx4 was designed to support QP type of MLX4_IB_QPT_RAW_PACKET.

Fixes: 3078f5f1bd8b ("IB/mlx4: Add support for RSS QP")
Signed-off-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Add necessary delay drop assignment
Maor Gottlieb [Thu, 17 Aug 2017 12:50:45 +0000 (15:50 +0300)]
IB/mlx5: Add necessary delay drop assignment

Assign the statistics and configuration structure pointer on success.

Fixes: fe248c3a5837 ('IB/mlx5: Add delay drop configuration and statistics')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Fix some spelling mistakes
Talat Batheesh [Thu, 17 Aug 2017 12:50:44 +0000 (15:50 +0300)]
IB/mlx5: Fix some spelling mistakes

Fix spelling mistakes in remarks
    "retrun"->"return"
    "Decalring"->"Declaring"

Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx4: Fix some spelling mistakes
Talat Batheesh [Thu, 17 Aug 2017 12:50:43 +0000 (15:50 +0300)]
IB/mlx4: Fix some spelling mistakes

Fix spelling mistakes in remarks
    "retrun"->"return"
    "cancell"->"cancel"

Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/mthca: Make explicit conversion to 64bit value
Leon Romanovsky [Thu, 17 Aug 2017 12:50:42 +0000 (15:50 +0300)]
RDMA/mthca: Make explicit conversion to 64bit value

The "lg" variable is declared as int so in all places where
this variable is used as a shift operand, the output will be
int too.

This produces the following smatch warning:
drivers/infiniband/hw/mthca/mthca_cmd.c:701 mthca_map_cmd() warn:
should '1 << lg' be a 64 bit type?

Simple declaration of "1" to be "1ULL" will fix the issue.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/usnic: Fix remove address space warning
Leon Romanovsky [Thu, 17 Aug 2017 12:50:41 +0000 (15:50 +0300)]
RDMA/usnic: Fix remove address space warning

Sparse tool complains with the following error:
drivers/infiniband/hw/usnic/usnic_ib_main.c:445:16: warning: cast removes
address space of expression

Fix it by doing casting on correct field and convert function helper which
sets ifaddr to be void, because there are no users who are interested in
returned value.

Fixes: c7845bcafe4d ("IB/usnic: Add UDP support in u*verbs.c, u*main.c and u*util.h")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/mlx4: Remove gfp_mask argument from acquire_group call
Leon Romanovsky [Thu, 17 Aug 2017 12:50:40 +0000 (15:50 +0300)]
RDMA/mlx4: Remove gfp_mask argument from acquire_group call

All callers of acquire_group() passed the same gfp_mask to it
and it is safe to remove it.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/core: Refactor get link layer wrapper
Leon Romanovsky [Thu, 17 Aug 2017 12:50:39 +0000 (15:50 +0300)]
RDMA/core: Refactor get link layer wrapper

The return values from rdma_node_get_transport() are strict
and IB_LINK_LAYER_UNSPECIFIED is unreachable in this flow.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/core: Delete BUG() from unreachable flow
Leon Romanovsky [Thu, 17 Aug 2017 12:50:38 +0000 (15:50 +0300)]
RDMA/core: Delete BUG() from unreachable flow

Remove call to BUG() in case wrong node_type was provided.
This flow is unreachable, because node_types are supplied
from specific enum.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/core: Cleanup device capability enum
Leon Romanovsky [Thu, 17 Aug 2017 12:50:37 +0000 (15:50 +0300)]
RDMA/core: Cleanup device capability enum

Cleanup patch prior exporting the ib_device_cap_flags
to the user space. In this patch, we are aligning the
indentation, removing IB_DEVICE_INIT_TYPE and IB_DEVICE_RESERVED
fields, because it is not used in the kernel.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/(core, ulp): Convert register/unregister event handler to be void
Leon Romanovsky [Thu, 17 Aug 2017 12:50:36 +0000 (15:50 +0300)]
RDMA/(core, ulp): Convert register/unregister event handler to be void

The functions ib_register_event_handler() and
ib_unregister_event_handler() always returned success and they can't fail.

Let's convert those functions to be void, remove redundant checks and
cleanup tons of goto statements.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/mlx4: Fix create qp command alignment
Maor Gottlieb [Thu, 17 Aug 2017 12:50:35 +0000 (15:50 +0300)]
RDMA/mlx4: Fix create qp command alignment

Avoid extra padding by replacing the order of inl_recv_sz and reserved,
otherwise 'mlx4_ib_create_qp' structure might be larger than legacy user
input leading to copy of some garbage data from the user space buffer.

Fixes: ea30b966f7dd ('IB/mlx4: Add inline-receive support')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/mlx4: Don't use uninitialized variable
Leon Romanovsky [Thu, 17 Aug 2017 12:50:34 +0000 (15:50 +0300)]
RDMA/mlx4: Don't use uninitialized variable

Avoid usage of uninitialized variable.

Fixes: 3078f5f1bd8b ("IB/mlx4: Add support for RSS QP")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/uverbs: Introduce and use helper functions to copy ah attributes
Parav Pandit [Thu, 17 Aug 2017 12:50:33 +0000 (15:50 +0300)]
IB/uverbs: Introduce and use helper functions to copy ah attributes

This patch introduces two helper functions to copy ah attributes
from uverbs to internal ib_ah_attr structure and the other way
during modify qp and query qp respectively.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/cma: Fix erroneous validation of supported default GID type
Leon Romanovsky [Thu, 17 Aug 2017 12:50:32 +0000 (15:50 +0300)]
IB/cma: Fix erroneous validation of supported default GID type

When rdma_cm is initializing a cma_device it checks if this device
supports the preferred default GID type. This check was done in a wrong way
and therefore sometimes rdma_cm is coming up with default GID type that is
not supported by the device.

Fix that by checking for supported GID type properly.

Fixes: 3c7f67d1880d ("IB/cma: Fix default RoCE type setting")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoMerge branch 'k.o/for-4.13-rc' into k.o/for-next
Doug Ledford [Thu, 24 Aug 2017 19:58:26 +0000 (15:58 -0400)]
Merge branch 'k.o/for-4.13-rc' into k.o/for-next

Pick up -rc fixes.

Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Always return success for RoCE modify port
Majd Dibbiny [Wed, 23 Aug 2017 05:35:42 +0000 (08:35 +0300)]
IB/mlx5: Always return success for RoCE modify port

CM layer calls ib_modify_port() regardless of the link layer.

For the Ethernet ports, qkey violation and Port capabilities
are meaningless. Therefore, always return success for ib_modify_port
calls on the Ethernet ports.

Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: Fix Raw Packet QP event handler assignment
Majd Dibbiny [Wed, 23 Aug 2017 05:35:41 +0000 (08:35 +0300)]
IB/mlx5: Fix Raw Packet QP event handler assignment

In case we have SQ and RQ for Raw Packet QP, the SQ's event handler
wasn't assigned.

Fixing this by assigning event handler for each WQ after creation.

[ 1877.145243] Call Trace:
[ 1877.148644] <IRQ>
[ 1877.150580] [<ffffffffa07987c5>] ? mlx5_rsc_event+0x105/0x210 [mlx5_core]
[ 1877.159581] [<ffffffffa0795bd7>] ? mlx5_cq_event+0x57/0xd0 [mlx5_core]
[ 1877.167137] [<ffffffffa079208e>] mlx5_eq_int+0x53e/0x6c0 [mlx5_core]
[ 1877.174526] [<ffffffff8101a679>] ? sched_clock+0x9/0x10
[ 1877.180753] [<ffffffff810f717e>] handle_irq_event_percpu+0x3e/0x1e0
[ 1877.188014] [<ffffffff810f735d>] handle_irq_event+0x3d/0x60
[ 1877.194567] [<ffffffff810f9fe7>] handle_edge_irq+0x77/0x130
[ 1877.201129] [<ffffffff81014c3f>] handle_irq+0xbf/0x150
[ 1877.207244] [<ffffffff815ed78a>] ? atomic_notifier_call_chain+0x1a/0x20
[ 1877.214829] [<ffffffff815f434f>] do_IRQ+0x4f/0xf0
[ 1877.220498] [<ffffffff815e94ad>] common_interrupt+0x6d/0x6d
[ 1877.227025] <EOI>
[ 1877.228967] [<ffffffff814834e2>] ? cpuidle_enter_state+0x52/0xc0
[ 1877.236990] [<ffffffff81483615>] cpuidle_idle_call+0xc5/0x200
[ 1877.243676] [<ffffffff8101bc7e>] arch_cpu_idle+0xe/0x30
[ 1877.249831] [<ffffffff810b4725>] cpu_startup_entry+0xf5/0x290
[ 1877.256513] [<ffffffff815cfee1>] start_secondary+0x265/0x27b
[ 1877.263111] Code: Bad RIP value.
[ 1877.267296] RIP [< (null)>] (null)
[ 1877.273264] RSP <ffff88046fd63df8>
[ 1877.277531] CR2: 0000000000000000

Fixes: 19098df2da78 ("IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/core: Avoid accessing non-allocated memory when inferring port type
Noa Osherovich [Wed, 23 Aug 2017 05:35:40 +0000 (08:35 +0300)]
IB/core: Avoid accessing non-allocated memory when inferring port type

Commit 44c58487d51a ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
introduced the concept of type in ah_attr:
 * During ib_register_device, each port is checked for its type which
   is stored in ib_device's port_immutable array.
 * During uverbs' modify_qp, the type is inferred using the port number
   in ib_uverbs_qp_dest struct (address vector) by accessing the
   relevant port_immutable array and the type is passed on to
   providers.

IB spec (version 1.3) enforces a valid port value only in Reset to
Init. During Init to RTR, the address vector must be valid but port
number is not mentioned as a field in the address vector, so its
value is not validated, which leads to accesses to a non-allocated
memory when inferring the port type.

Save the real port number in ib_qp during modify to Init (when the
comp_mask indicates that the port number is valid) and use this value
to infer the port type.

Avoid copying the address vector fields if the matching bit is not set
in the attr_mask. Address vector can't be modified before the port, so
no valid flow is affected.

Fixes: 44c58487d51a ('IB/core: Define 'ib' and 'roce' rdma_ah_attr types')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agordma: Autoload netlink client modules
Jason Gunthorpe [Mon, 14 Aug 2017 20:57:39 +0000 (14:57 -0600)]
rdma: Autoload netlink client modules

If a message comes in and we do not have the client in the table, then
try to load the module supplying that client using MODULE_ALIAS to find
it.

This duplicates the scheme seen in other netlink muxes (eg nfnetlink).

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agordma: Allow demand loading of NETLINK_RDMA
Jason Gunthorpe [Mon, 14 Aug 2017 20:57:38 +0000 (14:57 -0600)]
rdma: Allow demand loading of NETLINK_RDMA

Provide a module alias so that if userspace opens a netlink
socket for RDMA the kernel support is loaded automatically.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx4: use kvmalloc_array to allocate wrid
Li Dongyang [Wed, 16 Aug 2017 13:31:23 +0000 (23:31 +1000)]
IB/mlx4: use kvmalloc_array to allocate wrid

We could use kvmalloc_array instead of the
kmalloc and __vmalloc combination.
After this we don't need to include linux/vmalloc.h

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mlx5: use kvmalloc_array for mlx5_ib_wq
Li Dongyang [Wed, 16 Aug 2017 13:31:22 +0000 (23:31 +1000)]
IB/mlx5: use kvmalloc_array for mlx5_ib_wq

We observed multiple times on our Lustre OSS servers that when
the system memory is fragmented, kmalloc() in create_kernel_qp()
could fail order 4/5 allocations while we still have many free pages.

Switch to kvmalloc_array() to allow the operation to contine.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA: Fix return value check for ib_get_eth_speed()
Selvin Xavier [Thu, 17 Aug 2017 14:58:07 +0000 (07:58 -0700)]
RDMA: Fix return value check for ib_get_eth_speed()

ib_get_eth_speed() return 0 on success. Fixing the condition checking
and prevent reporting failure for query_port verb.

Fixes: d41861942fc5 ("Add generic function to extract IB speed from netdev")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/pvrdma: Remove unused function
Yuval Shaia [Thu, 10 Aug 2017 21:12:11 +0000 (00:12 +0300)]
IB/pvrdma: Remove unused function

The function pvrdma_idx_ring_is_valid_idx is not in used so let's remove
it.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoi40iw: Improve CQP timeout logic
Shiraz Saleem [Wed, 9 Aug 2017 01:38:45 +0000 (20:38 -0500)]
i40iw: Improve CQP timeout logic

The current timeout logic for Control Queue-Pair (CQP) OPs
does not take into account whether CQP makes progress but
rather blindly waits for a large timeout value, 100000 jiffies
for the completion event. Improve this by setting the timeout
based on whether the CQP is making progress or not. If the CQP
is hung, the timeout will happen sooner, in 5000 jiffies. Each
time the CQP progress is detetcted, the timeout extends by 5000
jiffies.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Christopher N Bednarz <christopher.n.bednarz@intel.com>
Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add kernel receive context info to debugfs
Kaike Wan [Sun, 13 Aug 2017 15:09:04 +0000 (08:09 -0700)]
IB/hfi1: Add kernel receive context info to debugfs

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Remove HFI1_VERBS_31BIT_PSN option
Grzegorz Morys [Sun, 13 Aug 2017 15:08:58 +0000 (08:08 -0700)]
IB/hfi1: Remove HFI1_VERBS_31BIT_PSN option

Remove HFI1_VERBS_31BIT_PSN Kconfig option leaving only 31-bit PSNs
available. The option was implemented in the early days of the driver
and is no longer needed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Grzegorz Morys <grzegorz.morys@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Remove pstate from hfi1_pportdata
Jakub Byczkowski [Sun, 13 Aug 2017 15:08:52 +0000 (08:08 -0700)]
IB/hfi1: Remove pstate from hfi1_pportdata

Do not track physical state separately from host_link_state.
Deduce physical state from host_link_state when required.
Change cache_physical_state to log_physical_state to make
sure host_link_state reflects hardwares physical state properly.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Stricter bounds checking of MAD trap index
Kamenee Arumugame [Sun, 13 Aug 2017 15:08:46 +0000 (08:08 -0700)]
IB/hfi1: Stricter bounds checking of MAD trap index

The macro size is valid. This change makes it less ambiguous.
Bounds check trap type for better security.

Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Kamenee Arumugam <kamenee.arumugam@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Load fallback platform configuration per HFI device
Jakub Byczkowski [Sun, 13 Aug 2017 15:08:40 +0000 (08:08 -0700)]
IB/hfi1: Load fallback platform configuration per HFI device

Currently fallback configuration is loaded once per driver instance.
With multiple HFI devices in the same system the current code may not
load the platform config data for the device. Change fallback platform
config data loading to be per device.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add flag for platform config scratch register read
Jakub Byczkowski [Sun, 13 Aug 2017 15:08:34 +0000 (08:08 -0700)]
IB/hfi1: Add flag for platform config scratch register read

Add flag in pport data structure to determine when platform config was
read from scratch registers. Change conditions in parse_platform_config
and get_platform_config_field to use the new flag.

Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Document phys port state bits not used in IB
Dennis Dalessandro [Sun, 13 Aug 2017 15:08:28 +0000 (08:08 -0700)]
IB/hfi1: Document phys port state bits not used in IB

A couple bits are used by OPA for link physical state that are not present
as part of InfiniBand. Add a short blurb what those states mean and removed
an unused state.

Cc: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Todd Rimmer <todd.rimmer@intel.com>
Reviewed-by: Brent Rothermel <brent.r.rothermel@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Check xchg returned value for queuing link down entry
Sebastian Sanchez [Sun, 13 Aug 2017 15:08:22 +0000 (08:08 -0700)]
IB/hfi1: Check xchg returned value for queuing link down entry

Check xchg returned value for queuing link down entry
to guarantee proper atomic value reads.

Fixes: 626c077c025f ("IB/hfi1: Prevent link down request double queuing")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: fix spelling mistake: "Maximim" -> "Maximum"
Colin Ian King [Sat, 5 Aug 2017 13:11:50 +0000 (14:11 +0100)]
IB/hfi1: fix spelling mistake: "Maximim" -> "Maximum"

Trivial fix to spelling mistake in pr_warn_ratelimited warning message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs
Dasaratharaman Chandramouli [Fri, 4 Aug 2017 20:54:53 +0000 (13:54 -0700)]
IB/hfi1: Enable RDMA_CAP_OPA_AH in hfi driver to support extended LIDs

Enabling this bit helps core components query for extended address
support using the rdma_cap_opa_ah interface.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Enhance PIO/SDMA send for 16B
Don Hiatt [Fri, 4 Aug 2017 20:54:47 +0000 (13:54 -0700)]
IB/hfi1: Enhance PIO/SDMA send for 16B

PIO/SDMA send logic now uses the hdr_type field to determine
the type of packet that has been constructed. Based on the hdr_type,
certain things such as PBC flags, padding count and the LT extra
trailing bytes are determined.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add 16B RC/UC support
Don Hiatt [Fri, 4 Aug 2017 20:54:41 +0000 (13:54 -0700)]
IB/hfi1: Add 16B RC/UC support

Add 16B bypass packet support for RC/UC traffic types.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids
Dasaratharaman Chandramouli [Fri, 4 Aug 2017 20:54:35 +0000 (13:54 -0700)]
IB/rdmavt, hfi1, qib: Enhance rdmavt and hfi1 to use 32 bit lids

Increase lid used in hfi1 driver to 32 bits. qib continues
to use 16 bit lids.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add 16B trace support
Don Hiatt [Fri, 4 Aug 2017 20:54:29 +0000 (13:54 -0700)]
IB/hfi1: Add 16B trace support

Add trace support to 16B bypass packets during send and
receive.

Sample input header trace:
<idle>-0     [000] d.h. 271742.509477: input_ibhdr: [0000:05:00.0] (16B)
len:24 sc:0 dlid:0xf0000b slid:0x10002 age:0 becn:0 fecn:0 l4:10 rc:0
sc:0 pkey:0x8001 entropy:0x0000 op:0x65,UD_SEND_ONLY_WITH_IMMEDIATE se:0
m:1 pad:3 tver:0 qpn:0xffffff a:0 psn:0x00000001 hlen:248 deth qkey
0x01234567 sqpn 0x000004

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add 16B UD support
Don Hiatt [Fri, 4 Aug 2017 20:54:23 +0000 (13:54 -0700)]
IB/hfi1: Add 16B UD support

Add 16B bypass packet support for UD traffic types.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Determine 9B/16B L2 header type based on Address handle
Don Hiatt [Fri, 4 Aug 2017 20:54:16 +0000 (13:54 -0700)]
IB/hfi1: Determine 9B/16B L2 header type based on Address handle

When address handle attributes are initialized, the LIDs are
transformed to be in the 32 bit LID space.
When constructing the header, hfi1 driver will look at the LID
to determine the packet header to be created.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add support to process 16B header errors
Don Hiatt [Fri, 4 Aug 2017 20:54:10 +0000 (13:54 -0700)]
IB/hfi1: Add support to process 16B header errors

Enhance hdr_rcverr() to also handle errors during
16B bypass packet receive.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add support to send 16B bypass packets
Don Hiatt [Fri, 4 Aug 2017 20:54:04 +0000 (13:54 -0700)]
IB/hfi1: Add support to send 16B bypass packets

We introduce struct hfi1_opa_header as a union
of ib (9B) and 16B headers.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Add support to receive 16B bypass packets
Don Hiatt [Fri, 4 Aug 2017 20:53:58 +0000 (13:53 -0700)]
IB/hfi1: Add support to receive 16B bypass packets

We introduce a struct hfi1_16b_header to support 16B headers.
16B bypass packets are received by the driver and processed
similar to 9B packets. Add basic support to handle 16B packets.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs
Don Hiatt [Fri, 4 Aug 2017 20:53:51 +0000 (13:53 -0700)]
IB/rdmavt, hfi1, qib: Modify check_ah() to account for extended LIDs

rvt_check_ah() delegates lid verification to underlying
driver. Underlying driver uses different conditions to
check for dlid depending on whether the device supports
extended LIDs

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hf1: User context locking is inconsistent
Michael J. Ruhl [Fri, 4 Aug 2017 20:52:44 +0000 (13:52 -0700)]
IB/hf1: User context locking is inconsistent

There is a mixture of mutex and spinlocks to protect receive context
(rcd/uctxt) information.  This is not used consistently.

Use the mutex to protect device receive context information only.
Use the spinlock to protect sub context information only.

Protect access to items in the rcd array with a spinlock and
reference count.

Remove spinlock around dd->rcd array cleanup.  Since interrupts are
disabled and cleaned up before this point, this lock is not useful.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Protect context array set/clear with spinlock
Michael J. Ruhl [Fri, 4 Aug 2017 20:52:38 +0000 (13:52 -0700)]
IB/hfi1: Protect context array set/clear with spinlock

The rcd array can be accessed from user context or during interrupts.
Protecting this with a mutex isn't a good idea because the mutex should
not be used from an IRQ.

Protect the allocation and freeing of rcd array elements with a
spinlock.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Use host_link_state to read state when DC is shut down
Bartlomiej Dudek [Fri, 4 Aug 2017 20:52:32 +0000 (13:52 -0700)]
IB/hfi1: Use host_link_state to read state when DC is shut down

When DC is shut down (by e.g.  disconnecting the cable), the
driver should use host_link_state to get port's current
physical state. This is due to the fact that physical state
is read from DC's CSRs and when DC is shut down and state is
changed, its registers are not impacted.

Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Bartlomiej Dudek <bartlomiej.dudek@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Remove lstate from hfi1_pportdata
Byczkowski, Jakub [Fri, 4 Aug 2017 20:52:26 +0000 (13:52 -0700)]
IB/hfi1: Remove lstate from hfi1_pportdata

Do not track logical state separately from host_link_state. Deduce
logical state from host_link_state when required. Transitions in
set_link_state and goto_offline already make sure host_link_state
reflects hardware's logical state properly.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jakub Byczkowski <jakub.byczkowski@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Remove pmtu from the QP structure
Sebastian Sanchez [Fri, 4 Aug 2017 20:52:20 +0000 (13:52 -0700)]
IB/hfi1: Remove pmtu from the QP structure

The pmtu field doens't have be stored in the QP structure
as it can easily be calculated when needed.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: Revert egress pkey check enforcement
Alex Estrin [Fri, 4 Aug 2017 20:52:13 +0000 (13:52 -0700)]
IB/hfi1: Revert egress pkey check enforcement

Current code has some serious flaws. Disarm the flag
pending an appropriate patch.

Fixes: 53526500f301 ("IB/hfi1: Permanently enable P_Key checking in HFI")
Cc: stable@vger.kernel.org
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/core: Fix input len in multiple user verbs
Amrani, Ram [Tue, 27 Jun 2017 14:04:42 +0000 (17:04 +0300)]
IB/core: Fix input len in multiple user verbs

Most user verbs pass user data to the kernel with the inclusion of the
ib_uverbs_cmd_hdr structure. This is problematic because the vendor has
no ideas if the verb was called by a legacy verb or an extended verb.
Also, the incosistency between the verbs is confusing.

Fixes: 565197dd8fb1 ("IB/core: Extend ib_uverbs_create_cq")
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agomlx5: Replace PCI pool old API
Romain Perier [Tue, 22 Aug 2017 11:46:59 +0000 (13:46 +0200)]
mlx5: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agomlx4: Replace PCI pool old API
Romain Perier [Tue, 22 Aug 2017 11:46:58 +0000 (13:46 +0200)]
mlx4: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/mthca: Replace PCI pool old API
Romain Perier [Tue, 22 Aug 2017 11:46:56 +0000 (13:46 +0200)]
IB/mthca: Replace PCI pool old API

The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Acked-by: Doug Ledford <dledford@redhat.com>
Tested-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/uverbs: Initialize cq_context appropriately
Bharat Potnuri [Tue, 1 Aug 2017 05:28:35 +0000 (10:58 +0530)]
RDMA/uverbs: Initialize cq_context appropriately

Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer
dereference in ib_uverbs_comp_handler(), if application doesnot use completion
channel. This patch fixes the cq_context initialization.

Fixes: 1e7710f3f65 ("IB/core: Change completion channel to use the reworked")
Cc: stable@vger.kernel.org # 4.12
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
(cherry picked from commit 699a2d5b1b880b4e4e1c7d55fa25659322cf5b51)

6 years agoRDMA/bnxt_re: Implement the alloc/get_hw_stats callback
Somnath Kotur [Wed, 2 Aug 2017 08:46:19 +0000 (01:46 -0700)]
RDMA/bnxt_re: Implement the alloc/get_hw_stats callback

Expose HW counters using the get_hw_stats callback

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/bnxt_re: Allocate multiple notification queues
Selvin Xavier [Wed, 2 Aug 2017 08:46:18 +0000 (01:46 -0700)]
RDMA/bnxt_re: Allocate multiple notification queues

Enables multiple Interrupt vectors. Driver is requesting the max
MSIX vectors based on the number of online  cpus and creates upto
9 MSIx vectors (1 for control path and 8 for data path).
A tasklet is created for each of these vectors. NQs are assigned
to CQs in round robin fashion.
This patch also adds IRQ affinity hint for the MSIX vector of each NQ.

Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoAdd OPA extended LID support
Hiatt, Don [Mon, 14 Aug 2017 18:17:43 +0000 (14:17 -0400)]
Add OPA extended LID support

This patch series primarily increases sizes of variables that hold
lid values from 16 to 32 bits. Additionally, it adds a check in
the IB mad stack to verify a properly formatted MAD when OPA
extended LIDs are used.

Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoMerge branch 'k.o/for-4.13-rc' into k.o/for-next
Doug Ledford [Fri, 18 Aug 2017 18:12:04 +0000 (14:12 -0400)]
Merge branch 'k.o/for-4.13-rc' into k.o/for-next

Merging our (hopefully) final -rc pull branch into our for-next branch
because some of our pending patches won't apply cleanly without having
the -rc patches in our tree.

Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoMerge branch 'misc' into k.o/for-next
Doug Ledford [Fri, 18 Aug 2017 18:10:23 +0000 (14:10 -0400)]
Merge branch 'misc' into k.o/for-next

Conflicts:
drivers/infiniband/core/iwcm.c - The rdma_netlink patches in
HEAD and the iwarp cm workqueue fix (don't use WQ_MEM_RECLAIM,
we aren't safe for that context) touched the same code.

Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/hfi1: add const to bin_attribute structures
Bhumika Goyal [Wed, 2 Aug 2017 10:01:30 +0000 (15:31 +0530)]
IB/hfi1: add const to bin_attribute structures

Add const to bin_attribute structures as they are only passed to the
functions sysfs_{remove/create}_bin_file. The arguments passed are of
type const, so declare the structures to be const.

Done using Coccinelle.

@m disable optional_qualifier@
identifier s;
position p;
@@
static struct bin_attribute s@p={...};

@okay1@
position p;
identifier m.s;
@@
(
sysfs_create_bin_file(...,&s@p,...)
|
sysfs_remove_bin_file(...,&s@p,...)
)

@bad@
position p!={m.p,okay1.p};
identifier m.s;
@@
s@p

@change depends on !bad disable optional_qualifier@
identifier m.s;
@@
static
+const
struct bin_attribute s={...};

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/qib: add const to bin_attribute structures
Bhumika Goyal [Wed, 2 Aug 2017 10:01:29 +0000 (15:31 +0530)]
IB/qib: add const to bin_attribute structures

Add const to bin_attribute structures as they are only passed to the
functions sysfs_{remove/create}_bin_file. The arguments passed are of
type const, so declare the structures to be const.

Done using Coccinelle.

@m disable optional_qualifier@
identifier s;
position p;
@@
static struct bin_attribute s@p={...};

@okay1@
position p;
identifier m.s;
@@
(
sysfs_create_bin_file(...,&s@p,...)
|
sysfs_remove_bin_file(...,&s@p,...)
)

@bad@
position p!={m.p,okay1.p};
identifier m.s;
@@
s@p

@change depends on !bad disable optional_qualifier@
identifier m.s;
@@
static
+const
struct bin_attribute s={...};

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoRDMA/uverbs: Initialize cq_context appropriately
Bharat Potnuri [Tue, 1 Aug 2017 05:28:35 +0000 (10:58 +0530)]
RDMA/uverbs: Initialize cq_context appropriately

Initializing cq_context with ev_queue in create_cq(), leads to NULL pointer
dereference in ib_uverbs_comp_handler(), if application doesnot use completion
channel. This patch fixes the cq_context initialization.

Fixes: 1e7710f3f65 ("IB/core: Change completion channel to use the reworked")
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoinfiniband: avoid overflow warning
Arnd Bergmann [Mon, 31 Jul 2017 06:50:05 +0000 (08:50 +0200)]
infiniband: avoid overflow warning

A sockaddr_in structure on the stack getting passed into rdma_ip2gid
triggers this warning, since we memcpy into a larger sockaddr_in6
structure:

In function 'memcpy',
    inlined from 'rdma_ip2gid' at include/rdma/ib_addr.h:175:3,
    inlined from 'addr_event.isra.4.constprop' at drivers/infiniband/core/roce_gid_mgmt.c:693:2,
    inlined from 'inetaddr_event' at drivers/infiniband/core/roce_gid_mgmt.c:716:9:
include/linux/string.h:305:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter

The warning seems appropriate here, but the code is also clearly
correct, so we really just want to shut up this instance of the
output.

The best way I found so far is to avoid the memcpy() call and instead
replace it with a struct assignment.

Fixes: 6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions")
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoi40iw: fix spelling mistake: "allloc_buf" -> "alloc_buf"
Colin Ian King [Fri, 21 Jul 2017 22:19:33 +0000 (23:19 +0100)]
i40iw: fix spelling mistake: "allloc_buf" -> "alloc_buf"

Trivial fix to spelling mistake in i40iw_debug  message and
also split up a couple of lines that are too long and cause
checkpatch warnings

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/rxe: Remove unneeded check
Yuval Shaia [Fri, 21 Jul 2017 19:20:50 +0000 (22:20 +0300)]
IB/rxe: Remove unneeded check

Port validation is performed in ib_core, no need to duplicate it here.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoIB/rxe: Convert pr_info to pr_warn
Yuval Shaia [Fri, 21 Jul 2017 19:14:09 +0000 (22:14 +0300)]
IB/rxe: Convert pr_info to pr_warn

This message is warning so let's print it accordingly.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoi40iw: Fixes for static checker warnings
Shiraz Saleem [Wed, 19 Jul 2017 18:55:26 +0000 (13:55 -0500)]
i40iw: Fixes for static checker warnings

Remove NULL check for cm_node->listener in i40iw_accept
as listener is always present at this point.

Remove the check for cm_node->accept_pend and related code
in i40iw_cm_event_connected as the cm_node in this context
is only pertinent to active node and cm_node->accept_pend
is always 0.

This fixes the following smatch warnings,

drivers/infiniband/hw/i40iw/i40iw_cm.c:3691 i40iw_accept()
error: we previously assumed 'cm_node->listener' could be null

drivers/infiniband/hw/i40iw/i40iw_cm.c:4061 i40iw_cm_event_connected()
error: we previously assumed 'cm_node->listener' could be null

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoi40iw: Simplify code
Christophe Jaillet [Sun, 16 Jul 2017 11:09:23 +0000 (13:09 +0200)]
i40iw: Simplify code

Axe a few lines of code and re-use existing error handling path to avoid
code duplication.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoinfiniband: pvrdma: constify pci_device_id.
Arvind Yadav [Sun, 16 Jul 2017 06:30:46 +0000 (12:00 +0530)]
infiniband: pvrdma: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
  10774    1872       8   12654    316e infiniband/hw/vmw_pvrdma/pvrdma_main.o

File size After adding 'const':
   text    data     bss     dec     hex filename
  10838    1808       8   12654    316e infiniband/hw/vmw_pvrdma/pvrdma_main.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoinfiniband: nes: constify pci_device_id.
Arvind Yadav [Sun, 16 Jul 2017 06:30:45 +0000 (12:00 +0530)]
infiniband: nes: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
  10429     780      33   11242    2bea drivers/infiniband/hw/nes/nes.o

File size After adding 'const':
   text    data     bss     dec     hex filename
  10541     668      33   11242    2bea drivers/infiniband/hw/nes/nes.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoinfiniband: mthca: constify pci_device_id.
Arvind Yadav [Sun, 16 Jul 2017 06:30:44 +0000 (12:00 +0530)]
infiniband: mthca: constify pci_device_id.

pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
  13067     805       4   13876    3634 infiniband/hw/mthca/mthca_main.o

File size After adding 'const':
   text    data     bss     dec     hex filename
  13419     453       4   13876    3634 infiniband/hw/mthca/mthca_main.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
6 years agoPCI/IB: add support for pci driver attribute groups
Greg Kroah-Hartman [Wed, 19 Jul 2017 13:01:06 +0000 (15:01 +0200)]
PCI/IB: add support for pci driver attribute groups

Some drivers (specifically the nes IB driver), want to create a lot of
sysfs driver attributes.  Instead of open-coding the creation and
removal of these files (and getting it wrong btw), it's a better idea to
let the driver core handle all of this logic for us.

So add a new field to the pci driver structure, **groups, that allows
pci drivers to specify an attribute group list it wishes to have created
when it is registered with the driver core.

Big bonus is now the driver doesn't race with userspace when the sysfs
files are created vs. when the kobject is announced, so any script/tool
that actually wanted to use these files will not have to poll waiting
for them to show up.

Cc: Faisal Latif <faisal.latif@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Cc: Hal Rosenstock <hal.rosenstock@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>