aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-05-24RDMA/hns: Increase checking CMQ status timeout valueWei Hu(Xavier)1-1/+1
This patch increases checking CMQ status timeout value and uses the same value with NIC driver to avoid deficiency of time. Signed-off-by: Wei Hu (Xavier) <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24RDMA/hns: Modify uar allocation algorithm to avoid bitmap exhaustWei Hu(Xavier)2-4/+7
This patch modified uar allocation algorithm in hns_roce_uar_alloc function to avoid bitmap exhaust. Signed-off-by: Wei Hu (Xavier) <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24Merge tag 'mlx5-updates-2018-05-17' of ↵Jason Gunthorpe11-17/+62
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into for-next mlx5-updates-2018-05-17 mlx5 core dirver updates for both net-next and rdma-next branches. From Christophe JAILLET, first three patche to use kvfree where needed. From: Or Gerlitz <[email protected]> Next six patches from Roi and Co adds support for merged sriov e-switch which comes to serve cases where both PFs, VFs set on them and both uplinks are to be used in single v-switch SW model. When merged e-switch is supported, the per-port e-switch is logically merged into one e-switch that spans both physical ports and all the VFs. This model allows to offload TC eswitch rules between VFs belonging to different PFs (and hence have different eswitch affinity), it also sets the some of the foundations needed for uplink LAG support. * tag 'mlx5-updates-2018-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5e: Explicitly set source e-switch in offloaded TC rules net/mlx5: Add source e-switch owner net/mlx5e: Explicitly set destination e-switch in FDB rules net/mlx5: Add destination e-switch owner net/mlx5: Properly handle a vport destination when setting FTE net/mlx5: Add merged e-switch cap IB/mlx5: Use 'kvfree()' for memory allocated by 'kvzalloc()' net/mlx5: Eswitch, Use 'kvfree()' for memory allocated by 'kvzalloc()' net/mlx5: Vport, Use 'kvfree()' for memory allocated by 'kvzalloc()'
2018-05-24IB/core: Introduce and use rdma_gid_table()Parav Pandit1-10/+15
There are several places a gid table is accessed. Have a helper tiny function rdma_gid_table() to avoid code duplication at such places. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24IB/core: Reduce the places that use zgidParav Pandit4-8/+19
Instead of open coding memcmp() to check whether a given GID is zero or not, use a helper function to do so, and replace instances of memcpy(z,&zgid) with memset. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24IB/mlx5: Fetch soft WQE's on fatal error stateErez Shitrit1-3/+12
On fatal error the driver simulates CQE's for ULPs that rely on completion of all their posted work-request. For the GSI traffic, the mlx5 has its own mechanism that sends the completions via software CQE's directly to the relevant CQ. This should be kept in fatal error too, so the driver should simulate such CQE's with the specified error state in order to complete GSI QP work requests. Without the fix the next deadlock might appears: schedule_timeout+0x274/0x350 wait_for_common+0xec/0x240 mcast_remove_one+0xd0/0x120 [ib_core] ib_unregister_device+0x12c/0x230 [ib_core] mlx5_ib_remove+0xc4/0x270 [mlx5_ib] mlx5_detach_device+0x184/0x1a0 [mlx5_core] mlx5_unload_one+0x308/0x340 [mlx5_core] mlx5_pci_err_detected+0x74/0xe0 [mlx5_core] Cc: <[email protected]> # 4.7 Fixes: 89ea94a7b6c4 ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Erez Shitrit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24RDMA/ucm: Mark UCM interface as BROKENLeon Romanovsky2-2/+13
In commit 357d23c811a7 ("Remove the obsolete libibcm library") in rdma-core [1], we removed obsolete library which used the /dev/infiniband/ucmX interface. Following multiple syzkaller reports about non-sanitized user input in the UCMA module, the short audit reveals the same issues in UCM module too. It is better to disable this interface in the kernel, before syzkaller team invests time and energy to harden this unused interface. [1] https://github.com/linux-rdma/rdma-core/pull/279 Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24IB/core: Remove duplicate declaration of gid_cache_wqParav Pandit1-2/+0
Remove duplicate declaration of gid_cache_wq. Fixes: d41861942 ("IB/core: Add generic function to extract IB speed from netdev") Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24RDMA/mlx5: Remove debug prints of VMA pointersLeon Romanovsky1-12/+3
Remove various prints of VMA pointers. Reported-by: Michal Kalderon <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Michal Kalderon <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24RDMA/hns: Rename the idx field of dboulijun2-4/+4
The lower 15 bit of paramter of db structure means different meanings when db type is sq, rq and srq. Signed-off-by: Lijun Ou <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-24IB/qib: Fix DMA api warning with debug kernelMike Marciniszyn3-13/+20
The following error occurs in a debug build when running MPI PSM: [ 307.415911] WARNING: CPU: 4 PID: 23867 at lib/dma-debug.c:1158 check_unmap+0x4ee/0xa20 [ 307.455661] ib_qib 0000:05:00.0: DMA-API: device driver failed to check map error[device address=0x00000000df82b000] [size=4096 bytes] [mapped as page] [ 307.517494] Modules linked in: [ 307.531584] ib_isert iscsi_target_mod ib_srpt target_core_mod rpcrdma sunrpc ib_srp scsi_transport_srp scsi_tgt ib_iser libiscsi ib_ipoib scsi_transport_iscsi rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_qib intel_powerclamp coretemp rdmavt intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel ipmi_ssif ib_core aesni_intel sg ipmi_si lrw gf128mul dca glue_helper ipmi_devintf iTCO_wdt gpio_ich hpwdt iTCO_vendor_support ablk_helper hpilo acpi_power_meter cryptd ipmi_msghandler ie31200_edac shpchp pcc_cpufreq lpc_ich pcspkr ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crct10dif_pclmul crct10dif_common drm crc32c_intel libahci tg3 libata serio_raw ptp i2c_core [ 307.846113] pps_core dm_mirror dm_region_hash dm_log dm_mod [ 307.866505] CPU: 4 PID: 23867 Comm: mpitests-IMB-MP Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 307.911178] Hardware name: HP ProLiant DL320e Gen8, BIOS J05 11/09/2013 [ 307.944206] Call Trace: [ 307.956973] [<ffffffffbd9e915b>] dump_stack+0x19/0x1b [ 307.982201] [<ffffffffbd2a2f58>] __warn+0xd8/0x100 [ 308.005999] [<ffffffffbd2a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 308.034260] [<ffffffffbd5f667e>] check_unmap+0x4ee/0xa20 [ 308.060801] [<ffffffffbd41acaa>] ? page_add_file_rmap+0x2a/0x1d0 [ 308.090689] [<ffffffffbd5f6c4d>] debug_dma_unmap_page+0x9d/0xb0 [ 308.120155] [<ffffffffbd4082e0>] ? might_fault+0xa0/0xb0 [ 308.146656] [<ffffffffc07761a5>] qib_tid_free.isra.14+0x215/0x2a0 [ib_qib] [ 308.180739] [<ffffffffc0776bf4>] qib_write+0x894/0x1280 [ib_qib] [ 308.210733] [<ffffffffbd540b00>] ? __inode_security_revalidate+0x70/0x80 [ 308.244837] [<ffffffffbd53c2b7>] ? security_file_permission+0x27/0xb0 [ 308.266025] qib_ib0.8006: multicast join failed for ff12:401b:8006:0000:0000:0000:ffff:ffff, status -22 [ 308.323421] [<ffffffffbd46f5d3>] vfs_write+0xc3/0x1f0 [ 308.347077] [<ffffffffbd492a5c>] ? fget_light+0xfc/0x510 [ 308.372533] [<ffffffffbd47045a>] SyS_write+0x8a/0x100 [ 308.396456] [<ffffffffbd9ff355>] system_call_fastpath+0x1c/0x21 The code calls a qib_map_page() which has never correctly tested for a mapping error. Fix by testing for pci_dma_mapping_error() in all cases and properly handling the failure in the caller. Additionally, streamline qib_map_page() arguments to satisfy just the single caller. Cc: <[email protected]> Reviewed-by: Alex Estrin <[email protected]> Tested-by: Don Dutile <[email protected]> Reviewed-by: Don Dutile <[email protected]> Signed-off-by: Mike Marciniszyn <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/isert: Fix for lib/dma_debug check_sync warningAlex Estrin1-9/+17
The following error message occurs on a target host in a debug build during session login: [ 3524.411874] WARNING: CPU: 5 PID: 12063 at lib/dma-debug.c:1207 check_sync+0x4ec/0x5b0 [ 3524.421057] infiniband hfi1_0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000000000000] [size=76 bytes] ......snip ..... [ 3524.535846] CPU: 5 PID: 12063 Comm: iscsi_np Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 3524.546764] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 [ 3524.555740] Call Trace: [ 3524.559102] [<ffffffffa5fe915b>] dump_stack+0x19/0x1b [ 3524.565477] [<ffffffffa58a2f58>] __warn+0xd8/0x100 [ 3524.571557] [<ffffffffa58a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 3524.578610] [<ffffffffa5bf5b8c>] check_sync+0x4ec/0x5b0 [ 3524.585177] [<ffffffffa58efc3f>] ? set_cpus_allowed_ptr+0x5f/0x1c0 [ 3524.592812] [<ffffffffa5bf5cd0>] debug_dma_sync_single_for_cpu+0x80/0x90 [ 3524.601029] [<ffffffffa586add3>] ? x2apic_send_IPI_mask+0x13/0x20 [ 3524.608574] [<ffffffffa585ee1b>] ? native_smp_send_reschedule+0x5b/0x80 [ 3524.616699] [<ffffffffa58e9b76>] ? resched_curr+0xf6/0x140 [ 3524.623567] [<ffffffffc0879af0>] isert_create_send_desc.isra.26+0xe0/0x110 [ib_isert] [ 3524.633060] [<ffffffffc087af95>] isert_put_login_tx+0x55/0x8b0 [ib_isert] [ 3524.641383] [<ffffffffa58ef114>] ? try_to_wake_up+0x1a4/0x430 [ 3524.648561] [<ffffffffc098cfed>] iscsi_target_do_tx_login_io+0xdd/0x230 [iscsi_target_mod] [ 3524.658557] [<ffffffffc098d827>] iscsi_target_do_login+0x1a7/0x600 [iscsi_target_mod] [ 3524.668084] [<ffffffffa59f9bc9>] ? kstrdup+0x49/0x60 [ 3524.674420] [<ffffffffc098e976>] iscsi_target_start_negotiation+0x56/0xc0 [iscsi_target_mod] [ 3524.684656] [<ffffffffc098c2ee>] __iscsi_target_login_thread+0x90e/0x1070 [iscsi_target_mod] [ 3524.694901] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.705446] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.715976] [<ffffffffc098ca78>] iscsi_target_login_thread+0x28/0x60 [iscsi_target_mod] [ 3524.725739] [<ffffffffa58d60ff>] kthread+0xef/0x100 [ 3524.732007] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.739540] [<ffffffffa5fff1b7>] ret_from_fork_nospec_begin+0x21/0x21 [ 3524.747558] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.755088] ---[ end trace 23f8bf9238bd1ed8 ]--- [ 3595.510822] iSCSI/iqn.1994-05.com.redhat:537fa56299: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. The code calls dma_sync on login_tx_desc->dma_addr prior to initializing it with dma-mapped address. login_tx_desc is a part of iser_conn structure and is used only once during login negotiation, so the issue is fixed by eliminating dma_sync call for this buffer using a special case routine. Cc: <[email protected]> Reviewed-by: Mike Marciniszyn <[email protected]> Reviewed-by: Don Dutile <[email protected]> Signed-off-by: Alex Estrin <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/{rdmavt,hfi1}: Change hrtimer add to use pinned versionMike Marciniszyn2-2/+2
Given we are dealing with nano-second level timers, when the timer pops, ensure it happens on the CPU which caused the timer to be set in the first place. This avoids excessive jitter from the desired expiration time by avoiding the cost of switching our context to another CPU that is cache cold for this given timer. Reviewed-by: Kaike Wan <[email protected]> Signed-off-by: Mike Marciniszyn <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/hfi1: Set port number for errorinfo MAD responseMichael J. Ruhl1-0/+1
For errorinfo MAD requests, the response has a 0 port number left over from a memset. Instead we should always set the port number in the response. Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Michael J. Ruhl <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/hfi1: Cleanup of exp_rcvMike Marciniszyn4-25/+56
The knowledge of the internal workings of the expect receive is too distributed. Fix by: - right size several rcd fields associated with expect receive - making an init entrance to init all the lists - consolidate all the allocations into an array anchored in the rcd Reviewed-by: Michael J. Ruhl <[email protected]> Reviewed-by: Kaike Wan <[email protected]> Signed-off-by: Mike Marciniszyn <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/hfi1: Add 16B Management Packet trace supportDon Hiatt2-70/+130
Add trace support for 16B Management Packets. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Don Hiatt <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/hfi1: Add support for 16B Management PacketsDon Hiatt4-33/+110
16B Management Packets (L4=0x08) replace the BTH and DETH of normal MAD packet packets with a header containing the the source and destination queue pair numbers; fields that were originally retrieved from the BTH/DETH are now populated from this header as well as from the 16B LRH (e.g. pkey). 16B Management Packets are used as an optimized management format on 16B fabrics. These management packets have an opcode of IB_OPCODE_UD_SEND_ONLY, a fixed 3Byte pad, and a header length of 24Bytes. The decision as to when we send a management packet is based upon either the source or destination queue pair number being 0 or 1. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Don Hiatt <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24IB/hfi1: Define 16B Management PacketsDon Hiatt2-0/+8
Add 16B Management Packet definition. This optimized packet format replaces the ib_other_headers and BTH with a source and destination QP number. To support these packets we introduce struct opa_16b_mgmt into the struct hfi1_16b_header. This packet format is only used for MAD packets using the IB_OPCODE_UD_SEND_ONLY opcode on QP0/1. The original 16B implementation failed to use 16B management packets so now we add their definition. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Don Hiatt <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24iw_cxgb4: provide detailed driver-specific MR informationSteve Wise1-0/+61
Add a table of important fields from the fw_ri_tpte structure to the mr resource tracking table. This is helpful in debugging. Signed-off-by: Steve Wise <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24iw_cxgb4: provide detailed driver-specific CQ informationSteve Wise1-0/+163
Add a table of important fields from the c4iw_cq* structures to the cq resource tracking table. This is helpful in debugging. Signed-off-by: Steve Wise <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-24iw_cxgb4: provide detailed provider-specific CM_ID informationSteve Wise1-0/+84
Add a table of important fields from the c4iw_ep* structures to the cm_id resource tracking table. This is helpful in debugging. Signed-off-by: Steve Wise <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-23RDMA/hns: Move the location for initializing tmp_lenoulijun1-1/+2
When posted work request, it need to compute the length of all sges of every wr and fill it into the msg_len field of send wqe. Thus, While posting multiple wr, tmp_len should be reinitialized to zero. Fixes: 8b9b8d143b46 ("RDMA/hns: Fix the endian problem for hns") Signed-off-by: Lijun Ou <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-23RDMA/hns: Bugfix for cq record db for kerneloulijun1-0/+1
When use cq record db for kernel, it needs to set the hr_cq->db_en to 1 and configure the dma address of record cq db of qp context. Fixes: 86188a8810ed ("RDMA/hns: Support cq record doorbell for kernel space") Signed-off-by: Lijun Ou <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-23IB/uverbs: Fix uverbs_attr_get_objJason Gunthorpe1-5/+5
The err pointer comes from uverbs_attr_get, not from the uobject member, which does not store an ERR_PTR. Fixes: be934cca9e98 ("IB/uverbs: Add device memory registration ioctl support") Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]>
2018-05-23RDMA/qedr: Fix doorbell bar mapping for dpi > 1Kalderon, Michal1-31/+29
Each user_context receives a separate dpi value and thus a different address on the doorbell bar. The qedr_mmap function needs to validate the address and map the doorbell bar accordingly. The current implementation always checked against dpi=0 doorbell range leading to a wrong mapping for doorbell bar. (It entered an else case that mapped the address differently). qedr_mmap should only be used for doorbells, so the else was actually wrong in the first place. This only has an affect on arm architecture and not an issue on a x86 based architecture. This lead to doorbells not occurring on arm based systems and left applications that use more than one dpi (or several applications run simultaneously ) to hang. Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs") Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: Michal Kalderon <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-22RDMA/ipoib: drop skb on path record lookup failureEvgenii Smirnov1-41/+21
In unicast_arp_send function there is an inconsistency in error handling of path_rec_start call. If path_rec_start is called because of an absent ah field, skb will be dropped. But if it is called on a creation of a new path, or if the path is invalid, skb will be added to the tail of path queue. In case of a new path it will be dropped on path_free, but in case of invalid path it can stay in the queue forever. This patch unifies the behavior, dropping skb in all cases of path_rec_start failure. Signed-off-by: Evgenii Smirnov <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-22RDMA/CMA: add rdma_iw_cm_id() and rdma_res_to_id() helpersSteve Wise2-0/+31
Add a helper function for iwarp drivers to be able to map an rdma_cm_id to an iw_cm_id. This is useful for dumping driver specific NLDEV/RESTRACK connection state. Add a helper to return the rdma_cm_id pointer from the rdma_restack pointer. This is needed for rdma drivers to map a res entry back to the public rdma_cm_id struct. Signed-off-by: Steve Wise <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-22iw_cxgb4: always set iw_cm_id.provider_dataSteve Wise1-0/+1
In active side connections, the provider_data field is not getting set. This will be used in a subsequent patch to dump state, so always set it. Signed-off-by: Steve Wise <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-05-22RDMA/ipoib: Update paths on CLIENT_REREG/SM_CHANGE eventsDoug Ledford2-7/+28
We do a light flush on CLIENT_REREG and SM_CHANGE events. This goes through and marks paths invalid. But we weren't always checking for this validity when we needed to, and so we could keep using a path marked invalid. What's more, once we establish a path with a valid ah, we put a pointer to the ah in the neigh struct directly, so even if we mark the path as invalid, as long as the neigh has a direct pointer to the ah, it keeps using the old, outdated ah. To fix this we do several things. 1) Put the valid flag in the ah instead of the path struct, so when we put the ah pointer directly in the neigh struct, we can easily check the validity of the ah on send events. 2) Check the neigh->ah and neigh->ah->valid elements in the needed places, and if we have an ah, but it's invalid, then invoke a refresh of the ah. 3) Fix the various places that check for path, but didn't check for path->valid (now path->ah && path->ah->valid). Reported-by: Evgenii Smirnov <[email protected]> Fixes: ee1e2c82c245 ("IPoIB: Refresh paths instead of flushing them on SM change events") Signed-off-by: Doug Ledford <[email protected]>
2018-05-17net/mlx5e: Explicitly set source e-switch in offloaded TC rulesShahar Klein3-0/+10
Set a specific source e-switch when setting a rule that matches on the ingress port. Signed-off-by: Shahar Klein <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-17net/mlx5: Add source e-switch ownerShahar Klein2-2/+14
The source e-switch owner allows a vport on one e-switch port be associated with a rule defined on the second port e-switch. The role of the source eswitch owner valid bit in the flow group is to allow the firmware fail driver attempts to wild card the source eswitch match field. If this bit is not set, the firmware ignores the source eswitch owner field totally. Signed-off-by: Shahar Klein <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-17net/mlx5e: Explicitly set destination e-switch in FDB rulesRabie Loulou3-0/+8
Set a specific destination e-switch when setting a destination vport. Signed-off-by: Rabie Loulou <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Shahar Klein <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-17net/mlx5: Add destination e-switch ownerShahar Klein7-10/+21
The destination e-switch owner allows a rule in namespace of one e-switch owner to point to a vport that is natively associated with another e-switch owner. Signed-off-by: Shahar Klein <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-17net/mlx5: Properly handle a vport destination when setting FTEShahar Klein1-0/+3
When creating FTE, properly distinguish between destination being vport or tir. The previous code just worked accidentally b/c of both dest being in the same offset within a union. Signed-off-by: Shahar Klein <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-17net/mlx5: Add merged e-switch capRoi Dayan1-1/+2
When merged e-switch is supported, the per-port e-switch is logically merged into one e-switch that spans both physical ports and all the VFs. Under merged eswitch, both the matching on source vport and setting destination vport can have a 2nd attribute which is the vhca id of the eswitch owner. For example: esw0: {match: <src vport=1 owner=0> action: fwd to <dst vport=7, owner=1>} is a flow set on eswitch0 matching on source vport=1 from his eswitch and the action being fwd to dest vport=7 of eswitch1. Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Shahar Klein <[email protected]> Reviewed-by: Or Gerlitz Klein <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-17IB/rxe: avoid calling WARN_ON_ONCE twiceZhu Yanjun1-4/+0
In the exit branch, WARN_ON_ONCE is called to show stack. So it is not necessary to call WARN_ON_ONCE before going to exit. Signed-off-by: Zhu Yanjun <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16RDMA/hns: Add 64KB page size support for hip08Yixian Liu3-22/+23
This patch adds the support of 64KB page size for hip08 in kernel. Signed-off-by: Yixian Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/ipoib: replace local_irq_disable() with proper lockingSebastian Andrzej Siewior1-9/+6
In ipoib_mcast_restart_task() the netif_addr_lock() is invoked prior local_irq_save(). netif_addr_lock() should not be invoked in interrupt disabled section, only in BH disabled sections. The priv->lock is always acquired with disabled interrupts. The only place where netif_addr_lock() and priv->lock nest ist ipoib_mcast_restart_task(). Drop the local_irq_save() and acquire priv->lock with spin_lock_irq() inside the netif_addr locked section. It's safe to do so because the caller is either a worker function or __ipoib_ib_dev_flush() which are both calling with interrupts enabled (and since BH is enabled here, too so netif_addr_lock_bh() needs to be used). Cc: Doug Ledford <[email protected]> Cc: Jason Gunthorpe <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/mlx5: Expose MPLS related tunneling offloadsAriel Levkovich4-2/+19
This patch reports the device's capbilities to offload encapsulated MPLS tunnel protocols to user-space: - Capability to offload MPLS over GRE. - Capability to offload MPLS over UDP. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/mlx5: Add support for MPLS flow specificationAriel Levkovich5-11/+159
This patch introduces support for the MPLS flow spec and allows the creation of rules that are matching on the MPLS label. Applying the rule matching depends on the flow specs order and the location of the MPLS in the spec list as there are different configurations to be made in the device in the cases of MPLSoGRE and MPLSoUDP vs. non-encapsulated MPLS. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/mlx5: Add support for GRE flow specificationAriel Levkovich1-0/+24
This patch introduces support for the GRE flow spec and allowing the creation of rules based on the protocol and key fields that are part of GRE protocol header. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/uverbs: Introduce a MPLS steering match filterAriel Levkovich2-0/+27
Add a new MPLS steering match filter that can match against a single MPLS tag field. Since the MPLS header can reside in different locations in the packet's protocol stack as well as be encapsulated with a tunnel protocol, it is required to know the exact location of the header in the protocol stack. Therefore, when including the MPLS protocol spec in the specs list, it is mandatory to provide the list in an ordered manner, so that it represents the actual header order in a matching packet. Drivers that process the spec list and apply the matching rule should treat the position of the MPLS spec in the spec list as the actual location of the MPLS label in the packet's protocol stack. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/uverbs: Expose MPLS flow spec to the user-kernel ABI headerAriel Levkovich1-0/+23
Add ib_uverbs_flow_spec_mpls to define a rule to match the MPLS protocol. The spec includes the generic specs header, type, size and reserved fields while the filter itself is defined as ib_uverbs_flow_mpls_filter and includes a single 32bit field named 'label' which consists of: Bits 0:19 - The MPLS label. Bits 20:22 - Traffic class field. Bit 23 - Bottom of stack bit. Bits 24:31 - Time to live (TTL) field. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/uverbs: Introduce a GRE steering match filterAriel Levkovich2-0/+28
Adding a new GRE steering match filter that can match against key and protocol fields. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/uverbs: Expose GRE flow spec to the user-kernel ABI headerAriel Levkovich1-0/+27
Add ib_uverbs_flow_spec_gre to define a rule to match the GRE encapsulation protocol. The spec includes the generic specs header, type, size and reserved fields while the filter itself is defined as ib_uverbs_flow_gre_filter and includes: 1. Checksum present bit, key present bit and version bits in a single 16bit field. 2. Protocol type field - Indicates the ether protocol type of the encapsulated payload. 3. Key field - present if key bit is set and contains an application specific key value. Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/mlx5: Use 'kvfree()' for memory allocated by 'kvzalloc()'Christophe JAILLET1-1/+1
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to free it. Fixes: 1cbe6fc86ccfe ("IB/mlx5: Add support for CQE compressing") Signed-off-by: Christophe JAILLET <[email protected]> Acked-by: Jason Gunthorpe <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-16net/mlx5: Eswitch, Use 'kvfree()' for memory allocated by 'kvzalloc()'Christophe JAILLET1-1/+1
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to free it. Fixes: fed9ce22bf8ae ("net/mlx5: E-Switch, Add API to create vport rx rules") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-16net/mlx5: Vport, Use 'kvfree()' for memory allocated by 'kvzalloc()'Christophe JAILLET1-3/+3
When 'kvzalloc()' is used to allocate memory, 'kvfree()' must be used to free it. Fixes: 9efa75254593d ("net/mlx5_core: Introduce access functions to query vport RoCE fields") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-16IB/cm: Store and restore ah_attr during CM message processingParav Pandit1-2/+14
During CM request processing flow, ah_attr is initialized twice. First based on wc. Secondly based on primary path record. ah_attr initialization from path record can fail, which leads to ah_attr zeroed out. Therefore, always initialize ah_attr on stack during reinitialization phase. If ah_attr init is successful, use the new ah_attry by overwriting the old one. If the ah_attr init fails, continue to use the last ah_attr. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-05-16IB/cm: Store and restore ah_attr during LAP msg processingParav Pandit1-3/+29
During CM LAP processing, ah_attr is reinitialized on receiving LAP request. First likely during CM request processing. ah_attr might get zero out if LAP processing fails. Therefore, attempt to create new ah_attr for the LAP message. If the initialization fails, continue with older ah_attr. If the initialization passes, consider the new ah_attr by overwriting the older one. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>