aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-08-02IB/hfi1: Make the cache handler own its rb tree rootDean Luick6-125/+81
The objects which use cache handling should reference their own handler object not the internal data structure it uses to track the nodes. Have the "users" of the mmu notifier code pass opaque objects which can then be properly used in the mmu callbacks depending on the owners needs. This patch has the additional benefit that operations no longer require a look up in a list to find the handlers. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Make use of mm consistentIra Weiny8-31/+50
The hfi1 driver registers a mmu_notifier callback when /dev/hfi1_* is opened, and unregisters it when the device is closed. The driver incorrectly assumes that the close will always happen from the same context as the open. In particular, closes due to SIGKILL or OOM killer activity may happen from a different context. In these cases, the wrong mm is passed to mmu_notifier_unregister(), which causes improper reference counting for the victim mm, and eventual memory corruption. Preserve the mm for all open file descriptors and use this mm rather than current->mm for memory operations for the lifetime of that fd. Note: this patch leaves 1 use of current->mm in place. This use is removed in a follow on patch because other functional changes were required prior to that use being removed. If registration fails, there is no reason to keep the handler object around. Free the handler object rather than add it to the list to prevent any mmu_notifier operations, including unregister, when registration fails. Suggested-by: Jim Foraker <[email protected]> Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Fix user SDMA racy user request claimDean Luick2-13/+20
The user SDMA in-use claim bit is in the structure that gets zeroed out once the claim is made. Move the request in-use flag into its own bit array and use that for atomic claims. This cleans up the claim code and removes any race possibility. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Fix error condition that needs to clean upDean Luick1-1/+2
If input validation fails, properly free the request before returning. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Release node on insert failureDean Luick1-0/+1
If unable to insert node into the RB tree cache, node will be freed before returning from the function. Null out iovec's pointer to node so iovec does not try to free it later. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Validate SDMA user iovector countDean Luick1-2/+22
Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Validate SDMA user request indexDean Luick1-0/+8
Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Use the same capability state for all shared contextsDean Luick4-21/+12
Save the current capability state at user context creation time. Report this saved value for all shared contexts. Also get rid of unnecessary hfi1_get_base_kinfo function. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Prevent null pointer dereferenceIra Weiny1-1/+1
If a context has not been assigned or assignment failed, pq may be NULL. Move the unregister within the protection of the null check. Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Rename TID mmu_rb_* functionsDean Luick1-12/+12
Clarify the names of the TID mmu functions. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Remove unneeded empty check in hfi1_mmu_rb_unregister()Dean Luick1-9/+6
Checking if the rb tree is empty is redundant with the while loop which is emptying the rb tree. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Restructure hfi1_file_openIra Weiny1-4/+10
Rearrange the file open call in prep for new changes. Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Make iovec loop index easy to understandDean Luick1-3/+3
Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Use "false" not 0Ira Weiny1-1/+1
For bool parameters "false" should be used Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Remove unused sub-context parameterIra Weiny1-5/+4
subctxt is not used, just remove it. Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Consolidate __mmu_rb_remove and hfi1_mmu_rb_removeIra Weiny1-17/+9
__mmu_rb_remove was called in only 1 place which was a very simple call site. Combine this function into its caller. Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Always expect ops functionsDean Luick1-14/+6
Remove, insert, and invalidate are always provided. No need to test. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Add parameter names to callback declarationsIra Weiny1-5/+6
This makes it more clear what these functions are operating on. Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Add parameter names to function declarationsIra Weiny1-3/+5
Parameter names to function declarations make it more clear what those parameters do. Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Remove unused function hfi1_mmu_rb_searchDean Luick2-19/+0
Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Remove unused uctxt->subpid and uctxt->pidDean Luick2-7/+0
These are no longer needed. Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Fix minor format errorIra Weiny1-1/+2
Brackets should be on the next line of a function Reviewed-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Expand reported serial numberIra Weiny1-1/+7
Expand the serial number space by using more bits from the GUID. Reviewed-by: Jubin John <[email protected]> Signed-off-by: Dean Luick <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Allow for non-double word multiple message sizes for user SDMAIra Weiny2-10/+23
The driver pads non-double word multiple message sizes but it doesn't account for this padding when the packet length is calculated. Also, the data length is miscalculated for message sizes less than 4 bytes due to the bit representation in LRH. And there's a check for non-double word multiple message sizes that prevents these messages from being sent. This patch fixes length miscalculations and enables the functionality to send non-double word multiple message sizes. Reviewed-by: Harish Chegondi <[email protected]> Signed-off-by: Sebastian Sanchez <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/rdmavt: Eliminate redundant opcode test in mr ref clearIra Weiny2-8/+5
The use of the specific opcode test is redundant since all ack entry users correctly manipulate the mr pointer to selectively trigger the reference clearing. The overly specific test hinders the use of implementation specific operations. The change needs to get rid of the union to insure that an atomic value is not seen as an MR pointer. Reviewed-by: Ashutosh Dixit <[email protected]> Signed-off-by: Mike Marciniszyn <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-02IB/hfi1: Handle kzalloc failure in init_pervl_scsIra Weiny1-3/+16
Checking the return value of the memory allocation call in init_pervl_scs() was missed. Recently the kmalloc() was changed to kzalloc() which identified the problem. While fixing this issue 2 other bugs were noticed. First, the array being allocated is accessed in the nomem path which can be reached before it is allocated. Second, kernel_send_context was not released on error. Fix both of these by creating a more common memory unwind label structure. Fixes: 35f6befc8441 ("staging/rdma/hfi1: Add qp to send context mapping for PIO") Reported-by: Leon Romanovsky <[email protected]> Reviewed-by: Mike Marciniszyn <[email protected]> Signed-off-by: Dennis Dalessandro <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2016-08-03xfs: move (and rename) the deferred bmap-free tracepointsDarrick J. Wong2-2/+7
Rename the deferred bmap-free to extent_free and make them only trigger when we're really running deferred ops. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: collapse single use static functionsDarrick J. Wong2-128/+63
Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: remove unnecessary parentheses from log redo item recovery functionsDarrick J. Wong2-12/+12
Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: remove the extents array from the rmap update done log itemDarrick J. Wong7-89/+15
Nothing ever uses the extent array in the rmap update done redo item, so remove it before it is fixed in the on-disk log format. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: in btree_lshift, only allocate temporary cursor when neededDarrick J. Wong1-17/+17
We only need the temporary cursor in _btree_lshift if we're shifting in an overlapped btree. Therefore, factor that into a single block of code so we avoid unnecessary cursor duplication. Also fix use of the wrong cursor when checking for corruption in xfs_btree_rshift(). Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: remove unnecesary lshift/rshift key initializationDarrick J. Wong1-14/+0
In the lshift/rshift functions we don't use the key variable for anything now, so remove the variable and its initializer. The update_keys functions figure out the key for a block on their own. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: remove the get*keys and update_keys btree ops pointersDarrick J. Wong6-136/+61
These are internal btree functions; we don't need them to be dispatched via function pointers. Make them static again and just check the overlapped flag to figure out what we need to do. The strategy behind this patch was suggested by Christoph. Signed-off-by: Darrick J. Wong <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: enable the rmap btree functionalityDarrick J. Wong2-1/+6
Originally-From: Dave Chinner <[email protected]> Add the feature flag to the supported matrix so that the kernel can mount and use rmap btree enabled filesystems Signed-off-by: Dave Chinner <[email protected]> [[email protected]: move the experimental tag] Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: don't update rmapbt when fixing agflDarrick J. Wong2-4/+17
Allow a caller of xfs_alloc_fix_freelist to disable rmapbt updates when fixing the AG freelist. xfs_repair needs this during phase 5 to be able to adjust the freelist while it's reconstructing the rmap btree; the missing entries will be added back at the very end of phase 5 once the AGFL contents settle down. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: disable XFS_IOC_SWAPEXT when rmap btree is enabledDarrick J. Wong1-0/+4
Swapping extents between two inodes requires the owner to be updated in the rmap tree for all the extents that are swapped. This code does not yet exist, so switch off the XFS_IOC_SWAPEXT ioctl until support has been implemented. This will need to be done before the rmap btree code can have the experimental tag removed. This functionality will be provided in a (much) later patch, using some of the reflink deferred block remapping functionality to accomlish extent swapping with rmap updates. Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add rmap btree block detection to log recoveryDarrick J. Wong1-0/+4
Originally-From: Dave Chinner <[email protected]> So such blocks can be correctly identified and have their operations structures attached to validate recovery has not resulted in a correct block. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add rmap btree geometry feature flagDarrick J. Wong2-1/+4
Originally-From: Dave Chinner <[email protected]> So xfs_info and other userspace utilities know the filesystem is using this feature. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: propagate bmap updates to rmapbtDarrick J. Wong8-16/+400
When we map, unmap, or convert an extent in a file's data or attr fork, schedule a respective update in the rmapbt. Previous versions of this patch required a 1:1 correspondence between bmap and rmap, but this is no longer true as we now have ability to make interval queries against the rmapbt. We use the deferred operations code to handle redo operations atomically and deadlock free. This plumbs in all five rmap actions (map, unmap, convert extent, alloc, free); we'll use the first three now for file data, and reflink will want the last two. We also add an error injection site to test log recovery. Finally, we need to fix the bmap shift extent code to adjust the rmaps correctly. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: enable the xfs_defer mechanism to process rmaps to updateDarrick J. Wong4-10/+134
Connect the xfs_defer mechanism with the pieces that we'll need to handle deferred rmap updates. We'll wire up the existing code to our new deferred mechanism later. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: log rmap intent itemsDarrick J. Wong6-0/+438
Provide a mechanism for higher levels to create RUI/RUD items, submit them to the log, and a stub function to deal with recovered RUI items. These parts will be connected to the rmapbt in a later patch. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: create rmap update intent log itemsDarrick J. Wong6-2/+662
Create rmap update intent/done log items to record redo information in the log. Because we need to roll transactions between updating the bmbt mapping and updating the reverse mapping, we also have to track the status of the metadata updates that will be recorded in the post-roll transactions, just in case we crash before committing the final transaction. This mechanism enables log recovery to finish what was already started. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add rmap btree insert and delete helpersDarrick J. Wong2-1/+49
Add a couple of helper functions to encapsulate rmap btree insert and delete operations. Add tracepoints to the update function. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: convert unwritten status of reverse mappingsDarrick J. Wong2-0/+446
Provide a function to convert an unwritten rmap extent to a real one and vice versa. [ dchinner: Note that this algorithm and code was derived from the existing bmapbt unwritten extent conversion code in xfs_bmap_add_extent_unwritten_real(). ] Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: remove an extent from the rmap btreeDarrick J. Wong1-5/+215
Originally-From: Dave Chinner <[email protected]> Now that we have records in the rmap btree, we need to remove them when extents are freed. This needs to find the relevant record in the btree and remove/trim/split it accordingly. [[email protected]: make rmap routines handle the enlarged keyspace] [dchinner: remove remaining unused debug printks] [darrick: fix a bug when growfs in an AG with an rmap ending at EOFS] Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add an extent to the rmap btreeDarrick J. Wong2-5/+223
Originally-From: Dave Chinner <[email protected]> Now all the btree, free space and transaction infrastructure is in place, we can finally add the code to insert reverse mappings to the rmap btree. Freeing will be done in a separate patch, so just the addition operation can be focussed on here. [darrick: handle owner offsets when adding rmaps] [dchinner: remove remaining debug printk statements] [darrick: move unwritten bit to rm_offset] Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add tracepoints for the rmap functionsDarrick J. Wong1-2/+49
Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: teach rmapbt to support interval queriesDarrick J. Wong2-0/+52
Now that the generic btree code supports querying all records within a range of keys, use that functionality to allow us to ask for all the extents mapped to a range of physical blocks. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: support overlapping intervals in the rmap btreeDarrick J. Wong2-4/+74
Now that the generic btree code supports overlapping intervals, plug in the rmap btree to this functionality. We will need it to find potential left neighbors in xfs_rmap_{alloc,free} later in the patch set. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add rmap btree operationsDarrick J. Wong5-0/+376
Originally-From: Dave Chinner <[email protected]> Implement the generic btree operations needed to manipulate rmap btree blocks. This is very similar to the per-ag freespace btree implementation, and uses the AGFL for allocation and freeing of blocks. Adapt the rmap btree to store owner offsets within each rmap record, and to handle the primary key being redefined as the tuple [agblk, owner, offset]. The expansion of the primary key is crucial to allowing multiple owners per extent. [darrick: adapt the btree ops to deal with offsets] [darrick: remove init_rec_from_key] [darrick: move unwritten bit to rm_offset] Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>