aboutsummaryrefslogtreecommitdiff
path: root/drivers/misc/cxl
AgeCommit message (Collapse)AuthorFilesLines
2017-05-02cxl: Force context lock during EEH flowVaibhav Jain1-2/+17
During an eeh event when the cxl card is fenced and card sysfs attr perst_reloads_same_image is set following warning message is seen in the kernel logs: Adapter context unlocked with 0 active contexts ------------[ cut here ]------------ WARNING: CPU: 12 PID: 627 at ../drivers/misc/cxl/main.c:325 cxl_adapter_context_unlock+0x60/0x80 [cxl] Even though this warning is harmless, it clutters the kernel log during an eeh event. This warning is triggered as the EEH callback cxl_pci_error_detected doesn't obtain a context-lock before forcibly detaching all active context and when context-lock is released during call to cxl_configure_adapter from cxl_pci_slot_reset, a warning in cxl_adapter_context_unlock is triggered. To fix this warning, we acquire the adapter context-lock via cxl_adapter_context_lock() in the eeh callback cxl_pci_error_detected() once all the virtual AFU PHBs are notified and their contexts detached. The context-lock is released in cxl_pci_slot_reset() after the adapter is successfully reconfigured and before the we call the slot_reset callback on slice attached device-drivers. Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists") Cc: [email protected] # v4.9+ Reported-by: Andrew Donnellan <[email protected]> Signed-off-by: Vaibhav Jain <[email protected]> Acked-by: Frederic Barrat <[email protected]> Reviewed-by: Matthew R. Ochs <[email protected]> Tested-by: Uma Krishnan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-19cxl: Enable PCI device IDs for future IBM CXL adaptersMatthew R. Ochs1-0/+2
Add support for future IBM Coherent Accelerator (CXL) devices with an IDs of 0x0623 and 0x0628. Signed-off-by: Matthew R. Ochs <[email protected]> Signed-off-by: Uma Krishnan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Add psl9 specific codeChristophe Lombard9-74/+751
The new Coherent Accelerator Interface Architecture, level 2, for the IBM POWER9 brings new content and features: - POWER9 Service Layer - Registers - Radix mode - Process element entry - Dedicated-Shared Process Programming Model - Translation Fault Handling - CAPP - Memory Context ID If a valid mm_struct is found the memory context id is used for each transaction associated with the process handle. The PSL uses the context ID to find the corresponding process element. Signed-off-by: Christophe Lombard <[email protected]> Acked-by: Frederic Barrat <[email protected]> [mpe: Fixup comment formatting, unsplit long strings] Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Isolate few psl8 specific callsChristophe Lombard5-56/+116
Point out the specific Coherent Accelerator Interface Architecture, level 1, registers. Code and functions specific to PSL8 (CAIA1) must be framed. Signed-off-by: Christophe Lombard <[email protected]> Acked-by: Frederic Barrat <[email protected]> [mpe: Don't split long strings, it makes them hard to grep for] Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Rename some psl8 specific functionsChristophe Lombard6-54/+54
Rename a few functions, changing the '_psl' suffix to '_psl8', to make clear that the implementation is psl8 specific. Those functions will have an equivalent implementation for the psl9 in a later patch. Signed-off-by: Christophe Lombard <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Update implementation service layerChristophe Lombard6-59/+110
The service layer API (in cxl.h) lists some low-level functions whose implementation is different on PSL8, PSL9 and XSL: - Init implementation for the adapter and the afu. - Invalidate TLB/SLB. - Attach process for dedicated/directed models. - Handle psl interrupts. - Debug registers for the adapter and the afu. - Traces. Each environment implements its own functions, and the common code uses them through function pointers, defined in cxl_service_layer_ops. Signed-off-by: Christophe Lombard <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Keep track of mm struct associated with a contextChristophe Lombard6-90/+61
The mm_struct corresponding to the current task is acquired each time an interrupt is raised. So to simplify the code, we only get the mm_struct when attaching an AFU context to the process. The mm_count reference is increased to ensure that the mm_struct can't be freed. The mm_struct will be released when the context is detached. A reference on mm_users is not kept to avoid a circular dependency if the process mmaps its cxl mmio and forget to unmap before exiting. The field glpid (pid of the group leader associated with the pid), of the structure cxl_context, is removed because it's no longer useful. Signed-off-by: Christophe Lombard <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Remove unused values in bare-metal environment.Christophe Lombard3-24/+7
The two previously fields pid and tid, located in the structure cxl_irq_info, are only used in the guest environment. To avoid confusion, it's not necessary to fill the fields in the bare-metal environment. Pid_tid is now renamed to 'reserved' to avoid undefined behavior on bare-metal. The PSL Process and Thread Identification Register (CXL_PSL_PID_TID_An) is only used when attaching a dedicated process for PSL8 only. This register goes away in CAIA2. Signed-off-by: Christophe Lombard <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-13cxl: Read vsec perst load imageChristophe Lombard1-0/+1
This bit is used to cause a flash image load for programmable CAIA-compliant implementation. If this bit is set to ‘0’, a power cycle of the adapter is required to load a programmable CAIA-com- pliant implementation from flash. This field will be used by the following patches. Signed-off-by: Christophe Lombard <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-24treewide: Fix typos in printkMasanari Iida1-1/+1
This patch fix some spelling typos found in printk. [[email protected]: drop arch/arm64/kernel/hibernate.c that was already in place] Signed-off-by: Masanari Iida <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2017-03-20cxl: Route eeh events to all slices for pci_channel_io_perm_failure stateVaibhav Jain1-7/+6
Fix a boundary condition where in some cases an eeh event with state == pci_channel_io_perm_failure wont be passed on to a driver attached to the virtual PCI device associated with a slice. This will happen in case the slice just before (n-1) doesn't have any vPHB bus associated with it, that results in an early return from cxl_pci_error_detected() callback. With state == pci_channel_io_perm_failure, the adapter will be removed irrespective of the return value of cxl_vphb_error_detected(). So we now always return PCI_ERS_RESULT_DISCONNECTED for this case i.e even if the AFU isn't using a vPHB (currently returns PCI_ERS_RESULT_NONE). Fixes: e4f5fc001a6("cxl: Do not create vPHB if there are no AFU configuration records") Signed-off-by: Vaibhav Jain <[email protected]> Reviewed-by: Matthew R. Ochs <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-02sched/headers: Prepare to move the get_task_struct()/put_task_struct() and ↵Ingo Molnar1-0/+2
related APIs from <linux/sched.h> to <linux/sched/task.h> But first update usage sites with the new header dependency. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-02sched/headers: Prepare to move signal wakeup & sigpending methods from ↵Ingo Molnar2-2/+2
<linux/sched.h> into <linux/sched/signal.h> Fix up affected files that include this signal functionality via sched.h. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar1-0/+1
<linux/sched/mm.h> We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/mm.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. The APIs that are going to be moved first are: mm_alloc() __mmdrop() mmdrop() mmdrop_async_fn() mmdrop_async() mmget_not_zero() mmput() mmput_async() get_task_mm() mm_access() mm_release() Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar1-0/+1
<linux/sched/clock.h> We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files. Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-01Merge tag 'powerpc-4.11-2' of ↵Linus Torvalds4-10/+27
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc updates from Michael Ellerman: "Highlights include: - an update of the disassembly code used by xmon to the latest versions in binutils. We've received permission from all the authors of the relevant binutils changes to relicense their changes to the relevant files from GPLv3 to GPLv2, for inclusion in Linux. Thanks to Peter Bergner for doing the leg work to get permission from everyone. - addition of the "architected" Power9 CPU table entry, allowing us to boot in Power9 architected mode under a hypervisor. - updates to the Power9 PMU code. - implementation of clear_bit_unlock_is_negative_byte() to optimise unlock_page(). - Freescale updates from Scott: "Highlights include 8xx breakpoints and perf, t1042rdb display support, and board updates." Thanks to: Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Balbir Singh, Douglas Miller, Frédéric Weisbecker, Gavin Shan, Madhavan Srinivasan, Michael Roth, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Peter Bergner, Paul E. McKenney, Rashmica Gupta, Russell Currey, Sahil Mehta, Stewart Smith" * tag 'powerpc-4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits) powerpc: Remove leftover cputime_to_nsecs call causing build error powerpc/mm/hash: Always clear UPRT and Host Radix bits when setting up CPU powerpc/optprobes: Fix TOC handling in optprobes trampoline powerpc/pseries: Advertise Hot Plug Event support to firmware cxl: fix nested locking hang during EEH hotplug powerpc/xmon: Dump memory in CPU endian format powerpc/pseries: Revert 'Auto-online hotplugged memory' powerpc/powernv: Make PCI non-optional powerpc/64: Implement clear_bit_unlock_is_negative_byte() powerpc/powernv: Remove unused variable in pnv_pci_sriov_disable() powerpc/kernel: Remove error message in pcibios_setup_phb_resources() powerpc/mm: Fix typo in set_pte_at() pci/hotplug/pnv-php: Disable MSI and PCI device properly pci/hotplug/pnv-php: Disable surprise hotplug capability on conflicts pci/hotplug/pnv-php: Remove WARN_ON() in pnv_php_put_slot() powerpc: Add POWER9 architected mode to cputable powerpc/perf: use is_kernel_addr macro in perf_get_misc_flags() powerpc/perf: Avoid FAB_*_MATCH checks for power9 powerpc/perf: Add restrictions to PMC5 in power9 DD1 powerpc/perf: Use Instruction Counter value ...
2017-02-24mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmfDave Jiang1-1/+2
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to take a vma and vmf parameter when the vma already resides in vmf. Remove the vma parameter to simplify things. [[email protected]: fix ARM build] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Ross Zwisler <[email protected]> Cc: Theodore Ts'o <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Jan Kara <[email protected]> Cc: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-02-21cxl: fix nested locking hang during EEH hotplugAndrew Donnellan4-10/+27
Commit 14a3ae34bfd0 ("cxl: Prevent read/write to AFU config space while AFU not configured") introduced a rwsem to fix an invalid memory access that occurred when someone attempts to access the config space of an AFU on a vPHB whilst the AFU is deconfigured, such as during EEH recovery. It turns out that it's possible to run into a nested locking issue when EEH recovery fails and a full device hotplug is required. cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on configured_rwsem. When EEH recovery fails, the EEH code calls pci_hp_remove_devices() to remove the device, which in turn calls cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries to grab the writer lock that's already held. Standard rwsem semantics don't express what we really want to do here and don't allow for nested locking. Fix this by replacing the rwsem with an atomic_t which we can control more finely. Allow the AFU to be locked multiple times so long as there are no readers. Fixes: 14a3ae34bfd0 ("cxl: Prevent read/write to AFU config space while AFU not configured") Cc: [email protected] # v4.9+ Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-03cxl: Fix build when CONFIG_DEBUG_FS=nAndrew Donnellan2-5/+57
Stub out the debugfs functions so that the build doesn't break when CONFIG_DEBUG_FS=n. Reported-by: Michael Ellerman <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-25cxl: Prevent read/write to AFU config space while AFU not configuredAndrew Donnellan4-23/+35
During EEH recovery, we deconfigure all AFUs whilst leaving the corresponding vPHB and virtual PCI device in place. If something attempts to interact with the AFU's PCI config space (e.g. running lspci) after the AFU has been deconfigured and before it's reconfigured, cxl_pcie_{read,write}_config() will read invalid values from the deconfigured struct cxl_afu and proceed to Oops when they try to dereference pointers that have been set to NULL during deconfiguration. Add a rwsem to struct cxl_afu so we can prevent interaction with config space while the AFU is deconfigured. Reported-by: Pradipta Ghosh <[email protected]> Suggested-by: Frederic Barrat <[email protected]> Cc: [email protected] # v4.9+ Signed-off-by: Andrew Donnellan <[email protected]> Signed-off-by: Vaibhav Jain <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-25cxl: Force psl data-cache flush during device shutdownVaibhav Jain1-0/+3
This change adds a force psl data cache flush during device shutdown callback. This should reduce a possibility of psl holding a dirty cache line while the CAPP is being reinitialized, which may result in a UE [load/store] machine check error. Signed-off-by: Vaibhav Jain <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-01-25cxl: Drop unused header asm/pnv-pci.hGreg Kurz1-1/+0
The kernel API does not use anything from this header file. Signed-off-by: Greg Kurz <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-12-16Merge tag 'powerpc-4.10-1' of ↵Linus Torvalds10-49/+160
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for the kexec_file_load() syscall, which is a prereq for secure and trusted boot. - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN). - Sort the exception tables at build time, to save time at boot, and store them as relative offsets to save space in the kernel image & memory. - Allow building the kernel with thin archives, which should allow us to build an allyesconfig once some other fixes land. - Build fixes to allow us to correctly rebuild when changing the kernel endian from big to little or vice versa. - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix. - Initial stack protector support (-fstack-protector). - Support for dumping the radix (aka. Linux) and hash page tables via debugfs. - Fix an oops in cxl coredump generation when cxl_get_fd() is used. - Freescale updates from Scott: "Highlights include 8xx hugepage support, qbman fixes/cleanup, device tree updates, and some misc cleanup." - Many and varied fixes and minor enhancements as always. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain" [ And thanks to Michael, who took time off from a new baby to get this pull request done. - Linus ] * tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits) powerpc/fsl/dts: add FMan node for t1042d4rdb powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb powerpc/fsl/dts: add QMan and BMan nodes on t1024 powerpc/fsl/dts: add QMan and BMan nodes on t1023 soc/fsl/qman: test: use DEFINE_SPINLOCK() powerpc/fsl-lbc: use DEFINE_SPINLOCK() powerpc/8xx: Implement support of hugepages powerpc: get hugetlbpage handling more generic powerpc: port 64 bits pgtable_cache to 32 bits powerpc/boot: Request no dynamic linker for boot wrapper soc/fsl/bman: Use resource_size instead of computation soc/fsl/qe: use builtin_platform_driver powerpc/fsl_pmc: use builtin_platform_driver powerpc/83xx/suspend: use builtin_platform_driver powerpc/ftrace: Fix the comments for ftrace_modify_code powerpc/perf: macros for power9 format encoding powerpc/perf: power9 raw event format encoding powerpc/perf: update attribute_group data structure powerpc/perf: factor out the event format field powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown ...
2016-12-14mm: use vmf->address instead of of vmf->virtual_addressJan Kara1-3/+2
Every single user of vmf->virtual_address typed that entry to unsigned long before doing anything with it so the type of virtual_address does not really provide us any additional safety. Just use masked vmf->address which already has the appropriate type. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Dan Williams <[email protected]> Cc: Ross Zwisler <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-11-25cxl: drop duplicate header sched.hGeliang Tang1-1/+0
Drop duplicate header sched.h from native.c. Signed-off-by: Geliang Tang <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-23cxl: Fix coccinelle warningsAndrew Donnellan4-6/+8
Fix the following coccinelle warnings: drivers/misc/cxl/debugfs.c:46:0-23: WARNING: fops_io_x64 should be defined with DEFINE_DEBUGFS_ATTRIBUTE drivers/misc/cxl/guest.c:890:5-26: WARNING: Comparison to bool drivers/misc/cxl/irq.c:107:3-23: WARNING: Assignment of bool to 0/1 drivers/misc/cxl/native.c:57:2-3: Unneeded semicolon drivers/misc/cxl/native.c:170:2-3: Unneeded semicolon Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Reviewed-by: Matthew R. Ochs <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-18cxl: Fix coredump generation when cxl_get_fd() is usedFrederic Barrat4-31/+135
If a process dumps core while owning a cxl file descriptor obtained from an AFU driver (e.g. cxlflash) through the cxl_get_fd() API, the following error occurs: [ 868.027591] Unable to handle kernel paging request for data at address ... [ 868.027778] Faulting instruction address: 0xc00000000035edb0 cpu 0x8c: Vector: 300 (Data Access) at [c000003c688275e0] pc: c00000000035edb0: elf_core_dump+0xd60/0x1300 lr: c00000000035ed80: elf_core_dump+0xd30/0x1300 sp: c000003c68827860 msr: 9000000100009033 dar: c dsisr: 40000000 current = 0xc000003c68780000 paca = 0xc000000001b73200 softe: 0 irq_happened: 0x01 pid = 46725, comm = hxesurelock enter ? for help [c000003c68827a60] c00000000036948c do_coredump+0xcec/0x11e0 [c000003c68827c20] c0000000000ce9e0 get_signal+0x540/0x7b0 [c000003c68827d10] c000000000017354 do_signal+0x54/0x2b0 [c000003c68827e00] c00000000001777c do_notify_resume+0xbc/0xd0 [c000003c68827e30] c000000000009838 ret_from_except_lite+0x64/0x68 --- Exception: 300 (Data Access) at 00003fff98ad2918 The root cause is that the address_space structure for the file doesn't define a 'host' member. When cxl allocates a file descriptor, it's using the anonymous inode to back the file, but allocates a private address_space for each context. The private address_space allows to track memory allocation for each context. cxl doesn't define the 'host' member of the address space, i.e. the inode. We don't want to define it as the anonymous inode, since there's no longer a 1-to-1 relation between address_space and inode. To fix it, instead of using the anonymous inode, we introduce a simple pseudo filesystem so that cxl can allocate its own inodes. So we now have one inode for each file and address_space. The pseudo filesystem is only mounted on the first allocation of a file descriptor by cxl_get_fd(). Tested with cxlflash. Signed-off-by: Frederic Barrat <[email protected]> Reviewed-by: Matthew R. Ochs <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-18cxl: Do adapter fence check before handling afu interruptVaibhav Jain1-3/+12
If an afu interrupt is in flight when an eeh error is triggered the control still reaches the function native_irq_multiplexed and the PE-Handle read from the CXL_PSL_PEHandle_An register is 0xffff. The function then erroneously assumes that the interrupt belonged to a detached context and generates a warning with full stack dump in the kernel log complaining: "Unable to demultiplex CXL PSL IRQ for PE 65535 DSISR ffffffff DAR ffffffff. (Possible AFU HW issue - was a term/remove acked with outstanding transactions" To fix this the patch adds new code to the function native_irq_multiplexed function to compares the read value of register CXL_PSL_PEHandle_An to ~0ULL. If true then logs a warning message saying that the interrupt is being ignored and returns IRQ_HANDLED from the irq handler. Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Vaibhav Jain <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-18cxl: Fix error handling in _cxl_pci_associate_default_context()Christophe Jaillet2-2/+2
'cxl_dev_context_init()' returns an error pointer in case of error, not NULL. So test it with IS_ERR. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-18cxl: Fix error handling in _cxl_cx4_setup_msi_irqs()Christophe Jaillet1-1/+1
'cxl_dev_context_init()' returns an error pointer in case of error, not NULL. So test it with IS_ERR. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-18cxl: Fix memory allocation failure testChristophe Jaillet1-5/+2
'cxl_context_alloc()' does not return an error pointer. It is just a shortcut for a call to 'kzalloc' with 'sizeof(struct cxl_context)' as the size parameter. So its return value should be compared with NULL. While fixing it, simplify a bit the code. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-10-24cxl: Fix leaking pid refs in some error pathsVaibhav Jain2-9/+15
In some error paths in functions cxl_start_context and afu_ioctl_start_work pid references to the current & group-leader tasks can leak after they are taken. This patch fixes these error paths to release these pid references before exiting the error path. Fixes: 7b8ad495d592 ("cxl: Fix DSI misses when the context owning task exits") Cc: [email protected] # v4.5+ Reviewed-by: Andrew Donnellan <[email protected]> Reported-by: Frederic Barrat <[email protected]> Signed-off-by: Vaibhav Jain <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-10-19cxl: Prevent adapter reset if an active context existsVaibhav Jain8-5/+116
This patch prevents resetting the cxl adapter via sysfs in presence of one or more active cxl_context on it. This protects against an unrecoverable error caused by PSL owning a dirty cache line even after reset and host tries to touch the same cache line. In case a force reset of the card is required irrespective of any active contexts, the int value -1 can be stored in the 'reset' sysfs attribute of the card. The patch introduces a new atomic_t member named contexts_num inside struct cxl that holds the number of active context attached to the card , which is checked against '0' before proceeding with the reset. To prevent against a race condition where a context is activated just after reset check is performed, the contexts_num is atomically set to '-1' after reset-check to indicate that no more contexts can be activated on the card anymore. Before activating a context we atomically test if contexts_num is non-negative and if so, increment its value by one. In case the value of contexts_num is negative then it indicates that the card is about to be reset and context activation is error-ed out at that point. Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card") Cc: [email protected] # v4.0+ Acked-by: Frederic Barrat <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Vaibhav Jain <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-10-04cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put()Andrew Donnellan1-5/+3
Rewrite the cxl_guest_init_afu() loop in cxl_of_probe() to use for_each_child_of_node() rather than a hand-coded for loop. Remove the useless of_node_put(afu_np) call after the loop, where it's guaranteed that afu_np == NULL. Reported-by: SF Markus Elfring <[email protected]> Reported-by: Julia Lawall <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Reviewed-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-10-04cxl: Flush PSL cache before resetting the adapterFrederic Barrat3-1/+39
If the capi link is going down while the PSL owns a dirty cache line, any access from the host for that data could lead to an Uncorrectable Error. So when resetting the capi adapter through sysfs, make sure the PSL cache is flushed. It won't help if there are any active Process Elements on the card, as the cache would likely get new dirty cache lines immediately, but if resetting an idle adapter, it should avoid any bad surprises from data left over from terminated Process Elements. Signed-off-by: Frederic Barrat <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-09-13cxl: Fix informational messageFrederic Barrat1-2/+2
When set_sl_ops() is called, the adapter data structure is not fully initialized yet. Therefore the device name is not showing up in the trace. Fix is simply to get the device name from the pci_dev structure. Fixes: 6d382616ac22 ("cxl: Abstract the differences between the PSL and XSL") Signed-off-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-08-22cxl: use pcibios_free_controller_deferred() when removing vPHBsAndrew Donnellan1-1/+9
When cxl removes a vPHB, it's possible that the pci_controller may be freed before all references to the devices on the vPHB have been released. This in turn causes an invalid memory access when the devices are eventually released, as pcibios_release_device() attempts to call the phb's release_device hook. In cxl_pci_vphb_remove(), remove the existing call to pcibios_free_controller(). Instead, use pcibios_free_controller_deferred() to free the pci_controller after all devices have been released. Export pci_set_host_bridge_release() so we can do this. Cc: [email protected] Signed-off-by: Andrew Donnellan <[email protected]> Reviewed-by: Matthew R. Ochs <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2016-08-10cxl: Set psl_fir_cntl to production environment valueFrederic Barrat1-3/+6
Switch the setting of psl_fir_cntl from debug to production environment recommended value. It mostly affects the PSL behavior when an error is raised in psl_fir1/2. Tested with cxlflash. Signed-off-by: Frederic Barrat <[email protected]> Reviewed-by: Uma Krishnan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-08-09cxl: Fix sparse warningsAndrew Donnellan2-2/+2
Make native_irq_wait() static and use NULL rather than 0 to initialise phb->cfg_data in cxl_pci_vphb_add() to remove sparse warnings. Signed-off-by: Andrew Donnellan <[email protected]> Reviewed-by: Matthew R. Ochs <[email protected]> Acked-by: Ian Munsie <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-08-09cxl: Fix NULL dereference in cxl_context_init() on PowerVM guestsAndrew Donnellan3-4/+4
Commit f67a6722d650 ("cxl: Workaround PE=0 hardware limitation in Mellanox CX4") added a "min_pe" field to struct cxl_service_layer_ops, to allow us to work around a Mellanox CX-4 hardware limitation. When allocating the PE number in cxl_context_init(), we read from ctx->afu->adapter->native->sl_ops->min_pe to get the minimum PE number. Unsurprisingly, in a PowerVM guest ctx->afu->adapter->native is NULL, and guests don't have a cxl_service_layer_ops struct anywhere. Move min_pe from struct cxl_service_layer_ops to struct cxl so it's accessible in both native and PowerVM environments. For the Mellanox CX-4, set the min_pe value in set_sl_ops(). Fixes: f67a6722d650 ("cxl: Workaround PE=0 hardware limitation in Mellanox CX4") Reported-by: Frederic Barrat <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Ian Munsie <[email protected]> Reviewed-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-19cxl: fix potential NULL dereference in free_adapter()Andrew Donnellan1-7/+9
If kzalloc() fails when allocating adapter->guest in cxl_guest_init_adapter(), we call free_adapter() before erroring out. free_adapter() in turn attempts to dereference adapter->guest, which in this case is NULL. In free_adapter(), skip the adapter->guest cleanup if adapter->guest is NULL. Fixes: 14baf4d9c739 ("cxl: Add guest-specific code") Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-19cxl: remove dead Kconfig optionsAndrew Donnellan1-10/+0
Remove the CXL_KERNEL_API and CXL_EEH Kconfig options, as they were only needed to coordinate the merging of the cxlflash driver. Also remove the stub implementation of cxl_perst_reloads_same_image() in cxlflash which is only used if CXL_EEH isn't defined (i.e. never). Suggested-by: Ian Munsie <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Acked-by: Ian Munsie <[email protected]> Acked-by: Matthew R. Ochs <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cardsAndrew Donnellan2-18/+226
Add a new API, cxl_check_and_switch_mode() to allow for switching of bi-modal CAPI cards, such as the Mellanox CX-4 network card. When a driver requests to switch a card to CAPI mode, use PCI hotplug infrastructure to remove all PCI devices underneath the slot. We then write an updated mode control register to the CAPI VSEC, hot reset the card, and reprobe the card. As the card may present a different set of PCI devices after the mode switch, use the infrastructure provided by the pnv_php driver and the OPAL PCI slot management facilities to ensure that: * the old devices are removed from both the OPAL and Linux device trees * the new devices are probed by OPAL and added to the OPAL device tree * the new devices are added to the Linux device tree and probed through the regular PCI device probe path As such, introduce a new option, CONFIG_CXL_BIMODAL, with a dependency on the pnv_php driver. Refactor existing code that touches the mode control register in the regular single mode case into a new function, setup_cxl_protocol_area(). Co-authored-by: Ian Munsie <[email protected]> Cc: Gavin Shan <[email protected]> Signed-off-by: Andrew Donnellan <[email protected]> Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Workaround PE=0 hardware limitation in Mellanox CX4Ian Munsie3-1/+4
The CX4 card cannot cope with a context with PE=0 due to a hardware limitation, resulting in: [ 34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939 [ 34.166580] mlx5_core 0000:01:00.1: Failed allocating uar, aborting Since the kernel API allocates a default context very early during device init that will almost certainly get Process Element ID 0 there is no easy way for us to extend the API to allow the Mellanox to inform us of this limitation ahead of time. Instead, work around the issue by extending the XSL structure to include a minimum PE to allocate. Although the bug is not in the XSL, it is the easiest place to work around this limitation given that the CX4 is currently the only card that uses an XSL. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Reviewed-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Add support for interrupts on the Mellanox CX4Ian Munsie4-0/+108
The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Add preliminary workaround for CX4 interrupt limitationIan Munsie5-0/+44
The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Frederic Barrat <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Add kernel APIs to get & set the max irqs per contextIan Munsie1-0/+27
These APIs will be used by the Mellanox CX4 support. While they function standalone to configure existing behaviour, their primary purpose is to allow the Mellanox driver to inform the cxl driver of a hardware limitation, which will be used in a future patch. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Frederic Barrat <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Add support for using the kernel API with a real PHBIan Munsie3-2/+22
This hooks up support for using the kernel API with a real PHB. After the AFU initialisation has completed it calls into the PHB code to pass it the AFU that will be used by other peer physical functions on the adapter. The cxl_pci_to_afu API is extended to work with peer PCI devices, retrieving the peer AFU from the PHB. This API may also now return an error if it is called on a PCI device that is not associated with either a cxl vPHB or a peer PCI device to an AFU, and this error is propagated down. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Do not create vPHB if there are no AFU configuration recordsIan Munsie2-0/+14
The vPHB model of the cxl kernel API is a hierarchy where the AFU is represented by the vPHB, and it's AFU configuration records are exposed as functions under that vPHB. If there are no AFU configuration records we will create a vPHB with nothing under it, which is a waste of resources and will opt us into EEH handling despite not having anything special to handle. This also does not make sense for cards using the peer model of the cxl kernel API, where the other functions of the device are exposed via additional peer physical functions rather than AFU configuration records. This model will also not work with the existing EEH handling in the cxl driver, as that is designed around the vPHB model. Skip creating the vPHB for AFUs without any AFU configuration records, and opt out of EEH handling for them. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-07-14cxl: Allow a default context to be associated with an external pci_devIan Munsie6-28/+91
The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <[email protected]> Reviewed-by: Frederic Barrat <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>