aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-26powerpc/mm/radix: Workaround prefetch issue with KVMBenjamin Herrenschmidt6-22/+154
There's a somewhat architectural issue with Radix MMU and KVM. When coming out of a guest with AIL (Alternate Interrupt Location, ie, MMU enabled), we start executing hypervisor code with the PID register still containing whatever the guest has been using. The problem is that the CPU can (and will) then start prefetching or speculatively load from whatever host context has that same PID (if any), thus bringing translations for that context into the TLB, which Linux doesn't know about. This can cause stale translations and subsequent crashes. Fixing this in a way that is neither racy nor a huge performance impact is difficult. We could just make the host invalidations always use broadcast forms but that would hurt single threaded programs for example. We chose to fix it instead by partitioning the PID space between guest and host. This is possible because today Linux only use 19 out of the 20 bits of PID space, so existing guests will work if we make the host use the top half of the 20 bits space. We additionally add support for a property to indicate to Linux the size of the PID register which will be useful if we eventually have processors with a larger PID space available. There is still an issue with malicious guests purposefully setting the PID register to a value in the hosts PID range. Hopefully future HW can prevent that, but in the meantime, we handle it with a pair of kludges: - On the way out of a guest, before we clear the current VCPU in the PACA, we check the PID and if it's outside of the permitted range we flush the TLB for that PID. - When context switching, if the mm is "new" on that CPU (the corresponding bit was set for the first time in the mm cpumask), we check if any sibling thread is in KVM (has a non-NULL VCPU pointer in the PACA). If that is the case, we also flush the PID for that CPU (core). This second part is needed to handle the case where a process is migrated (or starts a new pthread) on a sibling thread of the CPU coming out of KVM, as there's a window where stale translations can exist before we detect it and flush them out. A future optimization could be added by keeping track of whether the PID has ever been used and avoid doing that for completely fresh PIDs. We could similarily mark PIDs that have been the subject of a global invalidation as "fresh". But for now this will do. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop unneeded include of kvm_book3s_asm.h] Signed-off-by: Michael Ellerman <[email protected]>
2017-07-25net: ethernet: nb8800: Handle all 4 RGMII modes identicallyMarc Gonzalez1-5/+4
Before commit bf8f6952a233 ("Add blurb about RGMII") it was unclear whose responsibility it was to insert the required clock skew, and in hindsight, some PHY drivers got it wrong. The solution forward is to introduce a new property, explicitly requiring skew from the node to which it is attached. In the interim, this driver will handle all 4 RGMII modes identically (no skew). Fixes: 52dfc8301248 ("net: ethernet: add driver for Aurora VLSI NB8800 Ethernet controller") Signed-off-by: Marc Gonzalez <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25Revert "netvsc: optimize calculation of number of slots"stephen hemminger1-10/+33
The logic for computing page buffer scatter does not take into account the impact of compound pages. Therefore the optimization to compute number of slots was incorrect and could cause stack corruption a skb was sent with lots of fragments from huge pages. This reverts commit 60b86665af0dfbeebda8aae43f0ba451cd2dcfe5. Fixes: 60b86665af0d ("netvsc: optimize calculation of number of slots") Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25ftgmac100: return error in ftgmac100_alloc_rx_bufJoel Stanley1-2/+2
The error paths set err, but it's not returned. I wondered if we should fix all of the callers to check the returned value, but Ben explains why the code is this way: > Most call sites ignore it on purpose. There's nothing we can do if > we fail to get a buffer at interrupt time, so we point the buffer to > the scratch page so the HW doesn't DMA into lalaland and lose the > packet. > > The one call site that tests and can fail is the one used when brining > the interface up. If we fail to allocate at that point, we fail the > ifup. But as you noticed, I do have a bug not returning the error. Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Joel Stanley <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()Stefano Brivio1-4/+0
RFC 2465 defines ipv6IfStatsOutFragFails as: "The number of IPv6 datagrams that have been discarded because they needed to be fragmented at this output interface but could not be." The existing implementation, instead, would increase the counter twice in case we fail to allocate room for single fragments: once for the fragment, once for the datagram. This didn't look intentional though. In one of the two affected affected failure paths, the double increase was simply a result of a new 'goto fail' statement, introduced to avoid a skb leak. The other path appears to be affected since at least 2.6.12-rc2. Reported-by: Sabrina Dubroca <[email protected]> Fixes: 1d325d217c7f ("ipv6: ip6_fragment: fix headroom tests and skb leak") Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25Merge tag 'scsi-fixes' of ↵Linus Torvalds3-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small fixes. The transfer size fixes are actually correcting some performance drops on the hpsa and smartpqi cards. The cards actually have an internal cache for request speed up but bypass it for transfers > 1MB. Since 4.3 the efficiency of our merges has rendered the cache mostly unused, so limit transfers to under 1MB to recover the cache boost" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sg: fix static checker warning in sg_is_valid_dxfer scsi: smartpqi: limit transfer length to 1MB scsi: hpsa: limit transfer length to 1MB
2017-07-25Merge tag 'uuid-for-4.13-2' of git://git.infradead.org/users/hch/uuidLinus Torvalds5-27/+13
Pull uuid fixes from Christoph Hellwig: - add a missing "!" in the uuid tests - remove the last remaining user of the uuid_be type, and then the type and its helpers * tag 'uuid-for-4.13-2' of git://git.infradead.org/users/hch/uuid: uuid: remove uuid_be thunderbolt: use uuid_t instead of uuid_be uuid: fix incorrect uuid_equal conversion in test_uuid_test
2017-07-25Merge tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds8-81/+180
Pull dma mapping fixes from Christoph Hellwig: "split the global dma coherent pool from the per-device pool. This fixes a regression in the earlier 4.13 pull requests where the global pool would override a per-device CMA pool (Vladimir Murzin)" * tag 'dma-mapping-4.13-2' of git://git.infradead.org/users/hch/dma-mapping: ARM: NOMMU: Wire-up default DMA interface dma-coherent: introduce interface for default DMA pool
2017-07-25MD: fix warnning for UP caseShaohua Li1-1/+1
spin_is_locked always returns 0 for UP case, so ignores it Reported-by: Joshua Kinard <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2017-07-25parisc: Extend disabled preemption in copy_user_pageJohn David Anglin1-1/+1
It's always bothered me that we only disable preemption in copy_user_page around the call to flush_dcache_page_asm. This patch extends this to after the copy. Signed-off-by: John David Anglin <[email protected]> Cc: [email protected] # 4.9+ Signed-off-by: Helge Deller <[email protected]>
2017-07-25parisc: Prevent TLB speculation on flushed pages on CPUs that only support ↵John David Anglin1-20/+14
equivalent aliases Helge noticed that we flush the TLB page in flush_cache_page but not in flush_cache_range or flush_cache_mm. For a long time, we have had random segmentation faults building packages on machines with PA8800/8900 processors. These machines only support equivalent aliases. We don't see these faults on machines that don't require strict coherency. So, it appears TLB speculation sometimes leads to cache corruption on machines that require coherency. This patch adds TLB flushes to flush_cache_range and flush_cache_mm when coherency is required. We only flush the TLB in flush_cache_page when coherency is required. The patch also optimizes flush_cache_range. It turns out we always have the right context to use flush_user_dcache_range_asm and flush_user_icache_range_asm. The patch has been tested for some time on rp3440, rp3410 and A500-44. It's been boot tested on c8000. No random segmentation faults were observed during testing. Signed-off-by: John David Anglin <[email protected]> Cc: [email protected] # 4.9+ Signed-off-by: Helge Deller <[email protected]>
2017-07-25Merge branch 'stable/for-jens-4.13' of ↵Jens Axboe1-9/+12
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus Pull xen-blkfront fixes from Konrad for 4.13.
2017-07-25ALSA: hda - Add mute led support for HP ProBook 440 G4Kai-Heng Feng1-0/+1
Mic mute led does not work on HP ProBook 440 G4. We can use CXT_FIXUP_MUTE_LED_GPIO fixup to support it. BugLink: https://bugs.launchpad.net/bugs/1705586 Signed-off-by: Kai-Heng Feng <[email protected]> Cc: <[email protected]> # v4.12+ Signed-off-by: Takashi Iwai <[email protected]>
2017-07-25drm/amd/powerplay: fix AVFS voltage offset for Vega10Eric Huang1-9/+3
Signed-off-by: Eric Huang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-25drm/amdgpu/gfx9: simplify and fix GRBM index selectionNicolai Hähnle1-11/+13
Copy the approach taken by gfx8, which simplifies the code, and set the instance index properly. The latter is required for debugging, e.g. for reading wave status by UMR. Signed-off-by: Nicolai Hähnle <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-25drm/amdgpu: Fix blocking in RCU critical section(v2)Alex Xie1-3/+7
In RCU read-side critical sections, blocking or sleeping is prohibited. v2: Unlock RCU for the code path where result==NULL. (David Zhou) Update subject Tested-by and reported by: Dave Airlie <[email protected]> [ 141.965723] ============================= [ 141.965724] WARNING: suspicious RCU usage [ 141.965726] 4.12.0-rc7 #221 Not tainted [ 141.965727] ----------------------------- [ 141.965728] /home/airlied/devel/kernel/linux-2.6/include/linux/rcupdate.h:531 Illegal context switch in RCU read-side critical section! [ 141.965730] other info that might help us debug this: [ 141.965731] rcu_scheduler_active = 2, debug_locks = 0 [ 141.965732] 1 lock held by amdgpu_cs:0/1332: [ 141.965733] #0: (rcu_read_lock){......}, at: [<ffffffffa01a0d07>] amdgpu_bo_list_get+0x0/0x109 [amdgpu] [ 141.965774] stack backtrace: [ 141.965776] CPU: 6 PID: 1332 Comm: amdgpu_cs:0 Not tainted 4.12.0-rc7 #221 [ 141.965777] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 R2.0, BIOS 2603 06/26/2015 [ 141.965778] Call Trace: [ 141.965782] dump_stack+0x68/0x92 [ 141.965785] lockdep_rcu_suspicious+0xf7/0x100 [ 141.965788] ___might_sleep+0x56/0x1fc [ 141.965790] __might_sleep+0x68/0x6f [ 141.965793] __mutex_lock+0x4e/0x7b5 [ 141.965817] ? amdgpu_bo_list_get+0xa4/0x109 [amdgpu] [ 141.965820] ? lock_acquire+0x125/0x1b9 [ 141.965844] ? amdgpu_bo_list_set+0x464/0x464 [amdgpu] [ 141.965846] mutex_lock_nested+0x16/0x18 [ 141.965848] ? mutex_lock_nested+0x16/0x18 [ 141.965872] amdgpu_bo_list_get+0xa4/0x109 [amdgpu] [ 141.965895] amdgpu_cs_ioctl+0x4a0/0x17dd [amdgpu] [ 141.965898] ? radix_tree_node_alloc.constprop.11+0x77/0xab [ 141.965916] drm_ioctl+0x264/0x393 [drm] [ 141.965939] ? amdgpu_cs_find_mapping+0x83/0x83 [amdgpu] [ 141.965942] ? trace_hardirqs_on_caller+0x16a/0x186 Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-25nbd: clear disconnected on reconnectJosef Bacik1-0/+2
If our device loses its connection for longer than the dead timeout we will set NBD_DISCONNECTED in order to quickly fail any pending IO's that flood in after the IO's that were waiting during the dead timer. However if we re-connect at some point in the future we'll still see this DISCONNECTED flag set if we then lose our connection again after that, which means we won't get notifications for our newly lost connections. Fix this by just clearing the DISCONNECTED flag on reconnect in order to make sure everything works as it's supposed to. Reported-by: Dan Melnic <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-07-25parisc: Suspend lockup detectors before system haltHelge Deller1-0/+2
Some machines can't power off the machine, so disable the lockup detectors to avoid this watchdog BUG to show up every few seconds: watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [systemd-shutdow:1] Signed-off-by: Helge Deller <[email protected]> Cc: [email protected] # 4.9+
2017-07-25parisc: Show DIMM slot number which holds broken memory moduleHelge Deller1-5/+14
The Page Deallocation Table (PDT) holds the physical addresses of all broken memory addresses. With the physical address we now are able to show which DIMM slot (e.g. 1a, 3c) actually holds the broken memory module so that users are able to replace it. Signed-off-by: Helge Deller <[email protected]>
2017-07-25lib: test_rhashtable: Fix KASAN warningPhil Sutter1-1/+3
I forgot one spot when introducing struct test_obj_val. Fixes: e859afe1ee0c5 ("lib: test_rhashtable: fix for large entry counts") Reported by: kernel test robot <[email protected]> Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25net: phy: Remove trailing semicolon in macro definitionMarc Gonzalez1-1/+1
Commit e5a03bfd873c2 ("phy: Add an mdio_device structure") introduced a spurious trailing semicolon. Remove it. Signed-off-by: Marc Gonzalez <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25dm zoned: remove test for impossible REQ_OP_FLUSH conditionsMikulas Patocka1-2/+2
The value REQ_OP_FLUSH is only used by the block code for request-based devices. Remove the tests for REQ_OP_FLUSH from the bio-based dm-zoned-target. Signed-off-by: Mikulas Patocka <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25dm raid: bump target versionHeinz Mauelshagen2-1/+2
Bumo dm-raid target version to 1.12.1 to reflect that commit cc27b0c78c ("md: fix deadlock between mddev_suspend() and md_write_start()") is available. This version change allows userspace to detect that MD fix is available. Signed-off-by: Heinz Mauelshagen <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25dm raid: avoid mddev->suspended accessHeinz Mauelshagen1-5/+7
Use runtime flag to ensure that an mddev gets suspended/resumed just once. Signed-off-by: Heinz Mauelshagen <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25dm raid: fix activation check in validate_raid_redundancy()Heinz Mauelshagen1-5/+5
During growing reshapes (i.e. stripes being added to a raid set), the new stripe images are not in-sync and not part of the raid set until the reshape is started. LVM2 has to request multiple table reloads involving superblock updates in order to reflect proper size of SubLVs in the cluster. Before a stripe adding reshape starts, validate_raid_redundancy() fails as a result of that because it checks the total number of devices against the number of rebuild ones rather than the actual ones in the raid set (as retrieved from the superblock) thus resulting in failed raid4/5/6/10 redundancy checks. E.g. convert 3 stripes -> 7 stripes raid5 (which only allows for maximum 1 device to fail) requesting +4 delta disks causing 4 devices to rebuild during reshaping thus failing activation. To fix this, move validate_raid_redundancy() to get access to the current raid_set members. Signed-off-by: Heinz Mauelshagen <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25dm raid: remove WARN_ON() in raid10_md_layout_to_format()Heinz Mauelshagen1-2/+3
Signed-off-by: Heinz Mauelshagen <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25workqueue: implicit ordered attribute should be overridableTejun Heo2-5/+12
5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered") automatically enabled ordered attribute for unbound workqueues w/ max_active == 1. Because ordered workqueues reject max_active and some attribute changes, this implicit ordered mode broke cases where the user creates an unbound workqueue w/ max_active == 1 and later explicitly changes the related attributes. This patch distinguishes explicit and implicit ordered setting and overrides from attribute changes if implict. Signed-off-by: Tejun Heo <[email protected]> Fixes: 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
2017-07-25parisc: Add function to return DIMM slot of physical addressHelge Deller2-1/+40
Add a firmware wrapper function, which asks PDC firmware for the DIMM slot of a physical address. This is needed to show users which DIMM module needs replacement in case a broken DIMM was encountered. Signed-off-by: Helge Deller <[email protected]>
2017-07-25udp: preserve head state for IP_CMSG_PASSSECPaolo Abeni2-31/+40
Paul Moore reported a SELinux/IP_PASSSEC regression caused by missing skb->sp at recvmsg() time. We need to preserve the skb head state to process the IP_CMSG_PASSSEC cmsg. With this commit we avoid releasing the skb head state in the BH even if a secpath is attached to the current skb, and stores the skb status (with/without head states) in the scratch area, so that we can access it at skb deallocation time, without incurring in cache-miss penalties. This also avoids misusing the skb CB for ipv6 packets, as introduced by the commit 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing"). Clean a bit the scratch area helpers implementation, to reduce the code differences between 32 and 64 bits build. Reported-by: Paul Moore <[email protected]> Fixes: 0a463c78d25b ("udp: avoid a cache miss on dequeue") Fixes: 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing") Signed-off-by: Paolo Abeni <[email protected]> Tested-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25parisc: Fix crash when calling PDC_PAT_MEM PDT firmware functionHelge Deller2-3/+12
Commit c9c2877d08d9 ("parisc: Add Page Deallocation Table (PDT) support") introduced the pdc_pat_mem_read_pd_pdt() firmware helper function, which crashed the system because it trashed the stack if the pdc_pat_mem_read_pd_retinfo struct was located on the stack (and which is in size less than the required 32 64-bit values). Fix it by using the pdc_result struct instead when calling firmware and copy the return values back into the result struct when finished sucessfully. While debugging this code I noticed that the pdc_type wasn't set correctly either, so let's fix that too. Fixes: c9c2877d08d9 ("parisc: Add Page Deallocation Table (PDT) support") Signed-off-by: Helge Deller <[email protected]>
2017-07-25nvme-pci: fix HMB size calculationChristoph Hellwig1-3/+3
It's possible the preferred HMB size may not be a multiple of the chunk_size. This patch moves len to function scope and uses that in the for loop increment so the last iteration doesn't cause the total size to exceed the allocated HMB size. Based on an earlier patch from Keith Busch. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Dan Carpenter <[email protected]> Reviewed-by: Keith Busch <[email protected]> Fixes: 87ad72a59a38 ("nvme-pci: implement host memory buffer support")
2017-07-25nvme-fc: revise TRADDR parsingJames Smart3-97/+125
The FC-NVME spec hasn't locked down on the format string for TRADDR. Currently the spec is lobbying for "nn-<16hexdigits>:pn-<16hexdigits>" where the wwn's are hex values but not prefixed by 0x. Most implementations so far expect a string format of "nn-0x<16hexdigits>:pn-0x<16hexdigits>" to be used. The transport uses the match_u64 parser which requires a leading 0x prefix to set the base properly. If it's not there, a match will either fail or return a base 10 value. The resolution in T11 is pushing out. Therefore, to fix things now and to cover any eventuality and any implementations already in the field, this patch adds support for both formats. The change consists of replacing the token matching routine with a routine that validates the fixed string format, and then builds a local copy of the hex name with a 0x prefix before calling the system parser. Note: the same parser routine exists in both the initiator and target transports. Given this is about the only "shared" item, we chose to replicate rather than create an interdendency on some shared code. Signed-off-by: James Smart <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2017-07-25nvme-fc: address target disconnect race conditions in fcp io submitJames Smart1-8/+11
There are cases where threads are in the process of submitting new io when the LLDD calls in to remove the remote port. In some cases, the next io actually goes to the LLDD, who knows the remoteport isn't present and rejects it. To properly recovery/restart these i/o's we don't want to hard fail them, we want to treat them as temporary resource errors in which a delayed retry will work. Add a couple more checks on remoteport connectivity and commonize the busy response handling when it's seen. Signed-off-by: James Smart <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2017-07-25nvme: fabrics commands should use the fctype field for data directionJon Derrick1-1/+1
Fabrics commands with opcode 0x7F use the fctype field to indicate data direction. Signed-off-by: Jon Derrick <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Fixes: eb793e2c ("nvme.h: add NVMe over Fabrics definitions")
2017-07-25nvme: also provide a UUID in the WWID sysfs attributeJohannes Thumshirn1-0/+3
The WWID sysfs attribute can provide multiple means of a World Wide ID for a NVMe device. It can either be a NGUID, a EUI-64 or a concatenation of VID, Serial Number, Model and the Namespace ID in this order of preference. If the target also sends us a UUID use the UUID for identification and give it the highest priority. This eases generation of /dev/disk/by-* symlinks. Signed-off-by: Johannes Thumshirn <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2017-07-25Merge tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggyLinus Torvalds3-12/+20
Pull JFS fixes from David Kleikamp. * tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy: jfs: preserve i_mode if __jfs_set_acl() fails jfs: Don't clear SGID when inheriting ACLs jfs: atomically read inode size
2017-07-25Merge branch 'for-linus' of ↵Linus Torvalds4-8/+16
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fixes from Jiri Kosina: - regression fix (missing IRQs) for devices that require 'always poll' quirk, from Dmitry Torokhov - new device ID addition to Ortek driver, from Benjamin Tissoires * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: ortek: add one more buggy device HID: usbhid: fix "always poll" quirk
2017-07-25Merge branch 'for-linus' of ↵Linus Torvalds3-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Three bug fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: set change and reference bit on lazy key enablement s390: chp: handle CRW_ERC_INIT for channel-path status change s390/perf: fix problem state detection
2017-07-25xfs: check that dir block entries don't off the end of the bufferDarrick J. Wong1-0/+4
When we're checking the entries in a directory buffer, make sure that the entry length doesn't push us off the end of the buffer. Found via xfs/388 writing ones to the length fields. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2017-07-25xen/blkfront: always allocate grants first from per-queue persistent grantsDongli Zhang1-8/+11
This patch partially reverts 3df0e50 ("xen/blkfront: pseudo support for multi hardware queues/rings"). The xen-blkfront queue/ring might hang due to grants allocation failure in the situation when gnttab_free_head is almost empty while many persistent grants are reserved for this queue/ring. As persistent grants management was per-queue since 73716df ("xen/blkfront: make persistent grants pool per-queue"), we should always allocate from persistent grants first. Acked-by: Roger Pau Monné <[email protected]> Signed-off-by: Dongli Zhang <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2017-07-25xen-blkfront: fix mq start/stop raceJunxiao Bi1-1/+1
When ring buf full, hw queue will be stopped. While blkif interrupt consume request and make free space in ring buf, hw queue will be started again. But since start queue is protected by spin lock while stop not, that will cause a race. interrupt: process: blkif_interrupt() blkif_queue_rq() kick_pending_request_queues_locked() blk_mq_start_stopped_hw_queues() clear_bit(BLK_MQ_S_STOPPED, &hctx->state) blk_mq_stop_hw_queue(hctx) blk_mq_run_hw_queue(hctx, async) If ring buf is made empty in this case, interrupt will never come, then the hw queue will be stopped forever, all processes waiting for the pending io in the queue will hung. Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Ankur Arora <[email protected]> Acked-by: Roger Pau Monné <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2017-07-25dm bufio: fix error code in dm_bufio_write_dirty_buffers()Dan Carpenter1-2/+1
We should be returning normal negative error codes here. The "a" variables comes from &c->async_write_error which is a blk_status_t converted to a regular error code. In the current code, the blk_status_t gets propogated back to pool_create() and eventually results in an Oops. Fixes: 4e4cbee93d56 ("block: switch bios to blk_status_t") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25dm integrity: test for corrupted disk format during table loadMikulas Patocka1-0/+5
If the dm-integrity superblock was corrupted in such a way that the journal_sections field was zero, the integrity target would deadlock because it would wait forever for free space in the journal. Detect this situation and refuse to activate the device. Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: 7eada909bfd7 ("dm: add integrity target") Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25dm integrity: WARN_ON if variables representing journal usage get out of syncMikulas Patocka1-0/+2
If this WARN_ON triggers it speaks to programmer error, and likely implies corruption, but no released kernel should trigger it. This WARN_ON serves to assist DM integrity developers as changes are made/tested in the future. BUG_ON is excessive for catching programmer error, if a user or developer would like warnings to trigger a panic, they can enable that via /proc/sys/kernel/panic_on_warn Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-25virtio-net: fix module unloadingAndrew Jones1-1/+1
Unregister the driver before removing multi-instance hotplug callbacks. This order avoids the warning issued from __cpuhp_remove_state_cpuslocked when the number of remaining instances isn't yet zero. Fixes: 8017c279196a ("net/virtio-net: Convert to hotplug state machine") Cc: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Andrew Jones <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2017-07-25virtio-balloon: coding format cleanupWei Wang1-2/+4
Clean up the comment format. Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2017-07-25virtio-balloon: deflate via a page listLiang Li1-14/+8
This patch saves the deflated pages to a list, instead of the PFN array. Accordingly, the balloon_pfn_to_page() function is removed. Signed-off-by: Liang Li <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2017-07-25virtio_blk: Use sysfs_match_string() helperAndy Shevchenko1-5/+2
Use sysfs_match_string() helper instead of open coded variant. Cc: "Michael S. Tsirkin" <[email protected]> Cc: Jason Wang <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Jason Wang <[email protected]>
2017-07-25KVM: s390: take srcu lock when getting/setting storage keysChristian Borntraeger1-2/+6
The following warning was triggered by missing srcu locks around the storage key handling functions. ============================= WARNING: suspicious RCU usage 4.12.0+ #56 Not tainted ----------------------------- ./include/linux/kvm_host.h:572 suspicious rcu_dereference_check() usage! rcu_scheduler_active = 2, debug_locks = 1 1 lock held by live_migration/4936: #0: (&mm->mmap_sem){++++++}, at: [<0000000000141be0>] kvm_arch_vm_ioctl+0x6b8/0x22d0 CPU: 8 PID: 4936 Comm: live_migration Not tainted 4.12.0+ #56 Hardware name: IBM 2964 NC9 704 (LPAR) Call Trace: ([<000000000011378a>] show_stack+0xea/0xf0) [<000000000055cc4c>] dump_stack+0x94/0xd8 [<000000000012ee70>] gfn_to_memslot+0x1a0/0x1b8 [<0000000000130b76>] gfn_to_hva+0x2e/0x48 [<0000000000141c3c>] kvm_arch_vm_ioctl+0x714/0x22d0 [<000000000013306c>] kvm_vm_ioctl+0x11c/0x7b8 [<000000000037e2c0>] do_vfs_ioctl+0xa8/0x6c8 [<000000000037e984>] SyS_ioctl+0xa4/0xb8 [<00000000008b20a4>] system_call+0xc4/0x27c 1 lock held by live_migration/4936: #0: (&mm->mmap_sem){++++++}, at: [<0000000000141be0>] kvm_arch_vm_ioctl+0x6b8/0x22d0 Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Pierre Morel<[email protected]>
2017-07-25x86/efi: Fix reboot_mode when EFI runtime services are disabledStefan Assmann1-3/+3
When EFI runtime services are disabled, for example by the "noefi" kernel cmdline parameter, the reboot_type could still be set to BOOT_EFI causing reboot to fail. Fix this by checking if EFI runtime services are enabled. Signed-off-by: Stefan Assmann <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Fixed 'not disabled' double negation. ] Signed-off-by: Ingo Molnar <[email protected]>