aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-03-18tcmu: Convert cmd_time_out into backend device attributeNicholas Bellinger1-26/+68
Instead of putting cmd_time_out under ../target/core/user_0/foo/control, which has historically been used by parameters needed for initial backend device configuration, go ahead and move cmd_time_out into a backend device attribute. In order to do this, tcmu_module_init() has been updated to create a local struct configfs_attribute **tcmu_attrs, that is based upon the existing passthrough_attrib_attrs along with the new cmd_time_out attribute. Once **tcm_attrs has been setup, go ahead and point it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core. Also following MNC's previous change, ->cmd_time_out is stored in milliseconds but exposed via configfs in seconds. Also, note this patch restricts the modification of ->cmd_time_out to before + after the TCMU device has been configured, but not while it has active fabric exports. Cc: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18tcmu: make cmd timeout configurableMike Christie1-6/+35
A single daemon could implement multiple types of devices using multuple types of real devices that may not support restarting from crashes and/or handling tcmu timeouts. This makes the cmd timeout configurable, so handlers that do not support it can turn if off for now. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18tcmu: add helper to check if dev was configuredMike Christie1-2/+6
This adds a helper to check if the dev was configured. It will be used in the next patch to prevent updates to some config settings after the device has been setup. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linuxLinus Torvalds4-3/+12
Pull OpenRISC fixes from Stafford Horne: "OpenRISC fixes for build issues that were exposed by kbuild robots after 4.11 merge. All from allmodconfig builds. This includes: - bug in the handling of 8-byte get_user() calls - module build failure due to multile missing symbol exports" * tag 'openrisc-for-linus' of git://github.com/openrisc/linux: openrisc: Export symbols needed by modules openrisc: fix issue handling 8 byte get_user calls openrisc: xchg: fix `computed is not used` warning
2017-03-18target: fix race during implicit transition work flushesMike Christie1-9/+1
This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18target: allow userspace to set state to transitioningMike Christie1-15/+22
Userspace target_core_user handlers like tcmu-runner may want to set the ALUA state to transitioning while it does implicit transitions. This patch allows that state when set from configfs. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18target: fix ALUA transition timeout handlingMike Christie2-16/+9
The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18target: Use system workqueue for ALUA transitionsMike Christie1-5/+3
If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18target: fail ALUA transitions for pscsiMike Christie1-0/+3
We do not setup the LU group for pscsi devices, so if you write a state to alua_access_state that will cause a transition you will get a NULL pointer dereference. This patch will fail attempts to try and transition the path for backend devices that set the TRANSPORT_FLAG_PASSTHROUGH_ALUA flag. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18target: allow ALUA setup for some passthrough backendsMike Christie4-7/+15
This patch allows passthrough backends to use the core/base LIO ALUA setup and state checks, but still handle the execution of commands. This will allow the target_core_user module to execute STPG and RTPG in userspace, and not have to duplicate the ALUA state checks, path information (needed so we can check if command is executable on specific paths) and setup (rtslib sets/updates the configfs ALUA interface like it does for iblock or file). For STPG, the target_core_user userspace daemon, tcmu-runner will still execute the STPG, and to update the core/base LIO state it will use the existing configfs interface. For RTPG, tcmu-runner will loop over configfs and/or cache the state. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18tcmu: return on first Opt parse failureMike Christie1-0/+3
We only were returing failure if the last opt to be parsed failed. This has a return failure when we first detect a failure. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18tcmu: allow hw_max_sectors greater than 128Mike Christie1-19/+35
tcmu hard codes the hw_max_sectors to 128 which is a litle small. Userspace uses the max_sectors to report the optimal IO size and some initiators perform better with larger IOs (open-iscsi seems to do better with 256 to 512 depending on the test). (Fix do not display hw max sectors twice - MNC) Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18target: Drop pointless tfo->check_stop_free checkNicholas Bellinger2-2/+5
All in-tree fabric drivers provide a tfo->check_stop_free(), so there is no need to do the extra check within existing transport_cmd_check_stop_to_fabric() code. Just to be sure, add a check in target_fabric_tf_ops_check() to notify any out-of-tree drivers that might be missing it. Signed-off-by: Nicholas Bellinger <[email protected]>
2017-03-18parisc: Fix system shutdown haltHelge Deller1-0/+2
On those parisc machines which don't provide a software power off function, the system currently kills the init process at the end of a shutdown and unexpectedly restarts insteads of halting. Fix it by adding a loop which will not return. Signed-off-by: Helge Deller <[email protected]> Cc: [email protected] # 4.9+
2017-03-18parisc: perf: Fix potential NULL pointer dereferenceArvind Yadav1-45/+49
Fix potential NULL pointer dereference and clean up coding style errors (code indent, trailing whitespaces). Signed-off-by: Arvind Yadav <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2017-03-18Merge branch 'smp-urgent-for-linus' of ↵Linus Torvalds1-14/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug fix from Thomas Gleixner: "A single fix preventing the concurrent execution of the CPU hotplug callback install/invocation machinery. Long standing bug caused by a massive brain slip of that Gleixner dude, which went unnoticed for almost a year" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Serialize callback invocations proper
2017-03-18x86/mce: Init some CPU features earlyYazen Ghannam1-12/+18
When the MCA banks in __mcheck_cpu_init_generic() are polled for leftover errors logged during boot or from the previous boot, its required to have CPU features detected sufficiently so that the reading out and handling of those early errors is done correctly. If those features are not available, the decoding may miss some information and get incomplete errors logged. For example, on SMCA systems the MCA_IPID and MCA_SYND registers are not logged and MCA_ADDR is not masked appropriately. To cure that, do a subset of the basic feature detection early while the rest happens in its usual place in __mcheck_cpu_init_vendor(). Signed-off-by: Yazen Ghannam <[email protected]> Cc: Tony Luck <[email protected]> Cc: linux-edac <[email protected]> Cc: x86-ml <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Massage commit message and simplify. ] Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2017-03-18USB: serial: qcserial: add Dell DW5811eBjørn Mork1-0/+2
This is a Dell branded Sierra Wireless EM7455. Cc: <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Signed-off-by: Johan Hovold <[email protected]>
2017-03-17Merge tag 'pm-4.11-rc3' of ↵Linus Torvalds2-36/+36
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix a few more intel_pstate issues and one small issue in the cpufreq core. Specifics: - Fix breakage in the intel_pstate's debugfs interface for PID controller tuning (Rafael Wysocki) - Fix computations related to P-state limits in intel_pstate to avoid excessive rounding errors leading to visible inaccuracies (Srinivas Pandruvada, Rafael Wysocki) - Add a missing newline to a message printed by one function in the cpufreq core and clean up that function (Rafael Wysocki)" * tag 'pm-4.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Fix and clean up show_cpuinfo_cur_freq() cpufreq: intel_pstate: Avoid percentages in limits-related computations cpufreq: intel_pstate: Correct frequency setting in the HWP mode cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()
2017-03-18cpufreq: intel_pstate: One set of global limits in active modeRafael J. Wysocki1-96/+46
In the active mode intel_pstate currently uses two sets of global limits, each associated with one of the possible scaling_governor settings in that mode: "powersave" or "performance". The driver switches over from one of those sets to the other depending on the scaling_governor setting for the last CPU whose per-policy cpufreq interface in sysfs was last used to change parameters exposed in there. That obviously leads to no end of issues when the scaling_governor settings differ between CPUs. The most recent issue was introduced by commit a240c4aa5d0f (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy) that eliminated the reinitialization of "performance" limits in intel_pstate_set_policy() preventing the max limit from being set to anything below 100, among other things. Namely, an undesirable side effect of commit a240c4aa5d0f is that now, after setting scaling_governor to "performance" in the active mode, the per-policy limits for the CPU in question go to the highest level and stay there even when it is switched back to "powersave" later. As it turns out, some distributions set scaling_governor to "performance" temporarily for all CPUs to speed-up system initialization, so that change causes them to misbehave later. To fix that, get rid of the performance/powersave global limits split and use just one set of global limits for everything. From the user's persepctive, after this modification, when scaling_governor is switched from "performance" to "powersave" or the other way around on one CPU, the limits settings (ie. the global max/min_perf_pct and per-policy scaling_max/min_freq for any CPUs) will not change. Still, switching from "performance" to "powersave" or the other way around changes the way in which P-states are selected and in particular "performance" causes the driver to always request the highest P-state it is allowed to ask for for the given CPU. Fixes: a240c4aa5d0f (cpufreq: intel_pstate: Do not reinit performance limits in ->setpolicy) Signed-off-by: Rafael J. Wysocki <[email protected]>
2017-03-18Merge branches 'pm-cpufreq-fixes' and 'intel_pstate-fixes'Rafael J. Wysocki1-33/+31
* pm-cpufreq-fixes: cpufreq: Fix and clean up show_cpuinfo_cur_freq() * intel_pstate-fixes: cpufreq: intel_pstate: Avoid percentages in limits-related computations cpufreq: intel_pstate: Correct frequency setting in the HWP mode cpufreq: intel_pstate: Update pid_params.sample_rate_ns in pid_param_set()
2017-03-17Input: ALPS - fix trackstick button handling on V8 devicesMasaki Ota1-4/+2
Alps stick devices always have physical buttons, so we should not check ALPS_BUTTONPAD flag to decide whether we should report them. Fixes: 4777ac220c43 ("Input: ALPS - add touchstick support for SS5 hardware") Signed-off-by: Masaki Ota <[email protected]> Acked-by: Pali Rohar <[email protected]> Tested-by: Paul Donohue <[email protected]> Tested-by: Nick Fletcher <[email protected]> Cc: [email protected] Signed-off-by: Dmitry Torokhov <[email protected]>
2017-03-17Input: ALPS - fix V8+ protocol handling (73 03 28)Masaki Ota2-16/+61
Devices identified as E7="73 03 28" use slightly modified version of V8 protocol, with lower count per electrode, different offsets, and different feature bits in OTP data. Fixes: aeaa881f9b17 ("Input: ALPS - set DualPoint flag for 74 03 28 devices") Signed-off-by: Masaki Ota <[email protected]> Acked-by: Pali Rohar <[email protected]> Tested-by: Paul Donohue <[email protected]> Tested-by: Nick Fletcher <[email protected]> Cc: [email protected] Signed-off-by: Dmitry Torokhov <[email protected]>
2017-03-17Merge tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds13-27/+90
Pull NFS client fixes from Anna Schumaker: "We have a handful of stable fixes to fix kernel warnings and other bugs that have been around for a while. We've also found a few other reference counting bugs and memory leaks since the initial 4.11 pull. Stable Bugfixes: - Fix decrementing nrequests in NFS v4.2 COPY to fix kernel warnings - Prevent a double free in async nfs4_exchange_id() - Squelch a kbuild sparse complaint for xprtrdma Other Bugfixes: - Fix a typo (NFS_ATTR_FATTR_GROUP_NAME) that causes a memory leak - Fix a reference leak that causes kernel warnings - Make nfs4_cb_sv_ops static to fix a sparse warning - Respect a server's max size in CREATE_SESSION - Handle errors from nfs4_pnfs_ds_connect - Flexfiles layout shouldn't mark devices as unavailable" * tag 'nfs-for-4.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: pNFS/flexfiles: never nfs4_mark_deviceid_unavailable pNFS: return status from nfs4_pnfs_ds_connect NFSv4.1 respect server's max size in CREATE_SESSION NFS prevent double free in async nfs4_exchange_id nfs: make nfs4_cb_sv_ops static xprtrdma: Squelch kbuild sparse complaint NFS: fix the fault nrequests decreasing for nfs_inode COPY NFSv4: fix a reference leak caused WARNING messages nfs4: fix a typo of NFS_ATTR_FATTR_GROUP_NAME
2017-03-17Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds11-24/+126
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "An assorted pile of fixes along with some hardware enablement: - a fix for a KASAN / branch profiling related boot failure - some more fallout of the PUD rework - a fix for the Always Running Timer which is not initialized when the TSC frequency is known at boot time (via MSR/CPUID) - a resource leak fix for the RDT filesystem - another unwinder corner case fixup - removal of the warning for duplicate NMI handlers because there are legitimate cases where more than one handler can be registered at the last level - make a function static - found by sparse - a set of updates for the Intel MID platform which got delayed due to merge ordering constraints. It's hardware enablement for a non mainstream platform, so there is no risk" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mpx: Make unnecessarily global function static x86/intel_rdt: Put group node in rdtgroup_kn_unlock x86/unwind: Fix last frame check for aligned function stacks mm, x86: Fix native_pud_clear build error x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y x86/platform/intel-mid: Add power button support for Merrifield x86/platform/intel-mid: Use common power off sequence x86/platform: Remove warning message for duplicate NMI handlers x86/tsc: Fix ART for TSC_KNOWN_FREQ x86/platform/intel-mid: Correct MSI IRQ line for watchdog device
2017-03-17Merge branch 'x86-acpi-for-linus' of ↵Linus Torvalds6-152/+79
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 acpi fixes from Thomas Gleixner: "This update deals with the fallout of the recent work to make cpuid/node mappings persistent. It turned out that the boot time ACPI based mapping tripped over ACPI inconsistencies and caused regressions. It's partially reverted and the fragile part replaced by an implementation which makes the mapping persistent when a CPU goes online for the first time" * 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: acpi/processor: Check for duplicate processor ids at hotplug time acpi/processor: Implement DEVICE operator for processor enumeration x86/acpi: Restore the order of CPU IDs Revert"x86/acpi: Enable MADT APIs to return disabled apicids" Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting"
2017-03-17Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-19/+63
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A set of perf related fixes: - fix a CR4.PCE propagation issue caused by usage of mm instead of active_mm and therefore propagated the wrong value. - perf core fixes, which plug a use-after-free issue and make the event inheritance on fork more robust. - a tooling fix for symbol handling" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf symbols: Fix symbols__fixup_end heuristic for corner cases x86/perf: Clarify why x86_pmu_event_mapped() isn't racy x86/perf: Fix CR4.PCE propagation to use active_mm instead of mm perf/core: Better explain the inherit magic perf/core: Simplify perf_event_free_task() perf/core: Fix event inheritance on fork() perf/core: Fix use-after-free in perf_release()
2017-03-17btrfs: add missing memset while reading compressed inline extentsZygo Blaxell1-0/+14
This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b978188c ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d5750 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 0001760 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d 0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell <[email protected]> Reviewed-by: Liu Bo <[email protected]> Reviewed-by: Chris Mason <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2017-03-17Btrfs: fix regression in lock_delalloc_pagesLiu Bo1-1/+2
The bug is a regression after commit (da2c7009f6ca "btrfs: teach __process_pages_contig about PAGE_LOCK operation") and commit (76c0021db8fd "Btrfs: use helper to simplify lock/unlock pages"). So if the dirty pages which are under writeback got truncated partially before we lock the dirty pages, we couldn't find all pages mapping to the delalloc range, and the bug didn't return an error so it kept going on and found that the delalloc range got truncated and got to unlock the dirty pages, and then the ASSERT could caught the error, and showed ----------------------------------------------------------------------------- assertion failed: page_ops & PAGE_LOCK, file: fs/btrfs/extent_io.c, line: 1716 ----------------------------------------------------------------------------- This fixes the bug by returning the proper -EAGAIN. Cc: David Sterba <[email protected]> Reported-by: Dave Jones <[email protected]> Signed-off-by: Liu Bo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-03-17Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2-14/+69
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "From the scheduler departement: - a bunch of sched deadline related fixes which deal with various buglets and corner cases. - two fixes for the loadavg spikes which are caused by the delayed NOHZ accounting" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Use deadline instead of period when calculating overflow sched/deadline: Throttle a constrained deadline task activated after the deadline sched/deadline: Make sure the replenishment timer fires in the next period sched/loadavg: Use {READ,WRITE}_ONCE() for sample window sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting sched/deadline: Add missing update_rq_clock() in dl_task_timer()
2017-03-17Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds2-14/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Three fixes related to locking: - fix a SIGKILL issue for RWSEM_GENERIC_SPINLOCK which has been fixed for the XCHGADD variant already - plug a potential use after free in the futex code - prevent leaking a held spinlock in an futex error handling code path" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y futex: Add missing error handling to FUTEX_REQUEUE_PI futex: Fix potential use-after-free in FUTEX_REQUEUE_PI
2017-03-17Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-15/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "Just a simple revert of a new sched_clock implementation which turned out to be buggy" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "clocksource/drivers/tcb_clksrc: Use 32 bit tcb as sched_clock"
2017-03-17pNFS/flexfiles: never nfs4_mark_deviceid_unavailableWeston Andros Adamson4-10/+31
The flexfiles layout should never mark a device unavailable. Move nfs4_mark_deviceid_unavailable out of nfs4_pnfs_ds_connect and call directly from files layout where it's still needed. The flexfiles driver still handles marked devices in error paths, but will now print a rate limited warning. Signed-off-by: Weston Andros Adamson <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17pNFS: return status from nfs4_pnfs_ds_connectWeston Andros Adamson6-6/+48
The nfs4_pnfs_ds_connect path can call rpc_create which can fail or it can wait on another context to reach the same failure. This checks that the rpc_create succeeded and returns the error to the caller. When an error is returned, both the files and flexfiles layouts will return NULL from _prepare_ds(). The flexfiles layout will also return the layout with the error NFS4ERR_NXIO. Signed-off-by: Weston Andros Adamson <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17NFSv4.1 respect server's max size in CREATE_SESSIONOlga Kornievskaia1-2/+2
Currently client doesn't respect max sizes server returns in CREATE_SESSION. nfs4_session_set_rwsize() gets called and server->rsize, server->wsize are 0 so they never get set to the sizes returned by the server. Signed-off-by: Olga Kornievskaia <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17NFS prevent double free in async nfs4_exchange_idOlga Kornievskaia1-5/+4
Since rpc_task is async, the release function should be called which will free the impl_id, scope, and owner. Trond pointed at 2 more problems: -- use of client pointer after free in the nfs4_exchangeid_release() function -- cl_count mismatch if rpc_run_task() isn't run Fixes: 8d89bd70bc9 ("NFS setup async exchange_id") Signed-off-by: Olga Kornievskaia <[email protected]> Cc: [email protected] # 4.9 Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17nfs: make nfs4_cb_sv_ops staticJason Yan1-2/+2
Fixes the following sparse warning: fs/nfs/callback.c:235:21: warning: symbol 'nfs4_cb_sv_ops' was not declared. Should it be static? Signed-off-by: Jason Yan <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17xprtrdma: Squelch kbuild sparse complaintChuck Lever1-1/+2
New complaint from kbuild for 4.9.y: net/sunrpc/xprtrdma/verbs.c:489:19: sparse: incompatible types in comparison expression (different type sizes) verbs.c: 489 max_sge = min(ia->ri_device->attrs.max_sge, RPCRDMA_MAX_SEND_SGES); I can't reproduce this running sparse here. Likewise, "make W=1 net/sunrpc/xprtrdma/verbs.o" never indicated any issue. A little poking suggests that because the range of its values is small, gcc can make the actual width of RPCRDMA_MAX_SEND_SGES smaller than the width of an unsigned integer. Fixes: 16f906d66cd7 ("xprtrdma: Reduce required number of send SGEs") Signed-off-by: Chuck Lever <[email protected]> Cc: [email protected] Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17NFS: fix the fault nrequests decreasing for nfs_inode COPYKinglong Mee1-2/+4
The nfs_commit_file for NFSv4.2's COPY operation goes through the commit path for normal WRITE, but without increase nrequests, so, the nrequests decreased in nfs_commit_release_pages is fault. After that, the nrequests will be wrong. [ 5670.299881] ------------[ cut here ]------------ [ 5670.300295] WARNING: CPU: 0 PID: 27656 at fs/nfs/inode.c:127 nfs_clear_inode+0x66/0x90 [nfs] [ 5670.300558] Modules linked in: nfsv4(E) nfs(E) fscache(E) tun bridge stp llc fuse ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event ppdev f2fs coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_ens1371 intel_rapl_perf gameport snd_ac97_codec vmw_balloon ac97_bus snd_seq snd_pcm joydev snd_rawmidi snd_timer snd_seq_device snd soundcore nfit parport_pc parport acpi_cpufreq tpm_tis tpm_tis_core tpm i2c_piix4 vmw_vmci shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm e1000 crc32c_intel mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi fjes [last unloaded: fscache] [ 5670.302925] CPU: 0 PID: 27656 Comm: umount.nfs4 Tainted: G W E 4.11.0-rc1+ #519 [ 5670.303292] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 5670.304094] Call Trace: [ 5670.304510] dump_stack+0x63/0x86 [ 5670.304917] __warn+0xcb/0xf0 [ 5670.305276] warn_slowpath_null+0x1d/0x20 [ 5670.305661] nfs_clear_inode+0x66/0x90 [nfs] [ 5670.306093] nfs4_evict_inode+0x61/0x70 [nfsv4] [ 5670.306480] evict+0xbb/0x1c0 [ 5670.306888] dispose_list+0x4d/0x70 [ 5670.307233] evict_inodes+0x178/0x1a0 [ 5670.307579] generic_shutdown_super+0x44/0xf0 [ 5670.307985] nfs_kill_super+0x21/0x40 [nfs] [ 5670.308325] deactivate_locked_super+0x43/0x70 [ 5670.308698] deactivate_super+0x5a/0x60 [ 5670.309036] cleanup_mnt+0x3f/0x90 [ 5670.309407] __cleanup_mnt+0x12/0x20 [ 5670.309837] task_work_run+0x80/0xa0 [ 5670.310162] exit_to_usermode_loop+0x89/0x90 [ 5670.310497] syscall_return_slowpath+0xaa/0xb0 [ 5670.310875] entry_SYSCALL_64_fastpath+0xa7/0xa9 [ 5670.311197] RIP: 0033:0x7f1bb3617fe7 [ 5670.311545] RSP: 002b:00007ffecbabb828 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6 [ 5670.311906] RAX: 0000000000000000 RBX: 0000000001dca1f0 RCX: 00007f1bb3617fe7 [ 5670.312239] RDX: 000000000000000c RSI: 0000000000000001 RDI: 0000000001dc83c0 [ 5670.312653] RBP: 0000000001dc83c0 R08: 0000000000000001 R09: 0000000000000000 [ 5670.312998] R10: 0000000000000755 R11: 0000000000000206 R12: 00007ffecbabc66a [ 5670.313335] R13: 0000000001dc83a0 R14: 0000000000000000 R15: 0000000000000000 [ 5670.313758] ---[ end trace bf4bfe7764e4eb40 ]--- Cc: [email protected] Fixes: 67911c8f18 ("NFS: Add nfs_commit_file()") Signed-off-by: Kinglong Mee <[email protected]> Cc: [email protected] # 4.7+ Signed-off-by: Anna Schumaker <[email protected]>
2017-03-17Merge tag 'afs-20170316' of ↵Linus Torvalds13-222/+269
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS fixes from David Howells: "Fixes to the AFS filesystem in the kernel. They fix a variety of bugs. These include some issues fixed for consistency with other AFS implementations: - handle AFS mode bits better - use the client mtime rather than the server mtime in the protocol - handle the server returning more or less data than was requested in a FetchData call - distinguish mountpoints from symlinks based on the mode bits rather than preemptively reading every symlink to find out what it actually represents One other notable change for the user is that files are now flushed on close analogously with other network filesystems" * tag 'afs-20170316' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (28 commits) afs: Don't wait for page writeback with the page lock held afs: ->writepage() shouldn't call clear_page_dirty_for_io() afs: Fix abort on signal while waiting for call completion afs: Fix an off-by-one error in afs_send_pages() afs: Fix afs_kill_pages() afs: Fix page leak in afs_write_begin() afs: Don't set PG_error on local EINTR or ENOMEM when filling a page afs: Populate and use client modification time afs: Better abort and net error handling afs: Invalid op ID should abort with RXGEN_OPCODE afs: Fix the maths in afs_fs_store_data() afs: Use a bvec rather than a kvec in afs_send_pages() afs: Make struct afs_read::remain 64-bit afs: Fix AFS read bug afs: Prevent callback expiry timer overflow afs: Migrate vlocation fields to 64-bit afs: security: Replace rcu_assign_pointer() with RCU_INIT_POINTER() afs: inode: Replace rcu_assign_pointer() with RCU_INIT_POINTER() afs: Distinguish mountpoints from symlinks by file mode alone afs: Flush outstanding writes when an fd is closed ...
2017-03-17Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds1-0/+1
Pull ARM fix from Russell King: "Just one change to add the statx syscall this time around" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: wire up statx syscall
2017-03-17drm/amd/amdgpu: add POLARIS12 PCI IDEvan Quan1-0/+1
Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2017-03-17Merge tag 'for-linus-4.11b-rc3-tag' of ↵Linus Torvalds1-5/+6
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "A minor fix for using the appropriate refcount_t instead of atomic_t" * tag 'for-linus-4.11b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: drivers, xen: convert grant_map.users from atomic_t to refcount_t
2017-03-17Merge tag 'drm-fixes-for-v4.11-rc3' of ↵Linus Torvalds24-123/+271
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Bunch of fixes across the drivers, in a St Patrick's day pull request (please turn terminal colors to green on black or black on green for full effect). On the arm side, tilcdc, omap and malidp got fixes, while amd has some powermanagement fixes, and intel has a set of fixes across the driver. Nothing seems to bad or scary at this point" * tag 'drm-fixes-for-v4.11-rc3' of git://people.freedesktop.org/~airlied/linux: (27 commits) drm/amd/amdgpu: Fix debugfs reg read/write address width drm/amdgpu/si: add dpm quirk for Oland drm/radeon/si: add dpm quirk for Oland drm: amd: remove broken include path drm/amd/powerplay: fix copy error in smu7_clockpoweragting.c drm/tilcdc: Set framebuffer DMA address to HW only if CRTC is enabled drm/tilcdc: Fix hardcoded fail-return value in tilcdc_crtc_create() drm/i915: Fix forcewake active domain tracking drm/i915: Nuke skl_update_plane debug message from the pipe update critical section drm/i915: use correct node for handling cache domain eviction uapi: fix drm/omap_drm.h userspace compilation errors drm/omap: fix dmabuf mmap for dma_alloc'ed buffers drm/amdgpu: fix parser init error path to avoid crash in parser fini drm/amd/amdgpu: Disable GFX_PG on Carrizo until compute issues solved drm: mali-dp: Fix smart layer not going to composition drm: mali-dp: Remove mclk rate management drm/i915: Drain the freed state from the tail of the next commit drm/i915: Nuke debug messages from the pipe update critical section drm/i915: Use pagecache write to prepopulate shmemfs from pwrite-ioctl drm/i915: Store a permanent error in obj->mm.pages ...
2017-03-17Merge tag 'perf-urgent-for-mingo-4.11-20170317' of ↵Ingo Molnar1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fix from Arnaldo Carvalho de Melo: - Fix symbols__fixup_end heuristic for corner cases, such as JITted eBPF programs, that are loaded at page aligned addresses, just after the kernel proper (Daniel Borkmann) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2017-03-17perf symbols: Fix symbols__fixup_end heuristic for corner casesDaniel Borkmann1-1/+1
The current symbols__fixup_end() heuristic for the last entry in the rb tree is suboptimal as it leads to not being able to recognize the symbol in the call graph in a couple of corner cases, for example: i) If the symbol has a start address (f.e. exposed via kallsyms) that is at a page boundary, then the roundup(curr->start, 4096) for the last entry will result in curr->start == curr->end with a symbol length of zero. ii) If the symbol has a start address that is shortly before a page boundary, then also here, curr->end - curr->start will just be very few bytes, where it's unrealistic that we could perform a match against. Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so that we can catch such corner cases and have a better chance to find that specific symbol. It's still just best effort as the real end of the symbol is unknown to us (and could even be at a larger offset than the current range), but better than the current situation. Alexei reported that he recently run into case i) with a JITed eBPF program (these are all page aligned) as the last symbol which wasn't properly shown in the call graph (while other eBPF program symbols in the rb tree were displayed correctly). Since this is a generic issue, lets try to improve the heuristic a bit. Reported-and-Tested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Fixes: 2e538c4a1847 ("perf tools: Improve kernel/modules symbol lookup") Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-03-17drm/i915/gvt: Fix gvt scheduler interval timeZhenyu Wang1-2/+2
Fix to correctly assign 1ms for gvt scheduler interval time, as previous code using HZ is pretty broken. And use no delay for start gvt scheduler function. Fixes: 4b63960ebd3f ("drm/i915/gvt: vGPU schedule policy framework") Cc: Zhi Wang <[email protected]> Cc: [email protected] # v4.10+ Acked-by: Chuanxiao Dong <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2017-03-17drm/i915/gvt: GVT pin/unpin shadow contextChuanxiao Dong1-0/+27
When handling guest request, GVT needs to populate/update shadow_ctx with guest context. This behavior needs to make sure the shadow_ctx is pinned. The current implementation is relying on i195 allocate request to pin but this way cannot guarantee the i915 not to unpin the shadow_ctx when GVT update the guest context from shadow_ctx. So GVT should pin/unpin the shadow_ctx by itself. Signed-off-by: Chuanxiao Dong <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2017-03-17drm/i915/gvt: scan shadow indirect context image when validTina Zhang1-3/+6
The shadow indirect context image should be only scanned when valid. So far, Only RCS ring has the shadow indirect context image. This patch limits the scan logic only for RCS ring. v2. refine description of the subject v3. fix alignment. (Zhenyu) Signed-off-by: Tina Zhang <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2017-03-17drm/i915/kvmgt: fix suspicious rcu dereference usageChangbin Du1-1/+3
The srcu read lock must be held while accessing kvm memslots. This patch fix below warning for function kvmgt_rw_gpa(). [ 165.345093] [ ERR: suspicious RCU usage. ] [ 165.416538] Call Trace: [ 165.418989] dump_stack+0x85/0xc2 [ 165.422310] lockdep_rcu_suspicious+0xd7/0x110 [ 165.426769] kvm_read_guest_page+0x195/0x1b0 [kvm] [ 165.431574] kvm_read_guest+0x50/0x90 [kvm] [ 165.440492] kvmgt_rw_gpa+0x43/0xa0 [kvmgt] [ 165.444683] kvmgt_read_gpa+0x11/0x20 [kvmgt] [ 165.449061] gtt_get_entry64+0x4d/0xc0 [i915] [ 165.453438] ppgtt_populate_shadow_page_by_guest_entry+0x380/0xdc0 [i915] [ 165.460254] shadow_mm+0xd1/0x460 [i915] [ 165.472488] intel_vgpu_create_mm+0x1ab/0x210 [i915] [ 165.477472] intel_vgpu_g2v_create_ppgtt_mm+0x5f/0xc0 [i915] [ 165.483154] pvinfo_mmio_write+0x19b/0x1d0 [i915] [ 165.499068] intel_vgpu_emulate_mmio_write+0x3f9/0x600 [i915] [ 165.504827] intel_vgpu_rw+0x114/0x150 [kvmgt] [ 165.509281] intel_vgpu_write+0x16f/0x1a0 [kvmgt] [ 165.513993] vfio_mdev_write+0x20/0x30 [vfio_mdev] [ 165.518793] vfio_device_fops_write+0x24/0x30 [vfio] [ 165.523770] __vfs_write+0x28/0x120 [ 165.540529] vfs_write+0xce/0x1f0 v2: fix Cc format for stable Signed-off-by: Changbin Du <[email protected]> Cc: <[email protected]> # v4.10+ Reviewed-by: Xiao Guangrong <[email protected]> Reviewed-by: Jike Song <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>