aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-02-14ibmvnic: Fix login buffer memory leaksThomas Falcon1-0/+16
During device bringup, the driver exchanges login buffers with firmware. These buffers contain information such number of TX and RX queues alloted to the device, RX buffer size, etc. These buffers weren't being properly freed on device reset or close. We can free the buffer we send to firmware as soon as we get a response. There is information in the response buffer that the driver needs for normal operation so retain it until the next reset or removal. Signed-off-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-14ibmvnic: Wait until reset is complete to set carrier onThomas Falcon1-2/+2
Pushes back setting the carrier on until the end of the reset code. This resolves a bug where a watchdog timer was detecting that a TX queue had stalled before the adapter reset was complete. Signed-off-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-14Revert "net: thunderx: Add support for xdp redirect"Jesper Dangaard Brouer3-94/+31
This reverts commit aa136d0c82fcd6af14535853c30e219e02b2692d. As I previously[1] pointed out this implementation of XDP_REDIRECT is wrong. XDP_REDIRECT is a facility that must work between different NIC drivers. Another NIC driver can call ndo_xdp_xmit/nicvf_xdp_xmit, but your driver patch assumes payload data (at top of page) will contain a queue index and a DMA addr, this is not true and worse will likely contain garbage. Given you have not fixed this in due time (just reached v4.16-rc1), the only option I see is a revert. [1] http://lkml.kernel.org/r/[email protected] Cc: Sunil Goutham <[email protected]> Cc: Christina Jacob <[email protected]> Cc: Aleksey Makarov <[email protected]> Fixes: aa136d0c82fc ("net: thunderx: Add support for xdp redirect") Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-14Merge tag 'gvt-fixes-2018-02-14' of https://github.com/intel/gvt-linux into ↵Rodrigo Vivi3-3/+51
drm-intel-fixes gvt-fixes-2018-02-14 - gtt mmio 8b access fix (Tina) - one KBL required mmio reg for switch (Weinan) - one trace log typo fix (Weinan) Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-14sctp: fix some copy-paste errors for file commentsXin Long2-2/+3
This patch is to fix the file comments in stream.c and stream_interleave.c v1->v2: rephrase the comment for stream.c according to Neil's suggestion. Fixes: a83863174a61 ("sctp: prepare asoc stream for stream reconf") Fixes: 0c3f6f655487 ("sctp: implement make_datafrag for sctp_stream_interleave") Signed-off-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-14net: fix race on decreasing number of TX queuesJakub Kicinski1-2/+9
netif_set_real_num_tx_queues() can be called when netdev is up. That usually happens when user requests change of number of channels/rings with ethtool -L. The procedure for changing the number of queues involves resetting the qdiscs and setting dev->num_tx_queues to the new value. When the new value is lower than the old one, extra care has to be taken to ensure ordering of accesses to the number of queues vs qdisc reset. Currently the queues are reset before new dev->num_tx_queues is assigned, leaving a window of time where packets can be enqueued onto the queues going down, leading to a likely crash in the drivers, since most drivers don't check if TX skbs are assigned to an active queue. Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice") Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-14arm64: proc: Set PTE_NG for table entries to avoid traversing them twiceWill Deacon1-5/+9
When KASAN is enabled, the swapper page table contains many identical mappings of the zero page, which can lead to a stall during boot whilst the G -> nG code continually walks the same page table entries looking for global mappings. This patch sets the nG bit (bit 11, which is IGNORED) in table entries after processing the subtree so we can easily skip them if we see them a second time. Tested-by: Mark Rutland <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2018-02-14Merge tag 'gfs2-4.16.rc1.fixes' of ↵Linus Torvalds1-20/+23
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fix from Bob Peterson: "Fix regressions in the gfs2 iomap for block_map implementation we recently discovered in commit 3974320ca6" * tag 'gfs2-4.16.rc1.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fixes to "Implement iomap for block_map"
2018-02-14Merge tag 'powerpc-4.16-2' of ↵Linus Torvalds28-79/+231
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A larger batch of fixes than we'd like. Roughly 1/3 fixes for new code, 1/3 fixes for stable and 1/3 minor things. There's four commits fixing bugs when using 16GB huge pages on hash, caused by some of the preparatory changes for pkeys. Two fixes for bugs in the enhanced IRQ soft masking for local_t, one of which broke KVM in some circumstances. Four fixes for Power9. The most bizarre being a bug where futexes stopped working because a NULL pointer dereference didn't trap during early boot (it aliased the kernel mapping). A fix for memory hotplug when using the Radix MMU, and a fix for live migration of guests using the Radix MMU. Two fixes for hotplug on pseries machines. One where we weren't correctly updating NUMA info when CPUs are added and removed. And the other fixes crashes/hangs seen when doing memory hot remove during boot, which is apparently a thing people do. Finally a handful of build fixes for obscure configs and other minor fixes. Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Balbir Singh, Colin Ian King, Daniel Henrique Barboza, Florian Weimer, Guenter Roeck, Harish, Laurent Vivier, Madhavan Srinivasan, Mauricio Faria de Oliveira, Nathan Fontenot, Nicholas Piggin, Sam Bobroff" * tag 'powerpc-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix to use ucontext_t instead of struct ucontext powerpc/kdump: Fix powernv build break when KEXEC_CORE=n powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug powerpc/mm/hash64: Zero PGD pages on allocation powerpc/mm/hash64: Store the slot information at the right offset for hugetlb powerpc/mm/hash64: Allocate larger PMD table if hugetlb config is enabled powerpc/mm: Fix crashes with 16G huge pages powerpc/mm: Flush radix process translations when setting MMU type powerpc/vas: Don't set uses_vas for kernel windows powerpc/pseries: Enable RAS hotplug events later powerpc/mm/radix: Split linear mapping on hot-unplug powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID ocxl: fix signed comparison with less than zero powerpc/64s: Fix may_hard_irq_enable() for PMI soft masking powerpc/64s: Fix MASKABLE_RELON_EXCEPTION_HV_OOL macro powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove
2018-02-14nvme-rdma: fix sysfs invoked reset_ctrl error flowNitzan Carmi2-6/+7
When reset_controller that is invoked by sysfs fails, it enters an error flow which practically removes the nvme ctrl entirely (similar to delete_ctrl flow). It causes the system to hang, since a sysfs attribute cannot be unregistered by one of its own methods. This can be fixed by calling delete_ctrl as a work rather than sequential code. In addition, it should give the ctrl a chance to recover using reconnection mechanism (consistant with FC reset_ctrl error flow). Also, while we're here, return suitable errno in case the reset ended with non live ctrl. Signed-off-by: Nitzan Carmi <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]>
2018-02-14nvmet: Change return code of discard command if not supportedIsrael Rukshin1-2/+5
Execute discard command on block device that doesn't support it should return success. Returning internal error while using multi-path fails the path. Reviewed-by: Max Gurtovoy <[email protected]> Signed-off-by: Israel Rukshin <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]>
2018-02-14virtio/s390: implement PM operations for virtio_ccwChristian Borntraeger1-0/+29
Suspend/Resume to/from disk currently fails. Let us wire up the necessary callbacks. This is mostly just forwarding the requests to the virtio drivers. The only thing that has to be done in virtio_ccw itself is to re-set the virtio revision. Suggested-by: Thomas Huth <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]> Message-Id: <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> [CH: merged <[email protected]> to fix !CONFIG_PM configs] Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2018-02-14ALSA: hda/realtek: PCI quirk for Fujitsu U7x7Jan-Marek Glogowski1-0/+19
These laptops have a combined jack to attach headsets, the U727 on the left, the U757 on the right, but a headsets microphone doesn't work. Using hdajacksensetest I found that pin 0x19 changed the present state when plugging the headset, in addition to 0x21, but didn't have the correct configuration (shown as "Not connected"). So this sets the configuration to the same values as the headphone pin 0x21 except for the device type microphone, which makes it work correctly. With the patch the configured pins for U727 are Pin 0x12 (Internal Mic, Mobile-In): present = No Pin 0x14 (Internal Speaker): present = No Pin 0x19 (Black Mic, Left side): present = No Pin 0x1d (Internal Aux): present = No Pin 0x21 (Black Headphone, Left side): present = No Signed-off-by: Jan-Marek Glogowski <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-02-14mmc: bcm2835: Don't overwrite max frequency unconditionallyPhil Elwell1-1/+2
The optional DT parameter max-frequency could init the max bus frequency. So take care of this, before setting the max bus frequency. Fixes: 660fc733bd74 ("mmc: bcm2835: Add new driver for the sdhost controller.") Signed-off-by: Phil Elwell <[email protected]> Signed-off-by: Stefan Wahren <[email protected]> Cc: <[email protected]> # 4.12+ Signed-off-by: Ulf Hansson <[email protected]>
2018-02-14Revert "mmc: meson-gx: include tx phase in the tuning process"Jerome Brunet1-18/+1
This reverts commit 0a44697627d17a66d7dc98f17aeca07ca79c5c20. This commit was initially intended to fix problems with hs200 and hs400 on some boards, mainly the odroid-c2. The OC2 (Rev 0.2) I have performs well in this modes, so I could not confirm these issues. We've had several reports about the issues being still present on (some) OC2, so apparently, this change does not do what it was supposed to do. Maybe the eMMC signal quality is on the edge on the board. This may explain the variability we see in term of stability, but this is just a guess. Lowering the max_frequency to 100Mhz seems to do trick for those affected by the issue Worse, the commit created new issues (CRC errors and hangs) on other boards, such as the kvim 1 and 2, the p200 or the libretech-cc. According to amlogic, the Tx phase should not be tuned and left in its default configuration, so it is best to just revert the commit. Fixes: 0a44697627d1 ("mmc: meson-gx: include tx phase in the tuning process") Cc: <[email protected]> # 4.14+ Signed-off-by: Jerome Brunet <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2018-02-14ALSA: seq: Fix racy pool initializationsTakashi Iwai1-2/+6
ALSA sequencer core initializes the event pool on demand by invoking snd_seq_pool_init() when the first write happens and the pool is empty. Meanwhile user can reset the pool size manually via ioctl concurrently, and this may lead to UAF or out-of-bound accesses since the function tries to vmalloc / vfree the buffer. A simple fix is to just wrap the snd_seq_pool_init() call with the recently introduced client->ioctl_mutex; as the calls for snd_seq_pool_init() from other side are always protected with this mutex, we can avoid the race. Reported-by: 范龙飞 <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-02-14drm/i915/gvt: fix one typo of render_mmio traceWeinan Li1-1/+1
Fix one typo of render_mmio trace, exchange the mmio value of old and new. Signed-off-by: Weinan Li <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-02-14drm/i915/gvt: Support BAR0 8-byte reads/writesTina Zhang1-2/+49
GGTT is in BAR0 with 8 bytes aligned. With a qemu patch (commit: 38d49e8c1523d97d2191190d3f7b4ce7a0ab5aa3), VFIO can use 8-byte reads/ writes to access it. This patch is to support the 8-byte GGTT reads/writes. Ideally, we would like to support 8-byte reads/writes for the total BAR0. But it needs more work for handling 8-byte MMIO reads/writes. This patch can fix the issue caused by partial updating GGTT entry, during guest booting up. v3: - Use intel_vgpu_get_bar_gpa() stead. (Zhenyu) - Include all the GGTT checking logic in gtt_entry(). (Zhenyu) v2: - Limit to GGTT entry. (Zhenyu) Signed-off-by: Tina Zhang <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-02-14drm/i915/gvt: add 0xe4f0 into gen9 render listWeinan Li1-0/+1
Guest may set this register on KBL platform, it can impact hardware behavior, so add it into the gen9 render list. Otherwise gpu hang issue may happen during different vgpu switch. v2: separate it from patch set. Cc: Zhi Wang <[email protected]> Cc: Zhenyu Wang <[email protected]> Signed-off-by: Weinan Li <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-02-13drm/i915/pmu: Fix building without CONFIG_PMChris Wilson1-10/+23
As we peek inside struct device to query members guarded by CONFIG_PM, so must be the code. Reported-by: kbuild test robot <[email protected]> Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout") Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 05273c950a3c93c5f96be8807eaf24f2cc9f1c1e) Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-13drm/i915/pmu: Fix sleep under atomic in RC6 readoutTvrtko Ursulin2-15/+84
We are not allowed to call intel_runtime_pm_get from the PMU counter read callback since the former can sleep, and the latter is running under IRQ context. To workaround this, we record the last known RC6 and while runtime suspended estimate its increase by querying the runtime PM core timestamps. Downside of this approach is that we can temporarily lose a chunk of RC6 time, from the last PMU read-out to runtime suspend entry, but that will eventually catch up, once device comes back online and in the presence of PMU queries. Also, we have to be careful not to overshoot the RC6 estimate, so once resumed after a period of approximation, we only update the counter once it catches up. With the observation that RC6 is increasing while the device is suspended, this should not pose a problem and can only cause slight inaccuracies due clock base differences. v2: Simplify by estimating on top of PM core counters. (Imre) Signed-off-by: Tvrtko Ursulin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104943 Fixes: 6060b6aec03c ("drm/i915/pmu: Add RC6 residency metrics") Testcase: igt/perf_pmu/rc6-runtime-pm Cc: Tvrtko Ursulin <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Imre Deak <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: David Airlie <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 1fe699e30113ed6f6e853ff44710d256072ea627) Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-13drm/i915/pmu: Fix PMU enable vs execlists tasklet raceTvrtko Ursulin2-87/+52
Commit 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats") added a tasklet_disable call in busy stats enabling, but we failed to understand that the PMU enable callback runs as an hard IRQ (IPI). Consequence of this is that the PMU enable callback can interrupt the execlists tasklet, and will then deadlock when it calls intel_engine_stats_enable->tasklet_disable. To fix this, I realized it is possible to move the engine stats enablement and disablement to PMU event init and destroy hooks. This allows for much simpler implementation since those hooks run in normal context (can sleep). v2: Extract engine_event_destroy. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: 99e48bf98dd0 ("drm/i915: Lock out execlist tasklet while peeking inside for busy-stats") Testcase: igt/perf_pmu/enable-race-* Cc: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: [email protected] Reviewed-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit b2f78cda260bc6a1a2d382b1d85a29e69b5b3724) Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-13drm/i915: Lock out execlist tasklet while peeking inside for busy-statsChris Wilson1-8/+12
In order to prevent a race condition where we may end up overaccounting the active state and leaving the busy-stats believing the GPU is 100% busy, lock out the tasklet while we reconstruct the busy state. There is no direct spinlock guard for the execlists->port[], so we need to utilise tasklet_disable() as a synchronous barrier to prevent it, the only writer to execlists->port[], from running at the same time as the enable. Fixes: 4900727d35bb ("drm/i915/pmu: Reconstruct active state on starting busy-stats") Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Tvrtko Ursulin <[email protected]> (cherry picked from commit 99e48bf98dd036090b480a12c39e8b971731247e) Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-13drm/i915/breadcrumbs: Ignore unsubmitted signalersChris Wilson1-19/+10
When a request is preempted, it is unsubmitted from the HW queue and removed from the active list of breadcrumbs. In the process, this however triggers the signaler and it may see the clear rbtree with the old, and still valid, seqno, or it may match the cleared seqno with the now zero rq->global_seqno. This confuses the signaler into action and signaling the fence. Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue") Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: <[email protected]> # v4.12+ Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e) Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-13nvme-pci: Fix timeouts in connecting stateKeith Busch1-1/+5
We need to halt the controller immediately if we haven't completed initialization as indicated by the new "connecting" state. Fixes: ad70062cdb ("nvme-pci: introduce RECONNECTING state to mark initializing procedure") Signed-off-by: Keith Busch <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2018-02-13nvme-pci: Remap CMB SQ entries on every controller resetKeith Busch1-11/+14
The controller memory buffer is remapped into a kernel address on each reset, but the driver was setting the submission queue base address only on the very first queue creation. The remapped address is likely to change after a reset, so accessing the old address will hit a kernel bug. This patch fixes that by setting the queue's CMB base address each time the queue is created. Fixes: f63572dff1421 ("nvme: unmap CMB and remove sysfs file in reset path") Reported-by: Christian Black <[email protected]> Cc: Jon Derrick <[email protected]> Cc: <[email protected]> # 4.9+ Signed-off-by: Keith Busch <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2018-02-13nvme: fix the deadlock in nvme_update_formatsJianchao Wang1-3/+8
nvme_update_formats will invoke nvme_ns_remove under namespaces_mutext. The will cause deadlock because nvme_ns_remove will also require the namespaces_mutext. Fix it by getting the ns entries which should be removed under namespaces_mutext and invoke nvme_ns_remove out of namespaces_mutext. Signed-off-by: Jianchao Wang <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Keith Busch <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]>
2018-02-13gfs2: Fixes to "Implement iomap for block_map"Andreas Gruenbacher1-20/+23
It turns out that commit 3974320ca6 "Implement iomap for block_map" introduced a few bugs that trigger occasional failures with xfstest generic/476: In gfs2_iomap_begin, we jump to do_alloc when we determine that we are beyond the end of the allocated metadata (height > ip->i_height). There, we can end up calling hole_size with a metapath that doesn't match the current metadata tree, which doesn't make sense. After untangling the code at do_alloc, fix this by checking if the block we are looking for is within the range of allocated metadata. In addition, add a BUG() in case gfs2_iomap_begin is accidentally called for reading stuffed files: this is handled separately. Make sure we don't truncate iomap->length for reads beyond the end of the file; in that case, the entire range counts as a hole. Finally, revert to taking a bitmap write lock when doing allocations. It's unclear why that change didn't lead to any failures during testing. Signed-off-by: Andreas Gruenbacher <[email protected]> Signed-off-by: Bob Peterson <[email protected]>
2018-02-13rds: do not call ->conn_alloc with GFP_KERNELSowmini Varadhan1-1/+1
Commit ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management") adds an rcu read critical section to __rd_conn_create. The memory allocations in that critcal section need to use GFP_ATOMIC to avoid sleeping. This patch was verified with syzkaller reproducer. Reported-by: [email protected] Fixes: ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management") Signed-off-by: Sowmini Varadhan <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-13Merge tag 'mips_4.16_2' of ↵Linus Torvalds3-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips Pull MIPS fix from James Hogan: "A single change (and associated DT binding update) to allow the address of the MIPS Cluster Power Controller (CPC) to be chosen by DT, which allows SMP to work on generic MIPS kernels where the bootloader hasn't configured the CPC address (i.e. the new Ranchu platform)" * tag 'mips_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: MIPS: CPC: Map registers using DT in mips_cpc_default_phys_base() dt-bindings: Document mti,mips-cpc binding
2018-02-13Merge branch 'net-sched-couple-of-fixes'David S. Miller2-18/+32
Jiri Pirko says: ==================== net: sched: couple of fixes This patchset contains couple of fixes following-up the shared block patchsets. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-02-13net: sched: fix tc_u_common lookupJiri Pirko1-4/+20
The offending commit wrongly assumes 1:1 mapping between block and q. However, there are multiple blocks for a single q for classful qdiscs. Since the obscure tc_u_common sharing mechanism expects it to be shared among a qdisc, fix it by storing q pointer in case the block is not shared. Reported-by: Paweł Staszewski <[email protected]> Reported-by: Cong Wang <[email protected]> Fixes: 7fa9d974f3c2 ("net: sched: cls_u32: use block instead of q in tc_u_common") Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-13net: sched: don't set q pointer for shared blocksJiri Pirko1-14/+12
It is pointless to set block->q for block which are shared among multiple qdiscs. So remove the assignment in that case. Do a bit of code reshuffle to make block->index initialized at that point so we can use tcf_block_shared() helper. Reported-by: Cong Wang <[email protected]> Fixes: 4861738775d7 ("net: sched: introduce shared filter blocks infrastructure") Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-13mlxsw: spectrum_router: Fix error path in mlxsw_sp_vr_createJiri Pirko1-14/+18
Since mlxsw_sp_fib_create() and mlxsw_sp_mr_table_create() use ERR_PTR macro to propagate int err through return of a pointer, the return value is not NULL in case of failure. So if one of the calls fails, one of vr->fib4, vr->fib6 or vr->mr4_table is not NULL and mlxsw_sp_vr_is_used wrongly assumes that vr is in use which leads to crash like following one: [ 1293.949291] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c9 [ 1293.952729] IP: mlxsw_sp_mr_table_flush+0x15/0x70 [mlxsw_spectrum] Fix this by using local variables to hold the pointers and set vr->* only in case everything went fine. Fixes: 76610ebbde18 ("mlxsw: spectrum_router: Refactor virtual router handling") Fixes: a3d9bc506d64 ("mlxsw: spectrum_router: Extend virtual routers with IPv6 support") Fixes: d42b0965b1d4 ("mlxsw: spectrum_router: Add multicast routes notification handling functionality") Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-13net: af_unix: fix typo in UNIX_SKB_FRAGS_SZ commentTobias Klauser1-1/+1
Change "minimun" to "minimum". Signed-off-by: Tobias Klauser <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-13powerpc/macio: set a proper dma_coherent_maskChristoph Hellwig1-0/+1
We have expected busses to set up a coherent mask to properly use the common dma mapping code for a long time, and now that I've added a warning macio turned out to not set one up yet. This sets it to the same value as the dma_mask, which seems to be what the drivers expect. Reported-by: Mathieu Malaterre <[email protected]> Tested-by: Mathieu Malaterre <[email protected]> Reported-by: Meelis Roos <[email protected]> Tested-by: Meelis Roos <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2018-02-13uapi/if_ether.h: move __UAPI_DEF_ETHHDR libc defineHauke Mehrtens2-7/+5
This fixes a compile problem of some user space applications by not including linux/libc-compat.h in uapi/if_ether.h. linux/libc-compat.h checks which "features" the header files, included from the libc, provide to make the Linux kernel uapi header files only provide no conflicting structures and enums. If a user application mixes kernel headers and libc headers it could happen that linux/libc-compat.h gets included too early where not all other libc headers are included yet. Then the linux/libc-compat.h would not prevent all the redefinitions and we run into compile problems. This patch removes the include of linux/libc-compat.h from uapi/if_ether.h to fix the recently introduced case, but not all as this is more or less impossible. It is no problem to do the check directly in the if_ether.h file and not in libc-compat.h as this does not need any fancy glibc header detection as glibc never provided struct ethhdr and should define __UAPI_DEF_ETHHDR by them self when they will provide this. The following test program did not compile correctly any more: #include <linux/if_ether.h> #include <netinet/in.h> #include <linux/in.h> int main(void) { return 0; } Fixes: 6926e041a892 ("uapi/if_ether.h: prevent redefinition of struct ethhdr") Reported-by: Guillaume Nault <[email protected]> Cc: <[email protected]> # 4.15 Signed-off-by: Hauke Mehrtens <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-13blk: optimization for classic pollingNitesh Shetty1-0/+1
This removes the dependency on interrupts to wake up task. Set task state as TASK_RUNNING, if need_resched() returns true, while polling for IO completion. Earlier, polling task used to sleep, relying on interrupt to wake it up. This made some IO take very long when interrupt-coalescing is enabled in NVMe. Reference: http://lists.infradead.org/pipermail/linux-nvme/2018-February/015435.html Changes since v2->v3: -using __set_current_state() instead of set_current_state() Changes since v1->v2: -setting task state once in blk_poll, instead of multiple callers. Signed-off-by: Nitesh Shetty <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-02-13x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pagesTony Luck5-18/+26
In the following commit: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages") ... we added code to memory_failure() to unmap the page from the kernel 1:1 virtual address space to avoid speculative access to the page logging additional errors. But memory_failure() may not always succeed in taking the page offline, especially if the page belongs to the kernel. This can happen if there are too many corrected errors on a page and either mcelog(8) or drivers/ras/cec.c asks to take a page offline. Since we remove the 1:1 mapping early in memory_failure(), we can end up with the page unmapped, but still in use. On the next access the kernel crashes :-( There are also various debug paths that call memory_failure() to simulate occurrence of an error. Since there is no actual error in memory, we don't need to map out the page for those cases. Revert most of the previous attempt and keep the solution local to arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when: 1) there is a real error 2) memory_failure() succeeds. All of this only applies to 64-bit systems. 32-bit kernel doesn't map all of memory into kernel space. It isn't worth adding the code to unmap the piece that is mapped because nobody would run a 32-bit kernel on a machine that has recoverable machine checks. Signed-off-by: Tony Luck <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Robert (Persistent Memory) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] #v4.14 Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages") Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13locking/semaphore: Update the file path in documentationTycho Andersen1-1/+1
While reading this header I noticed that the locking stuff has moved to kernel/locking/*, so update the path in semaphore.h to point to that. Signed-off-by: Tycho Andersen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13locking/atomic/bitops: Document and clarify ordering semantics for failed ↵Will Deacon2-2/+8
test_and_{}_bit() A test_and_{}_bit() operation fails if the value of the bit is such that the modification does not take place. For example, if test_and_set_bit() returns 1. In these cases, follow the behaviour of cmpxchg and allow the operation to be unordered. This also applies to test_and_set_bit_lock() if the lock is found to be be taken already. Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13locking/qspinlock: Ensure node->count is updated before initialising nodeWill Deacon1-0/+8
When queuing on the qspinlock, the count field for the current CPU's head node is incremented. This needn't be atomic because locking in e.g. IRQ context is balanced and so an IRQ will return with node->count as it found it. However, the compiler could in theory reorder the initialisation of node[idx] before the increment of the head node->count, causing an IRQ to overwrite the initialised node and potentially corrupt the lock state. Avoid the potential for this harmful compiler reordering by placing a barrier() between the increment of the head node->count and the subsequent node initialisation. Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13locking/qspinlock: Ensure node is initialised before updating prev->nextWill Deacon1-6/+7
When a locker ends up queuing on the qspinlock locking slowpath, we initialise the relevant mcs node and publish it indirectly by updating the tail portion of the lock word using xchg_tail. If we find that there was a pre-existing locker in the queue, we subsequently update their ->next field to point at our node so that we are notified when it's our turn to take the lock. This can be roughly illustrated as follows: /* Initialise the fields in node and encode a pointer to node in tail */ tail = initialise_node(node); /* * Exchange tail into the lockword using an atomic read-modify-write * operation with release semantics */ old = xchg_tail(lock, tail); /* If there was a pre-existing waiter ... */ if (old & _Q_TAIL_MASK) { prev = decode_tail(old); smp_read_barrier_depends(); /* ... then update their ->next field to point to node. WRITE_ONCE(prev->next, node); } The conditional update of prev->next therefore relies on the address dependency from the result of xchg_tail ensuring order against the prior initialisation of node. However, since the release semantics of the xchg_tail operation apply only to the write portion of the RmW, then this ordering is not guaranteed and it is possible for the CPU to return old before the writes to node have been published, consequently allowing us to point prev->next to an uninitialised node. This patch fixes the problem by making the update of prev->next a RELEASE operation, which also removes the reliance on dependency ordering. Signed-off-by: Will Deacon <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13x86/error_inject: Make just_return_func() globally visibleArnd Bergmann1-0/+1
With link time optimizations enabled, I get a link failure: ./ccLbOEHX.ltrans19.ltrans.o: In function `override_function_with_return': <artificial>:(.text+0x7f3): undefined reference to `just_return_func' Marking the symbol .globl makes it work as expected. Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Josef Bacik <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nicolas Pitre <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: 540adea3809f ("error-injection: Separate error-injection from kprobe") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13x86/platform/UV: Fix GAM Range Table entries less than 1GB[email protected]1-3/+12
The latest UV platforms include the new ApachePass NVDIMMs into the UV address space. This has introduced address ranges in the Global Address Map Table that are less than the previous lowest range, which was 2GB. Fix the address calculation so it accommodates address ranges from bytes to exabytes. Signed-off-by: Mike Travis <[email protected]> Reviewed-by: Andrew Banman <[email protected]> Reviewed-by: Dimitri Sivanich <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russ Anderson <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13MIPS: Fix incorrect mem=X@Y handlingMarcin Nowakowski1-4/+12
Commit 73fbc1eba7ff ("MIPS: fix mem=X@Y commandline processing") added a fix to ensure that the memory range between PHYS_OFFSET and low memory address specified by mem= cmdline argument is not later processed by free_all_bootmem. This change was incorrect for systems where the commandline specifies more than 1 mem argument, as it will cause all memory between PHYS_OFFSET and each of the memory offsets to be marked as reserved, which results in parts of the RAM marked as reserved (Creator CI20's u-boot has a default commandline argument 'mem=256M@0x0 mem=768M@0x30000000'). Change the behaviour to ensure that only the range between PHYS_OFFSET and the lowest start address of the memories is marked as protected. This change also ensures that the range is marked protected even if it's only defined through the devicetree and not only via commandline arguments. Reported-by: Mathieu Malaterre <[email protected]> Signed-off-by: Marcin Nowakowski <[email protected]> Fixes: 73fbc1eba7ff ("MIPS: fix mem=X@Y commandline processing") Cc: Ralf Baechle <[email protected]> Cc: [email protected] Cc: <[email protected]> # v4.11+ Tested-by: Mathieu Malaterre <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/18562/ Signed-off-by: James Hogan <[email protected]>
2018-02-13x86/build: Add arch/x86/tools/insn_decoder_test to .gitignoreProgyan Bhattacharya1-0/+1
The file was generated by make command and should not be in the source tree. Signed-off-by: Progyan Bhattacharya <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13sched/cpufreq: Remove unused SUGOV_KTHREAD_PRIORITY macroLeo Yan1-2/+0
Since schedutil kernel thread directly set priority to 0, the macro SUGOV_KTHREAD_PRIORITY is not used. So remove it. Signed-off-by: Leo Yan <[email protected]> Acked-by: Viresh Kumar <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J . Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vikram Mulukutla <[email protected]> Cc: Vincent Guittot <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-13MIPS: BMIPS: Fix section mismatch warningJaedon Shin1-1/+1
Remove the __init annotation from bmips_cpu_setup() to avoid the following warning. WARNING: vmlinux.o(.text+0x35c950): Section mismatch in reference from the function brcmstb_pm_s3() to the function .init.text:bmips_cpu_setup() The function brcmstb_pm_s3() references the function __init bmips_cpu_setup(). This is often because brcmstb_pm_s3 lacks a __init annotation or the annotation of bmips_cpu_setup is wrong. Signed-off-by: Jaedon Shin <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Kevin Cernekee <[email protected]> Cc: [email protected] Reviewed-by: James Hogan <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Patchwork: https://patchwork.linux-mips.org/patch/18589/ Signed-off-by: James Hogan <[email protected]>
2018-02-13x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a ↵Masayoshi Mizuma1-1/+0
physical CPU When a physical CPU is hot-removed, the following warning messages are shown while the uncore device is removed in uncore_pci_remove(): WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988 uncore_pci_remove+0xf1/0x110 ... CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1 Workqueue: kacpi_hotplug acpi_hotplug_work_fn ... Call Trace: pci_device_remove+0x36/0xb0 device_release_driver_internal+0x145/0x210 pci_stop_bus_device+0x76/0xa0 pci_stop_root_bus+0x44/0x60 acpi_pci_root_remove+0x1f/0x80 acpi_bus_trim+0x54/0x90 acpi_bus_trim+0x2e/0x90 acpi_device_hotplug+0x2bc/0x4b0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x141/0x340 worker_thread+0x47/0x3e0 kthread+0xf5/0x130 When uncore_pci_remove() runs, it tries to get the package ID to clear the value of uncore_extra_pci_dev[].dev[] by using topology_phys_to_logical_pkg(). The warning messesages are shown because topology_phys_to_logical_pkg() returns -1. arch/x86/events/intel/uncore.c: static void uncore_pci_remove(struct pci_dev *pdev) { ... phys_id = uncore_pcibus_to_physid(pdev->bus); ... pkg = topology_phys_to_logical_pkg(phys_id); // returns -1 for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) { if (uncore_extra_pci_dev[pkg].dev[i] == pdev) { uncore_extra_pci_dev[pkg].dev[i] = NULL; break; } } WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!! topology_phys_to_logical_pkg() tries to find cpuinfo_x86->phys_proc_id that matches the phys_pkg argument. arch/x86/kernel/smpboot.c: int topology_phys_to_logical_pkg(unsigned int phys_pkg) { int cpu; for_each_possible_cpu(cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); if (c->initialized && c->phys_proc_id == phys_pkg) return c->logical_proc_id; } return -1; } However, the phys_proc_id was already set to 0 by remove_siblinginfo() when the CPU was offlined. So, topology_phys_to_logical_pkg() cannot find the correct logical_proc_id and always returns -1. As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning messages are shown. What is worse is that the bogus 'pkg' index results in two bugs: - We dereference uncore_extra_pci_dev[] with a negative index - We fail to clean up a stale pointer in uncore_extra_pci_dev[][] To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo(). This should not cause any problems, because ->phys_proc_id is not used after it is hot-removed and it is re-set while hot-adding. Signed-off-by: Masayoshi Mizuma <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: <[email protected]> Fixes: 30bb9811856f ("x86/topology: Avoid wasting 128k for package id array") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>