aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-06-15firmware: dmi_scan: Check DMI structure lengthJean Delvare1-7/+16
Before accessing DMI data to record it for later, we should ensure that the DMI structures are large enough to contain the data in question. Signed-off-by: Jean Delvare <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Linus Walleij <[email protected]>
2017-06-15firmware: dmi: Fix permissions of product_familyJean Delvare1-2/+2
This is not sensitive information like serial numbers, we can allow all users to read it. Fix odd alignment while we're here. Signed-off-by: Jean Delvare <[email protected]> Fixes: c61872c9833d ("firmware: dmi: Add DMI_PRODUCT_FAMILY identification string") Reviewed-by: Andy Shevchenko <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: Linus Walleij <[email protected]>
2017-06-15firmware: dmi_scan: Make dmi_walk and dmi_walk_early return real error codesAndy Lutomirski2-5/+6
Currently they return -1 on error, which will confuse callers if they try to interpret it as a normal negative error code. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Darren Hart (VMware) <[email protected]> Signed-off-by: Jean Delvare <[email protected]>
2017-06-15firmware: dmi_scan: Look for SMBIOS 3 entry point firstJean Delvare1-1/+16
Since version 3.0.0 of the SMBIOS specification, there can be multiple entry points in memory, pointing to one or two DMI tables. If both a 32-bit ("_SM_") entry point and a 64-bit ("_SM3_") entry point are present, the specification requires that the latter points to a table which is a super-set of the table pointed to by the former. Therefore we should give preference to the 64-bit ("_SM3_") entry point. However, currently the code is picking the first valid entry point it finds. Per specification, we should look for a 64-bit ("_SM3_") entry point first, and if we can't find any, look for a 32-bit ("_SM_" or "_DMI_") entry point. Modify the code to do that. Signed-off-by: Jean Delvare <[email protected]>
2017-06-15fs: don't forget to put old mntns in mntns_installAndrei Vagin1-0/+2
Fixes: 4f757f3cbf54 ("make sure that mntns_install() doesn't end up with referral for root") Cc: Al Viro <[email protected]> Signed-off-by: Andrei Vagin <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-15Hang/soft lockup in d_invalidate with simultaneous callsAl Viro1-6/+4
It's not hard to trigger a bunch of d_invalidate() on the same dentry in parallel. They end up fighting each other - any dentry picked for removal by one will be skipped by the rest and we'll go for the next iteration through the entire subtree, even if everything is being skipped. Morevoer, we immediately go back to scanning the subtree. The only thing we really need is to dissolve all mounts in the subtree and as soon as we've nothing left to do, we can just unhash the dentry and bugger off. Signed-off-by: Al Viro <[email protected]>
2017-06-15MIPS: .its targets depend on vmlinuxPaul Burton1-5/+5
The .its targets require information about the kernel binary, such as its entry point, which is extracted from the vmlinux ELF. We therefore require that the ELF is built before the .its files are generated. Declare this requirement in the Makefile such that make will ensure this is always the case, otherwise in corner cases we can hit issues as the .its is generated with an incorrect (either invalid or stale) entry point. Signed-off-by: Paul Burton <[email protected]> Fixes: cf2a5e0bb4c6 ("MIPS: Support generating Flattened Image Trees (.itb)") Cc: [email protected] Cc: stable <[email protected]> # v4.9+ Patchwork: https://patchwork.linux-mips.org/patch/16179/ Signed-off-by: Ralf Baechle <[email protected]>
2017-06-15MIPS: Fix bnezc/jialc return address calculationPaul Burton1-1/+3
The code handling the pop76 opcode (ie. bnezc & jialc instructions) in __compute_return_epc_for_insn() needs to set the value of $31 in the jialc case, which is encoded with rs = 0. However its check to differentiate bnezc (rs != 0) from jialc (rs = 0) was unfortunately backwards, meaning that if we emulate a bnezc instruction we clobber $31 & if we emulate a jialc instruction it actually behaves like a jic instruction. Fix this by inverting the check of rs to match the way the instructions are actually encoded. Signed-off-by: Paul Burton <[email protected]> Fixes: 28d6f93d201d ("MIPS: Emulate the new MIPS R6 BNEZC and JIALC instructions") Cc: stable <[email protected]> # v4.0+ Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/16178/ Signed-off-by: Ralf Baechle <[email protected]>
2017-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds136-491/+810
Pull networking fixes from David Miller: 1) The netlink attribute passed in to dev_set_alias() is not necessarily NULL terminated, don't use strlcpy() on it. From Alexander Potapenko. 2) Fix implementation of atomics in arm64 bpf JIT, from Daniel Borkmann. 3) Correct the release of netdevs and driver private data in certain circumstances. 4) Sanitize netlink message length properly in decnet, from Mateusz Jurczyk. 5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From Yuval Mintz. 6) Hash secret is never initialized in ipv6 ILA translation code, from Arnd Bergmann. I guess those clang warnings about unused inline functions are useful for something! 7) Fix endian selection in bpf_endian.h, from Daniel Borkmann. 8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX and CAIF. From Mateusz Jurczyk. 9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario Molitor. 10) Do not leak netdev on dev_alloc_name() errors in mac80211, from Johannes Berg. 11) Fix locking in sctp_for_each_endpoint(), from Xin Long. 12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle. 13) Fix use after free in ip_mc_clear_src(), from WANG Cong. 14) Fix regressions caused by ICMP rate limiting changes in 4.11, from Jesper Dangaard Brouer. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits) i40e: Fix a sleep-in-atomic bug net: don't global ICMP rate limit packets originating from loopback net/act_pedit: fix an error code net: update undefined ->ndo_change_mtu() comment net_sched: move tcf_lock down after gen_replace_estimator() caif: Add sockaddr length check before accessing sa_family in connect handler qed: fix dump of context data qmi_wwan: new Telewell and Sierra device IDs net: phy: Fix MDIO_THUNDER dependencies netconsole: Remove duplicate "netconsole: " logging prefix igmp: acquire pmc lock for ip_mc_clear_src() r8152: give the device version net: rps: fix uninitialized symbol warning mac80211: don't send SMPS action frame in AP mode when not needed mac80211/wpa: use constant time memory comparison for MACs mac80211: set bss_info data before configuring the channel mac80211: remove 5/10 MHz rate code from station MLME mac80211: Fix incorrect condition when checking rx timestamp mac80211: don't look at the PM bit of BAR frames i40e: fix handling of HW ATR eviction ...
2017-06-15Merge branch 'linus' of ↵Linus Torvalds4-5/+16
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a bug on sparc where we may dereference freed stack memory" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: Work around deallocated stack frame reference gcc bug on sparc.
2017-06-15Merge tag 'acpi-4.12-rc6' of ↵Linus Torvalds3-18/+39
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These revert an ACPICA commit from the 4.11 cycle that causes problems to happen on some systems and add a protection against possible kernel crashes due to table reference counter imbalance. Specifics: - Revert a 4.11 ACPICA change that made assumptions which are not satisfied on some systems and caused the enumeration of resources to fail on them (Rafael Wysocki). - Add a mechanism to prevent tables from being unmapped prematurely due to reference counter overflows (Lv Zheng)" * tag 'acpi-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
2017-06-15Merge tag 'pm-4.12-rc6' of ↵Linus Torvalds5-9/+16
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These revert a recent cpufreq schedutil governor change that turned out to be problematic and fix a few minor issues in cpufreq, cpuidle and the Exynos devfreq drivers. Specifics: - Revert a recent cpufreq schedutil governor change that caused some systems to behave undesirably (Rafael Wysocki). - Fix a cpufreq conservative governor issue introduced during the 3.10 cycle that prevents it from working as expected in some situations (Tomasz Wilczyński). - Fix an error code path in the generic cpuidle driver for DT-based systems (Christophe Jaillet). - Fix three minor issues in devfreq drivers for Exynos (Arvind Yadav, Krzysztof Kozlowski)" * tag 'pm-4.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: dt: Add missing 'of_node_put()' cpufreq: conservative: Allow down_threshold to take values from 1 to 10 Revert "cpufreq: schedutil: Reduce frequencies slower" PM / devfreq: exynos-ppmu: Staticize event list PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
2017-06-15Merge branch 'for-4.12/driver-matching-fix' of ↵Linus Torvalds1-61/+221
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID fix from Jiri Kosina: - ifdef-based bandaid for a long-standing issue with HID driver matching, avoiding regressions in cases where specific driver is not enabled in kernel .config, from Jiri Kosina * 'for-4.12/driver-matching-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: let generic driver yield control iff specific driver has been enabled
2017-06-15Merge tag 'media/v4.12-3' of ↵Linus Torvalds8-10/+22
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - some build dependency issues at CEC core with randconfigs - fix an off by one error at vb2 - a race fix at cec core - driver fixes at tc358743, sir_ir and rainshadow-cec * tag 'media/v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] media/cec.h: use IS_REACHABLE instead of IS_ENABLED [media] cec: race fix: don't return -ENONET in cec_receive() [media] sir_ir: infinite loop in interrupt handler [media] cec-notifier.h: handle unreachable CONFIG_CEC_CORE [media] cec: improve MEDIA_CEC_RC dependencies [media] vb2: Fix an off by one error in 'vb2_plane_vaddr' [media] rainshadow-cec: Fix missing spin_lock_init() [media] tc358743: fix register i2c_rd/wr function fix
2017-06-15ufs_truncate_blocks(): fix the case when size is in the last direct blockAl Viro1-9/+12
The logics when deciding whether we need to do anything with direct blocks is broken when new size is within the last direct block. It's better to find the path to the last byte _not_ to be removed and use that instead of the path to the beginning of the first block to be freed... Signed-off-by: Al Viro <[email protected]>
2017-06-15ufs: more deadlock prevention on tail unpackingAl Viro1-1/+1
->s_lock is not needed for ufs_change_blocknr() Signed-off-by: Al Viro <[email protected]>
2017-06-15ufs: avoid grabbing ->truncate_mutex if possibleAl Viro2-10/+26
tail unpacking is done in a wrong place; the deadlocks galore is best dealt with by doing that in ->write_iter() (and switching to iomap, while we are at it), but that's rather painful to backport. The trouble comes from grabbing pages that cover the beginning of tail from inside of ufs_new_fragments(); ongoing pageout of any of those is going to deadlock on ->truncate_mutex with process that got around to extending the tail holding that and waiting for page to get unlocked, while ->writepage() on that page is waiting on ->truncate_mutex. The thing is, we don't need ->truncate_mutex when the fragment we are trying to map is within the tail - the damn thing is allocated (tail can't contain holes). Let's do a plain lookup and if the fragment is present, we can just pretend that we'd won the race in almost all cases. The only exception is a fragment between the end of tail and the end of block containing tail. Protect ->i_lastfrag with ->meta_lock - read_seqlock_excl() is sufficient. Signed-off-by: Al Viro <[email protected]>
2017-06-14i40e: Fix a sleep-in-atomic bugJia-Ju Bai1-0/+2
The driver may sleep under a spin lock, and the function call path is: i40e_ndo_set_vf_port_vlan (acquire the lock by spin_lock_bh) i40e_vsi_remove_pvid i40e_vlan_stripping_disable i40e_aq_update_vsi_params i40e_asq_send_command mutex_lock --> may sleep To fixed it, the spin lock is released before "i40e_vsi_remove_pvid", and the lock is acquired again after this function. Signed-off-by: Jia-Ju Bai <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-14ufs_get_locked_page(): make sure we have buffer_headsAl Viro1-9/+8
callers rely upon that, but find_lock_page() racing with attempt of page eviction by memory pressure might have left us with * try_to_free_buffers() successfully done * __remove_mapping() failed, leaving the page in our mapping * find_lock_page() returning an uptodate page with no buffer_heads attached. Signed-off-by: Al Viro <[email protected]>
2017-06-15Merge branch 'acpica-fixes'Rafael J. Wysocki3-18/+39
* acpica-fixes: ACPICA: Tables: Mechanism to handle late stage acpi_get_table() imbalance Revert "ACPICA: Disassembler: Enhance resource descriptor detection"
2017-06-15Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'pm-devfreq'Rafael J. Wysocki12368-283946/+1317754
* pm-cpufreq: cpufreq: conservative: Allow down_threshold to take values from 1 to 10 Revert "cpufreq: schedutil: Reduce frequencies slower" * pm-cpuidle: cpuidle: dt: Add missing 'of_node_put()' * pm-devfreq: PM / devfreq: exynos-ppmu: Staticize event list PM / devfreq: exynos-ppmu: Handle return value of clk_prepare_enable PM / devfreq: exynos-nocp: Handle return value of clk_prepare_enable
2017-06-14ufs: fix s_size/s_dsize usersAl Viro4-24/+19
For UFS2 we need 64bit variants; we even store them in uspi, but use 32bit ones instead. One wrinkle is in handling of reserved space - recalculating it every time had been stupid all along, but now it would become really ugly. Just calculate it once... Signed-off-by: Al Viro <[email protected]>
2017-06-14ufs: fix reserved blocks checkAl Viro1-4/+6
a) honour ->s_minfree; don't just go with default (5) b) don't bother with capability checks until we know we'll need them Signed-off-by: Al Viro <[email protected]>
2017-06-14ufs: make ufs_freespace() return signedAl Viro1-2/+2
as it is, checking that its return value is <= 0 is useless and that's how it's being used. Signed-off-by: Al Viro <[email protected]>
2017-06-14net: don't global ICMP rate limit packets originating from loopbackJesper Dangaard Brouer2-3/+7
Florian Weimer seems to have a glibc test-case which requires that loopback interfaces does not get ICMP ratelimited. This was broken by commit c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited"). An ICMP response will usually be routed back-out the same incoming interface. Thus, take advantage of this and skip global ICMP ratelimit when the incoming device is loopback. In the unlikely event that the outgoing it not loopback, due to strange routing policy rules, ICMP rate limiting still works via peer ratelimiting via icmpv4_xrlim_allow(). Thus, we should still comply with RFC1812 (section 4.3.2.8 "Rate Limiting"). This seems to fix the reproducer given by Florian. While still avoiding to perform expensive and unneeded outgoing route lookup for rate limited packets (in the non-loopback case). Fixes: c0303efeab73 ("net: reduce cycles spend on ICMP replies that gets rate limited") Reported-by: Florian Weimer <[email protected]> Reported-by: "H.J. Lu" <[email protected]> Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-14block: Fix a blk_exit_rl() regressionBart Van Assche2-12/+24
Avoid that the following complaint is reported: BUG: sleeping function called from invalid context at kernel/workqueue.c:2790 in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3 1 lock held by rcuop/3/41: #0: (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500 Call Trace: dump_stack+0x86/0xcf ___might_sleep+0x174/0x260 __might_sleep+0x4a/0x80 flush_work+0x7e/0x2e0 __cancel_work_timer+0x143/0x1c0 cancel_work_sync+0x10/0x20 blk_throtl_exit+0x25/0x60 blkcg_exit_queue+0x35/0x40 blk_release_queue+0x42/0x130 kobject_put+0xa9/0x190 This happens since we invoke callbacks that need to block from the queue release handler. Fix this by pushing the final release to a workqueue. Reported-by: Ross Zwisler <[email protected]> Fixes: commit b425e5049258 ("block: Avoid that blk_exit_rl() triggers a use-after-free") Signed-off-by: Bart Van Assche <[email protected]> Tested-by: Ross Zwisler <[email protected]> Updated changelog Signed-off-by: Jens Axboe <[email protected]>
2017-06-14rdma/cxgb4: Fix memory leaks during module exitRaju Rangoju1-3/+7
Fix memory leaks of iw_cxgb4 module in the exit path Signed-off-by: Raju Rangoju <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14net/act_pedit: fix an error codeDan Carpenter1-1/+3
I'm reviewing static checker warnings where we do ERR_PTR(0), which is the same as NULL. I'm pretty sure we intended to return ERR_PTR(-EINVAL) here. Sometimes these bugs lead to a NULL dereference but I don't immediately see that problem here. Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers") Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Amir Vadai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-14ufs: fix logics in "ufs: make fsck -f happy"Al Viro1-13/+28
Storing stats _only_ at new locations is wrong for UFS1; old locations should always be kept updated. The check for "has been converted to use of new locations" is also wrong - it should be "->fs_maxbsize is equal to ->fs_bsize". Signed-off-by: Al Viro <[email protected]>
2017-06-14IB/ipoib: Fix memory leak in create child syscallFeras Daoud1-3/+3
The flow of creating a new child goes through ipoib_vlan_add which allocates a new interface and checks the rtnl_lock. If the lock is taken, restart_syscall will be called to restart the system call again. In this case we are not releasing the already allocated interface, causing a leak. Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support") Signed-off-by: Feras Daoud <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14IB/ipoib: Fix access to un-initialized napi structAlex Vesker1-1/+0
There is no need to re-enable napi since we set the initialized flag before calling ipoib_ib_dev_stop which will disable napi, disabling napi twice is harmless in case it was already disabled. One more reason for this fix is that when using IPoIB new device driver napi is not added to priv, this can lead to kernel panic when rn_ops ndo_open fails. [ 289.755840] invalid opcode: 0000 [#1] SMP [ 289.757111] task: ffff880036964440 ti: ffff880178ee8000 task.ti: ffff880178ee8000 [ 289.757111] RIP: 0010:[<ffffffffa05368d6>] [<ffffffffa05368d6>] napi_enable.part.24+0x4/0x6 [ib_ipoib] [ 289.757111] RSP: 0018:ffff880178eeb6d8 EFLAGS: 00010246 [ 289.757111] RAX: 0000000000000000 RBX: ffff880177a80010 RCX: 000000007fffffff [ 289.757111] RDX: ffffffff81d5f118 RSI: 0000000000000000 RDI: ffff880177a80010 [ 289.757111] RBP: ffff880178eeb6d8 R08: 0000000000000082 R09: 0000000000000283 [ 289.757111] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880175a00000 [ 289.757111] R13: ffff880177a80080 R14: 0000000000000000 R15: 0000000000000001 [ 289.757111] FS: 00007fe2ee346880(0000) GS:ffff88017fc00000(0000) knlGS:0000000000000000 [ 289.757111] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 289.757111] CR2: 00007fffca979020 CR3: 00000001792e4000 CR4: 00000000000006f0 [ 289.757111] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 289.757111] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 289.757111] Stack: [ 289.796027] ffff880178eeb6f0 ffffffffa05251f5 ffff880177a80000 ffff880178eeb718 [ 289.796027] ffffffffa0528505 ffff880175a00000 ffff880177a80000 0000000000000000 [ 289.796027] ffff880178eeb748 ffffffffa051f0ab ffff880175a00000 ffffffffa0537d60 [ 289.796027] Call Trace: [ 289.796027] [<ffffffffa05251f5>] napi_enable+0x25/0x30 [ib_ipoib] [ 289.796027] [<ffffffffa0528505>] ipoib_ib_dev_open+0x175/0x190 [ib_ipoib] [ 289.796027] [<ffffffffa051f0ab>] ipoib_open+0x4b/0x160 [ib_ipoib] [ 289.796027] [<ffffffff814fe33f>] _dev_open+0xbf/0x130 [ 289.796027] [<ffffffff814fe62d>] __dev_change_flags+0x9d/0x170 [ 289.796027] [<ffffffff814fe729>] dev_change_flags+0x29/0x60 [ 289.796027] [<ffffffff8150caf7>] do_setlink+0x397/0xa40 Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14IB/ipoib: Delete napi in device uninit defaultAlex Vesker1-0/+3
This patch mekas init_default and uninit_default symmetric with a call to delete napi. Additionally, the uninit_default gained delete napi call in case of init_default fails. Fixes: 515ed4f3aab4 ('IB/IPoIB: Separate control and data related initializations') Signed-off-by: Alex Vesker <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14IB/ipoib: Limit call to free rdma_netdev for capable devicesAlex Vesker2-3/+8
Limit calls to free_rdma_netdev() for capable devices only. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14IB/ipoib: Fix memory leaks for child interfaces privAlex Vesker2-2/+10
There is a need to free priv explicitly and not just to release the device, child priv is freed explicitly on remove flow and this patch also includes priv free on error flow in P_key creation and also in add_port. Fixes: cd565b4b51e5 ('IB/IPoIB: Support acceleration options callbacks') Signed-off-by: Alex Vesker <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14net: update undefined ->ndo_change_mtu() commentMagnus Damm1-2/+1
Update ->ndo_change_mtu() callback comment to remove text about returning error in case of undefined callback. This change makes the comment match the existing code behavior. Signed-off-by: Magnus Damm <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-14perf tools: Fix build with ARCH=x86_64Jiada Wang6-25/+25
With commit: 0a943cb10ce78 (tools build: Add HOSTARCH Makefile variable) when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of ARCH=x86, so the perf build process searchs header files from tools/arch/x86_64/include, which doesn't exist. The following build failure is seen: In file included from util/event.c:2:0: tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory compilation terminated. Fix this issue by using SRCARCH instead of ARCH in perf, just like the main kernel Makefile and tools/objtool's. Signed-off-by: Jiada Wang <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Eugeniu Rosca <[email protected]> Cc: Jan Stancek <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rui Teng <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 0a943cb10ce7 ("tools build: Add HOSTARCH Makefile variable") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-06-14perf evsel: Fix probing of precise_ip level for default cycles eventArnaldo Carvalho de Melo2-1/+13
Since commit 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done in the routine used to probe the max precise level when no events were passed to 'perf record' or 'perf top', i.e.: perf_evsel__new_cycles() perf_event_attr__set_max_precise_ip() The x86 code, in x86_pmu_hw_config(), which is called all the way from sys_perf_event_open() did, starting with the aforementioned commit: /* There's no sense in having PEBS for non sampling events: */ if (!is_sampling_event(event)) return -EINVAL; Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using just the non precise cycles variant. To make sure that this is the case, I tested it, before this patch, with: # perf probe -L x86_pmu_hw_config <x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0> 0 int x86_pmu_hw_config(struct perf_event *event) 1 { 2 if (event->attr.precise_ip) { <SNIP> 17 if (event->attr.precise_ip > precise) 18 return -EOPNOTSUPP; /* There's no sense in having PEBS for non sampling events: */ 21 if (!is_sampling_event(event)) 22 return -EINVAL; } <SNIP> # perf probe x86_pmu_hw_config:22 Added new events: probe:x86_pmu_hw_config (on x86_pmu_hw_config:22) probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22) You can now use it in all perf tools, such as: perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1 # perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1 0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ] # I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times. So fix it by just setting attr.sample_period Now, after this patch: # perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1 [ perf record: Woken up 1 times to write data ] 0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_open_cloexec_flag (/home/acme/bin/perf) 0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) 0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # The probe one called from perf_event_attr__set_max_precise_ip() works the first time, with attr.precise_ip = 3, wit hthe next ones being the per cpu ones for the cycles:ppp event. And here is the text from a report and alternative proposed patch by Thomas-Mich Richter: --- On s390 the counter and sampling facility do not support a precise IP skid level and sometimes returns EOPNOTSUPP when structure member precise_ip in struct perf_event_attr is not set to zero. On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This happens only when no events are specified on command line. The functions called are ... --> perf_evlist__add_default --> perf_evsel__new_cycles --> perf_event_attr__set_max_precise_ip The last function determines the value of structure member precise_ip by invoking the perf_event_open() system call and checking the return code. The first successful open is the value for precise_ip. However the value is determined without setting member sample_period and indicates no sampling. On s390 the counter facility and sampling facility are different. The above procedure determines a precise_ip value of 3 using the counter facility. Later it uses the sampling facility with a value of 3 and fails with EOPNOTSUPP. --- v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members of unnamed union members in the container struct initialization, so move from: struct perf_event_attr attr = { ... .sample_period = 1, }; to right after it as: struct perf_event_attr attr = { ... }; attr.sample_period = 1; v3: We need to reset .sample_period to 0 to let the users of perf_evsel__new_cycles() to properly setup attr.sample_period or attr.sample_freq. Reported by Ingo Molnar. Reported-and-Acked-by: Thomas-Mich Richter <[email protected]> Acked-by: Hendrik Brueckner <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-06-14net_sched: move tcf_lock down after gen_replace_estimator()WANG Cong1-5/+3
Laura reported a sleep-in-atomic kernel warning inside tcf_act_police_init() which calls gen_replace_estimator() with spinlock protection. It is not necessary in this case, we already have RTNL lock here so it is enough to protect concurrent writers. For the reader, i.e. tcf_act_police(), it needs to make decision based on this rate estimator, in the worst case we drop more/less packets than necessary while changing the rate in parallel, it is still acceptable. Reported-by: Laura Abbott <[email protected]> Reported-by: Nick Huber <[email protected]> Cc: Jamal Hadi Salim <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-14ceph: unify inode i_ctime updateYan, Zheng2-3/+3
Current __ceph_setattr() can set inode's i_ctime to current_time(), req->r_stamp or attr->ia_ctime. These time stamps may have minor differences. It may cause potential problem. Signed-off-by: "Yan, Zheng" <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-06-14ceph: use current_kernel_time() to get request time stampYan, Zheng1-3/+1
ceph uses ktime_get_real_ts() to get request time stamp. In most other cases, current_kernel_time() is used to get time stamp for filesystem operations (called by current_time()). There is granularity difference between ktime_get_real_ts() and current_kernel_time(). The later one can be up to one jiffy behind the former one. This can causes inode's ctime to go back. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-06-14ceph: check i_nlink while converting a file handle to dentryLuis Henriques1-0/+4
Converting a file handle to a dentry can be done call after the inode unlink. This means that __fh_to_dentry() requires an extra check to verify the number of links is not 0. The issue can be easily reproduced using xfstest generic/426, which does something like: name_to_handle_at(&fh) echo 3 > /proc/sys/vm/drop_caches unlink() open_by_handle_at(&fh) The call to open_by_handle_at() should fail, as the file doesn't exist anymore. Link: http://tracker.ceph.com/issues/19958 Signed-off-by: Luis Henriques <[email protected]> Reviewed-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-06-14rxe: Fix a sleep-in-atomic bug in post_one_sendJia-Ju Bai1-7/+2
The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep There is no flow that makes "qp->is_user" true, and copy_from_user may cause bug when a non-user pointer is used. So the lines of copy_from_user and check of "qp->is_user" are removed. Signed-off-by: Jia-Ju Bai <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Acked-by: Moni Shoua <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/qedr: Add 64KB PAGE_SIZE support to user-space queuesRam Amrani2-28/+41
Add 64KB PAGE_SIZE support to user-space CQ, SQ and RQ queues. De-facto it means that code was added to translate 64KB pages to smaller 4KB pages that the FW can handle. Otherwise, the FW would wrap (or jump to the next page) when reaching 4KB while the user space library will continue on the same large page. Note that MR code remains as is since the FW supports larger pages for MRs. Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/qedr: Initialize byte_len in WC of READ and SEND commandsMichal Kalderon1-0/+4
Initialize byte_len in work completion of RDMA_READ and RDMA_SEND. Exposed by uDAPL application. Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/bnxt_re: Remove FMR supportSelvin Xavier3-106/+2
Some issues observed with FMR implementation while running stress traffic. So removing the FMR verbs support for now. Signed-off-by: Selvin Xavier <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/bnxt_re: Fix RQE posting logicDevesh Sharma3-1/+20
This patch adds code to ring RQ Doorbell aggressively so that the adapter can DMA RQ buffers sooner, instead of DMA all WQEs in the post_recv WR list together at the end of the post_recv verb. Also use spinlock to serialize RQ posting Signed-off-by: Kalesh AP <[email protected]> Signed-off-by: Devesh Sharma <[email protected]> Signed-off-by: Selvin Xavier <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPsSomnath Kotur4-0/+24
HW stalls out after 0x800000 WQEs are posted for UD QPs. To workaround this problem, driver will send a modify_qp cmd to the HW at around the halfway mark(0x400000) so that FW can accordingly modify the QP context in the HW to prevent this stall. This workaround needs to be done for UD, QP1 and Raw Ethertype packets. Added a counter to keep track of WQEs posted during post_send. Signed-off-by: Somnath Kotur <[email protected]> Signed-off-by: Selvin Xavier <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_listSelvin Xavier1-2/+6
If the host buffers are freed before destroying MR in HW, HW could try accessing these buffers. This could cause a host crash. Fixing the code to avoid this condition. Signed-off-by: Selvin Xavier <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14RDMA/bnxt_re: HW workarounds for handling specific conditionsEddie Wai7-70/+509
This patch implements the following HW workarounds 1. The SQ depth needs to be augmented by 128 + 1 to avoid running into an Out of order CQE issue 2. Workaround to handle the problem where the HW fast path engine continues to access DMA memory in retranmission mode even after the WQE has already been completed. If the HW reports this condition, driver detects it and posts a Fence WQE. The driver stops reporting the completions to stack until it receives completion for Fence WQE. Signed-off-by: Eddie Wai <[email protected]> Signed-off-by: Sriharsha Basavapatna <[email protected]> Signed-off-by: Selvin Xavier <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-06-14drm/radeon: Fix overflow of watermark calcs at > 4k resolutions.Mario Kleiner3-6/+15
Commit e6b9a6c84b93 ("drm/radeon: Make display watermark calculations more accurate") made watermark calculations more accurate, but not for > 4k resolutions on 32-Bit architectures, as it introduced an integer overflow for those setups and resolutions. Fix this by proper u64 casting and division. Signed-off-by: Mario Kleiner <[email protected]> Reported-by: Ben Hutchings <[email protected]> Fixes: e6b9a6c84b93 ("drm/radeon: Make display watermark calculations more accurate") Cc: Ben Hutchings <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Signed-off-by: Alex Deucher <[email protected]>