aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-12-20IB/uverbs: Check input length in flow steering uverbsYann Droneaud1-0/+6
Since ib_copy_from_udata() doesn't check yet the available input data length before accessing userspace memory, an explicit check of this length is required to prevent: - reading past the user provided buffer, - underflow when subtracting the expected command size from the input length. This will ensure the newly added flow steering uverbs don't try to process truncated commands. Link: http://marc.info/[email protected]> Signed-off-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2013-12-20IB/uverbs: Set error code when fail to consume all flow_spec itemsYann Droneaud1-0/+1
If the flow_spec items parsed count does not match the number of items declared in the flow_attr command, or if not all bytes are used for flow_spec items (eg. trailing garbage), a log message is reported and the function leave through the error path. Unfortunately the error code is currently not set. This patch set error code to -EINVAL in such cases, so that the error is reported to userspace instead of silently fail. Link: http://marc.info/[email protected]> Signed-off-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2013-12-20IB/uverbs: Check reserved fields in create_flowYann Droneaud1-0/+7
As noted by Daniel Vetter in its article "Botching up ioctls"[1] "Check *all* unused fields and flags and all the padding for whether it's 0, and reject the ioctl if that's not the case. Otherwise your nice plan for future extensions is going right down the gutters since someone *will* submit an ioctl struct with random stack garbage in the yet unused parts. Which then bakes in the ABI that those fields can never be used for anything else but garbage." It's important to ensure that reserved fields are set to known value, so that it will be possible to use them latter to extend the ABI. The same reasonning apply to comp_mask field present in newer uverbs command: per commit 22878dbc9173 ("IB/core: Better checking of userspace values for receive flow steering"), unsupported values in comp_mask are rejected. [1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html Link: http://marc.info/[email protected]> Signed-off-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2013-12-20IB/uverbs: Check comp_mask in destroy_flowYann Droneaud1-0/+3
Just like the check added to create_flow in 22878dbc9173 ("IB/core: Better checking of userspace values for receive flow steering"), comp_mask must be checked in destroy_flow too. Since only empty comp_mask is currently supported, any other value must be rejected. This check was silently added in a previous patch[1] to move comp_mask in extended command header, part of previous patchset[2] against create/destroy_flow uverbs. The idea of moving comp_mask to the header was discarded for the final patchset[3]. Unfortunately the check added in destroy_flow uverb was not integrated in the final patchset. [1] http://marc.info/?i=40175eda10d670d098204da6aa4c327a0171ae5f.1381510045.git.ydroneaud@opteya.com [2] http://marc.info/[email protected] [3] http://marc.info/[email protected] Cc: Matan Barak <[email protected]> Link: http://marc.info/[email protected]> Signed-off-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2013-12-20IB/uverbs: Check reserved field in extended command headerYann Droneaud1-0/+3
As noted by Daniel Vetter in its article "Botching up ioctls"[1] "Check *all* unused fields and flags and all the padding for whether it's 0, and reject the ioctl if that's not the case. Otherwise your nice plan for future extensions is going right down the gutters since someone *will* submit an ioctl struct with random stack garbage in the yet unused parts. Which then bakes in the ABI that those fields can never be used for anything else but garbage." It's important to ensure that reserved fields are set to known value, so that it will be possible to use them latter to extend the ABI. The same reasonning apply to comp_mask field present in newer uverbs command: per commit 22878dbc9173 ("IB/core: Better checking of userspace values for receive flow steering"), unsupported values in comp_mask are rejected. [1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html Link: http://marc.info/[email protected]> Signed-off-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2013-12-20IB/uverbs: New macro to set pointers to NULL if length is 0 in INIT_UDATA()Roland Dreier2-11/+16
Trying to have a ternary operator to choose between NULL (or 0) and the real pointer value in invocations leads to an impossible choice between a sparse error about a literal 0 used as a NULL pointer, and a gcc warning about "pointer/integer type mismatch in conditional expression." Rather than clutter the source with more casts, move the ternary operator into a new INIT_UDATA_BUF_OR_NULL() macro, which makes it easier to use and simplifies its callers. Reported-by: Yann Droneaud <[email protected]> Signed-off-by: Roland Dreier <[email protected]>
2013-12-20Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-masterPaolo Bonzini13-62/+112
Patch queue for 3.13 - 2013-12-18 This fixes some grave issues we've only found after 3.13-rc1: - Make the modularized HV/PR book3s kvm work well as modules - Fix some race conditions - Fix compilation with certain compilers (booke) - Fix THP for book3s_hv - Fix preemption for book3s_pr Alexander Graf (4): KVM: PPC: Book3S: PR: Don't clobber our exit handler id KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Enable interrupts earlier Aneesh Kumar K.V (1): powerpc: book3s: kvm: Don't abuse host r2 in exit path Paul Mackerras (5): KVM: PPC: Book3S HV: Fix physical address calculations KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Don't drop low-order page address bits Scott Wood (1): powerpc/kvm/booke: Fix build break due to stack frame size warning pingfan liu (1): powerpc: kvm: fix rare but potential deadlock scene
2013-12-20Merge tag 'stable/for-linus-3.13-rc4-tag' of ↵Linus Torvalds6-44/+51
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: - Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use scratch pages. - Fix block API header for ARM32 and ARM64 to have proper layout - On ARM when mapping guests, stick on PTE_SPECIAL - When using SWIOTLB under ARM, don't call swiotlb functions twice - When unmapping guests memory and if we fail, don't return pages which failed to be unmapped. - Grant driver was using the wrong address on ARM. * tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Seperate the auto-translate logic properly (v2) xen/block: Correctly define structures in public headers on ARM32 and ARM64 arm: xen: foreign mapping PTEs are special. xen/arm64: do not call the swiotlb functions twice xen: privcmd: do not return pages which we have failed to unmap XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
2013-12-20Merge tag 'trace-fixes-v3.13-rc2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fix from Steven Rostedt: "This fixes a long standing bug in the ftrace profiler. The problem is that the profiler only initializes the online CPUs, and not possible CPUs. This causes issues if the user takes CPUs online or offline while the profiler is running. If we online a CPU after starting the profiler, we lose all the trace information on the CPU going online. If we offline a CPU after running a test and start a new test, it will not clear the old data from that CPU. This bug causes incorrect data to be reported to the user if they online or offline CPUs during the profiling" * tag 'trace-fixes-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Initialize the ftrace profiler for each possible cpu
2013-12-20Merge tag 'omap-for-v3.13/display-fix' of ↵Kevin Hilman522-2558/+4836
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes I accidentally removed some mux code for omap4 that I thought was dead code as omap4 has been booting with device tree only since v3.10. Turns out I also removed some display related mux code, so let's revert that except for the dead code parts. * tag 'omap-for-v3.13/display-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (439 commits) Revert "ARM: OMAP2+: Remove legacy mux code for display.c" +Linux 3.13-rc4
2013-12-20ext4: add explicit casts when masking cluster sizesTheodore Ts'o3-17/+25
The missing casts can cause the high 64-bits of the physical blocks to be lost. Set up new macros which allows us to make sure the right thing happen, even if at some point we end up supporting larger logical block numbers. Thanks to the Emese Revfy and the PaX security team for reporting this issue. Reported-by: PaX Team <[email protected]> Reported-by: Emese Revfy <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: [email protected]
2013-12-20netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletionHelmut Schaa1-1/+0
When having nf_conntrack_timestamp enabled deleting a netns can lead to the following BUG being triggered: [63836.660000] Kernel bug detected[#1]: [63836.660000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.18 #14 [63836.660000] task: 802d9420 ti: 802d2000 task.ti: 802d2000 [63836.660000] $ 0 : 00000000 00000000 00000000 00000000 [63836.660000] $ 4 : 00000001 00000004 00000020 00000020 [63836.660000] $ 8 : 00000000 80064910 00000000 00000000 [63836.660000] $12 : 0bff0002 00000001 00000000 0a0a0abe [63836.660000] $16 : 802e70a0 85f29d80 00000000 00000004 [63836.660000] $20 : 85fb62a0 00000002 802d3bc0 85fb62a0 [63836.660000] $24 : 00000000 87138110 [63836.660000] $28 : 802d2000 802d3b40 00000014 871327cc [63836.660000] Hi : 000005ff [63836.660000] Lo : f2edd000 [63836.660000] epc : 87138794 __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] Not tainted [63836.660000] ra : 871327cc nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] Status: 1100d403 KERNEL EXL IE [63836.660000] Cause : 00800034 [63836.660000] PrId : 0001974c (MIPS 74Kc) [63836.660000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mark xh [63836.660000] Process swapper (pid: 0, threadinfo=802d2000, task=802d9420, tls=00000000) [63836.660000] Stack : 802e70a0 871323d4 00000005 87080234 802e70a0 86d2a840 00000000 00000000 [63836.660000] Call Trace: [63836.660000] [<87138794>] __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] [<871327cc>] nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] [<801ff63c>] nf_iterate+0x90/0xec [63836.660000] [<801ff730>] nf_hook_slow+0x98/0x164 [63836.660000] [<80205968>] ip_rcv+0x3e8/0x40c [63836.660000] [<801d9754>] __netif_receive_skb_core+0x624/0x6a4 [63836.660000] [<801da124>] process_backlog+0xa4/0x16c [63836.660000] [<801d9bb4>] net_rx_action+0x10c/0x1e0 [63836.660000] [<8007c5a4>] __do_softirq+0xd0/0x1bc [63836.660000] [<8007c730>] do_softirq+0x48/0x68 [63836.660000] [<8007c964>] irq_exit+0x54/0x70 [63836.660000] [<80060830>] ret_from_irq+0x0/0x4 [63836.660000] [<8006a9f8>] r4k_wait_irqoff+0x18/0x1c [63836.660000] [<8009cfb8>] cpu_startup_entry+0xa4/0x104 [63836.660000] [<802eb918>] start_kernel+0x394/0x3ac [63836.660000] [63836.660000] Code: 00821021 8c420000 2c440001 <00040336> 90440011 92350010 90560010 2485ffff 02a5a821 [63837.040000] ---[ end trace ebf660c3ce3b55e7 ]--- [63837.050000] Kernel panic - not syncing: Fatal exception in interrupt [63837.050000] Rebooting in 3 seconds.. Fix this by not unregistering the conntrack extension in the per-netns cleanup code. This bug was introduced in (73f4001 netfilter: nf_ct_tstamp: move initialization out of pernet_operations). Signed-off-by: Helmut Schaa <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2013-12-20GFS2: Wait for async DIO in glock state changesSteven Whitehouse1-2/+8
We need to wait for any outstanding DIO to complete in a couple of situations. Firstly, in case we are changing out of deferred mode (in inode_go_sync) where GLF_DIRTY will not be set. That call could be prefixed with a test for gl_state == LM_ST_DEFERRED but it doesn't seem worth it bearing in mind that the test for outstanding DIO is very quick anyway, in the usual case that there is none. The second case is in inode_go_lock which will catch the cases where we have a cached EX lock, but where we grant deferred locks against it so that there is no glock state transistion. We only need to wait if the state is not deferred, since DIO is valid anyway in that state. Signed-off-by: Steven Whitehouse <[email protected]>
2013-12-20GFS2: Fix incorrect invalidation for DIO/buffered I/OSteven Whitehouse1-0/+30
In patch 209806aba9d540dde3db0a5ce72307f85f33468f we allowed local deferred locks to be granted against a cached exclusive lock. That opened up a corner case which this patch now fixes. The solution to the problem is to check whether we have cached pages each time we do direct I/O and if so to unmap, flush and invalidate those pages. Since the glock state machine normally does that for us, mostly the code will be a no-op. Signed-off-by: Steven Whitehouse <[email protected]>
2013-12-20netfilter: nft_exthdr: call ipv6_find_hdr() with explicitly initialized offsetDaniel Borkmann1-1/+1
In nft's nft_exthdr_eval() routine we process IPv6 extension header through invoking ipv6_find_hdr(), but we call it with an uninitialized offset variable that contains some stack value. In ipv6_find_hdr() we then test if the value of offset != 0 and call skb_header_pointer() on that offset in order to map struct ipv6hdr into it. Fix it up by initializing offset to 0 as it was probably intended to be. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Daniel Borkmann <[email protected]> Cc: Hannes Frederic Sowa <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2013-12-19drm/radeon: fix asic gfx values for scrapper asicsAlex Deucher1-4/+16
Fixes gfx corruption on certain TN/RL parts. bug: https://bugs.freedesktop.org/show_bug.cgi?id=60389 Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2013-12-19dccp: catch failed request_module call in dccp_probe initWang Weidong1-12/+7
Check the return value of request_module during dccp_probe initialisation, bail out if that call fails. Signed-off-by: Gerrit Renker <[email protected]> Signed-off-by: Wang Weidong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19Merge branch 'master' of ↵David S. Miller6-8/+22
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates This series contains updates to net, ixgbe and e1000e. David provides compiler fixes for e1000e. Don provides a fix for ixgbe to resolve a compile warning. John provides a fix to net where it is useful to be able to walk all upper devices when bringing a device online where the RTNL lock is held. In this case, it is safe to walk the all_adj_list because the RTNL lock is used to protect the write side as well. This patch adds a check to see if the RTNL lock is held before throwing a warning in netdev_all_upper_get_next_dev_rcu(). ==================== Signed-off-by: David S. Miller <[email protected]>
2013-12-19net: mvmdio: fix interrupt timeout handlingLeigh Brown1-0/+6
This version corrects the whitespace issue. orion_mdio_wait_ready uses wait_event_timeout to wait for the SMI interrupt to fire. wait_event_timeout waits for between "timeout - 1" and "timeout" jiffies. In this case a 1ms timeout when HZ is 1000 results in a wait of 0 to 1 jiffies, causing premature timeouts. This fix ensures a minimum timeout of 2 jiffies, ensuring wait_event_timeout will always wait at least 1 jiffie. Issue reported by Nicolas Schichan. Tested-by: Nicolas Schichan <[email protected]> Signed-off-by: Leigh Brown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19atl1c: Check return from pci_find_ext_capability() in atl1c_reset_pcie()Betty Dall1-3/+5
The function atl1c_reset_pcie() does not check the return from pci_find_ext_cabability() where it is getting the postion of the PCI_EXT_CAP_ID_ERR. It is possible for the return to be 0. Signed-off-by: Betty Dall <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19ipv6: always set the new created dst's from in ip6_rt_copyLi RongQing1-3/+1
ip6_rt_copy only sets dst.from if ort has flag RTF_ADDRCONF and RTF_DEFAULT. but the prefix routes which did get installed by hand locally can have an expiration, and no any flag combination which can ensure a potential from does never expire, so we should always set the new created dst's from. This also fixes the new created dst is always expired since the ort, which is created by RA, maybe has RTF_EXPIRES and RTF_ADDRCONF, but no RTF_DEFAULT. Suggested-by: Hannes Frederic Sowa <[email protected]> CC: Gao feng <[email protected]> Signed-off-by: Li RongQing <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19Merge tag 'keystone/maintainer-file' of ↵Kevin Hilman1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into fixes From Santosh Shilimkar: Couple of updates to MAINTAINERS file for Keystone - Add git tree information - Add clock drivers entry * tag 'keystone/maintainer-file' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone: MAINTAINERS: Add keystone clock drivers MAINTAINERS: Add keystone git tree information Signed-off-by: Kevin Hilman <[email protected]>
2013-12-19net: fec: fix potential use after freeEric Dumazet1-2/+2
skb_tx_timestamp(skb) should be called _before_ TX completion has a chance to trigger, otherwise it is too late and we access freed memory. Signed-off-by: Eric Dumazet <[email protected]> Fixes: de5fb0a05348 ("net: fec: put tx to napi poll function to fix dead lock") Cc: Frank Li <[email protected]> Cc: Richard Cochran <[email protected]> Acked-by: Richard Cochran <[email protected]> Acked-by: Frank Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19qla2xxx: Fix scsi_host leak on qlt_lport_register callback failureNicholas Bellinger1-0/+1
This patch fixes a possible scsi_host reference leak in qlt_lport_register(), when a non zero return from the passed (*callback) does not call drop the local reference via scsi_host_put() before returning. This currently does not effect existing tcm_qla2xxx code as the passed callback will never fail, but fix this up regardless for future code. Cc: Chad Dupuis <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2013-12-19target: Remove extra percpu_ref_initAndy Grover1-7/+1
lun->lun_ref is also initialized in core_tpg_post_addlun, so it doesn't need to be done in core_tpg_setup_virtual_lun0. (nab: Drop left-over percpu_ref_cancel_init in failure path) Signed-off-by: Andy Grover <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2013-12-19x86/efi: Don't select EFI from certain special ACPI driversJan Beulich5-6/+5
Commit 7ea6c6c1 ("Move cper.c from drivers/acpi/apei to drivers/firmware/efi") results in CONFIG_EFI being enabled even when the user doesn't want this. Since ACPI APEI used to build fine without UEFI (and as far as I know also has no functional depency on it), at least in that case using a reverse dependency is wrong (and a straight one isn't needed). Whether the same is true for ACPI_EXTLOG I don't know - if there is a functional dependency, it should depend on EFI rather than selecting it. It certainly has (currently) no build dependency. Adjust Kconfig and build logic so that the bad dependency gets avoided. Signed-off-by: Jan Beulich <[email protected]> Acked-by: Tony Luck <[email protected]> Cc: Matt Fleming <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-12-19bnx2x: downgrade "valid ME register value" message levelMichal Schmidt1-1/+1
"valid ME register value" is not an error. It should be logged for debugging only. Signed-off-by: Michal Schmidt <[email protected]> Acked-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19hamradio/yam: fix info leak in ioctlSalva Peiró1-0/+1
The yam_ioctl() code fails to initialise the cmd field of the struct yamdrv_ioctl_cfg. Add an explicit memset(0) before filling the structure to avoid the 4-byte info leak. Signed-off-by: Salva Peiró <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()Wenliang Fan1-0/+2
The local variable 'bi' comes from userspace. If userspace passed a large number to 'bi.data.calibrate', there would be an integer overflow in the following line: s->hdlctx.calibrate = bi.data.calibrate * s->par.bitrate / 16; Signed-off-by: Wenliang Fan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19xen-netback: fix some error return codeWei Yongjun1-4/+12
'err' is overwrited to 0 after maybe_pull_tail() call, so the error code was not set if skb_partial_csum_set() call failed. Fix to return error -EPROTO from those error handling case instead of 0. Fixes: d52eb0d46f36 ('xen-netback: make sure skb linear area covers checksum field') Signed-off-by: Wei Yongjun <[email protected]> Acked-by: Wei Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19net: inet_diag: zero out uninitialized idiag_{src,dst} fieldsDaniel Borkmann1-0/+16
Jakub reported while working with nlmon netlink sniffer that parts of the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6. That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3]. In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab] memory through this. At least, in udp_dump_one(), we allocate a skb in ... rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL); ... and then pass that to inet_sk_diag_fill() that puts the whole struct inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0], r->id.idiag_dst[0] and leave the rest untouched: r->id.idiag_src[0] = inet->inet_rcv_saddr; r->id.idiag_dst[0] = inet->inet_daddr; struct inet_diag_msg embeds struct inet_diag_sockid that is correctly / fully filled out in IPv6 case, but for IPv4 not. So just zero them out by using plain memset (for this little amount of bytes it's probably not worth the extra check for idiag_family == AF_INET). Similarly, fix also other places where we fill that out. Reported-by: Jakub Zawadzki <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-12-19x86 idle: Repair large-server 50-watt idle-power regressionLen Brown2-1/+5
Linux 3.10 changed the timing of how thread_info->flags is touched: x86: Use generic idle loop (7d1a941731fabf27e5fb6edbebb79fe856edb4e5) This caused Intel NHM-EX and WSM-EX servers to experience a large number of immediate MONITOR/MWAIT break wakeups, which caused cpuidle to demote from deep C-states to shallow C-states, which caused these platforms to experience a significant increase in idle power. Note that this issue was already present before the commit above, however, it wasn't seen often enough to be noticed in power measurements. Here we extend an errata workaround from the Core2 EX "Dunnington" to extend to NHM-EX and WSM-EX, to prevent these immediate returns from MWAIT, reducing idle power on these platforms. While only acpi_idle ran on Dunnington, intel_idle may also run on these two newer systems. As of today, there are no other models that are known to need this tweak. Link: http://lkml.kernel.org/r/CAJvTdK=%[email protected] Signed-off-by: Len Brown <[email protected]> Link: http://lkml.kernel.org/r/baff264285f6e585df757d58b17788feabc68918.1387403066.git.len.brown@intel.com Cc: <[email protected]> # 3.12.x, 3.11.x, 3.10.x Signed-off-by: H. Peter Anvin <[email protected]>
2013-12-19libata, freezer: avoid block device removal while system is frozenTejun Heo2-0/+27
Freezable kthreads and workqueues are fundamentally problematic in that they effectively introduce a big kernel lock widely used in the kernel and have already been the culprit of several deadlock scenarios. This is the latest occurrence. During resume, libata rescans all the ports and revalidates all pre-existing devices. If it determines that a device has gone missing, the device is removed from the system which involves invalidating block device and flushing bdi while holding driver core layer locks. Unfortunately, this can race with the rest of device resume. Because freezable kthreads and workqueues are thawed after device resume is complete and block device removal depends on freezable workqueues and kthreads (e.g. bdi_wq, jbd2) to make progress, this can lead to deadlock - block device removal can't proceed because kthreads are frozen and kthreads can't be thawed because device resume is blocked behind block device removal. 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue") made this particular deadlock scenario more visible but the underlying problem has always been there - the original forker task and jbd2 are freezable too. In fact, this is highly likely just one of many possible deadlock scenarios given that freezer behaves as a big kernel lock and we don't have any debug mechanism around it. I believe the right thing to do is getting rid of freezable kthreads and workqueues. This is something fundamentally broken. For now, implement a funny workaround in libata - just avoid doing block device hot[un]plug while the system is frozen. Kernel engineering at its finest. :( v2: Add EXPORT_SYMBOL_GPL(pm_freezing) for cases where libata is built as a module. v3: Comment updated and polling interval changed to 10ms as suggested by Rafael. v4: Add #ifdef CONFIG_FREEZER around the hack as pm_freezing is not defined when FREEZER is not configured thus breaking build. Reported by kbuild test robot. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Tomaž Šolc <[email protected]> Reviewed-by: "Rafael J. Wysocki" <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=62801 Link: http://lkml.kernel.org/r/[email protected] Cc: Greg Kroah-Hartman <[email protected]> Cc: Len Brown <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: [email protected] Cc: kbuild test robot <[email protected]>
2013-12-19arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled eventsWill Deacon1-20/+18
Commit 8f34a1da35ae ("arm64: ptrace: use HW_BREAKPOINT_EMPTY type for disabled breakpoints") fixed an issue with GDB trying to zero breakpoint control registers. The problem there is that the arch hw_breakpoint code will attempt to create a (disabled), execute breakpoint of length 0. This will fail validation and report unexpected failure to GDB. To avoid this, we treated disabled breakpoints as HW_BREAKPOINT_EMPTY, but that seems to have broken with recent kernels, causing watchpoints to be treated as TYPE_INST in the core code and returning ENOSPC for any further breakpoints. This patch fixes the problem by prioritising the `enable' field of the breakpoint: if it is cleared, we simply update the perf_event_attr to indicate that the thing is disabled and don't bother changing either the type or the length. This reinforces the behaviour that the breakpoint control register is essentially read-only apart from the enable bit when disabling a breakpoint. Cc: <[email protected]> Reported-by: Aaron Liu <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2013-12-19Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds2-2/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "An RT group-scheduling fix and the sched-domains topology setup fix from Mel" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities sched: Assign correct scheduling domain to 'sd_llc'
2013-12-19Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-3/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "An ABI documentation fix, and a mixed-PMU perf-info-corruption fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Document the new transaction sample type perf: Disable all pmus on unthrottling and rescheduling
2013-12-19Merge tag 'sound-3.13-rc5' of ↵Linus Torvalds16-45/+115
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "We have a bit more changes than usual in ASoC here, as it was slipped from the previous update. There are one minr ASoC PCM code fix and ASoC dmaengine fix, in addition of a collection of small ASoC driver fixes. The rest are a couple of HD-audio stable fixups, and a long-standing fix for the paused stream handling. So, all commits look not scary (and hopefully won't give you disastrous holiday season)" * tag 'sound-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Add Dell headset detection quirk for one more laptop model ASoC: wm8904: fix DSP mode B configuration ASoC: wm_adsp: Add small delay while polling DSP RAM start ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function ASoC: kirkwood: Fix the CPU DAI rates ASoC: wm5110: Correct HPOUT3 DAPM route typo ALSA: hda - Add Dell headset detection quirk for three laptop models ALSA: hda - Add enable_msi=0 workaround for four HP machines ASoC: don't leak on error in snd_dmaengine_pcm_register ASoC: fsl: imx-wm8962: Don't update bias_level in machine driver ASoC: tegra: fix uninitialized variables in set_fmt ASoC: wm8962: Enable SYSCLK provisonally before fetching generated DSPCLK_DIV ASoC: sam9x5_wm8731: change to work in DSP A mode ASoC: atmel_ssc_dai: add dai trigger ops ASoC: soc-pcm: Use valid condition for snd_soc_dai_digital_mute() in hw_free()
2013-12-19null_blk: warning on ignored submit_queues paramMatias Bjorling1-2/+5
Let the user know when the number of submission queues are being ignored. Signed-off-by: Matias Bjorling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2013-12-19null_blk: refactor init and init errors code pathsMatias Bjorling1-25/+38
Simplify the initialization logic of the three block-layers. - The queue initialization is split into two parts. This allows reuse of code when initializing the sq-, bio- and mq-based layers. - Set submit_queues default value to 0 and always set it at init time. - Simplify the init error code paths. Signed-off-by: Matias Bjorling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2013-12-19null_blk: documentationMatias Bjorling1-0/+71
Add description of module and its parameters. Signed-off-by: Matias Bjorling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2013-12-19null_blk: mem garbage on NUMA systems during initMatias Bjorling1-4/+4
For NUMA systems, initializing the blk-mq layer and using per node hctx. We initialize submit queues to 1, while blk-mq nr_hw_queues is initialized to the number of NUMA nodes. This makes the null_init_hctx function overwrite memory outside of what it allocated. In my case it lead to writing garbage into struct request_queue's mq_map. Signed-off-by: Matias Bjorling <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-12-19drivers: block: Mark the functions as static in skd_main.cRashika Kheria1-2/+2
Mark functions skd_skmsg_state_to_str() and skd_skreq_state_to_str() as static in skd_main.c because they are not used outside this file. This eliminates the following warnings in skd_main.c: drivers/block/skd_main.c:5272:13: warning: no previous prototype for ‘skd_skmsg_state_to_str’ [-Wmissing-prototypes] drivers/block/skd_main.c:5284:13: warning: no previous prototype for ‘skd_skreq_state_to_str’ [-Wmissing-prototypes] Signed-off-by: Rashika Kheria <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2013-12-19ARC: Allow conditional multiple inclusion of uapi/asm/unistd.hVineet Gupta1-1/+7
Commit 97bc386fc12d "ARC: Add guard macro to uapi/asm/unistd.h" inhibited multiple inclusion of ARCH unistd.h. This however hosed the system since Generic syscall table generator relies on it being included twice, and in lack-of an empty table was emitted by C preprocessor. Fix that by allowing one exception to rule for the special case (just like Xtensa) Suggested-by: Chen Gang <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-12-19Merge tag 'asoc-v3.13-rc4' of ↵Takashi Iwai541-2724/+5150
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v3.13 The fixes here are all driver specific ones, none of which particularly stand out but all of which are useful to users of those drivers.
2013-12-19Merge remote-tracking branches 'asoc/fix/adsp', 'asoc/fix/arizona', ↵Mark Brown11-32/+75
'asoc/fix/atmel', 'asoc/fix/fsl', 'asoc/fix/kirkwood', 'asoc/fix/tegra', 'asoc/fix/wm8904' and 'asoc/fix/wm8962' into asoc-linus
2013-12-19Merge remote-tracking branch 'asoc/fix/dma' into asoc-linusMark Brown1-11/+27
2013-12-19Merge remote-tracking branch 'asoc/fix/core' into asoc-linusMark Brown1-2/+3
2013-12-19ARM: shmobile: r8a7790: fix shdi resource sizesBen Dooks1-2/+2
The r8a7790.dtsi file has four sdhi nodes which the first two have the wrong resource size for their register block. This causes the sh_modbile_sdhi driver to fail to communicate with card at-all. Change sdhi{0,1} node size from 0x100 to 0x200 to correct these nodes as per Kuninori Morimoto's response to the original patch where all four nodes where changed. sdhi{2,3} are the correct size. This bug has been present since sdhi resources were added to the r8a7790 by 8c9b1aa41853272a ("ARM: shmobile: r8a7790: add MMCIF and SDHI DT templates") in v3.11-rc2. Signed-off-by: Ben Dooks <[email protected]> Tested-by: William Towle <[email protected]> Acked-by: Kuninori Morimoto <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2013-12-19ARM: shmobile: bockw: fixup DMA maskKuninori Morimoto1-1/+1
4dcfa60071b3d23f0181f27d8519f12e37cefbb9 (ARM: DMA-API: better handing of DMA masks for coherent allocations) exchanged DMA mask check method. Below warning will appear without this patch asoc-simple-card asoc-simple-card.0: \ Coherent DMA mask 0xffffffffffffffff is larger than dma_addr_t allows asoc-simple-card asoc-simple-card.0: \ Driver did not use or check the return value from dma_set_coherent_mask()? Signed-off-by: Kuninori Morimoto <[email protected]> Acked-by: Laurent Pinchart <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2013-12-19target/file: Update hw_max_sectors based on current block_sizeNicholas Bellinger4-5/+14
This patch allows FILEIO to update hw_max_sectors based on the current max_bytes_per_io. This is required because vfs_[writev,readv]() can accept a maximum of 2048 iovecs per call, so the enforced hw_max_sectors really needs to be calculated based on block_size. This addresses a >= v3.5 bug where block_size=512 was rejecting > 1M sized I/O requests, because FD_MAX_SECTORS was hardcoded to 2048 for the block_size=4096 case. (v2: Use max_bytes_per_io instead of ->update_hw_max_sectors) Reported-by: Henrik Goldman <[email protected]> Cc: <[email protected]> #3.5+ Signed-off-by: Nicholas Bellinger <[email protected]>