aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-02-28powerpc/powernv: Fix indirect XSCOM unmanglingBenjamin Herrenschmidt1-9/+12
We need to unmangle the full address, not just the register number, and we also need to support the real indirect bit being set for in-kernel uses. Signed-off-by: Benjamin Herrenschmidt <[email protected]> CC: <[email protected]> [v3.13]
2014-02-28powerpc/powernv: Fix opal_xscom_{read,write} prototypeBenjamin Herrenschmidt1-2/+2
The OPAL firmware functions opal_xscom_read and opal_xscom_write take a 64-bit argument for the XSCOM (PCB) address in order to support the indirect mode on P8. Signed-off-by: Benjamin Herrenschmidt <[email protected]> CC: <[email protected]> [v3.13]
2014-02-28powerpc/powernv: Refactor PHB diag-data dumpGavin Shan1-95/+125
As Ben suggested, the patch prints PHB diag-data with multiple fields in one line and omits the line if the fields of that line are all zero. With the patch applied, the PHB3 diag-data dump looks like: PHB3 PHB#3 Diag-data (Version: 1) brdgCtl: 00000002 RootSts: 0000000f 00400000 b0830008 00100147 00002000 nFir: 0000000000000000 0030006e00000000 0000000000000000 PhbSts: 0000001c00000000 0000000000000000 Lem: 0000000000100000 42498e327f502eae 0000000000000000 InAErr: 8000000000000000 8000000000000000 0402030000000000 0000000000000000 PE[ 8] A/B: 8480002b00000000 8000000000000000 [ The current diag data is so big that it overflows the printk buffer pretty quickly in cases when we get a handful of errors at once which can happen. --BenH ] Signed-off-by: Gavin Shan <[email protected]> CC: <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-28powerpc/powernv: Dump PHB diag-data immediatelyGavin Shan1-53/+43
The PHB diag-data is important to help locating the root cause for EEH errors such as frozen PE or fenced PHB. However, the EEH core enables IO path by clearing part of HW registers before collecting this data causing it to be corrupted. This patch fixes this by dumping the PHB diag-data immediately when frozen/fenced state on PE or PHB is detected for the first time in eeh_ops::get_state() or next_error() backend. Signed-off-by: Gavin Shan <[email protected]> CC: <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-28powerpc: Increase stack redzone for 64-bit userspace to 512 bytesPaul Mackerras3-5/+20
The new ELFv2 little-endian ABI increases the stack redzone -- the area below the stack pointer that can be used for storing data -- from 288 bytes to 512 bytes. This means that we need to allow more space on the user stack when delivering a signal to a 64-bit process. To make the code a bit clearer, we define new USER_REDZONE_SIZE and KERNEL_REDZONE_SIZE symbols in ptrace.h. For now, we leave the kernel redzone size at 288 bytes, since increasing it to 512 bytes would increase the size of interrupt stack frames correspondingly. Gcc currently only makes use of 288 bytes of redzone even when compiling for the new little-endian ABI, and the kernel cannot currently be compiled with the new ABI anyway. In the future, hopefully gcc will provide an option to control the amount of redzone used, and then we could reduce it even more. This also changes the code in arch_compat_alloc_user_space() to preserve the expanded redzone. It is not clear why this function would ever be used on a 64-bit process, though. Signed-off-by: Paul Mackerras <[email protected]> CC: <[email protected]> [v3.13] Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-28powerpc/ftrace: bugfix for test_24bit_addrLiu Ping Fan1-0/+1
The branch target should be the func addr, not the addr of func_descr_t. So using ppc_function_entry() to generate the right target addr. Signed-off-by: Liu Ping Fan <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-28powerpc/crashdump : Fix page frame number check in copy_oldmem_pageLaurent Dufour1-3/+5
In copy_oldmem_page, the current check using max_pfn and min_low_pfn to decide if the page is backed or not, is not valid when the memory layout is not continuous. This happens when running as a QEMU/KVM guest, where RTAS is mapped higher in the memory. In that case max_pfn points to the end of RTAS, and a hole between the end of the kdump kernel and RTAS is not backed by PTEs. As a consequence, the kdump kernel is crashing in copy_oldmem_page when accessing in a direct way the pages in that hole. This fix relies on the memblock's service memblock_is_region_memory to check if the read page is part or not of the directly accessible memory. Signed-off-by: Laurent Dufour <[email protected]> Tested-by: Mahesh Salgaonkar <[email protected]> CC: <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-28powerpc/le: Ensure that the 'stop-self' RTAS token is handled correctlyTony Breeds1-11/+11
Currently we're storing a host endian RTAS token in rtas_stop_self_args.token. We then pass that directly to rtas. This is fine on big endian however on little endian the token is not what we expect. This will typically result in hitting: panic("Alas, I survived.\n"); To fix this we always use the stop-self token in host order and always convert it to be32 before passing this to rtas. Signed-off-by: Tony Breeds <[email protected]> Cc: [email protected] Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-27ipv6: ipv6_find_hdr restore prev functionalityHans Schillstrom1-1/+1
The commit 9195bb8e381d81d5a315f911904cdf0cfcc919b8 ("ipv6: improve ipv6_find_hdr() to skip empty routing headers") broke ipv6_find_hdr(). When a target is specified like IPPROTO_ICMPV6 ipv6_find_hdr() returns -ENOENT when it's found, not the header as expected. A part of IPVS is broken and possible also nft_exthdr_eval(). When target is -1 which it is most cases, it works. This patch exits the do while loop if the specific header is found so the nexthdr could be returned as expected. Reported-by: Art -kwaak- van Breemen <[email protected]> Signed-off-by: Hans Schillstrom <[email protected]> CC:Ansis Atteka <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-27neigh: recompute reachabletime before returning from neigh_periodic_work()Duan Jiong1-3/+3
If the neigh table's entries is less than gc_thresh1, the function will return directly, and the reachabletime will not be recompute, so the reachabletime can be guessed. Signed-off-by: Duan Jiong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-28Merge branches 'pm-cpufreq', 'pm-hibernate' and 'acpi-processor'Rafael J. Wysocki2-37/+33
* pm-cpufreq: intel_pstate: Change busy calculation to use fixed point math. * pm-hibernate: PM / hibernate: Fix restore hang in freeze_processes() * acpi-processor: ACPI / processor: Rework processor throttling with work_on_cpu()
2014-02-27Merge branch 'for-davem' of ↵David S. Miller21-140/+178
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Regarding the mac80211 bits, Johannes says: "This time, I have a fix from Arik for scheduled scan recovery (something that only recently went into the tree), a memory leak fix from Eytan and a small regulatory bugfix from Inbal. The EAPOL change from Felix makes rekeying more stable while lots of traffic is flowing, and there's Emmanuel's and my fixes for a race in the code handling powersaving clients." Regarding the NFC bits, Samuel says: "We only have one candidate for 3.14 fixes, and this is a NCI NULL pointer dereference introduced during the 3.14 merge window." Regarding the iwlwifi bits, Emmanuel says: "This should fix an issue raised in iwldvm when we have lots of association failures. There is a bugzilla for this bug - it hasn't been validated by the user, but I hope it will do the trick." Beyond that... Amitkumar Karwar brings two mwifiex fixes, one to avoid a NULL pointer dereference and another to address an improperly timed interrupt. Arend van Spriel gives us a brcmfmac fix to avoid a crash during scatter-gather packet transfers. Avinash Patila offers an mwifiex to avoid an invalid memory access when a device is removed. Bing Zhao delivers a simple fix to avoid a naming conflict between libertas and mwifiex. Felix Fietkau provides a trio of ath9k fixes that properly account for sequence numbering in ps-poll frames, reduce the rate for false positives during baseband hang detection, and fix a regression related to rx descriptor handling. James Cameron shows us a libertas fix to ignore zero-length IEs when processing scan results. Kirill Tkhai brings a hostap fix to avoid prematurely freeing a timer. Stanislaw Gruszka fixes an ath9k locking problem. Sujith Manoharan addresses ETSI compliance for a device handled by ath9k by adjusting the minimum CCA power threshold values. ==================== Signed-off-by: David S. Miller <[email protected]>
2014-02-27bnx2x: Add missing bit in default Tx switchingYuval Mintz1-1/+8
Commit c14db2025 "bnx2x: Correct default Tx switching behaviour" supposedly changed the default Tx switching behaviour, but was missing the fastpath change required for FW to pass packets from PFs to VFs. Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-27kvm, vmx: Really fix lazy FPU on nested guestPaolo Bonzini1-1/+1
Commit e504c9098ed6 (kvm, vmx: Fix lazy FPU on nested guest, 2013-11-13) highlighted a real problem, but the fix was subtly wrong. nested_read_cr0 is the CR0 as read by L2, but here we want to look at the CR0 value reflecting L1's setup. In other words, L2 might think that TS=0 (so nested_read_cr0 has the bit clear); but if L1 is actually running it with TS=1, we should inject the fault into L1. The effective value of CR0 in L2 is contained in vmcs12->guest_cr0, use it. Fixes: e504c9098ed6acd9e1079c5e10e4910724ad429f Reported-by: Kashyap Chamarty <[email protected]> Reported-by: Stefan Bader <[email protected]> Tested-by: Kashyap Chamarty <[email protected]> Tested-by: Anthoine Bourgeois <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2014-02-27perf tools: fix BFD detection on opensuseAndi Kleen2-2/+2
opensuse libbfd requires -lz -liberty to build. Add those to the BFD feature detection. Signed-off-by: Andi Kleen <[email protected]> Acked-by: David Ahern <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2014-02-27Merge branch 'master' of ↵David S. Miller5-12/+30
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== 1) Build fix for ip_vti when NET_IP_TUNNEL is not set. We need this set to have ip_tunnel_get_stats64() available. 2) Fix a NULL pointer dereference on sub policy usage. We try to access a xfrm_state from the wrong array. 3) Take xfrm_state_lock in xfrm_migrate_state_find(), we need it to traverse through the state lists. 4) Clone states properly on migration, otherwise we crash when we migrate a state with aead algorithm attached. 5) Fix unlink race when between thread context and timer when policies are deleted. ==================== Signed-off-by: David S. Miller <[email protected]>
2014-02-27net: ipv6: ping: Use socket mark in routing lookupLorenzo Colitti1-0/+1
Signed-off-by: Lorenzo Colitti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-27Merge branch 'master' of ↵John W. Linville21-140/+178
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2014-02-27mac80211: fix association to 20/40 MHz VHT networksJohannes Berg1-0/+1
When a VHT network uses 20 or 40 MHz as per the HT operation information, the channel center frequency segment 0 field in the VHT operation information is reserved, so ignore it. This fixes association with such networks when the AP puts 0 into the field, previously we'd disconnect due to an invalid channel with the message wlan0: AP VHT information is invalid, disable VHT Cc: [email protected] Fixes: f2d9d270c15ae ("mac80211: support VHT association") Reported-by: Tim Nelson <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-27drm/radeon: enable speaker allocation setup on dce3.2Alex Deucher1-3/+0
Now that we disable audio while setting up the audio hw, we should be able to set this up without hangs. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
2014-02-27drm/radeon: change audio enable logicAlex Deucher5-27/+44
Disable audio around audio hw setup. This may avoid hangs on certain asics. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
2014-02-27drm/radeon: fix audio disable on dce6+Alex Deucher1-1/+1
Properly clear the enable bit when audio disable is requested. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Cc: [email protected]
2014-02-27drm/radeon: free uvd ring on unloadJerome Glisse3-2/+4
Need to free the uvd ring. Also reshuffle gart tear down to happen after uvd tear down. Signed-off-by: Jérôme Glisse <[email protected]> Cc: [email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2014-02-27drm/radeon: disable pll sharing for DP on DCE4.1Alex Deucher1-1/+15
Causes display problems. We had already disabled sharing for non-DP displays. Based on a patch from: Niels Ole Salscheider <[email protected]> bug: https://bugzilla.kernel.org/show_bug.cgi?id=58121 Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2014-02-27drm/radeon: fix missing bo reservationChristian König1-0/+6
Otherwise we might get a crash here. Signed-off-by: Christian König <[email protected]> Cc: [email protected] Signed-off-by: Alex Deucher <[email protected]>
2014-02-27drm/radeon: print the supported atpx function maskAlex Deucher1-1/+2
Print the supported functions mask in addition to the version. This is useful in debugging PX problems since we can see what functions are available. Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2014-02-27Merge tag 'metag-fixes-v3.14' of ↵Linus Torvalds3-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull Metag arch and asm-generic fixes from James Hogan: - Add the new sched_setattr/sched_getattr syscalls to the asm-generic syscall list, which is used by arc, arm64, c6x, hexagon, metag, openrisc, score, tile, and unicore32. - An IRQ affinity bug fix for metag to prevent interrupts being vectored to offline CPUs when their affinity is changed via /proc/irq/ (thanks tglx). * tag 'metag-fixes-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: irq-metag*: stop set_affinity vectoring to offline cpus asm-generic: add sched_setattr/sched_getattr syscalls
2014-02-27Merge tag 'pwm/for-3.14-rc5' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm fix from Thierry Reding: "Just a single trivial patch to plug a memory leak in an error path" * tag 'pwm/for-3.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: pwm: lp3943: Fix potential memory leak during request
2014-02-27Merge branch 'for_linus' of ↵Linus Torvalds20-59/+108
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull filesystem fixes from Jan Kara: "Notification, writeback, udf, quota fixes The notification patches are (with one exception) a fallout of my fsnotify rework which went into -rc1 (I've extented LTP to cover these cornercases to avoid similar breakage in future). The UDF patch is a nasty data corruption Al has recently reported, the revert of the writeback patch is due to possibility of violating sync(2) guarantees, and a quota bug can lead to corruption of quota files in ocfs2" * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: Allocate overflow events with proper type fanotify: Handle overflow in case of permission events fsnotify: Fix detection whether overflow event is queued Revert "writeback: do not sync data dirtied after sync start" quota: Fix race between dqput() and dquot_scan_active() udf: Fix data corruption on file type conversion inotify: Fix reporting of cookies for inotify events
2014-02-27Merge tag 'upstream-3.14-rc5' of git://git.infradead.org/linux-ubifsLinus Torvalds1-4/+4
Pull ubifs fix from Artem Bityutskiy: "Just a single fix for the UBI module unload path which makes sure we do not touch freed memory" * tag 'upstream-3.14-rc5' of git://git.infradead.org/linux-ubifs: UBI: fix some use after free bugs
2014-02-27kvm: x86: fix emulator buffer overflow (CVE-2014-0049)Andrew Honig1-1/+1
The problem occurs when the guest performs a pusha with the stack address pointing to an mmio address (or an invalid guest physical address) to start with, but then extending into an ordinary guest physical address. When doing repeated emulated pushes emulator_read_write sets mmio_needed to 1 on the first one. On a later push when the stack points to regular memory, mmio_nr_fragments is set to 0, but mmio_is_needed is not set to 0. As a result, KVM exits to userspace, and then returns to complete_emulated_mmio. In complete_emulated_mmio vcpu->mmio_cur_fragment is incremented. The termination condition of vcpu->mmio_cur_fragment == vcpu->mmio_nr_fragments is never achieved. The code bounces back and fourth to userspace incrementing mmio_cur_fragment past it's buffer. If the guest does nothing else it eventually leads to a a crash on a memcpy from invalid memory address. However if a guest code can cause the vm to be destroyed in another vcpu with excellent timing, then kvm_clear_async_pf_completion_queue can be used by the guest to control the data that's pointed to by the call to cancel_work_item, which can be used to gain execution. Fixes: f78146b0f9230765c6315b2e14f56112513389ad Signed-off-by: Andrew Honig <[email protected]> Cc: [email protected] (3.5+) Signed-off-by: Paolo Bonzini <[email protected]>
2014-02-27arm/arm64: KVM: detect CPU reset on CPU_PM_EXITMarc Zyngier3-4/+37
Commit 1fcf7ce0c602 (arm: kvm: implement CPU PM notifier) added support for CPU power-management, using a cpu_notifier to re-init KVM on a CPU that entered CPU idle. The code assumed that a CPU entering idle would actually be powered off, loosing its state entierely, and would then need to be reinitialized. It turns out that this is not always the case, and some HW performs CPU PM without actually killing the core. In this case, we try to reinitialize KVM while it is still live. It ends up badly, as reported by Andre Przywara (using a Calxeda Midway): [ 3.663897] Kernel panic - not syncing: unexpected prefetch abort in Hyp mode at: 0x685760 [ 3.663897] unexpected data abort in Hyp mode at: 0xc067d150 [ 3.663897] unexpected HVC/SVC trap in Hyp mode at: 0xc0901dd0 The trick here is to detect if we've been through a full re-init or not by looking at HVBAR (VBAR_EL2 on arm64). This involves implementing the backend for __hyp_get_vectors in the main KVM HYP code (rather small), and checking the return value against the default one when the CPU notifier is called on CPU_PM_EXIT. Reported-by: Andre Przywara <[email protected]> Tested-by: Andre Przywara <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Rob Herring <[email protected]> Acked-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-02-27sch_tbf: Fix potential memory leak in tbf_change().Hiroaki SHIMODA1-12/+12
The allocated child qdisc is not freed in error conditions. Defer the allocation after user configuration turns out to be valid and acceptable. Fixes: cc106e441a63b ("net: sched: tbf: fix the calculation of max_size") Signed-off-by: Hiroaki SHIMODA <[email protected]> Cc: Yang Yingliang <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-27dm thin: allow metadata space larger than supported to go unusedMike Snitzer5-19/+37
It was always intended that a user could provide a thin metadata device that is larger than the max supported by the on-disk format. The extra space would just go unused. Unfortunately that never worked. If the user attempted to use a larger metadata device on creation they would get an error like the following: device-mapper: space map common: space map too large device-mapper: transaction manager: couldn't create metadata space map device-mapper: thin metadata: tm_create_with_sm failed device-mapper: table: 252:17: thin-pool: Error creating metadata object device-mapper: ioctl: error adding target to table Fix this by allowing the initial metadata space map creation to cap its size at the max number of blocks supported (DM_SM_METADATA_MAX_BLOCKS). get_metadata_dev_size() must also impose DM_SM_METADATA_MAX_BLOCKS (via THIN_METADATA_MAX_SECTORS), otherwise extending metadata would cap at THIN_METADATA_MAX_SECTORS_WARNING (which is larger than supported). Also, the calculation for THIN_METADATA_MAX_SECTORS didn't account for the sizeof the disk_bitmap_header. So the supported maximum metadata size is a bit smaller (reduced from 33423360 to 33292800 sectors). Lastly, remove the "excess space will not be used" warning message from get_metadata_dev_size(); it resulted in printing the warning multiple times. Factor out warn_if_metadata_device_too_big(), call it from pool_ctr() and maybe_resize_metadata_dev(). Signed-off-by: Mike Snitzer <[email protected]> Acked-by: Joe Thornber <[email protected]>
2014-02-27Merge branch 'liblockdep-fixes' of https://github.com/sashalevin/liblockdep ↵Ingo Molnar5-4/+15
into core/urgent Pull tools/lib/lockdep/ fixes from Sasha Levin. Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar5-19/+40
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: * Fix annotation on stdio/GTK+ interfaces (Namhyung Kim) * Fix file descriptor leaking while searching DSOs for suitable symtab (Namhyung Kim). Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27Merge tag 'asoc-v3.14-rc4-2' of ↵Takashi Iwai3-2/+14
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v3.14 A few more driver specific bug fixes, all driver specific things that only affect users of those devices.
2014-02-27perf: Fix hotplug splatPeter Zijlstra1-6/+6
Drew Richardson reported that he could make the kernel go *boom* when hotplugging while having perf events active. It turned out that when you have a group event, the code in __perf_event_exit_context() fails to remove the group siblings from the context. We then proceed with destroying and freeing the event, and when you re-plug the CPU and try and add another event to that CPU, things go *boom* because you've still got dead entries there. Reported-by: Drew Richardson <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27perf/x86: Fix event schedulingPeter Zijlstra1-0/+3
Vince "Super Tester" Weaver reported a new round of syscall fuzzing (Trinity) failures, with perf WARN_ON()s triggering. He also provided traces of the failures. This is I think the relevant bit: > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_disable: x86_pmu_disable > pec_1076_warn-2804 [000] d... 147.926153: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926156: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926158: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926159: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926160: x86_pmu_state: n_events: 1, n_added: 0, n_txn: 1 > pec_1076_warn-2804 [000] d... 147.926161: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926162: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926163: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926166: collect_events: Adding event: 1 (ffff880119ec8800) So we add the insn:p event (fd[23]). At this point we should have: n_events = 2, n_added = 1, n_txn = 1 > pec_1076_warn-2804 [000] d... 147.926170: collect_events: Adding event: 0 (ffff8800c9e01800) > pec_1076_warn-2804 [000] d... 147.926172: collect_events: Adding event: 4 (ffff8800cbab2c00) We try and add the {BP,cycles,br_insn} group (fd[3], fd[4], fd[15]). These events are 0:cycles and 4:br_insn, the BP event isn't x86_pmu so that's not visible. group_sched_in() pmu->start_txn() /* nop - BP pmu */ event_sched_in() event->pmu->add() So here we should end up with: 0: n_events = 3, n_added = 2, n_txn = 2 4: n_events = 4, n_added = 3, n_txn = 3 But seeing the below state on x86_pmu_enable(), the must have failed, because the 0 and 4 events aren't there anymore. Looking at group_sched_in(), since the BP is the leader, its event_sched_in() must have succeeded, for otherwise we would not have seen the sibling adds. But since neither 0 or 4 are in the below state; their event_sched_in() must have failed; but I don't see why, the complete state: 0,0,1:p,4 fits perfectly fine on a core2. However, since we try and schedule 4 it means the 0 event must have succeeded! Therefore the 4 event must have failed, its failure will have put group_sched_in() into the fail path, which will call: event_sched_out() event->pmu->del() on 0 and the BP event. Now x86_pmu_del() will reduce n_events; but it will not reduce n_added; giving what we see below: n_event = 2, n_added = 2, n_txn = 2 > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_enable: x86_pmu_enable > pec_1076_warn-2804 [000] d... 147.926177: x86_pmu_state: Events: { > pec_1076_warn-2804 [000] d... 147.926179: x86_pmu_state: 0: state: .R config: ffffffffffffffff ( (null)) > pec_1076_warn-2804 [000] d... 147.926181: x86_pmu_state: 33: state: AR config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926182: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: n_events: 2, n_added: 2, n_txn: 2 > pec_1076_warn-2804 [000] d... 147.926184: x86_pmu_state: Assignment: { > pec_1076_warn-2804 [000] d... 147.926186: x86_pmu_state: 0->33 tag: 1 config: 0 (ffff88011ac99800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: 1->0 tag: 1 config: 1 (ffff880119ec8800) > pec_1076_warn-2804 [000] d... 147.926188: x86_pmu_state: } > pec_1076_warn-2804 [000] d... 147.926190: x86_pmu_enable: S0: hwc->idx: 33, hwc->last_cpu: 0, hwc->last_tag: 1 hwc->state: 0 So the problem is that x86_pmu_del(), when called from a group_sched_in() that fails (for whatever reason), and without x86_pmu TXN support (because the leader is !x86_pmu), will corrupt the n_added state. Reported-and-Tested-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Dave Jones <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27sched/deadline: Prevent rt_time growth to infinityJuri Lelli2-2/+14
Kirill Tkhai noted: Since deadline tasks share rt bandwidth, we must care about bandwidth timer set. Otherwise rt_time may grow up to infinity in update_curr_dl(), if there are no other available RT tasks on top level bandwidth. RT task were in fact throttled right after they got enqueued, and never executed again (rt_time never again went below rt_runtime). Peter then proposed to accrue DL execution on rt_time only when rt timer is active, and proposed a patch (this patch is a slight modification of that) to implement that behavior. While this solves Kirill problem, it has a drawback. Indeed, Kirill noted again: It looks we may get into a situation, when all CPU time is shared between RT and DL tasks: rt_runtime = n rt_period = 2n | RT working, DL sleeping | DL working, RT sleeping | ----------------------------------------------------------- | (1) duration = n | (2) duration = n | (repeat) |--------------------------|------------------------------| | (rt_bw timer is running) | (rt_bw timer is not running) | No time for fair tasks at all. While this can happen during the first period, if rq is always backlogged, RT tasks won't have the opportunity to execute anymore: rt_time reached rt_runtime during (1), suppose after (2) RT is enqueued back, it gets throttled since rt timer didn't fire, replenishment is from now on eaten up by DL tasks that accrue their execution on rt_time (while rt timer is active - we have an RT task waiting for replenishment). FAIR tasks are not touched after this first period. Ok, this is not ideal, and the situation is even worse! What above (the nice case), practically never happens in reality, where your rt timer is not aligned to tasks periods, tasks are in general not periodic, etc.. Long story short, you always risk to overload your system. This patch is based on Peter's idea, but exploits an additional fact: if you don't have RT tasks enqueued, it makes little sense to continue incrementing rt_time once you reached the upper limit (DL tasks have their own mechanism for throttling). This cures both problems: - no matter how many DL instances in the past, you'll have an rt_time slightly above rt_runtime when an RT task is enqueued, and from that point on (after the first replenishment), the task will normally execute; - you can still eat up all bandwidth during the first period, but not anymore after that, remember that DL execution will increment rt_time till the upper limit is reached. The situation is still not perfect! But, we have a simple solution for now, that limits how much you can jeopardize your system, as we keep working towards the right answer: RT groups scheduled using deadline servers. Reported-by: Kirill Tkhai <[email protected]> Signed-off-by: Juri Lelli <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27sched/deadline: Switch CPU's presence test orderJuri Lelli1-2/+2
Commit 82b9580 ("sched/deadline: Test for CPU's presence explicitly") changed how we check if a CPU returned by cpudeadline machinery is valid. But, we don't want to call cpu_present() if best_cpu is equal to -1. So, switch the order of tests inside WARN_ON(). Signed-off-by: Juri Lelli <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27sched/deadline: Cleanup RT leftovers from {inc/dec}_dl_migrationKirill Tkhai1-2/+0
In deadline class we do not have group scheduling. So, let's remove unnecessary X = X; equations. Signed-off-by: Kirill Tkhai <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Juri Lelli <[email protected]> Link: http://lkml.kernel.org/r/1393343543.4089.5.camel@tkhai Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27sched: Fix double normalization of vruntimeGeorge McCollister1-4/+4
dequeue_entity() is called when p->on_rq and sets se->on_rq = 0 which appears to guarentee that the !se->on_rq condition is met. If the task has done set_current_state(TASK_INTERRUPTIBLE) without schedule() the second condition will be met and vruntime will be incorrectly adjusted twice. In certain cases this can result in the task's vruntime never increasing past the vruntime of other tasks on the CFS' run queue, starving them of CPU time. This patch changes switched_from_fair() to use !p->on_rq instead of !se->on_rq. I'm able to cause a task with a priority of 120 to starve all other tasks with the same priority on an ARM platform running 3.2.51-rt72 PREEMPT RT by writing one character at time to a serial tty (16550 UART) in a tight loop. I'm also able to verify making this change corrects the problem on that platform and kernel version. Signed-off-by: George McCollister <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-02-27Merge remote-tracking branch 'asoc/fix/wm8958' into asoc-linusMark Brown1-1/+1
2014-02-27Merge remote-tracking branches 'asoc/fix/da732x' and 'asoc/fix/sta32x' into ↵Mark Brown2-1/+13
asoc-linus
2014-02-27Merge tag 'asoc-v3.14-rc4' into asoc-linusMark Brown11-208/+317
ASoC: Fixes for v3.14 A somewhat large set of fixes here due to the identification of some systematic problems with hard to use APIs in the subsystem. Takashi did a lot of work to address the enumeration API which uncovered a number of off by one bugs caused by confusing APIs while Charles addressed issues in the locking around DAPM. # gpg: Signature made Sun 23 Feb 2014 13:29:34 KST using RSA key ID 7EA229BD # gpg: Good signature from "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>"
2014-02-27Merge tag 'asoc-v3.14-rc3' into asoc-linusMark Brown15-84/+99
ASoC: Fixes for v3.14 A few fixes, all driver speccific ones. The DaVinci ones aren't as clear as they should be from the subject lines on the commits but they fix issues which will prevent correct operation in some use cases and only affect that particular driver so are reasonably safe. # gpg: Signature made Wed 19 Feb 2014 13:23:13 KST using RSA key ID 7EA229BD # gpg: Good signature from "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>" # gpg: aka "Mark Brown <[email protected]>"
2014-02-27iwlwifi: fix TX status for aggregated packetsJohannes Berg2-14/+18
Only the first packet is currently handled correctly, but then all others are assumed to have failed which is problematic. Fix this, marking them all successful instead (since if they're not then the firmware will have transmitted them as single frames.) This fixes the lost packet reporting. Also do a tiny variable scoping cleanup. Cc: <[email protected]> Signed-off-by: Johannes Berg <[email protected]> [Add the dvm part] Signed-off-by: Emmanuel Grumbach <[email protected]>
2014-02-27ASoC: sta32x: Fix wrong enum for limiter2 release rateTakashi Iwai1-1/+1
There is a typo in the Limiter2 Release Rate control, a wrong enum for Limiter1 is assigned. It must point to Limiter2. Spotted by a compile warning: In file included from sound/soc/codecs/sta32x.c:34:0: sound/soc/codecs/sta32x.c:223:29: warning: ‘sta32x_limiter2_release_rate_enum’ defined but not used [-Wunused-variable] static SOC_ENUM_SINGLE_DECL(sta32x_limiter2_release_rate_enum, ^ include/sound/soc.h:275:18: note: in definition of macro ‘SOC_ENUM_DOUBLE_DECL’ struct soc_enum name = SOC_ENUM_DOUBLE(xreg, xshift_l, xshift_r, \ ^ sound/soc/codecs/sta32x.c:223:8: note: in expansion of macro ‘SOC_ENUM_SINGLE_DECL’ static SOC_ENUM_SINGLE_DECL(sta32x_limiter2_release_rate_enum, ^ Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: <[email protected]>
2014-02-27iwlwifi: mvm: change of listen interval from 70 to 10Max Stepanov1-1/+1
Some APs reject STA association request if a listen interval value exceeds a threshold of 10. Thus, for example, Cisco APs may deny STA associations returning status code 12 (Association denied due to reason outside the scope of 802.11 standard) in the association response frame. Fixing the issue by setting the default IWL_CONN_MAX_LISTEN_INTERVAL value from 70 to 10. Cc: <[email protected]> [3.10+] Signed-off-by: Max Stepanov <[email protected]> Reviewed-by: Alexander Bondar <[email protected]> Signed-off-by: Emmanuel Grumbach <[email protected]>