aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2015-01-28tcp: fix timing issue in CUBIC slope calculationNeal Cardwell1-0/+8
This patch fixes a bug in CUBIC that causes cwnd to increase slightly too slowly when multiple ACKs arrive in the same jiffy. If cwnd is supposed to increase at a rate of more than once per jiffy, then CUBIC was sometimes too slow. Because the bic_target is calculated for a future point in time, calculated with time in jiffies, the cwnd can increase over the course of the jiffy while the bic_target calculated as the proper CUBIC cwnd at time t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only increases on jiffy tick boundaries. So since the cnt is set to: ca->cnt = cwnd / (bic_target - cwnd); as cwnd increases but bic_target does not increase due to jiffy granularity, the cnt becomes too large, causing cwnd to increase too slowly. For example: - suppose at the beginning of a jiffy, cwnd=40, bic_target=44 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10 - suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai() increases cwnd to 41 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13 So now CUBIC will wait for 13 packets to be ACKed before increasing cwnd to 42, insted of 10 as it should. The fix is to avoid adjusting the slope (determined by ca->cnt) multiple times within a jiffy, and instead skip to compute the Reno cwnd, the "TCP friendliness" code path. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix stretch ACK bugs in CUBICNeal Cardwell1-22/+9
Change CUBIC to properly handle stretch ACKs in additive increase mode by passing in the count of ACKed packets to tcp_cong_avoid_ai(). In addition, because we are now precisely accounting for stretch ACKs, including delayed ACKs, we can now remove the delayed ACK tracking and estimation code that tracked recent delayed ACK behavior in ca->delayed_ack. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix stretch ACK bugs in RenoNeal Cardwell1-4/+6
Change Reno to properly handle stretch ACKs in additive increase mode by passing in the count of ACKed packets to tcp_cong_avoid_ai(). In addition, if snd_cwnd crosses snd_ssthresh during slow start processing, and we then exit slow start mode, we need to carry over any remaining "credit" for packets ACKed and apply that to additive increase by passing this remaining "acked" count to tcp_cong_avoid_ai(). Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix the timid additive increase on stretch ACKsNeal Cardwell1-6/+9
tcp_cong_avoid_ai() was too timid (snd_cwnd increased too slowly) on "stretch ACKs" -- cases where the receiver ACKed more than 1 packet in a single ACK. For example, suppose w is 10 and we get a stretch ACK for 20 packets, so acked is 20. We ought to increase snd_cwnd by 2 (since acked/w = 20/10 = 2), but instead we were only increasing cwnd by 1. This patch fixes that behavior. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: stretch ACK fixes prepNeal Cardwell7-11/+15
LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that cover more than the RFC-specified maximum of 2 packets. These stretch ACKs can cause serious performance shortfalls in common congestion control algorithms that were designed and tuned years ago with receiver hosts that were not using LRO or GRO, and were instead politely ACKing every other packet. This patch series fixes Reno and CUBIC to handle stretch ACKs. This patch prepares for the upcoming stretch ACK bug fix patches. It adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and changes all congestion control algorithms to pass in 1 for the ACKed count. It also changes tcp_slow_start() to return the number of packet ACK "credits" that were not processed in slow start mode, and can be processed by the congestion control module in additive increase mode. In future patches we will fix tcp_cong_avoid_ai() to handle stretch ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start and additive increase mode. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-29ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy buildsMagnus Damm2-0/+27
As of commit 9a1091ef0017c40a ("irqchip: gic: Support hierarchy irq domain."), the APE6EVM legacy board support is known to be broken. The IRQ numbers of the GIC are now virtual, and no longer match the hardcoded hardware IRQ numbers in the legacy platform board code. To fix this issue specific to non-muliplatform r8a73a4 and APE6EVM: 1) Instantiate the GIC from platform board code and also 2) Skip over the DT arch timer as well as 3) Force delay setup based on DT CPU frequency With these 3 fixes in place interrupts on APE6EVM are now unbroken. Partially based on legacy GIC fix by Geert Uytterhoeven, thanks to him for the initial work. Signed-off-by: Magnus Damm <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-01-28Merge tag 'mvebu-fixes-3.19-6' of git://git.infradead.org/linux-mvebu into fixesOlof Johansson1-0/+7
Merge "mvebu-fixes-6" from Andrew Lunn: The previous fix for Armada XP, disabling I/O coherency, broke Armada 375/38x. Only switch the PL310 to I/O coherent mode if I/O coherency is enabled. * tag 'mvebu-fixes-3.19-6' of git://git.infradead.org/linux-mvebu: ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled Signed-off-by: Olof Johansson <[email protected]>
2015-01-28ALSA: ak411x: Fix stall in work callbackTakashi Iwai4-21/+18
When ak4114 work calls its callback and the callback invokes ak4114_reinit(), it stalls due to flush_delayed_work(). For avoiding this, control the reentrance by introducing a refcount. Also flush_delayed_work() is replaced with cancel_delayed_work_sync(). The exactly same bug is present in ak4113.c and fixed as well. Reported-by: Pavel Hofman <[email protected]> Acked-by: Jaroslav Kysela <[email protected]> Tested-by: Pavel Hofman <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2015-01-28ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is ↵Thomas Petazzoni1-0/+7
disabled Since commit f2c3c67f00 (merge commit that adds commit "ARM: mvebu: completely disable hardware I/O coherency"), we disable I/O coherency on Armada EBU platforms. However, we continue to initialize the coherency fabric, because this coherency fabric is needed on Armada XP for inter-CPU coherency. Unfortunately, due to this, we also continued to execute the coherency fabric initialization code for Armada 375/38x, which switched the PL310 into I/O coherent mode. This has the effect of disabling the outer cache sync operation: this is needed when I/O coherency is enabled to work around a PCIe/L2 deadlock. But obviously, when I/O coherency is disabled, having the outer cache sync operation is crucial. Therefore, this commit fixes the armada_375_380_coherency_init() so that the PL310 is switched to I/O coherent mode only if I/O coherency is enabled. Without this fix, all devices using DMA are broken on Armada 375/38x. Signed-off-by: Thomas Petazzoni <[email protected]> Acked-by: Gregory CLEMENT <[email protected]> Tested-by: Gregory CLEMENT <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Cc: <[email protected]> # v3.8+
2015-01-28dm thin: don't allow messages to be sent to a pool target in READ_ONLY or ↵Joe Thornber1-0/+6
FAIL mode You can't modify the metadata in these modes. It's better to fail these messages immediately than let the block-manager deny write locks on metadata blocks. Otherwise these failed metadata changes will trigger 'needs_check' to get set in the metadata superblock -- requiring repair using the thin_check utility. Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected]
2015-01-28dm cache: fix missing ERR_PTR returns and handlingJoe Thornber1-4/+5
Commit 9b1cc9f251 ("dm cache: share cache-metadata object across inactive and active DM tables") mistakenly ignored the use of ERR_PTR returns. Restore missing IS_ERR checks and ERR_PTR returns where appropriate. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected]
2015-01-28Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar7-27/+80
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: " User visible fixes: - Fix probing at function return (Namhyumg Kim) Developer visible fixes: - Symbol processing changes necessary for fixing support for kretprobes in 'perf probe' (Namhyung Kim, Arnaldo Carvalho de Melo) - Annotation memory leaks and instruction parsing fixes (Rabin Vincent) - Fix perl build on ARM64 (Wang Nam) " Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28sched: Fix crash if cpuset_cpumask_can_shrink() is passed an empty cpumaskMike Galbraith1-0/+3
While creating an exclusive cpuset, we passed cpuset_cpumask_can_shrink() an empty cpumask (cur), and dl_bw_of(cpumask_any(cur)) made boom with it: CPU: 0 PID: 6942 Comm: shield.sh Not tainted 3.19.0-master #19 Hardware name: MEDIONPC MS-7502/MS-7502, BIOS 6.00 PG 12/26/2007 task: ffff880224552450 ti: ffff8800caab8000 task.ti: ffff8800caab8000 RIP: 0010:[<ffffffff81073846>] [<ffffffff81073846>] cpuset_cpumask_can_shrink+0x56/0xb0 [...] Call Trace: [<ffffffff810cb82a>] validate_change+0x18a/0x200 [<ffffffff810cc877>] cpuset_write_resmask+0x3b7/0x720 [<ffffffff810c4d58>] cgroup_file_write+0x38/0x100 [<ffffffff811d953a>] kernfs_fop_write+0x12a/0x180 [<ffffffff8116e1a3>] vfs_write+0xb3/0x1d0 [<ffffffff8116ed06>] SyS_write+0x46/0xb0 [<ffffffff8159ced6>] system_call_fastpath+0x16/0x1b Signed-off-by: Mike Galbraith <[email protected]> Acked-by: Zefan Li <[email protected]> Fixes: f82f80426f7a ("sched/deadline: Ensure that updates to exclusive cpusets don't break AC") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28rbd: drop parent_ref in rbd_dev_unprobe() unconditionallyIlya Dryomov1-4/+1
This effectively reverts the last hunk of 392a9dad7e77 ("rbd: detect when clone image is flattened"). The problem with parent_overlap != 0 condition is that it's possible and completely valid to have an image with parent_overlap == 0 whose parent state needs to be cleaned up on unmap. The next commit, which drops the "clone image now standalone" logic, opens up another window of opportunity to hit this, but even without it # cat parent-ref.sh #!/bin/bash rbd create --image-format 2 --size 1 foo rbd snap create foo@snap rbd snap protect foo@snap rbd clone foo@snap bar rbd resize --allow-shrink --size 0 bar rbd resize --size 1 bar DEV=$(rbd map bar) rbd unmap $DEV leaves rbd_device/rbd_spec/etc and rbd_client along with ceph_client hanging around. My thinking behind calling rbd_dev_parent_put() unconditionally is that there shouldn't be any requests in flight at that point in time as we are deep into unmap sequence. Hence, even if rbd_dev_unparent() caused by flatten is delayed by in-flight requests, it will have finished by the time we reach rbd_dev_unprobe() caused by unmap, thus turning unconditional rbd_dev_parent_put() into a no-op. Fixes: http://tracker.ceph.com/issues/10352 Cc: [email protected] # 3.11+ Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2015-01-28rbd: fix rbd_dev_parent_get() when parent_overlap == 0Ilya Dryomov1-14/+6
The comment for rbd_dev_parent_get() said * We must get the reference before checking for the overlap to * coordinate properly with zeroing the parent overlap in * rbd_dev_v2_parent_info() when an image gets flattened. We * drop it again if there is no overlap. but the "drop it again if there is no overlap" part was missing from the implementation. This lead to absurd parent_ref values for images with parent_overlap == 0, as parent_ref was incremented for each img_request and virtually never decremented. Fix this by leveraging the fact that refresh path calls rbd_dev_v2_parent_info() under header_rwsem and use it for read in rbd_dev_parent_get(), instead of messing around with atomics. Get rid of barriers in rbd_dev_v2_parent_info() while at it - I don't see what they'd pair with now and I suspect we are in a pretty miserable situation as far as proper locking goes regardless. Cc: [email protected] # 3.11+ Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2015-01-28perf: Tighten (and fix) the grouping conditionPeter Zijlstra2-8/+13
The fix from 9fc81d87420d ("perf: Fix events installation during moving group") was incomplete in that it failed to recognise that creating a group with events for different CPUs is semantically broken -- they cannot be co-scheduled. Furthermore, it leads to real breakage where, when we create an event for CPU Y and then migrate it to form a group on CPU X, the code gets confused where the counter is programmed -- triggered in practice as well by me via the perf fuzzer. Fix this by tightening the rules for creating groups. Only allow grouping of counters that can be co-scheduled in the same context. This means for the same task and/or the same cpu. Fixes: 9fc81d87420d ("perf: Fix events installation during moving group") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28perf/x86/intel: Add model number for AirmontKan Liang1-0/+1
Intel Airmont supports the same architectural and non-architectural performance monitoring events as Silvermont. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28sched/fair: Avoid using uninitialized variable in preferred_group_nid()Jan Beulich1-1/+1
At least some gcc versions - validly afaict - warn about potentially using max_group uninitialized: There's no way the compiler can prove that the body of the conditional where it and max_faults get set/ updated gets executed; in fact, without knowing all the details of other scheduler code, I can't prove this either. Generally the necessary change would appear to be to clear max_group prior to entering the inner loop, and break out of the outer loop when it ends up being all clear after the inner one. This, however, seems inefficient, and afaict the same effect can be achieved by exiting the outer loop when max_faults is still zero after the inner loop. [ mingo: changed the solution to zero initialization: uninitialized_var() needs to die, as it's an actively dangerous construct: if in the future a known-proven-good piece of code is changed to have a true, buggy uninitialized variable, the compiler warning is then supressed... The better long term solution is to clean up the code flow, so that even simple minded compilers (and humans!) are able to read it without getting a headache. ] Signed-off-by: Jan Beulich <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28perf/rapl: Fix crash in rapl_scale()Stephane Eranian1-1/+1
This patch fixes a systematic crash in rapl_scale() due to an invalid pointer. The bug was introduced by commit: 89cbc76768c2 ("x86: Replace __get_cpu_var uses") The fix is simple. Just put the parenthesis where it needs to be, i.e., around rapl_pmu. To my surprise, the compiler was not complaining about passing an integer instead of a pointer. Reported-by: Vince Weaver <[email protected]> Tested-by: Vince Weaver <[email protected]> Fixes: 89cbc76768c2 ("x86: Replace __get_cpu_var uses") Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/20150122203834.GA10228@thinkpad Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28perf/x86/intel/uncore: Move uncore_box_init() out of driver initializationKan Liang2-15/+12
There were some issues about the uncore driver tried to access non-existing boxes, which caused boot crashes. These issues have been all fixed. But we should avoid boot failures if that ever happens again. This patch intends to prevent this kind of potential issues. It moves uncore_box_init out of driver initialization. The box will be initialized when it's first enabled. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Yan, Zheng <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-01-28x86, microcode: Return error from driver init code when loader is disabledBoris Ostrovsky1-1/+1
Commits 65cef1311d5d ("x86, microcode: Add a disable chicken bit") and a18a0f6850d4 ("x86, microcode: Don't initialize microcode code on paravirt") allow microcode driver skip initialization when microcode loading is not permitted. However, they don't prevent the driver from being loaded since the init code returns 0. If at some point later the driver gets unloaded this will result in an oops while trying to deregister the (never registered) device. To avoid this, make init code return an error on paravirt or when microcode loading is disabled. The driver will then never be loaded. Signed-off-by: Boris Ostrovsky <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Reported-by: James Digwall <[email protected]> Cc: [email protected] # 3.18 Signed-off-by: Borislav Petkov <[email protected]>
2015-01-28quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space unitsJan Kara8-195/+318
Currently ->get_dqblk() and ->set_dqblk() use struct fs_disk_quota which tracks space limits and usage in 512-byte blocks. However VFS quotas track usage in bytes (as some filesystems require that) and we need to somehow pass this information. Upto now it wasn't a problem because we didn't do any unit conversion (thus VFS quota routines happily stuck number of bytes into d_bcount field of struct fd_disk_quota). Only if you tried to use Q_XGETQUOTA or Q_XSETQLIM for VFS quotas (or Q_GETQUOTA / Q_SETQUOTA for XFS quotas), you got bogus results. Hardly anyone tried this but reportedly some Samba users hit the problem in practice. So when we want interfaces compatible we need to fix this. We bite the bullet and define another quota structure used for passing information from/to ->get_dqblk()/->set_dqblk. It's somewhat sad we have to have more conversion routines in fs/quota/quota.c and another copying of quota structure slows down getting of quota information by about 2% but it seems cleaner than overloading e.g. units of d_bcount to bytes. CC: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2015-01-28udf: Release preallocation on last writeable closeJan Kara1-1/+1
Commit 6fb1ca92a640 "udf: Fix race between write(2) and close(2)" changed the condition when preallocation is released. The idea was that we don't want to release the preallocation for an inode on close when there are other writeable file descriptors for the inode. However the condition was written in the opposite way so we released preallocation only if there were other writeable file descriptors. Fix the problem by changing the condition properly. CC: [email protected] Fixes: 6fb1ca92a6409a9d5b0696447cd4997bc9aaf5a2 Reported-by: Fabian Frederick <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2015-01-27Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds29-188/+315
Pull drm fixes from Dave Airlie: "This feels larger than I'd like but its for three reasons. a) amdkfd finalising the API more, this is a new feature introduced last merge window, and I'd prefer to make the tweaks to the API before it first gets into a stable release. b) radeon regression required splitting an internal API to fix properly, so it just changed a few more lines c) vmwgfx fix changes a lock from a mutex->spin lock, this is fallout from the new sleep checking. Otherwise there is just some tda998x fixes" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon: Remove rdev->gart.pages_addr array drm/radeon: Restore GART table contents after pinning it in VRAM v3 drm/radeon: Split off gart_get_page_entry ASIC hook from set_page_entry drm/amdkfd: Fix bug in call to init_pipelines() drm/amdkfd: Fix bug in pipelines initialization drm/radeon: Don't increment pipe_id in kgd_init_pipeline drm/i2c: tda998x: set the CEC I2C address based on the slave I2C address drm/vmwgfx: Replace the hw mutex with a hw spinlock drm/amdkfd: Allow user to limit only queues per device drm/amdkfd: PQM handle queue creation fault drm: tda998x: Fix EDID read timeout on HDMI connect drm: tda998x: Protect the page register
2015-01-27btrfs: fix raid56 scrub failed in xfstests btrfs/072Gui Hecheng1-0/+2
The xfstests btrfs/072 reports uncorrectable read errors in dmesg, because scrub forgets to use commit_root for parity scrub routine and scrub attempts to scrub those extents items whose contents are not fully on disk. To fix it, we just add the @search_commit_root flag back. Signed-off-by: Gui Hecheng <[email protected]> Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: Miao Xie <[email protected]> Signed-off-by: Chris Mason <[email protected]>
2015-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds42-334/+667
Pull networking fixes from David Miller: 1) Don't OOPS on socket AIO, from Christoph Hellwig. 2) Scheduled scans should be aborted upon RFKILL, from Emmanuel Grumbach. 3) Fix sleep in atomic context in kvaser_usb, from Ahmed S Darwish. 4) Fix RCU locking across copy_to_user() in bpf code, from Alexei Starovoitov. 5) Lots of crash, memory leak, short TX packet et al bug fixes in sh_eth from Ben Hutchings. 6) Fix memory corruption in SCTP wrt. INIT collitions, from Daniel Borkmann. 7) Fix return value logic for poll handlers in netxen, enic, and bnx2x. From Eric Dumazet and Govindarajulu Varadarajan. 8) Header length calculation fix in mac80211 from Fred Chou. 9) mv643xx_eth doesn't handle highmem correctly in non-TSO code paths. From Ezequiel Garcia. 10) udp_diag has bogus logic in it's hash chain skipping, copy same fix tcp diag used. From Herbert Xu. 11) amd-xgbe programs wrong rx flow control register, from Thomas Lendacky. 12) Fix race leading to use after free in ping receive path, from Subash Abhinov Kasiviswanathan. 13) Cache redirect routes otherwise we can get a heavy backlog of rcu jobs liberating DST_NOCACHE entries. From Hannes Frederic Sowa. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits) net: don't OOPS on socket aio stmmac: prevent probe drivers to crash kernel bnx2x: fix napi poll return value for repoll ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too sh_eth: Fix DMA-API usage for RX buffers sh_eth: Check for DMA mapping errors on transmit sh_eth: Ensure DMA engines are stopped before freeing buffers sh_eth: Remove RX overflow log messages ping: Fix race in free in receive path udp_diag: Fix socket skipping within chain can: kvaser_usb: Fix state handling upon BUS_ERROR events can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT can: kvaser_usb: Send correct context to URB completion can: kvaser_usb: Do not sleep in atomic context ipv4: try to cache dst_entries which would cause a redirect samples: bpf: relax test_maps check bpf: rcu lock must not be held when calling copy_to_user() net: sctp: fix slab corruption from use after free on INIT collisions net: mv643xx_eth: Fix highmem support in non-TSO egress path sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers ...
2015-01-27net: don't OOPS on socket aioChristoph Hellwig1-3/+0
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27stmmac: prevent probe drivers to crash kernelAndy Shevchenko1-1/+4
In the case when alloc_netdev fails we return NULL to a caller. But there is no check for NULL in the probe drivers. This patch changes NULL to an error pointer. The function description is amended to reflect what we may get returned. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27spi: spi-fsl-dspi: Remove usage of devm_kzallocBhuvanchandra DV1-2/+12
devm_* API was supposed to be used only in probe function call. Memory is allocated at 'probe' and free automatically at 'remove'. Usage of devm_* functions outside probe sometimes leads to memory leak. Avoid using devm_kzalloc in dspi_setup_transfer and use kzalloc instead. Also add the dspi_cleanup function to free the controller data upon cleanup. Acked-by: Stefan Agner <[email protected]> Signed-off-by: Bhuvanchandra DV <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected]
2015-01-27ASoC: Intel: Used lock version to update shim registersJie Yang1-2/+2
We need hold lock each time updating shirm registers, otherwise, we may set unexpected values to them when they are set in different thread at different time sequence. The notification work will be scheduled in global work queue, which won't hold this sst->spinlock itself, so here we need change to use the lock version to update shim registers. Signed-off-by: Jie Yang <[email protected]> Signed-off-by: Mark Brown <[email protected]>
2015-01-27ASoC: wm8731: init mutex in i2c init pathManuel Lauss1-0/+2
The I2C init path forgot to init the mutex, leading to an oops when controls are accessed. Signed-off-by: Manuel Lauss <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected]
2015-01-27Merge tag 'powerpc-3.19-5' of ↵Linus Torvalds2-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc fixes from Michael Ellerman: "Two powerpc fixes" * tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared powerpc/xmon: Fix another endiannes issue in RTAS call from xmon
2015-01-27ASoC: atmel_ssc_dai: fix start event for I2S modeBo Shen1-14/+4
According to the I2S specification information as following: - WS = 0, channel 1 (left) - WS = 1, channel 2 (right) So, the start event should be TF/RF falling edge. Reported-by: Songjun Wu <[email protected]> Signed-off-by: Bo Shen <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected]
2015-01-27Merge tag 'fixes-for-linus' of ↵Linus Torvalds1-10/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull one more module fix from Rusty Russell: "SCSI was using module_refcount() to figure out when the module was unloading: this broke with new atomic refcounting. The code is still suspicious, but this solves the WARN_ON()" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: scsi: always increment reference count
2015-01-27PCI: designware: Reject MSI-X IRQsLucas Stach1-0/+3
The DesignWare PCIe MSI hardware does not support MSI-X IRQs. Setting those up failed as a side effect of a bug which was fixed by 91f8ae823f2b ("PCI: designware: Setup and clear exactly one MSI at a time"). Now that this bug is fixed, MSI-X IRQs need to be rejected explicitly; otherwise devices trying to use them may end up with incorrectly working interrupts. Fixes: 91f8ae823f2b ("PCI: designware: Setup and clear exactly one MSI at a time") Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Jingoo Han <[email protected]> CC: [email protected] # v3.18+
2015-01-27bnx2x: fix napi poll return value for repollGovindarajulu Varadarajan1-1/+1
With the commit d75b1ade567ffab ("net: less interrupt masking in NAPI") napi repoll is done only when work_done == budget. When in busy_poll is we return 0 in napi_poll. We should return budget. Signed-off-by: Govindarajulu Varadarajan <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27Merge branch 'master' of ↵David S. Miller1-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== ipsec 2015-01-26 Just two small fixes for _decode_session6() where we might decode to wrong header information in some rare situations. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-01-27ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos tooHannes Frederic Sowa1-19/+26
Lubomir Rintel reported that during replacing a route the interface reference counter isn't correctly decremented. To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>: | [root@rhel7-5 lkundrak]# sh -x lal | + ip link add dev0 type dummy | + ip link set dev0 up | + ip link add dev1 type dummy | + ip link set dev1 up | + ip addr add 2001:db8:8086::2/64 dev dev0 | + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20 | + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10 | + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20 | + ip link del dev0 type dummy | Message from syslogd@rhel7-5 at Jan 23 10:54:41 ... | kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 | | Message from syslogd@rhel7-5 at Jan 23 10:54:51 ... | kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2 During replacement of a rt6_info we must walk all parent nodes and check if the to be replaced rt6_info got propagated. If so, replace it with an alive one. Fixes: 4a287eba2de3957 ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag") Reported-by: Lubomir Rintel <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Tested-by: Lubomir Rintel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27Merge branch 'sh_eth'David S. Miller1-21/+59
Ben Hutchings says: ==================== Fixes for sh_eth #3 I'm continuing review and testing of Ethernet support on the R-Car H2 chip. This series fixes the last of the more serious issues I've found. These are not tested on any of the other supported chips. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-01-27sh_eth: Fix DMA-API usage for RX buffersBen Hutchings1-11/+23
- Use the return value of dma_map_single(), rather than calling virt_to_page() separately - Check for mapping failue - Call dma_unmap_single() rather than dma_sync_single_for_cpu() Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27sh_eth: Check for DMA mapping errors on transmitBen Hutchings1-0/+4
dma_map_single() may fail if an IOMMU or swiotlb is in use, so we need to check for this. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27sh_eth: Ensure DMA engines are stopped before freeing buffersBen Hutchings1-7/+32
Currently we try to clear EDRRR and EDTRR and immediately continue to free buffers. This is unsafe because: - In general, register writes are not serialised with DMA, so we still have to wait for DMA to complete somehow - The R8A7790 (R-Car H2) manual states that the TX running flag cannot be cleared by writing to EDTRR - The same manual states that clearing the RX running flag only stops RX DMA at the next packet boundary I applied this patch to the driver to detect DMA writes to freed buffers: > --- a/drivers/net/ethernet/renesas/sh_eth.c > +++ b/drivers/net/ethernet/renesas/sh_eth.c > @@ -1098,7 +1098,14 @@ static void sh_eth_ring_free(struct net_device *ndev) > /* Free Rx skb ringbuffer */ > if (mdp->rx_skbuff) { > for (i = 0; i < mdp->num_rx_ring; i++) > + memcpy(mdp->rx_skbuff[i]->data, > + "Hello, world", 12); > + msleep(100); > + for (i = 0; i < mdp->num_rx_ring; i++) { > + WARN_ON(memcmp(mdp->rx_skbuff[i]->data, > + "Hello, world", 12)); > dev_kfree_skb(mdp->rx_skbuff[i]); > + } > } > kfree(mdp->rx_skbuff); > mdp->rx_skbuff = NULL; then ran the loop: while ethtool -G eth0 rx 128 ; ethtool -G eth0 rx 64; do echo -n .; done and 'ping -f' toward the sh_eth port from another machine. The warning fired several times a minute. To fix these issues: - Deactivate all TX descriptors rather than writing to EDTRR - As there seems to be no way of telling when RX DMA is stopped, perform a soft reset to ensure that both DMA enginess are stopped - To reduce the possibility of the reset truncating a transmitted frame, disable egress and wait a reasonable time to reach a packet boundary before resetting - Update statistics before resetting (The 'reasonable time' does not allow for CS/CD in half-duplex mode, but half-duplex no longer seems reasonable!) Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27sh_eth: Remove RX overflow log messagesBen Hutchings1-3/+0
If RX traffic is overflowing the FIFO or DMA ring, logging every time this happens just makes things worse. These errors are visible in the statistics anyway. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27Merge tag 'linux-can-fixes-for-3.19-20150127' of ↵David S. Miller1-13/+15
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2015-01-27 this is another pull request for net/master which consists of 4 patches. All 4 patches are contributed by Ahmed S. Darwish, he fixes more problems in the kvaser_usb driver. David, please merge net/master to net-next/master, as we have more kvaser_usb patches in the queue, that target net-next. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-01-27ping: Fix race in free in receive path[email protected]1-1/+4
An exception is seen in ICMP ping receive path where the skb destructor sock_rfree() tries to access a freed socket. This happens because ping_rcv() releases socket reference with sock_put() and this internally frees up the socket. Later icmp_rcv() will try to free the skb and as part of this, skb destructor is called and which leads to a kernel panic as the socket is freed already in ping_rcv(). -->|exception -007|sk_mem_uncharge -007|sock_rfree -008|skb_release_head_state -009|skb_release_all -009|__kfree_skb -010|kfree_skb -011|icmp_rcv -012|ip_local_deliver_finish Fix this incorrect free by cloning this skb and processing this cloned skb instead. This patch was suggested by Eric Dumazet Signed-off-by: Subash Abhinov Kasiviswanathan <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27udp_diag: Fix socket skipping within chainHerbert Xu1-1/+3
While working on rhashtable walking I noticed that the UDP diag dumping code is buggy. In particular, the socket skipping within a chain never happens, even though we record the number of sockets that should be skipped. As this code was supposedly copied from TCP, this patch does what TCP does and resets num before we walk a chain. Signed-off-by: Herbert Xu <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-27can: kvaser_usb: Fix state handling upon BUS_ERROR eventsAhmed S. Darwish1-4/+3
While being in an ERROR_WARNING state, and receiving further bus error events with error counters still in the ERROR_WARNING range of 97-127 inclusive, the state handling code erroneously reverts back to ERROR_ACTIVE. Per the CAN standard, only revert to ERROR_ACTIVE when the error counters are less than 96. Moreover, in certain Kvaser models, the BUS_ERROR flag is always set along with undefined bits in the M16C status register. Thus use bitwise operators instead of full equality for checking that register against bus errors. Signed-off-by: Ahmed S. Darwish <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2015-01-27can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUTAhmed S. Darwish1-2/+10
On some x86 laptops, plugging a Kvaser device again after an unplug makes the firmware always ignore the very first command. For such a case, provide some room for retries instead of completely exiting the driver init code. Signed-off-by: Ahmed S. Darwish <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2015-01-27can: kvaser_usb: Send correct context to URB completionAhmed S. Darwish1-1/+1
Send expected argument to the URB completion hander: a CAN netdevice instead of the network interface private context `kvaser_usb_net_priv'. This was discovered by having some garbage in the kernel log in place of the netdevice names: can0 and can1. Signed-off-by: Ahmed S. Darwish <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2015-01-27can: kvaser_usb: Do not sleep in atomic contextAhmed S. Darwish1-6/+1
Upon receiving a hardware event with the BUS_RESET flag set, the driver kills all of its anchored URBs and resets all of its transmit URB contexts. Unfortunately it does so under the context of URB completion handler `kvaser_usb_read_bulk_callback()', which is often called in an atomic context. While the device is flooded with many received error packets, usb_kill_urb() typically sleeps/reschedules till the transfer request of each killed URB in question completes, leading to the sleep in atomic bug. [3] In v2 submission of the original driver patch [1], it was stated that the URBs kill and tx contexts reset was needed since we don't receive any tx acknowledgments later and thus such resources will be locked down forever. Fortunately this is no longer needed since an earlier bugfix in this patch series is now applied: all tx URB contexts are reset upon CAN channel close. [2] Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF event, which is the recommended handling method advised by the device manufacturer. [1] http://article.gmane.org/gmane.linux.network/239442 http://www.webcitation.org/6Vr2yagAQ [2] can: kvaser_usb: Reset all URB tx contexts upon channel close 889b77f7fd2bcc922493d73a4c51d8a851505815 [3] Stacktrace: <IRQ> [<ffffffff8158de87>] dump_stack+0x45/0x57 [<ffffffff8158b60c>] __schedule_bug+0x41/0x4f [<ffffffff815904b1>] __schedule+0x5f1/0x700 [<ffffffff8159360a>] ? _raw_spin_unlock_irqrestore+0xa/0x10 [<ffffffff81590684>] schedule+0x24/0x70 [<ffffffff8147d0a5>] usb_kill_urb+0x65/0xa0 [<ffffffff81077970>] ? prepare_to_wait_event+0x110/0x110 [<ffffffff8147d7d8>] usb_kill_anchored_urbs+0x48/0x80 [<ffffffffa01f4028>] kvaser_usb_unlink_tx_urbs+0x18/0x50 [kvaser_usb] [<ffffffffa01f45d0>] kvaser_usb_rx_error+0xc0/0x400 [kvaser_usb] [<ffffffff8108b14a>] ? vprintk_default+0x1a/0x20 [<ffffffffa01f5241>] kvaser_usb_read_bulk_callback+0x4c1/0x5f0 [kvaser_usb] [<ffffffff8147a73e>] __usb_hcd_giveback_urb+0x5e/0xc0 [<ffffffff8147a8a1>] usb_hcd_giveback_urb+0x41/0x110 [<ffffffffa0008748>] finish_urb+0x98/0x180 [ohci_hcd] [<ffffffff810cd1a7>] ? acct_account_cputime+0x17/0x20 [<ffffffff81069f65>] ? local_clock+0x15/0x30 [<ffffffffa000a36b>] ohci_work+0x1fb/0x5a0 [ohci_hcd] [<ffffffff814fbb31>] ? process_backlog+0xb1/0x130 [<ffffffffa000cd5b>] ohci_irq+0xeb/0x270 [ohci_hcd] [<ffffffff81479fc1>] usb_hcd_irq+0x21/0x30 [<ffffffff8108bfd3>] handle_irq_event_percpu+0x43/0x120 [<ffffffff8108c0ed>] handle_irq_event+0x3d/0x60 [<ffffffff8108ec84>] handle_fasteoi_irq+0x74/0x110 [<ffffffff81004dfd>] handle_irq+0x1d/0x30 [<ffffffff81004727>] do_IRQ+0x57/0x100 [<ffffffff8159482a>] common_interrupt+0x6a/0x6a Signed-off-by: Ahmed S. Darwish <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>