aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-02-17clk: tegra: fix host1x clock on Tegra124Mark Zhang1-1/+1
The host1x clock on Tegra124 is a 3-bit wide mux with 6 parents. Change thte id to tegra_clk_host1x_8 so that the correct clock gets registered. Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Andrew Bresticker <[email protected]>
2014-02-17clk: tegra: PLLD2 fixes for hdmiDavid Ung1-8/+7
Set correct pll_d2_out0 divider and correct the p div values for pll_d2. Signed-off-by: David Ung <[email protected]> Signed-off-by: Andrew Bresticker <[email protected]>
2014-02-17clk: tegra: Fix PLLD mnp tableRhyland Klein1-1/+10
PLLD was using the same mnp table as PLLP. Fix it to use its own table which is different from PLLP's. Signed-off-by: Rhyland Klein <[email protected]> Signed-off-by: Andrew Bresticker <[email protected]>
2014-02-17clk: tegra: Fix PLLP rate tableGabe Black1-5/+5
This table had settings for 216MHz, but PLLP is (and is supposed to be) configured at 408MHz. If that table is used and PLLP_BASE_OVRRIDE is not set, the kernel will panic in clk_pll_recalc_rate(). Signed-off-by: Gabe Black <[email protected]> Signed-off-by: Andrew Bresticker <[email protected]>
2014-02-17clk: tegra: Correct clock number for UARTEThierry Reding1-1/+1
UARTE has clock number 66. Number 65 is the right one for UARTD. Signed-off-by: Thierry Reding <[email protected]>
2014-02-17clk: tegra: Add missing Tegra20 fuse clksPeter De Schrijver1-0/+2
Add clocks required for accessing fuses on Tegra20. Signed-off-by: Peter De Schrijver <[email protected]> Acked-by: Stephen Warren <[email protected]>
2014-02-17HID: hid-sensor-hub: quirk for STM Sensor hubArchana Patni2-0/+6
Added STM sensor hub vendor id in HID_SENSOR_HUB_ENUM_QUIRK to fix report descriptors. These devices uses old FW which uses logical 0 as minimum. In these, HID reports are not using proper collection classes. So we need to fix report descriptors,for such devices. This will not have any impact, if the FW uses logical 1 as minimum. We look for usage id for "power and report state", and modify logical minimum value to 1. This is a follow-up patch to commit id 875e36f8. Signed-off-by: Archana Patni <[email protected]> Reviewed-by: Srinivas Pandruvada <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2014-02-17avr32: add generic vga.h to KbuildChen Gang1-0/+1
Need add generic "vga.h", or can not pass building for allmodconfig, the related error: CC [M] drivers/gpu/drm/drm_irq.o In file included from include/linux/vgaarb.h:34, from drivers/gpu/drm/drm_irq.c:42: include/video/vga.h:22:21: error: asm/vga.h: No such file or directory Signed-off-by: Chen Gang <[email protected]> Acked-by: Hans-Christian Egtvedt <[email protected]>
2014-02-17avr32: add generic ioremap_wc() definition in io.hChen Gang1-0/+2
Need generic ioremap_wc(), or can not pass compiling with allmodconfig, the related error: CC [M] drivers/gpu/drm/drm_bufs.o drivers/gpu/drm/drm_bufs.c: In function 'drm_addmap_core': drivers/gpu/drm/drm_bufs.c:217: error: implicit declaration of function 'ioremap_wc' drivers/gpu/drm/drm_bufs.c:218: warning: assignment makes pointer from integer without a cast Signed-off-by: Chen Gang <[email protected]> Acked-by: Hans-Christian Egtvedt <[email protected]>
2014-02-17avr32: Makefile: add '-D__linux__' flag for gcc-4.4.7 useChen Gang1-1/+1
For avr32 cross compiler, do not define '__linux__' internally, so it will cause issue with allmodconfig. The related error: CC [M] fs/coda/psdev.o In file included from include/linux/coda.h:64, from fs/coda/psdev.c:45: include/uapi/linux/coda.h:221: error: expected specifier-qualifier-list before 'u_quad_t' The related toolchain version (which only download, not re-compile): [root@gchen linux-next]# /upstream/toolchain/download/avr32-gnu-toolchain-linux_x86/bin/avr32-gcc -v Using built-in specs. Target: avr32 Configured with: /data2/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/src/gcc/configure --target=avr32 --host=i686-pc-linux-gnu --build=x86_64-pc-linux-gnu --prefix=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-languages=c,c++ --disable-nls --disable-libssp --disable-libstdcxx-pch --with-dwarf2 --enable-version-specific-runtime-libs --disable-shared --enable-doc --with-mpfr-lib=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/lib --with-mpfr-include=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86/include --with-gmp=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --with-mpc=/home/toolsbuild/jenkins-knuth/workspace/avr32-gnu-toolchain/avr32-gnu-toolchain-linux_x86 --enable-__cxa_atexit --disable-shared --with-newlib --with-pkgversion=AVR_32_bit_GNU_Toolchain_3.4.2_435 --with-bugurl=http://www .atmel.com/avr Thread model: single gcc version 4.4.7 (AVR_32_bit_GNU_Toolchain_3.4.2_435) Signed-off-by: Chen Gang <[email protected]> Acked-by: Hans-Christian Egtvedt <[email protected]> Cc: [email protected]
2014-02-17avr32: fix missing module.h causing build failure in mimc200/fram.cPaul Gortmaker1-0/+1
Causing this: In file included from arch/avr32/boards/mimc200/fram.c:13: include/linux/miscdevice.h:51: error: field 'list' has incomplete type include/linux/miscdevice.h:55: error: expected specifier-qualifier-list before 'mode_t' arch/avr32/boards/mimc200/fram.c:42: error: 'THIS_MODULE' undeclared here (not in a function) Reported-by: Fengguang Wu <[email protected]> Cc: Haavard Skinnemoen <[email protected]> Cc: Hans-Christian Egtvedt <[email protected]> Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Sergei Trofimovich <[email protected]> Acked-by: Hans-Christian Egtvedt <[email protected]> Cc: [email protected]
2014-02-17ALSA: usb-audio: work around KEF X300A firmware bugClemens Ladisch1-0/+9
When the driver tries to access Function Unit 10, the KEF X300A speakers' firmware apparently locks up, making even PCM streaming impossible. Work around this by ignoring this FU. Cc: <[email protected]> Signed-off-by: Clemens Ladisch <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2014-02-17packet: check for ndo_select_queue during queue selectionDaniel Borkmann1-3/+21
Mathias reported that on an AMD Geode LX embedded board (ALiX) with ath9k driver PACKET_QDISC_BYPASS, introduced in commit d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option"), triggers a WARN_ON() coming from the driver itself via 066dae93bdf ("ath9k: rework tx queue selection and fix queue stopping/waking"). The reason why this happened is that ndo_select_queue() call is not invoked from direct xmit path i.e. for ieee80211 subsystem that sets queue and TID (similar to 802.1d tag) which is being put into the frame through 802.11e (WMM, QoS). If that is not set, pending frame counter for e.g. ath9k can get messed up. So the WARN_ON() in ath9k is absolutely legitimate. Generally, the hw queue selection in ieee80211 depends on the type of traffic, and priorities are set according to ieee80211_ac_numbers mapping; working in a similar way as DiffServ only on a lower layer, so that the AP can favour frames that have "real-time" requirements like voice or video data frames. Therefore, check for presence of ndo_select_queue() in netdev ops and, if available, invoke it with a fallback handler to __packet_pick_tx_queue(), so that driver such as bnx2x, ixgbe, or mlx4 can still select a hw queue for transmission in relation to the current CPU while e.g. ieee80211 subsystem can make their own choices. Reported-by: Mathias Kretschmer <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Cc: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-17netdevice: move netdev_cap_txqueue for shared usage to headerDaniel Borkmann2-12/+21
In order to allow users to invoke netdev_cap_txqueue, it needs to be moved into netdevice.h header file. While at it, also add kernel doc header to document the API. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-17netdevice: add queue selection fallback handler for ndo_select_queueDaniel Borkmann17-27/+31
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: David S. Miller <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-17drivers/net: tulip_remove_one needs to call pci_disable_device()Ingo Molnar1-0/+1
Otherwise the device is not completely shut down. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-17net: sctp: Fix a_rwnd/rwnd management to reflect real state of the ↵Matija Glavinic Pecotic5-87/+25
receiver's buffer Implementation of (a)rwnd calculation might lead to severe performance issues and associations completely stalling. These problems are described and solution is proposed which improves lksctp's robustness in congestion state. 1) Sudden drop of a_rwnd and incomplete window recovery afterwards Data accounted in sctp_assoc_rwnd_decrease takes only payload size (sctp data), but size of sk_buff, which is blamed against receiver buffer, is not accounted in rwnd. Theoretically, this should not be the problem as actual size of buffer is double the amount requested on the socket (SO_RECVBUF). Problem here is that this will have bad scaling for data which is less then sizeof sk_buff. E.g. in 4G (LTE) networks, link interfacing radio side will have a large portion of traffic of this size (less then 100B). An example of sudden drop and incomplete window recovery is given below. Node B exhibits problematic behavior. Node A initiates association and B is configured to advertise rwnd of 10000. A sends messages of size 43B (size of typical sctp message in 4G (LTE) network). On B data is left in buffer by not reading socket in userspace. Lets examine when we will hit pressure state and declare rwnd to be 0 for scenario with above stated parameters (rwnd == 10000, chunk size == 43, each chunk is sent in separate sctp packet) Logic is implemented in sctp_assoc_rwnd_decrease: socket_buffer (see below) is maximum size which can be held in socket buffer (sk_rcvbuf). current_alloced is amount of data currently allocated (rx_count) A simple expression is given for which it will be examined after how many packets for above stated parameters we enter pressure state: We start by condition which has to be met in order to enter pressure state: socket_buffer < currently_alloced; currently_alloced is represented as size of sctp packets received so far and not yet delivered to userspace. x is the number of chunks/packets (since there is no bundling, and each chunk is delivered in separate packet, we can observe each chunk also as sctp packet, and what is important here, having its own sk_buff): socket_buffer < x*each_sctp_packet; each_sctp_packet is sctp chunk size + sizeof(struct sk_buff). socket_buffer is twice the amount of initially requested size of socket buffer, which is in case of sctp, twice the a_rwnd requested: 2*rwnd < x*(payload+sizeof(struc sk_buff)); sizeof(struct sk_buff) is 190 (3.13.0-rc4+). Above is stated that rwnd is 10000 and each payload size is 43 20000 < x(43+190); x > 20000/233; x ~> 84; After ~84 messages, pressure state is entered and 0 rwnd is advertised while received 84*43B ~= 3612B sctp data. This is why external observer notices sudden drop from 6474 to 0, as it will be now shown in example: IP A.34340 > B.12345: sctp (1) [INIT] [init tag: 1875509148] [rwnd: 81920] [OS: 10] [MIS: 65535] [init TSN: 1096057017] IP B.12345 > A.34340: sctp (1) [INIT ACK] [init tag: 3198966556] [rwnd: 10000] [OS: 10] [MIS: 10] [init TSN: 902132839] IP A.34340 > B.12345: sctp (1) [COOKIE ECHO] IP B.12345 > A.34340: sctp (1) [COOKIE ACK] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057017] [SID: 0] [SSEQ 0] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057017] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057018] [SID: 0] [SSEQ 1] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057018] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057019] [SID: 0] [SSEQ 2] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057019] [a_rwnd 9914] [#gap acks 0] [#dup tsns 0] <...> IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057098] [SID: 0] [SSEQ 81] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057098] [a_rwnd 6517] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057099] [SID: 0] [SSEQ 82] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057099] [a_rwnd 6474] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057100] [SID: 0] [SSEQ 83] [PPID 0x18] --> Sudden drop IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] At this point, rwnd_press stores current rwnd value so it can be later restored in sctp_assoc_rwnd_increase. This however doesn't happen as condition to start slowly increasing rwnd until rwnd_press is returned to rwnd is never met. This condition is not met since rwnd, after it hit 0, must first reach rwnd_press by adding amount which is read from userspace. Let us observe values in above example. Initial a_rwnd is 10000, pressure was hit when rwnd was ~6500 and the amount of actual sctp data currently waiting to be delivered to userspace is ~3500. When userspace starts to read, sctp_assoc_rwnd_increase will be blamed only for sctp data, which is ~3500. Condition is never met, and when userspace reads all data, rwnd stays on 3569. IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 1505] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057100] [a_rwnd 3010] [#gap acks 0] [#dup tsns 0] IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057101] [SID: 0] [SSEQ 84] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057101] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0] --> At this point userspace read everything, rwnd recovered only to 3569 IP A.34340 > B.12345: sctp (1) [DATA] (B)(E) [TSN: 1096057102] [SID: 0] [SSEQ 85] [PPID 0x18] IP B.12345 > A.34340: sctp (1) [SACK] [cum ack 1096057102] [a_rwnd 3569] [#gap acks 0] [#dup tsns 0] Reproduction is straight forward, it is enough for sender to send packets of size less then sizeof(struct sk_buff) and receiver keeping them in its buffers. 2) Minute size window for associations sharing the same socket buffer In case multiple associations share the same socket, and same socket buffer (sctp.rcvbuf_policy == 0), different scenarios exist in which congestion on one of the associations can permanently drop rwnd of other association(s). Situation will be typically observed as one association suddenly having rwnd dropped to size of last packet received and never recovering beyond that point. Different scenarios will lead to it, but all have in common that one of the associations (let it be association from 1)) nearly depleted socket buffer, and the other association blames socket buffer just for the amount enough to start the pressure. This association will enter pressure state, set rwnd_press and announce 0 rwnd. When data is read by userspace, similar situation as in 1) will occur, rwnd will increase just for the size read by userspace but rwnd_press will be high enough so that association doesn't have enough credit to reach rwnd_press and restore to previous state. This case is special case of 1), being worse as there is, in the worst case, only one packet in buffer for which size rwnd will be increased. Consequence is association which has very low maximum rwnd ('minute size', in our case down to 43B - size of packet which caused pressure) and as such unusable. Scenario happened in the field and labs frequently after congestion state (link breaks, different probabilities of packet drop, packet reordering) and with scenario 1) preceding. Here is given a deterministic scenario for reproduction: >From node A establish two associations on the same socket, with rcvbuf_policy being set to share one common buffer (sctp.rcvbuf_policy == 0). On association 1 repeat scenario from 1), that is, bring it down to 0 and restore up. Observe scenario 1). Use small payload size (here we use 43). Once rwnd is 'recovered', bring it down close to 0, as in just one more packet would close it. This has as a consequence that association number 2 is able to receive (at least) one more packet which will bring it in pressure state. E.g. if association 2 had rwnd of 10000, packet received was 43, and we enter at this point into pressure, rwnd_press will have 9957. Once payload is delivered to userspace, rwnd will increase for 43, but conditions to restore rwnd to original state, just as in 1), will never be satisfied. --> Association 1, between A.y and B.12345 IP A.55915 > B.12345: sctp (1) [INIT] [init tag: 836880897] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 4032536569] IP B.12345 > A.55915: sctp (1) [INIT ACK] [init tag: 2873310749] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3799315613] IP A.55915 > B.12345: sctp (1) [COOKIE ECHO] IP B.12345 > A.55915: sctp (1) [COOKIE ACK] --> Association 2, between A.z and B.12346 IP A.55915 > B.12346: sctp (1) [INIT] [init tag: 534798321] [rwnd: 10000] [OS: 10] [MIS: 65535] [init TSN: 2099285173] IP B.12346 > A.55915: sctp (1) [INIT ACK] [init tag: 516668823] [rwnd: 81920] [OS: 10] [MIS: 10] [init TSN: 3676403240] IP A.55915 > B.12346: sctp (1) [COOKIE ECHO] IP B.12346 > A.55915: sctp (1) [COOKIE ACK] --> Deplete socket buffer by sending messages of size 43B over association 1 IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315613] [SID: 0] [SSEQ 0] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315613] [a_rwnd 9957] [#gap acks 0] [#dup tsns 0] <...> IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315696] [a_rwnd 6388] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315697] [SID: 0] [SSEQ 84] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315697] [a_rwnd 6345] [#gap acks 0] [#dup tsns 0] --> Sudden drop on 1 IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315698] [SID: 0] [SSEQ 85] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315698] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Here userspace read, rwnd 'recovered' to 3698, now deplete again using association 1 so there is place in buffer for only one more packet IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315799] [SID: 0] [SSEQ 186] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315799] [a_rwnd 86] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315800] [SID: 0] [SSEQ 187] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 43] [#gap acks 0] [#dup tsns 0] --> Socket buffer is almost depleted, but there is space for one more packet, send them over association 2, size 43B IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403240] [SID: 0] [SSEQ 0] [PPID 0x18] IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403240] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Immediate drop IP A.60995 > B.12346: sctp (1) [SACK] [cum ack 387491510] [a_rwnd 0] [#gap acks 0] [#dup tsns 0] --> Read everything from the socket, both association recover up to maximum rwnd they are capable of reaching, note that association 1 recovered up to 3698, and association 2 recovered only to 43 IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 1548] [#gap acks 0] [#dup tsns 0] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315800] [a_rwnd 3053] [#gap acks 0] [#dup tsns 0] IP B.12345 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3799315801] [SID: 0] [SSEQ 188] [PPID 0x18] IP A.55915 > B.12345: sctp (1) [SACK] [cum ack 3799315801] [a_rwnd 3698] [#gap acks 0] [#dup tsns 0] IP B.12346 > A.55915: sctp (1) [DATA] (B)(E) [TSN: 3676403241] [SID: 0] [SSEQ 1] [PPID 0x18] IP A.55915 > B.12346: sctp (1) [SACK] [cum ack 3676403241] [a_rwnd 43] [#gap acks 0] [#dup tsns 0] A careful reader might wonder why it is necessary to reproduce 1) prior reproduction of 2). It is simply easier to observe when to send packet over association 2 which will push association into the pressure state. Proposed solution: Both problems share the same root cause, and that is improper scaling of socket buffer with rwnd. Solution in which sizeof(sk_buff) is taken into concern while calculating rwnd is not possible due to fact that there is no linear relationship between amount of data blamed in increase/decrease with IP packet in which payload arrived. Even in case such solution would be followed, complexity of the code would increase. Due to nature of current rwnd handling, slow increase (in sctp_assoc_rwnd_increase) of rwnd after pressure state is entered is rationale, but it gives false representation to the sender of current buffer space. Furthermore, it implements additional congestion control mechanism which is defined on implementation, and not on standard basis. Proposed solution simplifies whole algorithm having on mind definition from rfc: o Receiver Window (rwnd): This gives the sender an indication of the space available in the receiver's inbound buffer. Core of the proposed solution is given with these lines: sctp_assoc_rwnd_update: if ((asoc->base.sk->sk_rcvbuf - rx_count) > 0) asoc->rwnd = (asoc->base.sk->sk_rcvbuf - rx_count) >> 1; else asoc->rwnd = 0; We advertise to sender (half of) actual space we have. Half is in the braces depending whether you would like to observe size of socket buffer as SO_RECVBUF or twice the amount, i.e. size is the one visible from userspace, that is, from kernelspace. In this way sender is given with good approximation of our buffer space, regardless of the buffer policy - we always advertise what we have. Proposed solution fixes described problems and removes necessity for rwnd restoration algorithm. Finally, as proposed solution is simplification, some lines of code, along with some bytes in struct sctp_association are saved. Version 2 of the patch addressed comments from Vlad. Name of the function is set to be more descriptive, and two parts of code are changed, in one removing the superfluous call to sctp_assoc_rwnd_update since call would not result in update of rwnd, and the other being reordering of the code in a way that call to sctp_assoc_rwnd_update updates rwnd. Version 3 corrected change introduced in v2 in a way that existing function is not reordered/copied in line, but it is correctly called. Thanks Vlad for suggesting. Signed-off-by: Matija Glavinic Pecotic <[email protected]> Reviewed-by: Alexander Sverdlin <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-16ipv4: distinguish EHOSTUNREACH from the ENETUNREACHDuan Jiong1-2/+7
since commit 251da413("ipv4: Cache ip_error() routes even when not forwarding."), the counter IPSTATS_MIB_INADDRERRORS can't work correctly, because the value of err was always set to ENETUNREACH. Signed-off-by: Duan Jiong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-16hyperv: Fix the carrier status settingHaiyang Zhang1-15/+38
Without this patch, the "cat /sys/class/net/ethN/operstate" shows "unknown", and "ethtool ethN" shows "Link detected: yes", when VM boots up with or without vNIC connected. This patch fixed the problem. Signed-off-by: Haiyang Zhang <[email protected]> Reviewed-by: K. Y. Srinivasan <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-16dccp: re-enable debug macroGerrit Renker2-1/+2
dccp tfrc: revert This reverts 6aee49c558de ("dccp: make local variable static") since the variable tfrc_debug is referenced by the tfrc_pr_debug(fmt, ...) macro when TFRC debugging is enabled. If it is enabled, use of the macro produces a compilation error. Signed-off-by: Gerrit Renker <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-17dt: Update binding information for mvebu gating clocks with Armada 380/385Thomas Petazzoni1-4/+32
Add the binding information for the gating clocks of the Armada 380 SoCs and the Armada 385 SoCs. Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Gregory CLEMENT <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17dt: Update binding information for mvebu core clocks with Armada 380/385Thomas Petazzoni1-0/+7
Add the binding information for the core clocks of the Armada 380 and Armada 385 SoCs Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Gregory CLEMENT <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17clk: mvebu: add clock support for Armada 380/385Gregory CLEMENT3-0/+172
Add the clock support for the new SoCs Armada 380 and Armada 385: core clocks and gating clocks. Signed-off-by: Gregory CLEMENT <[email protected]> Reviewed-by: Thomas Petazzoni <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17dt: Update binding information for mvebu gating clocks with Armada 375Gregory CLEMENT1-1/+30
Add the binding information for the gating clocks of the Armada 375 SoCs Signed-off-by: Gregory CLEMENT <[email protected]> Reviewed-by: Thomas Petazzoni <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17dt: Update binding information for mvebu core clocks with Armada 375Gregory CLEMENT1-0/+7
Add the binding information for the core clocks of the Armada 375 SoCs Signed-off-by: Gregory CLEMENT <[email protected]> Reviewed-by: Thomas Petazzoni <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17clk: mvebu: add clock support for Armada 375Gregory CLEMENT3-0/+189
Add the clock support for the new SoC Armada 375: core clocks and gating clocks. Signed-off-by: Gregory CLEMENT <[email protected]> Reviewed-by: Thomas Petazzoni <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17clk: mvebu: add Armada 375 support to the corediv clock driverThomas Petazzoni1-0/+19
This commit adds support for the Core Divider clocks of the Armada 375. Compared to Armada 370 and XP the Core Divider clocks of the 375 cannot be gated: only their ratio can be changed. This is reflected by the fact that the enable, disable and is_enabled clock operations are not defined, and that the enable_bit_offset field is also undefined. Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17clk: mvebu: refactor corediv driver to support more SoCThomas Petazzoni1-24/+57
This commit refactors the corediv clock driver so that it is capable of handling various SOCs that have slightly different corediv clock registers and capabilities. It introduces a clk_corediv_soc_desc structure that encapsulates all the SoC specific details. Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17clk: mvebu: add a little bit of documentation about data structuresThomas Petazzoni1-0/+17
Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-17clk: mvebu: do not copy the contents of clk_corediv_descThomas Petazzoni1-8/+8
Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Jason Cooper <[email protected]>
2014-02-16ext4: don't leave i_crtime.tv_sec uninitializedTheodore Ts'o1-0/+2
If the i_crtime field is not present in the inode, don't leave the field uninitialized. Fixes: ef7f38359 ("ext4: Add nanosecond timestamps") Reported-by: Vegard Nossum <[email protected]> Tested-by: Vegard Nossum <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: [email protected]
2014-02-17powerpc/eeh: Disable EEH on rebootGavin Shan2-1/+22
We possiblly detect EEH errors during reboot, particularly in kexec path, but it's impossible for device drivers and EEH core to handle or recover them properly. The patch registers one reboot notifier for EEH and disable EEH subsystem during reboot. That means the EEH errors is going to be cleared by hardware reset or second kernel during early stage of PCI probe. Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc/eeh: Cleanup on eeh_subsystem_enabledGavin Shan4-10/+27
The patch cleans up variable eeh_subsystem_enabled so that we needn't refer the variable directly from external. Instead, we will use function eeh_enabled() and eeh_set_enable() to operate the variable. Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc/powernv: Rework EEH resetGavin Shan1-25/+4
When doing reset in order to recover the affected PE, we issue hot reset on PE primary bus if it's not root bus. Otherwise, we issue hot or fundamental reset on root port or PHB accordingly. For the later case, we didn't cover the situation where PE only includes root port and it potentially causes kernel crash upon EEH error to the PE. The patch reworks the logic of EEH reset to improve the code readability and also avoid the kernel crash. Cc: [email protected] Reported-by: Thadeu Lima de Souza Cascardo <[email protected]> Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc: Use unstripped VDSO image for more accurate profiling dataAnton Blanchard2-2/+2
We are seeing a lot of hits in the VDSO that are not resolved by perf. A while(1) gettimeofday() loop shows the issue: 27.64% [vdso] [.] 0x000000000000060c 22.57% [vdso] [.] 0x0000000000000628 16.88% [vdso] [.] 0x0000000000000610 12.39% [vdso] [.] __kernel_gettimeofday 6.09% [vdso] [.] 0x00000000000005f8 3.58% test [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18 2.94% [vdso] [.] __kernel_datapage_offset 2.90% test [.] main We are using a stripped VDSO image which means only symbols with relocation info can be resolved. There isn't a lot of point to stripping the VDSO, the debug info is only about 1kB: 4680 arch/powerpc/kernel/vdso64/vdso64.so 5815 arch/powerpc/kernel/vdso64/vdso64.so.dbg By using the unstripped image, we can resolve all the symbols in the VDSO and the perf profile data looks much better: 76.53% [vdso] [.] __do_get_tspec 12.20% [vdso] [.] __kernel_gettimeofday 5.05% [vdso] [.] __get_datapage 3.20% test [.] main 2.92% test [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18 Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc: Link VDSOs at 0x0Anton Blanchard1-3/+3
perf is failing to resolve symbols in the VDSO. A while (1) gettimeofday() loop shows: 93.99% [vdso] [.] 0x00000000000005e0 3.12% test [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18 2.81% test [.] main The reason for this is that we are linking our VDSO shared libraries at 1MB, which is a little weird. Even though this is uncommon, Alan points out that it is valid and we should probably fix perf userspace. Regardless, I can't see a reason why we are doing this. The code is all position independent and we never rely on the VDSO ending up at 1M (and we never place it there on 64bit tasks). Changing our link address to 0x0 fixes perf VDSO symbol resolution: 73.18% [vdso] [.] 0x000000000000060c 12.39% [vdso] [.] __kernel_gettimeofday 3.58% test [.] 00000037.plt_call.gettimeofday@@GLIBC_2.18 2.94% [vdso] [.] __kernel_datapage_offset 2.90% test [.] main We still have some local symbol resolution issues that will be fixed in a subsequent patch. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bitAneesh Kumar K.V4-10/+64
Archs like ppc64 doesn't do tlb flush in set_pte/pmd functions when using a hash table MMU for various reasons (the flush is handled as part of the PTE modification when necessary). ppc64 thus doesn't implement flush_tlb_range for hash based MMUs. Additionally ppc64 require the tlb flushing to be batched within ptl locks. The reason to do that is to ensure that the hash page table is in sync with linux page table. We track the hpte index in linux pte and if we clear them without flushing hash and drop the ptl lock, we can have another cpu update the pte and can end up with duplicate entry in the hash table, which is fatal. We also want to keep set_pte_at simpler by not requiring them to do hash flush for performance reason. We do that by assuming that set_pte_at() is never *ever* called on a PTE that is already valid. This was the case until the NUMA code went in which broke that assumption. Fix that by introducing a new pair of helpers to set _PAGE_NUMA in a way similar to ptep/pmdp_set_wrprotect(), with a generic implementation using set_pte_at() and a powerpc specific one using the appropriate mechanism needed to keep the hash table in sync. Acked-by: Mel Gorman <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17mm: Dirty accountable change only apply to non prot numa caseAneesh Kumar K.V1-14/+7
So move it within the if loop Acked-by: Mel Gorman <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc/mm: Add new "set" flag argument to pte/pmd update functionAneesh Kumar K.V4-18/+24
pte_update() is a powerpc-ism used to change the bits of a PTE when the access permission is being restricted (a flush is potentially needed). It uses atomic operations on when needed and handles the hash synchronization on hash based processors. It is currently only used to clear PTE bits and so the current implementation doesn't provide a way to also set PTE bits. The new _PAGE_NUMA bit, when set, is actually restricting access so it must use that function too, so this change adds the ability for pte_update() to also set bits. We will use this later to set the _PAGE_NUMA bit. Acked-by: Mel Gorman <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc/pseries: Add Gen3 definitions for PCIE link speedKleber Sacilotto de Souza1-0/+6
Rev3 of the PCI Express Base Specification defines a Supported Link Speeds Vector where the bit definitions within this field are: Bit 0 - 2.5 GT/s Bit 1 - 5.0 GT/s Bit 2 - 8.0 GT/s This vector definition is used by the platform firmware to export the maximum and current link speeds of the PCI bus via the "ibm,pcie-link-speed-stats" device-tree property. This patch updates pseries_root_bridge_prepare() to detect Gen3 speed buses (defined by 0x04). Signed-off-by: Kleber Sacilotto de Souza <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc/pseries: Fix regression on PCI link speedKleber Sacilotto de Souza1-7/+9
Commit 5091f0c (powerpc/pseries: Fix PCIE link speed endian issue) introduced a regression on the PCI link speed detection using the device-tree property. The ibm,pcie-link-speed-stats property is composed of two 32-bit integers, the first one being the maxinum link speed and the second the current link speed. The changes introduced by the aforementioned commit are considering just the first integer. Fix this issue by changing how the property is accessed, using the helper functions to properly access the array of values. The explicit byte swapping is not needed anymore here, since it's done by the helper functions. Signed-off-by: Kleber Sacilotto de Souza <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-17powerpc: Set the correct ksp_limit on ppc32 when switching to irq stackKevin Hao1-1/+4
Guenter Roeck has got the following call trace on a p2020 board: Kernel stack overflow in process eb3e5a00, r1=eb79df90 CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4 task: eb3e5a00 ti: c0616000 task.ti: ef440000 NIP: c003a420 LR: c003a410 CTR: c0017518 REGS: eb79dee0 TRAP: 0901 Not tainted (3.13.0-rc8-juniper-00146-g19eca00) MSR: 00029000 <CE,EE,ME> CR: 24008444 XER: 00000000 GPR00: c003a410 eb79df90 eb3e5a00 00000000 eb05d900 00000001 65d87646 00000000 GPR08: 00000000 020b8000 00000000 00000000 44008442 NIP [c003a420] __do_softirq+0x94/0x1ec LR [c003a410] __do_softirq+0x84/0x1ec Call Trace: [eb79df90] [c003a410] __do_softirq+0x84/0x1ec (unreliable) [eb79dfe0] [c003a970] irq_exit+0xbc/0xc8 [eb79dff0] [c000cc1c] call_do_irq+0x24/0x3c [ef441f20] [c00046a8] do_IRQ+0x8c/0xf8 [ef441f40] [c000e7f4] ret_from_except+0x0/0x18 --- Exception: 501 at 0xfcda524 LR = 0x10024900 Instruction dump: 7c781b78 3b40000a 3a73b040 543c0024 3a800000 3b3913a0 7ef5bb78 48201bf9 5463103a 7d3b182e 7e89b92e 7c008146 <3ba00000> 7e7e9b78 48000014 57fff87f Kernel panic - not syncing: kernel stack overflow CPU: 0 PID: 2838 Comm: ssh Not tainted 3.13.0-rc8-juniper-00146-g19eca00 #4 Call Trace: The reason is that we have used the wrong register to calculate the ksp_limit in commit cbc9565ee826 (powerpc: Remove ksp_limit on ppc64). Just fix it. As suggested by Benjamin Herrenschmidt, also add the C prototype of the function in the comment in order to avoid such kind of errors in the future. Cc: [email protected] # 3.12 Reported-by: Guenter Roeck <[email protected]> Tested-by: Guenter Roeck <[email protected]> Signed-off-by: Kevin Hao <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2014-02-16Linux 3.14-rc3Linus Torvalds1-1/+1
2014-02-16Merge branch 'for-linus' of ↵Linus Torvalds7-23/+28
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "We have a small collection of fixes in my for-linus branch. The big thing that stands out is a revert of a new ioctl. Users haven't shipped yet in btrfs-progs, and Dave Sterba found a better way to export the information" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: use right clone root offset for compressed extents btrfs: fix null pointer deference at btrfs_sysfs_add_one+0x105 Btrfs: unset DCACHE_DISCONNECTED when mounting default subvol Btrfs: fix max_inline mount option Btrfs: fix a lockdep warning when cleaning up aborted transaction Revert "btrfs: add ioctl to export size of global metadata reservation"
2014-02-16Merge tag 'dt-fixes-for-3.14' of ↵Linus Torvalds1-34/+54
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: "Fix booting on PPC boards. Changes to of_match_node matching caused the serial port on some PPC boards to stop working. Reverted the change and reimplement to split matching between new style compatible only matching and fallback to old matching algorithm" * tag 'dt-fixes-for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: search the best compatible match first in __of_match_node() Revert "OF: base: match each node compatible against all given matches first"
2014-02-16SUNRPC: Fix a pipe_version reference leakTrond Myklebust1-1/+3
In gss_alloc_msg(), if the call to gss_encode_v1_msg() fails, we want to release the reference to the pipe_version that was obtained earlier in the function. Fixes: 9d3a2260f0f4b (SUNRPC: Fix buffer overflow checking in...) Signed-off-by: Trond Myklebust <[email protected]>
2014-02-16SUNRPC: Ensure that gss_auth isn't freed before its upcall messagesTrond Myklebust1-2/+11
Fix a race in which the RPC client is shutting down while the gss daemon is processing a downcall. If the RPC client manages to shut down before the gss daemon is done, then the struct gss_auth used in gss_release_msg() may have already been freed. Link: http://lkml.kernel.org/r/[email protected] Reported-by: John <[email protected]> Reported-by: Borislav Petkov <[email protected]> Cc: [email protected] # 3.12+ Signed-off-by: Trond Myklebust <[email protected]>
2014-02-16ata: sata_mv: Cleanup only the initialized portsEzequiel Garcia1-2/+7
When an error occurs in the port initialization loop, currently the driver tries to cleanup all the ports. This results in a NULL pointer dereference if the ports were only partially initialized. Fix this by updating only the number of initialized ports (either with failure or successfully), before jumping to the error path and looping over that number in the cleanup loop. Cc: Andrew Lunn <[email protected]> Reported-by: Mikael Pettersson <[email protected]> Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected]
2014-02-15ext4: fix online resize with a non-standard blocks per group settingTheodore Ts'o1-1/+1
The set_flexbg_block_bitmap() function assumed that the number of blocks in a blockgroup was sb->blocksize * 8, which is normally true, but not always! Use EXT4_BLOCKS_PER_GROUP(sb) instead, to fix block bitmap corruption after: mke2fs -t ext4 -g 3072 -i 4096 /dev/vdd 1G mount -t ext4 /dev/vdd /vdd resize2fs /dev/vdd 8G Signed-off-by: "Theodore Ts'o" <[email protected]> Reported-by: Jon Bernard <[email protected]> Cc: [email protected]
2014-02-15ext4: fix online resize with very large inode tablesTheodore Ts'o1-12/+20
If a file system has a large number of inodes per block group, all of the metadata blocks in a flex_bg may be larger than what can fit in a single block group. Unfortunately, ext4_alloc_group_tables() in resize.c was never tested to see if it would handle this case correctly, and there were a large number of bugs which caused the following sequence to result in a BUG_ON: kernel bug at fs/ext4/resize.c:409! ... call trace: [<ffffffff81256768>] ext4_flex_group_add+0x1448/0x1830 [<ffffffff81257de2>] ext4_resize_fs+0x7b2/0xe80 [<ffffffff8123ac50>] ext4_ioctl+0xbf0/0xf00 [<ffffffff811c111d>] do_vfs_ioctl+0x2dd/0x4b0 [<ffffffff811b9df2>] ? final_putname+0x22/0x50 [<ffffffff811c1371>] sys_ioctl+0x81/0xa0 [<ffffffff81676aa9>] system_call_fastpath+0x16/0x1b code: c8 4c 89 df e8 41 96 f8 ff 44 89 e8 49 01 c4 44 29 6d d4 0 rip [<ffffffff81254fa1>] set_flexbg_block_bitmap+0x171/0x180 This can be reproduced with the following command sequence: mke2fs -t ext4 -i 4096 /dev/vdd 1G mount -t ext4 /dev/vdd /vdd resize2fs /dev/vdd 8G To fix this, we need to make sure the right thing happens when a block group's inode table straddles two block groups, which means the following bugs had to be fixed: 1) Not clearing the BLOCK_UNINIT flag in the second block group in ext4_alloc_group_tables --- the was proximate cause of the BUG_ON. 2) Incorrectly determining how many block groups contained contiguous free blocks in ext4_alloc_group_tables(). 3) Incorrectly setting the start of the next block range to be marked in use after a discontinuity in setup_new_flex_group_blocks(). Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: [email protected]