aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-10-14ipv4: fix nexthop attlen check in fib_nh_matchJiri Pirko1-1/+1
fib_nh_match does not match nexthops correctly. Example: ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \ nexthop via 192.168.122.13 dev eth0 ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \ nexthop via 192.168.122.15 dev eth0 Del command is successful and route is removed. After this patch applied, the route is correctly matched and result is: RTNETLINK answers: No such process Please consider this for stable trees as well. Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14tcp: fix tcp_ack() performance problemEric Dumazet1-9/+27
We worked hard to improve tcp_ack() performance, by not accessing skb_shinfo() in fast path (cd7d8498c9a5 tcp: change tcp_skb_pcount() location) We still have one spurious access because of ACK timestamping, added in commit e1c8a607b281 ("net-timestamp: ACK timestamp for bytestreams") By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set, we can avoid two cache line misses for the common case. While we are at it, add two prefetchw() : One in tcp_ack() to bring skb at the head of write queue. One in tcp_clean_rtx_queue() loop to bring following skb, as we will delete skb from the write queue and dirty skb->next->prev. Add a couple of [un]likely() clauses. After this patch, tcp_ack() is no longer the most consuming function in tcp stack. Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Cc: Van Jacobson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14ceph: fix divide-by-zero in __validate_layout()Yan, Zheng1-1/+1
The 'stripe_unit' field is 64 bits, casting it to 32 bits can result zero. Signed-off-by: Yan, Zheng <[email protected]>
2014-10-14rbd: rbd workqueues need a resque workerIlya Dryomov1-1/+2
Need to use WQ_MEM_RECLAIM for our workqueues to prevent I/O lockups under memory pressure - we sit on the memory reclaim path. Cc: [email protected] # 3.17, needs backporting for 3.16 Signed-off-by: Ilya Dryomov <[email protected]> Tested-by: Micha Krause <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14libceph: ceph-msgr workqueue needs a resque workerIlya Dryomov1-1/+5
Commit f363e45fd118 ("net/ceph: make ceph_msgr_wq non-reentrant") effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq. This is wrong - libceph is very much a memory reclaim path, so restore it. Cc: [email protected] # needs backporting for < 3.12 Signed-off-by: Ilya Dryomov <[email protected]> Tested-by: Micha Krause <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14ceph: fix bool assignmentsFabian Frederick1-13/+13
Fix some coccinelle warnings: fs/ceph/caps.c:2400:6-10: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2401:6-15: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2402:6-17: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2403:6-22: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2404:6-22: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2405:6-19: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2440:4-20: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2469:3-16: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2490:2-18: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2519:3-7: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2549:3-12: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2575:2-6: WARNING: Assignment of bool to 0/1 fs/ceph/caps.c:2589:3-7: WARNING: Assignment of bool to 0/1 Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2014-10-14libceph: separate multiple ops with commas in debugfs outputIlya Dryomov1-1/+2
For requests with multiple ops, separate ops with commas instead of \t, which is a field separator here. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14libceph: sync osd op definitions in rados.hIlya Dryomov3-228/+137
Bring in missing osd ops and strings, use macros to eliminate multiple points of maintenance. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14libceph: remove redundant declarationFabian Frederick1-1/+0
ceph_release_page_vector was defined twice in libceph.h Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2014-10-14ceph: additional debugfs outputJohn Spray2-0/+47
MDS session state and client global ID is useful instrumentation when testing. Signed-off-by: John Spray <[email protected]>
2014-10-14ceph: export ceph_session_state_name functionJohn Spray2-7/+9
...so that it can be used from the ceph debugfs code when dumping session info. Signed-off-by: John Spray <[email protected]>
2014-10-14ceph: include the initial ACL in create/mkdir/mknod MDS requestsYan, Zheng4-47/+170
Current code set new file/directory's initial ACL in a non-atomic manner. Client first sends request to MDS to create new file/directory, then set the initial ACL after the new file/directory is successfully created. The fix is include the initial ACL in create/mkdir/mknod MDS requests. So MDS can handle creating file/directory and setting the initial ACL in one request. Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14ceph: use pagelist to present MDS request dataYan, Zheng3-39/+28
Current code uses page array to present MDS request data. Pages in the array are allocated/freed by caller of ceph_mdsc_do_request(). If request is interrupted, the pages can be freed while they are still being used by the request message. The fix is use pagelist to present MDS request data. Pagelist is reference counted. Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14libceph: reference counting pagelistYan, Zheng4-7/+10
this allow pagelist to present data that may be sent multiple times. Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14ceph: fix llistxattr on symlinkYan, Zheng1-2/+1
only regular file and directory have vxattrs. Signed-off-by: Yan, Zheng <[email protected]>
2014-10-14ceph: send client metadata to MDSJohn Spray1-1/+70
Implement version 2 of CEPH_MSG_CLIENT_SESSION syntax, which includes additional client metadata to allow the MDS to report on clients by user-sensible names like hostname. Signed-off-by: John Spray <[email protected]> Reviewed-by: Yan, Zheng <[email protected]>
2014-10-14Merge branch 'isdn'David S. Miller7-81/+265
Tilman Schmidt says: ==================== Coverity patches for drivers/isdn Here's a series of patches for the ISDN CAPI subsystem and the Gigaset ISDN driver. Patches 1 to 7 are specific fixes for Coverity warnings. Patches 8 to 11 fix related problems with the handling of invalid CAPI command codes I noticed while working on this. Patch 12 fixes an unrelated problem I noticed during the subsequent regression tests. It would be great if these could still be merged. ==================== Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/gigaset: fix usb_gigaset write_cmd result raceTilman Schmidt1-1/+3
In usb_gigaset function gigaset_write_cmd(), the length field of the command buffer structure could be cleared by the transmit tasklet before it was used for the function's return value. Fix by copying to a local variable before scheduling the tasklet. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/capi: don't return NULL from capi_cmd2str()Tilman Schmidt1-2/+7
capi_cmd2str() is used in many places to build log messages. None of them is prepared to handle NULL as a result. Change the function to return printable string "INVALID_COMMAND" instead. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/capi: handle CAPI 2.0 message parser failuresTilman Schmidt2-27/+145
Have callers of capi_cmsg2message and capi_message2cmsg handle non-zero return values indicating failure. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/capi: prevent NULL pointer dereference on invalid CAPI commandTilman Schmidt1-0/+7
An invalid CAPI 2.0 command/subcommand combination may retrieve a NULL pointer from the cpars[] array which will later be dereferenced by the parser routines. Fix by adding NULL pointer checks in strategic places. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/capi: refactor command/subcommand table accessesTilman Schmidt1-5/+18
Encapsulate accesses to the CAPI 2.0 command/subcommand name and parameter tables in a single place in preparation for redesign. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/capi: prevent index overrun from command_2_index()Tilman Schmidt1-0/+2
The result of the function command_2_index() is used to index two arrays mnames[] and cpars[] with max. index 0x4e but in its current form that function can produce results up to 3*(0x9+0x9)+0x7f = 0xb5. Fix by clamping all result values potentially overrunning the arrays to zero which is already handled as an invalid value. Re-spotted with Coverity. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/capi: correct capi20_manufacturer argument type mismatchTilman Schmidt2-3/+3
Function capi20_manufacturer() is declared with unsigned int cmd argument but called with unsigned long. Fix by correcting the function prototype since the actual argument is part of the user visible API. Spotted with Coverity. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/gigaset: fix non-heap pointer deallocationTilman Schmidt1-41/+70
at_state structures may be allocated individually or as part of a cardstate or bc_state structure. The disconnect() function handled both cases, creating a risk that it might try to deallocate an at_state structure that had not been allocated individually. Fix by splitting disconnect() into two variants handling cases with and without an associated B channel separately, and adding an explicit check. Spotted with Coverity. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/gigaset: fix NULL pointer dereferenceTilman Schmidt1-0/+5
In do_action, a NULL pointer might be passed to function start_dial which will dereference it. Fix by adding a check for NULL before the call. Spotted with Coverity. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/gigaset: limit raw CAPI message dump lengthTilman Schmidt1-0/+2
In dump_rawmsg, the length field from a received data package was used unscrutinized, allowing an attacker to control the size of the allocated buffer and the number of times the output loop iterates. Fix by limiting to a reasonable value. Spotted with Coverity. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/gigaset: make sure controller name is null terminatedTilman Schmidt1-2/+2
In gigaset_isdn_regdev, the name field may not have a null terminator if the source string's length is equal to the buffer size. Fix by zero filling the structure and excluding the last byte of the name field from the copy. Spotted with Coverity. Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14isdn/gigaset: missing break in do_facility_reqTilman Schmidt1-0/+1
If we take the unsupported supplementary service notification mask path, we end up falling through and overwriting the error code. Insert a break statement to skip the remainder of the switch case and proceed to sending the reply message. Spotted with Coverity. Reported-by: Dave Jones <[email protected]> Signed-off-by: Tilman Schmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14Merge branch 'fec-ptp'David S. Miller3-17/+267
Luwei Zhou says: ==================== Enable FEC pps feather Change from v2 to v3: -Using the default channel 0 to be PPS channel not PTP_PIN_SET/GETFUNC interface. -Using the linux definition of NSEC_PER_SEC. Change from v1 to v2: - Fix the potential 32-bit multiplication overflow issue. - Optimize the hareware adjustment code to improve efficiency as Richard suggested - Use ptp PTP_PIN_SET/GETFUNC interface to set PPS channel not device tree and add PTP_PF_PPS enumeration - Modify comments style ==================== Signed-off-by: David S. Miller <[email protected]>
2014-10-14net: fec: ptp: Enable PPS output based on ptp clockLuwei Zhou3-1/+205
FEC ptp timer has 4 channel compare/trigger function. It can be used to enable pps output. The pulse would be ouput high exactly on N second. The pulse ouput high on compare event mode is used to produce pulse per second. The pulse width would be one cycle based on ptp timer clock source.Since 31-bit ptp hardware timer is used, the timer will wrap more than 2 seconds. We need to reload the compare compare event about every 1 second. Signed-off-by: Luwei Zhou <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14net: fec: ptp: Use hardware algorithm to adjust PTP counter.Luwei Zhou2-12/+56
The FEC IP supports hardware adjustment for ptp timer. Refer to the description of ENET_ATCOR and ENET_ATINC registers in the spec about the hardware adjustment. This patch uses hardware support to adjust the ptp offset and frequency on the slave side. Signed-off-by: Luwei Zhou <[email protected]> Signed-off-by: Frank Li <[email protected]> Signed-off-by: Fugang Duan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14net: fec: ptp: Use the 31-bit ptp timer.Luwei Zhou1-4/+6
When ptp switches from software adjustment to hardware ajustment, linux ptp can't converge. It is caused by the IP limit. Hardware adjustment logcial have issue when ptp counter runs over 0x80000000(31 bit counter). The internal IP reference manual already remove 32bit free-running count support. This patch replace the 32-bit PTP timer with 31-bit. Signed-off-by: Luwei Zhou <[email protected]> Signed-off-by: Frank Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14ipv6: remove aca_lock spinlock from struct ifacaddr6Li RongQing2-2/+0
no user uses this lock. Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14x86: bpf_jit: fix two bugs in eBPF JIT compilerAlexei Starovoitov1-6/+19
1. JIT compiler using multi-pass approach to converge to final image size, since x86 instructions are variable length. It starts with large gaps between instructions (so some jumps may use imm32 instead of imm8) and iterates until total program size is the same as in previous pass. This algorithm works only if program size is strictly decreasing. Programs that use LD_ABS insn need additional code in prologue, but it was not emitted during 1st pass, so there was a chance that 2nd pass would adjust imm32->imm8 jump offsets to the same number of bytes as increase in prologue, which may cause algorithm to erroneously decide that size converged. Fix it by always emitting largest prologue in the first pass which is detected by oldproglen==0 check. Also change error check condition 'proglen != oldproglen' to fail gracefully. 2. while staring at the code realized that 64-byte buffer may not be enough when 1st insn is large, so increase it to 128 to avoid buffer overflow (theoretical maximum size of prologue+div is 109) and add runtime check. Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Reported-by: Darrick J. Wong <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Tested-by: Darrick J. Wong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14tcp: fix ooo_okay setting vs Small QueuesEric Dumazet1-2/+6
TCP Small Queues (tcp_tsq_handler()) can hold one reference on sk->sk_wmem_alloc, preventing skb->ooo_okay being set. We should relax test done to set skb->ooo_okay to take care of this extra reference. Minimal truesize of skb containing one byte of payload is SKB_TRUESIZE(1) Without this fix, we have more chance locking flows into the wrong transmit queue. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14skbuff: fix ftrace handling in skb_unshareAlexander Aring1-1/+6
If the skb is not dropped afterwards we should run consume_skb instead kfree_skb. Inside of function skb_unshare we do always a kfree_skb, doesn't depend if skb_copy failed or was successful. This patch switch this behaviour like skb_share_check, if allocation of sk_buff failed we use kfree_skb otherwise consume_skb. Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14fm10k: Add skb->xmit_more supportAlexander Duyck1-31/+34
This change adds support for skb->xmit_more based on the changes that were made to igb to support the feature. The main changes are moving up the check for maybe_stop_tx so that we can check netif_xmit_stopped to determine if we must write the tail because we can add no further buffers. Acked-by: Matthew Vick <[email protected]> Signed-off-by: Alexander Duyck <[email protected]> Acked-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-10-14ceph: remove redundant code for max file size verificationChao Yu2-15/+0
Both ceph_update_writeable_page and ceph_setattr will verify file size with max size ceph supported. There are two caller for ceph_update_writeable_page, ceph_write_begin and ceph_page_mkwrite. For ceph_write_begin, we have already verified the size in generic_write_checks of ceph_write_iter; for ceph_page_mkwrite, we have no chance to change file size when mmap. Likewise we have already verified the size in inode_change_ok when we call ceph_setattr. So let's remove the redundant code for max file size verification. Signed-off-by: Chao Yu <[email protected]> Reviewed-by: Yan, Zheng <[email protected]>
2014-10-14ceph: remove redundant io_iter_advance()Yan, Zheng1-1/+0
ceph_sync_read and generic_file_read_iter() have already advanced the IO iterator. Signed-off-by: Yan, Zheng <[email protected]>
2014-10-14ceph: move ceph_find_inode() outside the s_mutexYan, Zheng2-8/+10
ceph_find_inode() may wait on freeing inode, using it inside the s_mutex may cause deadlock. (the freeing inode is waiting for OSD read reply, but dispatch thread is blocked by the s_mutex) Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2014-10-14ceph: request xattrs if xattr_version is zeroYan, Zheng5-30/+20
Following sequence of events can happen. - Client releases an inode, queues cap release message. - A 'lookup' reply brings the same inode back, but the reply doesn't contain xattrs because MDS didn't receive the cap release message and thought client already has up-to-data xattrs. The fix is force sending a getattr request to MDS if xattrs_version is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client does not have xattr. Signed-off-by: Yan, Zheng <[email protected]>
2014-10-14rbd: set the remaining discard properties to enable supportJosh Durgin1-0/+2
max_discard_sectors must be set for the queue to support discard. Operations implementing discard for rbd zero data, so report that. Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: use helpers to handle discard for layered images correctlyJosh Durgin1-32/+22
Only allocate two osd ops for discard requests, since the preallocation hint is only added for regular writes. Use rbd_img_obj_request_fill() to recreate the original write or discard osd operations, isolating that logic to one place, and change the assert in rbd_osd_req_create_copyup() to accept discard requests as well. Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: extract a method for adding object operationsJosh Durgin1-55/+78
rbd_img_request_fill() creates a ceph_osd_request and has logic for adding the appropriate osd ops to it based on the request type and image properties. For layered images, the original rbd_obj_request is resent with a copyup operation in front, using a new ceph_osd_request. The logic for adding the original operations should be the same as when first sending them, so move it to a helper function. op_type only needs to be checked once, so create a helper for that as well and call it outside the loop in rbd_img_request_fill(). Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: make discard trigger copy-on-writeJosh Durgin1-1/+2
Discard requests are a form of write, so they should go through the same process as plain write requests and trigger copy-on-write for layered images. Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: tolerate -ENOENT for discard operationsJosh Durgin1-0/+3
Discard may try to delete an object from a non-layered image that does not exist. If this occurs, the image already has no data in that range, so change the result to success. Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: fix snapshot context reference count for discardsJosh Durgin1-1/+2
Discards take a reference to the snapshot context of an image when they are created. This reference needs to be cleaned up when the request is done just as it is for regular writes. Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: read image size for discard check safelyJosh Durgin1-6/+12
In rbd_img_request_fill() the image size is only checked to determine whether we can truncate an object instead of zeroing it for discard requests. Take rbd_dev->header_rwsem while reading the image size, and move this read into the discard check, so that non-discard ops don't need to take the semaphore in this function. Signed-off-by: Josh Durgin <[email protected]>
2014-10-14rbd: initial discard bits from Guangliang ZhaoGuangliang Zhao1-15/+89
This patch add the discard support for rbd driver. There are three types operation in the driver: 1. The objects would be removed if they completely contained within the discard range. 2. The objects would be truncated if they partly contained within the discard range, and align with their boundary. 3. Others would be zeroed. A discard request from blkdev_issue_discard() is defined which REQ_WRITE and REQ_DISCARD both marked and no data, so we must check the REQ_DISCARD first when getting the request type. This resolve: http://tracker.ceph.com/issues/190 [ Ilya Dryomov: This is incomplete and somewhat buggy, see follow up commits by Josh Durgin for refinements and fixes which weren't folded in to preserve authorship. ] Signed-off-by: Guangliang Zhao <[email protected]> Reviewed-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>