aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-07-26mm: reorganize SLAB freelist randomizationThomas Garnier4-61/+82
The kernel heap allocators are using a sequential freelist making their allocation predictable. This predictability makes kernel heap overflow easier to exploit. An attacker can careful prepare the kernel heap to control the following chunk overflowed. For example these attacks exploit the predictability of the heap: - Linux Kernel CAN SLUB overflow (https://goo.gl/oMNWkU) - Exploiting Linux Kernel Heap corruptions (http://goo.gl/EXLn95) ***Problems that needed solving: - Randomize the Freelist (singled linked) used in the SLUB allocator. - Ensure good performance to encourage usage. - Get best entropy in early boot stage. ***Parts: - 01/02 Reorganize the SLAB Freelist randomization to share elements with the SLUB implementation. - 02/02 The SLUB Freelist randomization implementation. Similar approach than the SLAB but tailored to the singled freelist used in SLUB. ***Performance data: slab_test impact is between 3% to 4% on average for 100000 attempts without smp. It is a very focused testing, kernbench show the overall impact on the system is way lower. Before: Single thread testing ===================== 1. Kmalloc: Repeatedly allocate then free test 100000 times kmalloc(8) -> 49 cycles kfree -> 77 cycles 100000 times kmalloc(16) -> 51 cycles kfree -> 79 cycles 100000 times kmalloc(32) -> 53 cycles kfree -> 83 cycles 100000 times kmalloc(64) -> 62 cycles kfree -> 90 cycles 100000 times kmalloc(128) -> 81 cycles kfree -> 97 cycles 100000 times kmalloc(256) -> 98 cycles kfree -> 121 cycles 100000 times kmalloc(512) -> 95 cycles kfree -> 122 cycles 100000 times kmalloc(1024) -> 96 cycles kfree -> 126 cycles 100000 times kmalloc(2048) -> 115 cycles kfree -> 140 cycles 100000 times kmalloc(4096) -> 149 cycles kfree -> 171 cycles 2. Kmalloc: alloc/free test 100000 times kmalloc(8)/kfree -> 70 cycles 100000 times kmalloc(16)/kfree -> 70 cycles 100000 times kmalloc(32)/kfree -> 70 cycles 100000 times kmalloc(64)/kfree -> 70 cycles 100000 times kmalloc(128)/kfree -> 70 cycles 100000 times kmalloc(256)/kfree -> 69 cycles 100000 times kmalloc(512)/kfree -> 70 cycles 100000 times kmalloc(1024)/kfree -> 73 cycles 100000 times kmalloc(2048)/kfree -> 72 cycles 100000 times kmalloc(4096)/kfree -> 71 cycles After: Single thread testing ===================== 1. Kmalloc: Repeatedly allocate then free test 100000 times kmalloc(8) -> 57 cycles kfree -> 78 cycles 100000 times kmalloc(16) -> 61 cycles kfree -> 81 cycles 100000 times kmalloc(32) -> 76 cycles kfree -> 93 cycles 100000 times kmalloc(64) -> 83 cycles kfree -> 94 cycles 100000 times kmalloc(128) -> 106 cycles kfree -> 107 cycles 100000 times kmalloc(256) -> 118 cycles kfree -> 117 cycles 100000 times kmalloc(512) -> 114 cycles kfree -> 116 cycles 100000 times kmalloc(1024) -> 115 cycles kfree -> 118 cycles 100000 times kmalloc(2048) -> 147 cycles kfree -> 131 cycles 100000 times kmalloc(4096) -> 214 cycles kfree -> 161 cycles 2. Kmalloc: alloc/free test 100000 times kmalloc(8)/kfree -> 66 cycles 100000 times kmalloc(16)/kfree -> 66 cycles 100000 times kmalloc(32)/kfree -> 66 cycles 100000 times kmalloc(64)/kfree -> 66 cycles 100000 times kmalloc(128)/kfree -> 65 cycles 100000 times kmalloc(256)/kfree -> 67 cycles 100000 times kmalloc(512)/kfree -> 67 cycles 100000 times kmalloc(1024)/kfree -> 64 cycles 100000 times kmalloc(2048)/kfree -> 67 cycles 100000 times kmalloc(4096)/kfree -> 67 cycles Kernbench, before: Average Optimal load -j 12 Run (std deviation): Elapsed Time 101.873 (1.16069) User Time 1045.22 (1.60447) System Time 88.969 (0.559195) Percent CPU 1112.9 (13.8279) Context Switches 189140 (2282.15) Sleeps 99008.6 (768.091) After: Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.47 (0.562732) User Time 1045.3 (1.34263) System Time 88.311 (0.342554) Percent CPU 1105.8 (6.49444) Context Switches 189081 (2355.78) Sleeps 99231.5 (800.358) This patch (of 2): This commit reorganizes the previous SLAB freelist randomization to prepare for the SLUB implementation. It moves functions that will be shared to slab_common. The entropy functions are changed to align with the SLUB implementation, now using get_random_(int|long) functions. These functions were chosen because they provide a bit more entropy early on boot and better performance when specific arch instructions are not available. [[email protected]: fix build] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Garnier <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26fs/fs-writeback.c: inode writeback list tracking tracepointsBrian Foster2-6/+25
The per-sb inode writeback list tracks inodes currently under writeback to facilitate efficient sync processing. In particular, it ensures that sync only needs to walk through a list of inodes that were cleaned by the sync. Add a couple tracepoints to help identify when inodes are added/removed to and from the writeback lists. Piggyback off of the writeback lazytime tracepoint template as it already tracks the relevant inode information. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Dave Chinner <[email protected]> cc: Josef Bacik <[email protected]> Cc: Holger Hoffstätte <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26fs/fs-writeback.c: add a new writeback list for syncDave Chinner6-25/+110
wait_sb_inodes() currently does a walk of all inodes in the filesystem to find dirty one to wait on during sync. This is highly inefficient and wastes a lot of CPU when there are lots of clean cached inodes that we don't need to wait on. To avoid this "all inode" walk, we need to track inodes that are currently under writeback that we need to wait for. We do this by adding inodes to a writeback list on the sb when the mapping is first tagged as having pages under writeback. wait_sb_inodes() can then walk this list of "inodes under IO" and wait specifically just for the inodes that the current sync(2) needs to wait for. Define a couple helpers to add/remove an inode from the writeback list and call them when the overall mapping is tagged for or cleared from writeback. Update wait_sb_inodes() to walk only the inodes under writeback due to the sync. With this change, filesystem sync times are significantly reduced for fs' with largely populated inode caches and otherwise no other work to do. For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem with a ~10m entry inode cache, sync times are reduced from ~7.3s to less than 0.1s when the filesystem is fully clean. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Jan Kara <[email protected]> Tested-by: Holger Hoffstätte <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2/cluster: clean up unnecessary assignment for 'ret'piaojun1-6/+2
Clean up unnecessary assignment for 'ret'. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jun Piao <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2: remove obscure BUG_ON in dlmglueJoseph Qi1-9/+0
These BUG_ON(!inode) are obscure because we have already used inode to get osb. And actually we can guarantee here inode is valid in the context. So we can safely remove them. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joseph Qi <[email protected]> Reviewed-by: Eric Ren <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2: cleanup implemented prototypesJoseph Qi2-8/+0
Several prototypes in inode.h are just defined but not actually implemented and used, so remove them. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2/dlm: fix memory leak of dlm_debug_ctxtJoseph Qi2-25/+2
dlm_debug_ctxt->debug_refcnt is initialized to 1 and then increased to 2 by dlm_debug_get in dlm_debug_init. But dlm_debug_put is called only once in dlm_debug_shutdown during unregister dlm, which leads to dlm_debug_ctxt leaked. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joseph Qi <[email protected]> Reviewed-by: Jiufei Xue <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2: cleanup unneeded goto in ocfs2_create_new_inode_locksJoseph Qi1-3/+1
The last goto is unneeded, so remove it. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joseph Qi <[email protected]> Reviewed-by: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2: improve recovery performanceJunxiao Bi1-20/+21
Journal replay will be run when performing recovery for a dead node. To avoid the stale cache impact, all blocks of dead node's journal inode were reloaded from disk. This hurts the performance. Check whether one block is cached before reloading it can improve performance a lot. In my test env, the time doing recovery was improved from 120s to 1s. [[email protected]: clean up the for loop p_blkno handling] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: "Gang He" <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26ocfs2: fix a redundant re-initializationEric Ren1-2/+0
Obviously, memset() has zeroed the whole struct locking_max_version. So, it's no need to zero its two fields individually. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Eric Ren <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Gang He <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26debugobjects.h: fix trivial kernel doc warningRandy Dunlap1-1/+1
Add ':' to fix trivial kernel-doc warning in <linux/debugobjects.h>: ..//include/linux/debugobjects.h:63: warning: No description found for parameter 'is_static_object' Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26m32r: add __ucmpdi2 to fix build failureSudip Mukherjee4-2/+45
We are having build failure with m32r and the error message being: ERROR: "__ucmpdi2" [lib/842/842_decompress.ko] undefined! ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined! ERROR: "__ucmpdi2" [drivers/scsi/sd_mod.ko] undefined! ERROR: "__ucmpdi2" [drivers/media/i2c/adv7842.ko] undefined! ERROR: "__ucmpdi2" [drivers/md/bcache/bcache.ko] undefined! ERROR: "__ucmpdi2" [drivers/iio/imu/inv_mpu6050/inv-mpu6050.ko] undefined! __ucmpdi2 is introduced to m32r architecture taking example from other architectures like h8300, microblaze, mips. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Sudip Mukherjee <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26scripts/bloat-o-meter: fix percent on <1% changesRiku Voipio1-2/+2
Python divisions are integer divisions unless at least one parameter is a float. The current bloat-o-meter fails to print sub-percentage changes: Total: Before=10515408, After=10604060, chg 0.000000% Force float division by using one float and pretty the print to two significant decimals: Total: Before=10515408, After=10604060, chg +0.84% Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Riku Voipio <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Michal Marek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26kbuild: abort build on bad stack protector flagKees Cook2-35/+42
Before, the stack protector flag was sanity checked before .config had been reprocessed. This meant the build couldn't be aborted early, and only a warning could be emitted followed later by the compiler blowing up with an unknown flag. This has caused a lot of confusion over time, so this splits the flag selection from sanity checking and performs the sanity checking after the make has been restarted from a reprocessed .config, so builds can be aborted as early as possible now. Additionally moves the x86-specific sanity check to the same location, since it suffered from the same warn-then-wait-for-compiler-failure problem. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Cc: Michal Marek <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26fbmon: remove unused function argumentArnd Bergmann1-1/+0
When building with "make W=1", we get a warning about an empty stub function that does nothing but reassign its one of its arguments: drivers/video/fbdev/core/fbmon.c: In function 'fb_edid_to_monspecs': drivers/video/fbdev/core/fbmon.c:1497:67: error: parameter 'specs' set but not used [-Werror=unused-but-set-parameter] We can simply make that function completely empty to avoid the warning. This prevents a warning which everyone will see after "CFLAGS: add -Wunused-but-set-parameter" is merged. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]> Cc: Jean-Christophe Plagniol-Villard <[email protected]> Cc: Tomi Valkeinen <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26dma-debug: track bucket lock state for static checkersStephen Boyd1-0/+2
get_hash_bucket() and put_hash_bucket() acquire and release the same spinlock, but this confuses static checkers such as sparse lib/dma-debug.c:254:27: warning: context imbalance in 'get_hash_bucket' - wrong count at exit lib/dma-debug.c:268:13: warning: context imbalance in 'put_hash_bucket' - unexpected unlock Add the appropriate acquire and release statements so that checkers can properly track the lock state. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stephen Boyd <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26dax: remote unused fault wrappersRoss Zwisler5-71/+21
Remove the unused wrappers dax_fault() and dax_pmd_fault(). After this removal, rename __dax_fault() and __dax_pmd_fault() to dax_fault() and dax_pmd_fault() respectively, and update all callers. The dax_fault() and dax_pmd_fault() wrappers were initially intended to capture some filesystem independent functionality around page faults (calling sb_start_pagefault() & sb_end_pagefault(), updating file mtime and ctime). However, the following commits: 5726b27b09cc ("ext2: Add locking for DAX faults") ea3d7209ca01 ("ext4: fix races between page faults and hole punching") added locking to the ext2 and ext4 filesystems after these common operations but before __dax_fault() and __dax_pmd_fault() were called. This means that these wrappers are no longer used, and are unlikely to be used in the future. XFS has had locking analogous to what was recently added to ext2 and ext4 since DAX support was initially introduced by: 6b698edeeef0 ("xfs: add DAX file operations support") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ross Zwisler <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Chinner <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26dax: some small updates to dax.txt documentationRoss Zwisler1-2/+4
These are originally from Matthew Wilcox and were part of his huge "mm,fs,dax: Change ->pmd_fault to ->huge_fault" patch that was part of PUD support. I'm breaking these small changes out as they stand on their own and add useful information to Documentation/filesystems/dax.txt. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ross Zwisler <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jonathan Corbet <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26arm: get rid of superfluous __GFP_REPEATMichal Hocko2-2/+2
__GFP_REPEAT has a rather weak semantic but since it has been introduced around 2.6.12 it has been ignored for low order allocations. PGALLOC_GFP uses __GFP_REPEAT but none of the allocation which uses this flag is for more than order-2. This means that this flag has never been actually useful here because it has always been used only for PAGE_ALLOC_COSTLY requests. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-07-26Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds129-1279/+11613
Pull block driver updates from Jens Axboe: "This branch also contains core changes. I've come to the conclusion that from 4.9 and forward, I'll be doing just a single branch. We often have dependencies between core and drivers, and it's hard to always split them up appropriately without pulling core into drivers when that happens. That said, this contains: - separate secure erase type for the core block layer, from Christoph. - set of discard fixes, from Christoph. - bio shrinking fixes from Christoph, as a followup up to the op/flags change in the core branch. - map and append request fixes from Christoph. - NVMeF (NVMe over Fabrics) code from Christoph. This is pretty exciting! - nvme-loop fixes from Arnd. - removal of ->driverfs_dev from Dan, after providing a device_add_disk() helper. - bcache fixes from Bhaktipriya and Yijing. - cdrom subchannel read fix from Vchannaiah. - set of lightnvm updates from Wenwei, Matias, Johannes, and Javier. - set of drbd updates and fixes from Fabian, Lars, and Philipp. - mg_disk error path fix from Bart. - user notification for failed device add for loop, from Minfei. - NVMe in general: + NVMe delay quirk from Guilherme. + SR-IOV support and command retry limits from Keith. + fix for memory-less NUMA node from Masayoshi. + use UINT_MAX for discard sectors, from Minfei. + cancel IO fixes from Ming. + don't allocate unused major, from Neil. + error code fixup from Dan. + use constants for PSDT/FUSE from James. + variable init fix from Jay. + fabrics fixes from Ming, Sagi, and Wei. + various fixes" * 'for-4.8/drivers' of git://git.kernel.dk/linux-block: (115 commits) nvme/pci: Provide SR-IOV support nvme: initialize variable before logical OR'ing it block: unexport various bio mapping helpers scsi/osd: open code blk_make_request target: stop using blk_make_request block: simplify and export blk_rq_append_bio block: ensure bios return from blk_get_request are properly initialized virtio_blk: use blk_rq_map_kern memstick: don't allow REQ_TYPE_BLOCK_PC requests block: shrink bio size again block: simplify and cleanup bvec pool handling block: get rid of bio_rw and READA block: don't ignore -EOPNOTSUPP blkdev_issue_write_same block: introduce BLKDEV_DISCARD_ZERO to fix zeroout NVMe: don't allocate unused nvme_major nvme: avoid crashes when node 0 is memoryless node. nvme: Limit command retries loop: Make user notify for adding loop device failed nvme-loop: fix nvme-loop Kconfig dependencies nvmet: fix return value check in nvmet_subsys_alloc() ...
2016-07-26ide: missing break statement in set_timings_mdma()Dan Carpenter1-0/+1
There was clearly supposed to be a break statement here. Currently we use the k2 ata timings instead of sh ata ones we intended. Probably no one has this hardware anymore so it likely doesn't make a difference beyond the static checker warning. Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26ide: hpt366: fix incorrect mask when checking at cmd_high_timeColin Ian King1-1/+1
According to the HPT366 data sheet, PCI config space dword 0x40-0x43 bits 11:8 specify the primary drive cmd_high_time, however, currently just 3 bits of the 4 are being used because the mask is 0x07 and not 0x0f. Fix the mask, allowing for the 40MHz clock to be detected. Also add in missing space between switch and parenthesis to clean up a checkpatch warning. Signed-off-by: Colin Ian King <[email protected]> Acked-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26ide-tape: fix misprint in failure handling in idetape_init()Alexey Khoroshilov1-3/+3
If driver_register() failed there is no sense to call driver_unregister(). unregister_chrdev() should be called here. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26cmd640: add __init attributeJulia Lawall1-1/+1
Add __init attribute on a function that is only called from other __init functions and that is not inlined, at least with gcc version 4.8.4 on an x86 machine with allyesconfig. Currently, the function is put in the .text.unlikely segment. Declaring it as __init will cause it to be put in the .init.text and to disappear after initialization. The result of objdump -x on the function before the change is as follows: 0000000000000000 l F .text.unlikely 00000000000000e4 cmd640x_init_one And after the change it is as follows: 00000000000000d2 l F .init.text 00000000000000df cmd640x_init_one Done with the help of Coccinelle. The semantic patch checks for local static non-init functions that are called from an __init function and are not called from any other function. Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26l2tp: Correctly return -EBADF from pppol2tp_getname.[email protected]1-5/+2
If 'tunnel' is NULL we should return -EBADF but the 'end_put_sess' path unconditionally sets 'error' back to zero. Rework the error path so it more closely matches pppol2tp_sendmsg. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Phil Turnbull <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26net/mlx5_core/health: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar1-5/+2
The workqueue health->wq was used as per device private health thread. This was done to perform delayed work. The workqueue has a single workitem(&health->work) and hence doesn't require ordering. It is involved in handling the health of the device and is not being used on a memory reclaim path. Hence, the singlethreaded workqueue has been replaced with the use of system_wq. Work item has been flushed in mlx5_health_cleanup() to ensure that there are no pending tasks while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26net: ipmr/ip6mr: update lastuse on entry changeNikolay Aleksandrov2-2/+2
Currently lastuse is updated on entry creation and cache hit, but it should also be updated on entry change. Since both on add and update the ttl array is updated we can simply update the lastuse in ipmr_update_thresholds. Signed-off-by: Nikolay Aleksandrov <[email protected]> CC: Roopa Prabhu <[email protected]> CC: Donald Sharp <[email protected]> CC: David S. Miller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-27gcc-plugins: disable under COMPILE_TESTKees Cook2-2/+3
Since adding the gcc plugin development headers is required for the gcc plugin support, we should ease into this new kernel build dependency more slowly. For now, disable the gcc plugins under COMPILE_TEST so that all*config builds will skip it. Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Michal Marek <[email protected]>
2016-07-26Merge branch 'for-4.8/core' of git://git.kernel.dk/linux-blockLinus Torvalds199-1523/+1918
Pull core block updates from Jens Axboe: - the big change is the cleanup from Mike Christie, cleaning up our uses of command types and modified flags. This is what will throw some merge conflicts - regression fix for the above for btrfs, from Vincent - following up to the above, better packing of struct request from Christoph - a 2038 fix for blktrace from Arnd - a few trivial/spelling fixes from Bart Van Assche - a front merge check fix from Damien, which could cause issues on SMR drives - Atari partition fix from Gabriel - convert cfq to highres timers, since jiffies isn't granular enough for some devices these days. From Jan and Jeff - CFQ priority boost fix idle classes, from me - cleanup series from Ming, improving our bio/bvec iteration - a direct issue fix for blk-mq from Omar - fix for plug merging not involving the IO scheduler, like we do for other types of merges. From Tahsin - expose DAX type internally and through sysfs. From Toshi and Yigal * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits) block: Fix front merge check block: do not merge requests without consulting with io scheduler block: Fix spelling in a source code comment block: expose QUEUE_FLAG_DAX in sysfs block: add QUEUE_FLAG_DAX for devices to advertise their DAX support Btrfs: fix comparison in __btrfs_map_block() block: atari: Return early for unsupported sector size Doc: block: Fix a typo in queue-sysfs.txt cfq-iosched: Charge at least 1 jiffie instead of 1 ns cfq-iosched: Fix regression in bonnie++ rewrite performance cfq-iosched: Convert slice_resid from u64 to s64 block: Convert fifo_time from ulong to u64 blktrace: avoid using timespec block/blk-cgroup.c: Declare local symbols static block/bio-integrity.c: Add #include "blk.h" block/partition-generic.c: Remove a set-but-not-used variable block: bio: kill BIO_MAX_SIZE cfq-iosched: temporarily boost queue priority for idle classes block: drbd: avoid to use BIO_MAX_SIZE block: bio: remove BIO_MAX_SECTORS ...
2016-07-26kbuild: Abort build on bad stack protector flagKees Cook2-35/+40
Before, the stack protector flag was sanity checked before .config had been reprocessed. This meant the build couldn't be aborted early, and only a warning could be emitted followed later by the compiler blowing up with an unknown flag. This has caused a lot of confusion over time, so this splits the flag selection from sanity checking and performs the sanity checking after the make has been restarted from a reprocessed .config, so builds can be aborted as early as possible now. Additionally moves the x86-specific sanity check to the same location, since it suffered from the same warn-then-wait-for-compiler-failure problem. Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Michal Marek <[email protected]>
2016-07-26Merge branch 'for-4.8' of ↵Linus Torvalds24-153/+314
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: "libata saw quite a bit of activities in this cycle: - SMR drive support still being worked on - bug fixes and improvements to misc SCSI command emulation - some low level driver updates" * 'for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (39 commits) libata-scsi: better style in ata_msense_*() AHCI: Clear GHC.IS to prevent unexpectly asserting INTx ata: sata_dwc_460ex: remove redundant dev_err call ata: define ATA_PROT_* in terms of ATA_PROT_FLAG_* libata: remove ATA_PROT_FLAG_DATA libata: remove ata_is_nodata ata: make lba_{28,48}_ok() use ATA_MAX_SECTORS{,_LBA48} libata-scsi: minor cleanup for ata_scsi_zbc_out_xlat libata-scsi: Fix ZBC management out command translation libata-scsi: Fix translation of REPORT ZONES command ata: Handle ATA NCQ NO-DATA commands correctly libata-eh: decode all taskfile protocols ata: fixup ATA_PROT_NODATA libsas: use ata_is_ncq() and ata_has_dma() accessors libata: use ata_is_ncq() accessors libata: return boolean values from ata_is_* libata-scsi: avoid repeated calculation of number of TRIM ranges libata-scsi: reject WRITE SAME (16) with n_block that exceeds limit libata-scsi: rename ata_msense_ctl_mode() to ata_msense_control() libata-scsi: fix D_SENSE bit relection in control mode page ...
2016-07-26Merge branch 'for-4.8' of ↵Linus Torvalds2-10/+37
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Nothing too exciting. - updates to the pids controller so that pid limit breaches can be noticed and monitored from userland. - cleanups and non-critical bug fixes" * 'for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: remove duplicated include from cgroup.c cgroup: Use lld instead of ld when printing pids controller events_limit cgroup: Add pids controller event when fork fails because of pid limit cgroup: allow NULL return from ss->css_alloc() cgroup: remove unnecessary 0 check from css_from_id() cgroup: fix idr leak for the first cgroup root
2016-07-26macsec: ensure rx_sa is set when validation is disabledBeniamino Galvani1-1/+2
macsec_decrypt() is not called when validation is disabled and so macsec_skb_cb(skb)->rx_sa is not set; but it is used later in macsec_post_decrypt(), ensure that it's always initialized. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Beniamino Galvani <[email protected]> Acked-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26Merge branch 'tipc-netlink-monitor-updates'David S. Miller11-12/+445
Parthasarathy Bhuvaragan says: ==================== tipc: netlink updates for neighbour monitor This series contains the updates to configure and read the attributes for neighbour monitor. v2: rebase on top of net-next ==================== Signed-off-by: David S. Miller <[email protected]>
2016-07-26tipc: dump monitor attributesParthasarathy Bhuvaragan6-0/+260
In this commit, we dump the monitor attributes when queried. The link monitor attributes are separated into two kinds: 1. general attributes per bearer 2. specific attributes per node/peer This style resembles the socket attributes and the nametable publications per socket. Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26tipc: add a function to get the bearer nameParthasarathy Bhuvaragan2-0/+22
Introduce a new function to get the bearer name from its id. This is used in subsequent commit. Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26tipc: get monitor threshold for the clusterParthasarathy Bhuvaragan6-0/+68
In this commit, we add support to fetch the configured cluster monitoring threshold. Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26tipc: make cluster size threshold for monitoring configurableParthasarathy Bhuvaragan7-2/+66
In this commit, we introduce support to configure the minimum threshold to activate the new link monitoring algorithm. Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26tipc: introduce constants for tipc address validationParthasarathy Bhuvaragan3-10/+29
In this commit, we introduce defines for tipc address size, offset and mask specification for Zone.Cluster.Node. There is no functional change in this commit. Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: Parthasarathy Bhuvaragan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in ↵He Chunhui1-6/+1
neigh_update() NUD_STALE is used when the caller(e.g. arp_process()) can't guarantee neighbour reachability. If the entry was NUD_VALID and lladdr is unchanged, the entry state should not be changed. Currently the code puts an extra "NUD_CONNECTED" condition. So if old state was NUD_DELAY or NUD_PROBE (they are NUD_VALID but not NUD_CONNECTED), the state can be changed to NUD_STALE. This may cause problem. Because NUD_STALE lladdr doesn't guarantee reachability, when we send traffic, the state will be changed to NUD_DELAY. In normal case, if we get no confirmation (by dst_confirm()), we will change the state to NUD_PROBE and send probe traffic. But now the state may be reset to NUD_STALE again(e.g. by broadcast ARP packets), so the probe traffic will not be sent. This situation may happen again and again, and packets will be sent to an non-reachable lladdr forever. The fix is to remove the "NUD_CONNECTED" condition. After that the "NEIGH_UPDATE_F_WEAK_OVERRIDE" condition (used by IPv6) in that branch will be redundant, so remove it. This change may increase probe traffic, but it's essential since NUD_STALE lladdr is unreliable. To ensure correctness, we prefer to resolve lladdr, when we can't get confirmation, even while remote packets try to set NUD_STALE state. Signed-off-by: Chunhui He <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-26arm64: dts: marvell: Add Aardvark PCIe support for Armada 3700Thomas Petazzoni2-0/+30
Add the SoC-level description of the PCIe controller found on the Marvell Armada 3700 and enable this PCIe controller on the development board for this SoC. Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2016-07-26Fix the Debian packaging script on systems with no codenameMarcin Mielniczuk1-1/+2
When calling `make deb-pkg` on a system with no codename (for example Arch Linux), lsb_release sometimes outputs `n/a` as the codename. This breaks dpkg-parsechangelog, which can't process the changelog correctly. Signed-off-by: Marcin Mielniczuk <[email protected]> Signed-off-by: Michal Marek <[email protected]>
2016-07-26PCI: aardvark: Add Aardvark PCI host controller driverThomas Petazzoni4-0/+1029
Add a driver for the Aardvark PCIe controller used on the Marvell Armada 3700 ARM64 SoC. Based on work done by Hezi Shahmoon <[email protected]> and Marcin Wojtas <[email protected]>. Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2016-07-26dt-bindings: add DT binding for the Aardvark PCIe controllerThomas Petazzoni1-0/+56
Add the documentation for the Device Tree binding for the Aardvark PCIe controller, found on Marvell Armada 3700 ARM64 SoCs. Signed-off-by: Thomas Petazzoni <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2016-07-26Merge branch 'linus' of ↵Linus Torvalds184-4261/+19350
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Here is the crypto update for 4.8: API: - first part of skcipher low-level conversions - add KPP (Key-agreement Protocol Primitives) interface. Algorithms: - fix IPsec/cryptd reordering issues that affects aesni - RSA no longer does explicit leading zero removal - add SHA3 - add DH - add ECDH - improve DRBG performance by not doing CTR by hand Drivers: - add x86 AVX2 multibuffer SHA256/512 - add POWER8 optimised crc32c - add xts support to vmx - add DH support to qat - add RSA support to caam - add Layerscape support to caam - add SEC1 AEAD support to talitos - improve performance by chaining requests in marvell/cesa - add support for Araneus Alea I USB RNG - add support for Broadcom BCM5301 RNG - add support for Amlogic Meson RNG - add support Broadcom NSP SoC RNG" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits) crypto: vmx - Fix aes_p8_xts_decrypt build failure crypto: vmx - Ignore generated files crypto: vmx - Adding support for XTS crypto: vmx - Adding asm subroutines for XTS crypto: skcipher - add comment for skcipher_alg->base crypto: testmgr - Print akcipher algorithm name crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op crypto: nx - off by one bug in nx_of_update_msc() crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct crypto: scatterwalk - Inline start/map/done crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone crypto: scatterwalk - Fix test in scatterwalk_done crypto: api - Optimise away crypto_yield when hard preemption is on crypto: scatterwalk - add no-copy support to copychunks crypto: scatterwalk - Remove scatterwalk_bytes_sglen crypto: omap - Stop using crypto scatterwalk_bytes_sglen crypto: skcipher - Remove top-level givcipher interface crypto: user - Remove crypto_lookup_skcipher call crypto: cts - Convert to skcipher ...
2016-07-26builddeb: fix file permissions before packagingHenning Schild1-0/+2
Builddep is not very explicit about file permissions. Actually the file permissions in the package are largely influenced by the umask of the user cloning the git and building the package. If that umask does not set go+r the resulting linux-headers package will prevent non-root users from building out-of-tree modules. And that is probably just one unexpected effect. Being a packaging/install tool builddep should make sure the file permissions are set correctly and not just derived from a value that is never checked. This patch sets ugo read permissions for all packaged files and derives the executable bit for directories and executables from the file-owner. Signed-off-by: Henning Schild <[email protected]> Signed-off-by: Michal Marek <[email protected]>
2016-07-26Merge tag 'docs-for-linus' of git://git.lwn.net/linuxLinus Torvalds30-430/+2653
Pull documentation updates from Jonathan Corbet: "Some big changes this month, headlined by the addition of a new formatted documentation mechanism based on the Sphinx system. The objectives here are to make it easier to create better-integrated (and more attractive) documents while (eventually) dumping our one-of-a-kind, cobbled-together system for something that is widely used and maintained by others. There's a fair amount of information what's being done, why, and how to use it in: https://lwn.net/Articles/692704/ https://lwn.net/Articles/692705/ Closer to home, Documentation/kernel-documentation.rst describes how it works. For now, the new system exists alongside the old one; you should soon see the GPU documentation converted over in the DRM pull and some significant media conversion work as well. Once all the docs have been moved over and we're convinced that the rough edges (of which are are a few) have been smoothed over, the DocBook-based stuff should go away. Primary credit is to Jani Nikula for doing the heavy lifting to make this stuff actually work; there has also been notable effort from Markus Heiser, Daniel Vetter, and Mauro Carvalho Chehab. Expect a couple of conflicts on the new index.rst file over the course of the merge window; they are trivially resolvable. That file may be a bit of a conflict magnet in the short term, but I don't expect that situation to last for any real length of time. Beyond that, of course, we have the usual collection of tweaks, updates, and typo fixes" * tag 'docs-for-linus' of git://git.lwn.net/linux: (77 commits) doc-rst: kernel-doc: fix handling of address_space tags Revert "doc/sphinx: Enable keep_warnings" doc-rst: kernel-doc directive, fix state machine reporter docs: deprecate kernel-doc-nano-HOWTO.txt doc/sphinx: Enable keep_warnings Documentation: add watermark_scale_factor to the list of vm systcl file kernel-doc: Fix up warning output docs: Get rid of some kernel-documentation warnings doc-rst: add an option to ignore DocBooks when generating docs workqueue: Fix a typo in workqueue.txt Doc: ocfs: Fix typo in filesystems/ocfs2-online-filecheck.txt Documentation/sphinx: skip build if user requested specific DOCBOOKS Documentation: add cleanmediadocs to the documentation targets Add .pyc files to .gitignore Doc: PM: Fix a typo in intel_powerclamp.txt doc-rst: flat-table directive - initial implementation Documentation: add meta-documentation for Sphinx and kernel-doc Documentation: tiny typo fix in usb/gadget_multi.txt Documentation: fix wrong value in md.txt bcache: documentation formatting, edited for clarity, stripe alignment notes ...
2016-07-26xtensa: Partially Revert "xtensa: Remove unnecessary of_platform_populate ↵Rob Herring2-2/+9
with default match table" This partially reverts commit 69d99e6c0d62 keeping only the main purpose of the original commit which is the removal of of_platform_populate() call. The moving of of_clk_init() caused changes in the initialization order breaking booting. Fixes: 69d99e6c0d621f ("xtensa: Remove unnecessary of_platform_populate with default match table") Cc: Kefeng Wang <[email protected]> Cc: Guenter Roeck <[email protected]> Tested-by: Max Filippov <[email protected]> Signed-off-by: Rob Herring <[email protected]>
2016-07-26PCI: tegra: Program PADS_REFCLK_CFG* registers with per-SoC valuesStephen Warren1-13/+8
The value that should be programmed into the PADS_REFCLK register varies per SoC. Fix the Tegra PCIe driver to program the correct values. Future SoCs will require different values in cfg0/1, so the two values are stored separately in the per-SoC data structures. For reference, the values are all documented in NV bug 1771116 comment 20. The ASIC team has validated all these values, except for the Tegra20 value which is simply left unchanged in this patch. Signed-off-by: Stephen Warren <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Thierry Reding <[email protected]>
2016-07-26PCI: tegra: Program PADS_REFCLK_CFG* always, not just on legacy SoCsStephen Warren1-6/+9
tegra_pcie_phy_power_on() calls tegra_pcie_phy_enable() only for legacy SoCs. However, part of tegra_pcie_phy_enable() needs to happen in all cases. Move that code up one level into tegra_pcie_phy_power_on(). [bhelgaas: changelog] Signed-off-by: Stephen Warren <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Simon Glass <[email protected]>