aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-03-16afs: ->writepage() shouldn't call clear_page_dirty_for_io()David Howells1-3/+3
The ->writepage() op shouldn't call clear_page_dirty_for_io() as that has already been called by the caller. Fix afs_writepage() by moving the call out of afs_write_back_from_locked_page() to afs_writepages_region() where it is needed. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Fix abort on signal while waiting for call completionDavid Howells1-13/+6
Fix the way in which a call that's in progress and being waited for is aborted in the case that EINTR is detected. We should be sending RX_USER_ABORT rather than RX_CALL_DEAD as the abort code. Note that since the only two ways out of the loop are if the call completes or if a signal happens, the kill-the-call clause after the loop has finished can only happen in the case of EINTR. This means that we only have one abort case to deal with, not two, and the "KWC" case can never happen and so can be deleted. Note further that simply aborting the call isn't necessarily the best thing here since at this point: the request has been entirely sent and it's likely the server will do the operation anyway - whether we abort it or not. In future, we should punt the handling of the remainder of the call off to a background thread. Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Fix an off-by-one error in afs_send_pages()David Howells1-1/+1
afs_send_pages() should only put the call into the AFS_CALL_AWAIT_REPLY state if it has sent all the pages - but the check it makes is incorrect and sometimes it will finish the loop early. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Fix afs_kill_pages()David Howells1-3/+7
Fix afs_kill_pages() in two ways: (1) If a writeback has been partially flushed, then if we try and kill the pages it contains, some of them may no longer be undergoing writeback and end_page_writeback() will assert. Fix this by checking to see whether the page in question is actually undergoing writeback before ending that writeback. (2) The loop that scans for pages to kill doesn't increase the first page index, and so the loop may not terminate, but it will try to process the same pages over and over again. Fix this by increasing the first page index to one after the last page we processed. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Fix page leak in afs_write_begin()David Howells1-2/+5
afs_write_begin() leaks a ref and a lock on a page if afs_fill_page() fails. Fix the leak by unlocking and releasing the page in the error path. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Don't set PG_error on local EINTR or ENOMEM when filling a pageDavid Howells1-2/+10
Don't set PG_error on a page if we get local EINTR or ENOMEM when filling a page for writing. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Populate and use client modification timeMarc Dionne2-10/+10
The inode timestamps should be set from the client time in the status received from the server, rather than the server time which is meant for internal server use. Set AFS_SET_MTIME and populate the mtime for operations that take an input status, such as file/dir creation and StoreData. If an input time is not provided the server will set the vnode times based on the current server time. In a situation where the server has some skew with the client, this could lead to the client seeing a timestamp in the future for a file that it just created or wrote. Signed-off-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Better abort and net error handlingDavid Howells1-8/+27
If we receive a network error, a remote abort or a protocol error whilst we're still transmitting data, make sure we return an appropriate error to the caller rather than ESHUTDOWN or ECONNABORTED. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Invalid op ID should abort with RXGEN_OPCODEDavid Howells2-1/+3
When we are given an invalid operation ID, we should abort that with RXGEN_OPCODE rather than RX_INVALID_OPERATION. Also map RXGEN_OPCODE to -ENOTSUPP. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Fix the maths in afs_fs_store_data()David Howells1-1/+1
afs_fs_store_data() works out of the size of the write it's going to make, but it uses 32-bit unsigned subtraction in one place that gets automatically cast to loff_t. However, if to < offset, then the number goes negative, but as the result isn't signed, this doesn't get sign-extended to 64-bits when placed in a loff_t. Fix by casting the operands to loff_t. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Use a bvec rather than a kvec in afs_send_pages()David Howells1-45/+52
Use a bvec rather than a kvec in afs_send_pages() as we don't then have to call kmap() in advance. This allows us to pass the array of contiguous pages that we extracted through to rxrpc in one go rather than passing a single page at a time. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Make struct afs_read::remain 64-bitDavid Howells2-5/+5
Make struct afs_read::remain 64-bit so that it can handle huge transfers if we ever request them or the server decides to give us a bit extra data (the other fields there are already 64-bit). Signed-off-by: David Howells <[email protected]> Tested-by: Marc Dionne <[email protected]>
2017-03-16afs: Fix AFS read bugDavid Howells1-1/+1
Fix a bug in AFS read whereby the request page afs_read::index isn't incremented after calling ->page_done() if ->remain reaches 0, indicating that the data read is complete. Without this a NULL pointer exception happens when ->page_done() is called twice for the last page because the page clearing loop will call it also and afs_readpages_page_done() clears the current entry in the page list. BUG: unable to handle kernel NULL pointer dereference at (null) IP: afs_readpages_page_done+0x21/0xa4 [kafs] PGD 0 Oops: 0002 [#1] SMP Modules linked in: kafs(E) CPU: 2 PID: 3002 Comm: md5sum Tainted: G E 4.10.0-fscache #485 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 task: ffff8804017d86c0 task.stack: ffff8803fc1d8000 RIP: 0010:afs_readpages_page_done+0x21/0xa4 [kafs] RSP: 0018:ffff8803fc1db978 EFLAGS: 00010282 RAX: ffff880405d39af8 RBX: 0000000000000000 RCX: ffff880407d83ed4 RDX: 0000000000000000 RSI: ffff880405d39a00 RDI: ffff880405c6f400 RBP: ffff8803fc1db988 R08: 0000000000000000 R09: 0000000000000001 R10: ffff8803fc1db820 R11: ffff88040cf56000 R12: ffff8804088f1780 R13: ffff8804017d86c0 R14: ffff8804088f1780 R15: 0000000000003840 FS: 00007f8154469700(0000) GS:ffff88041fb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000004016ec000 CR4: 00000000001406e0 Call Trace: afs_deliver_fs_fetch_data+0x5b9/0x60e [kafs] ? afs_make_call+0x316/0x4e8 [kafs] ? afs_make_call+0x359/0x4e8 [kafs] afs_deliver_to_call+0x173/0x2e8 [kafs] ? afs_make_call+0x316/0x4e8 [kafs] afs_make_call+0x37a/0x4e8 [kafs] ? wake_up_q+0x4f/0x4f ? __init_waitqueue_head+0x36/0x49 afs_fs_fetch_data+0x21c/0x227 [kafs] ? afs_fs_fetch_data+0x21c/0x227 [kafs] afs_vnode_fetch_data+0xf3/0x1d2 [kafs] afs_readpages+0x314/0x3fd [kafs] __do_page_cache_readahead+0x208/0x2c5 ondemand_readahead+0x3a2/0x3b7 ? ondemand_readahead+0x3a2/0x3b7 page_cache_async_readahead+0x5e/0x67 generic_file_read_iter+0x23b/0x70c ? __inode_security_revalidate+0x2f/0x62 __vfs_read+0xc4/0xe8 vfs_read+0xd1/0x15a SyS_read+0x4c/0x89 do_syscall_64+0x80/0x191 entry_SYSCALL64_slow_path+0x25/0x25 Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Marc Dionne <[email protected]>
2017-03-16afs: Prevent callback expiry timer overflowTina Ruchandani3-6/+7
get_seconds() returns real wall-clock seconds. On 32-bit systems this value will overflow in year 2038 and beyond. This patch changes afs_vnode record to use ktime_get_real_seconds() instead, for the fields cb_expires and cb_expires_at. Signed-off-by: Tina Ruchandani <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Migrate vlocation fields to 64-bitTina Ruchandani4-16/+20
get_seconds() returns real wall-clock seconds. On 32-bit systems this value will overflow in year 2038 and beyond. This patch changes afs's vlocation record to use ktime_get_real_seconds() instead, for the fields time_of_death and update_at. Signed-off-by: Tina Ruchandani <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: security: Replace rcu_assign_pointer() with RCU_INIT_POINTER()Andreea-Cristina Bernat1-1/+1
The use of "rcu_assign_pointer()" is NULLing out the pointer. According to RCU_INIT_POINTER()'s block comment: "1. This use of RCU_INIT_POINTER() is NULLing out the pointer" it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: inode: Replace rcu_assign_pointer() with RCU_INIT_POINTER()Andreea-Cristina Bernat1-1/+1
The use of "rcu_assign_pointer()" is NULLing out the pointer. According to RCU_INIT_POINTER()'s block comment: "1. This use of RCU_INIT_POINTER() is NULLing out the pointer" it is better to use it instead of rcu_assign_pointer() because it has a smaller overhead. The following Coccinelle semantic patch was used: @@ @@ - rcu_assign_pointer + RCU_INIT_POINTER (..., NULL) Signed-off-by: Andreea-Cristina Bernat <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Distinguish mountpoints from symlinks by file mode aloneDavid Howells3-68/+15
In AFS, mountpoints appear as symlinks with mode 0644 and normal symlinks have mode 0777, so use this to distinguish them rather than reading the content and parsing it. In the case of a mountpoint, the symlink body is a formatted string indicating the location of the target volume. Note that with this, kAFS no longer 'pre-fetches' the contents of symlinks, so afs_readpage() may fail with an access-denial because when the VFS calls d_automount(), it wraps the call in an credentials override that sets the initial creds - thereby preventing access to the caller's keyrings and the authentication keys held therein. To this end, a patch reverting that change to the VFS is required also. Reported-by: Jeffrey Altman <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Flush outstanding writes when an fd is closedDavid Howells3-0/+16
Flush outstanding writes in afs when an fd is closed. This is what NFS and CIFS do. Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Handle a short write to an AFS pageDavid Howells3-11/+23
Handle the situation where afs_write_begin() is told to expect that a full-page write will be made, but this doesn't happen (EFAULT, CTRL-C, etc.), and so afs_write_end() sees a partial write took place. Currently, no attempt is to deal with the discrepency. Fix this by loading the gap from the server. Reported-by: Al Viro <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Kill struct afs_read::pg_offsetDavid Howells1-1/+0
Kill struct afs_read::pg_offset as nothing uses it. It's unnecessary as pos can be masked off. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Handle better the server returning excess or short dataDavid Howells2-16/+40
When an AFS server is given an FS.FetchData{,64} request to read data from a file, it is permitted by the protocol to return more or less than was requested. kafs currently relies on the latter behaviour in readpage{,s} to handle a partial page at the end of the file (we just ask for a whole page and clear space beyond the short read). However, we don't handle all cases. Add: (1) Handle excess data by discarding it rather than aborting. Note that we use a common static buffer to discard into so that the decryption algorithm advances the PCBC state. (2) Handle a short read that affects more than just the last page. Note that if a read comes up unexpectedly short of long, it's possible that the server's copy of the file changed - in which case the data version number will have been incremented and the callback will have been broken - in which case all the pages currently attached to the inode will be zapped anyway at some point. Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Deal with an empty callback arrayMarc Dionne2-7/+9
Servers may send a callback array that is the same size as the FID array, or an empty array. If the callback count is 0, the code would attempt to read (fid_count * 12) bytes of data, which would fail and result in an unmarshalling error. This would lead to stale data for remotely modified files or directories. Store the callback array size in the internal afs_call structure and use that to determine the amount of data to read. Signed-off-by: Marc Dionne <[email protected]>
2017-03-16afs: Adjust mode bits processingMarc Dionne1-1/+6
Mode bits for an afs file should not be enforced in the usual way. For files, the absence of user bits can restrict file access with respect to what is granted by the server. These bits apply regardless of the owner or the current uid; the rest of the mode bits (group, other) are ignored. Signed-off-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Populate group ID from vnode statusMarc Dionne1-1/+1
The group was hard coded to GLOBAL_ROOT_GID; use the group ID that was received from the server. Signed-off-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]>
2017-03-16afs: Fix page overput in afs_fill_page()David Howells1-0/+1
afs_fill_page() loads the page it wants to fill into the afs_read request without incrementing its refcount - but then calls afs_put_read() to clean up afterwards, which then releases a ref on the page. Fix this by getting a ref on the page before calling afs_vnode_fetch_data(). This causes sync after a write to hang in afs_writepages_region() because find_get_pages_tag() gets confused and doesn't return. Fixes: 196ee9cd2d04 ("afs: Make afs_fs_fetch_data() take a list of pages") Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]> Tested-by: Marc Dionne <[email protected]>
2017-03-16afs: Fix missing put_page()David Howells1-0/+1
In afs_writepages_region(), inside the loop where we find dirty pages to deal with, one of the if-statements is missing a put_page(). Signed-off-by: David Howells <[email protected]>
2017-03-16blk-stat: fix blk_stat_sum() if all samples are batchedOmar Sandoval1-2/+2
We need to flush the batch _before_ we check the number of samples, otherwise we'll miss all of the batched samples. Fixes: cf43e6b ("block: add scalable completion tracking of requests") Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-03-16mmc: ushc: fix NULL-deref at probeJohan Hovold1-0/+3
Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer should a malicious device lack endpoints. Fixes: 53f3a9e26ed5 ("mmc: USB SD Host Controller (USHC) driver") Cc: stable <[email protected]> # 2.6.37 Cc: David Vrabel <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2017-03-16mmc: sdhci-of-at91: Support external regulatorsRomain Izard1-0/+19
The SDHCI controller in the SAMA5D2 chip requires a valid voltage set in the power control register, otherwise commands will fail with a timeout error. When using the regulator framework to specify the regulator used by the mmc device, the voltage is not configured, and it is not possible to use the connected device. Implement a custom 'set_power' function for this specific hardware, that configures the voltage in the register in all cases. Signed-off-by: Romain Izard <[email protected]> Acked-by: Ludovic Desroches <[email protected]> Cc: [email protected] #4.9+ Signed-off-by: Ulf Hansson <[email protected]>
2017-03-16mmc: core: mmc_blk_rw_cmd_err - remove unused variableWinkler, Tomas1-3/+0
Fix compilation warning: drivers/mmc/core/block.c:1563:24: warning: variable ‘mq_rq’ set but not used [-Wunused-but-set-variable] struct mmc_queue_req *mq_rq; Signed-off-by: Tomas Winkler <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2017-03-16drm/amdgpu: fix the clearing wb sizeHuang Rui1-2/+2
The clearing wb size should be the one that it is assigned. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-16drm/amdgpu: reinstate oland workaround for sclkAlex Deucher1-3/+7
Higher sclks seem to be unstable on some boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=100222 Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2017-03-16drm/radeon: reinstate oland workaround for sclkAlex Deucher1-3/+7
Higher sclks seem to be unstable on some boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=100222 Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2017-03-16perf/core: Better explain the inherit magicPeter Zijlstra1-3/+33
While going through the event inheritance code Oleg got confused. Add some comments to better explain the silent dissapearance of orphaned events. So what happens is that at perf_event_release_kernel() time; when an event looses its connection to userspace (and ceases to exist from the user's perspective) we can still have an arbitrary amount of inherited copies of the event. We want to synchronously find and remove all these child events. Since that requires a bit of lock juggling, there is the possibility that concurrent clone()s will create new child events. Therefore we first mark the parent event as DEAD, which marks all the extant child events as orphaned. We then avoid copying orphaned events; in order to avoid getting more of them. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16perf/core: Simplify perf_event_free_task()Peter Zijlstra1-11/+1
We have ctx->event_list that contains all events; no need to repeatedly iterate the group lists to find them all. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16perf/core: Fix event inheritance on fork()Peter Zijlstra1-2/+3
While hunting for clues to a use-after-free, Oleg spotted that perf_event_init_context() can loose an error value with the result that fork() can succeed even though we did not fully inherit the perf event context. Spotted-by: Oleg Nesterov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Cc: [email protected] Fixes: 889ff0150661 ("perf/core: Split context's event group list into pinned and non-pinned lists") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16perf/core: Fix use-after-free in perf_release()Peter Zijlstra1-0/+11
Dmitry reported syzcaller tripped a use-after-free in perf_release(). After much puzzlement Oleg spotted the below scenario: Task1 Task2 fork() perf_event_init_task() /* ... */ goto bad_fork_$foo; /* ... */ perf_event_free_task() mutex_lock(ctx->lock) perf_free_event(B) perf_event_release_kernel(A) mutex_lock(A->child_mutex) list_for_each_entry(child, ...) { /* child == B */ ctx = B->ctx; get_ctx(ctx); mutex_unlock(A->child_mutex); mutex_lock(A->child_mutex) list_del_init(B->child_list) mutex_unlock(A->child_mutex) /* ... */ mutex_unlock(ctx->lock); put_ctx() /* >0 */ free_task(); mutex_lock(ctx->lock); mutex_lock(A->child_mutex); /* ... */ mutex_unlock(A->child_mutex); mutex_unlock(ctx->lock) put_ctx() /* 0 */ ctx->task && !TOMBSTONE put_task_struct() /* UAF */ This patch closes the hole by making perf_event_free_task() destroy the task <-> ctx relation such that perf_event_release_kernel() will no longer observe the now dead task. Spotted-by: Oleg Nesterov <[email protected]> Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Fixes: c6e5b73242d2 ("perf: Synchronously clean up child events") Link: http://lkml.kernel.org/r/20170314155949.GE32474@worktop Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16mmc: mediatek: Fixed bug where clock frequency could be set wrongyong mao1-2/+2
This patch can fix two issues: Issue 1: In previous code, div may be overflow when setting clock frequency as f_min. We can use DIV_ROUND_UP to fix this boundary related issue. Issue 2: In previous code, we can not set the correct clock frequency when div equals 0xff. Signed-off-by: Yong Mao <[email protected]> Signed-off-by: Chaotian Jing <[email protected]> Reviewed-by: Daniel Kurtz <[email protected]> Signed-off-by: Ulf Hansson <[email protected]>
2017-03-16EDAC, pnd2_edac: Add new EDAC driver for Intel SoC platformsTony Luck5-0/+1859
Initial target for this driver is the Intel Apollo Lake platform and Denverton micro-server, they use the same internal memory controller IP called Pondicherry2. Memory controller registers are not in PCI config space like earlier Intel memory controllers. For Apollo Lake platform they are accessed via a "side-band" interface, for Denverton micro-server they are access via PCI config space and memory map I/O. This driver is for Apollo Lake and Denverton, but only the Denverton is fully enabled while we wait for the sideband driver. Apollo lake driver and initial cut at Denverton driver by Tony Luck. Extensive cleanup, refactoring and basic verification by Qiuxu Zhuo. Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Qiuxu Zhuo <[email protected]> Cc: linux-edac <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2017-03-16powerpc: Wire up statx() syscallChandan Rajendra3-1/+3
Test runs on a ppc64 BE guest succeeded. linux/samples/statx/test-statx program was executed on the following file types, 1. Regular file 2. Directory 3. device file 4. symlink 5. Named pipe The test run also included invoking test-statx with the runtime options provided in the main() function of test-statx.c Signed-off-by: Chandan Rajendra <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-16hwrng: geode - Revert managed API changesPrarit Bhargava1-15/+35
After commit e9afc746299d ("hwrng: geode - Use linux/io.h instead of asm/io.h") the geode-rng driver uses devres with pci_dev->dev to keep track of resources, but does not actually register a PCI driver. This results in the following issues: 1. The driver leaks memory because the driver does not attach to a device. The driver only uses the PCI device as a reference. devm_*() functions will release resources on driver detach, which the geode-rng driver will never do. As a result, 2. The driver cannot be reloaded because there is always a use of the ioport and region after the first load of the driver. Revert the changes made by e9afc746299d ("hwrng: geode - Use linux/io.h instead of asm/io.h"). Cc: <[email protected]> Signed-off-by: Prarit Bhargava <[email protected]> Fixes: 6e9b5e76882c ("hwrng: geode - Migrate to managed API") Cc: Matt Mackall <[email protected]> Cc: Corentin LABBE <[email protected]> Cc: PrasannaKumar Muralidharan <[email protected]> Cc: Wei Yongjun <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Herbert Xu <[email protected]>
2017-03-16hwrng: amd - Revert managed API changesPrarit Bhargava1-8/+34
After commit 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"), the amd-rng driver uses devres with pci_dev->dev to keep track of resources, but does not actually register a PCI driver. This results in the following issues: 1. The message WARNING: CPU: 2 PID: 621 at drivers/base/dd.c:349 driver_probe_device+0x38c is output when the i2c_amd756 driver loads and attempts to register a PCI driver. The PCI & device subsystems assume that no resources have been registered for the device, and the WARN_ON() triggers since amd-rng has already do so. 2. The driver leaks memory because the driver does not attach to a device. The driver only uses the PCI device as a reference. devm_*() functions will release resources on driver detach, which the amd-rng driver will never do. As a result, 3. The driver cannot be reloaded because there is always a use of the ioport and region after the first load of the driver. Revert the changes made by 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"). Cc: <[email protected]> Signed-off-by: Prarit Bhargava <[email protected]> Fixes: 31b2a73c9c5f ("hwrng: amd - Migrate to managed API"). Cc: Matt Mackall <[email protected]> Cc: Corentin LABBE <[email protected]> Cc: PrasannaKumar Muralidharan <[email protected]> Cc: Wei Yongjun <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Herbert Xu <[email protected]>
2017-03-16crypto: ccp - Assign DMA commands to the channel's CCPGary R Hook3-2/+6
The CCP driver generally uses a round-robin approach when assigning operations to available CCPs. For the DMA engine, however, the DMA mappings of the SGs are associated with a specific CCP. When an IOMMU is enabled, the IOMMU is programmed based on this specific device. If the DMA operations are not performed by that specific CCP then addressing errors and I/O page faults will occur. Update the CCP driver to allow a specific CCP device to be requested for an operation and use this in the DMA engine support. Cc: <[email protected]> # 4.9.x- Signed-off-by: Gary R Hook <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2017-03-16nl80211: fix dumpit error path RTNL deadlocksJohannes Berg1-71/+56
Sowmini pointed out Dmitry's RTNL deadlock report to me, and it turns out to be perfectly accurate - there are various error paths that miss unlock of the RTNL. To fix those, change the locking a bit to not be conditional in all those nl80211_prepare_*_dump() functions, but make those require the RTNL to start with, and fix the buggy error paths. This also let me use sparse (by appropriately overriding the rtnl_lock/rtnl_unlock functions) to validate the changes. Cc: [email protected] Reported-by: Sowmini Varadhan <[email protected]> Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-03-16sched/deadline: Use deadline instead of period when calculating overflowSteven Rostedt (VMware)1-4/+4
I was testing Daniel's changes with his test case, and tweaked it a little. Instead of having the runtime equal to the deadline, I increased the deadline ten fold. Daniel's test case had: attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */ attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ To make it more interesting, I changed it to: attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ attr.sched_deadline = 20 * 1000 * 1000; /* 20 ms */ attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ The results were rather surprising. The behavior that Daniel's patch was fixing came back. The task started using much more than .1% of the CPU. More like 20%. Looking into this I found that it was due to the dl_entity_overflow() constantly returning true. That's because it uses the relative period against relative runtime vs the absolute deadline against absolute runtime. runtime / (deadline - t) > dl_runtime / dl_period There's even a comment mentioning this, and saying that when relative deadline equals relative period, that the equation is the same as using deadline instead of period. That comment is backwards! What we really want is: runtime / (deadline - t) > dl_runtime / dl_deadline We care about if the runtime can make its deadline, not its period. And then we can say "when the deadline equals the period, the equation is the same as using dl_period instead of dl_deadline". After correcting this, now when the task gets enqueued, it can throttle correctly, and Daniel's fix to the throttling of sleeping deadline tasks works even when the runtime and deadline are not the same. Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Daniel Bristot de Oliveira <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luca Abeni <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Romulo Silva de Oliveira <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tommaso Cucinotta <[email protected]> Link: http://lkml.kernel.org/r/02135a27f1ae3fe5fd032568a5a2f370e190e8d7.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16sched/deadline: Throttle a constrained deadline task activated after the ↵Daniel Bristot de Oliveira1-0/+45
deadline During the activation, CBS checks if it can reuse the current task's runtime and period. If the deadline of the task is in the past, CBS cannot use the runtime, and so it replenishes the task. This rule works fine for implicit deadline tasks (deadline == period), and the CBS was designed for implicit deadline tasks. However, a task with constrained deadline (deadine < period) might be awakened after the deadline, but before the next period. In this case, replenishing the task would allow it to run for runtime / deadline. As in this case deadline < period, CBS enables a task to run for more than the runtime / period. In a very loaded system, this can cause a domino effect, making other tasks miss their deadlines. To avoid this problem, in the activation of a constrained deadline task after the deadline but before the next period, throttle the task and set the replenishing timer to the begin of the next period, unless it is boosted. Reproducer: --------------- %< --------------- int main (int argc, char **argv) { int ret; int flags = 0; unsigned long l = 0; struct timespec ts; struct sched_attr attr; memset(&attr, 0, sizeof(attr)); attr.size = sizeof(attr); attr.sched_policy = SCHED_DEADLINE; attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */ attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ ts.tv_sec = 0; ts.tv_nsec = 2000 * 1000; /* 2 ms */ ret = sched_setattr(0, &attr, flags); if (ret < 0) { perror("sched_setattr"); exit(-1); } for(;;) { /* XXX: you may need to adjust the loop */ for (l = 0; l < 150000; l++); /* * The ideia is to go to sleep right before the deadline * and then wake up before the next period to receive * a new replenishment. */ nanosleep(&ts, NULL); } exit(0); } --------------- >% --------------- On my box, this reproducer uses almost 50% of the CPU time, which is obviously wrong for a task with 2/2000 reservation. Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luca Abeni <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Romulo Silva de Oliveira <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tommaso Cucinotta <[email protected]> Link: http://lkml.kernel.org/r/edf58354e01db46bf42df8d2dd32418833f68c89.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16sched/deadline: Make sure the replenishment timer fires in the next periodDaniel Bristot de Oliveira1-2/+7
Currently, the replenishment timer is set to fire at the deadline of a task. Although that works for implicit deadline tasks because the deadline is equals to the begin of the next period, that is not correct for constrained deadline tasks (deadline < period). For instance: f.c: --------------- %< --------------- int main (void) { for(;;); } --------------- >% --------------- # gcc -o f f.c # trace-cmd record -e sched:sched_switch \ -e syscalls:sys_exit_sched_setattr \ chrt -d --sched-runtime 490000000 \ --sched-deadline 500000000 \ --sched-period 1000000000 0 ./f # trace-cmd report | grep "{pid of ./f}" After setting parameters, the task is replenished and continue running until being throttled: f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0 The task is throttled after running 492318 ms, as expected: f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> watchdog/3:32 [0] But then, the task is replenished 500719 ms after the first replenishment: <idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R ==> f:11295 [-1] Running for 490277 ms: f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> swapper/3:0 [120] Hence, in the first period, the task runs 2 * runtime, and that is a bug. During the first replenishment, the next deadline is set one period away. So the runtime / period starts to be respected. However, as the second replenishment took place in the wrong instant, the next replenishment will also be held in a wrong instant of time. Rather than occurring in the nth period away from the first activation, it is taking place in the (nth period - relative deadline). Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Luca Abeni <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Juri Lelli <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Romulo Silva de Oliveira <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tommaso Cucinotta <[email protected]> Link: http://lkml.kernel.org/r/ac50d89887c25285b47465638354b63362f8adff.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-03-16vmw_vmci: handle the return value from pci_alloc_irq_vectors correctlyChristoph Hellwig1-2/+2
It returns the number of vectors allocated when successful, so check for a negative error only. Fixes: 3bb434cd ("vmw_vmci: switch to pci_irq_alloc_vectors") Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Loïc Yhuel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-03-16ppdev: fix registering same device nameSudip Mukherjee1-2/+9
Usually every parallel port will have a single pardev registered with it. But ppdev driver is an exception. This userspace parallel port driver allows to create multiple parrallel port devices for a single parallel port. And as a result we were having a big warning like: "sysfs: cannot create duplicate filename '/devices/parport0/ppdev0.0'". And with that many parallel port printers stopped working. We have been using the minor number as the id field while registering a parralel port device with a parralel port. But when there are multiple parrallel port device for one single parallel port, they all tried to register with the same name like 'pardev0.0' and everything started failing. Use an incremented index as the id instead of the minor number. Fixes: 8b7d3a9d903e ("ppdev: use new parport device model") Cc: stable <[email protected]> # v4.9+ Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1414656 Bugzilla: https://bugs.archlinux.org/task/52322 Tested-by: James Feeney <[email protected]> Signed-off-by: Sudip Mukherjee <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>