aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-03-22rapidio/rionet: add shutdown event handlingAlexandre Bounine1-0/+38
Add shutdown notification handler which terminates active connections with remote RapidIO nodes. This prevents remote nodes from sending packets to the powered off node and eliminates hardware error events on remote nodes. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/tsi721: add shutdown notification callbackAlexandre Bounine3-0/+48
Add device driver specific shutdown notification callback. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio: add shutdown notification for RapidIO devicesAlexandre Bounine2-0/+14
Add bus-specific callback to stop RapidIO devices during a system shutdown. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/tsi721: add query_mport callbackAlexandre Bounine1-0/+34
Add device-specific implementation of query_mport callback function. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio: add query_mport operationAlexandre Bounine3-0/+69
Add mport query operation to report master port RapidIO capabilities and run time configuration to upper level drivers. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/tsi721_dma: fix pending transaction queue handlingAlexandre Bounine2-30/+32
Fix pending DMA request queue handling to avoid broken ordering during concurrent request submissions. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/tsi721: add option to configure direct mapping of IB windowAlexandre Bounine2-30/+142
Add an option to configure mapping of Inbound Window without RIO-to-PCIe address translation. If a local memory buffer is not properly aligned to meet HW requirements for RapidIO address mapping with address translation, caller can request an inbound window with matching RapidIO address assigned to it. This implementation selects RapidIO base address and size for inbound window that are capable to accommodate the local memory buffer. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/tsi721: add check for overlapped IB window mappingsAlexandre Bounine2-21/+62
Add check for attempts to request mapping of inbound RapidIO address space that overlaps with existing active mapping windows. Tsi721 device does not support overlapped inbound windows and SRIO address decoding behavior is not defined in such cases. This patch is applicable to kernel versions starting from v3.7. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/tsi721: fix hardcoded MRRS settingAlexandre Bounine1-4/+2
Remove use of hardcoded setting for Maximum Read Request Size (MRRS) value and use one set by PCIe bus driver. Using hardcoded value can cause PCIe bus errors on platforms that have tsi721 device on PCIe path that allows only smaller read request sizes. This fix is applicable to kernel versions starting from v3.2. Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/rionet: add capability to change MTUAurelien Jacquiot1-2/+15
These patches are the result of extensive collaboration within the RapidIO.org Software Task Group between Texas Instruments, Freescale, Prodrive Technologies, Nokia Networks, BAE and IDT. Additional input was received from other members of RapidIO.org. The objective was to create a character mode driver interface which exposes the capabilities of RapidIO devices directly to applications, in a manner that allows the numerous and varied RapidIO implementations to interoperate. The Software Task Group has also developed fabric management, Remote Memory Access, and sockets applications which make use of these interfaces in user space. Intensive testing with these applications prompted the RapidIO subsystem updates provided within this set of patches. This patch (of 29): Replace default Ethernet-specific routine by the custom one to allow setting of larger MTU supported by RapidIO messaging (max RIO packet size is 4096 bytes). Signed-off-by: Aurelien Jacquiot <[email protected]> Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Andre van Herk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22rapidio/rionet: fix deadlock on SMPAurelien Jacquiot1-2/+2
Fix deadlocking during concurrent receive and transmit operations on SMP platforms caused by the use of incorrect lock: on transmit 'tx_lock' spinlock should be used instead of 'lock' which is used for receive operation. This fix is applicable to kernel versions starting from v2.15. Signed-off-by: Aurelien Jacquiot <[email protected]> Signed-off-by: Alexandre Bounine <[email protected]> Cc: Matt Porter <[email protected]> Cc: Andre van Herk <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22cpumask: remove incorrect information from commentEric Biggers1-2/+0
Since commit cdfdef75e795 ("cpumask: only allocate nr_cpumask_bits."), this comment above cpumask_size() is no longer relevant. Signed-off-by: Eric Biggers <[email protected]> Cc: Rusty Russell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22fs/coredump: prevent fsuid=0 dumps into user-controlled directoriesJann Horn6-12/+32
This commit fixes the following security hole affecting systems where all of the following conditions are fulfilled: - The fs.suid_dumpable sysctl is set to 2. - The kernel.core_pattern sysctl's value starts with "/". (Systems where kernel.core_pattern starts with "|/" are not affected.) - Unprivileged user namespace creation is permitted. (This is true on Linux >=3.8, but some distributions disallow it by default using a distro patch.) Under these conditions, if a program executes under secure exec rules, causing it to run with the SUID_DUMP_ROOT flag, then unshares its user namespace, changes its root directory and crashes, the coredump will be written using fsuid=0 and a path derived from kernel.core_pattern - but this path is interpreted relative to the root directory of the process, allowing the attacker to control where a coredump will be written with root privileges. To fix the security issue, always interpret core_pattern for dumps that are written under SUID_DUMP_ROOT relative to the root directory of init. Signed-off-by: Jann Horn <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Al Viro <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglockOleg Nesterov1-2/+1
This test-case (simplified version of generated by syzkaller) #include <unistd.h> #include <sys/ptrace.h> #include <sys/wait.h> void test(void) { for (;;) { if (fork()) { wait(NULL); continue; } ptrace(PTRACE_SEIZE, getppid(), 0, 0); ptrace(PTRACE_INTERRUPT, getppid(), 0, 0); _exit(0); } } int main(void) { int np; for (np = 0; np < 8; ++np) if (!fork()) test(); while (wait(NULL) > 0) ; return 0; } triggers the 2nd WARN_ON_ONCE(!signr) warning in do_jobctl_trap(). The problem is that __ptrace_unlink() clears task->jobctl under siglock but task->ptrace is cleared without this lock held; this fools the "else" branch which assumes that !PT_SEIZED means PT_PTRACED. Note also that most of other PTRACE_SEIZE checks can race with detach from the exiting tracer too. Say, the callers of ptrace_trap_notify() assume that SEIZED can't go away after it was checked. Signed-off-by: Oleg Nesterov <[email protected]> Reported-by: Dmitry Vyukov <[email protected]> Cc: Tejun Heo <[email protected]> Cc: syzkaller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22fat: add config option to set UTF-8 mount option by defaultMaciej S. Szmigiero3-5/+24
FAT has long supported its own default file name encoding config setting, separate from CONFIG_NLS_DEFAULT. However, if UTF-8 encoded file names are desired FAT character set should not be set to utf8 since this would make file names case sensitive even if case insensitive matching is requested. Instead, "utf8" mount options should be provided to enable UTF-8 file names in FAT file system. Unfortunately, there was no possibility to set the default value of this option so on UTF-8 system "utf8" mount option had to be added manually to most FAT mounts. This patch adds config option to set such default value. Signed-off-by: Maciej S. Szmigiero <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22x86/compat: remove is_compat_task()Andy Lutomirski3-3/+4
x86's is_compat_task always checked the current syscall type, not the task type. It has no non-arch users any more, so just remove it to avoid confusion. On x86, nothing should really be checking the task ABI. There are legitimate users for the syscall ABI and for the mm ABI. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22drivers/hid/uhid.c: check write() bitness using in_compat_syscallAndy Lutomirski1-1/+1
uhid changes the format expected in write() depending on bitness. It should check the syscall bitness directly. Signed-off-by: Andy Lutomirski <[email protected]> Cc: David Herrmann <[email protected]> Acked-by: Jiri Kosina <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22input: redefine INPUT_COMPAT_TEST as in_compat_syscall()Andy Lutomirski1-11/+1
The input compat code should work like all other compat code: for 32-bit syscalls, use the 32-bit ABI and for 64-bit syscalls, use the 64-bit ABI. We have a helper for that (in_compat_syscall()): just use it. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Dmitry Torokhov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22drivers/gpu/drm/amd/amdkfd: use in_compat_syscall to check open() caller typeAndy Lutomirski2-2/+2
amdkfd wants to know syscall type, not task type. Check directly. Unfortunately, amdkfd is making nasty assumptions that a process' bitness is a well-defined constant thing. This isn't the case on x86. I don't know how much this matters, but this patch has no effect on generated code on x86, so amdkfd is equally broken with and without this patch. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Oded Gabbay <[email protected]> Cc: David Airlie <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22drivers/firmware/efi/efivars.c: use in_compat_syscall() to check for compat ↵Andy Lutomirski1-1/+1
callers This should make no difference on any architecture, as x86's historical is_compat_task behavior really did check whether the calling syscall was a compat syscall. x86's is_compat_task is going away, though. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Matt Fleming <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22firewire: use in_compat_syscall to check ioctl compatnessAndy Lutomirski1-2/+2
Firewire was using is_compat_task to check whether it was in a compat ioctl or a non-compat ioctl. Use is_compat_syscall instead so it works properly on all architectures. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Clemens Ladisch <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22net/xfrm_user: use in_compat_syscall to deny compat syscallsAndy Lutomirski1-1/+1
The code wants to prevent compat code from receiving messages. Use in_compat_syscall for this. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: Herbert Xu <[email protected]> Cc: "David S. Miller" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22net/sctp: use in_compat_syscall for sctp_getsockopt_connectx3Andy Lutomirski1-1/+1
SCTP unfortunately has a different ABI for SCTP_SOCKOPT_CONNECTX3 for 32-bit and 64-bit callers. Use in_compat_syscall to correctly distinguish them on all architectures. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Vlad Yasevich <[email protected]> Cc: Neil Horman <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ext4: in ext4_dir_llseek, check syscall bitness directlyAndy Lutomirski1-1/+1
ext4 treats directory offsets differently for 32-bit and 64-bit callers. Check the caller type using in_compat_syscall, not is_compat_task. This changes behavior on SPARC slightly. Signed-off-by: Andy Lutomirski <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Andreas Dilger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22staging/lustre: switch from is_compat_task to in_compat_syscallAndy Lutomirski1-1/+1
AFAICT, lustre is trying to determine syscall bitness. Use the new accessor. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Oleg Drokin <[email protected]> Cc: Andreas Dilger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22auditsc: for seccomp events, log syscall compat state using in_compat_syscallAndy Lutomirski1-2/+2
Except on SPARC, this is what the code always did. SPARC compat seccomp was buggy, although the impact of the bug was limited because SPARC 32-bit and 64-bit syscall numbers are the same. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Paul Moore <[email protected]> Cc: Eric Paris <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ptrace: in PEEK_SIGINFO, check syscall bitness, not task bitnessAndy Lutomirski1-1/+1
Users of the 32-bit ptrace() ABI expect the full 32-bit ABI. siginfo translation should check ptrace() ABI, not caller task ABI. This is an ABI change on SPARC. Let's hope that no one relied on the old buggy ABI. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22seccomp: check in_compat_syscall, not is_compat_task, in strict modeAndy Lutomirski1-2/+2
Seccomp wants to know the syscall bitness, not the caller task bitness, when it selects the syscall whitelist. As far as I know, this makes no difference on any architecture, so it's not a security problem. (It generates identical code everywhere except sparc, and, on sparc, the syscall numbering is the same for both ABIs.) Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22sparc/syscall: fix syscall_get_archAndy Lutomirski1-1/+8
Sparc's syscall_get_arch was buggy: it returned the task arch, not the syscall arch. This could confuse seccomp and audit. I don't think this is as bad for seccomp as it looks: sparc's 32-bit and 64-bit syscalls are numbered the same. Signed-off-by: Andy Lutomirski <[email protected]> Cc: David S. Miller <[email protected]> Cc: Sam Ravnborg <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22sparc/compat: provide an accurate in_compat_syscall implementationAndy Lutomirski1-0/+7
On sparc64 compat-enabled kernels, any task can make 32-bit and 64-bit syscalls. is_compat_task returns true in 32-bit tasks, which does not necessarily imply that the current syscall is 32-bit. Provide an in_compat_syscall implementation that checks whether the current syscall is compat. As far as I know, sparc is the only architecture on which is_compat_task checks the compat status of the task and on which the compat status of a syscall can differ from the compat status of the task. On x86, is_compat_task checks the syscall type, not the task type. [[email protected]: add comment, per Sam] [[email protected]: update comment, per Andy] Signed-off-by: Andy Lutomirski <[email protected]> Acked-by: David S. Miller <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Andy Lutomirski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22compat: add in_compat_syscall to ask whether we're in a compat syscallAndy Lutomirski1-0/+15
A lot of code currently abuses is_compat_task to determine this. Signed-off-by: Andy Lutomirski <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Clemens Ladisch <[email protected]> Cc: David Airlie <[email protected]> Cc: David Herrmann <[email protected]> Cc: David Miller <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: Eric Paris <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ingo Molnar <[email protected]> Acked-by: Jiri Kosina <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Neil Horman <[email protected]> Cc: Oded Gabbay <[email protected]> Cc: Oleg Drokin <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul Moore <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Steffen Klassert <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vlad Yasevich <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22kernel/hung_task.c: use timeout diff when timeout is updatedTetsuo Handa1-8/+13
When new timeout is written to /proc/sys/kernel/hung_task_timeout_secs, khungtaskd is interrupted and again sleeps for full timeout duration. This means that hang task will not be checked if new timeout is written periodically within old timeout duration and/or checking of hang task will be delayed for up to previous timeout duration. Fix this by remembering last time khungtaskd checked hang task. This change will allow other watchdog tasks (if any) to share khungtaskd by sleeping for minimal timeout diff of all watchdog tasks. Doing more watchdog tasks from khungtaskd will reduce the possibility of printk() collisions by multiple watchdog threads. Signed-off-by: Tetsuo Handa <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Aaron Tomlin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22zram: revive swap_slot_free_notifyMinchan Kim1-43/+50
Commit b430e9d1c6d4 ("remove compressed copy from zram in-memory") applied swap_slot_free_notify call in *end_swap_bio_read* to remove duplicated memory between zram and memory. However, with the introduction of rw_page in zram: 8c7f01025f7b ("zram: implement rw_page operation of zram"), it became void because rw_page doesn't need bio. Memory footprint is really important in embedded platforms which have small memory, for example, 512M) recently because it could start to kill processes if memory footprint exceeds some threshold by LMK or some similar memory management modules. This patch restores the function for rw_page, thereby eliminating this duplication. Signed-off-by: Minchan Kim <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: karam.lee <[email protected]> Cc: <[email protected]> Cc: Chan Jeong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ocfs2: add feature document for online file checkGang He1-0/+94
This document will describe OCFS2 online file check feature. OCFS2 is often used in high-availaibility systems. However, OCFS2 usually converts the filesystem to read-only when encounters an error. This may not be necessary, since turning the filesystem read-only would affect other running processes as well, decreasing availability. Then, a mount option (errors=continue) is introduced, which would return the -EIO errno to the calling process and terminate furhter processing so that the filesystem is not corrupted further. The filesystem is not converted to read-only, and the problematic file's inode number is reported in the kernel log. The user can try to check/fix this file via online filecheck feature. Signed-off-by: Gang He <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ocfs2: check/fix inode block for online file checkGang He2-9/+218
Implement online check or fix inode block during reading a inode block to memory. Signed-off-by: Gang He <[email protected]> Reviewed-by: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ocfs2: create/remove sysfile for online file checkGang He1-0/+5
Create online file check sysfile when ocfs2 mount, remove the related sysfile when ocfs2 umount. Signed-off-by: Gang He <[email protected]> Reviewed-by: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ocfs2: sysfile interfaces for online file checkGang He4-1/+660
Implement online file check sysfile interfaces, e.g. how to create the related sysfile according to device name, how to display/handle file check request from the sysfile. Signed-off-by: Gang He <[email protected]> Reviewed-by: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22ocfs2: export ocfs2_kset for online file checkGang He2-1/+4
When there are errors in the ocfs2 filesystem, they are usually accompanied by the inode number which caused the error. This inode number would be the input to fixing the file. One of these options could be considered: A file in the sys filesytem which would accept inode numbers. This could be used to communication back what has to be fixed or is fixed. You could write: $# echo "<inode>" > /sys/fs/ocfs2/devname/filecheck/check or $# echo "<inode>" > /sys/fs/ocfs2/devname/filecheck/fix Compare with second version, I re-design filecheck sysfs interfaces, there are three sysfs files (check, fix and set) under filecheck directory (see above), sysfs will accept only one argument <inode>. Second, I adjust some code in ocfs2_filecheck_repair_inode_block() function according to upstream feedback, we cannot just add VALID_FL flag back as a inode block fix, then we will not fix this field corruption currently until having a complete solution. Compare with first version, I use strncasecmp instead of double strncmp functions. Second, update the source file contribution vendor. This patch (of 4): Export ocfs2_kset object from ocfs2_stackglue kernel module, then online file check code will create the related sysfiles under ocfs2_kset object. We're exporting this because it's built in ocfs2_stackglue.ko. Signed-off-by: Gang He <[email protected]> Reviewed-by: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-22cpufreq: governor: Always schedule work on the CPU running updateRafael J. Wysocki1-1/+1
Modify dbs_irq_work() to always schedule the process-context work on the current CPU which also ran the dbs_update_util_handler() that the irq_work being handled came from. This causes the entire frequency update handling (involving the "ondemand" or "conservative" governors) to be carried out by the CPU whose frequency is to be updated and reduces the overall amount of inter-CPU noise related to cpufreq. Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-03-22cpufreq: Always update current frequency before startig governorRafael J. Wysocki1-11/+3
Make policy->cur match the current frequency returned by the driver's ->get() callback before starting the governor in case they went out of sync in the meantime and drop the piece of code attempting to resync policy->cur with the real frequency of the boot CPU from cpufreq_resume() as it serves no purpose any more (and it's racy and super-ugly anyway). Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2016-03-22cpufreq: Introduce cpufreq_update_current_freq()Rafael J. Wysocki1-9/+19
Move the part of cpufreq_update_policy() that obtains the current frequency from the driver and updates policy->cur if necessary to a separate function, cpufreq_get_current_freq(). That should not introduce functional changes and subsequent change set will need it. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2016-03-22cpufreq: Introduce cpufreq_start_governor()Rafael J. Wysocki1-22/+22
Starting a governor in cpufreq always follows the same pattern involving two calls to cpufreq_governor(), one with the event argument set to CPUFREQ_GOV_START and one with that argument set to CPUFREQ_GOV_LIMITS. Introduce cpufreq_start_governor() that will carry out those two operations and make all places where governors are started use it. That slightly modifies the behavior of cpufreq_set_policy() which now also will go back to the old governor if the second call to cpufreq_governor() (the one with event equal to CPUFREQ_GOV_LIMITS) fails, but that really is how it should work in the first place. Also cpufreq_resume() will now pring an error message if the CPUFREQ_GOV_LIMITS call to cpufreq_governor() fails, but that makes it follow cpufreq_add_policy_cpu() and cpufreq_offline() in that respect. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2016-03-22cpufreq: powernv: Add sysfs attributes to show throttle statsShilpasri G Bhat2-2/+141
Create sysfs attributes to export throttle information in /sys/devices/system/cpu/cpuX/cpufreq/throttle_stats directory. The newly added sysfs files are as follows: 1)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/turbo_stat 2)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/sub-turbo_stat 3)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/unthrottle 4)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/powercap 5)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overtemp 6)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/supply_fault 7)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/overcurrent 8)/sys/devices/system/cpu/cpuX/cpufreq/throttle_stats/occ_reset Detailed explanation of each attribute is added to Documentation/ABI/testing/sysfs-devices-system-cpu Signed-off-by: Shilpasri G Bhat <[email protected]> Acked-by: Viresh Kumar <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-03-22cpufreq: acpi-cpufreq: make Intel/AMD MSR access, io port access staticJisheng Zhang1-6/+6
These frequency register read/write operations' implementations for the given processor (Intel/AMD MSR access or I/O port access) are only used internally in acpi-cpufreq, so make them static. Signed-off-by: Jisheng Zhang <[email protected]> Acked-by: Viresh Kumar <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-03-22PCI: ACPI: IA64: fix IO port generic range checkLorenzo Pieralisi1-1/+13
The [0 - 64k] ACPI PCI IO port resource boundary check in: acpi_dev_ioresource_flags() is currently applied blindly in the ACPI resource parsing to all architectures, but only x86 suffers from that IO space limitation. On arches (ie IA64 and ARM64) where IO space is memory mapped, the PCI root bridges IO resource windows are firstly initialized from the _CRS (in acpi_decode_space()) and contain the CPU physical address at which a root bridge decodes IO space in the CPU physical address space with the offset value representing the offset required to translate the PCI bus address into the CPU physical address. The IO resource windows are then parsed and updated in arch code before creating and enumerating PCI buses (eg IA64 add_io_space()) to map in an arch specific way the obtained CPU physical address range to a slice of virtual address space reserved to map PCI IO space, ending up with PCI bridges resource windows containing IO resources like the following on a working IA64 configuration: PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [io 0x1000000-0x100ffff window] (bus address [0x0000-0xffff]) pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window] pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window] pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window] pci_bus 0000:00: root bus resource [bus 00] This implies that the [0 - 64K] check in acpi_dev_ioresource_flags() leaves platforms with memory mapped IO space (ie IA64) broken (ie kernel can't claim IO resources since the host bridge IO resource is disabled and discarded by ACPI core code, see log on IA64 with missing root bridge IO resource, silently filtered by current [0 - 64k] check in acpi_dev_ioresource_flags()): PCI host bridge to bus 0000:00 pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000fffff window] pci_bus 0000:00: root bus resource [mem 0x80000000-0x8fffffff window] pci_bus 0000:00: root bus resource [mem 0x80004000000-0x800ffffffff window] pci_bus 0000:00: root bus resource [bus 00] [...] pci 0000:00:03.0: [1002:515e] type 00 class 0x030000 pci 0000:00:03.0: reg 0x10: [mem 0x80000000-0x87ffffff pref] pci 0000:00:03.0: reg 0x14: [io 0x1000-0x10ff] pci 0000:00:03.0: reg 0x18: [mem 0x88020000-0x8802ffff] pci 0000:00:03.0: reg 0x30: [mem 0x88000000-0x8801ffff pref] pci 0000:00:03.0: supports D1 D2 pci 0000:00:03.0: can't claim BAR 1 [io 0x1000-0x10ff]: no compatible bridge window For this reason, the IO port resources boundaries check in generic ACPI parsing code should be guarded with a CONFIG_X86 guard so that more arches (ie ARM64) can benefit from the generic ACPI resources parsing interface without incurring in unexpected resource filtering, fixing at the same time current breakage on IA64. This patch factors out IO ports boundary [0 - 64k] check in generic ACPI code and makes the IO space check X86 specific to make sure that IO space resources are usable on other arches too. Fixes: 3772aea7d6f3 (ia64/PCI/ACPI: Use common ACPI resource parsing interface for host bridge) Signed-off-by: Lorenzo Pieralisi <[email protected]> Cc: 4.4+ <[email protected]> # 4.4+ Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-03-22tracing: Record and show NMI statePeter Zijlstra3-3/+9
The latency tracer format has a nice column to indicate IRQ state, but this is not able to tell us about NMI state. When tracing perf interrupt handlers (which often run in NMI context) it is very useful to see how the events nest. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2016-03-22tracing: Fix trace_printk() to print when not using bprintk()Steven Rostedt (Red Hat)2-3/+6
The trace_printk() code will allocate extra buffers if the compile detects that a trace_printk() is used. To do this, the format of the trace_printk() is saved to the __trace_printk_fmt section, and if that section is bigger than zero, the buffers are allocated (along with a message that this has happened). If trace_printk() uses a format that is not a constant, and thus something not guaranteed to be around when the print happens, the compiler optimizes the fmt out, as it is not used, and the __trace_printk_fmt section is not filled. This means the kernel will not allocate the special buffers needed for the trace_printk() and the trace_printk() will not write anything to the tracing buffer. Adding a "__used" to the variable in the __trace_printk_fmt section will keep it around, even though it is set to NULL. This will keep the string from being printed in the debugfs/tracing/printk_formats section as it is not needed. Reported-by: Vlastimil Babka <[email protected]> Fixes: 07d777fe8c398 "tracing: Add percpu buffers for trace_printk()" Cc: [email protected] # v3.5+ Signed-off-by: Steven Rostedt <[email protected]>
2016-03-22Merge branch 'AF_VSOCK-missed-wakeups'David S. Miller1-68/+87
Claudio Imbrenda says: ==================== AF_VSOCK: Shrink the area influenced by prepare_to_wait This patchset applies on net-next. I think I found a problem with the patch submitted by Laura Abbott ( https://lkml.org/lkml/2016/2/4/711 ): we might miss wakeups. Since the condition is not checked between the prepare_to_wait and the schedule(), if a wakeup happens after the condition is checked but before the sleep happens, and we miss it. ( A description of the problem can be found here: http://www.makelinux.net/ldd3/chp-6-sect-2 ). The first patch reverts the previous broken patch, while the second patch properly fixes the sleep-while-waiting issue. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-03-22AF_VSOCK: Shrink the area influenced by prepare_to_waitClaudio Imbrenda1-73/+85
When a thread is prepared for waiting by calling prepare_to_wait, sleeping is not allowed until either the wait has taken place or finish_wait has been called. The existing code in af_vsock imposed unnecessary no-sleep assumptions to a broad list of backend functions. This patch shrinks the influence of prepare_to_wait to the area where it is strictly needed, therefore relaxing the no-sleep restriction there. Signed-off-by: Claudio Imbrenda <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-22Revert "vsock: Fix blocking ops call in prepare_to_wait"Claudio Imbrenda1-6/+13
This reverts commit 5988818008257ca42010d6b43a3e0e48afec9898 ("vsock: Fix blocking ops call in prepare_to_wait") The commit reverted with this patch caused us to potentially miss wakeups. Since the condition is not checked between the prepare_to_wait and the schedule(), if a wakeup happens after the condition is checked but before the sleep happens, we will miss it. ( A description of the problem can be found here: http://www.makelinux.net/ldd3/chp-6-sect-2 ). By reverting the patch, the behaviour is still incorrect (since we shouldn't sleep between the prepare_to_wait and the schedule) but at least it will not miss wakeups. The next patch in the series actually fixes the behaviour. Signed-off-by: Claudio Imbrenda <[email protected]> Signed-off-by: David S. Miller <[email protected]>