Age | Commit message (Collapse) | Author | Files | Lines |
|
When I tried to send a patch to remove it, Andi told me we still need to
keep compabitlies for old libc, so we can't remove this completely. Then
just make it default to n and remove the doc from
feature-removal-schedule.txt.
Signed-off-by: WANG Cong <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Adding support for poll() in sysctl fs allows userspace to receive
notifications of changes in sysctl entries. This adds a infrastructure to
allow files in sysctl fs to be pollable and implements it for hostname and
domainname.
[[email protected]: s/declare/define/ for definitions]
Signed-off-by: Lucas De Marchi <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Kay Sievers <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Update rapidio.txt to reflect changes from recent patch.
See http://marc.info/?l=linux-kernel&m=131285620113589&w=2 for details.
Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Liu Gang <[email protected]>
Cc: Micha Nelissen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Modify Ethernet addess macros to be compatible with BE/LE platforms
Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Chul Kim <[email protected]>
Cc: Kumar Gala <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Li Yang <[email protected]>
Cc: <[email protected]> [2.6.39+]
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The "goto cleanup" path can deference "rswitch" when it is NULL.
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Alexandre Bounine <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Kumar Gala <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Chul Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add RapidIO mport driver for IDT TSI721 PCI Express-to-SRIO bridge device.
The driver provides full set of callback functions defined for mport
devices in RapidIO subsystem. It also is compatible with current version
of RIONET driver (Ethernet over RapidIO messaging services).
This patch is applicable to kernel versions starting from 2.6.39.
Signed-off-by: Alexandre Bounine <[email protected]>
Signed-off-by: Chul Kim <[email protected]>
Cc: Kumar Gala <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Li Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
port failed to initialize
The "struct rio_mport" contains a member of master port I/O memory
resource structure "struct resource iores". This resource will be read
from device tree and be used for rapidio R/W transaction memory space.
Rapidio requests the port I/O memory resource under the root resource
"iomem_resource".
struct rio_mport *port;
port = kzalloc(sizeof(struct rio_mport), GFP_KERNEL);
request_resource(&iomem_resource, &port->iores);
When port failed to initialize, allocated "rio_mport" structure memory
will be freed, and the port I/O memory resource structure pointer
"&port->iores" will be invalid. If other requests resource under
"iomem_resource", "&port->iores" node may be operated in the child
resources list and this will cause the system to crash.
So the requested port I/O memory resource should be released before
freeing allocated "rio_mport" structure.
Signed-off-by: Liu Gang <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Grant Likely <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
complete
The discovered bit in PGCCSR register indicates if the device has been
discovered by system host. In Rapidio systems, some agent devices can also
be master devices. They can issue requests into the system.
Signed-off-by: Liu Gang <[email protected]>
Acked-by: Alexandre Bounine <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Expand root=PARTUUID=UUID syntax to support selecting a root partition by
integer offset from a known, unique partition. This approach provides
similar properties to specifying a device and partition number, but using
the UUID as the unique path prior to evaluating the offset.
For example,
root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1
selects the partition with UUID 99DE.. then select the next
partition.
This change is motivated by a particular usecase in Chromium OS where the
bootloader can easily determine what partition it is on (by UUID) but
doesn't perform general partition table walking.
That said, support for this model provides a direct mechanism for the user
to modify the root partition to boot without specifically needing to
extract each UUID or update the bootloader explicitly when the root
partition UUID is changed (if it is recreated to be larger, for instance).
Pinning to a /boot-style partition UUID allows the arbitrary root
partition reconfiguration/modifications with slightly less ambiguity than
just [dev][partition] and less stringency than the specific root partition
UUID.
[[email protected]: fix init sections warning]
Signed-off-by: Will Drewry <[email protected]>
Cc: Kay Sievers <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Trond Myklebust <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
For the sysvsem undo, each task struct contains a sysv_sem structure with
a pointer to the undo information.
This pointer is only necessary if sysvipc is enabled - thus the pointer
can be made conditional on CONFIG_SYSVIPC.
Signed-off-by: Manfred Spraul <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
include/linux/sem.h contains several structures that are only used within
ipc/sem.c.
The patch moves them into ipc/sem.c - there is no need to expose the
structures to the whole kernel.
No functional changes, only whitespace cleanups and 80-char per line
fixes.
Signed-off-by: Manfred Spraul <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
semtimedop() does not handle spurious wakeups, it returns -EINTR to user
space. Most other schedule() users would just loop and not return to user
space. The patch adds such a loop to semtimedop()
Signed-off-by: Manfred Spraul <[email protected]>
Reported-by: Peter Zijlstra <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
sys_semtimedop() may return -EIDRM although the semaphore operation
completed successfully:
thread 1: thread 2:
semtimedop(), sleeps
semop():
* acquires sem_lock()
semtimedop() woken up due to timeout
sem_lock() loops
* notices that thread 2 could be completed.
* performs the operations that thread 2 is sleeping on.
* marks the semaphore operation as IN_WAKEUP
* drops sem_lock(), does wakeup, sets return code to 0
* thread delayed due to interrupt, whatever
* returns to user space
* thread still delayed
semctl(IPC_RMID)
* acquires sem_lock()
* ipc_rmid(), ipcp->deleted=1
* drops sem_lock()
* thread finally continues - but seem_lock()
now fails due to ipcp->deleted == 1
* returns -EIDRM instead of 0
The fix is trivial: Always use the return code in queue.status.
In real world, the race probably doesn't matter:
If the semaphore array is destroyed, the app is probably not interested
if the last operation succeeded or was already cancelled.
Signed-off-by: Manfred Spraul <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Mike Galbraith <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
It's often convenient to be able to release resource from IRQ context.
Make ida_simple_*() use irqsave/restore spin ops so that they are IRQ
safe.
Signed-off-by: Tejun Heo <[email protected]>
Acked-by: Rusty Russell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
fd* files are restricted to the task's owner, and other users may not get
direct access to them. But one may open any of these files and run any
setuid program, keeping opened file descriptors. As there are permission
checks on open(), but not on readdir() and read(), operations on the kept
file descriptors will not be checked. It makes it possible to violate
procfs permission model.
Reading fdinfo/* may disclosure current fds' position and flags, reading
directory contents of fdinfo/ and fd/ may disclosure the number of opened
files by the target task. This information is not sensible per se, but it
can reveal some private information (like length of a password stored in a
file) under certain conditions.
Used existing (un)lock_trace functions to check for ptrace_may_access(),
but instead of using EPERM return code from it use EACCES to be consistent
with existing proc_pid_follow_link()/proc_pid_readlink() return code. If
they differ, attacker can guess what fds exist by analyzing stat() return
code. Patched handlers: stat() for fd/*, stat() and read() for fdindo/*,
readdir() and lookup() for fd/ and fdinfo/.
Signed-off-by: Vasiliy Kulikov <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
On reading sysctl dirs we should return -EISDIR instead of -EINVAL.
Signed-off-by: Pavel Emelyanov <[email protected]>
Signed-off-by: Cyrill Gorcunov <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
{get,put}_mems_allowed() exist so that general kernel code may locklessly
access a task's set of allowable nodes without having the chance that a
concurrent write will cause the nodemask to be empty on configurations
where MAX_NUMNODES > BITS_PER_LONG.
This could incur a significant delay, however, especially in low memory
conditions because the page allocator is blocking and reclaim requires
get_mems_allowed() itself. It is not atypical to see writes to
cpuset.mems take over 2 seconds to complete, for example. In low memory
conditions, this is problematic because it's one of the most imporant
times to change cpuset.mems in the first place!
The only way a task's set of allowable nodes may change is through cpusets
by writing to cpuset.mems and when attaching a task to a generic code is
not reading the nodemask with get_mems_allowed() at the same time, and
then clearing all the old nodes. This prevents the possibility that a
reader will see an empty nodemask at the same time the writer is storing a
new nodemask.
If at least one node remains unchanged, though, it's possible to simply
set all new nodes and then clear all the old nodes. Changing a task's
nodemask is protected by cgroup_mutex so it's guaranteed that two threads
are not changing the same task's nodemask at the same time, so the
nodemask is guaranteed to be stored before another thread changes it and
determines whether a node remains set or not.
Signed-off-by: David Rientjes <[email protected]>
Cc: Miao Xie <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Paul Menage <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
warning: symbol 'swap_cgroup_ctrl' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <[email protected]>
Cc: Paul Menage <[email protected]>
Cc: Li Zefan <[email protected]>
Acked-by: Balbir Singh <[email protected]>
Cc: Daisuke Nishimura <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Various code in memcontrol.c () calls this_cpu_read() on the calculations
to be done from two different percpu variables, or does an open-coded
read-modify-write on a single percpu variable.
Disable preemption throughout these operations so that the writes go to
the correct palces.
[[email protected]: added this_cpu to __this_cpu conversion]
Signed-off-by: Johannes Weiner <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Daisuke Nishimura <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Christoph Lameter <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is a potential race between a thread charging a page and another
thread putting it back to the LRU list:
charge: putback:
SetPageCgroupUsed SetPageLRU
PageLRU && add to memcg LRU PageCgroupUsed && add to memcg LRU
The order of setting one flag and checking the other is crucial, otherwise
the charge may observe !PageLRU while the putback observes !PageCgroupUsed
and the page is not linked to the memcg LRU at all.
Global memory pressure may fix this by trying to isolate and putback the
page for reclaim, where that putback would link it to the memcg LRU again.
Without that, the memory cgroup is undeletable due to a charge whose
physical page can not be found and moved out.
Signed-off-by: Johannes Weiner <[email protected]>
Cc: Ying Han <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Cc: Daisuke Nishimura <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Reclaim decides to skip scanning an active list when the corresponding
inactive list is above a certain size in comparison to leave the assumed
working set alone while there are still enough reclaim candidates around.
The memcg implementation of comparing those lists instead reports whether
the whole memcg is low on the requested type of inactive pages,
considering all nodes and zones.
This can lead to an oversized active list not being scanned because of the
state of the other lists in the memcg, as well as an active list being
scanned while its corresponding inactive list has enough pages.
Not only is this wrong, it's also a scalability hazard, because the global
memory state over all nodes and zones has to be gathered for each memcg
and zone scanned.
Make these calculations purely based on the size of the two LRU lists
that are actually affected by the outcome of the decision.
Signed-off-by: Johannes Weiner <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Cc: Daisuke Nishimura <[email protected]>
Cc: Balbir Singh <[email protected]>
Reviewed-by: Minchan Kim <[email protected]>
Reviewed-by: Ying Han <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
If somebody is touching data too early, it might be easier to diagnose a
problem when dereferencing NULL at mem->info.nodeinfo[node] than trying to
understand why mem_cgroup_per_zone is [un|partly]initialized.
Signed-off-by: Igor Mammedov <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Before calling schedule_timeout(), task state should be changed.
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The memcg code sometimes uses "struct mem_cgroup *mem" and sometimes uses
"struct mem_cgroup *memcg". Rename all mem variables to memcg in source
file.
Signed-off-by: Raghavendra K T <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When the cgroup base was allocated with kmalloc, it was necessary to
annotate the variable with kmemleak_not_leak(). But because it has
recently been changed to be allocated with alloc_page() (which skips
kmemleak checks) causes a warning on boot up.
I was triggering this output:
allocated 8388608 bytes of page_cgroup
please try 'cgroup_disable=memory' option if you don't want memory cgroups
kmemleak: Trying to color unknown object at 0xf5840000 as Grey
Pid: 0, comm: swapper Not tainted 3.0.0-test #12
Call Trace:
[<c17e34e6>] ? printk+0x1d/0x1f^M
[<c10e2941>] paint_ptr+0x4f/0x78
[<c178ab57>] kmemleak_not_leak+0x58/0x7d
[<c108ae9f>] ? __rcu_read_unlock+0x9/0x7d
[<c1cdb462>] kmemleak_init+0x19d/0x1e9
[<c1cbf771>] start_kernel+0x346/0x3ec
[<c1cbf1b4>] ? loglevel+0x18/0x18
[<c1cbf0aa>] i386_start_kernel+0xaa/0xb0
After a bit of debugging I tracked the object 0xf840000 (and others) down
to the cgroup code. The change from allocating base with kmalloc to
alloc_page() has the base not calling kmemleak_alloc() which adds the
pointer to the object_tree_root, but kmemleak_not_leak() adds it to the
crt_early_log[] table. On kmemleak_init(), the entry is found in the
early_log[] but not the object_tree_root, and this error message is
displayed.
If alloc_page() fails then it defaults back to vmalloc() which still uses
the kmemleak_alloc() which makes us still need the kmemleak_not_leak()
call. The solution is to call the kmemleak_alloc() directly if the
alloc_page() succeeds.
Reviewed-by: Michal Hocko <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Jonathan Nieder <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
If a task has exited to the point it has called cgroup_exit() already,
then we can't migrate it to another cgroup anymore.
This can happen when we are attaching a task to a new cgroup between the
call to ->can_attach_task() on subsystems and the migration that is
eventually tried in cgroup_task_migrate().
In this case cgroup_task_migrate() returns -ESRCH and we don't want to
attach the task to the subsystems because the attachment to the new cgroup
itself failed.
Fix this by only calling ->attach_task() on the subsystems if the cgroup
migration succeeded.
Reported-by: Oleg Nesterov <[email protected]>
Signed-off-by: Ben Blum <[email protected]>
Acked-by: Paul Menage <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Tejun Heo <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix unstable tasklist locking in cgroup_attach_proc.
According to this thread - https://lkml.org/lkml/2011/7/27/243 - RCU is
not sufficient to guarantee the tasklist is stable w.r.t. de_thread and
exit. Taking tasklist_lock for reading, instead of rcu_read_lock, ensures
proper exclusion.
Signed-off-by: Ben Blum <[email protected]>
Acked-by: Paul Menage <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Neil Brown <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Clement Lecigne reports a filesystem which causes a kernel oops in
hfs_find_init() trying to dereference sb->ext_tree which is NULL.
This proves to be because the filesystem has a corrupted MDB extent
record, where the extents file does not fit into the first three extents
in the file record (the first blocks).
In hfs_get_block() when looking up the blocks for the extent file
(HFS_EXT_CNID), it fails the first blocks special case, and falls
through to the extent code (which ultimately calls hfs_find_init())
which is in the process of being initialised.
Hfs avoids this scenario by always having the extents b-tree fitting
into the first blocks (the extents B-tree can't have overflow extents).
The fix is to check at mount time that the B-tree fits into first
blocks, i.e. fail if HFS_I(inode)->alloc_blocks >=
HFS_I(inode)->first_blocks
Note, the existing commit 47f365eb57573 ("hfs: fix oops on mount with
corrupted btree extent records") becomes subsumed into this as a special
case, but only for the extents B-tree (HFS_EXT_CNID), it is perfectly
acceptable for the catalog B-Tree file to grow beyond three extents,
with the remaining extent descriptors in the extents overfow.
This fixes CVE-2011-2203
Reported-by: Clement LECIGNE <[email protected]>
Signed-off-by: Phillip Lougher <[email protected]>
Cc: Jeff Mahoney <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Use mpage_readpages() instead of multiple calls to isofs_readpage() to
reduce the CPU utilization and make performance higher.
Signed-off-by: Namjae Jeon <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Jan Kara <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
One can get this information from minix/inode.c, but adding the
explanations at the definition sites is more appropriate.
Signed-off-by: Sami Kerola <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
.exit.text
The driver is added using platform_driver_probe(), so the callbacks can be
discarded more aggessively.
Signed-off-by: Uwe Kleine-König <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add initial support for the microchip mcp7941x series of real time clocks.
The mcp7941x series is generally compatible with the ds1307 and ds1337 rtc
devices from dallas semiconductor. minor differences include a backup
battery enable bit, and the polarity of the oscillator enable bit.
Signed-off-by: David Anders <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Reviewed-by: Wolfram Sang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is the one use of an ida that doesn't retry on receiving -EAGAIN.
I'm assuming do so will cause no harm and may help on a rare occasion.
Signed-off-by: Jonathan Cameron <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When a cramfs ramdisk padded with 512 bytes is given to the kernel, the
current identify_ramdisk_image function fails to identify it.
Tested with a padded cramfs image on an ARM based board.
Signed-off-by: Neil Armstrong <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Al Viro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Since ramfs is hard-selected to "y", the module leftovers make no sense.
Signed-off-by: Richard Weinberger <[email protected]>
Reviewed-by: WANG Cong <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The case of address space randomization being disabled in runtime through
randomize_va_space sysctl is not treated properly in load_elf_binary(),
resulting in SIGKILL coming at exec() time for certain PIE-linked binaries
in case the randomization has been disabled at runtime prior to calling
exec().
Handle the randomize_va_space == 0 case the same way as if we were not
supporting .text randomization at all.
Based on original patch by H.J. Lu and Josh Boyer.
Signed-off-by: Jiri Kosina <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Russell King <[email protected]>
Cc: H.J. Lu <[email protected]>
Cc: <[email protected]>
Tested-by: Josh Boyer <[email protected]>
Acked-by: Nicolas Pitre <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This avoids duplicating the function in every arch gup_fast.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: David Gibson <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Up to this point the code assumed old refcounting for hugepages (pre-thp).
This updates the code directly to the thp mapcount tail page refcounting.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: David Gibson <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Acked-by: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
s390 didn't return 0 in that case, if it's rolling back the *nr pointer it
should also return zero to avoid adding pages to the array at the wrong
offset.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: David Gibson <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Up to this point the code assumed old refcounting for hugepages (pre-thp).
This updates the code directly to the thp mapcount tail page refcounting.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: David Gibson <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
powerpc didn't return 0 in that case, if it's rolling back the *nr pointer
it should also return zero to avoid adding pages to the array at the wrong
offset.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Acked-by: David Gibson <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: David Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Up to this point the code assumed old refcounting for hugepages (pre-thp).
This updates the code directly to the thp mapcount tail page refcounting.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: David Gibson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We only taken "refs" pins on the head page not "*nr" pins.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Acked-by: David Gibson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
"page" may have changed to point to the next hugepage after the loop
completed, The references have been taken on the head page, so the
put_page must happen there too.
This is a longstanding issue pre-thp inclusion.
It's totally unclear how these page_cache_add_speculative and
pte_val(pte) != pte_val(*ptep) checks are necessary across all the
powerpc gup_fast code, when x86 doesn't need any of that: there's no way
the page can be freed with irq disabled so we're guaranteed the
atomic_inc will happen on a page with page_count > 0 (so not needing the
speculative check).
The pte check is also meaningless on x86: no need to rollback on x86 if
the pte changed, because the pte can still change a CPU tick after the
check succeeded and it won't be rolled back in that case. The important
thing is we got a reference on a valid page that was mapped there a CPU
tick ago. So not knowing the soft tlb refill code of ppc64 in great
detail I'm not removing the "speculative" page_count increase and the
pte checks across all the code, but unless there's a strong reason for
it they should be later cleaned up too.
If a pte can change from huge to non-huge (like it could happen with
THP) passing a pte_t *ptep to gup_hugepte() would also require to repeat
the is_hugepd in gup_hugepte(), but that shouldn't happen with hugetlbfs
only so I'm not altering that.
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Acked-by: David Gibson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This part of gup_fast doesn't seem capable of handling hugetlbfs ptes,
those should be handled by gup_hugepd only, so these checks are
superfluous.
Plus if this wasn't a noop, it would have oopsed because, the insistence
of using the speculative refcounting would trigger a VM_BUG_ON if a tail
page was encountered in the page_cache_get_speculative().
Signed-off-by: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Acked-by: David Gibson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Michel while working on the working set estimation code, noticed that
calling get_page_unless_zero() on a random pfn_to_page(random_pfn)
wasn't safe, if the pfn ended up being a tail page of a transparent
hugepage under splitting by __split_huge_page_refcount().
He then found the problem could also theoretically materialize with
page_cache_get_speculative() during the speculative radix tree lookups
that uses get_page_unless_zero() in SMP if the radix tree page is freed
and reallocated and get_user_pages is called on it before
page_cache_get_speculative has a chance to call get_page_unless_zero().
So the best way to fix the problem is to keep page_tail->_count zero at
all times. This will guarantee that get_page_unless_zero() can never
succeed on any tail page. page_tail->_mapcount is guaranteed zero and
is unused for all tail pages of a compound page, so we can simply
account the tail page references there and transfer them to
tail_page->_count in __split_huge_page_refcount() (in addition to the
head_page->_mapcount).
While debugging this s/_count/_mapcount/ change I also noticed get_page is
called by direct-io.c on pages returned by get_user_pages. That wasn't
entirely safe because the two atomic_inc in get_page weren't atomic. As
opposed to other get_user_page users like secondary-MMU page fault to
establish the shadow pagetables would never call any superflous get_page
after get_user_page returns. It's safer to make get_page universally safe
for tail pages and to use get_page_foll() within follow_page (inside
get_user_pages()). get_page_foll() is safe to do the refcounting for tail
pages without taking any locks because it is run within PT lock protected
critical sections (PT lock for pte and page_table_lock for
pmd_trans_huge).
The standard get_page() as invoked by direct-io instead will now take
the compound_lock but still only for tail pages. The direct-io paths
are usually I/O bound and the compound_lock is per THP so very
finegrined, so there's no risk of scalability issues with it. A simple
direct-io benchmarks with all lockdep prove locking and spinlock
debugging infrastructure enabled shows identical performance and no
overhead. So it's worth it. Ideally direct-io should stop calling
get_page() on pages returned by get_user_pages(). The spinlock in
get_page() is already optimized away for no-THP builds but doing
get_page() on tail pages returned by GUP is generally a rare operation
and usually only run in I/O paths.
This new refcounting on page_tail->_mapcount in addition to avoiding new
RCU critical sections will also allow the working set estimation code to
work without any further complexity associated to the tail page
refcounting with THP.
Signed-off-by: Andrea Arcangeli <[email protected]>
Reported-by: Michel Lespinasse <[email protected]>
Reviewed-by: Michel Lespinasse <[email protected]>
Reviewed-by: Minchan Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: David Gibson <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
* git://github.com/rustyrussell/linux:
virtio-blk: use ida to allocate disk index
virtio: Add platform bus driver for memory mapped virtio device
virtio: Dont add "config" to list for !per_vq_vector
virtio: console: wait for first console port for early console output
virtio: console: add port stats for bytes received, sent and discarded
virtio: console: make discard_port_data() use get_inbuf()
virtio: console: rename variable
virtio: console: make get_inbuf() return port->inbuf if present
virtio: console: Fix return type for get_inbuf()
virtio: console: Use wait_event_freezable instead of _interruptible
virtio: console: Ignore port name update request if name already set
virtio: console: Fix indentation
virtio: modify vring_init and vring_size to take account of the layout containing *_event_idx
virtio.h: correct comment for struct virtio_driver
virtio-net: Use virtio_config_val() for retrieving config
virtio_config: Add virtio_config_val_len()
virtio-console: Use virtio_config_val() for retrieving config
|
|
Add sys_process_vm_readv and sys_process_vm_writev to ia64
syscall table. Passes tests at http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz
Signed-off-by: Tony Luck <[email protected]>
|
|
Just clean-up what GCC caught.
Signed-off-by: Takashi Iwai <[email protected]>
|
|
When the driver finds multiple ADCs, it tries to create an alternative
capture PCM stream. However, these secondary ADCs might be useless or
in uncontrolled paths in some cases, e.g. when auto-mic or dynamic
ADC-switching is enabled. Also, when only a single capture source is
available, the multi-streams don't make sense, too.
With this patch, the driver checks such condition and skips the alt
stream appropriately.
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
|