Age | Commit message (Collapse) | Author | Files | Lines |
|
The kernel perf event creation path shouldn't use find_task_by_vpid()
because a vpid exists in a specific namespace. find_task_by_vpid() uses
current's pid namespace which isn't always the correct namespace to use
for the vpid in all the places perf_event_create_kernel_counter() (and
thus find_get_context()) is called.
The goal is to clean up pid namespace handling and prevent bugs like:
https://bugzilla.kernel.org/show_bug.cgi?id=17281
Instead of using pids switch find_get_context() to use task struct
pointers directly. The syscall is responsible for resolving the pid to
a task struct. This moves the pid namespace resolution into the syscall
much like every other syscall that takes pid parameters.
Signed-off-by: Matt Helsley <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: Robin Green <[email protected]>
Cc: Prasad <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
LKML-Reference: <a134e5e392ab0204961fd1a62c84a222bf5874a9.1284407763.git.matthltc@us.ibm.com>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
dquot: do full inode dirty in allocating space
|
|
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6:
spi/pl022: move probe call to subsys_initcall()
powerpc/5200: mpc52xx_uart.c: Add of_node_put to avoid memory leak
spi/pl022: fix APB pclk power regression on U300
spi/spi_s3c64xx: Warn if PIO transfers time out
spi/s3c64xx: Fix incorrect reuse of 'val' local variable.
spi/s3c64xx: Fix compilation warning
spi/dw_spi: clean the cs_control code
spi/dw_spi: Allow interrupt sharing
spi/spi_s3c64xx: Increase dead reckoning time in wait_for_xfer()
spi/spi_s3c64xx: Move to subsys_initcall()
spi: free children in spi_unregister_master, not siblings
gpiolib: Add 'struct gpio_chip' forward declaration for !GPIOLIB case
of: Fix missing includes - ll_temac
spi/spi_s3c64xx: Staticise non-exported functions
spi/spi_s3c64xx: Make probe more robust against missing board config
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (28 commits)
ipheth: remove incorrect devtype to WWAN
MAINTAINERS: Add CAIF
sctp: fix test for end of loop
KS8851: Correct RX packet allocation
udp: add rehash on connect()
net: blackhole route should always be recalculated
ipv4: Suppress lockdep-RCU false positive in FIB trie (3)
niu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL
ipvs: fix active FTP
gro: Re-fix different skb headrooms
via-velocity: Turn scatter-gather support back off.
ipv4: Fix reverse path filtering with multipath routing.
UNIX: Do not loop forever at unix_autobind().
PATCH: b44 Handle RX FIFO overflow better (simplified)
irda: off by one
3c59x: Fix deadlock in vortex_error()
netfilter: discard overlapping IPv6 fragment
ipv6: discard overlapping fragment
net: fix tx queue selection for bridged devices implementing select_queue
bonding: Fix jiffies overflow problems (again)
...
Fix up trivial conflicts due to the same cgroup API thinko fix going
through both Andrew and the networking tree. However, there were small
differences between the two, with Andrew's version generally being the
nicer one, and the one I merged first. So pick that one.
Conflicts in: include/linux/cgroup.h and kernel/cgroup.c
|
|
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: Range check cpu in blk_cpu_to_group
scatterlist: prevent invalid free when alloc fails
writeback: Fix lost wake-up shutting down writeback thread
writeback: do not lose wakeup events when forking bdi threads
cciss: fix reporting of max queue depth since init
block: switch s390 tape_block and mg_disk to elevator_change()
block: add function call to switch the IO scheduler from a driver
fs/bio-integrity.c: return -ENOMEM on kmalloc failure
bio-integrity.c: remove dependency on __GFP_NOFAIL
BLOCK: fix bio.bi_rw handling
block: put dev->kobj in blk_register_queue fail path
cciss: handle allocation failure
cfq-iosched: Documentation help for new tunables
cfq-iosched: blktrace print per slice sector stats
cfq-iosched: Implement tunable group_idle
cfq-iosched: Do group share accounting in IOPS when slice_idle=0
cfq-iosched: Do not idle if slice_idle=0
cciss: disable doorbell reset on reset_devices
blkio: Fix return code for mkdir calls
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata-sff: Reenable Port Multiplier after libata-sff remodeling.
libata: skip EH autopsy and recovery during suspend
ahci: AHCI and RAID mode SATA patch for Intel Patsburg DeviceIDs
ata_piix: IDE Mode SATA patch for Intel Patsburg DeviceIDs
libata,pata_via: revert ata_wait_idle() removal from ata_sff/via_tf_load()
ahci: fix hang on failed softreset
pata_artop: Fix device ID parity check
|
|
Keep track of the link on the which the current request is in progress.
It allows support of links behind port multiplier.
Not all libata-sff is PMP compliant. Code for native BMDMA controller
does not take in accound PMP.
Tested on Marvell 7042 and Sil7526.
Signed-off-by: Gwendal Grignou <[email protected]>
Signed-off-by: Jeff Garzik <[email protected]>
|
|
For some mysterious reason, certain hardware reacts badly to usual EH
actions while the system is going for suspend. As the devices won't
be needed until the system is resumed, ask EH to skip usual autopsy
and recovery and proceed directly to suspend.
Signed-off-by: Tejun Heo <[email protected]>
Tested-by: Stephan Diestelhorst <[email protected]>
Cc: [email protected]
Signed-off-by: Jeff Garzik <[email protected]>
|
|
is low and kswapd is awake
Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is
cheaper than scanning a number of lists. To avoid synchronization
overhead, counter deltas are maintained on a per-cpu basis and drained
both periodically and when the delta is above a threshold. On large CPU
systems, the difference between the estimated and real value of
NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than
number of real free page in buddy, the VM can allocate pages below min
watermark, at worst reducing the real number of pages to zero. Even if
the OOM killer kills some victim for freeing memory, it may not free
memory if the exit path requires a new page resulting in livelock.
This patch introduces a zone_page_state_snapshot() function (courtesy of
Christoph) that takes a slightly more accurate view of an arbitrary vmstat
counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid
the watermark being accidentally broken. The estimate is not perfect and
may result in cache line bounces but is expected to be lighter than the
IPI calls necessary to continually drain the per-cpu counters while kswapd
is awake.
Signed-off-by: Christoph Lameter <[email protected]>
Signed-off-by: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Tests with recent firmware on Intel X25-M 80GB and OCZ Vertex 60GB SSDs
show a shift since I last tested in December: in part because of firmware
updates, in part because of the necessary move from barriers to awaiting
completion at the block layer. While discard at swapon still shows as
slightly beneficial on both, discarding 1MB swap cluster when allocating
is now disadvanteous: adds 25% overhead on Intel, adds 230% on OCZ (YMMV).
Surrender: discard as presently implemented is more hindrance than help
for swap; but might prove useful on other devices, or with improvements.
So continue to do the discard at swapon, but make discard while swapping
conditional on a SWAP_FLAG_DISCARD to sys_swapon() (which has been using
only the lower 16 bits of int flags).
We can add a --discard or -d to swapon(8), and a "discard" to swap in
/etc/fstab: matching the mount option for btrfs, ext4, fat, gfs2, nilfs2.
Signed-off-by: Hugh Dickins <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Nigel Cunningham <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: James Bottomley <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Please revert 2.6.36-rc commit d2997b1042ec150616c1963b5e5e919ffd0b0ebf
"hibernation: freeze swap at hibernation". It complicated matters by
adding a second swap allocation path, just for hibernation; without in any
way fixing the issue that it was intended to address - page reclaim after
fixing the hibernation image might free swap from a page already imaged as
swapcache, letting its swap be reallocated to store a different page of
the image: resulting in data corruption if the imaged page were freed as
clean then swapped back in. Pages freed to si->swap_map were still in
danger of being reallocated by the alternative allocation path.
I guess it inadvertently fixed slow SSD swap allocation for hibernation,
as reported by Nigel Cunningham: by missing out the discards that occur on
the usual swap allocation path; but that was unintentional, and needs a
separate fix.
Signed-off-by: Hugh Dickins <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Ondrej Zary <[email protected]>
Cc: Andrea Gelmini <[email protected]>
Cc: Balbir Singh <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Nigel Cunningham <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There's been some recent confusion about error checking GPIO numbers.
briefly, it should be handled mostly during setup, when gpio_request() is
called, and NEVER by expectig gpio_is_valid to report more than
never-usable GPIO numbers.
[[email protected]: terminate unterminated comment]
Signed-off-by: David Brownell <[email protected]>
Cc: Eric Miao" <[email protected]>
Cc: "Ryan Mallon" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Replace the arbitrary software-reset call from the device-probe
method, because:
- It is defective. To work correctly, it should be two byte writes,
not a single word write. As it stands, it does nothing.
- Some devices with sx150x expanders installed have their NRESET pins
ganged on the same line, so resetting one causes the others to reset -
not a nice thing to do arbitrarily!
- The probe, usually taking place at boot, implies a recent hard-reset,
so a software reset at this point is just a waste of energy anyway.
Therefore, make it optional, defaulting to off, as this will match the
common case of probing at powerup and also matches the current broken
no-op behavior.
Signed-off-by: Gregory Bean <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The pte_same check is reliable only if the swap entry remains pinned (by
the page lock on swapcache). We've also to ensure the swapcache isn't
removed before we take the lock as try_to_free_swap won't care about the
page pin.
One of the possible impacts of this patch is that a KSM-shared page can
point to the anon_vma of another process, which could exit before the page
is freed.
This can leave a page with a pointer to a recycled anon_vma object, or
worse, a pointer to something that is no longer an anon_vma.
[[email protected]: changelog help]
Signed-off-by: Andrea Arcangeli <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add cgroup_attach_task_all()
The existing cgroup_attach_task_current_cg() API is called by a thread to
attach another thread to all of its cgroups; this is unsuitable for cases
where a privileged task wants to attach itself to the cgroups of a less
privileged one, since the call must be made from the context of the target
task.
This patch adds a more generic cgroup_attach_task_all() API that allows
both the source task and to-be-moved task to be specified.
cgroup_attach_task_current_cg() becomes a specialization of the more
generic new function.
[[email protected]: rewrote changelog]
[[email protected]: address reviewer comments]
Signed-off-by: Michael S. Tsirkin <[email protected]>
Tested-by: Alex Williamson <[email protected]>
Acked-by: Paul Menage <[email protected]>
Cc: Li Zefan <[email protected]>
Cc: Ben Blum <[email protected]>
Cc: Sridhar Samudrala <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Some macro parameter references inside typeof() operator are not enclosed
with parenthesis. It should be safer to add them.
Signed-off-by: Huang Ying <[email protected]>
Acked-by: Stefani Seibold <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The introduction of support for SD combo cards breaks the initialization
of all CSR SDIO chips. The GO_IDLE (CMD0) in mmc_sd_get_cid() causes CSR
chips to be reset (this is non-standard behavior).
When initializing an SDIO card check for a combo card by using the memory
present bit in the R4 response to IO_SEND_OP_COND (CMD5). This avoids the
call to mmc_sd_get_cid() on an SDIO-only card.
Signed-off-by: David Vrabel <[email protected]>
Acked-by: Michal Mirolaw <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
asm-generic/hardirq.h needs asm/irq.h which might include
linux/interrupt.h as in the sparc 32 case. At this point
we need irq_cpustat generic definitions, but those are
included later in asm-generic/hardirq.h.
Then delay a bit the inclusion of irq.h from
asm-generic/hardirq.h, it doesn't need to be included early.
This fixes:
include/linux/interrupt.h: In function '__raise_softirq_irqoff':
include/linux/interrupt.h:414: error: implicit declaration of function 'local_softirq_pending'
include/linux/interrupt.h:414: error: lvalue required as left operand of assignment
Reported-by: Ingo Molnar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Koki Sanagi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
LKML-Reference: <20100908122557.GA5310@nowhere>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
I missed a perf_event_ctxp user when converting it to an array. Pull this
last user into perf_event.c as well and fix it up.
Signed-off-by: Peter Zijlstra <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Since software events are always schedulable, mixing them up with
hardware events (who are not) can lead to funny scheduling oddities.
Giving them their own context solves this.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Provide the infrastructure for multiple task contexts.
A more flexible approach would have resulted in more pointer chases
in the scheduling hot-paths. This approach has the limitation of a
static number of task contexts.
Since I expect most external PMUs to be system wide, or at least node
wide (as per the intel uncore unit) they won't actually need a task
context.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Allocate per-cpu contexts per pmu.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Give each cpu-context its own timer so that it is a self contained
entity, this eases the way for per-pmu-per-cpu contexts as well as
provides the basic infrastructure to allow different rotation
times per pmu.
Things to look at:
- folding the tick and these TICK_NSEC timers
- separate task context rotation
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Separate the swevent hash-table from the cpu_context bits in
preparation for per pmu cpu contexts.
This keeps the swevent hash a global entity.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Neither the overcommit nor the reservation sysfs parameter were
actually working, remove them as they'll only get in the way.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.
The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.
This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).
It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).
The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:
1) We disable the counter:
a) the pmu has per-counter enable bits, we flip that
b) we program a NOP event, preserving the counter state
2) We store the counter state and ignore all read/overflow events
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
Cc: Deng-Cheng Zhu <[email protected]>
Cc: David Miller <[email protected]>
Cc: Michael Cree <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Use hw_perf_event::period_left instead of hw_perf_event::remaining
and win back 8 bytes.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Provide default implementations for the pmu txn methods, this
allows us to remove some conditional code.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
Cc: Deng-Cheng Zhu <[email protected]>
Cc: David Miller <[email protected]>
Cc: Michael Cree <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Changes perf_disable() into perf_pmu_disable().
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
Cc: Deng-Cheng Zhu <[email protected]>
Cc: David Miller <[email protected]>
Cc: Michael Cree <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
Cc: Deng-Cheng Zhu <[email protected]>
Cc: David Miller <[email protected]>
Cc: Michael Cree <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
Cc: Deng-Cheng Zhu <[email protected]>
Cc: David Miller <[email protected]>
Cc: Michael Cree <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`
Signed-off-by: Peter Zijlstra <[email protected]>
Cc: paulus <[email protected]>
Cc: stephane eranian <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Lin Ming <[email protected]>
Cc: Yanmin <[email protected]>
Cc: Deng-Cheng Zhu <[email protected]>
Cc: David Miller <[email protected]>
Cc: Michael Cree <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
lg_lock_global() currently only acquires spinlocks for online CPUs, but
it's meant to lock all possible CPUs. Lglock-protected resources may be
associated with removed CPUs - and, indeed, that could happen with the
per-superblock open files lists.
At Nick's suggestion, change for_each_online_cpu() to
for_each_possible_cpu() to protect accesses to those resources.
Cc: Al Viro <[email protected]>
Acked-by: Nick Piggin <[email protected]>
Signed-off-by: Jonathan Corbet <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
So it can be used by all that need to check for that.
Signed-off-by: Stefan Bader <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Alex Shi found a regression when doing ffsb test. The test has several threads,
and each thread creates a small file, write to it and then delete it. ffsb
reports about 20% regression and Alex bisected it to 43d2932d88e4. The test
will call __mark_inode_dirty 3 times. without this commit, we only take
inode_lock one time, while with it, we take the lock 3 times with flags (
I_DIRTY_SYNC,I_DIRTY_PAGES,I_DIRTY). Perf shows the lock contention increased
too much. Below proposed patch fixes it.
fs is allocating blocks, which usually means file writes and the inode
will be dirtied soon. We fully dirty the inode to reduce some inode_lock
contention in several calls of __mark_inode_dirty.
Jan Kara: Added comment.
Signed-off-by: Shaohua Li <[email protected]>
Signed-off-by: Alex Shi <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
|
|
commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
added a secondary hash on UDP, hashed on (local addr, local port).
Problem is that following sequence :
fd = socket(...)
connect(fd, &remote, ...)
not only selects remote end point (address and port), but also sets
local address, while UDP stack stored in secondary hash table the socket
while its local address was INADDR_ANY (or ipv6 equivalent)
Sequence is :
- autobind() : choose a random local port, insert socket in hash tables
[while local address is INADDR_ANY]
- connect() : set remote address and port, change local address to IP
given by a route lookup.
When an incoming UDP frame comes, if more than 10 sockets are found in
primary hash table, we switch to secondary table, and fail to find
socket because its local address changed.
One solution to this problem is to rehash datagram socket if needed.
We add a new rehash(struct socket *) method in "struct proto", and
implement this method for UDP v4 & v6, using a common helper.
This rehashing only takes care of secondary hash table, since primary
hash (based on local port only) is not changed.
Reported-by: Krzysztof Piotr Oledzki <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Tested-by: Krzysztof Piotr Oledzki <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'semaphore-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
semaphore: Add DEFINE_SEMAPHORE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks
io-mapping: Fix the address space annotations
x86: Fix the address space annotations of iomap_atomic_prot_pfn()
x86, mm: Fix CONFIG_VMSPLIT_1G and 2G_OPT trampoline
x86, hwmon: Fix unsafe smp_processor_id() in thermal_throttle_add_dev
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
gcc-4.6: kernel/*: Fix unused but set warnings
mutex: Fix annotations to include it in kernel-locking docbook
pid: make setpgid() system call use RCU read-side critical section
MAINTAINERS: Add RCU's public git tree
|
|
- Do not create expectation when forwarding the PORT
command to avoid blocking the connection. The problem is that
nf_conntrack_ftp.c:help() tries to create the same expectation later in
POST_ROUTING and drops the packet with "dropping packet" message after
failure in nf_ct_expect_related.
- Change ip_vs_update_conntrack to alter the conntrack
for related connections from real server. If we do not alter the reply in
this direction the next packet from client sent to vport 20 comes as NEW
connection. We alter it but may be some collision happens for both
conntracks and the second conntrack gets destroyed immediately. The
connection stucks too.
Signed-off-by: Julian Anastasov <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
commit 052dc7c45i "spi/dw_spi: conditional transfer mode change"
introduced cs_control code, which has a bug by using bit offset
for spi mode to set transfer mode in control register. Also it
forces devices who don't need cs_control to re-configure the
control registers for each spi transfer. This patch will fix them
Signed-off-by: Feng Tang <[email protected]>
Signed-off-by: Grant Likely <[email protected]>
|
|
The full cleanup of init_MUTEX[_LOCKED] and DECLARE_MUTEX has not been
done. Some of the users are real semaphores and we should name them as
such instead of confusing everyone with "MUTEX".
Provide the infrastructure to get finally rid of init_MUTEX[_LOCKED]
and DECLARE_MUTEX.
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Christoph Hellwig <[email protected]>
LKML-Reference: <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: bus speed strings should be const
PCI hotplug: Fix build with CONFIG_ACPI unset
PCI: PCIe: Remove the port driver module exit routine
PCI: PCIe: Move PCIe PME code to the pcie directory
PCI: PCIe: Disable PCIe port services during port initialization
PCI: PCIe: Ask BIOS for control of all native services at once
ACPI/PCI: Negotiate _OSC control bits before requesting them
ACPI/PCI: Do not preserve _OSC control bits returned by a query
ACPI/PCI: Make acpi_pci_query_osc() return control bits
ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
PCI: PCIe: Introduce commad line switch for disabling port services
PCI: PCIe AER: Introduce pci_aer_available()
x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/pseries: Correct rtas_data_buf locking in dlpar code
powerpc/85xx: Add P1021 PCI IDs and quirks
arch/powerpc/sysdev/qe_lib/qe.c: Add of_node_put to avoid memory leak
arch/powerpc/platforms/83xx/mpc837x_mds.c: Add missing iounmap
fsl_rio: fix compile errors
powerpc/85xx: Fix compile issue with p1022_ds due to lmb rename to memblock
powerpc/85xx: Fix compilation of mpc85xx_mds.c
powerpc: Don't use kernel stack with translation off
powerpc/perf_event: Reduce latency of calling perf_event_do_pending
powerpc/kexec: Adds correct calling convention for kexec purgatory
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: fix a mismatch between code and comment
percpu: fix a memory leak in pcpu_extend_area_map()
percpu: add __percpu notations to UP allocator
percpu: handle __percpu notations in UP accessors
|
|
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: use zalloc_cpumask_var() for gcwq->mayday_mask
workqueue: fix GCWQ_DISASSOCIATED initialization
workqueue: Add a workqueue chapter to the tracepoint docbook
workqueue: fix cwq->nr_active underflow
workqueue: improve destroy_workqueue() debuggability
workqueue: mark lock acquisition on worker_maybe_bind_and_lock()
workqueue: annotate lock context change
workqueue: free rescuer on destroy_workqueue
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
tty: fix tty_line must not be equal to number of allocated tty pointers in tty driver
serial: bfin_sport_uart: restore transmit frame sync fix
serial: fix port type conflict between NS16550A & U6_16550A
MAINTAINERS: orphan isicom
vt: Fix console corruption on driver hand-over.
|