Age | Commit message (Collapse) | Author | Files | Lines |
|
Hugetlb supports multiple page sizes. We use split lock only for PMD
level, but not for PUD.
[[email protected]: coding-style fixes]
Signed-off-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently mm->pmd_huge_pte protected by page table lock. It will not
work with split lock. We have to have per-pmd pmd_huge_pte for proper
access serialization.
For now, let's just introduce wrapper to access mm->pmd_huge_pte.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With split page table lock we can't know which lock we need to take
before we find the relevant pmd.
Let's move lock taking inside the function.
Signed-off-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With split ptlock it's important to know which lock
pmd_trans_huge_lock() took. This patch adds one more parameter to the
function to return the lock.
In most places migration to new api is trivial. Exception is
move_huge_pmd(): we need to take two locks if pmd tables are different.
Signed-off-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Basic api, backed by mm->page_table_lock for now. Actual implementation
will be added later.
Signed-off-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With split page table lock for PMD level we can't hold mm->page_table_lock
while updating nr_ptes.
Let's convert it to atomic_long_t to avoid races.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We're going to introduce split page table lock for PMD level. Let's
rename existing split ptlock for PTE level to avoid confusion.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Tested-by: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Alex Thorlton noticed that some massively threaded workloads work poorly,
if THP enabled. This patchset fixes this by introducing split page table
lock for PMD tables. hugetlbfs is not covered yet.
This patchset is based on work by Naoya Horiguchi.
: akpm result summary:
:
: THP off, v3.12-rc2: 18.059261877 seconds time elapsed
: THP off, patched: 16.768027318 seconds time elapsed
:
: THP on, v3.12-rc2: 42.162306788 seconds time elapsed
: THP on, patched: 8.397885779 seconds time elapsed
:
: HUGETLB, v3.12-rc2: 47.574936948 seconds time elapsed
: HUGETLB, patched: 19.447481153 seconds time elapsed
THP off, v3.12-rc2:
-------------------
Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
1037072.835207 task-clock # 57.426 CPUs utilized ( +- 3.59% )
95,093 context-switches # 0.092 K/sec ( +- 3.93% )
140 cpu-migrations # 0.000 K/sec ( +- 5.28% )
10,000,550 page-faults # 0.010 M/sec ( +- 0.00% )
2,455,210,400,261 cycles # 2.367 GHz ( +- 3.62% ) [83.33%]
2,429,281,882,056 stalled-cycles-frontend # 98.94% frontend cycles idle ( +- 3.67% ) [83.33%]
1,975,960,019,659 stalled-cycles-backend # 80.48% backend cycles idle ( +- 3.88% ) [66.68%]
46,503,296,013 instructions # 0.02 insns per cycle
# 52.24 stalled cycles per insn ( +- 3.21% ) [83.34%]
9,278,997,542 branches # 8.947 M/sec ( +- 4.00% ) [83.34%]
89,881,640 branch-misses # 0.97% of all branches ( +- 1.17% ) [83.33%]
18.059261877 seconds time elapsed ( +- 2.65% )
THP on, v3.12-rc2:
------------------
Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
3114745.395974 task-clock # 73.875 CPUs utilized ( +- 1.84% )
267,356 context-switches # 0.086 K/sec ( +- 1.84% )
99 cpu-migrations # 0.000 K/sec ( +- 1.40% )
58,313 page-faults # 0.019 K/sec ( +- 0.28% )
7,416,635,817,510 cycles # 2.381 GHz ( +- 1.83% ) [83.33%]
7,342,619,196,993 stalled-cycles-frontend # 99.00% frontend cycles idle ( +- 1.88% ) [83.33%]
6,267,671,641,967 stalled-cycles-backend # 84.51% backend cycles idle ( +- 2.03% ) [66.67%]
117,819,935,165 instructions # 0.02 insns per cycle
# 62.32 stalled cycles per insn ( +- 4.39% ) [83.34%]
28,899,314,777 branches # 9.278 M/sec ( +- 4.48% ) [83.34%]
71,787,032 branch-misses # 0.25% of all branches ( +- 1.03% ) [83.33%]
42.162306788 seconds time elapsed ( +- 1.73% )
HUGETLB, v3.12-rc2:
-------------------
Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):
2588052.787264 task-clock # 54.400 CPUs utilized ( +- 3.69% )
246,831 context-switches # 0.095 K/sec ( +- 4.15% )
138 cpu-migrations # 0.000 K/sec ( +- 5.30% )
21,027 page-faults # 0.008 K/sec ( +- 0.01% )
6,166,666,307,263 cycles # 2.383 GHz ( +- 3.68% ) [83.33%]
6,086,008,929,407 stalled-cycles-frontend # 98.69% frontend cycles idle ( +- 3.77% ) [83.33%]
5,087,874,435,481 stalled-cycles-backend # 82.51% backend cycles idle ( +- 4.41% ) [66.67%]
133,782,831,249 instructions # 0.02 insns per cycle
# 45.49 stalled cycles per insn ( +- 4.30% ) [83.34%]
34,026,870,541 branches # 13.148 M/sec ( +- 4.24% ) [83.34%]
68,670,942 branch-misses # 0.20% of all branches ( +- 3.26% ) [83.33%]
47.574936948 seconds time elapsed ( +- 2.09% )
THP off, patched:
-----------------
Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
943301.957892 task-clock # 56.256 CPUs utilized ( +- 3.01% )
86,218 context-switches # 0.091 K/sec ( +- 3.17% )
121 cpu-migrations # 0.000 K/sec ( +- 6.64% )
10,000,551 page-faults # 0.011 M/sec ( +- 0.00% )
2,230,462,457,654 cycles # 2.365 GHz ( +- 3.04% ) [83.32%]
2,204,616,385,805 stalled-cycles-frontend # 98.84% frontend cycles idle ( +- 3.09% ) [83.32%]
1,778,640,046,926 stalled-cycles-backend # 79.74% backend cycles idle ( +- 3.47% ) [66.69%]
45,995,472,617 instructions # 0.02 insns per cycle
# 47.93 stalled cycles per insn ( +- 2.51% ) [83.34%]
9,179,700,174 branches # 9.731 M/sec ( +- 3.04% ) [83.35%]
89,166,529 branch-misses # 0.97% of all branches ( +- 1.45% ) [83.33%]
16.768027318 seconds time elapsed ( +- 2.47% )
THP on, patched:
----------------
Performance counter stats for './thp_memscale -c 80 -b 512m' (5 runs):
458793.837905 task-clock # 54.632 CPUs utilized ( +- 0.79% )
41,831 context-switches # 0.091 K/sec ( +- 0.97% )
98 cpu-migrations # 0.000 K/sec ( +- 1.66% )
57,829 page-faults # 0.126 K/sec ( +- 0.62% )
1,077,543,336,716 cycles # 2.349 GHz ( +- 0.81% ) [83.33%]
1,067,403,802,964 stalled-cycles-frontend # 99.06% frontend cycles idle ( +- 0.87% ) [83.33%]
864,764,616,143 stalled-cycles-backend # 80.25% backend cycles idle ( +- 0.73% ) [66.68%]
16,129,177,440 instructions # 0.01 insns per cycle
# 66.18 stalled cycles per insn ( +- 7.94% ) [83.35%]
3,618,938,569 branches # 7.888 M/sec ( +- 8.46% ) [83.36%]
33,242,032 branch-misses # 0.92% of all branches ( +- 2.02% ) [83.32%]
8.397885779 seconds time elapsed ( +- 0.18% )
HUGETLB, patched:
-----------------
Performance counter stats for './thp_memscale_hugetlbfs -c 80 -b 512M' (5 runs):
395353.076837 task-clock # 20.329 CPUs utilized ( +- 8.16% )
55,730 context-switches # 0.141 K/sec ( +- 5.31% )
138 cpu-migrations # 0.000 K/sec ( +- 4.24% )
21,027 page-faults # 0.053 K/sec ( +- 0.00% )
930,219,717,244 cycles # 2.353 GHz ( +- 8.21% ) [83.32%]
914,295,694,103 stalled-cycles-frontend # 98.29% frontend cycles idle ( +- 8.35% ) [83.33%]
704,137,950,187 stalled-cycles-backend # 75.70% backend cycles idle ( +- 9.16% ) [66.69%]
30,541,538,385 instructions # 0.03 insns per cycle
# 29.94 stalled cycles per insn ( +- 3.98% ) [83.35%]
8,415,376,631 branches # 21.286 M/sec ( +- 3.61% ) [83.36%]
32,645,478 branch-misses # 0.39% of all branches ( +- 3.41% ) [83.32%]
19.447481153 seconds time elapsed ( +- 2.00% )
This patch (of 11):
CONFIG_GENERIC_LOCKBREAK increases sizeof(spinlock_t) to 8 bytes. It
leads to increase sizeof(struct page) by 4 bytes on 32-bit system if split
page table lock is in use, since page->ptl shares space in union with
longs and pointers.
Let's disable split page table lock on 32-bit systems with
GENERIC_LOCKBREAK enabled.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Alex Thorlton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: "Eric W . Biederman" <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Robin Holt <[email protected]>
Cc: Sedat Dilek <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There's only one caller of do_generic_file_read() and the only actor is
file_read_actor(). No reason to have a callback parameter.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Reviewed-by: Wanpeng Li <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Cc: Maxim Levitsky <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs update frm Chris Mason:
"This is our usual merge window set of bug fixes, performance
improvements and cleanups. Miao Xie has some really nice
optimizations for writeback.
Josef also expanded our sanity checks quite a bit; these make up a big
chunk of the new lines"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (98 commits)
Btrfs: rename btrfs_start_all_delalloc_inodes
Btrfs: don't wait for the completion of all the ordered extents
Btrfs: don't wait for all the async delalloc when shrinking delalloc
Btrfs: fix the confusion between delalloc bytes and metadata bytes
Btrfs: pick up the code for the item number calculation in flush_space()
Btrfs: wait for the ordered extent only when we want
Btrfs: remove unnecessary initialization and memory barrior in shrink_delalloc()
Btrfs: avoid unnecessary scrub workers allocation
Btrfs: check file extent type before anything else
btrfs: Remove useless variable in write_ctree_super()
btrfs: Fix checkpatch.pl warning of spacing issues
btrfs: Replace kmalloc with kmalloc_array
btrfs: Enclose macros with complex values within parenthesis
btrfs: Use WARN_ON()'s return value in place of WARN_ON(1)
btrfs: Remove redundant local zero structure
btrfs: Pack struct btrfs_device
btrfs: Replace multiple atomic_inc() with atomic_add()
btrfs: Add helper function for free_root_pointers()
Btrfs: fix a crash when running balance and defrag concurrently
Btrfs: do not run snapshot-aware defragment on error
...
|
|
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.
Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Use module_pci_driver() macro which makes the code smaller and
simpler.
Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Casting the return value which is a void pointer is redundant.
The conversion from void pointer to any other pointer type is
guaranteed by the C programming language.
Signed-off-by: Jingoo Han <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Mark the places when the system are in user or are in kernel.
This is used to make full dynticks system (tickless) --
CONFIG_NO_HZ_FULL dependence.
Signed-off-by: Kirill Tkhai <[email protected]>
CC: David Miller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
CONFIG_NO_HZ_FULL requires possibility of smp_send_reschedule()
for the calling CPU. Currently, it is used in inc_nr_running()
scheduler primitive only.
Nobody calls smp_send_reschedule() from preemptible context
(furthermore, it looks like it will be save if anybody use it
another way in the future). But anyway I add WARN_ON() here
just to return here if anything changes.
Signed-off-by: Kirill Tkhai <[email protected]>
CC: David Miller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Simba-bridges
The SIMBA APB Bridges lacks the 'ranges' of-property describing the
PCI I/O and memory areas located beneath the bridge. Faking this
information has been performed by reading range registers in the
APB bridge, and calculating the corresponding areas.
In commit 01f94c4a6ced476ce69b895426fc29bfc48c69bd
("Fix sabre pci controllers with new probing scheme.") a bug was
introduced into this calculation, causing the PCI memory areas
to be calculated incorrectly: The shift size was set to be
identical for I/O and MEM ranges, which is incorrect.
This patch set the shift size of the MEM range back to the
value used before 01f94c4a6ced476ce69b895426fc29bfc48c69bd.
Signed-off-by: Kjetil Oftedal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
I noticed that we export a way to high value for the maxfilesize
attribute when debugging a client issue. The issue didn't turn
out to be related to it, but I think we should export it, so that
clients can limit what write sizes they accept before hitting
the server.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
|
|
By default, when tasks are specified (i.e. -p, -t or -u options)
per-thread mmaps are created.
Add an option to override that and force per-cpu mmaps.
Further comments by peterz:
So this option allows -t/-p/-u to create one buffer per cpu and attach
all the various thread/process/user tasks' their counters to that one
buffer?
As opposed to the current state where each such counter would have its
own buffer.
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Sukadev Bhattiprolu <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
You can't pass demangled name into "perf probe", because of special chars:
./perf probe -f -x /tmp/a.out 'foo(int)'
Semantic error :There is non-digit char in line number.
And you can't even pass without demangling (because it search symbol in
DSO with demangle=true):
./perf probe -f -x /tmp/a.out _Z3fooi
no symbols found in /tmp/a.out, maybe install a debug package?
However:
nm /tmp/a.out | grep foo
000000000040056d T _Z3fooi
After this patch, using the next command:
./perf probe -f --no-demangle -x /tmp/a.out _Z3fooi
probe will be successfully added.
Signed-off-by: Azat Khuzhin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
$ perf record ls
$ perf report
Press 'down enter end'
Result:
Program received signal SIGSEGV, Segmentation fault.
The UI browser, used on a argv array would access past the end of the
array on SEEK_END because it wasn't using 'nr_entries - 1', fix it.
Reported-by: [email protected]
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=59291
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It was affecting only frame-pointer (fp) based callchain processing.
Usage example:
perf top --call-graph dwarf,1024 --max-stack 2
Works for any tool that does callchain resolving and provides a
--max-stack option.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Waiman Long <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Just one use so far, on the hists browser, for completeness since there
we use perf_evlist__{first,last} and perf_evsel__next() for handling the
TAB and UNTAB keys.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In a few remaining places where the equivalent open coded variant was
still being used.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When introducing the PERF_RECORD_MMAP2 in:
5c5e854bc760 perf tools: Add attr->mmap2 support
A check for the number of entries parsed by sscanf was introduced that
assumed all of the 8 fields needed to be correctly parsed so that
particular /proc/pid/maps line would be considered synthesizable.
That broke anon records synthesizing, as it doesn't have the 'execname'
field.
Fix it by keeping the sscanf return check, changing it to not require
that the 'execname' variable be parsed, so that the preexisting logic
can kick in and set it to '//anon'.
This should get things like JIT profiling working again.
Signed-off-by: Don Zickus <[email protected]>
Cc: Bill Gray <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Richard Fowles <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/n/[email protected]
[ commit log message is mine, dzickus reported the problem with a patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add missing newline if the 'uid' is invalid:
hubble:~> perf top --stdio -u help
Error:
Invalid User: helphubble:~>
Fixed by this patch:
comet:~/tip/tools/perf> perf top --stdio -u help
Error:
Invalid User: help
comet:~/tip/tools/perf>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Accidentally ran into these, get rid of them.
Signed-off-by: Davidlohr Bueso <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Tweak the summary output as suggested by Ingo Molnar:
[penberg@localhost ~]$ perf trace -a --duration 10000 --summary -- sleep 1
^C
Summary of events:
Xorg (817), 148 events, 0.0%, 0.000 msec
syscall calls min avg max stddev
(msec) (msec) (msec) (%)
--------------- -------- --------- --------- --------- ------
read 7 0.002 0.004 0.011 32.00%
rt_sigprocmask 40 0.001 0.001 0.002 1.31%
ioctl 6 0.002 0.003 0.005 19.45%
writev 7 0.004 0.018 0.059 43.76%
select 9 0.000 74.513 507.869 74.61%
setitimer 4 0.001 0.002 0.002 10.08%
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
764369 depends on SMP, so don't select it on !SMP configs.
Signed-off-by: Olof Johansson <[email protected]>
Acked-by: Rob Herring <[email protected]>
|
|
764369 depends on SMP, so don't select it on !SMP configs.
Signed-off-by: Olof Johansson <[email protected]>
Acked-by: Srinivas Kandagatla <[email protected]>
Cc: Stuart Menefy <[email protected]>
|
|
CPU reset handler was set before fuse is initialized, but
tegra_cpu_reset_handler_enable() uses tegra_chip_id, which is set by
tegra_init_fuse(). This patch reorders the calls so the CPU reset
handler code does not read an uninitialized variable.
Signed-off-by: Alexandre Courbot <[email protected]>
Signed-off-by: Stephen Warren <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>
|
|
Add a single-vendor config for vt8500. We can't enable WM8750 in
multi_v7_defconfig since it's a v6-based device, but it's still valuable
to have an in-tree defconfig that is suitable for the hardware.
This is based on multi_v7_defconfig and can be tweaked over time. It
gets us off the ground for now. Naming it vt8500_v6_v7 similar to i.MX
since there are v5-based vt8500 chips as well.
Signed-off-by: Olof Johansson <[email protected]>
Acked-by: Tony Prisk <[email protected]>
|
|
This turns on the internal integrator LCD display(s). It seems that the code
to do this got lost in refactoring of the CLCD driver.
Signed-off-by: Jonathan Austin <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Cc: [email protected]
Signed-off-by: Olof Johansson <[email protected]>
|
|
Correct the SPI node compatible property items to match example code and
match current DTS usage.
Signed-off-by: Eric Witcher <[email protected]>
Acked-by: Sourav Poddar <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
|
|
On OMAPs the IO ring must be rearmed each time the pad wakeup
configuration is changed. So call pcs_soc->rearm() from
pcs_irq_set().
As pinctrl-single is now an interrupt controller in some cases,
we should follow the standards and keep the interrupts enabled
constantly, and not just for wake-up events. The tracking of
runtime vs wake-up interrupts can be handled separately for
the automated runtime PM solution when we have it in the
future.
Signed-off-by: Roger Quadros <[email protected]>
Acked-by: Linus Walleij <[email protected]>
[[email protected]: removed wrong comment, updated description]
Signed-off-by: Tony Lindgren <[email protected]>
|
|
In case of error, the function platform_device_register_resndata()
returns ERR_PTR() and never returns NULL. The NULL test in the return
value check should be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Igor Grinberg <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into xxx-dt
Several OMAP2+ DSS-related clock fixes for v3.13 from Tomi Valkeinen.
Basic test logs at:
http://www.pwsan.com/omap/testlogs/clock_fixes_v3.13/20131024090906/
|
|
It's safer to turn on regcache_cache_only before disabling regulator since
the driver will turn off the regcache_cache_only after enabling regulator.
If we remain cache_only false, some command like 'amixer cset' would get
failure if being run before wm8962_resume().
Signed-off-by: Nicolin Chen <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]
|
|
Set feature-libunwind-debug-frame. We don't want it in
CORE_FEATURE_TESTS because it's not the generic case, but we
need to set it in the !feature-libunwind case.
Also, because x86 distributions typically don't have
dwarf_find_debug_frame() unwinding method:
test-libunwind-debug-frame.c:(.text+0x31): undefined reference to `_Ux86_64_dwarf_find_debug_frame'
Restrict this new API to ARM for the time being.
With this patch test-all.c works again, so repeat perf builds
are fast again:
comet:~/tip> perf stat --null --repeat 5 make -C tools/perf/
[...]
0,452899660 seconds time elapsed ( +- 0,11% )
While with before it was:
comet:~/tip> perf stat --null --repeat 5 make -C tools/perf/
[...]
1,674001829 seconds time elapsed ( +- 0,16% )
[ Includes fix to config/feature-checks/Makefile from Will Deacon. ]
Tested-by: Will Deacon <[email protected]>
Tested-by: Jean Pihet <[email protected]>
Cc: Russell King <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The recent change in sysfs triggered a kernel WARNING at unloading a
sound driver like
WARNING: CPU: 3 PID: 2247 at fs/sysfs/group.c:214 sysfs_remove_group+0xe8/0xf0()
sysfs group ffffffff81ab7b20 not found for kobject 'event14'
for each jack instance. It's because the unregistration of jack input
device is done in dev_free callback, which is called after
snd_card_disconnect(). Since device_unregister(card->card_dev) is
called in snd_card_disconnect(), the whole sysfs entries belonging to
card->card_dev have been already removed recursively. Thus this
results in a warning as input_unregister_device() yet tries to
unregister the already removed sysfs entry.
For fixing this mess, we need to unregister the jack input device at
dev_disconnect callback so that it's called before unregistering the
card->card_dev.
Reviwed-by: Mark Brown <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Switch virtio-blk from the dual support for old-style requests and bios
to use the block-multiqueue.
Acked-by: Asias He <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
|
|
The new blk-mq code added new instances of __cpuinit usage.
We removed this a couple versions ago; we now want to remove
the compat no-op stubs. Introducing new users is not what
we want to see at this point in time, as it will break once
the stubs are gone.
Signed-off-by: Paul Gortmaker <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The current code may access to the already freed object. The input
device must be accessed and unregistered before freeing the top level
sound object.
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Unbalanced calls to snd_imx_pcm_trigger() may result in endless
FIQ activity and thus provoke eternal sound. While on the first glance,
the switch statement looks pretty symmetric, the SUSPEND/RESUME
pair is not: the suspend case comes along snd_pcm_suspend_all(),
which for fsl/imx-pcm-fiq is called only at snd_soc_suspend(),
but the resume case originates straight from the SNDRV_PCM_IOCTL_RESUME.
This way userland may provoke an unbalanced resume, which might cause
the fiq_enable counter to increase and never return to zero again,
so eventually imx_pcm_fiq is never disabled.
Simply removing the fiq_enable will solve the problem, as long as
one never goes play and capture game simultaneously, but beware
trying both at once, the early TRIGGER_STOP will cut off the other
activity prematurely. So now playing and capturing is scrutinized
separately, instead of by counting.
Signed-off-by: Oskar Schirmer <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Cc: [email protected]
|
|
'feature_timerfd' is checked all the time and calculated explicitly,
in a serial fashion. Add it to CORE_FEATURE_TESTS which causes it to
be built in parallel, using the newfangled parallel build autodetection
code.
This shaves 137 msecs off the perf build time on my system, which
speeds up the common case cached build by 43%:
Before:
comet:~/tip> perf stat --null --repeat 5 make -C tools/perf/
[...]
0,453771441 seconds time elapsed ( +- 0,09% )
After:
comet:~/tip> perf stat --null --repeat 5 make -C tools/perf/
[...]
0,316290185 seconds time elapsed ( +- 0,24% )
Cc: David Ahern <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 changes from Ted Ts'o:
"Ext4 updates for 3.13. Mostly bug fixes and cleanups"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: add prototypes for macro-generated functions
ext4: return non-zero st_blocks for inline data
ext4: use prandom_u32() instead of get_random_bytes()
ext4: remove unreachable code after ext4_can_extents_be_merged()
ext4: remove unreachable code in ext4_can_extents_be_merged()
ext4: avoid bh leak in retry path of ext4_expand_extra_isize_ea()
ext4: don't count free clusters from a corrupt block group
ext4: fix FITRIM in no journal mode
ext4: drop set but otherwise unused variable from ext4_add_dirent_to_inline()
ext4: change ext4_read_inline_dir() to return 0 on success
ext4: pair trace_ext4_writepages & trace_ext4_writepages_result
ext4: add ratelimiting to ext4 messages
ext4: fix performance regression in ext4_writepages
ext4: fixup kerndoc annotation of mpage_map_and_submit_extent()
ext4: fix assertion in ext4_add_complete_io()
|
|
Pull xfs update from Ben Myers:
"For 3.13-rc1 we have an eclectic assortment of bugfixes, cleanups, and
refactoring. Bugfixes that stand out are the fix for the AGF/AGI
deadlock, incore extent list fixes, verifier fixes for v4 superblocks
and growfs, and memory leaks. There are some asserts, warnings, and
strings that were cleaned up. There was further rearrangement of code
to make libxfs and the kernel sync up more easily, differences between
v2 and v3 directory code were abstracted using an ops vector,
xfs_inactive was reworked, and the preallocation/hole punching code
was refactored.
- simplify kmem_zone_zalloc
- add traces for AGF/AGI read ops
- add additional AIL traces
- fix xfs_remove AGF vs AGI deadlock
- fix the extent count of new incore extent page in the indirection
array
- don't fail bad secondary superblocks verification on v4 filesystems
due to unzeroed bits after v4 fields
- fix possible NULL dereference in xlog_verify_iclog
- remove redundant assert in xfs_dir2_leafn_split
- prevent stack overflows from page cache allocation
- fix some sparse warnings
- fix directory block format verifier to check the leaf entry count
- abstract the differences in dir2/dir3 via an ops vector
- continue process of reorganization to make libxfs/kernel code
merges easier
- refactor the preallocation and hole punching code
- fix for growfs and verifiers
- remove unnecessary scary corruption error when probing non-xfs
filesystems
- remove extra newlines from strings passed to printk
- prevent deadlock trying to cover an active log
- rework xfs_inactive()
- add the inode directory type support to XFS_IOC_FSGEOM
- cleanup (remove) usage of is_bad_inode
- fix miscalculation in xfs_iext_realloc_direct which results in
oversized direct extent list
- remove unnecessary count arg to xfs_iomap_write_allocate
- fix memory leak in xlog_recover_add_to_trans
- check superblock instead of block magic to determine if dtype field
is present
- fix lockdep annotation due to project quotas
- fix regression in xfs_node_toosmall which can lead to incorrect
directory btree node collapse
- make log recovery verify filesystem uuid of recovering blocks
- fix XFS_IOC_FREE_EOFBLOCKS definition
- remove invalid assert in xfs_inode_free
- fix for AIL lock regression"
* tag 'xfs-for-linus-v3.13-rc1' of git://oss.sgi.com/xfs/xfs: (49 commits)
xfs: simplify kmem_{zone_}zalloc
xfs: add tracepoints to AGF/AGI read operations
xfs: trace AIL manipulations
xfs: xfs_remove deadlocks due to inverted AGF vs AGI lock ordering
xfs: fix the extent count when allocating an new indirection array entry
xfs: be more forgiving of a v4 secondary sb w/ junk in v5 fields
xfs: fix possible NULL dereference in xlog_verify_iclog
xfs:xfs_dir2_node.c: pointer use before check for null
xfs: prevent stack overflows from page cache allocation
xfs: fix static and extern sparse warnings
xfs: validity check the directory block leaf entry count
xfs: make dir2 ftype offset pointers explicit
xfs: convert directory vector functions to constants
xfs: convert directory vector functions to constants
xfs: vectorise encoding/decoding directory headers
xfs: vectorise DA btree operations
xfs: vectorise directory leaf operations
xfs: vectorise directory data operations part 2
xfs: vectorise directory data operations
xfs: vectorise remaining shortform dir2 ops
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"A number of fixes:
- Fix segfault on perf trace -i perf.data, from Namhyung Kim.
- Fix segfault with --no-mmap-pages, from David Ahern.
- Don't force a refresh during progress update in the TUI, greatly
reducing startup costs, fix from Patrick Palka.
- Fix sw clock event period test wrt not checking if using >
max_sample_freq.
- Handle throttle events in 'object code reading' test, fix from
Adrian Hunter.
- Prevent condition that all sort keys are elided, fix from Namhyung
Kim.
- Round mmap pages to power 2, from David Ahern.
And a number of late arrival changes:
- Add summary only option to 'perf trace', suppressing the decoding
of events, from David Ahern
- 'perf trace --summary' formatting simplifications, from Pekka
Enberg.
- Beautify fifth argument of mmap() as fd, in 'perf trace', from
Namhyung Kim.
- Add direct access to dynamic arrays in libtraceevent, from Steven
Rostedt.
- Synthesize non-exec MMAP records when --data used, allowing the
resolution of data addresses to symbols (global variables, etc), by
Arnaldo Carvalho de Melo.
- Code cleanups by David Ahern and Adrian Hunter"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
tools lib traceevent: Add direct access to dynamic arrays
perf target: Shorten perf_target__ to target__
perf tests: Handle throttle events in 'object code reading' test
perf evlist: Refactor mmap_pages parsing
perf evlist: Round mmap pages to power 2 - v2
perf record: Fix segfault with --no-mmap-pages
perf trace: Add summary only option
perf trace: Simplify '--summary' output
perf trace: Change syscall summary duration order
perf tests: Compensate lower sample freq with longer test loop
perf trace: Fix segfault on perf trace -i perf.data
perf trace: Separate tp syscall field caching into init routine to be reused
perf trace: Beautify fifth argument of mmap() as fd
perf tests: Use lower sample_freq in sw clock event period test
perf tests: Check return of perf_evlist__open sw clock event period test
perf record: Move existing write_output into helper function
perf record: Use correct return type for write()
perf tools: Prevent condition that all sort keys are elided
perf machine: Simplify synthesize_threads method
perf machine: Introduce synthesize_threads method out of open coded equivalent
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull two x86 fixes from Ingo Molnar.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/amd: Tone down printk(), don't treat a missing firmware file as an error
x86/dumpstack: Fix printk_address for direct addresses
|