Age | Commit message (Collapse) | Author | Files | Lines |
|
Before we split up the dcache_lru_lock, the unused dentry counter needs to
be made independent of the global dcache_lru_lock. Convert it to per-cpu
counters to do this.
Signed-off-by: Dave Chinner <[email protected]>
Signed-off-by: Glauber Costa <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Artem Bityutskiy <[email protected]>
Cc: Arve Hjønnevåg <[email protected]>
Cc: Carlos Maiolino <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Chuck Lever <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Gleb Natapov <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: J. Bruce Fields <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Stultz <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Steven Whitehouse <[email protected]>
Cc: Thomas Hellstrom <[email protected]>
Cc: Trond Myklebust <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
The sysctl knob sysctl_vfs_cache_pressure is used to determine which
percentage of the shrinkable objects in our cache we should actively try
to shrink.
It works great in situations in which we have many objects (at least more
than 100), because the aproximation errors will be negligible. But if
this is not the case, specially when total_objects < 100, we may end up
concluding that we have no objects at all (total / 100 = 0, if total <
100).
This is certainly not the biggest killer in the world, but may matter in
very low kernel memory situations.
Signed-off-by: Glauber Costa <[email protected]>
Reviewed-by: Carlos Maiolino <[email protected]>
Acked-by: KAMEZAWA Hiroyuki <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Artem Bityutskiy <[email protected]>
Cc: Arve Hjønnevåg <[email protected]>
Cc: Carlos Maiolino <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Chuck Lever <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Gleb Natapov <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: J. Bruce Fields <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Stultz <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Steven Whitehouse <[email protected]>
Cc: Thomas Hellstrom <[email protected]>
Cc: Trond Myklebust <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
This series reworks our current object cache shrinking infrastructure in
two main ways:
* Noticing that a lot of users copy and paste their own version of LRU
lists for objects, we put some effort in providing a generic version.
It is modeled after the filesystem users: dentries, inodes, and xfs
(for various tasks), but we expect that other users could benefit in
the near future with little or no modification. Let us know if you
have any issues.
* The underlying list_lru being proposed automatically and
transparently keeps the elements in per-node lists, and is able to
manipulate the node lists individually. Given this infrastructure, we
are able to modify the up-to-now hammer called shrink_slab to proceed
with node-reclaim instead of always searching memory from all over like
it has been doing.
Per-node lru lists are also expected to lead to less contention in the lru
locks on multi-node scans, since we are now no longer fighting for a
global lock. The locks usually disappear from the profilers with this
change.
Although we have no official benchmarks for this version - be our guest to
independently evaluate this - earlier versions of this series were
performance tested (details at
http://permalink.gmane.org/gmane.linux.kernel.mm/100537) yielding no
visible performance regressions while yielding a better qualitative
behavior in NUMA machines.
With this infrastructure in place, we can use the list_lru entry point to
provide memcg isolation and per-memcg targeted reclaim. Historically,
those two pieces of work have been posted together. This version presents
only the infrastructure work, deferring the memcg work for a later time,
so we can focus on getting this part tested. You can see more about the
history of such work at http://lwn.net/Articles/552769/
Dave Chinner (18):
dcache: convert dentry_stat.nr_unused to per-cpu counters
dentry: move to per-sb LRU locks
dcache: remove dentries from LRU before putting on dispose list
mm: new shrinker API
shrinker: convert superblock shrinkers to new API
list: add a new LRU list type
inode: convert inode lru list to generic lru list code.
dcache: convert to use new lru list infrastructure
list_lru: per-node list infrastructure
shrinker: add node awareness
fs: convert inode and dentry shrinking to be node aware
xfs: convert buftarg LRU to generic code
xfs: rework buffer dispose list tracking
xfs: convert dquot cache lru to list_lru
fs: convert fs shrinkers to new scan/count API
drivers: convert shrinkers to new count/scan API
shrinker: convert remaining shrinkers to count/scan API
shrinker: Kill old ->shrink API.
Glauber Costa (7):
fs: bump inode and dentry counters to long
super: fix calculation of shrinkable objects for small numbers
list_lru: per-node API
vmscan: per-node deferred work
i915: bail out earlier when shrinker cannot acquire mutex
hugepage: convert huge zero page shrinker to new shrinker API
list_lru: dynamically adjust node arrays
This patch:
There are situations in very large machines in which we can have a large
quantity of dirty inodes, unused dentries, etc. This is particularly true
when umounting a filesystem, where eventually since every live object will
eventually be discarded.
Dave Chinner reported a problem with this while experimenting with the
shrinker revamp patchset. So we believe it is time for a change. This
patch just moves int to longs. Machines where it matters should have a
big long anyway.
Signed-off-by: Glauber Costa <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Artem Bityutskiy <[email protected]>
Cc: Arve Hjønnevåg <[email protected]>
Cc: Carlos Maiolino <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Chuck Lever <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Gleb Natapov <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: J. Bruce Fields <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Stultz <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Steven Whitehouse <[email protected]>
Cc: Thomas Hellstrom <[email protected]>
Cc: Trond Myklebust <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Dave Jones <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
For a long time no filesystem has been using vfs_follow_link, and as seen
by recent filesystem submissions any new use is accidental as well.
Remove vfs_follow_link, document the replacement in
Documentation/filesystems/porting and also rename __vfs_follow_link
to match its only caller better.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Olof Johansson reports that the tests against HZ_FIXED seem
non-functional. Fix this by using '0' as a sentinel for "not
specified" and test against that instead.
Reported-by: Olof Johansson <[email protected]>
Signed-off-by: Russell King <[email protected]>
|
|
stage of driver init time
Signed-off-by: James Smart <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
deleting linked list
Signed-off-by: James Smart <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
This patch will add big endian architecture support to megaraid_sas
driver. The support added is for LSI MegaRAID all generation controllers-
(3Gb/s, 6Gb/s and 12 Gb/s controllers).
We have done basic sanity test @ppc64 arch and @x86_64. Additional
testing/observations are welcome.
[jejb: fix up rejections]
Signed-off-by: Kashyap Desai <[email protected]>
Signed-off-by: Sumit Saxena <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
Pull CRIS updates from Jesper Nilsson:
"Mostly cleanup and removal of unused configs"
* tag 'cris-for-3.12' of git://jni.nu/cris:
CRIS: drop unused Kconfig symbols
CRIS: Add kvm_para.h which includes generic file
CRIS: remove unused current_regs
CRIS: Remove last traces of legacy RTC drivers
CRIS: remove "config OOM_REBOOT"
|
|
The mn10300 kernel crashes just after starting userspace programs, if
CONFIG_PREEMPT is disabled:
Freeing unused kernel memory: 96K (90286000 - 9029e000)
MISALIGN: 97c33ff9: unsupported instruction f
MISALIGN: 97c33ff9: unsupported instruction f
MISALIGN: 97c33ff9: unsupported instruction f
:
This fixes the problem that was introduced by commit d17fc238ac14
("MN10300: Enable IRQs more in system call exit work path").
Signed-off-by: Akira Takeuchi <[email protected]>
Signed-off-by: Kiyoshi Owada <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The prototype for ahc_9005_subdevinfo_valid shows that the caller has the
arguments in the wrong order.
637 ahc_9005_subdevinfo_valid(uint16_t device, uint16_t vendor,
638 uint16_t subdevice, uint16_t subvendor)
Signed-off-by: Dave Jones <[email protected]>
Acked-by: Hannes Reinecke <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
Changes the version of hpsa so we know something has changed. Please consider
this for inclusion.
Signed-off-by: Mike Miller <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
This patch does a bit of housekeeping for hpsa. Change lowercase alpha hex
digits to uppercase for consistency within the driver. Also moves the P822se
in the tables to keep controllers of each family grouped together.
Signed-off-by: Mike Miller <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
Add the marketing names for HP Smart Array Gen8 controllers. Also removes an
unused ID. Please consider this for inclusion.
Signed-off-by: Mike Miller <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
commit 6d4028c644e (i2c: use dev_get_platdata()) did a bad conversion
of this one case:
drivers/i2c/busses/i2c-davinci.c: In function 'davinci_i2c_probe':
drivers/i2c/busses/i2c-davinci.c:665:2: warning: passing argument 1 of
'dev_get_platdata' from incompatible pointer type [enabled by default]
Reviewed-by: Jingoo Han <[email protected]>
Acked-by: Wolfram Sang <[email protected]>
Signed-off-by: Olof Johansson <[email protected]>
|
|
* pm-cpuidle:
cpuidle: Check the result of cpuidle_get_driver() against NULL
|
|
* acpi-assorted:
ACPI / LPSS: don't crash if a device has no MMIO resources
|
|
* acpica:
ACPICA: Fix for a Store->ArgX when ArgX contains a reference to a field.
|
|
* acpi-pci-hotplug:
ACPI / hotplug / PCI: Avoid parent bus rescans on spurious device checks
ACPI / hotplug / PCI: Use _OST to notify firmware about notify status
ACPI / hotplug / PCI: Avoid doing too much for spurious notifies
ACPI / hotplug / PCI: Don't trim devices before scanning the namespace
|
|
* acpi-hotplug:
PM / hibernate / memory hotplug: Rework mutual exclusion
PM / hibernate: Create memory bitmaps after freezing user space
ACPI / scan: Change ordering of locks for device hotplug
|
|
This patch adds the PCI ID's for HP Smart Array Gen9 controllers. Please
consider this patch for inclusion.
Signed-off-by: Mike Miller <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
|
|
Enable the intel_pstate driver for Haswell CPUs. One missing Ivy Bridge
model (0x3E) is also included. Models referenced from
tools/power/x86/turbostat/turbostat.c:has_nehalem_turbo_ratio_limit
Signed-off-by: Nell Hardcastle <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Acked-by: Dirk Brandewie <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Signed-off-by: Dave Jones <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Signed-off-by: Al Viro <[email protected]>
|
|
For a long time no filesystem has been using vfs_follow_link, and as seen
by recent filesystem submissions any new use is accidental as well.
Remove vfs_follow_link, document the replacement in
Documentation/filesystems/porting and also rename __vfs_follow_link
to match its only caller better.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Pull device tree core updates from Grant Likely:
"Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding
the entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either"
Tim Bird questions whether the boot time cost of the random feeding may
be noticeable. And "add_device_randomness()" is definitely not some
speed deamon of a function.
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/platform: add error reporting to of_amba_device_create()
irq/of: Fix comment typo for irq_of_parse_and_map
of: Feed entire flattened device tree into the random pool
of/fdt: Clean up casting in unflattening path
of/fdt: Remove duplicate memory clearing on FDT unflattening
gpio: implement gpio-ranges binding document fix
of: call __of_parse_phandle_with_args from of_parse_phandle
of: introduce of_parse_phandle_with_fixed_args
of: move of_parse_phandle()
of: move documentation of of_parse_phandle_with_args
of: Fix missing memory initialization on FDT unflattening
of: consolidate definition of early_init_dt_alloc_memory_arch()
of: Make of_get_phy_mode() return int i.s.o. const int
include: dt-binding: input: create a DT header defining key codes.
of/platform: Staticize of_platform_device_create_pdata()
of: Specify initrd location using 64-bit
dt: Typo fix
OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled
|
|
Pull slave-dmaengine updates from Vinod Koul:
"This pull brings:
- Andy's DW driver updates
- Guennadi's sh driver updates
- Pl08x driver fixes from Tomasz & Alban
- Improvements to mmp_pdma by Daniel
- TI EDMA fixes by Joel
- New drivers:
- Hisilicon k3dma driver
- Renesas rcar dma driver
- New API for publishing slave driver capablities
- Various fixes across the subsystem by Andy, Jingoo, Sachin etc..."
* 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (94 commits)
dma: edma: Remove limits on number of slots
dma: edma: Leave linked to Null slot instead of DUMMY slot
dma: edma: Find missed events and issue them
ARM: edma: Add function to manually trigger an EDMA channel
dma: edma: Write out and handle MAX_NR_SG at a given time
dma: edma: Setup parameters to DMA MAX_NR_SG at a time
dmaengine: pl330: use dma_set_max_seg_size to set the sg limit
dmaengine: dma_slave_caps: remove sg entries
dma: replace devm_request_and_ioremap by devm_ioremap_resource
dma: ste_dma40: Fix potential null pointer dereference
dma: ste_dma40: Remove duplicate const
dma: imx-dma: Remove redundant NULL check
dma: dmagengine: fix function names in comments
dma: add driver for R-Car HPB-DMAC
dma: k3dma: use devm_ioremap_resource() instead of devm_request_and_ioremap()
dma: imx-sdma: Staticize sdma_driver_data structures
pch_dma: Add MODULE_DEVICE_TABLE
dmaengine: PL08x: Add cyclic transfer support
dmaengine: PL08x: Fix reading the byte count in cctl
dmaengine: PL08x: Add support for different maximum transfer size
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
Pull MMC updates from Chris Ball:
"MMC highlights for 3.12:
Core:
- Support Allocation Units 8MB-64MB in SD3.0, previous max was 4MB.
- The slot-gpio helper can now handle GPIO debouncing card-detect.
- Read supported voltages from DT "voltage-ranges" property.
Drivers:
- dw_mmc: Add support for ARC architecture, and support exynos5420.
- mmc_spi: Support CD/RO GPIOs.
- sh_mobile_sdhi: Add compatibility for more Renesas SoCs.
- sh_mmcif: Add DT support for DMA channels"
* tag 'mmc-updates-for-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: (50 commits)
Revert "mmc: tmio-mmc: Remove .set_pwr() callback from platform data"
mmc: dw_mmc: Add support for ARC
mmc: sdhci-s3c: initialize host->quirks2 for using quirks2
mmc: sdhci-s3c: fix the wrong register value, when clock is disabled
mmc: esdhc: add support to get voltage from device-tree
mmc: sdhci: get voltage from sdhc host
mmc: core: parse voltage from device-tree
mmc: omap_hsmmc: use the generic config for omap2plus devices
mmc: omap_hsmmc: clear status flags before starting a new command
mmc: dw_mmc: exynos: Add a new compatible string for exynos5420
mmc: sh_mmcif: revision-specific CLK_CTRL2 handling
mmc: sh_mmcif: revision-specific Command Completion Signal handling
mmc: sh_mmcif: add support for Device Tree DMA bindings
mmc: sh_mmcif: move header include from header into .c
mmc: SDHI: add DT compatibility strings for further SoCs
mmc: dw_mmc-pci: enable bus-mastering mode
mmc: dw_mmc-pci: get resources from a proper BAR
mmc: tmio-mmc: Remove .set_pwr() callback from platform data
mmc: tmio-mmc: Remove .get_cd() callback from platform data
mmc: sh_mobile_sdhi: Remove .set_pwr() callback from platform data
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device-mapper updates from Mike Snitzer:
"Add the ability to collect I/O statistics on user-defined regions of a
device-mapper device. This dm-stats code required the reintroduction
of a div64_u64_rem() helper, but as a separate method that doesn't
slow down div64_u64() -- especially on 32-bit systems.
Allow the error target to replace request-based DM devices (e.g.
multipath) in addition to bio-based DM devices.
Various other small code fixes and improvements to thin-provisioning,
DM cache and the DM ioctl interface"
* tag 'dm-3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm stripe: silence a couple sparse warnings
dm: add statistics support
dm thin: always return -ENOSPC if no_free_space is set
dm ioctl: cleanup error handling in table_load
dm ioctl: increase granularity of type_lock when loading table
dm ioctl: prevent rename to empty name or uuid
dm thin: set pool read-only if breaking_sharing fails block allocation
dm thin: prefix pool error messages with pool device name
dm: allow error target to replace bio-based and request-based targets
math64: New separate div64_u64_rem helper
dm space map: optimise sm_ll_dec and sm_ll_inc
dm btree: prefetch child nodes when walking tree for a dm_btree_del
dm btree: use pop_frame in dm_btree_del to cleanup code
dm cache: eliminate holes in cache structure
dm cache: fix stacking of geometry limits
dm thin: fix stacking of geometry limits
dm thin: add data block size limits to Documentation
dm cache: add data block size limits to code and Documentation
dm cache: document metadata device is exclussive to a cache
dm: stop using WQ_NON_REENTRANT
|
|
Pull md update from Neil Brown:
"Headline item is multithreading for RAID5 so that more IO/sec can be
supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6
calculations and an assortment of bug fixes"
* tag 'md/3.12' of git://neil.brown.name/md:
raid5: only wakeup necessary threads
md/raid5: flush out all pending requests before proceeding with reshape.
md/raid5: use seqcount to protect access to shape in make_request.
raid5: sysfs entry to control worker thread number
raid5: offload stripe handle to workqueue
raid5: fix stripe release order
raid5: make release_stripe lockless
md: avoid deadlock when dirty buffers during md_stop.
md: Don't test all of mddev->flags at once.
md: Fix apparent cut-and-paste error in super_90_validate
raid6/test: replace echo -e with printf
RAID: add tilegx SIMD implementation of raid6
md: fix safe_mode buglet.
md: don't call md_allow_write in get_bitmap_file.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile 3 (of many) from Al Viro:
"Waiman's conversion of d_path() and bits related to it,
kern_path_mountpoint(), several cleanups and fixes (exportfs
one is -stable fodder, IMO).
There definitely will be more... ;-/"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
split read_seqretry_or_unlock(), convert d_walk() to resulting primitives
dcache: Translating dentry into pathname without taking rename_lock
autofs4 - fix device ioctl mount lookup
introduce kern_path_mountpoint()
rename user_path_umountat() to user_path_mountpoint_at()
take unlazy_walk() into umount_lookup_last()
Kill indirect include of file.h from eventfd.h, use fdget() in cgroup.c
prune_super(): sb->s_op is never NULL
exportfs: don't assume that ->iterate() won't feed us too long entries
afs: get rid of redundant ->d_name.len checks
|
|
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
Move the call to platform_get_resource adjacent to the call to
devm_ioremap_resource to make the connection between them more clear.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Remove unneeded error handling on the result of a call to
platform_get_resource_byname when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,e,e1;
expression ret != 0;
identifier l;
@@
res = platform_get_resource_byname(...);
- if (res == NULL) { ... \(goto l;\|return ret;\) }
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
When I moved the RCU walk termination into unlazy_walk(), I didn't copy
quite all of it: for the successful RCU termination we properly add the
necessary reference counts to our temporary copy of the root path, but
for the failure case we need to make sure that any temporary root path
information is cleared out (since it does _not_ have the proper
reference counts from the RCU lookup).
We could clean up this mess by just always dropping the temporary root
information, but Al points out that that would mean that a single lookup
through symlinks could see multiple different root entries if it races
with another thread doing chroot. Not that I think we should really
care (we had that before too, back before we had a copy of the root path
in the nameidata).
Al says he has a cunning plan. In the meantime, this is the minimal fix
for the problem, even if it's not all that pretty.
Reported-by: Mace Moneta <[email protected]>
Acked-by: Al Viro <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Remove unneeded error handling on the result of a call to
platform_get_resource when the value is passed to devm_ioremap_resource.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression pdev,res,n,e,e1;
expression ret != 0;
identifier l;
@@
- res = platform_get_resource(pdev, IORESOURCE_MEM, n);
... when != res
- if (res == NULL) { ... \(goto l;\|return ret;\) }
... when != res
+ res = platform_get_resource(pdev, IORESOURCE_MEM, n);
e = devm_ioremap_resource(e1, res);
// </smpl>
Signed-off-by: Julia Lawall <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Acked-by: Wan Zongshun <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
This patch adds the driver for the watchdog found in the Allwinner A10 and
A13 SoCs. It has DT-support and uses the new watchdog framework.
Signed-off-by: Carlo Caione <[email protected]>
Acked-by: Maxime Ripard <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
This patch removes the global variables in the driver file and
group them into a structure.
Signed-off-by: Leela Krishna Amudala <[email protected]>
Reviewed-by: Tomasz Figa <[email protected]>
Acked-by: Kukjin Kim <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Let the inode verifier do it's work by returning an error when we
fail to find correct magic numbers in an inode buffer.
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
Saw this on generic/270 after a DQALLOC transaction overrun
shutdown:
XFS: Assertion failed: !(bip->bli_item.li_flags & XFS_LI_IN_AIL), file: fs/xfs/xfs_buf_item.c, line: 952
.....
xfs_buf_item_relse+0x4f/0xd0
xfs_buf_item_unlock+0x1b4/0x1e0
xfs_trans_free_items+0x7d/0xb0
xfs_trans_cancel+0x13c/0x1b0
xfs_symlink+0x37e/0xa60
....
When a transaction abort occured.
If we are aborting a transaction and trigger this code path, then
the item may be dirty. If the item is dirty, then it may be in the
AIL. Hence if we are aborting, we need to check if the item is in
the AIL and remove it before freeing it.
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
We have quite a few places now where we do:
x = kmem_zalloc(large size)
if (!x)
x = kmem_zalloc_large(large size)
and do a similar dance when freeing the memory. kmem_free() already
does the correct freeing dance, and kmem_zalloc_large() is only ever
called in these constructs, so just factor it all into
kmem_zalloc_large() and kmem_free().
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
Ever since increasing the number of supported ACLs from 25 to as
many as can fit in an xattr, there have been reports of order 4
memory allocations failing in the ACL code. Fix it in the same way
we've fixed all the xattr read/write code that has the same problem.
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
When splitting the root of the da btree, we shuffled data between
buffers and the structures that track them. At one point, we copy
data and state from one buffer to another, including the ops
associated with the buffer. When we do this, we also need to copy
the buffer type associated with the buf log item so that the buffer
is logged correctly. If we don't do that, log recovery won't
recognise it and hence it won't recalculate the CRC on the buffer
after recovery. This leads to a directory block that can't be read
after recovery has run.
Found by inspection after finding the same problem with remote
symlink buffers.
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Ben Myers <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
The logging of a remote symlink block does not set the buffer type
being logged, and hence on recovery the type of buffer is not
recognised and hence CRCs are not calculated after replay. This
results in log recoery throwing:
XFS (vdc): Unknown buffer type 0
errors, and subsequent reads of the symlink failing CRC
verification. Found via fsstress + godown.
Reported by: Michael L. Semon <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
This is the recovery side of the btree block owner change operation
performed by swapext on CRC enabled filesystems. We detect that an
owner change is needed by the flag that has been placed on the inode
log format flag field. Because the inode recovery is being replayed
after the buffers that make up the BMBT in the given checkpoint, we
can walk all the buffers and directly modify them when we see the
flag set on an inode.
Because the inode can be relogged and hence present in multiple
chekpoints with the "change owner" flag set, we could do multiple
passes across the inode to do this change. While this isn't optimal,
we can't directly ignore the flag as there may be multiple
independent swap extent operations being replayed on the same inode
in different checkpoints so we can't ignore them.
Further, because the owner change operation uses ordered buffers, we
might have buffers that are newer on disk than the current
checkpoint and so already have the owner changed in them. Hence we
cannot just peek at a buffer in the tree and check that it has the
correct owner and assume that the change was completed.
So, for the moment just brute force the owner change every time we
see an inode with the flag set. Note that we have to be careful here
because the owner of the buffers may point to either the old owner
or the new owner. Currently the verifier can't verify the owner
directly, so there is no failure case here right now. If we verify
the owner exactly in future, then we'll have to take this into
account.
This was tested in terms of normal operation via xfstests - all of
the fsr tests now pass without failure. however, we really need to
modify xfs/227 to stress v3 inodes correctly to ensure we fully
cover this case for v5 filesystems.
In terms of recovery testing, I used a hacked version of xfs_fsr
that held the temp inode open for a few seconds before exiting so
that the filesystem could be shut down with an open owner change
recovery flags set on at least the temp inode. fsr leaves the temp
inode unlinked and in btree format, so this was necessary for the
owner change to be reliably replayed.
logprint confirmed the tmp inode in the log had the correct flag set:
INO: cnt:3 total:3 a:0x69e9e0 len:56 a:0x69ea20 len:176 a:0x69eae0 len:88
INODE: #regs:3 ino:0x44 flags:0x209 dsize:88
^^^^^
0x200 is set, indicating a data fork owner change needed to be
replayed on inode 0x44. A printk in the revoery code confirmed that
the inode change was recovered:
XFS (vdc): Mounting Filesystem
XFS (vdc): Starting recovery (logdev: internal)
recovering owner change ino 0x44
XFS (vdc): Version 5 superblock detected. This kernel L support enabled!
Use of these features in this kernel is at your own risk!
XFS (vdc): Ending recovery (logdev: internal)
The script used to test this was:
$ cat ./recovery-fsr.sh
#!/bin/bash
dev=/dev/vdc
mntpt=/mnt/scratch
testfile=$mntpt/testfile
umount $mntpt
mkfs.xfs -f -m crc=1 $dev
mount $dev $mntpt
chmod 777 $mntpt
for i in `seq 10000 -1 0`; do
xfs_io -f -d -c "pwrite $(($i * 4096)) 4096" $testfile > /dev/null 2>&1
done
xfs_bmap -vp $testfile |head -20
xfs_fsr -d -v $testfile &
sleep 10
/home/dave/src/xfstests-dev/src/godown -f $mntpt
wait
umount $mntpt
xfs_logprint -t $dev |tail -20
time mount $dev $mntpt
xfs_bmap -vp $testfile
umount $mntpt
$
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Mark Tinguely <[email protected]>
Signed-off-by: Ben Myers <[email protected]>
|
|
The operation status is decoded in decode_op_hdr.
Stop the print_overflow message that is always hit without this patch:
nfs: decode_free_stateid: prematurely hit end of receive buffer. Remaining
buffer length is 0 words.
Signed-off-by: Andy Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Paul Bolle <[email protected]>
Signed-off-by: Jesper Nilsson <[email protected]>
|
|
Copied from frv.
Reviewed-by: David Howells <[email protected]>
Signed-off-by: Jesper Nilsson <[email protected]>
|