aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-07-30rbd: preserve snapc->seq in rbd_header_set_snap()Alex Elder1-11/+7
In rbd_header_set_snap(), there is logic to make the snap context's seq field get set to a particular snapshot id, or 0 if there is no snapshot for the rbd image. This seems to be an artifact of how the current snapshot id for an rbd_dev was recorded before the rbd_dev->snap_id field began to be used for that purpose. There's no need to update the value of snapc->seq here any more, so stop doing it. Tidy up a few local variables in that function while we're at it. Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Josh Durgin <[email protected]>
2012-07-30rbd: don't use snapc->seq that wayAlex Elder1-14/+0
In what appears to be an artifact of a different way of encoding whether an rbd image maps a snapshot, __rbd_refresh_header() has code that arranges to update the seq value in an rbd image's snapshot context to point to the first entry in its snapshot array if that's where it was pointing initially. We now use rbd_dev->snap_id to record the snapshot id--using the special value CEPH_NOSNAP to indicate the rbd_dev is not mapping a snapshot at all. There is therefore no need to check for this case, nor to update the seq value, in __rbd_refresh_header(). Just preserve the seq value that rbd_read_header() provides (which, at the moment, is nothing). Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Josh Durgin <[email protected]>
2012-07-30rbd: send header version when notifyingJosh Durgin1-2/+5
Previously the original header version was sent. Now, we update it when the header changes. Signed-off-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30rbd: use reference counting for the snap contextJosh Durgin1-18/+18
This prevents a race between requests with a given snap context and header updates that free it. The osd client was already expecting the snap context to be reference counted, since it get()s it in ceph_osdc_build_request and put()s it when the request completes. Also remove the second down_read()/up_read() on header_rwsem in rbd_do_request, which wasn't actually preventing this race or protecting any other data. Signed-off-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30rbd: set image size when header is updatedJosh Durgin1-0/+1
The image may have been resized. Signed-off-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30rbd: expose the correct size of the device in sysfsJosh Durgin1-3/+8
If an image was mapped to a snapshot, the size of the head version would be shown. Protect capacity with header_rwsem, since it may change. Signed-off-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30rbd: only reset capacity when pointing to headJosh Durgin1-1/+6
Snapshots cannot be resized, and the new capacity of head should not be reflected by the snapshot. Signed-off-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30rbd: return errors for mapped but deleted snapshotJosh Durgin1-2/+30
When a snapshot is deleted, the OSD will return ENOENT when reading from it. This is normally interpreted as a hole by rbd, which will return zeroes. To minimize the time in which this can happen, stop requests early when we are notified that our snapshot no longer exists. [[email protected]: updated __rbd_init_snaps_header() logic] Signed-off-by: Josh Durgin <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30libceph: trivial fix for the incorrect debug outputJiaju Zhang1-1/+1
This is a trivial fix for the debug output, as it is inconsistent with the function name so may confuse people when debugging. [[email protected]: switched to use __func__] Signed-off-by: Jiaju Zhang <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30ceph: fix potential double freeAlan Cox1-0/+1
We re-run the loop but we don't re-set the attrs pointer back to NULL. Signed-off-by: Alan Cox <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30libceph: reset connection retry on successfully negotiationSage Weil1-0/+2
We exponentially back off when we encounter connection errors. If several errors accumulate, we will eventually wait ages before even trying to reconnect. Fix this by resetting the backoff counter after a successful negotiation/ connection with the remote node. Fixes ceph issue #2802. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30libceph: protect ceph_con_open() with mutexSage Weil1-0/+2
Take the con mutex while we are initiating a ceph open. This is necessary because the may have previously been in use and then closed, which could result in a racing workqueue running con_work(). Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2012-07-30ceph: close old con before reopening on mds reconnectSage Weil1-0/+1
When we detect a mds session reset, close the old ceph_connection before reopening it. This ensures we clean up the old socket properly and keep the ceph_connection state correct. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]>
2012-07-30libceph: (re)initialize bio_iter on start of message receiveSage Weil1-5/+6
Previously, we were opportunistically initializing the bio_iter if it appeared to be uninitialized in the middle of the read path. The problem is that a sequence like: - start reading message - initialize bio_iter - read half a message - messenger fault, reconnect - restart reading message - ** bio_iter now non-NULL, not reinitialized ** - read past end of bio, crash Instead, initialize the bio_iter unconditionally when we allocate/claim the message for read. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]>
2012-07-30libceph: resubmit linger ops when pg mapping changesSage Weil1-5/+21
The linger op registration (i.e., watch) modifies the object state. As such, the OSD will reply with success if it has already applied without doing the associated side-effects (setting up the watch session state). If we lose the ACK and resubmit, we will see success but the watch will not be correctly registered and we won't get notifies. To fix this, always resubmit the linger op with a new tid. We accomplish this by re-registering as a linger (i.e., 'registered') if we are not yet registered. Then the second loop will treat this just like a normal case of re-registering. This mirrors a similar fix on the userland ceph.git, commit 5dd68b95, and ceph bug #2796. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]>
2012-07-30libceph: fix mutex coverage for ceph_con_closeSage Weil1-1/+7
Hold the mutex while twiddling all of the state bits to avoid possible races. While we're here, make not of why we cannot close the socket directly. Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]>
2012-07-30libceph: report socket read/write error messageSage Weil1-2/+6
We need to set error_msg to something useful before calling ceph_fault(); do so here for try_{read,write}(). This is more informative than libceph: osd0 192.168.106.220:6801 (null) Signed-off-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]>
2012-07-30libceph: support crush tunablesSage Weil4-7/+58
The server side recently added support for tuning some magic crush variables. Decode these variables if they are present, or use the default values if they are not present. Corresponds to ceph.git commit 89af369c25f274fe62ef730e5e8aad0c54f1e5a5. Signed-off-by: caleb miles <[email protected]> Reviewed-by: Sage Weil <[email protected]> Reviewed-by: Alex Elder <[email protected]> Reviewed-by: Yehuda Sadeh <[email protected]>
2012-07-30[media] via-camera: pass correct format settings to sensorDaniel Drake1-1/+1
The code attempts to maintain a "user format" and a "sensor format", but in this case it looks like a typo is passing the user format down to the sensor. This was preventing display of video at anything other than 640x480. Signed-off-by: Daniel Drake <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] rtl2832.c: minor cleanupHans-Frieder Vogt1-1/+1
The current formulation of the bw_params loop uses the counter j as an index for the first dimension of the bw_params array which is later incremented by the variable i. It is evaluated correctly only, because j is initialized to 0 at the beginning of the loop. I think that explicitly using the index 0 better reflects the intent of the expression. Signed-off-by: Hans-Frieder Vogt <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] Add support for the IguanaWorks USB IR TransceiverSean Young3-0/+651
Signed-off-by: Sean Young <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] Minor cleanups for MCE USBSean Young1-8/+5
Signed-off-by: Sean Young <[email protected]> Cc: Jarod Wilson <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] drivers/media/dvb/siano/smscoreapi.c: use list_for_each_entryJulia Lawall1-23/+16
Use list_for_each_entry and perform some other induced simplifications. The semantic match that finds the opportunity for this reorganization is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ struct list_head *pos; struct list_head *head; statement S; @@ *for (pos = (head)->next; pos != (head); pos = pos->next) S // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] Use a named union in struct v4l2_ioctl_infoHans Verkuil1-5/+5
Hi Mauro, struct v4l2_ioctl_info uses an anonymous union, which is initialized in the v4l2_ioctls table. Unfortunately gcc < 4.6 uses a non-standard syntax for that, so trying to compile v4l2-ioctl.c with an older gcc will fail. It is possible to work around this by testing the gcc version, but in this case it is easier to make the union named since it is used in only a few places. Signed-off-by: Hans Verkuil <[email protected]> Reported-by: Randy Dunlap <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30SUNRPC: return negative value in case rpcbind client creation errorStanislav Kinsbursky1-2/+2
Without this patch kernel will panic on LockD start, because lockd_up() checks lockd_up_net() result for negative value. From my pow it's better to return negative value from rpcbind routines instead of replacing all such checks like in lockd_up(). Signed-off-by: Stanislav Kinsbursky <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] [>= 3.0]
2012-07-30[media] mceusb: Add Twisted Melon USB IDsMark Lord1-0/+7
Add USB identifiers for MCE compatible I/R transceivers from Twisted Melon. Signed-off-by: Mark Lord <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] staging/media/solo6x10: use module_pci_driver macroDevendra Naga1-12/+1
the driver duplicates the module_pci_driver code, how? module_pci_driver is used for those drivers whose init and exit paths does only register and unregister to pci API and nothing else. so use the module_pci_driver macro instead Signed-off-by: Devendra Naga <[email protected]> Acked-by: Ismael Luceno <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30[media] staging/media/dt3155v4l: use module_pci_driver macroDevendra Naga1-14/+1
the driver duplicates the module_pci_driver code, remove the duplicate code and use the module_pci_driver macro. Signed-off-by: Devendra Naga <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2012-07-30Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds185-1042/+2662
Merge Andrew's first set of patches: "Non-MM patches: - lots of misc bits - tree-wide have_clk() cleanups - quite a lot of printk tweaks. I draw your attention to "printk: convert the format for KERN_<LEVEL> to a 2 byte pattern" which looks a bit scary. But afaict it's solid. - backlight updates - lib/ feature work (notably the addition and use of memweight()) - checkpatch updates - rtc updates - nilfs updates - fatfs updates (partial, still waiting for acks) - kdump, proc, fork, IPC, sysctl, taskstats, pps, etc - new fault-injection feature work" * Merge emailed patches from Andrew Morton <[email protected]>: (128 commits) drivers/misc/lkdtm.c: fix missing allocation failure check lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table() fault-injection: add tool to run command with failslab or fail_page_alloc fault-injection: add selftests for cpu and memory hotplug powerpc: pSeries reconfig notifier error injection module memory: memory notifier error injection module PM: PM notifier error injection module cpu: rewrite cpu-notifier-error-inject module fault-injection: notifier error injection c/r: fcntl: add F_GETOWNER_UIDS option resource: make sure requested range is included in the root range include/linux/aio.h: cpp->C conversions fs: cachefiles: add support for large files in filesystem caching pps: return PTR_ERR on error in device_create taskstats: check nla_reserve() return sysctl: suppress kmemleak messages ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION ipc: compat: use signed size_t types for msgsnd and msgrcv ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPC ipc: add COMPAT_SHMLBA support ...
2012-07-30drivers/misc/lkdtm.c: fix missing allocation failure checkAlan Cox1-0/+2
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44691 Reported-by: <[email protected]> Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30lib/scatterlist: do not re-write gfp_flags in __sg_alloc_table()Mandeep Singh Baines1-8/+0
We are seeing a lot of sg_alloc_table allocation failures using the new drm prime infrastructure. We isolated the cause to code in __sg_alloc_table that was re-writing the gfp_flags. There is a comment in the code that suggest that there is an assumption about the allocation coming from a memory pool. This was likely true when sg lists were primarily used for disk I/O. Signed-off-by: Mandeep Singh Baines <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Cong Wang <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Rob Clark <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Inki Dae <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Sonny Rao <[email protected]> Cc: Olof Johansson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30fault-injection: add tool to run command with failslab or fail_page_allocAkinobu Mita2-0/+246
This adds tools/testing/fault-injection/failcmd.sh to run a command while injecting slab/page allocation failures via fault injection. Example: Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab allocation failure. # ./tools/testing/fault-injection/failcmd.sh \ -- make -C tools/testing/selftests/ run_tests Same as above except to specify 100 times failures at most instead of one time at most by default. # ./tools/testing/fault-injection/failcmd.sh --times=100 \ -- make -C tools/testing/selftests/ run_tests Same as above except to inject page allocation failure instead of slab allocation failure. # env FAILCMD_TYPE=fail_page_alloc \ ./tools/testing/fault-injection/failcmd.sh --times=100 \ -- make -C tools/testing/selftests/ run_tests Signed-off-by: Akinobu Mita <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30fault-injection: add selftests for cpu and memory hotplugAkinobu Mita5-1/+464
This adds two selftests * tools/testing/selftests/cpu-hotplug/on-off-test.sh is testing script for CPU hotplug 1. Online all hot-pluggable CPUs 2. Offline all hot-pluggable CPUs 3. Online all hot-pluggable CPUs again 4. Exit if cpu-notifier-error-inject.ko is not available 5. Offline all hot-pluggable CPUs in preparation for testing 6. Test CPU hot-add error handling by injecting notifier errors 7. Online all hot-pluggable CPUs in preparation for testing 8. Test CPU hot-remove error handling by injecting notifier errors * tools/testing/selftests/memory-hotplug/on-off-test.sh is doing the similar thing for memory hotplug. 1. Online all hot-pluggable memory 2. Offline 10% of hot-pluggable memory 3. Online all hot-pluggable memory again 4. Exit if memory-notifier-error-inject.ko is not available 5. Offline 10% of hot-pluggable memory in preparation for testing 6. Test memory hot-add error handling by injecting notifier errors 7. Online all hot-pluggable memory in preparation for testing 8. Test memory hot-remove error handling by injecting notifier errors Signed-off-by: Akinobu Mita <[email protected]> Suggested-by: Andrew Morton <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Greg KH <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30powerpc: pSeries reconfig notifier error injection moduleAkinobu Mita3-0/+70
This provides the ability to inject artifical errors to pSeries reconfig notifier chain callbacks. It is controlled through debugfs interface under /sys/kernel/debug/notifier-error-inject/pSeries-reconfig If the notifier call chain should be failed with some events notified, write the error code to "actions/<notifier event>/error". Signed-off-by: Akinobu Mita <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Greg KH <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30memory: memory notifier error injection moduleAkinobu Mita3-0/+72
This provides the ability to inject artifical errors to memory hotplug notifier chain callbacks. It is controlled through debugfs interface under /sys/kernel/debug/notifier-error-inject/memory If the notifier call chain should be failed with some events notified, write the error code to "actions/<notifier event>/error". Example: Inject memory hotplug offline error (-12 == -ENOMEM) # cd /sys/kernel/debug/notifier-error-inject/memory # echo -12 > actions/MEM_GOING_OFFLINE/error # echo offline > /sys/devices/system/memory/memoryXXX/state bash: echo: write error: Cannot allocate memory Signed-off-by: Akinobu Mita <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Greg KH <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30PM: PM notifier error injection moduleAkinobu Mita3-0/+74
This provides the ability to inject artifical errors to PM notifier chain callbacks. It is controlled through debugfs interface under /sys/kernel/debug/notifier-error-inject/pm Each of the files in "error" directory represents an event which can be failed and contains the error code. If the notifier call chain should be failed with some events notified, write the error code to the files. If the notifier call chain should be failed with some events notified, write the error code to "actions/<notifier event>/error". Example: Inject PM suspend error (-12 = -ENOMEM) # cd /sys/kernel/debug/notifier-error-inject/pm # echo -12 > actions/PM_SUSPEND_PREPARE/error # echo mem > /sys/power/state bash: echo: write error: Cannot allocate memory Signed-off-by: Akinobu Mita <[email protected]> Acked-by: "Rafael J. Wysocki" <[email protected]> Cc: Pavel Machek <[email protected]> Cc: Greg KH <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30cpu: rewrite cpu-notifier-error-inject moduleAkinobu Mita2-40/+39
Rewrite existing cpu-notifier-error-inject module to use debugfs based new framework. This change removes cpu_up_prepare_error and cpu_down_prepare_error module parameters which were used to specify error code to be injected. We could keep these module parameters for backward compatibility by module_param_cb but it seems overkill for this module. This provides the ability to inject artifical errors to CPU notifier chain callbacks. It is controlled through debugfs interface under /sys/kernel/debug/notifier-error-inject/cpu If the notifier call chain should be failed with some events notified, write the error code to "actions/<notifier event>/error". Example1: inject CPU offline error (-1 == -EPERM) # cd /sys/kernel/debug/notifier-error-inject/cpu # echo -1 > actions/CPU_DOWN_PREPARE/error # echo 0 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: Operation not permitted Example2: inject CPU online error (-2 == -ENOENT) # cd /sys/kernel/debug/notifier-error-inject/cpu # echo -2 > actions/CPU_UP_PREPARE/error # echo 1 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: No such file or directory Signed-off-by: Akinobu Mita <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Greg KH <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30fault-injection: notifier error injectionAkinobu Mita5-0/+247
This patchset provides kernel modules that can be used to test the error handling of notifier call chain failures by injecting artifical errors to the following notifier chain callbacks. * CPU notifier * PM notifier * memory hotplug notifier * powerpc pSeries reconfig notifier Example: Inject CPU offline error (-1 == -EPERM) # cd /sys/kernel/debug/notifier-error-inject/cpu # echo -1 > actions/CPU_DOWN_PREPARE/error # echo 0 > /sys/devices/system/cpu/cpu1/online bash: echo: write error: Operation not permitted The patchset also adds cpu and memory hotplug tests to tools/testing/selftests These tests first do simple online and offline test and then do fault injection tests if notifier error injection module is available. This patch: The notifier error injection provides the ability to inject artifical errors to specified notifier chain callbacks. It is useful to test the error handling of notifier call chain failures. This adds common basic functions to define which type of events can be fail and to initialize the debugfs interface to control what error code should be returned and which event should be failed. Signed-off-by: Akinobu Mita <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Greg KH <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30c/r: fcntl: add F_GETOWNER_UIDS optionCyrill Gorcunov3-0/+34
When we restore file descriptors we would like them to look exactly as they were at dumping time. With help of fcntl it's almost possible, the missing snippet is file owners UIDs. To be able to read their values the F_GETOWNER_UIDS is introduced. This option is valid iif CONFIG_CHECKPOINT_RESTORE is turned on, otherwise returning -EINVAL. Signed-off-by: Cyrill Gorcunov <[email protected]> Acked-by: "Eric W. Biederman" <[email protected]> Cc: "Serge E. Hallyn" <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Pavel Emelyanov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30resource: make sure requested range is included in the root rangeOctavian Purdila1-1/+23
When the requested range is outside of the root range the logic in __reserve_region_with_split will cause an infinite recursion which will overflow the stack as seen in the warning bellow. This particular stack overflow was caused by requesting the (100000000-107ffffff) range while the root range was (0-ffffffff). In this case __request_resource would return the whole root range as conflict range (i.e. 0-ffffffff). Then, the logic in __reserve_region_with_split would continue the recursion requesting the new range as (conflict->end+1, end) which incidentally in this case equals the originally requested range. This patch aborts looking for an usable range when the request does not intersect with the root range. When the request partially overlaps with the root range, it ajust the request to fall in the root range and then continues with the new request. When the request is modified or aborted errors and a stack trace are logged to allow catching the errors in the upper layers. [ 5.968374] WARNING: at kernel/sched.c:4129 sub_preempt_count+0x63/0x89() [ 5.975150] Modules linked in: [ 5.978184] Pid: 1, comm: swapper Not tainted 3.0.22-mid27-00004-gb72c817 #46 [ 5.985324] Call Trace: [ 5.987759] [<c1039dfc>] ? console_unlock+0x17b/0x18d [ 5.992891] [<c1039620>] warn_slowpath_common+0x48/0x5d [ 5.998194] [<c1031758>] ? sub_preempt_count+0x63/0x89 [ 6.003412] [<c1039644>] warn_slowpath_null+0xf/0x13 [ 6.008453] [<c1031758>] sub_preempt_count+0x63/0x89 [ 6.013499] [<c14d60c4>] _raw_spin_unlock+0x27/0x3f [ 6.018453] [<c10c6349>] add_partial+0x36/0x3b [ 6.022973] [<c10c7c0a>] deactivate_slab+0x96/0xb4 [ 6.027842] [<c14cf9d9>] __slab_alloc.isra.54.constprop.63+0x204/0x241 [ 6.034456] [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38 [ 6.039842] [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38 [ 6.045232] [<c10c7dc9>] kmem_cache_alloc_trace+0x51/0xb0 [ 6.050710] [<c103f78f>] ? kzalloc.constprop.5+0x29/0x38 [ 6.056100] [<c103f78f>] kzalloc.constprop.5+0x29/0x38 [ 6.061320] [<c17b45e9>] __reserve_region_with_split+0x1c/0xd1 [ 6.067230] [<c17b4693>] __reserve_region_with_split+0xc6/0xd1 ... [ 7.179057] [<c17b4693>] __reserve_region_with_split+0xc6/0xd1 [ 7.184970] [<c17b4779>] reserve_region_with_split+0x30/0x42 [ 7.190709] [<c17a8ebf>] e820_reserve_resources_late+0xd1/0xe9 [ 7.196623] [<c17c9526>] pcibios_resource_survey+0x23/0x2a [ 7.202184] [<c17cad8a>] pcibios_init+0x23/0x35 [ 7.206789] [<c17ca574>] pci_subsys_init+0x3f/0x44 [ 7.211659] [<c1002088>] do_one_initcall+0x72/0x122 [ 7.216615] [<c17ca535>] ? pci_legacy_init+0x3d/0x3d [ 7.221659] [<c17a27ff>] kernel_init+0xa6/0x118 [ 7.226265] [<c17a2759>] ? start_kernel+0x334/0x334 [ 7.231223] [<c14d7482>] kernel_thread_helper+0x6/0x10 Signed-off-by: Octavian Purdila <[email protected]> Signed-off-by: Ram Pai <[email protected]> Cc: Jesse Barnes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30include/linux/aio.h: cpp->C conversionsAndrew Morton1-18/+20
Convert init_sync_kiocb() from a nasty macro into a nice C function. The struct assignment trick takes care of zeroing all unmentioned fields. Shrinks fs/read_write.o's .text from 9857 bytes to 9714. Also demacroize is_sync_kiocb() and aio_ring_avail(). The latter fixes an arg-referenced-multiple-times hand grenade. Cc: Junxiao Bi <[email protected]> Cc: Mark Fasheh <[email protected]> Acked-by: Jeff Moyer <[email protected]> Cc: Joel Becker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30fs: cachefiles: add support for large files in filesystem cachingJustin Lecher1-1/+1
Support the caching of large files. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=31182 Signed-off-by: Justin Lecher <[email protected]> Signed-off-by: Suresh Jayaraman <[email protected]> Tested-by: Suresh Jayaraman <[email protected]> Acked-by: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30pps: return PTR_ERR on error in device_createEmil Goode1-1/+3
We should return PTR_ERR if the call to the device_create function fails. Without this patch we instead return the value from a successful call to cdev_add if the call to device_create fails. Signed-off-by: Emil Goode <[email protected]> Acked-by: Devendra Naga <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Rodolfo Giometti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30taskstats: check nla_reserve() returnAlan Cox1-0/+5
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=44621 Reported-by: <[email protected]> Signed-off-by: Alan Cox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30sysctl: suppress kmemleak messagesSteven Rostedt1-1/+5
register_sysctl_table() is a strange function, as it makes internal allocations (a header) to register a sysctl_table. This header is a handle to the table that is created, and can be used to unregister the table. But if the table is permanent and never unregistered, the header acts the same as a static variable. Unfortunately, this allocation of memory that is never expected to be freed fools kmemleak in thinking that we have leaked memory. For those sysctl tables that are never unregistered, and have no pointer referencing them, kmemleak will think that these are memory leaks: unreferenced object 0xffff880079fb9d40 (size 192): comm "swapper/0", pid 0, jiffies 4294667316 (age 12614.152s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8146b590>] kmemleak_alloc+0x73/0x98 [<ffffffff8110a935>] kmemleak_alloc_recursive.constprop.42+0x16/0x18 [<ffffffff8110b852>] __kmalloc+0x107/0x153 [<ffffffff8116fa72>] kzalloc.constprop.8+0xe/0x10 [<ffffffff811703c9>] __register_sysctl_paths+0xe1/0x160 [<ffffffff81170463>] register_sysctl_paths+0x1b/0x1d [<ffffffff8117047d>] register_sysctl_table+0x18/0x1a [<ffffffff81afb0a1>] sysctl_init+0x10/0x14 [<ffffffff81b05a6f>] proc_sys_init+0x2f/0x31 [<ffffffff81b0584c>] proc_root_init+0xa5/0xa7 [<ffffffff81ae5b7e>] start_kernel+0x3d0/0x40a [<ffffffff81ae52a7>] x86_64_start_reservations+0xae/0xb2 [<ffffffff81ae53ad>] x86_64_start_kernel+0x102/0x111 [<ffffffffffffffff>] 0xffffffffffffffff The sysctl_base_table used by sysctl itself is one such instance that registers the table to never be unregistered. Use kmemleak_not_leak() to suppress the kmemleak false positive. Signed-off-by: Steven Rostedt <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSIONWill Deacon39-22/+29
Rather than #define the options manually in the architecture code, add Kconfig options for them and select them there instead. This also allows us to select the compat IPC version parsing automatically for platforms using the old compat IPC interface. Reported-by: Andrew Morton <[email protected]> Signed-off-by: Will Deacon <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30ipc: compat: use signed size_t types for msgsnd and msgrcvWill Deacon2-6/+6
The msgsnd and msgrcv system calls use size_t to represent the size of the message being transferred. POSIX states that values of msgsz greater than SSIZE_MAX cause the result to be implementation-defined. On Linux, this equates to returning -EINVAL if (long) msgsz < 0. For compat tasks where !CONFIG_ARCH_WANT_OLD_COMPAT_IPC and compat_size_t is smaller than size_t, negative size values passed from userspace will be interpreted as positive values by do_msg{rcv,snd} and will fail to exit early with -EINVAL. This patch changes the compat prototypes for msg{rcv,snd} so that the message size is represented as a compat_ssize_t, which we cast to the native ssize_t type for the core IPC code. Cc: Arnd Bergmann <[email protected]> Acked-by: Chris Metcalf <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30ipc: allow compat IPC version field parsing if !ARCH_WANT_OLD_COMPAT_IPCWill Deacon2-1/+2
Commit 48b25c43e6ee ("ipc: provide generic compat versions of IPC syscalls") added a new ARCH_WANT_OLD_COMPAT_IPC config option for architectures to select if their compat target requires the old IPC syscall interface. For architectures (such as AArch64) that do not require the internal calling conventions provided by this option, but have a compat target where the C library passes the IPC_64 flag explicitly, compat_ipc_parse_version no longer strips out the flag before calling the native system call implementation, resulting in unknown SHM/IPC commands and -EINVAL being returned to userspace. This patch separates the selection of the internal calling conventions for the IPC syscalls from the version parsing, allowing architectures to select __ARCH_WANT_COMPAT_IPC_PARSE_VERSION if they want to use version parsing whilst retaining the newer syscall calling conventions. Acked-by: Chris Metcalf <[email protected]> Cc: Arnd Bergmann <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30ipc: add COMPAT_SHMLBA supportWill Deacon6-11/+18
If the SHMLBA definition for a native task differs from the definition for a compat task, the do_shmat() function would need to handle both. This patch introduces COMPAT_SHMLBA, which is used by the compat shmat syscall when calling the ipc code and allows architectures such as AArch64 (where the native SHMLBA is 64k but the compat (AArch32) definition is 16k) to provide the correct semantics for compat IPC system calls. Cc: David S. Miller <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Arnd Bergmann <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-07-30kdump: append newline to the last lien of vmcoreinfo noteVivek Goyal1-1/+1
The last line of vmcoreinfo note does not end with \n. Parsing all the lines in note becomes easier if all lines end with \n instead of trying to special case the last line. I know at least one tool, vmcore-dmesg in kexec-tools tree which made the assumption that all lines end with \n. I think it is a good idea to fix it. Signed-off-by: Vivek Goyal <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Atsushi Kumagai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>