aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-11-19drm/i915: Replicate BIOS eDP bpp clamping hack for hswDaniel Vetter1-0/+20
Haswell's DDI encoders have their own ->get_config callback and in commit c6cd2ee2d59111a07cd9199564c9bdcb2d11e5cf Author: Jani Nikula <[email protected]> Date: Mon Oct 21 10:52:07 2013 +0300 drm/i915/dp: workaround BIOS eDP bpp clamping issue we've forgotten to replicate this hack. So let's do it that. Note for backporters: The above commit and all it's depencies need to be backported first. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71049 Cc: [email protected] Tested-by: Gökçen Eraslan <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Signed-off-by: Daniel Vetter <[email protected]>
2013-11-19perf/trace: Properly use u64 to hold event_idVince Weaver1-1/+1
The 64-bit attr.config value for perf trace events was being copied into an "int" before doing a comparison, meaning the top 32 bits were being truncated. As far as I can tell this didn't cause any errors, but it did mean it was possible to create valid aliases for all the tracepoint ids which I don't think was intended. (For example, 0xffffffff00000018 and 0x18 both enable the same tracepoint). Signed-off-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1311151236100.11932@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <[email protected]>
2013-11-19perf: Remove fragile swevent hlist optimizationPeter Zijlstra1-8/+0
Currently we only allocate a single cpu hashtable for per-cpu swevents; do away with this optimization for it is fragile in the face of things like perf_pmu_migrate_context(). The easiest thing is to make sure all CPUs are consistent wrt state. Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-19ftrace, perf: Avoid infinite event generation loopPeter Zijlstra5-0/+44
Vince's perf-trinity fuzzer found yet another 'interesting' problem. When we sample the irq_work_exit tracepoint with period==1 (or PERF_SAMPLE_PERIOD) and we add an fasync SIGNAL handler we create an infinite event generation loop: ,-> <IPI> | irq_work_exit() -> | trace_irq_work_exit() -> | ... | __perf_event_overflow() -> (due to fasync) | irq_work_queue() -> (irq_work_list must be empty) '--------- arch_irq_work_raise() Similar things can happen due to regular poll() wakeups if we exceed the ring-buffer wakeup watermark, or have an event_limit. To avoid this, dis-allow sampling this particular tracepoint. In order to achieve this, create a special perf_perm function pointer for each event and call this (when set) on trying to create a tracepoint perf event. [ roasted: use expr... to allow for ',' in your expression ] Reported-by: Vince Weaver <[email protected]> Tested-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Dave Jones <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-19netfilter: ebt_ip6: fix source and destination matchingLuís Fernando Cornachioni Estrozi1-3/+5
This bug was introduced on commit 0898f99a2. This just recovers two checks that existed before as suggested by Bart De Schuymer. Signed-off-by: Luís Fernando Cornachioni Estrozi <[email protected]> Signed-off-by: Bart De Schuymer <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2013-11-19tick: Document tick_do_timer_cpuAndrew Morton1-0/+15
Taken straight from a tglx email ;) Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2013-11-19timer: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...)Joe Perches1-3/+2
Use the helper function instead of __GFP_ZERO. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2013-11-19NOHZ: Check for nohz active instead of nohz enabledThomas Gleixner2-14/+11
RCU and the fine grained idle time accounting functions check tick_nohz_enabled. But that variable is merily telling that NOHZ has been enabled in the config and not been disabled on the command line. But it does not tell anything about nohz being active. That's what all this should check for. Matthew reported, that the idle accounting on his old P1 machine showed bogus values, when he enabled NOHZ in the config and did not disable it on the kernel command line. The reason is that his machine uses (refined) jiffies as a clocksource which explains why the "fine" grained accounting went into lala land, because it depends on when the system goes and leaves idle relative to the jiffies increment. Provide a tick_nohz_active indicator and let RCU and the accounting code use this instead of tick_nohz_enable. Reported-and-tested-by: Matthew Whitehead <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected]
2013-11-19tools lib traceevent: Fix use of multiple options in processing fieldSteven Rostedt1-2/+21
Jiri Olsa reported that the scsi_dispatch_cmd_done event failed to parse with: Error: expected type 5 but read 4 Error: expected type 5 but read 4 The problem is with this part of the print_fmt: __print_symbolic(((REC->result) >> 24) & 0xff, ... The __print_symbolic() helper function's first parameter is the field to use to determine what symbol to print based on the value of the result. The parser can handle one operation, but it can not handle multiple operations ('>>' and '&'). Add code to process all operations for the field argument for __print_symbolic() as well as __print_flags(). Reported-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-19perf header: Fix possible memory leaks in process_group_desc()Namhyung Kim1-1/+1
After processing all group descriptors or encountering an error, it frees all descriptors. However, current logic can leak memory since it might not traverse all descriptors. Note that the 'i' can have different value than nr_groups when an error occurred and it's safe to call free(desc[i].name) for every desc since we already make it NULL when it's reused for group names. Signed-off-by: Namhyung Kim <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-19perf header: Fix bogus group nameNamhyung Kim1-1/+3
When processing event group descriptor in perf file header, we reuse an allocated group name but forgot to prevent it from freeing. Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-19perf tools: Tag thread comm as overridenFrederic Weisbecker1-6/+5
The problem is that when a thread overrides its default ":%pid" comm, we forget to tag the thread comm as overriden. Hence, this overriden comm is not inherited on future forks. Fix it. Signed-off-by: Frederic Weisbecker <[email protected]> Tested-by: David Ahern <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-19drm/i915: Do not enable package C8 on unsupported hardwareChris Wilson2-0/+16
If the hardware does not support package C8, then do not even schedule work to enable it. Thereby we can eliminate a bunch of dangerous work. Signed-off-by: Chris Wilson <[email protected]> Cc: Paulo Zanoni <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Daniel Vetter <[email protected]>
2013-11-19drm/i915: Hold pc8 lock around toggling pc8.gpu_idleChris Wilson1-2/+6
We need to hold the pc8 lock around toggling the value of gpu_idle. Signed-off-by: Chris Wilson <[email protected]> Cc: Paulo Zanoni <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Daniel Vetter <[email protected]>
2013-11-19ALSA: hda - Also enable mute/micmute LED control for "Lenovo dock" fixupDavid Henningsson1-0/+2
The docking station is a Thinkpad thing, so it makes sense to check for mute/micmute LEDs for that quirk type too. Signed-off-by: David Henningsson <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2013-11-19Revert "HID: wiimote: add LEGO-wiimote VID"Jiri Kosina3-6/+1
This reverts commit 86b84167d4e67372376a57ea9955c5d53dae232f as it introduced a VID/PID conflict with its original owner: hid-wiimote got hid:b0005g*v0000054Cp00000306 added but hid-sony already has this id for the PS3 Remote (and the ID is oficically assigned to Sony). Revert the commit to avoid hid-sony regression. David is working on a bluez patch to force proper ID on the wiimote. Reported-by: David Herrmann <[email protected]> Reported-by: Michel Kraus <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2013-11-19Merge tag 'kvm-arm-fixes-3.13-1' of ↵Gleb Natapov1-6/+28
git://git.linaro.org/people/cdall/linux-kvm-arm into next Fix percpu vmalloc allocations
2013-11-19HID: sony: Send FF commands in non-atomic contextSven Eckelmann1-11/+42
The ff_memless has a timer running which gets run in an atomic context and calls the play_effect callback. The callback function for sony uses the hid_output_raw_report (overwritten by sixaxis_usb_output_raw_report) function to handle differences in the control message format. It is not safe for an atomic context because it may sleep later in usb_start_wait_urb. This "scheduling while atomic" can cause the system to lock up. A workaround is to make the force feedback state update using work_queues and use the play_effect function only to enqueue the work item. Reported-by: Simon Wood <[email protected]> Reported-by: David Herrmann <[email protected]> Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Simon Wood <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2013-11-19ALSA: firewire-lib: include sound/asound.h to refer to snd_pcm_format_tTakashi Sakamoto1-0/+1
'snd_pcm_format_t' is used by amdtp_out_stream_set_pcm_format(). Currently, when just including amdtp.h, compiler cannot find this type because this type is defined in uapi/sound/asound.h and this header is not included by amdtp.h. Signed-off-by: Takashi Sakamoto <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2013-11-19md/raid5: Use conf->device_lock protect changing of multi-thread resources.majianpeng1-24/+39
When we change group_thread_cnt from sysfs entry, it can OOPS. The kernel messages are: [ 135.299021] BUG: unable to handle kernel NULL pointer dereference at (null) [ 135.299073] IP: [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440 [ 135.299107] PGD 0 [ 135.299122] Oops: 0000 [#1] SMP [ 135.299144] Modules linked in: netconsole e1000e ptp pps_core [ 135.299188] CPU: 3 PID: 2225 Comm: md0_raid5 Not tainted 3.12.0+ #24 [ 135.299214] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 11/09/2011 [ 135.299255] task: ffff8800b9638f80 ti: ffff8800b77a4000 task.ti: ffff8800b77a4000 [ 135.299283] RIP: 0010:[<ffffffff815188ab>] [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440 [ 135.299323] RSP: 0018:ffff8800b77a5c48 EFLAGS: 00010002 [ 135.299344] RAX: ffff880037bb5c70 RBX: 0000000000000000 RCX: 0000000000000008 [ 135.299371] RDX: ffff880037bb5cb8 RSI: 0000000000000001 RDI: ffff880037bb5c00 [ 135.299398] RBP: ffff8800b77a5d08 R08: 0000000000000001 R09: 0000000000000000 [ 135.299425] R10: ffff8800b77a5c98 R11: 00000000ffffffff R12: ffff880037bb5c00 [ 135.299452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880037bb5c70 [ 135.299479] FS: 0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000 [ 135.299510] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 135.299532] CR2: 0000000000000000 CR3: 0000000001c0b000 CR4: 00000000000407e0 [ 135.299559] Stack: [ 135.299570] ffff8800b77a5c88 ffffffff8107383e ffff8800b77a5c88 ffff880037a64300 [ 135.299611] 000000000000ec08 ffff880037bb5cb8 ffff8800b77a5c98 ffffffffffffffd8 [ 135.299654] 000000000000ec08 ffff880037bb5c60 ffff8800b77a5c98 ffff8800b77a5c98 [ 135.299696] Call Trace: [ 135.299711] [<ffffffff8107383e>] ? __wake_up+0x4e/0x70 [ 135.299733] [<ffffffff81518f88>] raid5d+0x4c8/0x680 [ 135.299756] [<ffffffff817174ed>] ? schedule_timeout+0x15d/0x1f0 [ 135.299781] [<ffffffff81524c9f>] md_thread+0x11f/0x170 [ 135.299804] [<ffffffff81069cd0>] ? wake_up_bit+0x40/0x40 [ 135.299826] [<ffffffff81524b80>] ? md_rdev_init+0x110/0x110 [ 135.299850] [<ffffffff81069656>] kthread+0xc6/0xd0 [ 135.299871] [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70 [ 135.299899] [<ffffffff81722ffc>] ret_from_fork+0x7c/0xb0 [ 135.299923] [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70 [ 135.299951] Code: ff ff ff 0f 84 d7 fe ff ff e9 5c fe ff ff 66 90 41 8b b4 24 d8 01 00 00 45 31 ed 85 f6 0f 8e 7b fd ff ff 49 8b 9c 24 d0 01 00 00 <48> 3b 1b 49 89 dd 0f 85 67 fd ff ff 48 8d 43 28 31 d2 eb 17 90 [ 135.300005] RIP [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440 [ 135.300005] RSP <ffff8800b77a5c48> [ 135.300005] CR2: 0000000000000000 [ 135.300005] ---[ end trace 504854e5bb7562ed ]--- [ 135.300005] Kernel panic - not syncing: Fatal exception This is because raid5d() can be running when the multi-thread resources are changed via system. We see need to provide locking. mddev->device_lock is suitable, but we cannot simple call alloc_thread_groups under this lock as we cannot allocate memory while holding a spinlock. So change alloc_thread_groups() to allocate and return the data structures, then raid5_store_group_thread_cnt() can take the lock while updating the pointers to the data structures. This fixes a bug introduced in 3.12 and so is suitable for the 3.12.x stable series. Fixes: b721420e8719131896b009b11edbbd27 Cc: [email protected] (3.12) Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]> Reviewed-by: Shaohua Li <[email protected]>
2013-11-19md/raid5: Before freeing old multi-thread worker, it should flush them.majianpeng1-0/+3
When changing group_thread_cnt from sysfs entry, the kernel can oops. The kernel messages are: [ 740.961389] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 [ 740.961444] IP: [<ffffffff81062570>] process_one_work+0x30/0x500 [ 740.961476] PGD b9013067 PUD b651e067 PMD 0 [ 740.961503] Oops: 0000 [#1] SMP [ 740.961525] Modules linked in: netconsole e1000e ptp pps_core [ 740.961577] CPU: 0 PID: 3683 Comm: kworker/u8:5 Not tainted 3.12.0+ #23 [ 740.961602] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015 11/09/2011 [ 740.961646] task: ffff88013abe0000 ti: ffff88013a246000 task.ti: ffff88013a246000 [ 740.961673] RIP: 0010:[<ffffffff81062570>] [<ffffffff81062570>] process_one_work+0x30/0x500 [ 740.961708] RSP: 0018:ffff88013a247e08 EFLAGS: 00010086 [ 740.961730] RAX: ffff8800b912b400 RBX: ffff88013a61e680 RCX: ffff8800b912b400 [ 740.961757] RDX: ffff8800b912b600 RSI: ffff8800b912b600 RDI: ffff88013a61e680 [ 740.961782] RBP: ffff88013a247e48 R08: ffff88013a246000 R09: 000000000002c09d [ 740.961808] R10: 000000000000010f R11: 0000000000000000 R12: ffff88013b00cc00 [ 740.961833] R13: 0000000000000000 R14: ffff88013b00cf80 R15: ffff88013a61e6b0 [ 740.961861] FS: 0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000 [ 740.961893] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 740.962001] CR2: 00000000000000b8 CR3: 00000000b24fe000 CR4: 00000000000407f0 [ 740.962001] Stack: [ 740.962001] 0000000000000008 ffff8800b912b600 ffff88013b00cc00 ffff88013a61e680 [ 740.962001] ffff88013b00cc00 ffff88013b00cc18 ffff88013b00cf80 ffff88013a61e6b0 [ 740.962001] ffff88013a247eb8 ffffffff810639c6 0000000000012a80 ffff88013a247fd8 [ 740.962001] Call Trace: [ 740.962001] [<ffffffff810639c6>] worker_thread+0x206/0x3f0 [ 740.962001] [<ffffffff810637c0>] ? manage_workers+0x2c0/0x2c0 [ 740.962001] [<ffffffff81069656>] kthread+0xc6/0xd0 [ 740.962001] [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70 [ 740.962001] [<ffffffff81722ffc>] ret_from_fork+0x7c/0xb0 [ 740.962001] [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70 [ 740.962001] Code: 89 e5 41 57 41 56 41 55 45 31 ed 41 54 53 48 89 fb 48 83 ec 18 48 8b 06 4c 8b 67 48 48 89 c1 30 c9 a8 04 4c 0f 45 e9 80 7f 58 00 <49> 8b 45 08 44 8b b0 00 01 00 00 78 0c 41 f6 44 24 10 04 0f 84 [ 740.962001] RIP [<ffffffff81062570>] process_one_work+0x30/0x500 [ 740.962001] RSP <ffff88013a247e08> [ 740.962001] CR2: 0000000000000008 [ 740.962001] ---[ end trace 39181460000748de ]--- [ 740.962001] Kernel panic - not syncing: Fatal exception This can happen if there are some stripes left, fewer than MAX_STRIPE_BATCH. A worker is queued to handle them. But before calling raid5_do_work, raid5d handles those stripes making conf->active_stripe = 0. So mddev_suspend() can return. We might then free old worker resources before the queued raid5_do_work() handled them. When it runs, it crashes. raid5d() raid5_store_group_thread_cnt() queue_work mddev_suspend() handle_strips active_stripe=0 free(old worker resources) process_one_work raid5_do_work To avoid this, we should only flush the worker resources before freeing them. This fixes a bug introduced in 3.12 so is suitable for the 3.12.x stable series. Cc: [email protected] (3.12) Fixes: b721420e8719131896b009b11edbbd27 Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]> Reviewed-by: Shaohua Li <[email protected]>
2013-11-19md/raid5: For stripe with R5_ReadNoMerge, we replace REQ_FLUSH with REQ_NOMERGE.majianpeng1-1/+1
For R5_ReadNoMerge,it mean this bio can't merge with other bios or request.It used REQ_FLUSH to achieve this. But REQ_NOMERGE can do the same work. Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19UAPI: include <asm/byteorder.h> in linux/raid/md_p.hAurelien Jarno1-0/+1
linux/raid/md_p.h is using conditionals depending on endianess and fails with an error if neither of __BIG_ENDIAN, __LITTLE_ENDIAN or __BYTE_ORDER are defined, but it doesn't include any header which can define these constants. This make this header unusable alone. This patch adds a #include <asm/byteorder.h> at the beginning of this header to make it usable alone. This is needed to compile klibc on MIPS. Signed-off-by: Aurelien Jarno <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19raid1: Rewrite the implementation of iobarrier.majianpeng2-13/+129
There is an iobarrier in raid1 because of contention between normal IO and resync IO. It suspends all normal IO when resync/recovery happens. However if normal IO is out side the resync window, there is no contention. So this patch changes the barrier mechanism to only block IO that could contend with the resync that is currently happening. We partition the whole space into five parts. |---------|-----------|------------|----------------|-------| start next_resync start_next_window end_window start + RESYNC_WINDOW = next_resync next_resync + NEXT_NORMALIO_DISTANCE = start_next_window start_next_window + NEXT_NORMALIO_DISTANCE = end_window Firstly we introduce some concepts: 1 - RESYNC_WINDOW: For resync, there are 32 resync requests at most at the same time. A sync request is RESYNC_BLOCK_SIZE(64*1024). So the RESYNC_WINDOW is 32 * RESYNC_BLOCK_SIZE, that is 2MB. 2 - NEXT_NORMALIO_DISTANCE: the distance between next_resync and start_next_window. It also indicates the distance between start_next_window and end_window. It is currently 3 * RESYNC_WINDOW_SIZE but could be tuned if this turned out not to be optimal. 3 - next_resync: the next sector at which we will do sync IO. 4 - start: a position which is at most RESYNC_WINDOW before next_resync. 5 - start_next_window: a position which is NEXT_NORMALIO_DISTANCE beyond next_resync. Normal-io after this position doesn't need to wait for resync-io to complete. 6 - end_window: a position which is 2 * NEXT_NORMALIO_DISTANCE beyond next_resync. This also doesn't need to wait, but is counted differently. 7 - current_window_requests: the count of normalIO between start_next_window and end_window. 8 - next_window_requests: the count of normalIO after end_window. NormalIO will be partitioned into four types: NormIO1: the end sector of bio is smaller or equal the start NormIO2: the start sector of bio larger or equal to end_window NormIO3: the start sector of bio larger or equal to start_next_window. NormIO4: the location between start_next_window and end_window |--------|-----------|--------------------|----------------|-------------| | start | next_resync | start_next_window | end_window | NormIO1 NormIO4 NormIO4 NormIO3 NormIO2 For NormIO1, we don't need any io barrier. For NormIO4, we used a similar approach to the original iobarrier mechanism. The normalIO and resyncIO must be kept separate. For NormIO2/3, we add two fields to struct r1conf: "current_window_requests" and "next_window_requests". They indicate the count of active requests in the two window. For these, we don't wait for resync io to complete. For resync action, if there are NormIO4s, we must wait for it. If not, we can proceed. But if resync action reaches start_next_window and current_window_requests > 0 (that is there are NormIO3s), we must wait until the current_window_requests becomes zero. When current_window_requests becomes zero, start_next_window also moves forward. Then current_window_requests will replaced by next_window_requests. There is a problem which when and how to change from NormIO2 to NormIO3. Only then can sync action progress. We add a field in struct r1conf "start_next_window". A: if start_next_window == MaxSector, it means there are no NormIO2/3. So start_next_window = next_resync + NEXT_NORMALIO_DISTANCE B: if current_window_requests == 0 && next_window_requests != 0, it means start_next_window move to end_window There is another problem which how to differentiate between old NormIO2(now it is NormIO3) and NormIO2. For example, there are many bios which are NormIO2 and a bio which is NormIO3. NormIO3 firstly completed, so the bios of NormIO2 became NormIO3. We add a field in struct r1bio "start_next_window". This is used to record the position conf->start_next_window when the call to wait_barrier() is made in make_request(). In allow_barrier(), we check the conf->start_next_window. If r1bio->stat_next_window == conf->start_next_window, it means there is no transition between NormIO2 and NormIO3. If r1bio->start_next_window != conf->start_next_window, it mean there was a transition between NormIO2 and NormIO3. There can only have been one transition. So it only means the bio is old NormIO2. For one bio, there may be many r1bio's. So we make sure all the r1bio->start_next_window are the same value. If we met blocked_dev in make_request(), it must call allow_barrier and wait_barrier. So the former and the later value of conf->start_next_window will be change. If there are many r1bio's with differnet start_next_window, for the relevant bio, it depend on the last value of r1bio. It will cause error. To avoid this, we must wait for previous r1bios to complete. Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19raid1: Add some macros to make code clearly.majianpeng1-4/+4
In a subsequent patch, we'll use some const parameters. Using macros will make the code clearly. Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19raid1: Replace raise_barrier/lower_barrier with freeze_array/unfreeze_array ↵majianpeng1-5/+6
when reconfiguring the array. We used to use raise_barrier to suspend normal IO while we reconfigure the array. However raise_barrier will soon only suspend some normal IO, not all. So we need something else. Change it to use freeze_array. But freeze_array not only suspends normal io, it also suspends resync io. For the place where call raise_barrier for reconfigure, it isn't a problem. Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19raid1: Add a field array_frozen to indicate whether raid in freeze state.majianpeng2-8/+8
Because the following patch will rewrite the content between normal IO and resync IO. So we used a parameter to indicate whether raid is in freeze array. Signed-off-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19md: Convert use of typedef ctl_table to struct ctl_tableJoe Perches1-3/+3
This typedef is unnecessary and should just be removed. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19md/raid5: avoid deadlock when raid5 array has unack badblocks during ↵NeilBrown1-19/+49
md_stop_writes. When raid5 recovery hits a fresh badblock, this badblock will flagged as unack badblock until md_update_sb() is called. But md_stop will take reconfig lock which means raid5d can't call md_update_sb() in md_check_recovery(), the badblock will always be unack, so raid5d thread enters an infinite loop and md_stop_write() can never stop sync_thread. This causes deadlock. To solve this, when STOP_ARRAY ioctl is issued and sync_thread is running, we need set md->recovery FROZEN and INTR flags and wait for sync_thread to stop before we (re)take reconfig lock. This requires that raid5 reshape_request notices MD_RECOVERY_INTR (which it probably should have noticed anyway) and stops waiting for a metadata update in that case. Reported-by: Jianpeng Ma <[email protected]> Reported-by: Bian Yu <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19md: use MD_RECOVERY_INTR instead of kthread_should_stop in resync thread.NeilBrown3-31/+34
We currently use kthread_should_stop() in various places in the sync/reshape code to abort early. However some places set MD_RECOVERY_INTR but don't immediately call md_reap_sync_thread() (and we will shortly get another one). When this happens we are relying on md_check_recovery() to reap the thread and that only happen when it finishes normally. So MD_RECOVERY_INTR must lead to a normal finish without the kthread_should_stop() test. So replace all relevant tests, and be more careful when the thread is interrupted not to acknowledge that latest step in a reshape as it may not be fully committed yet. Also add a test on MD_RECOVERY_INTR in the 'is_mddev_idle' loop so we don't wait have to wait for the speed to drop before we can abort. Signed-off-by: NeilBrown <[email protected]>
2013-11-19md: fix some places where mddev_lock return value is not checked.NeilBrown1-5/+13
Sometimes we need to lock and mddev and cannot cope with failure due to interrupt. In these cases we should use mutex_lock, not mutex_lock_interruptible. Signed-off-by: NeilBrown <[email protected]>
2013-11-19raid5: Retry R5_ReadNoMerge flag when hit a read error.Bian Yu1-0/+3
Because of block layer merge, one bio fails will cause other bios which belongs to the same request fails, so raid5_end_read_request will record all these bios as badblocks. If retry request with R5_ReadNoMerge flag to avoid bios merge, badblocks can only record sector which is bad exactly. test: hdparm --yes-i-know-what-i-am-doing --make-bad-sector 300000 /dev/sdb mdadm -C /dev/md0 -l5 -n3 /dev/sd[bcd] --assume-clean mdadm /dev/md0 -f /dev/sdd mdadm /dev/md0 -r /dev/sdd mdadm --zero-superblock /dev/sdd mdadm /dev/md0 -a /dev/sdd 1. Without this patch: cat /sys/block/md0/md/rd*/bad_blocks 299776 256 299776 256 2. With this patch: cat /sys/block/md0/md/rd*/bad_blocks 300000 8 300000 8 Signed-off-by: Bian Yu <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-19raid5: relieve lock contention in get_active_stripe()Shaohua Li2-1/+8
track empty inactive list count, so md_raid5_congested() can use it to make decision. Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: NeilBrown <[email protected]>
2013-11-18seq_file: always clear m->count when we free m->bufAl Viro1-1/+2
Once we'd freed m->buf, m->count should become zero - we have no valid contents reachable via m->buf. Reported-by: Charley (Hao Chuan) Chu <[email protected]> Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-11-19ARM: 7892/1: Fix warning for V7M buildsOlof Johansson1-1/+1
Fixes a harmless warning when building for V7M (!MMU): arch/arm/kernel/traps.c:859:123: warning: 'kuser_init' defined but not used [-Wunused-function] By making the stub static inline instead of just static. Fixes: f6f91b0d9fd9 ('ARM: allow kuser helpers to be removed from the vector page') Signed-off-by: Olof Johansson <[email protected]> Signed-off-by: Russell King <[email protected]>
2013-11-18ARM: OMAP2+: Remove legacy omap4_twl6030_hsmmc_initTony Lindgren2-58/+0
This is no longer used, omap4 is device tree based now. Signed-off-by: Tony Lindgren <[email protected]>
2013-11-18ARM: OMAP2+: Remove legacy mux code for display.cTony Lindgren1-78/+0
This is all omap4 specific, which is device tree based nowadays and should use pinctrl-single instead. Signed-off-by: Tony Lindgren <[email protected]>
2013-11-19Merge branch 'pm-sleep'Rafael J. Wysocki2-1/+3
* pm-sleep: PM / Hibernate: Do not crash kernel in free_basic_memory_bitmaps()
2013-11-19Merge branch 'pm-cpufreq'Rafael J. Wysocki3-4/+4
* pm-cpufreq: cpufreq: governor: Remove fossil comment in the cpufreq_governor_dbs() cpufreq: OMAP: Fix compilation error 'r & ret undeclared' cpufreq: conservative: set requested_freq to policy max when it is over policy max
2013-11-19Merge branch 'pm-runtime'Rafael J. Wysocki2-8/+9
* pm-runtime: PM / Runtime: Fix error path for prepare PM / Runtime: Update documentation around probe|remove|suspend
2013-11-19Merge branch 'pm-tools'Rafael J. Wysocki2-56/+143
* pm-tools: tools / power turbostat: Support Silvermont
2013-11-19Merge branch 'pm-cpuidle'Rafael J. Wysocki1-1/+23
* pm-cpuidle: intel_idle: Support Intel Atom Processor C2000 Product Family
2013-11-19Merge branch 'acpi-video'Rafael J. Wysocki1-75/+12
* acpi-video: ACPI / video: clean up DMI table for initial black screen problem
2013-11-19Merge branch 'acpi-ec'Rafael J. Wysocki1-1/+2
* acpi-ec: ACPI / EC: Ensure lock is acquired before accessing ec struct members
2013-11-19Merge branch 'acpi-hotplug'Rafael J. Wysocki2-10/+5
* acpi-hotplug: ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed() ACPI / PCI root: Clear driver_data before failing enumeration ACPI / hotplug: Fix PCI host bridge hot removal ACPI / hotplug: Fix acpi_bus_get_device() return value check
2013-11-19ACPI / scan: Set flags.match_driver in acpi_bus_scan_fixed()Rafael J. Wysocki1-0/+2
Before commit 6931007cc90b (ACPI / scan: Start matching drivers after trying scan handlers) the match_driver flag for all devices was set in acpi_add_single_object(), but now it is set by acpi_bus_device_attach() which is not called for the "fixed" devices added by acpi_bus_scan_fixed(). This means that flags.match_driver is never set for those devices now, so make acpi_bus_scan_fixed() set it before calling device_attach(). Fixes: 6931007cc90b (ACPI / scan: Start matching drivers after trying scan handlers) Signed-off-by: Rafael J. Wysocki <[email protected]>
2013-11-19ACPI / PCI root: Clear driver_data before failing enumerationRafael J. Wysocki1-0/+1
If a PCI host bridge cannot be enumerated due to an error in pci_acpi_scan_root(), its ACPI device object's driver_data field has to be cleared by acpi_pci_root_add() before freeing the object pointed to by that field, or some later acpi_pci_find_root() checks that should fail may succeed and cause quite a bit of confusion to ensue. Fix acpi_pci_root_add() to clear device->driver_data before returning an error code as appropriate. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Toshi Kani <[email protected]>
2013-11-19ACPI / hotplug: Fix PCI host bridge hot removalRafael J. Wysocki1-8/+1
Since the PCI host bridge scan handler does not set hotplug.enabled, the check of it in acpi_bus_device_eject() effectively prevents the root bridge hot removal from working after commit a3b1b1ef78cd (ACPI / hotplug: Merge device hot-removal routines). However, that check is not necessary, because the other acpi_bus_device_eject() users, acpi_hotplug_notify_cb and acpi_eject_store(), do the same check by themselves before executing that function. For this reason, remove the scan handler check from acpi_bus_device_eject() to make PCI hot bridge hot removal work again. Fixes: a3b1b1ef78cd (ACPI / hotplug: Merge device hot-removal routines) Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Toshi Kani <[email protected]>
2013-11-19ACPI / hotplug: Fix acpi_bus_get_device() return value checkRafael J. Wysocki1-2/+1
Since acpi_bus_get_device() returns a plain int and not acpi_status, ACPI_FAILURE() should not be used for checking its return value. Fix that. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Toshi Kani <[email protected]>
2013-11-18Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds103-407/+905
Pull watchdog changes from Wim Van Sebroeck: - addition of MOXA ART watchdog driver (moxart_wdt) - addition of CSR SiRFprimaII and SiRFatlasVI watchdog driver (sirfsoc_wdt) - addition of ralink watchdog driver (rt2880_wdt) - various fixes and cleanups (__user annotation, ioctl return codes, removal of redundant of_match_ptr, removal of unnecessary amba_set_drvdata(), use allocated buffer for usb_control_msg, ...) - removal of MODULE_ALIAS_MISCDEV statements - watchdog related DT bindings - first set of improvements on the w83627hf_wdt driver * git://www.linux-watchdog.org/linux-watchdog: (26 commits) watchdog: w83627hf: Use helper functions to access superio registers watchdog: w83627hf: Enable watchdog device only if not already enabled watchdog: w83627hf: Enable watchdog only once watchdog: w83627hf: Convert to watchdog infrastructure watchdog: omap_wdt: raw read and write endian fix watchdog: sirf: don't depend on dummy value of CLOCK_TICK_RATE watchdog: pcwd_usb: overflow in usb_pcwd_send_command() watchdog: rt2880_wdt: fix return value check in rt288x_wdt_probe() watchdog: watchdog_core: Fix a trivial typo watchdog: dw: Enable OF support for DW watchdog timer watchdog: Get rid of MODULE_ALIAS_MISCDEV statements watchdog: ts72xx_wdt: Propagate return value from timeout_to_regval watchdog: pcwd_usb: Use allocated buffer for usb_control_msg watchdog: sp805_wdt: Remove unnecessary amba_set_drvdata() watchdog: sirf: add watchdog driver of CSR SiRFprimaII and SiRFatlasVI watchdog: Remove redundant of_match_ptr watchdog: ts72xx_wdt: cleanup return codes in ioctl documentation/devicetree: Move DT bindings from gpio to watchdog watchdog: add ralink watchdog driver watchdog: Add MOXA ART watchdog driver ...