aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-07libceph: respect RADOS_BACKOFF backoffsIlya Dryomov8-0/+737
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: make DEFINE_RB_* helpers more generalIlya Dryomov1-12/+37
Initially for ceph_pg_mapping, ceph_spg_mapping and ceph_hobject_id, compared with ceph_pg_compare(), ceph_spg_compare() and hoid_compare() respectively. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: avoid unnecessary pi lookups in calc_target()Ilya Dryomov3-30/+42
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: use target pi for calc_target() calculationsIlya Dryomov1-1/+8
For luminous and beyond we are encoding the actual spgid, which requires operating with the correct pg_num, i.e. that of the target pool. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: always populate t->target_{oid,oloc} in calc_target()Ilya Dryomov1-11/+4
need_check_tiering logic doesn't make a whole lot of sense. Drop it and apply tiering unconditionally on every calc_target() call instead. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: make sure need_resend targets reflect latest mapIlya Dryomov3-9/+27
Otherwise we may miss events like PG splits, pool deletions, etc when we get multiple incremental maps at once. Because check_pool_dne() can now be fed an unlinked request, finish_request() needed to be taught to handle unlinked requests. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: delete from need_resend_linger before check_linger_pool_dne()Ilya Dryomov1-0/+1
When processing a map update consisting of multiple incrementals, we may end up running check_linger_pool_dne() on a lingering request that was previously added to need_resend_linger list. If it is concluded that the target pool doesn't exist, the request is killed off while still on need_resend_linger list, which leads to a crash on a NULL lreq->osd in kick_requests(): libceph: linger_id 18446462598732840961 pool does not exist BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: ceph_osdc_handle_map+0x4ae/0x870 Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: resend on PG splits if OSD has RESEND_ON_SPLITIlya Dryomov3-11/+19
Note that ceph_osd_request_target fields are updated regardless of RESEND_ON_SPLIT. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: drop need_resend from calc_target()Ilya Dryomov1-7/+11
Replace it with more fine-grained bools to separate updating ceph_osd_request_target fields and the decision to resend. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: MOSDOp v8 encoding (actual spgid + full hash)Ilya Dryomov3-20/+154
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: ceph_connection_operations::reencode_message() methodIlya Dryomov2-2/+7
Give upper layers a chance to reencode the message after the connection is negotiated and ->peer_features is set. OSD client will use this to support both luminous and pre-luminous OSDs (in a single cluster): the former need MOSDOp v8; the latter will continue to be sent MOSDOp v4. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: encode_{pgid,oloc}() helpersIlya Dryomov1-23/+27
Factor out encode_{pgid,oloc}() and use ceph_encode_string() for oid. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: introduce ceph_spg, ceph_pg_to_primary_shard()Ilya Dryomov5-4/+60
Store both raw pgid and actual spgid in ceph_osd_request_target. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: new pi->last_force_request_resendIlya Dryomov1-0/+37
The old (v15) pi->last_force_request_resend has been repurposed to make pre-RESEND_ON_SPLIT clients that don't check for PG splits but do obey pi->last_force_request_resend resend on splits. See ceph.git commit 189ca7ec6420 ("mon/OSDMonitor: make pre-luminous clients resend ops on split"). Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: fold [l]req->last_force_resend into ceph_osd_request_targetIlya Dryomov2-13/+12
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: support SERVER_JEWEL feature bitsIlya Dryomov2-1/+9
Only MON_STATEFUL_SUB, really. MON_ROUTE_OSDMAP and OSDSUBOP_NO_SNAPCONTEXT are irrelevant. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: advertise support for OSD_POOLRESENDIlya Dryomov1-0/+1
The code has been in place since commit 63244fa123a7 ("libceph: introduce ceph_osd_request_target, calc_target()"), and, with the ceph_{oloc,oid}_copy() issue fixed in the previous commit, is now in working order. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: handle non-empty dest in ceph_{oloc,oid}_copy()Ilya Dryomov1-4/+6
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: new features macrosIlya Dryomov1-75/+167
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: remove ceph_sanitize_features() workaroundIlya Dryomov2-23/+1
Reflects ceph.git commit ff1959282826ae6acd7134e1b1ede74ffd1cc04a. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: update ceph_dentry_info::lease_session when necessaryYan, Zheng1-2/+7
Current code does not update ceph_dentry_info::lease_session once it is set. If auth mds of corresponding dentry changes, dentry lease keeps in an invalid state. Signed-off-by: "Yan, Zheng" <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: new mount option that specifies fscache uniquifierYan, Zheng3-21/+113
Current ceph uses FSID as primary index key of fscache data. This allows ceph to retain cached data across remount. But this causes problem (kernel opps, fscache does not support sharing data) when a filesystem get mounted several times (with fscache enabled, with different mount options). The fix is adding a new mount option, which specifies uniquifier for fscache. Signed-off-by: "Yan, Zheng" <[email protected]> Acked-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: avoid accessing freeing inode in ceph_check_delayed_caps()Yan, Zheng1-2/+9
Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: avoid invalid memory dereference in the middle of umountYan, Zheng2-4/+6
extra_mon_dispatch() and debugfs' foo_show functions dereference fsc->mdsc. we should clean up fsc->client->extra_mon_dispatch and debugfs before destroying fsc->mds. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: getattr before read on ceph.* xattrsYan, Zheng1-0/+3
Previously we were returning values for quota, layout xattrs without any kind of update -- the user just got whatever happened to be in our cache. Clearly this extra round trip has a cost, but reads of these xattrs are fairly rare, happening on admin intervention rather than in normal operation. Link: http://tracker.ceph.com/issues/17939 Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: don't re-send interrupted flock requestYan, Zheng1-1/+24
Don't re-send interrupted flock request in cases of mds failover and receiving request forward. Because corresponding 'lock intr' request may have been finished, it won't get re-sent. Link: http://tracker.ceph.com/issues/20170 Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: cleanup writepage_nounlock()Yan, Zheng1-6/+6
Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: redirty page when writepage_nounlock() skips unwritable pageYan, Zheng1-1/+2
Ceph needs to flush dirty page in the order in which in which snap context they belong to. Dirty pages belong to older snap context should be flushed earlier. if writepage_nounlock() can not flush a page, it should redirty the page. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: remove useless page->mapping check in writepage_nounlock()Yan, Zheng1-4/+0
Callers of writepage_nounlock() have already ensured non-null page->mapping. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: update the 'approaching max_size' codeYan, Zheng5-11/+23
The old 'approaching max_size' code expects MDS set max_size to '2 * reported_size'. This is no longer true. The new code reports file size when half of previous max_size increment has been used. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: re-request max size after importing capsYan, Zheng1-3/+8
The 'wanted max size' could be sent to inode's old auth mds, re-send it to inode's new auth mds if necessary. Otherwise write syscall may hang. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07drm/radeon: Fix eDP for single-display iMac10,1 (v2)Mario Kleiner1-2/+11
The late 2009, 27 inch Apple iMac10,1 has an internal eDP display and an external Mini- Displayport output, driven by a DCE-3.2, RV730 Radeon Mobility HD-4670. The machine worked fine in a dual-display setup with eDP panel + externally connected HDMI or DVI-D digital display sink, connected via MiniDP to DVI or HDMI adapter. However, booting the machine single-display with only eDP panel results in a completely black display - even backlight powering off, as soon as the radeon modesetting driver loads. This patch fixes the single dispay eDP case by assigning encoders based on dig->linkb, similar to DCE-4+. While this should not be generally necessary (Alex: "...atom on normal boards should be able to handle any mapping."), Apple seems to use some special routing here. One remaining problem not solved by this patch is that an external Minidisplayport->DP sink does still not work on iMac10,1, whereas external DVI and HDMI sinks continue to work. The problem affects at least all tested kernels since Linux 3.13 - didn't test earlier kernels, so backporting to stable probably makes sense. v2: With the original patch from 2016, Alex was worried it will break other DCE3.2 systems. Use dmi_match() to apply this special encoder assignment only for the Apple iMac 10,1 from late 2009. Signed-off-by: Mario Kleiner <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Michel Dänzer <[email protected]> Cc: <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-07ALSA: msnd: Optimize / harden DSP and MIDI loopsTakashi Iwai2-26/+27
The ISA msnd drivers have loops fetching the ring-buffer head, tail and size values inside the loops. Such codes are inefficient and fragile. This patch optimizes it, and also adds the sanity check to avoid the endless loops. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196131 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196133 Signed-off-by: Takashi Iwai <[email protected]>
2017-07-07KVM: mark memory slots as rcuChristian Borntraeger2-3/+5
we access the memslots array via srcu. Mark it as such and use the right access functions also for the freeing of memory slots. Found by sparse: ./include/linux/kvm_host.h:565:16: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]>
2017-07-07KVM: mark kvm->busses as rcu protectedChristian Borntraeger3-11/+22
mark kvm->busses as rcu protected and use the correct access function everywhere. found by sparse virt/kvm/kvm_main.c:3490:15: error: incompatible types in comparison expression (different address spaces) virt/kvm/kvm_main.c:3509:15: error: incompatible types in comparison expression (different address spaces) virt/kvm/kvm_main.c:3561:15: error: incompatible types in comparison expression (different address spaces) virt/kvm/kvm_main.c:3644:15: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Christian Borntraeger <[email protected]>
2017-07-07KVM: use rcu access function for irq routingChristian Borntraeger1-1/+1
irq routing is rcu protected. Use the proper access functions. Found by sparse virt/kvm/irqchip.c:233:13: warning: incorrect type in assignment (different address spaces) virt/kvm/irqchip.c:233:13: expected struct kvm_irq_routing_table *old virt/kvm/irqchip.c:233:13: got struct kvm_irq_routing_table [noderef] <asn:4>*irq_routing Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]>
2017-07-07tracing: Attempt to record other information even if some failJoel Fernandes1-8/+24
In recent patches where we record comm and tgid at the same time, we skip continuing to record if any fail. Fix that by trying to record as many things as we can even if some couldn't be recorded. If any information isn't recorded, then we don't set trace_taskinfo_save as before. Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Cc: Ingo Molnar <[email protected]> Signed-off-by: Joel Fernandes <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-07-07tracing: Treat recording tgid for idle task as a successJoel Fernandes1-1/+5
Currently we stop recording tgid for non-idle tasks when switching from/to idle task since we treat that as a record failure. Fix that by treat recording of tgid for idle task as a success. Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Cc: Ingo Molnar <[email protected]> Reported-by: Michael Sartain <[email protected]> Signed-off-by: Joel Fernandes <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-07-07tracing: Treat recording comm for idle task as a successJoel Fernandes1-1/+5
Currently we stop recording comm for non-idle tasks when switching from/to idle task since we treat that as a record failure. Fix that by treat recording of comm for idle task as a success. Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Cc: Ingo Molnar <[email protected]> Reported-by: Michael Sartain <[email protected]> Signed-off-by: Joel Fernandes <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-07-07rtc: ds1307: remove ds1307_removeAlexandre Belloni1-6/+0
ds1307_remove() is now empty, remove it Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: ds1307: use generic nvmemAlexandre Belloni1-66/+22
Instead of adding a binary sysfs attribute from the driver (which suffers from a race condition as the attribute appears after the device), use the core to register an nvmem device. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: ds1307: switch to rtc_register_deviceAlexandre Belloni1-2/+7
This removes a possible race condition and crash and allows for further improvement of the driver. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: rv8803: remove rv8803_removeAlexandre Belloni1-6/+0
rv8803_remove() is now empty, remove it. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: rv8803: use generic nvmem supportAlexandre Belloni1-31/+20
Instead of adding a binary sysfs attribute from the driver (which suffers from a race condition as the attribute appears after the device), use the core to register an nvmem device. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: rv8803: switch to rtc_register_deviceAlexandre Belloni1-6/+9
This removes a possible race condition and allows for further improvement of the driver. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: add generic nvmem supportAlexandre Belloni7-0/+143
Many RTCs have an on board non volatile storage. It can be battery backed RAM or an EEPROM. Use the nvmem subsystem to export it to both userspace and in-kernel consumers. This stays compatible with the previous (non documented) ABI that was using /sys/class/rtc/rtcx/device/nvram to export that memory. But will warn about the deprecation. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: at91rm9200: remove race conditionAlexandre Belloni1-6/+8
While highly unlikely, it is possible to get an interrupt as soon as it is requested. In that case, at91_rtc_interrupt() will be called with rtc == NULL. Solve that by using devm_rtc_allocate_device/rtc_register_device. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: introduce new registration methodAlexandre Belloni2-0/+91
Introduce rtc_register_device() to register an already allocated and initialized struct rtc_device. It automatically sets up the owner and the two steps allocation/registration will allow to remove race conditions in the IRQ handling of some driver. It also allows to properly extend the core without adding more arguments to rtc_device_register(). Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: class separate id allocation from registrationAlexandre Belloni1-19/+25
Create rtc_device_get_id to allocate the id for an RTC. Signed-off-by: Alexandre Belloni <[email protected]>
2017-07-07rtc: class separate device allocation from registrationAlexandre Belloni1-26/+37
Create rtc_allocate_device to allocate memory for a struct rtc_device and initialize it. Signed-off-by: Alexandre Belloni <[email protected]>