aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-07libceph: ceph_decode_skip_* helpersIlya Dryomov2-22/+63
Some of these won't be as efficient as they could be (e.g. ceph_decode_skip_set(... 32 ...) could advance by len * sizeof(u32) once instead of advancing by sizeof(u32) len times), but that's fine and not worth a bunch of extra macro code. Replace skip_name_map() with ceph_decode_skip_map as an example. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: kill __{insert,lookup,remove}_pg_mapping()Ilya Dryomov1-72/+15
Switch to DEFINE_RB_FUNCS2-generated {insert,lookup,erase}_pg_mapping(). Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: introduce and switch to decode_pg_mapping()Ilya Dryomov1-67/+83
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: don't pass pgid by valueIlya Dryomov1-10/+10
Make __{lookup,remove}_pg_mapping() look like their ceph_spg_mapping counterparts: take const struct ceph_pg *. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: respect RADOS_BACKOFF backoffsIlya Dryomov8-0/+737
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: make DEFINE_RB_* helpers more generalIlya Dryomov1-12/+37
Initially for ceph_pg_mapping, ceph_spg_mapping and ceph_hobject_id, compared with ceph_pg_compare(), ceph_spg_compare() and hoid_compare() respectively. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: avoid unnecessary pi lookups in calc_target()Ilya Dryomov3-30/+42
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: use target pi for calc_target() calculationsIlya Dryomov1-1/+8
For luminous and beyond we are encoding the actual spgid, which requires operating with the correct pg_num, i.e. that of the target pool. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: always populate t->target_{oid,oloc} in calc_target()Ilya Dryomov1-11/+4
need_check_tiering logic doesn't make a whole lot of sense. Drop it and apply tiering unconditionally on every calc_target() call instead. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: make sure need_resend targets reflect latest mapIlya Dryomov3-9/+27
Otherwise we may miss events like PG splits, pool deletions, etc when we get multiple incremental maps at once. Because check_pool_dne() can now be fed an unlinked request, finish_request() needed to be taught to handle unlinked requests. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: delete from need_resend_linger before check_linger_pool_dne()Ilya Dryomov1-0/+1
When processing a map update consisting of multiple incrementals, we may end up running check_linger_pool_dne() on a lingering request that was previously added to need_resend_linger list. If it is concluded that the target pool doesn't exist, the request is killed off while still on need_resend_linger list, which leads to a crash on a NULL lreq->osd in kick_requests(): libceph: linger_id 18446462598732840961 pool does not exist BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: ceph_osdc_handle_map+0x4ae/0x870 Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: resend on PG splits if OSD has RESEND_ON_SPLITIlya Dryomov3-11/+19
Note that ceph_osd_request_target fields are updated regardless of RESEND_ON_SPLIT. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: drop need_resend from calc_target()Ilya Dryomov1-7/+11
Replace it with more fine-grained bools to separate updating ceph_osd_request_target fields and the decision to resend. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: MOSDOp v8 encoding (actual spgid + full hash)Ilya Dryomov3-20/+154
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: ceph_connection_operations::reencode_message() methodIlya Dryomov2-2/+7
Give upper layers a chance to reencode the message after the connection is negotiated and ->peer_features is set. OSD client will use this to support both luminous and pre-luminous OSDs (in a single cluster): the former need MOSDOp v8; the latter will continue to be sent MOSDOp v4. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: encode_{pgid,oloc}() helpersIlya Dryomov1-23/+27
Factor out encode_{pgid,oloc}() and use ceph_encode_string() for oid. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: introduce ceph_spg, ceph_pg_to_primary_shard()Ilya Dryomov5-4/+60
Store both raw pgid and actual spgid in ceph_osd_request_target. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: new pi->last_force_request_resendIlya Dryomov1-0/+37
The old (v15) pi->last_force_request_resend has been repurposed to make pre-RESEND_ON_SPLIT clients that don't check for PG splits but do obey pi->last_force_request_resend resend on splits. See ceph.git commit 189ca7ec6420 ("mon/OSDMonitor: make pre-luminous clients resend ops on split"). Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: fold [l]req->last_force_resend into ceph_osd_request_targetIlya Dryomov2-13/+12
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: support SERVER_JEWEL feature bitsIlya Dryomov2-1/+9
Only MON_STATEFUL_SUB, really. MON_ROUTE_OSDMAP and OSDSUBOP_NO_SNAPCONTEXT are irrelevant. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: advertise support for OSD_POOLRESENDIlya Dryomov1-0/+1
The code has been in place since commit 63244fa123a7 ("libceph: introduce ceph_osd_request_target, calc_target()"), and, with the ceph_{oloc,oid}_copy() issue fixed in the previous commit, is now in working order. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: handle non-empty dest in ceph_{oloc,oid}_copy()Ilya Dryomov1-4/+6
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: new features macrosIlya Dryomov1-75/+167
Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: remove ceph_sanitize_features() workaroundIlya Dryomov2-23/+1
Reflects ceph.git commit ff1959282826ae6acd7134e1b1ede74ffd1cc04a. Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: update ceph_dentry_info::lease_session when necessaryYan, Zheng1-2/+7
Current code does not update ceph_dentry_info::lease_session once it is set. If auth mds of corresponding dentry changes, dentry lease keeps in an invalid state. Signed-off-by: "Yan, Zheng" <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: new mount option that specifies fscache uniquifierYan, Zheng3-21/+113
Current ceph uses FSID as primary index key of fscache data. This allows ceph to retain cached data across remount. But this causes problem (kernel opps, fscache does not support sharing data) when a filesystem get mounted several times (with fscache enabled, with different mount options). The fix is adding a new mount option, which specifies uniquifier for fscache. Signed-off-by: "Yan, Zheng" <[email protected]> Acked-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: avoid accessing freeing inode in ceph_check_delayed_caps()Yan, Zheng1-2/+9
Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: avoid invalid memory dereference in the middle of umountYan, Zheng2-4/+6
extra_mon_dispatch() and debugfs' foo_show functions dereference fsc->mdsc. we should clean up fsc->client->extra_mon_dispatch and debugfs before destroying fsc->mds. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: getattr before read on ceph.* xattrsYan, Zheng1-0/+3
Previously we were returning values for quota, layout xattrs without any kind of update -- the user just got whatever happened to be in our cache. Clearly this extra round trip has a cost, but reads of these xattrs are fairly rare, happening on admin intervention rather than in normal operation. Link: http://tracker.ceph.com/issues/17939 Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: don't re-send interrupted flock requestYan, Zheng1-1/+24
Don't re-send interrupted flock request in cases of mds failover and receiving request forward. Because corresponding 'lock intr' request may have been finished, it won't get re-sent. Link: http://tracker.ceph.com/issues/20170 Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: cleanup writepage_nounlock()Yan, Zheng1-6/+6
Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: redirty page when writepage_nounlock() skips unwritable pageYan, Zheng1-1/+2
Ceph needs to flush dirty page in the order in which in which snap context they belong to. Dirty pages belong to older snap context should be flushed earlier. if writepage_nounlock() can not flush a page, it should redirty the page. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: remove useless page->mapping check in writepage_nounlock()Yan, Zheng1-4/+0
Callers of writepage_nounlock() have already ensured non-null page->mapping. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: update the 'approaching max_size' codeYan, Zheng5-11/+23
The old 'approaching max_size' code expects MDS set max_size to '2 * reported_size'. This is no longer true. The new code reports file size when half of previous max_size increment has been used. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07ceph: re-request max size after importing capsYan, Zheng1-3/+8
The 'wanted max size' could be sent to inode's old auth mds, re-send it to inode's new auth mds if necessary. Otherwise write syscall may hang. Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07Merge tag 'mac80211-for-davem-2017-07-07' of ↵David S. Miller1-3/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== pull-request: mac80211 2017-07-07 Just got a set of fixes in from Jouni/QCA, all netlink validation fixes. I assume they ran some kind of checker, but I don't know what kind :) Please pull and let me know if there's any problem. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-07-07irqdomain: Allow ACPI device nodes to be used as irqdomain identifiersMarc Zyngier1-0/+16
A number of irqchip implementations are (ab)using the irqdomain allocator by passing a fwnode that is neither a FWNODE_OF or a FWNODE_IRQCHIP. This is pretty bad, but it also feels pretty crap to force these drivers to allocate their own irqchip_fwid when they already have a proper fwnode. Instead, let's teach the irqdomain allocator about ACPI device nodes, and add some lovely name generation code... Tested on an arm64 D05 system. Reported-and-tested-by: John Garry <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Agustin Vega-Frias <[email protected]> Cc: Ma Jun <[email protected]> Cc: Hanjun Guo <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-07-07cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIESSrinivas Dasari1-0/+4
validate_scan_freqs() retrieves frequencies from attributes nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with nla_get_u32(), which reads 4 bytes from each attribute without validating the size of data received. Attributes nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy. Validate size of each attribute before parsing to avoid potential buffer overread. Fixes: 2a519311926 ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Cc: [email protected] Signed-off-by: Srinivas Dasari <[email protected]> Signed-off-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-07-07cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODESrinivas Dasari1-0/+1
Buffer overread may happen as nl80211_set_station() reads 4 bytes from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without validating the size of data received when userspace sends less than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE. Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid the buffer overread. Fixes: 3b1c5a5307f ("{cfg,nl}80211: mesh power mode primitives and userspace access") Cc: [email protected] Signed-off-by: Srinivas Dasari <[email protected]> Signed-off-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-07-07cfg80211: Check if NAN service ID is of expected sizeSrinivas Dasari1-1/+1
nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from userspace with NL80211_NAN_FUNC_SERVICE_ID. Fixes: a442b761b24 ("cfg80211: add add_nan_func / del_nan_func") Cc: [email protected] Signed-off-by: Srinivas Dasari <[email protected]> Signed-off-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-07-07cfg80211: Check if PMKID attribute is of expected sizeSrinivas Dasari1-2/+1
nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, the wireless drivers may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum WLAN_PMKID_LEN bytes are received from userspace with NL80211_ATTR_PMKID. Fixes: 67fbb16be69d ("nl80211: PMKSA caching support") Cc: [email protected] Signed-off-by: Srinivas Dasari <[email protected]> Signed-off-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-07-07iov_iter: saner checks on copyin/copyoutAl Viro1-16/+39
* might_fault() is better checked in caller (and e.g. fault-in + kmap_atomic codepath also needs might_fault() coverage) * we have already done object size checks * we have *NOT* done access_ok() recently enough; we rely upon the iovec array having passed sanity checks back when it had been created and not nothing having buggered it since. However, that's very much non-local, so we'd better recheck that. So the thing we want does not match anything in uaccess - we need access_ok + kasan checks + raw copy without any zeroing. Just define such helpers and use them here. Signed-off-by: Al Viro <[email protected]>
2017-07-07arcnet: com20020-pci: Fix an error handling path in 'com20020pci_probe()'Christophe Jaillet1-2/+4
If this memory allocation fails, we should go through the error handling path as done everywhere else in this function before returning. Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-07nfp: flower: add missing clean up call to avoid memory leaksJakub Kicinski1-0/+1
nfp_flower_metadata_cleanup() is defined but never invoked, not calling it will cause us to leak mask and statistics queue memory on the host. Fixes: 43f84b72c50d ("nfp: add metadata to each flow offload") Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-07genirq/debugfs: Remove redundant NULL pointer checkThomas Gleixner1-2/+1
debugfs_remove() can be called with a NULL pointer. Fixes: 087cdfb662ae5 ("genirq/debugfs: Add proper debugfs interface") Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2017-07-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds131-1723/+3093
Merge misc updates from Andrew Morton: - a few hotfixes - various misc updates - ocfs2 updates - most of MM * emailed patches from Andrew Morton <[email protected]>: (108 commits) mm, memory_hotplug: move movable_node to the hotplug proper mm, memory_hotplug: drop CONFIG_MOVABLE_NODE mm, memory_hotplug: drop artificial restriction on online/offline mm: memcontrol: account slab stats per lruvec mm: memcontrol: per-lruvec stats infrastructure mm: memcontrol: use generic mod_memcg_page_state for kmem pages mm: memcontrol: use the node-native slab memory counters mm: vmstat: move slab statistics from zone to node counters mm/zswap.c: delete an error message for a failed memory allocation in zswap_dstmem_prepare() mm/zswap.c: improve a size determination in zswap_frontswap_init() mm/zswap.c: delete an error message for a failed memory allocation in zswap_pool_create() mm/swapfile.c: sort swap entries before free mm/oom_kill: count global and memory cgroup oom kills mm: per-cgroup memory reclaim stats mm: kmemleak: treat vm_struct as alternative reference to vmalloc'ed objects mm: kmemleak: factor object reference updating out of scan_block() mm: kmemleak: slightly reduce the size of some structures on 64-bit architectures mm, mempolicy: don't check cpuset seqlock where it doesn't matter mm, cpuset: always use seqlock when changing task's nodemask mm, mempolicy: simplify rebinding mempolicies when updating cpusets ...
2017-07-06Merge branch 'uaccess.strlen' of ↵Linus Torvalds40-601/+5
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull user access str* updates from Al Viro: "uaccess str...() dead code removal" * 'uaccess.strlen' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: s390 keyboard.c: don't open-code strndup_user() mips: get rid of unused __strnlen_user() get rid of unused __strncpy_from_user() instances kill strlen_user()
2017-07-06Merge branch 'work.probe_kernel_read' of ↵Linus Torvalds3-25/+4
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull probe_kernel_read() uses from Al Viro: "Several open-coded probe_kernel_read()..." * 'work.probe_kernel_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: dio: use probe_kernel_read() hp_sdc: use probe_kernel_read() hpfb: use probe_kernel_read()
2017-07-06Merge branch 'misc.alpha' of ↵Linus Torvalds1-41/+40
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull alpha user access updates from Al Viro: "Several alpha osf_sys.c uaccess cleanups - getdomainname() had insane byte-by-byte copying of string to userland (instead of strnlen + copy_to_user) plus yet another compat variant of timeval/itimerval with associated copyin/copyout primitives" * 'misc.alpha' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: osf_sigstack(): switch to put_user() osf_sys.c: switch handling of timeval32/itimerval32 to copy_{to,from}_user() osf_getdomainname(): use copy_to_user()
2017-07-06Merge branch 'misc.compat' of ↵Linus Torvalds23-1137/+1019
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc compat stuff updates from Al Viro: "This part is basically untangling various compat stuff. Compat syscalls moved to their native counterparts, getting rid of quite a bit of double-copying and/or set_fs() uses. A lot of field-by-field copyin/copyout killed off. - kernel/compat.c is much closer to containing just the copyin/copyout of compat structs. Not all compat syscalls are gone from it yet, but it's getting there. - ipc/compat_mq.c killed off completely. - block/compat_ioctl.c cleaned up; floppy compat ioctls moved to drivers/block/floppy.c where they belong. Yes, there are several drivers that implement some of the same ioctls. Some are m68k and one is 32bit-only pmac. drivers/block/floppy.c is the only one in that bunch that can be built on biarch" * 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: mqueue: move compat syscalls to native ones usbdevfs: get rid of field-by-field copyin compat_hdio_ioctl: get rid of set_fs() take floppy compat ioctls to sodding floppy.c ipmi: get rid of field-by-field __get_user() ipmi: get COMPAT_IPMICTL_RECEIVE_MSG in sync with the native one rt_sigtimedwait(): move compat to native select: switch compat_{get,put}_fd_set() to compat_{get,put}_bitmap() put_compat_rusage(): switch to copy_to_user() sigpending(): move compat to native getrlimit()/setrlimit(): move compat to native times(2): move compat to native compat_{get,put}_bitmap(): use unsafe_{get,put}_user() fb_get_fscreeninfo(): don't bother with do_fb_ioctl() do_sigaltstack(): lift copying to/from userland into callers take compat_sys_old_getrlimit() to native syscall trim __ARCH_WANT_SYS_OLD_GETRLIMIT