aboutsummaryrefslogtreecommitdiff
path: root/include/linux/ceph/libceph.h
AgeCommit message (Collapse)AuthorFilesLines
2020-12-14libceph, ceph: implement msgr2.1 protocol (crc and secure modes)Ilya Dryomov1-2/+7
Implement msgr2.1 wire protocol, available since nautilus 14.2.11 and octopus 15.2.5. msgr2.0 wire protocol is not implemented -- it has several security, integrity and robustness issues and therefore considered deprecated. Signed-off-by: Ilya Dryomov <[email protected]>
2020-12-14libceph: introduce connection modes and ms_mode optionIlya Dryomov1-0/+1
msgr2 supports two connection modes: crc (plain) and secure (on-wire encryption). Connection mode is picked by server based on input from client. Introduce ms_mode option: ms_mode=legacy - msgr1 (default) ms_mode=crc - crc mode, if denied fail ms_mode=secure - secure mode, if denied fail ms_mode=prefer-crc - crc mode, if denied agree to secure mode ms_mode=prefer-secure - secure mode, if denied agree to crc mode ms_mode affects all connections, we don't separate connections to mons like it's done in userspace with ms_client_mode vs ms_mon_client_mode. For now the default is legacy, to be flipped to prefer-crc after some time. Signed-off-by: Ilya Dryomov <[email protected]>
2020-12-14ceph: add new RECOVER mount_state when recovering sessionJeff Layton1-0/+1
When recovering a session (a'la recover_session=clean), we want to do all of the operations that we do on a forced umount, but changing the mount state to SHUTDOWN is can cause queued MDS requests to fail when the session comes back. Most of those can idle until the session is recovered in this situation. Reserve SHUTDOWN state for forced umount, and make a new RECOVER state for the forced reconnect situation. Change several tests for equality with SHUTDOWN to test for that or RECOVER. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Xiubo Li <[email protected]> Reviewed-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2020-08-04ceph: move sb->wb_pagevec_pool to be a global mempoolJeff Layton1-0/+1
When doing some testing recently, I hit some page allocation failures on mount, when creating the wb_pagevec_pool for the mount. That requires 128k (32 contiguous pages), and after thrashing the memory during an xfstests run, sometimes that would fail. 128k for each mount seems like a lot to hold in reserve for a rainy day, so let's change this to a global mempool that gets allocated when the module is plugged in. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2020-06-16libceph: move away from global osd_req_flagsIlya Dryomov1-2/+2
osd_req_flags is overly general and doesn't suit its only user (read_from_replica option) well: - applying osd_req_flags in account_request() affects all OSD requests, including linger (i.e. watch and notify). However, linger requests should always go to the primary even though some of them are reads (e.g. notify has side effects but it is a read because it doesn't result in mutation on the OSDs). - calls to class methods that are reads are allowed to go to the replica, but most such calls issued for "rbd map" and/or exclusive lock transitions are requested to be resent to the primary via EAGAIN, doubling the latency. Get rid of global osd_req_flags and set read_from_replica flag only on specific OSD requests instead. Fixes: 8ad44d5e0d1e ("libceph: read_from_replica option") Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jeff Layton <[email protected]>
2020-06-01libceph: read_from_replica optionIlya Dryomov1-0/+2
Expose replica reads through read_from_replica=balance and read_from_replica=localize. The default is to read from primary (read_from_replica=no). Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jeff Layton <[email protected]>
2020-06-01libceph: crush_location infrastructureIlya Dryomov1-0/+1
Allow expressing client's location in terms of CRUSH hierarchy as a set of (bucket type name, bucket name) pairs. The userspace syntax "crush_location = key1=value1 key2=value2" is incompatible with mount options and needed adaptation. Key-value pairs are separated by '|' and we use ':' instead of '=' to separate keys from values. So for: crush_location = host=foo rack=bar one would write: crush_location=host:foo|rack:bar As in userspace, "multipath" locations are supported, so indicating locality for parallel hierarchies is possible: crush_location=rack:foo1|rack:foo2|datacenter:bar Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jeff Layton <[email protected]>
2020-06-01libceph: add non-asserting rbtree insertion helperIlya Dryomov1-2/+8
Needed for the next commit and useful for ceph_pg_pool_info tree as well. I'm leaving the asserting helper in for now, but we should look at getting rid of it in the future. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jeff Layton <[email protected]>
2020-03-30ceph: move to a dedicated slabcache for mds requestsJeff Layton1-0/+1
On my machine (x86_64) this struct is 952 bytes, which gets rounded up to 1024 by kmalloc. Move this to a dedicated slabcache, so we can allocate them without the extra 72 bytes of overhead per. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2020-02-07ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_logAl Viro1-2/+3
... and now errorf() et.al. are never called with NULL fs_context, so we can get rid of conditional in those. Signed-off-by: Al Viro <[email protected]>
2019-11-27libceph, rbd, ceph: convert to use the new mount APIDavid Howells1-4/+6
Convert the ceph filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. [ Numerous string handling, leak and regression fixes; rbd conversion was particularly broken and had to be redone almost from scratch. ] Signed-off-by: David Howells <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2019-09-16libceph: add function that reset client's entity addrYan, Zheng1-0/+1
This function also re-open connections to OSD/MON, and re-send in-flight OSD requests after re-opening connections to OSD. Signed-off-by: "Yan, Zheng" <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2019-07-08libceph: bump CEPH_MSG_MAX_DATA_LEN (again)Ilya Dryomov1-2/+4
This time for rbd object map. Object maps are limited in size to 256000000 objects, two bits per object. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Dongsheng Yang <[email protected]>
2019-07-08libceph: remove ceph_get_direct_page_vector()Christoph Hellwig1-4/+0
This function is entirely unused. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2019-03-20libceph: wait for latest osdmap in ceph_monc_blacklist_add()Ilya Dryomov1-0/+2
Because map updates are distributed lazily, an OSD may not know about the new blacklist for quite some time after "osd blacklist add" command is completed. This makes it possible for a blacklisted but still alive client to overwrite a post-blacklist update, resulting in data corruption. Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using the post-blacklist epoch for all post-blacklist requests ensures that all such requests "wait" for the blacklist to come into force on their respective OSDs. Cc: [email protected] Fixes: 6305a3b41515 ("libceph: support for blacklisting clients") Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jason Dillaman <[email protected]>
2019-01-07libceph: allow setting abort_on_full for rbdDongsheng Yang1-2/+4
Introduce a new option abort_on_full, default to false. Then we can get -ENOSPC when the pool is full, or reaches quota. [ Don't show abort_on_full in /proc/mounts. ] Signed-off-by: Dongsheng Yang <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2018-10-22libceph: bump CEPH_MSG_MAX_DATA_LENIlya Dryomov1-1/+7
If the read is large enough, we end up spinning in the messenger: libceph: osd0 192.168.122.1:6801 io error libceph: osd0 192.168.122.1:6801 io error libceph: osd0 192.168.122.1:6801 io error This is a receive side limit, so only reads were affected. Cc: [email protected] Signed-off-by: Ilya Dryomov <[email protected]>
2018-04-02ceph: optimize memory usageChengguang Xu1-0/+1
In current code, regular file and directory use same struct ceph_file_info to store fs specific data so the struct has to include some fields which are only used for directory (e.g., readdir related info), when having plenty of regular files, it will lead to memory waste. This patch introduces dedicated ceph_dir_file_info cache for readdir related thins. So that regular file does not include those unused fields anymore. Signed-off-by: Chengguang Xu <[email protected]> Reviewed-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Philippe Ombredanne <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-09-06ceph: validate correctness of some mount optionsYan, Zheng1-10/+0
Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-09-06ceph: remove unused cap_release_safety mount optionYan, Zheng1-1/+0
Signed-off-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-07-07libceph: make DEFINE_RB_* helpers more generalIlya Dryomov1-12/+37
Initially for ceph_pg_mapping, ceph_spg_mapping and ceph_hobject_id, compared with ceph_pg_compare(), ceph_spg_compare() and hoid_compare() respectively. Signed-off-by: Ilya Dryomov <[email protected]>
2017-05-04libceph: convert ceph_snap_context.nref from atomic_t to refcount_tElena Reshetova1-1/+2
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: Hans Liljestrand <[email protected]> Signed-off-by: Kees Cook <[email protected]> Signed-off-by: David Windsor <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2017-05-04libceph, ceph: always advertise all supported featuresIlya Dryomov1-4/+1
No reason to hide CephFS-specific features in the rbd case. Recent feature bits mix RADOS and CephFS-specific stuff together anyway. Signed-off-by: Ilya Dryomov <[email protected]>
2017-03-07libceph: osd_request_timeout optionIlya Dryomov1-0/+2
osd_request_timeout specifies how many seconds to wait for a response from OSDs before returning -ETIMEDOUT from an OSD request. 0 (default) means no limit. osd_request_timeout is osdkeepalive-precise -- in-flight requests are swept through every osdkeepalive seconds. With ack vs commit behaviour gone, abort_request() is really simple. This is based on a patch from Artur Molchanov <[email protected]>. Tested-by: Artur Molchanov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2016-08-24rbd: add 'client_addr' sysfs rbd device attributeIlya Dryomov1-0/+1
Export client addr/nonce, so userspace can check if a image is being blacklisted. Signed-off-by: Mike Christie <[email protected]> [[email protected]: ceph_client_addr(), endianess fix] Signed-off-by: Ilya Dryomov <[email protected]>
2016-08-24libceph: rename ceph_client_id() -> ceph_client_gid()Ilya Dryomov1-1/+1
It's gid / global_id in other places. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Mike Christie <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2016-07-28ceph: fix symbol versioning for ceph_monc_do_statfsArnd Bergmann1-1/+2
The genksyms helper in the kernel cannot parse a type definition like "typeof(((type *)0)->keyfld)" that is used in the DEFINE_RB_FUNCS helper, causing the following EXPORT_SYMBOL() statement to be ignored when computing the crcs, and triggering a warning about this: WARNING: "ceph_monc_do_statfs" [fs/ceph/ceph.ko] has no CRC To work around the problem, we can rewrite the type to reference an undefined 'extern' symbol instead of a NULL pointer. This is evidently ok for genksyms, and it no longer complains about the line when calling it with 'genksyms -w'. I've looked briefly into extending genksyms instead, but it seems really hard to do. Jan Beulich introduced basic support for 'typeof' a while ago in dc53324060f3 ("genksyms: fix typeof() handling"), but that is not sufficient for the expression we have here. Signed-off-by: Arnd Bergmann <[email protected]> Fixes: fcd00b68bbe2 ("libceph: DEFINE_RB_FUNCS macro") Cc: Jan Beulich <[email protected]> Cc: Michal Marek <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2016-07-28libceph: introduce reference counted stringYan, Zheng1-0/+1
The data structure is for storing namesapce string. It allows namespace string to be shared between cephfs inodes with same layout. This data structure can also be referenced by OSD request. Signed-off-by: Yan, Zheng <[email protected]>
2016-05-26libceph: DEFINE_RB_FUNCS macroIlya Dryomov1-0/+57
Given struct foo { u64 id; struct rb_node bar_node; }; generate insert_bar(), erase_bar() and lookup_bar() functions with DEFINE_RB_FUNCS(bar, struct foo, id, bar_node) The key is assumed to be an integer (u64, int, etc), compared with < and >. nodefld has to be initialized with RB_CLEAR_NODE(). Start using it for MDS, MON and OSD requests and OSD sessions. Signed-off-by: Ilya Dryomov <[email protected]>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov1-2/+2
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-25libceph: monc hunt rate is 3s with backoff up to 30sIlya Dryomov1-0/+3
Unless we are in the process of setting up a client (i.e. connecting to the monitor cluster for the first time), apply a backoff: every time we want to reopen a session, increase our timeout by a multiple (currently 2); when we complete the connection, reduce that multipler by 50%. Mirrors ceph.git commit 794c86fd289bd62a35ed14368fa096c46736e9a2. Signed-off-by: Ilya Dryomov <[email protected]>
2016-03-25libceph: monc ping rate is 10sIlya Dryomov1-2/+3
Split ping interval and ping timeout: ping interval is 10s; keepalive timeout is 30s. Make monc_ping_timeout a constant while at it - it's not actually exported as a mount option (and the rest of tick-related settings won't be either), so it's got no place in ceph_options. Signed-off-by: Ilya Dryomov <[email protected]>
2015-11-02libceph: add nocephx_sign_messages optionIlya Dryomov1-1/+2
Support for message signing was merged into 3.19, along with nocephx_require_signatures option. But, all that option does is allow the kernel client to talk to clusters that don't support MSG_AUTH feature bit. That's pretty useless, given that it's been supported since bobtail. Meanwhile, if one disables message signing on the server side with "cephx sign messages = false", it becomes impossible to use the kernel client since it expects messages to be signed if MSG_AUTH was negotiated. Add nocephx_sign_messages option to support this use case. Signed-off-by: Ilya Dryomov <[email protected]>
2015-11-02libceph: stop duplicating client fields in messengerIlya Dryomov1-0/+1
supported_features and required_features serve no purpose at all, while nocrc and tcp_nodelay belong to ceph_options::flags. Signed-off-by: Ilya Dryomov <[email protected]>
2015-09-08libceph: use keepalive2 to verify the mon session is aliveYan, Zheng1-0/+2
Signed-off-by: Yan, Zheng <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2015-06-25ceph: pre-allocate data structure that tracks caps flushingYan, Zheng1-0/+1
Signed-off-by: Yan, Zheng <[email protected]>
2015-06-25libceph: store timeouts in jiffies, verify user inputIlya Dryomov1-6/+11
There are currently three libceph-level timeouts that the user can specify on mount: mount_timeout, osd_idle_ttl and osdkeepalive. All of these are in seconds and no checking is done on user input: negative values are accepted, we multiply them all by HZ which may or may not overflow, arbitrarily large jiffies then get added together, etc. There is also a bug in the way mount_timeout=0 is handled. It's supposed to mean "infinite timeout", but that's not how wait.h APIs treat it and so __ceph_open_session() for example will busy loop without much chance of being interrupted if none of ceph-mons are there. Fix all this by verifying user input, storing timeouts capped by msecs_to_jiffies() in jiffies and using the new ceph_timeout_jiffies() helper for all user-specified waits to handle infinite timeouts correctly. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2015-06-25libceph: nuke time_sub()Ilya Dryomov1-9/+0
Unused since ceph got merged into mainline I guess. Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Alex Elder <[email protected]>
2015-04-20libceph: expose client options through debugfsIlya Dryomov1-0/+1
Add a client_options attribute for showing libceph options. Signed-off-by: Ilya Dryomov <[email protected]>
2015-04-20libceph, ceph: split ceph_show_options()Ilya Dryomov1-0/+1
Split ceph_show_options() into two pieces and move the piece responsible for printing client (libceph) options into net/ceph. This way people adding a libceph option wouldn't have to remember to update code in fs/ceph. Signed-off-by: Ilya Dryomov <[email protected]>
2015-02-19libceph: tcp_nodelay supportChaitanya Huilgol1-1/+2
TCP_NODELAY socket option set on connection sockets, disables Nagle’s algorithm and improves latency characteristics. tcp_nodelay(default)/notcp_nodelay option flags provided to enable/disable setting the socket option. Signed-off-by: Chaitanya Huilgol <[email protected]> [[email protected]: NO_TCP_NODELAY -> TCP_NODELAY, minor adjustments] Signed-off-by: Ilya Dryomov <[email protected]>
2014-12-17libceph: require cephx message signature by defaultYan, Zheng1-0/+1
Signed-off-by: Yan, Zheng <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]>
2014-12-17libceph: nuke ceph_kvfree()Ilya Dryomov1-1/+0
Use kvfree() from linux/mm.h instead, which is identical. Also fix the ceph_buffer comment: we will allocate with kmalloc() up to 32k - the value of PAGE_ALLOC_COSTLY_ORDER, but that really is just an implementation detail so don't mention it at all. Signed-off-by: Ilya Dryomov <[email protected]>
2014-10-14libceph: remove redundant declarationFabian Frederick1-1/+0
ceph_release_page_vector was defined twice in libceph.h Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2014-05-06ceph_sync_read: stop poking into iov_iter gutsAl Viro1-2/+0
Signed-off-by: Al Viro <[email protected]>
2014-01-26libceph: add ceph_kv{malloc,free}() and switch to themIlya Dryomov1-4/+7
Encapsulate kmalloc vs vmalloc memory allocation and freeing logic into two helpers, ceph_kvmalloc() and ceph_kvfree(), and switch to them. ceph_kvmalloc() kmalloc()'s a maximum of 8 pages, anything bigger is vmalloc()'ed with __GFP_HIGHMEM set. This changes the existing behaviour: - for buffers (ceph_buffer_new()), from trying to kmalloc() everything and using vmalloc() just as a fallback - for messages (ceph_msg_new()), from going to vmalloc() for anything bigger than a page - for messages (ceph_msg_new()), from disallowing vmalloc() to use high memory Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2013-12-31libceph: all features fields must be u64Ilya Dryomov1-4/+4
In preparation for ceph_features.h update, change all features fields from unsigned int/u32 to u64. (ceph.git has ~40 feature bits at this point.) Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Sage Weil <[email protected]>
2013-05-01libceph: create source file "net/ceph/snapshot.c"Alex Elder1-25/+5
This creates a new source file "net/ceph/snapshot.c" to contain utility routines related to ceph snapshot contexts. The main motivation was to define ceph_create_snap_context() as a common way to create these structures, but I've moved the definitions of ceph_get_snap_context() and ceph_put_snap_context() there too. (The benefit of inlining those is very small, and I'd rather keep this collection of functions together.) Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Josh Durgin <[email protected]>
2013-05-01libceph: define CEPH_MSG_MAX_MIDDLE_LENAlex Elder1-0/+1
This is probably unnecessary but the code read as if it were wrong in read_partial_message(). Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Josh Durgin <[email protected]>