aboutsummaryrefslogtreecommitdiff
path: root/drivers/xen
AgeCommit message (Collapse)AuthorFilesLines
2017-10-31xen/pvcalls: implement release commandStefano Stabellini2-0/+99
Send PVCALLS_RELEASE to the backend and wait for a reply. Take both in_mutex and out_mutex to avoid concurrent accesses. Then, free the socket. For passive sockets, check whether we have already pre-allocated an active socket for the purpose of being accepted. If so, free that as well. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement poll commandStefano Stabellini2-9/+138
For active sockets, check the indexes and use the inflight_conn_req waitqueue to wait. For passive sockets if an accept is outstanding (PVCALLS_FLAG_ACCEPT_INFLIGHT), check if it has been answered by looking at bedata->rsp[req_id]. If so, return POLLIN. Otherwise use the inflight_accept_req waitqueue. If no accepts are inflight, send PVCALLS_POLL to the backend. If we have outstanding POLL requests awaiting for a response use the inflight_req waitqueue: inflight_req is awaken when a new response is received; on wakeup we check whether the POLL response is arrived by looking at the PVCALLS_FLAG_POLL_RET flag. We set the flag from pvcalls_front_event_handler, if the response was for a POLL command. In pvcalls_front_event_handler, get the struct sock_mapping from the poll id (we previously converted struct sock_mapping* to uintptr_t and used it as id). Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement recvmsgStefano Stabellini2-0/+115
Implement recvmsg by copying data from the "in" ring. If not enough data is available and the recvmsg call is blocking, then wait on the inflight_conn_req waitqueue. Take the active socket in_mutex so that only one function can access the ring at any given time. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement sendmsgStefano Stabellini2-0/+124
Send data to an active socket by copying data to the "out" ring. Take the active socket out_mutex so that only one function can access the ring at any given time. If not enough room is available on the ring, rather than returning immediately or sleep-waiting, spin for up to 5000 cycles. This small optimization turns out to improve performance significantly. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement accept commandStefano Stabellini2-0/+148
Introduce a waitqueue to allow only one outstanding accept command at any given time and to implement polling on the passive socket. Introduce a flags field to keep track of in-flight accept and poll commands. Send PVCALLS_ACCEPT to the backend. Allocate a new active socket. Make sure that only one accept command is executed at any given time by setting PVCALLS_FLAG_ACCEPT_INFLIGHT and waiting on the inflight_accept_req waitqueue. Convert the new struct sock_mapping pointer into an uintptr_t and use it as id for the new socket to pass to the backend. Check if the accept call is non-blocking: in that case after sending the ACCEPT command to the backend store the sock_mapping pointer of the new struct and the inflight req_id then return -EAGAIN (which will respond only when there is something to accept). Next time accept is called, we'll check if the ACCEPT command has been answered, if so we'll pick up where we left off, otherwise we return -EAGAIN again. Note that, differently from the other commands, we can use wait_event_interruptible (instead of wait_event) in the case of accept as we are able to track the req_id of the ACCEPT response that we are waiting. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement listen commandStefano Stabellini2-0/+58
Send PVCALLS_LISTEN to the backend. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement bind commandStefano Stabellini2-0/+69
Send PVCALLS_BIND to the backend. Introduce a new structure, part of struct sock_mapping, to store information specific to passive sockets. Introduce a status field to keep track of the status of the passive socket. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement connect commandStefano Stabellini2-0/+160
Send PVCALLS_CONNECT to the backend. Allocate a new ring and evtchn for the active socket. Introduce fields in struct sock_mapping to keep track of active sockets. Introduce a waitqueue to allow the frontend to wait on data coming from the backend on the active socket (recvmsg command). Two mutexes (one of reads and one for writes) will be used to protect the active socket in and out rings from concurrent accesses. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement socket command and handle eventsStefano Stabellini2-0/+139
Send a PVCALLS_SOCKET command to the backend, use the masked req_prod_pvt as req_id. This way, req_id is guaranteed to be between 0 and PVCALLS_NR_REQ_PER_RING. We already have a slot in the rsp array ready for the response, and there cannot be two outstanding responses with the same req_id. Wait for the response by waiting on the inflight_req waitqueue and check for the req_id field in rsp[req_id]. Use atomic accesses and barriers to read the field. Note that the barriers are simple smp barriers (as opposed to virt barriers) because they are for internal frontend synchronization, not frontend<->backend communication. Once a response is received, clear the corresponding rsp slot by setting req_id to PVCALLS_INVALID_ID. Note that PVCALLS_INVALID_ID is invalid only from the frontend point of view. It is not part of the PVCalls protocol. pvcalls_front_event_handler is in charge of copying responses from the ring to the appropriate rsp slot. It is done by copying the body of the response first, then by copying req_id atomically. After the copies, wake up anybody waiting on waitqueue. socket_lock protects accesses to the ring. Convert the pointer to sock_mapping into an uintptr_t and use it as id for the new socket to pass to the backend. The struct will be fully initialized later on connect or bind. sock->sk->sk_send_head is not used for ip sockets: reuse the field to store a pointer to the struct sock_mapping corresponding to the socket. This way, we can easily get the struct sock_mapping from the struct socket. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: connect to the backendStefano Stabellini1-0/+132
Implement the probe function for the pvcalls frontend. Read the supported versions, max-page-order and function-calls nodes from xenstore. Only one frontend<->backend connection is supported at any given time for a guest. Store the active frontend device to a static pointer. Introduce a stub functions for the event handler. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: implement frontend disconnectStefano Stabellini1-0/+71
Introduce a data structure named pvcalls_bedata. It contains pointers to the command ring, the event channel, a list of active sockets and a list of passive sockets. Lists accesses are protected by a spin_lock. Introduce a waitqueue to allow waiting for a response on commands sent to the backend. Introduce an array of struct xen_pvcalls_response to store commands responses. Introduce a new struct sock_mapping to keep track of sockets. In this patch the struct sock_mapping is minimal, the fields will be added by the next patches. pvcalls_refcount is used to keep count of the outstanding pvcalls users. Only remove connections once the refcount is zero. Implement pvcalls frontend removal function. Go through the list of active and passive sockets and free them all, one at a time. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-31xen/pvcalls: introduce the pvcalls xenbus frontendStefano Stabellini1-0/+61
Introduce a xenbus frontend for the pvcalls protocol, as defined by https://xenbits.xen.org/docs/unstable/misc/pvcalls.html. This patch only adds the stubs, the code will be added by the following patches. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-27Merge tag 'for-linus-4.14c-rc7-tag' of ↵Linus Torvalds2-7/+14
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a fix for the Xen gntdev device repairing an issue in case of partial failure of mapping multiple pages of another domain - a fix of a regression in the Xen balloon driver introduced in 4.13 - a build fix for Xen on ARM which will trigger e.g. for Linux RT - a maintainers update for pvops (not really Xen, but carrying through this tree just for convenience) * tag 'for-linus-4.14c-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: maintainers: drop Chris Wright from pvops arm/xen: don't inclide rwlock.h directly. xen: fix booting ballooned down hvm guest xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()
2017-10-26xen: fix booting ballooned down hvm guestJuergen Gross1-6/+13
Commit 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't online new memory initially") introduced a regression when booting a HVM domain with memory less than mem-max: instead of ballooning down immediately the system would try to use the memory up to mem-max resulting in Xen crashing the domain. For HVM domains the current size will be reflected in Xenstore node memory/static-max instead of memory/target. Additionally we have to trigger the ballooning process at once. Cc: <[email protected]> # 4.13 Fixes: 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't online new memory initially") Reported-by: Simon Gaiser <[email protected]> Suggested-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-25xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()Juergen Gross1-1/+1
In case gntdev_mmap() succeeds only partially in mapping grant pages it will leave some vital information uninitialized needed later for cleanup. This will lead to an out of bounds array access when unmapping the already mapped pages. So just initialize the data needed for unmapping the pages a little bit earlier. Cc: <[email protected]> Reported-by: Arthur Borsboom <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2017-10-11xen: don't open-code iov_iter_kvec()Al Viro1-12/+4
Signed-off-by: Al Viro <[email protected]>
2017-10-05timer: Remove expires and data arguments from DEFINE_TIMERKees Cook1-1/+1
Drop the arguments from the macro and adjust all callers with the following script: perl -pi -e 's/DEFINE_TIMER\((.*), 0, 0\);/DEFINE_TIMER($1);/g;' \ $(git grep DEFINE_TIMER | cut -d: -f1 | sort -u | grep -v timer.h) Signed-off-by: Kees Cook <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> # for m68k parts Acked-by: Guenter Roeck <[email protected]> # for watchdog parts Acked-by: David S. Miller <[email protected]> # for networking parts Acked-by: Greg Kroah-Hartman <[email protected]> Acked-by: Kalle Valo <[email protected]> # for wireless parts Acked-by: Arnd Bergmann <[email protected]> Cc: [email protected] Cc: Petr Mladek <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Sebastian Reichel <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Machek <[email protected]> Cc: [email protected] Cc: Chris Metcalf <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "James E.J. Bottomley" <[email protected]> Cc: Wim Van Sebroeck <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Ursula Braun <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Harish Patil <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Michael Reed <[email protected]> Cc: Manish Chopra <[email protected]> Cc: Len Brown <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: [email protected] Cc: Heiko Carstens <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Julian Wiedmann <[email protected]> Cc: John Stultz <[email protected]> Cc: Mark Gross <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Martin K. Petersen" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Stefan Richter <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: [email protected] Cc: Martin Schwidefsky <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: Sudip Mukherjee <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-09-29Merge tag 'for-linus-4.14c-rc3-tag' of ↵Linus Torvalds1-1/+10
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - avoid a warning when compiling with clang - consider read-only bits in xen-pciback when writing to a BAR - fix a boot crash of pv-domains * tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping xen-pciback: relax BAR sizing write value check x86/xen: clean up clang build warning
2017-09-28xen-pciback: relax BAR sizing write value checkJan Beulich1-1/+10
Just like done in d2bd05d88d ("xen-pciback: return proper values during BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the written value directly against ~0, but consider the r/o bits at the bottom (if any). Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2017-09-22Merge tag 'for-linus-4.14b-rc2-tag' of ↵Linus Torvalds1-63/+67
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: "A fix for a missing __init annotation and two cleanup patches" * tag 'for-linus-4.14b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen, arm64: drop dummy lookup_address() xen: don't compile pv-specific parts if XEN_PV isn't configured xen: x86: mark xen_find_pt_base as __init
2017-09-18xen: don't compile pv-specific parts if XEN_PV isn't configuredJuergen Gross1-63/+67
xenbus_client.c contains some functions specific for pv guests. Enclose them with #ifdef CONFIG_XEN_PV to avoid compiling them when they are not needed (e.g. on ARM). Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2017-09-13mm: treewide: remove GFP_TEMPORARY allocation flagMichal Hocko1-1/+1
GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's primary motivation was to allow users to tell that an allocation is short lived and so the allocator can try to place such allocations close together and prevent long term fragmentation. As much as this sounds like a reasonable semantic it becomes much less clear when to use the highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the context holding that memory sleep? Can it take locks? It seems there is no good answer for those questions. The current implementation of GFP_TEMPORARY is basically GFP_KERNEL | __GFP_RECLAIMABLE which in itself is tricky because basically none of the existing caller provide a way to reclaim the allocated memory. So this is rather misleading and hard to evaluate for any benefits. I have checked some random users and none of them has added the flag with a specific justification. I suspect most of them just copied from other existing users and others just thought it might be a good idea to use without any measuring. This suggests that GFP_TEMPORARY just motivates for cargo cult usage without any reasoning. I believe that our gfp flags are quite complex already and especially those with highlevel semantic should be clearly defined to prevent from confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and replace all existing users to simply use GFP_KERNEL. Please note that SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and so they will be placed properly for memory fragmentation prevention. I can see reasons we might want some gfp flag to reflect shorterm allocations but I propose starting from a clear semantic definition and only then add users with proper justification. This was been brought up before LSF this year by Matthew [1] and it turned out that GFP_TEMPORARY really doesn't have a clear semantic. It seems to be a heuristic without any measured advantage for most (if not all) its current users. The follow up discussion has revealed that opinions on what might be temporary allocation differ a lot between developers. So rather than trying to tweak existing users into a semantic which they haven't expected I propose to simply remove the flag and start from scratch if we really need a semantic for short term allocations. [1] http://lkml.kernel.org/r/[email protected] [[email protected]: fix typo] [[email protected]: coding-style fixes] [[email protected]: drm/i915: fix up] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Neil Brown <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-09-07Merge tag 'for-linus-4.14b-rc1-tag' of ↵Linus Torvalds6-8/+1262
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - the new pvcalls backend for routing socket calls from a guest to dom0 - some cleanups of Xen code - a fix for wrong usage of {get,put}_cpu() * tag 'for-linus-4.14b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (27 commits) xen/mmu: set MMU_NORMAL_PT_UPDATE in remap_area_mfn_pte_fn xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guests xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init() xen/pvcalls: use WARN_ON(1) instead of __WARN() xen: remove not used trace functions xen: remove unused function xen_set_domain_pte() xen: remove tests for pvh mode in pure pv paths xen-platform: constify pci_device_id. xen: cleanup xen.h xen: introduce a Kconfig option to enable the pvcalls backend xen/pvcalls: implement write xen/pvcalls: implement read xen/pvcalls: implement the ioworker functions xen/pvcalls: disconnect and module_exit xen/pvcalls: implement release command xen/pvcalls: implement poll command xen/pvcalls: implement accept command xen/pvcalls: implement listen command xen/pvcalls: implement bind command xen/pvcalls: implement connect command ...
2017-09-05Merge tag 'driver-core-4.14-rc1' of ↵Linus Torvalds1-24/+20
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core update from Greg KH: "Here is the "big" driver core update for 4.14-rc1. It's really not all that big, the largest thing here being some firmware tests to help ensure that that crazy api is working properly. There's also a new uevent for when a driver is bound or unbound from a device, fixing a hole in the driver model that's been there since the very beginning. Many thanks to Dmitry for being persistent and pointing out how wrong I was about this all along :) Patches for the new uevents are already in the systemd tree, if people want to play around with them. Otherwise just a number of other small api changes and updates here, nothing major. All of these patches have been in linux-next for a while with no reported issues" * tag 'driver-core-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits) driver core: bus: Fix a potential double free Do not disable driver and bus shutdown hook when class shutdown hook is set. base: topology: constify attribute_group structures. base: Convert to using %pOF instead of full_name kernfs: Clarify lockdep name for kn->count fbdev: uvesafb: remove DRIVER_ATTR() usage xen: xen-pciback: remove DRIVER_ATTR() usage driver core: Document struct device:dma_ops mod_devicetable: Remove excess description from structured comment test_firmware: add batched firmware tests firmware: enable a debug print for batched requests firmware: define pr_fmt firmware: send -EINTR on signal abort on fallback mechanism test_firmware: add test case for SIGCHLD on sync fallback initcall_debug: add deferred probe times Input: axp20x-pek - switch to using devm_device_add_group() Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01 Input: gpio_keys - use devm_device_add_group() for attributes driver core: add devm_device_add_group() and friends driver core: add device_{add|remove}_group() helpers ...
2017-09-04Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds1-4/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "This update provides: - Cleanup of the IDT management including the removal of the extra tracing IDT. A first step to cleanup the vector management code. - The removal of the paravirt op adjust_exception_frame. This is a XEN specific issue, but merged through this branch to avoid nasty merge collisions - Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has disabled the TSC DEADLINE timer in CPUID. - Adjust a debug message in the ioapic code to print out the information correctly" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) x86/idt: Fix the X86_TRAP_BP gate x86/xen: Get rid of paravirt op adjust_exception_frame x86/eisa: Add missing include x86/idt: Remove superfluous ALIGNment x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature x86/idt: Remove the tracing IDT leftovers x86/idt: Hide set_intr_gate() x86/idt: Simplify alloc_intr_gate() x86/idt: Deinline setup functions x86/idt: Remove unused functions/inlines x86/idt: Move interrupt gate initialization to IDT code x86/idt: Move APIC gate initialization to tables x86/idt: Move regular trap init to tables x86/idt: Move IST stack based traps to table init x86/idt: Move debug stack init to table based x86/idt: Switch early trap init to IDT tables x86/idt: Prepare for table based init x86/idt: Move early IDT setup out of 32-bit asm x86/idt: Move early IDT handler setup to IDT code x86/idt: Consolidate IDT invalidation ...
2017-08-31xen/gntdev: update to new mmu_notifier semanticJérôme Glisse1-8/+0
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Roger Pau Monné <[email protected]> Cc: [email protected] (moderated for non-subscribers) Cc: Kirill A. Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andrea Arcangeli <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-31xen: Don't try to call xen_alloc_p2m_entry() on autotranslating guestsBoris Ostrovsky1-3/+5
Commit aba831a69632 ("xen: remove tests for pvh mode in pure pv paths") removed XENFEAT_auto_translated_physmap test in xen_alloc_p2m_entry() since it is assumed that the routine is never called by non-PV guests. However, alloc_xenballooned_pages() may make this call on a PVH guest. Prevent this from happening by adding XENFEAT_auto_translated_physmap check there. Signed-off-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Fixes: aba831a69632 ("xen: remove tests for pvh mode in pure pv paths")
2017-08-31xen/events: events_fifo: Don't use {get,put}_cpu() in xen_evtchn_fifo_init()Julien Grall1-4/+3
When booting Linux as Xen guest with CONFIG_DEBUG_ATOMIC, the following splat appears: [ 0.002323] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes) [ 0.019717] ASID allocator initialised with 65536 entries [ 0.020019] xen:grant_table: Grant tables using version 1 layout [ 0.020051] Grant table initialized [ 0.020069] BUG: sleeping function called from invalid context at /data/src/linux/mm/page_alloc.c:4046 [ 0.020100] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0 [ 0.020123] no locks held by swapper/0/1. [ 0.020143] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc5 #598 [ 0.020166] Hardware name: FVP Base (DT) [ 0.020182] Call trace: [ 0.020199] [<ffff00000808a5c0>] dump_backtrace+0x0/0x270 [ 0.020222] [<ffff00000808a95c>] show_stack+0x24/0x30 [ 0.020244] [<ffff000008c1ef20>] dump_stack+0xb8/0xf0 [ 0.020267] [<ffff0000081128c0>] ___might_sleep+0x1c8/0x1f8 [ 0.020291] [<ffff000008112948>] __might_sleep+0x58/0x90 [ 0.020313] [<ffff0000082171b8>] __alloc_pages_nodemask+0x1c0/0x12e8 [ 0.020338] [<ffff00000827a110>] alloc_page_interleave+0x38/0x88 [ 0.020363] [<ffff00000827a904>] alloc_pages_current+0xdc/0xf0 [ 0.020387] [<ffff000008211f38>] __get_free_pages+0x28/0x50 [ 0.020411] [<ffff0000086566a4>] evtchn_fifo_alloc_control_block+0x2c/0xa0 [ 0.020437] [<ffff0000091747b0>] xen_evtchn_fifo_init+0x38/0xb4 [ 0.020461] [<ffff0000091746c0>] xen_init_IRQ+0x44/0xc8 [ 0.020484] [<ffff000009128adc>] xen_guest_init+0x250/0x300 [ 0.020507] [<ffff000008083974>] do_one_initcall+0x44/0x130 [ 0.020531] [<ffff000009120df8>] kernel_init_freeable+0x120/0x288 [ 0.020556] [<ffff000008c31ca8>] kernel_init+0x18/0x110 [ 0.020578] [<ffff000008083710>] ret_from_fork+0x10/0x40 [ 0.020606] xen:events: Using FIFO-based ABI [ 0.020658] Xen: initializing cpu0 [ 0.027727] Hierarchical SRCU implementation. [ 0.036235] EFI services will not be available. [ 0.043810] smp: Bringing up secondary CPUs ... This is because get_cpu() in xen_evtchn_fifo_init() will disable preemption, but __get_free_page() might sleep (GFP_ATOMIC is not set). xen_evtchn_fifo_init() will always be called before SMP is initialized, so {get,put}_cpu() could be replaced by a simple smp_processor_id(). This also avoid to modify evtchn_fifo_alloc_control_block that will be called in other context. Signed-off-by: Julien Grall <[email protected]> Reported-by: Andre Przywara <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Fixes: 1fe565517b57 ("xen/events: use the FIFO-based ABI if available") Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: use WARN_ON(1) instead of __WARN()Arnd Bergmann1-5/+5
__WARN() is an internal helper that is only available on some architectures, but causes a build error e.g. on ARM64 in some configurations: drivers/xen/pvcalls-back.c: In function 'set_backend_state': drivers/xen/pvcalls-back.c:1097:5: error: implicit declaration of function '__WARN' [-Werror=implicit-function-declaration] Unfortunately, there is no equivalent of BUG() that takes no arguments, but WARN_ON(1) is commonly used in other drivers and works on all configurations. Fixes: 7160378206b2 ("xen/pvcalls: xenbus state handling") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen-platform: constify pci_device_id.Arvind Yadav1-1/+1
pci_device_id are not supposed to change at runtime. All functions working with pci_device_id provided by <linux/pci.h> work with const pci_device_id. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen: introduce a Kconfig option to enable the pvcalls backendStefano Stabellini2-0/+13
Also add pvcalls-back to the Makefile. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement writeStefano Stabellini1-0/+71
When the other end notifies us that there is data to be written (pvcalls_back_conn_event), increment the io and write counters, and schedule the ioworker. Implement the write function called by ioworker by reading the data from the data ring, writing it to the socket by calling inet_sendmsg. Set out_error on error. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement readStefano Stabellini1-0/+85
When an active socket has data available, increment the io and read counters, and schedule the ioworker. Implement the read function by reading from the socket, writing the data to the data ring. Set in_error on error. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement the ioworker functionsStefano Stabellini1-0/+26
We have one ioworker per socket. Each ioworker goes through the list of outstanding read/write requests. Once all requests have been dealt with, it returns. We use one atomic counter per socket for "read" operations and one for "write" operations to keep track of the reads/writes to do. We also use one atomic counter ("io") per ioworker to keep track of how many outstanding requests we have in total assigned to the ioworker. The ioworker finishes when there are none. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: disconnect and module_exitStefano Stabellini1-0/+53
Implement backend_disconnect. Call pvcalls_back_release_active on active sockets and pvcalls_back_release_passive on passive sockets. Implement module_exit by calling backend_disconnect on frontend connections. [ boris: fixed long lines ] Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement release commandStefano Stabellini1-0/+68
Release both active and passive sockets. For active sockets, make sure to avoid possible conflicts with the ioworker reading/writing to those sockets concurrently. Set map->release to let the ioworker know atomically that the socket will be released soon, then wait until the ioworker finishes (flush_work). Unmap indexes pages and data rings. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement poll commandStefano Stabellini1-1/+74
Implement poll on passive sockets by requesting a delayed response with mappass->reqcopy, and reply back when there is data on the passive socket. Poll on active socket is unimplemented as by the spec, as the frontend should just wait for events and check the indexes on the indexes page. Only support one outstanding poll (or accept) request for every passive socket at any given time. [ boris: fixed long lines ] Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement accept commandStefano Stabellini1-0/+113
Implement the accept command by calling inet_accept. To avoid blocking in the kernel, call inet_accept(O_NONBLOCK) from a workqueue, which get scheduled on sk_data_ready (for a passive socket, it means that there are connections to accept). Use the reqcopy field to store the request. Accept the new socket from the delayed work function, create a new sock_mapping for it, map the indexes page and data ring, and reply to the other end. Allocate an ioworker for the socket. Only support one outstanding blocking accept request for every socket at any time. Add a field to sock_mapping to remember the passive socket from which an active socket was created. [ boris: fixed whitespaces ] Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement listen commandStefano Stabellini1-0/+21
Call inet_listen to implement the listen command. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement bind commandStefano Stabellini1-0/+79
Allocate a socket. Track the allocated passive sockets with a new data structure named sockpass_mapping. It contains an unbound workqueue to schedule delayed work for the accept and poll commands. It also has a reqcopy field to be used to store a copy of a request for delayed work. Reads/writes to it are protected by a lock (the "copy_lock" spinlock). Initialize the workqueue in pvcalls_back_bind. Implement the bind command with inet_bind. The pass_sk_data_ready event handler will be added later. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement connect commandStefano Stabellini1-0/+179
Allocate a socket. Keep track of socket <-> ring mappings with a new data structure, called sock_mapping. Implement the connect command by calling inet_stream_connect, and mapping the new indexes page and data ring. Allocate a workqueue and a work_struct, called ioworker, to perform reads and writes to the socket. When an active socket is closed (sk_state_change), set in_error to -ENOTCONN and notify the other end, as specified by the protocol. sk_data_ready and pvcalls_back_ioworker will be implemented later. [ boris: fixed whitespaces ] Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: implement socket commandStefano Stabellini1-0/+27
Just reply with success to the other end for now. Delay the allocation of the actual socket to bind and/or connect. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: handle commands from the frontendStefano Stabellini1-0/+125
When the other end notifies us that there are commands to be read (pvcalls_back_event), wake up the backend thread to parse the command. The command ring works like most other Xen rings, so use the usual ring macros to read and write to it. The functions implementing the commands are empty stubs for now. [ boris: fixed whitespaces ] Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: connect to a frontendStefano Stabellini1-0/+82
Introduce a per-frontend data structure named pvcalls_fedata. It contains pointers to the command ring, its event channel, a list of active sockets and a tree of passive sockets (passing sockets need to be looked up from the id on listen, accept and poll commands, while active sockets only on release). It also has an unbound workqueue to schedule the work of parsing and executing commands on the command ring. socket_lock protects the two lists. In pvcalls_back_global, keep a list of connected frontends. [ boris: fixed whitespaces/long lines ] Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: xenbus state handlingStefano Stabellini1-0/+155
Introduce the code to handle xenbus state changes. Implement the probe function for the pvcalls backend. Write the supported versions, max-page-order and function-calls nodes to xenstore, as required by the protocol. Introduce stub functions for disconnecting/connecting to a frontend. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: initialize the module and register the xenbus backendStefano Stabellini1-0/+22
Keep a list of connected frontends. Use a semaphore to protect list accesses. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-31xen/pvcalls: introduce the pvcalls xenbus backendStefano Stabellini1-0/+61
Introduce a xenbus backend for the pvcalls protocol, as defined by https://xenbits.xen.org/docs/unstable/misc/pvcalls.html. This patch only adds the stubs, the code will be added by the following patches. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Boris Ostrovsky <[email protected]>
2017-08-29x86/idt: Simplify alloc_intr_gate()Thomas Gleixner1-4/+2
The only users of alloc_intr_gate() are hypervisors, which both check the used_vectors bitmap whether they have allocated the gate already. Move that check into alloc_intr_gate() and simplify the users. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Reviewed-by: K. Y. Srinivasan <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-08-28xen: xen-pciback: remove DRIVER_ATTR() usageGreg Kroah-Hartman1-24/+20
It's better to be explicit and use the DRIVER_ATTR_RW() and DRIVER_ATTR_RO() macros when defining a driver's sysfs file. Bonus is this fixes up a checkpatch.pl warning. This is part of a series to drop DRIVER_ATTR() from the tree entirely. Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-08-24Merge tag 'kbuild-fixes-v4.13' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - fix linker script regression caused by dead code elimination support - fix typos and outdated comments - specify kselftest-clean as a PHONY target - fix "make dtbs_install" when $(srctree) includes shell special characters like '~' - Move -fshort-wchar to the global option list because defining it partially emits warnings * tag 'kbuild-fixes-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: update comments of Makefile.asm-generic kbuild: Do not use hyphen in exported variable name Makefile: add kselftest-clean to PHONY target list Kbuild: use -fshort-wchar globally fixdep: trivial: typo fix and correction kbuild: trivial cleanups on the comments kbuild: linker script do not match C names unless LD_DEAD_CODE_DATA_ELIMINATION is configured