aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2019-08-28Merge branch 'mlx5-odp-dc' into rdma.git for-nextJason Gunthorpe1-1/+3
Michael Guralnik says: ==================== The series adds support for on-demand paging for DC transport. As DC is a mlx-only transport, the capabilities are exposed to the user using DEVX objects and later on through mlx5dv_query_device. ==================== Based on the mlx5-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for dependencies * branch 'mlx5-odp-dc': IB/mlx5: Add page fault handler for DC initiator WQE IB/mlx5: Remove check of FW capabilities in ODP page fault handling net/mlx5: Set ODP capabilities for DC transport to max
2019-08-28PCI: Move ASPM declarations to linux/pci.hKrzysztof Wilczynski2-36/+18
Move ASPM definitions and function prototypes from include/linux/pci-aspm.h to include/linux/pci.h so users only need to include <linux/pci.h>: PCIE_LINK_STATE_L0S PCIE_LINK_STATE_L1 PCIE_LINK_STATE_CLKPM pci_disable_link_state() pci_disable_link_state_locked() pcie_no_aspm() No functional changes intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Krzysztof Wilczynski <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2019-08-28hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARDSebastian Andrzej Siewior1-0/+2
Add kernel doc annotation for HRTIMER_MODE_HARD. Fixes: ae6683d815895 ("hrtimer: Introduce HARD expiry mode") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28regulator: mt6358: Add support for MT6358 regulatorHsin-Hsiung Wang1-0/+56
The MT6358 is a regulator found on boards based on MediaTek MT8183 and probably other SoCs. It is a so called pmic and connects as a slave to SoC using SPI, wrapped inside the pmic-wrapper. Signed-off-by: Hsin-Hsiung Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2019-08-28usb: gadget: Export recommended BESL valuesThinh Nguyen1-0/+3
Currently there's no option for the controller driver to report the recommended Best Effort Service Latency (BESL) when operating with LPM support. Add new fields in usb_dcd_config_params to export the recommended baseline and deep BESL values for the function drivers to set the proper BESL value in the BOS descriptor. Signed-off-by: Thinh Nguyen <[email protected]> Signed-off-by: Felipe Balbi <[email protected]>
2019-08-28posix-cpu-timers: Utilize timerqueue for storageThomas Gleixner2-16/+59
Using a linear O(N) search for timer insertion affects execution time and D-cache footprint badly with a larger number of timers. Switch the storage to a timerqueue which is already used for hrtimers and alarmtimers. It does not affect the size of struct k_itimer as it.alarm is still larger. The extra list head for the expiry list will go away later once the expiry is moved into task work context. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Move state tracking to struct posix_cputimersThomas Gleixner3-9/+14
Put it where it belongs and clean up the ifdeffery in fork completely. Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Get rid of zero checksThomas Gleixner1-2/+5
Deactivation of the expiry cache is done by setting all clock caches to 0. That requires to have a check for zero in all places which update the expiry cache: if (cache == 0 || new < cache) cache = new; Use U64_MAX as the deactivated value, which allows to remove the zero checks when updating the cache and reduces it to the obvious check: if (new < cache) cache = new; This also removes the weird workaround in do_prlimit() which was required to convert a RLIMIT_CPU value of 0 (immediate expiry) to 1 because handing in 0 to the posix CPU timer code would have effectively disarmed it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Switch thread group sampling to arrayThomas Gleixner1-1/+1
That allows more simplifications in various places. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Restructure expiry arrayThomas Gleixner1-14/+27
Now that the abused struct task_cputime is gone, it's more natural to bundle the expiry cache and the list head of each clock into a struct and have an array of those structs. Follow the hrtimer naming convention of 'bases' and rename the expiry cache to 'nextevt' and adapt all usage sites. Generates also better code .text size shrinks by 80 bytes. Suggested-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Remove cputime_expiresThomas Gleixner1-7/+2
The last users of the magic struct cputime based expiry cache are gone. Remove the leftovers. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Remove the odd field rename definesThomas Gleixner1-15/+0
The last users of the odd define based renaming of struct task_cputime members are gone. Good riddance. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Provide array based access to expiry cacheThomas Gleixner2-8/+20
Using struct task_cputime for the expiry cache is a pretty odd choice and comes with magic defines to rename the fields for usage in the expiry cache. struct task_cputime is basically a u64 array with 3 members, but it has distinct members. The expiry cache content is different than the content of task_cputime because expiry[PROF] = task_cputime.stime + task_cputime.utime expiry[VIRT] = task_cputime.utime expiry[SCHED] = task_cputime.sum_exec_runtime So there is no direct mapping between task_cputime and the expiry cache and the #define based remapping is just a horrible hack. Having the expiry cache array based allows further simplification of the expiry code. To avoid an all in one cleanup which is hard to review add a temporary anonymous union into struct task_cputime which allows array based access to it. That requires to reorder the members. Add a build time sanity check to validate that the members are at the same place. The union and the build time checks will be removed after conversion. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Move expiry cache into struct posix_cputimersThomas Gleixner3-11/+22
The expiry cache belongs into the posix_cputimers container where the other cpu timers information is. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28sched: Move struct task_cputime to types.hThomas Gleixner2-16/+24
For upcoming posix-timer changes to avoid include recursion hell. Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Create a container structThomas Gleixner4-14/+40
Per task/process data of posix CPU timers is all over the place which makes the code hard to follow and requires ifdeffery. Create a container to hold all this information in one place, so data is consolidated and the ifdeffery can be confined to the posix timer header file and removed from places like fork. As a first step, move the cpu_timers list head array into the new struct and clean up the initializers and simplify fork. The remaining #ifdef in fork will be removed later. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Rename thread_group_cputimer() and make it staticThomas Gleixner1-1/+0
thread_group_cputimer() is a complete misnomer. The function does two things: - For arming process wide timers it makes sure that the atomic time storage is up to date. If no cpu timer is armed yet, then the atomic time storage is not updated by the scheduler for performance reasons. In that case a full summing up of all threads needs to be done and the update needs to be enabled. - Samples the current time into the caller supplied storage. Rename it to thread_group_start_cputime(), make it static and fixup the callsite. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28posix-cpu-timers: Provide quick sample function for itimerThomas Gleixner1-1/+1
get_itimer() needs a sample of the current thread group cputime. It invokes thread_group_cputimer() - which is a misnomer. That function also starts eventually the group cputime accouting which is bogus because the accounting is already active when a timer is armed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-08-28mtd: spi-nor: Add a ->setup() methodTudor Ambarus1-0/+5
nor->params.setup() configures the SPI NOR memory. Useful for SPI NOR flashes that have peculiarities to the SPI NOR standard, e.g. different opcodes, specific address calculation, page size, etc. Right now the only user will be the S3AN chips, but other manufacturers can implement it if needed. Move spi_nor_setup() related code in order to avoid a forward declaration to spi_nor_default_setup(). Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Add a ->convert_addr() methodBoris Brezillon1-7/+10
In order to separate manufacturer quirks from the core we need to get rid of all the manufacturer specific flags, like the SNOR_F_S3AN_ADDR_DEFAULT one. This can easily be replaced by a ->convert_addr() hook, which when implemented will provide the core with an easy way to convert an absolute address into something the flash understands. Right now the only user are the S3AN chips, but other manufacturers can implement it if needed. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Rework the SPI NOR lock/unlock logicBoris Brezillon1-7/+16
Add the SNOR_F_HAS_LOCK flag and set it when SPI_NOR_HAS_LOCK is set in the flash_info entry or when it's a Micron or ST flash. Move the locking hooks in a separate struct so that we have just one field to update when we change the locking implementation. Signed-off-by: Boris Brezillon <[email protected]> [[email protected]: use ->default_init() hook, introduce spi_nor_late_init_params(), set ops in nor->params] Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Create a ->set_4byte() methodBoris Brezillon1-0/+2
The procedure used to enable 4 byte addressing mode depends on the NOR device, so let's provide a hook so that manufacturer specific handling can be implemented in a sane way. Signed-off-by: Boris Brezillon <[email protected]> [[email protected]: use nor->params.set_4byte() instead of nor->set_4byte()] Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Move erase_map to 'struct spi_nor_flash_parameter'Tudor Ambarus1-3/+5
All flash parameters and settings should reside inside 'struct spi_nor_flash_parameter'. Move the SMPT parsed erase map from 'struct spi_nor' to 'struct spi_nor_flash_parameter'. Please note that there is a roll-back mechanism for the flash parameter and settings, for cases when SFDP parser fails. The SFDP parser receives a Stack allocated copy of nor->params, called sfdp_params, and uses it to retrieve the serial flash discoverable parameters. JESD216 SFDP is a standard and has a higher priority than the default initialized flash parameters, so will overwrite the sfdp_params data when needed. All SFDP code uses the local copy of nor->params, that will overwrite it in the end, if the parser succeds. Saving and restoring the nor->params.erase_map is no longer needed, since the SFDP code does not touch it. Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Drop quad_enable() from 'struct spi-nor'Tudor Ambarus1-2/+0
All flash parameters and settings should reside inside 'struct spi_nor_flash_parameter'. Drop the local copy of quad_enable() and use the one from 'struct spi_nor_flash_parameter'. Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Regroup flash parameter and settingsTudor Ambarus1-75/+164
The scope is to move all [FLASH-SPECIFIC] parameters and settings from 'struct spi_nor' to 'struct spi_nor_flash_parameter'. 'struct spi_nor_flash_parameter' describes the hardware capabilities and associated settings of the SPI NOR flash memory. It includes legacy flash parameters and settings that can be overwritten by the spi_nor_fixups hooks, or dynamically when parsing the JESD216 Serial Flash Discoverable Parameters (SFDP) tables. All SFDP params and settings will fit inside 'struct spi_nor_flash_parameter'. Move spi_nor_hwcaps related code to avoid forward declarations. Add a forward declaration that we can't avoid: 'struct spi_nor' will be used in 'struct spi_nor_flash_parameter'. Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Remove unused macroTudor Ambarus1-1/+0
Remove leftover from nor->cmd_buf. Signed-off-by: Tudor Ambarus <[email protected]>
2019-08-28Merge tag 'v5.3-rc6' into spi-nor/nextTudor Ambarus45-220/+230
Linux 5.3-rc6 Merge back latest release candidate, to include a fix that we depend on for new development: 834de5c1aa76 ("mtd: spi-nor: Fix the disabling of write protection at init")
2019-08-28perf: Allow normal events to output AUX dataAlexander Shishkin1-0/+14
In some cases, ordinary (non-AUX) events can generate data for AUX events. For example, PEBS events can come out as records in the Intel PT stream instead of their usual DS records, if configured to do so. One requirement for such events is to consistently schedule together, to ensure that the data from the "AUX output" events isn't lost while their corresponding AUX event is not scheduled. We use grouping to provide this guarantee: an "AUX output" event can be added to a group where an AUX event is a group leader, and provided that the former supports writing to the latter. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-08-28net/mlx5: Set ODP capabilities for DC transport to maxMichael Guralnik1-1/+3
In mlx5_core initialization, query max ODP capabilities for DC transport from FW and set as current capabilities. Signed-off-by: Michael Guralnik <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2019-08-27net: stmmac: setup higher frequency clk support for EHL & TGLVoon Weifeng1-0/+1
EHL DW EQOS is running on a 200MHz clock. Setting up stmmac-clk, ptp clock and ptp_max_adj to 200MHz. Signed-off-by: Voon Weifeng <[email protected]> Signed-off-by: Ong Boon Leong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-27Add genphy_c45_config_aneg() function to phy-c45.cMarco Hartmann1-0/+1
Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling genphy_config_aneg") introduced a check that aborts phy_config_aneg() if the phy is a C45 phy. This causes phy_state_machine() to call phy_error() so that the phy ends up in PHY_HALTED state. Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg() (analogous to the C22 case) so that the state machine can run correctly. genphy_c45_config_aneg() closely resembles mv3310_config_aneg() in drivers/net/phy/marvell10g.c, excluding vendor specific configurations for 1000BaseT. Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver") Signed-off-by: Marco Hartmann <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-28bpf: introduce verifier internal test flagAlexei Starovoitov1-0/+1
Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain and state pruning. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller7-31/+27
Minor conflict in r8169, bug fix had two versions in net and net-next, take the net-next hunks. Signed-off-by: David S. Miller <[email protected]>
2019-08-27Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-1/+0
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page lock leak in nfs_pageio_resend() - Ensure O_DIRECT reports an error if the bytes read/written is 0 - Don't handle errors if the bind/connect succeeded - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidat ed" Bugfixes: - Don't refresh attributes with mounted-on-file information - Fix return values for nfs4_file_open() and nfs_finish_open() - Fix pnfs layoutstats reporting of I/O errors - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort for soft I/O errors when the user specifies a hard mount. - Various fixes to the error handling in sunrpc - Don't report writepage()/writepages() errors twice" * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: remove set but not used variable 'mapping' NFSv2: Fix write regression NFSv2: Fix eof handling NFS: Fix writepage(s) error handling to not report errors twice NFS: Fix spurious EIO read errors pNFS/flexfiles: Don't time out requests on hard mounts SUNRPC: Handle connection breakages correctly in call_status() Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" SUNRPC: Handle EADDRINUSE and ENOBUFS correctly pNFS/flexfiles: Turn off soft RPC calls SUNRPC: Don't handle errors if the bind/connect succeeded NFS: On fatal writeback errors, we need to call nfs_inode_remove_request() NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend() NFSv4: Fix return value in nfs_finish_open() NFSv4: Fix return values for nfs4_file_open() NFS: Don't refresh attributes with mounted-on-file information
2019-08-27Revert "driver core: Add support for linking devices during device addition"Greg Kroah-Hartman1-14/+0
This reverts commit 5302dd7dd0b6d04c63cdce51d1e9fda9ef0be886. Based on a lot of email and in-person discussions, this patch series is being reworked to address a number of issues that were pointed out that needed to be taken care of before it should be merged. It will be resubmitted with those changes hopefully soon. Cc: Frank Rowand <[email protected]> Cc: Saravana Kannan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27Revert "driver core: Add edit_links() callback for drivers"Greg Kroah-Hartman1-20/+0
This reverts commit 134b23eec9e3a3c795a6ceb0efe2fa63e87983b2. Based on a lot of email and in-person discussions, this patch series is being reworked to address a number of issues that were pointed out that needed to be taken care of before it should be merged. It will be resubmitted with those changes hopefully soon. Cc: Frank Rowand <[email protected]> Cc: Saravana Kannan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27Revert "driver core: Add sync_state driver/bus callback"Greg Kroah-Hartman1-26/+0
This reverts commit 8f8184d6bf676a8680d6f441e40317d166b46f73. Based on a lot of email and in-person discussions, this patch series is being reworked to address a number of issues that were pointed out that needed to be taken care of before it should be merged. It will be resubmitted with those changes hopefully soon. Cc: Frank Rowand <[email protected]> Cc: Saravana Kannan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds1-0/+5
Pull networking fixes from David Miller: 1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya Leoshkevich. 2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for SMC from Jason Baron. 3) ipv6_mc_may_pull() should return 0 for malformed packets, not -EINVAL. From Stefano Brivio. 4) Don't forget to unpin umem xdp pages in error path of xdp_umem_reg(). From Ivan Khoronzhuk. 5) Fix sta object leak in mac80211, from Johannes Berg. 6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2 switches. From Florian Fainelli. 7) Revert DMA sync removal from r8169 which was causing regressions on some MIPS Loongson platforms. From Heiner Kallweit. 8) Use after free in flow dissector, from Jakub Sitnicki. 9) Fix NULL derefs of net devices during ICMP processing across collect_md tunnels, from Hangbin Liu. 10) proto_register() memory leaks, from Zhang Lin. 11) Set NLM_F_MULTI flag in multipart netlink messages consistently, from John Fastabend. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) r8152: Set memory to all 0xFFs on failed reg reads openvswitch: Fix conntrack cache with timeout ipv4: mpls: fix mpls_xmit for iptunnel nexthop: Fix nexthop_num_path for blackhole nexthops net: rds: add service level support in rds-info net: route dump netlink NLM_F_MULTI flag missing s390/qeth: reject oversized SNMP requests sock: fix potential memory leak in proto_register() MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode ipv4/icmp: fix rt dst dev null pointer dereference openvswitch: Fix log message in ovs conntrack bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 bpf: fix use after free in prog symbol exposure bpf: fix precision tracking in presence of bpf2bpf calls flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Revert "r8169: remove not needed call to dma_sync_single_for_device" ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev net/ncsi: Fix the payload copying for the request coming from Netlink qed: Add cleanup in qed_slowpath_start() ...
2019-08-27staging: greybus: add missing includesRui Miguel Silva11-0/+33
Before moving greybus core out of staging and moving header files to include/linux some greybus header files were missing the necessary includes. This would trigger compilation faillures with some example errors logged bellow for with CONFIG_KERNEL_HEADER_TEST=y. So, add the necessary headers to compile clean before relocating the header files. ./include/linux/greybus/hd.h:23:50: error: unknown type name 'u16' int (*cport_disable)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/greybus_protocols.h:1314:2: error: unknown type name '__u8' __u8 data[0]; ^~~~ ./include/linux/greybus/hd.h:24:52: error: unknown type name 'u16' int (*cport_connected)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/hd.h:25:48: error: unknown type name 'u16' int (*cport_flush)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/hd.h:26:51: error: unknown type name 'u16' int (*cport_shutdown)(struct gb_host_device *hd, u16 cport_id, ^~~ ./include/linux/greybus/hd.h:27:5: error: unknown type name 'u8' u8 phase, unsigned int timeout); ^~ ./include/linux/greybus/hd.h:28:50: error: unknown type name 'u16' int (*cport_quiesce)(struct gb_host_device *hd, u16 cport_id, ^~~ ./include/linux/greybus/hd.h:29:5: error: unknown type name 'size_t' size_t peer_space, unsigned int timeout); ^~~~~~ ./include/linux/greybus/hd.h:29:5: note: 'size_t' is defined in header '<stddef.h>'; did you forget to '#include <stddef.h>'? ./include/linux/greybus/hd.h:1:1: +#include <stddef.h> /* SPDX-License-Identifier: GPL-2.0 */ ./include/linux/greybus/hd.h:29:5: size_t peer_space, unsigned int timeout); ^~~~~~ ./include/linux/greybus/hd.h:30:48: error: unknown type name 'u16' int (*cport_clear)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/hd.h:32:49: error: unknown type name 'u16' int (*message_send)(struct gb_host_device *hd, u16 dest_cport_id, ^~~ ./include/linux/greybus/hd.h:33:32: error: unknown type name 'gfp_t' struct gb_message *message, gfp_t gfp_mask); ^~~~~ ./include/linux/greybus/hd.h:35:55: error: unknown type name 'u16' int (*latency_tag_enable)(struct gb_host_device *hd, u16 cport_id); Reported-by: kbuild test robot <[email protected]> Reported-by: Gao Xiang <[email protected]> Signed-off-by: Rui Miguel Silva <[email protected]> Signed-off-by: Rui Miguel Silva <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27staging: greybus: move core include files to include/linux/greybus/Greg Kroah-Hartman13-0/+3344
With the goal of moving the core of the greybus code out of staging, the include files need to be moved to include/linux/greybus.h and include/linux/greybus/ Cc: Vaibhav Hiremath <[email protected]> Cc: Johan Hovold <[email protected]> Cc: Vaibhav Agarwal <[email protected]> Cc: Rui Miguel Silva <[email protected]> Cc: David Lin <[email protected]> Cc: "Bryan O'Donoghue" <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Mark Greer <[email protected]> Acked-by: Viresh Kumar <[email protected]> Acked-by: Alex Elder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27block: split .sysfs_lock into two locksMing Lei1-0/+1
The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store path. Meantime, inside block's .show/.store callback, q->sysfs_lock is required. However, when mq & iosched kobjects are removed via blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held too. This way causes AB-BA lock because the kernfs built-in lock of 'kn-count' is required inside kobject_del() too, see the lockdep warning[1]. On the other hand, it isn't necessary to acquire q->sysfs_lock for both blk_mq_unregister_dev() & elv_unregister_queue() because clearing REGISTERED flag prevents storing to 'queue/scheduler' from being happened. Also sysfs write(store) is exclusive, so no necessary to hold the lock for elv_unregister_queue() when it is called in switching elevator path. So split .sysfs_lock into two: one is still named as .sysfs_lock for covering sync .store, the other one is named as .sysfs_dir_lock for covering kobjects and related status change. sysfs itself can handle the race between add/remove kobjects and showing/storing attributes under kobjects. For switching scheduler via storing to 'queue/scheduler', we use the queue flag of QUEUE_FLAG_REGISTERED with .sysfs_lock for avoiding the race, then we can avoid to hold .sysfs_lock during removing/adding kobjects. [1] lockdep warning ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc3-00044-g73277fc75ea0 #1380 Not tainted ------------------------------------------------------ rmmod/777 is trying to acquire lock: 00000000ac50e981 (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x59/0x72 but task is already holding lock: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&q->sysfs_lock){+.+.}: __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __mutex_lock+0x14a/0xa9b blk_mq_hw_sysfs_show+0x63/0xb6 sysfs_kf_seq_show+0x11f/0x196 seq_read+0x2cd/0x5f2 vfs_read+0xc7/0x18c ksys_read+0xc4/0x13e do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#202){++++}: check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __kernfs_remove+0x237/0x40b kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->sysfs_lock); lock(kn->count#202); lock(&q->sysfs_lock); lock(kn->count#202); *** DEADLOCK *** 2 locks held by rmmod/777: #0: 00000000e69bd9de (&lock){+.+.}, at: null_exit+0x2e/0x95 [null_blk] #1: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b stack backtrace: CPU: 0 PID: 777 Comm: rmmod Not tainted 5.3.0-rc3-00044-g73277fc75ea0 #1380 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180724_192412-buildhw-07.phx4 Call Trace: dump_stack+0x9a/0xe6 check_noncircular+0x207/0x251 ? print_circular_bug+0x32a/0x32a ? find_usage_backwards+0x84/0xb0 check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 ? check_prev_add+0xc45/0xc45 ? mark_lock+0x11b/0x804 ? check_usage_forwards+0x1ca/0x1ca __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 ? kernfs_remove_by_name_ns+0x59/0x72 __kernfs_remove+0x237/0x40b ? kernfs_remove_by_name_ns+0x59/0x72 ? kernfs_next_descendant_post+0x7d/0x7d ? strlen+0x10/0x23 ? strcmp+0x22/0x44 kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa ? disk_events_poll_msecs_store+0x12b/0x12b ? check_flags+0x1ea/0x204 ? mark_held_locks+0x1f/0x7a null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 ? free_module+0x39f/0x39f ? blkcg_maybe_throttle_current+0x8a/0x718 ? rwlock_bug+0x62/0x62 ? __blkcg_punt_bio_submit+0xd0/0xd0 ? trace_hardirqs_on_thunk+0x1a/0x20 ? mark_held_locks+0x1f/0x7a ? do_syscall_64+0x4c/0x295 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb696cdbe6b Code: 73 01 c3 48 8b 0d 1d 20 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 008 RSP: 002b:00007ffec9588788 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000559e589137c0 RCX: 00007fb696cdbe6b RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559e58913828 RBP: 0000000000000000 R08: 00007ffec9587701 R09: 0000000000000000 R10: 00007fb696d4eae0 R11: 0000000000000206 R12: 00007ffec95889b0 R13: 00007ffec95896b3 R14: 0000559e58913260 R15: 0000559e589137c0 Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Greg KH <[email protected]> Cc: Mike Snitzer <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27block: add helper for checking if queue is registeredMing Lei1-0/+1
There are 4 users which check if queue is registered, so add one helper to check it. Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Greg KH <[email protected]> Cc: Mike Snitzer <[email protected]> Cc: Bart Van Assche <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27block: Remove blk_mq_register_dev()Bart Van Assche1-1/+0
This function has no callers. Hence remove it. Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Hannes Reinecke <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback, memcg: Implement foreign dirty flushingTejun Heo2-0/+40
There's an inherent mismatch between memcg and writeback. The former trackes ownership per-page while the latter per-inode. This was a deliberate design decision because honoring per-page ownership in the writeback path is complicated, may lead to higher CPU and IO overheads and deemed unnecessary given that write-sharing an inode across different cgroups isn't a common use-case. Combined with inode majority-writer ownership switching, this works well enough in most cases but there are some pathological cases. For example, let's say there are two cgroups A and B which keep writing to different but confined parts of the same inode. B owns the inode and A's memory is limited far below B's. A's dirty ratio can rise enough to trigger balance_dirty_pages() sleeps but B's can be low enough to avoid triggering background writeback. A will be slowed down without a way to make writeback of the dirty pages happen. This patch implements foreign dirty recording and foreign mechanism so that when a memcg encounters a condition as above it can trigger flushes on bdi_writebacks which can clean its pages. Please see the comment on top of mem_cgroup_track_foreign_dirty_slowpath() for details. A reproducer follows. write-range.c:: #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> static const char *usage = "write-range FILE START SIZE\n"; int main(int argc, char **argv) { int fd; unsigned long start, size, end, pos; char *endp; char buf[4096]; if (argc < 4) { fprintf(stderr, usage); return 1; } fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); return 1; } start = strtoul(argv[2], &endp, 0); if (*endp != '\0') { fprintf(stderr, usage); return 1; } size = strtoul(argv[3], &endp, 0); if (*endp != '\0') { fprintf(stderr, usage); return 1; } end = start + size; while (1) { for (pos = start; pos < end; ) { long bread, bwritten = 0; if (lseek(fd, pos, SEEK_SET) < 0) { perror("lseek"); return 1; } bread = read(0, buf, sizeof(buf) < end - pos ? sizeof(buf) : end - pos); if (bread < 0) { perror("read"); return 1; } if (bread == 0) return 0; while (bwritten < bread) { long this; this = write(fd, buf + bwritten, bread - bwritten); if (this < 0) { perror("write"); return 1; } bwritten += this; pos += bwritten; } } } } repro.sh:: #!/bin/bash set -e set -x sysctl -w vm.dirty_expire_centisecs=300000 sysctl -w vm.dirty_writeback_centisecs=300000 sysctl -w vm.dirtytime_expire_seconds=300000 echo 3 > /proc/sys/vm/drop_caches TEST=/sys/fs/cgroup/test A=$TEST/A B=$TEST/B mkdir -p $A $B echo "+memory +io" > $TEST/cgroup.subtree_control echo $((1<<30)) > $A/memory.high echo $((32<<30)) > $B/memory.high rm -f testfile touch testfile fallocate -l 4G testfile echo "Starting B" (echo $BASHPID > $B/cgroup.procs pv -q --rate-limit 70M < /dev/urandom | ./write-range testfile $((2<<30)) $((2<<30))) & echo "Waiting 10s to ensure B claims the testfile inode" sleep 5 sync sleep 5 sync echo "Starting A" (echo $BASHPID > $A/cgroup.procs pv < /dev/urandom | ./write-range testfile 0 $((2<<30))) v2: Added comments explaining why the specific intervals are being used. v3: Use 0 @nr when calling cgroup_writeback_by_id() to use best-effort flushing while avoding possible livelocks. v4: Use get_jiffies_64() and time_before/after64() instead of raw jiffies_64 and arthimetic comparisons as suggested by Jan. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback, memcg: Implement cgroup_writeback_by_id()Tejun Heo1-0/+2
Implement cgroup_writeback_by_id() which initiates cgroup writeback from bdi and memcg IDs. This will be used by memcg foreign inode flushing. v2: Use wb_get_lookup() instead of wb_get_create() to avoid creating spurious wbs. v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort flushing while avoding possible livelocks. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback: Separate out wb_get_lookup() from wb_get_create()Tejun Heo1-0/+2
Separate out wb_get_lookup() which doesn't try to create one if there isn't already one from wb_get_create(). This will be used by later patches. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27bdi: Add bdi->idTejun Heo2-0/+3
There currently is no way to universally identify and lookup a bdi without holding a reference and pointer to it. This patch adds an non-recycling bdi->id and implements bdi_get_by_id() which looks up bdis by their ids. This will be used by memcg foreign inode flushing. I left bdi_list alone for simplicity and because while rb_tree does support rcu assignment it doesn't seem to guarantee lossless walk when walk is racing aginst tree rebalance operations. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback: Generalize and expose wb_completionTejun Heo2-0/+22
wb_completion is used to track writeback completions. We want to use it from memcg side for foreign inode flushes. This patch updates it to remember the target waitq instead of assuming bdi->wb_waitq and expose it outside of fs-writeback.c. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-26firmware: ti_sci: Allow for device shared and exclusive requestsLokesh Vutla1-0/+3
Sysfw provides an option for requesting exclusive access for a device using the flags MSG_FLAG_DEVICE_EXCLUSIVE. If this flag is not used, the device is meant to be shared across hosts. Once a device is requested from a host with this flag set, any request to this device from a different host will be nacked by sysfw. Current tisci driver enables this flag for every device requests. But this may not be true for all the devices. So provide a separate commands in driver for exclusive and shared device requests. Reviewed-by: Nishanth Menon <[email protected]> Signed-off-by: Lokesh Vutla <[email protected]> Signed-off-by: Santosh Shilimkar <[email protected]>
2019-08-26rcu: Don't include <linux/ktime.h> in rcutiny.hChristoph Hellwig1-1/+1
The kbuild reported a built failure due to a header loop when RCUTINY is enabled with my pending riscv-nommu port. Switch rcutiny.h to only include the minimal required header to get HZ instead. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>