aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2019-08-28mtd: spi-nor: Rework the SPI NOR lock/unlock logicBoris Brezillon1-7/+16
Add the SNOR_F_HAS_LOCK flag and set it when SPI_NOR_HAS_LOCK is set in the flash_info entry or when it's a Micron or ST flash. Move the locking hooks in a separate struct so that we have just one field to update when we change the locking implementation. Signed-off-by: Boris Brezillon <[email protected]> [[email protected]: use ->default_init() hook, introduce spi_nor_late_init_params(), set ops in nor->params] Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Create a ->set_4byte() methodBoris Brezillon1-0/+2
The procedure used to enable 4 byte addressing mode depends on the NOR device, so let's provide a hook so that manufacturer specific handling can be implemented in a sane way. Signed-off-by: Boris Brezillon <[email protected]> [[email protected]: use nor->params.set_4byte() instead of nor->set_4byte()] Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Move erase_map to 'struct spi_nor_flash_parameter'Tudor Ambarus1-3/+5
All flash parameters and settings should reside inside 'struct spi_nor_flash_parameter'. Move the SMPT parsed erase map from 'struct spi_nor' to 'struct spi_nor_flash_parameter'. Please note that there is a roll-back mechanism for the flash parameter and settings, for cases when SFDP parser fails. The SFDP parser receives a Stack allocated copy of nor->params, called sfdp_params, and uses it to retrieve the serial flash discoverable parameters. JESD216 SFDP is a standard and has a higher priority than the default initialized flash parameters, so will overwrite the sfdp_params data when needed. All SFDP code uses the local copy of nor->params, that will overwrite it in the end, if the parser succeds. Saving and restoring the nor->params.erase_map is no longer needed, since the SFDP code does not touch it. Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Drop quad_enable() from 'struct spi-nor'Tudor Ambarus1-2/+0
All flash parameters and settings should reside inside 'struct spi_nor_flash_parameter'. Drop the local copy of quad_enable() and use the one from 'struct spi_nor_flash_parameter'. Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Regroup flash parameter and settingsTudor Ambarus1-75/+164
The scope is to move all [FLASH-SPECIFIC] parameters and settings from 'struct spi_nor' to 'struct spi_nor_flash_parameter'. 'struct spi_nor_flash_parameter' describes the hardware capabilities and associated settings of the SPI NOR flash memory. It includes legacy flash parameters and settings that can be overwritten by the spi_nor_fixups hooks, or dynamically when parsing the JESD216 Serial Flash Discoverable Parameters (SFDP) tables. All SFDP params and settings will fit inside 'struct spi_nor_flash_parameter'. Move spi_nor_hwcaps related code to avoid forward declarations. Add a forward declaration that we can't avoid: 'struct spi_nor' will be used in 'struct spi_nor_flash_parameter'. Signed-off-by: Tudor Ambarus <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Vignesh Raghavendra <[email protected]>
2019-08-28mtd: spi-nor: Remove unused macroTudor Ambarus1-1/+0
Remove leftover from nor->cmd_buf. Signed-off-by: Tudor Ambarus <[email protected]>
2019-08-28Merge tag 'v5.3-rc6' into spi-nor/nextTudor Ambarus123-494/+531
Linux 5.3-rc6 Merge back latest release candidate, to include a fix that we depend on for new development: 834de5c1aa76 ("mtd: spi-nor: Fix the disabling of write protection at init")
2019-08-28perf: Allow normal events to output AUX dataAlexander Shishkin2-1/+16
In some cases, ordinary (non-AUX) events can generate data for AUX events. For example, PEBS events can come out as records in the Intel PT stream instead of their usual DS records, if configured to do so. One requirement for such events is to consistently schedule together, to ensure that the data from the "AUX output" events isn't lost while their corresponding AUX event is not scheduled. We use grouping to provide this guarantee: an "AUX output" event can be added to a group where an AUX event is a group leader, and provided that the former supports writing to the latter. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-08-28ACPI: cpufreq: Switch to QoS requests instead of cpufreq notifierViresh Kumar1-8/+18
The cpufreq core now takes the min/max frequency constraints via QoS requests and the CPUFREQ_ADJUST notifier shall get removed later on. Switch over to using the QoS request for maximum frequency constraint for acpi driver. Signed-off-by: Viresh Kumar <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-08-28net/mlx5: Set ODP capabilities for DC transport to maxMichael Guralnik1-1/+3
In mlx5_core initialization, query max ODP capabilities for DC transport from FW and set as current capabilities. Signed-off-by: Michael Guralnik <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2019-08-27net: stmmac: setup higher frequency clk support for EHL & TGLVoon Weifeng1-0/+1
EHL DW EQOS is running on a 200MHz clock. Setting up stmmac-clk, ptp clock and ptp_max_adj to 200MHz. Signed-off-by: Voon Weifeng <[email protected]> Signed-off-by: Ong Boon Leong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-27sctp: allow users to set ep ecn flag by sockoptXin Long1-0/+1
SCTP_ECN_SUPPORTED sockopt will be added to allow users to change ep ecn flag, and it's similar with other feature flags. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-27sctp: make ecn flag per netns and endpointXin Long2-1/+5
This patch is to add ecn flag for both netns_sctp and sctp_endpoint, net->sctp.ecn_enable is set 1 by default, and ep->ecn_enable will be initialized with net->sctp.ecn_enable. asoc->peer.ecn_capable will be set during negotiation only when ep->ecn_enable is set on both sides. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-27Add genphy_c45_config_aneg() function to phy-c45.cMarco Hartmann1-0/+1
Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling genphy_config_aneg") introduced a check that aborts phy_config_aneg() if the phy is a C45 phy. This causes phy_state_machine() to call phy_error() so that the phy ends up in PHY_HALTED state. Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg() (analogous to the C22 case) so that the state machine can run correctly. genphy_c45_config_aneg() closely resembles mv3310_config_aneg() in drivers/net/phy/marvell10g.c, excluding vendor specific configurations for 1000BaseT. Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver") Signed-off-by: Marco Hartmann <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-27net: dsa: remove bitmap operationsVivien Didelot1-3/+0
The bitmap operations were introduced to simplify the switch drivers in the future, since most of them could implement the common VLAN and MDB operations (add, del, dump) with simple functions taking all target ports at once, and thus limiting the number of hardware accesses. Programming an MDB or VLAN this way in a single operation would clearly simplify the drivers a lot but would require a new get-set interface in DSA. The usage of such bitmap from the stack also raised concerned in the past, leading to the dynamic allocation of a new ds->_bitmap member in the dsa_switch structure. So let's get rid of them for now. This commit nicely wraps the ds->ops->port_{mdb,vlan}_{prepare,add} switch operations into new dsa_switch_{mdb,vlan}_{prepare,add} variants not using any bitmap argument anymore. New dsa_switch_{mdb,vlan}_match helpers have been introduced to make clear which local port of a switch must be programmed with the target object. While the targeted user port is an obvious candidate, the DSA links must also be programmed, as well as the CPU port for VLANs. While at it, also remove local variables that are only used once. Signed-off-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-28bpf: introduce verifier internal test flagAlexei Starovoitov2-0/+4
Introduce BPF_F_TEST_STATE_FREQ flag to stress test parentage chain and state pruning. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-08-27net_sched: fix a NULL pointer deref in ipt actionCong Wang1-1/+3
The net pointer in struct xt_tgdtor_param is not explicitly initialized therefore is still NULL when dereferencing it. So we have to find a way to pass the correct net pointer to ipt_destroy_target(). The best way I find is just saving the net pointer inside the per netns struct tcf_idrinfo, which could make this patch smaller. Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset") Reported-and-tested-by: [email protected] Cc: Jamal Hadi Salim <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller16-47/+49
Minor conflict in r8169, bug fix had two versions in net and net-next, take the net-next hunks. Signed-off-by: David S. Miller <[email protected]>
2019-08-27Merge tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds1-1/+0
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Stable fixes: - Fix a page lock leak in nfs_pageio_resend() - Ensure O_DIRECT reports an error if the bytes read/written is 0 - Don't handle errors if the bind/connect succeeded - Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidat ed" Bugfixes: - Don't refresh attributes with mounted-on-file information - Fix return values for nfs4_file_open() and nfs_finish_open() - Fix pnfs layoutstats reporting of I/O errors - Don't use soft RPC calls for pNFS/flexfiles I/O, and don't abort for soft I/O errors when the user specifies a hard mount. - Various fixes to the error handling in sunrpc - Don't report writepage()/writepages() errors twice" * tag 'nfs-for-5.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: remove set but not used variable 'mapping' NFSv2: Fix write regression NFSv2: Fix eof handling NFS: Fix writepage(s) error handling to not report errors twice NFS: Fix spurious EIO read errors pNFS/flexfiles: Don't time out requests on hard mounts SUNRPC: Handle connection breakages correctly in call_status() Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated" SUNRPC: Handle EADDRINUSE and ENOBUFS correctly pNFS/flexfiles: Turn off soft RPC calls SUNRPC: Don't handle errors if the bind/connect succeeded NFS: On fatal writeback errors, we need to call nfs_inode_remove_request() NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend() NFSv4: Fix return value in nfs_finish_open() NFSv4: Fix return values for nfs4_file_open() NFS: Don't refresh attributes with mounted-on-file information
2019-08-27Revert "driver core: Add support for linking devices during device addition"Greg Kroah-Hartman1-14/+0
This reverts commit 5302dd7dd0b6d04c63cdce51d1e9fda9ef0be886. Based on a lot of email and in-person discussions, this patch series is being reworked to address a number of issues that were pointed out that needed to be taken care of before it should be merged. It will be resubmitted with those changes hopefully soon. Cc: Frank Rowand <[email protected]> Cc: Saravana Kannan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27Revert "driver core: Add edit_links() callback for drivers"Greg Kroah-Hartman1-20/+0
This reverts commit 134b23eec9e3a3c795a6ceb0efe2fa63e87983b2. Based on a lot of email and in-person discussions, this patch series is being reworked to address a number of issues that were pointed out that needed to be taken care of before it should be merged. It will be resubmitted with those changes hopefully soon. Cc: Frank Rowand <[email protected]> Cc: Saravana Kannan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27Revert "driver core: Add sync_state driver/bus callback"Greg Kroah-Hartman1-26/+0
This reverts commit 8f8184d6bf676a8680d6f441e40317d166b46f73. Based on a lot of email and in-person discussions, this patch series is being reworked to address a number of issues that were pointed out that needed to be taken care of before it should be merged. It will be resubmitted with those changes hopefully soon. Cc: Frank Rowand <[email protected]> Cc: Saravana Kannan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27ASoC: dapm: Expose snd_soc_dapm_new_control_unlocked properlyAmadeusz Sławiński1-0/+3
We use snd_soc_dapm_new_control_unlocked for topology and have local declaration, instead declare it properly in header like already declared snd_soc_dapm_new_control. Signed-off-by: Amadeusz Sławiński <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Pierre-Louis Bossart <[email protected]> Signed-off-by: Mark Brown <[email protected]>
2019-08-27Merge tag 'arc-5.3-rc7' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC updates from Vineet Gupta: - support for Edge Triggered IRQs in ARC IDU intc - other fixes here and there * tag 'arc-5.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: prefer __section from compiler_attributes.h dt-bindings: IDU-intc: Add support for edge-triggered interrupts dt-bindings: IDU-intc: Clean up documentation ARCv2: IDU-intc: Add support for edge-triggered interrupts ARC: unwind: Mark expected switch fall-throughs ARC: [plat-hsdk]: allow to switch between AXI DMAC port configurations ARC: fix typo in setup_dma_ops log message ARCv2: entry: early return from exception need not clear U & DE bits
2019-08-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds7-9/+15
Pull networking fixes from David Miller: 1) Use 32-bit index for tails calls in s390 bpf JIT, from Ilya Leoshkevich. 2) Fix missed EPOLLOUT events in TCP, from Eric Dumazet. Same fix for SMC from Jason Baron. 3) ipv6_mc_may_pull() should return 0 for malformed packets, not -EINVAL. From Stefano Brivio. 4) Don't forget to unpin umem xdp pages in error path of xdp_umem_reg(). From Ivan Khoronzhuk. 5) Fix sta object leak in mac80211, from Johannes Berg. 6) Fix regression by not configuring PHYLINK on CPU port of bcm_sf2 switches. From Florian Fainelli. 7) Revert DMA sync removal from r8169 which was causing regressions on some MIPS Loongson platforms. From Heiner Kallweit. 8) Use after free in flow dissector, from Jakub Sitnicki. 9) Fix NULL derefs of net devices during ICMP processing across collect_md tunnels, from Hangbin Liu. 10) proto_register() memory leaks, from Zhang Lin. 11) Set NLM_F_MULTI flag in multipart netlink messages consistently, from John Fastabend. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) r8152: Set memory to all 0xFFs on failed reg reads openvswitch: Fix conntrack cache with timeout ipv4: mpls: fix mpls_xmit for iptunnel nexthop: Fix nexthop_num_path for blackhole nexthops net: rds: add service level support in rds-info net: route dump netlink NLM_F_MULTI flag missing s390/qeth: reject oversized SNMP requests sock: fix potential memory leak in proto_register() MAINTAINERS: Add phylink keyword to SFF/SFP/SFP+ MODULE SUPPORT xfrm/xfrm_policy: fix dst dev null pointer dereference in collect_md mode ipv4/icmp: fix rt dst dev null pointer dereference openvswitch: Fix log message in ovs conntrack bpf: allow narrow loads of some sk_reuseport_md fields with offset > 0 bpf: fix use after free in prog symbol exposure bpf: fix precision tracking in presence of bpf2bpf calls flow_dissector: Fix potential use-after-free on BPF_PROG_DETACH Revert "r8169: remove not needed call to dma_sync_single_for_device" ipv6: propagate ipv6_add_dev's error returns out of ipv6_find_idev net/ncsi: Fix the payload copying for the request coming from Netlink qed: Add cleanup in qed_slowpath_start() ...
2019-08-27staging: greybus: add missing includesRui Miguel Silva11-0/+33
Before moving greybus core out of staging and moving header files to include/linux some greybus header files were missing the necessary includes. This would trigger compilation faillures with some example errors logged bellow for with CONFIG_KERNEL_HEADER_TEST=y. So, add the necessary headers to compile clean before relocating the header files. ./include/linux/greybus/hd.h:23:50: error: unknown type name 'u16' int (*cport_disable)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/greybus_protocols.h:1314:2: error: unknown type name '__u8' __u8 data[0]; ^~~~ ./include/linux/greybus/hd.h:24:52: error: unknown type name 'u16' int (*cport_connected)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/hd.h:25:48: error: unknown type name 'u16' int (*cport_flush)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/hd.h:26:51: error: unknown type name 'u16' int (*cport_shutdown)(struct gb_host_device *hd, u16 cport_id, ^~~ ./include/linux/greybus/hd.h:27:5: error: unknown type name 'u8' u8 phase, unsigned int timeout); ^~ ./include/linux/greybus/hd.h:28:50: error: unknown type name 'u16' int (*cport_quiesce)(struct gb_host_device *hd, u16 cport_id, ^~~ ./include/linux/greybus/hd.h:29:5: error: unknown type name 'size_t' size_t peer_space, unsigned int timeout); ^~~~~~ ./include/linux/greybus/hd.h:29:5: note: 'size_t' is defined in header '<stddef.h>'; did you forget to '#include <stddef.h>'? ./include/linux/greybus/hd.h:1:1: +#include <stddef.h> /* SPDX-License-Identifier: GPL-2.0 */ ./include/linux/greybus/hd.h:29:5: size_t peer_space, unsigned int timeout); ^~~~~~ ./include/linux/greybus/hd.h:30:48: error: unknown type name 'u16' int (*cport_clear)(struct gb_host_device *hd, u16 cport_id); ^~~ ./include/linux/greybus/hd.h:32:49: error: unknown type name 'u16' int (*message_send)(struct gb_host_device *hd, u16 dest_cport_id, ^~~ ./include/linux/greybus/hd.h:33:32: error: unknown type name 'gfp_t' struct gb_message *message, gfp_t gfp_mask); ^~~~~ ./include/linux/greybus/hd.h:35:55: error: unknown type name 'u16' int (*latency_tag_enable)(struct gb_host_device *hd, u16 cport_id); Reported-by: kbuild test robot <[email protected]> Reported-by: Gao Xiang <[email protected]> Signed-off-by: Rui Miguel Silva <[email protected]> Signed-off-by: Rui Miguel Silva <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27staging: greybus: move core include files to include/linux/greybus/Greg Kroah-Hartman13-0/+3344
With the goal of moving the core of the greybus code out of staging, the include files need to be moved to include/linux/greybus.h and include/linux/greybus/ Cc: Vaibhav Hiremath <[email protected]> Cc: Johan Hovold <[email protected]> Cc: Vaibhav Agarwal <[email protected]> Cc: Rui Miguel Silva <[email protected]> Cc: David Lin <[email protected]> Cc: "Bryan O'Donoghue" <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Mark Greer <[email protected]> Acked-by: Viresh Kumar <[email protected]> Acked-by: Alex Elder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27erofs: fix compile warnings when moving out include/trace/events/erofs.hGao Xiang1-0/+3
As Stephon reported [1], many compile warnings are raised when moving out include/trace/events/erofs.h: In file included from include/trace/events/erofs.h:8, from <command-line>: include/trace/events/erofs.h:28:37: warning: 'struct dentry' declared inside parameter list will not be visible outside of this definition or declaration TP_PROTO(struct inode *dir, struct dentry *dentry, unsigned int flags), ^~~~~~ include/linux/tracepoint.h:233:34: note: in definition of macro '__DECLARE_TRACE' static inline void trace_##name(proto) \ ^~~~~ include/linux/tracepoint.h:396:24: note: in expansion of macro 'PARAMS' __DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), \ ^~~~~~ include/linux/tracepoint.h:532:2: note: in expansion of macro 'DECLARE_TRACE' DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~~~~~~~~ include/linux/tracepoint.h:532:22: note: in expansion of macro 'PARAMS' DECLARE_TRACE(name, PARAMS(proto), PARAMS(args)) ^~~~~~ include/trace/events/erofs.h:26:1: note: in expansion of macro 'TRACE_EVENT' TRACE_EVENT(erofs_lookup, ^~~~~~~~~~~ include/trace/events/erofs.h:28:2: note: in expansion of macro 'TP_PROTO' TP_PROTO(struct inode *dir, struct dentry *dentry, unsigned int flags), ^~~~~~~~ That makes me very confused since most original EROFS tracepoint code was taken from f2fs, and finally I found commit 43c78d88036e ("kbuild: compile-test kernel headers to ensure they are self-contained") It seems these warnings are generated from KERNEL_HEADER_TEST feature and ext4/f2fs tracepoint files were in blacklist. Anyway, let's fix these issues for KERNEL_HEADER_TEST feature instead of adding to blacklist... [1] https://lore.kernel.org/lkml/[email protected]/ Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: Chao Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-08-27block: split .sysfs_lock into two locksMing Lei1-0/+1
The kernfs built-in lock of 'kn->count' is held in sysfs .show/.store path. Meantime, inside block's .show/.store callback, q->sysfs_lock is required. However, when mq & iosched kobjects are removed via blk_mq_unregister_dev() & elv_unregister_queue(), q->sysfs_lock is held too. This way causes AB-BA lock because the kernfs built-in lock of 'kn-count' is required inside kobject_del() too, see the lockdep warning[1]. On the other hand, it isn't necessary to acquire q->sysfs_lock for both blk_mq_unregister_dev() & elv_unregister_queue() because clearing REGISTERED flag prevents storing to 'queue/scheduler' from being happened. Also sysfs write(store) is exclusive, so no necessary to hold the lock for elv_unregister_queue() when it is called in switching elevator path. So split .sysfs_lock into two: one is still named as .sysfs_lock for covering sync .store, the other one is named as .sysfs_dir_lock for covering kobjects and related status change. sysfs itself can handle the race between add/remove kobjects and showing/storing attributes under kobjects. For switching scheduler via storing to 'queue/scheduler', we use the queue flag of QUEUE_FLAG_REGISTERED with .sysfs_lock for avoiding the race, then we can avoid to hold .sysfs_lock during removing/adding kobjects. [1] lockdep warning ====================================================== WARNING: possible circular locking dependency detected 5.3.0-rc3-00044-g73277fc75ea0 #1380 Not tainted ------------------------------------------------------ rmmod/777 is trying to acquire lock: 00000000ac50e981 (kn->count#202){++++}, at: kernfs_remove_by_name_ns+0x59/0x72 but task is already holding lock: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&q->sysfs_lock){+.+.}: __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __mutex_lock+0x14a/0xa9b blk_mq_hw_sysfs_show+0x63/0xb6 sysfs_kf_seq_show+0x11f/0x196 seq_read+0x2cd/0x5f2 vfs_read+0xc7/0x18c ksys_read+0xc4/0x13e do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (kn->count#202){++++}: check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 __kernfs_remove+0x237/0x40b kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->sysfs_lock); lock(kn->count#202); lock(&q->sysfs_lock); lock(kn->count#202); *** DEADLOCK *** 2 locks held by rmmod/777: #0: 00000000e69bd9de (&lock){+.+.}, at: null_exit+0x2e/0x95 [null_blk] #1: 00000000fb16ae21 (&q->sysfs_lock){+.+.}, at: blk_unregister_queue+0x78/0x10b stack backtrace: CPU: 0 PID: 777 Comm: rmmod Not tainted 5.3.0-rc3-00044-g73277fc75ea0 #1380 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180724_192412-buildhw-07.phx4 Call Trace: dump_stack+0x9a/0xe6 check_noncircular+0x207/0x251 ? print_circular_bug+0x32a/0x32a ? find_usage_backwards+0x84/0xb0 check_prev_add+0x5d2/0xc45 validate_chain+0xed3/0xf94 ? check_prev_add+0xc45/0xc45 ? mark_lock+0x11b/0x804 ? check_usage_forwards+0x1ca/0x1ca __lock_acquire+0x95f/0xa2f lock_acquire+0x1b4/0x1e8 ? kernfs_remove_by_name_ns+0x59/0x72 __kernfs_remove+0x237/0x40b ? kernfs_remove_by_name_ns+0x59/0x72 ? kernfs_next_descendant_post+0x7d/0x7d ? strlen+0x10/0x23 ? strcmp+0x22/0x44 kernfs_remove_by_name_ns+0x59/0x72 remove_files+0x61/0x96 sysfs_remove_group+0x81/0xa4 sysfs_remove_groups+0x3b/0x44 kobject_del+0x44/0x94 blk_mq_unregister_dev+0x83/0xdd blk_unregister_queue+0xa0/0x10b del_gendisk+0x259/0x3fa ? disk_events_poll_msecs_store+0x12b/0x12b ? check_flags+0x1ea/0x204 ? mark_held_locks+0x1f/0x7a null_del_dev+0x8b/0x1c3 [null_blk] null_exit+0x5c/0x95 [null_blk] __se_sys_delete_module+0x204/0x337 ? free_module+0x39f/0x39f ? blkcg_maybe_throttle_current+0x8a/0x718 ? rwlock_bug+0x62/0x62 ? __blkcg_punt_bio_submit+0xd0/0xd0 ? trace_hardirqs_on_thunk+0x1a/0x20 ? mark_held_locks+0x1f/0x7a ? do_syscall_64+0x4c/0x295 do_syscall_64+0xa7/0x295 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fb696cdbe6b Code: 73 01 c3 48 8b 0d 1d 20 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 008 RSP: 002b:00007ffec9588788 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 0000559e589137c0 RCX: 00007fb696cdbe6b RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000559e58913828 RBP: 0000000000000000 R08: 00007ffec9587701 R09: 0000000000000000 R10: 00007fb696d4eae0 R11: 0000000000000206 R12: 00007ffec95889b0 R13: 00007ffec95896b3 R14: 0000559e58913260 R15: 0000559e589137c0 Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Greg KH <[email protected]> Cc: Mike Snitzer <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27block: add helper for checking if queue is registeredMing Lei1-0/+1
There are 4 users which check if queue is registered, so add one helper to check it. Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Greg KH <[email protected]> Cc: Mike Snitzer <[email protected]> Cc: Bart Van Assche <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27block: Remove blk_mq_register_dev()Bart Van Assche1-1/+0
This function has no callers. Hence remove it. Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Hannes Reinecke <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27netfilter: nft_dynset: support for element deletionAnder Juaristi2-1/+10
This patch implements the delete operation from the ruleset. It implements a new delete() function in nft_set_rhash. It is simpler to use than the already existing remove(), because it only takes the set and the key as arguments, whereas remove() expects a full nft_set_elem structure. Signed-off-by: Ander Juaristi <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-08-27writeback, memcg: Implement foreign dirty flushingTejun Heo2-0/+40
There's an inherent mismatch between memcg and writeback. The former trackes ownership per-page while the latter per-inode. This was a deliberate design decision because honoring per-page ownership in the writeback path is complicated, may lead to higher CPU and IO overheads and deemed unnecessary given that write-sharing an inode across different cgroups isn't a common use-case. Combined with inode majority-writer ownership switching, this works well enough in most cases but there are some pathological cases. For example, let's say there are two cgroups A and B which keep writing to different but confined parts of the same inode. B owns the inode and A's memory is limited far below B's. A's dirty ratio can rise enough to trigger balance_dirty_pages() sleeps but B's can be low enough to avoid triggering background writeback. A will be slowed down without a way to make writeback of the dirty pages happen. This patch implements foreign dirty recording and foreign mechanism so that when a memcg encounters a condition as above it can trigger flushes on bdi_writebacks which can clean its pages. Please see the comment on top of mem_cgroup_track_foreign_dirty_slowpath() for details. A reproducer follows. write-range.c:: #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> static const char *usage = "write-range FILE START SIZE\n"; int main(int argc, char **argv) { int fd; unsigned long start, size, end, pos; char *endp; char buf[4096]; if (argc < 4) { fprintf(stderr, usage); return 1; } fd = open(argv[1], O_WRONLY); if (fd < 0) { perror("open"); return 1; } start = strtoul(argv[2], &endp, 0); if (*endp != '\0') { fprintf(stderr, usage); return 1; } size = strtoul(argv[3], &endp, 0); if (*endp != '\0') { fprintf(stderr, usage); return 1; } end = start + size; while (1) { for (pos = start; pos < end; ) { long bread, bwritten = 0; if (lseek(fd, pos, SEEK_SET) < 0) { perror("lseek"); return 1; } bread = read(0, buf, sizeof(buf) < end - pos ? sizeof(buf) : end - pos); if (bread < 0) { perror("read"); return 1; } if (bread == 0) return 0; while (bwritten < bread) { long this; this = write(fd, buf + bwritten, bread - bwritten); if (this < 0) { perror("write"); return 1; } bwritten += this; pos += bwritten; } } } } repro.sh:: #!/bin/bash set -e set -x sysctl -w vm.dirty_expire_centisecs=300000 sysctl -w vm.dirty_writeback_centisecs=300000 sysctl -w vm.dirtytime_expire_seconds=300000 echo 3 > /proc/sys/vm/drop_caches TEST=/sys/fs/cgroup/test A=$TEST/A B=$TEST/B mkdir -p $A $B echo "+memory +io" > $TEST/cgroup.subtree_control echo $((1<<30)) > $A/memory.high echo $((32<<30)) > $B/memory.high rm -f testfile touch testfile fallocate -l 4G testfile echo "Starting B" (echo $BASHPID > $B/cgroup.procs pv -q --rate-limit 70M < /dev/urandom | ./write-range testfile $((2<<30)) $((2<<30))) & echo "Waiting 10s to ensure B claims the testfile inode" sleep 5 sync sleep 5 sync echo "Starting A" (echo $BASHPID > $A/cgroup.procs pv < /dev/urandom | ./write-range testfile 0 $((2<<30))) v2: Added comments explaining why the specific intervals are being used. v3: Use 0 @nr when calling cgroup_writeback_by_id() to use best-effort flushing while avoding possible livelocks. v4: Use get_jiffies_64() and time_before/after64() instead of raw jiffies_64 and arthimetic comparisons as suggested by Jan. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback, memcg: Implement cgroup_writeback_by_id()Tejun Heo1-0/+2
Implement cgroup_writeback_by_id() which initiates cgroup writeback from bdi and memcg IDs. This will be used by memcg foreign inode flushing. v2: Use wb_get_lookup() instead of wb_get_create() to avoid creating spurious wbs. v3: Interpret 0 @nr as 1.25 * nr_dirty to implement best-effort flushing while avoding possible livelocks. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback: Separate out wb_get_lookup() from wb_get_create()Tejun Heo1-0/+2
Separate out wb_get_lookup() which doesn't try to create one if there isn't already one from wb_get_create(). This will be used by later patches. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27bdi: Add bdi->idTejun Heo2-0/+3
There currently is no way to universally identify and lookup a bdi without holding a reference and pointer to it. This patch adds an non-recycling bdi->id and implements bdi_get_by_id() which looks up bdis by their ids. This will be used by memcg foreign inode flushing. I left bdi_list alone for simplicity and because while rb_tree does support rcu assignment it doesn't seem to guarantee lossless walk when walk is racing aginst tree rebalance operations. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27writeback: Generalize and expose wb_completionTejun Heo2-0/+22
wb_completion is used to track writeback completions. We want to use it from memcg side for foreign inode flushes. This patch updates it to remember the target waitq instead of assuming bdi->wb_waitq and expose it outside of fs-writeback.c. Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-08-27ALSA: hda - Allow runtime PM for controller if component notifier is usedTakashi Iwai1-0/+1
Currently we disallow the runtime PM of the HD-audio controller if it's bound with HDMI/DP on Nvidia / AMD unless it's for dGPU. This is for keeping the link up to get the proper notification for ELD hotplug. As explained in the commit 37a3a98ef601 ("ALSA: hda - Enable runtime PM only for discrete GPU"), this keep-power-up behavior is rather a stop-gap solution until the ELD notification via audio component. And now we finally got the audio component for these graphics drivers via commit ade49db337a9 ("ALSA: hda/hdmi - Allow audio component for AMD/ATI and Nvidia HDMI"), so it's time to change. This patch makes HD-audio controller again runtime-suspendable when the device gets bound with audio component in HDMI codec driver. For making it easier to access from the codec driver, move the flag into the common hda_bus object instead of hda_intel flag. Also rename it to keep_power, to indicate the actual meaning. Signed-off-by: Takashi Iwai <[email protected]>
2019-08-27rxrpc: Use skb_unshare() rather than skb_cow_data()David Howells1-4/+8
The in-place decryption routines in AF_RXRPC's rxkad security module currently call skb_cow_data() to make sure the data isn't shared and that the skb can be written over. This has a problem, however, as the softirq handler may be still holding a ref or the Rx ring may be holding multiple refs when skb_cow_data() is called in rxkad_verify_packet() - and so skb_shared() returns true and __pskb_pull_tail() dislikes that. If this occurs, something like the following report will be generated. kernel BUG at net/core/skbuff.c:1463! ... RIP: 0010:pskb_expand_head+0x253/0x2b0 ... Call Trace: __pskb_pull_tail+0x49/0x460 skb_cow_data+0x6f/0x300 rxkad_verify_packet+0x18b/0xb10 [rxrpc] rxrpc_recvmsg_data.isra.11+0x4a8/0xa10 [rxrpc] rxrpc_kernel_recv_data+0x126/0x240 [rxrpc] afs_extract_data+0x51/0x2d0 [kafs] afs_deliver_fs_fetch_data+0x188/0x400 [kafs] afs_deliver_to_call+0xac/0x430 [kafs] afs_wait_for_call_to_complete+0x22f/0x3d0 [kafs] afs_make_call+0x282/0x3f0 [kafs] afs_fs_fetch_data+0x164/0x300 [kafs] afs_fetch_data+0x54/0x130 [kafs] afs_readpages+0x20d/0x340 [kafs] read_pages+0x66/0x180 __do_page_cache_readahead+0x188/0x1a0 ondemand_readahead+0x17d/0x2e0 generic_file_read_iter+0x740/0xc10 __vfs_read+0x145/0x1a0 vfs_read+0x8c/0x140 ksys_read+0x4a/0xb0 do_syscall_64+0x43/0xf0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by using skb_unshare() instead in the input path for DATA packets that have a security index != 0. Non-DATA packets don't need in-place encryption and neither do unencrypted DATA packets. Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Reported-by: Julian Wollrath <[email protected]> Signed-off-by: David Howells <[email protected]>
2019-08-27rxrpc: Use the tx-phase skb flag to simplify tracingDavid Howells1-29/+22
Use the previously-added transmit-phase skbuff private flag to simplify the socket buffer tracing a bit. Which phase the skbuff comes from can now be divined from the skb rather than having to be guessed from the call state. We can also reduce the number of rxrpc_skb_trace values by eliminating the difference between Tx and Rx in the symbols. Signed-off-by: David Howells <[email protected]>
2019-08-27Merge tag 'drm-next-5.4-2019-08-23' of ↵Dave Airlie1-0/+1
git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.4-2019-08-23: amdgpu: - Enable power features on Navi12 - Enable power features on Arcturus - RAS updates - Initial Renoir APU support - Enable power featyres on Renoir - DC gamma fixes - DCN2 fixes - GPU reset support for Picasso - Misc cleanups and fixes scheduler: - Possible race fix Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-08-26dt-bindings: ti_sci_pm_domains: Add support for exclusive and shared accessLokesh Vutla1-0/+9
TISCI protocol supports for enabling the device either with exclusive permissions for the requesting host or with sharing across the hosts. There are certain devices which are exclusive to Linux context and there are certain devices that are shared across different host contexts. So add support for getting this information from DT by increasing the power-domain cells to 2. Acked-by: Tero Kristo <[email protected]> Acked-by: Rob Herring <[email protected]> Reviewed-by: Nishanth Menon <[email protected]> Signed-off-by: Lokesh Vutla <[email protected]> Signed-off-by: Santosh Shilimkar <[email protected]>
2019-08-26firmware: ti_sci: Allow for device shared and exclusive requestsLokesh Vutla1-0/+3
Sysfw provides an option for requesting exclusive access for a device using the flags MSG_FLAG_DEVICE_EXCLUSIVE. If this flag is not used, the device is meant to be shared across hosts. Once a device is requested from a host with this flag set, any request to this device from a different host will be nacked by sysfw. Current tisci driver enables this flag for every device requests. But this may not be true for all the devices. So provide a separate commands in driver for exclusive and shared device requests. Reviewed-by: Nishanth Menon <[email protected]> Signed-off-by: Lokesh Vutla <[email protected]> Signed-off-by: Santosh Shilimkar <[email protected]>
2019-08-26rcu: Don't include <linux/ktime.h> in rcutiny.hChristoph Hellwig1-1/+1
The kbuild reported a built failure due to a header loop when RCUTINY is enabled with my pending riscv-nommu port. Switch rcutiny.h to only include the minimal required header to get HZ instead. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-26net: sched: copy tunnel info when setting flow_action entry->tunnelVlad Buslov1-0/+17
In order to remove dependency on rtnl lock, modify tc_setup_flow_action() to copy tunnel info, instead of just saving pointer to tunnel_key action tunnel info. This is necessary to prevent concurrent action overwrite from releasing tunnel info while it is being used by rtnl-unlocked driver. Implement helper tcf_tunnel_info_copy() that is used to copy tunnel info with all its options to dynamically allocated memory block. Modify tc_cleanup_flow_action() to free dynamically allocated tunnel info. Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-26net: sched: take reference to action dev before calling offloadsVlad Buslov1-0/+2
In order to remove dependency on rtnl lock when calling hardware offload API, take reference to action mirred dev when initializing flow_action structure in tc_setup_flow_action(). Implement function tc_cleanup_flow_action(), use it to release the device after hardware offload API is done using it. Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-26net: sched: take rtnl lock in tc_setup_flow_action()Vlad Buslov1-1/+1
In order to allow using new flow_action infrastructure from unlocked classifiers, modify tc_setup_flow_action() to accept new 'rtnl_held' argument. Take rtnl lock before accessing tc_action data. This is necessary to protect from concurrent action replace. Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-26net: sched: add API for registering unlocked offload block callbacksVlad Buslov2-0/+2
Extend struct flow_block_offload with "unlocked_driver_cb" flag to allow registering and unregistering block hardware offload callbacks that do not require caller to hold rtnl lock. Extend tcf_block with additional lockeddevcnt counter that is incremented for each non-unlocked driver callback attached to device. This counter is necessary to conditionally obtain rtnl lock before calling hardware callbacks in following patches. Register mlx5 tc block offload callbacks as "unlocked". Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-26net: sched: notify classifier on successful offload add/deleteVlad Buslov1-0/+4
To remove dependency on rtnl lock, extend classifier ops with new ops->hw_add() and ops->hw_del() callbacks. Call them from cls API while holding cb_lock every time filter if successfully added to or deleted from hardware. Implement the new API in flower classifier. Use it to manage hw_filters list under cb_lock protection, instead of relying on rtnl lock to synchronize with concurrent fl_reoffload() call. Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-26net: sched: refactor block offloads counter usageVlad Buslov2-32/+16
Without rtnl lock protection filters can no longer safely manage block offloads counter themselves. Refactor cls API to protect block offloadcnt with tcf_block->cb_lock that is already used to protect driver callback list and nooffloaddevcnt counter. The counter can be modified by concurrent tasks by new functions that execute block callbacks (which is safe with previous patch that changed its type to atomic_t), however, block bind/unbind code that checks the counter value takes cb_lock in write mode to exclude any concurrent modifications. This approach prevents race conditions between bind/unbind and callback execution code but allows for concurrency for tc rule update path. Move block offload counter, filter in hardware counter and filter flags management from classifiers into cls hardware offloads API. Make functions tcf_block_offload_{inc|dec}() and tc_cls_offload_cnt_update() to be cls API private. Implement following new cls API to be used instead: tc_setup_cb_add() - non-destructive filter add. If filter that wasn't already in hardware is successfully offloaded, increment block offloads counter, set filter in hardware counter and flag. On failure, previously offloaded filter is considered to be intact and offloads counter is not decremented. tc_setup_cb_replace() - destructive filter replace. Release existing filter block offload counter and reset its in hardware counter and flag. Set new filter in hardware counter and flag. On failure, previously offloaded filter is considered to be destroyed and offload counter is decremented. tc_setup_cb_destroy() - filter destroy. Unconditionally decrement block offloads counter. tc_setup_cb_reoffload() - reoffload filter to single cb. Execute cb() and call tc_cls_offload_cnt_update() if cb() didn't return an error. Refactor all offload-capable classifiers to atomically offload filters to hardware, change block offload counter, and set filter in hardware counter and flag by means of the new cls API functions. Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>