aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-11-25Merge tag 'fsverity-for-linus' of ↵Linus Torvalds5-6/+21
git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt Pull fsverity updates from Eric Biggers: "Expose the fs-verity bit through statx()" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: docs: fs-verity: mention statx() support f2fs: support STATX_ATTR_VERITY ext4: support STATX_ATTR_VERITY statx: define STATX_ATTR_VERITY docs: fs-verity: document first supported kernel version
2019-11-25Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds14-328/+208
Pull fscrypt updates from Eric Biggers: - Add the IV_INO_LBLK_64 encryption policy flag which modifies the encryption to be optimized for UFS inline encryption hardware. - For AES-128-CBC, use the crypto API's implementation of ESSIV (which was added in 5.4) rather than doing ESSIV manually. - A few other cleanups. * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: f2fs: add support for IV_INO_LBLK_64 encryption policies ext4: add support for IV_INO_LBLK_64 encryption policies fscrypt: add support for IV_INO_LBLK_64 policies fscrypt: avoid data race on fscrypt_mode::logged_impl_name docs: ioctl-number: document fscrypt ioctl numbers fscrypt: zeroize fscrypt_info before freeing fscrypt: remove struct fscrypt_ctx fscrypt: invoke crypto API for ESSIV handling
2019-11-25Merge tag 'affs-for-5.5-tag' of ↵Linus Torvalds2-16/+10
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull AFFS updates from David Sterba: "A minor bugfix and cleanup for AFFS" * tag 'affs-for-5.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: affs: fix a memory leak in affs_remount affs: Replace binary semaphores with mutexes
2019-11-25Merge tag 'for-5.5-tag' of ↵Linus Torvalds66-2619/+2780
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible changes: - new block group profiles: RAID1 with 3- and 4- copies - RAID1 in btrfs has always 2 copies, now add support for 3 and 4 - this is an incompat feature (named RAID1C34) - recommended use of RAID1C3 is replacement of RAID6 profile on metadata, this brings a more reliable resiliency against 2 device loss/damage - support for new checksums - per-filesystem, set at mkfs time - fast hash (crc32c successor): xxhash, 64bit digest - strong hashes (both 256bit): sha256 (slower, FIPS), blake2b (faster) - the blake2b module goes via the crypto tree, btrfs.ko has a soft dependency - speed up lseek, don't take inode locks unnecessarily, this can speed up parallel SEEK_CUR/SEEK_SET/SEEK_END by 80% - send: - allow clone operations within the same file - limit maximum number of sent clone references to avoid slow backref walking - error message improvements: device scan prints process name and PID Core changes: - cleanups - remove unique workqueue helpers, used to provide a way to avoid deadlocks in the workqueue code, now done in a simpler way - remove lots of indirect function calls in compression code - extent IO tree code moved out of extent_io.c - cleanup backup superblock handling at mount time - transaction life cycle documentation and cleanups - locking code cleanups, annotations and documentation - add more cold, const, pure function attributes - removal of unused or redundant struct members or variables - new tree-checker sanity tests - try to detect missing INODE_ITEM, cross-reference checks of DIR_ITEM, DIR_INDEX, INODE_REF, and XATTR_* items - remove own bio scheduling code (used to avoid checksum submissions being stuck behind other IO), replaced by cgroup controller-based code to allow better control and avoid priority inversions in cases where the custom and cgroup scheduling disagreed Fixes: - avoid getting stuck during cyclic writebacks - fix trimming of ranges crossing block group boundaries - fix rename exchange on subvolumes, all involved subvolumes need to be recorded in the transaction" * tag 'for-5.5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits) btrfs: drop bdev argument from submit_extent_page btrfs: remove extent_map::bdev btrfs: drop bio_set_dev where not needed btrfs: get bdev directly from fs_devices in submit_extent_page btrfs: record all roots for rename exchange on a subvol Btrfs: fix block group remaining RO forever after error during device replace btrfs: scrub: Don't check free space before marking a block group RO btrfs: change btrfs_fs_devices::rotating to bool btrfs: change btrfs_fs_devices::seeding to bool btrfs: rename btrfs_block_group_cache btrfs: block-group: Reuse the item key from caller of read_one_block_group() btrfs: block-group: Refactor btrfs_read_block_groups() btrfs: document extent buffer locking btrfs: access eb::blocking_writers according to ACCESS_ONCE policies btrfs: set blocking_writers directly, no increment or decrement btrfs: merge blocking_writers branches in btrfs_tree_read_lock btrfs: drop incompat bit for raid1c34 after last block group is gone btrfs: add incompat for raid1 with 3, 4 copies btrfs: add support for 4-copy replication (raid1c4) btrfs: add support for 3-copy replication (raid1c3) ...
2019-11-25Merge tag 'mtd/for-5.5' of ↵Linus Torvalds48-1023/+4525
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Miquel Raynal: "MTD core: - drop inactive maintainers, update the repositories and add IRC channel - debugfs functions improvements - initialize more structure parameters - misc fixes reported by robots MTD devices: - spear_smi: Fixed Write Burst mode - new Intel IXP4xx flash probing hook Raw NAND core: - useless extra checks dropped - update the detection of the bad block markers position Raw NAND controller drivers: - Cadence: new driver - Brcmnand: support for flash-dma v0 + fixes - Denali: drop support for the legacy controller/chip DT representation - superfluous dev_err() calls removed SPI NOR core changes: - introduce 'struct spi_nor_controller_ops' - clean the Register Operations methods - use dev_dbg insted of dev_err for low level info - fix retlen handling in sst_write() - fix silent truncations in spi_nor_read and spi_nor_read_raw() - fix the clearing of QE bit on lock()/unlock() - rework the disabling of the block write protection - rework the Quad Enable methods - make sure nor->spimem and nor->controller_ops are mutually exclusive - set default Quad Enable method for ISSI flashes - add support for few flashes SPI NOR controller drivers changes: - intel-spi: - support chips without software sequencer - add support for Intel Cannon Lake and Intel Comet Lake-H flashes CFI core changes: - code cleanups related useless initializers and coding style issues - fix for a possible double free problem in cfi_cmdset_0002 - improved HyperFlash error reporting and handling in cfi_cmdset_0002 core" * tag 'mtd/for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (73 commits) mtd: devices: fix mchp23k256 read and write mtd: no need to check return value of debugfs_create functions mtd: spi-nor: Set default Quad Enable method for ISSI flashes mtd: spi-nor: Add support for is25wp256 mtd: spi-nor: Add support for w25q256jw mtd: spi-nor: Move condition to avoid a NULL check mtd: spi-nor: Make sure nor->spimem and nor->controller_ops are mutually exclusive mtd: spi-nor: Rename Quad Enable methods mtd: spi-nor: Merge spansion Quad Enable methods mtd: spi-nor: Rename CR_QUAD_EN_SPAN to SR2_QUAD_EN_BIT1 mtd: spi-nor: Extend the SR Read Back test mtd: spi-nor: Rework the disabling of block write protection mtd: spi-nor: Fix clearing of QE bit on lock()/unlock() mtd: cfi_cmdset_0002: fix delayed error detection on HyperFlash mtd: cfi_cmdset_0002: only check errors when ready in cfi_check_err_status() mtd: cfi_cmdset_0002: don't free cfi->cfiq in error path of cfi_amdstd_setup() mtd: cfi_cmdset_*: kill useless 'ret' variable initializers mtd: cfi_util: use DIV_ROUND_UP() in cfi_udelay() mtd: spi-nor: Print debug message when the read back test fails mtd: spi-nor: Check all the bits written, not just the BP ones ...
2019-11-25Merge tag 'for-5.5/dm-changes' of ↵Linus Torvalds22-412/+433
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - Fix DM core to disallow stacking request-based DM on partitions. - Fix DM raid target to properly resync raidset even if bitmap needed additional pages. - Fix DM crypt performance regression due to use of WQ_HIGHPRI for the IO and crypt workqueues. - Fix DM integrity metadata layout that was aligned on 128K boundary rather than the intended 4K boundary (removes 124K of wasted space for each metadata block). - Improve the DM thin, cache and clone targets to use spin_lock_irq rather than spin_lock_irqsave where possible. - Fix DM thin single thread performance that was lost due to needless workqueue wakeups. - Fix DM zoned target performance that was lost due to excessive backing device checks. - Add ability to trigger write failure with the DM dust test target. - Fix whitespace indentation in drivers/md/Kconfig. - Various smalls fixes and cleanups (e.g. use struct_size, fix uninitialized variable, variable renames, etc). * tag 'for-5.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (22 commits) Revert "dm crypt: use WQ_HIGHPRI for the IO and crypt workqueues" dm: Fix Kconfig indentation dm thin: wakeup worker only when deferred bios exist dm integrity: fix excessive alignment of metadata runs dm raid: Remove unnecessary negation of a shift in raid10_format_to_md_layout dm zoned: reduce overhead of backing device checks dm dust: add limited write failure mode dm dust: change ret to r in dust_map_read and dust_map dm dust: change result vars to r dm cache: replace spin_lock_irqsave with spin_lock_irq dm bio prison: replace spin_lock_irqsave with spin_lock_irq dm thin: replace spin_lock_irqsave with spin_lock_irq dm clone: add bucket_lock_irq/bucket_unlock_irq helpers dm clone: replace spin_lock_irqsave with spin_lock_irq dm writecache: handle REQ_FUA dm writecache: fix uninitialized variable warning dm stripe: use struct_size() in kmalloc() dm raid: streamline rs_get_progress() and its raid_status() caller side dm raid: simplify rs_setup_recovery call chain dm raid: to ensure resynchronization, perform raid set grow in preresume ...
2019-11-25Merge tag 'for-5.5/disk-revalidate-20191122' of git://git.kernel.dk/linux-blockLinus Torvalds7-156/+135
Pull disk revalidation updates from Jens Axboe: "This continues the work that Jan Kara started to thoroughly cleanup and consolidate how we handle rescans and revalidations" * tag 'for-5.5/disk-revalidate-20191122' of git://git.kernel.dk/linux-block: block: move clearing bd_invalidated into check_disk_size_change block: remove (__)blkdev_reread_part as an exported API block: fix bdev_disk_changed for non-partitioned devices block: move rescan_partitions to fs/block_dev.c block: merge invalidate_partitions into rescan_partitions block: refactor rescan_partitions
2019-11-25Merge tag 'for-5.5/zoned-20191122' of git://git.kernel.dk/linux-blockLinus Torvalds15-702/+427
Pull zoned block device update from Jens Axboe: "Enhancements and improvements to the zoned device support" * tag 'for-5.5/zoned-20191122' of git://git.kernel.dk/linux-block: scsi: sd_zbc: Remove set but not used variable 'buflen' block: rework zone reporting scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer() null_blk: Add zone_nr_conv to features null_blk: clean up report zones null_blk: clean up the block device operations block: Remove partition support for zoned block devices block: Simplify report zones execution block: cleanup the !zoned case in blk_revalidate_disk_zones block: Enhance blk_revalidate_disk_zones()
2019-11-25Merge tag 'for-5.5/drivers-post-20191122' of git://git.kernel.dk/linux-blockLinus Torvalds12-20/+341
Pull additional block driver updates from Jens Axboe: "Here's another block driver update, done to avoid conflicts with the zoned changes coming next. This contains: - Prepare SCSI sd for zone open/close/finish support - Small NVMe pull request - hwmon support (Akinobu) - add new co-maintainer (Christoph) - work-around for a discard issue on non-conformant drives (Eduard) - Small nbd leak fix" * tag 'for-5.5/drivers-post-20191122' of git://git.kernel.dk/linux-block: nbd: prevent memory leak nvme: hwmon: add quirk to avoid changing temperature threshold nvme: hwmon: provide temperature min and max values for each sensor nvmet: add another maintainer nvme: Discard workaround for non-conformant devices nvme: Add hardware monitoring support scsi: sd_zbc: add zone open, close, and finish support
2019-11-25Merge tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds48-400/+783
Pull block driver updates from Jens Axboe: "Here are the main block driver updates for 5.5. Nothing major in here, mostly just fixes. This contains: - a set of bcache changes via Coly - MD changes from Song - loop unmap write-zeroes fix (Darrick) - spelling fixes (Geert) - zoned additions cleanups to null_blk/dm (Ajay) - allow null_blk online submit queue changes (Bart) - NVMe changes via Keith, nothing major here either" * tag 'for-5.5/drivers-20191121' of git://git.kernel.dk/linux-block: (56 commits) Revert "bcache: fix fifo index swapping condition in journal_pin_cmp()" drivers/md/raid5-ppl.c: use the new spelling of RWH_WRITE_LIFE_NOT_SET drivers/md/raid5.c: use the new spelling of RWH_WRITE_LIFE_NOT_SET bcache: don't export symbols bcache: remove the extra cflags for request.o bcache: at least try to shrink 1 node in bch_mca_scan() bcache: add idle_max_writeback_rate sysfs interface bcache: add code comments in bch_btree_leaf_dirty() bcache: fix deadlock in bcache_allocator bcache: add code comment bch_keylist_pop() and bch_keylist_pop_front() bcache: deleted code comments for dead code in bch_data_insert_keys() bcache: add more accurate error messages in read_super() bcache: fix static checker warning in bcache_device_free() bcache: fix a lost wake-up problem caused by mca_cannibalize_lock bcache: fix fifo index swapping condition in journal_pin_cmp() md/raid10: prevent access of uninitialized resync_pages offset md: avoid invalid memory access for array sb->dev_roles md/raid1: avoid soft lockup under high load null_blk: add zone open, close, and finish support dm: add zone open, close and finish support ...
2019-11-25slip: Fix use-after-free Read in slip_openJouni Hogander1-0/+1
Slip_open doesn't clean-up device which registration failed from the slip_devs device list. On next open after failure this list is iterated and freed device is accessed. Fix this by calling sl_free_netdev in error path. Here is the trace from the Syzbot: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:634 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:132 sl_sync drivers/net/slip/slip.c:725 [inline] slip_open+0xecd/0x11b7 drivers/net/slip/slip.c:801 tty_ldisc_open.isra.0+0xa3/0x110 drivers/tty/tty_ldisc.c:469 tty_set_ldisc+0x30e/0x6b0 drivers/tty/tty_ldisc.c:596 tiocsetd drivers/tty/tty_io.c:2334 [inline] tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2594 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 3b5a39979daf ("slip: Fix memory leak in slip_open error path") Reported-by: syzbot+4d5170758f3762109542@syzkaller.appspotmail.com Cc: David Miller <davem@davemloft.net> Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Jouni Hogander <jouni.hogander@unikie.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25Merge tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds51-800/+1398
Pull core block updates from Jens Axboe: "Due to more granular branches, this one is small and will be followed with other core branches that add specific features. I meant to just have a core and drivers branch, but external dependencies we ended up adding a few more that are also core. The changes are: - Fixes and improvements for the zoned device support (Ajay, Damien) - sed-opal table writing and datastore UID (Revanth) - blk-cgroup (and bfq) blk-cgroup stat fixes (Tejun) - Improvements to the block stats tracking (Pavel) - Fix for overruning sysfs buffer for large number of CPUs (Ming) - Optimization for small IO (Ming, Christoph) - Fix typo in RWH lifetime hint (Eugene) - Dead code removal and documentation (Bart) - Reduction in memory usage for queue and tag set (Bart) - Kerneldoc header documentation (André) - Device/partition revalidation fixes (Jan) - Stats tracking for flush requests (Konstantin) - Various other little fixes here and there (et al)" * tag 'for-5.5/block-20191121' of git://git.kernel.dk/linux-block: (48 commits) Revert "block: split bio if the only bvec's length is > SZ_4K" block: add iostat counters for flush requests block,bfq: Skip tracing hooks if possible block: sed-opal: Introduce SUM_SET_LIST parameter and append it using 'add_token_u64' blk-cgroup: cgroup_rstat_updated() shouldn't be called on cgroup1 block: Don't disable interrupts in trigger_softirq() sbitmap: Delete sbitmap_any_bit_clear() blk-mq: Delete blk_mq_has_free_tags() and blk_mq_can_queue() block: split bio if the only bvec's length is > SZ_4K block: still try to split bio if the bvec crosses pages blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT blk-cgroup: reimplement basic IO stats using cgroup rstat blk-cgroup: remove now unused blkg_print_stat_{bytes|ios}_recursive() blk-throtl: stop using blkg->stat_bytes and ->stat_ios bfq-iosched: stop using blkg->stat_bytes and ->stat_ios bfq-iosched: relocate bfqg_*rwstat*() helpers block: add zone open, close and finish ioctl support block: add zone open, close and finish operations block: Simplify REQ_OP_ZONE_RESET_ALL handling block: Remove REQ_OP_ZONE_RESET plugging ...
2019-11-25Merge tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds23-83/+138
Pull libata updates from Jens Axboe: "Just a few fixes all over the place, support for the Annapurna SATA controller, and a patchset that cleans up the error defines and ultimately fixes anissue with sata_mv" * tag 'for-5.5/libata-20191121' of git://git.kernel.dk/linux-block: ata: pata_artop: make arrays static const, makes object smaller ata_piix: remove open-coded dmi_match(DMI_OEM_STRING) ata: sata_mv, avoid trigerrable BUG_ON ata: make qc_prep return ata_completion_errors ata: define AC_ERR_OK ata: Documentation, fix function names libata: Ensure ata_port probe has completed before detach ahci: tegra: use regulator_bulk_set_supply_names() ahci: Add support for Amazon's Annapurna Labs SATA controller
2019-11-25net: dsa: sja1105: fix sja1105_parse_rgmii_delays()Oleksij Rempel1-5/+5
This function was using configuration of port 0 in devicetree for all ports. In case CPU port was not 0, the delay settings was ignored. This resulted not working communication between CPU and the switch. Fixes: f5b8631c293b ("net: dsa: sja1105: Error out if RGMII delays are requested in DT") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25macvlan: schedule bc_work even if errorMenglong Dong1-1/+2
While enqueueing a broadcast skb to port->bc_queue, schedule_work() is called to add port->bc_work, which processes the skbs in bc_queue, to "events" work queue. If port->bc_queue is full, the skb will be discarded and schedule_work(&port->bc_work) won't be called. However, if port->bc_queue is full and port->bc_work is not running or pending, port->bc_queue will keep full and schedule_work() won't be called any more, and all broadcast skbs to macvlan will be discarded. This case can happen: macvlan_process_broadcast() is the pending function of port->bc_work, it moves all the skbs in port->bc_queue to the queue "list", and processes the skbs in "list". During this, new skbs will keep being added to port->bc_queue in macvlan_broadcast_enqueue(), and port->bc_queue may already full when macvlan_process_broadcast() return. This may happen, especially when there are a lot of real-time threads and the process is preempted. Fix this by calling schedule_work(&port->bc_work) even if port->bc_work is full in macvlan_broadcast_enqueue(). Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25enetc: add support Credit Based Shaper(CBS) for hardware offloadPo Liu5-2/+138
The ENETC hardware support the Credit Based Shaper(CBS) which part of the IEEE-802.1Qav. The CBS driver was loaded by the sch_cbs interface when set in the QOS in the kernel. Here is an example command to set 20Mbits bandwidth in 1Gbits port for taffic class 7: tc qdisc add dev eth0 root handle 1: mqprio \ num_tc 8 map 0 1 2 3 4 5 6 7 hw 1 tc qdisc replace dev eth0 parent 1:8 cbs \ locredit -1470 hicredit 30 \ sendslope -980000 idleslope 20000 offload 1 Signed-off-by: Po Liu <Po.Liu@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25net: phy: add helpers phy_(un)lock_mdio_busHeiner Kallweit2-14/+24
Add helpers to make locking/unlocking the MDIO bus easier. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25mdio_bus: don't use managed reset-controllerDavid Bauer1-2/+4
Geert Uytterhoeven reported that using devm_reset_controller_get leads to a WARNING when probing a reset-controlled PHY. This is because the device devm_reset_controller_get gets supplied is not actually the one being probed. Acquire an unmanaged reset-control as well as free the reset_control on unregister to fix this. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> CC: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David Bauer <mail@david-bauer.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-25Merge tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-blockLinus Torvalds14-725/+3089
Pull io_uring updates from Jens Axboe: "A lot of stuff has been going on this cycle, with improving the support for networked IO (and hence unbounded request completion times) being one of the major themes. There's been a set of fixes done this week, I'll send those out as well once we're certain we're fully happy with them. This contains: - Unification of the "normal" submit path and the SQPOLL path (Pavel) - Support for sparse (and bigger) file sets, and updating of those file sets without needing to unregister/register again. - Independently sized CQ ring, instead of just making it always 2x the SQ ring size. This makes it more flexible for networked applications. - Support for overflowed CQ ring, never dropping events but providing backpressure on submits. - Add support for absolute timeouts, not just relative ones. - Support for generic cancellations. This divorces io_uring from workqueues as well, which additionally gets us one step closer to generic async system call support. - With cancellations, we can support grabbing the process file table as well, just like we do mm context. This allows support for system calls that create file descriptors, like accept4() support that's built on top of that. - Support for io_uring tracing (Dmitrii) - Support for linked timeouts. These abort an operation if it isn't completed by the time noted in the linke timeout. - Speedup tracking of poll requests - Various cleanups making the coder easier to follow (Jackie, Pavel, Bob, YueHaibing, me) - Update MAINTAINERS with new io_uring list" * tag 'for-5.5/io_uring-20191121' of git://git.kernel.dk/linux-block: (64 commits) io_uring: make POLL_ADD/POLL_REMOVE scale better io-wq: remove now redundant struct io_wq_nulls_list io_uring: Fix getting file for non-fd opcodes io_uring: introduce req_need_defer() io_uring: clean up io_uring_cancel_files() io-wq: ensure free/busy list browsing see all items io-wq: ensure we have a stable view of ->cur_work for cancellations io_wq: add get/put_work handlers to io_wq_create() io_uring: check for validity of ->rings in teardown io_uring: fix potential deadlock in io_poll_wake() io_uring: use correct "is IO worker" helper io_uring: fix -ENOENT issue with linked timer with short timeout io_uring: don't do flush cancel under inflight_lock io_uring: flag SQPOLL busy condition to userspace io_uring: make ASYNC_CANCEL work with poll and timeout io_uring: provide fallback request for OOM situations io_uring: convert accept4() -ERESTARTSYS into -EINTR io_uring: fix error clear of ->file_table in io_sqe_files_register() io_uring: separate the io_free_req and io_free_req_find_next interface io_uring: keep io_put_req only responsible for release and put req ...
2019-11-25Merge tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmddLinus Torvalds22-887/+1371
Pull tpmd updates from Jarkko Sakkinen: - support for Cr50 fTPM - support for fTPM on AMD Zen+ CPUs - TPM 2.0 trusted keys code relocated from drivers/char/tpm to security/keys * tag 'tpmdd-next-20191112' of git://git.infradead.org/users/jjs/linux-tpmdd: KEYS: trusted: Remove set but not used variable 'keyhndl' tpm: Switch to platform_get_irq_optional() tpm_crb: fix fTPM on AMD Zen+ CPUs KEYS: trusted: Move TPM2 trusted keys code KEYS: trusted: Create trusted keys subsystem KEYS: Use common tpm_buf for trusted and asymmetric keys tpm: Move tpm_buf code to include/linux/ tpm: use GFP_KERNEL instead of GFP_HIGHMEM for tpm_buf tpm: add check after commands attribs tab allocation tpm: tpm_tis_spi: Drop THIS_MODULE usage from driver struct tpm: tpm_tis_spi: Cleanup includes tpm: tpm_tis_spi: Support cr50 devices tpm: tpm_tis_spi: Introduce a flow control callback tpm: Add a flag to indicate TPM power is managed by firmware dt-bindings: tpm: document properties for cr50 tpm_tis: override durations for STM tpm with firmware 1.2.8.28 tpm: provide a way to override the chip returned durations tpm: Remove duplicate code from caps_show() in tpm-sysfs.c
2019-11-25vfs: properly and reliably lock f_pos in fdget_pos()Linus Torvalds3-8/+2
fdget_pos() is used by file operations that will read and update f_pos: things like "read()", "write()" and "lseek()" (but not, for example, "pread()/pwrite" that get their file positions elsewhere). However, it had two separate escape clauses for this, because not everybody wants or needs serialization of the file position. The first and most obvious case is the "file descriptor doesn't have a position at all", ie a stream-like file. Except we didn't actually use FMODE_STREAM, but instead used FMODE_ATOMIC_POS. The reason for that was that FMODE_STREAM didn't exist back in the days, but also that we didn't want to mark all the special cases, so we only marked the ones that _required_ position atomicity according to POSIX - regular files and directories. The case one was intentionally lazy, but now that we _do_ have FMODE_STREAM we could and should just use it. With the change to use FMODE_STREAM, there are no remaining uses for FMODE_ATOMIC_POS, and all the code to set it is deleted. Any cases where we don't want the serialization because the driver (or subsystem) doesn't use the file position should just be updated to do "stream_open()". We've done that for all the obvious and common situations, we may need a few more. Quoting Kirill Smelkov in the original FMODE_STREAM thread (see link below for full email): "And I appreciate if people could help at least somehow with "getting rid of mixed case entirely" (i.e. always lock f_pos_lock on !FMODE_STREAM), because this transition starts to diverge from my particular use-case too far. To me it makes sense to do that transition as follows: - convert nonseekable_open -> stream_open via stream_open.cocci; - audit other nonseekable_open calls and convert left users that truly don't depend on position to stream_open; - extend stream_open.cocci to analyze alloc_file_pseudo as well (this will cover pipes and sockets), or maybe convert pipes and sockets to FMODE_STREAM manually; - extend stream_open.cocci to analyze file_operations that use no_llseek or noop_llseek, but do not use nonseekable_open or alloc_file_pseudo. This might find files that have stream semantic but are opened differently; - extend stream_open.cocci to analyze file_operations whose .read/.write do not use ppos at all (independently of how file was opened); - ... - after that remove FMODE_ATOMIC_POS and always take f_pos_lock if !FMODE_STREAM; - gather bug reports for deadlocked read/write and convert missed cases to FMODE_STREAM, probably extending stream_open.cocci along the road to catch similar cases i.e. always take f_pos_lock unless a file is explicitly marked as being stream, and try to find and cover all files that are streams" We have not done the "extend stream_open.cocci to analyze alloc_file_pseudo" as well, but the previous commit did manually handle the case of pipes and sockets. The other case where we can avoid locking f_pos is the "this file descriptor only has a single user and it is us, and thus there is no need to lock it". The second test was correct, although a bit subtle and worth just re-iterating here. There are two kinds of other sources of references to the same file descriptor: file descriptors that have been explicitly shared across fork() or with dup(), and file tables having elevated reference counts due to threading (or explicit file sharing with clone()). The first case would have incremented the file count explicitly, and in the second case the previous __fdget() would have incremented it for us and set the FDPUT_FPUT flag. But in both cases the file count would be greater than one, so the "file_count(file) > 1" test catches both situations. Also note that if file_count is 1, that also means that no other thread can have access to the file table, so there also cannot be races with concurrent calls to dup()/fork()/clone() that would increment the file count any other way. Link: https://lore.kernel.org/linux-fsdevel/20190413184404.GA13490@deco.navytux.spb.ru Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Eic Dumazet <edumazet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marco Elver <elver@google.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Paul McKenney <paulmck@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-25vfs: mark pipes and sockets as stream-like file descriptorsLinus Torvalds2-2/+5
In commit 3975b097e577 ("convert stream-like files -> stream_open, even if they use noop_llseek") Kirill used a coccinelle script to change "nonseekable_open()" to "stream_open()", which changed the trivial cases of stream-like file descriptors to the new model with FMODE_STREAM. However, the two big cases - sockets and pipes - don't actually have that trivial pattern at all, and were thus never converted to FMODE_STREAM even though it makes lots of sense to do so. That's particularly true when looking forward to the next change: getting rid of FMODE_ATOMIC_POS entirely, and just using FMODE_STREAM to decide whether f_pos updates are needed or not. And if they are, we'll always do them atomically. This came up because KCSAN (correctly) noted that the non-locked f_pos updates are data races: they are clearly benign for the case where we don't care, but it would be good to just not have that issue exist at all. Note that the reason we used FMODE_ATOMIC_POS originally is that only doing it for the minimal required case is "safer" in that it's possible that the f_pos locking can cause unnecessary serialization across the whole write() call. And in the worst case, that kind of serialization can cause deadlock issues: think writers that need readers to empty the state using the same file descriptor. [ Note that the locking is per-file descriptor - because it protects "f_pos", which is obviously per-file descriptor - so it only affects cases where you literally use the same file descriptor to both read and write. So a regular pipe that has separate reading and writing file descriptors doesn't really have this situation even though it's the obvious case of "reader empties what a bit writer concurrently fills" But we want to make pipes as being stream-line anyway, because we don't want the unnecessary overhead of locking, and because a named pipe can be (ab-)used by reading and writing to the same file descriptor. ] There are likely a lot of other cases that might want FMODE_STREAM, and looking for ".llseek = no_llseek" users and other cases that don't have an lseek file operation at all and making them use "stream_open()" might be a good idea. But pipes and sockets are likely to be the two main cases. Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Eic Dumazet <edumazet@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Marco Elver <elver@google.com> Cc: Andrea Parri <parri.andrea@gmail.com> Cc: Paul McKenney <paulmck@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-25writeback: fix -Wformat compilation warningsQian Cai1-24/+24
The commit f05499a06fb4 ("writeback: use ino_t for inodes in tracepoints") introduced a lot of GCC compilation warnings on s390, In file included from ./include/trace/define_trace.h:102, from ./include/trace/events/writeback.h:904, from fs/fs-writeback.c:82: ./include/trace/events/writeback.h: In function 'trace_raw_output_writeback_page_template': ./include/trace/events/writeback.h:76:12: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'ino_t' {aka 'unsigned int'} [-Wformat=] TP_printk("bdi %s: ino=%lu index=%lu", ^~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/trace/trace_events.h:360:22: note: in definition of macro 'DECLARE_EVENT_CLASS' trace_seq_printf(s, print); \ ^~~~~ ./include/trace/events/writeback.h:76:2: note: in expansion of macro 'TP_printk' TP_printk("bdi %s: ino=%lu index=%lu", ^~~~~~~~~ Fix them by adding necessary casts where ino_t could be either "unsigned int" or "unsigned long". Fixes: f05499a06fb4 ("writeback: use ino_t for inodes in tracepoints") Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Tejun Heo <tj@kernel.org>
2019-11-25ALSA: usb-audio: Fix Focusrite Scarlett 6i6 gen1 - input handlingJens Verwiebe1-2/+21
The Scarlett 6i6 has no padding on rear inputs 3/4 but a gainstage. This patch introduces this functionality as to be seen in the mac or windows scarlett control. The correct address could already be found in the dump info, but was never used. Without this patch inputs 3/4 are quite unusable else. Signed-off-by: Jens Verwiebe <info@jensverwiebe.de> Link: https://lore.kernel.org/r/384d65cd-5e87-91eb-9fc3-e57226f534c6@jensverwiebe.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-25ALSA: hda/realtek - Enable internal speaker of ASUS UX431FLCJian-Hong Pan1-1/+9
Laptops like ASUS UX431FLC and UX431FL can share the same audio quirks. But UX431FLC needs one more step to enable the internal speaker: Pull the GPIO from CODEC to initialize the AMP. Fixes: 60083f9e94b2 ("ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL") Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191125093405.5702-1-jian-hong@endlessm.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-25Merge tag 'asoc-v5.5-2' of ↵Takashi Iwai1653-11969/+18928
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: More updates for v5.5 Some more development work for v5.5. Highlights include: - More cleanups from Morimoto-san. - Trigger word detection for RT5677. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2019-11-25Merge branch 'for-5.5/system-state' into for-linusPetr Mladek15-21/+901
2019-11-25Merge branch 'for-5.5/selftests' into for-linusPetr Mladek1-0/+1
2019-11-25Merge branch 'sched/rt' into sched/core, to pick up commitIngo Molnar1-1/+1
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25Merge tag 'kvm-ppc-next-5.5-2' of ↵Paolo Bonzini1-15/+29
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD Second KVM PPC update for 5.5 - Two fixes from Greg Kurz to fix memory leak bugs in the XIVE code.
2019-11-25x86/entry/32: Fix FIXUP_ESPFIX_STACK with user CR3Andy Lutomirski1-3/+18
UNWIND_ESPFIX_STACK needs to read the GDT, and the GDT mapping that can be accessed via %fs is not mapped in the user pagetables. Use SGDT to find the cpu_entry_area mapping and read the espfix offset from that instead. Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25lkdtm: Remove references to CONFIG_REFCOUNT_FULLWill Deacon1-2/+1
CONFIG_REFCOUNT_FULL no longer exists, so remove all references to it. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-11-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Remove unused 'refcount_error_report()' functionWill Deacon2-18/+0
'refcount_error_report()' has no callers. Remove it. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-10-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Consolidate implementations of refcount_tWill Deacon11-308/+59
The generic implementation of refcount_t should be good enough for everybody, so remove ARCH_HAS_REFCOUNT and REFCOUNT_FULL entirely, leaving the generic implementation enabled unconditionally. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-9-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitionsWill Deacon1-7/+2
The definitions of REFCOUNT_MAX and REFCOUNT_SATURATED are the same, regardless of CONFIG_REFCOUNT_FULL, so consolidate them into a single pair of definitions. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-8-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Move saturation warnings out of lineWill Deacon2-19/+48
Having the refcount saturation and warnings inline bloats the text, despite the fact that these paths should never be executed in normal operation. Move the refcount saturation and warnings out of line to reduce the image size when refcount_t checking is enabled. Relative to an x86_64 defconfig, the sizes reported by bloat-o-meter are: # defconfig+REFCOUNT_FULL, inline saturation (i.e. before this patch) Total: Before=14762076, After=14915442, chg +1.04% # defconfig+REFCOUNT_FULL, out-of-line saturation (i.e. after this patch) Total: Before=14762076, After=14835497, chg +0.50% A side-effect of this change is that we now only get one warning per refcount saturation type, rather than one per problematic call-site. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-7-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Improve performance of generic REFCOUNT_FULL codeWill Deacon1-56/+75
Rewrite the generic REFCOUNT_FULL implementation so that the saturation point is moved to INT_MIN / 2. This allows us to defer the sanity checks until after the atomic operation, which removes many uses of cmpxchg() in favour of atomic_fetch_{add,sub}(). Some crude perf results obtained from lkdtm show substantially less overhead, despite the checking: $ perf stat -r 3 -B -- echo {ATOMIC,REFCOUNT}_TIMING >/sys/kernel/debug/provoke-crash/DIRECT # arm64 ATOMIC_TIMING: 46.50451 +- 0.00134 seconds time elapsed ( +- 0.00% ) REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 77.57522 +- 0.00982 seconds time elapsed ( +- 0.01% ) REFCOUNT_TIMING (REFCOUNT_FULL, this series): 48.7181 +- 0.0256 seconds time elapsed ( +- 0.05% ) # x86 ATOMIC_TIMING: 31.6225 +- 0.0776 seconds time elapsed ( +- 0.25% ) REFCOUNT_TIMING (!REFCOUNT_FULL, mainline/x86 asm): 31.6689 +- 0.0901 seconds time elapsed ( +- 0.28% ) REFCOUNT_TIMING (REFCOUNT_FULL, mainline): 53.203 +- 0.138 seconds time elapsed ( +- 0.26% ) REFCOUNT_TIMING (REFCOUNT_FULL, this series): 31.7408 +- 0.0486 seconds time elapsed ( +- 0.15% ) Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Tested-by: Jan Glauber <jglauber@marvell.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-6-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the ↵Will Deacon2-246/+229
<linux/refcount.h> header In an effort to improve performance of the REFCOUNT_FULL implementation, move the bulk of its functions into linux/refcount.h. This allows them to be inlined in the same way as if they had been provided via CONFIG_ARCH_HAS_REFCOUNT. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-5-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Remove unused refcount_*_checked() variantsWill Deacon2-43/+36
The full-fat refcount implementation is exposed via a set of functions suffixed with "_checked()", the idea being that code can choose to use the more expensive, yet more secure implementation on a case-by-case basis. In reality, this hasn't happened, so with a grand total of zero users, let's remove the checked variants for now by simply dropping the suffix and predicating the out-of-line functions on CONFIG_REFCOUNT_FULL=y. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-4-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Ensure integer operands are treated as signedWill Deacon2-10/+10
In preparation for changing the saturation point of REFCOUNT_FULL to INT_MIN/2, change the type of integer operands passed into the API from 'unsigned int' to 'int' so that we can avoid casting during comparisons when we don't want to fall foul of C integral conversion rules for signed and unsigned types. Since the kernel is compiled with '-fno-strict-overflow', we don't need to worry about the UB introduced by signed overflow here. Furthermore, we're already making heavy use of the atomic_t API, which operates exclusively on signed types. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-3-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25locking/refcount: Define constants for saturation and max refcount valuesWill Deacon3-26/+29
The REFCOUNT_FULL implementation uses a different saturation point than the x86 implementation, which means that the shared refcount code in lib/refcount.c (e.g. refcount_dec_not_one()) needs to be aware of the difference. Rather than duplicate the definitions from the lkdtm driver, instead move them into <linux/refcount.h> and update all references accordingly. Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Hanjun Guo <guohanjun@huawei.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Elena Reshetova <elena.reshetova@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191121115902.2551-2-will@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25Merge branch 'x86/core' into perf/core, to resolve conflicts and to pick up ↵Ingo Molnar12-13/+128
completed topic tree Conflicts: tools/perf/check-headers.sh Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar195-4088/+3382
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25Merge branch 'x86/build' into x86/asm, to pick up completed topic branchIngo Molnar38-179/+163
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-25x86/pti/32: Calculate the various PTI cpu_entry_area sizes correctly, make ↵Ingo Molnar3-10/+14
the CPU_ENTRY_AREA_PAGES assert precise When two recent commits that increased the size of the 'struct cpu_entry_area' were merged in -tip, the 32-bit defconfig build started failing on the following build time assert: ./include/linux/compiler.h:391:38: error: call to ‘__compiletime_assert_189’ declared with attribute error: BUILD_BUG_ON failed: CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE arch/x86/mm/cpu_entry_area.c:189:2: note: in expansion of macro ‘BUILD_BUG_ON’ In function ‘setup_cpu_entry_area_ptes’, Which corresponds to the following build time assert: BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); The purpose of this assert is to sanity check the fixed-value definition of CPU_ENTRY_AREA_PAGES arch/x86/include/asm/pgtable_32_types.h: #define CPU_ENTRY_AREA_PAGES (NR_CPUS * 41) The '41' is supposed to match sizeof(struct cpu_entry_area)/PAGE_SIZE, which value we didn't want to define in such a low level header, because it would cause dependency hell. Every time the size of cpu_entry_area is changed, we have to adjust CPU_ENTRY_AREA_PAGES accordingly - and this assert is checking that constraint. But the assert is both imprecise and buggy, primarily because it doesn't include the single readonly IDT page that is mapped at CPU_ENTRY_AREA_BASE (which begins at a PMD boundary). This bug was hidden by the fact that by accident CPU_ENTRY_AREA_PAGES is defined too large upstream (v5.4-rc8): #define CPU_ENTRY_AREA_PAGES (NR_CPUS * 40) While 'struct cpu_entry_area' is 155648 bytes, or 38 pages. So we had two extra pages, which hid the bug. The following commit (not yet upstream) increased the size to 40 pages: x86/iopl: ("Restrict iopl() permission scope") ... but increased CPU_ENTRY_AREA_PAGES only 41 - i.e. shortening the gap to just 1 extra page. Then another not-yet-upstream commit changed the size again: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Which increased the cpu_entry_area size from 38 to 39 pages, but didn't change CPU_ENTRY_AREA_PAGES (kept it at 40). This worked fine, because we still had a page left from the accidental 'reserve'. But when these two commits were merged into the same tree, the combined size of cpu_entry_area grew from 38 to 40 pages, while CPU_ENTRY_AREA_PAGES finally caught up to 40 as well. Which is fine in terms of functionality, but the assert broke: BUILD_BUG_ON(CPU_ENTRY_AREA_PAGES * PAGE_SIZE < CPU_ENTRY_AREA_MAP_SIZE); because CPU_ENTRY_AREA_MAP_SIZE is the total size of the area, which is 1 page larger due to the IDT page. To fix all this, change the assert to two precise asserts: BUILD_BUG_ON((CPU_ENTRY_AREA_PAGES+1)*PAGE_SIZE != CPU_ENTRY_AREA_MAP_SIZE); BUILD_BUG_ON(CPU_ENTRY_AREA_TOTAL_SIZE != CPU_ENTRY_AREA_MAP_SIZE); This takes the IDT page into account, and also connects the size-based define of CPU_ENTRY_AREA_TOTAL_SIZE with the address-subtraction based define of CPU_ENTRY_AREA_MAP_SIZE. Also clean up some of the names which made it rather confusing: - 'CPU_ENTRY_AREA_TOT_SIZE' wasn't actually the 'total' size of the cpu-entry-area, but the per-cpu array size, so rename this to CPU_ENTRY_AREA_ARRAY_SIZE. - Introduce CPU_ENTRY_AREA_TOTAL_SIZE that _is_ the total mapping size, with the IDT included. - Add comments where '+1' denotes the IDT mapping - it wasn't obvious and took me about 3 hours to decode... Finally, because this particular commit is actually applied after this patch: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Fix the CPU_ENTRY_AREA_PAGES value from 40 pages to the correct 39 pages. All future commits that change cpu_entry_area will have to adjust this value precisely. As a side note, we should probably attempt to remove CPU_ENTRY_AREA_PAGES and derive its value directly from the structure, without causing header hell - but that is an adventure for another day! :-) Fixes: 880a98c33996: ("x86/cpu_entry_area: Add guard page for entry stack on 32bit") Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: stable@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller50-548/+2031
Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-11-24 The following pull-request contains BPF updates for your *net-next* tree. We've added 27 non-merge commits during the last 4 day(s) which contain a total of 50 files changed, 2031 insertions(+), 548 deletions(-). The main changes are: 1) Optimize bpf_tail_call() from retpoline-ed indirect jump to direct jump, from Daniel. 2) Support global variables in libbpf, from Andrii. 3) Cleanup selftests with BPF_TRACE_x() macro, from Martin. 4) Fix devmap hash, from Toke. 5) Fix register bounds after 32-bit conditional jumps, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-24Merge branch 'for-upstream' of ↵Jakub Kicinski7-7/+42
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2019-11-24 Here's one last bluetooth-next pull request for the 5.5 kernel: - Fix BDADDR_PROPERTY & INVALID_BDADDR quirk handling - Added support for BCM4334B0 and BCM4335A0 controllers - A few other smaller fixes related to locking and memory leaks ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-24ax88179_178a: add ethtool_op_get_ts_info()Andreas K. Besslein1-0/+1
This enables the use of SW timestamping. ax88179_178a uses the usbnet transmit function usbnet_start_xmit() which implements software timestamping. ax88179_178a overrides ethtool_ops but missed to set .get_ts_info. This caused SOF_TIMESTAMPING_TX_SOFTWARE capability to be not available. Signed-off-by: Andreas K. Besslein <besslein.andreas@gmail.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-24Merge branch 'mlxsw-Two-small-updates'Jakub Kicinski1-4/+40
Ido Schimmel says: ==================== mlxsw: Two small updates Patch #1 from Petr handles a corner case in GRE tunnel offload. Patch #2 from Amit fixes a recent issue where the driver was programming the device to use an adjacency index (for a nexthop) that was not properly initialized. ==================== Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-24mlxsw: spectrum_router: Fix use of uninitialized adjacency indexAmit Cohen1-3/+2
When mlxsw_sp_adj_discard_write() is called for the first time, the value stored in 'mlxsw_sp->router->adj_discard_index' is invalid, as indicated by 'mlxsw_sp->router->adj_discard_index_valid' being set to 'false'. In this case, we should not use the value initially stored in 'mlxsw_sp->router->adj_discard_index' (0) and instead use the value allocated later in the function. Fixes: 983db6198f0d ("mlxsw: spectrum_router: Allocate discard adjacency entry when needed") Signed-off-by: Amit Cohen <amitc@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>