aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-03-01Btrfs: fix log replay failure after unlink and link combinationFilipe Manana3-22/+139
If we have a file with 2 (or more) hard links in the same directory, remove one of the hard links, create a new file (or link an existing file) in the same directory with the name of the removed hard link, and then finally fsync the new file, we end up with a log that fails to replay, causing a mount failure. Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/testdir $ touch /mnt/testdir/foo $ ln /mnt/testdir/foo /mnt/testdir/bar $ sync $ unlink /mnt/testdir/bar $ touch /mnt/testdir/bar $ xfs_io -c "fsync" /mnt/testdir/bar <power failure> $ mount /dev/sdb /mnt mount: mount(2) failed: /mnt: No such file or directory When replaying the log, for that example, we also see the following in dmesg/syslog: [71813.671307] BTRFS info (device dm-0): failed to delete reference to bar, inode 258 parent 257 [71813.674204] ------------[ cut here ]------------ [71813.675694] BTRFS: Transaction aborted (error -2) [71813.677236] WARNING: CPU: 1 PID: 13231 at fs/btrfs/inode.c:4128 __btrfs_unlink_inode+0x17b/0x355 [btrfs] [71813.679669] Modules linked in: btrfs xfs f2fs dm_flakey dm_mod dax ghash_clmulni_intel ppdev pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper evdev psmouse i2c_piix4 parport_pc i2c_core pcspkr sg serio_raw parport button sunrpc loop autofs4 ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod ata_generic sd_mod virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel floppy virtio e1000 scsi_mod [last unloaded: btrfs] [71813.679669] CPU: 1 PID: 13231 Comm: mount Tainted: G W 4.15.0-rc9-btrfs-next-56+ #1 [71813.679669] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 [71813.679669] RIP: 0010:__btrfs_unlink_inode+0x17b/0x355 [btrfs] [71813.679669] RSP: 0018:ffffc90001cef738 EFLAGS: 00010286 [71813.679669] RAX: 0000000000000025 RBX: ffff880217ce4708 RCX: 0000000000000001 [71813.679669] RDX: 0000000000000000 RSI: ffffffff81c14bae RDI: 00000000ffffffff [71813.679669] RBP: ffffc90001cef7c0 R08: 0000000000000001 R09: 0000000000000001 [71813.679669] R10: ffffc90001cef5e0 R11: ffffffff8343f007 R12: ffff880217d474c8 [71813.679669] R13: 00000000fffffffe R14: ffff88021ccf1548 R15: 0000000000000101 [71813.679669] FS: 00007f7cee84c480(0000) GS:ffff88023fc80000(0000) knlGS:0000000000000000 [71813.679669] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [71813.679669] CR2: 00007f7cedc1abf9 CR3: 00000002354b4003 CR4: 00000000001606e0 [71813.679669] Call Trace: [71813.679669] btrfs_unlink_inode+0x17/0x41 [btrfs] [71813.679669] drop_one_dir_item+0xfa/0x131 [btrfs] [71813.679669] add_inode_ref+0x71e/0x851 [btrfs] [71813.679669] ? __lock_is_held+0x39/0x71 [71813.679669] ? replay_one_buffer+0x53/0x53a [btrfs] [71813.679669] replay_one_buffer+0x4a4/0x53a [btrfs] [71813.679669] ? rcu_read_unlock+0x3a/0x57 [71813.679669] ? __lock_is_held+0x39/0x71 [71813.679669] walk_up_log_tree+0x101/0x1d2 [btrfs] [71813.679669] walk_log_tree+0xad/0x188 [btrfs] [71813.679669] btrfs_recover_log_trees+0x1fa/0x31e [btrfs] [71813.679669] ? replay_one_extent+0x544/0x544 [btrfs] [71813.679669] open_ctree+0x1cf6/0x2209 [btrfs] [71813.679669] btrfs_mount_root+0x368/0x482 [btrfs] [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 [71813.679669] ? __lockdep_init_map+0x176/0x1c2 [71813.679669] ? mount_fs+0x64/0x10b [71813.679669] mount_fs+0x64/0x10b [71813.679669] vfs_kern_mount+0x68/0xce [71813.679669] btrfs_mount+0x13e/0x772 [btrfs] [71813.679669] ? trace_hardirqs_on_caller+0x14c/0x1a6 [71813.679669] ? __lockdep_init_map+0x176/0x1c2 [71813.679669] ? mount_fs+0x64/0x10b [71813.679669] mount_fs+0x64/0x10b [71813.679669] vfs_kern_mount+0x68/0xce [71813.679669] do_mount+0x6e5/0x973 [71813.679669] ? memdup_user+0x3e/0x5c [71813.679669] SyS_mount+0x72/0x98 [71813.679669] entry_SYSCALL_64_fastpath+0x1e/0x8b [71813.679669] RIP: 0033:0x7f7cedf150ba [71813.679669] RSP: 002b:00007ffca71da688 EFLAGS: 00000206 [71813.679669] Code: 7f a0 e8 51 0c fd ff 48 8b 43 50 f0 0f ba a8 30 2c 00 00 02 72 17 41 83 fd fb 74 11 44 89 ee 48 c7 c7 7d 11 7f a0 e8 38 f5 8d e0 <0f> ff 44 89 e9 ba 20 10 00 00 eb 4d 48 8b 4d b0 48 8b 75 88 4c [71813.679669] ---[ end trace 83bd473fc5b4663b ]--- [71813.854764] BTRFS: error (device dm-0) in __btrfs_unlink_inode:4128: errno=-2 No such entry [71813.886994] BTRFS: error (device dm-0) in btrfs_replay_log:2307: errno=-2 No such entry (Failed to recover log tree) [71813.903357] BTRFS error (device dm-0): cleaner transaction attach returned -30 [71814.128078] BTRFS error (device dm-0): open_ctree failed This happens because the log has inode reference items for both inode 258 (the first file we created) and inode 259 (the second file created), and when processing the reference item for inode 258, we replace the corresponding item in the subvolume tree (which has two names, "foo" and "bar") witht he one in the log (which only has one name, "foo") without removing the corresponding dir index keys from the parent directory. Later, when processing the inode reference item for inode 259, which has a name of "bar" associated to it, we notice that dir index entries exist for that name and for a different inode, so we attempt to unlink that name, which fails because the inode reference item for inode 258 no longer has the name "bar" associated to it, making a call to btrfs_unlink_inode() fail with a -ENOENT error. Fix this by unlinking all the names in an inode reference item from a subvolume tree that are not present in the inode reference item found in the log tree, before overwriting it with the item from the log tree. Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2018-03-01Btrfs: fix log replay failure after linking special file and fsyncFilipe Manana1-1/+1
If in the same transaction we rename a special file (fifo, character/block device or symbolic link), create a hard link for it having its old name then sync the log, we will end up with a log that can not be replayed and at when attempting to replay it, an EEXIST error is returned and mounting the filesystem fails. Example scenario: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ mkdir /mnt/testdir $ mkfifo /mnt/testdir/foo # Make sure everything done so far is durably persisted. $ sync # Create some unrelated file and fsync it, this is just to create a log # tree. The file must be in the same directory as our special file. $ touch /mnt/testdir/f1 $ xfs_io -c "fsync" /mnt/testdir/f1 # Rename our special file and then create a hard link with its old name. $ mv /mnt/testdir/foo /mnt/testdir/bar $ ln /mnt/testdir/bar /mnt/testdir/foo # Create some other unrelated file and fsync it, this is just to persist # the log tree which was modified by the previous rename and link # operations. Alternatively we could have modified file f1 and fsync it. $ touch /mnt/f2 $ xfs_io -c "fsync" /mnt/f2 <power failure> $ mount /dev/sdc /mnt mount: mount /dev/sdc on /mnt failed: File exists This happens because when both the log tree and the subvolume's tree have an entry in the directory "testdir" with the same name, that is, there is one key (258 INODE_REF 257) in the subvolume tree and another one in the log tree (where 258 is the inode number of our special file and 257 is the inode for directory "testdir"). Only the data of those two keys differs, in the subvolume tree the index field for inode reference has a value of 3 while the log tree it has a value of 5. Because the same key exists in both trees, but have different index, the log replay fails with an -EEXIST error when attempting to replay the inode reference from the log tree. Fix this by setting the last_unlink_trans field of the inode (our special file) to the current transaction id when a hard link is created, as this forces logging the parent directory inode, solving the conflict at log replay time. A new generic test case for fstests was also submitted. Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2018-03-01Btrfs: send, fix issuing write op when processing hole in no data modeFilipe Manana1-0/+3
When doing an incremental send of a filesystem with the no-holes feature enabled, we end up issuing a write operation when using the no data mode send flag, instead of issuing an update extent operation. Fix this by issuing the update extent operation instead. Trivial reproducer: $ mkfs.btrfs -f -O no-holes /dev/sdc $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdc /mnt/sdc $ mount /dev/sdd /mnt/sdd $ xfs_io -f -c "pwrite -S 0xab 0 32K" /mnt/sdc/foobar $ btrfs subvolume snapshot -r /mnt/sdc /mnt/sdc/snap1 $ xfs_io -c "fpunch 8K 8K" /mnt/sdc/foobar $ btrfs subvolume snapshot -r /mnt/sdc /mnt/sdc/snap2 $ btrfs send /mnt/sdc/snap1 | btrfs receive /mnt/sdd $ btrfs send --no-data -p /mnt/sdc/snap1 /mnt/sdc/snap2 \ | btrfs receive -vv /mnt/sdd Before this change the output of the second receive command is: receiving snapshot snap2 uuid=f6922049-8c22-e544-9ff9-fc6755918447... utimes write foobar, offset 8192, len 8192 utimes foobar BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=f6922049-8c22-e544-9ff9-... After this change it is: receiving snapshot snap2 uuid=564d36a3-ebc8-7343-aec9-bf6fda278e64... utimes update_extent foobar: offset=8192, len=8192 utimes foobar BTRFS_IOC_SET_RECEIVED_SUBVOL uuid=564d36a3-ebc8-7343-aec9-bf6fda278e64... Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2018-03-01btrfs: use proper endianness accessors for super_copyAnand Jain2-13/+15
The fs_info::super_copy is a byte copy of the on-disk structure and all members must use the accessor macros/functions to obtain the right value. This was missing in update_super_roots and in sysfs readers. Moving between opposite endianness hosts will report bogus numbers in sysfs, and mount may fail as the root will not be restored correctly. If the filesystem is always used on a same endian host, this will not be a problem. Fix this by using the btrfs_set_super...() functions to set fs_info::super_copy values, and for the sysfs, use the cached fs_info::nodesize/sectorsize values. CC: [email protected] Fixes: df93589a17378 ("btrfs: export more from FS_INFO to sysfs") Signed-off-by: Anand Jain <[email protected]> Reviewed-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> [ update changelog ] Signed-off-by: David Sterba <[email protected]>
2018-03-01btrfs: alloc_chunk: fix DUP stripe size handlingHans van Kranenburg1-5/+6
In case of using DUP, we search for enough unallocated disk space on a device to hold two stripes. The devices_info[ndevs-1].max_avail that holds the amount of unallocated space found is directly assigned to stripe_size, while it's actually twice the stripe size. Later on in the code, an unconditional division of stripe_size by dev_stripes corrects the value, but in the meantime there's a check to see if the stripe_size does not exceed max_chunk_size. Since during this check stripe_size is twice the amount as intended, the check will reduce the stripe_size to max_chunk_size if the actual correct to be used stripe_size is more than half the amount of max_chunk_size. The unconditional division later tries to correct stripe_size, but will actually make sure we can't allocate more than half the max_chunk_size. Fix this by moving the division by dev_stripes before the max chunk size check, so it always contains the right value, instead of putting a duct tape division in further on to get it fixed again. Since in all other cases than DUP, dev_stripes is 1, this change only affects DUP. Other attempts in the past were made to fix this: * 37db63a400 "Btrfs: fix max chunk size check in chunk allocator" tried to fix the same problem, but still resulted in part of the code acting on a wrongly doubled stripe_size value. * 86db25785a "Btrfs: fix max chunk size on raid5/6" unintentionally broke this fix again. The real problem was already introduced with the rest of the code in 73c5de0051. The user visible result however will be that the max chunk size for DUP will suddenly double, while it's actually acting according to the limits in the code again like it was 5 years ago. Reported-by: Naohiro Aota <[email protected]> Link: https://www.spinics.net/lists/linux-btrfs/msg69752.html Fixes: 73c5de0051 ("btrfs: quasi-round-robin for chunk allocation") Fixes: 86db25785a ("Btrfs: fix max chunk size on raid5/6") Signed-off-by: Hans van Kranenburg <[email protected]> Reviewed-by: David Sterba <[email protected]> [ update comment ] Signed-off-by: David Sterba <[email protected]>
2018-03-01btrfs: Handle btrfs_set_extent_delalloc failure in relocate_file_extent_clusterNikolay Borisov1-2/+16
Essentially duplicate the error handling from the above block which handles the !PageUptodate(page) case and additionally clear EXTENT_BOUNDARY. Signed-off-by: Nikolay Borisov <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2018-03-01btrfs: handle failure of add_pending_csumsNikolay Borisov1-2/+9
add_pending_csums was added as part of the new data=ordered implementation in e6dcd2dc9c48 ("Btrfs: New data=ordered implementation"). Even back then it called the btrfs_csum_file_blocks which can fail but it never bothered handling the failure. In ENOMEM situation this could lead to the filesystem failing to write the checksums for a particular extent and not detect this. On read this could lead to the filesystem erroring out due to crc mismatch. Fix it by propagating failure from add_pending_csums and handling them. Signed-off-by: Nikolay Borisov <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2018-03-01btrfs: use kvzalloc to allocate btrfs_fs_infoJeff Mahoney2-2/+2
The srcu_struct in btrfs_fs_info scales in size with NR_CPUS. On kernels built with NR_CPUS=8192, this can result in kmalloc failures that prevent mounting. There is work in progress to try to resolve this for every user of srcu_struct but using kvzalloc will work around the failures until that is complete. As an example with NR_CPUS=512 on x86_64: the overall size of subvol_srcu is 3460 bytes, fs_info is 6496. Signed-off-by: Jeff Mahoney <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2018-03-01platform/x86: intel-hid: Reset wakeup capable flag on removalRafael J. Wysocki1-0/+1
The intel-hid device will not be able to wake up the system any more after removing the notify handler provided by its driver, so make its sysfs attributes reflect that. Fixes: ef884112e55c (platform: x86: intel-hid: Wake up the system from suspend-to-idle) Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]>
2018-03-01platform/x86: intel-vbtn: Reset wakeup capable flag on removalRafael J. Wysocki1-0/+1
The intel-vbtn device will not be able to wake up the system any more after removing the notify handler provided by its driver, so make its sysfs attributes reflect that. Fixes: 91f9e850d465 (platform: x86: intel-vbtn: Wake up the system from suspend-to-idle) Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]>
2018-03-01x86/cpu_entry_area: Sync cpu_entry_area to initial_page_tableThomas Gleixner6-25/+32
The separation of the cpu_entry_area from the fixmap missed the fact that on 32bit non-PAE kernels the cpu_entry_area mapping might not be covered in initial_page_table by the previous synchronizations. This results in suspend/resume failures because 32bit utilizes initial page table for resume. The absence of the cpu_entry_area mapping results in a triple fault, aka. insta reboot. With PAE enabled this works by chance because the PGD entry which covers the fixmap and other parts incindentally provides the cpu_entry_area mapping as well. Synchronize the initial page table after setting up the cpu entry area. Instead of adding yet another copy of the same code, move it to a function and invoke it from the various places. It needs to be investigated if the existing calls in setup_arch() and setup_per_cpu_areas() can be replaced by the later invocation from setup_cpu_entry_areas(), but that's beyond the scope of this fix. Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap") Reported-by: Woody Suwalski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Woody Suwalski <[email protected]> Cc: William Grant <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-03-01pvcalls-front: 64-bit align flagsStefano Stabellini1-3/+8
We are using test_and_* operations on the status and flag fields of struct sock_mapping. However, these functions require the operand to be 64-bit aligned on arm64. Currently, only status is 64-bit aligned. Make status and flags explicitly 64-bit aligned. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-03-01Merge branch 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie8-33/+32
into drm-fixes A few misc fixes for 4.16. * 'drm-fixes-4.16' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: skip ECC for SRIOV in gmc late_init drm/amd/amdgpu: Correct VRAM width for APUs with GMC9 drm/amdgpu: fix&cleanups for wb_clear drm/amdgpu: Correct sdma_v4 get_wptr(v2) drm/amd/powerplay: fix power over limit on Fiji drm/amdgpu:Fixed wrong emit frame size for enc drm/amdgpu: move WB_FREE to correct place drm/amdgpu: only flush hotplug work without DC drm/amd/display: check for ipp before calling cursor operations
2018-03-01Merge tag 'drm-misc-fixes-2018-02-28' of ↵Dave Airlie6-8/+23
git://people.freedesktop.org/drm-misc into drm-fixes Two regression fixes here: a fb format regression on nouveau and a 4.16-rc1 regression with on LVDS with one sun4i device. Plus a sun4i and a virtio-gpu fixes. * tag 'drm-misc-fixes-2018-02-28' of git://people.freedesktop.org/drm-misc: virtio-gpu: fix ioctl and expose the fixed status to userspace. drm/sun4i: Protect the TCON pixel clocks drm/sun4i: Enable the output on the pins (tcon0) drm/nouveau: prefer XBGR2101010 for addfb ioctl
2018-03-01Merge tag 'drm-intel-fixes-2018-02-28' of ↵Dave Airlie4-8/+10
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - 2 display fixes: audio av_enc_map overflow check, and Cannonlake PLL related register offset. - 3 gem fixes: Clear for in-fence out-fence, fix for clearing exec_flags on execbuf failure, and add back global seqno to tracepoints that had been removed recently by other fence related patch. * tag 'drm-intel-fixes-2018-02-28' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915: Make global seqno known in i915_gem_request_execute tracepoint drm/i915: Clear the in-use marker on execbuf failure drm/i915/cnl: Fix PORT_TX_DW5/7 register address drm/i915/audio: fix check for av_enc_map overflow drm/i915: Fix rsvd2 mask when out-fence is returned
2018-02-28Merge tag 'armsoc-fixes' of ↵Linus Torvalds45-137/+191
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Arnd Bergmann: "This is the first set of bugfixes for ARM SoCs, fixing a couple of stability problems, mostly on TI OMAP and Rockchips platforms: - OMAP2 hwmod clocks must be enabled in the correct order - OMAP3 Wakeup from resume through PRM IRQ was unreliable - one regression on OMAP5 caused by a kexec fix - Rockchip ethernet needs some settings for stable operation on Rock64 - Rockchip based Chrombook Plus needs another clock setting for stable display suspend/resume - Rockchip based phyCORE-RK3288 was able to run at an invalid CPU clock frequency - Rockchip MMC link was sometimes unreliable - multiple fixes to avoid crashes in the Broadcom STB DPFE driver Other minor changes include: - Devicetree fixes for incorrect hardware description (rockchip, omap, Gemini, amlogic) - some MAINTAINER file updates to correct email and git addresses - some fixes addressing 'make W=1' dtc warnings (broadcom, amlogic, cavium, qualcomm, hisilicon, zx) - fixes for LTO-compilation (orion, davinci, clps711x) - one fix for an incorrect Kconfig errata selection - a memory leak in the OMAP timer driver - a kernel data leak in OMAP1 debugfs files" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits) MAINTAINERS: update entries for ARM/STM32 ARM: dts: bcm283x: Move arm-pmu out of soc node ARM: dts: bcm283x: Fix unit address of local_intc ARM: dts: NSP: Fix amount of RAM on BCM958625HR ARM: dts: Set D-Link DNS-313 SATA to muxmode 0 ARM: omap2: set CONFIG_LIRC=y in defconfig ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS memory: brcmstb: dpfe: support new way of passing data from the DCPU memory: brcmstb: dpfe: fix type declaration of variable "ret" memory: brcmstb: dpfe: properly mask vendor error bits ARM: BCM: dts: Remove leading 0x and 0s from bindings notation ARM: orion: fix orion_ge00_switch_board_info initialization ARM: davinci: mark spi_board_info arrays as const ARM: clps711x: mark clps711x_compat as const arm: zx: dts: Remove leading 0x and 0s from bindings notation arm64: dts: Remove leading 0x and 0s from bindings notation arm64: dts: cavium: fix PCI bus dtc warnings MAINTAINERS: ARM: at91: update my email address soc: imx: gpc: de-register power domains only if initialized ARM: dts: rockchip: Fix DWMMC clocks ...
2018-02-28Merge tag 'riscv-for-linus-4.16-rc4_smp_mb' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux Pull RISC-V fix from Palmer Dabbelt: "This week we have a single fix: replacing smp_mb() with __smp_mb(). We were the only architecture with smp_mb() and it appears to just be clearly wrong, so I think this is a pretty safe patch for an RC" * tag 'riscv-for-linus-4.16-rc4_smp_mb' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux: riscv/barrier: Define __smp_{mb,rmb,wmb}
2018-02-28timers: Forward timer base before migrating timersLingutla Chandrasekhar1-0/+6
On CPU hotunplug the enqueued timers of the unplugged CPU are migrated to a live CPU. This happens from the control thread which initiated the unplug. If the CPU on which the control thread runs came out from a longer idle period then the base clock of that CPU might be stale because the control thread runs prior to any event which forwards the clock. In such a case the timers from the unplugged CPU are queued on the live CPU based on the stale clock which can cause large delays due to increased granularity of the outer timer wheels which are far away from base:;clock. But there is a worse problem than that. The following sequence of events illustrates it: - CPU0 timer1 is queued expires = 59969 and base->clk = 59131. The timer is queued at wheel level 2, with resulting expiry time = 60032 (due to level granularity). - CPU1 enters idle @60007, with next timer expiry @60020. - CPU0 is hotplugged at @60009 - CPU1 exits idle and runs the control thread which migrates the timers from CPU0 timer1 is now queued in level 0 for immediate handling in the next softirq because the requested expiry time 59969 is before CPU1 base->clk 60007 - CPU1 runs code which forwards the base clock which succeeds because the next expiring timer. which was collected at idle entry time is still set to 60020. So it forwards beyond 60007 and therefore misses to expire the migrated timer1. That timer gets expired when the wheel wraps around again, which takes between 63 and 630ms depending on the HZ setting. Address both problems by invoking forward_timer_base() for the control CPUs timer base. All other places, which might run into a similar problem (mod_timer()/add_timer_on()) already invoke forward_timer_base() to avoid that. [ tglx: Massaged comment and changelog ] Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible") Co-developed-by: Neeraj Upadhyay <[email protected]> Signed-off-by: Neeraj Upadhyay <[email protected]> Signed-off-by: Lingutla Chandrasekhar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Anna-Maria Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-02-28Merge tag 'arm-soc/for-4.16/drivers-fixes' of ↵Arnd Bergmann1-14/+60
https://github.com/Broadcom/stblinux into fixes Pull "Broadcom drivers fixes for 4.16" from Florian Fainelli: This pull request contains Broadcom SoCs drivers fixes for 4.16, please pull the following: - Markus provides two minor fixes to the Broadcom STB DPFE driver, one to properly mask bits, and a second one to use the correct type. The third commit is a consequence of a newer DFPE firmware which would unfortunately crash without appropriate kernel changes. * tag 'arm-soc/for-4.16/drivers-fixes' of https://github.com/Broadcom/stblinux: memory: brcmstb: dpfe: support new way of passing data from the DCPU memory: brcmstb: dpfe: fix type declaration of variable "ret" memory: brcmstb: dpfe: properly mask vendor error bits
2018-02-28Merge tag 'arm-soc/for-4.16/devicetree-fixes' of ↵Arnd Bergmann7-14/+14
https://github.com/Broadcom/stblinux into fixes Pull "Broadcom devicetree fixes for 4.16" from Florian Fainelli: This pull request contains Broadcom ARM-based SoCs Device Tree fixes for 4.16, please pull the following: - Mathieu fixes leading 0x and 0's from bindings and Device Tree source files, he has done this treewide and most of his changes are already in 4.16 - Stefan provides two changes to the BCM283x DTS files in order to fix DTC warnings - Florian fixes the amount of RAM on the BCM958625HR reference board to properly limit to what is initialized by the bootloader * tag 'arm-soc/for-4.16/devicetree-fixes' of https://github.com/Broadcom/stblinux: ARM: dts: bcm283x: Move arm-pmu out of soc node ARM: dts: bcm283x: Fix unit address of local_intc ARM: dts: NSP: Fix amount of RAM on BCM958625HR ARM: BCM: dts: Remove leading 0x and 0s from bindings notation
2018-02-28MAINTAINERS: update entries for ARM/STM32Alexandre Torgue1-1/+3
Changes old git repository to the maintained one and adds more patterns. Signed-off-by: Alexandre Torgue <[email protected]> Acked-by: Maxime Coquelin <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2018-02-28Merge tag 'imx-fixes-4.16' of ↵Arnd Bergmann2-2/+10
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Pull "i.MX fixes for 4.16" from Shawn Guo: - Fix i.MX GPC driver to remove power domains only when they are initialized in imx_gpc_probe(). - Fix the broken Engicam i.CoreM6 DualLite/Solo RQS board DT to include imx6dl.dtsi instead of imx6q.dtsi. * tag 'imx-fixes-4.16' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 DualLite/Solo RQS soc: imx: gpc: de-register power domains only if initialized
2018-02-28Merge tag 'linux-kselftest-4.16-rc4' of ↵Linus Torvalds8-16/+19
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "Fixes for various problems in test output, compile errors, and missing configs" * tag 'linux-kselftest-4.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: vm: update .gitignore with new test selftests: memory-hotplug: silence test command echo selftests/futex: Fix line continuation in Makefile selftests: memfd: add config fragment for fuse selftests: pstore: Adding config fragment CONFIG_PSTORE_RAM=m selftests/android: Fix line continuation in Makefile selftest/vDSO: fix O= selftests: sync: missing CFLAGS while compiling
2018-02-28drm/amdgpu: skip ECC for SRIOV in gmc late_initMonk Liu1-1/+1
Signed-off-by: Monk Liu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amd/amdgpu: Correct VRAM width for APUs with GMC9Tom St Denis1-1/+4
DDR4 has a 64-bit width not 128-bits. It was reporting twice the width. Tested with my Ryzen 2400G. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amdgpu: fix&cleanups for wb_clearMonk Liu2-3/+4
fix: should do right shift on wb before clearing cleanups: 1,should memset all wb buffer 2,set max wb number to 128 (total 4KB) is big enough Signed-off-by: Monk Liu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amdgpu: Correct sdma_v4 get_wptr(v2)Emily Deng1-11/+7
the original method will change the wptr value in wb. v2: furthur cleanup Signed-off-by: Emily Deng <[email protected]> Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amd/powerplay: fix power over limit on FijiEric Huang1-7/+0
power containment disabled only on Fiji and compute power profile. It violates PCIe spec and may cause power supply failed. Enabling it will fix the issue, even the fix will drop performance of some compute tests. Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2018-02-28drm/amdgpu:Fixed wrong emit frame size for encJames Zhu1-1/+1
Emit frame size should match with corresponding function, uvd_v6_0_enc_ring_emit_vm_flush has 5 amdgpu_ring_write Signed-off-by: James Zhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amdgpu: move WB_FREE to correct placeMonk Liu1-5/+7
WB_FREE should be put after all engines's hw_fini done, otherwise the invalid wptr/rptr_addr would still be used by engines which trigger abnormal bugs. This fixes couple DMAR reading error in host side for SRIOV after guest kmd is unloaded. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amdgpu: only flush hotplug work without DCMonk Liu1-2/+4
since hotplug_work is initialized under the case of no dc support Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-28drm/amd/display: check for ipp before calling cursor operationsShirish S1-2/+4
Currently all cursor related functions are made to all pipes that are attached to a particular stream. This is not applicable to pipes that do not have cursor plane initialised like underlay. Hence this patch allows cursor related operations on a pipe only if ipp in available on that particular pipe. The check is added to set_cursor_position & set_cursor_attribute. Signed-off-by: Shirish S <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2018-02-28Merge tag 'xfs-4.16-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds4-5/+13
Pull xfs fixes from Darrick Wong: - fix some compiler warnings - fix block reservations for transactions created during log recovery - fix resource leaks when respecifying mount options * tag 'xfs-4.16-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix potential memory leak in mount option parsing xfs: reserve blocks for refcount / rmap log item recovery xfs: use memset to initialize xfs_scrub_agfl_info
2018-02-28x86/xen: add tty0 and hvc0 as preferred consoles for dom0Juergen Gross1-2/+4
Today the tty0 and hvc0 consoles are added as a preferred consoles for pv domUs only. As this requires a boot parameter for getting dom0 messages per default, add them for dom0, too. Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-02-28xen-netfront: Fix hang on device removalJason Andryuk1-1/+6
A toolstack may delete the vif frontend and backend xenstore entries while xen-netfront is in the removal code path. In that case, the checks for xenbus_read_driver_state would return XenbusStateUnknown, and xennet_remove would hang indefinitely. This hang prevents system shutdown. xennet_remove must be able to handle XenbusStateUnknown, and netback_changed must also wake up the wake_queue for that state as well. Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") Signed-off-by: Jason Andryuk <[email protected]> Cc: Eduardo Otubo <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-02-28xen/pirq: fix error path cleanup when binding MSIsRoger Pau Monne1-2/+2
Current cleanup in the error path of xen_bind_pirq_msi_to_irq is wrong. First of all there's an off-by-one in the cleanup loop, which can lead to unbinding wrong IRQs. Secondly IRQs not bound won't be freed, thus leaking IRQ numbers. Note that there's no need to differentiate between bound and unbound IRQs when freeing them, __unbind_from_irq will deal with both of them correctly. Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups") Reported-by: Hooman Mirhadi <[email protected]> Signed-off-by: Roger Pau Monné <[email protected]> Reviewed-by: Amit Shah <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-02-28Merge branch 'for-jens' of git://git.infradead.org/nvme into for-linusJens Axboe7-23/+28
Pull NVMe fixes from Keith for 4.16-rc. * 'for-jens' of git://git.infradead.org/nvme: nvmet: fix PSDT field check in command format nvme-multipath: fix sysfs dangerously created links nvme-pci: Fix nvme queue cleanup if IRQ setup fails nvmet-loop: use blk_rq_payload_bytes for sgl selection nvme-rdma: use blk_rq_payload_bytes instead of blk_rq_bytes nvme-fabrics: don't check for non-NULL module in nvmf_register_transport
2018-02-28Merge tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-5/+5
Pull dma-mapping fix from Christoph Hellwig: "A single fix for a memory leak regression in the dma-debug code" * tag 'dma-mapping-4.16-3' of git://git.infradead.org/users/hch/dma-mapping: dma-debug: fix memory leak in debug_dma_alloc_coherent
2018-02-28drm/i915: Make global seqno known in i915_gem_request_execute tracepointTvrtko Ursulin1-2/+2
Commit fe49789fab97 ("drm/i915: Deconstruct execute fence") re-arranged the code and moved the i915_gem_request_execute tracepoint to before the global seqno is assigned to the request. We need to move the tracepoint a bit later so this information is once again available. Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: fe49789fab97 ("drm/i915: Deconstruct execute fence") Cc: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: [email protected] Reviewed-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 158863fb50968c0ae85e87a401221425c941b9f0) Signed-off-by: Rodrigo Vivi <[email protected]>
2018-02-28drm/i915: Clear the in-use marker on execbuf failureChris Wilson1-0/+2
If we fail to unbind the vma (due to a signal on an active buffer that needs to be moved for the next execbuf), then we need to clear the persistent tracking state we setup for this execbuf. Fixes: c7c6e46f913b ("drm/i915: Convert execbuf to use struct-of-array packing for critical fields") Testcase: igt/gem_fenced_exec_thrash/no-spare-fences-busy* Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: <[email protected]> # v4.14+ Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit ed2f3532321083cf40e4da4e36234880e0136136) Signed-off-by: Rodrigo Vivi <[email protected]>
2018-02-28drm/i915/cnl: Fix PORT_TX_DW5/7 register addressMahesh Kumar1-2/+2
Register Address for CNL_PORT_DW5_LN0_D is 0x162E54, but current code is defining it as 0x162ED4. Similarly for CNL_PORT_DW7_LN0_D register address is defined 0x162EDC instead of 0x162E5C, fix it. Signed-off-by: Mahesh Kumar <[email protected]> Fixes: 04416108ccea ("drm/i915/cnl: Add registers related to voltage swing sequences.") Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit e103962611b2d464be6ab596d7b3495fe7b4c132) Signed-off-by: Rodrigo Vivi <[email protected]>
2018-02-28drm/i915/audio: fix check for av_enc_map overflowJani Nikula1-3/+3
Turns out -1 >= ARRAY_SIZE() is always true. Move the bounds check where we know pipe >= 0 and next to the array indexing where it makes most sense. Fixes: 9965db26ac05 ("drm/i915: Check for fused or unused pipes") Fixes: 0b7029b7e43f ("drm/i915: Check for fused or unused pipes") Cc: <[email protected]> # v4.10+ Cc: Mika Kahola <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: [email protected] Reviewed-by: Dhinakaran Pandiyan <[email protected]> Reviewed-by: Mika Kahola <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit cdb3db8542d854bd678d60cd28861b042e191672) Signed-off-by: Rodrigo Vivi <[email protected]>
2018-02-28drm/i915: Fix rsvd2 mask when out-fence is returnedDaniele Ceraolo Spurio1-1/+1
GENMASK_ULL wants the high bit of the mask first. The current value cancels the in-fence when an out-fence is returned. Fixes: fec0445caa273 ("drm/i915: Support explicit fencing for execbuf") Testcase: igt/gem_exec_fence/keep-in-fence* Cc: Chris Wilson <[email protected]> Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Cc: <[email protected]> # v4.12+ (cherry picked from commit b6a88e4a804cf5a71159906e16df2c1fc7196f92) Signed-off-by: Rodrigo Vivi <[email protected]>
2018-02-28Documentation, x86, resctrl: Make text and sample command matchLi RongQing1-1/+1
The text says "Move the cpus 4-7 over to p1", but the sample command writes to p0/cpus. Signed-off-by: Li RongQing <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-02-28dt-bindings/irqchip/renesas-irqc: Document R-Car M3-N supportGeert Uytterhoeven1-0/+1
Document support for the Interrupt Controller for Externel Devices (INTC-EX) in the Renesas M3-N (r8a77965) SoC. No driver update is needed. Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Simon Horman <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: Jason Cooper <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: [email protected] Cc: Rob Herring <[email protected]> Link: https://lkml.kernel.org/r/1519658712-22910-1-git-send-email-geert%[email protected]
2018-02-28ipvs: remove IPS_NAT_MASK check to fix passive FTPJulian Anastasov1-1/+1
The IPS_NAT_MASK check in 4.12 replaced previous check for nfct_nat() which was needed to fix a crash in 2.6.36-rc, see commit 7bcbf81a2296 ("ipvs: avoid oops for passive FTP"). But as IPVS does not set the IPS_SRC_NAT and IPS_DST_NAT bits, checking for IPS_NAT_MASK prevents PASV response to be properly mangled and blocks the transfer. Remove the check as it is not needed after 3.12 commit 41d73ec053d2 ("netfilter: nf_conntrack: make sequence number adjustments usuable without NAT") which changes nfct_nat() with nfct_seqadj() and especially after 3.13 commit b25adce16064 ("ipvs: correct usage/allocation of seqadj ext in ipvs"). Thanks to Li Shuang and Florian Westphal for reporting the problem! Reported-by: Li Shuang <[email protected]> Fixes: be7be6e161a2 ("netfilter: ipvs: fix incorrect conflict resolution") Signed-off-by: Julian Anastasov <[email protected]> Acked-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-02-28ARC: setup cpu possible mask according to possible-cpus dts propertyEugeniy Paltsev1-10/+40
As we have option in u-boot to set CPU mask for running linux, we want to pass information to kernel about CPU cores should be brought up. So we patch kernel dtb in u-boot to set possible-cpus property. This also allows us to have correctly setuped MCIP debug mask. Signed-off-by: Eugeniy Paltsev <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2018-02-28ARC: mcip: update MCIP debug mask when the new cpu came onlineEugeniy Paltsev2-5/+34
As of today we use hardcoded MCIP debug mask, so if we launch kernel via debugger and kick fever cores than HW has all cpus hang at the momemt of setup MCIP debug mask. So update MCIP debug mask when the new cpu came online, instead of use hardcoded MCIP debug mask. Signed-off-by: Eugeniy Paltsev <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2018-02-28ARC: mcip: halt GFRC counter when ARC cores haltEugeniy Paltsev2-0/+40
In SMP systems, GFRC is used for clocksource. However by default the counter keeps running even when core is halted (say when debugging via a JTAG debugger). This confuses Linux timekeeping and triggers flase RCU stall splat such as below: | [ARCLinux]# while true; do ./shm_open_23-1.run-test ; done | Running with 1000 processes for 1000 objects | hrtimer: interrupt took 485060 ns | | create_cnt: 1000 | Running with 1000 processes for 1000 objects | [ARCLinux]# INFO: rcu_preempt self-detected stall on CPU | 2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0 | INFO: rcu_preempt detected stalls on CPUs/tasks: | 0-...: (1 GPs behind) idle=71e/0/0 softirq=135264/135264 fqs=0 | 2-...: (1 GPs behind) idle=a01/1/0 softirq=135770/135773 fqs=0 | 3-...: (1 GPs behind) idle=4e0/0/0 softirq=134304/134304 fqs=0 | (detected by 1, t=13648 jiffies, g=31493, c=31492, q=1) Starting from ARC HS v3.0 it's possible to tie GFRC to state of up-to 4 ARC cores with help of GFRC's CORE register where we set a mask for cores which state we need to rely on. We update cpu mask every time new cpu came online instead of using hardcoded one or using mask generated from "possible_cpus" as we want it set correctly even if we run kernel on HW which has fewer cores than expected (or we launch kernel via debugger and kick fever cores than HW has) Note that GFRC halts when all cores have halted and thus relies on programming of Inter-Core-dEbug register to halt all cores when one halts. Signed-off-by: Alexey Brodkin <[email protected]> Signed-off-by: Eugeniy Paltsev <[email protected]> Signed-off-by: Vineet Gupta <[email protected]> [vgupta: rewrote changelog]
2018-02-28ARCv2: boot log: fix HS48 release numberVineet Gupta1-1/+1
Signed-off-by: Vineet Gupta <[email protected]>