aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-09-26aquantia: Setup max_mtu in ndev to enable jumbo framesIgor Russkikh2-10/+3
Although hardware is capable for almost 16K MTU, without max_mtu field correctly set it only allows standard MTU to be used. This patch enables max MTU, calculating it from hardware maximum frame size of 16352 octets (including FCS). Fixes: 5513e16421cb ("net: ethernet: aquantia: Fixes for aq_ndev_change_mtu") Signed-off-by: Pavel Belous <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-09-26netfilter: ipset: pernet ops must be unregistered lastFlorian Westphal1-9/+13
Removing the ipset module leaves a small window where one cpu performs module removal while another runs a command like 'ipset flush'. ipset uses net_generic(), unregistering the pernet ops frees this storage area. Fix it by first removing the user-visible api handlers and the pernet ops last. Fixes: 1785e8f473082 ("netfiler: ipset: Add net namespace for ipset") Reported-by: Li Shuang <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Acked-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-09-26netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addressesJozsef Kadlecsik10-22/+24
Wrong comparison prevented the hash types to add a range with more than 2^31 addresses but reported as a success. Fixes Netfilter's bugzilla id #1005, reported by Oleg Serditov and Oliver Ford. Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-09-26netfilter: xt_socket: Restore mark from full sockets onlySubash Abhinov Kasiviswanathan1-2/+2
An out of bounds error was detected on an ARM64 target with Android based kernel 4.9. This occurs while trying to restore mark on a skb from an inet request socket. BUG: KASAN: slab-out-of-bounds in socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248 Read of size 4 at addr ffffffc06a8d824c by task syz-fuzzer/1532 CPU: 7 PID: 1532 Comm: syz-fuzzer Tainted: G W O 4.9.41+ #1 Call trace: [<ffffff900808d2f8>] dump_backtrace+0x0/0x440 arch/arm64/kernel/traps.c:76 [<ffffff900808d760>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226 [<ffffff90085f7dc8>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffff90085f7dc8>] dump_stack+0xe4/0x134 lib/dump_stack.c:51 [<ffffff900830f358>] print_address_description+0x68/0x258 mm/kasan/report.c:248 [<ffffff900830f770>] kasan_report_error mm/kasan/report.c:347 [inline] [<ffffff900830f770>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371 [<ffffff900830fdec>] kasan_report+0x5c/0x70 mm/kasan/report.c:372 [<ffffff900830de98>] check_memory_region_inline mm/kasan/kasan.c:308 [inline] [<ffffff900830de98>] __asan_load4+0x88/0xa0 mm/kasan/kasan.c:740 [<ffffff90097498f8>] socket_match.isra.2+0xc8/0x1f0 net/netfilter/xt_socket.c:248 [<ffffff9009749a5c>] socket_mt4_v1_v2_v3+0x3c/0x48 net/netfilter/xt_socket.c:272 [<ffffff90097f7e4c>] ipt_do_table+0x54c/0xad8 net/ipv4/netfilter/ip_tables.c:311 [<ffffff90097fcf14>] iptable_mangle_hook+0x6c/0x220 net/ipv4/netfilter/iptable_mangle.c:90 ... Allocated by task 1532: save_stack_trace_tsk+0x0/0x2a0 arch/arm64/kernel/stacktrace.c:131 save_stack_trace+0x28/0x38 arch/arm64/kernel/stacktrace.c:215 save_stack mm/kasan/kasan.c:495 [inline] set_track mm/kasan/kasan.c:507 [inline] kasan_kmalloc+0xd8/0x188 mm/kasan/kasan.c:599 kasan_slab_alloc+0x14/0x20 mm/kasan/kasan.c:537 slab_post_alloc_hook mm/slab.h:417 [inline] slab_alloc_node mm/slub.c:2728 [inline] slab_alloc mm/slub.c:2736 [inline] kmem_cache_alloc+0x14c/0x2e8 mm/slub.c:2741 reqsk_alloc include/net/request_sock.h:87 [inline] inet_reqsk_alloc+0x4c/0x238 net/ipv4/tcp_input.c:6236 tcp_conn_request+0x2b0/0xea8 net/ipv4/tcp_input.c:6341 tcp_v4_conn_request+0xe0/0x100 net/ipv4/tcp_ipv4.c:1256 tcp_rcv_state_process+0x384/0x18a8 net/ipv4/tcp_input.c:5926 tcp_v4_do_rcv+0x2f0/0x3e0 net/ipv4/tcp_ipv4.c:1430 tcp_v4_rcv+0x1278/0x1350 net/ipv4/tcp_ipv4.c:1709 ip_local_deliver_finish+0x174/0x3e0 net/ipv4/ip_input.c:216 v1->v2: Change socket_mt6_v1_v2_v3() as well as mentioned by Eric v2->v3: Put the correct fixes tag Fixes: 01555e74bde5 ("netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flag") Signed-off-by: Subash Abhinov Kasiviswanathan <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-09-26xfs: revert "xfs: factor rmap btree size into the indlen calculations"Darrick J. Wong1-15/+2
In commit fd26a88093ba we added a worst case estimate for rmapbt blocks needed to satisfy the block mapping request. Since then, we added the ability to reserve enough space in each AG such that we should never run out of blocks to grow the rmapbt, which makes this calculation unnecessary. Revert the commit because it makes the extra delalloc indlen accounting unnecessary and incorrect. Reported-by: Eryu Guan <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-09-26xfs: Capture state of the right inode in xfs_iflush_doneCarlos Maiolino1-1/+1
My previous patch: d3a304b6292168b83b45d624784f973fdc1ca674 check for XFS_LI_FAILED flag xfs_iflush done, so the failed item can be properly resubmitted. In the loop scanning other inodes being completed, it should check the current item for the XFS_LI_FAILED, and not the initial one. The state of the initial inode is checked after the loop ends Kudos to Eric for catching this. Signed-off-by: Carlos Maiolino <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-09-26xfs: perag initialization should only touch m_ag_max_usable for AG 0Darrick J. Wong1-2/+10
We call __xfs_ag_resv_init to make a per-AG reservation for each AG. This makes the reservation per-AG, not per-filesystem. Therefore, it is incorrect to adjust m_ag_max_usable for each AG. Adjust it only when we're reserving AG 0's blocks so that we only do it once per fs. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2017-09-26xfs: update i_size after unwritten conversion in dio completionEryu Guan5-19/+28
Since commit d531d91d6990 ("xfs: always use unwritten extents for direct I/O writes"), we start allocating unwritten extents for all direct writes to allow appending aio in XFS. But for dio writes that could extend file size we update the in-core inode size first, then convert the unwritten extents to real allocations at dio completion time in xfs_dio_write_end_io(). Thus a racing direct read could see the new i_size and find the unwritten extents first and read zeros instead of actual data, if the direct writer also takes a shared iolock. Fix it by updating the in-core inode size after the unwritten extent conversion. To do this, introduce a new boolean argument to xfs_iomap_write_unwritten() to tell if we want to update in-core i_size or not. Suggested-by: Brian Foster <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Eryu Guan <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-09-26iomap_dio_rw: Allocate AIO completion queue before submitting dioChandan Rajendra1-7/+7
Executing xfs/104 test in a loop on Linux-v4.13 kernel on a ppc64 machine can cause the following NULL pointer dereference, .queue_work_on+0x4c/0x80 .iomap_dio_bio_end_io+0xbc/0x1f0 .bio_endio+0x118/0x1f0 .blk_update_request+0xd0/0x470 .blk_mq_end_request+0x24/0xc0 .lo_complete_rq+0x40/0xe0 .__blk_mq_complete_request_remote+0x28/0x40 .flush_smp_call_function_queue+0xc4/0x1e0 .smp_ipi_demux_relaxed+0x8c/0x100 .icp_hv_ipi_action+0x54/0xa0 .__handle_irq_event_percpu+0x84/0x2c0 .handle_irq_event_percpu+0x28/0x80 .handle_percpu_irq+0x78/0xc0 .generic_handle_irq+0x40/0x70 .__do_irq+0x88/0x200 .call_do_irq+0x14/0x24 .do_IRQ+0x84/0x130 This occurs due to the following sequence of events, 1. Allocate dio for Direct I/O write. 2. Invoke iomap_apply() until iov_iter_count() bytes have been submitted. - Assume that we have submitted atleast one bio. Hence iomap_dio->ref value will be >= 2. - If during the second iteration, iomap_apply() ends up returning -ENOSPC, we would break out of the loop and since the 'ret' value is a negative number we end up not allocating memory for super_block->s_dio_done_wq. 3. Meanwhile, iomap_dio_bio_end_io() is invoked for bios that have been submitted and here the code ends up dereferencing the NULL pointer stored at super_block->s_dio_done_wq. This commit fixes the bug by allocating memory for super_block->s_dio_done_wq before iomap_apply() is invoked. Reported-by: Eryu Guan <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Eryu Guan <[email protected]> Signed-off-by: Chandan Rajendra <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-09-26xfs: validate bdev support for DAX inode flagRoss Zwisler1-1/+2
Currently only the blocksize is checked, but we should really be calling bdev_dax_supported() which also tests to make sure we can get a struct dax_device and that the dax_direct_access() path is working. This is the same check that we do for the "-o dax" mount option in xfs_fs_fill_super(). This does not fix the race issues that caused the XFS DAX inode option to be disabled, so that option will still be disabled. If/when we re-enable it, though, I think we will want this issue to have been fixed. I also do think that we want to fix this in stable kernels. Signed-off-by: Ross Zwisler <[email protected]> CC: [email protected] Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-09-26l2tp: fix race condition in l2tp_tunnel_deleteSabrina Dubroca2-7/+8
If we try to delete the same tunnel twice, the first delete operation does a lookup (l2tp_tunnel_get), finds the tunnel, calls l2tp_tunnel_delete, which queues it for deletion by l2tp_tunnel_del_work. The second delete operation also finds the tunnel and calls l2tp_tunnel_delete. If the workqueue has already fired and started running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the same tunnel a second time, and try to free the socket again. Add a dead flag to prevent firing the workqueue twice. Then we can remove the check of queue_work's result that was meant to prevent that race but doesn't. Reproducer: ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000 ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000 ip link set l2tp1 up ip l2tp del tunnel tunnel_id 3000 ip l2tp del tunnel tunnel_id 3000 Fixes: f8ccac0e4493 ("l2tp: put tunnel socket release on a workqueue") Reported-by: Jianlin Shi <[email protected]> Signed-off-by: Sabrina Dubroca <[email protected]> Acked-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-09-26vti: fix use after free in vti_tunnel_xmit/vti6_tnl_xmitAlexey Kodanev2-2/+4
When running LTP IPsec tests, KASan might report: BUG: KASAN: use-after-free in vti_tunnel_xmit+0xeee/0xff0 [ip_vti] Read of size 4 at addr ffff880dc6ad1980 by task swapper/0/0 ... Call Trace: <IRQ> dump_stack+0x63/0x89 print_address_description+0x7c/0x290 kasan_report+0x28d/0x370 ? vti_tunnel_xmit+0xeee/0xff0 [ip_vti] __asan_report_load4_noabort+0x19/0x20 vti_tunnel_xmit+0xeee/0xff0 [ip_vti] ? vti_init_net+0x190/0x190 [ip_vti] ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 dev_hard_start_xmit+0x147/0x510 ? icmp_echo.part.24+0x1f0/0x210 __dev_queue_xmit+0x1394/0x1c60 ... Freed by task 0: save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x70/0xc0 kmem_cache_free+0x81/0x1e0 kfree_skbmem+0xb1/0xe0 kfree_skb+0x75/0x170 kfree_skb_list+0x3e/0x60 __dev_queue_xmit+0x1298/0x1c60 dev_queue_xmit+0x10/0x20 neigh_resolve_output+0x3a8/0x740 ip_finish_output2+0x5c0/0xe70 ip_finish_output+0x4ba/0x680 ip_output+0x1c1/0x3a0 xfrm_output_resume+0xc65/0x13d0 xfrm_output+0x1e4/0x380 xfrm4_output_finish+0x5c/0x70 Can be fixed if we get skb->len before dst_output(). Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code") Fixes: 22e1b23dafa8 ("vti6: Support inter address family tunneling.") Signed-off-by: Alexey Kodanev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-09-26drm/i915/bios: ignore HDMI on port AJani Nikula1-0/+7
The hardware state readout oopses after several warnings when trying to use HDMI on port A, if such a combination is configured in VBT. Filter the combo out already at the VBT parsing phase. v2: also ignore DVI (Ville) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102889 Cc: [email protected] Cc: Imre Deak <[email protected]> Reviewed-by: Ville Syrjälä <[email protected]> Tested-by: Daniel Drake <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit d27ffc1d00327c29b3aa97f941b42f0949f9e99f) Signed-off-by: Rodrigo Vivi <[email protected]>
2017-09-26drm/i915: remove redundant variable hw_checkColin Ian King1-2/+0
hw_check is being assigned and updated but is no longer being read, hence it is redundant and can be removed. Detected by clang scan-build: "warning: Value stored to 'hw_check' during its initialization is never read" Fixes: f6d1973db2d2 ("drm/i915: Move modeset state verifier calls") Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 4babc5e27cfda59e2e257d28628b8d853aea5206) Signed-off-by: Rodrigo Vivi <[email protected]>
2017-09-26drm/i915: always update ELD connector type after get modesJani Nikula2-5/+17
drm_edid_to_eld() initializes the connector ELD to zero, overwriting the ELD connector type initialized in intel_audio_codec_enable(). If userspace does getconnector and thus get_modes after modeset, a subsequent audio component i915_audio_component_get_eld() call will receive an ELD without the connector type properly set. It's fine for HDMI, but screws up audio for DP. Always set the ELD connector type at intel_connector_update_modes() based on the connector type. We can drop the connector type update from intel_audio_codec_enable(). Credits to Joseph Nuzman <[email protected]> for figuring this out. Cc: Ville Syrjälä <[email protected]> Cc: Joseph Nuzman <[email protected]> Reported-by: Joseph Nuzman <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101583 Reviewed-by: Ville Syrjälä <[email protected]> Tested-by: Joseph Nuzman <[email protected]> Cc: [email protected] # v4.10+, maybe earlier Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit d81fb7fd9436e81fda67e5bc8ed0713aa28d3db2) Signed-off-by: Rodrigo Vivi <[email protected]>
2017-09-26percpu: make this_cpu_generic_read() atomic w.r.t. interruptsMark Rutland1-2/+22
As raw_cpu_generic_read() is a plain read from a raw_cpu_ptr() address, it's possible (albeit unlikely) that the compiler will split the access across multiple instructions. In this_cpu_generic_read() we disable preemption but not interrupts before calling raw_cpu_generic_read(). Thus, an interrupt could be taken in the middle of the split load instructions. If a this_cpu_write() or RMW this_cpu_*() op is made to the same variable in the interrupt handling path, this_cpu_read() will return a torn value. For native word types, we can avoid tearing using READ_ONCE(), but this won't work in all cases (e.g. 64-bit types on most 32-bit platforms). This patch reworks this_cpu_generic_read() to use READ_ONCE() where possible, otherwise falling back to disabling interrupts. Signed-off-by: Mark Rutland <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Pranith Kumar <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Tejun Heo <[email protected]>
2017-09-26arm64: dts: rockchip: add the grf clk for dw-mipi-dsi on rk3399Nickey Yang1-2/+2
The clk of grf must be enabled before writing grf register for rk3399. Signed-off-by: Nickey Yang <[email protected]> [the grf clock is already part of the binding since march 2017] Signed-off-by: Heiko Stuebner <[email protected]>
2017-09-26btrfs: log csums for all modified extentsJosef Bacik1-2/+10
Amir reported a bug discovered by his cleaned up version of my dm-log-writes xfstests where we were missing csums at certain replay points. This is because fsx was doing an msync(), which essentially fsync()'s a specific range of a file. We will log all modified extents, but only search for the checksums in the range we are being asked to sync. We cannot simply log the extents in the range we're being asked because we are logging the inode item as it is currently, which if it has had a i_size update before the msync means we will miss extents when replaying. We could possibly get around this by marking the inode with the transaction that extended the i_size to see if we have this case, but this would be racy and we'd have to lock the whole range of the inode to make sure we didn't have an ordered extent outside of our range that was in the middle of completing. Fix this simply by keeping track of the modified extents range and logging the csums for the entire range of extents that we are logging. This makes the xfstest pass. Reported-by: Amir Goldstein <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: fix unexpected result when dio reading corrupted blocksLiu Bo1-5/+2
commit 4246a0b63bd8 ("block: add a bi_error field to struct bio") changed the logic of how dio read endio reports errors. For single stripe dio read, %bio->bi_status reflects the error before verifying checksum, and now we're updating it when data block matches with its checksum, while in the mismatching case, %bio->bi_status is not updated to relfect that. When some blocks in a file have been corrupted on disk, reading such a file ends up with 1) checksum errors are reported in kernel log 2) read(2) returns successfully with some content being 0x01. In order to fix it, we need to report its checksum mismatch error to the upper layer (dio layer in this case) as well. Fixes: 4246a0b63bd8 ("block: add a bi_error field to struct bio") Signed-off-by: Liu Bo <[email protected]> Reported-by: Goffredo Baroncelli <[email protected]> Tested-by: Goffredo Baroncelli <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: Report error on removing qgroup if del_qgroup_item failsSargun Dhillon1-0/+2
Previously, we were calling del_qgroup_item, and ignoring the return code resulting in a potential to have divergent in-memory state without an error. Perhaps, it makes sense to handle this error code, and put the filesystem into a read only, or similar state. This patch only adds reporting of the error if the error is fatal, (any error other than qgroup not found). Signed-off-by: Sargun Dhillon <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: skip checksum when reading compressed data if some IO have failedLiu Bo1-1/+8
Currently even if the underlying disk reports failure on IO, compressed read endio still gets to verify checksum and reports it as a checksum error. In fact, if some IO have failed during reading a compressed data extent , there's no way the checksum could match, therefore, we can skip that in order to return error quickly to the upper layer. Please note that we need to do this after recording the failed mirror index so that read-repair in the upper layer's endio can work properly. Signed-off-by: Liu Bo <[email protected]> Tested-by: Paul Jones <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: fix kernel oops while reading compressed dataLiu Bo1-0/+9
The kernel oops happens at kernel BUG at fs/btrfs/extent_io.c:2104! ... RIP: clean_io_failure+0x263/0x2a0 [btrfs] It's showing that read-repair code is using an improper mirror index. This is due to the fact that compression read's endio hasn't recorded the failed mirror index in %cb->orig_bio. With this, btrfs's read-repair can work properly on reading compressed data. Signed-off-by: Liu Bo <[email protected]> Reported-by: Paul Jones <[email protected]> Tested-by: Paul Jones <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: use btrfs_op instead of bio_op in __btrfs_map_blockLiu Bo1-1/+1
This seems to be a leftover of commit cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block"). It should use btrfs_op() helper to provide one of 'enum btrfs_map_op' types. Fixes: cf8cddd38bab ("btrfs: don't abuse REQ_OP_* flags for btrfs_map_block") Signed-off-by: Liu Bo <[email protected]> Reviewed-by: Satoru Takeuchi <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: do not backup tree roots when fsyncLiu Bo1-1/+8
It doesn't make sense to backup tree roots when doing fsync, since during fsync those tree roots have not been consistent on disk. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: remove BTRFS_FS_QUOTA_DISABLING flagMisono, Tomohiro2-5/+0
Currently, "btrfs quota enable" would fail after "btrfs quota disable" on the first time with syslog output "qgroup_rescan_init failed with -22", but it would succeed on the second time. When "quota disable" is called, BTRFS_FS_QUOTA_DISABLING flag bit will be set in fs_info->flags in btrfs_quota_disable(), but it will not be droppd in btrfs_run_qgroups() (which is called in btrfs_commit_transaction()) because quota_root has already been freed. If "quota enable" is called after that, both BTRFS_FS_QUOTA_DISABLING and BTRFS_FS_QUOTA_ENABLED flag would be dropped in the btrfs_run_qgroups() since quota_root is not NULL. This leads to the failure of "quota enable" on the first time. BTRFS_FS_QUOTA_DISABLING flag is not used outside of "quota disable" context and is equivalent to whether quota_root is NULL or not. btrfs_run_qgroups() checks whether quota_root is NULL or not in the first place. So, let's remove BTRFS_FS_QUOTA_DISABLING flag. Signed-off-by: Tomohiro Misono <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: propagate error to btrfs_cmp_data_prepare callerNaohiro Aota1-1/+1
btrfs_cmp_data_prepare() (almost) always returns 0 i.e. ignoring errors from gather_extent_pages(). While the pages are freed by btrfs_cmp_data_free(), cmp->num_pages still has > 0. Then, btrfs_extent_same() try to access the already freed pages causing faults (or violates PageLocked assertion). This patch just return the error as is so that the caller stop the process. Signed-off-by: Naohiro Aota <[email protected]> Fixes: f441460202cb ("btrfs: fix deadlock with extent-same and readpage") Cc: <[email protected]> # 4.2 Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: prevent to set invalid default subvolidsatoru takeuchi1-0/+4
`btrfs sub set-default` succeeds to set an ID which isn't corresponding to any fs/file tree. If such the bad ID is set to a filesystem, we can't mount this filesystem without specifying `subvol` or `subvolid` mount options. Fixes: 6ef5ed0d386b ("Btrfs: add ioctl and incompat flag to set the default mount subvol") Cc: <[email protected]> Signed-off-by: Satoru Takeuchi <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: send: fix error number for unknown inode typesTsutomu Itoh1-1/+1
ENOTSUPP should not be returned to the user program. (cf. include/linux/errno.h) Therefore, EOPNOTSUPP is used instead of ENOTSUPP. Signed-off-by: Tsutomu Itoh <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: fix NULL pointer dereference from free_reloc_roots()Naohiro Aota1-1/+1
__del_reloc_root should be called before freeing up reloc_root->node. If not, calling __del_reloc_root() dereference reloc_root->node, causing the system BUG. Fixes: 6bdf131fac23 ("Btrfs: don't leak reloc root nodes on error") Cc: <[email protected]> # 4.9 Signed-off-by: Naohiro Aota <[email protected]> Reviewed-by: Nikolay Borisov <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: finish ordered extent cleaning if no progress is foundNaohiro Aota1-0/+8
__endio_write_update_ordered() repeats the search until it reaches the end of the specified range. This works well with direct IO path, because before the function is called, it's ensured that there are ordered extents filling whole the range. It's not the case, however, when it's called from run_delalloc_range(): it is possible to have error in the midle of the loop in e.g. run_delalloc_nocow(), so that there exisits the range not covered by any ordered extents. By cleaning such "uncomplete" range, __endio_write_update_ordered() stucks at offset where there're no ordered extents. Since the ordered extents are created from head to tail, we can stop the search if there are no offset progress. Fixes: 524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang") Cc: <[email protected]> # 4.12 Signed-off-by: Naohiro Aota <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26btrfs: clear ordered flag on cleaning up ordered extentsNaohiro Aota1-0/+12
Commit 524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang") introduced btrfs_cleanup_ordered_extents() to cleanup submitted ordered extents. However, it does not clear the ordered bit (Private2) of corresponding pages. Thus, the following BUG occurs from free_pages_check_bad() (on btrfs/125 with nospace_cache). BUG: Bad page state in process btrfs pfn:3fa787 page:ffffdf2acfe9e1c0 count:0 mapcount:0 mapping: (null) index:0xd flags: 0x8000000000002008(uptodate|private_2) raw: 8000000000002008 0000000000000000 000000000000000d 00000000ffffffff raw: ffffdf2acf5c1b20 ffffb443802238b0 0000000000000000 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x2000(private_2) This patch clears the flag same as other places calling btrfs_dec_test_ordered_pending() for every page in the specified range. Fixes: 524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang") Cc: <[email protected]> # 4.12 Signed-off-by: Naohiro Aota <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: fix incorrect {node,sector}size endianness from BTRFS_IOC_FS_INFOOmar Sandoval1-3/+3
fs_info->super_copy->{node,sector}size are little-endian, but the ioctl should return the values in native endianness. Use the cached values in btrfs_fs_info instead. Found with sparse. Fixes: 80a773fbfc2d ("btrfs: retrieve more info from FS_INFO ioctl") Signed-off-by: Omar Sandoval <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: do not reset bio->bi_ops while writing bioLiu Bo1-3/+0
flush_epd_write_bio() sets bio->bi_opf by itself to honor REQ_SYNC, but it's not needed at all since bio->bi_opf has set up properly in both __extent_writepage() and write_one_eb(), and in the case of write_one_eb(), it also sets REQ_META, which we will lose in flush_epd_write_bio(). This remove this unnecessary bio->bi_opf setting. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26Btrfs: use the new helper wbc_to_write_flagsLiu Bo1-3/+2
This updates btrfs to use the helper wbc_to_write_flags which has been applied in ext4/xfs/f2fs/block. Please note that, with this, btrfs's dirty pages written by a writeback job will carry the flag REQ_BACKGROUND, which is currently used by writeback-throttle to determine whether it should go to get a request or wait. Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-09-26netfilter: ipvs: full-functionality option for ECN encapsulation in tunnelVadim Fedorenko1-2/+6
IPVS tunnel mode works as simple tunnel (see RFC 3168) copying ECN field to outer header. That's result in packet drops on egress tunnels in case the egress tunnel operates as ECN-capable with Full-functionality option (like ip_tunnel and ip6_tunnel kernel modules), according to RFC 3168 section 9.1.1 recommendation. This patch implements ECN full-functionality option into ipvs xmit code. Cc: [email protected] Cc: [email protected] Signed-off-by: Vadim Fedorenko <[email protected]> Reviewed-by: Konstantin Khlebnikov <[email protected]> Acked-by: Julian Anastasov <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-09-26phy: rockchip-typec: Don't set the aux voltage swing to 400 mVDouglas Anderson1-3/+4
On rk3399-gru-kevin there are some cases where we're seeing AUX CH failures when trying to do DisplayPort over type C. Problems are intermittent and don't reproduce all the time. Problems are often bursty and failures persist for several seconds before going away. The failure case I focused on is: * A particular type C to HDMI adapter. * One orientation (flip mode) of that adapter. * Easier to see failures when something is plugged into the _other type C port at the same time. * Problems reproduce on both type C ports (left and right side). Ironically problems also stop reproducing when I solder wires onto the AUX CH signals on a port (even if no scope is connected to the signals). In this case, problems only stop reproducing on the port with the wires connected. From the above it appears that something about the signaling on the aux channel is marginal and any slight differences can bring us over the edge to failure. It turns out that we can fix our problems by just increasing the voltage swing of the AUX CH, giving us a bunch of extra margin. In DP up to version 1.2 the voltage swing on the aux channel was specced as .29 V to 1.38 V. In DP version 1.3 the aux channel voltage was tightened to be between .29 V and .40 V, but it clarifies that it really only needs the lower voltage when operating at the highest speed (HBR3 mode). So right now we are trying to use a voltage that technically should be valid for all versions of the spec (including version 1.3 when transmitting at HBR3). That would be great to do if it worked reliably. ...but it doesn't seem to. It turns out that if you continue to read through the DP part of the rk3399 TRM and other parts of the type C PHY spec you'll find out that while the rk3399 does support DP 1.3, it doesn't support HBR3. The docs specifically say "RBR, HBR and HBR2 data rates only". Thus there is actually no requirement to support an AUX CH swing of .4 V. Even if there is no actual requirement to support the tighter voltage swing, one could possibly argue that we should support it anyway. The DP spec clarifies that the lower voltage on the AUX CH will reduce cross talk in some cases and that seems like it could be beneficial even at the lower bit rates. At the moment, though, we are seeing problems with the AUX CH and not on the other lines. Also, checking another known working and similar laptop shows that the other laptop runs the AUX channel at a higher voltage. Other notes: * Looking at measurements done on the AUX CH we weren't actually compliant with the DP 1.3 spec anyway. AUX CH peek-to-peek voltage was measured on rk3399-gru-kevin as .466 V which is > .4 V. * With this new patch the AUX channel isn't actually 1.0 V, but it has been confirmed that the signal is better and has more margin. Eye diagram passes. * If someone were truly an expert in the Type C PHY and in DisplayPort signaling they might be able to make things work and keep the voltage at < .4 V. The Type C PHY seems to have a plethora of tuning knobs that could almost certainly improve the signal integrity. Some of these things (like enabling tx_fcm_full_margin) even seem to fix my problems. However, lacking expertise I can't say whether this is a better or worse solution. Tightening signals to give cleaner waveforms can often have adverse affects, like increasing EMI or adding noise to other signals. I'd rather not tune things like this without a healthy application of expertise that I don't have. Signed-off-by: Douglas Anderson <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26phy: rockchip-typec: Set the AUX channel flip state earlierDouglas Anderson1-20/+42
On some DP monitors we found that setting the wrong flip state on the AUX channel could cause the monitor to stop asserting HotPlug Detect (HPD). Setting the right flip state caused these monitors to start asserting HotPlug Detect again. Here's what we believe was happening: * We'd plug in the monitor and we'd see HPD assert * We'd quickly see HPD deassert * The kernel would try to init the type C PHY but would init it in USB mode (because there was a peripheral there but no HPD) * Because the kernel never set the flip mode properly we'd never see the HPD come back. With this change, we'll still see HPD disappear (we don't think there's anything we can do about that), but then it will come back. Overall we can say that it's sane to set the AUX channel flip state even when HPD is not asserted. NOTE: to make this change possible, I needed to do a bit of cleanup to the tcphy_dp_aux_calibration() function so that it doesn't ever clobber the FLIP state. This made it very obvious that a line of code documented as "setting bit 12" also did a bunch of other magic, undocumented stuff. For now I'll just break out the bits and add a comment that this is black magic and we'll try to document tcphy_dp_aux_calibration() better in a future CL. ALSO NOTE: the old function used to write a bunch of hardcoded values in _some_ cases instead of doing a read-modify-write. One could possibly assert that these could have had (beneficial) side effects and thus with this new code (which always does read-modify-write) we could have a bug. We shouldn't need to worry, though, since in the old code tcphy_dp_aux_calibration() was always called following the de-assertion of "reset" the the type C PHY. ...so the type C PHY was always in default state. TX_ANA_CTRL_REG_1 is documented to be 0x0 after reset. This was also confirmed by printk. Suggested-by: Shawn Nematbakhsh <[email protected]> Reviewed-by: Chris Zhong <[email protected]> Signed-off-by: Douglas Anderson <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26phy: mvebu-cp110: checking for NULL instead of IS_ERR()Dan Carpenter1-2/+2
devm_ioremap_resource() never returns NULL, it only returns error pointers so this test needs to be changed. Fixes: d0438bd6aa09 ("phy: add the mvebu cp110 comphy driver") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26phy: mvebu-cp110-comphy: explicitly set the pipe selectorAntoine Tenart1-0/+10
The pipe selector is used to select some modes (such as USB or PCIe). Otherwise it must be set to 0 (or "unconnected"). This patch does this to ensure it is not set to an incompatible value when using the supported modes (SGMII, 10GKR). Signed-off-by: Antoine Tenart <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26phy: mvebu-cp110-comphy: fix mux error checkAntoine Tenart1-2/+2
The mux value is retrieved from the mvebu_comphy_get_mux() function which returns an int. In mvebu_comphy_power_on() this int is stored to a u32 and a check is made to ensure it's not negative. Which is wrong. This fixes it. Fixes: d0438bd6aa09 ("phy: add the mvebu cp110 comphy driver") Signed-off-by: Antoine Tenart <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26phy: phy-mtk-tphy: fix NULL point of chip bankChunfeng Yun1-1/+2
Chip bank of version-1 is initialized as NULL, but it's used by pcie_phy_instance_power_on/off(), so assign it a right address. Signed-off-by: Chunfeng Yun <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26phy: tegra: Handle return value of kasprintfArvind Yadav1-0/+2
kasprintf() can fail and it's return value must be checked. Signed-off-by: Arvind Yadav <[email protected]> Signed-off-by: Kishon Vijay Abraham I <[email protected]>
2017-09-26powerpc: Handle MCE on POWER9 with only DSISR bit 30 setMichael Neuling1-0/+13
On POWER9 DD2.1 and below, it's possible for a paste instruction to cause a Machine Check Exception (MCE) where only DSISR bit 30 (IBM 33) is set. This will result in the MCE handler seeing an unknown event, which triggers linux to crash. We change this by detecting unknown events caused by load/stores in the MCE handler and marking them as handled so that we no longer crash. An MCE that occurs like this is spurious, so we don't need to do anything in terms of servicing it. If there is something that needs to be serviced, the CPU will raise the MCE again with the correct DSISR so that it can be serviced properly. Signed-off-by: Michael Neuling <[email protected]> Reviewed-by: Nicholas Piggin <[email protected] Acked-by: Balbir Singh <[email protected]> [mpe: Expand comment with details from change log, use normal bit #s] Signed-off-by: Michael Ellerman <[email protected]>
2017-09-26drm/tegra: trace: Fix path to includeThierry Reding1-1/+1
The TRACE_INCLUDE_FILE macro needs to specify the path relative to the define_trace.h header rather than relative to the file defining it. Reported-by: Dmitry Osipenko <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-09-26Merge branch 'WIP.x86/fpu' into x86/fpu, because it's readyIngo Molnar15-315/+375
Signed-off-by: Ingo Molnar <[email protected]>
2017-09-26x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVESEric Biggers1-1/+1
This is the canonical method to use. Signed-off-by: Eric Biggers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kevin Hao <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Halcrow <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Yu-cheng Yu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-09-26x86/fpu: Use validate_xstate_header() to validate the xstate_header in ↵Eric Biggers1-11/+5
copy_user_to_xstate() Tighten the checks in copy_user_to_xstate(). Signed-off-by: Eric Biggers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kevin Hao <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Halcrow <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Yu-cheng Yu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-09-26x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()Eric Biggers1-7/+4
We now have this field in hdr.xfeatures. Signed-off-by: Eric Biggers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kevin Hao <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Halcrow <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Yu-cheng Yu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-09-26x86/fpu: Copy the full header in copy_user_to_xstate()Eric Biggers1-2/+5
This is in preparation to verify the full xstate header as supplied by user-space. Signed-off-by: Eric Biggers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kevin Hao <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Halcrow <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Yu-cheng Yu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-09-26x86/fpu: Use validate_xstate_header() to validate the xstate_header in ↵Eric Biggers1-10/+2
copy_kernel_to_xstate() Tighten the checks in copy_kernel_to_xstate(). Signed-off-by: Eric Biggers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kevin Hao <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Halcrow <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Yu-cheng Yu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>