aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-11-28Merge branch 'mlxsw-GRE-offloading-fixes'David S. Miller1-40/+69
Jiri Pirko says: ==================== mlxsw: GRE offloading fixes Petr says: This patchset fixes a couple bugs in offloading GRE tunnels in mlxsw driver. Patch #1 fixes a problem that local routes pointing at a GRE tunnel device are offloaded even if that netdevice is down. Patch #2 detects that as a result of moving a GRE netdevice to a different VRF, two tunnels now have a conflict of local addresses, something that the mlxsw driver can't offload. Patch #3 fixes a FIB abort caused by forming a route pointing at a GRE tunnel that is eligible for offloading but already onloaded. Patch #4 fixes a problem that next hops migrated to a new RIF kept the old RIF reference, which went dangling shortly afterwards. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-28mlxsw: spectrum_router: Update nexthop RIF on updatePetr Machata1-7/+21
The function mlxsw_sp_nexthop_rif_update() walks the list of nexthops associated with a RIF, and updates the corresponding entries in the switch. It is used in particular when a tunnel underlay netdevice moves to a different VRF, and all the nexthops are migrated over to a new RIF. The problem is that each nexthop holds a reference to its RIF, and that is not updated. So after the old RIF is gone, further activity on these nexthops (such as downing the underlay netdevice) dereferences a dangling pointer. Fix the issue by updating rif of impacted nexthops before calling mlxsw_sp_nexthop_rif_update(). Fixes: 0c5f1cd5ba8c ("mlxsw: spectrum_router: Generalize __mlxsw_sp_ipip_entry_update_tunnel()") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28mlxsw: spectrum_router: Handle encap to demoted tunnelsPetr Machata1-32/+29
Some tunnels that are offloadable on their own can nonetheless be demoted to slow path if their local address is in conflict with that of another tunnel. When a route is formed for such a tunnel, mlxsw_sp_nexthop_ipip_init() fails to find the corresponding IPIP entry, and that triggers a FIB abort. Resolve the problem by not assuming that a tunnel for which mlxsw_sp_ipip_ops.can_offload() holds also automatically has an IPIP entry. Fixes: af641713e97d ("mlxsw: spectrum_router: Onload conflicting tunnels") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28mlxsw: spectrum_router: Demote tunnels on VRF migrationPetr Machata1-0/+18
The mlxsw driver currently doesn't offload GRE tunnels if they have the same local address and use the same underlay VRF. When such a situation arises, the tunnels in conflict are demoted to slow path. However, the current code only verifies this condition on tunnel creation and tunnel change, not when a tunnel is moved to a different VRF. When the tunnel has no bound device, underlay and overlay are the same. Thus moving a tunnel moves the underlay as well, and that can cause local address conflict. So modify mlxsw_sp_netdevice_ipip_ol_vrf_event() to check if there are any conflicting tunnels, and demote them if yes. Fixes: af641713e97d ("mlxsw: spectrum_router: Onload conflicting tunnels") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28mlxsw: spectrum_router: Offload decap only for up tunnelsPetr Machata1-1/+1
When a new local route is added, an IPIP entry is looked up to determine whether the route should be offloaded as a tunnel decap or as a trap. That decision should take into account whether the tunnel netdevice in question is actually IFF_UP, and only install a decap offload if it is. Fixes: 0063587d3587 ("mlxsw: spectrum: Support decap-only IP-in-IP tunnels") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28Merge branch '40GbE' of ↵David S. Miller5-8/+13
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-11-27 This series contains updates to e1000, e1000e and i40e. Gustavo A. R. Silva fixes a sizeof() issue where we were taking the size of the pointer (which is always the size of the pointer). Sasha does a follow up fix to a previous fix for buffer overrun, to resolve community feedback from David Laight and the use of magic numbers. Amritha fixes the reporting of error codes for when adding a cloud filter fails. Ahmad Fatoum brushes the dust off the e1000 driver to fix a code comment and debug message which was incorrect about what the code was really doing. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-28btrfs: tree-checker: Fix false panic for sanity testQu Wenruo3-8/+43
[BUG] If we run btrfs with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y, it will instantly cause kernel panic like: ------ ... assertion failed: 0, file: fs/btrfs/disk-io.c, line: 3853 ... Call Trace: btrfs_mark_buffer_dirty+0x187/0x1f0 [btrfs] setup_items_for_insert+0x385/0x650 [btrfs] __btrfs_drop_extents+0x129a/0x1870 [btrfs] ... ----- [Cause] Btrfs will call btrfs_check_leaf() in btrfs_mark_buffer_dirty() to check if the leaf is valid with CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y. However quite some btrfs_mark_buffer_dirty() callers(*) don't really initialize its item data but only initialize its item pointers, leaving item data uninitialized. This makes tree-checker catch uninitialized data as error, causing such panic. *: These callers include but not limited to setup_items_for_insert() btrfs_split_item() btrfs_expand_item() [Fix] Add a new parameter @check_item_data to btrfs_check_leaf(). With @check_item_data set to false, item data check will be skipped and fallback to old btrfs_check_leaf() behavior. So we can still get early warning if we screw up item pointers, and avoid false panic. Cc: Filipe Manana <[email protected]> Reported-by: Lakshmipathi.G <[email protected]> Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-11-27proc: don't report kernel addresses in /proc/<pid>/stackLinus Torvalds1-2/+1
This just changes the file to report them as zero, although maybe even that could be removed. I checked, and at least procps doesn't actually seem to parse the 'stack' file at all. And since the file doesn't necessarily even exist (it requires CONFIG_STACKTRACE), possibly other tools don't really use it either. That said, in case somebody parses it with tools, just having that zero there should keep such tools happy. Signed-off-by: Linus Torvalds <[email protected]>
2017-11-27e1000: Fix off-by-one in debug messageAhmad Fatoum1-2/+4
Signed-off-by: Ahmad Fatoum <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-11-27i40e: Fix reporting incorrect error codesAmritha Nambiar1-1/+0
Adding cloud filters could fail for a number of reasons, unsupported filter fields for example, which fails during validation of fields itself. This will not result in admin command errors and converting the admin queue status to posix error code using i40e_aq_rc_to_posix would result in incorrect error values. If the failure was due to AQ error itself, reporting that correctly is handled in the inner function. Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-11-27e1000e: fix the use of magic numbers for buffer overrun issueSasha Neftin2-4/+8
This is a follow on to commit b10effb92e27 ("fix buffer overrun while the I219 is processing DMA transactions") to address David Laights concerns about the use of "magic" numbers. So define masks as well as add additional code comments to give a better understanding of what needs to be done to avoid a buffer overrun. Signed-off-by: Sasha Neftin <[email protected]> Reviewed-by: Alexander H Duyck <[email protected]> Reviewed-by: Dima Ruinskiy <[email protected]> Reviewed-by: Raanan Avargil <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-11-27i40e/virtchnl: fix application of sizeof to pointerGustavo A R Silva1-1/+1
sizeof when applied to a pointer typed expression gives the size of the pointer. The proper fix in this particular case is to code sizeof(*vfres) instead of sizeof(vfres). This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A R Silva <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-11-27lockd: fix "list_add double add" caused by legacy signal interfaceVasily Averin2-4/+9
restart_grace() uses hardcoded init_net. It can cause to "list_add double add" in following scenario: 1) nfsd and lockd was started in several net namespaces 2) nfsd in init_net was stopped (lockd was not stopped because it have users from another net namespaces) 3) lockd got signal, called restart_grace() -> set_grace_period() and enabled lock_manager in hardcoded init_net. 4) nfsd in init_net is started again, its lockd_up() calls set_grace_period() and tries to add lock_manager into init_net 2nd time. Jeff Layton suggest: "Make it safe to call locks_start_grace multiple times on the same lock_manager. If it's already on the global grace_list, then don't try to add it again. (But we don't intentionally add twice, so for now we WARN about that case.) With this change, we also need to ensure that the nfsd4 lock manager initializes the list before we call locks_start_grace. While we're at it, move the rest of the nfsd_net initialization into nfs4_state_create_net. I see no reason to have it spread over two functions like it is today." Suggested patch was updated to generate warning in described situation. Suggested-by: Jeff Layton <[email protected]> Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nlm_shutdown_hosts_net() cleanupVasily Averin1-2/+1
nlm_complain_hosts() walks through nlm_server_hosts hlist, which should be protected by nlm_host_mutex. Signed-off-by: Vasily Averin <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27race of nfsd inetaddr notifiers vs nn->nfsd_serv changeVasily Averin3-3/+17
nfsd_inet[6]addr_event uses nn->nfsd_serv without taking nfsd_mutex, which can be changed during execution of notifiers and crash the host. Moreover if notifiers were enabled in one net namespace they are enabled in all other net namespaces, from creation until destruction. This patch allows notifiers to access nn->nfsd_serv only after the pointer is correctly initialized and delays cleanup until notifiers are no longer in use. Signed-off-by: Vasily Averin <[email protected]> Tested-by: Scott Mayhew <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27race of lockd inetaddr notifiers vs nlmsvc_rqst changeVasily Averin1-2/+14
lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex, nlmsvc_rqst can be changed during execution of notifiers and crash the host. Patch enables access to nlmsvc_rqst only when it was correctly initialized and delays its cleanup until notifiers are no longer in use. Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if (nlmsvc_rqst)" check in notifiers is insufficient on its own. Signed-off-by: Vasily Averin <[email protected]> Tested-by: Scott Mayhew <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27SUNRPC: make cache_detail structures constBhumika Goyal2-4/+4
Make these const as they are only getting passed to the function cache_create_net having the argument as const. Signed-off-by: Bhumika Goyal <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27NFSD: make cache_detail structures constBhumika Goyal2-4/+4
Make these const as they are only getting passed to the function cache_create_net having the argument as const. Signed-off-by: Bhumika Goyal <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27sunrpc: make the function arg as constBhumika Goyal2-2/+2
Make the struct cache_detail *tmpl argument of the function cache_create_net as const as it is only getting passed to kmemup having the argument as const void *. Add const to the prototype too. Signed-off-by: Bhumika Goyal <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: check for use of the closed special stateidAndrew Elble1-2/+5
Prevent the use of the closed (invalid) special stateid by clients. Signed-off-by: Andrew Elble <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: fix panic in posix_unblock_lock called from nfs4_laundromatNaofumi Honda1-2/+2
From kernel 4.9, my two nfsv4 servers sometimes suffer from "panic: unable to handle kernel page request" in posix_unblock_lock() called from nfs4_laundromat(). These panics diseappear if we revert the commit "nfsd: add a LRU list for blocked locks". The cause appears to be a typo in nfs4_laundromat(), which is also present in nfs4_state_shutdown_net(). Cc: [email protected] Fixes: 7919d0a27f1e "nfsd: add a LRU list for blocked locks" Cc: [email protected] Reveiwed-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27lockd: lost rollback of set_grace_period() in lockd_down_net()Vasily Averin1-0/+2
Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect, it removes lockd_manager and disarm grace_period_end for init_net only. If nfsd was started from another net namespace lockd_up_net() calls set_grace_period() that adds lockd_manager into per-netns list and queues grace_period_end delayed work. These action should be reverted in lockd_down_net(). Otherwise it can lead to double list_add on after restart nfsd in netns, and to use-after-free if non-disarmed delayed work will be executed after netns destroy. Fixes: efda760fe95e ("lockd: fix lockd shutdown race") Cc: [email protected] Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27lockd: added cleanup checks in exit_net hookVasily Averin1-0/+11
Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27grace: replace BUG_ON by WARN_ONCE in exit_net hookVasily Averin1-1/+3
Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex classAndrew Elble1-3/+8
The use of the st_mutex has been confusing the validator. Use the proper nested notation so as to not produce warnings. Signed-off-by: Andrew Elble <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27lockd: remove net pointer from messagesVasily Averin4-14/+21
Publishing of net pointer is not safe, use net->ns.inum as net ID in debug messages [ 171.757678] lockd_up_net: per-net data created; net=f00001e7 [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 300.653313] lockd: nuking all hosts in net f00001e7... [ 300.653641] lockd: host garbage collection for net f00001e7 [ 300.653968] lockd: nlmsvc_mark_resources for net f00001e7 [ 300.711483] lockd_down_net: per-net data destroyed; net=f00001e7 [ 300.711847] lockd: nuking all hosts in net 0... [ 300.711847] lockd: host garbage collection for net 0 [ 300.711848] lockd: nlmsvc_mark_resources for net 0 Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: remove net pointer from debug messagesVasily Averin1-3/+3
Publishing of net pointer is not safe, replace it in debug meesages by net->ns.inum [ 119.989161] nfsd: initializing export module (net: f00001e7). [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 322.185240] nfsd: shutting down export module (net: f00001e7). [ 322.186062] nfsd: export shutdown complete (net: f00001e7). Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: Fix races with check_stateid_generation()Trond Myklebust1-3/+19
The various functions that call check_stateid_generation() in order to compare a client-supplied stateid with the nfs4_stid state, usually need to atomically check for closed state. Those that perform the check after locking the st_mutex using nfsd4_lock_ol_stateid() should now be OK, but we do want to fix up the others. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: Ensure we check stateid validity in the seqid operation checksTrond Myklebust1-9/+3
After taking the stateid st_mutex, we want to know that the stateid still represents valid state before performing any non-idempotent actions. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: Fix race in lock stateid creationTrond Myklebust1-28/+42
If we're looking up a new lock state, and the creation fails, then we want to unhash it, just like we do for OPEN. However in order to do so, we need to that no other LOCK requests can grab the mutex until we have unhashed it (and marked it as closed). Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd4: move find_lock_stateidTrond Myklebust1-19/+19
Trivial cleanup to simplify following patch. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: Ensure we don't recognise lock stateids after freeing themTrond Myklebust1-11/+8
In order to deal with lookup races, nfsd4_free_lock_stateid() needs to be able to signal to other stateful functions that the lock stateid is no longer valid. Right now, nfsd_lock() will check whether or not an existing stateid is still hashed, but only in the "new lock" path. To ensure the stateid invalidation is also recognised by the "existing lock" path, and also by a second call to nfsd4_free_lock_stateid() itself, we can change the type to NFS4_CLOSED_STID under the stp->st_mutex. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0)Trond Myklebust1-0/+8
Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: Fix another OPEN stateid raceTrond Myklebust1-15/+13
If nfsd4_process_open2() is initialising a new stateid, and yet the call to nfs4_get_vfs_file() fails for some reason, then we must declare the stateid closed, and unhash it before dropping the mutex. Right now, we unhash the stateid after dropping the mutex, and without changing the stateid type, meaning that another OPEN could theoretically look it up and attempt to use it. Reported-by: Andrew W Elble <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27nfsd: Fix stateid races between OPEN and CLOSETrond Myklebust1-8/+59
Open file stateids can linger on the nfs4_file list of stateids even after they have been closed. In order to avoid reusing such a stateid, and confusing the client, we need to recheck the nfs4_stid's type after taking the mutex. Otherwise, we risk reusing an old stateid that was already closed, which will confuse clients that expect new stateids to conform to RFC7530 Sections 9.1.4.2 and 16.2.5 or RFC5661 Sections 8.2.2 and 18.2.4. Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] Signed-off-by: J. Bruce Fields <[email protected]>
2017-11-27Rename superblock flags (MS_xyz -> SB_xyz)Linus Torvalds111-417/+417
This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-27auxdisplay: img-ascii-lcd: Only build on archs that have IOMEMThomas Meyer1-0/+1
This avoids the MODPOST error: ERROR: "devm_ioremap_resource" [drivers/auxdisplay/img-ascii-lcd.ko] undefined! Signed-off-by: Thomas Meyer <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-27mm, thp: Do not make pmd/pud dirty without a reasonKirill A. Shutemov5-16/+24
Currently we make page table entries dirty all the time regardless of access type and don't even consider if the mapping is write-protected. The reasoning is that we don't really need dirty tracking on THP and making the entry dirty upfront may save some time on first write to the page. Unfortunately, such approach may result in false-positive can_follow_write_pmd() for huge zero page or read-only shmem file. Let's only make page dirty only if we about to write to the page anyway (as we do for small pages). I've restructured the code to make entry dirty inside maybe_p[mu]d_mkwrite(). It also takes into account if the vma is write-protected. Signed-off-by: Kirill A. Shutemov <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-27mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d()Kirill A. Shutemov1-23/+13
Currently, we unconditionally make page table dirty in touch_pmd(). It may result in false-positive can_follow_write_pmd(). We may avoid the situation, if we would only make the page table entry dirty if caller asks for write access -- FOLL_WRITE. The patch also changes touch_pud() in the same way. Signed-off-by: Kirill A. Shutemov <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-27tipc: eliminate access after delete in group_filter_msg()Jon Maloy1-1/+1
KASAN revealed another access after delete in group.c. This time it found that we read the header of a received message after the buffer has been released. Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-27xen-netfront: remove warning when unloading moduleEduardo Otubo1-0/+18
v2: * Replace busy wait with wait_event()/wake_up_all() * Cannot garantee that at the time xennet_remove is called, the xen_netback state will not be XenbusStateClosed, so added a condition for that * There's a small chance for the xen_netback state is XenbusStateUnknown by the time the xen_netfront switches to Closed, so added a condition for that. When unloading module xen_netfront from guest, dmesg would output warning messages like below: [ 105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use! [ 105.236839] deferring g.e. 0x903 (pfn 0x35805) This problem relies on netfront and netback being out of sync. By the time netfront revokes the g.e.'s netback didn't have enough time to free all of them, hence displaying the warnings on dmesg. The trick here is to make netfront to wait until netback frees all the g.e.'s and only then continue to cleanup for the module removal, and this is done by manipulating both device states. Signed-off-by: Eduardo Otubo <[email protected]> Acked-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-27Btrfs: fix list_add corruption and soft lockups in fsyncLiu Bo2-3/+4
Xfstests btrfs/146 revealed this corruption, [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88). [ 58.153518] ------------[ cut here ]------------ [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0 ... [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0 ... [ 58.161956] Call Trace: [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs] [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs] [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs] [ 58.164393] vfs_fsync_range+0x5f/0xd0 [ 58.164898] do_fsync+0x5a/0x90 [ 58.165170] SyS_fsync+0x10/0x20 [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe ... It turns out that we could record btrfs_log_ctx:io_err in log_one_extents when IO fails, but make log_one_extents() return '0' instead of -EIO, so the IO error is not acknowledged by the callers, i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated from stack memory, it'd get freed with a object alive on the list. then a future list_add will throw the above warning. This returns the correct error in the above case. Jeff also reported this while testing against his fsync error patch set[1]. [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html "btrfs list corruption and soft lockups while testing writeback error handling" Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error") Signed-off-by: Liu Bo <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2017-11-28Merge tag 'mac80211-for-davem-2017-11-27' of ↵David S. Miller10-17/+61
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Four fixes: * CRYPTO_SHA256 is needed for regdb validation * mac80211: mesh path metric was wrong in some frames * mac80211: use QoS null-data packets on QoS connections * mac80211: tear down RX aggregation sessions first to drop fewer packets in HW restart scenarios ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-27btrfs: Fix wild memory access in compression level parserQu Wenruo3-3/+14
[BUG] Kernel panic when mounting with "-o compress" mount option. KASAN will report like: ------ ================================================================== BUG: KASAN: wild-memory-access in strncmp+0x31/0xc0 Read of size 1 at addr d86735fce994f800 by task mount/662 ... Call Trace: dump_stack+0xe3/0x175 kasan_report+0x163/0x370 __asan_load1+0x47/0x50 strncmp+0x31/0xc0 btrfs_compress_str2level+0x20/0x70 [btrfs] btrfs_parse_options+0xff4/0x1870 [btrfs] open_ctree+0x2679/0x49f0 [btrfs] btrfs_mount+0x1b7f/0x1d30 [btrfs] mount_fs+0x49/0x190 vfs_kern_mount.part.29+0xba/0x280 vfs_kern_mount+0x13/0x20 btrfs_mount+0x31e/0x1d30 [btrfs] mount_fs+0x49/0x190 vfs_kern_mount.part.29+0xba/0x280 do_mount+0xaad/0x1a00 SyS_mount+0x98/0xe0 entry_SYSCALL_64_fastpath+0x1f/0xbe ------ [Cause] For 'compress' and 'compress_force' options, its token doesn't expect any parameter so its args[0] contains uninitialized data. Accessing args[0] will cause above wild memory access. [Fix] For Opt_compress and Opt_compress_force, set compression level to the default. Signed-off-by: Qu Wenruo <[email protected]> Reviewed-by: David Sterba <[email protected]> [ set the default in advance ] Signed-off-by: David Sterba <[email protected]>
2017-11-28Merge branch 'sctp-stream-reconfig-fixes'David S. Miller1-12/+65
Xin Long says: ==================== sctp: a bunch of fixes for stream reconfig This patchset is to make stream reset and asoc reset work more correctly for stream reconfig. Thank to Marcelo making them very clear. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-28sctp: set sender next_tsn for the old result with ctsn_ack_point plus 1Xin Long1-1/+1
When doing asoc reset, if the sender of the response has already sent some chunk and increased asoc->next_tsn before the duplicate request comes, the response will use the old result with an incorrect sender next_tsn. Better than asoc->next_tsn, asoc->ctsn_ack_point can't be changed after the sender of the response has performed the asoc reset and before the peer has confirmed it, and it's value is still asoc->next_tsn original value minus 1. This patch sets sender next_tsn for the old result with ctsn_ack_point plus 1 when processing the duplicate request, to make sure the sender next_tsn value peer gets will be always right. Fixes: 692787cef651 ("sctp: implement receiver-side procedures for the SSN/TSN Reset Request Parameter") Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28sctp: avoid flushing unsent queue when doing asoc resetXin Long1-7/+14
Now when doing asoc reset, it cleans up sacked and abandoned queues by calling sctp_outq_free where it also cleans up unsent, retransmit and transmitted queues. It's safe for the sender of response, as these 3 queues are empty at that time. But when the receiver of response is doing the reset, the users may already enqueue some chunks into unsent during the time waiting the response, and these chunks should not be flushed. To void the chunks in it would be removed, it moves the queue into a temp list, then gets it back after sctp_outq_free is done. The patch also fixes some incorrect comments in sctp_process_strreset_tsnreq. Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28sctp: only allow the asoc reset when the asoc outq is emptyXin Long1-0/+9
As it says in rfc6525#section5.1.4, before sending the request, C2: The sender has either no outstanding TSNs or considers all outstanding TSNs abandoned. Prior to this patch, it tried to consider all outstanding TSNs abandoned by dropping all chunks in all outqs with sctp_outq_free (even including sacked, retransmit and transmitted queues) when doing this reset, which is too aggressive. To make it work gently, this patch will only allow the asoc reset when the sender has no outstanding TSNs by checking if unsent, transmitted and retransmit are all empty with sctp_outq_is_empty before sending and processing the request. Fixes: 692787cef651 ("sctp: implement receiver-side procedures for the SSN/TSN Reset Request Parameter") Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28sctp: only allow the out stream reset when the stream outq is emptyXin Long1-0/+35
Now the out stream reset in sctp stream reconf could be done even if the stream outq is not empty. It means that users can not be sure since which msg the new ssn will be used. To make this more synchronous, it shouldn't allow to do out stream reset until these chunks in unsent outq all are sent out. This patch checks the corresponding stream outqs when sending and processing the request . If any of them has unsent chunks in outq, it will return -EAGAIN instead or send SCTP_STRRESET_IN_PROGRESS back to the sender. Fixes: 7f9d68ac944e ("sctp: implement sender-side procedures for SSN Reset Request Parameter") Suggested-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-28sctp: use sizeof(__u16) for each stream number length instead of magic numberXin Long1-4/+6
Now in stream reconf part there are still some places using magic number 2 for each stream number length. To make it more readable, this patch is to replace them with sizeof(__u16). Reported-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>