Age | Commit message (Collapse) | Author | Files | Lines |
|
The commit 19ba1eb15a2a ("Input: psmouse - add a custom serio protocol
to send extra information") introduced usage of the BIT() macro
for SERIO_* flags; this macro is not provided in UAPI headers.
Replace if with similarly defined _BITUL() macro defined
in <linux/const.h>.
Fixes: 19ba1eb15a2a ("Input: psmouse - add a custom serio protocol to send extra information")
Signed-off-by: Eugene Syromiatnikov <[email protected]>
Cc: <[email protected]> # v5.0+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
it is sometimes necessary to poll a phy register by phy_read()
function until its value satisfies some condition. introduce
phy_read_poll_timeout() macros that do this.
Suggested-by: Andrew Lunn <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Dejin Zheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
it is sometimes necessary to poll a phy register by phy_read_mmd()
function until its value satisfies some condition. introduce
phy_read_mmd_poll_timeout() macros that do this.
Suggested-by: Andrew Lunn <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Dejin Zheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
redefined readx_poll_timeout macro by read_poll_timeout to
simplify the code.
Signed-off-by: Dejin Zheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
this macro is an extension of readx_poll_timeout macro. the accessor
function op just supports only one parameter in the readx_poll_timeout
macro, but this macro can supports multiple variable parameters for
it. so functions like phy_read(struct phy_device *phydev, u32 regnum)
and phy_read_mmd(struct phy_device *phydev, int devad, u32 regnum) can
also use this poll timeout core. and also expand it can sleep some time
before read operation.
Signed-off-by: Dejin Zheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Previous changes to the IP routing code have removed all the
tests for the DS_HOST route flag.
Remove the flags and all the code that sets it.
Signed-off-by: David Laight <[email protected]>
Acked-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Packet trap groups are now explicitly registered by drivers and not
implicitly registered when the packet traps are registered. Therefore,
there is no need to encode entire group structure the trap is associated
with inside the trap structure.
Instead, only pass the group identifier. Refer to it as initial group
identifier, as future patches will allow user space to move traps
between groups.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently, packet trap groups are implicitly registered by drivers upon
packet trap registration. When the traps are registered, each is
associated with a group and the group is created by devlink, if it does
not exist already.
This makes it difficult for drivers to pass additional attributes for
the groups.
Therefore, as a preparation for future patches that require passing
additional group attributes, add an API to explicitly register /
unregister these groups.
Next patches will convert existing drivers to use this API.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2020-03-21
Implement basic support for the devlink interface in the ice driver.
Additionally pave some necessary changes for adding a devlink region that
exposes the NVM contents.
This series first contains 5 patches for enabling and implementing full NVM
read access via the ETHTOOL_GEEPROM interface. This includes some cleanup of
endian-types, a new function for reading from the NVM and Shadow RAM as a flat
addressable space, a function to calculate the available flash size during
load, and a change to how some of the NVM version fields are stored in the
ice_nvm_info structure.
Following this is 3 patches for implementing devlink support. First, one patch
which implements the basic framework and introduces the ice_devlink.c file.
Second, a patch to implement basic .info_get support. Finally, a patch which
reads the device PBA identifier and reports it as the `board.id` value in the
.info_get response.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Unlike NL_SET_ERR_* macros, nl_set_extack_cookie_u64() and
nl_set_extack_cookie_u32() helpers do not check extack argument for null
and neither do their callers, as syzbot recently discovered for
ethnl_parse_header().
Instead of fixing the callers and leaving the trap in place, add check of
null extack to both helpers to make them consistent with NL_SET_ERR_*
macros.
v2: drop incorrect second Fixes tag
Fixes: 2363d73a2f3e ("ethtool: reject unrecognized request flags")
Reported-by: [email protected]
Signed-off-by: Michal Kubecek <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
So far PHY drivers have to check whether a downshift occurred to be
able to notify the user. To make life of drivers authors a little bit
easier move the downshift notification to phylib. phy_check_downshift()
compares the highest mutually advertised speed with the actual value
of phydev->speed (typically read by the PHY driver from a
vendor-specific register) to detect a downshift.
v2:
- Add downshift hint to phy_print_status
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 53eca1f3479f ("net: rename flow_action_hw_stats_types* ->
flow_action_hw_stats*") renamed just the flow action types and
helpers. For consistency rename variables, enums, struct members
and UAPI too (note that this UAPI was not in any official release,
yet).
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Make it so that CEPH_MSG_DATA_PAGES data item can own pages,
fixing a bunch of memory leaks for a page vector allocated in
alloc_msg_with_page_vector(). Currently, only watch-notify
messages trigger this allocation, and normally the page vector
is freed either in handle_watch_notify() or by the caller of
ceph_osdc_notify(). But if the message is freed before that
(e.g. if the session faults while reading in the message or
if the notify is stale), we leak the page vector.
This was supposed to be fixed by switching to a message-owned
pagelist, but that never happened.
Fixes: 1907920324f1 ("libceph: support for sending notifies")
Reported-by: Roman Penyaev <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Roman Penyaev <[email protected]>
|
|
CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
per-pool flags as well. Unfortunately the backwards compatibility here
is lacking:
- the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
was guarded by require_osd_release >= RELEASE_LUMINOUS
- it was subsequently backported to luminous in v12.2.2, but that makes
no difference to clients that only check OSDMAP_FULL/NEARFULL because
require_osd_release is not client-facing -- it is for OSDs
Since all kernels are affected, the best we can do here is just start
checking both map flags and pool flags and send that to stable.
These checks are best effort, so take osdc->lock and look up pool flags
just once. Remove the FIXME, since filesystem quotas are checked above
and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.
Cc: [email protected]
Reported-by: Yanhu Cao <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Acked-by: Sage Weil <[email protected]>
|
|
Don't let non-letters inside a literal block without escaping it, as
the toolchain would mis-interpret it:
./include/linux/i2c.h:518: WARNING: Inline strong start-string without end-string.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <[email protected]>:
x86/mm: split vmalloc_sync_all()
mm, slub: prevent kmalloc_node crashes and memory leaks
mm/mmu_notifier: silence PROVE_RCU_LIST warnings
epoll: fix possible lost wakeup on epoll_ctl() path
mm: do not allow MADV_PAGEOUT for CoW pages
mm, memcg: throttle allocators based on ancestral memory.high
mm, memcg: fix corruption on 64-bit divisor in memory.high throttling
page-flags: fix a crash at SetPageError(THP_SWAP)
mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case
memcg: fix NULL pointer dereference in __mem_cgroup_usage_unregister_event
|
|
Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in
__purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in
the vunmap() code-path. While this change was necessary to maintain
correctness on x86-32-pae kernels, it also adds additional cycles for
architectures that don't need it.
Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported
severe performance regressions in micro-benchmarks because it now also
calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But
the vmalloc_sync_all() implementation on x86-64 is only needed for newly
created mappings.
To avoid the unnecessary work on x86-64 and to gain the performance
back, split up vmalloc_sync_all() into two functions:
* vmalloc_sync_mappings(), and
* vmalloc_sync_unmappings()
Most call-sites to vmalloc_sync_all() only care about new mappings being
synchronized. The only exception is the new call-site added in the
above mentioned commit.
Shile Zhang directed us to a report of an 80% regression in reaim
throughput.
Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()")
Reported-by: kernel test robot <[email protected]>
Reported-by: Shile Zhang <[email protected]>
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Borislav Petkov <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]> [GHES]
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: https://lists.01.org/hyperkitty/list/[email protected]/thread/4D3JPPHBNOSPFK2KEPC6KGKS6J25AIDB/
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped
out") supported writing THP to a swap device but forgot to upgrade an
older commit df8c94d13c7e ("page-flags: define behavior of FS/IO-related
flags on compound pages") which could trigger a crash during THP
swapping out with DEBUG_VM_PGFLAGS=y,
kernel BUG at include/linux/page-flags.h:317!
page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
page:fffff3b2ec3a8000 refcount:512 mapcount:0 mapping:000000009eb0338c index:0x7f6e58200 head:fffff3b2ec3a8000 order:9 compound_mapcount:0 compound_pincount:0
anon flags: 0x45fffe0000d8454(uptodate|lru|workingset|owner_priv_1|writeback|head|reclaim|swapbacked)
end_swap_bio_write()
SetPageError(page)
VM_BUG_ON_PAGE(1 && PageCompound(page))
<IRQ>
bio_endio+0x297/0x560
dec_pending+0x218/0x430 [dm_mod]
clone_endio+0xe4/0x2c0 [dm_mod]
bio_endio+0x297/0x560
blk_update_request+0x201/0x920
scsi_end_request+0x6b/0x4b0
scsi_io_completion+0x509/0x7e0
scsi_finish_command+0x1ed/0x2a0
scsi_softirq_done+0x1c9/0x1d0
__blk_mqnterrupt+0xf/0x20
</IRQ>
Fix by checking PF_NO_TAIL in those places instead.
Fixes: bd4c82c22c36 ("mm, THP, swap: delay splitting THP after swapped out")
Signed-off-by: Qian Cai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: "Huang, Ying" <[email protected]>
Acked-by: Rafael Aquini <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Pull io_uring fixes from Jens Axboe:
"Two different fixes in here:
- Fix for a potential NULL pointer deref for links with async or
drain marked (Pavel)
- Fix for not properly checking RLIMIT_NOFILE for async punted
operations.
This affects openat/openat2, which were added this cycle, and
accept4. I did a full audit of other cases where we might check
current->signal->rlim[] and found only RLIMIT_FSIZE for buffered
writes and fallocate. That one is fixed and queued for 5.7 and
marked stable"
* tag 'io_uring-5.6-20200320' of git://git.kernel.dk/linux-block:
io_uring: make sure accept honor rlimit nofile
io_uring: make sure openat/openat2 honor rlimit nofile
io_uring: NULL-deref for IOSQE_{ASYNC,DRAIN}
|
|
The nfp driver uses ``fw.bundle_id`` to represent a unique identifier of the
entire firmware bundle.
A future change is going to introduce a similar notion in the ice
driver, so promote ``fw.bundle_id`` into a generic version now.
Signed-off-by: Jacob Keller <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Another set of changes:
* HE ranging (fine timing measurement) API support
* hwsim gets virtio support, for use with wmediumd,
to be able to simulate with multiple machines
* eapol-over-nl80211 improvements to exclude preauth
* IBSS reset support, to recover connections from
userspace
* and various others.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Now that we have a nested tunnel info attribute we can add a separate
one for the tunnel command and require it explicitly from user-space. It
must be one of RTM_SETLINK/DELLINK. Only RTM_SETLINK requires a valid
tunnel id, DELLINK just removes it if it was set before. This allows us
to have all tunnel attributes and control in one place, thus removing
the need for an outside vlan info flag.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
While discussing the new API, Roopa mentioned that we'll be adding more
tunnel attributes and options in the future, so it's better to make it a
nested attribute, since this is still in net-next we can easily change it
and nest the tunnel id attribute under BRIDGE_VLANDB_ENTRY_TUNNEL_INFO.
The new format is:
[BRIDGE_VLANDB_ENTRY]
[BRIDGE_VLANDB_ENTRY_TUNNEL_INFO]
[BRIDGE_VLANDB_TINFO_ID]
Any new tunnel attributes can be nested under
BRIDGE_VLANDB_ENTRY_TUNNEL_INFO.
Suggested-by: Roopa Prabhu <[email protected]>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Just like commit 4022e7af86be, this fixes the fact that
IORING_OP_ACCEPT ends up using get_unused_fd_flags(), which checks
current->signal->rlim[] for limits.
Add an extra argument to __sys_accept4_file() that allows us to pass
in the proper nofile limit, and grab it at request prep time.
Acked-by: David S. Miller <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Dmitry reports that a test case shows that io_uring isn't honoring a
modified rlimit nofile setting. get_unused_fd_flags() checks the task
signal->rlimi[] for the limits. As this isn't easily inheritable,
provide a __get_unused_fd_flags() that takes the value instead. Then we
can grab it when the request is prepared (from the original task), and
pass that in when we do the async part part of the open.
Reported-by: Dmitry Kadashev <[email protected]>
Tested-by: Dmitry Kadashev <[email protected]>
Acked-by: David S. Miller <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Drivers that trigger roaming need to know the lifetime of the configured
PMKSA for deciding whether to trigger the full or PMKSA cache based
authentication. The configured PMKSA is invalid after the PMK lifetime
has expired and must not be used after that and the STA needs to
disassociate if the PMK expires. Hence the STA is expected to refresh
the PMK with a full authentication before this happens (e.g., when
reassociating to a new BSS the next time or by performing EAPOL
reauthentication depending on the AKM) to avoid unnecessary
disconnection.
The PMK reauthentication threshold is the percentage of the PMK lifetime
value and indicates to the driver to trigger a full authentication roam
(without PMKSA caching) after the reauthentication threshold time, but
before the PMK timer has expired. Authentication methods like SAE need
to be able to generate a new PMKSA entry without having to force a
disconnection after this threshold timeout. If no roaming occurs between
the reauthentication threshold time and PMK lifetime expiration,
disassociation is still forced.
The new attributes for providing these values correspond to the dot11
MIB variables dot11RSNAConfigPMKLifetime and
dot11RSNAConfigPMKReauthThreshold.
This type of functionality is already available in cases where user
space component is in control of roaming. This commit extends that same
capability into cases where parts or all of this functionality is
offloaded to the driver.
Signed-off-by: Veerendranath Jakkam <[email protected]>
Signed-off-by: Jouni Malinen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Sometimes, userspace is able to detect that a peer silently lost its
state (like, if the peer reboots). wpa_supplicant does this for IBSS-RSN
by registering for auth/deauth frames, but when it detects this, it is
only able to remove the encryption keys of the peer and close its port.
However, the kernel also hold other state about the station, such as BA
sessions, probe response parameters and the like. They also need to be
resetted correctly.
This patch adds the NL80211_EXT_FEATURE_DEL_IBSS_STA feature flag
indicating the driver accepts deleting stations in IBSS mode, which
should send a deauth and reset the state of the station, just like in
mesh point mode.
Signed-off-by: Nicolas Cavallari <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[preserve -EINVAL return]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Add API for telling whether the driver supports protected TWT.
The protected_twt capability in the RSNXE will be based on this.
Signed-off-by: Shaul Triebitz <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Pass the AP's HE operation element to the driver.
Signed-off-by: Shaul Triebitz <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Add support for requesting that the ranging measurement will use
the trigger-based / non trigger-based flow instead of the EDCA based
flow.
Signed-off-by: Avraham Stern <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
The structure member added at some point, but the kernel-doc was not
updated.
Signed-off-by: Qiujun Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
This patch adds support for disabling pre-auth rx over the nl80211 control
port for mac80211.
Signed-off-by: Markus Theil <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[fix indentation slightly, squash feature enablement]
Signed-off-by: Johannes Berg <[email protected]>
|
|
If the nl80211 control port is used before this patch, pre-auth frames
(0x88c7) are send to userspace uncoditionally. While this enables userspace
to only use nl80211 on the station side, it is not always useful for APs.
Furthermore, pre-auth frames are ordinary data frames and not related to
the control port. Therefore it should for example be possible for pre-auth
frames to be bridged onto a wired network on AP side without touching
userspace.
For backwards compatibility to code already using pre-auth over nl80211,
this patch adds a feature flag to disable this behavior, while it remains
enabled by default. An additional ext. feature flag is added to detect this
from userspace.
Thanks to Jouni for pointing out, that pre-auth frames should be handled as
ordinary data frames.
Signed-off-by: Markus Theil <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
This allows communication with external entities.
It also required fixing up the netlink policy, since NLA_UNSPEC
attributes are no longer accepted.
Signed-off-by: Erel Geron <[email protected]>
[port to backports, inline the ID, use 29 as the ID as requested,
drop != NULL checks, reduce ifdefs]
Link: https://lore.kernel.org/r/20200305143212.c6e4c87d225b.I7ce60bf143e863dcdf0fb8040aab7168ba549b99@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
Kernel-doc complains if the line isn't prefixed with an
asterisk, fix that.
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/r/20200320144110.2786ad5fb234.I369d103d11c71e39e3a3f97ed68a528c5b875f1e@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
pull request: bluetooth-next 2020-03-19
Here's the main bluetooth-next pull request for the 5.7 kernel.
- Added wideband speech support to mgmt and the ability for HCI drivers
to declare support for it.
- Added initial support for L2CAP Enhanced Credit Based Mode
- Fixed suspend handling for several use cases
- Fixed Extended Advertising related issues
- Added support for Realtek 8822CE device
- Added DT bindings for QTI chip WCN3991
- Cleanups to replace zero-length arrays with flexible-array members
- Several other smaller cleanups & fixes
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The CONFIG_SYSFS declaration of sysfs_group_change_owner() is different
from the !CONFIG_SYSFS version and thus causes build failurs when
!CONFIG_SYSFS is set.
Reported-by: Stephen Rothwell <[email protected]>
Fixes: 303a42769c4c ("sysfs: add sysfs_group{s}_change_owner()")
Signed-off-by: Christian Brauner <[email protected]>
Reported-by: Randy Dunlap <[email protected]>
Acked-by: Randy Dunlap <[email protected]> # build-tested
Signed-off-by: David S. Miller <[email protected]>
|
|
The skbedit action "priority" is used for adjusting SKB priority. Allow
drivers to offload the action by introducing two new skbedit getters and a
new flow action, and initializing appropriately in tc_setup_flow_action().
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The two functions is_tcf_skbedit_mark() and is_tcf_skbedit_ptype() have a
very similar structure. A follow-up patch will add one more such function.
Instead of more cut'n'pasting, extract a helper function that checks
whether a TC action is an skbedit with the required flag. Convert the two
existing functions into thin wrappers around the helper.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc, afs: Interruptibility fixes
Here are a number of fixes for AF_RXRPC and AFS that make AFS system calls
less interruptible and so less likely to leave the filesystem in an
uncertain state. There's also a miscellaneous patch to make tracing
consistent.
(1) Firstly, abstract out the Tx space calculation in sendmsg. Much the
same code is replicated in a number of places that subsequent patches
are going to alter, including adding another copy.
(2) Fix Tx interruptibility by allowing a kernel service, such as AFS, to
request that a call be interruptible only when waiting for a call slot
to become available (ie. the call has not taken place yet) or that a
call be not interruptible at all (e.g. when we want to do writeback
and don't want a signal interrupting a VM-induced writeback).
(3) Increase the minimum delay on MSG_WAITALL for userspace sendmsg() when
waiting for Tx buffer space as a 2*RTT delay is really small over 10G
ethernet and a 1 jiffy timeout might be essentially 0 if at the end of
the jiffy period.
(4) Fix some tracing output in AFS to make it consistent with rxrpc.
(5) Make sure aborted asynchronous AFS operations are tidied up properly
so we don't end up with stuck rxrpc calls.
(6) Make AFS client calls uninterruptible in the Rx phase. If we don't
wait for the reply to be fully gathered, we can't update the local VFS
state and we end up in an indeterminate state with respect to the
server.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The cited commit removed RTNL from tc_setup_flow_action(), but the
function calls two tunnel key action helpers that use rtnl_dereference()
to fetch the action's parameters. This leads to "suspicious RCU usage"
warnings [1][2].
Change the helpers to use rcu_dereference_protected() while requiring
the action's lock to be held. This is safe because the two helpers are
only called from tc_setup_flow_action() which acquires the lock.
[1]
[ 156.950855] =============================
[ 156.955463] WARNING: suspicious RCU usage
[ 156.960085] 5.6.0-rc5-custom-47426-gdfe43878d573 #2409 Not tainted
[ 156.967116] -----------------------------
[ 156.971728] include/net/tc_act/tc_tunnel_key.h:31 suspicious rcu_dereference_protected() usage!
[ 156.981583]
[ 156.981583] other info that might help us debug this:
[ 156.981583]
[ 156.990675]
[ 156.990675] rcu_scheduler_active = 2, debug_locks = 1
[ 156.998205] 1 lock held by tc/877:
[ 157.002187] #0: ffff8881cbf7bea0 (&(&p->tcfa_lock)->rlock){+...}, at: tc_setup_flow_action+0xbe/0x4f78
[ 157.012866]
[ 157.012866] stack backtrace:
[ 157.017886] CPU: 2 PID: 877 Comm: tc Not tainted 5.6.0-rc5-custom-47426-gdfe43878d573 #2409
[ 157.027253] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[ 157.037389] Call Trace:
[ 157.040170] dump_stack+0xfd/0x178
[ 157.044034] lockdep_rcu_suspicious+0x14a/0x153
[ 157.049157] tc_setup_flow_action+0x89f/0x4f78
[ 157.054227] fl_hw_replace_filter+0x375/0x640
[ 157.064348] fl_change+0x28ec/0x4f6b
[ 157.088843] tc_new_tfilter+0x15e2/0x2260
[ 157.176801] rtnetlink_rcv_msg+0x8d6/0xb60
[ 157.190915] netlink_rcv_skb+0x177/0x460
[ 157.208884] rtnetlink_rcv+0x21/0x30
[ 157.212925] netlink_unicast+0x5d0/0x7f0
[ 157.227728] netlink_sendmsg+0x981/0xe90
[ 157.245416] ____sys_sendmsg+0x76d/0x8f0
[ 157.255348] ___sys_sendmsg+0x10f/0x190
[ 157.320308] __sys_sendmsg+0x115/0x1f0
[ 157.342553] __x64_sys_sendmsg+0x7d/0xc0
[ 157.346987] do_syscall_64+0xc1/0x600
[ 157.351142] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[2]
[ 157.432346] =============================
[ 157.436937] WARNING: suspicious RCU usage
[ 157.441537] 5.6.0-rc5-custom-47426-gdfe43878d573 #2409 Not tainted
[ 157.448559] -----------------------------
[ 157.453204] include/net/tc_act/tc_tunnel_key.h:43 suspicious rcu_dereference_protected() usage!
[ 157.463042]
[ 157.463042] other info that might help us debug this:
[ 157.463042]
[ 157.472112]
[ 157.472112] rcu_scheduler_active = 2, debug_locks = 1
[ 157.479529] 1 lock held by tc/877:
[ 157.483442] #0: ffff8881cbf7bea0 (&(&p->tcfa_lock)->rlock){+...}, at: tc_setup_flow_action+0xbe/0x4f78
[ 157.494119]
[ 157.494119] stack backtrace:
[ 157.499114] CPU: 2 PID: 877 Comm: tc Not tainted 5.6.0-rc5-custom-47426-gdfe43878d573 #2409
[ 157.508485] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[ 157.518628] Call Trace:
[ 157.521416] dump_stack+0xfd/0x178
[ 157.525293] lockdep_rcu_suspicious+0x14a/0x153
[ 157.530425] tc_setup_flow_action+0x993/0x4f78
[ 157.535505] fl_hw_replace_filter+0x375/0x640
[ 157.545650] fl_change+0x28ec/0x4f6b
[ 157.570204] tc_new_tfilter+0x15e2/0x2260
[ 157.658199] rtnetlink_rcv_msg+0x8d6/0xb60
[ 157.672315] netlink_rcv_skb+0x177/0x460
[ 157.690278] rtnetlink_rcv+0x21/0x30
[ 157.694320] netlink_unicast+0x5d0/0x7f0
[ 157.709129] netlink_sendmsg+0x981/0xe90
[ 157.726813] ____sys_sendmsg+0x76d/0x8f0
[ 157.736725] ___sys_sendmsg+0x10f/0x190
[ 157.801721] __sys_sendmsg+0x115/0x1f0
[ 157.823967] __x64_sys_sendmsg+0x7d/0xc0
[ 157.828403] do_syscall_64+0xc1/0x600
[ 157.832558] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes: b15e7a6e8d31 ("net: sched: don't take rtnl lock during flow_action setup")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds support for vlan stats to be included when dumping vlan
information. We have to dump them only when explicitly requested (thus the
flag below) because that disables the vlan range compression and will make
the dump significantly larger. In order to request the stats to be
included we add a new dump attribute called BRIDGE_VLANDB_DUMP_FLAGS which
can affect dumps with the following first flag:
- BRIDGE_VLANDB_DUMPF_STATS
The stats are intentionally nested and put into separate attributes to make
it easier for extending later since we plan to add per-vlan mcast stats,
drop stats and possibly STP stats. This is the last missing piece from the
new vlan API which makes the dumped vlan information complete.
A dump request which should include stats looks like:
[BRIDGE_VLANDB_DUMP_FLAGS] |= BRIDGE_VLANDB_DUMPF_STATS
A vlandb entry attribute with stats looks like:
[BRIDGE_VLANDB_ENTRY] = {
[BRIDGE_VLANDB_ENTRY_STATS] = {
[BRIDGE_VLANDB_STATS_RX_BYTES]
[BRIDGE_VLANDB_STATS_RX_PACKETS]
...
}
}
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-03-17
1) Compiler warnings and cleanup for the connection tracking series
2) Bug fixes for the connection tracking series
3) Fix devlink port register sequence
4) Last five patches in the series, By Eli cohen
Add the support for forwarding traffic between two eswitch uplink
representors (Hairpin for eswitch), using mlx5 termination tables
to change the direction of a packet in hw from RX to TX pipeline.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
This reverts the following commits:
8537f78647c0 ("netfilter: Introduce egress hook")
5418d3881e1f ("netfilter: Generalize ingress hook")
b030f194aed2 ("netfilter: Rename ingress hook include file")
>From the discussion in [0], the author's main motivation to add a hook
in fast path is for an out of tree kernel module, which is a red flag
to begin with. Other mentioned potential use cases like NAT{64,46}
is on future extensions w/o concrete code in the tree yet. Revert as
suggested [1] given the weak justification to add more hooks to critical
fast-path.
[0] https://lore.kernel.org/netdev/[email protected]/
[1] https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: David Miller <[email protected]>
Cc: Pablo Neira Ayuso <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Nacked-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Use nf_flow_offload_tuple() to fetch flow stats, from Paul Blakey.
2) Add new xt_IDLETIMER hard mode, from Manoj Basapathi.
Follow up patch to clean up this new mode, from Dan Carpenter.
3) Add support for geneve tunnel options, from Xin Long.
4) Make sets built-in and remove modular infrastructure for sets,
from Florian Westphal.
5) Remove unused TEMPLATE_NULLS_VAL, from Li RongQing.
6) Statify nft_pipapo_get, from Chen Wandun.
7) Use C99 flexible-array member, from Gustavo A. R. Silva.
8) More descriptive variable names for bitwise, from Jeremy Sowden.
9) Four patches to add tunnel device hardware offload to the flowtable
infrastructure, from wenxu.
10) pipapo set supports for 8-bit grouping, from Stefano Brivio.
11) pipapo can switch between nibble and byte grouping, also from
Stefano.
12) Add AVX2 vectorized version of pipapo, from Stefano Brivio.
13) Update pipapo to be use it for single ranges, from Stefano.
14) Add stateful expression support to elements via control plane,
eg. counter per element.
15) Re-visit sysctls in unprivileged namespaces, from Florian Westphal.
15) Add new egress hook, from Lukas Wunner.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Implement helpers for PCS accessed via the MII bus using 802.3 clause
45 cycles for 10GBASE-R. Only link up/down is supported, 10G full
duplex is assumed.
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Implement helpers for PCS accessed via the MII bus using 802.3 clause
22 cycles, conforming to 802.3 clause 37 and Cisco SGMII specifications
for the advertisement word.
Signed-off-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add APIs for modifying a MDIO device register, similar to the existing
phy_modify() group of functions, but at mdiobus level instead. Adapt
__phy_modify_changed() to use the new mdiobus level helper.
Signed-off-by: Russell King <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds support for manipulating vlan/tunnel mappings. The
tunnel ids are globally unique and are one per-vlan. There were two
trickier issues - first in order to support vlan ranges we have to
compute the current tunnel id in the following way:
- base tunnel id (attr) + current vlan id - starting vlan id
This is in line how the old API does vlan/tunnel mapping with ranges. We
already have the vlan range present, so it's redundant to add another
attribute for the tunnel range end. It's simply base tunnel id + vlan
range. And second to support removing mappings we need an out-of-band way
to tell the option manipulating function because there are no
special/reserved tunnel id values, so we use a vlan flag to denote the
operation is tunnel mapping removal.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a new option - BRIDGE_VLANDB_ENTRY_TUNNEL_ID which is used to dump
the tunnel id mapping. Since they're unique per vlan they can enter a
vlan range if they're consecutive, thus we can calculate the tunnel id
range map simply as: vlan range end id - vlan range start id. The
starting point is the tunnel id in BRIDGE_VLANDB_ENTRY_TUNNEL_ID. This
is similar to how the tunnel entries can be created in a range via the
old API (a vlan range maps to a tunnel range).
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|