aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2019-04-10clk: x86: Add system specific quirk to mark clocks as criticalDavid Müller1-0/+3
Since commit 648e921888ad ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL"), the pmc_plt_clocks of the Bay Trail SoC are unconditionally gated off. Unfortunately this will break systems where these clocks are used for external purposes beyond the kernel's knowledge. Fix it by implementing a system specific quirk to mark the necessary pmc_plt_clks as critical. Fixes: 648e921888ad ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL") Signed-off-by: David Müller <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Stephen Boyd <[email protected]>
2019-04-10of: use correct function prototype for of_overlay_fdt_apply()Chris Packham1-1/+2
When CONFIG_OF_OVERLAY is not enabled the fallback stub for of_overlay_fdt_apply() does not match the prototype for the case when CONFIG_OF_OVERLAY is enabled. Update the stub to use the correct function prototype. Fixes: 39a751a4cb7e ("of: change overlay apply input data from unflattened to FDT") Signed-off-by: Chris Packham <[email protected]> Reviewed-by: Frank Rowand <[email protected]> Signed-off-by: Rob Herring <[email protected]>
2019-04-10Merge branch 'mlx5-next' into rdma.git for-nextJason Gunthorpe5-32/+69
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Required for dependencies on the next series * branch 'mlx5-next': net/mlx5: E-Switch, add a new prio to be used by the RDMA side net/mlx5: E-Switch, don't use hardcoded values for FDB prios net/mlx5: Fix false compilation warning net/mlx5: Expose MPEIN (Management PCIE INfo) register layout net/mlx5: Add rate limit print macros net/mlx5: Add explicit bar address field net/mlx5: Replace dev_err/warn/info by mlx5_core_err/warn/info net/mlx5: Use dev->priv.name instead of dev_name net/mlx5: Make mlx5_core messages independent from mdev->pdev net/mlx5: Break load_one into three stages net/mlx5: Function setup/teardown procedures net/mlx5: Move health and page alloc init to mdev_init net/mlx5: Split mdev init and pci init net/mlx5: Remove redundant init functions parameter net/mlx5: Remove spinlock support from mlx5_write64 net/mlx5: Remove unused MLX5_*_DOORBELL_LOCK macros Signed-off-by: Jason Gunthorpe <[email protected]>
2019-04-10keys: safe concurrent user->{session,uid}_keyring accessJann Horn1-0/+7
The current code can perform concurrent updates and reads on user->session_keyring and user->uid_keyring. Add a comment to struct user_struct to document the nontrivial locking semantics, and use READ_ONCE() for unlocked readers and smp_store_release() for writers to prevent memory ordering issues. Fixes: 69664cf16af4 ("keys: don't generate user and user session keyrings unless they're accessed") Signed-off-by: Jann Horn <[email protected]> Signed-off-by: James Morris <[email protected]>
2019-04-10security: don't use RCU accessors for cred->session_keyringJann Horn1-1/+1
sparse complains that a bunch of places in kernel/cred.c access cred->session_keyring without the RCU helpers required by the __rcu annotation. cred->session_keyring is written in the following places: - prepare_kernel_cred() [in a new cred struct] - keyctl_session_to_parent() [in a new cred struct] - prepare_creds [in a new cred struct, via memcpy] - install_session_keyring_to_cred() - from install_session_keyring() on new creds - from join_session_keyring() on new creds [twice] - from umh_keys_init() - from call_usermodehelper_exec_async() on new creds All of these writes are before the creds are committed; therefore, cred->session_keyring doesn't need RCU protection. Remove the __rcu annotation and fix up all existing users that use __rcu. Signed-off-by: Jann Horn <[email protected]> Signed-off-by: James Morris <[email protected]>
2019-04-10Merge branch 'omap-for-v5.2/am4-ddr3' into omap-for-v5.2/am4-pm-v2Tony Lindgren1-0/+3
2019-04-10blk-mq: introduce blk_mq_complete_request_sync()Ming Lei1-0/+1
In NVMe's error handler, follows the typical steps of tearing down hardware for recovering controller: 1) stop blk_mq hw queues 2) stop the real hw queues 3) cancel in-flight requests via blk_mq_tagset_busy_iter(tags, cancel_request, ...) cancel_request(): mark the request as abort blk_mq_complete_request(req); 4) destroy real hw queues However, there may be race between #3 and #4, because blk_mq_complete_request() may run q->mq_ops->complete(rq) remotelly and asynchronously, and ->complete(rq) may be run after #4. This patch introduces blk_mq_complete_request_sync() for fixing the above race. Cc: Sagi Grimberg <[email protected]> Cc: Bart Van Assche <[email protected]> Cc: James Smart <[email protected]> Cc: [email protected] Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-10locking/rwsem: Optimize rwsem structure for uncontended lock acquisitionWaiman Long1-6/+15
For an uncontended rwsem, count and owner are the only fields a task needs to touch when acquiring the rwsem. So they are put next to each other to increase the chance that they will share the same cacheline. On a ThunderX2 99xx (arm64) system with 32K L1 cache and 256K L2 cache, a rwsem locking microbenchmark with one locking thread was run to write-lock and write-unlock an array of rwsems separated 2 cachelines apart in a 1M byte memory block. The locking rates (kops/s) of the microbenchmark when the rwsems are at various "long" (8-byte) offsets from beginning of the cacheline before and after the patch were as follows: Cacheline Offset Pre-patch Post-patch ---------------- --------- ---------- 0 17,449 16,588 1 17,450 16,465 2 17,450 16,460 3 17,453 16,462 4 14,867 16,471 5 14,867 16,470 6 14,853 16,464 7 14,867 13,172 Before the patch, the count and owner are 4 "long"s apart. After the patch, they are only 1 "long" apart. The rwsem data have to be loaded from the L3 cache for each access. It can be seen that the locking rates are more consistent after the patch than before. Note that for this particular system, the performance drop happens whenever the count and owner are at an odd multiples of "long"s apart. No performance drop was observed when only a single rwsem was used (hot cache). So the drop is likely just an idiosyncrasy of the cache architecture of this chip than an inherent problem with the patch. Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-10locking/rwsem: Move rwsem internal function declarations to rwsem-xadd.hWaiman Long1-7/+0
We don't need to expose rwsem internal functions which are not supposed to be called directly from other kernel code. Signed-off-by: Waiman Long <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tim Chen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-10Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar8-29/+46
Signed-off-by: Ingo Molnar <[email protected]>
2019-04-10OPP: Introduce dev_pm_opp_find_freq_ceil_by_volt()Andrew-sh.Cheng1-0/+8
This patch introduces a new helper routine in the OPP core, which returns the OPP with the highest frequency which has voltage less than or equal to the target voltage passed to the helper. Signed-off-by: Andrew-sh.Cheng <[email protected]> [ Viresh: Massaged the commit log and renamed the helper with some cleanups. ] Signed-off-by: Viresh Kumar <[email protected]>
2019-04-10net/mlx5: E-Switch, add a new prio to be used by the RDMA sideMark Bloch1-0/+1
Create a new prio in the FDB, it will be used when inserting steering rules into the FDB from the RDMA side. We create a new PRIO so rules from the net side and rules from the RDMA side won't be inserted to the same PRIO, each side has it's own sandbox to play in. Signed-off-by: Mark Bloch <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2019-04-10net/mlx5: E-Switch, don't use hardcoded values for FDB priosMark Bloch1-0/+5
When creating the FDB prios, use the enum values already defined and not the hardcoded values. Signed-off-by: Mark Bloch <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2019-04-09bpf: allow for key-less BTF in array mapDaniel Borkmann1-0/+1
Given we'll be reusing BPF array maps for global data/bss/rodata sections, we need a way to associate BTF DataSec type as its map value type. In usual cases we have this ugly BPF_ANNOTATE_KV_PAIR() macro hack e.g. via 38d5d3b3d5db ("bpf: Introduce BPF_ANNOTATE_KV_PAIR") to get initial map to type association going. While more use cases for it are discouraged, this also won't work for global data since the use of array map is a BPF loader detail and therefore unknown at compilation time. For array maps with just a single entry we make an exception in terms of BTF in that key type is declared optional if value type is of DataSec type. The latter LLVM is guaranteed to emit and it also aligns with how we regard global data maps as just a plain buffer area reusing existing map facilities for allowing things like introspection with existing tools. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-04-09bpf: add syscall side map freeze supportDaniel Borkmann1-1/+2
This patch adds a new BPF_MAP_FREEZE command which allows to "freeze" the map globally as read-only / immutable from syscall side. Map permission handling has been refactored into map_get_sys_perms() and drops FMODE_CAN_WRITE in case of locked map. Main use case is to allow for setting up .rodata sections from the BPF ELF which are loaded into the kernel, meaning BPF loader first allocates map, sets up map value by copying .rodata section into it and once complete, it calls BPF_MAP_FREEZE on the map fd to prevent further modifications. Right now BPF_MAP_FREEZE only takes map fd as argument while remaining bpf_attr members are required to be zero. I didn't add write-only locking here as counterpart since I don't have a concrete use-case for it on my side, and I think it makes probably more sense to wait once there is actually one. In that case bpf_attr can be extended as usual with a flag field and/or others where flag 0 means that we lock the map read-only hence this doesn't prevent to add further extensions to BPF_MAP_FREEZE upon need. A map creation flag like BPF_F_WRONCE was not considered for couple of reasons: i) in case of a generic implementation, a map can consist of more than just one element, thus there could be multiple map updates needed to set the map into a state where it can then be made immutable, ii) WRONCE indicates exact one-time write before it is then set immutable. A generic implementation would set a bit atomically on map update entry (if unset), indicating that every subsequent update from then onwards will need to bail out there. However, map updates can fail, so upon failure that flag would need to be unset again and the update attempt would need to be repeated for it to be eventually made immutable. While this can be made race-free, this approach feels less clean and in combination with reason i), it's not generic enough. A dedicated BPF_MAP_FREEZE command directly sets the flag and caller has the guarantee that map is immutable from syscall side upon successful return for any future syscall invocations that would alter the map state, which is also more intuitive from an API point of view. A command name such as BPF_MAP_LOCK has been avoided as it's too close with BPF map spin locks (which already has BPF_F_LOCK flag). BPF_MAP_FREEZE is so far only enabled for privileged users. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-04-09bpf: add program side {rd, wr}only support for mapsDaniel Borkmann1-0/+29
This work adds two new map creation flags BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG in order to allow for read-only or write-only BPF maps from a BPF program side. Today we have BPF_F_RDONLY and BPF_F_WRONLY, but this only applies to system call side, meaning the BPF program has full read/write access to the map as usual while bpf(2) calls with map fd can either only read or write into the map depending on the flags. BPF_F_RDONLY_PROG and BPF_F_WRONLY_PROG allows for the exact opposite such that verifier is going to reject program loads if write into a read-only map or a read into a write-only map is detected. For read-only map case also some helpers are forbidden for programs that would alter the map state such as map deletion, update, etc. As opposed to the two BPF_F_RDONLY / BPF_F_WRONLY flags, BPF_F_RDONLY_PROG as well as BPF_F_WRONLY_PROG really do correspond to the map lifetime. We've enabled this generic map extension to various non-special maps holding normal user data: array, hash, lru, lpm, local storage, queue and stack. Further generic map types could be followed up in future depending on use-case. Main use case here is to forbid writes into .rodata map values from verifier side. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-04-09bpf: implement lookup-free direct value access for mapsDaniel Borkmann2-0/+10
This generic extension to BPF maps allows for directly loading an address residing inside a BPF map value as a single BPF ldimm64 instruction! The idea is similar to what BPF_PSEUDO_MAP_FD does today, which is a special src_reg flag for ldimm64 instruction that indicates that inside the first part of the double insns's imm field is a file descriptor which the verifier then replaces as a full 64bit address of the map into both imm parts. For the newly added BPF_PSEUDO_MAP_VALUE src_reg flag, the idea is the following: the first part of the double insns's imm field is again a file descriptor corresponding to the map, and the second part of the imm field is an offset into the value. The verifier will then replace both imm parts with an address that points into the BPF map value at the given value offset for maps that support this operation. Currently supported is array map with single entry. It is possible to support more than just single map element by reusing both 16bit off fields of the insns as a map index, so full array map lookup could be expressed that way. It hasn't been implemented here due to lack of concrete use case, but could easily be done so in future in a compatible way, since both off fields right now have to be 0 and would correctly denote a map index 0. The BPF_PSEUDO_MAP_VALUE is a distinct flag as otherwise with BPF_PSEUDO_MAP_FD we could not differ offset 0 between load of map pointer versus load of map's value at offset 0, and changing BPF_PSEUDO_MAP_FD's encoding into off by one to differ between regular map pointer and map value pointer would add unnecessary complexity and increases barrier for debugability thus less suitable. Using the second part of the imm field as an offset into the value does /not/ come with limitations since maximum possible value size is in u32 universe anyway. This optimization allows for efficiently retrieving an address to a map value memory area without having to issue a helper call which needs to prepare registers according to calling convention, etc, without needing the extra NULL test, and without having to add the offset in an additional instruction to the value base pointer. The verifier then treats the destination register as PTR_TO_MAP_VALUE with constant reg->off from the user passed offset from the second imm field, and guarantees that this is within bounds of the map value. Any subsequent operations are normally treated as typical map value handling without anything extra needed from verification side. The two map operations for direct value access have been added to array map for now. In future other types could be supported as well depending on the use case. The main use case for this commit is to allow for BPF loader support for global variables that reside in .data/.rodata/.bss sections such that we can directly load the address of them with minimal additional infrastructure required. Loader support has been added in subsequent commits for libbpf library. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-04-09unexport d_alloc_pseudo()Al Viro1-1/+0
No modular uses since introducion of alloc_file_pseudo(), and the only non-modular user not in alloc_file_pseudo() had actually been wrong - should've been d_alloc_anon(). Signed-off-by: Al Viro <[email protected]>
2019-04-09dcache: sort the freeing-without-RCU-delay mess for good.Al Viro1-1/+1
For lockless accesses to dentries we don't have pinned we rely (among other things) upon having an RCU delay between dropping the last reference and actually freeing the memory. On the other hand, for things like pipes and sockets we neither do that kind of lockless access, nor want to deal with the overhead of an RCU delay every time a socket gets closed. So delay was made optional - setting DCACHE_RCUACCESS in ->d_flags made sure it would happen. We tried to avoid setting it unless we knew we need it. Unfortunately, that had led to recurring class of bugs, in which we missed the need to set it. We only really need it for dentries that are created by d_alloc_pseudo(), so let's not bother with trying to be smart - just make having an RCU delay the default. The ones that do *not* get it set the replacement flag (DCACHE_NORCU) and we'd better use that sparingly. d_alloc_pseudo() is the only such user right now. FWIW, the race that finally prompted that switch had been between __lock_parent() of immediate subdirectory of what's currently the root of a disconnected tree (e.g. from open-by-handle in progress) racing with d_splice_alias() elsewhere picking another alias for the same inode, either on outright corrupted fs image, or (in case of open-by-handle on NFS) that subdirectory having been just moved on server. It's not easy to hit, so the sky is not falling, but that's not the first race on similar missed cases and the logics for settinf DCACHE_RCUACCESS has gotten ridiculously convoluted. Cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2019-04-10cpuidle: Export the next timer expiration for CPUsUlf Hansson2-1/+7
To be able to predict the sleep duration for a CPU entering idle, it is essential to know the expiration time of the next timer. Both the teo and the menu cpuidle governors already use this information for CPU idle state selection. Moving forward, a similar prediction needs to be made for a group of idle CPUs rather than for a single one and the following changes implement a new genpd governor for that purpose. In order to support that feature, add a new function called tick_nohz_get_next_hrtimer() that will return the next hrtimer expiration time of a given CPU to be invoked after deciding whether or not to stop the scheduler tick on that CPU. Make the cpuidle core call tick_nohz_get_next_hrtimer() right before invoking the ->enter() callback provided by the cpuidle driver for the given state and store its return value in the per-CPU struct cpuidle_device, so as to make it available to code outside of cpuidle. Note that at the point when cpuidle calls tick_nohz_get_next_hrtimer(), the governor's ->select() callback has already returned and indicated whether or not the tick should be stopped, so in fact the value returned by tick_nohz_get_next_hrtimer() always is the next hrtimer expiration time for the given CPU, possibly including the tick (if it hasn't been stopped). Co-developed-by: Lina Iyer <[email protected]> Co-developed-by: Daniel Lezcano <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-04-10PM / Domains: Add support for CPU devices to genpdUlf Hansson1-0/+13
To enable a CPU device to be attached to a PM domain managed by genpd, make a few changes to it for convenience. To be able to quickly find out what CPUs are attached to a genpd, which typically becomes useful from a genpd governor as subsequent changes are about to show, add a cpumask to struct generic_pm_domain to be updated when a CPU device gets attached to the genpd containing that cpumask. Also, propagate the cpumask changes upwards in the domain hierarchy to the master PM domains. This way, the cpumask for a genpd hierarchically reflects all CPUs attached to the topology below it. Finally, make this an opt-in feature, to avoid having to manage CPUs and the cpumask for a genpd that don't need it. To that end, add a new genpd configuration bit, GENPD_FLAG_CPU_DOMAIN. Co-developed-by: Lina Iyer <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-04-10PM / Domains: Add generic data pointer to struct genpd_power_stateUlf Hansson1-1/+3
Add a data pointer to the genpd_power_state struct, to allow a genpd backend driver to store per-state specific data. To introduce the pointer, change the way genpd deals with freeing of the corresponding allocated data. More precisely, clarify the responsibility of whom that shall free the data, by adding a ->free_states() callback to the generic_pm_domain structure. The one allocating the data will be expected to set the callback, to allow genpd to invoke it from genpd_remove(). Co-developed-by: Lina Iyer <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-04-09docs: Add colon clearing sphinx warningTobin C. Harding1-1/+1
Sphinx emits various warnings all caused by a missing colon before code block: WARNING: Block quote ends without a blank line; unexpected unindent. ERROR: Unexpected indentation. WARNING: Block quote ends without a blank line; unexpected unindent. Add the colon, clearing sphinx warnings. Signed-off-by: Tobin C. Harding <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2019-04-09memory: ti-emif-sram: Add ti_emif_run_hw_leveling for DDR3 hardware levelingDave Gerlach1-0/+3
In certain situations, such as when returning from low power modes, the EMIF must re-run hardware leveling to properly restore DDR3 access. This is accomplished by introducing a new ti-emif-sram-pm call, ti_emif_run_hw_leveling, to check if DDR3 is in use and if so, trigger the full write and read leveling processes. Suggested-by: Brad Griffis <[email protected]> Signed-off-by: Dave Gerlach <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
2019-04-09Merge branches 'consolidate.2019.04.09a', 'doc.2019.03.26b', ↵Paul E. McKenney2-37/+5
'fixes.2019.03.26b', 'srcu.2019.03.26b', 'stall.2019.03.26b' and 'torture.2019.03.26b' into HEAD consolidate.2019.04.09a: Lingering RCU flavor consolidation cleanups. doc.2019.03.26b: Documentation updates. fixes.2019.03.26b: Miscellaneous fixes. srcu.2019.03.26b: SRCU updates. stall.2019.03.26b: RCU CPU stall warning updates. torture.2019.03.26b: Torture-test updates.
2019-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller6-28/+43
2019-04-08lib/string: Add strscpy_pad() functionTobin C. Harding1-0/+4
We have a function to copy strings safely and we have a function to copy strings and zero the tail of the destination (if source string is shorter than destination buffer) but we do not have a function to do both at once. This means developers must write this themselves if they desire this functionality. This is a chore, and also leaves us open to off by one errors unnecessarily. Add a function that calls strscpy() then memset()s the tail to zero if the source string is shorter than the destination buffer. Acked-by: Kees Cook <[email protected]> Signed-off-by: Tobin C. Harding <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2019-04-08net: phy: improve link partner capability detectionHeiner Kallweit1-0/+2
genphy_read_status() so far checks phydev->supported, not the actual PHY capabilities. This can make a difference if the supported speeds have been limited by of_set_phy_supported() or phy_set_max_speed(). It seems that this issue only affects the link partner advertisements as displayed by ethtool. Also this patch wouldn't apply to older kernels because linkmode bitmaps have been introduced recently. Therefore net-next. Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-08Merge tag 'mlx5-updates-2019-04-02' of ↵David S. Miller5-32/+64
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mamameed says: ==================== mlx5-updates-2019-04-02 This series provides misc updates to mlx5 driver 1) Aya Levin (1): Handle event of power detection in the PCIE slot 2) Eli Britstein (6): Some TC VLAN related updates and fixes to the previous VLAN modify action support patchset. Offload TC e-switch rules with egress/ingress VLAN devices 3) Max Gurtovoy (1): Fix double mutex initialization in esiwtch.c 4) Tariq Toukan (3): Misc small updates A write memory barrier is sufficient in EQ ci update Obsolete param field holding a constant value Unify logic of MTU boundaries 5) Tonghao Zhang (4): Misc updates to en_tc.c Make the log friendly when decapsulation offload not supported Remove 'parse_attr' argument in parse_tc_fdb_actions() Deletes unnecessary setting of esw_attr->parse_attr Return -EOPNOTSUPP when attempting to offload an unsupported action ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-08netfilter: make two functions staticFlorian Westphal1-1/+0
They have no external callers anymore. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-08netfilter: nft_osf: Add version option supportFernando Fernandez Mancera1-3/+8
Add version option support to the nftables "osf" expression. Signed-off-by: Fernando Fernandez Mancera <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-08virtio: Honour 'may_reduce_num' in vring_create_virtqueueCornelia Huck1-1/+1
vring_create_virtqueue() allows the caller to specify via the may_reduce_num parameter whether the vring code is allowed to allocate a smaller ring than specified. However, the split ring allocation code tries to allocate a smaller ring on allocation failure regardless of what the caller specified. This may cause trouble for e.g. virtio-pci in legacy mode, which does not support ring resizing. (The packed ring code does not resize in any case.) Let's fix this by bailing out immediately in the split ring code if the requested size cannot be allocated and may_reduce_num has not been specified. While at it, fix a typo in the usage instructions. Fixes: 2a2d1382fe9d ("virtio: Add improved queue allocation API") Cc: [email protected] # v4.6+ Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Halil Pasic <[email protected]> Reviewed-by: Jens Freimann <[email protected]>
2019-04-08netfilter: replace NF_NAT_NEEDED with IS_ENABLED(CONFIG_NF_NAT)Florian Westphal1-1/+1
NF_NAT_NEEDED is true whenever nat support for either ipv4 or ipv6 is enabled. Now that the af-specific nat configuration switches have been removed, IS_ENABLED(CONFIG_NF_NAT) has the same effect. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-08netfilter: nf_tables: merge route type into coreFlorian Westphal1-0/+15
very little code, so it really doesn't make sense to have extra modules or even a kconfig knob for this. Merge them and make functionality available unconditionally. The merge makes inet family route support trivial, so add it as well here. Before: text data bss dec hex filename 835 832 0 1667 683 nft_chain_route_ipv4.ko 870 832 0 1702 6a6 nft_chain_route_ipv6.ko 111568 2556 529 114653 1bfdd nf_tables.ko After: text data bss dec hex filename 113133 2556 529 116218 1c5fa nf_tables.ko Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-08netfilter: optimize nf_inet_addr_cmpLi RongQing1-0/+7
optimize nf_inet_addr_cmp by 64bit xor computation similar to ipv6_addr_equal() Signed-off-by: Yuan Linsi <[email protected]> Signed-off-by: Li RongQing <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-08time: Introduce jiffies64_to_msecs()Li RongQing1-0/+1
there is a similar helper in net/netfilter/nf_tables_api.c, this maybe become a common request someday, so move it to time.c Signed-off-by: Zhang Yu <[email protected]> Signed-off-by: Li RongQing <[email protected]> Acked-by: John Stultz <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-08ARM: OMAP2+: pm33xx: Add support for rtc+ddr in self refresh modeKeerthy1-0/+5
Add support for rtc+ddr in self refresh mode. Add addtional pm hooks for save/restore and rtc suspend/resume. Signed-off-by: Keerthy <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
2019-04-08rtc: OMAP: Add support for rtc-only modeKeerthy1-0/+7
Prepare rtc driver for rtc-only with DDR in self-refresh mode. omap_rtc_power_off now should cater to two features: 1) RTC plus DDR in self-refresh is power a saving mode where in the entire system including the different voltage rails from PMIC are shutdown except the ones feeding on to RTC and DDR. DDR is kept in self-refresh hence the contents are preserved. RTC ALARM2 is connected to PMIC_EN line once we the ALARM2 is triggered we enter the mode with DDR in self-refresh and RTC Ticking. After a predetermined time an RTC ALARM1 triggers waking up the system[1]. The control goes to bootloader. The bootloader then checks RTC scratchpad registers to confirm it was an rtc_only wakeup and follows a different path, configure bare minimal clocks for ddr and then jumps to the resume address in another RTC scratchpad registers and transfers the control to Kernel. Kernel then restores the saved context. omap_rtc_power_off_program does the ALARM2 programming part. [1] http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf Page 2884 2) Power-off: This is usual poweroff mode. omap_rtc_power_off calls the above omap_rtc_power_off_program function and in addition to that programs the OMAP_RTC_PMIC_REG for any external wake ups for PMIC like the pushbutton and shuts off the PMIC. Hence the split in omap_rtc_power_off. Signed-off-by: Keerthy <[email protected]> Acked-by: Alexandre Belloni <[email protected]> [[email protected]: folded in a fix for compile warning] Signed-off-by: Tony Lindgren <[email protected]>
2019-04-08datagram: remove rendundant 'peeked' argumentPaolo Abeni1-3/+3
After commit a297569fe00a ("net/udp: do not touch skb->peeked unless really needed") the 'peeked' argument of __skb_try_recv_datagram() and friends is always equal to !!'flags & MSG_PEEK'. Since such argument is really a boolean info, and the callers have already 'flags & MSG_PEEK' handy, we can remove it and clean-up the code a bit. Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-08dma-mapping: remove leftover NULL device supportChristoph Hellwig1-3/+3
Most dma_map_ops implementations already had some issues with a NULL device, or did simply crash if one was fed to them. Now that we have cleaned up all the obvious offenders we can stop to pretend we support this mode. Signed-off-by: Christoph Hellwig <[email protected]>
2019-04-08block: don't use for-inside-for in bio_for_each_segment_allMing Lei2-12/+22
Commit 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec") changes bio_for_each_segment_all() to use for-inside-for. This way breaks all bio_for_each_segment_all() call with error out branch via 'break', since now 'break' can only break from the inner loop. Fixes this issue by implementing bio_for_each_segment_all() via single 'for' loop, and now the logic is very similar with normal bvec iterator. Cc: Qu Wenruo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Omar Sandoval <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reported-and-Tested-by: Qu Wenruo <[email protected]> Fixes: 6dc4f100c175 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec") Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-08tracing: introduce TRACE_EVENT_NOP()Yafang Shao1-0/+15
Sometimes we want to define a tracepoint as a do-nothing function. So I introduce TRACE_EVENT_NOP, DECLARE_EVENT_CLASS_NOP and DEFINE_EVENT_NOP for this kind of usage. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-04-08Merge tag 'v5.1-rc3' into develLinus Walleij25-65/+137
Linux 5.1-rc3
2019-04-08drivers: Remove explicit invocations of mmiowb()Will Deacon1-2/+0
mmiowb() is now implied by spin_unlock() on architectures that require it, so there is no reason to call it from driver code. This patch was generated using coccinelle: @mmiowb@ @@ - mmiowb(); and invoked as: $ for d in drivers include/linux/qed sound; do \ spatch --include-headers --sp-file mmiowb.cocci --dir $d --in-place; done NOTE: mmiowb() has only ever guaranteed ordering in conjunction with spin_unlock(). However, pairing each mmiowb() removal in this patch with the corresponding call to spin_unlock() is not at all trivial, so there is a small chance that this change may regress any drivers incorrectly relying on mmiowb() to order MMIO writes between CPUs using lock-free synchronisation. If you've ended up bisecting to this commit, you can reintroduce the mmiowb() calls using wmb() instead, which should restore the old behaviour on all architectures other than some esoteric ia64 systems. Acked-by: Linus Torvalds <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-08mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessorsWill Deacon1-1/+10
Removing explicit calls to mmiowb() from driver code means that we must now call into the generic mmiowb_spin_{lock,unlock}() functions from the core spinlock code. In order to elide barriers following critical sections without any I/O writes, we also hook into the asm-generic I/O routines. Acked-by: Linus Torvalds <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-04-08cpufreq: intel_pstate: Update max frequency on global turbo changesRafael J. Wysocki1-0/+10
While the cpuinfo.max_freq value doesn't really matter for intel_pstate in the active mode, in the passive mode it is used by governors as the maximum physical frequency of the CPU and the results of governor computations generally depend on it. Also it is made available to user space via sysfs and it should match the current HW configuration. For this reason, make intel_pstate update cpuinfo.max_freq for all CPUs if it detects a global change of turbo frequency settings from "disable" to "enable" or the other way associated with a _PPC change notification from the platform firmware. Note that policy_is_inactive(), cpufreq_cpu_acquire(), cpufreq_cpu_release(), and cpufreq_set_policy() need to be made available to it for this purpose. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200759 Reported-by: Gabriele Mazzotta <[email protected]> Tested-by: Gabriele Mazzotta <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2019-04-08gpio: mmio: Drop bgpio_dir_invertedLinus Walleij1-4/+0
The direction inversion semantics are now handled by simply using the registers for in/out available, no need to keep track of inversion semantics exmplicitly anymore. Reviewed-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Jan Kotas <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2019-04-08mtd: nand: omap: Fix comment in platform data using wrong Kconfig symbolMiquel Raynal1-1/+1
The symbol that is being used in the #if/#endif block is not the one which is mentioned at the bottom. Fixes: 93af53b8633c ("nand: omap2: Remove horrible ifdefs to fix module probe") Signed-off-by: Miquel Raynal <[email protected]>
2019-04-08mtd: rawnand: Get rid of chip->ecc_{strength,step}_dsBoris Brezillon1-8/+0
nand_device embeds a nand_ecc_req object which contains the minimum strength and step-size required by the NAND device. Drop the chip->ecc_{strength,step}_ds fields and use chip->base.eccreq.{strength,step_size} instead. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Reviewed-by: Frieder Schrempf <[email protected]>
2019-04-08mtd: rawnand: Get rid of chip->numchipsBoris Brezillon1-4/+3
The same information is provided by nanddev_ntargets(). Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Reviewed-by: Frieder Schrempf <[email protected]>