Age | Commit message (Collapse) | Author | Files | Lines |
|
As we no longer support a try again if we cannot reenable the trigger
rename the function to reflect this. Also we don't do anything with
the value returned so stop it returning anything. For the few drivers
that didn't already print an error message in this patch, add such
a print.
Signed-off-by: Jonathan Cameron <[email protected]>
Reviewed-by: Lars-Peter Clausen <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Acked-by: Srinivas Pandruvada <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Srinivas Pandruvada <[email protected]>
Cc: Christian Oder <[email protected]>
Cc: Eugen Hristev <[email protected]>
Cc: Nishant Malpani <[email protected]>
Cc: Daniel Baluta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Use a heap allocated memory for the SPI transfer buffer. Using stack memory
can corrupt stack memory when using DMA on some systems.
This change moves the buffer from the stack of the trigger handler call to
the heap of the buffer of the state struct. The size increases takes into
account the alignment for the timestamp, which is 8 bytes.
The 'data' buffer is split into 'tx_buf' and 'rx_buf', to make a clearer
separation of which part of the buffer should be used for TX & RX.
Fixes: af3008485ea03 ("iio:adc: Add common code for ADI Sigma Delta devices")
Signed-off-by: Lars-Peter Clausen <[email protected]>
Signed-off-by: Alexandru Ardelean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
|
|
STEs format for Connect-X5 and Connect-X6DX different. Currently, on
Connext-X6DX the SW steering would break at some point when building STEs
w/o giving a proper error message. Fix this by checking the STE format of
the current device when initializing domain: add mlx5_ifc definitions for
Connect-X6DX SW steering, read FW capability to get the current format
version, and check this version when domain is being created.
Fixes: 26d688e33f88 ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The new OST has one global timer and two or four percpu timers, so there
will be three combinations in the upcoming new OST driver: the original
GLOBAL_TIMER + PERCPU_TIMER, the new GLOBAL_TIMER + PERCPU_TIMER0/1 and
GLOBAL_TIMER + PERCPU_TIMER0/1/2/3, For this, add the macro definition
about OST_CLK_PERCPU_TIMER0/1/2/3. And in order to ensure that all the
combinations work normally, the original ABI values of OST_CLK_PERCPU_TIMER
and OST_CLK_GLOBAL_TIMER need to be exchanged to ensure that in any
combinations, the clock can be registered (by calling clk_hw_register())
from index 0.
Before this patch, OST_CLK_PERCPU_TIMER and OST_CLK_GLOBAL_TIMER are only
used in two places, one is when using "assigned-clocks" to configure the
clocks in the DTS file; the other is when registering the clocks in the
sysost driver. When the values of these two ABIs are exchanged, the ABI
value used by sysost driver when registering the clock, and the ABI value
used by DTS when configuring the clock using "assigned-clocks" will also
change accordingly. Therefore, there is no situation that causes the wrong
clock to the configured. Therefore, exchanging ABI values will not cause
errors in the existing codes when registering and configuring the clocks.
Currently, in the mainline, only X1000 and X1830 are using sysost driver,
and the upcoming X2000 will also use sysost driver. This patch has been
tested on all three SoCs and all works fine.
Tested-by: 周正 (Zhou Zheng) <[email protected]>
Signed-off-by: 周琰杰 (Zhou Yanjie) <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
STATX_ATTR_MOUNT_ROOT and STATX_ATTR_DAX got merged with the same value,
so one of them needs fixing. Move STATX_ATTR_DAX.
While we're in here, clarify the value-matching scheme for some of the
attributes, and explain why the value for DAX does not match.
Fixes: 80340fe3605c ("statx: add mount_root")
Fixes: 712b2698e4c0 ("fs/stat: Define DAX statx attribute")
Link: https://lore.kernel.org/linux-fsdevel/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: David Howells <[email protected]>
Signed-off-by: Eric Sandeen <[email protected]>
Reviewed-by: David Howells <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Cc: <[email protected]> # 5.8
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Background:
Broadcast and multicast packages are enqueued for later processing.
This queue was previously hardcoded to 1000.
This proved insufficient for handling very high packet rates.
This resulted in packet drops for multicast.
While at the same time unicast worked fine.
The change:
This patch make the queue length adjustable to accommodate
for environments with very high multicast packet rate.
But still keeps the default value of 1000 unless specified.
The queue length is specified as a request per macvlan
using the IFLA_MACVLAN_BC_QUEUE_LEN parameter.
The actual used queue length will then be the maximum of
any macvlan connected to the same port. The actual used
queue length for the port can be retrieved (read only)
by the IFLA_MACVLAN_BC_QUEUE_LEN_USED parameter for verification.
This will be followed up by a patch to iproute2
in order to adjust the parameter from userspace.
Signed-off-by: Thomas Karlsson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
To complete the MMIO based mechanism, the fields: PASID, bus, device and
function of the Process Element Entry have to be filled. (See
OpenCAPI Power Platform Architecture document)
Hypervisor Process Element Entry
Word
0 1 .... 7 8 ...... 12 13 ..15 16.... 19 20 ........... 31
0 OSL Configuration State (0:31)
1 OSL Configuration State (32:63)
2 PASID | Reserved
3 Bus | Device |Function | Reserved
4 Reserved
5 Reserved
6 ....
Signed-off-by: Christophe Lombard <[email protected]>
Acked-by: Frederic Barrat <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Implement 'unsafe' version of put_compat_sigset()
For the bigendian, use unsafe_put_user() directly
to avoid intermediate copy through the stack.
For the littleendian, use a straight unsafe_copy_to_user().
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/537c7082ee309a0bb9c67a50c5d9dd929aedb82d.1597770847.git.christophe.leroy@csgroup.eu
|
|
We used LockDoc to derive locking rules for each member
of struct transaction_t.
Based on those results, we extended the existing documentation
by more members of struct transaction_t, and updated the existing
documentation.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexander Lochmann <[email protected]>
Signed-off-by: Horst Schirmeier <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
|
|
The handle_inode_event() interface was added as (quoting comment):
"a simple variant of handle_event() for groups that only have inode
marks and don't have ignore mask".
In other words, all backends except fanotify. The inotify backend
also falls under this category, but because it required extra arguments
it was left out of the initial pass of backends conversion to the
simple interface.
This results in code duplication between the generic helper
fsnotify_handle_event() and the inotify_handle_event() callback
which also happen to be buggy code.
Generalize the handle_inode_event() arguments and add the check for
FS_EXCL_UNLINK flag to the generic helper, so inotify backend could
be converted to use the simple interface.
Link: https://lore.kernel.org/r/[email protected]
CC: [email protected]
Fixes: b9a1b9772509 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
The FWHT stateless 'uAPI' was staging and marked explicitly in the
V4L2 specification that it will change and is unstable.
Note that these control IDs were never exported as a public API,
they were only defined in kernel-local headers (fwht-ctrls.h).
Now, the FWHT stateless controls is ready to be part
of the stable uAPI.
While not too late:
- Rename V4L2_CID_MPEG_VIDEO_FWHT_PARAMS to V4L2_CID_STATELESS_FWHT_PARAMS.
- Move the contents of fwht-ctrls.h to v4l2-controls.h.
- Move the public parts of drivers/media/test-drivers/vicodec/codec-fwht.h
to v4l2-controls.h.
- Add V4L2_CTRL_TYPE_FWHT_PARAMS control initialization and validation.
- Add p_fwht_params to struct v4l2_ext_control.
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
The H.264 stateless 'uAPI' was staging and marked explicitly in the
V4L2 specification that it will change and is unstable.
Note that these control IDs were never exported as a public API,
they were only defined in kernel-local headers (h264-ctrls.h).
Now, the H264 stateless controls is ready to be part
of the stable uAPI.
While not too late, let's rename them and re-number their
control IDs, moving them to the newly created stateless
control class, and updating all the drivers accordingly.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Move the H264 stateless control types out of staging,
and re-number them to avoid any confusion.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Since we are ready to stabilize the H264 stateless API,
start by first moving the parsed H264 pixel format.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Add a new control class to hold the stateless codecs controls
that are ready to be moved out of staging.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Check that all the fields that correspond or are related
to a H264 specification syntax element have legal values.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Avoid including h264-ctrls.h, vp8-ctrls.h, etc,
and instead just include v4l2-ctrls.h which does the right
thing.
This is in preparation for moving the stateless controls
out of staging, which will mean removing some of these headers.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
For historical reasons, stateful codec controls are named
as {}_MPEG_{}. While we can't at this point sanely
change all control IDs (such as V4L2_CID_MPEG_VIDEO_VP8_FRAME_HEADER),
we can least change the more meaningful macros such as classes
macros.
Signed-off-by: Ezequiel Garcia <[email protected]>
Tested-by: Jernej Skrabec <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
The kernel-doc markup is wrong: it is asking the tool to document
struct refcount_struct, instead of documenting typedef refcount_t.
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/afb9bb1e675bf5f72a34a55d780779d7d5916b4c.1606823973.git.mchehab+huawei@kernel.org
|
|
Changeset cd8084f91c02 ("locking/lockdep: Apply crossrelease to completions")
added a CONFIG_LOCKDEP_COMPLETE (that was later renamed to
CONFIG_LOCKDEP_COMPLETIONS).
Such changeset renamed the init_completion, and add a macro
that would either run a modified version or the original code.
However, such code reported too many false positives. So, it
ended being dropped later on by
changeset e966eaeeb623 ("locking/lockdep: Remove the cross-release locking checks").
Yet, the define remained there as just:
#define init_completion(x) __init_completion(x)
Get rid of the define, and return __init_completion() function
to its original name.
Fixes: e966eaeeb623 ("locking/lockdep: Remove the cross-release locking checks")
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/e657bfc533545c185b1c3c55926a449ead56a88b.1606823973.git.mchehab+huawei@kernel.org
|
|
More consistent naming should make it easier to untangle the _Generic
token pasting maze called __seqprop().
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
When building with W=2, there is a flood of warnings about the seqlock
macros shadowing local variables:
19806 linux/seqlock.h:331:11: warning: declaration of 'seq' shadows a previous local [-Wshadow]
48 linux/seqlock.h:348:11: warning: declaration of 'seq' shadows a previous local [-Wshadow]
8 linux/seqlock.h:379:11: warning: declaration of 'seq' shadows a previous local [-Wshadow]
Prefix the local variables to make the warning useful elsewhere again.
Fixes: 52ac39e5db51 ("seqlock: seqcount_t: Implement all read APIs as statement expressions")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
A number of architectures have non-pagetable aligned huge/large pages.
For such architectures a leaf can actually be part of a larger entry.
Provide generic helpers to determine the size of a page-table leaf.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
In order to write another lockless page-table walker, we need
gup_get_pte() exposed. While doing that, rename it to
ptep_get_lockless() to match the existing ptep_get() naming.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Add new api devm_drm_irq_install() to register interrupts,
no need to call drm_irq_uninstall() when the drm module is removed.
Signed-off-by: Tian Tao <[email protected]>
Reviewed-by: Thomas Zimmermann <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Added two ioctl to decompress/compress explicitly the compression
enabled file in "compress_mode=user" mount option.
Using these two ioctls, the users can make a control of compression
and decompression of their files.
Signed-off-by: Daeho Jeong <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
These values will be used by various drivers implementing the VP8
stateless API.
This had been suggested by Ezequiel Garcia for the Cedrus VP8 driver.
Signed-off-by: Emmanuel Gil Peyrot <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Since 5.10-rc1 i.MX is a devicetree-only platform, so simplify the code
by removing the unused non-DT support.
Signed-off-by: Fabio Estevam <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Userspace might want to implement a policy to temporarily disregard input
from certain devices, including not treating them as wakeup sources.
An example use case is a laptop, whose keyboard can be folded under the
screen to create tablet-like experience. The user then must hold the laptop
in such a way that it is difficult to avoid pressing the keyboard keys. It
is therefore desirable to temporarily disregard input from the keyboard,
until it is folded back. This obviously is a policy which should be kept
out of the kernel, but the kernel must provide suitable means to implement
such a policy.
This patch adds a sysfs interface for exactly this purpose.
To implement the said interface it adds an "inhibited" property to struct
input_dev, and effectively creates four states a device can be in: closed
uninhibited, closed inhibited, open uninhibited, open inhibited. It also
defers calling driver's ->open() and ->close() to until they are actually
needed, e.g. it makes no sense to prepare the underlying device for
generating events (->open()) if the device is inhibited.
uninhibit
closed <------------ closed
uninhibited ------------> inhibited
| ^ inhibit | ^
1st | | 1st | |
open | | open | |
| | | |
| | last | | last
| | close | | close
v | uninhibit v |
open <------------ open
uninhibited ------------> inhibited
The top inhibit/uninhibit transition happens when users == 0.
The bottom inhibit/uninhibit transition happens when users > 0.
The left open/close transition happens when !inhibited.
The right open/close transition happens when inhibited.
Due to all transitions being serialized with dev->mutex, it is impossible
to have "diagonal" transitions between closed uninhibited and open
inhibited or between open uninhibited and closed inhibited.
No new callbacks are added to drivers, because their open() and close()
serve exactly the purpose to tell the driver to start/stop providing
events to the input core. Consequently, open() and close() - if provided
- are called in both inhibit and uninhibit paths.
Signed-off-by: Patrik Fimml <[email protected]>
Co-developed-by: Andrzej Pietrasiewicz <[email protected]>
Signed-off-by: Andrzej Pietrasiewicz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
A helper function for drivers to decide if the device is used or not.
A lockdep check is introduced as inspecting ->users should be done under
input device's mutex.
Signed-off-by: Andrzej Pietrasiewicz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
This patch supports to store chksum value with compressed
data, and verify the integrality of compressed data while
reading the data.
The feature can be enabled through specifying mount option
'compress_chksum'.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This shifts the responsibility of setting up dentry operations from
fscrypt to the individual filesystems, allowing them to have their own
operations while still setting fscrypt's d_revalidate as appropriate.
Most filesystems can just use generic_set_encrypted_ci_d_ops, unless
they have their own specific dentry operations as well. That operation
will set the minimal d_ops required under the circumstances.
Since the fscrypt d_ops are set later on, we must set all d_ops there,
since we cannot adjust those later on. This should not result in any
change in behavior.
Signed-off-by: Daniel Rosenberg <[email protected]>
Acked-by: Theodore Ts'o <[email protected]>
Acked-by: Eric Biggers <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This adds a function to set dentry operations at lookup time that will
work for both encrypted filenames and casefolded filenames.
A filesystem that supports both features simultaneously can use this
function during lookup preparations to set up its dentry operations once
fscrypt no longer does that itself.
Currently the casefolding dentry operation are always set if the
filesystem defines an encoding because the features is toggleable on
empty directories. Unlike in the encryption case, the dentry operations
used come from the parent. Since we don't know what set of functions
we'll eventually need, and cannot change them later, we enable the
casefolding operations if the filesystem supports them at all.
By splitting out the various cases, we support as few dentry operations
as we can get away with, maximizing compatibility with overlayfs, which
will not function if a filesystem supports certain dentry_operations.
Signed-off-by: Daniel Rosenberg <[email protected]>
Reviewed-by: Theodore Ts'o <[email protected]>
Reviewed-by: Eric Biggers <[email protected]>
Reviewed-by: Gabriel Krisman Bertazi <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Added a new F2FS_IOC_SET_COMPRESS_OPTION ioctl to change file
compression option of a file.
struct f2fs_comp_option {
u8 algorithm; => compression algorithm
=> 0:lzo, 1:lz4, 2:zstd, 3:lzorle
u8 log_cluster_size; => log scale cluster size
=> 2 ~ 8
};
struct f2fs_comp_option option;
option.algorithm = 1;
option.log_cluster_size = 7;
ioctl(fd, F2FS_IOC_SET_COMPRESS_OPTION, &option);
Signed-off-by: Daeho Jeong <[email protected]>
[Chao Yu: remove f2fs_is_compress_algorithm_valid()]
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 features for v5.11:
Highlights:
- Enable big joiner to join two pipes to one port to overcome pipe restrictions
(Manasi, Ville, Maarten)
Display:
- More DG1 enabling (Lucas, Aditya)
- Fixes to cases without display (Lucas, José, Jani)
- Initial PSR state improvements (José)
- JSL eDP vswing updates (Tejas)
- Handle EDID declared max 16 bpc (Ville)
- Display refactoring (Ville)
Other:
- GVT features
- Backmerge
Signed-off-by: Dave Airlie <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Do not use rlimit-based memory accounting for bpf progs. It has been
replaced with memcg-based memory accounting.
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Remove rlimit-based accounting infrastructure code, which is not used
anymore.
To provide a backward compatibility, use an approximation of the
bpf map memory footprint as a "memlock" value, available to a user
via map info. The approximation is based on the maximal number of
elements and key and value sizes.
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Bpf maps can be updated from an interrupt context and in such
case there is no process which can be charged. It makes the memory
accounting of bpf maps non-trivial.
Fortunately, after commit 4127c6504f25 ("mm: kmem: enable kernel
memcg accounting from interrupt contexts") and commit b87d8cefe43c
("mm, memcg: rework remote charging API to support nesting")
it's finally possible.
To make the ownership model simple and consistent, when the map
is created, the memory cgroup of the current process is recorded.
All subsequent allocations related to the bpf map are charged to
the same memory cgroup. It includes allocations made by any processes
(even if they do belong to a different cgroup) and from interrupts.
This commit introduces 3 new helpers, which will be used by following
commits to enable the accounting of bpf maps memory:
- bpf_map_kmalloc_node()
- bpf_map_kzalloc()
- bpf_map_alloc_percpu()
They are wrapping popular memory allocation functions. They set
the active memory cgroup to the map's memory cgroup and add
__GFP_ACCOUNT to the passed gfp flags. Then they call into
the corresponding memory allocation function and restore
the original active memory cgroup.
These helpers are supposed to use everywhere except the map creation
path. During the map creation when the map structure is allocated by
itself, it cannot be passed to those helpers. In those cases default
memory allocation function will be used with the __GFP_ACCOUNT flag.
Signed-off-by: Roman Gushchin <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
PageKmemcg flag is currently defined as a page type (like buddy, offline,
table and guard). Semantically it means that the page was accounted as a
kernel memory by the page allocator and has to be uncharged on the
release.
As a side effect of defining the flag as a page type, the accounted page
can't be mapped to userspace (look at page_has_type() and comments above).
In particular, this blocks the accounting of vmalloc-backed memory used
by some bpf maps, because these maps do map the memory to userspace.
One option is to fix it by complicating the access to page->mapcount,
which provides some free bits for page->page_type.
But it's way better to move this flag into page->memcg_data flags.
Indeed, the flag makes no sense without enabled memory cgroups and memory
cgroup pointer set in particular.
This commit replaces PageKmemcg() and __SetPageKmemcg() with
PageMemcgKmem() and an open-coded OR operation setting the memcg pointer
with the MEMCG_DATA_KMEM bit. __ClearPageKmemcg() can be simple deleted,
as the whole memcg_data is zeroed at once.
As a bonus, on !CONFIG_MEMCG build the PageMemcgKmem() check will be
compiled out.
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The lowest bit in page->memcg_data is used to distinguish between struct
memory_cgroup pointer and a pointer to a objcgs array. All checks and
modifications of this bit are open-coded.
Let's formalize it using page memcg flags, defined in enum
page_memcg_data_flags.
Additional flags might be added later.
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
To gather all direct accesses to struct page's memcg_data field in one
place, let's introduce 3 new helpers to use in the slab accounting code:
struct obj_cgroup **page_objcgs(struct page *page);
struct obj_cgroup **page_objcgs_check(struct page *page);
bool set_page_objcgs(struct page *page, struct obj_cgroup **objcgs);
They are similar to the corresponding API for generic pages, except that
the setter can return false, indicating that the value has been already
set from a different thread.
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Patch series "mm: allow mapping accounted kernel pages to userspace", v6.
Currently a non-slab kernel page which has been charged to a memory cgroup
can't be mapped to userspace. The underlying reason is simple: PageKmemcg
flag is defined as a page type (like buddy, offline, etc), so it takes a
bit from a page->mapped counter. Pages with a type set can't be mapped to
userspace.
But in general the kmemcg flag has nothing to do with mapping to
userspace. It only means that the page has been accounted by the page
allocator, so it has to be properly uncharged on release.
Some bpf maps are mapping the vmalloc-based memory to userspace, and their
memory can't be accounted because of this implementation detail.
This patchset removes this limitation by moving the PageKmemcg flag into
one of the free bits of the page->mem_cgroup pointer. Also it formalizes
accesses to the page->mem_cgroup and page->obj_cgroups using new helpers,
adds several checks and removes a couple of obsolete functions. As the
result the code became more robust with fewer open-coded bit tricks.
This patch (of 4):
Currently there are many open-coded reads of the page->mem_cgroup pointer,
as well as a couple of read helpers, which are barely used.
It creates an obstacle on a way to reuse some bits of the pointer for
storing additional bits of information. In fact, we already do this for
slab pages, where the last bit indicates that a pointer has an attached
vector of objcg pointers instead of a regular memcg pointer.
This commits uses 2 existing helpers and introduces a new helper to
converts all read sides to calls of these helpers:
struct mem_cgroup *page_memcg(struct page *page);
struct mem_cgroup *page_memcg_rcu(struct page *page);
struct mem_cgroup *page_memcg_check(struct page *page);
page_memcg_check() is intended to be used in cases when the page can be a
slab page and have a memcg pointer pointing at objcg vector. It does
check the lowest bit, and if set, returns NULL. page_memcg() contains a
VM_BUG_ON_PAGE() check for the page not being a slab page.
To make sure nobody uses a direct access, struct page's
mem_cgroup/obj_cgroups is converted to unsigned long memcg_data.
Signed-off-by: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Currently it's impossible to delete files that use an unsupported
encryption policy, as the kernel will just return an error when
performing any operation on the top-level encrypted directory, even just
a path lookup into the directory or opening the directory for readdir.
More specifically, this occurs in any of the following cases:
- The encryption context has an unrecognized version number. Current
kernels know about v1 and v2, but there could be more versions in the
future.
- The encryption context has unrecognized encryption modes
(FSCRYPT_MODE_*) or flags (FSCRYPT_POLICY_FLAG_*), an unrecognized
combination of modes, or reserved bits set.
- The encryption key has been added and the encryption modes are
recognized but aren't available in the crypto API -- for example, a
directory is encrypted with FSCRYPT_MODE_ADIANTUM but the kernel
doesn't have CONFIG_CRYPTO_ADIANTUM enabled.
It's desirable to return errors for most operations on files that use an
unsupported encryption policy, but the current behavior is too strict.
We need to allow enough to delete files, so that people can't be stuck
with undeletable files when downgrading kernel versions. That includes
allowing directories to be listed and allowing dentries to be looked up.
Fix this by modifying the key setup logic to treat an unsupported
encryption policy in the same way as "key unavailable" in the cases that
are required for a recursive delete to work: preparing for a readdir or
a dentry lookup, revalidating a dentry, or checking whether an inode has
the same encryption policy as its parent directory.
Reviewed-by: Andreas Dilger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
|
|
Now that fscrypt_get_encryption_info() is only called from files in
fs/crypto/ (due to all key setup now being handled by higher-level
helper functions instead of directly by filesystems), unexport it and
move its declaration to fscrypt_private.h.
Reviewed-by: Andreas Dilger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
|
|
fscrypt_require_key() is now only used by files in fs/crypto/. So
reduce its visibility to fscrypt_private.h. This is also a prerequsite
for unexporting fscrypt_get_encryption_info().
Reviewed-by: Andreas Dilger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
|
|
In preparation for reducing the visibility of fscrypt_require_key() by
moving it to fscrypt_private.h, move the call to it from
fscrypt_prepare_setattr() to an out-of-line function.
Reviewed-by: Andreas Dilger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
|
|
The last remaining use of fscrypt_get_encryption_info() from filesystems
is for readdir (->iterate_shared()). Every other call is now in
fs/crypto/ as part of some other higher-level operation.
We need to add a new argument to fscrypt_get_encryption_info() to
indicate whether the encryption policy is allowed to be unrecognized or
not. Doing this is easier if we can work with high-level operations
rather than direct filesystem use of fscrypt_get_encryption_info().
So add a function fscrypt_prepare_readdir() which wraps the call to
fscrypt_get_encryption_info() for the readdir use case.
Reviewed-by: Andreas Dilger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Eric Biggers <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-next-2020-12-02
Low level mlx5 updates required by both netdev and rdma trees.
* tag 'mlx5-next-2020-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Treat host PF vport as other (non eswitch manager) vport
net/mlx5: Enable host PF HCA after eswitch is initialized
net/mlx5: Rename peer_pf to host_pf
net/mlx5: Make API mlx5_core_is_ecpf accept const pointer
net/mlx5: Export steering related functions
net/mlx5: Expose other function ifc bits
net/mlx5: Expose IP-in-IP TX and RX capability bits
net/mlx5: Update the hardware interface definition for vhca state
net/mlx5: Update the list of the PCI supported devices
net/mlx5: Avoid exposing driver internal command helpers
net/mlx5: Add ts_cqe_to_dest_cqn related bits
net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits
net/mlx5: Check dr mask size against mlx5_match_param size
net/mlx5: Add sampler destination type
net/mlx5: Add sample offload hardware bits and structures
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Implementation of support for parameterized testing in KUnit. This
approach requires the creation of a test case using the
KUNIT_CASE_PARAM() macro that accepts a generator function as input.
This generator function should return the next parameter given the
previous parameter in parameterized tests. It also provides a macro to
generate common-case generators based on arrays. Generators may also
optionally provide a human-readable description of parameters, which is
displayed where available.
Note, currently the result of each parameter run is displayed in
diagnostic lines, and only the overall test case output summarizes
TAP-compliant success or failure of all parameter runs. In future, when
supported by kunit-tool, these can be turned into subsubtest outputs.
Signed-off-by: Arpitha Raghunandan <[email protected]>
Co-developed-by: Marco Elver <[email protected]>
Signed-off-by: Marco Elver <[email protected]>
Reviewed-by: David Gow <[email protected]>
Tested-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
I have to now lock/unlock socket for the bind hook execution.
That shouldn't cause any overhead because the socket is unbound
and shouldn't receive any traffic.
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrey Ignatov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|