Age | Commit message (Collapse) | Author | Files | Lines |
|
Per the ongoing standardisation process, when virtual HCAs are present
in a network, traffic is routed based on a destination GID. In order to
access the SA we use the well known SA GID.
We also add a GRH required boolean field to the port attributes which is
used to report to the verbs consumer whether this port is connected to a
virtual network. We use this field to realize whether we need to create
an address vector with GRH to access the subnet administrator. We clear
the port attributes struct before calling the hardware driver to make
sure the default remains that GRH is not required.
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Or Gerlitz <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The subnet prefix is a part of the port_info MAD returned and should be
available at the ib_port_attr struct. We define it here and provide a
default implementation in case the hardware driver does not provide one.
The subnet prefix is required when creating the address vector to access
the SA in networks where GRH must be used.
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Or Gerlitz <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Fix the condition that dictates when MAD_IFC should be used. According
to firmware specifications, MAD_IFC commands must be used only if the
ib_virt capability is off.
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Or Gerlitz <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add two new NLAs to support configuration of Infiniband node or port
GUIDs. New applications can choose to use this interface to configure
GUIDs with iproute2 with commands such as:
ip link set dev ib0 vf 0 node_guid 00:02:c9:03:00:21:6e:70
ip link set dev ib0 vf 0 port_guid 00:02:c9:03:00:21:6e:78
A new ndo, ndo_sef_vf_guid is introduced to notify the net device of the
request to change the GUID.
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Or Gerlitz <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The old bitwise device_cap_flags variable was limited to u32 which
has all bits already defined. In order to overcome it, we converted
device_cap_flags variable to be u64 type.
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The setting to zero during variable initialization eliminates
the need to explicitly set to zero variables and structures.
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Define the necessary hardware structures for the offload
arithmetic capabilities and read/cache them on driver load.
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Device capability function was called similar in all places.
It was called twice for every queried parameter, while the
difference between calls was in HCA capability mode only.
The change proposed unify these calls into one function.
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add caching of maximum device capability of ATOMIC endian mode.
Fixes: f91e6d8941bf ('net/mlx5_core: Add setting ATOMIC endian mode')
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm[64] perf updates from Will Deacon:
"I have another mixed bag of ARM-related perf patches here.
It's about 25% CPU and 75% interconnect, but with drivers/bus/
languishing without an obvious maintainer or tree, Olof and I agreed
to keep all of these PMU patches together. I suspect a whole load of
code from drivers/bus/arm-* can be moved under drivers/perf/, so
that's on the radar for the future.
Summary:
- Initial support for ARMv8.1 CPU PMUs
- Support for the CPU PMU in Cavium ThunderX
- CPU PMU support for systems running 32-bit Linux in secure mode
- Support for the system PMU in ARM CCI-550 (Cache Coherent Interconnect)"
* tag 'arm64-perf' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (26 commits)
drivers/perf: arm_pmu: avoid NULL dereference when not using devicetree
arm64: perf: Extend ARMV8_EVTYPE_MASK to include PMCR.LC
arm-cci: remove unused variable
arm-cci: don't return value from void function
arm-cci: make private functions static
arm-cci: CoreLink CCI-550 PMU driver
arm-cci500: Rearrange PMU driver for code sharing with CCI-550 PMU
arm-cci: CCI-500: Work around PMU counter writes
arm-cci: Provide hook for writing to PMU counters
arm-cci: Add helper to enable PMU without synchornising counters
arm-cci: Add routines to save/restore all counters
arm-cci: Get the status of a counter
arm-cci: write_counter: Remove redundant check
arm-cci: Delay PMU counter writes to pmu::pmu_enable
arm-cci: Refactor CCI PMU enable/disable methods
arm-cci: Group writes to counter
arm-cci: fix handling cpumask_any_but return value
arm-cci: simplify sysfs attr handling
drivers/perf: arm_pmu: implement CPU_PM notifier
arm64: dts: Add Cavium ThunderX specific PMU
...
|
|
The first argument of WARN_ON() is a condition, so it means the warning
message here will just be the name without the ->qp_num information.
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch replaces the obsolete crypto hash interface with shash
and resolves a build failure after merge of the rdma tree
which is caused by the removal of crypto hash interface
Removing CRYPTO_ALG_ASYNC from crypto_alloc_shash(),
because it is by definition sync only
Signed-off-by: Mustafa Ismail <[email protected]>
Signed-off-by: Tatyana Nikolova <[email protected]>
Acked-by: Herbert Xu <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC architecture updates from Vineet Gupta:
- Big Endian io accessors fix [Lada]
- Spellos fixes [Adam]
- Fix for DW GMAC breakage [Alexey]
- Making DMA API 64-bit ready
- Shutting up -Wmaybe-uninitialized noise for ARC
- Other minor fixes here and there, comments update
* tag 'arc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (21 commits)
ARCv2: ioremap: Support dynamic peripheral address space
ARC: dma: reintroduce platform specific dma<->phys
ARC: dma: ioremap: use phys_addr_t consistenctly in code paths
ARC: dma: pass_phys() not sg_virt() to cache ops
ARC: dma: non-coherent pages need V-P mapping if in HIGHMEM
ARC: dma: Use struct page based page allocator helpers
ARC: build: Turn off -Wmaybe-uninitialized for ARC gcc 4.8
ARC: [plat-axs10x] add Ethernet PHY description in .dts
arc: use of_platform_default_populate() to populate default bus
ARC: thp: unbork !CONFIG_TRANSPARENT_HUGEPAGE build
arc: [plat-nsimosci*] use ezchip network driver
ARCv2: LLSC: software backoff is NOT needed starting HS2.1c
ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern
ARC: [plat-nsim] document ranges
ARC: build: Better way to detect ISA compatible toolchain
ARCv2: Allow enabling PAE40 w/o HIGHMEM
ARC: [BE] readl()/writel() to work in Big Endian CPU configuration
ARC: [*defconfig] No need to specify CONFIG_CROSS_COMPILE
ARC: [BE] Select correct CROSS_COMPILE prefix
ARC: bitops: Remove non relevant comments
...
|
|
This commit adds a cache eviction algorithm for the SDMA
user buffer cache.
Besides the interval RB tree used for node lookup, the cache
nodes are also arranged in a doubly-linked list. When a node is
used, it is put at the beginning of the list. Less frequently
used nodes naturally move to the tail of the list.
When the cache limit is reached, the eviction code starts
traversing the linked list in reverse, freeing buffers until
enough space has been freed to fit the new user buffer. This
guarantees that only the least used cache nodes will be removed
from the cache.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Use the new function to query whether the expected receive
user buffer can be pinned successfully. This requires that
a new variable be added to the hfi1_filedata structure used
to hold the number of pages pinned by the expected receive
code.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This change adds a pointer to the process mm_struct when
calling hfi1_release_user_pages().
Previously, the function used the mm_struct of the current
process to adjust the number of pinned pages. However, is some
cases, namely when unpinning pages due to a MMU notifier call,
we want to drop into that code block as it will cause a deadlock
(the MMU notifiers take the process' mmap_sem prior to calling
the callbacks).
By allowing to caller to specify the pointer to the mm_struct,
the caller has finer control over that part of hfi1_release_user_pages().
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
System administrators can use the locked memory
ulimit setting to set the maximum amount of memory
a user can lock/pin. However, this setting alone is not
enough to guarantee good operation of the hfi1 driver
due to the fact that the setting does not have fine
enough granularity to account for the limit being used
by multiple user processes and caches.
Therefore, a better limiting algorithm is needed. This
is where the new hfi1_can_pin_pages() function and the
cache_size module parameter come in.
The function works by looking at the ulimit and cache_size
value to compute a cache size. The algorithm examines the
ulimit value and, if it is not "unlimited", computes a
per-cache limit based on the number of configured user
contexts.
After that, the lower of the two - cache_size and computed
per-cache limit - is used.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add support for caching of user buffers used for SDMA
transfers. This change improves performance by
avoiding repeatedly pinning the pages of buffers, which
are being re-used by the application.
While the cost of the pinning operation has been made
heavier by adding the extra code to search the cache tree,
re-allocate pages arrays, and future cache evictions,
that cost will be amortized against the savings when the
same buffer is re-used. It is also worth noting that in
most cases, the cost of pinning should be much lower due
to the buffer already being in the cache.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Last address values for intervals in the interval RB tree
nodes should be non-inclusive in order to avoid confusing
ranges.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This commit adds a filter callback, which can be used to filter
out interval RB nodes matching a certain interval down to a
single one.
This is needed for the upcoming SDMA-side caching where buffers
will need to be filtered by their virtual address.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Interval RB trees provide their own searching function,
which also takes care of determining the path through
the tree that should be taken.
This make the compare callback unnecessary.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add a new tracepoint type for the MMU functions and calls
to that tracepoint to allow tracing of MMU functionality.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The interval RB trees can handle RB nodes which
hold ranged information. This is exactly the usage
for the buffer cache implemented in the expected
receive code path.
Convert the MMU/RB functions to use the interval RB
tree API. This will help with future users of the
caching API, as well.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Tell the remove MMU/RB callback if it's being called as
part of a memory invalidation or not. This can be important
in preventing a deadlock if the remove callback attempts to
take the map_sem semaphore because the kernel's MMU
invalidation functions have already taken it.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The usage of function pointers for RB node insertion
and removal in the expected receive code path was
meant to be a small performance optimization. However,
maintaining it, especially with the new MMU API, would
become more troublesome as the API is extended.
Since the performance optimization is minor, remove the
function pointers and replace with direct calls.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
In order to allow the remove MMU callbacks to free the
RB nodes, it is necessary to prevent any references to
the nodes after the remove callback has been called.
Therefore, remove the node from the tree prior to calling
the callback. In other words, the MMU/RB API now guarantees
that all RB node operations it performs will be done prior
to calling the remove callback and that the RB node will
not be touched afterwards.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Prevent a potential NULL pointer dereference (found
by code inspection) when unregistering an MMU handler.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Future users of the MMU/RB functions might be searching or
manipulating the MMU RB trees in interrupt context. Therefore,
the MMU/RB functions need to be able to run in interrupt
context. This requires that we use the IRQ-aware API for
spin locks.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The MMU notification code added to the
expected receive side has been re-factored and
split into it's own file. This was done in
order to make the code more general and, therefore,
usable by other parts of the driver.
The caching behavior remains the same. However,
the handling of the RB tree (insertion, deletions,
and searching) as well as the MMU invalidation
processing is now handled by functions in the
mmu_rb.[ch] files.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Dean Luick <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull UDF and quota updates from Jan Kara:
"This contains a rewrite of UDF handling of filename encoding to fix
remaining overflow issues from Andrew Gabbasov and quota changes to
support new Q_[X]GETNEXTQUOTA quotactl for VFS quota formats"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Fix possible GPF due to uninitialised pointers
ext4: Make Q_GETNEXTQUOTA work for quota in hidden inodes
quota: Forbid Q_GETQUOTA and Q_GETNEXTQUOTA for frozen filesystem
quota: Fix possible races during quota loading
ocfs2: Implement get_next_id()
quota_v2: Implement get_next_id() for V2 quota format
quota: Add support for ->get_nextdqblk() for VFS quota
udf: Merge linux specific translation into CS0 conversion function
udf: Remove struct ustr as non-needed intermediate storage
udf: Use separate buffer for copying split names
udf: Adjust UDF_NAME_LEN to better reflect actual restrictions
udf: Join functions for UTF8 and NLS conversions
udf: Parameterize output length in udf_put_filename
quota: Allow Q_GETQUOTA for frozen filesystem
quota: Fixup comments about return value of Q_[X]GETNEXTQUOTA
|
|
The user (or an init script) may setup RShunt via sysfs after the
driver was initialized, for instance based on the EEPROM contents
of a modular probe. The calibration register must be set accordingly.
Signed-off-by: Marc Titinger <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs updates from Dave Chinner:
"There's quite a lot in this request, and there's some cross-over with
ext4, dax and quota code due to the nature of the changes being made.
As for the rest of the XFS changes, there are lots of little things
all over the place, which add up to a lot of changes in the end.
The major changes are that we've reduced the size of the struct
xfs_inode by ~100 bytes (gives an inode cache footprint reduction of
>10%), the writepage code now only does a single set of mapping tree
lockups so uses less CPU, delayed allocation reservations won't
overrun under random write loads anymore, and we added compile time
verification for on-disk structure sizes so we find out when a commit
or platform/compiler change breaks the on disk structure as early as
possible.
Change summary:
- error propagation for direct IO failures fixes for both XFS and
ext4
- new quota interfaces and XFS implementation for iterating all the
quota IDs in the filesystem
- locking fixes for real-time device extent allocation
- reduction of duplicate information in the xfs and vfs inode, saving
roughly 100 bytes of memory per cached inode.
- buffer flag cleanup
- rework of the writepage code to use the generic write clustering
mechanisms
- several fixes for inode flag based DAX enablement
- rework of remount option parsing
- compile time verification of on-disk format structure sizes
- delayed allocation reservation overrun fixes
- lots of little error handling fixes
- small memory leak fixes
- enable xfsaild freezing again"
* tag 'xfs-for-linus-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (66 commits)
xfs: always set rvalp in xfs_dir2_node_trim_free
xfs: ensure committed is initialized in xfs_trans_roll
xfs: borrow indirect blocks from freed extent when available
xfs: refactor delalloc indlen reservation split into helper
xfs: update freeblocks counter after extent deletion
xfs: debug mode forced buffered write failure
xfs: remove impossible condition
xfs: check sizes of XFS on-disk structures at compile time
xfs: ioends require logically contiguous file offsets
xfs: use named array initializers for log item dumping
xfs: fix computation of inode btree maxlevels
xfs: reinitialise per-AG structures if geometry changes during recovery
xfs: remove xfs_trans_get_block_res
xfs: fix up inode32/64 (re)mount handling
xfs: fix format specifier , should be %llx and not %llu
xfs: sanitize remount options
xfs: convert mount option parsing to tokens
xfs: fix two memory leaks in xfs_attr_list.c error paths
xfs: XFS_DIFLAG2_DAX limited by PAGE_SIZE
xfs: dynamically switch modes when XFS_DIFLAG2_DAX is set/cleared
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"New Features:
- uplift filesystem encryption into fs/crypto/
- give sysfs entries to control memroy consumption
Enhancements:
- aio performance by preallocating blocks in ->write_iter
- use writepages lock for only WB_SYNC_ALL
- avoid redundant inline_data conversion
- enhance forground GC
- use wait_for_stable_page as possible
- speed up SEEK_DATA and fiiemap
Bug Fixes:
- corner case in terms of -ENOSPC for inline_data
- hung task caused by long latency in shrinker
- corruption between atomic write and f2fs_trace_pid
- avoid garbage lengths in dentries
- revoke atomicly written pages if an error occurs
In addition, there are various minor bug fixes and clean-ups"
* tag 'for-f2fs-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (81 commits)
f2fs: submit node page write bios when really required
f2fs: add missing argument to f2fs_setxattr stub
f2fs: fix to avoid unneeded unlock_new_inode
f2fs: clean up opened code with f2fs_update_dentry
f2fs: declare static functions
f2fs: use cryptoapi crc32 functions
f2fs: modify the readahead method in ra_node_page()
f2fs crypto: sync ext4_lookup and ext4_file_open
fs crypto: move per-file encryption from f2fs tree to fs/crypto
f2fs: mutex can't be used by down_write_nest_lock()
f2fs: recovery missing dot dentries in root directory
f2fs: fix to avoid deadlock when merging inline data
f2fs: introduce f2fs_flush_merged_bios for cleanup
f2fs: introduce f2fs_update_data_blkaddr for cleanup
f2fs crypto: fix incorrect positioning for GCing encrypted data page
f2fs: fix incorrect upper bound when iterating inode mapping tree
f2fs: avoid hungtask problem caused by losing wake_up
f2fs: trace old block address for CoWed page
f2fs: try to flush inode after merging inline data
f2fs: show more info about superblock recovery
...
|
|
Eric Dumazet says:
====================
net: propagate max_gso_segs and max_gso_size
bridge code does not properly update max_gso_segs and max_gso_size.
Since this was not really obvious, first patch adds two new rtnetlink
attributes to help debugging this kind of issues (ip -d link)
Second patch fixes bridge code.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
It can be useful to lower max_gso_segs on NIC with very low
number of TX descriptors like bcmgenet.
However, this is defeated by bridge since it does not propagate
the lower value of max_gso_segs and max_gso_size.
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Petri Gynther <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It can be useful to report dev->gso_max_segs and dev->gso_max_size
so that "ip -d link" can display them to help debugging.
For the moment, these attributes are read-only.
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Petri Gynther <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
vxlan_remcsum is called after iptunnel_pull_header and thus the skb has
vxlan header already pulled. Don't include vxlan header again in the
calculation.
Signed-off-by: Jiri Benc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Sparse reports false positives for the header manipulation inlines. Annotate
them correctly.
Tested by sparse on a little endian and big endian machine.
Fixes: 54bfd872bf16d ("vxlan: keep flags and vni in network byte order")
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Jiri Benc <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When the function dev_get_phys_port_name was added it missed a description
for it's len argument. Adding it.
Fixes: db24a9044ee1 ("net: add support for phys_port_name")
Signed-off-by: Luis de Bethencourt <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup namespace support from Tejun Heo:
"These are changes to implement namespace support for cgroup which has
been pending for quite some time now. It is very straight-forward and
only affects what part of cgroup hierarchies are visible.
After unsharing, mounting a cgroup fs will be scoped to the cgroups
the task belonged to at the time of unsharing and the cgroup paths
exposed to userland would be adjusted accordingly"
* 'for-4.6-ns' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix and restructure error handling in copy_cgroup_ns()
cgroup: fix alloc_cgroup_ns() error handling in copy_cgroup_ns()
Add FS_USERNS_FLAG to cgroup fs
cgroup: Add documentation for cgroup namespaces
cgroup: mount cgroupns-root when inside non-init cgroupns
kernfs: define kernfs_node_dentry
cgroup: cgroup namespace setns support
cgroup: introduce cgroup namespaces
sched: new clone flag CLONE_NEWCGROUP for cgroup namespace
kernfs: Add API to generate relative kernfs path
|
|
Only treat write goes up to the inode size as aligned request,
because it always write PAGE_CACHE_SIZE, but read a dynamic size.
Signed-off-by: Kinglong Mee <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
The 'is_merge' is an historical naming from when only a single lower layer
could exist. With the introduction of multiple lower layers the meaning of
this flag was changed to mean only the "lowest layer" (while all lower
layers were being merged).
So now 'is_merge' is inaccurate and hence renaming to 'is_lowest'
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
This patch fixes a newline warning found by the checkpatch.pl tool
Signed-off-by: Sohom-Bhattacharjee <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
In some instances xfs has been created with ftype=0 and there if a file
on lower fs is removed, overlay leaves a whiteout in upper fs but that
whiteout does not get filtered out and is visible to overlayfs users.
And reason it does not get filtered out because upper filesystem does
not report file type of whiteout as DT_CHR during iterate_dir().
So it seems to be a requirement that upper filesystem support d_type for
overlayfs to work properly. Do this check during mount and fail if d_type
is not supported.
Suggested-by: Dave Chinner <[email protected]>
Signed-off-by: Vivek Goyal <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Print a warning when overlayfs copies up a file if the process that
triggered the copy up has a R/O fd open to the lower file being copied up.
This can help catch applications that do things like the following:
fd1 = open("foo", O_RDONLY);
fd2 = open("foo", O_RDWR);
where they expect fd1 and fd2 to refer to the same file - which will no
longer be the case post-copy up.
With this patch, the following commands:
bash 5</mnt/a/foo128
6<>/mnt/a/foo128
assuming /mnt/a/foo128 to be an un-copied up file on an overlay will
produce the following warning in the kernel log:
overlayfs: Copying up foo129, but open R/O on fd 5 which will cease
to be coherent [pid=3818 bash]
This is enabled by setting:
/sys/module/overlay/parameters/check_copy_up
to 1.
The warnings are ratelimited and are also limited to one warning per file -
assuming the copy up completes in each case.
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
This patch hides error about missing lowerdir if MS_SILENT is set.
We use mount(NULL, "/", "overlay", MS_SILENT, NULL) for testing support of
overlayfs: syscall returns -ENODEV if it's not supported. Otherwise kernel
automatically loads module and returns -EINVAL because lowerdir is missing.
Signed-off-by: Konstantin Khlebnikov <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Unlink and rename in overlayfs checked the upper dentry for staleness by
verifying upper->d_parent against upperdir. However the dentry can go
stale also by being unhashed, for example.
Expand the verification to actually look up the name again (under parent
lock) and check if it matches the upper dentry. This matches what the VFS
does before passing the dentry to filesytem's unlink/rename methods, which
excludes any inconsistency caused by overlayfs.
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Otherwise we can run into problems with the writeback code.
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit c02196834456f2d5fad334088b70e98ce4967c34.
In the meantime we moved get_user_pages() outside of the reservation lock,
so that shouldn't be an issue any more
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|