Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Madalin Bucur <[email protected]>
Reviewed-by: Camelia Groza <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The fsl/fman drivers will use of_platform_populate() on all
supported platforms. Call of_platform_populate() to probe the
FMan sub-nodes.
Signed-off-by: Igal Liberman <[email protected]>
Signed-off-by: Madalin Bucur <[email protected]>
Acked-by: Scott Wood <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
QSGMII ports were not advertising 1G speed.
Signed-off-by: Madalin Bucur <[email protected]>
Reviewed-by: Camelia Groza <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jerome Brunet says:
====================
phy: Fix integration of eee-broken-modes
The purpose of this series is to fix the integration of the ethernet phy
property "eee-broken-modes" [0]
The v3 of this series has been merged, missing a fix (error reported by
kbuild robot) available in the v4 [1]
More importantly, Florian opposed adding a DT property mapping a device
register this directly [2]. The concern was that the property could be
abused to implement platform configuration policy. After discussing it,
I think we agreed that such information about the HW (defect) should appear
in the platform DT. However, the preferred way is to add a boolean property
for each EEE broken mode.
[0]: http://lkml.kernel.org/r/[email protected]
[1]: http://lkml.kernel.org/r/[email protected]
[2]: http://lkml.kernel.org/r/[email protected]
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The patches regarding eee-broken-modes was merged before all people
involved could find an agreement on the best way to move forward.
While we agreed on having a DT property to mark particular modes as broken,
the value used for eee-broken-modes mapped the phy register in very direct
way. Because of this, the concern is that it could be used to implement
configuration policies instead of describing a broken HW.
In the end, having a boolean property for each mode seems to be preferred
over one bit field value mapping the register (too) directly.
Cc: Florian Fainelli <[email protected]>
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The patches regarding eee-broken-modes was merged before all people
involved could find an agreement on the best way to move forward.
While we agreed on having a DT property to mark particular modes as broken,
the value used for eee-broken-modes mapped the phy register in very direct
way. Because of this, the concern is that it could be used to implement
configuration policies instead of describing a broken HW.
In the end, having a boolean property for each mode seems to be preferred
over one bit field value mapping the register (too) directly.
Cc: Florian Fainelli <[email protected]>
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In genphy_config_eee_advert, the return value of phy_read_mmd_indirect is
checked to know if the register could be accessed but the result is
assigned to a 'u32'.
Changing to 'int' to correctly get errors from phy_read_mmd_indirect.
Fixes: d853d145ea3e ("net: phy: add an option to disable EEE advertisement")
Reported-by: Julia Lawall <[email protected]>
Signed-off-by: Jerome Brunet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
s/prink/printk/
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Borislav Petkov <[email protected]>
Cc: Olof Johansson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The macro is to be used similarly as WARN_ON as:
if (WARN_ON_RATELIMIT(condition, state))
do_something();
One would expect only 'condition' to affect the 'if', but
WARN_ON_RATELIMIT does internally only:
WARN_ON((condition) && __ratelimit(state))
So the 'if' is affected by the ratelimiting state too. Fix this by
returning 'condition' in any case.
Note that nobody uses WARN_ON_RATELIMIT yet, so there is nothing to
worry about. But I was about to use it and was a bit surprised.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jiri Slaby <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Subtract KASLR offset from the kernel addresses reported by kcov.
Tested on x86_64 and AArch64 (Hikey LeMaker).
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexander Popov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Jon Masters <[email protected]>
Cc: David Daney <[email protected]>
Cc: Ganapatrao Kulkarni <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Nicolai Stange <[email protected]>
Cc: James Morse <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Alexander Popov <[email protected]>
Cc: syzkaller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Introduce kaslr_offset() similar to x86_64 to fix kcov.
[ Updated by Will Deacon ]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alexander Popov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Jon Masters <[email protected]>
Cc: David Daney <[email protected]>
Cc: Ganapatrao Kulkarni <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Nicolai Stange <[email protected]>
Cc: James Morse <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Alexander Popov <[email protected]>
Cc: syzkaller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When FADV_DONTNEED cannot drop all pages in the range, it observes that
some pages might still be on per-cpu LRU caches after recent
instantiation and so initiates remote calls to all CPUs to flush their
local caches. However, in most cases, the fadvise happens from the same
context that instantiated the pages, and any pre-LRU pages in the
specified range are most likely sitting on the local CPU's LRU cache,
and so in many cases this results in unnecessary remote calls, which, in
a loaded system, can hold up the fadvise() call significantly.
[ I didn't record it in the extreme case we observed at Facebook,
unfortunately. We had a slow-to-respond system and noticed it
lru_add_drain_all() leading the profile during fadvise calls. This
patch came out of thinking about the code and how we commonly call
FADV_DONTNEED.
FWIW, I wrote a silly directory tree walker/searcher that recurses
through /usr to read and FADV_DONTNEED each file it finds. On a 2
socket 40 ht machine, over 1% is spent in lru_add_drain_all(). With
the patch, that cost is gone; the local drain cost shows at 0.09%. ]
Try to avoid the remote call by flushing the local LRU cache before even
attempting to invalidate anything. It's a cheap operation, and the
local LRU cache is the most likely to hold any pre-LRU pages in the
specified fadvise range.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Johannes Weiner <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Acked-by: Hillf Danton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
For remote attestion it is important for the ima measurement values to
be platform-independent. Therefore integer fields to be hashed must be
converted to canonical format.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andreas Steffen <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Dmitry Kasatkin <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The IMA binary_runtime_measurements list is currently in platform native
format.
To allow restoring a measurement list carried across kexec with a
different endianness than the targeted kernel, this patch defines
little-endian as the canonical format. For big endian systems wanting
to save/restore the measurement list from a system with a different
endianness, a new boot command line parameter named "ima_canonical_fmt"
is defined.
Considerations: use of the "ima_canonical_fmt" boot command line option
will break existing userspace applications on big endian systems
expecting the binary_runtime_measurements list to be in platform native
format.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The configured IMA measurement list template format can be replaced at
runtime on the boot command line, including a custom template format.
This patch adds support for restoring a measuremement list containing
multiple builtin/custom template formats.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The builtin and single custom templates are currently stored in an
array. In preparation for being able to restore a measurement list
containing multiple builtin/custom templates, this patch stores the
builtin and custom templates as a linked list. This will permit
defining more than one custom template per boot.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The TPM PCRs are only reset on a hard reboot. In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot.
This patch uses the kexec buffer passing mechanism to pass the
serialized IMA binary_runtime_measurements to the next kernel.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: "Eric W. Biederman" <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The IMA kexec buffer allows the currently running kernel to pass the
measurement list via a kexec segment to the kernel that will be kexec'd.
This is the architecture-specific part of setting up the IMA kexec
buffer for the next kernel. It will be used in the next patch.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Dmitry Kasatkin <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In preparation for serializing the binary_runtime_measurements, this
patch maintains the amount of memory required.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Measurements carried across kexec need to be added to the IMA
measurement list, but should not prevent measurements of the newly
booted kernel from being added to the measurement list. This patch adds
support for allowing duplicate measurements.
The "boot_aggregate" measurement entry is the delimiter between soft
boots.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The TPM PCRs are only reset on a hard reboot. In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
list of the running kernel must be saved and restored on boot. This
patch restores the measurement list.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: Dmitry Kasatkin <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "ima: carry the measurement list across kexec", v8.
The TPM PCRs are only reset on a hard reboot. In order to validate a
TPM's quote after a soft reboot (eg. kexec -e), the IMA measurement
list of the running kernel must be saved and then restored on the
subsequent boot, possibly of a different architecture.
The existing securityfs binary_runtime_measurements file conveniently
provides a serialized format of the IMA measurement list. This patch
set serializes the measurement list in this format and restores it.
Up to now, the binary_runtime_measurements was defined as architecture
native format. The assumption being that userspace could and would
handle any architecture conversions. With the ability of carrying the
measurement list across kexec, possibly from one architecture to a
different one, the per boot architecture information is lost and with it
the ability of recalculating the template digest hash. To resolve this
problem, without breaking the existing ABI, this patch set introduces
the boot command line option "ima_canonical_fmt", which is arbitrarily
defined as little endian.
The need for this boot command line option will be limited to the
existing version 1 format of the binary_runtime_measurements.
Subsequent formats will be defined as canonical format (eg. TPM 2.0
support for larger digests).
A simplified method of Thiago Bauermann's "kexec buffer handover" patch
series for carrying the IMA measurement list across kexec is included in
this patch set. The simplified method requires all file measurements be
taken prior to executing the kexec load, as subsequent measurements will
not be carried across the kexec and restored.
This patch (of 10):
The IMA kexec buffer allows the currently running kernel to pass the
measurement list via a kexec segment to the kernel that will be kexec'd.
The second kernel can check whether the previous kernel sent the buffer
and retrieve it.
This is the architecture-specific part which enables IMA to receive the
measurement list passed by the previous kernel. It will be used in the
next patch.
The change in machine_kexec_64.c is to factor out the logic of removing
an FDT memory reservation so that it can be used by remove_ima_buffer.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Mimi Zohar <[email protected]>
Acked-by: "Eric W. Biederman" <[email protected]>
Cc: Andreas Steffen <[email protected]>
Cc: Dmitry Kasatkin <[email protected]>
Cc: Josh Sklar <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stewart Smith <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
__ip_append_data and ip_finish_output
There is an inconsistent conditional judgement in __ip_append_data and
ip_finish_output functions, the variable length in __ip_append_data just
include the length of application's payload and udp header, don't include
the length of ip header, but in ip_finish_output use
(skb->len > ip_skb_dst_mtu(skb)) as judgement, and skb->len include the
length of ip header.
That causes some particular application's udp payload whose length is
between (MTU - IP Header) and MTU were fragmented by ip_fragment even
though the rst->dev support UFO feature.
Add the length of ip header to length in __ip_append_data to keep
consistent conditional judgement as ip_finish_output for ip fragment.
Signed-off-by: Zheng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit e0097cf5f2f1 ("mmc: queue: Fix queue thread wake-up") did not go far
enough. mmc_wait_for_data_req_done() still contains some problems and can
be further simplified. First it should not touch
context_info->is_waiting_last_req because that is a wake-up control used by
the owner of the context. Secondly, it should always return when one of its
wake-up conditions is met because, again, that is contolled by the owner of
the context.
While the current block driver does not have an issue, these problems were
exposed during testing of the Software Command Queue patches.
Fixes: e0097cf5f2f1 ("mmc: queue: Fix queue thread wake-up")
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Harjani Ritesh <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
|
|
Since commit c2c24819b280 ("mmc: core: Don't power off the card when
starting the host"), the power state can still be MMC_POWER_UNDEFINED after
mmc_start_host() is called. That can trigger a warning in SDHCI during
runtime resume as it tries to restore the I/O state. Handle
MMC_POWER_UNDEFINED simply by not updating the I/O state in that case.
Fixes: c2c24819b280 ("mmc: core: Don't power off the card when starting the host")
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
|
|
Add a Socionext SoC specific compatible (suggested by Rob Herring).
No SoC specific data are associated with the compatible strings for
now, but other SoC vendors may use this IP and want to differentiate
IP variants in the future.
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
|
|
If our DELEGRETURN RPC call is rejected with an EACCES call, then we should
remove the GETATTR call from the compound RPC and retry.
This could potentially happen when there is a conflict between an
ACL denying attribute reads and our use of SP4_MACH_CRED.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If our CLOSE RPC call is rejected with an EACCES call, then we should
remove the GETATTR call from the compound RPC and retry.
This could potentially happen when there is a conflict between an
ACL denying attribute reads and our use of SP4_MACH_CRED.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
In order to benefit from the DENY share lock protection, we should
put the GETATTR operation before the CLOSE. Otherwise, we might race
with a Windows machine that thinks it is now safe to modify the file.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If we're downgrading from a READ+WRITE mode to a READ-only mode, then
ask for cache consistency attributes so that we avoid the revalidation
in nfs_close_context()
Fixes: 3947b74d0f9d ("NFSv4: Don't request a GETATTR on open_downgrade.")
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The NFS_INO_REVAL_FORCED flag now really only has meaning for the
case when we've just been handed a delegation for a file that was already
cached, and we're unsure about that cache.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the client holds no more writeable open state, and does not hold a
write delegation, then send a layoutreturn as part of the OPEN_DOWNGRADE.
We do this only for writes, since some layout drivers may require you to
also hold a read layout if you are doing a R/W workload.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
While we do not need to return the RW layout when downgrading from a
read/write open state to read-only, we might want to do so in order
to reduce the burden on the metadataserver so that it does not need
to check for changed data when responding to GETATTR requests.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
When an NFS4ERR_BAD_SEQID is received the open-owner is removed from
the ->state_owners rbtree so that it will no longer be used.
If any stateids attached to this open-owner are still in use, and if a
request using one gets an NFS4ERR_BAD_STATEID reply, this can for bad.
The state is marked as needing recovery and the nfs4_state_manager()
is scheduled to clean up. nfs4_state_manager() finds states to be
recovered by walking the state_owners rbtree. As the open-owner is
not in the rbtree, the bad state is not found so nfs4_state_manager()
completes having done nothing. The request is then retried, with a
predicatable result (indefinite retries).
If the stateid is for a delegation, this open_owner will be used
to open files when the delegation is returned. For that to work,
a new open-owner needs to be presented to the server.
This patch changes NFS4ERR_BAD_SEQID handling to leave the open-owner
in the rbtree but updates the 'create_time' so it looks like a new
open-owner. With this the indefinite retries no longer happen.
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If a file has both flock locks and OFD locks, then it is possible that
two different nfs4 lock states could apply to file accesses from a
single process.
It is not possible to know, efficiently, which one is "correct".
Presumably the state which represents a lock that covers the region
undergoing IO would be the "correct" one to use, but finding that has
a non-trivial cost and would provide miniscule value.
Currently we just return whichever is first in the list, which could
result in inconsistent behaviour if an application ever put it self in
this position. As consistent behaviour is preferable (when perfectly
correct behaviour is not available), change the search to return a
consistent result in this circumstance.
Specifically: if there is both a flock and OFD lock state, always return
the flock one.
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.
This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.
So add a test for ds_clp == NULL and return NULL in that case.
Fixes: c23266d532b4 ("NFS4.1 Fix data server connection race")
Signed-off-by: NeilBrown <[email protected]>
Acked-by: Olga Kornievskaia <[email protected]>
Acked-by: Adamson, Andy <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Instead of marking a device inactive, remove it from the cache entirely.
Flexfiles has a way to report errors back to the server, so we don't want
to stop devices from being tried again for 120 seconds.
Signed-off-by: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
It can be made static.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The access cache needs to check whether or not the mode bits, ownership,
or ACL has changed or the cache has timed out.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Just like in nfs_check_verifier(), we want to use
nfs_mapping_need_revalidate_inode() to check our knowledge of the
change attribute is up to date.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Consolidate the open-coded checking of NFS_I(inode)->cache_validity
into a couple of helper functions.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If we're holding a delegation, we can skip sending the close-to-open
GETATTR until we're returning that delegation.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
DELEGRETURN will always carry a reference to the inode except when
the latter is being freed, so let's ensure that we always use that
inode information to ensure close-to-open cache consistency, even
when the DELEGRETURN call is asynchronous.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If we successfully updated the change attribute, we should timestamp the
cache. While we do know that the other attributes are not completely up
to date, we have the NFS_INO_INVALID_ATTR flag that let us know that,
so it is valid to say that the cache has not timed out.
We can also clear NFS_INO_REVAL_PAGECACHE, since our change attribute
is now known to be valid.
Conversely, if the change attribute did not match, we should make sure to
also revalidate the access and ACL caches.
Signed-off-by: Trond Myklebust <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota, fsnotify and ext2 updates from Jan Kara:
"Changes to locking of some quota operations from dedicated quota mutex
to s_umount semaphore, a fsnotify fix and a simple ext2 fix"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: Fix bogus warning in dquot_disable()
fsnotify: Fix possible use-after-free in inode iteration on umount
ext2: reject inodes with negative size
quota: Remove dqonoff_mutex
ocfs2: Use s_umount for quota recovery protection
quota: Remove dqonoff_mutex from dquot_scan_active()
ocfs2: Protect periodic quota syncing with s_umount semaphore
quota: Use s_umount protection for quota operations
quota: Hold s_umount in exclusive mode when enabling / disabling quotas
fs: Provide function to get superblock with exclusive s_umount
|
|
Pull KVM fixes from Paolo Bonzini:
"Early fixes for x86.
Instead of the (botched) revert, the lockdep/might_sleep splat has a
real fix provided by Andrea"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: nVMX: Allow L1 to intercept software exceptions (#BP and #OF)
kvm: take srcu lock around kvm_steal_time_set_preempted()
kvm: fix schedule in atomic in kvm_steal_time_set_preempted()
KVM: hyperv: fix locking of struct kvm_hv fields
KVM: x86: Expose Intel AVX512IFMA/AVX512VBMI/SHA features to guest.
kvm: nVMX: Correct a VMX instruction error code for VMPTRLD
|
|
Partitions that are not aligned to the blocksize of a device may cause
invalid I/O requests because the blocklayer cares only about alignment
within the partition when building requests on partitions.
device
|--------4096--------|--------4096--------|--------4096--------|
partition offset 512byte
|-512-|--------4096--------|--------4096--------|--------4096--------|
When reading/writing one 4k block of the partition this maps to
reading/writing with an offset of 512 byte of the device leading to
unaligned requests for the device which in turn may cause unexpected
behavior of the device driver.
For DASD devices we have to translate the block number into a cylinder,
head, record format. The unaligned requests lead to wrong calculation
and therefore to misdirected I/O. In a "good" case this leads to I/O
errors because the underlying hardware detects the wrong addressing.
In a worst case scenario this might destroy data on the device.
To prevent partitions that are not aligned to the physical blocksize
of a device check for the alignment in the blkpg_ioctl.
Signed-off-by: Stefan Haberland <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fix from Jean Delvare.
* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi_scan: Always show system identification string
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"New Device Support
- Add support for Ricoh RC5T619 PMIC to rn5t618
- Add support for PM8821 PMIC to qcom-pm8xxx
New Functionality:
- Add support for GPIO to lpc_ich
- Add support for GPADC to sun4i
- Add ability for rk808 to shutdown
Fix-ups:
- Simplify/strip unnecessary code; tps65218, palmas, tps65217
- Device Tree binding updates; tps65218, altera-a10sr
- Provide/export device ID info; tps65218, axp20x-i2c, hi655x-pmic,
fsl-imx25-tsadc, intel_soc_pmic_bxtwc
- Use MFD API instead of of_platform_populate(); tps65218
- Generalise name-space; pm8xxx
- Supply/edit regmap configuration; axp20x, cs47l24-tables, axp20x
- Enable compile testing; max77620, max77686, exynos-lpass,
abx500-core
- Coding style issues; wm8994-core, wm5102-tables
- Supply endian support; syscon
- Remove module support; ab3100-core, ab8500-debugfs, ab8500-gpadc,
abx500-core
Bug Fixes:
- Fix ordering issues; wm8994
- Fix dependencies (build-time/run-time); exynos_lpass, sun4i-gpadc
- Fix compiler warnings; sun4i-gpadc
- Fix leaks; mfd-core
- Fix page fault during module unload; tps65217"
* tag 'mfd-for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (49 commits)
mfd: tps65217: Support an interrupt pin as the system wakeup
mfd: tps65217: Make an interrupt handler simpler
mfd: tps65217: Update register interrupt mask bits instead of writing operation
mfd: tps65217: Specify the IRQ name
mfd: tps65217: Fix page fault on unloading modules
mfd: palmas: Remove redundant check in palmas_power_off
mfd: arizona: Disable IRQs during driver remove
mfd: pm8xxx: add support to pm8821
mfd: intel-lpss: Try to enable Memory-Write-Invalidate
mfd: rn5t618: Add Ricoh RC5T619 PMIC support
mfd: axp20x: Add address extension registers for AXP806 regmap
mfd: intel_soc_pmic_bxtwc: Fix a typo in MODULE_DEVICE_TABLE()
mfd: core: Fix device reference leak in mfd_clone_cell
mfd: bcm590xx: Simplify a test
mfd: sun4i-gpadc: Select regmap-irq
mfd: abx500-core: drop unused MODULE_ tags from non-modular code
mfd: ab8500: make sysctrl explicitly non-modular
mfd: ab8500-gpadc: Make it explicitly non-modular
mfd: ab8500-debugfs: Make it explicitly non-modular
mfd: ab8500-core: Make it explicitly non-modular
...
|