Age | Commit message (Collapse) | Author | Files | Lines |
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.0:
- Fix license inconsistency in vkms.
Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
When provisioning a new data block for a virtual block, either because
the block was previously unallocated or because we are breaking sharing,
if the whole block of data is being overwritten the bio that triggered
the provisioning is issued immediately, skipping copying or zeroing of
the data block.
When this bio completes the new mapping is inserted in to the pool's
metadata by process_prepared_mapping(), where the bio completion is
signaled to the upper layers.
This completion is signaled without first committing the metadata. If
the bio in question has the REQ_FUA flag set and the system crashes
right after its completion and before the next metadata commit, then the
write is lost despite the REQ_FUA flag requiring that I/O completion for
this request must only be signaled after the data has been committed to
non-volatile storage.
Fix this by deferring the completion of overwrite bios, with the REQ_FUA
flag set, until after the metadata has been committed.
Cc: [email protected]
Signed-off-by: Nikos Tsironis <[email protected]>
Acked-by: Joe Thornber <[email protected]>
Acked-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
This reverts commit 8099b047ecc431518b9bb6bdbba3549bbecdc343.
It turns out that people do actually depend on the shebang string being
truncated, and on the fact that an interpreter (like perl) will often
just re-interpret it entirely to get the full argument list.
Reported-by: Samuel Dionne-Riel <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This reverts commit 2a5f14f279f59143139bcd1606903f2f80a34241.
This patch causes xfstests generic/311 to fail. Reverting this for
now until we have a proper fix.
Signed-off-by: Abhi Das <[email protected]>
Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently the ethtool_regs version is set to 0 for FEC devices.
Use this field to store the register dump version exposed by the
kernel. The choosen version 2 corresponds to the kernel compile test:
#if defined(CONFIG_M523x) || defined(CONFIG_M527x)
|| defined(CONFIG_M528x) || defined(CONFIG_M520x)
|| defined(CONFIG_M532x) || defined(CONFIG_ARM)
|| defined(CONFIG_ARM64) || defined(CONFIG_COMPILE_TEST)
and version 1 corresponds to the opposite. Binaries of ethtool unaware
of this version will dump the whole set as usual.
Signed-off-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The of_find_device_by_node() takes a reference to the underlying device
structure, we should release that reference.
Signed-off-by: Huang Zijiang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The basic idea behind ->pagecnt_bias is: If we pre-allocate the maximum
number of references that we might need to create in the fastpath later,
the bump-allocation fastpath only has to modify the non-atomic bias value
that tracks the number of extra references we hold instead of the atomic
refcount. The maximum number of allocations we can serve (under the
assumption that no allocation is made with size 0) is nc->size, so that's
the bias used.
However, even when all memory in the allocation has been given away, a
reference to the page is still held; and in the `offset < 0` slowpath, the
page may be reused if everyone else has dropped their references.
This means that the necessary number of references is actually
`nc->size+1`.
Luckily, from a quick grep, it looks like the only path that can call
page_frag_alloc(fragsz=1) is TAP with the IFF_NAPI_FRAGS flag, which
requires CAP_NET_ADMIN in the init namespace and is only intended to be
used for kernel testing and fuzzing.
To test for this issue, put a `WARN_ON(page_ref_count(page) == 0)` in the
`offset < 0` path, below the virt_to_page() call, and then repeatedly call
writev() on a TAP device with IFF_TAP|IFF_NO_PI|IFF_NAPI_FRAGS|IFF_NAPI,
with a vector consisting of 15 elements containing 1 byte each.
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Heiner Kallweit says:
====================
net: phy: fix locking issue
Russell pointed out that the locking used in phy_is_started() isn't
needed and misleading. This locking also contributes to a race fixed
with patch 2.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Russell reported the following race in the phylib state machine
(quoting from his mail):
if (phy_polling_mode(phydev) && phy_is_started(phydev))
phy_queue_state_machine(phydev, PHY_STATE_TIME);
state = PHY_UP
thread 0 thread 1
phy_disconnect()
+-phy_is_started()
phy_is_started() |
`-phy_stop()
+-phydev->state = PHY_HALTED
`-phy_stop_machine()
`-cancel_delayed_work_sync()
phy_queue_state_machine()
`-mod_delayed_work()
At this point, the phydev->state_queue() has been added back onto the
system workqueue despite phy_stop_machine() having been called and
cancel_delayed_work_sync() called on it.
Fix this by protecting the complete operation in thread 0.
Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Reported-by: Russell King - ARM Linux admin <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Russell suggested to remove the locking from phy_is_started() because
the read is atomic anyway and actually the locking may be more
misleading.
Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Suggested-by: Russell King - ARM Linux admin <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The clean target in the makefile conflicts with the generic
kselftests lib.mk, and fails to properly remove the compiled
test programs.
Remove the redundant rule, the TEST_GEN_FILES will be already
removed by the CLEAN macro in lib.mk.
Signed-off-by: Deepa Dinamani <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The value of ->num_ports comes from bcm_sf2_sw_probe() and it is less
than or equal to DSA_MAX_PORTS. The ds->ports[] array is used inside
the dsa_is_user_port() and dsa_is_cpu_port() functions. The ds->ports[]
array is allocated in dsa_switch_alloc() and it has ds->num_ports
elements so this leads to a static checker warning about a potential out
of bounds read.
Fixes: 8cfa94984c9c ("net: dsa: bcm_sf2: add suspend/resume callbacks")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Vivien Didelot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
With many active TCP sockets, fat TCP sockets could fool
__sk_mem_raise_allocated() thanks to an overflow.
They would increase their share of the memory, instead
of decreasing it.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The GPIO interrupt controller on the espressobin board only supports edge interrupts.
If one enables the use of hardware interrupts in the device tree for the 88E6341, it is
possible to miss an edge. When this happens, the INTn pin on the Marvell switch is
stuck low and no further interrupts occur.
I found after adding debug statements to mv88e6xxx_g1_irq_thread_work() that there is
a race in handling device interrupts (e.g. PHY link interrupts). Some interrupts are
directly cleared by reading the Global 1 status register. However, the device interrupt
flag, for example, is not cleared until all the unmasked SERDES and PHY ports are serviced.
This is done by reading the relevant SERDES and PHY status register.
The code only services interrupts whose status bit is set at the time of reading its status
register. If an interrupt event occurs after its status is read and before all interrupts
are serviced, then this event will not be serviced and the INTn output pin will remain low.
This is not a problem with polling or level interrupts since the handler will be called
again to process the event. However, it's a big problem when using level interrupts.
The fix presented here is to add a loop around the code servicing switch interrupts. If
any pending interrupts remain after the current set has been handled, we loop and process
the new set. If there are no pending interrupts after servicing, we are sure that INTn has
gone high and we will get an edge when a new event occurs.
Tested on espressobin board.
Fixes: dc30c35be720 ("net: dsa: mv88e6xxx: Implement interrupt support.")
Signed-off-by: John David Anglin <[email protected]>
Tested-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
phylib enables interrupts before phy_start() has been called, and if
we receive an interrupt in a non-started state, the interrupt handler
returns IRQ_NONE. This causes problems with at least one Marvell chip
as reported by Andrew.
Fix this by handling interrupts the same as in phy_mac_interrupt(),
basically always running the phylib state machine. It knows when it
has to do something and when not.
This change allows to handle interrupts gracefully even if they
occur in a non-started state.
Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Reported-by: Andrew Lunn <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In sctp_stream_init(), after sctp_stream_outq_migrate() freed the
surplus streams' ext, but sctp_stream_alloc_out() returns -ENOMEM,
stream->outcnt will not be set to 'outcnt'.
With the bigger value on stream->outcnt, when closing the assoc and
freeing its streams, the ext of those surplus streams will be freed
again since those stream exts were not set to NULL after freeing in
sctp_stream_outq_migrate(). Then the invalid-free issue reported by
syzbot would be triggered.
We fix it by simply setting them to NULL after freeing.
Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations")
Reported-by: [email protected]
Signed-off-by: Xin Long <[email protected]>
Acked-by: Neil Horman <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jianlin reported a panic when running sctp gso over gre over vlan device:
[ 84.772930] RIP: 0010:do_csum+0x6d/0x170
[ 84.790605] Call Trace:
[ 84.791054] csum_partial+0xd/0x20
[ 84.791657] gre_gso_segment+0x2c3/0x390
[ 84.792364] inet_gso_segment+0x161/0x3e0
[ 84.793071] skb_mac_gso_segment+0xb8/0x120
[ 84.793846] __skb_gso_segment+0x7e/0x180
[ 84.794581] validate_xmit_skb+0x141/0x2e0
[ 84.795297] __dev_queue_xmit+0x258/0x8f0
[ 84.795949] ? eth_header+0x26/0xc0
[ 84.796581] ip_finish_output2+0x196/0x430
[ 84.797295] ? skb_gso_validate_network_len+0x11/0x80
[ 84.798183] ? ip_finish_output+0x169/0x270
[ 84.798875] ip_output+0x6c/0xe0
[ 84.799413] ? ip_append_data.part.50+0xc0/0xc0
[ 84.800145] iptunnel_xmit+0x144/0x1c0
[ 84.800814] ip_tunnel_xmit+0x62d/0x930 [ip_tunnel]
[ 84.801699] gre_tap_xmit+0xac/0xf0 [ip_gre]
[ 84.802395] dev_hard_start_xmit+0xa5/0x210
[ 84.803086] sch_direct_xmit+0x14f/0x340
[ 84.803733] __dev_queue_xmit+0x799/0x8f0
[ 84.804472] ip_finish_output2+0x2e0/0x430
[ 84.805255] ? skb_gso_validate_network_len+0x11/0x80
[ 84.806154] ip_output+0x6c/0xe0
[ 84.806721] ? ip_append_data.part.50+0xc0/0xc0
[ 84.807516] sctp_packet_transmit+0x716/0xa10 [sctp]
[ 84.808337] sctp_outq_flush+0xd7/0x880 [sctp]
It was caused by SKB_GSO_CB(skb)->csum_start not set in sctp_gso_segment.
sctp_gso_segment() calls skb_segment() with 'feature | NETIF_F_HW_CSUM',
which causes SKB_GSO_CB(skb)->csum_start not to be set in skb_segment().
For TCP/UDP, when feature supports HW_CSUM, CHECKSUM_PARTIAL will be set
and gso_reset_checksum will be called to set SKB_GSO_CB(skb)->csum_start.
So SCTP should do the same as TCP/UDP, to call gso_reset_checksum() when
computing checksum in sctp_gso_segment.
Reported-by: Jianlin Shi <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Acked-by: Neil Horman <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Missing structure initialization in ebtables causes splat with
32-bit user level on a 64-bit kernel, from Francesco Ruggeri.
2) Missing dependency on nf_defrag in IPVS IPv6 codebase, from
Andrea Claudi.
3) Fix possible use-after-free from release path of target extensions.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2019-02-13
This series introduces some fixes to mlx5 driver.
For more information please see tag log below.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently mlx5 driver creates xdp redirect hw queues unconditionally on
netdevice open, This is great until someone starts redirecting XDP traffic
via ndo_xdp_xmit on mlx5 device and changes the device configuration at
the same time, this might cause crashes, since the other device's napi
is not aware of the mlx5 state change (resources un-availability).
To fix this we must synchronize with other devices napi's on the system.
Added a new flag under mlx5e_priv to determine XDP TX resources are
available, set/clear it up when necessary and use synchronize_rcu()
when the flag is turned off, so other napi's are in-sync with it, before
we actually cleanup the hw resources.
The flag is tested prior to committing to transmit on mlx5e_xdp_xmit, and
it is sufficient to determine if it safe to transmit or not. The other
two internal flags (MLX5E_STATE_OPENED and MLX5E_SQ_STATE_ENABLED) become
unnecessary. Thus, they are removed from data path.
Fixes: 58b99ee3e3eb ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
Reported-by: Toke Høiland-Jørgensen <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Eliminate the following compilation warning:
drivers/net/ethernet/mellanox/mlx5/core/events.c: warning: 'error_str'
may be used uninitialized in this function [-Wuninitialized]: => 238:3
Fixes: c2fb3db22d35 ("net/mlx5: Rework handling of port module events")
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Mikhael Goikhman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
When EEH is injected and PCI bus stalls, mlx5's pci error detect
function is called to deactivate the command interface and tear down
the device. The issue is that there can be a thread that already
passed MLX5_DEVICE_STATE_INTERNAL_ERROR check, it will send the command
and stuck in the wait_func.
Solution:
Add function mlx5_cmd_flush to disable command interface and clear all
the pending commands. When device state is set to
MLX5_DEVICE_STATE_INTERNAL_ERROR, call mlx5_cmd_flush to ensure all
pending threads waiting for firmware commands completion are terminated.
Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread")
Signed-off-by: Huy Nguyen <[email protected]>
Reviewed-by: Daniel Jurgens <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
New channels are applied to the priv channels only after they
are successfully opened. Then, the indirection table should be built
according to the new number of channels.
Currently, such build is preformed independently of whether the
channels opening is successful, and is not reverted on failure.
The bug is caused due to removal of rss params from channels struct
and moving it to priv struct. That change cause to independency between
channels and rss params.
This causes a crash on a later point, when accessing rqn of a non
existing channel.
This patch fixes it by moving the indirection table build right before
switching the priv channels to new channels struct, after the new set of
channels was successfully opened.
Fixes: bbeb53b8b2c9 ("net/mlx5e: Move RSS params to a dedicated struct")
Signed-off-by: Maria Pasechnik <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"This fixes kprobes/uprobes dynamic processing of strings, where it
processes the args but does not update the remaining length of the
buffer that the string arguments will be placed in. It constantly
passes in the total size of buffer used instead of passing in the
remaining size of the buffer used.
This could cause issues if the strings are larger than the max size of
an event which could cause the strings to be written beyond what was
reserved on the buffer"
* tag 'trace-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: probeevent: Correctly update remaining space in dynamic area
|
|
Fetch pointer to module before target object is released.
Fixes: 29e3880109e3 ("netfilter: nf_tables: fix use-after-free when deleting compat expressions")
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Don't warn or fail if it's missing.
v2: handle xgmi case more gracefully.
v3: handle older kernels properly
Reviewed-by: Hawking Zhang <[email protected]>
Tested-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Pull single NVMe fix from Christoph
* 'nvme-5.0' of git://git.infradead.org/nvme:
nvme-pci: add missing unlock for reset error
|
|
In the middle of do_exit() there is there is a call
"ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process
in TACKED_TRACED aka "(TASK_WAKEKILL | __TASK_TRACED)" and waits for
for the debugger to release the task or SIGKILL to be delivered.
Skipping past dequeue_signal when we know a fatal signal has already
been delivered resulted in SIGKILL remaining pending and
TIF_SIGPENDING remaining set. This in turn caused the
scheduler to not sleep in PTACE_EVENT_EXIT as it figured
a fatal signal was pending. This also caused ptrace_freeze_traced
in ptrace_check_attach to fail because it left a per thread
SIGKILL pending which is what fatal_signal_pending tests for.
This difference in signal state caused strace to report
strace: Exit of unknown pid NNNNN ignored
Therefore update the signal handling state like dequeue_signal
would when removing a per thread SIGKILL, by removing SIGKILL
from the per thread signal mask and clearing TIF_SIGPENDING.
Acked-by: Oleg Nesterov <[email protected]>
Reported-by: Oleg Nesterov <[email protected]>
Reported-by: Ivan Delalande <[email protected]>
Cc: [email protected]
Fixes: 35634ffa1751 ("signal: Always notice exiting tasks")
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
Commit bb364890323cca ("mmc: meson-gx: Free irq in release() callback")
changed the _probe code to use request_threaded_irq() instead of
devm_request_threaded_irq().
Unfortunately this removes a fallback for the interrupt name:
devm_request_threaded_irq() uses the device name as fallback if the
given IRQ name is NULL. request_threaded_irq() has no such fallback,
thus /proc/interrupts shows "(null)" instead.
Explicitly pass the dev_name() so we get the IRQ name shown in
/proc/interrupts again.
While here, also fix the indentation of the request_threaded_irq()
parameter list.
Fixes: bb364890323cca ("mmc: meson-gx: Free irq in release() callback")
Signed-off-by: Martin Blumenstingl <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
|
|
into drm-fixes
drm/imx: plane, ldb, and ipu-v3 fixes
- Fix CSI register offsets for i.MX51 and i.MX53.
- Fix delayed page flip completion events on i.MX6QP due to unexpected
behaviour of the PRE when issuing NOP buffer updates to the same
buffer address.
- Stop throwing errors for plane updates on disabled CRTCs when a
userspace process is killed while a plane update is pending.
- Add missing of_node_put cleanup in imx_ldb_bind.
Signed-off-by: Dave Airlie <[email protected]>
From: Philipp Zabel <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Merge fixes from Andrew Morton:
"6 fixes"
* emailed patches from Andrew Morton <[email protected]>:
mm: proc: smaps_rollup: fix pss_locked calculation
Rename include/{uapi => }/asm-generic/shmparam.h really
Revert "mm: use early_pfn_to_nid in page_ext_init"
mm/gup: fix gup_pmd_range() for dax
Revert "mm: slowly shrink slabs with a relatively small number of objects"
Revert "mm: don't reclaim inodes with many attached pages"
|
|
The 'pss_locked' field of smaps_rollup was being calculated incorrectly.
It accumulated the current pss everytime a locked VMA was found. Fix
that by adding to 'pss_locked' the same time as that of 'pss' if the vma
being walked is locked.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 493b0e9d945f ("mm: add /proc/pid/smaps_rollup")
Signed-off-by: Sandeep Patil <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Daniel Colascione <[email protected]>
Cc: <[email protected]> [4.14.x, 4.19.x]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit 36c0f7f0f899 ("arch: unexport asm/shmparam.h for all
architectures") is different from the patch I submitted.
My patch is this:
https://lore.kernel.org/lkml/[email protected]/T/#u
The file renaming part:
rename include/{uapi => }/asm-generic/shmparam.h (100%)
was lost when it was picked up.
I think it was an accident because Andrew did not say anything.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 36c0f7f0f899 ("arch: unexport asm/shmparam.h for all architectures")
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This reverts commit fe53ca54270a ("mm: use early_pfn_to_nid in
page_ext_init").
When booting a system with "page_owner=on",
start_kernel
page_ext_init
invoke_init_callbacks
init_section_page_ext
init_page_owner
init_early_allocated_pages
init_zones_in_node
init_pages_in_zone
lookup_page_ext
page_to_nid
The issue here is that page_to_nid() will not work since some page flags
have no node information until later in page_alloc_init_late() due to
DEFERRED_STRUCT_PAGE_INIT. Hence, it could trigger an out-of-bounds
access with an invalid nid.
UBSAN: Undefined behaviour in ./include/linux/mm.h:1104:50
index 7 is out of range for type 'zone [5]'
Also, kernel will panic since flags were poisoned earlier with,
CONFIG_DEBUG_VM_PGFLAGS=y
CONFIG_NODE_NOT_IN_PAGE_FLAGS=n
start_kernel
setup_arch
pagetable_init
paging_init
sparse_init
sparse_init_nid
memblock_alloc_try_nid_raw
It did not handle it well in init_pages_in_zone() which ends up calling
page_to_nid().
page:ffffea0004200000 is uninitialized and poisoned
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
page_owner info is not active (free page?)
kernel BUG at include/linux/mm.h:990!
RIP: 0010:init_page_owner+0x486/0x520
This means that assumptions behind commit fe53ca54270a ("mm: use
early_pfn_to_nid in page_ext_init") are incomplete. Therefore, revert
the commit for now. A proper way to move the page_owner initialization
to sooner is to hook into memmap initialization.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Qian Cai <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Pasha Tatashin <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
For dax pmd, pmd_trans_huge() returns false but pmd_huge() returns true
on x86. So the function works as long as hugetlb is configured.
However, dax doesn't depend on hugetlb.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: "Michael S . Tsirkin" <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This reverts commit 172b06c32b9497 ("mm: slowly shrink slabs with a
relatively small number of objects").
This change changes the agressiveness of shrinker reclaim, causing small
cache and low priority reclaim to greatly increase scanning pressure on
small caches. As a result, light memory pressure has a disproportionate
affect on small caches, and causes large caches to be reclaimed much
faster than previously.
As a result, it greatly perturbs the delicate balance of the VFS caches
(dentry/inode vs file page cache) such that the inode/dentry caches are
reclaimed much, much faster than the page cache and this drives us into
several other caching imbalance related problems.
As such, this is a bad change and needs to be reverted.
[ Needs some massaging to retain the later seekless shrinker
modifications.]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 172b06c32b9497 ("mm: slowly shrink slabs with a relatively small number of objects")
Signed-off-by: Dave Chinner <[email protected]>
Cc: Wolfgang Walter <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Spock <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This reverts commit a76cf1a474d7d ("mm: don't reclaim inodes with many
attached pages").
This change causes serious changes to page cache and inode cache
behaviour and balance, resulting in major performance regressions when
combining worklaods such as large file copies and kernel compiles.
https://bugzilla.kernel.org/show_bug.cgi?id=202441
This change is a hack to work around the problems introduced by changing
how agressive shrinkers are on small caches in commit 172b06c32b94 ("mm:
slowly shrink slabs with a relatively small number of objects"). It
creates more problems than it solves, wasn't adequately reviewed or
tested, so it needs to be reverted.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: a76cf1a474d7d ("mm: don't reclaim inodes with many attached pages")
Signed-off-by: Dave Chinner <[email protected]>
Cc: Wolfgang Walter <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Spock <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix fan detection for NCT6793D"
* tag 'hwmon-for-v5.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (nct6775) Fix fan6 detection for NCT6793D
|
|
Pull MD fix from Song
* 'md-fixes' of https://github.com/liu-song-6/linux:
md/raid1: don't clear bitmap bits on interrupted recovery.
|
|
sync_request_write no longer submits writes to a Faulty device. This has
the unfortunate side effect that bitmap bits can be incorrectly cleared
if a recovery is interrupted (previously, end_sync_write would have
prevented this). This means the next recovery may not copy everything
it should, potentially corrupting data.
Add a function for doing the proper md_bitmap_end_sync, called from
end_sync_write and the Faulty case in sync_request_write.
backport note to 4.14: s/md_bitmap_end_sync/bitmap_end_sync
Cc: [email protected] 4.14+
Fixes: 0c9d5b127f69 ("md/raid1: avoid reusing a resync bio after error handling.")
Reviewed-by: Jack Wang <[email protected]>
Tested-by: Jack Wang <[email protected]>
Signed-off-by: Nate Dailey <[email protected]>
Signed-off-by: Song Liu <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a pair of bug fixes that I'd like to include in 5.0:
- A fix to disambiguate swap from invalid PTEs, which fixes an error
when trying to unmap PROT_NONE pages.
- A revert to an optimization of the size of flat binaries. This is
really a workaround to prevent breaking existing boot flows, but
since the change was introduced as part of the 5.0 merge window I'd
like to have the fix in before 5.0 so we can avoid a regression for
any proper releases.
With these I hope we're out of patches for 5.0 in RISC-V land"
* tag 'riscv-for-linus-5.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Revert "RISC-V: Make BSS section as the last section in vmlinux.lds.S"
riscv: Add pte bit to distinguish swap from invalid
|
|
The current opt_inst_list operations inside team_nl_cmd_options_set()
is too complex to track:
LIST_HEAD(opt_inst_list);
nla_for_each_nested(...) {
list_for_each_entry(opt_inst, &team->option_inst_list, list) {
if (__team_option_inst_tmp_find(&opt_inst_list, opt_inst))
continue;
list_add(&opt_inst->tmp_list, &opt_inst_list);
}
}
team_nl_send_event_options_get(team, &opt_inst_list);
as while we retrieve 'opt_inst' from team->option_inst_list, it could
be added to the local 'opt_inst_list' for multiple times. The
__team_option_inst_tmp_find() doesn't work, as the setter
team_mode_option_set() still calls team->ops.exit() which uses
->tmp_list too in __team_options_change_check().
Simplify the list operations by moving the 'opt_inst_list' and
team_nl_send_event_options_get() into the nla_for_each_nested() loop so
that it can be guranteed that we won't insert a same list entry for
multiple times. Therefore, __team_option_inst_tmp_find() can be removed
too.
Fixes: 4fb0534fb7bb ("team: avoid adding twice the same option to the event list")
Fixes: 2fcdb2c9e659 ("team: allow to send multiple set events in one message")
Reported-by: [email protected]
Reported-by: [email protected]
Cc: Jiri Pirko <[email protected]>
Cc: Paolo Abeni <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Reviewed-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Cong Wang says:
====================
net_sched: some fixes for cls_tcindex
This patchset contains 3 bug fixes for tcindex filter. Please check
each patch for details.
v2: fix a compile error in patch 2
drop netns refcnt in patch 1
====================
Signed-off-by: Cong Wang <[email protected]>
|
|
struct tcindex_filter_result contains two parts:
struct tcf_exts and struct tcf_result.
For the local variable 'cr', its exts part is never used but
initialized without being released properly on success path. So
just completely remove the exts part to fix this leak.
For the local variable 'new_filter_result', it is never properly
released if not used by 'r' on success path.
Cc: Jamal Hadi Salim <[email protected]>
Cc: Jiri Pirko <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When tcindex_destroy() destroys all the filter results in
the perfect hash table, it invokes the walker to delete
each of them. However, results with class==0 are skipped
in either tcindex_walk() or tcindex_delete(), which causes
a memory leak reported by kmemleak.
This patch fixes it by skipping the walker and directly
deleting these filter results so we don't miss any filter
result.
As a result of this change, we have to initialize exts->net
properly in tcindex_alloc_perfect_hash(). For net-next, we
need to consider whether we should initialize ->net in
tcf_exts_init() instead, before that just directly test
CONFIG_NET_CLS_ACT=y.
Cc: Jamal Hadi Salim <[email protected]>
Cc: Jiri Pirko <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
tcindex_destroy() invokes tcindex_destroy_element() via
a walker to delete each filter result in its perfect hash
table, and tcindex_destroy_element() calls tcindex_delete()
which schedules tcf RCU works to do the final deletion work.
Unfortunately this races with the RCU callback
__tcindex_destroy(), which could lead to use-after-free as
reported by Adrian.
Fix this by migrating this RCU callback to tcf RCU work too,
as that workqueue is ordered, we will not have use-after-free.
Note, we don't need to hold netns refcnt because we don't call
tcf_exts_destroy() here.
Fixes: 27ce4f05e2ab ("net_sched: use tcf_queue_work() in tcindex filter")
Reported-by: Adrian <[email protected]>
Cc: Ben Hutchings <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Cc: Jiri Pirko <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Arthur Kiyanovski says:
====================
net: ena: race condition bug fix and version update
This patchset includes a fix to a race condition that can cause
kernel panic, as well as a driver version update because of this
fix.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Update driver version due to bug fix.
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix race condition between ena_update_on_link_change() and
ena_restore_device().
This race can occur if link notification arrives while the driver
is performing a reset sequence. In this case link can be set up,
enabling the device, before it is fully restored. If packets are
sent at this time, the driver might access uninitialized data
structures, causing kernel crash.
Move the clearing of ENA_FLAG_ONGOING_RESET and netif_carrier_on()
after ena_up() to ensure the device is ready when link is set up.
Fixes: d18e4f683445 ("net: ena: fix race condition between device reset and link up setup")
Signed-off-by: Arthur Kiyanovski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When calculating rb->frames_per_block * req->tp_block_nr the result
can overflow. Check it for overflow without limiting the total buffer
size to UINT_MAX.
This change fixes support for packet ring buffers >= UINT_MAX.
Fixes: 8f8d28e4d6d8 ("net/packet: fix overflow in check for tp_frame_nr")
Signed-off-by: Kal Conley <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|