Age | Commit message (Collapse) | Author | Files | Lines |
|
Every PCI-PCI bridge window should fit inside an upstream bridge window
because orphaned address space is unreachable from the primary side of the
upstream bridge. If we inherit invalid bridge windows that overlap an
upstream window from firmware, clip them to fit and update the bridge
accordingly.
[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <[email protected]>
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: David Howells <[email protected]>
CC: Paul Gortmaker <[email protected]>
|
|
Every PCI-PCI bridge window should fit inside an upstream bridge window
because orphaned address space is unreachable from the primary side of the
upstream bridge. If we inherit invalid bridge windows that overlap an
upstream window from firmware, clip them to fit and update the bridge
accordingly.
[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <[email protected]>
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: Richard Henderson <[email protected]>
CC: Ivan Kokshaysky <[email protected]>
CC: Matt Turner <[email protected]>
CC: [email protected]
|
|
Every PCI-PCI bridge window should fit inside an upstream bridge window
because orphaned address space is unreachable from the primary side of the
upstream bridge. If we inherit invalid bridge windows that overlap an
upstream window from firmware, clip them to fit and update the bridge
accordingly.
[bhelgaas: changelog]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <[email protected]>
Tested-by: Marek Kordik <[email protected]>
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: [email protected]
CC: [email protected] # v3.16+
|
|
Add pci_claim_bridge_resource() to claim a PCI-PCI bridge window. This is
like regular pci_claim_resource(), except that if we fail to claim the
window, we check to see if we can reduce the size of the window and try
again.
This is for scenarios like this:
pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff]
pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref]
pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xcfffffff pref]
The 00:01.0 window is illegal: it starts before the host bridge window, so
we have to assume the [0xbdf00000-0xbfffffff] region is inaccessible. We
can make it legal by clipping it to [mem 0xc0000000-0xddefffff 64bit pref].
Previously we discarded the 00:01.0 window and tried to reassign that part
of the hierarchy from scratch. That is a problem because Linux doesn't
always assign things optimally. For example, in this case, BIOS put the
01:00.0 device in a prefetchable window below 4GB, but after 5b28541552ef,
Linux puts the prefetchable window above 4GB where the 32-bit 01:00.0
device can't use it.
Clipping the 00:01.0 window is less intrusive than completely reassigning
things and is sufficient to let us use most of the BIOS configuration. Of
course, it's possible that devices below 00:01.0 will no longer fit. If
that's the case, we'll have to reassign things. But that's a separate
problem.
[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <[email protected]>
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: [email protected] # v3.16+
|
|
Add pci_bus_clip_resource(). If a PCI-PCI bridge window overlaps an
upstream bridge window but is not completely contained by it, this clips
the downstream window so it fits inside the upstream one.
No functional change (this adds the function but no callers).
[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <[email protected]>
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: [email protected] # v3.16+
|
|
pci_setup_bridge_io(), pci_setup_bridge_mmio(), and
pci_setup_bridge_mmio_pref() program the windows of PCI-PCI bridges.
Previously they accepted a pointer to the pci_bus of the secondary bus,
then looked up the bridge leading to that bus. Pass the bridge directly,
which will make it more convenient for future callers.
No functional change.
[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik <[email protected]>
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: [email protected] # v3.16+
|
|
Reports against the TL-WDN4800 card indicate that PCI bus reset of this
Atheros device cause system lock-ups and resets. I've also been able to
confirm this behavior on multiple systems. The device never returns from
reset and attempts to access config space of the device after reset result
in hangs. Blacklist bus reset for the device to avoid this issue.
[bhelgaas: This regression appeared in v3.14. Andreas bisected it to
425c1b223dac ("PCI: Add Virtual Channel to save/restore support"), but we
don't understand the mechanism by which that commit affects the reset
path.]
[bhelgaas: changelog, references]
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Andreas Hartmann <[email protected]>
Tested-by: Andreas Hartmann <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: [email protected] # v3.14+
|
|
Enable a mechanism for devices to quirk that they do not behave when
doing a PCI bus reset. We require a modest level of spec compliant
behavior in order to do a reset, for instance the device should come
out of reset without throwing errors and PCI config space should be
accessible after reset. This is too much to ask for some devices.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
CC: [email protected] # v3.14+
|
|
to be page aligned"
This patch partially reverts commit 421520ba98290a73b35b7644e877a48f18e06004
(only the arm64 part). There is no guarantee that the boot-loader places other
images like dtb in a different page than initrd start/end, especially when the
kernel is built with 64KB pages. When this happens, such pages must not be
freed. The free_reserved_area() already takes care of rounding up "start" and
rounding down "end" to avoid freeing partially used pages.
Cc: <[email protected]> # 3.17+
Reported-by: Peter Maydell <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
cycles:p and cycles:pp do not work on SLM since commit:
86a04461a99f ("perf/x86: Revamp PEBS event selection")
UOPS_RETIRED.ALL is not a PEBS capable event, so it should not be used
to count cycle number.
Actually SLM calls intel_pebs_aliases_core2() which uses INST_RETIRED.ANY_P
to count the number of cycles. It's a PEBS capable event. But inv and
cmask must be set to count cycles.
Considering SLM allows all events as PEBS with no flags, only
INST_RETIRED.ANY_P, inv=1, cmask=16 needs to handled specially.
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This patch fixes a problem with the initialization of the
sysfs_show() routine for the RAPL PMU.
The current code was wrongly relying on the EVENT_ATTR_STR()
macro which uses the events_sysfs_show() function in the x86
PMU code. That function itself was relying on the x86_pmu data
structure. Yet RAPL and the core PMU (x86_pmu) have nothing to
do with each other. They should therefore not interact with
each other.
The x86_pmu structure is initialized at boot time based on
the host CPU model. When the host CPU is not supported, the
x86_pmu remains uninitialized and some of the callbacks it
contains are NULL.
The false dependency with x86_pmu could potentially cause crashes
in case the x86_pmu is not initialized while the RAPL PMU is. This
may, for instance, be the case in virtualized environments.
This patch fixes the problem by using a private sysfs_show()
routine for exporting the RAPL PMU events.
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: http://lkml.kernel.org/r/20150113225953.GA21525@thinkpad
Cc: [email protected]
Cc: [email protected]
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Linus Torvalds <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
softnet_data.input_pkt_queue is protected by a spinlock that
we must hold when transferring packets from victim queue to an active
one. This is because other cpus could still be trying to enqueue packets
into victim queue.
A second problem is that when we transfert the NAPI poll_list from
victim to current cpu, we absolutely need to special case the percpu
backlog, because we do not want to add complex locking to protect
process_queue : Only owner cpu is allowed to manipulate it, unless cpu
is offline.
Based on initial patch from Prasad Sodagudi & Subash Abhinov
Kasiviswanathan.
This version is better because we do not slow down packet processing,
only make migration safer.
Reported-by: Prasad Sodagudi <[email protected]>
Reported-by: Subash Abhinov Kasiviswanathan <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Tony Lindgren says:
====================
Fixes for davinci_emac
Here's a repost of the fixes for davinci_emac with patches
updated for comments and acks collected.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
On dm816x we have two emac controllers with separate memory
areas.
Cc: Brian Hutchinson <[email protected]>
Cc: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
space
Some devices like dm816x have the MDIO registers within the first EMAC
instance address space. Let's fix the issue by allowing to pass an
optional second IO range for the EMAC control register area.
Cc: Brian Hutchinson <[email protected]>
Cc: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Looks like the phy_id is never set up beyond getting the phandle.
Note that we can remove the ifdef for phy_node as there is a stub
for of_phy_connec() if CONFIG_OF is not set.
Cc: Brian Hutchinson <[email protected]>
Cc: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We only use clk_get() to get the frequency, the rest is done by
the runtime PM calls. Let's free the clock too.
Cc: Brian Hutchinson <[email protected]>
Cc: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 3ba97381343b ("net: ethernet: davinci_emac: add pm_runtime support")
added support for runtime PM, but it causes issues on omap3 related devices
that actually gate the clocks:
Unhandled fault: external abort on non-linefetch (0x1008)
...
[<c04160f0>] (emac_dev_getnetstats) from [<c04d6a3c>] (dev_get_stats+0x78/0xc8)
[<c04d6a3c>] (dev_get_stats) from [<c04e9ccc>] (rtnl_fill_ifinfo+0x3b8/0x938)
[<c04e9ccc>] (rtnl_fill_ifinfo) from [<c04eade4>] (rtmsg_ifinfo+0x68/0xd8)
[<c04eade4>] (rtmsg_ifinfo) from [<c04dd35c>] (register_netdevice+0x3a0/0x4ec)
[<c04dd35c>] (register_netdevice) from [<c04dd4bc>] (register_netdev+0x14/0x24)
[<c04dd4bc>] (register_netdev) from [<c041755c>] (davinci_emac_probe+0x408/0x5c8)
[<c041755c>] (davinci_emac_probe) from [<c0396d78>] (platform_drv_probe+0x48/0xa4)
Let's fix it by moving the pm_runtime_get() call earlier, and also add it to
the emac_dev_getnetstats(). Also note that we want to use pm_runtime_get_sync()
as we don't want to have deferred_resume happen. And let's also check the
return value for pm_runtime_get_sync() as noted by Felipe Balbi <[email protected]>.
Cc: Brian Hutchinson <[email protected]>
Acked-by: Mark A. Greer <[email protected]>
Reviewed-by: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
On davinci_emac, we have pulse interrupts. This means that we need to
clear the EOI bits when disabling interrupts as otherwise the interrupts
keep happening. And we also need to not clear the EOI bits again when
enabling the interrupts as otherwise we will get tons of:
unexpected IRQ trap at vector 00
These errors almost certainly mean that the omap-intc.c is signaling
a spurious interrupt with the reserved irq 127 as we've seen earlier
on omap3.
Let's fix the issue by clearing the EOI bits when disabling the
interrupts. Let's also keep the comment for "Rx Threshold and Misc
interrupts are not enabled" for both enable and disable so people
are aware of this when potentially adding more support.
Note that eventually we should handle the RX and TX interrupts
separately like cpsw is now doing. However, so far I have not seen
any issues with this based on my testing, so it seems to behave a
little different compared to the cpsw that had a similar issue.
Cc: Brian Hutchinson <[email protected]>
Reviewed-by: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"This fixes a regression in the latest fuse update plus a fix for a
rather theoretical memory ordering issue"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: add memory barrier to INIT
fuse: fix LOOKUP vs INIT compat handling
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
- broadsheetfb: fix memory leak
- simplefb: fix build failure on sparc
* tag 'fbdev-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
fbdev/broadsheetfb: fix memory leak
simplefb: Fix build failure on Sparc
|
|
Pull MMC bugfix from Ulf Hansson:
"Fix sdhci regulator regression for Qualcomm and Nvidia boards"
* tag 'mmc-v3.19-4' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: sdhci: Set SDHCI_POWER_ON with external vmmc
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k fixlet from Geert Uytterhoeven.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up execveat
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc fixes from Michael Ellerman:
"A few powerpc fixes"
* tag 'powerpc-3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc: Work around gcc bug in current_thread_info()
cxl: Fix issues when unmapping contexts
powernv: Fix OPAL tracepoint code
|
|
The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That
structure is defined and allocated on the stack as
struct {
struct sock_extended_err ee;
struct sockaddr_in(6) offender;
} errhdr;
The second part is only initialized for certain SO_EE_ORIGIN values.
Always initialize it completely.
An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that
would return uninitialized bytes.
Signed-off-by: Willem de Bruijn <[email protected]>
----
Also verified that there is no padding between errhdr.ee and
errhdr.offender that could leak additional kernel data.
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2015-01-15
this is a pull request of 8 patches.
Ahmed S. Darwish contributes 4 fixes for the kvaser_usb driver. The two patches
by Oliver Hartkopp mark the m_can driver as non-ISO, as the CANFD standard was
updated. Roger Quadros's patch for the c_can driver fixes the register access
during RAMINIT. And one patch by my, which updates the MAINTAINERS file, as we
moved the git repos to the kernel.org infrastructure.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Except for VXLAN steering rules, all offloads should work as they were
under plain DMFS mode. Fix that by enabling all the offloads under
DMFS-A0 mode, except for VXLAN steering rules.
Fixes: d57febe1a478 "net/mlx4: Add A0 hybrid steering"
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Just two fixes - one for an uninialized variable and
one for a deadlock in regulatory processing.
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch fixes double kfree() calls at init_rx_ring() because
it causes static checker warning.
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Byungho An <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When the MAC address is provided in the device tree file, the
condition is true and kernel crashes due to NULL dereference.
Signed-off-by: Girish K.S <[email protected]>
Signed-off-by: Byungho An <[email protected]>
Signed-off-by: Kukjin Kim <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This was inadvertently dropped from an earlier commit, otherwise
the check against cq_vector == -1 to prevent double free doesn't
make any sense.
Fixes: 2b25d981790b
Signed-off-by: Jens Axboe <[email protected]>
|
|
commit b284fbe3b3ef9cf8 ("sh_eth: Fix access to TRSCER register") wanted
to add a .trscer_err_mask value to the R-Car Gen2 family-specific data
structure (r8a779x_data), but it was accidentally added to the
SH7724-specific data structure (sh7724_data).
Presumably this happened due to a patch conflict with commit
d407bc0203539031 ("sh-eth: Set fdr_value of R-Car SoCs"), which added
another field at the same position.
Move the field setting to fix this.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Fixes: b284fbe3b3ef9cf8 ("sh_eth: Fix access to TRSCER register")
Signed-off-by: David S. Miller <[email protected]>
|
|
while adding vlan in dual EMAC mode, only specific ports should be
subscribed for the vlan, else it will lead to switching mode and
if both ports connected to same switch cpsw will hung as it creates
a network loop. Fixing this by adding only specific ports in case
of dual EMAC.
Signed-off-by: Mugunthan V N <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Whenever Suspend PHY bit is set on DRA7x devices,
USB will not work due to Set EP Configuration command
always failing.
This was only found after a recent commit 2164a47 (usb:
dwc3: set SUSPHY bit for all cores, which will be merged
for v3.19) added a missing *required* step to dwc3
initialization. Synopsys Databook requires that we enable
Suspend PHY bit after initialization but that, unfortunately,
breaks DRA7x.
Note that the same regression was already patched for AM437x.
Reported-by: Roger Quadros <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
|
|
of_get_named_gpiod_flags fails with -EPROBE_DEFER in cases
where the gpio chip is available and the GPIO translation fails.
This causes drivers to be re-probed erroneusly, and hides the
real problem(i.e. the GPIO number being out of range).
Cc: Stable <[email protected]>
Signed-off-by: Hans Holmberg <[email protected]>
Reviewed-by: Alexandre Courbot <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
Fix attribute-creation race with userspace by using the default group
to create also the contingent gpio device attributes.
Fixes: d8f388d8dc8d ("gpio: sysfs interface")
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
The gpio device attributes were never destroyed when the gpio was
unexported (or on export failures).
Use device_create_with_groups() to create the default device attributes
of the gpio class device. Note that this also fixes the
attribute-creation race with userspace for these attributes.
Remove contingent attributes in export error path and on unexport.
Fixes: d8f388d8dc8d ("gpio: sysfs interface")
Cc: stable <[email protected]> # v2.6.27+
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
The gpio-chip device attributes were never destroyed when the device was
removed.
Fix by using device_create_with_groups() to create the device attributes
of the chip class device.
Note that this also fixes the attribute-creation race with userspace.
Fixes: d8f388d8dc8d ("gpio: sysfs interface")
Cc: stable <[email protected]> # v2.6.27+
Signed-off-by: Johan Hovold <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
This was accidently lost in 76a0df859def.
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
We should not touch the packet after a netif_rx: it might
get freed behind our back.
Suggested-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Ahmed S. Darwish <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Recent Leaf firmware versions (>= 3.1.557) do not allow to send
commands for non-existing channels. If a command is sent for a
non-existing channel, the firmware crashes.
Reported-by: Christopher Storah <[email protected]>
Signed-off-by: Olivier Sobrie <[email protected]>
Signed-off-by: Ahmed S. Darwish <[email protected]>
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in very high frequency (*), closing the CAN channel while
all the transmissions are on (#), opening the device again (@),
then sending a small number of packets would make the driver
enter an almost infinite loop of:
[....]
[15959.853988] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853990] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853991] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853993] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853994] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853995] kvaser_usb 4-3:1.0 can0: cannot find free context
[....]
_dragging the whole system down_ in the process due to the
excessive logging output.
Initially, this has caused random panics in the kernel due to a
buggy error recovery path. That got fixed in an earlier commit.(%)
This patch aims at solving the root cause. -->
16 tx URBs and contexts are allocated per CAN channel per USB
device. Such URBs are protected by:
a) A simple atomic counter, up to a value of MAX_TX_URBS (16)
b) A flag in each URB context, stating if it's free
c) The fact that ndo_start_xmit calls are themselves protected
by the networking layers higher above
After grabbing one of the tx URBs, if the driver noticed that all
of them are now taken, it stops the netif transmission queue.
Such queue is worken up again only if an acknowedgment was received
from the firmware on one of our earlier-sent frames.
Meanwhile, upon channel close (#), the driver sends a CMD_STOP_CHIP
to the firmware, effectively closing all further communication. In
the high traffic case, the atomic counter remains at MAX_TX_URBS,
and all the URB contexts remain marked as active. While opening
the channel again (@), it cannot send any further frames since no
more free tx URB contexts are available.
Reset all tx URB contexts upon CAN channel close.
(*) 50 parallel instances of `cangen0 -g 0 -ix`
(#) `ifconfig can0 down`
(@) `ifconfig can0 up`
(%) "can: kvaser_usb: Don't free packets when tight on URBs"
Signed-off-by: Ahmed S. Darwish <[email protected]>
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in high frequency caused seemingly-random panics in the
kernel.
On further inspection, it seems the driver erroneously freed the
to-be-transmitted packet upon getting tight on URBs and returning
NETDEV_TX_BUSY, leading to invalid memory writes and double frees
at a later point in time.
Note:
Finding no more URBs/transmit-contexts and returning NETDEV_TX_BUSY
is a driver bug in and out of itself: it means that our start/stop
queue flow control is broken.
This patch only fixes the (buggy) error handling code; the root
cause shall be fixed in a later commit.
Acked-by: Olivier Sobrie <[email protected]>
Signed-off-by: Ahmed S. Darwish <[email protected]>
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
use of regmap_read() and regmap_write() in c_can_hw_raminit_syscon()
is not safe as the RAMINIT register can be shared between different drivers
at least for TI SoCs.
To make the modification atomic we switch to using regmap_update_bits().
regmap_update_bits() skips writing to the register if it's read content is the
same as what is going to be written. This causes an issue for us when we
need to clear the DONE bit with the initial condition START:0, DONE:1 as
DONE bit must be written with 1 to clear it.
So we defer the clearing of DONE bit to later when we set the START bit.
There we are sure that START bit is changed from 0 to 1 so the write of
1 to already set DONE bit will happen.
Signed-off-by: Roger Quadros <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
During the CAN FD standardization process within the ISO it turned out that
the failure detection capability has to be improved.
The CAN in Automation organization (CiA) defined the already implemented CAN
FD controllers as 'non-ISO' and the upcoming improved CAN FD controllers as
'ISO' compliant. See at http://www.can-cia.com/index.php?id=1937
Finally there will be three types of CAN FD controllers in the future:
1. ISO compliant (fixed)
2. non-ISO compliant (fixed, like the M_CAN IP v3.0.1 in m_can.c)
3. ISO/non-ISO CAN FD controllers (switchable, like the PEAK USB FD)
So the current M_CAN driver for the M_CAN IP v3.0.1 has to expose its non-ISO
implementation by setting the CAN_CTRLMODE_FD_NON_ISO ctrlmode at startup.
As this bit cannot be switched at configuration time CAN_CTRLMODE_FD_NON_ISO
must not be set in ctrlmode_supported of the current M_CAN driver.
Signed-off-by: Oliver Hartkopp <[email protected]>
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
When changing flags in the CAN drivers ctrlmode the provided new content has to
be checked whether the bits are allowed to be changed. The bits that are to be
changed are given as a bitfield in cm->mask. Therefore checking against
cm->flags is wrong as the content can hold any kind of values.
The iproute2 tool sets the bits in cm->mask and cm->flags depending on the
detected command line options. To be robust against bogus user space
applications additionally sanitize the provided flags with the provided mask.
Cc: Wolfgang Grandegger <[email protected]>
Signed-off-by: Oliver Hartkopp <[email protected]>
Cc: linux-stable <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
The linux-can upstream git repositories are now hosted on kernel.org, update
MAINTAINERS accordingly.
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Commit 5f893b2639b2 "tracing: Move enabling tracepoints to just after
rcu_init()" broke the enabling of system call events from the command
line. The reason was that the enabling of command line trace events
was moved before PID 1 started, and the syscall tracepoints require
that all tasks have the TIF_SYSCALL_TRACEPOINT flag set. But the
swapper task (pid 0) is not part of that. Since the swapper task is the
only task that is running at this early in boot, no task gets the
flag set, and the tracepoint never gets reached.
Instead of setting the swapper task flag (there should be no reason to
do that), re-enabled trace events again after the init thread (PID 1)
has been started. It requires disabling all command line events and
re-enabling them, as just enabling them again will not reset the logic
to set the TIF_SYSCALL_TRACEPOINT flag, as the syscall tracepoint will
be fooled into thinking that it was already set, and wont try setting
it again. For this reason, we must first disable it and re-enable it.
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Michael Ellerman <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
|
|
trace_init() calls init_ftrace_syscalls() and then calls trace_event_init()
which also calls init_ftrace_syscalls(). It makes more sense to only
call it from trace_event_init().
Calling it twice wastes memory, as it allocates the syscall events twice,
and loses the first copy of it.
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Reported-by: Wang Nan <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
|
|
If the function graph tracer traces a jprobe callback, the system will
crash. This can easily be demonstrated by compiling the jprobe
sample module that is in the kernel tree, loading it and running the
function graph tracer.
# modprobe jprobe_example.ko
# echo function_graph > /sys/kernel/debug/tracing/current_tracer
# ls
The first two commands end up in a nice crash after the first fork.
(do_fork has a jprobe attached to it, so "ls" just triggers that fork)
The problem is caused by the jprobe_return() that all jprobe callbacks
must end with. The way jprobes works is that the function a jprobe
is attached to has a breakpoint placed at the start of it (or it uses
ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
will copy the stack frame and change the ip address to return to the
jprobe handler instead of the function. The jprobe handler must end
with jprobe_return() which swaps the stack and does an int3 (breakpoint).
This breakpoint handler will then put back the saved stack frame,
simulate the instruction at the beginning of the function it added
a breakpoint to, and then continue on.
For function tracing to work, it hijakes the return address from the
stack frame, and replaces it with a hook function that will trace
the end of the call. This hook function will restore the return
address of the function call.
If the function tracer traces the jprobe handler, the hook function
for that handler will not be called, and its saved return address
will be used for the next function. This will result in a kernel crash.
To solve this, pause function tracing before the jprobe handler is called
and unpause it before it returns back to the function it probed.
Some other updates:
Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
code look a bit cleaner and easier to understand (various tries to fix
this bug required this change).
Note, if fentry is being used, jprobes will change the ip address before
the function graph tracer runs and it will not be able to trace the
function that the jprobe is probing.
Link: http://lkml.kernel.org/r/[email protected]
Cc: [email protected] # 2.6.30+
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
|