Age | Commit message (Collapse) | Author | Files | Lines |
|
When the CONFIG_GENERIC_BUG is disabled by disabling CONFIG_BUG, if a
kernel thread is trapped by BUG(), the whole system will be in the
loop that infinitely handles the ebreak exception instead of entering the
die function. To fix this problem, the do_trap_break() will always call
the die() to deal with the break exception as the type of break is
BUG_TRAP_TYPE_BUG.
Signed-off-by: Vincent Chen <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Paul Walmsley <[email protected]>
|
|
Commit 3c1d3f097972 ("MIPS: futex: Emit Loongson3 sync workarounds
within asm") inadvertently removed the newlines following
__WEAK_LLSC_MB, which causes build failures for configurations in which
__WEAK_LLSC_MB expands to a sync instruction:
{standard input}: Assembler messages:
{standard input}:9346: Error: symbol `sync3' is already defined
{standard input}:9380: Error: symbol `sync3' is already defined
...
Fix this by restoring the newlines to separate the sync instruction from
anything following it (such as the 3: label), preventing inadvertent
concatenation.
Signed-off-by: Paul Burton <[email protected]>
Fixes: 3c1d3f097972 ("MIPS: futex: Emit Loongson3 sync workarounds within asm")
|
|
In commit 9f79b78ef744 ("Convert filldir[64]() from __put_user() to
unsafe_put_user()") I made filldir() use unsafe_put_user(), which
improves code generation on x86 enormously.
But because we didn't have a "unsafe_copy_to_user()", the dirent name
copy was also done by hand with unsafe_put_user() in a loop, and it
turns out that a lot of other architectures didn't like that, because
unlike x86, they have various alignment issues.
Most non-x86 architectures trap and fix it up, and some (like xtensa)
will just fail unaligned put_user() accesses unconditionally. Which
makes that "copy using put_user() in a loop" not work for them at all.
I could make that code do explicit alignment etc, but the architectures
that don't like unaligned accesses also don't really use the fancy
"user_access_begin/end()" model, so they might just use the regular old
__copy_to_user() interface.
So this commit takes that looping implementation, turns it into the x86
version of "unsafe_copy_to_user()", and makes other architectures
implement the unsafe copy version as __copy_to_user() (the same way they
do for the other unsafe_xyz() accessor functions).
Note that it only does this for the copying _to_ user space, and we
still don't have a unsafe version of copy_from_user().
That's partly because we have no current users of it, but also partly
because the copy_from_user() case is slightly different and cannot
efficiently be implemented in terms of a unsafe_get_user() loop (because
gcc can't do asm goto with outputs).
It would be trivial to do this using "rep movsb", which would work
really nicely on newer x86 cores, but really badly on some older ones.
Al Viro is looking at cleaning up all our user copy routines to make
this all a non-issue, but for now we have this simple-but-stupid version
for x86 that works fine for the dirent name copy case because those
names are short strings and we simply don't need anything fancier.
Fixes: 9f79b78ef744 ("Convert filldir[64]() from __put_user() to unsafe_put_user()")
Reported-by: Guenter Roeck <[email protected]>
Reported-and-tested-by: Tony Luck <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Max Filippov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
{save,restore}_dsp() internally checks if the cpu has dsp support.
Therefore, explicit check is not required before calling them in
{save,restore}_processor_state()
Signed-off-by: Aurabindo Jayamohanan <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Since dropping the set-to-gtt-domain in commit a679f58d0510 ("drm/i915:
Flush pages on acquisition"), we no longer mark the contents as dirty on
a write fault. This has the issue of us then not marking the pages as
dirty on releasing the buffer, which means the contents are not written
out to the swap device (should we ever pick that buffer as a victim).
Notably, this is visible in the dumb buffer interface used for cursors.
Having updated the cursor contents via mmap, and swapped away, if the
shrinker should evict the old cursor, upon next reuse, the cursor would
be invisible.
E.g. echo 80 > /proc/sys/kernel/sysrq ; echo f > /proc/sysrq-trigger
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111541
Fixes: a679f58d0510 ("drm/i915: Flush pages on acquisition")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: <[email protected]> # v5.2+
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 5028851cdfdf78dc22eacbc44a0ab0b3f599ee4a)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Force bonded requests to run on distinct engines so that they cannot be
shuffled onto the same engine where timeslicing will reverse the order.
A bonded request will often wait on a semaphore signaled by its master,
creating an implicit dependency -- if we ignore that implicit dependency
and allow the bonded request to run on the same engine and before its
master, we will cause a GPU hang. [Whether it will hang the GPU is
debatable, we should keep on timeslicing and each timeslice should be
"accidentally" counted as forward progress, in which case it should run
but at one-half to one-third speed.]
We can prevent this inversion by restricting which engines we allow
ourselves to jump to upon preemption, i.e. baking in the arrangement
established at first execution. (We should also consider capturing the
implicit dependency using i915_sched_add_dependency(), but first we need
to think about the constraints that requires on the execution/retirement
ordering.)
Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
References: ee1136908e9b ("drm/i915/execlists: Virtual engine bonding")
Testcase: igt/gem_exec_balancer/bonded-slice
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit e2144503bf3b22275dd33cef2880e1cb5fb200c5)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The officially validated plane width limit is 4k on skl+, however
we already had people using 5k displays before we started to enforce
the limit. Also it seems Windows allows 5k resolutions as well
(though not sure if they do it with one plane or two).
According to hw folks 5k should work with the possible
exception of the following features:
- Ytile (already limited to 4k)
- FP16 (already limited to 4k)
- render compression (already limited to 4k)
- KVMR sprite and cursor (don't care)
- horizontal panning (need to verify this)
- pipe and plane scaling (need to verify this)
So apart from last two items on that list we are already
fine. We should really verify what happens with those last
two items but I don't have a 5k display on hand atm so it'll
have to wait.
In the meantime let's just bump the limit back up to 5k since
several users have already been using it without apparent issues.
At least we'll be no worse off than we were prior to lowering
the limits.
Cc: [email protected]
Cc: Sean Paul <[email protected]>
Cc: José Roberto de Souza <[email protected]>
Tested-by: Leho Kraav <[email protected]>
Fixes: 372b9ffb5799 ("drm/i915: Fix skl+ max plane width")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111501
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Sean Paul <[email protected]>
(cherry picked from commit bed34ef544f9ab37ab349c04cf4142282c4dcf5d)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
When using virtual engines, the rq->engine is not stable until we hold
the engine->active.lock (as the virtual engine may be exchanged with the
sibling). Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
we may retire a request concurrently with resubmitting it to HW, we need
to be extra careful to verify we are holding the correct lock for the
request's active list. This is similar to the issue we saw with
rescheduling the virtual requests, see sched_lock_engine().
Or else:
<4> [876.736126] list_add corruption. prev->next should be next (ffff8883f931a1f8), but was dead000000000100. (prev=ffff888361ffa610).
<4> [876.736136] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
<4> [876.736137] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
<4> [876.736154] CPU: 2 PID: 21 Comm: ksoftirqd/2 Tainted: G U 5.3.0-CI-CI_DRM_6898+ #1
<4> [876.736156] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [876.736157] RIP: 0010:__list_add_valid+0x4d/0x70
<4> [876.736159] Code: c3 48 89 d1 48 c7 c7 20 33 0e 82 48 89 c2 e8 4a 4a bc ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 70 33 0e 82 e8 33 4a bc ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 c0 33 0e 82 e8
<4> [876.736160] RSP: 0018:ffffc9000018bd30 EFLAGS: 00010082
<4> [876.736162] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000104
<4> [876.736163] RDX: 0000000080000104 RSI: 0000000000000000 RDI: 00000000ffffffff
<4> [876.736164] RBP: ffffc9000018bd68 R08: 0000000000000000 R09: 0000000000000001
<4> [876.736165] R10: 00000000aed95de3 R11: 000000007fe927eb R12: ffff888361ffca10
<4> [876.736166] R13: ffff888361ffa610 R14: ffff888361ffc880 R15: ffff8883f931a1f8
<4> [876.736168] FS: 0000000000000000(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
<4> [876.736169] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [876.736170] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
<4> [876.736171] PKRU: 55555554
<4> [876.736172] Call Trace:
<4> [876.736226] __i915_request_submit+0x152/0x370 [i915]
<4> [876.736263] __execlists_submission_tasklet+0x6da/0x1f50 [i915]
<4> [876.736293] ? execlists_submission_tasklet+0x29/0x50 [i915]
<4> [876.736321] execlists_submission_tasklet+0x34/0x50 [i915]
<4> [876.736325] tasklet_action_common.isra.5+0x47/0xb0
<4> [876.736328] __do_softirq+0xd8/0x4ae
<4> [876.736332] ? smpboot_thread_fn+0x23/0x280
<4> [876.736334] ? smpboot_thread_fn+0x6b/0x280
<4> [876.736336] run_ksoftirqd+0x2b/0x50
<4> [876.736338] smpboot_thread_fn+0x1d3/0x280
<4> [876.736341] ? sort_range+0x20/0x20
<4> [876.736343] kthread+0x119/0x130
<4> [876.736345] ? kthread_park+0xa0/0xa0
<4> [876.736347] ret_from_fork+0x24/0x50
<4> [876.736353] irq event stamp: 2290145
<4> [876.736356] hardirqs last enabled at (2290144): [<ffffffff8123cde8>] __slab_free+0x3e8/0x500
<4> [876.736358] hardirqs last disabled at (2290145): [<ffffffff819cfb4d>] _raw_spin_lock_irqsave+0xd/0x50
<4> [876.736360] softirqs last enabled at (2290114): [<ffffffff81c0033e>] __do_softirq+0x33e/0x4ae
<4> [876.736361] softirqs last disabled at (2290119): [<ffffffff810b815b>] run_ksoftirqd+0x2b/0x50
<4> [876.736363] WARNING: CPU: 2 PID: 21 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
<4> [876.736364] ---[ end trace 3e58d6c7356c65bf ]---
<4> [876.736406] ------------[ cut here ]------------
<4> [876.736415] list_del corruption. prev->next should be ffff888361ffca10, but was ffff88840ac2c730
<4> [876.736421] WARNING: CPU: 2 PID: 5490 at lib/list_debug.c:53 __list_del_entry_valid+0x79/0x90
<4> [876.736422] Modules linked in: i915(+) amdgpu gpu_sched ttm vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_intel_nhlt snd_hda_codec snd_hwdep snd_hda_core ghash_clmulni_intel e1000e cdc_ether usbnet mii snd_pcm ptp pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
<4> [876.736433] CPU: 2 PID: 5490 Comm: i915_selftest Tainted: G U W 5.3.0-CI-CI_DRM_6898+ #1
<4> [876.736435] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [876.736436] RIP: 0010:__list_del_entry_valid+0x79/0x90
<4> [876.736438] Code: 0b 31 c0 c3 48 89 fe 48 c7 c7 30 34 0e 82 e8 ae 49 bc ff 0f 0b 31 c0 c3 48 89 f2 48 89 fe 48 c7 c7 68 34 0e 82 e8 97 49 bc ff <0f> 0b 31 c0 c3 48 c7 c7 a8 34 0e 82 e8 86 49 bc ff 0f 0b 31 c0 c3
<4> [876.736439] RSP: 0018:ffffc900003ef758 EFLAGS: 00010086
<4> [876.736440] RAX: 0000000000000000 RBX: ffff888361ffc840 RCX: 0000000000000002
<4> [876.736442] RDX: 0000000080000002 RSI: 0000000000000000 RDI: 00000000ffffffff
<4> [876.736443] RBP: ffffc900003ef780 R08: 0000000000000000 R09: 0000000000000001
<4> [876.736444] R10: 000000001418e4b7 R11: 000000007f0ea93b R12: ffff888361ffcab8
<4> [876.736445] R13: ffff88843b6d0000 R14: 000000000000217c R15: 0000000000000001
<4> [876.736447] FS: 00007f4e6f255240(0000) GS:ffff88849fd00000(0000) knlGS:0000000000000000
<4> [876.736448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [876.736449] CR2: 00007f093a9173c0 CR3: 00000003bba08005 CR4: 0000000000760ee0
<4> [876.736450] PKRU: 55555554
<4> [876.736451] Call Trace:
<4> [876.736488] i915_request_retire+0x224/0x8e0 [i915]
<4> [876.736521] i915_request_create+0x4b/0x1b0 [i915]
<4> [876.736550] nop_virtual_engine+0x230/0x4d0 [i915]
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111695
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Matthew Auld <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 37fa0de3c137d5f54f7e64f53495c9d501d42a4d)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
A few times in CI, we have detected a GPU hang on our Haswell GT2
systems with the characteristic IPEHR of 0x780c0000. When the PSMI w/a
was first introducted, it was applied to all Haswell, but later on we
found an erratum that supposedly restricted the issue to GT1 and so
constrained it only be applied on GT1. That may have been a mistake...
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111692
Fixes: 167bc759e823 ("drm/i915: Restrict PSMI context load w/a to Haswell GT1")
References: 2c550183476d ("drm/i915: Disable PSMI sleep messages on all rings around context switches")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Acked-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 56c05de6bd773b96deca379370965c49042b5fbf)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
While srcu may use an integer tag, it does not exclude potential error
codes and so may overlap with our own use of -EINTR. Use a separate
outparam to store the tag, and report the error code separately.
Fixes: 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit eebab60f224fcfd560957715d08c31564d8672ed)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
This allows userspace to use "legacy" mode for push constants, where
they are committed at 3DPRIMITIVE or flush time, rather than being
committed at 3DSTATE_BINDING_TABLE_POINTERS_XS time. Gen6-8 and Gen11
both use the "legacy" behavior - only Gen9 works in the "new" way.
Conflating push constants with binding tables is painful for userspace,
we would like to be able to avoid doing so.
Signed-off-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 0606259e3b3a1220a0f04a92a1654a3f674f47ee)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
As soon as we re-enable the various functions within the HW, they may go
off and read data via a GGTT offset. Hence, if we have not yet restored
the GGTT PTE before then, they may read and even *write* random locations
in memory.
Detected by DMAR faults during resume.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Martin Peres <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: [email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit cec5ca08e36fd18d2939b98055346b3b06f56c6c)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
As we may unwind incomplete requests (for preemption) prior to
processing the CSB and the schedule-out events, we may update rq->engine
(resetting it to point back to the parent virtual engine) prior to
calling execlists_schedule_out(), invalidating the assertion that the
request still points to the inflight engine. (The likelihood of this is
increased if the CSB interrupt processing is pushed to the ksoftirqd for
being too slow and direct submission overtakes it.)
Tvrtko summarised it as:
"So unwind from direct submission resets rq->engine and races with
process_csb from the tasklet which notices request has actually
completed."
Reported-by: Vinay Belgaumkar <[email protected]>
Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Vinay Belgaumkar <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit d810583fc2fcf139cc766eb2303500b2d9cf064d)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
IOC3 chips in SGI system are conntected to a bridge ASIC, which has
a 1-wire prom attached with part number information. This changeset
uses this information to create PCI subsystem information, which
the MFD driver uses for further platform device setup.
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Lee Jones <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Srinivas Kandagatla <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
nvmem_device_find provides a way to search for nvmem devices with
the help of a match function simlair to bus_find_device.
Reviewed-by: Srinivas Kandagatla <[email protected]>
Acked-by: Srinivas Kandagatla <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Lee Jones <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Alessandro Zummo <[email protected]>
Cc: Alexandre Belloni <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Adding leds and related triggers.
Signed-off-by: Alexandre GRIVEAUX <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Add IW8103 Wifi + bluetooth module to device tree and related power domain.
Signed-off-by: Alexandre GRIVEAUX <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Adding missing I2C nodes and some peripheral:
- PMU
- RTC
Signed-off-by: Alexandre GRIVEAUX <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
Add the devicetree nodes for the I2C core of the JZ4780 SoC, disabled
by default.
Signed-off-by: Alexandre GRIVEAUX <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
FORTIFY_SOURCE detects various overflows at compile and run time.
(6974f0c4555e ("include/linux/string.h:
add the option of fortified string.h functions)
ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and
run with CONFIG_FORTIFY_SOURCE.
Since mips can be built and run with that flag,
select ARCH_HAS_FORTIFY_SOURCE as default.
Signed-off-by: Dmitry Korotin <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
|
|
CSR IPI and legacy MMIO use the same infrastructure, but CSR IPI is
faster than legacy MMIO IPI. This patch enable CSR IPI if possible
(except for MailBox, because CSR IPI is too complicated for MailBox).
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Cc: Huacai Chen <[email protected]>
|
|
All Loongson-3 CPU family:
Code-name Brand-name PRId
Loongson-3A R1 Loongson-3A1000 0x6305
Loongson-3A R2 Loongson-3A2000 0x6308
Loongson-3A R2.1 Loongson-3A2000 0x630c
Loongson-3A R3 Loongson-3A3000 0x6309
Loongson-3A R3.1 Loongson-3A3000 0x630d
Loongson-3A R4 Loongson-3A4000 0xc000
Loongson-3B R1 Loongson-3B1000 0x6306
Loongson-3B R2 Loongson-3B1500 0x6307
Features of R4 revision of Loongson-3A:
- All R2/R3 features, including SFB, V-Cache, FTLB, RIXI, DSP, etc.
- Support variable ASID bits.
- Support MSA and VZ extensions.
- Support CPUCFG (CPU config) and CSR (Control and Status Register)
extensions.
- 64 entries of VTLB (classic TLB), 2048 entries of FTLB (8-way
set-associative).
Now 64-bit Loongson processors has three types of PRID.IMP: 0x6300 is
the classic one so we call it PRID_IMP_LOONGSON_64C (e.g., Loongson-2E/
2F/3A1000/3B1000/3B1500/3A2000/3A3000), 0x6100 is for some processors
which has reduced capabilities so we call it PRID_IMP_LOONGSON_64R
(e.g., Loongson-2K), 0xc000 is supposed to cover all new processors in
general (e.g., Loongson-3A4000+) so we call it PRID_IMP_LOONGSON_64G.
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Jiaxun Yang <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Cc: Huacai Chen <[email protected]>
|
|
Loongson-3A R4+ (Loongson-3A4000 and newer) has CPUCFG (CPU config) and
CSR (Control and Status Register) extensions. This patch add read/write
functionalities for them.
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Jiaxun Yang <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Cc: Huacai Chen <[email protected]>
|
|
The memory initialization of SGI-IP27 is already half-way to support
SPARSEMEM. It only had free_bootmem_with_active_regions() left-overs
interfering with sparse_memory_present_with_active_regions().
Replace these calls with simpler memblocks_present() call in prom_meminit()
and adjust arch/mips/Kconfig to enable SPARSEMEM and SPARSEMEM_EXTREME for
SGI-IP27.
Co-developed-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Thomas Bogendoerfer <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
|
|
When Loongson3 LL/SC errata workarounds are enabled (ie.
CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) run a tool to scan through the
compiled kernel & ensure that the workaround is applied correctly. That
is, ensure that:
- Every LL or LLD instruction is preceded by a sync instruction.
- Any branches from within an LL/SC loop to outside of that loop
target a sync instruction.
Reasoning for these conditions can be found by reading the comment above
the definition of __SYNC_loongson3_war in arch/mips/include/asm/sync.h.
This tool will help ensure that we don't inadvertently introduce code
paths that miss the required workarounds.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
In ejtag_debug_handler() we must reload the address of
ejtag_debug_buffer_spinlock if an sc fails, since the address in k0 will
have been clobbered by the result of the sc instruction. In the case
where we simply load a non-zero value (ie. there's contention for the
lock) the address will not be clobbered & we can simply branch back to
repeat the load from memory without reloading the address into k0.
The primary motivation for this change is that it moves the target of
the bnez instruction to an instruction within the LL/SC loop (the LL
itself), which we know contains no other memory accesses & therefore
isn't affected by Loongson3 LL/SC errata.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
In ejtag_debug_handler we use LL & SC instructions to acquire & release
an open-coded spinlock. For Loongson3 systems affected by LL/SC errata
this requires that we insert a sync instruction prior to the LL in order
to ensure correct behavior of the LL/SC loop.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already
emit a full completion barrier as part of the inline assembly containing
LL/SC loops for atomic operations. As such the barrier emitted by
__smp_mb__before_atomic() is redundant, and we can remove it.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
The loongson_llsc_mb() macro is no longer used - instead barriers are
emitted as part of inline asm using the __SYNC() macro. Remove the
now-defunct loongson_llsc_mb() macro.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
When building a kernel configured to support Loongson3 LL/SC workarounds
(ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) the inline assembly in
__xchg_asm() & __cmpxchg_asm() already emits completion barriers, and as
such we don't need to emit extra barriers from the xchg() or cmpxchg()
macros. Add compile-time constant checks causing us to omit the
redundant memory barriers.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Use smp_mb__before_atomic() rather than smp_mb__before_llsc() in
test_and_set_bit(), test_and_clear_bit() & test_and_change_bit(). The
_atomic() versions make semantic sense in these cases, and will allow a
later patch to omit redundant barriers for Loongson3 systems that
already include a barrier within __test_bit_op().
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Rather than using custom SZLONG_LOG & SZLONG_MASK macros to shift & mask
a bit index to form word & bit offsets respectively, make use of the
standard BIT_WORD() & BITS_PER_LONG macros for the same purpose.
volatile is added to the definition of pointers to the long-sized word
we'll operate on, in order to prevent the compiler complaining that we
cast away the volatile qualifier of the addr argument. This should have
no effect on generated code, which in the LL/SC case is inline asm
anyway & in the non-LLSC case access is constrained by compiler barriers
provided by raw_local_irq_{save,restore}().
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Introduce __bit_op() & __test_bit_op() macros which abstract away the
implementation of LL/SC loops. This cuts down on a lot of duplicate
boilerplate code, and also allows R10000_LLSC_WAR to be handled outside
of the individual bitop functions.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
The IRQ-disabling non-LLSC fallbacks for bitops on UP systems already
return a zero or one, so there's no need to perform another comparison
against zero. Move these comparisons into the LLSC paths to avoid the
redundant work.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Use the BIT() macro in asm/bitops.h rather than open-coding its
equivalent.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
The logical operations or & xor used in the test_and_set_bit_lock(),
test_and_clear_bit() & test_and_change_bit() functions currently force
the value 1<<bit to be placed in a register. If the bit is compile-time
constant & fits within the immediate field of an or/xor instruction (ie.
16 bits) then we can make use of the ori/xori instruction variants &
avoid the use of an extra register. Add the extra "i" constraints in
order to allow use of these immediate encodings.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
The only difference between test_and_set_bit() & test_and_set_bit_lock()
is memory ordering barrier semantics - the former provides a full
barrier whilst the latter only provides acquire semantics.
We can therefore implement test_and_set_bit() in terms of
test_and_set_bit_lock() with the addition of the extra memory barrier.
Do this in order to avoid duplicating logic.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
The start position for an ins instruction is always encoded as an
immediate, so allowing registers to be used by the inline asm makes no
sense. It should never happen anyway since a bit index should always be
small enough to be treated as an immediate, but remove the nonsensical
"r" for sanity.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Rather than #ifdef on CONFIG_CPU_* to determine whether the ins
instruction is supported we can simply check MIPS_ISA_REV to discover
whether we're targeting MIPSr2 or higher. Do so in order to clean up the
code.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
set_bit() can set bits 0-15 using an ori instruction, rather than
loading the value -1 into a register & then using an ins instruction.
That is, rather than the following:
li t0, -1
ll t1, 0(t2)
ins t1, t0, 4, 1
sc t1, 0(t2)
We can have the simpler:
ll t1, 0(t2)
ori t1, t1, 0x10
sc t1, 0(t2)
The or path already allows immediates to be used, so simply restricting
the ins path to bits that don't fit in immediates is sufficient to take
advantage of this.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Reorder conditions in our various bitops functions that check
kernel_uses_llsc such that they handle the !kernel_uses_llsc case first.
This allows us to avoid the need to duplicate the kernel_uses_llsc check
in all the other cases. For functions that don't involve barriers common
to the various implementations, we switch to returning from within each
if block making each case easier to read in isolation.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Remove the remaining duplication between 32b & 64b in asm/atomic.h by
making use of an ATOMIC_OPS() macro to generate:
- atomic_read()/atomic64_read()
- atomic_set()/atomic64_set()
- atomic_cmpxchg()/atomic64_cmpxchg()
- atomic_xchg()/atomic64_xchg()
This is consistent with the way all other functions in asm/atomic.h are
generated, and ensures consistency between the 32b & 64b functions.
Of note is that this results in the above now being static inline
functions rather than macros.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Unify the definitions of atomic_sub_if_positive() &
atomic64_sub_if_positive() using a macro like we do for most other
atomic functions. This allows us to share the implementation ensuring
consistency between the two. Notably this provides the appropriate
loongson3_war barriers in the atomic64_sub_if_positive() case which were
previously missing.
The code is rearranged a little to handle the !kernel_uses_llsc case
first in order to de-indent the LL/SC case & allow us not to go over 80
characters per line.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Use smp_mb__before_atomic() & smp_mb__after_atomic() in
atomic_sub_if_positive() rather than the equivalent
smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard &
this preps us for avoiding redundant duplicate barriers on Loongson3 in
a later patch.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|
|
Cut down on duplication by generalizing the ATOMIC_OP(),
ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros to work for both 32b &
64b atomics, and removing the ATOMIC64_ variants. This ensures
consistency between our atomic_* & atomic64_* functions.
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected]
|