Age | Commit message (Collapse) | Author | Files | Lines |
|
GICv4.1 redistributors have a VPE-aware INVALL register. Progress!
We can now emulate a guest-requested INVALL without emiting a
VINVALL command.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Making a VPE resident on GICv4.1 is pretty simple, as it is just a
single write to the local redistributor. We just need extra information
about which groups to enable, which the KVM code will have to provide.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
masking/unmasking doorbells on GICv4.1 relies on a new INVDB command,
which broadcasts the invalidation to all RDs.
Implement the new command as well as the masking callbacks, and plug
the whole thing into the v4.1 VPE irqchip.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The ITS VMAPP command gains some new fields with GICv4.1:
- a default doorbell, which allows a single doorbell to be used for
all the VLPIs routed to a given VPE
- a pointer to the configuration table (instead of having it in a register
that gets context switched)
- a flag indicating whether this is the first map or the last unmap for
this particular VPE
- a flag indicating whether the pending table is known to be zeroed, or not
Plumb in the new fields in the VMAPP builder, and add the map/unmap
refcounting so that the ITS can do the right thing.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
GICv4.1 defines a new VPE table that is potentially shared between
both the ITSs and the redistributors, following complicated affinity
rules.
To make things more confusing, the programming of this table at
the redistributor level is reusing the GICv4.0 GICR_VPROPBASER register
for something completely different.
The code flow is somewhat complexified by the need to respect the
affinities required by the HW, meaning that tables can either be
inherited from a previously discovered ITS or redistributor.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
While GICv4.0 mandates 16 bit worth of VPEIDs, GICv4.1 allows smaller
implementations to be built. Add the required glue to dynamically
compute the limit.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
GICv4.1 supports the RVPEID ("Residency per vPE ID"), which allows for
a much efficient way of making virtual CPUs resident (to allow direct
injection of interrupts).
The functionnality needs to be discovered on each and every redistributor
in the system, and disabled if the settings are inconsistent.
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Zenghui Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
These drivers no longer need it as they are only probed via DT.
crypto_platform_data was allocated but unused, so remove it.
This is a follow up for:
commit 45a536e3a7e0 ("crypto: atmel-tdes - Retire dma_request_slave_channel_compat()")
commit db28512f48e2 ("crypto: atmel-sha - Retire dma_request_slave_channel_compat()")
commit 62f72cbdcf02 ("crypto: atmel-aes - Retire dma_request_slave_channel_compat()")
Signed-off-by: Tudor Ambarus <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2020-01-21
1) Add support for TCP encapsulation of IKE and ESP messages,
as defined by RFC 8229. Patchset from Sabrina Dubroca.
Please note that there is a merge conflict in:
net/unix/af_unix.c
between commit:
3c32da19a858 ("unix: Show number of pending scm files of receive queue in fdinfo")
from the net-next tree and commit:
b50b0580d27b ("net: add queue argument to __skb_wait_for_more_packets and __skb_{,try_}recv_datagram")
from the ipsec-next tree.
The conflict can be solved as done in linux-next.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a new version of phy_do_ioctl that doesn't check whether net_device
is running. It will typically be used if suitable drivers attach the
PHY in probe already.
Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We just added phy_do_ioctl, but it turned out that we need another
version of this function that doesn't check whether net_device is
running. So rename phy_do_ioctl to phy_do_ioctl_running.
Signed-off-by: Heiner Kallweit <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The functions dma_get_slave_channel() and dma_get_any_slave_channel()
are called from DMA engine drivers only. Hence move their declarations
from the public header file <linux/dmaengine.h> to the private header
file drivers/dma/dmaengine.h.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
At its original introduction, dma_request_slave_channel_compat() used a
wrapper, to accommodate filter functions that modify the mask passed.
Filter functions can no longer modify masks, and the mask parameter was
made const in commit a53e28da574a40bc ("dma: Make the 'mask' parameter
of __dma_request_channel const") consecutively.
Hence remove the wrapper, and rename __dma_request_slave_channel_compat()
to dma_request_slave_channel_compat(), to get rid of one more function
name starting with a double underscore.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
Certain users can not use right now the DMAengine API due to missing
features in the core. Prime example is Networking.
These users can use the glue layer interface to avoid misuse of DMAengine
API and when the core gains the needed features they can be converted to
use generic API.
The most prominent features the glue layer clients are depending on:
- most PSI-L native peripheral use extra rflow ranges on a receive channel
and depending on the peripheral's configuration packets from a single
free descriptor ring is going to be received to different receive ring
- it is also possible to have different free descriptor rings per rflow
and an rflow can also support 4 additional free descriptor ring based
on the size of the incoming packet
- out of order completion of descriptors on a channel
- when we have several queues to handle different priority packets the
descriptors will be completed 'out-of-order'
- the notion of prep_slave_sg is not matching with what the streaming type
of operation is demanding for networking
- Streaming type of operation
- Ability to fill the free descriptor ring with descriptors in
anticipation of incoming traffic and when a packet arrives UDMAP will
form a packet and gives it to the client driver
- the descriptors are not backed with exact size data buffers as we don't
know the size of the packet we will receive, but as a generic pool of
buffers to be used by the receive channel
- NAPI type of operation (polling instead of interrupt driven transfer)
- without this we can not sustain gigabit speeds and we need to support NAPI
- not to limit this to networking, but other high performance operations
Signed-off-by: Grygorii Strashko <[email protected]>
Signed-off-by: Peter Ujfalusi <[email protected]>
Tested-by: Keerthy <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
In K3 architecture the DMA operates within threads. One end of the thread
is UDMAP, the other is on the peripheral side.
The UDMAP channel configuration depends on the needs of the remote
endpoint and it can be differ from peripheral to peripheral.
This patch adds database for am654 and j721e and small API to fetch the
PSI-L endpoint configuration from the database which should only used by
the DMA driver(s).
Another API is added for native peripherals to give possibility to pass new
configuration for the threads they are using, which is needed to be able to
handle changes caused by different firmware loaded for the peripheral for
example.
Signed-off-by: Peter Ujfalusi <[email protected]>
Tested-by: Keerthy <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
The K3 DMA architecture uses CPPI5 (Communications Port Programming
Interface) specified descriptors over PSI-L bus within NAVSS.
The header provides helpers, macros to work with these descriptors in a
consistent way.
Signed-off-by: Peter Ujfalusi <[email protected]>
Tested-by: Keerthy <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
dmaengine_get_direction_text() can be useful when the direction is printed
out. The text is easier to comprehend than the number.
Signed-off-by: Peter Ujfalusi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
A DMA hardware can have big cache or FIFO and the amount of data sitting in
the DMA fabric can be an interest for the clients.
For example in audio we want to know the delay in the data flow and in case
the DMA have significantly large FIFO/cache, it can affect the latenc/delay
Signed-off-by: Peter Ujfalusi <[email protected]>
Reviewed-by: Tero Kristo <[email protected]>
Tested-by: Keerthy <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
The metadata is best described as side band data or parameters traveling
alongside the data DMAd by the DMA engine. It is data
which is understood by the peripheral and the peripheral driver only, the
DMA engine see it only as data block and it is not interpreting it in any
way.
The metadata can be different per descriptor as it is a parameter for the
data being transferred.
If the DMA supports per descriptor metadata it can implement the attach,
get_ptr/set_len callbacks.
Client drivers must only use either attach or get_ptr/set_len to avoid
misconfiguration.
Client driver can check if a given metadata mode is supported by the
channel during probe time with
dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_CLIENT);
dmaengine_is_metadata_mode_supported(chan, DESC_METADATA_ENGINE);
and based on this information can use either mode.
Wrappers are also added for the metadata_ops.
To be used in DESC_METADATA_CLIENT mode:
dmaengine_desc_attach_metadata()
To be used in DESC_METADATA_ENGINE mode:
dmaengine_desc_get_metadata_ptr()
dmaengine_desc_set_metadata_len()
Signed-off-by: Peter Ujfalusi <[email protected]>
Reviewed-by: Tero Kristo <[email protected]>
Tested-by: Keerthy <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
This is for dependency of new TI ringacc dmaengine drivers
Merge tag 'drivers_soc_for_5.6' into topic/ti
SOC: TI Keystone Ring Accelerator driver
The Ring Accelerator (RINGACC or RA) provides hardware acceleration to
enable straightforward passing of work between a producer and a consumer.
There is one RINGACC module per NAVSS on TI AM65x SoCs.
Signed-off-by: Vinod Koul <[email protected]>
|
|
The bitmap allocation did not use full unsigned long sizes
when calculating the required size and that was triggered by KASAN
as slab-out-of-bounds read in several places. The patch fixes all
of them.
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Bitmaps are fairly popular for their space efficiency, but we don't have
generic iterators available. Make percpu's bitmap region iterators
available to everyone.
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Dennis Zhou <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Monitoring tools that want to find out which resctrl control and monitor
groups a task belongs to must currently read the "tasks" file in every
group until they locate the process ID.
Add an additional file /proc/{pid}/cpu_resctrl_groups to provide this
information:
1) res:
mon:
resctrl is not available.
2) res:/
mon:
Task is part of the root resctrl control group, and it is not associated
to any monitor group.
3) res:/
mon:mon0
Task is part of the root resctrl control group and monitor group mon0.
4) res:group0
mon:
Task is part of resctrl control group group0, and it is not associated
to any monitor group.
5) res:group0
mon:mon1
Task is part of resctrl control group group0 and monitor group mon1.
Signed-off-by: Chen Yu <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Tested-by: Jinshi Chen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
A number of network drivers has the same glue code to use phy_mii_ioctl
as ndo_do_ioctl handler. So let's add such a generic ndo_do_ioctl
handler to phylib.
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a new function irq_domain_translate_onecell() that is to be used as
the translate function in struct irq_domain_ops.
Signed-off-by: Yash Shah <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Dave noticed that when specifying multiple efi_fake_mem= entries only
the last entry was successfully being reflected in the efi memory map.
This is due to the fact that the efi_memmap_insert() is being called
multiple times, but on successive invocations the insertion should be
applied to the last new memmap rather than the original map at
efi_fake_memmap() entry.
Rework efi_fake_memmap() to install the new memory map after each
efi_fake_mem= entry is parsed.
This also fixes an issue in efi_fake_memmap() that caused it to litter
emtpy entries into the end of the efi memory map. An empty entry causes
efi_memmap_insert() to attempt more memmap splits / copies than
efi_memmap_split_count() accounted for when sizing the new map. When
that happens efi_memmap_insert() may overrun its allocation, and if you
are lucky will spill over to an unmapped page leading to crash
signature like the following rather than silent corruption:
BUG: unable to handle page fault for address: ffffffffff281000
[..]
RIP: 0010:efi_memmap_insert+0x11d/0x191
[..]
Call Trace:
? bgrt_init+0xbe/0xbe
? efi_arch_mem_reserve+0x1cb/0x228
? acpi_parse_bgrt+0xa/0xd
? acpi_table_parse+0x86/0xb8
? acpi_boot_init+0x494/0x4e3
? acpi_parse_x2apic+0x87/0x87
? setup_acpi_sci+0xa2/0xa2
? setup_arch+0x8db/0x9e1
? start_kernel+0x6a/0x547
? secondary_startup_64+0xb6/0xc0
Commit af1648984828 "x86/efi: Update e820 with reserved EFI boot
services data to fix kexec breakage" introduced more occurrences where
efi_memmap_insert() is invoked after an efi_fake_mem= configuration has
been parsed. Previously the side effects of vestigial empty entries were
benign, but with commit af1648984828 that follow-on efi_memmap_insert()
invocation triggers efi_memmap_insert() overruns.
Reported-by: Dave Young <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
In preparation for fixing efi_memmap_alloc() leaks, add support for
recording whether the memmap was dynamically allocated from slab,
memblock, or is the original physical memmap provided by the platform.
Given this tracking is established in efi_memmap_alloc() and needs to be
carried to efi_memmap_install(), use 'struct efi_memory_map_data' to
convey the flags.
Some small cleanups result from this reorganization, specifically the
removal of local variables for 'phys' and 'size' that are already
tracked in @data.
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
In preparation for garbage collecting dynamically allocated EFI memory
maps, where the allocation method of memblock vs slab needs to be
recalled, convert the existing 'late' flag into a 'flags' bitmask.
Arrange for the flag to be passed via 'struct efi_memory_map_data'. This
structure grows additional flags in follow-on changes.
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
|
|
Pull networking fixes from David Miller:
1) Fix non-blocking connect() in x25, from Martin Schiller.
2) Fix spurious decryption errors in kTLS, from Jakub Kicinski.
3) Netfilter use-after-free in mtype_destroy(), from Cong Wang.
4) Limit size of TSO packets properly in lan78xx driver, from Eric
Dumazet.
5) r8152 probe needs an endpoint sanity check, from Johan Hovold.
6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from
John Fastabend.
7) hns3 needs short frames padded on transmit, from Yunsheng Lin.
8) Fix netfilter ICMP header corruption, from Eyal Birger.
9) Fix soft lockup when low on memory in hns3, from Yonglong Liu.
10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan.
11) Fix memory leak in act_ctinfo, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
cxgb4: reject overlapped queues in TC-MQPRIO offload
cxgb4: fix Tx multi channel port rate limit
net: sched: act_ctinfo: fix memory leak
bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.
bnxt_en: Fix ipv6 RFS filter matching logic.
bnxt_en: Fix NTUPLE firmware command failures.
net: systemport: Fixed queue mapping in internal ring map
net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec
net: dsa: sja1105: Don't error out on disabled ports with no phy-mode
net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset
net: hns: fix soft lockup when there is not enough memory
net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()
net/sched: act_ife: initalize ife->metalist earlier
netfilter: nat: fix ICMP header corruption on ICMP errors
net: wan: lapbether.c: Use built-in RCU list checking
netfilter: nf_tables: fix flowtable list del corruption
netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks()
netfilter: nf_tables: remove WARN and add NLA_STRING upper limits
netfilter: nft_tunnel: ERSPAN_VERSION must not be null
netfilter: nft_tunnel: fix null-attribute check
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 E-Switch chains and prios
This series has two parts,
1) A merge commit with mlx5-next branch that include updates for mlx5
HW layouts needed for this and upcoming submissions.
2) From Paul, Increase the number of chains and prios
Currently the Mellanox driver supports offloading tc rules that
are defined on the first 4 chains and the first 16 priorities.
The restriction stems from the firmware flow level enforcement
requiring a flow table of a certain level to point to a flow
table of a higher level. This limitation may be ignored by setting
the ignore_flow_level bit when creating flow table entries.
Use unmanaged tables and ignore flow level to create more tables than
declared by fs_core steering. Manually manage the connections between the
tables themselves.
HW table is instantiated for every tc <chain,prio> tuple. The miss rule
of every table either jumps to the next <chain,prio> table, or continues
to slow_fdb. This logic is realized by following this sequence:
1. Create an auto-grouped flow table for the specified priority with
reserved entries
Reserved entries are allocated at the end of the flow table.
Flow groups are evaluated in sequence and therefore it is guaranteed
that the flow group defined on the last FTEs will be the last to evaluate.
Define a "match all" flow group on the reserved entries, providing
the platform to add table miss actions.
2. Set the miss rule action to jump to the next <chain,prio> table
or the slow_fdb.
3. Link the previous priority table to point to the new table by
updating its miss rule.
Please pull and let me know if there's any problem.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fixes from Ingo Molnar:
"Two rseq bugfixes:
- CLONE_VM !CLONE_THREAD didn't work properly, the kernel would end
up corrupting the TLS of the parent. Technically a change in the
ABI but the previous behavior couldn't resonably have been relied
on by applications so this looks like a valid exception to the ABI
rule.
- Make the RSEQ_FLAG_UNREGISTER ABI behavior consistent with the
handling of other flags. This is not thought to impact any
applications either"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Unregister rseq for clone CLONE_VM
rseq: Reject unknown flags on rseq unregister
|
|
This function supports iterating over a range of an array. Also add
documentation links for xa_for_each_start().
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
|
|
Some users need to take an xarray lock while holding another xarray lock.
Reported-by: Doug Gilbert <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
|
|
Pull block fixes from Jens Axboe:
"Three fixes that should go into this release:
- The 32-bit segment size fix that I mentioned last week (Ming)
- Use uint for the block size (Mikulas)
- A null_blk zone write handling fix (Damien)"
* tag 'block-5.5-2020-01-16' of git://git.kernel.dk/linux-block:
block: fix an integer overflow in logical block size
null_blk: Fix zone write handling
block: fix get_max_segment_size() overflow on 32bit arch
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull devfreq updates for v5.6 from Chanwoo Choi:
"1. Update devfreq core
- Add new 'name' attribute of sysfs to show the device name
: /sys/class/devfreq/devfreqX/name
- Make 'trans_stat' sysfs reset by entering zero(0)
: echo 0 > /sys/class/devfreq/devfreqX/trans_stat
- Add debugfs support with 'devfreq_summary' to show the summary
: /sys/kernel/debug/devfreq/devfreq_summary
- Change the type of time variable to 64bit to avoid overflows.
- Make separate devfreq_stats including the statistics information.
- Fix minor coding-style like indentation and kernel-doc warnings.
2. Update devfreq drivers
- Add new imx8m-ddrc.c devfreq driver for dynamic scaling of DDR frequency.
It changes the DDR frequency by using ARM SMCCC(SMC Calling Convention)
interface to control TF-A firmware.
- Add COMPILE_TEST dependency for rk3399_dmc.c.
- Clean-up code for exynos-bus.c and rk3399_dmc.c without behavior changes
3. Update devfreq-event drivers
- Fix excessive stack usage of exynos-ppmu.c and clean-up code of
rockchip-dfi.c without behavior changes."
* tag 'devfreq-next-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: (24 commits)
PM / devfreq: Add debugfs support with devfreq_summary file
PM / devfreq: exynos: Rename Exynos to lowercase
PM / devfreq: imx8m-ddrc: Fix inconsistent IS_ERR and PTR_ERR
PM / devfreq: exynos-bus: Add error log when fail to get devfreq-event
PM / devfreq: exynos-bus: Disable devfreq-event device when fails
PM / devfreq: rk3399_dmc: Disable devfreq-event device when fails
PM / devfreq: imx8m-ddrc: Remove unused defines
PM / devfreq: exynos-bus: Reduce goto statements and remove unused headers
PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency
PM / devfreq: rockchip-dfi: Convert to devm_platform_ioremap_resource
PM / devfreq: rk3399_dmc: Add missing of_node_put()
PM / devfreq: rockchip-dfi: Add missing of_node_put()
PM / devfreq: Fix multiple kernel-doc warnings
PM / devfreq: exynos-bus: Extract exynos_bus_profile_init_passive()
PM / devfreq: exynos-bus: Extract exynos_bus_profile_init()
PM / devfreq: Move declaration of DEVICE_ATTR_RW(min_freq)
PM / devfreq: Move statistics to separate struct devfreq_stats
PM / devfreq: Add clearing transitions stats
PM / devfreq: Change time stats to 64-bit
PM / devfreq: Add new name attribute for sysfs
...
|
|
We maintain global statistics for an entire MDIO bus, as well as broken
down, per MDIO bus address statistics. Given that it is possible for
MDIO devices such as switches to access MDIO bus addresses for which
there is not a mdio_device instance created (therefore not a a
corresponding device directory in sysfs either), we also maintain
per-address statistics under the statistics folder. The layout looks
like this:
/sys/class/mdio_bus/../statistics/
transfers
errrors
writes
reads
transfers_<addr>
errors_<addr>
writes_<addr>
reads_<addr>
When a mdio_device instance is registered, a statistics/ folder is
created with the tranfers, errors, writes and reads attributes which
point to the appropriate MDIO bus statistics structure.
Statistics are 64-bit unsigned quantities and maintained through the
u64_stats_sync.h helper functions.
Signed-off-by: Florian Fainelli <[email protected]>
Tested-by: Andrew Lunn <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
With commit
bef69dd87828 ("sched/cpufreq: Move the cfs_rq_util_change() call to cpufreq_update_util()")
update_load_avg() has become the central point for calling cpufreq
(not including the update of blocked load). This change helps to
simplify further the number of calls to cpufreq_update_util() and to
remove last redundant ones. With update_load_avg(), we are now sure
that cpufreq_update_util() will be called after every task attachment
to a cfs_rq and especially after propagating this event down to the
util_avg of the root cfs_rq, which is the level that is used by
cpufreq governors like schedutil to set the frequency of a CPU.
The SCHED_CPUFREQ_MIGRATION flag forces an early call to cpufreq when
the migration happens in a cgroup whereas util_avg of root cfs_rq is
not yet updated and this call is duplicated with the one that happens
immediately after when the migration event reaches the root cfs_rq.
The dedicated flag SCHED_CPUFREQ_MIGRATION is now useless and can be
removed. The interface of attach_entity_load_avg() can also be
simplified accordingly.
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Rafael J. Wysocki <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The function stop_cpus() is only used internally by the
stop_machine for stop multiple cpus.
Make it static.
Signed-off-by: Yangtao Li <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Since the bulk queue used by XDP_REDIRECT now lives in struct net_device,
we can re-use the bulking for the non-map version of the bpf_redirect()
helper. This is a simple matter of having xdp_do_redirect_slow() queue the
frame on the bulk queue instead of sending it out with __bpf_tx_xdp().
Unfortunately we can't make the bpf_redirect() helper return an error if
the ifindex doesn't exit (as bpf_redirect_map() does), because we don't
have a reference to the network namespace of the ingress device at the time
the helper is called. So we have to leave it as-is and keep the device
lookup in xdp_do_redirect_slow().
Since this leaves less reason to have the non-map redirect code in a
separate function, so we get rid of the xdp_do_redirect_slow() function
entirely. This does lose us the tracepoint disambiguation, but fortunately
the xdp_redirect and xdp_redirect_map tracepoints use the same tracepoint
entry structures. This means both can contain a map index, so we can just
amend the tracepoint definitions so we always emit the xdp_redirect(_err)
tracepoints, but with the map ID only populated if a map is present. This
means we retire the xdp_redirect_map(_err) tracepoints entirely, but keep
the definitions around in case someone is still listening for them.
With this change, the performance of the xdp_redirect sample program goes
from 5Mpps to 8.4Mpps (a 68% increase).
Since the flush functions are no longer map-specific, rename the flush()
functions to drop _map from their names. One of the renamed functions is
the xdp_do_flush_map() callback used in all the xdp-enabled drivers. To
keep from having to update all drivers, use a #define to keep the old name
working, and only update the virtual drivers in this patch.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Commit 96360004b862 ("xdp: Make devmap flush_list common for all map
instances"), changed devmap flushing to be a global operation instead of a
per-map operation. However, the queue structure used for bulking was still
allocated as part of the containing map.
This patch moves the devmap bulk queue into struct net_device. The
motivation for this is reusing it for the non-map variant of XDP_REDIRECT,
which will be changed in a subsequent commit. To avoid other fields of
struct net_device moving to different cache lines, we also move a couple of
other members around.
We defer the actual allocation of the bulk queue structure until the
NETDEV_REGISTER notification devmap.c. This makes it possible to check for
ndo_xdp_xmit support before allocating the structure, which is not possible
at the time struct net_device is allocated. However, we keep the freeing in
free_netdev() to avoid adding another RCU callback on NETDEV_UNREGISTER.
Because of this change, we lose the reference back to the map that
originated the redirect, so change the tracepoint to always return 0 as the
map ID and index. Otherwise no functional change is intended with this
patch.
After this patch, the relevant part of struct net_device looks like this,
according to pahole:
/* --- cacheline 14 boundary (896 bytes) --- */
struct netdev_queue * _tx __attribute__((__aligned__(64))); /* 896 8 */
unsigned int num_tx_queues; /* 904 4 */
unsigned int real_num_tx_queues; /* 908 4 */
struct Qdisc * qdisc; /* 912 8 */
unsigned int tx_queue_len; /* 920 4 */
spinlock_t tx_global_lock; /* 924 4 */
struct xdp_dev_bulk_queue * xdp_bulkq; /* 928 8 */
struct xps_dev_maps * xps_cpus_map; /* 936 8 */
struct xps_dev_maps * xps_rxqs_map; /* 944 8 */
struct mini_Qdisc * miniq_egress; /* 952 8 */
/* --- cacheline 15 boundary (960 bytes) --- */
struct hlist_head qdisc_hash[16]; /* 960 128 */
/* --- cacheline 17 boundary (1088 bytes) --- */
struct timer_list watchdog_timer; /* 1088 40 */
/* XXX last struct has 4 bytes of padding */
int watchdog_timeo; /* 1128 4 */
/* XXX 4 bytes hole, try to pack */
struct list_head todo_list; /* 1136 16 */
/* --- cacheline 18 boundary (1152 bytes) --- */
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Exclude the last n entries for an autogrouped flow table.
Reserving entries at the end of the FT will ensure that this FG will be
the last to be evaluated. This will be used in the next patch to create
a miss group enabling custom actions on FT miss.
Signed-off-by: Paul Blakey <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
If user sets ignore flow level flag on a rule, that rule can point to
a flow table of any level, including those with levels equal or less
than the level of the flow table it is added on.
This with unamanged tables will be used to create a FDB chain/prio
hierarchy much larger than currently supported level range.
Signed-off-by: Paul Blakey <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Currently, Most of the steering tree is statically declared ahead of time,
with steering prios instances allocated for each fdb chain to assign max
number of levels for each of them. This allows fs_core to manage the
connections and levels of the flow tables hierarcy to prevent loops, but
restricts us with the number of supported chains and priorities.
Introduce unmananged flow tables, allowing the user to manage the flow
table connections. A unamanged table is detached from the fs_core flow
table hierarcy, and is only connected back to the hierarchy by explicit
FTEs forward actions.
This will be used together with firmware that supports ignoring the flow
table levels to increase the number of supported chains and prios.
Signed-off-by: Paul Blakey <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
This merge syncs with mlx5-next latest HW bits and layout updates for next
features, in addition one patch that improves
mlx5_create_auto_grouped_flow_table() API across all mlx5 users.
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Refactor mlx5_create_auto_grouped_flow_table
net/mlx5e: Add discard counters per priority
net/mlx5e: Expose FEC feilds and related capability bit
net/mlx5: Add mlx5_ifc definitions for connection tracking support
net/mlx5: Add copy header action struct layout
net/mlx5: Expose resource dump register mapping
net/mlx5: Add structures and defines for MIRC register
net/mlx5: Read MCAM register groups 1 and 2
net/mlx5: Add structures layout for new MCAM access reg groups
net/mlx5: Expose vDPA emulation device capabilities
net/mlx5: Add Virtio Emulation related device capabilities
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Refactor mlx5_create_auto_grouped_flow_table() to use ft_attr param
which already carries the max_fte, prio and flags memebers, and is
used the same in similar mlx5_create_flow_table() function.
Signed-off-by: Paul Blakey <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
|