Age | Commit message (Collapse) | Author | Files | Lines |
|
on_each_cpu_mask() runs the function on each CPU specified by cpumask,
which may include the local processor.
Replace smp_call_function_many() with on_each_cpu_mask() to simplify
the code.
Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
When creating a new monitoring group, the RMID allocated for it may have
been used by a group which was previously removed. In this case, the
hardware counters will have non-zero values which should be deducted
from what is reported in the new group's counts.
resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
0, causing the initial count to be charged to the new group. Resurrect
__rmid_read() and use it to initialize prev_msr correctly.
Unlike before, __rmid_read() checks for error bits in the MSR read so
that callers don't need to.
Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
Signed-off-by: Peter Newman <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
When the user moves a running task to a new rdtgroup using the task's
file interface or by deleting its rdtgroup, the resulting change in
CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the
task(s) CPUs.
x86 allows reordering loads with prior stores, so if the task starts
running between a task_curr() check that the CPU hoisted before the
stores in the CLOSID/RMID update then it can start running with the old
CLOSID/RMID until it is switched again because __rdtgroup_move_task()
failed to determine that it needs to be interrupted to obtain the new
CLOSID/RMID.
Refer to the diagram below:
CPU 0 CPU 1
----- -----
__rdtgroup_move_task():
curr <- t1->cpu->rq->curr
__schedule():
rq->curr <- t1
resctrl_sched_in():
t1->{closid,rmid} -> {1,1}
t1->{closid,rmid} <- {2,2}
if (curr == t1) // false
IPI(t1->cpu)
A similar race impacts rdt_move_group_tasks(), which updates tasks in a
deleted rdtgroup.
In both cases, use smp_mb() to order the task_struct::{closid,rmid}
stores before the loads in task_curr(). In particular, in the
rdt_move_group_tasks() case, simply execute an smp_mb() on every
iteration with a matching task.
It is possible to use a single smp_mb() in rdt_move_group_tasks(), but
this would require two passes and a means of remembering which
task_structs were updated in the first loop. However, benchmarking
results below showed too little performance impact in the simple
approach to justify implementing the two-pass approach.
Times below were collected using `perf stat` to measure the time to
remove a group containing a 1600-task, parallel workload.
CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads)
# mkdir /sys/fs/resctrl/test
# echo $$ > /sys/fs/resctrl/test/tasks
# perf bench sched messaging -g 40 -l 100000
task-clock time ranges collected using:
# perf stat rmdir /sys/fs/resctrl/test
Baseline: 1.54 - 1.60 ms
smp_mb() every matching task: 1.57 - 1.67 ms
[ bp: Massage commit message. ]
Fixes: ae28d1aae48a ("x86/resctrl: Use an IPI instead of task_work_add() to update PQR_ASSOC MSR")
Fixes: 0efc89be9471 ("x86/intel_rdt: Update task closid immediately on CPU in rmdir and unmount")
Signed-off-by: Peter Newman <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Babu Moger <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass
in a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be
used no matter what the const value is. This prevents every subsystem
from having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject,
objects as being "non-mutable". The changes to the kobject and driver
core in this pull request are the result of that, as there are lots of
paths where kobjects and device pointers are not modified at all, so
marking them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object
rules.
All of this has been bike-shedded in quite a lot of detail on lkml
with different names and implementations resulting in the tiny version
we have in here, much better than my original proposal. Lots of
subsystem maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with
no problems"
* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
device property: Fix documentation for fwnode_get_next_parent()
firmware_loader: fix up to_fw_sysfs() to preserve const
usb.h: take advantage of container_of_const()
device.h: move kobj_to_dev() to use container_of_const()
container_of: add container_of_const() that preserves const-ness of the pointer
driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
driver core: fix up missed scsi/cxlflash class.devnode() conversion.
driver core: fix up some missing class.devnode() conversions.
driver core: make struct class.devnode() take a const *
driver core: make struct class.dev_uevent() take a const *
cacheinfo: Remove of_node_put() for fw_token
device property: Add a blank line in Kconfig of tests
device property: Rename goto label to be more precise
device property: Move PROPERTY_ENTRY_BOOL() a bit down
device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
kernfs: fix all kernel-doc warnings and multiple typos
driver core: pass a const * into of_device_uevent()
kobject: kset_uevent_ops: make name() callback take a const *
kobject: kset_uevent_ops: make filter() callback take a const *
kobject: make kobject_namespace take a const *
...
|
|
msr-index.h should contain all MSRs for easier grepping for MSR numbers
when dealing with unchecked MSR access warnings, for example.
Move the resctrl ones. Prefix IA32_PQR_ASSOC with "MSR_" while at it.
No functional changes.
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The devnode() in struct class should not be modifying the device that is
passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.
Cc: Fenghua Yu <[email protected]>
Cc: Reinette Chatre <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: [email protected]
Cc: "H. Peter Anvin" <[email protected]>
Cc: FUJITA Tomonori <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Justin Sanders <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: Benjamin Gaignard <[email protected]>
Cc: Liam Mark <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Brian Starkey <[email protected]>
Cc: John Stultz <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: Maxime Ripard <[email protected]>
Cc: Thomas Zimmermann <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: Dennis Dalessandro <[email protected]>
Cc: Dmitry Torokhov <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Sean Young <[email protected]>
Cc: Frank Haverkamp <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Anton Vorontsov <[email protected]>
Cc: Colin Cross <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Jaroslav Kysela <[email protected]>
Cc: Takashi Iwai <[email protected]>
Cc: Hans Verkuil <[email protected]>
Cc: Christophe JAILLET <[email protected]>
Cc: Xie Yongji <[email protected]>
Cc: Gautam Dawar <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: Eli Cohen <[email protected]>
Cc: Parav Pandit <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The field arch_has_empty_bitmaps is not required anymore. The field
min_cbm_bits is enough to validate the CBM (capacity bit mask) if the
architecture can support the zero CBM or not.
Suggested-by: Reinette Chatre <[email protected]>
Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Link: https://lore.kernel.org/r/166430979654.372014.615622285687642644.stgit@bmoger-ubuntu
|
|
AMD systems support zero CBM (capacity bit mask) for cache allocation.
That is reflected in rdt_init_res_defs_amd() by:
r->cache.arch_has_empty_bitmaps = true;
However given the unified code in cbm_validate(), checking for:
val == 0 && !arch_has_empty_bitmaps
is not enough because of another check in cbm_validate():
if ((zero_bit - first_bit) < r->cache.min_cbm_bits)
The default value of r->cache.min_cbm_bits = 1.
Leading to:
$ cd /sys/fs/resctrl
$ mkdir foo
$ cd foo
$ echo L3:0=0 > schemata
-bash: echo: write error: Invalid argument
$ cat /sys/fs/resctrl/info/last_cmd_status
Need at least 1 bits in the mask
Initialize the min_cbm_bits to 0 for AMD. Also, remove the default
setting of min_cbm_bits and initialize it separately.
After the fix:
$ cd /sys/fs/resctrl
$ mkdir foo
$ cd foo
$ echo L3:0=0 > schemata
$ cat /sys/fs/resctrl/info/last_cmd_status
ok
Fixes: 316e7f901f5a ("x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps")
Co-developed-by: Stephane Eranian <[email protected]>
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Reviewed-by: James Morse <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
|
|
resctrl_arch_rmid_read() returns a value in chunks, as read from the
hardware. This needs scaling to bytes by mon_scale, as provided by
the architecture code.
Now that resctrl_arch_rmid_read() performs the overflow and corrections
itself, it may as well return a value in bytes directly. This allows
the accesses to the architecture specific 'hw' structure to be removed.
Move the mon_scale conversion into resctrl_arch_rmid_read().
mbm_bw_count() is updated to calculate bandwidth from bytes.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
resctrl_rmid_realloc_threshold can be set by user-space. The maximum
value is specified by the architecture.
Currently max_threshold_occ_write() reads the maximum value from
boot_cpu_data.x86_cache_size, which is not portable to another
architecture.
Add resctrl_rmid_realloc_limit to describe the maximum size in bytes
that user-space can set the threshold to.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
resctrl_cqm_threshold is stored in a hardware specific chunk size,
but exposed to user-space as bytes.
This means the filesystem parts of resctrl need to know how the hardware
counts, to convert the user provided byte value to chunks. The interface
between the architecture's resctrl code and the filesystem ought to
treat everything as bytes.
Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read()
still returns its value in chunks, so this needs converting to bytes.
As all the users have been touched, rename the variable to
resctrl_rmid_realloc_threshold, which describes what the value is for.
Neither r->num_rmid nor hw_res->mon_scale are guaranteed to be a power
of 2, so the existing code introduces a rounding error from resctrl's
theoretical fraction of the cache usage. This behaviour is kept as it
ensures the user visible value matches the value read from hardware
when the rmid will be reallocated.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a counter. Currently the function returns
the MBM values in chunks directly from hardware. When reading a bandwidth
counter, get_corrected_mbm_count() must be used to correct the
value read.
get_corrected_mbm_count() is architecture specific, this work should be
done in resctrl_arch_rmid_read().
Move the function calls. This allows the resctrl filesystems's chunks
value to be removed in favour of the architecture private version.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a counter. Currently the function returns
the MBM values in chunks directly from hardware. When reading a bandwidth
counter, mbm_overflow_count() must be used to correct for any possible
overflow.
mbm_overflow_count() is architecture specific, its behaviour should
be part of resctrl_arch_rmid_read().
Move the mbm_overflow_count() calls into resctrl_arch_rmid_read().
This allows the resctrl filesystems's prev_msr to be removed in
favour of the architecture private version.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a hardware register. Currently the function
returns the MBM values in chunks directly from hardware.
To convert this to bytes, some correction and overflow calculations
are needed. These depend on the resource and domain structures.
Overflow detection requires the old chunks value. None of this
is available to resctrl_arch_rmid_read(). MPAM requires the
resource and domain structures to find the MMIO device that holds
the registers.
Pass the resource and domain to resctrl_arch_rmid_read(). This makes
rmid_dirty() too big. Instead merge it with its only caller, and the
name is kept as a local variable.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
__rmid_read() selects the specified eventid and returns the counter
value from the MSR. The error handling is architecture specific, and
handled by the callers, rdtgroup_mondata_show() and __mon_event_count().
Error handling should be handled by architecture specific code, as
a different architecture may have different requirements. MPAM's
counters can report that they are 'not ready', requiring a second
read after a short delay. This should be hidden from resctrl.
Make __rmid_read() the architecture specific function for reading
a counter. Rename it resctrl_arch_rmid_read() and move the error
handling into it.
A read from a counter that hardware supports but resctrl does not
now returns -EINVAL instead of -EIO from the default case in
__mon_event_count(). It isn't possible for user-space to see this
change as resctrl doesn't expose counters it doesn't support.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
To abstract the rmid counters into a helper that returns the number
of bytes counted, architecture specific per-rmid state is needed.
It needs to be possible to reset this hidden state, as the values
may outlive the life of an rmid, or the mount time of the filesystem.
mon_event_read() is called with first = true when an rmid is first
allocated in mkdir_mondata_subdir(). Add resctrl_arch_reset_rmid()
and call it from __mon_event_count()'s rr->first check.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
A renamed __rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a counter. Currently the function returns
the MBM values in chunks directly from hardware. For bandwidth
counters the resctrl filesystem uses this to calculate the number of
bytes ever seen.
MPAM's scaling of counters can be changed at runtime, reducing the
resolution but increasing the range. When this is changed the prev_msr
values need to be converted by the architecture code.
Add an array for per-rmid private storage. The prev_msr and chunks
values will move here to allow resctrl_arch_rmid_read() to always
return the number of bytes read by this counter without assistance
from the filesystem. The values are moved in later patches when
the overflow and correction calls are moved into __rmid_read().
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
mbm_bw_count() is only called by the mbm_handle_overflow() worker once a
second. It reads the hardware register, calculates the bandwidth and
updates m->prev_bw_msr which is used to hold the previous hardware register
value.
Operating directly on hardware register values makes it difficult to make
this code architecture independent, so that it can be moved to /fs/,
making the mba_sc feature something resctrl supports with no additional
support from the architecture.
Prior to calling mbm_bw_count(), mbm_update() reads from the same hardware
register using __mon_event_count().
Change mbm_bw_count() to use the current chunks value most recently saved
by __mon_event_count(). This removes an extra call to __rmid_read().
Instead of using m->prev_msr to calculate the number of chunks seen,
use the rr->val that was updated by __mon_event_count(). This removes an
extra call to mbm_overflow_count() and get_corrected_mbm_count().
Calculating bandwidth like this means mbm_bw_count() no longer operates
on hardware register values directly.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
update_mba_bw() calculates a new control value for the MBA resource
based on the user provided mbps_val and the current measured
bandwidth. Some control values need remapping by delay_bw_map().
It does this by calling wrmsrl() directly. This needs splitting
up to be done by an architecture specific helper, so that the
remainder can eventually be moved to /fs/.
Add resctrl_arch_update_one() to apply one configuration value
to the provided resource and domain. This avoids the staging
and cross-calling that is only needed with changes made by
user-space. delay_bw_map() moves to be part of the arch code,
to maintain the 'percentage control' view of MBA resources
in resctrl.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The resctrl arch code provides a second configuration array mbps_val[]
for the MBA software controller.
Since resctrl switched over to allocating and freeing its own array
when needed, nothing uses the arch code version.
Remove it.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Updates to resctrl's software controller follow the same path as
other configuration updates, but they don't modify the hardware state.
rdtgroup_schemata_write() uses parse_line() and the resource's
parse_ctrlval() function to stage the configuration.
resctrl_arch_update_domains() then updates the mbps_val[] array
instead, and resctrl_arch_update_domains() skips the rdt_ctrl_update()
call that would update hardware.
This complicates the interface between resctrl's filesystem parts
and architecture specific code. It should be possible for mba_sc
to be completely implemented by the filesystem parts of resctrl. This
would allow it to work on a second architecture with no additional code.
resctrl_arch_update_domains() using the mbps_val[] array prevents this.
Change parse_bw() to write the configuration value directly to the
mbps_val[] array in the domain structure. Change rdtgroup_schemata_write()
to skip the call to resctrl_arch_update_domains(), meaning all the
mba_sc specific code in resctrl_arch_update_domains() can be removed.
On the read-side, show_doms() and update_mba_bw() are changed to read
the mbps_val[] array from the domain structure. With this,
resctrl_arch_get_config() no longer needs to consider mba_sc resources.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
To support resctrl's MBA software controller, the architecture must provide
a second configuration array to hold the mbps_val[] from user-space.
This complicates the interface between the architecture specific code and
the filesystem portions of resctrl that will move to /fs/, to allow
multiple architectures to support resctrl.
Make the filesystem parts of resctrl create an array for the mba_sc
values. The software controller can be changed to use this, allowing
the architecture code to only consider the values configured in hardware.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
To determine whether the mba_MBps option to resctrl should be supported,
resctrl tests the boot CPUs' x86_vendor.
This isn't portable, and needs abstracting behind a helper so this check
can be part of the filesystem code that moves to /fs/.
Re-use the tests set_mba_sc() does to determine if the mba_sc is supported
on this system. An 'alloc_capable' test is added so that support for the
controls isn't implied by the 'delay_linear' property, which is always
true for MPAM. Because mbm_update() only update mba_sc if the mbm_local
counters are enabled, supports_mba_mbps() checks is_mbm_local_enabled().
(instead of using is_mbm_enabled(), which checks both).
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
set_mba_sc() enables the 'software controller' to regulate the bandwidth
based on the byte counters. This can be managed entirely in the parts
of resctrl that move to /fs/, without any extra support from the
architecture specific code. set_mba_sc() is called by rdt_enable_ctx()
during mount and unmount. It currently resets the arch code's ctrl_val[]
and mbps_val[] arrays.
The ctrl_val[] was already reset when the domain was created, and by
reset_all_ctrls() when the filesystem was last unmounted. Doing the work
in set_mba_sc() is not necessary as the values are already at their
defaults due to the creation of the domain, or were previously reset
during umount(), or are about to reset during umount().
Add a reset of the mbps_val[] in reset_all_ctrls(), allowing the code in
set_mba_sc() that reaches in to the architecture specific structures to
be removed.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Because domains are exposed to user-space via resctrl, the filesystem
must update its state when CPU hotplug callbacks are triggered.
Some of this work is common to any architecture that would support
resctrl, but the work is tied up with the architecture code to
free the memory.
Move the monitor subdir removal and the cancelling of the mbm/limbo
works into a new resctrl_offline_domain() call. These bits are not
specific to the architecture. Grouping them in one function allows
that code to be moved to /fs/ and re-used by another architecture.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
domain_add_cpu() and domain_remove_cpu() need to kfree() the child
arrays that were allocated by domain_setup_ctrlval().
As this memory is moved around, and new arrays are created, adjusting
the error handling cleanup code becomes noisier.
To simplify this, move all the kfree() calls into a domain_free() helper.
This depends on struct rdt_hw_domain being kzalloc()d, allowing it to
unconditionally kfree() all the child arrays.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Because domains are exposed to user-space via resctrl, the filesystem
must update its state when CPU hotplug callbacks are triggered.
Some of this work is common to any architecture that would support
resctrl, but the work is tied up with the architecture code to
allocate the memory.
Move domain_setup_mon_state(), the monitor subdir creation call and the
mbm/limbo workers into a new resctrl_online_domain() call. These bits
are not specific to the architecture. Grouping them in one function
allows that code to be moved to /fs/ and re-used by another architecture.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
mon_enabled and mon_capable are always set as a pair by
rdt_get_mon_l3_config().
There is no point having two values.
Merge them together.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
rdt_resources_all[] used to have extra entries for L2CODE/L2DATA.
These were hidden from resctrl by the alloc_enabled value.
Now that the L2/L2CODE/L2DATA resources have been merged together,
alloc_enabled doesn't mean anything, it always has the same value as
alloc_capable which indicates allocation is supported by this resource.
Remove alloc_enabled and its helpers.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Shaopeng Tan <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Xin Hao <[email protected]>
Tested-by: Shaopeng Tan <[email protected]>
Tested-by: Cristian Marussi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
prefetch register
The current pseudo_lock.c code overwrites the value of the
MSR_MISC_FEATURE_CONTROL to 0 even if the original value is not 0.
Therefore, modify it to save and restore the original values.
Fixes: 018961ae5579 ("x86/intel_rdt: Pseudo-lock region creation/removal core")
Fixes: 443810fe6160 ("x86/intel_rdt: Create debugfs files for pseudo-locking testing")
Fixes: 8a2fc0e1bc0c ("x86/intel_rdt: More precise L2 hit/miss measurements")
Signed-off-by: Kohei Tarumizu <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Acked-by: Reinette Chatre <[email protected]>
Link: https://lkml.kernel.org/r/eb660f3c2010b79a792c573c02d01e8e841206ad.1661358182.git.reinette.chatre@intel.com
|
|
In some cases, x86 code calls cpumask_weight() to check if any bit of a
given cpumask is set.
This can be done more efficiently with cpumask_empty() because
cpumask_empty() stops traversing the cpumask as soon as it finds first set
bit, while cpumask_weight() counts all bits unconditionally.
Signed-off-by: Yury Norov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Steve Wahl <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
There is no need to have struct kernfs_root be part of kernfs.h for
the whole kernel to see and poke around it. Move it internal to kernfs
code and provide a helper function, kernfs_root_to_node(), to handle the
one field that kernfs users were directly accessing from the structure.
Cc: Imran Khan <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The variable chunks is being shifted right and re-assinged the shifted
value which is then returned. Since chunks is not being read afterwards
the assignment is redundant and the >>= operator can be replaced with a
shift >> operator instead.
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Commit in Fixes separated the architecture specific and filesystem parts
of the resctrl domain structures.
This left the error paths in domain_add_cpu() kfree()ing the memory with
the wrong type.
This will cause a problem if someone adds a new member to struct
rdt_hw_domain meaning d_resctrl is no longer the first member.
Fixes: 792e0f6f789b ("x86/resctrl: Split struct rdt_domain")
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Reinette Chatre <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
domain_add_cpu() is called whenever a CPU is brought online. The
earlier call to domain_setup_ctrlval() allocates the control value
arrays.
If domain_setup_mon_state() fails, the control value arrays are not
freed.
Add the missing kfree() calls.
Fixes: 1bd2a63b4f0de ("x86/intel_rdt/mba_sc: Add initialization support")
Fixes: edf6fa1c4a951 ("x86/intel_rdt/cqm: Add RMID (Resource monitoring ID) management")
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Reinette Chatre <[email protected]>
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
"A first round of changes towards splitting the arch-specific bits from
the filesystem bits of resctrl, the ultimate goal being to support
ARM's equivalent technology MPAM, with the same fs interface (James
Morse)"
* tag 'x86_cache_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
x86/resctrl: Make resctrl_arch_get_config() return its value
x86/resctrl: Merge the CDP resources
x86/resctrl: Expand resctrl_arch_update_domains()'s msr_param range
x86/resctrl: Remove rdt_cdp_peer_get()
x86/resctrl: Merge the ctrl_val arrays
x86/resctrl: Calculate the index from the configuration type
x86/resctrl: Apply offset correction when config is staged
x86/resctrl: Make ctrlval arrays the same size
x86/resctrl: Pass configuration type to resctrl_arch_get_config()
x86/resctrl: Add a helper to read a closid's configuration
x86/resctrl: Rename update_domains() to resctrl_arch_update_domains()
x86/resctrl: Allow different CODE/DATA configurations to be staged
x86/resctrl: Group staged configuration into a separate struct
x86/resctrl: Move the schemata names into struct resctrl_schema
x86/resctrl: Add a helper to read/set the CDP configuration
x86/resctrl: Swizzle rdt_resource and resctrl_schema in pseudo_lock_region
x86/resctrl: Pass the schema to resctrl filesystem functions
x86/resctrl: Add resctrl_arch_get_num_closid()
x86/resctrl: Store the effective num_closid in the schema
x86/resctrl: Walk the resctrl schema list instead of an arch list
...
|
|
The recent commit
064855a69003 ("x86/resctrl: Fix default monitoring groups reporting")
caused a RHEL build failure with an uninitialized variable warning
treated as an error because it removed the default case snippet.
The RHEL Makefile uses '-Werror=maybe-uninitialized' to force possibly
uninitialized variable warnings to be treated as errors. This is also
reported by smatch via the 0day robot.
The error from the RHEL build is:
arch/x86/kernel/cpu/resctrl/monitor.c: In function ‘__mon_event_count’:
arch/x86/kernel/cpu/resctrl/monitor.c:261:12: error: ‘m’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
m->chunks += chunks;
^~
The upstream Makefile does not build using '-Werror=maybe-uninitialized'.
So, the problem is not seen there. Fix the problem by putting back the
default case snippet.
[ bp: note that there's nothing wrong with the code and other compilers
do not trigger this warning - this is being done just so the RHEL compiler
is happy. ]
Fixes: 064855a69003 ("x86/resctrl: Fix default monitoring groups reporting")
Reported-by: Terry Bowman <[email protected]>
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/162949631908.23903.17090272726012848523.stgit@bmoger-ubuntu
|
|
Creating a new sub monitoring group in the root /sys/fs/resctrl leads to
getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes
on the entire filesystem.
Steps to reproduce:
1. mount -t resctrl resctrl /sys/fs/resctrl/
2. cd /sys/fs/resctrl/
3. cat mon_data/mon_L3_00/mbm_total_bytes
23189832
4. Create sub monitor group:
mkdir mon_groups/test1
5. cat mon_data/mon_L3_00/mbm_total_bytes
Unavailable
When a new monitoring group is created, a new RMID is assigned to the
new group. But the RMID is not active yet. When the events are read on
the new RMID, it is expected to report the status as "Unavailable".
When the user reads the events on the default monitoring group with
multiple subgroups, the events on all subgroups are consolidated
together. Currently, if any of the RMID reads report as "Unavailable",
then everything will be reported as "Unavailable".
Fix the issue by discarding the "Unavailable" reads and reporting all
the successful RMID reads. This is not a problem on Intel systems as
Intel reports 0 on Inactive RMIDs.
Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
Reported-by: Paweł Szulik <[email protected]>
Signed-off-by: Babu Moger <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Reinette Chatre <[email protected]>
Cc: [email protected]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311
Link: https://lkml.kernel.org/r/162793309296.9224.15871659871696482080.stgit@bmoger-ubuntu
|
|
resctrl_arch_get_config() has no return, but does pass a single value
back via one of its arguments.
Return the value instead.
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
resctrl uses struct rdt_resource to describe the available hardware
resources. The domains of the CDP aliases share a single ctrl_val[]
array. The only differences between the struct rdt_hw_resource aliases
is the name and conf_type.
The name from struct rdt_hw_resource is visible to user-space. To
support another architecture, as many user-visible details should be
handled in the filesystem parts of the code that is common to all
architectures. The name and conf_type go together.
Remove conf_type and the CDP aliases. When CDP is supported and enabled,
schemata_list_create() can create two schemata using the single
resource, generating the CODE/DATA suffix to the schema name itself.
This allows the alloc_ctrlval_array() and complications around free()ing
the ctrl_val arrays to be removed.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
resctrl_arch_update_domains() specifies the one closid that has been
modified and needs copying to the hardware.
resctrl_arch_update_domains() takes a struct rdt_resource and a closid
as arguments, but copies all the staged configurations for that closid
into the ctrl_val[] array.
resctrl_arch_update_domains() is called once per schema, but once the
resources and domains are merged, the second call of a L2CODE/L2DATA
pair will find no staged configurations, as they were previously
applied. The msr_param of the first call only has one index, so would
only have update the hardware for the last staged configuration.
To avoid a second round of IPIs when changing L2CODE and L2DATA in one
go, expand the range of the msr_param if multiple staged configurations
are found.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
When CDP is enabled, rdt_cdp_peer_get() finds the alternative
CODE/DATA resource and returns the alternative domain. This is used
to determine if bitmaps overlap when there are aliased entries
in the two struct rdt_hw_resources.
Now that the ctrl_val[] used by the CODE/DATA resources is the same,
the search for an alternate resource/domain is not needed.
Replace rdt_cdp_peer_get() with resctrl_peer_type(), which returns
the alternative type. This can be passed to resctrl_arch_get_config()
with the same resource and domain.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Each struct rdt_hw_resource has its own ctrl_val[] array. When CDP is
enabled, two resources are in use, each with its own ctrl_val[] array
that holds half of the configuration used by hardware. One uses the odd
slots, the other the even. rdt_cdp_peer_get() is the helper to find the
alternate resource, its domain, and corresponding entry in the other
ctrl_val[] array.
Once the CDP resources are merged there will be one struct
rdt_hw_resource and one ctrl_val[] array for each hardware resource.
This will include changes to rdt_cdp_peer_get(), making it hard to
bisect any issue.
Merge the ctrl_val[] arrays for three CODE/DATA/NONE resources first.
Doing this before merging the resources temporarily complicates
allocating and freeing the ctrl_val arrays. Add a helper to allocate
the ctrl_val array, that returns the value on the L2 or L3 resource if
it already exists. This gets removed once the resources are merged, and
there really is only one ctrl_val[] array.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
resctrl uses cbm_idx() to map a closid to an index in the configuration
array. This is based on a multiplier and offset that are held in the
resource.
To merge the resources, the resctrl arch code needs to calculate the
index from something else, as there will only be one resource.
Decide based on the staged configuration type. This makes the static
mult and offset parameters redundant.
[ bp: Remove superfluous brackets in get_config_index() ]
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
When resctrl comes to copy the CAT MSR values from the ctrl_val[] array
into hardware, it applies an offset adjustment based on the type of
the resource. CODE and DATA resources have their closid mapped into an
odd/even range. This mapping is based on a property of the resource.
This happens once the new control value has been written to the ctrl_val[]
array. Once the CDP resources are merged, there will only be a single
property that needs to cover both odd/even mappings to the single
ctrl_val[] array. The offset adjustment must be applied before the new
value is written to the array.
Move the logic from cat_wrmsr() to resctrl_arch_update_domains(). The
value provided to apply_config() is now an index in the array, not the
closid. The parameters provided via struct msr_param are now indexes
too. As resctrl's use of closid is a u32, struct msr_param's type is
changed to match.
With this, the CODE and DATA resources only use the odd or even
indexes in the array. This allows the temporary num_closid/2 fixes in
domain_setup_ctrlval() and reset_all_ctrls() to be removed.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The CODE and DATA resources report a num_closid that is half the actual
size supported by the hardware. This behaviour is visible to user-space
when CDP is enabled.
The CODE and DATA resources have their own ctrlval arrays which are
half the size of the underlying hardware because num_closid was already
adjusted. One holds the odd configurations values, the other even.
Before the CDP resources can be merged, the 'half the closids' behaviour
needs to be implemented by schemata_list_create(), but this causes the
ctrl_val[] array to be full sized.
Remove the logic from the architecture specific rdt_get_cdp_config()
setup, and add it to schemata_list_create(). Functions that walk all the
configurations, such as domain_setup_ctrlval() and reset_all_ctrls(),
take num_closid directly from struct rdt_hw_resource also have
to halve num_closid as only the lower half of each array is in
use. domain_setup_ctrlval() and reset_all_ctrls() both copy struct
rdt_hw_resource's num_closid to a struct msr_param. Correct the value
here.
This is temporary as a subsequent patch will merge all three ctrl_val[]
arrays such that when CDP is in use, the CODA/DATA layout in the array
matches the hardware. reset_all_ctrls()'s loop over the whole of
ctrl_val[] is not touched as this is harmless, and will be required as
it is once the resources are merged.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The ctrl_val[] array for a struct rdt_hw_resource only holds
configurations of one type. The type is implicit.
Once the CDP resources are merged, the ctrl_val[] array will hold all
the configurations for the hardware resource. When a particular type of
configuration is needed, it must be specified explicitly.
Pass the expected type from the schema into resctrl_arch_get_config().
Nothing uses this yet, but once a single ctrl_val[] array is used for
the three struct rdt_hw_resources that share hardware, the type will be
used to return the correct configuration value from the shared array.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Functions like show_doms() reach into the architecture's private
structure to retrieve the configuration from the struct rdt_hw_resource.
The hardware configuration may look completely different to the
values resctrl gets from user-space. The staged configuration and
resctrl_arch_update_domains() allow the architecture to convert or
translate these values.
Resctrl shouldn't read or write the ctrl_val[] values directly. Add
a helper to read the current configuration. This will allow another
architecture to scale the bitmaps if necessary, and possibly use
controls that don't take the user-space control format at all.
Of the remaining functions that access ctrl_val[] directly,
apply_config() is part of the architecture-specific code, and is
called via resctrl_arch_update_domains(). reset_all_ctrls() will be an
architecture specific helper.
update_mba_bw() manipulates both ctrl_val[], mbps_val[] and the
hardware. The mbps_val[] that matches the mba_sc state of the resource
is changed, but the other is left unchanged. Abstracting this is the
subject of later patches that affect set_mba_sc() too.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
update_domains() merges the staged configuration changes into the arch
codes configuration array. Rename to make it clear it is part of the
arch code interface to resctrl.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Before the CDP resources can be merged, struct rdt_domain will need an
array of struct resctrl_staged_config, one per type of configuration.
Use the type as an index to the array to ensure that a schema
configuration string can't specify the same domain twice. This will
allow two schemata to apply configuration changes to one resource.
Signed-off-by: James Morse <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Jamie Iles <[email protected]>
Reviewed-by: Reinette Chatre <[email protected]>
Tested-by: Babu Moger <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|