aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/intel-pt-events.py
diff options
context:
space:
mode:
authorCosmin Ratiu <[email protected]>2024-10-01 13:37:06 +0300
committerJakub Kicinski <[email protected]>2024-10-04 11:33:46 -0700
commit918af0219a4d6a89cf02839005ede24e91f13bf6 (patch)
treeee41e2b6f542ea61522b1926f9be0742086e4cc5 /tools/perf/scripts/python/intel-pt-events.py
parent10cd92df833c3f6c35dc5e923651146d41332538 (diff)
net/mlx5: hw counters: Replace IDR+lists with xarray
Previously, managing counters was a complicated affair involving an IDR, a sorted double linked list, two single linked lists and a complex dance between a non-periodic wq task and users adding/deleting counters. Adding was done by inserting new counters into the IDR and into a single linked list, leaving the wq to process the list and actually add the counters into the double linked list, maintained sorted with the IDR. Deleting involved adding the counter into another single linked list, leaving the wq to actually unlink the counter from the other structures and release it. Dumping the counters is done with the bulk query API, which relies on the counter list being sorted and unmutable during querying to efficiently retrieve cached counter values. Finally, the IDR data struct is deprecated. This commit replaces all of that with an xarray. Adding is now done directly, by using xa_lock. Deleting is also done directly, under the xa_lock. Querying is done from a periodic task running every sampling_interval (default 1s) and uses the bulk query API for efficiency. It works by iterating over the xarray: - when a new bulk needs to be started, the bulk information is computed under the xa_lock. - the xa iteration state is saved and the xa_lock dropped. - the HW is queried for bulk counter values. - the xa_lock is reacquired. - counter caches with ids covered by the bulk response are updated. Querying always requests the max bulk length, for simplicity. Counters could be added/deleted while the HW is queried. This is safe, as the HW API simply returns unknown values for counters not in HW, but those values won't be accessed. Only counters present in xarray before bulk query will actually read queried cache values. This cuts down the size of mlx5_fc by 4 pointers (88->56 bytes), which amounts to ~3MB / 100K counters. But more importantly, this solves the wq spinlock congestion issue seen happening on high-rate counter insertion+deletion. Signed-off-by: Cosmin Ratiu <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
Diffstat (limited to 'tools/perf/scripts/python/intel-pt-events.py')
0 files changed, 0 insertions, 0 deletions