aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-10-20cifs: clarify comment about timestamp granularity for old serversSteve French1-1/+7
It could be confusing why we set granularity to 1 seconds rather than 2 seconds (1 second is the max the VFS allows) for these mounts to very old servers ... Signed-off-by: Steve French <[email protected]>
2019-10-20cifs: Handle -EINPROGRESS only when noblockcnt is setPaulo Alcantara (SUSE)1-2/+6
We only want to avoid blocking in connect when mounting SMB root filesystems, otherwise bail out from generic_ip_connect() so cifs.ko can perform any reconnect failover appropriately. This fixes DFS failover/reconnection tests in upstream buildbot. Fixes: 8eecd1c2e5bc ("cifs: Add support for root file systems") Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2019-10-21PM: QoS: Drop frequency QoS types from device PM QoSRafael J. Wysocki2-80/+2
There are no more active users of DEV_PM_QOS_MIN_FREQUENCY and DEV_PM_QOS_MAX_FREQUENCY device PM QoS request types, so drop them along with the code supporting them. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2019-10-21cpufreq: Use per-policy frequency QoSRafael J. Wysocki10-114/+114
Replace the CPU device PM QoS used for the management of min and max frequency constraints in cpufreq (and its users) with per-policy frequency QoS to avoid problems with cpufreq policies covering more then one CPU. Namely, a cpufreq driver is registered with the subsys interface which calls cpufreq_add_dev() for each CPU, starting from CPU0, so currently the PM QoS notifiers are added to the first CPU in the policy (i.e. CPU0 in the majority of cases). In turn, when the cpufreq driver is unregistered, the subsys interface doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0, and the PM QoS notifiers are only removed when cpufreq_remove_dev() is called for the last CPU in the policy, say CPUx, which as a rule is not CPU0 if the policy covers more than one CPU. Then, the PM QoS notifiers cannot be removed, because CPUx does not have them, and they are still there in the device PM QoS notifiers list of CPU0, which prevents new PM QoS notifiers from being registered for CPU0 on the next attempt to register the cpufreq driver. The same issue occurs when the first CPU in the policy goes offline before unregistering the driver. After this change it does not matter which CPU is the policy CPU at the driver registration time and whether or not it is online all the time, because the frequency QoS is per policy and not per CPU. Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework") Reported-by: Dmitry Osipenko <[email protected]> Tested-by: Dmitry Osipenko <[email protected]> Reported-by: Sudeep Holla <[email protected]> Tested-by: Sudeep Holla <[email protected]> Diagnosed-by: Viresh Kumar <[email protected]> Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2019-10-21PM: QoS: Introduce frequency QoSRafael J. Wysocki2-0/+284
Introduce frequency QoS, based on the "raw" low-level PM QoS, to represent min and max frequency requests and aggregate constraints. The min and max frequency requests are to be represented by struct freq_qos_request objects and the aggregate constraints are to be represented by struct freq_constraints objects. The latter are expected to be initialized with the help of freq_constraints_init(). The freq_qos_read_value() helper is defined to retrieve the aggregate constraints values from a given struct freq_constraints object and there are the freq_qos_add_request(), freq_qos_update_request() and freq_qos_remove_request() helpers to manipulate the min and max frequency requests. It is assumed that the the helpers will not run concurrently with each other for the same struct freq_qos_request object, so if that may be the case, their uses must ensure proper synchronization between them (e.g. through locking). In addition, freq_qos_add_notifier() and freq_qos_remove_notifier() are provided to add and remove notifiers that will trigger on aggregate constraint changes to and from a given struct freq_constraints object, respectively. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2019-10-20Linux 5.4-rc4Linus Torvalds1-1/+1
2019-10-20Merge tag 'kbuild-fixes-v5.4-2' of ↵Linus Torvalds3-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild fixes from Masahiro Yamada: - fix a bashism of setlocalversion - do not use the too new --sort option of tar * tag 'kbuild-fixes-v5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kheaders: substituting --sort in archive creation scripts: setlocalversion: fix a bashism kbuild: update comment about KBUILD_ALLDIRS
2019-10-20perf/x86/intel/pt: Fix base for single entry topaJiri Olsa1-1/+1
Jan reported failing ltp test for PT: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/tracing/pt_test/pt_test.c It looks like the reason is this new commit added in this v5.4 merge window: 38bb8d77d0b9 ("perf/x86/intel/pt: Split ToPA metadata and page layout") which did not keep the TOPA_SHIFT for entry base. Add it back. Reported-by: Jan Stancek <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Fixes: 38bb8d77d0b9 ("perf/x86/intel/pt: Split ToPA metadata and page layout") Link: https://lkml.kernel.org/r/[email protected] [ Minor changelog edits. ] Signed-off-by: Ingo Molnar <[email protected]>
2019-10-20Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds6-38/+84
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of x86 fixes: - Prevent a NULL pointer dereference in the X2APIC code in case of a CPU hotplug failure. - Prevent boot failures on HP superdome machines by invalidating the level2 kernel pagetable entries outside of the kernel area as invalid so BIOS reserved space won't be touched unintentionally. Also ensure that memory holes are rounded up to the next PMD boundary correctly. - Enable X2APIC support on Hyper-V to prevent boot failures. - Set the paravirt name when running on Hyper-V for consistency - Move a function under the appropriate ifdef guard to prevent build warnings" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/acpi: Move get_cmdline_acpi_rsdp() under #ifdef guard x86/hyperv: Set pv_info.name to "Hyper-V" x86/apic/x2apic: Fix a NULL pointer deref when handling a dying cpu x86/hyperv: Make vapic support x2apic mode x86/boot/64: Round memory hole size up to next PMD page x86/boot/64: Make level2_kernel_pgt pages invalid outside kernel area
2019-10-20Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds5-17/+43
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A small set of irq chip driver fixes and updates: - Update the SIFIVE PLIC interrupt driver to use the fasteoi handler to address the shortcomings of the existing flow handling which was prone to lose interrupts - Use the proper limit for GIC interrupt line numbers - Add retrigger support for the recently merged Anapurna Labs Fabric interrupt controller to make it complete - Enable the ATMEL AIC5 interrupt controller driver on the new SAM9X60 SoC" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/sifive-plic: Switch to fasteoi flow irqchip/gic-v3: Fix GIC_LINE_NR accessor irqchip/atmel-aic5: Add support for sam9x60 irqchip irqchip/al-fic: Add support for irq retrigger
2019-10-20Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull hrtimer fixlet from Thomas Gleixner: "A single commit annotating the lockcless access to timer->base with READ_ONCE() and adding the WRITE_ONCE() counterparts for completeness" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hrtimer: Annotate lockless access to timer->base
2019-10-20Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull stop-machine fix from Thomas Gleixner: "A single fix, amending stop machine with WRITE/READ_ONCE() to address the fallout of KCSAN" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stop_machine: Avoid potential race behaviour
2019-10-20KVM: arm64: pmu: Reset sample period on overflow handlingMarc Zyngier1-0/+20
The PMU emulation code uses the perf event sample period to trigger the overflow detection. This works fine for the *first* overflow handling, but results in a huge number of interrupts on the host, unrelated to the number of interrupts handled in the guest (a x20 factor is pretty common for the cycle counter). On a slow system (such as a SW model), this can result in the guest only making forward progress at a glacial pace. It turns out that the clue is in the name. The sample period is exactly that: a period. And once the an overflow has occured, the following period should be the full width of the associated counter, instead of whatever the guest had initially programed. Reset the sample period to the architected value in the overflow handler, which now results in a number of host interrupts that is much closer to the number of interrupts in the guest. Fixes: b02386eb7dac ("arm64: KVM: Add PMU overflow interrupt routing") Reviewed-by: Andrew Murray <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-10-20KVM: arm64: pmu: Set the CHAINED attribute before creating the in-kernel eventMarc Zyngier1-3/+3
The current convention for KVM to request a chained event from the host PMU is to set bit[0] in attr.config1 (PERF_ATTR_CFG1_KVM_PMU_CHAINED). But as it turns out, this bit gets set *after* we create the kernel event that backs our virtual counter, meaning that we never get a 64bit counter. Moving the setting to an earlier point solves the problem. Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters") Reviewed-by: Andrew Murray <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-10-20arm64: KVM: Handle PMCR_EL0.LC as RES1 on pure AArch64 systemsMarc Zyngier1-0/+4
Of PMCR_EL0.LC, the ARMv8 ARM says: "In an AArch64 only implementation, this field is RES 1." So be it. Fixes: ab9468340d2bc ("arm64: KVM: Add access handler for PMCR register") Reviewed-by: Andrew Murray <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-10-20KVM: arm64: pmu: Fix cycle counter truncationMarc Zyngier1-10/+12
When a counter is disabled, its value is sampled before the event is being disabled, and the value written back in the shadow register. In that process, the value gets truncated to 32bit, which is adequate for any counter but the cycle counter (defined as a 64bit counter). This obviously results in a corrupted counter, and things like "perf record -e cycles" not working at all when run in a guest... A similar, but less critical bug exists in kvm_pmu_get_counter_value. Make the truncation conditional on the counter not being the cycle counter, which results in a minor code reorganisation. Fixes: 80f393a23be6 ("KVM: arm/arm64: Support chained PMU counters") Reviewed-by: Andrew Murray <[email protected]> Reported-by: Julien Thierry <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-10-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds179-892/+1486
Pull networking fixes from David Miller: "I was battling a cold after some recent trips, so quite a bit piled up meanwhile, sorry about that. Highlights: 1) Fix fd leak in various bpf selftests, from Brian Vazquez. 2) Fix crash in xsk when device doesn't support some methods, from Magnus Karlsson. 3) Fix various leaks and use-after-free in rxrpc, from David Howells. 4) Fix several SKB leaks due to confusion of who owns an SKB and who should release it in the llc code. From Eric Biggers. 5) Kill a bunc of KCSAN warnings in TCP, from Eric Dumazet. 6) Jumbo packets don't work after resume on r8169, as the BIOS resets the chip into non-jumbo mode during suspend. From Heiner Kallweit. 7) Corrupt L2 header during MPLS push, from Davide Caratti. 8) Prevent possible infinite loop in tc_ctl_action, from Eric Dumazet. 9) Get register bits right in bcmgenet driver, based upon chip version. From Florian Fainelli. 10) Fix mutex problems in microchip DSA driver, from Marek Vasut. 11) Cure race between route lookup and invalidation in ipv4, from Wei Wang. 12) Fix performance regression due to false sharing in 'net' structure, from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (145 commits) net: reorder 'struct net' fields to avoid false sharing net: dsa: fix switch tree list net: ethernet: dwmac-sun8i: show message only when switching to promisc net: aquantia: add an error handling in aq_nic_set_multicast_list net: netem: correct the parent's backlog when corrupted packet was dropped net: netem: fix error path for corrupted GSO frames macb: propagate errors when getting optional clocks xen/netback: fix error path of xenvif_connect_data() net: hns3: fix mis-counting IRQ vector numbers issue net: usb: lan78xx: Connect PHY before registering MAC vsock/virtio: discard packets if credit is not respected vsock/virtio: send a credit update when buffer size is changed mlxsw: spectrum_trap: Push Ethernet header before reporting trap net: ensure correct skb->tstamp in various fragmenters net: bcmgenet: reset 40nm EPHY on energy detect net: bcmgenet: soft reset 40nm EPHYs before MAC init net: phy: bcm7xxx: define soft_reset for 40nm EPHY net: bcmgenet: don't set phydev->link from MAC net: Update address for MediaTek ethernet driver in MAINTAINERS ipv4: fix race condition between route lookup and invalidation ...
2019-10-19net: reorder 'struct net' fields to avoid false sharingEric Dumazet1-8/+17
Intel test robot reported a ~7% regression on TCP_CRR tests that they bisected to the cited commit. Indeed, every time a new TCP socket is created or deleted, the atomic counter net->count is touched (via get_net(net) and put_net(net) calls) So cpus might have to reload a contended cache line in net_hash_mix(net) calls. We need to reorder 'struct net' fields to move @hash_mix in a read mostly cache line. We move in the first cache line fields that can be dirtied often. We probably will have to address in a followup patch the __randomize_layout that was added in linux-4.13, since this might break our placement choices. Fixes: 355b98553789 ("netns: provide pure entropy for net_hash_mix()") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19net: dsa: fix switch tree listVivien Didelot1-1/+1
If there are multiple switch trees on the device, only the last one will be listed, because the arguments of list_add_tail are swapped. Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation") Signed-off-by: Vivien Didelot <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19net: ethernet: dwmac-sun8i: show message only when switching to promiscMans Rullgard1-1/+2
Printing the info message every time more than the max number of mac addresses are requested generates unnecessary log spam. Showing it only when the hw is not already in promiscous mode is equally informative without being annoying. Signed-off-by: Mans Rullgard <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19net: aquantia: add an error handling in aq_nic_set_multicast_listChenwandun1-0/+2
add an error handling in aq_nic_set_multicast_list, it may not work when hw_multicast_list_set error; and at the same time it will remove gcc Wunused-but-set-variable warning. Signed-off-by: Chenwandun <[email protected]> Reviewed-by: Igor Russkikh <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19Merge branch 'netem-fix-further-issues-with-packet-corruption'David S. Miller1-3/+8
Jakub Kicinski says: ==================== net: netem: fix further issues with packet corruption This set is fixing two more issues with the netem packet corruption. First patch (which was previously posted) avoids NULL pointer dereference if the first frame gets freed due to allocation or checksum failure. v2 improves the clarity of the code a little as requested by Cong. Second patch ensures we don't return SUCCESS if the frame was in fact dropped. Thanks to this commit message for patch 1 no longer needs the "this will still break with a single-frame failure" disclaimer. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-10-19net: netem: correct the parent's backlog when corrupted packet was droppedJakub Kicinski1-0/+2
If packet corruption failed we jump to finish_segs and return NET_XMIT_SUCCESS. Seeing success will make the parent qdisc increment its backlog, that's incorrect - we need to return NET_XMIT_DROP. Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue") Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19net: netem: fix error path for corrupted GSO framesJakub Kicinski1-3/+6
To corrupt a GSO frame we first perform segmentation. We then proceed using the first segment instead of the full GSO skb and requeue the rest of the segments as separate packets. If there are any issues with processing the first segment we still want to process the rest, therefore we jump to the finish_segs label. Commit 177b8007463c ("net: netem: fix backlog accounting for corrupted GSO frames") started using the pointer to the first segment in the "rest of segments processing", but as mentioned above the first segment may had already been freed at this point. Backlog corrections for parent qdiscs have to be adjusted. Fixes: 177b8007463c ("net: netem: fix backlog accounting for corrupted GSO frames") Reported-by: kbuild test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19macb: propagate errors when getting optional clocksMichael Tretter1-6/+6
The tx_clk, rx_clk, and tsu_clk are optional. Currently the macb driver marks clock as not available if it receives an error when trying to get a clock. This is wrong, because a clock controller might return -EPROBE_DEFER if a clock is not available, but will eventually become available. In these cases, the driver would probe successfully but will never be able to adjust the clocks, because the clocks were not available during probe, but became available later. For example, the clock controller for the ZynqMP is implemented in the PMU firmware and the clocks are only available after the firmware driver has been probed. Use devm_clk_get_optional() in instead of devm_clk_get() to get the optional clock and propagate all errors to the calling function. Signed-off-by: Michael Tretter <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Tested-by: Nicolas Ferre <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19xen/netback: fix error path of xenvif_connect_data()Juergen Gross1-1/+0
xenvif_connect_data() calls module_put() in case of error. This is wrong as there is no related module_get(). Remove the superfluous module_put(). Fixes: 279f438e36c0a7 ("xen-netback: Don't destroy the netdev until the vif is shut down") Cc: <[email protected]> # 3.12 Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Paul Durrant <[email protected]> Reviewed-by: Wei Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19net: hns3: fix mis-counting IRQ vector numbers issueYonglong Liu6-6/+58
Currently, the num_msi_left means the vector numbers of NIC, but if the PF supported RoCE, it contains the vector numbers of NIC and RoCE(Not expected). This may cause interrupts lost in some case, because of the NIC module used the vector resources which belongs to RoCE. This patch adds a new variable num_nic_msi to store the vector numbers of NIC, and adjust the default TQP numbers and rss_size according to the value of num_nic_msi. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yonglong Liu <[email protected]> Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-19perf trace: Use STUL_STRARRAY_FLAGS with mmapArnaldo Carvalho de Melo1-1/+3
The 'mmap' syscall has special needs so it doesn't use SCA_STRARRAY_FLAGS, see its implementation in syscall_arg__scnprintf_mmap_flags(), related to special handling of MAP_ANONYMOUS, so set ->parm to the strarray__mmap_flags and hook up with strarray__strtoul_flags manually, now we can filter by those or-ed string expressions: # perf trace -e syscalls:sys_enter_mmap sleep 1 0.000 syscalls:sys_enter_mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) 0.026 syscalls:sys_enter_mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) 0.036 syscalls:sys_enter_mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0) 0.046 syscalls:sys_enter_mmap(addr: 0x7fae003d9000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) 0.052 syscalls:sys_enter_mmap(addr: 0x7fae00526000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) 0.055 syscalls:sys_enter_mmap(addr: 0x7fae00573000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) 0.062 syscalls:sys_enter_mmap(addr: 0x7fae00579000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) 0.253 syscalls:sys_enter_mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) # # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE" sleep 1 0.000 syscalls:sys_enter_mmap(addr: 0x7f6ab3dcb000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) 0.010 syscalls:sys_enter_mmap(addr: 0x7f6ab3f18000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) 0.014 syscalls:sys_enter_mmap(addr: 0x7f6ab3f65000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|ANONYMOUS" sleep 1 0.000 syscalls:sys_enter_mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) # # perf trace -v -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|ANONYMOUS" sleep 1 |& grep "New filter" New filter for syscalls:sys_enter_mmap: flags==0x22 # Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf trace: Wire up strarray__strtoul_flags()Arnaldo Carvalho de Melo2-0/+9
Now anything that uses STRARRAY_FLAGS, like the 'fsmount' syscall will support mapping or-ed strings back to a value that can be used in a filter. In some cases, where STRARRAY_FLAGS isn't used but instead the scnprintf is a special one because of specific needs, like for mmap, then one has to set the ->pars to the strarray. See the next cset. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libbeauty: Introduce strarray__strtoul_flags()Arnaldo Carvalho de Melo2-1/+45
Counterpart of strarray__scnprintf_flags(), i.e. from a expression like: # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE" I.e. that "flags==PRIVATE|FIXED|DENYWRITE", turn that into # perf trace -e syscalls:sys_enter_mmap --filter=0x812 Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libbeauty: Make the mmap_flags strarray visible outside of its beautifierArnaldo Carvalho de Melo1-2/+2
So that we can later use it with the strarray__strtoul_flags() routine that will be soon introduced. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf trace: Use strtoul for the fcntl 'cmd' argumentArnaldo Carvalho de Melo1-1/+2
Since its values are in two ranges of values we ended up codifying it using a 'struct strarrays', so now hook it up with STUL_STRARRAYS so that we can do: # perf trace -e syscalls:*enter_fcntl --filter=cmd==SETLK||cmd==SETLKW 0.000 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4dee0) 1.523 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de90) 1.629 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de90) 2.711 sssd_kcm/19021 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7ffcf0a4de70) ^C# Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libbeauty: Introduce syscall_arg__strtoul_strarrays()Arnaldo Carvalho de Melo2-0/+8
To allow going from string to integer for 'struct strarrays'. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Add pr_err() macroJiri Olsa2-0/+4
And missing include for "perf/core.h" header, which provides LIBPERF_* debug levels and add missing pr_err() support. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Do not export perf_evsel__init()/perf_evlist__init()Jiri Olsa5-5/+2
There's no point in exporting perf_evsel__init()/perf_evlist__init(), it's called from perf_evsel__new()/perf_evlist__new() respectively. It's used only from perf where perf_evsel()/perf_evlist() is embedded perf's evsel/evlist. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Keep count of failed testsJiri Olsa5-7/+21
Keep the count of failed tests, so we get better output with failures, like: # make tests ... running static: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...FAILED test-evlist.c:53 failed to create evsel2 FAILED test-evlist.c:163 failed to create evsel2 FAILED test-evlist.c:287 failed count FAILED (3) - running test-evsel.c...OK running dynamic: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...FAILED test-evlist.c:53 failed to create evsel2 FAILED test-evlist.c:163 failed to create evsel2 FAILED test-evlist.c:287 failed count FAILED (3) - running test-evsel.c...OK ... Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Add tests_mmap_cpus testJiri Olsa1-0/+98
Add mmaping tests that generates prctl call on every cpu validates it gets all the related events in ring buffer. Committer testing: # make -C tools/perf/lib tests make: Entering directory '/home/acme/git/perf/tools/perf/lib' LINK test-cpumap-a LINK test-threadmap-a LINK test-evlist-a LINK test-evsel-a LINK test-cpumap-so LINK test-threadmap-so LINK test-evlist-so LINK test-evsel-so running static: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...OK - running test-evsel.c...OK running dynamic: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...OK - running test-evsel.c...OK make: Leaving directory '/home/acme/git/perf/tools/perf/lib' # Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Added _GNU_SOURCE define for sched.h to get sched_[gs]et_affinity Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Add tests_mmap_thread testJiri Olsa1-0/+119
Add mmaping tests that generates 100 prctl calls in monitored child process and validates it gets 100 events in ring buffer. Committer tests: # make -C tools/perf/lib tests make: Entering directory '/home/acme/git/perf/tools/perf/lib' LINK test-cpumap-a LINK test-threadmap-a LINK test-evlist-a LINK test-evsel-a LINK test-cpumap-so LINK test-threadmap-so LINK test-evlist-so LINK test-evsel-so running static: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...OK - running test-evsel.c...OK running dynamic: - running test-cpumap.c...OK - running test-threadmap.c...OK - running test-evlist.c...OK - running test-evsel.c...OK make: Leaving directory '/home/acme/git/perf/tools/perf/lib' # Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Link static tests with libapi.aJiri Olsa2-3/+4
Both static and dynamic tests needs to link with libapi.a, because it's using its functions. Also include path for libapi includes. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Move mask setup to perf_evlist__mmap_ops()Jiri Olsa2-2/+2
Move the mask setup to perf_evlist__mmap_ops(), because it's the same on both perf and libperf path. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Move mmap allocation to perf_evlist__mmap_ops::getJiri Olsa2-32/+34
Move allocation of the mmap array into perf_evlist__mmap_ops::get, to centralize the mmap allocation. Also move nr_mmap setup to perf_evlist__mmap_ops so it's centralized and shared by both perf and libperf mmap code. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19libperf: Introduce perf_evlist__for_each_mmap()Jiri Olsa7-6/+47
Add the perf_evlist__for_each_mmap() function and export it in the perf/evlist.h header, so that the user can iterate through 'struct perf_mmap' objects. Add a internal perf_mmap__link() function to do the actual linking. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf tests: Disable bp_signal testing for arm64Leo Yan1-9/+6
As there are several discussions for enabling perf breakpoint signal testing on arm64 platform: arm64 needs to rely on single-step to execute the breakpointed instruction and then reinstall the breakpoint exception handler. But if we hook the breakpoint with a signal, the signal handler will do the stepping rather than the breakpointed instruction, this causes infinite loops as below: Kernel space | Userspace ---------------------------------|-------------------------------- | __test_function() -> hit | breakpoint breakpoint_handler() | `-> user_enable_single_step() | do_signal() | | sig_handler() -> Step one | instruction and | trap to kernel single_step_handler() | `-> reinstall_suspended_bps() | | __test_function() -> hit | breakpoint again and | repeat up flow infinitely As Will Deacon mentioned [1]: "that we require the overflow handler to do the stepping on arm/arm64, which is relied upon by GDB/ptrace. The hw_breakpoint code is a complete disaster so my preference would be to rip out the perf part and just implement something directly in ptrace, but it's a pretty horrible job". Though Will commented this on arm architecture, but the comment also can apply on arm64 architecture. For complete information, I searched online and found a few years back, Wang Nan sent one patch 'arm64: Store breakpoint single step state into pstate' [2]; the patch tried to resolve this issue by avoiding single stepping in signal handler and defer to enable the signal stepping when return to __test_function(). The fixing was not merged due to the concern for missing to handle different usage cases. Based on the info, the most feasible way is to skip Perf breakpoint signal testing for arm64 and this could avoid the duplicate investigation efforts when people see the failure. This patch skips this case on arm64 platform, which is same with arm architecture. [1] https://lkml.org/lkml/2018/11/15/205 [2] https://lkml.org/lkml/2015/12/23/477 Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Brajeswar Ghosh <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Souptick Joarder <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf tests bp_account: Add dedicated checking helper is_supported()Leo Yan3-1/+18
The arm architecture supports breakpoint accounting but it doesn't support breakpoint overflow signal handling. The current code uses the same checking helper, thus it disables both testings (bp_account and bp_signal) for arm platform. For handling two testings separately, this patch adds a dedicated checking helper is_supported() for breakpoint accounting testing, thus it allows supporting breakpoint accounting testing on arm platform; the old helper test__bp_signal_is_supported() is only used to checking for breakpoint overflow signal testing. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Brajeswar Ghosh <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Souptick Joarder <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf tests: Remove needless headers for bp_accountLeo Yan1-4/+0
A few headers are not needed and were introduced by copying from other test file. This patch removes the needless headers for the breakpoint accounting testing. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Brajeswar Ghosh <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Souptick Joarder <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf list: Hide deprecated events by defaultJin Yao9-19/+55
There are some deprecated events listed by perf list. But we can't remove them from perf list with ease because some old scripts may use them. Deprecated events are old names of renamed events. When an event gets renamed the old name is kept around for some time and marked with Deprecated. The newer Intel event lists in the tree already have these headers. So we need to keep them in the event list, but provide a new option to show them. The new option is "--deprecated". With this patch, the deprecated events are hidden by default but they can be displayed when option "--deprecated" is enabled. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19perf trace: Pass a syscall_arg to syscall_arg_fmt->strtoul()Arnaldo Carvalho de Melo1-1/+5
With just what we need for the STUL_STRARRAY, i.e. the 'struct strarray' pointer to be used, just like with syscall_arg_fmt->scnprintf() for the other direction (number -> string). With this all the strarrays that are associated with syscalls can be used with '-e syscalls:sys_enter_SYSCALLNAME --filter', and soon will be possible as well to use with the strace-like shorter form, with just the syscall names, i.e. something like: -e lseek/whence==END/ For now we have to use the longer form: # perf trace -e syscalls:sys_enter_lseek 0.000 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.031 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 0.046 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.528 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.575 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 5003.593 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.017 pool/2242 syscalls:sys_enter_lseek(fd: 14<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.051 pool/2242 syscalls:sys_enter_lseek(fd: 15<anon_inode:[timerfd]>, offset: 0, whence: CUR) 10002.068 pool/2242 syscalls:sys_enter_lseek(fd: 16<anon_inode:[timerfd]>, offset: 0, whence: CUR) ^C# perf trace -e syscalls:sys_enter_lseek --filter="whence!=CUR" 0.000 sshd/24476 syscalls:sys_enter_lseek(fd: 3, offset: 9032, whence: SET) 0.060 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 9032, whence: SET) 0.187 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET) 0.203 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 118632, whence: SET) 0.349 sshd/24476 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libcrypt.so.2.0.0>, offset: 61936, whence: SET) ^C# And for those curious about what are those lseek(DSO, offset, SET), well, its the loader: # perf trace -e syscalls:sys_enter_lseek/max-stack=16/ --filter="whence!=CUR" 0.000 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.067 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 9032, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object_from_fd (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.198 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) 0.219 sshd/24495 syscalls:sys_enter_lseek(fd: 3</usr/lib64/libgcrypt.so.20.2.5>, offset: 118632, whence: SET) __libc_lseek64 (/usr/lib64/ld-2.29.so) _dl_map_object_from_fd (/usr/lib64/ld-2.29.so) _dl_map_object (/usr/lib64/ld-2.29.so) ^C# :-) With this we can use strings in strarrays in filters, which allows us to reuse all these that are in place for syscalls: $ find tools/perf/trace/beauty/ -name "*.c" | xargs grep -w DEFINE_STRARRAY tools/perf/trace/beauty/fcntl.c: static DEFINE_STRARRAY(fcntl_setlease, "F_"); tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(mmap_flags, "MAP_"); tools/perf/trace/beauty/mmap.c: static DEFINE_STRARRAY(madvise_advices, "MADV_"); tools/perf/trace/beauty/sync_file_range.c: static DEFINE_STRARRAY(sync_file_range_flags, "SYNC_FILE_RANGE_"); tools/perf/trace/beauty/socket.c: static DEFINE_STRARRAY(socket_ipproto, "IPPROTO_"); tools/perf/trace/beauty/mount_flags.c: static DEFINE_STRARRAY(mount_flags, "MS_"); tools/perf/trace/beauty/pkey_alloc.c: static DEFINE_STRARRAY(pkey_alloc_access_rights, "PKEY_"); tools/perf/trace/beauty/sockaddr.c:DEFINE_STRARRAY(socket_families, "PF_"); tools/perf/trace/beauty/tracepoints/x86_irq_vectors.c:static DEFINE_STRARRAY(x86_irq_vectors, "_VECTOR"); tools/perf/trace/beauty/tracepoints/x86_msr.c:static DEFINE_STRARRAY(x86_MSRs, "MSR_"); tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_options, "PR_"); tools/perf/trace/beauty/prctl.c: static DEFINE_STRARRAY(prctl_set_mm_options, "PR_SET_MM_"); tools/perf/trace/beauty/fspick.c: static DEFINE_STRARRAY(fspick_flags, "FSPICK_"); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(ioctl_tty_cmd, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(drm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_pcm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(sndrv_ctl_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(kvm_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(vhost_virtio_ioctl_read_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(perf_ioctl_cmds, ""); tools/perf/trace/beauty/ioctl.c: static DEFINE_STRARRAY(usbdevfs_ioctl_cmds, ""); tools/perf/trace/beauty/fsmount.c: static DEFINE_STRARRAY(fsmount_attr_flags, "MOUNT_ATTR_"); tools/perf/trace/beauty/renameat.c: static DEFINE_STRARRAY(rename_flags, "RENAME_"); tools/perf/trace/beauty/kcmp.c: static DEFINE_STRARRAY(kcmp_types, "KCMP_"); tools/perf/trace/beauty/move_mount.c: static DEFINE_STRARRAY(move_mount_flags, "MOVE_MOUNT_"); $ Well, some, as the mmap flags are like: $ tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x008000) + 1] = "POPULATE", [ilog2(0x010000) + 1] = "NONBLOCK", [ilog2(0x020000) + 1] = "STACK", [ilog2(0x040000) + 1] = "HUGETLB", [ilog2(0x080000) + 1] = "SYNC", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", }; $ So we'll need a strarray__strtoul_flags() that will break donw the flags into tokens separated by '|' before doing the lookup and then go on reconstructing the value from, say: # perf trace -e syscalls:sys_enter_mmap --filter="flags==PRIVATE|FIXED|DENYWRITE" into: # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x2|0x10|0x0800" and finally into: # perf trace -e syscalls:sys_enter_mmap --filter="flags==0x812" That is what we see if we don't use the augmented view obtained from: # perf trace -e mmap <SNIP> 211792.885 procmail/15393 mmap(addr: 0x7fcd11645000, len: 8192, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 8, off: 0xa000) = 0x7fcd11645000 <SNIP> But plain use tracefs: procmail-15559 [000] .... 54557.178262: sys_mmap(addr: 7f5c9bf7a000, len: 9b000, prot: 1, flags: 812, fd: 3, off: a9000) Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds27-139/+165
Merge misc fixes from Andrew Morton: "Rather a lot of fixes, almost all affecting mm/" * emailed patches from Andrew Morton <[email protected]>: (26 commits) scripts/gdb: fix debugging modules on s390 kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe register mm/thp: allow dropping THP from page cache mm/vmscan.c: support removing arbitrary sized pages from mapping mm/thp: fix node page state in split_huge_page_to_list() proc/meminfo: fix output alignment mm/init-mm.c: include <linux/mman.h> for vm_committed_as_batch mm/filemap.c: include <linux/ramfs.h> for generic_file_vm_ops definition mm: include <linux/huge_mm.h> for is_vma_temporary_stack zram: fix race between backing_dev_show and backing_dev_store mm/memcontrol: update lruvec counters in mem_cgroup_move_account ocfs2: fix panic due to ocfs2_wq is null hugetlbfs: don't access uninitialized memmaps in pfn_range_valid_gigantic() mm: memblock: do not enforce current limit for memblock_phys* family mm: memcg: get number of pages on the LRU list in memcgroup base on lru_zone_size mm/gup: fix a misnamed "write" argument, and a related bug mm/gup_benchmark: add a missing "w" to getopt string ocfs2: fix error handling in ocfs2_setattr() mm: memcg/slab: fix panic in __free_slab() caused by premature memcg pointer release mm/memunmap: don't access uninitialized memmap in memunmap_pages() ...
2019-10-19scripts/gdb: fix debugging modules on s390Ilya Leoshkevich1-1/+7
Currently lx-symbols assumes that module text is always located at module->core_layout->base, but s390 uses the following layout: +------+ <- module->core_layout->base | GOT | +------+ <- module->core_layout->base + module->arch->plt_offset | PLT | +------+ <- module->core_layout->base + module->arch->plt_offset + | TEXT | module->arch->plt_size +------+ Therefore, when trying to debug modules on s390, all the symbol addresses are skewed by plt_offset + plt_size. Fix by adding plt_offset + plt_size to module_addr in load_module_symbols(). Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Jan Kiszka <[email protected]> Cc: Kieran Bingham <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-10-19kernel/events/uprobes.c: only do FOLL_SPLIT_PMD for uprobe registerSong Liu1-2/+11
Attaching uprobe to text section in THP splits the PMD mapped page table into PTE mapped entries. On uprobe detach, we would like to regroup PMD mapped page table entry to regain performance benefit of THP. However, the regroup is broken For perf_event based trace_uprobe. This is because perf_event based trace_uprobe calls uprobe_unregister twice on close: first in TRACE_REG_PERF_CLOSE, then in TRACE_REG_PERF_UNREGISTER. The second call will split the PMD mapped page table entry, which is not the desired behavior. Fix this by only use FOLL_SPLIT_PMD for uprobe register case. Add a WARN() to confirm uprobe unregister never work on huge pages, and abort the operation when this WARN() triggers. Link: http://lkml.kernel.org/r/[email protected] Fixes: 5a52c9df62b4 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT") Signed-off-by: Song Liu <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: William Kucharski <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>