aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-11-14tls: Use kzalloc for aead_request allocationIlya Lesokhin1-1/+1
Use kzalloc for aead_request allocation as we don't set all the bits in the request. Fixes: 3c4d7559159b ('tls: kernel TLS support') Signed-off-by: Ilya Lesokhin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14Merge branch 'bpf-improve-verifier-ARG_CONST_SIZE_OR_ZERO-semantics'David S. Miller3-37/+142
Yonghong Song says: ==================== bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics This patch set intends to change verifier ARG_CONST_SIZE_OR_ZERO semantics so that simpler bpf programs can be written with verifier acceptance. Patch #1 comment provided the detailed examples and the patch itself implements the new semantics. Patch #2 changes bpf_probe_read helper arg2 type from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO. Patch #3 fixed a few test cases and added some for better coverage. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-14bpf: fix and add test cases for ARG_CONST_SIZE_OR_ZERO semantics changeYonghong Song1-19/+112
Fix a few test cases to allow non-NULL map/packet/stack pointer with size = 0. Change a few tests using bpf_probe_read to use bpf_probe_write_user so ARG_CONST_SIZE arg can still be properly tested. One existing test case already covers size = 0 with non-NULL packet pointer, so add additional tests so all cases of size = 0 and 0 <= size <= legal_upper_bound with non-NULL map/packet/stack pointer are covered. Signed-off-by: Yonghong Song <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZEROYonghong Song1-2/+6
The helper bpf_probe_read arg2 type is changed from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO to permit size-0 buffer. Together with newer ARG_CONST_SIZE_OR_ZERO semantics which allows non-NULL buffer with size 0, this allows simpler bpf programs with verifier acceptance. The previous commit which changes ARG_CONST_SIZE_OR_ZERO semantics has details on examples. Signed-off-by: Yonghong Song <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semanticsYonghong Song1-16/+24
For helpers, the argument type ARG_CONST_SIZE_OR_ZERO permits the access size to be 0 when accessing the previous argument (arg). Right now, it requires the arg needs to be NULL when size passed is 0 or could be 0. It also requires a non-NULL arg when the size is proved to be non-0. This patch changes verifier ARG_CONST_SIZE_OR_ZERO behavior such that for size-0 or possible size-0, it is not required the arg equal to NULL. There are a couple of reasons for this semantics change, and all of them intends to simplify user bpf programs which may improve user experience and/or increase chances of verifier acceptance. Together with the next patch which changes bpf_probe_read arg2 type from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO, the following two examples, which fail the verifier currently, are able to get verifier acceptance. Example 1: unsigned long len = pend - pstart; len = len > MAX_PAYLOAD_LEN ? MAX_PAYLOAD_LEN : len; len &= MAX_PAYLOAD_LEN; bpf_probe_read(data->payload, len, pstart); It does not have test for "len > 0" and it failed the verifier. Users may not be aware that they have to add this test. Converting the bpf_probe_read helper to have ARG_CONST_SIZE_OR_ZERO helps the above code get verifier acceptance. Example 2: Here is one example where llvm "messed up" the code and the verifier fails. ...... unsigned long len = pend - pstart; if (len > 0 && len <= MAX_PAYLOAD_LEN) bpf_probe_read(data->payload, len, pstart); ...... The compiler generates the following code and verifier fails: ...... 39: (79) r2 = *(u64 *)(r10 -16) 40: (1f) r2 -= r8 41: (bf) r1 = r2 42: (07) r1 += -1 43: (25) if r1 > 0xffe goto pc+3 R0=inv(id=0) R1=inv(id=0,umax_value=4094,var_off=(0x0; 0xfff)) R2=inv(id=0) R6=map_value(id=0,off=0,ks=4,vs=4095,imm=0) R7=inv(id=0) R8=inv(id=0) R9=inv0 R10=fp0 44: (bf) r1 = r6 45: (bf) r3 = r8 46: (85) call bpf_probe_read#45 R2 min value is negative, either use unsigned or 'var &= const' ...... The compiler optimization is correct. If r1 = 0, r1 - 1 = 0xffffffffffffffff > 0xffe. If r1 != 0, r1 - 1 will not wrap. r1 > 0xffe at insn #43 can actually capture both "r1 > 0" and "len <= MAX_PAYLOAD_LEN". This however causes an issue in verifier as the value range of arg2 "r2" does not properly get refined and lead to verification failure. Relaxing bpf_prog_read arg2 from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO allows the following simplied code: unsigned long len = pend - pstart; if (len <= MAX_PAYLOAD_LEN) bpf_probe_read(data->payload, len, pstart); The llvm compiler will generate less complex code and the verifier is able to verify that the program is okay. Signed-off-by: Yonghong Song <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14tcp: allow drivers to tweak TSQ logicEric Dumazet3-2/+5
I had many reports that TSQ logic breaks wifi aggregation. Current logic is to allow up to 1 ms of bytes to be queued into qdisc and drivers queues. But Wifi aggregation needs a bigger budget to allow bigger rates to be discovered by various TCP Congestion Controls algorithms. This patch adds an extra socket field, allowing wifi drivers to select another log scale to derive TCP Small Queue credit from current pacing rate. Initial value is 10, meaning that this patch does not change current behavior. We expect wifi drivers to set this field to smaller values (tests have been done with values from 6 to 9) They would have to use following template : if (skb->sk && skb->sk->sk_pacing_shift != MY_PACING_SHIFT) skb->sk->sk_pacing_shift = MY_PACING_SHIFT; Ref: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1670041 Signed-off-by: Eric Dumazet <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Toke Høiland-Jørgensen <[email protected]> Cc: Kir Kolyshkin <[email protected]> Acked-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14Merge tag 'rxrpc-next-20171111' of ↵David S. Miller7-7/+36
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fixes Here are some patches that fix some things in AF_RXRPC: (1) Prevent notifications from being passed to a kernel service for a call that it has ended. (2) Fix a null pointer deference that occurs under some circumstances when an ACK is generated. (3) Fix a number of things to do with call expiration. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-14bnx2x: fix slowpath null crashZhu Yanjun1-3/+10
When "NETDEV WATCHDOG: em4 (bnx2x): transmit queue 2 timed out" occurs, BNX2X_SP_RTNL_TX_TIMEOUT is set. In the function bnx2x_sp_rtnl_task, bnx2x_nic_unload and bnx2x_nic_load are executed to shutdown and open NIC. In the function bnx2x_nic_load, bnx2x_alloc_mem allocates dma failure. The message "bnx2x: [bnx2x_alloc_mem:8399(em4)]Can't allocate memory" pops out. The variable slowpath is set to NULL. When shutdown the NIC, the function bnx2x_nic_unload is called. In the function bnx2x_nic_unload, the following functions are executed. bnx2x_chip_cleanup bnx2x_set_storm_rx_mode bnx2x_set_q_rx_mode bnx2x_set_q_rx_mode bnx2x_config_rx_mode bnx2x_set_rx_mode_e2 In the function bnx2x_set_rx_mode_e2, the variable slowpath is operated. Then the crash occurs. To fix this crash, the variable slowpath is checked. And in the function bnx2x_sp_rtnl_task, after dma memory allocation fails, another shutdown and open NIC is executed. CC: Joe Jin <[email protected]> CC: Junxiao Bi <[email protected]> Signed-off-by: Zhu Yanjun <[email protected]> Acked-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14Merge branch 'cxgb4-collect-LE-TCAM-and-SGE-queue-contexts'David S. Miller9-0/+456
Rahul Lakkireddy says: ==================== cxgb4: collect LE-TCAM and SGE queue contexts Collect hardware dumps via ethtool --get-dump facility. Patch 1 collects LE-TCAM dump. Patch 2 collects SGE queue context dumps. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-11-14cxgb4: collect SGE queue context dumpRahul Lakkireddy9-0/+195
Collect SGE freelist queue and congestion manager contexts. Signed-off-by: Rahul Lakkireddy <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14cxgb4: collect LE-TCAM dumpRahul Lakkireddy6-0/+261
Signed-off-by: Rahul Lakkireddy <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14vxlan: fix the issue that neigh proxy blocks all icmpv6 packetsXin Long1-18/+13
Commit f1fb08f6337c ("vxlan: fix ND proxy when skb doesn't have transport header offset") removed icmp6_code and icmp6_type check before calling neigh_reduce when doing neigh proxy. It means all icmpv6 packets would be blocked by this, not only ns packet. In Jianlin's env, even ping6 couldn't work through it. This patch is to bring the icmp6_code and icmp6_type check back and also removed the same check from neigh_reduce(). Fixes: f1fb08f6337c ("vxlan: fix ND proxy when skb doesn't have transport header offset") Reported-by: Jianlin Shi <[email protected]> Signed-off-by: Xin Long <[email protected]> Reviewed-by: Vincent Bernat <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14xfrm6_tunnel: exit_net cleanup check addedVasily Averin1-0/+8
Be sure that spi_byaddr and spi_byspi arrays initialized in net_init hook were return to initial state Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14ppp: exit_net cleanup checks addedVasily Averin1-0/+2
Be sure that lists initialized in net_init hook were return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14phonet: exit_net cleanup check addedVasily Averin1-0/+3
Be sure that pndevs.list initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14l2tp: exit_net cleanup check addedVasily Averin1-0/+4
Be sure that l2tp_session_hlist array initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14fib_rules: exit_net cleanup check addedVasily Averin1-0/+6
Be sure that rules_ops list initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14fib_notifier: exit_net cleanup check addedVasily Averin1-0/+6
Be sure that fib_notifier_ops list initilized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14netdev: exit_net cleanup check addedVasily Averin1-0/+2
Be sure that dev_base_head list initialized in net_init hook was return to initial state Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14vxlan: exit_net cleanup checks addedVasily Averin1-0/+4
Be sure that sock_list array initialized in net_init hook was return to initial state Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14packet: exit_net cleanup check addedVasily Averin1-0/+1
Be sure that packet.sklist initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14geneve: exit_net cleanup check addedVasily Averin1-0/+1
Be sure that sock_list initialized in net_init hook was return to initial state. Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-14af_key: replace BUG_ON on WARN_ON in net_exit hookVasily Averin1-1/+1
Signed-off-by: Vasily Averin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-11-13Merge tag 'usb-4.15-rc1' of ↵Linus Torvalds764-9083/+6609
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB/PHY updates from Greg KH: "Here is the big set of USB and PHY driver updates for 4.15-rc1. There is the usual amount of gadget and xhci driver updates, along with phy and chipidea enhancements. There's also a lot of SPDX tags and license boilerplate cleanups as well, which provide some churn in the diffstat. Other major thing is the typec code that moved out of staging and into the "real" part of the drivers/usb/ tree, which was nice to see happen. All of these have been in linux-next with no reported issues for a while" * tag 'usb-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (263 commits) usb: gadget: f_fs: Fix use-after-free in ffs_free_inst USB: usbfs: compute urb->actual_length for isochronous usb: core: message: remember to reset 'ret' to 0 when necessary USB: typec: Remove remaining redundant license text USB: typec: add SPDX identifiers to some files USB: renesas_usbhs: rcar?.h: add SPDX tags USB: chipidea: ci_hdrc_tegra.c: add SPDX line USB: host: xhci-debugfs: add SPDX lines USB: add SPDX identifiers to all remaining Makefiles usb: host: isp1362-hcd: remove a couple of redundant assignments USB: adutux: remove redundant variable minor usb: core: add a new usb_get_ptm_status() helper usb: core: add a 'type' parameter to usb_get_status() usb: core: introduce a new usb_get_std_status() helper usb: core: rename usb_get_status() 'type' argument to 'recip' usb: core: add Status Type definitions USB: gadget: Remove redundant license text USB: gadget: function: Remove redundant license text USB: gadget: udc: Remove redundant license text USB: gadget: legacy: Remove redundant license text ...
2017-11-14Merge branch 'topic/xilinx' into for-linusVinod Koul2-14/+14
2017-11-14Merge branch 'topic/timer_api' into for-linusVinod Koul4-11/+8
2017-11-14Merge branch 'topic/ti' into for-linusVinod Koul3-4/+14
2017-11-14Merge branch 'topic/sun' into for-linusVinod Koul2-62/+221
2017-11-14Merge branch 'topic/sprd' into for-linusVinod Koul4-0/+1038
Kconfig and Makefile conflicts so put them in right order (sprd ones after stm ones) Signed-off-by: Vinod Koul <[email protected]>
2017-11-13Merge tag 'tty-4.15-rc1' of ↵Linus Torvalds202-1668/+1276
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big tty/serial driver pull request for 4.15-rc1. Lots of serial driver updates in here, some small vt cleanups, and a raft of SPDX and license boilerplate cleanups, messing up the diffstat a bit. Nothing major, with no realy functional changes except better hardware support for some platforms. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (110 commits) tty: ehv_bytechan: fix spelling mistake tty: serial: meson: allow baud-rates lower than 9600 serial: 8250_fintek: Fix crash with baud rate B0 serial: 8250_fintek: Disable delays for ports != 0 serial: 8250_fintek: Return -EINVAL on invalid configuration tty: Remove redundant license text tty: serdev: Remove redundant license text tty: hvc: Remove redundant license text tty: serial: Remove redundant license text tty: add SPDX identifiers to all remaining files in drivers/tty/ tty: serial: jsm: remove redundant pointer ts tty: serial: jsm: add space before the open parenthesis '(' tty: serial: jsm: fix coding style tty: serial: jsm: delete space between function name and '(' tty: serial: jsm: add blank line after declarations tty: serial: jsm: change the type of local variable tty: serial: imx: remove dead code imx_dma_rxint tty: serial: imx: disable ageing timer interrupt if dma in use serial: 8250: fix potential deadlock in rs485-mode serial: m32r_sio: Drop redundant .data assignment ...
2017-11-14Merge branch 'topic/stm' into for-linusVinod Koul7-1/+2213
2017-11-14Merge branch 'topic/sa11x0' into for-linusVinod Koul1-0/+11
2017-11-14Merge branch 'topic/renasas' into for-linusVinod Koul2-3/+4
2017-11-14Merge branch 'topic/qcom' into for-linusVinod Koul1-60/+109
2017-11-14Merge branch 'topic/pl330' into for-linusVinod Koul1-19/+20
2017-11-14Merge branch 'topic/imx' into for-linusVinod Koul1-3/+11
2017-11-14Merge branch 'topic/img' into for-linusVinod Koul1-18/+80
2017-11-14Merge branch 'topic/doc' into for-linusVinod Koul1-13/+17
2017-11-14Merge branch 'topic/dmatest' into for-linusVinod Koul1-0/+1
2017-11-14Merge branch 'topic/bcom' into for-linusVinod Koul2-81/+38
2017-11-14Merge branch 'topic/axi' into for-linusVinod Koul1-20/+55
2017-11-14Merge branch 'topic/print_fixes' into for-linusVinod Koul2-4/+4
2017-11-13Merge tag 'staging-4.15-rc1' of ↵Linus Torvalds1008-8386/+8567
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO updates from Greg KH: "Here is the "big" staging and IIO driver update for 4.15-rc1. Lots and lots of little changes, almost all minor code cleanups as the Outreachy application process happened during this development cycle. Also happened was a lot of IIO driver activity, and the typec USB code moving out of staging to drivers/usb (same commits are in the USB tree on a persistent branch to not cause merge issues.) Overall, it's a wash, I think we added a few hundred more lines than removed, but really only a few thousand were modified at all. All of these have been in linux-next for a while. There might be a merge issue with Al's vfs tree in the pi433 driver (take his changes, they are always better), and the media tree with some of the odd atomisp cleanups (take the media tree's version)" * tag 'staging-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (507 commits) staging: lustre: add SPDX identifiers to all lustre files staging: greybus: Remove redundant license text staging: greybus: add SPDX identifiers to all greybus driver files staging: ccree: simplify ioread/iowrite staging: ccree: simplify registers access staging: ccree: simplify error handling logic staging: ccree: remove dead code staging: ccree: handle limiting of DMA masks staging: ccree: copy IV to DMAable memory staging: fbtft: remove redundant initialization of buf staging: sm750fb: Fix parameter mistake in poke32 staging: wilc1000: Fix bssid buffer offset in Txq staging: fbtft: fb_ssd1331: fix mirrored display staging: android: Fix checkpatch.pl error staging: greybus: loopback: convert loopback to use generic async operations staging: greybus: operation: add private data with get/set accessors staging: greybus: loopback: Fix iteration count on async path staging: greybus: loopback: Hold per-connection mutex across operations staging: greybus/loopback: use ktime_get() for time intervals staging: fsl-dpaa2/eth: Extra headroom in RX buffers ...
2017-11-13Merge tag 'devprop-4.15-rc1' of ↵Linus Torvalds4-6/+15
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull device properties framework updates from Rafael Wysocki: "These make the fwnode_handle_get() function return a pointer to the target fwnode object, which reflects the of_node_get() behavior, and add a macro for iterating over graph endpoints (Sakari Ailus)" * tag 'devprop-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: device property: Add a macro for interating over graph endpoints device property: Make fwnode_handle_get() return the fwnode
2017-11-13Merge tag 'acpi-4.15-rc1' of ↵Linus Torvalds48-629/+1551
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update ACPICA to upstream revision 20170831, fix APEI to use the fixmap instead of ioremap_page_range(), add an operation region driver for TI PMIC TPS68470, add support for PCC subspace IDs to the ACPI CPPC driver, fix a few assorted issues and clean up some code. Specifics: - Update the ACPICA code to upstream revision 20170831 including * PDTT table header support (Bob Moore). * Cleanup and extension of internal string-to-integer conversion functions (Bob Moore). * Support for 64-bit hardware accesses (Lv Zheng). * ACPI PM Timer code adjustment to deal with 64-bit return values of acpi_hw_read() (Bob Moore). * Support for deferred table verification in acpiexec (Lv Zheng). - Fix APEI to use the fixmap instead of ioremap_page_range() which cannot work correctly the way the code in there attempted to use it and drop some code that's not necessary any more after that change (James Morse). - Clean up the APEI support code and make it use 64-bit timestamps (Arnd Bergmann, Dongjiu Geng, Jan Beulich). - Add operation region driver for TI PMIC TPS68470 (Rajmohan Mani). - Add support for PCC subspace IDs to the ACPI CPPC driver (George Cherian). - Fix an ACPI EC driver regression related to the handling of EC events during the "noirq" phases of system suspend/resume (Lv Zheng). - Delay the initialization of the lid state in the ACPI button driver to fix issues appearing on some systems (Hans de Goede). - Extend the KIOX000A "device always present" quirk to cover all affected BIOS versions (Hans de Goede). - Clean up some code in the ACPI core and drivers (Colin Ian King, Gustavo Silva)" * tag 'acpi-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (24 commits) ACPI: Mark expected switch fall-throughs ACPI / LPSS: Remove redundant initialization of clk ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs mailbox: PCC: Move the MAX_PCC_SUBSPACES definition to header file ACPI / sysfs: Make function param_set_trace_method_name() static ACPI / button: Delay acpi_lid_initialize_state() until first user space open ACPI / EC: Fix regression related to triggering source of EC event handling APEI / ERST: use 64-bit timestamps ACPI / APEI: Remove arch_apei_flush_tlb_one() arm64: mm: Remove arch_apei_flush_tlb_one() ACPI / APEI: Remove ghes_ioremap_area ACPI / APEI: Replace ioremap_page_range() with fixmap ACPI / APEI: remove the unused dead-code for SEA/NMI notification type ACPI / x86: Extend KIOX000A quirk to cover all affected BIOS versions ACPI / APEI: adjust a local variable type in ghes_ioremap_pfn_irq() ACPICA: Update version to 20170831 ACPICA: Update acpi_get_timer for 64-bit interface to acpi_hw_read ACPICA: String conversions: Update to add new behaviors ACPICA: String conversions: Cleanup/format comments. No functional changes ACPICA: Restructure/cleanup all string-to-integer conversion functions ...
2017-11-13Merge tag 'pm-4.15-rc1' of ↵Linus Torvalds99-1063/+1758
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "There are no real big ticket items here this time. The most noticeable change is probably the relocation of the OPP (Operating Performance Points) framework to its own directory under drivers/ as it has grown big enough for that. Also Viresh is now going to maintain it and send pull requests for it to me, so you will see this change in the git history going forward (but still not right now). Another noticeable set of changes is the modifications of the PM core, the PCI subsystem and the ACPI PM domain to allow of more integration between system-wide suspend/resume and runtime PM. For now it's just a way to avoid resuming devices from runtime suspend unnecessarily during system suspend (if the driver sets a flag to indicate its readiness for that) and in the works is an analogous mechanism to allow devices to stay suspended after system resume. In addition to that, we have some changes related to supporting frequency-invariant CPU utilization metrics in the scheduler and in the schedutil cpufreq governor on ARM and changes to add support for device performance states to the generic power domains (genpd) framework. The rest is mostly fixes and cleanups of various sorts. Specifics: - Relocate the OPP (Operating Performance Points) framework to its own directory under drivers/ and add support for power domain performance states to it (Viresh Kumar). - Modify the PM core, the PCI bus type and the ACPI PM domain to support power management driver flags allowing device drivers to specify their capabilities and preferences regarding the handling of devices with enabled runtime PM during system suspend/resume and clean up that code somewhat (Rafael Wysocki, Ulf Hansson). - Add frequency-invariant accounting support to the task scheduler on ARM and ARM64 (Dietmar Eggemann). - Fix PM QoS device resume latency framework to prevent "no restriction" requests from overriding requests with specific requirements and drop the confusing PM_QOS_FLAG_REMOTE_WAKEUP device PM QoS flag (Rafael Wysocki). - Drop legacy class suspend/resume operations from the PM core and drop legacy bus type suspend and resume callbacks from ARM/locomo (Rafael Wysocki). - Add min/max frequency support to devfreq and clean it up somewhat (Chanwoo Choi). - Rework wakeup support in the generic power domains (genpd) framework and update some of its users accordingly (Geert Uytterhoeven). - Convert timers in the PM core to use timer_setup() (Kees Cook). - Add support for exposing the SLP_S0 (Low Power S0 Idle) residency counter based on the LPIT ACPI table on Intel platforms (Srinivas Pandruvada). - Add per-CPU PM QoS resume latency support to the ladder cpuidle governor (Ramesh Thomas). - Fix a deadlock between the wakeup notify handler and the notifier removal in the ACPI core (Ville Syrjälä). - Fix a cpufreq schedutil governor issue causing it to use stale cached frequency values sometimes (Viresh Kumar). - Fix an issue in the system suspend core support code causing wakeup events detection to fail in some cases (Rajat Jain). - Fix the generic power domains (genpd) framework to prevent the PM core from using the direct-complete optimization with it as that is guaranteed to fail (Ulf Hansson). - Fix a minor issue in the cpuidle core and clean it up a bit (Gaurav Jindal, Nicholas Piggin). - Fix and clean up the intel_idle and ARM cpuidle drivers (Jason Baron, Len Brown, Leo Yan). - Fix a couple of minor issues in the OPP framework and clean it up (Arvind Yadav, Fabio Estevam, Sudeep Holla, Tobias Jordan). - Fix and clean up some cpufreq drivers and fix a minor issue in the cpufreq statistics code (Arvind Yadav, Bhumika Goyal, Fabio Estevam, Gautham Shenoy, Gustavo Silva, Marek Szyprowski, Masahiro Yamada, Robert Jarzmik, Zumeng Chen). - Fix minor issues in the system suspend and hibernation core, in power management documentation and in the AVS (Adaptive Voltage Scaling) framework (Helge Deller, Himanshu Jha, Joe Perches, Rafael Wysocki). - Fix some issues in the cpupower utility and document that Shuah Khan is going to maintain it going forward (Prarit Bhargava, Shuah Khan)" * tag 'pm-4.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (88 commits) tools/power/cpupower: add libcpupower.so.0.0.1 to .gitignore tools/power/cpupower: Add 64 bit library detection intel_idle: Graceful probe failure when MWAIT is disabled cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq freezer: Fix typo in freezable_schedule_timeout() comment PM / s2idle: Clear the events_check_enabled flag cpufreq: stats: Handle the case when trans_table goes beyond PAGE_SIZE cpufreq: arm_big_little: make cpufreq_arm_bL_ops structures const cpufreq: arm_big_little: make function arguments and structure pointer const cpuidle: Avoid assignment in if () argument cpuidle: Clean up cpuidle_enable_device() error handling a bit ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock PM / Domains: Fix genpd to deal with drivers returning 1 from ->prepare() cpuidle: ladder: Add per CPU PM QoS resume latency support PM / QoS: Fix device resume latency framework PM / domains: Rework governor code to be more consistent PM / Domains: Remove gpd_dev_ops.active_wakeup() callback soc: rockchip: power-domain: Use GENPD_FLAG_ACTIVE_WAKEUP soc: mediatek: Use GENPD_FLAG_ACTIVE_WAKEUP ARM: shmobile: pm-rmobile: Use GENPD_FLAG_ACTIVE_WAKEUP ...
2017-11-13x86 / CPU: Avoid unnecessary IPIs in arch_freq_get_on_cpu()Rafael J. Wysocki1-4/+7
Even though aperfmperf_snapshot_khz() caches the samples.khz value to return if called again in a sufficiently short time, its caller, arch_freq_get_on_cpu(), still uses smp_call_function_single() to run it which may allow user space to trigger an IPI storm by reading from the scaling_cur_freq cpufreq sysfs file in a tight loop. To avoid that, move the decision on whether or not to return the cached samples.khz value to arch_freq_get_on_cpu(). This change was part of commit 941f5f0f6ef5 ("x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo"), but it was not the reason for the revert and it remains applicable. Fixes: 4815d3c56d1e (cpufreq: x86: Make scaling_cur_freq behave more as expected) Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: WANG Chao <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-13Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds7-51/+148
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "These updates are related to TSC handling: - Support platforms which have synchronized TSCs but the boot CPU has a non zero TSC_ADJUST value, which is considered a firmware bug on normal systems. This applies to HPE/SGI UV platforms where the platform firmware uses TSC_ADJUST to ensure TSC synchronization across a huge number of sockets, but due to power on timings the boot CPU cannot be guaranteed to have a zero TSC_ADJUST register value. - Fix the ordering of udelay calibration and kvmclock_init() - Cleanup the udelay and calibration code" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Mark cyc2ns_init() and detect_art() __init x86/platform/UV: Mark tsc_check_sync as an init function x86/tsc: Make CONFIG_X86_TSC=n build work again x86/platform/UV: Add check of TSC state set by UV BIOS x86/tsc: Provide a means to disable TSC ART x86/tsc: Drastically reduce the number of firmware bug warnings x86/tsc: Skip TSC test and error messages if already unstable x86/tsc: Add option that TSC on Socket 0 being non-zero is valid x86/timers: Move simple_udelay_calibration() past kvmclock_init() x86/timers: Make recalibrate_cpu_khz() void x86/timers: Move the simple udelay calibration to tsc.h
2017-11-13Merge branch 'x86-cache-for-linus' of ↵Linus Torvalds6-32/+170
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache resource updates from Thomas Gleixner: "This update provides updates to RDT: - A diagnostic framework for the Resource Director Technology (RDT) user interface (sysfs). The failure modes of the user interface are hard to diagnose from the error codes. An extra last command status file provides now sensible textual information about the failure so its simpler to use. - A few minor cleanups and updates in the RDT code" * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel_rdt: Fix a silent failure when writing zero value schemata x86/intel_rdt: Fix potential deadlock during resctrl mount x86/intel_rdt: Fix potential deadlock during resctrl unmount x86/intel_rdt: Initialize bitmask of shareable resource if CDP enabled x86/intel_rdt: Remove redundant assignment x86/intel_rdt/cqm: Make integer rmid_limbo_count static x86/intel_rdt: Add documentation for "info/last_cmd_status" x86/intel_rdt: Add diagnostics when making directories x86/intel_rdt: Add diagnostics when writing the cpus file x86/intel_rdt: Add diagnostics when writing the tasks file x86/intel_rdt: Add diagnostics when writing the schemata file x86/intel_rdt: Add framework for better RDT UI diagnostics
2017-11-13Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds43-1317/+1493
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 APIC updates from Thomas Gleixner: "This update provides a major overhaul of the APIC initialization and vector allocation code: - Unification of the APIC and interrupt mode setup which was scattered all over the place and was hard to follow. This also distangles the timer setup from the APIC initialization which brings a clear separation of functionality. Great detective work from Dou Lyiang! - Refactoring of the x86 vector allocation mechanism. The existing code was based on nested loops and rather convoluted APIC callbacks which had a horrible worst case behaviour and tried to serve all different use cases in one go. This led to quite odd hacks when supporting the new managed interupt facility for multiqueue devices and made it more or less impossible to deal with the vector space exhaustion which was a major roadblock for server hibernation. Aside of that the code dealing with cpu hotplug and the system vectors was disconnected from the actual vector management and allocation code, which made it hard to follow and maintain. Utilizing the new bitmap matrix allocator core mechanism, the new allocator and management code consolidates the handling of system vectors, legacy vectors, cpu hotplug mechanisms and the actual allocation which needs to be aware of system and legacy vectors and hotplug constraints into a single consistent entity. This has one visible change: The support for multi CPU targets of interrupts, which is only available on a certain subset of CPUs/APIC variants has been removed in favour of single interrupt targets. A proper analysis of the multi CPU target feature revealed that there is no real advantage as the vast majority of interrupts end up on the CPU with the lowest APIC id in the set of target CPUs anyway. That change was agreed on by the relevant folks and allowed to simplify the implementation significantly and to replace rather fragile constructs like the vector cleanup IPI with straight forward and solid code. Furthermore this allowed to cleanly separate the allocation details for legacy, normal and managed interrupts: * Legacy interrupts are not longer wasting 16 vectors unconditionally * Managed interrupts have now a guaranteed vector reservation, but the actual vector assignment happens when the interrupt is requested. It's guaranteed not to fail. * Normal interrupts no longer allocate vectors unconditionally when the interrupt is set up (IO/APIC init or MSI(X) enable). The mechanism has been switched to a best effort reservation mode. The actual allocation happens when the interrupt is requested. Contrary to managed interrupts the request can fail due to vector space exhaustion, but drivers must handle a fail of request_irq() anyway. When the interrupt is freed, the vector is handed back as well. This solves a long standing problem with large unconditional vector allocations for a certain class of enterprise devices which prevented server hibernation due to vector space exhaustion when the unused allocated vectors had to be migrated to CPU0 while unplugging all non boot CPUs. The code has been equipped with trace points and detailed debugfs information to aid analysis of the vector space" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/vector/msi: Select CONFIG_GENERIC_IRQ_RESERVATION_MODE PCI/MSI: Set MSI_FLAG_MUST_REACTIVATE in core code genirq: Add config option for reservation mode x86/vector: Use correct per cpu variable in free_moved_vector() x86/apic/vector: Ignore set_affinity call for inactive interrupts x86/apic: Fix spelling mistake: "symmectic" -> "symmetric" x86/apic: Use dead_cpu instead of current CPU when cleaning up ACPI/init: Invoke early ACPI initialization earlier x86/vector: Respect affinity mask in irq descriptor x86/irq: Simplify hotplug vector accounting x86/vector: Switch IOAPIC to global reservation mode x86/vector/msi: Switch to global reservation mode x86/vector: Handle managed interrupts proper x86/io_apic: Reevaluate vector configuration on activate() iommu/amd: Reevaluate vector configuration on activate() iommu/vt-d: Reevaluate vector configuration on activate() x86/apic/msi: Force reactivation of interrupts at startup time x86/vector: Untangle internal state from irq_cfg x86/vector: Compile SMP only code conditionally x86/apic: Remove unused callbacks ...