aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-04-16KVM: x86: fix warning Using plain integer as NULL pointerHariprasad Kelam1-1/+1
Changed passing argument as "0 to NULL" which resolves below sparse warning arch/x86/kvm/x86.c:3096:61: warning: Using plain integer as NULL pointer Signed-off-by: Hariprasad Kelam <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16selftests: kvm: add a selftest for SMMVitaly Kuznetsov4-6/+191
Add a simple test for SMM, based on VMX. The test implements its own sync between the guest and the host as using our ucall library seems to be too cumbersome: SMI handler is happening in real-address mode. This patch also fixes KVM_SET_NESTED_STATE to happen after KVM_SET_VCPU_EVENTS, in fact it places it last. This is because KVM needs to know whether the processor is in SMM or not. Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16selftests: kvm: fix for compilers that do not support -no-piePaolo Bonzini1-1/+7
-no-pie was added to GCC at the same time as their configuration option --enable-default-pie. Compilers that were built before do not have -no-pie, but they also do not need it. Detect the option at build time. Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16selftests: kvm/evmcs_test: complete I/O before migrating guest statePaolo Bonzini4-16/+17
Starting state migration after an IO exit without first completing IO may result in test failures. We already have two tests that need this (this patch in fact fixes evmcs_test, similar to what was fixed for state_test in commit 0f73bbc851ed, "KVM: selftests: complete IO before migrating guest state", 2019-03-13) and a third is coming. So, move the code to vcpu_save_state, and while at it do not access register state until after I/O is complete. Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernelsSean Christopherson2-4/+16
Invoking the 64-bit variation on a 32-bit kenrel will crash the guest, trigger a WARN, and/or lead to a buffer overrun in the host, e.g. rsm_load_state_64() writes r8-r15 unconditionally, but enum kvm_reg and thus x86_emulate_ctxt._regs only define r8-r15 for CONFIG_X86_64. KVM allows userspace to report long mode support via CPUID, even though the guest is all but guaranteed to crash if it actually tries to enable long mode. But, a pure 32-bit guest that is ignorant of long mode will happily plod along. SMM complicates things as 64-bit CPUs use a different SMRAM save state area. KVM handles this correctly for 64-bit kernels, e.g. uses the legacy save state map if userspace has hid long mode from the guest, but doesn't fare well when userspace reports long mode support on a 32-bit host kernel (32-bit KVM doesn't support 64-bit guests). Since the alternative is to crash the guest, e.g. by not loading state or explicitly requesting shutdown, unconditionally use the legacy SMRAM save state map for 32-bit KVM. If a guest has managed to get far enough to handle SMIs when running under a weird/buggy userspace hypervisor, then don't deliberately crash the guest since there are no downsides (from KVM's perspective) to allow it to continue running. Fixes: 660a5d517aaab ("KVM: x86: save/load state on SMM switch") Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: Don't clear EFER during SMM transitions for 32-bit vCPUSean Christopherson1-10/+11
Neither AMD nor Intel CPUs have an EFER field in the legacy SMRAM save state area, i.e. don't save/restore EFER across SMM transitions. KVM somewhat models this, e.g. doesn't clear EFER on entry to SMM if the guest doesn't support long mode. But during RSM, KVM unconditionally clears EFER so that it can get back to pure 32-bit mode in order to start loading CRs with their actual non-SMM values. Clear EFER only when it will be written when loading the non-SMM state so as to preserve bits that can theoretically be set on 32-bit vCPUs, e.g. KVM always emulates EFER_SCE. And because CR4.PAE is cleared only to play nice with EFER, wrap that code in the long mode check as well. Note, this may result in a compiler warning about cr4 being consumed uninitialized. Re-read CR4 even though it's technically unnecessary, as doing so allows for more readable code and RSM emulation is not a performance critical path. Fixes: 660a5d517aaab ("KVM: x86: save/load state on SMM switch") Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: clear SMM flags before loading state while leaving SMMSean Christopherson3-16/+10
RSM emulation is currently broken on VMX when the interrupted guest has CR4.VMXE=1. Stop dancing around the issue of HF_SMM_MASK being set when loading SMSTATE into architectural state, e.g. by toggling it for problematic flows, and simply clear HF_SMM_MASK prior to loading architectural state (from SMRAM save state area). Reported-by: Jon Doron <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Liran Alon <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Fixes: 5bea5123cbf0 ("KVM: VMX: check nested state and CR4.VMXE against SMM") Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: Open code kvm_set_hflagsSean Christopherson3-18/+19
Prepare for clearing HF_SMM_MASK prior to loading state from the SMRAM save state map, i.e. kvm_smm_changed() needs to be called after state has been loaded and so cannot be done automatically when setting hflags from RSM. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: Load SMRAM in a single shot when leaving SMMSean Christopherson6-92/+92
RSM emulation is currently broken on VMX when the interrupted guest has CR4.VMXE=1. Rather than dance around the issue of HF_SMM_MASK being set when loading SMSTATE into architectural state, ideally RSM emulation itself would be reworked to clear HF_SMM_MASK prior to loading non-SMM architectural state. Ostensibly, the only motivation for having HF_SMM_MASK set throughout the loading of state from the SMRAM save state area is so that the memory accesses from GET_SMSTATE() are tagged with role.smm. Load all of the SMRAM save state area from guest memory at the beginning of RSM emulation, and load state from the buffer instead of reading guest memory one-by-one. This paves the way for clearing HF_SMM_MASK prior to loading state, and also aligns RSM with the enter_smm() behavior, which fills a buffer and writes SMRAM save state in a single go. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: nVMX: Expose RDPMC-exiting only when guest supports PMULiran Alon1-0/+25
Issue was discovered when running kvm-unit-tests on KVM running as L1 on top of Hyper-V. When vmx_instruction_intercept unit-test attempts to run RDPMC to test RDPMC-exiting, it is intercepted by L1 KVM which it's EXIT_REASON_RDPMC handler raise #GP because vCPU exposed by Hyper-V doesn't support PMU. Instead of unit-test expectation to be reflected with EXIT_REASON_RDPMC. The reason vmx_instruction_intercept unit-test attempts to run RDPMC even though Hyper-V doesn't support PMU is because L1 expose to L2 support for RDPMC-exiting. Which is reasonable to assume that is supported only in case CPU supports PMU to being with. Above issue can easily be simulated by modifying vmx_instruction_intercept config in x86/unittests.cfg to run QEMU with "-cpu host,+vmx,-pmu" and run unit-test. To handle issue, change KVM to expose RDPMC-exiting only when guest supports PMU. Reported-by: Saar Amar <[email protected]> Reviewed-by: Mihai Carabas <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Signed-off-by: Liran Alon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: Raise #GP when guest vCPU do not support PMULiran Alon1-0/+4
Before this change, reading a VMware pseduo PMC will succeed even when PMU is not supported by guest. This can easily be seen by running kvm-unit-test vmware_backdoors with "-cpu host,-pmu" option. Reviewed-by: Mihai Carabas <[email protected]> Signed-off-by: Liran Alon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16x86/kvm: move kvm_load/put_guest_xcr0 into atomic contextWANG Chao4-6/+12
guest xcr0 could leak into host when MCE happens in guest mode. Because do_machine_check() could schedule out at a few places. For example: kvm_load_guest_xcr0 ... kvm_x86_ops->run(vcpu) { vmx_vcpu_run vmx_complete_atomic_exit kvm_machine_check do_machine_check do_memory_failure memory_failure lock_page In this case, host_xcr0 is 0x2ff, guest vcpu xcr0 is 0xff. After schedule out, host cpu has guest xcr0 loaded (0xff). In __switch_to { switch_fpu_finish copy_kernel_to_fpregs XRSTORS If any bit i in XSTATE_BV[i] == 1 and xcr0[i] == 0, XRSTORS will generate #GP (In this case, bit 9). Then ex_handler_fprestore kicks in and tries to reinitialize fpu by restoring init fpu state. Same story as last #GP, except we get DOUBLE FAULT this time. Cc: [email protected] Signed-off-by: WANG Chao <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: x86: svm: make sure NMI is injected after nmi_singlestepVitaly Kuznetsov1-0/+3
I noticed that apic test from kvm-unit-tests always hangs on my EPYC 7401P, the hanging test nmi-after-sti is trying to deliver 30000 NMIs and tracing shows that we're sometimes able to deliver a few but never all. When we're trying to inject an NMI we may fail to do so immediately for various reasons, however, we still need to inject it so enable_nmi_window() arms nmi_singlestep mode. #DB occurs as expected, but we're not checking for pending NMIs before entering the guest and unless there's a different event to process, the NMI will never get delivered. Make KVM_REQ_EVENT request on the vCPU from db_interception() to make sure pending NMIs are checked and possibly injected. Signed-off-by: Vitaly Kuznetsov <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16svm/avic: Fix invalidate logical APIC id entrySuthikulpanit, Suravee1-1/+2
Only clear the valid bit when invalidate logical APIC id entry. The current logic clear the valid bit, but also set the rest of the bits (including reserved bits) to 1. Fixes: 98d90582be2e ('svm: Fix AVIC DFR and LDR handling') Signed-off-by: Suravee Suthikulpanit <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16Revert "svm: Fix AVIC incomplete IPI emulation"Suthikulpanit, Suravee1-4/+15
This reverts commit bb218fbcfaaa3b115d4cd7a43c0ca164f3a96e57. As Oren Twaig pointed out the old discussion: https://patchwork.kernel.org/patch/8292231/ that the change coud potentially cause an extra IPI to be sent to the destination vcpu because the AVIC hardware already set the IRR bit before the incomplete IPI #VMEXIT with id=1 (target vcpu is not running). Since writting to ICR and ICR2 will also set the IRR. If something triggers the destination vcpu to get scheduled before the emulation finishes, then this could result in an additional IPI. Also, the issue mentioned in the commit bb218fbcfaaa was misdiagnosed. Cc: Radim Krčmář <[email protected]> Cc: Paolo Bonzini <[email protected]> Reported-by: Oren Twaig <[email protected]> Signed-off-by: Suravee Suthikulpanit <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16kvm: mmu: Fix overflow on kvm mmu page limit calculationBen Gardon4-16/+15
KVM bases its memory usage limits on the total number of guest pages across all memslots. However, those limits, and the calculations to produce them, use 32 bit unsigned integers. This can result in overflow if a VM has more guest pages that can be represented by a u32. As a result of this overflow, KVM can use a low limit on the number of MMU pages it will allocate. This makes KVM unable to map all of guest memory at once, prompting spurious faults. Tested: Ran all kvm-unit-tests on an Intel Haswell machine. This patch introduced no new failures. Signed-off-by: Ben Gardon <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16KVM: nVMX: always use early vmcs check when EPT is disabledPaolo Bonzini2-2/+21
The remaining failures of vmx.flat when EPT is disabled are caused by incorrectly reflecting VMfails to the L1 hypervisor. What happens is that nested_vmx_restore_host_state corrupts the guest CR3, reloading it with the host's shadow CR3 instead, because it blindly loads GUEST_CR3 from the vmcs01. For simplicity let's just always use hardware VMCS checks when EPT is disabled. This way, nested_vmx_restore_host_state is not reached at all (or at least shouldn't be reached). Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16sc16is7xx: move label 'err_spi' to correct sectionGuoqing Jiang1-0/+2
err_spi is used when SERIAL_SC16IS7XX_SPI is enabled, so make the label only available under SERIAL_SC16IS7XX_SPI option. Otherwise, the below warning appears. drivers/tty/serial/sc16is7xx.c:1523:1: warning: label ‘err_spi’ defined but not used [-Wunused-label] err_spi: ^~~~~~~ Signed-off-by: Guoqing Jiang <[email protected]> Fixes: ac0cdb3d9901 ("sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init()") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-16serial: sh-sci: Fix HSCIF RX sampling point adjustmentGeert Uytterhoeven1-1/+1
The calculation of the sampling point has min() and max() exchanged. Fix this by using the clamp() helper instead. Fixes: 63ba1e00f178a448 ("serial: sh-sci: Support for HSCIF RX sampling point adjustment") Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Ulrich Hecht <[email protected]> Reviewed-by: Wolfram Sang <[email protected]> Acked-by: Dirk Behme <[email protected]> Cc: stable <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-16serial: sh-sci: Fix HSCIF RX sampling point calculationGeert Uytterhoeven1-1/+3
There are several issues with the formula used for calculating the deviation from the intended rate: 1. While min_err and last_stop are signed, srr and baud are unsigned. Hence the signed values are promoted to unsigned, which will lead to a bogus value of deviation if min_err is negative, 2. Srr is the register field value, which is one less than the actual sampling rate factor, 3. The divisions do not use rounding. Fix this by casting unsigned variables to int, adding one to srr, and using a single DIV_ROUND_CLOSEST(). Fixes: 63ba1e00f178a448 ("serial: sh-sci: Support for HSCIF RX sampling point adjustment") Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Mukesh Ojha <[email protected]> Cc: stable <[email protected]> Reviewed-by: Ulrich Hecht <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-16clocksource/drivers/timer-ti-dm: Remove omap_dm_timer_set_load_startNathan Chancellor1-28/+0
Commit 008258d995a6 ("clocksource/drivers/timer-ti-dm: Make omap_dm_timer_set_load_start() static") made omap_dm_time_set_load_start static because its prototype was not defined in a header. Unfortunately, this causes a build warning on multi_v7_defconfig because this function is not used anywhere in this translation unit: drivers/clocksource/timer-ti-dm.c:589:12: error: unused function 'omap_dm_timer_set_load_start' [-Werror,-Wunused-function] In fact, omap_dm_timer_set_load_start hasn't been used anywhere since commit f190be7f39a5 ("staging: tidspbridge: remove driver") and the prototype was removed in commit 592ea6bd1fad ("clocksource: timer-ti-dm: Make unexported functions static"), which is probably where this should have happened. Fixes: 592ea6bd1fad ("clocksource: timer-ti-dm: Make unexported functions static") Fixes: 008258d995a6 ("clocksource/drivers/timer-ti-dm: Make omap_dm_timer_set_load_start() static") Signed-off-by: Nathan Chancellor <[email protected]> Acked-by: Tony Lindgren <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]>
2019-04-16staging: erofs: fix unexpected out-of-bound data accessGao Xiang1-1/+1
Unexpected out-of-bound data will be read in erofs_read_raw_page after commit 07173c3ec276 ("block: enable multipage bvecs") since one iovec could have multiple pages. Let's fix as what Ming's pointed out in the previous email [1]. [1] https://lore.kernel.org/lkml/[email protected]/ Suggested-by: Ming Lei <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Fixes: 07173c3ec276 ("block: enable multipage bvecs") Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-16x86/Kconfig: Fix spelling mistake "effectivness" -> "effectiveness"Colin Ian King1-1/+1
The Kconfig text contains a spelling mistake, fix it. Signed-off-by: Colin Ian King <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16staging: comedi: vmk80xx: Fix possible double-free of ->usb_rx_bufIan Abbott1-3/+1
`vmk80xx_alloc_usb_buffers()` is called from `vmk80xx_auto_attach()` to allocate RX and TX buffers for USB transfers. It allocates `devpriv->usb_rx_buf` followed by `devpriv->usb_tx_buf`. If the allocation of `devpriv->usb_tx_buf` fails, it frees `devpriv->usb_rx_buf`, leaving the pointer set dangling, and returns an error. Later, `vmk80xx_detach()` will be called from the core comedi module code to clean up. `vmk80xx_detach()` also frees both `devpriv->usb_rx_buf` and `devpriv->usb_tx_buf`, but `devpriv->usb_rx_buf` may have already been freed, leading to a double-free error. Fix it by removing the call to `kfree(devpriv->usb_rx_buf)` from `vmk80xx_alloc_usb_buffers()`, relying on `vmk80xx_detach()` to free the memory. Signed-off-by: Ian Abbott <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-16staging: comedi: vmk80xx: Fix use of uninitialized semaphoreIan Abbott1-2/+2
If `vmk80xx_auto_attach()` returns an error, the core comedi module code will call `vmk80xx_detach()` to clean up. If `vmk80xx_auto_attach()` successfully allocated the comedi device private data, `vmk80xx_detach()` assumes that a `struct semaphore limit_sem` contained in the private data has been initialized and uses it. Unfortunately, there are a couple of places where `vmk80xx_auto_attach()` can return an error after allocating the device private data but before initializing the semaphore, so this assumption is invalid. Fix it by initializing the semaphore just after allocating the private data in `vmk80xx_auto_attach()` before any other errors can be returned. I believe this was the cause of the following syzbot crash report <https://syzkaller.appspot.com/bug?extid=54c2f58f15fe6876b6ad>: usb 1-1: config 0 has no interface number 0 usb 1-1: New USB device found, idVendor=10cf, idProduct=8068, bcdDevice=e6.8d usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0 usb 1-1: config 0 descriptor?? vmk80xx 1-1:0.117: driver 'vmk80xx' failed to auto-configure device. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.1.0-rc4-319354-g9a33b36 #3 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: usb_hub_wq hub_event Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xe8/0x16e lib/dump_stack.c:113 assign_lock_key kernel/locking/lockdep.c:786 [inline] register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095 __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582 lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x44/0x60 kernel/locking/spinlock.c:152 down+0x12/0x80 kernel/locking/semaphore.c:58 vmk80xx_detach+0x59/0x100 drivers/staging/comedi/drivers/vmk80xx.c:829 comedi_device_detach+0xed/0x800 drivers/staging/comedi/drivers.c:204 comedi_device_cleanup.part.0+0x68/0x140 drivers/staging/comedi/comedi_fops.c:156 comedi_device_cleanup drivers/staging/comedi/comedi_fops.c:187 [inline] comedi_free_board_dev.part.0+0x16/0x90 drivers/staging/comedi/comedi_fops.c:190 comedi_free_board_dev drivers/staging/comedi/comedi_fops.c:189 [inline] comedi_release_hardware_device+0x111/0x140 drivers/staging/comedi/comedi_fops.c:2880 comedi_auto_config.cold+0x124/0x1b0 drivers/staging/comedi/drivers.c:1068 usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361 really_probe+0x2da/0xb10 drivers/base/dd.c:509 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 __device_attach+0x223/0x3a0 drivers/base/dd.c:844 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 device_add+0xad2/0x16e0 drivers/base/core.c:2106 usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021 generic_probe+0xa2/0xda drivers/usb/core/generic.c:210 usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266 really_probe+0x2da/0xb10 drivers/base/dd.c:509 driver_probe_device+0x21d/0x350 drivers/base/dd.c:671 __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778 bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454 __device_attach+0x223/0x3a0 drivers/base/dd.c:844 bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514 device_add+0xad2/0x16e0 drivers/base/core.c:2106 usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534 hub_port_connect drivers/usb/core/hub.c:5089 [inline] hub_port_connect_change drivers/usb/core/hub.c:5204 [inline] port_event drivers/usb/core/hub.c:5350 [inline] hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432 process_one_work+0x90f/0x1580 kernel/workqueue.c:2269 worker_thread+0x9b/0xe20 kernel/workqueue.c:2415 kthread+0x313/0x420 kernel/kthread.c:253 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352 Reported-by: [email protected] Signed-off-by: Ian Abbott <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-16Merge tag 'extcon-fixes-for-5.1-rc4' of ↵Greg Kroah-Hartman1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon into char-misc-linus Chanwoo writes: Update extcon for v5.1-rc4 Detailed description for this pull request: 1. Fix the build issue of extcon-ptn5150.c driver by editing the module dependency in Kconfig. * tag 'extcon-fixes-for-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon: extcon: ptn5150: fix COMPILE_TEST dependencies
2019-04-16perf/x86: Fix incorrect PEBS_REGSKan Liang2-20/+20
PEBS_REGS used as mask for the supported registers for large PEBS. However, the mask cannot filter the sample_regs_user/sample_regs_intr correctly. (1ULL << PERF_REG_X86_*) should be used to replace PERF_REG_X86_*, which is only the index. Rename PEBS_REGS to PEBS_GP_REGS, because the mask is only for general purpose registers. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Cc: [email protected] Fixes: 2fe1bc1f501d ("perf/x86: Enable free running PEBS for REGS_USER/INTR") Link: https://lkml.kernel.org/r/[email protected] [ Renamed it to PEBS_GP_REGS - as 'GPRS' is used elsewhere ;-) ] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16perf/ring_buffer: Fix AUX record suppressionAlexander Shishkin1-18/+15
The following commit: 1627314fb54a33e ("perf: Suppress AUX/OVERWRITE records") has an unintended side-effect of also suppressing all AUX records with no flags and non-zero size, so all the regular records in the full trace mode. This breaks some use cases for people. Fix this by restoring "regular" AUX records. Reported-by: Ben Gainey <[email protected]> Tested-by: Ben Gainey <[email protected]> Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Fixes: 1627314fb54a33e ("perf: Suppress AUX/OVERWRITE records") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16perf/core: Fix the address filtering fixAlexander Shishkin1-16/+21
The following recent commit: c60f83b813e5 ("perf, pt, coresight: Fix address filters for vmas with non-zero offset") changes the address filtering logic to communicate filter ranges to the PMU driver via a single address range object, instead of having the driver do the final bit of math. That change forgets to take into account kernel filters, which are not calculated the same way as DSO based filters. Fix that by passing the kernel filters the same way as file-based filters. This doesn't require any additional changes in the drivers. Reported-by: Adrian Hunter <[email protected]> Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Fixes: c60f83b813e5 ("perf, pt, coresight: Fix address filters for vmas with non-zero offset") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16KVM: nVMX: allow tests to use bad virtual-APIC page addressPaolo Bonzini3-10/+19
As mentioned in the comment, there are some special cases where we can simply clear the TPR shadow bit from the CPU-based execution controls in the vmcs02. Handle them so that we can remove some XFAILs from vmx.flat. Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-16x86/mm/tlb: Revert "x86/mm: Align TLB invalidation info"Peter Zijlstra1-1/+1
Revert the following commit: 515ab7c41306: ("x86/mm: Align TLB invalidation info") I found out (the hard way) that under some .config options (notably L1_CACHE_SHIFT=7) and compiler combinations this on-stack alignment leads to a 320 byte stack usage, which then triggers a KASAN stack warning elsewhere. Using 320 bytes of stack space for a 40 byte structure is ludicrous and clearly not right. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Linus Torvalds <[email protected]> Acked-by: Nadav Amit <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 515ab7c41306 ("x86/mm: Align TLB invalidation info") Link: http://lkml.kernel.org/r/[email protected] [ Minor changelog edits. ] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16x86/reboot, efi: Use EFI reboot for Acer TravelMate X514-51TJian-Hong Pan2-1/+27
Upon reboot, the Acer TravelMate X514-51T laptop appears to complete the shutdown process, but then it hangs in BIOS POST with a black screen. The problem is intermittent - at some points it has appeared related to Secure Boot settings or different kernel builds, but ultimately we have not been able to identify the exact conditions that trigger the issue to come and go. Besides, the EFI mode cannot be disabled in the BIOS of this model. However, after extensive testing, we observe that using the EFI reboot method reliably avoids the issue in all cases. So add a boot time quirk to use EFI reboot on such systems. Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=203119 Signed-off-by: Jian-Hong Pan <[email protected]> Signed-off-by: Daniel Drake <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Fix !CONFIG_EFI build failure, clarify the code and the changelog a bit. ] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16x86/mm: Prevent bogus warnings with "noexec=off"Thomas Gleixner2-2/+3
Xose Vazquez Perez reported boot warnings when NX is disabled on the kernel command line. __early_set_fixmap() triggers this warning: attempted to set unsupported pgprot: 8000000000000163 bits: 8000000000000000 supported: 7fffffffffffffff WARNING: CPU: 0 PID: 0 at arch/x86/include/asm/pgtable.h:537 __early_set_fixmap+0xa2/0xff because it uses __default_kernel_pte_mask to mask out unsupported bits. Use __supported_pte_mask instead. Disabling NX on the command line also triggers the NX warning in the page table mapping check: WARNING: CPU: 1 PID: 1 at arch/x86/mm/dump_pagetables.c:262 note_page+0x2ae/0x650 .... Make the warning depend on NX set in __supported_pte_mask. Reported-by: Xose Vazquez Perez <[email protected]> Tested-by: Xose Vazquez Perez <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16kprobes: Fix error check when reusing optimized probesMasami Hiramatsu1-4/+2
The following commit introduced a bug in one of our error paths: 819319fc9346 ("kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()") it missed to handle the return value of kprobe_optready() as error-value. In reality, the kprobe_optready() returns a bool result, so "true" case must be passed instead of 0. This causes some errors on kprobe boot-time selftests on ARM: [ ] Beginning kprobe tests... [ ] Probe ARM code [ ] kprobe [ ] kretprobe [ ] ARM instruction simulation [ ] Check decoding tables [ ] Run test cases [ ] FAIL: test_case_handler not run [ ] FAIL: Test andge r10, r11, r14, asr r7 [ ] FAIL: Scenario 11 ... [ ] FAIL: Scenario 7 [ ] Total instruction simulation tests=1631, pass=1433 fail=198 [ ] kprobe tests failed This can happen if an optimized probe is unregistered and next kprobe is registered on same address until the previous probe is not reclaimed. If this happens, a hidden aggregated probe may be kept in memory, and no new kprobe can probe same address. Also, in that case register_kprobe() will return "1" instead of minus error value, which can mislead caller logic. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: David S . Miller <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Naveen N . Rao <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] # v5.0+ Fixes: 819319fc9346 ("kprobes: Return error if we fail to reuse kprobe instead of BUG_ON()") Link: http://lkml.kernel.org/r/155530808559.32517.539898325433642204.stgit@devnote2 Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16locking/lockdep: Make lockdep_unregister_key() honor 'debug_locks' againBart Van Assche1-4/+5
If lockdep_register_key() and lockdep_unregister_key() are called with debug_locks == false then the following warning is reported: WARNING: CPU: 2 PID: 15145 at kernel/locking/lockdep.c:4920 lockdep_unregister_key+0x1ad/0x240 That warning is reported because lockdep_unregister_key() ignores the value of 'debug_locks' and because the behavior of lockdep_register_key() depends on whether or not 'debug_locks' is set. Fix this inconsistency by making lockdep_unregister_key() take 'debug_locks' again into account. Signed-off-by: Bart Van Assche <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Waiman Long <[email protected]> Cc: Will Deacon <[email protected]> Cc: shenghui <[email protected]> Fixes: 90c1cba2b3b3 ("locking/lockdep: Zap lock classes even with lock debugging disabled") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-16x86/build/lto: Fix truncated .bss with -fdata-sectionsSami Tolvanen1-1/+1
With CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y, we compile the kernel with -fdata-sections, which also splits the .bss section. The new section, with a new .bss.* name, which pattern gets missed by the main x86 linker script which only expects the '.bss' name. This results in the discarding of the second part and a too small, truncated .bss section and an unhappy, non-working kernel. Use the common BSS_MAIN macro in the linker script to properly capture and merge all the generated BSS sections. Signed-off-by: Sami Tolvanen <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Extended the changelog. ] Signed-off-by: Ingo Molnar <[email protected]>
2019-04-15bnx2x: fix spelling mistake "dicline" -> "decline"Colin Ian King1-1/+1
There is a spelling mistake in a BNX2X_ERR message, fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-15Merge tag 'libnvdimm-fixes-5.1-rc6' of ↵Linus Torvalds7-71/+117
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "I debated holding this back for the v5.2 merge window due to the size of the "zero-key" changes, but affected users would benefit from having the fixes sooner. It did not make sense to change the zero-key semantic in isolation for the "secure-erase" command, but instead include it for all security commands. The short background on the need for these changes is that some NVDIMM platforms enable security with a default zero-key rather than let the OS specify the initial key. This makes the security enabling that landed in v5.0 unusable for some users. Summary: - Compatibility fix for nvdimm-security implementations with a default zero-key. - Miscellaneous small fixes for out-of-bound accesses, cleanup after initialization failures, and missing debug messages" * tag 'libnvdimm-fixes-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: tools/testing/nvdimm: Retain security state after overwrite libnvdimm/pmem: fix a possible OOB access when read and write pmem libnvdimm/security, acpi/nfit: unify zero-key for all security commands libnvdimm/security: provide fix for secure-erase to use zero-key libnvdimm/btt: Fix a kmemdup failure check libnvdimm/namespace: Fix a potential NULL pointer dereference acpi/nfit: Always dump _DSM output payload
2019-04-15Merge tag 'fsdax-fix-5.1-rc6' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull fsdax fix from Dan Williams: "A single filesystem-dax fix. It has been lingering in -next for a long while and there are no other fsdax fixes on the horizon: - Avoid a crash scenario with architectures like powerpc that require 'pgtable_deposit' for the zero page" * tag 'fsdax-fix-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: fs/dax: Deposit pagetable even when installing zero page
2019-04-15route: Avoid crash from dereferencing NULL rt->fromJonathan Lemon1-0/+4
When __ip6_rt_update_pmtu() is called, rt->from is RCU dereferenced, but is never checked for null - rt6_flush_exceptions() may have removed the entry. [ 1913.989004] RIP: 0010:ip6_rt_cache_alloc+0x13/0x170 [ 1914.209410] Call Trace: [ 1914.214798] <IRQ> [ 1914.219226] __ip6_rt_update_pmtu+0xb0/0x190 [ 1914.228649] ip6_tnl_xmit+0x2c2/0x970 [ip6_tunnel] [ 1914.239223] ? ip6_tnl_parse_tlv_enc_lim+0x32/0x1a0 [ip6_tunnel] [ 1914.252489] ? __gre6_xmit+0x148/0x530 [ip6_gre] [ 1914.262678] ip6gre_tunnel_xmit+0x17e/0x3c7 [ip6_gre] [ 1914.273831] dev_hard_start_xmit+0x8d/0x1f0 [ 1914.283061] sch_direct_xmit+0xfa/0x230 [ 1914.291521] __qdisc_run+0x154/0x4b0 [ 1914.299407] net_tx_action+0x10e/0x1f0 [ 1914.307678] __do_softirq+0xca/0x297 [ 1914.315567] irq_exit+0x96/0xa0 [ 1914.322494] smp_apic_timer_interrupt+0x68/0x130 [ 1914.332683] apic_timer_interrupt+0xf/0x20 [ 1914.341721] </IRQ> Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Signed-off-by: Jonathan Lemon <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Martin KaFai Lau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-15MAINTAINERS: normalize Woojung Huh's email addressLukas Bulwahn1-1/+1
MAINTAINERS contains a lower-case and upper-case variant of Woojung Huh' s email address. Only keep the lower-case variant in MAINTAINERS. Signed-off-by: Lukas Bulwahn <[email protected]> Acked-by: Woojung Huh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-15bonding: fix event handling for stacked bondsSabrina Dubroca1-1/+5
When a bond is enslaved to another bond, bond_netdev_event() only handles the event as if the bond is a master, and skips treating the bond as a slave. This leads to a refcount leak on the slave, since we don't remove the adjacency to its master and the master holds a reference on the slave. Reproducer: ip link add bondL type bond ip link add bondU type bond ip link set bondL master bondU ip link del bondL No "Fixes:" tag, this code is older than git history. Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-15Revert "net-sysfs: Fix memory leak in netdev_register_kobject"Wang Hai1-9/+5
This reverts commit 6b70fc94afd165342876e53fc4b2f7d085009945. The reverted bugfix will cause another issue. Reported by [email protected]. See https://syzkaller.appspot.com/x/log.txt?x=1737671b200000 for details. Signed-off-by: Wang Hai <[email protected]> Acked-by: Andy Shevchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-15Merge tag 'wireless-drivers-for-davem-2019-04-15' of ↵David S. Miller23-215/+123
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 5.1 Second set of fixes for 5.1. iwlwifi * add some new PCI IDs (plus a struct name change they depend on) * fix crypto with new devices, namely 22560 and above * fix for a potential deadlock in the TX path * a fix for offloaded rate-control * support new PCI HW IDs which use a new FW mt76 * fix lock initialisation and a possible deadlock * aggregation fixes rt2x00 * fix sequence numbering during retransmits ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-15KVM: x86/mmu: Fix an inverted list_empty() check when zapping sptesSean Christopherson1-1/+1
A recently introduced helper for handling zap vs. remote flush incorrectly bails early, effectively leaking defunct shadow pages. Manifests as a slab BUG when exiting KVM due to the shadow pages being alive when their associated cache is destroyed. ========================================================================== BUG kvm_mmu_page_header: Objects remaining in kvm_mmu_page_header on ... -------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Slab 0x00000000fc436387 objects=26 used=23 fp=0x00000000d023caee ... CPU: 6 PID: 4315 Comm: rmmod Tainted: G B 5.1.0-rc2+ #19 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: dump_stack+0x46/0x5b slab_err+0xad/0xd0 ? on_each_cpu_mask+0x3c/0x50 ? ksm_migrate_page+0x60/0x60 ? on_each_cpu_cond_mask+0x7c/0xa0 ? __kmalloc+0x1ca/0x1e0 __kmem_cache_shutdown+0x13a/0x310 shutdown_cache+0xf/0x130 kmem_cache_destroy+0x1d5/0x200 kvm_mmu_module_exit+0xa/0x30 [kvm] kvm_arch_exit+0x45/0x60 [kvm] kvm_exit+0x6f/0x80 [kvm] vmx_exit+0x1a/0x50 [kvm_intel] __x64_sys_delete_module+0x153/0x1f0 ? exit_to_usermode_loop+0x88/0xc0 do_syscall_64+0x4f/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: a21136345cb6f ("KVM: x86/mmu: Split remote_flush+zap case out of kvm_mmu_flush_or_zap()") Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-04-15drivers: power: supply: goldfish_battery: Fix bogus SPDX identifierThomas Gleixner1-1/+1
spdxcheck.py complains: drivers/power/supply/goldfish_battery.c: 1:28 Invalid License ID: GPL which is correct because GPL is not a valid identifier. Of course this could have been caught by checkpatch.pl _before_ submitting or merging the patch. WARNING: 'SPDX-License-Identifier: GPL' is not supported in LICENSES/... #19: FILE: drivers/power/supply/goldfish_battery.c:1: +// SPDX-License-Identifier: GPL Which is absolutely hillarious as the commit introducing this wreckage says in the changelog: There was a checkpatch complain: "Missing or malformed SPDX-License-Identifier tag". Oh well. Replacing a checkpatch warning by a different checkpatch warning is a really useful exercise. Use the proper GPL-2.0 identifier which is what the boiler plate in the file had originally. Fixes: e75e3a125b40 ("drivers: power: supply: goldfish_battery: Put an SPDX tag") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-15s390/pkey: add one more argument space for debug feature entryHarald Freudenberger1-1/+2
The debug feature entries have been used with up to 5 arguents (including the pointer to the format string) but there was only space reserved for 4 arguemnts. So now the registration does reserve space for 5 times a long value. This fixes a sometime appearing weired value as the last value of an debug feature entry like this: ... pkey_sec2protkey zcrypt_send_cprb (cardnr=10 domain=12) failed with errno -2143346254 Signed-off-by: Harald Freudenberger <[email protected]> Reported-by: Christian Rund <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2019-04-14drm/amd/display: If one stream full updates, full update all planesDavid Francis2-0/+22
[Why] On some compositors, with two monitors attached, VT terminal switch can cause a graphical issue by the following means: There are two streams, one for each monitor. Each stream has one plane current state: M1:S1->P1 M2:S2->P2 The user calls for a terminal switch and a commit is made to change both planes to linear swizzle mode. In atomic check, a new dc_state is constructed with new planes on each stream new state: M1:S1->P3 M2:S2->P4 In commit tail, each stream is committed, one at a time. The first stream (S1) updates properly, triggerring a full update and replacing the state current state: M1:S1->P3 M2:S2->P4 The update for S2 comes in, but dc detects that there is no difference between the stream and plane in the new and current states, and so triggers a fast update. The fast update does not program swizzle, so the second monitor is corrupted [How] Add a flag to dc_plane_state that forces full updates When a stream undergoes a full update, set this flag on all changed planes, then clear it on the current stream Subsequent streams will get full updates as a result Signed-off-by: David Francis <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Reviewed-by: Roman Li <[email protected]> Acked-by: Bhawanpreet Lakha <Bhawanpreet [email protected]> Acked-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-04-14Linux 5.1-rc5Linus Torvalds1-1/+1
2019-04-14Merge branch 'page-refs' (page ref overflow)Linus Torvalds8-28/+92
Merge page ref overflow branch. Jann Horn reported that he can overflow the page ref count with sufficient memory (and a filesystem that is intentionally extremely slow). Admittedly it's not exactly easy. To have more than four billion references to a page requires a minimum of 32GB of kernel memory just for the pointers to the pages, much less any metadata to keep track of those pointers. Jann needed a total of 140GB of memory and a specially crafted filesystem that leaves all reads pending (in order to not ever free the page references and just keep adding more). Still, we have a fairly straightforward way to limit the two obvious user-controllable sources of page references: direct-IO like page references gotten through get_user_pages(), and the splice pipe page duplication. So let's just do that. * branch page-refs: fs: prevent page refcount overflow in pipe_buf_get mm: prevent get_user_pages() from overflowing page refcount mm: add 'try_get_page()' helper function mm: make page ref count overflow check tighter and more explicit