blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2018-06-08	udp: fix rx queue len reported by diag and proc interface	Paolo Abeni	2	-2/+14
	After commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()") the sk_rmem_alloc field does not measure exactly anymore the receive queue length, because we batch the rmem release. The issue is really apparent only after commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty"): the user space can easily check for an empty socket with not-0 queue length reported by the 'ss' tool or the procfs interface. We need to use a custom UDP helper to report the correct queue length, taking into account the forward allocation deficit. Reported-by: [email protected] Fixes: 6b229cf77d68 ("UDP: add batching to udp_rmem_release()") Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-08	Merge branch 'for-4.18/mcsafe' into libnvdimm-for-next	Dan Williams	4	-4/+25

2018-06-08	Merge branch 'for-4.18/dax' into libnvdimm-for-next	Dan Williams	3	-44/+70

2018-06-08	Merge tag 'for-linus-20180608' of git://git.kernel.dk/linux-block	Linus Torvalds	1	-0/+1
	Pull block fixes from Jens Axboe: "A few fixes for this merge window, where some of them should go in sooner rather than later, hence a new pull this week. This pull request contains: - Set of NVMe fixes, mostly follow up cleanups/fixes to the queue changes, but also teardown/removal and misc changes (Christop/Dan/ Johannes/Sagi/Steve). - Two lightnvm fixes for issues that showed up in this window (Colin/Wei). - Failfast/driver flags inheritance for flush requests (Hannes). - The md device put sanitization and fix (Kent). - dm bio_set inheritance fix (me). - nbd discard granularity fix (Josef). - nbd consistency in command printing (Kevin). - Loop recursion validation fix (Ted). - Partition overlap check (Wang)" [ .. and now my build is warning-free again thanks to the md fix - Linus ] * tag 'for-linus-20180608' of git://git.kernel.dk/linux-block: (22 commits) nvme: cleanup double shift issue nvme-pci: make CMB SQ mod-param read-only nvme-pci: unquiesce dead controller queues nvme-pci: remove HMB teardown on reset nvme-pci: queue creation fixes nvme-pci: remove unnecessary completion doorbell check nvme-pci: remove unnecessary nested locking nvmet: filter newlines from user input nvme-rdma: correctly check for target keyed sgl support nvme: don't hold nvmf_transports_rwsem for more than transport lookups nvmet: return all zeroed buffer when we can't find an active namespace md: Unify mddev destruction paths dm: use bioset_init_from_src() to copy bio_set block: add bioset_init_from_src() helper block: always set partition number to '0' in blk_partition_remap() block: pass failfast and driver-specific flags to flush requests nbd: set discard_alignment to the granularity nbd: Consistently use request pointer in debug messages. block: add verifier for cmdline partition lightnvm: pblk: fix resource leak of invalid_bitmap ...
2018-06-08	Merge tag 'regulator-v4.18' of ↵	Linus Torvalds	11	-182/+45
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator updates from Mark Brown: "Quite a lot of core work this time around, though not 100% successful. We gained support for runtime mode changes thanks to David Collins and improved support for write only regulators (ones where we can't read back the configuration) from Douglas Anderson. There's been quite a bit of work from Linus Walleij on converting from specfying GPIOs by numbers to descriptors. Sadly the testing turned out to be less good than we had hoped and so a lot of this had to be reverted. We also have the start of updates to use coupled regulators from Maciej Purski, unfortunately there are further problems there so the last couple of patches have been reverted. We also have new drivers for BD71837 and SY8106A devices, SAW regulators on Qualcomm SPMI and dropped support for some preproduction chips that never made it to market from the AB8500 driver" * tag 'regulator-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (57 commits) regulator: gpio: Revert ARM: pxa, regulator: fix building ezx e680 regulator: Revert coupled regulator support again regulator: wm8994: Fix shared GPIOs regulator: max77686: Fix shared GPIOs regulator: bd71837: BD71837 PMIC regulator driver regulator: bd71837: Devicetree bindings for BD71837 regulators regulator: gpio: Get enable GPIO using GPIO descriptor regulator: fixed: Convert to use GPIO descriptor only regulator: s2mps11: Fix boot on Odroid XU3 dt-bindings: qcom_spmi: Document SAW support regulator: qcom_spmi: Add support for SAW regulator: tps65090: Pass descriptor instead of GPIO number regulator: s5m8767: Pass descriptor instead of GPIO number regulator: pfuze100: Delete reference to ena_gpio regulator: max8952: Pass descriptor instead of GPIO number regulator: lp8788-ldo: Pass descriptor instead of GPIO number regulator: lm363x: Pass descriptor instead of GPIO number regulator: max8973: Pass descriptor instead of GPIO number regulator: mc13xxx-core: Switch to SPDX identifier ...
2018-06-08	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	4	-6/+50
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "Apart from the core arm64 and perf changes, the Spectre v4 mitigation touches the arm KVM code and the ACPI PPTT support touches drivers/ (acpi and cacheinfo). I should have the maintainers' acks in place. Summary: - Spectre v4 mitigation (Speculative Store Bypass Disable) support for arm64 using SMC firmware call to set a hardware chicken bit - ACPI PPTT (Processor Properties Topology Table) parsing support and enable the feature for arm64 - Report signal frame size to user via auxv (AT_MINSIGSTKSZ). The primary motivation is Scalable Vector Extensions which requires more space on the signal frame than the currently defined MINSIGSTKSZ - ARM perf patches: allow building arm-cci as module, demote dev_warn() to dev_dbg() in arm-ccn event_init(), miscellaneous cleanups - cmpwait() WFE optimisation to avoid some spurious wakeups - L1_CACHE_BYTES reverted back to 64 (for performance reasons that have to do with some network allocations) while keeping ARCH_DMA_MINALIGN to 128. cache_line_size() returns the actual hardware Cache Writeback Granule - Turn LSE atomics on by default in Kconfig - Kernel fault reporting tidying - Some #include and miscellaneous cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (53 commits) arm64: Fix syscall restarting around signal suppressed by tracer arm64: topology: Avoid checking numa mask for scheduler MC selection ACPI / PPTT: fix build when CONFIG_ACPI_PPTT is not enabled arm64: cpu_errata: include required headers arm64: KVM: Move VCPU_WORKAROUND_2_FLAG macros to the top of the file arm64: signal: Report signal frame size to userspace via auxv arm64/sve: Thin out initialisation sanity-checks for sve_max_vl arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests arm64: KVM: Add ARCH_WORKAROUND_2 support for guests arm64: KVM: Add HYP per-cpu accessors arm64: ssbd: Add prctl interface for per-thread mitigation arm64: ssbd: Introduce thread flag to control userspace mitigation arm64: ssbd: Restore mitigation status on CPU resume arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation arm64: ssbd: Add global mitigation state accessor arm64: Add 'ssbd' command-line option arm64: Add ARCH_WORKAROUND_2 probing arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2 arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1 ...
2018-06-08	Merge tag 'dmaengine-4.18-rc1' of git://git.infradead.org/users/vkoul/slave-dma	Linus Torvalds	1	-0/+61
	Pull dmaengine updates from Vinod Koul: - updates to sprd, bam_dma, stm drivers - remove VLAs in dmatest - move TI drivers to their own subdir - switch to SPDX tags for ima/mxs dma drivers - simplify getting .drvdata on bunch of drivers by Wolfram Sang * tag 'dmaengine-4.18-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (32 commits) dmaengine: sprd: Add Spreadtrum DMA configuration dmaengine: sprd: Optimize the sprd_dma_prep_dma_memcpy() dmaengine: imx-dma: Switch to SPDX identifier dmaengine: mxs-dma: Switch to SPDX identifier dmaengine: imx-sdma: Switch to SPDX identifier dmaengine: usb-dmac: Document R8A7799{0,5} bindings dmaengine: qcom: bam_dma: fix some doc warnings. dmaengine: qcom: bam_dma: fix invalid assignment warning dmaengine: sprd: fix an NULL vs IS_ERR() bug dmaengine: sprd: Use devm_ioremap_resource() to map memory dmaengine: sprd: Fix potential NULL dereference in sprd_dma_probe() dmaengine: pl330: flush before wait, and add dev burst support. dmaengine: axi-dmac: Request IRQ with IRQF_SHARED dmaengine: stm32-mdma: fix spelling mistake: "avalaible" -> "available" dmaengine: rcar-dmac: Document R-Car D3 bindings dmaengine: sprd: Move DMA request mode and interrupt type into head file dmaengine: sprd: Define the DMA data width type dmaengine: sprd: Define the DMA transfer step type dmaengine: ti: New directory for Texas Instruments DMA drivers dmaengine: shdmac: Change platform check to CONFIG_ARCH_RENESAS ...
2018-06-08	Merge tag 'iommu-updates-v4.18' of ↵	Linus Torvalds	1	-1/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Nothing big this time. In particular: - Debugging code for Tegra-GART - Improvement in Intel VT-d fault printing to prevent soft-lockups when on fault storms - Improvements in AMD IOMMU event reporting - NUMA aware allocation in io-pgtable code for ARM - Various other small fixes and cleanups all over the place" * tag 'iommu-updates-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/io-pgtable-arm: Make allocations NUMA-aware iommu/amd: Prevent possible null pointer dereference and infinite loop iommu/amd: Fix grammar of comments iommu: Clean up the comments for iommu_group_alloc iommu/vt-d: Remove unnecessary parentheses iommu/vt-d: Clean up pasid quirk for pre-production devices iommu/vt-d: Clean up unused variable in find_or_alloc_domain iommu/vt-d: Fix iotlb psi missing for mappings iommu/vt-d: Introduce __mapping_notify_one() iommu: Remove extra NULL check when call strtobool() iommu/amd: Update logging information for new event type iommu/amd: Update the PASID information printed to the system log iommu/tegra: gart: Fix gart_iommu_unmap() iommu/tegra: gart: Add debugging facility iommu/io-pgtable-arm: Use for_each_set_bit to simplify code iommu/qcom: Simplify getting .drvdata iommu: Remove depends on HAS_DMA in case of platform dependency iommu/vt-d: Ratelimit each dmar fault printing
2018-06-08	Merge tag 'mtd/for-4.18' of git://git.infradead.org/linux-mtd	Linus Torvalds	3	-9/+31
	Pull MTD updates from Boris Brezillon: "Core changes: - Add a sysfs attribute to expose available OOB size Driver changes: - Remove HAS_DMA dependency on various drivers - Use dev_get_drvdata() instead of platform_get_drvdata() in docg3 - Replace msleep by usleep_range() in the dataflash driver - Avoid VLA usage in nftl layers - Remove useless .owner assignment in pismo - Fix various issues in the CFI driver - Improve TRX partition handling expose a DT compat for this part parser - Clarify OFFSET_CONTINUOUS meaning NAND core changes: - Add Miquel as a NAND maintainer - Add access mode to the nand_page_io_req struct - Fix kernel-doc in rawnand.h - Support bit-wise majority to recover from corrupted ONFI parameter pages - Stop checking FAIL bit after a SET_FEATURES, as documented in the ONFI spec Raw NAND Driver changes: - Fix and cleanup the error path of many NAND controller drivers - GPMI: + Cleanup/simplification of a few aspects in the driver + Take ECC setup specified in the DT into account - sunxi: remove support for GPIO-based R/B polling - MTK: + Use of_device_get_match_data() instead of of_match_device() + Add an entry in MAINTAINERS for this driver + Fix nand-ecc-step-size and nand-ecc-strength description in the DT bindings doc - fsl_ifc: fix ->cmdfunc() to read more than one ONFI parameter page OneNAND driver changes: - samsung: use dev_get_drvdata() instead of platform_get_drvdata() SPI NOR core changes: - Add support for a bunch of SPI NOR chips - Clear EAR reg when switching to 3-byte addressing mode on Winbond chips SPI NOR controller driver changes: - cadence: Add DMA support for direct mode reads - hisi: Prefix a few functions with hisi_ - intel: + Mark the driver as "dangerous" in Kconfig + Fix atomic sequence handling + Pass a 40us delay (instead of 0us) to readl_poll_timeout() - fsl: + fix a typo in a function name + add support for IP variants embedded in the ls2080a and ls1080a SoCs - stm32: request exclusive control of the reset line" * tag 'mtd/for-4.18' of git://git.infradead.org/linux-mtd: (66 commits) mtd: nand: Pass mode information to nand_page_io_req mtd: cfi_cmdset_0002: Change erase one block to enable XIP once mtd: cfi_cmdset_0002: Change erase functions to check chip good only mtd: cfi_cmdset_0002: Change erase functions to retry for error mtd: cfi_cmdset_0002: Change definition naming to retry write operation mtd: cfi_cmdset_0002: Change write buffer to check correct value mtd: cmdlinepart: Update comment for introduction of OFFSET_CONTINUOUS mtd: bcm47xxpart: add of_match_table with a new DT binding dt-bindings: mtd: document Broadcom's BCM47xx partitions mtd: spi-nor: Add support for EN25QH32 mtd: spi-nor: Add support for is25wp series chips mtd: spi-nor: Add Winbond w25q32jv support mtd: spi-nor: fsl-quadspi: add support for ls2080a/ls1080a mtd: spi-nor: stm32-quadspi: explicitly request exclusive reset control mtd: spi-nor: intel: provide a range for poll_timout mtd: spi-nor: fsl-quadspi: fix api naming typo _init_ahb_read mtd: spi-nor: intel-spi: Explicitly mark the driver as dangerous in Kconfig mtd: spi-nor: intel-spi: Fix atomic sequence handling mtd: rawnand: Do not check FAIL bit when executing a SET_FEATURES op mtd: rawnand: use bit-wise majority to recover the ONFI param page ...
2018-06-08	Merge tag 'gpio-v4.18-1' of ↵	Linus Torvalds	3	-5/+39
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for the v4.18 development cycle. Core changes: - We have killed off VLA from the core library and all drivers. The background should be clear for everyone at this point: https://lwn.net/Articles/749064/ Also I just don't like VLA's, kernel developers hate it when compilers do things behind their back. It's as simple as that. I'm sorry that they even slipped in to begin with. Kudos to Laura Abbott for exorcising them. - Support GPIO hogs in machines/board files. New drivers and chip support: - R-Car r8a77470 (RZ/G1C) - R-Car r8a77965 (M3-N) - R-Car r8a77990 (E3) - PCA953x driver improvements to accomodate more variants. Improvements and new features: - Support one interrupt per line on port A in the DesignWare dwapb driver. Misc: - Random cleanups, right header files in the drivers, some size optimizations etc" * tag 'gpio-v4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (73 commits) gpio: davinci: fix build warning when !CONFIG_OF gpio: dwapb: Fix rework support for 1 interrupt per port A GPIO gpio: pxa: Include the right header gpio: pl061: Include the right header gpio: pch: Include the right header gpio: pcf857x: Include the right header gpio: pca953x: Include the right header gpio: palmas: Include the right header gpio: omap: Include the right header gpio: octeon: Include the right header gpio: mxs: Switch to SPDX identifier gpio: Remove VLA from stmpe driver gpio: mxc: Switch to SPDX identifier gpio: mxc: add clock operation gpio: Remove VLA from gpiolib gpio: aspeed: Use a cache of output data registers gpio: aspeed: Set output latch before changing direction gpio: pca953x: fix address calculation for pcal6524 gpio: pca953x: define masks for addressing common and extended registers gpio: pca953x: set the PCA_PCAL flag also when matching by DT ...
2018-06-08	Merge branch 'for-linus' of ↵	Linus Torvalds	1	-3/+16
	git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: - Valve Steam Controller support from Rodrigo Rivas Costa - Redragon Asura support from Robert Munteanu - improvement of duplicate usage handling in generic hid-input from Benjamin Tissoires - Win 8.1 precisioun touchpad spec implementation from Benjamin Tissoires - Support for "In Range" flag for Wacom Intuos/Bamboo devices from Jason Gerecke - other various assorted smaller fixes and improvements * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (27 commits) HID: rmi: use HID_QUIRK_NO_INPUT_SYNC HID: multitouch: fix calculation of last slot field in multi-touch reports HID: quirks: remove Delcom Visual Signal Indicator from hid_have_special_driver[] HID: steam: select CONFIG_POWER_SUPPLY HID: i2c-hid: remove i2c_hid_open_mut HID: wacom: Support "in range" for Intuos/Bamboo tablets where possible HID: core: fix hid_hw_open() comment HID: hid-plantronics: Re-resend Update to map button for PTT products HID: multitouch: fix types returned from mt_need_to_apply_feature() HID: i2c-hid: check if device is there before really probing HID: steam: add missing fields in client initialization HID: steam: add battery device. HID: add driver for Valve Steam Controller HID: alps: Fix some style in 't4_read_write_register()' HID: alps: Check errors returned by 't4_read_write_register()' HID: alps: Save a memory allocation in 't4_read_write_register()' when writing data HID: alps: Report an error if we receive invalid data in 't4_read_write_register()' HID: multitouch: implement precision touchpad latency and switches HID: multitouch: simplify the settings of the various features HID: multitouch: make use of HID_QUIRK_INPUT_PER_APP ...
2018-06-08	Merge branch 'work.aio' of ↵	Linus Torvalds	3	-2/+24
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull aio iopriority support from Al Viro: "The rest of aio stuff for this cycle - Adam's aio ioprio series" * 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: aio ioprio use ioprio_check_cap ret val fs: aio ioprio add explicit block layer dependence fs: iomap dio set bio prio from kiocb prio fs: blkdev set bio prio from kiocb prio fs: Add aio iopriority support fs: Convert kiocb rw_hint from enum to u16 block: add ioprio_check_cap function
2018-06-08	Merge tag 'for-linus-4.18-rc1-tag' of ↵	Linus Torvalds	4	-4/+104
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "This contains some minor code cleanups (fixing return types of functions), some fixes for Linux running as Xen PVH guest, and adding of a new guest resource mapping feature for Xen tools" * tag 'for-linus-4.18-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/PVH: Make GDT selectors PVH-specific xen/PVH: Set up GS segment for stack canary xen/store: do not store local values in xen_start_info xen-netfront: fix xennet_start_xmit()'s return type xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE xen: Change return type to vm_fault_t
2018-06-08	block: add bioset_init_from_src() helper	Jens Axboe	1	-0/+1
	Add a helper that allows a caller to initialize a new bio_set, using the settings from an existing bio_set. Reported-by: Venkat R.B <[email protected]> Tested-by: Venkat R.B <[email protected]> Tested-by: Li Wang <[email protected]> Reviewed-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-06-08	netfilter: remove include/net/netfilter/nft_dup.h	Corentin Labbe	1	-10/+0
	include/net/netfilter/nft_dup.h was introduced in d877f07112f1 ("netfilter: nf_tables: add nft_dup expression") but was never user since this date. Furthermore, the only struct in this file is unused elsewhere. Signed-off-by: Corentin Labbe <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-06-08	Merge branch 'for-4.18/multitouch' into for-linus	Jiri Kosina	1	-3/+15
	- improvement of duplicate usage handling in hid-input from Benjamin Tissoires - Win 8.1 precisioun touchpad spec implementation from Benjamin Tissoires
2018-06-07	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	24	-476/+437
	Merge updates from Andrew Morton: - a few misc things - ocfs2 updates - v9fs updates - MM - procfs updates - lib/ updates - autofs updates * emailed patches from Andrew Morton <[email protected]>: (118 commits) autofs: small cleanup in autofs_getpath() autofs: clean up includes autofs: comment on selinux changes needed for module autoload autofs: update MAINTAINERS entry for autofs autofs: use autofs instead of autofs4 in documentation autofs: rename autofs documentation files autofs: create autofs Kconfig and Makefile autofs: delete fs/autofs4 source files autofs: update fs/autofs4/Makefile autofs: update fs/autofs4/Kconfig autofs: copy autofs4 to autofs autofs4: use autofs instead of autofs4 everywhere autofs4: merge auto_fs.h and auto_fs4.h fs/binfmt_misc.c: do not allow offset overflow checkpatch: improve patch recognition lib/ucs2_string.c: add MODULE_LICENSE() lib/mpi: headers cleanup lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock lib/idr.c: remove simple_ida_lock lib/bitmap.c: micro-optimization for __bitmap_complement() ...
2018-06-07	autofs4: merge auto_fs.h and auto_fs4.h	Ian Kent	2	-162/+160
	The autofs module has long since been removed so there's no need to have two separate include files for autofs. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ian Kent <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	lib/mpi: headers cleanup	Vasily Averin	1	-61/+0
	MPI headers contain definitions for huge number of non-existing functions. Most part of these functions was removed in 2012 by Dmitry Kasatkin - 7cf4206a99d1 ("Remove unused code from MPI library") - 9e235dcaf4f6 ("Revert "crypto: GnuPG based MPI lib - additional ...") - bc95eeadf5c6 ("lib/mpi: removed unused functions") however headers wwere not updated properly. Also I deleted some unused macros. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Dmitry Kasatkin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	include/linux/types.h: use fixed width types without double-underscore prefix	Masahiro Yamada	1	-14/+14
	This header file is not exported. It is safe to reference types without double-underscore prefix. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Lihao Liang <[email protected]> Cc: Philippe Ombredanne <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	include/linux/types.h: define aligned_ types based on uapi header	Masahiro Yamada	1	-3/+3
	<uapi/linux/types.h> has the same typedefs except that it prefixes them with double-underscore for user space. Use them for the kernel space typedefs. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Lihao Liang <[email protected]> Cc: Philippe Ombredanne <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	int-ll64.h: define u{8,16,32,64} and s{8,16,32,64} based on uapi header	Masahiro Yamada	1	-11/+8
	<uapi/asm-generic/int-ll64.h> has the same typedefs except that it prefixes them with double-underscore for user space. Use them for the kernel space typedefs. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Lihao Liang <[email protected]> Cc: Philippe Ombredanne <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: remove page_is_poisoned() from linux/mm.h	Sahara	1	-2/+0
	When commit bd33ef368135 ("mm: enable page poisoning early at boot") got rid of the PAGE_EXT_DEBUG_POISON, page_is_poisoned in the header left behind. This patch cleans up the leftovers under the table. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Sahara <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mem_cgroup: make sure moving_account, move_lock_task and stat_cpu in the ↵	Aaron Lu	1	-4/+19
	same cacheline The LKP robot found a 27% will-it-scale/page_fault3 performance regression regarding commit e27be240df53("mm: memcg: make sure memory.events is uptodate when waking pollers"). What the test does is: 1 mkstemp() a 128M file on a tmpfs; 2 start $nr_cpu processes, each to loop the following: 2.1 mmap() this file in shared write mode; 2.2 write 0 to this file in a PAGE_SIZE step till the end of the file; 2.3 unmap() this file and repeat this process. 3 After 5 minutes, check how many loops they managed to complete, the higher the better. The commit itself looks innocent enough as it merely changed some event counting mechanism and this test didn't trigger those events at all. Perf shows increased cycles spent on accessing root_mem_cgroup->stat_cpu in count_memcg_event_mm()(called by handle_mm_fault()) and in __mod_memcg_state() called by page_add_file_rmap(). So it's likely due to the changed layout of 'struct mem_cgroup' that either make stat_cpu falling into a constantly modifying cacheline or some hot fields stop being in the same cacheline. I verified this by moving memory_events[] back to where it was: : --- a/include/linux/memcontrol.h : +++ b/include/linux/memcontrol.h : @@ -205,7 +205,6 @@ struct mem_cgroup { : int oom_kill_disable; : : /* memory.events / : - atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; : struct cgroup_file events_file; : : / protect arrays of thresholds / : @@ -238,6 +237,7 @@ struct mem_cgroup { : struct mem_cgroup_stat_cpu __percpu stat_cpu; : atomic_long_t stat[MEMCG_NR_STAT]; : atomic_long_t events[NR_VM_EVENT_ITEMS]; : + atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; : : unsigned long socket_pressure; And performance restored. Later investigation found that as long as the following 3 fields moving_account, move_lock_task and stat_cpu are in the same cacheline, performance will be good. To avoid future performance surprise by other commits changing the layout of 'struct mem_cgroup', this patch makes sure the 3 fields stay in the same cacheline. One concern of this approach is, moving_account and move_lock_task could be modified when a process changes memory cgroup while stat_cpu is a always read field, it might hurt to place them in the same cacheline. I assume it is rare for a process to change memory cgroup so this should be OK. Link: https://lkml.kernel.org/r/20180528114019.GF9904@yexl-desktop Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Aaron Lu <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	include/linux/gfp.h: fix the annotation of GFP_ZONE_TABLE	Huaisheng Ye	1	-2/+2
	When bit is equal to 0x4, it means OPT_ZONE_DMA32 should be got from GFP_ZONE_TABLE. OPT_ZONE_DMA32 shall be equal to ZONE_DMA32 or ZONE_NORMAL according to the status of CONFIG_ZONE_DMA32. Similarly, when bit is equal to 0xc, that means OPT_ZONE_DMA32 should be got with an allocation policy GFP_MOVABLE. So ZONE_DMA32 or ZONE_NORMAL is the possible result value. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Huaisheng Ye <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Kate Stewart <[email protected]> Cc: "Levin, Alexander (Sasha Levin)" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	userfaultfd: prevent non-cooperative events vs mcopy_atomic races	Mike Rapoport	1	-2/+4
	If a process monitored with userfaultfd changes it's memory mappings or forks() at the same time as uffd monitor fills the process memory with UFFDIO_COPY, the actual creation of page table entries and copying of the data in mcopy_atomic may happen either before of after the memory mapping modifications and there is no way for the uffd monitor to maintain consistent view of the process memory layout. For instance, let's consider fork() running in parallel with userfaultfd_copy(): process \| uffd monitor ---------------------------------+------------------------------ fork() \| userfaultfd_copy() ... \| ... dup_mmap() \| down_read(mmap_sem) down_write(mmap_sem) \| /* create PTEs, copy data / dup_uffd() \| up_read(mmap_sem) copy_page_range() \| up_write(mmap_sem) \| dup_uffd_complete() \| / notify monitor / \| If the userfaultfd_copy() takes the mmap_sem first, the new page(s) will be present by the time copy_page_range() is called and they will appear in the child's memory mappings. However, if the fork() is the first to take the mmap_sem, the new pages won't be mapped in the child's address space. If the pages are not present and child tries to access them, the monitor will get page fault notification and everything is fine. However, if the pages are present*, the child can access them without uffd noticing. And if we copy them into child it'll see the wrong data. Since we are talking about background copy, we'd need to decide whether the pages should be copied or not regardless #PF notifications. Since userfaultfd monitor has no way to determine what was the order, let's disallow userfaultfd_copy in parallel with the non-cooperative events. In such case we return -EAGAIN and the uffd monitor can understand that userfaultfd_copy() clashed with a non-cooperative event and take an appropriate action. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Andrei Vagin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	slub: remove kmem_cache->reserved	Matthew Wilcox	1	-1/+0
	The reserved field was only used for embedding an rcu_head in the data structure. With the previous commit, we no longer need it. That lets us remove the 'reserved' argument to a lot of functions. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: add hmm_data to struct page	Matthew Wilcox	2	-12/+8
	Make hmm_data an explicit member of the struct page union. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: add pt_mm to struct page	Matthew Wilcox	1	-1/+1
	For pgd page table pages, x86 overloads the page->index field to store a pointer to the mm_struct. Rename this to pt_mm so it's visible to other users. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: improve struct page documentation	Matthew Wilcox	1	-21/+19
	Rewrite the documentation to describe what you can use in struct page rather than what you can't. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: combine LRU and main union in struct page	Matthew Wilcox	1	-51/+46
	This gives us five words of space in a single union in struct page. The compound_mapcount moves position (from offset 24 to offset 20) on 64-bit systems, but that does not seem likely to cause any trouble. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: move lru union within struct page	Matthew Wilcox	1	-51/+51
	Since the LRU is two words, this does not affect the double-word alignment of SLUB's freelist. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: combine first three unions in struct page	Matthew Wilcox	1	-33/+33
	By combining these three one-word unions into one three-word union, we make it easier for users to add their own multi-word fields to struct page, as well as making it obvious that SLUB needs to keep its double-word alignment for its freelist & counters. No field moves position; verified with pahole. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: move _refcount out of struct page union	Matthew Wilcox	1	-15/+10
	Keeping the refcount in the union only encourages people to put something else in the union which will overlap with _refcount and eventually explode messily. pahole reports no fields change location. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: move 'private' union within struct page	Matthew Wilcox	1	-31/+25
	By moving page->private to the fourth word of struct page, we can put the SLUB counters in the same word as SLAB's s_mem and still do the cmpxchg_double trick. Now the SLUB counters no longer overlap with the mapcount or refcount so we can drop the call to page_mapcount_reset() and simplify set_page_slub_counters() to a single line. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: switch s_mem and slab_cache in struct page	Matthew Wilcox	1	-2/+2
	This will allow us to store slub's counters in the same bits as slab's s_mem. slub now needs to set page->mapping to NULL as it frees the page, just like slab does. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Christoph Lameter <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: mark pages in use for page tables	Matthew Wilcox	3	-1/+9
	Define a new PageTable bit in the page_type and use it to mark pages in use as page tables. This can be helpful when debugging crashdumps or analysing memory fragmentation. Add a KPF flag to report these pages to userspace and update page-types.c to interpret that flag. Note that only pages currently accounted as NR_PAGETABLES are tracked as PageTable; this does not include pgd/p4d/pud/pmd pages. Those will be the subject of a later patch. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: split page_type out from _mapcount	Matthew Wilcox	2	-24/+34
	We're already using a union of many fields here, so stop abusing the _mapcount and make page_type its own field. That implies renaming some of the machinery that creates PageBuddy, PageBalloon and PageKmemcg; bring back the PG_buddy, PG_balloon and PG_kmemcg names. As suggested by Kirill, make page_type a bitmask. Because it starts out life as -1 (thanks to sharing the storage with _mapcount), setting a page flag means clearing the appropriate bit. This gives us space for probably twenty or so extra bits (depending how paranoid we want to be about _mapcount underflow). Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andrey Ryabinin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: save two stranded bits in gfp_mask	Shakeel Butt	1	-5/+5
	___GFP_COLD and ___GFP_OTHER_NODE were removed but their bits were stranded. Fill the gaps by moving the existing gfp masks around. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Suggested-by: Vlastimil Babka <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: change return type to vm_fault_t	Souptick Joarder	1	-3/+3
	Use new return type vm_fault_t for fault handler in struct vm_operations_struct. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. See commit 1c8f422059ae ("mm: change return type to vm_fault_t") Link: http://lkml.kernel.org/r/20180512063745.GA26866@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Joe Perches <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Dan Williams <[email protected]> Cc: David Rientjes <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: use new return type vm_fault_t	Souptick Joarder	1	-2/+2
	Use new return type vm_fault_t for fault handler in struct vm_operations_struct. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Link: http://lkml.kernel.org/r/20180511190542.GA2412@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Cc: Dan Williams <[email protected]> Cc: Jan Kara <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	memcg: introduce memory.min	Roman Gushchin	2	-6/+20
	Memory controller implements the memory.low best-effort memory protection mechanism, which works perfectly in many cases and allows protecting working sets of important workloads from sudden reclaim. But its semantics has a significant limitation: it works only as long as there is a supply of reclaimable memory. This makes it pretty useless against any sort of slow memory leaks or memory usage increases. This is especially true for swapless systems. If swap is enabled, memory soft protection effectively postpones problems, allowing a leaking application to fill all swap area, which makes no sense. The only effective way to guarantee the memory protection in this case is to invoke the OOM killer. It's possible to handle this case in userspace by reacting on MEMCG_LOW events; but there is still a place for a fail-safe in-kernel mechanism to provide stronger guarantees. This patch introduces the memory.min interface for cgroup v2 memory controller. It works very similarly to memory.low (sharing the same hierarchical behavior), except that it's not disabled if there is no more reclaimable memory in the system. If cgroup is not populated, its memory.min is ignored, because otherwise even the OOM killer wouldn't be able to reclaim the protected memory, and the system can stall. [[email protected]: s/low/min/ in docs] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: move is_pageblock_removable_nolock() to mm/memory_hotplug.c	Mathieu Malaterre	1	-1/+0
	is_pageblock_removable_nolock() is not used outside of mm/memory_hotplug.c. Move it next to unique caller is_mem_section_removable() and make it static. Remove prototype in <linux/memory_hotplug.h> to silence gcc warning (W=1): mm/page_alloc.c:7704:6: warning: no previous prototype for `is_pageblock_removable_nolock' [-Wmissing-prototypes] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mathieu Malaterre <[email protected]> Suggested-by: Michal Hocko <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	memcg: writeback: use memcg->cgwb_list directly	Wang Long	1	-1/+0
	mem_cgroup_cgwb_list is a very simple wrapper and it will never be used outside of code under CONFIG_CGROUP_WRITEBACK. so use memcg->cgwb_list directly. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Wang Long <[email protected]> Reviewed-by: Jan Kara <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm/ksm: move [set_]page_stable_node from ksm.h to ksm.c	Mike Rapoport	1	-11/+0
	page_stable_node() and set_page_stable_node() are only used in mm/ksm.c and there is no point to keep them in the include/linux/ksm.h [[email protected]: fix SYSFS=n build] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Andrea Arcangeli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm/ksm: remove unused page_referenced_ksm declaration	Mike Rapoport	1	-6/+0
	Commit 9f32624be943 ("mm/rmap: use rmap_walk() in page_referenced()") removed the declaration of page_referenced_ksm for the case CONFIG_KSM=y, but left one for CONFIG_KSM=n. Remove the unused leftover. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Andrea Arcangeli <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	lockdep: fix fs_reclaim annotation	Omar Sandoval	1	-0/+4
	While revisiting my Btrfs swapfile series [1], I introduced a situation in which reclaim would lock i_rwsem, and even though the swapon() path clearly made GFP_KERNEL allocations while holding i_rwsem, I got no complaints from lockdep. It turns out that the rework of the fs_reclaim annotation was broken: if the current task has PF_MEMALLOC set, we don't acquire the dummy fs_reclaim lock, but when reclaiming we always check this _after_ we've just set the PF_MEMALLOC flag. In most cases, we can fix this by moving the fs_reclaim_{acquire,release}() outside of the memalloc_noreclaim_{save,restore}(), althought kswapd is slightly different. After applying this, I got the expected lockdep splats. 1: https://lwn.net/Articles/625412/ Link: http://lkml.kernel.org/r/9f8aa70652a98e98d7c4de0fc96a4addcee13efe.1523778026.git.osandov@fb.com Fixes: d92a8cfcb37e ("locking/lockdep: Rework FS_RECLAIM annotation") Signed-off-by: Omar Sandoval <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: memory.low hierarchical behavior	Roman Gushchin	2	-2/+8
	This patch aims to address an issue in current memory.low semantics, which makes it hard to use it in a hierarchy, where some leaf memory cgroups are more valuable than others. For example, there are memcgs A, A/B, A/C, A/D and A/E: A A/memory.low = 2G, A/memory.current = 6G //\\ BC DE B/memory.low = 3G B/memory.current = 2G C/memory.low = 1G C/memory.current = 2G D/memory.low = 0 D/memory.current = 2G E/memory.low = 10G E/memory.current = 0 If we apply memory pressure, B, C and D are reclaimed at the same pace while A's usage exceeds 2G. This is obviously wrong, as B's usage is fully below B's memory.low, and C has 1G of protection as well. Also, A is pushed to the size, which is less than A's 2G memory.low, which is also wrong. A simple bash script (provided below) can be used to reproduce the problem. Current results are: A: 1430097920 A/B: 711929856 A/C: 717426688 A/D: 741376 A/E: 0 To address the issue a concept of effective memory.low is introduced. Effective memory.low is always equal or less than original memory.low. In a case, when there is no memory.low overcommittment (and also for top-level cgroups), these two values are equal. Otherwise it's a part of parent's effective memory.low, calculated as a cgroup's memory.low usage divided by sum of sibling's memory.low usages (under memory.low usage I mean the size of actually protected memory: memory.current if memory.current < memory.low, 0 otherwise). It's necessary to track the actual usage, because otherwise an empty cgroup with memory.low set (A/E in my example) will affect actual memory distribution, which makes no sense. To avoid traversing the cgroup tree twice, page_counters code is reused. Calculating effective memory.low can be done in the reclaim path, as we conveniently traversing the cgroup tree from top to bottom and check memory.low on each level. So, it's a perfect place to calculate effective memory low and save it to use it for children cgroups. This also eliminates a need to traverse the cgroup tree from bottom to top each time to check if parent's guarantee is not exceeded. Setting/resetting effective memory.low is intentionally racy, but it's fine and shouldn't lead to any significant differences in actual memory distribution. With this patch applied results are matching the expectations: A: 2147930112 A/B: 1428721664 A/C: 718393344 A/D: 815104 A/E: 0 Test script: #!/bin/bash CGPATH="/sys/fs/cgroup" truncate /file1 --size 2G truncate /file2 --size 2G truncate /file3 --size 2G truncate /file4 --size 50G mkdir "${CGPATH}/A" echo "+memory" > "${CGPATH}/A/cgroup.subtree_control" mkdir "${CGPATH}/A/B" "${CGPATH}/A/C" "${CGPATH}/A/D" "${CGPATH}/A/E" echo 2G > "${CGPATH}/A/memory.low" echo 3G > "${CGPATH}/A/B/memory.low" echo 1G > "${CGPATH}/A/C/memory.low" echo 0 > "${CGPATH}/A/D/memory.low" echo 10G > "${CGPATH}/A/E/memory.low" echo $$ > "${CGPATH}/A/B/cgroup.procs" && vmtouch -qt /file1 echo $$ > "${CGPATH}/A/C/cgroup.procs" && vmtouch -qt /file2 echo $$ > "${CGPATH}/A/D/cgroup.procs" && vmtouch -qt /file3 echo $$ > "${CGPATH}/cgroup.procs" && vmtouch -qt /file4 echo "A: " `cat "${CGPATH}/A/memory.current"` echo "A/B: " `cat "${CGPATH}/A/B/memory.current"` echo "A/C: " `cat "${CGPATH}/A/C/memory.current"` echo "A/D: " `cat "${CGPATH}/A/D/memory.current"` echo "A/E: " `cat "${CGPATH}/A/E/memory.current"` rmdir "${CGPATH}/A/B" "${CGPATH}/A/C" "${CGPATH}/A/D" "${CGPATH}/A/E" rmdir "${CGPATH}/A" rm /file1 /file2 /file3 /file4 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm: rename page_counter's count/limit into usage/max	Roman Gushchin	2	-8/+8
	This patch renames struct page_counter fields: count -> usage limit -> max and the corresponding functions: page_counter_limit() -> page_counter_set_max() mem_cgroup_get_limit() -> mem_cgroup_get_max() mem_cgroup_resize_limit() -> mem_cgroup_resize_max() memcg_update_kmem_limit() -> memcg_update_kmem_max() memcg_update_tcp_limit() -> memcg_update_tcp_max() The idea behind this renaming is to have the direct matching between memory cgroup knobs (low, high, max) and page_counters API. This is pure renaming, this patch doesn't bring any functional change. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-07	mm/memblock: introduce PHYS_ADDR_MAX	Stefan Agner	1	-0/+1
	So far code was using ULLONG_MAX and type casting to obtain a phys_addr_t with all bits set. The typecast is necessary to silence compiler warnings on 32-bit platforms. Use the simpler but still type safe approach "~(phys_addr_t)0" to create a preprocessor define for all bits set. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Stefan Agner <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>