blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2016-10-29	Merge tag 'usb-4.9-rc3' of ↵	Linus Torvalds	23	-84/+296
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of small USB driver fixes for 4.9-rc3. There is the usual number of gadget and xhci patches in here to resolved reported issues, as well as some usb-serial driver fixes and new device ids. All have been in linux-next with no reported issues" * tag 'usb-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (26 commits) usb: chipidea: host: fix NULL ptr dereference during shutdown usb: renesas_usbhs: add wait after initialization for R-Car Gen3 usb: increase ohci watchdog delay to 275 msec usb: musb: Call pm_runtime from musb_gadget_queue usb: musb: Fix hardirq-safe hardirq-unsafe lock order error usb: ehci-platform: increase EHCI_MAX_RSTS to 4 usb: ohci-at91: Set RemoteWakeupConnected bit explicitly. USB: serial: fix potential NULL-dereference at probe xhci: use default USB_RESUME_TIMEOUT when resuming ports. xhci: workaround for hosts missing CAS bit xhci: add restart quirk for Intel Wildcatpoint PCH USB: serial: cp210x: fix tiocmget error handling wusb: fix error return code in wusb_prf() Revert "Documentation: devicetree: dwc2: Deprecate g-tx-fifo-size" Revert "usb: dwc2: gadget: fix TX FIFO size and address initialization" Revert "usb: dwc2: gadget: change variable name to more meaningful" USB: serial: ftdi_sio: add support for Infineon TriBoard TC2X7 wusb: Stop using the stack for sg crypto scratch space usb: dwc3: Fix size used in dma_free_coherent() usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable ...
2016-10-28	Merge tag 'acpi-4.9-rc3' of ↵	Linus Torvalds	9	-56/+54
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix recent ACPICA regressions, an older PCI IRQ management regression, and an incorrect return value of a function in the APEI code. Specifics: - Fix three ACPICA issues related to the interpreter locking and introduced by recent changes in that area (Lv Zheng). - Fix a PCI IRQ management regression introduced during the 4.7 cycle and related to the configuration of shared IRQs on systems with an ISA bus (Sinan Kaya). - Fix up a return value of one function in the APEI code (Punit Agrawal)" * tag 'acpi-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Dispatcher: Fix interpreter locking around acpi_ev_initialize_region() ACPICA: Dispatcher: Fix an unbalanced lock exit path in acpi_ds_auto_serialize_method() ACPICA: Dispatcher: Fix order issue of method termination ACPI / APEI: Fix incorrect return value of ghes_proc() ACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs ACPI/PCI: pci_link: penalize SCI correctly ACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages
2016-10-28	Merge tag 'pm-4.9-rc3' of ↵	Linus Torvalds	2	-8/+34
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two intel_pstate issues related to the way it works when the scaling_governor sysfs attribute is set to "performance" and fix up messages in the system suspend core code. Specifics: - Fix a missing KERN_CONT in a system suspend message by converting the affected code to using pr_info() and pr_cont() instead of the "raw" printk() (Jon Hunter). - Make intel_pstate set the CPU P-state from its .set_policy() callback when the scaling_governor sysfs attribute is set to "performance" so that it interacts with NOHZ_FULL more predictably which was the case before 4.7 (Rafael Wysocki). - Make intel_pstate always request the maximum allowed P-state when the scaling_governor sysfs attribute is set to "performance" to prevent it from effectively ingoring that setting is some situations (Rafael Wysocki)" * tag 'pm-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Always set max P-state in performance mode PM / suspend: Fix missing KERN_CONT for suspend message cpufreq: intel_pstate: Set P-state upfront in performance mode
2016-10-28	Merge tag 'arc-4.9-rc3' of ↵	Linus Torvalds	20	-273/+203
	git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC updates from Vineet Gupta: - support IDU intc for UP builds - support gz, lzma compressed uImage [Daniel Mentz] - adjust /proc/cpuinfo for non-continuous cpu ids [Noam Camus] - syscall for userspace cmpxchg assist for configs lacking hardware atomics - rework of boot log printing mainly for identifying older arc700 cores - retiring some old code, build toggles * tag 'arc-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: module: print pretty section names ARC: module: elide loop to save reference to .eh_frame ARC: mm: retire ARC_DBG_TLB_MISS_COUNT... ARC: build: retire old toggles ARC: boot log: refactor cpu name/release printing ARC: boot log: remove awkward space comma from MMU line ARC: boot log: don't assume SWAPE instruction support ARC: boot log: refactor printing abt features not captured in BCRs ARCv2: boot log: print IOC exists as well as enabled status ARCv2: IOC: use @ioc_enable not @ioc_exist where intended ARC: syscall for userspace cmpxchg assist ARC: fix build warning in elf.h ARC: Adjust cpuinfo for non-continuous cpu ids ARC: [build] Support gz, lzma compressed uImage ARCv2: intc: untangle SMP, MCIP and IDU
2016-10-29	Merge branches 'acpica-fixes', 'acpi-pci-fixes' and 'acpi-apei-fixes'	Rafael J. Wysocki	4	-18/+24
	* acpica-fixes: ACPICA: Dispatcher: Fix interpreter locking around acpi_ev_initialize_region() ACPICA: Dispatcher: Fix an unbalanced lock exit path in acpi_ds_auto_serialize_method() ACPICA: Dispatcher: Fix order issue of method termination * acpi-pci-fixes: ACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs ACPI/PCI: pci_link: penalize SCI correctly ACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages * acpi-apei-fixes: ACPI / APEI: Fix incorrect return value of ghes_proc()
2016-10-29	ACPICA: Dispatcher: Fix interpreter locking around acpi_ev_initialize_region()	Lv Zheng	5	-19/+11
	In the code path of acpi_ev_initialize_region(), there is namespace modification code unlocked. This patch tunes the code to make sure such modification are always locked. Fixes: 74f51b80a0c4 (ACPICA: Namespace: Fix dynamic table loading issues) Tested-by: Imre Deak <[email protected]> Signed-off-by: Lv Zheng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-10-29	ACPICA: Dispatcher: Fix an unbalanced lock exit path in ↵	Lv Zheng	1	-1/+1
	acpi_ds_auto_serialize_method() There is a lock unbalanced exit path in acpi_ds_initialize_method(), this patch corrects it. Fixes: 441ad11d078f (ACPICA: Dispatcher: Fix a mutex issue for method auto serialization) Tested-by: Imre Deak <[email protected]> Signed-off-by: Lv Zheng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-10-29	ACPICA: Dispatcher: Fix order issue of method termination	Lv Zheng	1	-20/+20
	The last step of the method termination should be the end of the method serialization. Otherwise, the steps happening after it will face the race issues that cannot be protected by the method serialization mechanism. This patch fixes this issue by moving the per-method-object deletion code prior than the end of the method serialization. Otherwise, the possible race issues may result in AE_ALREADY_EXISTS error in a parallel environment. Fixes: 74f51b80a0c4 (ACPICA: Namespace: Fix dynamic table loading issues) Reported-and-tested-by: Imre Deak <[email protected]> Signed-off-by: Lv Zheng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-10-28	Merge tag 'powerpc-4.9-4' of ↵	Linus Torvalds	10	-42/+108
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Convert cmp to cmpd in idle enter sequence (Segher Boessenkool) - cxl: Fix leaking pid refs in some error paths (Vaibhav Jain) - Re-fix race condition between going idle and entering guest (Paul Mackerras) - Fix race condition in setting lock bit in idle/wakeup code (Paul Mackerras) - radix: Use tlbiel only if we ever ran on the current cpu (Aneesh Kumar K.V) - relocation, register save fixes for system reset interrupt (Nicholas Piggin) Fixes for code merged this cycle: - Fix CONFIG_ALIVEC typo in restore_tm_state() (Valentin Rothberg) - KVM: PPC: Book3S HV: Fix build error when SMP=n (Michael Ellerman)" * tag 'powerpc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: relocation, register save fixes for system reset interrupt powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state() powerpc/64: Fix race condition in setting lock bit in idle/wakeup code powerpc/64: Re-fix race condition between going idle and entering guest cxl: Fix leaking pid refs in some error paths powerpc: Convert cmp to cmpd in idle enter sequence KVM: PPC: Book3S HV: Fix build error when SMP=n
2016-10-29	Merge branches 'pm-cpufreq-fixes' and 'pm-sleep-fixes'	Rafael J. Wysocki	10915	-348613/+619663
	* pm-cpufreq-fixes: cpufreq: intel_pstate: Always set max P-state in performance mode cpufreq: intel_pstate: Set P-state upfront in performance mode * pm-sleep-fixes: PM / suspend: Fix missing KERN_CONT for suspend message
2016-10-28	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds	5	-14/+52
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc kernel fixes: a virtualization environment related fix, an uncore PMU driver removal handling fix, a PowerPC fix and new events for Knights Landing" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors perf/powerpc: Don't call perf_event_disable() from atomic context perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic perf/x86/intel/cstate: Add C-state residency events for Knights Landing
2016-10-28	Merge branch 'libnvdimm-fixes' of ↵	Linus Torvalds	5	-11/+17
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A build fix, a NULL de-reference found by static analysis, a misuse of the percpu_ref_exit() (tagged for -stable), and notification of failed attempts to clear media errors. These patches have received a build success notification from the 0day- kbuild-robot and appeared in next-20161028" * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: fix percpu_ref_exit ordering nvdimm: make CONFIG_NVDIMM_DAX 'bool' pmem: report error on clear poison failure libnvdimm, namespace: potential NULL deref on allocation error
2016-10-28	Merge tag 'arm64-fixes' of ↵	Linus Torvalds	2	-4/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Three arm64 fixes for -rc3. They're all pretty straightforward: a couple of NUMA issues from the Huawei folks and a thinko in __page_to_voff that seems to be benign, but is certainly better off fixed. Summary: - couple of NUMA fixes - thinko in __page_to_voff" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: fix __page_to_voff definition arm64/numa: fix incorrect log for memory-less node arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity
2016-10-28	Merge branch 'x86-urgent-for-linus' of ↵	Linus Torvalds	5	-7/+14
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: three build fixes, an unwinder fix and a microcode loader fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y x86: Fix export for mcount and __fentry__ x86/quirks: Hide maybe-uninitialized warning x86/build: Fix build with older GCC versions x86/unwind: Fix empty stack dereference in guess unwinder
2016-10-28	Merge branch 'timers-urgent-for-linus' of ↵	Linus Torvalds	1	-30/+44
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Fix four timer locking races: two were noticed by Linus while reviewing the code while chasing for a corruption bug, and two from fixing spurious USB timeouts" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Prevent base clock corruption when forwarding timers: Prevent base clock rewind when forwarding clock timers: Lock base for same bucket optimization timers: Plug locking race vs. timer migration
2016-10-28	Merge branches 'core-urgent-for-linus', 'irq-urgent-for-linus' and ↵	Linus Torvalds	3	-4/+3
	'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool, irq and scheduler fixes from Ingo Molnar: "One more objtool fixlet for GCC6 code generation patterns, an irq DocBook fix and an unused variable warning fix in the scheduler" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix rare switch jump table pattern detection * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: doc: Add missing parameter for msi_setup * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Remove unused but set variable 'rq'
2016-10-28	ARC: module: print pretty section names	Vineet Gupta	1	-14/+21
	Now that we have referece to section name string table in apply_relocate_add(), use it to - print the name of section being relocated - print symbol with NULL name (since it refers to a section) before \| Section to fixup 7000a060 \| ========================================================= \| rela->r_off \| rela->addend \| sym->st_value \| ADDR \| VALUE \| ========================================================= \| 1c 0 7000e000 7000a07c 7000e000 [] \| 40 0 7000a000 7000a0a0 7000a000 [] after \| Section to fixup .eh_frame @7000a060 \| ========================================================= \| r_off r_add st_value ADDRESS VALUE \| ========================================================= \| 1c 0 7000e000 7000a07c 7000e000 [.init.text] \| 40 0 7000a000 7000a0a0 7000a000 [.exit.text] Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: module: elide loop to save reference to .eh_frame	Vineet Gupta	2	-10/+9
	The loop was really needed in .debug_frame regime where wanted make it as SH_ALLOC so that apply_relocate_add() would process it. That's not needed for .eh_frame, so we check this in apply_relocate_add() which gets called for each section. Note that we need to save reference to "section name strings" section in module_frob_arch_sections() since apply_relocate_add() doesn't get that Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: mm: retire ARC_DBG_TLB_MISS_COUNT...	Vineet Gupta	3	-139/+0
	... given that we have perf counters abel to do the same thing non intrusively Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: build: retire old toggles	Vineet Gupta	1	-3/+0
	These are really ancient toggles and tools no longer require them to be passed. This paves way for deprecating them in long run. Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: boot log: refactor cpu name/release printing	Vineet Gupta	3	-24/+34
	The motivation is to identify ARC750 vs. ARC770 (we currently print generic "ARC700"). A given ARC700 release could be 750 or 770, with same ARCNUM (or family identifier which is unfortunate). The existing arc_cpu_tbl[] kept a single concatenated string for core name and release which thus doesn't work for 750 vs. 770 identification. So split this into 2 tables, one with core names and other with release. And while we are at it, get rid of the range checking for family numbers. We just document the known to exist cores running Linux and ditch others. With this in place, we add detection of ARC750 which is - cores 0x33 and before - cores 0x34 and later with MMUv2 Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: boot log: remove awkward space comma from MMU line	Vineet Gupta	1	-3/+3
	Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: boot log: don't assume SWAPE instruction support	Vineet Gupta	2	-2/+5
	This came to light when helping a customer with oldish ARC750 core who were getting instruction errors because of lack of SWAPE but boot log was incorrectly printing it as being present Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	ARC: boot log: refactor printing abt features not captured in BCRs	Vineet Gupta	2	-45/+43
	On older arc700 cores, some of the features configured were not present in Build config registers. To print about them at boot, we just use the Kconfig option i.e. whether linux is built to use them or not. So yes this seems bogus, but what else can be done. Moreover if linux is booting with these enabled, then the Kconfig info is a good indicator anyways. Over time these "hacks" accumulated in read_arc_build_cfg_regs() as well as arc_cpu_mumbojumbo(). so refactor and move all of those in a single place: read_arc_build_cfg_regs(). This causes some code redcution too: \| bloat-o-meter2 arch/arc/kernel/setup.o.0 arch/arc/kernel/setup.o.1 \| add/remove: 0/0 grow/shrink: 2/1 up/down: 64/-132 (-68) \| function old new delta \| setup_processor 610 670 +60 \| cpuinfo_arc700 76 80 +4 \| arc_cpu_mumbojumbo 752 620 -132 Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	Merge branch 'for-linus-4.9' of ↵	Linus Torvalds	2	-14/+64
	git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "My patch fixes the btrfs list_head abuse that we tracked down during Dave Jones' memory corruption investigation. With both Jens and my patches in place, I'm no longer able to trigger problems. Filipe is fixing a difficult old bug between snapshots, balance and send. Dave is cooking a few more for the next rc, but these are tested and ready" * 'for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: fix races on root_log_ctx lists btrfs: fix incremental send failure caused by balance
2016-10-28	ARCv2: boot log: print IOC exists as well as enabled status	Vineet Gupta	3	-9/+5
	Previously we would not print the case when IOC existed but was not enabled. And while at it, reduce one line off boot printing by consolidating the Peripheral address space and IO-Coherency which in a way applies to them Signed-off-by: Vineet Gupta <[email protected]>
2016-10-28	Merge tag 'sound-4.9-rc3' of ↵	Linus Torvalds	5	-8/+52
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This contains the usual stuff -- the fixups and quirks for HD-audio and USB-audio, in addition to a bad regression fix in ALSA sequencer timer since 4.8, and a trivial fix for asihpi PCI driver" * tag 'sound-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: usb-audio: Add quirk for Syntek STK1160 ALSA: seq: Fix time account regression ALSA: hda - Fix surround output pins for ASRock B150M mobo ALSA: hda - Fix headset mic detection problem for two Dell laptops ALSA: asihpi: fix kernel memory disclosure ALSA: hda - Adding a new group of pin cfg into ALC295 pin quirk table ALSA: hda - allow 40 bit DMA mask for NVidia devices
2016-10-28	Merge tag 'drm-x86-pat-regression-fix' of ↵	Linus Torvalds	9	-0/+80
	git://people.freedesktop.org/~airlied/linux Pull drm x86/pat regression fixes from Dave Airlie: "This is a standalone pull request for the fix for a regression introduced in -rc1 by a change to vm_insert_mixed to start using the PAT range tracking to validate page protections. With this fix in place, all the VRAM mappings for GPU drivers ended up at UC instead of WC. There are probably better ways to fix this long term, but nothing I'd considered for -fixes that wouldn't need more settling in time. So I've just created a new arch API that the drivers can reserve all their VRAM aperture ranges as WC" * tag 'drm-x86-pat-regression-fix' of git://people.freedesktop.org/~airlied/linux: drm/drivers: add support for using the arch wc mapping API. x86/io: add interface to reserve io memtype for a resource range. (v1.1)
2016-10-28	Merge tag 'dm-4.9-fixes' of ↵	Linus Torvalds	6	-44/+29
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a couple DM raid and DM mirror fixes - a couple .request_fn request-based DM NULL pointer fixes - a fix for a DM target reference count leak, on target load error, that prevented associated DM target kernel module(s) from being removed * tag 'dm-4.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm table: fix missing dm_put_target_type() in dm_table_add_target() dm rq: clear kworker_task if kthread_run() returned an error dm: free io_barrier after blk_cleanup_queue call dm raid: fix activation of existing raid4/10 devices dm mirror: use all available legs on multiple failures dm mirror: fix read error on recovery after default leg failure dm raid: fix compat_features validation
2016-10-28	Merge branch 'for-linus' of ↵	Linus Torvalds	3	-29/+34
	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull key fixes from James Morris: - fix a buffer overflow when displaying /proc/keys [CVE-2016-7042]. - fix broken initialisation in the big_key implementation that can result in an oops. - make big_key depend on having a random number generator available in Kconfig. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: security/keys: make BIG_KEYS dependent on stdrng. KEYS: Sort out big_key initialisation KEYS: Fix short sprintf buffer in /proc/keys show function
2016-10-28	perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors	Imre Palik	1	-3/+7
	perf doesn't seem to honour the number of fixed counters specified by CPUID leaf 0xa. It always assumes that Intel CPUs have at least 3 fixed counters. So if some of the fixed counters are masked out by the hypervisor, it still tries to check/set them. This patch makes perf behave nicer when the kernel is running under a hypervisor that doesn't expose all the counters. This patch contains some ideas from Matt Wilson. Signed-off-by: Imre Palik <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Cc: Alexander Kozyrev <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Artyom Kuanbekov <[email protected]> Cc: David Carrillo-Cisneros <[email protected]> Cc: David Woodhouse <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matt Wilson <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-10-28	perf/powerpc: Don't call perf_event_disable() from atomic context	Jiri Olsa	3	-3/+10
	The trinity syscall fuzzer triggered following WARN() on powerpc: WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278 ... NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0 LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 Call Trace: [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable) [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 Followed by a lockdep warning: =============================== [ INFO: suspicious RCU usage. ] 4.8.0-rc5+ #7 Tainted: G W ------------------------------- ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ls/2998: #0: (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0 #1: (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0 stack backtrace: CPU: 9 PID: 2998 Comm: ls Tainted: G W 4.8.0-rc5+ #7 Call Trace: [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable) [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180 [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0 [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0 [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380 [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60 [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0 [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 While it looks like the first WARN() is probably valid, the other one is triggered by disabling event via perf_event_disable() from atomic context. The event is disabled here in case we were not able to emulate the instruction that hit the breakpoint. By disabling the event we unschedule the event and make sure it's not scheduled back. But we can't call perf_event_disable() from atomic context, instead we need to use the event's pending_disable irq_work method to disable it. Reported-by: Jan Stancek <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Huang Ying <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/20161026094824.GA21397@krava Signed-off-by: Ingo Molnar <[email protected]>
2016-10-28	perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix ↵	Jiri Olsa	1	-4/+9
	CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic CAI Qian reported a crash in the PMU uncore device removal code, enabled by the CONFIG_DEBUG_TEST_DRIVER_REMOVE=y option: https://marc.info/?l=linux-kernel&m=147688837328451 The reason for the crash is that perf_pmu_unregister() tries to remove a PMU device which is not added at this point. We add PMU devices only after pmu_bus is registered, which happens in the perf_event_sysfs_init() call and sets the 'pmu_bus_running' flag. The fix is to get the 'pmu_bus_running' flag state at the point the PMU is taken out of the PMU list and remove the device later only if it's set. Reported-by: CAI Qian <[email protected]> Tested-by: CAI Qian <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/20161020111011.GA13361@krava Signed-off-by: Ingo Molnar <[email protected]>
2016-10-28	x86/microcode/AMD: Fix more fallout from CONFIG_RANDOMIZE_MEMORY=y	Borislav Petkov	1	-1/+1
	We needed the physical address of the container in order to compute the offset within the relocated ramdisk. And we did this by doing __pa() on the virtual address. However, __pa() does checks whether the physical address is within PAGE_OFFSET and __START_KERNEL_map - see __phys_addr() - which fail if we have CONFIG_RANDOMIZE_MEMORY enabled: we feed a virtual address which doesn't have the randomization offset into a function which uses PAGE_OFFSET which does have that offset. This makes this check fire: VIRTUAL_BUG_ON((x > y) \|\| !phys_addr_valid(x)); ^^^^^^ due to the randomization offset. The fix is as simple as using __pa_nodebug() because we do that randomization offset accounting later in that function ourselves. Reported-by: Bob Peterson <[email protected]> Tested-by: Bob Peterson <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andreas Gruenbacher <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Whitehouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: linux-mm <[email protected]> Cc: [email protected] # 4.9 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-10-27	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	24	-51/+89
	Merge misc fixes from Andrew Morton: "20 fixes" * emailed patches from Andrew Morton <[email protected]>: drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk cris/arch-v32: cryptocop: print a hex number after a 0x prefix ipack: print a hex number after a 0x prefix block: DAC960: print a hex number after a 0x prefix fs: exofs: print a hex number after a 0x prefix lib/genalloc.c: start search from start of chunk mm: memcontrol: do not recurse in direct reclaim CREDITS: update credit information for Martin Kepplinger proc: fix NULL dereference when reading /proc/<pid>/auxv mm: kmemleak: ensure that the task stack is not freed during scanning lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB latent_entropy: raise CONFIG_FRAME_WARN by default kconfig.h: remove config_enabled() macro ipc: account for kmem usage on mqueue and msg mm/slab: improve performance of gathering slabinfo stats mm: page_alloc: use KERN_CONT where appropriate mm/list_lru.c: avoid error-path NULL pointer deref h8300: fix syscall restarting kcov: properly check if we are in an interrupt mm/slab: fix kmemcg cache creation delayed issue
2016-10-27	drivers/misc/sgi-gru/grumain.c: remove bogus 0x prefix from printk	Dimitri Sivanich	1	-1/+1
	Would like to have this be a decimal number. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dimitri Sivanich <[email protected]> Reported-by: Uwe Kleine-König <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	cris/arch-v32: cryptocop: print a hex number after a 0x prefix	Uwe Kleine-König	1	-1/+1
	It makes the result hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Cc: Mikael Starvik <[email protected]> Cc: Jesper Nilsson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	ipack: print a hex number after a 0x prefix	Uwe Kleine-König	1	-1/+1
	It makes the result hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Cc: Samuel Iglesias Gonsalvez <[email protected]> Cc: Jens Taprogge <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	block: DAC960: print a hex number after a 0x prefix	Uwe Kleine-König	1	-2/+2
	It makes the message hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	fs: exofs: print a hex number after a 0x prefix	Uwe Kleine-König	1	-1/+1
	It makes the message hard to interpret correctly if a base 10 number is prefixed by 0x. So change to a hex number. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Uwe Kleine-König <[email protected]> Cc: Boaz Harrosh <[email protected]> Cc: Benny Halevy <[email protected]> Cc: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	lib/genalloc.c: start search from start of chunk	Daniel Mentz	1	-1/+2
	gen_pool_alloc_algo() iterates over the chunks of a pool trying to find a contiguous block of memory that satisfies the allocation request. The shortcut if (size > atomic_read(&chunk->avail)) continue; makes the loop skip over chunks that do not have enough bytes left to fulfill the request. There are two situations, though, where an allocation might still fail: (1) The available memory is not contiguous, i.e. the request cannot be fulfilled due to external fragmentation. (2) A race condition. Another thread runs the same code concurrently and is quicker to grab the available memory. In those situations, the loop calls pool->algo() to search the entire chunk, and pool->algo() returns some value that is >= end_bit to indicate that the search failed. This return value is then assigned to start_bit. The variables start_bit and end_bit describe the range that should be searched, and this range should be reset for every chunk that is searched. Today, the code fails to reset start_bit to 0. As a result, prefixes of subsequent chunks are ignored. Memory allocations might fail even though there is plenty of room left in these prefixes of those other chunks. Fixes: 7f184275aa30 ("lib, Make gen_pool memory allocator lockless") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Daniel Mentz <[email protected]> Reviewed-by: Mathieu Desnoyers <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	mm: memcontrol: do not recurse in direct reclaim	Johannes Weiner	2	-0/+11
	On 4.0, we saw a stack corruption from a page fault entering direct memory cgroup reclaim, calling into btrfs_releasepage(), which then tried to allocate an extent and recursed back into a kmem charge ad nauseam: [...] btrfs_releasepage+0x2c/0x30 try_to_release_page+0x32/0x50 shrink_page_list+0x6da/0x7a0 shrink_inactive_list+0x1e5/0x510 shrink_lruvec+0x605/0x7f0 shrink_zone+0xee/0x320 do_try_to_free_pages+0x174/0x440 try_to_free_mem_cgroup_pages+0xa7/0x130 try_charge+0x17b/0x830 memcg_charge_kmem+0x40/0x80 new_slab+0x2d9/0x5a0 __slab_alloc+0x2fd/0x44f kmem_cache_alloc+0x193/0x1e0 alloc_extent_state+0x21/0xc0 __clear_extent_bit+0x2b5/0x400 try_release_extent_mapping+0x1a3/0x220 __btrfs_releasepage+0x31/0x70 btrfs_releasepage+0x2c/0x30 try_to_release_page+0x32/0x50 shrink_page_list+0x6da/0x7a0 shrink_inactive_list+0x1e5/0x510 shrink_lruvec+0x605/0x7f0 shrink_zone+0xee/0x320 do_try_to_free_pages+0x174/0x440 try_to_free_mem_cgroup_pages+0xa7/0x130 try_charge+0x17b/0x830 mem_cgroup_try_charge+0x65/0x1c0 handle_mm_fault+0x117f/0x1510 __do_page_fault+0x177/0x420 do_page_fault+0xc/0x10 page_fault+0x22/0x30 On later kernels, kmem charging is opt-in rather than opt-out, and that particular kmem allocation in btrfs_releasepage() is no longer being charged and won't recurse and overrun the stack anymore. But it's not impossible for an accounted allocation to happen from the memcg direct reclaim context, and we needed to reproduce this crash many times before we even got a useful stack trace out of it. Like other direct reclaimers, mark tasks in memcg reclaim PF_MEMALLOC to avoid recursing into any other form of direct reclaim. Then let recursive charges from PF_MEMALLOC contexts bypass the cgroup limit. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Johannes Weiner <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	CREDITS: update credit information for Martin Kepplinger	Martin Kepplinger	1	-2/+3
	Content and employer changed. Link: http://lkml.kernel.org/r/1477304102-28830-1-git-send-email-martin.kepplinger@ginzinger.com Signed-off-by: Martin Kepplinger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	proc: fix NULL dereference when reading /proc/<pid>/auxv	Leon Yu	1	-0/+3
	Reading auxv of any kernel thread results in NULL pointer dereferencing in auxv_read() where mm can be NULL. Fix that by checking for NULL mm and bailing out early. This is also the original behavior changed by recent commit c5317167854e ("proc: switch auxv to use of __mem_open()"). # cat /proc/2/auxv Unable to handle kernel NULL pointer dereference at virtual address 000000a8 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 3 PID: 113 Comm: cat Not tainted 4.9.0-rc1-ARCH+ #1 Hardware name: BCM2709 task: ea3b0b00 task.stack: e99b2000 PC is at auxv_read+0x24/0x4c LR is at do_readv_writev+0x2fc/0x37c Process cat (pid: 113, stack limit = 0xe99b2210) Call chain: auxv_read do_readv_writev vfs_readv default_file_splice_read splice_direct_to_actor do_splice_direct do_sendfile SyS_sendfile64 ret_fast_syscall Fixes: c5317167854e ("proc: switch auxv to use of __mem_open()") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Leon Yu <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Al Viro <[email protected]> Cc: Kees Cook <[email protected]> Cc: John Stultz <[email protected]> Cc: Mateusz Guzik <[email protected]> Cc: Janis Danisevskis <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	mm: kmemleak: ensure that the task stack is not freed during scanning	Catalin Marinas	1	-2/+5
	Commit 68f24b08ee89 ("sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK") may cause the task->stack to be freed during kmemleak_scan() execution, leading to either a NULL pointer fault (if task->stack is NULL) or kmemleak accessing already freed memory. This patch uses the new try_get_task_stack() API to ensure that the task stack is not freed during kmemleak stack scanning. Addresses https://bugzilla.kernel.org/show_bug.cgi?id=173901. Fixes: 68f24b08ee89 ("sched/core: Free the stack early if CONFIG_THREAD_INFO_IN_TASK") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]> Reported-by: CAI Qian <[email protected]> Tested-by: CAI Qian <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: CAI Qian <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB	Dmitry Vyukov	1	-1/+1
	KASAN uses stackdepot to memorize stacks for all kmalloc/kfree calls. Current stackdepot capacity is 16MB (1024 top level entries x 4 pages on second level). Size of each stack is (num_frames + 3) * sizeof(long). Which gives us ~84K stacks. This capacity was chosen empirically and it is enough to run kernel normally. However, when lots of configs are enabled and a fuzzer tries to maximize code coverage, it easily hits the limit within tens of minutes. I've tested for long a time with number of top level entries bumped 4x (4096). And I think I've seen overflow only once. But I don't have all configs enabled and code coverage has not reached maximum yet. So bump it 8x to 8192. Since we have two-level table, memory cost of this is very moderate -- currently the top-level table is 8KB, with this patch it is 64KB, which is negligible under KASAN. Here is some approx math. 128MB allows us to memorize ~670K stacks (assuming stack is ~200b). I've grepped kernel for kmalloc\|kfree\|kmem_cache_alloc\|kmem_cache_free\| kzalloc\|kstrdup\|kstrndup\|kmemdup and it gives ~60K matches. Most of alloc/free call sites are reachable with only one stack. But some utility functions can have large fanout. Assuming average fanout is 5x, total number of alloc/free stacks is ~300K. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Baozeng Ding <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	latent_entropy: raise CONFIG_FRAME_WARN by default	Kees Cook	1	-0/+1
	When building with the latent_entropy plugin, set the default CONFIG_FRAME_WARN to 2048, since some __init functions have many basic blocks that, when instrumented by the latent_entropy plugin, grow beyond 1024 byte stack size on 32-bit builds. Link: http://lkml.kernel.org/r/20161018211216.GA39687@beast Signed-off-by: Kees Cook <[email protected]> Reported-by: kbuild test robot <[email protected]> Cc: Emese Revfy <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Michal Marek <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Dan Williams <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Nikolay Aleksandrov <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	kconfig.h: remove config_enabled() macro	Masahiro Yamada	3	-7/+6
	The use of config_enabled() is ambiguous. For config options, IS_ENABLED(), IS_REACHABLE(), etc. will make intention clearer. Sometimes config_enabled() has been used for non-config options because it is useful to check whether the given symbol is defined or not. I have been tackling on deprecating config_enabled(), and now is the time to finish this work. Some new users have appeared for v4.9-rc1, but it is trivial to replace them: - arch/x86/mm/kaslr.c replace config_enabled() with IS_ENABLED() because CONFIG_X86_ESPFIX64 and CONFIG_EFI are boolean. - include/asm-generic/export.h replace config_enabled() with __is_defined(). Then, config_enabled() can be removed now. Going forward, please use IS_ENABLED(), IS_REACHABLE(), etc. for config options, and __is_defined() for non-config symbols. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Nicolas Pitre <[email protected]> Cc: Peter Oberparleiter <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Kees Cook <[email protected]> Cc: Michal Marek <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Garnier <[email protected]> Cc: Paul Bolle <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	ipc: account for kmem usage on mqueue and msg	Aristeu Rozanski	1	-2/+2
	When kmem accounting switched from account by default to only account if flagged by __GFP_ACCOUNT, IPC mqueue and messages was left out. The production use case at hand is that mqueues should be customizable via sysctls in Docker containers in a Kubernetes cluster. This can only be safely allowed to the users of the cluster (without the risk that they can cause resource shortage on a node, influencing other users' containers) if all resources they control are bounded, i.e. accounted for. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Aristeu Rozanski <[email protected]> Reported-by: Stefan Schimanski <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Stefan Schimanski <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-27	mm/slab: improve performance of gathering slabinfo stats	Aruna Ramakrishna	2	-16/+28
	On large systems, when some slab caches grow to millions of objects (and many gigabytes), running 'cat /proc/slabinfo' can take up to 1-2 seconds. During this time, interrupts are disabled while walking the slab lists (slabs_full, slabs_partial, and slabs_free) for each node, and this sometimes causes timeouts in other drivers (for instance, Infiniband). This patch optimizes 'cat /proc/slabinfo' by maintaining a counter for total number of allocated slabs per node, per cache. This counter is updated when a slab is created or destroyed. This enables us to skip traversing the slabs_full list while gathering slabinfo statistics, and since slabs_full tends to be the biggest list when the cache is large, it results in a dramatic performance improvement. Getting slabinfo statistics now only requires walking the slabs_free and slabs_partial lists, and those lists are usually much smaller than slabs_full. We tested this after growing the dentry cache to 70GB, and the performance improved from 2s to 5ms. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Aruna Ramakrishna <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>