kernel-hacking-2024-linux-s.../mm
Johannes Weiner 6a7b95481d vmscan: order evictable rescue in LRU putback
Isolators putting a page back to the LRU do not hold the page lock, and if
the page is mlocked, another thread might munlock it concurrently.

Expecting this, the putback code re-checks the evictability of a page when
it just moved it to the unevictable list in order to correct its decision.

The problem, however, is that ordering is not garuanteed between setting
PG_lru when moving the page to the list and checking PG_mlocked
afterwards:

	#0:				#1

	spin_lock()
					if (TestClearPageMlocked())
					  if (PageLRU())
					    move to evictable list
	SetPageLRU()
	spin_unlock()
	if (!PageMlocked())
	  move to evictable list

The PageMlocked() check may get reordered before SetPageLRU() in #0,
resulting in #0 not moving the still mlocked page, and in #1 failing to
isolate and move the page as well.  The page is now stranded on the
unevictable list.

The race condition is very unlikely.  The consequence currently is one
page falling off the reclaim grid and eventually getting freed with
PG_unevictable set, which triggers a warning in the page allocator.

TestClearPageMlocked() in #1 already provides full memory barrier
semantics.

This patch adds an explicit full barrier to force ordering between
SetPageLRU() and PageMlocked() so that either one of the competitors
rescues the page.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-29 07:39:30 -07:00
..
allocpercpu.c percpu: use dynamic percpu allocator as the default percpu allocator 2009-06-24 15:13:35 +09:00
backing-dev.c writeback: kill space in debugfs item name 2009-10-09 13:01:27 +02:00
bootmem.c kmemleak: Do not report alloc_bootmem blocks as leaks 2009-08-27 14:29:17 +01:00
bounce.c
debug-pagealloc.c
dmapool.c dmapools: protect page_list walk in show_pools() 2009-06-30 18:56:00 -07:00
fadvise.c
failslab.c
filemap.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
filemap_xip.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
fremap.c
highmem.c
hugetlb.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
hwpoison-inject.c HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs 2009-09-16 11:50:17 +02:00
init-mm.c
internal.h mm: move highest_memmap_pfn 2009-09-22 07:17:41 -07:00
Kconfig ksm: more on default values 2009-10-08 07:36:38 -07:00
Kconfig.debug trivial: improve help text for mm debug config options 2009-09-21 15:14:57 +02:00
kmemcheck.c
kmemleak-test.c percpu: clean up percpu variable definitions 2009-06-24 15:13:48 +09:00
kmemleak.c kmemleak: Check for NULL pointer returned by create_object() 2009-10-09 13:28:47 -07:00
ksm.c ksm: more on default values 2009-10-08 07:36:38 -07:00
maccess.c
madvise.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2009-09-24 07:53:22 -07:00
Makefile procfs: disable per-task stack usage on NOMMU 2009-09-24 17:11:24 -07:00
memcontrol.c memcg: reduce check for softlimit excess 2009-10-01 16:11:13 -07:00
memory-failure.c hwpoison: fix oops on ksm pages 2009-10-29 07:39:24 -07:00
memory.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-09-24 08:32:11 -07:00
memory_hotplug.c walk system ram range 2009-09-23 07:39:41 -07:00
mempolicy.c do_mbind(): fix memory leak 2009-10-29 07:39:29 -07:00
mempool.c mm: remove broken 'kzalloc' mempool 2009-09-22 07:17:35 -07:00
migrate.c Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 2009-09-24 07:53:22 -07:00
mincore.c
mlock.c mm: m(un)lock avoid ZERO_PAGE 2009-09-22 07:17:40 -07:00
mm_init.c
mmap.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
mmu_context.c mm: reduce atomic use on use_mm fast path 2009-09-22 07:17:42 -07:00
mmu_notifier.c ksm: add mmu_notifier set_pte_at_notify() 2009-09-22 07:17:31 -07:00
mmzone.c
mprotect.c perf: Do the big rename: Performance Counters -> Performance Events 2009-09-21 14:28:04 +02:00
mremap.c truncate: new helpers 2009-09-24 08:41:47 -04:00
msync.c
nommu.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
oom_kill.c oom: oom_kill doesn't kill vfork parent (or child) 2009-09-22 07:17:39 -07:00
page-writeback.c writeback: account IO throttling wait as iowait 2009-10-09 12:40:42 +02:00
page_alloc.c revert "mm: oom analysis: add buffer cache information to show_free_areas()" 2009-10-29 07:39:27 -07:00
page_cgroup.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
page_io.c mm: remove file argument from swap_readpage() 2009-06-16 19:47:44 -07:00
page_isolation.c
pagewalk.c
percpu.c percpu: fix compile warnings 2009-10-12 17:04:42 +09:00
prio_tree.c
quicklist.c cpumask: use new-style cpumask ops in mm/quicklist. 2009-09-24 09:34:52 +09:30
readahead.c
rmap.c mm/rmap.c: fix comment 2009-10-01 16:11:12 -07:00
shmem.c const: mark struct vm_struct_operations 2009-09-27 11:39:25 -07:00
shmem_acl.c shmfs: use 'check_acl' instead of 'permission' 2009-09-08 11:08:46 -07:00
slab.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
slob.c slab: remove duplicate kmem_cache_init_late() declarations 2009-08-06 11:36:25 +03:00
slub.c mm: kmem_cache_create(): make it easier to catch NULL cache names 2009-09-22 07:17:33 -07:00
sparse-vmemmap.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
sparse.c memory hotplug: alloc page from other node in memory online 2009-09-22 07:17:26 -07:00
swap.c mm: replace various uses of num_physpages by totalram_pages 2009-09-22 07:17:38 -07:00
swap_state.c mm: add_to_swap_cache() does not return -EEXIST 2009-09-22 07:17:35 -07:00
swapfile.c swapfile: avoid NULL pointer dereference in swapon when s_bdev is NULL 2009-10-01 21:15:46 +02:00
thrash.c mm: pass mm to grab_swap_token 2009-06-23 12:50:05 -07:00
truncate.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2009-09-24 08:32:11 -07:00
util.c Merge branches 'slab/documentation', 'slab/fixes', 'slob/cleanups' and 'slub/fixes' into for-linus 2009-06-17 08:30:15 +03:00
vmalloc.c headers: remove sched.h from interrupt.h 2009-10-11 11:20:58 -07:00
vmscan.c vmscan: order evictable rescue in LRU putback 2009-10-29 07:39:30 -07:00
vmstat.c mm: vmstat: add isolate pages 2009-09-22 07:17:29 -07:00