twx-linux/mm
Ricardo Cañuelo Navarro 91f0e576f9 mm,madvise,hugetlb: check for 0-length range after end address adjustment
commit 2ede647a6fde3e54a6bfda7cf01c716649655900 upstream.

Add a sanity check to madvise_dontneed_free() to address a corner case in
madvise where a race condition causes the current vma being processed to
be backed by a different page size.

During a madvise(MADV_DONTNEED) call on a memory region registered with a
userfaultfd, there's a period of time where the process mm lock is
temporarily released in order to send a UFFD_EVENT_REMOVE and let
userspace handle the event.  During this time, the vma covering the
current address range may change due to an explicit mmap done concurrently
by another thread.

If, after that change, the memory region, which was originally backed by
4KB pages, is now backed by hugepages, the end address is rounded down to
a hugepage boundary to avoid data loss (see "Fixes" below).  This rounding
may cause the end address to be truncated to the same address as the
start.

Make this corner case follow the same semantics as in other similar cases
where the requested region has zero length (ie.  return 0).

This will make madvise_walk_vmas() continue to the next vma in the range
(this time holding the process mm lock) which, due to the prev pointer
becoming stale because of the vma change, will be the same hugepage-backed
vma that was just checked before.  The next time madvise_dontneed_free()
runs for this vma, if the start address isn't aligned to a hugepage
boundary, it'll return -EINVAL, which is also in line with the madvise
api.

From userspace perspective, madvise() will return EINVAL because the start
address isn't aligned according to the new vma alignment requirements
(hugepage), even though it was correctly page-aligned when the call was
issued.

Link: https://lkml.kernel.org/r/20250203075206.1452208-1-rcn@igalia.com
Fixes: 8ebe0a5eaaeb ("mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs")
Signed-off-by: Ricardo Cañuelo Navarro <rcn@igalia.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Florent Revest <revest@google.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-27 04:10:52 -08:00
..
damon mm/damon/vaddr: fix issue in damon_va_evenly_split_region() 2024-12-14 20:00:21 +01:00
kasan kasan: make report_lock a raw spinlock 2024-12-14 19:59:57 +01:00
kfence kfence: skip __GFP_THISNODE allocations on NUMA systems 2025-02-17 09:40:32 +01:00
kmsan kmsan: do not wipe out origin when doing partial unpoisoning 2024-06-16 13:47:41 +02:00
backing-dev.c blk-wbt: Fix detection of dirty-throttled tasks 2024-02-23 09:25:16 +01:00
balloon_compaction.c
bootmem_info.c
cma_debug.c
cma_sysfs.c
cma.c mm/cma: drop incorrect alignment check in cma_init_reserved_mem 2024-06-16 13:47:42 +02:00
cma.h
compaction.c mm, treewide: introduce NR_PAGE_ORDERS 2024-05-02 16:32:41 +02:00
debug_page_alloc.c
debug_page_ref.c
debug_vm_pgtable.c mm/debug_vm_pgtable: drop RANDOM_ORVALUE trick 2024-08-19 06:04:31 +02:00
debug.c
dmapool_test.c
dmapool.c
early_ioremap.c
fadvise.c mm: remove unnecessary pagevec includes 2023-06-23 16:59:31 -07:00
fail_page_alloc.c
failslab.c
filemap.c cachestat: fix page cache statistics permission checking 2025-02-01 18:37:54 +01:00
folio-compat.c filemap: Add fgf_t typedef 2023-07-24 18:04:30 -04:00
gup_test.c Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes. 2023-06-23 16:58:19 -07:00
gup_test.h
gup.c mm: gup: fix infinite loop within __get_longterm_locked 2025-02-21 13:57:26 +01:00
highmem.c mm: ptep_get() conversion 2023-06-19 16:19:25 -07:00
hmm.c mm: enable page walking API to lock vmas during the walk 2023-08-21 13:07:20 -07:00
huge_memory.c mm/thp: fix deferred split unqueue naming and locking 2024-11-17 15:08:59 +01:00
hugetlb_cgroup.c
hugetlb_vmemmap.c mm: hugetlb_vmemmap: fix a race between vmemmap pmd split 2023-08-18 10:12:14 -07:00
hugetlb_vmemmap.h
hugetlb.c mm: hugetlb: independent PMD page table shared count 2025-01-17 13:36:26 +01:00
hwpoison-inject.c
init-mm.c mm: move dummy_vm_ops out of a header 2023-08-21 13:37:46 -07:00
internal.h Rename .data.once to .data..once to fix resetting WARN*_ONCE 2024-12-09 10:32:59 +01:00
interval_tree.c
io-mapping.c
ioremap.c mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed 2023-08-18 10:12:36 -07:00
Kconfig mm: z3fold: deprecate CONFIG_Z3FOLD 2024-10-10 11:58:02 +02:00
Kconfig.debug
khugepaged.c mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point 2024-11-01 01:58:26 +01:00
kmemleak.c mm: kmemleak: fix upper boundary check for physical address objects 2025-02-17 09:40:35 +01:00
ksm.c mm/ksm: fix ksm_zero_pages accounting 2024-06-16 13:47:41 +02:00
list_lru.c
maccess.c
madvise.c mm,madvise,hugetlb: check for 0-length range after end address adjustment 2025-02-27 04:10:52 -08:00
Makefile - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
mapping_dirty_helpers.c mm: fix clean_record_shared_mapping_range kernel-doc 2023-08-24 16:20:30 -07:00
memblock.c memblock: use numa_valid_node() helper to check for invalid node ID 2025-01-17 13:36:09 +01:00
memcontrol.c memcg: fix soft lockup in the OOM process 2025-02-27 04:10:45 -08:00
memfd.c revert "memfd: improve userspace warnings for missing exec-related flags". 2023-09-05 11:11:52 -07:00
memory_hotplug.c x86/kaslr: Expose and use the end of the physical memory address space 2024-09-12 11:11:25 +02:00
memory-failure.c mm/memory-failure: use raw_spinlock_t in struct memory_failure_cpu 2024-08-29 17:33:15 +02:00
memory-tiers.c memory tier: use helper macro __ATTR_RW() 2023-08-18 10:12:38 -07:00
memory.c mm: don't install PMD mappings when THPs are disabled by the hw/process/vma 2024-11-08 16:28:27 +01:00
mempolicy.c mm/mempolicy: fix migrate_to_node() assuming there is at least one VMA in a MM 2024-12-14 20:00:18 +01:00
mempool.c
memremap.c
memtest.c memtest: use {READ,WRITE}_ONCE in memory scanning 2024-04-03 15:28:33 +02:00
migrate_device.c Add x86 shadow stack support 2023-08-31 12:20:12 -07:00
migrate.c vmscan,migrate: fix page count imbalance on node stats when demoting pages 2024-11-08 16:28:26 +01:00
mincore.c mm: enable page walking API to lock vmas during the walk 2023-08-21 13:07:20 -07:00
mlock.c merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes 2023-08-21 14:26:20 -07:00
mm_init.c efi: disable mirror feature during crashkernel 2024-01-31 16:18:56 -08:00
mm_slot.h
mmap_lock.c mm: mmap_lock: replace get_memcg_path_buf() with on-stack buffer 2024-08-03 08:54:12 +02:00
mmap.c mm: resolve faulty mmap_region() error path behaviour 2024-11-22 15:38:37 +01:00
mmu_gather.c mm: fix kernel-doc warning from tlb_flush_rmaps() 2023-08-24 16:20:30 -07:00
mmu_notifier.c mmu_notifiers: rename invalidate_range notifier 2023-08-18 10:12:41 -07:00
mmzone.c
mprotect.c mm: refactor map_deny_write_exec() 2024-11-22 15:38:37 +01:00
mremap.c mm/mremap: fix move_normal_pmd/retract_page_tables race 2024-10-22 15:46:21 +02:00
msync.c
nommu.c mm: refactor arch_calc_vm_flag_bits() and arm64 MTE handling 2024-11-22 15:38:37 +01:00
oom_kill.c memcg: fix soft lockup in the OOM process 2025-02-27 04:10:45 -08:00
page_alloc.c mm: page_alloc: move mlocked flag clearance into free_pages_prepare() 2024-12-14 19:59:51 +01:00
page_counter.c
page_ext.c mm/page_ext: move functions around for minor cleanups to page_ext 2023-08-18 10:12:31 -07:00
page_idle.c
page_io.c zswap: make zswap_load() take a folio 2023-08-21 13:37:27 -07:00
page_isolation.c mm/hugetlb: get rid of page_hstate() 2023-08-18 10:12:39 -07:00
page_owner.c mm/page_ext: use page_ext_data helper in page_owner 2023-08-21 13:37:27 -07:00
page_poison.c mm/page_poison: remove unused page_ext.h from page_poison 2023-08-21 13:37:30 -07:00
page_reporting.c mm, treewide: introduce NR_PAGE_ORDERS 2024-05-02 16:32:41 +02:00
page_reporting.h
page_table_check.c mm/page_table_check: support userfault wr-protect entries 2024-08-19 06:04:29 +02:00
page_vma_mapped.c mm: correct stale comment of function check_pte 2023-08-18 10:12:13 -07:00
page-writeback.c Revert "mm/writeback: fix possible divide-by-zero in wb_dirty_limits(), again" 2024-07-11 12:49:16 +02:00
pagewalk.c mm/pagewalk: fix bootstopping regression from extra pte_unmap() 2023-09-02 08:39:21 -07:00
percpu-internal.h percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce false sharing 2023-06-19 16:19:29 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c mm: Introduce flush_cache_vmap_early() 2024-02-16 19:10:52 +01:00
pgalloc-track.h
pgtable-generic.c mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-06-16 13:47:40 +02:00
process_vm_access.c
ptdump.c mm: ptdump should use ptep_get_lockless() 2023-06-19 16:19:24 -07:00
readahead.c mm/readahead: fix large folio support in async readahead 2025-01-09 13:32:08 +01:00
rmap.c mm: hugetlb: add huge page size param to set_huge_pte_at() 2023-09-29 17:20:47 -07:00
rodata_test.c
secretmem.c secretmem: disable memfd_secret() if arch cannot set direct map 2024-10-17 15:24:37 +02:00
shmem_quota.c tmpfs: fix race on handling dquot rbtree 2024-04-03 15:28:54 +02:00
shmem.c Revert "libfs: Add simple_offset_empty()" 2025-02-01 18:37:54 +01:00
show_mem.c mm, treewide: introduce NR_PAGE_ORDERS 2024-05-02 16:32:41 +02:00
shrinker_debug.c
shuffle.c
shuffle.h
slab_common.c mm: krealloc: Fix MTE false alarm in __do_krealloc 2024-11-17 15:08:58 +01:00
slab.c Randomized slab caches for kmalloc() 2023-07-18 10:07:47 +02:00
slab.h mm/slub: Avoid list corruption when removing a slab from the full list 2024-12-09 10:33:06 +01:00
slub.c mm/slub: Avoid list corruption when removing a slab from the full list 2024-12-09 10:33:06 +01:00
sparse-vmemmap.c mm/vmemmap: allow architectures to override how vmemmap optimization works 2023-08-18 10:12:53 -07:00
sparse.c x86/kaslr: Expose and use the end of the physical memory address space 2024-09-12 11:11:25 +02:00
swap_cgroup.c
swap_slots.c
swap_state.c mm/swap: inline folio_set_swap_entry() and folio_swap_entry() 2023-08-24 16:20:28 -07:00
swap.c mm: page_alloc: move mlocked flag clearance into free_pages_prepare() 2024-12-14 19:59:51 +01:00
swap.h mm/swap: fix race when skipping swapcache 2024-03-01 13:35:00 +01:00
swapfile.c mm/swapfile: skip HugeTLB pages for unuse_vma 2024-10-22 15:46:21 +02:00
truncate.c mm: Fix missing folio invalidation calls during truncation 2024-09-04 13:28:23 +02:00
usercopy.c
userfaultfd.c userfaultfd: fix checks for huge PMDs 2024-09-12 11:11:27 +02:00
util.c mm: only enforce minimum stack gap size if it's sensible 2024-10-04 16:30:02 +02:00
vmalloc.c vmalloc: fix accounting with i915 2024-12-27 13:58:53 +01:00
vmpressure.c net-memcg: Fix scope of sockmem pressure indicators 2023-08-16 12:21:32 +01:00
vmscan.c mm: vmscan: account for free pages to prevent infinite Loop in throttle_direct_reclaim() 2025-01-09 13:32:09 +01:00
vmstat.c vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event 2024-12-09 10:33:05 +01:00
workingset.c mm: ratelimit stat flush from workingset shrinker 2024-06-16 13:47:31 +02:00
z3fold.c mm/z3fold: remove obsolete comment for struct z3fold_pool 2023-08-21 13:37:51 -07:00
zbud.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zpool.c mm: zswap: remove shrink from zpool interface 2023-06-19 16:19:27 -07:00
zsmalloc.c merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes 2023-08-21 14:26:20 -07:00
zswap.c mm: zswap: fix missing folio cleanup in writeback race path 2024-03-01 13:35:10 +01:00