diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-19 09:21:03 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-05-19 09:21:03 -0700 |
commit | 61307b7be41a1f1039d1d1368810a1d92cb97b44 (patch) | |
tree | 639e233e177f8618cd5f86daeb7efc6b095890f0 /mm/mmap.c | |
parent | 0450d2083be6bdcd18c9535ac50c55266499b2df (diff) | |
parent | 76edc534cc289308130272a2ac28694fc9b72a03 (diff) |
Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
Diffstat (limited to 'mm/mmap.c')
-rw-r--r-- | mm/mmap.c | 233 |
1 files changed, 138 insertions, 95 deletions
diff --git a/mm/mmap.c b/mm/mmap.c index 3490af70f259..d6d8ab119b72 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1114,21 +1114,21 @@ static struct anon_vma *reusable_anon_vma(struct vm_area_struct *old, struct vm_ */ struct anon_vma *find_mergeable_anon_vma(struct vm_area_struct *vma) { - MA_STATE(mas, &vma->vm_mm->mm_mt, vma->vm_end, vma->vm_end); struct anon_vma *anon_vma = NULL; struct vm_area_struct *prev, *next; + VMA_ITERATOR(vmi, vma->vm_mm, vma->vm_end); /* Try next first. */ - next = mas_walk(&mas); + next = vma_iter_load(&vmi); if (next) { anon_vma = reusable_anon_vma(next, vma, next); if (anon_vma) return anon_vma; } - prev = mas_prev(&mas, 0); + prev = vma_prev(&vmi); VM_BUG_ON_VMA(prev != vma, vma); - prev = mas_prev(&mas, 0); + prev = vma_prev(&vmi); /* Try prev next. */ if (prev) anon_vma = reusable_anon_vma(prev, prev, vma); @@ -1255,18 +1255,6 @@ unsigned long do_mmap(struct file *file, unsigned long addr, if (mm->map_count > sysctl_max_map_count) return -ENOMEM; - /* Obtain the address to map to. we verify (or select) it and ensure - * that it represents a valid section of the address space. - */ - addr = get_unmapped_area(file, addr, len, pgoff, flags); - if (IS_ERR_VALUE(addr)) - return addr; - - if (flags & MAP_FIXED_NOREPLACE) { - if (find_vma_intersection(mm, addr, addr + len)) - return -EEXIST; - } - if (prot == PROT_EXEC) { pkey = execute_only_pkey(mm); if (pkey < 0) @@ -1280,6 +1268,18 @@ unsigned long do_mmap(struct file *file, unsigned long addr, vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; + /* Obtain the address to map to. we verify (or select) it and ensure + * that it represents a valid section of the address space. + */ + addr = __get_unmapped_area(file, addr, len, pgoff, flags, vm_flags); + if (IS_ERR_VALUE(addr)) + return addr; + + if (flags & MAP_FIXED_NOREPLACE) { + if (find_vma_intersection(mm, addr, addr + len)) + return -EEXIST; + } + if (flags & MAP_LOCKED) if (!can_do_mlock()) return -EPERM; @@ -1516,32 +1516,32 @@ bool vma_needs_dirty_tracking(struct vm_area_struct *vma) * to the private version (using protection_map[] without the * VM_SHARED bit). */ -int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) +bool vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) { /* If it was private or non-writable, the write bit is already clear */ if (!vma_is_shared_writable(vma)) - return 0; + return false; /* The backer wishes to know when pages are first written to? */ if (vm_ops_needs_writenotify(vma->vm_ops)) - return 1; + return true; /* The open routine did something to the protections that pgprot_modify * won't preserve? */ if (pgprot_val(vm_page_prot) != pgprot_val(vm_pgprot_modify(vm_page_prot, vma->vm_flags))) - return 0; + return false; /* * Do we need to track softdirty? hugetlb does not support softdirty * tracking yet. */ if (vma_soft_dirty_enabled(vma) && !is_vm_hugetlb_page(vma)) - return 1; + return true; /* Do we need write faults for uffd-wp tracking? */ if (userfaultfd_wp(vma)) - return 1; + return true; /* Can the mapping track the dirty pages? */ return vma_fs_can_writeback(vma); @@ -1551,14 +1551,14 @@ int vma_wants_writenotify(struct vm_area_struct *vma, pgprot_t vm_page_prot) * We account for memory if it's a private writeable mapping, * not hugepages and VM_NORESERVE wasn't set. */ -static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags) +static inline bool accountable_mapping(struct file *file, vm_flags_t vm_flags) { /* * hugetlb has its own accounting separate from the core VM * VM_HUGETLB may not be set yet so we cannot check for that flag. */ if (file && is_file_hugepages(file)) - return 0; + return false; return (vm_flags & (VM_NORESERVE | VM_SHARED | VM_WRITE)) == VM_WRITE; } @@ -1578,11 +1578,10 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info) unsigned long length, gap; unsigned long low_limit, high_limit; struct vm_area_struct *tmp; - - MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); + VMA_ITERATOR(vmi, current->mm, 0); /* Adjust search length to account for worst case alignment overhead */ - length = info->length + info->align_mask; + length = info->length + info->align_mask + info->start_gap; if (length < info->length) return -ENOMEM; @@ -1591,23 +1590,29 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info) low_limit = mmap_min_addr; high_limit = info->high_limit; retry: - if (mas_empty_area(&mas, low_limit, high_limit - 1, length)) + if (vma_iter_area_lowest(&vmi, low_limit, high_limit, length)) return -ENOMEM; - gap = mas.index; + /* + * Adjust for the gap first so it doesn't interfere with the + * later alignment. The first step is the minimum needed to + * fulill the start gap, the next steps is the minimum to align + * that. It is the minimum needed to fulill both. + */ + gap = vma_iter_addr(&vmi) + info->start_gap; gap += (info->align_offset - gap) & info->align_mask; - tmp = mas_next(&mas, ULONG_MAX); + tmp = vma_next(&vmi); if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */ if (vm_start_gap(tmp) < gap + length - 1) { low_limit = tmp->vm_end; - mas_reset(&mas); + vma_iter_reset(&vmi); goto retry; } } else { - tmp = mas_prev(&mas, 0); + tmp = vma_prev(&vmi); if (tmp && vm_end_gap(tmp) > gap) { low_limit = vm_end_gap(tmp); - mas_reset(&mas); + vma_iter_reset(&vmi); goto retry; } } @@ -1630,10 +1635,10 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info) unsigned long length, gap, gap_end; unsigned long low_limit, high_limit; struct vm_area_struct *tmp; + VMA_ITERATOR(vmi, current->mm, 0); - MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); /* Adjust search length to account for worst case alignment overhead */ - length = info->length + info->align_mask; + length = info->length + info->align_mask + info->start_gap; if (length < info->length) return -ENOMEM; @@ -1642,24 +1647,24 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info) low_limit = mmap_min_addr; high_limit = info->high_limit; retry: - if (mas_empty_area_rev(&mas, low_limit, high_limit - 1, length)) + if (vma_iter_area_highest(&vmi, low_limit, high_limit, length)) return -ENOMEM; - gap = mas.last + 1 - info->length; + gap = vma_iter_end(&vmi) - info->length; gap -= (gap - info->align_offset) & info->align_mask; - gap_end = mas.last; - tmp = mas_next(&mas, ULONG_MAX); + gap_end = vma_iter_end(&vmi); + tmp = vma_next(&vmi); if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */ - if (vm_start_gap(tmp) <= gap_end) { + if (vm_start_gap(tmp) < gap_end) { high_limit = vm_start_gap(tmp); - mas_reset(&mas); + vma_iter_reset(&vmi); goto retry; } } else { - tmp = mas_prev(&mas, 0); + tmp = vma_prev(&vmi); if (tmp && vm_end_gap(tmp) > gap) { high_limit = tmp->vm_start; - mas_reset(&mas); + vma_iter_reset(&vmi); goto retry; } } @@ -1707,7 +1712,7 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr, { struct mm_struct *mm = current->mm; struct vm_area_struct *vma, *prev; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info = {}; const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); if (len > mmap_end - mmap_min_addr) @@ -1725,12 +1730,9 @@ generic_get_unmapped_area(struct file *filp, unsigned long addr, return addr; } - info.flags = 0; info.length = len; info.low_limit = mm->mmap_base; info.high_limit = mmap_end; - info.align_mask = 0; - info.align_offset = 0; return vm_unmapped_area(&info); } @@ -1755,7 +1757,7 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr, { struct vm_area_struct *vma, *prev; struct mm_struct *mm = current->mm; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info = {}; const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags); /* requested length too big for entire address space */ @@ -1779,8 +1781,6 @@ generic_get_unmapped_area_topdown(struct file *filp, unsigned long addr, info.length = len; info.low_limit = PAGE_SIZE; info.high_limit = arch_get_mmap_base(addr, mm->mmap_base); - info.align_mask = 0; - info.align_offset = 0; addr = vm_unmapped_area(&info); /* @@ -1810,12 +1810,41 @@ arch_get_unmapped_area_topdown(struct file *filp, unsigned long addr, } #endif +#ifndef HAVE_ARCH_UNMAPPED_AREA_VMFLAGS +unsigned long +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) +{ + return arch_get_unmapped_area(filp, addr, len, pgoff, flags); +} + unsigned long -get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, - unsigned long pgoff, unsigned long flags) +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) +{ + return arch_get_unmapped_area_topdown(filp, addr, len, pgoff, flags); +} +#endif + +unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct file *filp, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags) +{ + if (test_bit(MMF_TOPDOWN, &mm->flags)) + return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff, + flags, vm_flags); + return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, vm_flags); +} + +unsigned long +__get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) { unsigned long (*get_area)(struct file *, unsigned long, - unsigned long, unsigned long, unsigned long); + unsigned long, unsigned long, unsigned long) + = NULL; unsigned long error = arch_mmap_check(addr, len, flags); if (error) @@ -1825,7 +1854,6 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, if (len > TASK_SIZE) return -ENOMEM; - get_area = current->mm->get_unmapped_area; if (file) { if (file->f_op->get_unmapped_area) get_area = file->f_op->get_unmapped_area; @@ -1835,16 +1863,22 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, * so use shmem's get_unmapped_area in case it can be huge. */ get_area = shmem_get_unmapped_area; - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { - /* Ensures that larger anonymous mappings are THP aligned. */ - get_area = thp_get_unmapped_area; } /* Always treat pgoff as zero for anonymous memory. */ if (!file) pgoff = 0; - addr = get_area(file, addr, len, pgoff, flags); + if (get_area) { + addr = get_area(file, addr, len, pgoff, flags); + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + /* Ensures that larger anonymous mappings are THP aligned. */ + addr = thp_get_unmapped_area_vmflags(file, addr, len, + pgoff, flags, vm_flags); + } else { + addr = mm_get_unmapped_area_vmflags(current->mm, file, addr, len, + pgoff, flags, vm_flags); + } if (IS_ERR_VALUE(addr)) return addr; @@ -1857,7 +1891,16 @@ get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, return error ? error : addr; } -EXPORT_SYMBOL(get_unmapped_area); +unsigned long +mm_get_unmapped_area(struct mm_struct *mm, struct file *file, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + if (test_bit(MMF_TOPDOWN, &mm->flags)) + return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags); + return arch_get_unmapped_area(file, addr, len, pgoff, flags); +} +EXPORT_SYMBOL(mm_get_unmapped_area); /** * find_vma_intersection() - Look up the first VMA which intersects the interval @@ -1914,12 +1957,12 @@ find_vma_prev(struct mm_struct *mm, unsigned long addr, struct vm_area_struct **pprev) { struct vm_area_struct *vma; - MA_STATE(mas, &mm->mm_mt, addr, addr); + VMA_ITERATOR(vmi, mm, addr); - vma = mas_walk(&mas); - *pprev = mas_prev(&mas, 0); + vma = vma_iter_load(&vmi); + *pprev = vma_prev(&vmi); if (!vma) - vma = mas_next(&mas, ULONG_MAX); + vma = vma_next(&vmi); return vma; } @@ -1973,7 +2016,7 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) struct vm_area_struct *next; unsigned long gap_addr; int error = 0; - MA_STATE(mas, &mm->mm_mt, vma->vm_start, address); + VMA_ITERATOR(vmi, mm, vma->vm_start); if (!(vma->vm_flags & VM_GROWSUP)) return -EFAULT; @@ -1999,15 +2042,15 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) } if (next) - mas_prev_range(&mas, address); + vma_iter_prev_range_limit(&vmi, address); - __mas_set_range(&mas, vma->vm_start, address - 1); - if (mas_preallocate(&mas, vma, GFP_KERNEL)) + vma_iter_config(&vmi, vma->vm_start, address); + if (vma_iter_prealloc(&vmi, vma)) return -ENOMEM; /* We must make sure the anon_vma is allocated. */ if (unlikely(anon_vma_prepare(vma))) { - mas_destroy(&mas); + vma_iter_free(&vmi); return -ENOMEM; } @@ -2047,7 +2090,7 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) anon_vma_interval_tree_pre_update_vma(vma); vma->vm_end = address; /* Overwrite old entry in mtree. */ - mas_store_prealloc(&mas, vma); + vma_iter_store(&vmi, vma); anon_vma_interval_tree_post_update_vma(vma); spin_unlock(&mm->page_table_lock); @@ -2056,7 +2099,7 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) } } anon_vma_unlock_write(vma->anon_vma); - mas_destroy(&mas); + vma_iter_free(&vmi); validate_mm(mm); return error; } @@ -2069,9 +2112,9 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) int expand_downwards(struct vm_area_struct *vma, unsigned long address) { struct mm_struct *mm = vma->vm_mm; - MA_STATE(mas, &mm->mm_mt, vma->vm_start, vma->vm_start); struct vm_area_struct *prev; int error = 0; + VMA_ITERATOR(vmi, mm, vma->vm_start); if (!(vma->vm_flags & VM_GROWSDOWN)) return -EFAULT; @@ -2081,7 +2124,7 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) return -EPERM; /* Enforce stack_guard_gap */ - prev = mas_prev(&mas, 0); + prev = vma_prev(&vmi); /* Check that both stack segments have the same anon_vma? */ if (prev) { if (!(prev->vm_flags & VM_GROWSDOWN) && @@ -2091,15 +2134,15 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) } if (prev) - mas_next_range(&mas, vma->vm_start); + vma_iter_next_range_limit(&vmi, vma->vm_start); - __mas_set_range(&mas, address, vma->vm_end - 1); - if (mas_preallocate(&mas, vma, GFP_KERNEL)) + vma_iter_config(&vmi, address, vma->vm_end); + if (vma_iter_prealloc(&vmi, vma)) return -ENOMEM; /* We must make sure the anon_vma is allocated. */ if (unlikely(anon_vma_prepare(vma))) { - mas_destroy(&mas); + vma_iter_free(&vmi); return -ENOMEM; } @@ -2140,7 +2183,7 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) vma->vm_start = address; vma->vm_pgoff -= grow; /* Overwrite old entry in mtree. */ - mas_store_prealloc(&mas, vma); + vma_iter_store(&vmi, vma); anon_vma_interval_tree_post_update_vma(vma); spin_unlock(&mm->page_table_lock); @@ -2149,7 +2192,7 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) } } anon_vma_unlock_write(vma->anon_vma); - mas_destroy(&mas); + vma_iter_free(&vmi); validate_mm(mm); return error; } @@ -3244,7 +3287,7 @@ void exit_mmap(struct mm_struct *mm) struct mmu_gather tlb; struct vm_area_struct *vma; unsigned long nr_accounted = 0; - MA_STATE(mas, &mm->mm_mt, 0, 0); + VMA_ITERATOR(vmi, mm, 0); int count = 0; /* mm's last user has gone, and its about to be pulled down */ @@ -3253,7 +3296,7 @@ void exit_mmap(struct mm_struct *mm) mmap_read_lock(mm); arch_exit_mmap(mm); - vma = mas_find(&mas, ULONG_MAX); + vma = vma_next(&vmi); if (!vma || unlikely(xa_is_zero(vma))) { /* Can happen if dup_mmap() received an OOM */ mmap_read_unlock(mm); @@ -3266,7 +3309,7 @@ void exit_mmap(struct mm_struct *mm) tlb_gather_mmu_fullmm(&tlb, mm); /* update_hiwater_rss(mm) here? but nobody should be looking */ /* Use ULONG_MAX here to ensure all VMAs in the mm are unmapped */ - unmap_vmas(&tlb, &mas, vma, 0, ULONG_MAX, ULONG_MAX, false); + unmap_vmas(&tlb, &vmi.mas, vma, 0, ULONG_MAX, ULONG_MAX, false); mmap_read_unlock(mm); /* @@ -3276,8 +3319,8 @@ void exit_mmap(struct mm_struct *mm) set_bit(MMF_OOM_SKIP, &mm->flags); mmap_write_lock(mm); mt_clear_in_rcu(&mm->mm_mt); - mas_set(&mas, vma->vm_end); - free_pgtables(&tlb, &mas, vma, FIRST_USER_ADDRESS, + vma_iter_set(&vmi, vma->vm_end); + free_pgtables(&tlb, &vmi.mas, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING, true); tlb_finish_mmu(&tlb); @@ -3286,14 +3329,14 @@ void exit_mmap(struct mm_struct *mm) * enabled, without holding any MM locks besides the unreachable * mmap_write_lock. */ - mas_set(&mas, vma->vm_end); + vma_iter_set(&vmi, vma->vm_end); do { if (vma->vm_flags & VM_ACCOUNT) nr_accounted += vma_pages(vma); remove_vma(vma, true); count++; cond_resched(); - vma = mas_find(&mas, ULONG_MAX); + vma = vma_next(&vmi); } while (vma && likely(!xa_is_zero(vma))); BUG_ON(count != mm->map_count); @@ -3715,7 +3758,7 @@ int mm_take_all_locks(struct mm_struct *mm) { struct vm_area_struct *vma; struct anon_vma_chain *avc; - MA_STATE(mas, &mm->mm_mt, 0, 0); + VMA_ITERATOR(vmi, mm, 0); mmap_assert_write_locked(mm); @@ -3727,14 +3770,14 @@ int mm_take_all_locks(struct mm_struct *mm) * being written to until mmap_write_unlock() or mmap_write_downgrade() * is reached. */ - mas_for_each(&mas, vma, ULONG_MAX) { + for_each_vma(vmi, vma) { if (signal_pending(current)) goto out_unlock; vma_start_write(vma); } - mas_set(&mas, 0); - mas_for_each(&mas, vma, ULONG_MAX) { + vma_iter_init(&vmi, mm, 0); + for_each_vma(vmi, vma) { if (signal_pending(current)) goto out_unlock; if (vma->vm_file && vma->vm_file->f_mapping && @@ -3742,8 +3785,8 @@ int mm_take_all_locks(struct mm_struct *mm) vm_lock_mapping(mm, vma->vm_file->f_mapping); } - mas_set(&mas, 0); - mas_for_each(&mas, vma, ULONG_MAX) { + vma_iter_init(&vmi, mm, 0); + for_each_vma(vmi, vma) { if (signal_pending(current)) goto out_unlock; if (vma->vm_file && vma->vm_file->f_mapping && @@ -3751,8 +3794,8 @@ int mm_take_all_locks(struct mm_struct *mm) vm_lock_mapping(mm, vma->vm_file->f_mapping); } - mas_set(&mas, 0); - mas_for_each(&mas, vma, ULONG_MAX) { + vma_iter_init(&vmi, mm, 0); + for_each_vma(vmi, vma) { if (signal_pending(current)) goto out_unlock; if (vma->anon_vma) @@ -3811,12 +3854,12 @@ void mm_drop_all_locks(struct mm_struct *mm) { struct vm_area_struct *vma; struct anon_vma_chain *avc; - MA_STATE(mas, &mm->mm_mt, 0, 0); + VMA_ITERATOR(vmi, mm, 0); mmap_assert_write_locked(mm); BUG_ON(!mutex_is_locked(&mm_all_locks_mutex)); - mas_for_each(&mas, vma, ULONG_MAX) { + for_each_vma(vmi, vma) { if (vma->anon_vma) list_for_each_entry(avc, &vma->anon_vma_chain, same_vma) vm_unlock_anon_vma(avc->anon_vma); |