diff options
author | Ryan Roberts <ryan.roberts@arm.com> | 2023-10-05 15:07:30 +0100 |
---|---|---|
committer | Catalin Marinas <catalin.marinas@arm.com> | 2023-10-16 18:27:31 +0100 |
commit | 3425cec42c3ce0f65fe74e412756b567b152e61d (patch) | |
tree | 9cc6580d592083f499ed44638959e554704da67b /arch/arm64/kernel | |
parent | 65033574ade97afccba074d837fd269903a83a9a (diff) |
arm64/mm: Hoist synchronization out of set_ptes() loop
set_ptes() sets a physically contiguous block of memory (which all
belongs to the same folio) to a contiguous block of ptes. The arm64
implementation of this previously just looped, operating on each
individual pte. But the __sync_icache_dcache() and mte_sync_tags()
operations can both be hoisted out of the loop so that they are
performed once for the contiguous set of pages (which may be less than
the whole folio). This should result in minor performance gains.
__sync_icache_dcache() already acts on the whole folio, and sets a flag
in the folio so that it skips duplicate calls. But by hoisting the call,
all the pte testing is done only once.
mte_sync_tags() operates on each individual page with its own loop. But
by passing the number of pages explicitly, we can rely solely on its
loop and do the checks only once. This approach also makes it robust for
the future, rather than assuming if a head page of a compound page is
being mapped, then the whole compound page is being mapped, instead we
explicitly know how many pages are being mapped. The old assumption may
not continue to hold once the "anonymous large folios" feature is
merged.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20231005140730.2191134-1-ryan.roberts@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Diffstat (limited to 'arch/arm64/kernel')
-rw-r--r-- | arch/arm64/kernel/mte.c | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 4edecaac8f91..2fb5e7a7a4d5 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -35,10 +35,10 @@ DEFINE_STATIC_KEY_FALSE(mte_async_or_asymm_mode); EXPORT_SYMBOL_GPL(mte_async_or_asymm_mode); #endif -void mte_sync_tags(pte_t pte) +void mte_sync_tags(pte_t pte, unsigned int nr_pages) { struct page *page = pte_page(pte); - long i, nr_pages = compound_nr(page); + unsigned int i; /* if PG_mte_tagged is set, tags have already been initialised */ for (i = 0; i < nr_pages; i++, page++) { |