diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-07-15 08:55:10 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-07-15 08:55:10 -0700 |
commit | 6a31ffdfed10dc48e6fd1775d50c22429382ab98 (patch) | |
tree | ab78c04fb92e200a8d0182d953bd588cc08e8a58 /arch/arm64 | |
parent | a5819099f601c1af5b86b1f5921a56859e45b19a (diff) | |
parent | f915a3e5b0182dd7376f11337e231500a157e1f4 (diff) |
Merge branch 'word-at-a-time'
Merge minor word-at-a-time instruction choice improvements for x86 and
arm64.
This is the second of four branches that came out of me looking at the
code generation for path lookup on arm64.
The word-at-a-time infrastructure is used to do string operations in
chunks of one word both when copying the pathname from user space (in
strncpy_from_user()), and when parsing and hashing the individual path
components (in link_path_walk()).
In particular, the "find the first zero byte" uses various bit tricks to
figure out the end of the string or path component, and get the length
without having to do things one byte at a time. Both x86-64 and arm64
had less than optimal code choices for that.
The commit message for the arm64 change in particular tries to explain
the exact code flow for the zero byte finding for people who care. It's
made a bit more complicated by the fact that we support big-endian
hardware too, and so we have some extra abstraction layers to allow
different models for finding the zero byte, quite apart from the issue
of picking specialized instructions.
* word-at-a-time:
arm64: word-at-a-time: improve byte count calculations for LE
x86-64: word-at-a-time: improve byte count calculations
Diffstat (limited to 'arch/arm64')
-rw-r--r-- | arch/arm64/include/asm/word-at-a-time.h | 11 |
1 files changed, 3 insertions, 8 deletions
diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h index 14251abee23c..824ca6987a51 100644 --- a/arch/arm64/include/asm/word-at-a-time.h +++ b/arch/arm64/include/asm/word-at-a-time.h @@ -27,20 +27,15 @@ static inline unsigned long has_zero(unsigned long a, unsigned long *bits, } #define prep_zero_mask(a, bits, c) (bits) +#define create_zero_mask(bits) (bits) +#define find_zero(bits) (__ffs(bits) >> 3) -static inline unsigned long create_zero_mask(unsigned long bits) +static inline unsigned long zero_bytemask(unsigned long bits) { bits = (bits - 1) & ~bits; return bits >> 7; } -static inline unsigned long find_zero(unsigned long mask) -{ - return fls64(mask) >> 3; -} - -#define zero_bytemask(mask) (mask) - #else /* __AARCH64EB__ */ #include <asm-generic/word-at-a-time.h> #endif |