diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-03-11 09:38:17 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-03-11 09:38:17 -0700 |
commit | 7ea65c89d864f1e9aa892eec85625a96fa9acee6 (patch) | |
tree | 423ab3b73e04a64c37c830ceaa7d6a82748eadbf /fs/overlayfs | |
parent | 97ec9715a84e6f0979242e1ea98b9af1a39acf3b (diff) | |
parent | 09406ad8e5105729291a7639160e0cd51c9e0c6c (diff) |
Merge tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Misc features, cleanups, and fixes for vfs and individual filesystems.
Features:
- Support idmapped mounts for hugetlbfs.
- Add RWF_NOAPPEND flag for pwritev2(). This allows us to fix a bug
where the passed offset is ignored if the file is O_APPEND. The new
flag allows a caller to enforce that the offset is honored to
conform to posix even if the file was opened in append mode.
- Move i_mmap_rwsem in struct address_space to avoid false sharing
between i_mmap and i_mmap_rwsem.
- Convert efs, qnx4, and coda to use the new mount api.
- Add a generic is_dot_dotdot() helper that's used by various
filesystems and the VFS code instead of open-coding it multiple
times.
- Recently we've added stable offsets which allows stable ordering
when iterating directories exported through NFS on e.g., tmpfs
filesystems. Originally an xarray was used for the offset map but
that caused slab fragmentation issues over time. This switches the
offset map to the maple tree which has a dense mode that handles
this scenario a lot better. Includes tests.
- Finally merge the case-insensitive improvement series Gabriel has
been working on for a long time. This cleanly propagates case
insensitive operations through ->s_d_op which in turn allows us to
remove the quite ugly generic_set_encrypted_ci_d_ops() operations.
It also improves performance by trying a case-sensitive comparison
first and then fallback to case-insensitive lookup if that fails.
This also fixes a bug where overlayfs would be able to be mounted
over a case insensitive directory which would lead to all sort of
odd behaviors.
Cleanups:
- Make file_dentry() a simple accessor now that ->d_real() is
simplified because of the backing file work we did the last two
cycles.
- Use the dedicated file_mnt_idmap helper in ntfs3.
- Use smp_load_acquire/store_release() in the i_size_read/write
helpers and thus remove the hack to handle i_size reads in the
filemap code.
- The SLAB_MEM_SPREAD is a nop now. Remove it from various places in
fs/
- It's no longer necessary to perform a second built-in initramfs
unpack call because we retain the contents of the previous
extraction. Remove it.
- Now that we have removed various allocators kfree_rcu() always
works with kmem caches and kmalloc(). So simplify various places
that only use an rcu callback in order to handle the kmem cache
case.
- Convert the pipe code to use a lockdep comparison function instead
of open-coding the nesting making lockdep validation easier.
- Move code into fs-writeback.c that was located in a header but can
be made static as it's only used in that one file.
- Rewrite the alignment checking iterators for iovec and bvec to be
easier to read, and also significantly more compact in terms of
generated code. This saves 270 bytes of text on x86-64 (with
clang-18) and 224 bytes on arm64 (with gcc-13). In profiles it also
saves a bit of time for the same workload.
- Switch various places to use KMEM_CACHE instead of
kmem_cache_create().
- Use inode_set_ctime_to_ts() in inode_set_ctime_current()
- Use kzalloc() in name_to_handle_at() to avoid kernel infoleak.
- Various smaller cleanups for eventfds.
Fixes:
- Fix various comments and typos, and unneeded initializations.
- Fix stack allocation hack for clang in the select code.
- Improve dump_mapping() debug code on a best-effort basis.
- Fix build errors in various selftests.
- Avoid wrap-around instrumentation in various places.
- Don't allow user namespaces without an idmapping to be used for
idmapped mounts.
- Fix sysv sb_read() call.
- Fix fallback implementation of the get_name() export operation"
* tag 'vfs-6.9.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (70 commits)
hugetlbfs: support idmapped mounts
qnx4: convert qnx4 to use the new mount api
fs: use inode_set_ctime_to_ts to set inode ctime to current time
libfs: Drop generic_set_encrypted_ci_d_ops
ubifs: Configure dentry operations at dentry-creation time
f2fs: Configure dentry operations at dentry-creation time
ext4: Configure dentry operations at dentry-creation time
libfs: Add helper to choose dentry operations at mount-time
libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops
fscrypt: Drop d_revalidate once the key is added
fscrypt: Drop d_revalidate for valid dentries during lookup
fscrypt: Factor out a helper to configure the lookup dentry
ovl: Always reject mounting over case-insensitive directories
libfs: Attempt exact-match comparison first during casefolded lookup
efs: remove SLAB_MEM_SPREAD flag usage
jfs: remove SLAB_MEM_SPREAD flag usage
minix: remove SLAB_MEM_SPREAD flag usage
openpromfs: remove SLAB_MEM_SPREAD flag usage
proc: remove SLAB_MEM_SPREAD flag usage
qnx6: remove SLAB_MEM_SPREAD flag usage
...
Diffstat (limited to 'fs/overlayfs')
-rw-r--r-- | fs/overlayfs/params.c | 14 | ||||
-rw-r--r-- | fs/overlayfs/super.c | 52 |
2 files changed, 33 insertions, 33 deletions
diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c index 112b4b12f825..36dcc530ac28 100644 --- a/fs/overlayfs/params.c +++ b/fs/overlayfs/params.c @@ -280,12 +280,20 @@ static int ovl_mount_dir_check(struct fs_context *fc, const struct path *path, { struct ovl_fs_context *ctx = fc->fs_private; - if (ovl_dentry_weird(path->dentry)) - return invalfc(fc, "filesystem on %s not supported", name); - if (!d_is_dir(path->dentry)) return invalfc(fc, "%s is not a directory", name); + /* + * Root dentries of case-insensitive capable filesystems might + * not have the dentry operations set, but still be incompatible + * with overlayfs. Check explicitly to prevent post-mount + * failures. + */ + if (sb_has_encoding(path->mnt->mnt_sb)) + return invalfc(fc, "case-insensitive capable filesystem on %s not supported", name); + + if (ovl_dentry_weird(path->dentry)) + return invalfc(fc, "filesystem on %s not supported", name); /* * Check whether upper path is read-only here to report failures diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index 2eef6c70b2ae..36d4b8b1f784 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -28,41 +28,38 @@ MODULE_LICENSE("GPL"); struct ovl_dir_cache; -static struct dentry *ovl_d_real(struct dentry *dentry, - const struct inode *inode) +static struct dentry *ovl_d_real(struct dentry *dentry, enum d_real_type type) { - struct dentry *real = NULL, *lower; + struct dentry *upper, *lower; int err; - /* - * vfs is only expected to call d_real() with NULL from d_real_inode() - * and with overlay inode from file_dentry() on an overlay file. - * - * TODO: remove @inode argument from d_real() API, remove code in this - * function that deals with non-NULL @inode and remove d_real() call - * from file_dentry(). - */ - if (inode && d_inode(dentry) == inode) - return dentry; - else if (inode) + switch (type) { + case D_REAL_DATA: + case D_REAL_METADATA: + break; + default: goto bug; + } if (!d_is_reg(dentry)) { /* d_real_inode() is only relevant for regular files */ return dentry; } - real = ovl_dentry_upper(dentry); - if (real && (inode == d_inode(real))) - return real; + upper = ovl_dentry_upper(dentry); + if (upper && (type == D_REAL_METADATA || + ovl_has_upperdata(d_inode(dentry)))) + return upper; - if (real && !inode && ovl_has_upperdata(d_inode(dentry))) - return real; + if (type == D_REAL_METADATA) { + lower = ovl_dentry_lower(dentry); + goto real_lower; + } /* - * Best effort lazy lookup of lowerdata for !inode case to return + * Best effort lazy lookup of lowerdata for D_REAL_DATA case to return * the real lowerdata dentry. The only current caller of d_real() with - * NULL inode is d_real_inode() from trace_uprobe and this caller is + * D_REAL_DATA is d_real_inode() from trace_uprobe and this caller is * likely going to be followed reading from the file, before placing * uprobes on offset within the file, so lowerdata should be available * when setting the uprobe. @@ -73,18 +70,13 @@ static struct dentry *ovl_d_real(struct dentry *dentry, lower = ovl_dentry_lowerdata(dentry); if (!lower) goto bug; - real = lower; - /* Handle recursion */ - real = d_real(real, inode); +real_lower: + /* Handle recursion into stacked lower fs */ + return d_real(lower, type); - if (!inode || inode == d_inode(real)) - return real; bug: - WARN(1, "%s(%pd4, %s:%lu): real dentry (%p/%lu) not found\n", - __func__, dentry, inode ? inode->i_sb->s_id : "NULL", - inode ? inode->i_ino : 0, real, - real && d_inode(real) ? d_inode(real)->i_ino : 0); + WARN(1, "%s(%pd4, %d): real dentry not found\n", __func__, dentry, type); return dentry; } |