diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-06-26 10:14:36 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-06-26 10:14:36 -0700 |
commit | 1f2300a7382119a857cc09e95db6e5d6fd813163 (patch) | |
tree | 01f2237be94e29f542a0aec89b9faf1a85bf77ce /fs/namei.c | |
parent | 2eedfa9e27ed7b22d9c06d8d072ad2dbce4fd635 (diff) | |
parent | bc2473c90fca55bf95b2ab6af1dacee26a4f92f6 (diff) |
Merge tag 'v6.5/vfs.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs file handling updates from Christian Brauner:
"This contains Amir's work to fix a long-standing problem where an
unprivileged overlayfs mount can be used to avoid fanotify permission
events that were requested for an inode or superblock on the
underlying filesystem.
Some background about files opened in overlayfs. If a file is opened
in overlayfs @file->f_path will refer to a "fake" path. What this
means is that while @file->f_inode will refer to inode of the
underlying layer, @file->f_path refers to an overlayfs
{dentry,vfsmount} pair. The reasons for doing this are out of scope
here but it is the reason why the vfs has been providing the
open_with_fake_path() helper for overlayfs for very long time now. So
nothing new here.
This is for sure not very elegant and everyone including the overlayfs
maintainers agree. Improving this significantly would involve more
fragile and potentially rather invasive changes.
In various codepaths access to the path of the underlying filesystem
is needed for such hybrid file. The best example is fsnotify where
this becomes security relevant. Passing the overlayfs
@file->f_path->dentry will cause fsnotify to skip generating fsnotify
events registered on the underlying inode or superblock.
To fix this we extend the vfs provided open_with_fake_path() concept
for overlayfs to create a backing file container that holds the real
path and to expose a helper that can be used by relevant callers to
get access to the path of the underlying filesystem through the new
file_real_path() helper. This pattern is similar to what we do in
d_real() and d_real_inode().
The first beneficiary is fsnotify and fixes the security sensitive
problem mentioned above.
There's a couple of nice cleanups included as well.
Over time, the old open_with_fake_path() helper added specifically for
overlayfs a long time ago started to get used in other places such as
cachefiles. Even though cachefiles have nothing to do with hybrid
files.
The only reason cachefiles used that concept was that files opened
with open_with_fake_path() aren't charged against the caller's open
file limit by raising FMODE_NOACCOUNT. It's just mere coincidence that
both overlayfs and cachefiles need to ensure to not overcharge the
caller for their internal open calls.
So this work disentangles FMODE_NOACCOUNT use cases and backing file
use-cases by adding the FMODE_BACKING flag which indicates that the
file can be used to retrieve the backing file of another filesystem.
(Fyi, Jens will be sending you a really nice cleanup from Christoph
that gets rid of 3 FMODE_* flags otherwise this would be the last
fmode_t bit we'd be using.)
So now overlayfs becomes the sole user of the renamed
open_with_fake_path() helper which is now named backing_file_open().
For internal kernel users such as cachefiles that are only interested
in FMODE_NOACCOUNT but not in FMODE_BACKING we add a new
kernel_file_open() helper which opens a file without being charged
against the caller's open file limit. All new helpers are properly
documented and clearly annotated to mention their special uses.
We also rename vfs_tmpfile_open() to kernel_tmpfile_open() to clearly
distinguish it from vfs_tmpfile() and align it the other kernel_*()
internal helpers"
* tag 'v6.5/vfs.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ovl: enable fsnotify events on underlying real files
fs: use backing_file container for internal files with "fake" f_path
fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers
fs: use a helper for opening kernel internal files
fs: rename {vfs,kernel}_tmpfile_open()
Diffstat (limited to 'fs/namei.c')
-rw-r--r-- | fs/namei.c | 24 |
1 files changed, 13 insertions, 11 deletions
diff --git a/fs/namei.c b/fs/namei.c index 6a5e26a529e1..91171da719c5 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3703,7 +3703,7 @@ static int vfs_tmpfile(struct mnt_idmap *idmap, } /** - * vfs_tmpfile_open - open a tmpfile for kernel internal use + * kernel_tmpfile_open - open a tmpfile for kernel internal use * @idmap: idmap of the mount the inode was found from * @parentpath: path of the base directory * @mode: mode of the new tmpfile @@ -3714,24 +3714,26 @@ static int vfs_tmpfile(struct mnt_idmap *idmap, * hence this is only for kernel internal use, and must not be installed into * file tables or such. */ -struct file *vfs_tmpfile_open(struct mnt_idmap *idmap, - const struct path *parentpath, - umode_t mode, int open_flag, const struct cred *cred) +struct file *kernel_tmpfile_open(struct mnt_idmap *idmap, + const struct path *parentpath, + umode_t mode, int open_flag, + const struct cred *cred) { struct file *file; int error; file = alloc_empty_file_noaccount(open_flag, cred); - if (!IS_ERR(file)) { - error = vfs_tmpfile(idmap, parentpath, file, mode); - if (error) { - fput(file); - file = ERR_PTR(error); - } + if (IS_ERR(file)) + return file; + + error = vfs_tmpfile(idmap, parentpath, file, mode); + if (error) { + fput(file); + file = ERR_PTR(error); } return file; } -EXPORT_SYMBOL(vfs_tmpfile_open); +EXPORT_SYMBOL(kernel_tmpfile_open); static int do_tmpfile(struct nameidata *nd, unsigned flags, const struct open_flags *op, |