summaryrefslogtreecommitdiff
path: root/fs/f2fs/file.c
AgeCommit message (Collapse)Author
2024-09-24Merge tag 'f2fs-for-6.12-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "The main changes include converting major IO paths to use folio, and adding various knobs to control GC more flexibly for Zoned devices. In addition, there are several patches to address corner cases of atomic file operations and better support for file pinning on zoned device. Enhancement: - add knobs to tune foreground/background GCs for Zoned devices - convert IO paths to use folio - reduce expensive checkpoint trigger frequency - allow F2FS_IPU_NOCACHE for pinned file - forcibly migrate to secure space for zoned device file pinning - get rid of buffer_head use - add write priority option based on zone UFS - get rid of online repair on corrupted directory Bug fixes: - fix to don't panic system for no free segment fault injection - fix to don't set SB_RDONLY in f2fs_handle_critical_error() - avoid unused block when dio write in LFS mode - compress: don't redirty sparse cluster during {,de}compress - check discard support for conventional zones - atomic: prevent atomic file from being dirtied before commit - atomic: fix to check atomic_file in f2fs ioctl interfaces - atomic: fix to forbid dio in atomic_file - atomic: fix to truncate pagecache before on-disk metadata truncation - atomic: create COW inode from parent dentry - atomic: fix to avoid racing w/ GC - atomic: require FMODE_WRITE for atomic write ioctls - fix to wait page writeback before setting gcing flag - fix to avoid racing in between read and OPU dio write, dio completion - fix several potential integer overflows in file offsets and dir_block_index - fix to avoid use-after-free in f2fs_stop_gc_thread() As usual, there are several code clean-ups and refactorings" * tag 'f2fs-for-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits) f2fs: allow F2FS_IPU_NOCACHE for pinned file f2fs: forcibly migrate to secure space for zoned device file pinning f2fs: remove unused parameters f2fs: fix to don't panic system for no free segment fault injection f2fs: fix to don't set SB_RDONLY in f2fs_handle_critical_error() f2fs: add valid block ratio not to do excessive GC for one time GC f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent f2fs: do FG_GC when GC boosting is required for zoned devices f2fs: increase BG GC migration window granularity when boosted for zoned devices f2fs: add reserved_segments sysfs node f2fs: introduce migration_window_granularity f2fs: make BG GC more aggressive for zoned devices f2fs: avoid unused block when dio write in LFS mode f2fs: fix to check atomic_file in f2fs ioctl interfaces f2fs: get rid of online repaire on corrupted directory f2fs: prevent atomic file from being dirtied before commit f2fs: get rid of page->index f2fs: convert read_node_page() to use folio f2fs: convert __write_node_page() to use folio f2fs: convert f2fs_write_data_page() to use folio ...
2024-09-11f2fs: fix to check atomic_file in f2fs ioctl interfacesChao Yu
Some f2fs ioctl interfaces like f2fs_ioc_set_pin_file(), f2fs_move_file_range(), and f2fs_defragment_range() missed to check atomic_write status, which may cause potential race issue, fix it. Cc: stable@vger.kernel.org Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-06f2fs: convert f2fs_vm_page_mkwrite() to use folioChao Yu
Convert to use folio, so that we can get rid of 'page->index' to prepare for removal of 'index' field in structure page [1]. [1] https://lore.kernel.org/all/Zp8fgUSIBGQ1TN0D@casper.infradead.org/ Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: atomic: fix to forbid dio in atomic_fileChao Yu
atomic write can only be used via buffered IO, let's fail direct IO on atomic_file and return -EOPNOTSUPP. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: compress: don't redirty sparse cluster during {,de}compressYeongjin Gil
In f2fs_do_write_data_page, when the data block is NULL_ADDR, it skips writepage considering that it has been already truncated. This results in an infinite loop as the PAGECACHE_TAG_TOWRITE tag is not cleared during the writeback process for a compressed file including NULL_ADDR in compress_mode=user. This is the reproduction process: 1. dd if=/dev/zero bs=4096 count=1024 seek=1024 of=testfile 2. f2fs_io compress testfile 3. dd if=/dev/zero bs=4096 count=1 conv=notrunc of=testfile 4. f2fs_io decompress testfile To prevent the problem, let's check whether the cluster is fully allocated before redirty its pages. Fixes: 5fdb322ff2c2 ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE") Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com> Tested-by: Jaewook Kim <jw5454.kim@samsung.com> Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: fix to avoid use-after-free in f2fs_stop_gc_thread()Chao Yu
syzbot reports a f2fs bug as below: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 print_report+0xe8/0x550 mm/kasan/report.c:491 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 instrument_atomic_read_write include/linux/instrumented.h:96 [inline] atomic_fetch_add_relaxed include/linux/atomic/atomic-instrumented.h:252 [inline] __refcount_add include/linux/refcount.h:184 [inline] __refcount_inc include/linux/refcount.h:241 [inline] refcount_inc include/linux/refcount.h:258 [inline] get_task_struct include/linux/sched/task.h:118 [inline] kthread_stop+0xca/0x630 kernel/kthread.c:704 f2fs_stop_gc_thread+0x65/0xb0 fs/f2fs/gc.c:210 f2fs_do_shutdown+0x192/0x540 fs/f2fs/file.c:2283 f2fs_ioc_shutdown fs/f2fs/file.c:2325 [inline] __f2fs_ioctl+0x443a/0xbe60 fs/f2fs/file.c:4325 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f The root cause is below race condition, it may cause use-after-free issue in sbi->gc_th pointer. - remount - f2fs_remount - f2fs_stop_gc_thread - kfree(gc_th) - f2fs_ioc_shutdown - f2fs_do_shutdown - f2fs_stop_gc_thread - kthread_stop(gc_th->f2fs_gc_task) : sbi->gc_thread = NULL; We will call f2fs_do_shutdown() in two paths: - for f2fs_ioc_shutdown() path, we should grab sb->s_umount semaphore for fixing. - for f2fs_shutdown() path, it's safe since caller has already grabbed sb->s_umount semaphore. Reported-by: syzbot+1a8e2b31f2ac9bd3d148@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/0000000000005c7ccb061e032b9b@google.com Fixes: 7950e9ac638e ("f2fs: stop gc/discard thread after fs shutdown") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: atomic: fix to truncate pagecache before on-disk metadata truncationChao Yu
We should always truncate pagecache while truncating on-disk data. Fixes: a46bebd502fe ("f2fs: synchronize atomic write aborts") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: fix to wait page writeback before setting gcing flagChao Yu
Soft IRQ Thread - f2fs_write_end_io - f2fs_defragment_range - set_page_private_gcing - type = WB_DATA_TYPE(page, false); : assign type w/ F2FS_WB_CP_DATA due to page_private_gcing() is true - dec_page_count() w/ wrong type - end_page_writeback() Value of F2FS_WB_CP_DATA reference count may become negative under above race condition, the root cause is we missed to wait page writeback before setting gcing page private flag, let's fix it. Fixes: 2d1fe8a86bf5 ("f2fs: fix to tag gcing flag on page during file defragment") Fixes: 4961acdd65c9 ("f2fs: fix to tag gcing flag on page during block migration") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: Create COW inode from parent dentry for atomic writeYeongjin Gil
The i_pino in f2fs_inode_info has the previous parent's i_ino when inode was renamed, which may cause f2fs_ioc_start_atomic_write to fail. If file_wrong_pino is true and i_nlink is 1, then to find a valid pino, we should refer to the dentry from inode. To resolve this issue, let's get parent inode using parent dentry directly. Fixes: 3db1de0e582c ("f2fs: change the current atomic write way") Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com> Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com> Reviewed-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: Require FMODE_WRITE for atomic write ioctlsJann Horn
The F2FS ioctls for starting and committing atomic writes check for inode_owner_or_capable(), but this does not give LSMs like SELinux or Landlock an opportunity to deny the write access - if the caller's FSUID matches the inode's UID, inode_owner_or_capable() immediately returns true. There are scenarios where LSMs want to deny a process the ability to write particular files, even files that the FSUID of the process owns; but this can currently partially be bypassed using atomic write ioctls in two ways: - F2FS_IOC_START_ATOMIC_REPLACE + F2FS_IOC_COMMIT_ATOMIC_WRITE can truncate an inode to size 0 - F2FS_IOC_START_ATOMIC_WRITE + F2FS_IOC_ABORT_ATOMIC_WRITE can revert changes another process concurrently made to a file Fix it by requiring FMODE_WRITE for these operations, just like for F2FS_IOC_MOVE_RANGE. Since any legitimate caller should only be using these ioctls when intending to write into the file, that seems unlikely to break anything. Fixes: 88b88a667971 ("f2fs: support atomic writes") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-21f2fs: clean up val{>>,<<}F2FS_BLKSIZE_BITSZhiguo Niu
Use F2FS_BYTES_TO_BLK(bytes) and F2FS_BLK_TO_BYTES(blk) for cleanup Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15f2fs: fix to use per-inode maxbytes and cleanupZhiguo Niu
This is a supplement to commit 6d1451bf7f84 ("f2fs: fix to use per-inode maxbytes") for some missed cases, also cleanup redundant code in f2fs_llseek. Cc: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15Revert "f2fs: use flush command instead of FUA for zoned device"Wenjie Cheng
This reverts commit c550e25bca660ed2554cbb48d32b82d0bb98e4b1. Commit c550e25bca660ed2554cbb48d32b82d0bb98e4b1 ("f2fs: use flush command instead of FUA for zoned device") used additional flush command to keep write order. Since Commit dd291d77cc90eb6a86e9860ba8e6e38eebd57d12 ("block: Introduce zone write plugging") has enabled the block layer to handle this order issue, there is no need to use flush command. Signed-off-by: Wenjie Cheng <cwjhust@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15f2fs: get rid of buffer_head useChao Yu
Convert to use folio and related functionality. Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15f2fs: fix to avoid racing in between read and OPU dio writeChao Yu
If lfs mode is on, buffered read may race w/ OPU dio write as below, it may cause buffered read hits unwritten data unexpectly, and for dio read, the race condition exists as well. Thread A Thread B - f2fs_file_write_iter - f2fs_dio_write_iter - __iomap_dio_rw - f2fs_iomap_begin - f2fs_map_blocks - __allocate_data_block - allocated blkaddr #x - iomap_dio_submit_bio - f2fs_file_read_iter - filemap_read - f2fs_read_data_folio - f2fs_mpage_readpages - f2fs_map_blocks : get blkaddr #x - f2fs_submit_read_bio IRQ - f2fs_read_end_io : read IO on blkaddr #x complete IRQ - iomap_dio_bio_end_io : direct write IO on blkaddr #x complete In LFS mode, if there is inflight dio, let's wait for its completion, this policy won't cover all race cases, however it is a tradeoff which avoids abusing lock around IO paths. Fixes: f847c699cff3 ("f2fs: allow out-place-update for direct IO in LFS mode") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15f2fs: fix to wait dio completionChao Yu
It should wait all existing dio write IOs before block removal, otherwise, previous direct write IO may overwrite data in the block which may be reused by other inode. Cc: stable@vger.kernel.org Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15f2fs: reduce expensive checkpoint trigger frequencyChao Yu
We may trigger high frequent checkpoint for below case: 1. mkdir /mnt/dir1; set dir1 encrypted 2. touch /mnt/file1; fsync /mnt/file1 3. mkdir /mnt/dir2; set dir2 encrypted 4. touch /mnt/file2; fsync /mnt/file2 ... Although, newly created dir and file are not related, due to commit bbf156f7afa7 ("f2fs: fix lost xattrs of directories"), we will trigger checkpoint whenever fsync() comes after a new encrypted dir created. In order to avoid such performance regression issue, let's record an entry including directory's ino in global cache whenever we update directory's xattr data, and then triggerring checkpoint() only if xattr metadata of target file's parent was updated. This patch updates to cover below no encryption case as well: 1) parent is checkpointed 2) set_xattr(dir) w/ new xnid 3) create(file) 4) fsync(file) Fixes: bbf156f7afa7 ("f2fs: fix lost xattrs of directories") Reported-by: wangzijie <wangzijie1@honor.com> Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Tested-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reported-by: Yunlei He <heyunlei@hihonor.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-12introduce fd_file(), convert all accessors to it.Al Viro
For any changes of struct fd representation we need to turn existing accesses to fields into calls of wrappers. Accesses to struct fd::flags are very few (3 in linux/file.h, 1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in explicit initializers). Those can be dealt with in the commit converting to new layout; accesses to struct fd::file are too many for that. This commit converts (almost) all of f.file to fd_file(f). It's not entirely mechanical ('file' is used as a member name more than just in struct fd) and it does not even attempt to distinguish the uses in pointer context from those in boolean context; the latter will be eventually turned into a separate helper (fd_empty()). NOTE: mass conversion to fd_empty(), tempting as it might be, is a bad idea; better do that piecewise in commit that convert from fdget...() to CLASS(...). [conflicts in fs/fhandle.c, kernel/bpf/syscall.c, mm/memcontrol.c caught by git; fs/stat.c one got caught by git grep] [fs/xattr.c conflict] Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-08-05f2fs: fix several potential integer overflows in file offsetsNikita Zhandarovich
When dealing with large extents and calculating file offsets by summing up according extent offsets and lengths of unsigned int type, one may encounter possible integer overflow if the values are big enough. Prevent this from happening by expanding one of the addends to (pgoff_t) type. Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Fixes: d323d005ac4a ("f2fs: support file defragment") Cc: stable@vger.kernel.org Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-07-23Merge tag 'f2fs-for-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "A pretty small update including mostly minor bug fixes in zoned storage along with the large section support. Enhancements: - add support for FS_IOC_GETFSSYSFSPATH - enable atgc dynamically if conditions are met - use new ioprio Macro to get ckpt thread ioprio level - remove unreachable lazytime mount option parsing Bug fixes: - fix null reference error when checking end of zone - fix start segno of large section - fix to cover read extent cache access with lock - don't dirty inode for readonly filesystem - allocate a new section if curseg is not the first seg in its zone - only fragment segment in the same section - truncate preallocated blocks in f2fs_file_open() - fix to avoid use SSR allocate when do defragment - fix to force buffered IO on inline_data inode And some minor code clean-ups and sanity checks" * tag 'f2fs-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (26 commits) f2fs: clean up addrs_per_{inode,block}() f2fs: clean up F2FS_I() f2fs: use meta inode for GC of COW file f2fs: use meta inode for GC of atomic file f2fs: only fragment segment in the same section f2fs: fix to update user block counts in block_operations() f2fs: remove unreachable lazytime mount option parsing f2fs: fix null reference error when checking end of zone f2fs: fix start segno of large section f2fs: remove redundant sanity check in sanity_check_inode() f2fs: assign CURSEG_ALL_DATA_ATGC if blkaddr is valid f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtie f2fs: clean up set REQ_RAHEAD given rac f2fs: enable atgc dynamically if conditions are met f2fs: fix to truncate preallocated blocks in f2fs_file_open() f2fs: fix to cover read extent cache access with lock f2fs: fix return value of f2fs_convert_inline_inode() f2fs: use new ioprio Macro to get ckpt thread ioprio level f2fs: fix to don't dirty inode for readonly filesystem f2fs: fix to avoid use SSR allocate when do defragment ...
2024-07-10f2fs: clean up F2FS_I()Chao Yu
Use temporary variable instead of F2FS_I() for cleanup. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-07-10f2fs: use meta inode for GC of COW fileSunmin Jeong
In case of the COW file, new updates and GC writes are already separated to page caches of the atomic file and COW file. As some cases that use the meta inode for GC, there are some race issues between a foreground thread and GC thread. To handle them, we need to take care when to invalidate and wait writeback of GC pages in COW files as the case of using the meta inode. Also, a pointer from the COW inode to the original inode is required to check the state of original pages. For the former, we can solve the problem by using the meta inode for GC of COW files. Then let's get a page from the original inode in move_data_block when GCing the COW file to avoid race condition. Fixes: 3db1de0e582c ("f2fs: change the current atomic write way") Cc: stable@vger.kernel.org #v5.19+ Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Reviewed-by: Yeongjin Gil <youngjin.gil@samsung.com> Signed-off-by: Sunmin Jeong <s_min.jeong@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-27vfs: rename parent_ino to d_parent_ino and make it use RCUMateusz Guzik
The routine is used by procfs through dir_emit_dots. The combined RCU and lock fallback implementation is too big for an inline. Given that the routine takes a dentry argument fs/dcache.c seems like the place to put it in. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20240627161152.802567-1-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-25f2fs: Use in_group_or_capable() helperYouling Tang
Use the in_group_or_capable() helper function to simplify the code. Signed-off-by: Youling Tang <tangyouling@kylinos.cn> Link: https://lore.kernel.org/r/20240620032335.147136-2-youling.tang@linux.dev Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-18f2fs: fix to use mnt_{want,drop}_write_file replace file_{start,end}_wrtieZhiguo Niu
mnt_{want,drop}_write_file is more suitable than file_{start,end}_wrtie and also is consistent with other ioctl operations. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to truncate preallocated blocks in f2fs_file_open()Chao Yu
chenyuwen reports a f2fs bug as below: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000011 fscrypt_set_bio_crypt_ctx+0x78/0x1e8 f2fs_grab_read_bio+0x78/0x208 f2fs_submit_page_read+0x44/0x154 f2fs_get_read_data_page+0x288/0x5f4 f2fs_get_lock_data_page+0x60/0x190 truncate_partial_data_page+0x108/0x4fc f2fs_do_truncate_blocks+0x344/0x5f0 f2fs_truncate_blocks+0x6c/0x134 f2fs_truncate+0xd8/0x200 f2fs_iget+0x20c/0x5ac do_garbage_collect+0x5d0/0xf6c f2fs_gc+0x22c/0x6a4 f2fs_disable_checkpoint+0xc8/0x310 f2fs_fill_super+0x14bc/0x1764 mount_bdev+0x1b4/0x21c f2fs_mount+0x20/0x30 legacy_get_tree+0x50/0xbc vfs_get_tree+0x5c/0x1b0 do_new_mount+0x298/0x4cc path_mount+0x33c/0x5fc __arm64_sys_mount+0xcc/0x15c invoke_syscall+0x60/0x150 el0_svc_common+0xb8/0xf8 do_el0_svc+0x28/0xa0 el0_svc+0x24/0x84 el0t_64_sync_handler+0x88/0xec It is because inode.i_crypt_info is not initialized during below path: - mount - f2fs_fill_super - f2fs_disable_checkpoint - f2fs_gc - f2fs_iget - f2fs_truncate So, let's relocate truncation of preallocated blocks to f2fs_file_open(), after fscrypt_file_open(). Fixes: d4dd19ec1ea0 ("f2fs: do not expose unwritten blocks to user by DIO") Reported-by: chenyuwen <yuwen.chen@xjmz.com> Closes: https://lore.kernel.org/linux-kernel/20240517085327.1188515-1-yuwen.chen@xjmz.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12f2fs: fix to force buffered IO on inline_data inodeChao Yu
It will return all zero data when DIO reading from inline_data inode, it is because f2fs_iomap_begin() assign iomap->type w/ IOMAP_HOLE incorrectly for this case. We can let iomap framework handle inline data via assigning iomap->type and iomap->inline_data correctly, however, it will be a little bit complicated when handling race case in between direct IO and buffered IO. So, let's force to use buffered IO to fix this issue. Cc: stable@vger.kernel.org Reported-by: Barry Song <v-songbaohua@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-20Merge tag 'f2fs-for-6.10.rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, we've tried to address some performance issues on zoned storage such as direct IO and write_hints. In addition, we've migrated some IO paths using folio. Meanwhile, there are multiple bug fixes in the compression paths, sanity check conditions, and error handlers. Enhancements: - allow direct io of pinned files for zoned storage - assign the write hint per stream by default - convert read paths and test_writeback to folio - avoid allocating WARM_DATA segment for direct IO Bug fixes: - fix false alarm on invalid block address - fix to add missing iput() in gc_data_segment() - fix to release node block count in error path of f2fs_new_node_page() - compress: - don't allow unaligned truncation on released compress inode - cover {reserve,release}_compress_blocks() w/ cp_rwsem lock - fix error path of inc_valid_block_count() - fix to update i_compr_blocks correctly - fix block migration when section is not aligned to pow2 - don't trigger OPU on pinfile for direct IO - fix to do sanity check on i_xattr_nid in sanity_check_inode() - write missing last sum blk of file pinning section - clear writeback when compression failed - fix to adjust appropirate defragment pg_end As usual, there are several minor code clean-ups, and fixes to manage missing corner cases in the error paths" * tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (50 commits) f2fs: initialize last_block_in_bio variable f2fs: Add inline to f2fs_build_fault_attr() stub f2fs: fix some ambiguous comments f2fs: fix to add missing iput() in gc_data_segment() f2fs: allow dirty sections with zero valid block for checkpoint disabled f2fs: compress: don't allow unaligned truncation on released compress inode f2fs: fix to release node block count in error path of f2fs_new_node_page() f2fs: compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock f2fs: compress: fix error path of inc_valid_block_count() f2fs: compress: fix typo in f2fs_reserve_compress_blocks() f2fs: compress: fix to update i_compr_blocks correctly f2fs: check validation of fault attrs in f2fs_build_fault_attr() f2fs: fix to limit gc_pin_file_threshold f2fs: remove unused GC_FAILURE_PIN f2fs: use f2fs_{err,info}_ratelimited() for cleanup f2fs: fix block migration when section is not aligned to pow2 f2fs: zone: fix to don't trigger OPU on pinfile for direct IO f2fs: fix to do sanity check on i_xattr_nid in sanity_check_inode() f2fs: fix to avoid allocating WARM_DATA segment for direct IO f2fs: remove redundant parameter in is_next_segment_free() ...
2024-05-10f2fs: compress: don't allow unaligned truncation on released compress inodeChao Yu
f2fs image may be corrupted after below testcase: - mkfs.f2fs -O extra_attr,compression -f /dev/vdb - mount /dev/vdb /mnt/f2fs - touch /mnt/f2fs/file - f2fs_io setflags compression /mnt/f2fs/file - dd if=/dev/zero of=/mnt/f2fs/file bs=4k count=4 - f2fs_io release_cblocks /mnt/f2fs/file - truncate -s 8192 /mnt/f2fs/file - umount /mnt/f2fs - fsck.f2fs /dev/vdb [ASSERT] (fsck_chk_inode_blk:1256) --> ino: 0x5 has i_blocks: 0x00000002, but has 0x3 blocks [FSCK] valid_block_count matching with CP [Fail] [0x4, 0x5] [FSCK] other corrupted bugs [Fail] The reason is: partial truncation assume compressed inode has reserved blocks, after partial truncation, valid block count may change w/o .i_blocks and .total_valid_block_count update, result in corruption. This patch only allow cluster size aligned truncation on released compress inode for fixing. Fixes: c61404153eb6 ("f2fs: introduce FI_COMPRESS_RELEASED instead of using IMMUTABLE bit") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-10f2fs: compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem ↵Chao Yu
lock It needs to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock to avoid racing with checkpoint, otherwise, filesystem metadata including blkaddr in dnode, inode fields and .total_valid_block_count may be corrupted after SPO case. Fixes: ef8d563f184e ("f2fs: introduce F2FS_IOC_RELEASE_COMPRESS_BLOCKS") Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-10f2fs: compress: fix typo in f2fs_reserve_compress_blocks()Chao Yu
s/released/reserved. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-10f2fs: compress: fix to update i_compr_blocks correctlyChao Yu
Previously, we account reserved blocks and compressed blocks into @compr_blocks, then, f2fs_i_compr_blocks_update(,compr_blocks) will update i_compr_blocks incorrectly, fix it. Meanwhile, for the case all blocks in cluster were reserved, fix to update dn->ofs_in_node correctly. Fixes: eb8fbaa53374 ("f2fs: compress: fix to check unreleased compressed cluster") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-09f2fs: fix to limit gc_pin_file_thresholdChao Yu
type of f2fs_inode.i_gc_failures, f2fs_inode_info.i_gc_failures, and f2fs_sb_info.gc_pin_file_threshold is __le16, unsigned int, and u64, so it will cause truncation during comparison and persistence. Unifying variable of these three variables to unsigned short, and add an upper boundary limitation for gc_pin_file_threshold. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-09f2fs: remove unused GC_FAILURE_PINChao Yu
After commit 3db1de0e582c ("f2fs: change the current atomic write way"), we removed all GC_FAILURE_ATOMIC usage, let's change i_gc_failures[] array to i_pin_failure for cleanup. Meanwhile, let's define i_current_depth and i_gc_failures as union variable due to they won't be valid at the same time. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-29f2fs: fix to avoid allocating WARM_DATA segment for direct IOChao Yu
If active_log is not 6, we never use WARM_DATA segment, let's avoid allocating WARM_DATA segment for direct IO. Signed-off-by: Yunlei He <heyunlei@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-19f2fs: assign the write hint per stream by defaultJaegeuk Kim
This reverts commit 930e2607638d ("f2fs: remove obsolete whint_mode"), as we decide to pass write hints to the disk. Cc: Hyunchul Lee <cheol.lee@lge.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-14f2fs: allow direct io of pinned files for zoned storageDaeho Jeong
Since the allocation happens in conventional LU for zoned storage, we can allow direct io for that. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-14f2fs: prevent writing without fallocate() for pinned filesDaeho Jeong
In a case writing without fallocate(), we can't guarantee it's allocated in the conventional area for zoned stroage. To make it consistent across storage devices, we disallow it regardless of storage device types. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12f2fs: add REQ_TIME time update for some user behaviorsZhiguo Niu
some user behaviors requested filesystem operations, which will cause filesystem not idle. Meanwhile adjust some f2fs_update_time(REQ_TIME) positions. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12f2fs: fix to check pinfile flag in f2fs_move_file_range()Chao Yu
ioctl(F2FS_IOC_MOVE_RANGE) can truncate or punch hole on pinned file, fix to disallow it. Fixes: 5fed0be8583f ("f2fs: do not allow partial truncation on pinned file") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12f2fs: fix to relocate check condition in f2fs_fallocate()Chao Yu
compress and pinfile flag should be checked after inode lock held to avoid race condition, fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Fixes: 5fed0be8583f ("f2fs: do not allow partial truncation on pinned file") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12f2fs: compress: fix to relocate check condition in f2fs_ioc_{,de}compress_file()Chao Yu
Compress flag should be checked after inode lock held to avoid racing w/ f2fs_setflags_common() , fix it. Fixes: 5fdb322ff2c2 ("f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE") Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Closes: https://lore.kernel.org/linux-f2fs-devel/CAHJ8P3LdZXLc2rqeYjvymgYHr2+YLuJ0sLG9DdsJZmwO7deuhw@mail.gmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12f2fs: compress: fix to relocate check condition in ↵Chao Yu
f2fs_{release,reserve}_compress_blocks() Compress flag should be checked after inode lock held to avoid racing w/ f2fs_setflags_common(), fix it. Fixes: 4c8ff7095bef ("f2fs: support data compression") Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Closes: https://lore.kernel.org/linux-f2fs-devel/CAHJ8P3LdZXLc2rqeYjvymgYHr2+YLuJ0sLG9DdsJZmwO7deuhw@mail.gmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-07fs: claw back a few FMODE_* bitsChristian Brauner
There's a bunch of flags that are purely based on what the file operations support while also never being conditionally set or unset. IOW, they're not subject to change for individual files. Imho, such flags don't need to live in f_mode they might as well live in the fops structs itself. And the fops struct already has that lonely mmap_supported_flags member. We might as well turn that into a generic fop_flags member and move a few flags from FMODE_* space into FOP_* space. That gets us four FMODE_* bits back and the ability for new static flags that are about file ops to not have to live in FMODE_* space but in their own FOP_* space. It's not the most beautiful thing ever but it gets the job done. Yes, there'll be an additional pointer chase but hopefully that won't matter for these flags. I suspect there's a few more we can move into there and that we can also redirect a bunch of new flag suggestions that follow this pattern into the fop_flags field instead of f_mode. Link: https://lore.kernel.org/r/20240328-gewendet-spargel-aa60a030ef74@brauner Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-02f2fs: fix to adjust appropirate defragment pg_endZhiguo Niu
A length that exceeds the real size of the inode may be specified from user, although these out-of-range areas are not mapped, but they still need to be check in while loop, which is unnecessary. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-29f2fs: fix to wait on page writeback in __clone_blkaddrs()Chao Yu
In below race condition, dst page may become writeback status in __clone_blkaddrs(), it needs to wait writeback before update, fix it. Thread A GC Thread - f2fs_move_file_range - filemap_write_and_wait_range(dst) - gc_data_segment - f2fs_down_write(dst) - move_data_page - set_page_writeback(dst_page) - f2fs_submit_page_write - f2fs_up_write(dst) - f2fs_down_write(dst) - __exchange_data_block - __clone_blkaddrs - f2fs_get_new_data_page - memcpy_page Fixes: 0a2aa8fbb969 ("f2fs: refactor __exchange_data_block for speed up") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-26f2fs: support .shutdown in f2fs_sopsChao Yu
Support .shutdown callback in f2fs_sops, then, it can be called to shut down the file system when underlying block device is marked dead. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-14f2fs: fix to avoid use-after-free issue in f2fs_filemap_faultChao Yu
syzbot reports a f2fs bug as below: BUG: KASAN: slab-use-after-free in f2fs_filemap_fault+0xd1/0x2c0 fs/f2fs/file.c:49 Read of size 8 at addr ffff88807bb22680 by task syz-executor184/5058 CPU: 0 PID: 5058 Comm: syz-executor184 Not tainted 6.7.0-syzkaller-09928-g052d534373b7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:377 [inline] print_report+0x163/0x540 mm/kasan/report.c:488 kasan_report+0x142/0x170 mm/kasan/report.c:601 f2fs_filemap_fault+0xd1/0x2c0 fs/f2fs/file.c:49 __do_fault+0x131/0x450 mm/memory.c:4376 do_shared_fault mm/memory.c:4798 [inline] do_fault mm/memory.c:4872 [inline] do_pte_missing mm/memory.c:3745 [inline] handle_pte_fault mm/memory.c:5144 [inline] __handle_mm_fault+0x23b7/0x72b0 mm/memory.c:5285 handle_mm_fault+0x27e/0x770 mm/memory.c:5450 do_user_addr_fault arch/x86/mm/fault.c:1364 [inline] handle_page_fault arch/x86/mm/fault.c:1507 [inline] exc_page_fault+0x456/0x870 arch/x86/mm/fault.c:1563 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 The root cause is: in f2fs_filemap_fault(), vmf->vma may be not alive after filemap_fault(), so it may cause use-after-free issue when accessing vmf->vma->vm_flags in trace_f2fs_filemap_fault(). So it needs to keep vm_flags in separated temporary variable for tracepoint use. Fixes: 87f3afd366f7 ("f2fs: add tracepoint for f2fs_vm_page_mkwrite()") Reported-and-tested-by: syzbot+763afad57075d3f862f2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/000000000000e8222b060f00db3b@google.com Cc: Ed Tsai <Ed.Tsai@mediatek.com> Suggested-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12f2fs: prevent atomic write on pinned fileDaeho Jeong
Since atomic write way was changed to out-place-update, we should prevent it on pinned files. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12f2fs: unify the error handling of f2fs_is_valid_blkaddrZhiguo Niu
There are some cases of f2fs_is_valid_blkaddr not handled as ERROR_INVALID_BLKADDR,so unify the error handling about all of f2fs_is_valid_blkaddr. Do f2fs_handle_error in __f2fs_is_valid_blkaddr for cleanup. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>