summaryrefslogtreecommitdiff
path: root/fs/xfs/xfs_rtalloc.c
AgeCommit message (Collapse)Author
2024-05-03xfs: simplify iext overflow checking and upgradeChristoph Hellwig
Currently the calls to xfs_iext_count_may_overflow and xfs_iext_count_upgrade are always paired. Merge them into a single function to simplify the callers and the actual check and upgrade logic itself. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-04-30xfs: fix error returns from xfs_bmapi_writeChristoph Hellwig
xfs_bmapi_write can return 0 without actually returning a mapping in mval in two different cases: 1) when there is absolutely no space available to do an allocation 2) when converting delalloc space, and the allocation is so small that it only covers parts of the delalloc extent before the range requested by the caller Callers at best can handle one of these cases, but in many cases can't cope with either one. Switch xfs_bmapi_write to always return a mapping or return an error code instead. For case 1) above ENOSPC is the obvious choice which is very much what the callers expect anyway. For case 2) there is no really good error code, so pick a funky one from the SysV streams portfolio. This fixes the reproducer here: https://lore.kernel.org/linux-xfs/CAEJPjCvT3Uag-pMTYuigEjWZHn1sGMZ0GCjVVCv29tNHK76Cgg@mail.gmail.com0/ which uses reserved blocks to create file systems that are gravely out of space and thus cause at least xfs_file_alloc_space to hang and trigger the lack of ENOSPC handling in xfs_dquot_disk_alloc. Note that this patch does not actually make any caller but xfs_alloc_file_space deal intelligently with case 2) above. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: 刘通 <lyutoon@gmail.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-04-22xfs: reinstate delalloc for RT inodes (if sb_rextsize == 1)Christoph Hellwig
Commit aff3a9edb708 ("xfs: Use preallocation for inodes with extsz hints") disabled delayed allocation for all inodes with extent size hints due a data exposure problem. It turns out we fixed this data exposure problem since by always creating unwritten extents for delalloc conversions due to more data exposure problems, but the writeback path doesn't actually support extent size hints when converting delalloc these days, which probably isn't a problem given that people using the hints know what they get. However due to the way how xfs_get_extsz_hint is implemented, it always claims an extent size hint for RT inodes even if the RT extent size is a single FSB. Due to that the above commit effectively disabled delalloc support for RT inodes. Switch xfs_get_extsz_hint to return 0 for this case and work around that in a few places to reinstate delalloc support for RT inodes on file systems with an sb_rextsize of 1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-04-22xfs: refactor realtime inode lockingChristoph Hellwig
Create helper functions to deal with locking realtime metadata inodes. This enables us to maintain correct locking order once we start adding the realtime rmap and refcount btree inodes. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-02-22xfs: report realtime metadata corruption errors to the health systemDarrick J. Wong
Whenever we encounter corrupt realtime metadat blocks, we should report that to the health monitoring system for later reporting. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2024-02-19xfs: Replace xfs_isilocked with xfs_assert_ilockedMatthew Wilcox (Oracle)
To use the new rwsem_assert_held()/rwsem_assert_held_write(), we can't use the existing ASSERT macro. Add a new xfs_assert_ilocked() and convert all the callers. Fix an apparent bug in xfs_isilocked(): If the caller specifies XFS_IOLOCK_EXCL | XFS_ILOCK_EXCL, xfs_assert_ilocked() will check both the IOLOCK and the ILOCK are held for write. xfs_isilocked() only checked that the ILOCK was held for write. xfs_assert_ilocked() is always on, even if DEBUG or XFS_WARN aren't defined. It's a cheap check, so I don't think it's worth defining it away. Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-02-13xfs: convert remaining kmem_free() to kfree()Dave Chinner
The remaining callers of kmem_free() are freeing heap memory, so we can convert them directly to kfree() and get rid of kmem_free() altogether. This conversion was done with: $ for f in `git grep -l kmem_free fs/xfs`; do > sed -i s/kmem_free/kfree/ $f > done $ Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-02-13xfs: convert kmem_free() for kvmalloc users to kvfree()Dave Chinner
Start getting rid of kmem_free() by converting all the cases where memory can come from vmalloc interfaces to calling kvfree() directly. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2024-02-13xfs: convert kmem_alloc() to kmalloc()Dave Chinner
kmem_alloc() is just a thin wrapper around kmalloc() these days. Convert everything to use kmalloc() so we can get rid of the wrapper. Note: the transaction region allocation in xlog_add_to_transaction() can be a high order allocation. Converting it to use kmalloc(__GFP_NOFAIL) results in warnings in the page allocation code being triggered because the mm subsystem does not want us to use __GFP_NOFAIL with high order allocations like we've been doing with the kmem_alloc() wrapper for a couple of decades. Hence this specific case gets converted to xlog_kvmalloc() rather than kmalloc() to avoid this issue. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: fold xfs_rtallocate_extent into xfs_bmap_rtallocChristoph Hellwig
There isn't really much left in xfs_rtallocate_extent now, fold it into the only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: simplify and optimize the RT allocation fallback cascadeChristoph Hellwig
There are currently multiple levels of fall back if an RT allocation can not be satisfied: 1) xfs_rtallocate_extent extends the minlen and reduces the maxlen due to the extent size hint. If that can't be done, it return -ENOSPC and let's xfs_bmap_rtalloc retry, which then not only drops the extent size hint based alignment, but also the minlen adjustment 2) if xfs_rtallocate_extent gets -ENOSPC from the underlying functions, it only drops the extent size hint based alignment and retries 3) if that still does not succeed, xfs_rtallocate_extent drops the extent size hint (which is a complex no-op at this point) and the minlen using the same code as (1) above 4) if that still doesn't success and the caller wanted an allocation near a blkno, drop that blkno hint. The handling in 1 is rather inefficient as we could just drop the alignment and continue, and 2/3 interact in really weird ways due to the duplicate policy. Move aligning the min and maxlen out of xfs_rtallocate_extent and into a helper called directly by xfs_bmap_rtalloc. This allows just continuing with the allocation if we have to drop the alignment instead of going through the retry loop and also dropping the perfectly usable minlen adjustment that didn't cause the problem, and then just use a single retry that drops both the minlen and alignment requirement when we really are out of space, thus consolidating cases (2) and (3) above. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: reorder the minlen and prod calculations in xfs_bmap_rtallocChristoph Hellwig
xfs_bmap_rtalloc is a bit of a mess in terms of calculating the locally need variables. Reorder them a bit so that related code is located next to each other - the raminlen calculation moves up next to where the maximum len is calculated, and all the prod calculation is move into a single place and rearranged so that the real prod calculation only happens when it actually is needed. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: remove XFS_RTMIN/XFS_RTMAXChristoph Hellwig
Use the kernel min/max helpers instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: remove rt-wrappers from xfs_format.hChristoph Hellwig
xfs_format.h has a bunch odd wrappers for helper functions and mount structure access using RT* prefixes. Replace them with their open coded versions (for those that weren't entirely unused) and remove the wrappers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: factor out a xfs_rtalloc_sumlevel helperChristoph Hellwig
xfs_rtallocate_extent_size has two loops with nearly identical logic in them. Split that logic into a separate xfs_rtalloc_sumlevel helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: tidy up xfs_rtallocate_extent_exactChristoph Hellwig
Use common code for both xfs_rtallocate_range calls by moving the !isfree logic into the non-default branch. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: merge the calls to xfs_rtallocate_range in xfs_rtallocate_blockChristoph Hellwig
Use a goto to use a common tail for the case of being able to allocate an extent. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: reflow the tail end of xfs_rtallocate_extent_blockChristoph Hellwig
Change polarity of a check so that the successful case of being able to allocate an extent is in the main path of the function and error handling is on a branch. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: invert a check in xfs_rtallocate_extent_blockChristoph Hellwig
Doing a break in the else side of a conditional is rather silly. Invert the check, break ASAP and unindent the other leg. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: move xfs_rtget_summary to xfs_rtbitmap.cChristoph Hellwig
xfs_rtmodify_summary_int is only used inside xfs_rtbitmap.c and to implement xfs_rtget_summary. Move xfs_rtget_summary to xfs_rtbitmap.c as the exported API and mark xfs_rtmodify_summary_int static. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: cleanup picking the start extent hint in xfs_bmap_rtallocChristoph Hellwig
Clean up the logical in xfs_bmap_rtalloc that tries to find a rtextent to start the search from by using a separate variable for the hint, not calling xfs_bmap_adjacent when we want to ignore the locality and avoid an extra roundtrip converting between block numbers and RT extent numbers. As a side-effect this doesn't pointlessly call xfs_rtpick_extent and increment the start rtextent hint if we are going to ignore the result anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: reflow the tail end of xfs_bmap_rtallocChristoph Hellwig
Reorder the tail end of xfs_bmap_rtalloc so that the successfully allocation is in the main path, and the error handling is on a branch. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: return -ENOSPC from xfs_rtallocate_*Christoph Hellwig
Just return -ENOSPC instead of returning 0 and setting the return rt extent number to NULLRTEXTNO. This is turn removes all users of NULLRTEXTNO, so remove that as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: move xfs_bmap_rtalloc to xfs_rtalloc.cChristoph Hellwig
xfs_bmap_rtalloc is currently in xfs_bmap_util.c, which is a somewhat odd spot for it, given that is only called from xfs_bmap.c and calls into xfs_rtalloc.c to do the actual work. Move xfs_bmap_rtalloc to xfs_rtalloc.c and mark xfs_rtpick_extent xfs_rtallocate_extent and xfs_rtallocate_extent static now that they aren't called from outside of xfs_rtalloc.c. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-22xfs: consider minlen sized extents in xfs_rtallocate_extent_blockChristoph Hellwig
minlen is the lower bound on the extent length that the caller can accept, and maxlen is at this point the maximal available length. This means a minlen extent is perfectly fine to use, so do it. This matches the equivalent logic in xfs_rtallocate_extent_exact that also accepts a minlen sized extent. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-13xfs: recompute growfsrtfree transaction reservation while growing rt volumeDarrick J. Wong
While playing with growfs to create a 20TB realtime section on a filesystem that didn't previously have an rt section, I noticed that growfs would occasionally shut down the log due to a transaction reservation overflow. xfs_calc_growrtfree_reservation uses the current size of the realtime summary file (m_rsumsize) to compute the transaction reservation for a growrtfree transaction. The reservations are computed at mount time, which means that m_rsumsize is zero when growfs starts "freeing" the new realtime extents into the rt volume. As a result, the transaction is undersized and fails. Fix this by recomputing the transaction reservations every time we change m_rsumsize. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-06xfs: don't allow overly small or large realtime volumesDarrick J. Wong
Don't allow realtime volumes that are less than one rt extent long. This has been broken across 4 LTS kernels with nobody noticing, so let's just disable it. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-12-06xfs: make rextslog computation consistent with mkfsDarrick J. Wong
There's a weird discrepancy in xfsprogs dating back to the creation of the Linux port -- if there are zero rt extents, mkfs will set sb_rextents and sb_rextslog both to zero: sbp->sb_rextslog = (uint8_t)(rtextents ? libxfs_highbit32((unsigned int)rtextents) : 0); However, that's not the check that xfs_repair uses for nonzero rtblocks: if (sb->sb_rextslog != libxfs_highbit32((unsigned int)sb->sb_rextents)) The difference here is that xfs_highbit32 returns -1 if its argument is zero. Unfortunately, this means that in the weird corner case of a realtime volume shorter than 1 rt extent, xfs_repair will immediately flag a freshly formatted filesystem as corrupt. Because mkfs has been writing ondisk artifacts like this for decades, we have to accept that as "correct". TBH, zero rextslog for zero rtextents makes more sense to me anyway. Regrettably, the superblock verifier checks created in commit copied xfs_repair even though mkfs has been writing out such filesystems for ages. Fix the superblock verifier to accept what mkfs spits out; the userspace version of this patch will have to fix xfs_repair as well. Note that the new helper leaves the zeroday bug where the upper 32 bits of sb_rextents is ripped off and fed to highbit32. This leads to a seriously undersized rt summary file, which immediately breaks mkfs: $ hugedisk.sh foo /dev/sdc $(( 0x100000080 * 4096))B $ /sbin/mkfs.xfs -f /dev/sda -m rmapbt=0,reflink=0 -r rtdev=/dev/mapper/foo meta-data=/dev/sda isize=512 agcount=4, agsize=1298176 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=0 bigtime=1 inobtcount=1 nrext64=1 data = bsize=4096 blocks=5192704, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=16384, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =/dev/mapper/foo extsz=4096 blocks=4294967424, rtextents=4294967424 Discarding blocks...Done. mkfs.xfs: Error initializing the realtime space [117 - Structure needs cleaning] The next patch will drop support for rt volumes with fewer than 1 or more than 2^32-1 rt extents, since they've clearly been broken forever. Fixes: f8e566c0f5e1f ("xfs: validate the realtime geometry in xfs_validate_sb_common") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-11-08Merge tag 'xfs-6.7-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs updates from Chandan Babu: - Realtime device subsystem: - Cleanup usage of xfs_rtblock_t and xfs_fsblock_t data types - Replace open coded conversions between rt blocks and rt extents with calls to static inline helpers - Replace open coded realtime geometry compuation and macros with helper functions - CPU usage optimizations for realtime allocator - Misc bug fixes associated with Realtime device - Allow read operations to execute while an FICLONE ioctl is being serviced - Misc bug fixes: - Alert user when xfs_droplink() encounters an inode with a link count of zero - Handle the case where the allocator could return zero extents when servicing an fallocate request * tag 'xfs-6.7-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (40 commits) xfs: allow read IO and FICLONE to run concurrently xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space xfs: introduce protection for drop nlink xfs: don't look for end of extent further than necessary in xfs_rtallocate_extent_near() xfs: don't try redundant allocations in xfs_rtallocate_extent_near() xfs: limit maxlen based on available space in xfs_rtallocate_extent_near() xfs: return maximum free size from xfs_rtany_summary() xfs: invert the realtime summary cache xfs: simplify rt bitmap/summary block accessor functions xfs: simplify xfs_rtbuf_get calling conventions xfs: cache last bitmap block in realtime allocator xfs: use accessor functions for summary info words xfs: consolidate realtime allocation arguments xfs: create helpers for rtsummary block/wordcount computations xfs: use accessor functions for bitmap words xfs: create helpers for rtbitmap block/wordcount computations xfs: create a helper to handle logging parts of rt bitmap/summary blocks xfs: convert rt summary macros to helpers xfs: convert open-coded xfs_rtword_t pointer accesses to helper xfs: remove XFS_BLOCKWSIZE and XFS_BLOCKWMASK macros ...
2023-10-19xfs: don't look for end of extent further than necessary in ↵Omar Sandoval
xfs_rtallocate_extent_near() As explained in the previous commit, xfs_rtallocate_extent_near() looks for the end of a free extent when searching backwards from the target bitmap block. Since the previous commit, it searches from the last bitmap block it checked to the bitmap block containing the start of the extent. This may still be more than necessary, since the free extent may not be that long. We know the maximum size of the free extent from the realtime summary. Use that to compute how many bitmap blocks we actually need to check. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-19xfs: don't try redundant allocations in xfs_rtallocate_extent_near()Omar Sandoval
xfs_rtallocate_extent_near() tries to find a free extent as close to a target bitmap block given by bbno as possible, which may be before or after bbno. Searching backwards has a complication: the realtime summary accounts for free space _starting_ in a bitmap block, but not straddling or ending in a bitmap block. So, when the negative search finds a free extent in the realtime summary, in order to end up closer to the target, it looks for the end of the free extent. For example, if bbno - 2 has a free extent, then it will check bbno - 1, then bbno - 2. But then if bbno - 3 has a free extent, it will check bbno - 1 again, then bbno - 2 again, and then bbno - 3. This results in a quadratic loop, which is completely pointless since the repeated checks won't find anything new. Fix it by remembering where we last checked up to and continue from there. This also obviates the need for a check of the realtime summary. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-19xfs: limit maxlen based on available space in xfs_rtallocate_extent_near()Omar Sandoval
xfs_rtallocate_extent_near() calls xfs_rtallocate_extent_block() with the minlen and maxlen that were passed to it. xfs_rtallocate_extent_block() then scans the bitmap block looking for a free range of size maxlen. If there is none, it has to scan the whole bitmap block before returning the largest range of at least size minlen. For a fragmented realtime device and a large allocation request, it's almost certain that this will have to search the whole bitmap block, leading to high CPU usage. However, the realtime summary tells us the maximum size available in the bitmap block. We can limit the search in xfs_rtallocate_extent_block() to that size and often stop before scanning the whole bitmap block. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-19xfs: return maximum free size from xfs_rtany_summary()Omar Sandoval
Instead of only returning whether there is any free space, return the maximum size, which is fast thanks to the previous commit. This will be used by two upcoming optimizations. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-19xfs: invert the realtime summary cacheOmar Sandoval
In commit 355e3532132b ("xfs: cache minimum realtime summary level"), I added a cache of the minimum level of the realtime summary that has any free extents. However, it turns out that the _maximum_ level is more useful for upcoming optimizations, and basically equivalent for the existing usage. So, let's change the meaning of the cache to be the maximum level + 1, or 0 if there are no free extents. For example, if the cache contains: {0, 4} then there are no free extents starting in realtime bitmap block 0, and there are no free extents larger than or equal to 2^4 blocks starting in realtime bitmap block 1. The cache is a loose upper bound, so there may or may not be free extents smaller than 2^4 blocks in realtime bitmap block 1. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-19xfs: cache last bitmap block in realtime allocatorOmar Sandoval
Profiling a workload on a highly fragmented realtime device showed a ton of CPU cycles being spent in xfs_trans_read_buf() called by xfs_rtbuf_get(). Further tracing showed that much of that was repeated calls to xfs_rtbuf_get() for the same block of the realtime bitmap. These come from xfs_rtallocate_extent_block(): as it walks through ranges of free bits in the bitmap, each call to xfs_rtcheck_range() and xfs_rtfind_{forw,back}() gets the same bitmap block. If the bitmap block is very fragmented, then this is _a lot_ of buffer lookups. The realtime allocator already passes around a cache of the last used realtime summary block to avoid repeated reads (the parameters rbpp and rsb). We can do the same for the realtime bitmap. This replaces rbpp and rsb with a struct xfs_rtbuf_cache, which caches the most recently used block for both the realtime bitmap and summary. xfs_rtbuf_get() now handles the caching instead of the callers, which requires plumbing xfs_rtbuf_cache to more functions but also makes sure we don't miss anything. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-18xfs: consolidate realtime allocation argumentsDave Chinner
Consolidate the arguments passed around the rt allocator into a struct xfs_rtalloc_arg similar to how the btree allocator arguments are consolidated in a struct xfs_alloc_arg.... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-18xfs: create helpers for rtsummary block/wordcount computationsDarrick J. Wong
Create helper functions that compute the number of blocks or words necessary to store the rt summary file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-18xfs: create helpers for rtbitmap block/wordcount computationsDarrick J. Wong
Create helper functions that compute the number of blocks or words necessary to store the rt bitmap. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-18xfs: convert to new timestamp accessorsJeff Layton
Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-75-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-10-17xfs: convert the rtbitmap block and bit macros to static inline functionsDarrick J. Wong
Replace these macros with typechecked helper functions. Eventually we're going to add more logic to the helpers and it'll be easier if we don't have to macro it up. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: use shifting and masking when converting rt extents, if possibleDarrick J. Wong
Avoid the costs of integer division (32-bit and 64-bit) if the realtime extent size is a power of two. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: convert do_div calls to xfs_rtb_to_rtx helper callsDarrick J. Wong
Convert these calls to use the helpers, and clean up all these places where the same variable can have different units depending on where it is in the function. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: convert rt extent numbers to xfs_rtxnum_tDarrick J. Wong
Further disambiguate the xfs_rtblock_t uses by creating a new type, xfs_rtxnum_t, to store the position of an extent within the realtime section, in units of rtextents. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: convert rt bitmap/summary block numbers to xfs_fileoff_tDarrick J. Wong
We should use xfs_fileoff_t to store the file block offset of any location within the realtime bitmap or summary files. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: convert xfs_extlen_t to xfs_rtxlen_t in the rt allocatorDarrick J. Wong
In most of the filesystem, we use xfs_extlen_t to store the length of a file (or AG) space mapping in units of fs blocks. Unfortunately, the realtime allocator also uses it to store the length of a rt space mapping in units of rt extents. This is confusing, since one rt extent can consist of many fs blocks. Separate the two by introducing a new type (xfs_rtxlen_t) to store the length of a space mapping (in units of realtime extents) that would be found in a file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: move the xfs_rtbitmap.c declarations to xfs_rtbitmap.hDarrick J. Wong
Move all the declarations for functionality in xfs_rtbitmap.c into a separate xfs_rtbitmap.h header file. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: make sure maxlen is still congruent with prod when rounding downDarrick J. Wong
In commit 2a6ca4baed62, we tried to fix an overflow problem in the realtime allocator that was caused by an overly large maxlen value causing xfs_rtcheck_range to run off the end of the realtime bitmap. Unfortunately, there is a subtle bug here -- maxlen (and minlen) both have to be aligned with @prod, but @prod can be larger than 1 if the user has set an extent size hint on the file, and that extent size hint is larger than the realtime extent size. If the rt free space extents are not aligned to this file's extszhint because other files without extent size hints allocated space (or the number of rt extents is similarly not aligned), then it's possible that maxlen after clamping to sb_rextents will no longer be aligned to prod. The allocation will succeed just fine, but we still trip the assertion. Fix the problem by reducing maxlen by any misalignment with prod. While we're at it, split the assertions into two so that we can tell which value had the bad alignment. Fixes: 2a6ca4baed62 ("xfs: make sure the rt allocator doesn't run off the end") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2023-10-17xfs: prevent rt growfs when quota is enabledDarrick J. Wong
Quotas aren't (yet) supported with realtime, so we shouldn't allow userspace to set up a realtime section when quotas are enabled, even if they attached one via mount options. IOWS, you shouldn't be able to do: # mkfs.xfs -f /dev/sda # mount /dev/sda /mnt -o rtdev=/dev/sdb,usrquota # xfs_growfs -r /mnt Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-11-16xfs: make rtbitmap ILOCKing consistent when scanning the rt bitmap fileDarrick J. Wong
xfs_rtalloc_query_range scans the realtime bitmap file in order of increasing file offset, so this caller can take ILOCK_SHARED on the rt bitmap inode instead of ILOCK_EXCL. This isn't going to yield any practical benefits at mount time, but we'd like to make the locking usage consistent around xfs_rtalloc_query_all calls. Make all the places we do this use the same xfs_ilock lockflags for consistency. Fixes: 4c934c7dd60c ("xfs: report realtime space information via the rtbitmap") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>
2022-11-16xfs: load rtbitmap and rtsummary extent mapping btrees at mount timeDarrick J. Wong
It turns out that GETFSMAP and online fsck have had a bug for years due to their use of ILOCK_SHARED to coordinate their linear scans of the realtime bitmap. If the bitmap file's data fork happens to be in BTREE format and the scan occurs immediately after mounting, the incore bmbt will not be populated, leading to ASSERTs tripping over the incorrect inode state. Because the bitmap scans always lock bitmap buffers in increasing order of file offset, it is appropriate for these two callers to take a shared ILOCK to improve scalability. To fix this problem, load both data and attr fork state into memory when mounting the realtime inodes. Realtime metadata files aren't supposed to have an attr fork so the second step is likely a nop. On most filesystems this is unlikely since the rtbitmap data fork is usually in extents format, but it's possible to craft a filesystem that will by fragmenting the free space in the data section and growfsing the rt section. Fixes: 4c934c7dd60c ("xfs: report realtime space information via the rtbitmap") Also-Fixes: 46d9bfb5e706 ("xfs: cross-reference the realtime bitmap") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>