Age | Commit message (Collapse) | Author |
|
Shaving more cycles.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Return an error instead (still work in progress...)
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
len might fit into a loff_t when aligned_len does not - make sure we use
a u64 for aligned_len. Also, we weren't always extending the inode
correctly.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Discovered by xfstests generic/553
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can assume that usually buffered and O_DIRECT IO won't be mixed, and
the calls to flush the page cache won't be needed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The chain_end field was not initialized before use in
hash_set_chain_start.
Signed-off-by: Justin Husted <sigstop@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In theory we should be able to do (non appending/extending) dio writes
without taking the inode lock at all - but this gets us most of the way
there.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This adds some horrible hacks, but the atomic ops for closures were
getting to be a pretty expensive part of the write path. We don't want
to rip out closures entirely from the write path, because they're used
for e.g. waiting on the allocator, or waiting on the journal flush, and
that stuff would get really ugly without closures.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_extent_ptr_decoded_append() is more general than we need here; we
know we're initializing a new extent so e.g. we're going to need the crc
entry.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This is considerably cheaper than bch2_btree_node_iter_fix(), for cases
where the key was only modified and key ordering isn't changing.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The main optimization here is that if we let
bch2_replicas_delta_list_apply() fail, we can completely skip calling
bch2_bkey_replicas_marked_locked().
And assorted other small optimizations.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This lets us avoid a cache miss in the write path.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Improve a few paper cuts that've shown up during profiling.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Valgrind data indicated that the flags field was only partially
initialized when written to disk.
Signed-off-by: Justin Husted <sigstop@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The previous patch 128cb1a to fix uninitialized data was incorrect and
did not initialize the padding space correctly. Furthermore, several
other cases in this function do not initialize their padding space
correctly.
Move initialization into some helper functions in a more robust way.
Signed-off-by: Justin Husted <sigstop@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Packed bkeys are padded up to 64 bit alignment, but the alloc bkey type
was not clearing the pad bytes after the last data byte. This left the
key possibly containing some random garbage at the end.
This problem was found using valgrind.
This patch also changes a path with the inode bkey to clear in the same
way.
Signed-off-by: Justin Husted <sigstop@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
On IO error, bch2_writepages_io_done() will set the page state to
indicate nothing's already reserved (since the write didn't happen, we
don't know what's already reserved). This can race with the buffered IO
path, in between getting a disk reservation and calling
bch2_set_page_dirty().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can't reuse bios without reinitializing them, and in the retry path
it's safer to just make sure we don't reuse them at all.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This is dead code
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This works around a bug where bio_full() doesn't check for
bio->bi_iter.bi_size overflowing - and, we don't really want to build
bios that are that big anyways.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The iterator counting assumed we're doing an obvious optimization when
only updating the refcount on indirect extents - but we're not doing it
yet.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Previously, we'd go into an infinite loop.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We definitely don't need an exclusive inode lock for readdir.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We have to free the old (in memory) btree node _before_ unlocking the
new nodes - else, some other thread with a read lock on the old node
could see stale data after another thread has already updated the new
node.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The generic IO path now handles inode updates for i_size and i_sectors -
this means we can drop a fair amount of code from fs-io.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
As before - we're moving non Linux specific code out of fs-io.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The next few patches are going to be more moving the logic around
i_size/i_sectors updates to io.c, and better separating the Linux VFS
specific code from core bcachefs code, to better support the fuse port.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Moving bch2_extent_update() to io.c will be greatly simplified if we
no longer have to keep ei_inode.bi_size/bi_sectors up to date.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In bch2_extent_update(), we have to update the inode if i_size is
changing (the file is being extend) or if i_sectors is changing, but we
want to avoid touching the inode if it's not necessary.
Change sum_sector_overwrites() to also check if there's already data
above where we're writing to - this means we're definitely not extending
the file.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
this deserves a unit test
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The tweaks to ctx->pos handling are also to help the fuse port
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can't use the page lock to protect it, because on writeback IO error
we need to access the page state before calling end_page_writeback() and
the page lock semantics are completely insane so that deadlocks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Disk space accounting for erasure coding + compression was completely
broken - we need to calculate the parity sectors delta the same way we
calculate disk_sectors, by calculating the old and new usage and
subtracting to get the difference.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The bkey_s_c returned by btree_iter_(peek|next) points into the btree
iter type, so advancing the iterator and then using the one previously
returned is a bug...
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|