diff options
author | Shaohua Li <shli@fb.com> | 2015-08-13 14:31:59 -0700 |
---|---|---|
committer | NeilBrown <neilb@suse.com> | 2015-10-24 17:16:19 +1100 |
commit | f6bed0ef0a808164f51197de062e0450ce6c1f96 (patch) | |
tree | 9fc2ec276f40ef36eaa14e295f543595472b56a9 /drivers/md/raid5.h | |
parent | b70abcb24711d1327a8a505ab3e931c24cbab0a7 (diff) |
raid5: add basic stripe log
This introduces a simple log for raid5. Data/parity writing to raid
array first writes to the log, then write to raid array disks. If
crash happens, we can recovery data from the log. This can speed up
raid resync and fix write hole issue.
The log structure is pretty simple. Data/meta data is stored in block
unit, which is 4k generally. It has only one type of meta data block.
The meta data block can track 3 types of data, stripe data, stripe
parity and flush block. MD superblock will point to the last valid
meta data block. Each meta data block has checksum/seq number, so
recovery can scan the log correctly. We store a checksum of stripe
data/parity to the metadata block, so meta data and stripe data/parity
can be written to log disk together. otherwise, meta data write must
wait till stripe data/parity is finished.
For stripe data, meta data block will record stripe data sector and
size. Currently the size is always 4k. This meta data record can be made
simpler if we just fix write hole (eg, we can record data of a stripe's
different disks together), but this format can be extended to support
caching in the future, which must record data address/size.
For stripe parity, meta data block will record stripe sector. It's
size should be 4k (for raid5) or 8k (for raid6). We always store p
parity first. This format should work for caching too.
flush block indicates a stripe is in raid array disks. Fixing write
hole doesn't need this type of meta data, it's for caching extension.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Diffstat (limited to 'drivers/md/raid5.h')
-rw-r--r-- | drivers/md/raid5.h | 9 |
1 files changed, 9 insertions, 0 deletions
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index a42c123d15d2..87fae2b50d87 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -223,6 +223,9 @@ struct stripe_head { struct stripe_head *batch_head; /* protected by stripe lock */ spinlock_t batch_lock; /* only header's lock is useful */ struct list_head batch_list; /* protected by head's batch lock*/ + + struct r5l_io_unit *log_io; + struct list_head log_list; /** * struct stripe_operations * @target - STRIPE_OP_COMPUTE_BLK target @@ -244,6 +247,7 @@ struct stripe_head { struct bio *toread, *read, *towrite, *written; sector_t sector; /* sector of this page */ unsigned long flags; + u32 log_checksum; } dev[1]; /* allocated with extra space depending of RAID geometry */ }; @@ -544,6 +548,7 @@ struct r5conf { struct r5worker_group *worker_groups; int group_cnt; int worker_cnt_per_group; + struct r5l_log *log; }; @@ -618,4 +623,8 @@ extern sector_t raid5_compute_sector(struct r5conf *conf, sector_t r_sector, extern struct stripe_head * raid5_get_active_stripe(struct r5conf *conf, sector_t sector, int previous, int noblock, int noquiesce); +extern int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev); +extern void r5l_exit_log(struct r5l_log *log); +extern int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh); +extern void r5l_write_stripe_run(struct r5l_log *log); #endif |