block: fix null pointer dereference in blk_mq_rq_timed_out()

We got a null pointer deference BUG_ON in blk_mq_rq_timed_out() as following: [ 108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040 [ 108.827059] PGD 0 P4D 0 [ 108.827313] Oops: 0000 [#1] SMP PTI [ 108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431 [ 108.829503] Workqueue: kblockd blk_mq_timeout_work [ 108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330 [ 108.838191] Call Trace: [ 108.838406] bt_iter+0x74/0x80 [ 108.838665] blk_mq_queue_tag_busy_iter+0x204/0x450 [ 108.839074] ? __switch_to_asm+0x34/0x70 [ 108.839405] ? blk_mq_stop_hw_queue+0x40/0x40 [ 108.839823] ? blk_mq_stop_hw_queue+0x40/0x40 [ 108.840273] ? syscall_return_via_sysret+0xf/0x7f [ 108.840732] blk_mq_timeout_work+0x74/0x200 [ 108.841151] process_one_work+0x297/0x680 [ 108.841550] worker_thread+0x29c/0x6f0 [ 108.841926] ? rescuer_thread+0x580/0x580 [ 108.842344] kthread+0x16a/0x1a0 [ 108.842666] ? kthread_flush_work+0x170/0x170 [ 108.843100] ret_from_fork+0x35/0x40 The bug is caused by the race between timeout handle and completion for flush request. When timeout handle function blk_mq_rq_timed_out() try to read 'req->q->mq_ops', the 'req' have completed and reinitiated by next flush request, which would call blk_rq_init() to clear 'req' as 0. After commit 12f5b93145 ("blk-mq: Remove generation seqeunce"), normal requests lifetime are protected by refcount. Until 'rq->ref' drop to zero, the request can really be free. Thus, these requests cannot been reused before timeout handle finish. However, flush request has defined .end_io and rq->end_io() is still called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq' can be reused by the next flush request handle, resulting in null pointer deference BUG ON. We fix this problem by covering flush request with 'rq->ref'. If the refcount is not zero, flush_end_io() return and wait the last holder recall it. To record the request status, we add a new entry 'rq_status', which will be used in flush_end_io(). Cc: Christoph Hellwig <hch@infradead.org> Cc: Keith Busch <keith.busch@intel.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: stable@vger.kernel.org # v4.18+ Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Yufen Yu <yuyufen@huawei.com> ------- v2: - move rq_status from struct request to struct blk_flush_queue v3: - remove unnecessary '{}' pair. v4: - let spinlock to protect 'fq->rq_status' v5: - move rq_status after flush_running_idx member of struct blk_flush_queue Signed-off-by: Jens Axboe <axboe@kernel.dk>
author: Yufen Yu <yuyufen@huawei.com> 2019-09-27 16:19:55 +0800
committer: Jens Axboe <axboe@kernel.dk> 2019-09-27 07:01:25 -0600
commit: 8d6996630c03d7ceeabe2611378fea5ca1c3f1b3 (patch)
tree: 2663815e5f56768c84d80685d1c4dc44156a9840 /block/blk-flush.c
parent: 2af2783f2ea4f273598557b247e320509247df80 (diff)
1 files changed, 10 insertions, 0 deletions
diff --git a/block/blk-flush.c b/block/blk-flush.c
index aedd9320e605..1eec9cbe5a0a 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -214,6 +214,16 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error)
 
 	/* release the tag's ownership to the req cloned from */
 	spin_lock_irqsave(&fq->mq_flush_lock, flags);
+
+	if (!refcount_dec_and_test(&flush_rq->ref)) {
+		fq->rq_status = error;
+		spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
+		return;
+	}
+
+	if (fq->rq_status != BLK_STS_OK)
+		error = fq->rq_status;
+
 	hctx = flush_rq->mq_hctx;
 	if (!q->elevator) {
 		blk_mq_tag_set_rq(hctx, flush_rq->tag, fq->orig_rq);
author	Yufen Yu <yuyufen@huawei.com>	2019-09-27 16:19:55 +0800
committer	Jens Axboe <axboe@kernel.dk>	2019-09-27 07:01:25 -0600
commit	8d6996630c03d7ceeabe2611378fea5ca1c3f1b3 (patch)
tree	2663815e5f56768c84d80685d1c4dc44156a9840 /block/blk-flush.c
parent	2af2783f2ea4f273598557b247e320509247df80 (diff)