summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-01-15sbitmap: remove stale comment in sbq_calc_wake_batchKemeng Shi
After commit 106397376c036 ("sbitmap: fix batching wakeup"), we may wake up more than one queue for each batch. Just remove stale comment that we wake up only one queue for each batch. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20240115145626.665562-1-shikemeng@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-15block: Correct a documentation comment in blk-cgroup.cNicky Chorley
Commit 99e603874366 ("blk-cgroup: pass a gendisk to the blkg allocation helpers") changed blkg_alloc() to take a struct gendisk instead of a struct request_queue, but the documentation comment still referred to q. So, update that comment to refer to disk instead and fix a typo. Signed-off-by: Nicky Chorley <ndchorley@gmail.com> Link: https://lore.kernel.org/r/20240114191056.6992-1-ndchorley@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-14null_blk: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/bf257b1078475a415cdc3344c6a750842946e367.1705222845.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-12block: ensure we hold a queue reference when using queue limitsJens Axboe
q_usage_counter is the only thing preventing us from the limits changing under us in __bio_split_to_limits, but blk_mq_submit_bio doesn't hold it while calling into it. Move the splitting inside the region where we know we've got a queue reference. Ideally this could still remain a shared section of code, but let's keep the fix simple and defer any refactoring here to later. Reported-by: Christoph Hellwig <hch@lst.de> Fixes: 900e08075202 ("block: move queue enter logic into blk_mq_submit_bio()") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-12blk-mq: rename blk_mq_can_use_cached_rqChristoph Hellwig
blk_mq_can_use_cached_rq doesn't just check if we can use the request, but also performs the work to actually use it. Remove the _can in the naming, and improve the comment describing the function. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240111135705.2155518-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-12block: print symbolic error name instead of error codeChristian Heusel
Utilize the %pe print specifier to get the symbolic error name as a string (i.e "-ENOMEM") in the log message instead of the error code to increase its readablility. This change was suggested in https://lore.kernel.org/all/92972476-0b1f-4d0a-9951-af3fc8bc6e65@suswa.mountain/ Signed-off-by: Christian Heusel <christian@heusel.eu> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20240111231521.1596838-1-christian@heusel.eu Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-12blk-mq: fix IO hang from sbitmap wakeup raceMing Lei
In blk_mq_mark_tag_wait(), __add_wait_queue() may be re-ordered with the following blk_mq_get_driver_tag() in case of getting driver tag failure. Then in __sbitmap_queue_wake_up(), waitqueue_active() may not observe the added waiter in blk_mq_mark_tag_wait() and wake up nothing, meantime blk_mq_mark_tag_wait() can't get driver tag successfully. This issue can be reproduced by running the following test in loop, and fio hang can be observed in < 30min when running it on my test VM in laptop. modprobe -r scsi_debug modprobe scsi_debug delay=0 dev_size_mb=4096 max_queue=1 host_max_queue=1 submit_queues=4 dev=`ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter*/host*/target*/*/block/* | head -1 | xargs basename` fio --filename=/dev/"$dev" --direct=1 --rw=randrw --bs=4k --iodepth=1 \ --runtime=100 --numjobs=40 --time_based --name=test \ --ioengine=libaio Fix the issue by adding one explicit barrier in blk_mq_mark_tag_wait(), which is just fine in case of running out of tag. Cc: Jan Kara <jack@suse.cz> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Reported-by: Changhui Zhong <czhong@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240112122626.4181044-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-10Merge tag 'nvme-6.8-2024-1-10' of git://git.infradead.org/nvme into ↵Jens Axboe
for-6.8/block Pull NVMe changes from Keith: "nvme follow-up updates for Linux 6.8 - tcp, fc, and rdma target fixes (Maurizio, Daniel, Hannes, Christoph) - discard fixes and improvements (Christoph) - timeout debug improvements (Keith, Max) - various cleanups (Daniel, Max, Giuxen) - trace event string fixes (Arnd) - shadow doorbell setup on reset fix (William) - a write zeroes quirk for SK Hynix (Jim)" * tag 'nvme-6.8-2024-1-10' of git://git.infradead.org/nvme: (25 commits) nvmet-rdma: avoid circular locking dependency on install_queue() nvmet-tcp: avoid circular locking dependency on install_queue() nvme-pci: set doorbell config before unquiescing nvmet-tcp: Fix the H2C expected PDU len calculation nvme-tcp: enhance timeout kernel log nvme-rdma: enhance timeout kernel log nvme-pci: enhance timeout kernel log nvme: trace: avoid memcpy overflow warning nvmet: re-fix tracing strncpy() warning nvme: introduce nvme_disk_is_ns_head helper nvme-pci: disable write zeroes for SK Hynix BC901 nvmet-fcloop: Remove remote port from list when unlinking nvmet-trace: avoid dereferencing pointer too early nvmet-fc: remove unnecessary bracket nvme: simplify the max_discard_segments calculation nvme: fix max_discard_sectors calculation nvme: also skip discard granularity updates in nvme_config_discard nvme: update the explanation for not updating the limits in nvme_config_discard nvmet-tcp: fix a missing endianess conversion in nvmet_tcp_try_peek_pdu nvme-common: mark nvme_tls_psk_prio static ...
2024-01-10nvmet-rdma: avoid circular locking dependency on install_queue()Hannes Reinecke
nvmet_rdma_install_queue() is driven from the ->io_work workqueue function, but will call flush_workqueue() which might trigger ->release_work() which in itself calls flush_work on ->io_work. To avoid that check for pending queue in disconnecting status, and return 'controller busy' when we reached a certain threshold. Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-10nvmet-tcp: avoid circular locking dependency on install_queue()Hannes Reinecke
nvmet_tcp_install_queue() is driven from the ->io_work workqueue function, but will call flush_workqueue() which might trigger ->release_work() which in itself calls flush_work on ->io_work. To avoid that check for pending queue in disconnecting status, and return 'controller busy' when we reached a certain threshold. Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-10nvme-pci: set doorbell config before unquiescingWilliam Butler
During resets, if queues are unquiesced first, then the host can submit IOs to the controller using shadow doorbell logic but the controller won't be aware. This can lead to necessary MMIO doorbells from being not issued, causing requests to be delayed and timed-out. Signed-off-by: William Butler <wab@google.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-10block: fix partial zone append completion handling in req_bio_endio()Damien Le Moal
Partial completions of zone append request is not allowed but if a zone append completion indicates a number of completed bytes different from the original BIO size, only the BIO status is set to error. This leads to bio_advance() not setting the BIO size to 0 and thus to not call bio_endio() at the end of req_bio_endio(). Make sure a partially completed zone append is failed and completed immediately by forcing the completed number of bytes (nbytes) to be equal to the BIO size, thus ensuring that bio_endio() is called. Fixes: 297db731847e ("block: fix req_bio_endio append error handling") Cc: stable@kernel.vger.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20240110092942.442334-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-10block/iocost: silence warning on 'last_period' potentially being unusedJens Axboe
If CONFIG_TRACEPOINTS isn't enabled, we assign this variable but then never use it. This can cause the compiler to complain about that: block/blk-iocost.c:1264:6: warning: variable 'last_period' set but not used [-Wunused-but-set-variable] 1264 | u64 last_period, cur_period; | ^ Rather than add ifdefs to guard this, just mark it __maybe_unused. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401102335.GiWdeIo9-lkp@intel.com/ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-09Merge tag 'md-6.8-20240109' of ↵Jens Axboe
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.8/block Pull MD fixes from Song: "1. Sparse warning since v6.0, by Bart; 2. /proc/mdstat regression since v6.7, by Yu Kuai." * tag 'md-6.8-20240109' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid1: Use blk_opf_t for read and write operations md: Fix md_seq_ops() regressions
2024-01-09md/raid1: Use blk_opf_t for read and write operationsBart Van Assche
Use the type blk_opf_t for read and write operations instead of int. This patch does not affect the generated code but fixes the following sparse warning: drivers/md/raid1.c:1993:60: sparse: sparse: incorrect type in argument 5 (different base types) expected restricted blk_opf_t [usertype] opf got int rw Cc: Song Liu <song@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Fixes: 3c5e514db58f ("md/raid1: Use the new blk_opf_t type") Cc: stable@vger.kernel.org # v6.0+ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401080657.UjFnvQgX-lkp@intel.com/ Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240108001223.23835-1-bvanassche@acm.org
2024-01-09md: Fix md_seq_ops() regressionsYu Kuai
Commit cf1b6d4441ff ("md: simplify md_seq_ops") introduce following regressions: 1) If list all_mddevs is emptly, personalities and unused devices won't be showed to user anymore. 2) If seq_file buffer overflowed from md_seq_show(), then md_seq_start() will be called again, hence personalities will be showed to user again. 3) If seq_file buffer overflowed from md_seq_stop(), seq_read_iter() doesn't handle this, hence unused devices won't be showed to user. Fix above problems by printing personalities and unused devices in md_seq_show(). Fixes: cf1b6d4441ff ("md: simplify md_seq_ops") Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240109133957.2975272-1-yukuai1@huaweicloud.com
2024-01-08block: make __get_task_ioprio() easier to readJens Axboe
We don't need to do any gymnastics if we don't have an io_context assigned at all, so just return early with our default priority. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-08block: move __get_task_ioprio() into header fileJens Axboe
We call this once per IO, which can be millions of times per second. Since nobody really uses io priorities, or at least it isn't very common, this is all wasted time and can amount to as much as 3% of the total kernel time. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-08nvmet-tcp: Fix the H2C expected PDU len calculationMaurizio Lombardi
The nvmet_tcp_handle_h2c_data_pdu() function should take into consideration the possibility that the header digest and/or the data digests are enabled when calculating the expected PDU length, before comparing it to the value stored in cmd->pdu_len. Fixes: efa56305908b ("nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU length") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-08nvme-tcp: enhance timeout kernel logMax Gurtovoy
Print the command_id along side blk-mq's tag to help match commands with protocol wire traces and logs. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-08nvme-rdma: enhance timeout kernel logMax Gurtovoy
Print the command_id along side blk-mq's tag to help match commands with protocol wire traces and logs. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-08nvme-pci: enhance timeout kernel logKeith Busch
Kernel configs don't necessarily have opcode decoding, and some opcodes are not even decodable. It is still interesting for debugging SSD issues to know what opcode is timing out, what request type it came from, and the data size (if applicable). Also print the command_id along side blk-mq's tag to help match commands with protocol wire traces and firmware logs, Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-08block: Treat sequential write preferred zone type as invalidDamien Le Moal
With the removal of the support for host-aware zoned devices, blk_revalidate_zone_cb() should never see the zone type BLK_ZONE_TYPE_SEQWRITE_PREF (sequential write preffered zones). Treat this zone type as being invalid. Fixes: 7437bb73f087 ("block: remove support for the host aware zone model") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240107072212.1071080-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-08block: remove disk_clear_zonedChristoph Hellwig
disk_clear_zoned is unused now that the last warts of the host-aware model support in sd are gone. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20231228075141.362560-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-08sd: remove the !ZBC && blk_queue_is_zoned case in sd_read_block_characteristicsChristoph Hellwig
Now that host-aware devices are always treated as conventional this case can't happen. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20231228075141.362560-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-05nvme: trace: avoid memcpy overflow warningArnd Bergmann
A previous patch introduced a struct_group() in nvme_common_command to help stringop fortification figure out the length of the fields, but one function is not currently using them: In file included from drivers/nvme/target/core.c:7: In file included from include/linux/string.h:254: include/linux/fortify-string.h:592:4: error: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror,-Wattribute-warning] __read_overflow2_field(q_size_field, size); ^ Change this one to use the correct field name to avoid the warning. Fixes: 5c629dc9609dc ("nvme: use struct group for generic command dwords") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-05nvmet: re-fix tracing strncpy() warningArnd Bergmann
An earlier patch had tried to address a warning about a string copy with missing zero termination: drivers/nvme/target/trace.h:52:3: warning: ‘strncpy’ specified bound 32 equals destination size [-Wstringop-truncation] The new version causes a different warning with some compiler versions, notably gcc-9 and gcc-10, and also misses the zero padding that was apparently done intentionally in the original code: drivers/nvme/target/trace.h:56:2: error: 'strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=] Change it to use strscpy_pad() with the original length, which will give a properly padded and zero-terminated string as well as avoiding the warning. Fixes: d86481e924a7 ("nvmet: use min of device_path and disk len") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-05nvme: introduce nvme_disk_is_ns_head helperGuixin Liu
We currently rely on gendisk's file operations (fops) to distinguish between a namespace head (ns_head) and a regular namespace. To enhance code readability, introduce a helper function. Additionally, we must ensure that the device is not an ns_head before calling nvme_get_ns_from_dev(). To enforce this, add a WARN_ON check within the nvme_get_ns_from_dev(). Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Song <liusong@linux.alibaba.com> [include fix: https://lore.kernel.org/oe-kbuild-all/202401031943.0N72Tkji-lkp@intel.com/] Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-05nvme-pci: disable write zeroes for SK Hynix BC901Jim.Lin
SK Hynix BC901 drive write zero will cause Chromebook takes more than 20 mins to switch to developer mode "disable write zeroes" can fix this issue and Sk Hynix has been verified. Signed-off-by: Jim.Lin <jim.lin@siliconmotion.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-05nvmet-fcloop: Remove remote port from list when unlinkingDaniel Wagner
The remote port is removed too late from fcloop_nports list. Remove it when port is unregistered. This prevents a busy loop in fcloop_exit, because it is possible the remote port is found in the list and thus we will never progress. The kernel log will be spammed with nvme_fcloop: fcloop_exit: Failed deleting remote port nvme_fcloop: fcloop_exit: Failed deleting target port Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-04drivers/block/xen-blkback/common.h: Fix spelling typo in commentliyouhong
Fix spelling typo in comment. Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: liyouhong <liyouhong@kylinos.cn> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20231226095701.172080-1-liyouhong@kylinos.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-04blk-cgroup: fix rcu lockdep warning in blkg_lookup()Ming Lei
blkg_lookup() is called with either queue_lock or rcu read lock, so use rcu_dereference_check(lockdep_is_held(&q->queue_lock)) for retrieving 'blkg', which way models the check exactly for covering queue lock or rcu read lock. Fix lockdep warning of "block/blk-cgroup.h:254 suspicious rcu_dereference_check() usage!" from blkg_lookup(). Tested-by: Changhui Zhong <czhong@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Fixes: 83462a6c971c ("blkcg: Drop unnecessary RCU read [un]locks from blkg_conf_prep/finish()") Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20231219012833.2129540-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-04blk-cgroup: don't use removal safe list iteratorsDaniel Vacek
Commit f1c006f1c685 moved deletion of the list blkg->q_node from blkg_destroy() to blkg_free_workfn(). Switch to using the list iterators, as we don't need removal protection anymore. Signed-off-by: Daniel Vacek <neelx@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20240104180031.148148-1-neelx@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-04block: floor the discard granularity to the physical block sizeChristoph Hellwig
Discarding less than a physical block doesn't make sense. This fixes the existing behavior for zram before the recent changes to default the discard granularity to the logical block size, and is also a generally useful sanity check. Fixes: 3753039def5d ("zram: use the default discard granularity") Reported-by: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240103081622.508754-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-01-03nvmet-trace: avoid dereferencing pointer too earlyDaniel Wagner
The first command issued from the host to the target is the fabrics connect command. At this point, neither the target queue nor the controller have been allocated. But we already try to trace this command in nvmet_req_init. Reported by KASAN. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvmet-fc: remove unnecessary bracketDaniel Wagner
There is no need for the bracket around the identifier. Remove it. Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme: simplify the max_discard_segments calculationChristoph Hellwig
Just stash away the DMRL value in the nvme_ctrl struture, and leave all interpretation to nvme_config_discard, where we know DSM is supported by the time we're configuring the number of segments. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme: fix max_discard_sectors calculationChristoph Hellwig
ctrl->max_discard_sectors stores a value that is potentially based of the DMRSL field in Identify Controller, which is in units of LBAs and thus dependent on the Format of a namespace. Fix this by moving the calculation of max_discard_sectors entirely into nvme_config_discard and replacing the ctrl->max_discard_sectors value with a local variable so that the calculation is always namespace-specific. Fixes: 1a86924e4f46 ("nvme: fix interpretation of DMRSL") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme: also skip discard granularity updates in nvme_config_discardChristoph Hellwig
Don't just skip the discard sectors and segments but also the granularity if a value was already set before. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme: update the explanation for not updating the limits in nvme_config_discardChristoph Hellwig
Expeand the comment a bit to explain what is going on. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvmet-tcp: fix a missing endianess conversion in nvmet_tcp_try_peek_pduChristoph Hellwig
No, a __le32 cast doesn't magically byteswap on big-endian systems.. Fixes: 70525e5d82f6 ("nvmet-tcp: peek icreq before starting TLS") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme-common: mark nvme_tls_psk_prio staticChristoph Hellwig
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme: remove unused definitionMax Gurtovoy
There is no users for NVMF_AUTH_HASH_LEN macro. Reviewed-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-03nvme: tcp: remove unnecessary goto statementGuixin Liu
There is no requirement to call nvme_tcp_free_queue() for queue deallocation if the pskid is null or the queue allocation fails, as the NVME_TCP_Q_ALLOCATED flag would not be set in such scenarios. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-02nvmet-tcp: remove boilerplate codeMaurizio Lombardi
Simplify the nvmet_tcp_handle_h2c_data_pdu() function by removing boilerplate code. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-02nvmet-tcp: fix a crash in nvmet_req_complete()Maurizio Lombardi
in nvmet_tcp_handle_h2c_data_pdu(), if the host sends a data_offset different from rbytes_done, the driver ends up calling nvmet_req_complete() passing a status error. The problem is that at this point cmd->req is not yet initialized, the kernel will crash after dereferencing a NULL pointer. Fix the bug by replacing the call to nvmet_req_complete() with nvmet_tcp_fatal_error(). Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Reviewed-by: Keith Busch <kbsuch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-01-02nvmet-tcp: Fix a kernel panic when host sends an invalid H2C PDU lengthMaurizio Lombardi
If the host sends an H2CData command with an invalid DATAL, the kernel may crash in nvmet_tcp_build_pdu_iovec(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 lr : nvmet_tcp_io_work+0x6ac/0x718 [nvmet_tcp] Call trace: process_one_work+0x174/0x3c8 worker_thread+0x2d0/0x3e8 kthread+0x104/0x110 Fix the bug by raising a fatal error if DATAL isn't coherent with the packet size. Also, the PDU length should never exceed the MAXH2CDATA parameter which has been communicated to the host in nvmet_tcp_handle_icreq(). Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-12-29mtd_blkdevs: use the default discard granularityChristoph Hellwig
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20231228075545.362768-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-29bcache: use the default discard granularityChristoph Hellwig
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-12-29zram: use the default discard granularityChristoph Hellwig
The discard granularity now defaults to a single sector, so don't set that value explicitly. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20231228075545.362768-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>