linux-2.6-microblaze.git
3 years agoblk-iocost: Factor out the active iocgs' state check into a separate function
Baolin Wang [Thu, 26 Nov 2020 08:16:14 +0000 (16:16 +0800)]
blk-iocost: Factor out the active iocgs' state check into a separate function

Factor out the iocgs' state check into a separate function to
simplify the ioc_timer_fn().

No functional change.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblk-iocost: Move the usage ratio calculation to the correct place
Baolin Wang [Thu, 26 Nov 2020 08:16:13 +0000 (16:16 +0800)]
blk-iocost: Move the usage ratio calculation to the correct place

We only use the hweight based usage ratio to calculate the new
hweight_inuse of the iocg to decide if this iocg can donate some
surplus vtime.

Thus move the usage ratio calculation to the correct place to
avoid unnecessary calculation for some vtime shortage iocgs.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblk-iocost: Remove unnecessary advance declaration
Baolin Wang [Thu, 26 Nov 2020 08:16:12 +0000 (16:16 +0800)]
blk-iocost: Remove unnecessary advance declaration

Remove unnecessary advance declaration of struct ioc_gq.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblk-iocost: Fix some typos in comments
Baolin Wang [Thu, 26 Nov 2020 08:16:11 +0000 (16:16 +0800)]
blk-iocost: Fix some typos in comments

Fix some typos in comments.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblktrace: fix up a kerneldoc comment
Christoph Hellwig [Mon, 7 Dec 2020 13:40:48 +0000 (14:40 +0100)]
blktrace: fix up a kerneldoc comment

Fixes: a54895fa057c ("block: remove the request_queue to argument request based tracepoints")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the request_queue to argument request based tracepoints
Christoph Hellwig [Thu, 3 Dec 2020 16:21:39 +0000 (17:21 +0100)]
block: remove the request_queue to argument request based tracepoints

The request_queue can trivially be derived from the request.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the request_queue argument to the block_bio_remap tracepoint
Christoph Hellwig [Thu, 3 Dec 2020 16:21:38 +0000 (17:21 +0100)]
block: remove the request_queue argument to the block_bio_remap tracepoint

The request_queue can trivially be derived from the bio.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the request_queue argument to the block_split tracepoint
Christoph Hellwig [Thu, 3 Dec 2020 16:21:37 +0000 (17:21 +0100)]
block: remove the request_queue argument to the block_split tracepoint

The request_queue can trivially be derived from the bio.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: simplify and extend the block_bio_merge tracepoint class
Christoph Hellwig [Thu, 3 Dec 2020 16:21:36 +0000 (17:21 +0100)]
block: simplify and extend the block_bio_merge tracepoint class

The block_bio_merge tracepoint class can be reused for most bio-based
tracepoints.  For that it just needs to lose the superfluous q and rq
parameters.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the unused block_sleeprq tracepoint
Christoph Hellwig [Thu, 3 Dec 2020 16:21:35 +0000 (17:21 +0100)]
block: remove the unused block_sleeprq tracepoint

The block_sleeprq tracepoint was only used by the legacy request code.
Remove it now that the legacy request code is gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblk-throttle: don't check whether or not lower limit is valid if CONFIG_BLK_DEV_THROT...
Yu Kuai [Thu, 26 Nov 2020 03:18:34 +0000 (11:18 +0800)]
blk-throttle: don't check whether or not lower limit is valid if CONFIG_BLK_DEV_THROTTLING_LOW is off

blk_throtl_update_limit_valid() will search for descendants to see if
'LIMIT_LOW' of bps/iops and READ/WRITE is nonzero. However, they're always
zero if CONFIG_BLK_DEV_THROTTLING_LOW is not set, furthermore, a lot of
time will be wasted to iterate descendants.

Thus do nothing in blk_throtl_update_limit_valid() in such situation.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: fix inflight statistics of part0
Jeffle Xu [Wed, 2 Dec 2020 11:11:45 +0000 (19:11 +0800)]
block: fix inflight statistics of part0

The inflight of partition 0 doesn't include inflight IOs to all
sub-partitions, since currently mq calculates inflight of specific
partition by simply camparing the value of the partition pointer.

Thus the following case is possible:

$ cat /sys/block/vda/inflight
       0        0
$ cat /sys/block/vda/vda1/inflight
       0      128

While single queue device (on a previous version, e.g. v3.10) has no
this issue:

$cat /sys/block/sda/sda3/inflight
       0       33
$cat /sys/block/sda/inflight
       0       33

Partition 0 should be specially handled since it represents the whole
disk. This issue is introduced since commit bf0ddaba65dd ("blk-mq: fix
sysfs inflight counter").

Besides, this patch can also fix the inflight statistics of part 0 in
/proc/diskstats. Before this patch, the inflight statistics of part 0
doesn't include that of sub partitions. (I have marked the 'inflight'
field with asterisk.)

$cat /proc/diskstats
 259       0 nvme0n1 45974469 0 367814768 6445794 1 0 1 0 *0* 111062 6445794 0 0 0 0 0 0
 259       2 nvme0n1p1 45974058 0 367797952 6445727 0 0 0 0 *33* 111001 6445727 0 0 0 0 0 0

This is introduced since commit f299b7c7a9de ("blk-mq: provide internal
in-flight variant").

Fixes: bf0ddaba65dd ("blk-mq: fix sysfs inflight counter")
Fixes: f299b7c7a9de ("blk-mq: provide internal in-flight variant")
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[axboe: adapt for 5.11 partition change]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agobio: optimise bvec iteration
Pavel Begunkov [Tue, 24 Nov 2020 17:58:13 +0000 (17:58 +0000)]
bio: optimise bvec iteration

__bio_for_each_bvec(), __bio_for_each_segment() and bio_copy_data_iter()
fall under conditions of bvec_iter_advance_single(), which is a faster
and slimmer version of bvec_iter_advance(). Add
bio_advance_iter_single() and convert them.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: optimise for_each_bvec() advance
Pavel Begunkov [Tue, 24 Nov 2020 17:58:12 +0000 (17:58 +0000)]
block: optimise for_each_bvec() advance

Because of how for_each_bvec() works it never advances across multiple
entries at a time, so bvec_iter_advance() is an overkill. Add
specialised bvec_iter_advance_single() that is faster. It also handles
zero-len bvecs, so can kill bvec_iter_skip_zero_bvec().

   text    data     bss     dec     hex filename
before:
  23977     805       0   24782    60ce lib/iov_iter.o
before, bvec_iter_advance() w/o WARN_ONCE()
  22886     600       0   23486    5bbe ./lib/iov_iter.o
after:
  21862     600       0   22462    57be lib/iov_iter.o

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: stop using bdget_disk for partition 0
Christoph Hellwig [Thu, 26 Nov 2020 09:41:07 +0000 (10:41 +0100)]
block: stop using bdget_disk for partition 0

We can just dereference the point in struct gendisk instead.  Also
remove the now unused export.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: merge struct block_device and struct hd_struct
Christoph Hellwig [Fri, 27 Nov 2020 15:43:51 +0000 (16:43 +0100)]
block: merge struct block_device and struct hd_struct

Instead of having two structures that represent each block device with
different life time rules, merge them into a single one.  This also
greatly simplifies the reference counting rules, as we can use the inode
reference count as the main reference count for the new struct
block_device, with the device model reference front ending it for device
model interaction.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agof2fs: remove a few bd_part checks
Christoph Hellwig [Tue, 24 Nov 2020 08:21:57 +0000 (09:21 +0100)]
f2fs: remove a few bd_part checks

bd_part is never NULL for a block device in use by a file system, so
remove the checks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: switch disk_part_iter_* to use a struct block_device
Christoph Hellwig [Tue, 24 Nov 2020 08:52:59 +0000 (09:52 +0100)]
block: switch disk_part_iter_* to use a struct block_device

Switch the partition iter infrastructure to iterate over block_device
references instead of hd_struct ones mostly used to get at the
block_device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: pass a block_device to invalidate_partition
Christoph Hellwig [Tue, 24 Nov 2020 08:20:37 +0000 (09:20 +0100)]
block: pass a block_device to invalidate_partition

Pass the block_device actually needed instead of looking it up using
bdget_disk.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: pass a block_device to blk_alloc_devt
Christoph Hellwig [Tue, 24 Nov 2020 08:17:46 +0000 (09:17 +0100)]
block: pass a block_device to blk_alloc_devt

Pass the block_device actually needed instead of the hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the partno field from struct hd_struct
Christoph Hellwig [Tue, 24 Nov 2020 08:37:14 +0000 (09:37 +0100)]
block: remove the partno field from struct hd_struct

Just use the bd_partno field in struct block_device everywhere.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: switch partition lookup to use struct block_device
Christoph Hellwig [Tue, 24 Nov 2020 08:36:54 +0000 (09:36 +0100)]
block: switch partition lookup to use struct block_device

Use struct block_device to lookup partitions on a disk.  This removes
all usage of struct hd_struct from the I/O path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: allocate struct hd_struct as part of struct bdev_inode
Christoph Hellwig [Thu, 26 Nov 2020 17:47:17 +0000 (18:47 +0100)]
block: allocate struct hd_struct as part of struct bdev_inode

Allocate hd_struct together with struct block_device to pre-load
the lifetime rule changes in preparation of merging the two structures.

Note that part0 was previously embedded into struct gendisk, but is
a separate allocation now, and already points to the block_device instead
of the hd_struct.  The lifetime of struct gendisk is still controlled by
the struct device embedded in the part0 hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move the policy field to struct block_device
Christoph Hellwig [Mon, 23 Nov 2020 15:36:02 +0000 (16:36 +0100)]
block: move the policy field to struct block_device

Move the policy field to struct block_device and rename it to the
more descriptive bd_read_only.  Also turn the field into a bool as it
is used as such.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move make_it_fail to struct block_device
Christoph Hellwig [Mon, 23 Nov 2020 15:28:47 +0000 (16:28 +0100)]
block: move make_it_fail to struct block_device

Move the make_it_fail flag to struct block_device an turn it into a bool
in preparation of killing struct hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move holder_dir to struct block_device
Christoph Hellwig [Mon, 23 Nov 2020 18:00:13 +0000 (19:00 +0100)]
block: move holder_dir to struct block_device

Move the holder_dir field to struct block_device in preparation for
kill struct hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move the partition_meta_info to struct block_device
Christoph Hellwig [Tue, 24 Nov 2020 11:01:45 +0000 (12:01 +0100)]
block: move the partition_meta_info to struct block_device

Move the partition_meta_info to struct block_device in preparation for
killing struct hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move the start_sect field to struct block_device
Christoph Hellwig [Tue, 24 Nov 2020 08:34:24 +0000 (09:34 +0100)]
block: move the start_sect field to struct block_device

Move the start_sect field to struct block_device in preparation
of killing struct hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move disk stat accounting to struct block_device
Christoph Hellwig [Tue, 24 Nov 2020 08:34:00 +0000 (09:34 +0100)]
block: move disk stat accounting to struct block_device

Move the dkstats and stamp field to struct block_device in preparation
of killing struct hd_struct.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the nr_sects field in struct hd_struct
Christoph Hellwig [Thu, 26 Nov 2020 17:43:37 +0000 (18:43 +0100)]
block: remove the nr_sects field in struct hd_struct

Now that the hd_struct always has a block device attached to it, there is
no need for having two size field that just get out of sync.

Additionally the field in hd_struct did not use proper serialization,
possibly allowing for torn writes.  By only using the block_device field
this problem also gets fixed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: initialize struct block_device in bdev_alloc
Christoph Hellwig [Mon, 23 Nov 2020 14:41:40 +0000 (15:41 +0100)]
block: initialize struct block_device in bdev_alloc

Don't play tricks with slab constructors as bdev structures tends to not
get reused very much, and this makes the code a lot less error prone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: simplify part_to_disk
Christoph Hellwig [Mon, 23 Nov 2020 15:38:15 +0000 (16:38 +0100)]
block: simplify part_to_disk

Now that struct hd_struct has a block_device pointer use that to
find the disk.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: simplify the block device claiming interface
Christoph Hellwig [Wed, 25 Nov 2020 20:20:08 +0000 (21:20 +0100)]
block: simplify the block device claiming interface

Stop passing the whole device as a separate argument given that it
can be trivially deducted and cleanup the !holder debug check.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove ->bd_contains
Christoph Hellwig [Mon, 23 Nov 2020 12:29:55 +0000 (13:29 +0100)]
block: remove ->bd_contains

Now that each hd_struct has a reference to the corresponding
block_device, there is no need for the bd_contains pointer.  Add
a bdev_whole() helper to look up the whole device block_device
struture instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: simplify bdev/disk lookup in blkdev_get
Christoph Hellwig [Thu, 26 Nov 2020 08:23:26 +0000 (09:23 +0100)]
block: simplify bdev/disk lookup in blkdev_get

To simplify block device lookup and a few other upcoming areas, make sure
that we always have a struct block_device available for each disk and
each partition, and only find existing block devices in bdget.  The only
downside of this is that each device and partition uses a little more
memory.  The upside will be that a lot of code can be simplified.

With that all we need to look up the block device is to lookup the inode
and do a few sanity checks on the gendisk, instead of the separate lookup
for the gendisk.  For blk-cgroup which wants to access a gendisk without
opening it, a new blkdev_{get,put}_no_open low-level interface is added
to replace the previous get_gendisk use.

Note that the change to look up block device directly instead of the two
step lookup using struct gendisk causes a subtile change in behavior:
accessing a non-existing partition on an existing block device can now
cause a call to request_module.  That call is harmless, and in practice
no recent system will access these nodes as they aren't created by udev
and static /dev/ setups are unusual.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove i_bdev
Christoph Hellwig [Mon, 23 Nov 2020 12:38:40 +0000 (13:38 +0100)]
block: remove i_bdev

Switch the block device lookup interfaces to directly work with a dev_t
so that struct block_device references are only acquired by the
blkdev_get variants (and the blk-cgroup special case).  This means that
we now don't need an extra reference in the inode and can generally
simplify handling of struct block_device to keep the lookups contained
in the core block layer code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Coly Li <colyli@suse.de> [bcache]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: opencode devcgroup_inode_permission
Christoph Hellwig [Mon, 23 Nov 2020 12:44:44 +0000 (13:44 +0100)]
block: opencode devcgroup_inode_permission

Just call devcgroup_check_permission to avoid various superflous checks
and a double conversion of the access flags.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: move bdput() to the callers of __blkdev_get
Christoph Hellwig [Thu, 26 Nov 2020 08:22:18 +0000 (09:22 +0100)]
block: move bdput() to the callers of __blkdev_get

This will allow for a more symmetric calling convention going forward.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: refactor blkdev_get
Christoph Hellwig [Mon, 23 Nov 2020 09:19:22 +0000 (10:19 +0100)]
block: refactor blkdev_get

Move more code that is only run on the outer open but not the open of
the underlying whole device when opening a partition into blkdev_get,
which leads to a much easier to follow structure.

This allows to simplify the disk and module refcounting so that one
reference is held for each open, similar to what we do with normal
file operations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: refactor __blkdev_put
Christoph Hellwig [Tue, 17 Nov 2020 11:01:26 +0000 (12:01 +0100)]
block: refactor __blkdev_put

Reorder the code to have one big section for the last close, and to use
bdev_is_partition.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoinit: cleanup match_dev_by_uuid and match_dev_by_label
Christoph Hellwig [Sat, 14 Nov 2020 18:41:48 +0000 (19:41 +0100)]
init: cleanup match_dev_by_uuid and match_dev_by_label

Avoid a totally pointless goto label, and use the same style of
comparism for both helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoinit: refactor devt_from_partuuid
Christoph Hellwig [Sat, 14 Nov 2020 18:35:28 +0000 (19:35 +0100)]
init: refactor devt_from_partuuid

The code in devt_from_partuuid is very convoluted.  Refactor a bit by
sanitizing the goto and variable name usage.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoinit: refactor name_to_dev_t
Christoph Hellwig [Sat, 14 Nov 2020 18:19:57 +0000 (19:19 +0100)]
init: refactor name_to_dev_t

Split each case into a self-contained helper, and move the block
dependent code entirely under the pre-existing #ifdef CONFIG_BLOCK.
This allows to remove the blk_lookup_devt stub in genhd.h.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: switch bdgrab to use igrab
Christoph Hellwig [Fri, 27 Nov 2020 15:45:27 +0000 (16:45 +0100)]
block: switch bdgrab to use igrab

All of the current callers already have a reference, but to prepare for
additional users ensure bdgrab returns NULL if the block device is being
freed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: change the hash used for looking up block devices
Christoph Hellwig [Mon, 9 Nov 2020 16:52:02 +0000 (17:52 +0100)]
block: change the hash used for looking up block devices

Adding the minor to the major creates tons of pointless conflicts. Just
use the dev_t itself, which is 32-bits and thus is guaranteed to fit
into ino_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: use put_device in put_disk
Christoph Hellwig [Tue, 10 Nov 2020 06:25:37 +0000 (07:25 +0100)]
block: use put_device in put_disk

Use put_device to put the device instead of poking into the internals
and using kobject_put.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: use disk_part_iter_exit in disk_part_iter_next
Christoph Hellwig [Tue, 10 Nov 2020 05:48:53 +0000 (06:48 +0100)]
block: use disk_part_iter_exit in disk_part_iter_next

Call disk_part_iter_exit in disk_part_iter_next instead of duplicating
the functionality.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: add a bdev_kobj helper
Christoph Hellwig [Tue, 17 Nov 2020 07:18:55 +0000 (08:18 +0100)]
block: add a bdev_kobj helper

Add a little helper to find the kobject for a struct block_device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove a superflous check in blkpg_do_ioctl
Christoph Hellwig [Tue, 24 Nov 2020 08:43:52 +0000 (09:43 +0100)]
block: remove a superflous check in blkpg_do_ioctl

sector_t is now always a u64, so this check is not needed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove a duplicate __disk_get_part prototype
Christoph Hellwig [Fri, 27 Nov 2020 15:21:52 +0000 (16:21 +0100)]
block: remove a duplicate __disk_get_part prototype

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodm: remove the block_device reference in struct mapped_device
Christoph Hellwig [Fri, 27 Nov 2020 15:21:42 +0000 (16:21 +0100)]
dm: remove the block_device reference in struct mapped_device

Get rid of the long-lasting struct block_device reference in
struct mapped_device.  The only remaining user is the freeze code,
where we can trivially look up the block device at freeze time
and release the reference at thaw time.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodm: simplify flush_bio initialization in __send_empty_flush
Christoph Hellwig [Sat, 7 Nov 2020 08:30:05 +0000 (09:30 +0100)]
dm: simplify flush_bio initialization in __send_empty_flush

We don't really need the struct block_device to initialize a bio.  So
switch from using bio_set_dev to manually setting up bi_disk (bi_partno
will always be zero and has been cleared by bio_init already).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoloop: do not call set_blocksize
Christoph Hellwig [Mon, 16 Nov 2020 14:38:30 +0000 (15:38 +0100)]
loop: do not call set_blocksize

set_blocksize is used by file systems to use their preferred buffer cache
block size.  Block drivers should not set it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozram: do not call set_blocksize
Christoph Hellwig [Mon, 16 Nov 2020 14:43:46 +0000 (15:43 +0100)]
zram: do not call set_blocksize

set_blocksize is used by file systems to use their preferred buffer cache
block size.  Block drivers should not set it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agomtip32xx: remove the call to fsync_bdev on removal
Christoph Hellwig [Tue, 30 Jun 2020 14:33:54 +0000 (16:33 +0200)]
mtip32xx: remove the call to fsync_bdev on removal

del_gendisk already calls fsync_bdev for every partition, no need
to do this twice.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agofs: simplify freeze_bdev/thaw_bdev
Christoph Hellwig [Tue, 24 Nov 2020 10:54:06 +0000 (11:54 +0100)]
fs: simplify freeze_bdev/thaw_bdev

Store the frozen superblock in struct block_device to avoid the awkward
interface that can return a sb only used a cookie, an ERR_PTR or NULL.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Chao Yu <yuchao0@huawei.com> [f2fs]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agofs: remove get_super_thawed and get_super_exclusive_thawed
Christoph Hellwig [Mon, 16 Nov 2020 14:21:18 +0000 (15:21 +0100)]
fs: remove get_super_thawed and get_super_exclusive_thawed

Just open code the wait in the only caller of both functions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agofilemap: consistently use ->f_mapping over ->i_mapping
Christoph Hellwig [Mon, 16 Nov 2020 13:33:37 +0000 (14:33 +0100)]
filemap: consistently use ->f_mapping over ->i_mapping

Use file->f_mapping in all remaining places that have a struct file
available to properly handle the case where inode->i_mapping !=
file_inode(file)->i_mapping.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: wbt: Remove unnecessary invoking of wbt_update_limits in wbt_init
Lei Chen [Mon, 30 Nov 2020 02:20:52 +0000 (10:20 +0800)]
block: wbt: Remove unnecessary invoking of wbt_update_limits in wbt_init

It's unnecessary to call wbt_update_limits explicitly within wbt_init,
because it will be called in the following function wbt_queue_depth_changed.

Signed-off-by: Lei Chen <lennychen@tencent.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove unused BIO_SPLIT_ENTRIES
Jeffle Xu [Wed, 25 Nov 2020 06:58:17 +0000 (14:58 +0800)]
block: remove unused BIO_SPLIT_ENTRIES

Since commit 4b1faf931650 ("block: Kill bio_pair_split()"), there's
no user of BIO_SPLIT_ENTRIES anymore.

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: unexport revalidate_disk_size
Christoph Hellwig [Mon, 16 Nov 2020 14:57:14 +0000 (15:57 +0100)]
block: unexport revalidate_disk_size

revalidate_disk_size is now only called from set_capacity_and_notify,
so drop the export.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agovirtio-blk: remove a spurious call to revalidate_disk_size
Christoph Hellwig [Mon, 16 Nov 2020 14:57:13 +0000 (15:57 +0100)]
virtio-blk: remove a spurious call to revalidate_disk_size

revalidate_disk_size just updates the block device size from the disk
size.  Thus calling it from virtblk_update_cache_mode doesn't actually
do anything.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agomd: remove a spurious call to revalidate_disk_size in update_size
Christoph Hellwig [Mon, 16 Nov 2020 14:57:12 +0000 (15:57 +0100)]
md: remove a spurious call to revalidate_disk_size in update_size

None of the ->resize methods updates the disk size, so calling
revalidate_disk_size here won't do anything.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agomd: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:11 +0000 (15:57 +0100)]
md: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodm-raid: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:10 +0000 (15:57 +0100)]
dm-raid: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agozram: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:09 +0000 (15:57 +0100)]
zram: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agornbd: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:08 +0000 (15:57 +0100)]
rnbd: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agorbd: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:07 +0000 (15:57 +0100)]
rbd: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodrbd: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:06 +0000 (15:57 +0100)]
drbd: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonvme: use set_capacity_and_notify in nvme_set_queue_dying
Christoph Hellwig [Mon, 16 Nov 2020 14:57:05 +0000 (15:57 +0100)]
nvme: use set_capacity_and_notify in nvme_set_queue_dying

Use the block layer helper to update both the disk and block device
sizes.  Contrary to the name no notification is sent in this case,
as a size 0 is special cased.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agopktcdvd: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:04 +0000 (15:57 +0100)]
pktcdvd: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodm: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:03 +0000 (15:57 +0100)]
dm: use set_capacity_and_notify

Use set_capacity_and_notify to set the size of both the disk and block
device.  This also gets the uevent notifications for the resize for free.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoaoe: don't call set_capacity from irq context
Christoph Hellwig [Mon, 16 Nov 2020 14:57:02 +0000 (15:57 +0100)]
aoe: don't call set_capacity from irq context

Updating the block device size from irq context can lead to torn
writes of the 64-bit value, and prevents us from using normal
process context locking primitives to serialize access to the 64-bit
nr_sectors value.  Defer the set_capacity to the already existing
workqueue handler, where it can be merged with the update of the
block device size by using set_capacity_and_notify.  As an extra
bonus this also adds proper uevent notifications for the resize.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonbd: use set_capacity_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:57:01 +0000 (15:57 +0100)]
nbd: use set_capacity_and_notify

Use set_capacity_and_notify to update the disk and block device sizes and
send a RESIZE uevent to userspace.  Note that blktests relies on uevents
being sent also for updates that did not change the device size, so the
explicit kobject_uevent remains for that case.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonbd: validate the block size in nbd_set_size
Christoph Hellwig [Mon, 16 Nov 2020 14:57:00 +0000 (15:57 +0100)]
nbd: validate the block size in nbd_set_size

Move the validation of the block from the callers into nbd_set_size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonbd: refactor size updates
Christoph Hellwig [Mon, 16 Nov 2020 14:56:59 +0000 (15:56 +0100)]
nbd: refactor size updates

Merge nbd_size_set and nbd_size_update into a single function that also
updates the nbd_config fields.  This new function takes the device size
in bytes as the first argument, and the blocksize as the second argument,
simplifying the calculations required in most callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonbd: move the task_recv check into nbd_size_update
Christoph Hellwig [Mon, 16 Nov 2020 14:56:58 +0000 (15:56 +0100)]
nbd: move the task_recv check into nbd_size_update

nbd_size_update is about to acquire a few more callers, so lift the check
into the function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonbd: remove the call to set_blocksize
Christoph Hellwig [Mon, 16 Nov 2020 14:56:57 +0000 (15:56 +0100)]
nbd: remove the call to set_blocksize

Block driver have no business setting the file system concept of a
block size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the update_bdev parameter to set_capacity_revalidate_and_notify
Christoph Hellwig [Mon, 16 Nov 2020 14:56:56 +0000 (15:56 +0100)]
block: remove the update_bdev parameter to set_capacity_revalidate_and_notify

The update_bdev argument is always set to true, so remove it.  Also
rename the function to the slighly less verbose set_capacity_and_notify,
as propagating the disk size to the block device isn't really
revalidation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agosd: update the bdev size in sd_revalidate_disk
Christoph Hellwig [Mon, 16 Nov 2020 14:56:55 +0000 (15:56 +0100)]
sd: update the bdev size in sd_revalidate_disk

This avoids the extra call to revalidate_disk_size in sd_rescan and
is otherwise a no-op because the size did not change, or we are in
the probe path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agonvme: let set_capacity_revalidate_and_notify update the bdev size
Christoph Hellwig [Mon, 16 Nov 2020 14:56:54 +0000 (15:56 +0100)]
nvme: let set_capacity_revalidate_and_notify update the bdev size

There is no good reason to call revalidate_disk_size separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoloop: let set_capacity_revalidate_and_notify update the bdev size
Christoph Hellwig [Mon, 16 Nov 2020 14:56:53 +0000 (15:56 +0100)]
loop: let set_capacity_revalidate_and_notify update the bdev size

There is no good reason to call revalidate_disk_size separately.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: remove the call to __invalidate_device in check_disk_size_change
Christoph Hellwig [Mon, 16 Nov 2020 14:56:52 +0000 (15:56 +0100)]
block: remove the call to __invalidate_device in check_disk_size_change

__invalidate_device without the kill_dirty parameter just invalidates
various clean entries in caches, which doesn't really help us with
anything, but can cause all kinds of horrible lock orders due to how
it calls into the file system.  The only reason this hasn't been a
major issue is because so many people use partitions, for which no
invalidation was performed anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: fix the kerneldoc comment for __register_blkdev
Christoph Hellwig [Sat, 14 Nov 2020 17:08:21 +0000 (18:08 +0100)]
block: fix the kerneldoc comment for __register_blkdev

Switch the comment to talk about __register_blkdev instead of
register_blkdev and document the new probe parameter.

Fixes: 3da1a61e7046 ("block: add an optional probe callback to major_names")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: switch gendisk lookup to a simple xarray
Christoph Hellwig [Thu, 29 Oct 2020 14:58:41 +0000 (15:58 +0100)]
block: switch gendisk lookup to a simple xarray

Now that bdev_map is only used for finding gendisks, we can use
a simple xarray instead of the regions tracking structure for it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoz2ram: use separate gendisk for the different modes
Christoph Hellwig [Thu, 29 Oct 2020 14:58:40 +0000 (15:58 +0100)]
z2ram: use separate gendisk for the different modes

Use separate gendisks (which share a tag_set) for the different operating
modes instead of redirecting the gendisk lookup using a probe callback.
This avoids potential problems with aliased block_device instances and
will eventually allow for removing the blk_register_region framework.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoz2ram: reindent
Christoph Hellwig [Thu, 29 Oct 2020 14:58:39 +0000 (15:58 +0100)]
z2ram: reindent

reindent the driver using Lident as the code style was far away from
normal Linux code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoataflop: use a separate gendisk for each media format
Christoph Hellwig [Thu, 29 Oct 2020 14:58:38 +0000 (15:58 +0100)]
ataflop: use a separate gendisk for each media format

The Atari floppy driver usually autodetects the media when used with the
ormal /dev/fd? devices, which also are the only nodes created by udev.
But it also supports various aliases that force a given media format.
That is currently supported using the blk_register_region framework
which finds the floppy gendisk even for a 'mismatched' dev_t.  The
problem with this (besides the code complexity) is that it creates
multiple struct block_device instances for the whole device of a
single gendisk, which can lead to interesting issues in code not
aware of that fact.

To fix this just create a separate gendisk for each of the aliases
if they are accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoamiflop: use separate gendisks for Amiga vs MS-DOS mode
Christoph Hellwig [Thu, 29 Oct 2020 14:58:37 +0000 (15:58 +0100)]
amiflop: use separate gendisks for Amiga vs MS-DOS mode

Use separate gendisks (which share a tag_set) for the native Amgiga vs
the MS-DOS mode instead of redirecting the gendisk lookup using a probe
callback.  This avoids potential problems with aliased block_device
instances and will eventually allow for removing the blk_register_region
framework.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agofloppy: use a separate gendisk for each media format
Christoph Hellwig [Thu, 29 Oct 2020 14:58:36 +0000 (15:58 +0100)]
floppy: use a separate gendisk for each media format

The floppy driver usually autodetects the media when used with the
normal /dev/fd? devices, which also are the only nodes created by udev.
But it also supports various aliases that force a given media format.
That is currently supported using the blk_register_region framework
which finds the floppy gendisk even for a 'mismatched' dev_t.  The
problem with this (besides the code complexity) is that it creates
multiple struct block_device instances for the whole device of a
single gendisk, which can lead to interesting issues in code not
aware of that fact.

To fix this just create a separate gendisk for each of the aliases
if they are accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoide: switch to __register_blkdev for command set probing
Christoph Hellwig [Thu, 29 Oct 2020 14:58:35 +0000 (15:58 +0100)]
ide: switch to __register_blkdev for command set probing

ide is the last user of the blk_register_region framework except for the
tracking of allocated gendisk.  Switch to __register_blkdev, even if that
doesn't allow us to trivially find out which command set to probe for.
That means we now always request all modules when a user tries to access
an unclaimed ide device node, but except for a few potentially loaded
modules for a fringe use case of a deprecated and soon to be removed
driver that doesn't make a difference.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agomd: use __register_blkdev to allocate devices on demand
Christoph Hellwig [Thu, 29 Oct 2020 14:58:34 +0000 (15:58 +0100)]
md: use __register_blkdev to allocate devices on demand

Use the simpler mechanism attached to major_name to allocate a md device
when a currently unregistered minor is accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoloop: use __register_blkdev to allocate devices on demand
Christoph Hellwig [Thu, 29 Oct 2020 14:58:33 +0000 (15:58 +0100)]
loop: use __register_blkdev to allocate devices on demand

Use the simpler mechanism attached to major_name to allocate a brd device
when a currently unregistered minor is accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agobrd: use __register_blkdev to allocate devices on demand
Christoph Hellwig [Thu, 29 Oct 2020 14:58:32 +0000 (15:58 +0100)]
brd: use __register_blkdev to allocate devices on demand

Use the simpler mechanism attached to major_name to allocate a brd device
when a currently unregistered minor is accessed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agosd: use __register_blkdev to avoid a modprobe for an unregistered dev_t
Christoph Hellwig [Thu, 29 Oct 2020 14:58:31 +0000 (15:58 +0100)]
sd: use __register_blkdev to avoid a modprobe for an unregistered dev_t

Switch from using blk_register_region to the probe callback passed to
__register_blkdev to disable the request_module call for an unclaimed
dev_t in the SD majors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoswim: don't call blk_register_region
Christoph Hellwig [Thu, 29 Oct 2020 14:58:30 +0000 (15:58 +0100)]
swim: don't call blk_register_region

The swim driver (unlike various other floppy drivers) doesn't have
magic device nodes for certain modes, and already registers a gendisk
for each of the floppies supported by a device.  Thus the region
registered is a no-op and can be removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoide: remove ide_{,un}register_region
Christoph Hellwig [Thu, 29 Oct 2020 14:58:29 +0000 (15:58 +0100)]
ide: remove ide_{,un}register_region

There is no need to ever register the fake gendisk used for ide-tape.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: add an optional probe callback to major_names
Christoph Hellwig [Thu, 29 Oct 2020 14:58:28 +0000 (15:58 +0100)]
block: add an optional probe callback to major_names

Add a callback to the major_names array that allows a driver to override
how to probe for dev_t that doesn't currently have a gendisk registered.
This will help separating the lookup of the gendisk by dev_t vs probe
action for a not currently registered dev_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: rework requesting modules for unclaimed devices
Christoph Hellwig [Thu, 29 Oct 2020 14:58:27 +0000 (15:58 +0100)]
block: rework requesting modules for unclaimed devices

Instead of reusing the ranges in bdev_map, add a new helper that is
called if no ranges was found.  This is a first step to unpeel and
eventually remove the complex ranges structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoblock: split block_class_lock
Christoph Hellwig [Thu, 29 Oct 2020 14:58:26 +0000 (15:58 +0100)]
block: split block_class_lock

Split the block_class_lock mutex into one each to protect bdev_map
and major_names.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>