Age | Commit message (Collapse) | Author |
|
When we loose a device for whatever reason while (re)scanning zones, we
trip over a NULL pointer in blk_revalidate_zone_cb, like in the following
log:
sd 0:0:0:0: [sda] 3418095616 4096-byte logical blocks: (14.0 TB/12.7 TiB)
sd 0:0:0:0: [sda] 52156 zones of 65536 logical blocks
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 37 00 00 08
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] REPORT ZONES start lba 1065287680 failed
sd 0:0:0:0: [sda] REPORT ZONES: Result: hostbyte=0x00 driverbyte=0x08
sd 0:0:0:0: [sda] Sense Key : 0xb [current]
sd 0:0:0:0: [sda] ASC=0x0 ASCQ=0x6
sda: failed to revalidate zones
sd 0:0:0:0: [sda] 0 4096-byte logical blocks: (0 B/0 B)
sda: detected capacity change from 14000519643136 to 0
==================================================================
BUG: KASAN: null-ptr-deref in blk_revalidate_zone_cb+0x1b7/0x550
Write of size 8 at addr 0000000000000010 by task kworker/u4:1/58
CPU: 1 PID: 58 Comm: kworker/u4:1 Not tainted 5.8.0-rc1 #692
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
Workqueue: events_unbound async_run_entry_fn
Call Trace:
dump_stack+0x7d/0xb0
? blk_revalidate_zone_cb+0x1b7/0x550
kasan_report.cold+0x5/0x37
? blk_revalidate_zone_cb+0x1b7/0x550
check_memory_region+0x145/0x1a0
blk_revalidate_zone_cb+0x1b7/0x550
sd_zbc_parse_report+0x1f1/0x370
? blk_req_zone_write_trylock+0x200/0x200
? sectors_to_logical+0x60/0x60
? blk_req_zone_write_trylock+0x200/0x200
? blk_req_zone_write_trylock+0x200/0x200
sd_zbc_report_zones+0x3c4/0x5e0
? sd_dif_config_host+0x500/0x500
blk_revalidate_disk_zones+0x231/0x44d
? _raw_write_lock_irqsave+0xb0/0xb0
? blk_queue_free_zone_bitmaps+0xd0/0xd0
sd_zbc_read_zones+0x8cf/0x11a0
sd_revalidate_disk+0x305c/0x64e0
? __device_add_disk+0x776/0xf20
? read_capacity_16.part.0+0x1080/0x1080
? blk_alloc_devt+0x250/0x250
? create_object.isra.0+0x595/0xa20
? kasan_unpoison_shadow+0x33/0x40
sd_probe+0x8dc/0xcd2
really_probe+0x20e/0xaf0
__driver_attach_async_helper+0x249/0x2d0
async_run_entry_fn+0xbe/0x560
process_one_work+0x764/0x1290
? _raw_read_unlock_irqrestore+0x30/0x30
worker_thread+0x598/0x12f0
? __kthread_parkme+0xc6/0x1b0
? schedule+0xed/0x2c0
? process_one_work+0x1290/0x1290
kthread+0x36b/0x440
? kthread_create_worker_on_cpu+0xa0/0xa0
ret_from_fork+0x22/0x30
==================================================================
When the device is already gone we end up with the following scenario:
The device's capacity is 0 and thus the number of zones will be 0 as well. When
allocating the bitmap for the conventional zones, we then trip over a NULL
pointer.
So if we encounter a zoned block device with a 0 capacity, don't dare to
revalidate the zones sizes.
Fixes: 6c6b35491422 ("block: set the zone size in blk_revalidate_disk_zones atomically")
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This function is just a tiny wrapper around blk_stack_limits. Open code
it int the two callers.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This function is just a tiny wrapper around blk_stack_limit and has
two callers. Simplify the stack a bit by open coding it in the two
callers.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Lift the code from device mapper into blk_stack_limits to inherity
the stacking limitations. This ensures we do the right thing for
all stacked zoned block devices.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
* for-5.9/drivers: (38 commits)
block: add max_active_zones to blk-sysfs
block: add max_open_zones to blk-sysfs
s390/dasd: Use struct_size() helper
s390/dasd: fix inability to use DASD with DIAG driver
md-cluster: fix wild pointer of unlock_all_bitmaps()
md/raid5-cache: clear MD_SB_CHANGE_PENDING before flushing stripes
md: fix deadlock causing by sysfs_notify
md: improve io stats accounting
md: raid0/linear: fix dereference before null check on pointer mddev
rsxx: switch from 'pci_free_consistent()' to 'dma_free_coherent()'
nvme: remove ns->disk checks
nvme-pci: use standard block status symbolic names
nvme-pci: use the consistent return type of nvme_pci_iod_alloc_size()
nvme-pci: add a blank line after declarations
nvme-pci: fix some comments issues
nvme-pci: remove redundant segment validation
nvme: document quirked Intel models
nvme: expose reconnect_delay and ctrl_loss_tmo via sysfs
nvme: support for zoned namespaces
nvme: support for multiple Command Sets Supported and Effects log pages
...
|
|
* for-5.9/block: (124 commits)
blk-cgroup: show global disk stats in root cgroup io.stat
blk-cgroup: make iostat functions visible to stat printing
block: improve discard bio alignment in __blkdev_issue_discard()
block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
block: defer flush request no matter whether we have elevator
block: make blk_timeout_init() static
block: remove retry loop in ioc_release_fn()
block: remove unnecessary ioc nested locking
block: integrate bd_start_claiming into __blkdev_get
block: use bd_prepare_to_claim directly in the loop driver
block: refactor bd_start_claiming
block: simplify the restart case in __blkdev_get
Revert "blk-rq-qos: remove redundant finish_wait to rq_qos_wait."
block: always remove partitions from blk_drop_partitions()
block: relax jiffies rounding for timeouts
blk-mq: remove redundant validation in __blk_mq_end_request()
blk-mq: Remove unnecessary local variable
writeback: remove bdi->congested_fn
writeback: remove struct bdi_writeback_congested
writeback: remove {set,clear}_wb_congested
...
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into master
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
- Update hashmap.h from libbpf and kvm.h from x86's kernel UAPI.
- Set opt->set in libsubcmd's OPT_CALLBACK_SET(). This fixes
'perf record --switch-output-event event-name' usage"
* tag 'perf-tools-fixes-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
tools arch kvm: Sync kvm headers with the kernel sources
perf tools: Sync hashmap.h with libbpf's
libsubcmd: Fix OPT_CALLBACK_SET()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull x86 fixes from Thomas Gleixner:
"A pile of fixes for x86:
- Fix the I/O bitmap invalidation on XEN PV, which was overlooked in
the recent ioperm/iopl rework. This caused the TSS and XEN's I/O
bitmap to get out of sync.
- Use the proper vectors for HYPERV.
- Make disabling of stack protector for the entry code work with GCC
builds which enable stack protector by default. Removing the option
is not sufficient, it needs an explicit -fno-stack-protector to
shut it off.
- Mark check_user_regs() noinstr as it is called from noinstr code.
The missing annotation causes it to be placed in the text section
which makes it instrumentable.
- Add the missing interrupt disable in exc_alignment_check()
- Fixup a XEN_PV build dependency in the 32bit entry code
- A few fixes to make the Clang integrated assembler happy
- Move EFI stub build to the right place for out of tree builds
- Make prepare_exit_to_usermode() static. It's not longer called from
ASM code"
* tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Don't add the EFI stub to targets
x86/entry: Actually disable stack protector
x86/ioperm: Fix io bitmap invalidation on Xen PV
x86: math-emu: Fix up 'cmp' insn for clang ias
x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
x86/entry: Add compatibility with IAS
x86/entry/common: Make prepare_exit_to_usermode() static
x86/entry: Mark check_user_regs() noinstr
x86/traps: Disable interrupts in exc_aligment_check()
x86/entry/32: Fix XEN_PV build dependency
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull timer fixes from Thomas Gleixner:
"Two fixes for the timer wheel:
- A timer which is already expired at enqueue time can set the
base->next_expiry value backwards. As a consequence base->clk can
be set back as well. This can lead to timers expiring early. Add a
sanity check to prevent this.
- When a timer is queued with an expiry time beyond the wheel
capacity then it should be queued in the bucket of the last wheel
level which is expiring last.
The code adjusted the expiry time to the maximum wheel capacity,
which is only correct when the wheel clock is 0. Aside of that the
check whether the delta is larger than wheel capacity does not
check the delta, it checks the expiry value itself. As a result
timers can expire at random.
Fix this by checking the right variable and adjust expiry time so
it becomes base->clock plus capacity which places it into the
outmost bucket in the last wheel level"
* tag 'timers-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timer: Fix wheel index calculation on last level
timer: Prevent base->clk from moving backward
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull scheduler fixes from Thomas Gleixner:
"A set of scheduler fixes:
- Plug a load average accounting race which was introduced with a
recent optimization casing load average to show bogus numbers.
- Fix the rseq CPU id initialization for new tasks. sched_fork() does
not update the rseq CPU id so the id is the stale id of the parent
task, which can cause user space data corruption.
- Handle a 0 return value of task_h_load() correctly in the load
balancer, which does not decrease imbalance and therefore pulls
until the maximum number of loops is reached, which might be all
tasks just created by a fork bomb"
* tag 'sched-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: handle case of task_h_load() returning 0
sched: Fix unreliable rseq cpu_id for new tasks
sched: Fix loadavg accounting race
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull irq fixes from Thomas Gleixner:
"Two fixes for the interrupt subsystem:
- Make the handling of the firmware node consistent and do not free
the node after the domain has been created successfully. The core
code stores a pointer to it which can lead to a use after free or
double free.
This used to "work" because the pointer was not stored when the
initial code was written, but at some point later it was required
to store it. Of course nobody noticed that the existing users break
that way.
- Handle affinity setting on inactive interrupts correctly when
hierarchical irq domains are enabled.
When interrupts are inactive with the modern hierarchical irqdomain
design, the interrupt chips are not necessarily in a state where
affinity changes can be handled. The legacy irq chip design allowed
this because interrupts are immediately fully initialized at
allocation time. X86 has a hacky workaround for this, but other
implementations do not.
This cased malfunction on GIC-V3. Instead of playing whack a mole
to find all affected drivers, change the core code to store the
requested affinity setting and then establish it when the interrupt
is allocated, which makes the X86 hack go away"
* tag 'irq-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/affinity: Handle affinity setting on inactive interrupts correctly
irqdomain/treewide: Keep firmware node unconditionally allocated
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb into master
Pull USB fixes from Greg KH:
"Here are a few small USB fixes, and one thunderbolt fix, for 5.8-rc6.
Nothing huge in here, just the normal collection of gadget, dwc2/3,
serial, and other minor USB driver fixes and id additions. Full
details are in the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: iuu_phoenix: fix memory corruption
USB: c67x00: fix use after free in c67x00_giveback_urb
usb: gadget: function: fix missing spinlock in f_uac1_legacy
usb: gadget: udc: atmel: fix uninitialized read in debug printk
usb: gadget: udc: atmel: remove outdated comment in usba_ep_disable()
usb: dwc2: Fix shutdown callback in platform
usb: cdns3: trace: fix some endian issues
usb: cdns3: ep0: fix some endian issues
usb: gadget: udc: gr_udc: fix memleak on error handling path in gr_ep_init()
usb: gadget: fix langid kernel-doc warning in usbstring.c
usb: dwc3: pci: add support for the Intel Jasper Lake
usb: dwc3: pci: add support for the Intel Tiger Lake PCH -H variant
usb: chipidea: core: add wakeup support for extcon
USB: serial: option: add Quectel EG95 LTE modem
thunderbolt: Fix path indices used in USB3 tunnel discovery
USB: serial: ch341: add new Product ID for CH340
USB: serial: option: add GosunCn GM500 series
USB: serial: cypress_m8: enable Simply Automated UPB PIM
|
|
git://git.infradead.org/users/hch/dma-mapping into master
Pull dma-mapping fixes from Christoph Hellwig:
"Ensure we always have fully addressable memory in the dma coherent
pool (Nicolas Saenz Julienne)"
* tag 'dma-mapping-5.8-6' of git://git.infradead.org/users/hch/dma-mapping:
dma-pool: do not allocate pool memory from CMA
dma-pool: make sure atomic pool suits device
dma-pool: introduce dma_guess_pool()
dma-pool: get rid of dma_in_atomic_pool()
dma-direct: provide function to check physical memory area validity
|
|
vmlinux-objs-y is added to targets, which currently means that the EFI
stub gets added to the targets as well. It shouldn't be added since it
is built elsewhere.
This confuses Makefile.build which interprets the EFI stub as a target
$(obj)/$(objtree)/drivers/firmware/efi/libstub/lib.a
and will create drivers/firmware/efi/libstub/ underneath
arch/x86/boot/compressed, to hold this supposed target, if building
out-of-tree. [0]
Fix this by pulling the stub out of vmlinux-objs-y into efi-obj-y.
[0] See scripts/Makefile.build near the end:
# Create directories for object files if they do not exist
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200715032631.1562882-1-nivedita@alum.mit.edu
|
|
Some builds of GCC enable stack protector by default. Simply removing
the arguments is not sufficient to disable stack protector, as the stack
protector for those GCC builds must be explicitly disabled. Remove the
argument removals and add -fno-stack-protector. Additionally include
missed x32 argument updates, and adjust whitespace for readability.
Fixes: 20355e5f73a7 ("x86/entry: Exclude low level entry code from sanitizing")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/202006261333.585319CA6B@keescook
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi into master
Pull SCSI fix from James Bottomley:
"One small driver fix. Although the one liner makes it sound like a
cosmetic change, it's a regression fix for the megaraid_sas driver"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: megaraid_sas: Remove undefined ENABLE_IRQ_POLL macro
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging into master
Pull hwmon fixes from Guenter Roeck:
- Using SCT on some Tohsiba drives causes firmware hangs. Disable its
use in the drivetemp driver.
- Handle potential buffer overflows in scmi and aspeed-pwm-tacho
driver.
- Energy reporting does not work well on all AMD CPUs. Restrict
amd_energy to known working models.
- Enable reading the CPU temperature on NCT6798D using undocumented
registers.
- Fix read errors seen if PEC is enabled in adm1275 driver.
- Fix setting the pwm1_enable in emc2103 driver.
* tag 'hwmon-for-v5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (drivetemp) Avoid SCT usage on Toshiba DT01ACA family drives
hwmon: (scmi) Fix potential buffer overflow in scmi_hwmon_probe()
hwmon: (nct6775) Accept PECI Calibration as temperature source for NCT6798D
hwmon: (adm1275) Make sure we are reading enough data for different chips
hwmon: (emc2103) fix unable to change fan pwm1_enable attribute
hwmon: (amd_energy) match for supported models
hwmon: (aspeed-pwm-tacho) Avoid possible buffer overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into master
Pull RISC-V fixes from Palmer Dabbelt:
"Two fixes:
- 16KiB kernel stacks on rv64, which fixes a lot of crashes.
- Rolling an mmiowb() into the scheduler, which when combined with
Will's fix to the mmiowb()-on-spinlock should fix the PREEMPT
issues we've been seeing"
* tag 'riscv-for-linus-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Upgrade smp_mb__after_spinlock() to iorw,iorw
riscv: use 16KB kernel stack on 64-bit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into master
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.8:
- A fix to the VAS code we merged this cycle, to report the proper
error code to userspace for address translation failures. And a
selftest update to match.
- Another fix for our pkey handling of PROT_EXEC mappings.
- A fix for a crash when booting a "secure VM" under an ultravisor
with certain numbers of CPUs.
Thanks to: Aneesh Kumar K.V, Haren Myneni, Laurent Dufour, Sandipan
Das, Satheesh Rajendran, Thiago Jung Bauermann"
* tag 'powerpc-5.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Use proper error code to check fault address
powerpc/vas: Report proper error code for address translation failure
powerpc/pseries/svm: Fix incorrect check for shared_lppaca_size
powerpc/book3s64/pkeys: Fix pkey_access_permitted() for execute disable pkey
|
|
It has been observed that Toshiba DT01ACA family drives have
WRITE FPDMA QUEUED command timeouts and sometimes just freeze until
power-cycled under heavy write loads when their temperature is getting
polled in SCT mode. The SMART mode seems to be fine, though.
Let's make sure we don't use SCT mode for these drives then.
While only the 3 TB model was actually caught exhibiting the problem let's
play safe here to avoid data corruption and extend the ban to the whole
family.
Fixes: 5b46903d8bf3 ("hwmon: Driver for disk and solid state drives with temperature sensors")
Cc: stable@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://lore.kernel.org/r/0cb2e7022b66c6d21d3f189a12a97878d0e7511b.1595075458.git.mail@maciej.szmigiero.name
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop
machinery, so the TSS and Xen's io bitmap would get out of sync
whenever disabling a valid io bitmap.
Add a new pvop for tss_invalidate_io_bitmap() to fix it.
This is XSA-329.
Fixes: 22fe5b0439dd ("x86/ioperm: Move TSS bitmap update to exit to user work")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
|
|
In order to improve consistency and usability in cgroup stat accounting,
we would like to support the root cgroup's io.stat.
Since the root cgroup has processes doing io even if the system has no
explicitly created cgroups, we need to be careful to avoid overhead in
that case. For that reason, the rstat algorithms don't handle the root
cgroup, so just turning the file on wouldn't give correct statistics.
To get around this, we simulate flushing the iostat struct by filling it
out directly from global disk stats. The result is a root cgroup io.stat
file consistent with both /proc/diskstats and io.stat.
Note that in order to collect the disk stats, we needed to iterate over
devices. To facilitate that, we had to change the linkage of a disk_type
to external so that it can be used from blk-cgroup.c to iterate over
disks.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Boris Burkov <boris@bur.io>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Previously, the code which printed io.stat only needed access to the
generic rstat flushing code, but since we plan to write some more
specific code for preparing root cgroup stats, we need to manipulate
iostat structs directly. Since declaring static functions ahead does not
seem like common practice in this file, simply move the iostat functions
up. We only plan to use blkg_iostat_set, but it seems better to keep them
all together.
Signed-off-by: Boris Burkov <boris@bur.io>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
into master
Pull NFS client fixes from Anna Schumaker:
"A few more NFS client bugfixes for Linux 5.8:
NFS:
- Fix interrupted slots by using the SEQUENCE operation
SUNRPC:
- revert d03727b248d0 to fix unkillable IOs
xprtrdma:
- Fix double-free in rpcrdma_ep_create()
- Fix recursion into rpcrdma_xprt_disconnect()
- Fix return code from rpcrdma_xprt_connect()
- Fix handling of connect errors
- Fix incorrect header size calculations"
* tag 'nfs-for-5.8-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion")
xprtrdma: fix incorrect header size calculations
NFS: Fix interrupted slots by sending a solo SEQUENCE operation
xprtrdma: Fix handling of connect errors
xprtrdma: Fix return code from rpcrdma_xprt_connect()
xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()
xprtrdma: Fix double-free in rpcrdma_ep_create()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc into master
Pull ARM SoC fixes from Arnd Bergmann:
"This time there are a number of actual code fixes, plus a small set of
device tree issues getting addressed:
Renesas:
- one defconfig cleanup to allow a later Kconfig change
Intel socfpga:
- enable QSPI devices on some machines
- fix DTC validation warnings
TI OMAP:
- Two DEBUG_ATOMIC_SLEEP fixes for ti-sysc interconnect target
module driver
- A regression fix for ti-sysc no-idle handling that caused issues
compared to earlier platform data based booting
- A fix for memory leak for omap_hwmod_allocate_module
- Fix d_can driver probe for am437x
NXP i.MX:
- A couple of fixes on i.MX platform device registration code to
stop the use of invalid IRQ 0.
- Fix a regression seen on ls1021a platform, caused by commit
52102a3ba6a61 ("soc: imx: move cpu code to drivers/soc/imx").
- Fix a misconfiguration of audio SSI on imx6qdl-gw551x board.
Amlogic Meson:
- misc DT fixes
- SoC ID fixes to detect all chips correctly"
* tag 'arm-fixes-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: spcfpga: Align GIC, NAND and UART nodenames with dtschema
ARM: dts: socfpga: Align L2 cache-controller nodename with dtschema
arm64: dts: stratix10: increase QSPI reg address in nand dts file
arm64: dts: stratix10: add status to qspi dts node
arm64: dts: agilex: add status to qspi dts node
ARM: dts: Fix dcan driver probe failed on am437x platform
ARM: OMAP2+: Fix possible memory leak in omap_hwmod_allocate_module
arm64: defconfig: Enable CONFIG_PCIE_RCAR_HOST
soc: imx: check ls1021a
ARM: imx: Remove imx_add_imx_dma() unused irq_err argument
ARM: imx: Provide correct number of resources when registering gpio devices
ARM: dts: imx6qdl-gw551x: fix audio SSI
bus: ti-sysc: Do not disable on suspend for no-idle
bus: ti-sysc: Fix sleeping function called from invalid context for RTC quirk
bus: ti-sysc: Fix wakeirq sleeping function called from invalid context
ARM: dts: meson: Align L2 cache-controller nodename with dtschema
arm64: dts: meson-gxl-s805x: reduce initial Mali450 core frequency
arm64: dts: meson: add missing gxl rng clock
soc: amlogic: meson-gx-socinfo: Fix S905X3 and S905D3 ID's
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into master
Pull arm64 fixes from Will Deacon:
"A batch of arm64 fixes.
Although the diffstat is a bit larger than we'd usually have at this
stage, a decent amount of it is the addition of comments describing
our syscall tracing behaviour, and also a sweep across all the modular
arm64 PMU drivers to make them rebust against unloading and unbinding.
There are a couple of minor things kicking around at the moment (CPU
errata and module PLTs for very large modules), but I'm not expecting
any significant changes now for us in 5.8.
- Fix kernel text addresses for relocatable images booting using EFI
and with KASLR disabled so that they match the vmlinux ELF binary.
- Fix unloading and unbinding of PMU driver modules.
- Fix generic mmiowb() when writeX() is called from preemptible
context (reported by the riscv folks).
- Fix ptrace hardware single-step interactions with signal handlers,
system calls and reverse debugging.
- Fix reporting of 64-bit x0 register for 32-bit tasks via
'perf_regs'.
- Add comments describing syscall entry/exit tracing ABI"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
drivers/perf: Prevent forced unbinding of PMU drivers
asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible()
arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP
arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter()
arm64: syscall: Expand the comment about ptrace and syscall(-1)
arm64: ptrace: Add a comment describing our syscall entry/exit trap ABI
arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return
arm64: ptrace: Override SPSR.SS when single-stepping is enabled
arm64: ptrace: Consistently use pseudo-singlestep exceptions
drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling
efi/libstub/arm64: Retain 2MB kernel Image alignment if !KASLR
|
|
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.
X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.
Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:
- Store it in the irq descriptors affinity mask
- Update the effective affinity to reflect that so user space has
a consistent view
- Don't call into the irq chip driver
This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.
Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.
For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.
Remove the X86 workaround as it is not longer required.
Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
|
|
When an expiration delta falls into the last level of the wheel, that delta
has be compared against the maximum possible delay and reduced to fit in if
necessary.
However instead of comparing the delta against the maximum, the code
compares the actual expiry against the maximum. Then instead of fixing the
delta to fit in, it sets the maximum delta as the expiry value.
This can result in various undesired outcomes, the worst possible one
being a timer expiring 15 days ahead to fire immediately.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200717140551.29076-2-frederic@kernel.org
|
|
compeletion")
Reverting commit d03727b248d0 "NFSv4 fix CLOSE not waiting for
direct IO compeletion". This patch made it so that fput() by calling
inode_dio_done() in nfs_file_release() would wait uninterruptably
for any outstanding directIO to the file (but that wait on IO should
be killable).
The problem the patch was also trying to address was REMOVE returning
ERR_ACCESS because the file is still opened, is supposed to be resolved
by server returning ERR_FILE_OPEN and not ERR_ACCESS.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
master
Pull io_uring fix from Jens Axboe:
"Fix for a case where, with automatic buffer selection, we can leak the
buffer descriptor for recvmsg"
* tag 'io_uring-5.8-2020-07-17' of git://git.kernel.dk/linux-block:
io_uring: fix recvmsg memory leak with buffer selection
|
|
Pull block fix from Jens Axboe:
"Single NVMe multipath capacity fix"
* tag 'block-5.8-2020-07-17' of git://git.kernel.dk/linux-block:
nvme: explicitly update mpath disk capacity on revalidation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse into master
Pull fuse fixes from Miklos Szeredi:
- two regressions in this cycle caused by the conversion of writepage
list to an rb_tree
- two regressions in v5.4 cause by the conversion to the new mount API
- saner behavior of fsconfig(2) for the reconfigure case
- an ancient issue with FS_IOC_{GET,SET}FLAGS ioctls
* tag 'fuse-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Fix parameter for FS_IOC_{GET,SET}FLAGS
fuse: don't ignore errors from fuse_writepages_fill()
fuse: clean up condition for writepage sending
fuse: reject options on reconfigure via fsconfig(2)
fuse: ignore 'data' argument of mount(..., MS_REMOUNT)
fuse: use ->reconfigure() instead of ->remount_fs()
fuse: fix warning in tree_insert() and clean up writepage insertion
fuse: move rb_erase() before tree_insert()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into master
Pull overlayfs fixes from Miklos Szeredi:
- fix a regression introduced in v4.20 in handling a regenerated
squashfs lower layer
- two regression fixes for this cycle, one of which is Oops inducing
- miscellaneous issues
* tag 'ovl-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix lookup of indexed hardlinks with metacopy
ovl: fix unneeded call to ovl_change_flags()
ovl: fix mount option checks for nfs_export with no upperdir
ovl: force read-only sb on failure to create index dir
ovl: fix regression with re-formatted lower squashfs
ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=on
ovl: relax WARN_ON() when decoding lower directory file handle
ovl: remove not used argument in ovl_check_origin
ovl: change ovl_copy_up_flags static
ovl: inode reference leak in ovl_is_inuse true case.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi into master
Pull spi fixes from Mark Brown:
"A couple of small driver specific fixes for fairly minor issues"
* tag 'spi-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-sun6i: sun6i_spi_transfer_one(): fix setting of clock rate
spi: mediatek: use correct SPI_CFG2_REG MACRO
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator into master
Pull regulator fixes from Mark Brown:
"The more substantial fix here is the rename of the da903x driver which
fixes a collision with the parent MFD driver name which caused issues
when things were built as modules.
There's also a fix for a mislableled regulator on the pmi8994 which is
quite important for systems with that device"
* tag 'regulator-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
MAINTAINERS: remove obsolete entry after file renaming
regulator: rename da903x to da903x-regulator
regulator: qcom_smd: Fix pmi8994 label
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into master
Pull regmap fixes from Mark Brown:
"A couple of substantial fixes here, one from Doug which fixes the
debugfs code for MMIO regmaps (fortunately not the common case) and
one from Marc fixing lookups of multiple regmaps for the same device
(a very unusual case).
There's also a fix for Kconfig to ensure we enable SoundWire properly"
* tag 'regmap-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: debugfs: Don't sleep while atomic for fast_io regmaps
regmap: add missing dependency on SoundWire
regmap: dev_get_regmap_match(): fix string comparison
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid into master
Pull HID fixes from Jiri Kosina:
- linked list race condition fix in hid-steam driver from Rodrigo Rivas
Costa
- assorted deviceID-specific quirks and other small cosmetic cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-hidpp: avoid repeated "multiplier = " log messages
HID: logitech: Use HIDPP_RECEIVER_INDEX instead of 0xff
HID: quirks: Ignore Simply Automated UPB PIM
HID: apple: Disable Fn-key key-re-mapping on clone keyboards
MAINTAINERS: update uhid and hid-wiimote entry
HID: steam: fixes race in handling device list.
HID: magicmouse: do not set up autorepeat
HID: alps: support devices with report id 2
HID: quirks: Always poll Obins Anne Pro 2 keyboard
HID: i2c-hid: add Mediacom FlexBook edge13 to descriptor override
|
|
While digging through the recent mmiowb preemption issue it came up that
we aren't actually preventing IO from crossing a scheduling boundary.
While it's a bit ugly to overload smp_mb__after_spinlock() with this
behavior, it's what PowerPC is doing so there's some precedent.
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into arm/fixes
arm/arm64: dts: socfpga: fixes for v5.8
- Add status = "okay" in QSPI
- Increase QSPI size in reg property
- Fix dtschema for SoCFPGA platforms
* tag 'socfpga_fixes_for_v5.8_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
arm64: dts: spcfpga: Align GIC, NAND and UART nodenames with dtschema
ARM: dts: socfpga: Align L2 cache-controller nodename with dtschema
arm64: dts: stratix10: increase QSPI reg address in nand dts file
arm64: dts: stratix10: add status to qspi dts node
arm64: dts: agilex: add status to qspi dts node
Link: https://lore.kernel.org/r/20200717155758.18233-1-dinguyen@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes
Renesas fixes for v5.8
- Replace CONFIG_PCIE_RCAR by CONFIG_PCIE_RCAR_HOST in the defconfig,
to unblock a planned Kconfig change.
* tag 'renesas-fixes-for-v5.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel:
arm64: defconfig: Enable CONFIG_PCIE_RCAR_HOST
Link: https://lore.kernel.org/r/20200717100523.15418-1-geert+renesas@glider.be
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound into master
Pull sound fixes from Takashi Iwai:
"No surprise here, just a few device-specific small fixes: two fixes
for USB LINE6 and one for USB-audio drivers wrt syzkaller fuzzer
issues, while the rest are all HD-audio Realtek quirks"
* tag 'sound-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - fixup for yet another Intel reference board
ALSA: hda/realtek - Enable Speaker for ASUS UX563
ALSA: hda/realtek - Enable Speaker for ASUS UX533 and UX534
ALSA: hda/realtek: Enable headset mic of Acer TravelMate B311R-31 with ALC256
ALSA: hda/realtek: enable headset mic of ASUS ROG Zephyrus G14(G401) series with ALC289
ALSA: hda/realtek - change to suitable link model for ASUS platform
ALSA: usb-audio: Fix race against the error recovery URB submission
ALSA: line6: Sync the pending work cancel at disconnection
ALSA: line6: Perform sanity check for each URB creation
|
|
This patch improves discard bio split for address and size alignment in
__blkdev_issue_discard(). The aligned discard bio may help underlying
device controller to perform better discard and internal garbage
collection, and avoid unnecessary internal fragment.
Current discard bio split algorithm in __blkdev_issue_discard() may have
non-discarded fregment on device even the discard bio LBA and size are
both aligned to device's discard granularity size.
Here is the example steps on how to reproduce the above problem.
- On a VMWare ESXi 6.5 update3 installation, create a 51GB virtual disk
with thin mode and give it to a Linux virtual machine.
- Inside the Linux virtual machine, if the 50GB virtual disk shows up as
/dev/sdb, fill data into the first 50GB by,
# dd if=/dev/zero of=/dev/sdb bs=4096 count=13107200
- Discard the 50GB range from offset 0 on /dev/sdb,
# blkdiscard /dev/sdb -o 0 -l 53687091200
- Observe the underlying mapping status of the device
# sg_get_lba_status /dev/sdb -m 1048 --lba=0
descriptor LBA: 0x0000000000000000 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000000000800 blocks: 16773120 deallocated
descriptor LBA: 0x0000000000fff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000001000000 blocks: 8386560 deallocated
descriptor LBA: 0x00000000017ff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000001800000 blocks: 8386560 deallocated
descriptor LBA: 0x0000000001fff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000002000000 blocks: 8386560 deallocated
descriptor LBA: 0x00000000027ff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000002800000 blocks: 8386560 deallocated
descriptor LBA: 0x0000000002fff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000003000000 blocks: 8386560 deallocated
descriptor LBA: 0x00000000037ff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000003800000 blocks: 8386560 deallocated
descriptor LBA: 0x0000000003fff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000004000000 blocks: 8386560 deallocated
descriptor LBA: 0x00000000047ff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000004800000 blocks: 8386560 deallocated
descriptor LBA: 0x0000000004fff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000005000000 blocks: 8386560 deallocated
descriptor LBA: 0x00000000057ff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000005800000 blocks: 8386560 deallocated
descriptor LBA: 0x0000000005fff800 blocks: 2048 mapped (or unknown)
descriptor LBA: 0x0000000006000000 blocks: 6291456 deallocated
descriptor LBA: 0x0000000006600000 blocks: 0 deallocated
Although the discard bio starts at LBA 0 and has 50<<30 bytes size which
are perfect aligned to the discard granularity, from the above list
these are many 1MB (2048 sectors) internal fragments exist unexpectedly.
The problem is in __blkdev_issue_discard(), an improper algorithm causes
an improper bio size which is not aligned.
25 int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
26 sector_t nr_sects, gfp_t gfp_mask, int flags,
27 struct bio **biop)
28 {
29 struct request_queue *q = bdev_get_queue(bdev);
[snipped]
56
57 while (nr_sects) {
58 sector_t req_sects = min_t(sector_t, nr_sects,
59 bio_allowed_max_sectors(q));
60
61 WARN_ON_ONCE((req_sects << 9) > UINT_MAX);
62
63 bio = blk_next_bio(bio, 0, gfp_mask);
64 bio->bi_iter.bi_sector = sector;
65 bio_set_dev(bio, bdev);
66 bio_set_op_attrs(bio, op, 0);
67
68 bio->bi_iter.bi_size = req_sects << 9;
69 sector += req_sects;
70 nr_sects -= req_sects;
[snipped]
79 }
80
81 *biop = bio;
82 return 0;
83 }
84 EXPORT_SYMBOL(__blkdev_issue_discard);
At line 58-59, to discard a 50GB range, req_sects is set as return value
of bio_allowed_max_sectors(q), which is 8388607 sectors. In the above
case, the discard granularity is 2048 sectors, although the start LBA
and discard length are aligned to discard granularity, req_sects never
has chance to be aligned to discard granularity. This is why there are
some still-mapped 2048 sectors fragment in every 4 or 8 GB range.
If req_sects at line 58 is set to a value aligned to discard_granularity
and close to UNIT_MAX, then all consequent split bios inside device
driver are (almostly) aligned to discard_granularity of the device
queue. The 2048 sectors still-mapped fragment will disappear.
This patch introduces bio_aligned_discard_max_sectors() to return the
the value which is aligned to q->limits.discard_granularity and closest
to UINT_MAX. Then this patch replaces bio_allowed_max_sectors() with
this new routine to decide a more proper split bio length.
But we still need to handle the situation when discard start LBA is not
aligned to q->limits.discard_granularity, otherwise even the length is
aligned, current code may still leave 2048 fragment around every 4GB
range. Therefore, to calculate req_sects, firstly the start LBA of
discard range is checked (including partition offset), if it is not
aligned to discard granularity, the first split location should make
sure following bio has bi_sector aligned to discard granularity. Then
there won't be still-mapped fragment in the middle of the discard range.
The above is how this patch improves discard bio alignment in
__blkdev_issue_discard(). Now with this patch, after discard with same
command line mentiond previously, sg_get_lba_status returns,
descriptor LBA: 0x0000000000000000 blocks: 106954752 deallocated
descriptor LBA: 0x0000000006600000 blocks: 0 deallocated
We an see there is no 2048 sectors segment anymore, everything is clean.
Reported-and-tested-by: Acshai Manoj <acshai.manoj@microfocus.com>
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Xiao Ni <xni@redhat.com>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Enzo Matsumiya <ematsumiya@suse.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are defined as
even numbers 6 and 8, such zone reset bios are treated as READ bios by
bio_data_dir(), which is obviously misleading.
The macro bio_data_dir() is defined in include/linux/bio.h as,
55 #define bio_data_dir(bio) \
56 (op_is_write(bio_op(bio)) ? WRITE : READ)
And op_is_write() is defined in include/linux/blk_types.h as,
397 static inline bool op_is_write(unsigned int op)
398 {
399 return (op & 1);
400 }
The convention of op_is_write() is when there is data transfer then the
op code should be odd number, and treat as a write op. bio_data_dir()
treats all bio direction as READ if op_is_write() reports false, and
WRITE if op_is_write() reports true.
Because REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL are even numbers,
although they don't transfer data but reporting them as READ bio by
bio_data_dir() is misleading and might be wrong. Because these two
commands will reset the writer pointers of the resetting zones, and all
content after the reset write pointer will be invalid and unaccessible,
obviously they are not READ bios in any means.
This patch changes REQ_OP_ZONE_RESET from 6 to 15, and changes
REQ_OP_ZONE_RESET_ALL from 8 to 17. Now bios with these two op code
can be treated as WRITE by bio_data_dir(). Although they don't transfer
data, now we keep them consistent with REQ_OP_DISCARD and
REQ_OP_WRITE_ZEROES with the ituition that they change on-media content
and should be WRITE request.
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Jens Axboe <axboe@fb.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit 7520872c0cf4 ("block: don't defer flushes on blk-mq + scheduling")
tried to fix deadlock for cycled wait between flush requests and data
request into flush_data_in_flight. The former holded all driver tags
and wait for data request completion, but the latter can not complete
for waiting free driver tags.
After commit 923218f6166a ("blk-mq: don't allocate driver tag upfront
for flush rq"), flush requests will not get driver tag before queuing
into flush queue.
* With elevator, flush request just get sched_tags before inserting
flush queue. It will not get driver tag until issue them to driver.
data request on list fq->flush_data_in_flight will complete in
the end.
* Without elevator, each flush request will get a driver tag when
allocate request. Then data request on fq->flush_data_in_flight
don't worry about lacking driver tag.
In both of these cases, cycled wait cannot be true. So we may allow
to defer flush request.
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The sparse tool complains as follows:
block/blk-timeout.c:93:12: warning:
symbol 'blk_timeout_init' was not declared. Should it be static?
Function blk_timeout_init() is not used outside of blk-timeout.c, so
mark it static.
Fixes: 9054650fac24 ("block: relax jiffies rounding for timeouts")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
To pick up the changes from:
83d31e5271ac ("KVM: nVMX: fixes for preemption timer migration")
That don't entail changes in tooling.
This silences these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up the changes in:
b2f9f1535bb9 ("libbpf: Fix libbpf hashmap on (I)LP32 architectures")
Silencing this warning:
Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h
I'll eventually update the warning to remove the "Kernel ABI" part
and instead state libbpf when noticing that the original is at
"tools/lib/something".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Jakub Bogusz <qboosh@pld-linux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Any option macro with _SET suffix should set opt->set variable which is
not happening for OPT_CALLBACK_SET(). This is causing issues with perf
record --switch-output-event. Fix that.
Before:
# ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
--switch-output-event syscalls:sys_enter_mmap
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.297 MB perf.data (657 samples) ]
After:
$ ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
--switch-output-event syscalls:sys_enter_mmap
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144542 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144608 ]
[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144660 ]
^C[ perf record: dump data: Woken up 1 times ]
[ perf record: Dump perf.data.2020061918144784 ]
[ perf record: Woken up 0 times to write data ]
[ perf record: Dump perf.data.2020061918144803 ]
[ perf record: Captured and wrote 0.419 MB perf.data.<timestamp> ]
Fixes: 636eb4d001b1 ("libsubcmd: Introduce OPT_CALLBACK_SET()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200619133412.50705-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Forcefully unbinding PMU drivers during perf sampling will lead to
a kernel panic, because the perf upper-layer framework call a NULL
pointer in this situation.
To solve this issue, "suppress_bind_attrs" should be set to true, so
that bind/unbind can be disabled via sysfs and prevent unbinding PMU
drivers during perf sampling.
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1594975763-32966-1-git-send-email-liuqi115@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
|