Age | Commit message (Collapse) | Author |
|
Simple helper functions to quiesce the request queue. These are
currently only used for switching IO schedulers on-the-fly, but
we can use them to properly switch IO accounting on and off as well.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
By default, CFQ will anticipate more IO from a given io context if the
previously completed IO was sync. This used to be fine, since the only
sync IO was reads and O_DIRECT writes. But with more "normal" sync writes
being used now, we don't want to anticipate for those.
Add a bio/request flag that informs the IO scheduler that this is a sync
request that we should not idle for. Introduce WRITE_ODIRECT specifically
for O_DIRECT writes, and make sure that the other sync writes set this
flag.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
For the older SSD devices that don't do command queuing, we do want to
enable plugging to get better merging.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This makes sure that we never wait on async IO for sync requests, instead
of doing the split on writes vs reads.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
tracing, net: fix net tree and tracing tree merge interaction
tracing, powerpc: fix powerpc tree and tracing tree interaction
ring-buffer: do not remove reader page from list on ring buffer free
function-graph: allow unregistering twice
trace: make argument 'mem' of trace_seq_putmem() const
tracing: add missing 'extern' keywords to trace_output.h
tracing: provide trace_seq_reserve()
blktrace: print out BLK_TN_MESSAGE properly
blktrace: extract duplidate code
blktrace: fix memory leak when freeing struct blk_io_trace
blktrace: fix blk_probes_ref chaos
blktrace: make classic output more classic
blktrace: fix off-by-one bug
blktrace: fix the original blktrace
blktrace: fix a race when creating blk_tree_root in debugfs
blktrace: fix timestamp in binary output
tracing, Text Edit Lock: cleanup
tracing: filter fix for TRACE_EVENT_FORMAT events
ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
x86: kretprobe-booster interrupt emulation code fix
...
Fix up trivial conflicts in
arch/parisc/include/asm/ftrace.h
include/linux/memory.h
kernel/extable.c
kernel/module.c
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-cpumask: (36 commits)
cpumask: remove cpumask allocation from idle_balance, fix
numa, cpumask: move numa_node_id default implementation to topology.h, fix
cpumask: remove cpumask allocation from idle_balance
x86: cpumask: x86 mmio-mod.c use cpumask_var_t for downed_cpus
x86: cpumask: update 32-bit APM not to mug current->cpus_allowed
x86: microcode: cleanup
x86: cpumask: use work_on_cpu in arch/x86/kernel/microcode_core.c
cpumask: fix CONFIG_CPUMASK_OFFSTACK=y cpu hotunplug crash
numa, cpumask: move numa_node_id default implementation to topology.h
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
cpumask: remove x86 cpumask_t uses.
cpumask: use cpumask_var_t in uv_flush_tlb_others.
cpumask: remove cpumask_t assignment from vector_allocation_domain()
cpumask: make Xen use the new operators.
cpumask: clean up summit's send_IPI functions
cpumask: use new cpumask functions throughout x86
x86: unify cpu_callin_mask/cpu_callout_mask/cpu_initialized_mask/cpu_sibling_setup_mask
cpumask: convert struct cpuinfo_x86's llc_shared_map to cpumask_var_t
cpumask: convert node_to_cpumask_map[] to cpumask_var_t
x86: unify 32 and 64-bit node_to_cpumask_map
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
s390: remove arch specific smp_send_stop()
panic: clean up kernel/panic.c
panic, smp: provide smp_send_stop() wrapper on UP too
panic: decrease oops_in_progress only after having done the panic
generic-ipi: eliminate WARN_ON()s during oops/panic
generic-ipi: cleanups
generic-ipi: remove CSD_FLAG_WAIT
generic-ipi: remove kmalloc()
generic IPI: simplify barriers and locking
|
|
Conflicts:
include/linux/slub_def.h
lib/Kconfig.debug
mm/slob.c
mm/slub.c
|
|
Conflicts:
arch/x86/kernel/cpu/common.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'percpu-cpumask-x86-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (682 commits)
percpu: fix spurious alignment WARN in legacy SMP percpu allocator
percpu: generalize embedding first chunk setup helper
percpu: more flexibility for @dyn_size of pcpu_setup_first_chunk()
percpu: make x86 addr <-> pcpu ptr conversion macros generic
linker script: define __per_cpu_load on all SMP capable archs
x86: UV: remove uv_flush_tlb_others() WARN_ON
percpu: finer grained locking to break deadlock and allow atomic free
percpu: move fully free chunk reclamation into a work
percpu: move chunk area map extension out of area allocation
percpu: replace pcpu_realloc() with pcpu_mem_alloc() and pcpu_mem_free()
x86, percpu: setup reserved percpu area for x86_64
percpu, module: implement reserved allocation and use it for module percpu variables
percpu: add an indirection ptr for chunk page map access
x86: make embedding percpu allocator return excessive free space
percpu: use negative for auto for pcpu_setup_first_chunk() arguments
percpu: improve first chunk initial area map handling
percpu: cosmetic renames in pcpu_setup_first_chunk()
percpu: clean up percpu constants
x86: un-__init fill_pud/pmd/pte
x86: remove vestigial fix_ioremap prototypes
...
Manually merge conflicts in arch/ia64/kernel/irq_ia64.c
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits)
[SCSI] scsi_dh_rdac: Retry for NOT_READY check condition
[SCSI] mpt2sas: make global symbols unique
[SCSI] sd: Make revalidate less chatty
[SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices
[SCSI] sd: Refactor sd_read_capacity()
[SCSI] mpt2sas v00.100.11.15
[SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h
[SCSI] ch: Add scsi type modalias
[SCSI] 3w-9xxx: add power management support
[SCSI] bsg: add linux/types.h include to bsg.h
[SCSI] cxgb3i: fix function descriptions
[SCSI] libiscsi: fix possbile null ptr session command cleanup
[SCSI] iscsi class: remove host no argument from session creation callout
[SCSI] libiscsi: pass session failure a session struct
[SCSI] iscsi lib: remove qdepth param from iscsi host allocation
[SCSI] iscsi lib: have lib create work queue for transmitting IO
[SCSI] iscsi class: fix lock dep warning on logout
[SCSI] libiscsi: don't cap queue depth in iscsi modules
[SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging
[SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging
...
|
|
Conflicts:
arch/parisc/kernel/irq.c
arch/x86/include/asm/fixmap_64.h
arch/x86/include/asm/setup.h
kernel/irq/handle.c
Semantic merge:
arch/x86/include/asm/fixmap.h
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
bsg submits REQ_TYPE_BLOCK_PC so the right check is max_hw_sectors.
But I've removed this check because right after, bsg proceeds with
calling blk_rq_map_user() which does all the right checks.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Put a WARN_ON in __blk_put_request if it is about to
leak bio(s). This is a serious bug that can happen in error
handling code paths.
For this to work I have fixed a couple of places in block/ where
request->bio != NULL ownership was not honored. And a small cleanup
at sg_io() while at it.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Currently inherited from sg.c bsg will submit asynchronous request
at the head-of-the-queue, (using "at_head" set in the call to
blk_execute_rq_nowait()). This is bad in situation where the queues
are full, requests will execute out of order, and can cause
starvation of the first submitted requests.
The sg_io_v4->flags member is used and a bit is allocated to denote the
Q_AT_TAIL. Zero is to queue at_head as before, to be compatible with old
code at the write/read path. SG_IO code path behavior was changed so to
be the same as write/read behavior. SG_IO was very rarely used and breaking
compatibility with it is OK at this stage.
sg_io_hdr at sg.h also has a flags member and uses 3 bits from the first
nibble and one bit from the last nibble. Even though none of these bits
are supported by bsg, The second nibble is allocated for use by bsg. Just
in case.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
CC: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
It calls blk_queue_make_request(), which sets the identical set of limits.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
cpu_core_map/cpu_sibling_map
Impact: cleanup
This is presumably what those definitions are for, and while all archs
define cpu_core_map/cpu_sibling map, that's changing (eg. x86 wants to
change it to a pointer).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
|
|
This allows it to compile and be used on the ps3 platform that wants
to use the #define values in scsi.h without actually having
CONFIG_SCSI set.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
|
|
into tracing/core
|
|
'x86/urgent', 'linus' and 'core/percpu' into x86/core
|
|
Commit 1e42807918d17e8c93bf14fbb74be84b141334c1 introduced a bug where we
don't get front/back segment sizes in the bio in blk_recount_segments().
Fix this by tracking the back bio as well as the front bio in
__blk_recalc_rq_segments(), this also cleans up the interface by getting
rid of the segment size pointer passing.
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
|
|
|
|
blk_recalc_rq_segments() requires a request structure passed in, which
we don't have from blk_recount_segments(). So the latter allocates one on
the stack, using > 400 bytes of stack for that. This can cause us to spill
over one page of stack from ext4 at least:
0) 4560 400 blk_recount_segments+0x43/0x62
1) 4160 32 bio_phys_segments+0x1c/0x24
2) 4128 32 blk_rq_bio_prep+0x2a/0xf9
3) 4096 32 init_request_from_bio+0xf9/0xfe
4) 4064 112 __make_request+0x33c/0x3f6
5) 3952 144 generic_make_request+0x2d1/0x321
6) 3808 64 submit_bio+0xb9/0xc3
7) 3744 48 submit_bh+0xea/0x10e
8) 3696 368 ext4_mb_init_cache+0x257/0xa6a [ext4]
9) 3328 288 ext4_mb_regular_allocator+0x421/0xcd9 [ext4]
10) 3040 160 ext4_mb_new_blocks+0x211/0x4b4 [ext4]
11) 2880 336 ext4_ext_get_blocks+0xb61/0xd45 [ext4]
12) 2544 96 ext4_get_blocks_wrap+0xf2/0x200 [ext4]
13) 2448 80 ext4_da_get_block_write+0x6e/0x16b [ext4]
14) 2368 352 mpage_da_map_blocks+0x7e/0x4b3 [ext4]
15) 2016 352 ext4_da_writepages+0x2ce/0x43c [ext4]
16) 1664 32 do_writepages+0x2d/0x3c
17) 1632 144 __writeback_single_inode+0x162/0x2cd
18) 1488 96 generic_sync_sb_inodes+0x1e3/0x32b
19) 1392 16 sync_sb_inodes+0xe/0x10
20) 1376 48 writeback_inodes+0x69/0xb3
21) 1328 208 balance_dirty_pages_ratelimited_nr+0x187/0x2f9
22) 1120 224 generic_file_buffered_write+0x1d4/0x2c4
23) 896 176 __generic_file_aio_write_nolock+0x35f/0x393
24) 720 80 generic_file_aio_write+0x6c/0xc8
25) 640 80 ext4_file_write+0xa9/0x137 [ext4]
26) 560 320 do_sync_write+0xf0/0x137
27) 240 48 vfs_write+0xb3/0x13c
28) 192 64 sys_write+0x4c/0x74
29) 128 128 system_call_fastpath+0x16/0x1b
Split the segment counting out into a __blk_recalc_rq_segments() helper
to avoid allocating an onstack request just for checking the physical
segment count.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Add documentation for register_blkdev() function and for the parameters.
Signed-off-by: Márton Németh <nm127@freemail.hu>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
the code so that we can use CSD_FLAG_LOCK for both purposes.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
Conflicts:
arch/x86/include/asm/pgtable.h
|
|
This prepares for a real __alloc_percpu, by adding an alignment argument.
Only one place uses __alloc_percpu directly, and that's for a string.
tj: af_inet also uses __alloc_percpu(), update it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
|
|
Conflicts:
block/blktrace.c
Semantic merge:
kernel/trace/blktrace.c
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
blk_abort_queue() iterates the timeout list and aborts each request on the
list, but if the driver error handling readds a request to the timeout list
during this processing, we could be looping forever. Fix this by splicing
current entries to a local list and run over that list instead.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Hi Tejun,
it looks like your commit:
block: don't depend on consecutive minor space
f331c0296f2a9fee0d396a70598b954062603015
broke a particular case for booting from partitioned md/raid devices.
That is the second time this has been broken recently. The previous
time was fixed by
block: do_mounts - accept root=<non-existant partition>
30f2f0eb4bd2c43d10a8b0d872c6e5ad8f31c9a0
Because the data isn't available when an md device is first created
(we add disks and set it up after creation), the initial partition
scan finds nothing. It is not until the device is opened that
another partition scan happens and finds something.
So at the point where the kernel parameter "root=/dev/md_d0p1" is
being parsed, md_d0 exists, but md_d0p1 does not.
However if we let blk_lookup_devt return the correct device number
even though the device doesn't exist, then the attempt to mount it
will successfully find the partition.
I have tried in the past to find a way to get the partition table to
be read as soon as the array is assembled but that proved impossible
(at the time). I don't remember the details, and could possibly
revisit it. However it would be really nice if blk_lookup_devt
could be adjusted to again accept non existant partitions.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO
and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before
213d9417fec62ef4c3675621b9364a667954d4dd.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
When submitting requests via SG_IO, which does a sync io, a
bsg_command is not allocated. So an in-Kernel sense_buffer was not
set. However when calling blk_execute_rq() with no sense buffer
one is provided from the stack. Now bsg at blk_complete_sgv4_hdr_rq()
would check if rq->sense_len and a sense was requested by sg_io_v4
the rq->sense was copy_user() back, but by now it is already mangled
stack memory.
I have fixed that by forcing a sense_buffer when calling bsg_map_hdr().
The bsg_command->sense is provided in the write/read path like before,
and on-the-stack buffer is provided when doing SG_IO.
I have also fixed a dprintk message to print rq->errors in hex because
of the scsi bit-field use of this member. For other block devices it
does not matter anyway.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
Impact: cleanup
Move blktrace.c to kernel/trace, also move its config entry.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: cleanup
To make it easy for ftrace plugin writers, as this was open coded in
the existing plugins
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: API change, cleanup
>From ring_buffer_{lock_reserve,unlock_commit}.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -14
trace_graph_return | -14
trace_graph_entry | -10
trace_function | -8
__ftrace_trace_stack | -8
ftrace_trace_userstack | -8
tracing_sched_switch_trace | -8
ftrace_trace_special | -12
tracing_sched_wakeup_trace | -8
9 functions changed, 90 bytes removed, diff: -90
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -1
1 function changed, 1 bytes removed, diff: -1
/tmp/vmlinux.after:
10 functions changed, 91 bytes removed, diff: -91
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: cleanup
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: simplification of tracers
As all tracers are doing this we might as well do it in
register_ftrace_event and save one branch each time we call these
callbacks.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
As they actually all return these enumerators.
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: bugfix and cleanup
Some callsites were returning either TRACE_ITER_PARTIAL_LINE if the
trace_seq routines (trace_seq_printf, etc) returned 0 meaning its buffer
was full, or zero otherwise.
But...
/* Return values for print_line callback */
enum print_line_t {
TRACE_TYPE_PARTIAL_LINE = 0, /* Retry after flushing the seq */
TRACE_TYPE_HANDLED = 1,
TRACE_TYPE_UNHANDLED = 2 /* Relay to other output functions */
};
In other cases the return value was not being relayed at all.
Most of the time it didn't hurt because the page wasn't get filled, but
for correctness sake, handle the return values everywhere.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: cleanup
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: new feature
With this and a blkrawverify modified not to verify the sequence numbers
we can start using the userspace tools to verify that the data produced
with the ftrace plugin works as expected.
Example:
[root@f10-1 ~]# echo 1 > /sys/block/sda/sda1/trace/enable
[root@f10-1 ~]# echo bin > /d/tracing/trace_options
[root@f10-1 ~]# echo blk > /d/tracing/current_tracer
[root@f10-1 ~]# cat /d/tracing/trace_pipe > sda1.blktrace.0
^C
[root@f10-1 ~]# ./blkrawverify --noseq sda1
Verifying sda1
CPU 0
Wrote output to sda1.verify.out
[root@f10-1 ~]# cat sda1.verify.out
---------------
Verifying sda1
---------------------
Summary for cpu 0:
1349 valid + 0 invalid (100.0%) processed
[root@f10-1 ~]#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Impact: API change
The trace_seq and trace_entry are in trace_iterator, where there are
more fields that may be needed by tracers, so just pass the
tracer_iterator as is already the case for struct tracer->print_line.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
tracing/core
|
|
Some initial probe requests don't have disk->queue mapped yet, so we
can't rely on a non-NULL queue in blk_queue_io_stat(). Wrap it in
blk_do_io_stat().
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
This patch adds the ability to pre-empt an ongoing BE timeslice when a RT
request is waiting for the current timeslice to complete. This reduces the
wait time to disk for RT requests from an upper bound of 4 (current value
of cfq_quantum) to 1 disk request.
Applied Jens' suggeested changes to avoid the rb lookup and use !cfq_class_rt()
and retested.
Latency(secs) for the RT task when doing sequential reads from 10G file.
| only RT | RT + BE | RT + BE + this patch
small (512 byte) reads | 143 | 163 | 145
large (1Mb) reads | 142 | 158 | 146
Signed-off-by: Divyesh Shah <dpshah@google.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|