qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] trace: timestamps, core IDs, and file creation


From: Hollis Blanchard
Subject: Re: [Qemu-devel] trace: timestamps, core IDs, and file creation
Date: Mon, 8 Feb 2016 11:59:00 -0800
User-agent: Mozilla/5.0 (X11; Linux i686; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

On 02/08/2016 07:30 AM, Stefan Hajnoczi wrote:
On Wed, Jan 13, 2016 at 03:13:02PM -0800, Hollis Blanchard wrote:
The event timestamps are host time (get_clock()). I'm correlating qemu
events with other logs (using icount), so host time is unhelpful. Could we
use cpu_get_clock() instead? (Trace events are used in other tools like
qemu-io, where guest time doesn't exist, and there we could continue to use
get_clock().)
It must be possible to trace code while the guest CPUs are stopped, so
cpu_get_clock() on own breaks existing users.

If the CPU clock is more convenient for you, perhaps you can add an
option to use that clocksource?

"Breaks"? Is that because every trace record is assumed to have a unique timestamp? Who makes that assumption? Qemu's tracing infrastructure works fine without it...

Hmm, an option. I guess if you want to debug Qemu behavior, like "why are block operations taking so long", seeing host time is the most important thing. However, if you want debug visibility into guest behavior, like "how long did I run with IRQs disabled", guest time is the most important thing. I'm not sure when a developer should be forced to make that decision...?

When trying to understand multi-core guest behavior, it's pretty important
to know which core is performing the traced action (e.g. MMIO). Would it
make sense to automatically embed the core index, like the timestamp, or do
you think it should be encoded in each individual tracepoint?
I think that Lluís has some of this stuff automated in his TCG tracing
work.  He has added special trace event types for TCG that can be
planted at code generation time as well as TB execution time.  They can
include the vcpu number automatically IIRC.

Neat.

Regarding I/O emulation, we have to be careful because architecturally
some types of devices (e.g. PCI devices) don't have the concept of which
core is performing an action.  QEMU takes advantage of that with
ioeventfd where a lightweight vmexit simply marks an eventfd file
descriptor readable in the kernel and quickly returns back to vcpu
execution.  Another QEMU thread monitors the eventfd and processes the
notification and there is no concept of the current vcpu at that point.

Yeah, by "MMIO" I was thinking in particular of memory.c instrumentation, which almost always executes in the context of a vcpu.

I'm not sure a vcpu property could work there. cpu_physical_memory_rw() may be executed by a machine init function, before any VCPUs are running. To handle that, I have a hack like this in memory_region_read_accessor():
    int cpu_index = -1;
    if (current_cpu)
        cpu_index = current_cpu->cpu_index;
    trace_memory_region_ops_read(cpu_index, mr, addr, tmp, size);
... and then my analysis tools, which display guest activity, know to ignore events with cpu_index==-1.

I'm not sure if a vcpu property could or should do the same...?

It might be easiest to include the vcpu id in the trace event explicitly
for now.

Mmm hmm. That could cause a lot of churn to the current set of trace events, but they seem to be mostly IO operations that execute outside VCPU context.

Right now, the simple tracepoint backend creates a trace file even when no
events are active and no -trace options are present. Compounding the
situation, omitting "-trace file=foo" results in the directory being
littered with trace-NNNN files... could we create the file only when tracing
is activated? And maybe with a more predictable name?
It should be possible to create the file only if traces are actually
generated.  I guess this wasn't done for simplicity and because the
write-out thread needs to empty the ring buffer as quickly as possible
to avoid dropping trace data.

The names contains the QEMU pid.  Although the pid isn't predictable it
is useful for correlating with system logs, host performance tools, etc.

You can override the name with -trace file=foo.log if you want.

Do you have an idea for another automatic file naming scheme?

"qemu.trace" or something static. If the log is important, the user will copy it anyways to give it a descriptive name, e.g. "out_of_memory.trace". If the log is not important, it should be overwritten by the next run; otherwise the workflow becomes:

qemu-system-... args args args
ls trace-*
rm trace-NNNN
qemu-system-... args args args
ls trace-*
rm trace-MMMM


Hollis Blanchard
Mentor Graphics Emulation Division

reply via email to

[Prev in Thread] Current Thread [Next in Thread]