qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/6] trace: [tcg] Optimize per-vCPU tracing s


From: Lluís Vilanova
Subject: Re: [Qemu-devel] [PATCH v3 0/6] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches
Date: Mon, 26 Dec 2016 20:41:07 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)

Richard Henderson writes:

> On 12/23/2016 10:51 AM, Lluís Vilanova wrote:
>>> On 12/22/2016 10:35 AM, Lluís Vilanova wrote:
>>>> To handle both issues, this series replicates the shared physical TB cache,
>>>> creating a separate physical TB cache for every combination of event states
>>>> (those with the 'vcpu' and 'tcg' properties). Then, all vCPUs tracing the 
>>>> same
>>>> events will use the same physical TB cache.
>> 
>>> Why do we need to "split the physical TB cache" as opposed to simply 
>>> including
>>> the trace state into the TB hash function?
>> 
>> Mmmm, that's an interesting alternative I did not consider. Are you aiming at
>> minimizing the changes, or do you also think it would be more efficient?

> I suspect that it will be more efficient.

I discovered that I ran my tests with "--enable-debug". Without it, the
performance changes for this series look within the noise range (great news!
guest code events are now for free), and the TB hash function approach should
show the same type of results on the hot path:

* vanilla, statically disabled
real    0m2,259s
user    0m2,252s
sys     0m0,004s

* vanilla, statically enabled (overhead: 2.18x)
real    0m4,921s
user    0m4,912s
sys     0m0,008s

* multi-tb, statically disabled (overhead: 1.00x)
real    0m2,269s
user    0m2,248s
sys     0m0,020s

* multi-tb, statically enabled (overhead: 0.98x)
real    0m2,223s
user    0m2,204s
sys     0m0,016s


The only drawback I see with going for your proposal is the possibility of a
higher TB collision rate when users set vCPUs with different tracing states
(after all, my series has a separate QHT for each combination of tracing
states).


>> The dynamic tracing state would then be an arbitrarily long bitmap (defined 
>> by
>> the number of events with the 'vcpu' property), so I'm not sure how to fit it
>> into the hashing function with minimal collisions (the bitmap is now limited 
>> to
>> an unsigned long to use it as an index to the TB cache "matrix").

> You could consider that index a unique identifier for the tracing state, and
> then only compare and hash that integer.

Yes, I get that. I meant that the bitmap has a theoretically unbounded size. But
we can limit that to some arbitrary number to keep the hashing function fixed
(in your example, you're using 32 bits, which should be enough).


>> The other drawback I see is that then it would also take longer to compute 
>> the
>> hashing function, instead of the simpler array indexing. As a benefit, 
>> workloads
>> with a high frequency of TB-flushing operations might be a bit faster (there
>> would be a single QHT).

> I don't see adding one more integer to the hashing function to be significant
> at all.  Certainly not the 15% that you describe in your cover letter.

See results above.


>> If someone can provide me the code for the modified hash lookup function to
>> account for the trace dstate bitmap contents, I will integrate it and 
>> measure if
>> there is any important change in performance.

> Something like the following should do it.  There are two /* cpu->??? */
> markers that would need to be filled in.

Thanks for the code! If you think the TB hash function is a less invasive
change, then I'll send a new series with that.


> If you can reduce the tracing identifier to 8 bits, that would be excellent.
> I've been wanting to make some other changes to TB hashing, and that would fit
> in well with a second "flags" value.

With all the vCPU events I have in the queue, it should be on the order of 16
bits. But we cannot shrink it down to 8 bits at compile time now, since we no
longer have a define with the number of vCPU events.


Cheers,
  Lluis



reply via email to

[Prev in Thread] Current Thread [Next in Thread]