qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] cpufreq and QEMU guests


From: Gleb Natapov
Subject: Re: [Qemu-devel] cpufreq and QEMU guests
Date: Tue, 17 Sep 2013 11:58:38 +0300

On Mon, Sep 16, 2013 at 08:42:58PM +0200, Benoît Canet wrote:
> Le Monday 16 Sep 2013 à 18:58:40 (+0300), Gleb Natapov a écrit :
> > On Mon, Sep 16, 2013 at 05:46:04PM +0200, Benoît Canet wrote:
> > > Le Monday 16 Sep 2013 à 18:32:39 (+0300), Gleb Natapov a écrit :
> > > > On Mon, Sep 16, 2013 at 05:05:45PM +0200, Benoît Canet wrote:
> > > > > Le Monday 16 Sep 2013 à 09:39:10 (-0500), Alexander Graf a écrit :
> > > > > > 
> > > > > > 
> > > > > > Am 16.09.2013 um 07:15 schrieb Benoît Canet <address@hidden>:
> > > > > > 
> > > > > > > 
> > > > > > > Hello,
> > > > > > > 
> > > > > > > I know a cloud provider worried about the fact that the 
> > > > > > > /proc/cpuinfo of his
> > > > > > > guests give a bogus frequency to his customer.
> > > > > > > 
> > > > > > > QEMU and the guests kernel currently have no way to reflect the 
> > > > > > > host frequency
> > > > > > > changes to the guests.
> > > > > > > 
> > > > > > > The customer compute intensive application then read this 
> > > > > > > information and take
> > > > > > > wrong decisions.
> > > > > > 
> > > > > > Why do they care about the frequency? Is it for scheduling 
> > > > > > workloads? The only other case I can think of would be the TSC and 
> > > > > > that should be fixed frequency these days.
> > > > > > 
> > > > > > If it's scheduling, you could maybe expose the unavailable compute 
> > > > > > time as steal time to the guest. Exposibg frequency in a virtual 
> > > > > > environment feels backwards.
> > > > > 
> > > > > The final customer have a compute intensive workload.
> > > > > At startup the code retrieve the cpu cache topology, the cpu model, 
> > > > > and various
> > > > > informations including the guest cpu frequency before starting the 
> > > > > compute job.
> > > > > The QEMU instance typicaly use -cpu host.
> > > > > 
> > > > > The code inspects the cpu frequency has seen by the guests to choose 
> > > > > the number
> > > > > of vms to instanciate to compute the given task.
> > > > I am not sure I understand. They look at guest cpu frequency to estimate
> > > > guest's performance?
> > > 
> > > Yes they take guest cpu count, model and frequency to estimate the 
> > > performance
> > > of the guest.
> > > Next they cluster enough guests to be able to compute the job in a given 
> > > time by
> > > using this estimate.
> > > 
> > They do it wrong. They should take guest cpu count, host cpu model and
> > frequency, pcpu/vcpu over commit (if any), guest/host memory overcommit
> > (if any) and estimate performance based on this. For pure computational
> > performance guest core performance should be close to host core
> > performance if there is not cpu/memory overcommit. With a lot of IO
> > things become more complicated.
> 
> I ommited to write some details of the use case.
> 
> The cloud is a Amazon compatible one this means there is no guest agent in the
> guest to help retrieve the host frequency and model.
>
> Also the AWS APIs don't provide a way to communicate the host CPU infos to the
> program responsible of the vm orchestrations.
> 
> So the only interface to access the host cpu info is QEMU and it's started 
> with
> -cpu host to passthrough the cpu model to the guest.
> 
Why are they sure they are started with "-cpu host"? Do they know if
host is overcommitted or guest's vcpu usage is restricted by any other
means?

> What hurt the final customer badly is that the guest /proc/cpuinfo see the
> regular max frequency of the host cpu but won't see the turbo frequency or a
> scaled down one.
> 
What he sees is host tsc frequency of the cpu a guest was booted on
[1] which should be adequate to estimate performance if guest is not
migrated. The frequency host cpu is running on at any given moment is
out of guest control and depend on host frequency governor and load.

[1] the value comes from host, for not constant tsc hosts this is max
    possible frequency

--
                        Gleb.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]