qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: KVM call agenda for Jan 11


From: Juan Quintela
Subject: [Qemu-devel] Re: KVM call agenda for Jan 11
Date: Tue, 11 Jan 2011 14:41:44 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Kevin Wolf <address@hidden> wrote:
> Am 10.01.2011 14:32, schrieb Juan Quintela:
>> Juan Quintela <address@hidden> wrote:
>>> Juan Quintela <address@hidden> wrote:
>>>
>>> Now sent it to the right kvm list.  Sorry for the second sent.
>>>
>>>> Please send any agenda items you are interested in covering.
>>>>
>>>> - KVM Forum 2011 (Jes).
>>>>
>>>> thanks, Juan.
>> 
>> - migration and block devices: a mess.
>>   * patches I sent last week: only work for root (for some definition of
>>     work)
>>   * qemu is used as non-root user.
>>   * forcing to have cache=none solves the issue
>
> I need to have a look at the specific problem, but it's hard to imagine
> that cache=none fixes anything reliably.

It uses O_DIRECT, that means that we don't have buffering problems.
I state the problem again:

machine A read 1st block of device.
<and stays without doing anything else>
machine B reads writes lots of places including 1st block

now guest from machine A migrates to machine B
machine A re-reads the 1st block, and lo and behold, it reads the old
contents, not the new ones.

Solutions:
- invalidate all buffers for that block device on machine A after
  migration.
   * with NFS, just close + reopen the file (and pray that nobody else
   has it also opened)
   * with block devices: use BLKFLBLK ioctl, and pray that nobody else is
     using the device, that device is not a ramdisk, and some more
     things.  To add injury to insult, you need to be root to be able
     to issue that ioctl (technically have CAP_SYS_ADMIN).

O_DIRECT fixes this problem altogether, because there is no buffering,
and if there are not buffers, they can't be invalid O:-)

Notice the "pray" part in the other solutions, we are basically trying
to do a "poor man" DLM, and that is not trivial to do. (althougth our
problem is not the general one, the principles are the same).

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]