qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] ivshmem property size should be a size, not a string


From: Markus Armbruster
Subject: Re: [Qemu-devel] ivshmem property size should be a size, not a string
Date: Mon, 23 Nov 2015 14:25:55 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Marc-André Lureau <address@hidden> writes:

> Hi
>
> ----- Original Message -----
>>
>> >> qemu-doc documents role only with chardev.  The code doesn't care.
>> >
>> > yeah, role is only really useful with a server. Another missing warning.
>> 
>> I think it makes sense only when we can migrate the shared memory
>> contents out-of-band.  Vaguely similar to block devices: either you
>> migrate them along (block migration), or you have to ensure they have
>> identical contents on the target in some other way.
>> 
>> Is the latter possible only with a server?
>
> The way ivshmem role works is:
> - master: will handle migration
> - peer: can't migrate
>
> It's not out-of-band, qemu handles the shared memory migration, even if
> the memory is owned by the server. In practice, it means you can only
> migrate one VM, or you migrate them all but you will migrate the shared
> memory N times and will have to synchronize in your guest app. I suppose 
> ivshmem
> "role" was designed to only migrate the master. Ability to migrate a pool of
> peer would be a significant new feature. I am not aware of such request.

I see.  But how is this supposed to work?

Before migration: one master and N peers connected to the server on host
A, N>=0.

After migration: one master and N' of the N peers connected to the
server on host B, N>=N'>=0, and the remaining N-N' peers still on host A
with their ivshmem device unplugged.

How would I do this even for N'==0?  I can't see how I'm supposted to
connect the migrated shared memory to a server on host B.

>> >> This is a byzantine mess even for QEMU standards.
>> >> 
>> >> Questions:
>> >> 
>> >> * Is it okay to have an unmapped BAR?
>> >
>> > I can't say much, though I remember I did some tests and it was ok.
>> 
>> Did you try chardev=...,size=S, where S is larger than what the server
>> provides?
>
> It will fall in check_shm_size().

Yes.  Called from ivshmem_read().  ivshmem_read() will then complain to
stderr, close the file descriptor we got from the server and leave the
BAR unmapped.  My question is how guests deal with that state.  Could be
anything from "detect the device is broken and fence it" to "kernel
panic".

Whatever it is, it could easily also happen if the guest wins the race
with the server and tries to use the device before it successfully got
its shared memory from the server.

>> A guest that fails there probably works for sufficiently small S only
>> when it loses the race with the server.
>> 
>> But my question is actually whether it's okay for a PCI device to
>> advertize a BAR, and then not map anything there.
>> 
>> Should realize wait for the server providing the shared memory?
>> 
>> >> * We actually have two distinct device kinds:
>> >> 
>> >>   - The basic device: shm with parameter size, or memdev
>> >> 
>> >>   - The doorbell-capable device: chardev with parameters size, vectors,
>> >>     ioeventfd=on, msi
>> >> 
>> >>   Common parameters: use64, role, although the latter is only documented
>> >>   for the doorbell-capable device.
>> >> 
>
>> 
>> >> * Are we *sure* this is ready to become ABI?  I doubt it's ABI yet,
>> >>   because before Marc-André's massive rework, it was even worse (*much*
>> >>   worse), to a degree that makes me doubt anybody could use it
>> >>   seriously.
>> >
>> > It's supposed to be the same ABI as before, with the memdev addition.
>> 
>> Well, it's *crap*.  We shouldn't extend and known crap so it can get
>> used more widely, we should deprecate it in favour of something less
>> crappy.
>
> I think you overstate this, if it would be so bad I don't think anyone could 
> use
> it and it wouldn't probably be in qemu. We are mainly talking misconfiguration
> and missing features (that people may not actually need). I think
> before we split things, we can address your comments with more
> options checking and documentation.

We have at least two problems:

1. Unless the guest can reliably detect the doorbell feature, the
   doorbell feature is *broken*.

   As far as I can tell, a device with a doorbell behaves exactly like
   one without a doorbell until it got its shared memory from the
   server.  If that's correct, then doorbell detection is inherently
   racy.

   The only way to fix this in documentation is "broken, do not use".

   The maximally compatible way to fix this in code is to ensure the
   guest can't read register IVPosition before we got the shared memory
   from the server.  We can make realize wait, or the read.  The latter
   is probably an even worse idea.

   An easier way to fix it in code is splitting up the device, so guests
   can simply check the PCI device ID to figure out whether they have
   one with a doorbell.

   An even easier way is dropping the doorbell feature outright.

2. The UI is crap.

   We can fix this by rejecting nonsensical option combinations.
   However, the result will be more complex than splitting the device in
   two so that nonsensical options combinations are simply impossible.

   If we need to split it anyway to fix the doorbell, we can clean up
   the UI at next to no cost.

>> Here's my attempt:
>> 
>> 1. Split ivshmem into ivshmem-plain and ivshmem-doorbell.
>> 
>>    Each one gets its own PCI device ID, so that guests can trivially see
>>    whether the device got a doorbell.
>> 
>>    Both have property use64.
>> 
>>    ivshmem-plain additionally has property memdev.
>> 
>>    ivshmem-doorbell additionally has chardev, size, vectors, ioeventfd,
>>    msi.
>> 
>>    Undecided: property role (see above).
>> 
>>    The only non-sensical property combination is ioeventfd=on,msi=off,
>>    which we cleanly reject.
>> 
>>    Undecided: behavior before the server provides the shared memory, and
>>    what to do when it's less than size (see above).  This is the *only*
>>    part of my proposal that may require code changes beyond
>>    configuration.  If we can't do this before the release, punt
>>    ivshmem-doorbell to the next cycle.
>
>> 2. Drop ivshmem property memdev.
>> 
>>    Use ivshmem-plain instead.
>> 
>> 3. Deprecate ivshmem.
>
> It sounds like a reasonable thing to do, but I don't think the benefit
> is so important.

If we need to split & deprecate to fix the doorbell anyway, the earlier
we do it, the better.  Before your big rework, the device's general
crappiness should've limited its use.  But from now on, it can only get
harder.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]