qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device


From: Cam Macdonell
Subject: [Qemu-devel] Re: [PATCH v3 0/2] Inter-VM shared memory PCI device
Date: Thu, 25 Mar 2010 11:35:39 -0600

On Thu, Mar 25, 2010 at 11:02 AM, Avi Kivity <address@hidden> wrote:
> On 03/25/2010 06:50 PM, Cam Macdonell wrote:
>>
>>> Please put the spec somewhere publicly accessible with a permanent URL.
>>>  I
>>> suggest a new qemu.git directory specs/.  It's more important than the
>>> code
>>> IMO.
>>>
>>
>> Sorry to be pedantic, do you want a URL or the spec as part of a patch
>> that adds it as  a file in qemu.git/docs/specs/
>>
>
> I leave it up to you.  If you are up to hosting it independently, than just
> post a URL as part of the patch.  Otherwise, I'm sure qemu.git will be more
> than happy to be the official repository for the memory sharing device
> specification.  In that case, make the the spec the first patch in the
> series.

Ok, I'll send it as part of the series that way people can comment
inline easily.

>
>>> Possible later extensions:
>>> - multiple doorbells that trigger different vectors
>>> - multicast doorbells
>>>
>>
>> Since the doorbells are exposed the multicast could be done by the
>> driver.  If multicast is handled by qemu, then we have different
>> behaviour when using ioeventfd/irqfd since only one eventfd can be
>> triggered by a write.
>>
>
> Multicast by the driver would require one exit per guest signalled.
>  Multicast by the shared memory server needs one exit to signal an eventfd,
> then the shared memory server signals the irqfds of all members of the
> multicast group.
>
>>>> The semantics of the value written to the doorbell depends on whether
>>>> the
>>>> device is using MSI or a regular pin-based interrupt.
>>>>
>>>>
>>>
>>> I recommend against making the semantics interrupt-style dependent.  It
>>> means the application needs to know whether MSI is in use or not, while
>>> it
>>> is generally the OS that is in control of that.
>>>
>>
>> It is basically the use of the status register that is the difference.
>>  The application view of what is happening doesn't need to change,
>> especially with UIO: write to doorbells, block on read until interrupt
>> arrives.  In the MSI case I could set the status register to the
>> vector that is received and then the would be equivalent from the view
>> of the application.  But, if future MSI support in UIO allows MSI
>> information (such as vector number) to be accessible in userspace,
>> then applications would become MSI dependent anyway.
>>
>
> Ah, I see.  You adjusted for the different behaviours in the driver.
>
> Still I recommend dropping the status register: this allows single-msi and
> PIRQ to behave the same way.  Also it is racy, if two guests signal a third,
> they will overwrite each other's status.

With shared interrupts with PIRQ without a status register how does a
device know it generated the interrupt?

>
>>> ioeventfd/irqfd are an implementation detail.  The spec should not depend
>>> on
>>> it.  It needs to be written as if qemu and kvm do not exist.  Again, I
>>> recommend Rusty's virtio-pci for inspiration.
>>>
>>> Applications should see exactly the same thing whether ioeventfd is
>>> enabled
>>> or not.
>>>
>>
>> The challenge I recently encountered with this is one line in the
>> eventfd implementation
>>
>> from kvm/virt/kvm/eventfd.c
>>
>> /* MMIO/PIO writes trigger an event if the addr/val match */
>> static int
>> ioeventfd_write(struct kvm_io_device *this, gpa_t addr, int len,
>>         const void *val)
>> {
>>     struct _ioeventfd *p = to_ioeventfd(this);
>>
>>     if (!ioeventfd_in_range(p, addr, len, val))
>>         return -EOPNOTSUPP;
>>
>>     eventfd_signal(p->eventfd, 1);
>>     return 0;
>> }
>>
>> IIUC, no matter what value is written to an ioeventfd by a guest, a
>> value of 1 is written.  So ioeventfds work differently than eventfds.
>> Can we add a "multivalue" flag to ioeventfds so that the value that
>> the guest writes is written to eventfd?
>>
>
> Eventfd values are a counter, not a register.  A read() on the other side
> returns the sum of all write()s (or eventfd_signal()s).  In the context of
> irqfd it just means the number of interrupts we coalesced.
>
> Multivalue was considered at one time for a different need and rejected.
>  Really, to solve the race you need a queue, and that can only be done in
> the shared memory segment using locked instructions.

I had a hunch it was probably considered.  That explains why irqfd
doesn't have a datamatch field.  I guess supporting multiple MSI
vectors with one doorbell per guest isn't possible if one 1 bit of
information can be communicated.

So, ioeventfd/irqfd restricts MSI to 1 vector between guests.  Should
multi-MSI even be supported then in the non-ioeventfd/irq case?
Otherwise ioeventfd/irqfd become more than an implementation detail.

>
> --
> error compiling committee.c: too many arguments to function
>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]