qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 7/7] virtio-console: Enable port throttling w


From: Amit Shah
Subject: Re: [Qemu-devel] [PATCH v8 7/7] virtio-console: Enable port throttling when chardev is slow to consume data
Date: Wed, 8 Dec 2010 19:55:38 +0530
User-agent: Mutt/1.5.21 (2010-09-15)

On (Wed) Dec 08 2010 [12:56:33], Paul Brook wrote:
> > > > Sure.  My proposal is for qemu_chr_write() to succeed all the time.  If
> > > > the backend can block, and the caller can handle it, it can get a
> > > > -EAGAIN (or WSAEWOULDBLOCK) return.  When the backend becomes writable,
> > > > the chardev can call the ->writes_unblocked() callback for that caller.
> > > > Individual callers need not bother about re-submitting partial writes,
> > > > since the buffering can be done in common code in one place
> > > > (qemu-char.c).
> > > 
> > > That's only OK if you assume it's OK to buffer all but one byte of the
> > > transmit request.
> > 
> > Is that a fair assumption to make?
> 
> No. See below.
> 
> > > > But that's entirely in guest memory, so it's limited to the amount of
> > > > RAM that has been allocated to the guest.
> > > 
> > > Exactly. The guest can cause ram_size * nr_ports of additional host
> > > memory to be allocated.  Not acceptable.
> > 
> > OK -- so this is how it adds up:
> > 
> > - guest vq
> > - virtio-serial-bus converts iov to buf
> 
> This is an unbelievably lame piece of code.

I doubt it's 'unbelievably lame' just because of the copy.  Care to
expand?

> There's absolutely no reason to 
> copy the data into a linear buffer. You should just be iterating over the 
> elements of the sglist.

OK, but that can be done in a separate patch series.

> > - qemu-char stores the buf in case it wasn't able to send
> >
> > but then, since it's all async, we have:
> > 
> > - virtio-serial-bus frees the buf
> > - guest deletes the buf and removes it from the vq
> >
> > So what's left is only the data in qemu-char's buf.  Now this can be
> > (buf_size - 1) * nr_ports in the worst case.
> 
> Add at least another buf_size because you have to allocate the qemu-char 
> buffer before you free the virtio-serial buffer. We can expect that
> buf_size ~= guest ram size [1], so for practical purposes it may as well be 
> unbounded.

Now this only happens when the host chardev is slow or isn't being read
from.  So it's not really a guest causing a host DoS, but a guest
causing itself some harm.  You're right that the allocations happen one
after the other, and the freeing happens later, so there is a time when
2 or 3 times the buf_size is needed.

However, once qemu_chr_write() returns, there could be just one copy
lying around, things are freed elsewhere.

> Worst case the guest could submit a buffer consisting of many large 
> overlapping regions, giving a buf_size many times greater then guest ram 
> size.  
> Technically such code isn't even doing anything wrong. We're only reading 
> from 
> guest RAM, so in principle the same memory can be submitted multiple times 
> without causing a race condition.

Interesting.

> > The alternative would be to keep using the guest buffer till all's done,
> 
> Yes.
> 
> > but then that depends on qemu getting async support - separating out the
> > qemu_chr_write() into a separate thread and allowing vcpu and chr io
> > operations to be run simultaneously.
> 
> You don't need any special async char API or threads.  Normal unix write 
> semantics (i.e. short writes and EAGAIN) plus the unblock hook are sufficient.
> As mentioned above, the virtio-serial code should be iterating over the 
> sglist.  If the host won't accept all the data immediately then just remember 
> how much has been sent, and resume iteration when the unblock hook is called.

Yes I've been thinking about this as well.  But the problem is some
kernel versions spin in the guest code till the buffer is placed back
in the vq (signalling it's done using it).  This is a problem for the
virtio-console (hvc) that does writes with spinlocks held, so allocating
new buffers, etc., isn't really -- possible easily.

                Amit



reply via email to

[Prev in Thread] Current Thread [Next in Thread]