[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 for-4.0 4/7] libvhost-user: Support tracking

From: Jason Wang
Subject: Re: [Qemu-devel] [PATCH v4 for-4.0 4/7] libvhost-user: Support tracking inflight I/O in shared memory
Date: Fri, 18 Jan 2019 17:26:46 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

On 2019/1/18 下午3:01, Yongji Xie wrote:
On Fri, 18 Jan 2019 at 12:00, Jason Wang <address@hidden> wrote:

On 2019/1/18 上午11:32, Yongji Xie wrote:
On Thu, 17 Jan 2019 at 17:57, Jason Wang <address@hidden> wrote:
On 2019/1/15 下午10:51, Yongji Xie wrote:
Well, this may work but here're my points:

1) The code want to recover from backed crash by introducing extra space
to store inflight data, but it still depends on the backend to set/get
the inflight state

2) Since the backend could be killed at any time, the backend must have
the ability to recover from the partial inflight state

So it looks to me 1) tends to be self-contradictory and 2) tends to be
recursive. The above lines show how tricky could the code looks like.

Solving this at vhost-user level through at backend is probably wrong.
It's time to consider the support from virtio itself.

I agree that supporting this in virtio level may be better. For
example, resubmitting inflight I/O once DEVICE_NEEDS_RESET is set in
Stefan's proposal. But I still think QEMU should be able to provide
this ability too. Supposed that one vhost-user backend need to support
multiple VMs. We can't enable reconnect ability until all VMs' guest
driver support the new feature. It's limited.
That's the way virtio evolves.

    But if QEMU have the
ability to store inflight buffer, the backend could at least have a
chance to support this case.
The problem is, you need a careful designed protocol described somewhere
That's what we should discuss in detail in this series.

Well, I ask some questions for this patch, but it looks like they were
still not answered. No?

Oh, sorry, I missed those questions. Let me try to answer them here.

Q1: If backend get killed in vu_queue_inflight_get() without setting
vq->inflight->desc[desc_idx] to 1, is there any problem?

The entry which stores the head of this inflight descriptor is not
lost in avail ring. So we can still get this inflight descriptor from
avail ring although we didn't set vq->inflight->desc[desc_idx] to 1.

Ok I get this.

void vu_queue_push()
     vq->inflight->elem_idx = elem->idx;
     vq->inflight->desc[elem->idx] = 0;
                                                     <-------- Does
this safe to be killed here?
     vq->inflight->used_idx = vq->vring.used->idx;

Because there are no concurrency between vu_queue_push() and
vu_queue_pop(), I don't see any problem here.

Basically we just need to make sure this two operations
(vq->vring.used->idx++ and vq->inflight->desc[elem->idx] = 0) are
atomic. I think there are some approach to achieve that. I'm not sure
my approach here is good enough, but it should work.

Rethink about this, some findings:

- What you suggest requires strict ordering in some part of vu_queue_push(). E.g it looks to me you need a compiler barrier to make sure used_idx is set before clearing desc[elem->idx]? This a an side effect of introduce new metadata (inflight->elem_idx and inflight->used_idx) to recover from crashing when logging inflgith->desc[] array.

- Modern backends like dpdk do batching aggressively, which means you probably need an array of elem_idx[]. This tends to be more complex since we probably need to negotiate the size of this array and the overhead is probably noticeable.

I don't audit all other places of the codes, but I suspect it would be hard to be 100% correct. And what happens for packed virtqueue? Basically, we don't want to produce tricky codes that is hard to debug in the future. Think in another direction, in order will be supported by virtio 1.1. With that, the inflight descriptors could be much more easier to be deduced like [used_idx, avail_idx)? So it looks to me supporting this from virtio layer is much more easier.



(is vhost-user.txt a good place for this?). And this work will be
(partial) duplicated for the future support from virtio spec itself.

I think the duplicated code is to maintain the inflight descriptor
list which should be in backend. That's not main work in this series.
And backend could choose to include it or not.

You need to have a documentation to describe the protocol. Otherwise, it
would be very hard for other backend to implement.

Yes, actually now I'm working on adding more detail to vhost-user.txt.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]