[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] semantics of qemu_peek_buffer ?
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] semantics of qemu_peek_buffer ? |
Date: |
Tue, 18 Mar 2014 13:08:07 +0000 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
* Juan Quintela (address@hidden) wrote:
> "Dr. David Alan Gilbert" <address@hidden> wrote:
> > Hi Juan,
> > What are the semantics of 'qemu_peek_buffer'?
> >
> > - is it supposed to guarantee (if there are no errors) that
> > it will read 'size' bytes? (i.e. it should block)
>
>
> We are talking qemu, specifically migration. "quarantee" is always a
> too strong word O:-)
It would be nice for people to expect it!
> Once told that, qemu_peek_buffer() should always read the amount of
> stuff it has been asked (or terun one error.
OK.
> > There are currently two users of it:
> > * qemu_read_buffer which spins filling it's buffer up
> > with repeated calls to qemu_peek_buffer
>
> <if people are using grep, function in qemu_get_buffer()>
>
> if we ask for "size" bytes, and there are less that that size, we are in
> trouble.
>
>
> > * vmstate_subsection_load that returns if the size read
> > doesn't match what it was expecting
>
> This is look-ahead of "size" chars, and has all the problems that you
> can think of read-ahead of more than one char. How is it used in
> sub-sections:
>
> - we read that the 1st char is a subsection number (but it could not be
> a subsection).
> - we read the size
> - we read the subsection name of that size
>
> - we search for a subsection with that name, if it exist, we then read
> them properly. if it don't exist, we abort the whole subsection idea,
> nad continue as if the subsection number hadn't exist in the 1st place.
>
> > I can't see how both of them can be right.
> >
> > The problem I'm seeing is that in my world I've got a
> > qemu_peek_buffer of 8 bytes, and with a repeated virt-test
> > local tcp migration it's failing about 1 in 8 times;
> > here is some debug:
> >
> > 19:51:15 INFO | [qemu output] qemu_peek_buffer refill case (pre); size=8
> > offset=0 index=32764 pending=4 buf_index=32764 buf_size=32768 pos=23302795
> > 19:51:15 INFO | [qemu output] qemu_fill_buffer got 1
> > 19:51:15 INFO | [qemu output] qemu_peek_buffer refill case (post); size=8
> > offset=0 index=0 pending=5 buf_index=0 buf_size=5 pos=23302796
> > 19:51:15 INFO | [qemu output] qemu_peek_buffer (size>pending); size=8
> > offset=0 index=0 pending=5 buf_index=0 buf_size=5 pos=23302796
> >
> > i.e. I asked for 8 bytes, there were 4 in the buffer, it called fill
> > buffer, which got one
> > more byte, and thus it returned me 5.
>
> Uh, oh. That shouldn't happen. could you try to change the
> "if (pending < size)"
> to
> "while (pending < size)"
>
> remember that you need to handle errors on that loop?
Yes, I think that will work - I think really the loop that is in
qemu_get_buffer()
needs to move up into qemu_peek_buffer() (with qemu_fill_buffer returning the
number of bytes it has read).
> BTW, what are you doing when you get that error? It is vmstate_load or
> qemu_get_buffer()? and for what fiel?
In my migration visitor world, in the compatibility code for the current format,
I'm using qemu_peek_buffer to peek the 8 byte header on ram pages to figure
out which type of page encoding we're using, so this means I'm calling
qemu_peek_buffer
once per page (with 8 bytes peek'd each time), so it's calling it a lot.
> > I think what vmstate_subsection_load wants (and what I want) is something
> > like qemu_read_buffer but which doesn't advance it's pointer, i.e. to read
> > a header, decide it's not for me and let the next function along use it.
>
> Problem was to fix the case when there is less that <size> bytes into
> the buffer. i.e. we start to search for a string that is bigger than
> the remaining (already read) buffer.
>
> > vmstate_subsection_load doesn't look like it flags an error if it
> > doesn't read enough; I guess the effect will be just to fail to
> > load a migration in some interesting way.
>
> Error checking is missing there. If we don't get enough space, we
> really want to fail the subsection read.
Yes, I think it's possible this code could hit the problem I've seen,
but it's much less likely because there are a lot less subsections
than there are pages.
I think the kernel can return only a few bytes, possibly because when
qemu_fill_buffer fills the buffer it can read a weird number of bytes
depending how much buffer space is left; so the kernel could have a few
left for the next read.
Dave
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK