qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V2 0/3] Introduce COLO-compare


From: Li Zhijian
Subject: Re: [Qemu-devel] [PATCH V2 0/3] Introduce COLO-compare
Date: Thu, 31 Mar 2016 11:01:01 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0



On 03/30/2016 08:05 PM, Dr. David Alan Gilbert wrote:
* Zhang Chen (address@hidden) wrote:
COLO-compare is a part of COLO project. It is used
to compare the network package to help COLO decide
whether to do checkpoint.

Hi Zhang Chen,
   I've put comments on the individual patches, but some more general things:

   1) Please add a coment giving the example of the command line for the primary
     and secondary use of this module - it helps make it easier to understand 
the patches.

   2) There's no tracing in here - please add some; I found when I tried to get
     COLO working I needed to use lots of tracing and debugging to understand 
the
     packet flow.

   3) Add comments; e.g. for each function say which thread is using it and 
where
      the packets are coming from; e.g.
         called from the main thread on the primary for packets arriving over 
the socket
         from the secondary.

      There's just so many packets going in so many directions it would make it
      easier to follow.

   4) A more fundamental problem is what happens if the secondary never sends 
anything
      on the socket, the result is you end up running until the end of the long 
COLO
      checkpoint without triggering a discompare - in my world I added a 
timeout (400ms)
      for an unmatched packet from the primary, where if no matching packet was 
received
      a checkpoint would be triggered.

   5) I see the packet comparison is still the simple memcmpy that you had in 
December;
      are you planning on doing anything more complicated; you must be seing 
most packets
      miscompare?

You can see my current world at; 
https://github.com/orbitfp7/qemu/commits/orbit-wp4-colo-mar16
which has my basic TCP comparison (it's only tracking incoming connections) and 
I know it's
not complete either.  It mostly works OK, although I've got an occasional seg
(which makes me wonder if I need to add the conn_list_lock I see you added).  
I'm also
not doing any TCP reassembly which is probably needed.

Thank you very much for your comments.
I just see you tree, you put in a lot of work(tcp comparison enhance, 
sequence/acknowledge
number re-write, timeout...)

Actually, this compare module is just in a RFC stage(only including compare 
frame), there are
many works to be done:

1) Integrate to COLO frame(and Let COLO primary and secondary at running state)

2) ip segment defrag

3) comparison base on the sequence number(tcp and udp) if packet has
   Because tcp re-transmission is quit common. IRC, your code will compare the 
whole tcp
    packet(sequence number will be compare)

4) packet belongs to the same connection is sort by sequence number

5) Out-Of-Oder packet handle

6) cleanup the un-active conn_list which maybe closed. the simple way is to 
introduce a
   timer to record whether a connection have packet come within a timeout, 
connection gone
    beyond this timeout should be cleanup.

7) Dave point out above (4)

8) something I miss...

For Various reasons, not all the works can be done immediately, So we hope to 
discuss and
decide which function have the high priority.
Any comments and suggestions are welcome.

IMO, a compare frame and a COLO frame hack patch could be simple enough.

Thanks
Li

Dave

v2:
  - add jhash.h

v1:
  - initial patch


Zhang Chen (3):
   colo-compare: introduce colo compare initlization
   colo-compare: track connection and enqueue packet
   colo-compare: introduce packet comparison thread

  include/qemu/jhash.h |  59 ++++
  net/Makefile.objs    |   1 +
  net/colo-compare.c   | 782 +++++++++++++++++++++++++++++++++++++++++++++++++++
  vl.c                 |   3 +-
  4 files changed, 844 insertions(+), 1 deletion(-)
  create mode 100644 include/qemu/jhash.h
  create mode 100644 net/colo-compare.c

--
1.9.1



--
Dr. David Alan Gilbert / address@hidden / Manchester, UK


.


--
Best regards.
Li Zhijian (8555)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]