qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH 4/6] dirty-bitmaps: clean-up bitmaps loading and migration logic
Date: Fri, 3 Aug 2018 10:10:34 +0100
User-agent: Mutt/1.10.0 (2018-05-17)

* Denis V. Lunev (address@hidden) wrote:
> On 08/03/2018 11:33 AM, Dr. David Alan Gilbert wrote:
> > * Denis V. Lunev (address@hidden) wrote:
> >> On 08/02/2018 12:50 PM, Dr. David Alan Gilbert wrote:
> >>> * Denis V. Lunev (address@hidden) wrote:
> >>>
> >>>
> >>>>> I don't quite understand the last two paragraphs.
> >>>> we are thinking right now to eliminate delay on regular IO
> >>>> for migration. There is some thoughts and internal work in
> >>>> progress. That is why I am worrying.
> >>> What downtime are you typicaly seeing and what are you aiming for?
> >>>
> >>> It would be good if you could explain what you're planning to
> >>> fix there so we can get a feel for it nearer the start of it
> >>> rather than at the end of the reviewing!
> >>>
> >>> Dave
> >> The ultimate goal is to reliable reach 100 ms with ongoing IO and
> >> you are perfectly correct about reviewing :)
> > That would be neat.
> >
> >> Though the problem is that right now we are just trying to
> >> invent something suitable :(
> > OK, some brain-storm level ideas:
> >
> >   a) Throttle the write bandwidth at later stages of migration
> >      (I think that's been suggested before)
> yes
> 
> >   b) Switch to some active-sync like behaviour where the writes
> >      are sent over the network as they happen to the destination
> >      (mreitz has some prototype code for that type of behaviour
> >      for postcopy)
> will not work. Even with the sync mirror (which we have submitted
> 2 years ago, but not accepted), usual downtime will be around 1 sec
> due to requests in flight.

I'm confused why it would be that high; are you saying that would be
longer than the downtime with the current writes near the end of
migration on the source?

> >   c) Write the writes into a buffer that gets migrated over the
> >     migration stream to get committed on the destination side.
> yes. this is an option but the buffer can be too big.

Combined with some throttling you should be able to bound the size;
especially since it should be done only for the very last part of the
migration.

> For the shared disk migration the options are the following:
> - without metadata updates writes could be just passed to finish, there
>   is no need to wait. But completions should be reported to destination
> - metadata updates are not allowed, they should be transferred to the
>   destination
> 
> For non-shared disk migration we do not need to wait local IO, we just need
> to restart them on target. Alternatively these areas could be marked as
> blocked for IO and re-sync again once writes are completed.
> 
> These are raw ideas, which should be improved and tweaked.

Nod; it needs some thinking about.

Dave

> Den
> 
> > As I say, brainstorm level ideas only!
> >
> > Dave
> >
> >
> >> Den
> >>
> >>>>> However, coming back to my question; it was really saying that
> >>>>> normal guest IO during the end of the migration will cause
> >>>>> a delay; I'm expecting that to be fairly unrelated to the size
> >>>>> of the disk; more to do with workload; so I guess in your case
> >>>>> the worry is the case of big large disks giving big large
> >>>>> bitmaps.
> >>>> exactly!
> >>>>
> >>>> Den
> >>> --
> >>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> > --
> > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]