[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 46/46] Start documenting how postcopy works.
From: |
Dr. David Alan Gilbert (git) |
Subject: |
[Qemu-devel] [PATCH 46/46] Start documenting how postcopy works. |
Date: |
Fri, 4 Jul 2014 18:41:57 +0100 |
From: "Dr. David Alan Gilbert" <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>
---
docs/migration.txt | 148 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 148 insertions(+)
diff --git a/docs/migration.txt b/docs/migration.txt
index 0492a45..dbd5e5f 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -294,3 +294,151 @@ save/send this state when we are in the middle of a pio
operation
(that is what ide_drive_pio_state_needed() checks). If DRQ_STAT is
not enabled, the values on that fields are garbage and don't need to
be sent.
+
+= Return path =
+
+In most migration scenarios there is only a single data path that runs
+from the source VM to the destination, typically along a single fd (although
+possibly with another fd or similar for some fast way of throwing pages
across).
+
+However, some uses need two way comms; in particular the Postcopy destination
+needs to be able to request pages on demand from the source.
+
+For these scenarios there is a 'return path' from the destination to the
source;
+qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the return
+path.
+
+ Source side
+ Forward path - written by migration thread
+ Return path - opened by main thread, read by fd_handler on main thread
+
+ Destination side
+ Forward path - read by main thread
+ Return path - opened by main thread, written by main thread AND postcopy
+ thread (protected by rp_mutex)
+
+Opening the return path generally sets the fd to be non-blocking so that a
+failed destination can't block the source; and since the non-blockingness seems
+to follow both directions it does alter the semantics of the forward path.
+
+= Postcopy =
+'Postcopy' migration is a way to deal with migrations that refuse to converge;
+it's plus side is that there is an upper bound on the amount of migration
traffic
+and time it takes, the down side is that during the postcopy phase, a failure
of
+*either* side or the network connection causes the guest to be lost.
+
+In postcopy the destination CPUs are started before all the memory has been
+transferred, and accesses to pages that are yet to be transferred cause
+a fault that's translated by QEMU into a request to the source QEMU.
+
+Postcopy can be combined with precopy (i.e. normal migration) so that if
precopy
+doesn't finish in a given time the switch is automatically made to precopy.
+
+=== Enabling postcopy ===
+
+To enable pure postcopy:
+
+migrate_set_capability x-postcopy-ram on
+
+To add a period of precopy:
+
+migrate_set_parameter x-postcopy-start-time 500
+
+(time in ms)
+
+=== Postcopy states ===
+Postcopy moves through a series of states (see postcopy_ram_state)
+from ADVISE->LISTEN->RUNNING->END
+
+ Advise: Set at the start of migration if postcopy is enabled, even
+ if it hasn't passed the start-time threshold; here the destination
+ checks it's OS has the support needed for postcopy, and performs
+ setup to ensure the RAM mappings are suitable for later postcopy.
+ (Triggered by reception of POSTCOPY_RAM_ADVISE command)
+
+Normal precopy now carries on as normal, until the point that the source
+hits the start-time threshold and transitions to postcopy. The source
+stops it's CPUs and transmits a 'discard bitmap' indicating pages that
+have been previously sent but are now dirty again and hence are out of
+date on the destination.
+
+The migration stream now contains a 'package' containing it's own chunk
+of migration stream, followed by a return to a normal stream containing
+page data. The package (sent as CMD_PACKAGED) contains the commands to
+cycle the states on the destination, followed by all of the device
+state excluding RAM. This lets the destination request pages from the
+source in parallel with loading device state, this is required since
+some devices (virtio) access guest memory during device initialisation.
+
+ Listen: The first command in the package, POSTCOPY_RAM_LISTEN, switches
+ the destination state to Listen, and starts a new thread
+ (the 'listen thread') which takes over the job of receiving
+ pages off the migration stream, while the main thread carries
+ on processing the blob. With this thread able to process page
+ reception, the destination now 'sensitises' the RAM to detect
+ any access to missing pages (on Linux using the 'userfault'
+ system).
+
+The package now contains all the remaining state data and the command
+to transition to the next state.
+
+ Running: POSTCOPY_RAM_RUN causes the destination to synchronise all
+ state and start the CPUs and IO devices running. The main
+ thread now finishes processing the migration package and
+ now carries on as it would for normal precopy migration
+ (although it can't do the cleanup it would do as it
+ finishes a normal migration).
+
+Page data is sent from the source to the destination both as part
+of a linear scan (like normal migration), and received by the 'listen thread',
+When the destination tries to use a page it hasn't got, it requests
+it from the source (down the return path) and the source sends this
+page in the same stream. When the source has transmitted all pages
+it sends a POSTCOPY_RAM_END command to transition to
+
+ End: The listen thread can now quit, and perform the cleanup of migration
+state, the migration is now complete.
+
+=== Source side page maps ===
+The source side keeps two bitmaps during postcopy; 'the migration bitmap'
+and 'sent map'. The 'migration bitmap' is basically the same as in
+the precopy case, and holds a bit to indicate that page is 'dirty' -
+i.e. needs sending. During the precopy phase this is updated as the CPU
+dirties pages, however during postcopy the CPUs are stopped and nothing
+should dirty anything any more.
+
+The 'sent map' is used for the transition to postcopy. It is a bitmap that
+has a bit set whenever a page is sent to the destination, however during
+the transition to postcopy mode it is masked against the migration bitmap
+(sentmap &= migrationbitmap) to generate a bitmap recording pages that
+have been previously been sent but are now dirty again. This masked
+sentmap is sent to the destination which discards those now dirty pages
+before starting the CPUs.
+
+Note that once in postcopy mode, the sent map is still updated, however it's
+contents are not-consistent as a local view of what's been sent since it's
+only got the masked result.
+
+=== Destination side page maps ===
+(Needs to be changed so we can update both easily - at the moment updates are
done
+ with a lock)
+The destination keeps a 'requested map' and a 'received map'.
+Both maps are initially 0, as pages are received the bits are set in 'received
map'.
+Incoming requests from the kernel cause the bit to be set in the 'requested
map'.
+When a page is received that is marked as 'requested' the kernel is notified.
+If the kernel requests a page that has already been 'received' the kernel is
notified
+without re-requesting.
+
+This leads to three valid page states:
+page states:
+ missing (!rc,!rq) - page not yet received or requested
+ received (rc,!rq) - Page received
+ requested (!rc,rq) - page requested but not yet received
+
+state transitions:
+ received -> missing (only during setup/discard)
+
+ missing -> received (normal incoming page)
+ requested -> received (incoming page previously requested)
+ missing -> requested (userfault request)
+
--
1.9.3
- [Qemu-devel] [PATCH 33/46] Postcopy: Create a fault handler thread before marking the ram as userfault, (continued)
- [Qemu-devel] [PATCH 33/46] Postcopy: Create a fault handler thread before marking the ram as userfault, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 34/46] Page request: Add MIG_RPCOMM_REQPAGES reverse command, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 36/46] Page request: Consume pages off the post-copy queue, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 37/46] Add assertion to check migration_dirty_pages doesn't go -ve; have seen it happen once but not sure why, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 35/46] Page request: Process incoming page request, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 38/46] postcopy_ram.c: place_page and helpers, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 39/46] Postcopy: Use helpers to map pages during migration, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 43/46] postcopy: Wire up loadvm_postcopy_ram_handle_{run, end} commands, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 44/46] postcopy: Use userfaultfd, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 45/46] End of migration for postcopy, Dr. David Alan Gilbert (git), 2014/07/04
- [Qemu-devel] [PATCH 46/46] Start documenting how postcopy works.,
Dr. David Alan Gilbert (git) <=
- [Qemu-devel] [PATCH 16/46] Add migration-capability boolean for postcopy-ram., Dr. David Alan Gilbert (git), 2014/07/04
[Qemu-devel] [PATCH 17/46] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages., Dr. David Alan Gilbert (git), 2014/07/04
[Qemu-devel] [PATCH 15/46] Rework loadvm path for subloops, Dr. David Alan Gilbert (git), 2014/07/04