[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path
From: |
Peter Feiner |
Subject: |
[Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path |
Date: |
Tue, 20 Nov 2012 07:10:23 -0500 |
This patch proposes a framework that empowers QEMU to implement powerful memory
management techniques with very low cost. The insight is to leverage existing
sub-systems, composing functionality in a traditional UNIX style. The key to
enabling this approach is to expand the -mem-path option to provide more control
over how backing memory files are named, opened, and mapped.
Two seemingly complex memory management techniques can now be achieved quite
simply:
1. Post-copy migration.
A. Boot a VM with RAM backed by MAP_SHARED files.
B. Pause the VM and save device state.
C. Share RAM files using a network file system.
D. Copy saved device state to target host.
E. On the target host, create a VM with the saved device state. Back RAM
with files on the network mount.
As the VM runs on the target, pages are loaded on demand by the
network file system.
2. VM cloning with shared CoW memory
A. Boot a parent VM, run for a while, then pause.
B. Save the parent's device state.
C. Create child VMs from the saved state. Back RAM with MAP_PRIVATE
overlays of the parent's RAM.
Pages are shared by the children until they're COW'd.
Note that kernel shared memory (KSM, or TPS - transparent page sharing -
in VMWare parlance) is complimentary to cloning. KSM discovers pages
that happen to be identical and shares them. Although KSM can find all
redundant pages, doing so comes at a cost. Cloning, on the other hand,
exploits sharing opportunities that it knows about for free. However, as
children are longer lived and COW more pages, the memory savings due to
cloning diminish.
By simply granting more control over the manipulation of backing RAM files, QEMU
can enable these features *with no additional costs or changes to its internal
memory management*.
Thanks to MMU notifiers, kvm is already compatible with all file systems and
MAP_SHARED and MAP_PRIVATE. Furthermore, thanks to -mem-path, QEMU is already
wired to back RAMBlocks with files. However, since these backing files are
currently randomly named (i.e., using mkstemp), it's impossible to know the
correspondence between different VMs' backing files.
This patch's essential feature is making the names of the RAMBlocks' backing
files deterministic and unique. Then, given predictable backing file names, this
patch makes QEMU configurable to use existing RAMBlock backing files. These new
features are exposed to the user via arguments to the -mem-path command:
-mem-path \
path[,mode=temp|open|create][,mmap=private|shared][,nofallback][,anyfs]
A detailed explanation of each argument follows.
mode:
Naming of the RAMBlock filenames is controlled by the new mode argument for
-mem-path. When mode is "create", files with deterministic names are created
and opened. When mode is "temp", which is the default, the names are random
and the files are unlinked, as per the current behavior of -mem-path. The
"open" mode is the same as "create" but it fails if the files don't already
exist; the "open" mode is included as a sanity checking measure when
creating children.
To make backing files' names' deterministic and unique, a RAMBlock's offset
is included in its backing file's name. The offset is a function of the
order in which RAMBlocks are allocated and their size, which, in turn, is a
function of the order in which devices are initialized. Hence, given the
same QEMU command line, two QEMU processes will have RAMBlocks with the same
offsets. Note that Xen's live migration also relies on RAMBlocks having the
same offsets between QEMU processes (see xen_read_physmap).
In addition to the offset, the size of the RamBlock and the (non-unique)
name of the RamBlock's MemoryRegion included in the file name:
qemu_back_mem.OFFSET+SIZE.NAME[.RANDOM]
The size and name are included for extra sanity checking when the -mem-path
mode is "open". The optional random suffix is added when the mode is "temp".
mmap:
Whether the RAMBlock files are mmap'd with MAP_PRIVATE or MAP_SHARED is
governed by the mmap argument. As explained above, MAP_PRIVATE is necessary
for sharing memory between children. The motivation for MAP_SHARED is to
make saving the parent's memory state a zero-cost operation during step 2B
of the clone algorithm by using the parent's RAMBlock files in place for the
children. This is best illustrated by an example:
$ mkdir /tmp/mem
$ qemu -mempath /tmp/mem,mode=create,mmap=shared -m 8g -qmp stdio disk.qcow2
# Warm up the parent (e.g., fill buffer cache, load apps) ...
# In the parent's qmp shell:
{"execute": "qmp_capabilities"}
{"execute": "stop"}
{"execute": "xen-save-devices-state", "arguments": {"filename": "devices"}}
{"execute": "quit"}
# Now we have a tiny file with the parent's device state and
# and a bunch of memory backing files:
$ ls -lh devices
... 80K ... devices
$ ls -lh /tmp/mem
... 8.0G ... qemu_back_mem.0+200000000.pc.ram
... 128K ... qemu_back_mem.200000000+20000.pc.bios
... 128K ... qemu_back_mem.200020000+20000.pc.rom
... 8.0M ... qemu_back_mem.200040000+800000.vga.vram
... 64K ... qemu_back_mem.200840000+10000.cirrus_vga.rom
... 128K ... qemu_back_mem.200850000+20000.e1000.rom
# Launch the children (mode=open, mmap=private):
$ for i in $(seq 100); do \
qemu-img create -o backing_file=disk.qcow2 -f qcow2 child$i.qcow2; \
qemu -mem-path /tmp/memory,mode=open,mmap=private \
-incoming "exec:cat devices" \
-m 8g \
child$i.qcow2 &; \
done
# Now we have 100 8GB VMs sharing memory!
nofallback:
Abort if a RAMBlock can't be backed by a file. Currently, QEMU falls back to
anonymous VMAs. This fallback only makes sense if you're using -mem-path as
a transparent optimization (as is the case with hugetlbfs).
anyfs:
-mem-path currently prints a warning if the specified path isn't on a
hugetlbfs mount. This option squelches that warning.
In conclusion, by constraining -mem-path's implementation with deterministic
file names and control over mmap flags, QEMU enables efficient VM cloning.
Moreover, novel memory-management techniques can be implemented without adding
any complexity to QEMU. For example, the example becomes post-copy live
migration if /tmp/mem is shared using NFS and a child is launched on another
host.
Peter Feiner (1):
exec: make -mem-path filenames deterministic
cpu-all.h | 5 ++++
exec.c | 60 ++++++++++++++++++++++++++++++++++--------------------
qemu-config.c | 26 +++++++++++++++++++++++
qemu-options.hx | 24 +++++++++++++++++++--
vl.c | 43 +++++++++++++++++++++++++++++++++++++-
5 files changed, 131 insertions(+), 27 deletions(-)
--
1.7.5.4
- [Qemu-devel] [PATCH 0/1] Exogenous memory management via -mem-path,
Peter Feiner <=