qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/1] exec: make -mem-path filenames deterministi


From: Anthony Liguori
Subject: Re: [Qemu-devel] [PATCH 1/1] exec: make -mem-path filenames deterministic
Date: Tue, 08 Jan 2013 13:04:34 -0600
User-agent: Notmuch/0.13.2+93~ged93d79 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu)

Peter Feiner <address@hidden> writes:

>> This is not reasonable IMHO.
>>
>> I was okay with sticking a name on a ramblock, but encoding a guest PA
>> offset turns this into a supported ABI which I'm not willing to do.
>>
>> A one line change is one thing, but not a complex new option that
>> introduces an ABI only for a proprietary product that's jumping through 
>> hoops to keep
>> from contributing useful logic to QEMU.
>
> Hi Anthony,
>
> Thanks for getting back to me.
>
> Sticking a name on the ramblock file would suite our product just
> fine. Indeed, this is what we had agreed upon at the KVM forum.
> However, I submitted a more complex patch in an attempt to expose a
> more general & easy to use feature; I was trying to make a more useful
> contribution than the simple patch :-)
>
> Perhaps I can assuage your ABI concern and argue the utility of this
> patch vs the one-line version. However, if you aren't satisfied,
> please let me know and I'll resubmit the one-line version.

Yes, please submit the oneliner.

> On ABI: This patch doesn't add a new ABI. QEMU already has this ABI
> due to Xen live migration.
>
> When a Xen domain is booted, a new domain is created with an empty
> physmap. Then QEMU is launched. QEMU creates its ramblocks and, via
> memory callbacks (xen_add_to_physmap), populates Xen's physmap using
> ramblock sizes & offsets.
>
> On incoming migration, the Xen toolstack creates a new domain,
> populates its physmap, and copies RAM from the outgoing migration.
> When QEMU is launched, it populates its Xen memory model (i.e.,
> XenIOState) by reading the domain's existing physmap from xenstore.
> When QEMU creates ramblocks, the callbacks in xen-all.c _ignore_ the
> new ramblocks because their offsets are already in the physmap. If the
> new ramblocks had different sizes & offsets than those from the
> outgoing QEMU process, then QEMU's memory model would be inconsistent
> with Xen's (i.e., the physmap maintained by the hypervisor and the
> XenIOState maintained in userspace). In particular, QEMU would expect
> memory at a particular physmap offset that wouldn't have been
> populated by the Xen toolstack during live migration.

This is an internal detail between Xen and QEMU.  That doesn't mean it's
a general public API.

I'm fairly certain that Xen does not support arbitrary versions of QEMU
to be used as qemu-dm.

Regards,

Anthony Liguori

>
> On utility: Just adding ramblock names to backing file paths makes
> post-copy migration & cloning possible, but involves some painful VFS
> contortions, which I give a detailed example of below. On the other
> hand, these new -mem-path parameters make post-copy migration &
> cloning simple by leveraging an existing QMP command, existing
> filesystems, and kernel behavior. Put another way, the useful logic
> for memory sharing and post-copy live migration already exists in the
> kernel and a myriad of filesystems.  A fairly small patch (albeit not
> one line) enables that logic in QEMU.
>
> Peter
>
> Detailed example:
>
> Suppose you have a patched QEMU that adds ramblock names to their
> backing files and you want to implement memory sharing via cloning.
> When clones come up, each of their ramblocks' backing files need to
> contain the same data as the corresponding backing file from the
> parent (obviously you want those new backing files to somehow share
> pages and COW). The basic idea is to save the parent's ramblock files
> and arrange for the clones to open them.
>
> You can see the parent's ramblock files easily enough by looking at
> the unlinked ramblock files (e.g., /proc/pid/fd/10 is a symlink to
> /tmp/qemu_back_mem.pc.ram.WHFZYw (deleted), /proc/pid/fd/11 is a
> symlink to /tmp/qemu_back_mem.vga.vram.WT1yQW (deleted), etc.).
> Unfortunately, since they're all mapped MAP_PRIVATE, these symlinks,
> when opened, will give all zeros. So you can either implement your own
> filesystem that gives you a backdoor to the MAP_PRIVATE pages (fast
> but complicated), or you can use qemu's monitor to dump guest RAM
> (slow but works).
>
> When a clone runs and creates a new backing file using mkstemp, you
> need to arrange for that backing file to somehow contain the same data
> as the corresponding file from the parent. There is an obvious
> heuristic for determining this correspondence: parse the ramblock name
> from the child's file and use the matching file from the parent.
> Correctness aside (e.g., multiple ramblocks can have the same name,
> e.g., e1000.rom, but this is moot because the _important_ ramblocks,
> i.e., pc.ram and vga.ram, are unique in the emulated system we care
> about), implementing this heuristic is a pain. To see the file being
> created, you need to implement a custom file system. Moreover, to
> share memory with another file that's been opened MAP_PRIVATE, you
> have to implement your own VMA operations. Oye!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]