[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion
From: |
Dan Williams |
Subject: |
Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion |
Date: |
Sun, 23 Jul 2017 13:10:34 -0700 |
On Sun, Jul 23, 2017 at 11:10 AM, Rik van Riel <address@hidden> wrote:
> On Sun, 2017-07-23 at 09:01 -0700, Dan Williams wrote:
>> [ adding Ross and Jan ]
>>
>> On Sun, Jul 23, 2017 at 7:04 AM, Rik van Riel <address@hidden>
>> wrote:
>> >
>> > The goal is to increase density of guests, by moving page
>> > cache into the host (where it can be easily reclaimed).
>> >
>> > If we assume the guests will be backed by relatively fast
>> > SSDs, a "whole device flush" from filesystem journaling
>> > code (issued where the filesystem issues a barrier or
>> > disk cache flush today) may be just what we need to make
>> > that work.
>>
>> Ok, apologies, I indeed had some pieces of the proposal confused.
>>
>> However, it still seems like the storage interface is not capable of
>> expressing what is needed, because the operation that is needed is a
>> range flush. In the guest you want the DAX page dirty tracking to
>> communicate range flush information to the host, but there's no
>> readily available block i/o semantic that software running on top of
>> the fake pmem device can use to communicate with the host. Instead
>> you
>> want to intercept the dax_flush() operation and turn it into a queued
>> request on the host.
>>
>> In 4.13 we have turned this dax_flush() operation into an explicit
>> driver call. That seems a better interface to modify than trying to
>> map block-storage flush-cache / force-unit-access commands to this
>> host request.
>>
>> The additional piece you would need to consider is whether to track
>> all writes in addition to mmap writes in the guest as DAX-page-cache
>> dirtying events, or arrange for every dax_copy_from_iter()
>> operation()
>> to also queue a sync on the host, but that essentially turns the host
>> page cache into a pseudo write-through mode.
>
> I suspect initially it will be fine to not offer DAX
> semantics to applications using these "fake DAX" devices
> from a virtual machine, because the DAX APIs are designed
> for a much higher performance device than these fake DAX
> setups could ever give.
Right, we don't need DAX, per se, in the guest.
>
> Having userspace call fsync/msync like done normally, and
> having those coarser calls be turned into somewhat efficient
> backend flushes would be perfectly acceptable.
>
> The big question is, what should that kind of interface look
> like?
To me, this looks much like the dirty cache tracking that is done in
the address_space radix for the DAX case, but modified to coordinate
queued / page-based flushing when the guest wants to persist data.
The similarity to DAX is not storing guest allocated pages in the
radix but entries that track dirty guest physical addresses.
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, (continued)
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Haozhong Zhang, 2017/07/21
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Stefan Hajnoczi, 2017/07/21
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Pankaj Gupta, 2017/07/21
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Rik van Riel, 2017/07/21
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Stefan Hajnoczi, 2017/07/21
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Dan Williams, 2017/07/22
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Rik van Riel, 2017/07/23
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Dan Williams, 2017/07/23
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Rik van Riel, 2017/07/23
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion,
Dan Williams <=
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Jan Kara, 2017/07/24
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Pankaj Gupta, 2017/07/24
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Jan Kara, 2017/07/24
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Dan Williams, 2017/07/24
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Jan Kara, 2017/07/24
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Dan Williams, 2017/07/24
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Pankaj Gupta, 2017/07/25
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Dan Williams, 2017/07/25
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Rik van Riel, 2017/07/25
- Re: [Qemu-devel] KVM "fake DAX" flushing interface - discussion, Pankaj Gupta, 2017/07/26