Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU

From:	Ryan Harper
Subject:	Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Date:	Mon, 13 Oct 2008 14:43:28 -0500
User-agent:	Mutt/1.5.6+20040907i

* Laurent Vivier <address@hidden> [2008-10-13 13:52]:
> 
> Le 13 oct. 08 à 19:06, Ryan Harper a écrit :
> 
> >* Anthony Liguori <address@hidden> [2008-10-09 12:00]:
> >>Read performance should be unaffected by using O_DSYNC.  O_DIRECT  
> >>will
> >>significantly reduce read performance.  I think we should use  
> >>O_DSYNC by
> >>default and I have sent out a patch that contains that.  We will  
> >>follow
> >>up with benchmarks to demonstrate this.
> >
> 
> Hi Ryan,
> 
> as "cache=on" implies a factor (memory) shared by the whole system,  
> you must take into account the size of the host memory and run some  
> applications (several guests ?) to pollute the host cache, for  
> instance you can run 4 guest and run bench in each of them  
> concurrently, and you could reasonably limits the size of the host  
> memory to 5 x the size of the guest memory.
> (for instance 4 guests with 128 MB on a host with 768 MB).

I'm not following you here, the only assumption I see is that we have 1g
of host mem free for caching the write.


> 
> as O_DSYNC implies journal commit, you should run a bench on the ext3  
> host file system concurrently to the bench on a guest to see the  
> impact of the commit on each bench.

I understand the goal here, but what sort of host ext3 journaling load
is appropriate.  Additionally, when we're exporting block devices, I
don't believe the ext3 journal is an issue.

> 
> >
> >baremetal baseline (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >type, block size, iface    | MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >write, 16k, lvm, direct=1  | 127.7 |  12   |   11.66      |     
> >9.48    |
> >write, 64k, lvm, direct=1  | 178.4 |   5   |   13.65      |    
> >27.15    |
> >write, 1M,  lvm, direct=1  | 186.0 |   3   |  163.75      |   
> >416.91    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >read , 16k, lvm, direct=1  | 170.4 |  15   |   10.86      |     
> >7.10    |
> >read , 64k, lvm, direct=1  | 199.2 |   5   |   12.52      |    
> >24.31    |
> >read , 1M,  lvm, direct=1  | 202.0 |   3   |  133.74      |   
> >382.67    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >
> 
> Could you recall which benchmark you use ?

yeah:

fio --name=guestrun --filename=/dev/vda --rw=write --bs=${SIZE}
--ioengine=libaio --direct=1 --norandommap --numjobs=1 --group_reporting
--thread --size=1g --write_lat_log --write_bw_log --iodepth=74

> 
> >kvm write (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >block size,iface,cache,sync| MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >16k,virtio,off,none        | 135.0 |  94   |    9.1       |     
> >8.71    |
> >16k,virtio,on ,none        | 184.0 | 100   |   63.69      |    
> >63.48    |
> >16k,virtio,on ,O_DSYNC     | 150.0 |  35   |    6.63      |     
> >8.31    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >64k,virtio,off,none        | 169.0 |  51   |   17.10      |    
> >28.00    |
> >64k,virtio,on ,none        | 189.0 |  60   |   69.42      |    
> >24.92    |
> >64k,virtio,on ,O_DSYNC     | 171.0 |  48   |   18.83      |    
> >27.72    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >1M ,virtio,off,none        | 142.0 |  30   |  7176.00     |   
> >523.00    |
> >1M ,virtio,on ,none        | 190.0 |  45   |  5332.63     |   
> >392.35    |
> >1M ,virtio,on ,O_DSYNC     | 164.0 |  39   |  6444.48     |   
> >471.20    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> 
> According to the semantic, I don't understand how O_DSYNC can be  
> better than cache=off in this case...

I don't have a good answer either, but O_DIRECT and O_DSYNC are
different paths through the kernel.  This deserves a better reply, but
I don't have one off the top of my head.

> 
> >
> >kvm read (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >block size,iface,cache,sync| MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >16k,virtio,off,none        | 175.0 |  40   |   22.42      |     
> >6.71    |
> >16k,virtio,on ,none        | 211.0 | 147   |   59.49      |     
> >5.54    |
> >16k,virtio,on ,O_DSYNC     | 212.0 | 145   |   60.45      |     
> >5.47    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >64k,virtio,off,none        | 190.0 |  64   |   16.31      |    
> >24.92    |
> >64k,virtio,on ,none        | 546.0 | 161   |  111.06      |     
> >8.54    |
> >64k,virtio,on ,O_DSYNC     | 520.0 | 151   |  116.66      |     
> >8.97    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >1M ,virtio,off,none        | 182.0 |  32   | 5573.44      |   
> >407.21    |
> >1M ,virtio,on ,none        | 750.0 | 127   | 1344.65      |    
> >96.42    |
> >1M ,virtio,on ,O_DSYNC     | 768.0 | 123   | 1289.05      |    
> >94.25    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> 
> OK, but in this case the size of the cache for "cache=off" is the size  
> of the guest cache whereas in the other cases the size of the cache is  
> the size of the guest cache + the size of the host cache, this is not  
> fair...

it isn't supposed to be fair, cache=off is O_DIRECT, we're reading from
the device, we *want* to be able to lean on the host cache to read the
data, pay once and benefit in other guests if possible.

> 
> >
> >--------------------------------------------------------------------------
> >exporting file in ext3 filesystem as block device (1g)
> >--------------------------------------------------------------------------
> >
> >kvm write (1g dataset):
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >Test scenarios             | bandw | % CPU | ave submit   | ave  
> >compl  |
> >block size,iface,cache,sync| MB/s  | usage | latency usec | latency  
> >ms |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >16k,virtio,off,none        |  12.1 |  15   |    9.1       |     
> >8.71    |
> >16k,virtio,on ,none        | 192.0 |  52   |   62.52      |     
> >6.17    |
> >16k,virtio,on ,O_DSYNC     | 142.0 |  59   |   18.81      |     
> >8.29    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >64k,virtio,off,none        |  15.5 |   8   |   21.10      |   
> >311.00    |
> >64k,virtio,on ,none        | 454.0 | 130   |  113.25      |    
> >10.65    |
> >64k,virtio,on ,O_DSYNC     | 154.0 |  48   |   20.25      |    
> >30.75    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> >1M ,virtio,off,none        |  24.7 |   5   | 41736.22     |  
> >3020.08    |
> >1M ,virtio,on ,none        | 485.0 | 100   |  2052.09     |   
> >149.81    |
> >1M ,virtio,on ,O_DSYNC     | 161.0 |  42   |  6268.84     |   
> >453.84    |
> >---------------------------+-------+-------+-------------- 
> >+------------+
> 
> What file type do you use (qcow2, raw ?).

Raw.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
address@hidden

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC] Disk integrity in QEMU, (continued)
- Re: [Qemu-devel] [RFC] Disk integrity in QEMU, Fabrice Bellard, 2008/10/10
- Re: [Qemu-devel] [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/13
  - Re: [Qemu-devel] [RFC] Disk integrity in QEMU, Anthony Liguori, 2008/10/13
    - Re: [Qemu-devel] [RFC] Disk integrity in QEMU, Jamie Lokier, 2008/10/13
- [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Ryan Harper, 2008/10/13
  - [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Anthony Liguori, 2008/10/13
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Avi Kivity, 2008/10/14
  - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/13
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Ryan Harper <=
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/13
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Ryan Harper, 2008/10/13
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/15
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/16
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Anthony Liguori, 2008/10/16
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/16
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Avi Kivity, 2008/10/17
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Laurent Vivier, 2008/10/17
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Kevin Wolf, 2008/10/14
    - Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU, Ryan Harper, 2008/10/14

Prev by Date: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Next by Date: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Previous by thread: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Next by thread: Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU
Index(es):
- Date
- Thread