|
From: | Laurent Vivier |
Subject: | Re: [Qemu-devel] Re: [RFC] Disk integrity in QEMU |
Date: | Mon, 13 Oct 2008 22:21:46 +0200 |
Le 13 oct. 08 à 21:43, Ryan Harper a écrit :
* Laurent Vivier <address@hidden> [2008-10-13 13:52]:Le 13 oct. 08 à 19:06, Ryan Harper a écrit :* Anthony Liguori <address@hidden> [2008-10-09 12:00]:Read performance should be unaffected by using O_DSYNC. O_DIRECT will significantly reduce read performance. I think we should use O_DSYNC by default and I have sent out a patch that contains that. We will follow up with benchmarks to demonstrate this.Hi Ryan, as "cache=on" implies a factor (memory) shared by the whole system, you must take into account the size of the host memory and run some applications (several guests ?) to pollute the host cache, for instance you can run 4 guest and run bench in each of them concurrently, and you could reasonably limits the size of the host memory to 5 x the size of the guest memory. (for instance 4 guests with 128 MB on a host with 768 MB).I'm not following you here, the only assumption I see is that we have 1gof host mem free for caching the write.
Is this a realistic use case ?
as O_DSYNC implies journal commit, you should run a bench on the ext3 host file system concurrently to the bench on a guest to see the impact of the commit on each bench.I understand the goal here, but what sort of host ext3 journaling load is appropriate. Additionally, when we're exporting block devices, I don't believe the ext3 journal is an issue.
Yes, it's a comment for the last test case. I think you can run the same benchmark as you do in the guest.
baremetal baseline (1g dataset): ---------------------------+-------+-------+-------------- +------------+ Test scenarios | bandw | % CPU | ave submit | ave compl | type, block size, iface | MB/s | usage | latency usec | latency ms | ---------------------------+-------+-------+-------------- +------------+ write, 16k, lvm, direct=1 | 127.7 | 12 | 11.66 | 9.48 | write, 64k, lvm, direct=1 | 178.4 | 5 | 13.65 | 27.15 | write, 1M, lvm, direct=1 | 186.0 | 3 | 163.75 | 416.91 | ---------------------------+-------+-------+-------------- +------------+ read , 16k, lvm, direct=1 | 170.4 | 15 | 10.86 | 7.10 | read , 64k, lvm, direct=1 | 199.2 | 5 | 12.52 | 24.31 | read , 1M, lvm, direct=1 | 202.0 | 3 | 133.74 | 382.67 | ---------------------------+-------+-------+-------------- +------------+Could you recall which benchmark you use ?yeah: fio --name=guestrun --filename=/dev/vda --rw=write --bs=${SIZE}--ioengine=libaio --direct=1 --norandommap --numjobs=1 -- group_reporting--thread --size=1g --write_lat_log --write_bw_log --iodepth=74
Thank you...
kvm write (1g dataset): ---------------------------+-------+-------+-------------- +------------+ Test scenarios | bandw | % CPU | ave submit | ave compl | block size,iface,cache,sync| MB/s | usage | latency usec | latency ms | ---------------------------+-------+-------+-------------- +------------+ 16k,virtio,off,none | 135.0 | 94 | 9.1 | 8.71 | 16k,virtio,on ,none | 184.0 | 100 | 63.69 | 63.48 | 16k,virtio,on ,O_DSYNC | 150.0 | 35 | 6.63 | 8.31 | ---------------------------+-------+-------+-------------- +------------+ 64k,virtio,off,none | 169.0 | 51 | 17.10 | 28.00 | 64k,virtio,on ,none | 189.0 | 60 | 69.42 | 24.92 | 64k,virtio,on ,O_DSYNC | 171.0 | 48 | 18.83 | 27.72 | ---------------------------+-------+-------+-------------- +------------+ 1M ,virtio,off,none | 142.0 | 30 | 7176.00 | 523.00 | 1M ,virtio,on ,none | 190.0 | 45 | 5332.63 | 392.35 | 1M ,virtio,on ,O_DSYNC | 164.0 | 39 | 6444.48 | 471.20 | ---------------------------+-------+-------+-------------- +------------+According to the semantic, I don't understand how O_DSYNC can be better than cache=off in this case...I don't have a good answer either, but O_DIRECT and O_DSYNC are different paths through the kernel. This deserves a better reply, but I don't have one off the top of my head.
The O_DIRECT kernel path should be more "direct" than the O_DSYNC one. Perhaps a oprofile could help to understand ? What it is strange also is the CPU usage with cache=off. It should be lower than others, perhaps an alignment issue ?
due to the LVM ?
kvm read (1g dataset): ---------------------------+-------+-------+-------------- +------------+ Test scenarios | bandw | % CPU | ave submit | ave compl | block size,iface,cache,sync| MB/s | usage | latency usec | latency ms | ---------------------------+-------+-------+-------------- +------------+ 16k,virtio,off,none | 175.0 | 40 | 22.42 | 6.71 | 16k,virtio,on ,none | 211.0 | 147 | 59.49 | 5.54 | 16k,virtio,on ,O_DSYNC | 212.0 | 145 | 60.45 | 5.47 | ---------------------------+-------+-------+-------------- +------------+ 64k,virtio,off,none | 190.0 | 64 | 16.31 | 24.92 | 64k,virtio,on ,none | 546.0 | 161 | 111.06 | 8.54 | 64k,virtio,on ,O_DSYNC | 520.0 | 151 | 116.66 | 8.97 | ---------------------------+-------+-------+-------------- +------------+ 1M ,virtio,off,none | 182.0 | 32 | 5573.44 | 407.21 | 1M ,virtio,on ,none | 750.0 | 127 | 1344.65 | 96.42 | 1M ,virtio,on ,O_DSYNC | 768.0 | 123 | 1289.05 | 94.25 | ---------------------------+-------+-------+-------------- +------------+OK, but in this case the size of the cache for "cache=off" is the size of the guest cache whereas in the other cases the size of the cache isthe size of the guest cache + the size of the host cache, this is not fair...it isn't supposed to be fair, cache=off is O_DIRECT, we're reading fromthe device, we *want* to be able to lean on the host cache to read the data, pay once and benefit in other guests if possible.
OK, but if you want to follow this way I think you must run several guests concurrently to see how the host cache help each of them. If you want I can try this tomorrow ? The O_DSYNC patch is the one posted to the mailing-list ?
And moreover, you should run an endurance test to see how the cache evolves.
-------------------------------------------------------------------------- exporting file in ext3 filesystem as block device (1g) -------------------------------------------------------------------------- kvm write (1g dataset): ---------------------------+-------+-------+-------------- +------------+ Test scenarios | bandw | % CPU | ave submit | ave compl | block size,iface,cache,sync| MB/s | usage | latency usec | latency ms | ---------------------------+-------+-------+-------------- +------------+ 16k,virtio,off,none | 12.1 | 15 | 9.1 | 8.71 | 16k,virtio,on ,none | 192.0 | 52 | 62.52 | 6.17 | 16k,virtio,on ,O_DSYNC | 142.0 | 59 | 18.81 | 8.29 | ---------------------------+-------+-------+-------------- +------------+ 64k,virtio,off,none | 15.5 | 8 | 21.10 | 311.00 | 64k,virtio,on ,none | 454.0 | 130 | 113.25 | 10.65 | 64k,virtio,on ,O_DSYNC | 154.0 | 48 | 20.25 | 30.75 | ---------------------------+-------+-------+-------------- +------------+ 1M ,virtio,off,none | 24.7 | 5 | 41736.22 | 3020.08 | 1M ,virtio,on ,none | 485.0 | 100 | 2052.09 | 149.81 | 1M ,virtio,on ,O_DSYNC | 161.0 | 42 | 6268.84 | 453.84 | ---------------------------+-------+-------+-------------- +------------+What file type do you use (qcow2, raw ?).Raw.
No comment Laurent ----------------------- Laurent Vivier ---------------------- "The best way to predict the future is to invent it." - Alan Kay
[Prev in Thread] | Current Thread | [Next in Thread] |