Re: [Qemu-block] backend for blk or fs with guaranteed blocking/synchron

No. I don't need realtime behavior. Realtime implies determinism, but determinism doesn't implies realtime. Of course, I realize that there are other sources of non-determinism exist, but these are separate stories. Here I just trying to eliminate one of them - asynchronous emulation of I/O inside qemu. Realtime isn't solution here.

Firstly, implementing realtime still leaves dependency on host machine (its performance, hardware configuration, etc.) and number of containers running. Yes, it will be deterministic, but results are tied to given host and containers count.

Secondly, it's just an overkill for problem being solved. The problem area is bounded by guest and QEMU implementation. Using realtime requires to fight complexities on host also (host kernel must be realtime, system configuration must be tuned, all possible latencies must be carefully traced, etc.). I perfectly understand how complex to design realtime system in generic, and implementing it using linux makes things even more complex.

Thirdly, it works only for KVM (and possibly other virtualization hypervisors). It's not my case, since my guest running with TCG and -icount,sleep=off.

It seems you got me wrong. I'll try to explain problem in other way.

Guest virtual clock must run independent of realtime (host) clock. They might be synchronized only in order to wait for some QEMU/host operation to be completed, i.e. guest time is being frozen by host performance bottlenecks, but it's transparent for guest. This is how works (or, at least, should work) "-icount,sleep=off" in time domain of CPU emulation. But I/O operations are seems to not respect this "policy". When QEMU processes I/O request from guest, it allows virtual time to run freely until backend completes operation and result passed back to guest. And this is what makes guest to "feel" speed/latency of I/O. It's the core of the problem.

To explain problem even better I've written a simple script (test_run_multiple_containers.sh), which emulates execution of multiple containers:

#!/bin/bash

N=$1

for i in $(seq 1 $N);

dd if=/dev/zero of=/tmp/testfile_$i bs=1K count=100000 2>&1 | sed -n 's/^.*, $.*$$/\1/p' &

done

wait

rm -f /tmp/testfile*

Where N is a number of containers running in parallel, and /tmp/testfile_$i is a file located in $i container's rootfs (dedicated mount point, blk device or something else).

Running

./test_run_multiple_containers.sh 1

on real machine should output value, which corresponds to maximum write speed. Lets define it as "max_io_throughput".

Running this script on real machine with different N values should give ouptuts with roughly identical values like "max_io_throughput / N".

What I need is that running this script on guest should always give identical and constant values, not depending on N value, current host load or something else external to guest. (No magic. While running emulation will cause at most "max_io_throughput" load on host (in terms of real time), QEMU will throttle guest virtual clock to be N times slower relative to realtime clock.)

Also I forgot to mention that container's rootfs aren't required to be persistent and stay on host during execution of containers. They may be transferred to guest RAM before execution. They're just source images of rootfs.

чт, 6 сент. 2018 г. в 21:08, Michael S. Tsirkin <address@hidden>:

On Thu, Sep 06, 2018 at 04:24:12PM +0600, Artem Pisarenko wrote:
> Hi all,
>
> I'm developing paravirtualized target linux system which runs multiple linux
> containers (LXC) inside itself. (For those, who unfamiliar with LXC, simply
> put, it's an isolated group of userspace processes with their own rootfs.) Each
> container should be provided access to its rootfs located at host and execution
> of container should be deterministic. Particularly, it means that container I/O
> operations must be synchronized within some predefined quantum of guest
> _virtual_ time, i.e. its I/O activity shouldn't be delayed by host performance
> or activities on host and other containers. In other words, guest should see
> it's like either infinite throughput and zero latency, or some predefined
> throughput/latency characteristics guaranteed per each rootfs.
>
> While other sources of non-determinism are seem to be eliminated (using TCG,
> -icount, etc.), asynchronous I/O still introduces it.

...

Just that you should realize that the issues are not limited to QEMU: to
get real time behaviour out of a Linux host you need a real-time kernel
and real-time capable hardware/firmware. I'm not an expert on this at
all, but see e.g. these old presentations:
https://lwn.net/Articles/656807/

--
MST

From:	Artem Pisarenko
Subject:	Re: [Qemu-block] backend for blk or fs with guaranteed blocking/synchronous I/O
Date:	Fri, 7 Sep 2018 14:15:44 +0600