qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH v2] block: posix: Handle undetectable alignment


From: Kevin Wolf
Subject: Re: [Qemu-block] [PATCH v2] block: posix: Handle undetectable alignment
Date: Mon, 12 Aug 2019 16:23:52 +0200
User-agent: Mutt/1.11.3 (2019-02-01)

Am 11.08.2019 um 22:50 hat Nir Soffer geschrieben:
> In some cases buf_align or request_alignment cannot be detected:
> 
> - With Gluster, buf_align cannot be detected since the actual I/O is
>   done on Gluster server, and qemu buffer alignment does not matter.

If it doesn't matter, the best value would be buf_align = 1.

> - With local XFS filesystem, buf_align cannot be detected if reading
>   from unallocated area.

Here, we actually do need alignment, but it's unknown whether it would
be 512 or 4096 or something entirely. Failing to align requests
correctly results in I/O errors.

> - With Gluster backed by XFS, request_alignment cannot be detected if
>   reading from unallocated area.

This is like buf_align for XFS: We don't know the right value, but
getting it wrong causes I/O errors.

> - With NFS, the server does not use direct I/O, so both buf_align
>   cannot be detected.

This suggests that byte-aligned requests are fine for NFS, i.e.
buf_align = request_alignment = 1 would be optimal in this case.

> These cases seems to work when storage sector size is 512 bytes, because
> the current code starts checking align=512. If the check succeeds
> because alignment cannot be detected we use 512. But this does not work
> for storage with 4k sector size.
> 
> Practically the alignment requirements are the same for buffer
> alignment, buffer length, and offset in file. So in case we cannot
> detect buf_align, we can use request alignment. If we cannot detect
> request alignment, we can fallback to a safe value.

This makes sense in general.

What the commit message doesn't explain, but probably should do is how
we determine whether we could successfully detect request alignment. The
approach taken here is that a detected alignment of 1 is understood as
failure to detect the real alignment.

With cases 2 and 3 this gives the desird result; however for cases 1 and
4, an alignment of 1 would be the actual correct value, and the new
probing algorithm results in a worse result.

However, since the negative effect of the old algorithm in cases 2 and 3
is I/O errors whereas the effect of the new one in cases 1 and 4 is just
degraded performance for I/O that isn't 4k aligned, the new approch is
still preferable.

I think we need to make this tradeoff clearer in the commit message and
the comment in the code, but the approach is reasonable enough.

> With this change:
> 
> - Provisioning VM and copying disks on local XFS and Gluster with 4k
>   sector size works, resolving bugs [1],[2].
> 
> - With NFS we fallback to buf_align and request_alignment of 4096
>   instead of 512. This may cause unneeded data copying, but so far I see
>   better performance with this change.
> 
> [1] https://bugzilla.redhat.com/1737256
> [2] https://bugzilla.redhat.com/1738657
> 
> Signed-off-by: Nir Soffer <address@hidden>
> ---
> 
> v1 was a minimal hack; this version is a more generic fix that works for
> any storage without requiring users to allocate the first block of an
> image. Allocting the first block of an image is still a good idea since
> it allows detecting the right alignment in some cases.
> 
> v1 could also affect cases when we could detect buf_align to use
> request_alignment instead; v2 will only affect cases when buf_align or
> request_alignment cannot be detected.
> 
> v1 was hare:
> https://lists.nongnu.org/archive/html/qemu-block/2019-08/msg00133.html
> 
>  block/file-posix.c | 40 +++++++++++++++++++++++++++++-----------
>  1 file changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index f33b542b33..511468f166 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -323,6 +323,7 @@ static void raw_probe_alignment(BlockDriverState *bs, int 
> fd, Error **errp)
>      BDRVRawState *s = bs->opaque;
>      char *buf;
>      size_t max_align = MAX(MAX_BLOCKSIZE, getpagesize());
> +    size_t alignments[] = {1, 512, 1024, 2048, 4096};
>  
>      /* For SCSI generic devices the alignment is not really used.
>         With buffered I/O, we don't have any restrictions. */
> @@ -349,25 +350,42 @@ static void raw_probe_alignment(BlockDriverState *bs, 
> int fd, Error **errp)
>      }
>  #endif
>  
> -    /* If we could not get the sizes so far, we can only guess them */
> -    if (!s->buf_align) {
> +    /*
> +     * If we could not get the sizes so far, we can only guess them. First 
> try
> +     * to detect request alignment, since it is more likely to succeed. Then
> +     * try to detect buf_align, which cannot be detected in some cases (e.g.
> +     * Gluster). If buf_align cannot be detected, we fallback to the value of
> +     * request_alignment.
> +     */
> +
> +    if (!bs->bl.request_alignment) {
> +        int i;
>          size_t align;
> -        buf = qemu_memalign(max_align, 2 * max_align);
> -        for (align = 512; align <= max_align; align <<= 1) {
> -            if (raw_is_io_aligned(fd, buf + align, max_align)) {
> -                s->buf_align = align;
> +        buf = qemu_memalign(max_align, max_align);
> +        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
> +            align = alignments[i];
> +            if (raw_is_io_aligned(fd, buf, align)) {
> +                /* Fallback to safe value. */
> +                bs->bl.request_alignment = (align != 1) ? align : max_align;
>                  break;
>              }
>          }
>          qemu_vfree(buf);
>      }
>  
> -    if (!bs->bl.request_alignment) {
> +    if (!s->buf_align) {
> +        int i;
>          size_t align;
> -        buf = qemu_memalign(s->buf_align, max_align);
> -        for (align = 512; align <= max_align; align <<= 1) {
> -            if (raw_is_io_aligned(fd, buf, align)) {
> -                bs->bl.request_alignment = align;
> +        buf = qemu_memalign(max_align, 2 * max_align);
> +        for (i = 0; i < ARRAY_SIZE(alignments); i++) {
> +            align = alignments[i];
> +            if (raw_is_io_aligned(fd, buf + align, max_align)) {
> +                /* Fallback to request_aligment or safe value. */
> +                s->buf_align = (align != 1)
> +                    ? align
> +                    : (bs->bl.request_alignment != 0)
> +                        ? bs->bl.request_alignment
> +                        : max_align;

Nested ternary operators over five lines aren't that readable any more.
I'd suggest using at least one more if:

    if (align != 1) {
        s->buf_align = align;
    } else {
        s->buf_slign = bs->bl.request_alignment ?: max_align;
    }

In fact, is checking bs->bl.request_alignment for 0 even worth it here?
If it's 0, we failed to find any working configuration and will return
an error anyway. Then it doesn't matter if s->buf_align becomes 0, too.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]