qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 00/93] TCI fixes and cleanups


From: Richard Henderson
Subject: Re: [PATCH v2 00/93] TCI fixes and cleanups
Date: Thu, 4 Feb 2021 10:42:17 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 2/4/21 10:02 AM, Stefan Weil wrote:
> Is there a Git repository which makes pulling all changes easier?

https://gitlab.com/rth7680/qemu/-/tree/tci-next

> Regarding misaligned bytecode access, there exist two solutions. We could
> either use code which handles that correctly (I had sent a patch using memcpy
> two years ago and recently sent a V2 which uses the QEMU standard functions 
> for
> that). Or we can align the data like it is done in Richard's patches. For me 
> it
> is not obvious which one is better.

I think it is obvious.  If a host requires aligned access, a single aligned
load requires only one instruction, and an unaligned access requires lots.
E.g., for sparc,

int foo(void *p)
{
    int x;
    __builtin_memcpy(&x, p, 4);
    return x;
}

        ldub    [%i0], %g3
        ldub    [%i0+1], %g2
        stb     %g3, [%fp+2043]
        stb     %g2, [%fp+2044]
        ldub    [%i0+2], %g3
        ldub    [%i0+3], %g2
        stb     %g3, [%fp+2045]
        stb     %g2, [%fp+2046]
        ldsw    [%fp+2043], %i0

Such unaligned accesses are *really* slow.

> While a single access is faster for aligned
> data, this might be different for sequential access on misaligned data which
> might profit from better caching of smaller bytecode.

I believe you'll find that the rewrite encoding is smaller already.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]