qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: QEMU for Qualcomm Hexagon - KVM Forum talk and code available


From: Taylor Simpson
Subject: RE: QEMU for Qualcomm Hexagon - KVM Forum talk and code available
Date: Wed, 13 Nov 2019 19:31:32 +0000

Responses below ...

Taylor


Taylor Simpson <address@hidden> writes:

> I had discussions with several people at the KVM Forum, and I’ve been 
> thinking about how to divide up the code for community review.  Here is my 
> proposal for the steps.
>
>   1.  linux-user changes + linux-user/hexagon + skeleton of
> target/hexagon This is the minimum amount to build and run a very simple 
> program.  I
>   have an assembly program that prints “Hello” and exits.  It is
>   constructed to use very few instructions that can be added brute
>   force in the Hexagon back end.

I'm hoping most of the linux-user changes are in the hexagon runloop?
There has been quite a bit of work splitting up and cleaning up the #ifdef mess 
in linux-user over the last few years.

[Taylor] The majority of the linux-user support is in linux-user/hexagon.  
However, there were still a few changes needed in the linux-user directory.
    elfload.c Needs some code to match some existing #ifdef TARGET_xyz code 
(e.g., define the init_thread function).
    signal.c Needs some code to map signal 33 from the guest to something else 
on the target.
                  I spoke to Laurent about this at the converence.
    syscall.c Needs a definition of regpairs_aligned that returns 1.
    Ssscall_defs.h Needs some definitions (e.g., TARGET_IOC_SIZEBITS and 
target_stat) added to the other #ifdef TARGET_xys blocks.

>   2.  Add the code that is imported from the Hexagon simulator and the
> qemu helper generator This will allow the scalar ISA to be executed.
> This will grow the set of programs that could execute, but there will still 
> be limitations.
> In particular, there can be no packets which means the C library won’t
> work .  We have to build with -nostdlib

You could run -nostdlib system TCG tests (hello and memory) but that would 
require modelling some sort of hardware and assumes you have a simple serial 
port or semihosting solution. That said a bunch of the MIPS tests are 
linux-user and -nostdlib so that isn't a major problem in getting some of the 
tests running.

When you say code imported from the hexagon simulator I was under the 
impression you were generating code from the instruction description.
Otherwise you'll need to be very clear about your licensing grants.

[Taylor] That is correct, we are generating code from the description.  There 
are two major pieces that are imported
    Instruction decode logic
    Any additional functions that are called from the generated code

[Taylor] All of the code will be licensed the same way.  I want to mark the 
imported code because it does not conform to the qemu coding standards.  I 
prefer not to reformat it in order to easily get bug fixes and enhancements 
going forward.  I also hope it will make the community review easier by 
allowing people to focus on the code that is new for qemu.

>   3.  Add support for packet semantics At this point, we will be able
> to execute full programs linked with the C library.  This will include
> the check-tcg tests.

I think the interesting question is if the roll-back semantics of the hexagon 
are something we might need for other emulated architectures or is a 
particularly specific solution for Hexagon (I'm guessing the later).

[Taylor] It is currently Hexagon-specific and isolated into the target/hexagon 
directory.  I was thinking the reviewers would have an easier time 
understanding the code if this were broken out.  However, it could also be 
merged together with step 2 if that is preferred.

>   4.  Add support for the wide vector extensions
>   5.  Add the helper overrides for performance optimization Some of
> these will be written by hand, and we’ll work with rev.ng to
>   integrate their flex/bison generator.

One thing to nail down will be will we include the generated code in the source 
tree with a tool to regenerate (much like we do for
linux-headers) or if we want to add the dependency and regenerate each time 
from scratch. I don't see including flex/bison as a dependency being a major 
issue (in fact we have it in our docker images so I guess something uses it). 
However it might be trickier depending on libclang which was also being 
discussed.

[Taylor] Currently, I have the generator and the generated code sitting in the 
source tree.  I'm flexible on this if the decision is to regenerate it every 
time.

>
> I would love some feedback on this proposal.  Hopefully, that is enough 
> detail so that people can comment.  If anything isn’t clear, please ask 
> questions.
>
>
> Thanks,
> Taylor
>
>
> From: Qemu-devel <qemu-devel-bounces+tsimpson=address@hidden>
> On Behalf Of Taylor Simpson
> Sent: Tuesday, November 5, 2019 10:33 AM
> To: Aleksandar Markovic <address@hidden>
> Cc: Alessandro Di Federico <address@hidden>; address@hidden;
> address@hidden; Niccolò Izzo <address@hidden>
> Subject: RE: QEMU for Qualcomm Hexagon - KVM Forum talk and code
> available
>
> Hi Aleksandar,
>
> Thank you – We’re glad you enjoyed the talk.
>
> One point of clarification on SIMD in Hexagon.  What we refer to as the 
> “scalar” core does have some SIMD operations.  Register pairs are 8 bytes, 
> and there are several SIMD instructions.  The example we showed in the talk 
> included a VADDH instruction.  It treats the register pair as 4 half-words 
> and does a vector add.  Then there are the Hexagon Vector eXtensions (HVX) 
> instructions that operate on 128-byte vectors.  There is a wide variety of 
> instructions in this set.  As you mentioned, some of them are pure SIMD and 
> others are very complex.
>
> For the helper generator, the vast majority of these are implemented with 
> helpers.  There are only 2 vector instructions in the scalar core that have a 
> TCG override, and all of the HVX instructions are implemented with helpers.  
> If you are interested in a deeper dive, see below.
>
> Alessandro and Niccolo can comment on the flex/bison implementation.
>
> Thanks,
> Taylor
>
>
> Now for the deeper dive in case anyone is interested.  Look at the genptr.c 
> file in target/hexagon.
>
> The first vector instruction that is with an override is A6_vminub_RdP.  It 
> does a byte-wise comparison of two register pairs and sets a predicate 
> register indicating whether the byte in the left or right operand is greater. 
>  Here is the TCG code.
> #define fWRAP_A6_vminub_RdP(GENHLPR, SHORTCODE) \ { \
>     TCGv BYTE = tcg_temp_new(); \
>     TCGv left = tcg_temp_new(); \
>     TCGv right = tcg_temp_new(); \
>     TCGv tmp = tcg_temp_new(); \
>     int i; \
>     tcg_gen_movi_tl(PeV, 0); \
>     tcg_gen_movi_i64(RddV, 0); \
>     for (i = 0; i < 8; i++) { \
>         fGETUBYTE(i, RttV); \
>         tcg_gen_mov_tl(left, BYTE); \
>         fGETUBYTE(i, RssV); \
>         tcg_gen_mov_tl(right, BYTE); \
>         tcg_gen_setcond_tl(TCG_COND_GT, tmp, left, right); \
>         fSETBIT(i, PeV, tmp); \
>         fMIN(tmp, left, right); \
>         fSETBYTE(i, RddV, tmp); \
>     } \
>     tcg_temp_free(BYTE); \
>     tcg_temp_free(left); \
>     tcg_temp_free(right); \
>     tcg_temp_free(tmp); \
> }
>
> The second instruction is S2_vsplatrb.  It takes the byte from the operand 
> and replicates it 4 times into the destination register.  Here is the TCG 
> code.
> #define fWRAP_S2_vsplatrb(GENHLPR, SHORTCODE) \ { \
>     TCGv tmp = tcg_temp_new(); \
>     int i; \
>     tcg_gen_movi_tl(RdV, 0); \
>     tcg_gen_andi_tl(tmp, RsV, 0xff); \
>     for (i = 0; i < 4; i++) { \
>         tcg_gen_shli_tl(RdV, RdV, 8); \
>         tcg_gen_or_tl(RdV, RdV, tmp); \
>     } \
>     tcg_temp_free(tmp); \
> }
>
>
> From: Aleksandar Markovic <address@hidden<mailto:address@hidden>>
> Sent: Monday, November 4, 2019 6:05 PM
> To: Taylor Simpson <address@hidden<mailto:address@hidden>>
> Cc: address@hidden<mailto:address@hidden>; Alessandro Di Federico 
> <address@hidden<mailto:address@hidden>>; 
> address@hidden<mailto:address@hidden>; Niccolò Izzo 
> <address@hidden<mailto:address@hidden>>
> Subject: Re: QEMU for Qualcomm Hexagon - KVM Forum talk and code available
>
>
> CAUTION: This email originated from outside of the organization.
>
>
> On Friday, October 25, 2019, Taylor Simpson 
> <address@hidden<mailto:address@hidden>> wrote:
> We would like inform the you that we will be doing a talk at the KVM Forum 
> next week on QEMU for Qualcomm Hexagon.  Alessandro Di Federico, Niccolo 
> Izzo, and I have been working independently on implementations of the Hexagon 
> target.  We plan to merge the implementations, have a community review, and 
> ultimately have Hexagon be an official target in QEMU.  Our code is available 
> at the links below.
> https://github.com/revng/qemu-hexagon
> https://github.com/quic/qemu
> If anyone has any feedback on the code as it stands today or guidance on how 
> best to prepare it for review, please let us know.
>
>
> Hi, Taylor, Niccolo (and Alessandro too).
>
> I didn't have a chance to take a look at neither the code nor the docs, but I 
> did attend you presentation at KVM Forum, and I found it superb and 
> attractive, one of the best on the conference, if not the very best.
>
> I just have a couple of general questions:
>
> - Regarding the code you plan to upstream, are all SIMD instructions 
> implemented via tcg API, or perhaps some of them remain being implemented 
> using helpers?
>
> - Most of SIMD instructions can be viewed simply as several paralel 
> elementary operations. However, for a given SIMD instruction set, usually not 
> all of them fit into this pattern. For example, "horizontal add" (addind data 
> elements from the same SIMD register), various "pack/unpack/interleave/merge" 
> operations, and more general "shuffle/permute" operations as well (here I am 
> not sure which of these are included in Hexagon SIMD set, but there must be 
> some). How did you deal with them?
>
> - What were the most challenging Hexagon SIMD instructions you came accross 
> while developing your solution?
>
> Sincerely,
> Aleksandar
>
>
>
>
> Thanks,
> Taylor


--
Alex Bennée

reply via email to

[Prev in Thread] Current Thread [Next in Thread]