tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] inline assembly and optimization passes


From: Jared Maddox
Subject: Re: [Tinycc-devel] inline assembly and optimization passes
Date: Sun, 22 Sep 2013 19:45:51 -0500

> Date: Sun, 22 Sep 2013 16:39:14 +0200
> From: Sylvain BERTRAND <address@hidden>
> To: address@hidden
> Subject: Re: [Tinycc-devel] inline assembly and optimization passes
> Message-ID: <address@hidden>
> Content-Type: text/plain; charset=us-ascii
>
> :)
>
> If I were to design a new language, I would go the other way,
> less complex, more explicit.
>

I assume that "less complex, more explicit" refers to the garbage
collection, classes, etc. The language would actually have a somewhat
simple syntax (when possible, syntax would be used for basically the
same thing in multiple contexts). The reason for several of the
"complexities" is that this would be intended to be an everyday
applications language. Thus, OO, garbage collection, and closures are
all desirable features, because they make it easier to write a program
in a "clean" manor. The presence of interfaces and absence of
inheritance is basically to force cleaner design upon programmers:
people often want things that will actually cause trouble while
ignoring things that should be used instead, so you have to disregard
some requests.

> I would go for a kind of C99- language (C-- is already taken by a
> great evil):
>  - only sized types (u8 s16 f80...), no void pointer

I'm undecided on sized types due to some of the mainframes out there
(e.g. some have 36 bit native words). void, and by extension void
pointers, would be part of the language, due to a desire to share data
types with C, thereby making interop easier (I consider C++'s approach
to interop inherently superior to Java's).

>  - kind of no implicit cast (use of aliasing)

I agree with no explicit casting, but don't know what you mean by aliasing.

>  - no typedef

I would approach this entirely different: all built-in types would
actually be like C's definition of a struct: you can't use them
directly. Instead, you have to typedef them to the type name that you
WILL be using. None of that silly "struct x y" stuff though, once it's
defined, it's name is used for it alone, and thus a struct, union, and
int are treated the same: they're built-ins that can't be accessed
except through a typedef.

I also wouldn't use typedef to create an alternate name for a type
(which it currently does) but to create a new type that happens to be
identical to the old type. I'd also have somewhere between two and
four variations, depending on the details of how I wanted to do it.
Implicit casting could happen between the base-most type and it's
typedeffed types (for the sake of initialization), but only explicit
casting would be allowed between typedeffed types.

>  - no enum

I would probably have enum, but since I would have OO (and thus could
justify namespacing), and not be trying to be code-compatible with C,
I would likely have enum members namespaced to their particular enum.
I would probably generify it a bit so that enums could be defined from
arbitrary types (e.g. a strucdt) as well as from ints.

>  - one loop instruction (loop{} you would break or continue)

I wold probably have both while() and do-while() since it makes some
things smaller to code.

I would probably have a foreach() or apply() for use on arrays, which
would also be a "discrete" type, just like an int, struct, or pointer.
I picked the basics of this idea up from LPC.

All arrays (and ifs, etc) would also return a value, which for
foreach()/apply() would be an array of the results from each
individual body invocation. For while() and do-while() it would be the
result of the last invocation.

>  - arrays to behave like structs

In what sense? I would have arrays (and arrays of arrays, and...)
passed by value instead of by pointer (if you want to pass it by
pointer then take it's pointer), but I'm not certain what YOU actually
mean.

>  - no bitfield

Loosing bitfields would vaguely be a shame, but it isn't like they're
really all that important.

>  - a clean attribute management syntax

I've thought about this, but it would have to be a later addition.
Better to get core language features implemented first, especially
since I would presumably want to use a compile-time-code system for
this, which would need to be designed in conjunction with the template
& concepts system.

>  - no const/volatile/inline/static... that would be done using
>    the attribute
>    syntax

I would probably include at least some of those before implementing an
attribute system. If nothing else, you can move them to your attribute
system later.

>  - clean inline asm.
>

I probably would keep assembly outside of the language, stuff like
that is why you keep interop with C.




At any rate, we really should move back to the optimization stuff. The
VM system that I was thinking of works like this:

1)Four kinds of memory spaces:
i ) the heap
ii ) the stack
iii ) registers, which are an arbitrary number of arbitrarily size
locations; if the registers of the target machine weren't big enough,
then a static memory space would be allocated to act as the "actual"
registers; note that this was for a VM, not an Intermediate Code
representation, and thus we'd likely want something else
iv ) the stacking heap, which is a "stack" of heaps, where you can
directly access the heap on top, indirectly access the heap below it
(e.g. by copyoing a value to the top heap, or writing a value from the
top heap), and is passed to the destination of "call outs", thereby
allowing it to act as a way to pass arguments. The purpose would be
to:
I ) Tell the JIT translator the scoping of some temporary values. This
is to provide it with extra information for optimization. An example
usage would be that if you have a loop, then it might be represented
by it's own heap, so that the interactions between normal memory and
the algorithm used inside the loop would be expressed entirely in
terms of copying data to and from the heap that was on the stack. This
should make it easier to optimize, since things are compartmentalized.
II ) Standardize the method of communicating with external code. A
PROPER dynamic library loading system would presumably be implemented,
but that in turn would be accessed via this mechanism. The operating
concept is "store data into a common memory space, stick a
dispatch-target into the correct register for the OS or processor to
see, and issue the dispatch-interrupt opcode".

2) Opcodes deal with arbitrary-sized data, not with fixed sizes. The
size is either encoded directly within the opcode, or is stored
somewher4e that the opcode says to look. This is to avoid the JVM's
one-time "32-bit only" limitation, though THAT would realistically
need to go in conjunction with a executable format that supported
sufficiently powerful linker/loader scripts to do the job (the only
code in a position to always know how such "nativizations" should go
is the compiler that produced the object files, so that needs to write
the linker/loader scripts). Some opcodes would probably be provided to
act (e.g. "clear", add, compare, etc.) on registers regardless of
size, so that you could use them without worrying about how large the
register REALLY is (unless it's too small, which you should be able to
detect with a specialized opcode).

3) "Native" system call interface to make it cleaner/easier to
interface with external things, like the operating system. This would
consist of a call-out opcode and a call-ret opcode, which conceptually
would be the same as various opcodes that actual processors have to
allow applications to communicate with the operating system. In this
case the actual usage of them would be a little cloudy, because the
JIT system would ultimately determine whether the actual OS, or a
library in the same memory space, were the destination.
i )The destination would be indicated by placing a value in some
particular register (presumably the lowest-numbered register, or
something like that), which code provided by the JIT would then use to
dispatch to the correct location.
ii )All data transfer between the source and destination would be
through either the top (and ONLY the top) heap on the stack-heap
(implying that the stack-heap should probably be implemented as a
bunch of memory-mapped files), and whatever means were provided by
different calls to destinations (one destination might provide
allocations of shared memory, for example). If the stacking-heap
didn't have a heap on it at either the interrupt, or on the return
from call-out, then an empty one would be pushed on before the system
proceeded. Empty heaps would default to "cleared".
iii ) One destination would be guaranteed to exist: 0. This
destination would support a number of different functions, indicated
by two values at the beginning of the stack-heap that was passed to
it: the first 8 bits would indicate the size of the second value; the
second value would both be placed on the lowest power-of-two byte
location that did not have an address lower than the size value, and
would be sized in bytes according to the size value. So, if the size
was 60 then the first byte of the value would be at address 63
(because the first byte has address 0), and would be 60 bytes long. If
the length is 0, then the second value is also 0. The actual value of
the second value indicates the function to use.
I ) Function 0 is "quit". Exiting is therefor the simplest call-out
that a piece of code can perform.
II ) Function 1 would probably need to be the first in a series of
call-outs designed to find other destinations. Destinations would
probably have discrete names. This series of functions might be a bad
match for an IL/IC, or it might be useful. Open topic.



There might have been a few other details, but that was the basic
idea. Multiple memory spaces, one with peculiar behavior designed to
be useful for optimization & call-outs. Arbitrary data sizes. An
opcode for the explicit purpose of "calling things that we can't know
about for whatever reason".

Thoughts?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]