Target native layer

classpath

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Target native layer

From:	Dr. Torsten Rupp
Subject:	Target native layer
Date:	Tue, 10 Aug 2004 13:30:09 +0200
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624

Dear Classpath-developers,

I read many emails in the list about the TARGET_*-layer. I'm not veryhappy about the discussion, because we discussed this already around 1year ago and at that time it seemed everybody was happy to get theabstraction layer TARGET_*. And of course I'm not happy about it,because there is a discussion to remove it completely without imhounderstanding the idea behind the layer. After exchanging some privateemails I like to post some of my thoughts in public.


1. Advantages and disadvantages

Advantages of TARGET_*:
- efficient code (independent of compiler)
- easy to port (just override the macros which are some
  special case for the target compared to the generic
  implementation)
- macros are usually small (only 3-5 lines)
- functions can also used (e. g. for complex native code)
- there is no "dead-code"
- no need for extensive ifdef-elif-else-endif constructs
- target dependent implementation is located in a single file
  (header-file with macros)

Disadvantages of TARGET_*
- debugging is more complicated
- not type-save

Advantages of target-layer functions like do_*()
- debugging is easier
- type-safe like other C-code

Disadvantages of target-layer functions do_*()
- less efficient (additional function-call)
- "dead-code" if some function in a object-file is not used
  (problem with linker; see comments below)
- code cluttered with ifdef-elif-else-endif constructs
- running autoconf for embedded systems is difficult, thus
  many "hard-wired" predefines are needed
- target dependent implementation are located in several
  files (configure, header-file with predefines, c-file)

2. Naming and other "cosmetic" things

The naming convention of the TARGET_* is like the following:

TARGET_NATIVE_<module>_<function>

<module> stands for some group of functions, e. g. file-functions.
<function> stands for the name of a function, usually the name of the
corresponding OS-function from Linux extended by some suffix, e. g.
OPEN_READ which stands for open(...,RD_READONLY). There is (or at
least: there should be) no exception, thus some names could become
a little bit long. But there are no "arbitrary" abbreviations
which are difficult to understand.

For the math-macros the naming is also like this, e. g.

TARGET_NATIVE_MATH_FLOAT_DOUBLE_ISNAN

module: MATH_FLOAT (floating point macros)
function: DOUBLE_ISNAN (check if a double is NaN)

The prefix TARGET_NATIVE_ is always used to avoid naming conflicts with
existing names in the OS-includes. E. g. using OPEN_READ() only would be
dangerous, because OPEN_READ could be some already existing constant,
macro or function in the specific target OS.

OF COURSE.... this naming is _not_ fixed and I do _not_ assume it is the
best solution at all (but it is some solution; before that we had many

conflicts with different OSs, because of the so "convenient" shortnames). If needed this can be changed without to much confusion and pain(it would be some pain of course for aicas, but this would be acceptable).


Cosmetics are:
 - length of names for macros (usually I never typeing a macro name,
   instead I use cut+copy; emacs-users can use auto-completion,
   Eclipse-users also have some help be the IDE). By the way: the
   longest name is currently 55 characters long.
 - prefix TARGET_NATIVE: some other would also be good
 - length of lines

3. complexity of macros, debugging

It is true that #defines are difficult to debug. They are also difficult
to write, but of course it depends always on the specific macro. Usually
the macros are of the form like this

#define TARGET_NATIVE_<name>(...) \
  INCLUDE
  do { \
    FUNCTION
    RESULT
  } while (0)

e. g.:

#defineTARGET_NATIVE_FILE_OPEN(filename,filedescriptor,flags,permissions,result) \

  #include <sys/types.h>
  #include <sys/stat.h>
  #include <fcntl.h>
  do { \
    filedescriptor=open(filename, \
                        flags, \
                        permissions \
                        ); \
    result=(filedescriptor>=0)?TARGET_NATIVE_OK:TARGET_NATIVE_ERROR; \
  } while (0)

The standard implementation (generic) contain 138 of these macros. Only9 are more complex, because of different possible implementations(selected by autoconf) or transformation of values. Thus in most of thecases the macros are only "wrappers" for some OS-specific call includingadaption of parameters (e. g. types or units, result value).

Imho the complexity of the macros is usually not very high (if itbecomes complex also a function can be implemented; then a macro is onlyan "alias"). They are multi-lined to make them more readable. The"include"-statements are needed in the generic implementation and the

"do...while" is a construct for safe usage of the macro.

Debugging of macros is difficult - if they are complex. If a macro isonly a wrapper then only a OS-specific function is called with someadditional calculations, e. g. evaluation of the return value. It is agood idea to keep the macros as simple as possible. And it is possible,because the TARGET_*-layer does not add additional functionally, it only"maps" some functionality.


3. autoconf, POSIX - porting

autoconf is a nice tool which is also used at aicas heavily (half of my

time I'm doing "autoconf"). But autoconf is also limited in its usage.For Unix-like systems autoconf is a good solution (for this autoconf iswritten, I assume), but for non-Unix-like systems autoconf could bebecome a problem. We discussed at aicas around 1 year ago if we shoulduse autoconf only, but we detected this is not really possible.Especially for embedded systems and "strange" systems (e. g. MinGW orembOS) it is a big challenge to "trim" autoconf in such a way that theright configuration is selected. I detected some complicacy of autoconfand the specific OS which makes it very difficult to use autoconf only.I will give you a few examples:


- for embeeded system it is not always feasible to check if a function

exists by compiling and linking a small example-program, becausesometimes linkage is done only partially on the host. Final linkage isdone on the target when loading the program or when creating the systemimage with the included application. Thus AC_CHECK_FUNC is not feasible.The same problem occurs for other checks, e. g. constants or datatypes.


- some systems have very strange and even wrong header-files. E. g. for
Windows/MinGW the headers sys/stat.h, io.h, windows.h, winbase.h are
needed for checking chsize() (truncate) or mkdir(). For embOS even some
header-files can not be included, because they are wrong (but cannot be
changed/fixed by aicas). These things make autoconf very complicate to

use, because if there are a lot of possible functions which can be usedto implement some feature, it is not clear which function is detected by

autoconf for some specific system. There even could be very bad

side-effects if more than one function is available (e. g. f1() andf2()) and at some time f2() is used instead f1() (with differentbehaviors or limitations), if there is some change for another targetsystem (e. g. you add some changes for RTEMS, but this will also haveeffects on e. g. embOS. You will not detect this problem until testingagain all targets for any change in autoconf). It is a little bit"undeterministic" which features are detected and if they are usable.


- some features are not detectable by autoconf, e. g. the ordering of
parameters for functions like inb() and outb() (we had that problem) or
additional parameters (which usually only produce a warning which is
discarded), e. g. gethostbyname_r() under Solaris.

There are much more difficult things which occur with autoconf. Toreplace some target layer (e. g. TARGET_*) by autoconf only will imhomake implementations for non-Unix-like systems very difficult and willonly shift the so called "complex" C-macro-implementation into "complex"

autoconf-macros-implementations (imho M4 is not much better then a
C-preprocessor and difficult to debug).

4. Multiple code - some statistics:

In the current implementation we use at aicas we have the following
systems. The numbers below count the number of macros at all (functions
and constants) which are different from the standard (generic)
implementation:

generic macros: 220

Linux: 0
Solaris: 11
RTEMS: 3
MinGW: 46
embOS: 10 (only partially implemented)

There are 0 (Linux) upto 46 (MinGW) special cases of macros. Some haveto be implemented because of different OS functions, some areimplemented for efficiency (e. g. some OSes offer a POSIX-threadinterface, but the native thread-interface is usually more efficient. Itcan be used with the macros-technique without any overhead). Thus forsome targets 0..20% of the macros have to be reimplemented to coverspecial causes. For "Unix"-like systems this is only usually less then 5%.


Some additional comments:

Efficient code: wrapper-functions are nice, but in some cases anoverkill, e. g. when calling simple function like sin(). In general Ccompilers do not optimize this (imho that is some reason why "inline"was introduced). If "inline" can be used, macros are almost not neededanymore.

autoconf: autoconf is a good idea and aicas is also heavily using it,but there are some limitations especially for embedded systems: because

autoconf can not run test-programs on embedded target-systems, some test
can not be done with autoconf (see above). I had to replace many

autoconf-test-functions by special versions which can be used forembedded systems. And still there are many things which are difficult tohandle, e. g. which include files have to be included for somenative-function. Some target systems make it really hard to use autoconfin the right way.

dead-code: the standard GNU linker does not remove functions which arenot used (dead-code). Thus if at least one function is needed from aobject file, all other function from that object file are linked to theapplication, too. There is only one automatic way to remove dead-code(-ffunction-sections), but this have other disadvantages. Even theman-page do not recommend it. Thus to remove dead-code of a functionsome #ifdef-endif around a function is needed. On the other side: Anon-used macro will not produce any dead code.


Some personal view:

By the way: I like "long" names and I hate uncommon abbreviations e. g.

like "fnctn" instead "function". I also like some prefix which indicatethe location of, e. g. "file_open()" instead "open()". I usually have noproblems with long names if the naming is consistent and useful. I alsohave no problems with lines longer then 80 characters, because my editordoes not have some "optimal" line length.

These are my thoughts to this topic. I hope all developers who areinterrested in some target native layer will reconsider the currentdiscussion. And I hope we will find a solution which can satisfy everybody.


Sincerely,

Torsten

[Prev in Thread]

Current Thread

[Next in Thread]

Target native layer, Dr. Torsten Rupp <=
- Re: Target native layer, Mark Wielaard, 2004/08/10
  - Re: Target native layer, Roman Kennke, 2004/08/10
    - Re: Target native layer, Dr. Torsten Rupp, 2004/08/11
    - Re: Target native layer, Michael Koch, 2004/08/11
    - Re: Target native layer, Dr. Torsten Rupp, 2004/08/12
    - Re: Target native layer, Michael Koch, 2004/08/12
    - Re: Target native layer, Dr. Torsten Rupp, 2004/08/12
    - Re: Target native layer, Michael Koch, 2004/08/12

Prev by Date: Re: Missing GdkClasspathFontPeer files in classpath 0.10
Next by Date: Re: Target native layer
Previous by thread: Missing GdkClasspathFontPeer files in classpath 0.10
Next by thread: Re: Target native layer
Index(es):
- Date
- Thread