guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

rtl metadata musings


From: Andy Wingo
Subject: rtl metadata musings
Date: Fri, 10 May 2013 07:07:31 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.2 (gnu/linux)

Hi,

For many days I have been hemming and hawing about how to serialize
debugging information into the new toolchain.  Here's a braindump of a
new plan.

To recap, the new toolchain has the new RTL assembly embedded in ELF.
There are 6 things we need to put in the ELF file somehow:

  (1) Procedure names and bounds.
  (2) Docstrings.
  (3) Generic procedure metadata (for procedure-properties).
  (4) Arity information (see docs for program-arities).
  (5) Information about local variables for the debugger.
  (6) Line numbers.

None of these things are "on the main path" -- loading a module
shouldn't even page any of this information into memory.  But it is all
useful to have, and sometimes you need to be able to access it
efficiently if it is there.

All of this data should be strippable from the .go files (which I guess
we should rename to .so files).  This constraint means there should be
no link from the "main" data out to the "debugging" data -- only the
other way around.  Otherwise stripping debug data could corrupt your
main program.

So those are the design constraints.

For (1) we use the standard ELF .symtab / .strtab mechanism.

For the rest I had considered encoding it all into DWARF, but I think it
can make sense to leave DWARF to handle the things that it knows best
like (5) and (6) and to provide special support for (2), (3), and (4).
You should be able to strip these different pieces separately.

(2): For docstrings, my idea is to make a .guile.docstr section with
entries like this:

  struct guile_docstring
  {
    Elf_Addr pc;
    Elf_Off str;
  }

The "pc" is the rtl-program-code, and the "str" is an offset into the
linked (via the section's sh_link member) .guile.docstrtab section.
Searching for a docstring does a bisection over the .guile.docstr for a
(rtl-program-code prog) and then loads the string from the table.

(3): Of course it's possible for a procedure's "documentation" property
to not be a string, and procedures can have any number of other
properties:

  (lambda ()
    #((foo . qux)
      (bar . "hi")
      ...)
    10)

Procedures with extended metadata get an entry in .guile.procprops:

  struct guile_procprops
  {
    Elf_Addr pc;
    Elf_Addr data;
  }

Here "data" points to an "absolute" address of the property alist, which
is part of the .data section along with any other program literal data.
(The address is absolute relative to the ELF image; at runtime you have
to add the base address the image is loaded at.)

As you might know, literals like conses are statically allocated in the
ELF memory image, but if they contain links to non-immediates like
symbols or other conses, those links need to be patched up when the ELF
is loaded.  In this way, generic procedure metadata does contribute to
runtime cost, because it needs relocation.  But it's not that common,
not too much work, and you don't need a guile_procprops entry if you
don't have extended metadata.

(4) Arity information describes the arities of the various case-lambda
clauses that a function has.  This information is used when printing a
function, to show the formals, and also when compiling, to check
arities.  It would be cleaner to have the compiler emit separate
functions for the different clauses, but that's not what happens now.
Anyway the plan is for another section, .guile.arities:

  struct guile_arity {
    Elf_Addr pc;
    Elf_Off size;
    nreq; // encodings for these not determined yet
    nopt;
    flags; // has-keyword-args, has-rest, is-case-lambda
    Elf_Offset offset;
  }

An entry describes how many required, optional, keyword, and rest
arguments a function has.  The .guile.arities section is prefixed by a
length indicating how many entries there are, then all the arity
structures, sorted by pc.  Note that one arity may contain another!  In
particular for case-lambda clauses you can have one arity for the whole
function, then a number of other ones for the cases.

After the arities, you have a block of offsets to another string table
to give the names and to give more information on keywords.  So all in
all it looks like this:

  Elf_Off n_arity_entries;
  struct guile_arity foo_arity = { PC, SIZE, 1, 2, 0, OFFSET }
  ...
OFFSET:
  X -> offset into associated .guile.arities_strtab for first req. arg
  Y -> offset into associated .guile.arities_strtab for first opt. arg
  Y -> offset into associated .guile.arities_strtab for second opt. arg
  offsets for next function...

Like metadata, keyword arguments would have an absolute address to the
.data section to link to the keywords literal associated with this
clause.

In this way we can share storage for formal parameters, have easy access
to arities without too much searching or consing, and also be able to
strip the arities section if needed without affecting anything else.

(5) and (6): Local variable information and line numbers can go into
.debug_info / .debug_lines / .debug_str as usual with DWARF.  DWARF does
well for this.  Not sure if I want to try to encode arity information
into DWARF; at least in the beginning it won't be necessary, so I'll
avoid it.

OK this thought was burning my neuron this morning and I wanted to get
it out.  I'll start working on it shortly.

Andy
-- 
http://wingolog.org/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]