tinycc-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Tinycc-devel] TinyCC REPL


From: David Mertens
Subject: Re: [Tinycc-devel] TinyCC REPL
Date: Thu, 14 May 2015 09:35:12 -0400

Hey Sergey,

Thanks for this! I am impressed at your efforts! Unfortunately, I think this is a bit premature. I have more work to do on my exsymtab fork before I was going to bring it up for inclusion in tcc itself. (Indeed, I am not entirely sure that it is appropriate for inclusion in tcc.) Here are a couple of reasons we should wait for a little while longer:
  1. The extended symbol table API is not documented at all.
  2. The symbol table copy takes O(N_sym^2) to run. It might be possible to speed this up, but for the moment I plan to work around it by implementing symbol table caching. I have not yet completed that work, so I consider the exsymtab project to be incomplete at the moment.
  3. This work is most of interest for folks using tcc as an in-memory C JIT compiler. In its current form it is nearly useless to those who want a fast compiler that produces binaries. After I implement symbol table caching, it may prove a bit more useful to the fast-compiler crowd, but it's not ready yet. We need to have a wider discussion about the merits of the extension before we include it in mob.
  4. As implemented, this cuts the maximum number of token symbols in half. (I needed to use one of the bits to indicate "extended symbol".)
  5. The current token lookup is based on a compressed trie that explicitly only supports A-Z, a-z, 0-9, and _. It does not support $ in identifiers and would need to be revised in order to do so. I chose to implement Phil Bagwell's Array Mapped Trie in the belief that it would perform better than a hash for lookup. Once I add symbol table caching, I hope to add (or switch to) Array Compressed Tries for even better cache utilization. But currently, I rely on have 63 allowed characters in identifiers.
  6. I know absolutely nothing about how the compilation and relocation stages modify the members of the symbol tables. It is a black box to me. As such, the copy procedure is a pile of ad-hoc data structure tests that is, in all likelihood, subtly broken and quite brittle. Adding this in its current state to tcc's codebase, especially without sufficient tests, could dampen efforts to change the symbol table handling or code generation.
  7. A separate idea that I plan to pursue on my fork is to extend how tcc pulls data in from file handles. I would like to make it hookable so that I could write hooks in my Perl module and have it interact directly with Perl's parser, rather than pulling all of the C code into a temporary buffer. This may go beyond the wishes of the community and merits further discussion.

For these reasons, I do not believe that the exsymtab fork, in its current state, should be brought into the mob branch. I am more than happy to have help, but let's wait a few more months until most of these issues have been ironed out and we all have had a chance to discuss the merits and drawbacks of extended symbol table support.


David


reply via email to

[Prev in Thread] Current Thread [Next in Thread]