[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs contributions, C and Lisp

From: David Engster
Subject: Re: Emacs contributions, C and Lisp
Date: Mon, 12 Jan 2015 21:41:41 +0100
User-agent: Gnus/5.13001 (Ma Gnus v0.10) Emacs/24.3.91 (gnu/linux)

Helmut Eller writes:
> On Mon, Jan 12 2015, David Engster wrote:
>> The first step would have been to replace our existing C++ parser with
>> the AST that is produced by GCC. The plugin would output the same LISP
>> structures that Semantic uses.
> I'm a bit confused because at one side you seem to say that certain
> things are not possible with plugins but at the other side you seem to
> think that plugins can dump enough information to make these things
> possible.

I'm not sure how familiar you are with CEDET. We already have the
infrastructure to parse local expressions and calculate completions
based on a database of "tags", which are structures generated from our
own C++ parser. At a first step, I wanted to replace only the parser,
meaning the part which creates the AST. The actual "rules" of C++ are
coded in Semantic (to various degree).

>> My work so far was mainly to investigate
>> how C++ types are actually stored in the AST. Especially the template
>> stuff is pretty weird, and documentation is sparse. Fortunately, the
>> headers are pretty well commented, but it still involves a *lot* of
>> trial and error.
> I can imagine that templates are complicated.  I tried to implement a
> find-definition command as a GCC plugin.  My first approach was to
> search the smallest subtree that contains a particular source location.
> That didn't work out because GCC doesn't record "source ranges" so it's
> difficult to know if a tree covers a particular location.  Another
> problem is that identifiers are resolved early eg. "x + y" produces a
> PLUS_EXPR (with the source location pointing to the + sign) but the
> arguments are pointers to the VAR_DECLs of x and y and the source
> location of those VAR_DECLs is typically a few lines earlier.
> In a second attempt I made Emacs insert a custom #pragma at the place
> where we want to search for a definition; similar to the gccsense
> approach.  Plugins can register pragmas and that way have access to the
> lexer.  That kinda works but the problem is that pragmas are only
> allowed in certain places (eg. at the end of a statement) and Emacs has
> to guess where those places are.

Indeed, the main difficulty here is to find the correct location in the
AST when you only have line/column information. But when using CEDET,
your source file will already be parsed, so Semantic has type
information for your symbols, meaning it would already know the types
from "x" and "y". You could directly ask the GCC plugin for the
definition of that actual type (it probably wouldn't even have to call
the plugin, because Semantic has a database for types).

>> The actual "semantic" part of parsing C++ would still be handled by
>> Emacs' Semantic package. For instance, it would calculate
>> completions. So obviously, those completions wouldn't match those from
>> libclang w.r.t. to accuracy, but they would be *much* better than they
>> are now, especially because the preprocessor is already handled, which
>> is currently one of Semantic's main problems. Also, type inference would
>> already be done by GCC, so you would see the resulting type from 'auto'
>> and such.
> Is the idea is to let GCC output some "global" information like type
> declarations to enable better "local" parsing of function bodies in
> Emacs?  Or do you want to do pretty much all parsing in GCC?

Here's how Semantic currently does it: when you load a file, it will
first parse only declarations and function signatures, so something like
a "shallow parse" (with depth=0). When you put your cursor in a
function, it calls the parser again and asks him to only parse the
function's body, which would be a depth=1 parse. I wanted to try to do a
similar thing with the GCC plugin; that means, by default it would do a
"shallow parse", skipping things like function bodies and only output
their signatures. Then later, you would pass parameters like a
function's name for which you'd like the detailed AST, or only things
like local variables or similar.

>> My plan was also to make this plugin usable for other tools. That means,
>> it should not only output LISP structures, but alternatively also JSON
>> and possibly XML. For instance, an external tool could build a symbol
>> database for providing references. This could also serve as a starting
>> point for doing refactoring. For more complicated tasks, the plugin
>> could provide an AST matcher which you can query with certain
>> expressions.
> In general I think the heavy lifting should be done in GCC+plugin and
> Emacs should only do the "easy" stuff like displaying the result.  But
> for performance and other reasons it might be necessary to do at least
> some parsing in Emacs too.

Again, I'm not very familiar with GCC, which is why I wanted to do as
much in Elisp as possible, meaning to re-use as much from CEDET as
possible. My primary goal was to make the C++ parser more accurate and


reply via email to

[Prev in Thread] Current Thread [Next in Thread]