[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: operators, datatypes, keywords
Re: operators, datatypes, keywords
Sat, 15 Aug 2009 13:29:29 +0100
2009/8/14 Mark Polesky <address@hidden>:
> I'm trying to get a better understanding of LilyPond the
> "programming language". Can someone look this over? Am I on the
> right track? As an example of some of the confusion I'm having,
> it seems to me that there are 3 categories of datatypes. That
> seems wrong, but I don't know how to resolve it. Can anyone
> suggest better terminology or clarify any misconceptions that I
> seem to have based on this?
It would probably make more sense if you studied lexer.ll and
parser.yy, even if you don't understand what's happening in these
files, since there are comments dotted about which help to clarify
some details. You might even notice some interesting features which
aren't documented; for example, here are two things I didn't know
about until a few days ago:
1. `+' can be used to concatenate strings:
foo = "bar" + "baz"
2. \include can take an identifier as argument:
foo = #"path/to/myfile.ily"
The following comments may be incorrect, so take them with a hefty
pinch of salt. :)
> * the first line is indexed with double-quotes in the grammar
> appendix, and the second line with single-quotes. Does this
> imply any categorical difference?
No, it's just a consequence of how these items are coded: single
character tokens are a `char' type so are enclosed in single quotes;
multiple-character tokens are what's called a `c string', i.e., a
null-terminated array of chars.
There is at least one exception: the backlash; I'm not sure why it's
enclosed in double quotes.
> * is "operator" the right term for this category?
They're just called `string tokens' in the parser.
> (datatypes whose type-check predicates are defined in C++)
> * using the method discussed here:
> box context dimension dir dispatcher duration font-metric grob
> grob-array input-location item iterator lily-lexer lily-parser
> listener moment music music-function music-list music-output
> otf-font output-def page-marker pango-font paper-book paper-system
> pitch prob score simple-closure skyline skyline-pair source-file
> spanner stencil stream-event translator translator-group
These are smobs.
> (datatypes whose type-check predicates are defined in scheme)
> * I don't know how to determine if there are others in this
> category. Is there a way?
Any important ones (that are used for documenting properties) should
appear in type-p-name-alist.
> (datatypes derived from the grammar appendix)
> * these are clearly of a different sort than the previous two.
> Can someone explain?
These aren't datatypes, they're integral types representing a
particular token (you can imagine them as being part of an
enumeration, where each item evaluates to a unique integer).
The parser calls them `artificial' tokens, since they're not the
direct result of the lexer matching a series of characters in the
input; rather they're internal tokens which the lexer generates. A
good example would be the keywords for markup commands: whereas
\markup is a direct keyword match, once the lexer has caught this, it
generates another token based on the signature of the markup command,
which ensures the parser knows what to expect in terms of arguments to
the function (e.g., for \musicglyph, it would return MARKUP_HEAD_SCM0,
to ensure the parser knows to expect a list consisting of the markup
head for this command (a function) followed by a scheme type).
> * why is "error" uncapitalized? should it be ALL-CAPS, or does it
> not belong in this category?
It has nothing to do with the others, it's just the sorting algorithm
which has placed it amongst them. Since it's not defined anywhere, it
appears to be an internal feature of Bison.
> * a list of the names of all the commands listed in the index of
> the grammar appendix should be equivalent to a list of all the
> LilyPond "keywords", I think. Let me know if this is wrong.
This is true for the quoted escaped kewords (e.g., "\\accepts"), but
not so if you're looking at the tokens (in caps), since C[haracter]
isn't a keyword (again, it appears amongst the keywords due to the
C[haracter] seems to be a catch-all token for any invalid escaped
character (e.g., \&).
You can get the list of keywords using ly:lexer-keywords:
#(display (ly:lexer-keywords (ly:parser-lexer parser)))
> * is there a proper name for the associated command-set?
> I would call them "core commands"; is there a better term?
lexer keywords/reserved words