pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Lexer woes


From: John Darrington
Subject: Re: Lexer woes
Date: Wed, 24 Sep 2008 07:05:16 +0800
User-agent: Mutt/1.5.18 (2008-05-17)

I was using q2c.  I'll see if I can hack it to work like this.

Sometime in the future though, I think we're going to need something
better.  In particular, I think it'd be really great to be able to do
command line completion, both for pspp and psppire (I understand spss
v17 does it already).  So for example given a  partial syntax like

    DESCRIPTIVES /VARIABLES=

if the user hits tab at this point, she'll get a list of the variable
names in the dictionary (but only those which are appropriate, ie not
scratch, string etc).  Similarly, for any given command, it'd be good
to have a list of subcommands valid for that command.


But back to the current issue, parsing the K-W as three tokens, whilst
will work for the purpose of syntax verification, obviously falls down
in the bigger picture.  The obvious solution would have been to allow
'-' as  a valid character in the T_ID token.  However this means that
constructs like

 COMPUTE X=Y/K-W.

suddenly get misinterpreted.  But so far as I can see, there are only
a few special places in spss syntax where algebraic expressions like
that can occur (in an IF, LOOP, COMPUTE, RECODE and a few others). I
wonder if it might not be a better solution to throw the lexer into a
different mode when an expression is expected.  Obviously there will
be complications (like when to switch back to non-expression mode).

J'


On Tue, Sep 23, 2008 at 08:12:56AM -0700, Ben Pfaff wrote:
     Are you parsing by hand or with q2c?  If you're doing it by hand,
     then you should be able to do something like this:
     
             if (lex_token (lexer) == T_ID
                 && !strcmp (lex_tokid (lexer), "K")
                 && lex_look_ahead (lexer) == '-')
               {
                 /* We know we're at "K-".  The only acceptable
                    follow-on to this is "W". */
                 lex_get ();
                 lex_force_match (lexer, '-');
                 if (!lex_force_match_id (lexer, "W"))
                   {
                      ....abort parsing....
                   }
                 ...got K-W...
               }
             else
               {
                 ...not K-W...
               }
     
     If you are using q2c, we will have to teach it a similar trick.

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]