[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Unquoted strings in BASIC
From: |
Maury Markowitz |
Subject: |
Unquoted strings in BASIC |
Date: |
Sun, 1 Dec 2024 10:07:45 -0500 |
I seem to have programmed myself into a corner, and I'm hoping someone can
offer some suggestions.
Source code here: https://github.com/maurymarkowitz/RetroBASIC/tree/master/src
The BASIC language has two constant types, numbers and strings. The ~300 line
scanner is basically a list of the keywords and then:
[0-9]*[0-9.][0-9]*([Ee][-+]?[0-9]+)? {
yylval.d = strtod(yytext, NULL);
return NUMBER;
}
\"[^"^\n]*[\"\n] {
yytext[strlen(yytext) - 1] = '\0';
yylval.s = str_new(yytext + 1);
return STRING;
}
Over in my ~1700 line parser, I have the concept of an expression, which is
either of these constants along with functions, operators etc. The language
also has list-like operators like:
PRINT A,B,C
Which I implemented as an exprlist:
exprlist:
expression
{
$$ = lst_prepend(NULL, $1);
}
|
exprlist ',' expression
{
$$ = lst_append($1, $3);
}
;
The problem arises in the DATA statement, which is normally something along the
lines of:
DATA 10,20,"HELLO","WORLD!"
I parse this as the statement token and then an exprlist:
DATA exprlist
{
statement_t *new = make_statement(DATA);
new->parms.data = $2;
$$ = new;
}
I can then read the values at runtime by walking down the list. But many
dialects allow strings to be unquoted as long as they do not contain a line
end, colon or comma:
DATA 10,20,HELLO,WORLD!
I am looking for ways to attack this. I tried this in my scanner:
[\,\:\n].*[\,\:\n] {
yytext[strlen(yytext) - 1] = '\0';
yylval.s = str_new(yytext + 1);
return STRING;
}
... but that captures too much and I get errors on every line. Many variations
on this theme either cause errors everywhere or fail to parse the DATA line.
I am not sure where I should attempt to fix this, in the scanner with a pattern
for "unquoted string", or the parser as a "datalist", or a mix of both?
Can someone offer some ways to attack this?
- Unquoted strings in BASIC,
Maury Markowitz <=