[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Possible improvements to quoted string handling in 2.0.7p3...
From: |
Phil D'Amore |
Subject: |
Possible improvements to quoted string handling in 2.0.7p3... |
Date: |
Thu, 29 May 2003 14:44:09 -0400 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030314 |
This is a bit long-winded. I could not think of a short way to explain it.
As I was trying to move from 2.0.3 to 2.0.7p3, I had the following
editfiles item that started to give me an issue:
editfiles:
{ /tmp/ntpd
LocateLineMatching ".*/usr/sbin/ntpdate -s -b -p 8 -u.*"
ReplaceLineWith " /usr/sbin/ntpdate -s -b -p 8 \"
DefineClasses "ntpconfig"
}
Because quote escaping in 2.0.3, well, wasn't, this worked fine. Moving
to 2.0.7p3 with the quoting improvements and the line continuation, it
was a parse error. I expected that. I'm referring to the \" nonsense
on the ReplaceLineWith line. That should print in the output file as:
/usr/sbin/ntpdate -s -b -p 8 \
as it is a shell line continuation, with the newline immediately after
the \. What this actually caused was a wrap-around to the folowing
line, and the lexer picked up the whole string "
/usr/sbin/ntpdate -s -b -p 8 "
DefineClasses " as the qstring, of course removing the \ in the
\" which happens in RemoveEscapeSequences().
What I also expected was that the following would fix it:
ReplaceLineWith " /usr/sbin/ntpdate -s -b -p 8 \\"
Which *should* have negated the special meaning of the \ and allowed the
" to terminate the string. Didn't happen. Looking at this, I saw that
the regex for a qstring in the lexer no longer looks for the newlines,
most likely to support the line continuation. Trying to figure out a
fix for this, I noticed a number of things that could make qstring
parsing more natural:
in parse.c
--------------
1. In RemoveEscapeSequences(), there was no way to escape an escape
character, thus partially leading to my problem. I have added that.
2. In RemoveEscapeSequences(), there was no way to escape a newline.
This would be really cool to have, so you have the best of both worlds
of line continuation. If you need the newline in your string, say in a
long InsertLine inside editfiles (we would like to use this for keeping
ssh keys updated for example), you can just slap the newline in and know
it will be stored as part of the string. If you just want to break a
long line up for readability in the cfengine config, you can escape it
and the resulant string will be sans newlines. I have added that.
3. I felt that RemoveEscapeSequences was not being used to its fullest
potential in its position to detect the end of a string. Trying to do
this with the regex for a qstring worked before, but at this point I
think this would require look-behind capability which is beyond what a
lex regex provides, to detect the case where the \ before a " character
is itself escaped, and should not modify the ". This is better handled
in code after the fact. More on this in a second, but the net effect is
that I have modified this function to return a value. This function
also now looks for quote characters that signify the true end of the
string, thus relieving the qstring regex of that duty.
in cflex.l.in
----------------
3 major things here, duplicated in both {qstring}s (<INASSIGN>{qstring}
and {qstring}):
1. I needed the code in the lexer to stop smashing yytext with a
terminating null. More on this as I go.
2. I decided to accept the fact that the regex will eat more string
than it should for my original problem case. This can be reversed, but
I decided not to, as I don't see any harm in it. When the copy of the
string is handed off to RemoveEscapeSequences(), that routine hands back
an integer value now. If that value is 0, nothing special happens, the
new string (pointed to by sp) is just handed off the the next thing as
applicable. If it is > 0, we will yyless() over-munched characters back
into the input stream, keeping the number of characters returned. This
is logged with a Debug1 when it happens, which should really only happen
when you have a \\" at the end of the string like my original problem.
We need to give them back to make sure that the next time the lexer
scans, it can see them again since they were useless to us this time.
This is the reason for no longer modifying yytext.
3. I changed the strdup to a malloc() for creating the string pointed
to by sp. RemoveEscapeSequences will do the copy to sp anyhow, so no
need to copy it twice. Also, it seems that sp was not always being
free()d. I added those.
The net effect of this IMHO is that string parsing is probably a bit
more natural now. Quotes can be escaped anywhere, escapes can be
escaped, and newlines can be escaped if you want to, or the newline can
just be a part of the string.
Some examples:
InsertLine "this line would be stored as one line without\
the carriage return because we escaped it"
InsertLine "This line would have the carriage return because we\\
escaped the escape"
InsertLine "This line would be a syntax error because there is no
terminating \"
InsertLine "This line (the one that started this mess for me in the
first place) woulc be the fixed version of the above\\"
All of this works for the right side of assignments in control: and
groups: as well.
I've tried to break this and can't. If anyone agrees this is an issue and
wants to kick the tires on it, I'd be interested to see if I missed something
or if this makes things work better.
Thanks,
--
Phil D'Amore "Sometimes there is a fine line
Senior System Administrator between criminally abusive
Red Hat, Inc behavior and fun."
Office: 919.754.3700 x44395 -- Ted the Generic Guy
Cell: 919.641.3669 (Dilbert 4/19/2003)
diff -Naur cfengine-2.0.7p3.orig/src/cflex.l.in cfengine-2.0.7p3/src/cflex.l.in
--- cfengine-2.0.7p3.orig/src/cflex.l.in 2003-05-21 04:28:04.000000000
-0400
+++ cfengine-2.0.7p3/src/cflex.l.in 2003-05-29 13:57:46.000000000 -0400
@@ -348,15 +348,19 @@
<INASSIGN>{qstring} {
char *sp = NULL;
+ int less = 0;
Debug1("RVAL-QSTRING %s\n",yytext);
- *(yytext+strlen((char *)yytext)-1) = '\0';
- if ((sp = strdup(yytext+1)) == NULL)
+ if ((sp = malloc(strlen(yytext)+1)) == NULL)
{
FatalError("Malloc failure in parsing");
}
- RemoveEscapeSequences(yytext+1,sp);
+ if((less = RemoveEscapeSequences(yytext+1,sp)) > 0)
+ {
+ Debug1("QSTRING-LESS %d\n",less);
+ yyless(less);
+ }
switch (ACTION)
{
@@ -366,6 +370,7 @@
return RVALUE;
case groups:
HandleGroupRValue(sp);
+ free(sp);
return RVALUE;
}
}
@@ -373,16 +378,22 @@
{qstring} {
char *sp = NULL;
+ int less = 0;
Debug1("QSTRING %s\n",yytext);
- *(yytext+strlen((char *)yytext)-1) = '\0';
- if ((sp = strdup(yytext+1)) == NULL)
+
+ if ((sp = malloc(strlen(yytext)+1)) == NULL)
{
FatalError("Malloc failure in parsing");
}
- RemoveEscapeSequences(yytext+1,sp);
+ if((less = RemoveEscapeSequences(yytext+1,sp)) > 0)
+ {
+ Debug1("QSTRING-LESS %d\n",less);
+ yyless(less);
+ }
HandleQuotedString(sp);
+ free(sp);
return QSTRING;
}
diff -Naur cfengine-2.0.7p3.orig/src/parse.c cfengine-2.0.7p3/src/parse.c
--- cfengine-2.0.7p3.orig/src/parse.c 2003-05-21 07:14:56.000000000 -0400
+++ cfengine-2.0.7p3/src/parse.c 2003-05-29 14:12:32.000000000 -0400
@@ -285,7 +285,7 @@
/*******************************************************************/
-void RemoveEscapeSequences(from,to)
+int RemoveEscapeSequences(from,to)
char *from,*to;
@@ -293,16 +293,28 @@
if (strlen(from) == 0)
{
- return;
+ return 0;
}
- for (sp = from; *(sp+1) != '\0'; sp++,cp++)
+ for (sp = from; sp != '\0'; sp++,cp++)
{
+ if ((*sp == '\"')|(*sp == '\''))
+ {
+ *(cp) = '\0';
+ if (*(sp+1) != '\0')
+ {
+ return (2+(sp - from));
+ }
+ return 0;
+ }
if (*sp == '\\')
{
switch (*(sp+1))
{
+ case '\n': sp+=2;
+ break;
+ case '\\':
case '\"':
case '\'': sp++;
break;
@@ -310,9 +322,9 @@
}
*cp = *sp;
}
-
- *cp = *sp;
- *(cp+1) = '\0';
+ /* Should not reach here! */
+ Debug1("End of string passed. Returning, but a bit confused.\n");
+ return 0;
}
/*******************************************************************/
diff -Naur cfengine-2.0.7p3.orig/src/prototypes.h
cfengine-2.0.7p3/src/prototypes.h
--- cfengine-2.0.7p3.orig/src/prototypes.h 2003-05-11 06:10:09.000000000
-0400
+++ cfengine-2.0.7p3/src/prototypes.h 2003-05-29 12:33:51.000000000 -0400
@@ -656,7 +656,7 @@
int ParseBootFiles ARGLIST((void));
void ParseFile ARGLIST((char *f,char *env));
void NewParser ARGLIST((void));
-void RemoveEscapeSequences ARGLIST((char *from,char *to));
+int RemoveEscapeSequences ARGLIST((char *from,char *to));
void DeleteParser ARGLIST((void));
void SetAction ARGLIST((enum actions action));
void HandleLValue ARGLIST((char *id));
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Possible improvements to quoted string handling in 2.0.7p3...,
Phil D'Amore <=