[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

remove length limitation for substituted vars

From: Ralf Wildenhues
Subject: remove length limitation for substituted vars
Date: Tue, 5 Dec 2006 21:49:11 +0100
User-agent: Mutt/1.5.13 (2006-08-11)

The move to use awk for config files arbitrarily reintroduced a
limitation not present in Dan Manthey's earlier code: the length of
a substituted (multi-line) value was now limited to the size of the
sed pattern space, because the mangle script slurped in a full value.
Dan's approach had this limit only per line, not per variable.

The patch below removes this limitation.  Tested on the usual set of
suspect systems.  It also revealed that Solaris sed likes only up to
7 characters in a label, not 8.

It's the last awk-related patch for now, hopefully.  OK to apply?


2006-12-05  Ralf Wildenhues  <address@hidden>

        * lib/autoconf/status.m4 (_AC_OUTPUT_FILES_PREPARE): When
        creating the awk substitution script, handle one input line at a
        time, so that the maximum length of a substituted (multi-line)
        value is not limited by the size of the sed pattern space.
        The trade-off is a slightly repetitive sed script.
        * doc/autoconf.texi (Limitations of Usual Tools): Branch labels
        can only have up to 7 characters, due to Solaris 10 /bin/sed.

Index: doc/autoconf.texi
RCS file: /cvsroot/autoconf/autoconf/doc/autoconf.texi,v
retrieving revision 1.1112
diff -u -r1.1112 autoconf.texi
--- doc/autoconf.texi   28 Nov 2006 03:29:47 -0000      1.1112
+++ doc/autoconf.texi   5 Dec 2006 20:39:48 -0000
@@ -13694,7 +13694,7 @@
 Unicos 9 @command{sed} loops endlessly on patterns like @samp{.*\n.*}.
-Sed scripts should not use branch labels longer than 8 characters and
+Sed scripts should not use branch labels longer than 7 characters and
 should not contain comments.  @acronym{HP-UX} sed has a limit of 99 commands
 (not counting @samp{:} commands) and
 48 labels, which can not be circumvented by using more than one script
Index: lib/autoconf/status.m4
RCS file: /cvsroot/autoconf/autoconf/lib/autoconf/status.m4,v
retrieving revision 1.122
diff -u -r1.122 status.m4
--- lib/autoconf/status.m4      5 Dec 2006 06:00:43 -0000       1.122
+++ lib/autoconf/status.m4      5 Dec 2006 20:39:49 -0000
@@ -420,10 +420,20 @@
 dnl Initialize an awk array of substitutions, keyed by variable name.
-dnl First read a whole (potentially multi-line) substitution,
-dnl and construct `S["VAR"]='.  Then, split it into pieces that fit
-dnl in an awk literal.  Each piece then gets active characters escaped
-dnl (if we escape earlier we risk splitting inside an escape sequence).
+dnl The initial line contains the variable name VAR, then a `!'.
+dnl Construct `S["VAR"]=' from it.
+dnl The rest of the line, and potentially further lines, contain the
+dnl substituted value; the last of those ends with $ac_delim.  We split
+dnl the output both along those substituted newlines and at intervals of
+dnl length _AC_AWK_LITERAL_LIMIT.  The latter is done to comply with awk
+dnl string literal limitations, the former for simplicity in doing so.
+dnl We deal with one input line at a time to avoid sed pattern space
+dnl limitations.  We kill the delimiter $ac_delim before splitting the
+dnl string (otherwise we risk splitting the delimiter).  And we do the
+dnl splitting before the quoting of awk special characters (otherwise we
+dnl risk splitting an escape sequence).
 dnl Output as separate string literals, joined with backslash-newline.
 dnl Eliminate the newline after `=' in a second script, for readability.
@@ -437,31 +447,43 @@
 cat >>"\$tmp/subs.awk" <<\CEOF$ac_eof
-sed '
-t line
-s/'"$ac_delim"'$//; t gotline
-N; b line
+sed -n '
-s/^/S["/; s/!.*/"]=/; p
+s/^/S["/; s/!.*/"]=/
-t more
+t repl
+t delim
+t more1
+s/["\\]/\\&/g; s/^/"/; s/$/\\n"\\/
+b repl
+s/["\\]/\\&/g; s/^/"/; s/$/"\\/
+t nl
-t notlast
-s/["\\]/\\&/g; s/\n/\\n/g
-s/^/"/; s/$/"/
+t more2
+s/["\\]/\\&/g; s/^/"/; s/$/"/
-s/["\\]/\\&/g; s/\n/\\n/g
-s/^/"/; s/$/"\\/
+s/["\\]/\\&/g; s/^/"/; s/$/"\\/
-b more
+t delim
 ' <conf$$subs.awk | sed '
Index: tests/
RCS file: /cvsroot/autoconf/autoconf/tests/,v
retrieving revision 1.76
diff -u -r1.76
--- tests/    5 Dec 2006 18:57:06 -0000       1.76
+++ tests/    5 Dec 2006 20:39:49 -0000
@@ -545,7 +545,8 @@
 # sed dumps core around 8 KiB.  However, POSIX says that sed need not
 # handle lines longer than 2048 bytes (including the trailing newline).
 # So we'll just test a 2000-byte value, and for awk, we test a line with
-# almost 1000 words, and one variable with 4 lines of 500 bytes each.
+# almost 1000 words, and one variable with 5 lines of 2000 bytes each:
+# multi-line values should allow to get around the limitations.
 AT_SETUP([Substitute a 2000-byte string])
@@ -561,7 +562,7 @@
 AC_SUBST([foo], ]m4_for([n], 1, 100,, ....................)[)
 AC_SUBST([bar], "]m4_for([n], 1, 100,, @ @ @ @ @ @ @ @ @ @@)[")
-AC_SUBST([baz], "]m4_for([n], 1, 4,, m4_for([m], 1, 25,, ... ... ... ... ....)
+AC_SUBST([baz], "]m4_for([n], 1, 5,, m4_for([m], 1, 100,, ... ... ... ... ....)
@@ -576,7 +577,7 @@
   AT_CHECK([cat Bar], 0, m4_for([n], 1, 100,, @ @ @ @ @ @ @ @ @ @@)
-  AT_CHECK([cat Baz], 0, m4_for([n], 1, 4,, m4_for([m], 1, 25,, ... ... ... 
... ....)
+  AT_CHECK([cat Baz], 0, m4_for([n], 1, 5,, m4_for([m], 1, 100,, ... ... ... 
... ....)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]