[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Multi-Line Definitions
From: |
Eric Blake |
Subject: |
Re: Multi-Line Definitions |
Date: |
Sat, 29 Sep 2007 13:31:38 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070728 Thunderbird/2.0.0.6 Mnenhy/0.7.5.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Eric Blake on 9/29/2007 1:04 PM:
> You've given me a good idea - I'll try instrumenting a version of m4 and
> coming up with a good list of the most popular regex patterns in use by
> autoconf (autoconf -t has limitations, since regex patterns tend to mess
> up the quoting of the trace file).
Here's something a bit more telling. With the attached patch, and in the
coreutils directory,
$ M4_TRACE_FILE=~/m4.trace M4=~/m4/src/m4 autoconf
$ wc m4.trace
62207 61314 666720 m4.trace
$ sort -u m4.trace | wc
401 405 5619
$ sort <m4.trace | uniq -c |sort -n -k1,1 |tail -n 15
740 [\\'']
816 (.*)
863
^[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_][abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789]*$
1163 \(..\)$
1163 \\
1163 ^\(..\)
2244 [^a-zA-Z0-9_]
2324 [ ]+
3242
4306 \\[`""]
5020 [`""]
5020 \\[\\$]
7376 [`$]
11504 @\(<:\|:>\|S|\|%:\)@
11915 @&t@
Wow. 61 thousand compilations of a regular expression pattern, with only
405 unique patterns. Sounds like some definite speedups to m4 are
possible if we were to cache compiled regular expressions and reuse them,
rather than always compiling from scratch.
Also, some of those most frequent patterns can be done with index() rather
than regexp() in m4sugar, offering some speedups even without m4 improvements.
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFG/qga84KuGfSFAYARAkTBAKDBUav7ddmiYKminHdxjc9Mc700JACgkaM+
wspVndSXvYUDNNmyXURQq54=
=Bq+8
-----END PGP SIGNATURE-----
diff --git a/src/builtin.c b/src/builtin.c
index dee2276..fa141b0 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -1922,6 +1922,8 @@ Warning: \\0 will disappear, use \\& instead in
replacements"));
}
}
+extern FILE *trace_file;
+
/*------------------------------------------.
| Initialize regular expression variables. |
`------------------------------------------*/
@@ -1973,6 +1975,8 @@ m4_regexp (struct obstack *obs, int argc, token_data
**argv)
init_pattern_buffer (&buf, ®s);
msg = re_compile_pattern (regexp, strlen (regexp), &buf);
+ if (trace_file)
+ fprintf (trace_file, "%s\n", regexp);
if (msg != NULL)
{
@@ -2033,6 +2037,8 @@ m4_patsubst (struct obstack *obs, int argc, token_data
**argv)
init_pattern_buffer (&buf, ®s);
msg = re_compile_pattern (regexp, strlen (regexp), &buf);
+ if (trace_file)
+ fprintf (trace_file, "%s\n", regexp);
if (msg != NULL)
{
diff --git a/src/m4.c b/src/m4.c
index 2d5ced0..9cce4fc 100644
--- a/src/m4.c
+++ b/src/m4.c
@@ -318,6 +318,8 @@ process_file (const char *name)
#define OPTSTRING "-B:D:EF:GH:I:L:N:PQR:S:T:U:d::eil:o:st:"
#endif
+FILE *trace_file;
+
int
main (int argc, char *const *argv, char *const *envp)
{
@@ -338,6 +340,12 @@ main (int argc, char *const *argv, char *const *envp)
retcode = EXIT_SUCCESS;
atexit (close_stdin);
+ {
+ const char *name = getenv ("M4_TRACE_FILE");
+ if (name)
+ trace_file = fopen(name, "a");
+ }
+
include_init ();
debug_init ();
#ifdef USE_STACKOVF
@@ -590,6 +598,8 @@ main (int argc, char *const *argv, char *const *envp)
undivert_all ();
}
output_exit ();
+ if (trace_file)
+ fclose (trace_file);
free_macro_sequence ();
exit (retcode);
}
- Re: Multi-Line Definitions, Ralf Wildenhues, 2007/09/18
- RE: Multi-Line Definitions, Eric Lemings, 2007/09/18
- Re: Multi-Line Definitions, Eric Blake, 2007/09/22
- Re: Multi-Line Definitions, Eric Blake-1, 2007/09/27
- Re: Multi-Line Definitions, Ralf Wildenhues, 2007/09/29
- Re: Multi-Line Definitions, Eric Blake, 2007/09/29
- Re: Multi-Line Definitions,
Eric Blake <=
- m4 regex usage [was: Multi-Line Definitions], Eric Blake, 2007/09/29
- m4sugar speedups [was: Multi-Line Definitions], Eric Blake, 2007/09/29
- Re: m4sugar speedups [was: Multi-Line Definitions], Benoit SIGOURE, 2007/09/30
- Re: m4sugar speedups [was: Multi-Line Definitions], Eric Blake, 2007/09/30