grep-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Grep-devel] proposed new function for dfa


From: Arnold Robbins
Subject: [Grep-devel] proposed new function for dfa
Date: Tue, 22 Nov 2016 21:05:56 +0200
User-agent: Heirloom mailx 12.5 6/20/10

Hi.

Please see below, which adds a function `dfacopysyntax' to dfa.h
and dfa.c.  The idea is that if we're compiling many regular expressions
with the same syntax bit settings, we can save considerable time
by computing the stuff once and then just copying the various parts
from the original.

grep doesn't do that, but gawk can.

I plan to put this into gawk's dfa, and make use of it. My changes
pass gawk's `make check'.

I hope that y'all can include this is GNULIB dfa as well.

Thanks,

Arnold
-----------------------------------
diff --git a/dfa.c b/dfa.c
index 0267378..7334c6f 100644
--- a/dfa.c
+++ b/dfa.c
@@ -792,6 +792,26 @@ char_context (struct dfa const *dfa, unsigned char c)
   return CTX_NONE;
 }
 
+/* Copy the syntax settings from one dfa instance to another.
+   Saves considerable computation time if compiling many regular expressions
+   based on the same setting.  */
+void dfacopysyntax (struct dfa *to, const struct dfa *from)
+{
+  if (to != NULL && from != NULL)
+    {
+       memset (to, 0, offsetof (struct dfa, dfaexec));
+       to->dfaexec = from->dfaexec;
+       to->simple_locale = from->simple_locale;
+       to->localeinfo = from->localeinfo;
+
+       to->fast = from->fast;
+
+       to->canychar = from->canychar;
+       to->lex.cur_mb_len = from->lex.cur_mb_len;
+       to->syntax = from->syntax;
+    }
+}
+
 /* Set a bit in the charclass for the given wchar_t.  Do nothing if WC
    is represented by a multi-byte sequence.  Even for MB_CUR_MAX == 1,
    this may happen when folding case in weird Turkish locales where
diff --git a/dfa.h b/dfa.h
index 8608b10..52d77d4 100644
--- a/dfa.h
+++ b/dfa.h
@@ -113,6 +113,11 @@ extern struct dfa *dfasuperset (struct dfa const *d) 
_GL_ATTRIBUTE_PURE;
 /* The DFA is likely to be fast.  */
 extern bool dfaisfast (struct dfa const *) _GL_ATTRIBUTE_PURE;
 
+/* Copy the syntax settings from one dfa instance to another.
+   Saves considerable computation time if compiling many regular expressions
+   based on the same setting.  */
+extern void dfacopysyntax (struct dfa *to, const struct dfa *from);
+
 /* Free the storage held by the components of a struct dfa. */
 extern void dfafree (struct dfa *);
 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]