[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Grep-devel] proposed new function for dfa
From: |
arnold |
Subject: |
Re: [Grep-devel] proposed new function for dfa |
Date: |
Thu, 24 Nov 2016 01:31:08 -0700 |
User-agent: |
Heirloom mailx 12.4 7/29/08 |
Hi Paul.
In terms of your technical questions, I don't know; I don't understand
the struct dfa so I treat it as a black box and copy code based on
pattern matching. It usually works to do so, but not always.
(Side question; why doesn't dfaalloc just memset the new struct dfa to
zero immediately? That would be simplest, no?)
Your proposal is more or less what I need. The d1 would be kept around
for the life of the program and used over and over again.
The main thing is that gawk can compile a regex up to twice - once
ignoring case and once paying attention to it. The new dfasyntax struct
and routines need to allow for that; if the case flag only goes into
dfacomp then I haven't been saved as much trouble as I would like.
Thanks,
Arnold
Paul Eggert <address@hidden> wrote:
> On 11/22/2016 11:56 PM, address@hidden wrote:
> > I did some profiling and in this case
> > dfasyntax was using a huge percentage of CPU time. I could try to
> > reproduce it if I really need to convince you.
>
> Sounds like too much work for too little benefit. Unfortunately I'm
> still confused. If Awk needs to reuse a DFA's syntax, presumably the use
> case would be like this:
>
> struct dfa *d1 = dfalloc (), *d2 = dfaalloc ();
> dfasyntax (d1, ...);
> for (...)
> {
> dfacopysyntax (d2, d1);
> dfacomp (..., d2, ...);
> dfaexec (d2, ...);
> }
> dfafree (d1); dfafree (d2); free (d1); free (d2)
>
> I don't see how this could work if dfacopysyntax doesn't clear the
> non-syntax part of the struct dfa, and yet you indicated that it's OK to
> remove that memset.
>
> Also, I'm not following how the copied syntax can start with the same
> value for canychar as the original. Aren't these indexes into uncopied data?
>
> More generally, I'm worried that the copied and the original objects
> might share pointers, and that freeing one might harm the other. Is that
> not possible?
>
>
> Another thought. The intent is that there be a fast way to do what you
> want, and clearly the current dfa.c doesn't do that. How about the
> following idea instead?
>
> * Add an opaque type 'struct dfasyntax' to dfa.h.
>
> * dfacomp accepts a new 'struct dfasyntax *' arg, specifying the syntax.
>
> * dfasyntax no longer accepts a 'struct dfa *' arg. Instead, it returns
> a pointer of type 'struct dfasyntax *' to a newly allocated opaque
> object O that contains all the info that dfasyntax currently computes.
> O can be freed via 'free'; this shouldn't happen as long as anyone is
> using a struct dfa object for which dfacomp has specified O.
>
> The goal is to give Awk a way to efficiently compute the syntax just
> once, while better handling the abovementioned storage-allocation and
> canychar issues. The use case would be like this:
>
> struct dfa *d = dfalloc ();
> struct dfasyntax *s = dfasyntax (...);
> for (...)
> {
> dfacomp (..., d, s, ...);
> dfaexec (d, ...);
> }
> dfafree (d); free (d); free (s);
- [Grep-devel] proposed new function for dfa, Arnold Robbins, 2016/11/22
- Re: [Grep-devel] proposed new function for dfa, Paul Eggert, 2016/11/22
- Re: [Grep-devel] proposed new function for dfa, arnold, 2016/11/23
- Re: [Grep-devel] proposed new function for dfa, Paul Eggert, 2016/11/23
- Re: [Grep-devel] proposed new function for dfa, arnold, 2016/11/23
- Re: [Grep-devel] proposed new function for dfa, Paul Eggert, 2016/11/23
- Re: [Grep-devel] proposed new function for dfa,
arnold <=
- Re: [Grep-devel] proposed new function for dfa, Norihiro Tanaka, 2016/11/24
- Re: [Grep-devel] proposed new function for dfa, Paul Eggert, 2016/11/23
- Re: [Grep-devel] proposed new function for dfa, Arnold Robbins, 2016/11/29