help-flex
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Scanner takes a long time to build.


From: Hans Aberg
Subject: Re: Scanner takes a long time to build.
Date: Sat, 27 Jul 2002 20:03:09 +0200

At 11:08 -0500 2002/07/27, Scott A Crosby wrote:
>> It is interesting though that the other fellow wanting the feature
>> was writing on an anti-spam tool. So it could happen that these
>> "natural language pattern recognition" tools require grammars that
>> may blow up exponentially in a straightforward DFA translation.
>>
>
>I created none of those regexp's... I recently started to use
>spamassassin, then noted the performance (.5-1.5 seconds/message). I
>know that flex can do matching of hundreds of regex's in parallel at
>megabytes a second, so I thought I'd test the feasibility and
>performance of flex.

Sorry, the main point is not who wrote the grammar, but the fact that it
was to be used for "natural language pattern recognition", and in that
context the DFA blew up to become very large. -- Perhaps this has something
to do with the complexity of natural languages, or something. Then the
problem may become more frequent in the future, as computers now have the
capacity to handle more voluminous programs than in the past.

(Strictly speaking, I did not say you were writing those rules, only that
you "was writing on an anti-spam tool", which is what you are doing by
those alterations. :-) )

  Hans Aberg





reply via email to

[Prev in Thread] Current Thread [Next in Thread]