lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Horrible std::regex performance


From: Vadim Zeitlin
Subject: Re: [lmi] Horrible std::regex performance
Date: Sat, 16 Jul 2016 23:28:08 +0200

On Sat, 16 Jul 2016 00:47:32 +0000 Greg Chicares <address@hidden> wrote:

GC> On 2016-07-16 00:12, Vadim Zeitlin wrote:
GC> > On Fri, 15 Jul 2016 23:41:03 +0000 Greg Chicares <address@hidden> wrote:
GC> [...]
GC> > GC> What does the PCRE measurement above mean? Did you translate
GC> > GC> 'test_coding_rules.cpp' to use the PCRE C API instead of C++ regex?
GC> > 
GC> >  Yes, exactly. AFAICS I did it correctly, although maybe not in the most
GC> > efficient way as I mostly tried to keep the changes as small as possible.
GC> > I'm almost sure PCRE performance could be improved further with some 
effort
GC> > (OTOH I'm half sure that xxx::regex performance might also be improved,
GC> > processing ~170000 lines in even 1 second is not really something to be
GC> > proud about on modern hardware).
GC> 
GC> Is your PCRE version ready to share? I'd really like to see it.

 Hi,

 It's not really ready to be committed (e.g. because I didn't touch the
regex_test.cpp file at all and also because I used PCRE C++ API for
expediency but I probably would wrap C API or wrap my own C++ API more
similar to std::regex one around it if I was doing it for real), but I
think it's ready to be shared now that I've also checked that you could
build it under MSW with the official makefiles too, so here it is:

https://github.com/vadz/lmi/commit/d1a5ae81de41e6b84155cc03d5dad4bca33e5725

(this is also tip of the "pcre" branch if you prefer to check it out).

 To use this you need PCRE itself, of course, so here are the brief
instructions about what I did to test with it:

0. Get the sources: I used Git just because I like having "git grep" at my
   disposal and comparing different version can always be useful, but you
   could also get a tarball, of course:

   $ cd /opt/lmi
   $ git clone https://github.com/svn2github/pcre.git

1. Build it: I did it in a separate directory, just because I don't like
   mixing up generated and source files, but you should be able to do it
   directly in the build directory as well and this would probably avoid
   the problem with the symlink below, so you could try to simplify this:

   $ cd pcre
   $ ./autogen.sh
   ...ignore some warnings...
   $ mkdir -p ../build/pcre
   $ cd $_
   $ /opt/lmi/pcre/configure --prefix=/opt/lmi/local
   # This creates a (Cygwin) symlink to a file in the source directory
   # which is not understood by (MinGW) compiler later, so replace the
   # link with a copy of the file:
   $ rm pcre_chartables.c
   $ cp /opt/lmi/mirrors/pcre/pcre_chartables.c.dist $_
   $ make
   $ make install

2. Now just "make test_coding_rules.exe" as usual and run it (you could
   also check that test_coding_rules.sh passes, as I did).


 Unfortunately, PCRE is not faster than boost::regex under MSW as it takes
1.8s (compared to the fantastic 1.1s taken by the boost:regex version). On
the bright side, it slightly restores my belief in my own sanity because it
at least takes almost the same time on 2 roughly similar machines.

 Please let me know what, if anything, would you like me to do about all
this. Thanks in advance,
VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]