lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Best way to integrate PCRE


From: Greg Chicares
Subject: Re: [lmi] Best way to integrate PCRE
Date: Wed, 29 Sep 2021 13:56:50 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0

On 9/28/21 11:28 PM, Vadim Zeitlin wrote:
> On Tue, 28 Sep 2021 20:09:03 +0000 Greg Chicares <gchicares@sbcglobal.net> 
> wrote:
[...draft of a discarded commit-message modification...] 
> GC>     IMPORTANT MODIFICATION: This commit doesn't match the original merge
> GC>     request. The 'pcre_regex.hpp' header has been modified, to guard its
> GC>     substance with a macro. Without this modification, the important
> GC>     'check_physical_closure' makefile target failed; that could break any
> GC>     later 'git bisect'. It's possible that 'git bisect' could fail anyway,
> GC>     as this modification had to be committed with '--no-verify'.
> GC> 
> GC> By the last sentence, I realized I had departed the true path.
> 
>  I'm sorry, maybe I'm even more tired than I thought (and, to be honest,
> this convinced me to postpone looking into the physical closure check
> tomorrow), but what exactly is wrong with this change and why did it have
> to be committed with '--no-verify'?

First, I tried
  git cherry-pick --ff ..5e49ea134
  ./nychthemeral_test.sh
and that failed, because the new PCRE C++ header needs macro guards to
keep it from failing 'make check_physical_closure' for msw due to the
(intended) absence of a PCRE header for msw. Therefore, I committed
a change to add macro guards.

Then, I realized this meant that the build was broken between HEAD
and origin/master, in the sense that
  make check_physical_closure
and thus
  ./nychthemeral_test.sh
would fail...but those are predicates I might very well want to use
with 'git bisect'. So I decided I'd better do the following, in three
separate steps just for caution:
 - move my new commit back, so that it followed your first one
   (that succeeded); and
 - 'git rebase interactive' again, marking mine as "fixup" (I think
   this intermediate step succeeded, but I'm not absolutely sure); and
 - rebase again, marking the fixed-up first commit as "reword"...which
   paused, to my surprise:

  /opt/lmi/src/lmi[0]$git rebase --interactive
  checking [toplevel]...You can amend the commit now, with
    git commit --amend 
  Once you are satisfied with your changes, run
    git rebase --continue

And when I tried committing it manually, it returned an error status,
with no explanation (which would be why it paused):

  /opt/lmi/src/lmi[0]$git commit --amend
  checking [toplevel]...%                                                       
               
  /opt/lmi/src/lmi[1]$

The original behavior was that it would say "COMMIT ABORTED" and
show what commands had failed; I don't know why that didn't happen,
but perhaps the 'make' command in 'hooks/pre-commit' failed without
producing any output.

Anyway, I forced the commit with '--no-verify', and added the
"IMPORTANT MODIFICATION" paragraph above. But, as often occurs,
writing the explanation left me thinking this was a bad idea.
And then testing the result, by forcing 'test_coding_rules' to
rebuild with an i686 msw architecture specified in the global
environment, cause the -lws2_32 failure, and that's when I
figured it would be better to raise this set of issues here.

> GC> BTW, I'm not seeing much speed improvement yet, in these
> GC> 'make check_concinnity' timings (all measured with 'test_coding_rules'
> GC> already built):
> GC> 
> GC>                 i686  x86_64
> GC>   before PCRE:  3.93   3.41
> GC>   after  PCRE:  3.24   3.12 [here, both use an ELF binary]
> 
>  The only explanation I see is that your CPU (it doesn't matter that there
> are many of them for this test, as only one is used) is so fast that the
> execution time is dominated by the IO, but this seems very suspicious to me
> because your IO subsystem should be pretty fast too.

When this machine was brand new, about six years ago, it had perhaps
the fastest consumer SSD, and one of the slowest Xeon clock speeds
available, so I don't understand why you would see remarkably
different results.

> GC> but parallelism can increase speed, later, and it's well worthwhile
> GC> to get rid of boost-1.33.1 once and for all.
> 
>  Yes, the initial goal wasn't to optimize anything, but to get rid of Boost
> and the optimization only arose as a way to avoid unacceptable slowdown
> when using std::regex.

Thanks, I had lost sight of that.

>  BTW, things really didn't improve for std::regex since the decision not
> to use it was taken 5 (?) years ago, as now there is talk about deprecating
> it entirely in the standard library, due to having too many problems not
> only with its performance, but also its API (which I agree is pretty bad,
> using option bits selecting the regex dialect to use is a horrible idea)
> and poor Unicode support.

Wow. I hope they would not merely deprecate the old library, but
also add a replacement (perhaps for compile-time regexes) at the
same time. I have to imagine that there's a lot of code out there
that depends on std::regex, so it's not viable just to remove it.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]