[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regex library

From: Assaf Gordon
Subject: Re: Regex library
Date: Sun, 27 Jun 2021 00:24:35 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0


On 2021-06-18 1:42 a.m., Pietro Paolini wrote:
In the sed source code there is a folder called lib/ which seems to include
the GNU lib and or maybe I am flay wrong and that isn't gnulib

The content of "/lib" in the sed-4.8.tar.gz is indeed a subset of gnulib.

Another question concerns the regex library in use, I can see the code
using regex functions defined as part of gnulib

Yet when I ldd the sed binary I can observe that PCRE is dynamically linked

What library is used for regex in GNU sed ? I inclined to say that PCRE
isn't used, after all libpthread gets linked too and it is not used.

You're correct - PCRE is not used by gnu sed.

On my system, it is "libselinux" which uses PCRE (and sed does use selinux by default):

   $ ldd /lib/x86_64-linux-gnu/ | grep -i pcre => /lib/x86_64-linux-gnu/

As for which regex code is used, the answer is a bit nuanced.

The source code file which does the actual regex matching is "sed/regexp.c":

Inside, two main function are used: re_compile_pattern() and re_search().

These are defined in gnulib's "regcomp.c" and "regexec.c" files:

These functions are also defined in glibc (although internal).

glibc and gnulib's source code are often synchronized, so these
functions might be identical, or (if it's an old glibc) - gnulib's
version that is bundled with gnu sed might be newer.

During "./configure", if the system's glibc is detected to have new-
enough version of these functions - they will be used.
Otherwise, the gnulib version will be used.

You can force the build to use glibc's version with:
    ./configure --without-included-regex
But that's not recommended, unless you are certain of what you're

To add another layer, GNU sed employs some regex optimizations using a
faster engine (gnulib's DFA engine, ).
That code is not available in glibc, and so it is always taken from gnulib.

Hope this answers the question.

 - assaf

reply via email to

[Prev in Thread] Current Thread [Next in Thread]