[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bash "extglob" needs to upgrade at least like zsh "kshglob"
From: |
Chet Ramey |
Subject: |
Re: bash "extglob" needs to upgrade at least like zsh "kshglob" |
Date: |
Wed, 16 Nov 2022 16:47:20 -0500 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 |
On 11/14/22 6:06 AM, Koichi Murase wrote:
However, I noticed two strange behaviors of the current engine.
Before adjusting the behavior of the new engine and submitting it for
review, I would like to confirm the (expected) behavior of the current
engine in the current devel.
These two behaviors finally turned out to be both related to the
matching of bracket expression by the function `BRACKMATCH'
(lib/glob/sm_loop.c).
----------------------------------------------------------------------
1. pattern [[=B=]][c] matches with c
$ bash-devel --norc
$ [[ Bc == [[=B=]][c] ]]; echo $?
0 <-- OK. This is expected.
$ [[ c == [[=B=]][c] ]]; echo $?
0 <-- This is unexpected.
This is clearly a problem, and your fix is a good one.
----------------------------------------------------------------------
2. bracket expression sometimes match or unmatch the slash with
FNM_PATHNAME.
FNM_PATHNAME is only used in two places in the Bash codebase. 1)
One is for the glob matching for the filenames in the directory
(lib/glob/glob.c). However, this actually does not seem to have an
actual effect because FNM_PATHNAME only causes differences in the
matching when the target string contains a slash but the filenames
do not contain any slashes. 2) The other is the filtering of the
pathname-expansion results with GLOBIGNORE (pathexp.c). So the only
way to test the behavior of Bash's strmatch with FNM_PATHNAME
(without writing a C program to directly use the function
`strmatch') is to use GLOBIGNORE.
This is true, but the two uses above are not directly analogous to
fnmatch(). The bash implementation, due to its origin as an fnmatch()
implementation, uses the same flag names but uses FNM_PATHNAME to signal
the matching engine that it's matching filenames for pathname expansion.
That triggers the difference. The first part of your patch is correct:
a bracket expression can never match a slash in the string. This is
common to both fnmatch and what POSIX calls "Patterns Used For Filename
Expansion."
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_03
fnmatch with FNM_PATHNAME only has to avoid matching the slash with a
bracket expression. The shell has an additional constraint: a slash that
appears in a bracket expression renders the bracket expression void and
requires the `[' to be matched explicitly. That's why there have to be
tests for slash in BRACKMATCH. There are two bugs: the off-by-one error
you note and matching the open bracket explicitly.
* In fnmatch(3), a bracket expression never matches a / with
FNM_PATHNAME.
This is one part of the fix.
* In Bash's strmatch, a bracket expression sometimes matches `/' and
sometimes does not. In the codebase, `/' is excluded only when it
explicitly appears after another character in the bracket
expression (lib/glob/sm_loop.c:574) even though the comment
mentions [/]. This is the reason that only [a/] fails with Bash's
implementation in cases #2..#6 in the above result.
That's the consequence of the test being in the wrong place in BRACKMATCH.
* What is happening with Bash's implementation in cases #7..#12 is
related the assumption of the backtracking trick for `*': The
trick for `*' backtracking explained in the code comment
lib/glob/sm_loop.c:320---"This is a clever idea from
glibc"---relies on the behavior that the bracket expression never
matches a slash that `*' cannot match.
These two changes should satisfy that.
The cleverness is due to Russ Cox, a really smart guy who figured this
stuff out first:
https://research.swtch.com/glob
(https://swtch.com/~rsc/regexp/ is a collection of his writing on regular
expressions. It's well worth reading.)
I attached the patch I applied. I didn't include your fix to issue 1 above.
Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/
bracket-slash.patch
Description: Text document
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Chet Ramey, 2022/11/03
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/14
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob",
Chet Ramey <=
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/17
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Chet Ramey, 2022/11/17
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/18
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/20
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Chet Ramey, 2022/11/22
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/28
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/28
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Chet Ramey, 2022/11/17
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/18
- Re: bash "extglob" needs to upgrade at least like zsh "kshglob", Koichi Murase, 2022/11/17