[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Backreferences in character classes?
From: |
Assaf Gordon |
Subject: |
Re: Backreferences in character classes? |
Date: |
Sat, 20 Jan 2018 16:25:55 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 |
Hello,
On 2018-01-19 02:25 PM, Jack Bates wrote:
Does sed support backreferences in character classes? The following
doesn't work for me:
echo "'foo'\"'\"'bar'" | sed "s/\([\"']\)\([^\1]*\)\1/\2/g"
Expected: foo'bar
Actual: foo'"'"'bar
No, back-references do not work inside character classes.
This is not only in sed, but also in perl:
$ echo "'foo'\"'\"'bar'" | perl -npe "s/([\"'])([^\1]*)\1/\2/g"
foo'"'"'bar
I would suggest the following:
First,
"sed -E" enables extended regular expression, and then there's no
need to escape the parenthesis. This should be supported on all modern
seds (including non-gnu). I will use -E in the examples below.
Second,
Since your character class contains only two characters (single quotes
and double quotes), it is rather easy to break it down to a regular
expression with alteration:
$ echo "'foo'\"'\"'bar'" | sed -E "s/(\"([^\"]*)\")|('([^']*)')/\2\4/g"
foo'bar
The "trick" is to replace with two back-references (\2 and \4) - one of
them is guaranteed to match (e.g. 'foo' or 'bar') and the other is
guaranteed to be empty (because it belongs to the alternate regex part
that didn't match).
Hope this helps,
- assaf