sed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug#47281: sed: problem with hex replace a literal '$'


From: Nora Platiel
Subject: Re: bug#47281: sed: problem with hex replace a literal '$'
Date: Mon, 22 Mar 2021 19:16:24 +0100

> Sent: Monday, March 22, 2021 at 11:13 AM
>
> It seems, there is no need to esacpe '&' in hexmode

(I assume that by "hexmode" you mean the use of \xHH escapes.)
I was not aware of this because I was testing with an old version of sed! And I 
got:

$ echo a | sed 's/\x61/\x26/'
a
$ echo a | sed 's/\x61/\x5C\x26/'
&

With a newer version I get:

$ echo a | sed 's/\x61/\x26/'
&
$ echo a | sed 's/\x61/\x5C\x26/'
\&

It seems the result of these 2 commits:

- fix \x26 on RHS of s command
https://git.savannah.gnu.org/cgit/sed.git/commit/?id=0f968ceb7bc1a65773979ef419872ce43677c790

- sed: treat '\x5c' as literal backslash
https://git.savannah.gnu.org/cgit/sed.git/commit/?id=b5f5236a4b3a2d6c2f89fb99d614486a65a40a24

The change was obviously backward incompatible, and I think the old behavior 
(albeit less convenient) was more consistent with the behavior of the regexp 
part:
- now in the replacement part \xHH is always literal (/\x26/ != /&/), while in 
the regexp part it isn't (/\x2E/ = /./);
- in the replacement part \x5C is not an escape (/\x5C1/ != /\1/), while in the 
regexp part it is (/\x5C./ = /\./, /\x5C</ = /\</).

At the very least this distinction should be documented in the "5.8.1 Escaping 
Precedence" paragraph.

> i'am not able to replace symbols from 0x00 to 0x0a (newline),
> and this is also somehow clear, because of the line-oriented
> inner workings of sed.

I have no problem replacing bytes form 0x00 to 0x0A:

$ printf '\x00\n' | sed 's/\x00/\x61/'
a

The problem is probably that your input file is read as 2 lines:
1) 00 01 02 03 04 05 06 07 08 09
2) 0B 0C 0D 0E 0F 10 11 12 13 14 ... FF

so the string \x00...\xFF is never found, because it is searched line by line.
If you replace the sequences \x00...\x09 and \x0B...\xFF using separate 's' 
commands, they should match.

With sed being line oriented, you cannot replace a newline from the input, but 
you can replace it once it's introduced in the pattern space:

$ echo x | sed 's/x/\n/ ; s/\x0A/\x61/'
a

Regards,
NP



reply via email to

[Prev in Thread] Current Thread [Next in Thread]