bug-sed
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#34316: sed misbehavior on BRE's


From: Lange, Markus
Subject: bug#34316: sed misbehavior on BRE's
Date: Mon, 4 Feb 2019 13:42:52 +0000

Hi,

I'm currently migrating processes from an old SuSE 9 Linux to an new
CentOS 7 Linux and observed some unexpected behavior changes on sed.

At first some information's about the systems:

old:~ # cat /etc/SuSE-release 
SuSE Linux 9.0 (i586)
VERSION = 9.0
old:~ # uname -a
Linux biblix 2.4.21-303-smp4G #1 SMP Tue Dec 6 12:33:10 UTC 2005 i686
i686 i386 GNU/Linux
old:~ # sed --version
GNU sed version 4.0.6
...

new:~ #cat /etc/os-release 
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/";
BUG_REPORT_URL="https://bugs.centos.org/";

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
new:~ # uname -a
Linux userWS0.dnb.de 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14
21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
new:~ # sed --version
sed (GNU sed) 4.2.2
...

Now lets see how the behavior has changed, what I think is a bug:

old:~ # sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*$/\2 \1\3/p'
Fehlerpica.dat 
138742c156c1445f8bdc3a7845548c00 9783507435339020F a19.04.03
18290030a02544e6a451538b0e44f9e2 9783507435377020F a19.04.03
4c7ff6d790b34470852434f5ee41200b 9783034312189020F a12.12.11

while the new system does not output anything using this expression.

Removing the line end ($) from the expression solved the problem,
somehow:

old:~ # sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*/\2 \1\3/p'
Fehlerpica.dat 
138742c156c1445f8bdc3a7845548c00 9783507435339020F a19.04.03
18290030a02544e6a451538b0e44f9e2 9783507435377020F a19.04.03
4c7ff6d790b34470852434f5ee41200b 9783034312189020F a12.12.11

new:~ # sed -n 's/^.*004K...\([0-
9xX]\{13\}\).*006V...\(.\{1,32\}\).*\(.020F.*\)021A.*/\2 \1\3/p'
Fehlerpica.dat 
138742c156c1445f8bdc3a7845548c00 9783507435339020F a19.04.03�208@
a30-01-19bc
18290030a02544e6a451538b0e44f9e2 9783507435377020F a19.04.03�208@
a30-01-19bc
4c7ff6d790b34470852434f5ee41200b 9783034312189020F a12.12.11�208@
a30-01-19bc

For me this seems to be the first unexpected behavior. The second,
which i think is tightly related, is that the first match group get's
text from the end of line attached. Maybe the first match group
consumes the line end?

So I started breaking the expression down, using only the first match
group:

old:~ # sed -n 's/^.*004K...\([0-9xX]\{13\}\).*$/\1/p' Fehlerpica.dat 
9783507435339
9783507435377
9783034312189

The new system still doesn't output anything, leaving out the line end
in the expression end up in output on the new system:

old:~ # sed -n 's/^.*004K...\([0-9xX]\{13\}\).*/\1/p' Fehlerpica.dat 
9783507435339
9783507435377
9783034312189
new:~ # sed -n 's/^.*004K...\([0-9xX]\{13\}\).*/\1/p' Fehlerpica.dat  
9783507435339�208@ a30-01-19bc
9783507435377�208@ a30-01-19bc
9783034312189�208@ a30-01-19bc

However the output differs and is wrong on the new system. The line end
is still appended to the match group.

If I try using only the second match group, the string is appended
there:
old:~ # sed -n 's/^.*006V...\(.\{1,32\}\).*/\1/p' Fehlerpica.dat
138742c156c1445f8bdc3a7845548c00
18290030a02544e6a451538b0e44f9e2
4c7ff6d790b34470852434f5ee41200b
new:~ # sed -n 's/^.*006V...\(.\{1,32\}\).*/\1/p' Fehlerpica.dat 
138742c156c1445f8bdc3a7845548c00�208@ a30-01-19bc
18290030a02544e6a451538b0e44f9e2�208@ a30-01-19bc
4c7ff6d790b34470852434f5ee41200b�208@ a30-01-19bc

So it seems like the first match group consumes far to much text in an
non-linear way breaking the match of the line end.

I've attached the Fehlerpica.dat for you and hope you can reproduce the
misbehavior.

If I can provide further information please let me know.

Thank you and best regards,
Markus Lange
-- 
***Lesen. Hören. Wissen. Deutsche Nationalbibliothek***

Deutsche Nationalbibliothek               
Fachbereich IT, Informationsinfrastruktur
Adickesallee 1
60322 Frankfurt am Main
Tel: +49 69 1525 -1786
mailto:address@hidden
http://www.dnb.de

Attachment: Fehlerpica.dat
Description: Fehlerpica.dat


reply via email to

[Prev in Thread] Current Thread [Next in Thread]