bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Bug in GNU awk, RLENGTH fails in locale es_UY.UTF-8


From: Francisco Castro
Subject: Bug in GNU awk, RLENGTH fails in locale es_UY.UTF-8
Date: Tue, 20 Jan 2009 19:52:00 -0200
User-agent: KMail/1.9.9

I found a bug, and I hope these examples help.

address@hidden:~% awk --version | sed q
GNU Awk 3.1.5

address@hidden:~% echo $LANG
es_UY.UTF-8

address@hidden:~% echo ae | awk '{print match($0, /e*/); print RSTART, RLENGTH}'
1
1 17

# It works with LANG=C
address@hidden:~% echo ae | LANG=C awk '{match($0, /e*/); print RSTART, 
RLENGTH}'
1 0

# It also works when RLENGTH should return != 0:
address@hidden:~% echo ñ | awk '{match($0, /ñ*/); print RSTART, RLENGTH}'
1 1

# Some more examples:

address@hidden:~% echo ae | awk '{match($0, /e*/); print RSTART, RLENGTH}'
1 18

address@hidden:~% echo aee | awk '{match($0, /e*/); print RSTART, RLENGTH}'
1 18

address@hidden:~% echo aeee | awk '{match($0, /e*/); print RSTART, RLENGTH}'
1 26

address@hidden:~% echo hello | awk '{match($0, /e*/); print RSTART, RLENGTH}'
1 26

address@hidden:~% echo world. | awk '{match($0, /e*/); print RSTART, RLENGTH}'
1 34

# The ñ is the character represented with the two bytes: 0xC3 0xB1 in UTF-8.
# "ñññ" and "world." gives the same result, it means it has something to do
# with the size in bytes, and not the length.

address@hidden:~% echo ñññ | awk '{match($0, /e*/); print RSTART, RLENGTH}'
1 34

address@hidden:~% echo -n ñ | od -t x1
0000000 c3 b1
0000002

address@hidden:~% echo aaaaaaaaaaaaaaaaaaa | awk '{match($0, /e*/); print 
RSTART, RLENGTH}'
1 82

address@hidden:~% cat /etc/issue
Debian GNU/Linux lenny/sid \n \l

address@hidden:~% md5sum `which awk`
423835ba1e46c652823021da7d41c4e1  /usr/bin/awk

address@hidden:~# LANG=C apt-get install gawk | grep gawk
gawk is already the newest version.

-- 
Francisco Castro

Attachment: signature.asc
Description: This is a digitally signed message part.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]