[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [coreutils] tr: case mapping anomaly
From: |
Eric Blake |
Subject: |
Re: [coreutils] tr: case mapping anomaly |
Date: |
Wed, 29 Sep 2010 05:59:03 -0600 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc13 Mnenhy/0.8.3 Thunderbird/3.1.4 |
On 09/28/2010 06:23 PM, Pádraig Brady wrote:
I found a few more issues:
This valid translation spec aborted:
LANG=en_US tr '[:upper:]- ' '[:lower:]_'
This misaligned conversion spec was allowed:
LANG=C tr 'A-Y[:lower:]' 'a-z[:upper:]'
This misaligned spec was allowed by extending the class:
LANG=C tr '[:upper:] ' '[:lower:]'
I'll apply the attached soon.
+ /* Note BSD allows extending of classes in string2. For example:
+ tr '[:upper:]0-9' '[:lower:]'
+ That's not portable however, contradicts POSIX and is dependent
+ on your collating sequence. */
That's not portable, however; it contradicts POSIX and is dependent on
your collating sequence.
+
+# Ensure we support translation of case classes with extension
+echo '01234567899999999999999999' > exp
+echo 'abcdefghijklmnopqrstuvwxyz' |
+tr '[:lower:]' '0-9' > out || fail=1
I guess we're guaranteed that [:lower:] has a defined order in the C
locale, so this one looks okay.
+tr '[:upper:][:lower:]' 'a-z[:upper:]' < /dev/null || fail=1
+tr '[:upper:][:lower:]' '[:upper:]a-z' < /dev/null || fail=1
Likewise, these two are not required by POSIX, but since they have a
defined order in the C locale, this looks okay.
+
+# Before coreutils 8.6 the trailing space in string1
+# caused the case class in string2 to be extended.
+# However that was not portable, dependent on locale
+# and in contravention of POSIX.
However, that was not portable, dependent on locale, and contrary to POSIX.
+tr '[:upper:] ' '[:lower:]' < /dev/null 2>out && fail=1
+echo 'tr: when translating with string1 longer than string2,
+the latter string must not end with a character class' > exp
+compare out exp || fail=1
+
+# Before coreutils 8.6 the disparat number of upper and lower
disparate
+ # Ensure when there are a different number of elements
+ # in each string, we validate the case mapping correctly
+ tr 'ab[:lower:]' '0-1[:upper:]' < /dev/null || _fail=1
Nice test; 'ab' and '0-1' are the same size sets of characters, but
different length strings, so [:lower:] and [:upper:] are still aligned.
However, it's only done in the en_US locale; you should probably also
test this POSIX-required feature under the C locale.
+
+ # Ensure we extend string2 appropriately
+ tr '[:upper:]- ' '[:lower:]_' < /dev/null || _fail=1
Seems non-portable to have a - in the middle, even though here the left
side is a character class instead of a byte. I think you'd better pick
a different character than -, or move the - to the end.
+
+ # Ensure the size of the case classes are accounted
+ # for as a unit.
+ echo 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' |
+ tr '[:upper:]A-B' '[:lower:]0' >out || _fail=1
+ echo '00cdefghijklmnopqrstuvwxyz' > exp
Huh? A and B are both in [:upper:]; when a character is listed more
than once in string1, it is only transliterated according to the first
listing. I think this should be 'abc...' not '00c...' for the expected
results.
+ # Ensure the size of the case classes are accounted
+ # for as a unit.
+ echo 'a' |
+ tr -t '[:lower:]a' '[:upper:]0' >out || _fail=1
+ echo '0' > exp
Likewise, this should be 'A' not '0', since 'a' is part of [:lower:].
+ # Ensure the size of the case classes are accounted
+ # for as a unit.
+ echo 'a' |
+ tr -t '[:lower:][:lower:]a' '[:lower:][:upper:]0' >out || _fail=1
+ echo '0' > exp
Here, 'a' rather than '0' (the leading [:lower:] in both strings means
that all lower-case letters are unchanged).
--
Eric Blake address@hidden +1-801-349-2682
Libvirt virtualization library http://libvirt.org
- [coreutils] tr: case mapping anomaly, Pádraig Brady, 2010/09/24
- Re: [coreutils] tr: case mapping anomaly, Eric Blake, 2010/09/24
- Re: [coreutils] tr: case mapping anomaly, Pádraig Brady, 2010/09/25
- Re: [coreutils] tr: case mapping anomaly, Jim Meyering, 2010/09/25
- Re: [coreutils] tr: case mapping anomaly,
Eric Blake <=
- Re: [coreutils] tr: case mapping anomaly, Pádraig Brady, 2010/09/29
- Re: [coreutils] tr: case mapping anomaly, Eric Blake, 2010/09/29
- Re: [coreutils] tr: case mapping anomaly, Eric Blake, 2010/09/29