[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: multibyte support (round 4) - tr
From: |
Assaf Gordon |
Subject: |
Re: multibyte support (round 4) - tr |
Date: |
Sat, 23 Dec 2017 18:50:41 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 |
Hello,
More progress on tr with multibyte support, available here:
https://files.housegordon.org/src/coreutils-multibyte-2017-12-23.patch.xz
translation (mostly) working:
$ echo abcdefg | ./src/tr 'abcd' 'αβγδ'
αβγδefg
$ echo '1234 ABCD ΨΔΩΣ *$%()' \
| ./src/tr -c '[:alpha:][:cntrl:]' 'Ψ'
ΨΨΨΨΨABCDΨΨΔΩΣΨΨΨΨΨΨ
$ echo 'αααββββ' | ./src/tr -s 'β' 'χ'
αααχ
$ echo 'aAbBcC ✀ χΧλΛσΣ' | ./src/tr '[:lower:]' '[:upper:]'
AABBCC ✀ ΧΧΛΛΣΣ
The current implementation could be a starting point for
testing and discussing specific edge-cases (some tests are already
included).
It is not tuned for efficiency (neither implementation nor run time
performance).
There's a lot of code duplication due to keeping the entire current
unibyte code-path intact.
comments welcomed.
- assaf