[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[3.0] UTF-8 and ${#var} or ${var: -1}
From: |
Stephane Chazelas |
Subject: |
[3.0] UTF-8 and ${#var} or ${var: -1} |
Date: |
Thu, 29 Jul 2004 16:44:41 +0100 |
User-agent: |
Mutt/1.5.6i |
At
http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_02
${#parameter}
String Length. The length in characters of the value
~~~~~~~~~~
of parameter shall be substituted.
If the parameter contains a two byte utf8 character
${#parameter} returns 2:
bash-3.00$ uname -rsvm
SunOS 5.8 Generic_117000-05 sun4u
bash-3.00$ locale charmap
UTF-8
bash-3.00$ a=$(printf '%b' '\0303\0251')
bash-3.00$ [[ $a = ? ]] && echo yes
yes
bash-3.00$ echo ${#a}
2
There's also a problem with ${var: -<n>}:
bash-3.00$ a=AeB
bash-3.00$ printf %s "${a: -1}" | od -to1
0000000 102
0000001
bash-3.00$ a=$(printf '%b' 'A\0303\0251B')
bash-3.00$ printf %s "${a: -1}" | od -to1
0000000
It seems OK in other places:
bash-3.00$ printf %s "${a#?}" | od -to1
0000000
bash-3.00$ case $a in ?) echo yes;; esac
yes
bash-3.00$ printf %4s "$a" | od -to1
0000000 040 040 040 303 251
0000005
bash-3.00$ a=$(printf '%b' 'A\0303\0251B')
bash-3.00$ printf %s "${a:2}" | od -to1
0000000 102
0000001
regards,
Stephane
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [3.0] UTF-8 and ${#var} or ${var: -1},
Stephane Chazelas <=