[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
From: |
Ralph Corderoy |
Subject: |
Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects |
Date: |
Tue, 17 Jun 2014 18:56:48 +0100 |
Hi Norm,
> So you are saying that "normal unix commands", such as grep, wc, tr
> etc, do or someday the GNU versions will, know about UTF-8, at least
> for file contents,
Yes, they do, today. And have done for quite a while. You need your
environment variables set up properly so `locale' reports UTF-8 (or
`utf8'). Then...
$ grep -i roman chars
Roman numerals Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ Ⅺ Ⅻ Ⅼ Ⅽ Ⅾ Ⅿ
$ grep £ chars
Currency £ € cent-¢
$ grep -i roman chars | sed -r 's/.*(.)/\1/'
Ⅿ
$ grep -i roman chars | sed -r 's/.*(.)/\1/' | hd
00000000 e2 85 af 0a |....|
00000004
$
> if not for file names?
The Unix kernel stores filenames as a run of bytes, not including `/'
and NUL. It places no interpretation on them itself. Userspace is able
to do so, but two users might see different names for the same file just
as they might `see' the same text file differently if they think the
bytes represent different encodings.
$ >pound-£
$ ls
pound-£
$ LC_ALL=C ls
pound-??
$
But really, these days, the whole world is UTF-8. Unless it's Microsoft
with their backwards backwards-compatibility view of the world, and no
one cares about them.
Cheers, Ralph.
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, (continued)
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/16
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/16
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/16
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Jerrad Pierce, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Earl Hood, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects,
Ralph Corderoy <=
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/18
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/18