Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects

nmh-workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects

From:	Ralph Corderoy
Subject:	Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
Date:	Tue, 17 Jun 2014 18:56:48 +0100

Hi Norm,

> So you are saying that "normal unix commands", such as grep, wc, tr
> etc, do or someday the GNU versions will, know about UTF-8, at least
> for file contents,

Yes, they do, today.  And have done for quite a while.  You need your
environment variables set up properly so `locale' reports UTF-8 (or
`utf8').  Then...

    $ grep -i roman chars
    Roman numerals Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ Ⅺ Ⅻ Ⅼ Ⅽ Ⅾ Ⅿ
    $ grep £ chars
    Currency £ € cent-¢
    $ grep -i roman chars | sed -r 's/.*(.)/\1/'
    Ⅿ
    $ grep -i roman chars | sed -r 's/.*(.)/\1/' | hd
    00000000  e2 85 af 0a                                       |....|
    00000004
    $ 

> if not for file names?

The Unix kernel stores filenames as a run of bytes, not including `/'
and NUL.  It places no interpretation on them itself.  Userspace is able
to do so, but two users might see different names for the same file just
as they might `see' the same text file differently if they think the
bytes represent different encodings.

    $ >pound-£
    $ ls
    pound-£
    $ LC_ALL=C ls
    pound-??
    $ 

But really, these days, the whole world is UTF-8.  Unless it's Microsoft
with their backwards backwards-compatibility view of the world, and no
one cares about them.

Cheers, Ralph.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, (continued)
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/16
  - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/16
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/16
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Jerrad Pierce, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Earl Hood, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy <=
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/18
    - Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/18

Prev by Date: Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
Next by Date: Re: [Nmh-workers] A permute command for nmh 1.7 ?
Previous by thread: Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
Next by thread: Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
Index(es):
- Date
- Thread