nmh-workers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects


From: Ken Hornstein
Subject: Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
Date: Tue, 17 Jun 2014 14:23:57 -0400

>So you are saying that "normal unix commands", such as grep, wc, tr
>etc, do or someday the GNU versions will, know about UTF-8, at least
>for file contents, if not for file names?

Ralph and Jerrad answered you already, but let me expand on this a bit.

There's an implicit assumption in nmh that messages in the message store
are valid RFC 5322 messages and can always be treated as such (see
dist and forw, for starters).

People will point out that mhfixmsg transforms message into more easily
grep-able forms ... but it does this while still keeping the message in
RFC 5322 format.  This is relatively straightforward.  But there is no
standardized way of storing 8-bit characters in message headers (like I
said before, I'm discounting message/global, which we don't handle and
I think few others do as well).  So we don't really have a way of having
messages available in an format that's easy to deal with using traditional
Unix tools (the encoding used in message headers is not like the encoding
used elsewhere, so it requires special handling and has special semantics).
So you need to have a tool that's smarter, or make some serious changes to
the message store.

Luckily, we already have a tool; it's called pick :-)

--Ken



reply via email to

[Prev in Thread] Current Thread [Next in Thread]